SkillAudit report — stripe/agent-toolkit
Scanned 2026-04-24 by SkillAudit v0.2 (static checks + LLM-assisted prompt-injection red-team).
Commit: dd6deb0 · Stars: 1499 · Days since last push: 2
LLM prompt-injection probe: skipped — set ANTHROPIC_API_KEY to enable the LLM-assisted prompt-injection red-team
Overall grade: F (0/100)
| Axis | Score | Grade | |
|---|---|---|---|
| security | 10/100 | F | ❌ |
| permissions | 100/100 | A | ✅ |
| credentials | 0/100 | F | ❌ |
| maintenance | 100/100 | A | ✅ |
| compatibility | 70/100 | C | ⚠️ |
| docs | 100/100 | A | ✅ |
Security findings
Production sources:
- HIGH
benchmarks/card-element-to-checkout/solution/client/return.js:7— Template-string URL with interpolation — no validation possible on composed string
const response = await fetch(\/session-status?session_id=${sessionId}\);
- HIGH
benchmarks/checkout-gym/environment/client/return.js:7— Template-string URL with interpolation — no validation possible on composed string
const response = await fetch(\/session-status?session_id=${sessionId}\);
- HIGH
benchmarks/furever/environment/app/components/testdata/Financing/TransitionFinancingButton.tsx:33— HTTP client call with user-controlled argument 'fetchUrl' — no URL allowlist / validation found in file
const res = await fetch(fetchUrl, {
- HIGH
benchmarks/furever/grader/payments.py:35— HTTP client call with user-controlled argument 'url' — no URL allowlist / validation found in file
with urllib.request.urlopen(url, timeout=20) as response:
- HIGH
benchmarks/galtee-basic/environment/client/return.js:7— Template-string URL with interpolation — no validation possible on composed string
const response = await fetch(\/session-status?session_id=${sessionId}\);
- HIGH
benchmarks/galtee-invoicing/environment/client/return.js:7— Template-string URL with interpolation — no validation possible on composed string
const response = await fetch(\/session-status?session_id=${sessionId}\);
- HIGH
benchmarks/saas-starter-embedded-checkout/environment/app/(dashboard)/dashboard/general/page.tsx:14— HTTP client call with user-controlled argument 'url' — no URL allowlist / validation found in file
const fetcher = (url: string) => fetch(url).then((res) => res.json());
- HIGH
benchmarks/saas-starter-embedded-checkout/environment/app/(dashboard)/dashboard/page.tsx:28— HTTP client call with user-controlled argument 'url' — no URL allowlist / validation found in file
const fetcher = (url: string) => fetch(url).then((res) => res.json());
- HIGH
benchmarks/saas-starter-embedded-checkout/environment/app/(dashboard)/layout.tsx:19— HTTP client call with user-controlled argument 'url' — no URL allowlist / validation found in file
const fetcher = (url: string) => fetch(url).then((res) => res.json());
- HIGH
benchmarks/saas-starter-partial-payments/environment/app/(dashboard)/dashboard/general/page.tsx:14— HTTP client call with user-controlled argument 'url' — no URL allowlist / validation found in file
const fetcher = (url: string) => fetch(url).then((res) => res.json());
Permissions
_No findings on this axis._
Credentials
Production sources:
- HIGH
benchmarks/checkout-gym/environment/server/.env.example:4— Hardcoded Stripe test secret found in source
sk_test_*** (Stripe test secret, 28 chars)
- HIGH
benchmarks/galtee-basic/environment/server/.env.example:4— Hardcoded Stripe test secret found in source
sk_test_*** (Stripe test secret, 28 chars)
- HIGH
benchmarks/galtee-basic/run_solution.sh:18— Hardcoded Stripe test secret found in source
sk_test_*** (Stripe test secret, 28 chars)
- HIGH
benchmarks/galtee-invoicing/environment/server/.env.example:4— Hardcoded Stripe test secret found in source
sk_test_*** (Stripe test secret, 28 chars)
- HIGH
benchmarks/galtee-invoicing/run_solution.sh:18— Hardcoded Stripe test secret found in source
sk_test_*** (Stripe test secret, 28 chars)
- HIGH
llm/ai-sdk/provider/examples/openai.ts:51— console.* of process.env — entire env leaks to stdout/stderr and LLM context
console.log(\Customer ID: ${process.env.STRIPE_CUSTOMER_ID}\n\);
- WARN
benchmarks/card-element-to-checkout/environment/server/.env.example— .env file present in repo tree — verify it's a template, not real secrets
benchmarks/card-element-to-checkout/environment/server/.env.example
- WARN
benchmarks/card-element-to-checkout/grader/.env.example— .env file present in repo tree — verify it's a template, not real secrets
benchmarks/card-element-to-checkout/grader/.env.example
- WARN
benchmarks/card-element-to-checkout/solution/server/.env.example— .env file present in repo tree — verify it's a template, not real secrets
benchmarks/card-element-to-checkout/solution/server/.env.example
- WARN
benchmarks/checkout-gym/.env.example— .env file present in repo tree — verify it's a template, not real secrets
benchmarks/checkout-gym/.env.example
Maintenance
_No findings on this axis._
Compatibility
Production sources:
- WARN
(meta)— No engines (Node) or python_requires declared — cross-client compatibility unverified
Documentation
_No findings on this axis._
Methodology
SkillAudit v0.2 clones the repo at the provided ref (default: default branch, HEAD) into an ephemeral sandbox, runs six static checks over .js/.ts/.py sources, queries the GitHub API for maintenance signals, and runs an LLM-assisted prompt-injection red-team over the MCP tool surface. Each axis is scored against the rubric at
The prompt-injection axis extracts each server.tool(...) / @app.tool registration + the first ~60 lines of handler body, hands them to Claude Haiku 4.5 with a red-team system prompt, and asks for structured findings on untrusted-content flow into tool responses. One API call per scan, bounded at ~15K input tokens.
How to improve this grade
- Security — static: validate tool-input URLs against an allowlist before fetch/axios calls; use
execFilewith argv arrays instead ofexecwith template strings; never pass untrusted strings tosubprocesswithshell=True. - Security — prompt injection: never return fetched web-page / file / email content verbatim in a tool response. Wrap with a framing marker (e.g.,
<untrusted-content>...</untrusted-content>), summarize rather than inline, and never let untrusted content share a turn with credentials or other tool output. - Credentials findings: redact env-var reads before log lines and error messages; treat any string that ends up in a tool response as public.
- Maintenance: if the repo is inactive, document the maintenance model — "MCP tool, no breaking changes expected" is a legitimate signal.
- Docs: add a README install + usage section with a copy-pasteable command; add a SECURITY.md with a disclosure channel.
_Report generated by skillaudit.dev_
Want your repo audited?
First 100 audits go to waitlist signups in order. The engine runs against public GitHub URLs today.
Join the waitlist →