SkillAudit report — lastmile-ai/mcp-agent
Scanned 2026-04-30 by SkillAudit v0.3 (surface-tiered static checks + LLM-assisted prompt-injection red-team).
Commit: f62d849 · Stars: 8288 · Days since last push: 88
LLM prompt-injection probe: skipped — set ANTHROPIC_API_KEY to enable the LLM-assisted prompt-injection red-team
Overall grade: C (70/100)
| Axis | Score | Grade | |
|---|---|---|---|
| security | 90/100 | A | ✅ |
| permissions | 100/100 | A | ✅ |
| credentials | 70/100 | C | ⚠️ |
| maintenance | 90/100 | A | ✅ |
| compatibility | 100/100 | A | ✅ |
| docs | 100/100 | A | ✅ |
Security findings
Examples / samples (low-weight) — 2 total, deduct 5/0 per high/warn:
- HIGH
examples/oauth/protected_by_oauth/registration.py:16— HTTP client call with user-controlled argument 'well_known_url' — no URL allowlist / validation found in file
response = requests.get(well_known_url)
- HIGH
examples/oauth/protected_by_oauth/registration.py:40— HTTP client call with user-controlled argument 'registration_endpoint' — no URL allowlist / validation found in file
response = requests.post(
Permissions
_No findings on this axis._
Credentials
Examples / samples (low-weight) — 10 total, deduct 5/0 per high/warn:
- HIGH
examples/basic/mcp_basic_agent/main.py:37— Hardcoded OpenAI / Anthropic-style API key found in source
sk-*** (OpenAI / Anthropic-style API key, 20 chars)
- HIGH
examples/basic/mcp_basic_agent/main.py:41— Hardcoded OpenAI / Anthropic-style API key found in source
sk-*** (OpenAI / Anthropic-style API key, 23 chars)
- HIGH
examples/basic/mcp_basic_agent/mcp_agent.secrets.yaml.example:10— Hardcoded OpenAI / Anthropic-style API key found in source
sk-*** (OpenAI / Anthropic-style API key, 21 chars)
- HIGH
examples/basic/mcp_tool_filter/mcp_agent.secrets.yaml.example:4— Hardcoded OpenAI / Anthropic-style API key found in source
sk-*** (OpenAI / Anthropic-style API key, 22 chars)
- HIGH
examples/basic/mcp_tool_filter/mcp_agent.secrets.yaml.example:7— Hardcoded OpenAI / Anthropic-style API key found in source
sk-*** (OpenAI / Anthropic-style API key, 25 chars)
Test source (low-weight) — 3 total, deduct 5/0 per high/warn:
- HIGH
tests/cli/fixtures/test_secrets_deploy.sh:6— Hardcoded OpenAI / Anthropic-style API key found in source
sk-*** (OpenAI / Anthropic-style API key, 21 chars)
- HIGH
tests/utils/test_config_preload.py:38— Hardcoded OpenAI / Anthropic-style API key found in source
sk-*** (OpenAI / Anthropic-style API key, 20 chars)
- HIGH
tests/utils/test_config_preload.py:41— Hardcoded OpenAI / Anthropic-style API key found in source
sk-*** (OpenAI / Anthropic-style API key, 23 chars)
Maintenance
Production sources:
- WARN
(meta)— 120 open issues — triage backlog
120 open
Compatibility
_No findings on this axis._
Documentation
_No findings on this axis._
Methodology
SkillAudit v0.3 clones the repo at the provided ref (default: default branch, HEAD) into an ephemeral sandbox, runs six static checks over .js/.ts/.py sources, queries the GitHub API for maintenance signals, and runs an LLM-assisted prompt-injection red-team over the MCP tool surface. Each axis is scored against the published rubric — surface tiers, per-(axis, surface) caps, grade buckets, and worked examples are all documented there.
The v0.3 calibration update introduces surface tiering: every finding is tagged with the code path it lives in (production / installer / examples / benchmarks / scripts / test). Production findings deduct at full weight (-30 high, -10 warn); installer findings deduct at half (-15 / -5); examples, benchmarks, top-level scripts, and tests deduct at low weight (-5 / 0). This stops a chatty benchmarks/ or samples/ directory from dominating an otherwise-clean MCP server's grade.
The prompt-injection axis extracts each server.tool(...) / @app.tool registration + the first ~60 lines of handler body, hands them to Claude Haiku 4.5 with a red-team system prompt, and asks for structured findings on untrusted-content flow into tool responses. One API call per scan, bounded at ~15K input tokens.
How to improve this grade
- Security — static: validate tool-input URLs against an allowlist before fetch/axios calls; use
execFilewith argv arrays instead ofexecwith template strings; never pass untrusted strings tosubprocesswithshell=True. - Security — prompt injection: never return fetched web-page / file / email content verbatim in a tool response. Wrap with a framing marker (e.g.,
<untrusted-content>...</untrusted-content>), summarize rather than inline, and never let untrusted content share a turn with credentials or other tool output. - Credentials findings: redact env-var reads before log lines and error messages; treat any string that ends up in a tool response as public.
- Maintenance: if the repo is inactive, document the maintenance model — "MCP tool, no breaking changes expected" is a legitimate signal.
- Docs: add a README install + usage section with a copy-pasteable command; add a SECURITY.md with a disclosure channel.
_Report generated by skillaudit.dev_
Want your repo audited?
First 100 audits go to waitlist signups in order. The engine runs against public GitHub URLs today.
Join the waitlist →