What 561 Repositories Taught Us About AI Agent Security
We scanned 561 open-source AI agent repositories, drove the false-positive rate to zero, and opened a disclosure pipeline. Here is what we learned — methodology, top patterns, and the raw numbers.
Over the last several months we ran the Inkog scanner against a curated benchmark of 561 open-source AI agent repositories — everything from the household-name frameworks (LangChain, CrewAI, AutoGen, LangGraph, pydantic-ai) to research projects and demo agents. We used the findings to drive the CLI's false-positive rate to zero, build the framework adapter catalog, and kick off a responsible-disclosure pipeline. This post shares what we learned.
It's also the most honest picture we can give of the state of AI agent security in April 2026: not a fear-mongering survey, not a vendor case study, just the raw numbers from a large-scale scan, what they mean, and what you can do about them.
Methodology
We built a benchmark of 561 repositories from the AI agent ecosystem. Selection criteria:
- Framework coverage: every major code-first framework (LangChain, LangGraph, CrewAI, AutoGen/AG2, pydantic-ai, OpenAI Agents, Google ADK, Semantic Kernel, Smolagents, DSPy, Haystack, LlamaIndex) and every major no-code tool (n8n, Flowise, Langflow, Dify, Copilot Studio, Agentforce).
- Diversity: library code, reference agents, production SaaS agents, research projects, and deliberately vulnerable training targets.
- Staleness filter: only repositories with a commit in the last 12 months.
For each repo we ran the Inkog CLI at main with the balanced security policy (vulnerabilities + risk patterns, minus low-noise hardening recommendations). Every finding was reviewed by a human analyst and classified as:
- DISCLOSE — true positive with a concrete exploit path, eligible for responsible disclosure
- DOWNGRADE — real architectural issue but low confidence or low practical impact; reported as best-practice feedback
- SKIP — false positive, test fixture, or duplicate
We then used the SKIP set to drive detector improvements. The V14 fix cycle alone eliminated 356 unsafe_env_access false positives, 25 SQLAlchemy ORM false positives, and 126 missing_oversight false positives. After V14 we re-ran the benchmark and verified a 0% false-positive rate on a 24-repo validation subset (73 true positives, 0 false positives).
The headline number
Of the 561 repositories scanned, we found concrete security issues in 9 repositories that we classified as DISCLOSE. Five of those are currently in the disclosure pipeline at various stages — coordinated with maintainers under a standard 90-day window.
Note: CVE identifiers and acknowledgments will be added to this post as each advisory publishes.
The other 552 repositories? Most had findings — Inkog flagged something in roughly 68% of them — but the overwhelming majority were architectural risk patterns (missing rate limits, missing human oversight, unbounded loops on development paths) rather than exploitable vulnerabilities with a proof of concept. That distinction is the whole reason we built the three-tier risk classification: vulnerability (exploitable), risk pattern (structural), hardening (best practice).
Treating all three as equivalent is how you end up with a scanner nobody runs.
The top 5 patterns we actually found
Across the 9 DISCLOSE repositories, the same handful of patterns kept showing up:
1. eval() or exec() on LLM output
The oldest anti-pattern in the book, but still the single most common high-impact finding. A tool or agent takes the LLM's response and passes it to eval(), exec(), or a shell. An attacker with any prompt-injection vector — direct or indirect — gets code execution.
Inkog detects this via the unvalidated_exec_eval pattern, which walks the data-flow graph from LLM output nodes to code-execution sinks. With V14's sanitization wiring, a validate_sql() or sanitize_python() call upstream now correctly suppresses the finding.
2. SQL injection via agent-to-agent delegation
Multi-agent systems build SQL queries from LLM-generated intent. One agent plans the query, another executes it, and neither validates the result. When the planning agent's output is influenced by untrusted input, the executing agent runs tainted SQL.
This is different from SQLAlchemy ORM patterns that look similar to regex scanners but are actually safe — ORM query builders parameterize by construction. V14's callTextHasSQLAlchemySession() check eliminated the 25 false positives this previously caused.
3. Missing human oversight on dangerous tools
EU AI Act Article 14 requires human oversight for agent actions with real-world effects. We found production-shipping multi-agent systems where the "send email," "transfer funds," and "commit code" tools had no approval callback — the LLM decided, and the tool executed.
Inkog's missing_human_oversight pattern uses EffectCategory (Financial / Destructive / Communication / DataMutation) inferred from tool descriptions to decide which tools need oversight. Logging tools, cache writes, and serialization are excluded — those aren't the Article 14 concern.
4. Hardcoded LLM API keys
Unsurprising but ubiquitous. We found Anthropic, OpenAI, Gemini, Groq, and HuggingFace keys embedded in source code across the benchmark. The CLI v1.2.0 release added detection for 11 modern AI provider key formats specifically because of the gap we measured: before v1.2.0, Inkog detected AWS/GitHub/Slack/Stripe keys but missed the keys that AI agent developers actually leak.
The fix is trivial — move them to environment variables, add a .env.example — but the finding matters because leaked AI provider keys get enrolled in Discord bot abuse within hours of hitting a public repo.
5. Token bombing via unbounded loops
The agent retries on failure. The retry has no cap. The failure mode generates a longer failure. The loop generates a longer loop. Your OpenAI bill buys someone a car.
This is Inkog's headline detection pattern — token_bombing — and it was the highest-volume true-positive category across the benchmark. Most repositories caught this in code review eventually, but the median time from introduction to fix was three weeks. In the dev-flow loop with the Inkog MCP server, that drops to three minutes.
Framework breakdown (anonymized)
We can't publish per-repository findings until disclosures complete, but we can share aggregate framework distributions. Every finding below is DISCLOSE-grade (concrete exploit path):
- CrewAI ecosystem: 3 findings across the benchmark. See our CrewAI multi-agent authentication post for the architectural pattern.
- LangGraph / LangChain ecosystem: 2 findings. Both on the delegation/interrupt path.
- Multi-agent research frameworks (CAMEL, MetaGPT, SuperAGI, Agent-S): 3 findings in flight. Advisories pending.
- MCP/tool ecosystem: 1 finding on a popular MCP server. Advisory pending.
The shape of the distribution matches what you'd expect: the more autonomous the agent, the more ways to hurt yourself. No framework is "safe" and no framework is "broken" — they all have sharp edges, and they all deserve better defaults.
Why most scanners find nothing
When we started, we ran SonarQube, Semgrep, and Snyk against the same benchmark as a sanity check. The results were striking: traditional scanners found zero AI-specific vulnerabilities across the 9 repositories we'd classified as DISCLOSE.
It's not that they're bad scanners. It's that they don't understand what an agent is. There's no Semgrep rule for "this LLM response reaches subprocess.run without validation" because Semgrep doesn't know what an LLM response is. There's no Snyk check for "this agent delegates to a sibling without signing the message" because Snyk doesn't model agent topology.
This is the Universal IR argument in a nutshell: AI agent security isn't a plugin for existing scanners — it needs its own representation of the code. Inkog's IR has 18 node types specifically for agent behavior: LoopNode, LLMCallNode, ToolCallNode, DelegationNode, HumanApprovalNode, AuthorizationCheckNode, and so on. Detection rules query those nodes. The framework adapters are the only piece that knows about LangChain or CrewAI — once the code hits the IR, the rules are framework-agnostic.
That's also why V14 could eliminate 356 unsafe-env-access false positives with a single new handler: the rule operates on UserInputNode nodes coming out of os.getenv() calls, and the new IsStandardConfigEnvVarHandler classifies the variable name. No per-framework special case. No regex fragility.
What this means for your project
If you ship AI agents — as a product, a library, or a research tool — here's what the data says you should do:
- Scan your repository. The free tier runs 5 scans/month. Start with
balancedpolicy. If you find nothing, trycomprehensive. If you find something, fix it before your next release. - Connect the MCP server to Claude Code or Cursor. Catch findings while you're writing the code, not three weeks later in a PR review. We covered the dev-flow loop in Building Secure AI Agents with Claude Code and the Inkog MCP.
- Add the GitHub Action to your CI.
diff: truemode only fails PRs on new findings, so you can roll out cleanly even on a codebase that has existing risk patterns. - Write an AGENTS.md. Then run
inkog_verify_governanceto check it against the code. Governance that doesn't match reality is governance theater. - If you're shipping a multi-agent system, run
inkog_audit_a2a. Delegation loops and unsigned handoffs are the 2026 equivalent of the 2018 SSRF era of cloud security — everywhere, under-reported, and overdue for a reckoning.
What's next
We're continuing the disclosure campaign. As each advisory publishes, we'll update this post with the CVE identifier, a link to the advisory, and (where the maintainer approves) a link to the fix commit. Expect updates in the coming weeks.
We're also expanding the benchmark. The 561-repo run covered the ecosystem as of early 2026, but the ecosystem is growing faster than the benchmark. If you maintain an open-source AI agent framework or reference implementation and want it included in the next run, open an issue — we add repos on a rolling basis.
Finally, if you're building agents and none of the patterns above felt familiar, that's the best possible outcome and you should keep doing what you're doing. Most teams we talked to during this research had no process for catching any of these issues. The goal of Inkog — and of this post — is to make that the exception, not the rule.
Get started:
- Install the CLI or the MCP server — free tier, no credit card
- Read the dev-flow walkthrough — building secure agents with Claude Code
- Join our Slack — questions, findings from your own scans, disclosure coordination