Back to Guardra
AI Agent Audit
Audit every layer of your AI agent.
Guardra's agent audit covers the full attack surface: prompts, memory, tools, and outputs. Send us a trace, an SDK call, or a batch of logs — we return production-ready findings and fixes mapped to the OWASP LLM Top 10.
Prompt layer
- Direct prompt injection (instruction override, role redefinition)
- Indirect injection through retrieved documents and tool outputs
- Jailbreak & roleplay guardrail bypass
- Prompt leakage (system prompt exfiltration)
- PII and secret leakage inside prompts
Memory & RAG
- Cross-session memory bleed
- Poisoned retrieval documents
- Unbounded memory growth / cost attacks
- Vector-store injection
- Long-term PII persistence
Tools & actions
- Over-privileged tool scopes
- Confused-deputy patterns across chained tools
- Fan-out / broadcast misuse
- Destructive actions without human-in-the-loop
- Shadow tools registered at runtime
Outputs
- Hallucinated APIs, endpoints, or package names
- Markdown / hyperlink injection
- Unsafe code suggestions
- Data exfiltration via images / URLs
- Policy-violating content
How an agent audit works
- 01Install the SDK or send traces via REST — zero changes to your agent logic.
- 02Guardra replays each span through 12,000+ deterministic detectors and an LLM-as-judge.
- 03Findings are ranked by real exploitability and mapped to OWASP LLM / MITRE ATLAS.
- 04Every finding ships with a policy-level fix and a regression test.
- 05Continuous monitoring kicks in — new runs are audited automatically.