MITRE ATLAS for practitioners: what to instrument first
MITRE ATLAS is the adversarial counterpart to ATT&CK for AI systems. It's comprehensive — 14 tactics, 80+ techniques — and consequently overwhelming for a team trying to prioritize their first quarter of AI security work. After implementing ATLAS-aligned detections across 900+ customer deployments, a clear priority order emerges.
Start with Reconnaissance (AML.TA0002). Attackers probe your agent before they attack it. Rate-limit the input surface. Log unusual query patterns — repeated variations, encoded payloads, multi-turn setup attempts. Most serious attacks have a reconnaissance signature 1–3 hours before the payload lands.
Next: Initial Access (AML.TA0004). For agents, this is overwhelmingly prompt injection — direct via user input, indirect via retrieved content. Instrument both. Do not trust 'our RAG index only contains our content' — attackers plant content in your SEO surface or vendor documentation specifically to get it retrieved later.
Third: Execution (AML.TA0005). This is where tool-call policy matters. Every function your agent can invoke is an execution primitive. Treat tools like shell commands — default-deny, explicit allow, argument validation, rate limits.
Persistence (AML.TA0007) is the one most teams miss. Long-term memory and RAG are persistence mechanisms. A payload planted today can activate next week. Scan memory writes with the same rigor as input prompts.
Exfiltration (AML.TA0010) is the money. Data leaves via outputs, tool calls, error messages, or side channels (image URLs, markdown links). Tag every sensitive value in memory or retrieved context and alarm when it appears in outputs.
Impact (AML.TA0040) is where your detection story becomes a business story. Financial fraud, reputation damage, safety violations. The prior five tactics make Impact detectable in minutes instead of months.