Modern agents don't just call tools. They run code, control browsers, spawn sub-agents, read from poisonable memory, and download model weights. QuilrAI covers all of it: every execution environment, every delegation chain, every data channel.
In 2026, agents don't just call MCP tools, they execute code, control browsers, delegate across trust boundaries, and retrieve from poisonable stores. The attack surface is the entire execution environment.
Agents that write and run code directly: Python sandboxes, bash shells, Docker containers, E2B cloud environments, and Jupyter kernels. Claude Code, OpenHands, SWE-agent all do this.
The model generates code, hands it to an interpreter, reads stdout/stderr as context, and iterates. Full shell access is common, not the exception.
Escape from sandbox via system calls, env variable exfiltration, writing to arbitrary paths, spawning child processes, installing packages that establish persistence.
Agents that control a real browser or desktop: take screenshots, move the mouse, fill forms, click buttons, read page DOM. Claude Computer Use, Browser-Use, and Playwright-based agents.
The model receives a screenshot, reasons about what to click, emits a tool call (click/type/scroll), and repeats, acting as a human operating a machine.
Full screen content visible to model, passwords, session tokens, private data in other tabs. Agents can be tricked via visual prompt injections on web pages.
Orchestrators that spawn, delegate to, and coordinate fleets of specialized sub-agents. One orchestrator agent assigns tasks; sub-agents execute and return results.
Agents communicate via structured messages or shared memory. Each hop can widen permissions, a researcher agent's results feed into an executor agent with write access.
Privilege escalation through delegation chains. Sub-agent trust is inherited, not earned. Compromising one agent compromises the entire pipeline downstream.
Agents with persistent memory: vector databases (Pinecone, Weaviate, Chroma), embedding stores, conversation histories. They retrieve context before every response.
At query time the agent embeds the prompt, retrieves semantically similar chunks, and injects them into the system prompt. What's in the DB becomes trusted context.
Poisoned documents in the retrieval store inject instructions into future responses, an attacker writes a doc once and influences every retrieval that matches it.
Standardized connectors that expose tools, resources, and prompts to AI models: file servers, DB connectors, GitHub, Slack, Jira, Linear, Notion, 150+ servers.
JSON-RPC protocol. AI hosts discover server capabilities at runtime and invoke tools dynamically. Any MCP host (Claude Desktop, Cursor, Zed) connects to any server.
No authentication standard. No parameter validation. Any connected model can invoke any exposed tool with arbitrary inputs. Rug-pull and tool-shadowing attacks in multi-server setups.
Self-hosted and fine-tuned models downloaded from public hubs: HuggingFace models, LoRA adapters, GGUF weights for Ollama, quantized checkpoints for vLLM.
Teams pull model weights at deploy time or fine-tune on private data. The model becomes part of the inference pipeline, its behavior is the behavior of the agent.
Backdoored weights activate on trigger phrases. Poisoned fine-tune datasets embed persistent behaviors. Model cards misrepresent safety evaluations.
A single agent task can touch a shell, a browser, a vector database, 12 MCP tools, and 3 sub-agents, all in one run. Every hop is a trust boundary crossing.
Agents with shell access can break sandbox boundaries, exfiltrate env vars, spawn child processes, and install persistence.
Sub-agents inherit permissions from parent agents across delegation chains, accumulating access no single agent was supposed to have.
Malicious content written once into a vector store injects instructions into every future retrieval, a persistent, invisible backdoor.
Guardian doesn't just govern tools, it governs skills. Each agent is analyzed for what it actually needs to do: read commits, post Slack messages, run bash commands. Guardian binds the minimum permission set for each skill, blocks anything outside it, and explains the rationale for every decision.
Guardian parses the agent's purpose and maps it to specific skills: 'summarize Jira issues' → read:jira, 'post to Slack' → write:slack:channel. Each skill gets exactly the permissions it needs. Any skill invocation that isn't bound gets blocked, even if the tool itself is approved.
Not just read vs. write, Guardian enforces scope within permissions. 'post Slack messages' → only to #engineering, not all channels. 'write files' → only to /project/src, not /project/config. 'query database' → SELECT only, no schema access. Rationale is auto-generated for every boundary.
When an agent attempts a skill it was never configured for, a summarizer trying to delete records, a researcher trying to write to production, Guardian intercepts in under 30ms. Drift is logged with the skill name, the permission attempted, and the policy that blocked it.
Red Team Agent probes every declared skill boundary: tries to escalate read → write, attempts cross-agent skill inheritance, injects prompts that trick the agent into claiming new skills. When it finds a gap, Guardian auto-tightens that specific skill's permission scope.
What this looks like in Guardian setup
Firewalls, IAM, DLP, and WAFs were built for humans clicking buttons. Agents execute code, control browsers, poison memory, and delegate trust, at machine speed, with no human in the loop.
Agents with bash/Python access can read env vars, write files, spawn processes, and call the network, entirely outside security tooling.
Computer-use agents screenshot the full desktop. Passwords, tokens, and confidential data in adjacent windows are all in-context.
Any document ingested into a vector DB is future trusted context. Attackers write once; the injection runs on every future retrieval.
Multi-agent delegation creates privilege escalation paths with no visibility. Sub-agents accumulate permissions across hops.
MCP servers have no standardized auth. Any connected client can invoke any tool with arbitrary parameters.
Web search results, emails, DB rows, API responses, file contents, any input channel can carry injected instructions.
Process-level interception of every shell command, Python exec, and network call from agent runtimes. Sandbox escapes blocked in <30ms.
Agents scoped to task-specific tabs. Screenshots scanned for PII/credentials before model sees them. DOM inspected for visual injections.
Retrieved chunks inspected for embedded instructions before context injection. Poisoned documents quarantined and flagged.
Full visibility into multi-agent delegation. Depth limits, privilege boundaries, and escalation detection across every agent hop.
Every input channel inspected, web search, files, DB rows, API responses. Injection patterns detected before they reach the model.
Every action, tool call, exec, screenshot, delegation, and retrieval logged with full context and provenance.
Five enforcement planes wrap around every agent deployment, from code execution and browser control to MCP tool calls, RAG retrieval, and network egress. No surface is ungoverned.
Every MCP server call passes through QuilrAI before reaching its target system. Authenticate servers, scope tool permissions, validate parameters, and enforce policies in real time.
// MCP Gateway intercepts tool call
mcpGateway.intercept("file-server.read", {
agent: "research-agent",
params: { path: "/etc/shadow" },
policy: "scope:/app/data/**"
});
// Result: BLOCKED, path outside scopeWatch real-time enforcement against common agentic tool attack patterns. Each scenario demonstrates a different control plane responding to threats.
AI Security Posture Management discovers all agents, MCP servers, code execution environments, vector stores, and browser automation across your infrastructure. You cannot govern what you cannot see.
Discover every agent runtime with shell or interpreter access, Claude Code, OpenHands, SWE-agent, AutoGen. Map allowed paths, resource limits, and egress permissions.
Identify every computer-use agent and Playwright automation deployed in your org. Track what sites they can reach, what data they can see, and what forms they can fill.
Discover every RAG database and embedding store. Identify what documents have been ingested, who can write to them, and which agents retrieve from them.
Automatic discovery of every MCP server: exposed tools, connected clients, authentication state, and per-tool permission scope across your entire environment.
Find unauthorized agent deployments, unregistered MCP servers, rogue code execution environments, and unapproved vector stores before attackers do.
Continuous risk assessment across every agent surface. Automated compliance reports for SOC 2, HIPAA, and regulatory frameworks covering all agentic activity.
Every MCP server, agent framework, and tool chain in your organization needs governance. QuilrAI provides it without slowing your teams down.
If your agents are being written or run by Claude Code, Cursor, or Copilot, there's a dedicated enforcement plane for that. See exactly what QuilrAI intercepts at the machine level.