For Engineering & IT

One line of code.
Every AI call governed.

Change your base URL. Identity, guardrails, rate limits, token saving, and routing — all enforced automatically. Works with every provider.

integration.py

# Before

base_url = "https://api.openai.com/v1"

# After — one line, full governance

base_url = "https://guardrails.quilr.ai/openai_compatible/"

/openai_compatible/OpenAI Compatible

/anthropic_messages/Anthropic Messages

/vertex_ai/Vertex AI

/sdk/v1/checkSDK Mode

~40ms

Overhead

99.6%

SLA

43%

Token savings

150+

MCP servers

Providers

1 line

To integrate

Three Ways to Integrate

OpenAI-compatible proxy, native Anthropic support, or SDK mode for any provider.

/openai_compatible/

OpenAI Compatible

Drop-in replacement for any OpenAI SDK call. Chat completions, embeddings, assistants — all governed.

Supported Models

GPT-4oGPT-4o-miniGPT-4 Turboo1o1-mini

example.py

from openai import OpenAI

client = OpenAI(

base_url="https://guardrails.quilr.ai/openai_compatible/",

# Pass your Quilr token via auth header

)

response = client.chat.completions.create(

model="gpt-4o",

messages=[{"role": "user", "content": "..."}]

)

Regional Endpoints

Autoguardrails.quilr.aiAutomatic routing

USAus.guardrails.quilr.aiUS East (Virginia)

Indiain.guardrails.quilr.aiAsia South (Mumbai)

Architecture

One Gateway. Every Connection Governed.

AI systems on the left. Tools and providers on the right. QuilrAI sits in the middle, every LLM call and MCP tool invocation passes through the Decision Engine.

── AI Systems ──

Models

OpenAIGPT-4o, o1

AnthropicClaude Sonnet/Opus

GoogleGemini Pro/Ultra

Self-HostedLlama, Mistral, vLLM

Agents

Cursor

Claude Code

OpenAI Agents

Custom Agents

QuilrAI

LLM Gateway + MCP Gateway

Decision Engine

Identity & Auth

Security Guardrails

Guardian Agent

Decision Engine

~40ms

overhead

150+

MCP servers

── LLM Providers ──

OpenAIAnthropicAzureBedrockVertex AIvLLM

── Tools & MCP Servers ──

GitHub

Slack

Jira

PostgreSQL

Google Drive

AWS

Brave Search

Salesforce

MongoDB

Notion

Linear

+ 140 more servers

── AI Systems ──

OpenAIGPT-4o, o1

AnthropicClaude Sonnet/Opus

GoogleGemini Pro/Ultra

Self-HostedLlama, Mistral, vLLM

Cursor

Claude Code

OpenAI Agents

Custom Agents

QuilrAI

LLM Gateway + MCP Gateway

Identity & Auth

Security Guardrails

Guardian Agent

Decision Engine

~40ms

overhead

150+

MCP servers

── LLM Providers ──

OpenAIAnthropicAzureBedrockVertex AIvLLM

── Tools & MCP Servers ──

GitHub

Slack

Jira

PostgreSQL

Google Drive

AWS

Brave Search

Salesforce

MongoDB

+ 140 more servers

1AI System connects

Any model or agent connects via one base_url change. OpenAI-compatible, Anthropic, Vertex AI, or MCP.

2QuilrAI governs

Every call passes through Identity, Guardrails, Guardian Agent, and the Decision Engine. ~40ms overhead.

3Tools & providers execute

Approved calls route to 5+ LLM providers or 150+ MCP servers. Automatic failover. Token optimization.

Pipeline Architecture

Every request passes through a multi-stage pipeline. Toggle between LLM and MCP views.

Identity & Auth

Know exactly who’s using AI and how much

JWT/JWKS (Auth0, Okta, Google) · per-user usage tracking · domain allowlists

Rate Limits

Prevent runaway costs and noisy-neighbor problems

Requests per min/hr/day · token budgets · per-team and per-model limits

Security Guardrails

Stop sensitive data leaks and prompt attacks — both directions

PII/PHI/PCI detection · prompt injection blocking · block, redact, anonymize, or monitor

Custom Intents

Block topics unique to your business (e.g. competitor mentions, legal risk)

Train your own classifier with examples · block, monitor, or redact matches

Prompt Store

Update prompts across all apps instantly — no code deploys

Centralized versioned prompts · {{variable}} templates · enforce mode rejects freeform

Token Saving

Cut input costs 43% automatically — no code changes

JSON→TOON compression · HTML/Markdown stripping · responses untouched

Request Routing

Split traffic across providers and auto-failover when one goes down

Weighted routing groups · group-name-as-model-parameter · automatic failover

MCP Gateway

150+ managed MCP servers via one URL. Dynamic Tool Calling. Auto-detected agents.

MCP Multiplexing

One single URL for all MCPs. Agents connect to one endpoint — the gateway routes to the right server based on the tool call.

mcp.quilr.ai/mcp/<slug>/

Dynamic Tool Calling

Reduces tool selection context from 10-20K tokens to ~200 tokens. Higher accuracy — LLMs pick the right tool without noise.

2x productive usage with same tokens

Web Search MCP

Built-in web search with enterprise security gateway integration. URL filtering enforced through your existing security stack.

Zscaler ZIAPrisma AccessFortiGateCisco Umbrella

MCP Library — 150+ Servers

Developer Tools

GitHub

GitLab

Jira

Linear

Sentry

Communication

Slack

Discord

Teams

Databases

PostgreSQL

MongoDB

Redis

Supabase

Cloud

AWS

GCP

Azure

Cloudflare

Productivity

Google Drive

Notion

Confluence

Asana

Data & Analytics

BigQuery

Snowflake

Tableau

Web Search

Brave Search

Google Search

Bing

Security

Zscaler ZIA

Prisma Access

FortiGate

File Systems

Local FS

GCS

Dropbox

Tool Risk Categorization

Read OnlySafe operations — no state changes

get_filelist_repossearch_issuesread_channel

WriteCreates or modifies resources

create_issuesend_messageupdate_recordpush_commit

DestructiveIrreversible operations — requires approval

delete_repodrop_tableremove_userpurge_cache

Auto-Detected Agents

Cursor

User-Agent: cursor/*

Claude Code

User-Agent: claude-code/*

OpenAI Agents

User-Agent: openai-agents/*

Gemini

User-Agent: gemini-cli/*

Agents are automatically identified via User-Agent header matching. Per-agent policies, rate limits, and tool access controls apply instantly. Add custom agents with your own keywords.

MCP/AI Portal

Self-service portal for end users to browse available MCPs, connect their accounts via OAuth, and start using tools. Not admin-only — engineers can self-serve.

Auth Mediation

Gateway brokers OAuth tokens — agents never see raw credentials. Modes: OAuth→Token, Token→Token, No Auth→OAuth. Bearer token + mcpuser header, OAuth DCR, OAuth Manual.

Built for Production

Cost control, intelligent routing, prompt management, and custom classifiers — all built in.

Routing Groups

Weighted distribution across providers with automatic fallback

OpenAIgpt-4o

40%

Anthropicclaude-sonnet-4

35%

Azure OpenAIgpt-4o

25%

Automatic failover: if one provider goes down, traffic shifts to the rest

+ Bedrock, Vertex AI, vLLM / custom endpoints

Token Saving — JSON → TOON

Lossless compression cuts token count by 43%

JSON (before)

{

"messages": [

{"role": "system", "content": "You are a helpful assistant..."},

{"role": "user", "content": "Summarize the Q3 report"}

"model": "gpt-4o",

"temperature": 0.7,

"max_tokens": 2048

}

TOON (after)

m:[

{r:"s",c:"You are a helpful assistant..."},

{r:"u",c:"Summarize the Q3 report"}

],M:"gpt-4o",t:0.7,x:2048

43%fewer tokens

Prompt Store

Versioned, centralized prompts with variable injection. No code deploys.

quilrai-prompt-store-summarize-v3

You are a {{role}} at {{company}}.

Summarize the following document in {{format}} format.

Focus on: {{focus_areas}}

Max length: {{max_words}} words.

{{role}}{{company}}{{format}}{{focus_areas}}{{max_words}}

Custom Intents

Train classifiers with examples. Block, monitor, or redact matches.

competitor_mentionblock

✓ Positive Examples

"How does QuilrAI compare to Prompt Security?"

"What advantages does Protect AI have over us?"

✗ Negative Examples

"What are our product advantages?"

"How is our security posture?"

~40ms

Overhead

99.6%

SLA

43%

Token savings

150+

MCP servers

Providers

1 line

To integrate

Start building securely

One line change. Full governance. Deploy in minutes.

Get a Demo Developer Docs

One line of code.Every AI call governed.

Three Ways to Integrate

OpenAI Compatible

One Gateway. Every Connection Governed.

Pipeline Architecture

MCP Gateway

MCP Multiplexing

Dynamic Tool Calling

Web Search MCP

MCP Library — 150+ Servers

Tool Risk Categorization

Auto-Detected Agents

MCP/AI Portal

Auth Mediation

Built for Production

Routing Groups

Token Saving — JSON → TOON

Prompt Store

Custom Intents

Start building securely

One line of code.
Every AI call governed.