QuilrAI
Back to Home

Open Source AI ships with no guardrails.
Your enterprise needs them.

Llama on dev laptops, vLLM clusters with no auth, community MCPs wired straight to production, model weights from unknown sources. QuilrAI discovers all of it, scans it, rates the risk, and enforces policy, one base_url change.

Discovers shadow AI·
Zero model changes·
<50ms added latency

The problem

Four surfaces. Zero governance.

Shadow AI on Dev Machines

Ollama · LM Studio · Jan · GPT4All

Llama 3 runs on 100 developer laptops. IT sees zero requests. No DLP event. No SIEM alert. Completely invisible.

Self-Hosted Model Servers

vLLM · HuggingFace TGI · Triton

OpenAI-compatible API, no auth by default. Any internal service can query the model. No rate limit, no logging, no content policy.

Community MCPs & Agents

GitHub · npm · PyPI · 1,000+ servers

Developers install MCP servers from the internet and wire them straight to production. No security audit. No CVE process. Tool scope is self-declared.

Model Supply Chain

HuggingFace · LoRA · GGUF quantizations

Weights from unknown sources. Backdoored models activate on trigger phrases. Fine-tunes trained on internal data leak IP through outputs.

npm audit for AI tools.

Every model weight, MCP server, and agent framework goes through 5 stages before it touches your infrastructure.

Step 01

Discover

Find every open source AI asset in your org

Ollama on port 11434 across the network
pip / npm installed MCP servers
HuggingFace cached models
Running vLLM / TGI endpoints
quilrai / oss-scanner / result
"tool":"github-mcp@0.3.1"
"source":"npm (unverified publisher)"
"risk_score":"HIGH (74/100)"
"permissions_declared":"["read:repo"]"
"permissions_actual":"["read:repo","write:repo","read:org","network:*"]"
"over_permissioned":"true, 3 undeclared scopes"
"enforced_scope":"["read:repo"] ← QuilrAI least-privilege"
"action":"QUARANTINED, pending review"

Experience Center

See it in action.

Ollama found on 12 dev machines

Click Play to run the scenario

What this looks like in Guardian setup

Allow model weight downloads?Approve
perm: Source: HuggingFace, Ollama (signed)
Allow inference on PII data?Approve
perm: redaction: auto-strip before inference
Allow outbound network calls?Deny
perm: blocked, no external API egress
100%
Ollama / vLLM endpoints governed
<50ms
Added latency per request
Zero
Model weight changes needed
Full
Audit trail, even local inference

Govern your open source AI today.

One base_url change. Every Ollama, vLLM, and MCP server in your org, fully governed.