Llama on dev laptops, vLLM clusters with no auth, community MCPs wired straight to production, model weights from unknown sources. QuilrAI discovers all of it, scans it, rates the risk, and enforces policy, one base_url change.
The problem
Ollama · LM Studio · Jan · GPT4All
Llama 3 runs on 100 developer laptops. IT sees zero requests. No DLP event. No SIEM alert. Completely invisible.
vLLM · HuggingFace TGI · Triton
OpenAI-compatible API, no auth by default. Any internal service can query the model. No rate limit, no logging, no content policy.
GitHub · npm · PyPI · 1,000+ servers
Developers install MCP servers from the internet and wire them straight to production. No security audit. No CVE process. Tool scope is self-declared.
HuggingFace · LoRA · GGUF quantizations
Weights from unknown sources. Backdoored models activate on trigger phrases. Fine-tunes trained on internal data leak IP through outputs.
Every model weight, MCP server, and agent framework goes through 5 stages before it touches your infrastructure.
Find every open source AI asset in your org
Experience Center
Click Play to run the scenario
What this looks like in Guardian setup
One base_url change. Every Ollama, vLLM, and MCP server in your org, fully governed.