One router. Every model.
Always the cheapest that works.
RealityRouter is a smart proxy between you, your agents and every LLM. It scores every provider per query and routes to the cheapest model that will still nail the answer — escalating only when it can't.
Works with OpenAI · Anthropic · Gemini · Cohere · Ollama · Others
See it route
Watch how RealityRouter picks the right model for each kind of query — and gets sharper over time.
Cheap when it's enough
Your routine queries don't need a $10 model.
Most agentic queries are routine — formatting, small refactors, quick lookups. The router scores them in milliseconds and ships them to whatever's cheapest that won't fail. Self hosted models often win, costing you nothing.
Smart when it counts
When you really need heavy lifting, you get it.
The math isn't just "pick cheapest." It's argmax(P × Reward − α × Cost − β × Time). When a query will likely fail on a small model, P(success) collapses for cheap options and Opus wins despite the cost — and you don't lose hours debugging a wrong answer.
Self-correcting
Broken answers never reach your agent.
RealityRouter always picks the model with highest expected utility, validates the output for protocol issues — truncated JSON, AI refusals, mid-word cuts — and silently escalates if the output won't survive contact with your agent. Your client sees only the good response.
RealityRouter learns
Your frustration becomes a training signal.
When you correct a model in the next turn, RealityRouter reads it as negative feedback and lowers that model's P(success) for similar tasks. Tomorrow's routing reflects yesterday's complaints, not only yours but also your peers — without you ever filling out a survey.
Without a router
You default to Opus or GPT flagship models for trivial tasks. 20× the cost, same answer.
Even Claude Code hits the wall mid-task. Agents stall. You wait for the window to reset.
When Anthropic lags or OpenAI 500s, your whole stack follows. No fallback, no provider competition.
With RealityRouter
One drop-in proxy between you and every LLM.
Every prompt from every agent flows through RealityRouter. It scores every model on Expected Utility — probability of success, cost, latency — and picks the most rewarding one for that specific query.
Every prompt scored against every model · most rewarding picked, automatically · with probabilities provided by Reality Signal™
See where your money goes
And where you save it.
Every routing decision is logged with its full utility breakdown. The web dashboard shows you cost vs. counterfactual, per-model reliability, per-agent spend, and live calibration health.
Real numbers from a real install: $147 saved against the always-flagship counterfactual on 1,097 requests across Zed, RooCode, and a Python client. 95% success rate.
The math, not the magic
Expected Utility, every query.
The router picks argmax EU — every model, every query. P(success) comes from Reality Signal™, calibrated against historical outcomes for similar tasks. α, β are your sensitivities — tune them according to your preferences, the router stays loyal forever.
Built in the open
Self-hosted. Auditable. Yours.
RealityRouter runs on your laptop, your server, or your VPC. Your API keys live in your .env. Your logs live on disk. No telemetry, no vendor lock-in.
- ✓CursorOpenAI-compatible base URL
- ✓Claude CodeAnthropic-compatible passthrough
- ✓ZedAgent Client Protocol via translation layer
- ✓Continue / VSCodiumNative session tracking
- ✓OpenClaw / AutoGPTAgent-card discovery + sticky sessions
- ✓Your own scriptsAny OpenAI-compatible client
Ready in 60 seconds.
One install command. One URL change. Every agent gets smarter.
Open · Self-hosted · Powered by Reality Signal™