One router. Every model.
Always the cheapest that works.

RealityRouter is a smart proxy between you, your agents and every LLM. It scores every provider per query and routes to the cheapest model that will still nail the answer — escalating only when it can't.

$curl -fsSL realityrouter.dev/install.sh | sh

Works with OpenAI · Anthropic · Gemini · Cohere · Ollama · Others

See it route

Watch how RealityRouter picks the right model for each kind of query — and gets sharper over time.

rc-watch · easy task

Cheap when it's enough

Your routine queries don't need a $10 model.

Most agentic queries are routine — formatting, small refactors, quick lookups. The router scores them in milliseconds and ships them to whatever's cheapest that won't fail. Self hosted models often win, costing you nothing.

Smart when it counts

When you really need heavy lifting, you get it.

The math isn't just "pick cheapest." It's argmax(P × Reward − α × Cost − β × Time). When a query will likely fail on a small model, P(success) collapses for cheap options and Opus wins despite the cost — and you don't lose hours debugging a wrong answer.

Self-correcting

Broken answers never reach your agent.

RealityRouter always picks the model with highest expected utility, validates the output for protocol issues — truncated JSON, AI refusals, mid-word cuts — and silently escalates if the output won't survive contact with your agent. Your client sees only the good response.

RealityRouter learns

Your frustration becomes a training signal.

When you correct a model in the next turn, RealityRouter reads it as negative feedback and lowers that model's P(success) for similar tasks. Tomorrow's routing reflects yesterday's complaints, not only yours but also your peers — without you ever filling out a survey.

Without a router

Every query hits flagship.

You default to Opus or GPT flagship models for trivial tasks. 20× the cost, same answer.

You keep running out.

Even Claude Code hits the wall mid-task. Agents stall. You wait for the window to reset.

One provider. One outage.

When Anthropic lags or OpenAI 500s, your whole stack follows. No fallback, no provider competition.

With RealityRouter

One drop-in proxy between you and every LLM.

Every prompt from every agent flows through RealityRouter. It scores every model on Expected Utility — probability of success, cost, latency — and picks the most rewarding one for that specific query.

YOUR STACKCursorClaude CodeZedYour scriptsHermesONE PROXYRealityRouterargmax EUYOUR LLMsGPT-5.4 ThinkingClaude Opus 4Claude Haiku 4.5Gemini 2.5 Flashqwen3-coder:30b · local

Every prompt scored against every model · most rewarding picked, automatically · with probabilities provided by Reality Signal™

See where your money goes

And where you save it.

Every routing decision is logged with its full utility breakdown. The web dashboard shows you cost vs. counterfactual, per-model reliability, per-agent spend, and live calibration health.

RealityRouterRealityRouter
System Health & Usage
Total volume
1,097
Requests
Accrued expense
$71.81
Actual USD
Potential cost
$219.37
Max model USD
Total savings
$147.56
Retained value
Success density
95.0%
Operational
Most reliable
gemini-2.5-pro
0.6787 med prob
Most economical
qwen3-coder:30b
$0.000 med cost
Fastest response
gemini-2.5-flash
3.41s med time
Least reliable
gemini-2.5-flash
0.5517 med prob
Chattiest
gemini-2.5-pro
1,482 avg tokens
Most shy
qwen3-coder:30b
479 avg tokens
Clumsiest
gemini-3-flash-preview
14.7% error rate

Real numbers from a real install: $147 saved against the always-flagship counterfactual on 1,097 requests across Zed, RooCode, and a Python client. 95% success rate.

The math, not the magic

Expected Utility, every query.

EU(model) = P(success) × Rewardα × Costβ × Latency

The router picks argmax EU — every model, every query. P(success) comes from Reality Signal™, calibrated against historical outcomes for similar tasks. α, β are your sensitivities — tune them according to your preferences, the router stays loyal forever.

Built in the open

Self-hosted. Auditable. Yours.

RealityRouter runs on your laptop, your server, or your VPC. Your API keys live in your .env. Your logs live on disk. No telemetry, no vendor lock-in.

Drop-in support
  • Cursor
    OpenAI-compatible base URL
  • Claude Code
    Anthropic-compatible passthrough
  • Zed
    Agent Client Protocol via translation layer
  • Continue / VSCodium
    Native session tracking
  • OpenClaw / AutoGPT
    Agent-card discovery + sticky sessions
  • Your own scripts
    Any OpenAI-compatible client

Ready in 60 seconds.

One install command. One URL change. Every agent gets smarter.

$curl -fsSL realityrouter.dev/install.sh | sh

Open · Self-hosted · Powered by Reality Signal™