One router. Every model.
Always the cheapest that works.

RealityRouter is a smart proxy between you, your agents and every LLM. It scores every provider per query and routes to the cheapest model that will still nail the answer — escalating only when it can't.

$curl -fsSL realityrouter.dev/install.sh | sh

Works with OpenAI · Anthropic · Gemini · Cohere · Ollama · Others

See it route

Watch how RealityRouter picks the right model for each kind of query — and gets sharper over time.

rc-watch · easy task

Cheap when it's enough

Your routine queries don't need a $10 model.

Most agentic queries are routine — formatting, small refactors, quick lookups. The router scores them in milliseconds and ships them to whatever's cheapest that won't fail. Self hosted models often win, costing you nothing.

Smart when it counts

When you really need heavy lifting, you get it.

The math isn't just "pick cheapest." It's argmax(P × Reward − α × Cost − β × Time). When a query will likely fail on a small model, P(success) collapses for cheap options and Opus wins despite the cost — and you don't lose hours debugging a wrong answer.

Self-correcting

Broken answers never reach your agent.

RealityRouter always picks the model with highest expected utility, validates the output for protocol issues — truncated JSON, AI refusals, mid-word cuts — and silently escalates if the output won't survive contact with your agent. Your client sees only the good response.

RealityRouter learns

Your frustration becomes a training signal.

When you correct a model in the next turn, RealityRouter reads it as negative feedback and lowers that model's P(success) for similar tasks. Tomorrow's routing reflects yesterday's complaints, not only yours but also your peers — without you ever filling out a survey.

Without a router

Every query hits flagship.

You default to Opus or GPT flagship models for trivial tasks. 20× the cost, same answer.

You keep running out.

Even Claude Code hits the wall mid-task. Agents stall. You wait for the window to reset.

One provider. One outage.

When Anthropic lags or OpenAI 500s, your whole stack follows. No fallback, no provider competition.

With RealityRouter

One drop-in proxy between you and every LLM.

Every prompt from every agent flows through RealityRouter. It scores every model on Expected Utility — probability of success, cost, latency — and picks the most rewarding one for that specific query.

Every prompt scored against every model · most rewarding picked, automatically · with probabilities provided by Reality Signal™

See where your money goes

And where you save it.

Every routing decision is logged with its full utility breakdown. The web dashboard shows you cost vs. counterfactual, per-model reliability, per-agent spend, and live calibration health.

RealityRouter·Control Center

Live

System Health & Usage

Total volume

1,097

Requests

Accrued expense

$71.81

Actual USD

Potential cost

$219.37

Max model USD

Total savings

$147.56

Retained value

Success density

95.0%

Operational

Most reliable

gemini-2.5-pro

0.6787 med prob

Most economical

qwen3-coder:30b

$0.000 med cost

Fastest response

gemini-2.5-flash

3.41s med time

Least reliable

gemini-2.5-flash

0.5517 med prob

Chattiest

gemini-2.5-pro

1,482 avg tokens

Most shy

qwen3-coder:30b

479 avg tokens

Clumsiest

gemini-3-flash-preview

14.7% error rate

Real numbers from a real install: $147 saved against the always-flagship counterfactual on 1,097 requests across Zed, RooCode, and a Python client. 95% success rate.

The math, not the magic

Expected Utility, every query.

EU(model) = P(success) × Reward−α × Cost−β × Latency

The router picks argmax EU — every model, every query. P(success) comes from Reality Signal™, calibrated against historical outcomes for similar tasks. α, β are your sensitivities — tune them according to your preferences, the router stays loyal forever.

Built in the open

Self-hosted. Auditable. Yours.

RealityRouter runs on your laptop, your server, or your VPC. Your API keys live in your .env. Your logs live on disk. No telemetry, no vendor lock-in.

Drop-in support