Every model. Every provider behind it.
Two routes in one SDK. Every model from every provider — and the same model across provider deployments, with priority, weighting, and failover.
Routing
Every model. Every provider. One SDK that routes both.
- Failover
- BYOK
- Region pin
- Weighted
- Retries
- Health
The three things routing actually does.
Every provider's models, one call.
OpenAI, Anthropic, Google, Bedrock, Mistral, Groq, Cerebras, Together — every model they ship, accessible through one OpenAI-compatible call.
→ universal model access
200 · 41 ms
Same model, different provider deployments.
gpt-4o on OpenAI ↔ Azure. claude-3.5 on Anthropic ↔ Bedrock. llama-3 on Together ↔ Groq. One model name, priority-ordered providers behind it.
→ same-model failover
200 · 41 ms
Sub-200ms switch when one provider dies.
Priority + weighted routing, automatic retry on 5xx, timeouts, capacity errors. Your client sees one stable response — not the chaos behind it.
→ cost-aware routing
200 · 41 ms
Hand-rolling LiteLLM glue, retry wrappers, and a status-page poller is the kind of code you'll regret writing at 3 a.m. when GPT-4o goes red.
Urgent backstory
We watched five AI startups in a row hand-roll the same routing stack — and burn three weeks doing it. We packaged ours so you don't have to. Drop the SDK in once; this product, plus the rest of the suite, comes with it.
Four common
ways teams ship with Routing.
Universal model access
Every model from every provider through one SDK. Switch from gpt-4o to claude-3.5-sonnet to gemini-2 by changing one string.
Same-model failover
gpt-4o priority list: OpenAI → Azure → fallback. The model name stays the same; the deployment behind it doesn't.
Cost-aware routing
Cheap-first, fast-first, balanced. Route per user tier, per request type. Health-checked every minute.
BYOK + region pinning
Mix our keys with yours. Cap per-key daily spend. Pin EU traffic to EU. Audit-logged per request.
Four steps, ten minutes.
Define your model
model: "smart" → [gpt-4o, claude-3.5, ...]
Drop in your keys
providers.byok({ openai: "sk-..." })Make the call
client.chat({ model: "smart", ... })Failover transparent
Logs show which provider answered
The stack you already have.
We stopped getting woken up by OpenAI 503s. Just stopped. The router handled it before our pager fired.
— Niko · CTO, RoutedeckFree to ship. Pay when you scale.
per request, after free tier. No markup on tokens. Cancel anytime.
Shipping this quarter.
Ship multi-provider routing this afternoon.
We onboard 1–2 indie startups a week. If you'd rather ship features than maintain a routing stack, talk to us.