Two routes in one SDK. Every model from every provider — and the same model across provider deployments, with priority, weighting, and failover.
Every model. Every provider. One SDK that routes both.
OpenAI, Anthropic, Google, Bedrock, Mistral, Groq, Cerebras, Together — every model they ship, accessible through one OpenAI-compatible call.
gpt-4o on OpenAI ↔ Azure. claude-3.5 on Anthropic ↔ Bedrock. llama-3 on Together ↔ Groq. One model name, priority-ordered providers behind it.
Priority + weighted routing, automatic retry on 5xx, timeouts, capacity errors. Your client sees one stable response — not the chaos behind it.
Hand-rolling LiteLLM glue, retry wrappers, and a status-page poller is the kind of code you'll regret writing at 3 a.m. when GPT-4o goes red.
We watched five AI startups in a row hand-roll the same routing stack — and burn three weeks doing it. We packaged ours so you don't have to. Drop the SDK in once; this product, plus the rest of the suite, comes with it.
Every model from every provider through one SDK. Switch from gpt-4o to claude-3.5-sonnet to gemini-2 by changing one string.
gpt-4o priority list: OpenAI → Azure → fallback. The model name stays the same; the deployment behind it doesn't.
Cheap-first, fast-first, balanced. Route per user tier, per request type. Health-checked every minute.
Mix our keys with yours. Cap per-key daily spend. Pin EU traffic to EU. Audit-logged per request.
model: "smart" → [gpt-4o, claude-3.5, ...]
providers.byok({ openai: "sk-..." })client.chat({ model: "smart", ... })Logs show which provider answered
We stopped getting woken up by OpenAI 503s. Just stopped. The router handled it before our pager fired.
— Niko · CTO, Routedeckper request, after free tier. No markup on tokens. Cancel anytime.
We onboard 1–2 indie startups a week. If you'd rather ship features than maintain a routing stack, talk to us.