PRODUCT · ROUTING

Every model. Every provider behind it.

Two routes in one SDK. Every model from every provider — and the same model across provider deployments, with priority, weighting, and failover.

Get started — free
ROUTE · model: gpt-4o · 24h
openai/gpt-4oP082%
azure/gpt-4oP011%
anthropic/sonnetP17%
groq/llama-3.3P20%
✓ auto-failover · 6 retries · 0 user-visible errors
/routing

Routing

Every model. Every provider. One SDK that routes both.

  • Failover
  • BYOK
  • Region pin
  • Weighted
  • Retries
  • Health
60+models wired
8providers live
41 msmedian routing
0.04%request loss
What it does

The three things routing actually does.

EVERY MODEL

Every provider's models, one call.

OpenAI, Anthropic, Google, Bedrock, Mistral, Groq, Cerebras, Together — every model they ship, accessible through one OpenAI-compatible call.

every model · ok
universal model access
200 · 41 ms
ANY PROVIDER

Same model, different provider deployments.

gpt-4o on OpenAI ↔ Azure. claude-3.5 on Anthropic ↔ Bedrock. llama-3 on Together ↔ Groq. One model name, priority-ordered providers behind it.

any provider · ok
same-model failover
200 · 41 ms
FAILOVER

Sub-200ms switch when one provider dies.

Priority + weighted routing, automatic retry on 5xx, timeouts, capacity errors. Your client sees one stable response — not the chaos behind it.

failover · ok
cost-aware routing
200 · 41 ms

Hand-rolling LiteLLM glue, retry wrappers, and a status-page poller is the kind of code you'll regret writing at 3 a.m. when GPT-4o goes red.

Why we built this

Urgent backstory

We watched five AI startups in a row hand-roll the same routing stack — and burn three weeks doing it. We packaged ours so you don't have to. Drop the SDK in once; this product, plus the rest of the suite, comes with it.

Use it for

Four common
ways teams ship with Routing.

01U

Universal model access

Every model from every provider through one SDK. Switch from gpt-4o to claude-3.5-sonnet to gemini-2 by changing one string.

02S

Same-model failover

gpt-4o priority list: OpenAI → Azure → fallback. The model name stays the same; the deployment behind it doesn't.

03C

Cost-aware routing

Cheap-first, fast-first, balanced. Route per user tier, per request type. Health-checked every minute.

04B

BYOK + region pinning

Mix our keys with yours. Cap per-key daily spend. Pin EU traffic to EU. Audit-logged per request.

How it works

Four steps, ten minutes.

1

Define your model

model: "smart" → [gpt-4o, claude-3.5, ...]
2

Drop in your keys

providers.byok({ openai: "sk-..." })
3

Make the call

client.chat({ model: "smart", ... })
4

Failover transparent

Logs show which provider answered
It works with

The stack you already have.

OpenAI
Anthropic
Bedrock
Vertex
Mistral
Groq
Cerebras
Together

We stopped getting woken up by OpenAI 503s. Just stopped. The router handled it before our pager fired.

Niko · CTO, Routedeck
R
Pricing

Free to ship. Pay when you scale.

FREE$0

Up to 1k requests/mo. Every product included.

Start free →
SCALE$0.0008

per request, after free tier. No markup on tokens. Cancel anytime.

What's next

Shipping this quarter.

Read full changelog →
WK 12Vertex AI provider
WK 13Sticky sessions
WK 14Per-route circuit breaker
WK 15Latency-shaped routing
ONE AFTERNOON AWAY

Ship multi-provider routing this afternoon.

We onboard 1–2 indie startups a week. If you'd rather ship features than maintain a routing stack, talk to us.

Get started — free