PRODUCT · ROUTING

Every model. Every provider behind it.

Two routes in one SDK. Every model from every provider — and the same model across provider deployments, with priority, weighting, and failover.

Get started — free

ROUTE · model: gpt-4o · 24h

openai/gpt-4oP082%

azure/gpt-4oP011%

anthropic/sonnetP17%

groq/llama-3.3P20%

/routing

Routing

Every model. Every provider. One SDK that routes both.

Failover
BYOK
Region pin
Weighted
Retries
Health

60+models wired

8providers live

41 msmedian routing

0.04%request loss

What it does

The three things routing actually does.

EVERY MODEL

Every provider's models, one call.

OpenAI, Anthropic, Google, Bedrock, Mistral, Groq, Cerebras, Together — every model they ship, accessible through one OpenAI-compatible call.

every model · ok
→ universal model access
200 · 41 ms

ANY PROVIDER

Same model, different provider deployments.

gpt-4o on OpenAI ↔ Azure. claude-3.5 on Anthropic ↔ Bedrock. llama-3 on Together ↔ Groq. One model name, priority-ordered providers behind it.

any provider · ok
→ same-model failover
200 · 41 ms

FAILOVER

Sub-200ms switch when one provider dies.

Priority + weighted routing, automatic retry on 5xx, timeouts, capacity errors. Your client sees one stable response — not the chaos behind it.

failover · ok
→ cost-aware routing
200 · 41 ms

Hand-rolling LiteLLM glue, retry wrappers, and a status-page poller is the kind of code you'll regret writing at 3 a.m. when GPT-4o goes red.

Why we built this

Urgent backstory

We watched five AI startups in a row hand-roll the same routing stack — and burn three weeks doing it. We packaged ours so you don't have to. Drop the SDK in once; this product, plus the rest of the suite, comes with it.

Use it for

Four common
ways teams ship with Routing.

01U

Universal model access

Every model from every provider through one SDK. Switch from gpt-4o to claude-3.5-sonnet to gemini-2 by changing one string.

02S

Same-model failover

gpt-4o priority list: OpenAI → Azure → fallback. The model name stays the same; the deployment behind it doesn't.

03C

Cost-aware routing

Cheap-first, fast-first, balanced. Route per user tier, per request type. Health-checked every minute.

04B

BYOK + region pinning

Mix our keys with yours. Cap per-key daily spend. Pin EU traffic to EU. Audit-logged per request.

How it works

Four steps, ten minutes.

Define your model

model: "smart" → [gpt-4o, claude-3.5, ...]

Drop in your keys

providers.byok({ openai: "sk-..." })

Make the call

client.chat({ model: "smart", ... })

Failover transparent

Logs show which provider answered

It works with

The stack you already have.

OpenAI

Anthropic

Bedrock

Vertex

Mistral

Groq

Cerebras

Together

“

We stopped getting woken up by OpenAI 503s. Just stopped. The router handled it before our pager fired.

— Niko · CTO, Routedeck

Pricing

Free to ship. Pay when you scale.

FREE$0

Up to 1k requests/mo. Every product included.

Start free →

SCALE$0.0008

per request, after free tier. No markup on tokens. Cancel anytime.

What's next

Shipping this quarter.

Read full changelog →

WK 12Vertex AI provider

WK 13Sticky sessions

WK 14Per-route circuit breaker

WK 15Latency-shaped routing

ONE AFTERNOON AWAY

Ship multi-provider routing this afternoon.

We onboard 1–2 indie startups a week. If you'd rather ship features than maintain a routing stack, talk to us.