The complete AI backend, built for startups.

Routing, tools, rate limits, billing, and logs — every piece a startup needs to ship an AI product, in one SDK call.

Get started — free
ROUTE · model: gpt-4o · 24h
openai/gpt-4oP082%
azure/gpt-4oP011%
anthropic/sonnetP17%
groq/llama-3.3P20%
✓ auto-failover · 6 retries · 0 user-visible errors
/routing

Routing

Every model. Every provider. One SDK that routes both.

  • Failover
  • BYOK
  • Region pin
  • Weighted
  • Retries
  • Health
03 / How it flows

One call in, one log out. We do the five things in between.

Your app makes one call. We check the user's tier, route to a provider, run any tools, debit the wallet, and write a log row. You handle the feature. We handle the rest.

Your appone HTTP call
Limitscheck tier budget
Routingpick provider, retry
Toolsrun MCP tool, OAuth'd
Billingdebit the wallet
Logswrite log row
Your userwallet debited
One canonical log row per request:req_84a3 · user_28f3a · gpt-4o · 1.2k tok · 412ms · github.create_issue · $0.018 · ok
02 / ROUTING

Every model. Every provider. One SDK that routes both.

Two routes in one. (1) Every model from every provider — OpenAI, Anthropic, Google, Bedrock, Mistral, Groq — through one OpenAI-compatible call. (2) The same model across provider deployments — gpt-4o on OpenAI ↔ Azure, claude-3.5 on Anthropic ↔ Bedrock — with priority and failover.

  • Every model from every provider — one SDK call
  • Same model across providers (OpenAI ↔ Azure, Anthropic ↔ Bedrock)
  • Priority + weighted routing, sub-200ms failover
  • BYOK or Assistiv-managed keys, swap freely
ROUTE · model: gpt-4o · 24h
openai/gpt-4oP082%
azure/gpt-4oP011%
anthropic/sonnetP17%
groq/llama-3.3P20%
✓ auto-failover · 6 retries · 0 user-visible errors
03 / TOOLS

Hosted MCP. We host the servers, handle end-user auth, and verify every tool call.

We host the MCP server. We handle end-user OAuth in your dashboard, under your brand. We verify every tool call before it runs — scopes, allowlists, dry-run when you want it. You write zero OAuth callbacks, store zero refresh tokens, and audit every action.

  • We host the MCP servers — 30+ apps live, growing weekly
  • We handle end-user auth — OAuth in your dashboard, your brand
  • Verified tool calls — scopes, allowlists, dry-run, audit row each call
  • Bring-your-own MCP servers supported alongside hosted
TOOLS · session_84a35 calls · oauth'd as user_28f3a
github.create_issue()200
slack.send_message()200
linear.create_ticket()200
gmail.search()200
stripe.create_invoice()200
04 / LIMITS

Per-platform AND per-end-user. Rate, wallet-balance, and usage-per-time gates.

Three kinds of gate. Rate limits (RPM, TPM, sliding window, token bucket). Wallet-balance gates (block when below threshold). Usage-per-time budgets ($X per day/week/month). Set them on the platform, on each end-user, or both. Enforced before the provider sees the call.

  • Per-platform gates (you set defaults)
  • Per-end-user gates (you set per signup)
  • Rate limits — RPM, TPM, sliding window, token bucket
  • Wallet-balance + usage-per-time budgets
TIER BUDGETS · today
free$0.42 / $0.50/day
pro$8.60 / $20/day
team$112 / $200/day
ent$2,140 / unlimited
no cap · billed monthly
05 / BILLING

Per-user wallets you top up via API. Plus client-side calling — end users hit Assistiv directly.

Each end-user gets a wallet. You handle payments however you want — Stripe, Paddle, invoicing, internal credits — and call walletTopUp() to credit the wallet. We debit per token, per tool, per request — atomic, audit-logged. End users can also call Assistiv directly with their own sk-eu_ key, bypassing your server entirely.

  • You own the payment flow; we own the wallet ledger
  • walletTopUp(user, amount) — one API call, idempotent
  • Client-side calling — end users hit api.assistiv.ai directly
  • Tokens, tools, agents all settle against the same wallet
WALLET · user_28f3a · pro tier
$5.80balance
62% of $10/day budgetresets in 6h 12m
gpt-4o · 1.2k tok−$0.018
claude · 890 tok−$0.012
github.create_issue−$0.001
stripe top-up+$10.00
06 / LOGS

LangSmith-grade observability. Full traces, token + latency + cost, replay, eval.

Every request becomes a full trace: each LLM call, retry, and tool step in the chain. Tokens, latency, cost, and status attached at every step. Filter by model, user, route, error. Replay any trace. Score traces against datasets. Built-in — not a separate SaaS.

  • Full traces — every step, every retry, every tool call
  • Tokens, latency, cost on every step (per user, per model, per route)
  • Replay traces, score against datasets, run evals
  • Webhook export to your warehouse — Snowflake, BigQuery, S3
● LIVE LOGSlast 60s · 6 of 1,284
14:02
user_28f3a
gpt-4o
1.2k
412ms
$0.018
14:02
user_91f0c
sonnet
0.9k
380ms
$0.012
14:02
user_28f3a
github
190ms
$0.001
14:01
user_55a2e
gpt-4o
2.1k
590ms
$0.034
14:01
user_28f3a
stripe
120ms
$0.001
14:01
user_91f0c
gpt-4o-m
0.4k
210ms
$0.0008
08 / What it replaces

Fifteen tabs of glue code. Closed.

Every product in the suite replaces something you'd otherwise glue together. Here's the receipt.

ROUTING
replaces…
  • LiteLLM glue code
  • Hand-rolled retry wrappers
  • Status-page polling
TOOLS
replaces…
  • OAuth swamp
  • Token refresh code
  • Custom GitHub/Slack/Gmail integrations
LIMITS
replaces…
  • Sliding-window Redis
  • Free-tier abuse triage
  • 2 a.m. throttle pages
BILLING
replaces…
  • DIY wallet tables
  • Per-user metering
  • Token-counting cron jobs
LOGS
replaces…
  • LangSmith
  • Datadog AI spend reports
  • Custom BigQuery cost views
09 / Ship weekly

We ship every week.

Small surface area means we can move fast. Here's the last six weeks.

Apr 28Tools

Linear, Notion, Jira MCPs

Three new hosted MCP servers. Same OAuth flow, same wallet debits.

Apr 21Routing

Region-pinned routing

Pin requests to US, EU, or AU regions. Per-key, per-user, or per-tier.

Apr 14Billing

Manual wallet adjustments

Issue refunds, holds, and credits via a single endpoint. Auditable.

Apr 07Limits

Token-bucket rate limits

New limit kind alongside sliding-window and fixed-budget. Mix freely.

Mar 31Logs

Webhook log export

Stream one canonical row per request to your warehouse. Real-time.

Mar 24Routing

Groq + Cerebras providers

Two more options for the priority list. BYOK or Assistiv-managed.

10 / Pricing

Free to start.
Usage on top.

Generous free tier across every product. Pay-as-you-go after that — top up the wallet, no monthly base, no "contact sales."

FREE
$0

For prototyping. All 5 products on, generous limits across the board.

  • 10k monthly active end-users
  • 5k traces/month, 7-day retention
  • BYOK requests — unlimited, free
  • All 5 products enabled
Start free →
most picked
PAY-AS-YOU-GO
Base + usage

For shipping. No monthly base. Top up the wallet, drain on usage.

  • Provider price + 5% on tokens
  • BYOK requests stay free
  • $0.00005 / MAU after 10k
  • $0.0025 / trace after 5k/mo
See pricing →
SCALE
Talk

For real volume. Negotiated commission, dedicated support, SOC 2 docs.

  • Commission below 5% at volume
  • Dedicated routing region
  • SOC 2 + DPA on request
  • Slack channel with founders
See full pricing →
ONE AFTERNOON AWAY

You build the product.
We are the backend.

We onboard 1–2 indie startups a week. Reply within a day, integrated within a week.

npm i @assistiv/sdk