Three gates, two scopes.
Rate, wallet-balance, and usage-per-time gates. Set them on the platform, on each end-user, or both. Enforced server-side before the provider sees the call.
Limits
Per-platform AND per-end-user. Rate, wallet-balance, and usage-per-time gates.
- Per-user
- Per-tier
- Per-key
- Sliding
- Token bkt
- Soft+hard
The three things limits actually does.
RPM, TPM, sliding window, token bucket.
Pick the gate that fits the route. Bursts allowed, abuse blocked. One config flag — no Lua scripts, no Redis to babysit.
→ per-platform defaults
200 · 8 ms
Block when the wallet runs dry.
Set a minimum balance threshold per end-user. Below it? 402, with a webhook. Top-up arrives, calls flow again — no manual reset.
→ per-end-user limits
200 · 8 ms
Per-day, per-week, per-month budgets.
$5/day per end-user on embeddings. $200/month per platform on the slow models. Soft warn at 80%, hard cap at 100%.
→ wallet-balance gates
200 · 8 ms
Sliding-window Redis, two abuse-triage Slack channels, and a 2 a.m. page when the free tier gets ratioed — that's the alternative. Pick a real one.
Urgent backstory
We watched five AI startups in a row hand-roll the same limits stack — and burn three weeks doing it. We packaged ours so you don't have to. Drop the SDK in once; this product, plus the rest of the suite, comes with it.
Four common
ways teams ship with Limits.
Per-platform defaults
Set RPM and daily-spend caps for the whole platform. Inherited by every end-user unless overridden.
Per-end-user limits
Override per signup. Free-tier user gets 60 RPM and $0.50/day. Pro gets 600 RPM and $20/day. Live-reloaded.
Wallet-balance gates
Auto-block once the wallet drops below your threshold. Webhook fires; UI nudges them to top up. No manual unblock.
Usage-per-time
Daily, weekly, monthly budgets. Soft warn at 80%, hard 429 at 100%. Per-route or per-user.
Four steps, ten minutes.
Define a tier
tiers.create({ name: "pro", rpm: 600 })Tag the request
client.chat({ user: "u_28f3a", ... })Limits applied
Token bucket + sliding window, server-side
Promote, instantly
wallet.setTier("u_28f3a", "pro")The stack you already have.
Free-tier abuse used to cost us $4k/mo. We turned on limits, the bill dropped 80% the same week.
— Sasha · solo founder, TablecastFree to ship. Pay when you scale.
per request, after free tier. No markup on tokens. Cancel anytime.
Shipping this quarter.
Ship rate limits + budgets this afternoon.
We onboard 1–2 indie startups a week. If you'd rather ship features than maintain a limits stack, talk to us.