Rate, wallet-balance, and usage-per-time gates. Set them on the platform, on each end-user, or both. Enforced server-side before the provider sees the call.
Per-platform AND per-end-user. Rate, wallet-balance, and usage-per-time gates.
Pick the gate that fits the route. Bursts allowed, abuse blocked. One config flag — no Lua scripts, no Redis to babysit.
Set a minimum balance threshold per end-user. Below it? 402, with a webhook. Top-up arrives, calls flow again — no manual reset.
$5/day per end-user on embeddings. $200/month per platform on the slow models. Soft warn at 80%, hard cap at 100%.
Sliding-window Redis, two abuse-triage Slack channels, and a 2 a.m. page when the free tier gets ratioed — that's the alternative. Pick a real one.
We watched five AI startups in a row hand-roll the same limits stack — and burn three weeks doing it. We packaged ours so you don't have to. Drop the SDK in once; this product, plus the rest of the suite, comes with it.
Set RPM and daily-spend caps for the whole platform. Inherited by every end-user unless overridden.
Override per signup. Free-tier user gets 60 RPM and $0.50/day. Pro gets 600 RPM and $20/day. Live-reloaded.
Auto-block once the wallet drops below your threshold. Webhook fires; UI nudges them to top up. No manual unblock.
Daily, weekly, monthly budgets. Soft warn at 80%, hard 429 at 100%. Per-route or per-user.
tiers.create({ name: "pro", rpm: 600 })client.chat({ user: "u_28f3a", ... })Token bucket + sliding window, server-side
wallet.setTier("u_28f3a", "pro")Free-tier abuse used to cost us $4k/mo. We turned on limits, the bill dropped 80% the same week.
— Sasha · solo founder, Tablecastper request, after free tier. No markup on tokens. Cancel anytime.
We onboard 1–2 indie startups a week. If you'd rather ship features than maintain a limits stack, talk to us.