# Assistiv Gateway — Integration Guide This document is the agent-readable reference for integrating Assistiv Gateway as the backend for your platform. It is organized by the steps a platform follows during integration. Each step is a self-contained file with full API details, request/response examples, and code snippets. Website-only admin features (platform signup, LLM provider keys, wallet topup, team management, MCP app activation, platform default rate limits) live on `assistiv.ai/dashboard` and are not documented here — you configure those in the UI before you start integrating. --- ## Nomenclature - **website** — `assistiv.ai`, Assistiv's own admin dashboard. - **platform** — you, the B2B customer of Assistiv integrating our API into your product. - **end user** — a user of your product. Your product provisions them in Assistiv. --- ## Integration Steps Follow these in order. Each is a self-contained file you can read independently. 1. **[Setup & Hello World](/docs/integration/step-1-setup-and-hello-world.txt)** — Dashboard config, server-side setup, verify connection 2. **[Provision End Users](/docs/integration/step-2-provision-end-users.txt)** — Create users with `external_id`, manage API keys, idempotent create 3. **[Budgets & Rate Limits](/docs/integration/step-3-budgets-and-rate-limits.txt)** — Per-user USD budgets, platform wallet, rate-limit overrides, tier patterns 4. **[Inference](/docs/integration/step-4-inference.txt)** — Chat Completions, Responses API, Models, streaming, tool calling, OpenAI SDK 5. **[MCP Tools (Optional)](/docs/integration/step-5-mcp-tools.txt)** — OAuth flow, hosted execution via Responses API, free tier, reference endpoints 6. **[Monitor & Operate](/docs/integration/step-6-monitor-and-operate.txt)** — Self-service endpoints, logs, cost reconciliation patterns --- ## Integration Flow (narrative summary) Assuming you completed the website prerequisites: 1. When a user signs up in your product, call `POST /v1/platforms/{platformId}/end-users` to provision them in Assistiv. The response contains a `sk-eu_*` raw key — store it in your DB keyed by your user's ID. 2. (Optional) Create a per-user budget and/or rate-limit override for them. Send `Idempotency-Key` on `POST /budget/topup`, `POST /budget/debit`, and `PATCH /budget` so retries replay cached ledger rows instead of double-crediting. Use `PATCH /budget { is_suspended: true }` to pause inference for a user without losing budget state. 3. Make inference calls with that user's `sk-eu_*` key against `POST /v1/chat/completions` (function calling) or `POST /v1/responses` (function calling **and** hosted MCP). Branch on `error.code` when you get 402: `budget_suspended`, `budget_exhausted`, `wallet_insufficient` each need a different fix. 4. (Optional, hosted MCP) For end users who need third-party tools (GitHub issues, Slack messages, etc.): - Send them through the OAuth flow once: your backend calls `GET mcp.assistiv.ai/oauth/authorize?app=github` server-side, you redirect their browser to the returned URL, they approve at the provider, they land back on your site at `{your_base_url}/mcp/oauth-callback?status=connected`. - From then on, every `POST /v1/responses` call you make for that user can include an `{type: "mcp", server_label: "assistiv", server_url: "https://mcp.assistiv.ai/mcp", authorization: "Bearer "}` tool item, and the model will autonomously call those tools server-side — you get one final answer back. 5. Register an outbound webhook endpoint (Dashboard → Webhooks, or `POST /v1/platforms/{pid}/webhook-endpoints`) to receive real-time `budget.topped_up`, `budget.low_balance`, `budget.suspended`, and `budget.unsuspended` events. Svix handles delivery, retries, and signature-signing — your receiver just verifies with the Svix SDK. `budget.debited` is available but off by default because at LLM cadence it's a firehose; opt in only if you actually need per-call notifications. 6. Monitor usage via `GET /v1/platforms/{platformId}/logs` (per-call detail) and `GET /v1/platforms/{platformId}/end-users/{euid}/budget/transactions` (full budget ledger — topups, debits, opening, adjustments, suspensions). The ledger is the source of truth for per-user spend; logs are complementary for per-model breakdowns and debugging. --- ## Complete Integration Example (TypeScript) Assumes you've done the website prerequisites: signed up, added an LLM provider key, and topped up your wallet. ```typescript const PLATFORM_KEY = process.env.ASSISTIV_PLATFORM_KEY!; // sk-plat_* const PLATFORM_ID = process.env.ASSISTIV_PLATFORM_ID!; const API_BASE = "https://api.assistiv.ai/v1"; // ── Step 0: Discover which models are actually enabled on this platform ── // Do NOT hardcode "gpt-4o". Every platform has different provider keys → a // different set of enabled models. Call this once at startup and cache the // result; pass the chosen model slug into your inference calls. async function listAvailableModels(): Promise { const res = await fetch(`${API_BASE}/models`, { headers: { Authorization: `Bearer ${PLATFORM_KEY}` }, }); if (!res.ok) throw new Error("Failed to list models"); const body = await res.json(); return body.data.map((m: { id: string }) => m.id); } // Typical startup: verify your chosen model is actually available. // const available = await listAvailableModels(); // const MODEL = available.includes("gpt-4o-mini") ? "gpt-4o-mini" : available[0]; // ── Step 1: Provision an end user when they sign up in your product ────── // This is idempotent on external_id: re-running it for the same user returns // 200 with the existing user + a freshly minted sk-eu_* key (instead of 409). // Safe to call from a retry loop or a re-run signup webhook. async function createAssistivUser(yourUserId: string, displayName: string) { const res = await fetch(`${API_BASE}/platforms/${PLATFORM_ID}/end-users`, { method: "POST", headers: { Authorization: `Bearer ${PLATFORM_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ external_id: yourUserId, display_name: displayName, }), }); const data = await res.json(); if (!res.ok) throw new Error(data.error?.message ?? "Failed to create end user"); // 201 = newly created, 200 = re-provisioned (existing user, fresh key). // Either way, save data.api_key.raw_key in YOUR DB keyed by yourUserId. return { assistivUserId: data.id, endUserKey: data.api_key.raw_key }; } // ── Step 1b (optional): Look up an existing end user by your stable ID ─── // Cheap idempotent lookup, no key minted. Use this if you already saved the // raw key in your DB and just want to confirm the user still exists. async function findAssistivUser(yourUserId: string) { const res = await fetch( `${API_BASE}/platforms/${PLATFORM_ID}/end-users?external_id=${encodeURIComponent(yourUserId)}`, { headers: { Authorization: `Bearer ${PLATFORM_KEY}` } }, ); const data = await res.json(); if (!res.ok) throw new Error(data.error?.message ?? "Lookup failed"); return data.data[0] ?? null; // null if not found } // ── Step 2: (Optional) Cap the user's monthly spend ────────────────────── async function createUserBudget(assistivUserId: string, maxUsd: number) { const res = await fetch( `${API_BASE}/platforms/${PLATFORM_ID}/end-users/${assistivUserId}/budget`, { method: "POST", headers: { Authorization: `Bearer ${PLATFORM_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ max_usd: maxUsd, period: "monthly", auto_replenish: true, replenish_amount: maxUsd, }), } ); if (!res.ok) throw new Error("Failed to create budget"); return res.json(); } // ── Step 3: (Optional) Raise rate limits for a premium user ────────────── async function raiseRateLimitsForPremium(assistivUserId: string) { const res = await fetch( `${API_BASE}/platforms/${PLATFORM_ID}/end-users/${assistivUserId}/rate-limits`, { method: "POST", headers: { Authorization: `Bearer ${PLATFORM_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ rpm_limit: 300, tpm_limit: 500000, rpd_limit: 50000, }), } ); if (!res.ok) throw new Error("Failed to set rate limits"); return res.json(); } // ── Step 4: Inference with the end-user key ────────────────────────────── // NOTE: `model` here must match a slug returned by listAvailableModels(). // Passing a slug that isn't enabled on this platform returns 404 model_not_found. async function chat(endUserKey: string, model: string, userMessage: string) { const res = await fetch(`${API_BASE}/chat/completions`, { method: "POST", headers: { Authorization: `Bearer ${endUserKey}`, "Content-Type": "application/json", }, body: JSON.stringify({ model, messages: [{ role: "user", content: userMessage }], max_tokens: 256, // >= 16 for gpt-5 family (Responses API floor) }), }); const data = await res.json(); if (!res.ok) throw new Error(data.error?.message ?? "Inference failed"); return data.choices[0].message.content; } // ── Step 5: (Optional, hosted MCP) Kick off the OAuth flow for one app ─── // Call this from a backend route the user hits when they click "Connect GitHub" // in your UI. Return the resulting URL to the frontend; have the frontend set // window.location.href = url. After approval, the user lands back on your site // at {your_base_url}/mcp/oauth-callback?app=github&status=connected. async function getMcpConnectUrl(endUserKey: string, appSlug: string): Promise { const res = await fetch( `https://mcp.assistiv.ai/oauth/authorize?app=${encodeURIComponent(appSlug)}`, { headers: { Authorization: `Bearer ${endUserKey}` }, redirect: "manual", } ); if (res.status !== 302) throw new Error("OAuth authorize failed"); return res.headers.get("location")!; } // ── Step 6: (Optional, hosted MCP) Inference with hosted MCP tools ─────── // Once the user has connected at least one MCP app via step 5, every call // to /v1/responses can attach the MCP tool item. The model will autonomously // call GitHub/Slack/etc. on behalf of the user, server-side, and return one // final answer. This only works on /v1/responses (not /v1/chat/completions). async function chatWithMcp(endUserKey: string, model: string, prompt: string) { const res = await fetch(`${API_BASE}/responses`, { method: "POST", headers: { Authorization: `Bearer ${endUserKey}`, "Content-Type": "application/json", }, body: JSON.stringify({ model, input: prompt, max_output_tokens: 256, // >= 16 required tools: [ { type: "mcp", server_label: "assistiv", server_url: "https://mcp.assistiv.ai/mcp", authorization: `Bearer ${endUserKey}`, require_approval: "never", // required — see "Hosted Execution" section }, ], }), }); const data = await res.json(); if (!res.ok) throw new Error(data.error?.message ?? "Hosted MCP inference failed"); // data.output is a mixed array of mcp_call items (one per executed tool) // and message items (the model's final assistant text). Pull the final text: const message = data.output.find((o: { type: string }) => o.type === "message"); return message?.content?.[0]?.text ?? ""; } // ── Full flow ───────────────────────────────────────────────────────────── async function onYourUserSignup(yourUserId: string, name: string) { // Discover models once at startup, not here const available = await listAvailableModels(); const model = available.includes("gpt-4o-mini") ? "gpt-4o-mini" : available[0]; if (!model) throw new Error("No models enabled on this platform — add a provider key"); const { assistivUserId, endUserKey } = await createAssistivUser(yourUserId, name); await createUserBudget(assistivUserId, 10.00); // Plain inference (no tools) const reply = await chat(endUserKey, model, "Hello!"); console.log(reply); // Later, after the user has gone through the OAuth flow for GitHub: const mcpReply = await chatWithMcp( endUserKey, model, "Create a GitHub issue in my-org/test-repo titled 'investigate flaky test'", ); console.log(mcpReply); } ``` --- ## Complete Integration Example (Python) ```python import os import requests PLATFORM_KEY = os.environ["ASSISTIV_PLATFORM_KEY"] PLATFORM_ID = os.environ["ASSISTIV_PLATFORM_ID"] API_BASE = "https://api.assistiv.ai/v1" platform_headers = { "Authorization": f"Bearer {PLATFORM_KEY}", "Content-Type": "application/json", } # Discover which models are actually enabled on this platform. # Do NOT hardcode "gpt-4o" — different platforms have different provider keys # configured, so the set of enabled models varies. Call this at startup and # pass the chosen slug into your inference calls. def list_available_models() -> list[str]: resp = requests.get( f"{API_BASE}/models", headers={"Authorization": f"Bearer {PLATFORM_KEY}"}, ) resp.raise_for_status() return [m["id"] for m in resp.json()["data"]] # Create an end user when they sign up in your product. # This is idempotent on external_id: re-running for the same user returns 200 # with the existing user + a freshly minted sk-eu_* key (instead of 409). Safe # to call from a retry loop or a re-run signup webhook. def create_assistiv_user(your_user_id: str, display_name: str) -> dict: resp = requests.post( f"{API_BASE}/platforms/{PLATFORM_ID}/end-users", headers=platform_headers, json={"external_id": your_user_id, "display_name": display_name}, ) resp.raise_for_status() data = resp.json() # 201 = newly created, 200 = re-provisioned (existing user, fresh key). # Save data["api_key"]["raw_key"] in your DB keyed by your_user_id. return { "assistiv_user_id": data["id"], "end_user_key": data["api_key"]["raw_key"], } # Look up an existing end user by your stable ID. Cheap idempotent lookup, # no key minted. Returns None if not found. def find_assistiv_user(your_user_id: str) -> dict | None: resp = requests.get( f"{API_BASE}/platforms/{PLATFORM_ID}/end-users", headers=platform_headers, params={"external_id": your_user_id}, ) resp.raise_for_status() rows = resp.json()["data"] return rows[0] if rows else None # Cap their monthly spend def create_user_budget(assistiv_user_id: str, max_usd: float) -> dict: resp = requests.post( f"{API_BASE}/platforms/{PLATFORM_ID}/end-users/{assistiv_user_id}/budget", headers=platform_headers, json={ "max_usd": max_usd, "period": "monthly", "auto_replenish": True, "replenish_amount": max_usd, }, ) resp.raise_for_status() return resp.json() # Plain inference with end-user key (no tools). # `model` must be a slug returned by list_available_models(). Passing a slug # that isn't enabled on this platform returns 404 model_not_found. def chat(end_user_key: str, model: str, user_message: str) -> str: resp = requests.post( f"{API_BASE}/chat/completions", headers={ "Authorization": f"Bearer {end_user_key}", "Content-Type": "application/json", }, json={ "model": model, "messages": [{"role": "user", "content": user_message}], "max_tokens": 256, # >= 16 for gpt-5 family }, ) resp.raise_for_status() return resp.json()["choices"][0]["message"]["content"] # (Optional, hosted MCP) Kick off the OAuth flow for one app. # Call this from a backend route the user hits when they click "Connect GitHub" # in your UI. Return the URL to your frontend; have the frontend set # window.location.href = url. After approval, the user lands back at # {your_base_url}/mcp/oauth-callback?app=github&status=connected. def get_mcp_connect_url(end_user_key: str, app_slug: str) -> str: resp = requests.get( f"https://mcp.assistiv.ai/oauth/authorize?app={app_slug}", headers={"Authorization": f"Bearer {end_user_key}"}, allow_redirects=False, ) if resp.status_code != 302: raise RuntimeError(f"OAuth authorize failed: {resp.status_code}") return resp.headers["Location"] # (Optional, hosted MCP) Inference with hosted MCP tools. # Once the user has connected at least one MCP app via the OAuth flow above, # every call to /v1/responses can attach the MCP tool item. The model will # autonomously call GitHub/Slack/etc. server-side and return one final answer. # This only works on /v1/responses, not /v1/chat/completions. def chat_with_mcp(end_user_key: str, model: str, prompt: str) -> str: resp = requests.post( f"{API_BASE}/responses", headers={ "Authorization": f"Bearer {end_user_key}", "Content-Type": "application/json", }, json={ "model": model, "input": prompt, "max_output_tokens": 256, # >= 16 required "tools": [ { "type": "mcp", "server_label": "assistiv", "server_url": "https://mcp.assistiv.ai/mcp", "authorization": f"Bearer {end_user_key}", "require_approval": "never", # required } ], }, ) resp.raise_for_status() data = resp.json() # data["output"] interleaves mcp_call items (one per executed tool) and # message items (the model's final assistant text). Pull the final text: for item in data["output"]: if item.get("type") == "message": return item["content"][0]["text"] return "" # Same call via the OpenAI SDK (also works because the wire format is OpenAI-native): # # from openai import OpenAI # client = OpenAI(api_key=end_user_key, base_url="https://api.assistiv.ai/v1") # models = [m.id for m in client.models.list().data] # discover first # response = client.responses.create( # model=models[0], # use a discovered slug, never hardcode # input=prompt, # tools=[{ # "type": "mcp", # "server_label": "assistiv", # "server_url": "https://mcp.assistiv.ai/mcp", # "authorization": f"Bearer {end_user_key}", # }], # ) # Full flow if __name__ == "__main__": # Discover models once at startup available = list_available_models() if not available: raise RuntimeError("No models enabled — add a provider key in the dashboard") model = "gpt-4o-mini" if "gpt-4o-mini" in available else available[0] result = create_assistiv_user("user-001", "Alice") create_user_budget(result["assistiv_user_id"], 10.00) # Plain inference (no tools) reply = chat(result["end_user_key"], model, "Hello!") print(reply) # Later, after Alice has gone through the OAuth flow for GitHub: mcp_reply = chat_with_mcp( result["end_user_key"], model, "Create a GitHub issue in my-org/test-repo titled 'investigate flaky test'", ) print(mcp_reply) ``` --- ## Environment Variables Reference ```bash # Required in your platform backend ASSISTIV_PLATFORM_KEY=sk-plat_... # From assistiv.ai/dashboard ASSISTIV_PLATFORM_ID=uuid # From assistiv.ai/dashboard # Optional — defaults shown ASSISTIV_API_BASE=https://api.assistiv.ai/v1 ASSISTIV_MCP_BASE=https://mcp.assistiv.ai ``` --- ## Troubleshooting Common Failures | Symptom | Likely cause | First thing to check | |---|---|---| | `401 unauthorized` | Wrong key or typo in `Authorization` header | Re-copy the key from the dashboard; verify it starts with `sk-plat_` or `sk-eu_` | | `402 payment_required` `code: wallet_insufficient` | Platform wallet empty | Dashboard → Wallet → Top up. `GET /v1/platforms/{id}/wallet` to verify balance | | `402 payment_required` `code: budget_exhausted` | This end user's budget exhausted | `GET /v1/me/budget` → check `remaining_usd`. `POST /budget/topup` (with `Idempotency-Key` — Epic 1) or wait for period reset | | `402 payment_required` `code: budget_suspended` | This end user's budget is paused via `is_suspended=true` | Admin action: `PATCH /budget { is_suspended: false }`. Distinct code from `budget_exhausted` — branch on `error.code` in your client | | `403 forbidden` | Platform key used on an `/v1/me/*` endpoint, or end-user key used on `/end-users/*` CRUD | Review "Two key types" — each endpoint's `Auth:` line names which key type is required | | `404 not_found` `code: model_not_found` | Hardcoded model slug not enabled on this platform | `GET /v1/models` to list actually-enabled slugs. Do NOT assume `gpt-4o` is universal | | `409 conflict` on `POST /end-users` | **Should not happen** — that endpoint is idempotent | You're on an old backend build. Pull the latest, or use the `?external_id=` filter + POST pattern | | `422 validation_error` | Request body shape wrong, required field missing, or value out of range | Read `error.message` — it names the specific field | | `422 integer_below_min_value` on `max_output_tokens` | Value below OpenAI Responses API floor of 16 | Set `max_output_tokens >= 16` (or omit) | | `422 unsupported require_approval value` on `/v1/responses` | `require_approval: "always"` in MCP tool item | Set `"never"` (or omit the field entirely) | | `429 rate_limit_exceeded` | Sliding window counter exceeded | Check `error.denied_by`: `eu_rpm`/`eu_rpd`/`plat_rpm`/`plat_rpd` tells you which bucket. Respect `Retry-After` header | | `500 internal_error` | Unhandled backend exception | Check your uvicorn (self-hosted) or support logs. If reproducible, file a bug with the request that caused it | | `500 max_iterations_exceeded` (MCP) | Agent loop didn't converge in 10 iterations | The model is stuck in a tool-call loop. Try a different prompt, a larger model, or add clearer "stop when done" instructions | | `503 mcp_unreachable` | MCP service down, end user has no connections, or tool list fetch failed | `GET mcp.assistiv.ai/health`; `GET mcp.assistiv.ai/connections` with the end-user key to confirm at least one is active | | `200` but `output` is empty or only has `mcp_call` items | Small models (e.g. gpt-5.4-nano) sometimes return only the tool call with no wrapping assistant message | Look in `output[].mcp_call.output` — that's the actual tool result and the de facto answer | | Wallet balance unchanged after a successful call | Debit is fire-and-forget (~1.5s delay) OR MCP free tier (first 100 calls/month) | Wait 2s and re-check. For MCP calls, check if you've exhausted the free tier this month | | Budget shows stale state after a PATCH | Budget is cached in Redis for 30s | Wait 30s, or if you direct-wrote to the DB in a test, flush `budget:{pid}:{euid}` manually | | Rate limit override change doesn't take effect | Should invalidate automatically — if it doesn't, backend is outdated | Bounce uvicorn. See `src/assistiv_inference/ratelimit/config.py::invalidate_rate_limit_config_cache` | | OAuth flow 400s with "No redirect URL configured" | `platform_app_configs.redirect_url` is NULL | Set it via the dashboard during MCP app activation, or direct-update the column | | OAuth provider shows "redirect_uri is not associated" | Callback URL mismatch between what MCP service sends and what's registered at provider | See **Testing OAuth locally** in Step 5 — you likely need a separate dev OAuth app | --- ## Key Constraints to Be Aware Of 1. **Platform key is server-side only.** Never expose `sk-plat_*` to browsers. 2. **End-user key is scoped.** `sk-eu_*` is safe client-side; it can only see its own user's data. 3. **`raw_key` is shown once.** Store it in your DB immediately on creation — cannot be retrieved later. 4. **Website setup is required.** Inference returns errors until you've added a provider key and topped up the wallet on the website. 5. **Wallet must have balance.** `402 payment_required` when the platform wallet is empty. 6. **OAuth kickoff is server-side.** `GET /oauth/authorize` needs the `Authorization` header; browsers can't send headers on redirects — proxy through your backend. 7. **All request/response fields are snake_case.** `external_id`, `max_usd`, `app_slug`, `raw_key`, etc. Except `mcp.assistiv.ai/connections` which returns `camelCase` (`appSlug`, `createdAt`) for historical reasons. 8. **Pagination is flat.** Response shape is `{ data, total, page, limit }` — no nested `pagination` object, no `totalPages`. 9. **Single-resource endpoints are unwrapped.** `POST /end-users` returns the end user directly, NOT `{ data: { ... } }`. Only list endpoints have the `data` wrapper. 10. **Budgets are USD, not tokens.** `max_usd`, `used_usd`, `remaining_usd`. 11. **MCP tool names.** Derived from the Pipedream `action.key` with dashes replaced by underscores (single underscore separators). Examples: `github_create_issue`, `github_get_current_user`, `slack_send_message`. The app slug is a naming convention, not a code-enforced prefix. 12. **Atomic billing.** Wallet and budget debits happen in one transaction — if either fails, both roll back. `DELETE` on a user cascades everywhere. 13. **MCP app activation is website-only.** You cannot activate, update, or deactivate an MCP app via API. Use `assistiv.ai/dashboard/mcp`. You CAN read the activated list via `GET /v1/platforms/{id}/mcp/apps`. 14. **Platform default rate limits are website-only.** Use the website to set defaults. Override per-user via the API. 15. **`POST /mcp` protocol has a required handshake.** Do not try to call `tools/call` cold — use an MCP SDK client or wait for the protocol walkthrough doc. 16. **Outbound webhooks are live (Epic 3).** Assistiv pushes real-time events to a URL you register via Dashboard → Webhooks or `POST /v1/platforms/{pid}/webhook-endpoints`. Event types: `budget.topped_up`, `budget.debited` (firehose, opt-in only), `budget.low_balance` (edge-triggered, auto re-arms on topup), `budget.suspended`, `budget.unsuspended`. Delivery is backed by Svix — HMAC-signed, retried with exponential backoff, replayable via a vended consumer-portal URL. Verify with the Svix SDK in your language. Hand-picked stable payload shape — internal columns (`actor_key_id`, `request_fingerprint`, snapshot columns) never leak. See Step 6 for registration, signature verification snippets, and event payload examples. 17. **Idempotency-Key on money movers (Epic 1+2).** `POST /budget/topup`, `POST /budget/debit`, and `PATCH /budget` honor the `Idempotency-Key` header with strict Stripe-style semantics: same key + same body → replay cached ledger row; same key + different body → `409 Conflict`. Use a stable key (Stripe invoice ID, your internal payment ID) across retries — NOT a fresh UUID per retry. 18. **Manual budget debit + negative balances (Epic 2).** `POST /budget/debit` records chargebacks, refunds, and write-offs. Allows `remaining_usd` to go below zero — you don't need a parallel "debt ledger". Inference still gates at `remaining_usd <= 0` (user can't spend new requests in debt) but the debit endpoint does not. Manual debits bypass `is_suspended`. 19. **Budget suspension (Epic 2).** `PATCH /budget { is_suspended: true }` pauses inference for a user without deleting the budget. Topups and manual debits still land; only inference is blocked with `402 budget_suspended` (distinct from `budget_exhausted`). Flip back to `false` to resume. Both transitions fire webhooks. 20. **Full budget ledger (Epic 1+2).** Every topup, debit (manual AND inference), opening balance, suspension, and config adjustment writes a row to `end_user_budget_transactions`. Query via `GET /v1/platforms/{pid}/end-users/{euid}/budget/transactions?since=&limit=200`. Row `type` is one of `opening`, `topup`, `debit`, `adjustment`. This is the canonical source for per-user spend reconciliation; inference logs are complementary for per-model breakdowns.