# Assistiv Gateway — Integration Guide

This document is the agent-readable reference for integrating Assistiv Gateway
as the backend for your platform. It is organized by the steps a platform
follows during integration. Each step is a self-contained file with full API
details, request/response examples, and code snippets.

Website-only admin features (platform signup, LLM provider keys, wallet topup,
team management, MCP app activation, platform default rate limits) live on
`assistiv.ai/dashboard` and are not documented here — you configure those in the
UI before you start integrating.

---

## Nomenclature

- **website** — `assistiv.ai`, Assistiv's own admin dashboard.
- **platform** — you, the B2B customer of Assistiv integrating our API into your product.
- **end user** — a user of your product. Your product provisions them in Assistiv.

---

## Integration Steps

Follow these in order. Each is a self-contained file you can read independently.

1. **[Setup & Hello World](/docs/integration/step-1-setup-and-hello-world.txt)** — Dashboard config, server-side setup, verify connection
2. **[Provision End Users](/docs/integration/step-2-provision-end-users.txt)** — Create users with `external_id`, manage API keys, idempotent create
3. **[Budgets & Rate Limits](/docs/integration/step-3-budgets-and-rate-limits.txt)** — Per-user USD budgets, platform wallet, rate-limit overrides, tier patterns
4. **[Inference](/docs/integration/step-4-inference.txt)** — Chat Completions, Responses API, Models, streaming, tool calling, OpenAI SDK
5. **[MCP Tools (Optional)](/docs/integration/step-5-mcp-tools.txt)** — OAuth flow, hosted execution via Responses API, free tier, reference endpoints
6. **[Monitor & Operate](/docs/integration/step-6-monitor-and-operate.txt)** — Self-service endpoints, logs, cost reconciliation patterns

---

## Integration Flow (narrative summary)

Assuming you completed the website prerequisites:

1. When a user signs up in your product, call `POST /v1/platforms/{platformId}/end-users` to provision them in Assistiv. The response contains a `sk-eu_*` raw key — store it in your DB keyed by your user's ID.
2. (Optional) Create a per-user budget and/or rate-limit override for them. Send `Idempotency-Key` on `POST /budget/topup`, `POST /budget/debit`, and `PATCH /budget` so retries replay cached ledger rows instead of double-crediting. Use `PATCH /budget { is_suspended: true }` to pause inference for a user without losing budget state.
3. Make inference calls with that user's `sk-eu_*` key against `POST /v1/chat/completions` (function calling) or `POST /v1/responses` (function calling **and** hosted MCP). Branch on `error.code` when you get 402: `budget_suspended`, `budget_exhausted`, `wallet_insufficient` each need a different fix.
4. (Optional, hosted MCP) For end users who need third-party tools (GitHub issues, Slack messages, etc.):
   - Send them through the OAuth flow once: your backend calls `GET mcp.assistiv.ai/oauth/authorize?app=github` server-side, you redirect their browser to the returned URL, they approve at the provider, they land back on your site at `{your_base_url}/mcp/oauth-callback?status=connected`.
   - From then on, every `POST /v1/responses` call you make for that user can include an `{type: "mcp", server_label: "assistiv", server_url: "https://mcp.assistiv.ai/mcp", authorization: "Bearer <their key>"}` tool item, and the model will autonomously call those tools server-side — you get one final answer back.
5. Register an outbound webhook endpoint (Dashboard → Webhooks, or `POST /v1/platforms/{pid}/webhook-endpoints`) to receive real-time `budget.topped_up`, `budget.low_balance`, `budget.suspended`, and `budget.unsuspended` events. Svix handles delivery, retries, and signature-signing — your receiver just verifies with the Svix SDK. `budget.debited` is available but off by default because at LLM cadence it's a firehose; opt in only if you actually need per-call notifications.
6. Monitor usage via `GET /v1/platforms/{platformId}/logs` (per-call detail) and `GET /v1/platforms/{platformId}/end-users/{euid}/budget/transactions` (full budget ledger — topups, debits, opening, adjustments, suspensions). The ledger is the source of truth for per-user spend; logs are complementary for per-model breakdowns and debugging.

---

## Complete Integration Example (TypeScript)

Assumes you've done the website prerequisites: signed up, added an LLM provider
key, and topped up your wallet.

```typescript
const PLATFORM_KEY = process.env.ASSISTIV_PLATFORM_KEY!;  // sk-plat_*
const PLATFORM_ID = process.env.ASSISTIV_PLATFORM_ID!;
const API_BASE = "https://api.assistiv.ai/v1";

// ── Step 0: Discover which models are actually enabled on this platform ──
// Do NOT hardcode "gpt-4o". Every platform has different provider keys → a
// different set of enabled models. Call this once at startup and cache the
// result; pass the chosen model slug into your inference calls.
async function listAvailableModels(): Promise<string[]> {
  const res = await fetch(`${API_BASE}/models`, {
    headers: { Authorization: `Bearer ${PLATFORM_KEY}` },
  });
  if (!res.ok) throw new Error("Failed to list models");
  const body = await res.json();
  return body.data.map((m: { id: string }) => m.id);
}

// Typical startup: verify your chosen model is actually available.
// const available = await listAvailableModels();
// const MODEL = available.includes("gpt-4o-mini") ? "gpt-4o-mini" : available[0];

// ── Step 1: Provision an end user when they sign up in your product ──────
// This is idempotent on external_id: re-running it for the same user returns
// 200 with the existing user + a freshly minted sk-eu_* key (instead of 409).
// Safe to call from a retry loop or a re-run signup webhook.
async function createAssistivUser(yourUserId: string, displayName: string) {
  const res = await fetch(`${API_BASE}/platforms/${PLATFORM_ID}/end-users`, {
    method: "POST",
    headers: {
      Authorization: `Bearer ${PLATFORM_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      external_id: yourUserId,
      display_name: displayName,
    }),
  });
  const data = await res.json();
  if (!res.ok) throw new Error(data.error?.message ?? "Failed to create end user");
  // 201 = newly created, 200 = re-provisioned (existing user, fresh key).
  // Either way, save data.api_key.raw_key in YOUR DB keyed by yourUserId.
  return { assistivUserId: data.id, endUserKey: data.api_key.raw_key };
}

// ── Step 1b (optional): Look up an existing end user by your stable ID ───
// Cheap idempotent lookup, no key minted. Use this if you already saved the
// raw key in your DB and just want to confirm the user still exists.
async function findAssistivUser(yourUserId: string) {
  const res = await fetch(
    `${API_BASE}/platforms/${PLATFORM_ID}/end-users?external_id=${encodeURIComponent(yourUserId)}`,
    { headers: { Authorization: `Bearer ${PLATFORM_KEY}` } },
  );
  const data = await res.json();
  if (!res.ok) throw new Error(data.error?.message ?? "Lookup failed");
  return data.data[0] ?? null;  // null if not found
}

// ── Step 2: (Optional) Cap the user's monthly spend ──────────────────────
async function createUserBudget(assistivUserId: string, maxUsd: number) {
  const res = await fetch(
    `${API_BASE}/platforms/${PLATFORM_ID}/end-users/${assistivUserId}/budget`,
    {
      method: "POST",
      headers: {
        Authorization: `Bearer ${PLATFORM_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        max_usd: maxUsd,
        period: "monthly",
        auto_replenish: true,
        replenish_amount: maxUsd,
      }),
    }
  );
  if (!res.ok) throw new Error("Failed to create budget");
  return res.json();
}

// ── Step 3: (Optional) Raise rate limits for a premium user ──────────────
async function raiseRateLimitsForPremium(assistivUserId: string) {
  const res = await fetch(
    `${API_BASE}/platforms/${PLATFORM_ID}/end-users/${assistivUserId}/rate-limits`,
    {
      method: "POST",
      headers: {
        Authorization: `Bearer ${PLATFORM_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        rpm_limit: 300,
        tpm_limit: 500000,
        rpd_limit: 50000,
      }),
    }
  );
  if (!res.ok) throw new Error("Failed to set rate limits");
  return res.json();
}

// ── Step 4: Inference with the end-user key ──────────────────────────────
// NOTE: `model` here must match a slug returned by listAvailableModels().
// Passing a slug that isn't enabled on this platform returns 404 model_not_found.
async function chat(endUserKey: string, model: string, userMessage: string) {
  const res = await fetch(`${API_BASE}/chat/completions`, {
    method: "POST",
    headers: {
      Authorization: `Bearer ${endUserKey}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model,
      messages: [{ role: "user", content: userMessage }],
      max_tokens: 256,  // >= 16 for gpt-5 family (Responses API floor)
    }),
  });
  const data = await res.json();
  if (!res.ok) throw new Error(data.error?.message ?? "Inference failed");
  return data.choices[0].message.content;
}

// ── Step 5: (Optional, hosted MCP) Kick off the OAuth flow for one app ───
// Call this from a backend route the user hits when they click "Connect GitHub"
// in your UI. Return the resulting URL to the frontend; have the frontend set
// window.location.href = url. After approval, the user lands back on your site
// at {your_base_url}/mcp/oauth-callback?app=github&status=connected.
async function getMcpConnectUrl(endUserKey: string, appSlug: string): Promise<string> {
  const res = await fetch(
    `https://mcp.assistiv.ai/oauth/authorize?app=${encodeURIComponent(appSlug)}`,
    {
      headers: { Authorization: `Bearer ${endUserKey}` },
      redirect: "manual",
    }
  );
  if (res.status !== 302) throw new Error("OAuth authorize failed");
  return res.headers.get("location")!;
}

// ── Step 6: (Optional, hosted MCP) Inference with hosted MCP tools ───────
// Once the user has connected at least one MCP app via step 5, every call
// to /v1/responses can attach the MCP tool item. The model will autonomously
// call GitHub/Slack/etc. on behalf of the user, server-side, and return one
// final answer. This only works on /v1/responses (not /v1/chat/completions).
async function chatWithMcp(endUserKey: string, model: string, prompt: string) {
  const res = await fetch(`${API_BASE}/responses`, {
    method: "POST",
    headers: {
      Authorization: `Bearer ${endUserKey}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model,
      input: prompt,
      max_output_tokens: 256,  // >= 16 required
      tools: [
        {
          type: "mcp",
          server_label: "assistiv",
          server_url: "https://mcp.assistiv.ai/mcp",
          authorization: `Bearer ${endUserKey}`,
          require_approval: "never",  // required — see "Hosted Execution" section
        },
      ],
    }),
  });
  const data = await res.json();
  if (!res.ok) throw new Error(data.error?.message ?? "Hosted MCP inference failed");

  // data.output is a mixed array of mcp_call items (one per executed tool)
  // and message items (the model's final assistant text). Pull the final text:
  const message = data.output.find((o: { type: string }) => o.type === "message");
  return message?.content?.[0]?.text ?? "";
}

// ── Full flow ─────────────────────────────────────────────────────────────
async function onYourUserSignup(yourUserId: string, name: string) {
  // Discover models once at startup, not here
  const available = await listAvailableModels();
  const model = available.includes("gpt-4o-mini") ? "gpt-4o-mini" : available[0];
  if (!model) throw new Error("No models enabled on this platform — add a provider key");

  const { assistivUserId, endUserKey } = await createAssistivUser(yourUserId, name);
  await createUserBudget(assistivUserId, 10.00);

  // Plain inference (no tools)
  const reply = await chat(endUserKey, model, "Hello!");
  console.log(reply);

  // Later, after the user has gone through the OAuth flow for GitHub:
  const mcpReply = await chatWithMcp(
    endUserKey,
    model,
    "Create a GitHub issue in my-org/test-repo titled 'investigate flaky test'",
  );
  console.log(mcpReply);
}
```

---

## Complete Integration Example (Python)

```python
import os
import requests

PLATFORM_KEY = os.environ["ASSISTIV_PLATFORM_KEY"]
PLATFORM_ID = os.environ["ASSISTIV_PLATFORM_ID"]
API_BASE = "https://api.assistiv.ai/v1"

platform_headers = {
    "Authorization": f"Bearer {PLATFORM_KEY}",
    "Content-Type": "application/json",
}


# Discover which models are actually enabled on this platform.
# Do NOT hardcode "gpt-4o" — different platforms have different provider keys
# configured, so the set of enabled models varies. Call this at startup and
# pass the chosen slug into your inference calls.
def list_available_models() -> list[str]:
    resp = requests.get(
        f"{API_BASE}/models",
        headers={"Authorization": f"Bearer {PLATFORM_KEY}"},
    )
    resp.raise_for_status()
    return [m["id"] for m in resp.json()["data"]]


# Create an end user when they sign up in your product.
# This is idempotent on external_id: re-running for the same user returns 200
# with the existing user + a freshly minted sk-eu_* key (instead of 409). Safe
# to call from a retry loop or a re-run signup webhook.
def create_assistiv_user(your_user_id: str, display_name: str) -> dict:
    resp = requests.post(
        f"{API_BASE}/platforms/{PLATFORM_ID}/end-users",
        headers=platform_headers,
        json={"external_id": your_user_id, "display_name": display_name},
    )
    resp.raise_for_status()
    data = resp.json()
    # 201 = newly created, 200 = re-provisioned (existing user, fresh key).
    # Save data["api_key"]["raw_key"] in your DB keyed by your_user_id.
    return {
        "assistiv_user_id": data["id"],
        "end_user_key": data["api_key"]["raw_key"],
    }


# Look up an existing end user by your stable ID. Cheap idempotent lookup,
# no key minted. Returns None if not found.
def find_assistiv_user(your_user_id: str) -> dict | None:
    resp = requests.get(
        f"{API_BASE}/platforms/{PLATFORM_ID}/end-users",
        headers=platform_headers,
        params={"external_id": your_user_id},
    )
    resp.raise_for_status()
    rows = resp.json()["data"]
    return rows[0] if rows else None

# Cap their monthly spend
def create_user_budget(assistiv_user_id: str, max_usd: float) -> dict:
    resp = requests.post(
        f"{API_BASE}/platforms/{PLATFORM_ID}/end-users/{assistiv_user_id}/budget",
        headers=platform_headers,
        json={
            "max_usd": max_usd,
            "period": "monthly",
            "auto_replenish": True,
            "replenish_amount": max_usd,
        },
    )
    resp.raise_for_status()
    return resp.json()

# Plain inference with end-user key (no tools).
# `model` must be a slug returned by list_available_models(). Passing a slug
# that isn't enabled on this platform returns 404 model_not_found.
def chat(end_user_key: str, model: str, user_message: str) -> str:
    resp = requests.post(
        f"{API_BASE}/chat/completions",
        headers={
            "Authorization": f"Bearer {end_user_key}",
            "Content-Type": "application/json",
        },
        json={
            "model": model,
            "messages": [{"role": "user", "content": user_message}],
            "max_tokens": 256,  # >= 16 for gpt-5 family
        },
    )
    resp.raise_for_status()
    return resp.json()["choices"][0]["message"]["content"]

# (Optional, hosted MCP) Kick off the OAuth flow for one app.
# Call this from a backend route the user hits when they click "Connect GitHub"
# in your UI. Return the URL to your frontend; have the frontend set
# window.location.href = url. After approval, the user lands back at
# {your_base_url}/mcp/oauth-callback?app=github&status=connected.
def get_mcp_connect_url(end_user_key: str, app_slug: str) -> str:
    resp = requests.get(
        f"https://mcp.assistiv.ai/oauth/authorize?app={app_slug}",
        headers={"Authorization": f"Bearer {end_user_key}"},
        allow_redirects=False,
    )
    if resp.status_code != 302:
        raise RuntimeError(f"OAuth authorize failed: {resp.status_code}")
    return resp.headers["Location"]

# (Optional, hosted MCP) Inference with hosted MCP tools.
# Once the user has connected at least one MCP app via the OAuth flow above,
# every call to /v1/responses can attach the MCP tool item. The model will
# autonomously call GitHub/Slack/etc. server-side and return one final answer.
# This only works on /v1/responses, not /v1/chat/completions.
def chat_with_mcp(end_user_key: str, model: str, prompt: str) -> str:
    resp = requests.post(
        f"{API_BASE}/responses",
        headers={
            "Authorization": f"Bearer {end_user_key}",
            "Content-Type": "application/json",
        },
        json={
            "model": model,
            "input": prompt,
            "max_output_tokens": 256,  # >= 16 required
            "tools": [
                {
                    "type": "mcp",
                    "server_label": "assistiv",
                    "server_url": "https://mcp.assistiv.ai/mcp",
                    "authorization": f"Bearer {end_user_key}",
                    "require_approval": "never",  # required
                }
            ],
        },
    )
    resp.raise_for_status()
    data = resp.json()
    # data["output"] interleaves mcp_call items (one per executed tool) and
    # message items (the model's final assistant text). Pull the final text:
    for item in data["output"]:
        if item.get("type") == "message":
            return item["content"][0]["text"]
    return ""

# Same call via the OpenAI SDK (also works because the wire format is OpenAI-native):
#
#     from openai import OpenAI
#     client = OpenAI(api_key=end_user_key, base_url="https://api.assistiv.ai/v1")
#     models = [m.id for m in client.models.list().data]  # discover first
#     response = client.responses.create(
#         model=models[0],  # use a discovered slug, never hardcode
#         input=prompt,
#         tools=[{
#             "type": "mcp",
#             "server_label": "assistiv",
#             "server_url": "https://mcp.assistiv.ai/mcp",
#             "authorization": f"Bearer {end_user_key}",
#         }],
#     )

# Full flow
if __name__ == "__main__":
    # Discover models once at startup
    available = list_available_models()
    if not available:
        raise RuntimeError("No models enabled — add a provider key in the dashboard")
    model = "gpt-4o-mini" if "gpt-4o-mini" in available else available[0]

    result = create_assistiv_user("user-001", "Alice")
    create_user_budget(result["assistiv_user_id"], 10.00)

    # Plain inference (no tools)
    reply = chat(result["end_user_key"], model, "Hello!")
    print(reply)

    # Later, after Alice has gone through the OAuth flow for GitHub:
    mcp_reply = chat_with_mcp(
        result["end_user_key"],
        model,
        "Create a GitHub issue in my-org/test-repo titled 'investigate flaky test'",
    )
    print(mcp_reply)
```

---

## Environment Variables Reference

```bash
# Required in your platform backend
ASSISTIV_PLATFORM_KEY=sk-plat_...     # From assistiv.ai/dashboard
ASSISTIV_PLATFORM_ID=uuid             # From assistiv.ai/dashboard

# Optional — defaults shown
ASSISTIV_API_BASE=https://api.assistiv.ai/v1
ASSISTIV_MCP_BASE=https://mcp.assistiv.ai
```

---

## Troubleshooting Common Failures

| Symptom | Likely cause | First thing to check |
|---|---|---|
| `401 unauthorized` | Wrong key or typo in `Authorization` header | Re-copy the key from the dashboard; verify it starts with `sk-plat_` or `sk-eu_` |
| `402 payment_required` `code: wallet_insufficient` | Platform wallet empty | Dashboard → Wallet → Top up. `GET /v1/platforms/{id}/wallet` to verify balance |
| `402 payment_required` `code: budget_exhausted` | This end user's budget exhausted | `GET /v1/me/budget` → check `remaining_usd`. `POST /budget/topup` (with `Idempotency-Key` — Epic 1) or wait for period reset |
| `402 payment_required` `code: budget_suspended` | This end user's budget is paused via `is_suspended=true` | Admin action: `PATCH /budget { is_suspended: false }`. Distinct code from `budget_exhausted` — branch on `error.code` in your client |
| `403 forbidden` | Platform key used on an `/v1/me/*` endpoint, or end-user key used on `/end-users/*` CRUD | Review "Two key types" — each endpoint's `Auth:` line names which key type is required |
| `404 not_found` `code: model_not_found` | Hardcoded model slug not enabled on this platform | `GET /v1/models` to list actually-enabled slugs. Do NOT assume `gpt-4o` is universal |
| `409 conflict` on `POST /end-users` | **Should not happen** — that endpoint is idempotent | You're on an old backend build. Pull the latest, or use the `?external_id=` filter + POST pattern |
| `422 validation_error` | Request body shape wrong, required field missing, or value out of range | Read `error.message` — it names the specific field |
| `422 integer_below_min_value` on `max_output_tokens` | Value below OpenAI Responses API floor of 16 | Set `max_output_tokens >= 16` (or omit) |
| `422 unsupported require_approval value` on `/v1/responses` | `require_approval: "always"` in MCP tool item | Set `"never"` (or omit the field entirely) |
| `429 rate_limit_exceeded` | Sliding window counter exceeded | Check `error.denied_by`: `eu_rpm`/`eu_rpd`/`plat_rpm`/`plat_rpd` tells you which bucket. Respect `Retry-After` header |
| `500 internal_error` | Unhandled backend exception | Check your uvicorn (self-hosted) or support logs. If reproducible, file a bug with the request that caused it |
| `500 max_iterations_exceeded` (MCP) | Agent loop didn't converge in 10 iterations | The model is stuck in a tool-call loop. Try a different prompt, a larger model, or add clearer "stop when done" instructions |
| `503 mcp_unreachable` | MCP service down, end user has no connections, or tool list fetch failed | `GET mcp.assistiv.ai/health`; `GET mcp.assistiv.ai/connections` with the end-user key to confirm at least one is active |
| `200` but `output` is empty or only has `mcp_call` items | Small models (e.g. gpt-5.4-nano) sometimes return only the tool call with no wrapping assistant message | Look in `output[].mcp_call.output` — that's the actual tool result and the de facto answer |
| Wallet balance unchanged after a successful call | Debit is fire-and-forget (~1.5s delay) OR MCP free tier (first 100 calls/month) | Wait 2s and re-check. For MCP calls, check if you've exhausted the free tier this month |
| Budget shows stale state after a PATCH | Budget is cached in Redis for 30s | Wait 30s, or if you direct-wrote to the DB in a test, flush `budget:{pid}:{euid}` manually |
| Rate limit override change doesn't take effect | Should invalidate automatically — if it doesn't, backend is outdated | Bounce uvicorn. See `src/assistiv_inference/ratelimit/config.py::invalidate_rate_limit_config_cache` |
| OAuth flow 400s with "No redirect URL configured" | `platform_app_configs.redirect_url` is NULL | Set it via the dashboard during MCP app activation, or direct-update the column |
| OAuth provider shows "redirect_uri is not associated" | Callback URL mismatch between what MCP service sends and what's registered at provider | See **Testing OAuth locally** in Step 5 — you likely need a separate dev OAuth app |

---

## Key Constraints to Be Aware Of

1. **Platform key is server-side only.** Never expose `sk-plat_*` to browsers.
2. **End-user key is scoped.** `sk-eu_*` is safe client-side; it can only see its own user's data.
3. **`raw_key` is shown once.** Store it in your DB immediately on creation — cannot be retrieved later.
4. **Website setup is required.** Inference returns errors until you've added a provider key and topped up the wallet on the website.
5. **Wallet must have balance.** `402 payment_required` when the platform wallet is empty.
6. **OAuth kickoff is server-side.** `GET /oauth/authorize` needs the `Authorization` header; browsers can't send headers on redirects — proxy through your backend.
7. **All request/response fields are snake_case.** `external_id`, `max_usd`, `app_slug`, `raw_key`, etc. Except `mcp.assistiv.ai/connections` which returns `camelCase` (`appSlug`, `createdAt`) for historical reasons.
8. **Pagination is flat.** Response shape is `{ data, total, page, limit }` — no nested `pagination` object, no `totalPages`.
9. **Single-resource endpoints are unwrapped.** `POST /end-users` returns the end user directly, NOT `{ data: { ... } }`. Only list endpoints have the `data` wrapper.
10. **Budgets are USD, not tokens.** `max_usd`, `used_usd`, `remaining_usd`.
11. **MCP tool names.** Derived from the Pipedream `action.key` with dashes replaced by underscores (single underscore separators). Examples: `github_create_issue`, `github_get_current_user`, `slack_send_message`. The app slug is a naming convention, not a code-enforced prefix.
12. **Atomic billing.** Wallet and budget debits happen in one transaction — if either fails, both roll back. `DELETE` on a user cascades everywhere.
13. **MCP app activation is website-only.** You cannot activate, update, or deactivate an MCP app via API. Use `assistiv.ai/dashboard/mcp`. You CAN read the activated list via `GET /v1/platforms/{id}/mcp/apps`.
14. **Platform default rate limits are website-only.** Use the website to set defaults. Override per-user via the API.
15. **`POST /mcp` protocol has a required handshake.** Do not try to call `tools/call` cold — use an MCP SDK client or wait for the protocol walkthrough doc.
16. **Outbound webhooks are live (Epic 3).** Assistiv pushes real-time events to a URL you register via Dashboard → Webhooks or `POST /v1/platforms/{pid}/webhook-endpoints`. Event types: `budget.topped_up`, `budget.debited` (firehose, opt-in only), `budget.low_balance` (edge-triggered, auto re-arms on topup), `budget.suspended`, `budget.unsuspended`. Delivery is backed by Svix — HMAC-signed, retried with exponential backoff, replayable via a vended consumer-portal URL. Verify with the Svix SDK in your language. Hand-picked stable payload shape — internal columns (`actor_key_id`, `request_fingerprint`, snapshot columns) never leak. See Step 6 for registration, signature verification snippets, and event payload examples.
17. **Idempotency-Key on money movers (Epic 1+2).** `POST /budget/topup`, `POST /budget/debit`, and `PATCH /budget` honor the `Idempotency-Key` header with strict Stripe-style semantics: same key + same body → replay cached ledger row; same key + different body → `409 Conflict`. Use a stable key (Stripe invoice ID, your internal payment ID) across retries — NOT a fresh UUID per retry.
18. **Manual budget debit + negative balances (Epic 2).** `POST /budget/debit` records chargebacks, refunds, and write-offs. Allows `remaining_usd` to go below zero — you don't need a parallel "debt ledger". Inference still gates at `remaining_usd <= 0` (user can't spend new requests in debt) but the debit endpoint does not. Manual debits bypass `is_suspended`.
19. **Budget suspension (Epic 2).** `PATCH /budget { is_suspended: true }` pauses inference for a user without deleting the budget. Topups and manual debits still land; only inference is blocked with `402 budget_suspended` (distinct from `budget_exhausted`). Flip back to `false` to resume. Both transitions fire webhooks.
20. **Full budget ledger (Epic 1+2).** Every topup, debit (manual AND inference), opening balance, suspension, and config adjustment writes a row to `end_user_budget_transactions`. Query via `GET /v1/platforms/{pid}/end-users/{euid}/budget/transactions?since=<ISO>&limit=200`. Row `type` is one of `opening`, `topup`, `debit`, `adjustment`. This is the canonical source for per-user spend reconciliation; inference logs are complementary for per-model breakdowns.