# Assistiv Gateway — Integration Guide This document is the agent-readable reference for integrating Assistiv Gateway as the backend for your platform. It is organized by the steps a platform follows during integration. Each step is a self-contained file with full API details, request/response examples, and code snippets. Website-only admin features (platform signup, LLM provider keys, wallet topup, team management, MCP app activation, platform default rate limits, skills source repos) live on `assistiv.ai/dashboard` and are not documented here — you configure those in the UI before you start integrating. A separate, optional product — **Skills (private beta)** — lets agents discover, fetch, and run code packages on demand. It is admin-provisioned (not self-serve) and out of band of the main integration flow. See the dedicated section below. --- ## Nomenclature - **website** — `assistiv.ai`, Assistiv's own admin dashboard. - **platform** — you, the B2B customer of Assistiv integrating our API into your product. - **end user** — a user of your product. Your product provisions them in Assistiv. --- ## Integration Steps Follow these in order. Each is a self-contained file you can read independently. 1. **[Setup & Hello World](/docs/integration/step-1-setup-and-hello-world.txt)** — Dashboard config, server-side setup, verify connection 2. **[Provision End Users](/docs/integration/step-2-provision-end-users.txt)** — Create users with `external_id`, manage API keys, idempotent create 3. **[Budgets & Rate Limits](/docs/integration/step-3-budgets-and-rate-limits.txt)** — Per-user USD budgets, platform wallet, rate-limit overrides, tier patterns 4. **[Inference](/docs/integration/step-4-inference.txt)** — Chat Completions, Responses API, Models, streaming, tool calling, OpenAI SDK 5. **[MCP Tools (Optional)](/docs/integration/step-5-mcp-tools.txt)** — OAuth flow, hosted execution via Responses API, free tier, reference endpoints 6. **[Monitor & Operate](/docs/integration/step-6-monitor-and-operate.txt)** — Self-service endpoints, logs, cost reconciliation patterns --- ## Integration Flow (narrative summary) Assuming you completed the website prerequisites: 1. When a user signs up in your product, call `POST /v1/platforms/{platformId}/end-users` to provision them in Assistiv. The response contains a `sk-eu_*` raw key — store it in your DB keyed by your user's ID. 2. (Optional) Create a per-user budget and/or rate-limit override for them. Send `Idempotency-Key` on `POST /budget/topup`, `POST /budget/debit`, and `PATCH /budget` so retries replay cached ledger rows instead of double-crediting. Use `PATCH /budget { is_suspended: true }` to pause inference for a user without losing budget state. 3. Make inference calls with that user's `sk-eu_*` key against `POST /v1/chat/completions` (function calling) or `POST /v1/responses` (function calling **and** hosted MCP). Branch on `error.code` when you get 402: `budget_suspended`, `budget_exhausted`, `wallet_insufficient` each need a different fix. 4. (Optional, hosted MCP) For end users who need third-party tools (GitHub issues, Slack messages, etc.): - Send them through the OAuth flow once: your backend calls `GET mcp.assistiv.ai/oauth/authorize?app=github` server-side, you redirect their browser to the returned URL, they approve at the provider, they land back on your site at `{your_base_url}/mcp/oauth-callback?status=connected`. - From then on, every `POST /v1/responses` call you make for that user can include an `{type: "mcp", server_label: "assistiv", server_url: "https://mcp.assistiv.ai/mcp", authorization: "Bearer "}` tool item, and the model will autonomously call those tools server-side — you get one final answer back. 5. Register an outbound webhook endpoint (Dashboard → Webhooks, or `POST /v1/platforms/{pid}/webhook-endpoints`) to receive real-time `budget.topped_up`, `budget.low_balance`, `budget.suspended`, and `budget.unsuspended` events. Svix handles delivery, retries, and signature-signing — your receiver just verifies with the Svix SDK. `budget.debited` is available but off by default because at LLM cadence it's a firehose; opt in only if you actually need per-call notifications. 6. Monitor usage via `GET /v1/platforms/{platformId}/logs` (per-call detail) and `GET /v1/platforms/{platformId}/end-users/{euid}/budget/transactions` (full budget ledger — topups, debits, opening, adjustments, suspensions). The ledger is the source of truth for per-user spend; logs are complementary for per-model breakdowns and debugging. --- ## Complete Integration Example (TypeScript) Assumes you've done the website prerequisites: signed up, added an LLM provider key, and topped up your wallet. ```typescript const PLATFORM_KEY = process.env.ASSISTIV_PLATFORM_KEY!; // sk-plat_* const PLATFORM_ID = process.env.ASSISTIV_PLATFORM_ID!; const API_BASE = "https://api.assistiv.ai/v1"; // ── Step 0: Discover which models are actually enabled on this platform ── // Do NOT hardcode "gpt-4o". Every platform has different provider keys → a // different set of enabled models. Call this once at startup and cache the // result; pass the chosen model slug into your inference calls. async function listAvailableModels(): Promise { const res = await fetch(`${API_BASE}/models`, { headers: { Authorization: `Bearer ${PLATFORM_KEY}` }, }); if (!res.ok) throw new Error("Failed to list models"); const body = await res.json(); return body.data.map((m: { id: string }) => m.id); } // Typical startup: verify your chosen model is actually available. // const available = await listAvailableModels(); // const MODEL = available.includes("gpt-4o-mini") ? "gpt-4o-mini" : available[0]; // ── Step 1: Provision an end user when they sign up in your product ────── // This is idempotent on external_id: re-running it for the same user returns // 200 with the existing user + a freshly minted sk-eu_* key (instead of 409). // Safe to call from a retry loop or a re-run signup webhook. async function createAssistivUser(yourUserId: string, displayName: string) { const res = await fetch(`${API_BASE}/platforms/${PLATFORM_ID}/end-users`, { method: "POST", headers: { Authorization: `Bearer ${PLATFORM_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ external_id: yourUserId, display_name: displayName, }), }); const data = await res.json(); if (!res.ok) throw new Error(data.error?.message ?? "Failed to create end user"); // 201 = newly created, 200 = re-provisioned (existing user, fresh key). // Either way, save data.api_key.raw_key in YOUR DB keyed by yourUserId. return { assistivUserId: data.id, endUserKey: data.api_key.raw_key }; } // ── Step 1b (optional): Look up an existing end user by your stable ID ─── // Cheap idempotent lookup, no key minted. Use this if you already saved the // raw key in your DB and just want to confirm the user still exists. async function findAssistivUser(yourUserId: string) { const res = await fetch( `${API_BASE}/platforms/${PLATFORM_ID}/end-users?external_id=${encodeURIComponent(yourUserId)}`, { headers: { Authorization: `Bearer ${PLATFORM_KEY}` } }, ); const data = await res.json(); if (!res.ok) throw new Error(data.error?.message ?? "Lookup failed"); return data.data[0] ?? null; // null if not found } // ── Step 2: (Optional) Cap the user's monthly spend ────────────────────── async function createUserBudget(assistivUserId: string, maxUsd: number) { const res = await fetch( `${API_BASE}/platforms/${PLATFORM_ID}/end-users/${assistivUserId}/budget`, { method: "POST", headers: { Authorization: `Bearer ${PLATFORM_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ max_usd: maxUsd, period: "monthly", auto_replenish: true, replenish_amount: maxUsd, }), } ); if (!res.ok) throw new Error("Failed to create budget"); return res.json(); } // ── Step 3: (Optional) Raise rate limits for a premium user ────────────── async function raiseRateLimitsForPremium(assistivUserId: string) { const res = await fetch( `${API_BASE}/platforms/${PLATFORM_ID}/end-users/${assistivUserId}/rate-limits`, { method: "POST", headers: { Authorization: `Bearer ${PLATFORM_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ rpm_limit: 300, tpm_limit: 500000, rpd_limit: 50000, }), } ); if (!res.ok) throw new Error("Failed to set rate limits"); return res.json(); } // ── Step 4: Inference with the end-user key ────────────────────────────── // NOTE: `model` here must match a slug returned by listAvailableModels(). // Passing a slug that isn't enabled on this platform returns 404 model_not_found. async function chat(endUserKey: string, model: string, userMessage: string) { const res = await fetch(`${API_BASE}/chat/completions`, { method: "POST", headers: { Authorization: `Bearer ${endUserKey}`, "Content-Type": "application/json", }, body: JSON.stringify({ model, messages: [{ role: "user", content: userMessage }], max_tokens: 256, // >= 16 for gpt-5 family (Responses API floor) }), }); const data = await res.json(); if (!res.ok) throw new Error(data.error?.message ?? "Inference failed"); return data.choices[0].message.content; } // ── Step 5: (Optional, hosted MCP) Kick off the OAuth flow for one app ─── // Call this from a backend route the user hits when they click "Connect GitHub" // in your UI. Return the resulting URL to the frontend; have the frontend set // window.location.href = url. After approval, the user lands back on your site // at {your_base_url}/mcp/oauth-callback?app=github&status=connected. async function getMcpConnectUrl(endUserKey: string, appSlug: string): Promise { const res = await fetch( `https://mcp.assistiv.ai/oauth/authorize?app=${encodeURIComponent(appSlug)}`, { headers: { Authorization: `Bearer ${endUserKey}` }, redirect: "manual", } ); if (res.status !== 302) throw new Error("OAuth authorize failed"); return res.headers.get("location")!; } // ── Step 6: (Optional, hosted MCP) Inference with hosted MCP tools ─────── // Once the user has connected at least one MCP app via step 5, every call // to /v1/responses can attach the MCP tool item. The model will autonomously // call GitHub/Slack/etc. on behalf of the user, server-side, and return one // final answer. This only works on /v1/responses (not /v1/chat/completions). async function chatWithMcp(endUserKey: string, model: string, prompt: string) { const res = await fetch(`${API_BASE}/responses`, { method: "POST", headers: { Authorization: `Bearer ${endUserKey}`, "Content-Type": "application/json", }, body: JSON.stringify({ model, input: prompt, max_output_tokens: 256, // >= 16 required tools: [ { type: "mcp", server_label: "assistiv", server_url: "https://mcp.assistiv.ai/mcp", authorization: `Bearer ${endUserKey}`, require_approval: "never", // required — see "Hosted Execution" section }, ], }), }); const data = await res.json(); if (!res.ok) throw new Error(data.error?.message ?? "Hosted MCP inference failed"); // data.output is a mixed array of mcp_call items (one per executed tool) // and message items (the model's final assistant text). Pull the final text: const message = data.output.find((o: { type: string }) => o.type === "message"); return message?.content?.[0]?.text ?? ""; } // ── Full flow ───────────────────────────────────────────────────────────── async function onYourUserSignup(yourUserId: string, name: string) { // Discover models once at startup, not here const available = await listAvailableModels(); const model = available.includes("gpt-4o-mini") ? "gpt-4o-mini" : available[0]; if (!model) throw new Error("No models enabled on this platform — add a provider key"); const { assistivUserId, endUserKey } = await createAssistivUser(yourUserId, name); await createUserBudget(assistivUserId, 10.00); // Plain inference (no tools) const reply = await chat(endUserKey, model, "Hello!"); console.log(reply); // Later, after the user has gone through the OAuth flow for GitHub: const mcpReply = await chatWithMcp( endUserKey, model, "Create a GitHub issue in my-org/test-repo titled 'investigate flaky test'", ); console.log(mcpReply); } ``` --- ## Complete Integration Example (Python) ```python import os import requests PLATFORM_KEY = os.environ["ASSISTIV_PLATFORM_KEY"] PLATFORM_ID = os.environ["ASSISTIV_PLATFORM_ID"] API_BASE = "https://api.assistiv.ai/v1" platform_headers = { "Authorization": f"Bearer {PLATFORM_KEY}", "Content-Type": "application/json", } # Discover which models are actually enabled on this platform. # Do NOT hardcode "gpt-4o" — different platforms have different provider keys # configured, so the set of enabled models varies. Call this at startup and # pass the chosen slug into your inference calls. def list_available_models() -> list[str]: resp = requests.get( f"{API_BASE}/models", headers={"Authorization": f"Bearer {PLATFORM_KEY}"}, ) resp.raise_for_status() return [m["id"] for m in resp.json()["data"]] # Create an end user when they sign up in your product. # This is idempotent on external_id: re-running for the same user returns 200 # with the existing user + a freshly minted sk-eu_* key (instead of 409). Safe # to call from a retry loop or a re-run signup webhook. def create_assistiv_user(your_user_id: str, display_name: str) -> dict: resp = requests.post( f"{API_BASE}/platforms/{PLATFORM_ID}/end-users", headers=platform_headers, json={"external_id": your_user_id, "display_name": display_name}, ) resp.raise_for_status() data = resp.json() # 201 = newly created, 200 = re-provisioned (existing user, fresh key). # Save data["api_key"]["raw_key"] in your DB keyed by your_user_id. return { "assistiv_user_id": data["id"], "end_user_key": data["api_key"]["raw_key"], } # Look up an existing end user by your stable ID. Cheap idempotent lookup, # no key minted. Returns None if not found. def find_assistiv_user(your_user_id: str) -> dict | None: resp = requests.get( f"{API_BASE}/platforms/{PLATFORM_ID}/end-users", headers=platform_headers, params={"external_id": your_user_id}, ) resp.raise_for_status() rows = resp.json()["data"] return rows[0] if rows else None # Cap their monthly spend def create_user_budget(assistiv_user_id: str, max_usd: float) -> dict: resp = requests.post( f"{API_BASE}/platforms/{PLATFORM_ID}/end-users/{assistiv_user_id}/budget", headers=platform_headers, json={ "max_usd": max_usd, "period": "monthly", "auto_replenish": True, "replenish_amount": max_usd, }, ) resp.raise_for_status() return resp.json() # Plain inference with end-user key (no tools). # `model` must be a slug returned by list_available_models(). Passing a slug # that isn't enabled on this platform returns 404 model_not_found. def chat(end_user_key: str, model: str, user_message: str) -> str: resp = requests.post( f"{API_BASE}/chat/completions", headers={ "Authorization": f"Bearer {end_user_key}", "Content-Type": "application/json", }, json={ "model": model, "messages": [{"role": "user", "content": user_message}], "max_tokens": 256, # >= 16 for gpt-5 family }, ) resp.raise_for_status() return resp.json()["choices"][0]["message"]["content"] # (Optional, hosted MCP) Kick off the OAuth flow for one app. # Call this from a backend route the user hits when they click "Connect GitHub" # in your UI. Return the URL to your frontend; have the frontend set # window.location.href = url. After approval, the user lands back at # {your_base_url}/mcp/oauth-callback?app=github&status=connected. def get_mcp_connect_url(end_user_key: str, app_slug: str) -> str: resp = requests.get( f"https://mcp.assistiv.ai/oauth/authorize?app={app_slug}", headers={"Authorization": f"Bearer {end_user_key}"}, allow_redirects=False, ) if resp.status_code != 302: raise RuntimeError(f"OAuth authorize failed: {resp.status_code}") return resp.headers["Location"] # (Optional, hosted MCP) Inference with hosted MCP tools. # Once the user has connected at least one MCP app via the OAuth flow above, # every call to /v1/responses can attach the MCP tool item. The model will # autonomously call GitHub/Slack/etc. server-side and return one final answer. # This only works on /v1/responses, not /v1/chat/completions. def chat_with_mcp(end_user_key: str, model: str, prompt: str) -> str: resp = requests.post( f"{API_BASE}/responses", headers={ "Authorization": f"Bearer {end_user_key}", "Content-Type": "application/json", }, json={ "model": model, "input": prompt, "max_output_tokens": 256, # >= 16 required "tools": [ { "type": "mcp", "server_label": "assistiv", "server_url": "https://mcp.assistiv.ai/mcp", "authorization": f"Bearer {end_user_key}", "require_approval": "never", # required } ], }, ) resp.raise_for_status() data = resp.json() # data["output"] interleaves mcp_call items (one per executed tool) and # message items (the model's final assistant text). Pull the final text: for item in data["output"]: if item.get("type") == "message": return item["content"][0]["text"] return "" # Same call via the OpenAI SDK (also works because the wire format is OpenAI-native): # # from openai import OpenAI # client = OpenAI(api_key=end_user_key, base_url="https://api.assistiv.ai/v1") # models = [m.id for m in client.models.list().data] # discover first # response = client.responses.create( # model=models[0], # use a discovered slug, never hardcode # input=prompt, # tools=[{ # "type": "mcp", # "server_label": "assistiv", # "server_url": "https://mcp.assistiv.ai/mcp", # "authorization": f"Bearer {end_user_key}", # }], # ) # Full flow if __name__ == "__main__": # Discover models once at startup available = list_available_models() if not available: raise RuntimeError("No models enabled — add a provider key in the dashboard") model = "gpt-4o-mini" if "gpt-4o-mini" in available else available[0] result = create_assistiv_user("user-001", "Alice") create_user_budget(result["assistiv_user_id"], 10.00) # Plain inference (no tools) reply = chat(result["end_user_key"], model, "Hello!") print(reply) # Later, after Alice has gone through the OAuth flow for GitHub: mcp_reply = chat_with_mcp( result["end_user_key"], model, "Create a GitHub issue in my-org/test-repo titled 'investigate flaky test'", ) print(mcp_reply) ``` --- ## Skills (private beta) — agent-fetchable code packages Skills are folders an agent discovers, fetches, and runs at runtime. A skill is just a directory with a `SKILL.md` (YAML frontmatter for `name` + `description`) plus any helper files the author shipped (Python, shell, configs, prompts, examples). The agent fetches the entire folder and the agent's runtime decides what to do with it. **Provisioning is admin-only.** All five REST endpoints and the four MCP tools return 403 unless `platform.skills_enabled = true` is set by an Assistiv admin. Email `founders@assistiv.ai` to provision. Self-serve is not available. ### Storage model - **Public** — synced from admin-vetted GitHub repos. Sweeper clones each repo every ~30s, walks every `SKILL.md`, content-hashes the folder, dedupes (plugin-bundle repos with byte-identical copies collapse to one row), and indexes for hybrid search. Updates land within ~30s of a commit. - **Private** — uploaded by a platform via `POST /skills`. Files inlined as base64. Encrypted at rest in GCS. Owned by the platform (`sk-plat_*` → `owner_platform_id`) or the end-user (`sk-eu_*` → `owner_end_user_id`, survives API key rotation). ### REST endpoints All under `/v1/platforms/{platform_id}/skills`: | Method | Path | Auth | Purpose | |---|---|---|---| | GET | `/search` | sk-plat_ or sk-eu_ | Hybrid retrieval (BM25 + pgvector + RRF). Query params: `query`, `top_k` (default 10, max 50), `visibility` (public/private/all). Returns matches with summaries, review aggregates, and recent comments. | | GET | `/{skill_id}` | sk-plat_ or sk-eu_ | Returns the full skill folder (multi-file, base64). Optional query params: `target_path` (string — when set, response includes a plain-language `instruction` telling the agent runtime where to write files), `wrap_in_folder` (default true), `force` (default false). | | POST | `` | sk-plat_ or sk-eu_ | Mints a private skill. Body: `{ name, description, files: [{path, content_base64}], idempotency_key? }`. 1-200 files, max 5 MB each. Path traversal rejected. Returns 201 with `{skill_id, queued_at}`. Embedding generates in background (fetchable immediately, vector-searchable in seconds). | | DELETE | `/{skill_id}` | owner only | Soft-delete. Subsequent fetch returns 410. Public skills cannot be deleted via this endpoint (returns 403; admin-only takedown is internal). | | POST | `/{skill_id}/reviews` | sk-eu_ only | Submit a 4-dim review. Body: `{ rating_accuracy, rating_value, rating_efficiency, rating_clarity }` each 1-5, plus optional `comment` (max 2000 chars). One review per (skill, end-user). Resubmits overwrite. Aggregates refresh atomically. | ### Quality model There is no LLM judge. Quality comes from agent-submitted reviews along four fixed dimensions: **accuracy** (did it do what it said?), **value** (was the result useful?), **efficiency** (tokens, time, side effects within reason?), **clarity** (was SKILL.md clear enough first try?). Each dim is 1-5. The `find_skill` ranker scores each match as `rrf_score × (1 + REVIEW_BOOST × normalized_rating × log(review_count + 1))`, so well-reviewed skills boost on the next search but new skills are never punished — they just don't get the bonus until they have reviews. ### MCP tools (auto-registered when `skills_enabled = true`) The mcp-service registers four tools on every per-user MCP URL: - `find_skill(query, top_k?, visibility?)` — wraps `GET /skills/search`. - `fetch_skill(skill_id, target_path?, wrap_in_folder?, force?)` — wraps `GET /skills/{id}`. - `create_skill(name, description, files)` — wraps `POST /skills`. - `submit_skill_review(skill_id, rating_accuracy, rating_value, rating_efficiency, rating_clarity, comment?)` — wraps `POST /skills/{id}/reviews`. These appear automatically when an agent fetches its tool list; no extra configuration is needed once the platform is provisioned. ### Pricing First **5,000 skill calls per platform per calendar month are free** (combined across find_skill, fetch_skill, create_skill, submit_skill_review). After 5k, **$0.0004 per call ($0.40 per 1k)** — about 2.5× cheaper than the MCP tool-call rate because search and FS reads are simpler to serve than tool execution. The rate is flat regardless of skill folder size: size is set by the source repo (admin-vetted GitHub or your own private upload), not by the caller, so fetching a 5MB skill costs the same per call as fetching a 50KB one. No wallet debit and no log row inside the free tier. Counter resets on the 1st of each month. Scale tier negotiates the rate and adds a dedicated catalog mirror for sustained 100M+ calls/month. ### Minimal usage example (TypeScript) ```typescript // Assumes platform.skills_enabled = true and the end-user is provisioned. const END_USER_KEY = "sk-eu_..."; // from your DB // 1. Search for a skill const search = await fetch( `${API_BASE}/platforms/${PLATFORM_ID}/skills/search?query=summarize+pdfs&top_k=5`, { headers: { Authorization: `Bearer ${END_USER_KEY}` } } ); const { matches } = await search.json(); const top = matches[0]; // 2. Fetch the full skill folder, ask runtime to materialize at ./workspace/ const fetchRes = await fetch( `${API_BASE}/platforms/${PLATFORM_ID}/skills/${top.skill_id}?target_path=./workspace`, { headers: { Authorization: `Bearer ${END_USER_KEY}` } } ); const skill = await fetchRes.json(); // skill.files = [{path, content_base64}, ...] // skill.instruction tells the agent runtime exactly what to do // 3. (After running the skill) submit a review await fetch( `${API_BASE}/platforms/${PLATFORM_ID}/skills/${top.skill_id}/reviews`, { method: "POST", headers: { Authorization: `Bearer ${END_USER_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ rating_accuracy: 5, rating_value: 4, rating_efficiency: 4, rating_clarity: 5, comment: "Worked first try on a 30-page PDF.", }), } ); ``` For an MCP-using agent the same flow happens automatically — the model calls `find_skill`, then `fetch_skill`, then runs whatever the runtime materialized, then calls `submit_skill_review`. No custom integration code needed beyond provisioning. --- ## Environment Variables Reference ```bash # Required in your platform backend ASSISTIV_PLATFORM_KEY=sk-plat_... # From assistiv.ai/dashboard ASSISTIV_PLATFORM_ID=uuid # From assistiv.ai/dashboard # Optional — defaults shown ASSISTIV_API_BASE=https://api.assistiv.ai/v1 ASSISTIV_MCP_BASE=https://mcp.assistiv.ai ``` --- ## Troubleshooting Common Failures | Symptom | Likely cause | First thing to check | |---|---|---| | `401 unauthorized` | Wrong key or typo in `Authorization` header | Re-copy the key from the dashboard; verify it starts with `sk-plat_` or `sk-eu_` | | `402 payment_required` `code: wallet_insufficient` | Platform wallet empty | Dashboard → Wallet → Top up. `GET /v1/platforms/{id}/wallet` to verify balance | | `402 payment_required` `code: budget_exhausted` | This end user's budget exhausted | `GET /v1/me/budget` → check `remaining_usd`. `POST /budget/topup` (with `Idempotency-Key` — Epic 1) or wait for period reset | | `402 payment_required` `code: budget_suspended` | This end user's budget is paused via `is_suspended=true` | Admin action: `PATCH /budget { is_suspended: false }`. Distinct code from `budget_exhausted` — branch on `error.code` in your client | | `403 forbidden` | Platform key used on an `/v1/me/*` endpoint, or end-user key used on `/end-users/*` CRUD | Review "Two key types" — each endpoint's `Auth:` line names which key type is required | | `404 not_found` `code: model_not_found` | Hardcoded model slug not enabled on this platform | `GET /v1/models` to list actually-enabled slugs. Do NOT assume `gpt-4o` is universal | | `409 conflict` on `POST /end-users` | **Should not happen** — that endpoint is idempotent | You're on an old backend build. Pull the latest, or use the `?external_id=` filter + POST pattern | | `422 validation_error` | Request body shape wrong, required field missing, or value out of range | Read `error.message` — it names the specific field | | `422 integer_below_min_value` on `max_output_tokens` | Value below OpenAI Responses API floor of 16 | Set `max_output_tokens >= 16` (or omit) | | `422 unsupported require_approval value` on `/v1/responses` | `require_approval: "always"` in MCP tool item | Set `"never"` (or omit the field entirely) | | `429 rate_limit_exceeded` | Sliding window counter exceeded | Check `error.denied_by`: `eu_rpm`/`eu_rpd`/`plat_rpm`/`plat_rpd` tells you which bucket. Respect `Retry-After` header | | `500 internal_error` | Unhandled backend exception | Check your uvicorn (self-hosted) or support logs. If reproducible, file a bug with the request that caused it | | `500 max_iterations_exceeded` (MCP) | Agent loop didn't converge in 10 iterations | The model is stuck in a tool-call loop. Try a different prompt, a larger model, or add clearer "stop when done" instructions | | `503 mcp_unreachable` | MCP service down, end user has no connections, or tool list fetch failed | `GET mcp.assistiv.ai/health`; `GET mcp.assistiv.ai/connections` with the end-user key to confirm at least one is active | | `200` but `output` is empty or only has `mcp_call` items | Small models (e.g. gpt-5.4-nano) sometimes return only the tool call with no wrapping assistant message | Look in `output[].mcp_call.output` — that's the actual tool result and the de facto answer | | Wallet balance unchanged after a successful call | Debit is fire-and-forget (~1.5s delay) OR free tier hit (first 1k MCP calls/month or 5k skill calls/month) | Wait 2s and re-check. For MCP/skills calls, check if you've exhausted the monthly free allowance | | Budget shows stale state after a PATCH | Budget is cached in Redis for 30s | Wait 30s, or if you direct-wrote to the DB in a test, flush `budget:{pid}:{euid}` manually | | Rate limit override change doesn't take effect | Should invalidate automatically — if it doesn't, backend is outdated | Bounce uvicorn. See `src/assistiv_inference/ratelimit/config.py::invalidate_rate_limit_config_cache` | | OAuth flow 400s with "No redirect URL configured" | `platform_app_configs.redirect_url` is NULL | Set it via the dashboard during MCP app activation, or direct-update the column | | OAuth provider shows "redirect_uri is not associated" | Callback URL mismatch between what MCP service sends and what's registered at provider | See **Testing OAuth locally** in Step 5 — you likely need a separate dev OAuth app | | `403 forbidden` on any `/skills/*` endpoint or skills MCP tool | `platform.skills_enabled = false` on this platform (admin not yet provisioned) | Email `founders@assistiv.ai` to enable. Self-serve provisioning is not available in private beta | | `404 not_found` on `GET /skills/{id}` after a successful create | Embedding still generating in background, or wrong skill_id | Wait a few seconds and retry. Vector search lags BM25 by ~5s on a fresh skill. Confirm `skill_id` matches what `POST /skills` returned | | `410 gone` on `GET /skills/{id}` | Skill was soft-deleted by its owner | Distinct from 404. The row exists with `is_deleted=true`; rolling back requires admin action | | `502 bad_gateway` on `GET /skills/{id}` (public skill) | Local sync clone missing on disk (sweeper recovering or pod restarted mid-sync) | Wait ~30s for the next sweep iteration and retry. Persists past 2 sweeps means the sweeper itself is stuck — check `/healthz` | | `413 payload_too_large` on `GET /skills/{id}` | Skill folder exceeds 20 MB inlined (rare; usually means a vendored binary) | Author needs to trim the folder. Files >5 MB are also rejected at create time | | Skills MCP tools (`find_skill`, `fetch_skill`, etc.) don't appear in agent's tool list | `platform.skills_enabled = false` (most common) OR using a stale MCP URL cached before provisioning | Provision the platform, then refetch `/v1/me/mcp-config`. Tools auto-register at session creation | | `find_skill` returns the same skill at 7 different paths | You're hitting an old DB state from before content_sha dedupe was deployed | Toggle the source repo's `is_active` flag in the admin dashboard to force a re-sync. Sync will collapse duplicates to one row | --- ## Key Constraints to Be Aware Of 1. **Platform key is server-side only.** Never expose `sk-plat_*` to browsers. 2. **End-user key is scoped.** `sk-eu_*` is safe client-side; it can only see its own user's data. 3. **`raw_key` is shown once.** Store it in your DB immediately on creation — cannot be retrieved later. 4. **Website setup is required.** Inference returns errors until you've added a provider key and topped up the wallet on the website. 5. **Wallet must have balance.** `402 payment_required` when the platform wallet is empty. 6. **OAuth kickoff is server-side.** `GET /oauth/authorize` needs the `Authorization` header; browsers can't send headers on redirects — proxy through your backend. 7. **All request/response fields are snake_case.** `external_id`, `max_usd`, `app_slug`, `raw_key`, etc. Except `mcp.assistiv.ai/connections` which returns `camelCase` (`appSlug`, `createdAt`) for historical reasons. 8. **Pagination is flat.** Response shape is `{ data, total, page, limit }` — no nested `pagination` object, no `totalPages`. 9. **Single-resource endpoints are unwrapped.** `POST /end-users` returns the end user directly, NOT `{ data: { ... } }`. Only list endpoints have the `data` wrapper. 10. **Budgets are USD, not tokens.** `max_usd`, `used_usd`, `remaining_usd`. 11. **MCP tool names.** Derived from the Pipedream `action.key` with dashes replaced by underscores (single underscore separators). Examples: `github_create_issue`, `github_get_current_user`, `slack_send_message`. The app slug is a naming convention, not a code-enforced prefix. 12. **Atomic billing.** Wallet and budget debits happen in one transaction — if either fails, both roll back. `DELETE` on a user cascades everywhere. 13. **MCP app activation is website-only.** You cannot activate, update, or deactivate an MCP app via API. Use `assistiv.ai/dashboard/mcp`. You CAN read the activated list via `GET /v1/platforms/{id}/mcp/apps`. 14. **Platform default rate limits are website-only.** Use the website to set defaults. Override per-user via the API. 15. **`POST /mcp` protocol has a required handshake.** Do not try to call `tools/call` cold — use an MCP SDK client or wait for the protocol walkthrough doc. 16. **Outbound webhooks are live (Epic 3).** Assistiv pushes real-time events to a URL you register via Dashboard → Webhooks or `POST /v1/platforms/{pid}/webhook-endpoints`. Event types: `budget.topped_up`, `budget.debited` (firehose, opt-in only), `budget.low_balance` (edge-triggered, auto re-arms on topup), `budget.suspended`, `budget.unsuspended`. Delivery is backed by Svix — HMAC-signed, retried with exponential backoff, replayable via a vended consumer-portal URL. Verify with the Svix SDK in your language. Hand-picked stable payload shape — internal columns (`actor_key_id`, `request_fingerprint`, snapshot columns) never leak. See Step 6 for registration, signature verification snippets, and event payload examples. 17. **Idempotency-Key on money movers (Epic 1+2).** `POST /budget/topup`, `POST /budget/debit`, and `PATCH /budget` honor the `Idempotency-Key` header with strict Stripe-style semantics: same key + same body → replay cached ledger row; same key + different body → `409 Conflict`. Use a stable key (Stripe invoice ID, your internal payment ID) across retries — NOT a fresh UUID per retry. 18. **Manual budget debit + negative balances (Epic 2).** `POST /budget/debit` records chargebacks, refunds, and write-offs. Allows `remaining_usd` to go below zero — you don't need a parallel "debt ledger". Inference still gates at `remaining_usd <= 0` (user can't spend new requests in debt) but the debit endpoint does not. Manual debits bypass `is_suspended`. 19. **Budget suspension (Epic 2).** `PATCH /budget { is_suspended: true }` pauses inference for a user without deleting the budget. Topups and manual debits still land; only inference is blocked with `402 budget_suspended` (distinct from `budget_exhausted`). Flip back to `false` to resume. Both transitions fire webhooks. 20. **Full budget ledger (Epic 1+2).** Every topup, debit (manual AND inference), opening balance, suspension, and config adjustment writes a row to `end_user_budget_transactions`. Query via `GET /v1/platforms/{pid}/end-users/{euid}/budget/transactions?since=&limit=200`. Row `type` is one of `opening`, `topup`, `debit`, `adjustment`. This is the canonical source for per-user spend reconciliation; inference logs are complementary for per-model breakdowns. 21. **Skills are admin-provisioned (private beta).** All `/skills/*` endpoints and the four skills MCP tools (`find_skill`, `fetch_skill`, `create_skill`, `submit_skill_review`) return 403 unless `platform.skills_enabled = true`. Self-serve provisioning is not available — email `founders@assistiv.ai`. Public skill content is synced from admin-vetted GitHub repos every ~30s; you cannot register source repos via API. 22. **Skill review dimensions are fixed and end-user-only.** All four dims (`rating_accuracy`, `rating_value`, `rating_efficiency`, `rating_clarity`) are required, each 1-5. Platform keys (`sk-plat_*`) cannot submit reviews — only end-user keys (`sk-eu_*`). One review per (skill, end-user); resubmits overwrite the prior row. 23. **Skill ownership survives key rotation when end-user-owned.** A private skill created with `sk-eu_*` is owned by the `end_user_id`, not the API key — rotating the key keeps the skill. Platform-owned skills are tied to `owner_platform_id`. Public skills have neither owner field set; their canonical source is the registered GitHub repo. 24. **Skill folder content is content-hashed and deduped.** The sync sweeper collapses byte-identical SKILL.md folders (across plugin-bundle repos, etc.) to one row per unique content. Tie-break on the existing winner first, lex-lowest path second, so `skill_id` is stable across syncs as long as content doesn't change. 25. **Skill fetch returns ALL files in the folder.** Unlike a single-file API, `GET /skills/{id}` returns every file the author shipped (SKILL.md + helpers, base64-encoded). Optional `target_path` query param triggers a plain-language `instruction` field telling the agent runtime where to write each file. Folder size cap: 20 MB total, 5 MB per file, 200 files max.