> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cogito.decart.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Errors

> Deterministic error schema. Always includes a request_id.

Cogito returns a single, deterministic error shape on every failure:

```json theme={null}
{
  "error": {
    "type": "invalid_request_error",
    "code": "model_not_found",
    "message": "Model 'foo' is not available.",
    "request_id": "req_..."
  }
}
```

Always log `request_id`. We trace it through the gateway, scheduler, and inference cluster. Support can pinpoint your request from this single ID.

## Error types

`error.type` is set per HTTP status by the gateway. The mapping is
deterministic — the table below is the full set you can encounter.

| Type                    | HTTP          | When                                                                                                                                                                                                                  |
| ----------------------- | ------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `invalid_request_error` | 400, 404      | Malformed JSON, missing required field, invalid params, unknown model slug. All 404s file here (both body-level and gateway-level missing routes) — distinguish via `error.code` (`model_not_found` vs. `not_found`). |
| `authentication_error`  | 401           | Missing / revoked API key                                                                                                                                                                                             |
| `insufficient_quota`    | 402           | Balance exhausted (out of credit)                                                                                                                                                                                     |
| `permission_denied`     | 403           | Key doesn't have access to a model or feature                                                                                                                                                                         |
| `rate_limit_error`      | 429           | Token / request rate limit hit, **or** monthly hard spend cap reached                                                                                                                                                 |
| `api_error`             | 500–502, 504+ | Cogito problem; retry safe                                                                                                                                                                                            |
| `service_unavailable`   | 503           | Billing not configured, or inference cluster temporarily degraded; retry with backoff                                                                                                                                 |

## Common codes

`error.code` is the machine-readable handle. Always log it alongside
`request_id`. Grouped by category:

**Auth**

* `invalid_api_key` — Bearer header missing or revoked (401)
* `missing_authorization` — no Bearer header at all (401)

**Billing**

* `insufficient_quota` — balance ≤ 0 (402). The wire code matches
  `error.type` — both are `insufficient_quota`. Renamed from the
  earlier `insufficient_balance` for OpenAI parity.
* `spend_cap_reached` — month-to-date spend has hit your hard cap (429).
  Renamed from the earlier `spend_capped`. Raise the cap in the
  dashboard or wait for the month to roll over.
* `billing_not_configured` — server misconfiguration on our side (503)

**Validation**

* `invalid_request_error` — generic body-shape failure (400)
* `model_not_found` — slug typo or model retired (404)

**Upstream / cluster**

* `backend_unavailable` — gateway-side error before reaching the
  upstream (e.g. missing backend credentials) (500, surfaces as
  `error.type: api_error`)
* `upstream_unavailable` — TCP/connection failure to the upstream
  cluster (502, surfaces as `error.type: api_error`)
* `upstream_<status>` — verbatim upstream HTTP status pass-through
  (e.g. `upstream_503` when the inference cluster itself returned 503)

## Retry-after

`429` and `503` responses include a `Retry-After` header (seconds). Honor it. Exponential backoff on top of `Retry-After` is fine; ignoring it will get you rate-limited harder.
