> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cogito.decart.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Create completion

> POST /v1/completions

Legacy OpenAI-compatible text completions endpoint for prompt-based clients and
benchmark harnesses. Use [chat completions](/api-reference/chat-completions) for
new applications.

This route is available for Dynamo-backed models. Placeholder catalog rows are
rejected with `model_not_found` instead of returning a synthetic completion.

## Request body

<ParamField body="model" type="string" required>
  Model id from the [catalog](/getting-started/models). Example:
  `moonshotai/kimi-k2.6:appliedcompute`.
</ParamField>

<ParamField body="prompt" type="string | string[] | integer[] | integer[][]" required>
  Prompt text or token ids to complete.
</ParamField>

<ParamField body="stream" type="boolean" default="false">
  When `true`, responses are streamed as Server-Sent Events and end with
  `data: [DONE]`.
</ParamField>

<ParamField body="max_tokens" type="integer">
  Maximum output tokens. Clamped to the model's `max_output_length`.
</ParamField>

<ParamField body="temperature" type="number" default="1">
  Sampling temperature, `0` to `2`. Lower values are more deterministic.
</ParamField>

<ParamField body="top_p" type="number" default="1">
  Nucleus sampling. Use either `temperature` or `top_p`, not both.
</ParamField>

<ParamField body="stop" type="string | string[]">
  Stop sequence or sequences.
</ParamField>

<ParamField body="frequency_penalty" type="number" default="0">
  `-2.0` to `2.0`. Penalize tokens by their frequency in the response so far.
</ParamField>

<ParamField body="presence_penalty" type="number" default="0">
  `-2.0` to `2.0`. Penalize tokens that have appeared at all.
</ParamField>

<ParamField body="ignore_eos" type="boolean">
  Engine extension used by fixed-length benchmark harnesses. When supported by
  the selected upstream, the model continues until `max_tokens` or another stop
  condition is reached.
</ParamField>

## Example

```bash theme={null}
curl https://api.cogito.decart.ai/v1/completions \
  -H "Authorization: Bearer $COGITO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "moonshotai/kimi-k2.6:appliedcompute",
    "prompt": "Write one sentence about fast inference:",
    "max_tokens": 32
  }'
```

## Response

```json theme={null}
{
  "id": "cmpl_...",
  "object": "text_completion",
  "created": 1714521600,
  "model": "moonshotai/kimi-k2.6",
  "choices": [
    {
      "index": 0,
      "text": " Fast inference keeps agent loops tight.",
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 8,
    "completion_tokens": 8,
    "total_tokens": 16
  }
}
```

## Headers on every response

* `x-request-id` — opaque ID. Log it. We trace it through every layer.
