Documentation Index
Fetch the complete documentation index at: https://docs.cogito.decart.ai/llms.txt
Use this file to discover all available pages before exploring further.
How billing works
Cogito charges per input token and per output token, with separate rates for each. Output is generally 1.5–3× more expensive per token than input — output tokens are generated sequentially on GPU/Trainium and consume more compute. For each request you’re billed:Context cache discount
When you re-send the same prefix tokens (a long system prompt, a reference document, a multi-turn conversation), Cogito automatically caches them and bills cached input tokens at 50% of the standard input rate. No code change required. You can verify the discount applied: every response includesusage.cached_input_tokens alongside prompt_tokens and completion_tokens.
Fine-tuned variants
Your fine-tunes price the same per-token rate as the base model. We don’t charge a premium for serving them.Hard spend caps
Set a maximum monthly spend in Dashboard → Billing. Once you hit it, requests return:Free credits
New accounts start with $5 in free credits. No card required to spend them. Card is required only when you top up beyond the free tier.Plans
| Plan | Pricing | When to pick it |
|---|---|---|
| Free | $0 + free credits | Evaluation, prototyping |
| Pro | Pay as you go | Production traffic |
| Enterprise | Contract | VPC isolation, P99 SLA, BAA, SSO |