Skip to main content
The cheap workhorse with a 1M-token window. Built for high-volume pipelines where the bill matters as much as the answer.
Slugdeepseek-v4-flash
ParametersMid-tier MoE
Context1,000,000 tokens (1M)
Throughput70 tokens/sec
TTFT220ms
LicenseDeepSeek License
Pricingcogito.decart.ai/models/deepseek-v4-flash

Best for

  • High-volume RAG with long retrieved contexts
  • Document and codebase summarization
  • Synthetic data generation
  • Anywhere $/token matters more than peak intelligence
client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[{"role": "user", "content": "Summarize: ..."}],
)