DeepSeek V4 Pro
DeepSeek V4 Pro is a DeepSeek API model listed with 1M context, 384K maximum output, thinking mode, JSON output, tool calls, OpenAI-format and Anthropic-format endpoints, and cache-aware token pricing.
DeepSeek V4 Pro became a high-signal model and pricing topic because long coding-agent and agent-harness sessions are sensitive to input cache hits, output cost, context length, and provider-specific API behavior.
DeepSeek API Docs are the factual source for model ID, base URLs, context length, max output, thinking mode, JSON output, tool calls, beta completion features, concurrency limits, and token pricing. Hacker News and Reddit discussions in the week of May 20-27, 2026 show user demand around whether the 75% V4 Pro discount becomes the new price floor and how that affects coding-agent cost.
- Evaluate DeepSeek API pricing for long-context chat and agent workflows.
- Compare cache-hit and cache-miss token costs before running coding-agent loops.
- Use tool calls, JSON output, and thinking mode in DeepSeek API experiments.
- Screen DeepSeek V4 Pro against GPT, Claude, Gemini, and Qwen model choices.
DeepSeek lists deepseek-v4-pro alongside deepseek-v4-flash in its Models & Pricing table. The page gives both OpenAI-format and Anthropic-format base URLs, 1M context length, 384K maximum output, thinking mode support, JSON output, tool calls, chat prefix completion, FIM completion in non-thinking mode, and V4 Pro pricing fields.
- Model ID: deepseek-v4-pro.
- API surfaces: https://api.deepseek.com and https://api.deepseek.com/anthropic.
- Pricing fields: cache-hit input, cache-miss input, output tokens, and concurrency limit.
DeepSeek says V4 Pro pricing will be officially adjusted to one quarter of the original price after the 75% discount promotion ends on 2026-05-31 15:59 UTC. That matters for agent loops because cached input and output tokens can dominate the cost of long sessions, especially when tools, context, and review iterations are repeated.
This candidate warrants both an entity page and a model-directory record. The entity page explains the pricing and agent-workflow significance, while /models/deepseek-v4-pro keeps the structured model ID, category, IO, context, output, tool, pricing, and endpoint fields readers need for selection.
Hacker News and Reddit are useful here for demand and wording: users are asking whether DeepSeek has created a new low-cost floor for long-context agent work. Those discussions should not override DeepSeek pricing docs, and users should recheck the official pricing page before committing spend.
The API-selection fields that make DeepSeek V4 Pro comparable with other model endpoints.
Long-running harnesses are where cache behavior, tool calling, and output cost become operational decisions.
DeepSeek V4 Pro is not local, but it is often evaluated as a low-cost hosted escalation or provider-native option.
DeepSeek-native coding agent that makes cache-aware API behavior visible in a terminal workflow.
High-end Anthropic coding and agent model used in cost and quality comparisons.
OpenAI frontier model used as a long-context and coding-agent comparison point.
Google Flash-tier model positioned around agentic coding, multimodal input, and long-horizon workflows.
DeepSeek V4 Pro FAQ
Page-level questions for DeepSeek V4 Pro.
Is DeepSeek V4 Pro getting a permanent price cut?+
Yes, according to the DeepSeek Models & Pricing page, V4 Pro pricing will be officially adjusted to one quarter of the original price after the 75% discount promotion ends on 2026-05-31 at 15:59 UTC. Always verify the live pricing page before budgeting production usage.
Should DeepSeek V4 Pro be a model-directory page or only an explainer?+
DeepSeek V4 Pro should be both. It has explainer demand because of the pricing and agent-cost discussion, and it has enough official structured fields for a model-directory record: model ID, context length, max output, supported features, base URLs, pricing, and concurrency limit.
Why does cache-hit pricing matter for coding agents?+
Coding agents repeatedly send project context, tool results, plans, diffs, and review instructions. If the provider cache hits, repeated input can be much cheaper; if it misses, long sessions can become expensive. That makes cache policy a practical harness decision rather than a minor billing detail.