DeepSeek V4 Pro

DeepSeek V4 Pro is a DeepSeek API model with 1M context, 384K maximum output, thinking mode, JSON output, tool calls, cache pricing, and OpenAI-format or Anthropic-format base URLs.

Platform: Replicate

Language ModelReasoningTool CallingLong Context

0 runs

DeepSeek API

0.003625/1M cache-hit input tokens, $0.435/1M cache-miss input tokens, $0.87/1M output tokens after the 75% V4 Pro adjustment

Commercial

🚀Function Overview

A DeepSeek API language model for long-context chat, reasoning, tool-using workflows, and cost-sensitive agent or coding loops that can benefit from cache-hit pricing.

Key Features

1M token context length in the official DeepSeek model table
384K maximum output in the official DeepSeek model table
Thinking mode with non-thinking mode switching guidance
JSON output and tool calls
Chat prefix completion beta and FIM completion in non-thinking mode
OpenAI-format and Anthropic-format API base URLs

Use Cases

•Long-context chat and agent workflows
•Coding-agent sessions that need tool calls and cache-aware cost control
•Structured JSON output generation
•API experiments comparing DeepSeek pricing against other frontier or long-context models

⚙️Input Parameters

messages

array

Chat messages sent to the DeepSeek API using OpenAI-format or Anthropic-format compatible endpoints.

thinking_mode

string

DeepSeek documents support for both thinking and non-thinking modes, with thinking enabled by default.

tools

array

Optional tool definitions for tool-calling workflows.

response_format

object

Optional JSON output controls when structured output is needed.

💡Usage Examples

Example 1

Input Parameters

{
  "model": "deepseek-v4-pro",
  "messages": [
    {
      "role": "user",
      "content": "Summarize the tradeoffs of cached input pricing for a long coding-agent session."
    }
  ],
  "thinking_mode": "default"
}

Output Results

A chat completion response from the DeepSeek API. Verify the current request and response schema in the official DeepSeek API documentation before production use.

Quick Actions

Use NowView Documentation

Technical Specifications

Hardware Type: DeepSeek API
Run Count: 0
Commercial Use: Supported
Pricing: 0.003625/1M cache-hit input tokens, $0.435/1M cache-miss input tokens, $0.87/1M output tokens after the 75% V4 Pro adjustment
Platform: Replicate

Related Keywords

DeepSeek V4 Prodeepseek-v4-proDeepSeek API pricing1M context modeltool calling modelcached input pricingthinking mode

Related Models

Bielik 1.5B v3 Instruct

Bielik-1.5B-v3-Instruct is a generative text model featuring 1.6 billion parameters. It is result of collaboration between the open-science/open-souce project SpeakLeash and the High Performance Computing (HPC)

Cordia-A6 Text Generation Model

A model for generating text sequences based on input prompts and adjustable parameters.

Claude Sonnet 4

Claude Sonnet 4 is a significant upgrade to 3.7, delivering superior coding and reasoning while responding more precisely to your instructions