GLM 5.2

GLM 5.2 is Z. Ready to experience the power of AI? Start your journey here!

Platform: Replicate

ReasoningCoding AssistantTool CallingLong ContextMCP

0 runs

z.ai cloud / Cloudflare Workers AI

Cloudflare Workers AI: input $1.40 per M, output $4.40 per M, cached input $0.26 per M

Commercial

🚀Function Overview

GLM 5. Ready to experience the power of AI? Start your journey here!

Key Features

1M base context window in Z.AI documentation and text-input/output focus
Up to 128K max output capacity in base model documentation
Thinking mode for deeper reasoning steps in long workflows
Built-in function-calling/tool-calling and streaming behavior
Structured output and MCP-oriented integrations

Use Cases

•Long-horizon coding and software engineering support
•Agent workflows requiring function calls and structured output
•Provider-aware pricing and token-budget comparisons
•MCP-aware orchestration and model routing
•One-shot app, UI, and game prototyping where users compare result quality against Opus/Fable-class models
•Hosted open-model routing when users care about access control, cached-input pricing, and avoiding single-provider lock-in
•Community-informed trials in OpenCode, Ollama cloud, Codex-style clients, and other coding harnesses

⚙️Input Parameters

model

string

The model id value, such as glm-5.2.

messages

array

Chat message array in OpenAI-compatible format for reasoning and coding workflows.

tools

array

Optional tool definitions for function-calling workflows.

thinking

string

Optional thinking mode switch used by Z.AI and provider clients.

response_format

object

Optional response format for structured output.

stream

boolean

Whether to stream partial outputs.

💡Usage Examples

Example 1

Input Parameters

{
  "model": "glm-5.2",
  "messages": [
    {
      "role": "user",
      "content": "Draft a migration plan for upgrading a monorepo from JavaScript to TypeScript under a canary rollout."
    }
  ],
  "thinking": "high",
  "stream": true
}

Output Results

GLM 5.2 returns a staged migration plan with package graph mapping, risk zones, validation gates, and fallback paths, while keeping the output suitable for structured tool execution.

Example 2

Input Parameters

{
  "model": "glm-5.2",
  "messages": [
    {
      "role": "user",
      "content": "Build a single-file prototype for a small browser game and explain which mechanics were completed."
    }
  ],
  "thinking": "max",
  "stream": true
}

Output Results

GLM 5.2 is a practical candidate for one-shot coding prototypes, but compare the generated mechanics, runtime behavior, token cost, and harness settings rather than relying on benchmark rank alone.

Quick Actions

Use NowView Documentation

Technical Specifications

Hardware Type: z.ai cloud / Cloudflare Workers AI
Run Count: 0
Commercial Use: Supported
Pricing: Cloudflare Workers AI: input $1.40 per M, output $4.40 per M, cached input $0.26 per M
Platform: Replicate

Related Keywords

GLM 5.2glm-5.2Z.AI GLM 5.2GLM 5.2 APIGLM 5.2 reasoning model@cf/zai-org/glm-5.2GLM 5.2 OpenCodeGLM 5.2 OllamaGLM 5.2 local deploymentGLM 5.2 pricing

Related Models

Bielik 1.5B v3 Instruct

Bielik-1.5B-v3-Instruct is a generative text model featuring 1.6 billion parameters. It is result of collaboration between the open-science/open-souce project SpeakLeash and the High Performance Computing (HPC)

Cordia-A6 Text Generation Model

A model for generating text sequences based on input prompts and adjustable parameters.

Claude Sonnet 4

Claude Sonnet 4 is a significant upgrade to 3.7, delivering superior coding and reasoning while responding more precisely to your instructions