Holo3.1 35B-A3B

Holo3.1 35B-A3B is H Company's OpenAI-compatible computer-use vision-language model for web, desktop, and mobile agent workflows, with text and image input, 65,536 token context, function calling, and Apache 2.0 open weights.

Platform: Replicate

Vision-Language ModelComputer UseFunction CallingLocal Agents

H Company Models API / open-weight local checkpoints

$0.25/1M input tokens and $1.80/1M output tokens for holo3-1-35b-a3b; free tier is rate-limited at 10 RPM

Commercial

🚀Function Overview

A computer-use VLM for agent loops that need to inspect screenshots, reason over UI state, return tool calls or structured JSON, and run through H Company's hosted API or open-weight local checkpoints.

Key Features

OpenAI-compatible H Company Models API endpoint
Text and image input with text output
65,536 token context window and up to 5 images per request
Native function calling on the Holo3.1 API model
Structured output mode for agent-loop JSON responses
Apache 2.0 open weights with BF16, FP8, NVFP4, and Q4 GGUF availability across the Holo3.1 family

Use Cases

•Browser, desktop, and mobile computer-use agents
•UI grounding and click-coordinate prediction from screenshots
•Agent harnesses that need structured JSON or native tool calls
•Local or edge computer-use experiments using Holo3.1 checkpoints
•Cost-sensitive automation prototypes using the rate-limited free tier

⚙️Input Parameters

messages

array

OpenAI-compatible chat messages. Holo supports text and image content for computer-use observations.

model

string

Use holo3-1-35b-a3b for the Holo3.1 35B-A3B Models API endpoint.

tools

array

Optional OpenAI-style tool definitions for native function-calling mode on Holo3.1.

extra_body.structured_outputs

object

Optional JSON schema constraint for structured-output agent loops.

extra_body.chat_template_kwargs.enable_thinking

boolean

Optional Holo-specific reasoning-channel toggle; H Company recommends enabling reasoning for agent loops and disabling it for single-shot grounding.

reasoning_effort

string

Optional reasoning effort field such as low, medium, or high for planning before UI actions.

💡Usage Examples

Example 1

Input Parameters

{
  "model": "holo3-1-35b-a3b",
  "messages": [
    {
      "role": "user",
      "content": "In one sentence, what is a computer-use agent?"
    }
  ],
  "reasoning_effort": "medium"
}

Output Results

A chat completion response from the H Company Models API. For screen-control workflows, send screenshots as image inputs and follow H Company's agent-loop or element-localization conventions.

Quick Actions

Use NowView Documentation

Technical Specifications

Hardware Type: H Company Models API / open-weight local checkpoints
Commercial Use: Supported
Pricing: $0.25/1M input tokens and $1.80/1M output tokens for holo3-1-35b-a3b; free tier is rate-limited at 10 RPM
Platform: Replicate

Related Keywords

Holo 3.1Holo3.1holo3-1-35b-a3bH Company Models APIcomputer-use agent modelvision-language modellocal GUI agentfunction calling model

Related Models

Bielik 1.5B v3 Instruct

Bielik-1.5B-v3-Instruct is a generative text model featuring 1.6 billion parameters. It is result of collaboration between the open-science/open-souce project SpeakLeash and the High Performance Computing (HPC)

Cordia-A6 Text Generation Model

A model for generating text sequences based on input prompts and adjustable parameters.

Claude Sonnet 4

Claude Sonnet 4 is a significant upgrade to 3.7, delivering superior coding and reasoning while responding more precisely to your instructions