Choosing the Right ASI:One Model | ASI:One Documentation

The ASI:One family has three models. They share the same API surface and capability set, so switching between them is a one-line change. The differences are in reasoning depth, tool budget, response size, latency, and cost.

	`asi1`	`asi1-ultra`	`asi1-mini`
Position	Adaptive default	Deepest reasoning	Fastest response
Context Window	200,000 tokens	200,000 tokens	200,000 tokens
Max Tool Calls per Turn	Standard	Up to 500	Standard
Reasoning Depth	Adaptive	Deepest	Light
Latency Profile	Balanced	Slowest	Fastest
Cost Profile	Mid	Highest	Lowest
Streaming	Supported	Supported	Supported
OpenAI Compatibility	Full SDK	Full SDK	Full SDK
Chat Completions API	Yes	Yes	Yes
Responses API	Yes	Yes	Yes

All three models work with the same API key. Switch by changing the model field in the request body.

Decision tree

Use this flow to pick a starting model. You can always re-evaluate per-workload.

Step 1. Is your workload latency-critical? Real-time chat, voice, autocomplete, classification, or routing where every millisecond matters?

Yes → start with asi1-mini
No → continue to Step 2

Step 2. Does your workload involve long agentic runs, deep research, code audits, or any task where you currently chunk work across multiple turns to fit within the tool budget?

Yes → start with asi1-ultra
No → start with asi1

If you are still unsure, start with asi1. It adapts to the request and is the right call for the broadest set of workloads.

When to use each model

`asi1` - the adaptive default

Pick asi1 for:

General agent workflows
Chat applications
Routine tool calling
Mixed workloads where requests vary in complexity
Code generation and editing
Content drafting
Web3 application logic

This is the right starting point for most projects. The other two models exist for specific tradeoffs - latency and cost on one side, depth and tool budget on the other.

Read the family overview →

`asi1-ultra` - deepest reasoning, largest tool budget

Pick asi1-ultra for:

Long-running agentic tasks (research agents, coding agents, planning agents)
Multi-hop research and synthesis
Code review and contract analysis
Strategic planning with multiple constraints
Workloads where you currently split work across many turns to fit within the tool budget

The 500-tool-calls-per-turn budget is the headline differentiator. It is what makes long agentic flows possible without forcing your application to manage state across many requests.

Read the asi1-ultra docs →

`asi1-mini` - fastest, lightest, lowest cost

Pick asi1-mini for:

Real-time chat and voice
Classification and routing
Autocomplete and inline suggestions
Simple tool calls (small number of well-defined actions)
High-volume workloads where cost dominates

The same 200,000-token context window as the rest of the family means you can pass large inputs - what you trade for speed is reasoning depth and response size, not the ability to read context.

Read the asi1-mini docs →

Switching models

Models share the same API surface. Switching is a single field change.

1   {
2 -   "model": "asi1",
3 +   "model": "asi1-ultra",
4     "messages": [...]
5   }

This means you can:

Use different models for different routes in the same application
A/B test models against each other for the same workload
Cascade: try asi1-mini first, fall back to asi1 or asi1-ultra for cases where the smaller model’s confidence is low
Promote workloads from asi1 to asi1-ultra as they grow in complexity, or demote to asi1-mini as they stabilize

FAQ

Are all three models tool-calling capable? Yes. All three share the same tool-calling API. The difference is the per-turn tool budget - asi1-ultra supports up to 500 tool calls per turn for long agentic flows.

Do I need separate API keys? No. The same key works across all three models. Billing is based on usage.

Do all three support the OpenAI SDK? Yes. All three are fully compatible with the OpenAI Chat Completions and Responses APIs.

What about streaming? All three models support streaming responses.

How big is the context window? 200,000 tokens for all three models in the family.

Which is the cheapest? asi1-mini, then asi1, then asi1-ultra. Cost scales with reasoning depth and response size.

Which is the fastest? asi1-mini. It is purpose-built for low-latency workloads.

Which is the most capable? asi1-ultra. It produces the deepest reasoning and supports the longest agentic runs.

Can I cascade between models? Yes. A common pattern is to try asi1-mini first for routine cases and only escalate to asi1 or asi1-ultra when the smaller model is uncertain or the task expands in scope.

ASI:One Models - the family overview
asi1-ultra - depth-optimized
asi1-mini - speed-optimized
Tool Calling
OpenAI Compatibility

Decision tree

When to use each model

asi1 - the adaptive default

asi1-ultra - deepest reasoning, largest tool budget

asi1-mini - fastest, lightest, lowest cost

Switching models

FAQ

Related

`asi1` - the adaptive default

`asi1-ultra` - deepest reasoning, largest tool budget

`asi1-mini` - fastest, lightest, lowest cost