asi1-mini | ASI:One Documentation

asi1-mini is the speed-optimized model in the ASI:One family. It is designed for workloads where latency and cost are the primary constraints: real-time chat, voice assistants, classification, autocomplete, and high-volume routing.

asi1-mini shares the full 200,000-token context window with the rest of the family, so you can give it the same context you would give asi1. What you trade for speed is reasoning depth and response size, not the ability to read large inputs.

If you are unsure whether you need it, start with asi1 and downgrade to asi1-mini when:

You are running high-volume, short-turn workloads where every millisecond matters
The task is well-defined and bounded - classify this, summarize that, extract these fields
You want a cheaper option for paths that do not need the deepest reasoning

For tasks that involve long agentic runs, code audits, or deep research, see asi1-ultra instead.

Quickstart

cURL

Python

JavaScript

$ curl -X POST https://api.asi1.ai/v1/chat/completions \
>   -H "Content-Type: application/json" \
>   -H "Authorization: Bearer $ASI_ONE_API_KEY" \
>   -d '{
>     "model": "asi1-mini",
>     "messages": [
>       {"role": "user", "content": "Classify this support ticket as billing, bug, or feature request: My invoice shows the wrong amount."}
>     ]
>   }'

Specifications

Specification	Value
Model ID	`asi1-mini`
Position	Fastest response, lowest cost in the family
Context Window	200,000 tokens
Streaming	Supported
OpenAI Compatibility	Full SDK
APIs	Chat Completions (`/v1/chat/completions`), Responses (`/v1/responses`)
Rate Limits	See your API dashboard

asi1-mini is tuned for low-latency, short-output workloads. It is the right pick when you want fast responses and you do not need the deeper reasoning of asi1 or the long agentic runs of asi1-ultra.

What asi1-mini is for

Real-time chat and voice

Interactive workloads where the user is waiting for a response. The faster the model returns the first token, the better the experience. asi1-mini is built for this.

Classification and routing

Short input, short output, well-defined task. Triage support tickets, route messages to the right queue, label incoming events, detect intent. asi1-mini is purpose-built for this profile.

Autocomplete and inline suggestions

When the model is part of a tight feedback loop - the user types, the model suggests, the user keeps typing - latency dominates the experience. asi1-mini keeps the loop tight.

Simple tool calls

Workloads where the model decides between a small set of well-defined actions and the right answer is rarely ambiguous. asi1-mini handles these efficiently.

High-volume, cost-sensitive workloads

When you are running thousands or millions of requests per day and each one is bounded in scope, the cost difference adds up. asi1-mini lets you scale workloads that would not be economical otherwise.

When to use a different model

asi1-mini is not the best choice for every workload.

If your workload requires…	Use this instead
Deep reasoning across many steps	`asi1-ultra`
Long agentic runs with many tool calls	`asi1-ultra`
Code review, contract analysis, deep research	`asi1-ultra`
Multi-paragraph drafting and editing	`asi1`
General-purpose chat that occasionally needs to reason	`asi1`
Adaptive workloads where requests vary in complexity	`asi1`

A useful heuristic: if you find yourself wishing the response were longer or more thorough, you have outgrown asi1-mini for that workload. Move it to asi1 (or asi1-ultra if depth is the issue).

Tool calling with asi1-mini

asi1-mini supports the full tool-calling API. It is well suited for workflows that branch between a small number of well-defined tools.

1 import requests, os
2 
3 url = "https://api.asi1.ai/v1/chat/completions"
4 headers = {
5     "Content-Type": "application/json",
6     "Authorization": f"Bearer {os.getenv('ASI_ONE_API_KEY')}"
7 }
8 
9 tools = [
10     {
11         "type": "function",
12         "function": {
13             "name": "lookup_order",
14             "description": "Look up an order by ID",
15             "parameters": {
16                 "type": "object",
17                 "properties": {
18                     "order_id": {"type": "string", "description": "Order identifier"}
19                 },
20                 "required": ["order_id"],
21             },
22         },
23     },
24     {
25         "type": "function",
26         "function": {
27             "name": "create_support_ticket",
28             "description": "Open a support ticket on behalf of the user",
29             "parameters": {
30                 "type": "object",
31                 "properties": {
32                     "category": {"type": "string", "enum": ["billing", "bug", "feature_request"]},
33                     "summary": {"type": "string"},
34                 },
35                 "required": ["category", "summary"],
36             },
37         },
38     },
39 ]
40 
41 data = {
42     "model": "asi1-mini",
43     "messages": [
44         {"role": "user", "content": "Where is order 12345?"}
45     ],
46     "tools": tools,
47 }
48 
49 response = requests.post(url, headers=headers, json=data)
50 print(response.json())

For schemas, multi-turn flows, and parallel tool calls, see the Tool Calling guide.

For long agentic flows that benefit from a much larger per-turn tool budget, use asi1-ultra.

Migration from asi1

Switching from asi1 to asi1-mini requires only changing the model field in the request body.

1   {
2 -   "model": "asi1",
3 +   "model": "asi1-mini",
4     "messages": [...]
5   }

Recommended approach:

Identify the workloads where the current asi1 response is consistently short, focused, and well-bounded
A/B test those workloads against asi1-mini and measure quality
Roll out asi1-mini for the workloads where quality holds and latency or cost improves materially
Keep asi1 for workloads that vary in complexity or that benefit from the adaptive default

ASI:One Models - the family overview
Model Selection - side-by-side comparison
asi1-ultra - the depth-optimized counterpart
Tool Calling