.rules/docs/ollama/chat.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171

# Generate a chat message

`POST /api/chat` — Generate the next message in a conversation between a user and an assistant.

**Server:** `http://localhost:11434`

## Request

| Field | Type | Required | Description |
|---|---|---|---|
| `model` | string | yes | Model name |
| `messages` | ChatMessage[] | yes | Chat history (array of message objects) |
| `tools` | ToolDefinition[] | no | Function tools the model may call |
| `format` | `"json"` \| object | no | Response format — `"json"` or a JSON schema |
| `options` | ModelOptions | no | Runtime generation options (see generate.md) |
| `stream` | boolean | no | Stream partial responses (default: `true`) |
| `think` | boolean \| string | no | Enable thinking output (`true`/`false` or `"high"`, `"medium"`, `"low"`) |
| `keep_alive` | string \| number | no | Keep-alive duration (e.g. `"5m"` or `0` to unload) |

### ChatMessage

| Field | Type | Required | Description |
|---|---|---|---|
| `role` | string | yes | `"system"`, `"user"`, `"assistant"`, or `"tool"` |
| `content` | string | yes | Message text |
| `images` | string[] | no | Base64-encoded images (multimodal) |
| `tool_calls` | ToolCall[] | no | Tool calls from the model |

### ToolDefinition

```json
{
  "type": "function",
  "function": {
    "name": "function_name",
    "description": "What the function does",
    "parameters": { /* JSON Schema */ }
  }
}
```

### ToolCall

```json
{
  "function": {
    "name": "function_name",
    "arguments": { /* key-value args */ }
  }
}
```

## Response (non-streaming, `stream: false`)

| Field | Type | Description |
|---|---|---|
| `model` | string | Model name |
| `created_at` | string | ISO 8601 timestamp |
| `message.role` | string | Always `"assistant"` |
| `message.content` | string | Assistant reply text |
| `message.thinking` | string | Thinking trace (when `think` enabled) |
| `message.tool_calls` | ToolCall[] | Tool calls requested by assistant |
| `done` | boolean | Whether the response finished |
| `done_reason` | string | Why it finished |
| `total_duration` | integer | Total time (nanoseconds) |
| `load_duration` | integer | Model load time (nanoseconds) |
| `prompt_eval_count` | integer | Input token count |
| `prompt_eval_duration` | integer | Prompt eval time (nanoseconds) |
| `eval_count` | integer | Output token count |
| `eval_duration` | integer | Token generation time (nanoseconds) |

## Streaming Response (`stream: true`, default)

Returns `application/x-ndjson`. Each chunk has `message.content` (partial text). Final chunk has `done: true` with duration/count stats.

## Examples

### Basic (streaming)
```bash
curl http://localhost:11434/api/chat -d '{
  "model": "gemma3",
  "messages": [
    {"role": "user", "content": "why is the sky blue?"}
  ]
}'
```

### Non-streaming
```bash
curl http://localhost:11434/api/chat -d '{
  "model": "gemma3",
  "messages": [
    {"role": "user", "content": "why is the sky blue?"}
  ],
  "stream": false
}'
```

### Structured output
```bash
curl http://localhost:11434/api/chat -d '{
  "model": "gemma3",
  "messages": [
    {"role": "user", "content": "What are the populations of the United States and Canada?"}
  ],
  "stream": false,
  "format": {
    "type": "object",
    "properties": {
      "countries": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "country": {"type": "string"},
            "population": {"type": "integer"}
          },
          "required": ["country", "population"]
        }
      }
    },
    "required": ["countries"]
  }
}'
```

### Tool calling
```bash
curl http://localhost:11434/api/chat -d '{
  "model": "qwen3",
  "messages": [
    {"role": "user", "content": "What is the weather today in Paris?"}
  ],
  "stream": false,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The location, e.g. San Francisco, CA"
            },
            "format": {
              "type": "string",
              "description": "celsius or fahrenheit",
              "enum": ["celsius", "fahrenheit"]
            }
          },
          "required": ["location", "format"]
        }
      }
    }
  ]
}'
```

### Thinking
```bash
curl http://localhost:11434/api/chat -d '{
  "model": "gpt-oss",
  "messages": [
    {"role": "user", "content": "What is 1+1?"}
  ],
  "think": "low"
}'
```