blob: eceb064abf521de24e8593ec320eb8342849597a (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
|
# Thinking / Reasoning Traces
Thinking-capable models emit a separate `thinking` field containing their reasoning trace, distinct from the final answer in `content`/`response`.
## Enabling Thinking
Set the `think` field on `/api/chat` or `/api/generate` requests:
| `think` value | Behavior |
|---|---|
| `true` | Enable thinking (most models) |
| `false` | Disable thinking |
| `"low"` / `"medium"` / `"high"` | GPT-OSS only — controls trace length; `true`/`false` is ignored for this model |
Thinking is **enabled by default** for supported models.
## Response Fields
| Endpoint | Thinking field | Answer field |
|---|---|---|
| `/api/chat` | `message.thinking` | `message.content` |
| `/api/generate` | `thinking` | `response` |
### Non-streaming example
```bash
curl http://localhost:11434/api/chat -d '{
"model": "qwen3",
"messages": [{"role": "user", "content": "How many r's in strawberry?"}],
"think": true,
"stream": false
}'
```
Response includes both `message.thinking` (reasoning) and `message.content` (final answer).
## Streaming Thinking
When streaming with `think` enabled, chunks arrive in two phases:
1. **Thinking phase** — chunks have `message.thinking` (or `thinking`) populated, `content` empty.
2. **Answer phase** — chunks have `message.content` (or `response`) populated, `thinking` empty.
Detect the transition by checking which field is non-empty on each chunk. See `streaming.md` for accumulation details.
## Supported Models
- Qwen 3 (`qwen3`)
- GPT-OSS (`gpt-oss`) — requires `"low"` / `"medium"` / `"high"` instead of boolean
- DeepSeek V3.1 (`deepseek-v3.1`)
- DeepSeek R1 (`deepseek-r1`)
- Browse latest: [thinking models](https://ollama.com/search?c=thinking)
## Conversation History with Thinking
When maintaining chat history, include the `thinking` field in the assistant message so the model retains context of its reasoning:
```json
{
"role": "assistant",
"thinking": "<accumulated thinking>",
"content": "<accumulated content>"
}
```
|