diff options
Diffstat (limited to '.rules/docs/ollama/thinking.md')
| -rw-r--r-- | .rules/docs/ollama/thinking.md | 64 |
1 files changed, 64 insertions, 0 deletions
diff --git a/.rules/docs/ollama/thinking.md b/.rules/docs/ollama/thinking.md new file mode 100644 index 0000000..eceb064 --- /dev/null +++ b/.rules/docs/ollama/thinking.md @@ -0,0 +1,64 @@ +# Thinking / Reasoning Traces + +Thinking-capable models emit a separate `thinking` field containing their reasoning trace, distinct from the final answer in `content`/`response`. + +## Enabling Thinking + +Set the `think` field on `/api/chat` or `/api/generate` requests: + +| `think` value | Behavior | +|---|---| +| `true` | Enable thinking (most models) | +| `false` | Disable thinking | +| `"low"` / `"medium"` / `"high"` | GPT-OSS only — controls trace length; `true`/`false` is ignored for this model | + +Thinking is **enabled by default** for supported models. + +## Response Fields + +| Endpoint | Thinking field | Answer field | +|---|---|---| +| `/api/chat` | `message.thinking` | `message.content` | +| `/api/generate` | `thinking` | `response` | + +### Non-streaming example + +```bash +curl http://localhost:11434/api/chat -d '{ + "model": "qwen3", + "messages": [{"role": "user", "content": "How many r's in strawberry?"}], + "think": true, + "stream": false +}' +``` + +Response includes both `message.thinking` (reasoning) and `message.content` (final answer). + +## Streaming Thinking + +When streaming with `think` enabled, chunks arrive in two phases: + +1. **Thinking phase** — chunks have `message.thinking` (or `thinking`) populated, `content` empty. +2. **Answer phase** — chunks have `message.content` (or `response`) populated, `thinking` empty. + +Detect the transition by checking which field is non-empty on each chunk. See `streaming.md` for accumulation details. + +## Supported Models + +- Qwen 3 (`qwen3`) +- GPT-OSS (`gpt-oss`) — requires `"low"` / `"medium"` / `"high"` instead of boolean +- DeepSeek V3.1 (`deepseek-v3.1`) +- DeepSeek R1 (`deepseek-r1`) +- Browse latest: [thinking models](https://ollama.com/search?c=thinking) + +## Conversation History with Thinking + +When maintaining chat history, include the `thinking` field in the assistant message so the model retains context of its reasoning: + +```json +{ + "role": "assistant", + "thinking": "<accumulated thinking>", + "content": "<accumulated content>" +} +``` |
