diff options
Diffstat (limited to '.rules/docs/ollama')
| -rw-r--r-- | .rules/docs/ollama/chat.md | 171 | ||||
| -rw-r--r-- | .rules/docs/ollama/embed.md | 56 | ||||
| -rw-r--r-- | .rules/docs/ollama/generate.md | 121 | ||||
| -rw-r--r-- | .rules/docs/ollama/list-models.md | 56 | ||||
| -rw-r--r-- | .rules/docs/ollama/show-model.md | 43 | ||||
| -rw-r--r-- | .rules/docs/ollama/version.md | 19 |
6 files changed, 466 insertions, 0 deletions
diff --git a/.rules/docs/ollama/chat.md b/.rules/docs/ollama/chat.md new file mode 100644 index 0000000..874243d --- /dev/null +++ b/.rules/docs/ollama/chat.md @@ -0,0 +1,171 @@ +# Generate a chat message + +`POST /api/chat` — Generate the next message in a conversation between a user and an assistant. + +**Server:** `http://localhost:11434` + +## Request + +| Field | Type | Required | Description | +|---|---|---|---| +| `model` | string | yes | Model name | +| `messages` | ChatMessage[] | yes | Chat history (array of message objects) | +| `tools` | ToolDefinition[] | no | Function tools the model may call | +| `format` | `"json"` \| object | no | Response format — `"json"` or a JSON schema | +| `options` | ModelOptions | no | Runtime generation options (see generate.md) | +| `stream` | boolean | no | Stream partial responses (default: `true`) | +| `think` | boolean \| string | no | Enable thinking output (`true`/`false` or `"high"`, `"medium"`, `"low"`) | +| `keep_alive` | string \| number | no | Keep-alive duration (e.g. `"5m"` or `0` to unload) | + +### ChatMessage + +| Field | Type | Required | Description | +|---|---|---|---| +| `role` | string | yes | `"system"`, `"user"`, `"assistant"`, or `"tool"` | +| `content` | string | yes | Message text | +| `images` | string[] | no | Base64-encoded images (multimodal) | +| `tool_calls` | ToolCall[] | no | Tool calls from the model | + +### ToolDefinition + +```json +{ + "type": "function", + "function": { + "name": "function_name", + "description": "What the function does", + "parameters": { /* JSON Schema */ } + } +} +``` + +### ToolCall + +```json +{ + "function": { + "name": "function_name", + "arguments": { /* key-value args */ } + } +} +``` + +## Response (non-streaming, `stream: false`) + +| Field | Type | Description | +|---|---|---| +| `model` | string | Model name | +| `created_at` | string | ISO 8601 timestamp | +| `message.role` | string | Always `"assistant"` | +| `message.content` | string | Assistant reply text | +| `message.thinking` | string | Thinking trace (when `think` enabled) | +| `message.tool_calls` | ToolCall[] | Tool calls requested by assistant | +| `done` | boolean | Whether the response finished | +| `done_reason` | string | Why it finished | +| `total_duration` | integer | Total time (nanoseconds) | +| `load_duration` | integer | Model load time (nanoseconds) | +| `prompt_eval_count` | integer | Input token count | +| `prompt_eval_duration` | integer | Prompt eval time (nanoseconds) | +| `eval_count` | integer | Output token count | +| `eval_duration` | integer | Token generation time (nanoseconds) | + +## Streaming Response (`stream: true`, default) + +Returns `application/x-ndjson`. Each chunk has `message.content` (partial text). Final chunk has `done: true` with duration/count stats. + +## Examples + +### Basic (streaming) +```bash +curl http://localhost:11434/api/chat -d '{ + "model": "gemma3", + "messages": [ + {"role": "user", "content": "why is the sky blue?"} + ] +}' +``` + +### Non-streaming +```bash +curl http://localhost:11434/api/chat -d '{ + "model": "gemma3", + "messages": [ + {"role": "user", "content": "why is the sky blue?"} + ], + "stream": false +}' +``` + +### Structured output +```bash +curl http://localhost:11434/api/chat -d '{ + "model": "gemma3", + "messages": [ + {"role": "user", "content": "What are the populations of the United States and Canada?"} + ], + "stream": false, + "format": { + "type": "object", + "properties": { + "countries": { + "type": "array", + "items": { + "type": "object", + "properties": { + "country": {"type": "string"}, + "population": {"type": "integer"} + }, + "required": ["country", "population"] + } + } + }, + "required": ["countries"] + } +}' +``` + +### Tool calling +```bash +curl http://localhost:11434/api/chat -d '{ + "model": "qwen3", + "messages": [ + {"role": "user", "content": "What is the weather today in Paris?"} + ], + "stream": false, + "tools": [ + { + "type": "function", + "function": { + "name": "get_current_weather", + "description": "Get the current weather for a location", + "parameters": { + "type": "object", + "properties": { + "location": { + "type": "string", + "description": "The location, e.g. San Francisco, CA" + }, + "format": { + "type": "string", + "description": "celsius or fahrenheit", + "enum": ["celsius", "fahrenheit"] + } + }, + "required": ["location", "format"] + } + } + } + ] +}' +``` + +### Thinking +```bash +curl http://localhost:11434/api/chat -d '{ + "model": "gpt-oss", + "messages": [ + {"role": "user", "content": "What is 1+1?"} + ], + "think": "low" +}' +``` diff --git a/.rules/docs/ollama/embed.md b/.rules/docs/ollama/embed.md new file mode 100644 index 0000000..9c81ebf --- /dev/null +++ b/.rules/docs/ollama/embed.md @@ -0,0 +1,56 @@ +# Generate embeddings + +`POST /api/embed` — Creates vector embeddings representing the input text. + +**Server:** `http://localhost:11434` + +## Request + +| Field | Type | Required | Description | +|---|---|---|---| +| `model` | string | yes | Model name (e.g. `"embeddinggemma"`) | +| `input` | string \| string[] | yes | Text or array of texts to embed | +| `truncate` | boolean | no | Truncate inputs exceeding context window (default: `true`). If `false`, returns an error. | +| `dimensions` | integer | no | Number of dimensions for the embedding vectors | +| `keep_alive` | string | no | Model keep-alive duration | +| `options` | ModelOptions | no | Runtime options (see generate.md) | + +## Response + +| Field | Type | Description | +|---|---|---| +| `model` | string | Model that produced the embeddings | +| `embeddings` | number[][] | Array of embedding vectors (one per input) | +| `total_duration` | integer | Total time (nanoseconds) | +| `load_duration` | integer | Model load time (nanoseconds) | +| `prompt_eval_count` | integer | Number of input tokens processed | + +## Examples + +### Single input +```bash +curl http://localhost:11434/api/embed -d '{ + "model": "embeddinggemma", + "input": "Why is the sky blue?" +}' +``` + +### Multiple inputs (batch) +```bash +curl http://localhost:11434/api/embed -d '{ + "model": "embeddinggemma", + "input": [ + "Why is the sky blue?", + "Why is the grass green?" + ] +}' +``` + +### Custom dimensions +```bash +curl http://localhost:11434/api/embed -d '{ + "model": "embeddinggemma", + "input": "Generate embeddings for this text", + "dimensions": 128 +}' +``` diff --git a/.rules/docs/ollama/generate.md b/.rules/docs/ollama/generate.md new file mode 100644 index 0000000..30534c2 --- /dev/null +++ b/.rules/docs/ollama/generate.md @@ -0,0 +1,121 @@ +# Generate a response + +`POST /api/generate` — Generates a response for a provided prompt. + +**Server:** `http://localhost:11434` + +## Request + +| Field | Type | Required | Description | +|---|---|---|---| +| `model` | string | yes | Model name | +| `prompt` | string | no | Text for the model to generate a response from | +| `suffix` | string | no | Fill-in-the-middle text after the prompt, before the response | +| `images` | string[] | no | Base64-encoded images (for multimodal models) | +| `format` | string \| object | no | `"json"` or a JSON schema object for structured output | +| `system` | string | no | System prompt | +| `stream` | boolean | no | Stream partial responses (default: `true`) | +| `think` | boolean \| string | no | Enable thinking output (`true`/`false` or `"high"`, `"medium"`, `"low"`) | +| `raw` | boolean | no | Return raw response without prompt templating | +| `keep_alive` | string \| number | no | Keep-alive duration (e.g. `"5m"` or `0` to unload immediately) | +| `options` | ModelOptions | no | Runtime generation options (see below) | + +### ModelOptions + +| Field | Type | Description | +|---|---|---| +| `seed` | integer | Random seed for reproducible outputs | +| `temperature` | float | Randomness (higher = more random) | +| `top_k` | integer | Limit next token to K most likely | +| `top_p` | float | Nucleus sampling cumulative probability threshold | +| `min_p` | float | Minimum probability threshold | +| `stop` | string \| string[] | Stop sequences | +| `num_ctx` | integer | Context length (number of tokens) | +| `num_predict` | integer | Max tokens to generate | + +## Response (non-streaming, `stream: false`) + +| Field | Type | Description | +|---|---|---| +| `model` | string | Model name | +| `created_at` | string | ISO 8601 timestamp | +| `response` | string | Generated text | +| `thinking` | string | Thinking output (when `think` enabled) | +| `done` | boolean | Whether generation finished | +| `done_reason` | string | Why generation stopped | +| `total_duration` | integer | Total time (nanoseconds) | +| `load_duration` | integer | Model load time (nanoseconds) | +| `prompt_eval_count` | integer | Number of input tokens | +| `prompt_eval_duration` | integer | Prompt eval time (nanoseconds) | +| `eval_count` | integer | Number of output tokens | +| `eval_duration` | integer | Token generation time (nanoseconds) | + +## Streaming Response (`stream: true`, default) + +Returns `application/x-ndjson` — one JSON object per line. Each chunk has the same fields as the non-streaming response. The final chunk has `done: true` with duration/count stats. + +## Examples + +### Basic (streaming) +```bash +curl http://localhost:11434/api/generate -d '{ + "model": "gemma3", + "prompt": "Why is the sky blue?" +}' +``` + +### Non-streaming +```bash +curl http://localhost:11434/api/generate -d '{ + "model": "gemma3", + "prompt": "Why is the sky blue?", + "stream": false +}' +``` + +### With options +```bash +curl http://localhost:11434/api/generate -d '{ + "model": "gemma3", + "prompt": "Why is the sky blue?", + "options": { + "temperature": 0.8, + "top_p": 0.9, + "seed": 42 + } +}' +``` + +### Structured output (JSON schema) +```bash +curl http://localhost:11434/api/generate -d '{ + "model": "gemma3", + "prompt": "What are the populations of the United States and Canada?", + "stream": false, + "format": { + "type": "object", + "properties": { + "countries": { + "type": "array", + "items": { + "type": "object", + "properties": { + "country": {"type": "string"}, + "population": {"type": "integer"} + }, + "required": ["country", "population"] + } + } + }, + "required": ["countries"] + } +}' +``` + +### Load / Unload model +```bash +# Load +curl http://localhost:11434/api/generate -d '{"model": "gemma3"}' +# Unload +curl http://localhost:11434/api/generate -d '{"model": "gemma3", "keep_alive": 0}' +``` diff --git a/.rules/docs/ollama/list-models.md b/.rules/docs/ollama/list-models.md new file mode 100644 index 0000000..f5da57f --- /dev/null +++ b/.rules/docs/ollama/list-models.md @@ -0,0 +1,56 @@ +# List models + +`GET /api/tags` — Fetch a list of locally available models and their details. + +**Server:** `http://localhost:11434` + +## Request + +No parameters required. + +```bash +curl http://localhost:11434/api/tags +``` + +## Response + +| Field | Type | Description | +|---|---|---| +| `models` | ModelSummary[] | Array of available models | + +### ModelSummary + +| Field | Type | Description | +|---|---|---| +| `name` | string | Model name | +| `model` | string | Model name | +| `modified_at` | string | Last modified (ISO 8601) | +| `size` | integer | Size on disk (bytes) | +| `digest` | string | SHA256 digest | +| `details.format` | string | File format (e.g. `"gguf"`) | +| `details.family` | string | Primary model family (e.g. `"llama"`) | +| `details.families` | string[] | All families the model belongs to | +| `details.parameter_size` | string | Parameter count label (e.g. `"7B"`) | +| `details.quantization_level` | string | Quantization level (e.g. `"Q4_0"`) | + +### Example response +```json +{ + "models": [ + { + "name": "gemma3", + "model": "gemma3", + "modified_at": "2025-10-03T23:34:03.409490317-07:00", + "size": 3338801804, + "digest": "a2af6cc3eb7fa8be8504abaf9b04e88f17a119ec3f04a3addf55f92841195f5a", + "details": { + "format": "gguf", + "family": "gemma", + "families": ["gemma"], + "parameter_size": "4.3B", + "quantization_level": "Q4_K_M" + } + } + ] +} +``` diff --git a/.rules/docs/ollama/show-model.md b/.rules/docs/ollama/show-model.md new file mode 100644 index 0000000..befbd22 --- /dev/null +++ b/.rules/docs/ollama/show-model.md @@ -0,0 +1,43 @@ +# Show model details + +`POST /api/show` — Get detailed information about a specific model. + +**Server:** `http://localhost:11434` + +## Request + +| Field | Type | Required | Description | +|---|---|---|---| +| `model` | string | yes | Model name to show | +| `verbose` | boolean | no | Include large verbose fields in the response | + +## Response + +| Field | Type | Description | +|---|---|---| +| `parameters` | string | Model parameter settings (text) | +| `modified_at` | string | Last modified (ISO 8601) | +| `template` | string | Prompt template used by the model | +| `capabilities` | string[] | Supported features (e.g. `"completion"`, `"vision"`) | +| `details.format` | string | File format (e.g. `"gguf"`) | +| `details.family` | string | Model family | +| `details.families` | string[] | All families | +| `details.parameter_size` | string | Parameter count label (e.g. `"4.3B"`) | +| `details.quantization_level` | string | Quantization level (e.g. `"Q4_K_M"`) | +| `model_info` | object | Architecture metadata (context length, embedding size, etc.) | + +## Examples + +```bash +curl http://localhost:11434/api/show -d '{ + "model": "gemma3" +}' +``` + +### Verbose +```bash +curl http://localhost:11434/api/show -d '{ + "model": "gemma3", + "verbose": true +}' +``` diff --git a/.rules/docs/ollama/version.md b/.rules/docs/ollama/version.md new file mode 100644 index 0000000..29fb757 --- /dev/null +++ b/.rules/docs/ollama/version.md @@ -0,0 +1,19 @@ +# Get version + +`GET /api/version` — Retrieve the Ollama server version. + +**Server:** `http://localhost:11434` + +## Request + +No parameters required. + +```bash +curl http://localhost:11434/api/version +``` + +## Response + +| Field | Type | Description | +|---|---|---| +| `version` | string | Ollama version (e.g. `"0.12.6"`) | |
