diff options
| author | Adam Malczewski <[email protected]> | 2026-06-12 19:26:31 +0900 |
|---|---|---|
| committer | Adam Malczewski <[email protected]> | 2026-06-12 19:26:31 +0900 |
| commit | 35197ed933044d322d0a653c4e88a5f3e475fe76 (patch) | |
| tree | f768be26a61b28551a0671f2519c3da4ff682a1f | |
| parent | dbf77ba78ff840e0ed5f6294030523fe3ab121fa (diff) | |
| download | dispatch-35197ed933044d322d0a653c4e88a5f3e475fe76.tar.gz dispatch-35197ed933044d322d0a653c4e88a5f3e475fe76.zip | |
feat(contracts): reasoning effort — ReasoningEffort ladder (low..max), ProviderStreamOptions/ChatRequest fields, per-conversation GET/PUT types
wire 0.6.1->0.7.0, transport-contract 0.10.0->0.11.0. Additive only; typecheck+biome clean.
| -rw-r--r-- | GLOSSARY.md | 1 | ||||
| -rw-r--r-- | packages/kernel/src/contracts/index.ts | 1 | ||||
| -rw-r--r-- | packages/kernel/src/contracts/provider.ts | 12 | ||||
| -rw-r--r-- | packages/transport-contract/package.json | 2 | ||||
| -rw-r--r-- | packages/transport-contract/src/index.ts | 40 | ||||
| -rw-r--r-- | packages/wire/package.json | 2 | ||||
| -rw-r--r-- | packages/wire/src/index.ts | 14 | ||||
| -rw-r--r-- | tasks.md | 18 |
8 files changed, 84 insertions, 6 deletions
diff --git a/GLOSSARY.md b/GLOSSARY.md index 7564276..61a555d 100644 --- a/GLOSSARY.md +++ b/GLOSSARY.md @@ -51,6 +51,7 @@ | **diagnostics** | The errors/warnings/hints a `language server` reports for a file — received both push (`textDocument/publishDiagnostics`) and pull (`textDocument/diagnostic`), then merged + deduped. | lints (when meaning LSP diagnostics) | | **workspace root** | The directory a `language server` is rooted at (its `rootUri` and spawn cwd): the nearest root-marker ancestor of a file, bounded above by the conversation's `working directory`. | project root (when meaning the per-server root) | | **working directory** | The per-conversation filesystem directory that tools and `language server`s operate within (`ToolExecuteContext.cwd`). Persisted per conversation by `conversation-store`; gettable/settable via the cwd endpoint; defaults a turn's cwd when `/chat` omits it. | cwd (spell out on first use), workdir (when meaning the conversation's directory) | +| **reasoning effort** | The per-request thinking-depth knob: how much extended thinking the model spends before answering. Ladder `low \| medium \| high \| xhigh \| max` (`ReasoningEffort` in `@dispatch/wire`). Resolution per turn: `ChatRequest.reasoningEffort` override → persisted per-conversation value → default `high`. Each provider maps a level to its native knob in its own code (e.g. Anthropic `thinking.budget_tokens`); providers without such a knob ignore it. | thinking level, effort level, thinking budget (that's the provider-NATIVE knob a level maps to) | | **context size** | The number of tokens a conversation currently occupies: the most recent turn's FINAL step `inputTokens + outputTokens` (NOT the aggregate per-turn `usage`, which sums per-step prompts and overcounts a multi-step turn). Stamped on `TurnDoneEvent.contextSize` (live) + `TurnMetrics.contextSize` (persisted); a client reads the LATEST turn's value as current usage. Distinct from the model's **context window** (its max token limit — a later feature). | context window (when meaning current usage), context length, tokens used, context usage | ## Known vocabulary drift diff --git a/packages/kernel/src/contracts/index.ts b/packages/kernel/src/contracts/index.ts index 10025e2..ffcbe76 100644 --- a/packages/kernel/src/contracts/index.ts +++ b/packages/kernel/src/contracts/index.ts @@ -95,6 +95,7 @@ export type { ProviderStreamOptions, ProviderToolCallEvent, ReasoningDeltaEvent, + ReasoningEffort, TextDeltaEvent, Usage, UsageEvent, diff --git a/packages/kernel/src/contracts/provider.ts b/packages/kernel/src/contracts/provider.ts index 0686c19..7f920c5 100644 --- a/packages/kernel/src/contracts/provider.ts +++ b/packages/kernel/src/contracts/provider.ts @@ -6,12 +6,12 @@ * translates its responses into `ProviderEvent`s. */ -import type { Usage } from "@dispatch/wire"; +import type { ReasoningEffort, Usage } from "@dispatch/wire"; import type { ChatMessage } from "./conversation.js"; import type { Logger } from "./logging.js"; import type { ToolContract } from "./tool.js"; -export type { Usage } from "@dispatch/wire"; +export type { ReasoningEffort, Usage } from "@dispatch/wire"; /** * Events a provider yields during a single `stream` call. The kernel consumes @@ -86,6 +86,14 @@ export interface ProviderStreamOptions { /** System prompt to prepend. */ readonly systemPrompt?: string; /** + * Reasoning-effort level for this request (already RESOLVED by the caller — + * the session-orchestrator applies the request → conversation → `"high"` + * default chain, so a provider receiving `undefined` may treat it as "no + * preference"). The provider maps the level to its native thinking knob in + * its own code; providers without such a knob ignore it. + */ + readonly reasoningEffort?: ReasoningEffort; + /** * Correlated logger for this turn's step (Phase A logging ABI). When present, * the provider should open a child `provider.request` span and capture the * verbatim post-transform request + raw response/error there, self-redacting diff --git a/packages/transport-contract/package.json b/packages/transport-contract/package.json index 84ceada..3a0a983 100644 --- a/packages/transport-contract/package.json +++ b/packages/transport-contract/package.json @@ -1,6 +1,6 @@ { "name": "@dispatch/transport-contract", - "version": "0.10.0", + "version": "0.11.0", "type": "module", "private": true, "main": "dist/index.js", diff --git a/packages/transport-contract/src/index.ts b/packages/transport-contract/src/index.ts index b0f6e20..e992f8b 100644 --- a/packages/transport-contract/src/index.ts +++ b/packages/transport-contract/src/index.ts @@ -20,9 +20,15 @@ */ import type { SurfaceClientMessage, SurfaceServerMessage } from "@dispatch/ui-contract"; -import type { AgentEvent, StoredChunk, TurnMetrics } from "@dispatch/wire"; +import type { AgentEvent, ReasoningEffort, StoredChunk, TurnMetrics } from "@dispatch/wire"; -export type { AgentEvent, StepMetrics, StoredChunk, TurnMetrics } from "@dispatch/wire"; +export type { + AgentEvent, + ReasoningEffort, + StepMetrics, + StoredChunk, + TurnMetrics, +} from "@dispatch/wire"; /** * Request body for `POST /chat` (sent as JSON). @@ -54,6 +60,14 @@ export interface ChatRequest { * prompt (so it does not affect prompt caching). */ readonly cwd?: string; + + /** + * Reasoning-effort override for THIS turn only (does not persist). When + * omitted, the server resolves the conversation's persisted value, falling + * back to `"high"`. Must be one of the `ReasoningEffort` levels; an + * unrecognized value → HTTP 400 `{ error }`. + */ + readonly reasoningEffort?: ReasoningEffort; } /** @@ -191,6 +205,28 @@ export interface SetCwdRequest { readonly cwd: string; } +// ─── Per-conversation reasoning effort ──────────────────────────────────────── + +/** + * Response of `GET /conversations/:id/reasoning-effort`. `reasoningEffort` is + * null when never set (the server then resolves turns at the default, + * `"high"`). + */ +export interface ReasoningEffortResponse { + readonly conversationId: string; + readonly reasoningEffort: ReasoningEffort | null; +} + +/** + * Body of `PUT /conversations/:id/reasoning-effort` — persists the + * conversation's sticky reasoning-effort level (used for every later turn that + * does not carry a per-turn `ChatRequest.reasoningEffort` override). An + * unrecognized level → HTTP 400 `{ error }`. + */ +export interface SetReasoningEffortRequest { + readonly reasoningEffort: ReasoningEffort; +} + // ─── Conversation close (explicit tab close) ────────────────────────────────── /** diff --git a/packages/wire/package.json b/packages/wire/package.json index d00772d..07c20b7 100644 --- a/packages/wire/package.json +++ b/packages/wire/package.json @@ -1,6 +1,6 @@ { "name": "@dispatch/wire", - "version": "0.6.1", + "version": "0.7.0", "type": "module", "private": true, "main": "dist/index.js", diff --git a/packages/wire/src/index.ts b/packages/wire/src/index.ts index 4fdf389..6a6de7d 100644 --- a/packages/wire/src/index.ts +++ b/packages/wire/src/index.ts @@ -146,6 +146,20 @@ export interface StoredChunk { readonly chunk: Chunk; } +// ─── Reasoning effort ─────────────────────────────────────────────────────── + +/** + * The per-request thinking-depth knob: how much extended thinking / reasoning + * the model should spend before answering. Provider-agnostic ladder; each + * provider maps a level to its native knob in its own code (e.g. an Anthropic + * provider maps it to a `thinking.budget_tokens` value) and MAY ignore levels + * (or the field entirely) that its backend cannot express. + * + * Resolution (owned by the session-orchestrator): per-turn request value → + * persisted per-conversation value → default `"high"`. + */ +export type ReasoningEffort = "low" | "medium" | "high" | "xhigh" | "max"; + // ─── Usage ────────────────────────────────────────────────────────────────── /** @@ -353,6 +353,24 @@ derives `hasOlder` from `chunks[0].seq > 1`. DB) — recipe fixed in §8 + above. (3) Violated the bracket trick once (`pkill -f 'cr5-data'` self-matched → killed parent shell, timeout-with-no-output); the existing §8 rule stands. +## Reasoning effort (current milestone) +User-gated calls: canonical term **reasoning effort** (GLOSSARY); ladder `low|medium|high|xhigh|max` +(Anthropic-driven, includes xhigh/max); scope = **(c)** persisted per-conversation + per-turn +`ChatRequest.reasoningEffort` override; resolution default **`high`**; provider picks sensible +budget_tokens; `../claude` orchestrated DIRECTLY (mode A); CLI `--effort` now. +- [x] **Wave 0 (orchestrator, contracts):** `ReasoningEffort` in `@dispatch/wire` (`0.6.1→0.7.0`); + `ProviderStreamOptions.reasoningEffort` (kernel contract; runtime untouched — providerOpts is + forwarded verbatim); `ChatRequest.reasoningEffort` + `ReasoningEffortResponse`/ + `SetReasoningEffortRequest` GET/PUT types (`@dispatch/transport-contract` `0.10.0→0.11.0`); + glossary entry. typecheck + biome clean. +- [ ] **Wave 1 (parallel):** `conversation-store` get/set persisted effort (mirror cwd); + `provider-anthropic` (../claude) level→thinking.budget_tokens mapping; `cli` `--effort` flag. +- [ ] **Wave 2:** `session-orchestrator` — resolution chain (turn override → stored → `high`), + thread into providerOpts via `StartTurnInput.reasoningEffort`. +- [ ] **Wave 3:** `transport-http` (validate `reasoningEffort`, thread to startTurn, GET/PUT + `/conversations/:id/reasoning-effort`) + `transport-ws` (thread on `chat.send`). +- [ ] Live-verify vs claude; FE courier handoff. + ## Open items - **Context window LIMIT (deferred, sibling of context size):** expose the selected model's max context-window token limit so the FE can render `contextSize / limit` (e.g. `1286 / 200000`). |
