diff options
| author | Adam Malczewski <[email protected]> | 2026-06-12 20:38:57 +0900 |
|---|---|---|
| committer | Adam Malczewski <[email protected]> | 2026-06-12 20:38:57 +0900 |
| commit | baa6f6c9d21de2f6ffc60e00f53c61d026155933 (patch) | |
| tree | fecae91d99d906a7b5054b398e4d3d90894567a0 /.dispatch | |
| parent | 7dcc06eecb5b691b0c0daec26db9d5e407d0a60e (diff) | |
| download | dispatch-web-baa6f6c9d21de2f6ffc60e00f53c61d026155933.tar.gz dispatch-web-baa6f6c9d21de2f6ffc60e00f53c61d026155933.zip | |
feat(chat): reasoning-effort selector — sticky per-conversation thinking-depth knob
Consume the backend's reasoning-effort handoff ([email protected] ReasoningEffort +
[email protected] GET/PUT /conversations/:id/reasoning-effort,
ChatRequest.reasoningEffort): a 5-level selector in the sidebar Model view,
under the provider + model dropdowns. null renders as 'high (default)' per
the server-owned resolution chain; PUT on change (effective next turn);
error + revert on 400; per-conversation re-mount incl. drafts (the draft id
survives promotion, so an effort set on a draft applies from turn 1).
Re-mirrored .dispatch references; GLOSSARY 'reasoning effort'; handoff
updated. 616 tests green; live curl probe passed.
Diffstat (limited to '.dispatch')
| -rw-r--r-- | .dispatch/transport-contract.reference.md | 66 | ||||
| -rw-r--r-- | .dispatch/wire.reference.md | 28 |
2 files changed, 88 insertions, 6 deletions
diff --git a/.dispatch/transport-contract.reference.md b/.dispatch/transport-contract.reference.md index 774cfb0..1c3d993 100644 --- a/.dispatch/transport-contract.reference.md +++ b/.dispatch/transport-contract.reference.md @@ -5,10 +5,27 @@ > hangs on a permission prompt). Your CODE still imports `@dispatch/transport-contract` normally — > this file is for READING only. > -> **Orchestrator:** SNAPSHOT of `[email protected]` (CR-5 history windowing shipped). -> Depends on `@dispatch/[email protected]` (see `wire.reference.md`) + `@dispatch/[email protected]` (see +> **Orchestrator:** SNAPSHOT of `[email protected]` (reasoning effort shipped). +> Depends on `@dispatch/[email protected]` (see `wire.reference.md`) + `@dispatch/[email protected]` (see > `ui-contract.reference.md`). > +> **2026-06-12 delta (reasoning-effort handoff — package bumped `0.10.0` → `0.11.0`, ADDITIVE):** +> the thinking-depth knob (`ReasoningEffort`, re-exported from `[email protected]`) lands in TWO scopes, +> resolved server-side per turn (per-turn override → persisted conversation value → default +> `"high"`; do NOT re-implement the chain client-side): +> 1. **Per-turn override** — optional `reasoningEffort?: ReasoningEffort` on `ChatRequest` (and +> therefore on WS `chat.send`, which extends it). Applies to THAT turn only; never persists. +> OMIT the key for "no override" (never send `null`/`""`). +> 2. **Persisted per-conversation setting** — `GET /conversations/:id/reasoning-effort` → +> `ReasoningEffortResponse { conversationId, reasoningEffort: ReasoningEffort | null }` +> (`null` = never set ⇒ the default `"high"` applies, NOT "off") and +> `PUT /conversations/:id/reasoning-effort` body `SetReasoningEffortRequest +> { reasoningEffort }`. Takes effect from the NEXT turn. +> Validation: an unrecognized level → HTTP 400 `{ error }` listing the valid levels (same for the +> WS path via the standard `chat.send` error reply). Cache note: CHANGING the level changes the +> provider request shape and can bust the prompt cache for the next turn (one-time re-prefill); +> a stable setting stays cache-safe (warming uses the same resolved effort). +> > **2026-06-12 delta (CR-5 history windowing — package bumped `0.9.0` → `0.10.0`):** NO type-shape > change — `GET /conversations/:id` gains two OPTIONAL query params alongside `sinceSeq`: > **`limit=<k>`** (the NEWEST `k` chunks of the selection, still ASCENDING; a selection with ≤ `k` @@ -126,6 +143,11 @@ - `GET /conversations/:id/lsp` — `LspStatusResponse`. LAZILY spawns+initializes the configured servers on the first call per cwd (can take a moment; cached after); returns once each settles to `connected`/`error`. `servers` is `[]` when `cwd` is null. +- `GET /conversations/:id/reasoning-effort` — `ReasoningEffortResponse` (`reasoningEffort` is `null` + when never set ⇒ default `"high"` applies). Works for an unseen/draft id. +- `PUT /conversations/:id/reasoning-effort` — body `SetReasoningEffortRequest` → + `200 ReasoningEffortResponse`; `400 { error }` on an unrecognized level (the message lists the + valid levels). Persists the conversation's sticky level; effective from the NEXT turn. - WebSocket on :24205 — ONE path-agnostic socket multiplexes surface ops (`@dispatch/ui-contract`) + chat ops (below). Open once, send `WsClientMessage`, receive `WsServerMessage`. Live `AgentEvent` deltas carry `conversationId`+`turnId` but **no `seq`** @@ -150,9 +172,15 @@ */ import type { SurfaceClientMessage, SurfaceServerMessage } from "@dispatch/ui-contract"; -import type { AgentEvent, StoredChunk, TurnMetrics } from "@dispatch/wire"; +import type { AgentEvent, ReasoningEffort, StoredChunk, TurnMetrics } from "@dispatch/wire"; -export type { AgentEvent, StepMetrics, StoredChunk, TurnMetrics } from "@dispatch/wire"; +export type { + AgentEvent, + ReasoningEffort, + StepMetrics, + StoredChunk, + TurnMetrics, +} from "@dispatch/wire"; /** * Request body for `POST /chat` (sent as JSON). @@ -184,6 +212,14 @@ export interface ChatRequest { * prompt (so it does not affect prompt caching). */ readonly cwd?: string; + + /** + * Reasoning-effort override for THIS turn only (does not persist). When + * omitted, the server resolves the conversation's persisted value, falling + * back to `"high"`. Must be one of the `ReasoningEffort` levels; an + * unrecognized value → HTTP 400 `{ error }`. + */ + readonly reasoningEffort?: ReasoningEffort; } /** @@ -315,6 +351,28 @@ export interface SetCwdRequest { readonly cwd: string; } +// ─── Per-conversation reasoning effort ──────────────────────────────────────── + +/** + * Response of `GET /conversations/:id/reasoning-effort`. `reasoningEffort` is + * null when never set (the server then resolves turns at the default, + * `"high"`). + */ +export interface ReasoningEffortResponse { + readonly conversationId: string; + readonly reasoningEffort: ReasoningEffort | null; +} + +/** + * Body of `PUT /conversations/:id/reasoning-effort` — persists the + * conversation's sticky reasoning-effort level (used for every later turn that + * does not carry a per-turn `ChatRequest.reasoningEffort` override). An + * unrecognized level → HTTP 400 `{ error }`. + */ +export interface SetReasoningEffortRequest { + readonly reasoningEffort: ReasoningEffort; +} + // ─── Conversation close (explicit tab close) ────────────────────────────────── /** diff --git a/.dispatch/wire.reference.md b/.dispatch/wire.reference.md index 1d761bf..34984d2 100644 --- a/.dispatch/wire.reference.md +++ b/.dispatch/wire.reference.md @@ -4,8 +4,18 @@ > types WITHOUT following the `file:` dep symlink out of this repo (which hangs on a permission > prompt). Your CODE still imports `@dispatch/wire` normally — this file is for READING only. > -> **Orchestrator:** SNAPSHOT of `[email protected]` (doc-only bump: the 1-based gap-free seq guarantee -> codified on `StoredChunk`). Regenerate whenever `@dispatch/wire` changes. +> **Orchestrator:** SNAPSHOT of `[email protected]` (reasoning effort — the thinking-depth knob). +> Regenerate whenever `@dispatch/wire` changes. +> +> **2026-06-12 delta (reasoning-effort handoff — package bumped `0.6.1` → `0.7.0`, ADDITIVE):** +> adds the **`ReasoningEffort`** type — the per-request thinking-depth ladder +> `"low" | "medium" | "high" | "xhigh" | "max"`. Provider-agnostic; the Anthropic provider maps +> levels to extended-thinking token budgets (low 4096 · medium 10240 · high 16384 · xhigh 32768 · +> max 65536); providers without a thinking knob ignore it. Resolution is SERVER-owned (do not +> re-implement): per-turn `ChatRequest.reasoningEffort` override → persisted per-conversation value +> (`GET`/`PUT /conversations/:id/reasoning-effort`, see `[email protected]`) → default +> `"high"`. Higher levels mean longer runs of `reasoning-delta` events before the first text delta. +> See the `ReasoningEffort` definition below. > > **2026-06-12 delta (CR-5 history windowing — package bumped `0.6.0` → `0.6.1`, DOC-ONLY):** the > per-conversation `seq` numbering is now a WRITTEN CONTRACTUAL GUARANTEE on `StoredChunk`: @@ -196,6 +206,20 @@ export interface StoredChunk { readonly chunk: Chunk; } +// ─── Reasoning effort ─────────────────────────────────────────────────────── + +/** + * The per-request thinking-depth knob: how much extended thinking / reasoning + * the model should spend before answering. Provider-agnostic ladder; each + * provider maps a level to its native knob in its own code (e.g. an Anthropic + * provider maps it to a `thinking.budget_tokens` value) and MAY ignore levels + * (or the field entirely) that its backend cannot express. + * + * Resolution (owned by the session-orchestrator): per-turn request value → + * persisted per-conversation value → default `"high"`. + */ +export type ReasoningEffort = "low" | "medium" | "high" | "xhigh" | "max"; + // ─── Usage ────────────────────────────────────────────────────────────────── /** |
