feat(chat): reasoning-effort selector — sticky per-conversation thinking-depth knob

Consume the backend's reasoning-effort handoff ([email protected] ReasoningEffort + [email protected] GET/PUT /conversations/:id/reasoning-effort, ChatRequest.reasoningEffort): a 5-level selector in the sidebar Model view, under the provider + model dropdowns. null renders as 'high (default)' per the server-owned resolution chain; PUT on change (effective next turn); error + revert on 400; per-conversation re-mount incl. drafts (the draft id survives promotion, so an effort set on a draft applies from turn 1). Re-mirrored .dispatch references; GLOSSARY 'reasoning effort'; handoff updated. 616 tests green; live curl probe passed.
author: Adam Malczewski <[email protected]> 2026-06-12 20:38:57 +0900
committer: Adam Malczewski <[email protected]> 2026-06-12 20:38:57 +0900
commit: baa6f6c9d21de2f6ffc60e00f53c61d026155933 (patch)
tree: fecae91d99d906a7b5054b398e4d3d90894567a0 /.dispatch
parent: 7dcc06eecb5b691b0c0daec26db9d5e407d0a60e (diff)
download: dispatch-web-baa6f6c9d21de2f6ffc60e00f53c61d026155933.tar.gz
dispatch-web-baa6f6c9d21de2f6ffc60e00f53c61d026155933.zip
2 files changed, 88 insertions, 6 deletions
diff --git a/.dispatch/transport-contract.reference.md b/.dispatch/transport-contract.reference.md
index 774cfb0..1c3d993 100644
--- a/.dispatch/transport-contract.reference.md
+++ b/.dispatch/transport-contract.reference.md
@@ -5,10 +5,27 @@
 > hangs on a permission prompt). Your CODE still imports `@dispatch/transport-contract` normally —
 > this file is for READING only.
 >
-> **Orchestrator:** SNAPSHOT of `[email protected]` (CR-5 history windowing shipped).
-> Depends on `@dispatch/[email protected]` (see `wire.reference.md`) + `@dispatch/[email protected]` (see
+> **Orchestrator:** SNAPSHOT of `[email protected]` (reasoning effort shipped).
+> Depends on `@dispatch/[email protected]` (see `wire.reference.md`) + `@dispatch/[email protected]` (see
 > `ui-contract.reference.md`).
 >
+> **2026-06-12 delta (reasoning-effort handoff — package bumped `0.10.0` → `0.11.0`, ADDITIVE):**
+> the thinking-depth knob (`ReasoningEffort`, re-exported from `[email protected]`) lands in TWO scopes,
+> resolved server-side per turn (per-turn override → persisted conversation value → default
+> `"high"`; do NOT re-implement the chain client-side):
+> 1. **Per-turn override** — optional `reasoningEffort?: ReasoningEffort` on `ChatRequest` (and
+>    therefore on WS `chat.send`, which extends it). Applies to THAT turn only; never persists.
+>    OMIT the key for "no override" (never send `null`/`""`).
+> 2. **Persisted per-conversation setting** — `GET /conversations/:id/reasoning-effort` →
+>    `ReasoningEffortResponse { conversationId, reasoningEffort: ReasoningEffort | null }`
+>    (`null` = never set ⇒ the default `"high"` applies, NOT "off") and
+>    `PUT /conversations/:id/reasoning-effort` body `SetReasoningEffortRequest
+>    { reasoningEffort }`. Takes effect from the NEXT turn.
+> Validation: an unrecognized level → HTTP 400 `{ error }` listing the valid levels (same for the
+> WS path via the standard `chat.send` error reply). Cache note: CHANGING the level changes the
+> provider request shape and can bust the prompt cache for the next turn (one-time re-prefill);
+> a stable setting stays cache-safe (warming uses the same resolved effort).
+>
 > **2026-06-12 delta (CR-5 history windowing — package bumped `0.9.0` → `0.10.0`):** NO type-shape
 > change — `GET /conversations/:id` gains two OPTIONAL query params alongside `sinceSeq`:
 > **`limit=<k>`** (the NEWEST `k` chunks of the selection, still ASCENDING; a selection with ≤ `k`
@@ -126,6 +143,11 @@
 - `GET /conversations/:id/lsp` — `LspStatusResponse`. LAZILY spawns+initializes the configured servers
   on the first call per cwd (can take a moment; cached after); returns once each settles to
   `connected`/`error`. `servers` is `[]` when `cwd` is null.
+- `GET /conversations/:id/reasoning-effort` — `ReasoningEffortResponse` (`reasoningEffort` is `null`
+  when never set ⇒ default `"high"` applies). Works for an unseen/draft id.
+- `PUT /conversations/:id/reasoning-effort` — body `SetReasoningEffortRequest` →
+  `200 ReasoningEffortResponse`; `400 { error }` on an unrecognized level (the message lists the
+  valid levels). Persists the conversation's sticky level; effective from the NEXT turn.
 - WebSocket on :24205 — ONE path-agnostic socket multiplexes surface ops
   (`@dispatch/ui-contract`) + chat ops (below). Open once, send `WsClientMessage`, receive
   `WsServerMessage`. Live `AgentEvent` deltas carry `conversationId`+`turnId` but **no `seq`**
@@ -150,9 +172,15 @@
  */
 
 import type { SurfaceClientMessage, SurfaceServerMessage } from "@dispatch/ui-contract";
-import type { AgentEvent, StoredChunk, TurnMetrics } from "@dispatch/wire";
+import type { AgentEvent, ReasoningEffort, StoredChunk, TurnMetrics } from "@dispatch/wire";
 
-export type { AgentEvent, StepMetrics, StoredChunk, TurnMetrics } from "@dispatch/wire";
+export type {
+	AgentEvent,
+	ReasoningEffort,
+	StepMetrics,
+	StoredChunk,
+	TurnMetrics,
+} from "@dispatch/wire";
 
 /**
  * Request body for `POST /chat` (sent as JSON).
@@ -184,6 +212,14 @@ export interface ChatRequest {
 	 * prompt (so it does not affect prompt caching).
 	 */
 	readonly cwd?: string;
+
+	/**
+	 * Reasoning-effort override for THIS turn only (does not persist). When
+	 * omitted, the server resolves the conversation's persisted value, falling
+	 * back to `"high"`. Must be one of the `ReasoningEffort` levels; an
+	 * unrecognized value → HTTP 400 `{ error }`.
+	 */
+	readonly reasoningEffort?: ReasoningEffort;
 }
 
 /**
@@ -315,6 +351,28 @@ export interface SetCwdRequest {
 	readonly cwd: string;
 }
 
+// ─── Per-conversation reasoning effort ────────────────────────────────────────
+
+/**
+ * Response of `GET /conversations/:id/reasoning-effort`. `reasoningEffort` is
+ * null when never set (the server then resolves turns at the default,
+ * `"high"`).
+ */
+export interface ReasoningEffortResponse {
+	readonly conversationId: string;
+	readonly reasoningEffort: ReasoningEffort | null;
+}
+
+/**
+ * Body of `PUT /conversations/:id/reasoning-effort` — persists the
+ * conversation's sticky reasoning-effort level (used for every later turn that
+ * does not carry a per-turn `ChatRequest.reasoningEffort` override). An
+ * unrecognized level → HTTP 400 `{ error }`.
+ */
+export interface SetReasoningEffortRequest {
+	readonly reasoningEffort: ReasoningEffort;
+}
+
 // ─── Conversation close (explicit tab close) ──────────────────────────────────
 
 /**
diff --git a/.dispatch/wire.reference.md b/.dispatch/wire.reference.md
index 1d761bf..34984d2 100644
--- a/.dispatch/wire.reference.md
+++ b/.dispatch/wire.reference.md
@@ -4,8 +4,18 @@
 > types WITHOUT following the `file:` dep symlink out of this repo (which hangs on a permission
 > prompt). Your CODE still imports `@dispatch/wire` normally — this file is for READING only.
 >
-> **Orchestrator:** SNAPSHOT of `[email protected]` (doc-only bump: the 1-based gap-free seq guarantee
-> codified on `StoredChunk`). Regenerate whenever `@dispatch/wire` changes.
+> **Orchestrator:** SNAPSHOT of `[email protected]` (reasoning effort — the thinking-depth knob).
+> Regenerate whenever `@dispatch/wire` changes.
+>
+> **2026-06-12 delta (reasoning-effort handoff — package bumped `0.6.1` → `0.7.0`, ADDITIVE):**
+> adds the **`ReasoningEffort`** type — the per-request thinking-depth ladder
+> `"low" | "medium" | "high" | "xhigh" | "max"`. Provider-agnostic; the Anthropic provider maps
+> levels to extended-thinking token budgets (low 4096 · medium 10240 · high 16384 · xhigh 32768 ·
+> max 65536); providers without a thinking knob ignore it. Resolution is SERVER-owned (do not
+> re-implement): per-turn `ChatRequest.reasoningEffort` override → persisted per-conversation value
+> (`GET`/`PUT /conversations/:id/reasoning-effort`, see `[email protected]`) → default
+> `"high"`. Higher levels mean longer runs of `reasoning-delta` events before the first text delta.
+> See the `ReasoningEffort` definition below.
 >
 > **2026-06-12 delta (CR-5 history windowing — package bumped `0.6.0` → `0.6.1`, DOC-ONLY):** the
 > per-conversation `seq` numbering is now a WRITTEN CONTRACTUAL GUARANTEE on `StoredChunk`:
@@ -196,6 +206,20 @@ export interface StoredChunk {
 	readonly chunk: Chunk;
 }
 
+// ─── Reasoning effort ───────────────────────────────────────────────────────
+
+/**
+ * The per-request thinking-depth knob: how much extended thinking / reasoning
+ * the model should spend before answering. Provider-agnostic ladder; each
+ * provider maps a level to its native knob in its own code (e.g. an Anthropic
+ * provider maps it to a `thinking.budget_tokens` value) and MAY ignore levels
+ * (or the field entirely) that its backend cannot express.
+ *
+ * Resolution (owned by the session-orchestrator): per-turn request value →
+ * persisted per-conversation value → default `"high"`.
+ */
+export type ReasoningEffort = "low" | "medium" | "high" | "xhigh" | "max";
+
 // ─── Usage ──────────────────────────────────────────────────────────────────
 
 /**
author	Adam Malczewski <[email protected]>	2026-06-12 20:38:57 +0900
committer	Adam Malczewski <[email protected]>	2026-06-12 20:38:57 +0900
commit	baa6f6c9d21de2f6ffc60e00f53c61d026155933 (patch)
tree	fecae91d99d906a7b5054b398e4d3d90894567a0 /.dispatch
parent	7dcc06eecb5b691b0c0daec26db9d5e407d0a60e (diff)
download	dispatch-web-baa6f6c9d21de2f6ffc60e00f53c61d026155933.tar.gz dispatch-web-baa6f6c9d21de2f6ffc60e00f53c61d026155933.zip