feat(contracts): reasoning effort — ReasoningEffort ladder (low..max), ProviderStreamOptions/ChatRequest fields, per-conversation GET/PUT types

wire 0.6.1->0.7.0, transport-contract 0.10.0->0.11.0. Additive only; typecheck+biome clean.
author: Adam Malczewski <[email protected]> 2026-06-12 19:26:31 +0900
committer: Adam Malczewski <[email protected]> 2026-06-12 19:26:31 +0900
commit: 35197ed933044d322d0a653c4e88a5f3e475fe76 (patch)
tree: f768be26a61b28551a0671f2519c3da4ff682a1f
parent: dbf77ba78ff840e0ed5f6294030523fe3ab121fa (diff)
download: dispatch-35197ed933044d322d0a653c4e88a5f3e475fe76.tar.gz
dispatch-35197ed933044d322d0a653c4e88a5f3e475fe76.zip
8 files changed, 84 insertions, 6 deletions
diff --git a/GLOSSARY.md b/GLOSSARY.md
index 7564276..61a555d 100644
--- a/GLOSSARY.md
+++ b/GLOSSARY.md
@@ -51,6 +51,7 @@
 | **diagnostics** | The errors/warnings/hints a `language server` reports for a file — received both push (`textDocument/publishDiagnostics`) and pull (`textDocument/diagnostic`), then merged + deduped. | lints (when meaning LSP diagnostics) |
 | **workspace root** | The directory a `language server` is rooted at (its `rootUri` and spawn cwd): the nearest root-marker ancestor of a file, bounded above by the conversation's `working directory`. | project root (when meaning the per-server root) |
 | **working directory** | The per-conversation filesystem directory that tools and `language server`s operate within (`ToolExecuteContext.cwd`). Persisted per conversation by `conversation-store`; gettable/settable via the cwd endpoint; defaults a turn's cwd when `/chat` omits it. | cwd (spell out on first use), workdir (when meaning the conversation's directory) |
+| **reasoning effort** | The per-request thinking-depth knob: how much extended thinking the model spends before answering. Ladder `low \| medium \| high \| xhigh \| max` (`ReasoningEffort` in `@dispatch/wire`). Resolution per turn: `ChatRequest.reasoningEffort` override → persisted per-conversation value → default `high`. Each provider maps a level to its native knob in its own code (e.g. Anthropic `thinking.budget_tokens`); providers without such a knob ignore it. | thinking level, effort level, thinking budget (that's the provider-NATIVE knob a level maps to) |
 | **context size** | The number of tokens a conversation currently occupies: the most recent turn's FINAL step `inputTokens + outputTokens` (NOT the aggregate per-turn `usage`, which sums per-step prompts and overcounts a multi-step turn). Stamped on `TurnDoneEvent.contextSize` (live) + `TurnMetrics.contextSize` (persisted); a client reads the LATEST turn's value as current usage. Distinct from the model's **context window** (its max token limit — a later feature). | context window (when meaning current usage), context length, tokens used, context usage |
 
 ## Known vocabulary drift
diff --git a/packages/kernel/src/contracts/index.ts b/packages/kernel/src/contracts/index.ts
index 10025e2..ffcbe76 100644
--- a/packages/kernel/src/contracts/index.ts
+++ b/packages/kernel/src/contracts/index.ts
@@ -95,6 +95,7 @@ export type {
 	ProviderStreamOptions,
 	ProviderToolCallEvent,
 	ReasoningDeltaEvent,
+	ReasoningEffort,
 	TextDeltaEvent,
 	Usage,
 	UsageEvent,
diff --git a/packages/kernel/src/contracts/provider.ts b/packages/kernel/src/contracts/provider.ts
index 0686c19..7f920c5 100644
--- a/packages/kernel/src/contracts/provider.ts
+++ b/packages/kernel/src/contracts/provider.ts
@@ -6,12 +6,12 @@
  * translates its responses into `ProviderEvent`s.
  */
 
-import type { Usage } from "@dispatch/wire";
+import type { ReasoningEffort, Usage } from "@dispatch/wire";
 import type { ChatMessage } from "./conversation.js";
 import type { Logger } from "./logging.js";
 import type { ToolContract } from "./tool.js";
 
-export type { Usage } from "@dispatch/wire";
+export type { ReasoningEffort, Usage } from "@dispatch/wire";
 
 /**
  * Events a provider yields during a single `stream` call. The kernel consumes
@@ -86,6 +86,14 @@ export interface ProviderStreamOptions {
 	/** System prompt to prepend. */
 	readonly systemPrompt?: string;
 	/**
+	 * Reasoning-effort level for this request (already RESOLVED by the caller —
+	 * the session-orchestrator applies the request → conversation → `"high"`
+	 * default chain, so a provider receiving `undefined` may treat it as "no
+	 * preference"). The provider maps the level to its native thinking knob in
+	 * its own code; providers without such a knob ignore it.
+	 */
+	readonly reasoningEffort?: ReasoningEffort;
+	/**
 	 * Correlated logger for this turn's step (Phase A logging ABI). When present,
 	 * the provider should open a child `provider.request` span and capture the
 	 * verbatim post-transform request + raw response/error there, self-redacting
diff --git a/packages/transport-contract/package.json b/packages/transport-contract/package.json
index 84ceada..3a0a983 100644
--- a/packages/transport-contract/package.json
+++ b/packages/transport-contract/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@dispatch/transport-contract",
-	"version": "0.10.0",
+	"version": "0.11.0",
 	"type": "module",
 	"private": true,
 	"main": "dist/index.js",
diff --git a/packages/transport-contract/src/index.ts b/packages/transport-contract/src/index.ts
index b0f6e20..e992f8b 100644
--- a/packages/transport-contract/src/index.ts
+++ b/packages/transport-contract/src/index.ts
@@ -20,9 +20,15 @@
  */
 
 import type { SurfaceClientMessage, SurfaceServerMessage } from "@dispatch/ui-contract";
-import type { AgentEvent, StoredChunk, TurnMetrics } from "@dispatch/wire";
+import type { AgentEvent, ReasoningEffort, StoredChunk, TurnMetrics } from "@dispatch/wire";
 
-export type { AgentEvent, StepMetrics, StoredChunk, TurnMetrics } from "@dispatch/wire";
+export type {
+	AgentEvent,
+	ReasoningEffort,
+	StepMetrics,
+	StoredChunk,
+	TurnMetrics,
+} from "@dispatch/wire";
 
 /**
  * Request body for `POST /chat` (sent as JSON).
@@ -54,6 +60,14 @@ export interface ChatRequest {
 	 * prompt (so it does not affect prompt caching).
 	 */
 	readonly cwd?: string;
+
+	/**
+	 * Reasoning-effort override for THIS turn only (does not persist). When
+	 * omitted, the server resolves the conversation's persisted value, falling
+	 * back to `"high"`. Must be one of the `ReasoningEffort` levels; an
+	 * unrecognized value → HTTP 400 `{ error }`.
+	 */
+	readonly reasoningEffort?: ReasoningEffort;
 }
 
 /**
@@ -191,6 +205,28 @@ export interface SetCwdRequest {
 	readonly cwd: string;
 }
 
+// ─── Per-conversation reasoning effort ────────────────────────────────────────
+
+/**
+ * Response of `GET /conversations/:id/reasoning-effort`. `reasoningEffort` is
+ * null when never set (the server then resolves turns at the default,
+ * `"high"`).
+ */
+export interface ReasoningEffortResponse {
+	readonly conversationId: string;
+	readonly reasoningEffort: ReasoningEffort | null;
+}
+
+/**
+ * Body of `PUT /conversations/:id/reasoning-effort` — persists the
+ * conversation's sticky reasoning-effort level (used for every later turn that
+ * does not carry a per-turn `ChatRequest.reasoningEffort` override). An
+ * unrecognized level → HTTP 400 `{ error }`.
+ */
+export interface SetReasoningEffortRequest {
+	readonly reasoningEffort: ReasoningEffort;
+}
+
 // ─── Conversation close (explicit tab close) ──────────────────────────────────
 
 /**
diff --git a/packages/wire/package.json b/packages/wire/package.json
index d00772d..07c20b7 100644
--- a/packages/wire/package.json
+++ b/packages/wire/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@dispatch/wire",
-	"version": "0.6.1",
+	"version": "0.7.0",
 	"type": "module",
 	"private": true,
 	"main": "dist/index.js",
diff --git a/packages/wire/src/index.ts b/packages/wire/src/index.ts
index 4fdf389..6a6de7d 100644
--- a/packages/wire/src/index.ts
+++ b/packages/wire/src/index.ts
@@ -146,6 +146,20 @@ export interface StoredChunk {
 	readonly chunk: Chunk;
 }
 
+// ─── Reasoning effort ───────────────────────────────────────────────────────
+
+/**
+ * The per-request thinking-depth knob: how much extended thinking / reasoning
+ * the model should spend before answering. Provider-agnostic ladder; each
+ * provider maps a level to its native knob in its own code (e.g. an Anthropic
+ * provider maps it to a `thinking.budget_tokens` value) and MAY ignore levels
+ * (or the field entirely) that its backend cannot express.
+ *
+ * Resolution (owned by the session-orchestrator): per-turn request value →
+ * persisted per-conversation value → default `"high"`.
+ */
+export type ReasoningEffort = "low" | "medium" | "high" | "xhigh" | "max";
+
 // ─── Usage ──────────────────────────────────────────────────────────────────
 
 /**
diff --git a/tasks.md b/tasks.md
index 205d9c9..ceb5557 100644
--- a/tasks.md
+++ b/tasks.md
@@ -353,6 +353,24 @@ derives `hasOlder` from `chunks[0].seq > 1`.
   DB) — recipe fixed in §8 + above. (3) Violated the bracket trick once (`pkill -f 'cr5-data'`
   self-matched → killed parent shell, timeout-with-no-output); the existing §8 rule stands.
 
+## Reasoning effort (current milestone)
+User-gated calls: canonical term **reasoning effort** (GLOSSARY); ladder `low|medium|high|xhigh|max`
+(Anthropic-driven, includes xhigh/max); scope = **(c)** persisted per-conversation + per-turn
+`ChatRequest.reasoningEffort` override; resolution default **`high`**; provider picks sensible
+budget_tokens; `../claude` orchestrated DIRECTLY (mode A); CLI `--effort` now.
+- [x] **Wave 0 (orchestrator, contracts):** `ReasoningEffort` in `@dispatch/wire` (`0.6.1→0.7.0`);
+  `ProviderStreamOptions.reasoningEffort` (kernel contract; runtime untouched — providerOpts is
+  forwarded verbatim); `ChatRequest.reasoningEffort` + `ReasoningEffortResponse`/
+  `SetReasoningEffortRequest` GET/PUT types (`@dispatch/transport-contract` `0.10.0→0.11.0`);
+  glossary entry. typecheck + biome clean.
+- [ ] **Wave 1 (parallel):** `conversation-store` get/set persisted effort (mirror cwd);
+  `provider-anthropic` (../claude) level→thinking.budget_tokens mapping; `cli` `--effort` flag.
+- [ ] **Wave 2:** `session-orchestrator` — resolution chain (turn override → stored → `high`),
+  thread into providerOpts via `StartTurnInput.reasoningEffort`.
+- [ ] **Wave 3:** `transport-http` (validate `reasoningEffort`, thread to startTurn, GET/PUT
+  `/conversations/:id/reasoning-effort`) + `transport-ws` (thread on `chat.send`).
+- [ ] Live-verify vs claude; FE courier handoff.
+
 ## Open items
 - **Context window LIMIT (deferred, sibling of context size):** expose the selected model's max
   context-window token limit so the FE can render `contextSize / limit` (e.g. `1286 / 200000`).
author	Adam Malczewski <[email protected]>	2026-06-12 19:26:31 +0900
committer	Adam Malczewski <[email protected]>	2026-06-12 19:26:31 +0900
commit	35197ed933044d322d0a653c4e88a5f3e475fe76 (patch)
tree	f768be26a61b28551a0671f2519c3da4ff682a1f
parent	dbf77ba78ff840e0ed5f6294030523fe3ab121fa (diff)
download	dispatch-35197ed933044d322d0a653c4e88a5f3e475fe76.tar.gz dispatch-35197ed933044d322d0a653c4e88a5f3e475fe76.zip