summaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
authorAdam Malczewski <[email protected]>2026-06-12 19:26:31 +0900
committerAdam Malczewski <[email protected]>2026-06-12 19:26:31 +0900
commit35197ed933044d322d0a653c4e88a5f3e475fe76 (patch)
treef768be26a61b28551a0671f2519c3da4ff682a1f
parentdbf77ba78ff840e0ed5f6294030523fe3ab121fa (diff)
downloaddispatch-35197ed933044d322d0a653c4e88a5f3e475fe76.tar.gz
dispatch-35197ed933044d322d0a653c4e88a5f3e475fe76.zip
feat(contracts): reasoning effort — ReasoningEffort ladder (low..max), ProviderStreamOptions/ChatRequest fields, per-conversation GET/PUT types
wire 0.6.1->0.7.0, transport-contract 0.10.0->0.11.0. Additive only; typecheck+biome clean.
-rw-r--r--GLOSSARY.md1
-rw-r--r--packages/kernel/src/contracts/index.ts1
-rw-r--r--packages/kernel/src/contracts/provider.ts12
-rw-r--r--packages/transport-contract/package.json2
-rw-r--r--packages/transport-contract/src/index.ts40
-rw-r--r--packages/wire/package.json2
-rw-r--r--packages/wire/src/index.ts14
-rw-r--r--tasks.md18
8 files changed, 84 insertions, 6 deletions
diff --git a/GLOSSARY.md b/GLOSSARY.md
index 7564276..61a555d 100644
--- a/GLOSSARY.md
+++ b/GLOSSARY.md
@@ -51,6 +51,7 @@
| **diagnostics** | The errors/warnings/hints a `language server` reports for a file — received both push (`textDocument/publishDiagnostics`) and pull (`textDocument/diagnostic`), then merged + deduped. | lints (when meaning LSP diagnostics) |
| **workspace root** | The directory a `language server` is rooted at (its `rootUri` and spawn cwd): the nearest root-marker ancestor of a file, bounded above by the conversation's `working directory`. | project root (when meaning the per-server root) |
| **working directory** | The per-conversation filesystem directory that tools and `language server`s operate within (`ToolExecuteContext.cwd`). Persisted per conversation by `conversation-store`; gettable/settable via the cwd endpoint; defaults a turn's cwd when `/chat` omits it. | cwd (spell out on first use), workdir (when meaning the conversation's directory) |
+| **reasoning effort** | The per-request thinking-depth knob: how much extended thinking the model spends before answering. Ladder `low \| medium \| high \| xhigh \| max` (`ReasoningEffort` in `@dispatch/wire`). Resolution per turn: `ChatRequest.reasoningEffort` override → persisted per-conversation value → default `high`. Each provider maps a level to its native knob in its own code (e.g. Anthropic `thinking.budget_tokens`); providers without such a knob ignore it. | thinking level, effort level, thinking budget (that's the provider-NATIVE knob a level maps to) |
| **context size** | The number of tokens a conversation currently occupies: the most recent turn's FINAL step `inputTokens + outputTokens` (NOT the aggregate per-turn `usage`, which sums per-step prompts and overcounts a multi-step turn). Stamped on `TurnDoneEvent.contextSize` (live) + `TurnMetrics.contextSize` (persisted); a client reads the LATEST turn's value as current usage. Distinct from the model's **context window** (its max token limit — a later feature). | context window (when meaning current usage), context length, tokens used, context usage |
## Known vocabulary drift
diff --git a/packages/kernel/src/contracts/index.ts b/packages/kernel/src/contracts/index.ts
index 10025e2..ffcbe76 100644
--- a/packages/kernel/src/contracts/index.ts
+++ b/packages/kernel/src/contracts/index.ts
@@ -95,6 +95,7 @@ export type {
ProviderStreamOptions,
ProviderToolCallEvent,
ReasoningDeltaEvent,
+ ReasoningEffort,
TextDeltaEvent,
Usage,
UsageEvent,
diff --git a/packages/kernel/src/contracts/provider.ts b/packages/kernel/src/contracts/provider.ts
index 0686c19..7f920c5 100644
--- a/packages/kernel/src/contracts/provider.ts
+++ b/packages/kernel/src/contracts/provider.ts
@@ -6,12 +6,12 @@
* translates its responses into `ProviderEvent`s.
*/
-import type { Usage } from "@dispatch/wire";
+import type { ReasoningEffort, Usage } from "@dispatch/wire";
import type { ChatMessage } from "./conversation.js";
import type { Logger } from "./logging.js";
import type { ToolContract } from "./tool.js";
-export type { Usage } from "@dispatch/wire";
+export type { ReasoningEffort, Usage } from "@dispatch/wire";
/**
* Events a provider yields during a single `stream` call. The kernel consumes
@@ -86,6 +86,14 @@ export interface ProviderStreamOptions {
/** System prompt to prepend. */
readonly systemPrompt?: string;
/**
+ * Reasoning-effort level for this request (already RESOLVED by the caller —
+ * the session-orchestrator applies the request → conversation → `"high"`
+ * default chain, so a provider receiving `undefined` may treat it as "no
+ * preference"). The provider maps the level to its native thinking knob in
+ * its own code; providers without such a knob ignore it.
+ */
+ readonly reasoningEffort?: ReasoningEffort;
+ /**
* Correlated logger for this turn's step (Phase A logging ABI). When present,
* the provider should open a child `provider.request` span and capture the
* verbatim post-transform request + raw response/error there, self-redacting
diff --git a/packages/transport-contract/package.json b/packages/transport-contract/package.json
index 84ceada..3a0a983 100644
--- a/packages/transport-contract/package.json
+++ b/packages/transport-contract/package.json
@@ -1,6 +1,6 @@
{
"name": "@dispatch/transport-contract",
- "version": "0.10.0",
+ "version": "0.11.0",
"type": "module",
"private": true,
"main": "dist/index.js",
diff --git a/packages/transport-contract/src/index.ts b/packages/transport-contract/src/index.ts
index b0f6e20..e992f8b 100644
--- a/packages/transport-contract/src/index.ts
+++ b/packages/transport-contract/src/index.ts
@@ -20,9 +20,15 @@
*/
import type { SurfaceClientMessage, SurfaceServerMessage } from "@dispatch/ui-contract";
-import type { AgentEvent, StoredChunk, TurnMetrics } from "@dispatch/wire";
+import type { AgentEvent, ReasoningEffort, StoredChunk, TurnMetrics } from "@dispatch/wire";
-export type { AgentEvent, StepMetrics, StoredChunk, TurnMetrics } from "@dispatch/wire";
+export type {
+ AgentEvent,
+ ReasoningEffort,
+ StepMetrics,
+ StoredChunk,
+ TurnMetrics,
+} from "@dispatch/wire";
/**
* Request body for `POST /chat` (sent as JSON).
@@ -54,6 +60,14 @@ export interface ChatRequest {
* prompt (so it does not affect prompt caching).
*/
readonly cwd?: string;
+
+ /**
+ * Reasoning-effort override for THIS turn only (does not persist). When
+ * omitted, the server resolves the conversation's persisted value, falling
+ * back to `"high"`. Must be one of the `ReasoningEffort` levels; an
+ * unrecognized value → HTTP 400 `{ error }`.
+ */
+ readonly reasoningEffort?: ReasoningEffort;
}
/**
@@ -191,6 +205,28 @@ export interface SetCwdRequest {
readonly cwd: string;
}
+// ─── Per-conversation reasoning effort ────────────────────────────────────────
+
+/**
+ * Response of `GET /conversations/:id/reasoning-effort`. `reasoningEffort` is
+ * null when never set (the server then resolves turns at the default,
+ * `"high"`).
+ */
+export interface ReasoningEffortResponse {
+ readonly conversationId: string;
+ readonly reasoningEffort: ReasoningEffort | null;
+}
+
+/**
+ * Body of `PUT /conversations/:id/reasoning-effort` — persists the
+ * conversation's sticky reasoning-effort level (used for every later turn that
+ * does not carry a per-turn `ChatRequest.reasoningEffort` override). An
+ * unrecognized level → HTTP 400 `{ error }`.
+ */
+export interface SetReasoningEffortRequest {
+ readonly reasoningEffort: ReasoningEffort;
+}
+
// ─── Conversation close (explicit tab close) ──────────────────────────────────
/**
diff --git a/packages/wire/package.json b/packages/wire/package.json
index d00772d..07c20b7 100644
--- a/packages/wire/package.json
+++ b/packages/wire/package.json
@@ -1,6 +1,6 @@
{
"name": "@dispatch/wire",
- "version": "0.6.1",
+ "version": "0.7.0",
"type": "module",
"private": true,
"main": "dist/index.js",
diff --git a/packages/wire/src/index.ts b/packages/wire/src/index.ts
index 4fdf389..6a6de7d 100644
--- a/packages/wire/src/index.ts
+++ b/packages/wire/src/index.ts
@@ -146,6 +146,20 @@ export interface StoredChunk {
readonly chunk: Chunk;
}
+// ─── Reasoning effort ───────────────────────────────────────────────────────
+
+/**
+ * The per-request thinking-depth knob: how much extended thinking / reasoning
+ * the model should spend before answering. Provider-agnostic ladder; each
+ * provider maps a level to its native knob in its own code (e.g. an Anthropic
+ * provider maps it to a `thinking.budget_tokens` value) and MAY ignore levels
+ * (or the field entirely) that its backend cannot express.
+ *
+ * Resolution (owned by the session-orchestrator): per-turn request value →
+ * persisted per-conversation value → default `"high"`.
+ */
+export type ReasoningEffort = "low" | "medium" | "high" | "xhigh" | "max";
+
// ─── Usage ──────────────────────────────────────────────────────────────────
/**
diff --git a/tasks.md b/tasks.md
index 205d9c9..ceb5557 100644
--- a/tasks.md
+++ b/tasks.md
@@ -353,6 +353,24 @@ derives `hasOlder` from `chunks[0].seq > 1`.
DB) — recipe fixed in §8 + above. (3) Violated the bracket trick once (`pkill -f 'cr5-data'`
self-matched → killed parent shell, timeout-with-no-output); the existing §8 rule stands.
+## Reasoning effort (current milestone)
+User-gated calls: canonical term **reasoning effort** (GLOSSARY); ladder `low|medium|high|xhigh|max`
+(Anthropic-driven, includes xhigh/max); scope = **(c)** persisted per-conversation + per-turn
+`ChatRequest.reasoningEffort` override; resolution default **`high`**; provider picks sensible
+budget_tokens; `../claude` orchestrated DIRECTLY (mode A); CLI `--effort` now.
+- [x] **Wave 0 (orchestrator, contracts):** `ReasoningEffort` in `@dispatch/wire` (`0.6.1→0.7.0`);
+ `ProviderStreamOptions.reasoningEffort` (kernel contract; runtime untouched — providerOpts is
+ forwarded verbatim); `ChatRequest.reasoningEffort` + `ReasoningEffortResponse`/
+ `SetReasoningEffortRequest` GET/PUT types (`@dispatch/transport-contract` `0.10.0→0.11.0`);
+ glossary entry. typecheck + biome clean.
+- [ ] **Wave 1 (parallel):** `conversation-store` get/set persisted effort (mirror cwd);
+ `provider-anthropic` (../claude) level→thinking.budget_tokens mapping; `cli` `--effort` flag.
+- [ ] **Wave 2:** `session-orchestrator` — resolution chain (turn override → stored → `high`),
+ thread into providerOpts via `StartTurnInput.reasoningEffort`.
+- [ ] **Wave 3:** `transport-http` (validate `reasoningEffort`, thread to startTurn, GET/PUT
+ `/conversations/:id/reasoning-effort`) + `transport-ws` (thread on `chat.send`).
+- [ ] Live-verify vs claude; FE courier handoff.
+
## Open items
- **Context window LIMIT (deferred, sibling of context size):** expose the selected model's max
context-window token limit so the FE can render `contextSize / limit` (e.g. `1286 / 200000`).