docs(handoff): FE courier — reasoning effort (selector, per-turn override, endpoints)

author: Adam Malczewski <[email protected]> 2026-06-12 20:16:02 +0900
committer: Adam Malczewski <[email protected]> 2026-06-12 20:16:02 +0900
commit: a1639b72103e4f038950a9dfe51c86fdda9f2771 (patch)
tree: fba088d628dedda245b5b4f3c2111dd623b057a2
parent: 57b53105a3a1cf4587244c92e4f8af7c12176249 (diff)
download: dispatch-a1639b72103e4f038950a9dfe51c86fdda9f2771.tar.gz
dispatch-a1639b72103e4f038950a9dfe51c86fdda9f2771.zip
2 files changed, 84 insertions, 2 deletions
diff --git a/frontend-reasoning-effort-handoff.md b/frontend-reasoning-effort-handoff.md
new file mode 100644
index 0000000..8647f36
--- /dev/null
+++ b/frontend-reasoning-effort-handoff.md
@@ -0,0 +1,81 @@
+# FE handoff — reasoning effort (thinking-depth knob)
+
+Courier this to `../dispatch-web` (cross-repo contract change; `lsp references` does not
+span repos — ORCHESTRATOR §7). All changes are ADDITIVE — nothing existing breaks.
+
+## What shipped (backend)
+
+A new user-settable knob, **reasoning effort**: how much extended thinking the model spends
+before answering. Canonical ladder (type `ReasoningEffort`, exported by `@dispatch/wire` and
+re-exported by `@dispatch/transport-contract`):
+
+```ts
+type ReasoningEffort = "low" | "medium" | "high" | "xhigh" | "max";
+```
+
+Versions: `@dispatch/wire` `0.6.1 → 0.7.0`, `@dispatch/transport-contract`
+`0.10.0 → 0.11.0`. Bump the pinned `file:` deps.
+
+It has TWO setting scopes, resolved server-side per turn:
+
+1. **Per-turn override** — optional `reasoningEffort` on `ChatRequest` (HTTP `POST /chat`)
+   and therefore on the WS `chat.send` message (`ChatSendMessage extends ChatRequest`).
+   Applies to THAT turn only; does NOT persist.
+2. **Persisted per-conversation setting** — sticky; used for every turn that has no per-turn
+   override:
+   - `GET /conversations/:id/reasoning-effort` → `ReasoningEffortResponse`
+     `{ conversationId, reasoningEffort: ReasoningEffort | null }` (`null` = never set).
+   - `PUT /conversations/:id/reasoning-effort` with body `SetReasoningEffortRequest`
+     `{ reasoningEffort }` → persists it.
+
+**Resolution chain (server-owned — do not re-implement):** per-turn override → persisted
+conversation value → **default `"high"`**. So a conversation with nothing set already runs at
+`high`; `null` from the GET means "default (`high`) applies", not "off".
+
+**Validation:** an unrecognized level → HTTP 400 `{ error }` (the error message lists the
+valid levels). Same for the WS path (the standard `chat.send` error reply). Send only the
+five ladder strings; omit the key entirely for "no override" (don't send `null`/`""`).
+
+## What the model does with it (context for UX copy)
+
+The Anthropic provider maps the level to an extended-thinking token budget
+(`low` 4 096 · `medium` 10 240 · `high` 16 384 · `xhigh` 32 768 · `max` 65 536). Higher
+levels = the model thinks longer before answering (more `reasoning-delta` events / thinking
+chunks ahead of the text — the FE already renders those). Providers without a thinking knob
+ignore the field — sending it is always safe.
+
+## What we need the FE to do
+
+1. **Per-conversation effort selector** — a 5-option control (plus an implicit "default"
+   state when the GET returns `null`):
+   - On conversation open: `GET /conversations/:id/reasoning-effort`; render `null` as
+     "high (default)".
+   - On change: `PUT` the chosen level. It takes effect from the NEXT turn — no turn restart
+     needed.
+2. **(Optional) per-turn override** — if the composer grows a "think harder for this one
+   message" affordance, set `reasoningEffort` on that `chat.send` only. The persisted setting
+   is untouched by overrides.
+3. **Expect more thinking** — at `xhigh`/`max` the pre-answer thinking phase can be long;
+   whatever spinner/" thinking…" treatment exists should tolerate extended runs of
+   reasoning deltas before the first text delta.
+
+## Cache note (don't surprise users)
+
+Changing the effort level changes the provider request shape, which can bust the prompt
+cache for the next turn (one-time re-prefill cost). The backend's cache-warming path already
+warms with the SAME resolved effort as a real turn, so a STABLE setting stays cache-safe;
+only the act of changing it costs. If the FE wants, it can mention this in the selector's
+tooltip — no functional handling required.
+
+## Verify (manual)
+
+```bash
+# sticky setting round-trip
+curl -s localhost:24203/conversations/<id>/reasoning-effort          # → null first time
+curl -s -X PUT localhost:24203/conversations/<id>/reasoning-effort \
+  -H 'content-type: application/json' -d '{"reasoningEffort":"xhigh"}'
+curl -s localhost:24203/conversations/<id>/reasoning-effort          # → "xhigh"
+# bad level → 400
+curl -s -X PUT localhost:24203/conversations/<id>/reasoning-effort \
+  -H 'content-type: application/json' -d '{"reasoningEffort":"banana"}'
+```
diff --git a/tasks.md b/tasks.md
index 1dbbb08..189980f 100644
--- a/tasks.md
+++ b/tasks.md
@@ -380,8 +380,9 @@ budget_tokens; `../claude` orchestrated DIRECTLY (mode A); CLI `--effort` now.
 - [x] Verified: `tsc -b` EXIT 0, biome clean, **993 vitest + 189 bun** green; all agents in-lane.
   Commits: arch-rewrite `35197ed` (contracts) + `020e051` (impl); ../claude `c0835a4`.
 - [ ] Live-verify vs claude (thinking deltas streamed at xhigh; persisted PUT honored next turn).
-- [ ] FE courier handoff (`frontend-reasoning-effort-handoff.md`): ChatRequest field + GET/PUT
-  endpoints + ladder.
+- [x] FE courier handoff written: `frontend-reasoning-effort-handoff.md` (user couriers to
+  `../dispatch-web`): ChatRequest/chat.send field + GET/PUT endpoints + ladder + default-`high`
+  semantics + cache note.
 
 ## Open items
 - **Context window LIMIT (deferred, sibling of context size):** expose the selected model's max
author	Adam Malczewski <[email protected]>	2026-06-12 20:16:02 +0900
committer	Adam Malczewski <[email protected]>	2026-06-12 20:16:02 +0900
commit	a1639b72103e4f038950a9dfe51c86fdda9f2771 (patch)
tree	fba088d628dedda245b5b4f3c2111dd623b057a2
parent	57b53105a3a1cf4587244c92e4f8af7c12176249 (diff)
download	dispatch-a1639b72103e4f038950a9dfe51c86fdda9f2771.tar.gz dispatch-a1639b72103e4f038950a9dfe51c86fdda9f2771.zip