summaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
authorAdam Malczewski <[email protected]>2026-06-30 01:30:06 +0900
committerAdam Malczewski <[email protected]>2026-06-30 01:30:06 +0900
commitbf74aeab143a49005c380706ae9847cf064fd2f2 (patch)
treec9e93dc0ebe818e7c0d0aafeba8387afd161da3f
parent6dd9ea9b935e5011c16faed6c869c976cf5ff172 (diff)
downloaddispatch-bf74aeab143a49005c380706ae9847cf064fd2f2.tar.gz
dispatch-bf74aeab143a49005c380706ae9847cf064fd2f2.zip
chore: remove old handoff docs, plans, review reports, and task lists from rootHEADmaindev
Removed 40+ markdown files that were cluttering the repo root: - frontend-*-handoff.md (28 files) — historical API contract handoffs, features all implemented - backend-to-fe-handoff.md, backend-to-fe-handoff-2.md — old handoff docs - broken-chat-repair-handoff.md — old repair handoff - PLAN-mcp.md, PLAN-per-edit-diagnostics.md — old planning docs - ai-review-report.md, crash-review-report.md — one-time review reports - tasks.md, HANDOFF.md — outdated status docs (git log is the source of truth) Kept: AGENTS.md, GLOSSARY.md, ORCHESTRATOR.md, README.md Also: gitignored ai-review-report.md so future Gemini reviews don't commit it
-rw-r--r--.gitignore1
-rw-r--r--HANDOFF.md45
-rw-r--r--PLAN-mcp.md128
-rw-r--r--PLAN-per-edit-diagnostics.md44
-rw-r--r--ai-review-report.md32
-rw-r--r--backend-to-fe-handoff-2.md124
-rw-r--r--backend-to-fe-handoff.md141
-rw-r--r--broken-chat-repair-handoff.md180
-rw-r--r--crash-review-report.md86
-rw-r--r--frontend-cache-rate-handoff.md126
-rw-r--r--frontend-cache-warming-handoff.md91
-rw-r--r--frontend-cache-warming-lifecycle-handoff.md94
-rw-r--r--frontend-compaction-handoff.md167
-rw-r--r--frontend-context-size-handoff.md47
-rw-r--r--frontend-conversation-lifecycle-handoff.md102
-rw-r--r--frontend-conversation-list-handoff.md100
-rw-r--r--frontend-conversation-open-handoff.md53
-rw-r--r--frontend-cr3-user-message-handoff.md54
-rw-r--r--frontend-cwd-resolution-handoff.md95
-rw-r--r--frontend-history-windowing-handoff.md70
-rw-r--r--frontend-lsp-cwd-handoff.md133
-rw-r--r--frontend-lsp-cwd-workspace-handoff.md75
-rw-r--r--frontend-mcp-status-handoff.md117
-rw-r--r--frontend-message-queue-handoff.md189
-rw-r--r--frontend-metrics-handoff.md121
-rw-r--r--frontend-metrics-pass2-handoff.md67
-rw-r--r--frontend-model-persistence-handoff.md91
-rw-r--r--frontend-reasoning-effort-handoff.md81
-rw-r--r--frontend-stop-generation-handoff.md49
-rw-r--r--frontend-system-prompt-handoff.md73
-rw-r--r--frontend-todo-handoff.md91
-rw-r--r--frontend-turn-continuity-handoff.md83
-rw-r--r--frontend-workspace-open-handoff.md47
-rw-r--r--frontend-workspaces-handoff.md216
-rw-r--r--tasks.md1050
35 files changed, 1 insertions, 4262 deletions
diff --git a/.gitignore b/.gitignore
index c40e88f..9f55c28 100644
--- a/.gitignore
+++ b/.gitignore
@@ -20,3 +20,4 @@ reports/
# Local observability journal (runtime artifact)
.dispatch/journal/
+ai-review-report.md
diff --git a/HANDOFF.md b/HANDOFF.md
deleted file mode 100644
index 29f1689..0000000
--- a/HANDOFF.md
+++ /dev/null
@@ -1,45 +0,0 @@
-# HANDOFF — next steps for the incoming orchestrator
-
-> Read `ORCHESTRATOR.md` first (your operating manual), then `tasks.md` (live
-> status), then this file (what to do next). The project is mature; this file
-> points at the live source of truth and the current open work.
-
-## Where things stand (one paragraph)
-
-Kernel + core extensions + host-bin are built, full-fidelity (every core feature
-is a real manifest-loaded extension). The turn loop runs real tools end-to-end
-against live models. LSP integration, observability (journal/collector/trace-store),
-cache warming, turn continuity (detached turns + multi-client), skills, message
-queue + steering, metrics (live + persisted), per-conversation model/cwd/reasoning
-persistence, and broken-chat self-repair are all DONE and live-verified.
-**`tsc -b` EXIT 0 · biome clean · 1468 vitest green.** The web frontend is a
-separate repo (`../frontend`); contract changes are couriered via the user.
-
-## How to boot & smoke-test
-```bash
-cd /home/tradam/projects/dispat../backend
-# .env auto-loads DISPATCH_API_KEY + BACKEND_PORT (24203).
-# Dev stack (live-reload): bin/up (ports 24203/24205/24204)
-# Stable second stack: ../bin/up2 (ports 25203/25205/25204, isolated data)
-bun packages/host-bin/src/main.ts # boots app + collector
-curl -s -X POST localhost:24203/chat -H 'content-type: application/json' \
- -d '{"conversationId":"c1","message":"Say hello in 3 words."}'
-```
-Process cleanup uses the `[x]` bracket trick (ORCHESTRATOR §8) — leaked
-server/collector procs poison the next run's counts.
-
-## What's open now
-
-See `tasks.md` for the live checklist. As of this writing:
-- **Per-edit LSP diagnostics** (commit `8f6114b`) — committed + green, NOT yet
- live-verified against a real running server.
-- **MCP (Model Context Protocol) integration** — the next major feature. Research
- + plan in progress; see `notes/mcp-design.md` (when written) + `PLAN-mcp.md`.
-- `notes/pending-issues.md` item 1 (workspace tab) — awaiting a user handoff.
-
-## Standing reminders (from ORCHESTRATOR.md — don't relearn the hard way)
-- Summon via `opencode run` (ORCHESTRATOR §2). Parallel wave = multiple concurrent
- summons on disjoint file sets only.
-- Verify independently (typecheck/test/check) + confirm single-lane edits.
-- Keep `tasks.md` current; write decisions down before pivoting.
-- Be careful with destructive git; back up `notes/` before any reset/clean.
diff --git a/PLAN-mcp.md b/PLAN-mcp.md
deleted file mode 100644
index 560da8e..0000000
--- a/PLAN-mcp.md
+++ /dev/null
@@ -1,128 +0,0 @@
-# Plan — MCP (Model Context Protocol) Integration
-
-> **Status:** PROPOSED — awaiting user approval of design decisions (§7 of
-> `notes/mcp-design.md`).
-> Design: `notes/mcp-design.md`.
-
-## Decisions (to confirm with user)
-
-1. **One `mcp` extension** managing multiple servers (like `lsp`).
-2. **Tool name format:** `<serverId>__<toolName>` (double-underscore separator).
-3. **Phase 1: stdio transport only** (covers freecad-mcp + chrome-devtools-mcp).
-4. **Phase 1: Tools only** (no Resources/Prompts).
-5. **Phase 1: no enable/disable surface** (per-cwd config is sufficient).
-6. **Hand-rolled JSON-RPC** (adapt LSP's rpc.ts + framing.ts; no MCP SDK dep).
-
-## Implementation waves
-
-### Wave 0: Orchestrator (contracts + wiring)
-
-| What | File | Change |
-|---|---|---|
-| No kernel contract change needed | — | The existing `ToolContract` + `host.defineTool()` + `host.getTools()` + `toolsFilter` + `ToolAssembly` are sufficient. MCP tools are just `ToolContract`s registered at runtime. |
-| Glossary | `GLOSSARY.md` | Add `MCP`, `MCP server`, `MCP host` (see design §6). |
-| Root tsconfig | `tsconfig.json` | Add `@dispatch/mcp` project reference (after Wave 1). |
-| host-bin registration | `packages/host-bin/src/main.ts` | Register `mcpExt` in `CORE_EXTENSIONS` (same pattern as `lspExt`). |
-| `bun install` | `bun.lock` | Link the new workspace package. |
-
-> **No `@dispatch/transport-contract` or `@dispatch/wire` version bump** in Phase 1.
-> MCP tools are transparent to the wire (they're just tools the model calls).
-> A future surface (enable/disable, status endpoint) would bump versions.
-
-### Wave 1: `packages/mcp/` (single unit — the extension)
-
-This is the main implementation. One owner-agent builds the entire `packages/mcp/`
-directory. It depends only on `@dispatch/kernel` (contracts) and
-`@dispatch/session-orchestrator` (for the `toolsFilter` handle).
-
-| File | Responsibility |
-|---|---|
-| `src/framing.ts` | `Content-Length` framing for stdio (adapt from LSP's framing.ts — encode/decode). PURE. |
-| `src/framing.test.ts` | Unit tests for encode/decode. |
-| `src/rpc.ts` | JSON-RPC 2.0 client: `request(method, params) → result`, `notify(method, params)`, `onNotification(method, handler)`. Adapts LSP's rpc.ts. PURE (injected `writeFn`). |
-| `src/rpc.test.ts` | Unit tests for request/response/notification handling. |
-| `src/transport.ts` | Transport abstraction: `StdioTransport` (spawn child, pipe stdin/stdout through framing + rpc) + the interface for a future `HttpTransport`. Injected `spawn` (like LSP). |
-| `src/transport.test.ts` | Tests against an in-memory pipe pair (no real spawn). |
-| `src/client.ts` | MCP client: `initialize()` (send proto version + caps, receive server caps), `listTools()` → `tools/list`, `callTool(name, args, signal)` → `tools/call`, listen for `notifications/tools/list_changed`. Tracks connection state. |
-| `src/client.test.ts` | Tests with a mock JSON-RPC connection (injected transport). |
-| `src/config.ts` | PURE config resolution: `.dispatch/mcp.json` → `opencode.json` `mcp` key. Returns `ResolvedMcpServer[]` + `shadowed` flag. Mirrors LSP config.ts. |
-| `src/config.test.ts` | Config resolution tests (precedence, shadow, empty). |
-| `src/registry.ts` | Tool name namespacing (`<serverId>__<toolName>`) + `adaptTool(serverId, mcpTool, client)` → `ToolContract`. The `execute()` proxies to `client.callTool()` and flattens MCP content to a string. PURE (injected client). |
-| `src/registry.test.ts` | Tests for namespacing, content flattening, error handling. |
-| `src/manager.ts` | `McpManager`: one client per server config; lazy-spawn on first access; `status(cwd)`; `getClient(serverId)`; `shutdownAll()`. Mirrors LSP manager.ts. Injected spawn + logger. |
-| `src/manager.test.ts` | Manager lifecycle tests (lazy spawn, shutdown, broken server). |
-| `src/types.ts` | `McpServerConfig`, `McpServerStatus`, `McpService`, `McpToolInfo`, `McpContentItem`. |
-| `src/extension.ts` | manifest + `activate(host)`: real spawn adapter, config resolution per-cwd, manager, register tools via `host.defineTool` (on connect + on `list_changed`), register `toolsFilter` (drop tools from disconnected servers), `mcpServiceHandle`, `deactivate()`. |
-| `src/index.ts` | Public surface exports. |
-
-**Scoping rules for the summon:**
-- `.dispatch/package-agent.md` + `.dispatch/extension-agent.md`
-- `.dispatch/rules/`: `one-owner.md`, `isolation-over-dry.md`, `biome-clean.md`,
- `pure-core.md`, `no-internal-mocks.md`, `typed-handles.md`,
- `extension-logging.md`.
-
-**Key guidance for the agent:**
-- Read `packages/lsp/src/` (framing.ts, rpc.ts, config.ts, manager.ts,
- extension.ts) as the architectural precedent — same pattern, simpler protocol.
-- Read `packages/kernel/src/contracts/tool.ts` for `ToolContract`.
-- Read `packages/kernel/src/contracts/extension.ts` for `HostAPI`,
- `defineTool`, `addFilter`, `provideService`, `defineService`.
-- Read `packages/session-orchestrator/src/tools-filter.ts` for `ToolAssembly`
- + `toolsFilter`.
-- The MCP `initialize` flow: send `{ method: "initialize", params: {
- protocolVersion: "2025-11-25", capabilities: {}, clientInfo: { name:
- "dispatch", version: "0.0.0" } } }`, receive server capabilities, then send
- `notifications/initialized`.
-- `tools/list` returns `{ tools: [{ name, description, inputSchema }] }`.
-- `tools/call` takes `{ name, arguments }` and returns `{ content: [...],
- isError?: boolean }`.
-- Tool names must be namespaced `<serverId>__<toolName>`.
-- `concurrencySafe: false` on all MCP-adapted tools (conservative — MCP servers
- are generally stateful single-client processes).
-- `Content-Length` framing for stdio (same as LSP — the MCP spec inherited
- this from LSP).
-- No external dependencies — hand-roll the JSON-RPC + framing (adapt LSP's).
-
-### Wave 2: host-bin registration (orchestrator)
-
-After Wave 1 is verified in isolation:
-- Add `@dispatch/mcp` to root `tsconfig.json` project references.
-- `bun install` to link the workspace package.
-- Register `mcpExt` in `CORE_EXTENSIONS` in `packages/host-bin/src/main.ts`.
-- Verify: `tsc -b` EXIT 0, biome clean, full vitest pass.
-
-### Wave 3: Live verification (orchestrator)
-
-- Boot the dev stack (`bin/up`).
-- Create a `.dispatch/mcp.json` in a test cwd with a simple MCP server
- (e.g. a trivial stdio server that exposes one tool).
-- Verify: `GET /conversations/:id/lsp`-equivalent — actually, verify by
- sending a chat that triggers the model to call the MCP tool.
-- Or: test with chrome-devtools-mcp (`npx chrome-devtools-mcp`) if available.
-- Confirm: the model sees the MCP tool, calls it, gets a result.
-- Clean up test config.
-
-## Test strategy (per the asymmetric testing rule)
-
-- **Pure core** (framing, rpc, config, registry, types): zero internal mocks,
- high coverage. The RPC + framing tests use in-memory pipe pairs (injected
- transport, not mocked `@dispatch/*`). Config tests use string fixtures.
-- **Shell** (transport, manager, extension): integration tests against
- in-memory/real child processes. A few tests, not exhaustive unit coverage.
- Do NOT mock sibling extensions.
-
-## Estimated size
-
-- ~12 source files + ~11 test files.
-- Closest precedent: `packages/lsp/` (~20 files). MCP is simpler (no
- diagnostics, no incremental sync, no file watching, no sidecars).
-- Expected test count: ~60-80 new tests.
-
-## What is explicitly OUT of scope for Phase 1
-
-- Streamable HTTP transport (Phase 2).
-- MCP Resources and Prompts primitives (Phase 2).
-- Client → Server capabilities (sampling, roots, elicitation) (Phase 2+).
-- Per-conversation enable/disable surface + transport endpoints (Phase 2).
-- Tool poisoning / rug-pull hash validation (security hardening, Phase 2).
-- `mcp-scan`-style static analysis (Phase 2+).
diff --git a/PLAN-per-edit-diagnostics.md b/PLAN-per-edit-diagnostics.md
deleted file mode 100644
index 20671c2..0000000
--- a/PLAN-per-edit-diagnostics.md
+++ /dev/null
@@ -1,44 +0,0 @@
-# Plan — Live Per-Edit Diagnostics (General LSP)
-
-> **Status:** APPROVED — implementing.
-
-## Decisions (confirmed with user)
-
-1. **Multi-server aggregation** — query ALL connected servers matching the file's extension, merge diagnostics tagged by source.
-2. **Incremental sync** — capture each server's `textDocumentSync.change` during `initialize`; compute prefix/suffix diff ranges for `change: 2`; full content for `change: 1`. Generic, works for ALL LSPs.
-3. **`languageId` mapping** — extend the existing `language.ts` with `.rb`/`.rbs`, `.c`/`.cpp`, etc.
-4. **Auto-append to `edit_file`** — after a successful edit, run diagnostics on the post-edit buffer. Only append diagnostics if there are errors/warnings (severity ≤ 2). Don't append on clean edits (no noise).
-5. **60s timeout** — if diagnostics take >10s, prepend a warning: "LSP is taking unusually long. If this happens more than once, raise it to the user." Always append this if slow, regardless of whether there are errors.
-6. **General** — not Steep-specific. Works for any LSP server.
-
-## Implementation waves
-
-### Wave 1: `packages/lsp/` (single unit)
-
-| File | Change |
-|---|---|
-| `src/diff.ts` (NEW) | Pure diff: `computeChangeRange(oldText, newText)` + `offsetToPosition(text, offset)` |
-| `src/language.ts` | Add `.rb`/`.rbs` → `"ruby"`, `.c`/`.h` → `"c"`, `.cpp`/`.cc`/`.hpp` → `"cpp"` |
-| `src/diagnostics.ts` | Add `hasReceivedPush(uri)` tracking, `clearReceived(uri)`, `formatFiltered(uri, minSeverity?)` |
-| `src/client.ts` | Capture `textDocumentSync.change` from init; track open doc text; add `change(filePath, newText)` with incremental/full sync; fix `languageId` in `open()`; extend `waitForDiagnostics(filePath, opts?)` to accept `text` + `timeoutMs` + return `{ formatted, slow, timedOut }` |
-| `src/tool.ts` | `diagnostics` op: query ALL matching connected servers (not just first); merge tagged by source |
-| `src/types.ts` | Add `getDiagnostics(opts)` to `LspService` + `DiagnosticsResult` type |
-| `src/extension.ts` | Implement `getDiagnostics` (calls manager → all matching clients → merge) |
-| `src/diff.test.ts` (NEW) | Unit tests for diff functions |
-| `src/tool.test.ts` | Multi-server aggregation test |
-| `src/client.test.ts` | `change()`, `languageId`, `waitForDiagnostics` with text tests |
-
-### Wave 2: `packages/tool-edit-file/` (cross-extension)
-
-| File | Change |
-|---|---|
-| `src/extension.ts` | Import `lspServiceHandle` from `@dispatch/lsp`; `host.getService()` in activate; pass to tool |
-| `src/edit-file.ts` | After successful edit: call `getDiagnostics({ filePath, text: newContent, cwd, minSeverity: 2, timeoutMs: 60_000 })`; append if errors; append slow warning if >10s |
-| `package.json` | Add `@dispatch/lsp` dep |
-| `tsconfig.json` | Add `@dispatch/lsp` reference |
-
-### Wave 3: Build wiring (orchestrator)
-
-- Root `tsconfig.json`: add `@dispatch/tool-edit-file` → `@dispatch/lsp` ref if needed
-- `bun install` to link
-- Verify: typecheck + test + biome
diff --git a/ai-review-report.md b/ai-review-report.md
deleted file mode 100644
index 570ba05..0000000
--- a/ai-review-report.md
+++ /dev/null
@@ -1,32 +0,0 @@
-# LSP Fixes Verification Report
-
-## Executive Summary
-The fixes implemented in `feature/lsp-fixes` successfully address the immediate crashes and the primary source of the 9.5 GB memory leak. The optional chaining fix, `fs.watch` error listener, bounded LRU document cache, and `initialize` timeout propagation are all correctly designed and do not introduce regressions. However, the memory leak fix for the diagnostics cache is **incomplete**: while actively opened documents are properly managed, diagnostics pushed for *unopened* background files will still accumulate and leak memory over time.
-
-## Per-Fix Verification
-
-### 1. Crash — TypeError from broken optional chaining (`client.ts`)
-- **Correctness:** **Correct.** The addition of the second optional chaining operator (`void this.rpc?.handleMessage(msg)?.catch(...)`) correctly evaluates to `undefined` when `this.rpc` is null, preventing the synchronous `TypeError` that previously crashed the server.
-- **Completeness:** **Complete.** Because `handleMessage` is an `async` function, it is guaranteed to return a Promise, meaning any internal errors are returned as rejections rather than synchronous throws.
-- **New Issues/Regressions:** None.
-
-### 2. Crash — Unhandled 'error' event on `fs.watch` (`extension.ts`)
-- **Correctness:** **Correct.** Attaching a no-op `'error'` listener to the `fs.watch` instance (`watcher.on("error", () => {})`) prevents Node/Bun from escalating transient filesystem errors (such as a directory vanishing during `bun install`) into unhandled exceptions that crash the main process.
-- **Completeness:** **Complete.** The watcher is properly treated as best-effort.
-- **New Issues/Regressions:** None.
-
-### 3. Memory leak — Unbounded document/diagnostic caches (`client.ts` + `diagnostics.ts`)
-- **Correctness:** **Correct (for opened files).** The `evictIfOverCap` method properly acts as an LRU cache. It uses the JavaScript `Map`'s insertion-order property by grabbing the oldest key via `this.openDocuments.keys().next().value`. The `change()` method successfully maintains LRU order by deleting and re-inserting documents so they move to the tail. Evicted files correctly send `textDocument/didClose` and clear their state via `this.diagnostics.purge()`.
-- **Completeness:** **Incomplete.** The fix bounds the memory for files the agent explicitly opens, but misses files analyzed passively. Language servers (like `pyright`, `rust-analyzer`, or `tsserver`) often scan the workspace in the background and emit `textDocument/publishDiagnostics` for files the client never touched. These trigger `DiagnosticsStore.setPushDiagnostics()`, which caches them unconditionally in the `pushDiagnostics` map. Because these background files are never placed in `openDocuments`, they are never reached by the `evictIfOverCap` loop. As a result, the `DiagnosticsStore` will still slowly leak memory over time as background files accumulate.
-- **New Issues/Regressions:** None introduced. The `closeDocument` method is carefully guarded with `wasOpen` so it doesn't send invalid `didClose` notifications for unopened files. There are no race conditions in the polling logic (`waitForDiagnostics`).
-
-### 4. Minor — Leaked promises on initialize timeout (`rpc.ts`)
-- **Correctness:** **Correct.** Passing the `timeoutMs` parameter directly into `rpc.sendRequest` (rather than wrapping it in an external `Promise.race`) successfully utilizes the connection's internal timeout handler. When the timeout triggers, `rpc.ts` explicitly deletes the request ID from the `this.pending` Map, freeing the memory.
-- **Completeness:** **Complete.** The leaked closure and promise are both cleared safely.
-- **New Issues/Regressions:** None.
-
-## Remaining Concerns
-1. **Unbounded `DiagnosticsStore` Map:** As detailed in Bug 3, `this.pushDiagnostics` and `this.pushReceived` inside `DiagnosticsStore` have no size bounds. A server pushing diagnostics for thousands of untouched files across a large monorepo will keep those strings and objects in memory forever until the client is destroyed.
-
-## Recommendations
-- **Bound passive diagnostics:** Modify `DiagnosticsStore` to implement its own LRU cache for `pushDiagnostics`, or have it coordinate with `client.ts` to ensure that even unopened files with diagnostics are eventually aged out and purged when a maximum threshold is reached.
diff --git a/backend-to-fe-handoff-2.md b/backend-to-fe-handoff-2.md
deleted file mode 100644
index 51a3e34..0000000
--- a/backend-to-fe-handoff-2.md
+++ /dev/null
@@ -1,124 +0,0 @@
-# Backend → FE handoff — context window + percentage-based compact
-
-> Courier to `../frontend`. Response to the context-window ask in
-> `backend-handoff.md` §3 + compacting rework.
-
-## What shipped
-
-1. **`GET /models` now includes `contextWindow` per model** — the FE can replace
- the hardcoded `MAX_CONTEXT = 1,000,000` with the real value.
-2. **Auto-compact is now percentage-based** (default 85% of `contextWindow`)
- instead of a flat token count (was 350k).
-
-## Bump pinned deps
-- `@dispatch/wire` → `0.11.0` (unchanged)
-- `@dispatch/transport-contract` → `0.15.0` (unchanged — the new types are
- additive to the existing version)
-
-## `GET /models` — now includes `modelInfo`
-
-The response now includes an optional `modelInfo` map alongside the existing
-`models` array. The `models` array is unchanged (backward compatible).
-
-```ts
-interface ModelsResponse {
- readonly models: readonly string[];
- readonly modelInfo?: Readonly<Record<string, ModelMetadata>>;
-}
-
-interface ModelMetadata {
- readonly contextWindow?: number; // max tokens (e.g. 200000)
-}
-```
-
-**Example response:**
-```json
-{
- "models": ["opencode/deepseek-v4-flash", "umans/umans-glm-5.2"],
- "modelInfo": {
- "opencode/deepseek-v4-flash": { "contextWindow": 128000 },
- "umans/umans-glm-5.2": { "contextWindow": 200000 }
- }
-}
-```
-
-`modelInfo` is absent when no provider reports `contextWindow`. Each key is the
-same `<credentialName>/<model>` string from the `models` array.
-
-**What the FE should do:**
-- When rendering the composer status bar, use `modelInfo[selectedModel].contextWindow`
- as the denominator for `contextSize / contextWindow · pct%`.
-- Fall back to the current hardcoded `1,000,000` when `modelInfo` is absent or
- the selected model has no `contextWindow`.
-
-## Auto-compact: now percentage-based
-
-**Old:** flat token threshold (default 350000). `contextSize >= threshold`.
-**New:** percentage of `contextWindow` (default 85%). `contextSize >= contextWindow × (percent / 100)`.
-
-Also fixed: the check now uses `contextSize` (true context occupancy = last
-step's `inputTokens + outputTokens`) instead of the overcounted aggregate
-`usage.inputTokens` (which summed every step's re-prefilled prompt).
-
-### `GET /conversations/:id/compact-percent` — read percent
-
-200: `CompactPercentResponse { conversationId, percent }`
-- `percent: 0` — auto-compact explicitly disabled (manual only).
-- `percent: null` (not stored) — **default: 85** (85% of the model's context window).
-- Any positive number (1-100) — auto-compact triggers when `contextSize`
- exceeds `percent`% of the model's `contextWindow`.
-
-### `PUT /conversations/:id/compact-percent` — set percent
-
-Body: `SetCompactPercentRequest { percent: number }`
-- `0` explicitly disables auto-compact.
-- Any positive number (1-100) sets the trigger percentage.
-- Default (when not stored) is 85.
-
-200: `CompactPercentResponse`
-
-**Renamed from `compact-threshold`** — the old endpoint paths, request types,
-and response types are gone. Update any FE code that referenced
-`compact-threshold`.
-
-## New types
-
-```ts
-// @dispatch/transport-contract
-export interface ModelMetadata {
- readonly contextWindow?: number;
-}
-
-// ModelsResponse now has modelInfo (additive — models array unchanged)
-export interface ModelsResponse {
- readonly models: readonly string[];
- readonly modelInfo?: Readonly<Record<string, ModelMetadata>>;
-}
-
-// Renamed from CompactThresholdResponse
-export interface CompactPercentResponse {
- readonly conversationId: string;
- readonly percent: number; // 0 = manual; null = default 85
-}
-
-// Renamed from SetCompactThresholdRequest
-export interface SetCompactPercentRequest {
- readonly percent: number;
-}
-```
-
-## What the FE needs to do
-
-1. **Use real `contextWindow`** from `GET /models` → `modelInfo[model].contextWindow`
- instead of hardcoded `MAX_CONTEXT = 1,000,000`.
-
-2. **Rename compact-threshold → compact-percent** in any FE code:
- - `GET /conversations/:id/compact-percent` (was `compact-threshold`)
- - `PUT /conversations/:id/compact-percent` (was `compact-threshold`)
- - `percent` field (was `threshold`)
-
-3. **Settings UI**: change the number input from "token count" to "percent
- (0-100)". Default 85. 0 = manual only.
-
-4. **No other changes** — the compact endpoint, WS message, and chain
- architecture are unchanged.
diff --git a/backend-to-fe-handoff.md b/backend-to-fe-handoff.md
deleted file mode 100644
index a6e726b..0000000
--- a/backend-to-fe-handoff.md
+++ /dev/null
@@ -1,141 +0,0 @@
-# Backend → FE handoff — CR-6 resolved + full endpoint list
-
-> Response to `backend-handoff.md` §2 CR-6. Courier back to `../frontend`.
-
-## CR-6: Assign seq during generation — RESOLVED
-
-**What changed:** The backend now persists chunks **incrementally at step
-boundaries** during generation, not only at turn-seal. The user message is
-persisted at turn start (before the first step), and each step's messages
-(assistant + tool-results) are persisted as soon as that step completes.
-
-**How it works:**
-1. Turn starts → user message is `append`ed immediately (gets seq numbers).
-2. Each step completes → step's messages are `append`ed immediately (get seq numbers).
-3. Turn seals → `turn-sealed` emitted (no batch `append` needed — already persisted).
-
-**What this means for the FE:**
-- `GET /conversations/:id?sinceSeq=N` returns committed, seq'd chunks **during
- generation**. The FE's existing `syncTail` already polls this — it will now
- find new chunks as each step completes.
-- The FE can adopt option (c) from the CR: fold events for the **current
- in-progress step** only (streaming text, thinking dots), and `syncTail` for
- **sealed steps**. The provisional state shrinks to just one step's worth of
- chunks — never a trim concern.
-- `turn-sealed` becomes a "refresh" signal — all chunks are already committed.
- The `done` event still carries final usage + contextSize (unchanged).
-
-**No wire/transport-contract change needed.** `StoredChunk` already has `seq`.
-`AgentEvent` types are unchanged. The FE just needs `syncTail` to find seq'd
-chunks during generation (which it already does).
-
-**Implementation detail:** The kernel calls a new `onStepComplete` callback
-(`RunTurnInput.onStepComplete`) after each step's messages are finalized.
-The orchestrator persists them via `conversationStore.append`. If the callback
-isn't called (e.g., test fakes), the orchestrator falls back to batch persist
-after `runTurn` returns — backward compatible.
-
----
-
-## Full endpoint list (current as of [email protected] / [email protected])
-
-### HTTP (port 24203)
-
-| Method | Path | Purpose |
-|---|---|---|
-| `POST` | `/chat` | Stream a turn (NDJSON response, `X-Conversation-Id` header) |
-| `POST` | `/chat/warm` | Cache-warm probe |
-| `GET` | `/models` | Model catalog (now includes `modelInfo` with `contextWindow` per model) |
-| `GET` | `/conversations` | List conversations (`?q=` prefix filter, `?status=active,idle` status filter) |
-| `GET` | `/conversations/:id` | Conversation history (`?sinceSeq=`, `?beforeSeq=`, `?limit=` windowing) |
-| `GET` | `/conversations/:id/metrics` | Per-turn metrics (tokens, timing) |
-| `GET` | `/conversations/:id/last` | Blocking last assistant message |
-| `GET` | `/conversations/:id/cwd` | Per-conversation working directory |
-| `PUT` | `/conversations/:id/cwd` | Set working directory |
-| `GET` | `/conversations/:id/reasoning-effort` | Per-conversation reasoning effort |
-| `PUT` | `/conversations/:id/reasoning-effort` | Set reasoning effort |
-| `GET` | `/conversations/:id/lsp` | LSP server status |
-| `GET` | `/conversations/:id/compact-percent` | Auto-compact percent (0=manual, null=default 85%) |
-| `PUT` | `/conversations/:id/compact-percent` | Set auto-compact percent |
-| `GET` | `/conversations/:id/title` | Read conversation title |
-| `PUT` | `/conversations/:id/title` | Set conversation title |
-| `POST` | `/conversations/:id/close` | Close tab (abort turn + mark `closed`) |
-| `POST` | `/conversations/:id/stop` | **NEW** — Stop generation (abort turn, keep conversation `idle`) |
-| `POST` | `/conversations/:id/compact` | **NEW** — Manual compaction (fork history + replace with summary) |
-| `POST` | `/conversations/:id/open` | **NEW** — Signal FE to open/focus tab (broadcasts `conversation.open`) |
-| `POST` | `/conversations/:id/queue` | Enqueue steering message |
-| `GET` | `/health` | Health check |
-| `GET` | `/metrics/throughput` | Per-model throughput samples |
-| `GET` | `/*` | Static frontend serving (SPA fallback, when `DISPATCH_WEB_DIR` is set) |
-
-### WebSocket (port 24205)
-
-**Client → Server:**
-| Type | Purpose |
-|---|---|
-| `chat.send` | Start a turn (stream events back via `chat.delta`) |
-| `chat.subscribe` | Watch a conversation's turns without sending |
-| `chat.unsubscribe` | Stop watching |
-| `chat.queue` | Enqueue steering (fire-and-forget) |
-| Surface ops | `surface.subscribe`, `surface.invoke`, etc. |
-
-**Server → Client (broadcasts):**
-| Type | Purpose |
-|---|---|
-| `chat.delta` | Per-conversation event (turn-start, text-delta, tool-call, usage, done, etc.) |
-| `chat.error` | Turn error |
-| `conversation.open` | **NEW** — CLI `--open` flag → open/focus a tab |
-| `conversation.statusChanged` | **NEW** — Lifecycle status change (`active`/`idle`/`closed`) |
-| `conversation.compacted` | **NEW** — History compacted (includes `newConversationId` = archive ID) |
-| Surface ops | Catalog, surface data, etc. |
-
-### New types the FE should consume
-
-```ts
-// ConversationMeta ([email protected]) — now has status + compactedFrom
-interface ConversationMeta {
- id: string;
- createdAt: number;
- lastActivityAt: number;
- title: string;
- status: "active" | "idle" | "closed";
- compactedFrom?: string; // archive ID (pre-compaction history)
-}
-
-// WS messages ([email protected])
-interface ConversationCompactedMessage {
- type: "conversation.compacted";
- conversationId: string;
- newConversationId: string; // archive ID
- messagesSummarized: number;
- messagesKept: number;
-}
-
-// HTTP response types
-interface CompactResponse {
- conversationId: string;
- newConversationId: string; // archive ID
- messagesSummarized: number;
- messagesKept: number;
-}
-
-interface CompactPercentResponse {
- conversationId: string;
- percent: number; // 0 = manual; null = default 85
-}
-
-interface SetCompactPercentRequest {
- percent: number;
-}
-```
-
-### FE handoff docs (in the backend repo)
-
-| File | Feature |
-|---|---|
-| `frontend-conversation-lifecycle-handoff.md` | Tab persistence (active/idle/closed) |
-| `frontend-compaction-handoff.md` | Compacting (non-destructive, chained archives) |
-| `frontend-stop-generation-handoff.md` | Stop generation mid-turn |
-| `frontend-conversation-list-handoff.md` | Conversation list + title + open tab |
-| `frontend-conversation-open-handoff.md` | CLI `--open` → `conversation.open` WS message |
-| `frontend-cache-rate-handoff.md` | Cache hit/miss calculation (updated for providers that don't report cache) |
diff --git a/broken-chat-repair-handoff.md b/broken-chat-repair-handoff.md
deleted file mode 100644
index 12deec0..0000000
--- a/broken-chat-repair-handoff.md
+++ /dev/null
@@ -1,180 +0,0 @@
-# Handoff → orchestrator (bcb5): broken-chat self-repair
-
-> From: diagnostic session. Agent/conversation `77574596`
-> (`77574596-3e7b-46f8-8d67-c9e17a529dee`) "broke unrecoverably." User goal:
-> **chats must self-heal when broken so they can continue.** Implement the fixes
-> below. Full diagnosis + plan also in `reports/broken-chat-repair-diagnosis.md`.
-
-## 0. Your job (TL;DR)
-
-`reconcile()` only repairs orphaned tool-calls. The production DB has **two other
-broken states** it doesn't handle, and they make a chat uncontinuable. Implement a
-read-time repair so broken chats auto-heal on next open — **no DB surgery**
-(append-only durability preserved; repair is a turn-path transform that runs on
-every `load()`). Three units, two repos:
-
-- **Wave 1 (arch-rewrite, PARALLEL — disjoint packages):**
- - `conversation-store` — extend `reconcile` (Layer 1) + harden `load()`.
- - `openai-stream` — harden `convertMessages` args (Layer 2).
-- **Wave 2 (separate repo `../claude`, SEPARATE agent):**
- - `provider-anthropic` — harden its `safeJson` (Layer 2 equivalent).
-
-**Key architectural insight that shapes the waves:** Layer 1 lives in
-`conversation-store.reconcile`, which runs in `load()` BEFORE any provider sees
-the messages. So the Layer 1 fix protects **every** provider (openai-compat AND
-anthropic) — the Claude plugin needs **no** Layer 1 change. Layer 2 (malformed
-tool-call args) is **per-provider** serialization safety, so it must be applied
-in each provider's converter (openai-stream + provider-anthropic).
-
-## 1. The break (what actually happened in `77574596`)
-
-Production DB: `/var/lib/dispatch/dispatch.db` (systemd `dispatch.service`).
-136 chunks; seq counter = 136; **all JSON valid; no orphaned tool-calls** — so
-`reconcile()` finds nothing wrong, yet the chat is uncontinuable. The tail:
-
-| seq | role | type | note |
-|---|---|---|---|
-| 133 | assistant | text | "Wave 0 fully verified…" |
-| 134 | assistant | tool-call | `todo_write`, `input` = **malformed JSON** (`json_type=text`, raw string) |
-| 135 | tool | tool-result | isError: "todo_write args must be an object with a `todos` array" |
-| 136 | assistant | **error** | `HTTP 400: unexpected character: line 1 column 1413 (char 1412). Received Model Group=glm-5.2` |
-
-### Root cause (confirmed byte-for-byte)
-- seq 134's `input` is a raw string. Parsing it fails
- `Expecting ':' delimiter: line 1 column 1413 (char 1412)` — an **exact match** to
- the provider's `unexpected character: line 1 column 1413`. The **model emitted
- malformed JSON as the `todo_write` arguments**.
-- Chain: model emits text + malformed-args tool-call (step 5) → kernel dispatches
- the tool, which returns an error result (seq 135) → kernel calls the provider
- again (step 6); the request re-includes the assistant message carrying the
- malformed `arguments` → provider 400s → persisted as an `error` chunk (seq 136).
-
-### Why it's "unrecoverable"
-- `openai-stream` `convertAssistantMessage` serializes tool-call args as
- `typeof c.input === "string" ? c.input : JSON.stringify(c.input)` — passes the
- malformed string straight through as the OpenAI `arguments` field → provider
- 400s on **every** continuation.
-- The trailing `assistant` message whose only chunk is `error` serializes to
- `content:""` + no tool_calls (error chunk is filtered out, leaving an empty
- assistant message) → also uncontinuable.
-- `reconcile()` touches neither. `load()` also has no try/catch on
- `JSON.parse(value)` — a single corrupt row would throw and brick the chat.
-
-### Scope (production DB, 140 conversations)
-- **6 conversations end in a trailing `error` chunk:** `102587c0`(seq2, HTTP 401
- model-not-supported), `2bf78252`(seq2), `61127511`(seq250), `77574596`(seq136),
- `d0d85eca`(seq2), `e1ee0989`(seq20).
-- **2 tool-calls total** carry a raw malformed-string `input`.
-- `102587c0` has **only** the trailing-error break (no args, no tool-calls) —
- proving Layer 1 is independently necessary. `77574596` has **both**.
-
-## 2. The fix
-
-### Layer 1 — `conversation-store` `reconcile.ts` (structural repair)
-Extend `reconcileWithReport` to:
-1. **Strip `error` chunks from assistant messages.** An `error` chunk is a
- failed-generation marker, never valid provider content (no provider understands
- an "error" content type) — provider-agnostic.
-2. **Drop any assistant message left with no `text` and no `tool-call` chunks**
- (the now-empty error-only message). This is what unblocks continuation. **Safe:**
- an error-only step ends with no tool-calls, so it is never followed by a `tool`
- message — no "tool-without-preceding-assistant-tool_calls" 400 can result. Keep
- the existing orphaned-tool-call synthesis unchanged.
-3. Extend `ReconcileReport` with counts of stripped error chunks / dropped messages
- (for the existing `reconcile.repair` boot/log span).
-
-Why here: the constitution designates `reconcile` as "the pure function run on load
-that repairs any partial turn." A trailing error-only assistant message IS a
-partial/broken turn. Pure, provider-agnostic, runs on every `load()` → auto-repairs
-all 6 broken chats. Repair is read-time only; storage (append-only) untouched.
-`loadSince` (FE reads) is intentionally NOT reconciled, so the user still SEES the
-error while the provider gets clean history.
-
-### Hardening — `conversation-store` `store.ts` `load()` (same unit)
-Wrap the per-chunk `JSON.parse(value)` in try/catch: on a corrupt/unparseable row,
-log + skip it (don't throw) so `reconcile` can still run on the rest. Today a single
-bad row makes `load()` throw → unrecoverable. (0 such rows today; "never leave the
-system broken" asks for it.)
-
-### Layer 2 — `openai-stream` `convert-messages.ts` (serialization safety)
-In `convertAssistantMessage`, ensure a tool-call's `arguments` is **always a valid
-JSON string**: if `input` is a string, `JSON.parse` it; on failure substitute a
-valid fallback object (e.g. `JSON.stringify({})` or a wrapped
-`{ _malformed_arguments: <truncated> }`). Objects pass through `JSON.stringify` as
-today. This neutralizes already-stored malformed args (seq 134) so the provider
-stops 400ing on continuation. Follow the SAME semantics as the Claude fix below
-(isolation over DRY: each provider reimplements locally, same behavior).
-
-### Layer 2 (equivalent) — `../claude` `provider-anthropic` `convert.ts` (SEPARATE agent)
-The Claude plugin already has a `safeJson(s)` helper (line ~115) used at
-`input: typeof c.input === "string" ? safeJson(c.input) : c.input`. But its fallback
-**returns the raw string `s` on parse failure** — for Anthropic, `tool_use.input`
-must be an object, so a raw string can still 400 when a historical malformed tool_use
-is re-sent. Fix: make `safeJson` return a **valid object fallback** (e.g. `{}`) on
-parse failure instead of the raw string. (Layer 1 does NOT apply here — the
-arch-rewrite `reconcile` already strips error chunks before the Claude provider sees
-the messages, so the Claude converter never receives error-only assistant messages.)
-
-## 3. Waves & summoning
-
-- **Wave 1 (arch-rewrite, PARALLEL):** `conversation-store` (Layer 1 + `load()`
- hardening) and `openai-stream` (Layer 2). Disjoint packages, no contract/type
- change, both depend only on already-built `@dispatch/kernel` contracts. Standard
- summon per ORCHESTRATOR §2/§3 (attach the scoped rules: conversation-store gets
- `pure-core.md`+`no-internal-mocks.md`+`typed-handles.md`+`extension-logging.md`;
- openai-stream gets `pure-core.md`+`no-internal-mocks.md`+`extension-logging.md`;
- both get `one-owner.md`+`isolation-over-dry.md`+`biome-clean.md`+`package-agent.md`+
- `extension-agent.md`).
-- **Wave 2 (separate repo, SEPARATE agent):** summon against
- `--cwd /home/tradam/projects/dispatch/claude` for `packages/provider-anthropic`
- (`convert.ts` `safeJson`). That repo has its own `AGENTS.md`; attach the
- arch-rewrite `package-agent.md`+scoped rules as needed. Can run in parallel with
- Wave 1 (different repo, no shared files).
-
-## 4. Why this auto-heals `77574596` (and the other 5) — no DB surgery
-On next open/continue, `load()` returns history ending at seq 135 (the tool-result):
-Layer 1 strips the seq-136 error message; Layer 2 sanitizes the seq-134 args to
-valid JSON. The provider receives
-`[…, assistant{text+tool-call(args:{})}, tool{error result}]` — a valid "continue
-after a tool result" state. The model sees its `todo_write` failed and adjusts.
-Chat continues. Same auto-repair applies to the other 5 (Layer 1 alone for the
-401/empty cases; Layer 1+2 for any malformed-args case).
-
-## 5. Test requirements (regression scar tissue)
-
-**conversation-store `reconcile.test.ts`:**
-- `reconcile strips error-only trailing assistant message` (the 77574596/102587c0
- shape: `[user, assistant{error}]` → `[user]`).
-- `reconcile strips error chunk but keeps sibling text`
- (`assistant{text,error}` → `assistant{text}`).
-- `reconcile drops assistant message left empty after stripping error`
- (`assistant{error}` only → dropped).
-- `reconcile keeps tool-call + strips error` (`assistant{tool-call,error}` with a
- matching result → `assistant{tool-call}`).
-- existing orphaned-tool-call behavior unchanged (regression).
-- (hardening) corrupt-JSON chunk row is skipped, rest load + reconcile.
-
-**openai-stream `convert-messages.test.ts`:**
-- `arguments is valid JSON when input is a malformed string` (seed from seq 134's
- raw string → output `JSON.parse`s, no throw).
-- `arguments passes through valid string input` and `stringifies object input`
- (regression).
-
-**provider-anthropic `convert.test.ts` (claude repo):**
-- `safeJson returns a valid object fallback on malformed string` (raw malformed
- string → `{}` or wrapped object, not the raw string).
-- `safeJson parses valid string input` (regression).
-
-## 6. Verify (ORCHESTRATOR §4)
-`bun run typecheck && bun run test && bun run check` whole-project green; both agents
-in-lane (`git status --short`); zero internal mocks in the pure-core units. Live-spot:
-open `77574596` against a probe/`bin/up` and confirm it now continues past the tool
-result instead of 400-looping.
-
-## 7. Notes / out of scope
-- **Parse-time prevention** (openai-stream / provider-anthropic could reject or
- repair malformed args when the model emits them, instead of storing a raw
- string) is a deeper follow-up; Layer 2 is the safety net that also repairs
- already-stored data.
-- Deploying the fix auto-repairs the 6 broken production chats on next load — no
- migration needed.
diff --git a/crash-review-report.md b/crash-review-report.md
deleted file mode 100644
index 272abdb..0000000
--- a/crash-review-report.md
+++ /dev/null
@@ -1,86 +0,0 @@
-# Production Crash Investigation — Independent Review
-
-## Executive Summary
-
-The production Dispatch server is experiencing two distinct failure modes under load:
-1. **Exit-code 1 Crashes**: Driven by an unhandled `EventEmitter` `'error'` event from the `ssh2` connection pool, **not** the AI-SDK as previously suspected.
-2. **Bun Runtime Segfaults**: Triggered by massive memory pressure from unbounded conversation history serialization during long multi-step agent turns, confirming the "leak" is actually a massive live working set, not a persistent memory leak.
-
-Additionally, a suspected latent crash path in the cache-warming probe has been confirmed as an Unhandled Promise Rejection.
-
----
-
-## 1. The Exit-1 Crash ("Timed out while waiting for handshake")
-
-### Finding: Confirmed Dispatch Bug in SSH Pool (Incorrect Preliminary Finding)
-The preliminary analysis hypothesized that the `error: Timed out while waiting for handshake` crash was caused by an unhandled `'error'` on an outbound TLS socket to the AI provider. **This is incorrect.**
-
-The crash actually originates from the `ssh2` package managing outbound remote computer connections, specifically within `packages/ssh/src/pool.ts`.
-
-### Technical Analysis
-- **The Evidence**: The exact string `'Timed out while waiting for handshake'` is hardcoded in `ssh2/lib/client.js` when the SSH handshake times out or keepalives fail during a re-keying phase.
-- **The Code Path**: In `packages/ssh/src/pool.ts`, the `doConnect` function attaches an `onError` listener to the `ssh2.Client` instance to catch connection failures:
- ```typescript
- client.on("error", onError);
- ```
- However, when the connection succeeds (`onReady` fires), the `cleanup()` function is called, which **removes the error listener**:
- ```typescript
- function cleanup(): void {
- clearTimeout(timer);
- client.removeListener("ready", onReady);
- client.removeListener("error", onError); // <-- Listener removed here
- }
- ```
-- **The Crash Mechanism**: After `doConnect` succeeds, the client is placed in the pool and returned to callers. If the SSH connection drops later or a timeout occurs, the `ssh2.Client` emits an `'error'` event. Because there are no longer any listeners attached for `'error'`, Node.js's `EventEmitter` escalates it to an uncaught exception, instantly crashing the process with exit code 1.
-
-### Recommendation
-**Dispatch Code Change**: Add a persistent `.on("error", ...)` handler to the `client` in `buildConnection` (or refrain from removing it) to gracefully catch post-connection drops, tear down the connection, and transition `state.value = "error"`.
-
----
-
-## 2. The Bun Native Segfaults (The 6.2 GB "Leak")
-
-### Finding: Massive Live Working Set, Not a Persistent Leak
-The preliminary investigation suspected a 2.5 GB/hour slow leak. The telemetry data confirms that the memory is **not permanently leaked**—when `activeConversations` drops to `0`, the RSS cleanly drops back down to the ~84 MB baseline. The crash is caused by unbounded live working set growth during concurrent agent turns, which fragments and overwhelms Bun's allocator.
-
-### Technical Analysis
-- **The Code Path**: `MAX_STEPS` in `packages/kernel/src/runtime/run-turn.ts` is set to `0` (unlimited). A single turn can run for hundreds of steps.
-- **The Mechanism**: In `executeStep`, every step appends new tool calls and results to the `messages` array. This array is then passed to `provider.stream()`.
-- Inside `packages/openai-stream/src/stream.ts`, the entire unbounded array is serialized into a single contiguous string every step:
- ```typescript
- const bodyString = JSON.stringify(body);
- ```
-- **The Crash**: If 4 concurrent conversations (`activeConversations = 4`) run for hundreds of steps, the `messages` arrays grow to hundreds of megabytes each. Serializing these arrays copies them into massive contiguous strings on the V8 heap on *every step*. This causes gigabytes of memory allocation churn, memory pressure spikes (peaking at 6.2 GB), and eventually triggers a native `SIGSEGV`/`SIGILL` in Bun's allocator.
-
-### Recommendation
-**Dispatch Code Change**:
-1. Reintroduce a sane `MAX_STEPS` limit (e.g., `50` or `100`) to bound the maximum length of a single turn.
-2. Implement a sliding window or context-truncation strategy for `messages` before serializing to prevent the payload from growing infinitely.
-3. **Operational Mitigation**: Apply the `MemoryMax` cgroup circuit breaker to turn the segfault into a controlled recycle while the codebase fix is developed.
-
----
-
-## 3. The Cache-Warming Latent Crash
-
-### Finding: Confirmed Latent Unhandled Promise Rejection
-The preliminary finding suspected a latent crash path due to a missing `try/catch` in the cache-warming probe. This is confirmed.
-
-### Technical Analysis
-- **The Code Path**: In `packages/session-orchestrator/src/orchestrator.ts`, the `createWarmService`'s `warm` function consumes the provider stream:
- ```typescript
- for await (const event of provider.stream(messages, assembled.tools, providerOpts)) {
- ```
-- **The Mechanism**: If the AI provider connection fails or aborts, `provider.stream` throws. Because there is no `try/catch` around this loop, the async `warm` function rejects its returned promise.
-- In `packages/cache-warming/src/warmer.ts`, the timer fires `void fireWarm(conversationId, token);`. Since the returned promise is not `await`ed or `.catch()`'d at the top level, it results in an **Unhandled Promise Rejection**.
-
-### Recommendation
-**Dispatch Code Change**: Add a `try/catch` block around the `for await` loop in `createWarmService` (or add `.catch()` to `fireWarm`) to gracefully emit an error result instead of throwing an unhandled rejection.
-
----
-
-## Assessment of Preliminary Findings
-
-- ❌ **"Unhandled TLS Socket to AI Provider"**: Incorrect. The exit-1 crash was a race condition in error listener attachment for outbound SSH connections (`ssh2`), not the AI SDK's TLS socket.
-- ✅ **"MAX_STEPS = 0 ... structural enabler for large working sets"**: Correct. Unbounded history serialization caused the massive gigabyte allocations that crashed Bun.
-- ✅ **"Cache-warming missing try/catch"**: Correct. It is an unhandled promise rejection waiting to happen.
-- ✅ **"Not an LSP leak"**: Confirmed. The memory growth is strictly tied to `activeConversations` and the unbounded turn array serialization.
diff --git a/frontend-cache-rate-handoff.md b/frontend-cache-rate-handoff.md
deleted file mode 100644
index b64a612..0000000
--- a/frontend-cache-rate-handoff.md
+++ /dev/null
@@ -1,126 +0,0 @@
-# FE handoff — cache hit/miss + percentage (calculation guide)
-
-> **Courier doc** (backend → `../frontend`, via the user). Per ORCHESTRATOR §7
-> the backend does not write the FE repo. This describes ONLY how to compute cache
-> hit/miss + percentages from data the backend ALREADY exposes — **no UI design here**
-> (the look is specified separately) and **no backend change is required**.
-> Contracts: `@dispatch/wire` + `@dispatch/transport-contract` `0.4.0`.
-
-## TL;DR
-The cache hit rate is `cacheReadTokens / inputTokens`. Everything you need is already
-on the `usage` + `done` live events and in `GET /conversations/:id/metrics`. There is
-**no separate cache endpoint or boolean** — it's derived from token counts, exactly as
-the old `CacheRatePanel` did.
-
-## The data shape (`Usage`, from `@dispatch/wire`)
-```ts
-interface Usage {
- inputTokens: number; // TOTAL prompt tokens this step/turn, INCLUDING cached ones
- outputTokens: number;
- cacheReadTokens?: number; // input tokens served FROM cache (the "hit" count). Optional.
- cacheWriteTokens?: number; // cache-creation count. Optional; usually ABSENT (see caveats).
-}
-```
-Field semantics that matter for the math:
-- `inputTokens` is the **whole** prompt, so `cacheReadTokens ≤ inputTokens` and the rate is in `[0,1]`.
-- The cache fields are **optional** — treat `undefined` as `0` in all arithmetic.
-
-## Formulas
-```ts
-const read = u.cacheReadTokens ?? 0;
-const write = u.cacheWriteTokens ?? 0;
-
-const isHit = read > 0; // hit vs miss
-const hitRate = u.inputTokens > 0 ? read / u.inputTokens : 0; // 0..1 (guard /0)
-const hitPct = Math.round(hitRate * 100);
-const fresh = Math.max(0, u.inputTokens - read - write); // uncached input tokens
-```
-(These are byte-identical to the old `CacheRatePanel.svelte` formulas: hit rate =
-`cacheReadTokens/inputTokens` clamped; uncached = `max(0, input − read − write)`.)
-
-## Where to get `Usage` — three granularities, two channels
-
-| Scope | LIVE (WS `chat.delta` / NDJSON) | REPLAY (`GET /conversations/:id/metrics`) |
-|---|---|---|
-| **Per step** | `usage` event (`type:"usage"`, carries `stepId`, `usage`) | `TurnMetrics.steps[].usage` (each has `stepId`) |
-| **Per turn** (authoritative aggregate) | `done` event (`type:"done"`, carries `usage`, `durationMs`) | `TurnMetrics.usage` |
-| **Cumulative** (conversation) | Σ of each turn's `done.usage` | Σ of `turns[].usage` |
-
-Notes:
-- The **per-turn aggregate IS the sum of its steps** (the runtime aggregates). So when
- summing a cumulative figure, pick ONE granularity — sum `done.usage`/`TurnMetrics.usage`
- per turn, **or** sum all steps — never both (double-count).
-- `done.usage` is the authoritative per-turn total. (`turn-sealed` does NOT carry usage in
- this backend — it's just `{conversationId, turnId}`; the numbers ride the immediately
- preceding `done` event.)
-- `step-complete` is timing only (ttft/decode) — no tokens; ignore it for cache.
-
-## Live accumulation + reconcile (recommended pattern)
-1. **In-progress turn (optional live counter):** as `usage` events stream, you may sum
- `read`/`input` across the turn's steps to show a live-updating hit % for the current turn.
-2. **Turn finished:** take that turn's authoritative totals from its `done.usage`. Use it as
- the turn's final value (replace any live partial for that turn).
-3. **Cumulative (session/conversation):** add each completed turn's `done.usage` to a running
- total. Compute the cumulative hit % from the running totals (`ΣcacheRead / Σinput`).
-4. **"Last request" rate:** the most recent turn's `done.usage` (or most recent step's `usage`
- if you want per-round-trip granularity).
-
-## Replay / reopening a conversation
-On open, `GET /conversations/:id/metrics` → `ConversationMetricsResponse { turns: TurnMetrics[] }`.
-Seed the cumulative totals from `Σ turns[].usage`, the "last request" from `turns.at(-1).usage`,
-and you can render a per-turn (and per-step, via `steps[]`) breakdown — a superset of what the
-old session-cumulative-only panel could show.
-
-## Caveats (be honest in the UI)
-- **`cacheWriteTokens` is usually absent.** The current provider is OpenAI-compatible
- (OpenCode Go): it reports a cache **read** count (`cached_tokens`) but **no cache-creation**
- count. So the old panel's separate "write" row will be 0/empty. Hit/miss and the read
- percentage are unaffected. It would populate only if an Anthropic-native (or
- `cache_write`-reporting) provider is added.
-- **Optional fields:** any of the cache fields can be `undefined` (provider-dependent). Default
- to 0; never assume presence.
-- **A legitimate 0% is not a bug.** OpenAI-style providers auto-cache (no `cache_control`
- breakpoints), and short prompts below the provider's cache threshold simply won't be cached —
- `cacheReadTokens: 0` is a real "miss", not missing data. Cache reads grow as a conversation's
- resent prefix gets large enough.
-- **Provider doesn't report cache at all — distinguish from 0.** Some providers (e.g.
- **Umans**) never include `cache_read_tokens` / `cache_write_tokens` in their usage
- payload. In that case `cacheReadTokens` is `undefined` — the provider can't tell you
- whether cache was hit or missed. This is **different from `cacheReadTokens: 0`**,
- which means "cache was checked and there were 0 hits" (a real miss).
-
- The FE should distinguish these three states:
-
- | `cacheReadTokens` | Meaning | FE display |
- |---|---|---|
- | `undefined` | Provider doesn't report cache | Hide cache panel, or show "N/A" |
- | `0` | Provider reports cache; this request had 0 hits | Show "0%" (genuine miss) |
- | `> 0` | Cache hit | Show percentage |
-
- ```ts
- function cacheDisplay(u: Usage): { kind: "not-reported" } | { kind: "reported"; hitPct: number } {
- if (u.cacheReadTokens === undefined) return { kind: "not-reported" };
- const read = u.cacheReadTokens;
- const hitRate = u.inputTokens > 0 ? read / u.inputTokens : 0;
- return { kind: "reported", hitPct: Math.round(hitRate * 100) };
- }
- ```
-
- When `kind === "not-reported"`, do NOT show "0%" — that's misleading. Either hide the
- cache panel entirely or show "Cache: not reported". This also applies to `cacheWriteTokens`
- (if `undefined`, don't show a write row).
-
-## Worked example (real numbers, captured live against OpenCode Go flash)
-| Turn | inputTokens | cacheReadTokens | hit % |
-|---|---|---|---|
-| 1 | 2669 | 384 | 14% |
-| 2 (history resent) | 2737 | 2560 | **93%** |
-
-Cumulative: read `2944` / input `5406` → **54%**. These exact values appear both on the live
-`done.usage` stream and in `GET /conversations/:id/metrics` (`turns[].usage`).
-
-## Type references
-- `@dispatch/wire`: `Usage`, `TurnUsageEvent` (`usage`), `TurnDoneEvent` (`done`),
- `TurnMetrics`, `StepMetrics`.
-- `@dispatch/transport-contract`: `ConversationMetricsResponse`, and the WS `chat.delta`
- envelope carrying each `AgentEvent`.
diff --git a/frontend-cache-warming-handoff.md b/frontend-cache-warming-handoff.md
deleted file mode 100644
index cf1f402..0000000
--- a/frontend-cache-warming-handoff.md
+++ /dev/null
@@ -1,91 +0,0 @@
-# FE handoff — cache warming: cache-rate fix + "expected cache" metric
-
-> **Courier doc** (backend → `../frontend`, via the user). Per ORCHESTRATOR §7 the backend does
-> NOT write the FE repo. `lsp references` does not span the two repos.
-> Backend commits: `7ffb6b2` (arch-rewrite), `0e9d118` (`../claude/provider-anthropic`).
-
-## Status — most of the original handoff is DONE (removed)
-Per the FE's `backend-handoff.md` (2026-06-11), the frontend has already consumed the bulk of the
-earlier version of this doc — those sections are **removed**:
-- ✅ `NumberField` (`kind:"number"`) renderer.
-- ✅ Conversation-scoped surface subscriptions (focused `conversationId` on subscribe/invoke +
- staleness rule; re-scope on conversation switch).
-- ✅ The "Cache Warming" sidebar view: enabled toggle, minutes+seconds interval (`cache-warming/
- set-interval`), `cache-warming/toggle`, manual **Warm now** (`POST /chat/warm`), live countdown,
- hit-% history.
-- ✅ `warmNow()` posting `/chat/warm` with the conversation's model.
-
-What remains below is the ONE piece the FE has not yet consumed: a cache-rate **correctness fix** and
-a new **retention** metric.
-
-## Cache-rate metric — a correctness fix + the "expected cache" metric (TO CONSUME)
-A backend bug made the cache-hit % read **100% on Claude whenever anything was cached** (it inflated).
-Root cause: Anthropic's `input_tokens` is the *uncached remainder*, with cache read/creation reported
-separately — but the wire `Usage.inputTokens` convention (which the flash/OpenAI-compat provider
-already follows) is the **TOTAL prompt incl. cached**. Fixed in `../claude/provider-anthropic`
-(`inputTokens = input + cacheRead + cacheWrite`). **No FE change needed for the fix itself** — your
-existing `cacheRead/inputTokens` math (in `frontend-cache-rate-handoff.md`) now yields the *true* rate
-on Claude. (That older handoff's caveat "cacheWriteTokens is usually absent" is **not** true for
-Claude — it reports both.)
-
-Show two distinct cache numbers:
-- **Cache rate** = `cacheReadTokens / inputTokens` — *what fraction of THIS turn's prompt came from
- cache*. Legitimately **drops when a turn adds a lot of new content** (e.g. pasting a big file: reads
- the old prefix back but also writes the new file → rate < 100%). Per-turn efficiency; on every
- `usage`/`done` event + persisted metrics.
-- **Expected cache (retention)** = *of the cache that existed going into this turn, how much we read
- back* — ideally **~100% every turn after the first**. **<100% = the cache busted/expired.** It is a
- **cross-turn** derivation (FE-side, from two consecutive turns' usage you already have):
- ```
- expectedCache(turn N) = clamp01( cacheRead_N / (cacheRead_{N-1} + cacheWrite_{N-1}) )
- ```
- (denominator = the prior turn's cached prefix = what it read + what it wrote).
-
-**Worked example (live, Claude haiku), one chat, two real turns:**
-| turn | inputTokens (total) | cacheRead | cacheWrite | cache rate `cr/input` | expected cache (cross-turn) |
-|---|---|---|---|---|---|
-| 1 (fresh) | 5149 | 0 | 5146 | 0% | — |
-| 2 (new msg) | 8462 | 5146 | 3313 | **61%** | `5146/(0+5146)` = **100%** |
-
-So on turn 2 the prompt was 61% cache (the rest was the new message), yet you read back **100%** of
-what turn 1 cached — two true, complementary signals. (Pre-fix, the rate wrongly showed 100% because
-the denominator excluded the 5146 cached tokens.)
-
-### Warming-specific (already on the wire — small additions)
-For the warming feature, the backend now also reports a **single-shot** retention so you don't have to
-track cross-turn state there:
-- **`WarmResponse.expectedCacheRate`** (new field on `POST /chat/warm`) =
- `round(cacheReadTokens / (cacheReadTokens + cacheWriteTokens) * 100)` — ~**100%** when the warm
- found the cache still warm, **0%** when it had expired (rewrote everything). This is the **"is
- warming working?"** signal — headline this for the Warm-now result rather than `cachePct`.
-- The conversation-scoped `cache-warming` surface gained a matching **`stat` "cache retention"** field
- (alongside the existing "last cache rate" stat). It's a generic `stat`, so your existing renderer
- already shows it — just relabel/position as desired.
-
-Types: `@dispatch/transport-contract` `WarmResponse` now carries `expectedCacheRate` (additive).
-
-## CR-3 — DONE (next-warm timestamps + manual-warm resets the timer)
-Both asks from `backend-handoff-cache-warming-timer.md` are implemented (commit `bfbad3a`). No
-contract bump (uses the `custom` escape hatch, as you suggested).
-
-**Ask 1 — authoritative timestamps on the `cache-warming` surface.** The conversation-scoped spec now
-includes a `custom` field:
-```ts
-{ kind: "custom", rendererId: "cache-warming-timer",
- payload: { nextWarmAt: number | null, lastWarmAt: number | null } } // epoch-ms
-```
-- `nextWarmAt` = epoch-ms the next AUTOMATIC warm will fire, or `null` when not scheduled (disabled,
- or a turn is generating so the timer is cancelled). Drive your countdown off this directly.
-- `lastWarmAt` = epoch-ms of the most recent completed warm, or `null` if none. Use its changes for
- the history. (The hit-% for that warm is the `last cache rate` / `cache retention` stats in the
- same spec.)
-- Pushed via the normal surface `update` on every change (warm complete, toggle, interval, turn
- start/settle). You can drop the FE-side best-effort countdown anchor.
-
-**Ask 2 — a manual `POST /chat/warm` now resets the cycle + refreshes the surface.** Implemented via
-an inversion (no new endpoint, no change to the `/chat/warm` request/response): the backend's warm
-service emits an internal event that the cache-warming extension consumes, so a manual warm now
-re-arms the automatic timer (new `nextWarmAt`), updates `lastPct`/`lastWarmAt`, and **pushes a surface
-`update`**. So after a "Warm now" click you'll get an authoritative surface `update` — you can drop the
-workaround of reading the % from the HTTP response (though the HTTP `WarmResponse` is still returned and
-fine to use for immediate feedback). Live-verified against Claude haiku.
diff --git a/frontend-cache-warming-lifecycle-handoff.md b/frontend-cache-warming-lifecycle-handoff.md
deleted file mode 100644
index 49bee0a..0000000
--- a/frontend-cache-warming-lifecycle-handoff.md
+++ /dev/null
@@ -1,94 +0,0 @@
-# FE handoff — CR-4 cache-warming lifecycle SHIPPED (+ CR-1 table, CR-2 scope)
-
-> **Courier doc** (backend → `../frontend`, via the user). Response to your
-> `backend-handoff-cache-warming.md` (CR-4) and the open asks CR-1 / CR-2 in
-> `backend-handoff.md`. Everything below is live on `bin/up` and verified with a
-> headless probe (same flow as your `scripts/probe-cache-warming.ts` — re-run it to
-> confirm; default-off means Phase C's toggle-enable branch now executes).
->
-> **Contract bumps to re-pin:** `@dispatch/ui-contract` **0.1.0 → 0.2.0**,
-> `@dispatch/transport-contract` **0.8.0 → 0.9.0**. `wire` unchanged (0.6.0).
-
-## CR-4a — warming now defaults OFF ✅
-A new conversation starts `enabled: false`, `nextWarmAt: null` — no warm is scheduled
-until the user opts in via the toggle. Interval default is still 240s. Bonus fix:
-re-enabling restores the conversation's PERSISTED interval (not the 240s default).
-One caveat (pre-existing behavior, now fail-safe): opt-in is not yet re-hydrated
-across a backend RESTART — after a restart a conversation reads disabled until
-toggled again. Flag it if that matters to you and we'll add boot hydration.
-
-## CR-4b — post-warm updates now carry the FUTURE `nextWarmAt` ✅
-Root cause was notify-before-reschedule in the warmer. Fixed; additionally:
-- after every automatic warm, the pushed `cache-warming-timer` payload is
- `{ nextWarmAt: <future>, lastWarmAt: <just now> }` (probe: 2 warms @5s, both FUTURE);
-- after `turn-sealed` the surface now pushes the fresh post-turn schedule (this was
- the "still past after a real chat turn" case in your probe);
-- on `turn-start` the surface pushes `nextWarmAt: null` (nothing scheduled while
- generating — render as your "waiting…" state);
-- if a warm completes with warming since-disabled, the update carries
- `nextWarmAt: null`, never a stale past timestamp.
-Your countdown can stay authoritative off `nextWarmAt`; the cosmetic past-value guard
-should now be dead code.
-
-## CR-4c — `POST /conversations/:id/close` ✅ (the tab-close affordance)
-New endpoint (no request body), `[email protected]`:
-
-```ts
-interface CloseConversationResponse {
- conversationId: string;
- abortedTurn: boolean; // true iff an in-flight turn existed and was aborted
-}
-```
-
-Semantics — exactly the asymmetry the user wanted:
-- **Aborts any in-flight turn.** The kernel stops at the next event boundary; the
- partial turn is PERSISTED and the turn SEALS normally — watchers receive
- `done` (with `reason: "aborted"`) then `turn-sealed`, so your stream-derived
- `generating` flag clears with no special-casing. Live-verified.
-- **Stops + disables cache-warming** for the conversation (persisted OFF — reopening
- the conversation later does not resume warming), and pushes a surface update
- (`enabled: false`, `nextWarmAt: null`) to subscribers.
-- **Idempotent**: closing an idle/unknown conversation is a 200 with
- `abortedTurn: false`.
-- Browser/socket disconnect and `chat.unsubscribe` are UNCHANGED — they still never
- touch the turn or the warming schedule (your "keep running when the window closes"
- half is regression-tested).
-Wire this into `store.closeTab()`; `fetch`/`sendBeacon` both fine (CORS already
-allows POST).
-
-## CR-4d — initial `surface` echo ✅ (no backend change was needed)
-HEAD already echoes `conversationId` on the initial `surface` reply (shipped in the
-per-conversation-scoping commit; unit-tested). We live-probed BOTH stacks today —
-:24205 and your :25205 — and the echo is present. Your probe most likely ran against
-a `bin/up2` instance booted before that commit (up2 freezes code at boot). Re-run
-`bin/up2` and your probe; if you still see a missing echo, send us the raw frame.
-
-## CR-1 — Loaded Extensions table ✅
-The surface now emits the "Loaded" count stat plus ONE custom field:
-
-```ts
-{ kind: "custom", rendererId: "table", payload: { columns, rows } }
-// columns: ["Name", "Version", "Trust", "Activation"]
-// rows: one per loaded extension (ALL trust tiers), cell-for-cell aligned
-```
-
-Typed payload is exported as `TablePayload` (+ `TABLE_RENDERER_ID`) from
-`@dispatch/surface-loaded-extensions` if you want to narrow instead of duck-typing.
-Note: `Version` cells all read `0.0.0` — manifests are genuinely unversioned today
-(the optional data-quality item from your handoff; not done).
-
-## CR-2 — catalog `scope` flag ✅ (`[email protected]`)
-`SurfaceCatalogEntry` gains `scope?: "global" | "conversation"`. Emitted today:
-`loaded-extensions` → `"global"`, `cache-warming` → `"conversation"`. Treat ABSENT as
-conversation-scoped (conservative — your current always-send-conversationId policy
-remains correct for both). You can now skip re-subscribing `scope: "global"` surfaces
-on conversation switch.
-
-## Suggested FE follow-ups (from your own queue)
-- Re-pin + re-mirror `.dispatch/{ui-contract,transport-contract}.reference.md`.
-- Wire `POST /conversations/:id/close` into the tab-close path.
-- Extend `probe-cache-warming.ts`: assert default-off, post-warm FUTURE `nextWarmAt`,
- and (new) close → `abortedTurn` + `done.reason === "aborted"`.
-- The "waiting…" guard for a past `nextWarmAt` can stay as a belt-and-braces guard
- but should never trigger now; `nextWarmAt: null` while generating is the real state
- to render.
diff --git a/frontend-compaction-handoff.md b/frontend-compaction-handoff.md
deleted file mode 100644
index 195bc1e..0000000
--- a/frontend-compaction-handoff.md
+++ /dev/null
@@ -1,167 +0,0 @@
-# FE handoff — conversation compacting
-
-Courier this to `../frontend`. All changes are ADDITIVE.
-
-## What shipped (backend)
-
-Conversation compaction: summarize old history into a summary + recent N,
-preserving the full pre-compaction history in a separate archive conversation.
-Creates a linked chain of archives you can walk backward.
-
-Two modes:
-- **Manual**: `POST /conversations/:id/compact` — triggers immediately.
-- **Automatic**: after each turn settles, the backend checks if the last turn's
- input tokens exceeded the per-conversation `compactThreshold` (default 85).
- If so, compaction runs automatically (fire-and-forget, non-blocking).
-
-## How compaction works — non-destructive, chained
-
-The compacted conversation **keeps its original ID** (so messaging between
-agents still works). The old full history is **forked** to a new archive
-conversation (new UUID). The archive inherits the source's `compactedFrom`,
-creating a chain:
-
-```
-Compaction 1: A (ID "abc") — full history forked to X (new ID).
- A's history replaced with [summary + recent N].
- A.compactedFrom = X
-
-Compaction 2: A (ID "abc") — current history forked to Y (new ID).
- A's history replaced with [new summary + recent N].
- A.compactedFrom = Y
- Y.compactedFrom = X (inherited from A's pre-compaction state)
-
-Chain: A → Y → X (walk compactedFrom backward)
-```
-
-Each archive is an **immutable snapshot** — a complete copy of the conversation
-at the time of that compaction. History is never destroyed.
-
-The FE **does not switch tabs** — the conversation ID doesn't change. Just
-reload the history.
-
-## Bump pinned deps
-- `@dispatch/wire` → `0.11.0`
-- `@dispatch/transport-contract` → `0.15.0`
-
-## New types
-
-```ts
-// @dispatch/wire — ConversationMeta now has compactedFrom
-export interface ConversationMeta {
- readonly id: string;
- readonly createdAt: number;
- readonly lastActivityAt: number;
- readonly title: string;
- readonly status: ConversationStatus; // "active" | "idle" | "closed"
- /** Points to the archive conversation with full pre-compaction history. */
- readonly compactedFrom?: string;
-}
-
-// @dispatch/wire
-export interface CompactionResult {
- readonly summary: string;
- readonly newConversationId: string; // ID of the archive (old full history)
- readonly messagesSummarized: number;
- readonly messagesKept: number;
-}
-
-// @dispatch/transport-contract — WS message (server → client)
-export interface ConversationCompactedMessage {
- readonly type: "conversation.compacted";
- readonly conversationId: string; // the conversation (ID unchanged)
- readonly newConversationId: string; // the archive ID (old full history)
- readonly messagesSummarized: number;
- readonly messagesKept: number;
-}
-// Added to WsServerMessage union.
-
-// @dispatch/transport-contract — HTTP response types
-export interface CompactResponse {
- readonly conversationId: string; // the conversation (ID unchanged)
- readonly newConversationId: string; // the archive ID (old full history)
- readonly messagesSummarized: number;
- readonly messagesKept: number;
-}
-
-export interface CompactPercentResponse {
- readonly conversationId: string;
- readonly percent: number; // 0 = manual only; null = default 85
-}
-
-export interface SetCompactPercentRequest {
- readonly percent: number;
-}
-```
-
-## `POST /conversations/:id/compact` — manual compaction
-
-Triggers compaction on demand. Optional JSON body:
-```json
-{ "keepLastN": 10, "modelName": "umans/umans-glm-5.2" }
-```
-- `keepLastN` (default 10): how many recent messages to retain.
-- `modelName`: override the model used for summarization.
-
-200 response: `CompactResponse` — includes `newConversationId` (the archive ID).
-The conversation ID in the response is the same as the request — the ID doesn't
-change. The FE should reload the conversation history.
-
-409: `{ error: string }` — conversation is generating, too short, percent not exceeded, etc.
-503: compaction service not available.
-
-## `GET /conversations/:id/compact-percent` — read percent
-
-200: `CompactPercentResponse { conversationId, percent }`
-- `percent: 0` — auto-compact explicitly disabled (manual only).
-- `percent: null` (not stored) — **default: 85** (85% tokens). The FE
- should display 85 as the default value in the settings UI.
-- Any positive number — auto-compact triggers when the last turn's input tokens
- exceed this value.
-
-## `PUT /conversations/:id/compact-percent` — set percent
-
-Body: `SetCompactPercentRequest { percent: number }`
-- `0` explicitly disables auto-compact.
-- Any positive number sets the trigger percent.
-- To "reset to default", set it to 85.
-
-## `conversation.compacted` WS message
-
-Broadcast to all connected WS clients when compaction completes. The FE should
-**reload the conversation history** via `GET /conversations/:id` (the
-conversation ID hasn't changed — just reload the same ID). The first message
-will now be a system summary.
-
-No tab switching needed — the ID is the same.
-
-## What the FE needs to do
-
-1. **Compact button** in the conversation toolbar → `POST /conversations/:id/compact`.
- Show a loading indicator while waiting. On success, reload the conversation
- history (same ID — just re-fetch).
-
-2. **Settings UI** for compact percent: `PUT /conversations/:id/compact-percent`
- with `{ percent: number }`. A number input (0 = manual only, default 85).
- Read the current value via `GET /conversations/:id/compact-percent`.
-
-3. **Handle `conversation.compacted` WS messages**: reload the conversation
- history via `GET /conversations/:id` (same ID, no tab switch).
-
-4. **"View predecessor" link**: when `ConversationMeta.compactedFrom` is present,
- show a link that opens the archive conversation in a read-only view (or a new
- tab). Load it via `GET /conversations/:compactedFrom`. The archive has
- `status: "closed"` and title `"Archive: <original>"`. Each archive may also
- have its own `compactedFrom` — walk the chain backward to see every snapshot.
-
-5. **Archives in conversation list**: archives appear in
- `GET /conversations?status=closed`. They have `compactedFrom` chaining to
- the previous archive (if any). The FE can show them in a history view.
-
-6. **Visual indicator**: show a badge on conversations that have a
- `compactedFrom` (they've been compacted). E.g. "Compacted" badge or chain icon.
-
-## CLI
-
-`dispatch compact <conversationId>` — triggers manual compaction. Resolves
-short IDs like other commands. The response includes the archive ID.
diff --git a/frontend-context-size-handoff.md b/frontend-context-size-handoff.md
deleted file mode 100644
index a774a0c..0000000
--- a/frontend-context-size-handoff.md
+++ /dev/null
@@ -1,47 +0,0 @@
-# FE handoff — context size (current context-window usage)
-
-Courier this to `../frontend` (cross-repo contract change; `lsp references` does not
-span repos — ORCHESTRATOR §7). Backend commit adds an optional `contextSize` field; no
-breaking change.
-
-## What shipped (backend)
-
-A new optional field **`contextSize`** (a token count) now flows to the frontend on two
-existing carriers. Both are computed identically and are EQUAL for the same turn:
-
-1. **Live** — `TurnDoneEvent.contextSize?: number` (the `done` AgentEvent, arriving in a
- `chat.delta` WS message / the NDJSON stream).
-2. **Persisted** — `TurnMetrics.contextSize?: number`, served by
- `GET /conversations/:id/metrics` (`ConversationMetricsResponse.turns[].contextSize`).
-
-Types: `@dispatch/wire` (`0.4.0 → 0.5.0`), re-exported by
-`@dispatch/transport-contract` (`0.5.0 → 0.6.0`). Bump the pinned `file:` deps.
-
-## Definition (read this — it's subtle)
-
-`contextSize` = **the turn's FINAL step `inputTokens + outputTokens`** — the tokens the
-conversation occupies right now.
-
-It is deliberately **NOT** the aggregate `usage` already on `done` / `TurnMetrics`.
-`usage.inputTokens` is the SUM across steps, which **overcounts** a multi-step / tool-calling
-turn (each step re-prefills the growing prompt). The final step's input already contains all
-prior context, so `finalStep.input + finalStep.output` is the true occupancy. Do not derive
-context size from `usage` yourself — read `contextSize`.
-
-## How to render it
-
-- **Current value = the LATEST turn's `contextSize`.** The chat's "current context usage" is
- whatever the most recent turn reported.
-- **Live update:** when a `done` event arrives, if `event.contextSize !== undefined`, set the
- displayed context size to it.
-- **On (re)hydrate:** call `GET /conversations/:id/metrics`, take the LAST element of `turns`
- that has a defined `contextSize`, and show its value. (Turns appear only after they seal.)
-- **Optionality:** `contextSize` may be `undefined` (provider reported no per-step usage).
- Treat absent as "unknown" — render a placeholder, NOT `0`.
-
-## Not included yet (next step)
-
-The model's **max context-window limit** is a SEPARATE, later field — so a UI like
-`contextSize / limit` (e.g. `34,102 / 200,000`) can't show the denominator yet. For now show
-only the current size (e.g. "34,102 tokens in context"). "context size" = current usage;
-"context window" = the future limit (see GLOSSARY).
diff --git a/frontend-conversation-lifecycle-handoff.md b/frontend-conversation-lifecycle-handoff.md
deleted file mode 100644
index ca6de57..0000000
--- a/frontend-conversation-lifecycle-handoff.md
+++ /dev/null
@@ -1,102 +0,0 @@
-# FE handoff — conversation lifecycle (tab persistence across devices)
-
-Courier this to `../frontend`. All changes are ADDITIVE — nothing existing breaks.
-
-## What shipped (backend)
-
-Conversations now have a lifecycle **status** field: `active`, `idle`, or `closed`.
-This enables tab persistence: when a new browser connects, it fetches all
-`active` + `idle` conversations and restores the tab bar.
-
-- **`active`** — an agent is currently generating (a turn is in-flight).
-- **`idle`** — conversation exists, not generating. User can send a message to resume.
-- **`closed`** — user dismissed the tab (hidden from the tab bar, not deleted).
-
-Status transitions are driven by the backend:
-- `idle → active` when a turn starts.
-- `active → idle` when a turn settles (done/error).
-- `→ closed` when `POST /conversations/:id/close` is called.
-
-## Bump pinned deps
-- `@dispatch/wire` → `0.10.0`
-- `@dispatch/transport-contract` → `0.14.0`
-
-## New types (`@dispatch/wire` + `@dispatch/transport-contract`)
-
-```ts
-export type ConversationStatus = "active" | "idle" | "closed";
-
-// ConversationMeta now has a status field:
-export interface ConversationMeta {
- readonly id: string;
- readonly createdAt: number;
- readonly lastActivityAt: number;
- readonly title: string;
- readonly status: ConversationStatus;
-}
-
-// New WS message (server → client):
-export interface ConversationStatusChangedMessage {
- readonly type: "conversation.statusChanged";
- readonly conversationId: string;
- readonly status: ConversationStatus;
-}
-```
-
-`ConversationStatusChangedMessage` is added to the `WsServerMessage` union.
-
-## `GET /conversations?status=active,idle` — filter by status
-
-The existing `GET /conversations` endpoint now accepts an optional `?status=`
-query param: a comma-separated list of statuses to filter by.
-
-- **Default (no param):** returns ALL conversations (all statuses).
-- `?status=active,idle` → only active + idle (what the FE tab bar wants).
-- `?status=closed` → only closed conversations (for a history view).
-- Invalid values are silently dropped. If all values are invalid, no filter
- is applied (returns all).
-
-## `POST /conversations/:id/close` — marks as closed
-
-The existing close endpoint now also sets the conversation's status to `closed`
-in the store. This persists across server restarts. The response is unchanged
-(`{ conversationId, abortedTurn }`).
-
-## `conversation.statusChanged` WS message
-
-Broadcast to ALL connected WS clients whenever a conversation's status changes.
-The backend emits this synchronously alongside the existing `turnStarted` /
-`turnSettled` / `conversationClosed` hooks.
-
-```ts
-{ type: "conversation.statusChanged", conversationId: "conv-1", status: "active" }
-```
-
-## What the FE needs to do
-
-1. **On connect:** call `GET /conversations?status=active,idle` to fetch
- conversations for the tab bar. Render tabs for each.
-
-2. **`active` tabs:** subscribe to the conversation's live stream
- (`chat.subscribe` WS op) to receive in-flight events.
-
-3. **`idle` tabs:** load history via `GET /conversations/:id`. No live
- subscription needed until the user sends a message.
-
-4. **Tab close button:** call `POST /conversations/:id/close` to mark the
- conversation as `closed`. Remove it from the tab bar.
-
-5. **Handle `conversation.statusChanged` WS messages:** update the tab's
- status indicator. When a conversation goes `idle → active`, show a
- loading/generating indicator. When it goes `active → idle`, remove the
- indicator. When it goes `closed`, remove the tab.
-
-6. **Closed conversations:** accessible from a history view
- (`GET /conversations?status=closed`). Can be reopened by sending a message
- (which transitions `closed → active`).
-
-## CLI
-
-`dispatch list` now defaults to `active,idle` (excludes closed). New flags:
-- `--status <active|idle|closed>` — filter by a single status.
-- `--all` — include closed (show all statuses).
diff --git a/frontend-conversation-list-handoff.md b/frontend-conversation-list-handoff.md
deleted file mode 100644
index dd3fd63..0000000
--- a/frontend-conversation-list-handoff.md
+++ /dev/null
@@ -1,100 +0,0 @@
-# FE handoff — conversation list, title, and open tab
-
-Courier this to `../frontend`. All changes are ADDITIVE — nothing existing breaks.
-
-## What shipped (backend)
-
-Three new features for conversation management:
-
-1. **Conversation list** — `GET /conversations` returns all known conversations with
- metadata (id, title, createdAt, lastActivityAt). The backend auto-tracks metadata
- on every message append; title defaults to the first user message (truncated 80 chars).
-
-2. **Conversation title** — `GET/PUT /conversations/:id/title` lets the FE read and
- set a human-readable title for any conversation.
-
-3. **Open tab signal** — `POST /conversations/:id/open` broadcasts a `conversation.open`
- WS message to all connected clients (e.g. when the CLI uses `--open`). See also
- `frontend-conversation-open-handoff.md` for the WS message details.
-
-No version bumps needed — all types are already in `@dispatch/transport-contract` `0.13.0`
-and `@dispatch/wire` `0.9.0`.
-
-## `GET /conversations` — conversation list
-
-Returns all conversations sorted by `lastActivityAt` descending (most recent first).
-
-- Optional `?q=<prefix>` query param filters by conversation ID prefix (short-ID
- resolution — used by the CLI; the FE can ignore it or use it for search).
-- 200 response: `ConversationListResponse`
-
-```ts
-interface ConversationListResponse {
- readonly conversations: readonly ConversationMeta[];
-}
-
-interface ConversationMeta {
- readonly id: string;
- readonly createdAt: number; // epoch-ms
- readonly lastActivityAt: number; // epoch-ms
- readonly title: string;
-}
-```
-
-**FE use case:** render a conversation sidebar/picker showing title + relative time.
-Click a conversation to open it (load its history via the existing `GET /conversations/:id`).
-
-## `GET /conversations/:id/title` — read title
-
-- 200 response: `TitleResponse { conversationId, title }`
-
-## `PUT /conversations/:id/title` — set title
-
-- Body: `SetTitleRequest { title: string }` (non-empty after trim, else 400)
-- 200 response: `TitleResponse { conversationId, title }`
-
-**FE use case:** let the user rename a conversation. The title is also auto-set from
-the first user message, so a newly created conversation already has a title.
-
-## `GET /conversations/:id/last` — blocking last message
-
-Blocks server-side until any in-flight turn settles, then returns the last AI text
-message. Mainly for CLI use (`dispatch read <id>`), but the FE could use it for
-notifications or previews.
-
-- 200 response: `LastMessageResponse { conversationId, content, turnId? }`
-- `content` is empty string if the conversation has no assistant message.
-- Unknown conversation → `content: ""` (200, not an error).
-
-## `POST /conversations/:id/open` — signal frontend
-
-Calls the backend to broadcast `conversation.open` to all connected WS clients. See
-`frontend-conversation-open-handoff.md` for the WS message format and FE handling.
-
-- 200 response: `OpenConversationResponse { conversationId }`
-
-## What the FE needs to do
-
-1. **Bump pinned deps:** `@dispatch/wire` → `0.9.0`, `@dispatch/transport-contract`
- → `0.13.0`.
-
-2. **Conversation sidebar/picker:** call `GET /conversations` on load (and periodically
- or on focus) to show a list of conversations. Each entry shows `title` + relative
- time from `lastActivityAt`. Click to open → load history via `GET /conversations/:id`.
-
-3. **Title editing:** add an inline edit affordance on the conversation title.
- `PUT /conversations/:id/title` with `{ title }` to update.
-
-4. **Handle `conversation.open` WS message:** when a `"conversation.open"` message
- arrives, open (or focus) a tab for that `conversationId`. See
- `frontend-conversation-open-handoff.md`.
-
-## Notes
-
-- Conversations are **in-memory only** on the backend — the list resets on server
- restart. New conversations appear as users chat; old ones may disappear after a
- restart.
-- The title is auto-set from the first user message (truncated 80 chars). Users can
- override it via `PUT /conversations/:id/title`.
-- `createdAt` is set on the first message append; `lastActivityAt` updates on every
- append.
diff --git a/frontend-conversation-open-handoff.md b/frontend-conversation-open-handoff.md
deleted file mode 100644
index c4064ea..0000000
--- a/frontend-conversation-open-handoff.md
+++ /dev/null
@@ -1,53 +0,0 @@
-# FE handoff — conversation.open (CLI --open flag)
-
-Courier this to `../frontend`. All changes are ADDITIVE.
-
-## What shipped (backend)
-
-The CLI's `--open` flag (`dispatch send <id> --text "..." --open`) calls
-`POST /conversations/:id/open`, which emits a `conversationOpened` bus event. The
-transport-ws extension subscribes and broadcasts a new WS message to ALL connected
-frontend clients.
-
-## New WS message (additive to `WsServerMessage`)
-
-```ts
-interface ConversationOpenMessage {
- readonly type: "conversation.open";
- readonly conversationId: string;
-}
-```
-
-The `type` is `"conversation.open"` — add it to the FE's `WsServerMessage` union
-handler. It arrives as a top-level WS message (not inside `chat.delta`).
-
-## What the FE needs to do
-
-1. **Handle the WS message** — when a `"conversation.open"` message arrives, open
- (or focus) a tab for `conversationId`. The backend just signals; the FE decides
- whether to actually open/focus or just notify.
-
-2. **Suggested behavior:**
- - If the conversation is already open in a tab, focus that tab.
- - If not, open a new tab for it (load its history via `GET /conversations/:id`).
- - Do NOT auto-focus if the user is actively typing in another tab (optional —
- your discretion).
-
-3. **No version bumps needed** — `@dispatch/transport-contract` already exports
- `ConversationOpenMessage` (additive to `WsServerMessage` since `0.13.0`). The FE
- just needs to add the `type: "conversation.open"` case to its WS message handler.
-
-## No other integration points
-
-- No new HTTP endpoints for the FE (the CLI calls `POST /conversations/:id/open`).
-- No new surface types.
-- No new `AgentEvent` types.
-- The message is a global broadcast (sent to ALL connected clients), not per-
- conversation.
-
-## Notes
-
-- The `--open` flag can be combined with `--queue` (enqueue + signal) or used
- without `--queue` (blocking send + signal).
-- Multiple clients (e.g. phone + laptop) all receive the broadcast — each decides
- independently whether to open/focus.
diff --git a/frontend-cr3-user-message-handoff.md b/frontend-cr3-user-message-handoff.md
deleted file mode 100644
index 0fb20e5..0000000
--- a/frontend-cr3-user-message-handoff.md
+++ /dev/null
@@ -1,54 +0,0 @@
-# FE handoff — CR-3 fixed: user prompt is now on the turn's event stream
-
-Courier to `../frontend`. This resolves CR-3 from `backend-handoff.md` ("a watcher can't see
-the turn's USER prompt until seal"). **Option B implemented + live-verified.** Your staged-but-inert
-consumption can now be turned on.
-
-## What shipped (backend)
-
-A new **additive** `AgentEvent` variant carries the user prompt INTO the turn's outward stream:
-
-```ts
-// @dispatch/wire — added to the AgentEvent union
-interface TurnInputEvent {
- type: "user-message";
- conversationId: string;
- turnId: string;
- text: string; // the raw prompt, exactly as sent
-}
-```
-
-`session-orchestrator` emits it via the broadcast/buffer path as the **FIRST event of every turn**
-(before `turn-start`), so it is replayed to every subscriber — live AND late-join — and arrives on
-the HTTP/NDJSON path too. Persistence is unchanged (the user message is still appended atomically at
-seal); this only adds a buffered/broadcast event. Metrics are unaffected (it is not usage).
-
-## Version bumps (re-pin both)
-
-- `@dispatch/wire` **`0.5.0 → 0.6.0`** (additive union member).
-- `@dispatch/transport-contract` **`0.7.0 → 0.8.0`** (re-exports `AgentEvent`/`chat.delta`, which now
- carries `user-message`; no other transport-contract change).
-
-Re-mirror `.dispatch/{wire,transport-contract}.reference.md` and add `user-message` to the FE
-exhaustiveness guard.
-
-## FE action
-
-Flip on the already-staged `core/chunks` branch that folds a `user-message` event into a provisional
-user chunk for watchers, with your text dedup against the sender's optimistic echo. After re-pin:
-- a **pure watcher** (second device / `chat.subscribe` only) now shows the user bubble the moment the
- turn starts, not at seal;
-- the **sender** is unchanged (its optimistic echo dedups against the replayed `user-message`);
-- a **late-joiner** gets `user-message` first in the replay, then the rest of the in-flight turn.
-
-## Live-verified (backend, vs flash)
-
-Two WS clients on one conversation; client B subscribed but never sent. On A's `chat.send`, B received
-`chat.delta { event:{ type:"user-message", text:"…", turnId, conversationId } }` as its **first** delta
-(index 0), **before** `turn-sealed`, with `text` equal to A's prompt, then the streaming reply. `RESULT: OK`.
-
-## Note
-
-The ordering guarantee is: `user-message` is the first event of the turn, immediately followed by
-`turn-start`, then the usual deltas → `done` → `turn-sealed`. Treat `user-message` as turn-scoped
-(it carries `turnId`) so a multi-turn transcript attributes each prompt to its turn.
diff --git a/frontend-cwd-resolution-handoff.md b/frontend-cwd-resolution-handoff.md
deleted file mode 100644
index 80d5105..0000000
--- a/frontend-cwd-resolution-handoff.md
+++ /dev/null
@@ -1,95 +0,0 @@
-# Backend handoff — cwd resolution fixes (backend → FE) — courier doc
-
-> **From:** arch-rewrite orchestrator · **To:** frontend orchestrator (b18a) · **Courier:** the user.
-> Response to the cwd bug report you sent to backend agent ab13. The fixes are DONE and
-> live-verified on the dev stack.
-
-## Version bumps
-
-| Package | From | To | Notes |
-|---|---|---|---|
-| `@dispatch/wire` | — | — | **Unchanged** |
-| `@dispatch/transport-contract` | — | — | **Unchanged** |
-| `@dispatch/ui-contract` | — | — | **Unchanged** |
-
-**This is a behavior-only change.** No wire/transport-contract types changed. No FE re-pin or
-re-mirror needed. The FE needs NO contract change to benefit.
-
----
-
-## 1. The fix (what was broken → what now works)
-
-You reported: a workspace `defaultCwd` set, a conversation with no explicit cwd, and `pwd` ran in
-the server default (`process.cwd()`) instead of the workspace `defaultCwd`. Plus your desired
-behavior: a per-conversation cwd **relative to the workspace `defaultCwd`** unless absolute.
-
-**Root cause (backend-only):** the workspace-relative resolution lived in
-`conversation-store.getEffectiveCwd`, which only resolved the *persisted* cwd. But the FE sends the
-CwdField value as a **per-turn `cwd` on `chat.send`**, and `session-orchestrator` used a per-turn
-`cwd` **as-is** — bypassing `getEffectiveCwd` entirely. So a relative `cwd` like `"arch-rewrite"`
-reached `run_shell` raw → resolved against `process.cwd()` → a nonexistent path → `pwd` broke.
-
-**Three backend fixes (all live-verified):**
-
-1. **Per-turn `cwd` is now resolved.** `session-orchestrator` passes the per-turn `cwd` (on
- `chat.send`/`POST /chat` AND on manual `POST /chat/warm`) through `getEffectiveCwd` as an
- override, so it goes through the same workspace-relative algorithm as the persisted cwd.
-2. **New-conversation timing.** A brand-new conversation's first turn previously ran
- `getEffectiveCwd` *before* the workspace was assigned (so it saw `"default"`, not the request's
- workspace). Now the workspace is assigned first. A relative per-turn `cwd` on the FIRST message
- of a new conversation now resolves against the intended workspace.
-3. **`DELETE /conversations/:id/cwd` was a stub** (returned `{cwd:null}` but did NOT clear the
- persisted key). It now calls `clearCwd` and truly deletes the persisted cwd.
-
-## 2. The resolution algorithm (now applied to BOTH persisted and per-turn cwd)
-
-```
-workspaceId = persisted conversation workspaceId ("default" fallback)
-workspaceCwd = workspace.defaultCwd ?? null
-conversationCwd = the explicit cwd (persisted via GET /cwd, OR the per-turn chat.send cwd)
-
-if (conversationCwd == null) → workspaceCwd ?? serverDefaultCwd // process.cwd()
-else if (conversationCwd absolute) → conversationCwd // starts with "/"
-else → path.resolve(workspaceCwd ?? serverDefaultCwd, conversationCwd)
-```
-
-`serverDefaultCwd` = `process.cwd()` (the server's cwd).
-
-## 3. FE impact (minimal — no contract change)
-
-You do NOT need to change anything. Both FE patterns now work correctly:
-
-- **If you omit `cwd` on `chat.send`** (your current code): the backend resolves the persisted
- conversation cwd (set via `PUT /conversations/:id/cwd`) through the algorithm. ✅
-- **If you send a relative `cwd` on `chat.send`**: it is resolved against the workspace
- `defaultCwd`. ✅ (was broken — used raw)
-- **If you send an absolute `cwd`** (starts `/`): overrides outright. ✅
-
-### Endpoints (semantics — shapes unchanged)
-
-- `GET /conversations/:id/cwd` → **unchanged**: the RAW explicit conversation cwd (`null` =
- inheriting workspace default). Your CwdField shows what the user typed.
-- `GET /conversations/:id/lsp` → returns the **effective** (resolved) cwd. It now roots LSP at the
- effective cwd INCLUDING the server-default fallthrough (when neither conversation nor workspace
- cwd is set, LSP roots at `process.cwd()`). Previously returned `cwd: null` + empty `servers` when
- no cwd was set.
-- `DELETE /conversations/:id/cwd` → **now actually clears** the persisted cwd (was a no-op stub).
- Returns `{ conversationId, cwd: null }` (unchanged shape). Use this to reset a conversation's cwd
- to "inherit workspace default".
-- `PUT /conversations/:id/cwd` → unchanged (persists the raw value).
-
-## 4. Optional FE simplification (not required)
-
-You MAY now safely **omit `cwd` on `chat.send`** entirely and rely on the backend resolving the
-persisted conversation cwd (set via `PUT /conversations/:id/cwd`). This was the design you
-described in the original report. Either path (send cwd, or omit it) is correct; the backend
-resolves both consistently. Sending it is harmless; omitting it avoids sending redundant data.
-
-## 5. Live-verified (dev stack, workspace `test` defaultCwd `/home/tradam/projects/dispatch`)
-
-- Existing conversation, per-turn `cwd:"arch-rewrite"` → `pwd` = `/home/tradam/projects/dispatch/arch-rewrite` ✅
-- Brand-new conversation, per-turn `cwd:"arch-rewrite"` → `pwd` = `/home/tradam/projects/dispatch/arch-rewrite` ✅
-- Chat omitting `cwd` (persisted cwd `arch-rewrite`) → `pwd` = `/home/tradam/projects/dispatch/arch-rewrite` ✅
-- `PUT /tmp/test` → GET `/tmp/test` → DELETE → GET `null` (actually cleared) ✅
-
-`tsc -b` EXIT 0, biome clean, 1311 vitest pass.
diff --git a/frontend-history-windowing-handoff.md b/frontend-history-windowing-handoff.md
deleted file mode 100644
index 6792c38..0000000
--- a/frontend-history-windowing-handoff.md
+++ /dev/null
@@ -1,70 +0,0 @@
-# Backend → frontend handoff — CR-5: history windowing (`limit` / `beforeSeq`)
-
-> **From:** arch-rewrite · **To:** frontend · **Courier:** the user.
-> Reply to `backend-handoff-chat-limit.md` (CR-5). 2026-06-12. SHIPPED.
-
-## What shipped
-
-`GET /conversations/:id` now takes two OPTIONAL query params on top of
-`sinceSeq` (all combinable; authoritative spec = the
-`ConversationHistoryResponse` JSDoc in `@dispatch/transport-contract`):
-
-1. **`limit=<k>`** — returns only the NEWEST `k` chunks of the selection,
- still ASCENDING by seq. A selection with ≤ `k` chunks is returned whole
- (your `limit=192` against a short conversation gets the normal full
- response, exact). `limit` absent → exactly the previous behavior.
-2. **`beforeSeq=<s>`** — restricts the selection to `seq < s` (exclusive).
- Combined semantics: `sinceSeq < seq < beforeSeq`; with `limit`: the newest
- `k` chunks below `s`, ascending — your "Show earlier messages" page-in path.
-
-Your three flows, verbatim from your handoff, all work as written:
-
-- Fresh load: `?sinceSeq=0&limit=192`
-- Tail sync: `?sinceSeq=<maxCachedSeq>` (unchanged)
-- Page older in: `?beforeSeq=<oldestLoadedSeq>&limit=<ceil(L/4)>`
-
-## Ask #3 — our pick: the seq invariant, no new field (your "cheapest option")
-
-We CONFIRM IN WRITING, as a contractual guarantee (now codified in the
-`StoredChunk` doc in `@dispatch/wire` and referenced from the history-response
-doc): **per-conversation `seq` is 1-based, monotonic, and gap-free** — a
-conversation's first chunk is always `seq === 1` and numbering never skips.
-
-So derive it exactly as you proposed: `hasOlder = oldestLoaded.seq > 1`.
-There is deliberately NO `earliestSeq`/`hasOlder` response field.
-
-## Validation (new, only for the new params)
-
-`limit` and `beforeSeq` must be **positive integers** when present
-(`sinceSeq` keeps its existing semantics — `0` = from the start). Malformed,
-zero, or negative values → **HTTP 400 `{ error }`** (the error message names
-the offending param). Don't send `beforeSeq=0` — and you never need to:
-`oldestLoaded.seq === 1` already means there is nothing older.
-
-## `latestSeq` caveat (important for your cursor logic)
-
-`latestSeq` semantics are UNCHANGED (seq of the last returned chunk; the
-requested `sinceSeq` when the slice is empty) — but on a **windowed** read it
-describes the returned window, NOT the conversation's high-water mark:
-
-- A fresh `?sinceSeq=0&limit=192` load DID reach the true tail → `latestSeq`
- is a valid sync cursor.
-- A `?beforeSeq=...` backfill page did NOT → do not regress your tail cursor
- from a backfill response. (Your seq-keyed dedup cache makes this natural —
- just don't feed backfill `latestSeq` into the tail cursor.)
-
-## Versions (re-pin + re-mirror)
-
-- `@dispatch/transport-contract` **`0.9.0 → 0.10.0`** — the param/validation/
- caveat docs above (response TYPE shape unchanged; no new fields).
-- `@dispatch/wire` **`0.6.0 → 0.6.1`** — doc-only: the 1-based seq guarantee
- codified on `StoredChunk`.
-
-## Test coverage (backend, for your confidence)
-
-- conversation-store: +8 windowing tests (newest-N ascending, bounds,
- combined bounds, page-in, empty selection, garbage-in, no-window regression
- guard; the "gap-free 1-based seq" test now backs a written contract).
-- transport-http: +20 route/param tests incl. all five 400 validation cases
- and a no-params byte-identical regression guard.
-- Full suite: typecheck clean · biome clean · 935 vitest + 112 bun tests green.
diff --git a/frontend-lsp-cwd-handoff.md b/frontend-lsp-cwd-handoff.md
deleted file mode 100644
index e7c8417..0000000
--- a/frontend-lsp-cwd-handoff.md
+++ /dev/null
@@ -1,133 +0,0 @@
-# Frontend handoff — LSP status + per-conversation CWD
-
-> Backend milestone complete (this repo). The web frontend is a SEPARATE repo
-> (`../frontend`); this document is couriered to it by the user (ORCHESTRATOR
-> §7 — `lsp references` does not span repos). All types below are exported from
-> `@dispatch/transport-contract` (bumped to **0.5.0**).
-
-## TL;DR for the FE
-Two new capabilities are now on the backend:
-1. **Per-conversation working directory (cwd)** — get/set per tab (a tab = a
- `conversationId`). Persisted server-side; defaults a turn's cwd when `/chat`
- omits one.
-2. **Per-conversation LSP status** — which language servers are configured for a
- tab's cwd and whether each is connected.
-
-## Endpoints
-
-### `GET /conversations/:id/cwd` → `CwdResponse`
-```ts
-interface CwdResponse { conversationId: string; cwd: string | null }
-```
-`cwd` is `null` until set.
-
-### `PUT /conversations/:id/cwd` (body `SetCwdRequest`) → `CwdResponse`
-```ts
-interface SetCwdRequest { cwd: string }
-```
-- `200` with the new `CwdResponse` on success.
-- `400` `{ error }` if `cwd` is missing/empty.
-- Content-Type `application/json`. CORS now allows `PUT`.
-
-### `GET /conversations/:id/lsp` → `LspStatusResponse`
-```ts
-type LspServerState = "connected" | "starting" | "error" | "not-started";
-interface LspServerInfo {
- id: string; // "typescript", "luau-lsp"
- name: string; // display name
- root: string; // absolute workspace root the server is rooted at
- extensions: string[]; // e.g. [".ts",".tsx"] or [".luau"]
- state: LspServerState;
- error?: string; // present only when state === "error"
-}
-interface LspStatusResponse {
- conversationId: string;
- cwd: string | null; // the tab's persisted cwd
- servers: LspServerInfo[]; // [] when cwd is null
-}
-```
-
-## Behavior notes (important for UX)
-- **`GET /conversations/:id/lsp` lazily connects.** The first call for a cwd
- resolves the configured servers and **spawns + initializes** them, so it can take
- a moment (typically <1s; a cold luau-lsp loading Roblox types can take longer) and
- returns once each server reaches `connected`/`error`. Subsequent calls are fast
- (cached). Suggested UX: call it when a tab opens / cwd changes, show a spinner per
- server until `state` settles, then a connected/error badge.
-- **`servers` is empty when `cwd` is null** — prompt the user to set a cwd first.
-- **States:** `connected` = ready; `error` = failed to start (`error` has the
- reason, e.g. binary not found); `not-started`/`starting` = transient.
-- **cwd defaulting:** if a `/chat` (or `/chat/warm`) request omits `cwd`, the
- backend now uses the conversation's persisted cwd. If a request DOES send `cwd`,
- that value is used AND persisted (so the CLI `--cwd` keeps the stored value
- fresh). The FE's PUT and the chat `cwd` field write the same per-conversation
- store.
-
-## How servers are configured (so you can explain it to users)
-Per the tab's cwd, the backend resolves language servers from, in order:
-1. `<cwd>/.dispatch/lsp.json` (`{ servers: { <id>: { command, extensions,
- rootMarkers?, env?, initialization?, watch? } } }`)
-2. fallback `<cwd>/opencode.json` `lsp` key (opencode-compatible)
-3. a built-in `typescript` server (so a TS project works with zero config).
-No FE work needed for this — just display `LspStatusResponse`.
-
-## Operational note (surface to users on `state:"error"`)
-Language-server binaries must be on the **backend process's PATH**. A binary in a
-non-standard location (e.g. `~/.local/bin/typescript-language-server`) won't be
-found if the server daemon's PATH lacks that dir, yielding
-`state:"error", error:"ENOENT ... posix_spawn '<bin>'"`. luau-lsp
-(`/usr/local/bin`) and standard-PATH binaries work out of the box. Consider showing
-the `error` text directly so users can diagnose a missing/unfound binary.
-
-## Verified live
-- Roblox project (`luau-lsp`) → `connected` through the full HTTP path
- (`GET /conversations/:id/lsp`), using the project's existing `opencode.json` +
- an auto-spawned `rojo sourcemap --watch` sidecar.
-- This repo (`typescript`) → `connected`.
-- cwd PUT/GET round-trip → `200` + correct value.
-
-## Not in this slice (potential future FE asks)
-- A live WS surface for LSP status (currently HTTP-poll on tab open / cwd change).
-- An LSP-diagnostics stream pushed into the chat (the agent can pull diagnostics
- via the `lsp` tool today; auto-inject-on-write was deliberately deferred).
-
----
-
-## CONFIRMED — answers to `backend-handoff-cwd-lsp.md` (your 6 asks)
-
-> Re your courier doc. All six hold in the current backend. Code refs are
-> `packages/transport-http/src/app.ts` and `packages/session-orchestrator/src/
-> orchestrator.ts`. None require a backend change. **The draft → first-message cwd
-> path you built is fully supported.**
-
-| # | Your ask | Confirmed | Where |
-|---|----------|-----------|-------|
-| 1 | Unseen id: `GET /cwd` ⇒ `200 {cwd:null}`; `GET /lsp` ⇒ `200 {cwd:null,servers:[]}` (no 404/500) | ✅ | `getCwd` returns `null` for any id; `/lsp` early-returns `{cwd:null,servers:[]}` before touching the LSP — `app.ts:322-333, 364-372` |
-| 2 | `PUT /cwd` on an unseen/draft id persists (no prior turn/row) | ✅ | `setCwd` is a plain per-id upsert (key `conv:<id>:cwd`) — `app.ts:335-362` |
-| 3 | Draft cwd carries into turn 1 (`PUT D/cwd`, then `chat.send` D with no `cwd`) | ✅ | orchestrator uses the persisted cwd when the request omits it; same store key the PUT writes — `orchestrator.ts:122-125`. Unit-tested ("uses the persisted cwd when the request omits cwd") |
-| 4 | CORS **preflight** (`OPTIONS` + `Access-Control-Request-Method: PUT`) is answered | ✅ | global Hono `cors`, `allowMethods:["GET","POST","PUT","OPTIONS"]` applied to all routes — `app.ts:112-114`; preflight test passes |
-| 5 | No spawn when `cwd` is null | ✅ | `/lsp` returns `servers:[]` before calling the LSP service when `cwd===null` — `app.ts:367-372` |
-| 6 | Error body is `{ error: string }` | ✅ | every error path returns `{error}` (e.g. empty-cwd PUT ⇒ `400 {error:"Field 'cwd' is required and must be a non-empty string"}`) — `app.ts:342,346,350,360,376,400` |
-
-### Setting the cwd on the first message — two supported flows
-- **(a) Pre-set, then send (your flow):** `PUT /conversations/D/cwd {cwd}` on the
- client-minted draft id → then `POST /chat {conversationId:D}` **without** a `cwd`
- field → the turn loads and runs in the persisted `D` cwd.
-- **(b) cwd on the first `/chat`:** include `cwd` in the first `POST /chat` → it is
- used for that turn **and** persisted for subsequent turns.
-Both write/read the same per-conversation store, so they're interchangeable; a draft
-that has never sent a message works because the cwd store is independent of history.
-
-### One edge to be aware of (FE currently safe)
-`PUT /cwd` rejects an empty-string `cwd` (`400`), but the **`/chat` `cwd` field**
-does not — the orchestrator treats any non-`undefined` `cwd` as "provided", so a
-literal `cwd:""` on `/chat` would override the persisted cwd with empty. Your FE
-omits the field (sends `undefined`) on cwd-less sends, so this never triggers. **Keep
-omitting the field (don't send `cwd:""` / `cwd:null`)** when you want the persisted
-draft cwd to apply. (If preferred, the backend can harden this to treat empty/blank
-as "not provided" — say the word.)
-
-### Live-verified
-Unseen-id `GET /cwd` ⇒ `{cwd:null}`, `GET /lsp` ⇒ `{cwd:null,servers:[]}`,
-`PUT` round-trip `200`, and the empty-cwd `400 {error}` shape were all observed live;
-Roblox `luau-lsp` and this repo's `typescript` both reach `state:"connected"`.
diff --git a/frontend-lsp-cwd-workspace-handoff.md b/frontend-lsp-cwd-workspace-handoff.md
deleted file mode 100644
index d17f0a5..0000000
--- a/frontend-lsp-cwd-workspace-handoff.md
+++ /dev/null
@@ -1,75 +0,0 @@
-# FE Courier Handoff: LSP cwd resolution fix + PUT cwd workspaceId
-
-> Backend→FE courier. The user couriers this to `../frontend` (FE agent `ffe3`).
-> No `@dispatch/wire` or `@dispatch/transport-contract` version bump is breaking —
-> the `SetCwdRequest.workspaceId` is additive (optional); the `LspStatusResponse.cwd`
-> semantics changed (was always non-null effective cwd; now null when no cwd set).
-
-## What changed (backend)
-
-### 1. `GET /conversations/:id/lsp` — behavior change
-
-**Before:** The endpoint called `getEffectiveCwd(conversationId)` directly. When no
-cwd was persisted, this fell through to the server default (`process.cwd()`) — so the
-LSP connected on the wrong directory (the server's cwd, not the conversation's
-workspace).
-
-**After:** The endpoint now gates on the **persisted** cwd (`getCwd`) first:
-- When no cwd is persisted → response is `{ cwd: null, servers: [] }` (HTTP 200, no
- LSP connection). The LSP does NOT connect when no working directory is set.
-- When a cwd IS persisted → the endpoint resolves the **effective** cwd (relative cwd
- resolved against the workspace `defaultCwd`; absolute → as-is) and returns
- `{ cwd: "<effectiveCwd>", servers: [...] }`.
-
-**FE impact:**
-- `LspStatusResponse.cwd` can now be `null` (previously it was always a string, even
- when no cwd was set — it returned `process.cwd()`). The FE should handle `null` by
- showing "no LSP connected" or "set a working directory."
-- When `cwd` is non-null, it is the RESOLVED (effective) cwd — an absolute path. The FE
- can display this as the directory the LSP is connected on.
-
-### 2. `PUT /conversations/:id/cwd` — new optional `workspaceId` field
-
-**Before:** The `PUT /conversations/:id/cwd` body was `{ cwd: string }` — only set
-the persisted cwd, no workspace assignment.
-
-**After:** The body now accepts an optional `workspaceId`:
-```json
-{ "cwd": "/home/user/project", "workspaceId": "my-team" }
-```
-
-When `workspaceId` is provided:
-1. The conversation is assigned to that workspace (via `ensureWorkspace` +
- `setWorkspaceId`) BEFORE the cwd is persisted.
-2. This ensures a subsequent `GET /conversations/:id/lsp` resolves a relative cwd
- against the workspace's `defaultCwd` (not the server default).
-3. Invalid `workspaceId` (not a valid slug: lowercase `[a-z0-9-]`, 1–40 chars) →
- HTTP 400 `{ error: "Invalid workspaceId" }`.
-
-When `workspaceId` is absent → behavior is unchanged (just `setCwd`).
-
-**FE action:** When the user sets the working directory on a new chat tab, send the
-`workspaceId` alongside the `cwd` in the `PUT /conversations/:id/cwd` request. This
-ensures the LSP resolves correctly even before the first turn.
-
-### Example flow (new chat tab)
-
-1. User opens a new chat tab (selects workspace "my-team" with
- `defaultCwd: "/home/tradam/projects/dispatch"`)
-2. User sets working dir to `"arch-rewrite"` (relative)
-3. FE sends: `PUT /conversations/abc/cwd { "cwd": "arch-rewrite", "workspaceId": "my-team" }`
-4. Backend: assigns conversation to workspace "my-team", then persists cwd "arch-rewrite"
-5. FE calls: `GET /conversations/abc/lsp`
-6. Backend: `getCwd("abc")` → `"arch-rewrite"` (non-null) → `getEffectiveCwd("abc")` →
- resolves "arch-rewrite" against workspace "my-team"'s `defaultCwd`
- (`"/home/tradam/projects/dispatch"`) → `"/home/tradam/projects/dispatch/arch-rewrite"`
-7. Response: `{ cwd: "/home/tradam/projects/dispatch/arch-rewrite", servers: [...] }`
-
-Without the `workspaceId` on the PUT (step 3), the conversation would be in the
-`"default"` workspace (defaultCwd: null), and the relative cwd "arch-rewrite" would
-resolve against `process.cwd()` — the wrong directory.
-
-## Contract version
-
-`@dispatch/transport-contract` bumped to `0.17.0` (additive: `SetCwdRequest.workspaceId`
-is optional; `LspStatusResponse.cwd` comment updated — no field type change).
diff --git a/frontend-mcp-status-handoff.md b/frontend-mcp-status-handoff.md
deleted file mode 100644
index d19a9f1..0000000
--- a/frontend-mcp-status-handoff.md
+++ /dev/null
@@ -1,117 +0,0 @@
-# Handoff — MCP Status Endpoint (backend + frontend)
-
-## Backend: `GET /conversations/:id/mcp` (transport-http)
-
-Mirror the existing `GET /conversations/:id/lsp` endpoint exactly. The contract
-types are already in `@dispatch/transport-contract` 0.22.0:
-
-```typescript
-export type McpServerState = "connecting" | "connected" | "error" | "disconnected";
-
-export interface McpServerInfo {
- readonly id: string;
- readonly state: McpServerState;
- readonly error?: string;
- readonly toolCount: number;
- readonly configSource?: string;
-}
-
-export interface McpStatusResponse {
- readonly conversationId: string;
- readonly cwd: string | null;
- readonly servers: readonly McpServerInfo[];
-}
-```
-
-### What to change in `packages/transport-http/`
-
-1. **`src/seam.ts`** — add re-exports from `@dispatch/mcp`:
- ```typescript
- export type { McpServerStatus, McpService } from "@dispatch/mcp";
- export { mcpServiceHandle } from "@dispatch/mcp";
- ```
-
-2. **`src/app.ts`** — add `mcpService?` to `CreateServerOptions` (optional, same
- as `lspService?`), then add the route:
- ```typescript
- app.get("/conversations/:id/mcp", async (c) => {
- // Mirror the LSP route exactly:
- // 1. Gate on persisted cwd (getCwd) — return {cwd:null, servers:[]} when null
- // 2. Resolve effective cwd (getEffectiveCwd) — return {cwd:null, servers:[]} when null
- // 3. If opts.mcpService === undefined → 503 { error: "MCP service not available" }
- // 4. Call opts.mcpService.status(effectiveCwd) → McpServerStatus[]
- // 5. Map McpServerStatus → McpServerInfo (id, state, error?, toolCount, configSource?)
- // 6. Return McpStatusResponse { conversationId, cwd: effectiveCwd, servers }
- });
- ```
-
-3. **`src/extension.ts`** — add `host.getService(mcpServiceHandle)` alongside
- `lspService`, and pass `mcpService` to `createApp({...})`.
-
-4. **`package.json`** — add `"@dispatch/mcp": "workspace:*"` to dependencies.
-
-5. **Tests** — mirror the LSP status tests:
- - Returns null+empty when no persisted cwd — `mcpService.status` NOT called.
- - Returns servers when cwd is set.
- - Returns 503 when `mcpService` is undefined.
- - Maps `McpServerStatus` → `McpServerInfo` correctly (error omitted when
- undefined, configSource omitted when undefined — honor `exactOptionalPropertyTypes`).
-
-### McpServerStatus → McpServerInfo mapping
-
-The `McpService.status(cwd)` returns `McpServerStatus[]` from `@dispatch/mcp`:
-```typescript
-interface McpServerStatus {
- readonly id: string;
- readonly state: "connecting" | "connected" | "error" | "disconnected";
- readonly error?: string;
- readonly toolCount: number;
-}
-```
-Map to `McpServerInfo` (same fields, conditionally include `error` per
-`exactOptionalPropertyTypes`). Note: `McpServerStatus` does NOT have
-`configSource` — that field is on `ResolvedMcpServer` (from config resolution).
-If you want to include `configSource` in the status response, the `McpService`
-interface or `McpServerStatus` would need to be extended. For Phase 2, omit
-`configSource` (it's optional on `McpServerInfo`) unless the MCP extension is
-updated to include it in the status.
-
----
-
-## Frontend (dispatch-web): consume `GET /conversations/:id/mcp`
-
-### What to do
-
-1. **Re-pin** `@dispatch/transport-contract` to `0.22.0`.
-2. **Re-mirror** the reference snapshot if one exists.
-3. **Add a fetch** for `GET /conversations/:id/mcp` — mirror how `GET /conversations/:id/lsp`
- is fetched and displayed.
-4. **Render** the MCP server status: each server's `id`, `state` (with the same
- connected/error/starting visual treatment as LSP), `toolCount`, and optional
- `error`.
-5. Place the MCP status UI alongside (or below) the LSP status in the conversation
- settings/panel — they're sibling features.
-
-### Response shape
-
-```json
-{
- "conversationId": "abc-123",
- "cwd": "/home/user/project",
- "servers": [
- {
- "id": "freecad",
- "state": "connected",
- "toolCount": 12
- },
- {
- "id": "chrome-devtools",
- "state": "error",
- "error": "Executable not found in $PATH: npx",
- "toolCount": 0
- }
- ]
-}
-```
-
-When no cwd is set: `{ "conversationId": "abc-123", "cwd": null, "servers": [] }`.
diff --git a/frontend-message-queue-handoff.md b/frontend-message-queue-handoff.md
deleted file mode 100644
index b9c2a6d..0000000
--- a/frontend-message-queue-handoff.md
+++ /dev/null
@@ -1,189 +0,0 @@
-# FE handoff — message queue + steering injection
-
-Courier this to `../frontend` (cross-repo contract change; `lsp references` does
-not span repos — ORCHESTRATOR §7). All changes are ADDITIVE — nothing existing breaks.
-
-## What shipped (backend)
-
-A per-conversation **message queue** + **steering** feature. While a turn is
-GENERATING, a client can enqueue a user message onto the conversation's queue;
-it is delivered mid-turn as **steering** — injected at the next tool-result
-boundary so the model sees it alongside the tool results and can adjust course.
-If the turn ends with a non-empty queue (no tool call fired), the queue is
-carried into a NEW turn as its opening prompt.
-
-- **`message queue`** — the per-conversation buffer (owned by a new
- `@dispatch/message-queue` extension). Transient + in-memory; the queue is
- NOT on the chat stream — it is exposed to the frontend as a per-conversation
- SURFACE (see below).
-- **`steering`** — a user message injected into an in-flight turn at the
- tool-result boundary (drawn from the queue). Emitted on the chat stream as a
- new `steering` `AgentEvent` so it appears in the transcript live.
-
-Versions: `@dispatch/wire` `0.7.0 → 0.8.0`, `@dispatch/transport-contract`
-`0.11.0 → 0.12.0`. Bump the pinned `file:` deps. (`@dispatch/ui-contract` is
-unchanged — the queue uses the existing `custom` surface field kind.)
-
-## Wire types (in `@dispatch/wire`, re-exported by `@dispatch/transport-contract`)
-
-```ts
-/** A message held in the conversation's queue, awaiting steering delivery. */
-interface QueuedMessage {
- readonly id: string; // stable, client-visible (UI key + dedup)
- readonly text: string;
- readonly queuedAt: number; // epoch-ms
-}
-
-/** Payload of the message-queue surface's `custom` field (see below). */
-interface QueuePayload {
- readonly messages: readonly QueuedMessage[];
-}
-
-/** New `AgentEvent` variant (additive to the union). */
-interface TurnSteeringEvent {
- readonly type: "steering";
- readonly conversationId: string;
- readonly turnId: string;
- readonly text: string; // the combined text of all drained messages
-}
-```
-
-## How the frontend reads queue STATE: a surface (NOT the chat stream)
-
-The queue is control/state, so it rides the **surface** channel (like
-cache-warming), not the chat event stream. The `message-queue` extension
-contributes a per-conversation surface:
-
-- **Surface id:** `"message-queue"`; **scope:** `"conversation"` (subscribe with
- the `conversationId`).
-- **One `custom` field**, `rendererId: "message-queue"`, `payload: QueuePayload`
- (`{ messages: QueuedMessage[] }` — the current queue snapshot).
-- The surface updates (full new spec) on every change: enqueue (queue grew) and
- drain (queue emptied). An idle conversation's queue is empty → the field's
- `messages` is `[]`.
-
-So: **subscribe** to the `message-queue` surface per conversation and render
-the queue list from `payload.messages`. You need a bespoke renderer for
-`rendererId: "message-queue"` (the `custom` escape hatch — see the loaded-
-extensions `table` renderer precedent). The surface is **read-only** (no
-`invoke` actions); enqueuing is a chat op (below).
-
-## How the frontend ENQUEUES: the `chat.queue` WS op
-
-```ts
-interface ChatQueueMessage {
- readonly type: "chat.queue";
- readonly conversationId: string;
- readonly text: string;
-}
-```
-(additive to `WsClientMessage`.)
-
-- **Fire-and-forget.** On success the server emits NOTHING back — the
- `message-queue` SURFACE updates (the new message appears in the snapshot).
- On failure (empty/missing `text`, unknown conversation) the server replies
- `chat.error` (`{ type: "chat.error"; conversationId?; message }`).
-- **`text` must be non-empty** after trim (the server 400/errors otherwise).
-- **Auto-start when idle (server-owned decision):** if NO turn is active for the
- conversation, `chat.queue` does NOT queue — it STARTS A NEW TURN with the
- message as its opening prompt (equivalent to `chat.send`). The sender is
- auto-subscribed and the turn's events stream as `chat.delta`s (the opening
- `user-message` carries the text). So a single `chat.queue` op works for both
- "steer during generation" and "send" — you don't need to pick. When a turn IS
- active, the message is appended to the queue (surface updates) and delivered
- at the next tool-result boundary.
-
-## How the frontend shows steering in the TRANSCRIPT: the `steering` event
-
-When the kernel drains a non-empty queue at a tool-result boundary, the
-session-orchestrator emits a **`steering`** `AgentEvent` on the chat stream
-(arrives inside a `chat.delta` `{ event }`, like every other `AgentEvent`):
-
-```ts
-{ type: "chat.delta", event: { type: "steering", conversationId, turnId, text } }
-```
-
-- Render `text` as a **user bubble in the transcript**, positioned after the
- tool-call/tool-result it followed (it is a user message the model saw mid-turn,
- alongside the tool results). One `steering` event per drain; `text` is the
- combined text of all messages drained at that boundary (joined by a blank
- line).
-- **Move, don't duplicate:** the drained messages were already shown in the
- queue surface; when the surface then updates to empty (the drain cleared the
- queue), they should leave the queue UI (they now live in the transcript as the
- `steering` bubble). A simple rule: on `steering`, append the bubble to the
- transcript; the surface's subsequent empty snapshot clears the queue UI.
-- **Late-join safe:** like `user-message`, `steering` is buffered into the
- in-flight turn's event buffer, so a client that subscribes mid-turn (or a
- second device) sees it before seal (mirrors the CR-3 `user-message` fix).
- (Carry-to-new-turn, below, does NOT emit `steering` — the new turn's
- `user-message` covers it.)
-
-## Carry to a new turn (no `steering` event)
-
-If a turn ENDS with a non-empty queue (the model finished without making a tool
-call, so no tool-result boundary was hit), the orchestrator drains the queue,
-combines the messages, and **starts a NEW turn** whose opening prompt is the
-combined text. You will see: the old turn's `done` + `turn-sealed`, then a new
-`turn-start` + `user-message` carrying the combined text (rendered as the new
-turn's normal user bubble). The queue surface also clears (empty snapshot). No
-`steering` event in this case — handle the carried text as an ordinary new-turn
-user message.
-
-## HTTP path (for the CLI / non-WS clients; the FE uses the WS op above)
-
-`POST /conversations/:id/queue` with body `QueueRequest { text }` → `QueueResponse`:
-
-```ts
-interface QueueResponse {
- readonly conversationId: string;
- readonly startedTurn: boolean; // true = was idle, a new turn started
- readonly queue: readonly QueuedMessage[]; // snapshot after the enqueue
-}
-```
-- Empty/whitespace `text` → HTTP 400 `{ error }`.
-- `startedTurn: true` means no turn was active and the enqueue started one (the
- message is the turn's opening prompt, NOT a queued steering message).
-- `startedTurn: false` means a turn was active and the message was queued (the
- `queue` snapshot includes it).
-
-## What we need the FE to do
-
-1. **Bump pinned deps:** `@dispatch/wire` → `0.8.0`, `@dispatch/transport-contract`
- → `0.12.0`.
-2. **Queue UI (per conversation):** subscribe to the `message-queue` surface
- (scope `conversation`) and render `payload.messages` (`QueuedMessage[]`) with a
- `rendererId: "message-queue"` custom renderer — a list of pending messages
- with their text (and maybe `queuedAt` as a timestamp). Empty `messages` =
- nothing to show (hide the panel).
-3. **Enqueue affordance:** while a turn is generating, show an input that sends
- `chat.queue { conversationId, text }` (NOT `chat.send` — `chat.queue` is the
- steering entry; it auto-starts a turn if idle, so it's safe to offer it
- whenever the user wants to add input). Trim/validate non-empty client-side
- too; expect a `chat.error` on failure.
-4. **Steering bubble:** handle the new `steering` `AgentEvent` (type `"steering"`)
- on the `chat.delta` stream → render `event.text` as a user bubble in the
- transcript after the tool calls; clear the queue UI when the surface updates
- to empty.
-5. **Carry:** no special handling — a carried queue surfaces as a normal new
- turn (`turn-start` + `user-message`); just let the existing new-turn flow
- render it. The queue surface clears automatically.
-
-## Notes / known gaps
-
-- **Live end-to-end (a real steering turn via a tool-calling model) is not yet
- exercised** — the logic is unit/integration tested and the app boots clean with
- the `message-queue` extension registered, but a live `chat.queue` → tool-call
- → `steering` event flow against a real model has not been run. Worth a live
- smoke once the FE wires it (or ask the backend to run one).
-- **Close-with-queued-messages (open product question):** if a client
- `POST /conversations/:id/close` (explicit tab close) while the queue is
- non-empty, the in-flight turn aborts and the carry currently STILL fires
- (starting a new turn on the closed conversation). This may or may not be
- desired (does closing discard pending steering, or honor it?). Backend flag
- for a decision; if "discard on close" is wanted, the backend will gate the
- carry on `finishReason !== "aborted"`. No FE action either way — just be aware
- a closed conversation might briefly start a turn from a queued message.
-- **`steering` is additive** to the `AgentEvent` union — no exhaustive switches
- broke on the backend (verified: `tsc -b` EXIT 0). If the FE has an exhaustive
- switch on `AgentEvent`, add a `steering` case.
diff --git a/frontend-metrics-handoff.md b/frontend-metrics-handoff.md
deleted file mode 100644
index be033d8..0000000
--- a/frontend-metrics-handoff.md
+++ /dev/null
@@ -1,121 +0,0 @@
-# Frontend handoff — live turn metrics (tokens + timing)
-
-> From: arch-rewrite (backend) orchestrator · For: the frontend FE team.
-> Status: **LIVE on the stream now** (backend committed + live-verified). Consume via the pinned
-> contracts `@dispatch/[email protected]` + `@dispatch/[email protected]` (reference snapshots
-> regenerated in `dispatch-web/.dispatch/{wire,transport-contract}.reference.md`).
-
-## 1. What you can now access
-The backend's **authoritative** token + timing metrics are now on the live turn stream:
-
-| Metric | Where | Field(s) |
-|---|---|---|
-| Per-step tokens | `usage` event | `usage` (`inputTokens`/`outputTokens`/`cacheReadTokens?`/`cacheWriteTokens?`) + new `stepId?` |
-| Per-step **TTFT** | new `step-complete` event | `ttftMs?` |
-| Per-step **decode** time | new `step-complete` event | `decodeMs?` |
-| Per-step total generation | new `step-complete` event | `genTotalMs?` |
-| **Tool execution** time | `tool-result` event | `durationMs?` |
-| **Turn** wall-clock | `done` event | `durationMs?` |
-| **Turn** total tokens | `done` event | `usage?` |
-| **Tokens/sec** (TPS) | derive | `usage.outputTokens / (step-complete.decodeMs / 1000)` |
-| Context-size proxy | `usage` event | `usage.inputTokens` (size the model counted; `cacheReadTokens` = cached portion) |
-
-"Authoritative" = measured by the backend runtime, not client wall-clock. They differ from
-anything you'd time in the browser (no network/buffering in them).
-
-## 2. How they're delivered
-**Inline, in the same chat stream you already consume** — WS `chat.delta` frames (and the
-`POST /chat` NDJSON stream) carry the `AgentEvent` union; metrics are additional event types /
-fields in that union. **No new endpoint, no subscription/negotiation.** You already `switch` on
-`event.type`; route the metric events to a telemetry handler and ignore any you don't render
-(zero cost). They do **not** appear in message content — keep your transcript rendering as-is.
-
-These events are **low-frequency** (one `step-complete` per step, one `done` per turn, a
-`durationMs` per tool result) — not per-token — so there's no stream-volume concern.
-
-## 3. The new/changed events (shapes)
-All new fields are **optional** — see §5. Every event still carries `conversationId` + `turnId`.
-
-```ts
-// NEW variant in AgentEvent — emitted once per step, AT STEP END (timing is final here)
-interface TurnStepCompleteEvent {
- type: "step-complete";
- conversationId: string;
- turnId: string;
- stepId: StepId; // join key to the step's `usage` event + tool events
- ttftMs?: number; // time to first token (stream start → first text|reasoning delta)
- decodeMs?: number; // first token → stream end (== genTotalMs - ttftMs)
- genTotalMs?: number; // whole-step generation (present even if no first token was seen)
-}
-
-// usage event — now labeled by step
-interface TurnUsageEvent {
- type: "usage";
- conversationId: string; turnId: string;
- stepId?: StepId; // NEW — attribute tokens to a step / join to step-complete
- usage: Usage; // { inputTokens, outputTokens, cacheReadTokens?, cacheWriteTokens? }
-}
-
-// tool-result — now carries execution time
-interface TurnToolResultEvent {
- type: "tool-result";
- conversationId: string; turnId: string;
- stepId: StepId; toolCallId: string; toolName: string;
- content: string; isError: boolean;
- durationMs?: number; // NEW — tool execution time (dispatch → result)
-}
-
-// done — now carries turn totals
-interface TurnDoneEvent {
- type: "done";
- conversationId: string; turnId: string;
- reason: string;
- durationMs?: number; // NEW — whole-turn wall-clock
- usage?: Usage; // NEW — aggregate turn tokens (so you needn't sum the usage events)
-}
-```
-
-## 4. Correlation & derived metrics
-Keys: `turnId` groups a turn; `stepId` groups a step within it; `toolCallId` pairs a tool call
-with its result. A turn has **one `step-complete` (and usually one `usage`) per step**.
-
-- **Per-step TPS** = `usage.outputTokens / (step-complete.decodeMs / 1000)` — join `usage` and
- `step-complete` by `stepId`. (Use `decodeMs`, not `genTotalMs`, for decode-rate TPS; it excludes
- first-token latency. See "which TPS" caveat below.)
-- **Turn TPS** = `done.usage.outputTokens / (Σ step-complete.decodeMs / 1000)`.
-- **Generation total per step** = `genTotalMs` (or `ttftMs + decodeMs`).
-- **Turn-visible first-token latency** = the `ttftMs` of **step 0** (the first `step-complete`).
-- **Total prefill overhead** = `Σ ttftMs` across steps; **pure generation** = `Σ decodeMs`.
-- **Tool time** = `tool-result.durationMs` per call; sum per `stepId` for a batch.
-
-"Which TPS": `decodeMs` is first-token → end, so TPS over it is the decode rate (first-token
-latency removed). If you want end-to-end rate including the wait, use `ttftMs + decodeMs`.
-
-## 5. Optionality — you MUST tolerate absence
-- `step-complete` is always emitted per step, but its **timing fields are present only when the
- server runs with a clock** (it does in normal operation). `ttftMs`/`decodeMs` are additionally
- absent for a step that produced **no text/reasoning token** (e.g. a tool-call-only step) —
- `genTotalMs` is still present in that case.
-- `usage.stepId`, `tool-result.durationMs`, `done.durationMs`, `done.usage` are all optional.
-- Render gracefully when a value is missing (omit the figure; don't show `NaN`/`undefined`).
-
-## 6. What is NOT available yet (deferred — Pass 2)
-**Metrics are LIVE-ONLY.** They are **not persisted**, so:
-- `GET /conversations/:id` (history) returns messages/chunks but **no tokens/timing**. Reopening a
- past conversation will show content without metrics.
-- If you need historical metrics (e.g. show TPS on a reloaded conversation), that's the planned
- **Pass 2** (persist per-turn metrics + a read path) — see `tasks.md` "Pass 2 — DEFERRED". Tell
- us if you need it and we'll prioritize.
-- TPS is not sent pre-computed (derive it, §4). No per-token timing (metrics are per-step/per-turn).
-
-## 7. Integration checklist
-1. Refresh deps: `bun run typecheck` in frontend (picks up `[email protected]` / `[email protected]`).
-2. Extend your `chat.delta` event handler: add a `case "step-complete"` and read the new optional
- fields on `usage`/`tool-result`/`done`. (No exhaustive-switch break — these are additive.)
-3. Keep a per-turn (and per-step, keyed by `stepId`) telemetry accumulator alongside the transcript
- store; fold metric events into it; render where you want (e.g. a turn footer / per-step badges).
-4. Treat every metric field as optional (§5).
-
-## 8. Carrier facts (unchanged)
-HTTP 24203 (`POST /chat` NDJSON, `GET /conversations/:id`, `GET /models`), WS 24205 (one socket,
-`chat.delta` carries each `AgentEvent`), CORS `*`. Same events on both carriers.
diff --git a/frontend-metrics-pass2-handoff.md b/frontend-metrics-pass2-handoff.md
deleted file mode 100644
index adf0404..0000000
--- a/frontend-metrics-pass2-handoff.md
+++ /dev/null
@@ -1,67 +0,0 @@
-# FE handoff — persisted replay metrics (Pass 2) + metrics endpoint
-
-> **Courier doc** (backend → `../frontend`, via the user). Per ORCHESTRATOR §7
-> the backend does NOT write the FE repo; the FE orchestrator applies this delta
-> on its side (regenerate the in-repo `.dispatch/*.reference.md` snapshots + bump
-> the `file:` dep). `lsp references` does not span the two repos. Backend commit:
-> `6db12ff`.
-
-## Versions
-- `@dispatch/wire` `0.3.0 → 0.4.0` (additive)
-- `@dispatch/transport-contract` `0.3.0 → 0.4.0` (additive)
-
-Pure-type, additive change — no breaking edits to existing types.
-
-## New wire types (`@dispatch/wire`, re-exported by `@dispatch/transport-contract`)
-
-```ts
-interface StepMetrics {
- stepId: StepId; // `<turnId>#<index>`, join key to the live stream
- usage: Usage; // { inputTokens, outputTokens, cacheReadTokens?, cacheWriteTokens? }
- ttftMs?: number; // time to first token (optional — clock + first-token gated)
- decodeMs?: number; // first token → stream end
- genTotalMs?: number; // stream start → end (== ttftMs + decodeMs when a first token was seen)
-}
-
-interface TurnMetrics {
- turnId: string; // plain wire turn id, join key to AgentEvents
- usage: Usage; // aggregate across all steps
- durationMs?: number; // turn wall-clock (optional — clock gated)
- steps: readonly StepMetrics[]; // per-step, in step order
-}
-```
-
-These are the **persisted, replayable** counterparts of the live `usage` /
-`step-complete` / `done` events (which remain transient and unchanged).
-
-## New read endpoint
-
-`GET /conversations/:id/metrics` → `ConversationMetricsResponse`:
-
-```ts
-interface ConversationMetricsResponse { turns: readonly TurnMetrics[] }
-```
-
-Semantics:
-- `turns` = every **sealed** turn's `TurnMetrics`, in **turn-append order**.
-- A turn appears only **after seal** (post-persist); an in-flight/unsealed turn is absent.
-- This is a **separate axis** from `GET /conversations/:id?sinceSeq=` (which returns
- seq-cursor chunk CONTENT). Metrics are keyed per **turn**, not per chunk, so they are
- **not** seq-filtered — hence a sibling route, not a field on the history response.
-- Unknown / metric-less conversation → `{ turns: [] }`.
-- CORS: same wildcard as the other routes.
-
-## Suggested FE consumption
-On (re)opening a conversation, the chat feature can `GET /conversations/:id/metrics`
-once alongside the history hydrate (`?sinceSeq=`), then render historical
-tokens/latency per turn (and per step via `stepId`) — identical fields to what it
-already routes from the live `step-complete` / `usage` / `done` stream. TPS is
-still derived FE-side (`usage.outputTokens / decodeMs`); context-size proxy =
-`usage.inputTokens`.
-
-## Invariants (confirmed live)
-- Persisted `TurnMetrics.usage` / `durationMs` and each `StepMetrics`
- (`stepId` + `usage` + `ttftMs`/`decodeMs`/`genTotalMs`) **byte-match** what the
- live stream emitted for the same turn (verified end-to-end against flash).
-- `stepId` is the SAME value on the live `step-complete`/`usage` events, the persisted
- `StepMetrics`, and the tool chunks — one grouping key across live + replay.
diff --git a/frontend-model-persistence-handoff.md b/frontend-model-persistence-handoff.md
deleted file mode 100644
index 912cea6..0000000
--- a/frontend-model-persistence-handoff.md
+++ /dev/null
@@ -1,91 +0,0 @@
-# Frontend handoff — per-conversation model persistence
-
-## What changed
-
-A chat's selected provider + model is now **persisted per conversation**
-(like `cwd` and `reasoningEffort` already are). Opening a conversation in a new
-browser session recalls the originally selected model instead of defaulting to
-the server default.
-
-## Contract version bump
-
-`@dispatch/transport-contract` `0.19.0 → 0.20.0` — re-pin the `file:` dep and
-re-mirror `.dispatch/transport-contract.reference.md`.
-
-## New types (additive)
-
-```ts
-// GET /conversations/:id/model
-export interface ModelResponse {
- readonly conversationId: string;
- readonly model: string | null; // <credentialName>/<model> form, or null
-}
-
-// PUT /conversations/:id/model
-export interface SetModelRequest {
- readonly model: string | null; // null clears the persisted selection
-}
-```
-
-## New endpoints
-
-### `GET /conversations/:id/model`
-Returns `ModelResponse`. `model` is `null` when never set (the server then
-resolves turns using the default provider + model).
-
-### `PUT /conversations/:id/model`
-Body: `SetModelRequest`. Set `model` to a `<credentialName>/<model>` string
-(one of the values from `GET /models`) to persist it. Set `model` to `null`
-to clear the persisted selection. Returns `ModelResponse` with the resulting
-value.
-
-## What the FE should do
-
-1. **On conversation open** — call `GET /conversations/:id/model` to fetch the
- persisted model. If non-null, set the model selector to that value. If null,
- use the global default (current behavior).
-
-2. **On model select** — call `PUT /conversations/:id/model` with the selected
- model name (`<credentialName>/<model>` form). This persists it so future
- turns (and new browser sessions) use the same model.
-
-3. **On model clear** (if the FE supports clearing back to default) — call
- `PUT /conversations/:id/model` with `{ model: null }`.
-
-4. **No `ChatRequest.model` change needed** — the FE may continue sending
- `model` on `chat.send` (per-turn override); the backend persists it. Or the
- FE may omit `model` on `chat.send` and rely on the persisted value — the
- backend resolves it. Either way works.
-
-## Backend behavior
-
-- **Per-turn override** (`ChatRequest.model` / `chat.send` model) takes
- precedence and is persisted.
-- **No per-turn override** → backend checks `getModel(conversationId)` → if
- non-null, uses it; if null, falls through to the default provider.
-- **Warm path** also resolves the model from persistence when no explicit
- override is given (parity with real turns).
-
-## No FE handoff needed for tasks 1 & 2
-
-- **Task 1** (workspace tab broadcast): already couriered to 29ae by a prior
- orchestrator agent (`frontend-workspace-open-handoff.md`).
-- **Task 2** (system-prompt cwd reconstruction): backend-only fix, no contract
- version bump, no FE action needed.
-
-## Assumptions made (user was away)
-
-1. **Persist the model name string** (`<credentialName>/<model>` form), not
- the provider/credential separately — the model name already encodes both
- (the credential binds to a provider). This mirrors how the CLI sends
- `--model` and how `ChatRequest.model` works.
-2. **No model validation on PUT** — the backend doesn't validate the model
- name on `PUT /conversations/:id/model` (it's just a string). The provider
- resolves it at turn time; an unknown model → turn error, not a 400. This
- matches the contract doc on `SetModelRequest`.
-3. **Empty string clears** — `setModel(id, "")` deletes the key. The HTTP
- `PUT` with `{ model: null }` maps to this. This is an implementation detail
- the FE doesn't need to know about (it sends `null`).
-4. **No `model` field on `ConversationMeta`** — following the precedent of `cwd`
- and `reasoningEffort` (which are NOT on `ConversationMeta` but fetched via
- dedicated endpoints). The FE calls `GET /conversations/:id/model` to read.
diff --git a/frontend-reasoning-effort-handoff.md b/frontend-reasoning-effort-handoff.md
deleted file mode 100644
index 656dede..0000000
--- a/frontend-reasoning-effort-handoff.md
+++ /dev/null
@@ -1,81 +0,0 @@
-# FE handoff — reasoning effort (thinking-depth knob)
-
-Courier this to `../frontend` (cross-repo contract change; `lsp references` does not
-span repos — ORCHESTRATOR §7). All changes are ADDITIVE — nothing existing breaks.
-
-## What shipped (backend)
-
-A new user-settable knob, **reasoning effort**: how much extended thinking the model spends
-before answering. Canonical ladder (type `ReasoningEffort`, exported by `@dispatch/wire` and
-re-exported by `@dispatch/transport-contract`):
-
-```ts
-type ReasoningEffort = "low" | "medium" | "high" | "xhigh" | "max";
-```
-
-Versions: `@dispatch/wire` `0.6.1 → 0.7.0`, `@dispatch/transport-contract`
-`0.10.0 → 0.11.0`. Bump the pinned `file:` deps.
-
-It has TWO setting scopes, resolved server-side per turn:
-
-1. **Per-turn override** — optional `reasoningEffort` on `ChatRequest` (HTTP `POST /chat`)
- and therefore on the WS `chat.send` message (`ChatSendMessage extends ChatRequest`).
- Applies to THAT turn only; does NOT persist.
-2. **Persisted per-conversation setting** — sticky; used for every turn that has no per-turn
- override:
- - `GET /conversations/:id/reasoning-effort` → `ReasoningEffortResponse`
- `{ conversationId, reasoningEffort: ReasoningEffort | null }` (`null` = never set).
- - `PUT /conversations/:id/reasoning-effort` with body `SetReasoningEffortRequest`
- `{ reasoningEffort }` → persists it.
-
-**Resolution chain (server-owned — do not re-implement):** per-turn override → persisted
-conversation value → **default `"high"`**. So a conversation with nothing set already runs at
-`high`; `null` from the GET means "default (`high`) applies", not "off".
-
-**Validation:** an unrecognized level → HTTP 400 `{ error }` (the error message lists the
-valid levels). Same for the WS path (the standard `chat.send` error reply). Send only the
-five ladder strings; omit the key entirely for "no override" (don't send `null`/`""`).
-
-## What the model does with it (context for UX copy)
-
-The Anthropic provider maps the level to an extended-thinking token budget
-(`low` 4 096 · `medium` 10 240 · `high` 16 384 · `xhigh` 32 768 · `max` 65 536). Higher
-levels = the model thinks longer before answering (more `reasoning-delta` events / thinking
-chunks ahead of the text — the FE already renders those). Providers without a thinking knob
-ignore the field — sending it is always safe.
-
-## What we need the FE to do
-
-1. **Per-conversation effort selector** — a 5-option control (plus an implicit "default"
- state when the GET returns `null`):
- - On conversation open: `GET /conversations/:id/reasoning-effort`; render `null` as
- "high (default)".
- - On change: `PUT` the chosen level. It takes effect from the NEXT turn — no turn restart
- needed.
-2. **(Optional) per-turn override** — if the composer grows a "think harder for this one
- message" affordance, set `reasoningEffort` on that `chat.send` only. The persisted setting
- is untouched by overrides.
-3. **Expect more thinking** — at `xhigh`/`max` the pre-answer thinking phase can be long;
- whatever spinner/" thinking…" treatment exists should tolerate extended runs of
- reasoning deltas before the first text delta.
-
-## Cache note (don't surprise users)
-
-Changing the effort level changes the provider request shape, which can bust the prompt
-cache for the next turn (one-time re-prefill cost). The backend's cache-warming path already
-warms with the SAME resolved effort as a real turn, so a STABLE setting stays cache-safe;
-only the act of changing it costs. If the FE wants, it can mention this in the selector's
-tooltip — no functional handling required.
-
-## Verify (manual)
-
-```bash
-# sticky setting round-trip
-curl -s localhost:24203/conversations/<id>/reasoning-effort # → null first time
-curl -s -X PUT localhost:24203/conversations/<id>/reasoning-effort \
- -H 'content-type: application/json' -d '{"reasoningEffort":"xhigh"}'
-curl -s localhost:24203/conversations/<id>/reasoning-effort # → "xhigh"
-# bad level → 400
-curl -s -X PUT localhost:24203/conversations/<id>/reasoning-effort \
- -H 'content-type: application/json' -d '{"reasoningEffort":"banana"}'
-```
diff --git a/frontend-stop-generation-handoff.md b/frontend-stop-generation-handoff.md
deleted file mode 100644
index 117b65e..0000000
--- a/frontend-stop-generation-handoff.md
+++ /dev/null
@@ -1,49 +0,0 @@
-# FE handoff — stop generation mid-turn
-
-Courier this to `../frontend`. All changes are ADDITIVE.
-
-## What shipped (backend)
-
-A "stop" button: aborts an in-flight generation without closing the conversation.
-The conversation stays open — it transitions `active → idle` via the normal
-turn-settle path. Partial messages are persisted. The turn seals with
-`finishReason: "aborted"`.
-
-This is distinct from `POST /conversations/:id/close` which marks the
-conversation as `closed` (tab dismiss).
-
-## `POST /conversations/:id/stop` — stop generation
-
-Aborts the in-flight turn's `AbortController`. The kernel finishes generation,
-persists partial messages, and seals the turn normally. The conversation
-transitions `active → idle` (not `closed`).
-
-- 200 response: `{ conversationId: string, abortedTurn: boolean }`
-- `abortedTurn: true` — a turn was active and has been aborted.
-- `abortedTurn: false` — no active turn (no-op, conversation was already idle).
-- Idempotent — stopping an idle conversation is safe.
-
-## What the FE receives after stopping
-
-The existing event flow handles everything — no new WS message needed:
-
-1. The `done` event arrives with `reason: "aborted"` (the turn sealed normally).
-2. The `conversation.statusChanged` WS message arrives with `status: "idle"`.
-3. The FE should reload history via `GET /conversations/:id` to see the partial
- messages that were persisted before the abort.
-
-## What the FE needs to do
-
-1. **Stop button** in the conversation toolbar (only visible when `status: "active"`).
- On click → `POST /conversations/:id/stop`. Disable the button after clicking
- (wait for the `done` event + `statusChanged: idle` before re-enabling).
-
-2. **Handle the response**: `abortedTurn: true` means the stop worked.
- `abortedTurn: false` means there was nothing to stop (the turn may have
- already finished between the click and the request).
-
-3. **Reload history** after receiving the `done` event to show partial output.
-
-## CLI
-
-`dispatch stop <conversationId>` — stops generation. Resolves short IDs.
diff --git a/frontend-system-prompt-handoff.md b/frontend-system-prompt-handoff.md
deleted file mode 100644
index c135145..0000000
--- a/frontend-system-prompt-handoff.md
+++ /dev/null
@@ -1,73 +0,0 @@
-# FE Courier Handoff: System Prompt Builder (Updated)
-
-> Backend→FE courier. Send to FE agent `ffe3`.
-> Supersedes the earlier `frontend-system-prompt-handoff.md` — adds `prompt:workspace_id`.
-
-## API endpoints
-
-### `GET /system-prompt` → `{ template: string }`
-Returns the current global template. When none is stored, returns the built-in default.
-
-### `PUT /system-prompt` ← `{ template: string }` → `{ template: string }`
-Set the global template. Empty string = "no system prompt". 400 if `template` missing/wrong type. 503 if service unavailable.
-
-### `GET /system-prompt/variables` → `{ variables: SystemPromptVariable[] }`
-Static catalog — always available (no service dependency). Use this to render the variable selector buttons.
-
-## Template format
-
-### Variable insertion
-```
-[type:name]
-```
-Resolves at construction time. Unknown type → blank. Non-existent variable (e.g. file not found) → blank.
-
-### Conditional blocks
-```
-[if type:name]
- ...if variable exists...
-[else]
- ...if not...
-[endif]
-```
-Negated:
-```
-[if !type:name]
- ...if variable does NOT exist...
-[endif]
-```
-Nested `[if]`: supported. Multi-line: supported. Unmatched `[if]`/`[endif]`: literal text.
-
-## Available variables (updated)
-
-| Type:Name | Description | Dynamic? |
-|---|---|---|
-| `system:time` | Current time (ISO 8601) | No |
-| `system:date` | Current date (YYYY-MM-DD) | No |
-| `system:os` | Operating system | No |
-| `system:hostname` | Machine hostname | No |
-| `prompt:cwd` | Working directory | No |
-| `prompt:model` | Current model name | No |
-| `prompt:conversation_id` | Conversation ID | No |
-| `prompt:workspace_id` | Workspace identifier — lets the AI know which workspace it's in, useful when summoning agents | No |
-| `git:branch` | Current git branch | No |
-| `git:status` | Short git status | No |
-| `file:<path>` | File contents (relative to cwd, or absolute if starts `/`) | **Yes** |
-
-For `file:<path>`, allow free-text input for the path.
-
-## Caching behavior
-
-System prompt is **constructed once** (first turn of a new conversation) and **persisted**. Reused on all subsequent turns (cache-safe). Reconstructed only on **compaction**. Changing the template does NOT affect existing conversations until compacted.
-
-## Default template
-
-```
-You are a helpful coding assistant.
-
-[if file:AGENTS.md]
-[file:AGENTS.md]
-[endif]
-
-The current working directory is [prompt:cwd].
-```
diff --git a/frontend-todo-handoff.md b/frontend-todo-handoff.md
deleted file mode 100644
index 4a81296..0000000
--- a/frontend-todo-handoff.md
+++ /dev/null
@@ -1,91 +0,0 @@
-# FE handoff — todo task list surface
-
-Courier this to `../frontend` (cross-repo contract change; `lsp references` does
-not span repos — ORCHESTRATOR §7). All changes are ADDITIVE — nothing existing breaks.
-
-## What shipped (backend)
-
-A per-conversation **task list** the AI model maintains via a `todo_write` tool. The
-list is exposed to the frontend as a per-conversation **surface** (read-only). The
-model creates/updates the list during a turn; the surface updates live so the FE can
-render the current state.
-
-- **`todo_write` tool** — the model passes the FULL list each call (replaces the
- existing list). Returns the list as JSON. The tool description guides the model on
- when to use it (3+ step tasks, planning, etc.).
-- **State** — in-memory, per-conversation. No persistence (the list lives for the
- process lifetime of the conversation).
-- **No new wire types, no version bumps.** The todo surface uses the existing
- `custom` surface field kind (`ui-contract` unchanged). The `TodoItem` type is
- defined by the `todo` extension and carried in the surface payload — it is NOT
- in `@dispatch/wire` or `@dispatch/transport-contract`.
-
-## The surface
-
-The `todo` extension contributes a per-conversation surface:
-
-- **Surface id:** `"todo"`
-- **Scope:** `"conversation"` (subscribe with the `conversationId`)
-- **Region:** `"side"`
-- **Title:** `"Tasks"`
-- **One `custom` field**, `rendererId: "todo"`, `payload: TodoPayload`
-
-```ts
-interface TodoPayload {
- todos: readonly TodoItem[];
-}
-
-interface TodoItem {
- content: string;
- status: "pending" | "in_progress" | "completed" | "cancelled";
-}
-```
-
-- **Read-only** — no `invoke` actions. The model mutates the list via the
- `todo_write` tool; the FE only renders.
-- **Updates** on every `todo_write` call (subscriber-notify → full new spec with the
- updated `todos` array).
-- **Empty list** — an idle conversation (no todo list created yet, or the model
- cleared it with an empty array) renders `todos: []`. Hide the panel when empty.
-
-## What the FE needs to do
-
-1. **Subscribe** to the `todo` surface per conversation (same pattern as
- `message-queue` and `cache-warming` — `scope: "conversation"`, pass
- `conversationId` on subscribe).
-
-2. **Custom renderer** for `rendererId: "todo"` — render the `payload.todos` array
- as a task list. Suggested UI:
- - Each item shows `content` with a status indicator:
- - `pending` — empty circle / checkbox
- - `in_progress` — spinner / filled circle (highlight)
- - `completed` — checkmark (strikethrough or dim the content)
- - `cancelled` — X / dash (dim/strikethrough)
- - Order is significant — items are in the order the model provided them (array
- index = identity).
- - Only one item should be `in_progress` at a time (the tool description enforces
- this via guidance, not validation — but the model should comply).
-
-3. **Live updates** — the surface pushes a new spec on every `todo_write` call. No
- polling needed. Just re-render from the new `payload.todos`.
-
-4. **Empty state** — when `todos` is `[]`, hide the panel (the model hasn't created
- a list yet, or cleared it).
-
-## No other integration points
-
-- No new WS ops (no `chat.queue` equivalent — the model is the only writer).
-- No new HTTP endpoints (the list is tool-driven, not API-driven).
-- No new `AgentEvent` types (the list is not on the chat stream).
-- No version bumps in `@dispatch/wire` or `@dispatch/transport-contract`.
-
-## Notes
-
-- **In-memory only** — the todo list does NOT persist across server restarts. If
- the server restarts, the list is cleared. The model recreates it on the next
- `todo_write` call. This mirrors the message-queue behavior.
-- **Per-conversation** — each conversation has its own list. Switching conversations
- means subscribing to a different `conversationId` and rendering that conversation's
- list.
-- **Model-driven** — the FE has no control over the list (read-only surface). The
- model creates, updates, and clears items. The FE just displays the current state.
diff --git a/frontend-turn-continuity-handoff.md b/frontend-turn-continuity-handoff.md
deleted file mode 100644
index e0be4a3..0000000
--- a/frontend-turn-continuity-handoff.md
+++ /dev/null
@@ -1,83 +0,0 @@
-# FE handoff — turn continuity + multi-client live view
-
-Courier to `../frontend` (cross-repo; `lsp references` does not span repos —
-ORCHESTRATOR §7). Backend is implemented + live-verified against flash. This unblocks
-the "turn keeps running when the browser is backgrounded/reloaded" + "watch the same
-chat from a second device" behavior.
-
-## What changed in the backend (principle now enforced)
-
-A turn is **no longer bound to the WebSocket connection**. It runs to completion on the
-server regardless of any client, and **any number of connections can watch the same
-conversation's live events** — including a client that connects mid-turn (late-join
-replay). The old behavior (socket close → `AbortController.abort()` → turn killed) is
-gone.
-
-## New WS protocol (additive — `@dispatch/transport-contract` `0.6.0 → 0.7.0`)
-
-Two new client→server messages on the existing socket:
-
-```ts
-{ type: "chat.subscribe"; conversationId: string } // start watching a conversation's turns
-{ type: "chat.unsubscribe"; conversationId: string } // stop watching (does NOT stop the turn)
-```
-
-Server→client is UNCHANGED: turn events still arrive as
-`{ type: "chat.delta", event: AgentEvent }` (and `{ type: "chat.error", ... }`). Both
-replayed and live events use `chat.delta`.
-
-Semantics:
-- **`chat.subscribe`** registers this connection to receive the conversation's turn
- events. If a turn is in-flight, the server immediately **replays that turn's events so
- far** (from its `turn-start`) as `chat.delta`, then streams live ones. If idle, nothing
- is replayed (rely on the history read).
-- **`chat.send`** still starts a turn AND **auto-subscribes the sending connection** — so
- the sender needs no separate `chat.subscribe`. (If a turn is already generating for that
- conversation, the server replies `chat.error` "a turn is already generating…" and you
- stay subscribed to watch the running one.)
-- **`chat.unsubscribe`** / socket close → the server drops this connection's subscription
- but **never stops the turn**.
-- Subscriptions **persist across turns** on the backend: subscribe once and you receive
- every subsequent turn on that conversation until you unsubscribe/close.
-
-## What the FE must change (from the FE investigation)
-
-1. **On WS (re)connect — re-subscribe chat, not just surfaces.** Today `onReopen`
- (`src/app/store.svelte.ts`) only re-sends *surface* subscriptions. It must ALSO, for
- every open conversation, send `chat.subscribe { conversationId }`. This is what makes a
- backgrounded/reconnected client re-attach to a still-running turn and resume live
- streaming. (Pair it with a `syncTail()` so any turn that sealed while you were gone is
- committed from history.)
-2. **On page load — subscribe each restored tab's conversation** (in addition to the
- existing IndexedDB + `GET /conversations/:id?sinceSeq=` rehydrate). After a reload
- mid-turn you'll get the in-flight turn replayed and can keep rendering it live.
-3. **Render a real "running" state.** Derive it from the stream: a `turn-start` (or any
- delta) with no matching `done`/`turn-sealed` yet = generating. Today the Composer status
- is hard-wired idle and the `status` AgentEvent is a no-op reducer — wire it up so a
- watching device shows "generating…".
-4. **Don't lose a missed `turn-sealed`.** If you reconnect after the turn sealed while you
- were away, you won't get a live `turn-sealed`; `syncTail()` on (re)connect (point 1)
- commits the finished turn from history. If you reconnect WHILE it's still running, the
- replay + live tail carry you to the real `turn-sealed`.
-5. **Multi-device handoff (the goal):** opening the same conversation on device B is just
- `chat.subscribe { conversationId }`. B will see the in-flight turn (replayed) and watch
- it finish — even if device A (the sender) closed. No special handling beyond points 1–3.
-
-## Out of scope (backend will NOT do these yet)
-
-- **Per-step persistence / crash-resume:** if the backend PROCESS crashes mid-turn, the
- in-flight turn is still lost (the in-flight buffer is in-memory; only sealed turns are
- persisted). Reconnecting to a *running* turn works; surviving a *backend crash* mid-turn
- does not. Separate durability milestone (R1).
-- **Concurrent-send arbitration:** sending from two devices at once is not handled (by
- product decision — won't happen). A second `chat.send` while generating gets a
- `chat.error`.
-- **Explicit "stop generating":** there is no stop op (disconnect no longer stops a turn).
- A future `chat.stop` would be deliberate.
-
-## Quick manual check (mirrors the backend live test)
-
-Open two WS connections, `chat.subscribe` the same `conversationId` on both, `chat.send`
-on one → both receive identical `chat.delta` streams. Close the sender mid-turn → the other
-keeps receiving through `done`. Connect a third mid-turn + `chat.subscribe` → it receives
-`turn-start` replayed then the rest.
diff --git a/frontend-workspace-open-handoff.md b/frontend-workspace-open-handoff.md
deleted file mode 100644
index 8005a2f..0000000
--- a/frontend-workspace-open-handoff.md
+++ /dev/null
@@ -1,47 +0,0 @@
-# Frontend handoff — workspace id on conversation.open / statusChanged
-
-## What changed
-
-The backend now resolves the conversation's actual persisted workspace id and
-includes it on the WS broadcast for both `conversation.open` and
-`conversation.statusChanged`.
-
-- `@dispatch/transport-contract` `0.18.0 → 0.19.0` — additive `workspaceId: string`
- on both `ConversationOpenMessage` and `ConversationStatusChangedMessage`.
-- The backend uses the conversation's stored workspace (`"default"` fallback),
- not the per-turn start option.
-
-## What the frontend must do
-
-1. **Re-pin the `file:` dep** on `@dispatch/transport-contract` from the backend
- repo once this commit lands.
-2. **Re-mirror `.dispatch/transport-contract.reference.md`** to match the `0.19.0`
- contract.
-3. **Parser update** (`src/adapters/ws/logic.ts:116-123`): parse `workspaceId`
- from the incoming `conversation.open` and `conversation.statusChanged` messages.
-4. **`openConversation()`** (`src/app/store.svelte.ts:588-600`): use the message's
- `workspaceId` to stamp/focus the tab instead of `activeWorkspaceId` (the
- viewer's current workspace). This fixes the bug where a tab opened via the
- CLI `--open --workspace my-ws` was appearing in every workspace.
-5. **`onConversationStatusChanged()`** (`src/app/store.svelte.ts:703-718`): same
- fix when the FE calls `openConversation(conversationId)` on a status change and
- has no existing tab — use the `workspaceId` from the message.
-6. **Tests** (`logic.test.ts`, `index.test.ts`, `App.test.ts`, `conformance.test.ts`):
- update fixtures/assertions to carry `workspaceId`.
-
-## Backend status
-
-- `@dispatch/transport-contract`: `0.19.0` with additive `workspaceId`.
-- `session-orchestrator`: payload types widened; status-change emits resolve
- workspace id from store.
-- `transport-ws`: broadcasts include `workspaceId`.
-- `transport-http`: `POST /conversations/:id/open` resolves workspace id and emits
- it.
-- Verified: `tsc -b` EXIT 0, biome clean, **1405 vitest** green.
-
-## Note on assumptions
-
-I'm working autonomously while the user is away. Every assumption I make is
-recorded in `notes/assumptions-log.md` in this repo. Please record any
-assumptions you make while implementing this handoff in your own assumptions log
-so we can raise them when the user returns.
diff --git a/frontend-workspaces-handoff.md b/frontend-workspaces-handoff.md
deleted file mode 100644
index 1fed1bf..0000000
--- a/frontend-workspaces-handoff.md
+++ /dev/null
@@ -1,216 +0,0 @@
-# Backend handoff — Workspaces (backend → FE) — courier doc
-
-> **From:** arch-rewrite orchestrator · **To:** frontend orchestrator · **Courier:** the user.
-> Response to `backend-handoff-workspaces.md`. This doc finalizes the contract shapes
-> the backend will implement. The FE should re-pin `@dispatch/wire` and
-> `@dispatch/transport-contract` `file:` deps and re-mirror any `.dispatch/*.reference.md`.
-
-## Version bumps
-
-| Package | From | To | Notes |
-|---|---|---|---|
-| `@dispatch/wire` | `0.11.0` | `0.12.0` | Additive: `Workspace`, `WorkspaceEntry`, `ConversationMeta.workspaceId` |
-| `@dispatch/transport-contract` | `0.15.0` | `0.16.0` | Additive: workspace endpoints + `workspaceId` on chat/queue ops |
-| `@dispatch/ui-contract` | `0.2.0` | `0.2.0` | **Unchanged** |
-
----
-
-## 1. Final types — `@dispatch/[email protected]`
-
-```ts
-/**
- * A named, URL-driven grouping of conversations that owns a default cwd.
- * Every conversation belongs to exactly one workspace; conversations that
- * haven't set their own per-conversation cwd inherit `defaultCwd`.
- */
-export interface Workspace {
- /** The URL slug (immutable). Lowercase `[a-z0-9-]`, 1–40 chars. */
- readonly id: string;
- /** Display title (editable). Defaults to `id` on creation. */
- readonly title: string;
- /** The workspace's default cwd, or `null` (fall through to server default). */
- readonly defaultCwd: string | null;
- /** Epoch-ms when the workspace was first created. */
- readonly createdAt: number;
- /** Epoch-ms of the most recent conversation activity in this workspace. */
- readonly lastActivityAt: number;
-}
-
-/**
- * A workspace entry in the list response — a `Workspace` plus a conversation count.
- */
-export interface WorkspaceEntry extends Workspace {
- /** Number of conversations assigned to this workspace. */
- readonly conversationCount: number;
-}
-```
-
-`ConversationMeta` gains a required `workspaceId`:
-
-```ts
-export interface ConversationMeta {
- readonly id: string;
- readonly createdAt: number;
- readonly lastActivityAt: number;
- readonly title: string;
- readonly status: ConversationStatus;
- /** Always present; "default" for legacy/unspecified conversations. */
- readonly workspaceId: string;
- readonly compactedFrom?: string;
-}
-```
-
----
-
-## 2. Final types — `@dispatch/[email protected]`
-
-### Additive fields on existing request types
-
-```ts
-export interface ChatRequest {
- readonly conversationId?: string;
- readonly message: string;
- readonly model?: string;
- readonly cwd?: string;
- readonly reasoningEffort?: ReasoningEffort;
- /** Workspace to assign the conversation to. Default "default". Auto-creates if missing. */
- readonly workspaceId?: string;
-}
-
-export interface QueueRequest {
- readonly text: string;
- /** Default "default". Auto-creates if missing. */
- readonly workspaceId?: string;
-}
-
-export interface ChatQueueMessage {
- readonly type: "chat.queue";
- readonly conversationId: string;
- readonly text: string;
- /** Default "default". Auto-creates if missing. */
- readonly workspaceId?: string;
-}
-```
-
-### Workspace endpoint types
-
-```ts
-/** Body of `PUT /workspaces/:id` (all fields optional — the ensure/create call). */
-export interface EnsureWorkspaceRequest {
- /** Display title. Default: the workspace id. Only used on create; ignored if workspace exists. */
- readonly title?: string;
- /** Default cwd. Default: null (inherit server default). Only used on create. */
- readonly defaultCwd?: string | null;
-}
-
-/** Response of GET/PUT /workspaces/:id — the workspace itself. */
-export interface WorkspaceResponse extends Workspace {}
-
-/** Response of `GET /workspaces` — all workspaces sorted by lastActivityAt desc. */
-export interface WorkspaceListResponse {
- readonly workspaces: readonly WorkspaceEntry[];
-}
-
-/** Body of `PUT /workspaces/:id/title`. */
-export interface SetWorkspaceTitleRequest {
- readonly title: string;
-}
-
-/** Body of `PUT /workspaces/:id/default-cwd`. null/absent = clear to server default. */
-export interface SetWorkspaceDefaultCwdRequest {
- readonly defaultCwd: string | null;
-}
-
-/** Response of `DELETE /workspaces/:id`. */
-export interface DeleteWorkspaceResponse {
- readonly workspaceId: string;
- /** Conversations that were closed (status → "closed") by this delete. */
- readonly closedCount: number;
-}
-```
-
----
-
-## 3. Final endpoint list
-
-| Method & Path | Body | Returns | Notes |
-|---|---|---|---|
-| `GET /workspaces` | — | `WorkspaceListResponse` | Sorted by `lastActivityAt` desc. Includes `conversationCount`. |
-| `PUT /workspaces/:id` | `EnsureWorkspaceRequest?` | `WorkspaceResponse` | **Create-on-miss** (idempotent). Creates with `title=id`, `defaultCwd=null` if missing. Returns existing as-is if present. Slug validated. |
-| `GET /workspaces/:id` | — | `WorkspaceResponse` | Pure read. 404 if missing. |
-| `PUT /workspaces/:id/title` | `SetWorkspaceTitleRequest` | `WorkspaceResponse` | Rename (display only; id unchanged). |
-| `PUT /workspaces/:id/default-cwd` | `SetWorkspaceDefaultCwdRequest` | `WorkspaceResponse` | Set/clear workspace default cwd. |
-| `DELETE /workspaces/:id` | — | `DeleteWorkspaceResponse` | **Closes all conversations** (status → "closed"), reassigns them to "default", then deletes the workspace. 409 for `"default"`. |
-| `GET /conversations` | `?workspaceId=`, `?status=`, `?q=` | `ConversationListResponse` | Additive `?workspaceId=` filter, composable with existing filters. |
-| `DELETE /conversations/:id/cwd` | — | `CwdResponse` | Clears explicit conversation cwd (returns `cwd: null`). |
-
-### Existing endpoints (semantic note, no type change)
-
-- `GET /conversations/:id/cwd` — unchanged: returns the **explicit** conversation cwd (`null` = inheriting workspace default).
-- `GET /conversations/:id/lsp` — now roots LSP at the **effective** cwd; `LspStatusResponse.cwd` returns the effective cwd.
-
----
-
-## 4. cwd resolution (backend-owned)
-
-```
-effectiveCwd = conversationStore.getCwd(conversationId) // explicit per-conversation
-if (effectiveCwd == null) {
- workspaceId = conversationStore.getWorkspaceId(conversationId) // "default" fallback
- workspace = conversationStore.getWorkspace(workspaceId)
- effectiveCwd = workspace?.defaultCwd ?? null
-}
-if (effectiveCwd == null) effectiveCwd = serverDefaultCwd // process.cwd() today
-```
-
-- `GET /conversations/:id/cwd` → explicit cwd only (`null` = inherit).
-- `GET /conversations/:id/lsp` → effective cwd.
-- Turn start (`runTurn` / `warm`) → effective cwd.
-
----
-
-## 5. `DELETE /workspaces/:id` semantics
-
-1. Close all conversations in that workspace (set `status = "closed"`).
-2. Reassign their `workspaceId` to `"default"` (so no dangling reference).
-3. Delete the workspace entity.
-4. Return `{ workspaceId, closedCount }`.
-5. `DELETE /workspaces/default` → HTTP 409.
-
-Closed conversations are hidden from tab-restore (`?status=active,idle` excludes `closed`).
-
----
-
-## 6. Workspace lifecycle / auto-creation
-
-- **Auto-create on turn start:** if `workspaceId` is provided and doesn't exist, the backend auto-creates it (`title = id`, `defaultCwd = null`).
-- **`PUT /workspaces/:id` create-on-miss:** if absent, creates with optional `title`/`defaultCwd` from the body (defaults: `title = id`, `defaultCwd = null`). If present, returns existing as-is.
-- **Slug validation:** `^[a-z0-9](?:[a-z0-9-]{0,38}[a-z0-9])?$` (1–40 chars, lowercase, digits, internal hyphens only). Reject invalid with 400. No normalization. `"default"` allowed but non-deletable.
-- **`"default"` workspace:** always synthesized if not persisted; guaranteed in `GET /workspaces` list.
-- **`lastActivityAt`:** updates when a conversation in the workspace appends, or on first creation. Does NOT update on title/default-cwd changes.
-- **Compaction:** post-compaction conversations inherit the original's `workspaceId`.
-
----
-
-## 7. Answers to FE open questions (Q1–Q8)
-
-| # | Decision |
-|---|---|
-| Q1 | **Close all conversations** in the workspace (status → "closed"), reassign to "default", then delete the workspace. Return `closedCount`. |
-| Q2 | **Add `DELETE /conversations/:id/cwd`** to clear explicit cwd (fall back to workspace default). `PUT` validation unchanged (empty string still 400). |
-| Q3 | **Deferred to v1** — no WS lifecycle push. Fetch-on-mount + manual refresh sufficient. Can add `workspace.created/updated/deleted` later, additively. |
-| Q4 | **`PUT /workspaces/:id`** is the create-on-miss entry point (idempotent, 200). `GET /workspaces/:id` is a pure read (404 if missing). |
-| Q5 | Slug regex `^[a-z0-9](?:[a-z0-9-]{0,38}[a-z0-9])?$`. Reject, don't normalize. `"default"` non-deletable. |
-| Q6 | `Workspace` in `@dispatch/wire`. Request/response bodies in `@dispatch/transport-contract`. |
-| Q7 | Confirmed — backend does nothing beyond `workspaceId` on `ConversationMeta` + `?workspaceId=` filter. |
-| Q8 | Yes — post-compaction conversations inherit `workspaceId`. `forkHistory` copies it. |
-
----
-
-## 8. Gaps resolved (from FE handoff §3)
-
-1. **Unknown workspaceId on turn start** → auto-create (title = id, defaultCwd = null). Typos can be deleted.
-2. **PUT /workspaces/:id initial state** → body accepts optional `title`/`defaultCwd` with defaults (`title = id`, `defaultCwd = null`). Only applied on create; existing workspace returned as-is.
-3. **lastActivityAt on title/default-cwd changes** → no.
-4. **LSP cwd field** → returns effective cwd.
-5. **Conversation count in list** → yes, included as `WorkspaceEntry.conversationCount`.
diff --git a/tasks.md b/tasks.md
deleted file mode 100644
index 137101a..0000000
--- a/tasks.md
+++ /dev/null
@@ -1,1050 +0,0 @@
-# Dispatch — tasks (live progress)
-
-> **Live status + roadmap only.** Completed milestones are summarized, not
-> narrated. Old blow-by-blow history is pruned — it lives in git (`git log`).
-> Keep this lean and current; do not let it re-accrete a step-by-step changelog.
-
-## Status (current)
-`tsc -b` EXIT 0 · biome clean · **1730 vitest** pass (+6 sshd-integration skipped). (worktree `feature/ssh-support`;
-merged `dev` — brings retry-with-backoff (`provider-retry` AgentEvent) + the LSP-dead-server fix alongside the
-SSH waves below.)
-
-## Retry with backoff on retryable provider errors (DONE — from dev)
-When the upstream LLM API returns a retryable error (HTTP 429 / 5xx "overloaded"),
-the kernel now retries `provider.stream()` with a stepped backoff, visibly, until
-the 8h cumulative-sleep budget is exhausted — then emits the final error and
-seals the turn. Retries fire ONLY when no content was emitted yet this step (the
-safety invariant — never duplicate partial output). Plan:
-`notes/retry-with-backoff-plan.md`; report: `reports/retry-with-backoff.md`.
-- **Architecture (kernel hook + shell policy/I/O):** kernel provides the hook
- (`RetryStrategy` contract + the retry loop in `runTurn`); the shell
- (session-orchestrator) provides the policy (the schedule) + the I/O (an
- abortable `setTimeout` sleep). Kernel imports no timer. `retry?` is optional
- → omit = no retry (backward-compatible).
-- **New transient `AgentEvent` variant** `provider-retry` (`@dispatch/wire`),
- emitted once per scheduled retry BEFORE the sleep so the UI can show
- "⚠ retrying in Ns…" immediately; NOT persisted to model history (never
- pollutes the prompt). Final failure is still a persisted `error` + seal.
-- **Schedule:** `5s,10s,30s,60s,5m,10m,15m,30m`, then repeat 30m until 8h of
- cumulative scheduled sleep → ~21 retries then give up. Pure `delayFor(attempt)`.
-- **Retry trigger:** emitted `error` with `retryable===true` → retry;
- `retryable` false/absent → give up; a THROWN error → retryable-by-default
- ONLY when pre-content. All gated on `!hadContent` (text/reasoning/tool-call/usage).
-- **Frontend handoff (5d3f, separate repo `../frontend`):** render
- `provider-retry` as a yellow warning system-message bubble showing `message`
- (+`code`) with the `delayMs` countdown.
-
-## SSH support — transparent remote execution (DONE — waves 0-5c)
-Plan: `notes/ssh-support-plan.md` (decisions locked in §0.5/§13). Orchestrated in
-waves (ORCHESTRATOR.md §2a — pre-author the contract seam, then parallel
-owner-agents on disjoint packages).
-- [x] **Wave 0** (orchestrator): kernel contract seam — `computerId` on
- `ToolExecuteContext` + `RunTurnInput` (additive optional; backward
- compatible). `tsc -b` EXIT 0.
-- [x] **Wave 1** (parallel): `wire` (Computer/defaultComputerId types) +
- `exec-backend` (NEW pkg: ExecBackend contract + LocalExecBackend + handle +
- resolver) + `kernel` runtime (thread computerId through dispatch/run-turn) +
- `conversation-store` (contract fan-out: defaultComputerId + getEffectiveComputer
- + per-conv computerId get/set/clear). `tsc -b` EXIT 0, biome clean, **1592 vitest**
- (was 1549, +43).
-- [x] **Wave 2** (parallel): refactor `tool-shell`/`read-file`/`write-file`/
- `edit-file` behind `ExecBackend` (local-only; spawn.ts deleted — logic moved
- to exec-backend; edit_file gains forward-compatible remote-diagnostics skip).
- `tsc -b` EXIT 0, biome clean, **1599 vitest** (was 1592).
-- [x] **Wave 3** (parallel): `session-orchestrator` (thread computerId end-to-end
- + remote tool-drop filter: drops `lsp` + `__`-namespaced MCP tools when
- remote) + `transport-contract` (ChatRequest.computerId + computer endpoint
- API types). `tsc -b` EXIT 0, biome clean, **1620 vitest** (was 1599).
-- [x] **Wave 4** (parallel): `transport-http` (computer endpoints + `/chat`
- threading + the `ComputerService` seam the ssh package will provide) +
- `transport-ws` (computerId through chat.send/queue) + `mcp` (CR-1: preserve
- computerId in filter). `tsc -b` EXIT 0, biome clean, **1641 vitest** (was 1620).
-- [x] **Wave 5a**: `exec-backend` — remote-backend factory handle (lazy lookup;
- computerId set -> SshExecBackend via factory; absent -> clear error). +24 tests.
-- [x] **Wave 5b**: `ssh` package (NEW) — SshConnectionPool (per-alias ssh2.Client,
- lazy connect, keep-alive, idle reap), SshExecBackend (ssh2 exec+sftp, node:fs
- .code error mapping), ~/.ssh/config reader (ssh-config), known_hosts
- auto-trust-and-pin, key-only auth from ~/.ssh. LOAD-BEARING: ssh2 verified
- under Bun (connected to local sshd :22, exec OK) — decision #1 confirmed.
- Provides remoteExecBackendFactoryHandle + computerServiceHandle. +45 tests
- (6 sshd integration tests skipped). tsc -b EXIT 0, biome clean, **1690 vitest**
- (was 1641).
-- [x] **Wave 5c**: host-bin — register exec-backend + ssh extensions in
- CORE_EXTENSIONS (correct DAG order); transport-http CR-5 barrel re-export of
- computerServiceHandle. orchestrator added missing @dispatch/exec-backend dep to
- host-bin + bun install. **LIVE-VERIFIED**: server boots clean ("Dispatch booted",
- no disabled extensions). tsc -b EXIT 0, biome clean, 1690 vitest (+6 sshd skipped).
-- [x] **Merge dev**: brought retry-with-backoff (`provider-retry` AgentEvent — what
- the FE consumes) + LSP-dead-server fix into the SSH branch. All code files
- auto-merged cleanly; only `tasks.md` conflicted (orchestrator-resolved).
-- [x] **FE handoff #3 (provider-retry merge) — RESOLVED**: FE re-synced both pinned
- file: deps (`@dispatch/wire` + `@dispatch/transport-contract`) against merged
- `feature/ssh-support`; both resolve `TurnProviderRetryEvent`. The 11 provider-
- retry svelte-check errors cleared with ZERO further FE code changes (consumer
- already complete + tested). FE full suite green: typecheck 0/0, 795/795 tests,
- biome clean, vite build OK. Earlier SSH handoffs (#1 wire types, #2 computer
- HTTP API) now also typecheck-clean against the merged wire. Nothing further
- needed from backend on this.
-- [x] **FE final sync check — GREEN, all three handoffs + cross-cutting verified**:
- FE confirmed whole-tree green (typecheck 0/0, 795/795 tests, biome clean, build
- OK, git clean). (1) provider-retry (§2c): TurnProviderRetryEvent resolves;
- assertAgentEventExhaustive covers it (typecheck-green = exhaustive); ChatView
- renders yellow alert-warning bubble w/ attemptLabel + delayLabel (delayMs via
- viewProviderRetry/formatRetryDelay) + code badge, gated {#if providerRetry}.
- (2) SSH handoff #1: Workspace.defaultComputerId + Computer/ComputerEntry resolve;
- 2 Workspace literals supply defaultComputerId: null; catalog flows through
- store.computers. (3) SSH handoff #2: full src/features/computer/ (ComputerField
- w/ per-conv selector + connection-status badge + Test-connection polling;
- ComputerSelect reusable; store computerId/refreshComputer/setComputer + computers
- catalog on boot + computerStatus/testComputer; WorkspaceCard default-computer
- selector via setDefaultComputer) — 20 view-model tests, typecheck-clean, chat.send
- unchanged. CROSS-CUTTING (key integration question): GREEN, no collision —
- provider-retry is WS-stream → TranscriptState.providerRetry → ChatView (transcript,
- keyed activeConversationId); computer is HTTP-ONLY (imports NO AgentEvent/chunks/
- TranscriptState) → AppStore.computerId (per-conv persisted) → ComputerField (sidebar,
- keyed currentConversationId). Disjoint state, disjoint channels (WS vs HTTP),
- disjoint regions (transcript vs sidebar), disjoint mount keys. The conversation-
- switch lifecycle is the only shared touchpoint and is correct + independent.
- assertAgentEventExhaustive confirms computer is NOT an AgentEvent (HTTP-only).
- We're done — nothing further needed from either side.
-- [ ] **DEFERRED — CR-6 usageCount**: `listComputers()` returns `usageCount: 0` until a
- conversation-store count-by-alias helper + host-bin wiring is added (non-blocking —
- discovery/connect/execute all work; only the count badge shows 0). Follow-up.
-- [ ] **DEFERRED — cache-warming**: computerId threading intentionally NOT done
- (user-deferred — cache-warming is not needed right now). Known limitation:
- a warm probe on a remote turn assembles the tool set WITHOUT the remote-drop
- → a potential prompt-cache miss (performance-only, not correctness). Revisit
- when cache-warming is re-enabled.
-Key decisions: ssh2 + ssh-config (project-local deps); key-only auth from
-`~/.ssh`; auto-trust-and-pin host keys; computers discovered read-only from
-`~/.ssh/config` (no CRUD entity); computerId persisted per-conversation; LSP/MCP
-silently dropped on remote turns; edit_file works w/o diagnostics remotely.
-
-## Per-edit LSP diagnostics auto-append (DONE)
-After a successful `edit_file`, the extension now calls LSP `getDiagnostics` on the
-post-edit buffer and appends any errors/warnings (severity ≤ 2) to the tool result —
-so the model sees lint/diagnostics feedback inline without a separate round-trip.
-Multi-server aggregation queries ALL connected servers matching the file's extension
-(not just the first), merging diagnostics tagged by source (`[steep]`, `[ruby-lsp]`, etc.).
-Incremental sync (`textDocument/didChange`) captures each server's `change` kind during
-`initialize` and computes prefix/suffix diff ranges for `change:2` servers, full content
-for `change:1`. New pure `diff.ts` (`computeChangeRange` + `offsetToPosition`, O(n)).
-60s timeout; slow warning if >10s; graceful degradation when no LSP available. Generic
-— works for any LSP. `languageId` mapping extended (`.rb`/`.rbs`/`.c`/`.cpp`/etc.).
-- [x] Wave 1 — `packages/lsp/` (single unit): diff.ts, client, tool, diagnostics, language, types, extension. 15 new diff tests + multi-server tool test.
-- [x] Wave 2 — `packages/tool-edit-file/`: optional dep on `@dispatch/lsp` via `host.getService()` (not manifest `dependsOn`); appends diagnostics after successful edit.
-- [x] Verified: `tsc -b` EXIT 0, biome clean, **1468 vitest** pass (was 1453, +15).
-- [x] **LIVE-VERIFIED** (production dispatch-server :24991): edit_file now surfaces LSP diagnostics inline — a deliberate type error (`const x: number = "not a number"`) in a .ts file produces `[TypeScript Language Server] ERROR (2322) L3:9: Type 'string' is not assignable to type 'number'` appended to the edit result. Required a lazy LSP service lookup fix (commits e03a96e + d4ff45c) — tool-edit-file activates at position 5 in CORE_EXTENSIONS while lsp activates at position 20, so getService always threw at activation time.
-
-## MCP (Model Context Protocol) integration (DONE)
-Dispatch is now an MCP host. A new `mcp` standard extension (`packages/mcp/`) spawns
-configured MCP servers (stdio child processes), performs the MCP handshake, discovers
-tools via `tools/list`, and registers each as a first-class Dispatch `ToolContract` via
-`host.defineTool`. When the model calls an MCP tool, the extension proxies to `tools/call`
-on the MCP server and returns the flattened result. Config: `.dispatch/mcp.json` (servers
-key) → `opencode.json` mcp key fallback, resolved per-cwd (mirrors LSP). Tool names namespaced
-as `<serverId>__<toolName>`. A `toolsFilter` drops tools from disconnected servers. Phase 1:
-stdio only, Tools only (no Resources/Prompts/HTTP/sampling). Hand-rolled JSON-RPC (zero deps).
-- **Design:** `notes/mcp-design.md` + `PLAN-mcp.md`.
-- [x] Wave 1 — `packages/mcp/` (agent via dispatch CLI): 12 source + 8 test files, 69 tests.
-- [x] Wave 2 — orchestrator: root tsconfig ref, host-bin CORE_EXTENSIONS registration, bun install.
-- [x] Verified: `tsc -b` EXIT 0, biome clean, **1537 vitest** pass (was 1468, +69).
-- [x] **LIVE-VERIFIED** (production dispatch-server :24991): a minimal test MCP server (stdio,
- one `ping` tool) configured in `.dispatch/mcp.json` → model discovered `test__ping`,
- called it with `{"msg":"hello"}`, received `pong` — full turn lifecycle (tool-call →
- tool-result → done). Tool name namespacing (`<serverId>__<toolName>`) confirmed on the wire.
-- **Bug found + fixed during live-verify:** `edit_file` tool was missing from the toolset
- because the per-edit diagnostics change called `host.getService(lspServiceHandle)` at
- activation time, but `tool-edit-file` activates BEFORE `lsp` in CORE_EXTENSIONS → getService
- threw → activate crashed → tool never registered. Fix: lazy lookup at edit time (commits
- e03a96e, d4ff45c).
-
-## Broken-chat self-repair (read-time reconcile) (DONE)
-Conversation `77574596` broke unrecoverably: `reconcile()` only repaired orphaned
-tool-calls, not (a) a trailing assistant message whose only chunk is `error`
-(serializes to empty content → uncontinuable) and (b) a `tool-call` whose `input`
-is a raw malformed-JSON string (re-sent as OpenAI `arguments` → provider 400s on
-every continuation). `load()` also had no try/catch on `JSON.parse` (one corrupt
-row would brick a chat). Fix = read-time repair so broken chats auto-heal on next
-open — NO DB surgery (append-only preserved; repair is a turn-path transform on
-`load()`). Full diagnosis + plan: `broken-chat-repair-handoff.md` +
-`reports/broken-chat-repair-diagnosis.md`.
-- **Layer 1 — `conversation-store` `reconcile.ts` (protects ALL providers):**
- `reconcileWithReport` now (1) strips `error` chunks from assistant messages, (2)
- drops any assistant message left with no `text`/`tool-call` (the emptied error-only
- msg — safe: never followed by a `tool` msg), (3) keeps orphaned-tool-call synthesis
- unchanged. `ReconcileReport` +2 additive counts (`strippedErrorChunks`,
- `droppedEmptyMessages`) for the repair span. `loadSince` (FE reads) intentionally
- NOT reconciled — the user still SEES the error while the provider gets clean history.
- **Hardening:** `store.ts` `load()` wraps per-chunk `JSON.parse` in try/catch →
- corrupt row skipped (log + continue), reconcile runs on the rest. +6 reconcile/store
- tests.
-- **Layer 2 — `openai-stream` `convert-messages.ts` (per-provider args safety):** new
- pure `serializeToolArguments` — object→stringify; valid-string→parse+restringify;
- malformed-string→fallback `{ _malformed_arguments: <truncated 200> }`. Output ALWAYS
- `JSON.parse`s → provider stops 400ing on stored malformed args. +4 tests.
-- **Layer 2 (equiv) — `../claude` `provider-anthropic` `convert.ts`:** `safeJson` now
- returns a valid object fallback (`{ _malformed_arguments: s.slice(0,200) }`) on
- parse failure, not the raw string (`tool_use.input` must be an object for Anthropic).
- Exported for direct testing. +3 tests. (Separate repo, separate agent.)
-- **Wave 1+2 (parallel, disjoint):** conversation-store + openai-stream (arch-rewrite)
- + provider-anthropic (`../claude`). All in-lane; zero internal mocks; no contract/type
- change. Reports: `reports/conversation-store.md`, `reports/openai-stream.md`,
- `../claude/reports/provider-anthropic.md`.
-- [x] Verified: arch-rewrite `tsc -b` EXIT 0, biome clean, **1453 vitest** (was 1443);
- `../claude` `tsc -b` EXIT 0, 71 vitest, biome clean. Both pure-core units zero
- internal mocks.
-- [x] **LIVE-VERIFIED** (dev stack `bin/up` :24203): reproduced 77574596's REAL broken
- tail (the actual malformed-args tool-call + trailing error chunk) in the dev DB;
- `POST /chat` continued it cleanly (`text-delta:"OK"` → `done` reason `"stop"`, no
- 400) — the provider accepted the reconciled history (error stripped, args sanitized).
- The historical error chunk remains in storage by design (read-time repair only); no
- new error was appended. Cleaned up the test conversation after.
-
-## LSP — broken-server recovery + config source attribution (DONE)
-Handoff from an agent running in raylib-jamstack (configuring ruby-lsp under the
-installed Dispatch harness `/usr/bin/dispatch-server`): two issues found by
-decompiling the running binary. (Previous orchestrator session 77574596 did the
-investigation + Wave 0 + wrote the prompt; its chat broke mid-summon — resumed.)
-- **Issue 2 (blocker):** a failed LSP server was `broken` FOREVER — the manager's
- `broken` set (keyed `${serverId}:${root}`) was cleared ONLY in `shutdownAll()`, so a
- server that failed (bad env, missing binary, OR a since-fixed bad config) stayed
- `state:"error"` for the whole process. For an agent running *inside* dispatch the
- only recovery (server restart) kills its own session.
-- **Issue 1:** `.dispatch/lsp.json` (read first) silently shadowed `opencode.json`'s
- `lsp` key — a broken entry won with no warning, and the caller couldn't tell which
- config source a server came from (`status()` was its only visibility).
-- **Wave 0 (orchestrator, contracts):** additive `readonly configSource?: string` on
- `LspServerInfo` (`@dispatch/transport-contract` `0.20.0→0.21.0`) + a type-test
- assertion (8→9). tsc/biome/vitest clean.
-- **Wave 1 — `lsp` extension:** (a) broken-server now self-heals when its *resolved
- config changes* since it was marked broken (a config edit is a discrete event → no
- retry storm; bounded backoff for transient failures); (b) `configSource?` mirrored on
- `LspServerStatus` + populated in `status()` (`.dispatch/lsp.json` / `opencode.json` /
- `built-in`); (c) shadow warning via `host.logger` when both configs declare lsp; (d)
- spawn-failure `error` strings now name the config source. 6 required named tests +
- extras. Report: (agent cut off before writing `reports/lsp.md`; work independently
- verified — 50 lsp tests, tsc EXIT 0, biome clean).
-- **Wave 1 CR (transport-http):** the `GET /conversations/:id/lsp` handler mapped
- `LspServerStatus`→`LspServerInfo` field-by-field and DROPPED `configSource` (never
- reached the wire). Summoned the transport-http owner for the one-line conditional-spread
- pass-through (mirrors `error`, honors `exactOptionalPropertyTypes`) + a named pass-through
- test (present + undefined-omitted). Report: `reports/transport-http.md`.
-- [x] Verified: `tsc -b` EXIT 0, biome clean, **1443 vitest** pass; all agents in-lane
- (only packages/lsp + transport-contract + transport-http touched; pre-existing
- uncommitted WIP in kernel/tool-shell left untouched). Zero internal mocks.
-- [x] **LIVE-VERIFIED** (dev stack `bin/up` on :24203, new code via `--watch`):
- (A) `configSource` reaches the wire — built-in TS server reports
- `configSource:"built-in"`, `state:"connected"` (Wave 0 + transport-http pass-through
- confirmed end-to-end); (B) a broken server (`.dispatch/lsp.json` → nonexistent binary)
- reports `state:"error"` + `configSource:".dispatch/lsp.json"` + a source-named error
- string (`broken-ts [from .dispatch/lsp.json]: Executable not found in $PATH: …`);
- (C) **recovery without restart** (the blocker) — same conversation/process went
- `error`→`connected` after the config was fixed (config change clears the broken key →
- re-spawn → connects); (D) no retry storm — repeated `status()` with no config change
- stays `error`; (E) shadow warning logged via `host.logger` (`extensionId:"lsp"`,
- level `warn`) when both `.dispatch/lsp.json` and `opencode.json` declare lsp.
-
-## Per-conversation model persistence (DONE)
-Bug: a chat's selected provider + model was NOT persisted per conversation.
-Opening the same chat in a new browser session defaulted to the server's
-default model rather than recalling the originally selected one.
-- **Wave 0 (orchestrator, contracts):** `@dispatch/transport-contract`
- `0.19.0→0.20.0` — additive `ModelResponse` + `SetModelRequest` types for
- `GET/PUT /conversations/:id/model`.
-- **Wave 1 — `conversation-store`:** `getModel`/`setModel` (`model:<id>` key,
- mirrors `getReasoningEffort`/`setReasoningEffort`); `forkHistory` copies model;
- empty string clears (idempotent). +13 tests.
-- **Wave 2 (parallel):** `session-orchestrator` (resolve model from persisted
- store when no per-turn override → `resolveModel`; persist the resolved model
- so it sticks; warm path parity; `resolveModelName` pure helper; +4 tests) +
- `transport-http` (`GET/PUT /conversations/:id/model` with validation +
- `parseModelBody` pure validator; +10 tests).
-- [x] Verified: `tsc -b` EXIT 0, biome clean, **1433 vitest** pass; all in-lane.
-
-## System-prompt stale on cwd change (DONE)
-Bug: the system-prompt service constructed the resolved prompt once on the first
-turn and reused it via `get()` on subsequent turns (cache-safe design). But the
-prompt is cwd-sensitive (`[file:AGENTS.md]`, `[prompt:cwd]` variables). When a
-conversation's cwd changed after the first turn, the cached prompt was stale —
-referenced files from the new cwd were not loaded.
-- **Wave 1 — `system-prompt`:** added `getWithMeta(conversationId)` returning
- `{ prompt, cwd }` — reads both `resolved:<id>` and a new `resolved-cwd:<id>`
- sibling key. `construct()` now also stores the cwd. All additive, no existing
- method signature/behavior changed. +5 tests.
-- **Wave 2 — `session-orchestrator`:** subsequent turns call `getWithMeta`,
- compare stored cwd vs `effectiveCwd ?? process.cwd()`, and `construct` if they
- differ (or if no stored prompt exists). Compaction path (always constructs)
- and warm path (no system prompt) unaffected. +1 test.
-- [x] Verified: `tsc -b` EXIT 0, biome clean, **1411 vitest** pass; both in-lane.
-- No FE handoff needed (backend-only fix; no contract version bump).
-
-## Workspace tab issue — conversation.open drops workspaceId (DONE)
-Cross-repo additive fix: `conversation.open` / `conversation.statusChanged` WS
-broadcasts now carry the conversation's persisted workspace id, so a frontend
-opens/focuses a tab in the correct workspace instead of the viewer's current
-workspace (`activeWorkspaceId`). CLI `dispatch <model> --open --workspace my-ws`
-now opens only in `my-ws`.
-- **Wave 0 (orchestrator, contracts):** `@dispatch/transport-contract`
- `0.18.0→0.19.0` — additive `readonly workspaceId: string` on
- `ConversationOpenMessage` and `ConversationStatusChangedMessage`.
-- **Wave 1 (parallel):** `session-orchestrator` (add `workspaceId` to
- `ConversationOpenedPayload`/`ConversationStatusChangedPayload`; resolve from
- `conversationStore.getWorkspaceId` at all status-change emit sites) +
- `transport-ws` (thread `workspaceId` from hook payload into WS broadcasts) —
- disjoint packages.
-- **Wave 2:** `transport-http` — `POST /conversations/:id/open` now awaits
- `getWorkspaceId(conversationId)` and emits `conversationOpened` with it.
-- [x] Verified: `tsc -b` EXIT 0, biome clean, **1405 vitest** green; all agents in-lane.
-- [x] **FE courier** to `29ae`: `frontend-workspace-open-handoff.md` — parse/use
- `workspaceId` from `conversation.open` and `conversation.statusChanged`;
- re-pin `@dispatch/transport-contract` `0.19.0`; re-mirror reference.md.
-
-## LSP cwd resolution — server-default fallthrough + workspace assignment (DONE)
-Bug: `GET /conversations/:id/lsp` called `getEffectiveCwd` directly, which falls through
-to `serverDefaultCwd` (`process.cwd()`) when no conversation cwd is set — the LSP
-connected on the wrong dir. Additionally, a new conversation's workspace isn't assigned
-until the first `chat.send`, so `getEffectiveCwd` resolved against `"default"` (not the
-intended workspace) when the FE set the cwd before the first turn.
-- **Wave 0 (orchestrator, contracts):** `@dispatch/transport-contract` `0.16.0→0.17.0` —
- additive `SetCwdRequest.workspaceId?: string` + updated `LspStatusResponse.cwd` comment
- ("resolved working directory the LSP connects on, or null when no cwd is set").
-- **Wave 1 — transport-http:** `GET /conversations/:id/lsp` now gates on `getCwd`
- (persisted) first — returns `{ cwd: null, servers: [] }` when no cwd set (LSP does NOT
- connect); only calls `getEffectiveCwd` + `lspService.status()` when a persisted cwd
- exists. `PUT /conversations/:id/cwd` now accepts optional `workspaceId` — validates
- with `isValidWorkspaceSlug`, then `ensureWorkspace` → `setWorkspaceId` → `setCwd`
- (assigns workspace before persisting cwd). 5 new tests + 1 assertion updated.
- Report: `reports/transport-http.md`.
-- [x] Verified: `tsc -b` EXIT 0, biome clean, **1332 vitest** pass; agent in-lane.
-- [x] **FE courier** sent to FE agent `ffe3`: `frontend-lsp-cwd-workspace-handoff.md`
- — send `workspaceId` on `PUT /conversations/:id/cwd`; `GET /conversations/:id/lsp`
- now returns `cwd: null` + empty `servers` when no working dir is set.
-
-## Workspace cwd fallthrough + relative resolution (DONE)
-FE courier in: bug report + behavior change (`workspace defaultCwd` not used at turn start when
-a conversation has no explicit cwd; plus per-conversation cwd should be **relative to the workspace
-`defaultCwd`** unless absolute). Resolution is backend-owned (the FE omits `cwd` on `chat.send`).
-- **Scope:** single unit — `conversation-store` owns `getEffectiveCwd` (already consumed unchanged
- by `session-orchestrator` turn/warm + `transport-http` `GET /conversations/:id/lsp`), so no
- cross-package surface change and no fan-out. `GET /conversations/:id/cwd` uses `getCwd` (raw
- explicit cwd) — unchanged.
-- [x] **conversation-store** — added injectable `serverDefaultCwd` (default `process.cwd()`) to
- `createConversationStore`; rewrote `getEffectiveCwd` with the new algorithm: explicit conversation
- cwd null → `workspaceCwd ?? serverDefaultCwd` (bug fix: was returning null, skipping the workspace
- default); absolute (starts `/`) → overrides; relative → `path.resolve(workspaceCwd ??
- serverDefaultCwd, conversationCwd)`. Public signature `(conversationId) => Promise<string | null>`
- unchanged. 8 regression tests. Report: `reports/conversation-store-workspace-cwd.md`.
-- [x] Verified: `tsc -b` EXIT 0, biome clean, **1289 vitest** pass; agent in-lane; zero internal mocks.
-
-## Per-turn cwd override not resolved relative to workspace (CURRENT — live-found)
-Live investigation (dev stack, tab 4ef4 in workspace `test` with `defaultCwd=/home/tradam/projects/
-dispatch`): `getEffectiveCwd` resolves a persisted relative cwd correctly (LSP endpoint + a chat
-**omitting** `cwd` both return `/home/tradam/projects/dispatch/arch-rewrite`). BUT a per-turn `cwd`
-sent on `chat.send` is used **as-is** by `session-orchestrator` (`cwd !== undefined ?
-Promise.resolve(cwd)`, orchestrator.ts:360), bypassing `getEffectiveCwd`. So raw `arch-rewrite`
-reaches `run_shell` → `resolve("arch-rewrite")` = `<process.cwd>/arch-rewrite` (nonexistent) → `pwd`
-broken; `./` → `resolve("./")` = `process.cwd()` (valid) → "works". The FE sends the CwdField value
-as a per-turn `cwd` (transport-ws threads it: router.ts:173 → extension.ts:277).
-- **Fix (2 waves):** add an optional `overrideCwd?: string` to `ConversationStore.getEffectiveCwd`
- (resolve the override if provided, else the persisted `getCwd` — same relative algorithm), then
- `session-orchestrator` passes the per-turn `cwd` (turn start + warm `opts.cwd`) as the override.
-- [x] **Wave 1 — conversation-store:** added `overrideCwd?` param + impl + tests.
-- [x] **Wave 2 — session-orchestrator:** pass per-turn cwd as override (turn start + warm) + tests.
-- [x] Verified: `tsc -b` EXIT 0, biome clean, **1298 vitest** pass; both agents in-lane; zero
- internal mocks.
-- [x] **LIVE-VERIFIED** (dev stack, workspace `test` defaultCwd `/home/tradam/projects/dispatch`):
- a per-turn `cwd:"arch-rewrite"` on an existing conversation (assigned to `test`) → `pwd`
- returns `/home/tradam/projects/dispatch/arch-rewrite` (resolved, not broken). Both the
- omit-cwd path (Wave 0) and the per-turn-cwd path (Wave 2) confirmed working.
-- **Known edge case (pre-existing, not a regression):** a brand-NEW conversation's FIRST turn runs
- `getEffectiveCwd` *before* the workspace is assigned (orchestrator.ts assigns it later in the
- IIFE), so a relative per-turn cwd resolves against the "default" workspace (server default)
- instead of the intended one. Uncommon (CwdField typically set after the first message). Deferred.
-- **Note (separate pre-existing bug, not touched):** `DELETE /conversations/:id/cwd` returns
- `cwd:null` but does NOT clear the persisted cwd (transport-http app.ts:538 — the route is a stub).
-
-## Cwd edge cases — timing + DELETE stub (DONE)
-Two pre-existing bugs surfaced during live-verify of the relative-cwd fix:
-- **Edge 1 (timing):** a NEW conversation's first turn ran `getEffectiveCwd` BEFORE the workspace
- was assigned, so a relative per-turn cwd resolved against `"default"` (server default) not the
- intended workspace. **Fix:** session-orchestrator now assigns the workspace (for new
- conversations, detected via `getConversationMeta === null`) BEFORE resolving the effective cwd;
- removed the duplicate assignment site. 3 tests.
-- **Edge 2 (DELETE stub):** `DELETE /conversations/:id/cwd` returned `{cwd:null}` but did NOT
- clear the persisted cwd (no `clearCwd` on the store). **Fix:** conversation-store added
- `clearCwd(id)` (`storage.delete(cwdKey)`, idempotent) + tests; transport-http DELETE handler now
- `await clearCwd` for real.
-- [x] **Wave A (parallel):** conversation-store (clearCwd) + session-orchestrator (timing) — disjoint.
-- [x] **Wave B:** transport-http (DELETE handler uses clearCwd).
-- [x] Verified: `tsc -b` EXIT 0, biome clean, **1311 vitest** pass; all in-lane; zero internal mocks.
-- [x] **LIVE-VERIFIED** (dev stack): Edge 2 — PUT→GET(`/tmp/test`)→DELETE→GET(`null`) actually
- cleared. Edge 1 — NEW conversation, workspace `test`, per-turn `cwd:"arch-rewrite"` → `pwd`
- returns `/home/tradam/projects/dispatch/arch-rewrite` (resolved against workspace default, not
- broken).
-- [x] **FE courier handoff** written + sent: `frontend-cwd-resolution-handoff.md` couriered to FE
- orchestrator conversation `b18a` via `dispatch send b18a --queue` (turn started). Behavior-only
- — no `@dispatch/wire`/`transport-contract`/`ui-contract` version bumps; no FE contract change
- needed. Notes: `DELETE /conversations/:id/cwd` now actually clears; per-turn `cwd` on `chat.send`
- resolved relative to workspace `defaultCwd`; FE MAY omit `cwd` on `chat.send` (backend resolves
- persisted).
-
-Built and verified live (full-fidelity: every feature is a manifest-loaded
-extension through the host):
-- **kernel** — contracts (ABI), bus, `runTurn` turn loop, extension host.
-- **core extensions** — storage-sqlite, auth-apikey, provider-openai-compat
- (OpenCode Go), conversation-store, session-orchestrator, transport-http,
- credential-store; tool extensions `read_file` (files + directory listing), `run_shell`,
- `edit_file`, `write_file`.
-- **observability** — structured Logger/Span ABI + journal-sink → out-of-process
- collector → trace-store (`bun:sqlite`); host-bin supervises the collector;
- nested turn→step→{prompt, provider.request, ttft, decode} spans; D5 verbatim
- provider capture (self-redacted); `trace-replay` record/replay lib + fixtures.
-- **CLI** — one-shot HTTP client (`bun packages/cli/src/main.ts`); `GET /models`,
- `--cwd`, `--conversation`.
-- **web frontend** — SEPARATE repo `../frontend`. Slice 1 (surface system)
- shipped via `ui-contract` + `surface-registry` + `transport-ws` +
- `surface-loaded-extensions`. Slice 2 (browser chat) in progress there.
-
-## How to run
-```bash
-# .env auto-loads DISPATCH_API_KEY (do NOT re-export) and pins BACKEND_PORT (beats PORT).
-# Private probe instance: override the port + ISOLATE data paths (ORCHESTRATOR §8):
-BACKEND_PORT=4567 SURFACE_WS_PORT=4569 DISPATCH_DB=/tmp/opencode/probe/dispatch.db \
- DISPATCH_TRACE_DB=/tmp/opencode/probe/traces.db DISPATCH_JOURNAL=/tmp/opencode/probe/app.ndjson \
- bun packages/host-bin/src/main.ts # boots app + collector
-curl -s -X POST localhost:4567/chat -H 'content-type: application/json' \
- -d '{"conversationId":"c1","message":"Say hello in 3 words."}' # field = conversationId
-```
-Process cleanup uses the `[x]` bracket trick (ORCHESTRATOR §8) — leaked
-server/collector procs poison the next run's counts.
-
-**Two stacks:** `bin/up` = dev (live-reload backend, ports 24203/24205/24204).
-`../bin/up2` = a **stable, no-watch** second stack on **25203/25205/25204** with
-ISOLATED data (`./.dispatch-data/up2/`, `./.dispatch/journal/up2/`) — runs ALONGSIDE
-`bin/up`, edit backend code freely without restarting it; Ctrl-C stops only itself.
-Enabled by a new env knob **`SURFACE_WS_PORT`** → `surfaceWsPort` config
-(`host-bin/config.ts`; default 24205 when unset, so dev is unchanged).
-
-## Foundation (done — summarized; details in git)
-- **MVP + multi-turn:** curl → transport-http → session-orchestrator →
- host/registry → provider → OpenCode Go → AgentEvents → NDJSON;
- `conversationId` threads history.
-- **Post-MVP:** auth→provider seam; `read_file` tool (live tool-dispatch loop);
- `getHostAPI()` hygiene; `tabId → conversationId` rename.
-- **Observability Phase A/B:** the substrate + collector/store + supervision +
- replay fixtures (see bullet list above).
-- **CLI MVP:** credential-store + transport-contract + cli; model catalog; cwd
- threading; multi-turn.
-- **FE Slice 1:** the surface system across both repos (live WS probe verified).
-- **FE Slice 2 backend prereqs:** `@dispatch/wire` split; per-chunk `seq` cursor;
- read endpoint `GET /conversations/:id?sinceSeq=`; WS chat-deltas (transport-ws);
- turn-lifecycle events (`turn-start`/`done`/`turn-sealed`); step grouping
- (`stepId` on tool chunks/events); live stream metrics (`step-complete` +
- `usage`/`done` token/timing — "Pass 1"); CORS.
-
-## Metrics — token + timing (current milestone)
-- [x] **Pass 1 — live stream metrics** (done): `step-complete` event +
- `usage`(stepId) + `done`(durationMs + aggregate usage).
-- [x] **Observability spans** (done): turn & step span-close stamp all four
- `Usage` fields (added cacheRead/cacheWrite; normalized `usage_*` → `usage.*`).
-- [x] **Pass 2 — persisted replay metrics** (done, was deferred): `StepMetrics`/
- `TurnMetrics` wire types; conversation-store `appendMetrics`/`loadMetrics`
- (separate key space, turn-append order); session-orchestrator accumulates
- per-step+turn metrics from the event stream and persists after seal;
- transport-http `GET /conversations/:id/metrics` → `ConversationMetricsResponse`.
- `@dispatch/wire` + `@dispatch/transport-contract` → `0.4.0`. Commit `6db12ff`.
-- [x] **Live-verified end-to-end** (against flash): live `step-complete`/`done`
- metrics ↔ persisted `GET /conversations/:id/metrics` byte-match (aggregate +
- per-step `stepId` + ttft/decode/genTotal + durationMs); journal turn/step spans
- carry dotted `usage.*` incl. `usage.cacheReadTokens` (the #2 fix).
-- [x] **FE courier handoff** written: `frontend-metrics-pass2-handoff.md` (in
- this repo; user couriers to `../frontend`; ORCHESTRATOR §7).
-
-## dedup / storage growth (DONE)
-Design `notes/observability-design.md` §12. User-gated calls: extend existing
-pipeline (no new ext); scope = **de-dup + retention/rotation** (D9 roll-ups
-deferred); dedup = **content-addressed bodies** (body-hash, NOT fingerprint-gated).
-- [x] **Wave 1 — `trace-store`**: content-addressed `bodies` table (SHA-256),
- at-rest gzip (>1 KiB), `prune(policy)` (age + drop-oldest byte-cap + orphan GC) /
- `RetentionPolicy` / `PruneSummary` / `DEFAULT_RETENTION` (7d/256MiB); reads
- transparent.
-- [x] **Wave 2 — `observability-collector`**: pure `shouldPrune` cadence helper;
- `main.ts` calls `store.prune(DEFAULT_RETENTION)` on a coarse cadence
- (`--prune-interval-ms`, default 60s; host-bin-overridable), log-and-continue on
- error.
-- [x] Glossary: added content-addressed body, trace retention, prefix fingerprint,
- warm vs real.
-- [x] **Migration bug** (found by live boot, fixed): Wave 1 created the
- `idx_records_bodyHash` index BEFORE running `migrateOldBodies`, so opening a
- pre-existing OLD-schema `traces.db` crashed the collector
- (`no such column: bodyHash`, crash-looped). Fix = reorder migration before the
- index + 3 regression tests that seed a real old-schema DB. bun 106→109.
-- Tests: bun 89→109. typecheck/biome clean. **Live-verified** against a real
- old-schema `traces.db`: 0 crashes, collector stays up, schema migrates
- (bodyHash + content-addressed bodies), real-data dedup (318 body refs → 270
- stored bodies), prune cadence fires cleanly (14× `prune completed`). Optional
- follow-up: host-bin env-override for the retention policy.
-
-## Standard tools — fs + shell (DONE)
-User-gated calls: **one tool per extension** (matches `tool-read-file` precedent); tools are
-**standard** tier (a turn completes with `tools:[]`, §2.6/§2.8). **Zero ABI change** — the
-`ToolContract`/`ToolExecuteContext` already carry `signal`/`onOutput`/`cwd`/`log`.
-- **Wave 1 (parallel, disjoint pkgs, kernel-only dep) — all green:**
- - [x] `tool-read-file` — EXTENDED `read_file` to list directory contents (sorted, `/`-suffixed
- subdirs; files unchanged). 41 tests.
- - [x] `tool-shell` (new) — `run_shell`: foreground, streamed via `ctx.onOutput`, `ctx.signal`
- cancel, `ctx.cwd`, timeout + output cap, `concurrencySafe:false`; injected `spawn`. 31 tests.
- - [x] `tool-edit-file` (new) — `edit_file`: `oldString`/`newString`/`replaceAll`; errors on
- absent/non-unique/identical; workdir-contained; `concurrencySafe:false`. 38 tests.
- - [x] `tool-write-file` (new) — `write_file`: explicit `overwrite` flag (absent+unset→create;
- exists+unset→error; exists+true→overwrite; absent+true→error); no parent auto-create. 33 tests.
-- **Wave 2 (done):** orchestrator added 3 root tsconfig refs + `bun install`; host-bin owner
- registered the 3 new extensions in `CORE_EXTENSIONS` (same pattern as `read_file`).
-- **Live-verified:** clean boot (`Dispatch booted`, collector up, no activation/capability-gate
- error — the new `shell` capability is accepted); full-graph `tsc -b` EXIT 0, biome clean.
-- **Recovery notes (scar tissue):** `tool-write-file` first returned plan-only (§5a) → re-summoned
- with "IMPLEMENT NOW". `tool-edit-file` hung vitest at collection — `computeReplacement` infinite-
- looped on empty `oldString` (`"".indexOf("") === 0`, index never advances) invoked at a test's
- `describe` scope; fixed with an early empty-string guard + validation. One agent deleted
- `ORCHESTRATOR.md` out-of-lane → caught by post-wave `git status`, restored from git.
-- Deferred (not selected): `glob`, `grep`/`search_code`, background shells.
-
-## Skill system + load_skill tool (DONE)
-User-gated calls: skills list lives in the **`load_skill` tool definition** (NOT the system prompt),
-refreshed **per new turn** (cache-stable across steps), **live file read** on execute. One `skills`
-standard extension (loader + filter + tool). Skill = md in `.skills/`; discovered from `~/.skills` +
-`<cwd>/.skills` (cwd shadows home); name = filename w/o `.md`. Format: line1 = summary,
-line2 = `---`, body = line3+; on load the first two lines are stripped; malformed (no `---`) =
-no summary but still loadable. Glossary: added `skill`, `skill summary`, `tools filter`.
-- **Mechanism — the per-turn `tools` filter chain** (first concrete use of the §3.2 context-assembly
- chain; reusable for persona/agents later):
- - [x] **kernel** — exposed `HostAPI.applyFilters` (delegates to the bus's existing `applyFilters`).
- - [x] **session-orchestrator** — defines+exports `toolsFilter`/`ToolAssembly`; applies it ONCE per
- turn (injected `applyToolsFilter` dep) before `runTurn`, threading `cwd`+`conversationId`.
- - [x] **skills** (new ext, `dependsOn session-orchestrator`) — pure parse/merge/render +
- `load_skill` tool (live read, strips first two lines, path-contained) + a `toolsFilter` filter
- that rewrites `load_skill`'s description + `name` enum with the per-cwd catalog. 42 tests.
- - [x] **host-bin** — registered `skills` in `CORE_EXTENSIONS`.
- - [x] **Fan-out (§5.3):** `applyFilters` was a required `HostAPI` addition → broke one consumer
- (transport-http `server.bun.test.ts` inline HostAPI stub) → fixed by its owner.
-- **Live-verified:** clean boot (`skills` activates, filter registered, no crash); full-graph
- `tsc -b` EXIT 0, biome clean. (End-to-end load_skill via a real LLM turn not yet exercised —
- unit/integration tests cover the filter rewrite + live read.)
-
-## Cache warming (core DONE; control surface PARTIAL)
-User-gated calls: target the external **Claude** provider (`../claude` provider-anthropic, loaded via
-`DISPATCH_EXTERNAL_EXTENSIONS`); warm-assembly lives in **session-orchestrator** (`warm()` reuses the
-real turn's assembly → byte-identical prefix, provider-agnostic); **surface system** for controls;
-**per-conversation** controls; interval default 4 min, free value. Old-code invariants honored
-(primary-model/full-prefix via reuse; refuse mid-turn; never persist/emit; in-flight invalidation;
-arm-on-settle/cancel-on-start; `pct = round(clamp(cacheRead/input,0,1)*100)`).
-- **Mechanism (2nd use of bus hooks; first event-hook emit):**
- - [x] **kernel** — exposed `HostAPI.emit` (delegates to bus.emit), counterpart of `on`.
- - [x] **session-orchestrator** — `turnStarted`/`turnSettled` event hooks (carry conversationId/cwd/
- modelName) emitted per turn; `warm()` service (`cacheWarmHandle`) reusing assembly, refusing
- mid-turn, never persisting/emitting; returns Usage.
- - [x] **cache-warming** (new ext) — per-conversation timers (arm/cancel/in-flight token),
- calls `warm()`, computes `lastPct`, persists `{enabled,intervalMs}` (default on/240s) in
- host.storage; registers a controls Surface. 19 tests.
- - [x] **host-bin** — registered cache-warming; **transport-http** HostAPI stub fixed for `emit`.
-- **Manual trigger endpoint:** `POST /chat/warm {conversationId, model?, cwd?}` → `WarmResponse`
- `{inputTokens,outputTokens,cacheReadTokens,cacheWriteTokens,cachePct}` (409 if generating). Powers a
- FE "warm now" button + fast tests. Types in `@dispatch/transport-contract`; route in transport-http.
-- **LIVE-VERIFIED against Claude haiku:** automatic timer warm → journal `warm complete pct:100`;
- manual `POST /chat/warm` → `cacheReadTokens:6799, cachePct:100` (100% hit), HTTP 200. The external
- `../claude` provider-anthropic is loaded via `bin/up` (`DISPATCH_EXTERNAL_EXTENSIONS`).
-- **Cache-metric fix + retention metric:** `provider-anthropic` (in `../claude`, commit `0e9d118`)
- now reports `Usage.inputTokens` as the TOTAL prompt (was the uncached remainder → the cache rate
- inflated/clamped to 100% on Claude). So `cacheRead/inputTokens` is now the true rate (live: a turn
- adding new content reads 61%, not 100%). Added **`expectedCacheRate`** = `cacheRead/(cacheRead+
- cacheWrite)` (retention/health, ~100% when warm, 0% when the cache expired) to `WarmResponse` +
- `POST /chat/warm` + the cache-warming surface (a "cache retention" stat). Live-verified: warm
- within TTL → 100%; warm after >5 min idle → 0% (cache expired). FE handoff updated with both
- metrics + the cross-turn real-turn `expectedCache = cacheRead_N/(cacheRead_{N-1}+cacheWrite_{N-1})`.
-- **Surface framework extended (DONE):** added `NumberField` to `ui-contract` + per-conversation
- surface scoping (optional `conversationId` on subscribe/unsubscribe/invoke + surface/update; new
- `SurfaceContext` on `SurfaceProvider.getSpec/invoke`; transport-ws keys subscriptions by
- `(surfaceId, conversationId)` and tags updates). cache-warming now serves a PER-CONVERSATION
- surface: `Toggle`(enabled) · `Number`(interval, seconds, `cache-warming/set-interval`) ·
- `Stat`(last cache %). All backward-compatible (global surfaces like `surface-loaded-extensions`
- unchanged). **FE courier:** `frontend-cache-warming-handoff.md` (this repo) — the web must render
- the `number` field kind + send/handle `conversationId` on the surface WS protocol.
-
-## Cache warming — FE CR-3 (DONE)
-FE asked (frontend `backend-handoff-cache-warming-timer.md`): expose next/last-warm timestamps +
-make a manual warm reset the timer/refresh the surface. Done via an **inversion** (commit `bfbad3a`):
-session-orchestrator `warm()` (the single chokepoint for manual `/chat/warm` AND the auto timer) emits
-a `warmCompleted` bus event; cache-warming subscribes and does all post-warm handling — so manual
-warms re-arm the timer + push a surface update with **no transport-http change** (core can't depend on
-the standard cache-warming ext). Added `nextWarmAt`/`lastWarmAt` state + a `custom`
-`rendererId:"cache-warming-timer"` surface field (no ui-contract bump). Caught + fixed a wiring bug
-(`createWarmService` missed the `emit` dep → `deps.emit?.` silently no-oped; made it required).
-Live-verified vs claude haiku (manual warm logs `warm complete` ~2s after the turn, not the 4-min
-timer). FE handoff updated. (FE CR-1 table + CR-2 catalog `scope` flag still open, not requested.)
-
-## LSP integration + per-conversation CWD (DONE)
-Design: `notes/lsp-design.md`. FE courier: `frontend-lsp-cwd-handoff.md`. Decisions
-(locked): **single `lsp` extension**; **hand-rolled pure JSON-RPC codec** (zero dep,
-injected-stream tested); **diagnostics-on-write deferred** (on-demand `lsp` tool
-only); **cwd persisted in `conversation-store`**; config = **built-in TypeScript +
-`<cwd>/.dispatch/lsp.json` + `<cwd>/opencode.json` `lsp` fallback** (Roblox works
-with its existing config). Glossary: added LSP, language server, diagnostics,
-workspace root, working directory.
-- **The bug we fixed** (opencode root cause, confirmed): opencode's
- `client/registerCapability` ignores all but `textDocument/diagnostic`, so
- `workspace/didChangeWatchedFiles` registrations are dropped + no real fs watcher
- → stale `sourcemap.json` → "Unknown require" mid-session. Fix = honor the
- registration + real fs watcher + forward `didChangeWatchedFiles` + auto-spawn
- `rojo sourcemap --watch` sidecar when `luau-lsp.sourcemap.autogenerate`. Covered
- by a regression test in `packages/lsp/src/client.test.ts`.
-- **`lsp` extension** (new, bundled core): hand-rolled LSP client (framing + rpc +
- watched-files + diagnostics + config + root + tool + manager), zero external deps.
- Lazy-spawn one server per `(serverID, root)`; config resolved **per cwd**;
- `lspServiceHandle.status(cwd)` lazy-connects + reports state; `deactivate` kills
- all child procs (host-bin shutdown now calls `host.deactivate()`).
-- **CWD:** `conversation-store.getCwd/setCwd`; `session-orchestrator` defaults a
- turn's cwd from the store; endpoints `GET`/`PUT /conversations/:id/cwd` +
- `GET /conversations/:id/lsp` in transport-http; wire types in
- `@dispatch/transport-contract` (→ `0.5.0`).
-- **LIVE-VERIFIED:** this repo (`typescript`) → `connected`; `/home/tradam/projects/
- roblox` (`luau-lsp`) → `connected` (via the project's own `opencode.json` + rojo
- sidecar); cwd PUT/GET round-trip 200. Op note: LSP binaries must be on the server
- process PATH (`~/.local/bin` daemon-PATH caveat for `typescript-language-server`).
-- **Recovery (scar tissue):** the `lsp` agent stalled on the final stretch (1 hung
- test + ~40 biome `!`/dot-key findings) → at the user's request the orchestrator
- finished it directly; also fixed a real design bug the agent missed: the manager
- read config statically instead of per-cwd (would have broken Roblox).
-
-## Context size — current context-window usage (DONE)
-User-gated decisions: term = **context size** (current usage; reserve "context window" for the
-model's max LIMIT, a later feature); definition = the turn's **FINAL step `inputTokens +
-outputTokens`** (NOT the aggregate `usage`, which sums per-step prompts and overcounts a
-multi-step turn); delivery = a backend-computed field on BOTH the live `done` event and the
-persisted `TurnMetrics`.
-- [x] **Contract (orchestrator):** optional `contextSize?: number` added to `TurnDoneEvent` +
- `TurnMetrics` in `@dispatch/wire` (`0.4.0→0.5.0`); `@dispatch/transport-contract`
- `0.5.0→0.6.0` (re-exports both — no other change). Glossary: added **context size**.
-- [x] **Wave (parallel, disjoint pkgs):**
- - [x] **kernel** — `run-turn.ts` tracks the last step's `Usage`; `doneEvent()` stamps
- `done.contextSize = lastStep.input + lastStep.output` (omitted when no usage). +3 tests.
- - [x] **session-orchestrator** — `metrics.ts build()` stamps `TurnMetrics.contextSize` from
- the final per-step metrics (same definition; equals the live value). +5 tests.
-- [x] Verified: `tsc -b` EXIT 0, biome clean, 881 vitest pass; both owners stayed in-lane.
- `conversation-store` (JSON passthrough) + `transport-http` (forwards/serves) unchanged.
-- [x] **LIVE-VERIFIED against flash** (`deepseek-v4-flash`): turn 1 → live `done.contextSize`
- 1255 == persisted `turns[-1].contextSize` 1255 == final-step `1206 in + 49 out` (NOT the
- aggregate); turn 2 (same conversation) → 1286 (grew cumulatively), live == persisted. Both
- carriers agree; "current" = latest turn's value.
-- [x] **FE courier handoff:** `frontend-context-size-handoff.md` (user couriers to
- `../frontend`).
-
-## Turn continuity — detached turns + multi-client live view (DONE)
-Design: `notes/turn-continuity-design.md`. FE courier: `frontend-turn-continuity-handoff.md`.
-Problem (code-traced): a turn's lifetime was bound to the WS connection — `transport-ws` aborted
-the in-flight turn on socket close, so a backgrounded/reloaded mobile browser killed generation.
-Principle enforced: **the FE is only a control interface; the AI runs independent of it**, and
-**multiple clients may watch the same conversation** (multi-device handoff).
-- **Decisions (locked):** broadcast hub lives in the CORE (`session-orchestrator`), not a
- transport; additive `SessionOrchestrator` handle (keep `handleMessage`); persist-at-seal kept,
- per-step R1 deferred; late-join served by an in-memory in-flight buffer; subscribers persist
- per-conversation independent of turns; no concurrent-send arbitration; no explicit stop op.
-- **Contract (orchestrator):** `@dispatch/transport-contract` `0.6.0→0.7.0` — additive WS ops
- `chat.subscribe`/`chat.unsubscribe` on `WsClientMessage` (events still arrive as `chat.delta`).
-- **Wave 1 — `session-orchestrator`:** detached per-conversation turn ownership + broadcast;
- `startTurn`/`subscribe`/`isActive` added to the handle; `handleMessage` → convenience wrapper
- (dropped `signal`). **Two-map model** (`subscribers` persistent + `activeTurns` buffer) — the
- fix for the live-found bug where pre-turn subscribers were dropped. 63 tests.
-- **Wave 2 (parallel) — `transport-ws`** (fan-out: per-connection chat-subscription map;
- `chat.send` auto-subscribes sender + `startTurn`; new ops in pure `router.ts`; `close` drops
- subs but NEVER aborts a turn; removed the turn `AbortController`) + **`transport-http`** (only
- test fakes updated for the 3 new methods; runtime unchanged). host-bin untouched.
-- **LIVE-VERIFIED against flash** (2-client WS test, `/tmp/ws_multi.ts`): (S1) two clients both
- stream a turn; closing the SENDER mid-turn → the other keeps receiving through `done` and the
- turn persists (1197 chars) — AI kept going independent of the interface; (S2) a client joining
- mid-turn gets `turn-start` replayed + the rest live. `RESULT OVERALL: OK`.
-- **Recovery (scar tissue):** first Wave-1 impl stored listeners INSIDE the per-turn hub and
- `startTurn` made a fresh empty-listener hub → every pre-turn subscriber dropped; live test got
- zero deltas though the turn ran+persisted. Caught by live-verify (unit test had subscribed
- AFTER start, masking it). Fixed via the persistent-subscribers / per-turn-buffer split.
-
-## Turn continuity — CR-3: user prompt on the event stream (DONE)
-FE bug (multi-client): a pure watcher (subscribed, not the sender) couldn't see the USER prompt until
-seal — the user message was passed to the provider + persisted only at seal, never on the turn's
-outward stream/buffer. FE courier: `frontend-cr3-user-message-handoff.md`.
-- **Contract:** `@dispatch/wire` `0.5.0→0.6.0` — additive `TurnInputEvent`
- `{ type:"user-message"; conversationId; turnId; text }` on the `AgentEvent` union (kernel barrels
- re-export it). `@dispatch/transport-contract` `0.7.0→0.8.0` (re-export only). Widening broke NO
- exhaustive switch (typecheck clean) — zero consumer fan-out.
-- **session-orchestrator:** `emitToHub({type:"user-message",…})` as the FIRST event of `runTurnDetached`
- (before `runTurn`) → buffered + broadcast to all subscribers (live + late-join); HTTP path covered via
- `handleMessage`'s buffer replay. Persistence + metrics unchanged. +3 tests; 3 Wave-1 tests updated
- (user-message now precedes turn-start).
-- **LIVE-VERIFIED vs flash:** a watcher that never sent receives `user-message` (correct text) as its
- FIRST `chat.delta`, before `turn-sealed`, then the streaming reply. `RESULT: OK`.
-- **Process note:** implemented directly by the orchestrator as a one-off (user-approved at the
- time). SUPERSEDED — the user has since confirmed the ORCHESTRATOR.md model governs: the
- orchestrator summons owner-agents and does not write feature code itself.
-
-## Cache warming — FE CR-4 lifecycle + CR-1 extensions table + CR-2 catalog scope (DONE)
-FE courier in: `../frontend/backend-handoff-cache-warming.md` (+ CR-1/CR-2 from their living
-`backend-handoff.md`). Courier out: `frontend-cache-warming-lifecycle-handoff.md`. Full report:
-`reports/cr4-cache-warming-lifecycle.md`.
-- **CR-4a:** warming defaults OFF (opt-in per conversation) — `parseSettings` + `DEFAULT_STATE`;
- re-enabling now restores the persisted interval. Known gap (pre-existing, fail-safe): no boot
- hydration of persisted opt-in across server restarts.
-- **CR-4b:** post-warm surface updates now carry the FUTURE `nextWarmAt` (re-arm BEFORE notify);
- `turnSettled`/`turnStarted` also push (fresh schedule after seal / `null` while generating).
-- **CR-4c:** new `POST /conversations/:id/close` (tab close ≠ disconnect): aborts the in-flight
- turn via a per-turn `AbortController` → kernel `runTurn` `signal` (partial persist + normal seal,
- `done.reason:"aborted"`), and emits new typed hook `conversationClosed` → cache-warming disables
- sync + persists OFF. Disconnect/`chat.unsubscribe` semantics unchanged.
-- **CR-4d:** no change — initial `surface` echo already at HEAD (FE probed a stale up2 boot).
-- **CR-1:** loaded-extensions emits count stat + ONE `custom`/`rendererId:"table"` field
- (`TablePayload` exported); columns Name|Version|Trust|Activation, all trust tiers.
-- **CR-2:** `SurfaceCatalogEntry.scope?: "global"|"conversation"` (`ui-contract` `0.1.0→0.2.0`);
- set on both surfaces. `transport-contract` `0.8.0→0.9.0` (additive `CloseConversationResponse`).
-- 907 tests pass (+13 new); typecheck + biome clean. **LIVE-VERIFIED vs `bin/up`:** default-off,
- 2 automatic warms @5s each pushing future `nextWarmAt`, mid-turn close → `abortedTurn:true` +
- `done.reason:"aborted"` + warming disabled, catalog scopes + table field present, echo present.
-
-## History windowing — FE CR-5 (DONE)
-FE courier in: `../frontend/backend-handoff-chat-limit.md` (+ living `backend-handoff.md` §2
-CR-5). Courier out: `frontend-history-windowing-handoff.md`. User-gated call: ask #3 shipped as
-the INVARIANT option (no new field) — seq is contractually **1-based, monotonic, gap-free**; FE
-derives `hasOlder` from `chunks[0].seq > 1`.
-- **Wave 0 (orchestrator, contracts):** `limit`/`beforeSeq` query-param semantics + validation +
- `latestSeq` windowed-read caveat documented on `ConversationHistoryResponse`
- (`@dispatch/transport-contract` `0.9.0→0.10.0`); 1-based seq guarantee codified on
- `StoredChunk` (`@dispatch/wire` `0.6.0→0.6.1`, doc-only).
-- **Wave 1 — `conversation-store`:** additive `loadSince(id, sinceSeq?, window?: { beforeSeq?,
- limit? })` — selection `sinceSeq < seq < beforeSeq`, newest-`limit` window, result stays
- ascending; garbage-in treated as absent (transport validates upstream). +8 tests.
-- **Wave 2 — `transport-http`:** parses + validates the params (positive integers; malformed/
- zero/negative → 400 `{ error }`, store never called with an invalid window); two-arg call
- shape preserved when no params (regression-guarded). +20 tests.
-- 935 vitest + 112 bun tests, typecheck + biome clean. **LIVE-VERIFIED** (isolated boot, real
- flash turns): firstSeq=1; `limit=2`→`[5,6]` ascending w/ correct `latestSeq`; `limit=9999`→
- full log; `beforeSeq=3`→`[1,2]`; `beforeSeq=3&limit=1`→`[2]`; `limit=0`/`beforeSeq=0`/
- `limit=abc`→400×3. `RESULT: OK` ×6.
-- **Scar tissue (process):** (1) probing with a PRIVATE boot was overkill — the windowing checks
- are read-only GETs and the dev stack was running; prefer probing `bin/up`/`up2` or asking the
- user (ORCHESTRATOR §8 updated). (2) The §8 boot recipe was stale (`DISPATCH_API_KEY_OPENCODE1`
- doesn't exist; an empty re-export OVERRIDES `.env` → "No providers registered"; `.env`'s
- `BACKEND_PORT` beats `PORT`; un-isolated data paths spawn a duplicate collector on the dev
- DB) — recipe fixed in §8 + above. (3) Violated the bracket trick once (`pkill -f 'cr5-data'`
- self-matched → killed parent shell, timeout-with-no-output); the existing §8 rule stands.
-
-## Reasoning effort (current milestone)
-User-gated calls: canonical term **reasoning effort** (GLOSSARY); ladder `low|medium|high|xhigh|max`
-(Anthropic-driven, includes xhigh/max); scope = **(c)** persisted per-conversation + per-turn
-`ChatRequest.reasoningEffort` override; resolution default **`high`**; provider picks sensible
-budget_tokens; `../claude` orchestrated DIRECTLY (mode A); CLI `--effort` now.
-- [x] **Wave 0 (orchestrator, contracts):** `ReasoningEffort` in `@dispatch/wire` (`0.6.1→0.7.0`);
- `ProviderStreamOptions.reasoningEffort` (kernel contract; runtime untouched — providerOpts is
- forwarded verbatim); `ChatRequest.reasoningEffort` + `ReasoningEffortResponse`/
- `SetReasoningEffortRequest` GET/PUT types (`@dispatch/transport-contract` `0.10.0→0.11.0`);
- glossary entry. typecheck + biome clean.
-- [x] **Wave 1 (parallel ×3, disjoint):** `conversation-store` get/setReasoningEffort (own key
- space, mirrors cwd; +12 tests); `provider-anthropic` (../claude commit `c0835a4`, mode A summon
- with `--dir ../claude`, contract excerpt INLINED per the cross-`--dir` hang rule) —
- `REASONING_EFFORT_BUDGETS` 4096/10240/16384/32768/65536, raises max_tokens above budget, strips
- temperature when thinking on, absent → byte-stable body (+12 tests); `cli` `--effort` flag,
- parse-validated, body key omitted when unset (+8 tests).
-- [x] **Wave 2:** `session-orchestrator` — exported pure `resolveReasoningEffort` (override →
- stored → `"high"`), additive `StartTurnInput.reasoningEffort`, providerOpts always stamped,
- **warm() parity** (same resolved effort as a real turn — prompt-cache safe), own fakes fixed
- (+9 tests).
-- [x] **Wave 3 (parallel ×2):** `transport-http` — `/chat` validation (400 names valid levels,
- orchestrator never sees bad input), threads to startTurn, GET/PUT
- `/conversations/:id/reasoning-effort` mirroring cwd endpoints, own fakes fixed; `transport-ws` —
- `chat.send` threading + validation (+3 tests).
-- [x] Verified: `tsc -b` EXIT 0, biome clean, **993 vitest + 189 bun** green; all agents in-lane.
- Commits: arch-rewrite `35197ed` (contracts) + `020e051` (impl); ../claude `c0835a4`.
-- [x] Live-verified vs claude (thinking deltas streamed at xhigh; persisted PUT honored next turn).
-- [x] FE courier handoff written: `frontend-reasoning-effort-handoff.md` (user couriers to
- `../frontend`): ChatRequest/chat.send field + GET/PUT endpoints + ladder + default-`high`
- semantics + cache note.
-
-## Message queue + steering injection (DONE)
-Design: this file's roadmap item 3 (now implemented). User-gated calls: a **separate
-`message-queue` standard extension** (dependsOn `surface-registry`) owns the queue STATE +
-a per-conversation `custom` surface; the **session-orchestrator** owns delivery (drain →
-inject → carry) + emits the `steering` event (it owns the chat hub — no `chatEmit` service
-needed); the **kernel** gets a generic `drainSteering` callback. Glossary: added
-**message queue**, **steering**, **queued message**. Enqueue when idle **starts a turn**
-(user choice; `chat.queue` degrades to `chat.send`). Steering text rendered live via a new
-additive `steering` `AgentEvent`; queue state via the surface (NOT the chat stream).
-- **Wave 0 (orchestrator, contracts):** `RunTurnInput.drainSteering?: () => readonly
- ChatMessage[]` (kernel contract — generic, kernel stays pure); `QueuedMessage` +
- `QueuePayload` + `TurnSteeringEvent` (type `"steering"`, additive to `AgentEvent`) in
- `@dispatch/wire` (`0.7.0→0.8.0`); `POST /conversations/:id/queue` + WS `chat.queue` op +
- `QueueRequest`/`QueueResponse` in `@dispatch/transport-contract` (`0.11.0→0.12.0`). typecheck
- clean except the expected transport-ws exhaustive-switch fan-out (fixed in Wave 3).
-- **Wave 1 (parallel ×2, disjoint):** `kernel` runtime — calls `drainSteering` at the
- tool-result boundary only when continuing to a next step (gated; no drain on max-steps),
- +6 pure tests (65 total); `message-queue` (NEW ext) — pure queue core (enqueue/getQueue/
- drain/combine) + `MessageQueueService`/`messageQueueHandle` + per-conversation `custom`
- surface (`rendererId:"message-queue"`, `QueuePayload`), 12 tests. (The message-queue agent
- DIED mid-task after writing all src+tests but before verifying/reporting; orchestrator
- recovered by running `bun install` + root tsconfig ref + verifying directly — tsc/vitest/
- biome clean, 12 tests pass; no hand-fixing of impl.)
-- **Wave 2:** `session-orchestrator` — added `enqueue` facade (idle→`startTurn`,
- active→queue.enqueue) + `resolveQueue?` dep (self-wired lazily in `activate` via
- `host.getService(messageQueueHandle)` — host-bin does NOT wire it) + `drainSteering` wrapper
- (drain → emit `steering` → return one combined user `ChatMessage`) + post-seal carry
- (non-empty queue → new turn), +8 tests (85 total). `message-queue` is an OPTIONAL dep
- (feature degrades off if absent).
-- **Wave 3 (parallel ×3):** `host-bin` — registered `message-queue` in `CORE_EXTENSIONS`
- (+dep+ref), 28 tests; `transport-http` — `POST /conversations/:id/queue` route + validation,
- 145 tests; `transport-ws` — `chat.queue` op + fixed the Wave-0 exhaustive-switch fan-out,
- 29 vitest + 20 bun.
-- Verified: `tsc -b` EXIT 0, biome clean (280 files), **1043 vitest + 199 transport bun** pass;
- all agents in-lane. **Boot smoke:** private instance boots clean with `message-queue`
- registered (no activation crash).
-- [x] FE courier handoff written: `frontend-message-queue-handoff.md` (user couriers to
- `../frontend`): surface (`rendererId:"message-queue"`), `chat.queue` WS op, `steering`
- event, HTTP `POST /queue`, auto-start-when-idle, carry semantics, version bumps.
-
-## Umans AI Coding Plan provider (DONE)
-User-gated calls: a new **`provider-umans`** standard extension wrapping the Umans
-OpenAI-compatible backend (`https://api.code.umans.ai/v1`). Built via the **full-refactor
-path**: first extract a generic `@dispatch/openai-stream` library from
-`provider-openai-compat`, then build `provider-umans` on top. Self-contained (reads
-`UMANS_API_KEY` from env directly — no `auth-apikey` dep).
-- **Wave 1 — `@dispatch/openai-stream` lib (NEW package):** extracted the generic OpenAI
- functions (convert-messages, convert-tools, parse-sse, listModels, stream, provider)
- from `provider-openai-compat` into a pure library package. `createOpenAICompatProvider`
- parameterized: `id: string` (was hardcoded `"openai-compat"`) + `transformBody?: (body,
- opts) => Record<string,unknown>` hook (for provider-specific body fields). Refactored
- `provider-openai-compat` to import from the lib (thin extension.ts, backward-compat
- re-exports, manifest unchanged, byte-identical behavior). Full tsc EXIT 0, 66 vitest,
- biome clean. Report: `reports/provider-umans-wave1-openai-stream.md`.
-- **Wave 2 — `provider-umans` (NEW ext):** imports `createOpenAICompatProvider` from the
- lib; registers provider id `"umans"`; `transformBody` maps Dispatch `reasoningEffort`
- (`low|medium|high|xhigh|max`) → Umans `reasoning_effort` (`none|low|medium|high`,
- capping `xhigh`/`max`→`high`); dynamic `listModels` (GET /v1/models); default model
- `umans-coder` (env `UMANS_MODEL` or config `provider.umans.model`); baseURL env
- `UMANS_BASE_URL`; absent key → warn + skip registration (graceful). Pure core:
- `mapReasoningEffort` + `resolveUmansConfig` (factored out for direct unit testing).
- 12 tests. Report: `reports/provider-umans.md`.
-- **Wave 3 — host-bin wiring:** registered `provider-umans` in `CORE_EXTENSIONS` + added
- `@dispatch/provider-umans` dep + root tsconfig ref. No credential-store entry needed
- (self-contained — reads env directly, doesn't go through `auth-apikey`). 28 host-bin
- tests.
-- Verified: full-graph `tsc -b` EXIT 0, biome clean (293 files), **1059 vitest** pass.
- **Boot smoke:** without `UMANS_API_KEY` → `"provider-umans: no UMANS_API_KEY. Provider
- not registered."` (graceful skip); with `UMANS_API_KEY=sk-test` → `"provider-umans:
- registered (model=umans-coder)"`.
-- [x] **LIVE-VERIFIED against the real Umans API:** the dev stack (umans-glm-5.2) called
- `web_search` (Firecrawl) in a real turn — first live Umans API call, clean response.
-
-## web_search tool — Firecrawl (DONE)
-Standard tool extension `tool-web-search` backed by a self-hosted Firecrawl instance
-(`http://100.102.55.49:31329/v1`, Tailscale, no API key). One tool `web_search` with 4
-modes: search, scrape, crawl (polls status URL), map — mirroring the proven opencode tool.
-Pure core: `validateArgs` (discriminated union by mode) + `format*` functions + `truncateOutput`.
-Injected edge: `FirecrawlClient` (injectable `fetchFn` + `sleep` + `now`), `AbortSignal.any`
-for per-request timeout + caller cancellation. `concurrencySafe: true`, `capabilities: { network: true }`.
-38 tests. Report: `reports/tool-web-search.md`.
-- **LIVE-VERIFIED:** the dev stack (umans-glm-5.2) called `web_search` → Firecrawl returned
- real results (Paris, France) — first live Umans API call too.
-
-## todo tool — per-conversation task list + surface (DONE)
-Standard tool extension with a single `todo_write` tool (opencode `todowrite` pattern:
-full-list replace, returns JSON, no business-rule enforcement — the description guides
-the model). Per-conversation in-memory state (`Map<conversationId, TodoItem[]>`). Per-
-conversation surface (`rendererId: "todo"`, `scope: "conversation"`) via subscriber-notify
-(message-queue pattern). `concurrencySafe: false` (mutates shared state).
-- **Wave 0 (orchestrator, kernel contract):** added `conversationId?: string` to
- `ToolExecuteContext` (additive, backward-compatible). Wired in `dispatch.ts` — the
- kernel already had `conversationId` as a parameter, just wasn't passing it through to
- the tool context. 170 kernel tests pass.
-- **Wave 1 (todo extension):** pure core (`validateTodos` — shape only; `getTodos`/
- `setTodos`/`clearTodos` — fresh array copies; `buildTodoSpec`; `formatTodoResult` →
- `JSON.stringify`). Shell: `createTodoWriteTool({ state, notify })` + surface provider.
- 26 tests. Report: `reports/todo.md`.
-- **Wave 2 (host-bin wiring):** registered `todo` in `CORE_EXTENSIONS` + dep + root tsconfig
- ref. 28 host-bin tests.
-- Verified: full-graph `tsc -b` EXIT 0, biome clean (314 files), **1123 vitest** pass.
- **Boot smoke:** `"todo: registered"` + activated.
-- [x] Live-verified (model uses `todo_write` in a real turn).
-
-## youtube_transcript tool (DONE)
-Standard tool extension `tool-youtube-transcript` backed by a self-hosted transcriber
-service (`http://100.102.55.49:41090`, Tailscale, no API key). One tool
-`youtube_transcript` — takes a YouTube URL, fetches the transcript (completed → full
-text + timestamped segments; queued/processing → position + ETA + `.youtube_subtitles_pending`
-retry convention; failed → error). Pure core: `validateUrl` + `format*` functions +
-`truncateOutput`. Injected edge: `TranscriptClient` (injectable `fetchFn`, `AbortSignal.any`
-for cancellation). `concurrencySafe: true`, `capabilities: { network: true }`. 30 tests.
-Report: `reports/tool-youtube-transcript.md`.
-
-## CLI — cross-client messaging + open tab (DONE)
-Roadmap items 2 + 4. The CLI can now list conversations, read the last AI message
-(blocking), send messages (blocking or `--queue`), and signal the frontend to open a
-conversation tab. Short-ID prefix resolution (4+ chars → full ID via `GET /conversations?q=`).
-- **Wave 0 (orchestrator, contracts):** `ConversationMeta` in `@dispatch/wire`
- (`0.8.0→0.9.0`); `ConversationListResponse`, `LastMessageResponse`,
- `OpenConversationResponse`, `SetTitleRequest`, `TitleResponse`, WS
- `conversation.open` in `@dispatch/transport-contract` (`0.12.0→0.13.0`);
- `listConversations()`/`getConversationMeta()`/`setConversationTitle()` on
- `ConversationStore`; new routes declared in transport-http manifest;
- `conversationOpened` hook in session-orchestrator.
-- **Wave 1 (conversation-store):** metadata tracking (createdAt on first write,
- lastActivityAt on every append, title from first user message truncated 80 chars);
- `conv-index` key tracks all conversation IDs; `extractTitle` pure helper. 21 new
- tests (81 total).
-- **Wave 2 (parallel, transport-http + transport-ws):** `GET /conversations` (list
- with `?q=` prefix filter), `GET /conversations/:id/last` (blocks until turn settles
- via subscribe-then-checkIsActive, returns last assistant text via pure
- `extractLastAssistantText`), `POST /conversations/:id/open` (emits
- `conversationOpened` hook), `PUT /conversations/:id/title`; `emit` threaded from
- `host.emit` → `createApp`. transport-ws subscribes to `conversationOpened` +
- broadcasts `ConversationOpenMessage` to all connected WS clients. 21+2 new tests.
-- **Wave 3 (CLI):** `dispatch list` (table: short ID + title + activity),
- `dispatch read <id>` (blocking, prints last AI message), `dispatch send <id> --text`
- (blocking by default; `--queue` for non-blocking enqueue; `--open` signals FE).
- Short-ID resolution (4+ chars → prefix search; 32+ chars = full UUID). 48 new
- tests (108 total).
-- Verified: full-graph `tsc -b` EXIT 0, biome clean (327 files), **1240 vitest** pass.
- **Boot smoke + endpoint smoke:** `GET /conversations` → `[]`, `GET /conversations/:id/last`
- → `{content:""}`, `POST /conversations/:id/open` → `{conversationId}`.
-- [x] Live-verified end-to-end (CLI → real conversation → FE tab open).
-
-## Workspaces (DONE)
-Cross-repo design ask from `../frontend` (`backend-handoff-workspaces.md`).
-Outbound courier: `frontend-workspaces-handoff.md` (final shapes + Q1–Q8).
-- **Boundary decision:** workspaces live inside `conversation-store` (metadata +
- cwd persistence owner); no new extension. Single owner-agent for all workspace
- storage + service methods.
-- **Versions:** `@dispatch/wire` `0.11.0→0.12.0`, `@dispatch/transport-contract`
- `0.15.0→0.16.0`, `@dispatch/ui-contract` unchanged. Kernel re-exports
- `Workspace`/`WorkspaceEntry`.
-- **Key decisions:** `DELETE /workspaces/:id` closes all conversations (status→
- "closed") + reassigns to "default" + deletes workspace; auto-create workspace on
- turn start if missing; `PUT /workspaces/:id` create-on-miss with optional
- `title`/`defaultCwd`; `DELETE /conversations/:id/cwd` to clear explicit cwd;
- `GET /conversations/:id/lsp` roots at effective cwd; WS lifecycle push deferred.
-- **Waves:**
- - **Wave 0 (orchestrator):** contracts (wire `0.12.0` + transport-contract
- `0.16.0` + kernel re-exports). tsc + biome clean.
- - **Wave 1 (conversation-store):** workspace persistence + service methods
- (`getWorkspace`, `ensureWorkspace`, `setWorkspaceTitle`, `setWorkspaceDefaultCwd`,
- `deleteWorkspace`, `listWorkspaces`, `getWorkspaceId`, `setWorkspaceId`,
- `getEffectiveCwd`, `isValidWorkspaceSlug`); `listConversations` filter;
- `forkHistory`/`replaceHistory` preserve `workspaceId`. 111 bun tests. CRs
- (kernel re-exports, `bun install`) resolved by orchestrator.
- - **Wave 2 (session-orchestrator):** `workspaceId` on `StartTurnInput`/
- `EnqueueInput`; effective cwd resolution (`getCwd` → `getEffectiveCwd`); auto-
- create workspace on turn start; warm parity. 93 vitest (+8).
- - **Wave 3 (parallel):** `transport-http` (workspace routes, `workspaceId`
- threading, `?workspaceId=` filter, `DELETE /conversations/:id/cwd`, effective
- cwd for LSP, slug validation; 166 tests), `transport-ws` (`workspaceId` on
- `chat.send`/`chat.queue`; 32 tests), `cli` (`--workspace`/`-w` flag; 123 tests).
- - FE handoff sent to agent 4091 via `dispatch send --queue` (non-blocking).
-- Verified: full-graph `tsc -b` EXIT 0, biome clean (328 files), **1283 vitest +
- 199 transport bun** pass (1 pre-existing `tool-shell` failure unrelated).
-- **LIVE-VERIFIED** against dev stack (`bin/up`): 11/11 workspace checks pass —
- create-on-miss, rename, set default-cwd, invalid-slug 400, unknown 404, delete-
- default 409, chat with workspaceId stamps conversation, workspace filter, cwd
- inheritance (null = inheriting), delete cascade (closedCount:1, workspace→404).
-- `dist/` rebuilt for FE (wire + transport-contract + kernel .d.ts contain Workspace
- types). FE agent 4091 notified twice (handoff + dist-ready).
-
-## Open items
-- **`prefix.fingerprint` / `warm|real` cache-bust attributes (deferred):** decoupled
- from dedup by the content-addressed decision; also gated on cache-warming being
- built (not yet) so `warm|real` can't be honestly stamped. Later cache-bust-debug
- milestone (`notes/observability-design.md` §3.1, §12).
-- **D9 analytics roll-ups (deferred):** rollup table shape + `GROUP BY` indexes +
- retention asymmetry + periodic rollup job (`notes/observability-design.md` §2 D9,
- §12). The scheduler mechanism (`host.scheduler.register`) already exists.
-- **D8 `prompt.assembly` segments:** deferred-by-design (await the context-filter
- chain).
-- **In-memory state persistence (message queue + todo list):** both the message
- queue and the todo list are in-memory only (`Map<conversationId, …>` in the
- extension's `activate`). Neither persists across server restarts. If persistence
- is needed later, both would write through `host.storage` (the conversation-store
- pattern: separate key space per feature, append/write per conversation).
-
-## Roadmap
-1. **Web frontend** (in progress, SEPARATE repo `../frontend`; Svelte +
- DaisyUI, same methodology). Slice 2 = browser chat MVP consuming the
- wire/transport-contract + metrics. Cross-repo contract changes are couriered
- via the user (ORCHESTRATOR §7); `lsp references` does not span repos.
-2. **Message queue — close-with-queued-messages (deferred product decision):**
- if a client closes a conversation (`POST /conversations/:id/close`) while the
- queue is non-empty, the carry currently still fires (starts a new turn on the
- closed conversation). Decide: does closing discard pending steering, or honor
- it? If "discard," gate the carry on `finishReason !== "aborted"` in
- session-orchestrator (one-line). No FE action either way.
-3. **FE: consume `GET /conversations/:id/status` for crash-recovery re-sync.**
- Backend endpoint shipped: returns `{ conversationId, isActive, status }` where
- `isActive` is the orchestrator's in-memory truth and `status` is the persisted
- lifecycle status. On reconnect (WS re-establish or page reload), the FE should
- call this for any tab it believes is "generating"; if `isActive: false`,
- override the local spinner to idle regardless of the persisted `status`
- (defense-in-depth against status drift the boot-sweep didn't catch).
-
-(Done and dropped from the list: CLI; dedup / storage growth; message queue +
-steering injection; CLI open-tab handoff; `todo` tool; `web_search` tool; tab
-persistence across devices; conversation compacting; live-verify steering flow.)
-
-## Stop generation must abort a hanging tool + not brick the conversation (DONE)
-FE courier in: "Stop generation doesn't abort a hanging tool call." When the user clicks Stop during
-a tool that hangs (e.g. `run_shell` with a blocking/grandchild-holding process), the turn never
-sealed → the FE spinner spun forever AND the conversation was bricked (next `chat.send` rejected as
-`"already-active"` because `activeTurns` was never cleared).
-- **Root cause:** the kernel's `executeToolCall` awaited `tool.execute(...)` with **no race against
- the abort signal** — a tool that ignored `ctx.signal` (or blocked on something it couldn't
- interrupt) blocked `drain` → `runTurn` never returned → session-orchestrator's `finally` (which
- clears `activeTurns`) never ran. (The `/stop` endpoint, `stopTurn`, and the `finally` cleanup were
- already correct — they just needed `runTurn` to return.) Secondary: `realSpawn` resolved on
- `child.on("close")` (waits for stdio) and killed only the immediate child, so a grandchild holding
- the pipes could stall the spawn promise + leak.
-- [x] **kernel** — `executeToolCall` now **races** `tool.execute` against `signal` via `Promise.race`;
- on abort it **resolves** (not rejects) `{ content: "Aborted", isError: true }` so the step completes
- normally → kernel's existing `signal.aborted → finishReason "aborted"` path runs → turn seals
- cleanly (`done` + `turn-sealed`) → `finally` clears `activeTurns` → **conversation freed, next
- message accepted**. Late rejections from the orphaned tool promise are swallowed. 11 tests incl.
- the durability test (hanging tool `new Promise(() => {})` + abort → `runTurn` returns
- `finishReason "aborted"`, doesn't hang). Report: `reports/kernel-abort-race.md`.
-- [x] **tool-shell** — `realSpawn` spawns `detached: true` (own process group); on abort **and**
- timeout kills the **group** (`process.kill(-pgid, "SIGKILL")`) AND resolves immediately (no
- `close`-dependency) so a grandchild holding the pipes can't stall the spawn or leak. 4 tests
- (grandchild abort, grandchild timeout, normal-completion stdout capture, simple abort). Report:
- `reports/tool-shell-process-group-kill.md`.
-- [x] Verified: `tsc -b` EXIT 0, biome clean, **1326 vitest** pass; both in-lane; kernel zero
- internal mocks.
-- [x] **Live-verified** (fresh `bin/up`): start a hanging tool (`run_shell` sleep/grandchild),
- Stop, then send a NEW message → it must be ACCEPTED (conversation not bricked) and the spinner
- clears.
-
-## System prompt builder — template-based system context (DONE)
-Design: `notes/system-prompt-design.md`. FE courier: `frontend-system-prompt-handoff.md`.
-Problem: no system prompt was sent to the provider for regular turns (the messages array
-started with the user message; `providerOpts.systemPrompt` was never set). This adds a
-template-based system prompt builder with variable placeholders (`[type:name]`) and
-conditionals (`[if]`/`[else]`/`[endif]`).
-- **Cache constraint (critical):** the system prompt is constructed ONCE (first turn of
- a new conversation) and persisted. Reused on all subsequent turns (no reconstruction —
- cache-safe). Reconstructed only on **compaction** (fresh variable resolution + compaction
- instructions appended).
-- **Variable types:** `system:time/date/os/hostname`, `prompt:cwd/model/conversation_id`,
- `git:branch/status`, `file:<path>` (dynamic — any path).
-- **Wave 0 (orchestrator, contracts):** `@dispatch/transport-contract` `0.17.0→0.18.0` —
- `SystemPromptTemplateResponse`, `SetSystemPromptTemplateRequest`, `SystemPromptVariable`,
- `SystemPromptVariablesResponse`.
-- **Wave 1 — `system-prompt` (NEW ext):** pure parser (29 tests) + variable resolver
- (injected adapters, 12 tests) + catalog (3 tests) + service handle (`construct` +
- `get` + `getTemplate` + `setTemplate`, 8 tests). 52 tests total. Default template:
- persona + AGENTS.md if exists + cwd.
-- **Wave 2 (parallel):** `session-orchestrator` (wire service: construct on first turn,
- get on subsequent, construct+append on compaction; 12 tests) + `transport-http`
- (GET/PUT `/system-prompt`, GET `/system-prompt/variables`; 6 tests).
-- **Wave 3 — host-bin:** registered `system-prompt` in `CORE_EXTENSIONS`.
-- [x] Verified: `tsc -b` EXIT 0, biome clean, **1396 vitest** pass.
-- [x] Live-verified (boot smoke: extension activates, `GET /system-prompt` returns default
- template, `GET /system-prompt/variables` returns catalog).
-- [x] **FE courier** sent to FE agent `ffe3`: `frontend-system-prompt-handoff.md`.