# Dispatch — tasks (live progress) > **Live status + roadmap only.** Completed milestones are summarized, not > narrated. Old blow-by-blow history is pruned — it lives in git (`git log`). > Keep this lean and current; do not let it re-accrete a step-by-step changelog. ## Status (current) `tsc -b` EXIT 0 · biome clean · **1453 vitest** green. ## Broken-chat self-repair (read-time reconcile) (DONE) Conversation `77574596` broke unrecoverably: `reconcile()` only repaired orphaned tool-calls, not (a) a trailing assistant message whose only chunk is `error` (serializes to empty content → uncontinuable) and (b) a `tool-call` whose `input` is a raw malformed-JSON string (re-sent as OpenAI `arguments` → provider 400s on every continuation). `load()` also had no try/catch on `JSON.parse` (one corrupt row would brick a chat). Fix = read-time repair so broken chats auto-heal on next open — NO DB surgery (append-only preserved; repair is a turn-path transform on `load()`). Full diagnosis + plan: `broken-chat-repair-handoff.md` + `reports/broken-chat-repair-diagnosis.md`. - **Layer 1 — `conversation-store` `reconcile.ts` (protects ALL providers):** `reconcileWithReport` now (1) strips `error` chunks from assistant messages, (2) drops any assistant message left with no `text`/`tool-call` (the emptied error-only msg — safe: never followed by a `tool` msg), (3) keeps orphaned-tool-call synthesis unchanged. `ReconcileReport` +2 additive counts (`strippedErrorChunks`, `droppedEmptyMessages`) for the repair span. `loadSince` (FE reads) intentionally NOT reconciled — the user still SEES the error while the provider gets clean history. **Hardening:** `store.ts` `load()` wraps per-chunk `JSON.parse` in try/catch → corrupt row skipped (log + continue), reconcile runs on the rest. +6 reconcile/store tests. - **Layer 2 — `openai-stream` `convert-messages.ts` (per-provider args safety):** new pure `serializeToolArguments` — object→stringify; valid-string→parse+restringify; malformed-string→fallback `{ _malformed_arguments: }`. Output ALWAYS `JSON.parse`s → provider stops 400ing on stored malformed args. +4 tests. - **Layer 2 (equiv) — `../claude` `provider-anthropic` `convert.ts`:** `safeJson` now returns a valid object fallback (`{ _malformed_arguments: s.slice(0,200) }`) on parse failure, not the raw string (`tool_use.input` must be an object for Anthropic). Exported for direct testing. +3 tests. (Separate repo, separate agent.) - **Wave 1+2 (parallel, disjoint):** conversation-store + openai-stream (arch-rewrite) + provider-anthropic (`../claude`). All in-lane; zero internal mocks; no contract/type change. Reports: `reports/conversation-store.md`, `reports/openai-stream.md`, `../claude/reports/provider-anthropic.md`. - [x] Verified: arch-rewrite `tsc -b` EXIT 0, biome clean, **1453 vitest** (was 1443); `../claude` `tsc -b` EXIT 0, 71 vitest, biome clean. Both pure-core units zero internal mocks. - [x] **LIVE-VERIFIED** (dev stack `bin/up` :24203): reproduced 77574596's REAL broken tail (the actual malformed-args tool-call + trailing error chunk) in the dev DB; `POST /chat` continued it cleanly (`text-delta:"OK"` → `done` reason `"stop"`, no 400) — the provider accepted the reconciled history (error stripped, args sanitized). The historical error chunk remains in storage by design (read-time repair only); no new error was appended. Cleaned up the test conversation after. ## LSP — broken-server recovery + config source attribution (DONE) Handoff from an agent running in raylib-jamstack (configuring ruby-lsp under the installed Dispatch harness `/usr/bin/dispatch-server`): two issues found by decompiling the running binary. (Previous orchestrator session 77574596 did the investigation + Wave 0 + wrote the prompt; its chat broke mid-summon — resumed.) - **Issue 2 (blocker):** a failed LSP server was `broken` FOREVER — the manager's `broken` set (keyed `${serverId}:${root}`) was cleared ONLY in `shutdownAll()`, so a server that failed (bad env, missing binary, OR a since-fixed bad config) stayed `state:"error"` for the whole process. For an agent running *inside* dispatch the only recovery (server restart) kills its own session. - **Issue 1:** `.dispatch/lsp.json` (read first) silently shadowed `opencode.json`'s `lsp` key — a broken entry won with no warning, and the caller couldn't tell which config source a server came from (`status()` was its only visibility). - **Wave 0 (orchestrator, contracts):** additive `readonly configSource?: string` on `LspServerInfo` (`@dispatch/transport-contract` `0.20.0→0.21.0`) + a type-test assertion (8→9). tsc/biome/vitest clean. - **Wave 1 — `lsp` extension:** (a) broken-server now self-heals when its *resolved config changes* since it was marked broken (a config edit is a discrete event → no retry storm; bounded backoff for transient failures); (b) `configSource?` mirrored on `LspServerStatus` + populated in `status()` (`.dispatch/lsp.json` / `opencode.json` / `built-in`); (c) shadow warning via `host.logger` when both configs declare lsp; (d) spawn-failure `error` strings now name the config source. 6 required named tests + extras. Report: (agent cut off before writing `reports/lsp.md`; work independently verified — 50 lsp tests, tsc EXIT 0, biome clean). - **Wave 1 CR (transport-http):** the `GET /conversations/:id/lsp` handler mapped `LspServerStatus`→`LspServerInfo` field-by-field and DROPPED `configSource` (never reached the wire). Summoned the transport-http owner for the one-line conditional-spread pass-through (mirrors `error`, honors `exactOptionalPropertyTypes`) + a named pass-through test (present + undefined-omitted). Report: `reports/transport-http.md`. - [x] Verified: `tsc -b` EXIT 0, biome clean, **1443 vitest** pass; all agents in-lane (only packages/lsp + transport-contract + transport-http touched; pre-existing uncommitted WIP in kernel/tool-shell left untouched). Zero internal mocks. - [x] **LIVE-VERIFIED** (dev stack `bin/up` on :24203, new code via `--watch`): (A) `configSource` reaches the wire — built-in TS server reports `configSource:"built-in"`, `state:"connected"` (Wave 0 + transport-http pass-through confirmed end-to-end); (B) a broken server (`.dispatch/lsp.json` → nonexistent binary) reports `state:"error"` + `configSource:".dispatch/lsp.json"` + a source-named error string (`broken-ts [from .dispatch/lsp.json]: Executable not found in $PATH: …`); (C) **recovery without restart** (the blocker) — same conversation/process went `error`→`connected` after the config was fixed (config change clears the broken key → re-spawn → connects); (D) no retry storm — repeated `status()` with no config change stays `error`; (E) shadow warning logged via `host.logger` (`extensionId:"lsp"`, level `warn`) when both `.dispatch/lsp.json` and `opencode.json` declare lsp. ## Per-conversation model persistence (DONE) Bug: a chat's selected provider + model was NOT persisted per conversation. Opening the same chat in a new browser session defaulted to the server's default model rather than recalling the originally selected one. - **Wave 0 (orchestrator, contracts):** `@dispatch/transport-contract` `0.19.0→0.20.0` — additive `ModelResponse` + `SetModelRequest` types for `GET/PUT /conversations/:id/model`. - **Wave 1 — `conversation-store`:** `getModel`/`setModel` (`model:` key, mirrors `getReasoningEffort`/`setReasoningEffort`); `forkHistory` copies model; empty string clears (idempotent). +13 tests. - **Wave 2 (parallel):** `session-orchestrator` (resolve model from persisted store when no per-turn override → `resolveModel`; persist the resolved model so it sticks; warm path parity; `resolveModelName` pure helper; +4 tests) + `transport-http` (`GET/PUT /conversations/:id/model` with validation + `parseModelBody` pure validator; +10 tests). - [x] Verified: `tsc -b` EXIT 0, biome clean, **1433 vitest** pass; all in-lane. ## System-prompt stale on cwd change (DONE) Bug: the system-prompt service constructed the resolved prompt once on the first turn and reused it via `get()` on subsequent turns (cache-safe design). But the prompt is cwd-sensitive (`[file:AGENTS.md]`, `[prompt:cwd]` variables). When a conversation's cwd changed after the first turn, the cached prompt was stale — referenced files from the new cwd were not loaded. - **Wave 1 — `system-prompt`:** added `getWithMeta(conversationId)` returning `{ prompt, cwd }` — reads both `resolved:` and a new `resolved-cwd:` sibling key. `construct()` now also stores the cwd. All additive, no existing method signature/behavior changed. +5 tests. - **Wave 2 — `session-orchestrator`:** subsequent turns call `getWithMeta`, compare stored cwd vs `effectiveCwd ?? process.cwd()`, and `construct` if they differ (or if no stored prompt exists). Compaction path (always constructs) and warm path (no system prompt) unaffected. +1 test. - [x] Verified: `tsc -b` EXIT 0, biome clean, **1411 vitest** pass; both in-lane. - No FE handoff needed (backend-only fix; no contract version bump). ## Workspace tab issue — conversation.open drops workspaceId (DONE) Cross-repo additive fix: `conversation.open` / `conversation.statusChanged` WS broadcasts now carry the conversation's persisted workspace id, so a frontend opens/focuses a tab in the correct workspace instead of the viewer's current workspace (`activeWorkspaceId`). CLI `dispatch --open --workspace my-ws` now opens only in `my-ws`. - **Wave 0 (orchestrator, contracts):** `@dispatch/transport-contract` `0.18.0→0.19.0` — additive `readonly workspaceId: string` on `ConversationOpenMessage` and `ConversationStatusChangedMessage`. - **Wave 1 (parallel):** `session-orchestrator` (add `workspaceId` to `ConversationOpenedPayload`/`ConversationStatusChangedPayload`; resolve from `conversationStore.getWorkspaceId` at all status-change emit sites) + `transport-ws` (thread `workspaceId` from hook payload into WS broadcasts) — disjoint packages. - **Wave 2:** `transport-http` — `POST /conversations/:id/open` now awaits `getWorkspaceId(conversationId)` and emits `conversationOpened` with it. - [x] Verified: `tsc -b` EXIT 0, biome clean, **1405 vitest** green; all agents in-lane. - [x] **FE courier** to `29ae`: `frontend-workspace-open-handoff.md` — parse/use `workspaceId` from `conversation.open` and `conversation.statusChanged`; re-pin `@dispatch/transport-contract` `0.19.0`; re-mirror reference.md. ## LSP cwd resolution — server-default fallthrough + workspace assignment (DONE) Bug: `GET /conversations/:id/lsp` called `getEffectiveCwd` directly, which falls through to `serverDefaultCwd` (`process.cwd()`) when no conversation cwd is set — the LSP connected on the wrong dir. Additionally, a new conversation's workspace isn't assigned until the first `chat.send`, so `getEffectiveCwd` resolved against `"default"` (not the intended workspace) when the FE set the cwd before the first turn. - **Wave 0 (orchestrator, contracts):** `@dispatch/transport-contract` `0.16.0→0.17.0` — additive `SetCwdRequest.workspaceId?: string` + updated `LspStatusResponse.cwd` comment ("resolved working directory the LSP connects on, or null when no cwd is set"). - **Wave 1 — transport-http:** `GET /conversations/:id/lsp` now gates on `getCwd` (persisted) first — returns `{ cwd: null, servers: [] }` when no cwd set (LSP does NOT connect); only calls `getEffectiveCwd` + `lspService.status()` when a persisted cwd exists. `PUT /conversations/:id/cwd` now accepts optional `workspaceId` — validates with `isValidWorkspaceSlug`, then `ensureWorkspace` → `setWorkspaceId` → `setCwd` (assigns workspace before persisting cwd). 5 new tests + 1 assertion updated. Report: `reports/transport-http.md`. - [x] Verified: `tsc -b` EXIT 0, biome clean, **1332 vitest** pass; agent in-lane. - [x] **FE courier** sent to FE agent `ffe3`: `frontend-lsp-cwd-workspace-handoff.md` — send `workspaceId` on `PUT /conversations/:id/cwd`; `GET /conversations/:id/lsp` now returns `cwd: null` + empty `servers` when no working dir is set. ## Workspace cwd fallthrough + relative resolution (DONE) FE courier in: bug report + behavior change (`workspace defaultCwd` not used at turn start when a conversation has no explicit cwd; plus per-conversation cwd should be **relative to the workspace `defaultCwd`** unless absolute). Resolution is backend-owned (the FE omits `cwd` on `chat.send`). - **Scope:** single unit — `conversation-store` owns `getEffectiveCwd` (already consumed unchanged by `session-orchestrator` turn/warm + `transport-http` `GET /conversations/:id/lsp`), so no cross-package surface change and no fan-out. `GET /conversations/:id/cwd` uses `getCwd` (raw explicit cwd) — unchanged. - [x] **conversation-store** — added injectable `serverDefaultCwd` (default `process.cwd()`) to `createConversationStore`; rewrote `getEffectiveCwd` with the new algorithm: explicit conversation cwd null → `workspaceCwd ?? serverDefaultCwd` (bug fix: was returning null, skipping the workspace default); absolute (starts `/`) → overrides; relative → `path.resolve(workspaceCwd ?? serverDefaultCwd, conversationCwd)`. Public signature `(conversationId) => Promise` unchanged. 8 regression tests. Report: `reports/conversation-store-workspace-cwd.md`. - [x] Verified: `tsc -b` EXIT 0, biome clean, **1289 vitest** pass; agent in-lane; zero internal mocks. ## Per-turn cwd override not resolved relative to workspace (CURRENT — live-found) Live investigation (dev stack, tab 4ef4 in workspace `test` with `defaultCwd=/home/tradam/projects/ dispatch`): `getEffectiveCwd` resolves a persisted relative cwd correctly (LSP endpoint + a chat **omitting** `cwd` both return `/home/tradam/projects/dispatch/arch-rewrite`). BUT a per-turn `cwd` sent on `chat.send` is used **as-is** by `session-orchestrator` (`cwd !== undefined ? Promise.resolve(cwd)`, orchestrator.ts:360), bypassing `getEffectiveCwd`. So raw `arch-rewrite` reaches `run_shell` → `resolve("arch-rewrite")` = `/arch-rewrite` (nonexistent) → `pwd` broken; `./` → `resolve("./")` = `process.cwd()` (valid) → "works". The FE sends the CwdField value as a per-turn `cwd` (transport-ws threads it: router.ts:173 → extension.ts:277). - **Fix (2 waves):** add an optional `overrideCwd?: string` to `ConversationStore.getEffectiveCwd` (resolve the override if provided, else the persisted `getCwd` — same relative algorithm), then `session-orchestrator` passes the per-turn `cwd` (turn start + warm `opts.cwd`) as the override. - [x] **Wave 1 — conversation-store:** added `overrideCwd?` param + impl + tests. - [x] **Wave 2 — session-orchestrator:** pass per-turn cwd as override (turn start + warm) + tests. - [x] Verified: `tsc -b` EXIT 0, biome clean, **1298 vitest** pass; both agents in-lane; zero internal mocks. - [x] **LIVE-VERIFIED** (dev stack, workspace `test` defaultCwd `/home/tradam/projects/dispatch`): a per-turn `cwd:"arch-rewrite"` on an existing conversation (assigned to `test`) → `pwd` returns `/home/tradam/projects/dispatch/arch-rewrite` (resolved, not broken). Both the omit-cwd path (Wave 0) and the per-turn-cwd path (Wave 2) confirmed working. - **Known edge case (pre-existing, not a regression):** a brand-NEW conversation's FIRST turn runs `getEffectiveCwd` *before* the workspace is assigned (orchestrator.ts assigns it later in the IIFE), so a relative per-turn cwd resolves against the "default" workspace (server default) instead of the intended one. Uncommon (CwdField typically set after the first message). Deferred. - **Note (separate pre-existing bug, not touched):** `DELETE /conversations/:id/cwd` returns `cwd:null` but does NOT clear the persisted cwd (transport-http app.ts:538 — the route is a stub). ## Cwd edge cases — timing + DELETE stub (DONE) Two pre-existing bugs surfaced during live-verify of the relative-cwd fix: - **Edge 1 (timing):** a NEW conversation's first turn ran `getEffectiveCwd` BEFORE the workspace was assigned, so a relative per-turn cwd resolved against `"default"` (server default) not the intended workspace. **Fix:** session-orchestrator now assigns the workspace (for new conversations, detected via `getConversationMeta === null`) BEFORE resolving the effective cwd; removed the duplicate assignment site. 3 tests. - **Edge 2 (DELETE stub):** `DELETE /conversations/:id/cwd` returned `{cwd:null}` but did NOT clear the persisted cwd (no `clearCwd` on the store). **Fix:** conversation-store added `clearCwd(id)` (`storage.delete(cwdKey)`, idempotent) + tests; transport-http DELETE handler now `await clearCwd` for real. - [x] **Wave A (parallel):** conversation-store (clearCwd) + session-orchestrator (timing) — disjoint. - [x] **Wave B:** transport-http (DELETE handler uses clearCwd). - [x] Verified: `tsc -b` EXIT 0, biome clean, **1311 vitest** pass; all in-lane; zero internal mocks. - [x] **LIVE-VERIFIED** (dev stack): Edge 2 — PUT→GET(`/tmp/test`)→DELETE→GET(`null`) actually cleared. Edge 1 — NEW conversation, workspace `test`, per-turn `cwd:"arch-rewrite"` → `pwd` returns `/home/tradam/projects/dispatch/arch-rewrite` (resolved against workspace default, not broken). - [x] **FE courier handoff** written + sent: `frontend-cwd-resolution-handoff.md` couriered to FE orchestrator conversation `b18a` via `dispatch send b18a --queue` (turn started). Behavior-only — no `@dispatch/wire`/`transport-contract`/`ui-contract` version bumps; no FE contract change needed. Notes: `DELETE /conversations/:id/cwd` now actually clears; per-turn `cwd` on `chat.send` resolved relative to workspace `defaultCwd`; FE MAY omit `cwd` on `chat.send` (backend resolves persisted). Built and verified live (full-fidelity: every feature is a manifest-loaded extension through the host): - **kernel** — contracts (ABI), bus, `runTurn` turn loop, extension host. - **core extensions** — storage-sqlite, auth-apikey, provider-openai-compat (OpenCode Go), conversation-store, session-orchestrator, transport-http, credential-store; tool extensions `read_file` (files + directory listing), `run_shell`, `edit_file`, `write_file`. - **observability** — structured Logger/Span ABI + journal-sink → out-of-process collector → trace-store (`bun:sqlite`); host-bin supervises the collector; nested turn→step→{prompt, provider.request, ttft, decode} spans; D5 verbatim provider capture (self-redacted); `trace-replay` record/replay lib + fixtures. - **CLI** — one-shot HTTP client (`bun packages/cli/src/main.ts`); `GET /models`, `--cwd`, `--conversation`. - **web frontend** — SEPARATE repo `../dispatch-web`. Slice 1 (surface system) shipped via `ui-contract` + `surface-registry` + `transport-ws` + `surface-loaded-extensions`. Slice 2 (browser chat) in progress there. ## How to run ```bash # .env auto-loads DISPATCH_API_KEY (do NOT re-export) and pins BACKEND_PORT (beats PORT). # Private probe instance: override the port + ISOLATE data paths (ORCHESTRATOR §8): BACKEND_PORT=4567 SURFACE_WS_PORT=4569 DISPATCH_DB=/tmp/opencode/probe/dispatch.db \ DISPATCH_TRACE_DB=/tmp/opencode/probe/traces.db DISPATCH_JOURNAL=/tmp/opencode/probe/app.ndjson \ bun packages/host-bin/src/main.ts # boots app + collector curl -s -X POST localhost:4567/chat -H 'content-type: application/json' \ -d '{"conversationId":"c1","message":"Say hello in 3 words."}' # field = conversationId ``` Process cleanup uses the `[x]` bracket trick (ORCHESTRATOR §8) — leaked server/collector procs poison the next run's counts. **Two stacks:** `bin/up` = dev (live-reload backend, ports 24203/24205/24204). `../bin/up2` = a **stable, no-watch** second stack on **25203/25205/25204** with ISOLATED data (`./.dispatch-data/up2/`, `./.dispatch/journal/up2/`) — runs ALONGSIDE `bin/up`, edit backend code freely without restarting it; Ctrl-C stops only itself. Enabled by a new env knob **`SURFACE_WS_PORT`** → `surfaceWsPort` config (`host-bin/config.ts`; default 24205 when unset, so dev is unchanged). ## Foundation (done — summarized; details in git) - **MVP + multi-turn:** curl → transport-http → session-orchestrator → host/registry → provider → OpenCode Go → AgentEvents → NDJSON; `conversationId` threads history. - **Post-MVP:** auth→provider seam; `read_file` tool (live tool-dispatch loop); `getHostAPI()` hygiene; `tabId → conversationId` rename. - **Observability Phase A/B:** the substrate + collector/store + supervision + replay fixtures (see bullet list above). - **CLI MVP:** credential-store + transport-contract + cli; model catalog; cwd threading; multi-turn. - **FE Slice 1:** the surface system across both repos (live WS probe verified). - **FE Slice 2 backend prereqs:** `@dispatch/wire` split; per-chunk `seq` cursor; read endpoint `GET /conversations/:id?sinceSeq=`; WS chat-deltas (transport-ws); turn-lifecycle events (`turn-start`/`done`/`turn-sealed`); step grouping (`stepId` on tool chunks/events); live stream metrics (`step-complete` + `usage`/`done` token/timing — "Pass 1"); CORS. ## Metrics — token + timing (current milestone) - [x] **Pass 1 — live stream metrics** (done): `step-complete` event + `usage`(stepId) + `done`(durationMs + aggregate usage). - [x] **Observability spans** (done): turn & step span-close stamp all four `Usage` fields (added cacheRead/cacheWrite; normalized `usage_*` → `usage.*`). - [x] **Pass 2 — persisted replay metrics** (done, was deferred): `StepMetrics`/ `TurnMetrics` wire types; conversation-store `appendMetrics`/`loadMetrics` (separate key space, turn-append order); session-orchestrator accumulates per-step+turn metrics from the event stream and persists after seal; transport-http `GET /conversations/:id/metrics` → `ConversationMetricsResponse`. `@dispatch/wire` + `@dispatch/transport-contract` → `0.4.0`. Commit `6db12ff`. - [x] **Live-verified end-to-end** (against flash): live `step-complete`/`done` metrics ↔ persisted `GET /conversations/:id/metrics` byte-match (aggregate + per-step `stepId` + ttft/decode/genTotal + durationMs); journal turn/step spans carry dotted `usage.*` incl. `usage.cacheReadTokens` (the #2 fix). - [x] **FE courier handoff** written: `frontend-metrics-pass2-handoff.md` (in this repo; user couriers to `../dispatch-web`; ORCHESTRATOR §7). ## dedup / storage growth (DONE) Design `notes/observability-design.md` §12. User-gated calls: extend existing pipeline (no new ext); scope = **de-dup + retention/rotation** (D9 roll-ups deferred); dedup = **content-addressed bodies** (body-hash, NOT fingerprint-gated). - [x] **Wave 1 — `trace-store`**: content-addressed `bodies` table (SHA-256), at-rest gzip (>1 KiB), `prune(policy)` (age + drop-oldest byte-cap + orphan GC) / `RetentionPolicy` / `PruneSummary` / `DEFAULT_RETENTION` (7d/256MiB); reads transparent. - [x] **Wave 2 — `observability-collector`**: pure `shouldPrune` cadence helper; `main.ts` calls `store.prune(DEFAULT_RETENTION)` on a coarse cadence (`--prune-interval-ms`, default 60s; host-bin-overridable), log-and-continue on error. - [x] Glossary: added content-addressed body, trace retention, prefix fingerprint, warm vs real. - [x] **Migration bug** (found by live boot, fixed): Wave 1 created the `idx_records_bodyHash` index BEFORE running `migrateOldBodies`, so opening a pre-existing OLD-schema `traces.db` crashed the collector (`no such column: bodyHash`, crash-looped). Fix = reorder migration before the index + 3 regression tests that seed a real old-schema DB. bun 106→109. - Tests: bun 89→109. typecheck/biome clean. **Live-verified** against a real old-schema `traces.db`: 0 crashes, collector stays up, schema migrates (bodyHash + content-addressed bodies), real-data dedup (318 body refs → 270 stored bodies), prune cadence fires cleanly (14× `prune completed`). Optional follow-up: host-bin env-override for the retention policy. ## Standard tools — fs + shell (DONE) User-gated calls: **one tool per extension** (matches `tool-read-file` precedent); tools are **standard** tier (a turn completes with `tools:[]`, §2.6/§2.8). **Zero ABI change** — the `ToolContract`/`ToolExecuteContext` already carry `signal`/`onOutput`/`cwd`/`log`. - **Wave 1 (parallel, disjoint pkgs, kernel-only dep) — all green:** - [x] `tool-read-file` — EXTENDED `read_file` to list directory contents (sorted, `/`-suffixed subdirs; files unchanged). 41 tests. - [x] `tool-shell` (new) — `run_shell`: foreground, streamed via `ctx.onOutput`, `ctx.signal` cancel, `ctx.cwd`, timeout + output cap, `concurrencySafe:false`; injected `spawn`. 31 tests. - [x] `tool-edit-file` (new) — `edit_file`: `oldString`/`newString`/`replaceAll`; errors on absent/non-unique/identical; workdir-contained; `concurrencySafe:false`. 38 tests. - [x] `tool-write-file` (new) — `write_file`: explicit `overwrite` flag (absent+unset→create; exists+unset→error; exists+true→overwrite; absent+true→error); no parent auto-create. 33 tests. - **Wave 2 (done):** orchestrator added 3 root tsconfig refs + `bun install`; host-bin owner registered the 3 new extensions in `CORE_EXTENSIONS` (same pattern as `read_file`). - **Live-verified:** clean boot (`Dispatch booted`, collector up, no activation/capability-gate error — the new `shell` capability is accepted); full-graph `tsc -b` EXIT 0, biome clean. - **Recovery notes (scar tissue):** `tool-write-file` first returned plan-only (§5a) → re-summoned with "IMPLEMENT NOW". `tool-edit-file` hung vitest at collection — `computeReplacement` infinite- looped on empty `oldString` (`"".indexOf("") === 0`, index never advances) invoked at a test's `describe` scope; fixed with an early empty-string guard + validation. One agent deleted `ORCHESTRATOR.md` out-of-lane → caught by post-wave `git status`, restored from git. - Deferred (not selected): `glob`, `grep`/`search_code`, background shells. ## Skill system + load_skill tool (DONE) User-gated calls: skills list lives in the **`load_skill` tool definition** (NOT the system prompt), refreshed **per new turn** (cache-stable across steps), **live file read** on execute. One `skills` standard extension (loader + filter + tool). Skill = md in `.skills/`; discovered from `~/.skills` + `/.skills` (cwd shadows home); name = filename w/o `.md`. Format: line1 = summary, line2 = `---`, body = line3+; on load the first two lines are stripped; malformed (no `---`) = no summary but still loadable. Glossary: added `skill`, `skill summary`, `tools filter`. - **Mechanism — the per-turn `tools` filter chain** (first concrete use of the §3.2 context-assembly chain; reusable for persona/agents later): - [x] **kernel** — exposed `HostAPI.applyFilters` (delegates to the bus's existing `applyFilters`). - [x] **session-orchestrator** — defines+exports `toolsFilter`/`ToolAssembly`; applies it ONCE per turn (injected `applyToolsFilter` dep) before `runTurn`, threading `cwd`+`conversationId`. - [x] **skills** (new ext, `dependsOn session-orchestrator`) — pure parse/merge/render + `load_skill` tool (live read, strips first two lines, path-contained) + a `toolsFilter` filter that rewrites `load_skill`'s description + `name` enum with the per-cwd catalog. 42 tests. - [x] **host-bin** — registered `skills` in `CORE_EXTENSIONS`. - [x] **Fan-out (§5.3):** `applyFilters` was a required `HostAPI` addition → broke one consumer (transport-http `server.bun.test.ts` inline HostAPI stub) → fixed by its owner. - **Live-verified:** clean boot (`skills` activates, filter registered, no crash); full-graph `tsc -b` EXIT 0, biome clean. (End-to-end load_skill via a real LLM turn not yet exercised — unit/integration tests cover the filter rewrite + live read.) ## Cache warming (core DONE; control surface PARTIAL) User-gated calls: target the external **Claude** provider (`../claude` provider-anthropic, loaded via `DISPATCH_EXTERNAL_EXTENSIONS`); warm-assembly lives in **session-orchestrator** (`warm()` reuses the real turn's assembly → byte-identical prefix, provider-agnostic); **surface system** for controls; **per-conversation** controls; interval default 4 min, free value. Old-code invariants honored (primary-model/full-prefix via reuse; refuse mid-turn; never persist/emit; in-flight invalidation; arm-on-settle/cancel-on-start; `pct = round(clamp(cacheRead/input,0,1)*100)`). - **Mechanism (2nd use of bus hooks; first event-hook emit):** - [x] **kernel** — exposed `HostAPI.emit` (delegates to bus.emit), counterpart of `on`. - [x] **session-orchestrator** — `turnStarted`/`turnSettled` event hooks (carry conversationId/cwd/ modelName) emitted per turn; `warm()` service (`cacheWarmHandle`) reusing assembly, refusing mid-turn, never persisting/emitting; returns Usage. - [x] **cache-warming** (new ext) — per-conversation timers (arm/cancel/in-flight token), calls `warm()`, computes `lastPct`, persists `{enabled,intervalMs}` (default on/240s) in host.storage; registers a controls Surface. 19 tests. - [x] **host-bin** — registered cache-warming; **transport-http** HostAPI stub fixed for `emit`. - **Manual trigger endpoint:** `POST /chat/warm {conversationId, model?, cwd?}` → `WarmResponse` `{inputTokens,outputTokens,cacheReadTokens,cacheWriteTokens,cachePct}` (409 if generating). Powers a FE "warm now" button + fast tests. Types in `@dispatch/transport-contract`; route in transport-http. - **LIVE-VERIFIED against Claude haiku:** automatic timer warm → journal `warm complete pct:100`; manual `POST /chat/warm` → `cacheReadTokens:6799, cachePct:100` (100% hit), HTTP 200. The external `../claude` provider-anthropic is loaded via `bin/up` (`DISPATCH_EXTERNAL_EXTENSIONS`). - **Cache-metric fix + retention metric:** `provider-anthropic` (in `../claude`, commit `0e9d118`) now reports `Usage.inputTokens` as the TOTAL prompt (was the uncached remainder → the cache rate inflated/clamped to 100% on Claude). So `cacheRead/inputTokens` is now the true rate (live: a turn adding new content reads 61%, not 100%). Added **`expectedCacheRate`** = `cacheRead/(cacheRead+ cacheWrite)` (retention/health, ~100% when warm, 0% when the cache expired) to `WarmResponse` + `POST /chat/warm` + the cache-warming surface (a "cache retention" stat). Live-verified: warm within TTL → 100%; warm after >5 min idle → 0% (cache expired). FE handoff updated with both metrics + the cross-turn real-turn `expectedCache = cacheRead_N/(cacheRead_{N-1}+cacheWrite_{N-1})`. - **Surface framework extended (DONE):** added `NumberField` to `ui-contract` + per-conversation surface scoping (optional `conversationId` on subscribe/unsubscribe/invoke + surface/update; new `SurfaceContext` on `SurfaceProvider.getSpec/invoke`; transport-ws keys subscriptions by `(surfaceId, conversationId)` and tags updates). cache-warming now serves a PER-CONVERSATION surface: `Toggle`(enabled) · `Number`(interval, seconds, `cache-warming/set-interval`) · `Stat`(last cache %). All backward-compatible (global surfaces like `surface-loaded-extensions` unchanged). **FE courier:** `frontend-cache-warming-handoff.md` (this repo) — the web must render the `number` field kind + send/handle `conversationId` on the surface WS protocol. ## Cache warming — FE CR-3 (DONE) FE asked (dispatch-web `backend-handoff-cache-warming-timer.md`): expose next/last-warm timestamps + make a manual warm reset the timer/refresh the surface. Done via an **inversion** (commit `bfbad3a`): session-orchestrator `warm()` (the single chokepoint for manual `/chat/warm` AND the auto timer) emits a `warmCompleted` bus event; cache-warming subscribes and does all post-warm handling — so manual warms re-arm the timer + push a surface update with **no transport-http change** (core can't depend on the standard cache-warming ext). Added `nextWarmAt`/`lastWarmAt` state + a `custom` `rendererId:"cache-warming-timer"` surface field (no ui-contract bump). Caught + fixed a wiring bug (`createWarmService` missed the `emit` dep → `deps.emit?.` silently no-oped; made it required). Live-verified vs claude haiku (manual warm logs `warm complete` ~2s after the turn, not the 4-min timer). FE handoff updated. (FE CR-1 table + CR-2 catalog `scope` flag still open, not requested.) ## LSP integration + per-conversation CWD (DONE) Design: `notes/lsp-design.md`. FE courier: `frontend-lsp-cwd-handoff.md`. Decisions (locked): **single `lsp` extension**; **hand-rolled pure JSON-RPC codec** (zero dep, injected-stream tested); **diagnostics-on-write deferred** (on-demand `lsp` tool only); **cwd persisted in `conversation-store`**; config = **built-in TypeScript + `/.dispatch/lsp.json` + `/opencode.json` `lsp` fallback** (Roblox works with its existing config). Glossary: added LSP, language server, diagnostics, workspace root, working directory. - **The bug we fixed** (opencode root cause, confirmed): opencode's `client/registerCapability` ignores all but `textDocument/diagnostic`, so `workspace/didChangeWatchedFiles` registrations are dropped + no real fs watcher → stale `sourcemap.json` → "Unknown require" mid-session. Fix = honor the registration + real fs watcher + forward `didChangeWatchedFiles` + auto-spawn `rojo sourcemap --watch` sidecar when `luau-lsp.sourcemap.autogenerate`. Covered by a regression test in `packages/lsp/src/client.test.ts`. - **`lsp` extension** (new, bundled core): hand-rolled LSP client (framing + rpc + watched-files + diagnostics + config + root + tool + manager), zero external deps. Lazy-spawn one server per `(serverID, root)`; config resolved **per cwd**; `lspServiceHandle.status(cwd)` lazy-connects + reports state; `deactivate` kills all child procs (host-bin shutdown now calls `host.deactivate()`). - **CWD:** `conversation-store.getCwd/setCwd`; `session-orchestrator` defaults a turn's cwd from the store; endpoints `GET`/`PUT /conversations/:id/cwd` + `GET /conversations/:id/lsp` in transport-http; wire types in `@dispatch/transport-contract` (→ `0.5.0`). - **LIVE-VERIFIED:** this repo (`typescript`) → `connected`; `/home/tradam/projects/ roblox` (`luau-lsp`) → `connected` (via the project's own `opencode.json` + rojo sidecar); cwd PUT/GET round-trip 200. Op note: LSP binaries must be on the server process PATH (`~/.local/bin` daemon-PATH caveat for `typescript-language-server`). - **Recovery (scar tissue):** the `lsp` agent stalled on the final stretch (1 hung test + ~40 biome `!`/dot-key findings) → at the user's request the orchestrator finished it directly; also fixed a real design bug the agent missed: the manager read config statically instead of per-cwd (would have broken Roblox). ## Context size — current context-window usage (DONE) User-gated decisions: term = **context size** (current usage; reserve "context window" for the model's max LIMIT, a later feature); definition = the turn's **FINAL step `inputTokens + outputTokens`** (NOT the aggregate `usage`, which sums per-step prompts and overcounts a multi-step turn); delivery = a backend-computed field on BOTH the live `done` event and the persisted `TurnMetrics`. - [x] **Contract (orchestrator):** optional `contextSize?: number` added to `TurnDoneEvent` + `TurnMetrics` in `@dispatch/wire` (`0.4.0→0.5.0`); `@dispatch/transport-contract` `0.5.0→0.6.0` (re-exports both — no other change). Glossary: added **context size**. - [x] **Wave (parallel, disjoint pkgs):** - [x] **kernel** — `run-turn.ts` tracks the last step's `Usage`; `doneEvent()` stamps `done.contextSize = lastStep.input + lastStep.output` (omitted when no usage). +3 tests. - [x] **session-orchestrator** — `metrics.ts build()` stamps `TurnMetrics.contextSize` from the final per-step metrics (same definition; equals the live value). +5 tests. - [x] Verified: `tsc -b` EXIT 0, biome clean, 881 vitest pass; both owners stayed in-lane. `conversation-store` (JSON passthrough) + `transport-http` (forwards/serves) unchanged. - [x] **LIVE-VERIFIED against flash** (`deepseek-v4-flash`): turn 1 → live `done.contextSize` 1255 == persisted `turns[-1].contextSize` 1255 == final-step `1206 in + 49 out` (NOT the aggregate); turn 2 (same conversation) → 1286 (grew cumulatively), live == persisted. Both carriers agree; "current" = latest turn's value. - [x] **FE courier handoff:** `frontend-context-size-handoff.md` (user couriers to `../dispatch-web`). ## Turn continuity — detached turns + multi-client live view (DONE) Design: `notes/turn-continuity-design.md`. FE courier: `frontend-turn-continuity-handoff.md`. Problem (code-traced): a turn's lifetime was bound to the WS connection — `transport-ws` aborted the in-flight turn on socket close, so a backgrounded/reloaded mobile browser killed generation. Principle enforced: **the FE is only a control interface; the AI runs independent of it**, and **multiple clients may watch the same conversation** (multi-device handoff). - **Decisions (locked):** broadcast hub lives in the CORE (`session-orchestrator`), not a transport; additive `SessionOrchestrator` handle (keep `handleMessage`); persist-at-seal kept, per-step R1 deferred; late-join served by an in-memory in-flight buffer; subscribers persist per-conversation independent of turns; no concurrent-send arbitration; no explicit stop op. - **Contract (orchestrator):** `@dispatch/transport-contract` `0.6.0→0.7.0` — additive WS ops `chat.subscribe`/`chat.unsubscribe` on `WsClientMessage` (events still arrive as `chat.delta`). - **Wave 1 — `session-orchestrator`:** detached per-conversation turn ownership + broadcast; `startTurn`/`subscribe`/`isActive` added to the handle; `handleMessage` → convenience wrapper (dropped `signal`). **Two-map model** (`subscribers` persistent + `activeTurns` buffer) — the fix for the live-found bug where pre-turn subscribers were dropped. 63 tests. - **Wave 2 (parallel) — `transport-ws`** (fan-out: per-connection chat-subscription map; `chat.send` auto-subscribes sender + `startTurn`; new ops in pure `router.ts`; `close` drops subs but NEVER aborts a turn; removed the turn `AbortController`) + **`transport-http`** (only test fakes updated for the 3 new methods; runtime unchanged). host-bin untouched. - **LIVE-VERIFIED against flash** (2-client WS test, `/tmp/ws_multi.ts`): (S1) two clients both stream a turn; closing the SENDER mid-turn → the other keeps receiving through `done` and the turn persists (1197 chars) — AI kept going independent of the interface; (S2) a client joining mid-turn gets `turn-start` replayed + the rest live. `RESULT OVERALL: OK`. - **Recovery (scar tissue):** first Wave-1 impl stored listeners INSIDE the per-turn hub and `startTurn` made a fresh empty-listener hub → every pre-turn subscriber dropped; live test got zero deltas though the turn ran+persisted. Caught by live-verify (unit test had subscribed AFTER start, masking it). Fixed via the persistent-subscribers / per-turn-buffer split. ## Turn continuity — CR-3: user prompt on the event stream (DONE) FE bug (multi-client): a pure watcher (subscribed, not the sender) couldn't see the USER prompt until seal — the user message was passed to the provider + persisted only at seal, never on the turn's outward stream/buffer. FE courier: `frontend-cr3-user-message-handoff.md`. - **Contract:** `@dispatch/wire` `0.5.0→0.6.0` — additive `TurnInputEvent` `{ type:"user-message"; conversationId; turnId; text }` on the `AgentEvent` union (kernel barrels re-export it). `@dispatch/transport-contract` `0.7.0→0.8.0` (re-export only). Widening broke NO exhaustive switch (typecheck clean) — zero consumer fan-out. - **session-orchestrator:** `emitToHub({type:"user-message",…})` as the FIRST event of `runTurnDetached` (before `runTurn`) → buffered + broadcast to all subscribers (live + late-join); HTTP path covered via `handleMessage`'s buffer replay. Persistence + metrics unchanged. +3 tests; 3 Wave-1 tests updated (user-message now precedes turn-start). - **LIVE-VERIFIED vs flash:** a watcher that never sent receives `user-message` (correct text) as its FIRST `chat.delta`, before `turn-sealed`, then the streaming reply. `RESULT: OK`. - **Process note:** implemented directly by the orchestrator as a one-off (user-approved at the time). SUPERSEDED — the user has since confirmed the ORCHESTRATOR.md model governs: the orchestrator summons owner-agents and does not write feature code itself. ## Cache warming — FE CR-4 lifecycle + CR-1 extensions table + CR-2 catalog scope (DONE) FE courier in: `../dispatch-web/backend-handoff-cache-warming.md` (+ CR-1/CR-2 from their living `backend-handoff.md`). Courier out: `frontend-cache-warming-lifecycle-handoff.md`. Full report: `reports/cr4-cache-warming-lifecycle.md`. - **CR-4a:** warming defaults OFF (opt-in per conversation) — `parseSettings` + `DEFAULT_STATE`; re-enabling now restores the persisted interval. Known gap (pre-existing, fail-safe): no boot hydration of persisted opt-in across server restarts. - **CR-4b:** post-warm surface updates now carry the FUTURE `nextWarmAt` (re-arm BEFORE notify); `turnSettled`/`turnStarted` also push (fresh schedule after seal / `null` while generating). - **CR-4c:** new `POST /conversations/:id/close` (tab close ≠ disconnect): aborts the in-flight turn via a per-turn `AbortController` → kernel `runTurn` `signal` (partial persist + normal seal, `done.reason:"aborted"`), and emits new typed hook `conversationClosed` → cache-warming disables sync + persists OFF. Disconnect/`chat.unsubscribe` semantics unchanged. - **CR-4d:** no change — initial `surface` echo already at HEAD (FE probed a stale up2 boot). - **CR-1:** loaded-extensions emits count stat + ONE `custom`/`rendererId:"table"` field (`TablePayload` exported); columns Name|Version|Trust|Activation, all trust tiers. - **CR-2:** `SurfaceCatalogEntry.scope?: "global"|"conversation"` (`ui-contract` `0.1.0→0.2.0`); set on both surfaces. `transport-contract` `0.8.0→0.9.0` (additive `CloseConversationResponse`). - 907 tests pass (+13 new); typecheck + biome clean. **LIVE-VERIFIED vs `bin/up`:** default-off, 2 automatic warms @5s each pushing future `nextWarmAt`, mid-turn close → `abortedTurn:true` + `done.reason:"aborted"` + warming disabled, catalog scopes + table field present, echo present. ## History windowing — FE CR-5 (DONE) FE courier in: `../dispatch-web/backend-handoff-chat-limit.md` (+ living `backend-handoff.md` §2 CR-5). Courier out: `frontend-history-windowing-handoff.md`. User-gated call: ask #3 shipped as the INVARIANT option (no new field) — seq is contractually **1-based, monotonic, gap-free**; FE derives `hasOlder` from `chunks[0].seq > 1`. - **Wave 0 (orchestrator, contracts):** `limit`/`beforeSeq` query-param semantics + validation + `latestSeq` windowed-read caveat documented on `ConversationHistoryResponse` (`@dispatch/transport-contract` `0.9.0→0.10.0`); 1-based seq guarantee codified on `StoredChunk` (`@dispatch/wire` `0.6.0→0.6.1`, doc-only). - **Wave 1 — `conversation-store`:** additive `loadSince(id, sinceSeq?, window?: { beforeSeq?, limit? })` — selection `sinceSeq < seq < beforeSeq`, newest-`limit` window, result stays ascending; garbage-in treated as absent (transport validates upstream). +8 tests. - **Wave 2 — `transport-http`:** parses + validates the params (positive integers; malformed/ zero/negative → 400 `{ error }`, store never called with an invalid window); two-arg call shape preserved when no params (regression-guarded). +20 tests. - 935 vitest + 112 bun tests, typecheck + biome clean. **LIVE-VERIFIED** (isolated boot, real flash turns): firstSeq=1; `limit=2`→`[5,6]` ascending w/ correct `latestSeq`; `limit=9999`→ full log; `beforeSeq=3`→`[1,2]`; `beforeSeq=3&limit=1`→`[2]`; `limit=0`/`beforeSeq=0`/ `limit=abc`→400×3. `RESULT: OK` ×6. - **Scar tissue (process):** (1) probing with a PRIVATE boot was overkill — the windowing checks are read-only GETs and the dev stack was running; prefer probing `bin/up`/`up2` or asking the user (ORCHESTRATOR §8 updated). (2) The §8 boot recipe was stale (`DISPATCH_API_KEY_OPENCODE1` doesn't exist; an empty re-export OVERRIDES `.env` → "No providers registered"; `.env`'s `BACKEND_PORT` beats `PORT`; un-isolated data paths spawn a duplicate collector on the dev DB) — recipe fixed in §8 + above. (3) Violated the bracket trick once (`pkill -f 'cr5-data'` self-matched → killed parent shell, timeout-with-no-output); the existing §8 rule stands. ## Reasoning effort (current milestone) User-gated calls: canonical term **reasoning effort** (GLOSSARY); ladder `low|medium|high|xhigh|max` (Anthropic-driven, includes xhigh/max); scope = **(c)** persisted per-conversation + per-turn `ChatRequest.reasoningEffort` override; resolution default **`high`**; provider picks sensible budget_tokens; `../claude` orchestrated DIRECTLY (mode A); CLI `--effort` now. - [x] **Wave 0 (orchestrator, contracts):** `ReasoningEffort` in `@dispatch/wire` (`0.6.1→0.7.0`); `ProviderStreamOptions.reasoningEffort` (kernel contract; runtime untouched — providerOpts is forwarded verbatim); `ChatRequest.reasoningEffort` + `ReasoningEffortResponse`/ `SetReasoningEffortRequest` GET/PUT types (`@dispatch/transport-contract` `0.10.0→0.11.0`); glossary entry. typecheck + biome clean. - [x] **Wave 1 (parallel ×3, disjoint):** `conversation-store` get/setReasoningEffort (own key space, mirrors cwd; +12 tests); `provider-anthropic` (../claude commit `c0835a4`, mode A summon with `--dir ../claude`, contract excerpt INLINED per the cross-`--dir` hang rule) — `REASONING_EFFORT_BUDGETS` 4096/10240/16384/32768/65536, raises max_tokens above budget, strips temperature when thinking on, absent → byte-stable body (+12 tests); `cli` `--effort` flag, parse-validated, body key omitted when unset (+8 tests). - [x] **Wave 2:** `session-orchestrator` — exported pure `resolveReasoningEffort` (override → stored → `"high"`), additive `StartTurnInput.reasoningEffort`, providerOpts always stamped, **warm() parity** (same resolved effort as a real turn — prompt-cache safe), own fakes fixed (+9 tests). - [x] **Wave 3 (parallel ×2):** `transport-http` — `/chat` validation (400 names valid levels, orchestrator never sees bad input), threads to startTurn, GET/PUT `/conversations/:id/reasoning-effort` mirroring cwd endpoints, own fakes fixed; `transport-ws` — `chat.send` threading + validation (+3 tests). - [x] Verified: `tsc -b` EXIT 0, biome clean, **993 vitest + 189 bun** green; all agents in-lane. Commits: arch-rewrite `35197ed` (contracts) + `020e051` (impl); ../claude `c0835a4`. - [ ] Live-verify vs claude (thinking deltas streamed at xhigh; persisted PUT honored next turn). - [x] FE courier handoff written: `frontend-reasoning-effort-handoff.md` (user couriers to `../dispatch-web`): ChatRequest/chat.send field + GET/PUT endpoints + ladder + default-`high` semantics + cache note. ## Message queue + steering injection (DONE) Design: this file's roadmap item 3 (now implemented). User-gated calls: a **separate `message-queue` standard extension** (dependsOn `surface-registry`) owns the queue STATE + a per-conversation `custom` surface; the **session-orchestrator** owns delivery (drain → inject → carry) + emits the `steering` event (it owns the chat hub — no `chatEmit` service needed); the **kernel** gets a generic `drainSteering` callback. Glossary: added **message queue**, **steering**, **queued message**. Enqueue when idle **starts a turn** (user choice; `chat.queue` degrades to `chat.send`). Steering text rendered live via a new additive `steering` `AgentEvent`; queue state via the surface (NOT the chat stream). - **Wave 0 (orchestrator, contracts):** `RunTurnInput.drainSteering?: () => readonly ChatMessage[]` (kernel contract — generic, kernel stays pure); `QueuedMessage` + `QueuePayload` + `TurnSteeringEvent` (type `"steering"`, additive to `AgentEvent`) in `@dispatch/wire` (`0.7.0→0.8.0`); `POST /conversations/:id/queue` + WS `chat.queue` op + `QueueRequest`/`QueueResponse` in `@dispatch/transport-contract` (`0.11.0→0.12.0`). typecheck clean except the expected transport-ws exhaustive-switch fan-out (fixed in Wave 3). - **Wave 1 (parallel ×2, disjoint):** `kernel` runtime — calls `drainSteering` at the tool-result boundary only when continuing to a next step (gated; no drain on max-steps), +6 pure tests (65 total); `message-queue` (NEW ext) — pure queue core (enqueue/getQueue/ drain/combine) + `MessageQueueService`/`messageQueueHandle` + per-conversation `custom` surface (`rendererId:"message-queue"`, `QueuePayload`), 12 tests. (The message-queue agent DIED mid-task after writing all src+tests but before verifying/reporting; orchestrator recovered by running `bun install` + root tsconfig ref + verifying directly — tsc/vitest/ biome clean, 12 tests pass; no hand-fixing of impl.) - **Wave 2:** `session-orchestrator` — added `enqueue` facade (idle→`startTurn`, active→queue.enqueue) + `resolveQueue?` dep (self-wired lazily in `activate` via `host.getService(messageQueueHandle)` — host-bin does NOT wire it) + `drainSteering` wrapper (drain → emit `steering` → return one combined user `ChatMessage`) + post-seal carry (non-empty queue → new turn), +8 tests (85 total). `message-queue` is an OPTIONAL dep (feature degrades off if absent). - **Wave 3 (parallel ×3):** `host-bin` — registered `message-queue` in `CORE_EXTENSIONS` (+dep+ref), 28 tests; `transport-http` — `POST /conversations/:id/queue` route + validation, 145 tests; `transport-ws` — `chat.queue` op + fixed the Wave-0 exhaustive-switch fan-out, 29 vitest + 20 bun. - Verified: `tsc -b` EXIT 0, biome clean (280 files), **1043 vitest + 199 transport bun** pass; all agents in-lane. **Boot smoke:** private instance boots clean with `message-queue` registered (no activation crash). - [x] FE courier handoff written: `frontend-message-queue-handoff.md` (user couriers to `../dispatch-web`): surface (`rendererId:"message-queue"`), `chat.queue` WS op, `steering` event, HTTP `POST /queue`, auto-start-when-idle, carry semantics, version bumps. ## Umans AI Coding Plan provider (DONE) User-gated calls: a new **`provider-umans`** standard extension wrapping the Umans OpenAI-compatible backend (`https://api.code.umans.ai/v1`). Built via the **full-refactor path**: first extract a generic `@dispatch/openai-stream` library from `provider-openai-compat`, then build `provider-umans` on top. Self-contained (reads `UMANS_API_KEY` from env directly — no `auth-apikey` dep). - **Wave 1 — `@dispatch/openai-stream` lib (NEW package):** extracted the generic OpenAI functions (convert-messages, convert-tools, parse-sse, listModels, stream, provider) from `provider-openai-compat` into a pure library package. `createOpenAICompatProvider` parameterized: `id: string` (was hardcoded `"openai-compat"`) + `transformBody?: (body, opts) => Record` hook (for provider-specific body fields). Refactored `provider-openai-compat` to import from the lib (thin extension.ts, backward-compat re-exports, manifest unchanged, byte-identical behavior). Full tsc EXIT 0, 66 vitest, biome clean. Report: `reports/provider-umans-wave1-openai-stream.md`. - **Wave 2 — `provider-umans` (NEW ext):** imports `createOpenAICompatProvider` from the lib; registers provider id `"umans"`; `transformBody` maps Dispatch `reasoningEffort` (`low|medium|high|xhigh|max`) → Umans `reasoning_effort` (`none|low|medium|high`, capping `xhigh`/`max`→`high`); dynamic `listModels` (GET /v1/models); default model `umans-coder` (env `UMANS_MODEL` or config `provider.umans.model`); baseURL env `UMANS_BASE_URL`; absent key → warn + skip registration (graceful). Pure core: `mapReasoningEffort` + `resolveUmansConfig` (factored out for direct unit testing). 12 tests. Report: `reports/provider-umans.md`. - **Wave 3 — host-bin wiring:** registered `provider-umans` in `CORE_EXTENSIONS` + added `@dispatch/provider-umans` dep + root tsconfig ref. No credential-store entry needed (self-contained — reads env directly, doesn't go through `auth-apikey`). 28 host-bin tests. - Verified: full-graph `tsc -b` EXIT 0, biome clean (293 files), **1059 vitest** pass. **Boot smoke:** without `UMANS_API_KEY` → `"provider-umans: no UMANS_API_KEY. Provider not registered."` (graceful skip); with `UMANS_API_KEY=sk-test` → `"provider-umans: registered (model=umans-coder)"`. - [x] **LIVE-VERIFIED against the real Umans API:** the dev stack (umans-glm-5.2) called `web_search` (Firecrawl) in a real turn — first live Umans API call, clean response. ## web_search tool — Firecrawl (DONE) Standard tool extension `tool-web-search` backed by a self-hosted Firecrawl instance (`http://100.102.55.49:31329/v1`, Tailscale, no API key). One tool `web_search` with 4 modes: search, scrape, crawl (polls status URL), map — mirroring the proven opencode tool. Pure core: `validateArgs` (discriminated union by mode) + `format*` functions + `truncateOutput`. Injected edge: `FirecrawlClient` (injectable `fetchFn` + `sleep` + `now`), `AbortSignal.any` for per-request timeout + caller cancellation. `concurrencySafe: true`, `capabilities: { network: true }`. 38 tests. Report: `reports/tool-web-search.md`. - **LIVE-VERIFIED:** the dev stack (umans-glm-5.2) called `web_search` → Firecrawl returned real results (Paris, France) — first live Umans API call too. ## todo tool — per-conversation task list + surface (DONE) Standard tool extension with a single `todo_write` tool (opencode `todowrite` pattern: full-list replace, returns JSON, no business-rule enforcement — the description guides the model). Per-conversation in-memory state (`Map`). Per- conversation surface (`rendererId: "todo"`, `scope: "conversation"`) via subscriber-notify (message-queue pattern). `concurrencySafe: false` (mutates shared state). - **Wave 0 (orchestrator, kernel contract):** added `conversationId?: string` to `ToolExecuteContext` (additive, backward-compatible). Wired in `dispatch.ts` — the kernel already had `conversationId` as a parameter, just wasn't passing it through to the tool context. 170 kernel tests pass. - **Wave 1 (todo extension):** pure core (`validateTodos` — shape only; `getTodos`/ `setTodos`/`clearTodos` — fresh array copies; `buildTodoSpec`; `formatTodoResult` → `JSON.stringify`). Shell: `createTodoWriteTool({ state, notify })` + surface provider. 26 tests. Report: `reports/todo.md`. - **Wave 2 (host-bin wiring):** registered `todo` in `CORE_EXTENSIONS` + dep + root tsconfig ref. 28 host-bin tests. - Verified: full-graph `tsc -b` EXIT 0, biome clean (314 files), **1123 vitest** pass. **Boot smoke:** `"todo: registered"` + activated. - [ ] Live-verify (model uses `todo_write` in a real turn — the dev stack has it loaded). ## youtube_transcript tool (DONE) Standard tool extension `tool-youtube-transcript` backed by a self-hosted transcriber service (`http://100.102.55.49:41090`, Tailscale, no API key). One tool `youtube_transcript` — takes a YouTube URL, fetches the transcript (completed → full text + timestamped segments; queued/processing → position + ETA + `.youtube_subtitles_pending` retry convention; failed → error). Pure core: `validateUrl` + `format*` functions + `truncateOutput`. Injected edge: `TranscriptClient` (injectable `fetchFn`, `AbortSignal.any` for cancellation). `concurrencySafe: true`, `capabilities: { network: true }`. 30 tests. Report: `reports/tool-youtube-transcript.md`. ## CLI — cross-client messaging + open tab (DONE) Roadmap items 2 + 4. The CLI can now list conversations, read the last AI message (blocking), send messages (blocking or `--queue`), and signal the frontend to open a conversation tab. Short-ID prefix resolution (4+ chars → full ID via `GET /conversations?q=`). - **Wave 0 (orchestrator, contracts):** `ConversationMeta` in `@dispatch/wire` (`0.8.0→0.9.0`); `ConversationListResponse`, `LastMessageResponse`, `OpenConversationResponse`, `SetTitleRequest`, `TitleResponse`, WS `conversation.open` in `@dispatch/transport-contract` (`0.12.0→0.13.0`); `listConversations()`/`getConversationMeta()`/`setConversationTitle()` on `ConversationStore`; new routes declared in transport-http manifest; `conversationOpened` hook in session-orchestrator. - **Wave 1 (conversation-store):** metadata tracking (createdAt on first write, lastActivityAt on every append, title from first user message truncated 80 chars); `conv-index` key tracks all conversation IDs; `extractTitle` pure helper. 21 new tests (81 total). - **Wave 2 (parallel, transport-http + transport-ws):** `GET /conversations` (list with `?q=` prefix filter), `GET /conversations/:id/last` (blocks until turn settles via subscribe-then-checkIsActive, returns last assistant text via pure `extractLastAssistantText`), `POST /conversations/:id/open` (emits `conversationOpened` hook), `PUT /conversations/:id/title`; `emit` threaded from `host.emit` → `createApp`. transport-ws subscribes to `conversationOpened` + broadcasts `ConversationOpenMessage` to all connected WS clients. 21+2 new tests. - **Wave 3 (CLI):** `dispatch list` (table: short ID + title + activity), `dispatch read ` (blocking, prints last AI message), `dispatch send --text` (blocking by default; `--queue` for non-blocking enqueue; `--open` signals FE). Short-ID resolution (4+ chars → prefix search; 32+ chars = full UUID). 48 new tests (108 total). - Verified: full-graph `tsc -b` EXIT 0, biome clean (327 files), **1240 vitest** pass. **Boot smoke + endpoint smoke:** `GET /conversations` → `[]`, `GET /conversations/:id/last` → `{content:""}`, `POST /conversations/:id/open` → `{conversationId}`. - [ ] Live-verify end-to-end (CLI → real conversation → FE tab open). ## Workspaces (DONE) Cross-repo design ask from `../dispatch-web` (`backend-handoff-workspaces.md`). Outbound courier: `frontend-workspaces-handoff.md` (final shapes + Q1–Q8). - **Boundary decision:** workspaces live inside `conversation-store` (metadata + cwd persistence owner); no new extension. Single owner-agent for all workspace storage + service methods. - **Versions:** `@dispatch/wire` `0.11.0→0.12.0`, `@dispatch/transport-contract` `0.15.0→0.16.0`, `@dispatch/ui-contract` unchanged. Kernel re-exports `Workspace`/`WorkspaceEntry`. - **Key decisions:** `DELETE /workspaces/:id` closes all conversations (status→ "closed") + reassigns to "default" + deletes workspace; auto-create workspace on turn start if missing; `PUT /workspaces/:id` create-on-miss with optional `title`/`defaultCwd`; `DELETE /conversations/:id/cwd` to clear explicit cwd; `GET /conversations/:id/lsp` roots at effective cwd; WS lifecycle push deferred. - **Waves:** - **Wave 0 (orchestrator):** contracts (wire `0.12.0` + transport-contract `0.16.0` + kernel re-exports). tsc + biome clean. - **Wave 1 (conversation-store):** workspace persistence + service methods (`getWorkspace`, `ensureWorkspace`, `setWorkspaceTitle`, `setWorkspaceDefaultCwd`, `deleteWorkspace`, `listWorkspaces`, `getWorkspaceId`, `setWorkspaceId`, `getEffectiveCwd`, `isValidWorkspaceSlug`); `listConversations` filter; `forkHistory`/`replaceHistory` preserve `workspaceId`. 111 bun tests. CRs (kernel re-exports, `bun install`) resolved by orchestrator. - **Wave 2 (session-orchestrator):** `workspaceId` on `StartTurnInput`/ `EnqueueInput`; effective cwd resolution (`getCwd` → `getEffectiveCwd`); auto- create workspace on turn start; warm parity. 93 vitest (+8). - **Wave 3 (parallel):** `transport-http` (workspace routes, `workspaceId` threading, `?workspaceId=` filter, `DELETE /conversations/:id/cwd`, effective cwd for LSP, slug validation; 166 tests), `transport-ws` (`workspaceId` on `chat.send`/`chat.queue`; 32 tests), `cli` (`--workspace`/`-w` flag; 123 tests). - FE handoff sent to agent 4091 via `dispatch send --queue` (non-blocking). - Verified: full-graph `tsc -b` EXIT 0, biome clean (328 files), **1283 vitest + 199 transport bun** pass (1 pre-existing `tool-shell` failure unrelated). - **LIVE-VERIFIED** against dev stack (`bin/up`): 11/11 workspace checks pass — create-on-miss, rename, set default-cwd, invalid-slug 400, unknown 404, delete- default 409, chat with workspaceId stamps conversation, workspace filter, cwd inheritance (null = inheriting), delete cascade (closedCount:1, workspace→404). - `dist/` rebuilt for FE (wire + transport-contract + kernel .d.ts contain Workspace types). FE agent 4091 notified twice (handoff + dist-ready). ## Open items - **`prefix.fingerprint` / `warm|real` cache-bust attributes (deferred):** decoupled from dedup by the content-addressed decision; also gated on cache-warming being built (not yet) so `warm|real` can't be honestly stamped. Later cache-bust-debug milestone (`notes/observability-design.md` §3.1, §12). - **D9 analytics roll-ups (deferred):** rollup table shape + `GROUP BY` indexes + retention asymmetry + periodic rollup job (`notes/observability-design.md` §2 D9, §12). The scheduler mechanism (`host.scheduler.register`) already exists. - **D8 `prompt.assembly` segments:** deferred-by-design (await the context-filter chain). - **In-memory state persistence (message queue + todo list):** both the message queue and the todo list are in-memory only (`Map` in the extension's `activate`). Neither persists across server restarts. If persistence is needed later, both would write through `host.storage` (the conversation-store pattern: separate key space per feature, append/write per conversation). ## Roadmap 1. **Web frontend** (in progress, SEPARATE repo `../dispatch-web`; Svelte + DaisyUI, same methodology). Slice 2 = browser chat MVP consuming the wire/transport-contract + metrics. Cross-repo contract changes are couriered via the user (ORCHESTRATOR §7); `lsp references` does not span repos. 2. ~~**CLI → open-tab handoff (cross-client messaging)**~~ — **DONE** (see CLI milestone section above; list, read, send, --queue, --open, short-ID resolution). 3. **Message queue + steering injection — DONE** (see the milestone section above; prerequisite for item 2's `--queue` flag met). 4. ~~**CLI flag to open/activate an FE tab**~~ — **DONE** (the `--open` flag on `dispatch send` calls `POST /conversations/:id/open` → backend broadcasts `conversation.open` WS message to all connected FE clients). 5. ~~**`todo` tool**~~ — **DONE** (see milestone section above). 6. ~~**`web_search` tool**~~ — **DONE** (see milestone section above). 7. **Message queue — close-with-queued-messages (deferred product decision):** if a client closes a conversation (`POST /conversations/:id/close`) while the queue is non-empty, the carry currently still fires (starts a new turn on the closed conversation). Decide: does closing discard pending steering, or honor it? If "discard," gate the carry on `finishReason !== "aborted"` in session-orchestrator (one-line). No FE action either way. 8. **Live-verify the steering flow (once the frontend is complete):** run a live `chat.queue` → tool-call → `steering` event flow against a real tool-calling model, end-to-end. The logic is unit/integration tested + boot-smoke-clean; this is the live end-to-end smoke. Blocked on the frontend wiring the queue surface + `chat.queue` op (or run it backend-only with a probe client). 9. ~~**Tab persistence across devices (conversation lifecycle)**~~ — **DONE**. Conversations have `status: "active" | "idle" | "closed"` on `ConversationMeta`. Orchestrator transitions: `idle → active` on turn-start, `active → idle` on settle, `→ closed` on close. `conversation.statusChanged` WS broadcast. `GET /conversations?status=` filter. CLI `dispatch list` defaults to `active,idle`; `--status`/`--all` flags. FE handoff: `frontend-conversation-lifecycle-handoff.md`. 10. ~~**Conversation compacting**~~ — **DONE**. Non-destructive: forks old history to a new archive conversation (new UUID), replaces the original conversation's history with `[system: summary] + recent N` (ID stays the same so messaging is unaffected). `compactedFrom` chains backward: A → Y → X. Manual via `POST /conversations/:id/compact`; automatic after turn settles if `compactThreshold` (default 85%) is exceeded. `GET/PUT /conversations/:id/compact-percent` for the setting. `conversation.compacted` WS broadcast. CLI `dispatch compact `. FE handoff: `frontend-compaction-handoff.md`. 11. **FE: consume `GET /conversations/:id/status` for crash-recovery re-sync.** Backend endpoint shipped (branch `fix/stuck-generating`): returns `{ conversationId, isActive, status }` where `isActive` is the orchestrator's in-memory truth and `status` is the persisted lifecycle status. On reconnect (WS re-establish or page reload), the FE should call this for any tab it believes is "generating"; if `isActive: false`, override the local spinner to idle regardless of the persisted `status` (defense-in-depth against status drift the boot-sweep didn't catch). No FE handoff doc needed — the endpoint is self-documenting (`GET /conversations/:id/status`). (Done and dropped from the list: CLI; dedup / storage growth; message queue + steering injection.) ## Stop generation must abort a hanging tool + not brick the conversation (DONE) FE courier in: "Stop generation doesn't abort a hanging tool call." When the user clicks Stop during a tool that hangs (e.g. `run_shell` with a blocking/grandchild-holding process), the turn never sealed → the FE spinner spun forever AND the conversation was bricked (next `chat.send` rejected as `"already-active"` because `activeTurns` was never cleared). - **Root cause:** the kernel's `executeToolCall` awaited `tool.execute(...)` with **no race against the abort signal** — a tool that ignored `ctx.signal` (or blocked on something it couldn't interrupt) blocked `drain` → `runTurn` never returned → session-orchestrator's `finally` (which clears `activeTurns`) never ran. (The `/stop` endpoint, `stopTurn`, and the `finally` cleanup were already correct — they just needed `runTurn` to return.) Secondary: `realSpawn` resolved on `child.on("close")` (waits for stdio) and killed only the immediate child, so a grandchild holding the pipes could stall the spawn promise + leak. - [x] **kernel** — `executeToolCall` now **races** `tool.execute` against `signal` via `Promise.race`; on abort it **resolves** (not rejects) `{ content: "Aborted", isError: true }` so the step completes normally → kernel's existing `signal.aborted → finishReason "aborted"` path runs → turn seals cleanly (`done` + `turn-sealed`) → `finally` clears `activeTurns` → **conversation freed, next message accepted**. Late rejections from the orphaned tool promise are swallowed. 11 tests incl. the durability test (hanging tool `new Promise(() => {})` + abort → `runTurn` returns `finishReason "aborted"`, doesn't hang). Report: `reports/kernel-abort-race.md`. - [x] **tool-shell** — `realSpawn` spawns `detached: true` (own process group); on abort **and** timeout kills the **group** (`process.kill(-pgid, "SIGKILL")`) AND resolves immediately (no `close`-dependency) so a grandchild holding the pipes can't stall the spawn or leak. 4 tests (grandchild abort, grandchild timeout, normal-completion stdout capture, simple abort). Report: `reports/tool-shell-process-group-kill.md`. - [x] Verified: `tsc -b` EXIT 0, biome clean, **1326 vitest** pass; both in-lane; kernel zero internal mocks. - [ ] **Live-verify** (needs a fresh `bin/up` — the dev stack is currently wedged, the very symptom of this bug): start a hanging tool (`run_shell` sleep/grandchild), Stop, then send a NEW message → it must be ACCEPTED (conversation not bricked) and the spinner clears. ## System prompt builder — template-based system context (DONE) Design: `notes/system-prompt-design.md`. FE courier: `frontend-system-prompt-handoff.md`. Problem: no system prompt was sent to the provider for regular turns (the messages array started with the user message; `providerOpts.systemPrompt` was never set). This adds a template-based system prompt builder with variable placeholders (`[type:name]`) and conditionals (`[if]`/`[else]`/`[endif]`). - **Cache constraint (critical):** the system prompt is constructed ONCE (first turn of a new conversation) and persisted. Reused on all subsequent turns (no reconstruction — cache-safe). Reconstructed only on **compaction** (fresh variable resolution + compaction instructions appended). - **Variable types:** `system:time/date/os/hostname`, `prompt:cwd/model/conversation_id`, `git:branch/status`, `file:` (dynamic — any path). - **Wave 0 (orchestrator, contracts):** `@dispatch/transport-contract` `0.17.0→0.18.0` — `SystemPromptTemplateResponse`, `SetSystemPromptTemplateRequest`, `SystemPromptVariable`, `SystemPromptVariablesResponse`. - **Wave 1 — `system-prompt` (NEW ext):** pure parser (29 tests) + variable resolver (injected adapters, 12 tests) + catalog (3 tests) + service handle (`construct` + `get` + `getTemplate` + `setTemplate`, 8 tests). 52 tests total. Default template: persona + AGENTS.md if exists + cwd. - **Wave 2 (parallel):** `session-orchestrator` (wire service: construct on first turn, get on subsequent, construct+append on compaction; 12 tests) + `transport-http` (GET/PUT `/system-prompt`, GET `/system-prompt/variables`; 6 tests). - **Wave 3 — host-bin:** registered `system-prompt` in `CORE_EXTENSIONS`. - [x] Verified: `tsc -b` EXIT 0, biome clean, **1396 vitest** pass. - [ ] Live-verify (boot smoke: extension activates, `GET /system-prompt` returns default template, `GET /system-prompt/variables` returns catalog). - [x] **FE courier** sent to FE agent `ffe3`: `frontend-system-prompt-handoff.md`.