diff options
35 files changed, 1 insertions, 4262 deletions
@@ -20,3 +20,4 @@ reports/ # Local observability journal (runtime artifact) .dispatch/journal/ +ai-review-report.md diff --git a/HANDOFF.md b/HANDOFF.md deleted file mode 100644 index 29f1689..0000000 --- a/HANDOFF.md +++ /dev/null @@ -1,45 +0,0 @@ -# HANDOFF — next steps for the incoming orchestrator - -> Read `ORCHESTRATOR.md` first (your operating manual), then `tasks.md` (live -> status), then this file (what to do next). The project is mature; this file -> points at the live source of truth and the current open work. - -## Where things stand (one paragraph) - -Kernel + core extensions + host-bin are built, full-fidelity (every core feature -is a real manifest-loaded extension). The turn loop runs real tools end-to-end -against live models. LSP integration, observability (journal/collector/trace-store), -cache warming, turn continuity (detached turns + multi-client), skills, message -queue + steering, metrics (live + persisted), per-conversation model/cwd/reasoning -persistence, and broken-chat self-repair are all DONE and live-verified. -**`tsc -b` EXIT 0 · biome clean · 1468 vitest green.** The web frontend is a -separate repo (`../frontend`); contract changes are couriered via the user. - -## How to boot & smoke-test -```bash -cd /home/tradam/projects/dispat../backend -# .env auto-loads DISPATCH_API_KEY + BACKEND_PORT (24203). -# Dev stack (live-reload): bin/up (ports 24203/24205/24204) -# Stable second stack: ../bin/up2 (ports 25203/25205/25204, isolated data) -bun packages/host-bin/src/main.ts # boots app + collector -curl -s -X POST localhost:24203/chat -H 'content-type: application/json' \ - -d '{"conversationId":"c1","message":"Say hello in 3 words."}' -``` -Process cleanup uses the `[x]` bracket trick (ORCHESTRATOR §8) — leaked -server/collector procs poison the next run's counts. - -## What's open now - -See `tasks.md` for the live checklist. As of this writing: -- **Per-edit LSP diagnostics** (commit `8f6114b`) — committed + green, NOT yet - live-verified against a real running server. -- **MCP (Model Context Protocol) integration** — the next major feature. Research - + plan in progress; see `notes/mcp-design.md` (when written) + `PLAN-mcp.md`. -- `notes/pending-issues.md` item 1 (workspace tab) — awaiting a user handoff. - -## Standing reminders (from ORCHESTRATOR.md — don't relearn the hard way) -- Summon via `opencode run` (ORCHESTRATOR §2). Parallel wave = multiple concurrent - summons on disjoint file sets only. -- Verify independently (typecheck/test/check) + confirm single-lane edits. -- Keep `tasks.md` current; write decisions down before pivoting. -- Be careful with destructive git; back up `notes/` before any reset/clean. diff --git a/PLAN-mcp.md b/PLAN-mcp.md deleted file mode 100644 index 560da8e..0000000 --- a/PLAN-mcp.md +++ /dev/null @@ -1,128 +0,0 @@ -# Plan — MCP (Model Context Protocol) Integration - -> **Status:** PROPOSED — awaiting user approval of design decisions (§7 of -> `notes/mcp-design.md`). -> Design: `notes/mcp-design.md`. - -## Decisions (to confirm with user) - -1. **One `mcp` extension** managing multiple servers (like `lsp`). -2. **Tool name format:** `<serverId>__<toolName>` (double-underscore separator). -3. **Phase 1: stdio transport only** (covers freecad-mcp + chrome-devtools-mcp). -4. **Phase 1: Tools only** (no Resources/Prompts). -5. **Phase 1: no enable/disable surface** (per-cwd config is sufficient). -6. **Hand-rolled JSON-RPC** (adapt LSP's rpc.ts + framing.ts; no MCP SDK dep). - -## Implementation waves - -### Wave 0: Orchestrator (contracts + wiring) - -| What | File | Change | -|---|---|---| -| No kernel contract change needed | — | The existing `ToolContract` + `host.defineTool()` + `host.getTools()` + `toolsFilter` + `ToolAssembly` are sufficient. MCP tools are just `ToolContract`s registered at runtime. | -| Glossary | `GLOSSARY.md` | Add `MCP`, `MCP server`, `MCP host` (see design §6). | -| Root tsconfig | `tsconfig.json` | Add `@dispatch/mcp` project reference (after Wave 1). | -| host-bin registration | `packages/host-bin/src/main.ts` | Register `mcpExt` in `CORE_EXTENSIONS` (same pattern as `lspExt`). | -| `bun install` | `bun.lock` | Link the new workspace package. | - -> **No `@dispatch/transport-contract` or `@dispatch/wire` version bump** in Phase 1. -> MCP tools are transparent to the wire (they're just tools the model calls). -> A future surface (enable/disable, status endpoint) would bump versions. - -### Wave 1: `packages/mcp/` (single unit — the extension) - -This is the main implementation. One owner-agent builds the entire `packages/mcp/` -directory. It depends only on `@dispatch/kernel` (contracts) and -`@dispatch/session-orchestrator` (for the `toolsFilter` handle). - -| File | Responsibility | -|---|---| -| `src/framing.ts` | `Content-Length` framing for stdio (adapt from LSP's framing.ts — encode/decode). PURE. | -| `src/framing.test.ts` | Unit tests for encode/decode. | -| `src/rpc.ts` | JSON-RPC 2.0 client: `request(method, params) → result`, `notify(method, params)`, `onNotification(method, handler)`. Adapts LSP's rpc.ts. PURE (injected `writeFn`). | -| `src/rpc.test.ts` | Unit tests for request/response/notification handling. | -| `src/transport.ts` | Transport abstraction: `StdioTransport` (spawn child, pipe stdin/stdout through framing + rpc) + the interface for a future `HttpTransport`. Injected `spawn` (like LSP). | -| `src/transport.test.ts` | Tests against an in-memory pipe pair (no real spawn). | -| `src/client.ts` | MCP client: `initialize()` (send proto version + caps, receive server caps), `listTools()` → `tools/list`, `callTool(name, args, signal)` → `tools/call`, listen for `notifications/tools/list_changed`. Tracks connection state. | -| `src/client.test.ts` | Tests with a mock JSON-RPC connection (injected transport). | -| `src/config.ts` | PURE config resolution: `.dispatch/mcp.json` → `opencode.json` `mcp` key. Returns `ResolvedMcpServer[]` + `shadowed` flag. Mirrors LSP config.ts. | -| `src/config.test.ts` | Config resolution tests (precedence, shadow, empty). | -| `src/registry.ts` | Tool name namespacing (`<serverId>__<toolName>`) + `adaptTool(serverId, mcpTool, client)` → `ToolContract`. The `execute()` proxies to `client.callTool()` and flattens MCP content to a string. PURE (injected client). | -| `src/registry.test.ts` | Tests for namespacing, content flattening, error handling. | -| `src/manager.ts` | `McpManager`: one client per server config; lazy-spawn on first access; `status(cwd)`; `getClient(serverId)`; `shutdownAll()`. Mirrors LSP manager.ts. Injected spawn + logger. | -| `src/manager.test.ts` | Manager lifecycle tests (lazy spawn, shutdown, broken server). | -| `src/types.ts` | `McpServerConfig`, `McpServerStatus`, `McpService`, `McpToolInfo`, `McpContentItem`. | -| `src/extension.ts` | manifest + `activate(host)`: real spawn adapter, config resolution per-cwd, manager, register tools via `host.defineTool` (on connect + on `list_changed`), register `toolsFilter` (drop tools from disconnected servers), `mcpServiceHandle`, `deactivate()`. | -| `src/index.ts` | Public surface exports. | - -**Scoping rules for the summon:** -- `.dispatch/package-agent.md` + `.dispatch/extension-agent.md` -- `.dispatch/rules/`: `one-owner.md`, `isolation-over-dry.md`, `biome-clean.md`, - `pure-core.md`, `no-internal-mocks.md`, `typed-handles.md`, - `extension-logging.md`. - -**Key guidance for the agent:** -- Read `packages/lsp/src/` (framing.ts, rpc.ts, config.ts, manager.ts, - extension.ts) as the architectural precedent — same pattern, simpler protocol. -- Read `packages/kernel/src/contracts/tool.ts` for `ToolContract`. -- Read `packages/kernel/src/contracts/extension.ts` for `HostAPI`, - `defineTool`, `addFilter`, `provideService`, `defineService`. -- Read `packages/session-orchestrator/src/tools-filter.ts` for `ToolAssembly` - + `toolsFilter`. -- The MCP `initialize` flow: send `{ method: "initialize", params: { - protocolVersion: "2025-11-25", capabilities: {}, clientInfo: { name: - "dispatch", version: "0.0.0" } } }`, receive server capabilities, then send - `notifications/initialized`. -- `tools/list` returns `{ tools: [{ name, description, inputSchema }] }`. -- `tools/call` takes `{ name, arguments }` and returns `{ content: [...], - isError?: boolean }`. -- Tool names must be namespaced `<serverId>__<toolName>`. -- `concurrencySafe: false` on all MCP-adapted tools (conservative — MCP servers - are generally stateful single-client processes). -- `Content-Length` framing for stdio (same as LSP — the MCP spec inherited - this from LSP). -- No external dependencies — hand-roll the JSON-RPC + framing (adapt LSP's). - -### Wave 2: host-bin registration (orchestrator) - -After Wave 1 is verified in isolation: -- Add `@dispatch/mcp` to root `tsconfig.json` project references. -- `bun install` to link the workspace package. -- Register `mcpExt` in `CORE_EXTENSIONS` in `packages/host-bin/src/main.ts`. -- Verify: `tsc -b` EXIT 0, biome clean, full vitest pass. - -### Wave 3: Live verification (orchestrator) - -- Boot the dev stack (`bin/up`). -- Create a `.dispatch/mcp.json` in a test cwd with a simple MCP server - (e.g. a trivial stdio server that exposes one tool). -- Verify: `GET /conversations/:id/lsp`-equivalent — actually, verify by - sending a chat that triggers the model to call the MCP tool. -- Or: test with chrome-devtools-mcp (`npx chrome-devtools-mcp`) if available. -- Confirm: the model sees the MCP tool, calls it, gets a result. -- Clean up test config. - -## Test strategy (per the asymmetric testing rule) - -- **Pure core** (framing, rpc, config, registry, types): zero internal mocks, - high coverage. The RPC + framing tests use in-memory pipe pairs (injected - transport, not mocked `@dispatch/*`). Config tests use string fixtures. -- **Shell** (transport, manager, extension): integration tests against - in-memory/real child processes. A few tests, not exhaustive unit coverage. - Do NOT mock sibling extensions. - -## Estimated size - -- ~12 source files + ~11 test files. -- Closest precedent: `packages/lsp/` (~20 files). MCP is simpler (no - diagnostics, no incremental sync, no file watching, no sidecars). -- Expected test count: ~60-80 new tests. - -## What is explicitly OUT of scope for Phase 1 - -- Streamable HTTP transport (Phase 2). -- MCP Resources and Prompts primitives (Phase 2). -- Client → Server capabilities (sampling, roots, elicitation) (Phase 2+). -- Per-conversation enable/disable surface + transport endpoints (Phase 2). -- Tool poisoning / rug-pull hash validation (security hardening, Phase 2). -- `mcp-scan`-style static analysis (Phase 2+). diff --git a/PLAN-per-edit-diagnostics.md b/PLAN-per-edit-diagnostics.md deleted file mode 100644 index 20671c2..0000000 --- a/PLAN-per-edit-diagnostics.md +++ /dev/null @@ -1,44 +0,0 @@ -# Plan — Live Per-Edit Diagnostics (General LSP) - -> **Status:** APPROVED — implementing. - -## Decisions (confirmed with user) - -1. **Multi-server aggregation** — query ALL connected servers matching the file's extension, merge diagnostics tagged by source. -2. **Incremental sync** — capture each server's `textDocumentSync.change` during `initialize`; compute prefix/suffix diff ranges for `change: 2`; full content for `change: 1`. Generic, works for ALL LSPs. -3. **`languageId` mapping** — extend the existing `language.ts` with `.rb`/`.rbs`, `.c`/`.cpp`, etc. -4. **Auto-append to `edit_file`** — after a successful edit, run diagnostics on the post-edit buffer. Only append diagnostics if there are errors/warnings (severity ≤ 2). Don't append on clean edits (no noise). -5. **60s timeout** — if diagnostics take >10s, prepend a warning: "LSP is taking unusually long. If this happens more than once, raise it to the user." Always append this if slow, regardless of whether there are errors. -6. **General** — not Steep-specific. Works for any LSP server. - -## Implementation waves - -### Wave 1: `packages/lsp/` (single unit) - -| File | Change | -|---|---| -| `src/diff.ts` (NEW) | Pure diff: `computeChangeRange(oldText, newText)` + `offsetToPosition(text, offset)` | -| `src/language.ts` | Add `.rb`/`.rbs` → `"ruby"`, `.c`/`.h` → `"c"`, `.cpp`/`.cc`/`.hpp` → `"cpp"` | -| `src/diagnostics.ts` | Add `hasReceivedPush(uri)` tracking, `clearReceived(uri)`, `formatFiltered(uri, minSeverity?)` | -| `src/client.ts` | Capture `textDocumentSync.change` from init; track open doc text; add `change(filePath, newText)` with incremental/full sync; fix `languageId` in `open()`; extend `waitForDiagnostics(filePath, opts?)` to accept `text` + `timeoutMs` + return `{ formatted, slow, timedOut }` | -| `src/tool.ts` | `diagnostics` op: query ALL matching connected servers (not just first); merge tagged by source | -| `src/types.ts` | Add `getDiagnostics(opts)` to `LspService` + `DiagnosticsResult` type | -| `src/extension.ts` | Implement `getDiagnostics` (calls manager → all matching clients → merge) | -| `src/diff.test.ts` (NEW) | Unit tests for diff functions | -| `src/tool.test.ts` | Multi-server aggregation test | -| `src/client.test.ts` | `change()`, `languageId`, `waitForDiagnostics` with text tests | - -### Wave 2: `packages/tool-edit-file/` (cross-extension) - -| File | Change | -|---|---| -| `src/extension.ts` | Import `lspServiceHandle` from `@dispatch/lsp`; `host.getService()` in activate; pass to tool | -| `src/edit-file.ts` | After successful edit: call `getDiagnostics({ filePath, text: newContent, cwd, minSeverity: 2, timeoutMs: 60_000 })`; append if errors; append slow warning if >10s | -| `package.json` | Add `@dispatch/lsp` dep | -| `tsconfig.json` | Add `@dispatch/lsp` reference | - -### Wave 3: Build wiring (orchestrator) - -- Root `tsconfig.json`: add `@dispatch/tool-edit-file` → `@dispatch/lsp` ref if needed -- `bun install` to link -- Verify: typecheck + test + biome diff --git a/ai-review-report.md b/ai-review-report.md deleted file mode 100644 index 570ba05..0000000 --- a/ai-review-report.md +++ /dev/null @@ -1,32 +0,0 @@ -# LSP Fixes Verification Report - -## Executive Summary -The fixes implemented in `feature/lsp-fixes` successfully address the immediate crashes and the primary source of the 9.5 GB memory leak. The optional chaining fix, `fs.watch` error listener, bounded LRU document cache, and `initialize` timeout propagation are all correctly designed and do not introduce regressions. However, the memory leak fix for the diagnostics cache is **incomplete**: while actively opened documents are properly managed, diagnostics pushed for *unopened* background files will still accumulate and leak memory over time. - -## Per-Fix Verification - -### 1. Crash — TypeError from broken optional chaining (`client.ts`) -- **Correctness:** **Correct.** The addition of the second optional chaining operator (`void this.rpc?.handleMessage(msg)?.catch(...)`) correctly evaluates to `undefined` when `this.rpc` is null, preventing the synchronous `TypeError` that previously crashed the server. -- **Completeness:** **Complete.** Because `handleMessage` is an `async` function, it is guaranteed to return a Promise, meaning any internal errors are returned as rejections rather than synchronous throws. -- **New Issues/Regressions:** None. - -### 2. Crash — Unhandled 'error' event on `fs.watch` (`extension.ts`) -- **Correctness:** **Correct.** Attaching a no-op `'error'` listener to the `fs.watch` instance (`watcher.on("error", () => {})`) prevents Node/Bun from escalating transient filesystem errors (such as a directory vanishing during `bun install`) into unhandled exceptions that crash the main process. -- **Completeness:** **Complete.** The watcher is properly treated as best-effort. -- **New Issues/Regressions:** None. - -### 3. Memory leak — Unbounded document/diagnostic caches (`client.ts` + `diagnostics.ts`) -- **Correctness:** **Correct (for opened files).** The `evictIfOverCap` method properly acts as an LRU cache. It uses the JavaScript `Map`'s insertion-order property by grabbing the oldest key via `this.openDocuments.keys().next().value`. The `change()` method successfully maintains LRU order by deleting and re-inserting documents so they move to the tail. Evicted files correctly send `textDocument/didClose` and clear their state via `this.diagnostics.purge()`. -- **Completeness:** **Incomplete.** The fix bounds the memory for files the agent explicitly opens, but misses files analyzed passively. Language servers (like `pyright`, `rust-analyzer`, or `tsserver`) often scan the workspace in the background and emit `textDocument/publishDiagnostics` for files the client never touched. These trigger `DiagnosticsStore.setPushDiagnostics()`, which caches them unconditionally in the `pushDiagnostics` map. Because these background files are never placed in `openDocuments`, they are never reached by the `evictIfOverCap` loop. As a result, the `DiagnosticsStore` will still slowly leak memory over time as background files accumulate. -- **New Issues/Regressions:** None introduced. The `closeDocument` method is carefully guarded with `wasOpen` so it doesn't send invalid `didClose` notifications for unopened files. There are no race conditions in the polling logic (`waitForDiagnostics`). - -### 4. Minor — Leaked promises on initialize timeout (`rpc.ts`) -- **Correctness:** **Correct.** Passing the `timeoutMs` parameter directly into `rpc.sendRequest` (rather than wrapping it in an external `Promise.race`) successfully utilizes the connection's internal timeout handler. When the timeout triggers, `rpc.ts` explicitly deletes the request ID from the `this.pending` Map, freeing the memory. -- **Completeness:** **Complete.** The leaked closure and promise are both cleared safely. -- **New Issues/Regressions:** None. - -## Remaining Concerns -1. **Unbounded `DiagnosticsStore` Map:** As detailed in Bug 3, `this.pushDiagnostics` and `this.pushReceived` inside `DiagnosticsStore` have no size bounds. A server pushing diagnostics for thousands of untouched files across a large monorepo will keep those strings and objects in memory forever until the client is destroyed. - -## Recommendations -- **Bound passive diagnostics:** Modify `DiagnosticsStore` to implement its own LRU cache for `pushDiagnostics`, or have it coordinate with `client.ts` to ensure that even unopened files with diagnostics are eventually aged out and purged when a maximum threshold is reached. diff --git a/backend-to-fe-handoff-2.md b/backend-to-fe-handoff-2.md deleted file mode 100644 index 51a3e34..0000000 --- a/backend-to-fe-handoff-2.md +++ /dev/null @@ -1,124 +0,0 @@ -# Backend → FE handoff — context window + percentage-based compact - -> Courier to `../frontend`. Response to the context-window ask in -> `backend-handoff.md` §3 + compacting rework. - -## What shipped - -1. **`GET /models` now includes `contextWindow` per model** — the FE can replace - the hardcoded `MAX_CONTEXT = 1,000,000` with the real value. -2. **Auto-compact is now percentage-based** (default 85% of `contextWindow`) - instead of a flat token count (was 350k). - -## Bump pinned deps -- `@dispatch/wire` → `0.11.0` (unchanged) -- `@dispatch/transport-contract` → `0.15.0` (unchanged — the new types are - additive to the existing version) - -## `GET /models` — now includes `modelInfo` - -The response now includes an optional `modelInfo` map alongside the existing -`models` array. The `models` array is unchanged (backward compatible). - -```ts -interface ModelsResponse { - readonly models: readonly string[]; - readonly modelInfo?: Readonly<Record<string, ModelMetadata>>; -} - -interface ModelMetadata { - readonly contextWindow?: number; // max tokens (e.g. 200000) -} -``` - -**Example response:** -```json -{ - "models": ["opencode/deepseek-v4-flash", "umans/umans-glm-5.2"], - "modelInfo": { - "opencode/deepseek-v4-flash": { "contextWindow": 128000 }, - "umans/umans-glm-5.2": { "contextWindow": 200000 } - } -} -``` - -`modelInfo` is absent when no provider reports `contextWindow`. Each key is the -same `<credentialName>/<model>` string from the `models` array. - -**What the FE should do:** -- When rendering the composer status bar, use `modelInfo[selectedModel].contextWindow` - as the denominator for `contextSize / contextWindow · pct%`. -- Fall back to the current hardcoded `1,000,000` when `modelInfo` is absent or - the selected model has no `contextWindow`. - -## Auto-compact: now percentage-based - -**Old:** flat token threshold (default 350000). `contextSize >= threshold`. -**New:** percentage of `contextWindow` (default 85%). `contextSize >= contextWindow × (percent / 100)`. - -Also fixed: the check now uses `contextSize` (true context occupancy = last -step's `inputTokens + outputTokens`) instead of the overcounted aggregate -`usage.inputTokens` (which summed every step's re-prefilled prompt). - -### `GET /conversations/:id/compact-percent` — read percent - -200: `CompactPercentResponse { conversationId, percent }` -- `percent: 0` — auto-compact explicitly disabled (manual only). -- `percent: null` (not stored) — **default: 85** (85% of the model's context window). -- Any positive number (1-100) — auto-compact triggers when `contextSize` - exceeds `percent`% of the model's `contextWindow`. - -### `PUT /conversations/:id/compact-percent` — set percent - -Body: `SetCompactPercentRequest { percent: number }` -- `0` explicitly disables auto-compact. -- Any positive number (1-100) sets the trigger percentage. -- Default (when not stored) is 85. - -200: `CompactPercentResponse` - -**Renamed from `compact-threshold`** — the old endpoint paths, request types, -and response types are gone. Update any FE code that referenced -`compact-threshold`. - -## New types - -```ts -// @dispatch/transport-contract -export interface ModelMetadata { - readonly contextWindow?: number; -} - -// ModelsResponse now has modelInfo (additive — models array unchanged) -export interface ModelsResponse { - readonly models: readonly string[]; - readonly modelInfo?: Readonly<Record<string, ModelMetadata>>; -} - -// Renamed from CompactThresholdResponse -export interface CompactPercentResponse { - readonly conversationId: string; - readonly percent: number; // 0 = manual; null = default 85 -} - -// Renamed from SetCompactThresholdRequest -export interface SetCompactPercentRequest { - readonly percent: number; -} -``` - -## What the FE needs to do - -1. **Use real `contextWindow`** from `GET /models` → `modelInfo[model].contextWindow` - instead of hardcoded `MAX_CONTEXT = 1,000,000`. - -2. **Rename compact-threshold → compact-percent** in any FE code: - - `GET /conversations/:id/compact-percent` (was `compact-threshold`) - - `PUT /conversations/:id/compact-percent` (was `compact-threshold`) - - `percent` field (was `threshold`) - -3. **Settings UI**: change the number input from "token count" to "percent - (0-100)". Default 85. 0 = manual only. - -4. **No other changes** — the compact endpoint, WS message, and chain - architecture are unchanged. diff --git a/backend-to-fe-handoff.md b/backend-to-fe-handoff.md deleted file mode 100644 index a6e726b..0000000 --- a/backend-to-fe-handoff.md +++ /dev/null @@ -1,141 +0,0 @@ -# Backend → FE handoff — CR-6 resolved + full endpoint list - -> Response to `backend-handoff.md` §2 CR-6. Courier back to `../frontend`. - -## CR-6: Assign seq during generation — RESOLVED - -**What changed:** The backend now persists chunks **incrementally at step -boundaries** during generation, not only at turn-seal. The user message is -persisted at turn start (before the first step), and each step's messages -(assistant + tool-results) are persisted as soon as that step completes. - -**How it works:** -1. Turn starts → user message is `append`ed immediately (gets seq numbers). -2. Each step completes → step's messages are `append`ed immediately (get seq numbers). -3. Turn seals → `turn-sealed` emitted (no batch `append` needed — already persisted). - -**What this means for the FE:** -- `GET /conversations/:id?sinceSeq=N` returns committed, seq'd chunks **during - generation**. The FE's existing `syncTail` already polls this — it will now - find new chunks as each step completes. -- The FE can adopt option (c) from the CR: fold events for the **current - in-progress step** only (streaming text, thinking dots), and `syncTail` for - **sealed steps**. The provisional state shrinks to just one step's worth of - chunks — never a trim concern. -- `turn-sealed` becomes a "refresh" signal — all chunks are already committed. - The `done` event still carries final usage + contextSize (unchanged). - -**No wire/transport-contract change needed.** `StoredChunk` already has `seq`. -`AgentEvent` types are unchanged. The FE just needs `syncTail` to find seq'd -chunks during generation (which it already does). - -**Implementation detail:** The kernel calls a new `onStepComplete` callback -(`RunTurnInput.onStepComplete`) after each step's messages are finalized. -The orchestrator persists them via `conversationStore.append`. If the callback -isn't called (e.g., test fakes), the orchestrator falls back to batch persist -after `runTurn` returns — backward compatible. - ---- - -## Full endpoint list (current as of [email protected] / [email protected]) - -### HTTP (port 24203) - -| Method | Path | Purpose | -|---|---|---| -| `POST` | `/chat` | Stream a turn (NDJSON response, `X-Conversation-Id` header) | -| `POST` | `/chat/warm` | Cache-warm probe | -| `GET` | `/models` | Model catalog (now includes `modelInfo` with `contextWindow` per model) | -| `GET` | `/conversations` | List conversations (`?q=` prefix filter, `?status=active,idle` status filter) | -| `GET` | `/conversations/:id` | Conversation history (`?sinceSeq=`, `?beforeSeq=`, `?limit=` windowing) | -| `GET` | `/conversations/:id/metrics` | Per-turn metrics (tokens, timing) | -| `GET` | `/conversations/:id/last` | Blocking last assistant message | -| `GET` | `/conversations/:id/cwd` | Per-conversation working directory | -| `PUT` | `/conversations/:id/cwd` | Set working directory | -| `GET` | `/conversations/:id/reasoning-effort` | Per-conversation reasoning effort | -| `PUT` | `/conversations/:id/reasoning-effort` | Set reasoning effort | -| `GET` | `/conversations/:id/lsp` | LSP server status | -| `GET` | `/conversations/:id/compact-percent` | Auto-compact percent (0=manual, null=default 85%) | -| `PUT` | `/conversations/:id/compact-percent` | Set auto-compact percent | -| `GET` | `/conversations/:id/title` | Read conversation title | -| `PUT` | `/conversations/:id/title` | Set conversation title | -| `POST` | `/conversations/:id/close` | Close tab (abort turn + mark `closed`) | -| `POST` | `/conversations/:id/stop` | **NEW** — Stop generation (abort turn, keep conversation `idle`) | -| `POST` | `/conversations/:id/compact` | **NEW** — Manual compaction (fork history + replace with summary) | -| `POST` | `/conversations/:id/open` | **NEW** — Signal FE to open/focus tab (broadcasts `conversation.open`) | -| `POST` | `/conversations/:id/queue` | Enqueue steering message | -| `GET` | `/health` | Health check | -| `GET` | `/metrics/throughput` | Per-model throughput samples | -| `GET` | `/*` | Static frontend serving (SPA fallback, when `DISPATCH_WEB_DIR` is set) | - -### WebSocket (port 24205) - -**Client → Server:** -| Type | Purpose | -|---|---| -| `chat.send` | Start a turn (stream events back via `chat.delta`) | -| `chat.subscribe` | Watch a conversation's turns without sending | -| `chat.unsubscribe` | Stop watching | -| `chat.queue` | Enqueue steering (fire-and-forget) | -| Surface ops | `surface.subscribe`, `surface.invoke`, etc. | - -**Server → Client (broadcasts):** -| Type | Purpose | -|---|---| -| `chat.delta` | Per-conversation event (turn-start, text-delta, tool-call, usage, done, etc.) | -| `chat.error` | Turn error | -| `conversation.open` | **NEW** — CLI `--open` flag → open/focus a tab | -| `conversation.statusChanged` | **NEW** — Lifecycle status change (`active`/`idle`/`closed`) | -| `conversation.compacted` | **NEW** — History compacted (includes `newConversationId` = archive ID) | -| Surface ops | Catalog, surface data, etc. | - -### New types the FE should consume - -```ts -// ConversationMeta ([email protected]) — now has status + compactedFrom -interface ConversationMeta { - id: string; - createdAt: number; - lastActivityAt: number; - title: string; - status: "active" | "idle" | "closed"; - compactedFrom?: string; // archive ID (pre-compaction history) -} - -// WS messages ([email protected]) -interface ConversationCompactedMessage { - type: "conversation.compacted"; - conversationId: string; - newConversationId: string; // archive ID - messagesSummarized: number; - messagesKept: number; -} - -// HTTP response types -interface CompactResponse { - conversationId: string; - newConversationId: string; // archive ID - messagesSummarized: number; - messagesKept: number; -} - -interface CompactPercentResponse { - conversationId: string; - percent: number; // 0 = manual; null = default 85 -} - -interface SetCompactPercentRequest { - percent: number; -} -``` - -### FE handoff docs (in the backend repo) - -| File | Feature | -|---|---| -| `frontend-conversation-lifecycle-handoff.md` | Tab persistence (active/idle/closed) | -| `frontend-compaction-handoff.md` | Compacting (non-destructive, chained archives) | -| `frontend-stop-generation-handoff.md` | Stop generation mid-turn | -| `frontend-conversation-list-handoff.md` | Conversation list + title + open tab | -| `frontend-conversation-open-handoff.md` | CLI `--open` → `conversation.open` WS message | -| `frontend-cache-rate-handoff.md` | Cache hit/miss calculation (updated for providers that don't report cache) | diff --git a/broken-chat-repair-handoff.md b/broken-chat-repair-handoff.md deleted file mode 100644 index 12deec0..0000000 --- a/broken-chat-repair-handoff.md +++ /dev/null @@ -1,180 +0,0 @@ -# Handoff → orchestrator (bcb5): broken-chat self-repair - -> From: diagnostic session. Agent/conversation `77574596` -> (`77574596-3e7b-46f8-8d67-c9e17a529dee`) "broke unrecoverably." User goal: -> **chats must self-heal when broken so they can continue.** Implement the fixes -> below. Full diagnosis + plan also in `reports/broken-chat-repair-diagnosis.md`. - -## 0. Your job (TL;DR) - -`reconcile()` only repairs orphaned tool-calls. The production DB has **two other -broken states** it doesn't handle, and they make a chat uncontinuable. Implement a -read-time repair so broken chats auto-heal on next open — **no DB surgery** -(append-only durability preserved; repair is a turn-path transform that runs on -every `load()`). Three units, two repos: - -- **Wave 1 (arch-rewrite, PARALLEL — disjoint packages):** - - `conversation-store` — extend `reconcile` (Layer 1) + harden `load()`. - - `openai-stream` — harden `convertMessages` args (Layer 2). -- **Wave 2 (separate repo `../claude`, SEPARATE agent):** - - `provider-anthropic` — harden its `safeJson` (Layer 2 equivalent). - -**Key architectural insight that shapes the waves:** Layer 1 lives in -`conversation-store.reconcile`, which runs in `load()` BEFORE any provider sees -the messages. So the Layer 1 fix protects **every** provider (openai-compat AND -anthropic) — the Claude plugin needs **no** Layer 1 change. Layer 2 (malformed -tool-call args) is **per-provider** serialization safety, so it must be applied -in each provider's converter (openai-stream + provider-anthropic). - -## 1. The break (what actually happened in `77574596`) - -Production DB: `/var/lib/dispatch/dispatch.db` (systemd `dispatch.service`). -136 chunks; seq counter = 136; **all JSON valid; no orphaned tool-calls** — so -`reconcile()` finds nothing wrong, yet the chat is uncontinuable. The tail: - -| seq | role | type | note | -|---|---|---|---| -| 133 | assistant | text | "Wave 0 fully verified…" | -| 134 | assistant | tool-call | `todo_write`, `input` = **malformed JSON** (`json_type=text`, raw string) | -| 135 | tool | tool-result | isError: "todo_write args must be an object with a `todos` array" | -| 136 | assistant | **error** | `HTTP 400: unexpected character: line 1 column 1413 (char 1412). Received Model Group=glm-5.2` | - -### Root cause (confirmed byte-for-byte) -- seq 134's `input` is a raw string. Parsing it fails - `Expecting ':' delimiter: line 1 column 1413 (char 1412)` — an **exact match** to - the provider's `unexpected character: line 1 column 1413`. The **model emitted - malformed JSON as the `todo_write` arguments**. -- Chain: model emits text + malformed-args tool-call (step 5) → kernel dispatches - the tool, which returns an error result (seq 135) → kernel calls the provider - again (step 6); the request re-includes the assistant message carrying the - malformed `arguments` → provider 400s → persisted as an `error` chunk (seq 136). - -### Why it's "unrecoverable" -- `openai-stream` `convertAssistantMessage` serializes tool-call args as - `typeof c.input === "string" ? c.input : JSON.stringify(c.input)` — passes the - malformed string straight through as the OpenAI `arguments` field → provider - 400s on **every** continuation. -- The trailing `assistant` message whose only chunk is `error` serializes to - `content:""` + no tool_calls (error chunk is filtered out, leaving an empty - assistant message) → also uncontinuable. -- `reconcile()` touches neither. `load()` also has no try/catch on - `JSON.parse(value)` — a single corrupt row would throw and brick the chat. - -### Scope (production DB, 140 conversations) -- **6 conversations end in a trailing `error` chunk:** `102587c0`(seq2, HTTP 401 - model-not-supported), `2bf78252`(seq2), `61127511`(seq250), `77574596`(seq136), - `d0d85eca`(seq2), `e1ee0989`(seq20). -- **2 tool-calls total** carry a raw malformed-string `input`. -- `102587c0` has **only** the trailing-error break (no args, no tool-calls) — - proving Layer 1 is independently necessary. `77574596` has **both**. - -## 2. The fix - -### Layer 1 — `conversation-store` `reconcile.ts` (structural repair) -Extend `reconcileWithReport` to: -1. **Strip `error` chunks from assistant messages.** An `error` chunk is a - failed-generation marker, never valid provider content (no provider understands - an "error" content type) — provider-agnostic. -2. **Drop any assistant message left with no `text` and no `tool-call` chunks** - (the now-empty error-only message). This is what unblocks continuation. **Safe:** - an error-only step ends with no tool-calls, so it is never followed by a `tool` - message — no "tool-without-preceding-assistant-tool_calls" 400 can result. Keep - the existing orphaned-tool-call synthesis unchanged. -3. Extend `ReconcileReport` with counts of stripped error chunks / dropped messages - (for the existing `reconcile.repair` boot/log span). - -Why here: the constitution designates `reconcile` as "the pure function run on load -that repairs any partial turn." A trailing error-only assistant message IS a -partial/broken turn. Pure, provider-agnostic, runs on every `load()` → auto-repairs -all 6 broken chats. Repair is read-time only; storage (append-only) untouched. -`loadSince` (FE reads) is intentionally NOT reconciled, so the user still SEES the -error while the provider gets clean history. - -### Hardening — `conversation-store` `store.ts` `load()` (same unit) -Wrap the per-chunk `JSON.parse(value)` in try/catch: on a corrupt/unparseable row, -log + skip it (don't throw) so `reconcile` can still run on the rest. Today a single -bad row makes `load()` throw → unrecoverable. (0 such rows today; "never leave the -system broken" asks for it.) - -### Layer 2 — `openai-stream` `convert-messages.ts` (serialization safety) -In `convertAssistantMessage`, ensure a tool-call's `arguments` is **always a valid -JSON string**: if `input` is a string, `JSON.parse` it; on failure substitute a -valid fallback object (e.g. `JSON.stringify({})` or a wrapped -`{ _malformed_arguments: <truncated> }`). Objects pass through `JSON.stringify` as -today. This neutralizes already-stored malformed args (seq 134) so the provider -stops 400ing on continuation. Follow the SAME semantics as the Claude fix below -(isolation over DRY: each provider reimplements locally, same behavior). - -### Layer 2 (equivalent) — `../claude` `provider-anthropic` `convert.ts` (SEPARATE agent) -The Claude plugin already has a `safeJson(s)` helper (line ~115) used at -`input: typeof c.input === "string" ? safeJson(c.input) : c.input`. But its fallback -**returns the raw string `s` on parse failure** — for Anthropic, `tool_use.input` -must be an object, so a raw string can still 400 when a historical malformed tool_use -is re-sent. Fix: make `safeJson` return a **valid object fallback** (e.g. `{}`) on -parse failure instead of the raw string. (Layer 1 does NOT apply here — the -arch-rewrite `reconcile` already strips error chunks before the Claude provider sees -the messages, so the Claude converter never receives error-only assistant messages.) - -## 3. Waves & summoning - -- **Wave 1 (arch-rewrite, PARALLEL):** `conversation-store` (Layer 1 + `load()` - hardening) and `openai-stream` (Layer 2). Disjoint packages, no contract/type - change, both depend only on already-built `@dispatch/kernel` contracts. Standard - summon per ORCHESTRATOR §2/§3 (attach the scoped rules: conversation-store gets - `pure-core.md`+`no-internal-mocks.md`+`typed-handles.md`+`extension-logging.md`; - openai-stream gets `pure-core.md`+`no-internal-mocks.md`+`extension-logging.md`; - both get `one-owner.md`+`isolation-over-dry.md`+`biome-clean.md`+`package-agent.md`+ - `extension-agent.md`). -- **Wave 2 (separate repo, SEPARATE agent):** summon against - `--cwd /home/tradam/projects/dispatch/claude` for `packages/provider-anthropic` - (`convert.ts` `safeJson`). That repo has its own `AGENTS.md`; attach the - arch-rewrite `package-agent.md`+scoped rules as needed. Can run in parallel with - Wave 1 (different repo, no shared files). - -## 4. Why this auto-heals `77574596` (and the other 5) — no DB surgery -On next open/continue, `load()` returns history ending at seq 135 (the tool-result): -Layer 1 strips the seq-136 error message; Layer 2 sanitizes the seq-134 args to -valid JSON. The provider receives -`[…, assistant{text+tool-call(args:{})}, tool{error result}]` — a valid "continue -after a tool result" state. The model sees its `todo_write` failed and adjusts. -Chat continues. Same auto-repair applies to the other 5 (Layer 1 alone for the -401/empty cases; Layer 1+2 for any malformed-args case). - -## 5. Test requirements (regression scar tissue) - -**conversation-store `reconcile.test.ts`:** -- `reconcile strips error-only trailing assistant message` (the 77574596/102587c0 - shape: `[user, assistant{error}]` → `[user]`). -- `reconcile strips error chunk but keeps sibling text` - (`assistant{text,error}` → `assistant{text}`). -- `reconcile drops assistant message left empty after stripping error` - (`assistant{error}` only → dropped). -- `reconcile keeps tool-call + strips error` (`assistant{tool-call,error}` with a - matching result → `assistant{tool-call}`). -- existing orphaned-tool-call behavior unchanged (regression). -- (hardening) corrupt-JSON chunk row is skipped, rest load + reconcile. - -**openai-stream `convert-messages.test.ts`:** -- `arguments is valid JSON when input is a malformed string` (seed from seq 134's - raw string → output `JSON.parse`s, no throw). -- `arguments passes through valid string input` and `stringifies object input` - (regression). - -**provider-anthropic `convert.test.ts` (claude repo):** -- `safeJson returns a valid object fallback on malformed string` (raw malformed - string → `{}` or wrapped object, not the raw string). -- `safeJson parses valid string input` (regression). - -## 6. Verify (ORCHESTRATOR §4) -`bun run typecheck && bun run test && bun run check` whole-project green; both agents -in-lane (`git status --short`); zero internal mocks in the pure-core units. Live-spot: -open `77574596` against a probe/`bin/up` and confirm it now continues past the tool -result instead of 400-looping. - -## 7. Notes / out of scope -- **Parse-time prevention** (openai-stream / provider-anthropic could reject or - repair malformed args when the model emits them, instead of storing a raw - string) is a deeper follow-up; Layer 2 is the safety net that also repairs - already-stored data. -- Deploying the fix auto-repairs the 6 broken production chats on next load — no - migration needed. diff --git a/crash-review-report.md b/crash-review-report.md deleted file mode 100644 index 272abdb..0000000 --- a/crash-review-report.md +++ /dev/null @@ -1,86 +0,0 @@ -# Production Crash Investigation — Independent Review - -## Executive Summary - -The production Dispatch server is experiencing two distinct failure modes under load: -1. **Exit-code 1 Crashes**: Driven by an unhandled `EventEmitter` `'error'` event from the `ssh2` connection pool, **not** the AI-SDK as previously suspected. -2. **Bun Runtime Segfaults**: Triggered by massive memory pressure from unbounded conversation history serialization during long multi-step agent turns, confirming the "leak" is actually a massive live working set, not a persistent memory leak. - -Additionally, a suspected latent crash path in the cache-warming probe has been confirmed as an Unhandled Promise Rejection. - ---- - -## 1. The Exit-1 Crash ("Timed out while waiting for handshake") - -### Finding: Confirmed Dispatch Bug in SSH Pool (Incorrect Preliminary Finding) -The preliminary analysis hypothesized that the `error: Timed out while waiting for handshake` crash was caused by an unhandled `'error'` on an outbound TLS socket to the AI provider. **This is incorrect.** - -The crash actually originates from the `ssh2` package managing outbound remote computer connections, specifically within `packages/ssh/src/pool.ts`. - -### Technical Analysis -- **The Evidence**: The exact string `'Timed out while waiting for handshake'` is hardcoded in `ssh2/lib/client.js` when the SSH handshake times out or keepalives fail during a re-keying phase. -- **The Code Path**: In `packages/ssh/src/pool.ts`, the `doConnect` function attaches an `onError` listener to the `ssh2.Client` instance to catch connection failures: - ```typescript - client.on("error", onError); - ``` - However, when the connection succeeds (`onReady` fires), the `cleanup()` function is called, which **removes the error listener**: - ```typescript - function cleanup(): void { - clearTimeout(timer); - client.removeListener("ready", onReady); - client.removeListener("error", onError); // <-- Listener removed here - } - ``` -- **The Crash Mechanism**: After `doConnect` succeeds, the client is placed in the pool and returned to callers. If the SSH connection drops later or a timeout occurs, the `ssh2.Client` emits an `'error'` event. Because there are no longer any listeners attached for `'error'`, Node.js's `EventEmitter` escalates it to an uncaught exception, instantly crashing the process with exit code 1. - -### Recommendation -**Dispatch Code Change**: Add a persistent `.on("error", ...)` handler to the `client` in `buildConnection` (or refrain from removing it) to gracefully catch post-connection drops, tear down the connection, and transition `state.value = "error"`. - ---- - -## 2. The Bun Native Segfaults (The 6.2 GB "Leak") - -### Finding: Massive Live Working Set, Not a Persistent Leak -The preliminary investigation suspected a 2.5 GB/hour slow leak. The telemetry data confirms that the memory is **not permanently leaked**—when `activeConversations` drops to `0`, the RSS cleanly drops back down to the ~84 MB baseline. The crash is caused by unbounded live working set growth during concurrent agent turns, which fragments and overwhelms Bun's allocator. - -### Technical Analysis -- **The Code Path**: `MAX_STEPS` in `packages/kernel/src/runtime/run-turn.ts` is set to `0` (unlimited). A single turn can run for hundreds of steps. -- **The Mechanism**: In `executeStep`, every step appends new tool calls and results to the `messages` array. This array is then passed to `provider.stream()`. -- Inside `packages/openai-stream/src/stream.ts`, the entire unbounded array is serialized into a single contiguous string every step: - ```typescript - const bodyString = JSON.stringify(body); - ``` -- **The Crash**: If 4 concurrent conversations (`activeConversations = 4`) run for hundreds of steps, the `messages` arrays grow to hundreds of megabytes each. Serializing these arrays copies them into massive contiguous strings on the V8 heap on *every step*. This causes gigabytes of memory allocation churn, memory pressure spikes (peaking at 6.2 GB), and eventually triggers a native `SIGSEGV`/`SIGILL` in Bun's allocator. - -### Recommendation -**Dispatch Code Change**: -1. Reintroduce a sane `MAX_STEPS` limit (e.g., `50` or `100`) to bound the maximum length of a single turn. -2. Implement a sliding window or context-truncation strategy for `messages` before serializing to prevent the payload from growing infinitely. -3. **Operational Mitigation**: Apply the `MemoryMax` cgroup circuit breaker to turn the segfault into a controlled recycle while the codebase fix is developed. - ---- - -## 3. The Cache-Warming Latent Crash - -### Finding: Confirmed Latent Unhandled Promise Rejection -The preliminary finding suspected a latent crash path due to a missing `try/catch` in the cache-warming probe. This is confirmed. - -### Technical Analysis -- **The Code Path**: In `packages/session-orchestrator/src/orchestrator.ts`, the `createWarmService`'s `warm` function consumes the provider stream: - ```typescript - for await (const event of provider.stream(messages, assembled.tools, providerOpts)) { - ``` -- **The Mechanism**: If the AI provider connection fails or aborts, `provider.stream` throws. Because there is no `try/catch` around this loop, the async `warm` function rejects its returned promise. -- In `packages/cache-warming/src/warmer.ts`, the timer fires `void fireWarm(conversationId, token);`. Since the returned promise is not `await`ed or `.catch()`'d at the top level, it results in an **Unhandled Promise Rejection**. - -### Recommendation -**Dispatch Code Change**: Add a `try/catch` block around the `for await` loop in `createWarmService` (or add `.catch()` to `fireWarm`) to gracefully emit an error result instead of throwing an unhandled rejection. - ---- - -## Assessment of Preliminary Findings - -- ❌ **"Unhandled TLS Socket to AI Provider"**: Incorrect. The exit-1 crash was a race condition in error listener attachment for outbound SSH connections (`ssh2`), not the AI SDK's TLS socket. -- ✅ **"MAX_STEPS = 0 ... structural enabler for large working sets"**: Correct. Unbounded history serialization caused the massive gigabyte allocations that crashed Bun. -- ✅ **"Cache-warming missing try/catch"**: Correct. It is an unhandled promise rejection waiting to happen. -- ✅ **"Not an LSP leak"**: Confirmed. The memory growth is strictly tied to `activeConversations` and the unbounded turn array serialization. diff --git a/frontend-cache-rate-handoff.md b/frontend-cache-rate-handoff.md deleted file mode 100644 index b64a612..0000000 --- a/frontend-cache-rate-handoff.md +++ /dev/null @@ -1,126 +0,0 @@ -# FE handoff — cache hit/miss + percentage (calculation guide) - -> **Courier doc** (backend → `../frontend`, via the user). Per ORCHESTRATOR §7 -> the backend does not write the FE repo. This describes ONLY how to compute cache -> hit/miss + percentages from data the backend ALREADY exposes — **no UI design here** -> (the look is specified separately) and **no backend change is required**. -> Contracts: `@dispatch/wire` + `@dispatch/transport-contract` `0.4.0`. - -## TL;DR -The cache hit rate is `cacheReadTokens / inputTokens`. Everything you need is already -on the `usage` + `done` live events and in `GET /conversations/:id/metrics`. There is -**no separate cache endpoint or boolean** — it's derived from token counts, exactly as -the old `CacheRatePanel` did. - -## The data shape (`Usage`, from `@dispatch/wire`) -```ts -interface Usage { - inputTokens: number; // TOTAL prompt tokens this step/turn, INCLUDING cached ones - outputTokens: number; - cacheReadTokens?: number; // input tokens served FROM cache (the "hit" count). Optional. - cacheWriteTokens?: number; // cache-creation count. Optional; usually ABSENT (see caveats). -} -``` -Field semantics that matter for the math: -- `inputTokens` is the **whole** prompt, so `cacheReadTokens ≤ inputTokens` and the rate is in `[0,1]`. -- The cache fields are **optional** — treat `undefined` as `0` in all arithmetic. - -## Formulas -```ts -const read = u.cacheReadTokens ?? 0; -const write = u.cacheWriteTokens ?? 0; - -const isHit = read > 0; // hit vs miss -const hitRate = u.inputTokens > 0 ? read / u.inputTokens : 0; // 0..1 (guard /0) -const hitPct = Math.round(hitRate * 100); -const fresh = Math.max(0, u.inputTokens - read - write); // uncached input tokens -``` -(These are byte-identical to the old `CacheRatePanel.svelte` formulas: hit rate = -`cacheReadTokens/inputTokens` clamped; uncached = `max(0, input − read − write)`.) - -## Where to get `Usage` — three granularities, two channels - -| Scope | LIVE (WS `chat.delta` / NDJSON) | REPLAY (`GET /conversations/:id/metrics`) | -|---|---|---| -| **Per step** | `usage` event (`type:"usage"`, carries `stepId`, `usage`) | `TurnMetrics.steps[].usage` (each has `stepId`) | -| **Per turn** (authoritative aggregate) | `done` event (`type:"done"`, carries `usage`, `durationMs`) | `TurnMetrics.usage` | -| **Cumulative** (conversation) | Σ of each turn's `done.usage` | Σ of `turns[].usage` | - -Notes: -- The **per-turn aggregate IS the sum of its steps** (the runtime aggregates). So when - summing a cumulative figure, pick ONE granularity — sum `done.usage`/`TurnMetrics.usage` - per turn, **or** sum all steps — never both (double-count). -- `done.usage` is the authoritative per-turn total. (`turn-sealed` does NOT carry usage in - this backend — it's just `{conversationId, turnId}`; the numbers ride the immediately - preceding `done` event.) -- `step-complete` is timing only (ttft/decode) — no tokens; ignore it for cache. - -## Live accumulation + reconcile (recommended pattern) -1. **In-progress turn (optional live counter):** as `usage` events stream, you may sum - `read`/`input` across the turn's steps to show a live-updating hit % for the current turn. -2. **Turn finished:** take that turn's authoritative totals from its `done.usage`. Use it as - the turn's final value (replace any live partial for that turn). -3. **Cumulative (session/conversation):** add each completed turn's `done.usage` to a running - total. Compute the cumulative hit % from the running totals (`ΣcacheRead / Σinput`). -4. **"Last request" rate:** the most recent turn's `done.usage` (or most recent step's `usage` - if you want per-round-trip granularity). - -## Replay / reopening a conversation -On open, `GET /conversations/:id/metrics` → `ConversationMetricsResponse { turns: TurnMetrics[] }`. -Seed the cumulative totals from `Σ turns[].usage`, the "last request" from `turns.at(-1).usage`, -and you can render a per-turn (and per-step, via `steps[]`) breakdown — a superset of what the -old session-cumulative-only panel could show. - -## Caveats (be honest in the UI) -- **`cacheWriteTokens` is usually absent.** The current provider is OpenAI-compatible - (OpenCode Go): it reports a cache **read** count (`cached_tokens`) but **no cache-creation** - count. So the old panel's separate "write" row will be 0/empty. Hit/miss and the read - percentage are unaffected. It would populate only if an Anthropic-native (or - `cache_write`-reporting) provider is added. -- **Optional fields:** any of the cache fields can be `undefined` (provider-dependent). Default - to 0; never assume presence. -- **A legitimate 0% is not a bug.** OpenAI-style providers auto-cache (no `cache_control` - breakpoints), and short prompts below the provider's cache threshold simply won't be cached — - `cacheReadTokens: 0` is a real "miss", not missing data. Cache reads grow as a conversation's - resent prefix gets large enough. -- **Provider doesn't report cache at all — distinguish from 0.** Some providers (e.g. - **Umans**) never include `cache_read_tokens` / `cache_write_tokens` in their usage - payload. In that case `cacheReadTokens` is `undefined` — the provider can't tell you - whether cache was hit or missed. This is **different from `cacheReadTokens: 0`**, - which means "cache was checked and there were 0 hits" (a real miss). - - The FE should distinguish these three states: - - | `cacheReadTokens` | Meaning | FE display | - |---|---|---| - | `undefined` | Provider doesn't report cache | Hide cache panel, or show "N/A" | - | `0` | Provider reports cache; this request had 0 hits | Show "0%" (genuine miss) | - | `> 0` | Cache hit | Show percentage | - - ```ts - function cacheDisplay(u: Usage): { kind: "not-reported" } | { kind: "reported"; hitPct: number } { - if (u.cacheReadTokens === undefined) return { kind: "not-reported" }; - const read = u.cacheReadTokens; - const hitRate = u.inputTokens > 0 ? read / u.inputTokens : 0; - return { kind: "reported", hitPct: Math.round(hitRate * 100) }; - } - ``` - - When `kind === "not-reported"`, do NOT show "0%" — that's misleading. Either hide the - cache panel entirely or show "Cache: not reported". This also applies to `cacheWriteTokens` - (if `undefined`, don't show a write row). - -## Worked example (real numbers, captured live against OpenCode Go flash) -| Turn | inputTokens | cacheReadTokens | hit % | -|---|---|---|---| -| 1 | 2669 | 384 | 14% | -| 2 (history resent) | 2737 | 2560 | **93%** | - -Cumulative: read `2944` / input `5406` → **54%**. These exact values appear both on the live -`done.usage` stream and in `GET /conversations/:id/metrics` (`turns[].usage`). - -## Type references -- `@dispatch/wire`: `Usage`, `TurnUsageEvent` (`usage`), `TurnDoneEvent` (`done`), - `TurnMetrics`, `StepMetrics`. -- `@dispatch/transport-contract`: `ConversationMetricsResponse`, and the WS `chat.delta` - envelope carrying each `AgentEvent`. diff --git a/frontend-cache-warming-handoff.md b/frontend-cache-warming-handoff.md deleted file mode 100644 index cf1f402..0000000 --- a/frontend-cache-warming-handoff.md +++ /dev/null @@ -1,91 +0,0 @@ -# FE handoff — cache warming: cache-rate fix + "expected cache" metric - -> **Courier doc** (backend → `../frontend`, via the user). Per ORCHESTRATOR §7 the backend does -> NOT write the FE repo. `lsp references` does not span the two repos. -> Backend commits: `7ffb6b2` (arch-rewrite), `0e9d118` (`../claude/provider-anthropic`). - -## Status — most of the original handoff is DONE (removed) -Per the FE's `backend-handoff.md` (2026-06-11), the frontend has already consumed the bulk of the -earlier version of this doc — those sections are **removed**: -- ✅ `NumberField` (`kind:"number"`) renderer. -- ✅ Conversation-scoped surface subscriptions (focused `conversationId` on subscribe/invoke + - staleness rule; re-scope on conversation switch). -- ✅ The "Cache Warming" sidebar view: enabled toggle, minutes+seconds interval (`cache-warming/ - set-interval`), `cache-warming/toggle`, manual **Warm now** (`POST /chat/warm`), live countdown, - hit-% history. -- ✅ `warmNow()` posting `/chat/warm` with the conversation's model. - -What remains below is the ONE piece the FE has not yet consumed: a cache-rate **correctness fix** and -a new **retention** metric. - -## Cache-rate metric — a correctness fix + the "expected cache" metric (TO CONSUME) -A backend bug made the cache-hit % read **100% on Claude whenever anything was cached** (it inflated). -Root cause: Anthropic's `input_tokens` is the *uncached remainder*, with cache read/creation reported -separately — but the wire `Usage.inputTokens` convention (which the flash/OpenAI-compat provider -already follows) is the **TOTAL prompt incl. cached**. Fixed in `../claude/provider-anthropic` -(`inputTokens = input + cacheRead + cacheWrite`). **No FE change needed for the fix itself** — your -existing `cacheRead/inputTokens` math (in `frontend-cache-rate-handoff.md`) now yields the *true* rate -on Claude. (That older handoff's caveat "cacheWriteTokens is usually absent" is **not** true for -Claude — it reports both.) - -Show two distinct cache numbers: -- **Cache rate** = `cacheReadTokens / inputTokens` — *what fraction of THIS turn's prompt came from - cache*. Legitimately **drops when a turn adds a lot of new content** (e.g. pasting a big file: reads - the old prefix back but also writes the new file → rate < 100%). Per-turn efficiency; on every - `usage`/`done` event + persisted metrics. -- **Expected cache (retention)** = *of the cache that existed going into this turn, how much we read - back* — ideally **~100% every turn after the first**. **<100% = the cache busted/expired.** It is a - **cross-turn** derivation (FE-side, from two consecutive turns' usage you already have): - ``` - expectedCache(turn N) = clamp01( cacheRead_N / (cacheRead_{N-1} + cacheWrite_{N-1}) ) - ``` - (denominator = the prior turn's cached prefix = what it read + what it wrote). - -**Worked example (live, Claude haiku), one chat, two real turns:** -| turn | inputTokens (total) | cacheRead | cacheWrite | cache rate `cr/input` | expected cache (cross-turn) | -|---|---|---|---|---|---| -| 1 (fresh) | 5149 | 0 | 5146 | 0% | — | -| 2 (new msg) | 8462 | 5146 | 3313 | **61%** | `5146/(0+5146)` = **100%** | - -So on turn 2 the prompt was 61% cache (the rest was the new message), yet you read back **100%** of -what turn 1 cached — two true, complementary signals. (Pre-fix, the rate wrongly showed 100% because -the denominator excluded the 5146 cached tokens.) - -### Warming-specific (already on the wire — small additions) -For the warming feature, the backend now also reports a **single-shot** retention so you don't have to -track cross-turn state there: -- **`WarmResponse.expectedCacheRate`** (new field on `POST /chat/warm`) = - `round(cacheReadTokens / (cacheReadTokens + cacheWriteTokens) * 100)` — ~**100%** when the warm - found the cache still warm, **0%** when it had expired (rewrote everything). This is the **"is - warming working?"** signal — headline this for the Warm-now result rather than `cachePct`. -- The conversation-scoped `cache-warming` surface gained a matching **`stat` "cache retention"** field - (alongside the existing "last cache rate" stat). It's a generic `stat`, so your existing renderer - already shows it — just relabel/position as desired. - -Types: `@dispatch/transport-contract` `WarmResponse` now carries `expectedCacheRate` (additive). - -## CR-3 — DONE (next-warm timestamps + manual-warm resets the timer) -Both asks from `backend-handoff-cache-warming-timer.md` are implemented (commit `bfbad3a`). No -contract bump (uses the `custom` escape hatch, as you suggested). - -**Ask 1 — authoritative timestamps on the `cache-warming` surface.** The conversation-scoped spec now -includes a `custom` field: -```ts -{ kind: "custom", rendererId: "cache-warming-timer", - payload: { nextWarmAt: number | null, lastWarmAt: number | null } } // epoch-ms -``` -- `nextWarmAt` = epoch-ms the next AUTOMATIC warm will fire, or `null` when not scheduled (disabled, - or a turn is generating so the timer is cancelled). Drive your countdown off this directly. -- `lastWarmAt` = epoch-ms of the most recent completed warm, or `null` if none. Use its changes for - the history. (The hit-% for that warm is the `last cache rate` / `cache retention` stats in the - same spec.) -- Pushed via the normal surface `update` on every change (warm complete, toggle, interval, turn - start/settle). You can drop the FE-side best-effort countdown anchor. - -**Ask 2 — a manual `POST /chat/warm` now resets the cycle + refreshes the surface.** Implemented via -an inversion (no new endpoint, no change to the `/chat/warm` request/response): the backend's warm -service emits an internal event that the cache-warming extension consumes, so a manual warm now -re-arms the automatic timer (new `nextWarmAt`), updates `lastPct`/`lastWarmAt`, and **pushes a surface -`update`**. So after a "Warm now" click you'll get an authoritative surface `update` — you can drop the -workaround of reading the % from the HTTP response (though the HTTP `WarmResponse` is still returned and -fine to use for immediate feedback). Live-verified against Claude haiku. diff --git a/frontend-cache-warming-lifecycle-handoff.md b/frontend-cache-warming-lifecycle-handoff.md deleted file mode 100644 index 49bee0a..0000000 --- a/frontend-cache-warming-lifecycle-handoff.md +++ /dev/null @@ -1,94 +0,0 @@ -# FE handoff — CR-4 cache-warming lifecycle SHIPPED (+ CR-1 table, CR-2 scope) - -> **Courier doc** (backend → `../frontend`, via the user). Response to your -> `backend-handoff-cache-warming.md` (CR-4) and the open asks CR-1 / CR-2 in -> `backend-handoff.md`. Everything below is live on `bin/up` and verified with a -> headless probe (same flow as your `scripts/probe-cache-warming.ts` — re-run it to -> confirm; default-off means Phase C's toggle-enable branch now executes). -> -> **Contract bumps to re-pin:** `@dispatch/ui-contract` **0.1.0 → 0.2.0**, -> `@dispatch/transport-contract` **0.8.0 → 0.9.0**. `wire` unchanged (0.6.0). - -## CR-4a — warming now defaults OFF ✅ -A new conversation starts `enabled: false`, `nextWarmAt: null` — no warm is scheduled -until the user opts in via the toggle. Interval default is still 240s. Bonus fix: -re-enabling restores the conversation's PERSISTED interval (not the 240s default). -One caveat (pre-existing behavior, now fail-safe): opt-in is not yet re-hydrated -across a backend RESTART — after a restart a conversation reads disabled until -toggled again. Flag it if that matters to you and we'll add boot hydration. - -## CR-4b — post-warm updates now carry the FUTURE `nextWarmAt` ✅ -Root cause was notify-before-reschedule in the warmer. Fixed; additionally: -- after every automatic warm, the pushed `cache-warming-timer` payload is - `{ nextWarmAt: <future>, lastWarmAt: <just now> }` (probe: 2 warms @5s, both FUTURE); -- after `turn-sealed` the surface now pushes the fresh post-turn schedule (this was - the "still past after a real chat turn" case in your probe); -- on `turn-start` the surface pushes `nextWarmAt: null` (nothing scheduled while - generating — render as your "waiting…" state); -- if a warm completes with warming since-disabled, the update carries - `nextWarmAt: null`, never a stale past timestamp. -Your countdown can stay authoritative off `nextWarmAt`; the cosmetic past-value guard -should now be dead code. - -## CR-4c — `POST /conversations/:id/close` ✅ (the tab-close affordance) -New endpoint (no request body), `[email protected]`: - -```ts -interface CloseConversationResponse { - conversationId: string; - abortedTurn: boolean; // true iff an in-flight turn existed and was aborted -} -``` - -Semantics — exactly the asymmetry the user wanted: -- **Aborts any in-flight turn.** The kernel stops at the next event boundary; the - partial turn is PERSISTED and the turn SEALS normally — watchers receive - `done` (with `reason: "aborted"`) then `turn-sealed`, so your stream-derived - `generating` flag clears with no special-casing. Live-verified. -- **Stops + disables cache-warming** for the conversation (persisted OFF — reopening - the conversation later does not resume warming), and pushes a surface update - (`enabled: false`, `nextWarmAt: null`) to subscribers. -- **Idempotent**: closing an idle/unknown conversation is a 200 with - `abortedTurn: false`. -- Browser/socket disconnect and `chat.unsubscribe` are UNCHANGED — they still never - touch the turn or the warming schedule (your "keep running when the window closes" - half is regression-tested). -Wire this into `store.closeTab()`; `fetch`/`sendBeacon` both fine (CORS already -allows POST). - -## CR-4d — initial `surface` echo ✅ (no backend change was needed) -HEAD already echoes `conversationId` on the initial `surface` reply (shipped in the -per-conversation-scoping commit; unit-tested). We live-probed BOTH stacks today — -:24205 and your :25205 — and the echo is present. Your probe most likely ran against -a `bin/up2` instance booted before that commit (up2 freezes code at boot). Re-run -`bin/up2` and your probe; if you still see a missing echo, send us the raw frame. - -## CR-1 — Loaded Extensions table ✅ -The surface now emits the "Loaded" count stat plus ONE custom field: - -```ts -{ kind: "custom", rendererId: "table", payload: { columns, rows } } -// columns: ["Name", "Version", "Trust", "Activation"] -// rows: one per loaded extension (ALL trust tiers), cell-for-cell aligned -``` - -Typed payload is exported as `TablePayload` (+ `TABLE_RENDERER_ID`) from -`@dispatch/surface-loaded-extensions` if you want to narrow instead of duck-typing. -Note: `Version` cells all read `0.0.0` — manifests are genuinely unversioned today -(the optional data-quality item from your handoff; not done). - -## CR-2 — catalog `scope` flag ✅ (`[email protected]`) -`SurfaceCatalogEntry` gains `scope?: "global" | "conversation"`. Emitted today: -`loaded-extensions` → `"global"`, `cache-warming` → `"conversation"`. Treat ABSENT as -conversation-scoped (conservative — your current always-send-conversationId policy -remains correct for both). You can now skip re-subscribing `scope: "global"` surfaces -on conversation switch. - -## Suggested FE follow-ups (from your own queue) -- Re-pin + re-mirror `.dispatch/{ui-contract,transport-contract}.reference.md`. -- Wire `POST /conversations/:id/close` into the tab-close path. -- Extend `probe-cache-warming.ts`: assert default-off, post-warm FUTURE `nextWarmAt`, - and (new) close → `abortedTurn` + `done.reason === "aborted"`. -- The "waiting…" guard for a past `nextWarmAt` can stay as a belt-and-braces guard - but should never trigger now; `nextWarmAt: null` while generating is the real state - to render. diff --git a/frontend-compaction-handoff.md b/frontend-compaction-handoff.md deleted file mode 100644 index 195bc1e..0000000 --- a/frontend-compaction-handoff.md +++ /dev/null @@ -1,167 +0,0 @@ -# FE handoff — conversation compacting - -Courier this to `../frontend`. All changes are ADDITIVE. - -## What shipped (backend) - -Conversation compaction: summarize old history into a summary + recent N, -preserving the full pre-compaction history in a separate archive conversation. -Creates a linked chain of archives you can walk backward. - -Two modes: -- **Manual**: `POST /conversations/:id/compact` — triggers immediately. -- **Automatic**: after each turn settles, the backend checks if the last turn's - input tokens exceeded the per-conversation `compactThreshold` (default 85). - If so, compaction runs automatically (fire-and-forget, non-blocking). - -## How compaction works — non-destructive, chained - -The compacted conversation **keeps its original ID** (so messaging between -agents still works). The old full history is **forked** to a new archive -conversation (new UUID). The archive inherits the source's `compactedFrom`, -creating a chain: - -``` -Compaction 1: A (ID "abc") — full history forked to X (new ID). - A's history replaced with [summary + recent N]. - A.compactedFrom = X - -Compaction 2: A (ID "abc") — current history forked to Y (new ID). - A's history replaced with [new summary + recent N]. - A.compactedFrom = Y - Y.compactedFrom = X (inherited from A's pre-compaction state) - -Chain: A → Y → X (walk compactedFrom backward) -``` - -Each archive is an **immutable snapshot** — a complete copy of the conversation -at the time of that compaction. History is never destroyed. - -The FE **does not switch tabs** — the conversation ID doesn't change. Just -reload the history. - -## Bump pinned deps -- `@dispatch/wire` → `0.11.0` -- `@dispatch/transport-contract` → `0.15.0` - -## New types - -```ts -// @dispatch/wire — ConversationMeta now has compactedFrom -export interface ConversationMeta { - readonly id: string; - readonly createdAt: number; - readonly lastActivityAt: number; - readonly title: string; - readonly status: ConversationStatus; // "active" | "idle" | "closed" - /** Points to the archive conversation with full pre-compaction history. */ - readonly compactedFrom?: string; -} - -// @dispatch/wire -export interface CompactionResult { - readonly summary: string; - readonly newConversationId: string; // ID of the archive (old full history) - readonly messagesSummarized: number; - readonly messagesKept: number; -} - -// @dispatch/transport-contract — WS message (server → client) -export interface ConversationCompactedMessage { - readonly type: "conversation.compacted"; - readonly conversationId: string; // the conversation (ID unchanged) - readonly newConversationId: string; // the archive ID (old full history) - readonly messagesSummarized: number; - readonly messagesKept: number; -} -// Added to WsServerMessage union. - -// @dispatch/transport-contract — HTTP response types -export interface CompactResponse { - readonly conversationId: string; // the conversation (ID unchanged) - readonly newConversationId: string; // the archive ID (old full history) - readonly messagesSummarized: number; - readonly messagesKept: number; -} - -export interface CompactPercentResponse { - readonly conversationId: string; - readonly percent: number; // 0 = manual only; null = default 85 -} - -export interface SetCompactPercentRequest { - readonly percent: number; -} -``` - -## `POST /conversations/:id/compact` — manual compaction - -Triggers compaction on demand. Optional JSON body: -```json -{ "keepLastN": 10, "modelName": "umans/umans-glm-5.2" } -``` -- `keepLastN` (default 10): how many recent messages to retain. -- `modelName`: override the model used for summarization. - -200 response: `CompactResponse` — includes `newConversationId` (the archive ID). -The conversation ID in the response is the same as the request — the ID doesn't -change. The FE should reload the conversation history. - -409: `{ error: string }` — conversation is generating, too short, percent not exceeded, etc. -503: compaction service not available. - -## `GET /conversations/:id/compact-percent` — read percent - -200: `CompactPercentResponse { conversationId, percent }` -- `percent: 0` — auto-compact explicitly disabled (manual only). -- `percent: null` (not stored) — **default: 85** (85% tokens). The FE - should display 85 as the default value in the settings UI. -- Any positive number — auto-compact triggers when the last turn's input tokens - exceed this value. - -## `PUT /conversations/:id/compact-percent` — set percent - -Body: `SetCompactPercentRequest { percent: number }` -- `0` explicitly disables auto-compact. -- Any positive number sets the trigger percent. -- To "reset to default", set it to 85. - -## `conversation.compacted` WS message - -Broadcast to all connected WS clients when compaction completes. The FE should -**reload the conversation history** via `GET /conversations/:id` (the -conversation ID hasn't changed — just reload the same ID). The first message -will now be a system summary. - -No tab switching needed — the ID is the same. - -## What the FE needs to do - -1. **Compact button** in the conversation toolbar → `POST /conversations/:id/compact`. - Show a loading indicator while waiting. On success, reload the conversation - history (same ID — just re-fetch). - -2. **Settings UI** for compact percent: `PUT /conversations/:id/compact-percent` - with `{ percent: number }`. A number input (0 = manual only, default 85). - Read the current value via `GET /conversations/:id/compact-percent`. - -3. **Handle `conversation.compacted` WS messages**: reload the conversation - history via `GET /conversations/:id` (same ID, no tab switch). - -4. **"View predecessor" link**: when `ConversationMeta.compactedFrom` is present, - show a link that opens the archive conversation in a read-only view (or a new - tab). Load it via `GET /conversations/:compactedFrom`. The archive has - `status: "closed"` and title `"Archive: <original>"`. Each archive may also - have its own `compactedFrom` — walk the chain backward to see every snapshot. - -5. **Archives in conversation list**: archives appear in - `GET /conversations?status=closed`. They have `compactedFrom` chaining to - the previous archive (if any). The FE can show them in a history view. - -6. **Visual indicator**: show a badge on conversations that have a - `compactedFrom` (they've been compacted). E.g. "Compacted" badge or chain icon. - -## CLI - -`dispatch compact <conversationId>` — triggers manual compaction. Resolves -short IDs like other commands. The response includes the archive ID. diff --git a/frontend-context-size-handoff.md b/frontend-context-size-handoff.md deleted file mode 100644 index a774a0c..0000000 --- a/frontend-context-size-handoff.md +++ /dev/null @@ -1,47 +0,0 @@ -# FE handoff — context size (current context-window usage) - -Courier this to `../frontend` (cross-repo contract change; `lsp references` does not -span repos — ORCHESTRATOR §7). Backend commit adds an optional `contextSize` field; no -breaking change. - -## What shipped (backend) - -A new optional field **`contextSize`** (a token count) now flows to the frontend on two -existing carriers. Both are computed identically and are EQUAL for the same turn: - -1. **Live** — `TurnDoneEvent.contextSize?: number` (the `done` AgentEvent, arriving in a - `chat.delta` WS message / the NDJSON stream). -2. **Persisted** — `TurnMetrics.contextSize?: number`, served by - `GET /conversations/:id/metrics` (`ConversationMetricsResponse.turns[].contextSize`). - -Types: `@dispatch/wire` (`0.4.0 → 0.5.0`), re-exported by -`@dispatch/transport-contract` (`0.5.0 → 0.6.0`). Bump the pinned `file:` deps. - -## Definition (read this — it's subtle) - -`contextSize` = **the turn's FINAL step `inputTokens + outputTokens`** — the tokens the -conversation occupies right now. - -It is deliberately **NOT** the aggregate `usage` already on `done` / `TurnMetrics`. -`usage.inputTokens` is the SUM across steps, which **overcounts** a multi-step / tool-calling -turn (each step re-prefills the growing prompt). The final step's input already contains all -prior context, so `finalStep.input + finalStep.output` is the true occupancy. Do not derive -context size from `usage` yourself — read `contextSize`. - -## How to render it - -- **Current value = the LATEST turn's `contextSize`.** The chat's "current context usage" is - whatever the most recent turn reported. -- **Live update:** when a `done` event arrives, if `event.contextSize !== undefined`, set the - displayed context size to it. -- **On (re)hydrate:** call `GET /conversations/:id/metrics`, take the LAST element of `turns` - that has a defined `contextSize`, and show its value. (Turns appear only after they seal.) -- **Optionality:** `contextSize` may be `undefined` (provider reported no per-step usage). - Treat absent as "unknown" — render a placeholder, NOT `0`. - -## Not included yet (next step) - -The model's **max context-window limit** is a SEPARATE, later field — so a UI like -`contextSize / limit` (e.g. `34,102 / 200,000`) can't show the denominator yet. For now show -only the current size (e.g. "34,102 tokens in context"). "context size" = current usage; -"context window" = the future limit (see GLOSSARY). diff --git a/frontend-conversation-lifecycle-handoff.md b/frontend-conversation-lifecycle-handoff.md deleted file mode 100644 index ca6de57..0000000 --- a/frontend-conversation-lifecycle-handoff.md +++ /dev/null @@ -1,102 +0,0 @@ -# FE handoff — conversation lifecycle (tab persistence across devices) - -Courier this to `../frontend`. All changes are ADDITIVE — nothing existing breaks. - -## What shipped (backend) - -Conversations now have a lifecycle **status** field: `active`, `idle`, or `closed`. -This enables tab persistence: when a new browser connects, it fetches all -`active` + `idle` conversations and restores the tab bar. - -- **`active`** — an agent is currently generating (a turn is in-flight). -- **`idle`** — conversation exists, not generating. User can send a message to resume. -- **`closed`** — user dismissed the tab (hidden from the tab bar, not deleted). - -Status transitions are driven by the backend: -- `idle → active` when a turn starts. -- `active → idle` when a turn settles (done/error). -- `→ closed` when `POST /conversations/:id/close` is called. - -## Bump pinned deps -- `@dispatch/wire` → `0.10.0` -- `@dispatch/transport-contract` → `0.14.0` - -## New types (`@dispatch/wire` + `@dispatch/transport-contract`) - -```ts -export type ConversationStatus = "active" | "idle" | "closed"; - -// ConversationMeta now has a status field: -export interface ConversationMeta { - readonly id: string; - readonly createdAt: number; - readonly lastActivityAt: number; - readonly title: string; - readonly status: ConversationStatus; -} - -// New WS message (server → client): -export interface ConversationStatusChangedMessage { - readonly type: "conversation.statusChanged"; - readonly conversationId: string; - readonly status: ConversationStatus; -} -``` - -`ConversationStatusChangedMessage` is added to the `WsServerMessage` union. - -## `GET /conversations?status=active,idle` — filter by status - -The existing `GET /conversations` endpoint now accepts an optional `?status=` -query param: a comma-separated list of statuses to filter by. - -- **Default (no param):** returns ALL conversations (all statuses). -- `?status=active,idle` → only active + idle (what the FE tab bar wants). -- `?status=closed` → only closed conversations (for a history view). -- Invalid values are silently dropped. If all values are invalid, no filter - is applied (returns all). - -## `POST /conversations/:id/close` — marks as closed - -The existing close endpoint now also sets the conversation's status to `closed` -in the store. This persists across server restarts. The response is unchanged -(`{ conversationId, abortedTurn }`). - -## `conversation.statusChanged` WS message - -Broadcast to ALL connected WS clients whenever a conversation's status changes. -The backend emits this synchronously alongside the existing `turnStarted` / -`turnSettled` / `conversationClosed` hooks. - -```ts -{ type: "conversation.statusChanged", conversationId: "conv-1", status: "active" } -``` - -## What the FE needs to do - -1. **On connect:** call `GET /conversations?status=active,idle` to fetch - conversations for the tab bar. Render tabs for each. - -2. **`active` tabs:** subscribe to the conversation's live stream - (`chat.subscribe` WS op) to receive in-flight events. - -3. **`idle` tabs:** load history via `GET /conversations/:id`. No live - subscription needed until the user sends a message. - -4. **Tab close button:** call `POST /conversations/:id/close` to mark the - conversation as `closed`. Remove it from the tab bar. - -5. **Handle `conversation.statusChanged` WS messages:** update the tab's - status indicator. When a conversation goes `idle → active`, show a - loading/generating indicator. When it goes `active → idle`, remove the - indicator. When it goes `closed`, remove the tab. - -6. **Closed conversations:** accessible from a history view - (`GET /conversations?status=closed`). Can be reopened by sending a message - (which transitions `closed → active`). - -## CLI - -`dispatch list` now defaults to `active,idle` (excludes closed). New flags: -- `--status <active|idle|closed>` — filter by a single status. -- `--all` — include closed (show all statuses). diff --git a/frontend-conversation-list-handoff.md b/frontend-conversation-list-handoff.md deleted file mode 100644 index dd3fd63..0000000 --- a/frontend-conversation-list-handoff.md +++ /dev/null @@ -1,100 +0,0 @@ -# FE handoff — conversation list, title, and open tab - -Courier this to `../frontend`. All changes are ADDITIVE — nothing existing breaks. - -## What shipped (backend) - -Three new features for conversation management: - -1. **Conversation list** — `GET /conversations` returns all known conversations with - metadata (id, title, createdAt, lastActivityAt). The backend auto-tracks metadata - on every message append; title defaults to the first user message (truncated 80 chars). - -2. **Conversation title** — `GET/PUT /conversations/:id/title` lets the FE read and - set a human-readable title for any conversation. - -3. **Open tab signal** — `POST /conversations/:id/open` broadcasts a `conversation.open` - WS message to all connected clients (e.g. when the CLI uses `--open`). See also - `frontend-conversation-open-handoff.md` for the WS message details. - -No version bumps needed — all types are already in `@dispatch/transport-contract` `0.13.0` -and `@dispatch/wire` `0.9.0`. - -## `GET /conversations` — conversation list - -Returns all conversations sorted by `lastActivityAt` descending (most recent first). - -- Optional `?q=<prefix>` query param filters by conversation ID prefix (short-ID - resolution — used by the CLI; the FE can ignore it or use it for search). -- 200 response: `ConversationListResponse` - -```ts -interface ConversationListResponse { - readonly conversations: readonly ConversationMeta[]; -} - -interface ConversationMeta { - readonly id: string; - readonly createdAt: number; // epoch-ms - readonly lastActivityAt: number; // epoch-ms - readonly title: string; -} -``` - -**FE use case:** render a conversation sidebar/picker showing title + relative time. -Click a conversation to open it (load its history via the existing `GET /conversations/:id`). - -## `GET /conversations/:id/title` — read title - -- 200 response: `TitleResponse { conversationId, title }` - -## `PUT /conversations/:id/title` — set title - -- Body: `SetTitleRequest { title: string }` (non-empty after trim, else 400) -- 200 response: `TitleResponse { conversationId, title }` - -**FE use case:** let the user rename a conversation. The title is also auto-set from -the first user message, so a newly created conversation already has a title. - -## `GET /conversations/:id/last` — blocking last message - -Blocks server-side until any in-flight turn settles, then returns the last AI text -message. Mainly for CLI use (`dispatch read <id>`), but the FE could use it for -notifications or previews. - -- 200 response: `LastMessageResponse { conversationId, content, turnId? }` -- `content` is empty string if the conversation has no assistant message. -- Unknown conversation → `content: ""` (200, not an error). - -## `POST /conversations/:id/open` — signal frontend - -Calls the backend to broadcast `conversation.open` to all connected WS clients. See -`frontend-conversation-open-handoff.md` for the WS message format and FE handling. - -- 200 response: `OpenConversationResponse { conversationId }` - -## What the FE needs to do - -1. **Bump pinned deps:** `@dispatch/wire` → `0.9.0`, `@dispatch/transport-contract` - → `0.13.0`. - -2. **Conversation sidebar/picker:** call `GET /conversations` on load (and periodically - or on focus) to show a list of conversations. Each entry shows `title` + relative - time from `lastActivityAt`. Click to open → load history via `GET /conversations/:id`. - -3. **Title editing:** add an inline edit affordance on the conversation title. - `PUT /conversations/:id/title` with `{ title }` to update. - -4. **Handle `conversation.open` WS message:** when a `"conversation.open"` message - arrives, open (or focus) a tab for that `conversationId`. See - `frontend-conversation-open-handoff.md`. - -## Notes - -- Conversations are **in-memory only** on the backend — the list resets on server - restart. New conversations appear as users chat; old ones may disappear after a - restart. -- The title is auto-set from the first user message (truncated 80 chars). Users can - override it via `PUT /conversations/:id/title`. -- `createdAt` is set on the first message append; `lastActivityAt` updates on every - append. diff --git a/frontend-conversation-open-handoff.md b/frontend-conversation-open-handoff.md deleted file mode 100644 index c4064ea..0000000 --- a/frontend-conversation-open-handoff.md +++ /dev/null @@ -1,53 +0,0 @@ -# FE handoff — conversation.open (CLI --open flag) - -Courier this to `../frontend`. All changes are ADDITIVE. - -## What shipped (backend) - -The CLI's `--open` flag (`dispatch send <id> --text "..." --open`) calls -`POST /conversations/:id/open`, which emits a `conversationOpened` bus event. The -transport-ws extension subscribes and broadcasts a new WS message to ALL connected -frontend clients. - -## New WS message (additive to `WsServerMessage`) - -```ts -interface ConversationOpenMessage { - readonly type: "conversation.open"; - readonly conversationId: string; -} -``` - -The `type` is `"conversation.open"` — add it to the FE's `WsServerMessage` union -handler. It arrives as a top-level WS message (not inside `chat.delta`). - -## What the FE needs to do - -1. **Handle the WS message** — when a `"conversation.open"` message arrives, open - (or focus) a tab for `conversationId`. The backend just signals; the FE decides - whether to actually open/focus or just notify. - -2. **Suggested behavior:** - - If the conversation is already open in a tab, focus that tab. - - If not, open a new tab for it (load its history via `GET /conversations/:id`). - - Do NOT auto-focus if the user is actively typing in another tab (optional — - your discretion). - -3. **No version bumps needed** — `@dispatch/transport-contract` already exports - `ConversationOpenMessage` (additive to `WsServerMessage` since `0.13.0`). The FE - just needs to add the `type: "conversation.open"` case to its WS message handler. - -## No other integration points - -- No new HTTP endpoints for the FE (the CLI calls `POST /conversations/:id/open`). -- No new surface types. -- No new `AgentEvent` types. -- The message is a global broadcast (sent to ALL connected clients), not per- - conversation. - -## Notes - -- The `--open` flag can be combined with `--queue` (enqueue + signal) or used - without `--queue` (blocking send + signal). -- Multiple clients (e.g. phone + laptop) all receive the broadcast — each decides - independently whether to open/focus. diff --git a/frontend-cr3-user-message-handoff.md b/frontend-cr3-user-message-handoff.md deleted file mode 100644 index 0fb20e5..0000000 --- a/frontend-cr3-user-message-handoff.md +++ /dev/null @@ -1,54 +0,0 @@ -# FE handoff — CR-3 fixed: user prompt is now on the turn's event stream - -Courier to `../frontend`. This resolves CR-3 from `backend-handoff.md` ("a watcher can't see -the turn's USER prompt until seal"). **Option B implemented + live-verified.** Your staged-but-inert -consumption can now be turned on. - -## What shipped (backend) - -A new **additive** `AgentEvent` variant carries the user prompt INTO the turn's outward stream: - -```ts -// @dispatch/wire — added to the AgentEvent union -interface TurnInputEvent { - type: "user-message"; - conversationId: string; - turnId: string; - text: string; // the raw prompt, exactly as sent -} -``` - -`session-orchestrator` emits it via the broadcast/buffer path as the **FIRST event of every turn** -(before `turn-start`), so it is replayed to every subscriber — live AND late-join — and arrives on -the HTTP/NDJSON path too. Persistence is unchanged (the user message is still appended atomically at -seal); this only adds a buffered/broadcast event. Metrics are unaffected (it is not usage). - -## Version bumps (re-pin both) - -- `@dispatch/wire` **`0.5.0 → 0.6.0`** (additive union member). -- `@dispatch/transport-contract` **`0.7.0 → 0.8.0`** (re-exports `AgentEvent`/`chat.delta`, which now - carries `user-message`; no other transport-contract change). - -Re-mirror `.dispatch/{wire,transport-contract}.reference.md` and add `user-message` to the FE -exhaustiveness guard. - -## FE action - -Flip on the already-staged `core/chunks` branch that folds a `user-message` event into a provisional -user chunk for watchers, with your text dedup against the sender's optimistic echo. After re-pin: -- a **pure watcher** (second device / `chat.subscribe` only) now shows the user bubble the moment the - turn starts, not at seal; -- the **sender** is unchanged (its optimistic echo dedups against the replayed `user-message`); -- a **late-joiner** gets `user-message` first in the replay, then the rest of the in-flight turn. - -## Live-verified (backend, vs flash) - -Two WS clients on one conversation; client B subscribed but never sent. On A's `chat.send`, B received -`chat.delta { event:{ type:"user-message", text:"…", turnId, conversationId } }` as its **first** delta -(index 0), **before** `turn-sealed`, with `text` equal to A's prompt, then the streaming reply. `RESULT: OK`. - -## Note - -The ordering guarantee is: `user-message` is the first event of the turn, immediately followed by -`turn-start`, then the usual deltas → `done` → `turn-sealed`. Treat `user-message` as turn-scoped -(it carries `turnId`) so a multi-turn transcript attributes each prompt to its turn. diff --git a/frontend-cwd-resolution-handoff.md b/frontend-cwd-resolution-handoff.md deleted file mode 100644 index 80d5105..0000000 --- a/frontend-cwd-resolution-handoff.md +++ /dev/null @@ -1,95 +0,0 @@ -# Backend handoff — cwd resolution fixes (backend → FE) — courier doc - -> **From:** arch-rewrite orchestrator · **To:** frontend orchestrator (b18a) · **Courier:** the user. -> Response to the cwd bug report you sent to backend agent ab13. The fixes are DONE and -> live-verified on the dev stack. - -## Version bumps - -| Package | From | To | Notes | -|---|---|---|---| -| `@dispatch/wire` | — | — | **Unchanged** | -| `@dispatch/transport-contract` | — | — | **Unchanged** | -| `@dispatch/ui-contract` | — | — | **Unchanged** | - -**This is a behavior-only change.** No wire/transport-contract types changed. No FE re-pin or -re-mirror needed. The FE needs NO contract change to benefit. - ---- - -## 1. The fix (what was broken → what now works) - -You reported: a workspace `defaultCwd` set, a conversation with no explicit cwd, and `pwd` ran in -the server default (`process.cwd()`) instead of the workspace `defaultCwd`. Plus your desired -behavior: a per-conversation cwd **relative to the workspace `defaultCwd`** unless absolute. - -**Root cause (backend-only):** the workspace-relative resolution lived in -`conversation-store.getEffectiveCwd`, which only resolved the *persisted* cwd. But the FE sends the -CwdField value as a **per-turn `cwd` on `chat.send`**, and `session-orchestrator` used a per-turn -`cwd` **as-is** — bypassing `getEffectiveCwd` entirely. So a relative `cwd` like `"arch-rewrite"` -reached `run_shell` raw → resolved against `process.cwd()` → a nonexistent path → `pwd` broke. - -**Three backend fixes (all live-verified):** - -1. **Per-turn `cwd` is now resolved.** `session-orchestrator` passes the per-turn `cwd` (on - `chat.send`/`POST /chat` AND on manual `POST /chat/warm`) through `getEffectiveCwd` as an - override, so it goes through the same workspace-relative algorithm as the persisted cwd. -2. **New-conversation timing.** A brand-new conversation's first turn previously ran - `getEffectiveCwd` *before* the workspace was assigned (so it saw `"default"`, not the request's - workspace). Now the workspace is assigned first. A relative per-turn `cwd` on the FIRST message - of a new conversation now resolves against the intended workspace. -3. **`DELETE /conversations/:id/cwd` was a stub** (returned `{cwd:null}` but did NOT clear the - persisted key). It now calls `clearCwd` and truly deletes the persisted cwd. - -## 2. The resolution algorithm (now applied to BOTH persisted and per-turn cwd) - -``` -workspaceId = persisted conversation workspaceId ("default" fallback) -workspaceCwd = workspace.defaultCwd ?? null -conversationCwd = the explicit cwd (persisted via GET /cwd, OR the per-turn chat.send cwd) - -if (conversationCwd == null) → workspaceCwd ?? serverDefaultCwd // process.cwd() -else if (conversationCwd absolute) → conversationCwd // starts with "/" -else → path.resolve(workspaceCwd ?? serverDefaultCwd, conversationCwd) -``` - -`serverDefaultCwd` = `process.cwd()` (the server's cwd). - -## 3. FE impact (minimal — no contract change) - -You do NOT need to change anything. Both FE patterns now work correctly: - -- **If you omit `cwd` on `chat.send`** (your current code): the backend resolves the persisted - conversation cwd (set via `PUT /conversations/:id/cwd`) through the algorithm. ✅ -- **If you send a relative `cwd` on `chat.send`**: it is resolved against the workspace - `defaultCwd`. ✅ (was broken — used raw) -- **If you send an absolute `cwd`** (starts `/`): overrides outright. ✅ - -### Endpoints (semantics — shapes unchanged) - -- `GET /conversations/:id/cwd` → **unchanged**: the RAW explicit conversation cwd (`null` = - inheriting workspace default). Your CwdField shows what the user typed. -- `GET /conversations/:id/lsp` → returns the **effective** (resolved) cwd. It now roots LSP at the - effective cwd INCLUDING the server-default fallthrough (when neither conversation nor workspace - cwd is set, LSP roots at `process.cwd()`). Previously returned `cwd: null` + empty `servers` when - no cwd was set. -- `DELETE /conversations/:id/cwd` → **now actually clears** the persisted cwd (was a no-op stub). - Returns `{ conversationId, cwd: null }` (unchanged shape). Use this to reset a conversation's cwd - to "inherit workspace default". -- `PUT /conversations/:id/cwd` → unchanged (persists the raw value). - -## 4. Optional FE simplification (not required) - -You MAY now safely **omit `cwd` on `chat.send`** entirely and rely on the backend resolving the -persisted conversation cwd (set via `PUT /conversations/:id/cwd`). This was the design you -described in the original report. Either path (send cwd, or omit it) is correct; the backend -resolves both consistently. Sending it is harmless; omitting it avoids sending redundant data. - -## 5. Live-verified (dev stack, workspace `test` defaultCwd `/home/tradam/projects/dispatch`) - -- Existing conversation, per-turn `cwd:"arch-rewrite"` → `pwd` = `/home/tradam/projects/dispatch/arch-rewrite` ✅ -- Brand-new conversation, per-turn `cwd:"arch-rewrite"` → `pwd` = `/home/tradam/projects/dispatch/arch-rewrite` ✅ -- Chat omitting `cwd` (persisted cwd `arch-rewrite`) → `pwd` = `/home/tradam/projects/dispatch/arch-rewrite` ✅ -- `PUT /tmp/test` → GET `/tmp/test` → DELETE → GET `null` (actually cleared) ✅ - -`tsc -b` EXIT 0, biome clean, 1311 vitest pass. diff --git a/frontend-history-windowing-handoff.md b/frontend-history-windowing-handoff.md deleted file mode 100644 index 6792c38..0000000 --- a/frontend-history-windowing-handoff.md +++ /dev/null @@ -1,70 +0,0 @@ -# Backend → frontend handoff — CR-5: history windowing (`limit` / `beforeSeq`) - -> **From:** arch-rewrite · **To:** frontend · **Courier:** the user. -> Reply to `backend-handoff-chat-limit.md` (CR-5). 2026-06-12. SHIPPED. - -## What shipped - -`GET /conversations/:id` now takes two OPTIONAL query params on top of -`sinceSeq` (all combinable; authoritative spec = the -`ConversationHistoryResponse` JSDoc in `@dispatch/transport-contract`): - -1. **`limit=<k>`** — returns only the NEWEST `k` chunks of the selection, - still ASCENDING by seq. A selection with ≤ `k` chunks is returned whole - (your `limit=192` against a short conversation gets the normal full - response, exact). `limit` absent → exactly the previous behavior. -2. **`beforeSeq=<s>`** — restricts the selection to `seq < s` (exclusive). - Combined semantics: `sinceSeq < seq < beforeSeq`; with `limit`: the newest - `k` chunks below `s`, ascending — your "Show earlier messages" page-in path. - -Your three flows, verbatim from your handoff, all work as written: - -- Fresh load: `?sinceSeq=0&limit=192` -- Tail sync: `?sinceSeq=<maxCachedSeq>` (unchanged) -- Page older in: `?beforeSeq=<oldestLoadedSeq>&limit=<ceil(L/4)>` - -## Ask #3 — our pick: the seq invariant, no new field (your "cheapest option") - -We CONFIRM IN WRITING, as a contractual guarantee (now codified in the -`StoredChunk` doc in `@dispatch/wire` and referenced from the history-response -doc): **per-conversation `seq` is 1-based, monotonic, and gap-free** — a -conversation's first chunk is always `seq === 1` and numbering never skips. - -So derive it exactly as you proposed: `hasOlder = oldestLoaded.seq > 1`. -There is deliberately NO `earliestSeq`/`hasOlder` response field. - -## Validation (new, only for the new params) - -`limit` and `beforeSeq` must be **positive integers** when present -(`sinceSeq` keeps its existing semantics — `0` = from the start). Malformed, -zero, or negative values → **HTTP 400 `{ error }`** (the error message names -the offending param). Don't send `beforeSeq=0` — and you never need to: -`oldestLoaded.seq === 1` already means there is nothing older. - -## `latestSeq` caveat (important for your cursor logic) - -`latestSeq` semantics are UNCHANGED (seq of the last returned chunk; the -requested `sinceSeq` when the slice is empty) — but on a **windowed** read it -describes the returned window, NOT the conversation's high-water mark: - -- A fresh `?sinceSeq=0&limit=192` load DID reach the true tail → `latestSeq` - is a valid sync cursor. -- A `?beforeSeq=...` backfill page did NOT → do not regress your tail cursor - from a backfill response. (Your seq-keyed dedup cache makes this natural — - just don't feed backfill `latestSeq` into the tail cursor.) - -## Versions (re-pin + re-mirror) - -- `@dispatch/transport-contract` **`0.9.0 → 0.10.0`** — the param/validation/ - caveat docs above (response TYPE shape unchanged; no new fields). -- `@dispatch/wire` **`0.6.0 → 0.6.1`** — doc-only: the 1-based seq guarantee - codified on `StoredChunk`. - -## Test coverage (backend, for your confidence) - -- conversation-store: +8 windowing tests (newest-N ascending, bounds, - combined bounds, page-in, empty selection, garbage-in, no-window regression - guard; the "gap-free 1-based seq" test now backs a written contract). -- transport-http: +20 route/param tests incl. all five 400 validation cases - and a no-params byte-identical regression guard. -- Full suite: typecheck clean · biome clean · 935 vitest + 112 bun tests green. diff --git a/frontend-lsp-cwd-handoff.md b/frontend-lsp-cwd-handoff.md deleted file mode 100644 index e7c8417..0000000 --- a/frontend-lsp-cwd-handoff.md +++ /dev/null @@ -1,133 +0,0 @@ -# Frontend handoff — LSP status + per-conversation CWD - -> Backend milestone complete (this repo). The web frontend is a SEPARATE repo -> (`../frontend`); this document is couriered to it by the user (ORCHESTRATOR -> §7 — `lsp references` does not span repos). All types below are exported from -> `@dispatch/transport-contract` (bumped to **0.5.0**). - -## TL;DR for the FE -Two new capabilities are now on the backend: -1. **Per-conversation working directory (cwd)** — get/set per tab (a tab = a - `conversationId`). Persisted server-side; defaults a turn's cwd when `/chat` - omits one. -2. **Per-conversation LSP status** — which language servers are configured for a - tab's cwd and whether each is connected. - -## Endpoints - -### `GET /conversations/:id/cwd` → `CwdResponse` -```ts -interface CwdResponse { conversationId: string; cwd: string | null } -``` -`cwd` is `null` until set. - -### `PUT /conversations/:id/cwd` (body `SetCwdRequest`) → `CwdResponse` -```ts -interface SetCwdRequest { cwd: string } -``` -- `200` with the new `CwdResponse` on success. -- `400` `{ error }` if `cwd` is missing/empty. -- Content-Type `application/json`. CORS now allows `PUT`. - -### `GET /conversations/:id/lsp` → `LspStatusResponse` -```ts -type LspServerState = "connected" | "starting" | "error" | "not-started"; -interface LspServerInfo { - id: string; // "typescript", "luau-lsp" - name: string; // display name - root: string; // absolute workspace root the server is rooted at - extensions: string[]; // e.g. [".ts",".tsx"] or [".luau"] - state: LspServerState; - error?: string; // present only when state === "error" -} -interface LspStatusResponse { - conversationId: string; - cwd: string | null; // the tab's persisted cwd - servers: LspServerInfo[]; // [] when cwd is null -} -``` - -## Behavior notes (important for UX) -- **`GET /conversations/:id/lsp` lazily connects.** The first call for a cwd - resolves the configured servers and **spawns + initializes** them, so it can take - a moment (typically <1s; a cold luau-lsp loading Roblox types can take longer) and - returns once each server reaches `connected`/`error`. Subsequent calls are fast - (cached). Suggested UX: call it when a tab opens / cwd changes, show a spinner per - server until `state` settles, then a connected/error badge. -- **`servers` is empty when `cwd` is null** — prompt the user to set a cwd first. -- **States:** `connected` = ready; `error` = failed to start (`error` has the - reason, e.g. binary not found); `not-started`/`starting` = transient. -- **cwd defaulting:** if a `/chat` (or `/chat/warm`) request omits `cwd`, the - backend now uses the conversation's persisted cwd. If a request DOES send `cwd`, - that value is used AND persisted (so the CLI `--cwd` keeps the stored value - fresh). The FE's PUT and the chat `cwd` field write the same per-conversation - store. - -## How servers are configured (so you can explain it to users) -Per the tab's cwd, the backend resolves language servers from, in order: -1. `<cwd>/.dispatch/lsp.json` (`{ servers: { <id>: { command, extensions, - rootMarkers?, env?, initialization?, watch? } } }`) -2. fallback `<cwd>/opencode.json` `lsp` key (opencode-compatible) -3. a built-in `typescript` server (so a TS project works with zero config). -No FE work needed for this — just display `LspStatusResponse`. - -## Operational note (surface to users on `state:"error"`) -Language-server binaries must be on the **backend process's PATH**. A binary in a -non-standard location (e.g. `~/.local/bin/typescript-language-server`) won't be -found if the server daemon's PATH lacks that dir, yielding -`state:"error", error:"ENOENT ... posix_spawn '<bin>'"`. luau-lsp -(`/usr/local/bin`) and standard-PATH binaries work out of the box. Consider showing -the `error` text directly so users can diagnose a missing/unfound binary. - -## Verified live -- Roblox project (`luau-lsp`) → `connected` through the full HTTP path - (`GET /conversations/:id/lsp`), using the project's existing `opencode.json` + - an auto-spawned `rojo sourcemap --watch` sidecar. -- This repo (`typescript`) → `connected`. -- cwd PUT/GET round-trip → `200` + correct value. - -## Not in this slice (potential future FE asks) -- A live WS surface for LSP status (currently HTTP-poll on tab open / cwd change). -- An LSP-diagnostics stream pushed into the chat (the agent can pull diagnostics - via the `lsp` tool today; auto-inject-on-write was deliberately deferred). - ---- - -## CONFIRMED — answers to `backend-handoff-cwd-lsp.md` (your 6 asks) - -> Re your courier doc. All six hold in the current backend. Code refs are -> `packages/transport-http/src/app.ts` and `packages/session-orchestrator/src/ -> orchestrator.ts`. None require a backend change. **The draft → first-message cwd -> path you built is fully supported.** - -| # | Your ask | Confirmed | Where | -|---|----------|-----------|-------| -| 1 | Unseen id: `GET /cwd` ⇒ `200 {cwd:null}`; `GET /lsp` ⇒ `200 {cwd:null,servers:[]}` (no 404/500) | ✅ | `getCwd` returns `null` for any id; `/lsp` early-returns `{cwd:null,servers:[]}` before touching the LSP — `app.ts:322-333, 364-372` | -| 2 | `PUT /cwd` on an unseen/draft id persists (no prior turn/row) | ✅ | `setCwd` is a plain per-id upsert (key `conv:<id>:cwd`) — `app.ts:335-362` | -| 3 | Draft cwd carries into turn 1 (`PUT D/cwd`, then `chat.send` D with no `cwd`) | ✅ | orchestrator uses the persisted cwd when the request omits it; same store key the PUT writes — `orchestrator.ts:122-125`. Unit-tested ("uses the persisted cwd when the request omits cwd") | -| 4 | CORS **preflight** (`OPTIONS` + `Access-Control-Request-Method: PUT`) is answered | ✅ | global Hono `cors`, `allowMethods:["GET","POST","PUT","OPTIONS"]` applied to all routes — `app.ts:112-114`; preflight test passes | -| 5 | No spawn when `cwd` is null | ✅ | `/lsp` returns `servers:[]` before calling the LSP service when `cwd===null` — `app.ts:367-372` | -| 6 | Error body is `{ error: string }` | ✅ | every error path returns `{error}` (e.g. empty-cwd PUT ⇒ `400 {error:"Field 'cwd' is required and must be a non-empty string"}`) — `app.ts:342,346,350,360,376,400` | - -### Setting the cwd on the first message — two supported flows -- **(a) Pre-set, then send (your flow):** `PUT /conversations/D/cwd {cwd}` on the - client-minted draft id → then `POST /chat {conversationId:D}` **without** a `cwd` - field → the turn loads and runs in the persisted `D` cwd. -- **(b) cwd on the first `/chat`:** include `cwd` in the first `POST /chat` → it is - used for that turn **and** persisted for subsequent turns. -Both write/read the same per-conversation store, so they're interchangeable; a draft -that has never sent a message works because the cwd store is independent of history. - -### One edge to be aware of (FE currently safe) -`PUT /cwd` rejects an empty-string `cwd` (`400`), but the **`/chat` `cwd` field** -does not — the orchestrator treats any non-`undefined` `cwd` as "provided", so a -literal `cwd:""` on `/chat` would override the persisted cwd with empty. Your FE -omits the field (sends `undefined`) on cwd-less sends, so this never triggers. **Keep -omitting the field (don't send `cwd:""` / `cwd:null`)** when you want the persisted -draft cwd to apply. (If preferred, the backend can harden this to treat empty/blank -as "not provided" — say the word.) - -### Live-verified -Unseen-id `GET /cwd` ⇒ `{cwd:null}`, `GET /lsp` ⇒ `{cwd:null,servers:[]}`, -`PUT` round-trip `200`, and the empty-cwd `400 {error}` shape were all observed live; -Roblox `luau-lsp` and this repo's `typescript` both reach `state:"connected"`. diff --git a/frontend-lsp-cwd-workspace-handoff.md b/frontend-lsp-cwd-workspace-handoff.md deleted file mode 100644 index d17f0a5..0000000 --- a/frontend-lsp-cwd-workspace-handoff.md +++ /dev/null @@ -1,75 +0,0 @@ -# FE Courier Handoff: LSP cwd resolution fix + PUT cwd workspaceId - -> Backend→FE courier. The user couriers this to `../frontend` (FE agent `ffe3`). -> No `@dispatch/wire` or `@dispatch/transport-contract` version bump is breaking — -> the `SetCwdRequest.workspaceId` is additive (optional); the `LspStatusResponse.cwd` -> semantics changed (was always non-null effective cwd; now null when no cwd set). - -## What changed (backend) - -### 1. `GET /conversations/:id/lsp` — behavior change - -**Before:** The endpoint called `getEffectiveCwd(conversationId)` directly. When no -cwd was persisted, this fell through to the server default (`process.cwd()`) — so the -LSP connected on the wrong directory (the server's cwd, not the conversation's -workspace). - -**After:** The endpoint now gates on the **persisted** cwd (`getCwd`) first: -- When no cwd is persisted → response is `{ cwd: null, servers: [] }` (HTTP 200, no - LSP connection). The LSP does NOT connect when no working directory is set. -- When a cwd IS persisted → the endpoint resolves the **effective** cwd (relative cwd - resolved against the workspace `defaultCwd`; absolute → as-is) and returns - `{ cwd: "<effectiveCwd>", servers: [...] }`. - -**FE impact:** -- `LspStatusResponse.cwd` can now be `null` (previously it was always a string, even - when no cwd was set — it returned `process.cwd()`). The FE should handle `null` by - showing "no LSP connected" or "set a working directory." -- When `cwd` is non-null, it is the RESOLVED (effective) cwd — an absolute path. The FE - can display this as the directory the LSP is connected on. - -### 2. `PUT /conversations/:id/cwd` — new optional `workspaceId` field - -**Before:** The `PUT /conversations/:id/cwd` body was `{ cwd: string }` — only set -the persisted cwd, no workspace assignment. - -**After:** The body now accepts an optional `workspaceId`: -```json -{ "cwd": "/home/user/project", "workspaceId": "my-team" } -``` - -When `workspaceId` is provided: -1. The conversation is assigned to that workspace (via `ensureWorkspace` + - `setWorkspaceId`) BEFORE the cwd is persisted. -2. This ensures a subsequent `GET /conversations/:id/lsp` resolves a relative cwd - against the workspace's `defaultCwd` (not the server default). -3. Invalid `workspaceId` (not a valid slug: lowercase `[a-z0-9-]`, 1–40 chars) → - HTTP 400 `{ error: "Invalid workspaceId" }`. - -When `workspaceId` is absent → behavior is unchanged (just `setCwd`). - -**FE action:** When the user sets the working directory on a new chat tab, send the -`workspaceId` alongside the `cwd` in the `PUT /conversations/:id/cwd` request. This -ensures the LSP resolves correctly even before the first turn. - -### Example flow (new chat tab) - -1. User opens a new chat tab (selects workspace "my-team" with - `defaultCwd: "/home/tradam/projects/dispatch"`) -2. User sets working dir to `"arch-rewrite"` (relative) -3. FE sends: `PUT /conversations/abc/cwd { "cwd": "arch-rewrite", "workspaceId": "my-team" }` -4. Backend: assigns conversation to workspace "my-team", then persists cwd "arch-rewrite" -5. FE calls: `GET /conversations/abc/lsp` -6. Backend: `getCwd("abc")` → `"arch-rewrite"` (non-null) → `getEffectiveCwd("abc")` → - resolves "arch-rewrite" against workspace "my-team"'s `defaultCwd` - (`"/home/tradam/projects/dispatch"`) → `"/home/tradam/projects/dispatch/arch-rewrite"` -7. Response: `{ cwd: "/home/tradam/projects/dispatch/arch-rewrite", servers: [...] }` - -Without the `workspaceId` on the PUT (step 3), the conversation would be in the -`"default"` workspace (defaultCwd: null), and the relative cwd "arch-rewrite" would -resolve against `process.cwd()` — the wrong directory. - -## Contract version - -`@dispatch/transport-contract` bumped to `0.17.0` (additive: `SetCwdRequest.workspaceId` -is optional; `LspStatusResponse.cwd` comment updated — no field type change). diff --git a/frontend-mcp-status-handoff.md b/frontend-mcp-status-handoff.md deleted file mode 100644 index d19a9f1..0000000 --- a/frontend-mcp-status-handoff.md +++ /dev/null @@ -1,117 +0,0 @@ -# Handoff — MCP Status Endpoint (backend + frontend) - -## Backend: `GET /conversations/:id/mcp` (transport-http) - -Mirror the existing `GET /conversations/:id/lsp` endpoint exactly. The contract -types are already in `@dispatch/transport-contract` 0.22.0: - -```typescript -export type McpServerState = "connecting" | "connected" | "error" | "disconnected"; - -export interface McpServerInfo { - readonly id: string; - readonly state: McpServerState; - readonly error?: string; - readonly toolCount: number; - readonly configSource?: string; -} - -export interface McpStatusResponse { - readonly conversationId: string; - readonly cwd: string | null; - readonly servers: readonly McpServerInfo[]; -} -``` - -### What to change in `packages/transport-http/` - -1. **`src/seam.ts`** — add re-exports from `@dispatch/mcp`: - ```typescript - export type { McpServerStatus, McpService } from "@dispatch/mcp"; - export { mcpServiceHandle } from "@dispatch/mcp"; - ``` - -2. **`src/app.ts`** — add `mcpService?` to `CreateServerOptions` (optional, same - as `lspService?`), then add the route: - ```typescript - app.get("/conversations/:id/mcp", async (c) => { - // Mirror the LSP route exactly: - // 1. Gate on persisted cwd (getCwd) — return {cwd:null, servers:[]} when null - // 2. Resolve effective cwd (getEffectiveCwd) — return {cwd:null, servers:[]} when null - // 3. If opts.mcpService === undefined → 503 { error: "MCP service not available" } - // 4. Call opts.mcpService.status(effectiveCwd) → McpServerStatus[] - // 5. Map McpServerStatus → McpServerInfo (id, state, error?, toolCount, configSource?) - // 6. Return McpStatusResponse { conversationId, cwd: effectiveCwd, servers } - }); - ``` - -3. **`src/extension.ts`** — add `host.getService(mcpServiceHandle)` alongside - `lspService`, and pass `mcpService` to `createApp({...})`. - -4. **`package.json`** — add `"@dispatch/mcp": "workspace:*"` to dependencies. - -5. **Tests** — mirror the LSP status tests: - - Returns null+empty when no persisted cwd — `mcpService.status` NOT called. - - Returns servers when cwd is set. - - Returns 503 when `mcpService` is undefined. - - Maps `McpServerStatus` → `McpServerInfo` correctly (error omitted when - undefined, configSource omitted when undefined — honor `exactOptionalPropertyTypes`). - -### McpServerStatus → McpServerInfo mapping - -The `McpService.status(cwd)` returns `McpServerStatus[]` from `@dispatch/mcp`: -```typescript -interface McpServerStatus { - readonly id: string; - readonly state: "connecting" | "connected" | "error" | "disconnected"; - readonly error?: string; - readonly toolCount: number; -} -``` -Map to `McpServerInfo` (same fields, conditionally include `error` per -`exactOptionalPropertyTypes`). Note: `McpServerStatus` does NOT have -`configSource` — that field is on `ResolvedMcpServer` (from config resolution). -If you want to include `configSource` in the status response, the `McpService` -interface or `McpServerStatus` would need to be extended. For Phase 2, omit -`configSource` (it's optional on `McpServerInfo`) unless the MCP extension is -updated to include it in the status. - ---- - -## Frontend (dispatch-web): consume `GET /conversations/:id/mcp` - -### What to do - -1. **Re-pin** `@dispatch/transport-contract` to `0.22.0`. -2. **Re-mirror** the reference snapshot if one exists. -3. **Add a fetch** for `GET /conversations/:id/mcp` — mirror how `GET /conversations/:id/lsp` - is fetched and displayed. -4. **Render** the MCP server status: each server's `id`, `state` (with the same - connected/error/starting visual treatment as LSP), `toolCount`, and optional - `error`. -5. Place the MCP status UI alongside (or below) the LSP status in the conversation - settings/panel — they're sibling features. - -### Response shape - -```json -{ - "conversationId": "abc-123", - "cwd": "/home/user/project", - "servers": [ - { - "id": "freecad", - "state": "connected", - "toolCount": 12 - }, - { - "id": "chrome-devtools", - "state": "error", - "error": "Executable not found in $PATH: npx", - "toolCount": 0 - } - ] -} -``` - -When no cwd is set: `{ "conversationId": "abc-123", "cwd": null, "servers": [] }`. diff --git a/frontend-message-queue-handoff.md b/frontend-message-queue-handoff.md deleted file mode 100644 index b9c2a6d..0000000 --- a/frontend-message-queue-handoff.md +++ /dev/null @@ -1,189 +0,0 @@ -# FE handoff — message queue + steering injection - -Courier this to `../frontend` (cross-repo contract change; `lsp references` does -not span repos — ORCHESTRATOR §7). All changes are ADDITIVE — nothing existing breaks. - -## What shipped (backend) - -A per-conversation **message queue** + **steering** feature. While a turn is -GENERATING, a client can enqueue a user message onto the conversation's queue; -it is delivered mid-turn as **steering** — injected at the next tool-result -boundary so the model sees it alongside the tool results and can adjust course. -If the turn ends with a non-empty queue (no tool call fired), the queue is -carried into a NEW turn as its opening prompt. - -- **`message queue`** — the per-conversation buffer (owned by a new - `@dispatch/message-queue` extension). Transient + in-memory; the queue is - NOT on the chat stream — it is exposed to the frontend as a per-conversation - SURFACE (see below). -- **`steering`** — a user message injected into an in-flight turn at the - tool-result boundary (drawn from the queue). Emitted on the chat stream as a - new `steering` `AgentEvent` so it appears in the transcript live. - -Versions: `@dispatch/wire` `0.7.0 → 0.8.0`, `@dispatch/transport-contract` -`0.11.0 → 0.12.0`. Bump the pinned `file:` deps. (`@dispatch/ui-contract` is -unchanged — the queue uses the existing `custom` surface field kind.) - -## Wire types (in `@dispatch/wire`, re-exported by `@dispatch/transport-contract`) - -```ts -/** A message held in the conversation's queue, awaiting steering delivery. */ -interface QueuedMessage { - readonly id: string; // stable, client-visible (UI key + dedup) - readonly text: string; - readonly queuedAt: number; // epoch-ms -} - -/** Payload of the message-queue surface's `custom` field (see below). */ -interface QueuePayload { - readonly messages: readonly QueuedMessage[]; -} - -/** New `AgentEvent` variant (additive to the union). */ -interface TurnSteeringEvent { - readonly type: "steering"; - readonly conversationId: string; - readonly turnId: string; - readonly text: string; // the combined text of all drained messages -} -``` - -## How the frontend reads queue STATE: a surface (NOT the chat stream) - -The queue is control/state, so it rides the **surface** channel (like -cache-warming), not the chat event stream. The `message-queue` extension -contributes a per-conversation surface: - -- **Surface id:** `"message-queue"`; **scope:** `"conversation"` (subscribe with - the `conversationId`). -- **One `custom` field**, `rendererId: "message-queue"`, `payload: QueuePayload` - (`{ messages: QueuedMessage[] }` — the current queue snapshot). -- The surface updates (full new spec) on every change: enqueue (queue grew) and - drain (queue emptied). An idle conversation's queue is empty → the field's - `messages` is `[]`. - -So: **subscribe** to the `message-queue` surface per conversation and render -the queue list from `payload.messages`. You need a bespoke renderer for -`rendererId: "message-queue"` (the `custom` escape hatch — see the loaded- -extensions `table` renderer precedent). The surface is **read-only** (no -`invoke` actions); enqueuing is a chat op (below). - -## How the frontend ENQUEUES: the `chat.queue` WS op - -```ts -interface ChatQueueMessage { - readonly type: "chat.queue"; - readonly conversationId: string; - readonly text: string; -} -``` -(additive to `WsClientMessage`.) - -- **Fire-and-forget.** On success the server emits NOTHING back — the - `message-queue` SURFACE updates (the new message appears in the snapshot). - On failure (empty/missing `text`, unknown conversation) the server replies - `chat.error` (`{ type: "chat.error"; conversationId?; message }`). -- **`text` must be non-empty** after trim (the server 400/errors otherwise). -- **Auto-start when idle (server-owned decision):** if NO turn is active for the - conversation, `chat.queue` does NOT queue — it STARTS A NEW TURN with the - message as its opening prompt (equivalent to `chat.send`). The sender is - auto-subscribed and the turn's events stream as `chat.delta`s (the opening - `user-message` carries the text). So a single `chat.queue` op works for both - "steer during generation" and "send" — you don't need to pick. When a turn IS - active, the message is appended to the queue (surface updates) and delivered - at the next tool-result boundary. - -## How the frontend shows steering in the TRANSCRIPT: the `steering` event - -When the kernel drains a non-empty queue at a tool-result boundary, the -session-orchestrator emits a **`steering`** `AgentEvent` on the chat stream -(arrives inside a `chat.delta` `{ event }`, like every other `AgentEvent`): - -```ts -{ type: "chat.delta", event: { type: "steering", conversationId, turnId, text } } -``` - -- Render `text` as a **user bubble in the transcript**, positioned after the - tool-call/tool-result it followed (it is a user message the model saw mid-turn, - alongside the tool results). One `steering` event per drain; `text` is the - combined text of all messages drained at that boundary (joined by a blank - line). -- **Move, don't duplicate:** the drained messages were already shown in the - queue surface; when the surface then updates to empty (the drain cleared the - queue), they should leave the queue UI (they now live in the transcript as the - `steering` bubble). A simple rule: on `steering`, append the bubble to the - transcript; the surface's subsequent empty snapshot clears the queue UI. -- **Late-join safe:** like `user-message`, `steering` is buffered into the - in-flight turn's event buffer, so a client that subscribes mid-turn (or a - second device) sees it before seal (mirrors the CR-3 `user-message` fix). - (Carry-to-new-turn, below, does NOT emit `steering` — the new turn's - `user-message` covers it.) - -## Carry to a new turn (no `steering` event) - -If a turn ENDS with a non-empty queue (the model finished without making a tool -call, so no tool-result boundary was hit), the orchestrator drains the queue, -combines the messages, and **starts a NEW turn** whose opening prompt is the -combined text. You will see: the old turn's `done` + `turn-sealed`, then a new -`turn-start` + `user-message` carrying the combined text (rendered as the new -turn's normal user bubble). The queue surface also clears (empty snapshot). No -`steering` event in this case — handle the carried text as an ordinary new-turn -user message. - -## HTTP path (for the CLI / non-WS clients; the FE uses the WS op above) - -`POST /conversations/:id/queue` with body `QueueRequest { text }` → `QueueResponse`: - -```ts -interface QueueResponse { - readonly conversationId: string; - readonly startedTurn: boolean; // true = was idle, a new turn started - readonly queue: readonly QueuedMessage[]; // snapshot after the enqueue -} -``` -- Empty/whitespace `text` → HTTP 400 `{ error }`. -- `startedTurn: true` means no turn was active and the enqueue started one (the - message is the turn's opening prompt, NOT a queued steering message). -- `startedTurn: false` means a turn was active and the message was queued (the - `queue` snapshot includes it). - -## What we need the FE to do - -1. **Bump pinned deps:** `@dispatch/wire` → `0.8.0`, `@dispatch/transport-contract` - → `0.12.0`. -2. **Queue UI (per conversation):** subscribe to the `message-queue` surface - (scope `conversation`) and render `payload.messages` (`QueuedMessage[]`) with a - `rendererId: "message-queue"` custom renderer — a list of pending messages - with their text (and maybe `queuedAt` as a timestamp). Empty `messages` = - nothing to show (hide the panel). -3. **Enqueue affordance:** while a turn is generating, show an input that sends - `chat.queue { conversationId, text }` (NOT `chat.send` — `chat.queue` is the - steering entry; it auto-starts a turn if idle, so it's safe to offer it - whenever the user wants to add input). Trim/validate non-empty client-side - too; expect a `chat.error` on failure. -4. **Steering bubble:** handle the new `steering` `AgentEvent` (type `"steering"`) - on the `chat.delta` stream → render `event.text` as a user bubble in the - transcript after the tool calls; clear the queue UI when the surface updates - to empty. -5. **Carry:** no special handling — a carried queue surfaces as a normal new - turn (`turn-start` + `user-message`); just let the existing new-turn flow - render it. The queue surface clears automatically. - -## Notes / known gaps - -- **Live end-to-end (a real steering turn via a tool-calling model) is not yet - exercised** — the logic is unit/integration tested and the app boots clean with - the `message-queue` extension registered, but a live `chat.queue` → tool-call - → `steering` event flow against a real model has not been run. Worth a live - smoke once the FE wires it (or ask the backend to run one). -- **Close-with-queued-messages (open product question):** if a client - `POST /conversations/:id/close` (explicit tab close) while the queue is - non-empty, the in-flight turn aborts and the carry currently STILL fires - (starting a new turn on the closed conversation). This may or may not be - desired (does closing discard pending steering, or honor it?). Backend flag - for a decision; if "discard on close" is wanted, the backend will gate the - carry on `finishReason !== "aborted"`. No FE action either way — just be aware - a closed conversation might briefly start a turn from a queued message. -- **`steering` is additive** to the `AgentEvent` union — no exhaustive switches - broke on the backend (verified: `tsc -b` EXIT 0). If the FE has an exhaustive - switch on `AgentEvent`, add a `steering` case. diff --git a/frontend-metrics-handoff.md b/frontend-metrics-handoff.md deleted file mode 100644 index be033d8..0000000 --- a/frontend-metrics-handoff.md +++ /dev/null @@ -1,121 +0,0 @@ -# Frontend handoff — live turn metrics (tokens + timing) - -> From: arch-rewrite (backend) orchestrator · For: the frontend FE team. -> Status: **LIVE on the stream now** (backend committed + live-verified). Consume via the pinned -> contracts `@dispatch/[email protected]` + `@dispatch/[email protected]` (reference snapshots -> regenerated in `dispatch-web/.dispatch/{wire,transport-contract}.reference.md`). - -## 1. What you can now access -The backend's **authoritative** token + timing metrics are now on the live turn stream: - -| Metric | Where | Field(s) | -|---|---|---| -| Per-step tokens | `usage` event | `usage` (`inputTokens`/`outputTokens`/`cacheReadTokens?`/`cacheWriteTokens?`) + new `stepId?` | -| Per-step **TTFT** | new `step-complete` event | `ttftMs?` | -| Per-step **decode** time | new `step-complete` event | `decodeMs?` | -| Per-step total generation | new `step-complete` event | `genTotalMs?` | -| **Tool execution** time | `tool-result` event | `durationMs?` | -| **Turn** wall-clock | `done` event | `durationMs?` | -| **Turn** total tokens | `done` event | `usage?` | -| **Tokens/sec** (TPS) | derive | `usage.outputTokens / (step-complete.decodeMs / 1000)` | -| Context-size proxy | `usage` event | `usage.inputTokens` (size the model counted; `cacheReadTokens` = cached portion) | - -"Authoritative" = measured by the backend runtime, not client wall-clock. They differ from -anything you'd time in the browser (no network/buffering in them). - -## 2. How they're delivered -**Inline, in the same chat stream you already consume** — WS `chat.delta` frames (and the -`POST /chat` NDJSON stream) carry the `AgentEvent` union; metrics are additional event types / -fields in that union. **No new endpoint, no subscription/negotiation.** You already `switch` on -`event.type`; route the metric events to a telemetry handler and ignore any you don't render -(zero cost). They do **not** appear in message content — keep your transcript rendering as-is. - -These events are **low-frequency** (one `step-complete` per step, one `done` per turn, a -`durationMs` per tool result) — not per-token — so there's no stream-volume concern. - -## 3. The new/changed events (shapes) -All new fields are **optional** — see §5. Every event still carries `conversationId` + `turnId`. - -```ts -// NEW variant in AgentEvent — emitted once per step, AT STEP END (timing is final here) -interface TurnStepCompleteEvent { - type: "step-complete"; - conversationId: string; - turnId: string; - stepId: StepId; // join key to the step's `usage` event + tool events - ttftMs?: number; // time to first token (stream start → first text|reasoning delta) - decodeMs?: number; // first token → stream end (== genTotalMs - ttftMs) - genTotalMs?: number; // whole-step generation (present even if no first token was seen) -} - -// usage event — now labeled by step -interface TurnUsageEvent { - type: "usage"; - conversationId: string; turnId: string; - stepId?: StepId; // NEW — attribute tokens to a step / join to step-complete - usage: Usage; // { inputTokens, outputTokens, cacheReadTokens?, cacheWriteTokens? } -} - -// tool-result — now carries execution time -interface TurnToolResultEvent { - type: "tool-result"; - conversationId: string; turnId: string; - stepId: StepId; toolCallId: string; toolName: string; - content: string; isError: boolean; - durationMs?: number; // NEW — tool execution time (dispatch → result) -} - -// done — now carries turn totals -interface TurnDoneEvent { - type: "done"; - conversationId: string; turnId: string; - reason: string; - durationMs?: number; // NEW — whole-turn wall-clock - usage?: Usage; // NEW — aggregate turn tokens (so you needn't sum the usage events) -} -``` - -## 4. Correlation & derived metrics -Keys: `turnId` groups a turn; `stepId` groups a step within it; `toolCallId` pairs a tool call -with its result. A turn has **one `step-complete` (and usually one `usage`) per step**. - -- **Per-step TPS** = `usage.outputTokens / (step-complete.decodeMs / 1000)` — join `usage` and - `step-complete` by `stepId`. (Use `decodeMs`, not `genTotalMs`, for decode-rate TPS; it excludes - first-token latency. See "which TPS" caveat below.) -- **Turn TPS** = `done.usage.outputTokens / (Σ step-complete.decodeMs / 1000)`. -- **Generation total per step** = `genTotalMs` (or `ttftMs + decodeMs`). -- **Turn-visible first-token latency** = the `ttftMs` of **step 0** (the first `step-complete`). -- **Total prefill overhead** = `Σ ttftMs` across steps; **pure generation** = `Σ decodeMs`. -- **Tool time** = `tool-result.durationMs` per call; sum per `stepId` for a batch. - -"Which TPS": `decodeMs` is first-token → end, so TPS over it is the decode rate (first-token -latency removed). If you want end-to-end rate including the wait, use `ttftMs + decodeMs`. - -## 5. Optionality — you MUST tolerate absence -- `step-complete` is always emitted per step, but its **timing fields are present only when the - server runs with a clock** (it does in normal operation). `ttftMs`/`decodeMs` are additionally - absent for a step that produced **no text/reasoning token** (e.g. a tool-call-only step) — - `genTotalMs` is still present in that case. -- `usage.stepId`, `tool-result.durationMs`, `done.durationMs`, `done.usage` are all optional. -- Render gracefully when a value is missing (omit the figure; don't show `NaN`/`undefined`). - -## 6. What is NOT available yet (deferred — Pass 2) -**Metrics are LIVE-ONLY.** They are **not persisted**, so: -- `GET /conversations/:id` (history) returns messages/chunks but **no tokens/timing**. Reopening a - past conversation will show content without metrics. -- If you need historical metrics (e.g. show TPS on a reloaded conversation), that's the planned - **Pass 2** (persist per-turn metrics + a read path) — see `tasks.md` "Pass 2 — DEFERRED". Tell - us if you need it and we'll prioritize. -- TPS is not sent pre-computed (derive it, §4). No per-token timing (metrics are per-step/per-turn). - -## 7. Integration checklist -1. Refresh deps: `bun run typecheck` in frontend (picks up `[email protected]` / `[email protected]`). -2. Extend your `chat.delta` event handler: add a `case "step-complete"` and read the new optional - fields on `usage`/`tool-result`/`done`. (No exhaustive-switch break — these are additive.) -3. Keep a per-turn (and per-step, keyed by `stepId`) telemetry accumulator alongside the transcript - store; fold metric events into it; render where you want (e.g. a turn footer / per-step badges). -4. Treat every metric field as optional (§5). - -## 8. Carrier facts (unchanged) -HTTP 24203 (`POST /chat` NDJSON, `GET /conversations/:id`, `GET /models`), WS 24205 (one socket, -`chat.delta` carries each `AgentEvent`), CORS `*`. Same events on both carriers. diff --git a/frontend-metrics-pass2-handoff.md b/frontend-metrics-pass2-handoff.md deleted file mode 100644 index adf0404..0000000 --- a/frontend-metrics-pass2-handoff.md +++ /dev/null @@ -1,67 +0,0 @@ -# FE handoff — persisted replay metrics (Pass 2) + metrics endpoint - -> **Courier doc** (backend → `../frontend`, via the user). Per ORCHESTRATOR §7 -> the backend does NOT write the FE repo; the FE orchestrator applies this delta -> on its side (regenerate the in-repo `.dispatch/*.reference.md` snapshots + bump -> the `file:` dep). `lsp references` does not span the two repos. Backend commit: -> `6db12ff`. - -## Versions -- `@dispatch/wire` `0.3.0 → 0.4.0` (additive) -- `@dispatch/transport-contract` `0.3.0 → 0.4.0` (additive) - -Pure-type, additive change — no breaking edits to existing types. - -## New wire types (`@dispatch/wire`, re-exported by `@dispatch/transport-contract`) - -```ts -interface StepMetrics { - stepId: StepId; // `<turnId>#<index>`, join key to the live stream - usage: Usage; // { inputTokens, outputTokens, cacheReadTokens?, cacheWriteTokens? } - ttftMs?: number; // time to first token (optional — clock + first-token gated) - decodeMs?: number; // first token → stream end - genTotalMs?: number; // stream start → end (== ttftMs + decodeMs when a first token was seen) -} - -interface TurnMetrics { - turnId: string; // plain wire turn id, join key to AgentEvents - usage: Usage; // aggregate across all steps - durationMs?: number; // turn wall-clock (optional — clock gated) - steps: readonly StepMetrics[]; // per-step, in step order -} -``` - -These are the **persisted, replayable** counterparts of the live `usage` / -`step-complete` / `done` events (which remain transient and unchanged). - -## New read endpoint - -`GET /conversations/:id/metrics` → `ConversationMetricsResponse`: - -```ts -interface ConversationMetricsResponse { turns: readonly TurnMetrics[] } -``` - -Semantics: -- `turns` = every **sealed** turn's `TurnMetrics`, in **turn-append order**. -- A turn appears only **after seal** (post-persist); an in-flight/unsealed turn is absent. -- This is a **separate axis** from `GET /conversations/:id?sinceSeq=` (which returns - seq-cursor chunk CONTENT). Metrics are keyed per **turn**, not per chunk, so they are - **not** seq-filtered — hence a sibling route, not a field on the history response. -- Unknown / metric-less conversation → `{ turns: [] }`. -- CORS: same wildcard as the other routes. - -## Suggested FE consumption -On (re)opening a conversation, the chat feature can `GET /conversations/:id/metrics` -once alongside the history hydrate (`?sinceSeq=`), then render historical -tokens/latency per turn (and per step via `stepId`) — identical fields to what it -already routes from the live `step-complete` / `usage` / `done` stream. TPS is -still derived FE-side (`usage.outputTokens / decodeMs`); context-size proxy = -`usage.inputTokens`. - -## Invariants (confirmed live) -- Persisted `TurnMetrics.usage` / `durationMs` and each `StepMetrics` - (`stepId` + `usage` + `ttftMs`/`decodeMs`/`genTotalMs`) **byte-match** what the - live stream emitted for the same turn (verified end-to-end against flash). -- `stepId` is the SAME value on the live `step-complete`/`usage` events, the persisted - `StepMetrics`, and the tool chunks — one grouping key across live + replay. diff --git a/frontend-model-persistence-handoff.md b/frontend-model-persistence-handoff.md deleted file mode 100644 index 912cea6..0000000 --- a/frontend-model-persistence-handoff.md +++ /dev/null @@ -1,91 +0,0 @@ -# Frontend handoff — per-conversation model persistence - -## What changed - -A chat's selected provider + model is now **persisted per conversation** -(like `cwd` and `reasoningEffort` already are). Opening a conversation in a new -browser session recalls the originally selected model instead of defaulting to -the server default. - -## Contract version bump - -`@dispatch/transport-contract` `0.19.0 → 0.20.0` — re-pin the `file:` dep and -re-mirror `.dispatch/transport-contract.reference.md`. - -## New types (additive) - -```ts -// GET /conversations/:id/model -export interface ModelResponse { - readonly conversationId: string; - readonly model: string | null; // <credentialName>/<model> form, or null -} - -// PUT /conversations/:id/model -export interface SetModelRequest { - readonly model: string | null; // null clears the persisted selection -} -``` - -## New endpoints - -### `GET /conversations/:id/model` -Returns `ModelResponse`. `model` is `null` when never set (the server then -resolves turns using the default provider + model). - -### `PUT /conversations/:id/model` -Body: `SetModelRequest`. Set `model` to a `<credentialName>/<model>` string -(one of the values from `GET /models`) to persist it. Set `model` to `null` -to clear the persisted selection. Returns `ModelResponse` with the resulting -value. - -## What the FE should do - -1. **On conversation open** — call `GET /conversations/:id/model` to fetch the - persisted model. If non-null, set the model selector to that value. If null, - use the global default (current behavior). - -2. **On model select** — call `PUT /conversations/:id/model` with the selected - model name (`<credentialName>/<model>` form). This persists it so future - turns (and new browser sessions) use the same model. - -3. **On model clear** (if the FE supports clearing back to default) — call - `PUT /conversations/:id/model` with `{ model: null }`. - -4. **No `ChatRequest.model` change needed** — the FE may continue sending - `model` on `chat.send` (per-turn override); the backend persists it. Or the - FE may omit `model` on `chat.send` and rely on the persisted value — the - backend resolves it. Either way works. - -## Backend behavior - -- **Per-turn override** (`ChatRequest.model` / `chat.send` model) takes - precedence and is persisted. -- **No per-turn override** → backend checks `getModel(conversationId)` → if - non-null, uses it; if null, falls through to the default provider. -- **Warm path** also resolves the model from persistence when no explicit - override is given (parity with real turns). - -## No FE handoff needed for tasks 1 & 2 - -- **Task 1** (workspace tab broadcast): already couriered to 29ae by a prior - orchestrator agent (`frontend-workspace-open-handoff.md`). -- **Task 2** (system-prompt cwd reconstruction): backend-only fix, no contract - version bump, no FE action needed. - -## Assumptions made (user was away) - -1. **Persist the model name string** (`<credentialName>/<model>` form), not - the provider/credential separately — the model name already encodes both - (the credential binds to a provider). This mirrors how the CLI sends - `--model` and how `ChatRequest.model` works. -2. **No model validation on PUT** — the backend doesn't validate the model - name on `PUT /conversations/:id/model` (it's just a string). The provider - resolves it at turn time; an unknown model → turn error, not a 400. This - matches the contract doc on `SetModelRequest`. -3. **Empty string clears** — `setModel(id, "")` deletes the key. The HTTP - `PUT` with `{ model: null }` maps to this. This is an implementation detail - the FE doesn't need to know about (it sends `null`). -4. **No `model` field on `ConversationMeta`** — following the precedent of `cwd` - and `reasoningEffort` (which are NOT on `ConversationMeta` but fetched via - dedicated endpoints). The FE calls `GET /conversations/:id/model` to read. diff --git a/frontend-reasoning-effort-handoff.md b/frontend-reasoning-effort-handoff.md deleted file mode 100644 index 656dede..0000000 --- a/frontend-reasoning-effort-handoff.md +++ /dev/null @@ -1,81 +0,0 @@ -# FE handoff — reasoning effort (thinking-depth knob) - -Courier this to `../frontend` (cross-repo contract change; `lsp references` does not -span repos — ORCHESTRATOR §7). All changes are ADDITIVE — nothing existing breaks. - -## What shipped (backend) - -A new user-settable knob, **reasoning effort**: how much extended thinking the model spends -before answering. Canonical ladder (type `ReasoningEffort`, exported by `@dispatch/wire` and -re-exported by `@dispatch/transport-contract`): - -```ts -type ReasoningEffort = "low" | "medium" | "high" | "xhigh" | "max"; -``` - -Versions: `@dispatch/wire` `0.6.1 → 0.7.0`, `@dispatch/transport-contract` -`0.10.0 → 0.11.0`. Bump the pinned `file:` deps. - -It has TWO setting scopes, resolved server-side per turn: - -1. **Per-turn override** — optional `reasoningEffort` on `ChatRequest` (HTTP `POST /chat`) - and therefore on the WS `chat.send` message (`ChatSendMessage extends ChatRequest`). - Applies to THAT turn only; does NOT persist. -2. **Persisted per-conversation setting** — sticky; used for every turn that has no per-turn - override: - - `GET /conversations/:id/reasoning-effort` → `ReasoningEffortResponse` - `{ conversationId, reasoningEffort: ReasoningEffort | null }` (`null` = never set). - - `PUT /conversations/:id/reasoning-effort` with body `SetReasoningEffortRequest` - `{ reasoningEffort }` → persists it. - -**Resolution chain (server-owned — do not re-implement):** per-turn override → persisted -conversation value → **default `"high"`**. So a conversation with nothing set already runs at -`high`; `null` from the GET means "default (`high`) applies", not "off". - -**Validation:** an unrecognized level → HTTP 400 `{ error }` (the error message lists the -valid levels). Same for the WS path (the standard `chat.send` error reply). Send only the -five ladder strings; omit the key entirely for "no override" (don't send `null`/`""`). - -## What the model does with it (context for UX copy) - -The Anthropic provider maps the level to an extended-thinking token budget -(`low` 4 096 · `medium` 10 240 · `high` 16 384 · `xhigh` 32 768 · `max` 65 536). Higher -levels = the model thinks longer before answering (more `reasoning-delta` events / thinking -chunks ahead of the text — the FE already renders those). Providers without a thinking knob -ignore the field — sending it is always safe. - -## What we need the FE to do - -1. **Per-conversation effort selector** — a 5-option control (plus an implicit "default" - state when the GET returns `null`): - - On conversation open: `GET /conversations/:id/reasoning-effort`; render `null` as - "high (default)". - - On change: `PUT` the chosen level. It takes effect from the NEXT turn — no turn restart - needed. -2. **(Optional) per-turn override** — if the composer grows a "think harder for this one - message" affordance, set `reasoningEffort` on that `chat.send` only. The persisted setting - is untouched by overrides. -3. **Expect more thinking** — at `xhigh`/`max` the pre-answer thinking phase can be long; - whatever spinner/" thinking…" treatment exists should tolerate extended runs of - reasoning deltas before the first text delta. - -## Cache note (don't surprise users) - -Changing the effort level changes the provider request shape, which can bust the prompt -cache for the next turn (one-time re-prefill cost). The backend's cache-warming path already -warms with the SAME resolved effort as a real turn, so a STABLE setting stays cache-safe; -only the act of changing it costs. If the FE wants, it can mention this in the selector's -tooltip — no functional handling required. - -## Verify (manual) - -```bash -# sticky setting round-trip -curl -s localhost:24203/conversations/<id>/reasoning-effort # → null first time -curl -s -X PUT localhost:24203/conversations/<id>/reasoning-effort \ - -H 'content-type: application/json' -d '{"reasoningEffort":"xhigh"}' -curl -s localhost:24203/conversations/<id>/reasoning-effort # → "xhigh" -# bad level → 400 -curl -s -X PUT localhost:24203/conversations/<id>/reasoning-effort \ - -H 'content-type: application/json' -d '{"reasoningEffort":"banana"}' -``` diff --git a/frontend-stop-generation-handoff.md b/frontend-stop-generation-handoff.md deleted file mode 100644 index 117b65e..0000000 --- a/frontend-stop-generation-handoff.md +++ /dev/null @@ -1,49 +0,0 @@ -# FE handoff — stop generation mid-turn - -Courier this to `../frontend`. All changes are ADDITIVE. - -## What shipped (backend) - -A "stop" button: aborts an in-flight generation without closing the conversation. -The conversation stays open — it transitions `active → idle` via the normal -turn-settle path. Partial messages are persisted. The turn seals with -`finishReason: "aborted"`. - -This is distinct from `POST /conversations/:id/close` which marks the -conversation as `closed` (tab dismiss). - -## `POST /conversations/:id/stop` — stop generation - -Aborts the in-flight turn's `AbortController`. The kernel finishes generation, -persists partial messages, and seals the turn normally. The conversation -transitions `active → idle` (not `closed`). - -- 200 response: `{ conversationId: string, abortedTurn: boolean }` -- `abortedTurn: true` — a turn was active and has been aborted. -- `abortedTurn: false` — no active turn (no-op, conversation was already idle). -- Idempotent — stopping an idle conversation is safe. - -## What the FE receives after stopping - -The existing event flow handles everything — no new WS message needed: - -1. The `done` event arrives with `reason: "aborted"` (the turn sealed normally). -2. The `conversation.statusChanged` WS message arrives with `status: "idle"`. -3. The FE should reload history via `GET /conversations/:id` to see the partial - messages that were persisted before the abort. - -## What the FE needs to do - -1. **Stop button** in the conversation toolbar (only visible when `status: "active"`). - On click → `POST /conversations/:id/stop`. Disable the button after clicking - (wait for the `done` event + `statusChanged: idle` before re-enabling). - -2. **Handle the response**: `abortedTurn: true` means the stop worked. - `abortedTurn: false` means there was nothing to stop (the turn may have - already finished between the click and the request). - -3. **Reload history** after receiving the `done` event to show partial output. - -## CLI - -`dispatch stop <conversationId>` — stops generation. Resolves short IDs. diff --git a/frontend-system-prompt-handoff.md b/frontend-system-prompt-handoff.md deleted file mode 100644 index c135145..0000000 --- a/frontend-system-prompt-handoff.md +++ /dev/null @@ -1,73 +0,0 @@ -# FE Courier Handoff: System Prompt Builder (Updated) - -> Backend→FE courier. Send to FE agent `ffe3`. -> Supersedes the earlier `frontend-system-prompt-handoff.md` — adds `prompt:workspace_id`. - -## API endpoints - -### `GET /system-prompt` → `{ template: string }` -Returns the current global template. When none is stored, returns the built-in default. - -### `PUT /system-prompt` ← `{ template: string }` → `{ template: string }` -Set the global template. Empty string = "no system prompt". 400 if `template` missing/wrong type. 503 if service unavailable. - -### `GET /system-prompt/variables` → `{ variables: SystemPromptVariable[] }` -Static catalog — always available (no service dependency). Use this to render the variable selector buttons. - -## Template format - -### Variable insertion -``` -[type:name] -``` -Resolves at construction time. Unknown type → blank. Non-existent variable (e.g. file not found) → blank. - -### Conditional blocks -``` -[if type:name] - ...if variable exists... -[else] - ...if not... -[endif] -``` -Negated: -``` -[if !type:name] - ...if variable does NOT exist... -[endif] -``` -Nested `[if]`: supported. Multi-line: supported. Unmatched `[if]`/`[endif]`: literal text. - -## Available variables (updated) - -| Type:Name | Description | Dynamic? | -|---|---|---| -| `system:time` | Current time (ISO 8601) | No | -| `system:date` | Current date (YYYY-MM-DD) | No | -| `system:os` | Operating system | No | -| `system:hostname` | Machine hostname | No | -| `prompt:cwd` | Working directory | No | -| `prompt:model` | Current model name | No | -| `prompt:conversation_id` | Conversation ID | No | -| `prompt:workspace_id` | Workspace identifier — lets the AI know which workspace it's in, useful when summoning agents | No | -| `git:branch` | Current git branch | No | -| `git:status` | Short git status | No | -| `file:<path>` | File contents (relative to cwd, or absolute if starts `/`) | **Yes** | - -For `file:<path>`, allow free-text input for the path. - -## Caching behavior - -System prompt is **constructed once** (first turn of a new conversation) and **persisted**. Reused on all subsequent turns (cache-safe). Reconstructed only on **compaction**. Changing the template does NOT affect existing conversations until compacted. - -## Default template - -``` -You are a helpful coding assistant. - -[if file:AGENTS.md] -[file:AGENTS.md] -[endif] - -The current working directory is [prompt:cwd]. -``` diff --git a/frontend-todo-handoff.md b/frontend-todo-handoff.md deleted file mode 100644 index 4a81296..0000000 --- a/frontend-todo-handoff.md +++ /dev/null @@ -1,91 +0,0 @@ -# FE handoff — todo task list surface - -Courier this to `../frontend` (cross-repo contract change; `lsp references` does -not span repos — ORCHESTRATOR §7). All changes are ADDITIVE — nothing existing breaks. - -## What shipped (backend) - -A per-conversation **task list** the AI model maintains via a `todo_write` tool. The -list is exposed to the frontend as a per-conversation **surface** (read-only). The -model creates/updates the list during a turn; the surface updates live so the FE can -render the current state. - -- **`todo_write` tool** — the model passes the FULL list each call (replaces the - existing list). Returns the list as JSON. The tool description guides the model on - when to use it (3+ step tasks, planning, etc.). -- **State** — in-memory, per-conversation. No persistence (the list lives for the - process lifetime of the conversation). -- **No new wire types, no version bumps.** The todo surface uses the existing - `custom` surface field kind (`ui-contract` unchanged). The `TodoItem` type is - defined by the `todo` extension and carried in the surface payload — it is NOT - in `@dispatch/wire` or `@dispatch/transport-contract`. - -## The surface - -The `todo` extension contributes a per-conversation surface: - -- **Surface id:** `"todo"` -- **Scope:** `"conversation"` (subscribe with the `conversationId`) -- **Region:** `"side"` -- **Title:** `"Tasks"` -- **One `custom` field**, `rendererId: "todo"`, `payload: TodoPayload` - -```ts -interface TodoPayload { - todos: readonly TodoItem[]; -} - -interface TodoItem { - content: string; - status: "pending" | "in_progress" | "completed" | "cancelled"; -} -``` - -- **Read-only** — no `invoke` actions. The model mutates the list via the - `todo_write` tool; the FE only renders. -- **Updates** on every `todo_write` call (subscriber-notify → full new spec with the - updated `todos` array). -- **Empty list** — an idle conversation (no todo list created yet, or the model - cleared it with an empty array) renders `todos: []`. Hide the panel when empty. - -## What the FE needs to do - -1. **Subscribe** to the `todo` surface per conversation (same pattern as - `message-queue` and `cache-warming` — `scope: "conversation"`, pass - `conversationId` on subscribe). - -2. **Custom renderer** for `rendererId: "todo"` — render the `payload.todos` array - as a task list. Suggested UI: - - Each item shows `content` with a status indicator: - - `pending` — empty circle / checkbox - - `in_progress` — spinner / filled circle (highlight) - - `completed` — checkmark (strikethrough or dim the content) - - `cancelled` — X / dash (dim/strikethrough) - - Order is significant — items are in the order the model provided them (array - index = identity). - - Only one item should be `in_progress` at a time (the tool description enforces - this via guidance, not validation — but the model should comply). - -3. **Live updates** — the surface pushes a new spec on every `todo_write` call. No - polling needed. Just re-render from the new `payload.todos`. - -4. **Empty state** — when `todos` is `[]`, hide the panel (the model hasn't created - a list yet, or cleared it). - -## No other integration points - -- No new WS ops (no `chat.queue` equivalent — the model is the only writer). -- No new HTTP endpoints (the list is tool-driven, not API-driven). -- No new `AgentEvent` types (the list is not on the chat stream). -- No version bumps in `@dispatch/wire` or `@dispatch/transport-contract`. - -## Notes - -- **In-memory only** — the todo list does NOT persist across server restarts. If - the server restarts, the list is cleared. The model recreates it on the next - `todo_write` call. This mirrors the message-queue behavior. -- **Per-conversation** — each conversation has its own list. Switching conversations - means subscribing to a different `conversationId` and rendering that conversation's - list. -- **Model-driven** — the FE has no control over the list (read-only surface). The - model creates, updates, and clears items. The FE just displays the current state. diff --git a/frontend-turn-continuity-handoff.md b/frontend-turn-continuity-handoff.md deleted file mode 100644 index e0be4a3..0000000 --- a/frontend-turn-continuity-handoff.md +++ /dev/null @@ -1,83 +0,0 @@ -# FE handoff — turn continuity + multi-client live view - -Courier to `../frontend` (cross-repo; `lsp references` does not span repos — -ORCHESTRATOR §7). Backend is implemented + live-verified against flash. This unblocks -the "turn keeps running when the browser is backgrounded/reloaded" + "watch the same -chat from a second device" behavior. - -## What changed in the backend (principle now enforced) - -A turn is **no longer bound to the WebSocket connection**. It runs to completion on the -server regardless of any client, and **any number of connections can watch the same -conversation's live events** — including a client that connects mid-turn (late-join -replay). The old behavior (socket close → `AbortController.abort()` → turn killed) is -gone. - -## New WS protocol (additive — `@dispatch/transport-contract` `0.6.0 → 0.7.0`) - -Two new client→server messages on the existing socket: - -```ts -{ type: "chat.subscribe"; conversationId: string } // start watching a conversation's turns -{ type: "chat.unsubscribe"; conversationId: string } // stop watching (does NOT stop the turn) -``` - -Server→client is UNCHANGED: turn events still arrive as -`{ type: "chat.delta", event: AgentEvent }` (and `{ type: "chat.error", ... }`). Both -replayed and live events use `chat.delta`. - -Semantics: -- **`chat.subscribe`** registers this connection to receive the conversation's turn - events. If a turn is in-flight, the server immediately **replays that turn's events so - far** (from its `turn-start`) as `chat.delta`, then streams live ones. If idle, nothing - is replayed (rely on the history read). -- **`chat.send`** still starts a turn AND **auto-subscribes the sending connection** — so - the sender needs no separate `chat.subscribe`. (If a turn is already generating for that - conversation, the server replies `chat.error` "a turn is already generating…" and you - stay subscribed to watch the running one.) -- **`chat.unsubscribe`** / socket close → the server drops this connection's subscription - but **never stops the turn**. -- Subscriptions **persist across turns** on the backend: subscribe once and you receive - every subsequent turn on that conversation until you unsubscribe/close. - -## What the FE must change (from the FE investigation) - -1. **On WS (re)connect — re-subscribe chat, not just surfaces.** Today `onReopen` - (`src/app/store.svelte.ts`) only re-sends *surface* subscriptions. It must ALSO, for - every open conversation, send `chat.subscribe { conversationId }`. This is what makes a - backgrounded/reconnected client re-attach to a still-running turn and resume live - streaming. (Pair it with a `syncTail()` so any turn that sealed while you were gone is - committed from history.) -2. **On page load — subscribe each restored tab's conversation** (in addition to the - existing IndexedDB + `GET /conversations/:id?sinceSeq=` rehydrate). After a reload - mid-turn you'll get the in-flight turn replayed and can keep rendering it live. -3. **Render a real "running" state.** Derive it from the stream: a `turn-start` (or any - delta) with no matching `done`/`turn-sealed` yet = generating. Today the Composer status - is hard-wired idle and the `status` AgentEvent is a no-op reducer — wire it up so a - watching device shows "generating…". -4. **Don't lose a missed `turn-sealed`.** If you reconnect after the turn sealed while you - were away, you won't get a live `turn-sealed`; `syncTail()` on (re)connect (point 1) - commits the finished turn from history. If you reconnect WHILE it's still running, the - replay + live tail carry you to the real `turn-sealed`. -5. **Multi-device handoff (the goal):** opening the same conversation on device B is just - `chat.subscribe { conversationId }`. B will see the in-flight turn (replayed) and watch - it finish — even if device A (the sender) closed. No special handling beyond points 1–3. - -## Out of scope (backend will NOT do these yet) - -- **Per-step persistence / crash-resume:** if the backend PROCESS crashes mid-turn, the - in-flight turn is still lost (the in-flight buffer is in-memory; only sealed turns are - persisted). Reconnecting to a *running* turn works; surviving a *backend crash* mid-turn - does not. Separate durability milestone (R1). -- **Concurrent-send arbitration:** sending from two devices at once is not handled (by - product decision — won't happen). A second `chat.send` while generating gets a - `chat.error`. -- **Explicit "stop generating":** there is no stop op (disconnect no longer stops a turn). - A future `chat.stop` would be deliberate. - -## Quick manual check (mirrors the backend live test) - -Open two WS connections, `chat.subscribe` the same `conversationId` on both, `chat.send` -on one → both receive identical `chat.delta` streams. Close the sender mid-turn → the other -keeps receiving through `done`. Connect a third mid-turn + `chat.subscribe` → it receives -`turn-start` replayed then the rest. diff --git a/frontend-workspace-open-handoff.md b/frontend-workspace-open-handoff.md deleted file mode 100644 index 8005a2f..0000000 --- a/frontend-workspace-open-handoff.md +++ /dev/null @@ -1,47 +0,0 @@ -# Frontend handoff — workspace id on conversation.open / statusChanged - -## What changed - -The backend now resolves the conversation's actual persisted workspace id and -includes it on the WS broadcast for both `conversation.open` and -`conversation.statusChanged`. - -- `@dispatch/transport-contract` `0.18.0 → 0.19.0` — additive `workspaceId: string` - on both `ConversationOpenMessage` and `ConversationStatusChangedMessage`. -- The backend uses the conversation's stored workspace (`"default"` fallback), - not the per-turn start option. - -## What the frontend must do - -1. **Re-pin the `file:` dep** on `@dispatch/transport-contract` from the backend - repo once this commit lands. -2. **Re-mirror `.dispatch/transport-contract.reference.md`** to match the `0.19.0` - contract. -3. **Parser update** (`src/adapters/ws/logic.ts:116-123`): parse `workspaceId` - from the incoming `conversation.open` and `conversation.statusChanged` messages. -4. **`openConversation()`** (`src/app/store.svelte.ts:588-600`): use the message's - `workspaceId` to stamp/focus the tab instead of `activeWorkspaceId` (the - viewer's current workspace). This fixes the bug where a tab opened via the - CLI `--open --workspace my-ws` was appearing in every workspace. -5. **`onConversationStatusChanged()`** (`src/app/store.svelte.ts:703-718`): same - fix when the FE calls `openConversation(conversationId)` on a status change and - has no existing tab — use the `workspaceId` from the message. -6. **Tests** (`logic.test.ts`, `index.test.ts`, `App.test.ts`, `conformance.test.ts`): - update fixtures/assertions to carry `workspaceId`. - -## Backend status - -- `@dispatch/transport-contract`: `0.19.0` with additive `workspaceId`. -- `session-orchestrator`: payload types widened; status-change emits resolve - workspace id from store. -- `transport-ws`: broadcasts include `workspaceId`. -- `transport-http`: `POST /conversations/:id/open` resolves workspace id and emits - it. -- Verified: `tsc -b` EXIT 0, biome clean, **1405 vitest** green. - -## Note on assumptions - -I'm working autonomously while the user is away. Every assumption I make is -recorded in `notes/assumptions-log.md` in this repo. Please record any -assumptions you make while implementing this handoff in your own assumptions log -so we can raise them when the user returns. diff --git a/frontend-workspaces-handoff.md b/frontend-workspaces-handoff.md deleted file mode 100644 index 1fed1bf..0000000 --- a/frontend-workspaces-handoff.md +++ /dev/null @@ -1,216 +0,0 @@ -# Backend handoff — Workspaces (backend → FE) — courier doc - -> **From:** arch-rewrite orchestrator · **To:** frontend orchestrator · **Courier:** the user. -> Response to `backend-handoff-workspaces.md`. This doc finalizes the contract shapes -> the backend will implement. The FE should re-pin `@dispatch/wire` and -> `@dispatch/transport-contract` `file:` deps and re-mirror any `.dispatch/*.reference.md`. - -## Version bumps - -| Package | From | To | Notes | -|---|---|---|---| -| `@dispatch/wire` | `0.11.0` | `0.12.0` | Additive: `Workspace`, `WorkspaceEntry`, `ConversationMeta.workspaceId` | -| `@dispatch/transport-contract` | `0.15.0` | `0.16.0` | Additive: workspace endpoints + `workspaceId` on chat/queue ops | -| `@dispatch/ui-contract` | `0.2.0` | `0.2.0` | **Unchanged** | - ---- - -## 1. Final types — `@dispatch/[email protected]` - -```ts -/** - * A named, URL-driven grouping of conversations that owns a default cwd. - * Every conversation belongs to exactly one workspace; conversations that - * haven't set their own per-conversation cwd inherit `defaultCwd`. - */ -export interface Workspace { - /** The URL slug (immutable). Lowercase `[a-z0-9-]`, 1–40 chars. */ - readonly id: string; - /** Display title (editable). Defaults to `id` on creation. */ - readonly title: string; - /** The workspace's default cwd, or `null` (fall through to server default). */ - readonly defaultCwd: string | null; - /** Epoch-ms when the workspace was first created. */ - readonly createdAt: number; - /** Epoch-ms of the most recent conversation activity in this workspace. */ - readonly lastActivityAt: number; -} - -/** - * A workspace entry in the list response — a `Workspace` plus a conversation count. - */ -export interface WorkspaceEntry extends Workspace { - /** Number of conversations assigned to this workspace. */ - readonly conversationCount: number; -} -``` - -`ConversationMeta` gains a required `workspaceId`: - -```ts -export interface ConversationMeta { - readonly id: string; - readonly createdAt: number; - readonly lastActivityAt: number; - readonly title: string; - readonly status: ConversationStatus; - /** Always present; "default" for legacy/unspecified conversations. */ - readonly workspaceId: string; - readonly compactedFrom?: string; -} -``` - ---- - -## 2. Final types — `@dispatch/[email protected]` - -### Additive fields on existing request types - -```ts -export interface ChatRequest { - readonly conversationId?: string; - readonly message: string; - readonly model?: string; - readonly cwd?: string; - readonly reasoningEffort?: ReasoningEffort; - /** Workspace to assign the conversation to. Default "default". Auto-creates if missing. */ - readonly workspaceId?: string; -} - -export interface QueueRequest { - readonly text: string; - /** Default "default". Auto-creates if missing. */ - readonly workspaceId?: string; -} - -export interface ChatQueueMessage { - readonly type: "chat.queue"; - readonly conversationId: string; - readonly text: string; - /** Default "default". Auto-creates if missing. */ - readonly workspaceId?: string; -} -``` - -### Workspace endpoint types - -```ts -/** Body of `PUT /workspaces/:id` (all fields optional — the ensure/create call). */ -export interface EnsureWorkspaceRequest { - /** Display title. Default: the workspace id. Only used on create; ignored if workspace exists. */ - readonly title?: string; - /** Default cwd. Default: null (inherit server default). Only used on create. */ - readonly defaultCwd?: string | null; -} - -/** Response of GET/PUT /workspaces/:id — the workspace itself. */ -export interface WorkspaceResponse extends Workspace {} - -/** Response of `GET /workspaces` — all workspaces sorted by lastActivityAt desc. */ -export interface WorkspaceListResponse { - readonly workspaces: readonly WorkspaceEntry[]; -} - -/** Body of `PUT /workspaces/:id/title`. */ -export interface SetWorkspaceTitleRequest { - readonly title: string; -} - -/** Body of `PUT /workspaces/:id/default-cwd`. null/absent = clear to server default. */ -export interface SetWorkspaceDefaultCwdRequest { - readonly defaultCwd: string | null; -} - -/** Response of `DELETE /workspaces/:id`. */ -export interface DeleteWorkspaceResponse { - readonly workspaceId: string; - /** Conversations that were closed (status → "closed") by this delete. */ - readonly closedCount: number; -} -``` - ---- - -## 3. Final endpoint list - -| Method & Path | Body | Returns | Notes | -|---|---|---|---| -| `GET /workspaces` | — | `WorkspaceListResponse` | Sorted by `lastActivityAt` desc. Includes `conversationCount`. | -| `PUT /workspaces/:id` | `EnsureWorkspaceRequest?` | `WorkspaceResponse` | **Create-on-miss** (idempotent). Creates with `title=id`, `defaultCwd=null` if missing. Returns existing as-is if present. Slug validated. | -| `GET /workspaces/:id` | — | `WorkspaceResponse` | Pure read. 404 if missing. | -| `PUT /workspaces/:id/title` | `SetWorkspaceTitleRequest` | `WorkspaceResponse` | Rename (display only; id unchanged). | -| `PUT /workspaces/:id/default-cwd` | `SetWorkspaceDefaultCwdRequest` | `WorkspaceResponse` | Set/clear workspace default cwd. | -| `DELETE /workspaces/:id` | — | `DeleteWorkspaceResponse` | **Closes all conversations** (status → "closed"), reassigns them to "default", then deletes the workspace. 409 for `"default"`. | -| `GET /conversations` | `?workspaceId=`, `?status=`, `?q=` | `ConversationListResponse` | Additive `?workspaceId=` filter, composable with existing filters. | -| `DELETE /conversations/:id/cwd` | — | `CwdResponse` | Clears explicit conversation cwd (returns `cwd: null`). | - -### Existing endpoints (semantic note, no type change) - -- `GET /conversations/:id/cwd` — unchanged: returns the **explicit** conversation cwd (`null` = inheriting workspace default). -- `GET /conversations/:id/lsp` — now roots LSP at the **effective** cwd; `LspStatusResponse.cwd` returns the effective cwd. - ---- - -## 4. cwd resolution (backend-owned) - -``` -effectiveCwd = conversationStore.getCwd(conversationId) // explicit per-conversation -if (effectiveCwd == null) { - workspaceId = conversationStore.getWorkspaceId(conversationId) // "default" fallback - workspace = conversationStore.getWorkspace(workspaceId) - effectiveCwd = workspace?.defaultCwd ?? null -} -if (effectiveCwd == null) effectiveCwd = serverDefaultCwd // process.cwd() today -``` - -- `GET /conversations/:id/cwd` → explicit cwd only (`null` = inherit). -- `GET /conversations/:id/lsp` → effective cwd. -- Turn start (`runTurn` / `warm`) → effective cwd. - ---- - -## 5. `DELETE /workspaces/:id` semantics - -1. Close all conversations in that workspace (set `status = "closed"`). -2. Reassign their `workspaceId` to `"default"` (so no dangling reference). -3. Delete the workspace entity. -4. Return `{ workspaceId, closedCount }`. -5. `DELETE /workspaces/default` → HTTP 409. - -Closed conversations are hidden from tab-restore (`?status=active,idle` excludes `closed`). - ---- - -## 6. Workspace lifecycle / auto-creation - -- **Auto-create on turn start:** if `workspaceId` is provided and doesn't exist, the backend auto-creates it (`title = id`, `defaultCwd = null`). -- **`PUT /workspaces/:id` create-on-miss:** if absent, creates with optional `title`/`defaultCwd` from the body (defaults: `title = id`, `defaultCwd = null`). If present, returns existing as-is. -- **Slug validation:** `^[a-z0-9](?:[a-z0-9-]{0,38}[a-z0-9])?$` (1–40 chars, lowercase, digits, internal hyphens only). Reject invalid with 400. No normalization. `"default"` allowed but non-deletable. -- **`"default"` workspace:** always synthesized if not persisted; guaranteed in `GET /workspaces` list. -- **`lastActivityAt`:** updates when a conversation in the workspace appends, or on first creation. Does NOT update on title/default-cwd changes. -- **Compaction:** post-compaction conversations inherit the original's `workspaceId`. - ---- - -## 7. Answers to FE open questions (Q1–Q8) - -| # | Decision | -|---|---| -| Q1 | **Close all conversations** in the workspace (status → "closed"), reassign to "default", then delete the workspace. Return `closedCount`. | -| Q2 | **Add `DELETE /conversations/:id/cwd`** to clear explicit cwd (fall back to workspace default). `PUT` validation unchanged (empty string still 400). | -| Q3 | **Deferred to v1** — no WS lifecycle push. Fetch-on-mount + manual refresh sufficient. Can add `workspace.created/updated/deleted` later, additively. | -| Q4 | **`PUT /workspaces/:id`** is the create-on-miss entry point (idempotent, 200). `GET /workspaces/:id` is a pure read (404 if missing). | -| Q5 | Slug regex `^[a-z0-9](?:[a-z0-9-]{0,38}[a-z0-9])?$`. Reject, don't normalize. `"default"` non-deletable. | -| Q6 | `Workspace` in `@dispatch/wire`. Request/response bodies in `@dispatch/transport-contract`. | -| Q7 | Confirmed — backend does nothing beyond `workspaceId` on `ConversationMeta` + `?workspaceId=` filter. | -| Q8 | Yes — post-compaction conversations inherit `workspaceId`. `forkHistory` copies it. | - ---- - -## 8. Gaps resolved (from FE handoff §3) - -1. **Unknown workspaceId on turn start** → auto-create (title = id, defaultCwd = null). Typos can be deleted. -2. **PUT /workspaces/:id initial state** → body accepts optional `title`/`defaultCwd` with defaults (`title = id`, `defaultCwd = null`). Only applied on create; existing workspace returned as-is. -3. **lastActivityAt on title/default-cwd changes** → no. -4. **LSP cwd field** → returns effective cwd. -5. **Conversation count in list** → yes, included as `WorkspaceEntry.conversationCount`. diff --git a/tasks.md b/tasks.md deleted file mode 100644 index 137101a..0000000 --- a/tasks.md +++ /dev/null @@ -1,1050 +0,0 @@ -# Dispatch — tasks (live progress) - -> **Live status + roadmap only.** Completed milestones are summarized, not -> narrated. Old blow-by-blow history is pruned — it lives in git (`git log`). -> Keep this lean and current; do not let it re-accrete a step-by-step changelog. - -## Status (current) -`tsc -b` EXIT 0 · biome clean · **1730 vitest** pass (+6 sshd-integration skipped). (worktree `feature/ssh-support`; -merged `dev` — brings retry-with-backoff (`provider-retry` AgentEvent) + the LSP-dead-server fix alongside the -SSH waves below.) - -## Retry with backoff on retryable provider errors (DONE — from dev) -When the upstream LLM API returns a retryable error (HTTP 429 / 5xx "overloaded"), -the kernel now retries `provider.stream()` with a stepped backoff, visibly, until -the 8h cumulative-sleep budget is exhausted — then emits the final error and -seals the turn. Retries fire ONLY when no content was emitted yet this step (the -safety invariant — never duplicate partial output). Plan: -`notes/retry-with-backoff-plan.md`; report: `reports/retry-with-backoff.md`. -- **Architecture (kernel hook + shell policy/I/O):** kernel provides the hook - (`RetryStrategy` contract + the retry loop in `runTurn`); the shell - (session-orchestrator) provides the policy (the schedule) + the I/O (an - abortable `setTimeout` sleep). Kernel imports no timer. `retry?` is optional - → omit = no retry (backward-compatible). -- **New transient `AgentEvent` variant** `provider-retry` (`@dispatch/wire`), - emitted once per scheduled retry BEFORE the sleep so the UI can show - "⚠ retrying in Ns…" immediately; NOT persisted to model history (never - pollutes the prompt). Final failure is still a persisted `error` + seal. -- **Schedule:** `5s,10s,30s,60s,5m,10m,15m,30m`, then repeat 30m until 8h of - cumulative scheduled sleep → ~21 retries then give up. Pure `delayFor(attempt)`. -- **Retry trigger:** emitted `error` with `retryable===true` → retry; - `retryable` false/absent → give up; a THROWN error → retryable-by-default - ONLY when pre-content. All gated on `!hadContent` (text/reasoning/tool-call/usage). -- **Frontend handoff (5d3f, separate repo `../frontend`):** render - `provider-retry` as a yellow warning system-message bubble showing `message` - (+`code`) with the `delayMs` countdown. - -## SSH support — transparent remote execution (DONE — waves 0-5c) -Plan: `notes/ssh-support-plan.md` (decisions locked in §0.5/§13). Orchestrated in -waves (ORCHESTRATOR.md §2a — pre-author the contract seam, then parallel -owner-agents on disjoint packages). -- [x] **Wave 0** (orchestrator): kernel contract seam — `computerId` on - `ToolExecuteContext` + `RunTurnInput` (additive optional; backward - compatible). `tsc -b` EXIT 0. -- [x] **Wave 1** (parallel): `wire` (Computer/defaultComputerId types) + - `exec-backend` (NEW pkg: ExecBackend contract + LocalExecBackend + handle + - resolver) + `kernel` runtime (thread computerId through dispatch/run-turn) + - `conversation-store` (contract fan-out: defaultComputerId + getEffectiveComputer - + per-conv computerId get/set/clear). `tsc -b` EXIT 0, biome clean, **1592 vitest** - (was 1549, +43). -- [x] **Wave 2** (parallel): refactor `tool-shell`/`read-file`/`write-file`/ - `edit-file` behind `ExecBackend` (local-only; spawn.ts deleted — logic moved - to exec-backend; edit_file gains forward-compatible remote-diagnostics skip). - `tsc -b` EXIT 0, biome clean, **1599 vitest** (was 1592). -- [x] **Wave 3** (parallel): `session-orchestrator` (thread computerId end-to-end - + remote tool-drop filter: drops `lsp` + `__`-namespaced MCP tools when - remote) + `transport-contract` (ChatRequest.computerId + computer endpoint - API types). `tsc -b` EXIT 0, biome clean, **1620 vitest** (was 1599). -- [x] **Wave 4** (parallel): `transport-http` (computer endpoints + `/chat` - threading + the `ComputerService` seam the ssh package will provide) + - `transport-ws` (computerId through chat.send/queue) + `mcp` (CR-1: preserve - computerId in filter). `tsc -b` EXIT 0, biome clean, **1641 vitest** (was 1620). -- [x] **Wave 5a**: `exec-backend` — remote-backend factory handle (lazy lookup; - computerId set -> SshExecBackend via factory; absent -> clear error). +24 tests. -- [x] **Wave 5b**: `ssh` package (NEW) — SshConnectionPool (per-alias ssh2.Client, - lazy connect, keep-alive, idle reap), SshExecBackend (ssh2 exec+sftp, node:fs - .code error mapping), ~/.ssh/config reader (ssh-config), known_hosts - auto-trust-and-pin, key-only auth from ~/.ssh. LOAD-BEARING: ssh2 verified - under Bun (connected to local sshd :22, exec OK) — decision #1 confirmed. - Provides remoteExecBackendFactoryHandle + computerServiceHandle. +45 tests - (6 sshd integration tests skipped). tsc -b EXIT 0, biome clean, **1690 vitest** - (was 1641). -- [x] **Wave 5c**: host-bin — register exec-backend + ssh extensions in - CORE_EXTENSIONS (correct DAG order); transport-http CR-5 barrel re-export of - computerServiceHandle. orchestrator added missing @dispatch/exec-backend dep to - host-bin + bun install. **LIVE-VERIFIED**: server boots clean ("Dispatch booted", - no disabled extensions). tsc -b EXIT 0, biome clean, 1690 vitest (+6 sshd skipped). -- [x] **Merge dev**: brought retry-with-backoff (`provider-retry` AgentEvent — what - the FE consumes) + LSP-dead-server fix into the SSH branch. All code files - auto-merged cleanly; only `tasks.md` conflicted (orchestrator-resolved). -- [x] **FE handoff #3 (provider-retry merge) — RESOLVED**: FE re-synced both pinned - file: deps (`@dispatch/wire` + `@dispatch/transport-contract`) against merged - `feature/ssh-support`; both resolve `TurnProviderRetryEvent`. The 11 provider- - retry svelte-check errors cleared with ZERO further FE code changes (consumer - already complete + tested). FE full suite green: typecheck 0/0, 795/795 tests, - biome clean, vite build OK. Earlier SSH handoffs (#1 wire types, #2 computer - HTTP API) now also typecheck-clean against the merged wire. Nothing further - needed from backend on this. -- [x] **FE final sync check — GREEN, all three handoffs + cross-cutting verified**: - FE confirmed whole-tree green (typecheck 0/0, 795/795 tests, biome clean, build - OK, git clean). (1) provider-retry (§2c): TurnProviderRetryEvent resolves; - assertAgentEventExhaustive covers it (typecheck-green = exhaustive); ChatView - renders yellow alert-warning bubble w/ attemptLabel + delayLabel (delayMs via - viewProviderRetry/formatRetryDelay) + code badge, gated {#if providerRetry}. - (2) SSH handoff #1: Workspace.defaultComputerId + Computer/ComputerEntry resolve; - 2 Workspace literals supply defaultComputerId: null; catalog flows through - store.computers. (3) SSH handoff #2: full src/features/computer/ (ComputerField - w/ per-conv selector + connection-status badge + Test-connection polling; - ComputerSelect reusable; store computerId/refreshComputer/setComputer + computers - catalog on boot + computerStatus/testComputer; WorkspaceCard default-computer - selector via setDefaultComputer) — 20 view-model tests, typecheck-clean, chat.send - unchanged. CROSS-CUTTING (key integration question): GREEN, no collision — - provider-retry is WS-stream → TranscriptState.providerRetry → ChatView (transcript, - keyed activeConversationId); computer is HTTP-ONLY (imports NO AgentEvent/chunks/ - TranscriptState) → AppStore.computerId (per-conv persisted) → ComputerField (sidebar, - keyed currentConversationId). Disjoint state, disjoint channels (WS vs HTTP), - disjoint regions (transcript vs sidebar), disjoint mount keys. The conversation- - switch lifecycle is the only shared touchpoint and is correct + independent. - assertAgentEventExhaustive confirms computer is NOT an AgentEvent (HTTP-only). - We're done — nothing further needed from either side. -- [ ] **DEFERRED — CR-6 usageCount**: `listComputers()` returns `usageCount: 0` until a - conversation-store count-by-alias helper + host-bin wiring is added (non-blocking — - discovery/connect/execute all work; only the count badge shows 0). Follow-up. -- [ ] **DEFERRED — cache-warming**: computerId threading intentionally NOT done - (user-deferred — cache-warming is not needed right now). Known limitation: - a warm probe on a remote turn assembles the tool set WITHOUT the remote-drop - → a potential prompt-cache miss (performance-only, not correctness). Revisit - when cache-warming is re-enabled. -Key decisions: ssh2 + ssh-config (project-local deps); key-only auth from -`~/.ssh`; auto-trust-and-pin host keys; computers discovered read-only from -`~/.ssh/config` (no CRUD entity); computerId persisted per-conversation; LSP/MCP -silently dropped on remote turns; edit_file works w/o diagnostics remotely. - -## Per-edit LSP diagnostics auto-append (DONE) -After a successful `edit_file`, the extension now calls LSP `getDiagnostics` on the -post-edit buffer and appends any errors/warnings (severity ≤ 2) to the tool result — -so the model sees lint/diagnostics feedback inline without a separate round-trip. -Multi-server aggregation queries ALL connected servers matching the file's extension -(not just the first), merging diagnostics tagged by source (`[steep]`, `[ruby-lsp]`, etc.). -Incremental sync (`textDocument/didChange`) captures each server's `change` kind during -`initialize` and computes prefix/suffix diff ranges for `change:2` servers, full content -for `change:1`. New pure `diff.ts` (`computeChangeRange` + `offsetToPosition`, O(n)). -60s timeout; slow warning if >10s; graceful degradation when no LSP available. Generic -— works for any LSP. `languageId` mapping extended (`.rb`/`.rbs`/`.c`/`.cpp`/etc.). -- [x] Wave 1 — `packages/lsp/` (single unit): diff.ts, client, tool, diagnostics, language, types, extension. 15 new diff tests + multi-server tool test. -- [x] Wave 2 — `packages/tool-edit-file/`: optional dep on `@dispatch/lsp` via `host.getService()` (not manifest `dependsOn`); appends diagnostics after successful edit. -- [x] Verified: `tsc -b` EXIT 0, biome clean, **1468 vitest** pass (was 1453, +15). -- [x] **LIVE-VERIFIED** (production dispatch-server :24991): edit_file now surfaces LSP diagnostics inline — a deliberate type error (`const x: number = "not a number"`) in a .ts file produces `[TypeScript Language Server] ERROR (2322) L3:9: Type 'string' is not assignable to type 'number'` appended to the edit result. Required a lazy LSP service lookup fix (commits e03a96e + d4ff45c) — tool-edit-file activates at position 5 in CORE_EXTENSIONS while lsp activates at position 20, so getService always threw at activation time. - -## MCP (Model Context Protocol) integration (DONE) -Dispatch is now an MCP host. A new `mcp` standard extension (`packages/mcp/`) spawns -configured MCP servers (stdio child processes), performs the MCP handshake, discovers -tools via `tools/list`, and registers each as a first-class Dispatch `ToolContract` via -`host.defineTool`. When the model calls an MCP tool, the extension proxies to `tools/call` -on the MCP server and returns the flattened result. Config: `.dispatch/mcp.json` (servers -key) → `opencode.json` mcp key fallback, resolved per-cwd (mirrors LSP). Tool names namespaced -as `<serverId>__<toolName>`. A `toolsFilter` drops tools from disconnected servers. Phase 1: -stdio only, Tools only (no Resources/Prompts/HTTP/sampling). Hand-rolled JSON-RPC (zero deps). -- **Design:** `notes/mcp-design.md` + `PLAN-mcp.md`. -- [x] Wave 1 — `packages/mcp/` (agent via dispatch CLI): 12 source + 8 test files, 69 tests. -- [x] Wave 2 — orchestrator: root tsconfig ref, host-bin CORE_EXTENSIONS registration, bun install. -- [x] Verified: `tsc -b` EXIT 0, biome clean, **1537 vitest** pass (was 1468, +69). -- [x] **LIVE-VERIFIED** (production dispatch-server :24991): a minimal test MCP server (stdio, - one `ping` tool) configured in `.dispatch/mcp.json` → model discovered `test__ping`, - called it with `{"msg":"hello"}`, received `pong` — full turn lifecycle (tool-call → - tool-result → done). Tool name namespacing (`<serverId>__<toolName>`) confirmed on the wire. -- **Bug found + fixed during live-verify:** `edit_file` tool was missing from the toolset - because the per-edit diagnostics change called `host.getService(lspServiceHandle)` at - activation time, but `tool-edit-file` activates BEFORE `lsp` in CORE_EXTENSIONS → getService - threw → activate crashed → tool never registered. Fix: lazy lookup at edit time (commits - e03a96e, d4ff45c). - -## Broken-chat self-repair (read-time reconcile) (DONE) -Conversation `77574596` broke unrecoverably: `reconcile()` only repaired orphaned -tool-calls, not (a) a trailing assistant message whose only chunk is `error` -(serializes to empty content → uncontinuable) and (b) a `tool-call` whose `input` -is a raw malformed-JSON string (re-sent as OpenAI `arguments` → provider 400s on -every continuation). `load()` also had no try/catch on `JSON.parse` (one corrupt -row would brick a chat). Fix = read-time repair so broken chats auto-heal on next -open — NO DB surgery (append-only preserved; repair is a turn-path transform on -`load()`). Full diagnosis + plan: `broken-chat-repair-handoff.md` + -`reports/broken-chat-repair-diagnosis.md`. -- **Layer 1 — `conversation-store` `reconcile.ts` (protects ALL providers):** - `reconcileWithReport` now (1) strips `error` chunks from assistant messages, (2) - drops any assistant message left with no `text`/`tool-call` (the emptied error-only - msg — safe: never followed by a `tool` msg), (3) keeps orphaned-tool-call synthesis - unchanged. `ReconcileReport` +2 additive counts (`strippedErrorChunks`, - `droppedEmptyMessages`) for the repair span. `loadSince` (FE reads) intentionally - NOT reconciled — the user still SEES the error while the provider gets clean history. - **Hardening:** `store.ts` `load()` wraps per-chunk `JSON.parse` in try/catch → - corrupt row skipped (log + continue), reconcile runs on the rest. +6 reconcile/store - tests. -- **Layer 2 — `openai-stream` `convert-messages.ts` (per-provider args safety):** new - pure `serializeToolArguments` — object→stringify; valid-string→parse+restringify; - malformed-string→fallback `{ _malformed_arguments: <truncated 200> }`. Output ALWAYS - `JSON.parse`s → provider stops 400ing on stored malformed args. +4 tests. -- **Layer 2 (equiv) — `../claude` `provider-anthropic` `convert.ts`:** `safeJson` now - returns a valid object fallback (`{ _malformed_arguments: s.slice(0,200) }`) on - parse failure, not the raw string (`tool_use.input` must be an object for Anthropic). - Exported for direct testing. +3 tests. (Separate repo, separate agent.) -- **Wave 1+2 (parallel, disjoint):** conversation-store + openai-stream (arch-rewrite) - + provider-anthropic (`../claude`). All in-lane; zero internal mocks; no contract/type - change. Reports: `reports/conversation-store.md`, `reports/openai-stream.md`, - `../claude/reports/provider-anthropic.md`. -- [x] Verified: arch-rewrite `tsc -b` EXIT 0, biome clean, **1453 vitest** (was 1443); - `../claude` `tsc -b` EXIT 0, 71 vitest, biome clean. Both pure-core units zero - internal mocks. -- [x] **LIVE-VERIFIED** (dev stack `bin/up` :24203): reproduced 77574596's REAL broken - tail (the actual malformed-args tool-call + trailing error chunk) in the dev DB; - `POST /chat` continued it cleanly (`text-delta:"OK"` → `done` reason `"stop"`, no - 400) — the provider accepted the reconciled history (error stripped, args sanitized). - The historical error chunk remains in storage by design (read-time repair only); no - new error was appended. Cleaned up the test conversation after. - -## LSP — broken-server recovery + config source attribution (DONE) -Handoff from an agent running in raylib-jamstack (configuring ruby-lsp under the -installed Dispatch harness `/usr/bin/dispatch-server`): two issues found by -decompiling the running binary. (Previous orchestrator session 77574596 did the -investigation + Wave 0 + wrote the prompt; its chat broke mid-summon — resumed.) -- **Issue 2 (blocker):** a failed LSP server was `broken` FOREVER — the manager's - `broken` set (keyed `${serverId}:${root}`) was cleared ONLY in `shutdownAll()`, so a - server that failed (bad env, missing binary, OR a since-fixed bad config) stayed - `state:"error"` for the whole process. For an agent running *inside* dispatch the - only recovery (server restart) kills its own session. -- **Issue 1:** `.dispatch/lsp.json` (read first) silently shadowed `opencode.json`'s - `lsp` key — a broken entry won with no warning, and the caller couldn't tell which - config source a server came from (`status()` was its only visibility). -- **Wave 0 (orchestrator, contracts):** additive `readonly configSource?: string` on - `LspServerInfo` (`@dispatch/transport-contract` `0.20.0→0.21.0`) + a type-test - assertion (8→9). tsc/biome/vitest clean. -- **Wave 1 — `lsp` extension:** (a) broken-server now self-heals when its *resolved - config changes* since it was marked broken (a config edit is a discrete event → no - retry storm; bounded backoff for transient failures); (b) `configSource?` mirrored on - `LspServerStatus` + populated in `status()` (`.dispatch/lsp.json` / `opencode.json` / - `built-in`); (c) shadow warning via `host.logger` when both configs declare lsp; (d) - spawn-failure `error` strings now name the config source. 6 required named tests + - extras. Report: (agent cut off before writing `reports/lsp.md`; work independently - verified — 50 lsp tests, tsc EXIT 0, biome clean). -- **Wave 1 CR (transport-http):** the `GET /conversations/:id/lsp` handler mapped - `LspServerStatus`→`LspServerInfo` field-by-field and DROPPED `configSource` (never - reached the wire). Summoned the transport-http owner for the one-line conditional-spread - pass-through (mirrors `error`, honors `exactOptionalPropertyTypes`) + a named pass-through - test (present + undefined-omitted). Report: `reports/transport-http.md`. -- [x] Verified: `tsc -b` EXIT 0, biome clean, **1443 vitest** pass; all agents in-lane - (only packages/lsp + transport-contract + transport-http touched; pre-existing - uncommitted WIP in kernel/tool-shell left untouched). Zero internal mocks. -- [x] **LIVE-VERIFIED** (dev stack `bin/up` on :24203, new code via `--watch`): - (A) `configSource` reaches the wire — built-in TS server reports - `configSource:"built-in"`, `state:"connected"` (Wave 0 + transport-http pass-through - confirmed end-to-end); (B) a broken server (`.dispatch/lsp.json` → nonexistent binary) - reports `state:"error"` + `configSource:".dispatch/lsp.json"` + a source-named error - string (`broken-ts [from .dispatch/lsp.json]: Executable not found in $PATH: …`); - (C) **recovery without restart** (the blocker) — same conversation/process went - `error`→`connected` after the config was fixed (config change clears the broken key → - re-spawn → connects); (D) no retry storm — repeated `status()` with no config change - stays `error`; (E) shadow warning logged via `host.logger` (`extensionId:"lsp"`, - level `warn`) when both `.dispatch/lsp.json` and `opencode.json` declare lsp. - -## Per-conversation model persistence (DONE) -Bug: a chat's selected provider + model was NOT persisted per conversation. -Opening the same chat in a new browser session defaulted to the server's -default model rather than recalling the originally selected one. -- **Wave 0 (orchestrator, contracts):** `@dispatch/transport-contract` - `0.19.0→0.20.0` — additive `ModelResponse` + `SetModelRequest` types for - `GET/PUT /conversations/:id/model`. -- **Wave 1 — `conversation-store`:** `getModel`/`setModel` (`model:<id>` key, - mirrors `getReasoningEffort`/`setReasoningEffort`); `forkHistory` copies model; - empty string clears (idempotent). +13 tests. -- **Wave 2 (parallel):** `session-orchestrator` (resolve model from persisted - store when no per-turn override → `resolveModel`; persist the resolved model - so it sticks; warm path parity; `resolveModelName` pure helper; +4 tests) + - `transport-http` (`GET/PUT /conversations/:id/model` with validation + - `parseModelBody` pure validator; +10 tests). -- [x] Verified: `tsc -b` EXIT 0, biome clean, **1433 vitest** pass; all in-lane. - -## System-prompt stale on cwd change (DONE) -Bug: the system-prompt service constructed the resolved prompt once on the first -turn and reused it via `get()` on subsequent turns (cache-safe design). But the -prompt is cwd-sensitive (`[file:AGENTS.md]`, `[prompt:cwd]` variables). When a -conversation's cwd changed after the first turn, the cached prompt was stale — -referenced files from the new cwd were not loaded. -- **Wave 1 — `system-prompt`:** added `getWithMeta(conversationId)` returning - `{ prompt, cwd }` — reads both `resolved:<id>` and a new `resolved-cwd:<id>` - sibling key. `construct()` now also stores the cwd. All additive, no existing - method signature/behavior changed. +5 tests. -- **Wave 2 — `session-orchestrator`:** subsequent turns call `getWithMeta`, - compare stored cwd vs `effectiveCwd ?? process.cwd()`, and `construct` if they - differ (or if no stored prompt exists). Compaction path (always constructs) - and warm path (no system prompt) unaffected. +1 test. -- [x] Verified: `tsc -b` EXIT 0, biome clean, **1411 vitest** pass; both in-lane. -- No FE handoff needed (backend-only fix; no contract version bump). - -## Workspace tab issue — conversation.open drops workspaceId (DONE) -Cross-repo additive fix: `conversation.open` / `conversation.statusChanged` WS -broadcasts now carry the conversation's persisted workspace id, so a frontend -opens/focuses a tab in the correct workspace instead of the viewer's current -workspace (`activeWorkspaceId`). CLI `dispatch <model> --open --workspace my-ws` -now opens only in `my-ws`. -- **Wave 0 (orchestrator, contracts):** `@dispatch/transport-contract` - `0.18.0→0.19.0` — additive `readonly workspaceId: string` on - `ConversationOpenMessage` and `ConversationStatusChangedMessage`. -- **Wave 1 (parallel):** `session-orchestrator` (add `workspaceId` to - `ConversationOpenedPayload`/`ConversationStatusChangedPayload`; resolve from - `conversationStore.getWorkspaceId` at all status-change emit sites) + - `transport-ws` (thread `workspaceId` from hook payload into WS broadcasts) — - disjoint packages. -- **Wave 2:** `transport-http` — `POST /conversations/:id/open` now awaits - `getWorkspaceId(conversationId)` and emits `conversationOpened` with it. -- [x] Verified: `tsc -b` EXIT 0, biome clean, **1405 vitest** green; all agents in-lane. -- [x] **FE courier** to `29ae`: `frontend-workspace-open-handoff.md` — parse/use - `workspaceId` from `conversation.open` and `conversation.statusChanged`; - re-pin `@dispatch/transport-contract` `0.19.0`; re-mirror reference.md. - -## LSP cwd resolution — server-default fallthrough + workspace assignment (DONE) -Bug: `GET /conversations/:id/lsp` called `getEffectiveCwd` directly, which falls through -to `serverDefaultCwd` (`process.cwd()`) when no conversation cwd is set — the LSP -connected on the wrong dir. Additionally, a new conversation's workspace isn't assigned -until the first `chat.send`, so `getEffectiveCwd` resolved against `"default"` (not the -intended workspace) when the FE set the cwd before the first turn. -- **Wave 0 (orchestrator, contracts):** `@dispatch/transport-contract` `0.16.0→0.17.0` — - additive `SetCwdRequest.workspaceId?: string` + updated `LspStatusResponse.cwd` comment - ("resolved working directory the LSP connects on, or null when no cwd is set"). -- **Wave 1 — transport-http:** `GET /conversations/:id/lsp` now gates on `getCwd` - (persisted) first — returns `{ cwd: null, servers: [] }` when no cwd set (LSP does NOT - connect); only calls `getEffectiveCwd` + `lspService.status()` when a persisted cwd - exists. `PUT /conversations/:id/cwd` now accepts optional `workspaceId` — validates - with `isValidWorkspaceSlug`, then `ensureWorkspace` → `setWorkspaceId` → `setCwd` - (assigns workspace before persisting cwd). 5 new tests + 1 assertion updated. - Report: `reports/transport-http.md`. -- [x] Verified: `tsc -b` EXIT 0, biome clean, **1332 vitest** pass; agent in-lane. -- [x] **FE courier** sent to FE agent `ffe3`: `frontend-lsp-cwd-workspace-handoff.md` - — send `workspaceId` on `PUT /conversations/:id/cwd`; `GET /conversations/:id/lsp` - now returns `cwd: null` + empty `servers` when no working dir is set. - -## Workspace cwd fallthrough + relative resolution (DONE) -FE courier in: bug report + behavior change (`workspace defaultCwd` not used at turn start when -a conversation has no explicit cwd; plus per-conversation cwd should be **relative to the workspace -`defaultCwd`** unless absolute). Resolution is backend-owned (the FE omits `cwd` on `chat.send`). -- **Scope:** single unit — `conversation-store` owns `getEffectiveCwd` (already consumed unchanged - by `session-orchestrator` turn/warm + `transport-http` `GET /conversations/:id/lsp`), so no - cross-package surface change and no fan-out. `GET /conversations/:id/cwd` uses `getCwd` (raw - explicit cwd) — unchanged. -- [x] **conversation-store** — added injectable `serverDefaultCwd` (default `process.cwd()`) to - `createConversationStore`; rewrote `getEffectiveCwd` with the new algorithm: explicit conversation - cwd null → `workspaceCwd ?? serverDefaultCwd` (bug fix: was returning null, skipping the workspace - default); absolute (starts `/`) → overrides; relative → `path.resolve(workspaceCwd ?? - serverDefaultCwd, conversationCwd)`. Public signature `(conversationId) => Promise<string | null>` - unchanged. 8 regression tests. Report: `reports/conversation-store-workspace-cwd.md`. -- [x] Verified: `tsc -b` EXIT 0, biome clean, **1289 vitest** pass; agent in-lane; zero internal mocks. - -## Per-turn cwd override not resolved relative to workspace (CURRENT — live-found) -Live investigation (dev stack, tab 4ef4 in workspace `test` with `defaultCwd=/home/tradam/projects/ -dispatch`): `getEffectiveCwd` resolves a persisted relative cwd correctly (LSP endpoint + a chat -**omitting** `cwd` both return `/home/tradam/projects/dispatch/arch-rewrite`). BUT a per-turn `cwd` -sent on `chat.send` is used **as-is** by `session-orchestrator` (`cwd !== undefined ? -Promise.resolve(cwd)`, orchestrator.ts:360), bypassing `getEffectiveCwd`. So raw `arch-rewrite` -reaches `run_shell` → `resolve("arch-rewrite")` = `<process.cwd>/arch-rewrite` (nonexistent) → `pwd` -broken; `./` → `resolve("./")` = `process.cwd()` (valid) → "works". The FE sends the CwdField value -as a per-turn `cwd` (transport-ws threads it: router.ts:173 → extension.ts:277). -- **Fix (2 waves):** add an optional `overrideCwd?: string` to `ConversationStore.getEffectiveCwd` - (resolve the override if provided, else the persisted `getCwd` — same relative algorithm), then - `session-orchestrator` passes the per-turn `cwd` (turn start + warm `opts.cwd`) as the override. -- [x] **Wave 1 — conversation-store:** added `overrideCwd?` param + impl + tests. -- [x] **Wave 2 — session-orchestrator:** pass per-turn cwd as override (turn start + warm) + tests. -- [x] Verified: `tsc -b` EXIT 0, biome clean, **1298 vitest** pass; both agents in-lane; zero - internal mocks. -- [x] **LIVE-VERIFIED** (dev stack, workspace `test` defaultCwd `/home/tradam/projects/dispatch`): - a per-turn `cwd:"arch-rewrite"` on an existing conversation (assigned to `test`) → `pwd` - returns `/home/tradam/projects/dispatch/arch-rewrite` (resolved, not broken). Both the - omit-cwd path (Wave 0) and the per-turn-cwd path (Wave 2) confirmed working. -- **Known edge case (pre-existing, not a regression):** a brand-NEW conversation's FIRST turn runs - `getEffectiveCwd` *before* the workspace is assigned (orchestrator.ts assigns it later in the - IIFE), so a relative per-turn cwd resolves against the "default" workspace (server default) - instead of the intended one. Uncommon (CwdField typically set after the first message). Deferred. -- **Note (separate pre-existing bug, not touched):** `DELETE /conversations/:id/cwd` returns - `cwd:null` but does NOT clear the persisted cwd (transport-http app.ts:538 — the route is a stub). - -## Cwd edge cases — timing + DELETE stub (DONE) -Two pre-existing bugs surfaced during live-verify of the relative-cwd fix: -- **Edge 1 (timing):** a NEW conversation's first turn ran `getEffectiveCwd` BEFORE the workspace - was assigned, so a relative per-turn cwd resolved against `"default"` (server default) not the - intended workspace. **Fix:** session-orchestrator now assigns the workspace (for new - conversations, detected via `getConversationMeta === null`) BEFORE resolving the effective cwd; - removed the duplicate assignment site. 3 tests. -- **Edge 2 (DELETE stub):** `DELETE /conversations/:id/cwd` returned `{cwd:null}` but did NOT - clear the persisted cwd (no `clearCwd` on the store). **Fix:** conversation-store added - `clearCwd(id)` (`storage.delete(cwdKey)`, idempotent) + tests; transport-http DELETE handler now - `await clearCwd` for real. -- [x] **Wave A (parallel):** conversation-store (clearCwd) + session-orchestrator (timing) — disjoint. -- [x] **Wave B:** transport-http (DELETE handler uses clearCwd). -- [x] Verified: `tsc -b` EXIT 0, biome clean, **1311 vitest** pass; all in-lane; zero internal mocks. -- [x] **LIVE-VERIFIED** (dev stack): Edge 2 — PUT→GET(`/tmp/test`)→DELETE→GET(`null`) actually - cleared. Edge 1 — NEW conversation, workspace `test`, per-turn `cwd:"arch-rewrite"` → `pwd` - returns `/home/tradam/projects/dispatch/arch-rewrite` (resolved against workspace default, not - broken). -- [x] **FE courier handoff** written + sent: `frontend-cwd-resolution-handoff.md` couriered to FE - orchestrator conversation `b18a` via `dispatch send b18a --queue` (turn started). Behavior-only - — no `@dispatch/wire`/`transport-contract`/`ui-contract` version bumps; no FE contract change - needed. Notes: `DELETE /conversations/:id/cwd` now actually clears; per-turn `cwd` on `chat.send` - resolved relative to workspace `defaultCwd`; FE MAY omit `cwd` on `chat.send` (backend resolves - persisted). - -Built and verified live (full-fidelity: every feature is a manifest-loaded -extension through the host): -- **kernel** — contracts (ABI), bus, `runTurn` turn loop, extension host. -- **core extensions** — storage-sqlite, auth-apikey, provider-openai-compat - (OpenCode Go), conversation-store, session-orchestrator, transport-http, - credential-store; tool extensions `read_file` (files + directory listing), `run_shell`, - `edit_file`, `write_file`. -- **observability** — structured Logger/Span ABI + journal-sink → out-of-process - collector → trace-store (`bun:sqlite`); host-bin supervises the collector; - nested turn→step→{prompt, provider.request, ttft, decode} spans; D5 verbatim - provider capture (self-redacted); `trace-replay` record/replay lib + fixtures. -- **CLI** — one-shot HTTP client (`bun packages/cli/src/main.ts`); `GET /models`, - `--cwd`, `--conversation`. -- **web frontend** — SEPARATE repo `../frontend`. Slice 1 (surface system) - shipped via `ui-contract` + `surface-registry` + `transport-ws` + - `surface-loaded-extensions`. Slice 2 (browser chat) in progress there. - -## How to run -```bash -# .env auto-loads DISPATCH_API_KEY (do NOT re-export) and pins BACKEND_PORT (beats PORT). -# Private probe instance: override the port + ISOLATE data paths (ORCHESTRATOR §8): -BACKEND_PORT=4567 SURFACE_WS_PORT=4569 DISPATCH_DB=/tmp/opencode/probe/dispatch.db \ - DISPATCH_TRACE_DB=/tmp/opencode/probe/traces.db DISPATCH_JOURNAL=/tmp/opencode/probe/app.ndjson \ - bun packages/host-bin/src/main.ts # boots app + collector -curl -s -X POST localhost:4567/chat -H 'content-type: application/json' \ - -d '{"conversationId":"c1","message":"Say hello in 3 words."}' # field = conversationId -``` -Process cleanup uses the `[x]` bracket trick (ORCHESTRATOR §8) — leaked -server/collector procs poison the next run's counts. - -**Two stacks:** `bin/up` = dev (live-reload backend, ports 24203/24205/24204). -`../bin/up2` = a **stable, no-watch** second stack on **25203/25205/25204** with -ISOLATED data (`./.dispatch-data/up2/`, `./.dispatch/journal/up2/`) — runs ALONGSIDE -`bin/up`, edit backend code freely without restarting it; Ctrl-C stops only itself. -Enabled by a new env knob **`SURFACE_WS_PORT`** → `surfaceWsPort` config -(`host-bin/config.ts`; default 24205 when unset, so dev is unchanged). - -## Foundation (done — summarized; details in git) -- **MVP + multi-turn:** curl → transport-http → session-orchestrator → - host/registry → provider → OpenCode Go → AgentEvents → NDJSON; - `conversationId` threads history. -- **Post-MVP:** auth→provider seam; `read_file` tool (live tool-dispatch loop); - `getHostAPI()` hygiene; `tabId → conversationId` rename. -- **Observability Phase A/B:** the substrate + collector/store + supervision + - replay fixtures (see bullet list above). -- **CLI MVP:** credential-store + transport-contract + cli; model catalog; cwd - threading; multi-turn. -- **FE Slice 1:** the surface system across both repos (live WS probe verified). -- **FE Slice 2 backend prereqs:** `@dispatch/wire` split; per-chunk `seq` cursor; - read endpoint `GET /conversations/:id?sinceSeq=`; WS chat-deltas (transport-ws); - turn-lifecycle events (`turn-start`/`done`/`turn-sealed`); step grouping - (`stepId` on tool chunks/events); live stream metrics (`step-complete` + - `usage`/`done` token/timing — "Pass 1"); CORS. - -## Metrics — token + timing (current milestone) -- [x] **Pass 1 — live stream metrics** (done): `step-complete` event + - `usage`(stepId) + `done`(durationMs + aggregate usage). -- [x] **Observability spans** (done): turn & step span-close stamp all four - `Usage` fields (added cacheRead/cacheWrite; normalized `usage_*` → `usage.*`). -- [x] **Pass 2 — persisted replay metrics** (done, was deferred): `StepMetrics`/ - `TurnMetrics` wire types; conversation-store `appendMetrics`/`loadMetrics` - (separate key space, turn-append order); session-orchestrator accumulates - per-step+turn metrics from the event stream and persists after seal; - transport-http `GET /conversations/:id/metrics` → `ConversationMetricsResponse`. - `@dispatch/wire` + `@dispatch/transport-contract` → `0.4.0`. Commit `6db12ff`. -- [x] **Live-verified end-to-end** (against flash): live `step-complete`/`done` - metrics ↔ persisted `GET /conversations/:id/metrics` byte-match (aggregate + - per-step `stepId` + ttft/decode/genTotal + durationMs); journal turn/step spans - carry dotted `usage.*` incl. `usage.cacheReadTokens` (the #2 fix). -- [x] **FE courier handoff** written: `frontend-metrics-pass2-handoff.md` (in - this repo; user couriers to `../frontend`; ORCHESTRATOR §7). - -## dedup / storage growth (DONE) -Design `notes/observability-design.md` §12. User-gated calls: extend existing -pipeline (no new ext); scope = **de-dup + retention/rotation** (D9 roll-ups -deferred); dedup = **content-addressed bodies** (body-hash, NOT fingerprint-gated). -- [x] **Wave 1 — `trace-store`**: content-addressed `bodies` table (SHA-256), - at-rest gzip (>1 KiB), `prune(policy)` (age + drop-oldest byte-cap + orphan GC) / - `RetentionPolicy` / `PruneSummary` / `DEFAULT_RETENTION` (7d/256MiB); reads - transparent. -- [x] **Wave 2 — `observability-collector`**: pure `shouldPrune` cadence helper; - `main.ts` calls `store.prune(DEFAULT_RETENTION)` on a coarse cadence - (`--prune-interval-ms`, default 60s; host-bin-overridable), log-and-continue on - error. -- [x] Glossary: added content-addressed body, trace retention, prefix fingerprint, - warm vs real. -- [x] **Migration bug** (found by live boot, fixed): Wave 1 created the - `idx_records_bodyHash` index BEFORE running `migrateOldBodies`, so opening a - pre-existing OLD-schema `traces.db` crashed the collector - (`no such column: bodyHash`, crash-looped). Fix = reorder migration before the - index + 3 regression tests that seed a real old-schema DB. bun 106→109. -- Tests: bun 89→109. typecheck/biome clean. **Live-verified** against a real - old-schema `traces.db`: 0 crashes, collector stays up, schema migrates - (bodyHash + content-addressed bodies), real-data dedup (318 body refs → 270 - stored bodies), prune cadence fires cleanly (14× `prune completed`). Optional - follow-up: host-bin env-override for the retention policy. - -## Standard tools — fs + shell (DONE) -User-gated calls: **one tool per extension** (matches `tool-read-file` precedent); tools are -**standard** tier (a turn completes with `tools:[]`, §2.6/§2.8). **Zero ABI change** — the -`ToolContract`/`ToolExecuteContext` already carry `signal`/`onOutput`/`cwd`/`log`. -- **Wave 1 (parallel, disjoint pkgs, kernel-only dep) — all green:** - - [x] `tool-read-file` — EXTENDED `read_file` to list directory contents (sorted, `/`-suffixed - subdirs; files unchanged). 41 tests. - - [x] `tool-shell` (new) — `run_shell`: foreground, streamed via `ctx.onOutput`, `ctx.signal` - cancel, `ctx.cwd`, timeout + output cap, `concurrencySafe:false`; injected `spawn`. 31 tests. - - [x] `tool-edit-file` (new) — `edit_file`: `oldString`/`newString`/`replaceAll`; errors on - absent/non-unique/identical; workdir-contained; `concurrencySafe:false`. 38 tests. - - [x] `tool-write-file` (new) — `write_file`: explicit `overwrite` flag (absent+unset→create; - exists+unset→error; exists+true→overwrite; absent+true→error); no parent auto-create. 33 tests. -- **Wave 2 (done):** orchestrator added 3 root tsconfig refs + `bun install`; host-bin owner - registered the 3 new extensions in `CORE_EXTENSIONS` (same pattern as `read_file`). -- **Live-verified:** clean boot (`Dispatch booted`, collector up, no activation/capability-gate - error — the new `shell` capability is accepted); full-graph `tsc -b` EXIT 0, biome clean. -- **Recovery notes (scar tissue):** `tool-write-file` first returned plan-only (§5a) → re-summoned - with "IMPLEMENT NOW". `tool-edit-file` hung vitest at collection — `computeReplacement` infinite- - looped on empty `oldString` (`"".indexOf("") === 0`, index never advances) invoked at a test's - `describe` scope; fixed with an early empty-string guard + validation. One agent deleted - `ORCHESTRATOR.md` out-of-lane → caught by post-wave `git status`, restored from git. -- Deferred (not selected): `glob`, `grep`/`search_code`, background shells. - -## Skill system + load_skill tool (DONE) -User-gated calls: skills list lives in the **`load_skill` tool definition** (NOT the system prompt), -refreshed **per new turn** (cache-stable across steps), **live file read** on execute. One `skills` -standard extension (loader + filter + tool). Skill = md in `.skills/`; discovered from `~/.skills` + -`<cwd>/.skills` (cwd shadows home); name = filename w/o `.md`. Format: line1 = summary, -line2 = `---`, body = line3+; on load the first two lines are stripped; malformed (no `---`) = -no summary but still loadable. Glossary: added `skill`, `skill summary`, `tools filter`. -- **Mechanism — the per-turn `tools` filter chain** (first concrete use of the §3.2 context-assembly - chain; reusable for persona/agents later): - - [x] **kernel** — exposed `HostAPI.applyFilters` (delegates to the bus's existing `applyFilters`). - - [x] **session-orchestrator** — defines+exports `toolsFilter`/`ToolAssembly`; applies it ONCE per - turn (injected `applyToolsFilter` dep) before `runTurn`, threading `cwd`+`conversationId`. - - [x] **skills** (new ext, `dependsOn session-orchestrator`) — pure parse/merge/render + - `load_skill` tool (live read, strips first two lines, path-contained) + a `toolsFilter` filter - that rewrites `load_skill`'s description + `name` enum with the per-cwd catalog. 42 tests. - - [x] **host-bin** — registered `skills` in `CORE_EXTENSIONS`. - - [x] **Fan-out (§5.3):** `applyFilters` was a required `HostAPI` addition → broke one consumer - (transport-http `server.bun.test.ts` inline HostAPI stub) → fixed by its owner. -- **Live-verified:** clean boot (`skills` activates, filter registered, no crash); full-graph - `tsc -b` EXIT 0, biome clean. (End-to-end load_skill via a real LLM turn not yet exercised — - unit/integration tests cover the filter rewrite + live read.) - -## Cache warming (core DONE; control surface PARTIAL) -User-gated calls: target the external **Claude** provider (`../claude` provider-anthropic, loaded via -`DISPATCH_EXTERNAL_EXTENSIONS`); warm-assembly lives in **session-orchestrator** (`warm()` reuses the -real turn's assembly → byte-identical prefix, provider-agnostic); **surface system** for controls; -**per-conversation** controls; interval default 4 min, free value. Old-code invariants honored -(primary-model/full-prefix via reuse; refuse mid-turn; never persist/emit; in-flight invalidation; -arm-on-settle/cancel-on-start; `pct = round(clamp(cacheRead/input,0,1)*100)`). -- **Mechanism (2nd use of bus hooks; first event-hook emit):** - - [x] **kernel** — exposed `HostAPI.emit` (delegates to bus.emit), counterpart of `on`. - - [x] **session-orchestrator** — `turnStarted`/`turnSettled` event hooks (carry conversationId/cwd/ - modelName) emitted per turn; `warm()` service (`cacheWarmHandle`) reusing assembly, refusing - mid-turn, never persisting/emitting; returns Usage. - - [x] **cache-warming** (new ext) — per-conversation timers (arm/cancel/in-flight token), - calls `warm()`, computes `lastPct`, persists `{enabled,intervalMs}` (default on/240s) in - host.storage; registers a controls Surface. 19 tests. - - [x] **host-bin** — registered cache-warming; **transport-http** HostAPI stub fixed for `emit`. -- **Manual trigger endpoint:** `POST /chat/warm {conversationId, model?, cwd?}` → `WarmResponse` - `{inputTokens,outputTokens,cacheReadTokens,cacheWriteTokens,cachePct}` (409 if generating). Powers a - FE "warm now" button + fast tests. Types in `@dispatch/transport-contract`; route in transport-http. -- **LIVE-VERIFIED against Claude haiku:** automatic timer warm → journal `warm complete pct:100`; - manual `POST /chat/warm` → `cacheReadTokens:6799, cachePct:100` (100% hit), HTTP 200. The external - `../claude` provider-anthropic is loaded via `bin/up` (`DISPATCH_EXTERNAL_EXTENSIONS`). -- **Cache-metric fix + retention metric:** `provider-anthropic` (in `../claude`, commit `0e9d118`) - now reports `Usage.inputTokens` as the TOTAL prompt (was the uncached remainder → the cache rate - inflated/clamped to 100% on Claude). So `cacheRead/inputTokens` is now the true rate (live: a turn - adding new content reads 61%, not 100%). Added **`expectedCacheRate`** = `cacheRead/(cacheRead+ - cacheWrite)` (retention/health, ~100% when warm, 0% when the cache expired) to `WarmResponse` + - `POST /chat/warm` + the cache-warming surface (a "cache retention" stat). Live-verified: warm - within TTL → 100%; warm after >5 min idle → 0% (cache expired). FE handoff updated with both - metrics + the cross-turn real-turn `expectedCache = cacheRead_N/(cacheRead_{N-1}+cacheWrite_{N-1})`. -- **Surface framework extended (DONE):** added `NumberField` to `ui-contract` + per-conversation - surface scoping (optional `conversationId` on subscribe/unsubscribe/invoke + surface/update; new - `SurfaceContext` on `SurfaceProvider.getSpec/invoke`; transport-ws keys subscriptions by - `(surfaceId, conversationId)` and tags updates). cache-warming now serves a PER-CONVERSATION - surface: `Toggle`(enabled) · `Number`(interval, seconds, `cache-warming/set-interval`) · - `Stat`(last cache %). All backward-compatible (global surfaces like `surface-loaded-extensions` - unchanged). **FE courier:** `frontend-cache-warming-handoff.md` (this repo) — the web must render - the `number` field kind + send/handle `conversationId` on the surface WS protocol. - -## Cache warming — FE CR-3 (DONE) -FE asked (frontend `backend-handoff-cache-warming-timer.md`): expose next/last-warm timestamps + -make a manual warm reset the timer/refresh the surface. Done via an **inversion** (commit `bfbad3a`): -session-orchestrator `warm()` (the single chokepoint for manual `/chat/warm` AND the auto timer) emits -a `warmCompleted` bus event; cache-warming subscribes and does all post-warm handling — so manual -warms re-arm the timer + push a surface update with **no transport-http change** (core can't depend on -the standard cache-warming ext). Added `nextWarmAt`/`lastWarmAt` state + a `custom` -`rendererId:"cache-warming-timer"` surface field (no ui-contract bump). Caught + fixed a wiring bug -(`createWarmService` missed the `emit` dep → `deps.emit?.` silently no-oped; made it required). -Live-verified vs claude haiku (manual warm logs `warm complete` ~2s after the turn, not the 4-min -timer). FE handoff updated. (FE CR-1 table + CR-2 catalog `scope` flag still open, not requested.) - -## LSP integration + per-conversation CWD (DONE) -Design: `notes/lsp-design.md`. FE courier: `frontend-lsp-cwd-handoff.md`. Decisions -(locked): **single `lsp` extension**; **hand-rolled pure JSON-RPC codec** (zero dep, -injected-stream tested); **diagnostics-on-write deferred** (on-demand `lsp` tool -only); **cwd persisted in `conversation-store`**; config = **built-in TypeScript + -`<cwd>/.dispatch/lsp.json` + `<cwd>/opencode.json` `lsp` fallback** (Roblox works -with its existing config). Glossary: added LSP, language server, diagnostics, -workspace root, working directory. -- **The bug we fixed** (opencode root cause, confirmed): opencode's - `client/registerCapability` ignores all but `textDocument/diagnostic`, so - `workspace/didChangeWatchedFiles` registrations are dropped + no real fs watcher - → stale `sourcemap.json` → "Unknown require" mid-session. Fix = honor the - registration + real fs watcher + forward `didChangeWatchedFiles` + auto-spawn - `rojo sourcemap --watch` sidecar when `luau-lsp.sourcemap.autogenerate`. Covered - by a regression test in `packages/lsp/src/client.test.ts`. -- **`lsp` extension** (new, bundled core): hand-rolled LSP client (framing + rpc + - watched-files + diagnostics + config + root + tool + manager), zero external deps. - Lazy-spawn one server per `(serverID, root)`; config resolved **per cwd**; - `lspServiceHandle.status(cwd)` lazy-connects + reports state; `deactivate` kills - all child procs (host-bin shutdown now calls `host.deactivate()`). -- **CWD:** `conversation-store.getCwd/setCwd`; `session-orchestrator` defaults a - turn's cwd from the store; endpoints `GET`/`PUT /conversations/:id/cwd` + - `GET /conversations/:id/lsp` in transport-http; wire types in - `@dispatch/transport-contract` (→ `0.5.0`). -- **LIVE-VERIFIED:** this repo (`typescript`) → `connected`; `/home/tradam/projects/ - roblox` (`luau-lsp`) → `connected` (via the project's own `opencode.json` + rojo - sidecar); cwd PUT/GET round-trip 200. Op note: LSP binaries must be on the server - process PATH (`~/.local/bin` daemon-PATH caveat for `typescript-language-server`). -- **Recovery (scar tissue):** the `lsp` agent stalled on the final stretch (1 hung - test + ~40 biome `!`/dot-key findings) → at the user's request the orchestrator - finished it directly; also fixed a real design bug the agent missed: the manager - read config statically instead of per-cwd (would have broken Roblox). - -## Context size — current context-window usage (DONE) -User-gated decisions: term = **context size** (current usage; reserve "context window" for the -model's max LIMIT, a later feature); definition = the turn's **FINAL step `inputTokens + -outputTokens`** (NOT the aggregate `usage`, which sums per-step prompts and overcounts a -multi-step turn); delivery = a backend-computed field on BOTH the live `done` event and the -persisted `TurnMetrics`. -- [x] **Contract (orchestrator):** optional `contextSize?: number` added to `TurnDoneEvent` + - `TurnMetrics` in `@dispatch/wire` (`0.4.0→0.5.0`); `@dispatch/transport-contract` - `0.5.0→0.6.0` (re-exports both — no other change). Glossary: added **context size**. -- [x] **Wave (parallel, disjoint pkgs):** - - [x] **kernel** — `run-turn.ts` tracks the last step's `Usage`; `doneEvent()` stamps - `done.contextSize = lastStep.input + lastStep.output` (omitted when no usage). +3 tests. - - [x] **session-orchestrator** — `metrics.ts build()` stamps `TurnMetrics.contextSize` from - the final per-step metrics (same definition; equals the live value). +5 tests. -- [x] Verified: `tsc -b` EXIT 0, biome clean, 881 vitest pass; both owners stayed in-lane. - `conversation-store` (JSON passthrough) + `transport-http` (forwards/serves) unchanged. -- [x] **LIVE-VERIFIED against flash** (`deepseek-v4-flash`): turn 1 → live `done.contextSize` - 1255 == persisted `turns[-1].contextSize` 1255 == final-step `1206 in + 49 out` (NOT the - aggregate); turn 2 (same conversation) → 1286 (grew cumulatively), live == persisted. Both - carriers agree; "current" = latest turn's value. -- [x] **FE courier handoff:** `frontend-context-size-handoff.md` (user couriers to - `../frontend`). - -## Turn continuity — detached turns + multi-client live view (DONE) -Design: `notes/turn-continuity-design.md`. FE courier: `frontend-turn-continuity-handoff.md`. -Problem (code-traced): a turn's lifetime was bound to the WS connection — `transport-ws` aborted -the in-flight turn on socket close, so a backgrounded/reloaded mobile browser killed generation. -Principle enforced: **the FE is only a control interface; the AI runs independent of it**, and -**multiple clients may watch the same conversation** (multi-device handoff). -- **Decisions (locked):** broadcast hub lives in the CORE (`session-orchestrator`), not a - transport; additive `SessionOrchestrator` handle (keep `handleMessage`); persist-at-seal kept, - per-step R1 deferred; late-join served by an in-memory in-flight buffer; subscribers persist - per-conversation independent of turns; no concurrent-send arbitration; no explicit stop op. -- **Contract (orchestrator):** `@dispatch/transport-contract` `0.6.0→0.7.0` — additive WS ops - `chat.subscribe`/`chat.unsubscribe` on `WsClientMessage` (events still arrive as `chat.delta`). -- **Wave 1 — `session-orchestrator`:** detached per-conversation turn ownership + broadcast; - `startTurn`/`subscribe`/`isActive` added to the handle; `handleMessage` → convenience wrapper - (dropped `signal`). **Two-map model** (`subscribers` persistent + `activeTurns` buffer) — the - fix for the live-found bug where pre-turn subscribers were dropped. 63 tests. -- **Wave 2 (parallel) — `transport-ws`** (fan-out: per-connection chat-subscription map; - `chat.send` auto-subscribes sender + `startTurn`; new ops in pure `router.ts`; `close` drops - subs but NEVER aborts a turn; removed the turn `AbortController`) + **`transport-http`** (only - test fakes updated for the 3 new methods; runtime unchanged). host-bin untouched. -- **LIVE-VERIFIED against flash** (2-client WS test, `/tmp/ws_multi.ts`): (S1) two clients both - stream a turn; closing the SENDER mid-turn → the other keeps receiving through `done` and the - turn persists (1197 chars) — AI kept going independent of the interface; (S2) a client joining - mid-turn gets `turn-start` replayed + the rest live. `RESULT OVERALL: OK`. -- **Recovery (scar tissue):** first Wave-1 impl stored listeners INSIDE the per-turn hub and - `startTurn` made a fresh empty-listener hub → every pre-turn subscriber dropped; live test got - zero deltas though the turn ran+persisted. Caught by live-verify (unit test had subscribed - AFTER start, masking it). Fixed via the persistent-subscribers / per-turn-buffer split. - -## Turn continuity — CR-3: user prompt on the event stream (DONE) -FE bug (multi-client): a pure watcher (subscribed, not the sender) couldn't see the USER prompt until -seal — the user message was passed to the provider + persisted only at seal, never on the turn's -outward stream/buffer. FE courier: `frontend-cr3-user-message-handoff.md`. -- **Contract:** `@dispatch/wire` `0.5.0→0.6.0` — additive `TurnInputEvent` - `{ type:"user-message"; conversationId; turnId; text }` on the `AgentEvent` union (kernel barrels - re-export it). `@dispatch/transport-contract` `0.7.0→0.8.0` (re-export only). Widening broke NO - exhaustive switch (typecheck clean) — zero consumer fan-out. -- **session-orchestrator:** `emitToHub({type:"user-message",…})` as the FIRST event of `runTurnDetached` - (before `runTurn`) → buffered + broadcast to all subscribers (live + late-join); HTTP path covered via - `handleMessage`'s buffer replay. Persistence + metrics unchanged. +3 tests; 3 Wave-1 tests updated - (user-message now precedes turn-start). -- **LIVE-VERIFIED vs flash:** a watcher that never sent receives `user-message` (correct text) as its - FIRST `chat.delta`, before `turn-sealed`, then the streaming reply. `RESULT: OK`. -- **Process note:** implemented directly by the orchestrator as a one-off (user-approved at the - time). SUPERSEDED — the user has since confirmed the ORCHESTRATOR.md model governs: the - orchestrator summons owner-agents and does not write feature code itself. - -## Cache warming — FE CR-4 lifecycle + CR-1 extensions table + CR-2 catalog scope (DONE) -FE courier in: `../frontend/backend-handoff-cache-warming.md` (+ CR-1/CR-2 from their living -`backend-handoff.md`). Courier out: `frontend-cache-warming-lifecycle-handoff.md`. Full report: -`reports/cr4-cache-warming-lifecycle.md`. -- **CR-4a:** warming defaults OFF (opt-in per conversation) — `parseSettings` + `DEFAULT_STATE`; - re-enabling now restores the persisted interval. Known gap (pre-existing, fail-safe): no boot - hydration of persisted opt-in across server restarts. -- **CR-4b:** post-warm surface updates now carry the FUTURE `nextWarmAt` (re-arm BEFORE notify); - `turnSettled`/`turnStarted` also push (fresh schedule after seal / `null` while generating). -- **CR-4c:** new `POST /conversations/:id/close` (tab close ≠ disconnect): aborts the in-flight - turn via a per-turn `AbortController` → kernel `runTurn` `signal` (partial persist + normal seal, - `done.reason:"aborted"`), and emits new typed hook `conversationClosed` → cache-warming disables - sync + persists OFF. Disconnect/`chat.unsubscribe` semantics unchanged. -- **CR-4d:** no change — initial `surface` echo already at HEAD (FE probed a stale up2 boot). -- **CR-1:** loaded-extensions emits count stat + ONE `custom`/`rendererId:"table"` field - (`TablePayload` exported); columns Name|Version|Trust|Activation, all trust tiers. -- **CR-2:** `SurfaceCatalogEntry.scope?: "global"|"conversation"` (`ui-contract` `0.1.0→0.2.0`); - set on both surfaces. `transport-contract` `0.8.0→0.9.0` (additive `CloseConversationResponse`). -- 907 tests pass (+13 new); typecheck + biome clean. **LIVE-VERIFIED vs `bin/up`:** default-off, - 2 automatic warms @5s each pushing future `nextWarmAt`, mid-turn close → `abortedTurn:true` + - `done.reason:"aborted"` + warming disabled, catalog scopes + table field present, echo present. - -## History windowing — FE CR-5 (DONE) -FE courier in: `../frontend/backend-handoff-chat-limit.md` (+ living `backend-handoff.md` §2 -CR-5). Courier out: `frontend-history-windowing-handoff.md`. User-gated call: ask #3 shipped as -the INVARIANT option (no new field) — seq is contractually **1-based, monotonic, gap-free**; FE -derives `hasOlder` from `chunks[0].seq > 1`. -- **Wave 0 (orchestrator, contracts):** `limit`/`beforeSeq` query-param semantics + validation + - `latestSeq` windowed-read caveat documented on `ConversationHistoryResponse` - (`@dispatch/transport-contract` `0.9.0→0.10.0`); 1-based seq guarantee codified on - `StoredChunk` (`@dispatch/wire` `0.6.0→0.6.1`, doc-only). -- **Wave 1 — `conversation-store`:** additive `loadSince(id, sinceSeq?, window?: { beforeSeq?, - limit? })` — selection `sinceSeq < seq < beforeSeq`, newest-`limit` window, result stays - ascending; garbage-in treated as absent (transport validates upstream). +8 tests. -- **Wave 2 — `transport-http`:** parses + validates the params (positive integers; malformed/ - zero/negative → 400 `{ error }`, store never called with an invalid window); two-arg call - shape preserved when no params (regression-guarded). +20 tests. -- 935 vitest + 112 bun tests, typecheck + biome clean. **LIVE-VERIFIED** (isolated boot, real - flash turns): firstSeq=1; `limit=2`→`[5,6]` ascending w/ correct `latestSeq`; `limit=9999`→ - full log; `beforeSeq=3`→`[1,2]`; `beforeSeq=3&limit=1`→`[2]`; `limit=0`/`beforeSeq=0`/ - `limit=abc`→400×3. `RESULT: OK` ×6. -- **Scar tissue (process):** (1) probing with a PRIVATE boot was overkill — the windowing checks - are read-only GETs and the dev stack was running; prefer probing `bin/up`/`up2` or asking the - user (ORCHESTRATOR §8 updated). (2) The §8 boot recipe was stale (`DISPATCH_API_KEY_OPENCODE1` - doesn't exist; an empty re-export OVERRIDES `.env` → "No providers registered"; `.env`'s - `BACKEND_PORT` beats `PORT`; un-isolated data paths spawn a duplicate collector on the dev - DB) — recipe fixed in §8 + above. (3) Violated the bracket trick once (`pkill -f 'cr5-data'` - self-matched → killed parent shell, timeout-with-no-output); the existing §8 rule stands. - -## Reasoning effort (current milestone) -User-gated calls: canonical term **reasoning effort** (GLOSSARY); ladder `low|medium|high|xhigh|max` -(Anthropic-driven, includes xhigh/max); scope = **(c)** persisted per-conversation + per-turn -`ChatRequest.reasoningEffort` override; resolution default **`high`**; provider picks sensible -budget_tokens; `../claude` orchestrated DIRECTLY (mode A); CLI `--effort` now. -- [x] **Wave 0 (orchestrator, contracts):** `ReasoningEffort` in `@dispatch/wire` (`0.6.1→0.7.0`); - `ProviderStreamOptions.reasoningEffort` (kernel contract; runtime untouched — providerOpts is - forwarded verbatim); `ChatRequest.reasoningEffort` + `ReasoningEffortResponse`/ - `SetReasoningEffortRequest` GET/PUT types (`@dispatch/transport-contract` `0.10.0→0.11.0`); - glossary entry. typecheck + biome clean. -- [x] **Wave 1 (parallel ×3, disjoint):** `conversation-store` get/setReasoningEffort (own key - space, mirrors cwd; +12 tests); `provider-anthropic` (../claude commit `c0835a4`, mode A summon - with `--dir ../claude`, contract excerpt INLINED per the cross-`--dir` hang rule) — - `REASONING_EFFORT_BUDGETS` 4096/10240/16384/32768/65536, raises max_tokens above budget, strips - temperature when thinking on, absent → byte-stable body (+12 tests); `cli` `--effort` flag, - parse-validated, body key omitted when unset (+8 tests). -- [x] **Wave 2:** `session-orchestrator` — exported pure `resolveReasoningEffort` (override → - stored → `"high"`), additive `StartTurnInput.reasoningEffort`, providerOpts always stamped, - **warm() parity** (same resolved effort as a real turn — prompt-cache safe), own fakes fixed - (+9 tests). -- [x] **Wave 3 (parallel ×2):** `transport-http` — `/chat` validation (400 names valid levels, - orchestrator never sees bad input), threads to startTurn, GET/PUT - `/conversations/:id/reasoning-effort` mirroring cwd endpoints, own fakes fixed; `transport-ws` — - `chat.send` threading + validation (+3 tests). -- [x] Verified: `tsc -b` EXIT 0, biome clean, **993 vitest + 189 bun** green; all agents in-lane. - Commits: arch-rewrite `35197ed` (contracts) + `020e051` (impl); ../claude `c0835a4`. -- [x] Live-verified vs claude (thinking deltas streamed at xhigh; persisted PUT honored next turn). -- [x] FE courier handoff written: `frontend-reasoning-effort-handoff.md` (user couriers to - `../frontend`): ChatRequest/chat.send field + GET/PUT endpoints + ladder + default-`high` - semantics + cache note. - -## Message queue + steering injection (DONE) -Design: this file's roadmap item 3 (now implemented). User-gated calls: a **separate -`message-queue` standard extension** (dependsOn `surface-registry`) owns the queue STATE + -a per-conversation `custom` surface; the **session-orchestrator** owns delivery (drain → -inject → carry) + emits the `steering` event (it owns the chat hub — no `chatEmit` service -needed); the **kernel** gets a generic `drainSteering` callback. Glossary: added -**message queue**, **steering**, **queued message**. Enqueue when idle **starts a turn** -(user choice; `chat.queue` degrades to `chat.send`). Steering text rendered live via a new -additive `steering` `AgentEvent`; queue state via the surface (NOT the chat stream). -- **Wave 0 (orchestrator, contracts):** `RunTurnInput.drainSteering?: () => readonly - ChatMessage[]` (kernel contract — generic, kernel stays pure); `QueuedMessage` + - `QueuePayload` + `TurnSteeringEvent` (type `"steering"`, additive to `AgentEvent`) in - `@dispatch/wire` (`0.7.0→0.8.0`); `POST /conversations/:id/queue` + WS `chat.queue` op + - `QueueRequest`/`QueueResponse` in `@dispatch/transport-contract` (`0.11.0→0.12.0`). typecheck - clean except the expected transport-ws exhaustive-switch fan-out (fixed in Wave 3). -- **Wave 1 (parallel ×2, disjoint):** `kernel` runtime — calls `drainSteering` at the - tool-result boundary only when continuing to a next step (gated; no drain on max-steps), - +6 pure tests (65 total); `message-queue` (NEW ext) — pure queue core (enqueue/getQueue/ - drain/combine) + `MessageQueueService`/`messageQueueHandle` + per-conversation `custom` - surface (`rendererId:"message-queue"`, `QueuePayload`), 12 tests. (The message-queue agent - DIED mid-task after writing all src+tests but before verifying/reporting; orchestrator - recovered by running `bun install` + root tsconfig ref + verifying directly — tsc/vitest/ - biome clean, 12 tests pass; no hand-fixing of impl.) -- **Wave 2:** `session-orchestrator` — added `enqueue` facade (idle→`startTurn`, - active→queue.enqueue) + `resolveQueue?` dep (self-wired lazily in `activate` via - `host.getService(messageQueueHandle)` — host-bin does NOT wire it) + `drainSteering` wrapper - (drain → emit `steering` → return one combined user `ChatMessage`) + post-seal carry - (non-empty queue → new turn), +8 tests (85 total). `message-queue` is an OPTIONAL dep - (feature degrades off if absent). -- **Wave 3 (parallel ×3):** `host-bin` — registered `message-queue` in `CORE_EXTENSIONS` - (+dep+ref), 28 tests; `transport-http` — `POST /conversations/:id/queue` route + validation, - 145 tests; `transport-ws` — `chat.queue` op + fixed the Wave-0 exhaustive-switch fan-out, - 29 vitest + 20 bun. -- Verified: `tsc -b` EXIT 0, biome clean (280 files), **1043 vitest + 199 transport bun** pass; - all agents in-lane. **Boot smoke:** private instance boots clean with `message-queue` - registered (no activation crash). -- [x] FE courier handoff written: `frontend-message-queue-handoff.md` (user couriers to - `../frontend`): surface (`rendererId:"message-queue"`), `chat.queue` WS op, `steering` - event, HTTP `POST /queue`, auto-start-when-idle, carry semantics, version bumps. - -## Umans AI Coding Plan provider (DONE) -User-gated calls: a new **`provider-umans`** standard extension wrapping the Umans -OpenAI-compatible backend (`https://api.code.umans.ai/v1`). Built via the **full-refactor -path**: first extract a generic `@dispatch/openai-stream` library from -`provider-openai-compat`, then build `provider-umans` on top. Self-contained (reads -`UMANS_API_KEY` from env directly — no `auth-apikey` dep). -- **Wave 1 — `@dispatch/openai-stream` lib (NEW package):** extracted the generic OpenAI - functions (convert-messages, convert-tools, parse-sse, listModels, stream, provider) - from `provider-openai-compat` into a pure library package. `createOpenAICompatProvider` - parameterized: `id: string` (was hardcoded `"openai-compat"`) + `transformBody?: (body, - opts) => Record<string,unknown>` hook (for provider-specific body fields). Refactored - `provider-openai-compat` to import from the lib (thin extension.ts, backward-compat - re-exports, manifest unchanged, byte-identical behavior). Full tsc EXIT 0, 66 vitest, - biome clean. Report: `reports/provider-umans-wave1-openai-stream.md`. -- **Wave 2 — `provider-umans` (NEW ext):** imports `createOpenAICompatProvider` from the - lib; registers provider id `"umans"`; `transformBody` maps Dispatch `reasoningEffort` - (`low|medium|high|xhigh|max`) → Umans `reasoning_effort` (`none|low|medium|high`, - capping `xhigh`/`max`→`high`); dynamic `listModels` (GET /v1/models); default model - `umans-coder` (env `UMANS_MODEL` or config `provider.umans.model`); baseURL env - `UMANS_BASE_URL`; absent key → warn + skip registration (graceful). Pure core: - `mapReasoningEffort` + `resolveUmansConfig` (factored out for direct unit testing). - 12 tests. Report: `reports/provider-umans.md`. -- **Wave 3 — host-bin wiring:** registered `provider-umans` in `CORE_EXTENSIONS` + added - `@dispatch/provider-umans` dep + root tsconfig ref. No credential-store entry needed - (self-contained — reads env directly, doesn't go through `auth-apikey`). 28 host-bin - tests. -- Verified: full-graph `tsc -b` EXIT 0, biome clean (293 files), **1059 vitest** pass. - **Boot smoke:** without `UMANS_API_KEY` → `"provider-umans: no UMANS_API_KEY. Provider - not registered."` (graceful skip); with `UMANS_API_KEY=sk-test` → `"provider-umans: - registered (model=umans-coder)"`. -- [x] **LIVE-VERIFIED against the real Umans API:** the dev stack (umans-glm-5.2) called - `web_search` (Firecrawl) in a real turn — first live Umans API call, clean response. - -## web_search tool — Firecrawl (DONE) -Standard tool extension `tool-web-search` backed by a self-hosted Firecrawl instance -(`http://100.102.55.49:31329/v1`, Tailscale, no API key). One tool `web_search` with 4 -modes: search, scrape, crawl (polls status URL), map — mirroring the proven opencode tool. -Pure core: `validateArgs` (discriminated union by mode) + `format*` functions + `truncateOutput`. -Injected edge: `FirecrawlClient` (injectable `fetchFn` + `sleep` + `now`), `AbortSignal.any` -for per-request timeout + caller cancellation. `concurrencySafe: true`, `capabilities: { network: true }`. -38 tests. Report: `reports/tool-web-search.md`. -- **LIVE-VERIFIED:** the dev stack (umans-glm-5.2) called `web_search` → Firecrawl returned - real results (Paris, France) — first live Umans API call too. - -## todo tool — per-conversation task list + surface (DONE) -Standard tool extension with a single `todo_write` tool (opencode `todowrite` pattern: -full-list replace, returns JSON, no business-rule enforcement — the description guides -the model). Per-conversation in-memory state (`Map<conversationId, TodoItem[]>`). Per- -conversation surface (`rendererId: "todo"`, `scope: "conversation"`) via subscriber-notify -(message-queue pattern). `concurrencySafe: false` (mutates shared state). -- **Wave 0 (orchestrator, kernel contract):** added `conversationId?: string` to - `ToolExecuteContext` (additive, backward-compatible). Wired in `dispatch.ts` — the - kernel already had `conversationId` as a parameter, just wasn't passing it through to - the tool context. 170 kernel tests pass. -- **Wave 1 (todo extension):** pure core (`validateTodos` — shape only; `getTodos`/ - `setTodos`/`clearTodos` — fresh array copies; `buildTodoSpec`; `formatTodoResult` → - `JSON.stringify`). Shell: `createTodoWriteTool({ state, notify })` + surface provider. - 26 tests. Report: `reports/todo.md`. -- **Wave 2 (host-bin wiring):** registered `todo` in `CORE_EXTENSIONS` + dep + root tsconfig - ref. 28 host-bin tests. -- Verified: full-graph `tsc -b` EXIT 0, biome clean (314 files), **1123 vitest** pass. - **Boot smoke:** `"todo: registered"` + activated. -- [x] Live-verified (model uses `todo_write` in a real turn). - -## youtube_transcript tool (DONE) -Standard tool extension `tool-youtube-transcript` backed by a self-hosted transcriber -service (`http://100.102.55.49:41090`, Tailscale, no API key). One tool -`youtube_transcript` — takes a YouTube URL, fetches the transcript (completed → full -text + timestamped segments; queued/processing → position + ETA + `.youtube_subtitles_pending` -retry convention; failed → error). Pure core: `validateUrl` + `format*` functions + -`truncateOutput`. Injected edge: `TranscriptClient` (injectable `fetchFn`, `AbortSignal.any` -for cancellation). `concurrencySafe: true`, `capabilities: { network: true }`. 30 tests. -Report: `reports/tool-youtube-transcript.md`. - -## CLI — cross-client messaging + open tab (DONE) -Roadmap items 2 + 4. The CLI can now list conversations, read the last AI message -(blocking), send messages (blocking or `--queue`), and signal the frontend to open a -conversation tab. Short-ID prefix resolution (4+ chars → full ID via `GET /conversations?q=`). -- **Wave 0 (orchestrator, contracts):** `ConversationMeta` in `@dispatch/wire` - (`0.8.0→0.9.0`); `ConversationListResponse`, `LastMessageResponse`, - `OpenConversationResponse`, `SetTitleRequest`, `TitleResponse`, WS - `conversation.open` in `@dispatch/transport-contract` (`0.12.0→0.13.0`); - `listConversations()`/`getConversationMeta()`/`setConversationTitle()` on - `ConversationStore`; new routes declared in transport-http manifest; - `conversationOpened` hook in session-orchestrator. -- **Wave 1 (conversation-store):** metadata tracking (createdAt on first write, - lastActivityAt on every append, title from first user message truncated 80 chars); - `conv-index` key tracks all conversation IDs; `extractTitle` pure helper. 21 new - tests (81 total). -- **Wave 2 (parallel, transport-http + transport-ws):** `GET /conversations` (list - with `?q=` prefix filter), `GET /conversations/:id/last` (blocks until turn settles - via subscribe-then-checkIsActive, returns last assistant text via pure - `extractLastAssistantText`), `POST /conversations/:id/open` (emits - `conversationOpened` hook), `PUT /conversations/:id/title`; `emit` threaded from - `host.emit` → `createApp`. transport-ws subscribes to `conversationOpened` + - broadcasts `ConversationOpenMessage` to all connected WS clients. 21+2 new tests. -- **Wave 3 (CLI):** `dispatch list` (table: short ID + title + activity), - `dispatch read <id>` (blocking, prints last AI message), `dispatch send <id> --text` - (blocking by default; `--queue` for non-blocking enqueue; `--open` signals FE). - Short-ID resolution (4+ chars → prefix search; 32+ chars = full UUID). 48 new - tests (108 total). -- Verified: full-graph `tsc -b` EXIT 0, biome clean (327 files), **1240 vitest** pass. - **Boot smoke + endpoint smoke:** `GET /conversations` → `[]`, `GET /conversations/:id/last` - → `{content:""}`, `POST /conversations/:id/open` → `{conversationId}`. -- [x] Live-verified end-to-end (CLI → real conversation → FE tab open). - -## Workspaces (DONE) -Cross-repo design ask from `../frontend` (`backend-handoff-workspaces.md`). -Outbound courier: `frontend-workspaces-handoff.md` (final shapes + Q1–Q8). -- **Boundary decision:** workspaces live inside `conversation-store` (metadata + - cwd persistence owner); no new extension. Single owner-agent for all workspace - storage + service methods. -- **Versions:** `@dispatch/wire` `0.11.0→0.12.0`, `@dispatch/transport-contract` - `0.15.0→0.16.0`, `@dispatch/ui-contract` unchanged. Kernel re-exports - `Workspace`/`WorkspaceEntry`. -- **Key decisions:** `DELETE /workspaces/:id` closes all conversations (status→ - "closed") + reassigns to "default" + deletes workspace; auto-create workspace on - turn start if missing; `PUT /workspaces/:id` create-on-miss with optional - `title`/`defaultCwd`; `DELETE /conversations/:id/cwd` to clear explicit cwd; - `GET /conversations/:id/lsp` roots at effective cwd; WS lifecycle push deferred. -- **Waves:** - - **Wave 0 (orchestrator):** contracts (wire `0.12.0` + transport-contract - `0.16.0` + kernel re-exports). tsc + biome clean. - - **Wave 1 (conversation-store):** workspace persistence + service methods - (`getWorkspace`, `ensureWorkspace`, `setWorkspaceTitle`, `setWorkspaceDefaultCwd`, - `deleteWorkspace`, `listWorkspaces`, `getWorkspaceId`, `setWorkspaceId`, - `getEffectiveCwd`, `isValidWorkspaceSlug`); `listConversations` filter; - `forkHistory`/`replaceHistory` preserve `workspaceId`. 111 bun tests. CRs - (kernel re-exports, `bun install`) resolved by orchestrator. - - **Wave 2 (session-orchestrator):** `workspaceId` on `StartTurnInput`/ - `EnqueueInput`; effective cwd resolution (`getCwd` → `getEffectiveCwd`); auto- - create workspace on turn start; warm parity. 93 vitest (+8). - - **Wave 3 (parallel):** `transport-http` (workspace routes, `workspaceId` - threading, `?workspaceId=` filter, `DELETE /conversations/:id/cwd`, effective - cwd for LSP, slug validation; 166 tests), `transport-ws` (`workspaceId` on - `chat.send`/`chat.queue`; 32 tests), `cli` (`--workspace`/`-w` flag; 123 tests). - - FE handoff sent to agent 4091 via `dispatch send --queue` (non-blocking). -- Verified: full-graph `tsc -b` EXIT 0, biome clean (328 files), **1283 vitest + - 199 transport bun** pass (1 pre-existing `tool-shell` failure unrelated). -- **LIVE-VERIFIED** against dev stack (`bin/up`): 11/11 workspace checks pass — - create-on-miss, rename, set default-cwd, invalid-slug 400, unknown 404, delete- - default 409, chat with workspaceId stamps conversation, workspace filter, cwd - inheritance (null = inheriting), delete cascade (closedCount:1, workspace→404). -- `dist/` rebuilt for FE (wire + transport-contract + kernel .d.ts contain Workspace - types). FE agent 4091 notified twice (handoff + dist-ready). - -## Open items -- **`prefix.fingerprint` / `warm|real` cache-bust attributes (deferred):** decoupled - from dedup by the content-addressed decision; also gated on cache-warming being - built (not yet) so `warm|real` can't be honestly stamped. Later cache-bust-debug - milestone (`notes/observability-design.md` §3.1, §12). -- **D9 analytics roll-ups (deferred):** rollup table shape + `GROUP BY` indexes + - retention asymmetry + periodic rollup job (`notes/observability-design.md` §2 D9, - §12). The scheduler mechanism (`host.scheduler.register`) already exists. -- **D8 `prompt.assembly` segments:** deferred-by-design (await the context-filter - chain). -- **In-memory state persistence (message queue + todo list):** both the message - queue and the todo list are in-memory only (`Map<conversationId, …>` in the - extension's `activate`). Neither persists across server restarts. If persistence - is needed later, both would write through `host.storage` (the conversation-store - pattern: separate key space per feature, append/write per conversation). - -## Roadmap -1. **Web frontend** (in progress, SEPARATE repo `../frontend`; Svelte + - DaisyUI, same methodology). Slice 2 = browser chat MVP consuming the - wire/transport-contract + metrics. Cross-repo contract changes are couriered - via the user (ORCHESTRATOR §7); `lsp references` does not span repos. -2. **Message queue — close-with-queued-messages (deferred product decision):** - if a client closes a conversation (`POST /conversations/:id/close`) while the - queue is non-empty, the carry currently still fires (starts a new turn on the - closed conversation). Decide: does closing discard pending steering, or honor - it? If "discard," gate the carry on `finishReason !== "aborted"` in - session-orchestrator (one-line). No FE action either way. -3. **FE: consume `GET /conversations/:id/status` for crash-recovery re-sync.** - Backend endpoint shipped: returns `{ conversationId, isActive, status }` where - `isActive` is the orchestrator's in-memory truth and `status` is the persisted - lifecycle status. On reconnect (WS re-establish or page reload), the FE should - call this for any tab it believes is "generating"; if `isActive: false`, - override the local spinner to idle regardless of the persisted `status` - (defense-in-depth against status drift the boot-sweep didn't catch). - -(Done and dropped from the list: CLI; dedup / storage growth; message queue + -steering injection; CLI open-tab handoff; `todo` tool; `web_search` tool; tab -persistence across devices; conversation compacting; live-verify steering flow.) - -## Stop generation must abort a hanging tool + not brick the conversation (DONE) -FE courier in: "Stop generation doesn't abort a hanging tool call." When the user clicks Stop during -a tool that hangs (e.g. `run_shell` with a blocking/grandchild-holding process), the turn never -sealed → the FE spinner spun forever AND the conversation was bricked (next `chat.send` rejected as -`"already-active"` because `activeTurns` was never cleared). -- **Root cause:** the kernel's `executeToolCall` awaited `tool.execute(...)` with **no race against - the abort signal** — a tool that ignored `ctx.signal` (or blocked on something it couldn't - interrupt) blocked `drain` → `runTurn` never returned → session-orchestrator's `finally` (which - clears `activeTurns`) never ran. (The `/stop` endpoint, `stopTurn`, and the `finally` cleanup were - already correct — they just needed `runTurn` to return.) Secondary: `realSpawn` resolved on - `child.on("close")` (waits for stdio) and killed only the immediate child, so a grandchild holding - the pipes could stall the spawn promise + leak. -- [x] **kernel** — `executeToolCall` now **races** `tool.execute` against `signal` via `Promise.race`; - on abort it **resolves** (not rejects) `{ content: "Aborted", isError: true }` so the step completes - normally → kernel's existing `signal.aborted → finishReason "aborted"` path runs → turn seals - cleanly (`done` + `turn-sealed`) → `finally` clears `activeTurns` → **conversation freed, next - message accepted**. Late rejections from the orphaned tool promise are swallowed. 11 tests incl. - the durability test (hanging tool `new Promise(() => {})` + abort → `runTurn` returns - `finishReason "aborted"`, doesn't hang). Report: `reports/kernel-abort-race.md`. -- [x] **tool-shell** — `realSpawn` spawns `detached: true` (own process group); on abort **and** - timeout kills the **group** (`process.kill(-pgid, "SIGKILL")`) AND resolves immediately (no - `close`-dependency) so a grandchild holding the pipes can't stall the spawn or leak. 4 tests - (grandchild abort, grandchild timeout, normal-completion stdout capture, simple abort). Report: - `reports/tool-shell-process-group-kill.md`. -- [x] Verified: `tsc -b` EXIT 0, biome clean, **1326 vitest** pass; both in-lane; kernel zero - internal mocks. -- [x] **Live-verified** (fresh `bin/up`): start a hanging tool (`run_shell` sleep/grandchild), - Stop, then send a NEW message → it must be ACCEPTED (conversation not bricked) and the spinner - clears. - -## System prompt builder — template-based system context (DONE) -Design: `notes/system-prompt-design.md`. FE courier: `frontend-system-prompt-handoff.md`. -Problem: no system prompt was sent to the provider for regular turns (the messages array -started with the user message; `providerOpts.systemPrompt` was never set). This adds a -template-based system prompt builder with variable placeholders (`[type:name]`) and -conditionals (`[if]`/`[else]`/`[endif]`). -- **Cache constraint (critical):** the system prompt is constructed ONCE (first turn of - a new conversation) and persisted. Reused on all subsequent turns (no reconstruction — - cache-safe). Reconstructed only on **compaction** (fresh variable resolution + compaction - instructions appended). -- **Variable types:** `system:time/date/os/hostname`, `prompt:cwd/model/conversation_id`, - `git:branch/status`, `file:<path>` (dynamic — any path). -- **Wave 0 (orchestrator, contracts):** `@dispatch/transport-contract` `0.17.0→0.18.0` — - `SystemPromptTemplateResponse`, `SetSystemPromptTemplateRequest`, `SystemPromptVariable`, - `SystemPromptVariablesResponse`. -- **Wave 1 — `system-prompt` (NEW ext):** pure parser (29 tests) + variable resolver - (injected adapters, 12 tests) + catalog (3 tests) + service handle (`construct` + - `get` + `getTemplate` + `setTemplate`, 8 tests). 52 tests total. Default template: - persona + AGENTS.md if exists + cwd. -- **Wave 2 (parallel):** `session-orchestrator` (wire service: construct on first turn, - get on subsequent, construct+append on compaction; 12 tests) + `transport-http` - (GET/PUT `/system-prompt`, GET `/system-prompt/variables`; 6 tests). -- **Wave 3 — host-bin:** registered `system-prompt` in `CORE_EXTENSIONS`. -- [x] Verified: `tsc -b` EXIT 0, biome clean, **1396 vitest** pass. -- [x] Live-verified (boot smoke: extension activates, `GET /system-prompt` returns default - template, `GET /system-prompt/variables` returns catalog). -- [x] **FE courier** sent to FE agent `ffe3`: `frontend-system-prompt-handoff.md`. |
