1 files changed, 194 insertions, 77 deletions
diff --git a/README.md b/README.md
index de95ed0..cfeed90 100644
--- a/README.md
+++ b/README.md
@@ -8,6 +8,7 @@ in a separate repo and talks to the same typed contracts over HTTP + a surface W
   the Vercel AI SDK for providers.
 - **Architecture:** `kernel → core extensions → standard extensions`. The kernel touches no I/O
   and names no concrete feature; effects live in extensions, injected through typed contracts.
+- **Tests:** 1453 vitest + bun:sqlite integration tests, zero internal mocks on pure-core packages.
 
 ---
 
@@ -25,25 +26,21 @@ bun install
 
 ---
 
-## Deploy the server
+## Quick start (dev)
 
-1. **Create a `.env`** in the repo root (it is gitignored):
+1. **Create a `.env`** in the repo root (gitignored; see `.env.example`):
 
    ```sh
    DISPATCH_API_KEY=sk-...                         # your OpenAI-compatible API key (the secret)
    DISPATCH_BASE_URL=https://opencode.ai/zen/go/v1 # the provider base URL
    DISPATCH_MODEL=deepseek-v4-flash                # default model when a request omits one
-   BACKEND_PORT=24203                              # port the HTTP server listens on
-   FRONTEND_PORT=24204                             # reserved for the future web UI
+   BACKEND_PORT=24203                              # HTTP server port (dev default)
+   SURFACE_WS_PORT=24205                           # surface WebSocket port (dev default)
 
    # Optional — Umans AI Coding Plan provider (https://code.umans.ai)
    UMANS_API_KEY=sk-...                            # if set, the "umans" provider is registered
-   # UMANS_BASE_URL=https://api.code.umans.ai/v1   # override the default base URL
-   # UMANS_MODEL=umans-coder                       # default model (umans-coder|umans-kimi-k2.7|umans-glm-5.2|umans-flash)
    ```
 
-   Bun auto-loads `.env`. (If your shell also needs the vars: `set -a; source .env; set +a`.)
-
 2. **Boot the server:**
 
    ```sh
@@ -51,16 +48,11 @@ bun install
    ```
 
    It loads config, activates every extension through the host, and serves HTTP on
-   `BACKEND_PORT`. It also spawns and supervises an out-of-process **observability collector**
-   (restart-on-crash, drain-on-shutdown) and writes a structured journal to `.dispatch/journal/`
-   plus a trace database. A collector failure never crashes the server.
-
-   ```
-   Dispatch listening on http://localhost:24203
-   ```
-
-   It also serves a **surface WebSocket** on `:24205` (the `transport-ws` extension) — the channel
-   the web frontend uses to discover and render backend-declared *surfaces*.
+   `BACKEND_PORT` (default 24203). It also spawns and supervises an out-of-process
+   **observability collector** (restart-on-crash, drain-on-shutdown) and writes a structured
+   journal to `.dispatch/journal/` plus a trace database. A collector failure never crashes the
+   server. A **surface WebSocket** on `SURFACE_WS_PORT` (default 24205) carries live updates
+   to connected frontends.
 
 3. **Smoke-test it:**
 
@@ -71,15 +63,87 @@ bun install
    # one turn (NDJSON stream of events back); X-Conversation-Id header threads multi-turn
    curl -s -X POST localhost:24203/chat \
      -H 'content-type: application/json' \
-     -d '{"model":"opencode/deepseek-v4-flash","message":"Say hello in 3 words."}'
+     -d '{"conversationId":"c1","message":"Say hello in 3 words."}'
    ```
 
-### HTTP API (for any client)
+---
+
+## Deploy as a systemd service (Arch Linux)
+
+`bin/install` builds the binaries + frontend, installs them system-wide, and sets up a
+systemd service.
+
+```sh
+sudo bin/install              # build + install + enable + start
+sudo bin/install --no-build   # install only (skip the build step)
+sudo bin/install --uninstall  # stop + disable + remove files (keeps config + data)
+```
+
+**What it installs:**
+
+| Path | Description |
+|---|---|
+| `/usr/bin/dispatch-server` | Standalone backend binary (Bun compile) |
+| `/usr/bin/dispatch` | Standalone CLI binary (Bun compile) |
+| `/usr/share/dispatch/web/` | Built frontend static files |
+| `/etc/dispatch/env` | Server config (systemd EnvironmentFile) |
+| `/etc/systemd/system/dispatch.service` | systemd unit |
+| `/var/lib/dispatch/` | Data directory (SQLite DBs) |
+| `/var/log/dispatch/` | Journal + trace logs |
+
+The production config (`systemd/dispatch.env`) uses ports **24991** (HTTP) and **24990**
+(surface WS), distinct from the dev defaults (24203/24205). After install:
+
+```sh
+systemctl status dispatch
+journalctl -u dispatch -f          # live logs
+curl -s localhost:24991/health     # → {"ok":true}
+```
+
+`bin/sync-env` updates the API keys in `/etc/dispatch/env` without touching the ports.
+`bin/setup-env` is the interactive first-time setup (prompts for keys, writes the env file).
+
+---
+
+## HTTP API
 
 | Method & path | Body / params | Returns |
 |---|---|---|
-| `GET /models` | — | `{ "models": ["opencode/<model>", ...] }` — the model catalog |
-| `POST /chat` | `{ conversationId?, message, model?, cwd? }` | NDJSON stream of `AgentEvent`s; resolved id in the `X-Conversation-Id` header |
+| `GET /health` | — | `{ "ok": true }` |
+| `GET /models` | — | `{ "models": ["opencode/<model>", ...] }` — the catalog |
+| `POST /chat` | `{ conversationId?, message, model?, cwd?, reasoningEffort? }` | NDJSON stream of `AgentEvent`s; resolved id in the `X-Conversation-Id` header |
+| `POST /chat/warm` | `{ conversationId, model?, cwd? }` | Cache-warming result (tokens + cache %) |
+| `GET /conversations` | `?q=<prefix>&status=<active|idle|closed>&workspaceId=<id>` | Conversation list |
+| `GET /conversations/:id` | `?sinceSeq=<n>&beforeSeq=<n>&limit=<n>` | Conversation history (chunk log) |
+| `GET /conversations/:id/status` | — | `{ conversationId, isActive, status }` |
+| `GET /conversations/:id/last` | — | Last assistant message (blocks until turn settles) |
+| `GET /conversations/:id/metrics` | — | Per-turn + per-step token/timing metrics |
+| `GET /conversations/:id/cwd` | — | Persisted working directory |
+| `PUT /conversations/:id/cwd` | `{ cwd, workspaceId? }` | Set working directory |
+| `DELETE /conversations/:id/cwd` | — | Clear working directory |
+| `GET /conversations/:id/model` | — | Persisted model selection |
+| `PUT /conversations/:id/model` | `{ model }` | Set model selection |
+| `GET /conversations/:id/reasoning-effort` | — | Persisted reasoning effort |
+| `PUT /conversations/:id/reasoning-effort` | `{ reasoningEffort }` | Set reasoning effort |
+| `GET /conversations/:id/lsp` | — | LSP server status for the conversation's cwd |
+| `POST /conversations/:id/queue` | `{ message }` | Enqueue a steering message |
+| `POST /conversations/:id/stop` | — | Stop generation (aborts in-flight turn) |
+| `POST /conversations/:id/close` | — | Close conversation (aborts turn, marks closed) |
+| `POST /conversations/:id/open` | — | Signal frontend to open a tab |
+| `PUT /conversations/:id/title` | `{ title }` | Set conversation title |
+| `POST /conversations/:id/compact` | — | Compact history (non-destructive fork + summary) |
+| `GET /conversations/:id/compact-percent` | — | Auto-compaction threshold |
+| `PUT /conversations/:id/compact-percent` | `{ percent }` | Set auto-compaction threshold |
+| `GET /workspaces` | — | Workspace list |
+| `PUT /workspaces/:id` | `{ title?, defaultCwd? }` | Create/update workspace |
+| `GET /workspaces/:id` | — | Workspace detail |
+| `PUT /workspaces/:id/title` | `{ title }` | Rename workspace |
+| `PUT /workspaces/:id/default-cwd` | `{ defaultCwd }` | Set workspace default cwd |
+| `DELETE /workspaces/:id` | — | Delete workspace (closes conversations, reassigns to default) |
+| `GET /system-prompt` | — | Current system prompt template |
+| `PUT /system-prompt` | `{ template }` | Set system prompt template |
+| `GET /system-prompt/variables` | — | Available template variables |
+| `GET /metrics/throughput` | — | Aggregate throughput metrics |
 
 The request/response shapes are the `@dispatch/transport-contract` package — import it to build
 any new frontend.
@@ -89,30 +153,33 @@ any new frontend.
 ## Use the CLI
 
 The CLI (`packages/cli`) is a one-shot HTTP client of the server above, so **the server must be
-running** (the CLI reads `BACKEND_PORT`, or pass `--server <url>`). Run it via:
+running** (the CLI reads `BACKEND_PORT`, or pass `--server <url>`).
 
-```sh
-bun run dispatch -- <args>
-# or directly:
-bun packages/cli/src/main.ts <args>
 ```
-
-### Commands
-
-```
-dispatch models [--server <url>]
-dispatch <modelName> --text "..." [--file <path>] [--cwd <dir>] [--conversation <id>] [--server <url>] [--show-reasoning]
-dispatch --help
+Usage:
+  dispatch models [--server <url>]
+  dispatch list [<prefix>] [--status <active|idle|closed>] [--all] [--server <url>]
+  dispatch stop <conversationId> [--server <url>]
+  dispatch compact <conversationId> [--server <url>]
+  dispatch read <conversationId> [--server <url>]
+  dispatch open <conversationId> [--server <url>]
+  dispatch send <conversationId> --text "..." [--queue] [--open] [--cwd <dir>] [--effort <level>] [--workspace <id>] [--server <url>]
+  dispatch <modelName> --text "..." [--file <path>] [--cwd <dir>] [--conversation <id>] [--effort <level>] [--workspace <id>] [--server <url>] [--show-reasoning] [--open]
+  dispatch --help
+
+Effort levels: low, medium, high (default), xhigh, max
 ```
 
 - **`<modelName>`** is a **model name** in `<credentialName>/<model>` form — exactly a line from
   `dispatch models` (e.g. `opencode/deepseek-v4-flash`).
-- **`--text`** and/or **`--file`** supply the message (at least one is required); `--file` folds
-  the file's contents into the message.
-- **`--cwd`** sets the working directory for tools this turn (defaults to the current directory).
-- **`--conversation <id>`** continues a prior conversation; each turn prints its id so you can
-  pass it back.
-- **`--show-reasoning`** also prints the model's reasoning stream (hidden by default).
+- **`send <id> --text "..."`** sends a message to an existing conversation (`--queue` for
+  non-blocking enqueue, `--open` to signal the frontend to open a tab).
+- **`read <id>`** blocks until the turn settles, then prints the last assistant message.
+- **`list`** shows conversations (short ID + title + activity); `--status` filters, `--all`
+  includes closed.
+- **`compact <id>`** manually compacts a conversation's history.
+- **`--effort`** sets reasoning effort for the turn (low|medium|high|xhigh|max; default high).
+- **`--workspace <id>`** scopes the conversation to a workspace.
 
 ### Examples
 
@@ -126,40 +193,38 @@ bun run dispatch -- opencode/deepseek-v4-flash --text "Say hello in 3 words."
 # let the model read a file in a given directory (uses the read_file tool, contained to --cwd)
 bun run dispatch -- opencode/deepseek-v4-flash --cwd ./src --text "Read main.ts and summarize it."
 
-# attach a file's contents to your message
-bun run dispatch -- opencode/deepseek-v4-flash --file notes.md --text "Summarize this."
-
 # continue a conversation (id is printed after each turn)
 bun run dispatch -- opencode/deepseek-v4-flash --conversation <id> --text "and in French?"
+
+# list conversations, read the last reply, send a queued message
+bun run dispatch -- list
+bun run dispatch -- read <id>
+bun run dispatch -- send <id> --text "follow up" --queue --open
 ```
 
 ---
 
 ## Web frontend (dispatch-web)
 
-The web UI is a **separate repo** at `../dispatch-web` (Svelte 5 + Vite), built to the same
-methodology and consuming the backend's typed contracts. As of slice 1 it renders the backend's
-**surface system** (e.g. the live "Loaded Extensions" surface); chat UI is a later slice.
+The web UI is a **separate repo** at `../dispatch-web` (Svelte 5 + Vite + DaisyUI), built to the
+same methodology and consuming the backend's typed contracts (`@dispatch/wire`,
+`@dispatch/transport-contract`, `@dispatch/ui-contract`). The browser chat MVP is in progress — it
+streams turns over the chat WebSocket, renders the surface system (loaded extensions, cache-warming
+controls, message queue, todo list), and supports conversation lifecycle, workspaces, LSP status,
+and per-conversation settings.
 
-It needs **this server running** — it connects to the surface WebSocket on `:24205`. To run both:
+**Run both at once with live reload:**
 
 ```sh
-# terminal 1 — backend (this repo)
-bun run dev                       # HTTP :24203 + surface WS :24205
-
-# terminal 2 — frontend (sibling repo)
-cd ../dispatch-web
-bun install                       # links @dispatch/ui-contract via a file: dep to this repo
-bun run dev                       # Vite dev server on http://localhost:24204
+cd /home/tradam/projects/dispatch
+bin/up      # backend (bun --watch :24203 + WS :24205) + frontend (vite HMR :24204)
 ```
 
-Then open **http://localhost:24204**. See `../dispatch-web/README.md` for full setup, including
-visiting over a LAN / Tailscale.
+`bin/up2` starts a second, stable stack on ports 25203/25205/25204 with isolated data — runs
+alongside `bin/up` without interference. Both Ctrl-C cleanly (including the collector child).
 
-**Or run both at once with live reload:** `bin/up` (also `bun run dev:all`) starts the backend
-(`bun --watch`) and the frontend (Vite HMR) together and **Ctrl-C stops both** — including the
-backend's observability collector. (The backend reloads via a full process restart; the frontend
-hot-reloads in place.)
+Then open **http://localhost:24204** (or your Tailscale hostname). See `../dispatch-web/README.md`
+for full setup.
 
 ---
 
@@ -170,20 +235,45 @@ hot-reloads in place.)
 The **Depends on** column is each extension's manifest `dependsOn` (other extensions, resolved
 topologically at activation). Every extension also depends implicitly on the kernel ABI.
 
-| Package | Tier | Description | Depends on |
-|---|---|---|---|
-| **kernel** | kernel | The minimal runtime core — contracts (the ABI), the extension host, the turn loop (`runTurn`), and the event/hook/service bus; touches no I/O and names no concrete feature. | — |
-| **storage-sqlite** | core | Concrete `bun:sqlite` backend behind the kernel's storage interface (a host bootstrap dependency; its `activate` is an intentional no-op). | — |
-| **auth-apikey** | core | Resolves an API key (the secret) from the environment into `ApiKeyCredentials` for a provider to consume. | — |
-| **credential-store** | core | Owns named **credentials** and the **model catalog** — resolves a `<credential>/<model>` model name to a provider + model and aggregates `GET /models`. | — |
-| **provider-openai-compat** | core | Wraps an OpenAI-compatible LLM backend (streaming chat + `listModels`); the OpenCode Go path, holding opencode-go specifics for now. | auth-apikey |
-| **conversation-store** | core | Append-only persistence of the turn/chunk log, with a pure `reconcile` that repairs any interrupted turn on load. | — |
-| **session-orchestrator** | core | Drives one turn end-to-end: load history → resolve provider/model/tools → call `runTurn` → persist. | conversation-store, credential-store |
-| **transport-http** | core | Hono HTTP transport exposing `POST /chat` (NDJSON event stream) and `GET /models` (the catalog). | credential-store, session-orchestrator |
-| **tool-read-file** | standard | A `read_file` tool with offset/limit pagination and two-layer workdir containment, honoring the per-turn `cwd`. | — |
-| **surface-registry** | standard | In-process registry where extensions contribute UI **surfaces** (frontend-agnostic data); exposes a typed `surfaceRegistryHandle` service. | — |
-| **transport-ws** | standard | WebSocket transport (`:24205`) serving the surface catalog + per-surface subscribe / update / invoke to clients. | surface-registry |
-| **surface-loaded-extensions** | standard | Contributes the live "Loaded Extensions" surface (a `stat` per activated extension) — the first real surface. | surface-registry |
+#### Kernel (not an extension)
+
+| Package | Description |
+|---|---|
+| **kernel** | The minimal runtime core — contracts (the ABI), the extension host, the turn loop (`runTurn`), and the event/hook/service bus; touches no I/O and names no concrete feature. |
+
+#### Core extensions (minimum to complete one turn end-to-end)
+
+| Package | Description | Depends on |
+|---|---|---|
+| **storage-sqlite** | Concrete `bun:sqlite` backend behind the kernel's storage interface (a host bootstrap dependency; its `activate` is an intentional no-op). | — |
+| **auth-apikey** | Resolves an API key (the secret) from the environment into `ApiKeyCredentials` for a provider to consume. | — |
+| **credential-store** | Owns named **credentials** and the **model catalog** — resolves a `<credential>/<model>` model name to a provider + model and aggregates `GET /models`. | — |
+| **provider-openai-compat** | Wraps an OpenAI-compatible LLM backend (streaming chat + `listModels`); the OpenCode Go path. | auth-apikey |
+| **conversation-store** | Append-only persistence of the turn/chunk log, with a pure `reconcile` that repairs any interrupted turn on load. | — |
+| **session-orchestrator** | Drives one turn end-to-end: load history → resolve provider/model/tools → call `runTurn` → persist. | conversation-store, credential-store |
+| **transport-http** | Hono HTTP transport exposing `POST /chat` (NDJSON event stream), `GET /models`, and the full conversation/workspace/LSP/system-prompt API. | session-orchestrator |
+
+#### Standard extensions (shipped on-by-default)
+
+| Package | Description | Depends on |
+|---|---|---|
+| **tool-read-file** | `read_file` tool with offset/limit pagination, directory listing, and workdir containment. | — |
+| **tool-shell** | `run_shell` tool — foreground, streamed output, process-group kill on abort/timeout. | — |
+| **tool-edit-file** | `edit_file` tool — exact-string replacement, `replaceAll` flag, workdir-contained. | — |
+| **tool-write-file** | `write_file` tool — explicit `overwrite` flag, no parent auto-create. | — |
+| **tool-web-search** | `web_search` tool — Firecrawl-backed (search, scrape, crawl, map). | — |
+| **tool-youtube-transcript** | `youtube_transcript` tool — fetches transcripts from a transcriber service. | — |
+| **todo** | `todo_write` tool — per-conversation task list with a surface. | surface-registry |
+| **skills** | `load_skill` tool + per-turn tools filter that rewrites the skill list per cwd. | session-orchestrator |
+| **system-prompt** | Template-based system prompt builder with variable placeholders (`[type:name]`) and conditionals. | — |
+| **cache-warming** | Per-conversation prompt-cache warming timers + manual trigger (`POST /chat/warm`) + a surface. | session-orchestrator, surface-registry |
+| **message-queue** | Per-conversation steering queue + surface; enqueue when idle starts a new turn. | surface-registry |
+| **lsp** | Language Server Protocol client (hand-rolled JSON-RPC over stdio) — lazy-spawn, per-cwd config, diagnostics, `lsp` tool. | — |
+| **provider-umans** | Umans OpenAI-compatible provider (`api.code.umans.ai`); self-contained (reads `UMANS_API_KEY` from env). | — |
+| **surface-registry** | In-process registry where extensions contribute UI **surfaces** (frontend-agnostic data); exposes a typed `surfaceRegistryHandle` service. | — |
+| **transport-ws** | WebSocket transport serving the surface catalog + per-surface subscribe / update / invoke to clients. | surface-registry |
+| **surface-loaded-extensions** | Contributes the live "Loaded Extensions" surface (a `stat` per activated extension). | surface-registry |
+| **throughput-store** | Aggregate throughput metrics storage + `GET /metrics/throughput`. | — |
 
 ### Supporting packages (not extensions)
 
@@ -191,13 +281,15 @@ The **Depends on** column is each package's `@dispatch/*` workspace dependencies
 
 | Package | Description | Depends on |
 |---|---|---|
-| **transport-contract** | Types-only description of the HTTP API (`ChatRequest`, `ModelsResponse`, `AgentEvent`) shared by the server and every client. | kernel |
-| **ui-contract** | Types-only, **frontend-agnostic** vocabulary for backend-declared **surfaces** (`SurfaceSpec`, field kinds, the surface WS protocol) — shared by the backend and any client (web, CLI). | — |
+| **wire** | Types-only wire ABI (`AgentEvent` + conversation model + `Usage`); kernel + transport-contract re-export it so clients consume the wire without the kernel runtime. | — |
+| **transport-contract** | Types-only description of the full HTTP API shared by the server and every client. | wire, kernel |
+| **ui-contract** | Types-only, **frontend-agnostic** vocabulary for backend-declared **surfaces** (`SurfaceSpec`, field kinds, the surface WS protocol). | — |
+| **openai-stream** | Generic OpenAI-compatible stream/convert/listModels library extracted from `provider-openai-compat` (shared by `provider-umans`). | wire |
 | **cli** | The bundled one-shot terminal client documented above. | transport-contract |
 | **host-bin** | The composition root: loads config, activates all extensions through the host, serves HTTP, and supervises the observability collector. | kernel, all extensions, journal-sink |
 | **journal-sink** | Bootstrap `LogSink` that appends structured logs/spans to an NDJSON journal (rotation, fail-safe). | kernel |
 | **observability-collector** | Out-of-process binary that tails the journal and inserts records into the trace store (idempotent, at-least-once). | kernel, trace-store |
-| **trace-store** | `bun:sqlite` store for trace records/bodies, plus a `trace` CLI to render a turn's timeline. | kernel |
+| **trace-store** | `bun:sqlite` store for trace records/bodies (content-addressed dedup + retention), plus a `trace` CLI. | kernel |
 | **trace-replay** | Generic HTTP-exchange record/replay library for hermetic, network-free provider tests. | — |
 
 ---
@@ -208,7 +300,26 @@ The **Depends on** column is each package's `@dispatch/*` workspace dependencies
 bun run typecheck   # tsc -b --pretty
 bun run test        # vitest (pure/unit + integration)
 bun run test:bun    # bun:sqlite-backed tests
+bun run test:all    # both test suites
 bun run check       # biome (lint + format)
+bun run check:fix   # biome --write (auto-fix)
+```
+
+### Dev stacks
+
+| Script | Backend | Frontend | Notes |
+|---|---|---|---|
+| `bin/up` | `:24203` + WS `:24205` (`bun --watch`) | `:24204` (vite HMR) | Full restart on backend change; loads `../claude` as external extensions |
+| `bin/up2` | `:25203` + WS `:25205` (no watch) | `:25204` (vite preview) | Stable second stack; isolated data; runs alongside `bin/up` |
+
+### Workspace layout
+
+```
+/home/tradam/projects/dispatch/
+├── dispatch-backend/   this repo (branch dev)
+├── dispatch-web/       separate repo — web frontend (Svelte + DaisyUI)
+├── claude/             separate repo — Claude provider-anthropic extension
+└── bin/                shared dev scripts (up, up2)
 ```
 
 ---
@@ -220,3 +331,9 @@ bun run check       # biome (lint + format)
 - **Orchestration workflow:** `ORCHESTRATOR.md`
 - **Canonical vocabulary:** `GLOSSARY.md`
 - **Live status / task log:** `tasks.md`
+- **Observability design:** `notes/observability-design.md`
+- **LSP design:** `notes/lsp-design.md`
+- **System prompt design:** `notes/system-prompt-design.md`
+- **Turn continuity design:** `notes/turn-continuity-design.md`
+- **CLI design:** `notes/cli-design.md`
+- **Frontend design:** `notes/frontend-design.md`