dispatch

Age	Commit message (Collapse)	Author
9 days	fix(kernel): disable MAX_STEPS limit (0 = unlimited)	Adam Malczewski
	Agents were being cut off mid-task at 50 steps. The MAX_STEPS=50 hardcoded limit was silently terminating turns while the model was actively making tool calls, leaving conversations idle with a dangling tool-result as the last chunk. Setting MAX_STEPS to 0 disables the limit — the loop runs until the model stops making tool calls naturally or the abort signal fires. The max-steps code path is preserved for when MAX_STEPS > 0.
9 days	feat(heartbeat): workspace heartbeat loop with configurable AI monitoring	Adam Malczewski

9 days	fix(ssh): POST /computers/:alias/test hangs after successful SSH connect	Adam Malczewski
	The test endpoint's runProbe() waited for the ssh2 stream's 'close' event, which some SSH servers never emit for short-lived exec channels (the command 'true' exits instantly). This caused the promise to hang forever — the HTTP response never returned, and the FE's Test spinner spun indefinitely. Three fixes: 1. runProbe now resolves on the 'exit' event (not 'close') — the command has finished and the exit code is available. 'close' is kept as a fallback. Stream data/stderr are drained to prevent buffer deadlocks. 2. runProbe has a 15s timeout safety net — if the exec callback or 'exit' event never fires (e.g. server requires a pty for exec), the probe resolves false instead of hanging forever. 3. The entire test() method is wrapped in a 30s Promise.race timeout — even if pool.acquire() or pool.drop() hangs, the endpoint ALWAYS responds with { ok, error? }. The probe is fully non-interactive (no blocking prompts). tsc EXIT 0, biome clean, 1756 tests pass.
9 days	feat(ssh): discover computers from ~/.ssh/known_hosts + remote system-prompt	Adam Malczewski
	Two improvements to the SSH support feature: 1. KNOWN_HOSTS DISCOVERY (packages/ssh): Computers are now auto-discovered from ~/.ssh/known_hosts (every hostname you've ever connected to) in ADDITION to ~/.ssh/config (explicit Host aliases). Config entries take precedence (full params); known_hosts entries get defaulted params (User=defaultUser, IdentityFile=null→pool probes default keys, Port from [host]:port or 22, knownHost=true). Zero-config — no ~/.ssh/config file needed; hosts just appear. Reject list: dispatch.toml [ssh].reject = [...] (glob patterns like github.com, *.ts.net) filters noise from the catalog. Read from both the global ~/.config/dispatch/dispatch.toml and the project dispatch.toml. Parsed with Bun.TOML.parse (zero deps). Only filters discovery (catalog); specific lookups (getComputer/getStatus/test/connect) ignore the reject list (it's a visibility filter, not access control). New pure functions: parseKnownHosts(), isRejected(), globMatch(). +26 tests. tsc EXIT 0, biome clean, 1756 tests pass. 2. REMOTE SYSTEM-PROMPT AWARENESS (packages/system-prompt): When a conversation has a computerId set (remote turn), the system prompt now resolves system:os, system:hostname, git:branch/git:status, and file: reads against the REMOTE machine — not the local host. Previously the prompt always said 'Arch Linux (WSL)' + local hostname even when the agent was connected to a remote Artix Linux machine. The ResolverAdapters' hostname()/platform() are now async (so a remote adapter can run 'hostname'/'uname -s' over SSH). The system-prompt extension builds remote adapters from the ExecBackend (readFile→SFTP, spawn→SSH exec). Cache invalidation now checks computerId (switching computers rebuilds the prompt). The compaction path also threads computerId. @dispatch/system-prompt now depends on @dispatch/exec-backend.
9 days	docs(tasks): mark FE final sync check GREEN — all 3 handoffs + ↵	Adam Malczewski
	cross-cutting verified FE confirmed whole-tree green (typecheck 0/0, 795/795 tests, biome clean, build OK, git clean). All three handoffs GREEN with no integration gaps: - provider-retry: yellow alert-warning bubble renders w/ countdown. - SSH #1 wire types: defaultComputerId + Computer/ComputerEntry resolve. - SSH #2 computer API: full src/features/computer/ feature wired + typecheck-clean. Cross-cutting verified: provider-retry is WS-stream (TranscriptState.providerRetry → ChatView), computer is HTTP-only (AppStore.computerId → ComputerField sidebar) — disjoint state/channels/regions/mount-keys; no collision. SSH support + provider- retry integration is complete and validated end-to-end on both repos.
10 days	Merge branch 'dev' into feature/ssh-support	Adam Malczewski
	Brings dev's retry-with-backoff (the transient `provider-retry` AgentEvent the web frontend consumes) + the LSP-dead-server per-edit-hang fix into the SSH feature branch, alongside the SSH waves 0-5c. All code files auto-merged cleanly (run-turn.ts, orchestrator.ts, runtime.ts, wire/index.ts, tool-edit-file/extension.ts, run-turn.test.ts — both computerId threading and retry-with-backoff coexist). Only tasks.md conflicted (status section — orchestrator-resolved; both feature sections kept). Verified post-merge: tsc -b EXIT 0, biome clean (391 files), 1730 vitest pass +6 sshd-integration skipped (was 1690; +40 from dev's retry/LSP tests). Wire dist rebuilt so the FE can re-sync the pinned @dispatch/wire dep and pick up BOTH provider-retry AND the SSH Computer/defaultComputerId types. No merge or push (into dev or otherwise).
10 days	Merge branch 'feature/lsp-bugfix' into dev	Adam Malczewski

10 days	fix(lsp): stop per-edit hangs on dead/slow servers (10s cap + skip + self-heal)	Adam Malczewski
	The LSP diagnostics path hung up to 60s per edit whenever a configured Ruby language server was dead or slow (the reported Steep langserver case): a killed/crashed server was never detected (stayed "connected" forever), servers were queried sequentially with a 60s budget each, and a corrupted-but-alive server (Steep's ~3h phantom-SyntaxError drift) had no recovery. Four fixes, all in packages/lsp/ (the tool-edit-file call site lowered to 10s): 1. Dead-process detection: SpawnedProcess.onExit (Bun proc.exited) + stdout-end defence flip the client to error, dispose the rpc, kill the proc. The manager re-spawns a fresh server after the 30s backoff. Dead servers are now skipped (0s) instead of polled for 60s. 2. Concurrent fan-out + 10s hard cap: new aggregateDiagnostics queries all matching servers at once, each capped at 10s. A non-responder is skipped with "LSP took too long (>10s), skipped — raise this to the user" instead of blocking the fast server's results. Replaces the vague "unusually long" warning (now structurally impossible: slow is always false). 3. Corruption self-heal: a detector flags a server re-emitting identical non-empty diagnostics despite the file changing; after 5 repeats the client is marked broken and re-spawned. Clean files never trip it. (Acknowledged false-positive risk on persistent unfixed errors; CLI type-check gate stays authoritative.) 4. sendRequest timeout: hover/definition/references cap at 10s so they can't hang the turn against a dead server; the initialize handshake keeps its 45s race. Verification: typecheck clean; 1573 tests pass (96 files), +15 new LSP tests (86 in packages/lsp); biome clean. No kernel/contract changes; onExit is internal to packages/lsp.
10 days	chore: remove stale .skills/ORCHESTRATOR.md duplicate	Adam Malczewski
	This was the OLD orchestrator manual (references the retired `opencode run` CLI + `opencode-go/mimo-v2.5-pro`, MVP-era content). The current manual lives at root ORCHESTRATOR.md (references the `dispatch` CLI + umans/umans-glm-5.2). Unrelated housekeeping; split from the retry feature commit.
10 days	feat(kernel): retry-with-backoff on retryable provider errors	Adam Malczewski
	When the upstream LLM API returns a retryable error (HTTP 429 / 5xx "overloaded"), the kernel now retries provider.stream() with a stepped backoff, visibly, until the 8h cumulative-sleep budget is exhausted — then emits the final error and seals the turn. Retries fire only when no content was emitted yet this step (safety invariant: never duplicate partial output). - wire: new transient TurnProviderRetryEvent AgentEvent variant (emitted before each sleep; not persisted to model history). - kernel contracts: RetryStrategy (pure delayFor + injected sleep) + optional retry? on RunTurnInput (omit = no retry, backward-compatible). - kernel run-turn: retry loop in executeStep; providerRetryEvent constructor. Kernel imports no timer (sleep injected). - session-orchestrator: concrete schedule (5s..30m, repeat 30m, 8h budget) + abortable setTimeout sleep, wired into RunTurnInput.retry. tsc -b EXIT 0; biome clean; 1574 vitest pass (+16 new: 11 kernel retry tests with injected fake sleep + pure delayFor, zero @dispatch/* mocks; 5 schedule tests). Transports unchanged (transport-ws forwards AgentEvent verbatim in chat.delta; transport-http is generic JSON.stringify). Plan: notes/retry-with-backoff-plan.md. tasks.md updated with milestone + optional CLI-renderer roadmap follow-up.
10 days	feat(ssh): wave 5c — host-bin registers exec-backend + ssh; transport-http ↵	Adam Malczewski
	barrel Wave 5c (final wiring) of transparent SSH support. - host-bin: register exec-backend + ssh in CORE_EXTENSIONS (exec-backend before the tool extensions that dependsOn it; ssh after, provides the remote-backend factory + ComputerService at boot). +@dispatch/exec-backend/@dispatch/ssh deps + tsconfig refs. - transport-http: CR-5 — re-export computerServiceHandle + ComputerService type from the package barrel (src/index.ts), mirroring lsp/mcp handles, so ssh imports the typed symbol cleanly (no more dist/seam.js subpath workaround). - orchestrator: added the @dispatch/exec-backend dep the host-bin agent missed + bun install. LIVE-VERIFIED: bun packages/host-bin/src/main.ts boots clean ('Dispatch booted', no disabled extensions) — exec-backend + ssh + all tool extensions load together. Verified: tsc -b EXIT 0, biome clean, 1690 vitest pass (+6 sshd-integration skipped). DEFERRED (CR-6): listComputers usageCount stays 0 until a conversation-store count-by-alias helper is added (non-blocking). Refs: notes/ssh-support-plan.md. No merge or push.
10 days	feat(ssh): wave 5b — the ssh package (remote ExecBackend over ssh2)	Adam Malczewski
	Wave 5b of transparent SSH support. NEW standard extension @dispatch/ssh makes remote execution actually work over SSH, transparently. ssh2 verified to run under Bun (load-bearing decision #1 confirmed: connects to local sshd :22 + execs). - config.ts: ~/.ssh/config reader via ssh-config -> Computer[]/ComputerEntry[] (read-only discovery; resolves hostName/port/user/identityFile/knownHost). - hostkey.ts: known_hosts auto-trust-and-pin (present->verify/reject-on-mismatch, absent->accept+append; the accept-new analog). - errors.ts: pure ssh2/SFTP -> node:fs-style .code error mapping (so tools' existing ENOENT branches work unchanged). - pool.ts: SshConnectionPool (per-alias ssh2.Client, lazy connect, keep-alive, idle reap ~15m); key-only auth from ~/.ssh (config IdentityFile or default id_ed25519/id_rsa); no agent-forwarding, no PTY. - backend.ts: SshExecBackend implements ExecBackend (spawn via client.exec with shell-quoted cwd; fs via SFTP). - service.ts + extension.ts: activate provides BOTH handles the other units consume — remoteExecBackendFactoryHandle (exec-backend: computerId->SshExecBackend) AND computerServiceHandle (transport-http: listComputers/getComputer/getStatus/test). - orchestrator: added packages/ssh to root tsconfig.json refs + bun install. Tests: 45 pass + 6 sshd-integration skipped (it.skipIf(!process.env.SSH_TEST_HOST)). Verified: tsc -b EXIT 0, biome clean, 1690 vitest pass (was 1641, +49). CRs for wave 5c: host-bin registration; CR-5 transport-http barrel re-export; CR-6 usageCount wiring (deferred-ok, defaults to 0). Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
10 days	feat(ssh): wave 5a — exec-backend remote-backend factory handle	Adam Malczewski
	exec-backend declares remoteExecBackendFactoryHandle (a consumer-defined ServiceHandle<(computerId) => ExecBackend>) that the ssh package will provide (standard→core layering). The resolver's computerId-set branch now lazy-looks-up this factory (at tool-execute time, runtime) and calls it; if ssh isn't loaded, getService throws → a clear 'SSH remote execution is not configured' error. The computerId-undefined (local) branch is byte-identical to before. This is the seam wave 5b (the ssh package) plugs into. +tests for both branches. Verified: tsc -b EXIT 0, biome clean. No merge or push.
10 days	feat(ssh): wave 4 — computer HTTP/WS endpoints + chat computerId threading	Adam Malczewski
	Wave 4 of transparent SSH support (3 parallel owner-agents on disjoint packages). - transport-http: computer routes — GET /computers, GET /computers/:alias, GET /computers/:alias/status, POST /computers/:alias/test (all delegate to a new ComputerService seam, graceful []/disconnected when ssh not loaded); GET/PUT/DELETE /conversations/:id/computer; PUT /workspaces/:id/default-computer (mirror the cwd/default-cwd routes); /chat threads computerId into the orchestrator. Defines ComputerService interface + computerServiceHandle (defineService<ComputerService>('ssh')) in seam.ts — the seam the ssh package provides via host.provideService in wave 5. - transport-ws: chat.send + chat.queue thread computerId onto the route result (mirrors cwd/workspaceId), forwarded to the orchestrator input. - mcp: CR-1 fix — filterMcpTools now preserves computerId on the returned ToolAssembly (mirrors cwd preservation), so the filter chain stays consistent. - orchestrator: added @dispatch/wire dep to transport-http (build/config, my lane) so its seam.ts Computer/ComputerEntry import resolves. Verified: tsc -b EXIT 0, biome clean, 1641 vitest pass (was 1620, +21). Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
10 days	feat(ssh): wave 3 — session-orchestrator computerId threading + ↵	Adam Malczewski
	transport-contract API types Wave 3 of transparent SSH support (2 parallel owner-agents on disjoint packages). - session-orchestrator: thread computerId end-to-end through the turn, mirroring cwd exactly — StartTurnInput/EnqueueInput/handleMessage/TurnLifecyclePayload gain computerId; runTurnDetached resolves effectiveComputerId via conversationStore.getEffectiveComputer(convId, override), persists the override, threads into RunTurnInput + ToolAssembly. Register a remote-degradation tools-filter (filterRemoteIncompatibleTools) that, when assembly.computerId is set (REMOTE), drops the 'lsp' tool + any '__'-namespaced MCP tool (local processes that can't see remote files); LOCAL (computerId undefined) is a passthrough — byte-identical to today. +21 tests. - transport-contract: + computerId on ChatRequest (flows to ChatSendMessage) + computer endpoint API types (ComputerListResponse, ComputerResponse, ComputerStatusResponse, SetConversationComputerRequest, ConversationComputerResponse, SetWorkspaceDefaultComputerRequest, TestComputerResponse) — mirrors the cwd/workspace endpoint types. - CR-1 (non-blocking, folded into wave 4): MCP filter doesn't preserve computerId on the returned ToolAssembly. - cache-warming computerId threading intentionally DEFERRED (user request) — noted as a known performance-only limitation in tasks.md. Verified: tsc -b EXIT 0, biome clean, 1620 vitest pass (was 1599, +21). Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
10 days	feat(ssh): wave 2 — route filesystem/shell tools behind ExecBackend	Adam Malczewski
	Wave 2 of transparent SSH support (4 parallel owner-agents on disjoint tool packages). The tools now resolve an ExecBackend per-call from ctx.computerId and call backend.spawn / backend.readFile / etc. instead of node:fs and node:child_process directly — so they are transport-agnostic (local now; remote over SSH later, transparent to the agent). Still LOCAL-ONLY this wave (computerId always undefined -> LocalExecBackend, behavior-identical). - tool-shell: factory takes resolveBackend; execute calls backend.spawn. spawn.ts DELETED (realSpawn was a verbatim duplicate of exec-backend's LocalExecBackend.spawn — logic moved to the sanctioned shared package). manifest dependsOn:[exec-backend]; host.getService at activation. - tool-read-file: readFile/stat/readdir -> backend.* (pure logic untouched; ENOENT .code branches kept). - tool-write-file: exists/stat/writeFile -> backend.* (pure logic untouched). - tool-edit-file: readFile/writeFile -> backend.* + forward-compatible REMOTE diagnostics skip (ctx.computerId set -> skip LSP, return empty — plan §6.1; local path byte-identical to today). LSP lookup stays lazy. - orchestrator: pre-wired @dispatch/exec-backend dep into the 4 tool package.jsons + bun install (build/config, my lane) so isolated verify resolved cleanly; agents added the ../exec-backend tsconfig ref. Verified: tsc -b EXIT 0, biome clean, 1599 vitest pass (was 1592). Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
10 days	feat(ssh): wave 1 — ExecBackend + computer data model + runtime threading	Adam Malczewski
	Wave 1 of transparent SSH support (parallel owner-agents on disjoint packages, plus the orchestrator-authored kernel contract seam from wave 0): - packages/wire: + Computer/ComputerEntry (read-only view over ~/.ssh/config Host aliases) + Workspace.defaultComputerId (string\|null, null=local). Types only; 3 conformance tests. - packages/exec-backend (NEW core extension): the ExecBackend abstraction (spawn + minimal fs surface) the bundled tools will program against instead of node:fs/child_process. LocalExecBackend wraps today's node calls (behavior-identical; node:fs-style .code errors). execBackendHandle + ExecBackendResolver (sync; computerId undefined -> local; set -> throws until the ssh package wires remote resolution in wave 5). 20 tests. - packages/kernel (runtime only): thread computerId through dispatch.ts + run-turn.ts exactly as cwd is threaded (opaque, forwarded to ToolExecuteContext; absent = local = byte-identical to today). +2 tests. - packages/conversation-store: computer (SSH alias) assignment + resolution mirroring cwd — WorkspaceRow.defaultComputerId + setWorkspaceDefaultComputerId + getComputerId/setComputerId/clearComputerId + getEffectiveComputer (override -> per-conv -> workspace default -> null/local). Fixes the 3 Workspace literal sites the new required wire field broke. +18 tests. - orchestrator: root tsconfig.json ref for exec-backend + bun install. Verified: tsc -b EXIT 0, biome clean, 1592 vitest pass (was 1549, +43). Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
10 days	feat(ssh): wave 0 — kernel contract seam (computerId)	Adam Malczewski
	Add additive optional `computerId` field to ToolExecuteContext + RunTurnInput. The kernel never interprets it (forwards verbatim to tools, like cwd) — it never enters the model prompt (no prompt-cache impact). When omitted/undefined, execution is LOCAL (today's behavior), so this is fully backward compatible. This is the orchestrator-authored seam (ORCHESTRATOR.md §2a) that lets Wave 1's producers (wire Computer types, exec-backend contract) and the consumer (kernel runtime threading) run in parallel against a fixed type. Refs: notes/ssh-support-plan.md (decisions resolved in §0.5/§13). No merge or push.
10 days	feat(cli): add --workspace filter to 'dispatch list'	Adam Malczewski
	The backend already supported GET /conversations?workspaceId= but the CLI never sent it. Wire the list command to that filter: - args.ts: parse --workspace / -w on 'list' (placed before the --catch-all so the single-dash -w shorthand isn't taken for a positional prefix); add workspaceId? to the list ParsedCommand. - http.ts: add workspaceId? to FetchConversationsOpts; send ?workspaceId= (after q/status, preserving URLSearchParams order). - main.ts: forward parsed.workspaceId into fetchConversations; update USAGE. Composable with --status and the <prefix> short-id arg. 'Open conversations in workspace X' is now: dispatch list --workspace X (status defaults to active,idle). No contract changes — purely additive CLI wiring. Tests: +4 args (incl. composability + missing-value error), +2 http (exact ?workspaceId= URL + combined status/workspaceId with %2C encoding). typecheck EXIT 0, biome clean (364 files), full suite 1558 passed. Live-verified against an isolated server.
10 days	plan(ssh): lock final decision — take ssh-config dep; no open questions remain	Adam Malczewski
	Resolve the last open question: take the ssh-config npm package (project-local, alongside ssh2) for correct ~/.ssh/config parsing rather than hand-rolling. §13 now lists all 8 decisions as resolved and marks the plan decision-complete. Also records minor adopted defaults (config reader lives in ssh extension; stale alias surfaced as unresolved not silent-local; default identity probing order; assume unencrypted keys for MVP). Planning document only; no code changed. No merge or push.
10 days	plan(ssh): bake in resolved decisions + ~/.ssh/config discovery	Adam Malczewski
	Update the SSH support plan to reflect user-confirmed decisions and a key simplification from a new requirement: - New §0.5 'Resolved decisions' records all 7 confirmed answers. - Computer is now a READ-ONLY view over ~/.ssh/config (Host aliases), not a persisted CRUD entity: no computer-store package, no create/update/delete API. computerId IS an SSH config alias. ~/.ssh/known_hosts is the host-key trust store (auto-trust-and-pin). - Auth simplified to key-only from ~/.ssh (no gopass/SecretsAccess/secretRef anywhere). - ssh2 only (no bun-ssh2 fork); verifying under Bun is the load-bearing Phase-3 first step. - LSP/MCP silently dropped on remote turns (no system-prompt note); edit_file works with no diagnostics on remote. - computerId persisted per-conversation (like cwd). - Updated data model (§3), connection mgmt (§4), security (§7), edge cases (§8), API surface (§9 read-only), frontend (§10), packages table (§11, no computer-store), phases (§12), and resolved open questions (§13). Planning document only; no code changed. No merge or push.
10 days	docs(notes): research — list conversations filtered by worktree/workspace	Adam Malczewski
	Investigation of whether the backend supports listing open conversations filtered by a specific worktree/workspace. Findings: - 'worktree' is not a Dispatch domain concept; canonical term is 'workspace' (logical grouping) vs 'working directory' (cwd, filesystem path). - GET /conversations already supports composable ?workspaceId=, ?status=, ?q= filters. 'Open conversations in workspace X' = ?workspaceId=X&status=active,idle. - Every conversation carries a workspaceId (default 'default'); ConversationMeta is in @dispatch/wire; filter lives in conversation-store listConversations. - A literal directory (git worktree) filter (?cwd=) is NOT supported; §3b documents the small additive change needed across wire/store/transport-http. - Test coverage verified: store-workspace.test.ts:369, store.test.ts:1463, app.test.ts:3696. Research notes only — no code/contract changes.
10 days	plan(ssh): add transparent SSH support design & implementation plan	Adam Malczewski
	Research and plan transparent SSH execution so an agent runs commands on a remote computer as if local — the agent never learns it is using SSH. Covers: - How the cwd → ToolExecuteContext pipeline works today and where a computerId threads in (mirroring cwd end-to-end) - The ExecBackend abstraction (spawn + fs) behind which tool-shell/ read-file/write-file/edit-file are refactored, with LocalExecBackend (node) and SshExecBackend (ssh2) implementations - Computer data model + workspace defaultComputerId + per-conversation override, mirroring the getEffectiveCwd resolution ladder (null = local) - SSH connection pooling (one per computer, lazy connect, keep-alive, idle reaping), auth via SecretsAccess/gopass, host-key verification - Turn loop / dispatch integration (additive optional computerId field, backward-compatible — absent = today's local behavior) - LSP/MCP degrade by dropping those tools on remote turns (future: remote server spawn over SSH) - API surface (computer CRUD, per-conv + workspace-default endpoints, chat.send gains computerId), frontend impact - Security, edge cases, phased implementation, contract gaps reported to unit owners (one-owner-per-unit honored — planner does not edit others) No code changed; planning document only. No merge or push.
10 days	feat(cli): add --file flag to 'dispatch send' subcommand	Adam Malczewski
	Add the same --file <path> support that the summon (chat) command has to the 'dispatch send' subcommand. When --file is given, the file's contents are read and attached to the message (composed via composeMessage, identical to chat). - args.ts: add 'file' to the send ParsedCommand, make 'text' optional, parse --file, and require at least one of --text or --file. - main.ts: read the file and compose the message in the send case, using the composed message in both the --queue and streaming branches; update USAGE. - args.test.ts: cover --file parsing (alone, with --text, missing value) and update the existing send expectations + the both-missing error message.
10 days	fix(bin): pin dev ports in bin/up so shell BACKEND_PORT can't override .env	Adam Malczewski
	bin/up ran `bun --watch main.ts` without setting BACKEND_PORT, so a shell-exported BACKEND_PORT (e.g. 24991 in ~/.bashrc, set so the Dispatch CLI hits the prod server) overrode .env's dev value 24203 — Bun lets shell env win over .env — binding the dev server onto the production port and colliding with the active dispatch-server systemd service. transport-http then failed to activate (Bun.serve "port in use"), so the HTTP server never came up and the frontend got "Failed to fetch". Force BACKEND_PORT=24203 + SURFACE_WS_PORT=24205 in the setsid invocation so the dev stack is deterministic regardless of the shell environment.
10 days	feat(transport-http): add GET /conversations/:id/mcp status endpoint	Adam Malczewski
	Mirrors the existing GET /conversations/:id/lsp route exactly: gates on the persisted then effective cwd (null → empty servers), returns 503 when the MCP service isn't loaded, and maps McpServerStatus → McpServerInfo (conditionally including `error` per exactOptionalPropertyTypes). Wires mcpService into CreateServerOptions + extension activate via a plain host.getService (mirroring lspService; "mcp" added to dependsOn, route added to contributes.routes), adds the @dispatch/mcp workspace dep, and re-exports mcpServiceHandle / McpService / McpServerStatus from seam.ts. Adds 4 tests mirroring the LSP status tests.
10 days	fix(lsp): prevent server crash from malformed LSP messages	Adam Malczewski
	Two bugs caused the dispatch server to crash (15 times since Jun 24) when chat cc6c edited packages/transport-http/src/app.ts — a 40KB file with 23 multi-byte UTF-8 lines. The edit_file diagnostics hook sends the file to tsserver, which sends back a large publishDiagnostics response. When the response was split across stdout chunks at a multi-byte character boundary, the server crashed. Layer 1 — rpc.ts handleMessage: JSON.parse had no try/catch. A corrupted message threw an unhandled SyntaxError → unhandled rejection → process exit. Wrapped in try/catch; malformed messages are now skipped. Also hardened client.ts handleBytes: the async handleMessage Promise was fire-and-forget. Added .catch(() => {}) as defence-in-depth so no rejection from the RPC layer can ever crash the server. Layer 2 — framing.ts FrameDecoder: used a string buffer with new TextDecoder().decode(chunk) (no { stream: true }), corrupting multi-byte characters split across chunks. Worse, Content-Length counts bytes but the buffer was sliced by character count — for multi-byte content byte length ≠ char length, so the decoder extracted the wrong slice as a message. Rewrote to use a Uint8Array byte buffer: header separator search is byte-level, Content-Length comparison is byte-level, and the body is decoded only after all bytes are confirmed present. Tests: 5 new multi-byte framing tests (split at char boundary, byte-vs-char Content-Length, two messages in one chunk, three-way split) + 1 rpc test (malformed JSON does not throw). All 1545 tests pass.
10 days	feat(transport-contract): add McpServerInfo + McpStatusResponse (0.22.0)	Adam Malczewski
	Additive types for GET /conversations/:id/mcp status endpoint, mirroring the existing LSP status types. McpServerState, McpServerInfo, McpStatusResponse. +2 type-test assertions. Version bump 0.21.0 → 0.22.0. Handoff written: frontend-mcp-status-handoff.md (backend route + FE consumption).
10 days	docs: live-verify MCP + per-edit diagnostics; update tasks.md (1537 tests)	Adam Malczewski
	- MCP live-verified: test MCP server → tool discovery (test__ping) → tool call → pong result. Full turn lifecycle confirmed on production server. - Per-edit diagnostics live-verified: type error in .ts file surfaces [TypeScript Language Server] ERROR (2322) inline after edit. - edit_file bug found + fixed during live-verify (lazy LSP lookup).
11 days	fix(tool-edit-file): lazy LSP service lookup — diagnostics now actually work	Adam Malczewski
	The previous fix (e03a96e) wrapped getService in try/catch to prevent the activation crash, but that wasn't enough: tool-edit-file activates at position 5 in CORE_EXTENSIONS while lsp activates at position 20. So getService ALWAYS threw at activation time, lspService was ALWAYS undefined, and the diagnostics hook was NEVER wired — edits succeeded but never showed LSP feedback. Fix: make the LSP service lookup LAZY — defer it to edit time (when the tool is actually called), not activation time. By then all extensions have activated. The diagnostics function tries getService on each edit call; if LSP isn't loaded, it returns a no-op (graceful degradation).
11 days	fix(tool-edit-file): wrap getService in try/catch to prevent activation crash	Adam Malczewski
	The per-edit diagnostics change (8f6114b) called host.getService(lspServiceHandle) during activate(). But getService THROWS when a service has no provider — so if the LSP extension activates AFTER tool-edit-file (or isn't loaded at all), the activate() function crashes and the edit_file tool is NEVER REGISTERED. This is why the edit_file tool was missing from the agent toolset. Fix: wrap getService in try/catch — if the LSP service isn't available yet, lspService becomes undefined and edits proceed without diagnostics (the graceful degradation the comment always promised but the code didn't deliver).
11 days	feat(mcp): Model Context Protocol host extension	Adam Malczewski
	New `mcp` standard extension (`packages/mcp/`) that makes Dispatch an MCP host: spawns configured MCP servers (stdio child processes), performs the MCP handshake (initialize → notifications/initialized), discovers tools via tools/list, and registers each as a first-class Dispatch ToolContract via host.defineTool. When the model calls an MCP tool, the extension proxies the call to tools/call on the MCP server and returns the flattened result. Architecture (sibling of `lsp` extension): - Config: .dispatch/mcp.json (servers key) → opencode.json mcp key fallback, resolved per-cwd (mirrors LSP config resolution) - Transport: StdioTransport (spawn child, Content-Length framing + JSON-RPC 2.0) - Client: initialize → tools/list → tools/call; handles list_changed notifications for dynamic tool updates - Registry: tool name namespacing (<serverId>__<toolName>), ToolContract adapter that proxies execute → callTool, content flattening (text/image/ resource → string) - Manager: one client per server, lazy-spawn, status(), shutdownAll() - Extension: manifest (dependsOn session-orchestrator, capabilities spawn), registers tools + a toolsFilter (drops disconnected server's tools), mcpServiceHandle, deactivate kills all child processes Phase 1 scope: stdio only, Tools only (no Resources/Prompts/HTTP/sampling). Hand-rolled JSON-RPC + framing (zero external deps, adapts LSP patterns). Wave 1 (agent): 12 source + 8 test files, 69 new tests. Wave 2 (orchestrator): root tsconfig ref, host-bin CORE_EXTENSIONS registration + package.json dep, bun install. Verified: tsc -b EXIT 0, biome clean, 1537 vitest pass (was 1468, +69).
11 days	docs(mcp): add MCP/MCP server/MCP host glossary entries	Adam Malczewski

11 days	docs: MCP (Model Context Protocol) integration design + implementation plan	Adam Malczewski
	- notes/mcp-design.md: full design — architecture fit (sibling of lsp ext), per-cwd config (.dispatch/mcp.json + opencode.json mcp key), tool name namespacing (<serverId>__<toolName>), ToolContract adapter, content flattening, security, glossary additions, 6 open design decisions - PLAN-mcp.md: wave breakdown (Wave 0 contracts/wiring, Wave 1 the mcp extension, Wave 2 host-bin registration, Wave 3 live verification) - Phase 1 scope: stdio only, Tools only, no surface, hand-rolled JSON-RPC - No kernel contract change needed (existing ToolContract + defineTool + toolsFilter are sufficient)
11 days	docs: update tasks.md (per-edit diagnostics milestone, 1468 tests) + retire ↵	Adam Malczewski
	stale HANDOFF.md - tasks.md: record per-edit LSP diagnostics auto-append milestone (commit 8f6114b), fix test count 1453→1468 - HANDOFF.md: retire stale post-MVP handoff (referenced arch-rewrite path, 178 tests, next-steps all done) → current accurate pointer file
11 days	feat(lsp+tool-edit-file): multi-server diagnostics + per-edit auto-append	Adam Malczewski
	LSP extension: - Multi-server aggregation: query ALL connected servers matching the file's extension (not just the first), merge diagnostics tagged by source - Incremental sync: capture each server's textDocumentSync.change during initialize; compute prefix/suffix diff ranges for change:2 servers; full content for change:1 (generic, works for any LSP) - New diff.ts: pure computeChangeRange + offsetToPosition (O(n), tested) - Buffer sync: change(filePath, newText) sends didChange with post-edit in-memory content; openWithText for first open; tracks open doc text - languageId mapping: extended with .rb/.rbs/.c/.cpp/etc. (was 'unknown') - waitForDiagnostics: accepts text override + timeoutMs; returns { formatted, slow, timedOut }; polls for publishDiagnostics push - DiagnosticsStore: hasReceivedPush/clearReceived tracking; formatFiltered with minSeverity (1=Error, 2=Warning) for edit_file integration - LspService.getDiagnostics: service method for cross-extension use tool-edit-file: - After successful edit, calls LSP getDiagnostics with post-edit buffer - Only appends diagnostics with severity ≤ 2 (errors+warnings, no noise) - Appends slow warning (>10s): 'LSP is taking unusually long...' - 60s timeout; graceful degradation when no LSP available - Optional dep on @dispatch/lsp (getService pattern, not manifest depOn) 1468 vitest pass (was 1453, +15 new diff tests).
11 days	docs: remove claude extension references from README	Adam Malczewski

11 days	docs: replace local paths with GitHub repo links	Adam Malczewski
	- Web frontend section links to github.com/realtradam/dispatch-web - Claude extension links to github.com/realtradam/dispatch-adapter-claude - Workspace layout uses generic clone instructions, not /home/tradam paths - Dev stack bin/up instructions show clone-both-repos-as-siblings
11 days	docs: rewrite README for current project state	Adam Malczewski
	- Full HTTP API table (30+ endpoints: conversations, workspaces, LSP, system-prompt, metrics, queue, cache-warming, compaction, etc.) - Complete CLI commands (list, read, send, stop, compact, open + flags) - All 37 packages documented with tiers and dependencies - systemd deploy section (bin/build, bin/install, bin/sync-env) - Dev stacks table (bin/up 24203, bin/up2 25203) - Workspace layout (dispatch-backend, dispatch-web, claude, bin) - Updated web frontend section (Slice 2 browser chat in progress) - Links to all design docs
11 days	docs: update paths from arch-rewrite to dispatch-backend	Adam Malczewski
	After consolidating to the dev branch and renaming the worktree, update all path references in ORCHESTRATOR.md and .skills/ORCHESTRATOR.md.
11 days	docs(tasks): mark live-verifies complete + slim roadmap	Adam Malczewski
	Mark all 5 live-verify checkboxes as done (reasoning effort, todo tool, CLI cross-client, abort-race, system-prompt builder). Slim the roadmap from 11 items down to 3 open items (web frontend, close-with-queued-messages product decision, FE crash-recovery status endpoint) by dropping the 8 completed/verified items.
11 days	fix(kernel+tool-shell): abort hanging tool calls without bricking the ↵	Adam Malczewski
	conversation kernel: executeToolCall now races tool.execute against the abort signal via Promise.race; on abort resolves (not rejects) with an "Aborted" result so the step completes normally → finishReason "aborted" → turn seals cleanly (done event) → finally clears activeTurns → conversation freed, next message accepted. run-turn strips tool-call chunks from the assistant message on abort (keeps text/thinking) and omits tool-result messages to avoid persisting dangling tool calls that would 400 the provider next turn. tool-shell: realSpawn spawns detached (own process group); on abort AND timeout kills the entire group (process.kill(-pgid, SIGKILL)) and resolves immediately — no child.on("close") dependency, so a grandchild holding the pipes can't stall the spawn promise or leak. Also: ORCHESTRATOR.md migrated to dispatch CLI summon mechanism; .skills summary; bin/sync-env PATH injection; frontend handoff docs. 1453 vitest pass · tsc -b EXIT 0 · biome clean.
11 days	fix(broken-chat): read-time self-repair of unrecoverable chats	Adam Malczewski
	reconcile() only repaired orphaned tool-calls. Two other broken states made chats uncontinuable, and load() had no parse-error guard: - A trailing assistant message whose only chunk is 'error' (a failed- generation marker) serializes to empty content -> provider rejects/empty -> chat never continues. 6 of 140 production conversations were stuck. - A tool-call whose input is a raw malformed-JSON string (model emitted broken JSON) re-sent as OpenAI arguments -> provider 400s on every continuation (the 77574596 break). - load() JSON.parse had no try/catch -> one corrupt row bricked the chat. Fix = read-time repair (no DB surgery; append-only preserved). reconcile runs on every load() BEFORE any provider sees messages, so Layer 1 protects ALL providers. Layer 1 (conversation-store reconcile): strip error chunks from assistant messages + drop the now-empty error-only messages (safe: never followed by a tool message); orphaned-tool-call synthesis unchanged; ReconcileReport +2 additive counts. loadSince (FE reads) intentionally unreconciled so the user still SEES the error. load() wraps JSON.parse in try/catch (skip corrupt rows). Layer 2 (openai-stream): serializeToolArguments ensures tool-call arguments is always valid JSON (malformed string -> fallback object), neutralizing already-stored malformed args. Layer 2 equiv (../claude provider-anthropic): safeJson returns a valid object fallback on parse failure, not the raw string. (Separate repo.) Live-verified: reproduced 77574596's real broken tail in the dev DB; POST /chat continued it cleanly (no 400, model replied) — the provider accepted the reconciled history. tsc -b EXIT 0, biome clean, 1453 vitest pass.
11 days	docs(lsp): live-verify passed — broken-server recovery + configSource + ↵	Adam Malczewski
	shadow warning All five live-verify checks passed against the dev stack (bin/up :24203): configSource reaches the wire (built-in TS, 'built-in'); broken server reports error + configSource + source-named error; recovery without restart (blocker, error->connected after config fix); no retry storm; shadow warning logged via host.logger when both configs declare lsp.
11 days	fix(lsp): broken-server recovery + config source attribution	Adam Malczewski
	Two issues found by decompiling the running dispatch-server binary (handoff from a ruby-lsp setup in raylib-jamstack): Issue 2 (blocker): a failed LSP server was "broken" FOREVER — the manager's broken set was cleared only in shutdownAll(), so a server that failed (bad env, missing binary, or a since-fixed config) stayed state:"error" for the whole process. For an agent running inside dispatch the only recovery (server restart) kills its own session. Now a broken server self-heals when its resolved config changes since it was marked broken (discrete event → no retry storm), with a bounded backoff for transient failures. Issue 1: .dispatch/lsp.json silently shadowed opencode.json's lsp key with no warning and no source attribution. Now: shadow warning via host.logger when both declare lsp; configSource populated on status (.dispatch/lsp.json / opencode.json / built-in); spawn-failure error strings name the config source. Contract: additive configSource?: string on LspServerInfo (@dispatch/transport-contract 0.20.0→0.21.0). transport-http passes it through to the wire (was a field-by-field map that dropped it — CR resolved by the transport-http owner). tsc -b EXIT 0, biome clean, 1443 vitest pass.
11 days	docs: task 3 (per-conversation model persistence) done	Adam Malczewski

11 days	feat: persistent per-conversation model selection	Adam Malczewski
	A chat's selected provider + model is now persisted per conversation (like cwd and reasoningEffort). Opening a conversation in a new browser recalls the originally selected model instead of defaulting. - transport-contract 0.19.0→0.20.0: ModelResponse + SetModelRequest types for GET/PUT /conversations/:id/model. - conversation-store: getModel/setModel (model:<id> key, mirrors getReasoningEffort/setReasoningEffort); forkHistory copies model; empty string clears. - session-orchestrator: resolve model from persisted store when no per-turn override; persist the resolved model so it sticks; warm path parity. - transport-http: GET/PUT /conversations/:id/model endpoints with validation. 1433 vitest pass; tsc + biome clean.
11 days	docs: task 2 (system-prompt cwd reconstruction) done	Adam Malczewski

11 days	fix(system-prompt): reconstruct on cwd change via getWithMeta	Adam Malczewski
	The system-prompt service cached the resolved prompt on first turn and reused it on subsequent turns via get(). But the prompt is cwd-sensitive (file:AGENTS.md, prompt:cwd variables). When a conversation's cwd changed after the first turn, the cached prompt was stale — referenced files from the new cwd were not loaded. system-prompt: added getWithMeta(conversationId) returning { prompt, cwd } and stores resolved-cwd:<id> alongside resolved:<id> in construct(). session-orchestrator: subsequent turns now call getWithMeta, compare stored cwd vs effective cwd, and reconstruct if they differ. Compaction path (always constructs) and warm path (no system prompt) are unaffected. 1411 vitest pass; tsc + biome clean.
11 days	workspace: conversation.open/statusChanged carry workspaceId (1405 vitest)	Adam Malczewski
	- @dispatch/transport-contract 0.18.0 -> 0.19.0: add workspaceId: string to ConversationOpenMessage and ConversationStatusChangedMessage - session-orchestrator: include persisted workspaceId in conversationOpened/ conversationStatusChanged payloads - transport-ws: forward workspaceId in WS broadcasts - transport-http: POST /conversations/:id/open resolves workspaceId before emit - FE handoff to 29ae: frontend-workspace-open-handoff.md