summaryrefslogtreecommitdiffhomepage
AgeCommit message (Collapse)Author
2 daysfix(install): restart service instead of start (no-op when already running)devAdam Malczewski
2 daysfix(install): run build as user, sudo only on privileged linesAdam Malczewski
The script previously required sudo for the entire script (id -u check), which meant bin/build ran as root and created root-owned dist/ files. On the next build, the normal user couldn't overwrite them (EACCES). Now the script runs without a sudo prefix: the build step runs as the normal user (dist/ files are user-owned), and sudo is used only on the specific lines that write to system directories (/usr/bin, /etc, /usr/share) or call systemctl.
2 daysfix(build): run tsc --build before bun build --compileAdam Malczewski
bin/build was compiling the binary directly from stale dist/*.js files without first recompiling the TypeScript packages. Since package.json main fields point to dist/index.js, source edits to .ts files were silently lost in the compiled binary. Now tsc --build runs first (composite project references rebuild all packages in dependency order), then bun build --compile bundles the fresh dist/ output.
2 daysfix(kernel): disable MAX_STEPS limit (0 = unlimited)Adam Malczewski
Agents were being cut off mid-task at 50 steps. The MAX_STEPS=50 hardcoded limit was silently terminating turns while the model was actively making tool calls, leaving conversations idle with a dangling tool-result as the last chunk. Setting MAX_STEPS to 0 disables the limit — the loop runs until the model stops making tool calls naturally or the abort signal fires. The max-steps code path is preserved for when MAX_STEPS > 0.
3 daysfix(ssh): POST /computers/:alias/test hangs after successful SSH connectAdam Malczewski
The test endpoint's runProbe() waited for the ssh2 stream's 'close' event, which some SSH servers never emit for short-lived exec channels (the command 'true' exits instantly). This caused the promise to hang forever — the HTTP response never returned, and the FE's Test spinner spun indefinitely. Three fixes: 1. runProbe now resolves on the 'exit' event (not 'close') — the command has finished and the exit code is available. 'close' is kept as a fallback. Stream data/stderr are drained to prevent buffer deadlocks. 2. runProbe has a 15s timeout safety net — if the exec callback or 'exit' event never fires (e.g. server requires a pty for exec), the probe resolves false instead of hanging forever. 3. The entire test() method is wrapped in a 30s Promise.race timeout — even if pool.acquire() or pool.drop() hangs, the endpoint ALWAYS responds with { ok, error? }. The probe is fully non-interactive (no blocking prompts). tsc EXIT 0, biome clean, 1756 tests pass.
3 daysfeat(ssh): discover computers from ~/.ssh/known_hosts + remote system-promptAdam Malczewski
Two improvements to the SSH support feature: 1. KNOWN_HOSTS DISCOVERY (packages/ssh): Computers are now auto-discovered from ~/.ssh/known_hosts (every hostname you've ever connected to) in ADDITION to ~/.ssh/config (explicit Host aliases). Config entries take precedence (full params); known_hosts entries get defaulted params (User=defaultUser, IdentityFile=null→pool probes default keys, Port from [host]:port or 22, knownHost=true). Zero-config — no ~/.ssh/config file needed; hosts just appear. Reject list: dispatch.toml [ssh].reject = [...] (glob patterns like github.com, *.ts.net) filters noise from the catalog. Read from both the global ~/.config/dispatch/dispatch.toml and the project dispatch.toml. Parsed with Bun.TOML.parse (zero deps). Only filters discovery (catalog); specific lookups (getComputer/getStatus/test/connect) ignore the reject list (it's a visibility filter, not access control). New pure functions: parseKnownHosts(), isRejected(), globMatch(). +26 tests. tsc EXIT 0, biome clean, 1756 tests pass. 2. REMOTE SYSTEM-PROMPT AWARENESS (packages/system-prompt): When a conversation has a computerId set (remote turn), the system prompt now resolves system:os, system:hostname, git:branch/git:status, and file: reads against the REMOTE machine — not the local host. Previously the prompt always said 'Arch Linux (WSL)' + local hostname even when the agent was connected to a remote Artix Linux machine. The ResolverAdapters' hostname()/platform() are now async (so a remote adapter can run 'hostname'/'uname -s' over SSH). The system-prompt extension builds remote adapters from the ExecBackend (readFile→SFTP, spawn→SSH exec). Cache invalidation now checks computerId (switching computers rebuilds the prompt). The compaction path also threads computerId. @dispatch/system-prompt now depends on @dispatch/exec-backend.
3 daysdocs(tasks): mark FE final sync check GREEN — all 3 handoffs + ↵Adam Malczewski
cross-cutting verified FE confirmed whole-tree green (typecheck 0/0, 795/795 tests, biome clean, build OK, git clean). All three handoffs GREEN with no integration gaps: - provider-retry: yellow alert-warning bubble renders w/ countdown. - SSH #1 wire types: defaultComputerId + Computer/ComputerEntry resolve. - SSH #2 computer API: full src/features/computer/ feature wired + typecheck-clean. Cross-cutting verified: provider-retry is WS-stream (TranscriptState.providerRetry → ChatView), computer is HTTP-only (AppStore.computerId → ComputerField sidebar) — disjoint state/channels/regions/mount-keys; no collision. SSH support + provider- retry integration is complete and validated end-to-end on both repos.
3 daysMerge branch 'dev' into feature/ssh-supportAdam Malczewski
Brings dev's retry-with-backoff (the transient `provider-retry` AgentEvent the web frontend consumes) + the LSP-dead-server per-edit-hang fix into the SSH feature branch, alongside the SSH waves 0-5c. All code files auto-merged cleanly (run-turn.ts, orchestrator.ts, runtime.ts, wire/index.ts, tool-edit-file/extension.ts, run-turn.test.ts — both computerId threading and retry-with-backoff coexist). Only tasks.md conflicted (status section — orchestrator-resolved; both feature sections kept). Verified post-merge: tsc -b EXIT 0, biome clean (391 files), 1730 vitest pass +6 sshd-integration skipped (was 1690; +40 from dev's retry/LSP tests). Wire dist rebuilt so the FE can re-sync the pinned @dispatch/wire dep and pick up BOTH provider-retry AND the SSH Computer/defaultComputerId types. No merge or push (into dev or otherwise).
3 daysMerge branch 'feature/lsp-bugfix' into devAdam Malczewski
3 daysfix(lsp): stop per-edit hangs on dead/slow servers (10s cap + skip + self-heal)Adam Malczewski
The LSP diagnostics path hung up to 60s per edit whenever a configured Ruby language server was dead or slow (the reported Steep langserver case): a killed/crashed server was never detected (stayed "connected" forever), servers were queried sequentially with a 60s budget each, and a corrupted-but-alive server (Steep's ~3h phantom-SyntaxError drift) had no recovery. Four fixes, all in packages/lsp/ (the tool-edit-file call site lowered to 10s): 1. Dead-process detection: SpawnedProcess.onExit (Bun proc.exited) + stdout-end defence flip the client to error, dispose the rpc, kill the proc. The manager re-spawns a fresh server after the 30s backoff. Dead servers are now skipped (0s) instead of polled for 60s. 2. Concurrent fan-out + 10s hard cap: new aggregateDiagnostics queries all matching servers at once, each capped at 10s. A non-responder is skipped with "LSP took too long (>10s), skipped — raise this to the user" instead of blocking the fast server's results. Replaces the vague "unusually long" warning (now structurally impossible: slow is always false). 3. Corruption self-heal: a detector flags a server re-emitting identical non-empty diagnostics despite the file changing; after 5 repeats the client is marked broken and re-spawned. Clean files never trip it. (Acknowledged false-positive risk on persistent unfixed errors; CLI type-check gate stays authoritative.) 4. sendRequest timeout: hover/definition/references cap at 10s so they can't hang the turn against a dead server; the initialize handshake keeps its 45s race. Verification: typecheck clean; 1573 tests pass (96 files), +15 new LSP tests (86 in packages/lsp); biome clean. No kernel/contract changes; onExit is internal to packages/lsp.
3 dayschore: remove stale .skills/ORCHESTRATOR.md duplicateAdam Malczewski
This was the OLD orchestrator manual (references the retired `opencode run` CLI + `opencode-go/mimo-v2.5-pro`, MVP-era content). The current manual lives at root ORCHESTRATOR.md (references the `dispatch` CLI + umans/umans-glm-5.2). Unrelated housekeeping; split from the retry feature commit.
3 daysfeat(kernel): retry-with-backoff on retryable provider errorsAdam Malczewski
When the upstream LLM API returns a retryable error (HTTP 429 / 5xx "overloaded"), the kernel now retries provider.stream() with a stepped backoff, visibly, until the 8h cumulative-sleep budget is exhausted — then emits the final error and seals the turn. Retries fire only when no content was emitted yet this step (safety invariant: never duplicate partial output). - wire: new transient TurnProviderRetryEvent AgentEvent variant (emitted before each sleep; not persisted to model history). - kernel contracts: RetryStrategy (pure delayFor + injected sleep) + optional retry? on RunTurnInput (omit = no retry, backward-compatible). - kernel run-turn: retry loop in executeStep; providerRetryEvent constructor. Kernel imports no timer (sleep injected). - session-orchestrator: concrete schedule (5s..30m, repeat 30m, 8h budget) + abortable setTimeout sleep, wired into RunTurnInput.retry. tsc -b EXIT 0; biome clean; 1574 vitest pass (+16 new: 11 kernel retry tests with injected fake sleep + pure delayFor, zero @dispatch/* mocks; 5 schedule tests). Transports unchanged (transport-ws forwards AgentEvent verbatim in chat.delta; transport-http is generic JSON.stringify). Plan: notes/retry-with-backoff-plan.md. tasks.md updated with milestone + optional CLI-renderer roadmap follow-up.
4 daysfeat(ssh): wave 5c — host-bin registers exec-backend + ssh; transport-http ↵Adam Malczewski
barrel Wave 5c (final wiring) of transparent SSH support. - host-bin: register exec-backend + ssh in CORE_EXTENSIONS (exec-backend before the tool extensions that dependsOn it; ssh after, provides the remote-backend factory + ComputerService at boot). +@dispatch/exec-backend/@dispatch/ssh deps + tsconfig refs. - transport-http: CR-5 — re-export computerServiceHandle + ComputerService type from the package barrel (src/index.ts), mirroring lsp/mcp handles, so ssh imports the typed symbol cleanly (no more dist/seam.js subpath workaround). - orchestrator: added the @dispatch/exec-backend dep the host-bin agent missed + bun install. LIVE-VERIFIED: bun packages/host-bin/src/main.ts boots clean ('Dispatch booted', no disabled extensions) — exec-backend + ssh + all tool extensions load together. Verified: tsc -b EXIT 0, biome clean, 1690 vitest pass (+6 sshd-integration skipped). DEFERRED (CR-6): listComputers usageCount stays 0 until a conversation-store count-by-alias helper is added (non-blocking). Refs: notes/ssh-support-plan.md. No merge or push.
4 daysfeat(ssh): wave 5b — the ssh package (remote ExecBackend over ssh2)Adam Malczewski
Wave 5b of transparent SSH support. NEW standard extension @dispatch/ssh makes remote execution actually work over SSH, transparently. ssh2 verified to run under Bun (load-bearing decision #1 confirmed: connects to local sshd :22 + execs). - config.ts: ~/.ssh/config reader via ssh-config -> Computer[]/ComputerEntry[] (read-only discovery; resolves hostName/port/user/identityFile/knownHost). - hostkey.ts: known_hosts auto-trust-and-pin (present->verify/reject-on-mismatch, absent->accept+append; the accept-new analog). - errors.ts: pure ssh2/SFTP -> node:fs-style .code error mapping (so tools' existing ENOENT branches work unchanged). - pool.ts: SshConnectionPool (per-alias ssh2.Client, lazy connect, keep-alive, idle reap ~15m); key-only auth from ~/.ssh (config IdentityFile or default id_ed25519/id_rsa); no agent-forwarding, no PTY. - backend.ts: SshExecBackend implements ExecBackend (spawn via client.exec with shell-quoted cwd; fs via SFTP). - service.ts + extension.ts: activate provides BOTH handles the other units consume — remoteExecBackendFactoryHandle (exec-backend: computerId->SshExecBackend) AND computerServiceHandle (transport-http: listComputers/getComputer/getStatus/test). - orchestrator: added packages/ssh to root tsconfig.json refs + bun install. Tests: 45 pass + 6 sshd-integration skipped (it.skipIf(!process.env.SSH_TEST_HOST)). Verified: tsc -b EXIT 0, biome clean, 1690 vitest pass (was 1641, +49). CRs for wave 5c: host-bin registration; CR-5 transport-http barrel re-export; CR-6 usageCount wiring (deferred-ok, defaults to 0). Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
4 daysfeat(ssh): wave 5a — exec-backend remote-backend factory handleAdam Malczewski
exec-backend declares remoteExecBackendFactoryHandle (a consumer-defined ServiceHandle<(computerId) => ExecBackend>) that the ssh package will provide (standard→core layering). The resolver's computerId-set branch now lazy-looks-up this factory (at tool-execute time, runtime) and calls it; if ssh isn't loaded, getService throws → a clear 'SSH remote execution is not configured' error. The computerId-undefined (local) branch is byte-identical to before. This is the seam wave 5b (the ssh package) plugs into. +tests for both branches. Verified: tsc -b EXIT 0, biome clean. No merge or push.
4 daysfeat(ssh): wave 4 — computer HTTP/WS endpoints + chat computerId threadingAdam Malczewski
Wave 4 of transparent SSH support (3 parallel owner-agents on disjoint packages). - transport-http: computer routes — GET /computers, GET /computers/:alias, GET /computers/:alias/status, POST /computers/:alias/test (all delegate to a new ComputerService seam, graceful []/disconnected when ssh not loaded); GET/PUT/DELETE /conversations/:id/computer; PUT /workspaces/:id/default-computer (mirror the cwd/default-cwd routes); /chat threads computerId into the orchestrator. Defines ComputerService interface + computerServiceHandle (defineService<ComputerService>('ssh')) in seam.ts — the seam the ssh package provides via host.provideService in wave 5. - transport-ws: chat.send + chat.queue thread computerId onto the route result (mirrors cwd/workspaceId), forwarded to the orchestrator input. - mcp: CR-1 fix — filterMcpTools now preserves computerId on the returned ToolAssembly (mirrors cwd preservation), so the filter chain stays consistent. - orchestrator: added @dispatch/wire dep to transport-http (build/config, my lane) so its seam.ts Computer/ComputerEntry import resolves. Verified: tsc -b EXIT 0, biome clean, 1641 vitest pass (was 1620, +21). Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
4 daysfeat(ssh): wave 3 — session-orchestrator computerId threading + ↵Adam Malczewski
transport-contract API types Wave 3 of transparent SSH support (2 parallel owner-agents on disjoint packages). - session-orchestrator: thread computerId end-to-end through the turn, mirroring cwd exactly — StartTurnInput/EnqueueInput/handleMessage/TurnLifecyclePayload gain computerId; runTurnDetached resolves effectiveComputerId via conversationStore.getEffectiveComputer(convId, override), persists the override, threads into RunTurnInput + ToolAssembly. Register a remote-degradation tools-filter (filterRemoteIncompatibleTools) that, when assembly.computerId is set (REMOTE), drops the 'lsp' tool + any '__'-namespaced MCP tool (local processes that can't see remote files); LOCAL (computerId undefined) is a passthrough — byte-identical to today. +21 tests. - transport-contract: + computerId on ChatRequest (flows to ChatSendMessage) + computer endpoint API types (ComputerListResponse, ComputerResponse, ComputerStatusResponse, SetConversationComputerRequest, ConversationComputerResponse, SetWorkspaceDefaultComputerRequest, TestComputerResponse) — mirrors the cwd/workspace endpoint types. - CR-1 (non-blocking, folded into wave 4): MCP filter doesn't preserve computerId on the returned ToolAssembly. - cache-warming computerId threading intentionally DEFERRED (user request) — noted as a known performance-only limitation in tasks.md. Verified: tsc -b EXIT 0, biome clean, 1620 vitest pass (was 1599, +21). Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
4 daysfeat(ssh): wave 2 — route filesystem/shell tools behind ExecBackendAdam Malczewski
Wave 2 of transparent SSH support (4 parallel owner-agents on disjoint tool packages). The tools now resolve an ExecBackend per-call from ctx.computerId and call backend.spawn / backend.readFile / etc. instead of node:fs and node:child_process directly — so they are transport-agnostic (local now; remote over SSH later, transparent to the agent). Still LOCAL-ONLY this wave (computerId always undefined -> LocalExecBackend, behavior-identical). - tool-shell: factory takes resolveBackend; execute calls backend.spawn. spawn.ts DELETED (realSpawn was a verbatim duplicate of exec-backend's LocalExecBackend.spawn — logic moved to the sanctioned shared package). manifest dependsOn:[exec-backend]; host.getService at activation. - tool-read-file: readFile/stat/readdir -> backend.* (pure logic untouched; ENOENT .code branches kept). - tool-write-file: exists/stat/writeFile -> backend.* (pure logic untouched). - tool-edit-file: readFile/writeFile -> backend.* + forward-compatible REMOTE diagnostics skip (ctx.computerId set -> skip LSP, return empty — plan §6.1; local path byte-identical to today). LSP lookup stays lazy. - orchestrator: pre-wired @dispatch/exec-backend dep into the 4 tool package.jsons + bun install (build/config, my lane) so isolated verify resolved cleanly; agents added the ../exec-backend tsconfig ref. Verified: tsc -b EXIT 0, biome clean, 1599 vitest pass (was 1592). Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
4 daysfeat(ssh): wave 1 — ExecBackend + computer data model + runtime threadingAdam Malczewski
Wave 1 of transparent SSH support (parallel owner-agents on disjoint packages, plus the orchestrator-authored kernel contract seam from wave 0): - packages/wire: + Computer/ComputerEntry (read-only view over ~/.ssh/config Host aliases) + Workspace.defaultComputerId (string|null, null=local). Types only; 3 conformance tests. - packages/exec-backend (NEW core extension): the ExecBackend abstraction (spawn + minimal fs surface) the bundled tools will program against instead of node:fs/child_process. LocalExecBackend wraps today's node calls (behavior-identical; node:fs-style .code errors). execBackendHandle + ExecBackendResolver (sync; computerId undefined -> local; set -> throws until the ssh package wires remote resolution in wave 5). 20 tests. - packages/kernel (runtime only): thread computerId through dispatch.ts + run-turn.ts exactly as cwd is threaded (opaque, forwarded to ToolExecuteContext; absent = local = byte-identical to today). +2 tests. - packages/conversation-store: computer (SSH alias) assignment + resolution mirroring cwd — WorkspaceRow.defaultComputerId + setWorkspaceDefaultComputerId + getComputerId/setComputerId/clearComputerId + getEffectiveComputer (override -> per-conv -> workspace default -> null/local). Fixes the 3 Workspace literal sites the new required wire field broke. +18 tests. - orchestrator: root tsconfig.json ref for exec-backend + bun install. Verified: tsc -b EXIT 0, biome clean, 1592 vitest pass (was 1549, +43). Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
4 daysfeat(ssh): wave 0 — kernel contract seam (computerId)Adam Malczewski
Add additive optional `computerId` field to ToolExecuteContext + RunTurnInput. The kernel never interprets it (forwards verbatim to tools, like cwd) — it never enters the model prompt (no prompt-cache impact). When omitted/undefined, execution is LOCAL (today's behavior), so this is fully backward compatible. This is the orchestrator-authored seam (ORCHESTRATOR.md §2a) that lets Wave 1's producers (wire Computer types, exec-backend contract) and the consumer (kernel runtime threading) run in parallel against a fixed type. Refs: notes/ssh-support-plan.md (decisions resolved in §0.5/§13). No merge or push.
4 daysfeat(cli): add --workspace filter to 'dispatch list'Adam Malczewski
The backend already supported GET /conversations?workspaceId= but the CLI never sent it. Wire the list command to that filter: - args.ts: parse --workspace / -w on 'list' (placed before the --catch-all so the single-dash -w shorthand isn't taken for a positional prefix); add workspaceId? to the list ParsedCommand. - http.ts: add workspaceId? to FetchConversationsOpts; send ?workspaceId= (after q/status, preserving URLSearchParams order). - main.ts: forward parsed.workspaceId into fetchConversations; update USAGE. Composable with --status and the <prefix> short-id arg. 'Open conversations in workspace X' is now: dispatch list --workspace X (status defaults to active,idle). No contract changes — purely additive CLI wiring. Tests: +4 args (incl. composability + missing-value error), +2 http (exact ?workspaceId= URL + combined status/workspaceId with %2C encoding). typecheck EXIT 0, biome clean (364 files), full suite 1558 passed. Live-verified against an isolated server.
4 daysplan(ssh): lock final decision — take ssh-config dep; no open questions remainAdam Malczewski
Resolve the last open question: take the ssh-config npm package (project-local, alongside ssh2) for correct ~/.ssh/config parsing rather than hand-rolling. §13 now lists all 8 decisions as resolved and marks the plan decision-complete. Also records minor adopted defaults (config reader lives in ssh extension; stale alias surfaced as unresolved not silent-local; default identity probing order; assume unencrypted keys for MVP). Planning document only; no code changed. No merge or push.
4 daysplan(ssh): bake in resolved decisions + ~/.ssh/config discoveryAdam Malczewski
Update the SSH support plan to reflect user-confirmed decisions and a key simplification from a new requirement: - New §0.5 'Resolved decisions' records all 7 confirmed answers. - Computer is now a READ-ONLY view over ~/.ssh/config (Host aliases), not a persisted CRUD entity: no computer-store package, no create/update/delete API. computerId IS an SSH config alias. ~/.ssh/known_hosts is the host-key trust store (auto-trust-and-pin). - Auth simplified to key-only from ~/.ssh (no gopass/SecretsAccess/secretRef anywhere). - ssh2 only (no bun-ssh2 fork); verifying under Bun is the load-bearing Phase-3 first step. - LSP/MCP silently dropped on remote turns (no system-prompt note); edit_file works with no diagnostics on remote. - computerId persisted per-conversation (like cwd). - Updated data model (§3), connection mgmt (§4), security (§7), edge cases (§8), API surface (§9 read-only), frontend (§10), packages table (§11, no computer-store), phases (§12), and resolved open questions (§13). Planning document only; no code changed. No merge or push.
4 daysdocs(notes): research — list conversations filtered by worktree/workspaceAdam Malczewski
Investigation of whether the backend supports listing open conversations filtered by a specific worktree/workspace. Findings: - 'worktree' is not a Dispatch domain concept; canonical term is 'workspace' (logical grouping) vs 'working directory' (cwd, filesystem path). - GET /conversations already supports composable ?workspaceId=, ?status=, ?q= filters. 'Open conversations in workspace X' = ?workspaceId=X&status=active,idle. - Every conversation carries a workspaceId (default 'default'); ConversationMeta is in @dispatch/wire; filter lives in conversation-store listConversations. - A literal directory (git worktree) filter (?cwd=) is NOT supported; §3b documents the small additive change needed across wire/store/transport-http. - Test coverage verified: store-workspace.test.ts:369, store.test.ts:1463, app.test.ts:3696. Research notes only — no code/contract changes.
4 daysplan(ssh): add transparent SSH support design & implementation planAdam Malczewski
Research and plan transparent SSH execution so an agent runs commands on a remote computer as if local — the agent never learns it is using SSH. Covers: - How the cwd → ToolExecuteContext pipeline works today and where a computerId threads in (mirroring cwd end-to-end) - The ExecBackend abstraction (spawn + fs) behind which tool-shell/ read-file/write-file/edit-file are refactored, with LocalExecBackend (node) and SshExecBackend (ssh2) implementations - Computer data model + workspace defaultComputerId + per-conversation override, mirroring the getEffectiveCwd resolution ladder (null = local) - SSH connection pooling (one per computer, lazy connect, keep-alive, idle reaping), auth via SecretsAccess/gopass, host-key verification - Turn loop / dispatch integration (additive optional computerId field, backward-compatible — absent = today's local behavior) - LSP/MCP degrade by dropping those tools on remote turns (future: remote server spawn over SSH) - API surface (computer CRUD, per-conv + workspace-default endpoints, chat.send gains computerId), frontend impact - Security, edge cases, phased implementation, contract gaps reported to unit owners (one-owner-per-unit honored — planner does not edit others) No code changed; planning document only. No merge or push.
4 daysfeat(cli): add --file flag to 'dispatch send' subcommandAdam Malczewski
Add the same --file <path> support that the summon (chat) command has to the 'dispatch send' subcommand. When --file is given, the file's contents are read and attached to the message (composed via composeMessage, identical to chat). - args.ts: add 'file' to the send ParsedCommand, make 'text' optional, parse --file, and require at least one of --text or --file. - main.ts: read the file and compose the message in the send case, using the composed message in both the --queue and streaming branches; update USAGE. - args.test.ts: cover --file parsing (alone, with --text, missing value) and update the existing send expectations + the both-missing error message.
4 daysfix(bin): pin dev ports in bin/up so shell BACKEND_PORT can't override .envAdam Malczewski
bin/up ran `bun --watch main.ts` without setting BACKEND_PORT, so a shell-exported BACKEND_PORT (e.g. 24991 in ~/.bashrc, set so the Dispatch CLI hits the prod server) overrode .env's dev value 24203 — Bun lets shell env win over .env — binding the dev server onto the production port and colliding with the active dispatch-server systemd service. transport-http then failed to activate (Bun.serve "port in use"), so the HTTP server never came up and the frontend got "Failed to fetch". Force BACKEND_PORT=24203 + SURFACE_WS_PORT=24205 in the setsid invocation so the dev stack is deterministic regardless of the shell environment.
4 daysfeat(transport-http): add GET /conversations/:id/mcp status endpointAdam Malczewski
Mirrors the existing GET /conversations/:id/lsp route exactly: gates on the persisted then effective cwd (null → empty servers), returns 503 when the MCP service isn't loaded, and maps McpServerStatus → McpServerInfo (conditionally including `error` per exactOptionalPropertyTypes). Wires mcpService into CreateServerOptions + extension activate via a plain host.getService (mirroring lspService; "mcp" added to dependsOn, route added to contributes.routes), adds the @dispatch/mcp workspace dep, and re-exports mcpServiceHandle / McpService / McpServerStatus from seam.ts. Adds 4 tests mirroring the LSP status tests.
4 daysfix(lsp): prevent server crash from malformed LSP messagesAdam Malczewski
Two bugs caused the dispatch server to crash (15 times since Jun 24) when chat cc6c edited packages/transport-http/src/app.ts — a 40KB file with 23 multi-byte UTF-8 lines. The edit_file diagnostics hook sends the file to tsserver, which sends back a large publishDiagnostics response. When the response was split across stdout chunks at a multi-byte character boundary, the server crashed. Layer 1 — rpc.ts handleMessage: JSON.parse had no try/catch. A corrupted message threw an unhandled SyntaxError → unhandled rejection → process exit. Wrapped in try/catch; malformed messages are now skipped. Also hardened client.ts handleBytes: the async handleMessage Promise was fire-and-forget. Added .catch(() => {}) as defence-in-depth so no rejection from the RPC layer can ever crash the server. Layer 2 — framing.ts FrameDecoder: used a string buffer with new TextDecoder().decode(chunk) (no { stream: true }), corrupting multi-byte characters split across chunks. Worse, Content-Length counts bytes but the buffer was sliced by character count — for multi-byte content byte length ≠ char length, so the decoder extracted the wrong slice as a message. Rewrote to use a Uint8Array byte buffer: header separator search is byte-level, Content-Length comparison is byte-level, and the body is decoded only after all bytes are confirmed present. Tests: 5 new multi-byte framing tests (split at char boundary, byte-vs-char Content-Length, two messages in one chunk, three-way split) + 1 rpc test (malformed JSON does not throw). All 1545 tests pass.
4 daysfeat(transport-contract): add McpServerInfo + McpStatusResponse (0.22.0)Adam Malczewski
Additive types for GET /conversations/:id/mcp status endpoint, mirroring the existing LSP status types. McpServerState, McpServerInfo, McpStatusResponse. +2 type-test assertions. Version bump 0.21.0 → 0.22.0. Handoff written: frontend-mcp-status-handoff.md (backend route + FE consumption).
4 daysdocs: live-verify MCP + per-edit diagnostics; update tasks.md (1537 tests)Adam Malczewski
- MCP live-verified: test MCP server → tool discovery (test__ping) → tool call → pong result. Full turn lifecycle confirmed on production server. - Per-edit diagnostics live-verified: type error in .ts file surfaces [TypeScript Language Server] ERROR (2322) inline after edit. - edit_file bug found + fixed during live-verify (lazy LSP lookup).
4 daysfix(tool-edit-file): lazy LSP service lookup — diagnostics now actually workAdam Malczewski
The previous fix (e03a96e) wrapped getService in try/catch to prevent the activation crash, but that wasn't enough: tool-edit-file activates at position 5 in CORE_EXTENSIONS while lsp activates at position 20. So getService ALWAYS threw at activation time, lspService was ALWAYS undefined, and the diagnostics hook was NEVER wired — edits succeeded but never showed LSP feedback. Fix: make the LSP service lookup LAZY — defer it to edit time (when the tool is actually called), not activation time. By then all extensions have activated. The diagnostics function tries getService on each edit call; if LSP isn't loaded, it returns a no-op (graceful degradation).
4 daysfix(tool-edit-file): wrap getService in try/catch to prevent activation crashAdam Malczewski
The per-edit diagnostics change (8f6114b) called host.getService(lspServiceHandle) during activate(). But getService THROWS when a service has no provider — so if the LSP extension activates AFTER tool-edit-file (or isn't loaded at all), the activate() function crashes and the edit_file tool is NEVER REGISTERED. This is why the edit_file tool was missing from the agent toolset. Fix: wrap getService in try/catch — if the LSP service isn't available yet, lspService becomes undefined and edits proceed without diagnostics (the graceful degradation the comment always promised but the code didn't deliver).
4 daysfeat(mcp): Model Context Protocol host extensionAdam Malczewski
New `mcp` standard extension (`packages/mcp/`) that makes Dispatch an MCP host: spawns configured MCP servers (stdio child processes), performs the MCP handshake (initialize → notifications/initialized), discovers tools via tools/list, and registers each as a first-class Dispatch ToolContract via host.defineTool. When the model calls an MCP tool, the extension proxies the call to tools/call on the MCP server and returns the flattened result. Architecture (sibling of `lsp` extension): - Config: .dispatch/mcp.json (servers key) → opencode.json mcp key fallback, resolved per-cwd (mirrors LSP config resolution) - Transport: StdioTransport (spawn child, Content-Length framing + JSON-RPC 2.0) - Client: initialize → tools/list → tools/call; handles list_changed notifications for dynamic tool updates - Registry: tool name namespacing (<serverId>__<toolName>), ToolContract adapter that proxies execute → callTool, content flattening (text/image/ resource → string) - Manager: one client per server, lazy-spawn, status(), shutdownAll() - Extension: manifest (dependsOn session-orchestrator, capabilities spawn), registers tools + a toolsFilter (drops disconnected server's tools), mcpServiceHandle, deactivate kills all child processes Phase 1 scope: stdio only, Tools only (no Resources/Prompts/HTTP/sampling). Hand-rolled JSON-RPC + framing (zero external deps, adapts LSP patterns). Wave 1 (agent): 12 source + 8 test files, 69 new tests. Wave 2 (orchestrator): root tsconfig ref, host-bin CORE_EXTENSIONS registration + package.json dep, bun install. Verified: tsc -b EXIT 0, biome clean, 1537 vitest pass (was 1468, +69).
5 daysdocs(mcp): add MCP/MCP server/MCP host glossary entriesAdam Malczewski
5 daysdocs: MCP (Model Context Protocol) integration design + implementation planAdam Malczewski
- notes/mcp-design.md: full design — architecture fit (sibling of lsp ext), per-cwd config (.dispatch/mcp.json + opencode.json mcp key), tool name namespacing (<serverId>__<toolName>), ToolContract adapter, content flattening, security, glossary additions, 6 open design decisions - PLAN-mcp.md: wave breakdown (Wave 0 contracts/wiring, Wave 1 the mcp extension, Wave 2 host-bin registration, Wave 3 live verification) - Phase 1 scope: stdio only, Tools only, no surface, hand-rolled JSON-RPC - No kernel contract change needed (existing ToolContract + defineTool + toolsFilter are sufficient)
5 daysdocs: update tasks.md (per-edit diagnostics milestone, 1468 tests) + retire ↵Adam Malczewski
stale HANDOFF.md - tasks.md: record per-edit LSP diagnostics auto-append milestone (commit 8f6114b), fix test count 1453→1468 - HANDOFF.md: retire stale post-MVP handoff (referenced arch-rewrite path, 178 tests, next-steps all done) → current accurate pointer file
5 daysfeat(lsp+tool-edit-file): multi-server diagnostics + per-edit auto-appendAdam Malczewski
LSP extension: - Multi-server aggregation: query ALL connected servers matching the file's extension (not just the first), merge diagnostics tagged by source - Incremental sync: capture each server's textDocumentSync.change during initialize; compute prefix/suffix diff ranges for change:2 servers; full content for change:1 (generic, works for any LSP) - New diff.ts: pure computeChangeRange + offsetToPosition (O(n), tested) - Buffer sync: change(filePath, newText) sends didChange with post-edit in-memory content; openWithText for first open; tracks open doc text - languageId mapping: extended with .rb/.rbs/.c/.cpp/etc. (was 'unknown') - waitForDiagnostics: accepts text override + timeoutMs; returns { formatted, slow, timedOut }; polls for publishDiagnostics push - DiagnosticsStore: hasReceivedPush/clearReceived tracking; formatFiltered with minSeverity (1=Error, 2=Warning) for edit_file integration - LspService.getDiagnostics: service method for cross-extension use tool-edit-file: - After successful edit, calls LSP getDiagnostics with post-edit buffer - Only appends diagnostics with severity ≤ 2 (errors+warnings, no noise) - Appends slow warning (>10s): 'LSP is taking unusually long...' - 60s timeout; graceful degradation when no LSP available - Optional dep on @dispatch/lsp (getService pattern, not manifest depOn) 1468 vitest pass (was 1453, +15 new diff tests).
5 daysdocs: remove claude extension references from READMEAdam Malczewski
5 daysdocs: replace local paths with GitHub repo linksAdam Malczewski
- Web frontend section links to github.com/realtradam/dispatch-web - Claude extension links to github.com/realtradam/dispatch-adapter-claude - Workspace layout uses generic clone instructions, not /home/tradam paths - Dev stack bin/up instructions show clone-both-repos-as-siblings
5 daysdocs: rewrite README for current project stateAdam Malczewski
- Full HTTP API table (30+ endpoints: conversations, workspaces, LSP, system-prompt, metrics, queue, cache-warming, compaction, etc.) - Complete CLI commands (list, read, send, stop, compact, open + flags) - All 37 packages documented with tiers and dependencies - systemd deploy section (bin/build, bin/install, bin/sync-env) - Dev stacks table (bin/up 24203, bin/up2 25203) - Workspace layout (dispatch-backend, dispatch-web, claude, bin) - Updated web frontend section (Slice 2 browser chat in progress) - Links to all design docs
5 daysdocs: update paths from arch-rewrite to dispatch-backendAdam Malczewski
After consolidating to the dev branch and renaming the worktree, update all path references in ORCHESTRATOR.md and .skills/ORCHESTRATOR.md.
5 daysdocs(tasks): mark live-verifies complete + slim roadmapAdam Malczewski
Mark all 5 live-verify checkboxes as done (reasoning effort, todo tool, CLI cross-client, abort-race, system-prompt builder). Slim the roadmap from 11 items down to 3 open items (web frontend, close-with-queued-messages product decision, FE crash-recovery status endpoint) by dropping the 8 completed/verified items.
5 daysfix(kernel+tool-shell): abort hanging tool calls without bricking the ↵Adam Malczewski
conversation kernel: executeToolCall now races tool.execute against the abort signal via Promise.race; on abort resolves (not rejects) with an "Aborted" result so the step completes normally → finishReason "aborted" → turn seals cleanly (done event) → finally clears activeTurns → conversation freed, next message accepted. run-turn strips tool-call chunks from the assistant message on abort (keeps text/thinking) and omits tool-result messages to avoid persisting dangling tool calls that would 400 the provider next turn. tool-shell: realSpawn spawns detached (own process group); on abort AND timeout kills the entire group (process.kill(-pgid, SIGKILL)) and resolves immediately — no child.on("close") dependency, so a grandchild holding the pipes can't stall the spawn promise or leak. Also: ORCHESTRATOR.md migrated to dispatch CLI summon mechanism; .skills summary; bin/sync-env PATH injection; frontend handoff docs. 1453 vitest pass · tsc -b EXIT 0 · biome clean.
5 daysfix(broken-chat): read-time self-repair of unrecoverable chatsAdam Malczewski
reconcile() only repaired orphaned tool-calls. Two other broken states made chats uncontinuable, and load() had no parse-error guard: - A trailing assistant message whose only chunk is 'error' (a failed- generation marker) serializes to empty content -> provider rejects/empty -> chat never continues. 6 of 140 production conversations were stuck. - A tool-call whose input is a raw malformed-JSON string (model emitted broken JSON) re-sent as OpenAI arguments -> provider 400s on every continuation (the 77574596 break). - load() JSON.parse had no try/catch -> one corrupt row bricked the chat. Fix = read-time repair (no DB surgery; append-only preserved). reconcile runs on every load() BEFORE any provider sees messages, so Layer 1 protects ALL providers. Layer 1 (conversation-store reconcile): strip error chunks from assistant messages + drop the now-empty error-only messages (safe: never followed by a tool message); orphaned-tool-call synthesis unchanged; ReconcileReport +2 additive counts. loadSince (FE reads) intentionally unreconciled so the user still SEES the error. load() wraps JSON.parse in try/catch (skip corrupt rows). Layer 2 (openai-stream): serializeToolArguments ensures tool-call arguments is always valid JSON (malformed string -> fallback object), neutralizing already-stored malformed args. Layer 2 equiv (../claude provider-anthropic): safeJson returns a valid object fallback on parse failure, not the raw string. (Separate repo.) Live-verified: reproduced 77574596's real broken tail in the dev DB; POST /chat continued it cleanly (no 400, model replied) — the provider accepted the reconciled history. tsc -b EXIT 0, biome clean, 1453 vitest pass.
5 daysdocs(lsp): live-verify passed — broken-server recovery + configSource + ↵Adam Malczewski
shadow warning All five live-verify checks passed against the dev stack (bin/up :24203): configSource reaches the wire (built-in TS, 'built-in'); broken server reports error + configSource + source-named error; recovery without restart (blocker, error->connected after config fix); no retry storm; shadow warning logged via host.logger when both configs declare lsp.
5 daysfix(lsp): broken-server recovery + config source attributionAdam Malczewski
Two issues found by decompiling the running dispatch-server binary (handoff from a ruby-lsp setup in raylib-jamstack): Issue 2 (blocker): a failed LSP server was "broken" FOREVER — the manager's broken set was cleared only in shutdownAll(), so a server that failed (bad env, missing binary, or a since-fixed config) stayed state:"error" for the whole process. For an agent running *inside* dispatch the only recovery (server restart) kills its own session. Now a broken server self-heals when its resolved config changes since it was marked broken (discrete event → no retry storm), with a bounded backoff for transient failures. Issue 1: .dispatch/lsp.json silently shadowed opencode.json's lsp key with no warning and no source attribution. Now: shadow warning via host.logger when both declare lsp; configSource populated on status (.dispatch/lsp.json / opencode.json / built-in); spawn-failure error strings name the config source. Contract: additive configSource?: string on LspServerInfo (@dispatch/transport-contract 0.20.0→0.21.0). transport-http passes it through to the wire (was a field-by-field map that dropped it — CR resolved by the transport-http owner). tsc -b EXIT 0, biome clean, 1443 vitest pass.
5 daysdocs: task 3 (per-conversation model persistence) doneAdam Malczewski
5 daysfeat: persistent per-conversation model selectionAdam Malczewski
A chat's selected provider + model is now persisted per conversation (like cwd and reasoningEffort). Opening a conversation in a new browser recalls the originally selected model instead of defaulting. - transport-contract 0.19.0→0.20.0: ModelResponse + SetModelRequest types for GET/PUT /conversations/:id/model. - conversation-store: getModel/setModel (model:<id> key, mirrors getReasoningEffort/setReasoningEffort); forkHistory copies model; empty string clears. - session-orchestrator: resolve model from persisted store when no per-turn override; persist the resolved model so it sticks; warm path parity. - transport-http: GET/PUT /conversations/:id/model endpoints with validation. 1433 vitest pass; tsc + biome clean.
5 daysdocs: task 2 (system-prompt cwd reconstruction) doneAdam Malczewski