| Age | Commit message (Collapse) | Author |
|
|
|
The script previously required sudo for the entire script (id -u check),
which meant bin/build ran as root and created root-owned dist/ files.
On the next build, the normal user couldn't overwrite them (EACCES).
Now the script runs without a sudo prefix: the build step runs as the
normal user (dist/ files are user-owned), and sudo is used only on the
specific lines that write to system directories (/usr/bin, /etc,
/usr/share) or call systemctl.
|
|
bin/build was compiling the binary directly from stale dist/*.js files
without first recompiling the TypeScript packages. Since package.json
main fields point to dist/index.js, source edits to .ts files were
silently lost in the compiled binary.
Now tsc --build runs first (composite project references rebuild all
packages in dependency order), then bun build --compile bundles the
fresh dist/ output.
|
|
Agents were being cut off mid-task at 50 steps. The MAX_STEPS=50
hardcoded limit was silently terminating turns while the model was
actively making tool calls, leaving conversations idle with a
dangling tool-result as the last chunk.
Setting MAX_STEPS to 0 disables the limit — the loop runs until the
model stops making tool calls naturally or the abort signal fires.
The max-steps code path is preserved for when MAX_STEPS > 0.
|
|
The test endpoint's runProbe() waited for the ssh2 stream's 'close' event,
which some SSH servers never emit for short-lived exec channels (the command
'true' exits instantly). This caused the promise to hang forever — the HTTP
response never returned, and the FE's Test spinner spun indefinitely.
Three fixes:
1. runProbe now resolves on the 'exit' event (not 'close') — the command has
finished and the exit code is available. 'close' is kept as a fallback.
Stream data/stderr are drained to prevent buffer deadlocks.
2. runProbe has a 15s timeout safety net — if the exec callback or 'exit'
event never fires (e.g. server requires a pty for exec), the probe
resolves false instead of hanging forever.
3. The entire test() method is wrapped in a 30s Promise.race timeout —
even if pool.acquire() or pool.drop() hangs, the endpoint ALWAYS
responds with { ok, error? }.
The probe is fully non-interactive (no blocking prompts). tsc EXIT 0,
biome clean, 1756 tests pass.
|
|
Two improvements to the SSH support feature:
1. KNOWN_HOSTS DISCOVERY (packages/ssh):
Computers are now auto-discovered from ~/.ssh/known_hosts (every hostname
you've ever connected to) in ADDITION to ~/.ssh/config (explicit Host
aliases). Config entries take precedence (full params); known_hosts entries
get defaulted params (User=defaultUser, IdentityFile=null→pool probes
default keys, Port from [host]:port or 22, knownHost=true). Zero-config —
no ~/.ssh/config file needed; hosts just appear.
Reject list: dispatch.toml [ssh].reject = [...] (glob patterns like
github.com, *.ts.net) filters noise from the catalog. Read from both
the global ~/.config/dispatch/dispatch.toml and the project dispatch.toml.
Parsed with Bun.TOML.parse (zero deps). Only filters discovery (catalog);
specific lookups (getComputer/getStatus/test/connect) ignore the reject
list (it's a visibility filter, not access control).
New pure functions: parseKnownHosts(), isRejected(), globMatch().
+26 tests. tsc EXIT 0, biome clean, 1756 tests pass.
2. REMOTE SYSTEM-PROMPT AWARENESS (packages/system-prompt):
When a conversation has a computerId set (remote turn), the system prompt
now resolves system:os, system:hostname, git:branch/git:status, and
file: reads against the REMOTE machine — not the local host. Previously
the prompt always said 'Arch Linux (WSL)' + local hostname even when the
agent was connected to a remote Artix Linux machine.
The ResolverAdapters' hostname()/platform() are now async (so a remote
adapter can run 'hostname'/'uname -s' over SSH). The system-prompt
extension builds remote adapters from the ExecBackend (readFile→SFTP,
spawn→SSH exec). Cache invalidation now checks computerId (switching
computers rebuilds the prompt). The compaction path also threads
computerId. @dispatch/system-prompt now depends on @dispatch/exec-backend.
|
|
cross-cutting verified
FE confirmed whole-tree green (typecheck 0/0, 795/795 tests, biome clean, build
OK, git clean). All three handoffs GREEN with no integration gaps:
- provider-retry: yellow alert-warning bubble renders w/ countdown.
- SSH #1 wire types: defaultComputerId + Computer/ComputerEntry resolve.
- SSH #2 computer API: full src/features/computer/ feature wired + typecheck-clean.
Cross-cutting verified: provider-retry is WS-stream (TranscriptState.providerRetry
→ ChatView), computer is HTTP-only (AppStore.computerId → ComputerField sidebar) —
disjoint state/channels/regions/mount-keys; no collision. SSH support + provider-
retry integration is complete and validated end-to-end on both repos.
|
|
Brings dev's retry-with-backoff (the transient `provider-retry` AgentEvent the
web frontend consumes) + the LSP-dead-server per-edit-hang fix into the SSH
feature branch, alongside the SSH waves 0-5c.
All code files auto-merged cleanly (run-turn.ts, orchestrator.ts, runtime.ts,
wire/index.ts, tool-edit-file/extension.ts, run-turn.test.ts — both computerId
threading and retry-with-backoff coexist). Only tasks.md conflicted (status
section — orchestrator-resolved; both feature sections kept).
Verified post-merge: tsc -b EXIT 0, biome clean (391 files), 1730 vitest pass
+6 sshd-integration skipped (was 1690; +40 from dev's retry/LSP tests).
Wire dist rebuilt so the FE can re-sync the pinned @dispatch/wire dep and pick
up BOTH provider-retry AND the SSH Computer/defaultComputerId types.
No merge or push (into dev or otherwise).
|
|
|
|
The LSP diagnostics path hung up to 60s per edit whenever a configured Ruby
language server was dead or slow (the reported Steep langserver case): a
killed/crashed server was never detected (stayed "connected" forever), servers
were queried sequentially with a 60s budget each, and a corrupted-but-alive
server (Steep's ~3h phantom-SyntaxError drift) had no recovery.
Four fixes, all in packages/lsp/ (the tool-edit-file call site lowered to 10s):
1. Dead-process detection: SpawnedProcess.onExit (Bun proc.exited) + stdout-end
defence flip the client to error, dispose the rpc, kill the proc. The manager
re-spawns a fresh server after the 30s backoff. Dead servers are now skipped
(0s) instead of polled for 60s.
2. Concurrent fan-out + 10s hard cap: new aggregateDiagnostics queries all
matching servers at once, each capped at 10s. A non-responder is skipped
with "LSP took too long (>10s), skipped — raise this to the user" instead of
blocking the fast server's results. Replaces the vague "unusually long"
warning (now structurally impossible: slow is always false).
3. Corruption self-heal: a detector flags a server re-emitting identical
non-empty diagnostics despite the file changing; after 5 repeats the client
is marked broken and re-spawned. Clean files never trip it. (Acknowledged
false-positive risk on persistent unfixed errors; CLI type-check gate stays
authoritative.)
4. sendRequest timeout: hover/definition/references cap at 10s so they can't
hang the turn against a dead server; the initialize handshake keeps its 45s
race.
Verification: typecheck clean; 1573 tests pass (96 files), +15 new LSP tests
(86 in packages/lsp); biome clean. No kernel/contract changes; onExit is
internal to packages/lsp.
|
|
This was the OLD orchestrator manual (references the retired `opencode run`
CLI + `opencode-go/mimo-v2.5-pro`, MVP-era content). The current manual lives
at root ORCHESTRATOR.md (references the `dispatch` CLI + umans/umans-glm-5.2).
Unrelated housekeeping; split from the retry feature commit.
|
|
When the upstream LLM API returns a retryable error (HTTP 429 / 5xx
"overloaded"), the kernel now retries provider.stream() with a stepped
backoff, visibly, until the 8h cumulative-sleep budget is exhausted — then
emits the final error and seals the turn. Retries fire only when no content
was emitted yet this step (safety invariant: never duplicate partial output).
- wire: new transient TurnProviderRetryEvent AgentEvent variant (emitted
before each sleep; not persisted to model history).
- kernel contracts: RetryStrategy (pure delayFor + injected sleep) + optional
retry? on RunTurnInput (omit = no retry, backward-compatible).
- kernel run-turn: retry loop in executeStep; providerRetryEvent constructor.
Kernel imports no timer (sleep injected).
- session-orchestrator: concrete schedule (5s..30m, repeat 30m, 8h budget) +
abortable setTimeout sleep, wired into RunTurnInput.retry.
tsc -b EXIT 0; biome clean; 1574 vitest pass (+16 new: 11 kernel retry tests
with injected fake sleep + pure delayFor, zero @dispatch/* mocks; 5 schedule
tests). Transports unchanged (transport-ws forwards AgentEvent verbatim in
chat.delta; transport-http is generic JSON.stringify).
Plan: notes/retry-with-backoff-plan.md. tasks.md updated with milestone +
optional CLI-renderer roadmap follow-up.
|
|
barrel
Wave 5c (final wiring) of transparent SSH support.
- host-bin: register exec-backend + ssh in CORE_EXTENSIONS (exec-backend before
the tool extensions that dependsOn it; ssh after, provides the remote-backend
factory + ComputerService at boot). +@dispatch/exec-backend/@dispatch/ssh deps +
tsconfig refs.
- transport-http: CR-5 — re-export computerServiceHandle + ComputerService type
from the package barrel (src/index.ts), mirroring lsp/mcp handles, so ssh imports
the typed symbol cleanly (no more dist/seam.js subpath workaround).
- orchestrator: added the @dispatch/exec-backend dep the host-bin agent missed +
bun install.
LIVE-VERIFIED: bun packages/host-bin/src/main.ts boots clean ('Dispatch booted',
no disabled extensions) — exec-backend + ssh + all tool extensions load together.
Verified: tsc -b EXIT 0, biome clean, 1690 vitest pass (+6 sshd-integration skipped).
DEFERRED (CR-6): listComputers usageCount stays 0 until a conversation-store
count-by-alias helper is added (non-blocking).
Refs: notes/ssh-support-plan.md. No merge or push.
|
|
Wave 5b of transparent SSH support. NEW standard extension @dispatch/ssh makes
remote execution actually work over SSH, transparently. ssh2 verified to run under
Bun (load-bearing decision #1 confirmed: connects to local sshd :22 + execs).
- config.ts: ~/.ssh/config reader via ssh-config -> Computer[]/ComputerEntry[]
(read-only discovery; resolves hostName/port/user/identityFile/knownHost).
- hostkey.ts: known_hosts auto-trust-and-pin (present->verify/reject-on-mismatch,
absent->accept+append; the accept-new analog).
- errors.ts: pure ssh2/SFTP -> node:fs-style .code error mapping (so tools'
existing ENOENT branches work unchanged).
- pool.ts: SshConnectionPool (per-alias ssh2.Client, lazy connect, keep-alive,
idle reap ~15m); key-only auth from ~/.ssh (config IdentityFile or default
id_ed25519/id_rsa); no agent-forwarding, no PTY.
- backend.ts: SshExecBackend implements ExecBackend (spawn via client.exec with
shell-quoted cwd; fs via SFTP).
- service.ts + extension.ts: activate provides BOTH handles the other units
consume — remoteExecBackendFactoryHandle (exec-backend: computerId->SshExecBackend)
AND computerServiceHandle (transport-http: listComputers/getComputer/getStatus/test).
- orchestrator: added packages/ssh to root tsconfig.json refs + bun install.
Tests: 45 pass + 6 sshd-integration skipped (it.skipIf(!process.env.SSH_TEST_HOST)).
Verified: tsc -b EXIT 0, biome clean, 1690 vitest pass (was 1641, +49).
CRs for wave 5c: host-bin registration; CR-5 transport-http barrel re-export;
CR-6 usageCount wiring (deferred-ok, defaults to 0).
Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
|
|
exec-backend declares remoteExecBackendFactoryHandle (a consumer-defined
ServiceHandle<(computerId) => ExecBackend>) that the ssh package will provide
(standard→core layering). The resolver's computerId-set branch now lazy-looks-up
this factory (at tool-execute time, runtime) and calls it; if ssh isn't loaded,
getService throws → a clear 'SSH remote execution is not configured' error. The
computerId-undefined (local) branch is byte-identical to before.
This is the seam wave 5b (the ssh package) plugs into. +tests for both branches.
Verified: tsc -b EXIT 0, biome clean. No merge or push.
|
|
Wave 4 of transparent SSH support (3 parallel owner-agents on disjoint packages).
- transport-http: computer routes — GET /computers, GET /computers/:alias,
GET /computers/:alias/status, POST /computers/:alias/test (all delegate to a
new ComputerService seam, graceful []/disconnected when ssh not loaded);
GET/PUT/DELETE /conversations/:id/computer; PUT /workspaces/:id/default-computer
(mirror the cwd/default-cwd routes); /chat threads computerId into the
orchestrator. Defines ComputerService interface + computerServiceHandle
(defineService<ComputerService>('ssh')) in seam.ts — the seam the ssh package
provides via host.provideService in wave 5.
- transport-ws: chat.send + chat.queue thread computerId onto the route result
(mirrors cwd/workspaceId), forwarded to the orchestrator input.
- mcp: CR-1 fix — filterMcpTools now preserves computerId on the returned
ToolAssembly (mirrors cwd preservation), so the filter chain stays consistent.
- orchestrator: added @dispatch/wire dep to transport-http (build/config, my lane)
so its seam.ts Computer/ComputerEntry import resolves.
Verified: tsc -b EXIT 0, biome clean, 1641 vitest pass (was 1620, +21).
Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
|
|
transport-contract API types
Wave 3 of transparent SSH support (2 parallel owner-agents on disjoint packages).
- session-orchestrator: thread computerId end-to-end through the turn, mirroring
cwd exactly — StartTurnInput/EnqueueInput/handleMessage/TurnLifecyclePayload
gain computerId; runTurnDetached resolves effectiveComputerId via
conversationStore.getEffectiveComputer(convId, override), persists the override,
threads into RunTurnInput + ToolAssembly. Register a remote-degradation
tools-filter (filterRemoteIncompatibleTools) that, when assembly.computerId is
set (REMOTE), drops the 'lsp' tool + any '__'-namespaced MCP tool (local
processes that can't see remote files); LOCAL (computerId undefined) is a
passthrough — byte-identical to today. +21 tests.
- transport-contract: + computerId on ChatRequest (flows to ChatSendMessage) +
computer endpoint API types (ComputerListResponse, ComputerResponse,
ComputerStatusResponse, SetConversationComputerRequest,
ConversationComputerResponse, SetWorkspaceDefaultComputerRequest,
TestComputerResponse) — mirrors the cwd/workspace endpoint types.
- CR-1 (non-blocking, folded into wave 4): MCP filter doesn't preserve computerId
on the returned ToolAssembly.
- cache-warming computerId threading intentionally DEFERRED (user request) —
noted as a known performance-only limitation in tasks.md.
Verified: tsc -b EXIT 0, biome clean, 1620 vitest pass (was 1599, +21).
Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
|
|
Wave 2 of transparent SSH support (4 parallel owner-agents on disjoint
tool packages). The tools now resolve an ExecBackend per-call from
ctx.computerId and call backend.spawn / backend.readFile / etc. instead of
node:fs and node:child_process directly — so they are transport-agnostic
(local now; remote over SSH later, transparent to the agent). Still LOCAL-ONLY
this wave (computerId always undefined -> LocalExecBackend, behavior-identical).
- tool-shell: factory takes resolveBackend; execute calls backend.spawn.
spawn.ts DELETED (realSpawn was a verbatim duplicate of exec-backend's
LocalExecBackend.spawn — logic moved to the sanctioned shared package).
manifest dependsOn:[exec-backend]; host.getService at activation.
- tool-read-file: readFile/stat/readdir -> backend.* (pure logic untouched;
ENOENT .code branches kept).
- tool-write-file: exists/stat/writeFile -> backend.* (pure logic untouched).
- tool-edit-file: readFile/writeFile -> backend.* + forward-compatible REMOTE
diagnostics skip (ctx.computerId set -> skip LSP, return empty — plan §6.1;
local path byte-identical to today). LSP lookup stays lazy.
- orchestrator: pre-wired @dispatch/exec-backend dep into the 4 tool
package.jsons + bun install (build/config, my lane) so isolated verify
resolved cleanly; agents added the ../exec-backend tsconfig ref.
Verified: tsc -b EXIT 0, biome clean, 1599 vitest pass (was 1592).
Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
|
|
Wave 1 of transparent SSH support (parallel owner-agents on disjoint packages,
plus the orchestrator-authored kernel contract seam from wave 0):
- packages/wire: + Computer/ComputerEntry (read-only view over ~/.ssh/config
Host aliases) + Workspace.defaultComputerId (string|null, null=local). Types
only; 3 conformance tests.
- packages/exec-backend (NEW core extension): the ExecBackend abstraction
(spawn + minimal fs surface) the bundled tools will program against instead
of node:fs/child_process. LocalExecBackend wraps today's node calls
(behavior-identical; node:fs-style .code errors). execBackendHandle +
ExecBackendResolver (sync; computerId undefined -> local; set -> throws until
the ssh package wires remote resolution in wave 5). 20 tests.
- packages/kernel (runtime only): thread computerId through dispatch.ts +
run-turn.ts exactly as cwd is threaded (opaque, forwarded to
ToolExecuteContext; absent = local = byte-identical to today). +2 tests.
- packages/conversation-store: computer (SSH alias) assignment + resolution
mirroring cwd — WorkspaceRow.defaultComputerId + setWorkspaceDefaultComputerId
+ getComputerId/setComputerId/clearComputerId + getEffectiveComputer
(override -> per-conv -> workspace default -> null/local). Fixes the 3
Workspace literal sites the new required wire field broke. +18 tests.
- orchestrator: root tsconfig.json ref for exec-backend + bun install.
Verified: tsc -b EXIT 0, biome clean, 1592 vitest pass (was 1549, +43).
Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
|
|
Add additive optional `computerId` field to ToolExecuteContext + RunTurnInput.
The kernel never interprets it (forwards verbatim to tools, like cwd) — it never
enters the model prompt (no prompt-cache impact). When omitted/undefined,
execution is LOCAL (today's behavior), so this is fully backward compatible.
This is the orchestrator-authored seam (ORCHESTRATOR.md §2a) that lets Wave 1's
producers (wire Computer types, exec-backend contract) and the consumer
(kernel runtime threading) run in parallel against a fixed type.
Refs: notes/ssh-support-plan.md (decisions resolved in §0.5/§13).
No merge or push.
|
|
The backend already supported GET /conversations?workspaceId= but the CLI
never sent it. Wire the list command to that filter:
- args.ts: parse --workspace / -w on 'list' (placed before the --catch-all
so the single-dash -w shorthand isn't taken for a positional prefix);
add workspaceId? to the list ParsedCommand.
- http.ts: add workspaceId? to FetchConversationsOpts; send ?workspaceId=
(after q/status, preserving URLSearchParams order).
- main.ts: forward parsed.workspaceId into fetchConversations; update USAGE.
Composable with --status and the <prefix> short-id arg. 'Open conversations
in workspace X' is now: dispatch list --workspace X (status defaults to
active,idle). No contract changes — purely additive CLI wiring.
Tests: +4 args (incl. composability + missing-value error), +2 http
(exact ?workspaceId= URL + combined status/workspaceId with %2C encoding).
typecheck EXIT 0, biome clean (364 files), full suite 1558 passed.
Live-verified against an isolated server.
|
|
Resolve the last open question: take the ssh-config npm package (project-local,
alongside ssh2) for correct ~/.ssh/config parsing rather than hand-rolling.
§13 now lists all 8 decisions as resolved and marks the plan decision-complete.
Also records minor adopted defaults (config reader lives in ssh extension;
stale alias surfaced as unresolved not silent-local; default identity probing
order; assume unencrypted keys for MVP).
Planning document only; no code changed. No merge or push.
|
|
Update the SSH support plan to reflect user-confirmed decisions and a key
simplification from a new requirement:
- New §0.5 'Resolved decisions' records all 7 confirmed answers.
- Computer is now a READ-ONLY view over ~/.ssh/config (Host aliases), not a
persisted CRUD entity: no computer-store package, no create/update/delete
API. computerId IS an SSH config alias. ~/.ssh/known_hosts is the host-key
trust store (auto-trust-and-pin).
- Auth simplified to key-only from ~/.ssh (no gopass/SecretsAccess/secretRef
anywhere).
- ssh2 only (no bun-ssh2 fork); verifying under Bun is the load-bearing
Phase-3 first step.
- LSP/MCP silently dropped on remote turns (no system-prompt note);
edit_file works with no diagnostics on remote.
- computerId persisted per-conversation (like cwd).
- Updated data model (§3), connection mgmt (§4), security (§7), edge cases
(§8), API surface (§9 read-only), frontend (§10), packages table (§11,
no computer-store), phases (§12), and resolved open questions (§13).
Planning document only; no code changed. No merge or push.
|
|
Investigation of whether the backend supports listing open conversations
filtered by a specific worktree/workspace.
Findings:
- 'worktree' is not a Dispatch domain concept; canonical term is 'workspace'
(logical grouping) vs 'working directory' (cwd, filesystem path).
- GET /conversations already supports composable ?workspaceId=, ?status=, ?q=
filters. 'Open conversations in workspace X' = ?workspaceId=X&status=active,idle.
- Every conversation carries a workspaceId (default 'default'); ConversationMeta
is in @dispatch/wire; filter lives in conversation-store listConversations.
- A literal directory (git worktree) filter (?cwd=) is NOT supported; §3b
documents the small additive change needed across wire/store/transport-http.
- Test coverage verified: store-workspace.test.ts:369, store.test.ts:1463,
app.test.ts:3696.
Research notes only — no code/contract changes.
|
|
Research and plan transparent SSH execution so an agent runs commands on a
remote computer as if local — the agent never learns it is using SSH.
Covers:
- How the cwd → ToolExecuteContext pipeline works today and where a
computerId threads in (mirroring cwd end-to-end)
- The ExecBackend abstraction (spawn + fs) behind which tool-shell/
read-file/write-file/edit-file are refactored, with LocalExecBackend
(node) and SshExecBackend (ssh2) implementations
- Computer data model + workspace defaultComputerId + per-conversation
override, mirroring the getEffectiveCwd resolution ladder (null = local)
- SSH connection pooling (one per computer, lazy connect, keep-alive, idle
reaping), auth via SecretsAccess/gopass, host-key verification
- Turn loop / dispatch integration (additive optional computerId field,
backward-compatible — absent = today's local behavior)
- LSP/MCP degrade by dropping those tools on remote turns (future: remote
server spawn over SSH)
- API surface (computer CRUD, per-conv + workspace-default endpoints,
chat.send gains computerId), frontend impact
- Security, edge cases, phased implementation, contract gaps reported to
unit owners (one-owner-per-unit honored — planner does not edit others)
No code changed; planning document only. No merge or push.
|
|
Add the same --file <path> support that the summon (chat) command has to the
'dispatch send' subcommand. When --file is given, the file's contents are read
and attached to the message (composed via composeMessage, identical to chat).
- args.ts: add 'file' to the send ParsedCommand, make 'text' optional, parse
--file, and require at least one of --text or --file.
- main.ts: read the file and compose the message in the send case, using the
composed message in both the --queue and streaming branches; update USAGE.
- args.test.ts: cover --file parsing (alone, with --text, missing value) and
update the existing send expectations + the both-missing error message.
|
|
bin/up ran `bun --watch main.ts` without setting BACKEND_PORT, so a
shell-exported BACKEND_PORT (e.g. 24991 in ~/.bashrc, set so the Dispatch
CLI hits the prod server) overrode .env's dev value 24203 — Bun lets shell
env win over .env — binding the dev server onto the production port and
colliding with the active dispatch-server systemd service. transport-http
then failed to activate (Bun.serve "port in use"), so the HTTP server
never came up and the frontend got "Failed to fetch".
Force BACKEND_PORT=24203 + SURFACE_WS_PORT=24205 in the setsid invocation
so the dev stack is deterministic regardless of the shell environment.
|
|
Mirrors the existing GET /conversations/:id/lsp route exactly: gates on the
persisted then effective cwd (null → empty servers), returns 503 when the
MCP service isn't loaded, and maps McpServerStatus → McpServerInfo
(conditionally including `error` per exactOptionalPropertyTypes).
Wires mcpService into CreateServerOptions + extension activate via a plain
host.getService (mirroring lspService; "mcp" added to dependsOn, route added
to contributes.routes), adds the @dispatch/mcp workspace dep, and re-exports
mcpServiceHandle / McpService / McpServerStatus from seam.ts. Adds 4 tests
mirroring the LSP status tests.
|
|
Two bugs caused the dispatch server to crash (15 times since Jun 24)
when chat cc6c edited packages/transport-http/src/app.ts — a 40KB file
with 23 multi-byte UTF-8 lines. The edit_file diagnostics hook sends the
file to tsserver, which sends back a large publishDiagnostics response.
When the response was split across stdout chunks at a multi-byte
character boundary, the server crashed.
Layer 1 — rpc.ts handleMessage: JSON.parse had no try/catch. A corrupted
message threw an unhandled SyntaxError → unhandled rejection → process
exit. Wrapped in try/catch; malformed messages are now skipped.
Also hardened client.ts handleBytes: the async handleMessage Promise was
fire-and-forget. Added .catch(() => {}) as defence-in-depth so no
rejection from the RPC layer can ever crash the server.
Layer 2 — framing.ts FrameDecoder: used a string buffer with
new TextDecoder().decode(chunk) (no { stream: true }), corrupting
multi-byte characters split across chunks. Worse, Content-Length counts
bytes but the buffer was sliced by character count — for multi-byte
content byte length ≠ char length, so the decoder extracted the wrong
slice as a message. Rewrote to use a Uint8Array byte buffer: header
separator search is byte-level, Content-Length comparison is byte-level,
and the body is decoded only after all bytes are confirmed present.
Tests: 5 new multi-byte framing tests (split at char boundary,
byte-vs-char Content-Length, two messages in one chunk, three-way split)
+ 1 rpc test (malformed JSON does not throw). All 1545 tests pass.
|
|
Additive types for GET /conversations/:id/mcp status endpoint, mirroring the
existing LSP status types. McpServerState, McpServerInfo, McpStatusResponse.
+2 type-test assertions. Version bump 0.21.0 → 0.22.0.
Handoff written: frontend-mcp-status-handoff.md (backend route + FE consumption).
|
|
- MCP live-verified: test MCP server → tool discovery (test__ping) → tool call
→ pong result. Full turn lifecycle confirmed on production server.
- Per-edit diagnostics live-verified: type error in .ts file surfaces
[TypeScript Language Server] ERROR (2322) inline after edit.
- edit_file bug found + fixed during live-verify (lazy LSP lookup).
|
|
The previous fix (e03a96e) wrapped getService in try/catch to prevent the
activation crash, but that wasn't enough: tool-edit-file activates at position
5 in CORE_EXTENSIONS while lsp activates at position 20. So getService ALWAYS
threw at activation time, lspService was ALWAYS undefined, and the diagnostics
hook was NEVER wired — edits succeeded but never showed LSP feedback.
Fix: make the LSP service lookup LAZY — defer it to edit time (when the tool is
actually called), not activation time. By then all extensions have activated.
The diagnostics function tries getService on each edit call; if LSP isn't
loaded, it returns a no-op (graceful degradation).
|
|
The per-edit diagnostics change (8f6114b) called host.getService(lspServiceHandle)
during activate(). But getService THROWS when a service has no provider — so if
the LSP extension activates AFTER tool-edit-file (or isn't loaded at all), the
activate() function crashes and the edit_file tool is NEVER REGISTERED. This is
why the edit_file tool was missing from the agent toolset.
Fix: wrap getService in try/catch — if the LSP service isn't available yet,
lspService becomes undefined and edits proceed without diagnostics (the graceful
degradation the comment always promised but the code didn't deliver).
|
|
New `mcp` standard extension (`packages/mcp/`) that makes Dispatch an MCP
host: spawns configured MCP servers (stdio child processes), performs the MCP
handshake (initialize → notifications/initialized), discovers tools via
tools/list, and registers each as a first-class Dispatch ToolContract via
host.defineTool. When the model calls an MCP tool, the extension proxies the
call to tools/call on the MCP server and returns the flattened result.
Architecture (sibling of `lsp` extension):
- Config: .dispatch/mcp.json (servers key) → opencode.json mcp key fallback,
resolved per-cwd (mirrors LSP config resolution)
- Transport: StdioTransport (spawn child, Content-Length framing + JSON-RPC 2.0)
- Client: initialize → tools/list → tools/call; handles list_changed
notifications for dynamic tool updates
- Registry: tool name namespacing (<serverId>__<toolName>), ToolContract
adapter that proxies execute → callTool, content flattening (text/image/
resource → string)
- Manager: one client per server, lazy-spawn, status(), shutdownAll()
- Extension: manifest (dependsOn session-orchestrator, capabilities spawn),
registers tools + a toolsFilter (drops disconnected server's tools),
mcpServiceHandle, deactivate kills all child processes
Phase 1 scope: stdio only, Tools only (no Resources/Prompts/HTTP/sampling).
Hand-rolled JSON-RPC + framing (zero external deps, adapts LSP patterns).
Wave 1 (agent): 12 source + 8 test files, 69 new tests.
Wave 2 (orchestrator): root tsconfig ref, host-bin CORE_EXTENSIONS
registration + package.json dep, bun install.
Verified: tsc -b EXIT 0, biome clean, 1537 vitest pass (was 1468, +69).
|
|
|
|
- notes/mcp-design.md: full design — architecture fit (sibling of lsp ext),
per-cwd config (.dispatch/mcp.json + opencode.json mcp key), tool name
namespacing (<serverId>__<toolName>), ToolContract adapter, content
flattening, security, glossary additions, 6 open design decisions
- PLAN-mcp.md: wave breakdown (Wave 0 contracts/wiring, Wave 1 the mcp
extension, Wave 2 host-bin registration, Wave 3 live verification)
- Phase 1 scope: stdio only, Tools only, no surface, hand-rolled JSON-RPC
- No kernel contract change needed (existing ToolContract + defineTool +
toolsFilter are sufficient)
|
|
stale HANDOFF.md
- tasks.md: record per-edit LSP diagnostics auto-append milestone (commit
8f6114b), fix test count 1453→1468
- HANDOFF.md: retire stale post-MVP handoff (referenced arch-rewrite path,
178 tests, next-steps all done) → current accurate pointer file
|
|
LSP extension:
- Multi-server aggregation: query ALL connected servers matching the
file's extension (not just the first), merge diagnostics tagged by source
- Incremental sync: capture each server's textDocumentSync.change during
initialize; compute prefix/suffix diff ranges for change:2 servers;
full content for change:1 (generic, works for any LSP)
- New diff.ts: pure computeChangeRange + offsetToPosition (O(n), tested)
- Buffer sync: change(filePath, newText) sends didChange with post-edit
in-memory content; openWithText for first open; tracks open doc text
- languageId mapping: extended with .rb/.rbs/.c/.cpp/etc. (was 'unknown')
- waitForDiagnostics: accepts text override + timeoutMs; returns
{ formatted, slow, timedOut }; polls for publishDiagnostics push
- DiagnosticsStore: hasReceivedPush/clearReceived tracking; formatFiltered
with minSeverity (1=Error, 2=Warning) for edit_file integration
- LspService.getDiagnostics: service method for cross-extension use
tool-edit-file:
- After successful edit, calls LSP getDiagnostics with post-edit buffer
- Only appends diagnostics with severity ≤ 2 (errors+warnings, no noise)
- Appends slow warning (>10s): 'LSP is taking unusually long...'
- 60s timeout; graceful degradation when no LSP available
- Optional dep on @dispatch/lsp (getService pattern, not manifest depOn)
1468 vitest pass (was 1453, +15 new diff tests).
|
|
|
|
- Web frontend section links to github.com/realtradam/dispatch-web
- Claude extension links to github.com/realtradam/dispatch-adapter-claude
- Workspace layout uses generic clone instructions, not /home/tradam paths
- Dev stack bin/up instructions show clone-both-repos-as-siblings
|
|
- Full HTTP API table (30+ endpoints: conversations, workspaces, LSP,
system-prompt, metrics, queue, cache-warming, compaction, etc.)
- Complete CLI commands (list, read, send, stop, compact, open + flags)
- All 37 packages documented with tiers and dependencies
- systemd deploy section (bin/build, bin/install, bin/sync-env)
- Dev stacks table (bin/up 24203, bin/up2 25203)
- Workspace layout (dispatch-backend, dispatch-web, claude, bin)
- Updated web frontend section (Slice 2 browser chat in progress)
- Links to all design docs
|
|
After consolidating to the dev branch and renaming the worktree,
update all path references in ORCHESTRATOR.md and .skills/ORCHESTRATOR.md.
|
|
Mark all 5 live-verify checkboxes as done (reasoning effort, todo tool,
CLI cross-client, abort-race, system-prompt builder). Slim the roadmap
from 11 items down to 3 open items (web frontend, close-with-queued-messages
product decision, FE crash-recovery status endpoint) by dropping the 8
completed/verified items.
|
|
conversation
kernel: executeToolCall now races tool.execute against the abort signal
via Promise.race; on abort resolves (not rejects) with an "Aborted" result
so the step completes normally → finishReason "aborted" → turn seals
cleanly (done event) → finally clears activeTurns → conversation freed,
next message accepted. run-turn strips tool-call chunks from the assistant
message on abort (keeps text/thinking) and omits tool-result messages to
avoid persisting dangling tool calls that would 400 the provider next turn.
tool-shell: realSpawn spawns detached (own process group); on abort AND
timeout kills the entire group (process.kill(-pgid, SIGKILL)) and resolves
immediately — no child.on("close") dependency, so a grandchild holding the
pipes can't stall the spawn promise or leak.
Also: ORCHESTRATOR.md migrated to dispatch CLI summon mechanism; .skills
summary; bin/sync-env PATH injection; frontend handoff docs.
1453 vitest pass · tsc -b EXIT 0 · biome clean.
|
|
reconcile() only repaired orphaned tool-calls. Two other broken states made
chats uncontinuable, and load() had no parse-error guard:
- A trailing assistant message whose only chunk is 'error' (a failed-
generation marker) serializes to empty content -> provider rejects/empty
-> chat never continues. 6 of 140 production conversations were stuck.
- A tool-call whose input is a raw malformed-JSON string (model emitted
broken JSON) re-sent as OpenAI arguments -> provider 400s on every
continuation (the 77574596 break).
- load() JSON.parse had no try/catch -> one corrupt row bricked the chat.
Fix = read-time repair (no DB surgery; append-only preserved). reconcile
runs on every load() BEFORE any provider sees messages, so Layer 1
protects ALL providers.
Layer 1 (conversation-store reconcile): strip error chunks from assistant
messages + drop the now-empty error-only messages (safe: never followed by
a tool message); orphaned-tool-call synthesis unchanged; ReconcileReport
+2 additive counts. loadSince (FE reads) intentionally unreconciled so the
user still SEES the error. load() wraps JSON.parse in try/catch (skip
corrupt rows).
Layer 2 (openai-stream): serializeToolArguments ensures tool-call
arguments is always valid JSON (malformed string -> fallback object),
neutralizing already-stored malformed args.
Layer 2 equiv (../claude provider-anthropic): safeJson returns a valid
object fallback on parse failure, not the raw string. (Separate repo.)
Live-verified: reproduced 77574596's real broken tail in the dev DB;
POST /chat continued it cleanly (no 400, model replied) — the provider
accepted the reconciled history.
tsc -b EXIT 0, biome clean, 1453 vitest pass.
|
|
shadow warning
All five live-verify checks passed against the dev stack (bin/up :24203):
configSource reaches the wire (built-in TS, 'built-in'); broken server reports
error + configSource + source-named error; recovery without restart (blocker,
error->connected after config fix); no retry storm; shadow warning logged via
host.logger when both configs declare lsp.
|
|
Two issues found by decompiling the running dispatch-server binary
(handoff from a ruby-lsp setup in raylib-jamstack):
Issue 2 (blocker): a failed LSP server was "broken" FOREVER — the
manager's broken set was cleared only in shutdownAll(), so a server
that failed (bad env, missing binary, or a since-fixed config) stayed
state:"error" for the whole process. For an agent running *inside*
dispatch the only recovery (server restart) kills its own session.
Now a broken server self-heals when its resolved config changes since
it was marked broken (discrete event → no retry storm), with a bounded
backoff for transient failures.
Issue 1: .dispatch/lsp.json silently shadowed opencode.json's lsp key
with no warning and no source attribution. Now: shadow warning via
host.logger when both declare lsp; configSource populated on status
(.dispatch/lsp.json / opencode.json / built-in); spawn-failure error
strings name the config source.
Contract: additive configSource?: string on LspServerInfo
(@dispatch/transport-contract 0.20.0→0.21.0). transport-http passes it
through to the wire (was a field-by-field map that dropped it — CR
resolved by the transport-http owner).
tsc -b EXIT 0, biome clean, 1443 vitest pass.
|
|
|
|
A chat's selected provider + model is now persisted per conversation (like cwd
and reasoningEffort). Opening a conversation in a new browser recalls the
originally selected model instead of defaulting.
- transport-contract 0.19.0→0.20.0: ModelResponse + SetModelRequest types
for GET/PUT /conversations/:id/model.
- conversation-store: getModel/setModel (model:<id> key, mirrors
getReasoningEffort/setReasoningEffort); forkHistory copies model; empty
string clears.
- session-orchestrator: resolve model from persisted store when no per-turn
override; persist the resolved model so it sticks; warm path parity.
- transport-http: GET/PUT /conversations/:id/model endpoints with validation.
1433 vitest pass; tsc + biome clean.
|
|
|