| Age | Commit message (Collapse) | Author |
|
Agents were being cut off mid-task at 50 steps. The MAX_STEPS=50
hardcoded limit was silently terminating turns while the model was
actively making tool calls, leaving conversations idle with a
dangling tool-result as the last chunk.
Setting MAX_STEPS to 0 disables the limit — the loop runs until the
model stops making tool calls naturally or the abort signal fires.
The max-steps code path is preserved for when MAX_STEPS > 0.
|
|
|
|
The test endpoint's runProbe() waited for the ssh2 stream's 'close' event,
which some SSH servers never emit for short-lived exec channels (the command
'true' exits instantly). This caused the promise to hang forever — the HTTP
response never returned, and the FE's Test spinner spun indefinitely.
Three fixes:
1. runProbe now resolves on the 'exit' event (not 'close') — the command has
finished and the exit code is available. 'close' is kept as a fallback.
Stream data/stderr are drained to prevent buffer deadlocks.
2. runProbe has a 15s timeout safety net — if the exec callback or 'exit'
event never fires (e.g. server requires a pty for exec), the probe
resolves false instead of hanging forever.
3. The entire test() method is wrapped in a 30s Promise.race timeout —
even if pool.acquire() or pool.drop() hangs, the endpoint ALWAYS
responds with { ok, error? }.
The probe is fully non-interactive (no blocking prompts). tsc EXIT 0,
biome clean, 1756 tests pass.
|
|
Two improvements to the SSH support feature:
1. KNOWN_HOSTS DISCOVERY (packages/ssh):
Computers are now auto-discovered from ~/.ssh/known_hosts (every hostname
you've ever connected to) in ADDITION to ~/.ssh/config (explicit Host
aliases). Config entries take precedence (full params); known_hosts entries
get defaulted params (User=defaultUser, IdentityFile=null→pool probes
default keys, Port from [host]:port or 22, knownHost=true). Zero-config —
no ~/.ssh/config file needed; hosts just appear.
Reject list: dispatch.toml [ssh].reject = [...] (glob patterns like
github.com, *.ts.net) filters noise from the catalog. Read from both
the global ~/.config/dispatch/dispatch.toml and the project dispatch.toml.
Parsed with Bun.TOML.parse (zero deps). Only filters discovery (catalog);
specific lookups (getComputer/getStatus/test/connect) ignore the reject
list (it's a visibility filter, not access control).
New pure functions: parseKnownHosts(), isRejected(), globMatch().
+26 tests. tsc EXIT 0, biome clean, 1756 tests pass.
2. REMOTE SYSTEM-PROMPT AWARENESS (packages/system-prompt):
When a conversation has a computerId set (remote turn), the system prompt
now resolves system:os, system:hostname, git:branch/git:status, and
file: reads against the REMOTE machine — not the local host. Previously
the prompt always said 'Arch Linux (WSL)' + local hostname even when the
agent was connected to a remote Artix Linux machine.
The ResolverAdapters' hostname()/platform() are now async (so a remote
adapter can run 'hostname'/'uname -s' over SSH). The system-prompt
extension builds remote adapters from the ExecBackend (readFile→SFTP,
spawn→SSH exec). Cache invalidation now checks computerId (switching
computers rebuilds the prompt). The compaction path also threads
computerId. @dispatch/system-prompt now depends on @dispatch/exec-backend.
|
|
cross-cutting verified
FE confirmed whole-tree green (typecheck 0/0, 795/795 tests, biome clean, build
OK, git clean). All three handoffs GREEN with no integration gaps:
- provider-retry: yellow alert-warning bubble renders w/ countdown.
- SSH #1 wire types: defaultComputerId + Computer/ComputerEntry resolve.
- SSH #2 computer API: full src/features/computer/ feature wired + typecheck-clean.
Cross-cutting verified: provider-retry is WS-stream (TranscriptState.providerRetry
→ ChatView), computer is HTTP-only (AppStore.computerId → ComputerField sidebar) —
disjoint state/channels/regions/mount-keys; no collision. SSH support + provider-
retry integration is complete and validated end-to-end on both repos.
|
|
Brings dev's retry-with-backoff (the transient `provider-retry` AgentEvent the
web frontend consumes) + the LSP-dead-server per-edit-hang fix into the SSH
feature branch, alongside the SSH waves 0-5c.
All code files auto-merged cleanly (run-turn.ts, orchestrator.ts, runtime.ts,
wire/index.ts, tool-edit-file/extension.ts, run-turn.test.ts — both computerId
threading and retry-with-backoff coexist). Only tasks.md conflicted (status
section — orchestrator-resolved; both feature sections kept).
Verified post-merge: tsc -b EXIT 0, biome clean (391 files), 1730 vitest pass
+6 sshd-integration skipped (was 1690; +40 from dev's retry/LSP tests).
Wire dist rebuilt so the FE can re-sync the pinned @dispatch/wire dep and pick
up BOTH provider-retry AND the SSH Computer/defaultComputerId types.
No merge or push (into dev or otherwise).
|
|
|
|
The LSP diagnostics path hung up to 60s per edit whenever a configured Ruby
language server was dead or slow (the reported Steep langserver case): a
killed/crashed server was never detected (stayed "connected" forever), servers
were queried sequentially with a 60s budget each, and a corrupted-but-alive
server (Steep's ~3h phantom-SyntaxError drift) had no recovery.
Four fixes, all in packages/lsp/ (the tool-edit-file call site lowered to 10s):
1. Dead-process detection: SpawnedProcess.onExit (Bun proc.exited) + stdout-end
defence flip the client to error, dispose the rpc, kill the proc. The manager
re-spawns a fresh server after the 30s backoff. Dead servers are now skipped
(0s) instead of polled for 60s.
2. Concurrent fan-out + 10s hard cap: new aggregateDiagnostics queries all
matching servers at once, each capped at 10s. A non-responder is skipped
with "LSP took too long (>10s), skipped — raise this to the user" instead of
blocking the fast server's results. Replaces the vague "unusually long"
warning (now structurally impossible: slow is always false).
3. Corruption self-heal: a detector flags a server re-emitting identical
non-empty diagnostics despite the file changing; after 5 repeats the client
is marked broken and re-spawned. Clean files never trip it. (Acknowledged
false-positive risk on persistent unfixed errors; CLI type-check gate stays
authoritative.)
4. sendRequest timeout: hover/definition/references cap at 10s so they can't
hang the turn against a dead server; the initialize handshake keeps its 45s
race.
Verification: typecheck clean; 1573 tests pass (96 files), +15 new LSP tests
(86 in packages/lsp); biome clean. No kernel/contract changes; onExit is
internal to packages/lsp.
|
|
This was the OLD orchestrator manual (references the retired `opencode run`
CLI + `opencode-go/mimo-v2.5-pro`, MVP-era content). The current manual lives
at root ORCHESTRATOR.md (references the `dispatch` CLI + umans/umans-glm-5.2).
Unrelated housekeeping; split from the retry feature commit.
|
|
When the upstream LLM API returns a retryable error (HTTP 429 / 5xx
"overloaded"), the kernel now retries provider.stream() with a stepped
backoff, visibly, until the 8h cumulative-sleep budget is exhausted — then
emits the final error and seals the turn. Retries fire only when no content
was emitted yet this step (safety invariant: never duplicate partial output).
- wire: new transient TurnProviderRetryEvent AgentEvent variant (emitted
before each sleep; not persisted to model history).
- kernel contracts: RetryStrategy (pure delayFor + injected sleep) + optional
retry? on RunTurnInput (omit = no retry, backward-compatible).
- kernel run-turn: retry loop in executeStep; providerRetryEvent constructor.
Kernel imports no timer (sleep injected).
- session-orchestrator: concrete schedule (5s..30m, repeat 30m, 8h budget) +
abortable setTimeout sleep, wired into RunTurnInput.retry.
tsc -b EXIT 0; biome clean; 1574 vitest pass (+16 new: 11 kernel retry tests
with injected fake sleep + pure delayFor, zero @dispatch/* mocks; 5 schedule
tests). Transports unchanged (transport-ws forwards AgentEvent verbatim in
chat.delta; transport-http is generic JSON.stringify).
Plan: notes/retry-with-backoff-plan.md. tasks.md updated with milestone +
optional CLI-renderer roadmap follow-up.
|
|
barrel
Wave 5c (final wiring) of transparent SSH support.
- host-bin: register exec-backend + ssh in CORE_EXTENSIONS (exec-backend before
the tool extensions that dependsOn it; ssh after, provides the remote-backend
factory + ComputerService at boot). +@dispatch/exec-backend/@dispatch/ssh deps +
tsconfig refs.
- transport-http: CR-5 — re-export computerServiceHandle + ComputerService type
from the package barrel (src/index.ts), mirroring lsp/mcp handles, so ssh imports
the typed symbol cleanly (no more dist/seam.js subpath workaround).
- orchestrator: added the @dispatch/exec-backend dep the host-bin agent missed +
bun install.
LIVE-VERIFIED: bun packages/host-bin/src/main.ts boots clean ('Dispatch booted',
no disabled extensions) — exec-backend + ssh + all tool extensions load together.
Verified: tsc -b EXIT 0, biome clean, 1690 vitest pass (+6 sshd-integration skipped).
DEFERRED (CR-6): listComputers usageCount stays 0 until a conversation-store
count-by-alias helper is added (non-blocking).
Refs: notes/ssh-support-plan.md. No merge or push.
|
|
Wave 5b of transparent SSH support. NEW standard extension @dispatch/ssh makes
remote execution actually work over SSH, transparently. ssh2 verified to run under
Bun (load-bearing decision #1 confirmed: connects to local sshd :22 + execs).
- config.ts: ~/.ssh/config reader via ssh-config -> Computer[]/ComputerEntry[]
(read-only discovery; resolves hostName/port/user/identityFile/knownHost).
- hostkey.ts: known_hosts auto-trust-and-pin (present->verify/reject-on-mismatch,
absent->accept+append; the accept-new analog).
- errors.ts: pure ssh2/SFTP -> node:fs-style .code error mapping (so tools'
existing ENOENT branches work unchanged).
- pool.ts: SshConnectionPool (per-alias ssh2.Client, lazy connect, keep-alive,
idle reap ~15m); key-only auth from ~/.ssh (config IdentityFile or default
id_ed25519/id_rsa); no agent-forwarding, no PTY.
- backend.ts: SshExecBackend implements ExecBackend (spawn via client.exec with
shell-quoted cwd; fs via SFTP).
- service.ts + extension.ts: activate provides BOTH handles the other units
consume — remoteExecBackendFactoryHandle (exec-backend: computerId->SshExecBackend)
AND computerServiceHandle (transport-http: listComputers/getComputer/getStatus/test).
- orchestrator: added packages/ssh to root tsconfig.json refs + bun install.
Tests: 45 pass + 6 sshd-integration skipped (it.skipIf(!process.env.SSH_TEST_HOST)).
Verified: tsc -b EXIT 0, biome clean, 1690 vitest pass (was 1641, +49).
CRs for wave 5c: host-bin registration; CR-5 transport-http barrel re-export;
CR-6 usageCount wiring (deferred-ok, defaults to 0).
Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
|
|
exec-backend declares remoteExecBackendFactoryHandle (a consumer-defined
ServiceHandle<(computerId) => ExecBackend>) that the ssh package will provide
(standard→core layering). The resolver's computerId-set branch now lazy-looks-up
this factory (at tool-execute time, runtime) and calls it; if ssh isn't loaded,
getService throws → a clear 'SSH remote execution is not configured' error. The
computerId-undefined (local) branch is byte-identical to before.
This is the seam wave 5b (the ssh package) plugs into. +tests for both branches.
Verified: tsc -b EXIT 0, biome clean. No merge or push.
|
|
Wave 4 of transparent SSH support (3 parallel owner-agents on disjoint packages).
- transport-http: computer routes — GET /computers, GET /computers/:alias,
GET /computers/:alias/status, POST /computers/:alias/test (all delegate to a
new ComputerService seam, graceful []/disconnected when ssh not loaded);
GET/PUT/DELETE /conversations/:id/computer; PUT /workspaces/:id/default-computer
(mirror the cwd/default-cwd routes); /chat threads computerId into the
orchestrator. Defines ComputerService interface + computerServiceHandle
(defineService<ComputerService>('ssh')) in seam.ts — the seam the ssh package
provides via host.provideService in wave 5.
- transport-ws: chat.send + chat.queue thread computerId onto the route result
(mirrors cwd/workspaceId), forwarded to the orchestrator input.
- mcp: CR-1 fix — filterMcpTools now preserves computerId on the returned
ToolAssembly (mirrors cwd preservation), so the filter chain stays consistent.
- orchestrator: added @dispatch/wire dep to transport-http (build/config, my lane)
so its seam.ts Computer/ComputerEntry import resolves.
Verified: tsc -b EXIT 0, biome clean, 1641 vitest pass (was 1620, +21).
Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
|
|
transport-contract API types
Wave 3 of transparent SSH support (2 parallel owner-agents on disjoint packages).
- session-orchestrator: thread computerId end-to-end through the turn, mirroring
cwd exactly — StartTurnInput/EnqueueInput/handleMessage/TurnLifecyclePayload
gain computerId; runTurnDetached resolves effectiveComputerId via
conversationStore.getEffectiveComputer(convId, override), persists the override,
threads into RunTurnInput + ToolAssembly. Register a remote-degradation
tools-filter (filterRemoteIncompatibleTools) that, when assembly.computerId is
set (REMOTE), drops the 'lsp' tool + any '__'-namespaced MCP tool (local
processes that can't see remote files); LOCAL (computerId undefined) is a
passthrough — byte-identical to today. +21 tests.
- transport-contract: + computerId on ChatRequest (flows to ChatSendMessage) +
computer endpoint API types (ComputerListResponse, ComputerResponse,
ComputerStatusResponse, SetConversationComputerRequest,
ConversationComputerResponse, SetWorkspaceDefaultComputerRequest,
TestComputerResponse) — mirrors the cwd/workspace endpoint types.
- CR-1 (non-blocking, folded into wave 4): MCP filter doesn't preserve computerId
on the returned ToolAssembly.
- cache-warming computerId threading intentionally DEFERRED (user request) —
noted as a known performance-only limitation in tasks.md.
Verified: tsc -b EXIT 0, biome clean, 1620 vitest pass (was 1599, +21).
Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
|
|
Wave 2 of transparent SSH support (4 parallel owner-agents on disjoint
tool packages). The tools now resolve an ExecBackend per-call from
ctx.computerId and call backend.spawn / backend.readFile / etc. instead of
node:fs and node:child_process directly — so they are transport-agnostic
(local now; remote over SSH later, transparent to the agent). Still LOCAL-ONLY
this wave (computerId always undefined -> LocalExecBackend, behavior-identical).
- tool-shell: factory takes resolveBackend; execute calls backend.spawn.
spawn.ts DELETED (realSpawn was a verbatim duplicate of exec-backend's
LocalExecBackend.spawn — logic moved to the sanctioned shared package).
manifest dependsOn:[exec-backend]; host.getService at activation.
- tool-read-file: readFile/stat/readdir -> backend.* (pure logic untouched;
ENOENT .code branches kept).
- tool-write-file: exists/stat/writeFile -> backend.* (pure logic untouched).
- tool-edit-file: readFile/writeFile -> backend.* + forward-compatible REMOTE
diagnostics skip (ctx.computerId set -> skip LSP, return empty — plan §6.1;
local path byte-identical to today). LSP lookup stays lazy.
- orchestrator: pre-wired @dispatch/exec-backend dep into the 4 tool
package.jsons + bun install (build/config, my lane) so isolated verify
resolved cleanly; agents added the ../exec-backend tsconfig ref.
Verified: tsc -b EXIT 0, biome clean, 1599 vitest pass (was 1592).
Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
|
|
Wave 1 of transparent SSH support (parallel owner-agents on disjoint packages,
plus the orchestrator-authored kernel contract seam from wave 0):
- packages/wire: + Computer/ComputerEntry (read-only view over ~/.ssh/config
Host aliases) + Workspace.defaultComputerId (string|null, null=local). Types
only; 3 conformance tests.
- packages/exec-backend (NEW core extension): the ExecBackend abstraction
(spawn + minimal fs surface) the bundled tools will program against instead
of node:fs/child_process. LocalExecBackend wraps today's node calls
(behavior-identical; node:fs-style .code errors). execBackendHandle +
ExecBackendResolver (sync; computerId undefined -> local; set -> throws until
the ssh package wires remote resolution in wave 5). 20 tests.
- packages/kernel (runtime only): thread computerId through dispatch.ts +
run-turn.ts exactly as cwd is threaded (opaque, forwarded to
ToolExecuteContext; absent = local = byte-identical to today). +2 tests.
- packages/conversation-store: computer (SSH alias) assignment + resolution
mirroring cwd — WorkspaceRow.defaultComputerId + setWorkspaceDefaultComputerId
+ getComputerId/setComputerId/clearComputerId + getEffectiveComputer
(override -> per-conv -> workspace default -> null/local). Fixes the 3
Workspace literal sites the new required wire field broke. +18 tests.
- orchestrator: root tsconfig.json ref for exec-backend + bun install.
Verified: tsc -b EXIT 0, biome clean, 1592 vitest pass (was 1549, +43).
Refs: notes/ssh-support-plan.md (decisions §0.5/§13). No merge or push.
|
|
Add additive optional `computerId` field to ToolExecuteContext + RunTurnInput.
The kernel never interprets it (forwards verbatim to tools, like cwd) — it never
enters the model prompt (no prompt-cache impact). When omitted/undefined,
execution is LOCAL (today's behavior), so this is fully backward compatible.
This is the orchestrator-authored seam (ORCHESTRATOR.md §2a) that lets Wave 1's
producers (wire Computer types, exec-backend contract) and the consumer
(kernel runtime threading) run in parallel against a fixed type.
Refs: notes/ssh-support-plan.md (decisions resolved in §0.5/§13).
No merge or push.
|
|
The backend already supported GET /conversations?workspaceId= but the CLI
never sent it. Wire the list command to that filter:
- args.ts: parse --workspace / -w on 'list' (placed before the --catch-all
so the single-dash -w shorthand isn't taken for a positional prefix);
add workspaceId? to the list ParsedCommand.
- http.ts: add workspaceId? to FetchConversationsOpts; send ?workspaceId=
(after q/status, preserving URLSearchParams order).
- main.ts: forward parsed.workspaceId into fetchConversations; update USAGE.
Composable with --status and the <prefix> short-id arg. 'Open conversations
in workspace X' is now: dispatch list --workspace X (status defaults to
active,idle). No contract changes — purely additive CLI wiring.
Tests: +4 args (incl. composability + missing-value error), +2 http
(exact ?workspaceId= URL + combined status/workspaceId with %2C encoding).
typecheck EXIT 0, biome clean (364 files), full suite 1558 passed.
Live-verified against an isolated server.
|
|
Resolve the last open question: take the ssh-config npm package (project-local,
alongside ssh2) for correct ~/.ssh/config parsing rather than hand-rolling.
§13 now lists all 8 decisions as resolved and marks the plan decision-complete.
Also records minor adopted defaults (config reader lives in ssh extension;
stale alias surfaced as unresolved not silent-local; default identity probing
order; assume unencrypted keys for MVP).
Planning document only; no code changed. No merge or push.
|
|
Update the SSH support plan to reflect user-confirmed decisions and a key
simplification from a new requirement:
- New §0.5 'Resolved decisions' records all 7 confirmed answers.
- Computer is now a READ-ONLY view over ~/.ssh/config (Host aliases), not a
persisted CRUD entity: no computer-store package, no create/update/delete
API. computerId IS an SSH config alias. ~/.ssh/known_hosts is the host-key
trust store (auto-trust-and-pin).
- Auth simplified to key-only from ~/.ssh (no gopass/SecretsAccess/secretRef
anywhere).
- ssh2 only (no bun-ssh2 fork); verifying under Bun is the load-bearing
Phase-3 first step.
- LSP/MCP silently dropped on remote turns (no system-prompt note);
edit_file works with no diagnostics on remote.
- computerId persisted per-conversation (like cwd).
- Updated data model (§3), connection mgmt (§4), security (§7), edge cases
(§8), API surface (§9 read-only), frontend (§10), packages table (§11,
no computer-store), phases (§12), and resolved open questions (§13).
Planning document only; no code changed. No merge or push.
|
|
Investigation of whether the backend supports listing open conversations
filtered by a specific worktree/workspace.
Findings:
- 'worktree' is not a Dispatch domain concept; canonical term is 'workspace'
(logical grouping) vs 'working directory' (cwd, filesystem path).
- GET /conversations already supports composable ?workspaceId=, ?status=, ?q=
filters. 'Open conversations in workspace X' = ?workspaceId=X&status=active,idle.
- Every conversation carries a workspaceId (default 'default'); ConversationMeta
is in @dispatch/wire; filter lives in conversation-store listConversations.
- A literal directory (git worktree) filter (?cwd=) is NOT supported; §3b
documents the small additive change needed across wire/store/transport-http.
- Test coverage verified: store-workspace.test.ts:369, store.test.ts:1463,
app.test.ts:3696.
Research notes only — no code/contract changes.
|
|
Research and plan transparent SSH execution so an agent runs commands on a
remote computer as if local — the agent never learns it is using SSH.
Covers:
- How the cwd → ToolExecuteContext pipeline works today and where a
computerId threads in (mirroring cwd end-to-end)
- The ExecBackend abstraction (spawn + fs) behind which tool-shell/
read-file/write-file/edit-file are refactored, with LocalExecBackend
(node) and SshExecBackend (ssh2) implementations
- Computer data model + workspace defaultComputerId + per-conversation
override, mirroring the getEffectiveCwd resolution ladder (null = local)
- SSH connection pooling (one per computer, lazy connect, keep-alive, idle
reaping), auth via SecretsAccess/gopass, host-key verification
- Turn loop / dispatch integration (additive optional computerId field,
backward-compatible — absent = today's local behavior)
- LSP/MCP degrade by dropping those tools on remote turns (future: remote
server spawn over SSH)
- API surface (computer CRUD, per-conv + workspace-default endpoints,
chat.send gains computerId), frontend impact
- Security, edge cases, phased implementation, contract gaps reported to
unit owners (one-owner-per-unit honored — planner does not edit others)
No code changed; planning document only. No merge or push.
|
|
Add the same --file <path> support that the summon (chat) command has to the
'dispatch send' subcommand. When --file is given, the file's contents are read
and attached to the message (composed via composeMessage, identical to chat).
- args.ts: add 'file' to the send ParsedCommand, make 'text' optional, parse
--file, and require at least one of --text or --file.
- main.ts: read the file and compose the message in the send case, using the
composed message in both the --queue and streaming branches; update USAGE.
- args.test.ts: cover --file parsing (alone, with --text, missing value) and
update the existing send expectations + the both-missing error message.
|
|
bin/up ran `bun --watch main.ts` without setting BACKEND_PORT, so a
shell-exported BACKEND_PORT (e.g. 24991 in ~/.bashrc, set so the Dispatch
CLI hits the prod server) overrode .env's dev value 24203 — Bun lets shell
env win over .env — binding the dev server onto the production port and
colliding with the active dispatch-server systemd service. transport-http
then failed to activate (Bun.serve "port in use"), so the HTTP server
never came up and the frontend got "Failed to fetch".
Force BACKEND_PORT=24203 + SURFACE_WS_PORT=24205 in the setsid invocation
so the dev stack is deterministic regardless of the shell environment.
|
|
Mirrors the existing GET /conversations/:id/lsp route exactly: gates on the
persisted then effective cwd (null → empty servers), returns 503 when the
MCP service isn't loaded, and maps McpServerStatus → McpServerInfo
(conditionally including `error` per exactOptionalPropertyTypes).
Wires mcpService into CreateServerOptions + extension activate via a plain
host.getService (mirroring lspService; "mcp" added to dependsOn, route added
to contributes.routes), adds the @dispatch/mcp workspace dep, and re-exports
mcpServiceHandle / McpService / McpServerStatus from seam.ts. Adds 4 tests
mirroring the LSP status tests.
|
|
Two bugs caused the dispatch server to crash (15 times since Jun 24)
when chat cc6c edited packages/transport-http/src/app.ts — a 40KB file
with 23 multi-byte UTF-8 lines. The edit_file diagnostics hook sends the
file to tsserver, which sends back a large publishDiagnostics response.
When the response was split across stdout chunks at a multi-byte
character boundary, the server crashed.
Layer 1 — rpc.ts handleMessage: JSON.parse had no try/catch. A corrupted
message threw an unhandled SyntaxError → unhandled rejection → process
exit. Wrapped in try/catch; malformed messages are now skipped.
Also hardened client.ts handleBytes: the async handleMessage Promise was
fire-and-forget. Added .catch(() => {}) as defence-in-depth so no
rejection from the RPC layer can ever crash the server.
Layer 2 — framing.ts FrameDecoder: used a string buffer with
new TextDecoder().decode(chunk) (no { stream: true }), corrupting
multi-byte characters split across chunks. Worse, Content-Length counts
bytes but the buffer was sliced by character count — for multi-byte
content byte length ≠ char length, so the decoder extracted the wrong
slice as a message. Rewrote to use a Uint8Array byte buffer: header
separator search is byte-level, Content-Length comparison is byte-level,
and the body is decoded only after all bytes are confirmed present.
Tests: 5 new multi-byte framing tests (split at char boundary,
byte-vs-char Content-Length, two messages in one chunk, three-way split)
+ 1 rpc test (malformed JSON does not throw). All 1545 tests pass.
|
|
Additive types for GET /conversations/:id/mcp status endpoint, mirroring the
existing LSP status types. McpServerState, McpServerInfo, McpStatusResponse.
+2 type-test assertions. Version bump 0.21.0 → 0.22.0.
Handoff written: frontend-mcp-status-handoff.md (backend route + FE consumption).
|
|
- MCP live-verified: test MCP server → tool discovery (test__ping) → tool call
→ pong result. Full turn lifecycle confirmed on production server.
- Per-edit diagnostics live-verified: type error in .ts file surfaces
[TypeScript Language Server] ERROR (2322) inline after edit.
- edit_file bug found + fixed during live-verify (lazy LSP lookup).
|
|
The previous fix (e03a96e) wrapped getService in try/catch to prevent the
activation crash, but that wasn't enough: tool-edit-file activates at position
5 in CORE_EXTENSIONS while lsp activates at position 20. So getService ALWAYS
threw at activation time, lspService was ALWAYS undefined, and the diagnostics
hook was NEVER wired — edits succeeded but never showed LSP feedback.
Fix: make the LSP service lookup LAZY — defer it to edit time (when the tool is
actually called), not activation time. By then all extensions have activated.
The diagnostics function tries getService on each edit call; if LSP isn't
loaded, it returns a no-op (graceful degradation).
|
|
The per-edit diagnostics change (8f6114b) called host.getService(lspServiceHandle)
during activate(). But getService THROWS when a service has no provider — so if
the LSP extension activates AFTER tool-edit-file (or isn't loaded at all), the
activate() function crashes and the edit_file tool is NEVER REGISTERED. This is
why the edit_file tool was missing from the agent toolset.
Fix: wrap getService in try/catch — if the LSP service isn't available yet,
lspService becomes undefined and edits proceed without diagnostics (the graceful
degradation the comment always promised but the code didn't deliver).
|
|
New `mcp` standard extension (`packages/mcp/`) that makes Dispatch an MCP
host: spawns configured MCP servers (stdio child processes), performs the MCP
handshake (initialize → notifications/initialized), discovers tools via
tools/list, and registers each as a first-class Dispatch ToolContract via
host.defineTool. When the model calls an MCP tool, the extension proxies the
call to tools/call on the MCP server and returns the flattened result.
Architecture (sibling of `lsp` extension):
- Config: .dispatch/mcp.json (servers key) → opencode.json mcp key fallback,
resolved per-cwd (mirrors LSP config resolution)
- Transport: StdioTransport (spawn child, Content-Length framing + JSON-RPC 2.0)
- Client: initialize → tools/list → tools/call; handles list_changed
notifications for dynamic tool updates
- Registry: tool name namespacing (<serverId>__<toolName>), ToolContract
adapter that proxies execute → callTool, content flattening (text/image/
resource → string)
- Manager: one client per server, lazy-spawn, status(), shutdownAll()
- Extension: manifest (dependsOn session-orchestrator, capabilities spawn),
registers tools + a toolsFilter (drops disconnected server's tools),
mcpServiceHandle, deactivate kills all child processes
Phase 1 scope: stdio only, Tools only (no Resources/Prompts/HTTP/sampling).
Hand-rolled JSON-RPC + framing (zero external deps, adapts LSP patterns).
Wave 1 (agent): 12 source + 8 test files, 69 new tests.
Wave 2 (orchestrator): root tsconfig ref, host-bin CORE_EXTENSIONS
registration + package.json dep, bun install.
Verified: tsc -b EXIT 0, biome clean, 1537 vitest pass (was 1468, +69).
|
|
|
|
- notes/mcp-design.md: full design — architecture fit (sibling of lsp ext),
per-cwd config (.dispatch/mcp.json + opencode.json mcp key), tool name
namespacing (<serverId>__<toolName>), ToolContract adapter, content
flattening, security, glossary additions, 6 open design decisions
- PLAN-mcp.md: wave breakdown (Wave 0 contracts/wiring, Wave 1 the mcp
extension, Wave 2 host-bin registration, Wave 3 live verification)
- Phase 1 scope: stdio only, Tools only, no surface, hand-rolled JSON-RPC
- No kernel contract change needed (existing ToolContract + defineTool +
toolsFilter are sufficient)
|
|
stale HANDOFF.md
- tasks.md: record per-edit LSP diagnostics auto-append milestone (commit
8f6114b), fix test count 1453→1468
- HANDOFF.md: retire stale post-MVP handoff (referenced arch-rewrite path,
178 tests, next-steps all done) → current accurate pointer file
|
|
LSP extension:
- Multi-server aggregation: query ALL connected servers matching the
file's extension (not just the first), merge diagnostics tagged by source
- Incremental sync: capture each server's textDocumentSync.change during
initialize; compute prefix/suffix diff ranges for change:2 servers;
full content for change:1 (generic, works for any LSP)
- New diff.ts: pure computeChangeRange + offsetToPosition (O(n), tested)
- Buffer sync: change(filePath, newText) sends didChange with post-edit
in-memory content; openWithText for first open; tracks open doc text
- languageId mapping: extended with .rb/.rbs/.c/.cpp/etc. (was 'unknown')
- waitForDiagnostics: accepts text override + timeoutMs; returns
{ formatted, slow, timedOut }; polls for publishDiagnostics push
- DiagnosticsStore: hasReceivedPush/clearReceived tracking; formatFiltered
with minSeverity (1=Error, 2=Warning) for edit_file integration
- LspService.getDiagnostics: service method for cross-extension use
tool-edit-file:
- After successful edit, calls LSP getDiagnostics with post-edit buffer
- Only appends diagnostics with severity ≤ 2 (errors+warnings, no noise)
- Appends slow warning (>10s): 'LSP is taking unusually long...'
- 60s timeout; graceful degradation when no LSP available
- Optional dep on @dispatch/lsp (getService pattern, not manifest depOn)
1468 vitest pass (was 1453, +15 new diff tests).
|
|
|
|
- Web frontend section links to github.com/realtradam/dispatch-web
- Claude extension links to github.com/realtradam/dispatch-adapter-claude
- Workspace layout uses generic clone instructions, not /home/tradam paths
- Dev stack bin/up instructions show clone-both-repos-as-siblings
|
|
- Full HTTP API table (30+ endpoints: conversations, workspaces, LSP,
system-prompt, metrics, queue, cache-warming, compaction, etc.)
- Complete CLI commands (list, read, send, stop, compact, open + flags)
- All 37 packages documented with tiers and dependencies
- systemd deploy section (bin/build, bin/install, bin/sync-env)
- Dev stacks table (bin/up 24203, bin/up2 25203)
- Workspace layout (dispatch-backend, dispatch-web, claude, bin)
- Updated web frontend section (Slice 2 browser chat in progress)
- Links to all design docs
|
|
After consolidating to the dev branch and renaming the worktree,
update all path references in ORCHESTRATOR.md and .skills/ORCHESTRATOR.md.
|
|
Mark all 5 live-verify checkboxes as done (reasoning effort, todo tool,
CLI cross-client, abort-race, system-prompt builder). Slim the roadmap
from 11 items down to 3 open items (web frontend, close-with-queued-messages
product decision, FE crash-recovery status endpoint) by dropping the 8
completed/verified items.
|
|
conversation
kernel: executeToolCall now races tool.execute against the abort signal
via Promise.race; on abort resolves (not rejects) with an "Aborted" result
so the step completes normally → finishReason "aborted" → turn seals
cleanly (done event) → finally clears activeTurns → conversation freed,
next message accepted. run-turn strips tool-call chunks from the assistant
message on abort (keeps text/thinking) and omits tool-result messages to
avoid persisting dangling tool calls that would 400 the provider next turn.
tool-shell: realSpawn spawns detached (own process group); on abort AND
timeout kills the entire group (process.kill(-pgid, SIGKILL)) and resolves
immediately — no child.on("close") dependency, so a grandchild holding the
pipes can't stall the spawn promise or leak.
Also: ORCHESTRATOR.md migrated to dispatch CLI summon mechanism; .skills
summary; bin/sync-env PATH injection; frontend handoff docs.
1453 vitest pass · tsc -b EXIT 0 · biome clean.
|
|
reconcile() only repaired orphaned tool-calls. Two other broken states made
chats uncontinuable, and load() had no parse-error guard:
- A trailing assistant message whose only chunk is 'error' (a failed-
generation marker) serializes to empty content -> provider rejects/empty
-> chat never continues. 6 of 140 production conversations were stuck.
- A tool-call whose input is a raw malformed-JSON string (model emitted
broken JSON) re-sent as OpenAI arguments -> provider 400s on every
continuation (the 77574596 break).
- load() JSON.parse had no try/catch -> one corrupt row bricked the chat.
Fix = read-time repair (no DB surgery; append-only preserved). reconcile
runs on every load() BEFORE any provider sees messages, so Layer 1
protects ALL providers.
Layer 1 (conversation-store reconcile): strip error chunks from assistant
messages + drop the now-empty error-only messages (safe: never followed by
a tool message); orphaned-tool-call synthesis unchanged; ReconcileReport
+2 additive counts. loadSince (FE reads) intentionally unreconciled so the
user still SEES the error. load() wraps JSON.parse in try/catch (skip
corrupt rows).
Layer 2 (openai-stream): serializeToolArguments ensures tool-call
arguments is always valid JSON (malformed string -> fallback object),
neutralizing already-stored malformed args.
Layer 2 equiv (../claude provider-anthropic): safeJson returns a valid
object fallback on parse failure, not the raw string. (Separate repo.)
Live-verified: reproduced 77574596's real broken tail in the dev DB;
POST /chat continued it cleanly (no 400, model replied) — the provider
accepted the reconciled history.
tsc -b EXIT 0, biome clean, 1453 vitest pass.
|
|
shadow warning
All five live-verify checks passed against the dev stack (bin/up :24203):
configSource reaches the wire (built-in TS, 'built-in'); broken server reports
error + configSource + source-named error; recovery without restart (blocker,
error->connected after config fix); no retry storm; shadow warning logged via
host.logger when both configs declare lsp.
|
|
Two issues found by decompiling the running dispatch-server binary
(handoff from a ruby-lsp setup in raylib-jamstack):
Issue 2 (blocker): a failed LSP server was "broken" FOREVER — the
manager's broken set was cleared only in shutdownAll(), so a server
that failed (bad env, missing binary, or a since-fixed config) stayed
state:"error" for the whole process. For an agent running *inside*
dispatch the only recovery (server restart) kills its own session.
Now a broken server self-heals when its resolved config changes since
it was marked broken (discrete event → no retry storm), with a bounded
backoff for transient failures.
Issue 1: .dispatch/lsp.json silently shadowed opencode.json's lsp key
with no warning and no source attribution. Now: shadow warning via
host.logger when both declare lsp; configSource populated on status
(.dispatch/lsp.json / opencode.json / built-in); spawn-failure error
strings name the config source.
Contract: additive configSource?: string on LspServerInfo
(@dispatch/transport-contract 0.20.0→0.21.0). transport-http passes it
through to the wire (was a field-by-field map that dropped it — CR
resolved by the transport-http owner).
tsc -b EXIT 0, biome clean, 1443 vitest pass.
|
|
|
|
A chat's selected provider + model is now persisted per conversation (like cwd
and reasoningEffort). Opening a conversation in a new browser recalls the
originally selected model instead of defaulting.
- transport-contract 0.19.0→0.20.0: ModelResponse + SetModelRequest types
for GET/PUT /conversations/:id/model.
- conversation-store: getModel/setModel (model:<id> key, mirrors
getReasoningEffort/setReasoningEffort); forkHistory copies model; empty
string clears.
- session-orchestrator: resolve model from persisted store when no per-turn
override; persist the resolved model so it sticks; warm path parity.
- transport-http: GET/PUT /conversations/:id/model endpoints with validation.
1433 vitest pass; tsc + biome clean.
|
|
|
|
The system-prompt service cached the resolved prompt on first turn and reused
it on subsequent turns via get(). But the prompt is cwd-sensitive (file:AGENTS.md,
prompt:cwd variables). When a conversation's cwd changed after the first turn,
the cached prompt was stale — referenced files from the new cwd were not loaded.
system-prompt: added getWithMeta(conversationId) returning { prompt, cwd } and
stores resolved-cwd:<id> alongside resolved:<id> in construct().
session-orchestrator: subsequent turns now call getWithMeta, compare stored cwd
vs effective cwd, and reconstruct if they differ. Compaction path (always
constructs) and warm path (no system prompt) are unaffected.
1411 vitest pass; tsc + biome clean.
|
|
- @dispatch/transport-contract 0.18.0 -> 0.19.0:
add workspaceId: string to ConversationOpenMessage and ConversationStatusChangedMessage
- session-orchestrator: include persisted workspaceId in conversationOpened/
conversationStatusChanged payloads
- transport-ws: forward workspaceId in WS broadcasts
- transport-http: POST /conversations/:id/open resolves workspaceId before emit
- FE handoff to 29ae: frontend-workspace-open-handoff.md
|