summaryrefslogtreecommitdiffhomepage
path: root/packages
AgeCommit message (Collapse)Author
12 daysfeat(todo): per-conversation task list tool + surfaceAdam Malczewski
New standard tool extension with a single todo_write tool (opencode todowrite pattern: full-list replace, returns JSON, no business-rule enforcement — the description guides the model). Per-conversation in-memory state + per-conversation surface (rendererId: todo, scope: conversation) via subscriber-notify (message-queue pattern). Wave 0 (kernel contract): added conversationId?: string to ToolExecuteContext (additive, backward-compatible). Wired in dispatch.ts — the kernel already had it but wasn't passing it through to tools. Wave 1 (todo extension): pure core (validateTodos — shape only; getTodos/ setTodos/clearTodos; buildTodoSpec; formatTodoResult). Shell: createTodoWriteTool + surface provider. Tool description matches opencode's todowrite.txt depth (when-to-use, examples, task states). Priority field removed (bloats the tool with little value). 25 tests. Wave 2 (host-bin): registered todo in CORE_EXTENSIONS + dep + root tsconfig ref. Verified: tsc EXIT 0, 1123 vitest, biome clean (314 files). Boot smoke clean. FE handoff: frontend-todo-handoff.md.
12 daysfeat(tool-web-search): Firecrawl-backed web search toolAdam Malczewski
New standard tool extension with one tool web_search supporting 4 modes (search, scrape, crawl, map) against a self-hosted Firecrawl instance. Pure core: validateArgs (discriminated union by mode) + format* functions + truncateOutput. Injected edge: FirecrawlClient (injectable fetchFn/sleep/now, AbortSignal.any for per-request timeout + caller cancellation). concurrencySafe true, capabilities network. 38 tests, zero vi.mock. Live-verified: umans-glm-5.2 called web_search → real Firecrawl results (also the first live Umans API call).
12 daysfeat(provider-umans): Umans AI Coding Plan provider + openai-stream libAdam Malczewski
Extract a generic @dispatch/openai-stream library from provider-openai-compat (convert-messages, convert-tools, parse-sse, listModels, stream, provider), parameterizing createOpenAICompatProvider with uid=1000(tradam) gid=1000(tradam) groups=1000(tradam),966(docker),968(ollama),998(wheel) + hook. Refactor provider-openai-compat to import from the lib (byte-identical behavior). New @dispatch/provider-umans extension wraps the Umans OpenAI-compatible backend (https://api.code.umans.ai/v1). Self-contained: reads UMANS_API_KEY from env directly (no auth-apikey dep). transformBody maps reasoningEffort → reasoning_effort (capping xhigh/max → high). Dynamic listModels via GET /v1/models. host-bin: registered provider-umans in CORE_EXTENSIONS + umans credential (gated on UMANS_API_KEY — the credential is the model-catalog index). Verified: tsc EXIT 0, 1059 vitest, biome clean (293 files). Boot smoke: umans models appear in GET /models (7 models live).
13 daysfeat(message-queue): per-conversation queue + steering injectionAdam Malczewski
A per-conversation message queue (new message-queue extension) holds user messages enqueued while a turn generates; delivered mid-turn as steering at the tool-result boundary (or carried to a new turn if no tool call fires). - kernel: RunTurnInput.drainSteering callback (generic; kernel stays pure) - wire 0.7.0->0.8.0: QueuedMessage, QueuePayload, TurnSteeringEvent (additive) - transport-contract 0.11.0->0.12.0: POST /conversations/:id/queue + chat.queue WS op - message-queue ext: queue state + per-conversation custom surface (rendererId message-queue) - session-orchestrator: enqueue facade + drainSteering wiring + post-seal carry - transport-http/ws: queue endpoint + chat.queue op (fixes WsClientMessage exhaustive switch) - host-bin: register message-queue 1043 vitest + 199 transport bun pass; tsc/biome clean; boot smoke clean. FE courier: frontend-message-queue-handoff.md.
13 daysfix(history): harden loadSince sinceSeq lower bound (forgiving, like ↵Adam Malczewski
beforeSeq/limit) Coerce sinceSeq to a non-negative integer lower bound in loadSince (omitted/0/ non-positive/non-integer/NaN/Infinity -> 0; valid as-is). The transport layer 400s these upstream, but loadSince stays total for direct callers. Byte-identical to the prior ?? 0 for the only values any caller ever passed. 58 bun tests pass.
2026-06-12feat(reasoning-effort): persisted per-conversation + per-turn override, ↵Adam Malczewski
threaded to providers - conversation-store: get/setReasoningEffort (own key space, mirrors cwd) - session-orchestrator: resolveReasoningEffort (override -> stored -> 'high'), StartTurnInput.reasoningEffort, warm() parity (cache-safe) - transport-http: /chat validation (400 on bad level) + GET/PUT /conversations/:id/reasoning-effort - transport-ws: chat.send threading + validation - cli: --effort <low|medium|high|xhigh|max> 993 vitest + 189 bun tests green; typecheck + biome clean.
2026-06-12feat(contracts): reasoning effort — ReasoningEffort ladder (low..max), ↵Adam Malczewski
ProviderStreamOptions/ChatRequest fields, per-conversation GET/PUT types wire 0.6.1->0.7.0, transport-contract 0.10.0->0.11.0. Additive only; typecheck+biome clean.
2026-06-12feat(history): CR-5 windowed reads — ?limit= / ?beforeSeq= on GET ↵Adam Malczewski
/conversations/:id Selection sinceSeq < seq < beforeSeq; newest-limit window, ascending; positive- integer validation (400, store never sees an invalid window); 1-based gap-free seq codified as the contractual has-older mechanism (no earliestSeq field). transport-contract 0.9.0->0.10.0, wire 0.6.0->0.6.1 (doc-only). conversation-store +8 tests, transport-http +20; 935 vitest + 112 bun green. Live-verified: 6/6 probe checks OK. FE courier: frontend-history-windowing-handoff.md
2026-06-12feat(cache-warming): lifecycle CR-4 — default-off, fresh nextWarmAt, ↵Adam Malczewski
conversation close (+CR-1 table, CR-2 scope) CR-4a: warming defaults OFF (opt-in per conversation); re-enabling restores the persisted interval. CR-4b: re-arm BEFORE surface notify so post-warm updates carry the FUTURE nextWarmAt; turnSettled/turnStarted now also push (fresh schedule after seal, null while generating). CR-4c: POST /conversations/:id/close — per-turn AbortController wired to the kernel runTurn signal (partial persist + normal seal, done.reason "aborted"), new conversationClosed hook, cache-warming disables sync + persists OFF. Disconnect/chat.unsubscribe semantics unchanged. CR-4d: no change needed — initial surface echo already at HEAD (stale up2 boot on the FE probe). CR-1: loaded-extensions emits a single custom rendererId:"table" field (TablePayload exported; Name|Version|Trust|Activation, all trust tiers). CR-2: SurfaceCatalogEntry.scope?: "global"|"conversation" on both surfaces. Contracts: ui-contract 0.1.0→0.2.0, transport-contract 0.8.0→0.9.0 (additive). 907 tests pass (+13); live-verified against bin/up (warms @5s with future nextWarmAt; mid-turn close → abortedTurn:true + done.reason aborted). Courier: frontend-cache-warming-lifecycle-handoff.md.
2026-06-12fix(turns): emit user prompt on the turn event stream (CR-3)Adam Malczewski
A pure watcher (subscribed but not the sender) couldn't see the user prompt until the turn sealed: the user message was only persisted at seal and never entered the live/replayable stream. Add an additive TurnInputEvent {type:"user-message", conversationId, turnId, text} to the AgentEvent union and emit it via the broadcast/buffer path as the first event of every turn, so it is replayed to all subscribers (live + late-join) and on the HTTP path. Persistence and metrics unchanged; the union widening breaks no exhaustive switch. - @dispatch/wire 0.5.0->0.6.0; @dispatch/transport-contract 0.7.0->0.8.0 (re-export) - session-orchestrator: emit user-message at runTurnDetached start; +3 tests, 3 Wave-1 tests updated (user-message precedes turn-start) - FE courier: frontend-cr3-user-message-handoff.md Live-verified vs flash: watcher receives user-message (correct text) as its first chat.delta before turn-sealed. 894 vitest + transport bun green; tsc -b EXIT 0.
2026-06-12feat(turns): detached turns + multi-client live viewAdam Malczewski
A turn no longer dies when its WebSocket connection closes. The turn-broadcast hub moves into the core (session-orchestrator): turns run detached, persist at seal regardless of clients, and fan out AgentEvents to N subscribers per conversation with in-flight buffer replay for late-joiners. transport-ws stops aborting turns on socket close and gains chat.subscribe/chat.unsubscribe so a second device (or a reloaded browser) can watch a running turn. - @dispatch/transport-contract 0.6.0->0.7.0: chat.subscribe/chat.unsubscribe WS ops - session-orchestrator: startTurn/subscribe/isActive; persistent subscribers + per-turn buffer (two-map model); handleMessage = convenience wrapper (no signal) - transport-ws: per-connection chat-subscription fan-out; no turn-abort-on-close - transport-http: test fakes updated for the widened interface (runtime unchanged) - design notes/turn-continuity-design.md; FE courier frontend-turn-continuity-handoff.md Live-verified vs flash (2-client WS): sender disconnect mid-turn -> other client streams to done + turn persists; late-join replays turn from turn-start. 891 vitest + transport bun green; tsc -b EXIT 0; biome clean.
2026-06-12feat(metrics): expose current context size to the frontendAdam Malczewski
contextSize = the turn's FINAL step inputTokens+outputTokens (true context occupancy; NOT the aggregate usage, which sums per-step prompts and overcounts multi-step turns). Stamped on both the live done event (kernel) and persisted TurnMetrics (session-orchestrator); a client reads the latest turn's value. - @dispatch/wire 0.4.0->0.5.0: optional contextSize on TurnDoneEvent + TurnMetrics - @dispatch/transport-contract 0.5.0->0.6.0 (re-export only) - glossary: context size (reserve 'context window' for the model limit, later) - FE courier: frontend-context-size-handoff.md 881 vitest pass; tsc -b EXIT 0; biome clean.
2026-06-11feat(lsp,cwd): LSP integration + per-conversation cwd; fix cache-warming ↵Adam Malczewski
cache bust LSP + per-conversation CWD feature: - new bundled `lsp` extension: hand-rolled JSON-RPC codec (framing/rpc), lazy one-server-per-(serverID,root), per-cwd config resolution, on-demand `lsp` tool - `conversation-store`: getCwd/setCwd (cwdKey); `session-orchestrator` defaults a turn's cwd from the store - `transport-http`: cwd + lsp status endpoints; wire types in transport-contract - host-bin: register lsp; config wiring Cache-warming fix (the warm read 0% on the first reheat after a message): - warm assembled tools under a different cwd than the real turn (a reheat sends no cwd, and the warm service had no store fallback). The skills filter rewrites the cwd-sensitive `load_skill` description, so the tools block — the first bytes of the prompt-cache prefix — diverged and the cache missed entirely. Warm now resolves cwd as opts.cwd ?? conversationStore.getCwd(), mirroring handleMessage. - capture warm sends as `provider.request` spans flagged `warm:true` (thread a child logger into providerOpts) so warm vs real bodies are diffable (obs §3.1). - kernel logger: span-close now merges child-bound attrs like span-open, so a `warm:true` query finds the closed span (with usage/status), not just the open. Tests: warm forwards a warm-flagged logger; warm falls back to stored cwd; logger open/close attr consistency. Full suite green (873).
2026-06-11feat(cache-warming): CR-3 — manual warm resets timer + ↵Adam Malczewski
nextWarmAt/lastWarmAt surface FE CR-3 (backend-handoff-cache-warming-timer.md). The inversion: session-orchestrator's warm() (the single chokepoint for manual /chat/warm AND the automatic timer) emits a warmCompleted bus event; cache-warming subscribes and does ALL post-warm handling. So a manual warm now re-arms the timer + refreshes the surface with NO transport-http change (core can't depend on the standard cache-warming ext). - session-orchestrator: warmCompleted event hook + emit from warm() on success - cache-warming: warmCompleted subscriber unifies result handling (manual + automatic); adds nextWarmAt/lastWarmAt state + a custom 'cache-warming-timer' surface field - fix: createWarmService was missing the emit dep (deps.emit?. silently no-oped) → wired it + made emit REQUIRED so it can't regress Live-verified vs claude haiku: manual POST /chat/warm now logs cache-warming 'warm complete' ~2s after the turn (not the 4-min timer) → manual warm reaches the warmer. 800 vitest + 109 bun green; tsc -b 0; biome clean.
2026-06-11fix(cache-warming): accurate cache rate + expectedCacheRate (retention) metricAdam Malczewski
The Claude cache % read 100% whenever anything was cached, because the metric's denominator (inputTokens) excluded cached tokens on Anthropic. Fixed upstream in ../claude/provider-anthropic (inputTokens = total prompt); this commit adds the companion retention metric and exposes it: - transport-contract: WarmResponse += expectedCacheRate - transport-http: POST /chat/warm returns expectedCacheRate = cacheRead/(cacheRead+cacheWrite) - cache-warming: computeExpectedCacheRate + a per-conversation 'cache retention' surface stat - handoff: documents the fix + cache-rate vs expected-cache (cross-turn) for the FE Live-verified vs claude haiku: real turn cache rate 61% (was inflated 100%); warm within TTL expectedCacheRate=100%, after expiry=0%.
2026-06-11feat(surfaces): NumberField + per-conversation surface scoping; ↵Adam Malczewski
cache-warming controls Extend the surface framework so cache-warming exposes per-conversation controls: - ui-contract: add NumberField (settable free-value numeric) to SurfaceField; add optional conversationId to subscribe/unsubscribe/invoke + surface/update - surface-registry: SurfaceContext { conversationId? } on getSpec/invoke (backward-compatible) - transport-ws: thread conversationId; key subscriptions by (surfaceId, conversationId); tag surface/update replies with conversationId - cache-warming: per-conversation surface — Toggle(enabled) + Number(interval seconds, cache-warming/set-interval) + Stat(last cache %); drop the currentConversationId closure Global surfaces (surface-loaded-extensions) unchanged. 784 vitest + 109 bun = 893 tests; tsc -b EXIT 0; biome clean.
2026-06-11feat(cache-warming): manual POST /chat/warm trigger endpointAdam Malczewski
A frontend 'warm now' button (and fast tests) can trigger a warm on demand instead of waiting for the automatic timer. - transport-contract: WarmRequest / WarmResponse wire types - transport-http: POST /chat/warm → cacheWarmHandle.warm(); 200 with cachePct, 409 when the conversation is generating, 400 on missing conversationId Live-verified vs claude haiku: seed turn cacheWrite=6799 → POST /chat/warm returns cacheReadTokens=6799 cachePct=100 (100% hit). 760 vitest + 109 bun green.
2026-06-11feat(cache-warming): per-conversation prompt-cache warming + warm() serviceAdam Malczewski
Backend-driven warming targeting whatever provider a conversation uses (incl. the external Claude provider-anthropic). Core engine + on/off + last-cache-% done; interval-as-view-control pending a ui-contract NumberField (surface-system gap). Mechanism: - kernel: expose HostAPI.emit (typed bus event emit; counterpart of on) - session-orchestrator: turnStarted/turnSettled event hooks (conversationId/cwd/model); warm() service (cacheWarmHandle) reusing the real-turn assembly (byte-identical prefix, provider-agnostic), refuses mid-turn, never persists/emits, returns Usage - cache-warming (new ext): per-conversation timers (arm on settle, cancel on start, in-flight invalidation), calls warm(), pct=round(clamp(cacheRead/input,0,1)*100), persists {enabled,intervalMs} (default on/240s), registers a controls surface - host-bin: register cache-warming; transport-http: HostAPI stub +emit (fan-out) Honors old-code invariants. 760 vitest + 109 bun = 869 tests; tsc -b EXIT 0; biome clean.
2026-06-10feat(skills): skill system + load_skill tool via per-turn tools filterAdam Malczewski
Skills are markdown in .skills/ dirs (~/.skills + <cwd>/.skills, cwd shadows home; name = filename). Format: line1 summary, line2 ---, body line3+; load strips the first two lines; malformed = no summary but still loadable. Mechanism (first use of the context-assembly filter chain, §3.2): - kernel: expose HostAPI.applyFilters (delegates to bus.applyFilters) - session-orchestrator: define/export toolsFilter + ToolAssembly; apply once per turn before runTurn (cache-stable across steps), threading cwd + conversationId - skills (new ext): pure parse/merge/render + load_skill tool (live read, path-contained) + a toolsFilter filter rewriting load_skill's description + name enum per cwd - host-bin: register skills in CORE_EXTENSIONS - transport-http: fix HostAPI test stub for the new applyFilters method (fan-out) 734 vitest + 109 bun = 843 tests; tsc -b EXIT 0; biome clean; clean live boot.
2026-06-10feat(tools): add run_shell, edit_file, write_file + read_file directory listingAdam Malczewski
Four standard-tier tool extensions (one tool per extension, zero ABI change): - tool-read-file: read_file now lists directory contents (sorted, /-suffixed subdirs) - tool-shell: run_shell (foreground, streamed, cancellable, cwd, timeout + output cap) - tool-edit-file: edit_file (oldString/newString/replaceAll; errors on absent/non-unique) - tool-write-file: write_file (explicit overwrite flag) Registered in host-bin CORE_EXTENSIONS. Live boot clean (shell capability accepted). 686 vitest + 89 bun = 775 tests; tsc -b EXIT 0; biome clean.
2026-06-10trace-store: fix old-schema migration crash (found by live boot)Adam Malczewski
Wave 1 created idx_records_bodyHash BEFORE migrateOldBodies ran, so opening a pre-existing old-schema traces.db crashed the collector with 'no such column: bodyHash' (crash-looped 168x in ~20s). Fresh DBs hid it (CREATE TABLE already has bodyHash); only a real old-schema DB exposed it. - reorder schema(): migrateOldBodies (ALTER ADD bodyHash + content-address backfill + drop old bodies) runs BEFORE the bodyHash index. - add 3 regression tests that seed a real old-schema DB and open it. Live-verified: old-schema traces.db migrates on boot with 0 crashes; 318 body refs collapse to 270 content-addressed bodies; prune cadence fires cleanly. typecheck EXIT 0; biome clean; bun 106->109, 0 fail.
2026-06-10observability-collector: drive trace-store prune on a cadenceAdam Malczewski
Wave 2 (final) of the dedup/storage-growth milestone (notes §12). - pure shouldPrune(now,lastPruneAt,intervalMs) cadence helper (injected clock). - main.ts calls store.prune(DEFAULT_RETENTION) on a coarse cadence (--prune-interval-ms, default 60s; host-bin-overridable), far less frequent than a drain. Prune errors are logged and never stop the tail loop. - confirmed body inserts flow through trace-store's content-addressed path. - glossary: content-addressed body, trace retention, prefix fingerprint, warm vs real. typecheck EXIT 0; biome clean; vitest 576; bun 100->106, 0 fail.
2026-06-10trace-store: content-addressed body dedup + retention/pruneAdam Malczewski
Wave 1 of the dedup/storage-growth milestone (notes §12). - bodies table is now content-addressed (SHA-256 hash key); identical verbatim bodies (cache-warming resends, any repeat) collapse to one stored row, referenced by hash from records. Transparent to insert/read callers. - at-rest gzip compression for bodies >1 KiB (node:zlib), decompressed on read. - prune(policy): age-based delete + drop-oldest byte-cap eviction + orphan-body GC. Exports RetentionPolicy/PruneSummary/DEFAULT_RETENTION (7d / 256 MiB). typecheck EXIT 0; biome clean; vitest 576; bun 89->100, 0 fail.
2026-06-10feat: per-model throughput (tok/s) tracking + metrics endpointAdam Malczewski
New throughput-store extension records one token-weighted sample per turn (model, output tokens, pure generation time = Σ step genTotalMs) into a day-bucketed KV store, and aggregates per-model tok/s = Σtokens / Σgen-seconds over a day/week/month (server-local boundaries; week = ISO Mon–Sun). transport-http records a sample per turn (logged) and serves GET /metrics/throughput?period=day|week|month&date=<...>. The response is typed as transport-contract's ThroughputResponse, so store/wire drift is a compile error. Pure period + aggregate logic fully unit-tested.
2026-06-10kernel/run-turn: thread providerOpts (model) into provider.streamAdam Malczewski
executeStep built the stream opts with only the logger, so providerOpts.model (the selected model) never reached any provider — each fell back to its own default. Carry providerOpts through StepContext into the per-step stream opts, plus a regression test asserting the model is forwarded.
2026-06-10host-bin: external-extension loader + claude credential wiringAdam Malczewski
Add loadExternalExtensions(): fault-isolated dynamic import of out-of-repo extensions declared via DISPATCH_EXTERNAL_EXTENSIONS. main.ts assembles the credential-store in boot() so a 'claude' credential is registered when an external anthropic provider is loaded; config.ts surfaces the anthropic model / credential-key settings those extensions read.
2026-06-10feat(conversation-store): reconcile.repair span (logging-audit #1)Adam Malczewski
Load-time history repair was invisible (createConversationStore got no logger). Now: optional logger injected (extension passes host.logger); reconcile logic moved into pure reconcileWithReport() returning a ReconcileReport (reconcile() stays a thin byte-identical wrapper); load() emits a reconcile.repair span (childed with conversationId, flat attrs repairedCount/firstRepairedToolCallId) ONLY when a real repair occurs. No contract fan-out (factory is package-internal). typecheck EXIT 0, biome clean, 550 vitest (+4) + 89 bun.
2026-06-10feat(metrics): durable per-turn/step token+timing metrics (observability ↵Adam Malczewski
spans + persisted replay) Two-part token-data improvement: #2 Observability spans (kernel run-turn): turn & step span-close now stamp ALL four Usage fields — added usage.cacheReadTokens/cacheWriteTokens (were silently dropped) and normalized usage_* -> usage.* to match the provider.request span (consistent D9 GROUP BY). No contract change. #3 Persisted replay metrics (conversation-store + read endpoint): new StepMetrics/TurnMetrics wire types; conversation-store persists per-turn metrics in a separate key space (appendMetrics/loadMetrics, turn-append order); session-orchestrator accumulates per-step+turn metrics from the event stream (pure metrics.ts) and persists after seal; transport-http serves GET /conversations/:id/metrics -> ConversationMetricsResponse. Contracts: @dispatch/wire + @dispatch/transport-contract bumped 0.3.0->0.4.0 (additive). GLOSSARY: turn metrics / step metrics. typecheck EXIT 0, biome clean, 546 vitest + 89 bun = 635 tests.
2026-06-07feat(wire,kernel,session-orchestrator): live turn metrics on the streamAdam Malczewski
Expose the backend's authoritative token+timing metrics on the live AgentEvent stream (observability-only -> now also client-facing). All additive/optional. - [email protected]: new TurnStepCompleteEvent (type:step-complete) with per-step ttftMs/decodeMs/genTotalMs; usage += stepId; tool-result += durationMs (exec); done += durationMs (turn wall-clock) + usage (turn total). RunTurnInput += now?. [email protected] (re-export bump). - kernel-runtime: when now injected, measures + emits the above (reuses the ttft/decode first-token detection); omits timing gracefully without a clock. - session-orchestrator: adds now? to deps, threads into RunTurnInput; extension activate injects () => Date.now(). - transport/cli/host-bin: untouched (verbatim pass-through; additive fields). FE handoff: frontend-metrics-handoff.md. typecheck clean; 520 vitest + 89 bun; biome 0/0. Replay/persistence = deferred Pass 2 (documented in tasks.md).
2026-06-07feat(kernel-runtime): per-step TTFT + decode timing spans (observability)Adam Malczewski
Split each step's generation into a ttft span (stream start -> first text|reasoning token) and a decode span (first token -> stream end), children of the step span. decode = generation total - TTFT; both retrievable from the trace-store. First token counts reasoning deltas; a step with no content token ends ttft with firstToken:false (no misleading decode). Span-based (no clock injection), no wire/contract change. +3 runtime tests. GLOSSARY: TTFT + decode time. typecheck clean; 512 vitest; biome 0/0.
2026-06-07feat(wire,kernel,conversation-store): step grouping via stepId for batched ↵Adam Malczewski
tool calls Expose a per-step grouping key so a client can render a model's batched/parallel tool calls (those emitted in one step) as one unit, on both the live stream and replayed history. Key = branded StepId, derived turnId#stepIndex (0-based). - [email protected]: required stepId on Turn{Tool,ToolResult}Event; optional stepId on Tool{Call,Result}Chunk (generation provenance on the chunk, not the StoredChunk envelope — StoredChunk unchanged). [email protected] (re-export bump). - kernel-runtime: mint stepId per step; stamp on tool chunks + tool events. - conversation-store: chunk-carried stepId round-trips append/load/loadSince for free; reconcile copies it onto synthesized (interrupted) results. - cli: stepId added to event test fixtures (renderer unchanged). typecheck clean; 509 vitest + 89 bun; biome 0/0. FE courier reply + reference snapshots regenerated in ../dispatch-web.
2026-06-06feat(transport-http): wildcard CORS + bump contract pkgs to 0.1.0 (FE Slice ↵Adam Malczewski
2 handoff) Unblock the browser frontend (Vite origin :24204 -> HTTP backend :24203): - transport-http: wildcard CORS via hono/cors on all routes (Access-Control-Allow-Origin: *, Allow-Methods GET/POST/OPTIONS, Allow-Headers Content-Type) + OPTIONS preflight (204). Headers present on the streamed POST /chat NDJSON response too. +4 app.fetch tests. - wire / transport-contract / ui-contract: 0.0.0 -> 0.1.0 as the FE-consumable baseline (semver convention §2.9: major = cross-repo fan-out signal). Verified live: OPTIONS /chat -> 204 with CORS headers; GET /models -> 200 with Access-Control-Allow-Origin: *. typecheck clean, 502 vitest + 89 bun, biome clean.
2026-06-06refactor(transport-http,host-bin): transport-http owns its Bun.serve (fix ↵Adam Malczewski
log scope) Make transport-http a full-fidelity extension that runs its own Bun.serve inside activate(host) — symmetric with transport-ws. The Hono app is now built with the extension-scoped host, so all HTTP edge logs are correctly attributed extensionId=transport-http instead of the host-bin __host__ scope (verified live in the journal). - transport-http: createTransportHttpExtension() factory; activate builds the app + Bun.serve, reads host.config httpPort (?? 24203); deactivate stops it. - host-bin: drops the HTTP Bun.serve + createServer call; config.ts maps BACKEND_PORT/PORT -> httpPort. host-bin now serves no transport (both transports self-serve); boot log -> 'Dispatch booted'. - +5 bun lifecycle tests wired into test:bun. No contract change (composition wiring). Verified live: HTTP serves on :24203; journal edge logs now scoped transport-http. typecheck clean, 498 vitest + 89 bun, biome clean.
2026-06-06feat(transport-http,transport-ws): structured edge logging (close coverage ↵Adam Malczewski
gap #2) Both HTTP + WS transport edges now emit structured logs via the injected logger (D7-compliant: no per-AgentEvent/chat.delta frame logging). Verified live — the journal contains the edge records. - transport-ws: connection open/close (debug), chat.send accepted (info), surface-op + malformed-chat.send (warn), abort-on-close (debug). +4 bun tests. Correctly scoped extensionId=transport-ws (owns its Bun.serve). - transport-http: /chat accepted (info) / 400 (warn) / turn-failure (error), GET /conversations read (info), /models + store failure (error). +4 vitest. Known follow-up: transport-http edge logs are attributed to '__host__' (not 'transport-http') because host-bin runs the HTTP server via createServer(getHostAPI()) rather than the extension owning its Bun.serve. Logs are captured + correlated; only the per-extension filter is mis-scoped. Tracked in tasks.md. typecheck clean, 498 vitest + 84 bun, biome clean.
2026-06-06feat(kernel-runtime,session-orchestrator): emit turn lifecycle eventsAdam Malczewski
Close a gap found live: neither transport emitted turn-start/done/turn-sealed (the wire defined them; nothing fired them). turn-sealed is the FE's cache-commit signal (frontend-design §6.3); done ends the stream. - kernel-runtime: runTurn emits turn-start first and done (with finishReason) last, on every exit path (stop/tool-calls/max-steps/error/aborted). - session-orchestrator: emits turn-sealed after conversationStore.append succeeds (the kernel touches no DB, so the post-persist seal is the orchestrator's). Not emitted if append throws. No contract change (all three wire types already existed). Verified live: HTTP /chat and WS chat both stream turn-start … done turn-sealed. typecheck clean, 494 vitest + 80 bun, biome clean.
2026-06-06feat(transport-ws,transport-contract): multiplex chat ops onto the surface WSAdam Malczewski
Add chat WS ops (chat.send / chat.delta / chat.error) + unified WsClientMessage/WsServerMessage unions to @dispatch/transport-contract (imports ui-contract; surface protocol unchanged — additive non-colliding type variants, no channel wrapper). transport-ws drives sessionOrchestrator.handleMessage, streaming each AgentEvent as chat.delta over the same connection that carries surface ops; per-connection AbortController cancels in-flight turns on socket close; error-isolated. Verified live: one WS connection delivered the surface catalog AND a real flash chat turn (chat.delta stream, reply 'Hello my friend'). Completes the FE Slice 2 backend prereqs. typecheck clean, 485 vitest + 80 bun, biome clean. Discovered (separate, pre-existing): runtime does not emit turn-start/done/turn-sealed on either transport — needed for FE cache-commit; tracked in tasks.md.
2026-06-06feat(transport-http): GET /conversations/:id?sinceSeq= read-side history ↵Adam Malczewski
endpoint Incremental rehydration endpoint for long-lived clients. Returns ConversationHistoryResponse { chunks: StoredChunk[], latestSeq } — the RAW, append-order, seq-filtered slice from conversation-store.loadSince, NOT reconciled (reconcile conflicts with the per-chunk seq cursor, so it stays on the turn path; the read path is a pure sync primitive). - transport-contract: add ConversationHistoryResponse + StoredChunk re-export. - transport-http: GET /conversations/:id route reaching the log directly via conversationStoreHandle (dependsOn conversation-store); pure parseSinceSeq (absent->0, invalid->400). - build wiring: conversation-store dep + project ref. FE Slice 2 backend prereq (read-side). typecheck clean, 481 vitest, biome clean.
2026-06-06feat(wire,conversation-store): per-chunk seq sync cursor (StoredChunk)Adam Malczewski
Add StoredChunk { seq, role, chunk } to @dispatch/wire (re-exported via the kernel contract shims). Keeps Chunk pure (provider-facing, no cursor); the sync cursor lives only on the envelope. conversation-store: rekey conv:<id>:msg:<seq> -> conv:<id>:chunk:<seq>; append explodes messages into role-tagged seq'd chunks (1-based, gap-free, monotonic) with internal boundary metadata so load() round-trips ChatMessage[] losslessly and still reconciles; new loadSince(id, sinceSeq?) raw sync stream. session-orchestrator test fake conforms to the widened interface. FE Slice 2 backend prereq (per-chunk seq). typecheck clean, 469 vitest, biome clean.
2026-06-06feat(frontend,wire): surface system (FE slice 1) + @dispatch/wire types-only ↵Adam Malczewski
split (B2) FE slice 1 — backend-declared, frontend-agnostic surface system (verified live): new types-only @dispatch/ui-contract (SurfaceSpec / field kinds / region / ActionRef / catalog), surface-registry (typed service handle), transport-ws (Bun WS :24205, path-agnostic upgrade), surface-loaded-extensions (first real surface); kernel HostAPI.getExtensions; host-bin wiring; bin/up. Harness: retire AGENTS 'backend only', ORCHESTRATOR §3/§7/§8, frontend-design.md locked. B2 — wire-types split (chat-slice prerequisite): new types-only @dispatch/wire single-sources the wire ABI (AgentEvent + 11 variants; conversation model Chunk/ChatMessage/Role/TurnId/StepId + 6 chunk variants; Usage) with zero @dispatch/* deps. @dispatch/kernel re-exports via shims so its public surface is byte-identical (zero consumer blast radius). transport-contract re-exports AgentEvent from @dispatch/wire and drops its @dispatch/kernel dependency, so HTTP clients (the web frontend) consume the wire without the kernel runtime. tsc -b + biome clean; 460 vitest + 77 bun pass.
2026-06-05feat(cli): one-shot terminal client (models, chat, ↵Adam Malczewski
--text/--file/--cwd/--conversation) HTTP client of transport-contract; pure-core arg/render/ndjson + injected fetch/fs shell. Docs: GLOSSARY (credential/key/model name/model catalog), tasks.md milestone, ORCHESTRATOR geography.
2026-06-05feat(backend): credential-store + model selection/catalog (GET /models) + ↵Adam Malczewski
per-turn cwd through orchestrator/transport/host-bin
2026-06-05feat(kernel): listModels/ModelInfo + per-turn cwd contracts; add ↵Adam Malczewski
transport-contract wire package
2026-06-05feat(observability): map DeepSeek nested cache tokens ↵Adam Malczewski
(prompt_tokens_details.cached_tokens) -> Usage.cacheReadTokens The real flash fixture showed flash reports cache usage in the NESTED prompt_tokens_details.cached_tokens form (384 cached of 665 prompt); the parser only mapped the flat cache_read_tokens form, so cache tokens never surfaced. Now: cacheReadTokens = usage.cache_read_tokens ?? usage.prompt_tokens_details?.cached_tokens (flat wins; cacheWriteTokens flat-only, never fabricated; partial/null *_details safe). No kernel contract change (Usage already has the fields). +5 parser tests + a real-fixture regression (cacheReadTokens === 384). These counts (+ a future prefix.fingerprint) are the cheap signals for body de-duplication. The broader trace-body storage-growth concern (verbatim body stored per request -> ~O(N^2) for long conversations) is logged DEFERRED in tasks.md; mitigation already designed (D5 volume control + §6 retention/rotation), not yet built. 339 tests, typecheck + biome 0/0.
2026-06-05feat(observability): replace synthetic text-turn fixture with a real ↵Adam Malczewski
sanitized flash capture (D5) Installed a real DISPATCH_RECORD_FIXTURE capture (200 text/event-stream, deepseek-v4-flash, finish_reason stop, reply 'Hello there friend') as src/__fixtures__/flash-text-turn.json, replacing the hand-authored one. Auth header masked (Bearer sk-…redacted…UN0); fixture re-verified secret-free before commit. Text-turn replay assertions updated to real values (inputTokens 665 / outputTokens 90); structural assertions kept (multi-chunk replay, getCapturedRequest deep-equals the outgoing request, finish/stop event, concatenated deltas). The provider's request-building + SSE parsing now run against genuine flash bytes. Real-data finding (logged in tasks.md): flash reports cache tokens via DeepSeek's NESTED prompt_tokens_details.cached_tokens (384 cached of 665 prompt); the parser only maps the FLAT cache_read/creation form, so cache tokens don't surface — an observability gap vs the §3.1 cache-debugging goal. Deferred pending decision. 334 tests, typecheck + biome 0/0.
2026-06-05fix(observability): record-mode redaction leaked capitalized Authorization ↵Adam Malczewski
header — case-insensitive mask (+3 regression tests) A live capture exposed that provider record mode saved the request Authorization header with the API key in CLEARTEXT: onExchange redaction matched lowercase 'authorization', but streamChat sends 'Authorization' (capital A) and recordFetch captures headers verbatim, so the real header slipped through. The provider.request SPAN redaction was unaffected (it masks from config.apiKey directly — journal + trace-DB showed zero leaks); the leak was record-mode-only and caught PRE-COMMIT (fixture was /tmp-only, scrubbed). Fix: redact auth case-insensitively across all captured header keys (strip Bearer, maskSecret the token, re-prepend, preserve key casing). New tests: reproduce the exact capital-Authorization leak (would have caught it), a lowercase case, and a guard that no authorization header of ANY casing survives carrying a raw sk- token. 334 tests (331 -> +3), typecheck + biome 0/0. This is the live-capture step (D5) earning its keep — real data exposed what the synthetic redaction test assumed away.
2026-06-05feat(observability): provider record/replay via @dispatch/trace-replay — ↵Adam Malczewski
env-gated capture + hermetic fixture tests (331 tests) provider-openai-compat now consumes @dispatch/trace-replay. (A) Opt-in record mode: when DISPATCH_RECORD_FIXTURE is set, the fetch edge wraps recordFetch and saves a fixture of the verbatim post-transform request + raw SSE response, self-redacting the auth header in the provider's OWN code (reuses its existing maskSecret graduated-tier mask — no shared helper, isolation over DRY). Zero overhead when unset; fail-safe. (B) Hermetic replay tests: stream.test.ts drives the provider off committed SSE fixtures via replayFetch (chunk-split to exercise SSE parsing across boundaries), asserting ProviderEvents + that the outgoing request still matches the recorded one (transform-drift regression). Injectable fetch via an internal StreamConfig.fetchFn — NO kernel contract change. 2 committed fixtures (text-turn + tool-call, currently hand-authored-faithful; a real flash text-turn swap follows). Verified: tsc -b clean, 331 vitest (327 -> +4: 2 replay + 2 redaction), biome 0/0. Provider 44 -> 48.
2026-06-05feat(observability): trace-replay — generic HTTP-exchange record/replay ↵Adam Malczewski
library (39 tests) New standalone package @dispatch/trace-replay: replayFetch (pure — fixture -> fetch double + captured request, optional chunking to simulate streaming), recordFetch (tees a real fetch into a fixture WITHOUT consuming the caller's stream), and serialize/parse + save/load fixture I/O. Redaction-free by design: calling extensions self-redact in their OWN code before saving (isolation over DRY, D5/§9). Zero @dispatch/* deps, no bun:sqlite (runs under vitest). The shared unit realizing the §7/D5 replay affordance for hermetic provider tests; provider-openai-compat will consume it next. Root tsconfig ref wired. Verified: tsc -b clean, 327 vitest (288 -> +39: replay 12 / record 8 / fixture 19), biome 0/0. Agent stayed in lane (packages/trace-replay only).
2026-06-05feat(observability): host-bin supervises the collector (spawn-first / ↵Adam Malczewski
drain-last / restart) — 288 tests host-bin spawns the out-of-process collector before serving (real Bun.spawn adapted to a ChildHandle), restarts on unexpected exit (backoff + restart-guard cap), drains on SIGINT/SIGTERM (collector final-drain, SIGKILL fallback on timeout). createCollectorSupervisor takes an injected spawn so the lifecycle is unit-tested with a fake (no real subprocess). Collector failures never crash the app (D3 subordinate/fail-safe). New env DISPATCH_TRACE_DB (default ./.dispatch-data/traces.db). Verified: tsc -b clean, 288 tests (279 + 9 supervisor), biome 0/0. Live (clean single run): 1 collector during, trace DB auto-populated (nested easy-view), 0 collectors after shutdown.
2026-06-05fix(observability): nest turn/step/prompt/provider.request spans into a tree ↵Adam Malczewski
(+ buildSpanOpen parent propagation) run-turn: step is now turnSpan.child; prompt/provider.request/tool-call are step's children (stepSpan.log passed into provider.stream). logger.ts: buildSpanOpen now propagates the child's computed parentSpanId onto the span-open record — a latent bug where span.child(...) never set parentSpanId on open (close was already correct). Verified: tsc -b clean, 279 tests, biome 0/0. Live: span tree turn->step->{prompt,provider.request}; the trace CLI easy-view renders the nesting.
2026-06-05feat(observability): Phase B — trace-store (SQLite) + out-of-process ↵Adam Malczewski
collector + trace CLI (345 tests) trace-store (bun:sqlite): records+bodies schema (thin/fat split), idempotent insertRecords (FNV-1a id + INSERT OR IGNORE), getTurn/getBody, pure renderEasyView (D8 timeline skeleton), trace CLI. Its own DB, isolated from storage-sqlite. observability-collector: out-of-process bin — tail journal -> splitLines/drainOnce -> trace-store.insertRecords; offset sidecar; at-least-once + idempotent; fail-safe; clean SIGINT/SIGTERM drain. Build-config (orchestrator): root tsconfig refs; both excluded from vitest + added to test:bun (bun:sqlite); bun install. Verified: tsc -b clean, 345 tests (273 vitest + 72 bun), biome 0 warnings/0 infos. Pipeline proven end-to-end: app -> journal -> collector -> SQLite -> 'trace <turnId>' easy-view. Known follow-up (next commit): kernel spans are flat (parent=ROOT) — run-turn nesting fix.