summaryrefslogtreecommitdiffhomepage
path: root/tasks.md
AgeCommit message (Collapse)Author
2026-06-11feat(cache-warming): manual POST /chat/warm trigger endpointAdam Malczewski
A frontend 'warm now' button (and fast tests) can trigger a warm on demand instead of waiting for the automatic timer. - transport-contract: WarmRequest / WarmResponse wire types - transport-http: POST /chat/warm → cacheWarmHandle.warm(); 200 with cachePct, 409 when the conversation is generating, 400 on missing conversationId Live-verified vs claude haiku: seed turn cacheWrite=6799 → POST /chat/warm returns cacheReadTokens=6799 cachePct=100 (100% hit). 760 vitest + 109 bun green.
2026-06-11feat(cache-warming): per-conversation prompt-cache warming + warm() serviceAdam Malczewski
Backend-driven warming targeting whatever provider a conversation uses (incl. the external Claude provider-anthropic). Core engine + on/off + last-cache-% done; interval-as-view-control pending a ui-contract NumberField (surface-system gap). Mechanism: - kernel: expose HostAPI.emit (typed bus event emit; counterpart of on) - session-orchestrator: turnStarted/turnSettled event hooks (conversationId/cwd/model); warm() service (cacheWarmHandle) reusing the real-turn assembly (byte-identical prefix, provider-agnostic), refuses mid-turn, never persists/emits, returns Usage - cache-warming (new ext): per-conversation timers (arm on settle, cancel on start, in-flight invalidation), calls warm(), pct=round(clamp(cacheRead/input,0,1)*100), persists {enabled,intervalMs} (default on/240s), registers a controls surface - host-bin: register cache-warming; transport-http: HostAPI stub +emit (fan-out) Honors old-code invariants. 760 vitest + 109 bun = 869 tests; tsc -b EXIT 0; biome clean.
2026-06-10feat(skills): skill system + load_skill tool via per-turn tools filterAdam Malczewski
Skills are markdown in .skills/ dirs (~/.skills + <cwd>/.skills, cwd shadows home; name = filename). Format: line1 summary, line2 ---, body line3+; load strips the first two lines; malformed = no summary but still loadable. Mechanism (first use of the context-assembly filter chain, §3.2): - kernel: expose HostAPI.applyFilters (delegates to bus.applyFilters) - session-orchestrator: define/export toolsFilter + ToolAssembly; apply once per turn before runTurn (cache-stable across steps), threading cwd + conversationId - skills (new ext): pure parse/merge/render + load_skill tool (live read, path-contained) + a toolsFilter filter rewriting load_skill's description + name enum per cwd - host-bin: register skills in CORE_EXTENSIONS - transport-http: fix HostAPI test stub for the new applyFilters method (fan-out) 734 vitest + 109 bun = 843 tests; tsc -b EXIT 0; biome clean; clean live boot.
2026-06-10feat(tools): add run_shell, edit_file, write_file + read_file directory listingAdam Malczewski
Four standard-tier tool extensions (one tool per extension, zero ABI change): - tool-read-file: read_file now lists directory contents (sorted, /-suffixed subdirs) - tool-shell: run_shell (foreground, streamed, cancellable, cwd, timeout + output cap) - tool-edit-file: edit_file (oldString/newString/replaceAll; errors on absent/non-unique) - tool-write-file: write_file (explicit overwrite flag) Registered in host-bin CORE_EXTENSIONS. Live boot clean (shell capability accepted). 686 vitest + 89 bun = 775 tests; tsc -b EXIT 0; biome clean.
2026-06-10trace-store: fix old-schema migration crash (found by live boot)Adam Malczewski
Wave 1 created idx_records_bodyHash BEFORE migrateOldBodies ran, so opening a pre-existing old-schema traces.db crashed the collector with 'no such column: bodyHash' (crash-looped 168x in ~20s). Fresh DBs hid it (CREATE TABLE already has bodyHash); only a real old-schema DB exposed it. - reorder schema(): migrateOldBodies (ALTER ADD bodyHash + content-address backfill + drop old bodies) runs BEFORE the bodyHash index. - add 3 regression tests that seed a real old-schema DB and open it. Live-verified: old-schema traces.db migrates on boot with 0 crashes; 318 body refs collapse to 270 content-addressed bodies; prune cadence fires cleanly. typecheck EXIT 0; biome clean; bun 106->109, 0 fail.
2026-06-10observability-collector: drive trace-store prune on a cadenceAdam Malczewski
Wave 2 (final) of the dedup/storage-growth milestone (notes §12). - pure shouldPrune(now,lastPruneAt,intervalMs) cadence helper (injected clock). - main.ts calls store.prune(DEFAULT_RETENTION) on a coarse cadence (--prune-interval-ms, default 60s; host-bin-overridable), far less frequent than a drain. Prune errors are logged and never stop the tail loop. - confirmed body inserts flow through trace-store's content-addressed path. - glossary: content-addressed body, trace retention, prefix fingerprint, warm vs real. typecheck EXIT 0; biome clean; vitest 576; bun 100->106, 0 fail.
2026-06-10trace-store: content-addressed body dedup + retention/pruneAdam Malczewski
Wave 1 of the dedup/storage-growth milestone (notes §12). - bodies table is now content-addressed (SHA-256 hash key); identical verbatim bodies (cache-warming resends, any repeat) collapse to one stored row, referenced by hash from records. Transparent to insert/read callers. - at-rest gzip compression for bodies >1 KiB (node:zlib), decompressed on read. - prune(policy): age-based delete + drop-oldest byte-cap eviction + orphan-body GC. Exports RetentionPolicy/PruneSummary/DEFAULT_RETENTION (7d / 256 MiB). typecheck EXIT 0; biome clean; vitest 576; bun 89->100, 0 fail.
2026-06-10feat(conversation-store): reconcile.repair span (logging-audit #1)Adam Malczewski
Load-time history repair was invisible (createConversationStore got no logger). Now: optional logger injected (extension passes host.logger); reconcile logic moved into pure reconcileWithReport() returning a ReconcileReport (reconcile() stays a thin byte-identical wrapper); load() emits a reconcile.repair span (childed with conversationId, flat attrs repairedCount/firstRepairedToolCallId) ONLY when a real repair occurs. No contract fan-out (factory is package-internal). typecheck EXIT 0, biome clean, 550 vitest (+4) + 89 bun.
2026-06-10docs(metrics): FE Pass-2 courier handoff + mark live-verifiedAdam Malczewski
GET /conversations/:id/metrics verified end-to-end against flash (live stream metrics byte-match the persisted TurnMetrics; journal turn/step spans carry dotted usage.* incl. cacheReadTokens). Handoff doc for the user to courier the wire/transport-contract 0.4.0 delta to ../dispatch-web (ORCHESTRATOR \xc2\xa77).
2026-06-10docs(tasks): prune stale milestone history to a lean current-state docAdam Malczewski
tasks.md had accreted 635 lines of blow-by-blow milestone narration; that history lives in git. Collapse completed milestones into a compact summary, keep operational/open/roadmap, record the metrics milestone (incl. Pass 2). Removes the stale 'regen FE .reference.md' practice notes that contradicted ORCHESTRATOR \xc2\xa77 (cross-repo changes are couriered via the user; the backend does not write the FE repo) \u2014 \xc2\xa77 stands as the single source of truth.
2026-06-07docs(tasks): record live metrics Pass 1 done + live-verifiedAdam Malczewski
2026-06-07feat(wire,kernel,session-orchestrator): live turn metrics on the streamAdam Malczewski
Expose the backend's authoritative token+timing metrics on the live AgentEvent stream (observability-only -> now also client-facing). All additive/optional. - [email protected]: new TurnStepCompleteEvent (type:step-complete) with per-step ttftMs/decodeMs/genTotalMs; usage += stepId; tool-result += durationMs (exec); done += durationMs (turn wall-clock) + usage (turn total). RunTurnInput += now?. [email protected] (re-export bump). - kernel-runtime: when now injected, measures + emits the above (reuses the ttft/decode first-token detection); omits timing gracefully without a clock. - session-orchestrator: adds now? to deps, threads into RunTurnInput; extension activate injects () => Date.now(). - transport/cli/host-bin: untouched (verbatim pass-through; additive fields). FE handoff: frontend-metrics-handoff.md. typecheck clean; 520 vitest + 89 bun; biome 0/0. Replay/persistence = deferred Pass 2 (documented in tasks.md).
2026-06-07docs(tasks): record per-step TTFT+decode timing done + live-verifiedAdam Malczewski
2026-06-07docs(tasks): record stepId step-grouping done + live-verifiedAdam Malczewski
2026-06-07feat(wire,kernel,conversation-store): step grouping via stepId for batched ↵Adam Malczewski
tool calls Expose a per-step grouping key so a client can render a model's batched/parallel tool calls (those emitted in one step) as one unit, on both the live stream and replayed history. Key = branded StepId, derived turnId#stepIndex (0-based). - [email protected]: required stepId on Turn{Tool,ToolResult}Event; optional stepId on Tool{Call,Result}Chunk (generation provenance on the chunk, not the StoredChunk envelope — StoredChunk unchanged). [email protected] (re-export bump). - kernel-runtime: mint stepId per step; stamp on tool chunks + tool events. - conversation-store: chunk-carried stepId round-trips append/load/loadSince for free; reconcile copies it onto synthesized (interrupted) results. - cli: stepId added to event test fixtures (renderer unchanged). typecheck clean; 509 vitest + 89 bun; biome 0/0. FE courier reply + reference snapshots regenerated in ../dispatch-web.
2026-06-06docs(tasks): record FE Slice-2 backend handoff resolution (A-E answered)Adam Malczewski
2026-06-06refactor(transport-http,host-bin): transport-http owns its Bun.serve (fix ↵Adam Malczewski
log scope) Make transport-http a full-fidelity extension that runs its own Bun.serve inside activate(host) — symmetric with transport-ws. The Hono app is now built with the extension-scoped host, so all HTTP edge logs are correctly attributed extensionId=transport-http instead of the host-bin __host__ scope (verified live in the journal). - transport-http: createTransportHttpExtension() factory; activate builds the app + Bun.serve, reads host.config httpPort (?? 24203); deactivate stops it. - host-bin: drops the HTTP Bun.serve + createServer call; config.ts maps BACKEND_PORT/PORT -> httpPort. host-bin now serves no transport (both transports self-serve); boot log -> 'Dispatch booted'. - +5 bun lifecycle tests wired into test:bun. No contract change (composition wiring). Verified live: HTTP serves on :24203; journal edge logs now scoped transport-http. typecheck clean, 498 vitest + 89 bun, biome clean.
2026-06-06feat(transport-http,transport-ws): structured edge logging (close coverage ↵Adam Malczewski
gap #2) Both HTTP + WS transport edges now emit structured logs via the injected logger (D7-compliant: no per-AgentEvent/chat.delta frame logging). Verified live — the journal contains the edge records. - transport-ws: connection open/close (debug), chat.send accepted (info), surface-op + malformed-chat.send (warn), abort-on-close (debug). +4 bun tests. Correctly scoped extensionId=transport-ws (owns its Bun.serve). - transport-http: /chat accepted (info) / 400 (warn) / turn-failure (error), GET /conversations read (info), /models + store failure (error). +4 vitest. Known follow-up: transport-http edge logs are attributed to '__host__' (not 'transport-http') because host-bin runs the HTTP server via createServer(getHostAPI()) rather than the extension owning its Bun.serve. Logs are captured + correlated; only the per-extension filter is mis-scoped. Tracked in tasks.md. typecheck clean, 498 vitest + 84 bun, biome clean.
2026-06-06docs(harness): author extension-logging rule (close the pending logging gap)Adam Malczewski
The .dispatch/rules/extension-logging.md rule was '(pending)' in ORCHESTRATOR §3 for the entire life of the observability substrate, so every extension summon was built without logging/self-redaction guidance — leaving most extensions silent (a coverage audit found conversation-store, transport-http, credential-store, tool-read-file, storage-sqlite, auth-apikey, surface-* all with zero logger refs). - Author .dispatch/rules/extension-logging.md (tribal-knowledge only, P6/P7): self-redact your own secrets in your own code (no shared helper; §6 tiers), use injected host.logger/ctx.log, flat scalar attrs, no token-delta logging, one-way logs, edge verbatim capture. - Wire it into ORCHESTRATOR §3 as 'every extension' — include on EVERY extension summon; remove the (pending) note. - Record the coverage audit + remaining instrumentation debt (#1 reconcile.repair span in conversation-store, #2 transport-edge logging) in tasks.md. Future extensions now get logging by construction.
2026-06-06feat(kernel-runtime,session-orchestrator): emit turn lifecycle eventsAdam Malczewski
Close a gap found live: neither transport emitted turn-start/done/turn-sealed (the wire defined them; nothing fired them). turn-sealed is the FE's cache-commit signal (frontend-design §6.3); done ends the stream. - kernel-runtime: runTurn emits turn-start first and done (with finishReason) last, on every exit path (stop/tool-calls/max-steps/error/aborted). - session-orchestrator: emits turn-sealed after conversationStore.append succeeds (the kernel touches no DB, so the post-persist seal is the orchestrator's). Not emitted if append throws. No contract change (all three wire types already existed). Verified live: HTTP /chat and WS chat both stream turn-start … done turn-sealed. typecheck clean, 494 vitest + 80 bun, biome clean.
2026-06-06feat(transport-ws,transport-contract): multiplex chat ops onto the surface WSAdam Malczewski
Add chat WS ops (chat.send / chat.delta / chat.error) + unified WsClientMessage/WsServerMessage unions to @dispatch/transport-contract (imports ui-contract; surface protocol unchanged — additive non-colliding type variants, no channel wrapper). transport-ws drives sessionOrchestrator.handleMessage, streaming each AgentEvent as chat.delta over the same connection that carries surface ops; per-connection AbortController cancels in-flight turns on socket close; error-isolated. Verified live: one WS connection delivered the surface catalog AND a real flash chat turn (chat.delta stream, reply 'Hello my friend'). Completes the FE Slice 2 backend prereqs. typecheck clean, 485 vitest + 80 bun, biome clean. Discovered (separate, pre-existing): runtime does not emit turn-start/done/turn-sealed on either transport — needed for FE cache-commit; tracked in tasks.md.
2026-06-06feat(transport-http): GET /conversations/:id?sinceSeq= read-side history ↵Adam Malczewski
endpoint Incremental rehydration endpoint for long-lived clients. Returns ConversationHistoryResponse { chunks: StoredChunk[], latestSeq } — the RAW, append-order, seq-filtered slice from conversation-store.loadSince, NOT reconciled (reconcile conflicts with the per-chunk seq cursor, so it stays on the turn path; the read path is a pure sync primitive). - transport-contract: add ConversationHistoryResponse + StoredChunk re-export. - transport-http: GET /conversations/:id route reaching the log directly via conversationStoreHandle (dependsOn conversation-store); pure parseSinceSeq (absent->0, invalid->400). - build wiring: conversation-store dep + project ref. FE Slice 2 backend prereq (read-side). typecheck clean, 481 vitest, biome clean.
2026-06-06feat(wire,conversation-store): per-chunk seq sync cursor (StoredChunk)Adam Malczewski
Add StoredChunk { seq, role, chunk } to @dispatch/wire (re-exported via the kernel contract shims). Keeps Chunk pure (provider-facing, no cursor); the sync cursor lives only on the envelope. conversation-store: rekey conv:<id>:msg:<seq> -> conv:<id>:chunk:<seq>; append explodes messages into role-tagged seq'd chunks (1-based, gap-free, monotonic) with internal boundary metadata so load() round-trips ChatMessage[] losslessly and still reconciles; new loadSince(id, sinceSeq?) raw sync stream. session-orchestrator test fake conforms to the widened interface. FE Slice 2 backend prereq (per-chunk seq). typecheck clean, 469 vitest, biome clean.
2026-06-06feat(frontend,wire): surface system (FE slice 1) + @dispatch/wire types-only ↵Adam Malczewski
split (B2) FE slice 1 — backend-declared, frontend-agnostic surface system (verified live): new types-only @dispatch/ui-contract (SurfaceSpec / field kinds / region / ActionRef / catalog), surface-registry (typed service handle), transport-ws (Bun WS :24205, path-agnostic upgrade), surface-loaded-extensions (first real surface); kernel HostAPI.getExtensions; host-bin wiring; bin/up. Harness: retire AGENTS 'backend only', ORCHESTRATOR §3/§7/§8, frontend-design.md locked. B2 — wire-types split (chat-slice prerequisite): new types-only @dispatch/wire single-sources the wire ABI (AgentEvent + 11 variants; conversation model Chunk/ChatMessage/Role/TurnId/StepId + 6 chunk variants; Usage) with zero @dispatch/* deps. @dispatch/kernel re-exports via shims so its public surface is byte-identical (zero consumer blast radius). transport-contract re-exports AgentEvent from @dispatch/wire and drops its @dispatch/kernel dependency, so HTTP clients (the web frontend) consume the wire without the kernel runtime. tsc -b + biome clean; 460 vitest + 77 bun pass.
2026-06-05feat(cli): one-shot terminal client (models, chat, ↵Adam Malczewski
--text/--file/--cwd/--conversation) HTTP client of transport-contract; pure-core arg/render/ndjson + injected fetch/fs shell. Docs: GLOSSARY (credential/key/model name/model catalog), tasks.md milestone, ORCHESTRATOR geography.
2026-06-05docs: reorder roadmap — CLI first, then web frontend, then dedup/storageAdam Malczewski
User-set ordering: (1) CLI MVP (line-oriented, NOT a TUI; may have basic selectors; same mirrored-backend methodology with a careful design pass first — seed at notes/cli-design.md), (2) web frontend (Svelte + DaisyUI, notes/frontend-design.md), (3) dedup/storage growth. CLI design seed frames the same open questions as the web FE (pure-core/shell split, unit boundaries, transport, testing) adapted to a terminal client.
2026-06-05docs: roadmap — Frontend MVP next (Svelte + DaisyUI, mirrored-backend ↵Adam Malczewski
methodology), then dedup/storage User-set roadmap: (1) Frontend MVP — Svelte + DaisyUI, but ONLY after a careful design pass that maps the backend's methodology (minimal core + extensions, typed contracts, pure-core/inject-effects, one-owner, asymmetric testing) onto the frontend. Old Dispatch FE is reference-only; port 24204 reserved. Seed doc at notes/frontend-design.md (IDEATION mode — design WITH the user before any summon). (2) dedup/storage growth (D5 volume-control + prefix.fingerprint + §6 retention) — already designed, sequenced after the FE. Re-sequenced the deferred storage item. When FE build begins: retire AGENTS.md 'Backend only' line + author new frontend scoped rules + update ORCHESTRATOR §3/§7.
2026-06-05feat(observability): map DeepSeek nested cache tokens ↵Adam Malczewski
(prompt_tokens_details.cached_tokens) -> Usage.cacheReadTokens The real flash fixture showed flash reports cache usage in the NESTED prompt_tokens_details.cached_tokens form (384 cached of 665 prompt); the parser only mapped the flat cache_read_tokens form, so cache tokens never surfaced. Now: cacheReadTokens = usage.cache_read_tokens ?? usage.prompt_tokens_details?.cached_tokens (flat wins; cacheWriteTokens flat-only, never fabricated; partial/null *_details safe). No kernel contract change (Usage already has the fields). +5 parser tests + a real-fixture regression (cacheReadTokens === 384). These counts (+ a future prefix.fingerprint) are the cheap signals for body de-duplication. The broader trace-body storage-growth concern (verbatim body stored per request -> ~O(N^2) for long conversations) is logged DEFERRED in tasks.md; mitigation already designed (D5 volume control + §6 retention/rotation), not yet built. 339 tests, typecheck + biome 0/0.
2026-06-05feat(observability): replace synthetic text-turn fixture with a real ↵Adam Malczewski
sanitized flash capture (D5) Installed a real DISPATCH_RECORD_FIXTURE capture (200 text/event-stream, deepseek-v4-flash, finish_reason stop, reply 'Hello there friend') as src/__fixtures__/flash-text-turn.json, replacing the hand-authored one. Auth header masked (Bearer sk-…redacted…UN0); fixture re-verified secret-free before commit. Text-turn replay assertions updated to real values (inputTokens 665 / outputTokens 90); structural assertions kept (multi-chunk replay, getCapturedRequest deep-equals the outgoing request, finish/stop event, concatenated deltas). The provider's request-building + SSE parsing now run against genuine flash bytes. Real-data finding (logged in tasks.md): flash reports cache tokens via DeepSeek's NESTED prompt_tokens_details.cached_tokens (384 cached of 665 prompt); the parser only maps the FLAT cache_read/creation form, so cache tokens don't surface — an observability gap vs the §3.1 cache-debugging goal. Deferred pending decision. 334 tests, typecheck + biome 0/0.
2026-06-05fix(observability): record-mode redaction leaked capitalized Authorization ↵Adam Malczewski
header — case-insensitive mask (+3 regression tests) A live capture exposed that provider record mode saved the request Authorization header with the API key in CLEARTEXT: onExchange redaction matched lowercase 'authorization', but streamChat sends 'Authorization' (capital A) and recordFetch captures headers verbatim, so the real header slipped through. The provider.request SPAN redaction was unaffected (it masks from config.apiKey directly — journal + trace-DB showed zero leaks); the leak was record-mode-only and caught PRE-COMMIT (fixture was /tmp-only, scrubbed). Fix: redact auth case-insensitively across all captured header keys (strip Bearer, maskSecret the token, re-prepend, preserve key casing). New tests: reproduce the exact capital-Authorization leak (would have caught it), a lowercase case, and a guard that no authorization header of ANY casing survives carrying a raw sk- token. 334 tests (331 -> +3), typecheck + biome 0/0. This is the live-capture step (D5) earning its keep — real data exposed what the synthetic redaction test assumed away.
2026-06-05feat(observability): provider record/replay via @dispatch/trace-replay — ↵Adam Malczewski
env-gated capture + hermetic fixture tests (331 tests) provider-openai-compat now consumes @dispatch/trace-replay. (A) Opt-in record mode: when DISPATCH_RECORD_FIXTURE is set, the fetch edge wraps recordFetch and saves a fixture of the verbatim post-transform request + raw SSE response, self-redacting the auth header in the provider's OWN code (reuses its existing maskSecret graduated-tier mask — no shared helper, isolation over DRY). Zero overhead when unset; fail-safe. (B) Hermetic replay tests: stream.test.ts drives the provider off committed SSE fixtures via replayFetch (chunk-split to exercise SSE parsing across boundaries), asserting ProviderEvents + that the outgoing request still matches the recorded one (transform-drift regression). Injectable fetch via an internal StreamConfig.fetchFn — NO kernel contract change. 2 committed fixtures (text-turn + tool-call, currently hand-authored-faithful; a real flash text-turn swap follows). Verified: tsc -b clean, 331 vitest (327 -> +4: 2 replay + 2 redaction), biome 0/0. Provider 44 -> 48.
2026-06-05feat(observability): trace-replay — generic HTTP-exchange record/replay ↵Adam Malczewski
library (39 tests) New standalone package @dispatch/trace-replay: replayFetch (pure — fixture -> fetch double + captured request, optional chunking to simulate streaming), recordFetch (tees a real fetch into a fixture WITHOUT consuming the caller's stream), and serialize/parse + save/load fixture I/O. Redaction-free by design: calling extensions self-redact in their OWN code before saving (isolation over DRY, D5/§9). Zero @dispatch/* deps, no bun:sqlite (runs under vitest). The shared unit realizing the §7/D5 replay affordance for hermetic provider tests; provider-openai-compat will consume it next. Root tsconfig ref wired. Verified: tsc -b clean, 327 vitest (288 -> +39: replay 12 / record 8 / fixture 19), biome 0/0. Agent stayed in lane (packages/trace-replay only).
2026-06-05feat(observability): Phase B — trace-store (SQLite) + out-of-process ↵Adam Malczewski
collector + trace CLI (345 tests) trace-store (bun:sqlite): records+bodies schema (thin/fat split), idempotent insertRecords (FNV-1a id + INSERT OR IGNORE), getTurn/getBody, pure renderEasyView (D8 timeline skeleton), trace CLI. Its own DB, isolated from storage-sqlite. observability-collector: out-of-process bin — tail journal -> splitLines/drainOnce -> trace-store.insertRecords; offset sidecar; at-least-once + idempotent; fail-safe; clean SIGINT/SIGTERM drain. Build-config (orchestrator): root tsconfig refs; both excluded from vitest + added to test:bun (bun:sqlite); bun install. Verified: tsc -b clean, 345 tests (273 vitest + 72 bun), biome 0 warnings/0 infos. Pipeline proven end-to-end: app -> journal -> collector -> SQLite -> 'trace <turnId>' easy-view. Known follow-up (next commit): kernel spans are flat (parent=ROOT) — run-turn nesting fix.
2026-06-05refactor(observability): pure-types contracts/logging + Span body channel; ↵Adam Malczewski
verbatim before/after -> LogRecord.body (273 tests) contracts/logging.ts reduced to pure types; createLogger (+ helpers) moved to kernel/src/logging/ — @dispatch/kernel still exports it (host-bin/tool-read-file unaffected). Span body channel (Option A): Logger.span / Span.child / Span.end accept an optional body string -> SpanOpenRecord.body / SpanCloseRecord.body. Large verbatim payloads now use body, not stringified attributes (store-fat-serve-thin; attributes stay thin/queryable for D9). before: run-turn emits a 'prompt' span with the verbatim messages+tools in body (small scalars in attrs). after: provider.request span carries the verbatim request in body; attrs thin, auth self-redacted. Verified: tsc -b clean, 273 tests, biome 0 warnings/0 infos. Live boot: prompt + provider.request bodies present and correlated (shared turnId); request.body no longer in attributes; auth-key leak count = 0.
2026-06-05feat(observability): Phase A.2 — verbatim provider.request "after" capture ↵Adam Malczewski
+ self-redaction (267 tests) Threads the step span's correlated logger into provider.stream (new optional ProviderStreamOptions.logger) so provider-openai-compat opens a child provider.request span at the fetch edge, capturing the verbatim post-transform request + response status/cache-tokens/raw-error. Auth header self-redacted in the provider's OWN code (graduated mask tiers; no shared helper). Capture is fail-safe (never throws into the turn). Adds the first hermetic provider HTTP test (stream.test.ts: fetch mocked, 15 cases). Large payloads use attributes for now; the LogRecord.body channel is a deferred ABI design (notes §10). Verified: tsc -b clean, 267 tests (250->+17), biome 0 warnings/0 infos. Live boot: provider.request shares turnId with prompt:before (before<->after diffable); auth-key leak count = 0 (self-redaction proven on a real request).
2026-06-05feat(observability): Phase A logging substrate — Logger/Span ABI + journal ↵Adam Malczewski
sink (250 tests) Structured, agent-first logging captured durably to an append-only journal file. Kernel (contracts/logging.ts): leveled/attributed Logger + Span, auto-scoped per extension (host stamps manifest.id, unspoofable), incremental span records (open/close) for crash-reconstructable traces, injected LogSink (pure record-builder). ctx.log on ToolContract; runTurn opens turn/step/tool-call spans and captures the verbatim pre-mutation prompt (the 'before') on the step span. journal-sink (new package, bootstrap dep — not an extension): LogSink appending NDJSON to a rotating journal; pure serialize + thin fs edge; fail-safe drop, never blocks a turn. host-bin injects it via HostDeps; session-orchestrator threads host.logger (childed per turn) into runTurn. Redaction is per-extension self-redaction (no shared helper — isolation over DRY). The out-of-process collector + SQLite store + the verbatim 'after' provider.request capture are Phase B / next (notes/observability-design.md §10/§11). Verified: tsc -b clean, 250 tests (218→+32), biome clean. Live boot: a turn's journal holds host logs + turn/step spans (open+close) + the prompt:before record with the verbatim messages array. Harness: ORCHESTRATOR §3 rule-scoping map; .dispatch/rules/isolation-over-dry.md; notes/observability-design.md (design D1–D10 + Phase A/B plan).
2026-06-05refactor(kernel): rename tabId → conversationId across contracts + ↵Adam Malczewski
consumers (218 tests) Step 4 of the post-MVP backlog: resolve the last vocab drift. The canonical term for a thread of turns is `conversationId` (GLOSSARY), but `AgentEvent` variants and `RunTurnInput` still used the legacy `tabId` from the old frontend "tab" concept, with session-orchestrator bridging `conversationId → tabId`. Atomic, type-driven rename across the full 10-file consumer set: - contracts/events.ts: all 11 AgentEvent variants tabId → conversationId - contracts/runtime.ts: RunTurnInput.tabId → conversationId - runtime/{events,run-turn,dispatch}.ts: factory params, ctx field, locals - session-orchestrator: drop the redundant `tabId: conversationId` bridge line - transport-http: emit wiring; external /chat field + X-Conversation-Id header unchanged (already canonical) — only the emitted NDJSON event field flips - tests (run-turn, app, logic): inputs + assertions now use conversationId Pure rename, zero behavior change: typecheck clean, 218 tests pass (unchanged count), biome clean, `grep tabId packages/` → zero matches. Verified live: multi-turn curl emits conversationId-keyed NDJSON and threads history correctly. GLOSSARY drift note removed. Closes the post-MVP backlog (Steps 1–4).
2026-06-05refactor(host): expose getHostAPI(); host-bin drops duplicate adapter; ↵Adam Malczewski
storage-sqlite manifest honesty host CR-1: createHost.getHostAPI() returns the canonical post-activation HostAPI (registration closed) via a single builder — host-bin deletes its buildPostActivationHostAPI duplicate and calls host.getHostAPI(). storage-sqlite CR-2: remove false contributes.services:["storage"] (backend is a kernel bootstrap dep injected as HostDeps.storageFactory, not a bus service); document the intentional no-op activate. typecheck clean, 218 tests pass, biome clean; live boot + curl verified.
2026-06-05feat(tool-read-file): add read_file tool extension + wire into host-binAdam Malczewski
First TOOL extension (standard tier, fs capability). Pure-core/shell split with workdir containment (realpath symlink guard). host-bin registers it in CORE_EXTENSIONS; flows into runTurn via session-orchestrator's resolveTools. Verified: typecheck clean, 214 tests pass (was 185), biome clean. Live curl against flash produced a real tool-call + tool-result round-trip with correct final answer. Proves the kernel tool-dispatch loop end-to-end (plan §3.3).
2026-06-05feat(auth): provider resolves credentials via AuthContract, not configAdam Malczewski
- kernel HostAPI: add getAuthProviders()/getAuthProvider(id) read-views (mirrors getProviders) - provider-openai-compat: activate() resolves creds via host.getAuthProvider("apikey").resolve(); dependsOn auth-apikey; model stays config-driven - host-bin: mirror the new getters in post-activation HostAPI stub - auth-apikey is no longer vestigial; auth seam exercised end-to-end - 185 tests pass; typecheck + biome clean; verified live (curl returns real response)
2026-06-05docs(tasks): MVP achieved — live multi-turn curl verified against OpenCode ↵Adam Malczewski
Go flash
2026-06-04docs: tasks tracker — orchestrator+transport done; kernel-crs nextAdam Malczewski
2026-06-04docs: conversation-store done; orchestrator+transport in parallelAdam Malczewski
2026-06-04docs: host done; conversation-store in progressAdam Malczewski
2026-06-04feat(core-ext): storage-sqlite, auth-apikey, provider-openai-compatAdam Malczewski
- storage-sqlite: bun:sqlite StorageNamespace backend + migrations (21 bun tests) - auth-apikey: pure resolver from env → ApiKeyCredentials (4 tests) - provider-openai-compat: OpenAI-compatible SSE stream → ProviderEvents - orchestrator fixes: provider imports (@dispatch/kernel), missing dep, exactOptionalPropertyTypes (omit-when-undefined), root tsconfig refs - vitest excludes storage-sqlite (bun:sqlite); test:bun runs it under bun
2026-06-04docs: add MVP tasks trackerAdam Malczewski