summaryrefslogtreecommitdiffhomepage
AgeCommit message (Collapse)Author
8 daysrefactor(tool-youtube-transcript): remove .youtube_subtitles_pending file ↵Adam Malczewski
convention Leaner tool description and queued response — no longer instructs the model to append URLs to a pending file. The tool just returns status, ETA, and URL.
8 daysfeat(tool-youtube-transcript): YouTube transcription toolAdam Malczewski
New standard tool extension backed by a self-hosted transcriber service (http://100.102.55.49:41090, Tailscale, no API key). One tool youtube_transcript — fetches transcripts for YouTube videos. Returns completed (full text + timestamped segments), queued/processing (position + ETA + .youtube_subtitles_pending retry convention), or failed (error). Pure core: validateUrl + format* functions + truncateOutput. Injected edge: TranscriptClient (injectable fetchFn, AbortSignal.any for cancellation). concurrencySafe true, capabilities network. 30 tests. Verified: tsc EXIT 0, 1152 vitest, biome clean (327 files). Boot smoke clean.
8 daysfeat(todo): per-conversation task list tool + surfaceAdam Malczewski
New standard tool extension with a single todo_write tool (opencode todowrite pattern: full-list replace, returns JSON, no business-rule enforcement — the description guides the model). Per-conversation in-memory state + per-conversation surface (rendererId: todo, scope: conversation) via subscriber-notify (message-queue pattern). Wave 0 (kernel contract): added conversationId?: string to ToolExecuteContext (additive, backward-compatible). Wired in dispatch.ts — the kernel already had it but wasn't passing it through to tools. Wave 1 (todo extension): pure core (validateTodos — shape only; getTodos/ setTodos/clearTodos; buildTodoSpec; formatTodoResult). Shell: createTodoWriteTool + surface provider. Tool description matches opencode's todowrite.txt depth (when-to-use, examples, task states). Priority field removed (bloats the tool with little value). 25 tests. Wave 2 (host-bin): registered todo in CORE_EXTENSIONS + dep + root tsconfig ref. Verified: tsc EXIT 0, 1123 vitest, biome clean (314 files). Boot smoke clean. FE handoff: frontend-todo-handoff.md.
8 daysfeat(tool-web-search): Firecrawl-backed web search toolAdam Malczewski
New standard tool extension with one tool web_search supporting 4 modes (search, scrape, crawl, map) against a self-hosted Firecrawl instance. Pure core: validateArgs (discriminated union by mode) + format* functions + truncateOutput. Injected edge: FirecrawlClient (injectable fetchFn/sleep/now, AbortSignal.any for per-request timeout + caller cancellation). concurrencySafe true, capabilities network. 38 tests, zero vi.mock. Live-verified: umans-glm-5.2 called web_search → real Firecrawl results (also the first live Umans API call).
8 daysfeat(provider-umans): Umans AI Coding Plan provider + openai-stream libAdam Malczewski
Extract a generic @dispatch/openai-stream library from provider-openai-compat (convert-messages, convert-tools, parse-sse, listModels, stream, provider), parameterizing createOpenAICompatProvider with uid=1000(tradam) gid=1000(tradam) groups=1000(tradam),966(docker),968(ollama),998(wheel) + hook. Refactor provider-openai-compat to import from the lib (byte-identical behavior). New @dispatch/provider-umans extension wraps the Umans OpenAI-compatible backend (https://api.code.umans.ai/v1). Self-contained: reads UMANS_API_KEY from env directly (no auth-apikey dep). transformBody maps reasoningEffort → reasoning_effort (capping xhigh/max → high). Dynamic listModels via GET /v1/models. host-bin: registered provider-umans in CORE_EXTENSIONS + umans credential (gated on UMANS_API_KEY — the credential is the model-catalog index). Verified: tsc EXIT 0, 1059 vitest, biome clean (293 files). Boot smoke: umans models appear in GET /models (7 models live).
8 daysfeat(message-queue): per-conversation queue + steering injectionAdam Malczewski
A per-conversation message queue (new message-queue extension) holds user messages enqueued while a turn generates; delivered mid-turn as steering at the tool-result boundary (or carried to a new turn if no tool call fires). - kernel: RunTurnInput.drainSteering callback (generic; kernel stays pure) - wire 0.7.0->0.8.0: QueuedMessage, QueuePayload, TurnSteeringEvent (additive) - transport-contract 0.11.0->0.12.0: POST /conversations/:id/queue + chat.queue WS op - message-queue ext: queue state + per-conversation custom surface (rendererId message-queue) - session-orchestrator: enqueue facade + drainSteering wiring + post-seal carry - transport-http/ws: queue endpoint + chat.queue op (fixes WsClientMessage exhaustive switch) - host-bin: register message-queue 1043 vitest + 199 transport bun pass; tsc/biome clean; boot smoke clean. FE courier: frontend-message-queue-handoff.md.
8 daysfix(history): harden loadSince sinceSeq lower bound (forgiving, like ↵Adam Malczewski
beforeSeq/limit) Coerce sinceSeq to a non-negative integer lower bound in loadSince (omitted/0/ non-positive/non-integer/NaN/Infinity -> 0; valid as-is). The transport layer 400s these upstream, but loadSince stays total for direct callers. Byte-identical to the prior ?? 0 for the only values any caller ever passed. 58 bun tests pass.
2026-06-12docs(harness): Task-tool summon mechanism rework (briefs, ORCHESTRATOR, ↵Adam Malczewski
HANDOFF) + .skills
2026-06-12docs(handoff): FE courier — reasoning effort (selector, per-turn override, ↵Adam Malczewski
endpoints)
2026-06-12docs(tasks): reasoning-effort milestone — waves 1-3 done, live-verify + FE ↵Adam Malczewski
handoff pending
2026-06-12feat(reasoning-effort): persisted per-conversation + per-turn override, ↵Adam Malczewski
threaded to providers - conversation-store: get/setReasoningEffort (own key space, mirrors cwd) - session-orchestrator: resolveReasoningEffort (override -> stored -> 'high'), StartTurnInput.reasoningEffort, warm() parity (cache-safe) - transport-http: /chat validation (400 on bad level) + GET/PUT /conversations/:id/reasoning-effort - transport-ws: chat.send threading + validation - cli: --effort <low|medium|high|xhigh|max> 993 vitest + 189 bun tests green; typecheck + biome clean.
2026-06-12feat(contracts): reasoning effort — ReasoningEffort ladder (low..max), ↵Adam Malczewski
ProviderStreamOptions/ChatRequest fields, per-conversation GET/PUT types wire 0.6.1->0.7.0, transport-contract 0.10.0->0.11.0. Additive only; typecheck+biome clean.
2026-06-12feat(history): CR-5 windowed reads — ?limit= / ?beforeSeq= on GET ↵Adam Malczewski
/conversations/:id Selection sinceSeq < seq < beforeSeq; newest-limit window, ascending; positive- integer validation (400, store never sees an invalid window); 1-based gap-free seq codified as the contractual has-older mechanism (no earliestSeq field). transport-contract 0.9.0->0.10.0, wire 0.6.0->0.6.1 (doc-only). conversation-store +8 tests, transport-http +20; 935 vitest + 112 bun green. Live-verified: 6/6 probe checks OK. FE courier: frontend-history-windowing-handoff.md
2026-06-12feat(cache-warming): lifecycle CR-4 — default-off, fresh nextWarmAt, ↵Adam Malczewski
conversation close (+CR-1 table, CR-2 scope) CR-4a: warming defaults OFF (opt-in per conversation); re-enabling restores the persisted interval. CR-4b: re-arm BEFORE surface notify so post-warm updates carry the FUTURE nextWarmAt; turnSettled/turnStarted now also push (fresh schedule after seal, null while generating). CR-4c: POST /conversations/:id/close — per-turn AbortController wired to the kernel runTurn signal (partial persist + normal seal, done.reason "aborted"), new conversationClosed hook, cache-warming disables sync + persists OFF. Disconnect/chat.unsubscribe semantics unchanged. CR-4d: no change needed — initial surface echo already at HEAD (stale up2 boot on the FE probe). CR-1: loaded-extensions emits a single custom rendererId:"table" field (TablePayload exported; Name|Version|Trust|Activation, all trust tiers). CR-2: SurfaceCatalogEntry.scope?: "global"|"conversation" on both surfaces. Contracts: ui-contract 0.1.0→0.2.0, transport-contract 0.8.0→0.9.0 (additive). 907 tests pass (+13); live-verified against bin/up (warms @5s with future nextWarmAt; mid-turn close → abortedTurn:true + done.reason aborted). Courier: frontend-cache-warming-lifecycle-handoff.md.
2026-06-12fix(turns): emit user prompt on the turn event stream (CR-3)Adam Malczewski
A pure watcher (subscribed but not the sender) couldn't see the user prompt until the turn sealed: the user message was only persisted at seal and never entered the live/replayable stream. Add an additive TurnInputEvent {type:"user-message", conversationId, turnId, text} to the AgentEvent union and emit it via the broadcast/buffer path as the first event of every turn, so it is replayed to all subscribers (live + late-join) and on the HTTP path. Persistence and metrics unchanged; the union widening breaks no exhaustive switch. - @dispatch/wire 0.5.0->0.6.0; @dispatch/transport-contract 0.7.0->0.8.0 (re-export) - session-orchestrator: emit user-message at runTurnDetached start; +3 tests, 3 Wave-1 tests updated (user-message precedes turn-start) - FE courier: frontend-cr3-user-message-handoff.md Live-verified vs flash: watcher receives user-message (correct text) as its first chat.delta before turn-sealed. 894 vitest + transport bun green; tsc -b EXIT 0.
2026-06-12feat(turns): detached turns + multi-client live viewAdam Malczewski
A turn no longer dies when its WebSocket connection closes. The turn-broadcast hub moves into the core (session-orchestrator): turns run detached, persist at seal regardless of clients, and fan out AgentEvents to N subscribers per conversation with in-flight buffer replay for late-joiners. transport-ws stops aborting turns on socket close and gains chat.subscribe/chat.unsubscribe so a second device (or a reloaded browser) can watch a running turn. - @dispatch/transport-contract 0.6.0->0.7.0: chat.subscribe/chat.unsubscribe WS ops - session-orchestrator: startTurn/subscribe/isActive; persistent subscribers + per-turn buffer (two-map model); handleMessage = convenience wrapper (no signal) - transport-ws: per-connection chat-subscription fan-out; no turn-abort-on-close - transport-http: test fakes updated for the widened interface (runtime unchanged) - design notes/turn-continuity-design.md; FE courier frontend-turn-continuity-handoff.md Live-verified vs flash (2-client WS): sender disconnect mid-turn -> other client streams to done + turn persists; late-join replays turn from turn-start. 891 vitest + transport bun green; tsc -b EXIT 0; biome clean.
2026-06-12docs(tasks): add context window limit to roadmap (open items)Adam Malczewski
2026-06-12docs(tasks): record live-verify of context size against flashAdam Malczewski
2026-06-12feat(metrics): expose current context size to the frontendAdam Malczewski
contextSize = the turn's FINAL step inputTokens+outputTokens (true context occupancy; NOT the aggregate usage, which sums per-step prompts and overcounts multi-step turns). Stamped on both the live done event (kernel) and persisted TurnMetrics (session-orchestrator); a client reads the latest turn's value. - @dispatch/wire 0.4.0->0.5.0: optional contextSize on TurnDoneEvent + TurnMetrics - @dispatch/transport-contract 0.5.0->0.6.0 (re-export only) - glossary: context size (reserve 'context window' for the model limit, later) - FE courier: frontend-context-size-handoff.md 881 vitest pass; tsc -b EXIT 0; biome clean.
2026-06-11feat(lsp,cwd): LSP integration + per-conversation cwd; fix cache-warming ↵Adam Malczewski
cache bust LSP + per-conversation CWD feature: - new bundled `lsp` extension: hand-rolled JSON-RPC codec (framing/rpc), lazy one-server-per-(serverID,root), per-cwd config resolution, on-demand `lsp` tool - `conversation-store`: getCwd/setCwd (cwdKey); `session-orchestrator` defaults a turn's cwd from the store - `transport-http`: cwd + lsp status endpoints; wire types in transport-contract - host-bin: register lsp; config wiring Cache-warming fix (the warm read 0% on the first reheat after a message): - warm assembled tools under a different cwd than the real turn (a reheat sends no cwd, and the warm service had no store fallback). The skills filter rewrites the cwd-sensitive `load_skill` description, so the tools block — the first bytes of the prompt-cache prefix — diverged and the cache missed entirely. Warm now resolves cwd as opts.cwd ?? conversationStore.getCwd(), mirroring handleMessage. - capture warm sends as `provider.request` spans flagged `warm:true` (thread a child logger into providerOpts) so warm vs real bodies are diffable (obs §3.1). - kernel logger: span-close now merges child-bound attrs like span-open, so a `warm:true` query finds the closed span (with usage/status), not just the open. Tests: warm forwards a warm-flagged logger; warm falls back to stored cwd; logger open/close attr consistency. Full suite green (873).
2026-06-11docs: CR-3 resolution courier (timer field + manual-warm reset) + tasksAdam Malczewski
2026-06-11feat(cache-warming): CR-3 — manual warm resets timer + ↵Adam Malczewski
nextWarmAt/lastWarmAt surface FE CR-3 (backend-handoff-cache-warming-timer.md). The inversion: session-orchestrator's warm() (the single chokepoint for manual /chat/warm AND the automatic timer) emits a warmCompleted bus event; cache-warming subscribes and does ALL post-warm handling. So a manual warm now re-arms the timer + refreshes the surface with NO transport-http change (core can't depend on the standard cache-warming ext). - session-orchestrator: warmCompleted event hook + emit from warm() on success - cache-warming: warmCompleted subscriber unifies result handling (manual + automatic); adds nextWarmAt/lastWarmAt state + a custom 'cache-warming-timer' surface field - fix: createWarmService was missing the emit dep (deps.emit?. silently no-oped) → wired it + made emit REQUIRED so it can't regress Live-verified vs claude haiku: manual POST /chat/warm now logs cache-warming 'warm complete' ~2s after the turn (not the 4-min timer) → manual warm reaches the warmer. 800 vitest + 109 bun green; tsc -b 0; biome clean.
2026-06-11docs(handoff): prune cache-warming FE handoff to what's unconsumedAdam Malczewski
Per the FE's backend-handoff.md (2026-06-11) the frontend shipped the NumberField renderer, conversation-scoped subscriptions, the Cache Warming view, and warmNow(). Removed those sections; kept only the new cache-rate fix + expectedCacheRate (retention) metric the FE has not yet consumed.
2026-06-11fix(cache-warming): accurate cache rate + expectedCacheRate (retention) metricAdam Malczewski
The Claude cache % read 100% whenever anything was cached, because the metric's denominator (inputTokens) excluded cached tokens on Anthropic. Fixed upstream in ../claude/provider-anthropic (inputTokens = total prompt); this commit adds the companion retention metric and exposes it: - transport-contract: WarmResponse += expectedCacheRate - transport-http: POST /chat/warm returns expectedCacheRate = cacheRead/(cacheRead+cacheWrite) - cache-warming: computeExpectedCacheRate + a per-conversation 'cache retention' surface stat - handoff: documents the fix + cache-rate vs expected-cache (cross-turn) for the FE Live-verified vs claude haiku: real turn cache rate 61% (was inflated 100%); warm within TTL expectedCacheRate=100%, after expiry=0%.
2026-06-11docs(handoff): FE courier for cache-warming controls + surface protocolAdam Malczewski
NumberField render, conversationId on the surface WS protocol, the cache-warming control surface (toggle/interval/last-%), and POST /chat/warm.
2026-06-11feat(surfaces): NumberField + per-conversation surface scoping; ↵Adam Malczewski
cache-warming controls Extend the surface framework so cache-warming exposes per-conversation controls: - ui-contract: add NumberField (settable free-value numeric) to SurfaceField; add optional conversationId to subscribe/unsubscribe/invoke + surface/update - surface-registry: SurfaceContext { conversationId? } on getSpec/invoke (backward-compatible) - transport-ws: thread conversationId; key subscriptions by (surfaceId, conversationId); tag surface/update replies with conversationId - cache-warming: per-conversation surface — Toggle(enabled) + Number(interval seconds, cache-warming/set-interval) + Stat(last cache %); drop the currentConversationId closure Global surfaces (surface-loaded-extensions) unchanged. 784 vitest + 109 bun = 893 tests; tsc -b EXIT 0; biome clean.
2026-06-11feat(cache-warming): manual POST /chat/warm trigger endpointAdam Malczewski
A frontend 'warm now' button (and fast tests) can trigger a warm on demand instead of waiting for the automatic timer. - transport-contract: WarmRequest / WarmResponse wire types - transport-http: POST /chat/warm → cacheWarmHandle.warm(); 200 with cachePct, 409 when the conversation is generating, 400 on missing conversationId Live-verified vs claude haiku: seed turn cacheWrite=6799 → POST /chat/warm returns cacheReadTokens=6799 cachePct=100 (100% hit). 760 vitest + 109 bun green.
2026-06-11feat(cache-warming): per-conversation prompt-cache warming + warm() serviceAdam Malczewski
Backend-driven warming targeting whatever provider a conversation uses (incl. the external Claude provider-anthropic). Core engine + on/off + last-cache-% done; interval-as-view-control pending a ui-contract NumberField (surface-system gap). Mechanism: - kernel: expose HostAPI.emit (typed bus event emit; counterpart of on) - session-orchestrator: turnStarted/turnSettled event hooks (conversationId/cwd/model); warm() service (cacheWarmHandle) reusing the real-turn assembly (byte-identical prefix, provider-agnostic), refuses mid-turn, never persists/emits, returns Usage - cache-warming (new ext): per-conversation timers (arm on settle, cancel on start, in-flight invalidation), calls warm(), pct=round(clamp(cacheRead/input,0,1)*100), persists {enabled,intervalMs} (default on/240s), registers a controls surface - host-bin: register cache-warming; transport-http: HostAPI stub +emit (fan-out) Honors old-code invariants. 760 vitest + 109 bun = 869 tests; tsc -b EXIT 0; biome clean.
2026-06-10feat(skills): skill system + load_skill tool via per-turn tools filterAdam Malczewski
Skills are markdown in .skills/ dirs (~/.skills + <cwd>/.skills, cwd shadows home; name = filename). Format: line1 summary, line2 ---, body line3+; load strips the first two lines; malformed = no summary but still loadable. Mechanism (first use of the context-assembly filter chain, §3.2): - kernel: expose HostAPI.applyFilters (delegates to bus.applyFilters) - session-orchestrator: define/export toolsFilter + ToolAssembly; apply once per turn before runTurn (cache-stable across steps), threading cwd + conversationId - skills (new ext): pure parse/merge/render + load_skill tool (live read, path-contained) + a toolsFilter filter rewriting load_skill's description + name enum per cwd - host-bin: register skills in CORE_EXTENSIONS - transport-http: fix HostAPI test stub for the new applyFilters method (fan-out) 734 vitest + 109 bun = 843 tests; tsc -b EXIT 0; biome clean; clean live boot.
2026-06-10feat(tools): add run_shell, edit_file, write_file + read_file directory listingAdam Malczewski
Four standard-tier tool extensions (one tool per extension, zero ABI change): - tool-read-file: read_file now lists directory contents (sorted, /-suffixed subdirs) - tool-shell: run_shell (foreground, streamed, cancellable, cwd, timeout + output cap) - tool-edit-file: edit_file (oldString/newString/replaceAll; errors on absent/non-unique) - tool-write-file: write_file (explicit overwrite flag) Registered in host-bin CORE_EXTENSIONS. Live boot clean (shell capability accepted). 686 vitest + 89 bun = 775 tests; tsc -b EXIT 0; biome clean.
2026-06-10trace-store: fix old-schema migration crash (found by live boot)Adam Malczewski
Wave 1 created idx_records_bodyHash BEFORE migrateOldBodies ran, so opening a pre-existing old-schema traces.db crashed the collector with 'no such column: bodyHash' (crash-looped 168x in ~20s). Fresh DBs hid it (CREATE TABLE already has bodyHash); only a real old-schema DB exposed it. - reorder schema(): migrateOldBodies (ALTER ADD bodyHash + content-address backfill + drop old bodies) runs BEFORE the bodyHash index. - add 3 regression tests that seed a real old-schema DB and open it. Live-verified: old-schema traces.db migrates on boot with 0 crashes; 318 body refs collapse to 270 content-addressed bodies; prune cadence fires cleanly. typecheck EXIT 0; biome clean; bun 106->109, 0 fail.
2026-06-10observability-collector: drive trace-store prune on a cadenceAdam Malczewski
Wave 2 (final) of the dedup/storage-growth milestone (notes §12). - pure shouldPrune(now,lastPruneAt,intervalMs) cadence helper (injected clock). - main.ts calls store.prune(DEFAULT_RETENTION) on a coarse cadence (--prune-interval-ms, default 60s; host-bin-overridable), far less frequent than a drain. Prune errors are logged and never stop the tail loop. - confirmed body inserts flow through trace-store's content-addressed path. - glossary: content-addressed body, trace retention, prefix fingerprint, warm vs real. typecheck EXIT 0; biome clean; vitest 576; bun 100->106, 0 fail.
2026-06-10trace-store: content-addressed body dedup + retention/pruneAdam Malczewski
Wave 1 of the dedup/storage-growth milestone (notes §12). - bodies table is now content-addressed (SHA-256 hash key); identical verbatim bodies (cache-warming resends, any repeat) collapse to one stored row, referenced by hash from records. Transparent to insert/read callers. - at-rest gzip compression for bodies >1 KiB (node:zlib), decompressed on read. - prune(policy): age-based delete + drop-oldest byte-cap eviction + orphan-body GC. Exports RetentionPolicy/PruneSummary/DEFAULT_RETENTION (7d / 256 MiB). typecheck EXIT 0; biome clean; vitest 576; bun 89->100, 0 fail.
2026-06-10feat: per-model throughput (tok/s) tracking + metrics endpointAdam Malczewski
New throughput-store extension records one token-weighted sample per turn (model, output tokens, pure generation time = Σ step genTotalMs) into a day-bucketed KV store, and aggregates per-model tok/s = Σtokens / Σgen-seconds over a day/week/month (server-local boundaries; week = ISO Mon–Sun). transport-http records a sample per turn (logged) and serves GET /metrics/throughput?period=day|week|month&date=<...>. The response is typed as transport-contract's ThroughputResponse, so store/wire drift is a compile error. Pure period + aggregate logic fully unit-tested.
2026-06-10kernel/run-turn: thread providerOpts (model) into provider.streamAdam Malczewski
executeStep built the stream opts with only the logger, so providerOpts.model (the selected model) never reached any provider — each fell back to its own default. Carry providerOpts through StepContext into the per-step stream opts, plus a regression test asserting the model is forwarded.
2026-06-10host-bin: external-extension loader + claude credential wiringAdam Malczewski
Add loadExternalExtensions(): fault-isolated dynamic import of out-of-repo extensions declared via DISPATCH_EXTERNAL_EXTENSIONS. main.ts assembles the credential-store in boot() so a 'claude' credential is registered when an external anthropic provider is loaded; config.ts surfaces the anthropic model / credential-key settings those extensions read.
2026-06-10docs: FE cache hit/miss + percentage calculation handoffAdam Malczewski
Calculation-only courier doc for ../dispatch-web: Usage field semantics, hitRate = cacheReadTokens/inputTokens, where to source it (usage/done events + GET /conversations/:id/metrics, per-step/turn/cumulative), live accumulate + done.usage reconcile, replay seeding, and the cacheWriteTokens-absent caveat. No backend change required; UI design left to the FE.
2026-06-10feat(conversation-store): reconcile.repair span (logging-audit #1)Adam Malczewski
Load-time history repair was invisible (createConversationStore got no logger). Now: optional logger injected (extension passes host.logger); reconcile logic moved into pure reconcileWithReport() returning a ReconcileReport (reconcile() stays a thin byte-identical wrapper); load() emits a reconcile.repair span (childed with conversationId, flat attrs repairedCount/firstRepairedToolCallId) ONLY when a real repair occurs. No contract fan-out (factory is package-internal). typecheck EXIT 0, biome clean, 550 vitest (+4) + 89 bun.
2026-06-10docs(metrics): FE Pass-2 courier handoff + mark live-verifiedAdam Malczewski
GET /conversations/:id/metrics verified end-to-end against flash (live stream metrics byte-match the persisted TurnMetrics; journal turn/step spans carry dotted usage.* incl. cacheReadTokens). Handoff doc for the user to courier the wire/transport-contract 0.4.0 delta to ../dispatch-web (ORCHESTRATOR \xc2\xa77).
2026-06-10docs(tasks): prune stale milestone history to a lean current-state docAdam Malczewski
tasks.md had accreted 635 lines of blow-by-blow milestone narration; that history lives in git. Collapse completed milestones into a compact summary, keep operational/open/roadmap, record the metrics milestone (incl. Pass 2). Removes the stale 'regen FE .reference.md' practice notes that contradicted ORCHESTRATOR \xc2\xa77 (cross-repo changes are couriered via the user; the backend does not write the FE repo) \u2014 \xc2\xa77 stands as the single source of truth.
2026-06-10feat(metrics): durable per-turn/step token+timing metrics (observability ↵Adam Malczewski
spans + persisted replay) Two-part token-data improvement: #2 Observability spans (kernel run-turn): turn & step span-close now stamp ALL four Usage fields — added usage.cacheReadTokens/cacheWriteTokens (were silently dropped) and normalized usage_* -> usage.* to match the provider.request span (consistent D9 GROUP BY). No contract change. #3 Persisted replay metrics (conversation-store + read endpoint): new StepMetrics/TurnMetrics wire types; conversation-store persists per-turn metrics in a separate key space (appendMetrics/loadMetrics, turn-append order); session-orchestrator accumulates per-step+turn metrics from the event stream (pure metrics.ts) and persists after seal; transport-http serves GET /conversations/:id/metrics -> ConversationMetricsResponse. Contracts: @dispatch/wire + @dispatch/transport-contract bumped 0.3.0->0.4.0 (additive). GLOSSARY: turn metrics / step metrics. typecheck EXIT 0, biome clean, 546 vitest + 89 bun = 635 tests.
2026-06-07docs(tasks): record live metrics Pass 1 done + live-verifiedAdam Malczewski
2026-06-07feat(wire,kernel,session-orchestrator): live turn metrics on the streamAdam Malczewski
Expose the backend's authoritative token+timing metrics on the live AgentEvent stream (observability-only -> now also client-facing). All additive/optional. - [email protected]: new TurnStepCompleteEvent (type:step-complete) with per-step ttftMs/decodeMs/genTotalMs; usage += stepId; tool-result += durationMs (exec); done += durationMs (turn wall-clock) + usage (turn total). RunTurnInput += now?. [email protected] (re-export bump). - kernel-runtime: when now injected, measures + emits the above (reuses the ttft/decode first-token detection); omits timing gracefully without a clock. - session-orchestrator: adds now? to deps, threads into RunTurnInput; extension activate injects () => Date.now(). - transport/cli/host-bin: untouched (verbatim pass-through; additive fields). FE handoff: frontend-metrics-handoff.md. typecheck clean; 520 vitest + 89 bun; biome 0/0. Replay/persistence = deferred Pass 2 (documented in tasks.md).
2026-06-07docs(tasks): record per-step TTFT+decode timing done + live-verifiedAdam Malczewski
2026-06-07feat(kernel-runtime): per-step TTFT + decode timing spans (observability)Adam Malczewski
Split each step's generation into a ttft span (stream start -> first text|reasoning token) and a decode span (first token -> stream end), children of the step span. decode = generation total - TTFT; both retrievable from the trace-store. First token counts reasoning deltas; a step with no content token ends ttft with firstToken:false (no misleading decode). Span-based (no clock injection), no wire/contract change. +3 runtime tests. GLOSSARY: TTFT + decode time. typecheck clean; 512 vitest; biome 0/0.
2026-06-07docs(tasks): record stepId step-grouping done + live-verifiedAdam Malczewski
2026-06-07feat(wire,kernel,conversation-store): step grouping via stepId for batched ↵Adam Malczewski
tool calls Expose a per-step grouping key so a client can render a model's batched/parallel tool calls (those emitted in one step) as one unit, on both the live stream and replayed history. Key = branded StepId, derived turnId#stepIndex (0-based). - [email protected]: required stepId on Turn{Tool,ToolResult}Event; optional stepId on Tool{Call,Result}Chunk (generation provenance on the chunk, not the StoredChunk envelope — StoredChunk unchanged). [email protected] (re-export bump). - kernel-runtime: mint stepId per step; stamp on tool chunks + tool events. - conversation-store: chunk-carried stepId round-trips append/load/loadSince for free; reconcile copies it onto synthesized (interrupted) results. - cli: stepId added to event test fixtures (renderer unchanged). typecheck clean; 509 vitest + 89 bun; biome 0/0. FE courier reply + reference snapshots regenerated in ../dispatch-web.
2026-06-07docs(harness): biome-clean rule + parallel-wave orchestrationAdam Malczewski
- add .dispatch/rules/biome-clean.md (0 warnings/0 infos; no `!`/useLiteralKeys), wired into the every-agent scoping map + canonical invocation - package-agent: note biome zero-tolerance; verify line clarifies 0 warnings AND 0 infos - ORCHESTRATOR: document parallel-execution waves (2a) + agent-failure recovery patterns (5a) + concurrency caveats
2026-06-06docs(tasks): record FE Slice-2 backend handoff resolution (A-E answered)Adam Malczewski
2026-06-06feat(transport-http): wildcard CORS + bump contract pkgs to 0.1.0 (FE Slice ↵Adam Malczewski
2 handoff) Unblock the browser frontend (Vite origin :24204 -> HTTP backend :24203): - transport-http: wildcard CORS via hono/cors on all routes (Access-Control-Allow-Origin: *, Allow-Methods GET/POST/OPTIONS, Allow-Headers Content-Type) + OPTIONS preflight (204). Headers present on the streamed POST /chat NDJSON response too. +4 app.fetch tests. - wire / transport-contract / ui-contract: 0.0.0 -> 0.1.0 as the FE-consumable baseline (semver convention §2.9: major = cross-repo fan-out signal). Verified live: OPTIONS /chat -> 204 with CORS headers; GET /models -> 200 with Access-Control-Allow-Origin: *. typecheck clean, 502 vitest + 89 bun, biome clean.