summaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
authorAdam Malczewski <[email protected]>2026-06-07 17:47:58 +0900
committerAdam Malczewski <[email protected]>2026-06-07 17:47:58 +0900
commit4248cd1d546a4c1fb4e68940c11b5e309c2c2736 (patch)
tree50bbfda39bb1c52f154f44c7656342241527a2ab
parent7c459c7d919d1e08a228e8abc56129be174d8abe (diff)
downloaddispatch-4248cd1d546a4c1fb4e68940c11b5e309c2c2736.tar.gz
dispatch-4248cd1d546a4c1fb4e68940c11b5e309c2c2736.zip
docs(tasks): record live metrics Pass 1 done + live-verified
-rw-r--r--tasks.md31
1 files changed, 16 insertions, 15 deletions
diff --git a/tasks.md b/tasks.md
index 46849cb..47f6627 100644
--- a/tasks.md
+++ b/tasks.md
@@ -580,7 +580,7 @@ measured** (provider-agnostic, self-consistent); **first token = first text OR r
spans with valid `durationMs` (ttft 1090ms, decode 1673ms) + `firstToken:true`. GLOSSARY: TTFT,
decode time. NOT on the wire (clients don't receive it) — a future wire+FE step if desired.
-### Expose backend metrics to clients (timing/tokens) — IN PROGRESS
+### Expose backend metrics to clients (timing/tokens) — Pass 1 DONE + verified live (`7c459c7`)
User ask: surface the backend's authoritative metrics (tokens, TTFT/decode, TPS, tool-exec +
turn durations) to clients (CLI/web FE). Decisions (user, §5.2):
- **Delivery = inline in the existing chat stream**, as distinct `AgentEvent` types/fields —
@@ -589,7 +589,7 @@ turn durations) to clients (CLI/web FE). Decisions (user, §5.2):
- **Carrier = a new `step-complete` event** (per-step end, ordering-safe timing) + additive fields.
- **Scope = LIVE now**; persisted-for-replay is a documented fast-follow (below).
-#### Pass 1 — LIVE stream metrics [~] IN PROGRESS
+#### Pass 1 — LIVE stream metrics [x] DONE + verified live
Metric set (all wire fields additive/optional): per-step tokens (`usage` += `stepId`), per-step
`ttftMs`/`decodeMs`/`genTotalMs` (new `step-complete` event), tool-exec `durationMs` (`tool-result`),
turn `durationMs` + aggregate `usage` (`done`). TPS derived FE-side (`outputTokens/decodeMs`);
@@ -600,19 +600,20 @@ context-size proxy = `usage.inputTokens` (already present).
number` (additive optional clock — runtime has no clock today; needed to put real numbers on the
wire). **Whole-graph typecheck clean** → new variant breaks NO consumer (no exhaustive switches;
cli/transport unaffected). GLOSSARY: TTFT/decode already added.
-- [ ] **Build wave (2 disjoint owner-agents, parallel, mimo-v2.5-pro):**
- - **kernel-runtime:** when `now` provided, measure + emit timing — `step-complete` per step
- (ttft/decode/genTotal, reusing the just-built first-token detection), `stepId` on `usage`,
- `durationMs` on `tool-result`, `durationMs`+`usage` on `done`. Keep the trace spans (may unify
- span timing with the numeric measurement). Omit timing gracefully when `now` absent. +tests.
- - **session-orchestrator:** add `now?` to `SessionOrchestratorDeps`, thread into the
- `RunTurnInput` it builds; provide `() => Date.now()` from the extension's `activate` (shell
- edge). +test asserting forwarding.
- - **NOT touched (typecheck-confirmed):** transport-http/ws (verbatim pass-through), cli (optional
- fields, non-exhaustive switch), host-bin (orchestrator self-injects the clock in activate).
-- [ ] **Post-wave (orchestrator):** full typecheck/test/biome; bump `wire`+`transport-contract`
- minor (`0.2.0→0.3.0`); regen FE `.dispatch/{wire,transport-contract}.reference.md`; courier reply.
- Live: confirm `step-complete`/durations/`usage.stepId` on the real stream.
+- [x] **Build wave (2 disjoint owner-agents, parallel, mimo-v2.5-pro):**
+ - **kernel-runtime:** emits `step-complete` per step (ttft/decode/genTotal), `stepId` on `usage`,
+ `durationMs` on `tool-result`, `durationMs`+`usage` on `done`; gated on injected `now`; trace
+ spans kept. +6 tests. reports/kernel-runtime.md.
+ - **session-orchestrator:** `now?` on `SessionOrchestratorDeps`, threaded into `RunTurnInput`;
+ extension `activate` injects `() => Date.now()`. +2 tests. reports/session-orchestrator.md.
+ - **NOT touched (typecheck-confirmed):** transport-http/ws, cli, host-bin.
+- [x] **Post-wave (orchestrator):** typecheck EXIT 0, **520 vitest** (512→+8) + 89 bun, biome 0/0,
+ in-lane, no internal mocks. Bumped `wire`+`transport-contract` `0.2.0→0.3.0`; regen FE
+ `.dispatch/{wire,transport-contract}.reference.md`; wrote **`frontend-metrics-handoff.md`** (in
+ arch-rewrite root, for the user to courier — full FE consumption guide). **Live (host-bin :24235,
+ real flash, tool turn):** 2 `step-complete` (genTotal==ttft+decode ✓), both `usage` carry
+ `stepId`, `tool-result.durationMs=3`, `done.durationMs=3294`+turn `usage` (cacheRead 768),
+ derived TPS≈107 tok/s. All green.
#### Pass 2 — DEFERRED: persisted metrics for REPLAY
Today usage/timing are NOT persisted (conversation-store stores chunks only), so reopening a past