From ee502ba1228fdaec4a15413a973ffce7ca89a0b6 Mon Sep 17 00:00:00 2001 From: Adam Malczewski Date: Wed, 10 Jun 2026 08:41:26 +0900 Subject: docs(metrics): FE Pass-2 courier handoff + mark live-verified GET /conversations/:id/metrics verified end-to-end against flash (live stream metrics byte-match the persisted TurnMetrics; journal turn/step spans carry dotted usage.* incl. cacheReadTokens). Handoff doc for the user to courier the wire/transport-contract 0.4.0 delta to ../dispatch-web (ORCHESTRATOR \xc2\xa77). --- frontend-metrics-pass2-handoff.md | 67 +++++++++++++++++++++++++++++++++++++++ tasks.md | 11 ++++--- 2 files changed, 73 insertions(+), 5 deletions(-) create mode 100644 frontend-metrics-pass2-handoff.md diff --git a/frontend-metrics-pass2-handoff.md b/frontend-metrics-pass2-handoff.md new file mode 100644 index 0000000..9019a85 --- /dev/null +++ b/frontend-metrics-pass2-handoff.md @@ -0,0 +1,67 @@ +# FE handoff — persisted replay metrics (Pass 2) + metrics endpoint + +> **Courier doc** (backend → `../dispatch-web`, via the user). Per ORCHESTRATOR §7 +> the backend does NOT write the FE repo; the FE orchestrator applies this delta +> on its side (regenerate the in-repo `.dispatch/*.reference.md` snapshots + bump +> the `file:` dep). `lsp references` does not span the two repos. Backend commit: +> `6db12ff`. + +## Versions +- `@dispatch/wire` `0.3.0 → 0.4.0` (additive) +- `@dispatch/transport-contract` `0.3.0 → 0.4.0` (additive) + +Pure-type, additive change — no breaking edits to existing types. + +## New wire types (`@dispatch/wire`, re-exported by `@dispatch/transport-contract`) + +```ts +interface StepMetrics { + stepId: StepId; // `#`, join key to the live stream + usage: Usage; // { inputTokens, outputTokens, cacheReadTokens?, cacheWriteTokens? } + ttftMs?: number; // time to first token (optional — clock + first-token gated) + decodeMs?: number; // first token → stream end + genTotalMs?: number; // stream start → end (== ttftMs + decodeMs when a first token was seen) +} + +interface TurnMetrics { + turnId: string; // plain wire turn id, join key to AgentEvents + usage: Usage; // aggregate across all steps + durationMs?: number; // turn wall-clock (optional — clock gated) + steps: readonly StepMetrics[]; // per-step, in step order +} +``` + +These are the **persisted, replayable** counterparts of the live `usage` / +`step-complete` / `done` events (which remain transient and unchanged). + +## New read endpoint + +`GET /conversations/:id/metrics` → `ConversationMetricsResponse`: + +```ts +interface ConversationMetricsResponse { turns: readonly TurnMetrics[] } +``` + +Semantics: +- `turns` = every **sealed** turn's `TurnMetrics`, in **turn-append order**. +- A turn appears only **after seal** (post-persist); an in-flight/unsealed turn is absent. +- This is a **separate axis** from `GET /conversations/:id?sinceSeq=` (which returns + seq-cursor chunk CONTENT). Metrics are keyed per **turn**, not per chunk, so they are + **not** seq-filtered — hence a sibling route, not a field on the history response. +- Unknown / metric-less conversation → `{ turns: [] }`. +- CORS: same wildcard as the other routes. + +## Suggested FE consumption +On (re)opening a conversation, the chat feature can `GET /conversations/:id/metrics` +once alongside the history hydrate (`?sinceSeq=`), then render historical +tokens/latency per turn (and per step via `stepId`) — identical fields to what it +already routes from the live `step-complete` / `usage` / `done` stream. TPS is +still derived FE-side (`usage.outputTokens / decodeMs`); context-size proxy = +`usage.inputTokens`. + +## Invariants (confirmed live) +- Persisted `TurnMetrics.usage` / `durationMs` and each `StepMetrics` + (`stepId` + `usage` + `ttftMs`/`decodeMs`/`genTotalMs`) **byte-match** what the + live stream emitted for the same turn (verified end-to-end against flash). +- `stepId` is the SAME value on the live `step-complete`/`usage` events, the persisted + `StepMetrics`, and the tool chunks — one grouping key across live + replay. diff --git a/tasks.md b/tasks.md index f94aec2..8692551 100644 --- a/tasks.md +++ b/tasks.md @@ -61,11 +61,12 @@ server/collector procs poison the next run's counts. per-step+turn metrics from the event stream and persists after seal; transport-http `GET /conversations/:id/metrics` → `ConversationMetricsResponse`. `@dispatch/wire` + `@dispatch/transport-contract` → `0.4.0`. Commit `6db12ff`. -- [ ] **FE courier handoff** for the new types + endpoint (in-repo handoff doc; - user couriers to `../dispatch-web`; ORCHESTRATOR §7 — backend does not write - the FE repo). -- [ ] **Live re-probe** of `GET /conversations/:id/metrics` end-to-end (first - probe came back empty — re-run with a longer boot wait). +- [x] **Live-verified end-to-end** (against flash): live `step-complete`/`done` + metrics ↔ persisted `GET /conversations/:id/metrics` byte-match (aggregate + + per-step `stepId` + ttft/decode/genTotal + durationMs); journal turn/step spans + carry dotted `usage.*` incl. `usage.cacheReadTokens` (the #2 fix). +- [x] **FE courier handoff** written: `frontend-metrics-pass2-handoff.md` (in + this repo; user couriers to `../dispatch-web`; ORCHESTRATOR §7). ## Open items - **logging-audit #1:** conversation-store has no injected logger, so a load-time -- cgit v1.2.3