diff options
| author | Adam Malczewski <[email protected]> | 2026-06-10 08:41:26 +0900 |
|---|---|---|
| committer | Adam Malczewski <[email protected]> | 2026-06-10 08:41:26 +0900 |
| commit | ee502ba1228fdaec4a15413a973ffce7ca89a0b6 (patch) | |
| tree | 8d48f9a509219caf004098dd61375868d7f3fdcd | |
| parent | 18e9f583c695230d59fa03f21c6abf58936eb928 (diff) | |
| download | dispatch-ee502ba1228fdaec4a15413a973ffce7ca89a0b6.tar.gz dispatch-ee502ba1228fdaec4a15413a973ffce7ca89a0b6.zip | |
docs(metrics): FE Pass-2 courier handoff + mark live-verified
GET /conversations/:id/metrics verified end-to-end against flash (live stream
metrics byte-match the persisted TurnMetrics; journal turn/step spans carry
dotted usage.* incl. cacheReadTokens). Handoff doc for the user to courier the
wire/transport-contract 0.4.0 delta to ../dispatch-web (ORCHESTRATOR \xc2\xa77).
| -rw-r--r-- | frontend-metrics-pass2-handoff.md | 67 | ||||
| -rw-r--r-- | tasks.md | 11 |
2 files changed, 73 insertions, 5 deletions
diff --git a/frontend-metrics-pass2-handoff.md b/frontend-metrics-pass2-handoff.md new file mode 100644 index 0000000..9019a85 --- /dev/null +++ b/frontend-metrics-pass2-handoff.md @@ -0,0 +1,67 @@ +# FE handoff — persisted replay metrics (Pass 2) + metrics endpoint + +> **Courier doc** (backend → `../dispatch-web`, via the user). Per ORCHESTRATOR §7 +> the backend does NOT write the FE repo; the FE orchestrator applies this delta +> on its side (regenerate the in-repo `.dispatch/*.reference.md` snapshots + bump +> the `file:` dep). `lsp references` does not span the two repos. Backend commit: +> `6db12ff`. + +## Versions +- `@dispatch/wire` `0.3.0 → 0.4.0` (additive) +- `@dispatch/transport-contract` `0.3.0 → 0.4.0` (additive) + +Pure-type, additive change — no breaking edits to existing types. + +## New wire types (`@dispatch/wire`, re-exported by `@dispatch/transport-contract`) + +```ts +interface StepMetrics { + stepId: StepId; // `<turnId>#<index>`, join key to the live stream + usage: Usage; // { inputTokens, outputTokens, cacheReadTokens?, cacheWriteTokens? } + ttftMs?: number; // time to first token (optional — clock + first-token gated) + decodeMs?: number; // first token → stream end + genTotalMs?: number; // stream start → end (== ttftMs + decodeMs when a first token was seen) +} + +interface TurnMetrics { + turnId: string; // plain wire turn id, join key to AgentEvents + usage: Usage; // aggregate across all steps + durationMs?: number; // turn wall-clock (optional — clock gated) + steps: readonly StepMetrics[]; // per-step, in step order +} +``` + +These are the **persisted, replayable** counterparts of the live `usage` / +`step-complete` / `done` events (which remain transient and unchanged). + +## New read endpoint + +`GET /conversations/:id/metrics` → `ConversationMetricsResponse`: + +```ts +interface ConversationMetricsResponse { turns: readonly TurnMetrics[] } +``` + +Semantics: +- `turns` = every **sealed** turn's `TurnMetrics`, in **turn-append order**. +- A turn appears only **after seal** (post-persist); an in-flight/unsealed turn is absent. +- This is a **separate axis** from `GET /conversations/:id?sinceSeq=` (which returns + seq-cursor chunk CONTENT). Metrics are keyed per **turn**, not per chunk, so they are + **not** seq-filtered — hence a sibling route, not a field on the history response. +- Unknown / metric-less conversation → `{ turns: [] }`. +- CORS: same wildcard as the other routes. + +## Suggested FE consumption +On (re)opening a conversation, the chat feature can `GET /conversations/:id/metrics` +once alongside the history hydrate (`?sinceSeq=`), then render historical +tokens/latency per turn (and per step via `stepId`) — identical fields to what it +already routes from the live `step-complete` / `usage` / `done` stream. TPS is +still derived FE-side (`usage.outputTokens / decodeMs`); context-size proxy = +`usage.inputTokens`. + +## Invariants (confirmed live) +- Persisted `TurnMetrics.usage` / `durationMs` and each `StepMetrics` + (`stepId` + `usage` + `ttftMs`/`decodeMs`/`genTotalMs`) **byte-match** what the + live stream emitted for the same turn (verified end-to-end against flash). +- `stepId` is the SAME value on the live `step-complete`/`usage` events, the persisted + `StepMetrics`, and the tool chunks — one grouping key across live + replay. @@ -61,11 +61,12 @@ server/collector procs poison the next run's counts. per-step+turn metrics from the event stream and persists after seal; transport-http `GET /conversations/:id/metrics` → `ConversationMetricsResponse`. `@dispatch/wire` + `@dispatch/transport-contract` → `0.4.0`. Commit `6db12ff`. -- [ ] **FE courier handoff** for the new types + endpoint (in-repo handoff doc; - user couriers to `../dispatch-web`; ORCHESTRATOR §7 — backend does not write - the FE repo). -- [ ] **Live re-probe** of `GET /conversations/:id/metrics` end-to-end (first - probe came back empty — re-run with a longer boot wait). +- [x] **Live-verified end-to-end** (against flash): live `step-complete`/`done` + metrics ↔ persisted `GET /conversations/:id/metrics` byte-match (aggregate + + per-step `stepId` + ttft/decode/genTotal + durationMs); journal turn/step spans + carry dotted `usage.*` incl. `usage.cacheReadTokens` (the #2 fix). +- [x] **FE courier handoff** written: `frontend-metrics-pass2-handoff.md` (in + this repo; user couriers to `../dispatch-web`; ORCHESTRATOR §7). ## Open items - **logging-audit #1:** conversation-store has no injected logger, so a load-time |
