summaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
authorAdam Malczewski <[email protected]>2026-06-10 08:41:26 +0900
committerAdam Malczewski <[email protected]>2026-06-10 08:41:26 +0900
commitee502ba1228fdaec4a15413a973ffce7ca89a0b6 (patch)
tree8d48f9a509219caf004098dd61375868d7f3fdcd
parent18e9f583c695230d59fa03f21c6abf58936eb928 (diff)
downloaddispatch-ee502ba1228fdaec4a15413a973ffce7ca89a0b6.tar.gz
dispatch-ee502ba1228fdaec4a15413a973ffce7ca89a0b6.zip
docs(metrics): FE Pass-2 courier handoff + mark live-verified
GET /conversations/:id/metrics verified end-to-end against flash (live stream metrics byte-match the persisted TurnMetrics; journal turn/step spans carry dotted usage.* incl. cacheReadTokens). Handoff doc for the user to courier the wire/transport-contract 0.4.0 delta to ../dispatch-web (ORCHESTRATOR \xc2\xa77).
-rw-r--r--frontend-metrics-pass2-handoff.md67
-rw-r--r--tasks.md11
2 files changed, 73 insertions, 5 deletions
diff --git a/frontend-metrics-pass2-handoff.md b/frontend-metrics-pass2-handoff.md
new file mode 100644
index 0000000..9019a85
--- /dev/null
+++ b/frontend-metrics-pass2-handoff.md
@@ -0,0 +1,67 @@
+# FE handoff — persisted replay metrics (Pass 2) + metrics endpoint
+
+> **Courier doc** (backend → `../dispatch-web`, via the user). Per ORCHESTRATOR §7
+> the backend does NOT write the FE repo; the FE orchestrator applies this delta
+> on its side (regenerate the in-repo `.dispatch/*.reference.md` snapshots + bump
+> the `file:` dep). `lsp references` does not span the two repos. Backend commit:
+> `6db12ff`.
+
+## Versions
+- `@dispatch/wire` `0.3.0 → 0.4.0` (additive)
+- `@dispatch/transport-contract` `0.3.0 → 0.4.0` (additive)
+
+Pure-type, additive change — no breaking edits to existing types.
+
+## New wire types (`@dispatch/wire`, re-exported by `@dispatch/transport-contract`)
+
+```ts
+interface StepMetrics {
+ stepId: StepId; // `<turnId>#<index>`, join key to the live stream
+ usage: Usage; // { inputTokens, outputTokens, cacheReadTokens?, cacheWriteTokens? }
+ ttftMs?: number; // time to first token (optional — clock + first-token gated)
+ decodeMs?: number; // first token → stream end
+ genTotalMs?: number; // stream start → end (== ttftMs + decodeMs when a first token was seen)
+}
+
+interface TurnMetrics {
+ turnId: string; // plain wire turn id, join key to AgentEvents
+ usage: Usage; // aggregate across all steps
+ durationMs?: number; // turn wall-clock (optional — clock gated)
+ steps: readonly StepMetrics[]; // per-step, in step order
+}
+```
+
+These are the **persisted, replayable** counterparts of the live `usage` /
+`step-complete` / `done` events (which remain transient and unchanged).
+
+## New read endpoint
+
+`GET /conversations/:id/metrics` → `ConversationMetricsResponse`:
+
+```ts
+interface ConversationMetricsResponse { turns: readonly TurnMetrics[] }
+```
+
+Semantics:
+- `turns` = every **sealed** turn's `TurnMetrics`, in **turn-append order**.
+- A turn appears only **after seal** (post-persist); an in-flight/unsealed turn is absent.
+- This is a **separate axis** from `GET /conversations/:id?sinceSeq=` (which returns
+ seq-cursor chunk CONTENT). Metrics are keyed per **turn**, not per chunk, so they are
+ **not** seq-filtered — hence a sibling route, not a field on the history response.
+- Unknown / metric-less conversation → `{ turns: [] }`.
+- CORS: same wildcard as the other routes.
+
+## Suggested FE consumption
+On (re)opening a conversation, the chat feature can `GET /conversations/:id/metrics`
+once alongside the history hydrate (`?sinceSeq=`), then render historical
+tokens/latency per turn (and per step via `stepId`) — identical fields to what it
+already routes from the live `step-complete` / `usage` / `done` stream. TPS is
+still derived FE-side (`usage.outputTokens / decodeMs`); context-size proxy =
+`usage.inputTokens`.
+
+## Invariants (confirmed live)
+- Persisted `TurnMetrics.usage` / `durationMs` and each `StepMetrics`
+ (`stepId` + `usage` + `ttftMs`/`decodeMs`/`genTotalMs`) **byte-match** what the
+ live stream emitted for the same turn (verified end-to-end against flash).
+- `stepId` is the SAME value on the live `step-complete`/`usage` events, the persisted
+ `StepMetrics`, and the tool chunks — one grouping key across live + replay.
diff --git a/tasks.md b/tasks.md
index f94aec2..8692551 100644
--- a/tasks.md
+++ b/tasks.md
@@ -61,11 +61,12 @@ server/collector procs poison the next run's counts.
per-step+turn metrics from the event stream and persists after seal;
transport-http `GET /conversations/:id/metrics` → `ConversationMetricsResponse`.
`@dispatch/wire` + `@dispatch/transport-contract` → `0.4.0`. Commit `6db12ff`.
-- [ ] **FE courier handoff** for the new types + endpoint (in-repo handoff doc;
- user couriers to `../dispatch-web`; ORCHESTRATOR §7 — backend does not write
- the FE repo).
-- [ ] **Live re-probe** of `GET /conversations/:id/metrics` end-to-end (first
- probe came back empty — re-run with a longer boot wait).
+- [x] **Live-verified end-to-end** (against flash): live `step-complete`/`done`
+ metrics ↔ persisted `GET /conversations/:id/metrics` byte-match (aggregate +
+ per-step `stepId` + ttft/decode/genTotal + durationMs); journal turn/step spans
+ carry dotted `usage.*` incl. `usage.cacheReadTokens` (the #2 fix).
+- [x] **FE courier handoff** written: `frontend-metrics-pass2-handoff.md` (in
+ this repo; user couriers to `../dispatch-web`; ORCHESTRATOR §7).
## Open items
- **logging-audit #1:** conversation-store has no injected logger, so a load-time