feat(cache-warming): consume CR-4 lifecycle — tab-close cancel + scope-aware subscriptions

- closeTab now POSTs /conversations/:id/close (abort in-flight turn + stop/disable warming server-side); disconnect still leaves both running ([email protected]) - syncSubscriptions honors catalog scope ([email protected]): global surfaces are not re-subscribed on conversation switch - fix(ws): the surface-message parser dropped the conversationId echo (CR-4d was ours, not the backend's) — preserved + unit-tested - secondsUntilNext: 3s stale guard — a past nextWarmAt renders as waiting, not 0s - re-pinned + re-mirrored [email protected] / [email protected] - scripts/probe-cache-warming.ts: live CR-4 probe (default-off, future nextWarmAt, repeated warms, mid-turn close abort, idempotent re-close) — 17/17 against bin/up
author: Adam Malczewski <[email protected]> 2026-06-12 16:28:07 +0900
committer: Adam Malczewski <[email protected]> 2026-06-12 16:28:07 +0900
commit: 4001274e3ba25a3946df1e9f2dc82ca6781cd2bf (patch)
tree: 24af95e69bda5c38ab7eefd6b71d55b4c247040a
parent: e6f6bd86eab07954d8f06e740659969c3dfecc7f (diff)
download: dispatch-web-4001274e3ba25a3946df1e9f2dc82ca6781cd2bf.tar.gz
dispatch-web-4001274e3ba25a3946df1e9f2dc82ca6781cd2bf.zip
11 files changed, 883 insertions, 178 deletions
diff --git a/.dispatch/transport-contract.reference.md b/.dispatch/transport-contract.reference.md
index 86eac50..e6ab799 100644
--- a/.dispatch/transport-contract.reference.md
+++ b/.dispatch/transport-contract.reference.md
@@ -5,10 +5,25 @@
 > hangs on a permission prompt). Your CODE still imports `@dispatch/transport-contract` normally —
 > this file is for READING only.
 >
-> **Orchestrator:** SNAPSHOT of `[email protected]` (CR-3 user-message shipped). Depends on
-> `@dispatch/[email protected]` (see `wire.reference.md`) + `@dispatch/[email protected]` (see
+> **Orchestrator:** SNAPSHOT of `[email protected]` (CR-4 cache-warming lifecycle shipped).
+> Depends on `@dispatch/[email protected]` (see `wire.reference.md`) + `@dispatch/[email protected]` (see
 > `ui-contract.reference.md`).
 >
+> **2026-06-12 delta (CR-4 cache-warming lifecycle — package bumped `0.8.0` → `0.9.0`):** adds
+> `POST /conversations/:id/close` (`CloseConversationResponse`) — the EXPLICIT "user closed this
+> conversation's tab" affordance, distinct from a socket disconnect / `chat.unsubscribe` (which
+> still NEVER touch the turn or the warming schedule). Closing (1) aborts any in-flight turn — the
+> kernel stops at the next event boundary, partial messages are PERSISTED, and the turn SEALS
+> normally with `finishReason: "aborted"` (watchers receive `done` then `turn-sealed`, so a
+> stream-derived "generating" flag clears with no special-casing) — and (2) stops + DISABLES
+> cache-warming for the conversation (persisted OFF; reopening does not resume warming). Idempotent:
+> closing an idle/unknown conversation is `200` with `abortedTurn: false`. Backend behavior fixes
+> riding EXISTING shapes (no other contract change): warming now defaults OFF for a new conversation
+> (240s interval default kept; re-enable restores the persisted interval); post-warm surface updates
+> now carry the FUTURE `nextWarmAt` (notify-before-reschedule fixed); `nextWarmAt: null` is pushed on
+> `turn-start` (nothing scheduled while generating) and when warming is/became disabled. Caveat: the
+> warming opt-in is NOT yet re-hydrated across a backend restart (reads disabled until toggled again).
+>
 > **2026-06-12 delta (CR-3 user-message handoff — package bumped `0.7.0` → `0.8.0`):** NO transport
 > shape change — it re-exports `AgentEvent` (which `chat.delta` / `/chat` NDJSON carry), and that union
 > gained the additive `TurnInputEvent` (`{ type: "user-message"; conversationId; turnId; text }`), the
@@ -29,7 +44,7 @@
 > persist across turns. FE consumes via the `chat` feature + app store (re-subscribe every open
 > conversation on (re)connect + page load; derive a "running" state structurally from
 > `turn-start`…no-`done`/`turn-sealed`-yet). OUT of scope: per-step crash-resume, concurrent-send
-> arbitration, explicit stop.
+> arbitration.
 >
 > **2026-06-12 delta (context-size handoff — package bumped `0.5.0` → `0.6.0`, depends on
 > `[email protected]`):** no NEW transport shape — the optional `contextSize?: number` rides the
@@ -85,6 +100,9 @@
 - `POST /chat/warm` — body `WarmRequest` (JSON) → `200 WarmResponse` (cache-warm usage incl.
   `cachePct`); `409 { error }` when the conversation is currently generating; `400 { error }` on a
   missing/invalid `conversationId`. The warm is NEVER persisted/streamed/folded into real usage.
+- `POST /conversations/:id/close` — no body → `200 CloseConversationResponse`. The EXPLICIT tab-close
+  affordance: aborts any in-flight turn (persists the partial; seals with `finishReason: "aborted"`)
+  AND stops + disables cache-warming (persisted OFF). Idempotent (`abortedTurn: false` when idle/unknown).
 - `GET /metrics/throughput?period=day|week|month&date=<...>` — `ThroughputResponse` (token-weighted
   tokens/sec per model over the window). Not part of cache-warming; listed for completeness.
 - `GET /conversations/:id/cwd` — `CwdResponse` (`cwd` is `null` until set).
@@ -97,18 +115,23 @@
   (`@dispatch/ui-contract`) + chat ops (below). Open once, send `WsClientMessage`, receive
   `WsServerMessage`. Live `AgentEvent` deltas carry `conversationId`+`turnId` but **no `seq`**
   (seq lives only on `StoredChunk`, obtained via the `sinceSeq` sync after `turn-sealed`).
-- DEFERRED (not built; do not depend on): `GET /conversations` (list), `POST /conversations/:id/cancel`.
+- DEFERRED (not built; do not depend on): `GET /conversations` (list). (The former deferred
+  `POST /conversations/:id/cancel` is superseded by `POST /conversations/:id/close`.)
 
 ```ts
 /**
  * Transport contract — the typed description of Dispatch's client–server API
- * (HTTP + WebSocket). Types-only (zero runtime). Each side owns its own
- * (de)serialization — the contract is the SHAPES, not the codec.
+ * (HTTP + WebSocket).
+ *
+ * This package is types-only (zero runtime). It is the single shared surface
+ * every client imports to know how to talk to the backend. Each side owns its
+ * OWN (de)serialization: the contract is the SHAPES, not the codec. The
+ * streaming response payload is the kernel's `AgentEvent` union, re-exported
+ * here so a client has one import for the whole wire.
  *
- * The WebSocket carries BOTH chat ops (here) and surface ops (in
+ * The WebSocket carries BOTH chat ops (defined here) and surface ops (defined in
  * `@dispatch/ui-contract`) over one connection; the unified `WsClientMessage` /
- * `WsServerMessage` unions below compose them. Chat ops are new, non-colliding
- * `type` variants (`chat.*`) — the shipped surface protocol is unchanged.
+ * `WsServerMessage` unions below compose them.
  */
 
 import type { SurfaceClientMessage, SurfaceServerMessage } from "@dispatch/ui-contract";
@@ -124,19 +147,35 @@ export type { AgentEvent, StepMetrics, StoredChunk, TurnMetrics } from "@dispatc
  * response header (useful when `conversationId` was omitted).
  */
 export interface ChatRequest {
-	/** The conversation to continue. Omit to start fresh — server mints an id (X-Conversation-Id). */
+	/**
+	 * The conversation to continue. Omit to start a fresh conversation — the
+	 * server mints an id and returns it via the `X-Conversation-Id` header.
+	 */
 	readonly conversationId?: string;
+
 	/** The user's message text for this turn. */
 	readonly message: string;
-	/** Model name in `<credentialName>/<model>` form (one of `GET /models`). Omit = server default. */
+
+	/**
+	 * The model to use, as a model name in `<credentialName>/<model>` form — one
+	 * of the exact strings returned by `GET /models`. Omit to use the server's
+	 * default credential + model.
+	 */
 	readonly model?: string;
-	/** Working directory for this turn's tool execution. Defaults server-side. Not part of the prompt. */
+
+	/**
+	 * Working directory for this turn's tool execution. Defaults server-side when
+	 * omitted. Forwarded to tools for path resolution; never part of the model
+	 * prompt (so it does not affect prompt caching).
+	 */
 	readonly cwd?: string;
 }
 
 /**
- * Response body for `GET /models` — the model catalog. Each entry is a model
- * name in `<credentialName>/<model>` form (exactly `ChatRequest.model`).
+ * Response body for `GET /models` — the model catalog.
+ *
+ * Each entry is a model name in `<credentialName>/<model>` form: exactly the
+ * string a client passes back as `ChatRequest.model`.
  */
 export interface ModelsResponse {
 	readonly models: readonly string[];
@@ -144,14 +183,22 @@ export interface ModelsResponse {
 
 /**
  * Response body for `GET /conversations/:id?sinceSeq=<n>` — the incremental
- * read-side history endpoint a long-lived client uses to (re)hydrate cheaply.
+ * read-side history endpoint a long-lived client uses to (re)hydrate a
+ * conversation cheaply.
+ *
+ * `chunks` is the RAW, append-order, seq-ordered slice of the conversation log
+ * with `seq > sinceSeq` (or the whole log when `sinceSeq` is omitted/0). It is
+ * NOT reconciled: a dangling tool-call is returned as-is (rendered as an
+ * interrupted call). Reconciliation is a turn-path concern — the server repairs
+ * history only when it feeds a provider, never on this read path — which is what
+ * preserves the per-chunk `seq` cursor invariant (a synthesized repair chunk
+ * would have no seq).
  *
- * `chunks` is the RAW, append-order, seq-ordered slice with `seq > sinceSeq`
- * (or the whole log when `sinceSeq` is omitted/0). NOT reconciled: a dangling
- * tool-call is returned as-is. `latestSeq` is the `seq` of the LAST chunk, or —
- * when the slice is empty (caught up) — the requested `sinceSeq` (0 for a full
- * read of an empty conversation). After applying, the client's new cursor is
- * always `latestSeq`; empty `chunks` means "nothing new past your cursor".
+ * `latestSeq` is the `seq` of the LAST chunk in this response, or — when the
+ * slice is empty (the client is already caught up) — the requested `sinceSeq`
+ * (0 for a full read of an empty conversation). So after applying the response a
+ * client's new cursor is always `latestSeq`, and an empty `chunks` means
+ * "nothing new past your cursor".
  */
 export interface ConversationHistoryResponse {
 	readonly chunks: readonly StoredChunk[];
@@ -163,12 +210,6 @@ export interface ConversationHistoryResponse {
  * (and per-step) token + timing metrics for a conversation, for a client
  * reopening a past conversation to render historical usage/latency.
  *
- * This is a SEPARATE axis from the two other read concerns and is deliberately
- * its own endpoint: the live `usage`/`step-complete`/`done` events are transient
- * (not persisted), and `ConversationHistoryResponse` carries seq-cursor chunk
- * CONTENT. Metrics are keyed per TURN (not per chunk) and so are not seq-filtered
- * — hence a sibling route rather than a field on the history response.
- *
  * `turns` is every SEALED turn's `TurnMetrics` in turn order. A turn appears only
  * after its metrics were persisted (post-seal); an in-flight or unsealed turn is
  * absent until then.
@@ -180,63 +221,46 @@ export interface ConversationMetricsResponse {
 /** The aggregation window for `GET /metrics/throughput`. */
 export type ThroughputPeriod = "day" | "week" | "month";
 
-/** One model's token-weighted throughput over a period. */
+/**
+ * One model's throughput over a period. `tokensPerSecond` is the TOKEN-WEIGHTED
+ * average — `Σ(output tokens) / Σ(generation seconds)` across the period's
+ * turns — so larger turns count proportionally more than smaller ones.
+ * Generation time is the model's pure decode time (it excludes tool-execution
+ * waits).
+ */
 export interface ThroughputModelStat {
+	/** The model name in `<credentialName>/<model>` form (as selected). */
 	readonly model: string;
+	/** Token-weighted average tokens/second over the period. */
 	readonly tokensPerSecond: number;
+	/** Total output tokens generated across the period's turns. */
 	readonly totalOutputTokens: number;
+	/** Total pure generation time across the period's turns, in milliseconds. */
 	readonly totalGenMs: number;
+	/** Number of turns that contributed. */
 	readonly turns: number;
 }
 
-/** Response body for `GET /metrics/throughput?period=...&date=...`. */
+/**
+ * Response body for
+ * `GET /metrics/throughput?period=day|week|month&date=<...>`.
+ *
+ * `date` is `YYYY-MM-DD` for day/week (week = the ISO Mon–Sun week containing
+ * that date) and `YYYY-MM` for month. Boundaries are computed in the server's
+ * local timezone; `start`/`end` are the resolved half-open `[start, end)` range
+ * in epoch-ms. `models` lists every model active in the window, sorted by
+ * `tokensPerSecond` descending.
+ */
 export interface ThroughputResponse {
 	readonly period: ThroughputPeriod;
 	readonly date: string;
-	readonly start: number; // inclusive window start, epoch-ms
-	readonly end: number; // exclusive window end, epoch-ms
+	/** Inclusive start of the window, epoch-ms. */
+	readonly start: number;
+	/** Exclusive end of the window, epoch-ms. */
+	readonly end: number;
 	readonly models: readonly ThroughputModelStat[];
 }
 
-/**
- * Request body for `POST /chat/warm` — manually trigger a prompt-cache WARMING
- * request for a conversation (e.g. a "warm now" button). The warm replays the
- * conversation's existing prefix to refresh the provider cache; it is NEVER
- * persisted and NEVER streamed. Pass the SAME `model`/`cwd` the conversation
- * chats with so the prefix is byte-identical to a real turn (that's the cache hit).
- */
-export interface WarmRequest {
-	readonly conversationId: string;
-	readonly model?: string; // `<credentialName>/<model>`; omit = server default
-	readonly cwd?: string;
-}
-
-/**
- * Response body for `POST /chat/warm` (HTTP 200). The warm's usage — never folded
- * into the conversation's real usage. A client surfaces `cachePct` as the "last
- * warming" cache-hit indicator. A 409 (currently generating) returns `{ error }` instead.
- */
-export interface WarmResponse {
-	readonly inputTokens: number;
-	readonly outputTokens: number;
-	readonly cacheReadTokens: number;
-	readonly cacheWriteTokens: number;
-	/**
-	 * **Cache rate** — what fraction of THIS request's prompt was served from cache:
-	 * `round(cacheReadTokens / inputTokens * 100)` (0 when `inputTokens <= 0`).
-	 * (`inputTokens` is the TOTAL prompt incl. cached, so this is in [0,100].)
-	 */
-	readonly cachePct: number;
-	/**
-	 * **Expected cache (retention)** — of the cacheable prefix this warm touched, how
-	 * much was still warm and read back vs. had to be (re)written:
-	 * `round(cacheReadTokens / (cacheReadTokens + cacheWriteTokens) * 100)` (0 when the
-	 * sum is 0). For a healthy warm this is ~**100%**; it drops toward 0 as the cache
-	 * expires/busts. This is the warming HEALTH signal — headline it for "Warm now".
-	 */
-	readonly expectedCacheRate: number;
-}
-
 // ─── Per-conversation working directory (cwd) ─────────────────────────────────
 
 /** Response of `GET /conversations/:id/cwd`. `cwd` is null when never set. */
@@ -250,6 +274,28 @@ export interface SetCwdRequest {
 	readonly cwd: string;
 }
 
+// ─── Conversation close (explicit tab close) ──────────────────────────────────
+
+/**
+ * Response of `POST /conversations/:id/close` (no request body).
+ *
+ * The EXPLICIT "the user closed this conversation's tab" affordance — distinct
+ * from a socket disconnect or `chat.unsubscribe`, which deliberately never touch
+ * the turn or the warming schedule. Closing:
+ *  1. aborts any in-flight turn (the kernel stops at the next event boundary,
+ *     partial messages are persisted, and the turn SEALS normally with
+ *     `finishReason: "aborted"` — watchers see `done` + `turn-sealed`), and
+ *  2. stops + disables cache-warming for the conversation (persisted OFF, so a
+ *     reopened conversation stays opt-in).
+ * Idempotent: closing an idle or unknown conversation succeeds with
+ * `abortedTurn: false`.
+ */
+export interface CloseConversationResponse {
+	readonly conversationId: string;
+	/** True when an in-flight turn existed and was aborted by this close. */
+	readonly abortedTurn: boolean;
+}
+
 // ─── Per-conversation LSP status ──────────────────────────────────────────────
 
 /** The connection state of a single language server for a workspace. */
@@ -280,15 +326,71 @@ export interface LspStatusResponse {
 	readonly servers: readonly LspServerInfo[];
 }
 
+/**
+ * Request body for `POST /chat/warm` — manually trigger a prompt-cache WARMING
+ * request for a conversation (e.g. a frontend "warm now" button, or fast tests
+ * that don't want to wait for the automatic warming timer).
+ *
+ * The warm replays the conversation's existing prefix to the provider to refresh
+ * its prompt cache; it is NEVER persisted and NEVER streamed (no `AgentEvent`s).
+ * Pass the same `model`/`cwd` the conversation chats with so the warm request's
+ * prefix is byte-identical to a real turn (which is what makes the cache hit).
+ */
+export interface WarmRequest {
+	/** The conversation whose prompt cache to warm. */
+	readonly conversationId: string;
+
+	/**
+	 * The model name in `<credentialName>/<model>` form the conversation uses, so
+	 * the warm resolves the same provider + prefix. Omit to use the server default.
+	 */
+	readonly model?: string;
+
+	/** Working directory matching the conversation's turns (for cwd-aware tool assembly). */
+	readonly cwd?: string;
+}
+
+/**
+ * Response body for `POST /chat/warm` (HTTP 200). The warm request's usage —
+ * never folded into the conversation's real usage. A client surfaces `cachePct`
+ * as the "last warming" cache-hit indicator.
+ *
+ * When warming cannot run because the conversation is currently generating, the
+ * server responds `409` with `{ error }` instead of this body.
+ */
+export interface WarmResponse {
+	readonly inputTokens: number;
+	readonly outputTokens: number;
+	readonly cacheReadTokens: number;
+	readonly cacheWriteTokens: number;
+	/**
+	 * **Cache rate** — what fraction of THIS request's prompt was served from cache:
+	 * `round(cacheReadTokens / inputTokens * 100)` (0 when `inputTokens <= 0`).
+	 * (`inputTokens` is the TOTAL prompt incl. cached, so this is in [0,100].)
+	 */
+	readonly cachePct: number;
+	/**
+	 * **Expected cache (retention)** — of the cacheable prefix this warm touched, how
+	 * much was still warm and read back vs. had to be (re)written:
+	 * `round(cacheReadTokens / (cacheReadTokens + cacheWriteTokens) * 100)` (0 when the
+	 * sum is 0). For a healthy warm this is ~**100%** (the whole prefix was still
+	 * cached); it drops toward 0 as the cache expires/busts and the warm has to rewrite
+	 * it. This is the warming HEALTH signal — distinct from `cachePct` (which a warm's
+	 * tiny fresh probe makes ~equal, but which on a real turn reflects new content).
+	 */
+	readonly expectedCacheRate: number;
+}
+
 // ─── WebSocket chat ops ───────────────────────────────────────────────────────
 // The persistent WS connection multiplexes chat ops (below) with surface ops
-// (`@dispatch/ui-contract`). Chat `type`s are namespaced (`chat.*`) so they
-// never collide with surface ones.
+// (`@dispatch/ui-contract`). The unified unions at the bottom compose both. Chat
+// `type`s are namespaced (`chat.*`) so they never collide with surface ones.
 
 /**
- * Client → server: start or continue a turn over the WS connection. Same fields
- * as the HTTP `ChatRequest`; omit `conversationId` to start fresh — the resolved
- * id arrives on the streamed `AgentEvent`s (each carries `conversationId`).
+ * Client → server: start or continue a turn over the WS connection. Carries the
+ * same fields as the HTTP `ChatRequest` (so one shape drives both transports);
+ * omit `conversationId` to start fresh — the resolved id arrives on the streamed
+ * `AgentEvent`s (each carries `conversationId`).
  */
 export interface ChatSendMessage extends ChatRequest {
 	readonly type: "chat.send";
@@ -296,8 +398,9 @@ export interface ChatSendMessage extends ChatRequest {
 
 /**
  * Server → client: one `AgentEvent` from an in-flight turn (text-delta,
- * tool-call, usage, done, turn-sealed, …). Fold these into the transcript
- * exactly as the HTTP NDJSON stream — same events, different carrier.
+ * tool-call, usage, done, turn-sealed, …). The client folds these into its
+ * transcript exactly as it folds the HTTP NDJSON stream — same events, different
+ * carrier.
  */
 export interface ChatDeltaMessage {
 	readonly type: "chat.delta";
@@ -316,13 +419,20 @@ export interface ChatErrorMessage {
 }
 
 /**
- * Client → server: start WATCHING a conversation's live turn events WITHOUT sending.
- * On subscribe the server REPLAYS the current in-flight turn so far (from its
- * `turn-start`) as `chat.delta`, then streams live; nothing replayed if idle (rely
- * on `GET /conversations/:id` history). Infer "generating" from a replayed
- * `turn-start` with no matching `done`/`turn-sealed` yet. `chat.send` already
- * auto-subscribes the sender, so this is for conversations you VIEW but didn't send to
- * (a 2nd device, or a reloaded/reconnected client). Idempotent per (connection, id).
+ * Client → server: start WATCHING a conversation's live turn events WITHOUT
+ * sending a message. This is what makes a turn viewable independently of who
+ * started it — a second device (multi-client handoff) or a client that reloaded
+ * mid-turn subscribes to receive the in-flight turn.
+ *
+ * On subscribe the server replays the CURRENT in-flight turn's events so far as
+ * `chat.delta` messages (so a late-joiner sees the whole running turn from its
+ * `turn-start`), then streams subsequent live events. If no turn is in-flight,
+ * nothing is replayed (the client relies on `GET /conversations/:id` history).
+ * A client infers "generating" from a replayed `turn-start` with no matching
+ * `done`/`turn-sealed` yet. Idempotent per `(connection, conversationId)`.
+ *
+ * NOTE: `chat.send` auto-subscribes the sending connection, so a client only needs
+ * `chat.subscribe` for conversations it is viewing but did not send to.
  */
 export interface ChatSubscribeMessage {
 	readonly type: "chat.subscribe";
@@ -331,22 +441,28 @@ export interface ChatSubscribeMessage {
 
 /**
  * Client → server: stop watching a conversation's turn events on this connection.
- * Does NOT stop/affect the turn (it runs to completion regardless of subscribers).
- * Socket close drops all of a connection's subscriptions the same way — again
- * WITHOUT aborting any in-flight turn.
+ * Does NOT stop or affect the turn itself (the turn runs to completion regardless
+ * of subscribers). The server also drops all of a connection's subscriptions when
+ * the socket closes — again WITHOUT aborting any in-flight turn.
  */
 export interface ChatUnsubscribeMessage {
 	readonly type: "chat.unsubscribe";
 	readonly conversationId: string;
 }
 
-/** Every client → server WS message: surface ops + chat ops. Discriminate on `type`. */
+/**
+ * Every client → server WS message: surface ops (`@dispatch/ui-contract`) + chat
+ * ops. A server discriminates on `type`.
+ */
 export type WsClientMessage =
 	| SurfaceClientMessage
 	| ChatSendMessage
 	| ChatSubscribeMessage
 	| ChatUnsubscribeMessage;
 
-/** Every server → client WS message: surface ops + chat ops. Discriminate on `type`. */
+/**
+ * Every server → client WS message: surface ops (`@dispatch/ui-contract`) + chat
+ * ops. A client discriminates on `type`.
+ */
 export type WsServerMessage = SurfaceServerMessage | ChatDeltaMessage | ChatErrorMessage;
 ```
diff --git a/.dispatch/ui-contract.reference.md b/.dispatch/ui-contract.reference.md
index 00d354f..d751af8 100644
--- a/.dispatch/ui-contract.reference.md
+++ b/.dispatch/ui-contract.reference.md
@@ -5,29 +5,49 @@
 > hangs on a permission prompt). Your CODE still imports `@dispatch/ui-contract` normally — this
 > file is for READING only.
 >
-> **Orchestrator:** this is a SNAPSHOT — regenerate it whenever `ui-contract` changes.
+> **Orchestrator:** this is a SNAPSHOT of `[email protected]` — regenerate it whenever
+> `ui-contract` changes.
+>
+> **2026-06-12 delta (CR-2/CR-4 handoff — package bumped `0.1.0` → `0.2.0`):** adds the optional
+> `scope?: "global" | "conversation"` to `SurfaceCatalogEntry` so a client can skip re-subscribing
+> GLOBAL surfaces on a conversation switch. ABSENT means assume conversation-scoped (the
+> conservative always-send-conversationId policy remains correct for both). Emitted today:
+> `loaded-extensions` → `"global"`, `cache-warming` → `"conversation"`. Also (CR-4d, no shape
+> change): the initial `surface` reply to a conversation-scoped subscribe ECHOES `conversationId`
+> as documented (was already on backend HEAD; verify with a freshly-booted backend).
 >
 > **2026-06 delta (cache-warming handoff):** adds the `NumberField` variant (`kind:"number"`) to
 > the `SurfaceField` union, and an OPTIONAL `conversationId?` to `SubscribeMessage` /
 > `UnsubscribeMessage` / `InvokeMessage` / `SurfaceMessage` / `SurfaceUpdate` so a surface can be
 > CONVERSATION-SCOPED (state differs per conversation, e.g. `cache-warming`) vs GLOBAL (one state for
 > all, e.g. `loaded-extensions`). All additive / backward-compatible: a global surface omits
-> `conversationId` and behaves exactly as before. (Backend left the package version at `0.1.0`.)
+> `conversationId` and behaves exactly as before.
 
 ```ts
 /**
  * UI contract — the frontend-agnostic vocabulary for backend-declared "surfaces".
  *
  * A SURFACE is a "data transportation surface": a typed description of what data an
- * extension exposes, its semantics, and the actions that can act on it — NOT UI.
- * Any client renders a surface in its own idiom (web/Svelte, CLI, future TUI/mobile).
- * Types-only, zero runtime, zero `@dispatch/*` deps.
+ * extension exposes, its semantics, and the actions that can act on it — NOT UI. It
+ * carries STRUCTURE + SEMANTICS + ACTIONS, never styling and never a rendering-
+ * framework token. Any client (web/Svelte, CLI, future TUI/mobile) renders a surface
+ * in its own idiom, so swapping or adding a client is a zero-backend-change event.
+ *
+ * This package is types-only (zero runtime) and has ZERO `@dispatch/*` dependencies,
+ * so a separate client repo can depend on JUST this contract.
  */
 
-/** Where a surface mounts — a coarse, semantic placement hint, NOT layout/CSS. Open string. */
+/**
+ * Where a surface mounts — a coarse, semantic placement hint, NOT a layout/CSS
+ * instruction. A client maps a region to its own idiom; an unknown region falls back
+ * to the client's default placement. Deliberately left open (a `string`).
+ */
 export type Region = string;
 
-/** A typed reference to a backend action a field can invoke (client posts payload back). */
+/**
+ * A typed reference to a backend action a field can invoke. The client posts it back
+ * (with a payload); the surface id comes from context.
+ */
 export interface ActionRef {
 	readonly actionId: string;
 }
@@ -38,7 +58,10 @@ export interface SurfaceOption {
 	readonly label: string;
 }
 
-/** A field within a surface — a SEMANTIC value, not a widget. `kind` is the discriminant. */
+/**
+ * A field within a surface — a SEMANTIC value, not a widget. `kind` is the
+ * discriminant a client switches on to pick a renderer.
+ */
 export type SurfaceField =
 	| ToggleField
 	| ProgressField
@@ -56,7 +79,7 @@ export interface ToggleField {
 	readonly action: ActionRef;
 }
 
-/** A bounded ratio in [0, 1] with a label. Read-only. */
+/** A bounded ratio in [0, 1] with a label (e.g. a cache-hit rate). Read-only. */
 export interface ProgressField {
 	readonly kind: "progress";
 	readonly label: string;
@@ -83,7 +106,7 @@ export interface StatField {
  * A settable numeric value plus the action that sets it — the free-value
  * counterpart to `selector` (which is a fixed enum). Optional `min`/`max`/`step`
  * are SEMANTIC bounds a client may use to validate/step input; `unit` is a
- * display hint (e.g. "ms", "s"). The client posts the new number as the action
+ * display hint (e.g. "ms", "min"). The client posts the new number as the action
  * payload. Unlike `progress`/`stat` (read-only), this field is interactive.
  */
 export interface NumberField {
@@ -105,8 +128,10 @@ export interface ButtonField {
 }
 
 /**
- * The escape hatch: data that fits no semantic field kind. Opaque `payload` + a
- * `rendererId`; clients WITH a renderer for that id show it, others GRACEFULLY SKIP.
+ * The escape hatch: data that fits no semantic field kind. Carries an opaque
+ * `payload` + a `rendererId`; clients WITH a renderer for that id show it, others
+ * GRACEFULLY SKIP. Keep rare — and the owning extension should export a typed
+ * payload type so its bespoke renderer narrows `payload` via a typed symbol.
  */
 export interface CustomField {
 	readonly kind: "custom";
@@ -114,7 +139,10 @@ export interface CustomField {
 	readonly payload: unknown;
 }
 
-/** A surface: an ordered set of fields mounted in a region, with a title. */
+/**
+ * A surface: an ordered set of fields mounted in a region, with a title. The atomic
+ * unit a backend extension contributes and a client renders.
+ */
 export interface SurfaceSpec {
 	readonly id: string;
 	readonly region: Region;
@@ -122,20 +150,33 @@ export interface SurfaceSpec {
 	readonly fields: readonly SurfaceField[];
 }
 
-/** A surface-catalog entry — discovery metadata only (no field data). */
+/**
+ * A surface-catalog entry — discovery metadata only (no field data).
+ */
 export interface SurfaceCatalogEntry {
 	readonly id: string;
 	readonly region: Region;
 	readonly title: string;
+	/**
+	 * Whether the surface's spec/values differ per conversation ("conversation")
+	 * or are app-wide ("global"). A client may skip re-subscribing GLOBAL surfaces
+	 * on a conversation switch (they ignore `conversationId`). Optional + additive:
+	 * when absent, a client should assume conversation-scoped (the conservative
+	 * "always send the focused conversationId" policy still works for both).
+	 */
+	readonly scope?: "global" | "conversation";
 }
 
 /** The surface catalog: the list of available surfaces a client can choose to show. */
 export type SurfaceCatalog = readonly SurfaceCatalogEntry[];
 
 /**
- * A live update for a subscribed surface. v1 carries the full new spec.
- * `conversationId` is present only for a CONVERSATION-SCOPED surface (tells the
- * client which conversation this update is for); a global surface omits it.
+ * A live update for a subscribed surface (pushed over the WS channel). v1 carries
+ * the full new spec (the simplest "patch").
+ *
+ * `conversationId` is present only for a CONVERSATION-SCOPED surface (one whose
+ * spec/values differ per conversation, e.g. cache-warming controls): it tells the
+ * client which conversation this update pertains to. A global surface omits it.
  */
 export interface SurfaceUpdate {
 	readonly surfaceId: string;
@@ -143,14 +184,18 @@ export interface SurfaceUpdate {
 	readonly conversationId?: string;
 }
 
-// ── Surface WebSocket protocol (slice 1: surfaces only) ──────────────────────
+// ── Surface WebSocket protocol ────────────────────────────────────────────────
 
 /** A client → server message on the surface channel. */
 export type SurfaceClientMessage = SubscribeMessage | UnsubscribeMessage | InvokeMessage;
 
 /**
- * Begin receiving live updates for a surface. For a CONVERSATION-SCOPED surface,
- * include the `conversationId` whose state you want; omit it for a global surface.
+ * Begin receiving live updates for a surface (server replies with `surface`, then `update`s).
+ *
+ * For a CONVERSATION-SCOPED surface, include the `conversationId` whose state you
+ * want — the server resolves the spec for that conversation and pushes its updates.
+ * Omit it for a global surface (or to view a conversation-scoped surface with no
+ * conversation in focus → the surface decides its default/empty state).
  */
 export interface SubscribeMessage {
 	readonly type: "subscribe";
@@ -208,7 +253,7 @@ export interface SurfaceUpdateMessage {
 	readonly update: SurfaceUpdate;
 }
 
-/** A surface-scoped error. */
+/** A surface-scoped error (e.g. unknown surface id, invoke failed). */
 export interface SurfaceErrorMessage {
 	readonly type: "error";
 	readonly surfaceId?: string;
diff --git a/backend-handoff-cache-warming.md b/backend-handoff-cache-warming.md
new file mode 100644
index 0000000..a0019f9
--- /dev/null
+++ b/backend-handoff-cache-warming.md
@@ -0,0 +1,102 @@
+# Cache-warming lifecycle handoff (FE → backend) — CR-4 — **RESOLVED ✅ 2026-06-12**
+
+> **Closed.** Backend reply: `../arch-rewrite/frontend-cache-warming-lifecycle-handoff.md`
+> (`[email protected]` + `[email protected]`). All asks shipped; FE consumed + live-probed
+> 17/17 (`scripts/probe-cache-warming.ts` against `bin/up`). CR-4d turned out to be an FE bug (our
+> WS parser dropped the `conversationId` echo on the initial `surface` message) — fixed FE-side.
+> Current status lives in `backend-handoff.md` §2. Original report kept below for history.
+
+> **From:** dispatch-web · **To:** arch-rewrite · **Courier:** the user.
+> User-reported symptoms, investigated FE-side with a live probe against a running backend
+> (`bin/up2` stack, HTTP :25203 / surface WS :25205, 2026-06-12). Repro tool:
+> `dispatch-web/scripts/probe-cache-warming.ts` (drives the FE's real WS adapter + the
+> `cache-warming` surface; safe to re-run to verify fixes).
+>
+> **Verdict up front:** the FE renders the surface data faithfully — symptoms 1 and 2 are
+> backend data/behavior; symptom 3 needs a new backend affordance (FE will wire it on arrival).
+
+## User-reported symptoms
+
+1. Warming is **ON by default** for a new conversation — the user has to manually turn it off.
+   Wanted: default OFF, opt-in per conversation.
+2. With warming enabled, **no usable countdown** to the next refresh — the user can't tell
+   whether refreshes are happening at all.
+3. Wanted lifecycle: refreshes **keep running when the browser window closes** (✅ already true,
+   verified — see below), but **closing the conversation's tab in the app should stop the
+   refreshes AND abort any in-flight generation** (closing the tab = "done with this chat for now").
+
+## Probe evidence (verbatim observations)
+
+Fresh conversation (first turn sealed), then `subscribe {surfaceId:"cache-warming", conversationId}`:
+
+- **Initial spec:** `toggle value: true`, `number value: 240` (s), timer payload
+  `{ nextWarmAt: <now+240s>, lastWarmAt: null }` → **enabled by default, warm already scheduled**.
+  Confirms symptom 1 is backend default state.
+- `invoke cache-warming/set-interval payload:20` → update with a FUTURE `nextWarmAt` (+20s). ✅
+- **Automatic warms DO repeat and DO push updates** — 3 warms observed at ~21s spacing
+  (interval 20s), each pushing an `update` with fresh `Last Cache %` / `Cache retention` stats.
+  So the engine itself works.
+- **BUG (symptom 2 root cause): every post-warm `update` carries a STALE `nextWarmAt` — the fire
+  time of the warm that JUST completed (i.e. in the past), never the next scheduled one.**
+  Observed sequence (epoch ms):
+
+  | update after | nextWarmAt | lastWarmAt | note |
+  |---|---|---|---|
+  | warm #1 | 1781246273405 | 1781246274299 | nextWarmAt < lastWarmAt (past) |
+  | warm #2 | 1781246294299 | 1781246295269 | = warm#1.lastWarmAt + 20 000 → still past |
+  | warm #3 | 1781246315269 | 1781246315998 | = warm#2.lastWarmAt + 20 000 → still past |
+
+  The pattern shows the reschedule math exists (`next = lastWarm + interval`) but the surface
+  update is emitted with the PRE-warm snapshot; the post-reschedule (future) `nextWarmAt` is
+  never pushed. The FE countdown is authoritative off `nextWarmAt` (per the cache-warming
+  handoff design), so after the FIRST automatic warm the UI shows "Next warm in 0s" forever —
+  exactly the user's "I can't tell if it's working".
+- Same staleness after a real chat turn while subscribed: last update after `turn-sealed` still
+  carried a past `nextWarmAt` (−10s and counting), even though a warm was presumably scheduled.
+- **Browser-closed continuity ✅:** the schedule is fully server-side — warms fired with no
+  browser attached (only the headless probe socket). Symptom 3's "keep running when the window
+  closes" half already works; do not regress it.
+- **Contract deviation (minor):** the initial `surface` reply to a conversation-scoped subscribe
+  does NOT echo `conversationId` (updates do). `ui-contract` says the echo should be present
+  ("echoes the subscribe's conversation … so the client routes it"). The FE currently tolerates
+  the missing echo (treats no-echo as current), but that weakens stale-scope filtering on fast
+  conversation switches — please echo it.
+
+## Asks
+
+### CR-4a — default warming to OFF for a new conversation
+New conversations currently start `enabled: true`, interval 240s, first warm scheduled
+immediately. Make the default `enabled: false` (no warm scheduled until the user opts in).
+No contract change — it's the initial state of the existing surface.
+
+### CR-4b — push the refreshed (future) `nextWarmAt` after each automatic warm
+After a warm completes + the next one is scheduled, the emitted surface `update`'s
+`cache-warming-timer` payload must carry the NEW future `nextWarmAt` (and the new `lastWarmAt`).
+Either emit the update after rescheduling or emit a second update — FE is indifferent; it just
+renders the authoritative timestamp. (Same applies to the post-`turn-sealed` reschedule path.)
+No contract change — it's the payload of the existing custom field.
+
+### CR-4c — a "conversation closed" affordance (stop warming + abort generation)
+The FE needs to tell the backend "the user closed this conversation's tab": that should
+(1) disable/stop cache-warming for the conversation and (2) abort any in-flight turn.
+Today there is no path:
+- `chat.unsubscribe` / socket close explicitly never stops the turn (by design — keep that);
+- surface `unsubscribe` doesn't touch the warming schedule (correct for mere disconnects);
+- `POST /conversations/:id/cancel` is DEFERRED in `transport-contract`;
+- programmatically invoking `cache-warming/toggle` is unsuitable: it FLIPS with no payload, so
+  it's racy as an explicit "disable" (and doesn't abort generation).
+
+Preferred shape (backend's call): a single explicit `POST /conversations/:id/close` (or WS
+message) that does both, OR un-defer `/cancel` + accept an optional explicit boolean payload on
+`cache-warming/toggle`. Whatever ships, the FE wires it into its tab-close path. Note the
+asymmetry the user wants: browser/socket disconnect ⇒ warming continues; explicit tab close ⇒
+warming + generation stop.
+
+### CR-4d (minor) — echo `conversationId` on the initial `surface` message
+Per the `ui-contract` doc comment on `SurfaceMessage` (see deviation above).
+
+## FE-side follow-ups (ours, queued behind the above)
+- Harden the countdown display: a past `nextWarmAt` renders as "waiting…" instead of a stuck
+  "0s" (cosmetic guard; CR-4b is the real fix).
+- On CR-4c shipping: call the close affordance from `store.closeTab()`; re-pin + re-mirror the
+  contract; extend `scripts/probe-cache-warming.ts` to verify default-off + post-warm countdown.
diff --git a/backend-handoff.md b/backend-handoff.md
index b5cf1eb..30a1d64 100644
--- a/backend-handoff.md
+++ b/backend-handoff.md
@@ -5,18 +5,35 @@
 > **From:** dispatch-web orchestrator · **To:** arch-rewrite orchestrator · **Courier:** the user.
 > `lsp` does NOT span the repos (AGENTS.md § Backend seam) — every cross-repo ask flows through here.
 
-_Last updated: 2026-06-12. **FE is current on `[email protected]` / `[email protected]`.** All handoffs
-to date are consumed: surfaces + WS, conversation transcript/metrics, tabs + model selector,
-cache-warming (incl. authoritative timer + retention + cache-rate fix), **per-conversation cwd + LSP
-status**, **context size** (the `contextSize` field — `done` live + `TurnMetrics` persisted —
-rendered as a current-usage readout above the composer), and **turn continuity + multi-client live
-view** (`chat.subscribe`/`chat.unsubscribe`; re-attach to a running turn on reconnect/reload/second
-device; stream-derived "generating…" state).
-**Open asks:** CR-1 (Loaded Extensions as a real table) + CR-2 (optional catalog `scope` flag) below.
+_Last updated: 2026-06-12 (CR-4 consumed). **FE is current on `[email protected]` /
+`[email protected]` / `[email protected]`.** All handoffs to date are consumed: surfaces + WS,
+conversation transcript/metrics, tabs + model selector, cache-warming (incl. authoritative timer +
+retention + cache-rate fix + the CR-4 lifecycle below), **per-conversation cwd + LSP status**,
+**context size**, and **turn continuity + multi-client live view**.
+**Open asks: NONE.** CR-1/CR-2/CR-4 all RESOLVED ✅ (see §2); §3 lists likely next asks.
 **CR-3 (watcher couldn't see the USER prompt until seal) → RESOLVED ✅** — backend shipped the
-`user-message` turn event (`[email protected]` / `[email protected]`); FE re-pinned + consumption live.
-The cwd/LSP draft-path verification (`backend-handoff-cwd-lsp.md`) came back **all ✅ confirmed** by the
-backend (answers in their `frontend-lsp-cwd-handoff.md`) — see §2._
+`user-message` turn event; FE re-pinned + consumption live.
+The cwd/LSP draft-path verification (`backend-handoff-cwd-lsp.md`) came back **all ✅ confirmed**._
+
+**CR-4 cache-warming lifecycle (`frontend-cache-warming-lifecycle-handoff.md`) → CONSUMED ✅
+(live-probed 17/17 against `bin/up`).** Re-pinned `[email protected]→0.2.0` +
+`[email protected]→0.9.0` (`wire` unchanged); re-mirrored both `.dispatch/*.reference.md`. FE
+work: `store.closeTab()` now POSTs `POST /conversations/:id/close` (fire-and-forget, idempotent) —
+the explicit "done with this chat" affordance that aborts an in-flight turn + stops/disables that
+conversation's warming, while disconnect/`chat.unsubscribe` still leave both running;
+`syncSubscriptions` honors the new catalog `scope` flag (a `scope:"global"` surface is no longer
+re-subscribed on every conversation switch; absent = conversation-scoped, conservative);
+`secondsUntilNext` gained a 3s belt-and-braces stale guard (a past `nextWarmAt` renders "waiting…",
+should never trigger now). **CR-4d correction: the missing `conversationId` echo on the initial
+`surface` message was OURS** — the backend was right, HEAD echoes it; our WS parser
+(`adapters/ws/logic.ts` `case "surface"`) rebuilt the message and DROPPED the field. Fixed + unit
+tests; the protocol reducer's stale-scope filtering now actually bites on the initial reply too.
+Probe verified live: default OFF + nothing scheduled on a fresh conversation; toggle-on/interval
+updates carry FUTURE `nextWarmAt`; repeated automatic warms each push a FUTURE `nextWarmAt`;
+`nextWarmAt: null` pushed on `turn-start`; close mid-turn → 200 `abortedTurn:true`, watcher gets
+`done` `reason:"aborted"` + `turn-sealed`, surface flips `enabled:false`/`nextWarmAt:null`; second
+close idempotent (`abortedTurn:false`). CR-1 table payload also verified arriving (FE renderer
+pre-existing). 568 tests green._
 
 **Turn-continuity handoff (`frontend-turn-continuity-handoff.md`) → CONSUMED ✅.** Re-pinned
 `[email protected]→0.7.0` (additive; `wire` unchanged at `0.5.0`); re-mirrored
@@ -43,7 +60,7 @@ backend ask — but the max-limit denominator is now a live FE need; see §3.
 
 ## 1. Pinned backend contracts (consumed by the FE)
 
-Pinned as `file:` deps: **`[email protected]`; `[email protected]`; `[email protected]`**.
+Pinned as `file:` deps: **`[email protected]`; `[email protected]`; `[email protected]`**.
 
 | Package | Used for |
 |---|---|
@@ -54,62 +71,58 @@ Pinned as `file:` deps: **`[email protected]`; `[email protected]`; `transport-contract
 Endpoints in use (HTTP **24203**, WS **24205**, CORS `*` incl. `PUT`):
 `POST /chat` (NDJSON) · `GET /models` · `GET /conversations/:id?sinceSeq=<n>` ·
 `GET /conversations/:id/metrics` · `GET`/`PUT /conversations/:id/cwd` ·
-`GET /conversations/:id/lsp` · `POST /chat/warm` · WS `chat.send`→`chat.delta` ·
+`GET /conversations/:id/lsp` · `POST /chat/warm` · `POST /conversations/:id/close` (explicit
+tab-close: abort turn + stop/disable warming) · WS `chat.send`→`chat.delta` ·
 WS `chat.subscribe`/`chat.unsubscribe` (watch a conversation's turns without sending; replay + live).
 
 Mirrored in-repo for headless agents: `.dispatch/{ui-contract,wire,transport-contract}.reference.md`
-(regenerate on any contract bump; all current as of `[email protected]` / `[email protected]`).
+(regenerate on any contract bump; all current as of `[email protected]` /
+`[email protected]` / `[email protected]`).
 
 ## 2. Open asks FOR THE BACKEND
 
-### CR-1 — emit the **Loaded Extensions** surface as a true table
-
-The user wants the Loaded Extensions surface rendered as a nice multi-column
-table (e.g. `Name | Version | Trust | Scope`), listing **all** loaded extensions.
-
-**Already covered — do NOT redo these (no contract change needed):**
-- The `custom` field kind + `rendererId` + graceful-skip already exist in
-  `[email protected]`. CR-1 uses that escape hatch — no `@dispatch/ui-contract` bump.
-- The FE renderer is **done and shipped**: `SurfaceView` → `SurfaceTable` →
-  shared `Table`, dispatched on `rendererId === "table"`. It renders the moment
-  the surface emits the field below.
-- The FE already groups consecutive `stat` fields into an aligned 2-column
-  (label → value) table, so the current surface (one `stat` per extension:
-  name → version) is **already readable as a table today**. CR-1 is the upgrade
-  to real columns, not a fix for something broken.
-- The "frontend modules" half of the Extensions view is **100% FE-owned**
-  (aggregated from each FE feature's `manifest`) — backend has nothing to provide there.
-
-**What I NEED from the backend to finish it:** replace the N per-extension
-`stat` fields with a SINGLE `custom` field:
-```ts
-{
-  kind: "custom",
-  rendererId: "table",
-  payload: {
-    columns: string[],                      // header labels
-    rows: (string | number | boolean)[][],  // each row aligns cell-for-cell to columns
-  },
-}
-```
-- Cells are coerced to strings; a malformed payload renders nothing (safe skip).
-- `rows` should enumerate **every** loaded extension (all trust tiers / kinds),
-  so "show all" is satisfied from this one surface.
-
-**Optional (data quality, not a blocker):** extension manifest `version`s all
-read `0.0.0` (unversioned). If real versions should appear in the table column,
-bump each extension's manifest `version` — otherwise the column is all `0.0.0`.
-
-### CR-2 (optional, low priority) — a `scope` flag on the surface catalog entry
-
-The catalog (`SurfaceCatalogEntry`) carries no hint of whether a surface is GLOBAL or
-CONVERSATION-SCOPED, so the FE follows the handoff's "always send the focused `conversationId`"
-policy. That works (global surfaces ignore it; the FE's routing accepts the no-echo global reply),
-but it means the FE **re-subscribes every surface — including global ones like `loaded-extensions` —
-on every conversation switch**, which is needless churn (one redundant unsubscribe+subscribe round
-trip per global surface per switch; no user-visible bug, the old spec is retained so there's no
-flicker). An optional `scope?: "global" | "conversation"` on `SurfaceCatalogEntry` would let the FE
-skip re-subscribing globals on switch. **Not blocking** — only raise if cheap.
+**None open.** Resolved history below.
+
+### CR-1 — Loaded Extensions as a true table → **RESOLVED ✅** (shipped + consumed)
+
+Backend now emits the "Loaded" count stat plus ONE
+`{ kind: "custom", rendererId: "table", payload: { columns, rows } }` field
+(`columns: ["Name", "Version", "Trust", "Activation"]`, one row per loaded extension, all trust
+tiers). Verified arriving live; the FE's pre-existing `SurfaceTable` renderer (dispatch on
+`rendererId === "table"`) shows it with no FE change. A typed `TablePayload` (+ `TABLE_RENDERER_ID`)
+is exported from `@dispatch/surface-loaded-extensions` if we ever want to narrow instead of
+duck-typing — not consumed (would add a dep for no behavior change). Data-quality note stands:
+`Version` cells all read `0.0.0` (manifests genuinely unversioned; optional backend cleanup).
+
+### CR-2 — catalog `scope` flag → **RESOLVED ✅** (`[email protected]`, consumed)
+
+`SurfaceCatalogEntry.scope?: "global" | "conversation"` shipped (emitted: `loaded-extensions` →
+global, `cache-warming` → conversation). FE consumed: `syncSubscriptions` subscribes a
+`scope:"global"` surface WITHOUT a conversationId, so a conversation switch no longer churns a
+redundant unsubscribe+subscribe per global surface. ABSENT scope = assume conversation-scoped
+(conservative, per contract).
+
+### CR-4 — cache-warming lifecycle → **RESOLVED ✅** (courier `backend-handoff-cache-warming.md`; reply `frontend-cache-warming-lifecycle-handoff.md`; live-probed 17/17)
+
+All four asks shipped + consumed (`[email protected]`):
+- **(a) default OFF** for a new conversation (interval default still 240s; re-enable restores the
+  persisted interval). Verified live.
+- **(b) FUTURE `nextWarmAt`** pushed after every automatic warm + after `turn-sealed`;
+  `nextWarmAt: null` pushed on `turn-start` (FE renders "waiting…" while generating) and when
+  disabled. Verified live (2 automatic warms @10s, both future).
+- **(c) `POST /conversations/:id/close`** (`CloseConversationResponse { conversationId,
+  abortedTurn }`): aborts an in-flight turn (partial persisted, seals with `reason: "aborted"` →
+  watchers' `generating` clears structurally) + stops/disables warming (persisted OFF), idempotent;
+  disconnect/`chat.unsubscribe` still never touch either. FE wires it in `store.closeTab()`
+  (fire-and-forget). Verified live incl. mid-turn abort + idempotent re-close.
+- **(d) `conversationId` echo on the initial `surface` message — was an FE BUG, not backend.**
+  The backend's frame carries it (raw-frame verified); our WS parser
+  (`adapters/ws/logic.ts` `case "surface"`) rebuilt the message and dropped the field. Fixed FE-side
+  + unit-tested; stale-scope filtering now applies to the initial reply too. Backend owes nothing.
+
+**Known caveat (accepted, fail-safe):** the warming opt-in is NOT re-hydrated across a backend
+RESTART — a conversation reads disabled until toggled again. Flag to the backend if persistence
+across restarts becomes a product need (they offered boot hydration).
 
 ### cwd + LSP draft path → **VERIFIED ✅ (all 6 asks confirmed; courier `backend-handoff-cwd-lsp.md`)**
 
@@ -184,7 +197,11 @@ errored turn would leave a persisted prompt with no reply).
   source the model's advertised window.
 - `GET /conversations` — conversation list / sidebar (history explorer / switcher); could also expose a
   per-conversation "last model" so a reopened tab seeds its model from the server instead of localStorage.
-- `POST /conversations/:id/cancel` — "stop generating".
+- ~~`POST /conversations/:id/cancel`~~ — **superseded by `POST /conversations/:id/close`
+  (CR-4c, shipped)**. A standalone "stop generating WITHOUT closing/disabling warming" button would
+  still need a separate affordance if the product ever wants it.
+- **Warming opt-in persistence across backend restarts** — currently fail-safe-off after a restart;
+  backend offered boot hydration if it becomes a need (see CR-4 caveat in §2).
 - **LSP status over WS** (push) — today the FE HTTP-polls `GET /conversations/:id/lsp` on panel mount /
   cwd change + a manual refresh; a live surface/WS push would remove the manual refresh and reflect a
   server flipping to `error`/`connected` without a reload. (Backend flagged this as a future option.)
diff --git a/scripts/probe-cache-warming.ts b/scripts/probe-cache-warming.ts
new file mode 100644
index 0000000..470e43b
--- /dev/null
+++ b/scripts/probe-cache-warming.ts
@@ -0,0 +1,277 @@
+/**
+ * scripts/probe-cache-warming.ts — LIVE probe of the `cache-warming` surface +
+ * conversation-close lifecycle against a RUNNING backend (bin/up: HTTP :24203 +
+ * surface WS :24205; override with PROBE_HTTP / PROBE_WS for bin/up2's +1000
+ * ports). NOT part of `bun run test`. Verifies the CR-4 handoff end-to-end:
+ *
+ *   A. draft subscribe (no conversationId) → degenerate "no conversation" spec
+ *   B. fresh conversation → warming defaults OFF, nothing scheduled (CR-4a)
+ *   C. toggle on + 10s interval → repeated automatic warms, each update carrying
+ *      a FUTURE nextWarmAt (CR-4b), initial `surface` echoes conversationId (CR-4d)
+ *   D. POST /conversations/:id/close mid-turn → abortedTurn, done.reason
+ *      "aborted", turn-sealed, warming disabled + unscheduled (CR-4c)
+ *
+ *     bun scripts/probe-cache-warming.ts
+ */
+import type {
+	ChatDeltaMessage,
+	ChatErrorMessage,
+	CloseConversationResponse,
+} from "@dispatch/transport-contract";
+import type { SurfaceServerMessage, SurfaceSpec } from "@dispatch/ui-contract";
+import { createSurfaceSocket } from "../src/adapters/ws/index.ts";
+import { parseControls } from "../src/features/cache-warming/logic/view-model.ts";
+
+const WS_URL = process.env.PROBE_WS ?? "ws://localhost:24205";
+const HTTP_BASE = process.env.PROBE_HTTP ?? "http://localhost:24203";
+const SURFACE_ID = "cache-warming";
+
+const checks: { name: string; ok: boolean }[] = [];
+const record = (name: string, ok: boolean, detail?: string) => {
+	checks.push({ name, ok });
+	console.log(`  ${ok ? "✅" : "❌"} ${name}${detail ? ` — ${detail}` : ""}`);
+};
+const log = (msg: string) => console.log(`[${new Date().toISOString().slice(11, 19)}] ${msg}`);
+const sleep = (ms: number) => new Promise((r) => setTimeout(r, ms));
+
+function summarize(spec: SurfaceSpec | null): string {
+	const c = parseControls(spec);
+	const next =
+		c.nextWarmAt === null ? "null" : `${Math.round((c.nextWarmAt - Date.now()) / 1000)}s`;
+	return `enabled=${c.enabled} interval=${c.intervalSeconds}s lastPct=${c.lastPct} next=${next} lastWarmAt=${c.lastWarmAt}`;
+}
+
+let catalog: { id: string; scope?: string }[] = [];
+let latestSpec: SurfaceSpec | null = null;
+let latestSpecConv: string | undefined;
+let specWaiter: (() => void) | null = null;
+
+const chatHandlers = new Map<string, (msg: ChatDeltaMessage | ChatErrorMessage) => void>();
+
+const socket = createSurfaceSocket({
+	url: WS_URL,
+	onMessage: (m: SurfaceServerMessage) => {
+		if (m.type === "catalog") {
+			catalog = [...m.catalog];
+			log(`catalog: ${m.catalog.map((e) => `${e.id}(scope=${e.scope ?? "—"})`).join(", ")}`);
+		} else if (m.type === "surface") {
+			latestSpec = m.spec;
+			latestSpecConv = m.conversationId;
+			log(`surface(initial) conv=${m.conversationId ?? "—"}: ${summarize(m.spec)}`);
+			specWaiter?.();
+		} else if (m.type === "update") {
+			if (m.update.surfaceId !== SURFACE_ID) return;
+			latestSpec = m.update.spec;
+			latestSpecConv = m.update.conversationId;
+			log(`update conv=${m.update.conversationId ?? "—"}: ${summarize(m.update.spec)}`);
+			specWaiter?.();
+		} else if (m.type === "error") {
+			log(`surface ERROR: ${m.surfaceId ?? "—"}: ${m.message}`);
+		}
+	},
+	onChat: (msg) => {
+		const id = msg.type === "chat.error" ? msg.conversationId : msg.event.conversationId;
+		if (id !== undefined) chatHandlers.get(id)?.(msg);
+	},
+});
+
+/** Wait for the next surface/update message (or time out). */
+function nextSpec(timeoutMs: number): Promise<boolean> {
+	return new Promise((resolve) => {
+		const t = setTimeout(() => {
+			specWaiter = null;
+			resolve(false);
+		}, timeoutMs);
+		specWaiter = () => {
+			clearTimeout(t);
+			specWaiter = null;
+			resolve(true);
+		};
+	});
+}
+
+async function runTinyTurn(conversationId: string, prompt: string): Promise<boolean> {
+	const done = Promise.withResolvers<boolean>();
+	chatHandlers.set(conversationId, (msg) => {
+		if (msg.type === "chat.error") {
+			log(`chat.error: ${msg.message}`);
+			done.resolve(false);
+		} else if (msg.event.type === "turn-sealed") {
+			done.resolve(true);
+		}
+	});
+	socket.send({ type: "chat.send", conversationId, message: prompt });
+	const t = setTimeout(() => done.resolve(false), 90_000);
+	const ok = await done.promise;
+	clearTimeout(t);
+	chatHandlers.delete(conversationId);
+	return ok;
+}
+
+function invoke(actionId: string, conversationId: string, payload?: unknown): void {
+	socket.send(
+		payload === undefined
+			? { type: "invoke", surfaceId: SURFACE_ID, actionId, conversationId }
+			: { type: "invoke", surfaceId: SURFACE_ID, actionId, payload, conversationId },
+	);
+}
+
+async function main() {
+	await sleep(600);
+	record(
+		"catalog includes cache-warming with scope=conversation",
+		catalog.some((e) => e.id === SURFACE_ID && e.scope === "conversation"),
+	);
+
+	// ── A: the DRAFT/new-tab path — subscribe with NO conversationId ───────────
+	log("PHASE A: subscribe with NO conversationId (draft / new tab)");
+	socket.send({ type: "subscribe", surfaceId: SURFACE_ID });
+	await nextSpec(3000);
+	record(
+		"draft subscribe → degenerate spec (no toggle parsed)",
+		!parseControls(latestSpec).enabled,
+	);
+	socket.send({ type: "unsubscribe", surfaceId: SURFACE_ID });
+	await sleep(300);
+
+	// ── B: a FRESH conversation defaults OFF (CR-4a) + echo (CR-4d) ────────────
+	const conv = crypto.randomUUID();
+	log(`PHASE B: creating conversation ${conv}`);
+	if (!(await runTinyTurn(conv, "Reply with exactly: ok"))) {
+		log("FATAL: could not create a conversation");
+		process.exit(1);
+	}
+	socket.send({ type: "subscribe", surfaceId: SURFACE_ID, conversationId: conv });
+	await nextSpec(3000);
+	const fresh = parseControls(latestSpec);
+	record("CR-4d: initial surface message echoes conversationId", latestSpecConv === conv);
+	record("CR-4a: fresh conversation defaults to warming OFF", fresh.enabled === false);
+	record("CR-4a: nothing scheduled while off (nextWarmAt null)", fresh.nextWarmAt === null);
+
+	// ── C: opt in + 10s interval → repeated warms, FUTURE nextWarmAt (CR-4b) ───
+	log("PHASE C: toggling warming ON");
+	const toggleId = fresh.toggleActionId;
+	if (toggleId === null) {
+		record("toggle action present", false);
+		process.exit(1);
+	}
+	invoke(toggleId, conv);
+	await nextSpec(3000);
+	let c = parseControls(latestSpec);
+	record("toggle-on update arrived (enabled)", c.enabled === true);
+	record(
+		"CR-4b: enable schedules a FUTURE nextWarmAt",
+		c.nextWarmAt !== null && c.nextWarmAt > Date.now(),
+	);
+
+	const setIntervalId = c.setIntervalActionId;
+	if (setIntervalId !== null) {
+		log("PHASE C: set-interval = 10s");
+		invoke(setIntervalId, conv, 10);
+		await nextSpec(3000);
+		c = parseControls(latestSpec);
+		record(
+			"set-interval update: interval=10 + FUTURE nextWarmAt",
+			c.intervalSeconds === 10 && c.nextWarmAt !== null && c.nextWarmAt > Date.now(),
+		);
+	}
+
+	log("PHASE C: waiting up to 45s for 2 automatic warms…");
+	const deadline = Date.now() + 45_000;
+	let lastSeen = c.lastWarmAt;
+	let warms = 0;
+	let allFuture = true;
+	while (Date.now() < deadline && warms < 2) {
+		await nextSpec(Math.max(1, deadline - Date.now()));
+		const now = parseControls(latestSpec);
+		if (now.lastWarmAt !== null && now.lastWarmAt !== lastSeen) {
+			lastSeen = now.lastWarmAt;
+			warms++;
+			const future = now.nextWarmAt !== null && now.nextWarmAt > Date.now() - 1000;
+			if (!future) allFuture = false;
+			log(
+				`  automatic warm #${warms}: pct=${now.lastPct} retention=${now.retentionPct} ` +
+					`nextWarmAt ${future ? "FUTURE" : "STALE/PAST"}`,
+			);
+		}
+	}
+	record("automatic warms repeat (2 observed @10s)", warms >= 2, `${warms} warm(s)`);
+	record("CR-4b: every post-warm update carries a FUTURE nextWarmAt", warms >= 2 && allFuture);
+
+	// ── D: close mid-turn → abort + warming disabled (CR-4c) ───────────────────
+	log("PHASE D: starting a long turn, then closing the conversation mid-turn…");
+	const seenDone = Promise.withResolvers<string>(); // resolves with done.reason
+	const seenSealed = Promise.withResolvers<void>();
+	let turnStarted = false;
+	const started = Promise.withResolvers<void>();
+	chatHandlers.set(conv, (msg) => {
+		if (msg.type === "chat.error") {
+			log(`chat.error: ${msg.message}`);
+			return;
+		}
+		const ev = msg.event;
+		if (ev.type === "turn-start") {
+			turnStarted = true;
+			started.resolve();
+		} else if (ev.type === "done") {
+			seenDone.resolve(ev.reason);
+		} else if (ev.type === "turn-sealed") {
+			seenSealed.resolve();
+		}
+	});
+	socket.send({
+		type: "chat.send",
+		conversationId: conv,
+		message:
+			"Write a detailed 1000-word essay about the history of computing. Take your time and be thorough.",
+	});
+	const startTimeout = setTimeout(() => started.resolve(), 15_000);
+	await started.promise;
+	clearTimeout(startTimeout);
+	record("turn started (watcher saw turn-start)", turnStarted);
+	await sleep(1000); // let it generate a moment
+
+	const res = await fetch(`${HTTP_BASE}/conversations/${encodeURIComponent(conv)}/close`, {
+		method: "POST",
+		headers: { Origin: "http://localhost:24204" },
+	});
+	record("POST /conversations/:id/close → 200", res.ok, `HTTP ${res.status}`);
+	const body = (await res.json()) as CloseConversationResponse;
+	record("close aborted the in-flight turn (abortedTurn)", body.abortedTurn === true);
+
+	const doneReason = await Promise.race([seenDone.promise, sleep(15_000).then(() => "(timeout)")]);
+	record('watcher received done with reason "aborted"', doneReason === "aborted", doneReason);
+	const sealed = await Promise.race([
+		seenSealed.promise.then(() => true),
+		sleep(15_000).then(() => false),
+	]);
+	record("turn sealed normally after abort", sealed);
+	chatHandlers.delete(conv);
+
+	// The close also pushed a surface update: warming disabled + unscheduled.
+	await sleep(1500);
+	const closed = parseControls(latestSpec);
+	record(
+		"CR-4c: close disabled warming + cleared the schedule",
+		closed.enabled === false && closed.nextWarmAt === null,
+		summarize(latestSpec),
+	);
+
+	// Idempotency: closing again (now idle) succeeds with abortedTurn false.
+	const res2 = await fetch(`${HTTP_BASE}/conversations/${encodeURIComponent(conv)}/close`, {
+		method: "POST",
+		headers: { Origin: "http://localhost:24204" },
+	});
+	const body2 = (await res2.json()) as CloseConversationResponse;
+	record("close is idempotent (200 + abortedTurn:false)", res2.ok && body2.abortedTurn === false);
+
+	socket.close();
+	const passed = checks.filter((x) => x.ok).length;
+	console.log(`\n[probe-cache-warming] ${passed}/${checks.length} checks passed`);
+	process.exit(passed === checks.length ? 0 : 1);
+}
+
+main().catch((e) => {
+	console.error(`[probe] FATAL: ${e}`);
+	process.exit(1);
+});
diff --git a/src/adapters/ws/logic.test.ts b/src/adapters/ws/logic.test.ts
index 546afe1..2784295 100644
--- a/src/adapters/ws/logic.test.ts
+++ b/src/adapters/ws/logic.test.ts
@@ -64,6 +64,29 @@ describe("parseServerMessage", () => {
 		});
 	});
 
+	it("preserves the conversationId echo on a scoped surface message", () => {
+		const data = JSON.stringify({
+			type: "surface",
+			spec: { id: "s1", region: "r", title: "S1", fields: [] },
+			conversationId: "c1",
+		});
+		const result = parseServerMessage(data);
+		expect(result).toEqual({
+			type: "surface",
+			spec: { id: "s1", region: "r", title: "S1", fields: [] },
+			conversationId: "c1",
+		});
+	});
+
+	it("rejects a surface message with a non-string conversationId", () => {
+		const data = JSON.stringify({
+			type: "surface",
+			spec: { id: "s1", region: "r", title: "S1", fields: [] },
+			conversationId: 42,
+		});
+		expect(parseServerMessage(data)).toBeNull();
+	});
+
 	it("parses an update message", () => {
 		const data = JSON.stringify({
 			type: "update",
diff --git a/src/adapters/ws/logic.ts b/src/adapters/ws/logic.ts
index 6592f1b..17e3951 100644
--- a/src/adapters/ws/logic.ts
+++ b/src/adapters/ws/logic.ts
@@ -59,7 +59,15 @@ export function parseServerMessage(data: string): WsServerMessage | null {
 			if (typeof spec.region !== "string") return null;
 			if (typeof spec.title !== "string") return null;
 			if (!Array.isArray(spec.fields)) return null;
-			return { type: "surface", spec: spec as unknown as SurfaceMessage["spec"] };
+			// Preserve the conversationId echo (a conversation-scoped surface's initial
+			// reply carries it) — dropping it would defeat the protocol reducer's
+			// stale-scope filtering on a fast conversation switch.
+			const conversationId = parsed.conversationId;
+			if (conversationId !== undefined && typeof conversationId !== "string") return null;
+			const surfaceSpec = spec as unknown as SurfaceMessage["spec"];
+			return conversationId !== undefined
+				? { type: "surface", spec: surfaceSpec, conversationId }
+				: { type: "surface", spec: surfaceSpec };
 		}
 		case "update": {
 			const update = parsed.update;
diff --git a/src/app/store.svelte.ts b/src/app/store.svelte.ts
index df92b31..2837bb5 100644
--- a/src/app/store.svelte.ts
+++ b/src/app/store.svelte.ts
@@ -256,6 +256,23 @@ export function createAppStore(opts?: CreateAppStoreOptions): AppStore {
 		socket?.send({ type: "chat.unsubscribe", conversationId });
 	}
 
+	/**
+	 * Tell the backend the user EXPLICITLY closed this conversation's tab
+	 * (`POST /conversations/:id/close`): aborts any in-flight turn (it seals with
+	 * `reason: "aborted"`) and stops + DISABLES its cache-warming (persisted OFF).
+	 * Distinct from a disconnect / `chat.unsubscribe`, which deliberately leave
+	 * both running. Fire-and-forget: a failure is non-fatal (worst case the
+	 * warming keeps running until a later close/toggle), and the endpoint is
+	 * idempotent server-side.
+	 */
+	function closeConversation(conversationId: string): void {
+		void fetchImpl(`${httpBase}/conversations/${encodeURIComponent(conversationId)}/close`, {
+			method: "POST",
+		}).catch(() => {
+			// Non-fatal — see doc comment.
+		});
+	}
+
 	/** The conversation the surfaces should scope to (undefined for a draft). */
 	function focusedConversationId(): string | undefined {
 		return tabsStore.activeConversationId ?? undefined;
@@ -289,7 +306,12 @@ export function createAppStore(opts?: CreateAppStoreOptions): AppStore {
 	function syncSubscriptions(): void {
 		const cid = focusedConversationId();
 		for (const entry of protocol.catalog) {
-			const result = protocolSubscribe(protocol, entry.id, cid);
+			// A GLOBAL surface ignores conversation scope — subscribe it WITHOUT an id
+			// so a conversation switch doesn't churn a redundant unsubscribe+subscribe
+			// round trip ([email protected] catalog `scope`; ABSENT = assume
+			// conversation-scoped, the conservative pre-0.2.0 policy).
+			const scoped = entry.scope === "global" ? undefined : cid;
+			const result = protocolSubscribe(protocol, entry.id, scoped);
 			protocol = result.state;
 			for (const msg of result.outgoing) {
 				socket?.send(msg);
@@ -489,7 +511,10 @@ export function createAppStore(opts?: CreateAppStoreOptions): AppStore {
 
 		closeTab(conversationId: string): void {
 			tabsStore.closeTab(conversationId);
-			// Stop watching the closed conversation's turns (does NOT stop the turn).
+			// The user is DONE with this chat for now: abort any in-flight turn and
+			// stop + disable its cache-warming, server-side.
+			closeConversation(conversationId);
+			// Stop watching the closed conversation's turns.
 			unsubscribeChat(conversationId);
 			const store = chatStores.get(conversationId);
 			if (store !== undefined) {
diff --git a/src/app/store.test.ts b/src/app/store.test.ts
index 803d7dc..f4b5a0f 100644
--- a/src/app/store.test.ts
+++ b/src/app/store.test.ts
@@ -674,6 +674,78 @@ describe("createAppStore", () => {
 		store.dispose();
 	});
 
+	it("closing a tab POSTs /conversations/:id/close (abort turn + stop warming)", async () => {
+		const calls: { url: string; method: string }[] = [];
+		const base = fakeFetchImpl();
+		const fetchImpl: typeof fetch = async (input, init) => {
+			const url = typeof input === "string" ? input : input instanceof URL ? input.href : input.url;
+			calls.push({ url, method: init?.method ?? "GET" });
+			if (url.endsWith("/close")) {
+				return new Response(
+					JSON.stringify({ conversationId: url.split("/").at(-2), abortedTurn: false }),
+					{ status: 200 },
+				);
+			}
+			return base(input, init);
+		};
+		const ws = fakeSocket();
+		const store = createAppStore({
+			socketFactory: () => ws,
+			fetchImpl,
+			localStorage: createFakeStorage(),
+		});
+		ws.resolveOpen();
+
+		store.send("first");
+		const convId = activeConversationId(store);
+		store.closeTab(convId);
+		await Promise.resolve(); // flush the fire-and-forget fetch
+
+		const close = calls.find((c) => c.url.endsWith(`/conversations/${convId}/close`));
+		expect(close).toBeDefined();
+		expect(close?.method).toBe("POST");
+
+		store.dispose();
+	});
+
+	it("does NOT re-scope a scope:'global' surface on conversation switch (no churn)", () => {
+		const ws = fakeSocket();
+		const store = createAppStore({
+			socketFactory: () => ws,
+			fetchImpl: fakeFetchImpl(),
+			localStorage: createFakeStorage(),
+		});
+		ws.resolveOpen();
+
+		ws.feedSurfaceMessage({
+			type: "catalog",
+			catalog: [
+				{ id: "s-global", region: "side", title: "Global", scope: "global" },
+				{ id: "s-conv", region: "side", title: "Scoped", scope: "conversation" },
+			],
+		});
+
+		ws.sent.length = 0;
+		store.send("promote the draft"); // draft → real conversation: surfaces re-scope
+		const convId = activeConversationId(store);
+
+		const surfaceMsgs = parseSent(ws).filter(
+			(p): p is { type: string; surfaceId: string; conversationId?: string } =>
+				(p as { type: string }).type === "subscribe" ||
+				(p as { type: string }).type === "unsubscribe",
+		);
+		// The conversation-scoped surface re-scopes: unsubscribe old + subscribe new id.
+		expect(
+			surfaceMsgs.some(
+				(m) => m.type === "subscribe" && m.surfaceId === "s-conv" && m.conversationId === convId,
+			),
+		).toBe(true);
+		// The global surface is untouched — no redundant unsubscribe+subscribe round trip.
+		expect(surfaceMsgs.some((m) => m.surfaceId === "s-global")).toBe(false);
+
+		store.dispose();
+	});
+
 	it("tabs persist to the injected storage and restore on a new store", () => {
 		const ws = fakeSocket();
 		const storage = createFakeStorage();
diff --git a/src/features/cache-warming/logic/view-model.test.ts b/src/features/cache-warming/logic/view-model.test.ts
index 3d6f6d0..d5ea901 100644
--- a/src/features/cache-warming/logic/view-model.test.ts
+++ b/src/features/cache-warming/logic/view-model.test.ts
@@ -215,6 +215,14 @@ describe("secondsUntilNext (authoritative, from nextWarmAt)", () => {
 		expect(secondsUntilNext(10_000, 10_000)).toBe(0);
 		expect(secondsUntilNext(250_000, 10_000)).toBe(240);
 		expect(secondsUntilNext(70_000, 10_000)).toBe(60);
-		expect(secondsUntilNext(5_000, 999_999)).toBe(0); // already past
+	});
+
+	it("treats a nextWarmAt past the stale grace as not scheduled (belt-and-braces)", () => {
+		// Within the 3s grace an on-time warm may briefly read "0s"…
+		expect(secondsUntilNext(10_000, 11_000)).toBe(0);
+		expect(secondsUntilNext(10_000, 13_000)).toBe(0);
+		// …but beyond it the value is stale → null (the "waiting…" state).
+		expect(secondsUntilNext(10_000, 13_001)).toBeNull();
+		expect(secondsUntilNext(5_000, 999_999)).toBeNull();
 	});
 });
diff --git a/src/features/cache-warming/logic/view-model.ts b/src/features/cache-warming/logic/view-model.ts
index f7740d7..eb105f6 100644
--- a/src/features/cache-warming/logic/view-model.ts
+++ b/src/features/cache-warming/logic/view-model.ts
@@ -232,11 +232,23 @@ export function observeWarm(
 }
 
 /**
+ * Grace before a PAST `nextWarmAt` is treated as "not scheduled" (→ the
+ * "waiting…" state instead of a perpetual "0s"). The backend pushes the FUTURE
+ * `nextWarmAt` after every warm (CR-4b) and `null` while generating/disabled, so
+ * this is a belt-and-braces guard that should never trigger — it only matters
+ * against a stale/buggy emitter, and the small window lets an on-time warm show
+ * "0s" for the second it takes to complete.
+ */
+const STALE_NEXT_WARM_MS = 3000;
+
+/**
  * Seconds until the next automatic warm, AUTHORITATIVE: derived straight from the
  * backend's `nextWarmAt` epoch-ms (never FE-anchored/guessed). `null` when nothing
- * is scheduled (disabled, or a turn is generating so the timer is cancelled).
+ * is scheduled (disabled, or a turn is generating so the timer is cancelled) — or
+ * when `nextWarmAt` is stale (further than the grace into the past).
  */
 export function secondsUntilNext(nextWarmAt: number | null, now: number): number | null {
 	if (nextWarmAt === null) return null;
+	if (now - nextWarmAt > STALE_NEXT_WARM_MS) return null;
 	return Math.max(0, Math.ceil((nextWarmAt - now) / 1000));
 }
author	Adam Malczewski <[email protected]>	2026-06-12 16:28:07 +0900
committer	Adam Malczewski <[email protected]>	2026-06-12 16:28:07 +0900
commit	4001274e3ba25a3946df1e9f2dc82ca6781cd2bf (patch)
tree	24af95e69bda5c38ab7eefd6b71d55b4c247040a
parent	e6f6bd86eab07954d8f06e740659969c3dfecc7f (diff)
download	dispatch-web-4001274e3ba25a3946df1e9f2dc82ca6781cd2bf.tar.gz dispatch-web-4001274e3ba25a3946df1e9f2dc82ca6781cd2bf.zip