summaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
authorAdam Malczewski <[email protected]>2026-06-22 15:01:34 +0900
committerAdam Malczewski <[email protected]>2026-06-22 15:01:34 +0900
commite95d526635fd11d75e43e2dd219a6a05b49d43b9 (patch)
treedadea212fac43a9b5a12edec1158b163cbf5852f
parente6e2d31b987043d0325a93d23295af0408b723ec (diff)
downloaddispatch-e95d526635fd11d75e43e2dd219a6a05b49d43b9.tar.gz
dispatch-e95d526635fd11d75e43e2dd219a6a05b49d43b9.zip
docs: FE handoff for context window + percentage-based compact
-rw-r--r--backend-to-fe-handoff-2.md124
1 files changed, 124 insertions, 0 deletions
diff --git a/backend-to-fe-handoff-2.md b/backend-to-fe-handoff-2.md
new file mode 100644
index 0000000..942381b
--- /dev/null
+++ b/backend-to-fe-handoff-2.md
@@ -0,0 +1,124 @@
+# Backend → FE handoff — context window + percentage-based compact
+
+> Courier to `../dispatch-web`. Response to the context-window ask in
+> `backend-handoff.md` §3 + compacting rework.
+
+## What shipped
+
+1. **`GET /models` now includes `contextWindow` per model** — the FE can replace
+ the hardcoded `MAX_CONTEXT = 1,000,000` with the real value.
+2. **Auto-compact is now percentage-based** (default 85% of `contextWindow`)
+ instead of a flat token count (was 350k).
+
+## Bump pinned deps
+- `@dispatch/wire` → `0.11.0` (unchanged)
+- `@dispatch/transport-contract` → `0.15.0` (unchanged — the new types are
+ additive to the existing version)
+
+## `GET /models` — now includes `modelInfo`
+
+The response now includes an optional `modelInfo` map alongside the existing
+`models` array. The `models` array is unchanged (backward compatible).
+
+```ts
+interface ModelsResponse {
+ readonly models: readonly string[];
+ readonly modelInfo?: Readonly<Record<string, ModelMetadata>>;
+}
+
+interface ModelMetadata {
+ readonly contextWindow?: number; // max tokens (e.g. 200000)
+}
+```
+
+**Example response:**
+```json
+{
+ "models": ["opencode/deepseek-v4-flash", "umans/umans-glm-5.2"],
+ "modelInfo": {
+ "opencode/deepseek-v4-flash": { "contextWindow": 128000 },
+ "umans/umans-glm-5.2": { "contextWindow": 200000 }
+ }
+}
+```
+
+`modelInfo` is absent when no provider reports `contextWindow`. Each key is the
+same `<credentialName>/<model>` string from the `models` array.
+
+**What the FE should do:**
+- When rendering the composer status bar, use `modelInfo[selectedModel].contextWindow`
+ as the denominator for `contextSize / contextWindow · pct%`.
+- Fall back to the current hardcoded `1,000,000` when `modelInfo` is absent or
+ the selected model has no `contextWindow`.
+
+## Auto-compact: now percentage-based
+
+**Old:** flat token threshold (default 350000). `contextSize >= threshold`.
+**New:** percentage of `contextWindow` (default 85%). `contextSize >= contextWindow × (percent / 100)`.
+
+Also fixed: the check now uses `contextSize` (true context occupancy = last
+step's `inputTokens + outputTokens`) instead of the overcounted aggregate
+`usage.inputTokens` (which summed every step's re-prefilled prompt).
+
+### `GET /conversations/:id/compact-percent` — read percent
+
+200: `CompactPercentResponse { conversationId, percent }`
+- `percent: 0` — auto-compact explicitly disabled (manual only).
+- `percent: null` (not stored) — **default: 85** (85% of the model's context window).
+- Any positive number (1-100) — auto-compact triggers when `contextSize`
+ exceeds `percent`% of the model's `contextWindow`.
+
+### `PUT /conversations/:id/compact-percent` — set percent
+
+Body: `SetCompactPercentRequest { percent: number }`
+- `0` explicitly disables auto-compact.
+- Any positive number (1-100) sets the trigger percentage.
+- Default (when not stored) is 85.
+
+200: `CompactPercentResponse`
+
+**Renamed from `compact-threshold`** — the old endpoint paths, request types,
+and response types are gone. Update any FE code that referenced
+`compact-threshold`.
+
+## New types
+
+```ts
+// @dispatch/transport-contract
+export interface ModelMetadata {
+ readonly contextWindow?: number;
+}
+
+// ModelsResponse now has modelInfo (additive — models array unchanged)
+export interface ModelsResponse {
+ readonly models: readonly string[];
+ readonly modelInfo?: Readonly<Record<string, ModelMetadata>>;
+}
+
+// Renamed from CompactThresholdResponse
+export interface CompactPercentResponse {
+ readonly conversationId: string;
+ readonly percent: number; // 0 = manual; null = default 85
+}
+
+// Renamed from SetCompactThresholdRequest
+export interface SetCompactPercentRequest {
+ readonly percent: number;
+}
+```
+
+## What the FE needs to do
+
+1. **Use real `contextWindow`** from `GET /models` → `modelInfo[model].contextWindow`
+ instead of hardcoded `MAX_CONTEXT = 1,000,000`.
+
+2. **Rename compact-threshold → compact-percent** in any FE code:
+ - `GET /conversations/:id/compact-percent` (was `compact-threshold`)
+ - `PUT /conversations/:id/compact-percent` (was `compact-threshold`)
+ - `percent` field (was `threshold`)
+
+3. **Settings UI**: change the number input from "token count" to "percent
+ (0-100)". Default 85. 0 = manual only.
+
+4. **No other changes** — the compact endpoint, WS message, and chain
+ architecture are unchanged.