| Age | Commit message (Collapse) | Author |
|
1. Real context window: GET /models now returns modelInfo[model].contextWindow.
The Composer uses this instead of the hardcoded MAX_CONTEXT = 1,000,000.
Falls back to 1M when modelInfo is absent or the model has no contextWindow.
2. Percentage-based auto-compact: the compact-threshold endpoint is renamed
to compact-percent. The CompactionView now shows a percent input (0-100,
default 85, 0 = manual) instead of a token count input. Types renamed:
CompactThresholdResponse → CompactPercentResponse,
SetCompactThresholdRequest → SetCompactPercentRequest.
Note: the field name in the backend types is still 'threshold' (not
'percent') — the FE maps between them.
Re-mirrored .dispatch/transport-contract.reference.md.
686 tests green. 0 svelte-check errors + warnings.
|
|
Tool calls and results now use the same DaisyUI collapse pattern as
thinking blocks — collapsed by default, click to expand. Each card
shows the tool name + a wrench icon in the title; expanding reveals
the input/output with overflow-x-auto for long lines and max-h-96
overflow-y-auto for very long output.
Batched tool calls: each entry is its own collapse card (was a DaisyUI
list). Pending results show a spinner in the title. Errors show a red
badge.
686 tests green.
|
|
Add overflow-hidden to tool card containers and overflow-x-auto to
<pre> elements so long tool output (file contents, JSON, etc.) gets
its own scrollbar instead of expanding the chat width and creating
a horizontal scrollbar on the entire transcript.
686 tests green.
|
|
The turn number comes from the entry's position in the metrics array
(1-based), which is correct regardless of trimming since stepId matching
aligns segments to the right entry. Now displays 'turn 3 · 12k tok' instead
of just 'turn · 12k tok'.
|
|
One button to the right of the text input:
- idle → Send (starts a turn)
- generating + text → Queue (steers via chat.queue)
- generating + empty → Stop (aborts via POST /stop)
|
|
Consume the stop-generation handoff (no version bumps, no new types).
- App store: stopGeneration() → POST /conversations/:id/stop (fire-and-forget)
- Composer: stop button (square, error color) visible only while generating,
next to the send/queue button
- Existing event flow handles the rest: done with reason 'aborted' clears
generating; conversation.statusChanged: idle updates the tab spinner
686 tests green.
|
|
Consume the compaction handoff ([email protected], [email protected]).
Re-pinned file: deps + re-mirrored .dispatch/*.reference.md.
- New 'Compaction' sidebar view (CompactionView.svelte):
- 'Compact now' button → POST /conversations/:id/compact (loading indicator
+ result: 'N messages summarized, M kept')
- Auto-compact threshold number input → GET/PUT
/conversations/:id/compact-threshold (0 = disabled, default 350000)
- Re-mounts per conversation via {#key}
- App store: compactNow() + compactThreshold reactive state +
setCompactThreshold(), seeded on focus change (like reasoning-effort + cwd)
- conversation.compacted WS handler: reloads the SAME conversation's history
(ID unchanged — old history forked to an archive, not a tab switch)
- WS adapter parses newConversationId field on ConversationCompactedMessage
- conformance guards + tests cover the new type
686 tests green.
|
|
boundaries
Consume the message-queue + steering handoff ([email protected], [email protected]).
Re-pinned file: deps + re-mirrored .dispatch/*.reference.md.
- fold steering AgentEvent into the transcript as a provisional user bubble
(after the tool-result it followed; no de-dup — the queue surface carried it)
- add rendererId: "message-queue" custom renderer (pure parser + MessageQueueList)
rendered as a compact panel above the Composer (hidden when queue is empty)
- add ChatStore.queueMessage / AppStore.queueMessage — sends chat.queue WS op
(trim/validate non-empty; auto-starts a turn if idle)
- Composer switches to chat.queue while generating (button → Queue, placeholder
→ Steer the conversation...)
- exhaustiveness guards updated for steering + chat.queue
- carry-to-new-turn needs no special handling (normal new turn)
664 tests green.
|
|
thinking-depth knob
Consume the backend's reasoning-effort handoff ([email protected] ReasoningEffort +
[email protected] GET/PUT /conversations/:id/reasoning-effort,
ChatRequest.reasoningEffort): a 5-level selector in the sidebar Model view,
under the provider + model dropdowns. null renders as 'high (default)' per
the server-owned resolution chain; PUT on change (effective next turn);
error + revert on 400; per-conversation re-mount incl. drafts (the draft id
survives promotion, so an effort set on a draft applies from turn 1).
Re-mirrored .dispatch references; GLOSSARY 'reasoning effort'; handoff
updated. 616 tests green; live curl probe passed.
|
|
show-earlier page-in
Long transcripts no longer grow unbounded: past the chat limit (default 256
chunks, localStorage dispatch.chatLimit) the oldest ceil(limit/4) committed
chunks are unloaded in ONE bulk pass — never one-per-delta (old Dispatch's
scroll-jump-per-step bug) — and only while the reader is stuck to the bottom
(scrolled-up readers defer the trim; it catches up in whole quarters). A fresh
page load windows to the newest floor(0.75*limit). Unloading is purely local
(IndexedDB cache + server keep everything); a hiddenBeforeSeq watermark keeps
history merges from resurrecting unloaded chunks, and a 'Show earlier messages'
affordance pages a quarter back in from the cache with scroll-anchor
preservation. Thinking-collapse render keys stay stable across trims via a
hiddenThinkingCount ordinal base.
- core/chunks/trim.ts: pure policy (trim/window/restore/normalize) + tests
- chat store: chatLimit + canUnload deps, windowed load, showEarlier()
- composition root: dispatch.chatLimit localStorage knob + unload gate wired
to smart-scroll isAtBottom()
- backend CR-5 OPENED (not a blocker): ?limit=/?beforeSeq= on
GET /conversations/:id (courier backend-handoff-chat-limit.md)
- scripts/live-probe.ts: fix pre-existing stale TurnMetricsEntry reads
(m1.usage -> total.usage) that crashed the probe; 17/17 live checks pass
|
|
Restore the ergonomic composer from old Dispatch: an auto-resizing textarea
(1→7 lines) with a fixed-width Send button beside it, and a status bar BELOW
holding a status icon · context-window fill bar (escalating success/warning/
error color) · compact token count (current / limit · pct%).
The bar reuses the latest turn's contextSize as current usage and HARDCODES a
1,000,000-token window limit as a placeholder (real per-model limit is the next
backend ask). Absorbs the standalone ContextSizeBadge (removed). Pure helpers
computeContextUsage + formatCompactTokens added to core/metrics (tested).
540 tests green.
|
|
Backend context-size handoff: re-pin [email protected] / [email protected]
(+ re-mirror .dispatch reference snapshots). Thread the optional contextSize
through core/metrics (done fold + durable + selectCurrentContextSize: latest
turn's defined value, undefined=>unknown never 0, durable-wins-over-live).
Chat store exposes currentContextSize; ContextSizeBadge renders
"N tokens in context" / "context size unknown" above the composer.
GLOSSARY: add context size / context window. 533 tests green.
|
|
cache warming + retention, markdown
Consumes the backend cache-warming + cache-rate handoffs end-to-end and adds supporting infra:
- protocol/transport: conversation-scoped surfaces (conversationId on subscribe/invoke/surface + staleness routing); store auto-subscribes the catalog with the focused conversation and re-scopes on switch.
- surface-host: generic Number field renderer + custom rendererId dispatch (graceful skip on unknown).
- cache-warming feature: enabled toggle, min+sec interval, AUTHORITATIVE countdown from the surface's cache-warming-timer nextWarmAt, manual Warm now (POST /chat/warm), lastWarmAt-keyed history, cache-retention stat, expectedCacheRate headline.
- metrics: cross-turn expected-cache (retention) derivation + bubble badge; cache-rate fix needs no code change (inputTokens now total).
- markdown feature: marked + marked-highlight + highlight.js + dompurify, rendered in ChatView.
- fixes (gemini review): {#key activeConversationId} remount of CacheWarmingView to stop history/feedback leaking across tabs; guard NaN interval inputs from committing 0.
- docs/contracts: regenerated transport/ui-contract mirrors; backend-handoff updated (CR-3 resolved).
Verified: svelte-check 0 errors, biome clean, 494 tests pass, vite build OK.
|
|
- move the model picker out of the chat header into a dedicated "Model" sidebar
view; sidebar now seeds two default panels (Model on top, Extensions below)
- split the single model dropdown into two stacked selects: a key selector
(distinct credential keys) + a model selector (models under the current key)
- pure model-select helpers (splitModelName/joinModelName/modelKeys/modelsForKey),
split on the FIRST slash so multi-slash model names stay intact
- onSelect still emits the full `<key>/<model>` string (ChatRequest.model unchanged)
|
|
Derive cache hit rate (cacheReadTokens / inputTokens) from data already
folded in core/metrics — no backend/contract change.
- core/metrics: computeCachePct + viewCacheRate (pct + success/warning/error
level by 66/33 thresholds + isHit); thread a running cumulativeUsage onto
each finalized turn-metrics row for the conversation total.
- ChatView: render two labelled, colour-coded percentage badges in the
turn-total bubble — "Last turn:" (that turn) and "Chat Total:" (cumulative).
- Honour backend caveats: absent cache fields -> 0, divide-by-zero guarded,
a legitimate 0% renders plainly (not "no data").
|
|
Consume [email protected] / [email protected] metrics: usage.stepId,
step-complete (ttft/decode/genTotal), done.durationMs/usage, and the
durable GET /conversations/:id/metrics endpoint.
- core/metrics: pure live-fold + durable-merge reducer; decode-rate TPS;
head-aligned, stable placement; progressive per-step rows (each shown as
its step ends) with the turn-total row gated on the done event.
- features/chat: store folds metric events + hydrates durable TurnMetrics;
ChatView renders inline step bubbles + a turn-total bubble.
- app: MetricsSync HTTP effect (tolerates 404) injected into chat stores.
- scripts/live-probe: drives the metrics path; live-verified 17/17 vs bin/up.
- docs: regenerate .dispatch wire/transport mirrors to 0.4.0; glossary terms
(turn/step metrics, TTFT, decode time, TPS, metrics bubble); trim handoff.
|
|
This reverts commit 48c6d85c3cc5a57a729f14068e2346b17ed62088.
|
|
Consume wire/transport-contract 0.3.0 (step-complete event + timing
fields on usage/tool-result/done). Pure core/telemetry module:
foldMetricEvent (reducer) + derived selectors (stepTps, turnTps, etc).
TelemetryState is pure data, no active-turn tracking — consumers pass
turnId to selectors. ChatStore wires foldMetricEvent into handleDelta
and exposes telemetry + currentTurnId. ChatView shows step-metrics
footer (time/TPS/tokens) on assistant text bubbles and durationMs badge
on tool cards. New TurnSummary component renders turn-level stats
(wall-clock, tokens, steps, TPS) in a DaisyUI stats block. Extended
live-probe to verify telemetry events against bin/up (pending backend
restart). 336 tests, typecheck 0, biome clean, build ok.
|
|
persisted open
Thinking renders inside a visible rounded-card bubble (like tool calls),
capped to the same max-w-5xl column as assistant text. Uses a DaisyUI
checkbox collapse (no arrow/plus icon) with smooth animation. Title reads
"Thinking" + loading-dots while the model is actively generating, then
flips to "Thoughts" with no dots once done. Open/closed state persists
across the generating→completed→sealed transition via stable ordinal keys
(per-conversation isolation via {#key} in App). Added optional streaming
flag to RenderedChunk (pure selector, only on the accumulating chunk).
|
|
Remove the opacity-50 dimming applied to provisional (streaming) chunks
across user/assistant/tool/batch rendering; in-flight content now renders
at full opacity. Test updated to assert no dimming.
|
|
Consume the backend's new stepId grouping key (wire/transport-contract
0.1.0 -> 0.2.0). foldEvent copies event.stepId onto live tool chunks so
live and replay group identically. New pure selector groupRenderedChunks
(core/chunks) folds a step's 2+ tool calls into one tool-batch group,
pairing each call with its result by toolCallId; single/no-stepId calls
stay as cards. ChatView renders a batch as a DaisyUI list (list-row per
pair). Fixtures updated for the now-required event stepId.
|
|
cards
All messages flow left in one column via the DaisyUI chat-start grid:
- user keeps a primary speech bubble;
- assistant/system/error render in a transparent (invisible) chat-bubble so
they read as plain prose yet inherit identical left spacing, capped to a
readable max-w-5xl column;
- tool call/result render as regular (non-speech) rounded cards, nested in the
same grid so they line up too;
- role header labels dropped; chat-wide left padding added.
Alignment uses specificity-based variants (no !important).
|
|
- features/tabs: pure tab-workspace reducer (create/select/close/setModel/
setTitle/deriveTitle, draft=null active) + injected-persistence runes store
- features/chat: mutable per-tab model (setModel) + delta routing guard
(ignore foreign conversationId) + ModelSelector.svelte + DaisyUI chat bubbles
/ composer (keeps streaming <details> keying fix)
- features/conversation-cache: surface delete(conversationId) on the wrapper
for tab-close local-forget
- adapters/local-storage: generic injected JSON localStore<T> (quota/corrupt-safe)
Verified: svelte-check 0/0, vitest 273, biome clean, build ok.
|
|
ChatView keyed the transcript each-block by object identity, but core/chunks
returns new RenderedChunk objects per delta, so Svelte recreated each
<article>/<details> every frame — an opened Thinking element snapped shut on
the next token. Key by stable identity instead (c${seq} for committed, p${i}
for append-only provisional) so streaming reuses the DOM. Adds a regression
test that the <details> stays open across a streaming update.
Verified: svelte-check 0/0, vitest 222, biome clean, build ok.
|
|
- adapters/idb: createIdbChunkStore implements the ConversationChunkStore port
over IndexedDB (compound [conversationId,seq] key, idempotent append, meta
store for lastAccess); 8 tests with fake-indexeddb
- features/chat: createChatStore (runes-thin over the core/chunks reducer, all
effects injected via ChatTransport/HistorySync/ConversationCache ports) +
ChatView/Composer svelte-thin UI; folds chat.delta, syncs on turn-sealed,
hydrates from cache then catches up; 25 tests
Verified green: svelte-check 0/0, vitest 202, biome clean, build ok.
|