| Age | Commit message (Collapse) | Author |
|
|
|
Add multimodal image/PDF input to the chat box via clipboard paste, gated by a
graceful per-model capability check.
UX: a pasted image/PDF inserts an inline token (【image:…】 / 【pdf:…】) into the
draft, so attachments have ORDER relative to typed text and can be referenced
positionally. The token is the only handle — deleting it (atomic Backspace/
Delete, or selection overlap) detaches the file; an input-reconciliation safety
net detaches any attachment whose token is no longer intact. No preview strip.
Capability check: resolveModelCapabilities reads models.dev modalities.input
(new GET /models/capabilities, mirrors /context-limit). The input blocks Send
(no tokens spent) only on a definitive 'no'; unknown capability (catalog offline
/ unmapped provider) stays permissive. Attachments require a fresh turn — Send is
blocked while generating and /chat rejects content mid-turn (409).
Attachments are EPHEMERAL: forwarded to the model for the turn via ordered AI SDK
ImagePart/FilePart content, but never persisted (history keeps the text with
[image]/[pdf] markers). Text-only turns serialize byte-identically to before.
Limits (Anthropic-aligned, enforced at paste + re-validated server-side):
PNG/JPEG/WebP/GIF/PDF; image ≤5MB, PDF ≤32MB, ≤20 attachments, ≤32MB total.
core: UserContentPart types, models/attachments validator, capability resolver,
agent.run+toModelMessages thread ordered content. api: /chat content validation +
passthrough. frontend: attachment-tokens helper, ChatInput paste/token/gating,
per-tab staged attachments, App.svelte capability fetch. +44 tests.
|
|
- TabBar: HTML5 drag-and-drop to reorder user tabs (subagent tabs untouched);
double-click a tab title to rename (Enter/blur confirm, Escape cancel).
- Store: add reorderTabs/renameTab/setDraft; per-tab in-memory `draft` and
`manualTitle` fields. Manual rename suppresses first-message auto-title.
- ChatInput: bind to the active tab's draft so switching tabs saves/restores
unsent text instead of clobbering it.
- Backend: updateTabPositions() + PATCH /tabs/reorder persist tab order to the
existing `position` column; tabs without a stored position fall to the end
then get explicit positions on first reorder.
- Tests: store reorder/rename/auto-title-guard/draft coverage; core
updateTabPositions coverage (FakeDatabase extended with transaction support).
|
|
Cross-branch contract test (u2/context-window-view merged from dev): the
Context Window panel derives current context from cacheStats.last via
computeContextUsage. This drives the full path — persisted usage aggregate ->
hydrateFromBackend -> cacheStats.last -> computeContextUsage -> '48,200 /
200,000' — proving the view shows real context size immediately after a reload
on a new device (not 'No context data yet'). Guards the contract so neither
persistence nor the view can silently break it.
|
|
Addresses the live-accumulator overshoot a Gemini review surfaced: the
frontend adds every streamed usage event to cacheStats, but a rate-limited
fallback attempt's usage is discarded server-side (never persisted). Live
numbers overshot until a reload re-seeded from the DB aggregate.
Fix: turn-sealed (emitted AFTER the atomic usage-row write) now carries the
authoritative getUsageStatsForTab aggregate. The store REPLACES (not adds)
cacheStats with it every turn — landing the just-sealed turn's usage AND
self-healing any live drift, including the discarded-fallback overshoot. No
extra round-trip (piggybacks turn-sealed); idempotent in the happy path.
- core: add UsageStats type; getUsageStatsForTab returns it; turn-sealed gains
optional usageStats field.
- api: agent-manager reads getUsageStatsForTab post-flush and attaches it to
the turn-sealed emit (try/catch: omit on DB error).
- frontend: turn-sealed handler replaces cacheStats (undefined ⇒ untouched
back-compat; null ⇒ clear).
Tests: frontend reconcile/self-heal/back-compat/null-clear; api turn-sealed
carries aggregate. 509 -> 514 passing; typecheck + biome green.
|
|
Persist usage as invisible type:"usage" chunk rows (side channel):
- core: add "usage" ChunkType + UsageData; exclude usage rows from
getChunksForTab/getTotalChunkCount; add getUsageStatsForTab aggregate
(exported from barrel); defensive skip in groupRowsToMessages.
- api: agent-manager accumulates per-attempt usageRows and flushes them in
the same atomic appendChunks call as the turn's content (discarded on a
superseded fallback attempt). GET /tabs enriches rows with usageStats.
- frontend: hydrateFromBackend seeds cacheStats from usageStats (reload only;
no re-seed on statuses reconnect, so no double-count with live events).
Tests: core DB-backed usage persistence/aggregate; api usage-row-per-event +
fallback discard; routes GET /tabs usageStats; frontend hydrate seed +
no-double-count + live-accumulation-after-seed. 495 -> 509 passing.
|
|
Add a frontend store test (flagged by a Gemini review) that queues TWO
messages mid-turn and asserts they collapse into a single untagged
initiator row joined with "\n---\n" — matching the backend's joined user
turn — and that the next turn-start tags that single row. The prior test
only covered the single-message case, leaving the join logic structurally
correct but untested.
|
|
A message queued while the agent was mid-turn was only handled if it
arrived DURING a tool batch (injected as a [USER INTERRUPT]). If it
landed after the last tool call — or the turn had no tools — the agent
silently appended it to history and ended the turn with no response, so
it sat there unanswered. This affected both user-queued messages and
agent-queued ones (send_to_tab).
- agent.ts: stop the end-of-turn drain that swallowed trailing queued
messages into history. They now stay on the queue.
- agent-manager: after a CLEAN turn settles, continueFromQueue() drains
the queue and starts a fresh turn to answer it. Skipped on a
user-stopped or errored turn (queue preserved for the next send).
- Loop safety: continuation draws from the existing autoWakeBudget, so a
runaway agent<->agent chain is bounded; human sends refill it, so human
conversations are never throttled.
- dequeueMessages now tags message-consumed with reason
"interrupt" | "continuation"; the frontend collapses continuation-
consumed queued bubbles into the next turn's initiator row (avoids the
linger/dup traps documented in queue-interrupt-reconcile-edge-cases.md).
- Tests: agent (no-swallow + interrupt regression), agent-manager
(continuation, no-op when empty, user-stop preserves queue, bounded
loop), frontend (continuation bubble becomes next initiator).
- wishlist: remove the now-fixed item.
|
|
per-chunk eviction
Replace the stored ChatMessage[] with a chunk-native model: tab.chunks (sealed
ChunkRow[]) + tab.live (transient in-flight turn buffer) + derived tab.renderGroups.
This enables per-chunk eviction (trimming WITHIN a large turn) and raw-chunk
pagination (loadOlderChunks), removing the whole-message eviction limitation.
Backend:
- Emit turn-start/turn-sealed around each turn; expose currentTurnId in the status
snapshot. turn-sealed fires after the durable write (status:idle fires before it).
- New GET /tabs/:id/chunks raw paginated endpoint (limit/before).
- Wrap appendChunks in a single SQLite transaction.
Frontend:
- turn-sealed drives a turn-aware reconcile that folds the sealed turn into chunks
while preserving a concurrent newer in-flight turn and pending queued messages;
deferred while the user is scrolled up.
- Stable turn-scoped render keys (${turnId}:${role}:${n}) avoid remount/flash.
Reconcile correctness (three review passes):
- preserve a concurrent newer turn when an earlier deferred reconcile flushes;
- keep optimistic queued user messages (no loss);
- turn-start backfill skips pending queued rows and tags only the turn initiator;
- bind consumed interrupt messages to the in-flight turn so they collapse on seal
(no lingering/duplicated bubble).
Tests: chat-store reconcile/eviction/pagination suite; api chunks endpoint + events.
|
|
- send prompt-caching + oauth anthropic-beta headers on the Claude OAuth provider
- restructure the OAuth request body (billing header, identity split, relocate
third-party system prompt to the first user message) to match Claude Code
- apply rolling cache_control breakpoints and group a turn's tool results into a
single role:tool message for correct breakpoint placement
- emit per-step usage events (cache read/write split) and add the Cache Rate
sidebar panel
- dedup byte-identical tool calls within a single batch
|
|
backend pagination
Frontend keeps only a bounded window of chunks in memory (configurable via
settings slider, default 100). Older messages are evicted when at the bottom
and re-fetched from the backend on scroll-up.
- Backend: paginated GET /tabs/:id/messages with ?limit=N&before=seq
- Store: evictMessages trims oldest messages until total chunks ≤ limit
- Store: loadMoreMessages fetches next page and prepends with dedup
- ChatPanel: smart scroll hooks trigger eviction on return-to-bottom
- ChatPanel: onNearTop loads older history with scroll-position maintenance
- Settings: chunk limit slider in Memory section
- Fix: oldestLoadedSeq recalculated after eviction (pagination cursor stays valid)
- Fix: seq preserved on ChatMessage for cursor tracking
- Fix: scrolledUpTabs cleaned up on tab switch (no memory leak)
- Fix: evictMessages reads appSettings.chunkLimit directly (live updates)
|
|
running in background
Implements the 'background-running agents + restore-layout-on-reopen'
feature. Full design and parallel-implementation plan in
`plan-bg-restore.md`; Gemini code review (SHIP verdict, no findings) in
`report.md`.
User-visible behaviors:
1. Browser-close keeps agents alive. If an agent is mid-stream when
the browser closes / reloads / loses the network, it continues
processing on the backend. (This was already the case in code —
agents run fire-and-forget in app.ts:77-79 — but it was previously
pointless because the UI never restored the tab to receive the
output.)
2. Layout restore on browser reopen. Every tab that existed at the
time the window was closed is restored, in original `position`
order, with full persisted message history. Tabs whose agents
finished while disconnected appear with the completed message.
Tabs whose agents are still running appear streaming live — the
in-flight assistant message is reconstructed from the backend's
in-memory `currentChunks` (sent over the wire on connect) and
accumulates new deltas as they arrive.
3. Explicit tab-close cancels + forgets. Clicking the X still
cancels the agent (existing `stopTab` in DELETE /tabs/:id) and
archives the row (`is_open = 0`), so it is not restored. No
change to that path.
The gap that the implementation closes: previously, App.svelte:onMount
unconditionally called `createNewTab()` with a fresh UUID, ignoring
every existing row in the `tabs` table. Every browser open was a
clean slate. The DB had the conversation history but no way for the
UI to discover it.
Implementation:
• New `TabStatusSnapshot` interface in
packages/core/src/types/index.ts (auto-exported via existing
`export * from "./types"`):
interface TabStatusSnapshot {
status: AgentStatus;
currentChunks?: Chunk[]; // present iff running
currentAssistantId?: string; // present iff running
}
• `agent-manager.ts:getAllStatuses()` rewritten to return
`Record<string, TabStatusSnapshot>` (was
`Record<string, AgentStatus>`). For running tabs only, attaches a
defensive shallow copy of `tabAgent.currentChunks` (the live
streaming array the per-message loop appends to) plus the DB id
of the in-flight assistant message. The defensive copy is the
consumer's to mutate. Idle / error tabs get `{ status }` only.
`GET /status` and the WS `onOpen` snapshot both pick up the new
shape automatically — neither call site changed.
• Frontend mirror of `TabStatusSnapshot` in
packages/frontend/src/lib/types.ts; `AgentEvent.statuses` variant
updated to use `Record<string, TabStatusSnapshot>`.
• New `hydrateFromBackend()` on the tab store
(packages/frontend/src/lib/tabs.svelte.ts). Sequence on app
mount:
1. Bail with 0 if `tabs.length > 0` (hot-reload idempotency).
2. GET /tabs → list of `is_open=1` rows in
`position` order.
3. GET /status → in-flight TabStatusSnapshot map.
4. GET /tabs/:id/messages for each tab in parallel via
Promise.all → persisted ChatMessage[].
5. Build the Tab objects, splicing the snapshot's live
chunks into the in-flight assistant message for every
running tab (two paths: merge into the existing DB row
with matching id, or append a fresh in-flight message
if no row matches).
6. `tabs = restored; activeTabId = restored[0]?.id ?? null;`
Every fetch is wrapped in try/catch so one tab's failure can't
destroy the whole restore pass.
• WS `statuses` handler in `tabs.svelte.ts:handleEvent` rewritten
for the new shape. Still fires `reloadTabMessagesFromApi` on the
desync case (frontend thinks running, backend says idle — the
pre-existing recovery path is preserved). When backend says
running, seeds in-flight chunks into the assistant message
matching `snap.currentAssistantId` (creating it if needed).
When backend says non-running, clears `isStreaming` on the
previous in-flight message and nulls `currentAssistantId`.
• `App.svelte:onMount` now awaits `tabStore.hydrateFromBackend()`
before deciding whether to fall back to `createNewTab()`.
Fallback condition is the doubly-defensive
`restored === 0 && tabStore.tabs.length === 0`. `wsClient.connect()`
fires in parallel with hydration — the resulting WS `statuses`
event is per-tab idempotent against the hydrated state, so there
is no race even if it arrives mid-hydration.
What was NOT done (deliberately, deferred to wishlist):
• Pre-existing inconsistency: core `AgentStatus` includes
"waiting_for_key" but frontend `TabStatusSnapshot.status` uses
only the existing 3-state pattern ("idle" | "running" | "error").
Not introduced here; mirrored the existing precedent.
• Restored tabs use defaults for `reasoningEffort`, `agentSlug`,
`agentScope`, `agentModels`, `workingDirectory` — these are not
in the DB `tabs` schema. Future schema expansion.
• Per-delta DB flushing — not needed; the in-memory snapshot
covers the gap between flushAssistant calls.
• LocalStorage cache of tab ids — backend DB is the source of truth.
Process notes:
• Implemented via parallel programmer subagents (flash agents were
requested but unavailable in this environment — substituted with
"programmer" agents, which share the "reads a plan, implements a
single step" charter). Backend (Segment A: getAllStatuses + 5
tests) and frontend (Segment B: types + hydrateFromBackend +
statuses handler + onMount + 8 tests) ran disjoint-file-ownership
in parallel.
• Gemini code review (yolo mode for tool access, explicit
prompt-level write restriction to `report.md` only) returned a
SHIP verdict with no findings against the plan.
• Self-review surfaced one followup gap that Gemini's earlier
plan-mode pass also caught: no explicit test for
`/tabs/:id/messages` failure isolation. Added a test covering
both HTTP-500 and network-error variants alongside a healthy
tab, asserting per-tab failures don't destroy the whole restore.
Tests:
• api/tests/agent-manager.test.ts: +5 (snapshot empty record,
idle-tab field omission, running-tab field inclusion, defensive
copy invariant, omits chunks for running tab with null
currentChunks). 31 total (was 26).
• frontend/tests/chat-store.test.ts: +9 (restore-with-messages,
in-flight seeding, /tabs failure → 0 returned, empty /tabs
array, idempotency when tabs already exist, idle-status when
/status omits, running-snapshot statuses handler seeding,
idle-snapshot statuses handler clearing, per-tab failure
isolation across HTTP-500 and network-error). 44 total (was 35).
Totals: 243 tests across 3 packages all green; typecheck clean on
core + api + frontend; biome clean across 124 files.
|
|
reasoning round-trip + max-thinking budget audit
Migrates the LLM stack from [email protected] + @ai-sdk/[email protected] +
@ai-sdk/[email protected] to [email protected] + @ai-sdk/[email protected]
+ @ai-sdk/[email protected]. Full design in plan-v6-upgrade.md;
two rounds of Gemini code review captured in report.md.
Motivation: the recurring 'reasoning-signature without reasoning' error
on Claude Opus 4.7 was a v4 SDK artefact — @ai-sdk/[email protected] emitted
Anthropic signature_delta as a separate stream chunk that orphaned when
the model produced a signed-but-empty thinking block, and our chunk
store had no signature field so the round-trip back to Anthropic was
rejected on the next turn. In v6, signatures arrive inside
providerMetadata on the reasoning-end event, and the orphan-signature
class of bug is gone at the SDK level.
Core changes:
• ThinkingChunk gains optional metadata?: Record<string, unknown>
(the v6 providerMetadata blob). A non-undefined metadata 'seals'
the chunk: subsequent reasoning-delta opens a new chunk rather
than extending the sealed one.
• AgentEvent gains { type: 'reasoning-end'; metadata? } (replaces
the v4 reasoning-signature variant).
• toModelMessages (replaces toCoreMessages):
- returns ModelMessage[] (was CoreMessage[])
- thinking → { type: 'reasoning', text, providerOptions: metadata }
- tool-batch entries → { type: 'tool-call', input } (was 'args')
- tool results → { output: { type: 'text', value } } ToolResultOutput
• Claude OAuth uses createAnthropic({ authToken }) natively — no more
custom-fetch x-api-key → Bearer swap.
• rewriteBodyForOpus47 deleted — Opus 4.7 adaptive thinking is native
via providerOptions.anthropic.thinking = { type: 'adaptive' }.
• V1 middleware → V3 (specificationVersion: 'v3').
• v4-era normalizeMessages openai-compatible middleware deleted; the
v6 openai-compatible provider extracts reasoning_content natively
from { type: 'reasoning' } content parts.
• applyAnthropicStructuralNormalisations (mirrors opencode
provider/transform.ts:53-148): drops empty text/reasoning parts,
scrubs non-[a-zA-Z0-9_-] toolCallIds, splits [tool-call, non-tool]
assistant turns (Anthropic rejects tool_use followed by text).
• applyOpenAICompatibleReasoningNormalisation (mirrors opencode
transform.ts:217-249): lifts reasoning text into
providerOptions.openaiCompatible.reasoning_content (always, even
empty). Solves DeepSeek 'The reasoning_content in the thinking
mode must be passed back' — the v6 SDK skips emitting
reasoning_content when text is empty (dist/index.mjs:245), but
DeepSeek requires the field present once thinking was used.
• Tools: tool({ inputSchema: jsonSchema(zodToJsonSchema(...)) })
(was parameters: ZodSchema). AI SDK tools have no execute
callback — the agent runs tools manually for permission prompts
and shell-output streaming. New dep: zod-to-json-schema@^3.25.2.
• fullStream event loop rewritten for v6 event shape: text-delta
(text not textDelta), reasoning-start/delta/end, tool-input-*,
tool-call (input not args), tool-result, tool-error (new), abort
(new), start-step/finish-step, finish.
Max-thinking audit (matches opencode transform.ts:642-671 budgets):
• Claude enabled-thinking max budget 16000 → 31999 (Anthropic ceiling)
• Claude enabled-thinking high budget 10000 → 16000
• maxOutputTokens 'budget + 8000' → fixed 32000 (matches opencode's
OUTPUT_TOKEN_MAX; model self-allocates thinking vs response within)
• Opus 4.7 adaptive thinking gains display: 'summarized' and sibling
effort field (without these, thinking content is hidden by Anthropic
and the model barely thinks).
Frontend mirrors:
• types.ts — ThinkingChunk.metadata?, AgentEvent reasoning-end
• tabs.svelte.ts — routes reasoning-end through applyChunkEvent
• ChatMessage.svelte — hides empty thinking chunks; hides the entire
assistant bubble when no chunk has renderable content
Gemini-review-driven fixes:
• tool-error and abort stream events now surface as error chunks
(were silently ignored)
• toolCallId scrubbing pass (opencode transform.ts:96-122 parity)
• Empty-reasoning-cull explicit test coverage for both Anthropic
structural normalisation and DeepSeek path
Test counts (223 tests across 3 packages, all green):
• tests/chunks/append.test.ts: 44 (was 38) — reasoning-end sealing,
orphan walk-back, multi-block interleaving
• tests/agent/agent.test.ts: 24 (was 5) — exhaustive v6 event
mappings, structural normalisations, signature/reasoning_content
round-trip, tool-error/abort branches, DeepSeek scenario, empty
reasoning edge case
• tests/llm/provider.test.ts: 9 (was 22) — dropped 13 obsolete v4
middleware tests; new minimal tests confirm no middleware wrapping
on default openai-compat path and that createAnthropic gets
authToken vs apiKey correctly for OAuth vs api-key flows
• tests/tools/registry.test.ts: 10 (was 4) — v6 tool() contract
(inputSchema, no execute, JSON Schema for nested zod)
• packages/api/tests/agent-manager.test.ts: 12 (was 7) — mock Agent
emits v6 reasoning events; reasoning-end broadcast + ordering
• packages/frontend/tests/chat-store.test.ts: 35 (was 32) —
reasoning-end flow through Svelte $state store
typecheck clean (tsc --noEmit on core + api, svelte-check on frontend),
biome clean across 124 files.
|
|
createTabStore + handleEvent (replaces POJO harness)
|
|
batching, error/system chunks
|
|
handling
- Agent Builder: full CRUD with card grid, drag-and-drop model reorder, edit/delete
- Auto-save on edit with 600ms debounce, AbortController for concurrency, fieldset disabled until name entered
- Agent definitions stored as TOML with cwd field, loaded from global/project dirs
- Working directory: per-tab CWD override in Chat Settings, agent default CWD, auto-create on first message
- CWD validation: check-dir endpoint with ~ expansion, real-time validity indicator
- Subagent CWD validated against parent's effective CWD using path.relative
- Unavailable tool calls: caught gracefully, shown as tool call with error badge, model retries
- UI: tab bar border radius, sidebar border removed, chat input ghost style, scroll-to-bottom rectangle
- Skills dir collapse uses CSS rotation, Model Choice renamed to Chat Settings, System Prompt view removed
- Reusable SkillsBrowser/ToolPermissions with external mode for Agent Builder
- ModelSelector: Agent/Manual toggle, agent list, Agent Settings link
- Page router, skills recursive scanning, bin/up gopass removed, docker volume mounts
|
|
effort, and dynamic model listing
|
|
Permission engine:
- Rule-based engine: wildcard matching, last-match-wins, reject cascade
- PermissionService with pending/approved state, PermissionChecker interface
- dispatch.yaml config loader with per-permission pattern rules
Shell tool:
- run_shell tool with child_process spawn, timeout, streaming output
- Tree-sitter static analysis (web-tree-sitter + tree-sitter-bash WASM)
- BashArity command normalization for 'always allow' patterns
- FILE_COMMANDS set: rm, cp, mv, mkdir, ls, find, grep, cat, etc.
Agent loop refactored:
- Removed maxSteps, manual step loop with tool execution
- Permission checks on shell commands (external_directory only)
- Permission checks on file tools outside workspace boundary
- Symlink bypass fix (realpathSync), .. false positive fix
- Shell output streaming via Promise.race + setImmediate polling
API layer:
- PermissionManager wraps PermissionService, broadcasts via WebSocket
- WebSocket handles permission-reply messages from frontend
- Config loaded from dispatch.yaml, converted to ruleset
Frontend:
- Permission prompt modal (native dialog, focus trap, ARIA)
- Always-allow confirmation flow with pattern preview
- Shell output display (live streaming + final parsed result)
- Permission log panel (fixed bottom-right overlay)
- Exit code badge (green 0, red non-zero)
134 tests, typecheck clean on all 3 packages
|
|
- Tool calls now appear at their stream position within messages (ContentSegment model)
- Added reasoning/thinking display: collapsible <details> block above content
- Set DeepSeek V4 Flash reasoningEffort to max via providerOptions
- ChatMessage.content changed from string to ContentSegment[] (text | tool-call)
- Agent handles AI SDK reasoning events, yields reasoning-delta
- Fixed duplicate key in ChatMessage.svelte each block
|
|
- Bun monorepo with @dispatch/core, @dispatch/api, @dispatch/frontend
- Agent runtime with Vercel AI SDK, streaming via WebSocket
- Tools: read_file, write_file, list_files (scoped to working directory)
- Hono API server with POST /chat, GET /status, GET /health, WS /ws
- Svelte 5 + DaisyUI frontend with chat UI, theme switcher, copy button
- OpenCode Go (Zen) as LLM provider, deepseek-v4-flash-free model
- Docker setup (dev + prod) with bin/ scripts and gopass secrets
- Biome v2 linting/formatting, Vitest tests (44 passing)
- Debug info attached to error messages for diagnostics
|