dispatch

Age	Commit message (Collapse)	Author
2026-06-04	feat(config): add subdirectory LSP config watchers and fix permission orderingv2-deprecated	Adam Malczewski
	- Add watchDirConfig() for per-directory config watching - Register watchers for subdirectories with their own dispatch.toml - Fix permission ordering: move "*" wildcard to front so findLast reaches specific rules first (was silently breaking all specific bash permission rules) - Add comprehensive tests for watcher functionality - Update mocks in test files
2026-06-03	Merge branch 'dev' into warm/prompt-cache-warming	Adam Malczewski
	# Conflicts: # packages/api/src/agent-manager.ts # packages/api/tests/agent-manager.test.ts # packages/frontend/src/lib/tabs.svelte.ts
2026-06-03	feat: prompt cache warming for idle tabs	Adam Malczewski
	Keep a tab's provider prompt-cache warm while idle by periodically replaying the exact cached conversation prefix plus a single trivial throwaway turn, resetting the provider's ~5-min cache TTL so the user's next real message hits a warm cache. Backend: - Agent.warmCache(history): extracts buildLlmContext() shared with run(), then re-sends the identical system+tools+history prefix (same Anthropic cache_control breakpoints) plus a 'reply with just a .' probe turn via toolChoice:none. Returns the request usage; mutates no history, emits/persists nothing. - AgentManager.warmCacheForTab(): resolves the same agent the next real turn would use, replays the FULL genuine history, refuses while a turn is running. - POST /chat/warm: returns ONLY the warming request's usage (never persisted, never folded into the real usage aggregate). Frontend: - cache-warming.svelte.ts store: per-tab 4-min repeating idle timer with countdown, warming-specific last-request cache %, and error capture. Arms on turn end, pauses during a turn, disables+resets on a real user message. - cache-warm-storage.ts: per-tab localStorage persistence of the toggle. - Lifecycle hooks wired into tabs.svelte.ts (status/statuses/sendMessage/ hydrate/create/open/close). - ModelSelector: bottom-of-panel checkbox + debug strip (last-% / countdown / error), shown only when enabled. Warming cache data never touches the real Cache Rate view. Tests: core warmCache (5), api warm route (3) + warmCacheForTab (3), frontend store (12) + storage (10). check / test (779) / frontend build / typecheck all green.
2026-06-03	Merge branch 'dev' into cmp7/compaction-tool	Adam Malczewski
	# Conflicts: # packages/frontend/src/lib/components/ChatInput.svelte
2026-06-03	feat(compaction): add UI-driven conversation compaction	Adam Malczewski
	Summarize a conversation's older "head" into a structured anchored Markdown summary while preserving the most recent turns verbatim, shrinking context size while keeping the information needed to continue coherently. Triggered by a "Compact conversation" button in Chat Settings (not an agent tool). Approach informed by OpenCode's session/compaction.ts: - Ported SUMMARY_TEMPLATE (Goal / Constraints / Progress / Key Decisions / Next Steps / Critical Context / Relevant Files) and the anchored-summary buildPrompt (re-summarizes a prior summary when present). - Ported the TOOL_OUTPUT_MAX_CHARS (2000) cap on tool results in the summary request. - Simplified tail selection to a fixed recent-turn count (DEFAULT_TAIL_TURNS=2) instead of OpenCode's token-budget splitTurn. core: - New src/compaction/ module (pure, DB-free): template, prompt builder, head/tail selection, transcript renderer with tool-output capping, prior summary extraction. Generic over ChatMessage so callers keep turnId/seq. - db/chunks.ts: rekeyChunks(from,to) relocates a tab's full history to a backup tab (reversible — nothing is deleted). - AgentEvent: compaction-started / -complete / -error variants. api: - AgentManager.compactTab(tempTabId, sourceTabId): side-effect-free resolveConnection() for the compactor model (configured compaction_model_, else the source tab's own key+model), one-shot tool-less summary generation via a transient Agent, then relocate full history to a fresh backup tab and re-seed the canonical source id with [summary turn + preserved tail]. Source tab is locked (messages queue) during the run; queue drains afterward. - Routes: POST /tabs/:id/compact, GET/PUT /tabs/settings/compaction-model. frontend: - "Compact conversation" button in ModelSelector (Chat Settings), between Working Directory and the agent toggle; idle-gated. - Compaction-model key+model selector in Settings, beside the title model. - Transient placeholder tab shows a large, non-faded "Please wait, compacting conversation…" screen; closing it cancels. Source input locked while running. - Handle compaction- events: reload compacted source, insert backup tab, refocus source, discard placeholder. tests: core compaction unit tests, rekeyChunks DB test, AgentManager.compactTab orchestration tests, and compaction route tests. All green (713 tests), biome clean, all typechecks pass, frontend builds.
2026-06-03	Merge branch 'dev' into img8/image-attachments	Adam Malczewski

2026-06-02	feat(chat): paste-to-attach images/PDFs with model capability check	Adam Malczewski
	Add multimodal image/PDF input to the chat box via clipboard paste, gated by a graceful per-model capability check. UX: a pasted image/PDF inserts an inline token (【image:…】 / 【pdf:…】) into the draft, so attachments have ORDER relative to typed text and can be referenced positionally. The token is the only handle — deleting it (atomic Backspace/ Delete, or selection overlap) detaches the file; an input-reconciliation safety net detaches any attachment whose token is no longer intact. No preview strip. Capability check: resolveModelCapabilities reads models.dev modalities.input (new GET /models/capabilities, mirrors /context-limit). The input blocks Send (no tokens spent) only on a definitive 'no'; unknown capability (catalog offline / unmapped provider) stays permissive. Attachments require a fresh turn — Send is blocked while generating and /chat rejects content mid-turn (409). Attachments are EPHEMERAL: forwarded to the model for the turn via ordered AI SDK ImagePart/FilePart content, but never persisted (history keeps the text with [image]/[pdf] markers). Text-only turns serialize byte-identically to before. Limits (Anthropic-aligned, enforced at paste + re-validated server-side): PNG/JPEG/WebP/GIF/PDF; image ≤5MB, PDF ≤32MB, ≤20 attachments, ≤32MB total. core: UserContentPart types, models/attachments validator, capability resolver, agent.run+toModelMessages thread ordered content. api: /chat content validation + passthrough. frontend: attachment-tokens helper, ChatInput paste/token/gating, per-tab staged attachments, App.svelte capability fetch. +44 tests.
2026-06-02	feat(tools): add key_usage tool reporting API-key usage levels	Adam Malczewski
	Adds an agent-callable `key_usage` tool that reports current usage for configured API keys so the agent can pick a key with headroom, warn before hitting a rate limit, and diagnose exhausted-key failures. Per key it reports: provider, active/exhausted status (with last error + when it was exhausted), remaining rate-limit headroom and reset timestamp per window (5-hour, weekly, and monthly where the provider exposes it), and whether the figures are live or served from cache (with the cache's last-fetched-from-source time). Supports anthropic and opencode-go keys (live with cache fallback for anthropic; live scrape for opencode-go). Optional `key_id` reports one key; omitted reports all. Hard permission gate `perm_key_usage` (default off): when disabled the tool is completely removed from the toolset/context. Registered in both the parent permission-gated path and the child whitelist path, advertised in the system prompt (TOOL_DESCRIPTIONS), grantable to subagents via the summon enum, and exposed as a frontend tool-permission checkbox. To report data freshness, claude.ts gains `getAccountUsageWithSource` + `ClaudeUsageResult` (live vs cache + cachedAt from usage_cache.cached_at); the existing `getAccountUsage` now delegates to it, preserving behavior. Tests: core key-usage tool suite (windows, %-conversion, freshness, exhausted status, unsupported/unavailable, filtering) + agent-manager perm-gate test.
2026-06-02	Merge branch 'dev' into feat/cs-code-search-tool	Adam Malczewski
	# Conflicts: # packages/api/src/agent-manager.ts # packages/api/tests/agent-manager.test.ts # packages/frontend/src/lib/components/ToolPermissions.svelte # packages/frontend/src/lib/settings.svelte.ts
2026-06-02	feat(lsp): add config-driven LSP support (Roblox Luau via luau-lsp)	Adam Malczewski
	Add Language Server Protocol integration modeled on opencode's, wired for this codebase's plain-TypeScript tool/agent architecture. Core (@dispatch/core): - lsp/client.ts: LSP/JSON-RPC client over stdio (vscode-jsonrpc) with the initialize handshake, didOpen/didChange sync, push + pull diagnostics (textDocument/diagnostic, workspace/diagnostic), and a generic request() passthrough for hover/definition/references/documentSymbol. - lsp/server.ts: resolves dispatch.toml [lsp] entries into spawn specs. Config-driven only — no builtin registry, no auto-download. - lsp/manager.ts: process-wide LspManager owning client lifecycles, keyed by root+serverID, lazy spawn + reuse + graceful shutdown. - lsp/language.ts: extension->languageId map incl. .luau -> "luau". - lsp/diagnostic.ts: error-only <diagnostics> block formatting (1-based). - tools/lsp.ts: on-demand 'lsp' tool (1-based coords -> 0-based wire). - write-file.ts: optional onAfterWrite hook for diagnostics-on-write. - config schema: validate [lsp] block; DispatchConfig.lsp + LspServerConfig. API (@dispatch/api): - AgentManager owns one LspManager; per-working-directory server cache cleared on config reload; diagnostics appended to write_file results; 'lsp' tool gated by new perm_lsp setting; shutdownAll on destroy(). Config: - dispatch.toml: documented, commented [lsp.luau-lsp] Roblox example. Tests: fake-lsp-server fixture + client/manager/server/diagnostic/schema/ tool/write-hook suites, plus an opt-in real-binary luau-lsp smoke test (auto-skipped when luau-lsp is absent). 652 pass; biome + 3 typechecks green.
2026-06-02	feat: add search_code tool wrapping the cs code-search engine	Adam Malczewski
	Add a dedicated, permission-gated search_code tool that wraps boyter/cs (code spelunker) — a fast, relevance-ranked, structure-aware code search engine — giving agents a better default than grep/find for exploratory 'where is X / how does Y work' searches (ranked results, snippets, ~5x smaller payloads). - packages/core/src/tools/search-code.ts: createSearchCodeTool factory; -f json invocation, workdir path containment, graceful missing-binary handling (DISPATCH_CS_BIN override), readable per-file formatted output. - Wire-up: export from core; register in agent-manager (both child-whitelist and parent perm paths) behind new perm_search_code; add to summon catalog + tools enum; frontend ToolPermissions + settings. - Docker: build a patched, statically-linked cs (pinned v3.1.0 commit) in a golang builder stage and bundle at /usr/local/bin/cs. - docker/cs/luau-declarations.patch: additive Luau declaration table so --only-declarations / definition ranking works for Roblox .luau files (upstream has Lua but not Luau). Applied during the Docker build. - Tests: new search-code.test.ts (stubbed JSON formatting + live-cs integration, skipped when cs absent); agent-manager/routes mocks + perm-gating assertions; loader pass-through. All tests (596), biome, and tsc (core/api/frontend) pass. cs-builder Docker stage verified to build and produce a working patched binary.
2026-06-02	Merge branch 'dev' into perm/fix-user-agent-summon-permission	Adam Malczewski
	# Conflicts: # packages/api/tests/agent-manager.test.ts
2026-06-02	fix(perm): decouple perm_user_agent from perm_summon for spawning user agents	Adam Malczewski
	Granting only the user-agent (top-level) permission without the subagent-summon permission left the agent unable to summon user agents: the whole summon tool was gated behind perm_summon, so perm_user_agent alone produced no summon tool. Register summon when EITHER perm_summon OR perm_user_agent is granted. createSummonTool now takes an independent subagentEnabled flag (mirrors perm_summon) alongside userAgentEnabled (mirrors perm_user_agent): - subagent-only -> ordinary subagents, no top_level - user-agent-only -> spawns ONLY top-level user agents (top_level forced, background/top_level params dropped, user-agent catalog only) - both -> unchanged full behavior retrieve stays bundled with perm_summon (user agents are fire-and-forget). Adds core summon tests (user-agent-only mode + legacy-default regression) and an agent-manager summon/user_agent permission-split suite.
2026-06-02	fix(tabs): advertise send_to_tab/read_tab in the agent system prompt	Adam Malczewski
	Granted tab-messaging tools were registered in the API tool payload but buildSystemPrompt built its 'You have access to the following tools' list by filtering toolNames through TOOL_DESCRIPTIONS, which had no entries for send_to_tab/read_tab. The model was therefore told it lacked those tools and refused to use them even when explicitly granted. Add the two missing TOOL_DESCRIPTIONS entries so the capability list matches the granted toolset. Add regression tests that capture the constructed Agent's systemPrompt and assert the tab-messaging tools are listed when granted (and omitted when not), locking the prompt list to the schema list so they can't drift again.
2026-06-02	feat(todo): port opencode's declarative whole-list todo tool	Adam Malczewski
	Replace the imperative id-based CRUD todo tool (add/update/list/get/remove) with opencode's declarative whole-list design: a single `todos` param that replaces the entire list each call. No model-visible ids, no delta reasoning, no "task not found" spirals. - core: TaskItem { id, content, status }; statuses pending\|in_progress\| completed\|cancelled. TaskList.setTasks/getTasks/onChange. New rich TODO_DESCRIPTION adapted from opencode's todowrite.txt. - api: TASK_MANAGEMENT_GUIDANCE system-prompt section (from anthropic.txt); updated TOOL_DESCRIPTIONS.todo. Reload fix: TabStatusSnapshot now carries per-tab tasks so getAllStatuses rehydrates the panel on reconnect. - frontend: mirror types; hydrate tasks from snapshot in both restore paths; upgrade sidebar Tasks panel to render content + all four statuses + progress. - tests: new core task-list.test.ts (15); updated api TaskList mocks + getAllStatuses task-snapshot coverage. bun run check clean; 569 tests pass; all packages typecheck.
2026-06-02	Merge branch 'dev' into u3/agent-effort-level	Adam Malczewski
	# Conflicts: # packages/api/tests/agent-manager.test.ts
2026-06-02	Merge branch 'dev' into u1/usage-persistence	Adam Malczewski

2026-06-02	fix: reconcile live cacheStats to DB truth on turn-sealed	Adam Malczewski
	Addresses the live-accumulator overshoot a Gemini review surfaced: the frontend adds every streamed usage event to cacheStats, but a rate-limited fallback attempt's usage is discarded server-side (never persisted). Live numbers overshot until a reload re-seeded from the DB aggregate. Fix: turn-sealed (emitted AFTER the atomic usage-row write) now carries the authoritative getUsageStatsForTab aggregate. The store REPLACES (not adds) cacheStats with it every turn — landing the just-sealed turn's usage AND self-healing any live drift, including the discarded-fallback overshoot. No extra round-trip (piggybacks turn-sealed); idempotent in the happy path. - core: add UsageStats type; getUsageStatsForTab returns it; turn-sealed gains optional usageStats field. - api: agent-manager reads getUsageStatsForTab post-flush and attaches it to the turn-sealed emit (try/catch: omit on DB error). - frontend: turn-sealed handler replaces cacheStats (undefined ⇒ untouched back-compat; null ⇒ clear). Tests: frontend reconcile/self-heal/back-compat/null-clear; api turn-sealed carries aggregate. 509 -> 514 passing; typecheck + biome green.
2026-06-02	feat(context-window): show current/max context usage per tab/model	Adam Malczewski
	Add a 'Context Window' sidebar view showing the live context occupancy (latest request's input+output) against the model's maximum context window, resolved dynamically from the models.dev catalog. - core: models.dev catalog module (resolveContextLimit) with disk cache, TTL, stale-fallback + offline penalty memo; null for unknown models. - api: GET /models/context-limit?provider=&modelId=. - frontend: ContextWindowPanel + computeContextUsage helper; App resolves + caches the active model's max (anthropic/opencode-anthropic only); percent shown to 2 decimals; degrades to bare token count when max unknown. - tests: core catalog (13), api route (3), frontend helper (6).
2026-06-02	feat: persist per-tab token/cache usage across reload	Adam Malczewski
	Persist usage as invisible type:"usage" chunk rows (side channel): - core: add "usage" ChunkType + UsageData; exclude usage rows from getChunksForTab/getTotalChunkCount; add getUsageStatsForTab aggregate (exported from barrel); defensive skip in groupRowsToMessages. - api: agent-manager accumulates per-attempt usageRows and flushes them in the same atomic appendChunks call as the turn's content (discarded on a superseded fallback attempt). GET /tabs enriches rows with usageStats. - frontend: hydrateFromBackend seeds cacheStats from usageStats (reload only; no re-seed on statuses reconnect, so no double-count with live events). Tests: core DB-backed usage persistence/aggregate; api usage-row-per-event + fallback discard; routes GET /tabs usageStats; frontend hydrate seed + no-double-count + live-accumulation-after-seed. 495 -> 509 passing.
2026-06-02	feat(agents): per-model reasoning effort level	Adam Malczewski
	Add a per-model/key reasoning effort setting to agent definitions, surfaced and editable in the Agent Settings page and displayed at a glance in the model selector views. - core: single source of truth for effort levels (REASONING_EFFORTS, DEFAULT_REASONING_EFFORT='high', labels, isReasoningEffort guard); add 'xhigh' level; AgentModelEntry.effort; xhigh budget=24000 for classic-thinking Claude; default floor 'high'. Persist/parse effort in the agent TOML loader. - api: thread effort through the fallback chain with per-model -> per-tab -> default precedence; validate /chat + agentModels effort from the canonical list. - frontend: effort <select> per model row in AgentBuilder; effort badges in ModelSelector (agent + subagent chains); Thinking dropdown sourced from canonical list; per-tab default raised to 'high'. - tests: +15 (loader round-trip, agent xhigh budget, canonical list + guard, api precedence, route validation).
2026-06-01	merge: dev into r1/claude-reset-fix	Adam Malczewski
	Brings in the n2/ntfy-notifications feature (ntfy.sh push notifications with per-event toggles, subagent-suppression flag, topic-only input, Settings UI, dispatcher + transport + config modules, 12+ new tests), the header declutter (theme picker + Debug panel moved into Settings / sidebar), the shared theme boot-apply module, and an a11y label for the remove-panel button. No code changes from this branch were touched by the merge — the overlap was purely textual. Conflict resolution: 1. HANDOFF.md (add/add conflict). Both branches independently put a single-purpose HANDOFF.md at the repo root for their respective in-flight feature, matching the existing convention (c351719 did the same for this branch; 29bdd00 did the same for ntfy). After this merge both features ship, so neither is in-flight anymore. Archive both into notes/: - notes/wake-schedule-handoff.md (this branch — git tracks as a rename from HANDOFF.md) - notes/ntfy-notifications-handoff.md (dev — recovered from MERGE_HEAD before deletion) The root HANDOFF.md is intentionally absent post-merge; the next in-flight branch will create its own. 2. packages/api/tests/routes.test.ts (auto-merged). dev appended ntfy stubs to the vi.mock('@dispatch/core', ...) factory; this branch appended a 'Wake schedule routes' describe block at the bottom. The two regions don't overlap and the textual auto-merge is correct (verified: 6 describe blocks, both mock-stub regions and the new describe present, no conflict markers). Verification on the merge commit: bun run test → 31 files, 495 / 495 passing (was 431 on the branch + 64 from dev) bun run check → biome clean, 156 files bun run --cwd packages/frontend typecheck → svelte-check 0 errors, 0 warnings dev can now fast-forward to this commit: git checkout dev && git merge --ff-only r1/claude-reset-fix
2026-06-01	fix(api): wake-schedule toggle requires explicit action: 'on' \| 'off'	Adam Malczewski
	Round-2 Gemini review surfaced that the toggle endpoint derived add-vs- remove from its own in-memory state, which combined catastrophically with any UI desync: a user clicking to turn ON an hour the UI showed as off, but the server had as on, would silently get the hour turned OFF. The clicks felt 'inverted' and the only recovery was a full reload. Fix: require an explicit `action` field on every /toggle request. The client must declare its intent; the server is no longer allowed to guess. Idempotency rules: - action: 'off' on an already-off hour → 200, no-op success. - action: 'on' on an already-on hour → 200, REPLACES timestamps (so a recovering UI can re-assert the user's wall-clock intent without a delete-then-add round trip). - Missing or invalid action → 400. The 'off' path no longer reads or requires `timestamps`. The 'on' path still requires all four slot timestamps as finite Unix-ms numbers (the skewed-toggle relaxation from round 1 is preserved). Tests: - toggle() helper auto-derives action from `timestamps` presence, so the existing 12 tests stayed terse. One test that relied on the old 'empty body = add' behavior now passes `action: 'on'` explicitly. - Added 4 new contract tests: * rejects requests missing/with-invalid action * action='off' on an already-off hour is idempotent * action='on' on an already-on hour replaces timestamps (the round-2 desync-recovery scenario) * action='off' ignores stray timestamps payloads 29 / 29 routes tests pass; 431 / 431 across the workspace.
2026-06-01	feat(notifications): topic-only input (drop URL validation)	Adam Malczewski
	The Settings field is now a plain topic name (e.g. `my-secret-topic`) instead of a full URL. The transport always posts to `https://ntfy.sh/<topic>` (URL-encoded), and the only server-side check is "non-empty when enabled". Removes the user-visible "string does not match the expected pattern" error people hit when typing a bare topic. - packages/core/src/notifications/ntfy.ts: drop validateTopicUrl; add buildNtfyUrl(topic) + exported NTFY_BASE_URL. - packages/core/src/notifications/types.ts, config.ts: rename topicUrl -> topic; update docs. - packages/api/src/routes/notifications.ts: only validates non-empty topic when enabled. Also fixes a latent bug where notifySubagents was dropped on every PUT (was not passed to normalizeNtfyConfig). - packages/frontend/src/lib/components/SettingsPanel.svelte: relabel field "Topic URL" -> "Topic"; placeholder "your-secret-topic"; updated helper copy. - Tests updated: rewrote validateTopicUrl coverage as buildNtfyUrl coverage + proof that previously-rejected topics (dots, spaces, unicode, "Any Topic Whatsoever") now POST cleanly. - HANDOFF.md: added a short "topic-only input" section.
2026-06-01	fix(api): wake-schedule — accept skewed toggles, atomic persist, ↵	Adam Malczewski
	boot-recovery reason Three review-finding fixes in models.ts + regression tests: 1. POST /wake-schedule/toggle no longer rejects 'past' timestamps (Gemini #1, High). Client-server clock skew + request latency meant a freshly-computed nextOccurrenceAt(HH:MM) for an imminent slot could land in the past by the time the server validated it, silently failing the UI toggle. The scheduler's recoverScheduleEntry already fires within MISSED_WAKE_GRACE_MS and rolls forward by 24h × N, so the strict <= now check was actively harmful. Kept Number.isFinite + slot-present validation. 2. persistSchedule is now transactional (Gemini #3, Medium). The old DELETE-then-N-INSERTs path, when an INSERT failed mid-loop, left the table empty (DELETE had committed) and silently wiped the user's schedule on next boot — the catch swallowed the error. Wrapped both in db.transaction(...): on failure everything rolls back, in-memory state is untouched, and the previously persisted snapshot stays intact. 3. Boot-recovery reason no longer masked when boot recovery + due slots coincide (Gemini #5, Nit). Capture bootFireRequested before clearing the flag and append ' (boot recovery)' to the reason so the lastWake/pendingRetry surface tells the truth. Tests: - Replaced 'POST toggle rejects past timestamp' (the bug-as-feature test) with 'POST toggle ACCEPTS a slightly-past timestamp (clock skew / latency)' regression guard. - Added 'POST toggle rejects NaN / Infinity / non-number slot values' to lock the malformed-input path. - Added 'snapshot remains consistent across toggle round-trips (persistSchedule atomicity)' — exercises GET/POST cycles to ensure the transactional impl agrees with itself across add/remove. All 427 tests pass; biome clean.
2026-06-01	feat(notifications): add notifySubagents toggle to suppress subagent turn pings	Adam Malczewski
	A parent agent that spawns 8 subagents was producing 9 "Turn complete" notifications per round — almost always noise. New `notifySubagents` config flag (defaults to false) gates `turn-completed` and `turn-error` from any tab with a `parentTabId`. The flag is intentionally NOT applied to `permission-required` — a subagent's permission prompt still needs a human tap to proceed, so suppressing it would silently hang the subagent. `agent-spawned` is already top-level-only by construction. Wiring: - core/notifications/types.ts: NtfyConfig.notifySubagents: boolean - core/notifications/config.ts: defaults to false; normalize() tolerates missing / wrong-typed values and falls back to false - core/notifications/dispatcher.ts: new optional TabParentLookup option (getTabParentId). When notifySubagents=false AND the lookup returns a non-empty parent id string, turn-completed/turn-error are dropped. Lookup failures (no lookup configured, throws, returns undefined) fall back to "treat as top-level" so legitimate top-level events are never silently dropped when the DB is briefly unreadable. - api/app.ts: wires getTabParentId via core's getTab(id)?.parentTabId - frontend SettingsPanel.svelte: "Include subagent tabs" checkbox with an explanatory hint that permission prompts still fire Tests (+9): - 3 in config.test.ts: default-false, explicit-true, wrong-typed fallback - 6 in dispatcher.test.ts: suppression of turn-completed/turn-error from subagents, no suppression when flag is true, permission-required not gated, graceful fallback when lookup is missing/throws/returns undefined Live ntfy.sh round-trip re-verified (status: 200).
2026-06-01	feat(wake): probe 4 times per marked hour (:00 :15 :30 :45), coalesce ↵	Adam Malczewski
	same-tick fires Marking an hour on the Claude Wake Schedule panel now schedules FOUR probes within that hour instead of one. Rate-window edges are unforgiving — a single probe at :15 can miss the actual reset moment by up to 14 minutes; hitting :00 / :15 / :30 / :45 puts us within ~7 minutes of any reset that happens during that hour. When multiple slots come due in the same 30s scheduler tick (or recover together at boot), they coalesce into a SINGLE upstream wake call — no point hitting Anthropic 4× in the same window. DB schema - wake_schedule is now (hour, slot_minute, next_wake_at) PK (hour, slot_minute). Destructive migration: detect old single-row-per-hour schema by absence of the slot_minute column and DROP TABLE. No other table is touched. Per user direction: no back-compat for old rows. API - POST /models/wake-schedule/toggle add: { hour, timestamps: { '0': ms, '15': ms, '30': ms, '45': ms } } — all 4 slots required, all must be future Unix ms. Delete shape unchanged ({ hour }). - GET /models/wake-schedule shape: schedule: { '9': { '0': ts, '15': ts, '30': ts, '45': ts }, ... } probeSlotMinutes: [0, 15, 30, 45] resetOffsetHours, lastWake, pendingRetry (unchanged from prior commit) Frontend - Computes 4 timestamps client-side (next occurrence of HH:MM in local TZ) and sends them in one request. - markedHours summary now says 'Probes :00 :15 :30 :45 → reset by ~Xh later'. - Same in-flight tracking / current-hour ring / status row as before. Tests - wake-scheduler.test.ts unchanged (pure helpers still correct; added PROBE_SLOT_MINUTES + isProbeSlotMinute exports). - routes.test.ts rewritten for the new payload shape: 12 wake-schedule tests covering snapshot shape, add/remove (full 4-slot round-trip), validation (range, integer, past-slot, missing slot, non-object, missing timestamps), independent multi-hour scheduling, and re-toggle replacement. 417 tests total (was 414).
2026-06-01	fix(api): wake scheduler — missed-wake recovery, retry consolidation, ↵	Adam Malczewski
	status surface Bugs fixed - Missed wakes silently lost. The old loadScheduleFromDB just pushed any past next_wake_at to its 'next occurrence' in server local time, so a wake that fired while the API was down never ran — defeating the whole point of the panel (overnight task picks up after a 5h rate-window reset). Now: if missed by <= 2h we fire it on the next tick; either way the entry is rolled forward by 24h-multiple steps. - Server-TZ drift. nextOccurrenceAt15 used the server's local TZ, so on a UTC Docker host running for a user in PST the reschedule slowly migrated the fire time. Now we advance by 24h * N from the original client-supplied timestamp, preserving the user's wall-clock intent. - Retry storm. Every failed wake pushed a new entry into a retries[] array, all converging at the same +5min instant. Replaced with a single shared pending-retry slot whose budget resets on subsequent failures. - Retry race with fresh fires. If a tick fired AND a retry was due in the same iteration we'd double-hit the upstream. Now retries only run on ticks where no fresh wake fired. New behavior surfaced on /wake-schedule: { schedule, resetOffsetHours, lastWake, pendingRetry } POST /wake-schedule/toggle now also rejects non-integer hours (4.5, etc.) and returns the same snapshot shape so the client can stay in sync. Tests: 9 new HTTP route tests covering snapshot shape, add/remove, validation (range, integer, past timestamp, missing timestamp), and independent multi-hour scheduling.
2026-06-01	feat(api): extract pure wake-scheduler helpers (nextDailyAfter, ↵	Adam Malczewski
	recoverScheduleEntry) Side-effect-free module so missed-wake recovery and rescheduling can be unit-tested without booting Hono or touching SQLite. - nextDailyAfter: advances by 24h increments until strictly > now (handles multi-day gaps in a single step instead of looping a day at a time). - recoverScheduleEntry: classifies a past next_wake_at into 'fire now, then advance' vs 'silently advance' based on MISSED_WAKE_GRACE_MS (2h). - CLAUDE_RESET_OFFSET_HOURS / resetHourFor: single source of truth for the '+5h reset' display, previously hardcoded in three places. Includes 12 unit tests covering grace boundaries, multi-day skip, custom grace windows, and midnight wraparound.
2026-06-01	feat(api): wire notification dispatcher into app + /notifications routes	Adam Malczewski
	PermissionManager: add onPromptAdded(listener) callback. Fires exactly once per unique pending prompt id, even when broadcastPending is called repeatedly for unrelated mutations (e.g. another prompt resolving while this one is still pending). app.ts: instantiate NotificationDispatcher, attach to both AgentManager and PermissionManager. Tab-title lookup via core's getTab so the notifications carry human-readable context instead of raw UUIDs. routes/notifications.ts: - GET /notifications — current config (auth token redacted) plus the event-type catalog and defaults - PUT /notifications — partial update; auth token semantics are undefined=keep, ''=clear, otherwise replace - POST /notifications/test — sends a test notification with the current config (rejects if disabled or topic invalid) Tests: - new permission-manager.test.ts covers the onPromptAdded contract (one-fire-per-prompt, dedup across rebroadcasts, unsubscribe, listener throws don't break siblings) - existing routes.test.ts gets stubs for the new core notification exports so the @dispatch/core mock stays complete
2026-06-01	fix(queue): consume queued messages after a turn ends (start a new turn)	Adam Malczewski
	A message queued while the agent was mid-turn was only handled if it arrived DURING a tool batch (injected as a [USER INTERRUPT]). If it landed after the last tool call — or the turn had no tools — the agent silently appended it to history and ended the turn with no response, so it sat there unanswered. This affected both user-queued messages and agent-queued ones (send_to_tab). - agent.ts: stop the end-of-turn drain that swallowed trailing queued messages into history. They now stay on the queue. - agent-manager: after a CLEAN turn settles, continueFromQueue() drains the queue and starts a fresh turn to answer it. Skipped on a user-stopped or errored turn (queue preserved for the next send). - Loop safety: continuation draws from the existing autoWakeBudget, so a runaway agent<->agent chain is bounded; human sends refill it, so human conversations are never throttled. - dequeueMessages now tags message-consumed with reason "interrupt" \| "continuation"; the frontend collapses continuation- consumed queued bubbles into the next turn's initiator row (avoids the linger/dup traps documented in queue-interrupt-reconcile-edge-cases.md). - Tests: agent (no-swallow + interrupt regression), agent-manager (continuation, no-op when empty, user-stop preserves queue, bounded loop), frontend (continuation bubble becomes next initiator). - wishlist: remove the now-fixed item.
2026-06-01	feat(tabs): tab-to-tab agent communication via short handles	Adam Malczewski
	Add send_to_tab / read_tab tools so an agent can message or read another tab by a git-style short handle (shortest unique prefix of the tab UUID, min 4 chars), shown in the tab bar. - core/db/tabs: resolveTabPrefix + shortestUniquePrefix (open tabs only, LIKE-sanitized prefix matching) - new tools read-tab.ts / send-to-tab.ts (+ tests) decoupled from the DB TabRow via a minimal ResolvedTabRef projection - agent-manager: unified deliverMessage routing (busy -> queue, idle -> new turn) shared by POST /chat and send_to_tab; agent->agent auto-wake budget (MAX_AGENT_AUTO_WAKES) to bound ping-pong loops - summon/loader: send_to_tab + read_tab as grantable tools - frontend: shortHandleFor + handle badge in TabBar; perm toggles - notes: tab-comm / user-agents / todo-redesign plans - chore: biome format fixes (debug-logger, summon.test) Refs notes/plan-tab-comm.md
2026-05-30	feat(chunks): chunk-native frontend store with turn-sealed reconcile + ↵	Adam Malczewski
	per-chunk eviction Replace the stored ChatMessage[] with a chunk-native model: tab.chunks (sealed ChunkRow[]) + tab.live (transient in-flight turn buffer) + derived tab.renderGroups. This enables per-chunk eviction (trimming WITHIN a large turn) and raw-chunk pagination (loadOlderChunks), removing the whole-message eviction limitation. Backend: - Emit turn-start/turn-sealed around each turn; expose currentTurnId in the status snapshot. turn-sealed fires after the durable write (status:idle fires before it). - New GET /tabs/:id/chunks raw paginated endpoint (limit/before). - Wrap appendChunks in a single SQLite transaction. Frontend: - turn-sealed drives a turn-aware reconcile that folds the sealed turn into chunks while preserving a concurrent newer in-flight turn and pending queued messages; deferred while the user is scrolled up. - Stable turn-scoped render keys (${turnId}:${role}:${n}) avoid remount/flash. Reconcile correctness (three review passes): - preserve a concurrent newer turn when an earlier deferred reconcile flushes; - keep optimistic queued user messages (no loss); - turn-start backfill skips pending queued rows and tags only the turn initiator; - bind consumed interrupt messages to the in-flight turn so they collapse on seal (no lingering/duplicated bubble). Tests: chat-store reconcile/eviction/pagination suite; api chunks endpoint + events.
2026-05-30	refactor(chunks): append-only chunk log with per-step cache-stable wire	Adam Malczewski
	Replace the message-as-container model with a flat, append-only chunk log. - chunks table (id, tab_id, seq, turn_id, step, role, type, data_json): one row per chunk; tool_call (assistant) and tool_result (tool) are SEPARATE rows linked by callId. Message/turn are derived groupings, not stored. - chunks/transform.ts: DB-free explode (Chunk[] -> rows) / group (rows -> messages), shared by backend and the browser frontend. - Cache fix: toModelMessages segments each turn at tool-batch boundaries into stable [assistant, tool] pairs per step, so earlier steps serialize byte-identically across requests (kills the prompt-cache churn). - agent-manager persists a turn's chunks on seal (once), discarding a failed fallback attempt's partial chunks; rebuilds agent history from the log. - GET /messages windows the log by chunk seq then groups; loadMoreMessages merges a turn split across the window boundary by turnId. - One-shot migration drops the legacy messages table and clears tabs; settings/credentials/keys/usage preserved. Full suite green (317 tests); biome, tsc, and svelte-check clean.
2026-05-29	feat: stop generation button with abort signal plumbing	Adam Malczewski
	- Add POST /chat/stop endpoint on API - Thread abortSignal from agent-manager through Agent.run() to streamText - Thread abortSignal option through the Agent.run() signature - Emit status:idle on stopTab() so frontend WS gets the update - Add stopGeneration() store method on frontend tabStore - Add stop button in ChatInput (btn-sm lg:btn-xs for mobile tap target) - Add tests for /chat/stop endpoint - Refactor processMessage to pass abortSignal to agent.run
2026-05-28	feat: restore tab layout + in-flight chunks on browser reopen; agents keep ↵	Adam Malczewski
	running in background Implements the 'background-running agents + restore-layout-on-reopen' feature. Full design and parallel-implementation plan in `plan-bg-restore.md`; Gemini code review (SHIP verdict, no findings) in `report.md`. User-visible behaviors: 1. Browser-close keeps agents alive. If an agent is mid-stream when the browser closes / reloads / loses the network, it continues processing on the backend. (This was already the case in code — agents run fire-and-forget in app.ts:77-79 — but it was previously pointless because the UI never restored the tab to receive the output.) 2. Layout restore on browser reopen. Every tab that existed at the time the window was closed is restored, in original `position` order, with full persisted message history. Tabs whose agents finished while disconnected appear with the completed message. Tabs whose agents are still running appear streaming live — the in-flight assistant message is reconstructed from the backend's in-memory `currentChunks` (sent over the wire on connect) and accumulates new deltas as they arrive. 3. Explicit tab-close cancels + forgets. Clicking the X still cancels the agent (existing `stopTab` in DELETE /tabs/:id) and archives the row (`is_open = 0`), so it is not restored. No change to that path. The gap that the implementation closes: previously, App.svelte:onMount unconditionally called `createNewTab()` with a fresh UUID, ignoring every existing row in the `tabs` table. Every browser open was a clean slate. The DB had the conversation history but no way for the UI to discover it. Implementation: • New `TabStatusSnapshot` interface in packages/core/src/types/index.ts (auto-exported via existing `export * from "./types"`): interface TabStatusSnapshot { status: AgentStatus; currentChunks?: Chunk[]; // present iff running currentAssistantId?: string; // present iff running } • `agent-manager.ts:getAllStatuses()` rewritten to return `Record<string, TabStatusSnapshot>` (was `Record<string, AgentStatus>`). For running tabs only, attaches a defensive shallow copy of `tabAgent.currentChunks` (the live streaming array the per-message loop appends to) plus the DB id of the in-flight assistant message. The defensive copy is the consumer's to mutate. Idle / error tabs get `{ status }` only. `GET /status` and the WS `onOpen` snapshot both pick up the new shape automatically — neither call site changed. • Frontend mirror of `TabStatusSnapshot` in packages/frontend/src/lib/types.ts; `AgentEvent.statuses` variant updated to use `Record<string, TabStatusSnapshot>`. • New `hydrateFromBackend()` on the tab store (packages/frontend/src/lib/tabs.svelte.ts). Sequence on app mount: 1. Bail with 0 if `tabs.length > 0` (hot-reload idempotency). 2. GET /tabs → list of `is_open=1` rows in `position` order. 3. GET /status → in-flight TabStatusSnapshot map. 4. GET /tabs/:id/messages for each tab in parallel via Promise.all → persisted ChatMessage[]. 5. Build the Tab objects, splicing the snapshot's live chunks into the in-flight assistant message for every running tab (two paths: merge into the existing DB row with matching id, or append a fresh in-flight message if no row matches). 6. `tabs = restored; activeTabId = restored[0]?.id ?? null;` Every fetch is wrapped in try/catch so one tab's failure can't destroy the whole restore pass. • WS `statuses` handler in `tabs.svelte.ts:handleEvent` rewritten for the new shape. Still fires `reloadTabMessagesFromApi` on the desync case (frontend thinks running, backend says idle — the pre-existing recovery path is preserved). When backend says running, seeds in-flight chunks into the assistant message matching `snap.currentAssistantId` (creating it if needed). When backend says non-running, clears `isStreaming` on the previous in-flight message and nulls `currentAssistantId`. • `App.svelte:onMount` now awaits `tabStore.hydrateFromBackend()` before deciding whether to fall back to `createNewTab()`. Fallback condition is the doubly-defensive `restored === 0 && tabStore.tabs.length === 0`. `wsClient.connect()` fires in parallel with hydration — the resulting WS `statuses` event is per-tab idempotent against the hydrated state, so there is no race even if it arrives mid-hydration. What was NOT done (deliberately, deferred to wishlist): • Pre-existing inconsistency: core `AgentStatus` includes "waiting_for_key" but frontend `TabStatusSnapshot.status` uses only the existing 3-state pattern ("idle" \| "running" \| "error"). Not introduced here; mirrored the existing precedent. • Restored tabs use defaults for `reasoningEffort`, `agentSlug`, `agentScope`, `agentModels`, `workingDirectory` — these are not in the DB `tabs` schema. Future schema expansion. • Per-delta DB flushing — not needed; the in-memory snapshot covers the gap between flushAssistant calls. • LocalStorage cache of tab ids — backend DB is the source of truth. Process notes: • Implemented via parallel programmer subagents (flash agents were requested but unavailable in this environment — substituted with "programmer" agents, which share the "reads a plan, implements a single step" charter). Backend (Segment A: getAllStatuses + 5 tests) and frontend (Segment B: types + hydrateFromBackend + statuses handler + onMount + 8 tests) ran disjoint-file-ownership in parallel. • Gemini code review (yolo mode for tool access, explicit prompt-level write restriction to `report.md` only) returned a SHIP verdict with no findings against the plan. • Self-review surfaced one followup gap that Gemini's earlier plan-mode pass also caught: no explicit test for `/tabs/:id/messages` failure isolation. Added a test covering both HTTP-500 and network-error variants alongside a healthy tab, asserting per-tab failures don't destroy the whole restore. Tests: • api/tests/agent-manager.test.ts: +5 (snapshot empty record, idle-tab field omission, running-tab field inclusion, defensive copy invariant, omits chunks for running tab with null currentChunks). 31 total (was 26). • frontend/tests/chat-store.test.ts: +9 (restore-with-messages, in-flight seeding, /tabs failure → 0 returned, empty /tabs array, idempotency when tabs already exist, idle-status when /status omits, running-snapshot statuses handler seeding, idle-snapshot statuses handler clearing, per-tab failure isolation across HTTP-500 and network-error). 44 total (was 35). Totals: 243 tests across 3 packages all green; typecheck clean on core + api + frontend; biome clean across 124 files.
2026-05-28	fix(api): pre-populate Agent.messages from DB on construction so model ↵	Adam Malczewski
	switches preserve chat history Before this change, swapping the model mid-conversation via the sidebar slider lost all prior turns: the new model saw only the current user message and treated the conversation as brand-new. Root cause: `getOrCreateAgentForTab` invalidates the cached Agent (`tabAgent.agent = null`) whenever the effective keyId, modelId, permission key, or working directory differs from the cached values. The replacement Agent was then constructed with `messages: []` and the post-construction step that loads prior turns from the SQLite `messages` table simply did not exist. `processMessage` had already appended the current turn's user message to the DB (line 960) before calling `getOrCreateAgentForTab` (line 1015), so the DB held the full context — it was just never read. Fix: after every `new Agent(...)` in `getOrCreateAgentForTab`, call `getMessagesForTab(tabId)`, walk backwards to the most recent user-role row, and assign all strictly-prior rows to `tabAgent.agent.messages`. The walk-backwards strategy correctly handles two boundary cases: 1. Simple model switch — last DB row is the current user message; drop it (`Agent.run()` will push it again at line 546). 2. Agent-mode auto-fallback retry — last DB row may be a partial assistant response flushed by the previous failed attempt; we drop both that and the current user message in one pass. System-role rows (config-reload notices, etc.) are preserved verbatim; `toModelMessages` already strips them before the wire payload, so this is safe. The fix covers every Agent-reconstruction trigger, not just the model slider: - Sidebar model/key change (the reported case) - Permission setting change - Working-directory change (`processMessage` line 951) - dispatch.toml config-watcher reload (lines 236–237) - Skills directory watcher reload (lines 249–250) - `stopTab` after user cancellation (line 775) If `getMessagesForTab` throws (e.g. DB locked, schema mismatch), we swallow the error and leave `messages: []` — matching pre-fix behaviour for that case so this commit never regresses. Tests (+6 in `packages/api/tests/agent-manager.test.ts`, total 26): - pre-populates Agent.messages from DB history - leaves messages empty when DB has only the current turn (first msg) - excludes a partial assistant trail from a prior fallback attempt - preserves system-role rows in pre-populated history - survives a getMessagesForTab failure without crashing - reloads history on every Agent reconstruction (simulated slider switch from Opus to DeepSeek across two processMessage calls) The test rig was extended with: - `fakeMessagesByTab` map + `setFakeMessages` helper letting tests inject arbitrary DB rows for the mocked `getMessagesForTab`. - `constructedAgents` array captured at `run()` entry (not at construction) so each test sees the post-pre-populate snapshot; the production code reassigns `agent.messages` after `new Agent()` returns, so capturing at construction yielded a stale empty array. - Pluggable `runImpl` hook for tests that want a custom event stream (not yet exercised; staged for the next round of agent-mode fallback tests). Totals: 229 tests across 3 packages all green; typecheck clean on core + api + frontend; biome clean across 124 files.
2026-05-28	refactor(core): upgrade ai-sdk v4 → v6 + Anthropic/openai-compatible ↵	Adam Malczewski
	reasoning round-trip + max-thinking budget audit Migrates the LLM stack from [email protected] + @ai-sdk/[email protected] + @ai-sdk/[email protected] to [email protected] + @ai-sdk/[email protected] + @ai-sdk/[email protected]. Full design in plan-v6-upgrade.md; two rounds of Gemini code review captured in report.md. Motivation: the recurring 'reasoning-signature without reasoning' error on Claude Opus 4.7 was a v4 SDK artefact — @ai-sdk/[email protected] emitted Anthropic signature_delta as a separate stream chunk that orphaned when the model produced a signed-but-empty thinking block, and our chunk store had no signature field so the round-trip back to Anthropic was rejected on the next turn. In v6, signatures arrive inside providerMetadata on the reasoning-end event, and the orphan-signature class of bug is gone at the SDK level. Core changes: • ThinkingChunk gains optional metadata?: Record<string, unknown> (the v6 providerMetadata blob). A non-undefined metadata 'seals' the chunk: subsequent reasoning-delta opens a new chunk rather than extending the sealed one. • AgentEvent gains { type: 'reasoning-end'; metadata? } (replaces the v4 reasoning-signature variant). • toModelMessages (replaces toCoreMessages): - returns ModelMessage[] (was CoreMessage[]) - thinking → { type: 'reasoning', text, providerOptions: metadata } - tool-batch entries → { type: 'tool-call', input } (was 'args') - tool results → { output: { type: 'text', value } } ToolResultOutput • Claude OAuth uses createAnthropic({ authToken }) natively — no more custom-fetch x-api-key → Bearer swap. • rewriteBodyForOpus47 deleted — Opus 4.7 adaptive thinking is native via providerOptions.anthropic.thinking = { type: 'adaptive' }. • V1 middleware → V3 (specificationVersion: 'v3'). • v4-era normalizeMessages openai-compatible middleware deleted; the v6 openai-compatible provider extracts reasoning_content natively from { type: 'reasoning' } content parts. • applyAnthropicStructuralNormalisations (mirrors opencode provider/transform.ts:53-148): drops empty text/reasoning parts, scrubs non-[a-zA-Z0-9_-] toolCallIds, splits [tool-call, non-tool] assistant turns (Anthropic rejects tool_use followed by text). • applyOpenAICompatibleReasoningNormalisation (mirrors opencode transform.ts:217-249): lifts reasoning text into providerOptions.openaiCompatible.reasoning_content (always, even empty). Solves DeepSeek 'The reasoning_content in the thinking mode must be passed back' — the v6 SDK skips emitting reasoning_content when text is empty (dist/index.mjs:245), but DeepSeek requires the field present once thinking was used. • Tools: tool({ inputSchema: jsonSchema(zodToJsonSchema(...)) }) (was parameters: ZodSchema). AI SDK tools have no execute callback — the agent runs tools manually for permission prompts and shell-output streaming. New dep: zod-to-json-schema@^3.25.2. • fullStream event loop rewritten for v6 event shape: text-delta (text not textDelta), reasoning-start/delta/end, tool-input-*, tool-call (input not args), tool-result, tool-error (new), abort (new), start-step/finish-step, finish. Max-thinking audit (matches opencode transform.ts:642-671 budgets): • Claude enabled-thinking max budget 16000 → 31999 (Anthropic ceiling) • Claude enabled-thinking high budget 10000 → 16000 • maxOutputTokens 'budget + 8000' → fixed 32000 (matches opencode's OUTPUT_TOKEN_MAX; model self-allocates thinking vs response within) • Opus 4.7 adaptive thinking gains display: 'summarized' and sibling effort field (without these, thinking content is hidden by Anthropic and the model barely thinks). Frontend mirrors: • types.ts — ThinkingChunk.metadata?, AgentEvent reasoning-end • tabs.svelte.ts — routes reasoning-end through applyChunkEvent • ChatMessage.svelte — hides empty thinking chunks; hides the entire assistant bubble when no chunk has renderable content Gemini-review-driven fixes: • tool-error and abort stream events now surface as error chunks (were silently ignored) • toolCallId scrubbing pass (opencode transform.ts:96-122 parity) • Empty-reasoning-cull explicit test coverage for both Anthropic structural normalisation and DeepSeek path Test counts (223 tests across 3 packages, all green): • tests/chunks/append.test.ts: 44 (was 38) — reasoning-end sealing, orphan walk-back, multi-block interleaving • tests/agent/agent.test.ts: 24 (was 5) — exhaustive v6 event mappings, structural normalisations, signature/reasoning_content round-trip, tool-error/abort branches, DeepSeek scenario, empty reasoning edge case • tests/llm/provider.test.ts: 9 (was 22) — dropped 13 obsolete v4 middleware tests; new minimal tests confirm no middleware wrapping on default openai-compat path and that createAnthropic gets authToken vs apiKey correctly for OAuth vs api-key flows • tests/tools/registry.test.ts: 10 (was 4) — v6 tool() contract (inputSchema, no execute, JSON Schema for nested zod) • packages/api/tests/agent-manager.test.ts: 12 (was 7) — mock Agent emits v6 reasoning events; reasoning-end broadcast + ordering • packages/frontend/tests/chat-store.test.ts: 35 (was 32) — reasoning-end flow through Svelte $state store typecheck clean (tsc --noEmit on core + api, svelte-check on frontend), biome clean across 124 files.
2026-05-27	refactor: ChatMessage.chunks[] union — interleaved thinking, tool ↵	Adam Malczewski
	batching, error/system chunks
2026-05-27	feat: tool-output truncation+spill, read_file pagination, read_file_slice, ↵	Adam Malczewski
	symlink-safe path resolution
2026-05-23	feat: add is_subagent flag to agents, fix all lint/type/test issues	Adam Malczewski
	- Add is_subagent checkbox to agent editor; subagents are hidden from Chat Settings - Add is_subagent field to AgentDefinition type, TOML serialization, and API route - Filter subagents from ModelSelector agent list - Fix all biome lint/format errors across codebase (useLiteralKeys, noNonNullAssertion, noExplicitAny, formatting, import sorting) - Fix svelte-check errors (type narrowing in SkillsBrowser, ToolPermissions, SidebarPanel) - Fix a11y warnings in App.svelte (label-control associations) - Fix test mocks missing BackgroundShellStore, BackgroundTranscriptStore, createWebSearchTool, createYoutubeTranscribeTool - Update stale 409 test to match current message-queuing behavior - Exclude packaging/ and release/ dirs from biome to avoid linting stale build artifacts
2026-05-22	feat: agent summoning system, todo improvements, security fixes, ↵	Adam Malczewski
	double-execution bug fix - Add summon/retrieve tools for spawning child agents in new tabs - summon: non-blocking, returns agent_id immediately - retrieve: blocking, waits for child to finish, returns result - Child tools are intersected with parent permissions (no privilege escalation) - Working directory validated to stay within workspace - Abort controller stops orphaned agents on tab close - Rename task_list tool to todo with comprehensive usage guidance in system prompt - Rename PermissionLog.svelte to ToolPermissions.svelte - Add 'Summon agents' toggle to tool permissions UI - Redesign TaskListPanel with DaisyUI checkboxes (indeterminate for in-progress) - Remove 'blocked' status from task system - Add tab-created WebSocket event for child agent tab visibility - Add HMR cleanup for WebSocket connections (close stale connections on hot reload) - Fix ensureAssistantMessage to not throw on closed tabs - Fix double tool execution: remove execute from AI SDK tool() in registry.ts (agent.ts already executes tools manually via executeToolWithStreaming) - Fix all pre-existing test failures (missing mocks, stale API signatures) - Add debug info to copy button (tab ID, injected skills, all tab IDs) - Add tab ID and tools to conversation copy output
2026-05-20	feat: phase 3 — config, skills, model groups, task list, and sidebar UI	Adam Malczewski
	- Config system: TOML-based dispatch.toml with hot-reload via chokidar - Model/key resolution: tag-based model selection, key fallback chains - Skills system: directory loader with TOML frontmatter, agent mappings - Task list tool: add/update/list/get operations with WebSocket events - API routes: GET /config, /skills, /skills/:name, /models, /models/resolve - Frontend: sidebar with model status, task list, config viewer, skills browser, permission log - Sliding sidebar animation using CSS transitions (not Svelte transitions)
2026-05-19	feat: Phase 2 — shell permissions, tree-sitter analysis, permission UI	Adam Malczewski
	Permission engine: - Rule-based engine: wildcard matching, last-match-wins, reject cascade - PermissionService with pending/approved state, PermissionChecker interface - dispatch.yaml config loader with per-permission pattern rules Shell tool: - run_shell tool with child_process spawn, timeout, streaming output - Tree-sitter static analysis (web-tree-sitter + tree-sitter-bash WASM) - BashArity command normalization for 'always allow' patterns - FILE_COMMANDS set: rm, cp, mv, mkdir, ls, find, grep, cat, etc. Agent loop refactored: - Removed maxSteps, manual step loop with tool execution - Permission checks on shell commands (external_directory only) - Permission checks on file tools outside workspace boundary - Symlink bypass fix (realpathSync), .. false positive fix - Shell output streaming via Promise.race + setImmediate polling API layer: - PermissionManager wraps PermissionService, broadcasts via WebSocket - WebSocket handles permission-reply messages from frontend - Config loaded from dispatch.yaml, converted to ruleset Frontend: - Permission prompt modal (native dialog, focus trap, ARIA) - Always-allow confirmation flow with pattern preview - Shell output display (live streaming + final parsed result) - Permission log panel (fixed bottom-right overlay) - Exit code badge (green 0, red non-zero) 134 tests, typecheck clean on all 3 packages
2026-05-19	fix: DeepSeek reasoning_content dropped on multi-step tool calls	Adam Malczewski
	- Base URL corrected: zen/v1 -> zen/go/v1 (opencode-go provider) - Model changed: deepseek-v4-flash-free -> deepseek-v4-flash - Added wrapLanguageModel middleware to inject reasoning_content via providerMetadata.openaiCompatible before each stream call - Fixed test mocks: removed vi.importActual (unsupported in Bun), added tool factory mocks, preserved real tool export in ai mock - Added 11 tests for the normalizeMessages middleware
2026-05-19	Phase 1: single agent + basic UI	Adam Malczewski
	- Bun monorepo with @dispatch/core, @dispatch/api, @dispatch/frontend - Agent runtime with Vercel AI SDK, streaming via WebSocket - Tools: read_file, write_file, list_files (scoped to working directory) - Hono API server with POST /chat, GET /status, GET /health, WS /ws - Svelte 5 + DaisyUI frontend with chat UI, theme switcher, copy button - OpenCode Go (Zen) as LLM provider, deepseek-v4-flash-free model - Docker setup (dev + prod) with bin/ scripts and gopass secrets - Biome v2 linting/formatting, Vitest tests (44 passing) - Debug info attached to error messages for diagnostics