summaryrefslogtreecommitdiffhomepage
path: root/packages/core/src/agent
AgeCommit message (Collapse)Author
2026-06-03fix: warm the SAME Anthropic message-cache bucket as real turnsAdam Malczewski
Root cause of the 'first warmup misses' + 'switch to chat misses' bugs: Anthropic keys the MESSAGE-level prompt cache on `tool_choice` AND the extended-thinking parameters (both rows in their cache-invalidation table mark the messages cache as invalidated on change). The original warmCache() sent toolChoice:'none' and NO thinking providerOptions, while real turns send toolChoice:'auto' + thinking config for the effort. So warming and chat wrote TWO different message-cache buckets: - warmup #1 missed (no warm-only bucket existed yet), every later warmup hit its own bucket; - the next real chat message read the OTHER bucket → miss. Fix: extract a shared buildStreamOptions() that produces the cache-affecting params (toolChoice + thinking providerOptions + maxOutputTokens). Both run() and warmCache() now call it with the SAME resolved reasoning effort, so the warming replay refreshes the exact cache the next real message reads. The trivial probe turn is still appended AFTER the last cache breakpoint, so it never disturbs the cached prefix. Threaded the per-tab reasoning effort (per-model -> per-tab selector -> default, mirroring processMessage) from the frontend resolver through POST /chat/warm to warmCacheForTab to warmCache. Tests: updated the warmCache toolChoice test to assert it MATCHES a real turn, added an invariant test driving run() and warmCache() and asserting identical cache-affecting params, and assert effort forwarding in the frontend store. check / test (780) / frontend build / typecheck all green.
2026-06-03feat: prompt cache warming for idle tabsAdam Malczewski
Keep a tab's provider prompt-cache warm while idle by periodically replaying the exact cached conversation prefix plus a single trivial throwaway turn, resetting the provider's ~5-min cache TTL so the user's next real message hits a warm cache. Backend: - Agent.warmCache(history): extracts buildLlmContext() shared with run(), then re-sends the identical system+tools+history prefix (same Anthropic cache_control breakpoints) plus a 'reply with just a .' probe turn via toolChoice:none. Returns the request usage; mutates no history, emits/persists nothing. - AgentManager.warmCacheForTab(): resolves the same agent the next real turn would use, replays the FULL genuine history, refuses while a turn is running. - POST /chat/warm: returns ONLY the warming request's usage (never persisted, never folded into the real usage aggregate). Frontend: - cache-warming.svelte.ts store: per-tab 4-min repeating idle timer with countdown, warming-specific last-request cache %, and error capture. Arms on turn end, pauses during a turn, disables+resets on a real user message. - cache-warm-storage.ts: per-tab localStorage persistence of the toggle. - Lifecycle hooks wired into tabs.svelte.ts (status/statuses/sendMessage/ hydrate/create/open/close). - ModelSelector: bottom-of-panel checkbox + debug strip (last-% / countdown / error), shown only when enabled. Warming cache data never touches the real Cache Rate view. Tests: core warmCache (5), api warm route (3) + warmCacheForTab (3), frontend store (12) + storage (10). check / test (779) / frontend build / typecheck all green.
2026-06-02feat(chat): paste-to-attach images/PDFs with model capability checkAdam Malczewski
Add multimodal image/PDF input to the chat box via clipboard paste, gated by a graceful per-model capability check. UX: a pasted image/PDF inserts an inline token (【image:…】 / 【pdf:…】) into the draft, so attachments have ORDER relative to typed text and can be referenced positionally. The token is the only handle — deleting it (atomic Backspace/ Delete, or selection overlap) detaches the file; an input-reconciliation safety net detaches any attachment whose token is no longer intact. No preview strip. Capability check: resolveModelCapabilities reads models.dev modalities.input (new GET /models/capabilities, mirrors /context-limit). The input blocks Send (no tokens spent) only on a definitive 'no'; unknown capability (catalog offline / unmapped provider) stays permissive. Attachments require a fresh turn — Send is blocked while generating and /chat rejects content mid-turn (409). Attachments are EPHEMERAL: forwarded to the model for the turn via ordered AI SDK ImagePart/FilePart content, but never persisted (history keeps the text with [image]/[pdf] markers). Text-only turns serialize byte-identically to before. Limits (Anthropic-aligned, enforced at paste + re-validated server-side): PNG/JPEG/WebP/GIF/PDF; image ≤5MB, PDF ≤32MB, ≤20 attachments, ≤32MB total. core: UserContentPart types, models/attachments validator, capability resolver, agent.run+toModelMessages thread ordered content. api: /chat content validation + passthrough. frontend: attachment-tokens helper, ChatInput paste/token/gating, per-tab staged attachments, App.svelte capability fetch. +44 tests.
2026-06-02feat(agents): per-model reasoning effort levelAdam Malczewski
Add a per-model/key reasoning effort setting to agent definitions, surfaced and editable in the Agent Settings page and displayed at a glance in the model selector views. - core: single source of truth for effort levels (REASONING_EFFORTS, DEFAULT_REASONING_EFFORT='high', labels, isReasoningEffort guard); add 'xhigh' level; AgentModelEntry.effort; xhigh budget=24000 for classic-thinking Claude; default floor 'high'. Persist/parse effort in the agent TOML loader. - api: thread effort through the fallback chain with per-model -> per-tab -> default precedence; validate /chat + agentModels effort from the canonical list. - frontend: effort <select> per model row in AgentBuilder; effort badges in ModelSelector (agent + subagent chains); Thinking dropdown sourced from canonical list; per-tab default raised to 'high'. - tests: +15 (loader round-trip, agent xhigh budget, canonical list + guard, api precedence, route validation).
2026-06-01fix(queue): consume queued messages after a turn ends (start a new turn)Adam Malczewski
A message queued while the agent was mid-turn was only handled if it arrived DURING a tool batch (injected as a [USER INTERRUPT]). If it landed after the last tool call — or the turn had no tools — the agent silently appended it to history and ended the turn with no response, so it sat there unanswered. This affected both user-queued messages and agent-queued ones (send_to_tab). - agent.ts: stop the end-of-turn drain that swallowed trailing queued messages into history. They now stay on the queue. - agent-manager: after a CLEAN turn settles, continueFromQueue() drains the queue and starts a fresh turn to answer it. Skipped on a user-stopped or errored turn (queue preserved for the next send). - Loop safety: continuation draws from the existing autoWakeBudget, so a runaway agent<->agent chain is bounded; human sends refill it, so human conversations are never throttled. - dequeueMessages now tags message-consumed with reason "interrupt" | "continuation"; the frontend collapses continuation- consumed queued bubbles into the next turn's initiator row (avoids the linger/dup traps documented in queue-interrupt-reconcile-edge-cases.md). - Tests: agent (no-swallow + interrupt regression), agent-manager (continuation, no-op when empty, user-stop preserves queue, bounded loop), frontend (continuation bubble becomes next initiator). - wishlist: remove the now-fixed item.
2026-05-31feat(debug): wire LLM debug logger end-to-endAdam Malczewski
The debug-logger.ts module existed but was completely orphaned — none of its functions had any callsites, so DISPATCH_DEBUG_LLM=1 did nothing. Wires it in across the stack: - llm/debug-logger.ts: add wrapFetchWithLogging() that tees SSE bodies via TransformStream + response.clone() so we capture every chunk without draining the body the AI SDK consumes. Redacts authorization / x-api-key / cookie headers in logs. Also exports nextDebugSeq() so requests and log files share an id. - llm/provider.ts: all 3 factories (Claude OAuth, plain-API-key Anthropic, OpenAI-compatible) now pass fetch: wrapFetchWithLogging(globalThis.fetch). For Claude OAuth the wrap goes on the inner base fetch so logged bodies reflect the post-transform shape + Claude-Code session headers. Added tabId to ProviderConfig for log labelling. - agent/agent.ts: threads tabId through createProvider and emits logAgentLoop / logStepLifecycle / logStreamEvent at every meaningful point in the run loop — step start/end, tool count, every fullStream event. All are no-ops when DISPATCH_DEBUG_LLM is unset. - core/index.ts: re-exports the debug helpers. - tests/llm/provider.test.ts: switch one full-object equality assertion to property assertions so the test survives the new fetch: wrapper. Plumbing the env var into the container required three more fixes: - bin/up: re-export DISPATCH_DEBUG_LLM* so docker compose forwards them (compose only forwards vars referenced in the environment: block). Also pre-creates /tmp/dispatch/llm-debug and chowns it on first run so the container's UID-1000 bun process can write into it without EACCES. - docker-compose.yml: declare the debug vars on api.environment and bind-mount /tmp/dispatch/llm-debug:/tmp/dispatch/llm-debug so logs are inspectable from the host without docker exec. - docker/entrypoint.dev.sh: explicitly forward DISPATCH_DEBUG_* through the 'su -' login-shell barrier — su - resets the environment to TERM/ PATH/HOME/SHELL/USER/LOGNAME only, silently stripping everything else. This is why the vars appeared via 'docker exec env' (which spawns a new process inheriting the container env) but were absent from the actual bun process's /proc/<pid>/environ. bin/build: drop stray sudo for consistency with bin/up and bin/down.
2026-05-30fix(agent): stream thinking for all adaptive Claude models, not just Opus 4.7Adam Malczewski
Extended thinking was gated on a hardcoded `model === "claude-opus-4-7"` check, so newer/other adaptive models (Opus 4.8, Opus/Sonnet 4.6) fell into the classic `thinking: { type: "enabled" }` branch. Adaptive models default thinking display to "omitted", so no thinking was streamed — the UI showed nothing for Claude while DeepSeek (a separate openai-compatible path) worked. Replace the string check with a pure helper `anthropicThinkingProviderOptions` that mirrors opencode's transform.ts detection: - adaptive (`type: "adaptive"`) for Opus 4.7+ (version-parsed) and Opus/Sonnet 4.6 (id substring; handles dash and dot forms); - `display: "summarized"` ONLY for Opus 4.7+ (they default to omitted and must be forced); Opus/Sonnet 4.6 stream thinking without it; - all other Claude models keep classic `enabled` + budgetTokens. Pure function (no provider/streamText/network), unit-tested directly: Opus 4.8 (the reported bug), Opus 4.7, Sonnet/Opus 4.6, Opus 4.5 + dated Sonnet (enabled), a future Opus 4.9 (proves version-parse), and effort->budget mapping.
2026-05-30refactor(chunks): append-only chunk log with per-step cache-stable wireAdam Malczewski
Replace the message-as-container model with a flat, append-only chunk log. - chunks table (id, tab_id, seq, turn_id, step, role, type, data_json): one row per chunk; tool_call (assistant) and tool_result (tool) are SEPARATE rows linked by callId. Message/turn are derived groupings, not stored. - chunks/transform.ts: DB-free explode (Chunk[] -> rows) / group (rows -> messages), shared by backend and the browser frontend. - Cache fix: toModelMessages segments each turn at tool-batch boundaries into stable [assistant, tool] pairs per step, so earlier steps serialize byte-identically across requests (kills the prompt-cache churn). - agent-manager persists a turn's chunks on seal (once), discarding a failed fallback attempt's partial chunks; rebuilds agent history from the log. - GET /messages windows the log by chunk seq then groups; loadMoreMessages merges a turn split across the window boundary by turnId. - One-shot migration drops the legacy messages table and clears tabs; settings/credentials/keys/usage preserved. Full suite green (317 tests); biome, tsc, and svelte-check clean.
2026-05-30feat(cache): Anthropic prompt caching, usage telemetry, and Cache Rate viewAdam Malczewski
- send prompt-caching + oauth anthropic-beta headers on the Claude OAuth provider - restructure the OAuth request body (billing header, identity split, relocate third-party system prompt to the first user message) to match Claude Code - apply rolling cache_control breakpoints and group a turn's tool results into a single role:tool message for correct breakpoint placement - emit per-step usage events (cache read/write split) and add the Cache Rate sidebar panel - dedup byte-identical tool calls within a single batch
2026-05-29feat: stop generation button with abort signal plumbingAdam Malczewski
- Add POST /chat/stop endpoint on API - Thread abortSignal from agent-manager through Agent.run() to streamText - Thread abortSignal option through the Agent.run() signature - Emit status:idle on stopTab() so frontend WS gets the update - Add stopGeneration() store method on frontend tabStore - Add stop button in ChatInput (btn-sm lg:btn-xs for mobile tap target) - Add tests for /chat/stop endpoint - Refactor processMessage to pass abortSignal to agent.run
2026-05-29fix: handle unavailable tool calls via native v6 tool-error event, not ↵Adam Malczewski
synthetic invalid tool - Removed __invalid__ tool definition, experimental_repairToolCall, and v4-era NoSuchToolError catch block — AI SDK v6 already emits a native tool-error stream event with the original tool name - Added synthesizeResidualToolResults() helper to fill orphaned tool-call IDs with isError: true results for abort/error terminal paths - tool-error handler now break's instead of return's — lets sibling tools execute normally via the manual executor loop - Added final safety net after execution loop to catch any genuinely orphaned tool-call IDs before round-tripping to the LLM - Propagated isError through toModelMessages so error results are properly flagged in conversation history - Updated tests: tool-error event now continues to idle (not error), added sibling-orphan prevention test
2026-05-28fix(core): normalize tool schemas for Anthropic, add toolChoice=auto; ↵Adam Malczewski
feat(summon): agent definition support; docs: cc/ research findings - registry.ts: add normalizeForAnthropic() to strip , additionalProperties, default, nullable from zodToJsonSchema output so Anthropic doesn't silently reject tool definitions - agent.ts: add toolChoice=auto for Claude OAuth to prevent Opus thinking forever without calling tools - summon.ts: add agentSlug parameter, build agents catalog in description, add toAvailableAgents helper - agent-manager.ts: wire agent definition loading into spawnChildAgent, agent model fallback - loader.ts: export loadAgent, expandAgentToolNames, getAgentDirPaths; add getAgentDirPaths for permission gate - agent.ts: auto-allow read-only tools in agent definition directories - packaging/PKGBUILD: exclude ARM64 prebuilds from x86_64 package - cc/: research findings on Claude Opus tool calling issues - tests: loader tests, summon tool tests
2026-05-28fix(core): strip stale [USER INTERRUPT] from LLM history; inject into last ↵Adam Malczewski
tool of batch The interrupt block embedded in a tool-result was persistent in the assistant message history, so the imperative 'You MUST address these before continuing' got re-evaluated as fresh on every subsequent LLM step. Result: the model repeatedly thought about and re-acknowledged the same interrupt 5-10+ times per chat (verified in production DB traces — e.g. tab 4c5727aa had 11 thinking chunks quoting a single interrupt verbatim). agent.ts (toModelMessages): strip [USER INTERRUPT] from every tool- result except the one in the freshest tool-batch (last chunk of the last assistant message, which itself must be the last message). The strip is a serialization-time transform only — this.messages, the DB row, and the UI display all keep the full text. The LLM sees the imperative exactly once: the step immediately after injection. agent.ts (tool execution loop): batch queued messages across the group's tool calls and inject them only into the LAST executable tool's result. Previously the first tool to dequeue won; now the interrupt lands in a single deterministic spot regardless of timing. Tool-level handlers (run-shell/youtube/retrieve) are untouched — they still embed their own interrupt text when they background work. Also fix pre-existing tabs.test.ts: it referenced a getDescendantIds function that didn't exist (added: BFS, leaf-first, cycle-safe, skips archived) and imported bun:sqlite directly which vite couldn't resolve (rewrote with a minimal FakeDatabase + vi.mock pattern matching the rest of the suite).
2026-05-28refactor(core): upgrade ai-sdk v4 → v6 + Anthropic/openai-compatible ↵Adam Malczewski
reasoning round-trip + max-thinking budget audit Migrates the LLM stack from [email protected] + @ai-sdk/[email protected] + @ai-sdk/[email protected] to [email protected] + @ai-sdk/[email protected] + @ai-sdk/[email protected]. Full design in plan-v6-upgrade.md; two rounds of Gemini code review captured in report.md. Motivation: the recurring 'reasoning-signature without reasoning' error on Claude Opus 4.7 was a v4 SDK artefact — @ai-sdk/[email protected] emitted Anthropic signature_delta as a separate stream chunk that orphaned when the model produced a signed-but-empty thinking block, and our chunk store had no signature field so the round-trip back to Anthropic was rejected on the next turn. In v6, signatures arrive inside providerMetadata on the reasoning-end event, and the orphan-signature class of bug is gone at the SDK level. Core changes: • ThinkingChunk gains optional metadata?: Record<string, unknown> (the v6 providerMetadata blob). A non-undefined metadata 'seals' the chunk: subsequent reasoning-delta opens a new chunk rather than extending the sealed one. • AgentEvent gains { type: 'reasoning-end'; metadata? } (replaces the v4 reasoning-signature variant). • toModelMessages (replaces toCoreMessages): - returns ModelMessage[] (was CoreMessage[]) - thinking → { type: 'reasoning', text, providerOptions: metadata } - tool-batch entries → { type: 'tool-call', input } (was 'args') - tool results → { output: { type: 'text', value } } ToolResultOutput • Claude OAuth uses createAnthropic({ authToken }) natively — no more custom-fetch x-api-key → Bearer swap. • rewriteBodyForOpus47 deleted — Opus 4.7 adaptive thinking is native via providerOptions.anthropic.thinking = { type: 'adaptive' }. • V1 middleware → V3 (specificationVersion: 'v3'). • v4-era normalizeMessages openai-compatible middleware deleted; the v6 openai-compatible provider extracts reasoning_content natively from { type: 'reasoning' } content parts. • applyAnthropicStructuralNormalisations (mirrors opencode provider/transform.ts:53-148): drops empty text/reasoning parts, scrubs non-[a-zA-Z0-9_-] toolCallIds, splits [tool-call, non-tool] assistant turns (Anthropic rejects tool_use followed by text). • applyOpenAICompatibleReasoningNormalisation (mirrors opencode transform.ts:217-249): lifts reasoning text into providerOptions.openaiCompatible.reasoning_content (always, even empty). Solves DeepSeek 'The reasoning_content in the thinking mode must be passed back' — the v6 SDK skips emitting reasoning_content when text is empty (dist/index.mjs:245), but DeepSeek requires the field present once thinking was used. • Tools: tool({ inputSchema: jsonSchema(zodToJsonSchema(...)) }) (was parameters: ZodSchema). AI SDK tools have no execute callback — the agent runs tools manually for permission prompts and shell-output streaming. New dep: zod-to-json-schema@^3.25.2. • fullStream event loop rewritten for v6 event shape: text-delta (text not textDelta), reasoning-start/delta/end, tool-input-*, tool-call (input not args), tool-result, tool-error (new), abort (new), start-step/finish-step, finish. Max-thinking audit (matches opencode transform.ts:642-671 budgets): • Claude enabled-thinking max budget 16000 → 31999 (Anthropic ceiling) • Claude enabled-thinking high budget 10000 → 16000 • maxOutputTokens 'budget + 8000' → fixed 32000 (matches opencode's OUTPUT_TOKEN_MAX; model self-allocates thinking vs response within) • Opus 4.7 adaptive thinking gains display: 'summarized' and sibling effort field (without these, thinking content is hidden by Anthropic and the model barely thinks). Frontend mirrors: • types.ts — ThinkingChunk.metadata?, AgentEvent reasoning-end • tabs.svelte.ts — routes reasoning-end through applyChunkEvent • ChatMessage.svelte — hides empty thinking chunks; hides the entire assistant bubble when no chunk has renderable content Gemini-review-driven fixes: • tool-error and abort stream events now surface as error chunks (were silently ignored) • toolCallId scrubbing pass (opencode transform.ts:96-122 parity) • Empty-reasoning-cull explicit test coverage for both Anthropic structural normalisation and DeepSeek path Test counts (223 tests across 3 packages, all green): • tests/chunks/append.test.ts: 44 (was 38) — reasoning-end sealing, orphan walk-back, multi-block interleaving • tests/agent/agent.test.ts: 24 (was 5) — exhaustive v6 event mappings, structural normalisations, signature/reasoning_content round-trip, tool-error/abort branches, DeepSeek scenario, empty reasoning edge case • tests/llm/provider.test.ts: 9 (was 22) — dropped 13 obsolete v4 middleware tests; new minimal tests confirm no middleware wrapping on default openai-compat path and that createAnthropic gets authToken vs apiKey correctly for OAuth vs api-key flows • tests/tools/registry.test.ts: 10 (was 4) — v6 tool() contract (inputSchema, no execute, JSON Schema for nested zod) • packages/api/tests/agent-manager.test.ts: 12 (was 7) — mock Agent emits v6 reasoning events; reasoning-end broadcast + ordering • packages/frontend/tests/chat-store.test.ts: 35 (was 32) — reasoning-end flow through Svelte $state store typecheck clean (tsc --noEmit on core + api, svelte-check on frontend), biome clean across 124 files.
2026-05-27refactor: ChatMessage.chunks[] union — interleaved thinking, tool ↵Adam Malczewski
batching, error/system chunks
2026-05-27feat: tool-output truncation+spill, read_file pagination, read_file_slice, ↵Adam Malczewski
symlink-safe path resolution
2026-05-24fix: prompt caching, OpenCode Go MiniMax/Qwen support, Opus 4.7 thinking, ↵Adam Malczewski
SDK compat - Implement Anthropic prompt caching: first system message + last 2 non-system messages get cache_control: ephemeral, mirroring OpenCode's applyCaching strategy. Move system prompt inline into messages array so providerOptions can attach. - Add opencode-anthropic provider variant routing MiniMax/Qwen models through the /messages endpoint with x-api-key auth, distinct from the Claude OAuth flow's Bearer auth and Claude Code mimicry. - Split isAnthropic into isClaudeOAuth (billing header, mcp_ tool prefix, thinking config) and usesAnthropicSDK (cache markers) so non-OAuth Anthropic-format gateways get the right treatment. - Pin @ai-sdk/anthropic to ^1.2.12: v3 returns LanguageModelV3-spec models that ai v4's streamText rejects at runtime ('AI SDK 4 only supports models that implement specification version v1'). Drop unnecessary V1 casts. - Restore Opus 4.7 extended thinking by rewriting the outgoing /messages body in the Claude OAuth fetch interceptor: inject thinking: { type: 'adaptive' } (v1 SDK can't emit it), strip temperature/top_p/top_k (Anthropic rejects them with thinking enabled). Gated on max_tokens > 4096 so effort=none still works. - Bump MAX_STEPS from 10 to 50 to align with AI SDK's stepCountIs(20) default and reduce mid-task halts. - Fix pre-existing typecheck errors in agent-manager.ts (entry/nextEntry narrowing), app.ts (agentModels body field), KeyUsage.svelte (m guards), and a TS2742 in provider.ts via explicit ModelFactory return type. - buildFallbackSequence now always returns at least one entry so processMessage runs the agent loop even without keyId/modelId (fixes 4 broken agent-manager tests).
2026-05-23feat: google gemini provider, adaptive thinking for opus 4.7, model search ↵Adam Malczewski
filter - Added Google (Gemini) as a provider: add-key UI, env var resolution via resolveApiKey, usage tracking via native models endpoint + gemini.google.com cookie scraping - @ai-sdk/anthropic upgraded to v3 (adaptive thinking support) with LanguageModelV1 cast for ai v4 compat - Claude Opus 4.7 uses adaptive thinking (type: adaptive); all other models keep explicit budget tokens - Model selector modal: search filter with space matching dash/underscore - Copy button: all tool results truncated at 300 chars - Sidebar layout fix: Claude Reset panel removed from flex-1 fill to prevent overlap
2026-05-23feat: key fallback using agent models[] hierarchy, background tool modes, ↵Adam Malczewski
copy truncation - Agent rate-limit fallback now iterates through agent's configured models[] in strict order - Frontend sends agentModels with each /chat request; backend uses buildFallbackSequence() - Emits notice event on fallback so chat shows which key failed and what's being tried next - Child agents inherit parent's agentModels for fallback - Added statusCode propagation from AI SDK errors for programmatic 429 detection - Copy button truncates all tool results at 300 chars (was 200 for 4 specific tools) - run_shell, summon, youtube_transcribe: background mode support - summon: blocking mode by default with getResult callback
2026-05-23feat: add is_subagent flag to agents, fix all lint/type/test issuesAdam Malczewski
- Add is_subagent checkbox to agent editor; subagents are hidden from Chat Settings - Add is_subagent field to AgentDefinition type, TOML serialization, and API route - Filter subagents from ModelSelector agent list - Fix all biome lint/format errors across codebase (useLiteralKeys, noNonNullAssertion, noExplicitAny, formatting, import sorting) - Fix svelte-check errors (type narrowing in SkillsBrowser, ToolPermissions, SidebarPanel) - Fix a11y warnings in App.svelte (label-control associations) - Fix test mocks missing BackgroundShellStore, BackgroundTranscriptStore, createWebSearchTool, createYoutubeTranscribeTool - Update stale 409 test to match current message-queuing behavior - Exclude packaging/ and release/ dirs from biome to avoid linting stale build artifacts
2026-05-23feat: web_search + youtube_transcribe tools, shell interrupt backgrounding, ↵Adam Malczewski
fixes - Add web_search tool (Firecrawl POST to /v1/search with query, limit, lang, country, scrapeOptions) - Add youtube_transcribe tool (GET to transcriber service, handles completed/queued/failed statuses) - Both tools registered for parent agents (always) and child agents (permission-gated) - Added to summon enum, TOOL_DESCRIPTIONS, and core exports - Shell interrupt: run_shell now races against user queue interrupt - When interrupted, command continues in background with run_shell_<uuid> job ID - BackgroundShellStore holds running processes, auto-cleans 10min after completion - retrieve tool extended to handle both agent IDs and shell job IDs - Tool error detection: results starting with 'Error:' now marked isError in UI - Fix TS error: cast unavailMatch[1] regex capture group to string - Docker: network_mode host for Tailscale/LAN access to external services - Bun.serve idleTimeout set to 60s (was default 10s) - KeyUsage: clearer message when OpenCode usage data unavailable - Firecrawl: only send scrapeOptions when scrape=true (avoid 400 on instances without scrape support)
2026-05-22feat: message queue/interrupt system, CORS fix, mobile fixes, chat splittingAdam Malczewski
- Add message queue allowing users to send messages while agent is running - Queue messages are injected into tool results as [USER INTERRUPT] - Retrieve tool interrupted via Promise.race when user message arrives - Queued messages show with 'queued' badge and cancel button - Consumed messages repositioned and chat splits at interrupt point - New assistant message block created after interrupt for clean flow - Add POST /chat/cancel endpoint for cancelling queued messages - Fix CORS to allow any origin (Tailscale/LAN access) - Fix crypto.randomUUID fallback for non-secure contexts (HTTP) - Fix frontend API URL derivation from page hostname - Auto-create DB tab if missing on processMessage (foreign key fix) - Add error logging to processMessage catch block - Fix working directory input sync on agent switch - Fix agent mode button to re-apply agent settings
2026-05-22feat: agent builder, CWD support, auto-save, UI polish, unavailable tool ↵Adam Malczewski
handling - Agent Builder: full CRUD with card grid, drag-and-drop model reorder, edit/delete - Auto-save on edit with 600ms debounce, AbortController for concurrency, fieldset disabled until name entered - Agent definitions stored as TOML with cwd field, loaded from global/project dirs - Working directory: per-tab CWD override in Chat Settings, agent default CWD, auto-create on first message - CWD validation: check-dir endpoint with ~ expansion, real-time validity indicator - Subagent CWD validated against parent's effective CWD using path.relative - Unavailable tool calls: caught gracefully, shown as tool call with error badge, model retries - UI: tab bar border radius, sidebar border removed, chat input ghost style, scroll-to-bottom rectangle - Skills dir collapse uses CSS rotation, Model Choice renamed to Chat Settings, System Prompt view removed - Reusable SkillsBrowser/ToolPermissions with external mode for Agent Builder - ModelSelector: Agent/Manual toggle, agent list, Agent Settings link - Page router, skills recursive scanning, bin/up gopass removed, docker volume mounts
2026-05-20feat: claude max oauth support with multi-account switching, reasoning ↵Adam Malczewski
effort, and dynamic model listing
2026-05-19feat: Phase 2 — shell permissions, tree-sitter analysis, permission UIAdam Malczewski
Permission engine: - Rule-based engine: wildcard matching, last-match-wins, reject cascade - PermissionService with pending/approved state, PermissionChecker interface - dispatch.yaml config loader with per-permission pattern rules Shell tool: - run_shell tool with child_process spawn, timeout, streaming output - Tree-sitter static analysis (web-tree-sitter + tree-sitter-bash WASM) - BashArity command normalization for 'always allow' patterns - FILE_COMMANDS set: rm, cp, mv, mkdir, ls, find, grep, cat, etc. Agent loop refactored: - Removed maxSteps, manual step loop with tool execution - Permission checks on shell commands (external_directory only) - Permission checks on file tools outside workspace boundary - Symlink bypass fix (realpathSync), .. false positive fix - Shell output streaming via Promise.race + setImmediate polling API layer: - PermissionManager wraps PermissionService, broadcasts via WebSocket - WebSocket handles permission-reply messages from frontend - Config loaded from dispatch.yaml, converted to ruleset Frontend: - Permission prompt modal (native dialog, focus trap, ARIA) - Always-allow confirmation flow with pattern preview - Shell output display (live streaming + final parsed result) - Permission log panel (fixed bottom-right overlay) - Exit code badge (green 0, red non-zero) 134 tests, typecheck clean on all 3 packages
2026-05-19feat: inline tool display and thinking/reasoning supportAdam Malczewski
- Tool calls now appear at their stream position within messages (ContentSegment model) - Added reasoning/thinking display: collapsible <details> block above content - Set DeepSeek V4 Flash reasoningEffort to max via providerOptions - ChatMessage.content changed from string to ContentSegment[] (text | tool-call) - Agent handles AI SDK reasoning events, yields reasoning-delta - Fixed duplicate key in ChatMessage.svelte each block
2026-05-19fix: review findings — multi-turn history, race condition, collapse UIAdam Malczewski
- Core: toCoreMessages now includes tool calls and tool results in history - Core: isError read from step tool results instead of hardcoded false - API: status set synchronously before async generator to prevent race - Frontend: DaisyUI collapse-open class applied dynamically on expanded state - Frontend: removed duplicate isConnected update in wsClient.onEvent
2026-05-19Phase 1: single agent + basic UIAdam Malczewski
- Bun monorepo with @dispatch/core, @dispatch/api, @dispatch/frontend - Agent runtime with Vercel AI SDK, streaming via WebSocket - Tools: read_file, write_file, list_files (scoped to working directory) - Hono API server with POST /chat, GET /status, GET /health, WS /ws - Svelte 5 + DaisyUI frontend with chat UI, theme switcher, copy button - OpenCode Go (Zen) as LLM provider, deepseek-v4-flash-free model - Docker setup (dev + prod) with bin/ scripts and gopass secrets - Biome v2 linting/formatting, Vitest tests (44 passing) - Debug info attached to error messages for diagnostics