summaryrefslogtreecommitdiffhomepage
path: root/packages/api/src/app.ts
AgeCommit message (Collapse)Author
2026-06-03fix: warm the SAME Anthropic message-cache bucket as real turnsAdam Malczewski
Root cause of the 'first warmup misses' + 'switch to chat misses' bugs: Anthropic keys the MESSAGE-level prompt cache on `tool_choice` AND the extended-thinking parameters (both rows in their cache-invalidation table mark the messages cache as invalidated on change). The original warmCache() sent toolChoice:'none' and NO thinking providerOptions, while real turns send toolChoice:'auto' + thinking config for the effort. So warming and chat wrote TWO different message-cache buckets: - warmup #1 missed (no warm-only bucket existed yet), every later warmup hit its own bucket; - the next real chat message read the OTHER bucket → miss. Fix: extract a shared buildStreamOptions() that produces the cache-affecting params (toolChoice + thinking providerOptions + maxOutputTokens). Both run() and warmCache() now call it with the SAME resolved reasoning effort, so the warming replay refreshes the exact cache the next real message reads. The trivial probe turn is still appended AFTER the last cache breakpoint, so it never disturbs the cached prefix. Threaded the per-tab reasoning effort (per-model -> per-tab selector -> default, mirroring processMessage) from the frontend resolver through POST /chat/warm to warmCacheForTab to warmCache. Tests: updated the warmCache toolChoice test to assert it MATCHES a real turn, added an invariant test driving run() and warmCache() and asserting identical cache-affecting params, and assert effort forwarding in the frontend store. check / test (780) / frontend build / typecheck all green.
2026-06-03feat: prompt cache warming for idle tabsAdam Malczewski
Keep a tab's provider prompt-cache warm while idle by periodically replaying the exact cached conversation prefix plus a single trivial throwaway turn, resetting the provider's ~5-min cache TTL so the user's next real message hits a warm cache. Backend: - Agent.warmCache(history): extracts buildLlmContext() shared with run(), then re-sends the identical system+tools+history prefix (same Anthropic cache_control breakpoints) plus a 'reply with just a .' probe turn via toolChoice:none. Returns the request usage; mutates no history, emits/persists nothing. - AgentManager.warmCacheForTab(): resolves the same agent the next real turn would use, replays the FULL genuine history, refuses while a turn is running. - POST /chat/warm: returns ONLY the warming request's usage (never persisted, never folded into the real usage aggregate). Frontend: - cache-warming.svelte.ts store: per-tab 4-min repeating idle timer with countdown, warming-specific last-request cache %, and error capture. Arms on turn end, pauses during a turn, disables+resets on a real user message. - cache-warm-storage.ts: per-tab localStorage persistence of the toggle. - Lifecycle hooks wired into tabs.svelte.ts (status/statuses/sendMessage/ hydrate/create/open/close). - ModelSelector: bottom-of-panel checkbox + debug strip (last-% / countdown / error), shown only when enabled. Warming cache data never touches the real Cache Rate view. Tests: core warmCache (5), api warm route (3) + warmCacheForTab (3), frontend store (12) + storage (10). check / test (779) / frontend build / typecheck all green.
2026-06-02feat(chat): paste-to-attach images/PDFs with model capability checkAdam Malczewski
Add multimodal image/PDF input to the chat box via clipboard paste, gated by a graceful per-model capability check. UX: a pasted image/PDF inserts an inline token (【image:…】 / 【pdf:…】) into the draft, so attachments have ORDER relative to typed text and can be referenced positionally. The token is the only handle — deleting it (atomic Backspace/ Delete, or selection overlap) detaches the file; an input-reconciliation safety net detaches any attachment whose token is no longer intact. No preview strip. Capability check: resolveModelCapabilities reads models.dev modalities.input (new GET /models/capabilities, mirrors /context-limit). The input blocks Send (no tokens spent) only on a definitive 'no'; unknown capability (catalog offline / unmapped provider) stays permissive. Attachments require a fresh turn — Send is blocked while generating and /chat rejects content mid-turn (409). Attachments are EPHEMERAL: forwarded to the model for the turn via ordered AI SDK ImagePart/FilePart content, but never persisted (history keeps the text with [image]/[pdf] markers). Text-only turns serialize byte-identically to before. Limits (Anthropic-aligned, enforced at paste + re-validated server-side): PNG/JPEG/WebP/GIF/PDF; image ≤5MB, PDF ≤32MB, ≤20 attachments, ≤32MB total. core: UserContentPart types, models/attachments validator, capability resolver, agent.run+toModelMessages thread ordered content. api: /chat content validation + passthrough. frontend: attachment-tokens helper, ChatInput paste/token/gating, per-tab staged attachments, App.svelte capability fetch. +44 tests.
2026-06-02feat(agents): per-model reasoning effort levelAdam Malczewski
Add a per-model/key reasoning effort setting to agent definitions, surfaced and editable in the Agent Settings page and displayed at a glance in the model selector views. - core: single source of truth for effort levels (REASONING_EFFORTS, DEFAULT_REASONING_EFFORT='high', labels, isReasoningEffort guard); add 'xhigh' level; AgentModelEntry.effort; xhigh budget=24000 for classic-thinking Claude; default floor 'high'. Persist/parse effort in the agent TOML loader. - api: thread effort through the fallback chain with per-model -> per-tab -> default precedence; validate /chat + agentModels effort from the canonical list. - frontend: effort <select> per model row in AgentBuilder; effort badges in ModelSelector (agent + subagent chains); Thinking dropdown sourced from canonical list; per-tab default raised to 'high'. - tests: +15 (loader round-trip, agent xhigh budget, canonical list + guard, api precedence, route validation).
2026-06-01feat(notifications): add notifySubagents toggle to suppress subagent turn pingsAdam Malczewski
A parent agent that spawns 8 subagents was producing 9 "Turn complete" notifications per round — almost always noise. New `notifySubagents` config flag (defaults to false) gates `turn-completed` and `turn-error` from any tab with a `parentTabId`. The flag is intentionally NOT applied to `permission-required` — a subagent's permission prompt still needs a human tap to proceed, so suppressing it would silently hang the subagent. `agent-spawned` is already top-level-only by construction. Wiring: - core/notifications/types.ts: NtfyConfig.notifySubagents: boolean - core/notifications/config.ts: defaults to false; normalize() tolerates missing / wrong-typed values and falls back to false - core/notifications/dispatcher.ts: new optional TabParentLookup option (getTabParentId). When notifySubagents=false AND the lookup returns a non-empty parent id string, turn-completed/turn-error are dropped. Lookup failures (no lookup configured, throws, returns undefined) fall back to "treat as top-level" so legitimate top-level events are never silently dropped when the DB is briefly unreadable. - api/app.ts: wires getTabParentId via core's getTab(id)?.parentTabId - frontend SettingsPanel.svelte: "Include subagent tabs" checkbox with an explanatory hint that permission prompts still fire Tests (+9): - 3 in config.test.ts: default-false, explicit-true, wrong-typed fallback - 6 in dispatcher.test.ts: suppression of turn-completed/turn-error from subagents, no suppression when flag is true, permission-required not gated, graceful fallback when lookup is missing/throws/returns undefined Live ntfy.sh round-trip re-verified (status: 200).
2026-06-01feat(api): wire notification dispatcher into app + /notifications routesAdam Malczewski
PermissionManager: add onPromptAdded(listener) callback. Fires exactly once per unique pending prompt id, even when broadcastPending is called repeatedly for unrelated mutations (e.g. another prompt resolving while this one is still pending). app.ts: instantiate NotificationDispatcher, attach to both AgentManager and PermissionManager. Tab-title lookup via core's getTab so the notifications carry human-readable context instead of raw UUIDs. routes/notifications.ts: - GET /notifications — current config (auth token redacted) plus the event-type catalog and defaults - PUT /notifications — partial update; auth token semantics are undefined=keep, ''=clear, otherwise replace - POST /notifications/test — sends a test notification with the current config (rejects if disabled or topic invalid) Tests: - new permission-manager.test.ts covers the onPromptAdded contract (one-fire-per-prompt, dedup across rebroadcasts, unsubscribe, listener throws don't break siblings) - existing routes.test.ts gets stubs for the new core notification exports so the @dispatch/core mock stays complete
2026-06-01feat(tabs): tab-to-tab agent communication via short handlesAdam Malczewski
Add send_to_tab / read_tab tools so an agent can message or read another tab by a git-style short handle (shortest unique prefix of the tab UUID, min 4 chars), shown in the tab bar. - core/db/tabs: resolveTabPrefix + shortestUniquePrefix (open tabs only, LIKE-sanitized prefix matching) - new tools read-tab.ts / send-to-tab.ts (+ tests) decoupled from the DB TabRow via a minimal ResolvedTabRef projection - agent-manager: unified deliverMessage routing (busy -> queue, idle -> new turn) shared by POST /chat and send_to_tab; agent->agent auto-wake budget (MAX_AGENT_AUTO_WAKES) to bound ping-pong loops - summon/loader: send_to_tab + read_tab as grantable tools - frontend: shortHandleFor + handle badge in TabBar; perm toggles - notes: tab-comm / user-agents / todo-redesign plans - chore: biome format fixes (debug-logger, summon.test) Refs notes/plan-tab-comm.md
2026-05-29feat: stop generation button with abort signal plumbingAdam Malczewski
- Add POST /chat/stop endpoint on API - Thread abortSignal from agent-manager through Agent.run() to streamText - Thread abortSignal option through the Agent.run() signature - Emit status:idle on stopTab() so frontend WS gets the update - Add stopGeneration() store method on frontend tabStore - Add stop button in ChatInput (btn-sm lg:btn-xs for mobile tap target) - Add tests for /chat/stop endpoint - Refactor processMessage to pass abortSignal to agent.run
2026-05-24fix: prompt caching, OpenCode Go MiniMax/Qwen support, Opus 4.7 thinking, ↵Adam Malczewski
SDK compat - Implement Anthropic prompt caching: first system message + last 2 non-system messages get cache_control: ephemeral, mirroring OpenCode's applyCaching strategy. Move system prompt inline into messages array so providerOptions can attach. - Add opencode-anthropic provider variant routing MiniMax/Qwen models through the /messages endpoint with x-api-key auth, distinct from the Claude OAuth flow's Bearer auth and Claude Code mimicry. - Split isAnthropic into isClaudeOAuth (billing header, mcp_ tool prefix, thinking config) and usesAnthropicSDK (cache markers) so non-OAuth Anthropic-format gateways get the right treatment. - Pin @ai-sdk/anthropic to ^1.2.12: v3 returns LanguageModelV3-spec models that ai v4's streamText rejects at runtime ('AI SDK 4 only supports models that implement specification version v1'). Drop unnecessary V1 casts. - Restore Opus 4.7 extended thinking by rewriting the outgoing /messages body in the Claude OAuth fetch interceptor: inject thinking: { type: 'adaptive' } (v1 SDK can't emit it), strip temperature/top_p/top_k (Anthropic rejects them with thinking enabled). Gated on max_tokens > 4096 so effort=none still works. - Bump MAX_STEPS from 10 to 50 to align with AI SDK's stepCountIs(20) default and reduce mid-task halts. - Fix pre-existing typecheck errors in agent-manager.ts (entry/nextEntry narrowing), app.ts (agentModels body field), KeyUsage.svelte (m guards), and a TS2742 in provider.ts via explicit ModelFactory return type. - buildFallbackSequence now always returns at least one entry so processMessage runs the agent loop even without keyId/modelId (fixes 4 broken agent-manager tests).
2026-05-23feat: key fallback using agent models[] hierarchy, background tool modes, ↵Adam Malczewski
copy truncation - Agent rate-limit fallback now iterates through agent's configured models[] in strict order - Frontend sends agentModels with each /chat request; backend uses buildFallbackSequence() - Emits notice event on fallback so chat shows which key failed and what's being tried next - Child agents inherit parent's agentModels for fallback - Added statusCode propagation from AI SDK errors for programmatic 429 detection - Copy button truncates all tool results at 300 chars (was 200 for 4 specific tools) - run_shell, summon, youtube_transcribe: background mode support - summon: blocking mode by default with getResult callback
2026-05-23feat: add is_subagent flag to agents, fix all lint/type/test issuesAdam Malczewski
- Add is_subagent checkbox to agent editor; subagents are hidden from Chat Settings - Add is_subagent field to AgentDefinition type, TOML serialization, and API route - Filter subagents from ModelSelector agent list - Fix all biome lint/format errors across codebase (useLiteralKeys, noNonNullAssertion, noExplicitAny, formatting, import sorting) - Fix svelte-check errors (type narrowing in SkillsBrowser, ToolPermissions, SidebarPanel) - Fix a11y warnings in App.svelte (label-control associations) - Fix test mocks missing BackgroundShellStore, BackgroundTranscriptStore, createWebSearchTool, createYoutubeTranscribeTool - Update stale 409 test to match current message-queuing behavior - Exclude packaging/ and release/ dirs from biome to avoid linting stale build artifacts
2026-05-22feat: message queue/interrupt system, CORS fix, mobile fixes, chat splittingAdam Malczewski
- Add message queue allowing users to send messages while agent is running - Queue messages are injected into tool results as [USER INTERRUPT] - Retrieve tool interrupted via Promise.race when user message arrives - Queued messages show with 'queued' badge and cancel button - Consumed messages repositioned and chat splits at interrupt point - New assistant message block created after interrupt for clean flow - Add POST /chat/cancel endpoint for cancelling queued messages - Fix CORS to allow any origin (Tailscale/LAN access) - Fix crypto.randomUUID fallback for non-secure contexts (HTTP) - Fix frontend API URL derivation from page hostname - Auto-create DB tab if missing on processMessage (foreign key fix) - Add error logging to processMessage catch block - Fix working directory input sync on agent switch - Fix agent mode button to re-apply agent settings
2026-05-22feat: agent builder, CWD support, auto-save, UI polish, unavailable tool ↵Adam Malczewski
handling - Agent Builder: full CRUD with card grid, drag-and-drop model reorder, edit/delete - Auto-save on edit with 600ms debounce, AbortController for concurrency, fieldset disabled until name entered - Agent definitions stored as TOML with cwd field, loaded from global/project dirs - Working directory: per-tab CWD override in Chat Settings, agent default CWD, auto-create on first message - CWD validation: check-dir endpoint with ~ expansion, real-time validity indicator - Subagent CWD validated against parent's effective CWD using path.relative - Unavailable tool calls: caught gracefully, shown as tool call with error badge, model retries - UI: tab bar border radius, sidebar border removed, chat input ghost style, scroll-to-bottom rectangle - Skills dir collapse uses CSS rotation, Model Choice renamed to Chat Settings, System Prompt view removed - Reusable SkillsBrowser/ToolPermissions with external mode for Agent Builder - ModelSelector: Agent/Manual toggle, agent list, Agent Settings link - Page router, skills recursive scanning, bin/up gopass removed, docker volume mounts
2026-05-21feat: tab system with per-tab agents, DB persistence, and DaisyUI tabs-lift UIAdam Malczewski
- Add tabs, messages, and settings tables to SQLite database - Backend: refactor AgentManager to manage per-tab Agent instances via Map<tabId, TabAgent> - Backend: WebSocket events tagged with tabId for multiplexing - Backend: tab CRUD routes (create, list, update, archive, messages) - Backend: persist user and assistant messages to DB during chat - Frontend: new tabStore replaces single chatStore with multi-tab reactive state - Frontend: TabBar component using DaisyUI tabs-lift style with status dots - Frontend: Settings sidebar panel for title generation model selection - Frontend: wire ChatPanel, ChatInput, Header to use tabStore - Fix HMR listener accumulation via wsClient.clearCallbacks() - Delete old single-chat chatStore (chat.svelte.ts)
2026-05-21fix: wake scheduler persistence/retry, credential filtering, usage cache and ↵Adam Malczewski
display names - Wake scheduler: fix Bun timer leak, make recurring daily, persist to disk, retry failed wakes every 5min for 30min, start at boot - Key usage: localStorage cache survives page refresh, spinner during all refreshes, show cached data immediately - Credential filtering: key-usage and wake only use configured credentials_file, exclude unconfigured accounts - Display: remove counter suffix from Claude labels, format opencode/copilot key names
2026-05-20feat: claude max oauth support with multi-account switching, reasoning ↵Adam Malczewski
effort, and dynamic model listing
2026-05-20feat: phase 3 — config, skills, model groups, task list, and sidebar UIAdam Malczewski
- Config system: TOML-based dispatch.toml with hot-reload via chokidar - Model/key resolution: tag-based model selection, key fallback chains - Skills system: directory loader with TOML frontmatter, agent mappings - Task list tool: add/update/list/get operations with WebSocket events - API routes: GET /config, /skills, /skills/:name, /models, /models/resolve - Frontend: sidebar with model status, task list, config viewer, skills browser, permission log - Sliding sidebar animation using CSS transitions (not Svelte transitions)
2026-05-19feat: Phase 2 — shell permissions, tree-sitter analysis, permission UIAdam Malczewski
Permission engine: - Rule-based engine: wildcard matching, last-match-wins, reject cascade - PermissionService with pending/approved state, PermissionChecker interface - dispatch.yaml config loader with per-permission pattern rules Shell tool: - run_shell tool with child_process spawn, timeout, streaming output - Tree-sitter static analysis (web-tree-sitter + tree-sitter-bash WASM) - BashArity command normalization for 'always allow' patterns - FILE_COMMANDS set: rm, cp, mv, mkdir, ls, find, grep, cat, etc. Agent loop refactored: - Removed maxSteps, manual step loop with tool execution - Permission checks on shell commands (external_directory only) - Permission checks on file tools outside workspace boundary - Symlink bypass fix (realpathSync), .. false positive fix - Shell output streaming via Promise.race + setImmediate polling API layer: - PermissionManager wraps PermissionService, broadcasts via WebSocket - WebSocket handles permission-reply messages from frontend - Config loaded from dispatch.yaml, converted to ruleset Frontend: - Permission prompt modal (native dialog, focus trap, ARIA) - Always-allow confirmation flow with pattern preview - Shell output display (live streaming + final parsed result) - Permission log panel (fixed bottom-right overlay) - Exit code badge (green 0, red non-zero) 134 tests, typecheck clean on all 3 packages
2026-05-19Phase 1: single agent + basic UIAdam Malczewski
- Bun monorepo with @dispatch/core, @dispatch/api, @dispatch/frontend - Agent runtime with Vercel AI SDK, streaming via WebSocket - Tools: read_file, write_file, list_files (scoped to working directory) - Hono API server with POST /chat, GET /status, GET /health, WS /ws - Svelte 5 + DaisyUI frontend with chat UI, theme switcher, copy button - OpenCode Go (Zen) as LLM provider, deepseek-v4-flash-free model - Docker setup (dev + prod) with bin/ scripts and gopass secrets - Biome v2 linting/formatting, Vitest tests (44 passing) - Debug info attached to error messages for diagnostics