| Age | Commit message (Collapse) | Author |
|
- Add watchDirConfig() for per-directory config watching
- Register watchers for subdirectories with their own dispatch.toml
- Fix permission ordering: move "*" wildcard to front so findLast
reaches specific rules first (was silently breaking all specific
bash permission rules)
- Add comprehensive tests for watcher functionality
- Update mocks in test files
|
|
Gemini review caught a precedence-inversion bug in mergePermissions: when a
nested permission group exists in BOTH global and local configs, the previous
`{ ...existing, ...value }` spread updated an overridden pattern IN PLACE,
leaving its original (global) insertion slot. Since configToRuleset flattens
patterns in iteration order and evaluate() uses findLast (last match wins), a
more-general global pattern declared lower (e.g. "*") would sit AFTER the
local override and silently shadow it.
Example: global bash { "npm test"=allow, "*"=ask } + local bash
{ "npm test"=deny } resolved "npm test" to "ask" instead of "deny".
Fix: drop global patterns the local block also defines, keep remaining global
patterns in order, then append ALL local patterns last — reproducing a clean
"global rules then local rules" concatenation so local always wins. Adds a
regression test asserting order and evaluation outcome.
|
|
Load an optional global config at ~/.config/dispatch/dispatch.toml
(override via DISPATCH_GLOBAL_CONFIG) and deep-merge it underneath every
project/working-directory dispatch.toml, so machine-wide settings — most
notably globally available LSP servers — work in any repo without per-repo
config. Local always wins on conflicts.
- loader: add getGlobalConfigPath(), loadGlobalConfig(), mergeConfigs();
loadConfig(dir) now loads+merges global. [lsp] and [[keys]] merge by id;
[permissions] merge per-group with global patterns emitted first so local
rules win at evaluation time (findLast). A malformed global config is
downgraded to empty rather than breaking every repo.
- watcher: watch BOTH global and local dispatch.toml so hot-reload re-merges
on either change (dedupes when paths coincide).
- export new loader fns from config/index and core index.
- types/agent-manager: doc updates reflecting merged LSP resolution.
- dispatch.toml: document global-default merge behavior; activate biome and
typescript-language-server LSP entries.
- tests: merge precedence, lsp/keys merge-by-id, permissions merge,
filesystem integration, malformed-global resilience; isolate global path
in existing loader tests.
|
|
# Conflicts:
# packages/api/src/agent-manager.ts
# packages/api/tests/agent-manager.test.ts
# packages/frontend/src/lib/tabs.svelte.ts
|
|
Root cause of the 'first warmup misses' + 'switch to chat misses' bugs:
Anthropic keys the MESSAGE-level prompt cache on `tool_choice` AND the
extended-thinking parameters (both rows in their cache-invalidation table mark
the messages cache as invalidated on change). The original warmCache() sent
toolChoice:'none' and NO thinking providerOptions, while real turns send
toolChoice:'auto' + thinking config for the effort. So warming and chat wrote
TWO different message-cache buckets:
- warmup #1 missed (no warm-only bucket existed yet), every later warmup hit
its own bucket;
- the next real chat message read the OTHER bucket → miss.
Fix: extract a shared buildStreamOptions() that produces the cache-affecting
params (toolChoice + thinking providerOptions + maxOutputTokens). Both run()
and warmCache() now call it with the SAME resolved reasoning effort, so the
warming replay refreshes the exact cache the next real message reads. The
trivial probe turn is still appended AFTER the last cache breakpoint, so it
never disturbs the cached prefix.
Threaded the per-tab reasoning effort (per-model -> per-tab selector -> default,
mirroring processMessage) from the frontend resolver through POST /chat/warm to
warmCacheForTab to warmCache.
Tests: updated the warmCache toolChoice test to assert it MATCHES a real turn,
added an invariant test driving run() and warmCache() and asserting identical
cache-affecting params, and assert effort forwarding in the frontend store.
check / test (780) / frontend build / typecheck all green.
|
|
Keep a tab's provider prompt-cache warm while idle by periodically replaying
the exact cached conversation prefix plus a single trivial throwaway turn,
resetting the provider's ~5-min cache TTL so the user's next real message hits
a warm cache.
Backend:
- Agent.warmCache(history): extracts buildLlmContext() shared with run(), then
re-sends the identical system+tools+history prefix (same Anthropic
cache_control breakpoints) plus a 'reply with just a .' probe turn via
toolChoice:none. Returns the request usage; mutates no history, emits/persists
nothing.
- AgentManager.warmCacheForTab(): resolves the same agent the next real turn
would use, replays the FULL genuine history, refuses while a turn is running.
- POST /chat/warm: returns ONLY the warming request's usage (never persisted,
never folded into the real usage aggregate).
Frontend:
- cache-warming.svelte.ts store: per-tab 4-min repeating idle timer with
countdown, warming-specific last-request cache %, and error capture. Arms on
turn end, pauses during a turn, disables+resets on a real user message.
- cache-warm-storage.ts: per-tab localStorage persistence of the toggle.
- Lifecycle hooks wired into tabs.svelte.ts (status/statuses/sendMessage/
hydrate/create/open/close).
- ModelSelector: bottom-of-panel checkbox + debug strip (last-% / countdown /
error), shown only when enabled. Warming cache data never touches the real
Cache Rate view.
Tests: core warmCache (5), api warm route (3) + warmCacheForTab (3), frontend
store (12) + storage (10). check / test (779) / frontend build / typecheck all
green.
|
|
# Conflicts:
# packages/frontend/src/lib/components/ChatInput.svelte
|
|
Summarize a conversation's older "head" into a structured anchored Markdown
summary while preserving the most recent turns verbatim, shrinking context size
while keeping the information needed to continue coherently. Triggered by a
"Compact conversation" button in Chat Settings (not an agent tool).
Approach informed by OpenCode's session/compaction.ts:
- Ported SUMMARY_TEMPLATE (Goal / Constraints / Progress / Key Decisions /
Next Steps / Critical Context / Relevant Files) and the anchored-summary
buildPrompt (re-summarizes a prior summary when present).
- Ported the TOOL_OUTPUT_MAX_CHARS (2000) cap on tool results in the summary
request.
- Simplified tail selection to a fixed recent-turn count (DEFAULT_TAIL_TURNS=2)
instead of OpenCode's token-budget splitTurn.
core:
- New src/compaction/ module (pure, DB-free): template, prompt builder,
head/tail selection, transcript renderer with tool-output capping, prior
summary extraction. Generic over ChatMessage so callers keep turnId/seq.
- db/chunks.ts: rekeyChunks(from,to) relocates a tab's full history to a
backup tab (reversible — nothing is deleted).
- AgentEvent: compaction-started / -complete / -error variants.
api:
- AgentManager.compactTab(tempTabId, sourceTabId): side-effect-free
resolveConnection() for the compactor model (configured compaction_model_*,
else the source tab's own key+model), one-shot tool-less summary generation
via a transient Agent, then relocate full history to a fresh backup tab and
re-seed the canonical source id with [summary turn + preserved tail]. Source
tab is locked (messages queue) during the run; queue drains afterward.
- Routes: POST /tabs/:id/compact, GET/PUT /tabs/settings/compaction-model.
frontend:
- "Compact conversation" button in ModelSelector (Chat Settings), between
Working Directory and the agent toggle; idle-gated.
- Compaction-model key+model selector in Settings, beside the title model.
- Transient placeholder tab shows a large, non-faded "Please wait, compacting
conversation…" screen; closing it cancels. Source input locked while running.
- Handle compaction-* events: reload compacted source, insert backup tab,
refocus source, discard placeholder.
tests: core compaction unit tests, rekeyChunks DB test, AgentManager.compactTab
orchestration tests, and compaction route tests. All green (713 tests), biome
clean, all typechecks pass, frontend builds.
|
|
|
|
Add multimodal image/PDF input to the chat box via clipboard paste, gated by a
graceful per-model capability check.
UX: a pasted image/PDF inserts an inline token (【image:…】 / 【pdf:…】) into the
draft, so attachments have ORDER relative to typed text and can be referenced
positionally. The token is the only handle — deleting it (atomic Backspace/
Delete, or selection overlap) detaches the file; an input-reconciliation safety
net detaches any attachment whose token is no longer intact. No preview strip.
Capability check: resolveModelCapabilities reads models.dev modalities.input
(new GET /models/capabilities, mirrors /context-limit). The input blocks Send
(no tokens spent) only on a definitive 'no'; unknown capability (catalog offline
/ unmapped provider) stays permissive. Attachments require a fresh turn — Send is
blocked while generating and /chat rejects content mid-turn (409).
Attachments are EPHEMERAL: forwarded to the model for the turn via ordered AI SDK
ImagePart/FilePart content, but never persisted (history keeps the text with
[image]/[pdf] markers). Text-only turns serialize byte-identically to before.
Limits (Anthropic-aligned, enforced at paste + re-validated server-side):
PNG/JPEG/WebP/GIF/PDF; image ≤5MB, PDF ≤32MB, ≤20 attachments, ≤32MB total.
core: UserContentPart types, models/attachments validator, capability resolver,
agent.run+toModelMessages thread ordered content. api: /chat content validation +
passthrough. frontend: attachment-tokens helper, ChatInput paste/token/gating,
per-tab staged attachments, App.svelte capability fetch. +44 tests.
|
|
Adds an agent-callable `key_usage` tool that reports current usage for
configured API keys so the agent can pick a key with headroom, warn before
hitting a rate limit, and diagnose exhausted-key failures.
Per key it reports: provider, active/exhausted status (with last error +
when it was exhausted), remaining rate-limit headroom and reset timestamp per
window (5-hour, weekly, and monthly where the provider exposes it), and
whether the figures are live or served from cache (with the cache's
last-fetched-from-source time). Supports anthropic and opencode-go keys
(live with cache fallback for anthropic; live scrape for opencode-go).
Optional `key_id` reports one key; omitted reports all.
Hard permission gate `perm_key_usage` (default off): when disabled the tool
is completely removed from the toolset/context. Registered in both the
parent permission-gated path and the child whitelist path, advertised in the
system prompt (TOOL_DESCRIPTIONS), grantable to subagents via the summon
enum, and exposed as a frontend tool-permission checkbox.
To report data freshness, claude.ts gains `getAccountUsageWithSource` +
`ClaudeUsageResult` (live vs cache + cachedAt from usage_cache.cached_at);
the existing `getAccountUsage` now delegates to it, preserving behavior.
Tests: core key-usage tool suite (windows, %-conversion, freshness, exhausted
status, unsupported/unavailable, filtering) + agent-manager perm-gate test.
|
|
# Conflicts:
# packages/api/src/agent-manager.ts
# packages/api/tests/agent-manager.test.ts
# packages/frontend/src/lib/components/ToolPermissions.svelte
# packages/frontend/src/lib/settings.svelte.ts
|
|
|
|
The wake probe was hardcoded to claude-3-5-haiku-20241022, which the
endpoint no longer serves (HTTP 404), exhausting the retry loop. Now the
probe fetches the live model list via fetchAnthropicModels (falling back
to ANTHROPIC_MODELS_FALLBACK if empty) and selects the current Haiku via
a new pure selectHaikuModel() helper (first case-insensitive 'haiku'
substring match; newest-first ordering). No-match surfaces a clear
per-account error instead of crashing.
|
|
Address findings from a second independent (Gemini) review covering the tool
and the packaging:
- Robustness (was: crash): non-string params from a model hallucination (e.g.
include_ext: ["ts","go"]) threw 'x.trim is not a function' and killed the
tool call. Add an asString() coercion for all string params (query, path,
include_ext, exclude_pattern, only); non-strings now no-op or return the
graceful 'query is required' error.
- Output bound: cap each rendered snippet line at 500 chars (MAX_LINE_CHARS,
mirrors read-file.ts) so a matched minified/generated line can't bloat the
payload. (Total output is already bounded by the universal truncator.)
- packaging/PKGBUILD: make the cs clone rerun-safe (rm -rf before clone) so
makepkg -e / repeat runs don't abort on 'destination path already exists';
add conflicts=('cs') to the code-search package for a clean pacman error vs.
the unrelated AUR 'cs' that also owns /usr/bin/cs (no provides — different
program).
Not changed (verified): path containment, the -- flag-injection guard, and the
deterministic pinned Docker build were all confirmed solid by the review.
Tests: +2 (wrong-type params don't crash; long-line truncation). Full suite
605 pass, biome + tsc green.
|
|
Add Language Server Protocol integration modeled on opencode's, wired for
this codebase's plain-TypeScript tool/agent architecture.
Core (@dispatch/core):
- lsp/client.ts: LSP/JSON-RPC client over stdio (vscode-jsonrpc) with the
initialize handshake, didOpen/didChange sync, push + pull diagnostics
(textDocument/diagnostic, workspace/diagnostic), and a generic request()
passthrough for hover/definition/references/documentSymbol.
- lsp/server.ts: resolves dispatch.toml [lsp] entries into spawn specs.
Config-driven only — no builtin registry, no auto-download.
- lsp/manager.ts: process-wide LspManager owning client lifecycles, keyed
by root+serverID, lazy spawn + reuse + graceful shutdown.
- lsp/language.ts: extension->languageId map incl. .luau -> "luau".
- lsp/diagnostic.ts: error-only <diagnostics> block formatting (1-based).
- tools/lsp.ts: on-demand 'lsp' tool (1-based coords -> 0-based wire).
- write-file.ts: optional onAfterWrite hook for diagnostics-on-write.
- config schema: validate [lsp] block; DispatchConfig.lsp + LspServerConfig.
API (@dispatch/api):
- AgentManager owns one LspManager; per-working-directory server cache
cleared on config reload; diagnostics appended to write_file results;
'lsp' tool gated by new perm_lsp setting; shutdownAll on destroy().
Config:
- dispatch.toml: documented, commented [lsp.luau-lsp] Roblox example.
Tests: fake-lsp-server fixture + client/manager/server/diagnostic/schema/
tool/write-hook suites, plus an opt-in real-binary luau-lsp smoke test
(auto-skipped when luau-lsp is absent). 652 pass; biome + 3 typechecks green.
|
|
Address bugs found by an end-to-end test of the tool:
- HIGH: prose/text files (.md/.html/etc.) came back as bare headers with no
snippet. cs's default 'auto' snippet mode emits a single 'content' string
(no 'lines[]') for prose, which the renderer skipped. Force
--snippet-mode=lines by default so every file type returns a lines[] window
that renders. Also add a defensive 'content'-shape fallback in formatResults
(+ widen the CsResult type) so a content result is never shown blank.
- HIGH: the 'context' parameter was a no-op — cs ignores -C except in grep
snippet mode. When context is supplied, switch to --snippet-mode=grep so -C
actually widens the per-match window (verified 2 -> 26 lines); default
(no context) keeps the richer lines window for code.
- LOW: a 'path' pointing at a file (not a dir) silently returned 'No matches
found' (cs --dir <file> => null). Now stat the path and return an
explanatory error (file vs nonexistent), pointing at read_file for a file.
- MEDIUM/doc: clarify snippet_length (prose-mostly) and context descriptions.
Tests: +5 (prose rendering live + stubbed content-shape; context widening;
path-is-file; path-nonexistent). Full suite 603 pass, biome + tsc green.
Note: the EACCES spill failure seen in testing is pre-existing platform
infra (truncate.ts SPILL_ROOT, shared by all tools), not part of this tool.
|
|
Address findings from an independent code review of the search_code tool:
- Critical: cs failures (non-zero exit, or SIGTERM from the spawn timeout)
were swallowed and reported to the model as 'No matches found', discarding
stderr. Now capture exit code + signal from 'close' and return a real
Error: (timeout message for SIGTERM, exit-code + stderr otherwise). cs
exits 0 on a genuine no-match, so that path still reports correctly.
- High: a query beginning with '-' (e.g. '-foo') was parsed by cs as a
(usually invalid) flag. Insert a '--' separator before the query so it is
always treated as the positional search term.
- Low: relative-path display fallback now matches the workdir only at a path
boundary, so a sibling dir sharing the prefix (e.g. /app vs /app-secrets)
isn't rendered as a '../app-secrets/...' path.
Adds tests for the non-zero-exit (stderr surfaced, not 'No matches') and
dash-leading-query cases. All tests (598), biome, and tsc pass.
|
|
Add a dedicated, permission-gated search_code tool that wraps boyter/cs
(code spelunker) — a fast, relevance-ranked, structure-aware code search
engine — giving agents a better default than grep/find for exploratory
'where is X / how does Y work' searches (ranked results, snippets, ~5x
smaller payloads).
- packages/core/src/tools/search-code.ts: createSearchCodeTool factory;
-f json invocation, workdir path containment, graceful missing-binary
handling (DISPATCH_CS_BIN override), readable per-file formatted output.
- Wire-up: export from core; register in agent-manager (both child-whitelist
and parent perm paths) behind new perm_search_code; add to summon catalog
+ tools enum; frontend ToolPermissions + settings.
- Docker: build a patched, statically-linked cs (pinned v3.1.0 commit) in a
golang builder stage and bundle at /usr/local/bin/cs.
- docker/cs/luau-declarations.patch: additive Luau declaration table so
--only-declarations / definition ranking works for Roblox .luau files
(upstream has Lua but not Luau). Applied during the Docker build.
- Tests: new search-code.test.ts (stubbed JSON formatting + live-cs
integration, skipped when cs absent); agent-manager/routes mocks +
perm-gating assertions; loader pass-through.
All tests (596), biome, and tsc (core/api/frontend) pass. cs-builder Docker
stage verified to build and produce a working patched binary.
|
|
# Conflicts:
# packages/api/tests/agent-manager.test.ts
|
|
'arrives on its own')
Matches actual behavior: a peer's reply wakes this tab with a new message in a
later turn. Updated the send_to_tab description (both canReadTab branches), the
delivery-result text (both branches), and the system-prompt one-liner; updated
the test assertion accordingly.
|
|
Granting only the user-agent (top-level) permission without the
subagent-summon permission left the agent unable to summon user agents:
the whole summon tool was gated behind perm_summon, so perm_user_agent
alone produced no summon tool.
Register summon when EITHER perm_summon OR perm_user_agent is granted.
createSummonTool now takes an independent subagentEnabled flag (mirrors
perm_summon) alongside userAgentEnabled (mirrors perm_user_agent):
- subagent-only -> ordinary subagents, no top_level
- user-agent-only -> spawns ONLY top-level user agents (top_level
forced, background/top_level params dropped, user-agent catalog only)
- both -> unchanged full behavior
retrieve stays bundled with perm_summon (user agents are fire-and-forget).
Adds core summon tests (user-agent-only mode + legacy-default regression)
and an agent-manager summon/user_agent permission-split suite.
|
|
The send_to_tab guidance previously told the agent it could call read_tab to
check for a reply, but the tab-messaging permissions are split — a tab can
hold send_to_tab WITHOUT read_tab (the exact case in testing). Advertising a
tool the agent wasn't granted is wrong.
Thread a canReadTab flag from AgentManager.buildTabCommToolEntries into
createSendToTabTool (true iff this tab is also granted read_tab). The tool
description and the delivery-result text now only reference read_tab when
canReadTab is true; otherwise they say a reply arrives on its own and to end
the turn. Drop the read_tab phrasing from the static TOOL_DESCRIPTIONS
one-liner (can't be conditional per-tab there).
Also uppercase ONLY in the recipient reply-contract footer for emphasis.
Tests: cover both canReadTab branches for description + result text; assert
ONLY is uppercased.
|
|
replies
Two behavioral problems observed once the tools were usable:
1. The SENDER busy-waited for a reply (ran 'sleep 20' / polled) instead of
ending its turn. Tool description, the delivery result text, and the
system-prompt one-liner now say plainly: do not sleep/poll/run commands
to wait; a reply arrives on its own in a later turn (or via read_tab in a
future turn); keep working if there's other work, else end your turn.
2. The RECIPIENT replied to its OWN user in plain text instead of routing the
answer back through send_to_tab. The provenance wrapper now states the
message is from another AGENT (not your user), and that to reply you must
use send_to_tab addressed to the sender's handle — and only if asked, since
it may just be context. A plain text answer reaches only your own user.
Tests updated for the new wording.
|
|
|
|
|
|
Replace the imperative id-based CRUD todo tool (add/update/list/get/remove)
with opencode's declarative whole-list design: a single `todos` param that
replaces the entire list each call. No model-visible ids, no delta reasoning,
no "task not found" spirals.
- core: TaskItem { id, content, status }; statuses pending|in_progress|
completed|cancelled. TaskList.setTasks/getTasks/onChange. New rich
TODO_DESCRIPTION adapted from opencode's todowrite.txt.
- api: TASK_MANAGEMENT_GUIDANCE system-prompt section (from anthropic.txt);
updated TOOL_DESCRIPTIONS.todo. Reload fix: TabStatusSnapshot now carries
per-tab tasks so getAllStatuses rehydrates the panel on reconnect.
- frontend: mirror types; hydrate tasks from snapshot in both restore paths;
upgrade sidebar Tasks panel to render content + all four statuses + progress.
- tests: new core task-list.test.ts (15); updated api TaskList mocks +
getAllStatuses task-snapshot coverage.
bun run check clean; 569 tests pass; all packages typecheck.
|
|
The wake probe POSTed a bare { model, messages } body with no system[]
identity. Anthropic validates system[] on OAuth (Pro/Max) subscription
requests and rejects any that lack the verbatim Claude Code identity, so
every scheduled wake (and the manual Wake-now button) failed silently —
surfacing as a blank '— failed' status that then burned the retry budget.
- Add pure buildWakeProbeBody(model) in @dispatch/core mirroring a genuine
Claude Code request (billing header block + identity block + 'hi'), with
a unit test for its shape.
- wakeAllClaudeAccounts now sends that body plus the CLI session/request-id
headers, and records 'HTTP <status>: <message>' on failure so the panel
never shows a bare 'failed' and breakage stays debuggable.
|
|
- TabBar: HTML5 drag-and-drop to reorder user tabs (subagent tabs untouched);
double-click a tab title to rename (Enter/blur confirm, Escape cancel).
- Store: add reorderTabs/renameTab/setDraft; per-tab in-memory `draft` and
`manualTitle` fields. Manual rename suppresses first-message auto-title.
- ChatInput: bind to the active tab's draft so switching tabs saves/restores
unsent text instead of clobbering it.
- Backend: updateTabPositions() + PATCH /tabs/reorder persist tab order to the
existing `position` column; tabs without a stored position fall to the end
then get explicit positions on first reorder.
- Tests: store reorder/rename/auto-title-guard/draft coverage; core
updateTabPositions coverage (FakeDatabase extended with transaction support).
|
|
# Conflicts:
# packages/api/tests/agent-manager.test.ts
|
|
|
|
Addresses the live-accumulator overshoot a Gemini review surfaced: the
frontend adds every streamed usage event to cacheStats, but a rate-limited
fallback attempt's usage is discarded server-side (never persisted). Live
numbers overshot until a reload re-seeded from the DB aggregate.
Fix: turn-sealed (emitted AFTER the atomic usage-row write) now carries the
authoritative getUsageStatsForTab aggregate. The store REPLACES (not adds)
cacheStats with it every turn — landing the just-sealed turn's usage AND
self-healing any live drift, including the discarded-fallback overshoot. No
extra round-trip (piggybacks turn-sealed); idempotent in the happy path.
- core: add UsageStats type; getUsageStatsForTab returns it; turn-sealed gains
optional usageStats field.
- api: agent-manager reads getUsageStatsForTab post-flush and attaches it to
the turn-sealed emit (try/catch: omit on DB error).
- frontend: turn-sealed handler replaces cacheStats (undefined ⇒ untouched
back-compat; null ⇒ clear).
Tests: frontend reconcile/self-heal/back-compat/null-clear; api turn-sealed
carries aggregate. 509 -> 514 passing; typecheck + biome green.
|
|
Add a 'Context Window' sidebar view showing the live context occupancy
(latest request's input+output) against the model's maximum context
window, resolved dynamically from the models.dev catalog.
- core: models.dev catalog module (resolveContextLimit) with disk cache,
TTL, stale-fallback + offline penalty memo; null for unknown models.
- api: GET /models/context-limit?provider=&modelId=.
- frontend: ContextWindowPanel + computeContextUsage helper; App resolves
+ caches the active model's max (anthropic/opencode-anthropic only);
percent shown to 2 decimals; degrades to bare token count when max
unknown.
- tests: core catalog (13), api route (3), frontend helper (6).
|
|
Persist usage as invisible type:"usage" chunk rows (side channel):
- core: add "usage" ChunkType + UsageData; exclude usage rows from
getChunksForTab/getTotalChunkCount; add getUsageStatsForTab aggregate
(exported from barrel); defensive skip in groupRowsToMessages.
- api: agent-manager accumulates per-attempt usageRows and flushes them in
the same atomic appendChunks call as the turn's content (discarded on a
superseded fallback attempt). GET /tabs enriches rows with usageStats.
- frontend: hydrateFromBackend seeds cacheStats from usageStats (reload only;
no re-seed on statuses reconnect, so no double-count with live events).
Tests: core DB-backed usage persistence/aggregate; api usage-row-per-event +
fallback discard; routes GET /tabs usageStats; frontend hydrate seed +
no-double-count + live-accumulation-after-seed. 495 -> 509 passing.
|
|
Add a per-model/key reasoning effort setting to agent definitions,
surfaced and editable in the Agent Settings page and displayed at a
glance in the model selector views.
- core: single source of truth for effort levels (REASONING_EFFORTS,
DEFAULT_REASONING_EFFORT='high', labels, isReasoningEffort guard);
add 'xhigh' level; AgentModelEntry.effort; xhigh budget=24000 for
classic-thinking Claude; default floor 'high'. Persist/parse effort
in the agent TOML loader.
- api: thread effort through the fallback chain with per-model -> per-tab
-> default precedence; validate /chat + agentModels effort from the
canonical list.
- frontend: effort <select> per model row in AgentBuilder; effort badges
in ModelSelector (agent + subagent chains); Thinking dropdown sourced
from canonical list; per-tab default raised to 'high'.
- tests: +15 (loader round-trip, agent xhigh budget, canonical list +
guard, api precedence, route validation).
|
|
Brings in the n2/ntfy-notifications feature (ntfy.sh push notifications
with per-event toggles, subagent-suppression flag, topic-only input,
Settings UI, dispatcher + transport + config modules, 12+ new tests),
the header declutter (theme picker + Debug panel moved into Settings /
sidebar), the shared theme boot-apply module, and an a11y label for the
remove-panel button.
No code changes from this branch were touched by the merge — the
overlap was purely textual.
Conflict resolution:
1. HANDOFF.md (add/add conflict). Both branches independently put a
single-purpose HANDOFF.md at the repo root for their respective
in-flight feature, matching the existing convention (c351719 did
the same for this branch; 29bdd00 did the same for ntfy). After
this merge both features ship, so neither is in-flight anymore.
Archive both into notes/:
- notes/wake-schedule-handoff.md (this branch — git tracks as a
rename from HANDOFF.md)
- notes/ntfy-notifications-handoff.md (dev — recovered from
MERGE_HEAD before deletion)
The root HANDOFF.md is intentionally absent post-merge; the next
in-flight branch will create its own.
2. packages/api/tests/routes.test.ts (auto-merged). dev appended ntfy
stubs to the vi.mock('@dispatch/core', ...) factory; this branch
appended a 'Wake schedule routes' describe block at the bottom.
The two regions don't overlap and the textual auto-merge is correct
(verified: 6 describe blocks, both mock-stub regions and the new
describe present, no conflict markers).
Verification on the merge commit:
bun run test → 31 files, 495 / 495 passing
(was 431 on the branch + 64 from dev)
bun run check → biome clean, 156 files
bun run --cwd packages/frontend typecheck
→ svelte-check 0 errors, 0 warnings
dev can now fast-forward to this commit:
git checkout dev && git merge --ff-only r1/claude-reset-fix
|
|
The Settings field is now a plain topic name (e.g. `my-secret-topic`)
instead of a full URL. The transport always posts to
`https://ntfy.sh/<topic>` (URL-encoded), and the only server-side check
is "non-empty when enabled". Removes the user-visible
"string does not match the expected pattern" error people hit when
typing a bare topic.
- packages/core/src/notifications/ntfy.ts: drop validateTopicUrl;
add buildNtfyUrl(topic) + exported NTFY_BASE_URL.
- packages/core/src/notifications/types.ts, config.ts: rename
topicUrl -> topic; update docs.
- packages/api/src/routes/notifications.ts: only validates non-empty
topic when enabled. Also fixes a latent bug where notifySubagents
was dropped on every PUT (was not passed to normalizeNtfyConfig).
- packages/frontend/src/lib/components/SettingsPanel.svelte: relabel
field "Topic URL" -> "Topic"; placeholder "your-secret-topic";
updated helper copy.
- Tests updated: rewrote validateTopicUrl coverage as buildNtfyUrl
coverage + proof that previously-rejected topics (dots, spaces,
unicode, "Any Topic Whatsoever") now POST cleanly.
- HANDOFF.md: added a short "topic-only input" section.
|
|
A parent agent that spawns 8 subagents was producing 9 "Turn complete"
notifications per round — almost always noise. New `notifySubagents`
config flag (defaults to false) gates `turn-completed` and `turn-error`
from any tab with a `parentTabId`. The flag is intentionally NOT applied
to `permission-required` — a subagent's permission prompt still needs a
human tap to proceed, so suppressing it would silently hang the
subagent. `agent-spawned` is already top-level-only by construction.
Wiring:
- core/notifications/types.ts: NtfyConfig.notifySubagents: boolean
- core/notifications/config.ts: defaults to false; normalize() tolerates
missing / wrong-typed values and falls back to false
- core/notifications/dispatcher.ts: new optional TabParentLookup option
(getTabParentId). When notifySubagents=false AND the lookup returns a
non-empty parent id string, turn-completed/turn-error are dropped.
Lookup failures (no lookup configured, throws, returns undefined) fall
back to "treat as top-level" so legitimate top-level events are never
silently dropped when the DB is briefly unreadable.
- api/app.ts: wires getTabParentId via core's getTab(id)?.parentTabId
- frontend SettingsPanel.svelte: "Include subagent tabs" checkbox with
an explanatory hint that permission prompts still fire
Tests (+9):
- 3 in config.test.ts: default-false, explicit-true, wrong-typed fallback
- 6 in dispatcher.test.ts: suppression of turn-completed/turn-error from
subagents, no suppression when flag is true, permission-required not
gated, graceful fallback when lookup is missing/throws/returns undefined
Live ntfy.sh round-trip re-verified (status: 200).
|
|
same-tick fires
Marking an hour on the Claude Wake Schedule panel now schedules FOUR probes
within that hour instead of one. Rate-window edges are unforgiving — a
single probe at :15 can miss the actual reset moment by up to 14 minutes;
hitting :00 / :15 / :30 / :45 puts us within ~7 minutes of any reset that
happens during that hour.
When multiple slots come due in the same 30s scheduler tick (or recover
together at boot), they coalesce into a SINGLE upstream wake call — no
point hitting Anthropic 4× in the same window.
DB schema
- wake_schedule is now (hour, slot_minute, next_wake_at) PK (hour,
slot_minute). Destructive migration: detect old single-row-per-hour
schema by absence of the slot_minute column and DROP TABLE. No other
table is touched. Per user direction: no back-compat for old rows.
API
- POST /models/wake-schedule/toggle add: { hour, timestamps: { '0': ms,
'15': ms, '30': ms, '45': ms } } — all 4 slots required, all must be
future Unix ms. Delete shape unchanged ({ hour }).
- GET /models/wake-schedule shape:
schedule: { '9': { '0': ts, '15': ts, '30': ts, '45': ts }, ... }
probeSlotMinutes: [0, 15, 30, 45]
resetOffsetHours, lastWake, pendingRetry (unchanged from prior commit)
Frontend
- Computes 4 timestamps client-side (next occurrence of HH:MM in local TZ)
and sends them in one request.
- markedHours summary now says 'Probes :00 :15 :30 :45 → reset by ~Xh later'.
- Same in-flight tracking / current-hour ring / status row as before.
Tests
- wake-scheduler.test.ts unchanged (pure helpers still correct; added
PROBE_SLOT_MINUTES + isProbeSlotMinute exports).
- routes.test.ts rewritten for the new payload shape: 12 wake-schedule
tests covering snapshot shape, add/remove (full 4-slot round-trip),
validation (range, integer, past-slot, missing slot, non-object,
missing timestamps), independent multi-hour scheduling, and
re-toggle replacement. 417 tests total (was 414).
|
|
Click, support Basic auth, non-optimistic UI clear
Acted on 4 of 6 findings from the gemini-3-flash-preview second-opinion
review (the other 2 were verified-wrong or judged not worth the
complexity — see HANDOFF.md).
core/src/notifications/ntfy.ts:
- validateTopicUrl now enforces ntfy's actual topic-name constraints:
exactly one path segment, 1–64 chars, charset [A-Za-z0-9_-]. Prevents
users from saving topic URLs that look fine but silently 404 at
publish time (cf. binwiederhier/ntfy#1451 for the 64-char limit and
binwiederhier/ntfy's topic-name regex for the charset).
- Click header now passes through sanitizeHeader, closing the same
CRLF-injection vector that Title/Tags already had.
- Authorization header construction now factors through a small
buildAuthHeaderValue helper: a value that already starts with a scheme
token ("Bearer xyz", "Basic dXNlcjpwYXNz") is used verbatim, so users
of private ntfy servers that want Basic auth can paste the full header
value. Bare tokens still get the "Bearer " prefix automatically.
frontend/SettingsPanel.svelte:
- clearNtfyAuthToken() was optimistic: it flipped hasAuthToken=false
locally before awaiting the network call. If the request failed the
UI lied about server state, and worse — a subsequent Save() with
authToken:undefined would silently re-arm the original token. Now
awaits the response, surfaces failures via the existing ntfySaveError
banner, and only mutates local state on success. Adds a
ntfyClearingToken loading flag so the button disables + spins during
the request.
Tests: +6 in ntfy.test.ts (multi-segment rejection, charset rejection,
length boundary, 64-char acceptance, Basic auth pass-through, Click
sanitization). All 442 tests pass; biome clean; svelte-check clean;
manual ntfy.sh end-to-end re-verified.
|
|
Adds a transport-agnostic NotificationDispatcher and a fire-and-forget
ntfy.sh transport (no SDK; just fetch). Configuration is persisted as a
single global JSON blob under the 'ntfy_config' settings key.
Event taxonomy (per-event toggles):
- turn-completed — assistant turn finished cleanly
- turn-error — final turn error (after all fallbacks)
- permission-required — new permission prompt was created
- agent-spawned — top-level user-agent tab spawned via 'summon'
Design:
- Single internal notify(event) interface so a future transport (email,
webhook) plugs in without changing call sites.
- attachToAgentManager + attachToPermissionManager subscribe to the
existing event streams via narrow listener interfaces (no @dispatch/api
dependency back into core).
- 5s in-memory dedupe window on dedupeKey suppresses permission re-emits.
- 10s per-request abort timeout so a hung ntfy server can't pin a worker.
- All sends are fire-and-forget: void Promise.resolve(...).catch(warn).
Tests (39 new):
- ntfy transport: URL/headers/body/auth/click, header sanitization,
per-event-type defaults, error paths.
- config: defaults, normalization tolerance, round-trip, redaction.
- dispatcher: master switch, per-event toggle, dedupe, agent/permission
hookups, top-level-only filtering for agent-spawned, dispose.
|
|
A message queued while the agent was mid-turn was only handled if it
arrived DURING a tool batch (injected as a [USER INTERRUPT]). If it
landed after the last tool call — or the turn had no tools — the agent
silently appended it to history and ended the turn with no response, so
it sat there unanswered. This affected both user-queued messages and
agent-queued ones (send_to_tab).
- agent.ts: stop the end-of-turn drain that swallowed trailing queued
messages into history. They now stay on the queue.
- agent-manager: after a CLEAN turn settles, continueFromQueue() drains
the queue and starts a fresh turn to answer it. Skipped on a
user-stopped or errored turn (queue preserved for the next send).
- Loop safety: continuation draws from the existing autoWakeBudget, so a
runaway agent<->agent chain is bounded; human sends refill it, so human
conversations are never throttled.
- dequeueMessages now tags message-consumed with reason
"interrupt" | "continuation"; the frontend collapses continuation-
consumed queued bubbles into the next turn's initiator row (avoids the
linger/dup traps documented in queue-interrupt-reconcile-edge-cases.md).
- Tests: agent (no-swallow + interrupt regression), agent-manager
(continuation, no-op when empty, user-stop preserves queue, bounded
loop), frontend (continuation bubble becomes next initiator).
- wishlist: remove the now-fixed item.
|
|
Add send_to_tab / read_tab tools so an agent can message or read another
tab by a git-style short handle (shortest unique prefix of the tab UUID,
min 4 chars), shown in the tab bar.
- core/db/tabs: resolveTabPrefix + shortestUniquePrefix (open tabs only,
LIKE-sanitized prefix matching)
- new tools read-tab.ts / send-to-tab.ts (+ tests) decoupled from the DB
TabRow via a minimal ResolvedTabRef projection
- agent-manager: unified deliverMessage routing (busy -> queue, idle ->
new turn) shared by POST /chat and send_to_tab; agent->agent auto-wake
budget (MAX_AGENT_AUTO_WAKES) to bound ping-pong loops
- summon/loader: send_to_tab + read_tab as grantable tools
- frontend: shortHandleFor + handle badge in TabBar; perm toggles
- notes: tab-comm / user-agents / todo-redesign plans
- chore: biome format fixes (debug-logger, summon.test)
Refs notes/plan-tab-comm.md
|
|
The debug-logger.ts module existed but was completely orphaned — none of
its functions had any callsites, so DISPATCH_DEBUG_LLM=1 did nothing.
Wires it in across the stack:
- llm/debug-logger.ts: add wrapFetchWithLogging() that tees SSE bodies via
TransformStream + response.clone() so we capture every chunk without
draining the body the AI SDK consumes. Redacts authorization / x-api-key
/ cookie headers in logs. Also exports nextDebugSeq() so requests and
log files share an id.
- llm/provider.ts: all 3 factories (Claude OAuth, plain-API-key Anthropic,
OpenAI-compatible) now pass fetch: wrapFetchWithLogging(globalThis.fetch).
For Claude OAuth the wrap goes on the inner base fetch so logged bodies
reflect the post-transform shape + Claude-Code session headers. Added
tabId to ProviderConfig for log labelling.
- agent/agent.ts: threads tabId through createProvider and emits
logAgentLoop / logStepLifecycle / logStreamEvent at every meaningful
point in the run loop — step start/end, tool count, every fullStream
event. All are no-ops when DISPATCH_DEBUG_LLM is unset.
- core/index.ts: re-exports the debug helpers.
- tests/llm/provider.test.ts: switch one full-object equality assertion
to property assertions so the test survives the new fetch: wrapper.
Plumbing the env var into the container required three more fixes:
- bin/up: re-export DISPATCH_DEBUG_LLM* so docker compose forwards them
(compose only forwards vars referenced in the environment: block).
Also pre-creates /tmp/dispatch/llm-debug and chowns it on first run so
the container's UID-1000 bun process can write into it without EACCES.
- docker-compose.yml: declare the debug vars on api.environment and
bind-mount /tmp/dispatch/llm-debug:/tmp/dispatch/llm-debug so logs are
inspectable from the host without docker exec.
- docker/entrypoint.dev.sh: explicitly forward DISPATCH_DEBUG_* through
the 'su -' login-shell barrier — su - resets the environment to TERM/
PATH/HOME/SHELL/USER/LOGNAME only, silently stripping everything else.
This is why the vars appeared via 'docker exec env' (which spawns a
new process inheriting the container env) but were absent from the
actual bun process's /proc/<pid>/environ.
bin/build: drop stray sudo for consistency with bin/up and bin/down.
|
|
- agent parameter is now required on summon tool
- new top_level param spawns independent fire-and-forget user agent tabs
- gated by perm_user_agent permission (UI checkbox added)
- agent definition type validation (subagent vs user-agent slug mismatch)
- context-aware error messages when agent slug not found
- read_file_slice added to summon tool's allowed tools enum
- updated and expanded summon tests
|
|
Extended thinking was gated on a hardcoded `model === "claude-opus-4-7"` check,
so newer/other adaptive models (Opus 4.8, Opus/Sonnet 4.6) fell into the classic
`thinking: { type: "enabled" }` branch. Adaptive models default thinking display
to "omitted", so no thinking was streamed — the UI showed nothing for Claude while
DeepSeek (a separate openai-compatible path) worked.
Replace the string check with a pure helper `anthropicThinkingProviderOptions`
that mirrors opencode's transform.ts detection:
- adaptive (`type: "adaptive"`) for Opus 4.7+ (version-parsed) and Opus/Sonnet
4.6 (id substring; handles dash and dot forms);
- `display: "summarized"` ONLY for Opus 4.7+ (they default to omitted and must
be forced); Opus/Sonnet 4.6 stream thinking without it;
- all other Claude models keep classic `enabled` + budgetTokens.
Pure function (no provider/streamText/network), unit-tested directly: Opus 4.8
(the reported bug), Opus 4.7, Sonnet/Opus 4.6, Opus 4.5 + dated Sonnet (enabled),
a future Opus 4.9 (proves version-parse), and effort->budget mapping.
|
|
Move all loose root-level .md files (plans, reports, gemini reviews, incident
notes) into a single notes/ directory, and update the doc-reference breadcrumbs in
code comments/test labels to the notes/ path.
Add notes/queue-interrupt-reconcile-edge-cases.md: documents why the
queue/interrupt/turn-sealed reconcile path keeps surfacing edge cases (a catalog of
the four review-pass bugs, the no-loss/no-duplicate invariants, the recommended
membership-based reconcile refactor, and interleaving-test guidance).
|
|
per-chunk eviction
Replace the stored ChatMessage[] with a chunk-native model: tab.chunks (sealed
ChunkRow[]) + tab.live (transient in-flight turn buffer) + derived tab.renderGroups.
This enables per-chunk eviction (trimming WITHIN a large turn) and raw-chunk
pagination (loadOlderChunks), removing the whole-message eviction limitation.
Backend:
- Emit turn-start/turn-sealed around each turn; expose currentTurnId in the status
snapshot. turn-sealed fires after the durable write (status:idle fires before it).
- New GET /tabs/:id/chunks raw paginated endpoint (limit/before).
- Wrap appendChunks in a single SQLite transaction.
Frontend:
- turn-sealed drives a turn-aware reconcile that folds the sealed turn into chunks
while preserving a concurrent newer in-flight turn and pending queued messages;
deferred while the user is scrolled up.
- Stable turn-scoped render keys (${turnId}:${role}:${n}) avoid remount/flash.
Reconcile correctness (three review passes):
- preserve a concurrent newer turn when an earlier deferred reconcile flushes;
- keep optimistic queued user messages (no loss);
- turn-start backfill skips pending queued rows and tags only the turn initiator;
- bind consumed interrupt messages to the in-flight turn so they collapse on seal
(no lingering/duplicated bubble).
Tests: chat-store reconcile/eviction/pagination suite; api chunks endpoint + events.
|
|
Replace the message-as-container model with a flat, append-only chunk log.
- chunks table (id, tab_id, seq, turn_id, step, role, type, data_json): one
row per chunk; tool_call (assistant) and tool_result (tool) are SEPARATE
rows linked by callId. Message/turn are derived groupings, not stored.
- chunks/transform.ts: DB-free explode (Chunk[] -> rows) / group (rows ->
messages), shared by backend and the browser frontend.
- Cache fix: toModelMessages segments each turn at tool-batch boundaries into
stable [assistant, tool] pairs per step, so earlier steps serialize
byte-identically across requests (kills the prompt-cache churn).
- agent-manager persists a turn's chunks on seal (once), discarding a failed
fallback attempt's partial chunks; rebuilds agent history from the log.
- GET /messages windows the log by chunk seq then groups; loadMoreMessages
merges a turn split across the window boundary by turnId.
- One-shot migration drops the legacy messages table and clears tabs;
settings/credentials/keys/usage preserved.
Full suite green (317 tests); biome, tsc, and svelte-check clean.
|
|
- send prompt-caching + oauth anthropic-beta headers on the Claude OAuth provider
- restructure the OAuth request body (billing header, identity split, relocate
third-party system prompt to the first user message) to match Claude Code
- apply rolling cache_control breakpoints and group a turn's tool results into a
single role:tool message for correct breakpoint placement
- emit per-step usage events (cache read/write split) and add the Cache Rate
sidebar panel
- dedup byte-identical tool calls within a single batch
|