| Age | Commit message (Collapse) | Author |
|
verbatim before/after -> LogRecord.body (273 tests)
contracts/logging.ts reduced to pure types; createLogger (+ helpers) moved to kernel/src/logging/ — @dispatch/kernel still exports it (host-bin/tool-read-file unaffected).
Span body channel (Option A): Logger.span / Span.child / Span.end accept an optional body string -> SpanOpenRecord.body / SpanCloseRecord.body. Large verbatim payloads now use body, not stringified attributes (store-fat-serve-thin; attributes stay thin/queryable for D9).
before: run-turn emits a 'prompt' span with the verbatim messages+tools in body (small scalars in attrs). after: provider.request span carries the verbatim request in body; attrs thin, auth self-redacted.
Verified: tsc -b clean, 273 tests, biome 0 warnings/0 infos. Live boot: prompt + provider.request bodies present and correlated (shared turnId); request.body no longer in attributes; auth-key leak count = 0.
|
|
+ self-redaction (267 tests)
Threads the step span's correlated logger into provider.stream (new optional ProviderStreamOptions.logger) so provider-openai-compat opens a child provider.request span at the fetch edge, capturing the verbatim post-transform request + response status/cache-tokens/raw-error. Auth header self-redacted in the provider's OWN code (graduated mask tiers; no shared helper). Capture is fail-safe (never throws into the turn). Adds the first hermetic provider HTTP test (stream.test.ts: fetch mocked, 15 cases). Large payloads use attributes for now; the LogRecord.body channel is a deferred ABI design (notes §10).
Verified: tsc -b clean, 267 tests (250->+17), biome 0 warnings/0 infos. Live boot: provider.request shares turnId with prompt:before (before<->after diffable); auth-key leak count = 0 (self-redaction proven on a real request).
|
|
sink (250 tests)
Structured, agent-first logging captured durably to an append-only journal file.
Kernel (contracts/logging.ts): leveled/attributed Logger + Span, auto-scoped per extension (host stamps manifest.id, unspoofable), incremental span records (open/close) for crash-reconstructable traces, injected LogSink (pure record-builder). ctx.log on ToolContract; runTurn opens turn/step/tool-call spans and captures the verbatim pre-mutation prompt (the 'before') on the step span.
journal-sink (new package, bootstrap dep — not an extension): LogSink appending NDJSON to a rotating journal; pure serialize + thin fs edge; fail-safe drop, never blocks a turn. host-bin injects it via HostDeps; session-orchestrator threads host.logger (childed per turn) into runTurn.
Redaction is per-extension self-redaction (no shared helper — isolation over DRY). The out-of-process collector + SQLite store + the verbatim 'after' provider.request capture are Phase B / next (notes/observability-design.md §10/§11).
Verified: tsc -b clean, 250 tests (218→+32), biome clean. Live boot: a turn's journal holds host logs + turn/step spans (open+close) + the prompt:before record with the verbatim messages array.
Harness: ORCHESTRATOR §3 rule-scoping map; .dispatch/rules/isolation-over-dry.md; notes/observability-design.md (design D1–D10 + Phase A/B plan).
|
|
consumers (218 tests)
Step 4 of the post-MVP backlog: resolve the last vocab drift. The canonical
term for a thread of turns is `conversationId` (GLOSSARY), but `AgentEvent`
variants and `RunTurnInput` still used the legacy `tabId` from the old frontend
"tab" concept, with session-orchestrator bridging `conversationId → tabId`.
Atomic, type-driven rename across the full 10-file consumer set:
- contracts/events.ts: all 11 AgentEvent variants tabId → conversationId
- contracts/runtime.ts: RunTurnInput.tabId → conversationId
- runtime/{events,run-turn,dispatch}.ts: factory params, ctx field, locals
- session-orchestrator: drop the redundant `tabId: conversationId` bridge line
- transport-http: emit wiring; external /chat field + X-Conversation-Id header
unchanged (already canonical) — only the emitted NDJSON event field flips
- tests (run-turn, app, logic): inputs + assertions now use conversationId
Pure rename, zero behavior change: typecheck clean, 218 tests pass (unchanged
count), biome clean, `grep tabId packages/` → zero matches. Verified live:
multi-turn curl emits conversationId-keyed NDJSON and threads history correctly.
GLOSSARY drift note removed. Closes the post-MVP backlog (Steps 1–4).
|
|
storage-sqlite manifest honesty
host CR-1: createHost.getHostAPI() returns the canonical post-activation HostAPI
(registration closed) via a single builder — host-bin deletes its
buildPostActivationHostAPI duplicate and calls host.getHostAPI().
storage-sqlite CR-2: remove false contributes.services:["storage"] (backend is a
kernel bootstrap dep injected as HostDeps.storageFactory, not a bus service);
document the intentional no-op activate.
typecheck clean, 218 tests pass, biome clean; live boot + curl verified.
|
|
First TOOL extension (standard tier, fs capability). Pure-core/shell split with
workdir containment (realpath symlink guard). host-bin registers it in
CORE_EXTENSIONS; flows into runTurn via session-orchestrator's resolveTools.
Verified: typecheck clean, 214 tests pass (was 185), biome clean. Live curl
against flash produced a real tool-call + tool-result round-trip with correct
final answer. Proves the kernel tool-dispatch loop end-to-end (plan §3.3).
|
|
- kernel HostAPI: add getAuthProviders()/getAuthProvider(id) read-views (mirrors getProviders)
- provider-openai-compat: activate() resolves creds via host.getAuthProvider("apikey").resolve(); dependsOn auth-apikey; model stays config-driven
- host-bin: mirror the new getters in post-activation HostAPI stub
- auth-apikey is no longer vestigial; auth seam exercised end-to-end
- 185 tests pass; typecheck + biome clean; verified live (curl returns real response)
|
|
ORCHESTRATOR↔plan §5/§3.6, add HANDOFF.md, host-bin reads BACKEND_PORT (24203)
|
|
Bun.serve; full-fidelity wiring (178 tests)
|
|
input tabId/turnId (CR-3); simplify orchestrator wiring (167 tests)
|
|
build graph (164 tests)
|
|
StorageNamespace + pure reconcile (16 tests)
|
|
activate, HostAPI (wraps bus); 50 tests
|
|
- storage-sqlite: bun:sqlite StorageNamespace backend + migrations (21 bun tests)
- auth-apikey: pure resolver from env → ApiKeyCredentials (4 tests)
- provider-openai-compat: OpenAI-compatible SSE stream → ProviderEvents
- orchestrator fixes: provider imports (@dispatch/kernel), missing dep,
exactOptionalPropertyTypes (omit-when-undefined), root tsconfig refs
- vitest excludes storage-sqlite (bun:sqlite); test:bun runs it under bun
|
|
union (resolves runtime CR-1/2/3)
|
|
(eager/semaphore/dedup/concurrencySafe/abort), 16 tests
|
|
stateful shell
|
|
dispatch, hooks, extension/HostAPI, runtime, events)
|
|
stub)
|
|
|
|
- Add watchDirConfig() for per-directory config watching
- Register watchers for subdirectories with their own dispatch.toml
- Fix permission ordering: move "*" wildcard to front so findLast
reaches specific rules first (was silently breaking all specific
bash permission rules)
- Add comprehensive tests for watcher functionality
- Update mocks in test files
|
|
Gemini review caught a precedence-inversion bug in mergePermissions: when a
nested permission group exists in BOTH global and local configs, the previous
`{ ...existing, ...value }` spread updated an overridden pattern IN PLACE,
leaving its original (global) insertion slot. Since configToRuleset flattens
patterns in iteration order and evaluate() uses findLast (last match wins), a
more-general global pattern declared lower (e.g. "*") would sit AFTER the
local override and silently shadow it.
Example: global bash { "npm test"=allow, "*"=ask } + local bash
{ "npm test"=deny } resolved "npm test" to "ask" instead of "deny".
Fix: drop global patterns the local block also defines, keep remaining global
patterns in order, then append ALL local patterns last — reproducing a clean
"global rules then local rules" concatenation so local always wins. Adds a
regression test asserting order and evaluation outcome.
|
|
Load an optional global config at ~/.config/dispatch/dispatch.toml
(override via DISPATCH_GLOBAL_CONFIG) and deep-merge it underneath every
project/working-directory dispatch.toml, so machine-wide settings — most
notably globally available LSP servers — work in any repo without per-repo
config. Local always wins on conflicts.
- loader: add getGlobalConfigPath(), loadGlobalConfig(), mergeConfigs();
loadConfig(dir) now loads+merges global. [lsp] and [[keys]] merge by id;
[permissions] merge per-group with global patterns emitted first so local
rules win at evaluation time (findLast). A malformed global config is
downgraded to empty rather than breaking every repo.
- watcher: watch BOTH global and local dispatch.toml so hot-reload re-merges
on either change (dedupes when paths coincide).
- export new loader fns from config/index and core index.
- types/agent-manager: doc updates reflecting merged LSP resolution.
- dispatch.toml: document global-default merge behavior; activate biome and
typescript-language-server LSP entries.
- tests: merge precedence, lsp/keys merge-by-id, permissions merge,
filesystem integration, malformed-global resilience; isolate global path
in existing loader tests.
|
|
# Conflicts:
# packages/api/src/agent-manager.ts
# packages/api/tests/agent-manager.test.ts
# packages/frontend/src/lib/tabs.svelte.ts
|
|
Root cause of the 'first warmup misses' + 'switch to chat misses' bugs:
Anthropic keys the MESSAGE-level prompt cache on `tool_choice` AND the
extended-thinking parameters (both rows in their cache-invalidation table mark
the messages cache as invalidated on change). The original warmCache() sent
toolChoice:'none' and NO thinking providerOptions, while real turns send
toolChoice:'auto' + thinking config for the effort. So warming and chat wrote
TWO different message-cache buckets:
- warmup #1 missed (no warm-only bucket existed yet), every later warmup hit
its own bucket;
- the next real chat message read the OTHER bucket → miss.
Fix: extract a shared buildStreamOptions() that produces the cache-affecting
params (toolChoice + thinking providerOptions + maxOutputTokens). Both run()
and warmCache() now call it with the SAME resolved reasoning effort, so the
warming replay refreshes the exact cache the next real message reads. The
trivial probe turn is still appended AFTER the last cache breakpoint, so it
never disturbs the cached prefix.
Threaded the per-tab reasoning effort (per-model -> per-tab selector -> default,
mirroring processMessage) from the frontend resolver through POST /chat/warm to
warmCacheForTab to warmCache.
Tests: updated the warmCache toolChoice test to assert it MATCHES a real turn,
added an invariant test driving run() and warmCache() and asserting identical
cache-affecting params, and assert effort forwarding in the frontend store.
check / test (780) / frontend build / typecheck all green.
|
|
Keep a tab's provider prompt-cache warm while idle by periodically replaying
the exact cached conversation prefix plus a single trivial throwaway turn,
resetting the provider's ~5-min cache TTL so the user's next real message hits
a warm cache.
Backend:
- Agent.warmCache(history): extracts buildLlmContext() shared with run(), then
re-sends the identical system+tools+history prefix (same Anthropic
cache_control breakpoints) plus a 'reply with just a .' probe turn via
toolChoice:none. Returns the request usage; mutates no history, emits/persists
nothing.
- AgentManager.warmCacheForTab(): resolves the same agent the next real turn
would use, replays the FULL genuine history, refuses while a turn is running.
- POST /chat/warm: returns ONLY the warming request's usage (never persisted,
never folded into the real usage aggregate).
Frontend:
- cache-warming.svelte.ts store: per-tab 4-min repeating idle timer with
countdown, warming-specific last-request cache %, and error capture. Arms on
turn end, pauses during a turn, disables+resets on a real user message.
- cache-warm-storage.ts: per-tab localStorage persistence of the toggle.
- Lifecycle hooks wired into tabs.svelte.ts (status/statuses/sendMessage/
hydrate/create/open/close).
- ModelSelector: bottom-of-panel checkbox + debug strip (last-% / countdown /
error), shown only when enabled. Warming cache data never touches the real
Cache Rate view.
Tests: core warmCache (5), api warm route (3) + warmCacheForTab (3), frontend
store (12) + storage (10). check / test (779) / frontend build / typecheck all
green.
|
|
# Conflicts:
# packages/frontend/src/lib/components/ChatInput.svelte
|
|
Summarize a conversation's older "head" into a structured anchored Markdown
summary while preserving the most recent turns verbatim, shrinking context size
while keeping the information needed to continue coherently. Triggered by a
"Compact conversation" button in Chat Settings (not an agent tool).
Approach informed by OpenCode's session/compaction.ts:
- Ported SUMMARY_TEMPLATE (Goal / Constraints / Progress / Key Decisions /
Next Steps / Critical Context / Relevant Files) and the anchored-summary
buildPrompt (re-summarizes a prior summary when present).
- Ported the TOOL_OUTPUT_MAX_CHARS (2000) cap on tool results in the summary
request.
- Simplified tail selection to a fixed recent-turn count (DEFAULT_TAIL_TURNS=2)
instead of OpenCode's token-budget splitTurn.
core:
- New src/compaction/ module (pure, DB-free): template, prompt builder,
head/tail selection, transcript renderer with tool-output capping, prior
summary extraction. Generic over ChatMessage so callers keep turnId/seq.
- db/chunks.ts: rekeyChunks(from,to) relocates a tab's full history to a
backup tab (reversible — nothing is deleted).
- AgentEvent: compaction-started / -complete / -error variants.
api:
- AgentManager.compactTab(tempTabId, sourceTabId): side-effect-free
resolveConnection() for the compactor model (configured compaction_model_*,
else the source tab's own key+model), one-shot tool-less summary generation
via a transient Agent, then relocate full history to a fresh backup tab and
re-seed the canonical source id with [summary turn + preserved tail]. Source
tab is locked (messages queue) during the run; queue drains afterward.
- Routes: POST /tabs/:id/compact, GET/PUT /tabs/settings/compaction-model.
frontend:
- "Compact conversation" button in ModelSelector (Chat Settings), between
Working Directory and the agent toggle; idle-gated.
- Compaction-model key+model selector in Settings, beside the title model.
- Transient placeholder tab shows a large, non-faded "Please wait, compacting
conversation…" screen; closing it cancels. Source input locked while running.
- Handle compaction-* events: reload compacted source, insert backup tab,
refocus source, discard placeholder.
tests: core compaction unit tests, rekeyChunks DB test, AgentManager.compactTab
orchestration tests, and compaction route tests. All green (713 tests), biome
clean, all typechecks pass, frontend builds.
|
|
|
|
|
|
|
|
Use DaisyUI's status-with-ping pattern on the tab status dot so it pings
when the agent has stopped and is likely waiting on the user:
- idle with incomplete (pending/in_progress) tasks remaining, or
- stopped due to an error.
Implements wishlist item #21.
|
|
Add multimodal image/PDF input to the chat box via clipboard paste, gated by a
graceful per-model capability check.
UX: a pasted image/PDF inserts an inline token (【image:…】 / 【pdf:…】) into the
draft, so attachments have ORDER relative to typed text and can be referenced
positionally. The token is the only handle — deleting it (atomic Backspace/
Delete, or selection overlap) detaches the file; an input-reconciliation safety
net detaches any attachment whose token is no longer intact. No preview strip.
Capability check: resolveModelCapabilities reads models.dev modalities.input
(new GET /models/capabilities, mirrors /context-limit). The input blocks Send
(no tokens spent) only on a definitive 'no'; unknown capability (catalog offline
/ unmapped provider) stays permissive. Attachments require a fresh turn — Send is
blocked while generating and /chat rejects content mid-turn (409).
Attachments are EPHEMERAL: forwarded to the model for the turn via ordered AI SDK
ImagePart/FilePart content, but never persisted (history keeps the text with
[image]/[pdf] markers). Text-only turns serialize byte-identically to before.
Limits (Anthropic-aligned, enforced at paste + re-validated server-side):
PNG/JPEG/WebP/GIF/PDF; image ≤5MB, PDF ≤32MB, ≤20 attachments, ≤32MB total.
core: UserContentPart types, models/attachments validator, capability resolver,
agent.run+toModelMessages thread ordered content. api: /chat content validation +
passthrough. frontend: attachment-tokens helper, ChatInput paste/token/gating,
per-tab staged attachments, App.svelte capability fetch. +44 tests.
|
|
Adds an agent-callable `key_usage` tool that reports current usage for
configured API keys so the agent can pick a key with headroom, warn before
hitting a rate limit, and diagnose exhausted-key failures.
Per key it reports: provider, active/exhausted status (with last error +
when it was exhausted), remaining rate-limit headroom and reset timestamp per
window (5-hour, weekly, and monthly where the provider exposes it), and
whether the figures are live or served from cache (with the cache's
last-fetched-from-source time). Supports anthropic and opencode-go keys
(live with cache fallback for anthropic; live scrape for opencode-go).
Optional `key_id` reports one key; omitted reports all.
Hard permission gate `perm_key_usage` (default off): when disabled the tool
is completely removed from the toolset/context. Registered in both the
parent permission-gated path and the child whitelist path, advertised in the
system prompt (TOOL_DESCRIPTIONS), grantable to subagents via the summon
enum, and exposed as a frontend tool-permission checkbox.
To report data freshness, claude.ts gains `getAccountUsageWithSource` +
`ClaudeUsageResult` (live vs cache + cachedAt from usage_cache.cached_at);
the existing `getAccountUsage` now delegates to it, preserving behavior.
Tests: core key-usage tool suite (windows, %-conversion, freshness, exhausted
status, unsupported/unavailable, filtering) + agent-manager perm-gate test.
|
|
|
|
# Conflicts:
# packages/api/src/agent-manager.ts
# packages/api/tests/agent-manager.test.ts
# packages/frontend/src/lib/components/ToolPermissions.svelte
# packages/frontend/src/lib/settings.svelte.ts
|
|
Address the remaining real defects from the Luau/cs test reports. The wrapper-
level findings (dash-leading queries, context flag, no-match message, empty
query, path-is-file) were already fixed in earlier commits and verified through
the tool; the two genuinely-open items were engine-level, plus a test-coverage
gap (the patch-dependent behaviors were only exercised by live tests that
skip without a cs binary).
- Engine fix (docker/cs/fuzzy-distance.patch): cs's fuzzy `term~N` only scanned
same-length windows, so it matched substitutions but never mid-word
insertions/deletions — e.g. `computSlipAngle~1` (a dropped 'e') failed to find
`computeSlipAngle`, contradicting cs's own "within 1 or 2 distance" docs.
Now scan windows of length termLen±maxDist (true Levenshtein) and keep the
best per offset. Updates one pre-existing cs test that encoded the buggy
substitution-only behaviour and adds mid-word insert/delete cases. Passes
cs's pkg/search + pkg/ranker suites; builds clean against the pinned commit.
- Provisioning: apply the new patch everywhere the Luau patch is applied —
Dockerfile, Dockerfile.dev, packaging/PKGBUILD build() — so every install
path (Docker bin/up, native code-search package via bin/service install)
ships both patches.
- Tests: add skip-gated live tests for Luau declaration detection (function /
local function / type / export type), only=usages exclusion, the Luau
language tag, and fuzzy mid-word matching. New capability probes
(findLuauCapableCs / findFuzzyCapableCs) run these only on a cs that actually
has each patch and skip (never fail) on an unpatched/absent binary. Default
suite: 600 pass / 12 skip; with a both-patched cs: 612 pass / 0 skip.
- Docs: UPSTREAM_CS_FUZZY_BUG.md documents the unreported upstream defect for a
potential boyter/cs PR; CS_ARTIX_DEPLOY.md updated to reflect that
sync-dispatch.sh now ships the code-search package (carrying both patches).
biome + tsc (core/api/frontend) + svelte-check all green.
|
|
|
|
The wake probe was hardcoded to claude-3-5-haiku-20241022, which the
endpoint no longer serves (HTTP 404), exhausting the retry loop. Now the
probe fetches the live model list via fetchAnthropicModels (falling back
to ANTHROPIC_MODELS_FALLBACK if empty) and selects the current Haiku via
a new pure selectHaikuModel() helper (first case-insensitive 'haiku'
substring match; newest-first ordering). No-match surfaces a clear
per-account error instead of crashing.
|
|
Add 'LSP queries' to the ToolPermissions UI (disabled by default, matching
the backend's perm_lsp gating). The existing generic settings route
(/tabs/settings/perm_lsp) and agent-manager invalidation handle persistence
and agent rebuild automatically — only the frontend store default and the
permission entry needed wiring.
|
|
The API server bound a fixed port (3000) and died with EADDRINUSE when it
was taken — painful when running multiple dispatch instances (e.g. testing
several feature branches at once). Replace the static default export with an
explicit Bun.serve retry loop that increments the port by one on EADDRINUSE,
from START_PORT (PORT env or 3000) up to MAX_PORT (3010), logging the chosen
port and a hint to repoint the frontend's API URL on a bump.
Guarded by import.meta.main so importing the module (for `app`) never binds a
port. Frontend unchanged — set its API URL manually when a bump occurs.
|
|
Address findings from a second independent (Gemini) review covering the tool
and the packaging:
- Robustness (was: crash): non-string params from a model hallucination (e.g.
include_ext: ["ts","go"]) threw 'x.trim is not a function' and killed the
tool call. Add an asString() coercion for all string params (query, path,
include_ext, exclude_pattern, only); non-strings now no-op or return the
graceful 'query is required' error.
- Output bound: cap each rendered snippet line at 500 chars (MAX_LINE_CHARS,
mirrors read-file.ts) so a matched minified/generated line can't bloat the
payload. (Total output is already bounded by the universal truncator.)
- packaging/PKGBUILD: make the cs clone rerun-safe (rm -rf before clone) so
makepkg -e / repeat runs don't abort on 'destination path already exists';
add conflicts=('cs') to the code-search package for a clean pacman error vs.
the unrelated AUR 'cs' that also owns /usr/bin/cs (no provides — different
program).
Not changed (verified): path containment, the -- flag-injection guard, and the
deterministic pinned Docker build were all confirmed solid by the review.
Tests: +2 (wrong-type params don't crash; long-line truncation). Full suite
605 pass, biome + tsc green.
|
|
Add Language Server Protocol integration modeled on opencode's, wired for
this codebase's plain-TypeScript tool/agent architecture.
Core (@dispatch/core):
- lsp/client.ts: LSP/JSON-RPC client over stdio (vscode-jsonrpc) with the
initialize handshake, didOpen/didChange sync, push + pull diagnostics
(textDocument/diagnostic, workspace/diagnostic), and a generic request()
passthrough for hover/definition/references/documentSymbol.
- lsp/server.ts: resolves dispatch.toml [lsp] entries into spawn specs.
Config-driven only — no builtin registry, no auto-download.
- lsp/manager.ts: process-wide LspManager owning client lifecycles, keyed
by root+serverID, lazy spawn + reuse + graceful shutdown.
- lsp/language.ts: extension->languageId map incl. .luau -> "luau".
- lsp/diagnostic.ts: error-only <diagnostics> block formatting (1-based).
- tools/lsp.ts: on-demand 'lsp' tool (1-based coords -> 0-based wire).
- write-file.ts: optional onAfterWrite hook for diagnostics-on-write.
- config schema: validate [lsp] block; DispatchConfig.lsp + LspServerConfig.
API (@dispatch/api):
- AgentManager owns one LspManager; per-working-directory server cache
cleared on config reload; diagnostics appended to write_file results;
'lsp' tool gated by new perm_lsp setting; shutdownAll on destroy().
Config:
- dispatch.toml: documented, commented [lsp.luau-lsp] Roblox example.
Tests: fake-lsp-server fixture + client/manager/server/diagnostic/schema/
tool/write-hook suites, plus an opt-in real-binary luau-lsp smoke test
(auto-skipped when luau-lsp is absent). 652 pass; biome + 3 typechecks green.
|
|
Address bugs found by an end-to-end test of the tool:
- HIGH: prose/text files (.md/.html/etc.) came back as bare headers with no
snippet. cs's default 'auto' snippet mode emits a single 'content' string
(no 'lines[]') for prose, which the renderer skipped. Force
--snippet-mode=lines by default so every file type returns a lines[] window
that renders. Also add a defensive 'content'-shape fallback in formatResults
(+ widen the CsResult type) so a content result is never shown blank.
- HIGH: the 'context' parameter was a no-op — cs ignores -C except in grep
snippet mode. When context is supplied, switch to --snippet-mode=grep so -C
actually widens the per-match window (verified 2 -> 26 lines); default
(no context) keeps the richer lines window for code.
- LOW: a 'path' pointing at a file (not a dir) silently returned 'No matches
found' (cs --dir <file> => null). Now stat the path and return an
explanatory error (file vs nonexistent), pointing at read_file for a file.
- MEDIUM/doc: clarify snippet_length (prose-mostly) and context descriptions.
Tests: +5 (prose rendering live + stubbed content-shape; context widening;
path-is-file; path-nonexistent). Full suite 603 pass, biome + tsc green.
Note: the EACCES spill failure seen in testing is pre-existing platform
infra (truncate.ts SPILL_ROOT, shared by all tools), not part of this tool.
|
|
Address findings from an independent code review of the search_code tool:
- Critical: cs failures (non-zero exit, or SIGTERM from the spawn timeout)
were swallowed and reported to the model as 'No matches found', discarding
stderr. Now capture exit code + signal from 'close' and return a real
Error: (timeout message for SIGTERM, exit-code + stderr otherwise). cs
exits 0 on a genuine no-match, so that path still reports correctly.
- High: a query beginning with '-' (e.g. '-foo') was parsed by cs as a
(usually invalid) flag. Insert a '--' separator before the query so it is
always treated as the positional search term.
- Low: relative-path display fallback now matches the workdir only at a path
boundary, so a sibling dir sharing the prefix (e.g. /app vs /app-secrets)
isn't rendered as a '../app-secrets/...' path.
Adds tests for the non-zero-exit (stderr surfaced, not 'No matches') and
dash-leading-query cases. All tests (598), biome, and tsc pass.
|
|
Add a dedicated, permission-gated search_code tool that wraps boyter/cs
(code spelunker) — a fast, relevance-ranked, structure-aware code search
engine — giving agents a better default than grep/find for exploratory
'where is X / how does Y work' searches (ranked results, snippets, ~5x
smaller payloads).
- packages/core/src/tools/search-code.ts: createSearchCodeTool factory;
-f json invocation, workdir path containment, graceful missing-binary
handling (DISPATCH_CS_BIN override), readable per-file formatted output.
- Wire-up: export from core; register in agent-manager (both child-whitelist
and parent perm paths) behind new perm_search_code; add to summon catalog
+ tools enum; frontend ToolPermissions + settings.
- Docker: build a patched, statically-linked cs (pinned v3.1.0 commit) in a
golang builder stage and bundle at /usr/local/bin/cs.
- docker/cs/luau-declarations.patch: additive Luau declaration table so
--only-declarations / definition ranking works for Roblox .luau files
(upstream has Lua but not Luau). Applied during the Docker build.
- Tests: new search-code.test.ts (stubbed JSON formatting + live-cs
integration, skipped when cs absent); agent-manager/routes mocks +
perm-gating assertions; loader pass-through.
All tests (596), biome, and tsc (core/api/frontend) pass. cs-builder Docker
stage verified to build and produce a working patched binary.
|
|
|
|
# Conflicts:
# packages/api/tests/agent-manager.test.ts
|
|
Apply tab-active so the pinned + button uses the same raised base-100
fill, border and top-corner treatment as a selected tab, and drop its
left border (!border-l-0) so it sits flush against the bar's edge.
Keep !rounded-ss-none for the square top-left corner.
|
|
'arrives on its own')
Matches actual behavior: a peer's reply wakes this tab with a new message in a
later turn. Updated the send_to_tab description (both canReadTab branches), the
delivery-result text (both branches), and the system-prompt one-liner; updated
the test assertion accordingly.
|