diff options
| -rw-r--r-- | cc/01-tool-schema-analysis.md | 62 | ||||
| -rw-r--r-- | cc/02-anthropic-tool-format.md | 90 | ||||
| -rw-r--r-- | cc/03-recommendations.md | 116 | ||||
| -rw-r--r-- | cc/04-schema-debug.md | 40 | ||||
| -rw-r--r-- | cc/README.md | 38 | ||||
| -rw-r--r-- | packages/api/src/agent-manager.ts | 141 | ||||
| -rw-r--r-- | packages/core/src/agent/agent.ts | 33 | ||||
| -rw-r--r-- | packages/core/src/agents/index.ts | 12 | ||||
| -rw-r--r-- | packages/core/src/agents/loader.ts | 75 | ||||
| -rw-r--r-- | packages/core/src/index.ts | 19 | ||||
| -rw-r--r-- | packages/core/src/tools/registry.ts | 39 | ||||
| -rw-r--r-- | packages/core/src/tools/summon.ts | 127 | ||||
| -rw-r--r-- | packages/core/tests/agents/loader.test.ts | 132 | ||||
| -rw-r--r-- | packages/core/tests/tools/summon.test.ts | 137 | ||||
| -rw-r--r-- | packaging/PKGBUILD | 2 | ||||
| -rw-r--r-- | wishlist.md | 5 |
16 files changed, 1020 insertions, 48 deletions
diff --git a/cc/01-tool-schema-analysis.md b/cc/01-tool-schema-analysis.md new file mode 100644 index 0000000..d803b2c --- /dev/null +++ b/cc/01-tool-schema-analysis.md @@ -0,0 +1,62 @@ +# Tool Schema Analysis — Dispatch / AI SDK v6 + +## How Dispatch Defines Tools + +Dispatch uses the **AI SDK v6** (`ai@^6.0.191`) with provider adapters (`@ai-sdk/anthropic@^3.0.79`, `@ai-sdk/openai-compatible@^2.0.48`). + +### The conversion pipeline + +``` +Zod schema (z.object({...})) + → zodToJsonSchema() → JSON Schema Draft 7 + → jsonSchema() → AI SDK v6 inputSchema wrapper + → tool() → AI SDK v6 Tool object (no execute fn) + → streamText({tools}) → SDK → provider adapter → wire format +``` + +**Critical file:** `packages/core/src/tools/registry.ts` — the `toAISDKTool()` function. + +### What each tool's schema looks like after conversion + +For a typical tool like `read_file`: +```json +{ + "type": "object", + "properties": { + "path": { "type": "string", "description": "Path to the file..." }, + "offset": { "type": "integer", "minimum": 1, "description": "..." }, + "limit": { "type": "integer", "minimum": 1, "description": "..." } + }, + "required": ["path"], + "additionalProperties": false, + "$schema": "http://json-schema.org/draft-07/schema#" +} +``` + +### The Problem Surface + +The `zodToJsonSchema()` library (v3.x) generates full Draft 7 JSON Schema including: + +1. **`$schema` field** — `"http://json-schema.org/draft-07/schema#"` — Anthropic's API rejects this or silently ignores it depending on version +2. **`additionalProperties: false`** — Anthropic may not handle this correctly; Claude's `tool_use` blocks sometimes include extra fields +3. **`default` values** — Zod schemas with `.default()` produce JSON Schema `default` fields that Anthropic doesn't support +4. **`minimum`/`maximum`** — Numeric constraints may cause issues if Claude passes values slightly outside bounds + +### What the oh-my-pi reference does differently + +The `references/oh-my-pi/packages/ai/src/utils/schema/normalize.ts` shows a `normalizeSchemaForCCA()` function that strips all of the above. Dispatch does **no normalization** before passing schemas to the AI SDK. + +### What the AI SDK's @ai-sdk/anthropic adapter does + +The adapter should convert the AI SDK's internal format to Anthropic's wire format. However, the adapter may: +- Pass through JSON Schema fields that Anthropic doesn't support (like `$schema`, `additionalProperties`) +- Not add parameter `description` fields from the schema to the Anthropic format +- Not handle `default` values correctly + +### Files to inspect + +| File | What to check | +|------|---------------| +| `node_modules/@ai-sdk/anthropic/dist/index.mjs` | How tools are serialized for the wire | +| `node_modules/ai/dist/index.mjs` | How `tool()` and `jsonSchema()` work | +| `node_modules/zod-to-json-schema/dist/` | What `zodToJsonSchema` outputs for our schemas | diff --git a/cc/02-anthropic-tool-format.md b/cc/02-anthropic-tool-format.md new file mode 100644 index 0000000..3b2888c --- /dev/null +++ b/cc/02-anthropic-tool-format.md @@ -0,0 +1,90 @@ +# Anthropic Tool Format — What Claude Actually Expects + +## The Correct Anthropic API Tool Format + +```json +{ + "tools": [ + { + "name": "tool_name", + "description": "What this tool does", + "input_schema": { + "type": "object", + "properties": { + "param1": { + "type": "string", + "description": "What param1 is" + } + }, + "required": ["param1"] + } + } + ], + "tool_choice": { "type": "auto" } +} +``` + +### Key requirements + +| Field | Type | Required | Notes | +|-------|------|----------|-------| +| `name` | string | **Yes** | Must match `^[a-zA-Z0-9_-]{1,64}$` | +| `description` | string | Strongly recommended | Claude uses this to decide when to call | +| `input_schema` | object | **Yes** | JSON Schema; root must have `type: "object"` | +| `input_schema.type` | string | **Yes** | MUST be `"object"` | +| `input_schema.properties` | object | Recommended | Each property needs `type` and `description` | +| `input_schema.required` | string[] | Optional | Lists required property names | + +### Anthropic does NOT support in input_schema + +1. **`$schema`** — JSON Schema meta-keyword; Anthropic rejects it +2. **`additionalProperties`** — Not supported; can cause silent schema rejection +3. **`default`** — Not part of Anthropic's JSON Schema subset +4. **`type` as array** (e.g. `["string", "null"]`) — Rejected +5. **`type: "null"`** — Rejected +6. **`nullable`** keyword — Not supported +7. **`anyOf` / `oneOf` / `allOf`** — Not supported (combiners) +8. **`$ref` / `$defs` / `$dynamicRef`** — Not supported +9. **`propertyNames`** — Not supported + +## What Claude Code (the CLI) Uses Internally + +Claude Code has its OWN internal tool definitions. They use a specific format that Claude has been trained on extensively. When you use Claude outside of Claude Code, the model loses: + +1. **The exact tool descriptions** it was trained on in Claude Code +2. **The precise schema format** Claude Code uses +3. **The system prompt** that tells Claude to use tools proactively +4. **The billing/identity headers** that tell Anthropic this is a "real" CLI session + +### The billing header trick + +From `packages/core/src/credentials/claude.ts`: + +``` +x-anthropic-billing-header: cc_version=2.1.112.xxx; cc_entrypoint=sdk-cli; cch=xxxxx; +``` + +And the identity preamble: +``` +"You are Claude Code, Anthropic's official CLI for Claude." +``` + +These are mirrored from opencode, but **they go beyond what the raw API needs**. They're hacks to get Anthropic's backend to treat the request as coming from Claude Code. + +## Why Claude Opus "Thinks Forever" + +Common causes when tools don't get called: + +1. **Tool schema has unsupported fields** → Anthropic silently ignores the tool → Claude never sees it → Claude just talks +2. **Missing description on parameters** → Claude doesn't know what values to provide → stalls +3. **`input_schema` missing `type: "object"`** → Anthropic rejects the tool entirely +4. **`tool_choice` is wrong** → Set to `"none"` or Claude decides not to use tools +5. **System prompt doesn't instruct tool use** → Claude doesn't realize it should call tools +6. **Anthropic beta headers missing** → Extended thinking or new features might not work +7. **Thinking budget too high** → Claude uses all tokens thinking and never gets to tool calls + +## Sources + +- https://docs.anthropic.com/en/docs/build-with-claude/tool-use +- https://docs.aimlapi.com/capabilities/anthropic +- https://sdk.vercel.ai/providers/ai-sdk-providers/anthropic diff --git a/cc/03-recommendations.md b/cc/03-recommendations.md new file mode 100644 index 0000000..47ffaf8 --- /dev/null +++ b/cc/03-recommendations.md @@ -0,0 +1,116 @@ +# Recommendations — Fixing Claude Opus Tool Calling + +## 1. Add Anthropic Schema Normalization + +**Problem:** `zodToJsonSchema()` generates Draft 7 JSON Schema with `$schema`, `additionalProperties`, and potentially `default` fields that Anthropic's API doesn't support. + +**Fix:** Add a normalization step between `zodToJsonSchema()` and `jsonSchema()` in `packages/core/src/tools/registry.ts` that strips unsupported fields: + +```typescript +function normalizeForAnthropic(schema: Record<string, unknown>): Record<string, unknown> { + // Remove fields Anthropic doesn't support + delete schema.$schema; + delete schema.additionalProperties; + delete schema.default; + // Strip from nested properties too + if (schema.properties && typeof schema.properties === 'object') { + for (const key of Object.keys(schema.properties as Record<string, unknown>)) { + const prop = (schema.properties as Record<string, unknown>)[key] as Record<string, unknown>; + delete prop.$schema; + delete prop.additionalProperties; + delete prop.default; + } + } + return schema; +} +``` + +**File to modify:** `packages/core/src/tools/registry.ts` — the `toAISDKTool()` function. + +## 2. Verify @ai-sdk/anthropic Adapter Version + +**Check:** `node_modules/@ai-sdk/anthropic/dist/index.mjs` — verify the adapter properly converts `input_schema` to Anthropic's `input_schema` format on the wire. + +The AI SDK v6 adapter should handle this, but verify by looking at how it serializes tools. The key serialization happens in the adapter's `convertToolsToAnthropic()` or similar function. + +## 3. Add `tool_choice: "any"` for Opus + +**Problem:** Opus may decide not to call tools even when it should. + +**Fix:** For Claude Opus sessions, consider setting `tool_choice: { type: "any" }` (or `"auto"`) to encourage tool use. Currently dispatch doesn't set any explicit `tool_choice` — the AI SDK default may be suboptimal. + +In `packages/core/src/agent/agent.ts`, the `streamText` options don't include `toolChoice`. Consider adding it conditionally for anthropic provider: + +```typescript +const streamOptions = { + model, + messages: coreMessages, + tools, + ...(isClaudeOAuth ? { toolChoice: "auto" } : {}), +}; +``` + +## 4. Check Parameter Descriptions + +**Problem:** All tools have `.describe()` on their parameters, but verify the AI SDK's Anthropic adapter is forwarding these descriptions to the Anthropic `input_schema.properties.*.description` field. + +## 5. System Prompt Tool Instructions + +The current system prompt in `agent-manager.ts` lists tools generically: +``` +"You have access to the following tools:\n\n{tool_list}\n\nWhen asked to work with files, use these tools." +``` + +For Claude Opus, add more explicit instructions about WHEN to use each tool and that it SHOULD use tools rather than just talking about solutions. + +## 6. Debugging: Log the Actual API Request + +Add logging to see what's actually sent to Anthropic for the `tools` parameter. The most reliable way is to add a `fetch` wrapper or check the `@ai-sdk/anthropic` adapter's serialization. + +Quick check: +```bash +node -e " +import { zodToJsonSchema } from 'zod-to-json-schema'; +import { z } from 'zod'; +const schema = z.object({ + path: z.string().describe('Path to the file'), + offset: z.number().int().optional(), +}); +console.log(JSON.stringify(zodToJsonSchema(schema), null, 2)); +" +``` + +This will show exactly what JSON Schema is produced and whether it has `$schema`, `additionalProperties`, etc. + +## 7. Test with Raw Anthropic API + +Bypass the AI SDK entirely and test with a direct API call to isolate whether the issue is in the AI SDK adapter: + +```bash +curl -X POST https://api.anthropic.com/v1/messages \ + -H "anthropic-version: 2023-06-01" \ + -H "x-api-key: $ANTHROPIC_API_KEY" \ + -H "content-type: application/json" \ + -d '{ + "model": "claude-opus-4-20250514", + "max_tokens": 1024, + "tools": [ + { + "name": "read_file", + "description": "Read the contents of a file", + "input_schema": { + "type": "object", + "properties": { + "path": {"type": "string", "description": "Path to file"} + }, + "required": ["path"] + } + } + ], + "messages": [ + {"role": "user", "content": "Read /etc/hostname"} + ] + }' +``` + +If this works but dispatch doesn't, the issue is in the AI SDK adapter or the schema conversion. diff --git a/cc/04-schema-debug.md b/cc/04-schema-debug.md new file mode 100644 index 0000000..5889862 --- /dev/null +++ b/cc/04-schema-debug.md @@ -0,0 +1,40 @@ +# Schema Debug — What zodToJsonSchema Actually Produces + +Run this to see what the AI SDK actually sends to Anthropic: + +```bash +node --experimental-strip-types -e " +import { z } from 'zod'; +import { zodToJsonSchema } from 'zod-to-json-schema'; + +const schema = z.object({ + path: z.string().describe('Path to the file, relative to the working directory'), + offset: z.number().int().min(1).optional().describe('1-indexed start line. Default: 1.'), + limit: z.number().int().min(1).optional().describe('Max lines to return. Default: 500. Hard cap: 5000.'), +}); + +console.log(JSON.stringify(zodToJsonSchema(schema), null, 2)); +" +``` + +## Check the @ai-sdk/anthropic adapter's tool serialization + +```bash +grep -n 'input_schema\|tools\|jsonSchema\|convertTools' node_modules/@ai-sdk/anthropic/dist/index.mjs | head -30 +``` + +## Check if streamText is receiving the tools correctly + +Look at how streamText processes tool options in the AI SDK: + +```bash +grep -n 'tools\|toolChoice\|toolCall' node_modules/ai/dist/index.mjs | head -40 +``` + +## Key questions to answer + +1. Does `zodToJsonSchema` output `$schema`? If yes, Anthropic may silently reject the tool definition. +2. Does it output `additionalProperties`? Same concern. +3. Does the `@ai-sdk/anthropic` adapter strip these before sending to the API? +4. Does the adapter forward parameter `description` fields from JSON Schema to Anthropic's wire format? +5. Is `tool_choice` being set or defaulting to something suboptimal? diff --git a/cc/README.md b/cc/README.md new file mode 100644 index 0000000..a9c813d --- /dev/null +++ b/cc/README.md @@ -0,0 +1,38 @@ +# Claude Opus Tool Calling Investigation + +## Problem +Claude Opus "thinks forever" and doesn't call tools when used outside Claude Code's harness (in Dispatch's own agent harness). + +## Summary of Findings + +### 1. Tool Schema Format +Dispatch uses the AI SDK v6 (`@ai-sdk/anthropic@^3.x`) which should handle format conversion automatically. However, `zodToJsonSchema()` produces Draft 7 JSON Schema with fields (`$schema`, `additionalProperties`, `default`) that Anthropic's API doesn't support. Dispatch does **no schema normalization**. + +### 2. Anthropic's requirements +Anthropic's `input_schema` must be clean JSON Schema: +- Root `type: "object"` is mandatory +- No `$schema`, `additionalProperties`, `default`, `nullable` +- No combiners (`anyOf`, `oneOf`, etc.) +- Parameter `description` fields are strongly recommended + +### 3. Missing tool_choice +Dispatch doesn't set `tool_choice` in `streamText()` options. The default may cause Opus to not call tools. + +### 4. System prompt +The system prompt tells Opus what tools exist but may not be forceful enough about actually USING them instead of just talking about solutions. + +### 5. No Anthropic-specific schema normalization +Unlike opencode's `normalizeSchemaForCCA()`, dispatch passes raw JSON Schema to the AI SDK without stripping unsupported fields. + +## Key Files + +| File | Purpose | +|------|---------| +| `packages/core/src/tools/registry.ts` | Tool → AI SDK conversion | +| `packages/core/src/agent/agent.ts` | Agent loop, `streamText({tools})` | +| `packages/api/src/agent-manager.ts` | Provides tool config to agent | +| `packages/core/src/llm/provider.ts` | Provider creation | + +## Next Steps + +See `03-recommendations.md` for specific fixes to try. diff --git a/packages/api/src/agent-manager.ts b/packages/api/src/agent-manager.ts index 92caf81..88503f3 100644 --- a/packages/api/src/agent-manager.ts +++ b/packages/api/src/agent-manager.ts @@ -25,9 +25,14 @@ import { createWriteFileTool, createYoutubeTranscribeTool, type DispatchConfig, + expandAgentToolNames, + GLOBAL_AGENTS_DIR, + getAgentDirPaths, getClaudeAccountsFromDB, getMessagesForTab, getSetting, + loadAgent, + loadAgents, loadConfig, loadSkills, ModelRegistry, @@ -39,6 +44,7 @@ import { type SystemChunkKind, type TabStatusSnapshot, TaskList, + toAvailableAgents, updateMessage, validateConfig, } from "@dispatch/core"; @@ -429,19 +435,30 @@ export class AgentManager { } if (allowed.has("summon")) { const childParentAllowedTools = new Set(toolEntries.map((e) => e.name)); + const availableAgents = toAvailableAgents( + loadAgents(workingDirectory), + GLOBAL_AGENTS_DIR, + workingDirectory, + ); + const agentDirPaths = getAgentDirPaths(workingDirectory); toolEntries.push({ name: "summon", - tool: createSummonTool(workingDirectory, { - spawn: (opts) => - this.spawnChildAgent({ - ...opts, - parentKeyId: tabAgent.keyId, - parentModelId: tabAgent.modelId, - parentAllowedTools: childParentAllowedTools, - parentTabId: tabId, - }), - getResult: (id) => this.getChildResult(id), - }), + tool: createSummonTool( + workingDirectory, + { + spawn: (opts) => + this.spawnChildAgent({ + ...opts, + parentKeyId: tabAgent.keyId, + parentModelId: tabAgent.modelId, + parentAllowedTools: childParentAllowedTools, + parentTabId: tabId, + }), + getResult: (id) => this.getChildResult(id), + }, + availableAgents, + agentDirPaths, + ), }); } if (allowed.has("retrieve")) { @@ -489,19 +506,30 @@ export class AgentManager { if (permSummon) { // Capture parent's allowed tool names for child permission enforcement const parentAllowedTools = new Set(toolEntries.map((e) => e.name)); + const availableAgents = toAvailableAgents( + loadAgents(workingDirectory), + GLOBAL_AGENTS_DIR, + workingDirectory, + ); + const agentDirPaths = getAgentDirPaths(workingDirectory); toolEntries.push({ name: "summon", - tool: createSummonTool(workingDirectory, { - spawn: (opts) => - this.spawnChildAgent({ - ...opts, - parentKeyId: tabAgent.keyId, - parentModelId: tabAgent.modelId, - parentAllowedTools, - parentTabId: tabId, - }), - getResult: (id) => this.getChildResult(id), - }), + tool: createSummonTool( + workingDirectory, + { + spawn: (opts) => + this.spawnChildAgent({ + ...opts, + parentKeyId: tabAgent.keyId, + parentModelId: tabAgent.modelId, + parentAllowedTools, + parentTabId: tabId, + }), + getResult: (id) => this.getChildResult(id), + }, + availableAgents, + agentDirPaths, + ), }); toolEntries.push({ name: "retrieve", @@ -874,6 +902,14 @@ export class AgentManager { task: string; tools: string[]; workingDirectory?: string; + /** + * Optional slug of an `AgentDefinition` to apply. When set, the + * definition's `tools`, `models`, and `cwd` take precedence over + * the `tools`/`workingDirectory` passed in `options`. Tools are + * still intersected with `parentAllowedTools` to prevent a + * subagent from gaining capabilities its parent doesn't have. + */ + agentSlug?: string; parentKeyId?: string | null; parentModelId?: string | null; parentAllowedTools?: Set<string>; @@ -895,12 +931,28 @@ export class AgentManager { parentEffectiveDir = join(homedir(), parentEffectiveDir.slice(1)); } + // Resolve the agent definition (if a slug was supplied) BEFORE + // computing the effective working directory and tool whitelist. + // The definition's cwd/tools take precedence over the caller's + // `workingDirectory`/`tools` parameters, mirroring how a top-level + // tab picking the same definition would behave. + let agentDef: ReturnType<typeof loadAgent> = null; + if (options.agentSlug) { + agentDef = loadAgent(options.agentSlug, parentEffectiveDir); + if (!agentDef) { + throw new Error( + `Agent definition not found: "${options.agentSlug}". Inspect the agents directories to see available slugs.`, + ); + } + } + // Resolve and validate child working directory against parent's effective dir - let resolvedWorkingDirectory = options.workingDirectory; - if (options.workingDirectory) { + const requestedDir = agentDef?.cwd ?? options.workingDirectory; + let resolvedWorkingDirectory = requestedDir; + if (requestedDir) { const { isAbsolute, relative, resolve, join } = await import("node:path"); // Expand ~ in child working directory - let childDir = options.workingDirectory; + let childDir = requestedDir; if (childDir === "~" || childDir.startsWith("~/")) { const { homedir } = await import("node:os"); childDir = join(homedir(), childDir.slice(1)); @@ -911,7 +963,7 @@ export class AgentManager { const isOutside = rel.startsWith("..") || isAbsolute(rel); if (isOutside) { throw new Error( - `Working directory "${options.workingDirectory}" is outside the parent's working directory "${parentDir}".`, + `Working directory "${requestedDir}" is outside the parent's working directory "${parentDir}".`, ); } // Store the resolved absolute path so downstream code doesn't @@ -919,25 +971,42 @@ export class AgentManager { resolvedWorkingDirectory = resolved; } - // Intersect requested tools with parent's allowed tools to prevent privilege escalation - let childTools = options.tools; + // Determine the child's tool whitelist. When an agent definition + // was supplied, expand its short permission-group names + // (read/edit/bash) into concrete tool names. Otherwise use the + // `tools` parameter verbatim. Either way, intersect with + // parentAllowedTools so a subagent can't gain capabilities the + // parent doesn't have — even an agent definition can't escalate. + const baseTools = agentDef ? expandAgentToolNames(agentDef.tools) : options.tools; + let childTools = baseTools; if (options.parentAllowedTools) { - childTools = options.tools.filter((t) => options.parentAllowedTools?.has(t)); + childTools = baseTools.filter((t) => options.parentAllowedTools?.has(t)); } // Create the tab agent entry with overrides const tabAgent = this._getOrCreateTabAgent(tabId); tabAgent.toolsOverride = childTools; tabAgent.workingDirectoryOverride = resolvedWorkingDirectory; - tabAgent.keyId = options.parentKeyId ?? null; - tabAgent.modelId = options.parentModelId ?? null; tabAgent.finalOutput = ""; - // Inherit parent's agent fallback models - if (options.parentTabId) { - const parentAgent = this.tabAgents.get(options.parentTabId); - if (parentAgent?.agentModels) { - tabAgent.agentModels = parentAgent.agentModels; + if (agentDef && agentDef.models.length > 0) { + // The agent definition specifies its own model fallback chain. + // Clear keyId/modelId so the fallback sequence uses the + // definition's models (matches how a top-level tab using this + // definition would be configured). + tabAgent.keyId = null; + tabAgent.modelId = null; + tabAgent.agentModels = agentDef.models; + } else { + // No definition (or definition has no models) → inherit from + // the parent like before. + tabAgent.keyId = options.parentKeyId ?? null; + tabAgent.modelId = options.parentModelId ?? null; + if (options.parentTabId) { + const parentAgent = this.tabAgents.get(options.parentTabId); + if (parentAgent?.agentModels) { + tabAgent.agentModels = parentAgent.agentModels; + } } } diff --git a/packages/core/src/agent/agent.ts b/packages/core/src/agent/agent.ts index bb4ee7d..6139dec 100644 --- a/packages/core/src/agent/agent.ts +++ b/packages/core/src/agent/agent.ts @@ -2,6 +2,7 @@ import { dirname } from "node:path"; import type { ProviderOptions } from "@ai-sdk/provider-utils"; import type { ModelMessage, SystemModelMessage } from "ai"; import { streamText } from "ai"; +import { getAgentDirPaths } from "../agents/loader.js"; import { appendEventToChunks } from "../chunks/append.js"; import { buildBillingHeaderValue, SYSTEM_IDENTITY } from "../credentials/claude.js"; import { createProvider, prefixToolName, unprefixToolName } from "../llm/provider.js"; @@ -531,7 +532,27 @@ export class Agent { const isSpillPath = resolvedPath === resolvedSpillRoot || resolvedPath.startsWith(`${resolvedSpillRoot}/`); - if (!isUnderWorkdir && !isSpillPath) { + // Agent definitions live in well-known directories + // (`~/.config/dispatch/agents/` and + // `<workdir>/.dispatch/agents/`). Reading those is a + // prerequisite for the summon tool's "specify which subagent" + // flow — the LLM needs to inspect the TOML to know what each + // agent does. We auto-allow READ-ONLY tools under those paths + // without prompting the user. Writes (`write_file`) still go + // through the normal external_directory gate so an agent can't + // quietly overwrite another agent's definition. + const isReadOnlyTool = + tc.name === "read_file" || tc.name === "read_file_slice" || tc.name === "list_files"; + let isAgentsDirReadOnly = false; + if (isReadOnlyTool) { + const agentDirs = getAgentDirPaths(this.config.workingDirectory); + const canonicalAgentDirs = await Promise.all(agentDirs.map((d) => canonicalize(d))); + isAgentsDirReadOnly = canonicalAgentDirs.some( + (d) => resolvedPath === d || resolvedPath.startsWith(`${d}/`), + ); + } + + if (!isUnderWorkdir && !isSpillPath && !isAgentsDirReadOnly) { const permissionType = tc.name === "read_file" ? "read" : tc.name === "write_file" ? "edit" : "list"; @@ -715,6 +736,16 @@ export class Agent { tools, }; + // Encourage tool use on Anthropic. Without an explicit + // `toolChoice`, Claude (especially Opus 4.7 with adaptive + // thinking) can decide to "think forever" instead of calling + // the tools it has been given. `"auto"` keeps Claude free to + // answer with text when no tool is needed, while making the + // availability of tools an explicit signal in the request. + if (isClaudeOAuth) { + streamOptions.toolChoice = "auto"; + } + if (isClaudeOAuth && effort !== "none") { // v6 native support for Opus 4.7 adaptive thinking via // providerOptions. No more rewriteBodyForOpus47 body- diff --git a/packages/core/src/agents/index.ts b/packages/core/src/agents/index.ts index 13f6244..4931162 100644 --- a/packages/core/src/agents/index.ts +++ b/packages/core/src/agents/index.ts @@ -1 +1,11 @@ -export { deleteAgent, getAgentDirs, loadAgents, saveAgent } from "./loader.js"; +export { + deleteAgent, + expandAgentToolNames, + GLOBAL_AGENTS_DIR, + getAgentDirPaths, + getAgentDirs, + getProjectAgentsDir, + loadAgent, + loadAgents, + saveAgent, +} from "./loader.js"; diff --git a/packages/core/src/agents/loader.ts b/packages/core/src/agents/loader.ts index cf84381..333716e 100644 --- a/packages/core/src/agents/loader.ts +++ b/packages/core/src/agents/loader.ts @@ -20,9 +20,9 @@ function sanitizeSlug(slug: string): string { // ─── Constants ─────────────────────────────────────────────────── -const GLOBAL_AGENTS_DIR = path.join(os.homedir(), ".config", "dispatch", "agents"); +export const GLOBAL_AGENTS_DIR = path.join(os.homedir(), ".config", "dispatch", "agents"); -function getProjectAgentsDir(projectDir: string): string { +export function getProjectAgentsDir(projectDir: string): string { return path.join(projectDir, ".dispatch", "agents"); } @@ -49,6 +49,77 @@ export function getAgentDirs( } /** + * Return just the absolute filesystem paths of the agent directories. + * Used by the agent's permission gate to grant read-only access to + * these locations by default (so any agent can list/read agent + * definitions without prompting the user). + */ +export function getAgentDirPaths(projectDir?: string): string[] { + const paths = [GLOBAL_AGENTS_DIR]; + if (projectDir) paths.push(getProjectAgentsDir(projectDir)); + return paths; +} + +/** + * Load a single agent definition by slug. Searches the project-scoped + * directory first (if `projectDir` is provided), then falls back to + * the global directory. Returns `null` if no match is found. + * + * Slug matching is exact and case-sensitive; sanitization mirrors + * `saveAgent` to keep loader and writer symmetric. + */ +export function loadAgent(slug: string, projectDir?: string): AgentDefinition | null { + const safeSlug = sanitizeSlug(slug); + const agents = loadAgents(projectDir); + return agents.find((a) => a.slug === safeSlug) ?? null; +} + +/** + * Translate the short permission-group names used by `AgentDefinition.tools` + * (e.g. `"read"`, `"edit"`, `"bash"`) into the concrete tool-implementation + * names registered with the agent runtime (e.g. `"read_file"`, + * `"list_files"`, `"write_file"`, `"run_shell"`). + * + * The mapping mirrors the per-permission tool-creation paths in + * `AgentManager.getOrCreateAgentForTab` so a subagent summoned with a + * given agent definition ends up with the exact same set of registered + * tools as a top-level tab using that definition. Tool names that aren't + * group aliases (`summon`, `retrieve`, `web_search`, `youtube_transcribe`, + * `todo`) are passed through unchanged. + * + * `"todo"` is auto-included so the summoned agent always has its task list + * available, matching the parent-agent path which always registers `todo`. + */ +export function expandAgentToolNames(tools: string[]): string[] { + const expanded = new Set<string>(); + for (const t of tools) { + switch (t) { + case "read": + expanded.add("read_file"); + expanded.add("read_file_slice"); + expanded.add("list_files"); + break; + case "edit": + expanded.add("write_file"); + break; + case "bash": + expanded.add("run_shell"); + break; + default: + // Pass through tool names that aren't permission-group + // aliases (summon, retrieve, web_search, youtube_transcribe, + // todo, and the granular file tools themselves if a user + // hand-wrote them in a TOML). + expanded.add(t); + } + } + // Always include `todo` — every agent should be able to track its work, + // and the parent-agent path adds it unconditionally. + expanded.add("todo"); + return Array.from(expanded); +} + +/** * Ensure the default global agent exists. Creates it if missing. */ function ensureDefaultAgent(): void { diff --git a/packages/core/src/index.ts b/packages/core/src/index.ts index 1453a01..74cb159 100644 --- a/packages/core/src/index.ts +++ b/packages/core/src/index.ts @@ -2,7 +2,17 @@ // Agent & LLM export { Agent } from "./agent/agent.js"; -export { deleteAgent, getAgentDirs, loadAgents, saveAgent } from "./agents/index.js"; +export { + deleteAgent, + expandAgentToolNames, + GLOBAL_AGENTS_DIR, + getAgentDirPaths, + getAgentDirs, + getProjectAgentsDir, + loadAgent, + loadAgents, + saveAgent, +} from "./agents/index.js"; // Chunk helpers export { appendEventToChunks, @@ -61,7 +71,12 @@ export { createToolRegistry } from "./tools/registry.js"; export { createRetrieveTool, type RetrieveCallbacks } from "./tools/retrieve.js"; export { BackgroundShellStore, createRunShellTool } from "./tools/run-shell.js"; export { analyzeCommand } from "./tools/shell-analyze.js"; -export { createSummonTool, type SummonCallbacks } from "./tools/summon.js"; +export { + type AvailableAgent, + createSummonTool, + type SummonCallbacks, + toAvailableAgents, +} from "./tools/summon.js"; export { createTaskListTool, TaskList } from "./tools/task-list.js"; export { clearSpillForTab } from "./tools/truncate.js"; export { createWebSearchTool } from "./tools/web-search.js"; diff --git a/packages/core/src/tools/registry.ts b/packages/core/src/tools/registry.ts index a09535e..ff6f4d1 100644 --- a/packages/core/src/tools/registry.ts +++ b/packages/core/src/tools/registry.ts @@ -4,6 +4,41 @@ import { zodToJsonSchema } from "zod-to-json-schema"; import type { ToolDefinition } from "../types/index.js"; /** + * Strip JSON Schema fields that Anthropic's API does not accept from a + * `zodToJsonSchema()` output. The Anthropic `/messages` API rejects (or + * silently ignores) tools whose `input_schema` contains `$schema`, + * `additionalProperties`, `default`, or `nullable` — when this happens + * Claude never sees the tool and the model "thinks forever" instead of + * calling it. + * + * The stripped fields are also harmless to remove for OpenAI-compatible + * endpoints, so we apply this unconditionally. + */ +function normalizeForAnthropic(schema: Record<string, unknown>): Record<string, unknown> { + delete schema.$schema; + delete schema.additionalProperties; + delete schema.default; + delete schema.nullable; + + const properties = schema.properties; + if (properties && typeof properties === "object") { + for (const key of Object.keys(properties as Record<string, unknown>)) { + const prop = (properties as Record<string, unknown>)[key]; + if (prop && typeof prop === "object") { + normalizeForAnthropic(prop as Record<string, unknown>); + } + } + } + + const items = schema.items; + if (items && typeof items === "object") { + normalizeForAnthropic(items as Record<string, unknown>); + } + + return schema; +} + +/** * Convert an internal `ToolDefinition` (Zod-parameterised) to an AI SDK v6 * `Tool` object. * @@ -14,9 +49,11 @@ import type { ToolDefinition } from "../types/index.js"; * `fullStream` that agent.ts collects and dispatches. */ function toAISDKTool(def: ToolDefinition): Tool { + const raw = zodToJsonSchema(def.parameters) as Record<string, unknown>; + const normalized = normalizeForAnthropic(raw); return tool({ description: def.description, - inputSchema: jsonSchema(zodToJsonSchema(def.parameters)), + inputSchema: jsonSchema(normalized), }); } diff --git a/packages/core/src/tools/summon.ts b/packages/core/src/tools/summon.ts index 22ab35b..ed4b080 100644 --- a/packages/core/src/tools/summon.ts +++ b/packages/core/src/tools/summon.ts @@ -1,17 +1,90 @@ import { z } from "zod"; -import type { ToolDefinition } from "../types/index.js"; +import type { AgentDefinition, ToolDefinition } from "../types/index.js"; export interface SummonCallbacks { - spawn(options: { task: string; tools: string[]; workingDirectory?: string }): Promise<string>; + spawn(options: { + task: string; + tools: string[]; + workingDirectory?: string; + /** + * Optional slug of an `AgentDefinition` (loaded from + * `~/.config/dispatch/agents/` or `<projectDir>/.dispatch/agents/`) + * to use as the basis for the spawned child. When provided, + * the definition's tools, models, and cwd override the + * `tools` and `workingDirectory` parameters passed alongside. + */ + agentSlug?: string; + }): Promise<string>; getResult( agentId: string, ): Promise<{ status: "done"; result: string } | { status: "error"; error: string }>; } +/** + * Summary of an agent definition surfaced to the calling LLM in the + * summon tool's description. The shape is intentionally minimal — full + * TOML inspection is done by reading the definition file directly, + * which all agents are allowed to do by default. + */ +export interface AvailableAgent { + slug: string; + name: string; + description: string; + /** Filesystem path of the TOML the agent can read for full details. */ + path: string; +} + +/** + * Build the prose paragraph that lists available agent definitions plus + * the disk locations where they live, injected into the summon tool's + * description. + * + * Returns the empty string when no agents are visible — keeps the + * description compact for environments where no definitions exist yet. + */ +function buildAgentsCatalog(agents: AvailableAgent[], agentDirs: string[]): string { + const lines: string[] = []; + lines.push(""); + lines.push("Agent definitions live on disk and can be inspected with read_file/list_files:"); + for (const d of agentDirs) { + lines.push(` - ${d}`); + } + if (agents.length === 0) { + lines.push(""); + lines.push("No agent definitions are currently defined."); + return lines.join("\n"); + } + lines.push(""); + lines.push("To summon a specific agent, pass its slug as the 'agent' parameter."); + lines.push("When 'agent' is set, the child inherits that definition's tools, models,"); + lines.push("and working directory; the 'tools' parameter is ignored."); + lines.push(""); + lines.push("Available agents:"); + for (const a of agents) { + const desc = a.description ? ` — ${a.description}` : ""; + lines.push(` - ${a.slug}: ${a.name}${desc}`); + } + return lines.join("\n"); +} + +/** + * Factory for the `summon` tool. Accepts a snapshot of agent definitions + * available at the time the tool is registered so the LLM's view of + * which agents exist matches what `spawnChildAgent` can actually load. + * + * `agentDirs` is the list of filesystem paths the catalog references in + * its description; this is information-only — the runtime resolves + * slugs through `loadAgent` independently. + */ export function createSummonTool( defaultWorkingDirectory: string, callbacks: SummonCallbacks, + availableAgents: AvailableAgent[] = [], + agentDirs: string[] = [], ): ToolDefinition { + const catalog = buildAgentsCatalog(availableAgents, agentDirs); + const agentSlugs = availableAgents.map((a) => a.slug); + return { name: "summon", description: [ @@ -38,7 +111,8 @@ export function createSummonTool( " - web_search: Search the web", " - youtube_transcribe: Fetch YouTube video transcripts", "", - "If tools is omitted, the child gets read_file, list_files, and todo only (read-only by default).", + "If tools is omitted (and no 'agent' is specified), the child gets read_file, list_files, and todo only (read-only by default).", + catalog, ].join("\n"), parameters: z.object({ task: z @@ -46,6 +120,21 @@ export function createSummonTool( .describe( "Detailed instructions for the child agent. Be specific about what it should do and what it should return.", ), + agent: z + .string() + .optional() + .describe( + [ + "Slug of an agent definition to use as the basis for the child agent.", + "When provided, the child inherits the definition's tools, models, and", + "working directory; the 'tools' parameter is ignored. Inspect the agent", + "directories listed above to discover which slugs are available and what", + "each one does.", + agentSlugs.length > 0 ? `Available slugs: ${agentSlugs.join(", ")}.` : "", + ] + .filter(Boolean) + .join(" "), + ), tools: z .array( z.enum([ @@ -62,13 +151,13 @@ export function createSummonTool( ) .optional() .describe( - 'Tool names to give the child. Defaults to ["read_file", "list_files", "todo"]. Include "summon" and "retrieve" to allow nesting.', + 'Tool names to give the child. Defaults to ["read_file", "list_files", "todo"]. Include "summon" and "retrieve" to allow nesting. Ignored when "agent" is set.', ), working_directory: z .string() .optional() .describe( - "Absolute path for the child to work in. Defaults to the current working directory.", + "Absolute path for the child to work in. Defaults to the current working directory. When 'agent' is set and its definition has a cwd, that takes precedence.", ), background: z .boolean() @@ -79,6 +168,7 @@ export function createSummonTool( }), execute: async (args: Record<string, unknown>): Promise<string> => { const task = args.task as string; + const agentSlug = args.agent as string | undefined; const tools = (args.tools as string[] | undefined) ?? ["read_file", "list_files", "todo"]; const workingDirectory = (args.working_directory as string | undefined) ?? defaultWorkingDirectory; @@ -89,6 +179,7 @@ export function createSummonTool( task, tools, workingDirectory, + ...(agentSlug ? { agentSlug } : {}), }); if (!background) { @@ -113,3 +204,29 @@ export function createSummonTool( }, }; } + +/** + * Build the `AvailableAgent[]` projection from a list of full + * `AgentDefinition` records. Each entry's `path` is derived from the + * scope+slug so the agent can `read_file(path)` directly. + */ +export function toAvailableAgents( + defs: AgentDefinition[], + globalDir: string, + projectDir: string | null, +): AvailableAgent[] { + return defs.map((d) => { + const baseDir = + d.scope === "global" + ? globalDir + : projectDir + ? `${projectDir.replace(/\/$/, "")}/.dispatch/agents` + : globalDir; + return { + slug: d.slug, + name: d.name, + description: d.description, + path: `${baseDir}/${d.slug}.toml`, + }; + }); +} diff --git a/packages/core/tests/agents/loader.test.ts b/packages/core/tests/agents/loader.test.ts new file mode 100644 index 0000000..88173ea --- /dev/null +++ b/packages/core/tests/agents/loader.test.ts @@ -0,0 +1,132 @@ +import * as fs from "node:fs"; +import * as os from "node:os"; +import * as path from "node:path"; +import { afterEach, beforeEach, describe, expect, it } from "vitest"; +import { expandAgentToolNames, getAgentDirPaths, loadAgent } from "../../src/agents/loader.js"; + +describe("expandAgentToolNames", () => { + it("expands 'read' into the granular read tools", () => { + const out = expandAgentToolNames(["read"]); + expect(out).toContain("read_file"); + expect(out).toContain("read_file_slice"); + expect(out).toContain("list_files"); + }); + + it("expands 'edit' into write_file", () => { + const out = expandAgentToolNames(["edit"]); + expect(out).toContain("write_file"); + }); + + it("expands 'bash' into run_shell", () => { + const out = expandAgentToolNames(["bash"]); + expect(out).toContain("run_shell"); + }); + + it("passes through non-group tool names unchanged", () => { + const out = expandAgentToolNames(["summon", "retrieve", "web_search", "youtube_transcribe"]); + expect(out).toEqual( + expect.arrayContaining(["summon", "retrieve", "web_search", "youtube_transcribe"]), + ); + }); + + it("always includes 'todo' even when not requested", () => { + expect(expandAgentToolNames([])).toContain("todo"); + expect(expandAgentToolNames(["read"])).toContain("todo"); + expect(expandAgentToolNames(["summon"])).toContain("todo"); + }); + + it("deduplicates when groups overlap with explicit names", () => { + const out = expandAgentToolNames(["read", "read_file"]); + // Each name should appear at most once + const counts = new Map<string, number>(); + for (const t of out) counts.set(t, (counts.get(t) ?? 0) + 1); + for (const [, c] of counts) expect(c).toBe(1); + }); +}); + +describe("getAgentDirPaths", () => { + it("returns just the global dir when no projectDir is supplied", () => { + const paths = getAgentDirPaths(); + expect(paths).toHaveLength(1); + expect(paths[0]).toContain(".config/dispatch/agents"); + }); + + it("appends the project-scoped dir when projectDir is supplied", () => { + const paths = getAgentDirPaths("/some/project"); + expect(paths).toHaveLength(2); + expect(paths[1]).toBe("/some/project/.dispatch/agents"); + }); +}); + +describe("loadAgent — project-scoped sandbox", () => { + // `GLOBAL_AGENTS_DIR` is captured at module load via `os.homedir()` + // and can't be redirected at runtime. The project-scoped path, + // however, is computed per-call from the `projectDir` argument, so + // we exercise that branch instead. This is also the more common + // real-world case (per-project agent definitions). + let tmpProject: string; + + beforeEach(() => { + tmpProject = fs.mkdtempSync(path.join(os.tmpdir(), "dispatch-loader-test-")); + }); + + afterEach(() => { + fs.rmSync(tmpProject, { recursive: true, force: true }); + }); + + function writeAgentToml(slug: string, body: string): void { + const agentsDir = path.join(tmpProject, ".dispatch", "agents"); + fs.mkdirSync(agentsDir, { recursive: true }); + fs.writeFileSync(path.join(agentsDir, `${slug}.toml`), body, "utf-8"); + } + + // Uses a slug unlikely to collide with anything the user might + // already have in ~/.config/dispatch/agents. `loadAgent` returns + // the FIRST match it finds across all scanned directories, and + // the global scope is scanned before the project scope — a slug + // that exists in both would resolve to the global one (which is + // real, not under our control). The "z-dispatch-test-*" prefix + // gives this fixture exclusive ownership of the slug. + const TEST_SLUG = "z-dispatch-test-fixture"; + + it("returns null for an unknown slug within the project scope", () => { + const agent = loadAgent("z-dispatch-test-does-not-exist", tmpProject); + expect(agent).toBeNull(); + }); + + it("loads a TOML definition written to the project's .dispatch/agents", () => { + writeAgentToml( + TEST_SLUG, + [ + 'name = "Fixture"', + 'description = "Sandbox fixture for loadAgent test."', + "skills = []", + 'tools = ["read", "bash"]', + "is_subagent = true", + "", + "[[models]]", + 'key_id = "opencode-1"', + 'model_id = "deepseek-v4-flash"', + "", + ].join("\n"), + ); + + const agent = loadAgent(TEST_SLUG, tmpProject); + expect(agent).not.toBeNull(); + expect(agent?.slug).toBe(TEST_SLUG); + expect(agent?.name).toBe("Fixture"); + expect(agent?.tools).toEqual(["read", "bash"]); + expect(agent?.is_subagent).toBe(true); + expect(agent?.models).toEqual([{ key_id: "opencode-1", model_id: "deepseek-v4-flash" }]); + expect(agent?.scope).toBe(tmpProject); + }); + + it("sanitizes the slug so path traversal can't reach outside the agents dir", () => { + // Even if a caller passes something gnarly, the lookup is by + // sanitized slug — no file outside the configured dirs should + // ever be opened. The sanitized form ("etc-passwd") obviously + // doesn't exist in the temp project, so the result is null. + const agent = loadAgent("../../../etc/passwd", tmpProject); + expect(agent).toBeNull(); + }); +}); diff --git a/packages/core/tests/tools/summon.test.ts b/packages/core/tests/tools/summon.test.ts new file mode 100644 index 0000000..3909e48 --- /dev/null +++ b/packages/core/tests/tools/summon.test.ts @@ -0,0 +1,137 @@ +import { describe, expect, it, vi } from "vitest"; +import { + type AvailableAgent, + createSummonTool, + type SummonCallbacks, +} from "../../src/tools/summon.js"; + +const noopCallbacks: SummonCallbacks = { + spawn: async () => "agent-id-stub", + getResult: async () => ({ status: "done", result: "" }), +}; + +describe("createSummonTool — description content", () => { + it("lists the agent directories so the LLM knows where to look", () => { + const tool = createSummonTool( + "/tmp/work", + noopCallbacks, + [], + ["/home/u/.config/dispatch/agents", "/tmp/work/.dispatch/agents"], + ); + expect(tool.description).toContain("/home/u/.config/dispatch/agents"); + expect(tool.description).toContain("/tmp/work/.dispatch/agents"); + expect(tool.description).toContain("read_file"); + }); + + it("includes available agent slugs+names in the description", () => { + const agents: AvailableAgent[] = [ + { + slug: "programmer", + name: "Programmer", + description: "Implements code from a plan.", + path: "/home/u/.config/dispatch/agents/programmer.toml", + }, + { + slug: "researcher", + name: "Researcher", + description: "Investigates topics.", + path: "/home/u/.config/dispatch/agents/researcher.toml", + }, + ]; + const tool = createSummonTool("/tmp/work", noopCallbacks, agents, [ + "/home/u/.config/dispatch/agents", + ]); + expect(tool.description).toContain("programmer"); + expect(tool.description).toContain("Programmer"); + expect(tool.description).toContain("Implements code from a plan"); + expect(tool.description).toContain("researcher"); + expect(tool.description).toContain("Investigates topics"); + }); + + it("emits a 'no agents defined' notice when the catalog is empty", () => { + const tool = createSummonTool( + "/tmp/work", + noopCallbacks, + [], + ["/home/u/.config/dispatch/agents"], + ); + expect(tool.description).toContain("No agent definitions are currently defined"); + }); +}); + +describe("createSummonTool — execute() argument forwarding", () => { + it("forwards agent slug through to callbacks.spawn", async () => { + const spawn = vi.fn(async () => "tab-xyz"); + const tool = createSummonTool( + "/tmp/work", + { spawn, getResult: async () => ({ status: "done", result: "ok" }) }, + [], + [], + ); + await tool.execute({ + task: "do thing", + agent: "programmer", + background: true, + }); + expect(spawn).toHaveBeenCalledTimes(1); + const callArg = spawn.mock.calls[0]?.[0]; + expect(callArg).toMatchObject({ + task: "do thing", + agentSlug: "programmer", + }); + }); + + it("omits agentSlug from the spawn payload when no agent param is given", async () => { + const spawn = vi.fn(async () => "tab-xyz"); + const tool = createSummonTool( + "/tmp/work", + { spawn, getResult: async () => ({ status: "done", result: "ok" }) }, + [], + [], + ); + await tool.execute({ + task: "do thing", + background: true, + }); + expect(spawn).toHaveBeenCalledTimes(1); + const callArg = spawn.mock.calls[0]?.[0]; + expect(callArg).not.toHaveProperty("agentSlug"); + }); + + it("returns spawned agent_id when background=true (no blocking on result)", async () => { + const getResult = vi.fn(async () => ({ status: "done" as const, result: "should-not-see" })); + const tool = createSummonTool("/tmp/work", { spawn: async () => "id-42", getResult }, [], []); + const out = await tool.execute({ task: "x", background: true }); + expect(out).toContain("id-42"); + // Background mode must not block on getResult + expect(getResult).not.toHaveBeenCalled(); + }); + + it("blocks on result and returns it when background=false (default)", async () => { + const tool = createSummonTool( + "/tmp/work", + { + spawn: async () => "id-1", + getResult: async () => ({ status: "done", result: "child-output" }), + }, + [], + [], + ); + const out = await tool.execute({ task: "x" }); + expect(out).toBe("child-output"); + }); + + it("surfaces child errors when blocking", async () => { + const tool = createSummonTool( + "/tmp/work", + { + spawn: async () => "id-1", + getResult: async () => ({ status: "error", error: "boom" }), + }, + [], + [], + ); + const out = await tool.execute({ task: "x" }); + expect(out).toContain("boom"); + }); +}); diff --git a/packaging/PKGBUILD b/packaging/PKGBUILD index 4ac87a8..a514eb5 100644 --- a/packaging/PKGBUILD +++ b/packaging/PKGBUILD @@ -91,8 +91,10 @@ package_dispatch() { install -Dm644 packages/frontend/package.json "${optdir}/packages/frontend/package.json" # Runtime node_modules (preserve symlinks for bun workspaces) + # Exclude prebuilt ARM64 .node binaries — strip (x86_64 host) can't process them. install -dm755 "${optdir}/node_modules" cp -a node_modules/. "${optdir}/node_modules/" + find "${optdir}/node_modules" -path '*/prebuilds/linux-arm64/*.node' -delete # Root manifest + lockfile install -Dm644 package.json "${optdir}/package.json" diff --git a/wishlist.md b/wishlist.md index 1528593..1c1b1be 100644 --- a/wishlist.md +++ b/wishlist.md @@ -12,3 +12,8 @@ - **AI can summon subagents using pre-configured agent types.** When the AI needs to delegate work, it can spawn subagents by selecting from a list of agent types that were defined in the agent editor page, rather than simply cloning itself with the same model. - **Status indicator next to the chat input box.** A small icon sits to the left of the input area: a spinner while the AI is generating a response, a checkmark when it finishes successfully, and an X when the last generation errored out. + +- **Failed tool calls should produce proper tool results, not get silently dropped.** + - Repro: agent calls `read_file` but the tool is unavailable → SDK emits `tried to call unavailable tool`. User grants permission and asks agent to retry. Agent tries again, but the SDK now errors with `Tool results are missing for tool calls call_00_..., call_01_...` — the previous step's tool calls were recorded (IDs registered in the assistant message) but their results were never written (because the stream aborted when the unavailable-tool error was caught). The synthetic error path in agent.ts (lines 856-882) only covers the ONE tool that triggered the error; sibling tool calls in the same batch that the LLM sent alongside the unavailable one have their IDs in the history but no matching tool-result, which trips the v6 SDK validation. + - Fix: when the unavailable-tool error is caught, every tool call in `stepToolCalls` that doesn't already have a result should receive a synthetic tool-result explaining the failure (e.g. "This step was aborted because tool X is unavailable."). The existing per-tool synthetic result only covers the offending tool; the rest of the batch becomes orphaned. + - Beyond the unavailable-tool case more generally: any path that yields tool-call events without a matching tool-result leaves the history in an un-roundtrippable state. Consider adding a `failed` or `aborted` state to tool results so the model can distinguish "the tool ran and returned this output" from "the tool never ran because something went wrong upstream." |
