summaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
authorAdam Malczewski <[email protected]>2026-06-01 01:46:13 +0900
committerAdam Malczewski <[email protected]>2026-06-01 01:46:13 +0900
commit8b9533c22a47bbf6f916667e2c25d8e8e419da37 (patch)
tree715a6a3d6f43781395e7dc7c8cdb519cef46a870
parent1853dd1d40308deb829bc621beb79c5d39b9c57f (diff)
downloaddispatch-8b9533c22a47bbf6f916667e2c25d8e8e419da37.tar.gz
dispatch-8b9533c22a47bbf6f916667e2c25d8e8e419da37.zip
feat(tabs): tab-to-tab agent communication via short handles
Add send_to_tab / read_tab tools so an agent can message or read another tab by a git-style short handle (shortest unique prefix of the tab UUID, min 4 chars), shown in the tab bar. - core/db/tabs: resolveTabPrefix + shortestUniquePrefix (open tabs only, LIKE-sanitized prefix matching) - new tools read-tab.ts / send-to-tab.ts (+ tests) decoupled from the DB TabRow via a minimal ResolvedTabRef projection - agent-manager: unified deliverMessage routing (busy -> queue, idle -> new turn) shared by POST /chat and send_to_tab; agent->agent auto-wake budget (MAX_AGENT_AUTO_WAKES) to bound ping-pong loops - summon/loader: send_to_tab + read_tab as grantable tools - frontend: shortHandleFor + handle badge in TabBar; perm toggles - notes: tab-comm / user-agents / todo-redesign plans - chore: biome format fixes (debug-logger, summon.test) Refs notes/plan-tab-comm.md
-rw-r--r--notes/plan-tab-comm.md358
-rw-r--r--notes/plan-user-agents.md206
-rw-r--r--notes/todo-tool-redesign-plan.md86
-rw-r--r--notes/wishlist.md2
-rw-r--r--packages/api/src/agent-manager.ts263
-rw-r--r--packages/api/src/app.ts25
-rw-r--r--packages/api/tests/agent-manager.test.ts365
-rw-r--r--packages/api/tests/routes.test.ts28
-rw-r--r--packages/core/src/agents/loader.ts4
-rw-r--r--packages/core/src/db/tabs.ts75
-rw-r--r--packages/core/src/index.ts11
-rw-r--r--packages/core/src/llm/debug-logger.ts21
-rw-r--r--packages/core/src/tools/read-tab.ts95
-rw-r--r--packages/core/src/tools/send-to-tab.ts147
-rw-r--r--packages/core/src/tools/summon.ts8
-rw-r--r--packages/core/tests/agents/loader.test.ts28
-rw-r--r--packages/core/tests/db/tabs.test.ts119
-rw-r--r--packages/core/tests/tools/read-tab.test.ts101
-rw-r--r--packages/core/tests/tools/send-to-tab.test.ts142
-rw-r--r--packages/core/tests/tools/summon.test.ts10
-rw-r--r--packages/frontend/src/lib/components/TabBar.svelte2
-rw-r--r--packages/frontend/src/lib/components/ToolPermissions.svelte10
-rw-r--r--packages/frontend/src/lib/settings.svelte.ts4
-rw-r--r--packages/frontend/src/lib/tabs.svelte.ts24
24 files changed, 2092 insertions, 42 deletions
diff --git a/notes/plan-tab-comm.md b/notes/plan-tab-comm.md
new file mode 100644
index 0000000..7acd804
--- /dev/null
+++ b/notes/plan-tab-comm.md
@@ -0,0 +1,358 @@
+# Implementation Plan: Tab-to-Tab Agent Communication
+
+## Summary
+
+Give every tab a **short, human-readable handle** visible in the UI, and give
+agents **two new tools** to talk to each other by that handle:
+
+1. **`send_to_tab`** — deliver a user message to another tab by its short ID.
+ - Target **mid-turn** → message is **queued** (identical path to a user message).
+ - Target **idle** → message **wakes** the tab and starts a new turn.
+ - Fire-and-forget: returns immediately, does not block on a response.
+2. **`read_tab`** — return the target tab's **last completed assistant turn** plus
+ its current status (`idle` / `running` / `error`). Non-blocking snapshot.
+
+The tools are gated behind two independent permissions (both default off):
+**`perm_send_to_tab`** (message other tabs) and **`perm_read_tab`** (read other
+tabs), mirroring how `perm_user_agent` gates user-agent spawning. Splitting them
+lets an operator grant read-only visibility without also granting the ability to
+wake/steer other tabs.
+
+This enables: handing an agent a tab handle and asking it to steer another AI,
+chaining agents, and one agent delegating subtasks to a peer tab.
+
+### The short ID is git-style: derived, not stored
+
+We do **not** add a `short_id` column. The handle is the **shortest unique
+prefix of the tab's existing UUID**, exactly like git short commit hashes:
+
+- A prefix is valid as long as it uniquely identifies one open tab.
+- Minimum display length is 4 hex chars; if two open tabs collide on those 4,
+ the displayed handle grows one char at a time until unambiguous.
+- Resolution accepts **any length** ≥ a small floor and matches by prefix.
+
+This means the short form is a **pure projection** of data we already store —
+no new column, no migration, no backfill, no unique-ID generation/retry, no
+collision bookkeeping at insert time. It also can't drift: a prefix is always
+computed against the current set of open tabs.
+
+### The good news: the wake/queue machinery already exists
+
+`POST /chat` (`packages/api/src/app.ts`) **already** implements exactly the
+routing the wishlist describes:
+
+```ts
+if (agentManager.getTabStatus(tabId) === "running") {
+ agentManager.queueMessage(tabId, message, queueId); // mid-turn → queue
+} else {
+ agentManager.processMessage(tabId, message, ...); // idle → new turn
+}
+```
+
+The new `send_to_tab` tool reuses this same decision via a small shared
+`AgentManager` method. We are **not** inventing a new delivery mechanism — we're
+exposing the existing one to agents and adding addressing by a derived handle.
+
+---
+
+## Current architecture (for context)
+
+- **Tab identity**: tabs are keyed by `crypto.randomUUID()` — a canonical
+ **lowercase** hex UUID (verified: both `crypto.randomUUID()` and the frontend's
+ non-secure fallback emit lowercase). Created via `createTab()` in
+ `packages/core/src/db/tabs.ts`, reached from:
+ - `POST /tabs` (frontend "+" button → `createNewTab`)
+ - `AgentManager.spawnChildAgent` (summon, sub/user agents)
+- **Message delivery**: `AgentManager` owns per-tab state in `tabAgents`
+ (`Map<tabId, TabAgent>`). Key methods:
+ - `processMessage(tabId, msg, keyId?, modelId?, ...)` — runs a full turn,
+ persists chunks, resolves completion promises.
+ - `queueMessage(tabId, msg, clientId?)` — pushes to `tabAgent.messageQueue`
+ and wakes blocking tools via `queueListeners`.
+ - `getTabStatus(tabId)` — `"idle" | "running" | "error"`.
+- **Tools**: built per-tab in `getOrCreateAgentForTab`. Two paths — a
+ permission-gated path (top-level tabs) and a `toolsOverride` whitelist path
+ (child agents). Each tool is a `ToolDefinition` whose `execute` closes over
+ **callbacks** supplied by `AgentManager` (see `createSummonTool`,
+ `createRetrieveTool`). This is the seam the new tools plug into.
+- **History**: a tab's conversation is a flat append-only `chunks` log. The last
+ assistant turn is recoverable via `getChunksForTab(tabId)` →
+ `groupRowsToMessages(...)` → last `role === "assistant"` message.
+- **Permissions**: read in `getOrCreateAgentForTab` via `getSetting("perm_*")`,
+ folded into `permKey` (cache-invalidation), surfaced as checkboxes in
+ `ToolPermissions.svelte` with defaults in `settings.svelte.ts`.
+- **SQLite `LIKE`**: case-insensitive for ASCII by default (no
+ `case_sensitive_like` PRAGMA set), so `6FE8` resolves `6fe8…` for free. `%`
+ and `_` are wildcards → incoming prefixes MUST be sanitized (see below).
+
+---
+
+## Part A — Git-style short tab handles
+
+### The two halves
+
+A derived-prefix scheme has a **display** side (compute the shortest unique
+prefix to show) and a **resolution** side (match an arbitrary-length prefix back
+to one tab). They live in different layers:
+
+| Concern | Where | Why |
+|---|---|---|
+| **Display** — each open tab's shortest unique prefix | Frontend (`tabs.svelte.ts`) | The frontend already holds every open tab's full UUID; computing prefixes client-side needs zero new wire data and updates reactively as tabs open/close |
+| **Resolution** — prefix string → real `tabId` | Backend DB (`db/tabs.ts`) | The tools run server-side and must resolve against the authoritative open-tab set |
+
+Because the handle is derived, **nothing new is stored or sent**. No change to
+the `tabs` schema, no `tab-created` payload change.
+
+### Display: shortest-unique-prefix (frontend)
+
+`packages/frontend/src/lib/tabs.svelte.ts`
+- Add a derived helper, e.g. `shortHandleFor(tabId)`, computed against the
+ current open `tabs`:
+ - Start at length 4. If any **other** open tab shares that prefix, increment
+ until unique (cap at full UUID as the degenerate fallback).
+ - Expose as a `$derived` map `{ tabId → handle }` so the tab bar and any
+ "Agents" view stay in sync as tabs open/close.
+- This naturally yields 4-char handles in the common case and only grows on a
+ genuine leading-hex collision among open tabs.
+
+`packages/frontend/src/lib/components/TabBar.svelte`
+- Render the handle as a small mono badge next to the title (both the user-tab
+ row and the subagent row). This is the human/LLM-visible addressing token.
+
+> Note: the existing debug `shortId` helper in `tabs.svelte.ts` (first-8-char
+> slice) is unrelated and stays as-is; the new handle is collision-aware.
+
+### Resolution: prefix → tab (backend)
+
+`packages/core/src/db/tabs.ts`
+- New `resolveTabPrefix(prefix: string): { status: "ok"; tab: TabRow } |
+ { status: "none" } | { status: "ambiguous"; matches: TabRow[] }`:
+ 1. **Sanitize**: lowercase; reject/strip anything outside `[0-9a-f-]` so the
+ SQLite `LIKE` wildcards `%`/`_` can't be injected. Enforce a min length
+ (e.g. 4) to avoid absurdly broad matches.
+ 2. Query open tabs: `SELECT * FROM tabs WHERE is_open = 1 AND id LIKE $p`
+ with `$p = prefix + '%'`.
+ 3. 0 rows → `none`; 1 row → `ok`; >1 → `ambiguous` (return the matches so the
+ caller can list them).
+- Scope to `is_open = 1`: closed tabs are not addressable and shouldn't cause
+ phantom ambiguity.
+- (Exact full-UUID still works — it's just a maximal prefix.)
+
+`packages/core/src/index.ts`
+- Re-export `resolveTabPrefix`.
+
+### Why derive instead of store
+
+- **No migration / backfill / column.** Zero schema churn.
+- **No insert-time unique-ID generation** (the race-prone part) — UUIDs are
+ already unique by construction.
+- **Self-correcting.** If two tabs share a 4-char prefix, both just display 5
+ chars; when one closes, the other can shrink back to 4. A stored handle would
+ go stale here.
+- **Resolution is one indexed-ish `LIKE` scan** over open tabs (a handful of
+ rows); negligible cost.
+
+---
+
+## Part B — The two tools
+
+Both live in core as `ToolDefinition` factories taking a callbacks object, and
+are wired in `AgentManager` exactly like `summon`/`retrieve`.
+
+### Tool shapes
+
+```
+send_to_tab({
+ tab_id: string, // required — the short handle (any unique-length prefix) of the target
+ message: string, // required — the user message to deliver
+})
+-> "Delivered to tab 6fe8 (status: running → queued)" // or "(idle → started new turn)"
+
+read_tab({
+ tab_id: string, // required — the short handle of the target tab
+})
+-> "<tab_response tab=6fe8 status=idle>...last assistant turn text...</tab_response>"
+```
+
+The `tab_id` parameter accepts any length prefix; the tool resolves it via
+`resolveTabPrefix`. On `ambiguous`, the tool returns the competing handles and
+asks the agent to add a character — same UX as `git checkout <ambiguous sha>`.
+
+### `send_to_tab` semantics (mirrors `POST /chat`)
+
+1. `resolveTabPrefix(tab_id)`:
+ - `none` → error listing currently-open handles (mirrors how `summon` lists
+ valid agent slugs on a bad slug).
+ - `ambiguous` → error listing the matching handles; ask for one more char.
+ - `ok` → proceed with `tab.id`.
+2. Reject sending to **self** (no-op footgun) with a clear message.
+3. Prefix the delivered text with provenance so the target (and the user) know
+ who sent it and can reply back:
+ `[message from tab <senderHandle>]\n\n<message>`
+4. Route via a new shared `AgentManager.deliverMessage(tabId, message)`:
+ - target `running` → `queueMessage(...)` → report `queued`.
+ - target `idle`/`error` → hydrate key/model from the live `TabAgent` (warm)
+ or the DB tab row (cold), then `processMessage(...)` → report `started`.
+5. Return immediately (fire-and-forget). The sender uses `read_tab` later.
+
+### `read_tab` semantics (non-blocking snapshot)
+
+1. `resolveTabPrefix(tab_id)` (same `none`/`ambiguous` handling).
+2. Read `getChunksForTab(tab.id)` → `groupRowsToMessages` → last
+ `role === "assistant"` message → flatten its text chunks.
+3. Return that text plus `status` from `getTabStatus(tab.id)`:
+ - `running` → note the turn is still in progress; returns the **previous**
+ completed turn (or "no completed turn yet").
+ - `idle` → the just-finished turn.
+ - empty history → "Tab has no assistant responses yet."
+
+**Why non-blocking:** two agents that block on each other's results would
+deadlock. `retrieve` can block because child agents can't summon their parent;
+peer tabs have no such guarantee. A snapshot read + explicit re-read is safe.
+(An optional `wait: boolean` flag is possible later but deliberately omitted v1.)
+
+### Why a shared `deliverMessage` method
+
+`POST /chat` and `send_to_tab` make the **same** running/idle decision. Factor it
+into one `AgentManager.deliverMessage(tabId, message, opts?)` returning
+`{ status: "queued" | "started" }`, and call it from both. Avoids drift between
+the HTTP path and the tool path. (Refactor `POST /chat` to use it.)
+
+The tools take a **resolver callback** (`resolveShortId`) wired to
+`resolveTabPrefix`, plus `deliver`, `getLastResponse`, `getStatus`,
+`listOpenHandles`, and `selfTabId` — all closed over by `AgentManager`, same
+pattern as `createSummonTool`.
+
+---
+
+## Changes by file
+
+### Core
+
+| File | Change |
+|---|---|
+| `packages/core/src/db/tabs.ts` | New `resolveTabPrefix(prefix)` (sanitize → `LIKE` over open tabs → ok/none/ambiguous). **No schema/column change.** |
+| `packages/core/src/tools/send-to-tab.ts` | **new** — `createSendToTabTool(callbacks)`; `SendToTabCallbacks { resolveShortId, deliver, listOpenHandles, selfTabId }` |
+| `packages/core/src/tools/read-tab.ts` | **new** — `createReadTabTool(callbacks)`; `ReadTabCallbacks { resolveShortId, getLastResponse, getStatus, listOpenHandles }` |
+| `packages/core/src/tools/summon.ts` | Add `send_to_tab`, `read_tab` to the `tools` enum + description list so agents can grant them to children |
+| `packages/core/src/index.ts` | Re-export `resolveTabPrefix` + the two new tool factories |
+
+> Removed from the earlier draft: `short_id` column, migration, backfill,
+> `generateShortId()`, `getTabByShortId()`, and the `shortId` field on
+> `TabRow`/`tab-created`. All obsolete under the derived-prefix design.
+
+### API
+
+| File | Change |
+|---|---|
+| `packages/api/src/agent-manager.ts` | Read `perm_send_to_tab` + `perm_read_tab` + add both to `permKey`; gate each tool independently; build `send_to_tab`/`read_tab` in **both** tool paths (permission path always; child path when `toolsOverride` includes them); add `deliverMessage()`; cold-tab key/model hydrate from DB on wake. Wire tool callbacks to `resolveTabPrefix`/`deliverMessage`/`getChunksForTab`/`getTabStatus`, passing `selfTabId = tabId` |
+| `packages/api/src/app.ts` | Refactor `POST /chat` to call `deliverMessage()` |
+
+> `tab-created` payload is **unchanged** (no `shortId`), and `routes/tabs.ts`
+> needs no change — the frontend derives handles from UUIDs it already receives.
+
+### Frontend
+
+| File | Change |
+|---|---|
+| `packages/frontend/src/lib/tabs.svelte.ts` | Add derived `shortHandleFor` / `{tabId→handle}` map (shortest-unique-prefix over open tabs). **No `Tab` field added** — it's derived, not stored |
+| `packages/frontend/src/lib/components/TabBar.svelte` | Render the derived handle badge on each tab |
+| `packages/frontend/src/lib/components/ToolPermissions.svelte` | Add two entries: `{ id: "send_to_tab", label: "Message other tabs" }` and `{ id: "read_tab", label: "Read other tabs" }` |
+| `packages/frontend/src/lib/settings.svelte.ts` | Add `send_to_tab: false` + `read_tab: false` to `toolPerms` + `savedToolPerms` defaults |
+| `packages/frontend/src/lib/components/AgentBuilder.svelte` *(if it lists tools)* | Include the two new tool names so agent definitions can grant them |
+
+---
+
+## Files NOT changing
+
+| File | Why |
+|---|---|
+| `packages/core/src/db/index.ts` | **No schema change** — handle is derived, not stored |
+| `packages/core/src/types/index.ts` | `tab-created` unchanged; no `shortId` on the wire |
+| `packages/core/src/tools/retrieve.ts` | Unchanged — peer reads use the new `read_tab`, not `retrieve` |
+| `packages/core/src/agent/agent.ts` | The agent loop already passes `queueCallbacks`/context to tools; no change needed |
+| DB `settings` table | Key-value; `perm_send_to_tab` / `perm_read_tab` need no migration |
+| Queue internals (`queueMessage`/`dequeueMessages`/`waitForQueuedMessage`) | Reused as-is |
+
+---
+
+## Testing
+
+- **`packages/core/tests/db/`** — `resolveTabPrefix`: exact UUID → ok; 4-char
+ unique → ok; colliding prefix → ambiguous with both matches; unknown → none;
+ case-insensitive (`6FE8` ↔ `6fe8`); wildcard injection (`%`, `_`) sanitized;
+ closed tabs excluded from matches.
+- **`packages/core/tests/tools/`** — `send_to_tab`: none → lists open handles;
+ ambiguous → asks for more chars; self-send rejected; provenance prefix applied;
+ calls `deliver`. `read_tab`: returns last assistant turn; empty history
+ message; status surfaced.
+- **Frontend** — shortest-unique-prefix: two tabs sharing 4 hex chars both render
+ 5; closing one lets the other shrink back to 4; single tab renders 4.
+- **`packages/api/tests/agent-manager.test.ts`** — `deliverMessage` routes
+ running→queue / idle→processMessage; cold-tab wake hydrates key/model from DB;
+ `perm_send_to_tab` / `perm_read_tab` each gate their tool independently + invalidate cache.
+- **`packages/api/tests/routes.test.ts`** — `POST /chat` still behaves after the
+ `deliverMessage` refactor.
+- Manual: tab A messages idle tab B (B wakes); A messages running tab B (queued,
+ consumed on B's next turn — see dependency below); A `read_tab` B; ambiguous
+ prefix prompts for one more char.
+
+---
+
+## Risks / edge cases / dependencies
+
+- **DEPENDENCY — "queue not consumed after turn" bug.** When the target is
+ **running**, `send_to_tab` queues the message. Per the separate wishlist item,
+ a queued message currently attaches at end-of-turn but does **not** kick off a
+ new turn (`agent.ts` end-of-loop pushes it to history then yields `done`). So
+ peer messages to a *busy* tab won't get a reply until that fix lands. The
+ **idle-wake** path is fully functional today. → Recommend fixing the queue-
+ consumption bug alongside, or shipping idle-wake first and calling out the
+ busy-tab limitation.
+- **Ambiguous prefix is a first-class outcome**, not an error to hide. Surface
+ the competing handles and ask for one more char (git's exact UX). Tests must
+ cover it.
+- **`LIKE` wildcard injection.** `%`/`_` in a raw prefix would broaden the match.
+ Sanitize to `[0-9a-f-]` + min length before querying. Covered by a test.
+- **Display vs resolution drift window.** The frontend computes a handle from its
+ known open tabs; the backend resolves against the DB. If they momentarily
+ disagree (a tab opened elsewhere a beat ago), the worst case is a `none`/
+ `ambiguous` the agent retries — self-correcting, no corruption.
+- **Cold-tab wake loses fallback chain.** An idle tab not in `tabAgents` (e.g.
+ after server restart) only has `key_id`/`model_id` in the DB — the agent
+ definition's multi-model `agentModels` fallback chain isn't persisted. Waking
+ it uses the single stored model (no fallback). Acceptable degradation; note it.
+ (Overlaps with the "key switching not migrating context" wishlist item.)
+- **Deadlock avoidance.** `read_tab` is intentionally non-blocking so two agents
+ can't wait on each other forever.
+- **Runaway agent ping-pong (livelock) — MITIGATED.** Two agents that each reply
+ to incoming messages (A wakes B wakes A ...) would spend tokens unbounded with
+ no human in the loop. Mitigation: an **origin-aware auto-wake budget** in
+ `AgentManager.deliverMessage`. Each tab carries `autoWakeBudget` (max
+ `MAX_AGENT_AUTO_WAKES = 6`). A `send_to_tab` call delivers with `origin:
+ "agent"`; waking an idle tab consumes one unit. At 0, further agent messages
+ are **queued but do NOT wake** the tab (`status: "suppressed"`) and a `notice`
+ system chunk is emitted; the `send_to_tab` tool returns a "HELD — do not keep
+ resending" message so the sender stops. Any human-originated delivery
+ (`POST /chat`, `origin: "human"`, the default) **refills the budget to full**,
+ so human-driven and bounded multi-hop delegation are unrestricted; only
+ unattended machine-to-machine cascades are capped. Messages are never dropped.
+- **Footguns behind a permission.** An agent could spam or wake the user's
+ personal tabs. Mitigations: `perm_send_to_tab` / `perm_read_tab` default **off**; self-send
+ blocked; provenance prefix makes the source visible to the user.
+- **Stale handle in tool description.** Unlike `summon`'s agent catalog, we do
+ NOT bake the live tab list into the tool description (it changes constantly).
+ Discovery is via the UI badge + the open-handle list returned on none/ambiguous.
+
+---
+
+## Suggested phasing
+
+1. **Phase 1 — Derived handles (Part A).** `resolveTabPrefix` (backend) +
+ shortest-unique-prefix display + TabBar badge. No schema change; shippable on
+ its own (useful even before the tools).
+2. **Phase 2 — Tools (Part B).** `send_to_tab` + `read_tab`, `deliverMessage`
+ refactor, `perm_send_to_tab` + `perm_read_tab`, permission UI + defaults, summon whitelist.
+3. **Phase 3 — Polish.** Optional "Agents" sidebar view; fix the
+ queue-consumption bug so busy-tab delivery replies without a nudge; optional
+ `wait` flag on `read_tab`.
diff --git a/notes/plan-user-agents.md b/notes/plan-user-agents.md
new file mode 100644
index 0000000..012dbfb
--- /dev/null
+++ b/notes/plan-user-agents.md
@@ -0,0 +1,206 @@
+# Implementation Plan: User Agents
+
+## Summary
+
+Two changes rolled into one:
+
+1. **`agent` becomes required** on `summon` — all spawned agents must use a definition
+2. **New `top_level` mode** — spawns independent "user agent" tabs, gated by a new `perm_user_agent` permission
+
+A **user agent** is a top-level, independent tab spawned by an AI agent via the `summon` tool. Unlike **subagents** (child tabs owned by a parent), user agents appear as first-class tabs — persistent, independent lifecycle, no parent. They are fire-and-forget: the spawning agent gets an `agent_id` back but cannot `retrieve` the result.
+
+User agents **must** be spawned from a non-subagent agent definition (e.g. `default`). The definition controls their tools, models, working directory, and system prompt. Tools are still intersected with the spawning agent's own tools (can't escalate). A new `perm_user_agent` permission gates access to this capability, surfaced as a separate checkbox in the Tool Permissions UI.
+
+---
+
+## Current Architecture (for context)
+
+Today, the `summon` tool creates **subagent tabs**:
+
+- They have a `parentTabId` linking them to the spawning tab
+- They show in a **bottom row** under the parent tab in the tab bar
+- They're **non-persistent** by default (italic, faded) until promoted
+- Their **tools are restricted** — intersected with the parent's tool set (can't escalate)
+- Their **working directory** must be within the parent's working directory
+- They have **completion tracking** (`completionPromise`) — the parent blocks on `retrieve` until the child finishes
+- The `agent` parameter is currently optional — agents can be spawned ad-hoc with just a `tools` list, inheriting the parent's model
+
+### Comparison table
+
+| Property | Subagent (current & updated) | User Agent (new) |
+|---|---|---|
+| `parentTabId` | set to parent | `null` |
+| `persistent` | `false` (promoted on click) | `true` |
+| Tab bar position | Bottom row under parent | Top row with user tabs |
+| Tab lifecycle | Closed when parent closes | Independent |
+| Retrievable | Yes, via `retrieve` tool | No — fire-and-forget |
+| Working directory | Must be within parent's dir | Any dir (from definition or default) |
+| Completion tracking | Yes (`completionPromise`) | No |
+| Agent definition | Required | Required (`is_subagent !== true`) |
+| Models | From agent definition | From agent definition |
+
+---
+
+## Summon tool — new parameter shape
+
+```
+summon({
+ task: string, // required — what to do
+ agent: string, // required — agent definition slug (was optional)
+ top_level?: boolean, // optional — spawn as user agent (only in schema if perm_user_agent enabled)
+ tools?: string[], // optional — override tools (intersected with spawning agent's tools)
+ background?: boolean, // optional — for subagents only (user agents are always fire-and-forget)
+ working_directory?: string // optional — override the definition's cwd
+})
+```
+
+### Tools resolution
+
+- **`tools` omitted**: agent definition's tools ∩ spawning agent's tools
+- **`tools` provided**: provided tools ∩ spawning agent's tools
+
+Models always come from the agent definition.
+
+---
+
+## Changes by file
+
+### 1. `packages/core/src/tools/summon.ts`
+
+**Schema changes:**
+
+- `agent` — change from `.optional()` to required
+- `top_level` — new `z.boolean().optional()`, **only included in the schema when `userAgentEnabled` is `true`**
+- `tools` — stays optional. Description updated: "Defaults to the agent definition's tools. Intersected with the spawning agent's tools."
+- `background` — stays optional, ignored when `top_level: true`
+- `working_directory` — stays optional
+
+**Factory signature change:**
+
+```ts
+createSummonTool(
+ defaultWorkingDirectory: string,
+ callbacks: SummonCallbacks,
+ availableSubagents: AvailableAgent[], // is_subagent === true
+ availableUserAgents: AvailableAgent[], // is_subagent !== true
+ agentDirs: string[],
+ userAgentEnabled: boolean, // new — controls whether top_level param + user agent catalog exist
+)
+```
+
+**`SummonCallbacks.spawn` interface:**
+
+- Add `topLevel?: boolean` to the spawn options object
+
+**`buildAgentsCatalog` update:**
+
+The catalog in the tool description is built conditionally:
+
+- **When `userAgentEnabled` is `false`**: only show the subagents group:
+ ```
+ Available agents:
+ - programmer: Programmer — Implements code from a given plan
+ - flash: Flash — A cheap subagent
+ ```
+
+- **When `userAgentEnabled` is `true`**: show two labeled groups:
+ ```
+ Subagents (spawned as child tabs):
+ - programmer: Programmer — Implements code from a given plan
+ - flash: Flash — A cheap subagent
+
+ User agents (spawned as independent top-level tabs, requires top_level=true):
+ - default: Default — Default agent with all tools enabled
+ ```
+
+**`toAvailableAgents` → split into two functions:**
+
+- Rename existing to `toAvailableSubagents()` — keeps `is_subagent === true` filter
+- New `toAvailableUserAgents()` — filters `is_subagent !== true`
+
+**Execute logic when `top_level: true`:**
+
+- Always return immediately with `agent_id` (fire-and-forget, ignore `background`)
+- Call `callbacks.spawn(...)` with `topLevel: true`
+
+---
+
+### 2. `packages/core/src/index.ts`
+
+- Re-export renamed `toAvailableSubagents` and new `toAvailableUserAgents`
+
+---
+
+### 3. `packages/api/src/agent-manager.ts`
+
+**`getOrCreateAgentForTab` — permission reading & tool construction:**
+
+- Read new setting: `const permUserAgent = getSetting("perm_user_agent") === "allow"`
+- Include `permUserAgent` in the `permKey` cache-invalidation string
+- Load user agent definitions via `toAvailableUserAgents(...)`
+- Pass `userAgentEnabled: permUserAgent` and `availableUserAgents` to `createSummonTool`
+
+**`spawnChildAgent` — when `topLevel: true`:**
+
+- **`parentTabId`**: pass `null` to `createTab()` and the `tab-created` event
+- **Working directory**: use the agent definition's `cwd` if set, otherwise the global default (`DISPATCH_WORKING_DIR` / `process.cwd()`). **No containment check** against parent's directory.
+- **Tools**: from definition (or `tools` param if provided), intersected with spawning agent's tools. Same intersection logic as subagents — can't escalate.
+- **Models**: from the agent definition's `models` array
+- **No completion tracking**: skip `completionPromise` / `completionResolve` setup. Leave them `undefined`.
+
+**`getChildResult` guard:**
+
+- If the tab has no `completionPromise` and status is `running`, return error: `"This is a user agent (top-level tab) and cannot be retrieved. User agents are fire-and-forget."`
+
+---
+
+### 4. `packages/frontend/src/lib/components/ToolPermissions.svelte`
+
+Add a new entry to the `toolPermissions` array:
+
+```ts
+{
+ id: "user_agent",
+ label: "Spawn user agents",
+ description: "Allow the AI to open new independent top-level tabs"
+}
+```
+
+---
+
+### 5. `packages/frontend/src/lib/settings.svelte.ts`
+
+Add `user_agent: false` to the default `toolPerms` and `savedToolPerms` objects.
+
+---
+
+## Files NOT changing
+
+| File | Why |
+|---|---|
+| `packages/frontend/src/lib/components/TabBar.svelte` | Already renders `parentTabId === null` tabs in top row as persistent |
+| `packages/frontend/src/lib/tabs.svelte.ts` (`tab-created` handler) | Already sets `persistent: parentTabId == null` |
+| `packages/core/src/tools/retrieve.ts` | Unchanged — the retrieve guard lives in AgentManager |
+| `packages/core/src/agents/loader.ts` | `is_subagent` already exists and distinguishes the two types |
+| DB schema | The `settings` table is key-value, no migration needed |
+| Agent definition TOML format | `is_subagent` already exists |
+
+---
+
+## Complete file change list
+
+| File | Change |
+|---|---|
+| `packages/core/src/tools/summon.ts` | `agent` required, conditional `top_level` param, two-group catalog (subagents-only when no perm), `toAvailableSubagents()` + `toAvailableUserAgents()`, spawn interface update |
+| `packages/core/src/index.ts` | Re-export new/renamed functions |
+| `packages/api/src/agent-manager.ts` | Read `perm_user_agent`, pass to factory, handle `topLevel` in spawn, retrieve guard |
+| `packages/frontend/src/lib/components/ToolPermissions.svelte` | Add "Spawn user agents" checkbox |
+| `packages/frontend/src/lib/settings.svelte.ts` | Add `user_agent` default |
+
+---
+
+## Risks / edge cases
+
+- **Nested user agents**: A user agent could itself have `perm_user_agent` and spawn more user agents. This is allowed and works naturally since user agents are independent tabs with no parent chain.
+- **Agent definition with no models**: Should not happen in practice — the Agent Builder UI requires at least one model entry. But if it does, the spawn will fail at the model-resolution step with a clear error.
+- **Retrieve on user agent**: Guarded in `getChildResult` — returns an error message explaining user agents are fire-and-forget.
diff --git a/notes/todo-tool-redesign-plan.md b/notes/todo-tool-redesign-plan.md
new file mode 100644
index 0000000..7e3af48
--- /dev/null
+++ b/notes/todo-tool-redesign-plan.md
@@ -0,0 +1,86 @@
+# Todo/Task tool redesign — port opencode's effective design into Dispatch
+
+## 1. What each implementation does today
+
+### Dispatch (current — disabled because it confused agents)
+- Tool name: `todo`. **Imperative CRUD** with 5 actions: `add`, `update`, `list`, `get`, `remove`.
+- Each task gets a **server-assigned opaque id** (`task-1`, `task-2`, …) returned by `add`.
+- To mutate state the model must call `update` / `remove` **with the right `task_id`**.
+- `TaskItem = { id, title, description, status }`, status `pending | in_progress | done | blocked`.
+- Wiring: `packages/core/src/tools/task-list.ts` (class `TaskList` + `createTaskListTool`),
+ instantiated per-tab in `packages/api/src/agent-manager.ts`, broadcast via the
+ `task-list-update` agent event, rendered by `packages/frontend/.../TaskListPanel.svelte`.
+
+### opencode (effective)
+- Tool name: `todowrite`. **One declarative action**: a single param `todos` containing the
+ **entire list**. Every call **replaces** the whole list.
+- Todo shape: `{ content, status, priority }`. **No ids exposed to the model.**
+- Statuses: `pending | in_progress | completed | cancelled`. Priority: `high | medium | low`.
+- Rich tool description (`todowrite.txt`) + heavy **system-prompt reinforcement**
+ (`prompt/anthropic.txt` "Task Management" section) with worked examples of the *flow*.
+- Persisted per session, broadcast via a `todo.updated` bus event to the UI.
+
+## 2. Why opencode's is effective and Dispatch's spirals
+
+The single biggest problem with Dispatch's version is that it is an **imperative, id-based,
+multi-action API**. That creates several failure modes that make weaker models spiral:
+
+1. **Id bookkeeping.** The model must remember each `task-N` id returned by `add` and reference
+ it later. LLMs lose track, **guess an id**, and hit `Error: Task with ID 'task-3' not found`,
+ then thrash trying to re-sync with `list`/`get`.
+2. **Many round-trips.** Setting up a 5-item plan and marking one in progress is **6+ tool calls**
+ (5×`add` + 1×`update`). Re-planning means `remove`+`add` churn.
+3. **Delta reasoning.** The model has to reason about *current server state* vs *desired state* and
+ emit the diff. LLMs are far better at emitting a **full desired state**.
+4. **Inconsistent surface.** `blocked` exists in the type but is never explained in the prompt and
+ isn't rendered — dead state that invites confusion.
+
+opencode's `todowrite` removes the entire class of problems: it is **declarative / idempotent**.
+The model emits the full desired list each time; no ids, no deltas, no "not found" errors, and a
+5-item plan + one in-progress is **one call**. The strong description + system-prompt examples teach
+the *cadence* (write list → mark in_progress → work → mark completed → next).
+
+## 3. Plan for Dispatch
+
+Goal: keep all existing plumbing (tool **name `todo`**, the `task-list-update` event, the
+`TaskList` per-tab store, the sidebar panel) but swap the **imperative CRUD interface for a
+declarative whole-list write**, matching opencode's model. Keeping the name `todo` means zero churn
+in the allowlist/summon/loader/permission wiring and existing agent TOMLs.
+
+### Core (`packages/core`)
+1. `types/index.ts`
+ - `TaskStatus = "pending" | "in_progress" | "completed" | "cancelled"`.
+ - `TaskPriority = "high" | "medium" | "low"`.
+ - `TaskItem = { id; content; status; priority }` (`id` kept = positional, for UI keying + event
+ contract; **never shown to the model**).
+2. `tools/task-list.ts`
+ - Replace the CRUD `TaskList` (add/update/get/remove) with a declarative store:
+ `setTasks(items)` rebuilds the list (positional ids), `getTasks()`, `onChange()`.
+ - `createTaskListTool` exposes ONE param `todos: Array<{content, status, priority}>`; `execute`
+ calls `setTasks` and echoes the canonical stored list back (without ids). Robust defaults:
+ missing `priority`→`medium`, missing/invalid `status`→`pending`, empty array clears the list.
+ - New rich `TODO_DESCRIPTION` ported/adapted from opencode's `todowrite.txt` (When to use / When
+ NOT / States / Rules / Examples), emphasising "send the whole list every time, no ids".
+
+### API (`packages/api/src/agent-manager.ts`)
+3. Replace `TODO_GUIDANCE` with an opencode-style **"Task Management"** system-prompt section:
+ declarative whole-list semantics, "use it VERY frequently", one `in_progress` at a time, mark
+ completed immediately, plus the two worked examples. Update `TOOL_DESCRIPTIONS.todo`.
+ (Wiring via `createTaskListTool(tabAgent.taskList)` and `onChange → task-list-update` is unchanged.)
+
+### Frontend (`packages/frontend`)
+4. `lib/types.ts`: mirror the new `TaskItem` (`{ id, content, status, priority }` + new status union).
+5. `lib/components/TaskListPanel.svelte`: render `content`; map statuses (completed→checked/struck,
+ in_progress→indeterminate/bold, cancelled→dim/struck, pending→empty); subtle `high` priority hint;
+ update the counter label (`completed`/`in progress`).
+
+### Tests + verification
+6. `packages/core/tests/tools/task-list.test.ts`: empty list, whole-list replace, status/priority
+ preservation, default priority, `onChange` fires, tool `execute` updates store + echoes no ids.
+7. `bun run typecheck` (core), `bun run test` (vitest), `bun run check` (biome).
+
+### Out of scope / deliberately unchanged
+- Tool name stays `todo`; `expandAgentToolNames`, summon defaults, permission keys, agent TOMLs
+ untouched.
+- Persistence to DB (opencode stores todos in SQLite) is **not** added — Dispatch keeps the existing
+ in-memory per-tab `TaskList`; the visible/UX behaviour is what was failing, and that's what we fix.
diff --git a/notes/wishlist.md b/notes/wishlist.md
index aa806b5..8c3f5ef 100644
--- a/notes/wishlist.md
+++ b/notes/wishlist.md
@@ -7,8 +7,6 @@
- Start a chat on one device (e.g. desktop) and seamlessly pick it up later on another (e.g. phone).
- Sidebar remembers which views were open and in what order, restoring them exactly as they were.
-- **Edit chat history.** Click on any existing message in the chat history and choose to edit it — this applies to user messages, AI responses, and tool results.
-
- **Update the way tools appear in the chat UI.** Improve the visual presentation of tool calls and their results — make them more readable, compact, and scannable.
- **Show git diffs for edited files.** When the AI edits a file (write_file tool call), display a git diff in the UI rather than just the raw file content.
diff --git a/packages/api/src/agent-manager.ts b/packages/api/src/agent-manager.ts
index 111237c..517c661 100644
--- a/packages/api/src/agent-manager.ts
+++ b/packages/api/src/agent-manager.ts
@@ -15,8 +15,10 @@ import {
createListFilesTool,
createReadFileSliceTool,
createReadFileTool,
+ createReadTabTool,
createRetrieveTool,
createRunShellTool,
+ createSendToTabTool,
createSkillsWatcher,
createSummonTool,
createTaskListTool,
@@ -32,6 +34,8 @@ import {
getClaudeAccountsFromDB,
getMessagesForTab,
getSetting,
+ getTab,
+ listOpenTabs,
loadAgent,
loadAgents,
loadConfig,
@@ -41,8 +45,11 @@ import {
refreshAccountCredentials,
refreshAccountCredentialsAsync,
resolveApiKey,
+ resolveTabPrefix,
type SkillDefinition,
type SystemChunkKind,
+ shortestUniquePrefix,
+ type TabResolution,
type TabStatusSnapshot,
TaskList,
toAvailableSubagents,
@@ -73,6 +80,16 @@ const TOOL_DESCRIPTIONS: Record<string, string> = {
"Fetch the transcript/subtitles for a YouTube video. Set background=true to start in the background and get a job_id for later retrieval.",
};
+/**
+ * Maximum number of CONSECUTIVE agent-to-agent auto-wakes a tab will accept
+ * before it stops auto-responding and waits for a human. Each `send_to_tab`
+ * that would wake an idle tab consumes one unit; any human-originated message
+ * (e.g. via `POST /chat`) refills the budget to full. This bounds runaway
+ * agent ping-pong loops (A wakes B wakes A ...) that would otherwise spend
+ * tokens unbounded with no human in the loop. See notes/plan-tab-comm.md.
+ */
+const MAX_AGENT_AUTO_WAKES = 6;
+
const DEFAULT_SYSTEM_PROMPT =
"You are Dispatch, an agent designed to help with any task that the user asks for. Be helpful and concise.";
@@ -197,6 +214,14 @@ interface TabAgent {
* rows. Set at the start of `processMessage`, cleared when the turn ends.
*/
currentTurnId: string | null;
+ /**
+ * Remaining consecutive agent-to-agent auto-wakes this tab will accept
+ * before requiring human intervention (see `MAX_AGENT_AUTO_WAKES`).
+ * Refilled to the max by any human-originated `deliverMessage`; decremented
+ * each time an agent-originated `send_to_tab` wakes this tab from idle. When
+ * it hits 0, further agent messages are queued but do NOT start a turn.
+ */
+ autoWakeBudget: number;
}
export class AgentManager {
@@ -343,6 +368,7 @@ export class AgentManager {
currentChunks: null,
currentAssistantId: null,
currentTurnId: null,
+ autoWakeBudget: MAX_AGENT_AUTO_WAKES,
};
this.tabAgents.set(tabId, tabAgent);
}
@@ -366,10 +392,12 @@ export class AgentManager {
const permBash = getSetting("perm_bash") === "allow";
const permSummon = getSetting("perm_summon") === "allow";
const permUserAgent = getSetting("perm_user_agent") === "allow";
+ const permSendToTab = getSetting("perm_send_to_tab") === "allow";
+ const permReadTab = getSetting("perm_read_tab") === "allow";
const permWebSearch = getSetting("perm_web_search") === "allow";
const permYoutubeTranscribe = getSetting("perm_youtube_transcribe") === "allow";
const sysPrompt = getSetting("system_prompt") ?? "";
- const permKey = `${permRead}:${permEdit}:${permBash}:${permSummon}:${permUserAgent}:${permWebSearch}:${permYoutubeTranscribe}:${sysPrompt}`;
+ const permKey = `${permRead}:${permEdit}:${permBash}:${permSummon}:${permUserAgent}:${permSendToTab}:${permReadTab}:${permWebSearch}:${permYoutubeTranscribe}:${sysPrompt}`;
// If the override differs or permissions changed, invalidate the cached agent
if (
@@ -504,6 +532,12 @@ export class AgentManager {
}),
});
}
+ // Tab-to-tab communication — gated on the child whitelist.
+ if (allowed.has("send_to_tab") || allowed.has("read_tab")) {
+ for (const entry of this.buildTabCommToolEntries(tabId)) {
+ if (allowed.has(entry.name)) toolEntries.push(entry);
+ }
+ }
} else {
// Parent agent: use permission settings from DB
if (permRead) {
@@ -581,6 +615,14 @@ export class AgentManager {
}),
});
}
+ if (permSendToTab || permReadTab) {
+ const tabCommAllowed = new Set<string>();
+ if (permSendToTab) tabCommAllowed.add("send_to_tab");
+ if (permReadTab) tabCommAllowed.add("read_tab");
+ for (const entry of this.buildTabCommToolEntries(tabId)) {
+ if (tabCommAllowed.has(entry.name)) toolEntries.push(entry);
+ }
+ }
}
const tools = toolEntries.map((e) => e.tool);
@@ -971,14 +1013,18 @@ export class AgentManager {
if (!agentDef) {
const allDefs = loadAgents(parentEffectiveDir);
if (options.topLevel) {
- const userAgents = allDefs.filter((d) => !d.is_subagent).map((d) => `${d.slug} (${d.name})`);
+ const userAgents = allDefs
+ .filter((d) => !d.is_subagent)
+ .map((d) => `${d.slug} (${d.name})`);
const hint =
userAgents.length > 0
? ` Available user agents: ${userAgents.join(", ")}.`
: " No user agent definitions exist yet.";
throw new Error(`Agent definition not found: "${options.agentSlug}".${hint}`);
} else {
- const subagents = allDefs.filter((d) => d.is_subagent).map((d) => `${d.slug} (${d.name})`);
+ const subagents = allDefs
+ .filter((d) => d.is_subagent)
+ .map((d) => `${d.slug} (${d.name})`);
const hint =
subagents.length > 0
? ` Available subagents: ${subagents.join(", ")}.`
@@ -1147,7 +1193,8 @@ export class AgentManager {
if (tabAgent.status === "running") {
return {
status: "error",
- error: "This is a user agent (top-level tab) and cannot be retrieved. User agents are fire-and-forget.",
+ error:
+ "This is a user agent (top-level tab) and cannot be retrieved. User agents are fire-and-forget.",
};
}
return {
@@ -1159,6 +1206,214 @@ export class AgentManager {
return tabAgent.completionPromise;
}
+ // ─── Tab-to-tab communication ───────────────────────────────────
+ //
+ // `send_to_tab` / `read_tab` let an agent message a peer tab by its short
+ // handle (a git-style prefix of the tab UUID). Delivery reuses the exact
+ // running→queue / idle→new-turn routing that `POST /chat` uses (see
+ // `deliverMessage`), so an agent message behaves identically to a user one.
+
+ /**
+ * Build the `send_to_tab` + `read_tab` tool entries for `tabId`. Shared by
+ * both tool-construction paths (child whitelist + permission-gated parent).
+ * `selfHandle` is computed once so the calling tab can stamp provenance and
+ * reject self-sends.
+ */
+ private buildTabCommToolEntries(
+ tabId: string,
+ ): Array<{ name: string; tool: ReturnType<typeof createSendToTabTool> }> {
+ const selfHandle = shortestUniquePrefix(tabId);
+ return [
+ {
+ name: "send_to_tab",
+ tool: createSendToTabTool({
+ resolveShortId: (prefix) => this.resolveTabHandle(prefix),
+ // origin: "agent" subjects this to the receiver's auto-wake
+ // budget so agent↔agent loops are bounded (see deliverMessage).
+ deliver: (targetId, message) =>
+ this.deliverMessage(targetId, message, { origin: "agent" }),
+ listOpenHandles: () => this.listOpenHandles(tabId),
+ self: { id: tabId, handle: selfHandle },
+ }),
+ },
+ {
+ name: "read_tab",
+ tool: createReadTabTool({
+ resolveShortId: (prefix) => this.resolveTabHandle(prefix),
+ getLastResponse: (targetId) => this.getLastTabResponse(targetId),
+ listOpenHandles: () => this.listOpenHandles(tabId),
+ }),
+ },
+ ];
+ }
+
+ /**
+ * Project a core `ResolveTabPrefixResult` down to the tool-facing
+ * `TabResolution` (minimal `{ id, title, handle }` refs). Each match's
+ * `handle` is recomputed via `shortestUniquePrefix` so the value the tool
+ * echoes back always matches what the UI currently shows.
+ */
+ private resolveTabHandle(prefix: string): TabResolution {
+ const res = resolveTabPrefix(prefix);
+ if (res.status === "none") return { status: "none" };
+ if (res.status === "ok") {
+ return {
+ status: "ok",
+ tab: {
+ id: res.tab.id,
+ title: res.tab.title,
+ handle: shortestUniquePrefix(res.tab.id),
+ },
+ };
+ }
+ return {
+ status: "ambiguous",
+ matches: res.matches.map((t) => ({
+ id: t.id,
+ title: t.title,
+ handle: shortestUniquePrefix(t.id),
+ })),
+ };
+ }
+
+ /** Snapshot of open tabs as `{ handle, title }`, excluding `exceptId`
+ * (typically the caller's own tab). Drives the "available tabs" hints. */
+ private listOpenHandles(exceptId?: string): Array<{ handle: string; title: string }> {
+ return listOpenTabs()
+ .filter((t) => t.id !== exceptId)
+ .map((t) => ({ handle: shortestUniquePrefix(t.id), title: t.title }));
+ }
+
+ /**
+ * Return a tab's most recent COMPLETED assistant turn as flat text, plus
+ * its current status. Reads the persisted chunk log (source of truth) and
+ * grabs the last `role === "assistant"` group's text chunks. `text` is null
+ * when no completed assistant turn exists yet.
+ */
+ getLastTabResponse(tabId: string): { text: string | null; status: AgentStatus } {
+ const status = this.getTabStatus(tabId);
+ try {
+ const messages = getMessagesForTab(tabId);
+ for (let i = messages.length - 1; i >= 0; i--) {
+ const msg = messages[i];
+ if (!msg || msg.role !== "assistant") continue;
+ const text = msg.chunks
+ .filter((c): c is { type: "text"; text: string } => c.type === "text")
+ .map((c) => c.text)
+ .join("")
+ .trim();
+ if (text.length > 0) return { text, status };
+ }
+ } catch {
+ // DB unavailable / tab unknown — fall through to null.
+ }
+ return { text: null, status };
+ }
+
+ /**
+ * Deliver `message` to `tabId`, choosing the SAME routing as `POST /chat`:
+ * - target running → queue it (consumed like a user interrupt).
+ * - target idle/errored → wake it and start a new turn.
+ *
+ * Returns quickly; does NOT block on the turn. Both the HTTP `/chat` path
+ * and the `send_to_tab` tool call through here so the running/idle decision
+ * lives in exactly one place.
+ *
+ * `opts` carries the per-request knobs `/chat` forwards (key/model, agent
+ * fallback chain, reasoning effort, working dir, an explicit queue id). The
+ * `send_to_tab` tool passes none of these — for a cold wake (a tab not in
+ * `tabAgents`, e.g. after a server restart) the key/model are hydrated from
+ * the live `TabAgent` if present, else from the persisted tab row. (A cold
+ * tab keeps its stored key/model but not its full agent-definition fallback
+ * chain — see plan notes.)
+ */
+ deliverMessage(
+ tabId: string,
+ message: string,
+ opts: {
+ keyId?: string;
+ modelId?: string;
+ agentModels?: Array<{ key_id: string; model_id: string }>;
+ reasoningEffort?: "none" | "low" | "medium" | "high" | "max";
+ workingDirectory?: string;
+ queueId?: string;
+ /**
+ * Who is sending this message. `"human"` (default) is unrestricted
+ * and REFILLS the target's agent-to-agent auto-wake budget. `"agent"`
+ * (from the `send_to_tab` tool) is governed by that budget: an
+ * agent-originated wake of an idle tab consumes one unit, and once the
+ * budget is exhausted the message is queued WITHOUT starting a turn
+ * (returned as `suppressed`) so a runaway A↔B loop can't spend tokens
+ * forever with no human in the loop.
+ */
+ origin?: "human" | "agent";
+ } = {},
+ ): { status: "queued"; messageId: string } | { status: "started" } | { status: "suppressed" } {
+ const origin = opts.origin ?? "human";
+
+ // A human touching the tab clears any accumulated agent-wake throttle:
+ // the conversation is back under human supervision, so peers get a fresh
+ // budget of auto-wakes again.
+ if (origin === "human") {
+ this._getOrCreateTabAgent(tabId).autoWakeBudget = MAX_AGENT_AUTO_WAKES;
+ }
+
+ if (this.getTabStatus(tabId) === "running") {
+ // Busy target → always queue (consumed like a user interrupt),
+ // regardless of origin. Queuing does not itself start a turn, so it
+ // can't drive a runaway loop; we don't spend budget here.
+ const { messageId } = this.queueMessage(tabId, message, opts.queueId);
+ return { status: "queued", messageId };
+ }
+
+ // Idle/errored target → this delivery would WAKE the tab (start a turn).
+ // For agent-originated wakes, enforce the auto-wake budget first.
+ if (origin === "agent") {
+ const target = this._getOrCreateTabAgent(tabId);
+ if (target.autoWakeBudget <= 0) {
+ // Budget exhausted: preserve the message (queue it, never drop)
+ // but do NOT wake the tab. A human message will refill the budget
+ // and the queued message will be seen on the next human turn.
+ this.queueMessage(tabId, message, opts.queueId);
+ const notice =
+ `Automatic agent-to-agent message limit reached for this tab ` +
+ `(${MAX_AGENT_AUTO_WAKES} consecutive). Further messages from other tabs ` +
+ `are held until you send a message here.`;
+ this.emit({ type: "notice", message: notice }, tabId);
+ this.routeSystemEventToTab(tabId, "notice", notice);
+ return { status: "suppressed" };
+ }
+ target.autoWakeBudget -= 1;
+ }
+
+ // Resolve key/model: explicit opts win, then the live tab agent's, then
+ // the persisted row's.
+ const tabAgent = this.tabAgents.get(tabId);
+ let keyId = opts.keyId ?? tabAgent?.keyId ?? undefined;
+ let modelId = opts.modelId ?? tabAgent?.modelId ?? undefined;
+ const agentModels = opts.agentModels ?? tabAgent?.agentModels;
+ if (!keyId || !modelId) {
+ const row = getTab(tabId);
+ if (row) {
+ keyId = keyId ?? row.keyId ?? undefined;
+ modelId = modelId ?? row.modelId ?? undefined;
+ }
+ }
+
+ this.processMessage(
+ tabId,
+ message,
+ keyId,
+ modelId,
+ opts.reasoningEffort,
+ opts.workingDirectory,
+ agentModels,
+ ).catch((err) => {
+ console.error(`[dispatch] deliverMessage processMessage error for tab ${tabId}:`, err);
+ });
+ return { status: "started" };
+ }
+
async processMessage(
tabId: string,
message: string,
diff --git a/packages/api/src/app.ts b/packages/api/src/app.ts
index 73d3de5..19cc193 100644
--- a/packages/api/src/app.ts
+++ b/packages/api/src/app.ts
@@ -56,28 +56,33 @@ app.post("/chat", async (c) => {
return c.json({ error: "message must be a non-empty string" }, 400);
}
- if (agentManager.getTabStatus(tabId) === "running") {
- const queueId = typeof body.queueId === "string" ? body.queueId : undefined;
- const { messageId } = agentManager.queueMessage(tabId, message, queueId);
- return c.json({ status: "queued", messageId });
- }
-
const keyId = typeof body.keyId === "string" ? body.keyId : undefined;
const modelId = typeof body.modelId === "string" ? body.modelId : undefined;
const agentModels = Array.isArray(body.agentModels) ? body.agentModels : undefined;
const workingDirectory =
typeof body.workingDirectory === "string" ? body.workingDirectory : undefined;
+ const queueId = typeof body.queueId === "string" ? body.queueId : undefined;
const validEfforts = ["none", "low", "medium", "high", "max"];
const reasoningEffort =
typeof body.reasoningEffort === "string" && validEfforts.includes(body.reasoningEffort)
? (body.reasoningEffort as "none" | "low" | "medium" | "high" | "max")
: undefined;
- // Non-blocking — let the agent run in the background
- agentManager
- .processMessage(tabId, message, keyId, modelId, reasoningEffort, workingDirectory, agentModels)
- .catch(console.error);
+ // Single routing decision (queue if busy, new turn if idle) shared with the
+ // `send_to_tab` tool via `AgentManager.deliverMessage`. Non-blocking — a
+ // started turn runs in the background.
+ const outcome = agentManager.deliverMessage(tabId, message, {
+ ...(keyId ? { keyId } : {}),
+ ...(modelId ? { modelId } : {}),
+ ...(agentModels ? { agentModels } : {}),
+ ...(reasoningEffort ? { reasoningEffort } : {}),
+ ...(workingDirectory !== undefined ? { workingDirectory } : {}),
+ ...(queueId ? { queueId } : {}),
+ });
+ if (outcome.status === "queued") {
+ return c.json({ status: "queued", messageId: outcome.messageId });
+ }
return c.json({ status: "ok" });
});
diff --git a/packages/api/tests/agent-manager.test.ts b/packages/api/tests/agent-manager.test.ts
index 4415bbb..1358eb1 100644
--- a/packages/api/tests/agent-manager.test.ts
+++ b/packages/api/tests/agent-manager.test.ts
@@ -24,6 +24,39 @@ function resetFakeMessages(): void {
function setFakeMessages(tabId: string, rows: FakeMessageRow[]): void {
fakeMessagesByTab.set(tabId, rows);
}
+
+// Configurable stub for the tabs DB (getTab / listOpenTabs). Tests can seed
+// rows to exercise deliverMessage cold-hydration and handle resolution.
+interface FakeTabRow {
+ id: string;
+ title: string;
+ keyId: string | null;
+ modelId: string | null;
+ parentTabId: string | null;
+ status: string;
+ isOpen: boolean;
+ position: number;
+ createdAt: number;
+ updatedAt: number;
+}
+const fakeTabs = new Map<string, FakeTabRow>();
+function resetFakeTabs(): void {
+ fakeTabs.clear();
+}
+function setFakeTab(row: Partial<FakeTabRow> & { id: string }): void {
+ fakeTabs.set(row.id, {
+ title: "Tab",
+ keyId: null,
+ modelId: null,
+ parentTabId: null,
+ status: "idle",
+ isOpen: true,
+ position: 0,
+ createdAt: 0,
+ updatedAt: 0,
+ ...row,
+ });
+}
function makeRow(
tabId: string,
seq: number,
@@ -42,11 +75,22 @@ function makeRow(
// because the production code reassigns `agent.messages =
// rows.slice(...)` AFTER `new Agent()` returns — capturing a
// reference at construction would yield a stale empty array.
-const constructedAgents: Array<{ initialMessages: unknown[] }> = [];
+const constructedAgents: Array<{ initialMessages: unknown[]; toolNames: string[] }> = [];
function resetConstructedAgents(): void {
constructedAgents.length = 0;
}
+// Configurable settings store so tests can toggle tool permissions
+// (perm_send_to_tab / perm_read_tab / ...) and assert which tools the
+// constructed Agent receives. Defaults to empty (getSetting → null).
+const fakeSettings = new Map<string, string>();
+function resetFakeSettings(): void {
+ fakeSettings.clear();
+}
+function setFakeSetting(key: string, value: string): void {
+ fakeSettings.set(key, value);
+}
+
// Allow tests to swap in a custom `run` generator (e.g. to simulate
// a fallback failure mid-stream). Returning to undefined restores
// the default.
@@ -87,12 +131,19 @@ vi.mock("@dispatch/core", () => ({
Agent: class MockAgent {
status = "idle";
messages: unknown[] = [];
+ toolNames: string[] = [];
+ constructor(config: { tools?: Array<{ name: string }> }) {
+ this.toolNames = (config?.tools ?? []).map((t) => t.name);
+ }
async *run(message: string): AsyncGenerator<unknown> {
// Snapshot the post-construction pre-populated message list
// the first thing `run()` does, before the real `Agent.run`
// would push the current user message at line 546. Tests
// inspect this to verify history was loaded correctly.
- constructedAgents.push({ initialMessages: [...this.messages] });
+ constructedAgents.push({
+ initialMessages: [...this.messages],
+ toolNames: [...this.toolNames],
+ });
if (runImpl) {
for await (const ev of runImpl(message)) yield ev;
return;
@@ -244,6 +295,41 @@ vi.mock("@dispatch/core", () => ({
};
},
createTab() {},
+ getTab(id: string) {
+ return fakeTabs.get(id) ?? null;
+ },
+ listOpenTabs() {
+ return [...fakeTabs.values()].filter((t) => t.isOpen);
+ },
+ resolveTabPrefix(prefix: string) {
+ const sanitized = (prefix ?? "").toLowerCase().replace(/[^0-9a-f-]/g, "");
+ if (sanitized.length < 4) return { status: "none" };
+ const matches = [...fakeTabs.values()].filter(
+ (t) => t.isOpen && t.id.toLowerCase().startsWith(sanitized),
+ );
+ if (matches.length === 0) return { status: "none" };
+ if (matches.length === 1) return { status: "ok", tab: matches[0] };
+ return { status: "ambiguous", matches };
+ },
+ shortestUniquePrefix(id: string) {
+ return (id ?? "").slice(0, 4);
+ },
+ createSendToTabTool(_callbacks: unknown): ToolDefinition {
+ return {
+ name: "send_to_tab",
+ description: "send to tab",
+ parameters: { _type: "z.ZodObject", shape: {} } as unknown as ToolDefinition["parameters"],
+ execute: async () => "mock",
+ };
+ },
+ createReadTabTool(_callbacks: unknown): ToolDefinition {
+ return {
+ name: "read_tab",
+ description: "read tab",
+ parameters: { _type: "z.ZodObject", shape: {} } as unknown as ToolDefinition["parameters"],
+ execute: async () => "mock",
+ };
+ },
getClaudeAccountsFromDB() {
return [];
},
@@ -256,8 +342,8 @@ vi.mock("@dispatch/core", () => ({
resolveApiKey() {
return null;
},
- getSetting(_key: string) {
- return null;
+ getSetting(key: string) {
+ return fakeSettings.get(key) ?? null;
},
appendChunks() {
return [];
@@ -316,6 +402,8 @@ describe("AgentManager", () => {
beforeEach(() => {
resetFakeMessages();
resetConstructedAgents();
+ resetFakeTabs();
+ resetFakeSettings();
setRunImpl(null);
appendEventToChunksSpy.mockClear();
});
@@ -849,4 +937,273 @@ describe("AgentManager", () => {
expect(snap["tab-early"]).not.toHaveProperty("currentChunks");
expect(snap["tab-early"]).not.toHaveProperty("currentAssistantId");
});
+
+ // ─── Tab-to-tab communication ─────────────────────────────────
+
+ describe("deliverMessage", () => {
+ it("starts a new turn when the target tab is idle", async () => {
+ const manager = new AgentManager();
+ const events: AgentEvent[] = [];
+ manager.onEvent((e) => events.push(e));
+
+ const outcome = manager.deliverMessage("tab-idle", "wake up");
+ expect(outcome.status).toBe("started");
+
+ // Let the background turn run to completion.
+ await new Promise<void>((r) => setTimeout(r, 60));
+ expect(events.some((e) => e.type === "text-delta")).toBe(true);
+ expect(manager.getTabStatus("tab-idle")).toBe("idle");
+ });
+
+ it("queues the message when the target tab is running", () => {
+ const manager = new AgentManager();
+ const inner = manager as unknown as {
+ tabAgents: Map<string, Record<string, unknown>>;
+ };
+ // Seed a running tab agent directly.
+ inner.tabAgents.set("tab-busy", {
+ agent: null,
+ status: "running",
+ keyId: null,
+ modelId: null,
+ taskList: { onChange: () => {} },
+ messageQueue: [],
+ queueListeners: [],
+ shellStore: {},
+ transcriptStore: {},
+ currentChunks: null,
+ currentAssistantId: null,
+ currentTurnId: null,
+ });
+
+ const outcome = manager.deliverMessage("tab-busy", "queued msg");
+ expect(outcome.status).toBe("queued");
+ if (outcome.status === "queued") {
+ expect(typeof outcome.messageId).toBe("string");
+ }
+ // The message landed on the running tab's queue.
+ const agent = inner.tabAgents.get("tab-busy") as { messageQueue: unknown[] };
+ expect(agent.messageQueue).toHaveLength(1);
+ });
+
+ it("hydrates key/model from the persisted tab row for a cold wake", () => {
+ const manager = new AgentManager();
+ setFakeTab({ id: "tab-cold", keyId: "persisted-key", modelId: "persisted-model" });
+
+ // Spy on processMessage to capture the key/model deliverMessage
+ // forwarded — asserting the hydration decision directly rather than
+ // downstream tabAgent state (which the mocked ModelRegistry rewrites).
+ const spy = vi.spyOn(manager, "processMessage").mockResolvedValue(undefined);
+
+ const outcome = manager.deliverMessage("tab-cold", "hello");
+ expect(outcome.status).toBe("started");
+
+ expect(spy).toHaveBeenCalledTimes(1);
+ const args = spy.mock.calls[0] ?? [];
+ expect(args[0]).toBe("tab-cold"); // tabId
+ expect(args[1]).toBe("hello"); // message
+ expect(args[2]).toBe("persisted-key"); // keyId hydrated from row
+ expect(args[3]).toBe("persisted-model"); // modelId hydrated from row
+ });
+
+ it("prefers explicit opts over the persisted row on a cold wake", () => {
+ const manager = new AgentManager();
+ setFakeTab({ id: "tab-cold2", keyId: "row-key", modelId: "row-model" });
+ const spy = vi.spyOn(manager, "processMessage").mockResolvedValue(undefined);
+
+ manager.deliverMessage("tab-cold2", "hello", {
+ keyId: "explicit-key",
+ modelId: "explicit-model",
+ });
+
+ const args = spy.mock.calls[0] ?? [];
+ expect(args[2]).toBe("explicit-key");
+ expect(args[3]).toBe("explicit-model");
+ });
+ });
+
+ describe("deliverMessage — agent auto-wake budget", () => {
+ it("allows up to 6 consecutive agent wakes, then suppresses further ones", () => {
+ const manager = new AgentManager();
+ const spy = vi.spyOn(manager, "processMessage").mockResolvedValue(undefined);
+
+ // 6 agent-originated wakes of an idle tab should all start turns.
+ for (let i = 0; i < 6; i++) {
+ const outcome = manager.deliverMessage("tab-pp", `msg ${i}`, { origin: "agent" });
+ expect(outcome.status).toBe("started");
+ }
+ expect(spy).toHaveBeenCalledTimes(6);
+
+ // The 7th is suppressed: no new turn, message preserved on the queue.
+ const seventh = manager.deliverMessage("tab-pp", "msg 7", { origin: "agent" });
+ expect(seventh.status).toBe("suppressed");
+ expect(spy).toHaveBeenCalledTimes(6); // unchanged — no wake
+
+ const inner = manager as unknown as {
+ tabAgents: Map<string, { messageQueue: unknown[]; autoWakeBudget: number }>;
+ };
+ const agent = inner.tabAgents.get("tab-pp");
+ expect(agent?.autoWakeBudget).toBe(0);
+ // Suppressed message is queued, not dropped.
+ expect(agent?.messageQueue).toHaveLength(1);
+ });
+
+ it("a human message refills the budget and re-enables agent wakes", () => {
+ const manager = new AgentManager();
+ vi.spyOn(manager, "processMessage").mockResolvedValue(undefined);
+
+ // Exhaust the budget with agent wakes.
+ for (let i = 0; i < 6; i++) {
+ manager.deliverMessage("tab-refill", `a${i}`, { origin: "agent" });
+ }
+ expect(manager.deliverMessage("tab-refill", "blocked", { origin: "agent" }).status).toBe(
+ "suppressed",
+ );
+
+ // A human message refills the budget...
+ const humanOutcome = manager.deliverMessage("tab-refill", "human here", {
+ origin: "human",
+ });
+ expect(humanOutcome.status).toBe("started");
+
+ const inner = manager as unknown as {
+ tabAgents: Map<string, { autoWakeBudget: number }>;
+ };
+ expect(inner.tabAgents.get("tab-refill")?.autoWakeBudget).toBe(6);
+
+ // ...so an agent can wake it again.
+ expect(manager.deliverMessage("tab-refill", "again", { origin: "agent" }).status).toBe(
+ "started",
+ );
+ });
+
+ it("does not consume budget when the message is merely queued (busy target)", () => {
+ const manager = new AgentManager();
+ const inner = manager as unknown as {
+ tabAgents: Map<string, Record<string, unknown>>;
+ };
+ inner.tabAgents.set("tab-busy-budget", {
+ agent: null,
+ status: "running",
+ keyId: null,
+ modelId: null,
+ taskList: { onChange: () => {} },
+ messageQueue: [],
+ queueListeners: [],
+ shellStore: {},
+ transcriptStore: {},
+ currentChunks: null,
+ currentAssistantId: null,
+ currentTurnId: null,
+ autoWakeBudget: 6,
+ });
+
+ const outcome = manager.deliverMessage("tab-busy-budget", "queued one", {
+ origin: "agent",
+ });
+ expect(outcome.status).toBe("queued");
+ // Budget untouched — queuing can't drive a runaway loop.
+ const agent = inner.tabAgents.get("tab-busy-budget") as { autoWakeBudget: number };
+ expect(agent.autoWakeBudget).toBe(6);
+ });
+
+ it("human-originated wakes are never throttled", () => {
+ const manager = new AgentManager();
+ const spy = vi.spyOn(manager, "processMessage").mockResolvedValue(undefined);
+
+ // Far more than the budget, all human-originated → all start turns.
+ for (let i = 0; i < 10; i++) {
+ const outcome = manager.deliverMessage("tab-human", `h${i}`, { origin: "human" });
+ expect(outcome.status).toBe("started");
+ }
+ expect(spy).toHaveBeenCalledTimes(10);
+ });
+
+ it("defaults origin to human when unspecified (POST /chat path)", () => {
+ const manager = new AgentManager();
+ const spy = vi.spyOn(manager, "processMessage").mockResolvedValue(undefined);
+ for (let i = 0; i < 8; i++) {
+ expect(manager.deliverMessage("tab-default", `d${i}`).status).toBe("started");
+ }
+ expect(spy).toHaveBeenCalledTimes(8);
+ });
+ });
+
+ describe("getLastTabResponse", () => {
+ it("returns the most recent assistant turn's text and current status", () => {
+ const manager = new AgentManager();
+ setFakeMessages("tab-hist", [
+ makeRow("tab-hist", 1, "user", [{ type: "text", text: "hi" }]),
+ makeRow("tab-hist", 2, "assistant", [{ type: "text", text: "first answer" }]),
+ makeRow("tab-hist", 3, "user", [{ type: "text", text: "again" }]),
+ makeRow("tab-hist", 4, "assistant", [
+ { type: "text", text: "second " },
+ { type: "text", text: "answer" },
+ ]),
+ ]);
+
+ const res = manager.getLastTabResponse("tab-hist");
+ expect(res.text).toBe("second answer");
+ expect(res.status).toBe("idle");
+ });
+
+ it("returns null text when the tab has no assistant turn yet", () => {
+ const manager = new AgentManager();
+ setFakeMessages("tab-empty", [
+ makeRow("tab-empty", 1, "user", [{ type: "text", text: "hi" }]),
+ ]);
+ const res = manager.getLastTabResponse("tab-empty");
+ expect(res.text).toBeNull();
+ });
+
+ it("skips assistant turns that contain no text chunks", () => {
+ const manager = new AgentManager();
+ setFakeMessages("tab-toolonly", [
+ makeRow("tab-toolonly", 1, "assistant", [{ type: "text", text: "real answer" }]),
+ // A later assistant turn with only non-text chunks should be skipped.
+ makeRow("tab-toolonly", 2, "assistant", [{ type: "thinking", text: "hmm" }]),
+ ]);
+ const res = manager.getLastTabResponse("tab-toolonly");
+ expect(res.text).toBe("real answer");
+ });
+ });
+
+ describe("send_to_tab / read_tab permission split", () => {
+ // Drives the real parent-path tool construction in getOrCreateAgentForTab
+ // by toggling the new split permissions and inspecting which tools the
+ // constructed Agent received.
+ async function toolsForPerms(tabId: string, perms: Record<string, string>): Promise<string[]> {
+ for (const [k, v] of Object.entries(perms)) setFakeSetting(k, v);
+ const manager = new AgentManager();
+ await manager.processMessage(tabId, "go");
+ return constructedAgents.at(-1)?.toolNames ?? [];
+ }
+
+ it("grants only send_to_tab when only perm_send_to_tab is allowed", async () => {
+ const tools = await toolsForPerms("tab-send-only", { perm_send_to_tab: "allow" });
+ expect(tools).toContain("send_to_tab");
+ expect(tools).not.toContain("read_tab");
+ });
+
+ it("grants only read_tab when only perm_read_tab is allowed", async () => {
+ const tools = await toolsForPerms("tab-read-only", { perm_read_tab: "allow" });
+ expect(tools).toContain("read_tab");
+ expect(tools).not.toContain("send_to_tab");
+ });
+
+ it("grants both when both permissions are allowed", async () => {
+ const tools = await toolsForPerms("tab-both", {
+ perm_send_to_tab: "allow",
+ perm_read_tab: "allow",
+ });
+ expect(tools).toContain("send_to_tab");
+ expect(tools).toContain("read_tab");
+ });
+
+ it("grants neither when both permissions are off", async () => {
+ const tools = await toolsForPerms("tab-neither", {});
+ expect(tools).not.toContain("send_to_tab");
+ expect(tools).not.toContain("read_tab");
+ });
+ });
});
diff --git a/packages/api/tests/routes.test.ts b/packages/api/tests/routes.test.ts
index 4b8dd40..9ab2afe 100644
--- a/packages/api/tests/routes.test.ts
+++ b/packages/api/tests/routes.test.ts
@@ -166,6 +166,34 @@ vi.mock("@dispatch/core", () => ({
};
},
createTab() {},
+ getTab() {
+ return null;
+ },
+ listOpenTabs() {
+ return [];
+ },
+ resolveTabPrefix() {
+ return { status: "none" };
+ },
+ shortestUniquePrefix(id: string) {
+ return (id ?? "").slice(0, 4);
+ },
+ createSendToTabTool(_callbacks: unknown) {
+ return {
+ name: "send_to_tab",
+ description: "send to tab",
+ parameters: { _type: "z.ZodObject", shape: {} },
+ execute: async () => "mock",
+ };
+ },
+ createReadTabTool(_callbacks: unknown) {
+ return {
+ name: "read_tab",
+ description: "read tab",
+ parameters: { _type: "z.ZodObject", shape: {} },
+ execute: async () => "mock",
+ };
+ },
getClaudeAccountsFromDB() {
return [];
},
diff --git a/packages/core/src/agents/loader.ts b/packages/core/src/agents/loader.ts
index 333716e..f4a6c5a 100644
--- a/packages/core/src/agents/loader.ts
+++ b/packages/core/src/agents/loader.ts
@@ -108,8 +108,8 @@ export function expandAgentToolNames(tools: string[]): string[] {
default:
// Pass through tool names that aren't permission-group
// aliases (summon, retrieve, web_search, youtube_transcribe,
- // todo, and the granular file tools themselves if a user
- // hand-wrote them in a TOML).
+ // send_to_tab, read_tab, todo, and the granular file tools
+ // themselves if a user hand-wrote them in a TOML).
expanded.add(t);
}
}
diff --git a/packages/core/src/db/tabs.ts b/packages/core/src/db/tabs.ts
index 928f434..8b290d2 100644
--- a/packages/core/src/db/tabs.ts
+++ b/packages/core/src/db/tabs.ts
@@ -159,3 +159,78 @@ export function getDescendantIds(rootId: string): string[] {
}
return order.reverse();
}
+
+/**
+ * Minimum length of a tab-handle prefix accepted by `resolveTabPrefix`.
+ * Mirrors the frontend's minimum DISPLAY length (4 hex chars). Anything
+ * shorter is rejected as too broad — an agent must echo at least the 4-char
+ * handle shown in the UI.
+ */
+export const MIN_TAB_PREFIX_LENGTH = 4;
+
+/**
+ * Outcome of resolving a short tab handle (a git-style prefix of a tab's
+ * UUID) back to a concrete open tab.
+ *
+ * - `ok` — exactly one open tab matched; `tab` is it.
+ * - `none` — no open tab matched (bad/stale handle, or too-short prefix).
+ * - `ambiguous` — more than one open tab shares the prefix; `matches` lists
+ * them so the caller can ask for one more character (the same
+ * UX as `git checkout <ambiguous-sha>`).
+ */
+export type ResolveTabPrefixResult =
+ | { status: "ok"; tab: TabRow }
+ | { status: "none" }
+ | { status: "ambiguous"; matches: TabRow[] };
+
+/**
+ * Resolve a short tab handle to a single OPEN tab by prefix match — the
+ * git-short-hash model. The handle is NEVER stored: it is always derived from
+ * (and matched against) the canonical lowercase UUID in `tabs.id`.
+ *
+ * Sanitization is mandatory because the SQLite `LIKE` operator treats `%` and
+ * `_` as wildcards: an unsanitized prefix like `a%` would match broadly. We
+ * lowercase the input (UUIDs are canonical lowercase; SQLite `LIKE` is also
+ * ASCII-case-insensitive by default) and strip everything outside the UUID
+ * alphabet `[0-9a-f-]` so no wildcard can survive into the query.
+ *
+ * A prefix shorter than `MIN_TAB_PREFIX_LENGTH` after sanitization returns
+ * `none` rather than matching a large swath of tabs.
+ *
+ * Only OPEN tabs (`is_open = 1`) are addressable — a closed tab's UUID prefix
+ * must not cause phantom ambiguity or resolve to a dead conversation.
+ */
+export function resolveTabPrefix(prefix: string): ResolveTabPrefixResult {
+ const sanitized = (prefix ?? "").toLowerCase().replace(/[^0-9a-f-]/g, "");
+ if (sanitized.length < MIN_TAB_PREFIX_LENGTH) {
+ return { status: "none" };
+ }
+ const db = getDatabase();
+ const rows = db
+ .query("SELECT * FROM tabs WHERE is_open = 1 AND id LIKE $prefix ORDER BY position ASC")
+ .all({ $prefix: `${sanitized}%` }) as Array<Record<string, unknown>>;
+ if (rows.length === 0) return { status: "none" };
+ if (rows.length === 1) return { status: "ok", tab: rowToTab(rows[0] as Record<string, unknown>) };
+ return { status: "ambiguous", matches: rows.map(rowToTab) };
+}
+
+/**
+ * Compute the shortest unique prefix (minimum `MIN_TAB_PREFIX_LENGTH` chars)
+ * that identifies `tabId` among the currently OPEN tabs — the backend twin of
+ * the frontend's display helper. Used when a tool needs to echo a tab's own
+ * handle (e.g. provenance prefixes, "available tabs" hints) without trusting a
+ * value from the wire.
+ *
+ * Returns the full id if no shorter unique prefix exists (degenerate — only if
+ * two open tabs share an entire id, which UUID uniqueness precludes).
+ */
+export function shortestUniquePrefix(tabId: string): string {
+ const db = getDatabase();
+ const rows = db.query("SELECT id FROM tabs WHERE is_open = 1").all() as Array<{ id: string }>;
+ const others = rows.map((r) => r.id).filter((id) => id !== tabId);
+ for (let len = MIN_TAB_PREFIX_LENGTH; len < tabId.length; len++) {
+ const candidate = tabId.slice(0, len);
+ if (!others.some((id) => id.startsWith(candidate))) return candidate;
+ }
+ return tabId;
+}
diff --git a/packages/core/src/index.ts b/packages/core/src/index.ts
index a3816ea..b1b17cc 100644
--- a/packages/core/src/index.ts
+++ b/packages/core/src/index.ts
@@ -49,6 +49,10 @@ export {
createTab,
getTab,
listOpenTabs,
+ MIN_TAB_PREFIX_LENGTH,
+ type ResolveTabPrefixResult,
+ resolveTabPrefix,
+ shortestUniquePrefix,
type TabRow,
updateTabModel,
updateTabStatus,
@@ -78,9 +82,16 @@ export { prefix as bashArityPrefix } from "./tools/bash-arity.js";
export { createListFilesTool } from "./tools/list-files.js";
export { createReadFileTool } from "./tools/read-file.js";
export { createReadFileSliceTool } from "./tools/read-file-slice.js";
+export { createReadTabTool, type ReadTabCallbacks } from "./tools/read-tab.js";
export { createToolRegistry } from "./tools/registry.js";
export { createRetrieveTool, type RetrieveCallbacks } from "./tools/retrieve.js";
export { BackgroundShellStore, createRunShellTool } from "./tools/run-shell.js";
+export {
+ createSendToTabTool,
+ type ResolvedTabRef,
+ type SendToTabCallbacks,
+ type TabResolution,
+} from "./tools/send-to-tab.js";
export { analyzeCommand } from "./tools/shell-analyze.js";
export {
type AvailableAgent,
diff --git a/packages/core/src/llm/debug-logger.ts b/packages/core/src/llm/debug-logger.ts
index 2b7420c..072a7a1 100644
--- a/packages/core/src/llm/debug-logger.ts
+++ b/packages/core/src/llm/debug-logger.ts
@@ -281,11 +281,7 @@ export function logStepLifecycle(data: {
* Log agent loop-level events (loop start, break conditions, etc.).
* Only logs at verbosity >= 3.
*/
-export function logAgentLoop(data: {
- tabId?: string;
- event: string;
- detail?: unknown;
-}): void {
+export function logAgentLoop(data: { tabId?: string; event: string; detail?: unknown }): void {
if (!ENABLED || VERBOSITY < 3) return;
const detail = data.detail !== undefined ? ` ${JSON.stringify(data.detail)}` : "";
console.error(`[dispatch-debug] AGENT tab=${data.tabId ?? "?"} ${data.event}${detail}`);
@@ -308,16 +304,14 @@ export function logAgentLoop(data: {
* we just clone and read once via `.text()` — simpler and safe because
* non-streaming bodies are bounded.
*/
-export function wrapFetchWithLogging<
- F extends (...args: never[]) => Promise<Response> | Response,
->(baseFetch: F, opts: { tabId?: string; modelHint?: string }): F {
+export function wrapFetchWithLogging<F extends (...args: never[]) => Promise<Response> | Response>(
+ baseFetch: F,
+ opts: { tabId?: string; modelHint?: string },
+): F {
if (!ENABLED) return baseFetch;
const wrapped = async (...args: Parameters<F>) => {
const requestId = ++seq;
- const [input, init] = args as unknown as [
- RequestInfo | URL,
- RequestInit | undefined,
- ];
+ const [input, init] = args as unknown as [RequestInfo | URL, RequestInit | undefined];
const url =
typeof input === "string"
? input
@@ -325,7 +319,8 @@ export function wrapFetchWithLogging<
? input.toString()
: (input as Request).url;
const method =
- init?.method ?? (typeof input === "object" && "method" in input ? (input as Request).method : "POST");
+ init?.method ??
+ (typeof input === "object" && "method" in input ? (input as Request).method : "POST");
// Snapshot headers as a plain object for logging.
const headerObj: Record<string, string> = {};
diff --git a/packages/core/src/tools/read-tab.ts b/packages/core/src/tools/read-tab.ts
new file mode 100644
index 0000000..e80dbd0
--- /dev/null
+++ b/packages/core/src/tools/read-tab.ts
@@ -0,0 +1,95 @@
+import { z } from "zod";
+import type { AgentStatus, ToolDefinition } from "../types/index.js";
+import type { TabResolution } from "./send-to-tab.js";
+
+export interface ReadTabCallbacks {
+ /** Resolve a (possibly short) handle to one open tab. */
+ resolveShortId(prefix: string): TabResolution;
+ /**
+ * Return the target tab's most recent COMPLETED assistant turn as plain
+ * text, plus its current status. `text` is null when the tab has no
+ * completed assistant turn yet.
+ */
+ getLastResponse(tabId: string): { text: string | null; status: AgentStatus };
+ /** Snapshot of currently-open tabs, for "available tabs" error hints. */
+ listOpenHandles(): Array<{ handle: string; title: string }>;
+}
+
+/** Render the "available tabs" hint shared by the none/ambiguous branches. */
+function renderOpenHandles(handles: Array<{ handle: string; title: string }>): string {
+ if (handles.length === 0) return "No other tabs are currently open.";
+ const lines = handles.map((h) => ` - ${h.handle}: ${h.title}`);
+ return ["Currently open tabs:", ...lines].join("\n");
+}
+
+export function createReadTabTool(callbacks: ReadTabCallbacks): ToolDefinition {
+ return {
+ name: "read_tab",
+ description: [
+ "Read the most recent completed response from another tab (agent) by its short ID.",
+ "",
+ "Returns a SNAPSHOT — it does NOT block or wait for the target to finish.",
+ " - If the target is idle, you get its just-finished turn.",
+ " - If the target is still running, you get its PREVIOUS completed turn (if any);",
+ " call read_tab again later to get the newest one.",
+ "",
+ "Use this after send_to_tab to collect another agent's reply. IDs are git-style",
+ "prefixes: pass any length that uniquely identifies the target (min 4 chars).",
+ ].join("\n"),
+ parameters: z.object({
+ tab_id: z
+ .string()
+ .describe(
+ "The short ID (handle) of the tab to read, as shown in the tab bar. Any unique-length prefix works (min 4 chars).",
+ ),
+ }),
+ execute: async (args: Record<string, unknown>): Promise<string> => {
+ const rawId = (args.tab_id as string | undefined)?.trim() ?? "";
+
+ if (!rawId) {
+ return `Error: tab_id is required.\n\n${renderOpenHandles(callbacks.listOpenHandles())}`;
+ }
+
+ const resolution = callbacks.resolveShortId(rawId);
+
+ if (resolution.status === "none") {
+ return [
+ `Error: no open tab matches the ID "${rawId}".`,
+ "",
+ renderOpenHandles(callbacks.listOpenHandles()),
+ ].join("\n");
+ }
+ if (resolution.status === "ambiguous") {
+ const matches = resolution.matches.map((m) => ` - ${m.handle}: ${m.title}`).join("\n");
+ return [
+ `Error: the ID "${rawId}" is ambiguous — it matches multiple open tabs:`,
+ matches,
+ "",
+ "Add one or more characters to disambiguate.",
+ ].join("\n");
+ }
+
+ const target = resolution.tab;
+ const { text, status } = callbacks.getLastResponse(target.id);
+
+ const runningNote =
+ status === "running"
+ ? " (this tab is still running; the response below is its previous completed turn — read again later for the newest)"
+ : "";
+
+ if (text === null) {
+ const reason =
+ status === "running"
+ ? "it is still working on its first turn"
+ : "it has no assistant responses yet";
+ return `Tab ${target.handle} (${target.title}) has no completed response — ${reason}.`;
+ }
+
+ return [
+ `<tab_response tab="${target.handle}" status="${status}"${runningNote}>`,
+ text,
+ "</tab_response>",
+ ].join("\n");
+ },
+ };
+}
diff --git a/packages/core/src/tools/send-to-tab.ts b/packages/core/src/tools/send-to-tab.ts
new file mode 100644
index 0000000..eb86b7e
--- /dev/null
+++ b/packages/core/src/tools/send-to-tab.ts
@@ -0,0 +1,147 @@
+import { z } from "zod";
+import type { ToolDefinition } from "../types/index.js";
+
+/**
+ * A tab reference surfaced to the `send_to_tab` / `read_tab` tools. The tools
+ * are intentionally decoupled from the DB `TabRow` shape — the AgentManager
+ * maps `resolveTabPrefix(...)` results down to this minimal projection so the
+ * tools (and their unit tests) never depend on the persistence layer.
+ */
+export interface ResolvedTabRef {
+ /** The tab's canonical full UUID. */
+ id: string;
+ /** The tab's display title (for disambiguation hints). */
+ title: string;
+ /** The tab's current short handle (shortest unique prefix). */
+ handle: string;
+}
+
+/**
+ * Outcome of resolving a short tab handle. Mirrors core's
+ * `ResolveTabPrefixResult` but over the minimal `ResolvedTabRef` projection.
+ */
+export type TabResolution =
+ | { status: "ok"; tab: ResolvedTabRef }
+ | { status: "none" }
+ | { status: "ambiguous"; matches: ResolvedTabRef[] };
+
+export interface SendToTabCallbacks {
+ /** Resolve a (possibly short) handle to one open tab. */
+ resolveShortId(prefix: string): TabResolution;
+ /**
+ * Deliver `message` to `tabId`. If the target is mid-turn the message is
+ * queued (same path as a user message); if idle/errored it wakes the tab
+ * and starts a new turn. Returns quickly — does NOT block on the turn.
+ */
+ deliver(
+ tabId: string,
+ message: string,
+ ):
+ | Promise<{ status: "queued" | "started" | "suppressed" }>
+ | { status: "queued" | "started" | "suppressed" };
+ /** Snapshot of currently-open tabs, for "available tabs" error hints. */
+ listOpenHandles(): Array<{ handle: string; title: string }>;
+ /** The calling tab's own id + handle — used to block self-sends and to
+ * stamp provenance onto the delivered message. */
+ self: { id: string; handle: string };
+}
+
+/** Render the "available tabs" hint shared by the none/ambiguous branches. */
+function renderOpenHandles(handles: Array<{ handle: string; title: string }>): string {
+ if (handles.length === 0) return "No other tabs are currently open.";
+ const lines = handles.map((h) => ` - ${h.handle}: ${h.title}`);
+ return ["Currently open tabs:", ...lines].join("\n");
+}
+
+export function createSendToTabTool(callbacks: SendToTabCallbacks): ToolDefinition {
+ return {
+ name: "send_to_tab",
+ description: [
+ "Send a message to another tab (agent) by its short ID — the handle shown in the tab bar.",
+ "",
+ "Behaviour mirrors a user sending a message:",
+ " - If the target tab is mid-turn (busy), your message is QUEUED and picked up next.",
+ " - If the target tab is idle, your message WAKES it and starts a new turn.",
+ "",
+ "This is fire-and-forget: it returns immediately and does NOT wait for a reply.",
+ "Use the 'read_tab' tool with the same ID later to read the target's latest response.",
+ "",
+ "Your tab ID is auto-added to the top of the message so the recipient can reply to you.",
+ "IDs are git-style prefixes: pass any length that uniquely identifies the target (min 4 chars).",
+ "If the ID is ambiguous you'll be asked to add a character.",
+ ].join("\n"),
+ parameters: z.object({
+ tab_id: z
+ .string()
+ .describe(
+ "The short ID (handle) of the target tab, as shown in the tab bar. Any unique-length prefix of the tab's id works (min 4 chars).",
+ ),
+ message: z
+ .string()
+ .describe("The message to deliver to the target tab, exactly as a user would type it."),
+ }),
+ execute: async (args: Record<string, unknown>): Promise<string> => {
+ const rawId = (args.tab_id as string | undefined)?.trim() ?? "";
+ const message = (args.message as string | undefined) ?? "";
+
+ if (!rawId) {
+ return `Error: tab_id is required.\n\n${renderOpenHandles(callbacks.listOpenHandles())}`;
+ }
+ if (!message.trim()) {
+ return "Error: message must not be empty.";
+ }
+
+ const resolution = callbacks.resolveShortId(rawId);
+
+ if (resolution.status === "none") {
+ return [
+ `Error: no open tab matches the ID "${rawId}".`,
+ "",
+ renderOpenHandles(callbacks.listOpenHandles()),
+ ].join("\n");
+ }
+ if (resolution.status === "ambiguous") {
+ const matches = resolution.matches.map((m) => ` - ${m.handle}: ${m.title}`).join("\n");
+ return [
+ `Error: the ID "${rawId}" is ambiguous — it matches multiple open tabs:`,
+ matches,
+ "",
+ "Add one or more characters to disambiguate.",
+ ].join("\n");
+ }
+
+ const target = resolution.tab;
+
+ if (target.id === callbacks.self.id) {
+ return "Error: cannot send a message to your own tab.";
+ }
+
+ // Stamp provenance so the recipient (and the watching user) can see
+ // which tab the message came from and reply back via its handle.
+ const delivered = `[message from tab ${callbacks.self.handle}]\n\n${message}`;
+
+ try {
+ const result = await callbacks.deliver(target.id, delivered);
+ if (result.status === "suppressed") {
+ // The target hit its automatic agent-to-agent wake limit. The
+ // message was preserved (queued) but did NOT start a turn — a
+ // human must step in. Tell the sender plainly so it stops
+ // hammering the target and creating a runaway loop.
+ return [
+ `Message HELD for tab ${target.handle} (${target.title}) — it was NOT delivered as a wake.`,
+ `That tab has reached its automatic agent-to-agent message limit, so it will not`,
+ `auto-respond again until a human sends it a message. Do not keep resending:`,
+ `your message is already queued and will be seen when a human resumes that tab.`,
+ ].join("\n");
+ }
+ const verb =
+ result.status === "queued"
+ ? "queued (target is busy; it will be picked up next turn)"
+ : "delivered (target was idle; a new turn has started)";
+ return `Message ${verb}. Target tab: ${target.handle} (${target.title}). Use read_tab with "${target.handle}" to read its reply later.`;
+ } catch (err) {
+ return `Error delivering message: ${err instanceof Error ? err.message : String(err)}`;
+ }
+ },
+ };
+}
diff --git a/packages/core/src/tools/summon.ts b/packages/core/src/tools/summon.ts
index d40f0ca..4820e89 100644
--- a/packages/core/src/tools/summon.ts
+++ b/packages/core/src/tools/summon.ts
@@ -177,6 +177,8 @@ export function createSummonTool(
" - retrieve: Collect results from its children (required if summon is given)",
" - web_search: Search the web",
" - youtube_transcribe: Fetch YouTube video transcripts",
+ " - send_to_tab: Send a message to another tab/agent by its ID",
+ " - read_tab: Read another tab/agent's latest response by its ID",
"",
"The 'agent' parameter is required — every spawned agent must use a definition.",
"Tools default to the agent definition's tools, intersected with your own tools (you can't grant capabilities you don't have).",
@@ -232,6 +234,8 @@ export function createSummonTool(
"retrieve",
"web_search",
"youtube_transcribe",
+ "send_to_tab",
+ "read_tab",
]),
)
.optional()
@@ -262,7 +266,9 @@ export function createSummonTool(
const tools = args.tools as string[] | undefined;
const workingDirectory = args.working_directory as string | undefined;
const background = (args.background as boolean | undefined) ?? false;
- const topLevel = userAgentEnabled ? ((args.top_level as boolean | undefined) ?? false) : false;
+ const topLevel = userAgentEnabled
+ ? ((args.top_level as boolean | undefined) ?? false)
+ : false;
try {
const agentId = await callbacks.spawn({
diff --git a/packages/core/tests/agents/loader.test.ts b/packages/core/tests/agents/loader.test.ts
index 88173ea..35ab0cf 100644
--- a/packages/core/tests/agents/loader.test.ts
+++ b/packages/core/tests/agents/loader.test.ts
@@ -22,10 +22,34 @@ describe("expandAgentToolNames", () => {
expect(out).toContain("run_shell");
});
+ it("passes through the tab tools as independent names (no tab_comm group)", () => {
+ const out = expandAgentToolNames(["send_to_tab", "read_tab"]);
+ expect(out).toContain("send_to_tab");
+ expect(out).toContain("read_tab");
+ // Granting only one must not pull in the other.
+ const onlySend = expandAgentToolNames(["send_to_tab"]);
+ expect(onlySend).toContain("send_to_tab");
+ expect(onlySend).not.toContain("read_tab");
+ });
+
it("passes through non-group tool names unchanged", () => {
- const out = expandAgentToolNames(["summon", "retrieve", "web_search", "youtube_transcribe"]);
+ const out = expandAgentToolNames([
+ "summon",
+ "retrieve",
+ "web_search",
+ "youtube_transcribe",
+ "send_to_tab",
+ "read_tab",
+ ]);
expect(out).toEqual(
- expect.arrayContaining(["summon", "retrieve", "web_search", "youtube_transcribe"]),
+ expect.arrayContaining([
+ "summon",
+ "retrieve",
+ "web_search",
+ "youtube_transcribe",
+ "send_to_tab",
+ "read_tab",
+ ]),
);
});
diff --git a/packages/core/tests/db/tabs.test.ts b/packages/core/tests/db/tabs.test.ts
index e1c9bf8..67533dc 100644
--- a/packages/core/tests/db/tabs.test.ts
+++ b/packages/core/tests/db/tabs.test.ts
@@ -73,6 +73,22 @@ class FakeDatabase {
return [{ max_pos: maxPos }];
}
+ // resolveTabPrefix: open tabs whose id starts with a sanitized prefix.
+ // The production query binds `$prefix` as `<sanitized>%`; emulate SQLite
+ // LIKE prefix semantics here (case-insensitive, `%` = "rest of string").
+ if (norm === "SELECT * FROM tabs WHERE is_open = 1 AND id LIKE $prefix ORDER BY position ASC") {
+ const raw = String(params?.$prefix ?? "");
+ const needle = raw.endsWith("%") ? raw.slice(0, -1) : raw;
+ return this.rows
+ .filter((r) => r.is_open === 1 && r.id.toLowerCase().startsWith(needle.toLowerCase()))
+ .sort((a, b) => a.position - b.position);
+ }
+
+ // shortestUniquePrefix: all open tab ids.
+ if (norm === "SELECT id FROM tabs WHERE is_open = 1") {
+ return this.rows.filter((r) => r.is_open === 1).map((r) => ({ id: r.id }));
+ }
+
throw new Error(`FakeDatabase: unsupported SELECT: ${norm}`);
}
@@ -134,7 +150,8 @@ vi.mock("../../src/db/index.js", () => ({
// Dynamic import AFTER `vi.mock` registers (vitest hoists `vi.mock` to
// the very top of the file, so by the time this line runs the mock is
// active for `./index.js` resolution inside `tabs.ts`).
-const { archiveTab, createTab, getDescendantIds, getTab } = await import("../../src/db/tabs.js");
+const { archiveTab, createTab, getDescendantIds, getTab, resolveTabPrefix, shortestUniquePrefix } =
+ await import("../../src/db/tabs.js");
beforeAll(() => {
fakeDb = new FakeDatabase();
@@ -234,3 +251,103 @@ describe("getDescendantIds", () => {
expect(ids2).toEqual(["b1", "a1"]);
});
});
+
+// ---------------------------------------------------------------------------
+// resolveTabPrefix — git-style short-handle resolution
+// ---------------------------------------------------------------------------
+describe("resolveTabPrefix", () => {
+ it("returns none when the prefix is shorter than the minimum length", () => {
+ createTab("abcd1234-0000-4000-8000-000000000000", "A");
+ // 3 chars < MIN_TAB_PREFIX_LENGTH (4)
+ expect(resolveTabPrefix("abc").status).toBe("none");
+ });
+
+ it("returns none when no open tab matches", () => {
+ createTab("abcd1234-0000-4000-8000-000000000000", "A");
+ expect(resolveTabPrefix("ffff").status).toBe("none");
+ });
+
+ it("resolves a unique 4-char prefix to the single matching tab", () => {
+ createTab("abcd1234-0000-4000-8000-000000000000", "Alpha");
+ createTab("9999aaaa-0000-4000-8000-000000000000", "Beta");
+ const res = resolveTabPrefix("abcd");
+ expect(res.status).toBe("ok");
+ if (res.status === "ok") {
+ expect(res.tab.id).toBe("abcd1234-0000-4000-8000-000000000000");
+ expect(res.tab.title).toBe("Alpha");
+ }
+ });
+
+ it("resolves the full UUID (a maximal prefix)", () => {
+ createTab("abcd1234-0000-4000-8000-000000000000", "Alpha");
+ const res = resolveTabPrefix("abcd1234-0000-4000-8000-000000000000");
+ expect(res.status).toBe("ok");
+ });
+
+ it("reports ambiguity when multiple open tabs share the prefix", () => {
+ createTab("abcd1111-0000-4000-8000-000000000000", "One");
+ createTab("abcd2222-0000-4000-8000-000000000000", "Two");
+ const res = resolveTabPrefix("abcd");
+ expect(res.status).toBe("ambiguous");
+ if (res.status === "ambiguous") {
+ expect(res.matches).toHaveLength(2);
+ expect(res.matches.map((m) => m.title).sort()).toEqual(["One", "Two"]);
+ }
+ });
+
+ it("disambiguates when one more character is supplied", () => {
+ createTab("abcd1111-0000-4000-8000-000000000000", "One");
+ createTab("abcd2222-0000-4000-8000-000000000000", "Two");
+ const res = resolveTabPrefix("abcd1");
+ expect(res.status).toBe("ok");
+ if (res.status === "ok") expect(res.tab.title).toBe("One");
+ });
+
+ it("matches case-insensitively (UUIDs are lowercase; LIKE is ASCII-CI)", () => {
+ createTab("abcd1234-0000-4000-8000-000000000000", "Alpha");
+ const res = resolveTabPrefix("ABCD");
+ expect(res.status).toBe("ok");
+ });
+
+ it("sanitizes LIKE wildcards so they cannot broaden the match", () => {
+ createTab("abcd1234-0000-4000-8000-000000000000", "Alpha");
+ createTab("9999aaaa-0000-4000-8000-000000000000", "Beta");
+ // `%` would match everything if not stripped; after sanitization the
+ // query is effectively `abcd%` which matches only Alpha.
+ const res = resolveTabPrefix("ab%d");
+ // "ab%d" -> sanitized "abd" (3 chars) -> below min length -> none.
+ expect(res.status).toBe("none");
+ });
+
+ it("excludes archived (closed) tabs from matches", () => {
+ createTab("abcd1234-0000-4000-8000-000000000000", "Alpha");
+ archiveTab("abcd1234-0000-4000-8000-000000000000");
+ expect(resolveTabPrefix("abcd").status).toBe("none");
+ });
+});
+
+// ---------------------------------------------------------------------------
+// shortestUniquePrefix — display-handle derivation
+// ---------------------------------------------------------------------------
+describe("shortestUniquePrefix", () => {
+ it("returns a 4-char prefix when no other open tab collides", () => {
+ createTab("abcd1234-0000-4000-8000-000000000000", "Alpha");
+ expect(shortestUniquePrefix("abcd1234-0000-4000-8000-000000000000")).toBe("abcd");
+ });
+
+ it("grows the prefix one char at a time on a collision", () => {
+ createTab("abcd1111-0000-4000-8000-000000000000", "One");
+ createTab("abcd2222-0000-4000-8000-000000000000", "Two");
+ // First differing char is at index 4, so a 5-char prefix is unique.
+ expect(shortestUniquePrefix("abcd1111-0000-4000-8000-000000000000")).toBe("abcd1");
+ expect(shortestUniquePrefix("abcd2222-0000-4000-8000-000000000000")).toBe("abcd2");
+ });
+
+ it("ignores closed tabs when computing uniqueness", () => {
+ createTab("abcd1111-0000-4000-8000-000000000000", "One");
+ createTab("abcd2222-0000-4000-8000-000000000000", "Two");
+ archiveTab("abcd2222-0000-4000-8000-000000000000");
+ // With Two closed, One no longer collides → back to 4 chars.
+ expect(shortestUniquePrefix("abcd1111-0000-4000-8000-000000000000")).toBe("abcd");
+ });
+});
diff --git a/packages/core/tests/tools/read-tab.test.ts b/packages/core/tests/tools/read-tab.test.ts
new file mode 100644
index 0000000..71e419c
--- /dev/null
+++ b/packages/core/tests/tools/read-tab.test.ts
@@ -0,0 +1,101 @@
+import { describe, expect, it } from "vitest";
+import { createReadTabTool, type ReadTabCallbacks } from "../../src/tools/read-tab.js";
+import type { TabResolution } from "../../src/tools/send-to-tab.js";
+
+function makeCallbacks(overrides: Partial<ReadTabCallbacks> = {}): ReadTabCallbacks {
+ return {
+ resolveShortId: (): TabResolution => ({
+ status: "ok",
+ tab: { id: "target-id", title: "Target", handle: "targ" },
+ }),
+ getLastResponse: () => ({ text: "the answer is 42", status: "idle" }),
+ listOpenHandles: () => [{ handle: "targ", title: "Target" }],
+ ...overrides,
+ };
+}
+
+describe("createReadTabTool — schema & description", () => {
+ it("is a non-blocking snapshot read", () => {
+ const tool = createReadTabTool(makeCallbacks());
+ expect(tool.name).toBe("read_tab");
+ expect(tool.description).toContain("SNAPSHOT");
+ expect(tool.description.toLowerCase()).toContain("does not block");
+ });
+});
+
+describe("createReadTabTool — execute()", () => {
+ it("returns the last assistant response wrapped in a tab_response tag", async () => {
+ const tool = createReadTabTool(makeCallbacks());
+ const out = await tool.execute({ tab_id: "targ" });
+ expect(out).toContain("<tab_response");
+ expect(out).toContain('tab="targ"');
+ expect(out).toContain('status="idle"');
+ expect(out).toContain("the answer is 42");
+ expect(out).toContain("</tab_response>");
+ });
+
+ it("notes that a running tab's response is its previous completed turn", async () => {
+ const tool = createReadTabTool(
+ makeCallbacks({
+ getLastResponse: () => ({ text: "older turn", status: "running" }),
+ }),
+ );
+ const out = await tool.execute({ tab_id: "targ" });
+ expect(out).toContain("still running");
+ expect(out).toContain("older turn");
+ });
+
+ it("explains when a tab has no completed response yet (idle)", async () => {
+ const tool = createReadTabTool(
+ makeCallbacks({
+ getLastResponse: () => ({ text: null, status: "idle" }),
+ }),
+ );
+ const out = await tool.execute({ tab_id: "targ" });
+ expect(out).toContain("no completed response");
+ expect(out).toContain("no assistant responses yet");
+ });
+
+ it("explains when a tab is still on its first turn (running, no prior text)", async () => {
+ const tool = createReadTabTool(
+ makeCallbacks({
+ getLastResponse: () => ({ text: null, status: "running" }),
+ }),
+ );
+ const out = await tool.execute({ tab_id: "targ" });
+ expect(out).toContain("no completed response");
+ expect(out).toContain("still working on its first turn");
+ });
+
+ it("rejects an empty tab_id and lists open handles", async () => {
+ const tool = createReadTabTool(makeCallbacks());
+ const out = await tool.execute({ tab_id: "" });
+ expect(out).toContain("Error");
+ expect(out).toContain("targ");
+ });
+
+ it("returns a helpful error when the id is unknown", async () => {
+ const tool = createReadTabTool(makeCallbacks({ resolveShortId: () => ({ status: "none" }) }));
+ const out = await tool.execute({ tab_id: "zzzz" });
+ expect(out).toContain("no open tab matches");
+ expect(out).toContain("Currently open tabs:");
+ });
+
+ it("asks for more characters when the id is ambiguous", async () => {
+ const tool = createReadTabTool(
+ makeCallbacks({
+ resolveShortId: () => ({
+ status: "ambiguous",
+ matches: [
+ { id: "a1", title: "One", handle: "abcd1" },
+ { id: "a2", title: "Two", handle: "abcd2" },
+ ],
+ }),
+ }),
+ );
+ const out = await tool.execute({ tab_id: "abcd" });
+ expect(out).toContain("ambiguous");
+ expect(out).toContain("abcd1");
+ expect(out).toContain("abcd2");
+ });
+});
diff --git a/packages/core/tests/tools/send-to-tab.test.ts b/packages/core/tests/tools/send-to-tab.test.ts
new file mode 100644
index 0000000..4450fc5
--- /dev/null
+++ b/packages/core/tests/tools/send-to-tab.test.ts
@@ -0,0 +1,142 @@
+import { describe, expect, it, vi } from "vitest";
+import {
+ createSendToTabTool,
+ type SendToTabCallbacks,
+ type TabResolution,
+} from "../../src/tools/send-to-tab.js";
+
+function makeCallbacks(overrides: Partial<SendToTabCallbacks> = {}): SendToTabCallbacks {
+ return {
+ resolveShortId: (): TabResolution => ({
+ status: "ok",
+ tab: { id: "target-id", title: "Target", handle: "targ" },
+ }),
+ deliver: () => ({ status: "started" }),
+ listOpenHandles: () => [{ handle: "targ", title: "Target" }],
+ self: { id: "self-id", handle: "self" },
+ ...overrides,
+ };
+}
+
+describe("createSendToTabTool — schema & description", () => {
+ it("exposes tab_id and message params and a fire-and-forget description", () => {
+ const tool = createSendToTabTool(makeCallbacks());
+ expect(tool.name).toBe("send_to_tab");
+ expect(tool.description).toContain("fire-and-forget");
+ expect(tool.description.toLowerCase()).toContain("queued");
+ });
+});
+
+describe("createSendToTabTool — execute()", () => {
+ it("delivers to a resolved target and reports the started status", async () => {
+ const deliver = vi.fn(() => ({ status: "started" as const }));
+ const tool = createSendToTabTool(makeCallbacks({ deliver }));
+ const out = await tool.execute({ tab_id: "targ", message: "hello there" });
+ expect(deliver).toHaveBeenCalledTimes(1);
+ const [targetId, delivered] = deliver.mock.calls[0] ?? [];
+ expect(targetId).toBe("target-id");
+ // Provenance prefix names the sending tab's handle.
+ expect(delivered).toContain("[message from tab self]");
+ expect(delivered).toContain("hello there");
+ expect(out).toContain("idle");
+ expect(out).toContain("targ");
+ });
+
+ it("reports the queued status when the target is busy", async () => {
+ const deliver = vi.fn(() => ({ status: "queued" as const }));
+ const tool = createSendToTabTool(makeCallbacks({ deliver }));
+ const out = await tool.execute({ tab_id: "targ", message: "ping" });
+ expect(out.toLowerCase()).toContain("queued");
+ expect(out.toLowerCase()).toContain("busy");
+ });
+
+ it("reports a HELD message when delivery is suppressed (auto-wake limit hit)", async () => {
+ const deliver = vi.fn(() => ({ status: "suppressed" as const }));
+ const tool = createSendToTabTool(makeCallbacks({ deliver }));
+ const out = await tool.execute({ tab_id: "targ", message: "ping again" });
+ expect(out).toContain("HELD");
+ expect(out.toLowerCase()).toContain("limit");
+ // It must steer the sender away from retrying in a loop.
+ expect(out.toLowerCase()).toContain("do not keep resending");
+ expect(out.toLowerCase()).toContain("human");
+ });
+
+ it("rejects an empty tab_id and lists open handles", async () => {
+ const tool = createSendToTabTool(makeCallbacks());
+ const out = await tool.execute({ tab_id: " ", message: "hi" });
+ expect(out).toContain("Error");
+ expect(out).toContain("targ");
+ });
+
+ it("rejects an empty message", async () => {
+ const deliver = vi.fn(() => ({ status: "started" as const }));
+ const tool = createSendToTabTool(makeCallbacks({ deliver }));
+ const out = await tool.execute({ tab_id: "targ", message: " " });
+ expect(out).toContain("Error");
+ expect(deliver).not.toHaveBeenCalled();
+ });
+
+ it("returns a helpful error and open-tab list when the id is unknown", async () => {
+ const deliver = vi.fn(() => ({ status: "started" as const }));
+ const tool = createSendToTabTool(
+ makeCallbacks({
+ resolveShortId: () => ({ status: "none" }),
+ deliver,
+ }),
+ );
+ const out = await tool.execute({ tab_id: "zzzz", message: "hi" });
+ expect(out).toContain("no open tab matches");
+ expect(out).toContain("Currently open tabs:");
+ expect(deliver).not.toHaveBeenCalled();
+ });
+
+ it("asks for more characters when the id is ambiguous", async () => {
+ const deliver = vi.fn(() => ({ status: "started" as const }));
+ const tool = createSendToTabTool(
+ makeCallbacks({
+ resolveShortId: () => ({
+ status: "ambiguous",
+ matches: [
+ { id: "a1", title: "One", handle: "abcd1" },
+ { id: "a2", title: "Two", handle: "abcd2" },
+ ],
+ }),
+ deliver,
+ }),
+ );
+ const out = await tool.execute({ tab_id: "abcd", message: "hi" });
+ expect(out).toContain("ambiguous");
+ expect(out).toContain("abcd1");
+ expect(out).toContain("abcd2");
+ expect(deliver).not.toHaveBeenCalled();
+ });
+
+ it("refuses to send to its own tab", async () => {
+ const deliver = vi.fn(() => ({ status: "started" as const }));
+ const tool = createSendToTabTool(
+ makeCallbacks({
+ resolveShortId: () => ({
+ status: "ok",
+ tab: { id: "self-id", title: "Me", handle: "self" },
+ }),
+ deliver,
+ }),
+ );
+ const out = await tool.execute({ tab_id: "self", message: "hi" });
+ expect(out).toContain("cannot send a message to your own tab");
+ expect(deliver).not.toHaveBeenCalled();
+ });
+
+ it("surfaces a thrown delivery error instead of crashing", async () => {
+ const tool = createSendToTabTool(
+ makeCallbacks({
+ deliver: () => {
+ throw new Error("boom");
+ },
+ }),
+ );
+ const out = await tool.execute({ tab_id: "targ", message: "hi" });
+ expect(out).toContain("Error delivering message");
+ expect(out).toContain("boom");
+ });
+});
diff --git a/packages/core/tests/tools/summon.test.ts b/packages/core/tests/tools/summon.test.ts
index 6597c21..f59f345 100644
--- a/packages/core/tests/tools/summon.test.ts
+++ b/packages/core/tests/tools/summon.test.ts
@@ -39,9 +39,13 @@ describe("createSummonTool — description content", () => {
path: "/home/u/.config/dispatch/agents/researcher.toml",
},
];
- const tool = createSummonTool("/tmp/work", noopCallbacks, agents, [], [
- "/home/u/.config/dispatch/agents",
- ]);
+ const tool = createSummonTool(
+ "/tmp/work",
+ noopCallbacks,
+ agents,
+ [],
+ ["/home/u/.config/dispatch/agents"],
+ );
expect(tool.description).toContain("programmer");
expect(tool.description).toContain("Programmer");
expect(tool.description).toContain("Implements code from a plan");
diff --git a/packages/frontend/src/lib/components/TabBar.svelte b/packages/frontend/src/lib/components/TabBar.svelte
index feb1e5b..3cbd849 100644
--- a/packages/frontend/src/lib/components/TabBar.svelte
+++ b/packages/frontend/src/lib/components/TabBar.svelte
@@ -56,6 +56,7 @@ const activeUserTabId = $derived(
>
<span class="flex items-center gap-1.5">
<span class="w-1.5 h-1.5 rounded-full shrink-0 {statusColor(tab.agentStatus)}"></span>
+ <span class="font-mono text-[10px] px-1 py-0.5 rounded bg-base-300 text-base-content/60 shrink-0" title="Tab ID — agents address this tab by this handle">{tabStore.shortHandleFor(tab.id)}</span>
<span class="max-w-32 truncate text-xs">{tab.title}</span>
</span>
<button
@@ -89,6 +90,7 @@ const activeUserTabId = $derived(
>
<span class="flex items-center gap-1">
<span class="w-1 h-1 rounded-full shrink-0 {statusColor(tab.agentStatus)}"></span>
+ <span class="font-mono text-[10px] px-1 rounded bg-base-300 text-base-content/60 shrink-0" title="Tab ID — agents address this tab by this handle">{tabStore.shortHandleFor(tab.id)}</span>
<span class="max-w-28 truncate text-xs">{tab.title}</span>
</span>
{#if tab.persistent}
diff --git a/packages/frontend/src/lib/components/ToolPermissions.svelte b/packages/frontend/src/lib/components/ToolPermissions.svelte
index d64eaf9..0caf0fa 100644
--- a/packages/frontend/src/lib/components/ToolPermissions.svelte
+++ b/packages/frontend/src/lib/components/ToolPermissions.svelte
@@ -28,6 +28,16 @@ const toolPermissions: ToolPermission[] = [
description: "Allow the AI to open new independent top-level tabs",
},
{
+ id: "send_to_tab",
+ label: "Message other tabs",
+ description: "Allow the AI to send messages to other tabs by their ID",
+ },
+ {
+ id: "read_tab",
+ label: "Read other tabs",
+ description: "Allow the AI to read other tabs' latest responses by their ID",
+ },
+ {
id: "web_search",
label: "Web search",
description: "Allow the AI to search the web via Firecrawl",
diff --git a/packages/frontend/src/lib/settings.svelte.ts b/packages/frontend/src/lib/settings.svelte.ts
index 11eb79c..123008d 100644
--- a/packages/frontend/src/lib/settings.svelte.ts
+++ b/packages/frontend/src/lib/settings.svelte.ts
@@ -9,6 +9,8 @@ let toolPerms = $state<Record<string, boolean>>({
bash: false,
summon: false,
user_agent: false,
+ send_to_tab: false,
+ read_tab: false,
external_directory: false,
web_search: false,
youtube_transcribe: false,
@@ -19,6 +21,8 @@ let savedToolPerms = $state<Record<string, boolean>>({
bash: false,
summon: false,
user_agent: false,
+ send_to_tab: false,
+ read_tab: false,
external_directory: false,
web_search: false,
youtube_transcribe: false,
diff --git a/packages/frontend/src/lib/tabs.svelte.ts b/packages/frontend/src/lib/tabs.svelte.ts
index e303c80..317de8d 100644
--- a/packages/frontend/src/lib/tabs.svelte.ts
+++ b/packages/frontend/src/lib/tabs.svelte.ts
@@ -232,6 +232,29 @@ export function createTabStore() {
return tabs.find((t) => t.id === id);
}
+ /**
+ * Minimum display length of a tab handle (git-style short id). Mirrors
+ * `MIN_TAB_PREFIX_LENGTH` in core's `db/tabs.ts` so the handle the user sees
+ * is always resolvable by the backend's `resolveTabPrefix`.
+ */
+ const MIN_HANDLE_LENGTH = 4;
+
+ /**
+ * Compute the shortest unique prefix (≥ MIN_HANDLE_LENGTH chars) of `tabId`
+ * among all currently-open tabs — the displayed "handle" agents use to
+ * address each other. Purely DERIVED from the UUIDs already in `tabs`; never
+ * stored. Grows by one char only when another open tab shares the prefix, and
+ * shrinks back when that sibling closes.
+ */
+ function shortHandleFor(tabId: string): string {
+ const others = tabs.map((t) => t.id).filter((id) => id !== tabId);
+ for (let len = MIN_HANDLE_LENGTH; len < tabId.length; len++) {
+ const candidate = tabId.slice(0, len);
+ if (!others.some((id) => id.startsWith(candidate))) return candidate;
+ }
+ return tabId;
+ }
+
async function createNewTab(): Promise<Tab> {
const id = generateId();
const title = "New Tab";
@@ -1926,6 +1949,7 @@ export function createTabStore() {
get configReloaded() {
return configReloaded;
},
+ shortHandleFor,
createNewTab,
switchTab,
closeTab,