Merge branch 'dev' into perm/fix-user-agent-summon-permission

# Conflicts: # packages/api/tests/agent-manager.test.ts
author: Adam Malczewski <[email protected]> 2026-06-02 16:04:20 +0900
committer: Adam Malczewski <[email protected]> 2026-06-02 16:04:20 +0900
commit: a24397636de35f4b92c7cd85154ddc03b98d47cd (patch)
tree: 99c5de728457d816d9baf0bfffe3c4fc2eb34af5
parent: 3ff2db698c2633023934d8477a9e995f78fa011e (diff)
parent: e0b63c0c03880bf77a07d47b28bbabf84649fcc3 (diff)
download: dispatch-a24397636de35f4b92c7cd85154ddc03b98d47cd.tar.gz
dispatch-a24397636de35f4b92c7cd85154ddc03b98d47cd.zip
5 files changed, 265 insertions, 66 deletions
diff --git a/HANDOFF.md b/HANDOFF.md
index dbabee1..81f9472 100644
--- a/HANDOFF.md
+++ b/HANDOFF.md
@@ -1,68 +1,95 @@
-# Handoff — td/todo-fix: declarative todo/task system
+# Handoff — tab/fix-tab-messaging-tool: cross-tab messaging tools usable when granted
 
 ## Summary
-Replaced Dispatch's imperative, id-based `todo` tool (actions `add`/`update`/`list`/`get`/`remove`)
-with opencode's **declarative whole-list** design, and fixed the panel blanking on reload. The tool
-name (`todo`), the `task-list-update` event, the per-tab `TaskList` store, and the sidebar **Tasks**
-panel are all preserved — only the interface, status model, and UI rendering changed.
+Agents could be granted the cross-tab messaging tools (`send_to_tab` / `read_tab`) yet
+behaved as if they didn't have them — claiming they were "incapable" and refusing to call
+them. **Root cause:** the tools were correctly registered, permission-gated, resolved
+per-tab, and executable, and their JSON schemas WERE sent to the model — but the agent's
+**system prompt** enumerates "You have access to the following tools" by filtering tool
+names through a static `TOOL_DESCRIPTIONS` map, and that map had **no entries** for
+`send_to_tab` / `read_tab`. So the prompt explicitly told the model it lacked them.
 
-## What changed (and why it's better)
-- **Declarative whole-list write** (from opencode's `todowrite`): the model sends the *entire*
-  desired list in one `todos` param each call; the store replaces its list. No model-visible ids,
-  no delta reasoning, no "task not found" spirals, no multi-call churn — the failure modes that made
-  the old CRUD tool confuse weaker models.
-- **Status lifecycle:** `pending | in_progress | completed | cancelled` (was `pending | in_progress |
-  done | blocked`; `blocked` was dead/unrendered state).
-- **No `priority`** (deliberately dropped per product decision; opencode has it, we don't).
-- **Reload reliability:** todos used to blank on page reload (broadcast only on change, absent from
-  the reconnect snapshot). Now `TabStatusSnapshot` carries per-tab `tasks`, so the panel rehydrates
-  from the backend on reload/reconnect. Still **in-memory per-tab** (no DB; does not survive a server
-  restart).
+After fixing the core bug, two follow-up behavioral/prompting issues surfaced in live
+testing and were also fixed in the tool context:
+1. The **sender busy-waited** (ran `sleep`/polled) for a reply instead of ending its turn.
+2. The **recipient replied to its own user in plain text** instead of routing the answer
+   back through `send_to_tab` to the sender.
+A third refinement made every `read_tab` mention **conditional** on the tab actually
+holding `read_tab` (the permissions are split, so a tab can have `send_to_tab` without
+`read_tab` — advertising a tool it wasn't granted is wrong).
+
+## What changed (and why)
+- **Advertise the tools (the actual bug):** added `send_to_tab` + `read_tab` entries to
+  `TOOL_DESCRIPTIONS` so the system prompt's capability list matches the granted toolset.
+- **Stop sender busy-wait:** the `send_to_tab` tool description, its delivery-result text,
+  and the system-prompt one-liner now say plainly: do NOT sleep/poll/run commands to wait;
+  if the target replies it will **WAKE you with a new message** in a later turn; keep
+  working if you have other tasks, else **end your turn**.
+- **Fix recipient reply routing:** the delivered-message wrapper now states the message is
+  from **another agent, NOT your user**, and that to reply you must use `send_to_tab`
+  addressed back to the sender's handle — and **ONLY** if asked (it may just be context).
+  A plain text response reaches only the recipient's own user.
+- **Conditional `read_tab` guidance:** `createSendToTabTool` takes a new `canReadTab`
+  callback flag. `AgentManager.buildTabCommToolEntries(tabId, canReadTab)` passes it
+  (`allowed.has("read_tab")` on the child path; `permReadTab` on the parent path). The
+  description + result text only reference `read_tab` when the tab actually has it. The
+  static `TOOL_DESCRIPTIONS.send_to_tab` one-liner dropped its `read_tab` phrasing (it
+  can't be per-tab conditional there).
 
 ## Files changed
-- `packages/core/src/types/index.ts` — `TaskStatus` union; `TaskItem = { id, content, status }`
-  (`id` internal/positional, never shown to the model); `TabStatusSnapshot.tasks?`.
-- `packages/core/src/tools/task-list.ts` — rewrote `TaskList` (declarative `setTasks`/`getTasks`/
-  `onChange`); `createTaskListTool` with a single `todos` param that echoes the stored list without
-  ids; new exported `TODO_DESCRIPTION` (adapted from opencode `todowrite.txt`).
-- `packages/core/src/index.ts` — export `TODO_DESCRIPTION`.
-- `packages/api/src/agent-manager.ts` — `TODO_GUIDANCE` → `TASK_MANAGEMENT_GUIDANCE` (system-prompt
-  section adapted from opencode `anthropic.txt`); updated `TOOL_DESCRIPTIONS.todo`; `getAllStatuses()`
-  now includes each tab's `tasks` (all tabs, omitted when empty).
-- `packages/frontend/src/lib/types.ts` — mirror `TaskItem` + `TabStatusSnapshot.tasks`.
-- `packages/frontend/src/lib/tabs.svelte.ts` — hydrate `tasks` from the snapshot in both restore
-  paths (initial `GET /status` map + `statuses` WS handler); updated debug-dump label.
-- `packages/frontend/src/lib/components/TaskListPanel.svelte` — render `content`; all four statuses
-  (completed→checked+strikethrough, in_progress→indeterminate+bold, cancelled→dim+strikethrough,
-  pending→empty); `completed/active` progress counter. Sidebar panel only — nothing relocated.
-- `packages/core/tests/tools/task-list.test.ts` — new (15 tests).
-- `packages/api/tests/agent-manager.test.ts`, `packages/api/tests/routes.test.ts` — updated
-  `TaskList` mocks to the declarative shape; added `getAllStatuses` task-snapshot coverage.
-- `notes/todo-tool-redesign-plan.md` — appended an "As-built" section.
+- `packages/api/src/agent-manager.ts`
+  - `TOOL_DESCRIPTIONS`: added `send_to_tab` + `read_tab`; `send_to_tab` one-liner carries
+    the no-busy-wait / wake-you-with-a-new-message guidance (no `read_tab` reference).
+  - `buildTabCommToolEntries(tabId, canReadTab)`: new param, forwarded into
+    `createSendToTabTool` as `canReadTab`. Both call sites updated
+    (`allowed.has("read_tab")` / `permReadTab`).
+- `packages/core/src/tools/send-to-tab.ts`
+  - `SendToTabCallbacks` gained `canReadTab: boolean`.
+  - Description built conditionally (the `read_tab` follow-up line only appears when
+    `canReadTab`); "WAKE you with a new message" phrasing; recipient reply-contract footer
+    with **ONLY** uppercased; header marks sender as another agent (not your user).
+  - Delivery-result text built conditionally (mentions `read_tab` only when `canReadTab`).
+- `packages/api/tests/agent-manager.test.ts`
+  - Agent mock now captures `config.systemPrompt`; new describe block
+    "send_to_tab / read_tab system-prompt advertisement" (5 tests) asserts the prompt lists
+    the granted tab tools (and omits ungranted ones), locking the prompt list to the schema.
+- `packages/core/tests/tools/send-to-tab.test.ts`
+  - `makeCallbacks` default `canReadTab: true`; assertions for provenance header/footer,
+    **ONLY** uppercase, no-busy-wait/end-your-turn, "wake you with a new message", and both
+    `canReadTab` branches (description + result text) for `read_tab` presence/absence.
 
 ## Public surface changed
-- **Tool `todo`**: parameters changed from `{ action, title, description, task_id, status }` to a
-  single `{ todos: Array<{ content, status }> }`. Statuses `pending|in_progress|completed|cancelled`.
-- **`@dispatch/core` exports**: added `TODO_DESCRIPTION`. `TaskItem` shape changed (`title`+
-  `description` → `content`; status union changed). `TaskList` methods changed (`addTask`/`updateTask`/
-  `removeTask`/`getTask` removed; `setTasks` added).
-- **`TabStatusSnapshot`** (wire format, core + frontend mirror) gained optional `tasks`.
-- Tool name, allowlist/loader/summon/permission wiring, agent TOMLs: **unchanged**.
+- **`@dispatch/core` — `SendToTabCallbacks`**: added required field `canReadTab: boolean`.
+  Any external caller of `createSendToTabTool` must now supply it. (In-repo, the only caller
+  is `AgentManager.buildTabCommToolEntries`, updated here.)
+- No changes to tool NAMES, permission keys, registry, execution path, wire formats, DB, or
+  the frontend. Tool behavior (delivery routing, auto-wake budget, resolution) is unchanged
+  — only the advertised/contextual text and the new `canReadTab` plumbing.
 
 ## Verification status
-- `bun run check` (biome): clean.
-- `bun run test`: **585 passing** (37 files).
-- `tsc --noEmit` (core, api) + `svelte-check` (frontend): 0 errors.
-- Verified post-merge of `dev`.
+- `bun run check` (biome): **clean** (165 files, no fixes).
+- `bun run test`: **594 passing** (37 files). (Baseline was 585; +9 new tests.)
+- `tsc --noEmit` core + api: **0 errors**.
+- `svelte-check` (frontend): **0 errors, 0 warnings**.
+- Re-verified after `git merge --no-edit dev` (already up to date) immediately before push.
 
 ## Published
-Yes. Merged `dev` down (no conflicts), re-verified all-green, fast-forwarded
-`dev` → `9d6b7a9`. User confirmed the task system works before merge.
+**Yes.** `dev` was already an ancestor of this branch (clean fast-forward, no merge commit
+needed). Fast-forwarded `dev`: `c0c0872..e4379da`. User confirmed the fix before merge.
+
+Commits (oldest→newest):
+- `9c89ec9` advertise send_to_tab/read_tab in the agent system prompt (+ regression tests)
+- `e475e52` clearer send_to_tab context to stop busy-wait + wrong-recipient replies
+- `aa295e8` only mention read_tab when the sender actually has it; CAPS on ONLY
+- `e4379da` say a reply will WAKE you with a new message
 
 ## Assumptions / known gaps
-- No DB persistence: todos are in-memory per-tab and do not survive a server restart (matches scope;
-  opencode persists to SQLite — intentionally not ported).
-- No `priority` field (dropped per decision).
-- No new UI surfaces — the existing sidebar Tasks panel only.
-- An unrelated untracked `bookmark-manager/` directory exists in the worktree root; it is not part of
-  this feature and was left untouched (never staged/committed).
+- The static `TOOL_DESCRIPTIONS.send_to_tab` system-prompt one-liner can't be per-tab
+  conditional, so it deliberately omits any `read_tab` reference. The precise, conditional
+  `read_tab` guidance lives in the tool's own description/result (which ARE per-tab).
+- `read_tab` itself was already truthful (it's only present when granted); no description
+  changes were needed there.
+- These are prompting/UX nudges — model adherence isn't guaranteed, but the wording now
+  matches actual runtime behavior (split perms, wake-on-reply, reply-via-tool).
+- Pre-existing untracked dirs in the worktree root (e.g. `bookmark-manager/` noted in a
+  prior handoff) were left untouched; not part of this feature.
diff --git a/packages/api/src/agent-manager.ts b/packages/api/src/agent-manager.ts
index 9499ce5..2795a6c 100644
--- a/packages/api/src/agent-manager.ts
+++ b/packages/api/src/agent-manager.ts
@@ -83,6 +83,10 @@ const TOOL_DESCRIPTIONS: Record<string, string> = {
 	web_search: "Search the web and optionally scrape full page content from results.",
 	youtube_transcribe:
 		"Fetch the transcript/subtitles for a YouTube video. Set background=true to start in the background and get a job_id for later retrieval.",
+	send_to_tab:
+		"Send a message to another tab (agent) by its short ID, as shown in the tab bar. Fire-and-forget: it queues/wakes the target and returns immediately without waiting for a reply. Do NOT sleep, poll, or run commands to wait — if the target replies it will wake you with a new message in a later turn; if you are only waiting, end your turn.",
+	read_tab:
+		"Read another tab (agent)'s most recent completed response by its short ID. Returns a non-blocking snapshot; if the target is still running you get its previous completed turn. Use after send_to_tab to collect a reply.",
 };
 
 /**
@@ -542,7 +546,7 @@ export class AgentManager {
 				}
 				// Tab-to-tab communication — gated on the child whitelist.
 				if (allowed.has("send_to_tab") || allowed.has("read_tab")) {
-					for (const entry of this.buildTabCommToolEntries(tabId)) {
+					for (const entry of this.buildTabCommToolEntries(tabId, allowed.has("read_tab"))) {
 						if (allowed.has(entry.name)) toolEntries.push(entry);
 					}
 				}
@@ -639,7 +643,7 @@ export class AgentManager {
 					const tabCommAllowed = new Set<string>();
 					if (permSendToTab) tabCommAllowed.add("send_to_tab");
 					if (permReadTab) tabCommAllowed.add("read_tab");
-					for (const entry of this.buildTabCommToolEntries(tabId)) {
+					for (const entry of this.buildTabCommToolEntries(tabId, permReadTab)) {
 						if (tabCommAllowed.has(entry.name)) toolEntries.push(entry);
 					}
 				}
@@ -1249,9 +1253,15 @@ export class AgentManager {
 	 * both tool-construction paths (child whitelist + permission-gated parent).
 	 * `selfHandle` is computed once so the calling tab can stamp provenance and
 	 * reject self-sends.
+	 *
+	 * `canReadTab` reflects whether THIS tab will also be granted `read_tab`
+	 * (the permissions are split). It is forwarded into `send_to_tab` so the
+	 * tool only points the agent at `read_tab` when it actually has it — never
+	 * advertising a tool the agent wasn't granted.
 	 */
 	private buildTabCommToolEntries(
 		tabId: string,
+		canReadTab: boolean,
 	): Array<{ name: string; tool: ReturnType<typeof createSendToTabTool> }> {
 		const selfHandle = shortestUniquePrefix(tabId);
 		return [
@@ -1265,6 +1275,7 @@ export class AgentManager {
 						this.deliverMessage(targetId, message, { origin: "agent" }),
 					listOpenHandles: () => this.listOpenHandles(tabId),
 					self: { id: tabId, handle: selfHandle },
+					canReadTab,
 				}),
 			},
 			{
diff --git a/packages/api/tests/agent-manager.test.ts b/packages/api/tests/agent-manager.test.ts
index f3ea207..3353aff 100644
--- a/packages/api/tests/agent-manager.test.ts
+++ b/packages/api/tests/agent-manager.test.ts
@@ -75,7 +75,11 @@ function makeRow(
 // because the production code reassigns `agent.messages =
 // rows.slice(...)` AFTER `new Agent()` returns — capturing a
 // reference at construction would yield a stale empty array.
-const constructedAgents: Array<{ initialMessages: unknown[]; toolNames: string[] }> = [];
+const constructedAgents: Array<{
+	initialMessages: unknown[];
+	toolNames: string[];
+	systemPrompt: string;
+}> = [];
 function resetConstructedAgents(): void {
 	constructedAgents.length = 0;
 }
@@ -159,8 +163,10 @@ vi.mock("@dispatch/core", () => ({
 		status = "idle";
 		messages: unknown[] = [];
 		toolNames: string[] = [];
-		constructor(config: { tools?: Array<{ name: string }> }) {
+		systemPrompt = "";
+		constructor(config: { tools?: Array<{ name: string }>; systemPrompt?: string }) {
 			this.toolNames = (config?.tools ?? []).map((t) => t.name);
+			this.systemPrompt = config?.systemPrompt ?? "";
 		}
 		async *run(message: string, options?: { reasoningEffort?: string }): AsyncGenerator<unknown> {
 			// Snapshot the post-construction pre-populated message list
@@ -170,6 +176,7 @@ vi.mock("@dispatch/core", () => ({
 			constructedAgents.push({
 				initialMessages: [...this.messages],
 				toolNames: [...this.toolNames],
+				systemPrompt: this.systemPrompt,
 			});
 			capturedRunOptions.push(options);
 			if (runImpl) {
@@ -1502,6 +1509,66 @@ describe("AgentManager", () => {
 		});
 	});
 
+	// Regression: granted tab-messaging tools must also be ADVERTISED in the
+	// agent's system prompt. The tools were registered in the API tool payload
+	// but `buildSystemPrompt` filtered its "You have access to the following
+	// tools" list through TOOL_DESCRIPTIONS, which lacked send_to_tab/read_tab
+	// — so the model was told it didn't have them and refused to use them. This
+	// locks the prompt's capability list to the granted toolset.
+	describe("send_to_tab / read_tab system-prompt advertisement", () => {
+		async function promptForPerms(tabId: string, perms: Record<string, string>): Promise<string> {
+			for (const [k, v] of Object.entries(perms)) setFakeSetting(k, v);
+			const manager = new AgentManager();
+			await manager.processMessage(tabId, "go");
+			return constructedAgents.at(-1)?.systemPrompt ?? "";
+		}
+
+		it("lists send_to_tab in the system prompt when granted", async () => {
+			const prompt = await promptForPerms("tab-prompt-send", { perm_send_to_tab: "allow" });
+			expect(prompt).toContain("- send_to_tab:");
+			expect(prompt).not.toContain("- read_tab:");
+		});
+
+		it("lists read_tab in the system prompt when granted", async () => {
+			const prompt = await promptForPerms("tab-prompt-read", { perm_read_tab: "allow" });
+			expect(prompt).toContain("- read_tab:");
+			expect(prompt).not.toContain("- send_to_tab:");
+		});
+
+		it("lists both tab-messaging tools when both are granted", async () => {
+			const prompt = await promptForPerms("tab-prompt-both", {
+				perm_send_to_tab: "allow",
+				perm_read_tab: "allow",
+			});
+			expect(prompt).toContain("- send_to_tab:");
+			expect(prompt).toContain("- read_tab:");
+		});
+
+		it("omits both from the system prompt when neither is granted", async () => {
+			const prompt = await promptForPerms("tab-prompt-neither", {});
+			expect(prompt).not.toContain("- send_to_tab:");
+			expect(prompt).not.toContain("- read_tab:");
+		});
+
+		it("advertises exactly the granted tab tools (prompt list matches schema)", async () => {
+			for (const [k, v] of Object.entries({
+				perm_send_to_tab: "allow",
+				perm_read_tab: "allow",
+			})) {
+				setFakeSetting(k, v);
+			}
+			const manager = new AgentManager();
+			await manager.processMessage("tab-prompt-match", "go");
+			const inst = constructedAgents.at(-1);
+			// Every granted tab-messaging tool surfaced in the schema must also be
+			// advertised in the prompt, so the model never believes it lacks one.
+			for (const name of ["send_to_tab", "read_tab"]) {
+				expect(inst?.toolNames).toContain(name);
+				expect(inst?.systemPrompt).toContain(`- ${name}:`);
+			}
+		});
+	});
+
 	// ─── Usage side-channel persistence ──────────────────────────────
 	//
 	// `usage` AgentEvents (one per LLM round-trip) are persisted as invisible
diff --git a/packages/core/src/tools/send-to-tab.ts b/packages/core/src/tools/send-to-tab.ts
index eb86b7e..eae6bfa 100644
--- a/packages/core/src/tools/send-to-tab.ts
+++ b/packages/core/src/tools/send-to-tab.ts
@@ -44,6 +44,13 @@ export interface SendToTabCallbacks {
 	/** The calling tab's own id + handle — used to block self-sends and to
 	 *  stamp provenance onto the delivered message. */
 	self: { id: string; handle: string };
+	/**
+	 * Whether THIS calling tab also has the `read_tab` tool granted. The
+	 * tab-messaging permissions are split, so a tab can hold `send_to_tab`
+	 * without `read_tab`. When false, the tool must NOT tell the agent to use
+	 * `read_tab` (it doesn't have it) — replies only arrive on their own.
+	 */
+	canReadTab: boolean;
 }
 
 /** Render the "available tabs" hint shared by the none/ambiguous branches. */
@@ -54,6 +61,19 @@ function renderOpenHandles(handles: Array<{ handle: string; title: string }>): s
 }
 
 export function createSendToTabTool(callbacks: SendToTabCallbacks): ToolDefinition {
+	// The `read_tab` follow-up hint is only truthful when this tab actually
+	// holds the `read_tab` tool (the permissions are split). When it doesn't,
+	// the only honest guidance is that a reply will wake it as a new message — never tell
+	// the agent to call a tool it wasn't granted.
+	const waitLine = callbacks.canReadTab
+		? "money. If the target replies it will WAKE you with a new message in a later turn; you"
+		: "money. If the target replies it will WAKE you with a new message in a later turn.";
+	const readTabLine = callbacks.canReadTab
+		? ["can also call 'read_tab' with the same ID in a FUTURE turn to check. If you have other"]
+		: [];
+	const keepGoingLine = callbacks.canReadTab
+		? "work to do, keep going; if you are ONLY waiting for the reply, end your turn now."
+		: "If you have other work to do, keep going; if you are ONLY waiting for the reply, end your turn now.";
 	return {
 		name: "send_to_tab",
 		description: [
@@ -64,9 +84,14 @@ export function createSendToTabTool(callbacks: SendToTabCallbacks): ToolDefiniti
 			"  - If the target tab is idle, your message WAKES it and starts a new turn.",
 			"",
 			"This is fire-and-forget: it returns immediately and does NOT wait for a reply.",
-			"Use the 'read_tab' tool with the same ID later to read the target's latest response.",
+			"Do NOT sleep, poll, or run shell commands to wait for a reply — that wastes turns and",
+			waitLine,
+			...readTabLine,
+			keepGoingLine,
 			"",
-			"Your tab ID is auto-added to the top of the message so the recipient can reply to you.",
+			"Your tab ID is auto-added to the top of the message so the recipient knows who to reply",
+			"to. The recipient must use this same 'send_to_tab' tool (addressed to your ID) to answer;",
+			"a plain text response reaches only their own user, not you.",
 			"IDs are git-style prefixes: pass any length that uniquely identifies the target (min 4 chars).",
 			"If the ID is ambiguous you'll be asked to add a character.",
 		].join("\n"),
@@ -117,8 +142,18 @@ export function createSendToTabTool(callbacks: SendToTabCallbacks): ToolDefiniti
 			}
 
 			// Stamp provenance so the recipient (and the watching user) can see
-			// which tab the message came from and reply back via its handle.
-			const delivered = `[message from tab ${callbacks.self.handle}]\n\n${message}`;
+			// which tab the message came from and how to reply. The header makes
+			// clear this is a PEER AGENT, not the recipient's own user, and the
+			// footer states the reply contract: a reply (only if warranted) must
+			// go back through `send_to_tab`, since a plain text answer reaches
+			// only the recipient's own user — not this sender.
+			const delivered = [
+				`[message from tab ${callbacks.self.handle} — this is another agent, NOT your user]`,
+				"",
+				message,
+				"",
+				`[To reply to tab ${callbacks.self.handle}, use the send_to_tab tool with tab_id "${callbacks.self.handle}". ONLY reply if this message asks you to, or your user tells you to — it may just be context or instructions. A plain text response goes to your own user, not to this agent.]`,
+			].join("\n");
 
 			try {
 				const result = await callbacks.deliver(target.id, delivered);
@@ -138,7 +173,23 @@ export function createSendToTabTool(callbacks: SendToTabCallbacks): ToolDefiniti
 					result.status === "queued"
 						? "queued (target is busy; it will be picked up next turn)"
 						: "delivered (target was idle; a new turn has started)";
-				return `Message ${verb}. Target tab: ${target.handle} (${target.title}). Use read_tab with "${target.handle}" to read its reply later.`;
+				const tail = callbacks.canReadTab
+					? [
+							"Do NOT sleep, poll, or run commands to wait for a reply. If the target replies it",
+							`will WAKE you with a new message later; you can also call read_tab with "${target.handle}"`,
+							"in a FUTURE turn to check. Keep working if you have other tasks; if you are ONLY",
+							"waiting for this reply, end your turn now.",
+						]
+					: [
+							"Do NOT sleep, poll, or run commands to wait for a reply. If the target replies it",
+							"will WAKE you with a new message later. Keep working if you have other tasks; if",
+							"you are ONLY waiting for this reply, end your turn now.",
+						];
+				return [
+					`Message ${verb}. Target tab: ${target.handle} (${target.title}).`,
+					"",
+					...tail,
+				].join("\n");
 			} catch (err) {
 				return `Error delivering message: ${err instanceof Error ? err.message : String(err)}`;
 			}
diff --git a/packages/core/tests/tools/send-to-tab.test.ts b/packages/core/tests/tools/send-to-tab.test.ts
index 4450fc5..21d8032 100644
--- a/packages/core/tests/tools/send-to-tab.test.ts
+++ b/packages/core/tests/tools/send-to-tab.test.ts
@@ -14,6 +14,7 @@ function makeCallbacks(overrides: Partial<SendToTabCallbacks> = {}): SendToTabCa
 		deliver: () => ({ status: "started" }),
 		listOpenHandles: () => [{ handle: "targ", title: "Target" }],
 		self: { id: "self-id", handle: "self" },
+		canReadTab: true,
 		...overrides,
 	};
 }
@@ -24,6 +25,22 @@ describe("createSendToTabTool — schema & description", () => {
 		expect(tool.name).toBe("send_to_tab");
 		expect(tool.description).toContain("fire-and-forget");
 		expect(tool.description.toLowerCase()).toContain("queued");
+		// Description must steer the model away from busy-waiting for a reply.
+		expect(tool.description.toLowerCase()).toContain("do not sleep");
+		expect(tool.description.toLowerCase()).toContain("end your turn");
+	});
+
+	it("mentions read_tab in the description only when canReadTab is true", () => {
+		const tool = createSendToTabTool(makeCallbacks({ canReadTab: true }));
+		expect(tool.description).toContain("read_tab");
+	});
+
+	it("never mentions read_tab in the description when canReadTab is false", () => {
+		const tool = createSendToTabTool(makeCallbacks({ canReadTab: false }));
+		expect(tool.description).not.toContain("read_tab");
+		// Still tells the agent a reply will wake it + to end its turn.
+		expect(tool.description.toLowerCase()).toContain("wake you with a new message");
+		expect(tool.description.toLowerCase()).toContain("end your turn");
 	});
 });
 
@@ -35,11 +52,37 @@ describe("createSendToTabTool — execute()", () => {
 		expect(deliver).toHaveBeenCalledTimes(1);
 		const [targetId, delivered] = deliver.mock.calls[0] ?? [];
 		expect(targetId).toBe("target-id");
-		// Provenance prefix names the sending tab's handle.
-		expect(delivered).toContain("[message from tab self]");
+		// Provenance header names the sending tab's handle and marks it as a
+		// peer agent (not the recipient's own user).
+		expect(delivered).toContain("[message from tab self");
+		expect(delivered).toContain("another agent");
 		expect(delivered).toContain("hello there");
+		// Reply contract: the recipient must answer via send_to_tab back to the
+		// sender's handle, not as a plain text reply to its own user.
+		expect(delivered).toContain('send_to_tab tool with tab_id "self"');
+		expect(delivered).toContain("ONLY reply if");
 		expect(out).toContain("idle");
 		expect(out).toContain("targ");
+		// Sender is steered away from busy-waiting and told to end its turn.
+		expect(out.toLowerCase()).toContain("do not sleep");
+		expect(out.toLowerCase()).toContain("end your turn");
+	});
+
+	it("points the sender at read_tab in the result only when canReadTab is true", async () => {
+		const deliver = vi.fn(() => ({ status: "started" as const }));
+		const tool = createSendToTabTool(makeCallbacks({ deliver, canReadTab: true }));
+		const out = await tool.execute({ tab_id: "targ", message: "hi" });
+		expect(out).toContain("read_tab");
+	});
+
+	it("omits read_tab from the result when canReadTab is false", async () => {
+		const deliver = vi.fn(() => ({ status: "started" as const }));
+		const tool = createSendToTabTool(makeCallbacks({ deliver, canReadTab: false }));
+		const out = await tool.execute({ tab_id: "targ", message: "hi" });
+		expect(out).not.toContain("read_tab");
+		// Still steers away from busy-waiting and toward ending the turn.
+		expect(out.toLowerCase()).toContain("do not sleep");
+		expect(out.toLowerCase()).toContain("end your turn");
 	});
 
 	it("reports the queued status when the target is busy", async () => {
author	Adam Malczewski <[email protected]>	2026-06-02 16:04:20 +0900
committer	Adam Malczewski <[email protected]>	2026-06-02 16:04:20 +0900
commit	a24397636de35f4b92c7cd85154ddc03b98d47cd (patch)
tree	99c5de728457d816d9baf0bfffe3c4fc2eb34af5
parent	3ff2db698c2633023934d8477a9e995f78fa011e (diff)
parent	e0b63c0c03880bf77a07d47b28bbabf84649fcc3 (diff)
download	dispatch-a24397636de35f4b92c7cd85154ddc03b98d47cd.tar.gz dispatch-a24397636de35f4b92c7cd85154ddc03b98d47cd.zip