summaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
-rw-r--r--PLAN-mcp.md128
-rw-r--r--notes/mcp-design.md323
2 files changed, 451 insertions, 0 deletions
diff --git a/PLAN-mcp.md b/PLAN-mcp.md
new file mode 100644
index 0000000..560da8e
--- /dev/null
+++ b/PLAN-mcp.md
@@ -0,0 +1,128 @@
+# Plan — MCP (Model Context Protocol) Integration
+
+> **Status:** PROPOSED — awaiting user approval of design decisions (§7 of
+> `notes/mcp-design.md`).
+> Design: `notes/mcp-design.md`.
+
+## Decisions (to confirm with user)
+
+1. **One `mcp` extension** managing multiple servers (like `lsp`).
+2. **Tool name format:** `<serverId>__<toolName>` (double-underscore separator).
+3. **Phase 1: stdio transport only** (covers freecad-mcp + chrome-devtools-mcp).
+4. **Phase 1: Tools only** (no Resources/Prompts).
+5. **Phase 1: no enable/disable surface** (per-cwd config is sufficient).
+6. **Hand-rolled JSON-RPC** (adapt LSP's rpc.ts + framing.ts; no MCP SDK dep).
+
+## Implementation waves
+
+### Wave 0: Orchestrator (contracts + wiring)
+
+| What | File | Change |
+|---|---|---|
+| No kernel contract change needed | — | The existing `ToolContract` + `host.defineTool()` + `host.getTools()` + `toolsFilter` + `ToolAssembly` are sufficient. MCP tools are just `ToolContract`s registered at runtime. |
+| Glossary | `GLOSSARY.md` | Add `MCP`, `MCP server`, `MCP host` (see design §6). |
+| Root tsconfig | `tsconfig.json` | Add `@dispatch/mcp` project reference (after Wave 1). |
+| host-bin registration | `packages/host-bin/src/main.ts` | Register `mcpExt` in `CORE_EXTENSIONS` (same pattern as `lspExt`). |
+| `bun install` | `bun.lock` | Link the new workspace package. |
+
+> **No `@dispatch/transport-contract` or `@dispatch/wire` version bump** in Phase 1.
+> MCP tools are transparent to the wire (they're just tools the model calls).
+> A future surface (enable/disable, status endpoint) would bump versions.
+
+### Wave 1: `packages/mcp/` (single unit — the extension)
+
+This is the main implementation. One owner-agent builds the entire `packages/mcp/`
+directory. It depends only on `@dispatch/kernel` (contracts) and
+`@dispatch/session-orchestrator` (for the `toolsFilter` handle).
+
+| File | Responsibility |
+|---|---|
+| `src/framing.ts` | `Content-Length` framing for stdio (adapt from LSP's framing.ts — encode/decode). PURE. |
+| `src/framing.test.ts` | Unit tests for encode/decode. |
+| `src/rpc.ts` | JSON-RPC 2.0 client: `request(method, params) → result`, `notify(method, params)`, `onNotification(method, handler)`. Adapts LSP's rpc.ts. PURE (injected `writeFn`). |
+| `src/rpc.test.ts` | Unit tests for request/response/notification handling. |
+| `src/transport.ts` | Transport abstraction: `StdioTransport` (spawn child, pipe stdin/stdout through framing + rpc) + the interface for a future `HttpTransport`. Injected `spawn` (like LSP). |
+| `src/transport.test.ts` | Tests against an in-memory pipe pair (no real spawn). |
+| `src/client.ts` | MCP client: `initialize()` (send proto version + caps, receive server caps), `listTools()` → `tools/list`, `callTool(name, args, signal)` → `tools/call`, listen for `notifications/tools/list_changed`. Tracks connection state. |
+| `src/client.test.ts` | Tests with a mock JSON-RPC connection (injected transport). |
+| `src/config.ts` | PURE config resolution: `.dispatch/mcp.json` → `opencode.json` `mcp` key. Returns `ResolvedMcpServer[]` + `shadowed` flag. Mirrors LSP config.ts. |
+| `src/config.test.ts` | Config resolution tests (precedence, shadow, empty). |
+| `src/registry.ts` | Tool name namespacing (`<serverId>__<toolName>`) + `adaptTool(serverId, mcpTool, client)` → `ToolContract`. The `execute()` proxies to `client.callTool()` and flattens MCP content to a string. PURE (injected client). |
+| `src/registry.test.ts` | Tests for namespacing, content flattening, error handling. |
+| `src/manager.ts` | `McpManager`: one client per server config; lazy-spawn on first access; `status(cwd)`; `getClient(serverId)`; `shutdownAll()`. Mirrors LSP manager.ts. Injected spawn + logger. |
+| `src/manager.test.ts` | Manager lifecycle tests (lazy spawn, shutdown, broken server). |
+| `src/types.ts` | `McpServerConfig`, `McpServerStatus`, `McpService`, `McpToolInfo`, `McpContentItem`. |
+| `src/extension.ts` | manifest + `activate(host)`: real spawn adapter, config resolution per-cwd, manager, register tools via `host.defineTool` (on connect + on `list_changed`), register `toolsFilter` (drop tools from disconnected servers), `mcpServiceHandle`, `deactivate()`. |
+| `src/index.ts` | Public surface exports. |
+
+**Scoping rules for the summon:**
+- `.dispatch/package-agent.md` + `.dispatch/extension-agent.md`
+- `.dispatch/rules/`: `one-owner.md`, `isolation-over-dry.md`, `biome-clean.md`,
+ `pure-core.md`, `no-internal-mocks.md`, `typed-handles.md`,
+ `extension-logging.md`.
+
+**Key guidance for the agent:**
+- Read `packages/lsp/src/` (framing.ts, rpc.ts, config.ts, manager.ts,
+ extension.ts) as the architectural precedent — same pattern, simpler protocol.
+- Read `packages/kernel/src/contracts/tool.ts` for `ToolContract`.
+- Read `packages/kernel/src/contracts/extension.ts` for `HostAPI`,
+ `defineTool`, `addFilter`, `provideService`, `defineService`.
+- Read `packages/session-orchestrator/src/tools-filter.ts` for `ToolAssembly`
+ + `toolsFilter`.
+- The MCP `initialize` flow: send `{ method: "initialize", params: {
+ protocolVersion: "2025-11-25", capabilities: {}, clientInfo: { name:
+ "dispatch", version: "0.0.0" } } }`, receive server capabilities, then send
+ `notifications/initialized`.
+- `tools/list` returns `{ tools: [{ name, description, inputSchema }] }`.
+- `tools/call` takes `{ name, arguments }` and returns `{ content: [...],
+ isError?: boolean }`.
+- Tool names must be namespaced `<serverId>__<toolName>`.
+- `concurrencySafe: false` on all MCP-adapted tools (conservative — MCP servers
+ are generally stateful single-client processes).
+- `Content-Length` framing for stdio (same as LSP — the MCP spec inherited
+ this from LSP).
+- No external dependencies — hand-roll the JSON-RPC + framing (adapt LSP's).
+
+### Wave 2: host-bin registration (orchestrator)
+
+After Wave 1 is verified in isolation:
+- Add `@dispatch/mcp` to root `tsconfig.json` project references.
+- `bun install` to link the workspace package.
+- Register `mcpExt` in `CORE_EXTENSIONS` in `packages/host-bin/src/main.ts`.
+- Verify: `tsc -b` EXIT 0, biome clean, full vitest pass.
+
+### Wave 3: Live verification (orchestrator)
+
+- Boot the dev stack (`bin/up`).
+- Create a `.dispatch/mcp.json` in a test cwd with a simple MCP server
+ (e.g. a trivial stdio server that exposes one tool).
+- Verify: `GET /conversations/:id/lsp`-equivalent — actually, verify by
+ sending a chat that triggers the model to call the MCP tool.
+- Or: test with chrome-devtools-mcp (`npx chrome-devtools-mcp`) if available.
+- Confirm: the model sees the MCP tool, calls it, gets a result.
+- Clean up test config.
+
+## Test strategy (per the asymmetric testing rule)
+
+- **Pure core** (framing, rpc, config, registry, types): zero internal mocks,
+ high coverage. The RPC + framing tests use in-memory pipe pairs (injected
+ transport, not mocked `@dispatch/*`). Config tests use string fixtures.
+- **Shell** (transport, manager, extension): integration tests against
+ in-memory/real child processes. A few tests, not exhaustive unit coverage.
+ Do NOT mock sibling extensions.
+
+## Estimated size
+
+- ~12 source files + ~11 test files.
+- Closest precedent: `packages/lsp/` (~20 files). MCP is simpler (no
+ diagnostics, no incremental sync, no file watching, no sidecars).
+- Expected test count: ~60-80 new tests.
+
+## What is explicitly OUT of scope for Phase 1
+
+- Streamable HTTP transport (Phase 2).
+- MCP Resources and Prompts primitives (Phase 2).
+- Client → Server capabilities (sampling, roots, elicitation) (Phase 2+).
+- Per-conversation enable/disable surface + transport endpoints (Phase 2).
+- Tool poisoning / rug-pull hash validation (security hardening, Phase 2).
+- `mcp-scan`-style static analysis (Phase 2+).
diff --git a/notes/mcp-design.md b/notes/mcp-design.md
new file mode 100644
index 0000000..c38e3b4
--- /dev/null
+++ b/notes/mcp-design.md
@@ -0,0 +1,323 @@
+# MCP (Model Context Protocol) Integration — Design
+
+> **Status:** DESIGN — pending user approval before implementation.
+> Spec: https://modelcontextprotocol.io/specification/2025-11-25
+> SDK (TS): https://github.com/modelcontextprotocol/typescript-sdk
+
+## 0. What MCP is
+
+MCP is an open standard (Anthropic, Nov 2024) for connecting AI applications
+to external tools, data sources, and services — "USB-C for AI." An AI host
+(like Dispatch) connects to MCP servers, which expose capabilities as three
+primitives: **Tools** (executable actions), **Resources** (read-only data),
+and **Prompts** (reusable templates). The protocol is **JSON-RPC 2.0** over
+**stdio** (local child process) or **Streamable HTTP** (remote, POST + SSE).
+
+The architecture has three roles:
+- **Host** — the AI application (Dispatch). Manages multiple MCP clients.
+- **Client** — one per server. Handles the connection, capability discovery,
+ and primitive invocation.
+- **Server** — a process/service exposing Tools/Resources/Prompts.
+
+Dispatch will act as an **MCP host**. Each configured MCP server is a child
+process (stdio) or remote endpoint (HTTP) that Dispatch spawns/connects to,
+discovers tools from, and proxies tool calls to.
+
+## 1. Why this fits Dispatch's architecture
+
+MCP integration is a **standard extension** — not kernel, not core. It is
+architecturally a sibling of the existing `lsp` extension:
+
+| Aspect | LSP extension | MCP extension (proposed) |
+|---|---|---|
+| Protocol | JSON-RPC 2.0 over stdio | JSON-RPC 2.0 over stdio + HTTP |
+| Child processes | One per (serverID, root) | One per configured server |
+| Config source | `.dispatch/lsp.json` + `opencode.json` `lsp` key | `.dispatch/mcp.json` + `opencode.json` `mcp` key |
+| Config resolution | Per-cwd | Per-cwd (same pattern) |
+| What it registers | `lsp` tool + `lspServiceHandle` | N tools (one per MCP tool discovered) + `mcpServiceHandle` |
+| Lifecycle | lazy-spawn, `deactivate` kills all | lazy-spawn, `deactivate` kills all |
+| Capability | `spawn: true, fs: true` | `spawn: true` (stdio) / network (HTTP) |
+
+**Key difference:** LSP registers ONE tool (`lsp`) that the model calls to
+query diagnostics. MCP registers MANY tools — one per tool discovered from each
+connected MCP server. The model calls them directly by name (e.g.
+`freecad_create_object`, `chrome_navigate`). This is the whole point: the model
+sees MCP server tools as first-class Dispatch tools.
+
+**How tools reach the model:** the `session-orchestrator`'s `resolveTools()`
+calls `host.getTools()` → the MCP extension has called `host.defineTool()` for
+each discovered MCP tool → they flow through the `toolsFilter` chain → into
+`runTurn`. No new contract surface needed for the basic tool path — the existing
+`ToolContract` + `host.defineTool` + `host.getTools()` is sufficient.
+
+## 2. The per-task loading problem
+
+The user wants to "load up MCPs for specific tasks." This means different MCP
+servers should be available in different contexts — not all MCP servers all the
+time. Three mechanisms address this, in increasing sophistication:
+
+### 2a. Per-cwd config (baseline — mirrors LSP)
+Config is resolved per-cwd: `.dispatch/mcp.json` in the working directory
+declares which MCP servers are available. A conversation pointed at a FreeCAD
+project dir has `freecad` configured; one pointed at a web project has
+`chrome-devtools` configured. This is the simplest mechanism and mirrors LSP
+exactly. No new contract surface.
+
+### 2b. Tools filter (per-turn scoping)
+The MCP extension registers a `toolsFilter` (same mechanism as `skills`) that
+can REMOVE tools from the assembly based on per-turn context. For example:
+- Only include MCP tools from servers that have successfully connected (drop
+ tools from a server that's `error`/disconnected).
+- Scope by a per-conversation "enabled MCP servers" preference (the user
+ toggles which MCP servers are active for this conversation).
+
+This requires NO new contract — the `toolsFilter` + `ToolAssembly` already
+carry `cwd` + `conversationId`, which is enough to scope.
+
+### 2c. Dynamic enable/disable surface (later)
+A per-conversation surface (like cache-warming's) where the user toggles MCP
+servers on/off from the frontend. This needs a surface + transport endpoints
+but reuses the existing surface framework. Deferred to a later phase.
+
+## 3. Config format
+
+Mirror the `mcpServers` format that the MCP ecosystem uses (Claude Desktop,
+VS Code, Cursor all use this shape), adapted to Dispatch's per-cwd resolution:
+
+### `.dispatch/mcp.json`
+```json
+{
+ "servers": {
+ "freecad": {
+ "command": "uvx",
+ "args": ["freecad-mcp"],
+ "env": { "FREECAD_RPC_HOST": "localhost" }
+ },
+ "chrome-devtools": {
+ "command": "npx",
+ "args": ["chrome-devtools-mcp@latest"]
+ },
+ "remote-freecad": {
+ "transport": "http",
+ "url": "http://192.168.1.100:9876/mcp"
+ }
+ }
+}
+```
+
+### `opencode.json` (fallback)
+```json
+{
+ "mcp": {
+ "freecad": { "command": "uvx", "args": ["freecad-mcp"] }
+ }
+}
+```
+
+**Resolution** (same precedence as LSP):
+1. `<cwd>/.dispatch/mcp.json` — if present, its `servers` win (shadow warning
+ if `opencode.json` also declares `mcp`).
+2. `<cwd>/opencode.json` `mcp` key — fallback.
+3. No built-in servers (MCP has no built-in registry; everything is configured).
+
+Each server entry:
+- `command` + `args` + optional `env` → stdio transport (spawn child process).
+- `transport: "http"` + `url` + optional `headers` → Streamable HTTP transport.
+- Optional `disabled: true` → present in config but not started (for the
+ enable/disable surface later).
+
+## 4. Architecture — the `mcp` extension (`packages/mcp/`)
+
+```
+packages/mcp/src/
+ config.ts PURE config resolution (mirrors lsp/config.ts)
+ config.test.ts
+ transport.ts Transport abstraction: stdio + Streamable HTTP
+ transport.test.ts
+ framing.ts Content-Length framing for stdio (mirrors lsp/framing.ts)
+ framing.test.ts
+ rpc.ts JSON-RPC 2.0 client (request/response/notification, mirrors lsp/rpc.ts)
+ rpc.test.ts
+ client.ts MCP client: initialize → tools/list → tools/call; handles
+ list_changed notifications; capability negotiation
+ client.test.ts
+ manager.ts McpManager: one client per configured server; lazy-spawn;
+ status(); getClient(); shutdownAll()
+ manager.test.ts
+ registry.ts Tool name namespacing + ToolContract adapter:
+ wraps an MCP tool (name/description/inputSchema) into a
+ Dispatch ToolContract whose execute() proxies to tools/call
+ registry.test.ts
+ types.ts McpServerConfig, McpServerStatus, McpService, McpToolInfo
+ extension.ts manifest + activate(host): real spawn/HTTP adapters, register
+ tools via host.defineTool, register toolsFilter, mcpServiceHandle
+ index.ts public surface (exports)
+```
+
+### 4.1. The MCP client lifecycle
+
+```
+1. resolve config (per-cwd) → list of server configs
+2. on first tool access (lazy):
+ a. stdio: spawn child process (command + args + env)
+ b. http: open HTTP/SSE connection
+3. send `initialize` { protocolVersion, capabilities, clientInfo }
+4. receive server { protocolVersion, capabilities, serverInfo }
+5. send `notifications/initialized`
+6. call `tools/list` → discover tools
+7. for each tool: register a namespaced ToolContract via host.defineTool
+8. if server declared `tools.listChanged: true`:
+ listen for `notifications/tools/list_changed` → re-list → re-register
+9. on deactivate: send shutdown, kill child process / close HTTP
+```
+
+### 4.2. Tool name namespacing
+
+MCP tools from different servers may have name collisions (e.g. both freecad
+and chrome-devtools might have a `screenshot` tool). Solution: namespace as
+`<serverId>_<toolName>`:
+
+- `freecad_create_object`
+- `chrome-devtools_navigate_page`
+- `chrome-devtools_take_screenshot`
+
+The ToolContract's `description` is prefixed with `[<serverId>]` for clarity:
+`"[chrome-devtools] Take a screenshot of the current page"`.
+
+### 4.3. The ToolContract adapter (registry.ts)
+
+Each MCP tool discovered via `tools/list` becomes a `ToolContract`:
+
+```typescript
+// MCP tool (from tools/list):
+{ name: "create_object", description: "...", inputSchema: { type: "object", ... } }
+
+// → adapted to Dispatch ToolContract:
+{
+ name: "freecad_create_object",
+ description: "[freecad] Create a new object in FreeCAD.",
+ parameters: <mapped from inputSchema>,
+ execute: async (args, ctx) => {
+ // proxy to MCP server: tools/call { name: "create_object", arguments: args }
+ const result = await client.callTool("create_object", args, ctx.signal);
+ // MCP returns content array (text/image/resource) → flatten to string
+ return { content: flattenContent(result.content), isError: result.isError };
+ },
+ concurrencySafe: false, // MCP tools are generally not concurrency-safe
+}
+```
+
+The MCP `inputSchema` is already JSON Schema, which maps directly to
+Dispatch's `ToolParameterSchema` (same structural type — see tool.ts contract).
+No transformation needed beyond passthrough.
+
+### 4.4. Content flattening
+
+MCP tool results return a `content` array of typed items:
+```json
+{ "content": [
+ { "type": "text", "text": "..." },
+ { "type": "image", "data": "<base64>", "mimeType": "image/png" },
+ { "type": "resource", "resource": { "uri": "...", "text": "..." } }
+] }
+```
+Dispatch's `ToolResult.content` is a string. Flattening:
+- `text` → the text.
+- `image` → `"[image: <mimeType>, <n> bytes]"` (data not inlined; a future
+ multimodal ToolResult could carry it).
+- `resource` → the resource text or `"[resource: <uri>]"`.
+- Multiple items → joined with `\n`.
+
+### 4.5. Resources and Prompts (deferred)
+
+MCP servers also expose **Resources** (read-only data) and **Prompts**
+(templated messages). These are lower priority:
+- **Resources** could be exposed as a `mcp` tool op (`list_resources`,
+ `read_resource`) or injected into context — deferred.
+- **Prompts** could be surfaced as skills — deferred.
+
+Phase 1 implements **Tools** only (the highest-value primitive). Resources
+and Prompts can be added later without breaking the Tools path.
+
+### 4.6. Client → Server capabilities (deferred)
+
+MCP servers can request:
+- **Sampling** (`sampling/createMessage`) — the server asks the host to run an
+ LLM completion. This enables recursive agent workflows. Deferred (complex;
+ requires a provider round-trip from within a tool call).
+- **Roots** — the server asks about filesystem boundaries. We can support this
+ by returning the conversation's cwd. Low effort but deferred.
+- **Elicitation** — the server requests structured input from the user. Needs
+ a UI round-trip. Deferred.
+
+Phase 1 declares `capabilities: {}` (no client capabilities) — pure consumer.
+
+## 5. Security considerations
+
+MCP servers are **arbitrary code execution** (they spawn child processes,
+make network calls, access the filesystem). Key security measures:
+
+1. **Config-gated, not auto-discovered.** MCP servers are only loaded from
+ `.dispatch/mcp.json` or `opencode.json` in the cwd — never auto-discovered
+ or downloaded. The user must explicitly configure them.
+2. **Trust level.** The `mcp` extension is `trust: "bundled"` (like `lsp`),
+ meaning it's only loaded from the bundled set, not from untrusted
+ external extensions. The MCP *servers* it spawns are user-configured and
+ run with the server process's privileges — same as `run_shell`.
+3. **`capabilities: { spawn: true, network: true }`** — the extension needs
+ both spawn (stdio) and network (HTTP). The host gates these.
+4. **No shared secrets.** The `env` in the config is passed to the child
+ process directly; the extension never logs env values (self-redaction per
+ `.dispatch/rules/extension-logging.md`).
+5. **Tool descriptions are untrusted** (per MCP spec). They are passed through
+ to the model but never executed as code.
+
+## 6. Glossary additions (proposed)
+
+| Term | Meaning | Aliases to avoid |
+|---|---|---|
+| **MCP** | Model Context Protocol — the JSON-RPC 2.0-over-stdio/HTTP protocol an MCP server speaks. Used as the adjective for the feature (the `mcp` extension, the `mcp` tool). | — |
+| **MCP server** | A process/service speaking MCP that exposes Tools, Resources, and/or Prompts. Spawned (stdio) or connected (HTTP) by Dispatch acting as MCP host. | MCP provider (that's a Dispatch provider) |
+| **MCP host** | The application (Dispatch) that manages MCP clients, discovers server capabilities, and proxies tool calls. Dispatch is always the host. | — |
+
+("MCP client" is an internal implementation detail of the `mcp` extension, not
+a user-facing term — no glossary entry needed.)
+
+## 7. Open design decisions (for the user)
+
+1. **Boundary: one `mcp` extension or per-server?**
+ - **Recommendation: ONE `mcp` extension** managing multiple servers (like
+ `lsp` manages multiple language servers). A per-server extension would
+ require dynamic extension loading at runtime (not currently supported) and
+ violates the "config drives everything" principle.
+ - This is the user's decision per ORCHESTRATOR §1 step 3.
+
+2. **Tool name format: `<serverId>_<toolName>` vs `<serverId>.<toolName>` vs `<serverId>/<toolName>`?**
+ - **Recommendation: `<serverId>__<toolName>`** (double underscore as
+ separator — single underscore is common in tool names themselves; double
+ is visually distinct and unlikely to collide). The `serverId` comes from
+ the config key (e.g. `"freecad"`).
+
+3. **Stdio only in Phase 1, or stdio + HTTP?**
+ - **Recommendation: stdio only in Phase 1.** HTTP transport adds SSE
+ handling, reconnection, and auth. Stdio covers the two examples (freecad-mcp
+ via `uvx`, chrome-devtools-mcp via `npx`). HTTP can be Phase 2.
+
+4. **Resources/Prompts in Phase 1?**
+ - **Recommendation: Tools only in Phase 1.** Resources and Prompts are
+ lower value and can be added later without breaking anything.
+
+5. **Per-conversation enable/disable surface in Phase 1?**
+ - **Recommendation: No.** Per-cwd config (§2a) + the toolsFilter dropping
+ disconnected servers (§2b) is sufficient for Phase 1. The surface (§2c)
+ is Phase 2.
+
+6. **Should we use the official `@modelcontextprotocol/sdk` or hand-roll?**
+ - **Recommendation: Hand-roll the JSON-RPC client (like LSP).** The
+ protocol is simple JSON-RPC 2.0 with `Content-Length` framing for stdio.
+ The LSP extension already has a battle-tested `rpc.ts` + `framing.ts` that
+ can be adapted. A dependency on the MCP SDK would pull in its transport
+ abstractions, its own JSON-RPC layer, and Zod — adding weight for little
+ gain (the protocol surface we need is tiny: initialize, tools/list,
+ tools/call, list_changed notification). Hand-rolling also keeps the
+ "zero external deps" precedent (LSP has none).