summaryrefslogtreecommitdiffhomepage
path: root/packages/core/src/llm/provider.ts
AgeCommit message (Collapse)Author
2026-06-04chore: genesis — remove all files to rebuild from scratch (arch rewrite)Adam Malczewski
2026-05-31feat(debug): wire LLM debug logger end-to-endAdam Malczewski
The debug-logger.ts module existed but was completely orphaned — none of its functions had any callsites, so DISPATCH_DEBUG_LLM=1 did nothing. Wires it in across the stack: - llm/debug-logger.ts: add wrapFetchWithLogging() that tees SSE bodies via TransformStream + response.clone() so we capture every chunk without draining the body the AI SDK consumes. Redacts authorization / x-api-key / cookie headers in logs. Also exports nextDebugSeq() so requests and log files share an id. - llm/provider.ts: all 3 factories (Claude OAuth, plain-API-key Anthropic, OpenAI-compatible) now pass fetch: wrapFetchWithLogging(globalThis.fetch). For Claude OAuth the wrap goes on the inner base fetch so logged bodies reflect the post-transform shape + Claude-Code session headers. Added tabId to ProviderConfig for log labelling. - agent/agent.ts: threads tabId through createProvider and emits logAgentLoop / logStepLifecycle / logStreamEvent at every meaningful point in the run loop — step start/end, tool count, every fullStream event. All are no-ops when DISPATCH_DEBUG_LLM is unset. - core/index.ts: re-exports the debug helpers. - tests/llm/provider.test.ts: switch one full-object equality assertion to property assertions so the test survives the new fetch: wrapper. Plumbing the env var into the container required three more fixes: - bin/up: re-export DISPATCH_DEBUG_LLM* so docker compose forwards them (compose only forwards vars referenced in the environment: block). Also pre-creates /tmp/dispatch/llm-debug and chowns it on first run so the container's UID-1000 bun process can write into it without EACCES. - docker-compose.yml: declare the debug vars on api.environment and bind-mount /tmp/dispatch/llm-debug:/tmp/dispatch/llm-debug so logs are inspectable from the host without docker exec. - docker/entrypoint.dev.sh: explicitly forward DISPATCH_DEBUG_* through the 'su -' login-shell barrier — su - resets the environment to TERM/ PATH/HOME/SHELL/USER/LOGNAME only, silently stripping everything else. This is why the vars appeared via 'docker exec env' (which spawns a new process inheriting the container env) but were absent from the actual bun process's /proc/<pid>/environ. bin/build: drop stray sudo for consistency with bin/up and bin/down.
2026-05-30chore(notes): collect loose root docs into notes/; add reconcile edge-cases noteAdam Malczewski
Move all loose root-level .md files (plans, reports, gemini reviews, incident notes) into a single notes/ directory, and update the doc-reference breadcrumbs in code comments/test labels to the notes/ path. Add notes/queue-interrupt-reconcile-edge-cases.md: documents why the queue/interrupt/turn-sealed reconcile path keeps surfacing edge cases (a catalog of the four review-pass bugs, the no-loss/no-duplicate invariants, the recommended membership-based reconcile refactor, and interleaving-test guidance).
2026-05-30feat(cache): Anthropic prompt caching, usage telemetry, and Cache Rate viewAdam Malczewski
- send prompt-caching + oauth anthropic-beta headers on the Claude OAuth provider - restructure the OAuth request body (billing header, identity split, relocate third-party system prompt to the first user message) to match Claude Code - apply rolling cache_control breakpoints and group a turn's tool results into a single role:tool message for correct breakpoint placement - emit per-step usage events (cache read/write split) and add the Cache Rate sidebar panel - dedup byte-identical tool calls within a single batch
2026-05-28refactor(core): upgrade ai-sdk v4 → v6 + Anthropic/openai-compatible ↵Adam Malczewski
reasoning round-trip + max-thinking budget audit Migrates the LLM stack from [email protected] + @ai-sdk/[email protected] + @ai-sdk/[email protected] to [email protected] + @ai-sdk/[email protected] + @ai-sdk/[email protected]. Full design in plan-v6-upgrade.md; two rounds of Gemini code review captured in report.md. Motivation: the recurring 'reasoning-signature without reasoning' error on Claude Opus 4.7 was a v4 SDK artefact — @ai-sdk/[email protected] emitted Anthropic signature_delta as a separate stream chunk that orphaned when the model produced a signed-but-empty thinking block, and our chunk store had no signature field so the round-trip back to Anthropic was rejected on the next turn. In v6, signatures arrive inside providerMetadata on the reasoning-end event, and the orphan-signature class of bug is gone at the SDK level. Core changes: • ThinkingChunk gains optional metadata?: Record<string, unknown> (the v6 providerMetadata blob). A non-undefined metadata 'seals' the chunk: subsequent reasoning-delta opens a new chunk rather than extending the sealed one. • AgentEvent gains { type: 'reasoning-end'; metadata? } (replaces the v4 reasoning-signature variant). • toModelMessages (replaces toCoreMessages): - returns ModelMessage[] (was CoreMessage[]) - thinking → { type: 'reasoning', text, providerOptions: metadata } - tool-batch entries → { type: 'tool-call', input } (was 'args') - tool results → { output: { type: 'text', value } } ToolResultOutput • Claude OAuth uses createAnthropic({ authToken }) natively — no more custom-fetch x-api-key → Bearer swap. • rewriteBodyForOpus47 deleted — Opus 4.7 adaptive thinking is native via providerOptions.anthropic.thinking = { type: 'adaptive' }. • V1 middleware → V3 (specificationVersion: 'v3'). • v4-era normalizeMessages openai-compatible middleware deleted; the v6 openai-compatible provider extracts reasoning_content natively from { type: 'reasoning' } content parts. • applyAnthropicStructuralNormalisations (mirrors opencode provider/transform.ts:53-148): drops empty text/reasoning parts, scrubs non-[a-zA-Z0-9_-] toolCallIds, splits [tool-call, non-tool] assistant turns (Anthropic rejects tool_use followed by text). • applyOpenAICompatibleReasoningNormalisation (mirrors opencode transform.ts:217-249): lifts reasoning text into providerOptions.openaiCompatible.reasoning_content (always, even empty). Solves DeepSeek 'The reasoning_content in the thinking mode must be passed back' — the v6 SDK skips emitting reasoning_content when text is empty (dist/index.mjs:245), but DeepSeek requires the field present once thinking was used. • Tools: tool({ inputSchema: jsonSchema(zodToJsonSchema(...)) }) (was parameters: ZodSchema). AI SDK tools have no execute callback — the agent runs tools manually for permission prompts and shell-output streaming. New dep: zod-to-json-schema@^3.25.2. • fullStream event loop rewritten for v6 event shape: text-delta (text not textDelta), reasoning-start/delta/end, tool-input-*, tool-call (input not args), tool-result, tool-error (new), abort (new), start-step/finish-step, finish. Max-thinking audit (matches opencode transform.ts:642-671 budgets): • Claude enabled-thinking max budget 16000 → 31999 (Anthropic ceiling) • Claude enabled-thinking high budget 10000 → 16000 • maxOutputTokens 'budget + 8000' → fixed 32000 (matches opencode's OUTPUT_TOKEN_MAX; model self-allocates thinking vs response within) • Opus 4.7 adaptive thinking gains display: 'summarized' and sibling effort field (without these, thinking content is hidden by Anthropic and the model barely thinks). Frontend mirrors: • types.ts — ThinkingChunk.metadata?, AgentEvent reasoning-end • tabs.svelte.ts — routes reasoning-end through applyChunkEvent • ChatMessage.svelte — hides empty thinking chunks; hides the entire assistant bubble when no chunk has renderable content Gemini-review-driven fixes: • tool-error and abort stream events now surface as error chunks (were silently ignored) • toolCallId scrubbing pass (opencode transform.ts:96-122 parity) • Empty-reasoning-cull explicit test coverage for both Anthropic structural normalisation and DeepSeek path Test counts (223 tests across 3 packages, all green): • tests/chunks/append.test.ts: 44 (was 38) — reasoning-end sealing, orphan walk-back, multi-block interleaving • tests/agent/agent.test.ts: 24 (was 5) — exhaustive v6 event mappings, structural normalisations, signature/reasoning_content round-trip, tool-error/abort branches, DeepSeek scenario, empty reasoning edge case • tests/llm/provider.test.ts: 9 (was 22) — dropped 13 obsolete v4 middleware tests; new minimal tests confirm no middleware wrapping on default openai-compat path and that createAnthropic gets authToken vs apiKey correctly for OAuth vs api-key flows • tests/tools/registry.test.ts: 10 (was 4) — v6 tool() contract (inputSchema, no execute, JSON Schema for nested zod) • packages/api/tests/agent-manager.test.ts: 12 (was 7) — mock Agent emits v6 reasoning events; reasoning-end broadcast + ordering • packages/frontend/tests/chat-store.test.ts: 35 (was 32) — reasoning-end flow through Svelte $state store typecheck clean (tsc --noEmit on core + api, svelte-check on frontend), biome clean across 124 files.
2026-05-24fix: prompt caching, OpenCode Go MiniMax/Qwen support, Opus 4.7 thinking, ↵Adam Malczewski
SDK compat - Implement Anthropic prompt caching: first system message + last 2 non-system messages get cache_control: ephemeral, mirroring OpenCode's applyCaching strategy. Move system prompt inline into messages array so providerOptions can attach. - Add opencode-anthropic provider variant routing MiniMax/Qwen models through the /messages endpoint with x-api-key auth, distinct from the Claude OAuth flow's Bearer auth and Claude Code mimicry. - Split isAnthropic into isClaudeOAuth (billing header, mcp_ tool prefix, thinking config) and usesAnthropicSDK (cache markers) so non-OAuth Anthropic-format gateways get the right treatment. - Pin @ai-sdk/anthropic to ^1.2.12: v3 returns LanguageModelV3-spec models that ai v4's streamText rejects at runtime ('AI SDK 4 only supports models that implement specification version v1'). Drop unnecessary V1 casts. - Restore Opus 4.7 extended thinking by rewriting the outgoing /messages body in the Claude OAuth fetch interceptor: inject thinking: { type: 'adaptive' } (v1 SDK can't emit it), strip temperature/top_p/top_k (Anthropic rejects them with thinking enabled). Gated on max_tokens > 4096 so effort=none still works. - Bump MAX_STEPS from 10 to 50 to align with AI SDK's stepCountIs(20) default and reduce mid-task halts. - Fix pre-existing typecheck errors in agent-manager.ts (entry/nextEntry narrowing), app.ts (agentModels body field), KeyUsage.svelte (m guards), and a TS2742 in provider.ts via explicit ModelFactory return type. - buildFallbackSequence now always returns at least one entry so processMessage runs the agent loop even without keyId/modelId (fixes 4 broken agent-manager tests).
2026-05-22feat: agent builder, CWD support, auto-save, UI polish, unavailable tool ↵Adam Malczewski
handling - Agent Builder: full CRUD with card grid, drag-and-drop model reorder, edit/delete - Auto-save on edit with 600ms debounce, AbortController for concurrency, fieldset disabled until name entered - Agent definitions stored as TOML with cwd field, loaded from global/project dirs - Working directory: per-tab CWD override in Chat Settings, agent default CWD, auto-create on first message - CWD validation: check-dir endpoint with ~ expansion, real-time validity indicator - Subagent CWD validated against parent's effective CWD using path.relative - Unavailable tool calls: caught gracefully, shown as tool call with error badge, model retries - UI: tab bar border radius, sidebar border removed, chat input ghost style, scroll-to-bottom rectangle - Skills dir collapse uses CSS rotation, Model Choice renamed to Chat Settings, System Prompt view removed - Reusable SkillsBrowser/ToolPermissions with external mode for Agent Builder - ModelSelector: Agent/Manual toggle, agent list, Agent Settings link - Page router, skills recursive scanning, bin/up gopass removed, docker volume mounts
2026-05-20feat: claude max oauth support with multi-account switching, reasoning ↵Adam Malczewski
effort, and dynamic model listing
2026-05-19fix: DeepSeek reasoning_content dropped on multi-step tool callsAdam Malczewski
- Base URL corrected: zen/v1 -> zen/go/v1 (opencode-go provider) - Model changed: deepseek-v4-flash-free -> deepseek-v4-flash - Added wrapLanguageModel middleware to inject reasoning_content via providerMetadata.openaiCompatible before each stream call - Fixed test mocks: removed vi.importActual (unsupported in Bun), added tool factory mocks, preserved real tool export in ai mock - Added 11 tests for the normalizeMessages middleware
2026-05-19Phase 1: single agent + basic UIAdam Malczewski
- Bun monorepo with @dispatch/core, @dispatch/api, @dispatch/frontend - Agent runtime with Vercel AI SDK, streaming via WebSocket - Tools: read_file, write_file, list_files (scoped to working directory) - Hono API server with POST /chat, GET /status, GET /health, WS /ws - Svelte 5 + DaisyUI frontend with chat UI, theme switcher, copy button - OpenCode Go (Zen) as LLM provider, deepseek-v4-flash-free model - Docker setup (dev + prod) with bin/ scripts and gopass secrets - Biome v2 linting/formatting, Vitest tests (44 passing) - Debug info attached to error messages for diagnostics