update to correctly use new api for newer models.HEAD main

author: Adam Malczewski <[email protected]> 2026-04-30 18:06:07 +0900
committer: Adam Malczewski <[email protected]> 2026-04-30 18:06:07 +0900
commit: 9be8821368deff024eafedeea55a614f9a9468cf (patch)
tree: 43d70e2e8d6ac31e288f8f99b71555c051db0b19
parent: 5c9b8f5142198bdf230d500b5101322a22235670 (diff)
download: dispatch-adapter-copilot-main.tar.gz
dispatch-adapter-copilot-main.zip
7 files changed, 1062 insertions, 22 deletions
diff --git a/.rubocop.yml b/.rubocop.yml
index ff78dd3..cc23375 100644
--- a/.rubocop.yml
+++ b/.rubocop.yml
@@ -47,3 +47,7 @@ Style/Documentation:
 
 Style/RedundantStructKeywordInit:
   Enabled: false
+
+Lint/EmptyBlock:
+  Exclude:
+    - "spec/**/*"
diff --git a/ENDPOINT_ROUTING.md b/ENDPOINT_ROUTING.md
new file mode 100644
index 0000000..e4ab523
--- /dev/null
+++ b/ENDPOINT_ROUTING.md
@@ -0,0 +1,189 @@
+# Endpoint Routing — How the adapter picks `/v1/chat/completions` vs `/v1/responses`
+
+> **TL;DR** The decision is made by a single regex against the model id string.
+> No capability discovery, no flag, no per-request override.
+
+## The decision
+
+The full routing logic lives in **one method**:
+
+`lib/dispatch/adapter/copilot.rb`
+
+```ruby
+# Returns true when the selected model requires the /v1/responses endpoint.
+# This applies to GPT-5 reasoning models. These models reject tool calls on
+# /v1/chat/completions and return a 400 RequestError directing callers to
+# use /v1/responses instead.
+def uses_responses_api?
+  @model.match?(/\Agpt-5/)
+end
+```
+
+`\A` anchors at the start of the string, so any model id whose name begins
+with the literal `gpt-5` (case-sensitive) is routed to the Responses API.
+Everything else goes to Chat Completions.
+
+The check is invoked once per `#chat` call:
+
+```ruby
+# lib/dispatch/adapter/copilot.rb (inside #chat)
+if uses_responses_api?
+  if stream
+    chat_streaming_responses(...)        # POST /v1/responses (SSE)
+  else
+    chat_non_streaming_responses(...)    # POST /v1/responses
+  end
+else
+  # build chat-completions body
+  if stream
+    chat_streaming(...)                  # POST /v1/chat/completions (SSE)
+  else
+    chat_non_streaming(...)              # POST /v1/chat/completions
+  end
+end
+```
+
+The four code paths are:
+
+| Path | Method | Endpoint | Streamed? |
+|---|---|---|---|
+| Responses, streaming | `chat_streaming_responses` | `POST /v1/responses` | yes |
+| Responses, blocking | `chat_non_streaming_responses` | `POST /v1/responses` | no |
+| Chat, streaming | `chat_streaming` | `POST /v1/chat/completions` | yes |
+| Chat, blocking | `chat_non_streaming` | `POST /v1/chat/completions` | no |
+
+All four live in `lib/dispatch/adapter/copilot.rb`.
+
+## Body-shape differences (what the adapter rewrites silently)
+
+| Concept | `/v1/chat/completions` body | `/v1/responses` body |
+|---|---|---|
+| Conversation | `messages: [...]` | `input: [...]` |
+| Token cap | `max_tokens` (or `max_completion_tokens` on o*/gpt-5/gemini) | `max_output_tokens` |
+| Reasoning effort | `reasoning_effort: "high"` | `reasoning: { effort: "high" }` |
+| Tool definition | `{ type: "function", function: { name, description, parameters } }` | `{ type: "function", name, description, parameters }` (no `function:` wrapper) |
+
+These transforms are handled inside the adapter — callers always pass the
+same `Dispatch::Adapter::ToolDefinition` / `Dispatch::Adapter::Message`
+structs and the same `thinking:` keyword.
+
+## Current model list and routing
+
+Source: `reference/models.txt` (lives one level up from this gem, in the
+parent `update-adapters/` workspace; format is `model_id,premium_multiplier`).
+
+| Model id | Premium multiplier | `\Agpt-5` match? | Endpoint |
+|---|---|---|---|
+| gpt-4.1 | 0.0 | ❌ | `/v1/chat/completions` |
+| gpt-4o | 0.0 | ❌ | `/v1/chat/completions` |
+| gpt-5-mini | 0.0 | ✅ | `/v1/responses` |
+| oswe-vscode-prime | 0.0 | ❌ | `/v1/chat/completions` |
+| grok-code-fast-1 | 0.25 | ❌ | `/v1/chat/completions` |
+| claude-haiku-4.5 | 0.33 | ❌ | `/v1/chat/completions` |
+| gemini-3-flash-preview | 0.33 | ❌ | `/v1/chat/completions` |
+| gpt-5.4-mini | 0.33 | ✅ | `/v1/responses` |
+| claude-sonnet-4 | 1.0 | ❌ | `/v1/chat/completions` |
+| claude-sonnet-4.5 | 1.0 | ❌ | `/v1/chat/completions` |
+| claude-sonnet-4.6 | 1.0 | ❌ | `/v1/chat/completions` |
+| gemini-2.5-pro | 1.0 | ❌ | `/v1/chat/completions` |
+| gemini-3.1-pro-preview | 1.0 | ❌ | `/v1/chat/completions` |
+| gpt-5.2 | 1.0 | ✅ | `/v1/responses` |
+| gpt-5.2-codex | 1.0 | ✅ | `/v1/responses` |
+| gpt-5.3-codex | 1.0 | ✅ | `/v1/responses` |
+| gpt-5.4 | 1.0 | ✅ | `/v1/responses` |
+| claude-opus-4.7 | 7.5 | ❌ | `/v1/chat/completions` |
+| gpt-5.5 | 7.5 | ✅ | `/v1/responses` |
+
+## Why a regex and not capability discovery?
+
+`GET https://api.githubcopilot.com/models` does NOT return a field that
+indicates which endpoint a given model accepts. A typical entry looks like:
+
+```json
+{
+  "id": "claude-3.7-sonnet",
+  "vendor": "Anthropic",
+  "model_picker_enabled": true,
+  "policy": { "state": "enabled" },
+  "capabilities": {
+    "family": "claude-3.7-sonnet",
+    "type": "chat",
+    "tokenizer": "o200k_base",
+    "limits": { "max_context_window_tokens": 200000, "max_output_tokens": 8192, "max_prompt_tokens": 90000 },
+    "supports": { "streaming": true, "tool_calls": true, "parallel_tool_calls": true, "vision": true }
+  }
+}
+```
+
+There is no `endpoints`, `api`, `responses_api`, or `chat_completions`
+flag. The signal that a model needs `/v1/responses` is the **400 error
+string** Copilot returns when you send tools + reasoning_effort to
+`/v1/chat/completions` for a GPT-5 family model:
+
+```
+Function tools with reasoning_effort are not supported for gpt-5.4 in
+/v1/chat/completions. Please use /v1/responses instead.
+```
+
+Hence the hardcoded `/\Agpt-5/` heuristic. See
+`GPT5_RESPONSES_API.md` for the original problem statement.
+
+## How to update this when GitHub adds new models
+
+When GitHub Copilot adds a new model that requires `/v1/responses`:
+
+1. **Edit the regex** in
+   `lib/dispatch/adapter/copilot.rb` at the `uses_responses_api?` method.
+   Add the new family to the alternation, e.g.:
+
+   ```ruby
+   def uses_responses_api?
+     @model.match?(/\A(?:gpt-5|gpt-6|codex-6|o5)/)
+   end
+   ```
+
+2. **Update the test expectations** in
+   `spec/dispatch/adapter/copilot_spec.rb`. Search for `uses_responses_api`
+   and `/\Agpt-5/` to find the relevant examples; both positive (a model
+   that should match) and negative (a model that shouldn't) cases need
+   updating.
+
+3. **Update the table above** in this file
+   (`ENDPOINT_ROUTING.md`) so the documented routing matches the code.
+
+4. **Update `reference/models.txt`** in the parent workspace if you also
+   want the new model listed for build/test scripts.
+
+5. **Bump the gem version** in
+   `lib/dispatch/adapter/version.rb` (minor bump for new model support,
+   patch for a regex tweak that just fixes routing for an existing
+   misclassified model).
+
+6. **Run the test gate** from inside this gem:
+   ```bash
+   bundle exec rubocop --autocorrect-all
+   bundle exec rspec
+   ```
+   Both must exit 0.
+
+## Alternative: probe-and-fallback (not currently implemented)
+
+A more durable design would catch the specific 400 error string from
+`/v1/chat/completions`, cache the offending model id, and retransmit on
+`/v1/responses`. Pros: zero hardcoded list. Cons: adds latency on the
+first request per new model per process and depends on the upstream
+error wording staying stable. The probe must include a tool definition
+to be reliable — sending a tool-less request to `/v1/chat/completions`
+will succeed for some GPT-5 variants and only the tools+reasoning combo
+triggers the rejection.
+
+## File reference (everything routing-related)
+
+| Path | What it contains |
+|---|---|
+| `lib/dispatch/adapter/copilot.rb` | `uses_responses_api?` (the regex), the `chat` dispatcher, all four code paths, body builders for both endpoints |
+| `lib/dispatch/adapter/version.rb` | Gem version constant |
+| `spec/dispatch/adapter/copilot_spec.rb` | Tests for both endpoint paths and the routing predicate |
+| `GPT5_RESPONSES_API.md` | Original problem statement — the 400 error from Copilot |
+| `ENDPOINT_ROUTING.md` | This file |
+| `../models.txt` | Workspace-level list of model ids and premium multipliers |
diff --git a/GPT5_RESPONSES_API.md b/GPT5_RESPONSES_API.md
new file mode 100644
index 0000000..8dec8cf
--- /dev/null
+++ b/GPT5_RESPONSES_API.md
@@ -0,0 +1,68 @@
+# GPT-5.4 + Tool Calls — Requires `/v1/responses` API
+
+## Problem
+
+When `build.rb` selects the `gpt-5.4` model and sends a request with tool
+definitions, the Copilot API responds with:
+
+```
+Dispatch::Adapter::RequestError: Function tools with reasoning_effort are
+not supported for gpt-5.4 in /v1/chat/completions. Please use /v1/responses instead.
+```
+
+`dispatch-adapter-copilot` currently targets `/v1/chat/completions` for all
+models. GPT-5.4 is a reasoning model that requires the newer `/v1/responses`
+endpoint when tool calls are involved.
+
+---
+
+## Background
+
+### `/v1/chat/completions`
+OpenAI's original chat API. Stateless: you send the full `messages` history,
+get back `choices`. Tool calls via `tools` + `tool_calls` are supported. Works
+for all models up to GPT-4o.
+
+### `/v1/responses`
+Introduced for reasoning models (o1, o3, GPT-5+). Key differences:
+
+- Uses `input` instead of `messages` for the conversation history.
+- Exposes a `reasoning_effort` parameter (`low` / `medium` / `high`).
+- Optionally stateful via `previous_response_id` (server keeps history).
+- **Required** for tool use on reasoning/GPT-5 models — OpenAI removed
+  function-call support from Chat Completions for these models.
+
+GPT-5.4 was added to the GitHub Copilot model catalog but brings the
+Responses API requirement with it. The adapter was written before this model
+existed, so it has no Responses API support.
+
+---
+
+## What Needs to Be Done
+
+To support GPT-5.4 (and future reasoning models) with tool calls:
+
+1. **Detect reasoning models** — identify which model IDs require the
+   Responses API (e.g. anything matching `gpt-5.*` or carrying a
+   `reasoning` capability flag in the `/models` response).
+
+2. **Implement a Responses API code path** in `dispatch-adapter-copilot`:
+   - Endpoint: `POST /v1/responses` (not `/v1/chat/completions`).
+   - Request shape: `input` array instead of `messages`.
+   - Response shape: different structure — parse accordingly.
+   - Map `Dispatch::Adapter` tool definitions and result blocks to the
+     Responses API format.
+   - Handle `reasoning_effort` (expose as an adapter option or auto-set
+     to `medium`).
+
+3. **Route per model** — the adapter should check the model ID and choose
+   the correct endpoint at request time, keeping Chat Completions for all
+   non-reasoning models.
+
+---
+
+## Workaround (until implemented)
+
+Use `sonnet-4.6` instead of `gpt-5.4` in `build.rb`'s interactive menu.
+Claude Sonnet 4.6 (routed via Copilot's `/v1/chat/completions`) fully
+supports tool calls and has no Responses API requirement.
diff --git a/Gemfile.lock b/Gemfile.lock
index f7595b0..f3aa5a6 100644
--- a/Gemfile.lock
+++ b/Gemfile.lock
@@ -1,7 +1,7 @@
 PATH
   remote: ../dispatch-adapter-interface
   specs:
-    dispatch-adapter-interface (0.2.0)
+    dispatch-adapter-interface (0.3.0)
 
 PATH
   remote: .
@@ -114,7 +114,7 @@ CHECKSUMS
   date (3.5.1) sha256=750d06384d7b9c15d562c76291407d89e368dda4d4fff957eb94962d325a0dc0
   diff-lcs (1.6.2) sha256=9ae0d2cba7d4df3075fe8cd8602a8604993efc0dfa934cff568969efb1909962
   dispatch-adapter-copilot (0.4.0)
-  dispatch-adapter-interface (0.2.0)
+  dispatch-adapter-interface (0.3.0)
   erb (6.0.2) sha256=9fe6264d44f79422c87490a1558479bd0e7dad4dd0e317656e67ea3077b5242b
   hashdiff (1.2.1) sha256=9c079dbc513dfc8833ab59c0c2d8f230fa28499cc5efb4b8dd276cf931457cd1
   io-console (0.8.2) sha256=d6e3ae7a7cc7574f4b8893b4fca2162e57a825b223a177b7afa236c5ef9814cc
diff --git a/dispatch-adapter-copilot.gemspec b/dispatch-adapter-copilot.gemspec
index 4dbdb1e..2ecf345 100644
--- a/dispatch-adapter-copilot.gemspec
+++ b/dispatch-adapter-copilot.gemspec
@@ -25,7 +25,8 @@ Gem::Specification.new do |spec|
       (f == gemspec) ||
         f.start_with?(*%w[bin/ Gemfile .gitignore .rspec spec/ .rubocop.yml])
     end
-  end.select { |f| File.exist?(File.join(__dir__, f)) }
+  end
+  spec.files.select! { |f| File.exist?(File.join(__dir__, f)) }
   spec.bindir = "exe"
   spec.executables = spec.files.grep(%r{\Aexe/}) { |f| File.basename(f) }
   spec.require_paths = ["lib"]
diff --git a/lib/dispatch/adapter/copilot.rb b/lib/dispatch/adapter/copilot.rb
index 7355df8..445adff 100644
--- a/lib/dispatch/adapter/copilot.rb
+++ b/lib/dispatch/adapter/copilot.rb
@@ -82,29 +82,38 @@ module Dispatch
 
       def chat(messages, system: nil, tools: [], stream: false, max_tokens: nil, thinking: :default, &)
         ensure_authenticated!
-        wire_messages = build_wire_messages(messages, system)
-        wire_tools = build_wire_tools(tools)
         effective_max_tokens = max_tokens || @default_max_tokens
         effective_thinking = thinking == :default ? @default_thinking : thinking
         validate_thinking_level!(effective_thinking)
 
-        body = {
-          model: @model,
-          messages: wire_messages,
-          stream: stream
-        }
-        if uses_max_completion_tokens?
-          body[:max_completion_tokens] = effective_max_tokens
+        if uses_responses_api?
+          if stream
+            chat_streaming_responses(messages, system, tools, effective_max_tokens, effective_thinking, &)
+          else
+            chat_non_streaming_responses(messages, system, tools, effective_max_tokens, effective_thinking)
+          end
         else
-          body[:max_tokens] = effective_max_tokens
-        end
-        body[:tools] = wire_tools unless wire_tools.empty?
-        body[:reasoning_effort] = effective_thinking if effective_thinking
+          wire_messages = build_wire_messages(messages, system)
+          wire_tools = build_wire_tools(tools)
 
-        if stream
-          chat_streaming(body, &)
-        else
-          chat_non_streaming(body)
+          body = {
+            model: @model,
+            messages: wire_messages,
+            stream: stream
+          }
+          if uses_max_completion_tokens?
+            body[:max_completion_tokens] = effective_max_tokens
+          else
+            body[:max_tokens] = effective_max_tokens
+          end
+          body[:tools] = wire_tools unless wire_tools.empty?
+          body[:reasoning_effort] = effective_thinking if effective_thinking
+
+          if stream
+            chat_streaming(body, &)
+          else
+            chat_non_streaming(body)
+          end
         end
       end
 
@@ -189,6 +198,14 @@ module Dispatch
         @model.match?(/o[1-9]|gpt-5|gemini/)
       end
 
+      # Returns true when the selected model requires the /v1/responses endpoint.
+      # This applies to GPT-5 reasoning models. These models reject tool calls on
+      # /v1/chat/completions and return a 400 RequestError directing callers to
+      # use /v1/responses instead.
+      def uses_responses_api?
+        @model.match?(/\Agpt-5/)
+      end
+
       def default_token_path
         File.join(Dir.home, ".config", "dispatch", "copilot_github_token")
       end
@@ -440,6 +457,92 @@ module Dispatch
         merge_consecutive_roles(wire)
       end
 
+      # Converts canonical messages to the flat `input` array required by
+      # POST /v1/responses. System prompt is prepended as a system-role item.
+      # The Responses API does not support a top-level `system` parameter —
+      # the system message must be the first element of `input`.
+      def build_responses_api_input(messages, system)
+        input = []
+        input << { role: "system", content: system } if system
+
+        messages.each do |msg|
+          input.concat(convert_message_to_responses_input(msg))
+        end
+
+        input
+      end
+
+      # Converts a single canonical Message to one or more Responses API input
+      # items. Returns an Array (always) so results can be flat-concatenated.
+      def convert_message_to_responses_input(msg)
+        case msg.content
+        when String
+          [{ role: msg.role, content: msg.content }]
+        when Array
+          convert_content_blocks_to_responses_input(msg)
+        else
+          [{ role: msg.role, content: msg.content.to_s }]
+        end
+      end
+
+      # Converts an array of content blocks (TextBlock, ToolUseBlock,
+      # ToolResultBlock) from a single Message into Responses API input items.
+      #
+      # Key differences from the Chat Completions conversion:
+      # - ToolUseBlock  → top-level {type: "function_call", ...} item (not nested
+      #   under an assistant message role)
+      # - ToolResultBlock → top-level {type: "function_call_output", ...} item
+      # - TextBlock in assistant message → {role: "assistant", content: [{type:
+      #   "output_text", text: "..."}]}
+      def convert_content_blocks_to_responses_input(msg)
+        items = []
+        text_parts = []
+
+        msg.content.each do |block|
+          case block
+          when TextBlock
+            text_parts << block.text
+          when ImageBlock
+            raise NotImplementedError, "ImageBlock is not yet supported by the Copilot adapter"
+          when ToolUseBlock
+            # Flush any accumulated text first as an assistant message
+            unless text_parts.empty?
+              items << {
+                role: "assistant",
+                content: [{ type: "output_text", text: text_parts.join("\n") }]
+              }
+              text_parts = []
+            end
+            items << {
+              type: "function_call",
+              call_id: block.id,
+              name: block.name,
+              arguments: JSON.generate(block.arguments)
+            }
+          when ToolResultBlock
+            items << {
+              type: "function_call_output",
+              call_id: block.tool_use_id,
+              output: tool_result_content(block)
+            }
+          end
+        end
+
+        # Flush any remaining text
+        unless text_parts.empty?
+          items << if msg.role == "assistant"
+                     {
+                       role: "assistant",
+                       content: [{ type: "output_text", text: text_parts.join("\n") }]
+                     }
+                   else
+                     { role: msg.role, content: text_parts.join("\n") }
+                   end
+        end
+
+        items
+      end
+
       def convert_message(msg)
         case msg.content
         when String
@@ -544,6 +647,45 @@ module Dispatch
         end
       end
 
+      # Assembles the full request body for POST /v1/responses.
+      #
+      # Key differences from the Chat Completions body:
+      # - Uses `input` instead of `messages`.
+      # - Uses `max_output_tokens` instead of `max_tokens`/`max_completion_tokens`.
+      # - Uses `reasoning: {effort:}` instead of `reasoning_effort`.
+      # - Tool definitions omit the `function` wrapper — name/description/parameters
+      #   are top-level inside the tool object.
+      def build_responses_api_body(messages, system, tools, stream, max_tokens, thinking)
+        input = build_responses_api_input(messages, system)
+        wire_tools = build_responses_api_tools(tools)
+
+        body = {
+          model: @model,
+          input: input,
+          stream: stream,
+          max_output_tokens: max_tokens
+        }
+
+        body[:tools] = wire_tools unless wire_tools.empty?
+        body[:reasoning] = { effort: thinking } if thinking
+
+        body
+      end
+
+      # Converts ToolDefinition structs (or plain hashes) to the Responses API
+      # tool format. Unlike Chat Completions, there is no `function` wrapper —
+      # name, description, and parameters are direct keys on the tool object.
+      def build_responses_api_tools(tools)
+        tools.map do |td|
+          {
+            type: "function",
+            name: tool_attr(td, :name),
+            description: tool_attr(td, :description),
+            parameters: tool_attr(td, :parameters)
+          }
+        end
+      end
+
       # --- Chat (non-streaming) ---
 
       def chat_non_streaming(body)
@@ -603,6 +745,78 @@ module Dispatch
         )
       end
 
+      # Non-streaming chat via POST /v1/responses.
+      # Called when uses_responses_api? is true and stream is false.
+      def chat_non_streaming_responses(messages, system, tools, max_tokens, thinking)
+        @rate_limiter.wait!
+        body = build_responses_api_body(messages, system, tools, false, max_tokens, thinking)
+        wire_messages = build_responses_api_input(messages, system)
+
+        uri = URI("#{API_BASE}/responses")
+        request = Net::HTTP::Post.new(uri)
+        apply_headers!(request, initiator: x_initiator_for_responses(wire_messages))
+        request.body = JSON.generate(deep_utf8(body))
+
+        response = execute_request(uri, request)
+        data = parse_response!(response)
+        build_response_from_responses_api(data)
+      end
+
+      # Builds a canonical Response from a /v1/responses non-streaming body.
+      def build_response_from_responses_api(data)
+        output = data["output"] || []
+        text_parts = []
+        tool_calls = []
+
+        output.each do |item|
+          case item["type"]
+          when "message"
+            (item["content"] || []).each do |part|
+              text_parts << part["text"] if part["type"] == "output_text" && part["text"]
+            end
+          when "function_call"
+            tool_calls << ToolUseBlock.new(
+              id: item["call_id"] || item["id"],
+              name: item["name"],
+              arguments: parse_tool_arguments(item["arguments"])
+            )
+          end
+        end
+
+        stop_reason = tool_calls.any? ? :tool_use : :end_turn
+        content = text_parts.empty? ? nil : text_parts.join
+
+        usage_data = data["usage"] || {}
+        usage = Usage.new(
+          input_tokens: usage_data["input_tokens"] || 0,
+          output_tokens: usage_data["output_tokens"] || 0
+        )
+
+        Response.new(
+          content: content,
+          tool_calls: tool_calls,
+          model: data["model"] || @model,
+          stop_reason: stop_reason,
+          usage: usage
+        )
+      end
+
+      # Determines X-Initiator for a Responses API call.
+      # Same logic as x_initiator_for but operates on the already-built `input`
+      # array where items use `type: "function_call"` / `type: "function_call_output"`
+      # instead of role-based items.
+      def x_initiator_for_responses(input_items)
+        if input_items.any? do |item|
+          item[:role].to_s == "assistant" ||
+          item[:type].to_s == "function_call" ||
+          item[:type].to_s == "function_call_output"
+        end
+          "agent"
+        else
+          "user"
+        end
+      end
+
       # Recursively coerces every String inside a wire-body to valid UTF-8.
       #
       # Tool results (grep output, file reads, shell stdout) frequently arrive
@@ -625,8 +839,6 @@ module Dispatch
           obj.map { |v| deep_utf8(v) }
         when Hash
           obj.each_with_object({}) { |(k, v), h| h[k] = deep_utf8(v) }
-        when Symbol
-          obj
         else
           obj
         end
@@ -772,6 +984,142 @@ module Dispatch
           )
         )
       end
+
+      # Streaming chat via POST /v1/responses.
+      # Called when uses_responses_api? is true and stream is true.
+      def chat_streaming_responses(messages, system, tools, max_tokens, thinking, &block)
+        @rate_limiter.wait!
+        body = build_responses_api_body(messages, system, tools, true, max_tokens, thinking)
+        wire_input = build_responses_api_input(messages, system)
+
+        uri = URI("#{API_BASE}/responses")
+        request = Net::HTTP::Post.new(uri)
+        apply_headers!(request, initiator: x_initiator_for_responses(wire_input))
+        request.body = JSON.generate(deep_utf8(body))
+
+        collector = new_responses_stream_collector
+
+        execute_streaming_request(uri, request) do |response|
+          buffer = +""
+          response.read_body do |chunk|
+            buffer << chunk
+            process_responses_sse_buffer(buffer, collector, &block)
+          end
+        end
+
+        build_streaming_response_from_responses(collector)
+      end
+
+      def new_responses_stream_collector
+        {
+          # text_parts: Hash<output_index => String> — accumulated text fragments
+          text_parts: Hash.new { |h, k| h[k] = +"" },
+          # tool_calls: Hash<item_id => {call_id:, name:, arguments:}>
+          tool_calls: {},
+          # order: Array of [:text, output_index] or [:tool, item_id] in appearance order
+          order: [],
+          model: @model,
+          input_tokens: 0,
+          output_tokens: 0
+        }
+      end
+
+      def process_responses_sse_buffer(buffer, collector, &)
+        while (line_end = buffer.index("\n"))
+          line = buffer.slice!(0..line_end).strip
+          next if line.empty?
+          next unless line.start_with?("data: ")
+
+          data_str = line.delete_prefix("data: ")
+          next if data_str == "[DONE]"
+
+          data = JSON.parse(data_str)
+          process_responses_stream_event(data, collector, &)
+        end
+      rescue JSON::ParserError
+        nil
+      end
+
+      def process_responses_stream_event(data, collector, &block)
+        case data["type"]
+        when "response.output_item.added"
+          handle_responses_output_item_added(data, collector, &block)
+        when "response.output_text.delta"
+          output_index = data["output_index"] || 0
+          fragment = data["delta"].to_s
+          collector[:text_parts][output_index] << fragment
+          block.call(StreamDelta.new(type: :text_delta, text: fragment))
+        when "response.function_call_arguments.delta"
+          handle_responses_arguments_delta(data, collector, &block)
+        when "response.completed"
+          usage = data.dig("response", "usage") || {}
+          collector[:input_tokens] = usage["input_tokens"] || collector[:input_tokens]
+          collector[:output_tokens] = usage["output_tokens"] || collector[:output_tokens]
+          model = data.dig("response", "model")
+          collector[:model] = model if model
+        end
+      end
+
+      def handle_responses_output_item_added(data, collector, &block)
+        item = data["item"] || {}
+        case item["type"]
+        when "function_call"
+          item_id = item["id"]
+          collector[:tool_calls][item_id] = {
+            call_id: item["call_id"] || item_id,
+            name: item["name"] || "",
+            arguments: +""
+          }
+          collector[:order] << [:tool, item_id]
+          block.call(StreamDelta.new(
+                       type: :tool_use_start,
+                       tool_call_id: item["call_id"] || item_id,
+                       tool_name: item["name"] || ""
+                     ))
+        when "message"
+          output_index = data["output_index"] || 0
+          collector[:order] << [:text, output_index] unless collector[:order].any? { |t, i| t == :text && i == output_index }
+        end
+      end
+
+      def handle_responses_arguments_delta(data, collector, &block)
+        item_id = data["item_id"]
+        fragment = data["delta"].to_s
+        tc = collector[:tool_calls][item_id]
+        return unless tc
+
+        tc[:arguments] << fragment
+        block.call(StreamDelta.new(
+                     type: :tool_use_delta,
+                     tool_call_id: tc[:call_id],
+                     argument_delta: fragment
+                   ))
+      end
+
+      def build_streaming_response_from_responses(collector)
+        tool_calls = collector[:tool_calls].values.map do |tc|
+          ToolUseBlock.new(
+            id: tc[:call_id],
+            name: tc[:name],
+            arguments: parse_tool_arguments(tc[:arguments])
+          )
+        end
+
+        all_text = collector[:text_parts].keys.sort.map { |idx| collector[:text_parts][idx] }.join
+        content = all_text.empty? ? nil : all_text
+        stop_reason = tool_calls.any? ? :tool_use : :end_turn
+
+        Response.new(
+          content: content,
+          tool_calls: tool_calls,
+          model: collector[:model],
+          stop_reason: stop_reason,
+          usage: Usage.new(
+            input_tokens: collector[:input_tokens],
+            output_tokens: collector[:output_tokens]
+          )
+        )
+      end
     end
   end
 end
diff --git a/spec/dispatch/adapter/copilot_spec.rb b/spec/dispatch/adapter/copilot_spec.rb
index 15bdcb0..6f54fef 100644
--- a/spec/dispatch/adapter/copilot_spec.rb
+++ b/spec/dispatch/adapter/copilot_spec.rb
@@ -1588,4 +1588,434 @@ RSpec.describe Dispatch::Adapter::Copilot do
       expect { adapter.chat(messages) }.to raise_error(Dispatch::Adapter::ConnectionError)
     end
   end
+
+  describe "#chat via /v1/responses (reasoning models)" do
+    let(:responses_adapter) do
+      described_class.new(
+        model: "gpt-5.4",
+        github_token: github_token,
+        max_tokens: 4096,
+        thinking: "medium",
+        min_request_interval: 0
+      )
+    end
+
+    # Helper: build a well-formed /v1/responses non-streaming response body.
+    def responses_body(text: nil, tool_calls: [], model: "gpt-5.4",
+                       input_tokens: 10, output_tokens: 5)
+      output = []
+      unless text.nil?
+        output << {
+          "type" => "message",
+          "id" => "msg_001",
+          "role" => "assistant",
+          "content" => [{ "type" => "output_text", "text" => text }]
+        }
+      end
+      tool_calls.each do |tc|
+        output << {
+          "type" => "function_call",
+          "id" => tc[:id],
+          "call_id" => tc[:id],
+          "name" => tc[:name],
+          "arguments" => JSON.generate(tc[:arguments])
+        }
+      end
+      {
+        "id" => "resp_001",
+        "object" => "response",
+        "model" => model,
+        "output" => output,
+        "usage" => {
+          "input_tokens" => input_tokens,
+          "output_tokens" => output_tokens,
+          "total_tokens" => input_tokens + output_tokens
+        }
+      }
+    end
+
+    context "with a text-only response" do
+      before do
+        stub_request(:post, "https://api.githubcopilot.com/responses")
+          .to_return(
+            status: 200,
+            body: JSON.generate(responses_body(text: "Hello from GPT-5!")),
+            headers: { "Content-Type" => "application/json" }
+          )
+      end
+
+      it "returns a Response with content" do
+        messages = [Dispatch::Adapter::Message.new(role: "user", content: "Hi")]
+        response = responses_adapter.chat(messages)
+
+        expect(response).to be_a(Dispatch::Adapter::Response)
+        expect(response.content).to eq("Hello from GPT-5!")
+        expect(response.tool_calls).to be_empty
+        expect(response.model).to eq("gpt-5.4")
+        expect(response.stop_reason).to eq(:end_turn)
+        expect(response.usage.input_tokens).to eq(10)
+        expect(response.usage.output_tokens).to eq(5)
+      end
+    end
+
+    context "with a tool call response" do
+      before do
+        stub_request(:post, "https://api.githubcopilot.com/responses")
+          .to_return(
+            status: 200,
+            body: JSON.generate(responses_body(
+                                  tool_calls: [{ id: "call_abc", name: "get_weather",
+                                                 arguments: { "city" => "New York" } }]
+                                )),
+            headers: { "Content-Type" => "application/json" }
+          )
+      end
+
+      it "returns a Response with tool_calls as ToolUseBlock array" do
+        messages = [Dispatch::Adapter::Message.new(role: "user", content: "Weather?")]
+        response = responses_adapter.chat(messages)
+
+        expect(response.content).to be_nil
+        expect(response.stop_reason).to eq(:tool_use)
+        expect(response.tool_calls.size).to eq(1)
+
+        tc = response.tool_calls.first
+        expect(tc).to be_a(Dispatch::Adapter::ToolUseBlock)
+        expect(tc.id).to eq("call_abc")
+        expect(tc.name).to eq("get_weather")
+        expect(tc.arguments).to eq({ "city" => "New York" })
+      end
+    end
+
+    context "with mixed text + tool call response" do
+      before do
+        stub_request(:post, "https://api.githubcopilot.com/responses")
+          .to_return(
+            status: 200,
+            body: JSON.generate(responses_body(
+                                  text: "Let me check.",
+                                  tool_calls: [{ id: "call_def", name: "search", arguments: { "q" => "test" } }]
+                                )),
+            headers: { "Content-Type" => "application/json" }
+          )
+      end
+
+      it "returns both content and tool_calls" do
+        messages = [Dispatch::Adapter::Message.new(role: "user", content: "Search")]
+        response = responses_adapter.chat(messages)
+
+        expect(response.content).to eq("Let me check.")
+        expect(response.tool_calls.size).to eq(1)
+        expect(response.stop_reason).to eq(:tool_use)
+      end
+    end
+
+    context "request body shape" do
+      it "sends `input` not `messages`" do
+        stub = stub_request(:post, "https://api.githubcopilot.com/responses")
+               .with do |req|
+                 body = JSON.parse(req.body)
+                 body.key?("input") && !body.key?("messages")
+               end
+               .to_return(
+                 status: 200,
+                 body: JSON.generate(responses_body(text: "ok")),
+                 headers: { "Content-Type" => "application/json" }
+               )
+
+        messages = [Dispatch::Adapter::Message.new(role: "user", content: "Hi")]
+        responses_adapter.chat(messages)
+
+        expect(stub).to have_been_requested
+      end
+
+      it "sends `max_output_tokens` not `max_tokens`" do
+        stub = stub_request(:post, "https://api.githubcopilot.com/responses")
+               .with do |req|
+                 body = JSON.parse(req.body)
+                 body["max_output_tokens"] == 4096 &&
+                   !body.key?("max_tokens") &&
+                   !body.key?("max_completion_tokens")
+               end
+               .to_return(
+                 status: 200,
+                 body: JSON.generate(responses_body(text: "ok")),
+                 headers: { "Content-Type" => "application/json" }
+               )
+
+        messages = [Dispatch::Adapter::Message.new(role: "user", content: "Hi")]
+        responses_adapter.chat(messages)
+
+        expect(stub).to have_been_requested
+      end
+
+      it "sends `reasoning: {effort:}` not `reasoning_effort`" do
+        stub = stub_request(:post, "https://api.githubcopilot.com/responses")
+               .with do |req|
+                 body = JSON.parse(req.body)
+                 body["reasoning"] == { "effort" => "medium" } && !body.key?("reasoning_effort")
+               end
+               .to_return(
+                 status: 200,
+                 body: JSON.generate(responses_body(text: "ok")),
+                 headers: { "Content-Type" => "application/json" }
+               )
+
+        messages = [Dispatch::Adapter::Message.new(role: "user", content: "Hi")]
+        responses_adapter.chat(messages)
+
+        expect(stub).to have_been_requested
+      end
+
+      it "sends tools without the `function` wrapper" do
+        tool = Dispatch::Adapter::ToolDefinition.new(
+          name: "get_weather",
+          description: "Get weather",
+          parameters: { "type" => "object", "properties" => { "city" => { "type" => "string" } } }
+        )
+
+        stub = stub_request(:post, "https://api.githubcopilot.com/responses")
+               .with do |req|
+                 body = JSON.parse(req.body)
+                 t = body["tools"]&.first
+                 t && t["type"] == "function" &&
+                   t["name"] == "get_weather" &&
+                   t["description"] == "Get weather" &&
+                   !t.key?("function")
+               end
+               .to_return(
+                 status: 200,
+                 body: JSON.generate(responses_body(text: "ok")),
+                 headers: { "Content-Type" => "application/json" }
+               )
+
+        messages = [Dispatch::Adapter::Message.new(role: "user", content: "weather?")]
+        responses_adapter.chat(messages, tools: [tool])
+
+        expect(stub).to have_been_requested
+      end
+
+      it "converts tool results to function_call_output items in input" do
+        tool_use = Dispatch::Adapter::ToolUseBlock.new(
+          id: "call_1", name: "search", arguments: { "q" => "ruby" }
+        )
+        tool_result = Dispatch::Adapter::ToolResultBlock.new(
+          tool_use_id: "call_1", content: "some results"
+        )
+
+        messages = [
+          Dispatch::Adapter::Message.new(role: "user", content: "search ruby"),
+          Dispatch::Adapter::Message.new(role: "assistant", content: [tool_use]),
+          Dispatch::Adapter::Message.new(role: "user", content: [tool_result])
+        ]
+
+        stub = stub_request(:post, "https://api.githubcopilot.com/responses")
+               .with do |req|
+                 body = JSON.parse(req.body)
+                 input = body["input"]
+                 fc = input.find { |i| i["type"] == "function_call" }
+                 fco = input.find { |i| i["type"] == "function_call_output" }
+                 fc && fc["call_id"] == "call_1" && fc["name"] == "search" &&
+                   fco && fco["call_id"] == "call_1" && fco["output"] == "some results"
+               end
+               .to_return(
+                 status: 200,
+                 body: JSON.generate(responses_body(text: "done")),
+                 headers: { "Content-Type" => "application/json" }
+               )
+
+        responses_adapter.chat(messages)
+
+        expect(stub).to have_been_requested
+      end
+    end
+
+    context "with system: parameter" do
+      it "prepends system item at start of input array" do
+        stub = stub_request(:post, "https://api.githubcopilot.com/responses")
+               .with do |req|
+                 body = JSON.parse(req.body)
+                 body["input"].first == { "role" => "system", "content" => "Be concise." }
+               end
+               .to_return(
+                 status: 200,
+                 body: JSON.generate(responses_body(text: "ok")),
+                 headers: { "Content-Type" => "application/json" }
+               )
+
+        messages = [Dispatch::Adapter::Message.new(role: "user", content: "Hi")]
+        responses_adapter.chat(messages, system: "Be concise.")
+
+        expect(stub).to have_been_requested
+      end
+    end
+
+    context "X-Initiator header" do
+      let(:ok_resp) do
+        {
+          status: 200,
+          body: JSON.generate(responses_body(text: "ok")),
+          headers: { "Content-Type" => "application/json" }
+        }
+      end
+
+      it "sends X-Initiator: user for a fresh user message" do
+        stub = stub_request(:post, "https://api.githubcopilot.com/responses")
+               .with(headers: { "X-Initiator" => "user" })
+               .to_return(**ok_resp)
+
+        messages = [Dispatch::Adapter::Message.new(role: "user", content: "Hi")]
+        responses_adapter.chat(messages)
+
+        expect(stub).to have_been_requested
+      end
+
+      it "sends X-Initiator: agent when tool results are present" do
+        tool_use = Dispatch::Adapter::ToolUseBlock.new(id: "c1", name: "fn", arguments: {})
+        tool_result = Dispatch::Adapter::ToolResultBlock.new(tool_use_id: "c1", content: "res")
+
+        messages = [
+          Dispatch::Adapter::Message.new(role: "user", content: "go"),
+          Dispatch::Adapter::Message.new(role: "assistant", content: [tool_use]),
+          Dispatch::Adapter::Message.new(role: "user", content: [tool_result])
+        ]
+
+        stub = stub_request(:post, "https://api.githubcopilot.com/responses")
+               .with(headers: { "X-Initiator" => "agent" })
+               .to_return(**ok_resp)
+
+        responses_adapter.chat(messages)
+
+        expect(stub).to have_been_requested
+      end
+    end
+
+    context "error mapping" do
+      it "maps 400 to RequestError (the error gpt-5.4 would give on wrong endpoint)" do
+        stub_request(:post, "https://api.githubcopilot.com/responses")
+          .to_return(status: 400, body: JSON.generate({ "error" => { "message" => "bad request" } }))
+
+        messages = [Dispatch::Adapter::Message.new(role: "user", content: "Hi")]
+        expect { responses_adapter.chat(messages) }.to raise_error(Dispatch::Adapter::RequestError)
+      end
+    end
+
+    context "streaming" do
+      def sse_events(*events)
+        all = events.map { |e| "data: #{JSON.generate(e)}\n\n" }
+        all << "data: [DONE]\n\n"
+        all.join
+      end
+
+      it "yields text StreamDeltas and returns Response" do
+        body = sse_events(
+          { "type" => "response.output_item.added", "output_index" => 0,
+            "item" => { "type" => "message", "id" => "msg_001", "role" => "assistant", "content" => [] } },
+          { "type" => "response.output_text.delta", "item_id" => "msg_001",
+            "output_index" => 0, "content_index" => 0, "delta" => "Hello" },
+          { "type" => "response.output_text.delta", "item_id" => "msg_001",
+            "output_index" => 0, "content_index" => 0, "delta" => " world" },
+          { "type" => "response.completed",
+            "response" => { "model" => "gpt-5.4",
+                            "usage" => { "input_tokens" => 10, "output_tokens" => 2, "total_tokens" => 12 } } }
+        )
+
+        stub_request(:post, "https://api.githubcopilot.com/responses")
+          .with { |req| JSON.parse(req.body)["stream"] == true }
+          .to_return(status: 200, body: body, headers: { "Content-Type" => "text/event-stream" })
+
+        messages = [Dispatch::Adapter::Message.new(role: "user", content: "Hi")]
+        deltas = []
+        response = responses_adapter.chat(messages, stream: true) { |d| deltas << d }
+
+        text_deltas = deltas.select { |d| d.type == :text_delta }
+        expect(text_deltas.size).to eq(2)
+        expect(text_deltas[0].text).to eq("Hello")
+        expect(text_deltas[1].text).to eq(" world")
+
+        expect(response).to be_a(Dispatch::Adapter::Response)
+        expect(response.content).to eq("Hello world")
+        expect(response.stop_reason).to eq(:end_turn)
+        expect(response.model).to eq("gpt-5.4")
+        expect(response.usage.input_tokens).to eq(10)
+        expect(response.usage.output_tokens).to eq(2)
+      end
+
+      it "yields tool_use_start and tool_use_delta for function call streams" do
+        body = sse_events(
+          { "type" => "response.output_item.added", "output_index" => 0,
+            "item" => { "type" => "function_call", "id" => "fc_001", "call_id" => "call_001",
+                        "name" => "get_weather" } },
+          { "type" => "response.function_call_arguments.delta",
+            "item_id" => "fc_001", "output_index" => 0, "delta" => "{\"city\":" },
+          { "type" => "response.function_call_arguments.delta",
+            "item_id" => "fc_001", "output_index" => 0, "delta" => "\"NYC\"}" },
+          { "type" => "response.completed",
+            "response" => { "model" => "gpt-5.4",
+                            "usage" => { "input_tokens" => 15, "output_tokens" => 8, "total_tokens" => 23 } } }
+        )
+
+        stub_request(:post, "https://api.githubcopilot.com/responses")
+          .to_return(status: 200, body: body, headers: { "Content-Type" => "text/event-stream" })
+
+        messages = [Dispatch::Adapter::Message.new(role: "user", content: "weather?")]
+        deltas = []
+        response = responses_adapter.chat(messages, stream: true) { |d| deltas << d }
+
+        starts = deltas.select { |d| d.type == :tool_use_start }
+        arg_deltas = deltas.select { |d| d.type == :tool_use_delta }
+
+        expect(starts.size).to eq(1)
+        expect(starts.first.tool_call_id).to eq("call_001")
+        expect(starts.first.tool_name).to eq("get_weather")
+
+        expect(arg_deltas.size).to eq(2)
+        expect(arg_deltas[0].argument_delta).to eq("{\"city\":")
+        expect(arg_deltas[1].argument_delta).to eq("\"NYC\"}")
+
+        expect(response.stop_reason).to eq(:tool_use)
+        expect(response.tool_calls.size).to eq(1)
+        expect(response.tool_calls.first.name).to eq("get_weather")
+        expect(response.tool_calls.first.arguments).to eq({ "city" => "NYC" })
+        expect(response.usage.input_tokens).to eq(15)
+        expect(response.usage.output_tokens).to eq(8)
+      end
+
+      it "sends stream: true in the request body" do
+        body = sse_events(
+          { "type" => "response.completed",
+            "response" => { "model" => "gpt-5.4", "usage" => { "input_tokens" => 5, "output_tokens" => 1 } } }
+        )
+
+        stub = stub_request(:post, "https://api.githubcopilot.com/responses")
+               .with { |req| JSON.parse(req.body)["stream"] == true }
+               .to_return(status: 200, body: body, headers: { "Content-Type" => "text/event-stream" })
+
+        messages = [Dispatch::Adapter::Message.new(role: "user", content: "Hi")]
+        responses_adapter.chat(messages, stream: true) { |_d| }
+
+        expect(stub).to have_been_requested
+      end
+
+      it "returns nil content when there are no text deltas" do
+        body = sse_events(
+          { "type" => "response.output_item.added", "output_index" => 0,
+            "item" => { "type" => "function_call", "id" => "fc_001", "call_id" => "call_001", "name" => "fn" } },
+          { "type" => "response.function_call_arguments.delta",
+            "item_id" => "fc_001", "output_index" => 0, "delta" => "{}" },
+          { "type" => "response.completed",
+            "response" => { "model" => "gpt-5.4", "usage" => { "input_tokens" => 5, "output_tokens" => 1 } } }
+        )
+
+        stub_request(:post, "https://api.githubcopilot.com/responses")
+          .to_return(status: 200, body: body, headers: { "Content-Type" => "text/event-stream" })
+
+        messages = [Dispatch::Adapter::Message.new(role: "user", content: "do it")]
+        response = responses_adapter.chat(messages, stream: true) { |_d| }
+
+        expect(response.content).to be_nil
+        expect(response.tool_calls.size).to eq(1)
+      end
+    end
+  end
 end
author	Adam Malczewski <[email protected]>	2026-04-30 18:06:07 +0900
committer	Adam Malczewski <[email protected]>	2026-04-30 18:06:07 +0900
commit	9be8821368deff024eafedeea55a614f9a9468cf (patch)
tree	43d70e2e8d6ac31e288f8f99b71555c051db0b19
parent	5c9b8f5142198bdf230d500b5101322a22235670 (diff)
download	dispatch-adapter-copilot-main.tar.gz dispatch-adapter-copilot-main.zip