1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
|
# dispatch-adapter-claude
A [Dispatch](https://github.com/realtradam/dispatch-adapter-interface) adapter
that connects to Anthropic's Claude API using a personal **Pro / Max
subscription** via the same OAuth flow that the Claude Code CLI uses.
---
## ⚠ Status / disclaimer
This gem impersonates the Claude Code CLI HTTP signature
(`User-Agent: claude-cli/2.1.63 (external, cli)` and the matching Stainless
header set). That is the mechanism Anthropic uses to route Pro / Max traffic
through the Claude Code entitlement.
Use this under your **own** Claude Pro / Max subscription and at your own risk
with respect to Anthropic's Terms of Service. It is your responsibility to
ensure your usage complies with those terms.
---
## Installation
Add to your `Gemfile`:
```ruby
gem "dispatch-adapter-claude"
```
Or install directly:
```bash
gem install dispatch-adapter-claude
```
The gem requires Ruby ≥ 3.2 and depends on
`dispatch-adapter-interface ~> 0.2`.
---
## Quick start
```ruby
require "dispatch/adapter/claude"
# Build the adapter (defaults to claude-sonnet-4-5-20250929).
claude = Dispatch::Adapter::Claude.new(
model: "claude-sonnet-4-5-20250929"
)
# First run: opens a browser for the OAuth PKCE flow and caches the token
# at ~/.config/dispatch/claude_oauth.json (mode 0600).
# Subsequent calls: validates / auto-refreshes the stored token.
claude.authenticate!
# Send a message.
msgs = [
Dispatch::Adapter::Message.new(
role: "user",
content: [Dispatch::Adapter::TextBlock.new(text: "Say hi")]
)
]
resp = claude.chat(msgs)
puts resp.content # => "Hi there! ..."
puts resp.stop_reason # => :end_turn
puts resp.usage.cost.total # USD-equivalent (computed)
# Check subscription quota (OAuth only).
report = claude.usage_report
entry = report.limits.find { |e| e.id == "anthropic:5h" }
puts entry.amount.used_fraction # 0.124 (12.4% of 5-hour window)
puts entry.window.resets_at # 2026-04-28 19:00:00 UTC
```
---
## Pricing semantics
Claude Pro / Max returns **no dollar-cost line item** per request — it is a
flat-rate plan. The `usage.cost` field on every `Response` is a
**locally-computed USD-equivalent** derived from a bundled price table that
mirrors what API customers would pay:
```
cost.input = (price_per_mtok.input / 1_000_000) × input_tokens
cost.output = (price_per_mtok.output / 1_000_000) × output_tokens
cost.cache_read = (price_per_mtok.cache_read / 1_000_000) × cache_read_tokens
cost.cache_write = (price_per_mtok.cache_write / 1_000_000) × cache_creation_tokens
cost.total = sum of the above
```
The only **authoritative** consumption signal for Pro / Max is
`usage_report`, which reports what fraction of each rolling window
(5-hour, 7-day, 7-day Opus, 7-day Sonnet) has been used.
---
## Configuration
### Constructor keyword arguments
| Argument | Type | Default | Description |
|---|---|---|---|
| `model` | `String` | `"claude-sonnet-4-5-20250929"` | Anthropic model ID |
| `api_key` | `String, nil` | `nil` | Raw `sk-ant-api…` key; bypasses OAuth when set |
| `token_path` | `String, nil` | `nil` | Override path for the OAuth token file |
| `base_url` | `String` | `"https://api.anthropic.com"` | API base URL |
| `max_tokens` | `Integer, nil` | `nil` | Instance-level default for `max_tokens` |
| `thinking` | `String, Hash, nil` | `nil` | Instance-level thinking config (see below) |
| `cache_retention` | `Symbol, nil` | `nil` | Default cache TTL: `:short` (5 min), `:long` (1 h), `:none` |
| `min_request_interval` | `Float` | `1.0` | Minimum seconds between outbound requests |
| `extra_betas` | `Array<String>` | `[]` | Additional `Anthropic-Beta` header values |
| `is_oauth` | `Boolean, nil` | `nil` | Override OAuth auto-detection |
| `token_store` | `TokenStore, nil` | `nil` | Inject a custom credential store (testing) |
### Environment variables
No required environment variables. The adapter loads credentials from the
token store at `~/.config/dispatch/claude_oauth.json` by default. Override
with `token_path:`.
### Thinking / extended-output
Pass `thinking:` to the constructor (instance default) or to `chat` (per-call):
```ruby
# String shorthand (maps to budget_tokens heuristics internally)
claude = Dispatch::Adapter::Claude.new(thinking: "high")
# Hash — full control
claude = Dispatch::Adapter::Claude.new(
thinking: { type: :adaptive, display: :summarized }
)
# Per-call override
resp = claude.chat(msgs, thinking: { type: :enabled, budget_tokens: 4096 })
```
For **adaptive thinking** (Opus 4.7+), use `{ type: :adaptive }`.
For models that don't support thinking, the parameter is silently ignored.
### Prompt caching
Pass `cache_retention:` to enable Anthropic prompt caching:
```ruby
# Short-lived (5 min TTL) — works everywhere including API keys
resp = claude.chat(msgs, cache_retention: :short)
# Long-lived (1 h TTL) — api.anthropic.com only
resp = claude.chat(msgs, cache_retention: :long)
# Disable caching even if the instance default is set
resp = claude.chat(msgs, cache_retention: :none)
```
The adapter automatically places up to 4 cache breakpoints in the order
that Anthropic requires: last tool → last system block → penultimate user
message → last user message. The 4-breakpoint cap and TTL ordering rule
(a 5-min block may not follow a 1-h block) are enforced automatically.
### Tool use
```ruby
add_tool = Dispatch::Adapter::ToolDefinition.new(
name: "add",
description: "Return the sum of two integers",
parameters: {
type: "object",
properties: {
a: { type: "integer" },
b: { type: "integer" }
},
required: %w[a b]
}
)
resp = claude.chat(msgs, tools: [add_tool])
if resp.stop_reason == :tool_use
tc = resp.tool_calls.first
puts "#{tc.name}(#{tc.arguments})" # => "add({"a"=>2, "b"=>3})"
end
```
When using OAuth (Pro / Max), tool names are automatically prefixed with
`proxy_` on the wire and stripped from the response. Built-in tool names
(`web_search`, `code_execution`, `text_editor`, `computer`) are passed
through unchanged.
### Streaming
```ruby
full_resp = claude.chat(msgs, stream: true) do |delta|
case delta.type
when :text_delta then print delta.text
when :thinking_delta then print "[thinking] #{delta.text}"
when :tool_use_start then puts "\n[tool] #{delta.tool_name}"
end
end
puts full_resp.usage.cost.total
```
---
## Auth lifecycle
```ruby
claude = Dispatch::Adapter::Claude.new
# Interactive OAuth PKCE login (opens browser on first call)
result = claude.authenticate!
# => :logged_in | :cached | :refreshed | :api_key
# Check whether credentials are present
claude.authenticated? # => true / false
# Remove stored OAuth credentials
claude.logout!
```
Tokens are stored at `~/.config/dispatch/claude_oauth.json` (file mode 0600)
and refreshed automatically 5 minutes before expiry.
---
## Limitations
| Area | Constraint |
|---|---|
| Image modality | JPG, PNG, WEBP, GIF only; no audio or video |
| Tool-name length | Anthropic enforces a maximum on compiled grammar size; a 400 "compiled grammar too large" triggers an automatic retry with `strict: false` |
| Cache breakpoints | At most 4 per request; the adapter enforces the cap and TTL ordering automatically |
| OAuth callback port | Port 54545 is hard-coded by Anthropic's redirect URI; the port must be free during the initial login |
| Thinking on forced tool_choice | When `tool_choice: :any` or `tool_choice: { type: :tool, … }`, thinking and `output_config` are removed (the API rejects them otherwise) |
| Opus 4.7+ sampling params | `top_p` / `top_k` are silently dropped for Opus 4.7+ models |
| `usage_report` | Requires OAuth (Pro / Max); returns `nil` with a raw API key |
---
## Tracking upstream changes
The two constants most likely to drift when Anthropic releases a new
version of Claude Code are:
| Constant | Location | Current value |
|---|---|---|
| `CLAUDE_CODE_VERSION` | `lib/dispatch/adapter/claude/headers.rb` | `"2.1.63"` |
| `STAINLESS_PACKAGE_VERSION` | `lib/dispatch/adapter/claude/headers.rb` | `"0.74.0"` |
When the real Claude Code CLI updates, bump these to match. Anthropic also
rotates the `Anthropic-Beta` header set a few times per year; `DEFAULT_BETAS`
in the same file lists the current set.
Background research, including the original reverse-engineering notes and the
full gap analysis against `dispatch-adapter-interface`, is in
[`.rules/research/research.md`](./.rules/research/research.md).
|