2 files changed, 206 insertions, 0 deletions
diff --git a/.rules/changelog/2026-04/01/01.md b/.rules/changelog/2026-04/01/01.md
new file mode 100644
index 0000000..5134518
--- /dev/null
+++ b/.rules/changelog/2026-04/01/01.md
@@ -0,0 +1,26 @@
+# Changelog — 2026-04-01 #01
+
+## Changes
+
+### New: Rate Limiter (`lib/dispatch/adapter/rate_limiter.rb`)
+- Implemented `Dispatch::Adapter::RateLimiter` class with cross-process rate limiting via filesystem locks (`flock`).
+- Supports per-request cooldown (`min_request_interval`) and sliding window limiting (`rate_limit`).
+- State stored as JSON in `{token_path_dir}/copilot_rate_limit` with `0600` permissions.
+- Handles missing, empty, or corrupt state files gracefully.
+- Validates constructor arguments with descriptive `ArgumentError` messages.
+
+### Modified: Copilot Adapter (`lib/dispatch/adapter/copilot.rb`)
+- Added `min_request_interval:` (default `3.0`) and `rate_limit:` (default `nil`) constructor parameters.
+- Instantiates a `RateLimiter` in `initialize`.
+- Calls `@rate_limiter.wait!` before HTTP requests in `chat_non_streaming`, `chat_streaming`, and `list_models`.
+- Added `require_relative "rate_limiter"`.
+
+### Version Bump (`lib/dispatch/adapter/version.rb`)
+- Bumped `VERSION` from `"0.1.0"` to `"0.2.0"`.
+
+### Updated Test (`spec/dispatch/adapter/copilot_spec.rb`)
+- Updated VERSION expectation from `"0.1.0"` to `"0.2.0"`.
+
+### RuboCop Config (`.rubocop.yml`)
+- Disabled `Layout/LineLength` cop.
+- Disabled `Style/Documentation` cop.
diff --git a/.rules/plan/rate-limiting.md b/.rules/plan/rate-limiting.md
new file mode 100644
index 0000000..56dfb2d
--- /dev/null
+++ b/.rules/plan/rate-limiting.md
@@ -0,0 +1,180 @@
+# Rate Limiting — Implementation Plan
+
+Cross-process, per-account rate limiting for the Copilot adapter. All processes sharing the same GitHub account (same `token_path` directory) share a single rate limit state via the filesystem.
+
+---
+
+## Overview
+
+Two rate limiting mechanisms, both enforced transparently (the adapter sleeps until allowed, never raises):
+
+1. **Per-request cooldown** — Minimum interval between consecutive requests. Default: 3 seconds.
+2. **Sliding window limit** — Maximum N requests within a time period. Default: disabled (`nil`).
+
+Both are configured via constructor parameters. Rate limit state is stored in a file next to the persisted GitHub token, using `flock` for cross-process atomic access.
+
+---
+
+## Configuration
+
+### Constructor Parameters
+
+```ruby
+Copilot.new(
+  model: "gpt-4.1",
+  github_token: nil,
+  token_path: nil,
+  max_tokens: 8192,
+  thinking: nil,
+  min_request_interval: 3.0,          # seconds between requests (Float/Integer, nil to disable)
+  rate_limit: nil                      # sliding window config (Hash or nil to disable)
+)
+```
+
+#### `min_request_interval:` (default: `3.0`)
+
+- Minimum number of seconds that must elapse between the start of one request and the start of the next.
+- Set to `nil` or `0` to disable.
+- Applies system-wide across all processes sharing the same rate limit file.
+
+#### `rate_limit:` (default: `nil` — disabled)
+
+- A Hash with two keys: `{ requests: Integer, period: Integer }`.
+  - `requests` — Maximum number of requests allowed within the window.
+  - `period` — Window size in seconds.
+- Example: `{ requests: 10, period: 60 }` means at most 10 requests per 60-second sliding window.
+- Set to `nil` to disable sliding window limiting (only per-request cooldown applies).
+- Validation: both `requests` and `period` must be positive integers when provided. Raises `ArgumentError` otherwise.
+
+---
+
+## Behaviour
+
+When `chat` or `list_models` is called (any method that hits the Copilot API):
+
+1. **Acquire the rate limit file lock** (`flock(File::LOCK_EX)`).
+2. **Read the rate limit state** from the file.
+3. **Check per-request cooldown**: If less than `min_request_interval` seconds have elapsed since the last request timestamp, calculate the remaining wait time.
+4. **Check sliding window** (if configured): Count how many timestamps in the log fall within `[now - period, now]`. If the count >= `requests`, calculate the wait time until the oldest entry in the window expires.
+5. **Take the maximum** of both wait times (they can overlap).
+6. **Release the lock**, then **sleep** for the calculated wait time (if any).
+7. **Re-acquire the lock**, re-read state, re-check (the state may have changed while sleeping — another process may have made a request during our sleep).
+8. **Record the current timestamp** in the state file and release the lock.
+9. **Proceed** with the API request.
+
+The re-check-after-sleep loop is necessary because another process could slip in a request while we were sleeping. The loop converges quickly (at most a few iterations) because each process sleeps for the correct duration.
+
+### Thread Safety
+
+The existing `@mutex` protects the Copilot token refresh. Rate limiting uses a separate concern:
+
+- **Cross-process**: `flock` on the rate limit file.
+- **In-process threads**: The `flock` call itself is sufficient — Ruby's `File#flock` blocks the calling thread (does not hold the GVL while waiting), so concurrent threads in the same process will serialize correctly through the flock.
+
+---
+
+## File Format
+
+### Path
+
+```
+{token_path_directory}/copilot_rate_limit
+```
+
+Where `token_path_directory` is `File.dirname(@token_path)`. Since `@token_path` defaults to `~/.config/dispatch/copilot_github_token`, the rate limit file defaults to `~/.config/dispatch/copilot_rate_limit`.
+
+### Contents
+
+JSON with two fields:
+
+```json
+{
+  "last_request_at": 1743465600.123,
+  "request_log": [1743465590.0, 1743465595.0, 1743465600.123]
+}
+```
+
+- `last_request_at` — Unix timestamp (Float) of the most recent request. Used for per-request cooldown.
+- `request_log` — Array of Unix timestamps (Float) for recent requests. Used for sliding window. Entries older than the window `period` are pruned on every write to keep the file small.
+
+If sliding window is disabled, `request_log` is still maintained (empty array) so that enabling it later works immediately without losing the last-request timestamp.
+
+When the file does not exist or is empty/corrupt, treat it as fresh state (no previous requests).
+
+### File Permissions
+
+Created with `0600` (same as the token file) to prevent other users from reading/tampering.
+
+---
+
+## Implementation Structure
+
+### New File: `lib/dispatch/adapter/rate_limiter.rb`
+
+A standalone class `Dispatch::Adapter::RateLimiter` that encapsulates all rate limiting logic. The Copilot adapter delegates to it.
+
+```ruby
+class RateLimiter
+  def initialize(rate_limit_path:, min_request_interval:, rate_limit:)
+    # ...
+  end
+
+  def wait!
+    # Acquire lock, read state, compute wait, sleep, record, release.
+  end
+end
+```
+
+#### Public API
+
+- `#wait!` — Blocks until the rate limit allows a request, then records the request timestamp. Called by the adapter before every API call.
+
+#### Private Methods
+
+- `#read_state(file)` — Parse JSON from the locked file. Returns default state on missing/corrupt file.
+- `#write_state(file, state)` — Write JSON state back to the file.
+- `#compute_wait(state, now)` — Returns the number of seconds to sleep (Float, 0.0 if no wait needed).
+- `#prune_log(log, now, period)` — Remove timestamps older than `now - period`.
+- `#record_request(state, now)` — Append `now` to log, update `last_request_at`, prune old entries.
+
+### Changes to `Dispatch::Adapter::Copilot`
+
+1. Add constructor parameters `min_request_interval:` and `rate_limit:`.
+2. In `initialize`, create a `RateLimiter` instance.
+3. Call `@rate_limiter.wait!` at the start of `chat_non_streaming`, `chat_streaming`, and `list_models` — after `ensure_authenticated!` (authentication should not be rate-limited) but before the HTTP request.
+4. Validate `rate_limit:` hash structure in the constructor.
+
+### Changes to `Dispatch::Adapter::Base`
+
+No changes. Rate limiting is an implementation concern of the Copilot adapter, not part of the abstract interface. Other adapters may have different rate limiting strategies or none at all.
+
+---
+
+## Edge Cases
+
+| Scenario | Behaviour |
+|---|---|
+| Rate limit file does not exist | Treat as no previous requests. Create on first write. |
+| Rate limit file contains invalid JSON | Treat as no previous requests. Overwrite on next write. |
+| Rate limit file directory does not exist | Create it (same as `persist_token` does for the token file). |
+| `min_request_interval: nil` or `0` | Per-request cooldown disabled. |
+| `rate_limit: nil` | Sliding window disabled. Only cooldown applies. |
+| Both disabled | `wait!` is a no-op (returns immediately). |
+| `rate_limit:` missing `requests` or `period` key | Raises `ArgumentError` in constructor. |
+| `rate_limit: { requests: 0, ... }` or negative | Raises `ArgumentError` in constructor. |
+| Clock skew between processes | Handled — we use monotonic-ish `Time.now.to_f`. Minor skew (sub-second) is acceptable. Major skew (NTP jump) could cause one extra wait or one early request, which is acceptable. |
+| Process killed while holding lock | `flock` is automatically released by the OS when the file descriptor is closed (including process termination). No stale locks. |
+| Very long `request_log` after sustained use | Pruned on every write. Maximum size = `rate_limit[:requests]` entries. |
+
+---
+
+## Validation Rules
+
+In the constructor:
+
+- `min_request_interval` must be `nil`, or a `Numeric` >= 0. Raise `ArgumentError` otherwise.
+- `rate_limit` must be `nil` or a `Hash` with:
+  - `:requests` — positive `Integer`
+  - `:period` — positive `Integer` or `Float`
+  - No extra keys required; extra keys are ignored.
+- Raise `ArgumentError` with a descriptive message on invalid config.