summaryrefslogtreecommitdiffhomepage
path: root/packages/openai-stream
AgeCommit message (Collapse)Author
5 daysfix(broken-chat): read-time self-repair of unrecoverable chatsAdam Malczewski
reconcile() only repaired orphaned tool-calls. Two other broken states made chats uncontinuable, and load() had no parse-error guard: - A trailing assistant message whose only chunk is 'error' (a failed- generation marker) serializes to empty content -> provider rejects/empty -> chat never continues. 6 of 140 production conversations were stuck. - A tool-call whose input is a raw malformed-JSON string (model emitted broken JSON) re-sent as OpenAI arguments -> provider 400s on every continuation (the 77574596 break). - load() JSON.parse had no try/catch -> one corrupt row bricked the chat. Fix = read-time repair (no DB surgery; append-only preserved). reconcile runs on every load() BEFORE any provider sees messages, so Layer 1 protects ALL providers. Layer 1 (conversation-store reconcile): strip error chunks from assistant messages + drop the now-empty error-only messages (safe: never followed by a tool message); orphaned-tool-call synthesis unchanged; ReconcileReport +2 additive counts. loadSince (FE reads) intentionally unreconciled so the user still SEES the error. load() wraps JSON.parse in try/catch (skip corrupt rows). Layer 2 (openai-stream): serializeToolArguments ensures tool-call arguments is always valid JSON (malformed string -> fallback object), neutralizing already-stored malformed args. Layer 2 equiv (../claude provider-anthropic): safeJson returns a valid object fallback on parse failure, not the raw string. (Separate repo.) Live-verified: reproduced 77574596's real broken tail in the dev DB; POST /chat continued it cleanly (no 400, model replied) — the provider accepted the reconciled history. tsc -b EXIT 0, biome clean, 1453 vitest pass.
7 daysfeat: context window from model endpoints + percentage-based auto-compactAdam Malczewski
ModelInfo (kernel contract): - Add contextWindow?: number field OpenAI-stream listModels: - Parse contextWindow from common field names (context_length, context_window, max_context_length, max_tokens) Transport-contract: - ModelsResponse: add optional modelInfo map (model name → { contextWindow? }) - Add ModelMetadata type - Rename CompactThresholdResponse → CompactPercentResponse - Rename SetCompactThresholdRequest → SetCompactPercentRequest Credential store: - Add getModelInfo(modelName) method — resolves full ModelInfo (including contextWindow) for a <credential>/<model> string Transport-http: - GET /models now includes modelInfo with contextWindow per model - Rename compact-threshold endpoints → compact-percent Session-orchestrator: - Auto-compact now uses contextSize (not overcounted usage.inputTokens) compared against contextWindow * (percent / 100) - Default percent: 85 (was flat 350000) - resolveModelInfo dep added to look up contextWindow - Passes modelName from the settled turn to the compaction service Conversation store: - Rename getCompactThreshold/setCompactThreshold → getCompactPercent/setCompactPercent - compactThresholdKey → compact-percent key
8 daysfix(openai-stream): omit usage attrs from spans when provider doesn't reportAdam Malczewski
When a provider doesn't include a usage field in the SSE stream, the span attributes (usage.inputTokens, usage.outputTokens) are now absent instead of defaulting to 0. This makes it clear in the journal that the provider didn't report usage, rather than looking like 0 tokens were used.
8 daysfix(openai-stream): add stream_options.include_usage to all requestsAdam Malczewski
Without stream_options.include_usage, OpenAI-compatible providers omit the usage field from the SSE stream entirely. Umans returned 0 tokens for everything; OpenCode's proxy happened to include usage without it. Now both providers return proper prompt_tokens + completion_tokens. Note: Umans does not report cache_read_tokens or prompt_tokens_details.cached_tokens — cache hit rate will be 0% for Umans regardless. This is a provider limitation, not a parsing issue.
8 daysfeat(provider-umans): Umans AI Coding Plan provider + openai-stream libAdam Malczewski
Extract a generic @dispatch/openai-stream library from provider-openai-compat (convert-messages, convert-tools, parse-sse, listModels, stream, provider), parameterizing createOpenAICompatProvider with uid=1000(tradam) gid=1000(tradam) groups=1000(tradam),966(docker),968(ollama),998(wheel) + hook. Refactor provider-openai-compat to import from the lib (byte-identical behavior). New @dispatch/provider-umans extension wraps the Umans OpenAI-compatible backend (https://api.code.umans.ai/v1). Self-contained: reads UMANS_API_KEY from env directly (no auth-apikey dep). transformBody maps reasoningEffort → reasoning_effort (capping xhigh/max → high). Dynamic listModels via GET /v1/models. host-bin: registered provider-umans in CORE_EXTENSIONS + umans credential (gated on UMANS_API_KEY — the credential is the model-catalog index). Verified: tsc EXIT 0, 1059 vitest, biome clean (293 files). Boot smoke: umans models appear in GET /models (7 models live).