| Age | Commit message (Collapse) | Author |
|
reconcile() only repaired orphaned tool-calls. Two other broken states made
chats uncontinuable, and load() had no parse-error guard:
- A trailing assistant message whose only chunk is 'error' (a failed-
generation marker) serializes to empty content -> provider rejects/empty
-> chat never continues. 6 of 140 production conversations were stuck.
- A tool-call whose input is a raw malformed-JSON string (model emitted
broken JSON) re-sent as OpenAI arguments -> provider 400s on every
continuation (the 77574596 break).
- load() JSON.parse had no try/catch -> one corrupt row bricked the chat.
Fix = read-time repair (no DB surgery; append-only preserved). reconcile
runs on every load() BEFORE any provider sees messages, so Layer 1
protects ALL providers.
Layer 1 (conversation-store reconcile): strip error chunks from assistant
messages + drop the now-empty error-only messages (safe: never followed by
a tool message); orphaned-tool-call synthesis unchanged; ReconcileReport
+2 additive counts. loadSince (FE reads) intentionally unreconciled so the
user still SEES the error. load() wraps JSON.parse in try/catch (skip
corrupt rows).
Layer 2 (openai-stream): serializeToolArguments ensures tool-call
arguments is always valid JSON (malformed string -> fallback object),
neutralizing already-stored malformed args.
Layer 2 equiv (../claude provider-anthropic): safeJson returns a valid
object fallback on parse failure, not the raw string. (Separate repo.)
Live-verified: reproduced 77574596's real broken tail in the dev DB;
POST /chat continued it cleanly (no 400, model replied) — the provider
accepted the reconciled history.
tsc -b EXIT 0, biome clean, 1453 vitest pass.
|
|
ModelInfo (kernel contract):
- Add contextWindow?: number field
OpenAI-stream listModels:
- Parse contextWindow from common field names (context_length,
context_window, max_context_length, max_tokens)
Transport-contract:
- ModelsResponse: add optional modelInfo map (model name → { contextWindow? })
- Add ModelMetadata type
- Rename CompactThresholdResponse → CompactPercentResponse
- Rename SetCompactThresholdRequest → SetCompactPercentRequest
Credential store:
- Add getModelInfo(modelName) method — resolves full ModelInfo
(including contextWindow) for a <credential>/<model> string
Transport-http:
- GET /models now includes modelInfo with contextWindow per model
- Rename compact-threshold endpoints → compact-percent
Session-orchestrator:
- Auto-compact now uses contextSize (not overcounted usage.inputTokens)
compared against contextWindow * (percent / 100)
- Default percent: 85 (was flat 350000)
- resolveModelInfo dep added to look up contextWindow
- Passes modelName from the settled turn to the compaction service
Conversation store:
- Rename getCompactThreshold/setCompactThreshold → getCompactPercent/setCompactPercent
- compactThresholdKey → compact-percent key
|
|
When a provider doesn't include a usage field in the SSE stream, the
span attributes (usage.inputTokens, usage.outputTokens) are now absent
instead of defaulting to 0. This makes it clear in the journal that the
provider didn't report usage, rather than looking like 0 tokens were
used.
|
|
Without stream_options.include_usage, OpenAI-compatible providers omit
the usage field from the SSE stream entirely. Umans returned 0 tokens
for everything; OpenCode's proxy happened to include usage without it.
Now both providers return proper prompt_tokens + completion_tokens.
Note: Umans does not report cache_read_tokens or
prompt_tokens_details.cached_tokens — cache hit rate will be 0% for
Umans regardless. This is a provider limitation, not a parsing issue.
|
|
Extract a generic @dispatch/openai-stream library from provider-openai-compat
(convert-messages, convert-tools, parse-sse, listModels, stream, provider),
parameterizing createOpenAICompatProvider with uid=1000(tradam) gid=1000(tradam) groups=1000(tradam),966(docker),968(ollama),998(wheel) + hook.
Refactor provider-openai-compat to import from the lib (byte-identical behavior).
New @dispatch/provider-umans extension wraps the Umans OpenAI-compatible backend
(https://api.code.umans.ai/v1). Self-contained: reads UMANS_API_KEY from env
directly (no auth-apikey dep). transformBody maps reasoningEffort →
reasoning_effort (capping xhigh/max → high). Dynamic listModels via GET /v1/models.
host-bin: registered provider-umans in CORE_EXTENSIONS + umans credential
(gated on UMANS_API_KEY — the credential is the model-catalog index).
Verified: tsc EXIT 0, 1059 vitest, biome clean (293 files). Boot smoke:
umans models appear in GET /models (7 models live).
|