feat(observability): map DeepSeek nested cache tokens (prompt_tokens_details.cached_tokens) -> Usage.cacheReadTokens - dispatch

diff options

author	Adam Malczewski <[email protected]>	2026-06-05 17:53:00 +0900
committer	Adam Malczewski <[email protected]>	2026-06-05 17:53:00 +0900
commit	8c417472e7801369c3dfd004c9c85d7d69372f7c (patch)
tree	3da8eac532855a14013b502afaf30d669010c6be /packages/kernel/src
parent	74986a54093f6492cc4420f5917e5215f42a8f89 (diff)
download	dispatch-8c417472e7801369c3dfd004c9c85d7d69372f7c.tar.gz dispatch-8c417472e7801369c3dfd004c9c85d7d69372f7c.zip

feat(observability): map DeepSeek nested cache tokens (prompt_tokens_details.cached_tokens) -> Usage.cacheReadTokens

The real flash fixture showed flash reports cache usage in the NESTED prompt_tokens_details.cached_tokens form (384 cached of 665 prompt); the parser only mapped the flat cache_read_tokens form, so cache tokens never surfaced. Now: cacheReadTokens = usage.cache_read_tokens ?? usage.prompt_tokens_details?.cached_tokens (flat wins; cacheWriteTokens flat-only, never fabricated; partial/null *_details safe). No kernel contract change (Usage already has the fields). +5 parser tests + a real-fixture regression (cacheReadTokens === 384). These counts (+ a future prefix.fingerprint) are the cheap signals for body de-duplication. The broader trace-body storage-growth concern (verbatim body stored per request -> ~O(N^2) for long conversations) is logged DEFERRED in tasks.md; mitigation already designed (D5 volume control + §6 retention/rotation), not yet built. 339 tests, typecheck + biome 0/0.

Diffstat (limited to 'packages/kernel/src')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: