Analysis of Claude Code's Cache TTL Change and Its Impact
By seanGSISG
AI Summary
In early 2026, a subtle yet impactful change occurred in Claude Code's prompt cache TTL default, shifting from 1 hour to 5 minutes. This alteration, which seems to have been implemented silently by Anthropic, has led to a significant increase in cache creation costs and quota consumption for users. The analysis of session data from January to April 2026, gathered from two different machines, reveals this shift and its consequences.
## Data Analysis
The data, extracted from JSONL files, shows a clear transition in TTL behavior across four phases. Initially, in January, only 5-minute TTLs were recorded, likely before the 1-hour option was available. From February 1 to March 5, the 1-hour TTL was consistently used, indicating it was the intended default. However, starting March 6, 5-minute tokens began to reappear, and by March 8, they dominated, suggesting a server-side configuration change.
## Cost and Quota Impact
This change has financial implications. The cost analysis, based on Anthropic's pricing, shows that users overpaid by 17.1% due to the increased frequency of cache re-creations under the 5-minute TTL. Additionally, users on subscription plans began hitting their quota limits for the first time, as cache creation tokens count towards these limits at a higher rate than cache reads.
## Hypothesis and Request
The data suggests that the 1-hour TTL was the intended default, with the March reversion being either a cost-saving measure or an accidental regression. The analysis calls for Anthropic to confirm whether this change was intentional and to consider reverting to the 1-hour TTL or making it user-configurable. Clarification on quota counting behavior for cache reads is also requested to help users manage their usage effectively.
## Methodology
The analysis was conducted using session logs directly from Claude Code, ensuring accuracy without external interference. The pricing data was sourced from official Anthropic rates, providing a reliable basis for the cost analysis.
Key Concepts
Cache TTL refers to the duration for which cached data remains valid before it needs to be refreshed or recreated. It determines how long data can be reused without fetching or computing it again.
Quota consumption refers to the usage of allocated resources or limits within a given period, often applied to subscription plans where users have a set amount of resources they can use.
Category
TechnologyOriginal source
https://github.com/anthropics/claude-code/issues/46829More on Discover
Summarized by Mente
Save any article, video, or tweet. AI summarizes it, finds connections, and creates your to-do list.
Start free, no credit card