You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the SlidingWindowConversationManager only offers a "hard truncation" approach - when the conversation exceeds the window size, older messages are simply discarded. While this effectively manages memory and costs, it creates a significant limitation: all context from the beginning of the conversation is permanently lost.
This means that in longer conversations, users may need to re-explain background information, repeat previous decisions, or lose the benefit of earlier troubleshooting steps.
The sliding window approach treats all messages as equally disposable based solely on their age regardless of their semantic importance to the ongoing conversation.
Proposed Solution
Implement a SummarizingConversationManager that preserves conversation context by summarizing older messages instead of discarding them when the window limit is exceeded.
Flexible configuration: configurable ratio of messages to summarize vs preserve;
Fallback: falls back to sliding window behavior when summarization fails;
Cost awareness: can use a separate (and potentially faster/cheaper) model for summarization.
summarization_model=AnthropicModel(
model_id="claude-opus-4-20250514",
max_tokens=500, # Keep summaries conciseparams={
"temperature": 0.3# Lower temperature for more consistent summaries
},
)
manager=SummarizingConversationManager(
window_size=6, # Window tresholdenable_summarization=True,
summarization_model=summarization_model, # Use Anthropic model for summariessummary_ratio=0.5, # Summarize 50% of oldest messagespreserve_recent_messages=3, # Always keep 3 most recent messages
)
graph TD
A[New Message Added] --> B{Messages > Window Size?}
B -->|No| C[Continue Normal Flow]
B -->|Yes| D{Summarization Enabled?}
D -->|No| E[SlidingWindowConversationManager]
E --> F[Trim Oldest Messages]
F --> G[Preserve Tool Use Pairs]
G --> H[Return Trimmed Messages]
D -->|Yes| I[SummarizingConversationManager]
I --> J{Enough Messages to Summarize?}
J -->|No| K[Fallback to Sliding Window]
K --> F
J -->|Yes| L[Calculate Messages to Summarize]
L --> M[Extract Oldest Messages]
M --> N[Generate AI Summary]
N --> O{Summary Generation Success?}
O -->|No| P[Log Warning & Fallback]
P --> F
O -->|Yes| Q[Create Summary Message]
Q --> R[Replace Old Messages with Summary]
R --> S[Preserve Recent Messages]
S --> T[Return: Summary + Recent Messages]
style A fill:#e1f5fe
style D fill:#fff3e0
style I fill:#e8f5e8
style E fill:#fce4ec
style N fill:#f3e5f5
style T fill:#e8f5e8
style H fill:#fce4ec
Loading
Use Case
When agents deal with long conversations, the conversation eventually hits the model's context limit. At that point, we can either cut off the older messages and lose that context forever, or summarize the earlier parts so the agent can still reference what happened before. A summarization technique prevents the "goldfish memory" problem where the agent suddenly forgets everything that was discussed in the first beginning of the conversation.
Alternatives Solutions
No response
Additional Context
No response
The text was updated successfully, but these errors were encountered:
Hi @stefanoamorelli, thanks for making this feature request and and associated pull requests! We already have a feature request open for this type of a feature, with a slightly different approach: #90
One main difference between the approaches is using an Agent instead of a ModelProvider. Using an Agent allows for toolUse as a part of the summarization process.
We can also simplify the initialization of the new ConversationManager by passing in the current agent when calling reduce_context as well, instead of requiring a ModelProvider at initialization time.
Please take a look at the original Feature Request. We can continue the discussion about the different implementation strategies there.
Problem Statement
Currently, the
SlidingWindowConversationManager
only offers a "hard truncation" approach - when the conversation exceeds the window size, older messages are simply discarded. While this effectively manages memory and costs, it creates a significant limitation: all context from the beginning of the conversation is permanently lost.This means that in longer conversations, users may need to re-explain background information, repeat previous decisions, or lose the benefit of earlier troubleshooting steps.
The sliding window approach treats all messages as equally disposable based solely on their age regardless of their semantic importance to the ongoing conversation.
Proposed Solution
Implement a
SummarizingConversationManager
that preserves conversation context by summarizing older messages instead of discarding them when the window limit is exceeded.Use Case
When agents deal with long conversations, the conversation eventually hits the model's context limit. At that point, we can either cut off the older messages and lose that context forever, or summarize the earlier parts so the agent can still reference what happened before. A summarization technique prevents the "goldfish memory" problem where the agent suddenly forgets everything that was discussed in the first beginning of the conversation.
Alternatives Solutions
No response
Additional Context
No response
The text was updated successfully, but these errors were encountered: