10000 [FEATURE] Implement a summarizing conversation manager · Issue #111 · strands-agents/sdk-python · GitHub
[go: up one dir, main page]

Skip to content

[FEATURE] Implement a summarizing conversation manager #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
stefanoamorelli opened this issue May 25, 2025 · 2 comments
Closed

[FEATURE] Implement a summarizing conversation manager #111

stefanoamorelli opened this issue May 25, 2025 · 2 comments
Labels
duplicate This issue or pull request already exists enhancement New feature or request

Comments

@stefanoamorelli
Copy link
Contributor

Problem Statement

Currently, the SlidingWindowConversationManager only offers a "hard truncation" approach - when the conversation exceeds the window size, older messages are simply discarded. While this effectively manages memory and costs, it creates a significant limitation: all context from the beginning of the conversation is permanently lost.

This means that in longer conversations, users may need to re-explain background information, repeat previous decisions, or lose the benefit of earlier troubleshooting steps.

The sliding window approach treats all messages as equally disposable based solely on their age regardless of their semantic importance to the ongoing conversation.

Proposed Solution

Implement a SummarizingConversationManager that preserves conversation context by summarizing older messages instead of discarding them when the window limit is exceeded.

  • Flexible configuration: configurable ratio of messages to summarize vs preserve;
  • Fallback: falls back to sliding window behavior when summarization fails;
  • Cost awareness: can use a separate (and potentially faster/cheaper) model for summarization.
summarization_model = AnthropicModel(
    model_id="claude-opus-4-20250514",
    max_tokens=500,  # Keep summaries concise
    params={
        "temperature": 0.3  # Lower temperature for more consistent summaries
    },
)

manager = SummarizingConversationManager(
    window_size=6,  # Window treshold
    enable_summarization=True,
    summarization_model=summarization_model,  # Use Anthropic model for summaries
    summary_ratio=0.5,  # Summarize 50% of oldest messages
    preserve_recent_messages=3,  # Always keep 3 most recent messages
)
graph TD
      A[New Message Added] --> B{Messages > Window Size?}
      B -->|No| C[Continue Normal Flow]
      B -->|Yes| D{Summarization Enabled?}

      D -->|No| E[SlidingWindowConversationManager]
      E --> F[Trim Oldest Messages]
      F --> G[Preserve Tool Use Pairs]
      G --> H[Return Trimmed Messages]

      D -->|Yes| I[SummarizingConversationManager]
      I --> J{Enough Messages to Summarize?}
      J -->|No| K[Fallback to Sliding Window]
      K --> F

      J -->|Yes| L[Calculate Messages to Summarize]
      L --> M[Extract Oldest Messages]
      M --> N[Generate AI Summary]

      N --> O{Summary Generation Success?}
      O -->|No| P[Log Warning & Fallback]
      P --> F

      O -->|Yes| Q[Create Summary Message]
      Q --> R[Replace Old Messages with Summary]
      R --> S[Preserve Recent Messages]
      S --> T[Return: Summary + Recent Messages]

      style A fill:#e1f5fe
      style D fill:#fff3e0
      style I fill:#e8f5e8
      style E fill:#fce4ec
      style N fill:#f3e5f5
      style T fill:#e8f5e8
      style H fill:#fce4ec
Loading

Use Case

When agents deal with long conversations, the conversation eventually hits the model's context limit. At that point, we can either cut off the older messages and lose that context forever, or summarize the earlier parts so the agent can still reference what happened before. A summarization technique prevents the "goldfish memory" problem where the agent suddenly forgets everything that was discussed in the first beginning of the conversation.

Alternatives Solutions

No response

Additional Context

No response

@stefanoamorelli stefanoamorelli added the enhancement New feature or request label May 25, 2025
@Unshure Unshure added the duplicate This issue or pull request already exists label May 26, 2025
@Unshure
Copy link
Member
Unshure commented May 26, 2025

Hi @stefanoamorelli, thanks for making this feature request and and associated pull requests! We already have a feature request open for this type of a feature, with a slightly different approach: #90

One main difference between the approaches is using an Agent instead of a ModelProvider. Using an Agent allows for toolUse as a part of the summarization process.

We can also simplify the initialization of the new ConversationManager by passing in the current agent when calling reduce_context as well, instead of requiring a ModelProvider at initialization time.

Please take a look at the original Feature Request. We can continue the discussion about the different implementation strategies there.

@Unshure Unshure closed this as completed May 26, 2025
@stefanoamorelli
Copy link
Contributor Author

Apologies, indeed, I did not see the open issue (did not search exhaustively). Sure, let's continue the conversation there!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants
0