8000 fix: add thinking blocks by sangorrin · Pull Request #202 · usestrix/strix · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@sangorrin
Copy link
@sangorrin sangorrin commented Dec 18, 2025

This PR fixes a critical issue #156 that makes impossible to use important reasoning models, such as claude-4.5 or haiku, by returning thinking_blocks when the previous assistant response had them.

Recently a related PR (BerriAI/litellm#17106) was merged in litellm. That PR basically removes thinking when thinking_blocks are not returned as a workaround to avoid errors. Having said that, that limits the capabilities of the reasoning LLMs. For that reason, litellm clients such as Strix need to handle the return of the thinking_blocks by themshelves, at least for reasoning models.

Ref: https://docs.litellm.ai/docs/reasoning_content
Ref: BerriAI/litellm#14194
Ref: BerriAI/litellm#9020

NOTE: I considered that one of the messages that was being sent as 'assistant' was incorrect and had to be sent as 'user' so I changed that as well. Please correct me if I'm wrong.

Tested with

  • "anthropic/claude-haiku-4-5"
  • on a Mac air M2.
  • python 3.12.9
  • latest strix commit
  • see errors in the issues

Copilot AI review requested due to automatic review settings December 18, 2025 14:19
Copy link
Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for thinking_blocks from reasoning models (such as Claude Sonnet 4.5 and Claude Haiku 4.5) by properly extracting and preserving them in conversation history. This fixes a critical issue where reasoning models would fail due to missing thinking blocks in subsequent requests. Additionally, the PR corrects the message role for timeout notifications from "assistant" to "user".

  • Adds thinking_blocks field to LLMResponse dataclass and extracts it from LLM responses for reasoning models
  • Updates add_message method to accept and store thinking_blocks in conversation history
  • Fixes message role from "assistant" to "user" for waiting timeout messages

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
strix/llm/llm.py Adds thinking_blocks field to LLMResponse and extracts thinking blocks from reasoning model responses when applicable
strix/agents/state.py Extends add_message to accept optional thinking_blocks parameter and conditionally includes them in message storage
strix/agents/base_agent.py Passes thinking_blocks when storing assistant messages and corrects timeout message role from "assistant" to "user"

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

0