Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes a critical issue #156 that makes impossible to use important reasoning models, such as claude-4.5 or haiku, by returning
thinking_blockswhen the previous assistant response had them.Recently a related PR (BerriAI/litellm#17106) was merged in litellm. That PR basically removes thinking when thinking_blocks are not returned as a workaround to avoid errors. Having said that, that limits the capabilities of the reasoning LLMs. For that reason, litellm clients such as Strix need to handle the return of the thinking_blocks by themshelves, at least for reasoning models.
Ref: https://docs.litellm.ai/docs/reasoning_content
Ref: BerriAI/litellm#14194
Ref: BerriAI/litellm#9020
NOTE: I considered that one of the messages that was being sent as 'assistant' was incorrect and had to be sent as 'user' so I changed that as well. Please correct me if I'm wrong.
Tested with