fix(file_read): remove redundant base64 encoding of file content #53

awalcutt · 2025-05-26T21:54:36Z

Description

The AWS SDK automatically base64 encodes binary content when sending requests, but the file_read tool was also encoding it, resulting in double-encoding. According to the Bedrock API documentation: "If you use an AWS SDK, you don't need to encode the bytes in base64."

This change fixes the issue by removing the extra base64 encoding step and letting the bedrock client handle it.

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse_stream.html

Related Issues

Fixes #26

Documentation PR

N/A

Type of Change

Bug fix
New Tool
Breaking change
Other (please describe):

Testing

hatch fmt --linter
hatch fmt --formatter
hatch test --all
installing in my private application

Checklist

I have read the CONTRIBUTING document
I have added tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature
My changes generate no new warnings
Any dependent changes have been merged and published
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

The AWS SDK automatically base64 encodes binary content when sending requests, but the file_read tool was also encoding it, resulting in double-encoding. According to the Bedrock API documentation: "If you use an AWS SDK, you don't need to encode the bytes in base64." This change fixes the issue by removing the extra base64 encoding step and letting the bedrock client handle it.

awsarron · 2025-05-26T21:59:11Z

Hi @awalcutt, thank you for opening this PR.

How does this work with other model providers? Strands supports various model providers like Bedrock, Anthropic, LiteLLM, Llama API, Ollama, OpenAI, and custom providers.

awalcutt · 2025-05-27T06:05:09Z

@awsarron Good question. I didn't look too deep into that because the docstring for this functions says "Create a Bedrock document block from a file."

Anthropic Provider is also separately encoding the content so this should fix that case as well.

As far as I can tell the other providers do not have specific handling for document content blocks but fall back on json.dumps(content) which will throw an error if bytes are passed in.

The DocumentContent type explicitly calls for bytes rather than str (via DocumentSource).

So... should DocumentContent contain bytes or an encoded string?

If bytes model providers must explicitly handle document blocks by throwing an error, filtering them out, or explicitly encoding them. Documentation should clarify that this is required for any model provider implementation. SDK can provide helper functions. I think this option is the most flexible to emerging/custom model provider implementations and is a cleaner interface between the responsibility of tools and model providers.

If encoded string model providers that need bytes (e.g. Bedrock) are responsible for decoding document blocks or configuring the underlying client to prevent additional encoding. Model providers that want a different encoding (encryption? compression?) may need to decode. Personally I find this interface a bit awkward but I recognize that practically speaking most providers are going to expect base64 encoding. There is a potential this is wasteful (of tokens and therefore money) if a provider does not support documents and gets sent a bunch of base64 encoded content that it can't use.

Thoughts?

pgrayy · 2025-05-28T22:31:53Z

Hello @awalcutt, thank you for putting together this PR. After discussing with the team, we decided on the following:

Add document content handling in the OpenAI and LiteLLM model providers (models - content - documents sdk-python#138)
- As with Bedrock and Anthropic, we are going to handle the base64 encoding on behalf of customers and so the change you make here will still be valid.
Throw an exception in the LlamaAPI and Ollama model providers when document blocks are passed in (PR pending).
- These providers do not support document blocks and so having an explicit exception will be better than letting json.dumps fail.

Once these 2 changes are in, I would be okay merging your fix. Thank you for your patience.

…ands-agents#53)

awalcutt requested a review from a team as a code owner May 26, 2025 21:54

awsarron self-assigned this May 26, 2025

mehtarac self-assigned this May 27, 2025

awsarron assigned pgrayy and unassigned awsarron and mehtarac May 27, 2025

pgrayy mentioned this pull request May 28, 2025

models - content - documents strands-agents/sdk-python#138

Merged

11 tasks

This was referenced May 29, 2025

models - anthropic - document - plain text strands-agents/sdk-python#141

Merged

models - unsupported content types strands-agents/sdk-python#144

Merged

pgrayy approved these changes Jun 2, 2025

View reviewed changes

pgrayy merged commit 467858b into strands-agents:main Jun 2, 2025
30 checks passed

jimbrub pushed a commit to jimbrub/tools that referenced this pull request Jun 23, 2025

fix(file_read): remove redundant base64 encoding of file content (str…

7f2861e

…ands-agents#53)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(file_read): remove redundant base64 encoding of file content #53

fix(file_read): remove redundant base64 encoding of file content #53

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fix(file_read): remove redundant base64 encoding of file content #53

fix(file_read): remove redundant base64 encoding of file content #53

Uh oh!

Conversation

Description

Related Issues

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!