implement Content-MD5 check for PutObject #9064
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
As requested and reported in #8929, we were missing
Content-MD5
validation forPutObject
, which was implemented in thelegacy
provider.ChecksumAlgorithm
orContent-MD5
allows to verify that the payload sent is properly received in the right shape.https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html#checking-object-integrity-md5
This will only concern
PutObject
for now, but many operation have this parameter (I think around 21, some likePutBucketAcl
for example).As we had a look with @alexrashed a long, long time ago, the specs implement some kind of way to know which operations have to have some kind of required checksum.
The end goal would be to have a handler to manage the checksum verification for all operations, except the ones with a streaming body like
PutObject
, as those require special handling (reading the body only once). This is in the backlog and will be tackled in the future.Changes
Implemented different strategies depending on the provider.
For the default and streaming providers, I've done the same kind of logic as the checksum for the streaming provider, but using the
ETag
of the object, which represents thehexdigest
of the md5 hash. TheContent-MD5
header is the base64 encoded digest of the hash, so it needs a bit of encoding/decoding to be able to compare them.I did not do the same logic as the regular checksum in the default provider, because if the request is encoded using
aws-chunk
, the request data is not equal to the object body, and should be decoded twice (once in LocalStack and once in moto), so I'm doing after moto already calculated the ETag.Same logic for the v3 provider, we first set the object to read the body only once, and we then verify the b64 encoded etag vs the provided
Content-MD5
value.Removed the xfail for the related test and improved it a bit.