8000 fix CBOR timestamp parsing by alexrashed · Pull Request #11133 · localstack/localstack · GitHub
[go: up one dir, main page]

Skip to content

fix CBOR timestamp parsing #11133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 4, 2024
Merged

fix CBOR timestamp parsing #11133

merged 2 commits into from
Jul 4, 2024

Conversation

alexrashed
Copy link
Member
@alexrashed alexrashed commented Jul 3, 2024

Motivation

Kinesis supports CBOR (Concise Binary Object Representation) encoding as Content-Type in addition to JSON.
CBOR support was added for LocalStack with #6494 by extending our ASF parsers and serializers, and patching CBOR support into botocore (when being used as a proxy to kinesis-mock).
Unfortunately, AWS does not properly implement the CBOR spec when it comes to timestamp handling:

  • RFC8949 defines CBOR timestamps as floating-point epoch seconds:

    The tag content MUST be an unsigned or negative integer (major types 0 and 1) or a floating-point number (major type 7 with additional information 25, 26, or 27).
    ...
    To indicate fractional seconds, floating-point values can be used within tag number 1 instead of integer values.

  • AWS however uses milliseconds as nicely described in this issue: Kinesis: In GetShardIterator request with AT_TIMESTAMP, timestamp is passed incorrectly based on CBOR Specification aws/aws-sdk-java-v2#4661
    • This means that when using a standard-compliant parser, the timestamps multiplied by 1000 (resulting in an error - "ValueError: year 56474 is out of range"):
      2024-07-03T13:30:33.095 ERROR --- [et.reactor-1] l.aws.handlers.logging     : exception during call chain
      ValueError: year 56474 is out of range
      
      The above exception was the direct cause of the following exception:
      
      Traceback (most recent call last):
        File "/opt/code/localstack/.venv/lib/python3.11/site-packages/localstack/aws/protocol/parser.py", line 871, in _parse_body_as_json
          return cbor2.loads(body_contents)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
      _cbor2.CBORDecodeValueError: error decoding datetime from epoch
      
      The above exception was the direct cause of the following exception:
      
      Traceback (most recent call last):
        File "/opt/code/localstack/.venv/lib/python3.11/site-packages/rolo/gateway/chain.py", line 166, in handle
          handler(self, self.context, response)
        File "/opt/code/localstack/.venv/lib/python3.11/site-packages/localstack/aws/handlers/service.py", line 63, in __call__
          return self.parse_and_enrich(context)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/opt/code/localstack/.venv/lib/python3.11/site-packages/localstack/aws/handlers/service.py", line 67, in parse_and_enrich
          operation, instance = parser.parse(context.request)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/opt/code/localstack/.venv/lib/python3.11/site-packages/localstack/aws/protocol/parser.py", line 172, in wrapper
          return func(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "/opt/code/localstack/.venv/lib/python3.11/site-packages/localstack/aws/protocol/parser.py", line 921, in parse
          final_parsed = self._do_parse(request, shape, uri_params)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/opt/code/localstack/.venv/lib/python3.11/site-packages/localstack/aws/protocol/parser.py", line 933, in _do_parse
          parsed = self._handle_json_body(request, shape, uri_params)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/opt/code/localstack/.venv/lib/python3.11/site-packages/localstack/aws/protocol/parser.py", line 945, in _handle_json_body
          parsed_json = self._parse_body_as_json(request)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/opt/code/localstack/.venv/lib/python3.11/site-packages/localst
      8000
      ack/aws/protocol/parser.py", line 873, in _parse_body_as_json
          raise ProtocolParserError("HTTP body could not be parsed as CBOR.") from e
      localstack.aws.protocol.parser.ProtocolParserError: HTTP body could not be parsed as CBOR.
      2024-07-03T13:30:33.097  INFO --- [et.reactor-1] localstack.request.http    : POST / => 500
      2024-07-03T13:30:33.151 ERROR --- [t.reactor-11] l.aws.handlers.logging     : exception during call chain
      

This already caused a few issues in the past, which were addressed with #6791 and #11071.
But these fixes are only working if the datetime fields are serialized as numbers instead of native timestamps.
The AWS SDK does perform a native timestamp serialization, which again leads to parsing errors (mentioned above, see reproducer linked in "Testing").

This PR addresses the CBOR timestamp parsing/serialization issues directly when using cbor2 to en-/decode the binary data.

Unfortunately, cbor2 does not easily allow patching standard en-/decoder functions, which is why the fix only works if the Python-native cbor loading code is explicitly imported from the respective private modules (happy for any input on how to make this more stable / better): agronholm/cbor2#192

Changes

  • Patches cbor2 datetime encoding and decoding.
  • Adjusts all imports of cbor2 to explicitly use the Python implementation instead of the C / native library.
  • Fixes the parser (removes an invalid assumption that binary data is not also base64 encoded).
  • Fixes the serializer (removed timestamp handling in the serializer - done natively by cbor2 now).

Testing

  • I created a reproducer using the AWS Java SDK to showcase this issue: https://github.com/alexrashed/kinesis-cbor-reproducer and verified the changes here with this reproducer / the AWS Java SDK.
  • I added an AWS-verified plain-request CBOR integration test, testing CBOR serialization and deserialization.

Fixes #9699

@alexrashed alexrashed added aws:kinesis Amazon Kinesis area: asf semver: patch Non-breaking changes which can be included in patch releases labels Jul 3, 2024
@alexrashed alexrashed added this to the 3.6 milestone Jul 3, 2024
@alexrashed alexrashed self-assigned this Jul 3, 2024
Copy link
github-actions bot commented Jul 3, 2024

S3 Image Test Results (AMD64 / ARM64)

  2 files  ±0    2 suites  ±0   3m 24s ⏱️ -3s
404 tests ±0  352 ✅ ±0   52 💤 ±0  0 ❌ ±0 
808 runs  ±0  704 ✅ ±0  104 💤 ±0  0 ❌ ±0 

Results for commit c0db4c5. ± Comparison against base commit 58313a7.

♻️ This comment has been updated with latest results.

Copy link
github-actions bot commented Jul 3, 2024

LocalStack Community integration with Pro

    2 files  ±0      2 suites  ±0   1h 34m 35s ⏱️ - 8m 34s
3 157 tests +1  2 762 ✅ +2  395 💤 ±0  0 ❌  - 1 
3 159 runs  +1  2 762 ✅ +2  397 💤 ±0  0 ❌  - 1 

Results for commit c0db4c5. ± Comparison against base commit 58313a7.

♻️ This comment has been updated with latest results.

@alexrashed alexrashed force-pushed the fix-cbor-timestmps branch from 7a92481 to d91991e Compare July 4, 2024 08:40
@alexrashed alexrashed force-pushed the fix-cbor-timestmps branch 2 times, most recently from 32de83f to c0db4c5 Compare July 4, 2024 10:02
@alexrashed alexrashed requested a review from bentsku July 4, 2024 10:10
@alexrashed alexrashed marked this pull request as ready for review July 4, 2024 10:10
@alexrashed alexrashed requested a review from thrau as a code owner July 4, 2024 10:10
Copy link
Contributor
@bentsku bentsku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, very nice and complete write up! Thanks a lot, it was very easy to understand, follow the issue and the different linked reports.

It introduces a tight coupling to the internal Python implementation, where we won't follow if anything change from the library without manually checking their code. I don't see an easy way out without forking and tracking upstream.
But as you've pointed out, the issue is so severe that, to me, this is a good trade off! It's well documented, so it should not be too hard to maintain in the future.

LGTM, maybe we can have a second pair of eyes just to be sure on the patching side 😄

@alexrashed alexrashed force-pushed the fix-cbor-timestmps branch from c0db4c5 to 2880f8f Compare July 4, 2024 12:27
@alexrashed alexrashed merged commit b612563 into master Jul 4, 2024
13 of 14 checks passed
@alexrashed alexrashed deleted the fix-cbor-timestmps branch July 4, 2024 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: asf aws:kinesis Amazon Kinesis semver: patch Non-breaking changes which can be included in patch releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HTTP body could not be parsed as CBOR
2 participants
0