8000 Fix flaky lambda test event retry reserved concurrency by joe4dev · Pull Request #12441 · localstack/localstack · GitHub
[go: up one dir, main page]

Skip to content

Fix flaky lambda test event retry reserved concurrency #12441

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

joe4dev
Copy link
Member
@joe4dev joe4dev commented Mar 26, 2025

Motivation

The test test_reserved_concurrency_async_queue is currently skipped due to flakiness.

Changes

  • Add aws_client_no_retry fixture and use it in the testcase
  • Re-design test case to use SQS-based notifications instead of unpredictable sleep
  • Add requests_id-based assertions

Testing

Temporarily using a botocore configuration with increased retries makes the retry problem reproducible causing the test to fail. This shows that we need an aws_client_no_retry to mitigate potential flakiness and reduce unnecessary retries.

To reproduce:

  1. Ensure that retries are not disabled by configuration: TEST_DISABLE_RETRIES_AND_TIMEOUTS=0
  2. Use the following config:
retry_delay_config = botocore.config.Config(retries={"max_attempts": 10})
retry_delay_client = aws_client_factory(config=retry_delay_config)
with pytest.raises(aws_client.lambda_.exceptions.TooManyRequestsException) as e:
    retry_delay_client.lambda_.invoke(
        FunctionName=fn_arn, InvocationType="RequestResponse"
    )

joe4dev added 3 commits March 26, 2025 11:47
This simplifies the test case such that no transformer is required and makes the notify more explicit through a traceable `queue_url` reference.
@joe4dev joe4dev added the semver: patch Non-breaking changes which can be included in patch releases label Mar 26, 2025
@joe4dev joe4dev added this to the 4.4 milestone Mar 26, 2025
@joe4dev joe4dev self-assigned this Mar 26, 2025
Copy link
github-actions bot commented Mar 26, 2025

LocalStack Community integration with Pro

    2 files  ±0      2 suites  ±0   1h 50m 41s ⏱️ - 1m 33s
4 306 tests +1  3 986 ✅ +2  320 💤  - 1  0 ❌ ±0 
4 308 runs  +1  3 986 ✅ +2  322 💤  - 1  0 ❌ ±0 

Results for commit ff17ef5. ± Comparison against base commit 8489276.

♻️ This comment has been updated with latest results.

@joe4dev joe4dev marked this pull request as ready for review March 27, 2025 10:11
Copy link
Member
@dfangl dfangl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice fix of the test!

This fixture can be used to obtain Boto clients with disabled retries for testing.
botocore docs: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/retries.html#configuring-a-retry-mode

Use this client when testing exceptions (i.e., with pytest.raises(...)) or expected errors (e.g., status code 500)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can clarify here, that this is mostly needed when exceptions are tested which have retries? So all listed in here: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/retries.html#legacy-retry-mode (or matching the http status codes also listed there).
For a "ResourceNotFound" exception for example, this is not necessary

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we are indeed using the legacy retry mode, which makes things even worse 😬 .

I wanted to keep it simple and long-term given there are no adverse effects of using the no_retry client with a non-retrying exception. Nevertheless, I added the clarification for most accurate and actionable advice at the current status.

@joe4dev joe4dev merged commit ccddefd into master Mar 28, 2025
25 of 27 checks passed
@joe4dev joe4dev deleted the fix-flaky-lambda-test-event-retry-reserved-concurrency branch March 28, 2025 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver: patch Non-breaking changes which can be included in patch releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0