10000 Step Functions: Improve Nested Map Run Stability by MEPalma · Pull Request #12343 · localstack/localstack · GitHub
[go: up one dir, main page]

Skip to content

Step Functions: Improve Nested Map Run Stability #12343

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 5, 2025

Conversation

MEPalma
Copy link
Contributor
@MEPalma MEPalma commented Mar 5, 2025

Motivation

Currently, the Step Functions v2 interpreter encounters issues when evaluating nested MapRun calls distributed across multiple MapRuns #12335. This arises from two main issues: certain map run components inappropriately use member variables to share state, and the worker number management is shared across multiple map runs, causing no workers to start for jobs beyond the first job in nested scenarios.

Changes

This pull request addresses the anti-pattern of stateful objects in the map run logic by making them stateless. Additionally, it corrects the creation of workers so they are appropriately initiated for all nested map runs.

Testing

Added a relevant snapshot test to verify the correct functioning of distributed nested map runs without maximum concurrency settings.

@MEPalma MEPalma added the semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases label Mar 5, 2025
@MEPalma MEPalma added this to the 4.3 milestone Mar 5, 2025
@MEPalma MEPalma self-assigned this Mar 5, 2025
Copy link
github-actions bot commented Mar 5, 2025

LocalStack Community integration with Pro

    2 files      2 suites   31m 3s ⏱️
1 437 tests 1 364 ✅ 73 💤 0 ❌
1 439 runs  1 364 ✅ 75 💤 0 ❌

Results for commit 232a967.

♻️ This comment has been updated with latest results.

Copy link
Member
@joe4dev joe4dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM functionally, I just added a clarification question.

Feel free to merge :)

def _map_run(self, env: Environment) -> None:

def _map_run(
self, env: Environment, eval_input: DistributedIterationComponentEvalInput
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice conversion into statelessness using eval_input as parameter 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs: Would it make sense to add a comment about the statelessness here for our future selves?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll leave a comment in the top level class for iterations

elif isinstance(self.iteration_component, InlineItemProcessor):
eval_input = InlineItemProcessorEvalInput(
env.map_run_record_pool_manager.add(map_run_record)
if isinstance(self.iteration_component, DistributedIterator):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs: It might be helpful to explain the difference of DistributedIteratorEvalInput and DistributedItemProcessorEvalInput in a clarifying comment here

}
}
},
"MaxConcurrency": 9
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me clarifying why the test case aims to cover no_max_concurrency, but we have two MaxConcurrency configurations in the state machine definition here?
This might be worth clarifying in a comment as well.

Copy link
< 8000 /details>
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of the test is to not limit concurrency, which we always do to 1 in the test suite (MaxConcurrency: 1). We achieve this by still bounding the max concurrency to an upper limit (just not the default maximum of removing the MaxConcurrency setting).

@MEPalma MEPalma merged commit 976046d into master Mar 5, 2025
31 checks passed
@MEPalma MEPalma deleted the MEP-SFN-fix_evaluation_nested_map_runs branch March 5, 2025 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver: minor Non-breaking changes which can be included in minor releases, but not in patch releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0