Migrate dependencies dependents sets from TaskState to Key #9042
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a follow up to #9036 and builds directly on top
This replaces the elements in
TaskState.dependencies
with the keys instead of theTaskState
objects themselves. This has a few benefitsTaskState
overhead lighter. A few months ago we were investigating GC overhead on the scheduler and theTaskState
objects are adding to this in a non-trivial way. The fewer references we keep around the easier. Besides, these dependencies/dependents links are cycles that have to be followed by the GC. I don't expect this change to have a measurable impact but I believe this direction is healthy for the scheduler as a whole.Task
class (TaskState.run_spec
) and reduce update_graph runtime and reduces memory overhead a little (c.f. a simple, empty set needs already 216B on py3.10, see also Reduce memory usage of scheduler process - Optimize scheduler.py::TaskState class #8331)