8000 [tp] improve parallelize_module API to support more cases by wanchaol · Pull Request #157182 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@wanchaol
Copy link
Collaborator
@wanchaol wanchaol commented Jun 28, 2025

This PR improves the parallelize_module API to support more corner cases:

  1. if the plan entry specified as "", it should apply the style to the current module
  2. if the plan entry does not have a corresponding submodule to apply, raise a warning and ignore this plan entry

As working on this PR, I also found that the while-loop inside is actually not necessary and could produce some nasty on the fly modifying while iterating behavior.. So I removed the while loop

cc @H-Huang @awgu @fegin @fduwjj @wz337 @wconstab @d4l3k

@pytorch-bot
Copy link
pytorch-bot bot commented Jun 28, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/157182

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 6577ce0 with merge base 81759af (image):

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added ciflow/inductor oncall: distributed Add this issue/PR to distributed oncall triage queue labels Jun 28, 2025
@wanchaol wanchaol added release notes: distributed (dtensor) release notes category ciflow/trunk Trigger trunk jobs on your pull request and removed oncall: distributed Add this issue/PR to distributed oncall triage queue ciflow/inductor labels Jun 28, 2025
Copy link
Contributor
@tianyu-l tianyu-l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense to me, may need minor modifications before merging

This PR improves the parallelize_module API to support more corner
cases:
1. if the plan entry specified as "", it should apply the style to
8000
 the
   current module
2. if the plan entry does not have a corresponding submodule to apply,
   raise a warning and ignore this plan entry

As working on this PR, I also found that the while-loop inside is
actually not necessary and could produce some nasty on the fly modifying
while iterating behavior.. So I removed the while loop
@pytorch-bot pytorch-bot bot added ciflow/inductor oncall: distributed Add this issue/PR to distributed oncall triage queue labels Jun 30, 2025
@wanchaol
Copy link
Collaborator Author

FYI while working on this, I also found that the while-loop inside is actually not necessary and could produce some nasty conflicts by on the fly modifying the list and the while iterating behavior.. So I removed the while loop

@wanchaol
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged oncall: distributed Add this issue/PR to distributed oncall triage queue open source release notes: distributed (dtensor) release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants

0