8000 [FSDP][StateDict] Allow FULL_STATE_DICT option for 2D by wz337 · Pull Request #120837 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[FSDP][StateDict] Allow FULL_STATE_DICT option for 2D #120837

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

wz337
Copy link
Contributor
@wz337 wz337 commented Feb 28, 2024

Fixes #120722

TL;DR for the issue:
As users are expected to use get_model_state_dict to do state_dict retrieval, I think it's fine to remove the warning and RuntimeError.
More context in #120722.

cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj @tianyu-l @wconstab @yf225 @chauhang

Copy link
pytorch-bot bot commented Feb 28, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120837

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit 2ab6d61 with merge base 5a0a964 (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: distributed (fsdp) release notes category label Feb 28, 2024
@github-actions github-actions bot added oncall: distributed Add this issue/PR to distributed oncall triage queue ciflow/inductor labels Feb 28, 2024
@wz337 wz337 added ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR and removed oncall: distributed Add this issue/PR to distributed oncall triage queue ciflow/inductor labels Feb 28, 2024
@wz337 wz337 marked this pull request as ready for review February 28, 2024 23:13
@wz337 wz337 force-pushed the allow_2D_full_state_dict branch from e70e223 to 2ab6d61 Compare February 29, 2024 01:29
@github-actions github-actions bot added oncall: distributed Add this issue/PR to distributed oncall triage queue ciflow/inductor labels Feb 29, 2024
@wz337 wz337 requested a review from fegin February 29, 2024 18:29
@wz337
Copy link
Contributor Author
wz337 commented Mar 5, 2024

@pytorchmergebot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 5, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

mvpatel2000 pushed a commit to mvpatel2000/pytorch that referenced this pull request Mar 5, 2024
Fixes pytorch#120722

TL;DR for the issue:
As users are expected to use get_model_state_dict to do state_dict retrieval, I think it's fine to remove the warning and RuntimeError.
More context in pytorch#120722.

Pull Request resolved: pytorch#120837
Approved by: https://github.com/Skylion007
atalman pushed a commit that referenced this pull request Mar 8, 2024
)

Fixes #120722

TL;DR for the issue:
As users are expected to use get_model_state_dict to do state_dict retrieval, I think it's fine to remove the warning and RuntimeError.
More context in #120722.

Pull Request resolved: #120837
Approved by: https://github.com/Skylion007

Co-authored-by: wz337 <wz337@cornell.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/inductor ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR ciflow/trunk Trigger trunk jobs on your pull request Merged oncall: distributed Add this issue/PR to distributed oncall triage queue release notes: distributed (fsdp) release notes category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow Full State Dict with 2D FSDP + TP
3 participants
0