8000 [dynamo] use proxies to nn.Module in dynamo generated GraphModules by williamwen42 · Pull Request #120756 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[dynamo] use proxies to nn.Module in dynamo generated GraphModules #120756

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

williamwen42
Copy link
Member
@williamwen42 williamwen42 commented Feb 28, 2024

Stack from ghstack (oldest at bottom):

Fixes remaining refleaks found when debugging #119607, tests added in #120657.

Also fixes some tests that xfail: #120631 (not entirely sure why), but introduced tests now fail.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @aakhundov

[ghstack-poisoned]
Copy link
pytorch-bot bot commented Feb 28, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120756

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 7bb33b1 with merge base 61ff41f (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]
[ghstack-poisoned]
williamwen42 added a commit that referenced this pull request Feb 28, 2024
[ghstack-poisoned]
williamwen42 added a commit that referenced this pull request Feb 29, 2024
[ghstack-poisoned]
williamwen42 added a commit that referenced this pull request Mar 4, 2024
…hModules"

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng

[ghstack-poisoned]
williamwen42 added a commit that referenced this pull request Mar 6, 2024
…hModules"

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng

[ghstack-poisoned]
williamwen42 added a commit that referenced this pull request Mar 8, 2024
…hModules"

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng

[ghstack-poisoned]
williamwen42 added a commit that referenced this pull request Mar 12, 2024

# Super hacky - returns a proxy object for a nn.Module that references
# to the original nn.Module, but is a distinct object.
def nn_module_proxy(mod):
Copy link
Contributor
@mlazos mlazos Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you name this wrapper instead, due to the other uses of proxy?

return mod
try:
# best effort shallow copy
proxy = torch.nn.Module.__new__(torch.nn.Module)
Copy link
Contributor
@mlazos mlazos Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use a WeakRef here instead? and do some routing to the underlying weakref'd object?

Then we could use the finalizer to cleanup the graph/invalidate anything else.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I attempted using weakrefs but discovered that a big difficulty in using weakrefs is that some downstream code (HOPs, export, backends) may require the actual module itself.

# best effort shallow copy
proxy = torch.nn.Module.__new__(torch.nn.Module)
proxy.__class__ = mod.__class__
proxy.__dict__ = mod.__dict__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't the parameters will still be held alive by this?

Copy link
Member Author
@williamwen42 williamwen42 Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameters are lifted as inputs to GraphModules, so AFAIK a GraphModule should never have a Parameter as an attribute.

The point of this nn.Module wrapper/proxy is so that when the user deletes all their references to the original module, dynamo cache invalidation will work because the compiled GraphModule doesn't hold a reference to the original module.

The parameters will be freed when cache invalidation runs.

…hModules"

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng

[ghstack-poisoned]
williamwen42 added a commit that referenced this pull request Mar 14, 2024
@williamwen42
Copy link
Member Author

@jansel do you think it would be better if we explicitly write out which attributes of nn.Modules (e.g. params, modules, hooks, etc.) we want to copy over to the "proxy"/wrapper? I'm thinking it would be better for backend writers.

@jansel
Copy link
Contributor
jansel commented Mar 19, 2024

Yeah we could prune things down.

…hModules"

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng

[ghstack-poisoned]
williamwen42 added a commit that referenced this pull request Mar 19, 2024
@williamwen42
Copy link
Member Author

@anijain2305 and I talked offline - we can keep the __dict__ approach here since we have more control over the kinds of modules we're expecting from output_graph.py (i.e. inbuilt torch.nn.Modules, HOP-generated GraphModules)

@williamwen42 williamwen42 added the topic: not user facing topic category label Mar 19, 2024
…hModules"


Fixes remaining refleaks found when debugging #119607, tests added in #120657.

Also fixes #120631 (not entirely sure why)

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang aakhundov

[ghstack-poisoned]
williamwen42 added a commit that referenced this pull request Mar 20, 2024
@williamwen42
Copy link
Member Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 21, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

facebook-github-bot pushed a commit to pytorch/benchmark that referenced this pull request Mar 22, 2024
Summary:
Fixes remaining refleaks found when debugging pytorch/pytorch#119607, tests added in pytorch/pytorch#120657.

Also fixes some tests that xfail: pytorch/pytorch#120631 (not entirely sure why), but introduced tests now fail.

X-link: pytorch/pytorch#120756
Approved by: https://github.com/jansel

Reviewed By: huydhn

Differential Revision: D55225127

Pulled By: williamwen42

fbshipit-source-id: 0d402deb4246d981304985761b94f637970fabce
pytorch-bot bot pushed a commit that referenced this pull request Apr 22, 2024
…120756)

Fixes remaining refleaks found when debugging #119607, tests added in #120657.

Also fixes some tests that xfail: #120631 (not entirely sure why), but introduced tests now fail.

Pull Request resolved: #120756
Approved by: https://github.com/jansel
@github-actions github-actions bot deleted the gh/williamwen42/24/head branch April 25, 2024 01:54
williamwen42 added a commit that referenced this pull request May 15, 2024
…120756)

Fixes remaining refleaks found when debugging #119607, tests added in #120657.

Also fixes some tests that xfail: #120631 (not entirely sure why), but introduced tests now fail.

Pull Request resolved: #120756
Approved by: https://github.com/jansel
williamwen42 added a commit that referenced this pull request May 15, 2024
…120756)

Fixes remaining refleaks found when debugging #119607, tests added in #120657.

Also fixes some tests that xfail: #120631 (not entirely sure why), but introduced tests now fail.

Pull Request resolved: #120756
Approved by: https://github.com/jansel
atalman pushed a commit that referenced this pull request May 22, 2024
…126332)

* [dynamo] use proxies to nn.Module in dynamo generated GraphModules (#120756)

Fixes remaining refleaks found when debugging #119607, tests added in #120657.

Also fixes some tests that xfail: #120631 (not entirely sure why), but introduced tests now fail.

Pull Request resolved: #120756
Approved by: https://github.com/jansel

* [dynamo] use proxies to nn.Module in dynamo generated GraphModules (#120756)

Fixes remaining refleaks found when debugging #119607, tests added in #120657.

Also fixes some tests that xfail: #120631 (not entirely sure why), but introduced tests now fail.

Pull Request resolved: #120756
Approved by: https://github.com/jansel
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0