[hop] Support more output types for `flat_apply` #146714

StrongerXi · 2025-02-07T17:43:34Z

Stack from ghstack (oldest at bottom):

This patch enables flat_apply to support certain non-Tensor output
types like containers and graphable types. This will in turn enable the
upcoming mark_traceable to support more output types.

The patch also exposes a func_to_graphable rather than having the
users calling the lower level pytree.flatten(ConstantFunction(...)).

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames

[ghstack-poisoned]

pytorch-bot · 2025-02-07T17:43:39Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146714

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 86a8231 with merge base 6061664 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

pytorchmergebot · 2025-02-26T17:17:52Z

Starting merge as part of PR stack under #147572

## Context > **Note:** `mark_traceable` got renamed to `nonstrict_trace` after > offline discussion. The reasons are (1) it aligns with `torch.export`'s > `nonstrict` notion, and (2) it's more definitive in behavior suggestion. 1. [Overall Design](https://docs.google.com/document/d/1O-dR2ZQaJQVt_v67AVcDCw2yJLtqgkZFwoXK0buEWRg/edit?tab=t.0) 2. [Dynamo graph representation with `torch._higher_order_ops.flat_apply`](https://docs.google.com/document/d/1YHl5nPTJvYeCPE5TO9uA18DPWNgUYGE4gCn6bFvXcBM/edit?tab=t.0#heading=h.xtw3hhbro4gn) ## Summary This patch adds a `torch._dynamo.nonstrict_trace` decorator, which currently is an enhanced version of `torch._dynamo.allow_in_graph` (see docstring for their differences). Specifically, this patch focuses on the UI and functionality prototyping/plumbing. The main enhancement is supporting more input types, and the implementation challenge lies in reconstructing the input objects from Dynamo `VariableTracker` (while accounting for buffered side-effects and guards). This patch takes a middle-ground (simple implementation with a bit of user labor), by 1. asking the user to provide pytree registration for non-proxy-able input types, 2. letting Dynamo trace through `pytree_flatten` (which accounts for buffered side-effects and guards automatically), 3. and passing in the TreeSpec as a graph attribute constant into `torch._higher_order_ops.flat_apply` (which unflattens the inputs and invokes the underlying function). ## Next Steps In subsequent patches, we will try to support the following: - annotating on class method - reads to global tensors - inputs that contains `pytree.register_constant`-ed instances. - function as input - more output types (e.g., any pytree-registered type) - `torch.nn.Module` as inputs Pull Request resolved: #146367 Approved by: https://github.com/zou3519 ghstack dependencies: #146714

…46950) This patch removes some duplicated name generation logic in Dynamo. Pull Request resolved: #146950 Approved by: https://github.com/zou3519 ghstack dependencies: #146714, #146367

As title, also see 1. new test `test_nonstrict_trace_on_method` for example. 2. newly added comments for why we need special treatment on methods. Pull Request resolved: #147571 Approved by: https://github.com/zou3519 ghstack dependencies: #146714, #146367, #146950

…`-ed function (#147572) As title. Without this patch we get the following error: Tweaking the `allow_non_fake_inputs` flag on tensor mode doesn't quite work for AOTAutograd, which also needs to fake-tensor-propagate the `nonstrict_trace`-ed function, but that's _after_ Dynamo has handled the `nonstrict_trace` processing and put the `flat_apply(...)` node into the graph. So we can't easily to temporarily enable the `allow_non_fake_inputs` flag on current fake mode, when AOTAutograd processes a `flat_apply` node from Dynamo's `nonstrict_trace` handling. And after discussing with zou3519, I decided to add a global `FakeTensorTLS` that contains a `allow_non_fake_inputs_override` flag, and patch the `nonstrict_trace`-ed function to temporarily tweak this flag during its execution. Pull Request resolved: #147572 Approved by: https://github.com/zou3519 ghstack dependencies: #146714, #146367, #146950, #147571

This patch enables `flat_apply` to support certain non-Tensor output types like containers and graphable types. This will in turn enable the upcoming `mark_traceable` to support more output types. The patch also exposes a `func_to_graphable` rather than having the users calling the lower level `pytree.flatten(ConstantFunction(...))`. Pull Request resolved: #146714 Approved by: https://github.com/zou3519

## Context > **Note:** `mark_traceable` got renamed to `nonstrict_trace` after > offline discussion. The reasons are (1) it aligns with `torch.export`'s > `nonstrict` notion, and (2) it's more definitive in behavior suggestion. 1. [Overall Design](https://docs.google.com/document/d/1O-dR2ZQaJQVt_v67AVcDCw2yJLtqgkZFwoXK0buEWRg/edit?tab=t.0) 2. [Dynamo graph representation with `torch._higher_order_ops.flat_apply`](https://docs.google.com/document/d/1YHl5nPTJvYeCPE5TO9uA18DPWNgUYGE4gCn6bFvXcBM/edit?tab=t.0#heading=h.xtw3hhbro4gn) ## Summary This patch adds a `torch._dynamo.nonstrict_trace` decorator, which currently is an enhanced version of `torch._dynamo.allow_in_graph` (see docstring for their differences). Specifically, this patch focuses on the UI and functionality prototyping/plumbing. The main enhancement is supporting more input types, and the implementation challenge lies in reconstructing the input objects from Dynamo `VariableTracker` (while accounting for buffered side-effects and guards). This patch takes a middle-ground (simple implementation with a bit of user labor), by 1. asking the user to provide pytree registration for non-proxy-able input types, 2. letting Dynamo trace through `pytree_flatten` (which accounts for buffered side-effects and guards automatically), 3. and passing in the TreeSpec as a graph attribute constant into `torch._higher_order_ops.flat_apply` (which unflattens the inputs and invokes the underlying function). ## Next Steps In subsequent patches, we will try to support the following: - annotating on class method - reads to global tensors - inputs that contains `pytree.register_constant`-ed instances. - function as input - more output types (e.g., any pytree-registered type) - `torch.nn.Module` as inputs Pull Request resolved: #146367 Approved by: https://github.com/zou3519 ghstack dependencies: #146714

…46950) This patch removes some duplicated name generation logic in Dynamo. Pull Request resolved: #146950 Approved by: https://github.com/zou3519 ghstack dependencies: #146714, #146367

As title, also see 1. new test `test_nonstrict_trace_on_method` for example. 2. newly added comments for why we need special treatment on methods. Pull Request resolved: #147571 Approved by: https://github.com/zou3519 ghstack dependencies: #146714, #146367, #146950

…`-ed function (#147572) As title. Without this patch we get the following error: Tweaking the `allow_non_fake_inputs` flag on tensor mode doesn't quite work for AOTAutograd, which also needs to fake-tensor-propagate the `nonstrict_trace`-ed function, but that's _after_ Dynamo has handled the `nonstrict_trace` processing and put the `flat_apply(...)` node into the graph. So we can't easily to temporarily enable the `allow_non_fake_inputs` flag on current fake mode, when AOTAutograd processes a `flat_apply` node from Dynamo's `nonstrict_trace` handling. And after discussing with zou3519, I decided to add a global `FakeTensorTLS` that contains a `allow_non_fake_inputs_override` flag, and patch the `nonstrict_trace`-ed function to temporarily tweak this flag during its execution. Pull Request resolved: #147572 Approved by: https://github.com/zou3519 ghstack dependencies: #146714, #146367, #146950, #147571

This patch enables `flat_apply` to support certain non-Tensor output types like containers and graphable types. This will in turn enable the upcoming `mark_traceable` to support more output types. The patch also exposes a `func_to_graphable` rather than having the users calling the lower level `pytree.flatten(ConstantFunction(...))`. Pull Request resolved: pytorch#146714 Approved by: https://github.com/zou3519

## Context > **Note:** `mark_traceable` got renamed to `nonstrict_trace` after > offline discussion. The reasons are (1) it aligns with `torch.export`'s > `nonstrict` notion, and (2) it's more definitive in behavior suggestion. 1. [Overall Design](https://docs.google.com/document/d/1O-dR2ZQaJQVt_v67AVcDCw2yJLtqgkZFwoXK0buEWRg/edit?tab=t.0) 2. [Dynamo graph representation with `torch._higher_order_ops.flat_apply`](https://docs.google.com/document/d/1YHl5nPTJvYeCPE5TO9uA18DPWNgUYGE4gCn6bFvXcBM/edit?tab=t.0#heading=h.xtw3hhbro4gn) ## Summary This patch adds a `torch._dynamo.nonstrict_trace` decorator, which currently is an enhanced version of `torch._dynamo.allow_in_graph` (see docstring for their differences). Specifically, this patch focuses on the UI and functionality prototyping/plumbing. The main enhancement is supporting more input types, and the implementation challenge lies in reconstructing the input objects from Dynamo `VariableTracker` (while accounting for buffered side-effects and guards). This patch takes a middle-ground (simple implementation with a bit of user labor), by 1. asking the user to provide pytree registration for non-proxy-able input types, 2. letting Dynamo trace through `pytree_flatten` (which accounts for buffered side-effects and guards automatically), 3. and passing in the TreeSpec as a graph attribute constant into `torch._higher_order_ops.flat_apply` (which unflattens the inputs and invokes the underlying function). ## Next Steps In subsequent patches, we will try to support the following: - annotating on class method - reads to global tensors - inputs that contains `pytree.register_constant`-ed instances. - function as input - more output types (e.g., any pytree-registered type) - `torch.nn.Module` as inputs Pull Request resolved: pytorch#146367 Approved by: https://github.com/zou3519 ghstack dependencies: pytorch#146714

…torch#146950) This patch removes some duplicated name generation logic in Dynamo. Pull Request resolved: pytorch#146950 Approved by: https://github.com/zou3519 ghstack dependencies: pytorch#146714, pytorch#146367

As title, also see 1. new test `test_nonstrict_trace_on_method` for example. 2. newly added comments for why we need special treatment on methods. Pull Request resolved: pytorch#147571 Approved by: https://github.com/zou3519 ghstack dependencies: pytorch#146714, pytorch#146367, pytorch#146950

…`-ed function (pytorch#147572) As title. Without this patch we get the following error: Tweaking the `allow_non_fake_inputs` flag on tensor mode doesn't quite work for AOTAutograd, which also needs to fake-tensor-propagate the `nonstrict_trace`-ed function, but that's _after_ Dynamo has handled the `nonstrict_trace` processing and put the `flat_apply(...)` node into the graph. So we can't easily to temporarily enable the `allow_non_fake_inputs` flag on current fake mode, when AOTAutograd processes a `flat_apply` node from Dynamo's `nonstrict_trace` handling. And after discussing with zou3519, I decided to add a global `FakeTensorTLS` that contains a `allow_non_fake_inputs_override` flag, and patch the `nonstrict_trace`-ed function to temporarily tweak this flag during its execution. Pull Request resolved: pytorch#147572 Approved by: https://github.com/zou3519 ghstack dependencies: pytorch#146714, pytorch#146367, pytorch#146950, pytorch#147571

Update

882a318

[ghstack-poisoned]

StrongerXi requested a review from zou3519 as a code owner February 7, 2025 17:43

StrongerXi mentioned this pull request Feb 7, 2025

[dynamo][fx] Support dataclass whose fields have init=False #146713

Closed

StrongerXi mentioned this pull request Feb 7, 2025

[dynamo] Initial support for nonstrict_trace #146367

Closed

pytorch-bot bot added ciflow/inductor module: dynamo labels Feb 7, 2025

zou3519 approved these changes Feb 7, 2025

View reviewed changes

StrongerXi added 3 commits February 11, 2025 11:06

Update

3a115b4

[ghstack-poisoned]

Update

fee2066

[ghstack-poisoned]

Update

0e744ce

[ghstack-poisoned]

StrongerXi mentioned this pull request Feb 11, 2025

[dynamo] Use the new get_unique_name_wrt helper when applicable #146950

Closed

StrongerXi added 2 commits February 12, 2025 15:03

Update

dd63d63

[ghstack-poisoned]

Update

70e36b6

[ghstack-poisoned]

This was referenced Feb 13, 2025

[dynamo][fx] Don't emit call_function node to construct NamedTuple instances for Dynamo and make_fx tracing #147145

Closed

[dynamo][fx] Don't emit call_function node to construct dataclass instances for Dynamo and make_fx tracing #147152

Closed

StrongerXi 8000 added 3 commits February 13, 2025 23:37

Update

a7c49fd

[ghstack-poisoned]

Update

e096049

[ghstack-poisoned]

Update

1fd4a47

[ghstack-poisoned]

StrongerXi added the topic: not user facing topic category label Feb 20, 2025

This was referenced Feb 21, 2025

[dynamo] Support nonstrict_trace on class method #147571

Closed

[dynamo] Support reads to global/captured tensors in nonstrict_trace-ed function #147572

Closed

StrongerXi added 2 commits February 24, 2025 16:03

Update

c822959

[ghstack-poisoned]

Update

86a8231

[ghstack-poisoned]

pytorchmergebot added the Merged label Feb 26, 2025

pytorchmergebot closed this in bab84f0 Feb 26, 2025

github-actions bot deleted the gh/StrongerXi/83/head branch March 30, 2025 02:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[hop] Support more output types for `flat_apply` #146714

[hop] Support more output types for `flat_apply` #146714

[hop] Support more output types for flat_apply #146714

[hop] Support more output types for flat_apply #146714

Conversation

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146714

✅ No Failures

[hop] Support more output types for `flat_apply` #146714

[hop] Support more output types for `flat_apply` #146714