[Trace PyDispatcher] Capture Vmapped autograd function as graph #146288

yanboliang · 2025-02-03T02:14:22Z

Stack from ghstack (oldest at bottom):

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames

[ghstack-poisoned]

pytorch-bot · 2025-02-03T02:14:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146288

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures

As of commit 96cc5cc with merge base fa48757 ():

NEW FAILURES - The following jobs have failed:

pull / linux-focal-cuda12.4-py3.10-gcc9-sm89 / test (default, 1, 5, lf.linux.g6.4xlarge.experimental.nvidia.gpu) (gh)
inductor/test_flex_attention.py::TestFlexAttention::test_block_mask_non_divisible
pull / linux-focal-cuda12.4-py3.10-gcc9-sm89 / test (default, 2, 5, lf.linux.g6.4xlarge.experimental.nvidia.gpu) (gh)
inductor/test_flex_attention.py::TestFlexAttention::test_free_symbol_dynamic
pull / linux-focal-cuda12.4-py3.10-gcc9-sm89 / test (default, 5, 5, lf.linux.g6.4xlarge.experimental.nvidia.gpu) (gh)
test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_flex_attention_converts_stacked_seq_indices_cuda_float32
pull / linux-focal-py3.13-clang10 / test (dynamo_wrapped, 3, 3, lf.linux.2xlarge) (gh)
functorch/test_eager_transforms.py::TestComposabilityCPU::test_autograd_function_no_setup_context_transform_functionalize_cpu
pull / linux-focal-py3.9-clang10 / test (dynamo_wrapped, 3, 3, lf.linux.2xlarge) (gh)
functorch/test_eager_transforms.py::TestComposabilityCPU::test_autograd_function_no_setup_context_transform_functionalize_cpu

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 1474df7 Pull Request resolved: #146288

[ghstack-poisoned]

ghstack-source-id: 52ea779 Pull Request resolved: #146288

yanboliang · 2025-02-03T05:18:43Z

torch/_dynamo/variables/builder.py

-            self.install_guards(GuardBuilder.FUNCTION_MATCH)
+            self.install_guards(GuardBuilder.TYPE_MATCH)
+            func_source = AttrSource(self.source, "__func__")
+            install_guard(func_source.make_guard(GuardBuilder.ID_MATCH))


Not sure if this is a bug in guarding the apply method of an autograd function. The original FUNCTION_MATCH triggers an ID check failure (bellow is the error stack), but it works correctly when changed to TYPE_MATCH on apply and ID_MATCH on apply.__func__.

Traceback (most recent call last): File "/data/users/ybliang/debug/debug2.py", line 34, in <module> print(fn(x)) File "/home/ybliang/local/pytorch/torch/_dynamo/eval_frame.py", line 570, in _fn return fn(*args, **kwargs) File "/home/ybliang/local/pytorch/torch/_dynamo/convert_frame.py", line 1400, in __call__ return self._torchdynamo_orig_callable( File "/home/ybliang/local/pytorch/torch/_dynamo/convert_frame.py", line 565, in __call__ return _compile( File "/home/ybliang/local/pytorch/torch/_dynamo/convert_frame.py", line 997, in _compile guarded_code = compile_inner(code, one_graph, hooks, transform) File "/home/ybliang/local/pytorch/torch/_utils_internal.py", line 95, in wrapper_function return function(*args, **kwargs) File "/home/ybliang/local/pytorch/torch/_dynamo/convert_frame.py", line 726, in compile_inner return _compile_inner(code, one_graph, hooks, transform) File "/home/ybliang/local/pytorch/torch/_dynamo/convert_frame.py", line 862, in _compile_inner check_fn = CheckFunctionManager( File "/home/ybliang/local/pytorch/torch/_dynamo/guards.py", line 2466, in __init__ raise AssertionError(f"Guard check failed: {reasons}") AssertionError: Guard check failed: 0/0: ___check_obj_id(G['Foo'].apply, 139997975477056)

[ghstack-poisoned]

ghstack-source-id: 1224919 Pull Request resolved: #146288

test/dynamo/test_python_dispatcher.py

zou3519 · 2025-02-03T15:23:33Z

torch/_dynamo/variables/functions.py

@@ -341,6 +341,17 @@ def call_function(
            ]:
                with torch._dynamo.side_effects.allow_side_effects_under_checkpoint(tx):
                    return super().call_function(tx, args, kwargs)
+        elif self.fn is torch._functorch.autograd_function.vmapify_autograd_function:


just wondering, why didn't you do something like:

UserDefinedFunction(torch._functorch.autograd_function.vmapify_autograd_function).call_function(tx, args)

The current solution in this PR looks fine to me though

Yeah, the default inlining approach is better, but I ran into a few unsupported Dynamo features while handling the following case:

pytorch/torch/_functorch/autograd_function.py

Lines 502 to 511 in e68f508

Generated = type(

name,

(torch.autograd.Function,),

{

"forward": staticmethod(forward),

"backward": staticmethod(backward),

"jvp": staticmethod(jvp),

"setup_context": staticmethod(setup_context),

"generate_vmap_rule": True,

},

This includes issues like constructing NestedUserFunctionVariable without a source, among others. Since I’d like to keep this PR focused on tracing the vmapped autograd function rather than addressing broader issues, I decided to go with this approach for now.

That said, I’m happy to revisit this later and migrate it to the inlining approach as a follow-up. I’ll add a TODO comment here to track it.

TODO sounds fine

zou3519 · 2025-02-03T15:24:44Z

torch/_dynamo/variables/misc.py

+    def as_proxy(self):
+        return self.fn_cls


Why does this have an as_proxy? We shouldn't be putting autograd.Functions into the graph

Yes, this is typo, we don't use it actually, will remove it.

zou3519

I added suggestions for more testing. Code seems reasonable to me

zou3519 · 2025-02-03T15:25:47Z

torch/_dynamo/variables/misc.py

+        #    though this constraint could be relaxed in the future.
+        if (
+            name == "apply"
+            and self.fn_cls.__name__.startswith("Vmapped")


We should have some more robust way of identifying a vmapped autograd function. Since these are generated in Dynamo now, could we set a flag when constructing the AutogradFunctionVariable?

Good point! Setting a flag during construction is definitely a more robust approach. I’ll update it.

zou3519 · 2025-02-03T15:27:09Z

torch/_dynamo/variables/misc.py

+        # 1. If the autograd function is not vmapified:
+        #    - We can directly handle it by either treating it as allow_in_graph or
+        #      wrapping it as an AutogradFunctionApplyVariable HOP.


I don't think we ever allow_in_graph an autograd.Function unless the user has explicitly used allow_in_graph?

Yes, here it refers users explicitly using allow_in_graph decorator. I'll update the comment to clarify this.

zou3519 · 2025-02-03T15:31:13Z

test/dynamo/test_python_dispatcher.py

+        @torch.compile(backend=eager, fullgraph=True)
+        def fn(x):
+            return torch.vmap(Foo.apply)(x)


Can you do a more complicated (non-pointwise) test case with double vmap? Maybe use double vmap on the LinearFunction?

zou3519 · 2025-02-03T15:33:45Z

torch/_dynamo/variables/misc.py

+        #    - The original autograd function (be called when functorch transforms are active):
+        #      - Since we already wrap the vmapped autograd function as an AutogradFunctionApplyVariable HOP,
+        #        and the vmapped autograd function calls the original autograd function, we simply inline them.
+        if name == "apply" and not torch._C._are_functorch_transforms_active():


Er, what happens if functorch transforms are active but you have a regular autograd.Function? Something like:

def f(x, y): z = Foo.apply(y) return x * z x = torch.randn(3) y = torch.randn([]) vmap(f, (0, None))(x, y)

It goes into the else branch down blow, which is inlining the apply method. This is because we only want call_apply on the vmapped autograd function and capture fwd/bwd graphs there. And during tracing vmapped autograd function, it triggers call to the original regular autograd function's apply method, which we should just inline it.

zou3519 · 2025-02-03T15:36:58Z

test/dynamo/test_python_dispatcher.py

@@ -130,6 +135,184 @@ def fn(x, y):
        # No recompile
        self.assertEqual(counter.frame_count, 1)

+    def test_vmapped_autograd_function(self):


For testing, we have a lot of autograd.Function we can test. A good comprehensive way to do this is to copy-paste the following and add torch.compile(backend="eager") testing to it

pytorch/test/functorch/test_vmap.py

Line 4341 in 01554c7

"test_vmap_exhaustive",

(but only for the autograd_function_db).

The vmap tests are able to generate inputs with various in_dims, so that better exercises the logic. They're also able to generate vmap(vmap( tests.

zou3519 · 2025-02-03T15:38:08Z

torch/_dynamo/variables/misc.py

+        # 3. `AutogradFunctionApplyVariable` requires `parent_source` to be non-None,
+        #    though this constraint could be relaxed in the future.


How difficult is it to relax this constraint? Otherwise the source we're generating here is incorrect, right?

I don’t think it’s very difficult, though I haven’t looked into it too deeply. I just want to keep this PR focused on its intended scope, but I’ll address it as a follow-up.

The source we generate here is correct—the issue is that we can’t generate a guard from it. The problem arises because it’s trying to generate guards on torch._functorch.autograd_function.vmapped_xxx.apply, which leads to errors during guard evaluation. This happens because the vmapped autograd function is created on the fly during compilation, and we don’t materialize it.

One possible solution could be adding a new guard specifically for objects generated dynamically.

zou3519

cool I think I am just looking for the double vmap tests

github-actions · 2025-05-13T20:36:57Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

Update

e874551

[ghstack-poisoned]

pytorch-bot bot added ciflow/inductor module: dynamo labels Feb 3, 2025

yanboliang added a commit that referenced this pull request Feb 3, 2025

[Trace PyDispatcher] Capture Vmapped autograd function as graph

cb62211

ghstack-source-id: 1474df7 Pull Request resolved: #146288

yanboliang added the topic: not user facing topic category label Feb 3, 2025

Update

dcb65e6

[ghstack-poisoned]

yanboliang added a commit that referenced this pull request Feb 3, 2025

[Trace PyDispatcher] Capture Vmapped autograd function as graph

00d7da9

ghstack-source-id: 52ea779 Pull Request resolved: #146288

yanboliang commented Feb 3, 2025

View reviewed changes

yanboliang requested a review from zou3519 February 3, 2025 06:25

Update

96cc5cc

[ghstack-poisoned]

yanboliang added a commit that referenced this pull request Feb 3, 2025

[Trace PyDispatcher] Capture Vmapped autograd function as graph

150f8ec

ghstack-source-id: 1224919 Pull Request resolved: #146288

zou3519 reviewed Feb 3, 2025

View reviewed changes

test/dynamo/test_python_dispatcher.py Show resolved Hide resolved

zou3519 reviewed Feb 3, 2025

View reviewed changes

zou3519 reviewed Feb 3, 2025 8000

View reviewed changes

zou3519 reviewed Feb 4, 2025

View reviewed changes

yanboliang mentioned this pull request Feb 14, 2025

vmap + autograd.Function [generate_vmap_rule=True] + torch.compile don't work together #129845

Open

yanboliang mentioned this pull request Feb 28, 2025

autograd.function with setup_context has a number of issues with torch.compile #130051

Open

pytorchbot added the open source label Mar 14, 2025

github-actions bot added the Stale label May 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Trace PyDispatcher] Capture Vmapped autograd function as graph #146288

[Trace PyDispatcher] Capture Vmapped autograd function as graph #146288

	Generated = type(
	name,
	(torch.autograd.Function,),
	{
	"forward": staticmethod(forward),
	"backward": staticmethod(backward),
	"jvp": staticmethod(jvp),
	"setup_context": staticmethod(setup_context),
	"generate_vmap_rule": True,
	},

		# 3. `AutogradFunctionApplyVariable` requires `parent_source` to be non-None,
		# though this constraint could be relaxed in the future.

[Trace PyDispatcher] Capture Vmapped autograd function as graph #146288

Are you sure you want to change the base?

[Trace PyDispatcher] Capture Vmapped autograd function as graph #146288

Conversation

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146288

❌ 5 New Failures

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment