[dynamic shapes] guard_or_false for computeStorageNbytes #150483

pianpwk · 2025-04-01T22:45:39Z

removes fast path for computing storage, fixes some adjacent tests

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames

pytorch-bot · 2025-04-01T22:45:42Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/150483

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 5a9460e with merge base cbcb57d ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

inductor / unit-test / cuda12.6-py3.10-gcc9-sm86 / test (inductor_cpp_wrapper, 1, 2, ephemeral.linux.g5.4xlarge.nvidia.gpu) (gh) (#152916)
[ FAILED ] AotInductorTest.BasicPackageLoaderTestCpu

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pianpwk · 2025-04-02T20:10:19Z

torch/_export/serde/serialize.py

@@ -1701,6 +1702,9 @@ def _process_sym_expr(sym: sympy.Expr, hint: Optional[Union[int, bool, float]] =
                        compiler_min=vr.lower,  # type: ignore[arg-type]
                        compiler_max=vr.upper,  # type: ignore[arg-type]
                    )
+                # ShapeEnv meta
+                if isinstance(sym, sympy.Symbol):
+                    self.shape_env.var_to_stack[sym] = CapturedTraceback.extract(skip=1)


fixes one-off deserialization issue

laithsakka · 2025-04-04T15:46:25Z

aten/src/ATen/EmptyTensor.cpp

@@ -160,7 +160,7 @@ SymInt computeStorageNbytes(
  // of the last element according to stride
  SymInt size = 1;
  for (const auto i : c10::irange(sizes.size())) {
-    if (TORCH_GUARD_SIZE_OBLIVIOUS(sizes[i].sym_eq(0))) {
+    if (TORCH_GUARD_OR_FALSE(sizes[i].sym_eq(0))) {


would it make sense to add a runtime assert in this case
something like

torch._check(sizes[i].sym_neq(0), f"We assumes that unbacked size {sizes[i]} is not 0 but it turn out to be zero at rumtime,}",)
alternatively if its ok for this function to return something that is greater than the actual storage number of bytes.
at line 169 we can do to make sure we do not return negative.
return Max(itemsize_bytes * (storage_offset + size),0) ;

Isn't it ok in this case? Like if the tensor numel is 0 at runtime, aren't we literally storing 0 bytes for the tensor?

the output if we skip this wont be zero though?
like if we return false, but the numel is actually 0 then at line 167
size += strides[i] * (sizes[i] - 1);
size would be
1 + strides[i]*(-1) is the strides in that case guaranteed to be 1? if not we will get.
1-strides[0] which is not always 0 no?

I guess my questions did you check what would be the actual output at runtime if we skip this one when the size is 0. and if the break things?
one way to see what could fail is this.
#151172

Now I understand that we are already doing this right now, and that well, we are just extending it briefly in this case by making it work for u0-u2 .. etc instead of just u0, u0+u2...
but lets wait for the test above, i think if things fail probably better to fix the soundness and insert a runtime assert on dde that its not actually zero.
if nothing fail we can debug it to understand what happen and decide.

it caused a lot of issues, see CI before I removed the check: https://hud.pytorch.org/pytorch/pytorch/pull/150483?sha=ccc45c38c60db864098da4ce31647bad4f7eee45

can you add a comment as the following:

# This used to be TORCH_GUARD_SIZE_OBLIVIOUS, but since any size is always >=0, assuming that TORCH_GUARD_SIZE_OBLIVIOUS was safe we extended the assumption to all other unbacked expressions.

also can we make
size += strides[i] * (sizes[i] - 1);
--->
size += strides[i] * max(0, (sizes[i] - 1));
I just do not want to this to every possibly return a negative

maybe this is why this is safe
https://www.internalfb.com/diff/D56851139?whitespace=SHOW_ALL

See the comment bellow.

c10::SymInt new_size_bytes = result.is_contiguous() ? at::detail::computeStorageNbytesContiguous( size, itemsize, std::move(storage_offset)) : at::detail::computeStorageNbytes( size, stride, itemsize, std::move(storage_offset)); // TODO: When there are unbacked SymInts, we unconditionally skip the // setter. This is technically wrong, but we cannot conveniently test // the real condition in many cases, because a lot of people are using // set_ just to swizzle metadata on a tensor, they didn't actually want // to see if they need to 8000 resize the storage. // // The old behavior was to unconditionally set_nbytes, but I think not // setting it is more safe. if (new_size_bytes.has_hint() && storage.sym_nbytes().has_hint() && TORCH_GUARD_SIZE_OBLIVIOUS( new_size_bytes.sym_gt(storage.sym_nbytes()))) { storage.set_nbytes(std::move(new_size_bytes)); } ``` or at least look like this was not safe and the diff above tried to make it safer but did not want to remove the guard size oblivious?

Nice find, sounds like this was never really safe, we just avoided it here. I'm worried the Max(0, *) will cause problems for our weak min/max reasoning, but let's see what CI says

pianpwk · 2025-04-11T21:52:05Z

@pytorchbot rebase

pytorchmergebot · 2025-04-11T21:53:31Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2025-04-11T21:53:34Z

Successfully rebased pianpwk/oblivious_storagenbytes onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout pianpwk/oblivious_storagenbytes && git pull --rebase)

…k/oblivious_storagenbytes

pianpwk · 2025-04-23T18:30:48Z

test/dynamo/test_recompiles.py

@@ -393,6 +394,9 @@ def f(x):

        self.assertEqual(counter.frame_count, 2)  # not three or four!

+    # TODO(laithsakka): guard_or_false fallback should occur before oblivious/unbacked hints
+    # maybe we can deprecate this option with backed_size_oblivious?
+    @unittest.expectedFailure


Happens because falling back to oblivious hint can add guard that triggers an additional (e.g. u0 != 0)), avoidable if we had used the False fallback earlier

I have a fix for this

laithsakka · 2025-04-25T17:13:34Z

aten/src/ATen/EmptyTensor.cpp

@@ -160,7 +160,7 @@ SymInt computeStorageNbytes(
  // of the last element according to stride
  SymInt size = 1;
  for (const auto i : c10::irange(sizes.size())) {
-    if (TORCH_GUARD_SIZE_OBLIVIOUS(sizes[i].sym_eq(0))) {
+    if (TORCH_GUARD_OR_FALSE(sizes[i].sym_eq(0))) {


# this is kind more closer to current guard_size_obl(sz.sym_eq(0)) def func(sz): if guard_or_false(sz>=0): return guard_or_false(sz.sym_eq(0)) else: return sz==0

We know the output shape, and we know this always produces a clone. Avoids data-dependent errors from the decomposition. along with #150483, should fix #123855 Pull Request resolved: #152129 Approved by: https://github.com/laithsakka

…k/oblivious_storagenbytes

aorenste · 2025-05-05T19:31:57Z

FYI: This change is blocking #152662

aorenste · 2025-05-05T21:56:53Z

After rebasing I needed this patch to make this work: P1803805396

pianpwk · 2025-05-09T16:57:06Z

@pytorchbot merge

pytorchmergebot · 2025-05-09T17:02:57Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorch-bot bot added ciflow/inductor module: dynamo labels Apr 1, 2025

pianpwk added the release notes: export label Apr 2, 2025

pianpwk commented Apr 2, 2025

View reviewed changes

laithsakka reviewed Apr 4, 2025

View reviewed changes

pianpwk added 7 commits April 11, 2025 21:53

init

6b76bf6

lint

b6f0964

fix test

88da25c

test

15dae80

mark_as_oblivious

6201d42

mark expected fail

c96f389

Update test_recompiles.py

173d186

pytorchmergebot force-pushed the pianpwk/oblivious_storagenbytes branch from 12a6643 to 173d186 Compare April 11, 2025 21:53

pianpwk marked this pull request as ready for review April 11, 2025 22:27

pianpwk requested review from avikchaudhuri, tugsbayasgalan, zhxchen17, ydwu4 and angelayi as code owners April 11, 2025 22:27

pianpwk requested review from bdhirsh and laithsakka April 11, 2025 22:27

pianpwk added 5 commits April 14, 2025 09:03

Merge branch 'main' of https://github.com/pytorch/pytorch into pianpw…

56d047b

…k/oblivious_storagenbytes

try with >=1 check

90aea1b

Update EmptyTensor.cpp

ccc45c3

remove >= 1 check < 6DAF /div>

a06fa83

Update EmptyTensor.cpp

8cae0b8

Update EmptyTensor.cpp

8cae0b8

pianpwk added 2 commits April 23, 2025 11:27

size-oblivious test

3d9fff4

lint

a389dca

pianpwk commented Apr 23, 2025

View reviewed changes

Update EmptyTensor.cpp

7683a45

pianpwk mentioned this pull request Apr 24, 2025

[dynamic shapes] aten.constant_pad_nd meta impl #152129

Closed

laithsakka reviewed Apr 25, 2025

View reviewed changes

Merge branch 'main' of https://github.com/pytorch/pytorch into pianpw…

5a9c277

…k/oblivious_storagenbytes

pianpwk added 3 commits May 5, 2025 20:25

Update test_export.py

06f1f71

Update expected_results.csv

6c91597

Update EmptyTensor.cpp

d6ad2b8

pianpwk mentioned this pull request May 7, 2025

[WIP][dynamic shapes] unbacked safer cat, repeat #153011

Draft

laithsakka approved these changes May 7, 2025

View reviewed changes

pianpwk added 3 commits May 7, 2025 17:25

Merge branch 'main' into pianpwk/oblivious_storagenbytes

ec21175

Update expected_results.csv

abc6beb

merge

5a9460e

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 9, 2025

pytorchmergebot added the merging label May 9, 2025

pytorchmergebot added the Merged label May 9, 2025

pytorchmergebot closed this in d808a3e May 9, 2025

pytorchmergebot removed the merging label May 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dynamic shapes] guard_or_false for computeStorageNbytes #150483

[dynamic shapes] guard_or_false for computeStorageNbytes #150483

[dynamic shapes] guard_or_false for computeStorageNbytes #150483

[dynamic shapes] guard_or_false for computeStorageNbytes #150483

Conversation

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/150483

✅ You can merge normally! (1 Unrelated Failure)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Merge started