-
Notifications
You must be signed in to change notification settings - Fork 24.2k
Introduce guard_or_true, guard_or_false #148430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ompositions.py [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148430
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 50 PendingAs of commit b566522 with merge base 5a7588f ( NEW FAILURE - The following job has failed:
UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…d_size_oblivious in decompositions.py" some context in this document: https://docs.google.com/document/d/18nJsj-F2C_QXO7ClwzPcAUENQ-B440B43W7DdDnlDt4/edit?tab=t.0#heading=h.pgebnyi7pocj But TLDR; guard_else_true, guard_else_false are better than guard_size_oblivious due to : 1) easier to reason about what assumptions we are making while reading the code. 2) avoid guard_size_oblivious complexity that is not needed. 3) avoid unsoundness that could make guard_size_oblivious(a==1) be True when its not True for some a during runtime. 4) less data dependent errors for some cases: ex, when guard_size_oblivious(a==1) we know a is a tensor size, but its traced with a=u1-u2 for example guard_size_oblivious(a==1) will throw data dependent error but guard_else_false will just return false. cc ezyang SherlockNoMad EikanWang jgong5 wenzhe-nrv voznesenskym penguinwu Guobing-Chen XiaobingSuper zhuhaozhe blzheng jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov [ghstack-poisoned]
…d_size_oblivious in decompositions.py" some context in this document: https://docs.google.com/document/d/18nJsj-F2C_QXO7ClwzPcAUENQ-B440B43W7DdDnlDt4/edit?tab=t.0#heading=h.pgebnyi7pocj But TLDR; guard_else_true, guard_else_false are better than guard_size_oblivious due to : 1) easier to reason about what assumptions we are making while reading the code. 2) avoid guard_size_oblivious complexity that is not needed. 3) avoid unsoundness that could make guard_size_oblivious(a==1) be True when its not True for some a during runtime. 4) less data dependent errors for some cases: ex, when guard_size_oblivious(a==1) we know a is a tensor size, but its traced with a=u1-u2 for example guard_size_oblivious(a==1) will throw data dependent error but guard_else_false will just return false. cc ezyang SherlockNoMad EikanWang jgong5 wenzhe-nrv voznesenskym penguinwu Guobing-Chen XiaobingSuper zhuhaozhe blzheng jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov [ghstack-poisoned]
I instinctively don't like the name I do think we want to get buy in from our partners in dynamo/inductor before landing a change like this. As we discussed yesterday, there is nothing more permanent than a temporary commit and this is doubly true for public API changes. I'm adding @jansel, @eellison, @zou3519 and @anijain2305 as reviewers but feel free to add others that you think might have strong opinions about this change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not clear to me what the overlap is of this versus statically_known_true. even in this pr we have some wrong usages. i think we should be careful about adding more apis.
I also dont like the name guard_else_false and guard_else_true.
torch/_inductor/decomposition.py
Outdated
and guard_else_true(self.size(0) > 0) | ||
and guard_else_false(input2.size(0) == 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we just be using statically_known_true ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
statically_known_true will try to evaluate this statically, but it will fail in situations of unbacked for example
for the check guard_else_false(input2.size(0) == 1) when self.size(0) is u0 [0, infinity], in this case,
(1) we can not tell statically if u0 ==1 or not.
(2) we can not guard also on u0 being 1, because we do not have a hint.
so what we do, we say we are willing to deviate here from the normal pytorch semantic for the runtime input of input2.size(0)==1 and always take the false branch.
We used to do that indirectly using guard_size_oblivious this is more explicit
see this document for more information about the intuition of guard_size_oblivious
https://docs.google.com/document/d/18nJsj-F2C_QXO7ClwzPcAUENQ-B440B43W7DdDnlDt4/edit?tab=t.0#heading=h.pgebnyi7pocj
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The decompositions here are not pytorch semantics. they are optimizations as decompositions.
If guard_else_false(input2.size(0) == 1)
can be statically_known_true(input2.size(0) == 1)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see do you want decompositions to fail if input2.size(0) is unbacked?
if thats the case we can address that in different PR, since this PR does not change current semantics
but what you are proposing is to avoid (the avoidance of 0/1 specialization in this case)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be more compelling to show use cases of this api where we should actually be using them. all of the usages here are not what we want.
torch/_inductor/decomposition.py
Outdated
if guard_size_oblivious(self.shape[1] == 1) or guard_size_oblivious( | ||
if guard_else_false(self.shape[1] == 1) or guard_else_false( | ||
batch2.shape[2] == 1 | ||
): | ||
out = (self.unsqueeze(-1) * batch2.unsqueeze(1)).sum(dim=2) | ||
return out | ||
if self.device.type == "cpu": | ||
if guard_size_oblivious(self.size(1) == 1) and guard_size_oblivious( | ||
if guard_else_false(self.size(1) == 1) and guard_else_false( | ||
batch2.size(-1) == 1 | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these are all optimizations, we should just be doing statically_known_true. or potentially an api that is only willing to add guards on non-export use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mhmm this mostly keep the semantics of guard_size_oblivious,
moving to statically_known_true will change the behaviour.
i will start failing with unbacked.
this API does what you explained, just for unbacked vs backed instead of export vs non export.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laithsakka I think what @eellison is saying is that in these cases we should change the behavior and move to statically_known_true
even for the case of backed symints. Because these are optimizations, we don't want to guard, even if we can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if thats the case we can address that in different PR since this one does not change existing behavior.
comment but from his statement "or potentially an api that is only willing to add guards on non-export use cases." it might not be what he wanted
i initially did not have strong opinion about the naming but as i implemented it i found guard_or_false, does not make sense, first x_or_false is x . logically |
Guard only has side effects in the case where we guard and return True. The main point I'm making is when naming APIs you generally want to conform to common naming patterns rather than invent your own. In the case you decide to take the road less traveled, you should have a very good reason to and so far I'm unconvinced we have one. |
…ze_oblivious in decompositions.py" some context in this document: https://docs.google.com/document/d/18nJsj-F2C_QXO7ClwzPcAUENQ-B440B43W7DdDnlDt4/edit?tab=t.0#heading=h.pgebnyi7pocj But TLDR; guard_else_true, guard_else_false are better than guard_size_oblivious due to : 1) easier to reason about what assumptions we are making while reading the code. 2) avoid size_oblivious complexity that is not needed. 3) avoid unsoundness that could make guard_size_oblivious(a==1) be True when its not True for some a during runtime. 4) less data dependent errors for some cases: ex, when guard_size_oblivious(a==1) we know a is a tensor size, but its traced with a=u1-u2 for example guard_size_oblivious(a==1) will throw data dependent error but guard_else_false will just return false. cc ezyang SherlockNoMad EikanWang jgong5 wenzhe-nrv voznesenskym penguinwu Guobing-Chen XiaobingSuper zhuhaozhe blzheng jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov [ghstack-poisoned]
I don't think we should have both true and false versions of this. We only have statically_known_true. Not also statically_known_false. |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: inductor / unit-test / cuda12.6-py3.12-gcc9-sm86 / build Details for Dev Infra teamRaised by workflow job |
@pytorchbot commit -i |
❌ 🤖 pytorchbot command failed:
Try |
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 4 checks: pull / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, lf.linux.2xlarge), pull / cuda12.4-py3.10-gcc9-sm75 / test (pr_time_benchmarks, 1, 1, linux.g4dn.metal.nvidia.gpu), inductor / unit-test / cuda12.6-py3.12-gcc9-sm86 / build, inductor / linux-jammy-cpu-py3.9-gcc11-inductor / test (cpu_inductor_torchbench, 1, 2, linux.8xlarge.amx) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
cancel |
Can't merge closed PR #148430 |
@pytorchbot commit |
❌ 🤖 pytorchbot command failed:
Try |
can |
alreaded landed |
some context in this document: https://docs.google.com/document/d/18nJsj-F2C_QXO7ClwzPcAUENQ-B440B43W7DdDnlDt4/edit?tab=t.0#heading=h.pgebnyi7pocj But TLDR; `guard_or_true`, `guard_or_false` are better than `guard_size_oblivious` due to : - Easier to reason about what assumptions we are making while reading the code. - Avoid size_oblivious complexity that is not needed. - Avoid unsoundness that could make `guard_size_oblivious(a==1)` be true when its not true for some vaue `a` during runtime. - Less data dependent errors for some cases: ex, when doing `guard_size_oblivious(a==1)` and we know `a` is a tensor size, if it's traced with `a=u1-u2` `guard_size_oblivious(a==1)` will throw a data dependent error but `guard_else_false` will just return `False`. ### How is it different from statically_known_true?? **`if(cond)`:** (normal guarding) will try to evaluate statically and guard on the condition, willing to restrict input space to evaluate cond. if it fails to evaluate due to data dependent error will throw an exception (that could be converted to graph break in some situations). **`statically_known_true(cond)`:** would be used when you never want to add a guard (restrict your input space), but just want to do a best effort check to see if you can infer that something is true/false ONLY based on existing constraints. **`guard_or_true(cond)`/`guard_or_false(cond)`:** Those would be used in situations you prefer to guard and know the result of the expression over not guarding, but in case you hit a data dependent error you are ok with just returning true or false. Some reasons you might be ok with returning true/false instead could be: 1. It's an optimization I do not want to fail for not performing optimization. 2. I am willing to deviate from the normal semantics when I have unbacked for the benefit of not failing (See the doc above for more details). **`definitely_true(cond)`**: same as `guard_or_false(cond)` except does not try to do static eval for unbacked (planning to deprecate it and replace uses with `guard_or_false` or make it alias to `guard_or_false`) Pull Request resolved: pytorch#148430 Approved by: https://github.com/bobrenjc93
This reverts commit d5593ea. Reverted pytorch#148430 on behalf of https://github.com/laithsakka due to need to fix stuff ([comment](pytorch#148430 (comment)))
some context in this document: https://docs.google.com/document/d/18nJsj-F2C_QXO7ClwzPcAUENQ-B440B43W7DdDnlDt4/edit?tab=t.0#heading=h.pgebnyi7pocj But TLDR; `guard_or_true`, `guard_or_false` are better than `guard_size_oblivious` due to : - Easier to reason about what assumptions we are making while reading the code. - Avoid size_oblivious complexity that is not needed. - Avoid unsoundness that could make `guard_size_oblivious(a==1)` be true when its not true for some vaue `a` during runtime. - Less data dependent errors for some cases: ex, when doing `guard_size_oblivious(a==1)` and we know `a` is a tensor size, if it's traced with `a=u1-u2` `guard_size_oblivious(a==1)` will throw a data dependent error but `guard_else_false` will just return `False`. ### How is it different from statically_known_true?? **`if(cond)`:** (normal guarding) will try to evaluate statically and guard on the condition, willing to restrict input space to evaluate cond. if it fails to evaluate due to data dependent error will throw an exception (that could be converted to graph break in some situations). **`statically_known_true(cond)`:** would be used when you never want to add a guard (restrict your input space), but just want to do a best effort check to see if you can infer that something is true/false ONLY based on existing constraints. **`guard_or_true(cond)`/`guard_or_false(cond)`:** Those would be used in situations you prefer to guard and know the result of the expression over not guarding, but in case you hit a data dependent error you are ok with just returning true or false. Some reasons you might be ok with returning true/false instead could be: 1. It's an optimization I do not want to fail for not performing optimization. 2. I am willing to deviate from the normal semantics when I have unbacked for the benefit of not failing (See the doc above for more details). **`definitely_true(cond)`**: same as `guard_or_false(cond)` except does not try to do static eval for unbacked (planning to deprecate it and replace uses with `guard_or_false` or make it alias to `guard_or_false`) Pull Request resolved: pytorch#148430 Approved by: https://github.com/bobrenjc93
…ompositions.py ghstack-source-id: 6d20fe8 Pull Request resolved: pytorch/pytorch#148430
…ompositions.py ghstack-source-id: 6d88d61 Pull Request resolved: pytorch/pytorch#148430
Stack from ghstack (oldest at bottom):
some context in this document:
https://docs.google.com/document/d/18nJsj-F2C_QXO7ClwzPcAUENQ-B440B43W7DdDnlDt4/edit?tab=t.0#heading=h.pgebnyi7pocj
But TLDR;
guard_or_true
,guard_or_false
are better thanguard_size_oblivious
due to :guard_size_oblivious(a==1)
be true when its not true for some vauea
during runtime.guard_size_oblivious(a==1)
and we knowa
is a tensor size, if it's traced witha=u1-u2
guard_size_oblivious(a==1)
will throw a data dependent error butguard_else_false
will just returnFalse
.How is it different from statically_known_true??
if(cond)
: (normal guarding) will try to evaluate statically and guard on the condition, willing to restrict input space to evaluate cond. if it fails to evaluate due to data dependent error will throw an exception (that could be converted to graph break in some situations).statically_known_true(cond)
: would be used when you never want to add a guard (restrict your input space), but just want to do a best effort check to see if you can infer that something is true/false ONLY based on existing constraints.guard_or_true(cond)
/guard_or_false(cond)
: Those would be used in situations you prefer to guard and know the result of the expression over not guarding, but in case you hit a data dependent error you are ok with just returning true or false.Some reasons you might be ok with returning true/false instead could be:
definitely_true(cond)
: same asguard_or_false(cond)
except does not try to do static eval for unbacked (planning to deprecate it and replace uses withguard_or_false
or make it alias toguard_or_false
)cc @ezyang @SherlockNoMad @EikanWang @jgong5 @wenzhe-nrv @voznesenskym @penguinwu @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov