-
Notifications
You must be signed in to change notification settings - Fork 24.2k
Accuracy issue in torch inductor #153299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@naveenkumarmarri can you also try with |
@masnesral adding the flag still introduces the accuracy gap |
@naveenkumarmarri does the repro script actually work for you? For example: https://gist.github.com/naveenkumarmarri/0f66d89695e56840a06c7a37dccca83f#file-repro-py-L67 |
this script is autogenerated by enabling
I followed the torch.compile docs to get the script. Let me know if there is a better way to generate the repro script that might be helpful to reproducing this |
@navmarri14, I dunno; I'm kind of a noob with minifier support, but the script looks invalid to me. Is this use case using either of |
@masnesral I don't have # compiled model leads to mismatch in accuracy
def fn(x):
return torch.logsumexp(x, dim=-1).pow(2).mean() if I disable the compilation on the specific operation, the results match with the uncompiled model. @torch._dynamo.disable(recursive=True)
def fn(x):
return torch.logsumexp(x, dim=-1).pow(2).mean() I tried to reproduce this in a simple model definition but the results seem to be matching between compiled and uncompiled models. import torch
import torch.nn as nn
class ToyModel(nn.Module):
def __init__(self):
super(ToyModel, self).__init__()
def forward(self, x):
return torch.logsumexp(x, dim=-1).pow(2).mean()
device = torch.device("cuda")
model = ToyModel().to(device)
x = torch.load("shift_logits.pt", map_location=device)
uncompiled_loss = model(x)
print(f"uncompiled_loss: {uncompiled_loss.item()}")
model = torch.compile(model, fullgraph=False, backend="inductor")
compiled_loss = model(x)
print(f"compiled_loss: {compiled_loss.item()}")
print(f"compiled equal to uncompiled: {torch.equal(uncompiled_loss, compiled_loss)}") prints
cc: @ezyang incase if you have seen this issue before or if could share better approach to share repro |
Ok, I was able to get the repro working just by commenting out the bad lines: https://gist.github.com/masnesral/f5c9afeb24247838e7fb7812b1f47bc7 I also ran the compiler bisector, but it didn't find a culprit:
cc @eellison I think we'd consider this ubn. |
@masnesral are there any references on how to fix these kind of issues. I can make a PR if possible |
I started looking at this, i can take it. |
🐛 Describe the bug
I am noticing accuracy difference when training with torch compile. to narrow down the issue, ran ablation with
eager
,aot_eager
andinductor
and observed that numerics diverge when using backend=inductor
.Looks like this is happening inside inductor and to produce, I added
repro.py
andminified_launcher.py
. let me know if you need any information that can help better reproduce thisError logs
No response
Versions
ran repro minifer and the script is available here
repro.py - https://gist.github.com/naveenkumarmarri/0f66d89695e56840a06c7a37dccca83f
minifier.py - https://gist.github.com/naveenkumarmarri/8fa35e72e3210a6c6b13548d9ab73df6
cc @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @aakhundov
The text was updated successfully, but these errors were encountered: