-
Notifications
You must be signed in to change notification settings - Fork 24.2k
Remove set_default_dtype call in jit_metaprogramming_utils.py #69624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove set_default_dtype call in jit_metaprogramming_utils.py #69624
Conversation
CI Flow Status⚛️ CI FlowRuleset - Version:
You can add a comment to the PR and tag @pytorchbot with the following commands: # ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun
# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow For more information, please take a look at the CI Flow Wiki. |
🔗 Helpful links
❌ 7 New FailuresAs of commit 5ffb8d3b0e (more details on the Dr. CI page): Expand to see more
🕵️ 7 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages
|
80ec3be
to
3c40e37
Compare
Hey @mruberry, I've made progress so far getting a lot of the tests to work with float32 being the default dtype. However, there are some tests defined in One option is to try to modify the reference functions to decrease the error, but that probably isn't always guaranteed to be possible for every case and it would require a good understanding of each function. The other option is to just go back to using float64 for the tests that fail with due to precision. This is a much easier option. Would that be alright with you? EDIT: I suppose a third option is to increase the error tolerance of some of these. But in cases where the error is large, this is probably not a good idea Another detail is that |
3c40e37
to
117e312
Compare
Yes! We should just add a comment when we do it.
Yes for gradcheck we should always be in double, you're right. cc @jbschlosser who's updating nn tests. Keeping them working is the priority as long as we call out where they're using double more explicitly. |
Hey @kurtamohler, I'm in the process of migrating the old |
|
So, it turns out that if we remove the Just to be clear, |
This PR is already great -- what if we just explicitly set the default dtype to double in test_nn.py and file a follow-up issue (with a comment in the code linking the issue) for a "better engineering" TODO. That way we can get this PR merged, and we'll still have an issue to track further test_nn.py changes (which @jbschlosser) is already working on. How does that sound @kurtamohler? At least that way test_ops.py and test_jit.py won't have the default double dtype, and with asserts to protect them they won't regress again. Plus test_nn.py will have a clear TODO. |
72517cc
to
aba1eef
Compare
I'm good with that plan as well. My hope is that eventually the old stuff will just be destroyed and we won't have to make sweeping changes to it in |
e046bed
to
9b8ca2d
Compare
@mruberry, this is ready for another review |
9b8ca2d
to
ce5d813
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, nice!! LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, @kurtamohler! Sorry this slipped through the crackers in early January!
/easycla As part of the transition to the PyTorch Foundation, this project now requires contributions be covered under the new CLA. See #85559 for additional details. This comment will trigger a new check of this PR. If you are already covered, you will simply see a new "EasyCLA" check that passes. If you are not covered, a bot will leave a new comment with a link to sign. |
/easycla |
@kurtamohler Do you still want to merge this? If yes, please resolve merge conflicts and I can take over Meta-internal verification. |
@kit1980, thanks for the ping. Yes I would like to get this merged. I will resolve the conflicts |
5ffb8d3
to
839021c
Compare
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/69624
Note: Links to docs will display an error until the docs builds have been completed. ❌ 5 FailuresAs of commit 29eb0ef: NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
839021c
to
678fbc5
Compare
@kit1980 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
b10ff10
to
5f982ef
Compare
@kit1980 , I think this is ready for Meta-internal verification. The remaining CI failure is upstream |
@kit1980 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
5f982ef
to
29eb0ef
Compare
@kit1980 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@pytorchbot successfully started a rebase job. Check the current status here |
Tried to rebase and push PR #69624, but it was already up to date |
@kurtamohler I see one relevant Meta-internal test failure:
This is a generated test. Not sure why we don't have similar on GitHub or if it's even a good test. |
Thanks for the update @kit1980. I'm guessing that's something that will have to be fixed internally. It seems like the arguments to bilinear are not all the same type. For instance, I get the same error with this: >>> torch.nn.functional.bilinear(torch.randn(3).double(), torch.randn(5), torch.randn(1, 3, 5))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: expected scalar type Double but found Float |
@kurtamohler Actually lots of tests failing on GitHub with what looks like the same cause:
|
True, it's possible that fixing those errors will fix the internal ones. I've figured out how to reproduce some of the errors in my environment, now I'll work on fixing them |
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
Fixes #68972
cc @EikanWang @jgong5 @wenzhe-nrv @sanchitintel