-
Notifications
You must be signed in to change notification settings - Fork 24.2k
[inductor] Add Python type annotations to torch/_inductor
#146167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@rec Hello, I would like try to help with |
@zeshengzong: Sure, that'd be great, thanks for asking! I updated the list of files with the files you and I are working on, I should have done that before. CC me on any review. |
@zeshengzong - so as I'm going on, I'm finding that nearly all the real typing issues that aren't just "declare a parameter" are what I call chameleon variables - a variable that has a lot of possible types on input, but quickly becomes one single type before it gets used. For example, a parameter variable might have type The type checker might be able to deduce this type, except the code uses the same input variable, which is known to accept all these types. In many cases this can be fixed by using a new variable:
In other cases, the ambiguity is baked into the API, so I use I have still been using On the other hand, there is an extra call to an identity function. It isn't free, and if it's being traced through Dynamo, will give different results. Asserts are potentially 100% free if you run Python in optimize mode ( Now I don't even know what to do! 😀 I'm going to look around and see what I can find, and think about the issue... |
So I spent some time thinking about it, and here's what I came up with. First, the CPU cost of an empty function call or an assert is tiny and shouldn't be considered. Second, even though the asserts are covering cases that never occur today (seemingly in every case I've looked into, we'd immediately fail anyway), if they do ever trigger, it will make debugging easier. And assertions can be switched off for maximum speed. So assertions it is! |
@zeshengzong: All right, we have a new commit. I learned a lot from doing this, please hit me up with questions. 🙂 |
I'm almost finished with the The main lesson was that doing the whole file at one time was a bad idea. 😬 Making the changes themselves wasn't that hard - the trouble was getting it to not break unit tests, and while you're working that out, there are huge merges. Worse, I ran into a new issue, where tests fail on CI but pass locally. I started out with something like 80 errors, 25 of which worked locally, the other 55 of which I could reproduce exactly. I cleaned away the 55 errors I could reproduce, but those 25 are left.(*) I'm using the CI and reading code to track this down, but this is disconcerting, first time in almost a year at pytorch. Unfortunately, splitting up the pull request generates so many errors in the first half it's not worth it, so I'm charging to the finish.... (* - it looks at this moment as if there are no significant errors left, but that's because I'm testing a portion of the whole.) |
🚀 The feature, motivation and pitch
Type annotations make new development and maintenance easier, and sometimes find bugs.
And
torch/_inductor
is tricky, and under constant modification by disparate developers.How?
Adding annotations occasionally finds latent bugs, but the real payoff is in faster and more accurate maintenance and new development that using that annotated code.
If we knew which files, classes and functions were going to be used in future development, we could prioritize annotating those.
What we can measure is what gets imported in existing code.
This little script gives the following sorted counts of imports from
_inductor
over all oftorch/
:.pattern_matcher
: 486.utils
: 324.ir
: 137.codegen.common
: 131.virtualized
: 111.codecache
: 59.lowering
: 54.scheduler
: 52So there is an import for either
torch._inductor.pattern_maker
, or a symbol contained within it, 486 times withintorch/
.Deliverable (per file)
# mypy: allow-untyped-defs
andignore-errors
statements:# mypy: allow-untyped-decorators
(possibly keep, typing decorators correctly is arduous)That script above also drills down into individual symbols, for example:
Tracking
utils.py
: [inductor] Add type annotations to _inductor/utils.py #144108pattern_matcher.py
: [inductor] Improve type annotations in _inductor/pattern_matcher.py #146626ir.py
: [inductor] Improve type annotations in _inductor/ir.py #148358ir.py
: [inductor] Add typing to _inductor/ir.py #149958ir.py
: [inductor] Add more typing to _inductor/ir.py #149959codegen/common.py
: [inductor] Clean typing in codegen/common.py and codecache.py #150767virtualized.py
: (in progress @zeshengzong)code_cache.py
: also [inductor] Clean typing in codegen/common.py and codecache.py #150767lowering.py
: the first line disables all type checking: removing that reveals a hefty 395 errors; removingtype: ignores
adds another 25 errorsAlternatives
Fumbling ahead with an ongoing ignorance of type information. 😁
cc @ezyang @malfet @xuzhao9 @gramster @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @amjames @desertfire @chauhang @aakhundov
The text was updated successfully, but these errors were encountered: