-
Notifications
You must be signed in to change notification settings - Fork 24.8k
Description
The problem
Accurate Python type annotations are essential to building reliable and reusable software, but there are almost 5000 #type: ignore
statements in Python files below torch/
.
After going through several of hundred such statements, I'd say the majority have a simple type solution that doesn't need an expert in Python typing to come up with, and most of the rest still have a better solution than #type: ignore
but might take some expertise with typing to actually get.
There's nothing to be done about the old annotations except fix them one at a time, but there is something automated that could be added without great effort that would discourage writing new #type: ignore
statements.
The solution: the type_ignore_linter
In the first step, every Python file in torch/
is tokenized
, the number of #type: ignore
statements is counted, and stored in a "type grandfather" fiile mapping filenames to ignore counts.
Then on each commit, the type_ignore_linter
would load the grandfather file, see if the total count of #type: ignore
had increased for any files in the commit, and if so, report an error.
Git approval access over the "type grandfather file" could be given to some Python typing annotation fans, with the hope that people would get stuck, ask for approval for a # type: ignore
add, and instead be helped in writing correct typing.
And as usual there would have to be a way to easily opt a file or line out of this linter, for emergencies or pathological laziness.
How much work would it be?
tools/linters/adaptors
already has two linters that tokenize everything, one of which is a linter that counts "bad things in files" and stores them in a grandfather file. (The grandfather part is under review.)
You'd want to write unit tests, productionize it as a linter, but if one had done such things already for other linters, it'd be one solid day of work.
The issue would be getting buy-in from everyone, as it would potentially affect any pull request. Having the ability to noqa: ignore_linter
"to just shut it up" might be mitigating.
How could it go wrong?
- People with approval on the grandfather file might become a bottleneck
- Alternatively, too many people writing to the grandfather might mean that the system becomes toothless.
- Developers could fix typing elsewhere in the same file to add ignores to their code (could be worse).
Alternatives
The alternative is the current state: allowing any developer to add #type: ignore
statements at any point in the code while a few developers try to add typing. See https://en.wikipedia.org/wiki/Sisyphus
Appendix: what the grandfather list would look like:
The code to count was so short I just wrote it standalone from bits of other things, it's here.
4917 ignores
{
"torch/_inductor/ir.py": 232,
"torch/_inductor/fx_passes/group_batch_fusion.py": 121,
"torch/_inductor/fx_passes/split_cat.py": 100,
"torch/nested/_internal/ops.py": 100,
"torch/_refs/__init__.py": 87,
"torch/fx/experimental/symbolic_shapes.py": 64,
"torch/testing/_internal/common_utils.py": 59,
"torch/distributed/fsdp/_flat_param.py": 50,
"torch/_inductor/graph.py": 50,
"torch/fx/experimental/sym_node.py": 45,
"torch/ao/quantization/fx/convert.py": 42,
"torch/_inductor/codegen/cuda/gemm_template.py": 41,
"torch/_dynamo/symbolic_convert.py": 40,
"torch/nn/modules/conv.py": 35,
"torch/_inductor/fx_passes/efficient_conv_bn_eval.py": 33,
"torch/_inductor/codegen/cpp.py": 33,
... many more...