Ideas for linting trio programs

@wgwz

I haven't given up on catching missing awaits at runtime (see #495), but another strategy that could be useful is to have a linter/static analyzer that catches them for you. In fact, you really want both, for all the reasons runtime + static checks are useful for other kinds of errors: runtime checks can be 100% reliable when the error actually happens, and are especially useful for beginners who can't be expected to set up fancy linting tools; static checks can catch mistakes early, even in untested code-paths.

Examples of this being a real issue:

Update VSTS to Azure DevOps and ignore issue with Heroku packages python/cpython#9168 (comment)
@wgwz gets bitten: https://gitter.im/python-trio/general?at=5b9dde717dca30650309ca3e
Yury says that specifically when a function is alone on a line, he accidentally writes foo() instead of await foo() about once a month.

I know absolutely nothing about static analysis, so I have no idea how to actually implement a lint for this, but let's have an issue to discuss and maybe we'll figure it out together.

In theory, this should be pretty straightforward, because in Trio, the property to enforce is: any function call which returns an awaitable, should have an await to immediately consume that awaitable. This makes things way simpler than asyncio, where it's common to create coroutine objects and then pass them somewhere else, and sometimes this is what you want (passing them into asyncio.run or asyncio.create_task or asyncio.gather) and other times its an accident because you forgot an await. So a lint for asyncio would require some pretty sophisticated data flow inference and knowledge of different APIs; for trio, once we've figured out which calls are to async functions/methods, the check itself requires only superficial knowledge of the surface AST. OTOH, this may make it more difficult to convince tools like mypy or pylint to include our check, because it won't work for asyncio users; and, if they do include it there will need to be some kind of configuration to turn it on or off. Maybe it would need to be a plugin, at least to start out.

I got a bit nerd-sniped by this last night. Findings so far:

pylint

Pylint has docs on how to add a new check, and an official plugin API, so we could distribute a checker as a third-party package. And, (through astroid), it has some reasonably sophisticated APIs to go from a call back to the definition of the function/method being called. So playing around, I came up with this: https://gist.github.com/njsmith/0ce8c661684ba4514b9ea4aac6182f3c

But unfortunately, it turns out that figuring out which functions return awaitables is a bit more complicated than it seems. In particular, the code in that gist can (I think) catch many cases where there's a call to a function, there's no await at the call site, and the function is defined with async def. However, this is still wrong in at least two cases:

It ignores @ decorators on the function. These can be pulled out of the AsyncFunctionDef by accessing its .decorators attribute, but... then what. (Maybe if we special-case @{en,dis}able_ki_protection and then otherwise ignore async defs with decorators, that's still enough to be useful?)
It doesn't distinguish between async functions and async generators. This is important, because both are defined using async def syntax, but async functions must be called with await, and async generators must not be called with await. We could do this by hand, though it's a bit tiresome: we'd need to walk the AST for the function body, looking for yield statements, except being careful not to look inside any nested functions.

So this is probably doable, but it was looking complicated enough that I paused to look at mypy instead.

mypy

In theory, mypy should solve this problem nicely, because it already has a stupendously sophisticated type inference engine that knows which calls return awaitables and which ones don't. So, we should be able to skip all the complications that we ran into with pylint, and just go straight to implementing our check. And in theory, this can even be more accurate, because it can take into account explicit type hints in cases where static analysis alone isn't enough.

Mypy also has a plugin API, but it's completely undocumented. That said, it looks like it is possible to ship a third-party plugin and then import it through the mypy config file by using the undocumented plugins = ... option in the config file, which contains a comma-separated list of setuptools-style entry points pointing to plugin objects.

AFAICT, though, currently the plugin API only allows you to define custom type inference rules (e.g., inferring the return value from open by checking whether the mode string has a b in it). I don't see any way to define a custom type checking rule.

It looks like the overall flow is mypy.main:main → mypy.build:build, whose BuildManager object together with dispatch runs 3 semantic analyzer passes (semanal_pass1.py, semanal.py, semanal_pass3.py) and then 2 passes of type checking which are performed using a mypy.checker:TypeChecker object. So I guess we'd want to hook in somewhere in those two passes?

Others?

Someone in the python/typing gitter claimed that pyre will have a check for this soon. Given that pyre is written by facebook and facebook uses asyncio, I suspect anything they come up with will be much cleverer than we need; and given that it's written in ocaml, I doubt we can easily write a plugin in python :-). Also they currently only support running checks on macOS and Linux, and I've never heard of it before now. So probably not what most of our users will be using?

Anything else?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

pylint

mypy

Others?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

pylint

mypy

Others?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions