-
-
Notifications
You must be signed in to change notification settings - Fork 32k
bpo-46771: Implement asyncio context managers for handling timeouts #31394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Tinche this is the first sketch.
It has no tests yet, sorry.
The idea is: let's use CancelScope class.
The class has a state.
If timeout occurs, the state is switched to cancelled
and task.cancel()
is called.
The fact of timeout occurrence is detected by the state check (self._state == _State.CANCELLED
).
The explicit state machine is very useful, it simplifies both designing and debugging.
move_on_*
family is renamed with cancel_scope
+ cancel_spoce_at
.
fail_*
family is renamed with timeout
+ timeout_at
Sync context managers are used to visually guarantee no suspension points.
I didn't implement CancelScope.cancel()
because it is barely equal to task.cancel()
. Can be added if we decide to support this.
Ironically, the cancellation message is not needed for implementation because the cancellation flag is set by call_at()
callback. If something swallows raised CancelledError -- the flag is still set. We can add two flags actually: cancellation was raised and caught; I'm not sure if it is really needed. If we decide 'yes' .cancelling()
flag can be added along with '.cancelled()'.
I'll try to add tests later.
Sorry, tomorrow I'll be traveling and inactive most likely.
@gvanrossum you may be interested in this preliminary design. It is a mix of trio-styled cancellation scopes and asyncio specific.
The code is pretty simple and understandable, at least from my point of view.
Sure, need to write a comprehensive test suite.
I'll gather tests from both async-timeout and quattro, plus add tests for mentioned edge cases.
@asvetlov Thanks for starting this, I will dedicate some time over the weekend. How can I contribute tests to this though? You invited me to asvetlov/cpython, but this PR is from python/cpython. Sorry, I have not done this before :) |
Clone my repo locally BTW, I use GitHub CLI (https://cli.github.com/), it allows to checkout forks very easily. Click 'copy-to-clipboard' button right to the branch name. The tool setups remote, checkouts branch, and switches to it. |
Sorry, I was over-optimistic :( |
Thanks for the explanation. The asvetlov repo has no Can |
@asvetlov Ok, for the discussion about more sophisticated nonce handling. Would you like me to do a writeup here, or open an issue on the asvetlov repo? I will try to explain the issue and a proposed solution. You were right saying on the mailing list that asyncio doesn't have priorities, but I think with a little logic we can do better and maybe save our users from a potential footgun. |
Can you add a test that can only be fixed with a nonce? I'd like to take a look, because I'm still -1 on the nonce idea. We should figure out another way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the API only.
I'm not sure if the discussion about cancel edge cases should move here, but my comments on that so far are:
|
@1st1 Here's a test that fails without nonces: async def test_nested_timeouts_concurrent(self):
with self.assertRaises(TimeoutError):
with asyncio.timeout(0.002):
try:
with asyncio.timeout(0.003):
# Pretend we crunch some numbers.
time.sleep(0.005)
await asyncio.sleep(1)
except asyncio.TimeoutError:
pass Both timeouts are marked as cancelled, but the inner timeout swallows the cancellation for the outer timeout. So the outer timeout never triggers. |
@Tinche up to you. |
Thinking about what happens here: When we reach the The expectation is that the outer cancel scope should also raise TimeoutError, because its deadline is also exceeded. So perhaps the outer But it looks like there's a variant of the example that isn't fixed by such a check -- for example, we could add await asyncio.sleep(2) after the try/except block (or even in the except clause), inside the outer cancel scope. Since the CancelledError exception has been wholly swallowed by the inner I'm guessing the fix for that would be to use a nonce (either using an explicit nonce API or via the cancel message). However, that still depends on the order in which the callbacks run. With the new cancel semantics (where extra cancels are ignored) whoever runs first wins, while with the old cancel semantics (where the most recent cancel gets to set the cancel message / nonce) whoever runs last wins -- but we don't want to rely on the order in which the callbacks run (since I have no idea in which order the current implementation runs them, given that they both become "ready" simultaneously, and we shouldn't depend on that). So that's why Tin is proposing to use timestamps (or a cancel stack, or some other mechanism that lets us arbitrate between the two cancel calls). Note that this case is slightly different from the "web server cancels" problem, which was solvable without resorting to timestamps (the cancel scope callback would have to check whether the task was already being cancelled, using t.cancelling()). I'm not sure yet what the right solution would be, but I'm glad that we're thingking about this scenario carefully. |
This comment was marked as resolved.
This comment was marked as resolved.
@gvanrossum thanks for the great write up, I believe you hit the nail on the head. I think to get this 100% correct, we need a way to throw the highest priority cancellation into the task on its iteration. The highest priority is the cancel scope that was entered first. It doesn't have to be a timestamp, we could have a global (or thread-local, but that's overthinking it) counter that the cancel scope takes from on The cancellation data could also be in a context var instead of on the CancelledError, I suppose? Would that be cleaner? |
I have a new proposal. Thinking about @asvetlov's diagrams some more, I think we can make everything work if the cancel state of a task kept a count of cancellations. This would replace the Now any context manager cleanup handler, like The only thing that's painful with this scheme is how to handle cancel messages. Hoever, (this is for @cjerdonek to disagree with :-) I think cancel messages were a bad idea to begin with -- if I read the bpo where it started, it came from a user who misunderstood how to print exceptions. It seems they are not always preserved regardless. I believe that any scheme that allows for multiple pending cancellations will have a hard time to keep track of which cancel message should be allowed to bubbled out. I'm guessing it could be solved by making |
@gvanrossum what happens to the count when the exception is thrown? Does it reset to 0 (I guess this would have to happen after the entire iteration is done) or does it stay, and only If it stays, I think essentially you have level-triggered cancellation then. Which I like, but it would be a compatibility break, right? |
It stays, but this is not the flag that causes the exception to be thrown -- for that we have So I don't think it's quite the same as level-triggered, IIUC (that would keep interrupting every await governed by the cancel scope until its |
Cool, I'm on board with this proposal then. (I got a little confused on Your understanding of level-based cancellation is correct. In practice it helps remove a few footguns, especially around |
Sure, go ahead and send a draft PR! |
Got a working version here: #31434. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Once we're confident that the buldbot tests will pass let's land this!
🤖 New build scheduled with the buildbot fleet by @gvanrossum for commit e8c67ce 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
buildbot failures are unrelated: 11 tests failed: The same on other boxes, test_asyncio is never affected |
Looks like it's all the refleaks buildbots, leaking in unicode. Let's ignore that; I'll merge this. Thanks for this significant improvement to asyncio usability! |
I've been quietly following along, thanks a ton @asvetlov ! |
(And thanks to @gvanrossum for managing this of course). |
Thanks, guys! |
https://bugs.python.org/issue46771