-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
bpo-32734: Fix asyncio.Lock multiple acquire safety issue #5466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, the change is correct.
Please fix minor issues for test.
Lib/test/test_asyncio/test_locks.py
Outdated
t1.cancel() | ||
lock.release() | ||
|
||
asyncio.Task(lockandtrigger(), loop=self.loop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use loop.create_task()
for a task creation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Keep in mind asyncio.Task was used in the rest of the tests so I used it in order to be consistent.
Lib/test/test_asyncio/test_locks.py
Outdated
|
||
# Make sure only one lock was taken | ||
self.assertEqual(lock_count, 1) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explicitly make sure that all four tasks are finished at the end of test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
GH-5477 is a backport of this pull request to the 3.6 branch. |
GH-5479 is a backport of this pull request to the 3.7 branch. |
Alright. Done with changes and backports. Anything else I can help with? :-) I have made the requested changes; please review again |
Thanks for making the requested changes! @asvetlov: please review the changes made to this pull request. |
Lib/asyncio/locks.py
Outdated
@@ -181,18 +181,22 @@ def locked(self): | |||
self._locked = True | |||
return True | |||
|
|||
_waiters = self._waiters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: just use self._waiters
. No need for a new variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New variable was created in order to avoid the double attribute lookup, I'll remove it if you think it's more readable though.
await fut | ||
self._locked = True | ||
return True | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very subtle. Can you add a comment why we use two nested 'try' statements here? I.e. that it's important that 'finally' block is called before 'except CancelledError'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing :-)
if not fut.done(): | ||
fut.set_result(True) | ||
break | ||
"""Wake up the first waiter if it isn't done.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if it is done? Can there be a situation where you have [fut, fut] in the _waiters list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be indeed, when 2 or more locks are waiting on acquire.
If the first future is done, it means that later on it will either get the lock, or it was cancelled in which case it will wake up the next waiter upon loop's next cycle.
The patch fixes the bug by avoiding multiple done waiters at once and cleaning up the waiters list asap.
The only way multiple done waiters can happen is if 2 futures get cancelled in which case only 1 waiter will wake up, and the second cancelled future will not wake up another one as lock would have already been taken (if not self._locked: self._wake_up_first()
) or it wasn't taken yet it which case it's still sitting .done() in the waiter's list.
In summary:
.done necessarily means either the lock is immediately taken next loop or if it isn't, (.done occurred because of cancellation) next one wakes up. So if we have a done future not including the one currently running we shouldn't wake up another one as it will cause a double wakeup, hence, the almighty bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, makes sense. Could you please add this all as a comment to _wake_up_first
? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Phew, now that's a long comment, lemme see how can I shorten it while still managing to fully explain it. Challenge accepted :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, if you have time to write a short comment :)
Sorry for being anal about this. Every once in a while we find a new bug in asyncio.Lock
or asyncio.Queue
and it's always a challenge to wrap your head around how they are actually working. Every line of code is significant, sometimes it's not safe to do a simple refactoring because things break in very non-obvious ways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, all good mate. It took some time to wrap my head around it as well.
Luckily I didn't actually encounter the bug at production but rather found it by reading the implementation which took a couple of hours ;-)
How's this comment? Attempting to be short and to the point as much as you can with synchronization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few review comments
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase And if you don't make the requested changes, you will be put in the comfy chair! |
Lib/asyncio/locks.py
Outdated
@@ -181,18 +181,22 @@ def locked(self): | |||
self._locked = True | |||
return True | |||
|
|||
_waiters = self._waiters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New variable was created in order to avoid the double attribute lookup, I'll remove it if you think it's more readable though.
if not fut.done(): | ||
fut.set_result(True) | ||
break | ||
"""Wake up the first waiter if it isn't done.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be indeed, when 2 or more locks are waiting on acquire.
If the first future is done, it means that later on it will either get the lock, or it was cancelled in which case it will wake up the next waiter upon loop's next cycle.
The patch fixes the bug by avoiding multiple done waiters at once and cleaning up the waiters list asap.
The only way multiple done waiters can happen is if 2 futures get cancelled in which case only 1 waiter will wake up, and the second cancelled future will not wake up another one as lock would have already been taken (if not self._locked: self._wake_up_first()
) or it wasn't taken yet it which case it's still sitting .done() in the waiter's list.
In summary:
.done necessarily means either the lock is immediately taken next loop or if it isn't, (.done occurred because of cancellation) next one wakes up. So if we have a done future not including the one currently running we shouldn't wake up another one as it will cause a double wakeup, hence, the almighty bug.
Lib/asyncio/locks.py
Outdated
try: | ||
await fut | ||
finally: | ||
_waiters.remove(fut) | ||
except futures.CancelledError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@1st1 @asvetlov I'll be honest, the only problem I have (which originates from the original code anyway) is what will happen if the exception isn't a CancelledError (such as task.__step(self, exc=OutOfCoconutsException)
). Apart from Monty Python having no horse, the code will enter a deadlock as no one will wake up the next waiter. I'm not entirely sure what exceptions can pass into the task's __step as task doesn't allow set_exception but still better be safe than sorry no?
await fut | ||
self._locked = True | ||
return True | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing :-)
I have made the requested changes; please review again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@1st1 please review again
Alright, I guess that's a green light for my backports? |
Let's see if our bots can do them automagically first ;) |
) (cherry picked from commit 2f79c01) Co-authored-by: Bar Harel <bzvi7919@gmail.com>
GH-5501 is a backport of this pull request to the 3.7 branch. |
Sorry, @bharel and @1st1, I could not cleanly backport this to |
GH-5502 is a backport of this pull request to the 3.6 branch. |
Thanks @1st1 :-) |
3.5 is in security-only fixes mode IIRC. |
As promised, found and fixed in the same day due to the high importance.
This will cause almost non-existing overhead (one try statement) and eliminate the bug.
https://bugs.python.org/issue32734