8000 bpo-32734: Fix asyncio.Lock multiple acquire safety issue by bharel · Pull Request #5466 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

bpo-32734: Fix asyncio.Lock multiple acquire safety issue #5466

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 2, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 22 additions & 10 deletions Lib/asyncio/locks.py
Original file line number Diff line number Diff line change
Expand Up @@ -183,16 +183,22 @@ async def acquire(self):

fut = self._loop.create_future()
self._waiters.append(fut)

# Finally block should be called before the CancelledError
# handling as we don't want CancelledError to call
# _wake_up_first() and attempt to wake up itself.
try:
await fut
self._locked = True
return True
try:
8000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very subtle. Can you add a comment why we use two nested 'try' statements here? I.e. that it's important that 'finally' block is called before 'except CancelledError'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing :-)

await fut
finally:
self._waiters.remove(fut)
except futures.CancelledError:
if not self._locked:
self._wake_up_first()
raise
finally:
self._waiters.remove(fut)

self._locked = True
return True

def release(self):
"""Release a lock.
Expand All @@ -212,11 +218,17 @@ def release(self):
raise RuntimeError('Lock is not acquired.')

def _wake_up_first(self):
"""Wake up the first waiter who isn't cancelled."""
for fut in self._waiters:
if not fut.done():
fut.set_result(True)
break
"""Wake up the first waiter if it isn't done."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it is done? Can there be a situation where you have [fut, fut] in the _waiters list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be indeed, when 2 or more locks are waiting on acquire.
If the first future is done, it means that later on it will either get the lock, or it was cancelled in which case it will wake up the next waiter upon loop's next cycle.
The patch fixes the bug by avoiding multiple done waiters at once and cleaning up the waiters list asap.
The only way multiple done waiters can happen is if 2 futures get cancelled in which case only 1 waiter will wake up, and the second cancelled future will not wake up another one as lock would have already been taken (if not self._locked: self._wake_up_first()) or it wasn't taken yet it which case it's still sitting .done() in the waiter's list.

In summary:
.done necessarily means either the lock is immediately taken next loop or if it isn't, (.done occurred because of cancellation) next one wakes up. So if we have a done future not including the one currently running we shouldn't wake up another one as it will cause a double wakeup, hence, the almighty bug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, makes sense. Could you please add this all as a comment to _wake_up_first? :)

Copy link
Contributor Author
@bharel bharel Feb 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phew, now that's a long comment, lemme see how can I shorten it while still managing to fully explain it. Challenge accepted :P

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, if you have time to write a short comment :)

Sorry for being anal about this. Every once in a while we find a new bug in asyncio.Lock or asyncio.Queue and it's always a challenge to wrap your head around how they are actually working. Every line of code is significant, sometimes it's not safe to do a simple refactoring because things break in very non-obvious ways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, all good mate. It took some time to wrap my head around it as well.
Luckily I didn't actually encounter the bug at production but rather found it by reading the implementation which took a couple of hours ;-)

How's this comment? Attempting to be short and to the point as much as you can with synchronization.

try:
fut = next(iter(self._waiters))
except StopIteration:
return

# .done() necessarily means that a waiter will wake up later on and
# either take the lock, or, if it was cancelled and lock wasn't
# taken already, will hit this again and wake up a new waiter.
if not fut.done():
fut.set_result(True)


class Event:
Expand Down
50 changes: 50 additions & 0 deletions Lib/test/test_asyncio/test_locks.py
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,56 @@ async def lockit(name, blocker):
self.assertTrue(tb.cancelled())
self.assertTrue(tc.done())

def test_cancel_release_race(self):
# Issue 32734
# Acquire 4 locks, cancel second, release first
# and 2 locks are taken at once.
lock = asyncio.Lock(loop=self.loop)
lock_count = 0
call_count = 0

async def lockit():
nonlocal lock_count
nonlocal call_count
call_count += 1
await lock.acquire()
lock_count += 1

async def lockandtrigger():
await lock.acquire()
self.loop.call_soon(trigger)

def trigger():
t1.cancel()
lock.release()

t0 = self.loop.create_task(lockandtrigger())
t1 = self.loop.create_task(lockit())
t2 = self.loop.create_task(lockit())
t3 = self.loop.create_task(lockit())

# First loop acquires all
test_utils.run_briefly(self.loop)
self.assertTrue(t0.done())

# Second loop calls trigger
test_utils.run_briefly(self.loop)
# Third loop calls cancellation
test_utils.run_briefly(self.loop)

# Make sure only one lock was taken
self.assertEqual(lock_count, 1)
# While 3 calls were made to lockit()
self.assertEqual(call_count, 3)
self.assertTrue(t1.cancelled() and t2.done())

# Cleanup the task that is stuck on acquire.
t3.cancel()
test_utils.run_briefly(self.loop)
self.assertTrue(t3.cancelled())



def test_finished_waiter_cancelled(self):
lock = asyncio.Lock(loop=self.loop)

Expand Down
1 change: 1 addition & 0 deletions Misc/ACKS
Original file line number Diff line number Diff line change
Expand Up @@ -602,6 +602,7 @@ Milton L. Hankins
Stephen Hansen
Barry Hantman
Lynda Hardman
Bar Harel
Derek Harland
Jason Harper
David Harrigan
Expand Down
5992
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Fixed ``asyncio.Lock()`` safety issue which allowed acquiring and locking
the same lock multiple times, without it being free. Patch by Bar Harel.
0