-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
bpo-39847: win32: don't over-wait for mutex after tickcount overflow #18780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-39847: win32: don't over-wait for mutex after tickcount overflow #18780
Conversation
The 32-bit (49-day) TickCount relied on in EnterNonRecursiveMutex can overflow in the gap between the 'target' time and the 'now' time WaitForSingleObjectEx returns, causing the loop to think it needs to wait another 49 days. This is most likely to happen when the machine is hibernated during WaitForSingleObjectEx. This makes acquiring a lock/event/etc from the _thread or threading module appear to never timeout. Replace with GetTickCount64 - this is OK now we no longer support XP which lacks it, and is in use for time.monotonic. Approaches not taken: - forcing signed arithmetic for the now/target comparison (this would only partially solve the problem with GetTickCount as you could still break it by hibernating for 24 days; it's unnecessary for 64-bit which takes however many hundred million years to roll over)
181d6de
to
1026d8d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@zooba: Do you want to have a look?
cc @pitrou who loves lock if I recall correctly :-D |
Sorry @bobince and @vstinner, I had trouble checking out the |
…8780) The 32-bit (49-day) TickCount relied on in EnterNonRecursiveMutex can overflow in the gap between the 'target' time and the 'now' time WaitForSingleObjectEx returns, causing the loop to think it needs to wait another 49 days. This is most likely to happen when the machine is hibernated during WaitForSingleObjectEx. This makes acquiring a lock/event/etc from the _thread or threading module appear to never timeout. Replace with GetTickCount64 - this is OK now Python no longer supports XP which lacks it, and is in use for time.monotonic(). Co-authored-by: And Clover <and.clover@bromium.com> (cherry picked from commit 64838ce) Co-authored-by: bobince <and+github@doxdesk.com>
GH-18945 is a backport of this pull request to the 3.8 branch. |
The 32-bit (49-day) TickCount relied on in EnterNonRecursiveMutex can overflow in the gap between the 'target' time and the 'now' time WaitForSingleObjectEx returns, causing the loop to think it needs to wait another 49 days. This is most likely to happen when the machine is hibernated during WaitForSingleObjectEx. This makes acquiring a lock/event/etc from the _thread or threading module appear to never timeout. Replace with GetTickCount64 - this is OK now Python no longer supports XP which lacks it, and is in use for time.monotonic(). Co-authored-by: And Clover <and.clover@bromium.com> (cherry picked from commit 64838ce) Co-authored-by: bobince <and+github@doxdesk.com>
…onGH-18780) The 32-bit (49-day) TickCount relied on in EnterNonRecursiveMutex can overflow in the gap between the 'target' time and the 'now' time WaitForSingleObjectEx returns, causing the loop to think it needs to wait another 49 days. This is most likely to happen when the machine is hibernated during WaitForSingleObjectEx. This makes acquiring a lock/event/etc from the _thread or threading module appear to never timeout. Replace with GetTickCount64 - this is OK now Python no longer supports XP which lacks it, and is in use for time.monotonic(). Co-authored-by: And Clover <and.clover@bromium.com> (cherry picked from commit 64838ce) Co-authored-by: bobince <and+github@doxdesk.com>
GH-18959 is a backport of this pull request to the 3.7 branch. |
Yep, ta. I couldn't see any conflict with my human eyes but the cherry-pick doesn't care. |
…8780) (GH-18959) The 32-bit (49-day) TickCount relied on in EnterNonRecursiveMutex can overflow in the gap between the 'target' time and the 'now' time WaitForSingleObjectEx returns, causing the loop to think it needs to wait another 49 days. This is most likely to happen when the machine is hibernated during WaitForSingleObjectEx. This makes acquiring a lock/event/etc from the _thread or threading module appear to never timeout. Replace with GetTickCount64 - this is OK now Python no longer supports XP which lacks it, and is in use for time.monotonic(). Co-authored-by: And Clover <and.clover@bromium.com> (cherry picked from commit 64838ce) Co-authored-by: bobince <and+github@doxdesk.com>
Thanks. I merged your 3.7 backport. |
The 32-bit (49-day) TickCount relied on in EnterNonRecursiveMutex can overflow in the gap between the 'target' time and the 'now' time WaitForSingleObjectEx returns, causing the loop to think it needs to wait another 49 days. This is most likely to happen when the machine is hibernated during WaitForSingleObjectEx.
This makes acquiring a lock/event/etc from the _thread or threading module appear to never timeout.
Replace with GetTickCount64. This is OK now we no longer support WinXP which lacks it; it is already in use for time.monotonic.
Approaches not taken:
forcing signed arithmetic for the now/target comparison (this would only partially solve the problem with GetTickCount as you could still break it by hibernating for 24 days; it's unnecessary for 64-bit which takes however many hundred million years to roll over);
doing anything to cope with the possibility of the new milliseconds being out of range for DWORD; this shouldn't be possible given the target-now time should never be more than the original milliseconds DWORD passed in, and PyCOND_TIMEDWAIT can't accept a longer time anyway;
I haven't been able to add a representative test, as it would mean waiting 49 days or some pretty nasty syscall hooking which seems like overkill. Hopefully change is fairly transparently harmless.
https://bugs.python.org/issue39847