-
-
Notifications
You must be signed in to change notification settings - Fork 32k
GH-100573: Fix server hang caused by os.stat() on named pipe (Windows) #100959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@kumaraditya303 Any chance you have time to review this? And maybe @eryksun too? The only possibly controversial thing is that I added |
async def stat_it(): | ||
os.stat(ADDRESS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case the implementation of os.stat()
changes, what basically needs to be tested is whether a pipe connection that's closed immediately causes the server to die, such as os.close(os.open(ADDRESS, os.O_RDONLY))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried it and it does not reproduces the error. Note that I am using Windows 11.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I found the same problem. I also had to add two await asyncio.sleep(0)
calls after server.close()
to make the final close(open())
call fail as expected.
So I think I'd rather merge this as is, using os.stat()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All that os.stat()
is doing that's even remotely relevant here is calling h = CreateFileW(ADDRESS, ...)
, doing a bit of work that takes a little time, and then calling CloseHandle(h)
.
os.stat()
won't be relevant to the error at all if Microsoft provides a better way to implement os.stat()
that doesn't require opening a handle to the pipe, or if we decide that we should just inspect a "\\.\PIPE\" path as a string in os.stat()
instead of opening a handle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then how would you fix the test? os.close(os.open()) doesn’t fail without the fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe os.open()
actually isn't fast enough to trigger the race condition reliably. Try _winapi.CloseHandle(_overlapped.ConnectPipe(ADDRESS))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot that os.stat()
also calls GetFileAttributesW()
on the pipe while the handle is open. This tries to open a second pipe connection before the first one has been closed. The following reproduces the scenario:
h = _overlapped.ConnectPipe(ADDRESS)
try:
_winapi.CloseHandle(_overlapped.ConnectPipe(ADDRESS))
except OSError as e:
if e.winerror != _overlapped.ERROR_PIPE_BUSY:
raise
finally:
_winapi.CloseHandle(h)
In the test, 2/5 attempts at opening a second pipe connection fail with ERROR_PIPE_BUSY
, so this error is intentionally ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we don’t need a test. All this is irrelevant for the fix. I can either delete the test, keep the PR as is, or close it unresolved. (Someone else can try again.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code in my last comment reliably triggers the problem for me because it ensures that the client connects and disconnects from the server's next pipe instance, leaving the pipe in a broken state before the server calls ConnectNamedPipe()
.
If it comes to closing the PR or deleting the test, I vote for deleting the test. It should have been designed from the beginning to handle the broken pipe error. Not every aspect of a correct design has to be tested.
server.close() | ||
|
||
with self.assertRaises(FileNotFoundError): | ||
os.stat(ADDRESS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here, os.close(os.open(ADDRESS, os.O_RDONLY))
will raise FileNotFoundError
after the server has closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For that we need to wait for closing the server so asyncio.sleep(0)
is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't look like that's needed with the latest probe code.
Misc/NEWS.d/next/Library/2023-01-12-01-18-13.gh-issue-100573.KDskqo.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please re-review, I've added Eryk's code. I've confirmed that it reproduces the bug and refactored the test so the probe code only needs to be written once. |
Thanks @gvanrossum for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10, 3.11. |
GH-101019 is a backport of this pull request to the 3.11 branch. |
GH-101020 is a backport of this pull request to the 3.10 branch. |
…indows) (pythonGH-100959) (cherry picked from commit 1bc7a73) Co-authored-by: Guido van Rossum <guido@python.org>
…indows) (pythonGH-100959) (cherry picked from commit 1bc7a73) Co-authored-by: Guido van Rossum <guido@python.org>
Uh oh!
There was an error while loading. Please reload this page.