8000 GH-100573: Fix server hang caused by os.stat() on named pipe (Windows) by gvanrossum · Pull Request #100959 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

GH-100573: Fix server hang caused by os.stat() on named pipe (Windows) #100959

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jan 13, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
8000 Loading
Diff view
Diff view
Next Next commit
Fix server hang caused by os.stat() on named pipe (Windows)
  • Loading branch information
gvanrossum committed Jan 11, 2023
commit d20c40ca8a8f92b9f515ccbfbf25d3cb98c9b8f9
5 changes: 5 additions & 0 deletions Lib/asyncio/windows_events.py
Original file line number Diff line number Diff line change
Expand Up @@ -366,6 +366,10 @@ def loop_accept_pipe(f=None):
return

f = self._proactor.accept_pipe(pipe)
except BrokenPipeError:
if pipe and pipe.fileno() != -1:
pipe.close()
self.call_soon(loop_accept_pipe)
except OSError as exc:
if pipe and pipe.fileno() != -1:
self.call_exception_handler({
Expand All @@ -377,6 +381,7 @@ def loop_accept_pipe(f=None):
elif self._debug:
logger.warning("Accept pipe failed on pipe %r",
pipe, exc_info=True)
self.call_soon(loop_accept_pipe)
except exceptions.CancelledError:
if pipe:
pipe.close()
Expand Down
32 changes: 32 additions & 0 deletions Lib/test/test_asyncio/test_windows_events.py
Original file line number Diff line number Diff line change
Expand Up @@ -250,6 +250,38 @@ def test_address_argument_type_error(self):
proactor.sendto(sock, b'abc', addr=bad_address)
sock.close()

def test_client_pipe_stat(self):
res = self.loop.run_until_complete(self._test_client_pipe_stat())
self.assertEqual(res, 'done')

async def _test_client_pipe_stat(self):
ADDRESS = r'\\.\pipe\test_client_pipe_stat-%s' % os.getpid()

with self.assertRaises(FileNotFoundError):
os.stat(ADDRESS)

[server] = await self.loop.start_serving_pipe(
asyncio.Protocol, ADDRESS)
self.assertIsInstance(server, windows_events.PipeServer)

errors = []
self.loop.set_exception_handler(lambda *args: errors.append(args))

async def stat_it():
os.stat(ADDRESS)
Copy link
Contributor
@eryksun eryksun Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case the implementation of os.stat() changes, what basically needs to be tested is whether a pipe connection that's closed immediately causes the server to die, such as os.close(os.open(ADDRESS, os.O_RDONLY)).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it and it does not reproduces the error. Note that I am using Windows 11.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I found the same problem. I also had to add two await asyncio.sleep(0) calls after server.close() to make the final close(open()) call fail as expected.

So I think I'd rather merge this as is, using os.stat().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All that os.stat() is doing that's even remotely relevant here is calling h = CreateFileW(ADDRESS, ...), doing a bit of work that takes a little time, and then calling CloseHandle(h).

os.stat() won't be relevant to the error at all if Microsoft provides a better way to implement os.stat() that doesn't require opening a handle to the pipe, or if we decide that we should just inspect a "\\.\PIPE\" path as a string in os.stat() instead of opening a handle.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then how would you fix the test? os.close(os.open()) doesn’t fail without the fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe os.open() actually isn't fast enough to trigger the race condition reliably. Try _winapi.CloseHandle(_overlapped.ConnectPipe(ADDRESS)).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot that os.stat() also calls GetFileAttributesW() on the pipe while the handle is open. This tries to open a second pipe connection before the first one has been closed. The following reproduces the scenario:

            h = _overlapped.ConnectPipe(ADDRESS)
            try:
                _winapi.CloseHandle(_overlapped.ConnectPipe(ADDRESS))
            except OSError as e:
                if e.winerror != _overlapped.ERROR_PIPE_BUSY:
                    raise
            finally:
                _winapi.CloseHandle(h)

In the test, 2/5 attempts at opening a second pipe connection fail with ERROR_PIPE_BUSY, so this error is intentionally ignored.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we don’t need a test. All this is irrelevant for the fix. I can either delete the test, keep the PR as is, or close it unresolved. (Someone else can try again.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in my last c 8000 omment reliably triggers the problem for me because it ensures that the client connects and disconnects from the server's next pipe instance, leaving the pipe in a broken state before the server calls ConnectNamedPipe().

If it comes to closing the PR or deleting the test, I vote for deleting the test. It should have been designed from the beginning to handle the broken pipe error. Not every aspect of a correct design has to be tested.


for i in range(5):
await self.loop.create_task(stat_it())

self.assertEqual(len(errors), 0, errors)

server.close()

with self.assertRaises(FileNotFoundError):
os.stat(ADDRESS)
Copy link
Contributor
@eryksun eryksun Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here, os.close(os.open(ADDRESS, os.O_RDONLY)) will raise FileNotFoundError after the server has closed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For that we need to wait for closing the server so asyncio.sleep(0) is required.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't look like that's needed with the latest probe code.


return "done"


class WinPolicyTests(test_utils.TestCase):

Expand Down
0