-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
gh-93821: Handle connection resets o 8000 n Windows #124779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Refactor OSError re-throwing and wrap all instances of 'ov.getresult()' with the decorator so that client-side disconnects can be gracefully handled. Update the Proactor server to handle the connection and continue looping and serving (client disconnects are not a fatal error).
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Similar to this one #124032, however this is a little bit more complete by capturing more potential problems that were observed in the original patch. |
@@ -0,0 +1 @@ | |||
Fix error handling in windows events where clients terminating connections could result in an asyncio server using Proactor event loops to hang indefinitely. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About NEWS format please to reading this: https://devguide.python.org/documentation/markup/
for example:
:mod:`asyncio`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there something I need to change, or is the comment informational?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry. I meant that your NEWS format seems incorrect. Maybe you should check out the NEWS written by others in the Misc section, that might be better.
Edit: For the module, you should use the example I provided.
@@ -0,0 +1 @@ | |||
Fix error handling in windows events where clients terminating connections could result in an asyncio server using Proactor event loops to hang indefinitely. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think this is what they meant:
Fix error handling in windows events where clients terminating connections could result in an asyncio server using Proactor event loops to hang indefinitely. | |
Fix error handling in windows events where clients terminating connections could result in an :mod:`asyncio` server using Proactor event loops to hang indefinitely. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! Thanks for clarification.
Keep docs better organized by linking the module the change was destined for.
except exceptions.CancelledError: | ||
# Effectively ignore connections that throw a cancelled error | ||
# during setup, loop back around and continue serving. | ||
sock.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppressing CancelledError
is a huge red flag and should only be done in desperate times. Can you write an elaborate comment what's going on here? I don't understand why CancelledError
needs suppressing neither from this comment nor from the PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is supposed to be:
except ConnectionResetError:
self.call_soon(loop)
We have reproduced this: the regular uvicorn+fastapi server receives a connection from a process that is forcefully terminated afterward.
What _loop
needs to do: determine if an error = OSError: [WinError 64] The specified network name is no longer available
; in that case, it should just run the next iteration of the loop.
Our working fix looks like that: https://gist.github.com/cybergrind/9cbbdc94503548d74dc4d5d3ed99248c
The following commit authors need to sign the Contributor License Agreement: |
Refactor OSError re-throwing and wrap all instances of 'ov.getresult()' with the decorator so that client-side disconnects can be gracefully handled. Update the Proactor server to handle the connection and continue looping and serving (client disconnects are not a fatal error).
I discovered that, especially under heavy server load, the asycio Windows server could slow down and could end up throwing
[WinError 64] The specified network name is no longer available
errors. This occurs when clients connect and disconnect before the asyncio code has time to service the connection, especially if the client side disconnects erroneously (for example, if it's process was immediately killed before it could gracefully close the connection). These errors would bubble up into the proactor server loop and cause the socket there close - and never get re-opened. The result of this is that my servers would appear to hang, requiring a force-restart to restore their functionality.The solution here catches all potential connection reset errors in the
IocpProactor
class by clients. It appears that the sending-side was previously capturing the errors, but likely the tight timing of the client connection sequence meant errors in these steps weren't noticed previously.