-
-
Notifications
You must be signed in to change notification settings - Fork 32k
SSL session content bleeds into stdout with lots of threads #118138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I built Python 3.12.3 --with-pydebug, and now I get a crash before any corrupt output shows. Backtrace from gdb (happy to provide more if it would be helpful):
|
We are seeing a similar issue ( No repro for it from our side, since it's happening intermittently in a fairly complicated Ray program, so we haven't tried to minimize the issue. Just wanted to note that users unaffiliated with @sterwill are seeing the issue as well. |
https://github.com/python/cpython/blob/3.10/Modules/_io/textio.c#L1584 calls Any idea what the canonical representation is, and/or how to check it? Applying the following patch to python 3.10.12
Produces this output for a failing case:
The length 0 entries seem suspicious to me. While anecdotal, the issue seems to happen more often for me when I enable TQDM-style progress bars. I'm not sure if that's A) true, B) related to the code paths used like https://github.com/python/cpython/blob/3.10/Modules/_io/textio.c#L1663 , or C) just due to there being a lot more output when I do so. |
FYI I am able to reproduce the problem reliably with the following reduced script:
I've added slightly more debugging output to the above patch, and am now getting output like:
142 entries / 2 writes per print * (99 'x' + 1 '\n') == 7100, so I think that the zero-length entry I mentioned above probably isn't relevant. Why it's expecting 8100 bytes and not 100k isn't clear to me. Interestingly, it always comes up as 7100 vs. 8100. Without the barrier it varied more, but the few times I ran it it was only no error, 800 vs. 8100, or 7100 vs. 8100. My guess at this point is that there's a threading issue with appending to and clearing the list, but I'm getting out of my depth. |
I have increased confidence that it's a threading issue now. With this patch:
The above script produces this on stderr:
Almost certainly some other thread is coming in and adding some list entries which would then get overwritten by that I have also confirmed that the stdout from the script is missing some expected output (I changed the "xxxx" to be the i, j of the thread and inner loop, and some of those go missing). |
This patch fixes the issue for me on python 3.10.12, but not on
This is probably an awful way to fix the issue. |
Multithreaded writes to files (including standard calls to print() from threads) can result in assertion failures and potentially missed output due to race conditions in _io_TextIOWrapper_write_impl and _textiowrapper_writeflush. See: python#118138
Bug report
Bug description:
I've been trying to track down a rare segfault I'm seeing with Python 3 (several versions) in AWS Lambda and AWS CodeBuild. It's been very hard to reproduce, but I have a few stack traces that show the crash happens in OpenSSL's certificate verification code below Python's
ssl
module. While trying to isolate the problem in OpenSSL, I stumbled on a different issue that might be related, which I'm reporting here.The program below uses the standard library to execute a lot of HTTPS requests concurrently with threads. When run on my Linux desktop in gnome-terminal, the program doesn't exhibit any weird behavior. It prints a bunch of
X
characters for each request it completes. However, if I pipe its output through a program likeless
ortee
, I see additional output--raw bytes from the SSL network session bleeding into stdout. It feels like OpenSSL is writing into memory that Python's using to prepare strings or print to stdout due to a lack of synchronization or refcounting, but I don't know OpenSSL or Python internals well, so this is just a guess.python3 test-program.py | tee /dev/null
makes it happen every time on my workstation. It never crashes, it just bleeds SSL contents into the output.It happens on Ubuntu 22.04 x86_64 with Python 3.12 and OpenSSL 3.0.2, and on AWS CloudShell with Python 3.9.16 and OpenSSL 3.0.8.
Feel free to run this program as often as you want with the URL I put in there (it's a small static web page on my personal site that won't mind a few thousand hits).
CPython versions tested on:
3.9, 3.12
Operating systems tested on:
Linux
Linked PRs
The text was updated successfully, but these errors were encountered: