-
-
Notifications
You must be signed in to change notification settings - Fork 32k
gh-118138: Adds test for multithreaded writes to files. #119107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-118138: Adds test for multithreaded writes to files. #119107
Conversation
Multithreaded writes to files (including standard calls to print() from threads) can result in assertion failures and potentially missed output due to race conditions in _io_TextIOWrapper_write_impl and _textiowrapper_writeflush. See: python#118138
The following commit authors need to sign the Contributor License Agreement: |
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Result: ACCESS |
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
I believe that the issue with my patch was due to closing the file accepting output before calling Edit: I also feel like there's probably a cleaner way to handle |
After deploying this patch at scale internally, I can confirm that there are still race condition issues; the test fails about 14% of the time in our CI cluster (interestingly it always passed while running locally, so it might be a load issue; I haven't investigated the exact nature of the failures yet). 14% is better than the 100% failure rate we were seeing prior, but it's still not fully fixed. |
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
I think that the test provided here might be useful (after being adapted to meet cpython standards), but the actual code change likely is not. On #119507 a much simpler patch achieves the same practical "it doesn't assert any longer" effect for my use case. The issue is also being discussed here: https://discuss.python.org/t/bug-with-multi-threaded-print-write-causing-dropped-output-output-of-uninitialized-memory-assertion-failures/54538 where @pitrou suggests using a mutex to protect the internal state. Should we close this PR? |
@elistevens, thanks for taking the time you did on this PR. While there wasn't much feedback here, that DPO thread definitely served as a proxy, which is probably okay. Regardless, you clearly put a lot of thought into this and demonstrated a healthy dose of determination and skill. I hope there will be more opportunities for you to participate in Python core development. If you have any feedback, please let us know. |
Thanks, Eric! I'm not worried about the relative quiet on this PR; #119507 had a lively discussion and ultimately resulted in a much cleaner fix than the one I proposed here. I also recognize that maintainers are quite busy and I think should focus on the PRs that are likely to merge. Thank you for all the work that you all do! |
Multithreaded writes to files (including standard calls to print() from threads) can result in assertion failures, missed dropped output, and (in non-debug builds) writing uninitialized memory out due to race conditions in
_io_TextIOWrapper_write_impl
and_textiowrapper_writeflush
.See: #118138
I've provided a change that fixes the original issue for me on python 3.10.12,
but my attempt to turn the repro script into a self-contained test case fails with that patch on both 3.10.12 and 3.14.0a0 (That patch makesmain
as of today).self->pending_bytes
always be a list, and callsPyList_SetSlice(pending, 0, pending_size_at_start, NULL)
to remove entries after they've been processed. I strongly suspect that the patch only narrows the time windows for race conditions, not actually eliminate them. I also think that it might introduce the possibility of dropped or repeated output should those race conditions trigger, but I am uncertain if that's actually the case (and they don't happen in practice, according to the test script).I am out of my depth and would appreciate a maintainer taking a look.
Thanks to @sterwill for the original repro script, which the test case here is a descendent of.