10000 `test_zstd` failed on ubuntu with free-threading · Issue #133885 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

test_zstd failed on ubuntu with free-threading #133885

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sobolevn opened this issue May 11, 2025 · 12 comments
Closed

test_zstd failed on ubuntu with free-threading #133885

sobolevn opened this issue May 11, 2025 · 12 comments
Labels
3.14 bugs and security fixes 3.15 new features, bugs and security fixes extension-modules C modules in the Modules dir tests Tests in the Lib/test dir topic-free-threading type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@sobolevn
Copy link
Member
sobolevn commented May 11, 2025

Crash report

Link: https://github.com/python/cpython/actions/runs/14954410814/job/42008158034?pr=133876

0:00:11 load avg: 57.89 [ 12/491/1] test_zstd worker non-zero exit code (Exit code -11 (SIGSEGV))
test_compress_empty (test.test_zstd.CompressorTestCase.test_compress_empty) ... ok
test_compress_flushblock (test.test_zstd.CompressorTestCase.test_compress_flushblock) ... ok
test_compress_flushframe (test.test_zstd.CompressorTestCase.test_compress_flushframe) ... ok
test_compress_parameters (test.test_zstd.CompressorTestCase.test_compress_parameters) ... ok
test_simple_compress_bad_args (test.test_zstd.CompressorTestCase.test_simple_compress_bad_args) ... ok
test_unknown_compression_parameter (test.test_zstd.CompressorTestCase.test_unknown_compression_parameter) ... ok
test_zstd_multithread_compress (test.test_zstd.CompressorTestCase.test_zstd_multithread_compress) ... skipped "zstd build doesn't support multi-threaded compression"
test_decompressor_1 (test.test_zstd.DecompressorFlagsTestCase.test_decompressor_1) ... ok
test_decompressor_skippable (test.test_zstd.DecompressorFlagsTestCase.test_decompressor_skippable) ... ok
test_function_decompress (test.test_zstd.DecompressorFlagsTestCase.test_function_decompress) ... ok
test_function_skippable (test.test_zstd.DecompressorFlagsTestCase.test_function_skippable) ... ok
test_decompress_empty (test.test_zstd.DecompressorTestCase.test_decompress_empty) ... ok
test_decompress_empty_content_frame (test.test_zstd.DecompressorTestCase.test_decompress_empty_content_frame) ... ok
test_decompress_epilogue_flags (test.test_zstd.DecompressorTestCase.test_decompress_epilogue_flags) ... ok
test_decompress_parameters (test.test_zstd.DecompressorTestCase.test_decompress_parameters) ... ok
test_decompressor_1 (test.test_zstd.DecompressorTestCase.test_decompressor_1) ... ok
test_decompressor_arg (test.test_zstd.DecompressorTestCase.test_decompressor_arg) ... ok
test_decompressor_chunks_read_3 (test.test_zstd.DecompressorTestCase.test_decompressor_chunks_read_3) ... ok
test_decompressor_chunks_read_300 (test.test_zstd.DecompressorTestCase.test_decompressor_chunks_read_300) ... ok
test_simple_decompress_bad_args (test.test_zstd.DecompressorTestCase.test_simple_decompress_bad_args) ... ok
test_unknown_decompression_parameter (test.test_zstd.DecompressorTestCase.test_unknown_decompression_parameter) ... ok
test_UnsupportedOperation (test.test_zstd.FileTestCase.test_UnsupportedOperation) ... ok
test_append_new_file (test.test_zstd.FileTestCase.test_append_new_file) ... ok
test_close (test.test_zstd.FileTestCase.test_close) ... ok
test_closed (test.test_zstd.FileTestCase.test_closed) ... ok
test_decompress_limited (test.test_zstd.FileTestCase.test_decompress_limited) ... ok
test_file_dict (test.test_zstd.FileTestCase.test_file_dict) ... ok
test_file_prefix (test.test_zstd.FileTestCase.test_file_prefix) ... ok
test_fileno (test.test_zstd.FileTestCase.test_fileno) ... ok
test_init (test.test_zstd.FileTestCase.test_init) ... ok
test_init_bad_check (test.test_zstd.FileTestCase.test_init_bad_check) ... ok
test_init_bad_mode (test.test_zstd.FileTestCase.test_init_bad_mode) ... ok
test_init_close_fp (test.test_zstd.FileTestCase.test_init_close_fp) ... ok
test_init_mode (test.test_zstd.FileTestCase.test_init_mode) ... ok
test_init_with_PathLike_filename (test.test_zstd.FileTestCase.test_init_with_PathLike_filename) ... ok
test_init_with_filename (test.test_zstd.FileTestCase.test_init_with_filename) ... ok
test_init_with_x_mode (test.test_zstd.FileTestCase.test_init_with_x_mode) ... ok
test_iterator (test.test_zstd.FileTestCase.test_iterator) ... ok
test_name (test.test_zstd.FileTestCase.test_name) ... ok
test_peek (test.test_zstd.FileTestCase.test_peek) ... ok
test_peek_bad_args (test.test_zstd.FileTestCase.test_peek_bad_args) ... ok
test_read1 (test.test_zstd.FileTestCase.test_read1) ... ok
test_read1_0 (test.test_zstd.FileTestCase.test_read1_0) ... ok
test_read1_10 (test.test_zstd.FileTestCase.test_read1_10) ... ok
test_read1_bad_args (test.test_zstd.FileTestCase.test_read1_bad_args) ... ok
test_read1_multistream (test.test_zstd.FileTestCase.test_read1_multistream) ... ok
test_read_0 (test.test_zstd.FileTestCase.test_read_0) ... ok
test_read_10 (test.test_zstd.FileTestCase.test_read_10) ... ok
test_read_bad_args (test.test_zstd.FileTestCase.test_read_bad_args) ... ok
test_read_bad_data (test.test_zstd.FileTestCase.test_read_bad_data) ... ok
test_read_exception (test.test_zstd.FileTestCase.test_read_exception) ... ok
test_read_incomplete (test.test_zstd.FileTestCase.test_read_incomplete) ... ok
test_read_multistream (test.test_zstd.FileTestCase.test_read_multistream) ... ok
test_read_readinto_readinto1 (test.test_zstd.FileTestCase.test_read_readinto_readinto1) ... ok
test_read_truncated (test.test_zstd.FileTestCase.test_read_truncated) ... ok
test_readable (test.test_zstd.FileTestCase.test_readable) ... ok
test_readinto (test.test_zstd.FileTestCase.test_readinto) ... ok
test_seek_backward (test.test_zstd.FileTestCase.test_seek_backward) ... ok
test_seek_backward_across_streams (test.test_zstd.FileTestCase.test_seek_backward_across_streams) ... ok
test_seek_backward_relative_to_end (test.test_zstd.FileTestCase.test_seek_backward_relative_to_end) ... ok
test_seek_bad_args (test.test_zstd.FileTestCase.test_seek_bad_args) ... ok
test_seek_forward (test.test_zstd.FileTestCase.test_seek_forward) ... ok
test_seek_forward_across_streams (test.test_zstd.FileTestCase.test_seek_forward_across_streams) ... ok
test_seek_forward_relative_to_current (test.test_zstd.FileTestCase.test_seek_forward_relative_to_current) ... ok
test_seek_forward_relative_to_end (test.test_zstd.FileTestCase.test_seek_forward_relative_to_end) ... ok
test_seek_not_seekable (test.test_zstd.FileTestCase.test_seek_not_seekable) ... ok
test_seek_past_end (test.test_zstd.FileTestCase.test_seek_past_end) ... ok
test_seek_past_start (test.test_zstd.FileTestCase.test_seek_past_start) ... ok
test_seekable (test.test_zstd.FileTestCase.test_seekable) ... ok
test_tell (test.test_zstd.FileTestCase.test_tell) ... ok
test_tell_bad_args (test.test_zstd.FileTestCase.test_tell_bad_args) ... ok
test_writable (test.test_zstd.FileTestCase.test_writable) ... ok
test_write (test.test_zstd.FileTestCase.test_write) ... ok
test_write_101 (test.test_zstd.FileTestCase.test_write_101) ... ok
test_write_append (test.test_zstd.FileTestCase.test_write_append) ... ok
test_write_bad_args (test.test_zstd.FileTestCase.test_write_bad_args) ... ok
test_write_empty_block (test.test_zstd.FileTestCase.test_write_empty_block) ... ok
test_write_empty_frame (test.test_zstd.FileTestCase.test_write_empty_frame) ... ok
test_writelines (test.test_zstd.FileTestCase.test_writelines) ... ok
test_zstdfile_flush (test.test_zstd.FileTestCase.test_zstdfile_flush) ... ok
test_zstdfile_flush_mode (test.test_zstd.FileTestCase.test_zstdfile_flush_mode) ... ok
test_zstdfile_iter_issue45475 (test.test_zstd.FileTestCase.test_zstdfile_iter_issue45475) ... ok
test_zstdfile_truncate (test.test_zstd.FileTestCase.test_zstdfile_truncate) ... ok
test_compress_locking (test.test_zstd.FreeThreadingMethodTests.test_compress_locking) ... Fatal Python error: Segmentation fault

<Cannot show all threads while the GIL is disabled>
Stack (most recent call first):
  File "/home/runner/work/cpython/cpython-ro-srcdir/Lib/test/test_zstd.py", line 2450 in run_method
  File "/home/runner/work/cpython/cpython-ro-srcdir/Lib/threading.py"

Linked PRs

@sobolevn sobolevn added tests Tests in the Lib/test dir type-crash A hard crash of the interpreter, possibly with a core dump labels May 11, 2025
@StanFromIreland
Copy link
Contributor

This is a duplicate I believe, IIRC it is part of the meta issue.

cc @emmatyping

@Rogdham
Copy link
Contributor
Rogdham commented May 11, 2025

Good idea to create a dedicated issue, it will be clearer to do the analysis here without polluting the other one.

Below is my last comment on the topic pasted for visibility.


Could it be the following?

  • By sharing the ZstdCompressor instance between the threads, we share the compression context of type ZSTD_CCtx
  • From a comment in zstd.h:
    • “since v1.3.0, ZSTD_CStream and ZSTD_CCtx are the same thing”
    • “For parallel execution, use one separate ZSTD_CStream per thread.”
Details

From below, it seems that ZSTD_compressStream_generic line 6165 is writing data at the location of zcs->inBuff (zcs being the compression context), so if another threads does the same or set it to null, we have the issue.

ThreadSanitizer:DEADLYSIGNAL
    #0 <null> <null> (libc.so.6+0x16c5c7) (BuildId: d056ce83eebe65ce7e52ecfa5af5363e4863d283)
    #1 memcpy <null> (python+0xe8c98) (BuildId: dea998379ba126e7933ccc59c8ddb0390562739e)
    #2 ZSTD_limitCopy /usr/src/debug/zstd/zstd-1.5.7/lib/compress/../common/zstd_internal.h:252:9 (libzstd.so.1+0x26eae) (BuildId: 9723b93a8052010d25908aaa6174df6de760859a)
    #3 ZSTD_compressStream_generic /usr/src/debug/zstd/zstd-1.5.7/lib/compress/zstd_compress.c:6165:39 (libzstd.so.1+0x26eae)
    #4 ZSTD_compressStream2 /usr/src/debug/zstd/zstd-1.5.7/lib/compress/zstd_compress.c:6540:5 (libzstd.so.1+0x26eae)
    #5 compress_impl /redacted/./Modules/_zstd/compressor.c:453:20 (_zstd.cpython-315t-x86_64-linux-gnu.so+0x77e0) (BuildId: 74b337c2def9ea3fa795cafd5a7f41ce493b69f9)
    #6 _zstd_ZstdCompressor_compress_impl /redacted/./Modules/_zstd/compressor.c:591:15 (_zstd.cpython-315t-x86_64-linux-gnu.so+0x8cbf) (BuildId: 74b337c2def9ea3fa795cafd5a7f41ce493b69f9)
    #7 _zstd_ZstdCompressor_compress /redacted/./Modules/_zstd/clinic/compressor.c.h:171:20 (_zstd.cpython-315t-x86_64-linux-gnu.so+0x8cbf)
<redacted>

@picnixz picnixz added topic-free-threading 3.14 bugs and security fixes 3.15 new features, bugs and security fixes extension-modules C modules in the Modules dir labels May 11, 2025
@emmatyping
Copy link
Member
emmatyping commented May 12, 2025

I believe @Rogdham's analysis is correct. I think unfortunately we will have to require that Zstd(De)compressor instances be only used in one thread at a time and raise an exception if someone tries to share them across threads. That's currently what python-zstandard documents as allowed, and raising an exception is what indygreg/python-zstandard#243 is planning on doing.

@emmatyping
Copy link
Member

cc @ngoldbaum as you had thoughts on free threading and zstd.

@colesbury
Copy link
Contributor

I'm not sure I understand all the context, but if the test is crashing because it's not thread-safe and the fix isn't immediately obvious, I think the test should be disabled for now. It's no fun to have the CI fail on PRs because of unrelated tests.

sobolevn added a commit to sobolevn/cpython that referenced this issue May 12, 2025
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 12, 2025
…H-133943)

(cherry picked from commit e8665d4)

Co-authored-by: sobolevn <mail@sobolevn.me>
@encukou
Copy link
Member
encukou commented May 13, 2025

Should it be only disabled for free-threading?

@StanFromIreland
Copy link
Contributor

I believe so, it passed on all other CIs. The merged pr unnecessarily skips it without that consideration.

@sobolevn
Copy link
Member Author
sobolevn commented May 13, 2025

@StanFromIreland it is not "unnecessarily skips", it skips tests that were failing :)

Previously these tests were only run on --free-threading builds:

cpython/Lib/test/test_zstd.py

Lines 2433 to 2439 in 8cf4947

@unittest.skip("it fails for now, see gh-133885")
class FreeThreadingMethodTests(unittest.TestCase):
@unittest.skipUnless(Py_GIL_DISABLED, 'this test can only possibly fail with GIL disabled')
@threading_helper.reap_threads
@threading_helper.requires_working_threading()
def test_compress_locking(self):

cpython/Lib/test/test_zstd.py

Lines 2472 to 2476 in 8cf4947

@unittest.skipUnless(Py_GIL_DISABLED, 'this test can only possibly fail with GIL disabled')
@threading_helper.reap_threads
@threading_helper.requires_working_threading()
def test_decompress_locking(self):

Now - they are not executed at all, untill we fix them / code :)

sobolevn added a commit that referenced this issue May 13, 2025
) (#133949)

gh-133885: skip `test_compress_locking` in `test_zstd` (GH-133943)
(cherry picked from commit e8665d4)

Co-authored-by: sobolevn <mail@sobolevn.me>
@emmatyping
Copy link
Member

So the (de)compressors need to be kept to the same thread, so perhaps we can have them hold a lock and if another thread tries to acquire it then it raises an exception? I think something like this could be implemented in a lightweight way using the object's internal ob_mutex most likely.

@dpdani
Copy link
Contributor
dpdani commented May 14, 2025

Hi @emmatyping 👋

I wanted to add a bit more context on the PR you linked.
We are not planning to set an exception when a de/compressor is shared between threads in python-zstandard because the maintainer asked us to revert that, so you won't find it in the PR.
But there's a commit that partly implemented that behavior, in case you're interested in reading it: indygreg/python-zstandard@3e0ec59#diff-3db8262822d3b43cb97f0d1fa8151dd4f4f8b121fb35d473133a8552f2f7ae72

@colesbury
Copy link
Contributor

This test crashes in the (default) GIL enabled build too (even more often than in the FT build!), but it was always skipped by:

@unittest.skipUnless(Py_GIL_DISABLED, 'this test can only possibly fail with GIL disabled')

It's a good idea to include threading tests like you're doing here, but we should be testing both the FT and default builds, especially when interacting with C libraries.

colesbury pushed a commit that referenced this issue May 23, 2025
Move from using critical sections to locks for the (de)compression methods.
Since the methods allow other threads to run, we should use a lock rather
than a critical section.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 23, 2025
…thongh-134289)

Move from using critical sections to locks for the (de)compression methods.
Since the methods allow other threads to run, we should use a lock rather
than a critical section.
(cherry picked from commit 8dbc119)

Co-authored-by: Emma Smith <emma@emmatyping.dev>
colesbury pushed a commit that referenced this issue May 23, 2025
…h-134289) (gh-134560)

Move from using critical sections to locks for the (de)compression methods.
Since the methods allow other threads to run, we should use a lock rather
than a critical section.
(cherry picked from commit 8dbc119)

Co-authored-by: Emma Smith <emma@emmatyping.dev>
@emmatyping
Copy link
Member

Going to go ahead and close this, as we haven't seen any test failures since the PR adding locks around (de)compression contexts and ZstdDict landed. Thanks all for the help in resolving this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.14 bugs and security fixes 3.15 new features, bugs and security fixes extension-modules C modules in the Modules dir tests Tests in the Lib/test dir topic-free-threading type-crash A hard crash of the interpreter, possibly with a core dump
Projects
Status: Done
Development

No branches or pull requests

8 participants
0