10000 Python 3.9 segfault in minimal tracing snippet · Issue #3801 · open-telemetry/opentelemetry-python · GitHub
[go: up one dir, main page]

Skip to content

Python 3.9 segfault in minimal tracing snippet #3801

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
NullHypothesis opened this issue Mar 20, 2024 · 6 comments
Closed

Python 3.9 segfault in minimal tracing snippet #3801

NullHypothesis opened this issue Mar 20, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@NullHypothesis
Copy link

The following code reproducibly results in a segmentation fault:

#!/usr/bin/env python3.9

# pip install opentelemetry-sdk
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import (
    BatchSpanProcessor,
    ConsoleSpanExporter,
)

fd = open("foo.txt", "w")
tracer_provider = TracerProvider()
processor = BatchSpanProcessor(ConsoleSpanExporter(out=fd))
tracer_provider.add_span_processor(processor)

If you don't have a copy of Python 3.9 handy, you can use this Dockerfile:

FROM python:3.9.18-slim
RUN pip install opentelemetry-sdk
COPY file.py .
CMD ./file.py

Describe your environment
This happens on both macOS (Sonoma 14.3.1) and Linux (Ubuntu 23.10). As far as I can tell, Python <=3.9 is affected but not Python >=3.10.

Steps to reproduce
The segfault occurs under the following conditions:

  • BatchSpanProcessor must be used (does not happen with SimpleSpanProcessor)
  • out=fd must be passed to ConsoleSpanExporter (does not happen without)
  • fd must not be closed at the end of the file (does not happen with fd.close())

What is the expected behavior?
No segfault.

What is the actual behavior?
Segfault.

Additional context
I realize that Python 3.9 is close to its end-of-life but I figured that there's merit in reporting this issue regardless.

@NullHypothesis NullHypothesis added the bug Something isn't working label Mar 20, 2024
@methane
Copy link
Contributor
methane commented Mar 21, 2024

This is a bug in Python. Python tried to show "unclosed file" ResourceWarning, but globals dict is already gone.

I'm not sure this would be fixed because 3.9 is in security fix only mode.

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
_PyDict_GetItemIdWithError (dp=0x0, key=<optimized out>) at Objects/dictobject.c:1491
1491	Objects/dictobject.c: No such file or directory.
(gdb) bt
#0  _PyDict_GetItemIdWithError (dp=0x0, key=<optimized out>) at Objects/dictobject.c:1491
#1  setup_context (stack_level=<optimized out>, filename=<optimized out>, lineno=<optimized out>, module=<optimized out>, registry=<optimized out>)
    at Python/_warnings.c:876
#2  do_warn (message=0x7ffff65378b0, category=0x7ffff7e83010 <_PyExc_ResourceWarning>, stack_level=<optimized out>, source=0x7ffff684dba0)
    at Python/_warnings.c:953
#3  0x00007ffff6bb67da in warn_unicode (category=0x7ffff7e83010 <_PyExc_ResourceWarning>, message=0x7ffff65378b0, stack_level=1,
    source=0x7ffff684dba0) at Python/_warnings.c:1108
#4  _PyErr_WarnFormatV (source=0x7ffff684dba0, category=<optimized out>, stack_level=1, format=<optimized out>, vargs=<optimized out>)
    at Python/_warnings.c:1128
#5  0x00007ffff6bb678b in PyErr_ResourceWarning (source=0x7ffff68a99e0, stack_level=0, format=0x7ffff68a99e0 "\001") at Python/_warnings.c:1179
#6  0x00007ffff6eb1e4e in fileio_dealloc_warn (self=0x7ffff659a940, source=0x7ffff7ece330 <PyId_mode.16579>) at ./Modules/_io/fileio.c:96
#7  0x00007ffff6ca6841 in method_vectorcall_O (func=0x7ffff686b6d0, args=0x7fffffffd9d0, nargsf=<optimized out>, kwnames=<optimized out>)
    at Objects/descrobject.c:464
#8  0x00007ffff6b8b576 in _PyObject_VectorcallTstate (tstate=0x55555555cf60, callable=0x7ffff686b6d0, args=0x7fffffffd9d0, nargsf=2, kwnames=0x0)
    at ./Include/cpython/abstract.h:118

@methane
Copy link
Contributor
methane commented Mar 21, 2024

I investigated that is this issue fixed or just hidden by some random reason.
I found this commit fixed it already.

python/cpython#21605

@methane
Copy link
Contributor
methane commented Mar 21, 2024

FYI, minimum reproducible code without otel:

import os, time

f = open("foo.txt", "w")

class C:
    def __init__(self):
        self.f = f
        os.register_at_fork(after_in_child=self.atfork)

    def atfork(self):
        print("atfork")
c=C()
del c, f

@NullHypothesis
Copy link
Author

Great work, thanks @methane! I also reported this in Python's issue tracker (python/cpython#117090) but closed the issue because I didn't have a reproducible snippet without third-party code.

As you said, this is unlikely to get fixed by Python and cannot be fixed by OpenTelemetry, so we might as well close this issue.

@methane
Copy link
Contributor
methane commented Mar 21, 2024

As you said, this is unlikely to get fixed by Python and cannot be fixed by OpenTelemetry, so we might as well close this issue.

Root cause is unclosed file. You can fix it by subclassing ConsoleSpanExporter and implement shutdown:

class FileSpanExporter(ConsoleSpanExporter):
    def shutdown(self):
        self.out.close()

@NullHypothesis
Copy link
Author

Thanks again for your careful investigation of this issue, @methane.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants
0