8000 SIGSEV in `datetime.timedelta` (possibly from datetime's C `delta_new`) · Issue #132413 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

SIGSEV in datetime.timedelta (possibly from datetime's C delta_new) #132413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Jacoblightning opened this issue Apr 11, 2025 · 30 comments
Open
Labels
extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@Jacoblightning
Copy link
Jacoblightning commented Apr 11, 2025

Crash report

What happened?

I was testing some AI code with ollama and stumbled across a really weird crash.
The fact that it happens during an IndexError and a specific function has to be there leads me to believe that this is a CPython bug and not an ollama bug.

from ollama import chat

# Has to be here to segfault???
def colorSwitch(color): 
    print(color, end="", flush=True)


stream = chat(
    model="llama3.2", # I think it works with any but this is what I used.
    messages=[{"role": "user", "content": ""}],
    options={"seed":0}, # Does not need this but I figured it would be helpful
    stream=True,
)
# Any iteration works. I just simplified it down to this.
part = next(iter(stream))['message']['content']
temp = part.split("</think>", 1)

# Crash Here
temp[1]

PythonCore.zip

CPython versions tested on:

3.13

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

Python 3.13.2 (main, Feb 5 2025, 08:05:21) [GCC 14.2.1 20250128]

Linked PRs

@Jacoblightning Jacoblightning added the type-crash A hard crash of the interpreter, possibly with a core dump label Apr 11, 2025
@picnixz picnixz added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Apr 11, 2025
@picnixz
Copy link
Member
picnixz commented Apr 11, 2025

I was testing some AI code with ollama and stumbled across a really weird crash.

What's the crash? please provide the traceback (shown on the terminal if possible) (not just the core dump).

@Jacoblightning
Copy link
Author

Sorry. Here is the python traceback. If you wanted a gdb or other traceback, let me know.

  File "crasher1.py", line 19, in <module>
    temp[1]
    ~~~~^^^
IndexError: list index out of range
Segmentation fault (core dumped)

@picnixz
Copy link
Member
picnixz commented Apr 11, 2025

If you wanted a gdb or other traceback, let me know

If possible yes, so that we know where the crash exactly happens. If possible, you can use python -X faulthandler crasher.py as well though I'm not sure if we're able to know more.

Also, it might be that ollama is using the Python C API behind the scene (AFAIU, ollama is written in Go and C but there are Python bindings, which may be the ones where the issue arise).

@picnixz picnixz changed the title Segfault on IndexError under specific conditions SIGSEV following an IndexError when using ollama Apr 11, 2025
@Jacoblightning
Copy link
Author

Ah. It appears that it is an ollama issue after all. Ill raise it over there.

Fatal Python error: Segmentation fault

Current thread 0x00007d584222bbc0 (most recent call first):
  Garbage-collecting
  File "/home/jacoblightning3/PycharmProjects/AiFight/.venv/lib/python3.13/site-packages/httpx/_client.py", line 158 in close
  File "/home/jacoblightning3/PycharmProjects/AiFight/.venv/lib/python3.13/site-packages/httpx/_models.py", line 972 in close
  File "/home/jacoblightning3/PycharmProjects/AiFight/.venv/lib/python3.13/site-packages/httpx/_client.py", line 877 in stream
  File "/usr/lib/python3.13/contextlib.py", line 162 in __exit__
  File "/home/jacoblightning3/PycharmProjects/AiFight/.venv/lib/python3.13/site-packages/ollama/_client.py", line 163 in inner
Segmentation fault (core dumped)

@Jacoblightning
Copy link
Author

@picnixz So, I just checked and it appears that ollama-python is pure python. (Ofc I realized this after I made the issue.) Would that bring the issue back here? They don't appear to be using CTypes, etc.

@JelleZijlstra
Copy link
Member

That does look like it could be a CPython issue, though ollama has a few dependencies that include compiled code. I tried to reproduce it (3.13.1, MacOS, latest ollama) but I got httpx.ConnectError: [Errno 61] Connection refused instead (does ollama require a local server or something? probably not something I'm interested in setting up).

Two useful ways forward could be:

  • Get the full C stack trace in gdb or lldb or a similar tool and explore what's happening when we hit the segfault. For example, maybe that will tell you what type it's looking at when the crash happens.
  • Reduce the reproducer to something simpler. For example, you can start by removing more and more parts of ollama that aren't relevant to the crash and then see if it still reproduces.

@Jacoblightning
Copy link
Author

I tried to reproduce it (3.13.1, MacOS, latest ollama) but I got httpx.ConnectError: [Errno 61] Connection refused instead (does ollama require a local server or something? probably not something I'm interested in setting up).

Yes. The ollama package requires ollama to be installed and running a local server.

@Jacoblightning
Copy link
Author

Running python with the debug build I just compiled produces:

Modules/_datetimemodule.c:2745:13: runtime error: member access within null pointer of type 'struct datetime_state'
AddressSanitizer:DEADLYSIGNAL
=================================================================
==9445==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010 (pc 0x7c8cc6b93737 bp 0x7ffe5c2d11f0 sp 0x7ffe5c2d1040 T0)
==9445==The signal is caused by a READ memory access.
==9445==Hint: address points to the zero page.
    #0 0x7c8cc6b93737 in delta_new Modules/_datetimemodule.c:2745
    #1 0x5790c86108b1 in type_call Objects/typeobject.c:1987
    #2 0x5790c8410b28 in _PyObject_MakeTpCall Objects/call.c:242
    #3 0x5790c8410fc7 in _PyObject_VectorcallTstate Include/internal/pycore_call.h:166
    #4 0x5790c8411018 in PyObject_Vectorcall Objects/call.c:327
    #5 0x5790c883c457 in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1502
    #6 0x5790c8472fda in _PyEval_EvalFrame Include/internal/pycore_ceval.h:119
    #7 0x5790c8473ce6 in gen_send_ex2 Objects/genobject.c:229
    #8 0x5790c84786a3 in gen_send_ex Objects/genobject.c:270
    #9 0x5790c847adb5 in _gen_throw Objects/genobject.c:543
    #10 0x5790c847b17c in gen_throw Objects/genobject.c:580
    #11 0x5790c883ebd2 in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1640
    #12 0x5790c8883c6b in _PyEval_EvalFrame Include/internal/pycore_ceval.h:119
    #13 0x5790c8883f2f in _PyEval_Vector Python/ceval.c:1812
    #14 0x5790c84105c3 in _PyFunction_Vectorcall Objects/call.c:413
    #15 0x5790c8419a0e in _PyObject_VectorcallTstate Include/internal/pycore_call.h:168
    #16 0x5790c841b924 in method_vectorcall Objects/classobject.c:62
    #17 0x5790c8410ed7 in _PyObject_VectorcallTstate Include/internal/pycore_call.h:168
    #18 0x5790c8411018 in PyObject_Vectorcall Objects/call.c:327
    #19 0x5790c888107e in _PyEval_EvalFrameDefault Python/generated_cases.c.h:6205
    #20 0x5790c8472fda in _PyEval_EvalFrame Include/internal/pycore_ceval.h:119
    #21 0x5790c8473ce6 in gen_send_ex2 Objects/genobject.c:229
    #22 0x5790c84786a3 in gen_send_ex Objects/genobject.c:270
    #23 0x5790c8479898 in gen_close Objects/genobject.c:392
    #24 0x5790c8479dcd in _PyGen_Finalize Objects/genobject.c:106
    #25 0x5790c894557a in finalize_garbage Python/gc.c:980
    #26 0x5790c8947b80 in gc_collect_main Python/gc.c:1408
    #27 0x5790c8949657 in _PyGC_CollectNoFail Python/gc.c:1657
    #28 0x5790c89fc3c1 in finalize_modules Python/pylifecycle.c:1757
    #29 0x5790c8a08ab6 in _Py_Finalize Python/pylifecycle.c:2125
    #30 0x5790c8a08fbc in Py_FinalizeEx Python/pylifecycle.c:2252
    #31 0x5790c8ab41fa in Py_RunMain Modules/main.c:778
    #32 0x5790c8ab440a in pymain_main Modules/main.c:806
    #33 0x5790c8ab4787 in Py_BytesMain Modules/main.c:830
    #34 0x5790c813ab41 in main Programs/python.c:15
    #35 0x7c8cc7c35487  (/usr/lib/libc.so.6+0x27487) (BuildId: 0b707b217b15b106c25fe51df3724b25848310c0)
    #36 0x7c8cc7c3554b in __libc_start_main (/usr/lib/libc.so.6+0x2754b) (BuildId: 0b707b217b15b106c25fe51df3724b25848310c0)
    #37 0x5790c813aa64 in _start (/home/jacoblightning3/Documents/python/3.13debug/bin/python3.13+0x11a5a64) (BuildId: c2486f8b3c246fe979c4f6575406636585088b8a)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV Modules/_datetimemodule.c:2745 in delta_new
==9445==ABORTING

@JelleZijlstra
Copy link
Member

On current 3.13 tip that line in datetime is

y = accum("seconds", x, second, CONST_US_PER_SECOND(st), &leftover_us);
which doesn't have an obvious bug. This sort of crash could be the result of memory corruption earlier in the program.

@Jacoblightning
Copy link
Author
Jacoblightning commented Apr 11, 2025

So, using GDB. I figured out that st on that line is a null pointer and the macro attempts to dereference it.
st is created in the function _get_current_state which GDB says is returning NULL from line 168.
The weird thing is that GDB won't let me step into or break on lines 166 or 167.

So, assuming that GDB is right on where _get_current_state is returning, get_module_state must have been called.

static datetime_state *
get_module_state(PyObject *module)
{
void *state = _PyModule_GetState(module);
assert(state != NULL);
return (datetime_state *)state;
}

The other weird thing is that the assert on line 106 is not failing as it should (I have assertions turned on in my build)
So I have to assume that GDB is wrong about where _get_current_state is returning. (So it must be either 157 or 163).

Since neither of those return paths in _get_current_state set *p_mod, wouldn't this be a bug since current_mod is not checked in delta_new and it is just assumed that st != NULL?

(I could be totally wrong on all this. :)

@Jacoblightning
Copy link
Author

Just realized that line numbers are different. Let me fix that

@Jacoblightning
Copy link
Author

Why _get_current_state returns NULL in the first place in this situation, I have no idea.
(Actually, I do. both get_current_module and possibly PyImport_ImportModule("_datetime"); failed. But I don't know why this happens)

@Jacoblightning
Copy link
Author
Jacoblightning commented Apr 11, 2025

@JelleZijlstra
New Minimal Reproducible example:

import httpx

client = httpx.Client(
    # Any URL works
    base_url="https://duckduckgo.com"
)


def req():
    with client.stream("GET", "/") as r:
        yield

# Cannot be inlined into the iter. If inlined, segfault does not occur
stream = req()

# Any iteration works. I just simplified it down to this.
next(iter(stream)) 

Also, I found the line where python crashes:

https://github.com/encode/httpx/blob/9e8ab40369bd3ec2cc8bff37ab79bf5769c8b00f/httpx/_client.py#L158

@picnixz
Copy link
Member
picnixz commented Apr 11, 2025

Ok, so I think it's an issue with datetime (and/or maybe with iterators?). Thank you very much for your investigation! I'll try to see if I can find something tomorrow or on Sunday

@picnixz picnixz changed the title SIGSEV following an IndexError when using ollama SIGSEV in datetime.timedelta (possibly from datetime's C delta_new) Apr 11, 2025
@picnixz picnixz added extension-modules C modules in the Modules dir and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Apr 11, 2025
@Jacoblightning Jacoblightning changed the title SIGSEV in datetime.timedelta (possibly from datetime's C delta_new) Bug in datetime/iterators causes Sigsegv Apr 11, 2025
@Jacoblightning
Copy link
Author

Oh. Whoops. Didn't see that you changed the name

8000
@Jacoblightning Jacoblightning changed the title Bug in datetime/iterators causes Sigsegv SIGSEV in datetime.timedelta (possibly from datetime's C delta_new)[/+] Apr 11, 2025
@Jacoblightning Jacoblightning changed the title SIGSEV in datetime.timedelta (possibly from datetime's C delta_new)[/+] SIGSEV in datetime.timedelta (possibly from datetime's C delta_new) Apr 11, 2025
@StanFromIreland
Copy link
Contributor
StanFromIreland commented Apr 12, 2025

Has this been tested with _pydatetime to verify the issue is definitely in datetime.c?

@picnixz
Copy link
Member
picnixz commented Apr 12, 2025

There seems to be a subtle issue as well:

Python 3.14.0a7+ (heads/main:d4e2cdc15bd, Apr 12 2025, 10:58:46) [GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.modules['_datetime'] = None
>>> import httpx
...
... client = httpx.Client(
...     # Any URL works
...     base_url="https://duckduckgo.com"
... )
...
...
... def req():
...     with client.stream("GET", "/") as r:
...         yield
...
... # Cannot be inlined into the iter. If inlined, segfault does not occur
... stream = req()
...
... # Any iteration works. I just simplified it down to this.
... next(iter(stream))
...
>>>
>>> ^D
Exception ignored while closing generator <generator object req at 0x7f5fdcffd150>:
Traceback (most recent call last):
  File "<python-input-3>", line 10, in req
  File "/$HOME/lib/python/cpython/Lib/contextlib.py", line 162, in __exit__
  File "/$HOME/Applications/python3.12/local/lib/python3.12/site-packages/httpx/_client.py", line 877, in stream
  File "/$HOME/Applications/python3.12/local/lib/python3.12/site-packages/httpx/_models.py", line 971, in close
  File "/$HOME/lib/python/cpython/Lib/contextlib.py", line 305, in helper
TypeError: 'NoneType' object is not callable

Note that the exception being ignored while closing generator is only raised when exiting the interpreter, but the interpreter does not SIGSEV. Note that I'm using my 3.12 system-wide installation for the httpx package but other than that it shouldn't change anything. What's surprsing is that I cannot reproduce the crash itself with the latest main! I can reproduce the above issue however, but I cannot reproduce the SIGSEV.

@picnixz
Copy link
Member
picnixz commented Apr 12, 2025

@Jacoblightning Can you try building the latest released version please?

@Jacoblightning
Copy link
Author
Jacoblightning commented Apr 12, 2025

Still crashing for me on Python 3.14.0a7+ (heads/main:891465fc7a6, Apr 12 2025, 08:18:07) [GCC 14.2.1 20250207]. (Both debug and release) I believe that it's the same stacktrace (just with different line numbers)

@StanFromIreland
Copy link
Contributor

What os are you on?

@picnixz
Copy link
Member
picnixz commented Apr 12, 2025

Oh btw, mine is openSUSE 15.5 and I was using gcc 7.5. So it could also be a GCC issue (or me not knowing how to check...)

@Jacoblightning
Copy link
Author

What os are you on?

Arch linux

@Jacoblightning
Copy link
Author

I can check on my windows VM too but I can't do that until noon.

@picnixz
Copy link
Member
picnixz commented Apr 12, 2025

@ZeroIntensity Since you're on AL, can you check if the error also persists on your side? TiA

@ZeroIntensity
Copy link
Member

I can reproduce this using a fresh build off main, but I don't think this is an issue with datetime. The crash seems to be happening in the eval loop, and anything that happens i A3E2 n datetime is probably just a side-effect of memory corruption.

My theory is that this has to do with reference counting problems on stackrefs + generator locals.

@StanFromIreland
Copy link
Contributor

@ZeroIntensity Can you also try with the Python implementation, if it also crashes then it would be a sideffect, otherwise datetime may be the culprit? I can test in a few hours on Linux.

@ZeroIntensity
Copy link
Member

The crash doesn't occur with the Python implementation enabled, but Valgrind still explodes with errors. I'm pretty sure _datetime just acts as a trigger for the segfault.

@neonene
Copy link
Contributor
neonene commented Apr 13, 2025

If httpx tries to import datetime lazily at BoundSyncStream.close(), an ImportError occurs even on 3.11:

if meta_path is None:
raise ImportError("sys.meta_path is None, Python is likely "
"shutting down")

The same error can happen in the current _datetimemodule.c even before module_clear() is invoked, by which we fail to get the valid/live pointer to the module state as you already discussed above:

/* The static types can outlive the module,
* so we must re-import the module. */
mod = PyImport_ImportModule("_datetime");
if (mod == NULL) {
return NULL;
}
}
datetime_state *st = get_module_state(mod);

Triggered by d82a7ba. I'll also check as much as I can.

cc @ericsnowcurrently

@neonene
Copy link
Contributor
neonene commented Apr 14, 2025

I've posted my questions, keeping aside the _datetime. I'm not sure yet what need to be fixed.

https://discuss.python.org/t/strange-side-effect-of-the-generator-when-finally-clause-is-contained/88353

@neonene
Copy link
Contributor
neonene commented Apr 21, 2025

Generator's different behaviors are still surprising to me:

def gen():
    try:
        print(1)
        yield 2
    finally:
        print(3)
  • print(next(it := gen()))
1
2
3
  • print(next(gen()))
1
3
2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump
Projects
Development

No branches or pull requests

6 participants
0