-
-
Notifications
You must be signed in to change notification settings - Fork 32k
bpo-42923: Dump extension modules on fatal error #24207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @methane @corona10 @pablogsal @serhiy-storchaka: This enhancement should help to explain to users that their crash may come from a third party C extension modules rather than Python (internals or stdlib). |
I updated the PR to change the formatting. I tested with a more realistic list of extensions, crash after loading pip. The extension modules list was too long when rendered with one item per line, so I wrote it on a single long with separated by commas:
|
I don't have time to precise review or testing locally. But it looks nice by quick looking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this also dump the extension modules in the standard library?
We could ignore stdlib extension modules using an hardcoded list of all stdlib extensions names. But I don't want to maintain such list in the long term. Apart of the name, I don't know any programmatic way to detect if a module comes from the stdlib or not. My expectation is that users will copy/paste the whole output to a bug reports, and so core developers can look for some unusual extensions. If we get more and more bug reports containing crashes, maybe it would be nice to have a tool to process the output:
Concrete example: when I looked at https://bugs.python.org/issue42891 I didn't know anything about unicorn or lsm-db. I don't think that the reporter knows the exhaustive list of all extension modules used by his code. He got a crash and considers that it must be a bug in Python. Then I discovered that lsm-db is implemented in C. And the bug comes from this extension. A Linux kernel "oops" dump contains a flag saying if the kernel is "tainted": https://www.kernel.org/doc/html/latest/admin-guide/tainted-kernels.html |
I see. Well, one of the reasons I was asking is that this list is probably going to be gigantic and a lot of stdlib extension modules will be in it almost always. Given that there is nothing that tells the user that extension modules are the culprit is not always clear what the user can do with that information. On the other hand as a core Dev that has to debug crashes I find the information quite useful, but I would prefer to not have the built-in extension modules to reduce the noise of a list that can be veeeery long. |
Since Python dicts keep the insertion order, the interesting thing is that the list is written in the import order! Last items are the most recently imported modules. I backported locally the function to Python 3.9 to test other applications. At Python startup, sys.modules contains 38 modules, 17 extensions:
After loading numpy and jupyter_client, sys.modules contains 360 modules (+322), 76 extensions (+59):
Ratio of extensions / all modules:
|
If I have to debug a crash and I don't know anything about the application, for me it's relevant to know if the application imported stdlib extensions. For example, if _ctypes is loaded, maybe _ctypes was misused or was used to load "unsafe" code.
I tried numpy+jupyter_client: I get 76 extensions. The list is long, is it "veeeery long" for you?
Or do you prefer to get the list rendered with one item per line?
|
Another example. I tested "import cinder" (OpenStack Cinder application): 221 modules (43 extensions):
|
Note that almost all of these come from the stdlib, which I would say is going to be noisy to the user. |
Is veeery long (3 'e's) 😉 |
I created https://bugs.python.org/issue42955 to add sys.modules_names tuple: names of stdlib modules. Once it will be merged, I will updated this PR to filter the list of modules (ignore stdlib modules). |
The Py_FatalError() function and the faulthandler module now dump the list of extension modules on a fatal error. Add _Py_DumpExtensionModules() and _PyModule_IsExtension() internal functions.
I merged non controversial changes to make this PR shorter. I rebased my PR on master. |
@pablogsal: I merged a first implementation which doesn't exclude stdlib modules. I will update the once once https://bugs.python.org/issue42955 will be implemented. |
The Py_FatalError() function and the faulthandler module now dump the list of extension modules on a fatal error. Add _Py_DumpExtensionModules() and _PyModule_IsExtension() internal functions.
The Py_FatalError() function and the faulthandler module now dump the
list of extension modules on a fatal error.
functions.
hardcoding stderr.
https://bugs.python.org/issue42923