-
-
Notifications
You must be signed in to change notification settings - Fork 32k
gh-97556: Fix crash on null characters in string literal parser #97577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Most changes to Python require a NEWS entry. Please add it using the blurb_it web app or the blurb command-line tool. |
Most changes to Python require a NEWS entry. Please add it using the blurb_it web app or the blurb command-line tool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good catch, and the patch is almost perfect (I have one minor suggestion).
Could you also add a unittest that demonstrates the bug?
Parser/string_parser.c
Outdated
if (len > INT_MAX) { | ||
PyErr_SetString(PyExc_OverflowError, "string to parse is too long"); | ||
return -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move this check to the top? I really don't care about the case where the string including quotes is just over INT_MAX but after stripping quotes it fits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can do. Also, I just realized I copied the wrong issue number in the title (I meant #97556): is it enough to change the title, or should I create a whole new pull request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just edit the title.
I'm having trouble including a unit test due to a similar issue, #96670. Null characters raise errors when they are included in source code that is parsed using ast.parse, or which is imported from another file. So I have a test case for this which can be run directly, but it cannot be imported as part of the standard test suite. Any suggestions on how to work around that? |
You can write the test program to a file with a .py extension, manipulate sys.path so you can import that file, and attempt to import it. I think there are a few other tests that do that. E.g. look in test_bdb.py for import_helper. |
An easier way is to write the test code as a string and use the cpython/Lib/test/test_cmd_line_script.py Lines 220 to 226 in 68c46ae
|
The _PyPegen_parsestr function was crashing on null characters inside of string literals due to a call to strlen, which treated it as a null terminator.
This change removes that call, which was actually redundant: the length was already calculated by the caller function and can be retrieved using PyBytes_AsStringAndSize. This has the added benefit of reducing the number of times the length gets calculated.