8000 gh-97556: Fix crash on null characters in string literal parser by apccurtiss · Pull Request #97577 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

gh-97556: Fix crash on null characters in string literal parser #97577

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

apccurtiss
Copy link
@apccurtiss apccurtiss commented Sep 26, 2022

The _PyPegen_parsestr function was crashing on null characters inside of string literals due to a call to strlen, which treated it as a null terminator.

This change removes that call, which was actually redundant: the length was already calculated by the caller function and can be retrieved using PyBytes_AsStringAndSize. This has the added benefit of reducing the number of times the length gets calculated.

@bedevere-bot
Copy link

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

@ghost
Copy link
ghost commented Sep 26, 2022

All commit authors signed the Contributor License Agreement.
CLA signed

@bedevere-bot
Copy link

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

@gvanrossum gvanrossum changed the title gh-96670: String literal parser no longer crashes on null characters gh-96670: Fix crash on null characters in string literal parser Sep 26, 2022
Copy link
Member
@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good catch, and the patch is almost perfect (I have one minor suggestion).

Could you also add a unittest that demonstrates the bug?

Comment on lines 229 to 231
if (len > INT_MAX) {
PyErr_SetString(PyExc_OverflowError, "string to parse is too long");
return -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move this check to the top? I really don't care about the case where the string including quotes is just over INT_MAX but after stripping quotes it fits.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do. Also, I just realized I copied the wrong issue number in the title (I meant #97556): is it enough to change the title, or should I create a whole new pull request?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just edit the title.

@apccurtiss apccurtiss changed the title gh-96670: Fix crash on null characters in string literal parser gh-97556: Fix crash on null characters in string literal parser Sep 26, 2022
@apccurtiss
Copy link
Author

I'm having trouble including a unit test due to a similar issue, #96670. Null characters raise errors when they are included in source code that is parsed using ast.parse, or which is imported from another file. So I have a test case for this which can be run directly, but it cannot be imported as part of the standard test suite. Any suggestions on how to work around that?

@gvanrossum
Copy link
Member

I'm having trouble including a unit test due to a similar issue, #96670. Null characters raise errors when they are included in source code that is parsed using ast.parse, or which is imported from another file. So I have a test case for this which can be run directly, but it cannot be imported as part of the standard test suite. Any suggestions on how to work around that?

You can write the test program to a file with a .py extension, manipulate sys.path so you can import that file, and attempt to import it. I think there are a few other tests that do that. E.g. look in test_bdb.py for import_helper.

@pablogsal
Copy link
Member

You can write the test program to a file with a .py extension, manipulate sys.path so you can import that file, and attempt to import it. I think there are a few other tests that do that. E.g. look in test_bdb.py for import_helper.

An easier way is to write the test code as a string and use the support.assert_python_ok helper to execute the test code. Check this as an example:

def test_basic_script(self):
with os_helper.temp_dir() as script_dir:
script_name = _make_test_script(script_dir, 'script')
self._check_script(script_name, script_name, script_name,
script_dir, None,
importlib.machinery.SourceFileLoader,
expected_cwd=script_dir)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0