[3.9] bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters (GH-28993) #29071

ambv · 2021-10-19T20:19:41Z

There are two errors that this commit fixes:

The parser was not correctly computing the offset and the string
source for E_LINECONT errors due to the incorrect usage of strtok().
The parser was not correctly unwinding the call stack when a tokenizer
exception happened in rules involving optionals ('?', [...]) as we
always make them return valid results by using the comma operator. We
need to check first if we don't have an error before continuing..
(cherry picked from commit a106343)

Co-authored-by: Pablo Galindo Salgado Pablogsal@gmail.com

https://bugs.python.org/issue45494

…alid continuation characters (pythonGH-28993) There are two errors that this commit fixes: * The parser was not correctly computing the offset and the string source for E_LINECONT errors due to the incorrect usage of strtok(). * The parser was not correctly unwinding the call stack when a tokenizer exception happened in rules involving optionals ('?', [...]) as we always make them return valid results by using the comma operator. We need to check first if we don't have an error before continuing.. (cherry picked from commit a106343) Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>

ambv · 2021-10-19T22:01:25Z

On Python 3.9, the actual position reported is (3, 22) instead of the expected (2, 2). That's both before and after this PR. The difference is that the PR solves an assertion failure (see my comment on the issue) so it's worth landing it anyway. I'll be changing the test here to point at (3, 22) then.

pablogsal · 2021-10-19T23:14:11Z

Lib/test/test_exceptions.py

    def test_error_offset_continuation_characters(self):
        check = self.check
-        check('"\\\n"(1 for c in I,\\\n\\', 2, 2)
+        check('"\\\n"(1 for c in I,\\\n\\', 3, 22)


This is unfortunately a regression:

>>> try: ... compile('"\\\n"(1 for c in I,\\\n\\', "", "exec") ... except SyntaxError as e: ... f = e ... >>> f.lineno 2 >>> f.offset 3

pablogsal

We need to figure out why this points to some random place :(

bedevere-bot · 2021-10-19T23:14:29Z

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

And if you don't make the requested changes, you will be put in the comfy chair!

pablogsal · 2021-10-19T23:16:28Z

That's both before and after this PR.
Maybe I am missing something but I am not getting that:

Python 3.9.7 (default, Oct 10 2021, 15:13:22)
[GCC 11.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> try:
...   compile('"\\\n"(1 for c in I,\\\n\\', "", "exec")
... except SyntaxError as e:
...   f = e
...
>>> f
SyntaxError('Generator expression must be parenthesized', ('', 2, 3, '"\\\n"(1 for c in I,\\\n\\\n'))
>>> f.offset
3
>>> f.lineno
2

pablogsal

On the other hand the old error is wrong:

Python 3.9.7 (default, Oct 10 2021, 15:13:22)
[GCC 11.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> compile('"\\\n"(1 for c in I,\\\n\\', "","exec")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "", line 2
    "\
"(1 for c in I,\
\
      ^
SyntaxError: Generator expression must be parenthesized

while the one in this PR is correct but the offsets are badly computed:

Python 3.10.0 (default, Oct  4 2021, 22:36:16) [GCC 11.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> compile('"\\\n"(1 for c in I,\\\n\\', "","exec")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "", line 2
    "(1 for c in I,\
     ^
SyntaxError: '(' was never closed
>>>

ambv · 2021-10-20T13:45:50Z

Maybe I am missing something but I am not getting that

Right, I read (2, 3) as (3, 2) in the output. So yes, definitely there's a regression here.

pablogsal · 2021-10-20T16:01:34Z

I have been investigating and this is actually a bug in the tokenizer that was solved by the big tokenizer refactor I did for 3.10. Backporting that is going to be a pain and quite invasive unfortunately :(

pablogsal

Given the complication, I am ok landing this as is.

bedevere-bot · 2021-10-20T16:51:16Z

@ambv: Please replace # with GH- in the commit message next time. Thanks!

ambv requested review from lysnikolaou and pablogsal as code owners October 19, 2021 20:19

the-knights-who-say-ni added the CLA signed label Oct 19, 2021

bedevere-bot mentioned this pull request Oct 19, 2021

bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters #28993

Merged

bedevere-bot added the awaiting core review label Oct 19, 2021

yeah well okay, let's point at a crazy location

aa41b1b

pablogsal reviewed Oct 19, 2021

View reviewed changes

pablogsal requested changes Oct 19, 2021

View reviewed changes

bedevere-bot added awaiting changes and removed awaiting core review labels Oct 19, 2021

pablogsal reviewed Oct 19, 2021

View reviewed changes

pablogsal approved these changes Oct 20, 2021

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting changes labels Oct 20, 2021

ambv merged commit 88f4ec8 into python:3.9 Oct 20, 2021

bedevere-bot removed the awaiting merge label Oct 20, 2021

ambv deleted the backport-a106343-3.9 branch October 20, 2021 16:51

ammaraskar mentioned this pull request Feb 13, 2023

[fuzzer] Parser null deref with continuation characters and generator parenthesis error #89657

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[3.9] bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters (GH-28993) #29071

[3.9] bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters (GH-28993) #29071

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[3.9] bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters (GH-28993) #29071

[3.9] bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters (GH-28993) #29071

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!