8000 bpo-31592: Fix an assertion failure in Python/ast.c in case of a bad unicodedata.normalize() by orenmn · Pull Request #3767 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

bpo-31592: Fix an assertion failure in Python/ast.c in case of a bad unicodedata.normalize() #3767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Sep 30, 2017

Conversation

orenmn
Copy link
Contributor
@orenmn orenmn commented Sep 26, 2017
  • in ast.c: add a check whether unicodedata.normalize() returned a string.
  • in test_ast.py: add tests to verify that the assertion failure is no more.

https://bugs.python.org/issue31592

@orenmn
Copy link
Contributor Author
orenmn commented Sep 27, 2017

Added another change to the patch, to fix the bug that Serhiy mentioned in https://bugs.python.org/issue31592#msg303043.

Python/ast.c Outdated
8000
id2 = PyObject_Call(c->c_normalize, c->c_normalize_args, NULL);
/* Use _PyObject_FastCall() this way to conceal c->c_normalize_args
from the user. */
id2 = _PyObject_FastCall(c->c_normalize,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use c->c_normalize_args. Use just a 2-element C array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right.
Would you mind to mention why this is better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is with reusing the same tuple for passing arguments. If the fake normalize() save the reference to the tuple, it will see that an immutable tuple is mutated. It should be implemented in C, I can't reproduce the problem with Python code.

Just allocate a 2-element array on the stack, fill it with arguments, and pass it to the function. This will significantly simplify the code too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just revert you last change, I'll create a separate PR. These bugs are related to the same function, but can be fixed separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that we must conceal the tuple c->c_normalize_args from the user.
Doesn't my patch conceal it? I passed only the tuple items array to _PyObject_FastCall(), so even it doesn't have access to the tuple.
for example, _PyObject_FastCall() might eventually cause calling function_code_fastcall(), which would copy the args C array into f_localsplus, or it might call _PyStack_AsTuple(), which would copy the args C array into a new tuple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is my fix bad because it uses the internal structure of tuple?

Copy link
Member
< 8000 div class="js-minimize-comment d-none">

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, your patch conceal it. But using c->c_normalize_args as a buffer is suboptimal. You don't need a heap allocation, and the code would be simpler if use a stack variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, thanks for the explanation :)

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I didn't expect the Spanish Inquisition!. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be poked with soft cushions!

@orenmn
Copy link
Contributor Author
orenmn commented Sep 28, 2017

I didn't expect the Spanish Inquisition!

@bedevere-bot
Copy link

Nobody expects the Spanish Inquisition!

@serhiy-storchaka: please review the changes made to this pull request.

Python/ast.c Outdated
id2 = _PyObject_FastCall(c->c_normalize,
((PyTupleObject *)c->c_normalize_args)->ob_item,
2);
PyObject *form = PyUnicode_FromString("NFKC");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use _Py_IDENTIFIER().

@@ -1,2 +1,2 @@
Fix an assertion failure in case of a bad `unicodedata.normalize()`. Patch
by Oren Milman.
Fixed an assertion failure in Python parser in case of a bad `unicodedata.normalize()`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the 'Python parser' part :)

BTW, in https://devguide.python.org/committing/#what-s-new-and-news-entries, the example uses 'Fix ...' (as opposed to 'Fixed ...').
Which phrasing should be used? Or is it unimportant?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer 'Fixed ...'. But my English is really bad. I just use old patterns.

About good and bad phrasing read this: https://mail.python.org/pipermail/python-dev/2011-May/111303.html.

@serhiy-storchaka serhiy-storchaka added needs backport to 3.6 type-bug An unexpected behavior, bug, or error labels Sep 30, 2017
@serhiy-storchaka serhiy-storchaka merged commit 7dc46d8 into python:master Sep 30, 2017
@miss-islington
Copy link
Contributor

Thanks @orenmn for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 3.6.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-3836 is a backport of this pull request to the 3.6 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Sep 30, 2017
… a bad unicodedata.normalize(). (pythonGH-3767)

(cherry picked from commit 7dc46d8)
serhiy-storchaka pushed a commit that referenced this pull request Sep 30, 2017
… a bad unicodedata.normalize(). (GH-3767) (#3836)

(cherry picked from commit 7dc46d8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0