8000 Simplify the grammar of the `assignment` rule · Issue #122951 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

Simplify the grammar of the assignment rule #122951

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tomasr8 opened this issue Aug 12, 2024 · 8 comments
Closed

Simplify the grammar of the assignment rule #122951

tomasr8 opened this issue Aug 12, 2024 · 8 comments
Labels
easy topic-parser type-feature A feature request or enhancement

Comments

@tomasr8
Copy link
Member
tomasr8 commented Aug 12, 2024

Feature or enhancement

Proposal:

I don't know all the nuances of the new grammar but I think that the assignment rule can be simplified in the following way:

assignment[stmt_ty]:
    | a=NAME ':' b=expression c=['=' d=annotated_rhs { d }] {
        CHECK_VERSION(
            stmt_ty,
            6,
            "Variable annotation syntax is",
            _PyAST_AnnAssign(CHECK(expr_ty, _PyPegen_set_expr_context(p, a, Store)), b, c, 1, EXTRA)
        ) }
    | a=('(' b=single_target ')' { b }
         | single_subscript_attribute_target) ':' b=expression c=['=' d=annotated_rhs { d }] {
        CHECK_VERSION(stmt_ty, 6, "Variable annotations syntax is", _PyAST_AnnAssign(a, b, c, 0, EXTRA)) }
-     | a[asdl_expr_seq*]=(z=star_targets '=' { z })+ b=(yield_expr | star_expressions) !'=' tc=[TYPE_COMMENT] {
+     | a[asdl_expr_seq*]=(z=star_targets '=' { z })+ b=annotated_rhs !'=' tc=[TYPE_COMMENT] {
         _PyAST_Assign(a, b, NEW_TYPE_COMMENT(p, tc), EXTRA) }
-     | a=single_target b=augassign ~ c=(yield_expr | star_expressions) {
+     | a=single_target b=augassign ~ c=annotated_rhs {
         _PyAST_AugAssign(a, b->kind, c, EXTRA) }
    | invalid_assignment

annotated_rhs[expr_ty]: yield_expr | star_expressions

That is, just reusing annotated_rhs instead of writing out (yield_expr | star_expressions).

There was a similar issue before that fixed some cases, but this one remained. I can submit a PR if this is considered worthwhile but anyone feel free to pick this up :)

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

#116988

Linked PRs

@tomasr8 tomasr8 added the type-feature A feature request or enhancement label Aug 12, 2024
@dongrixinyu
Copy link

Excuse me.
I am reading the source code of parse.c ane notice the function assignment_rule.
I am confused about the annotation of

// assignment:
//     | NAME ':' expression ['=' annotated_rhs]
//     | ('(' single_target ')' | single_subscript_attribute_target) ':' expression ['=' annotated_rhs]
//     | ((star_targets '='))+ (yield_expr | star_expressions) !'=' TYPE_COMMENT?
//     | single_target augassign ~ (yield_expr | star_expressions)
//     | invalid_assignment

What's the meaning of these patterns, for example, ((star_targets '='))+ (yield_expr | star_expressions) !'=' TYPE_COMMENT?

What is star_targets and star_expressions? Could you provide some help to understand it?

@picnixz
Copy link
Member
picnixz commented Sep 5, 2024

See https://docs.python.org/3/reference/grammar.html#full-grammar-specification to understand the grammar. The star_targets and star_expressions are defined later in the document. CPython mixes EBNF and PEG syntax just for their own convenience.

As for the issue, I think you can replace it. On the other hand, I wonder whether this could cause a slight performance change where the parser needs to make another pass to subsitute the rule or not.

EDIT: nvm the performance shouldn't really be the bottleneck here (in the relevant PR, the occurrences were just outright replaced)

@dongrixinyu
Copy link

I reviewed several functions in parse.c, like atom_rule. I find it recursively checks the type of atom tokens, such as NAME, True, False, None, Number, etc. It seem like really time-consuming.

Is it really nessesary to be designed like this?

@picnixz
Copy link
Member
picnixz commented Sep 6, 2024

Is it really nessesary to be designed like this?

If you want a more in-depth rationale, read PEP 617. If you can find a way to make an equivalent parser but simpler, we'll gladly accept it. Long story short, we moved from an LL(1) parser to a PEG one because left recursion is not supported by LL(k) parsers (actually, it can cause an infinite recursion on PEG parsers but LL(k) parsers do not allow them in their grammar). Another reason why we use a PEG parser is that a CFG parser would have ambiguous cases which a PEG does not by essence.

@dongrixinyu
Copy link

Thx for your explanation. I will take a deeper look at the parser.

I used to assume the parser works when executing python program. But when I checked CPython source codes, I recognized parsing Python codes only occurs before generating .pyc. Based on this, the speed of parsing Python codes does not make a big deal. Right?

@picnixz
Copy link
Member
picnixz commented Sep 6, 2024

Depends on the use I'd say but here it's probably negligible, though I'm not very familiar with how the pieces are generated. Nonetheless, simplfying how the grammar looks for the user is a good thing IMO.

@tomasr8 tomasr8 added the easy label Oct 3, 2024
@tomasr8
Copy link
Member Author
tomasr8 commented Oct 3, 2024

This issue seems suitable for a new contributor so I'm adding the easy label :)
Here's what you need to fix this:

  • You'll need to apply the changes mentioned in the first post to the Grammar/python.gram file.
  • Once you've made the changes to the grammar, you need to regenerate the parser. You can do that with make regen-pegen. More details here: https://devguide.python.org/developer-workflow/grammar/
  • After that you need to rebuild python again (for example with make -s -j2).

@zedr
Copy link
Contributor
zedr commented Oct 5, 2024

Opened #124998

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
easy topic-parser type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

5 participants
0