8000 gh-127833: Reword and expand the Notation section by encukou · Pull Request #134443 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Consolidate with the Full Grammar intro
Co-authored-by: Blaise Pabon <blaise@gmail.com>
  • Loading branch information
encukou and blaisep committed May 21, 2025
commit ec90d4066987534c7dbeed7e91aadbac1ff8670b
16 changes: 7 additions & 9 deletions Doc/reference/grammar.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,13 @@ used to generate the CPython parser (see :source:`Grammar/python.gram`).
The version here omits details related to code generation and
error recovery.

The notation is a mixture of `EBNF
<https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form>`_
and `PEG <https://en.wikipedia.org/wiki/Parsing_expression_grammar>`_.
In particular, ``&`` followed by a symbol, token or parenthesized
group indicates a positive lookahead (i.e., is required to match but
not consumed), while ``!`` indicates a negative lookahead (i.e., is
required *not* to match). We use the ``|`` separator to mean PEG's
"ordered choice" (written as ``/`` in traditional PEG grammars). See
:pep:`617` for more details on the grammar's syntax.
The notation used here is the same as in the preceding docs,
and is described in the :ref:`notation <notation>` section,
except for a few extra complications:

* ``&e``: a positive lookahead (that is, ``e`` is required to match but
not consumed)
* ``!e``: a negative lookahead (that is, ``e`` is required *not* to match)

.. literalinclude:: ../../Grammar/python.gram
:language: peg
16 changes: 11 additions & 5 deletions Doc/reference/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,10 @@

.. index:: BNF, grammar, syntax, notation

The descriptions of lexical analysis and syntax use a modified
`Backus–Naur form (BNF) <https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form>`_ grammar
notation. This uses the following style of definition:
The descriptions of lexical analysis use a grammar notation that is a mixture
of `EBNF <https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form>`_
and `PEG <https://en.wikipedia.org/wiki/Parsing_expression_grammar>`_.
For example:

.. grammar-snippet::
:group: notation
Expand Down Expand Up @@ -136,7 +137,11 @@
* ``e1 e2``: Items separated only by whitespace denote a sequence.
Here, ``e1`` must be followed by ``e2``.
* ``e1 | e2``: A vertical bar is used to separate alternatives.
It is the least tightly binding operator in this notation.
It denotes PEG's "ordered choice": if ``e1`` matches, ``e2`` is
not considered.
In traditional PEG grammars, this is written as a slash, ``/``, rather than
a vertical bar.
See :pep:`617` for more background and details.
* ``e*``: A star means zero or more repetitions of the preceding item.
* ``e+``: Likewise, a plus means one or more repetitions.
* ``[e]``: A phrase enclosed in square brackets means zero or
Expand All @@ -145,14 +150,15 @@
the preceding item is optional.
* ``(e)``: Parentheses are used for grouping.

The unary operators (``*``, ``+``, ``?``) bind as tightly as possible.
The unary operators (``*``, ``+``, ``?``) bind as tightly as possible;
the vertical bar (``|``) binds most loosely.

White space is only meaningful to separate tokens.

Rules are normally contained on a single line, but rules that are too long
may be wrapped:

.. grammar-snippet::

Check warning on line 161 in Doc/reference/introduction.rst

View workflow job for this annotation

GitHub Actions / Docs / Docs

'token' reference target not found: notation:stringliteral [ref.token]

Check warning on line 161 in Doc/reference/introduction.rst

View workflow job for this annotation

GitHub Actions / Docs / Docs

'token' reference target not found: notation:bytesliteral [ref.token]

Check warning on line 161 in Doc/reference/introduction.rst

View workflow job for this annotation

GitHub Actions / Docs / Docs

'token' reference target not found: notation:integer [ref.token]

Check warning on line 161 in Doc/reference/introduction.rst

View workflow job for this annotation

GitHub Actions / Docs / Docs

'token' reference target not found: notation:floatnumber [ref.token]

Check warning on line 161 in Doc/reference/introduction.rst

View workflow job for this annotation

GitHub Actions / Docs / Docs

'token' reference target not found: notation:imagnumber [ref.token]
:group: notation

literal: `stringliteral` | `bytesliteral`
Expand All @@ -163,7 +169,7 @@
For example:


.. grammar-snippet::

Check warning on line 172 in Doc/reference/introduction.rst

View workflow job for this annotation

GitHub Actions / Docs / Docs

'token' reference target not found: notation-alt:stringliteral [ref.token]

Check warning on line 172 in Doc/reference/introduction.rst

View workflow job for this annotation

GitHub Actions / Docs / Docs

'token' reference target not found: notation-alt:bytesliteral [ref.token]

Check warning on line 172 in Doc/reference/introduction.rst

View workflow job for this annotation

GitHub Actions / Docs / 6AC4 Docs

'token' reference target not found: notation-alt:integer [ref.token]

Check warning on line 172 in Doc/reference/introduction.rst

View workflow job for this annotation

GitHub Actions / Docs / Docs

'token' reference target not found: notation-alt:floatnumber [ref.token]

Check warning on line 172 in Doc/reference/introduction.rst

View workflow job for this annotation

GitHub Actions / Docs / Docs

'token' reference target not found: notation-alt:imagnumber [ref.token]
:group: notation-alt

literal:
Expand Down
Loading
0