8000 Add PEP 536: f-literals (#155) · python/peps@64eebdf · GitHub
[go: up one dir, main page]

Skip to content

Commit 64eebdf

Browse files
flying-sheepbrettcannon
authored andcommitted
Add PEP 536: f-literals (#155)
1 parent 3799435 commit 64eebdf

File tree

1 file changed

+183
-0
lines changed

1 file changed

+183
-0
lines changed

pep-0536.txt

Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,183 @@
1+
PEP: 536
2+
Title: Final Grammar for Literal String Interpolation
3+
Version: $Revision$
4+
Last-Modified: $Date$
5+
Author: Philipp Angerer <phil.angerer@gmail.com>
6+
Status: Draft
7+
Type: Standards Track
8+
Content-Type: text/x-rst
9+
Created: 11-Dec-2016
10+
Python-Version: 3.7
11+
Post-History: 12-Dec-2016
12+
13+
Abstract
14+
========
15+
16+
PEP 498 introduced Literal String Interpolation (or “f-strings”).
17+
The expression portions of those literals however are subject to
18+
certain restrictions. This PEP proposes a formal grammar lifting
19+
those restrictions, promoting “f-strings” to “f expressions” or f-literals.
20+
21+
This PEP expands upon the f-strings introduced by PEP 498,
22+
so this text requires familiarity with PEP 498.
23+
24+
Terminology
25+
===========
26+
27+
This text will refer to the existing grammar as “f-strings”,
28+
and the proposed one as “f-literals”.
29+
30+
Furthermore, it will refer to the ``{}``-delimited expressions in
31+
f-literals/f-strings as “expression portions” and the static string content
32+
around them as “string portions”.
33+
34+
Motivation
35+
==========
36+
37+
The current implementation of f-strings in CPython relies on the existing
38+
string parsing machinery and a post processing of its tokens. This results in
39+
several restrictions to the possible expressions usable within f-strings:
40+
41+
#. It is impossible to use the quote character delimiting the f-string
42+
within the expression portion::
43+
44+
>>> f'Magic wand: { bag['wand'] }'
45+
^
46+
SyntaxError: invalid syntax
47+
48+
#. A previously considered way around it would lead to escape sequences
49+
in executed code and is prohibit 10000 ed in f-strings::
50+
51+
>>> f'Magic wand { bag[\'wand\'] } string'
52+
SyntaxError: f-string expression portion cannot include a backslash
53+
54+
#. Comments are forbidden even in multi-line f-strings::
55+
56+
>>> f'''A complex trick: {
57+
... bag['bag'] # recursive bags!
58+
... }'''
59+
SyntaxError: f-string expression part cannot include '#'
60+
61+
#. Expression portions need to wrap ``':'`` and ``'!'`` in braces::
62+
63+
>>> f'Useless use of lambdas: { lambda x: x*2 }'
64+
SyntaxError: unexpected EOF while parsing
65+
66+
These limitations serve no purpose from a language user perspective and
67+
can be lifted by giving f-literals a regular grammar without exceptions
68+
and implementing it using dedicated parse code.
69+
70+
Rationale
71+
=========
72+
73+
.. https://mail.python.org/pipermail/python-ideas/2016-August/041727.html
74+
75+
The restrictions mentioned in Motivation_ are non-obvious and counter-intuitive
76+
unless the user is familiar with the f-literals’ implementation details.
77+
78+
As mentioned, a previous version of PEP 498 allowed escape sequences
79+
anywhere in f-strings, including as ways to encode the braces delimiting
80+
the expression portions and in their code. They would be expanded before
81+
the code is parsed, which would have had several important ramifications:
82+
83+
#. It would not be clear to human readers which portions are Expressions
84+
and which are strings. Great material for an “obfuscated/underhanded
85+
Python challenge”
86+
#. Syntax highlighters are good in parsing nested grammar, but not
87+
in recognizing escape sequences. ECMAScript 2016 (JavaScript) allows
88+
escape sequences in its identifiers [1]_ and the author knows of no
89+
syntax highlighter able to correctly highlight code making use of this.
90+
91+
As a consequence, the expression portions would be harder to recognize
92+
with and without the aid of syntax highlighting. With the new grammar,
93+
it is easy to extend syntax highlighters to correctly parse
94+
and display f-literals:
95+
96+
.. raw:: html
97+
98+
<pre><span style=color:#ff5500>f'Magic wand: </span><span style=color:#3daee9>{</span>bag[<span style=color:#bf0303>'wand'</span>]<span style=color:#3daee9>:^10}</span><span style=color:#ff5500>'</span></pre>
99+
100+
.. This is the output of kate-syntax-highlighter when given that code
101+
(with some quotes stripped)
102+
103+
Highlighting expression portions with possible escape sequences would
104+
mean to create a modified copy of all rules of the complete expression
105+
grammar, accounting for the possibility of escape sequences in key words,
106+
delimiters, and all other language syntax. One such duplication would
107+
yield one level of escaping depth and have to be repeated for a deeper
108+
escaping in a recursive f-literal. This is the case since no highlighting
109+
engine known to the author supports expanding escape sequences before
110+
applying rules to a certain context. Nesting contexts however is a
111+
standard feature of all highlighting engines.
112+
113+
Familiarity also plays a role: Arbitrary nesting of expressions
114+
without expansion of escape sequences is available in every single
115+
other language employing a string interpolation method that uses
116+
expressions instead of just variable names. [2]_
117+
118+
Specification
119+
=============
120+
121+
PEP 498 specified f-strings as the following, but places restrictions on it::
122+
123+
f ' <text> { <expression> <optional !s, !r, or !a> <optional : format specifier> } <text> ... '
124+
125+
All restrictions mentioned in the PEP are lifted from f-literals,
126+
as explained below:
127+
128+
#. Expression portions may now contain strings delimited with the same
129+
kind of quote that is used to delimit the f-literal.
130+
#. Backslashes may now appear within expressions just like anywhere else
131+
in Python code. In case of strings nested within f-literals,
132+
escape sequences are expanded when the innermost string is evaluated.
133+
#. Comments, using the ``'#'`` character, are possible only in multi-line
134+
f-literals, since comments are terminated by the end of the line
135+
(which makes closing a single-line f-literal impossible).
136+
#. Expression portions may contain ``':'`` or ``'!'`` wherever
137+
syntactically valid. The first ``':'`` or ``'!'`` that is not part
138+
of an expression has to be followed a valid coercion or format specifier.
139+
140+
A remaining restriction not explicitly mentioned by PEP 498 is line breaks
141+
in expression portions. Since strings delimited by single ``'`` or ``"``
142+
characters are expected to be single line, line breaks remain illegal
143+
in expression portions of single line strings.
144+
145+
.. note:: Is lifting of the restrictions sufficient,
146+
or should we specify a more complete grammar?
147+
148+
Backwards Compatibility
149+
=======================
150+
151+
f-literals are fully backwards compatible to f-strings,
152+
and expands the syntax considered legal.
153+
154+
Reference Implementation
155+
========================
156+
157+
TBD
158+
159+
References
160+
==========
161+
162+
.. [1] ECMAScript ``IdentifierName`` specification
163+
( http://ecma-international.org/ecma-262/6.0/#sec-names-and-keywords )
164+
165+
Yes, ``const cthulhu = { H̹̙̦̮͉̩̗̗ͧ̇̏̊̾Eͨ͆͒̆ͮ̃͏̷̮̣̫̤̣Cͯ̂͐͏̨̛͔̦̟͈̻O̜͎͍͙͚̬̝̣̽ͮ͐͗̀ͤ̍̀͢M̴̡̲̭͍͇̼̟̯̦̉̒͠Ḛ̛̙̞̪̗ͥͤͩ̾͑̔͐ͅṮ̴̷̷̗̼͍̿̿̓̽͐H̙̙̔̄͜\u 9634 0042: 42 }`` is valid ECMAScript 2016
166+
167+
.. [2] Wikipedia article on string interpolation
168+
( https://en.wikipedia.org/wiki/String_interpolation )
169+
170+
Copyright
171+
=========
172+
173+
This document has been placed in the public domain.
174+
175+
176+
..
177+
Local Variables:
178+
mode: indented-text
179+
indent-tabs-mode: nil
180+
sentence-end-double-space: t
181+
fill-column: 70
182+
coding: utf-8
183+
End:

0 commit comments

Comments
 (0)
0