-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Implement partial PEP-498 (f-string) support #2054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I've also submitted this upstream as micropython#4998 - modulo the lack of translation support upstream (at least in the way CircuitPython does it?), the PRs are identical. I'm happy for both projects to benefit from this, but since I mostly work in CircuitPython (and I'm also not sure how often CircuitPython rebases off upstream), I wanted to make sure the PR here landed as my top priority 🤷♂️ |
Looks like one failure here was due to flash size for one locale of M0 builds - perhaps this needs stashed behind a Another failure I don't understand at all - this seems to have not even built the feature, as the very first f-string causes a |
The way the tests work, they are mostly comparing the circuitpython result with the python3 result. It looks like we are using python 3.5 as the baseline, due to travis being based on ubuntu xenial. That may mean that there's no way for us to validate the functionality right now, which is a bummer. |
And, yes, many of the ports are almost full (in terms of flash space usage). As a first step you could try making the f-string feature depend on the preprocessor macro CIRCUITPY_FULL_BUILD. I think this gets you "almost everything except boards based on SAMD21". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to have this on in all builds. I've never actually seen the Itsy Bitsy M0 be the one hold out on space. Looks like we're 160 bytes over which we could probably find a way to squeeze.
I agree with Damien's suggestion on the other PR about tests too.
Based on the I'm wondering if finding a better alternative to the dual-buffer chr0-chr2 + chr3-chr5 workaround for switching between reader/postfix buffers would help code size by enough. I'm not sure if |
Ok, drop this to 1 from 3 and you should have space: https://github.com/adafruit/circuitpython/blob/master/py/circuitpy_mpconfig.h#L317 I'll go back in and bump it up for M4+ later. |
Ok, we'll just wait for fixes from the upstream PR and then merge. |
@klardotsh Looks like upstream's PR has stalled. Want to update this and we'll merge? |
@tannewt Let me bump that thread, it's possible Damien didn't see a notification/etc. of my questions/clarifications. If I don't hear back by the weekend (likely the next availability I have to work on this) I'll focus on patching things up here! |
This implements (most of) the PEP-498 spec for f-strings, with two exceptions: - raw f-strings (`fr` or `rf` prefixes) raise `NotImplementedError` - one special corner case does not function as specified in the PEP (more on that in a moment) This is implemented in the core as a syntax translation, brute-forcing all f-strings to run through `String.format`. For example, the statement `x='world'; print(f'hello {x}')` gets translated *at a syntax level* (injected into the lexer) to `x='world'; print('hello {}'.format(x))`. While this may lead to weird column results in tracebacks, it seemed like the fastest, most efficient, and *likely* most RAM-friendly option, despite being implemented under the hood with a completely separate `vstr_t`. Since [string concatenation of adjacent literals is implemented in the lexer](micropython@534b7c3), two side effects emerge: - All strings with at least one f-string portion are concatenated into a single literal which *must* be run through `String.format()` wholesale, and: - Concatenation of a raw string with interpolation characters with an f-string will cause `IndexError`/`KeyError`, which is both different from CPython *and* different from the corner case mentioned in the PEP (which gave an example of the following:) ```python x = 10 y = 'hi' assert ('a' 'b' f'{x}' '{c}' f'str<{y:^4}>' 'd' 'e') == 'ab10{c}str< hi >de' ``` The above-linked commit detailed a pretty solid case for leaving string concatenation in the lexer rather than putting it in the parser, and undoing that decision would likely be disproportionately costly on resources for the sake of a probably-low-impact corner case. An alternative to become complaint with this corner case of the PEP would be to revert to string concatenation in the parser *only when an f-string is part of concatenation*, though I've done no investigation on the difficulty or costs of doing this. A decent set of tests is included. I've manually tested this on the `unix` port on Linux and on a Feather M4 Express (`atmel-samd`) and things seem sane.
170b269
to
1ae77fb
Compare
Closing in favor of #2690 |
This implements (most of) the PEP-498 spec for f-strings, with two
exceptions:
fr
orrf
prefixes) raiseNotImplementedError
(more on that in a moment)
This is implemented in the core as a syntax translation, brute-forcing
all f-strings to run through
String.format
. For example, the statementx='world'; print(f'hello {x}')
gets translated at a syntax level(injected into the lexer) to
x='world'; print('hello {}'.format(x))
.While this may lead to weird column results in tracebacks, it seemed
like the fastest, most efficient, and likely most RAM-friendly option,
despite being implemented under the hood with a completely separate
vstr_t
.Since string concatenation of adjacent literals is implemented in the
lexer,
two side effects emerge:
single literal which must be run through
String.format()
wholesale,and:
f-string will cause
IndexError
/KeyError
, which is both differentfrom CPython and different from the corner case mentioned in the PEP
(which gave an example of the following:)
The above-linked commit detailed a pretty solid case for leaving string
concatenation in the lexer rather than putting it in the parser, and
undoing that decision would likely be disproportionately costly on
resources for the sake of a probably-low-impact corner case. An
alternative to become compliant with this corner case of the PEP would
be to revert to string concatenation in the parser only when an
f-string is part of concatenation, though I've done no investigation on
the difficulty or costs of doing this.
A decent set of tests is included. I've manually tested this on the
unix
port on Linux and on a Feather M4 Express (atmel-samd
) andthings seem sane.