8000 Implement partial PEP-498 (f-string) support by klardotsh · Pull Request #2054 · adafruit/circuitpython · GitHub
[go: up one dir, main page]

Skip to content

Implement partial PEP-498 (f-string) support #2054

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

klardotsh
Copy link
@klardotsh klardotsh commented Aug 11, 2019

This implements (most of) the PEP-498 spec for f-strings, with two
exceptions:

  • raw f-strings (fr or rf prefixes) raise NotImplementedError
  • one special corner case does not function as specified in the PEP
    (more on that in a moment)

This is implemented in the core as a syntax translation, brute-forcing
all f-strings to run through String.format. For example, the statement
x='world'; print(f'hello {x}') gets translated at a syntax level
(injected into the lexer) to x='world'; print('hello {}'.format(x)).
While this may lead to weird column results in tracebacks, it seemed
like the fastest, most efficient, and likely most RAM-friendly option,
despite being implemented under the hood with a completely separate
vstr_t.

Since string concatenation of adjacent literals is implemented in the
lexer
,
two side effects emerge:

  • All strings with at least one f-string portion are concatenated into a
    single literal which must be run through String.format() wholesale,
    and:
  • Concatenation of a raw string with interpolation characters with an
    f-string will cause IndexError/KeyError, which is both different
    from CPython and different from the corner case mentioned in the PEP
    (which gave an example of the following:)
x = 10
y = 'hi'
assert ('a' 'b' f'{x}' '{c}' f'str<{y:^4}>' 'd' 'e') == 'ab10{c}str< hi >de'

The above-linked commit detailed a pretty solid case for leaving string
concatenation in the lexer rather than putting it in the parser, and
undoing that decision would likely be disproportionately costly on
resources for the sake of a probably-low-impact corner case. An
alternative to become compliant with this corner case of the PEP would
be to revert to string concatenation in the parser only when an
f-string is part of concatenation
, though I've done no investigation on
the difficulty or costs of doing this.

A decent set of tests is included. I've manually tested this on the
unix port on Linux and on a Feather M4 Express (atmel-samd) and
things seem sane.

@klardotsh
Copy link
Author
klardotsh commented Aug 11, 2019

I've also submitted this upstream as micropython#4998 - modulo the lack of translation support upstream (at least in the way CircuitPython does it?), the PRs are identical. I'm happy for both projects to benefit from this, but since I mostly work in CircuitPython (and I'm also not sure how often CircuitPython rebases off upstream), I wanted to make sure the PR here landed as my top priority 🤷‍♂️

@klardotsh
Copy link
Author

Looks like one failure here was due to flash size for one locale of M0 builds - perhaps this needs stashed behind a mpconfig flag and disabled on M0s? Seems like an unideal UX but might have to do what we have to do (or I just made a really flash-size-bloated feature?)

Another failure I don't understand at all - this seems to have not even built the feature, as the very first f-string causes a SyntaxError. I'm not familiar enough with the test infra here to know what's up.

@jepler
Copy link
jepler commented Aug 11, 2019

The way the tests work, they are mostly comparing the circuitpython result with the python3 result. It looks like we are using python 3.5 as the baseline, due to travis being based on ubuntu xenial. That may mean that there's no way for us to validate the functionality right now, which is a bummer.

@jepler
Copy link
jepler commented Aug 11, 2019

And, yes, many of the ports are almost full (in terms of flash space usage). As a first step you could try making the f-string feature depend on the preprocessor macro CIRCUITPY_FULL_BUILD. I think this gets you "almost everything except boards based on SAMD21".

@tannewt tannewt added cpython api modules from cpython enhancement labels Aug 13, 2019
@tannewt tannewt added this to the 5.x.0 - Features milestone Aug 13, 2019
Copy link
Member
@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to have this on in all builds. I've never actually seen the Itsy Bitsy M0 be the one hold out on space. Looks like we're 160 bytes over which we could probably find a way to squeeze.

I agree with Damien's suggestion on the other PR about tests too.

@klardotsh
Copy link
Author

Based on the bare-arm test failure I got in upstream PR, this feature added about 700 bytes to the bare-arm build. I can probably find a way to trim that down (though I have very few ideas on that just yet - at work at the moment so can't toy with things much right now).

I'm wondering if finding a better alternative to the dual-buffer chr0-chr2 + chr3-chr5 workaround for switching between reader/postfix buffers would help code size by enough. I'm not sure if mp_readers are seekable, but if they are, a more sane alternative may be to rewind that reader rather than backing up its already-read values when switching to the postfix sou 8000 rce. Hmm...

@tannewt
Copy link
Member
tannewt commented Aug 13, 2019

Ok, drop this to 1 from 3 and you should have space: https://github.com/adafruit/circuitpython/blob/master/py/circuitpy_mpconfig.h#L317

I'll go back in and bump it up for M4+ later.

@tannewt
Copy link
Member
tannewt commented Aug 16, 2019

Ok, we'll just wait for fixes from the upstream PR and then merge.

@tannewt
Copy link
Member
tannewt commented Aug 27, 2019

@klardotsh Looks like upstream's PR has stalled. Want to update this and we'll merge?

@klardotsh
Copy link
Author

@tannewt Let me bump that thread, it's possible Damien didn't see a notification/etc. of my questions/clarifications.

If I don't hear back by the weekend (likely the next availability I have to work on this) I'll focus on patching things up here!

This implements (most of) the PEP-498 spec for f-strings, with two
exceptions:

- raw f-strings (`fr` or `rf` prefixes) raise `NotImplementedError`
- one special corner case does not function as specified in the PEP
(more on that in a moment)

This is implemented in the core as a syntax translation, brute-forcing
all f-strings to run through `String.format`. For example, the statement
`x='world'; print(f'hello {x}')` gets translated *at a syntax level*
(injected into the lexer) to `x='world'; print('hello {}'.format(x))`.
While this may lead to weird column results in tracebacks, it seemed
like the fastest, most efficient, and *likely* most RAM-friendly option,
despite being implemented under the hood with a completely separate
`vstr_t`.

Since [string concatenation of adjacent literals is implemented in the
lexer](micropython@534b7c3),
two side effects emerge:

- All strings with at least one f-string portion are concatenated into a
single literal which *must* be run through `String.format()` wholesale,
and:
- Concatenation of a raw string with interpolation characters with an
f-string will cause `IndexError`/`KeyError`, which is both different
from CPython *and* different from the corner case mentioned in the PEP
(which gave an example of the following:)

```python
x = 10
y = 'hi'
assert ('a' 'b' f'{x}' '{c}' f'str<{y:^4}>' 'd' 'e') == 'ab10{c}str< hi >de'
```

The above-linked commit detailed a pretty solid case for leaving string
concatenation in the lexer rather than putting it in the parser, and
undoing that decision would likely be disproportionately costly on
resources for the sake of a probably-low-impact corner case. An
alternative to become complaint with this corner case of the PEP would
be to revert to string concatenation in the parser *only when an
f-string is part of concatenation*, though I've done no investigation on
the difficulty or costs of doing this.

A decent set of tests is included. I've manually tested this on the
`unix` port on Linux and on a Feather M4 Express (`atmel-samd`) and
things seem sane.
@tannewt
Copy link
Member
tannewt commented Mar 9, 2020

Closing in favor of #2690

@tannewt tannewt closed this Mar 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpython api modules from cpython enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0