8000 Use named groups in mathtext parser. by anntzer · Pull Request #21448 · matplotlib/matplotlib · GitHub
[go: up one dir, main page]

Skip to content

Use named groups in mathtext parser. #21448

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 21, 2022
Merged

Use named groups in mathtext parser. #21448

merged 1 commit into from
Apr 21, 2022

Conversation

anntzer
Copy link
Contributor
@anntzer anntzer commented Oct 24, 2021

This makes the definitions much easier to read, by removing the need to
use Group() and Suppress(): previously, these would be used to nest the
parsed tokens and remove the unnecessary ones (e.g, braces), but such
tokens can also easily be ignored by giving names to the tokens we use
(instead of accessing them by position). (See e.g. the implementation
of subsuper, which is also somewhat simplified.)

Also remove a few other pyparsing combinators that are not
needed: Combine (just concatenate the strings ourselves), FollowedBy
(use lookahead regexes), Literal (use plain strings, in particular
braces which are very common).

Also remove right_delim_safe, as self._right_delim already includes the
backslash-escaped closing brace rather than the unescaped one
(right_delim_safe dates back to when it didn't).

The error message for a^2_2^2 now changed to "double superscript" as
we now simply check for subscripts and superscripts one at a time, and
fail when we see either two subscript or two superscripts, emitting the
corresponding error.

(I played a bit with the parser to try and see what's wrong with pyparsing 3
and got there; unfortunately this patch doesn't fix the incompatibility :-()

Edit: This seems to fix support with pyparsing 3. It's not totally clear why, though...
A much more minimal fix would be

diff --git i/lib/matplotlib/_mathtext.py w/lib/matplotlib/_mathtext.py
index 73e6163944..fbf58b5df1 100644
--- i/lib/matplotlib/_mathtext.py
+++ w/lib/matplotlib/_mathtext.py
@@ -2046,7 +2046,7 @@ class Parser:
         p.accentprefixed <<= Suppress(p.bslash) + oneOf(self._accentprefixed)
         p.symbol_name   <<= (
             Combine(p.bslash + oneOf(list(tex2uni)))
-            + FollowedBy(Regex("[^A-Za-z]").leaveWhitespace() | StringEnd())
+            + Suppress(Regex("(?=[^A-Za-z]|$)").leaveWhitespace())
         )
         p.symbol        <<= (p.single_symbol | p.symbol_name).leaveWhitespace()

I still don't know whether that's a bug or an intended API change on pyparsing's side.

PR Summary

PR Checklist

  • Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (run flake8 on changed files to check).
  • New features are documented, with examples if plot related.
  • Documentation is sphinx and numpydoc compliant (the docs should build without error).
  • Conforms to Matplotlib style conventions (install flake8-docstrings and run flake8 --docstring-convention=all).
  • New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
  • API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).

@anntzer
Copy link
Contributor Author
anntzer commented Oct 24, 2021

I'll milestone this as RC for mpl3.5, although for that release most likely we could (should) just extract the minimal patch (above, in the Edit: ...) to specifically fix compat with pyparsing 3.

@anntzer anntzer added the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label Oct 24, 2021
timhoffm added a commit to timhoffm/matplotlib that referenced this pull request Oct 24, 2021
- Code suggestion taken from matplotlib#21448
- Removed all pyparsing pinning. If this runs through CI, the fix
  is proven to be working.
@timhoffm timhoffm modified the milestones: v3.5.0, v3.6.0 Oct 24, 2021
@timhoffm timhoffm removed the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label Oct 24, 2021
@timhoffm
Copy link
Member

We should only backport the minimal fix as suggested. I've created #21454 for this. So we can remilestone to 3.6 and remove the "release critical" tag here.

timhoffm added a commit to timhoffm/matplotlib that referenced this pull request Oct 24, 2021
- Code suggestion taken from matplotlib#21448
- Removed all pyparsing pinning. If this runs through CI, the fix
  is proven to be working.
@anntzer
Copy link
Contributor Author
anntzer commented Oct 24, 2021

I know, I was just letting someone else do the work :-)

timhoffm added a commit to timhoffm/matplotlib that referenced this pull request Oct 24, 2021
- Code suggestion taken from matplotlib#21448
- Removed all pyparsing pinning. If this runs through CI, the fix
  is proven to be working.
@timhoffm
Copy link
Member

I suggest you remove the pyparsing pinning in this PR as well to make sure it solves the issue.

@anntzer
Copy link
Contributor Author
anntzer commented Oct 24, 2021

This is already done in the second commit.

@anntzer
Copy link
Contributor Author
anntzer commented Oct 26, 2021

Seems also a bit faster, see #20821 (comment).

Edit: self-contained perf test:

MPLBACKEND=agg python -c 'import time; from pylab import *; from matplotlib.tests.test_mathtext import math_tests; fig = figure(figsize=(3, 10)); fig.text(0, 0, "\n".join(filter(None, math_tests)), size=6); start = time.perf_counter(); [fig.canvas.draw() for _ in range(10)]; print((time.perf_counter() - start) / 10)'

before: ~1.02s; after: ~0.81s.

This makes the definitions much easier to read, by removing the need to
use Group() and Suppress(): previously, these would be used to nest the
parsed tokens and remove the unnecessary ones (e.g, braces), but such
tokens can also easily be ignored by giving names to the tokens we use
(instead of accessing them by position).  (See e.g. the implementation
of subsuper, which is also somewhat simplified.)

Also remove a few other pyparsing combinators that are not
needed: Combine (just concatenate the strings ourselves), FollowedBy
(use lookahead regexes), Literal (use plain strings, in particular
braces which are very common).

Also remove right_delim_safe, as self._right_delim already includes the
backslash-escaped closing brace rather than the unescaped one
(right_delim_safe dates back to when it didn't).

The error message for `a^2_2^2` now changed to "double superscript" as
we now simply check for subscripts and superscripts one at a time, and
fail when we see either two subscript or two superscripts, emitting the
corresponding error.
@oscargus oscargus merged commit c6d4a29 into matplotlib:main Apr 21, 2022
@anntzer anntzer deleted the mtp branch May 1, 2022 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0