-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
WIP: ENH: provide doctest runner which is less sensitive to number formatting #9674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
On the list a few people were against including a runner as part of numpy, suggesting it should be a separate project or part of the doctest module. Two reasons I see to keep it in numpy are: 1. we could update it in lockstep with numpy releases. This way if numpy updates printing in a way that breaks the runner, we can update the runner at the same time. Less headache for downstream projects. 2. The runner is potentially numpy-dependent, eg we might use |
You can compare to the Astropy (link) and Sympy (link) implementations of a similar doctest runner. I had forgotten about them when I wrote this, but the implementations turned out extremely similar. Features of the Astropy runner not in this PR: 1. Support for mathematical expressions involving Features of this PR not in the Astropy runner: 1. greedy grabs all surrounding whitespace 2. Plays more nicely with other doctest options/runners by replacing all numbers by a space and then super'ing the resulting string to the doctest module. 3. allows all Astropy does some regex stuff with |
208c2de
to
f7204ea
Compare
New doctest option `NPY_FLEX_NUM` is defined.
f7204ea
to
08d58f7
Compare
Just pointing out (no more than that) that SageMath also has something like this. SageMath has a very customized doctest framework and this is just one modification: https://github.com/sagemath/sage/blob/master/src/sage/doctest/parsing.py#L878 |
@ahaldane - looking at your implementation, and especially at various links, I cannot help to think that really we should do this upstream. It turns out that following up on an astropy issue [1], @kdavies4 wrote to python-ideas [2], with the final conclusion by Coghlan that this was a good idea [3]. What do you think? Shall we try to push it upstream? One possible difficulty is the lack of tuning possible, but I think that a simple scheme of an absolute offset which is the larger of ~1e-15 * number, or value of last digit would capture almost anything that we care about. (Independently, I would suggest to make that the default, rather than an arbitrary 1e-3; the default should be to look for rounding errors only.) [1] astropy/astropy#2662 |
Certainly for SageMath, the customizability is important. We do this with a comment like CC @embray |
That can still be done if it's a separate package right? Could be completely standalone, but also under the numpy Github organisation is possible - similar to
That's not really relevant for the decision to ship as part of numpy. Numpy itself won't even use this test runner, so including it within numpy would be weird. |
Pytest plugin? |
Re: tuning I agree that tunable tolerance is very desirable, since a lot of operations have much less precision than 1e-15, eg if they involve sums. Even in astropy doctests, I remember some of the doctests only want |
The sage implementation looks very good to me! It does a lot of things this PR doesn't do (eg support for random numbers), and I like the addition of the 'abs tol' per-test directive. We would need to remove or generalize some sage specific stuff (for parsing the sage command line style), and perhaps a few tweaks to the regexes. I think I now agree we should move it upstream or make it a separate package... I think the suggested location options are:
Then I think @pv is suggesting an implementation distinction between
Actually, I think that making it a pytest plugin requires making the doctest subclasses first, in any case. So we can leave the pytest decision for later. By the way, I see that pytest already has its own custom doctest subclasses too, which removes the |
The reason that I mentioned fixed defaults for upstreaming is that in the linked discussion it was suggested it was not easy to provide a precision together with a doctest option. I have not actually investigated this myself, but if true, I still think this should not stop us from upstreaming. For astropy at least, I'm fairly sure the "rounding error" level is almost always applicable (and one could certainly raise the minimum to 1e-10 instead of 1e-15). And if one also added a |
# These patterns detect spaces/punctuation before/after a number | ||
_punct = '[\[\]\(\),]' | ||
_before = '(?:\s+|^|(?<={}))'.format(_punct) | ||
_after = '(?:\s+|$|(?={}))'.format(_punct) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it should be _after = '(?=\s+|$|(?={}))'.format(_punct)
in order to capture numbers separated by white spaces: https://regex101.com/r/O9S9Z6/1
We now have a different solution (refcheck) merged in and running on CI. Can we close this? |
I think The discussion here about upstreaming our doc checking framework for the benefit of the ecosystem is relevant but also covered in #12548. |
Yes, OK to close this; I added my comment about upstreaming to doctest itself in #12548 - it would seem the ultimate solution. |
As discussed in #9139 and on the mailing list (link), it might be nice to provide downstream projects a doctest runner which ignores little details in how numbers are printed. This PR is a try at that, to get an idea of how it might play out for initial feedback before I work on it any more.
Essentially, this runner searches for all "number-like" string patterns in the want/got strings, extracts them, compares them numerically, and passes on the non-numeric parts to the usual doctest comparison function.
See the comments for more details.