-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
fix markevery plot option with nans in data #22519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for opening your first PR into Matplotlib!
If you have not heard from us in a while, please feel free to ping @matplotlib/developers
or anyone who has commented on the PR. Most of our reviewers are volunteers and sometimes things fall through the cracks.
You can also join us on gitter for real-time discussion.
For details on testing, writing docs, and our review process, please see the developer guide
We strive to be a welcoming and open project. Please follow our Code of Conduct.
(I cannot try out the patch right now. So if you can attach a plot that would help at least me.) Thanks for this! It is clearly an issue that should be fixed. Isn't there a risk that markers end up where the value is NaN with this fix? (Probably clearer is '-o' is used, and maybe other nan-locations/markevery-values?) Not sure how one really should deal with it though. |
Thanks for the figures! I've manage to try out the patch a bit now and I agree with your conclusion that it doesn't look like there will markers on NaN. However, it is possible to provoke it to do some "strange" things, but as I mentioned earlier, it may in these cases not be obvious what one really would like to obtain. Still, based on the documentation: " markers will be spaced at approximately equal visual distances along the line; the distance along the line between markers is determined by multiplying the display-coordinate distance of the axes bounding-box diagonal by the value of every." or the demo: "Float arguments allow markers to be spaced at approximately equal distances along the line.", I would not expect it to become like: or Maybe an alternative approach would be to compute the "theoretical position" and then remove them if the closest actual data point is NaN? (On the other hand, the gallery shows some irregular spacing behavior when zoomed in: https://matplotlib.org/stable/gallery/lines_bars_and_markers/markevery_demo.html ) |
I have tried to polish it a bit, see the current version. |
You are right. Maybe one can assume some sort of maximum derivative based on the existing data? But it will probably be possible to find edge cases independent of how it is done. To some extent the most important thing is that it doesn't give an error... (And the behavior I was considering is obtained for an integer
This is indeed a tricky one, especially since any type of line disappears. I'm a bit too junior when it comes to approving stuff that actually defines a behavior, but I think the new version is probably as good as I can think of. Hopefully someone more experienced can chip in. |
Thanks for the feedback. I have now tried adding a test so that the code coverage passes. |
4557b26
to
08a0eec
Compare
Ping @matplotlib/developers - the tests pass, not sure if this counts as a new feature for the docs checklist - previously the code just failed. |
Thanks and sorry for forgetting about this. I think the fact that it is fixed is great and I think that at the end of the day one can probably always find NaN-patterns that gives non-ideal behavior (especially since one cannot really tell what the ideal behavior is). So for me it is a go. |
Thanks! |
lib/matplotlib/lines.py
Outdated
@@ -159,7 +159,24 @@ def _slice_or_none(in_v, slc): | |||
"markevery is specified relative to the axes size, but " | |||
"the line does not have a Axes as parent") | |||
# calc cumulative distance along path (in display coords): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment doesn't seem to be in the right place to me.
lib/matplotlib/lines.py
Outdated
aux = np.empty_like(tpath.vertices) | ||
for ic, vals in enumerate(tpath.vertices.T): | ||
ok = ~mask[:, ic] | ||
ii = np.where(ok, np.arange(ok.shape[0]), -1) | ||
i0, i1 = np.nonzero(ok)[0][[0, -1]] | ||
ii[:i0] = i0 | ||
ilo = np.maximum.accumulate(ii) | ||
ii[mask[:, ic]] = ok.shape[0] - 1 | ||
ii[i1:] = i1 | ||
ihi = np.minimum.accumulate(ii[::-1])[::-1] | ||
aux[:, ic] = 0.5 * (vals[ilo] + vals[ihi]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part really needs a comment for explanation; I'm not sure it's the simplest way to achieve what you want since I'm not sure what it's trying to do.
I have tried to explain the code. |
Sorry for being dense, but can you explain this algorithm to me? Naively, I would just do markevery on x, and if a value of y is NaN, then it wouldn't get marked. If I recall correctly there is some visual version of this - so perhaps that is the problem here? Sorry, I need more context without spending a lot of time. |
The patch does nothing to the marking algorithm (which I did not study), it just allows computing the "cumulative distance along path (in display coordinates)" when there are nans in the coordinates. Originally I tried just to remove the nans, but that did not work (data length-related errors). Hmmm. But it should be possible, let me retry. |
.. OK, so it is the "in display co-ordinates" that is the problem, caused by passing a float to markevery. Thanks for the reminder. I guess filling in with nonsense y values and then choosing based on that is a good algorithm as any. |
I have just pushed another implementation that just removes the nans. It gives slightly different results, subjectively sometimes not as "good", but the code is trivial and much easier to maintain. Let me know which version you prefer. |
disp_coords = affine.transform(tpath.vertices) | ||
fin = np.isfinite(verts).all(axis=1) | ||
fverts = verts[fin] | ||
disp_coords = affine.transform(fverts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems the easiest, and has the feature of only testing shown markers for visual distance. Not always what folks want, but at least doesn't error, and has the benefit of simplicity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still OK! I think we have similar views: no error thrown and some output that makes sense (although it is and always will be possible to find cases that gives "sub-optimal" results).
And nicer with a shorter/simpler solution as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is OK!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rc can you squash to one commit, and then I will merge... If you have not done a rebase before, make sure to make a backup branch so you can get back to this point. But basically
git fetch upstream
git rebase -i upstream/master
and choose squash for all your commits except the first one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
- TST: new test_markevery_linear_scales_nans() test + baseline images
842ad60
to
99d9475
Compare
Re-hi, just checking if the current state is OK. |
Sorry to let this languish! |
No problem, thanks for the review! |
PR Summary
Hi matplotlib developers,
This PR addresses the following code failure (with matplotlib '3.6.0.dev1701+gb80bc91827', Python 3.8.10):
where the error is
I have fixed the issue by replacing the non-finite values by means for each axis - not sure if it is right, but it is simple and works well for my data. Let me know if anything fancier is needed.
Best regards,
r.
PR Checklist
Tests and Styling
pytest
passes).flake8-docstrings
and runflake8 --docstring-convention=all
).Documentation
doc/users/next_whats_new/
(follow instructions in README.rst there).doc/api/next_api_changes/
(follow instructions in README.rst there).