Add new example for plotting a confidence_ellipse #13570

CarstenSchelp · 2019-03-02T22:40:45Z

PR Summary

New statistics-example with a robust and exact way to plot a confidence ellipse of a two-dimensional dataset.

PR Checklist

Has Pytest style unit tests
Code is Flake 8 compliant
New features are documented, with examples if plot related
Documentation is sphinx and numpydoc compliant
Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

anntzer · 2019-03-02T22:54:49Z

examples/statistics/confidence_ellipse.py

+    kwargs : `~matplotlib.patches.Patch` properties
+
+    author : Carsten Schelp
+    license: GNU General Public License v3.0 (https://github.com/CarstenSchelp/CarstenSchelp.github.io/blob/master/LICENSE)


Not a lawyer so pinging @tacaswell on this, but I think GPL won't be acceptable. Anyone can dismiss once resolved.

And most other examples don’t have an author block....

Not a lawyer so pinging @tacaswell on this, but I think GPL won't be acceptable. Anyone can dismiss once resolved.

When no other example bothers with licenses I won't, either. I just thought that the python license was GPLv3.0 compatible?

Oh - matplotlib has a license of its own. Either way - I removed the license block. I hope that it is ok now if I resolve this conversation?

@anntzer: Hi, thanks to the community this is pull request is approaching a state that might be called 'mature'. Do you agree that your change request has been addressed appropriately? It was about that probably problematic license-tag in the docstring.

anntzer · 2019-03-02T22:56:15Z

Thanks for the PR!

There are a few styling issues; we don't apply PEP8 extremely strictly on the examples but I'd suggest you run flake8 on the file and fix as many of the style points it raises.

…ke8 checks pass.

CarstenSchelp · 2019-03-03T09:02:56Z

Thank you for the reviews! I have applied the suggested changes.

timhoffm · 2019-03-03T11:52:11Z

This is technically ok, but it's a lot of code (see current doc build for the rendered result).

Putting all the code into a single code block might be a bit scary to the reader. Have you considered splitting the example into multiple sections? See e.g. https://matplotlib.org/devdocs/gallery/lines_bars_and_markers/joinstyle.html (source is in examples/lines_bars_and_markers/joinstyle.py).

CarstenSchelp · 2019-03-03T13:48:14Z

@timhoffm Hi Tim, you are right - it even scared me! Thanks for pointing me to that example. The one I picked to get started did not have to use more than one code block. I will be back with a more friendly layout shortly.

CarstenSchelp · 2019-03-03T14:51:24Z

I hope that I managed to make the example more user-friendly, now.

jklymak · 2019-03-03T15:02:51Z

Ellipse takes x,y width height and angle, so why do you use transforms to supply those parameters?

CarstenSchelp · 2019-03-03T19:11:10Z

Ellipse takes x,y width height and angle, so why do you use transforms to supply those parameters?

Hi @jklymak,
I appreciate your question, thank you!
I could move the rotation out of sight by specifying the angle when creating the ellipse like such after line 65:

    ellipse = Ellipse((0, 0),
        width=ell_radius_x * 2,
        height=ell_radius_y * 2,
        angle=45,
        **kwargs)

The effect is that the same transform would be applied a code-layer lower.

As for the translation transform - it cannot be worked away by specifying (mean_x, mean_y) instead of the origin (0, 0) as the center of the ellipse. The scale transform is relative to the origin and the ellipse shifts to the wrong place when it is scaled while it is already in its final place. Some implementations of scale() allow the user to specify a center of the scaling transform other than the origin. But this one does not. (Also this would mean nothing else than shifting the ellipse back and forth once more.)

Executing the transforms explicitly does not degrade efficiency as far as I can tell. Also I like to see all transforms in the same code-layer rather than having some executed here and some there.
The way I coded this also corresponds nicely with the "recipe" that I give in the article that explains this whole approach.
I hope this explanation is helpful. Do let me know when I am not entirely clear!

CarstenSchelp · 2019-03-03T19:16:06Z

Aah! One wrong click! Sorry for this closing-reopening hubbub. Went by accident.

timhoffm

Looks much nicer now. I've proposed some additional improvements 😄

examples/statistics/confidence_ellipse.py

… plots are closely related and thus go into one figure.

jklymak · 2019-03-03T21:15:51Z

The effect is that the same transform would be applied a code-layer lower.

Yes, but the naive user's tendency would be to not apply a transform. If you need one then great, but explain why. I'm not convinced you do, but I could certainly be wrong...

As for the translation transform - it cannot be worked away by specifying (mean_x, mean_y) instead of the origin (0, 0) as the center of the ellipse.

I don't understand why you need a scale transform at all.

The way I coded this also corresponds nicely with the "recipe" that I give in the article that explains this whole approach.

Do you link the article? Because I'm not following why you don't know the width and height a-priori. If you can't do this any other way than by specifying transforms it would be good to explain it clearly to the user, because otherwise they will wonder why you aren't just rotating at the start.

CarstenSchelp · 2019-03-03T21:31:58Z

Do you link the article?

The link to the article is in the docstring of the method. But in fact, the approach does need some explanation and maybe the link should be in a more prominent place. Like the first heading that goes "The plotting function itself ...".
If that is not enough then I could place some comment in the code like "This is a normalized covariance, the slope of the ascending axis of its ellipse is always 1 (=45deg).", "Now scale the normalized ellipse back to size - that way, the angle takes care of itself." … and so forth.
I assume that users who wants to understand what's going on will read the article if the link is somehow findable. I'll let this rest for the night and come up with something some time tomorrow (guess what timezome I'm in :-)

timhoffm

This is already quite good. Some additional recommendations to make the code even more compact and clear.

examples/statistics/confidence_ellipse.py

jklymak · 2019-03-04T23:28:53Z

First, I agree that an example like this is useful.

Second, I understand what you are doing, and I understand why you want to do it this way, but I feel it is a misuse of the transform stack. Instead I think you should consider getting the eigenvectors and eigenvalues and using those to calculate the angle, width, and height as outlined in the first answer to this stack overflow question:

https://stackoverflow.com/questions/12301071/multidimensional-confidence-intervals/12321306#12321306

CarstenSchelp · 2019-03-09T10:46:35Z

@jklymak: Thank you for really diving into this, Jody.
To me transforms are a computationally efficient and intuitive way to get things done. They are not particularly obscure, low level or heavy. The transforms API is great, too - resulting in very readable code.
If I typed this out in numpy it would actually look cluttered and scary.
So applying transforms here does not look like misuse to me.
Also, half of the beauty of this approach to get the ellipse right is that by simply scaling the normalized ellipse, the angle takes care of itself. There is really no need to do "atan2(...)".
Using Ellipse(angle=...) alone requires me to provide an angle that will implicitly emerge from other operations. So, within this approach, Ellipse(angle=...) is not the right tool for the job.
I hope that I am making enough sense , here. Can you find rhyme and reason in my point of view?

TODO: pass facecolor parameter to Ellipse() explicitly. Co-Authored-By: CarstenSchelp <carstenschelp@mp.nl>

…s '.T' Co-Authored-By: CarstenSchelp <carstenschelp@mp.nl>

Co-Authored-By: CarstenSchelp <carstenschelp@mp.nl>

…t value for it in confidence_ellipse().

…/matplotlib into confidence_ellipse

comments handled

anntzer · 2019-03-10T19:28:44Z

I haven't looked too much into it, but after a cursory look I think I do like the approach of using transforms to achieve the desired result.

ImportanceOfBeingErnest · 2019-03-10T23:25:36Z

If there is a controversity about the method being used, maybe one can add another function showing the eigenvector decomposition at the bottom of the example and show how both methods would yield the same result?

timhoffm

Only minor style issues left.

I don't engage in the discussion on eigenvector vs. transform. Both have their pros and cons. Either is fine by me.

examples/statistics/confidence_ellipse.py

timhoffm · 2019-03-12T22:22:48Z

examples/statistics/confidence_ellipse.py

+# Different number of standard deviations
+# """""""""""""""""""""""""""""""""""""""
+#
+# A plot with n_std = 3 (gray), 2 (blue) and 1 (red)


Could be some thing more descriptive like "A plot with multiple ellipses describing different standard deviations". The exact numbers and colors are explained in the legend and don't need to be described here.

Oh, I didn't see this one. I will take a look shortly.

examples/statistics/confidence_ellipse.py

timhoffm · 2019-03-12T22:32:02Z

examples/statistics/confidence_ellipse.py

+    [-0.2, 0.35]
+])
+mu = np.array([0, 0]).T
+scale = np.array([8, 5]).T


Be consistent in the way you define mu and scale (see above).

If there is a controversity about the method being used, maybe one can add another function showing the eigenvector decomposition at the bottom of the example and show how both methods would yield the same result?

So far all comments and suggestions were aiming for an example as concise and readable as possible. This is a good thing and I would not add anything extra.
The core of this example is to show a particular way to get that ellipse and to explain why this works (via a link, that is). There are other ways to get that ellipse and the linked article links to one, in its turn, too. If somebody wants to show another way that should happen in another example, I think. Should be a quick job with some - perfectly legitimate - copy and paste.

examples/statistics/confidence_ellipse.py

timhoffm · 2019-03-12T22:39:23Z

For easier review: Current doc build

Remove another facecolot='none' that apparently slipped through. Co-Authored-By: CarstenSchelp <carstenschelp@mp.nl>

…ample

Remove superfluous string interpolation-'f' that slipped through. Co-Authored-By: CarstenSchelp <carstenschelp@mp.nl>

…/matplotlib into confidence_ellipse

timhoffm

IMO this is good to go. We can still bring it in 3.1.

Latest doc build

@CarstenSchelp thanks a lot!

CarstenSchelp · 2019-03-16T09:17:36Z

Likewise, Tim, and everyone else! Nice too see that people find it worthwhile!

anntzer · 2019-04-02T09:10:44Z

Small markup improvements remain possible, but this is already better than 99% of the examples, so I'm not going to nitpick :) merging.
Thanks for the great PR!

…ce_ellipse

…570-on-v3.1.x Backport PR #13570 on branch v3.1.x (Add new example for plotting a confidence_ellipse)

Add new example for plotting a confidence_ellipse

9661375

anntzer previously requested changes Mar 2, 2019

View reviewed changes

anntzer added the Documentation label Mar 2, 2019

Carsten added 2 commits March 3, 2019 09:57

Merge https://github.com/matplotlib/matplotlib into confidence_ellipse

b48f97d

Applying suggested changes: remove author and license block, make fla…

e74b886

…ke8 checks pass.

Put demos in multiple code boxes.

701e171

CarstenSchelp closed this Mar 3, 2019

CarstenSchelp reopened this Mar 3, 2019

timhoffm reviewed Mar 3, 2019

View reviewed changes

Carsten added 3 commits March 3, 2019 22:09

Apply some improvements concerning style and conventions. First three…

c845c88

… plots are closely related and thus go into one figure.

run flake8 checks

2df575a

Merge https://github.com/matplotlib/matplotlib into confidence_ellipse

d413596

Make link to explaining article more prominent.

713a985

timhoffm reviewed Mar 4, 2019

View reviewed changes

timhoffm and others added 4 commits March 9, 2019 15:06

Have keyword argument 'facecolor' default to 'none'.

ef22b47

TODO: pass facecolor parameter to Ellipse() explicitly. Co-Authored-By: CarstenSchelp <carstenschelp@mp.nl>

In demo-code change mu from nparray to tuple. Also removes superfluou…

49a2869

…s '.T' Co-Authored-By: CarstenSchelp <carstenschelp@mp.nl>

Pass single values to scatter() instead of making artificial lists.

856b27d

Co-Authored-By: CarstenSchelp <carstenschelp@mp.nl>

Pass facecolor to Ellipse() explicitly because we introduced a defaul…

ea287e0

…t value for it in confidence_ellipse().

Carsten added 4 commits March 9, 2019 15:33

Merge branch 'confidence_ellipse' of https://github.com/CarstenSchelp…

71b2b05

…/matplotlib into confidence_ellipse

Add legend to example with multiple standard deviations.

f0e1f91

Have ellipse plotted "over" scattered dataset using zorder=0

47a87f8

Merge https://github.com/matplotlib/matplotlib into confidence_ellipse

55e0f51

timhoffm reviewed Mar 12, 2019

View reviewed changes

timhoffm and others added 5 commits March 14, 2019 09:09

Update examples/statistics/confidence_ellipse.py

b7df47b

Remove another facecolot='none' that apparently slipped through. Co-Authored-By: CarstenSchelp <carstenschelp@mp.nl>

Correct stale color specifications in heading of "different n_std" ex…

31d35d3

…ample

Update examples/statistics/confidence_ellipse.py

dbd50b3

Remove superfluous string interpolation-'f' that slipped through. Co-Authored-By: CarstenSchelp <carstenschelp@mp.nl>

Merge branch 'confidence_ellipse' of https://github.com/CarstenSchelp…

2d1fca7

…/matplotlib into confidence_ellipse

Consistency define variables mu and scale as tuples.

566b6c8

timhoffm approved these changes Mar 16, 2019

View reviewed changes

timhoffm added the status: needs review label Apr 2, 2019

anntzer added this to the v3.1.0 milestone Apr 2, 2019

anntzer merged commit 4eb452a into matplotlib:master Apr 2, 2019

meeseeksmachine pushed a commit to meeseeksmachine/matplotlib that referenced this pull request Apr 2, 2019

Backport PR matplotlib#13570: Add new example for plotting a confiden…

ce0d591

…ce_ellipse

meeseeksmachine mentioned this pull request Apr 2, 2019

Backport PR #13570 on branch v3.1.x (Add new example for plotting a confidence_ellipse) #13838

Merged

QuLogic removed the status: needs review label Apr 2, 2019

tacaswell added a commit that referenced this pull request Apr 2, 2019

Merge pull request #13838 from meeseeksmachine/auto-backport-of-pr-13…

5143c55

…570-on-v3.1.x Backport PR #13570 on branch v3.1.x (Add new example for plotting a confidence_ellipse)

Uh oh!

Add new example for plotting a confidence_ellipse #13570

Add new example for plotting a confidence_ellipse #13570

Uh oh!

Conversation

Uh oh!

PR Summary

PR Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!