8000 Improve contour performance in WCSAxes by 10-1000x by astrofrog · Pull Request #7568 · astropy/astropy · GitHub
[go: up one dir, main page]

Skip to content

Improve contour performance in WCSAxes by 10-1000x #7568

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 20, 2018

Conversation

astrofrog
Copy link
Member
@astrofrog astrofrog commented Jun 17, 2018

Background

WCSAxes has a mechanism for overplotting contours from an image in a different coordinate system from the original image. This essentially boils down to:

ax.contour(contour_data, transform=ax.get_transform(contour_wcs))

However, the way Matplotlib deals with this internally is to compute the contours on contour_data, then to represent each line in the contour map as a Matplotlib Path (note - each line, not each level - there might be multiple lines per level), then to transform each path individually using the specified transform. In the case where coordinate conversions are needed, the transform includes calls to SkyCoord, which we know is a little slow, especially if called once for each line in the contour map. For some contour maps I've made, there are sometimes thousands of individual lines.

New approach

The approach implemented here is to overload contour and contourf, and internally call them without transform, but to then modify the resulting ContourSet manually. We do this by extracting all the vertices of the lines, stacking them into a single array, transforming them, and splitting them again. This is far more efficient because we then only effectively call SkyCoord once.

Performance

The following functions test the new approach, the old approach, and also test the performance without contours:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.axes import Axes

from astropy.io import fits
from astropy.wcs import WCS
from astropy.utils.data import get_pkg_data_filename

hdu_2mass = fits.open(get_pkg_data_filename('galactic_center/gc_2mass_h.fits'))[0]
hdu_msx = fits.open(get_pkg_data_filename('galactic_center/gc_msx_e.fits'))[0]


def new():
    ax = plt.subplot(1, 1, 1, projection=WCS(hdu_2mass.header))
    ax.imshow(hdu_2mass.data)
    ax.contour(hdu_msx.data, levels=np.logspace(-6, -3, 10),
               transform=ax.get_transform(WCS(hdu_msx.header)), colors='white')
    plt.savefig('new.png')


def old():
    ax = plt.subplot(1, 1, 1, projection=WCS(hdu_2mass.header))
    ax.imshow(hdu_2mass.data)
    Axes.contour(ax, hdu_msx.data, levels=np.logspace(-6, -3, 10),
                 transform=ax.get_transform(WCS(hdu_msx.header)), colors='white')
    plt.savefig('old.png')


def nocontour():
    ax = plt.subplot(1, 1, 1, projection=WCS(hdu_2mass.header))
    ax.imshow(hdu_2mass.data)
    plt.savefig('nocontour.png')

The results are as follows:

In [10]: %timeit old()
14.5 s ± 710 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [11]: %timeit new()
198 ms ± 8.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [13]: %timeit nocontour()
125 ms ± 1.39 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

If we subtract the base time for the plot excluding the contour, the old approach took 14+ seconds, whereas now it takes ~70ms, a speedup of around 200x.

The exact speedup will depend a lot on the number of lines in the contour map. The worst case scenario (for the new implementation) would be if there was just a single contour line. If I take the example above and set levels=[5.8e-4], I get a single contour line. In that case, the timings look like:

In [5]: %timeit old()
156 ms ± 3.65 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [4]: %timeit new()
145 ms ± 1.85 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [6]: %timeit nocontour()
153 ms ± 7.71 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Basically the differences are in the noise. So there should not really be any cases where this results in worse performance.

I've added a couple of benchmark to astropy-benchmarks: astropy/astropy-benchmarks#58

The default implementation of contour in Matplotlib calls the transform (when a transform is given) for each individual contour line. Since our transforms rely on WCS and SkyCoord, each transform can have non-negligible overhead. Therefore, we now instead deal with the transformation ourselves by stacking all the vertices of the paths and transforming them all in one go.
@astropy-bot
Copy link
astropy-bot bot commented Jun 17, 2018

Hi there @astrofrog 👋 - thanks for the pull request! I'm just a friendly 🤖 that checks for issues related to the changelog and making sure that this pull request is milestoned and labeled correctly. This is mainly intended for the maintainers, so if you are not a maintainer you can ignore this, and a maintainer will let you know if any action is required on your part 😃.

Everything looks good from my point of view! 👍

If there are any issues with this message, please report them here.

Copy link
Member
@larrybradley larrybradley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @astrofrog. I only had a few minor comments (e.g. typos).

@@ -196,6 +196,46 @@ def imshow(self, X, *args, **kwargs):

return super().imshow(X, *args, **kwargs)

def contour(self, *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the matplotlib docstrings for contour and contourf cause any sphinx warnings? I know it's not desirable to copy the docstrings, but I recently noticed that when building the astropy docs locally that the WCSAxes set_xlabel set_ylabel docstrings (inherited from matplotlib) cause sphinx warnings (~matplotlib.text.Text and "text" links are broken), e.g. http://docs.astropy.org/en/latest/api/astropy.visualization.wcsaxes.WCSAxes.html#astropy.visualization.wcsaxes.WCSAxes.set_xlabel. I'm not sure why the ~matplotlib.text.Text target isn't found.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth sticking functools.wraps on this so you have the function signature and stuff?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - I've written some custom docstrings that explicitly refer to contour and contourf via intersphinx for information about valid arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cadair - Actually if we didn't set any docstring I think it just inherits it? But I'd rather go with the approach done here now so we don't get random doc failures when Matplotlib change their docstring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@astrofrog I wasn't actually thinking for the docstring, I was thinking for the function signature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cadair ah, that won't be very interesting though:

Signature: ax.contour(*args, data=None, **kwargs)

(that's the matplotlib one!)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦‍♀️

# In Matplotlib, when calling contour() with a transform, each
# individual path in the contour map is transformed separately. However,
# this is much too slow for us since each call to the transforms results
# in an Astropy coordinate transformation, which has a non-negligeable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

negligible

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


Using transforms with the native Matplotlib contour/contourf can be slow if
the transforms have a non-negligible overhead (which is the case for
WCS/Skycoord transforms) since the transform is called for each individual
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SkyCoord

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

pos_segments = []

for collection in cset.collections:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unnecessary blank line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

# Now re-populate the segments in the line collections
for ilevel, vert in enumerate(vertices):
vert = np.split(vert, pos_segments[ilevel])
for iseg in range(len(vert)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using enumerate here would be slightly faster, e.g. something like

for iseg, ivert in vert:
    all_paths[ilevel][iseg].vertices = ivert

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@larrybradley
Copy link
Member

@astrofrog When will the new contour benchmarks be run? I don't see them yet. I assume this PR shouldn't be merged until the benchmarks have been run at least once with the old code?

@astrofrog
Copy link
Member Author

@larrybradley - thanks for the review! I'll check today why the benchmarks haven't run yet.

@larrybradley
Copy link
Member

@astrofrog Let me know when the benchmarks have been run and I'll merge this.

@astrofrog
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0