Remove unnecessary calls to float() before division. #10331

anntzer · 2018-01-27T08:03:03Z

Most of these calls probably date back to the time before
__future__-division existed...

Note that this will have the side effect that some functions that were
accidentally accepting strings instead of floats before will no longer
accept strings -- which seems fine to me.

PR Summary

PR Checklist

Has Pytest style unit tests
Code is PEP 8 compliant
New features are documented, with examples if plot related
Documentation is sphinx and numpydoc compliant
Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

afvincent

Looks reasonable to me as far as I can tell.

One typo (I think) and a couple very minor comments.

afvincent · 2018-01-27T18:17:27Z

lib/matplotlib/testing/compare.py

-    sum_of_squares = np.sum(histogram * np.arange(len(histogram)) ** 2)
-    rms = np.sqrt(float(sum_of_squares) / num_values)
-    return rms
+    return ((expectedImage - actualImage) ** 2).mean()


Huh, I may have missed something but this is "Mean Square" rather than "Root Mean Square", is it not? Is it not missing a call to np.sqrt?

The previous method (going through the histogram) seems extra complex, why was it done that way? Is this just a weird historical artifact of supporting numarray/numeric or an attempt to get performance?

the missing sqrt is just a typo, fixed.
pretty sure performance of this specific part of the code is irrelevant compared to rendering the png or shelling out to inkscape and ghostscript...

afvincent · 2018-01-27T18:23:57Z

lib/mpl_toolkits/mplot3d/proj3d.py

@@ -61,7 +61,7 @@ def line2d_seg_dist(p1, p2, p0):
    x01 = np.asarray(p0[0]) - p1[0]
    y01 = np.asarray(p0[1]) - p1[1]

-    u = (x01*x21 + y01*y21)/float(abs(x21**2 + y21**2))
+    u = (x01*x21 + y01*y21) / abs(x21**2 + y21**2)


Very genuine comment: are x21 and y21 complex or real numbers? Because in the latter case, what is the point of abs(...) then?

Pretty sure they're intended to be reals (it's computing a 3D distance...). Removed the call to abs, good catch.

afvincent · 2018-01-27T18:27:55Z

examples/color/color_cycle_default.py

@@ -13,7 +13,7 @@
 colors = prop_cycle.by_key()['color']

 lwbase = plt.rcParams['lines.linewidth']
-thin = float('%.1f' % (lwbase / 2))
+thin = lwbase / 2


By curiosity, was there a purpose to the former "rounding to 0.1" scheme?

Probably to make the title label look better(?), but I think 0.75 looks just fine too.

Fair enough, and I think the same

afvincent · 2018-01-27T18:33:33Z

I have the feeling that Travis agrees with me about the "typo" in the image comparison test 😈.

tacaswell · 2018-01-27T18:55:16Z

lib/mpl_toolkits/axisartist/angle_helper.py

@@ -56,10 +52,10 @@ def select_step_hour(dv):
    minsec_limits_ = [1.5, 2.5, 3.5, 4.5, 5.5, 8, 11, 14, 18, 25, 45]
    minsec_steps_  = [1,   2,   3,   4,   5,   6, 10, 12, 15, 20, 30]

-    minute_limits_ = A(minsec_limits_)*(1./60.)
+    minute_limits_ = np.array(minsec_limits_) / 60


This looks like another case of old-school micro-optomizations like (x*x vs x**2), but this one still work!

In [3]: d = np.array(range(52)) In [4]: %timeit d / 60 936 ns ± 0.538 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) In [5]: %timeit d * (1/ 60) 683 ns ± 1.43 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each) In [6]: d = np.array(range(50000)) In [7]: %timeit d / 60 102 µs ± 95.3 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) In [8]: %timeit d * (1/ 60) 37.4 µs ± 554 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

I can repro, but I think axisartist/angle_helper is far enough from being on a critical code path to make it worth obfuscating...

tacaswell · 2018-01-27T18:57:49Z

👍 in principle, but some concern about the RMS changes.

anntzer · 2018-01-27T19:20:05Z

hopefully fixed the rms changes

efiring · 2018-01-27T19:54:25Z

examples/api/mathtext_asarray.py

@@ -22,7 +22,7 @@
    r'some other string', color='red', fontsize=20, dpi=200)

 fig = plt.figure()
-fig.figimage(rgba1.astype(float)/255., 100, 100)
-fig.figimage(rgba2.astype(float)/255., 100, 300)
+fig.figimage(rgba1, 100, 100)


Why did you remove the division by 255?

because we can display uint8 images just as well as float images, and the real point of the example is to show how to get the mathtext as an image array...

efiring · 2018-01-27T20:01:04Z

lib/matplotlib/backends/backend_mixed.py

-                    float(l) / self.dpi * self._figdpi,
-                    (float(height)-b-h) / self.dpi * self._figdpi,
+                    l / self.dpi * self._figdpi,
+                    (height-b-h) / self.dpi * self._figdpi,


To me, since you are in here, these would be much clearer as

self._figdpi / self.dpi, self._figdpi * (height-b-h) / self.dpi,

Seeing a/b*c always slows me down in a way that c*a/b or (a/b)*c doesn't.

agreed, fixed

efiring

Mostly good, but I have a couple more questions.

efiring · 2018-01-28T02:54:39Z

lib/mpl_toolkits/axes_grid1/axes_size.py

@@ -268,7 +268,7 @@ def from_any(size, fraction_ref=None):
        return Fixed(size)
    elif isinstance(size, six.string_types):
        if size[-1] == "%":
-            return Fraction(float(size[:-1])/100., fraction_ref)
+            return Fraction(float(size[:-1]) / 100, fraction_ref)


Is there a reason this float remains?

because the purpose here is to convert "3%' to 0.03? :)

Right. Sorry, I didn't pay enough attention to the context.

efiring · 2018-01-28T03:02:23Z

lib/mpl_toolkits/axisartist/angle_helper.py

@@ -250,7 +245,7 @@ def _get_number_fraction(self, factor):
                break

            d = factor // threshold
-            int_log_d = int(floor(math.log10(d)))
+            int_log_d = int(math.floor(math.log10(d)))


Earlier it looked like you were eliminating the use of the math module in favor of numpy functions--but not here. Why? (The math module is faster; I assumed you were simplifying by eliminating it in contexts where the speed difference doesn't matter.)

because the rest of the expression used math and I was too lazy to change both. Obviously this is not on a critical path, changed.

efiring · 2018-01-28T03:16:37Z

lib/matplotlib/testing/compare.py

-    rms = np.sqrt(float(sum_of_squares) / num_values)
-    return rms
+    # Convert to float to avoid overflowing finite integer types.
+    return np.sqrt(((expectedImage - actualImage).astype(float) ** 2).mean())


This is seems to be a substantive change to a genuine RMS, from something I don't understand. There must have been a reason for the convoluted original.

I don't understand the original either, but the relevant tests (which check for exact return values of this function) are still passing...

Ours are, but I suspect this might cause problems for packagers who put in larger tolerances than ours, as I believe some do. Therefore I recommend sticking to the original algorithm for the purposes of this PR. Perhaps add a comment that this is not a simple rms. @mdboom can probably tell us the rationale for the algorithm.

AFAICT the results should be the same: the old method counted, for each delta, the number of pixels that have this delta value, then computed a RMS based on that histogram.
Again, https://github.com/matplotlib/matplotlib/pull/10335/files shows tests where we check that the reported RMS values are stable to 1e-4.

If I am correctly expressing the two algorithms below, then their results are not the same:

In [26]: x = np.zeros(1000, dtype=int); x[0] = 1 In [27]: h = np.bincount(x, minlength=256) In [28]: np.sqrt(((h * np.arange(len(h)))**2).mean()) 0.0625 In [29]: np.sqrt((x**2).mean()) 0.031622776601683791 In [30]: x[:10] array([1, 0, 0, 0, 0, 0, 0, 0, 0, 0]) In [31]: (h * np.arange(len(h)))[:10] array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0]) In [32]: len(x) 1000 In [33]: len(h) 256

the previous algorithm was dividing by num_values (so sum() / len(x) in your case, not mean()). this fix shows that the values are indentical.

Oops... sorry I missed that.

Well, I can run the Cartopy test suite without any issue using this PR.

efiring · 2018-01-28T03:23:54Z

lib/matplotlib/ticker.py

@@ -1574,7 +1574,7 @@ def tick_values(self, vmin, vmax):
        """
        if self.nbins is None:
            return self.locs
-        step = max(int(0.99 + len(self.locs) / float(self.nbins)), 1)
+        step = max(int(np.ceil(len(self.locs) / self.nbins)), 1)


Won't this change the results in rare cases? It does look nicer, though.

I think the code really, really looks like it was written by someone who meant ceil but didn't remember that function existed.

For the change to actually matter you need a) to use bins with FixedLocator, which seems to be quite rare in my experience, and b) at least 99 ticks and 100 bins (to fall in the "edge" region), which is also quite rare...

I'll go along with this one.

Most of these calls probably date back to the time before `__future__`-division existed... Note that this will have the side effect that some functions that were accidentally accepting strings instead of floats before will no longer accept strings -- which seems fine to me.

QuLogic · 2018-01-28T22:34:39Z

lib/matplotlib/testing/compare.py

-    rms = np.sqrt(float(sum_of_squares) / num_values)
-    return rms
+    # Convert to float to avoid overflowing finite integer types.
+    return np.sqrt(((expectedImage - actualImage).astype(float) ** 2).mean())


Well, I can run the Cartopy test suite without any issue using this PR.

anntzer force-pushed the future-division branch 3 times, most recently from ea8439a to e19e48d Compare January 27, 2018 10:22

afvincent reviewed Jan 27, 2018

View reviewed changes

tacaswell reviewed Jan 27, 2018

View reviewed changes

tacaswell added this to the v2.2 milestone Jan 27, 2018

anntzer force-pushed the future-division branch from e19e48d to c0b773e Compare January 27, 2018 19:17

anntzer force-pushed the future-division branch 2 times, most recently from 6aa7aba to 6664dd5 Compare January 27, 2018 19:46

efiring reviewed Jan 27, 2018

View reviewed changes

anntzer force-pushed the future-division branch 2 times, most recently from c13813d to 7c30859 Compare January 27, 2018 23:24

efiring requested changes Jan 28, 2018

View reviewed changes

anntzer force-pushed the future-division branch from 7c30859 to 3ea6cfa Compare January 28, 2018 06:29

efiring approved these changes Jan 28, 2018

View reviewed changes

QuLogic approved these changes Jan 28, 2018

View reviewed changes

jklymak merged commit 8650c2d into matplotlib:master Jan 29, 2018

anntzer deleted the future-division branch January 29, 2018 19:17

QuLogic modified the milestones: needs sorting, v2.2.0 Feb 12, 2018

Uh oh!

Remove unnecessary calls to float() before division. #10331

Remove unnecessary calls to float() before division. #10331

Uh oh!

Conversation

PR Summary

PR Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!