-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Remove unnecessary calls to float() before division. #10331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ea8439a
to
e19e48d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable to me as far as I can tell.
One typo (I think) and a couple very minor comments.
lib/matplotlib/testing/compare.py
Outdated
sum_of_squares = np.sum(histogram * np.arange(len(histogram)) ** 2) | ||
rms = np.sqrt(float(sum_of_squares) / num_values) | ||
return rms | ||
return ((expectedImage - actualImage) ** 2).mean() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, I may have missed something but this is "Mean Square" rather than "Root Mean Square", is it not? Is it not missing a call to np.sqrt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous method (going through the histogram) seems extra complex, why was it done that way? Is this just a weird historical artifact of supporting numarray/numeric or an attempt to get performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the missing sqrt is just a typo, fixed.
pretty sure performance of this specific part of the code is irrelevant compared to rendering the png or shelling out to inkscape and ghostscript...
lib/mpl_toolkits/mplot3d/proj3d.py
Outdated
@@ -61,7 +61,7 @@ def line2d_seg_dist(p1, p2, p0): | |||
x01 = np.asarray(p0[0]) - p1[0] | |||
y01 = np.asarray(p0[1]) - p1[1] | |||
|
|||
u = (x01*x21 + y01*y21)/float(abs(x21**2 + y21**2)) | |||
u = (x01*x21 + y01*y21) / abs(x21**2 + y21**2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very genuine comment: are x21
and y21
complex or real numbers? Because in the latter case, what is the point of abs(...)
then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure they're intended to be reals (it's computing a 3D distance...). Removed the call to abs, good catch.
@@ -13,7 +13,7 @@ | |||
colors = prop_cycle.by_key()['color'] | |||
|
|||
lwbase = plt.rcParams['lines.linewidth'] | |||
thin = float('%.1f' % (lwbase / 2)) | |||
thin = lwbase / 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By curiosity, was there a purpose to the former "rounding to 0.1" scheme?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably to make the title label look better(?), but I think 0.75 looks just fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, and I think the same
I have the feeling that Travis agrees with me about the "typo" in the image comparison test 😈. |
@@ -56,10 +52,10 @@ def select_step_hour(dv): | |||
minsec_limits_ = [1.5, 2.5, 3.5, 4.5, 5.5, 8, 11, 14, 18, 25, 45] | |||
minsec_steps_ = [1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30] | |||
|
|||
minute_limits_ = A(minsec_limits_)*(1./60.) | |||
minute_limits_ = np.array(minsec_limits_) / 60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like another case of old-school micro-optomizations like (x*x
vs x**2
), but this one still work!
In [3]: d = np.array(range(52))
In [4]: %timeit d / 60
936 ns ± 0.538 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: %timeit d * (1/ 60)
683 ns ± 1.43 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [6]: d = np.array(range(50000))
In [7]: %timeit d / 60
102 µs ± 95.3 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [8]: %timeit d * (1/ 60)
37.4 µs ± 554 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can repro, but I think axisartist/angle_helper is far enough from being on a critical code path to make it worth obfuscating...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
👍 in principle, but some concern about the RMS changes. |
e19e48d
to
c0b773e
Compare
hopefully fixed the rms changes |
6aa7aba
to
6664dd5
Compare
@@ -22,7 +22,7 @@ | |||
r'some other string', color='red', fontsize=20, dpi=200) | |||
|
|||
fig = plt.figure() | |||
fig.figimage(rgba1.astype(float)/255., 100, 100) | |||
fig.figimage(rgba2.astype(float)/255., 100, 300) | |||
fig.figimage(rgba1, 100, 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove the division by 255?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we can display uint8 images just as well as float images, and the real point of the example is to show how to get the mathtext as an image array...
float(l) / self.dpi * self._figdpi, | ||
(float(height)-b-h) / self.dpi * self._figdpi, | ||
l / self.dpi * self._figdpi, | ||
(height-b-h) / self.dpi * self._figdpi, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, since you are in here, these would be much clearer as
self._figdpi / self.dpi,
self._figdpi * (height-b-h) / self.dpi,
Seeing a/b*c
always slows me down in a way that c*a/b
or (a/b)*c
doesn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, fixed
c13813d
to
7c30859
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly good, but I have a couple more questions.
@@ -268,7 +268,7 @@ def from_any(size, fraction_ref=None): | |||
return Fixed(size) | |||
elif isinstance(size, six.string_types): | |||
if size[-1] == "%": | |||
return Fraction(float(size[:-1])/100., fraction_ref) | |||
return Fraction(float(size[:-1]) / 100, fraction_ref) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this float
remains?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because the purpose here is to convert "3%
' to 0.03
? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Sorry, I didn't pay enough attention to the context.
@@ -250,7 +245,7 @@ def _get_number_fraction(self, factor): | |||
break | |||
|
|||
d = factor // threshold | |||
int_log_d = int(floor(math.log10(d))) | |||
int_log_d = int(math.floor(math.log10(d))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier it looked like you were eliminating the use of the math module in favor of numpy functions--but not here. Why? (The math module is faster; I assumed you were simplifying by eliminating it in contexts where the speed difference doesn't matter.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because the rest of the expression used math and I was too lazy to change both. Obviously this is not on a critical path, changed.
rms = np.sqrt(float(sum_of_squares) / num_values) | ||
return rms | ||
# Convert to float to avoid overflowing finite integer types. | ||
return np.sqrt(((expectedImage - actualImage).astype(float) ** 2).mean()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is seems to be a substantive change to a genuine RMS, from something I don't understand. There must have been a reason for the convoluted original.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the original either, but the relevant tests (which check for exact return values of this function) are still passing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ours are, but I suspect this might cause problems for packagers who put in larger tolerances than ours, as I believe some do. Therefore I recommend sticking to the original algorithm for the purposes of this PR. Perhaps add a comment that this is not a simple rms. @mdboom can probably tell us the rationale for the algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT the results should be the same: the old method counted, for each delta, the number of pixels that have this delta value, then computed a RMS based on that histogram.
Again, https://github.com/matplotlib/matplotlib/pull/10335/files shows tests where we check that the reported RMS values are stable to 1e-4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am correctly expressing the two algorithms below, then their results are not the same:
In [26]: x = np.zeros(1000, dtype=int); x[0] = 1
In [27]: h = np.bincount(x, minlength=256)
In [28]: np.sqrt(((h * np.arange(len(h)))**2).mean())
0.0625
In [29]: np.sqrt((x**2).mean())
0.031622776601683791
In [30]: x[:10]
array([1, 0, 0, 0, 0, 0, 0, 0, 0, 0])
In [31]: (h * np.arange(len(h)))[:10]
array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0])
In [32]: len(x)
1000
In [33]: len(h)
256
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the previous algorithm was dividing by num_values (so sum() / len(x) in your case, not mean()). this fix shows that the values are indentical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops... sorry I missed that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I can run the Cartopy test suite without any issue using this PR.
@@ -1574,7 +1574,7 @@ def tick_values(self, vmin, vmax): | |||
""" | |||
if self.nbins is None: | |||
return self.locs | |||
step = max(int(0.99 + len(self.locs) / float(self.nbins)), 1) | |||
step = max(int(np.ceil(len(self.locs) / self.nbins)), 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this change the results in rare cases? It does look nicer, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I think the code really, really looks like it was written by someone who meant ceil but didn't remember that function existed.
- For the change to actually matter you need a) to use bins with FixedLocator, which seems to be quite rare in my experience, and b) at least 99 ticks and 100 bins (to fall in the "edge" region), which is also quite rare...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll go along with this one.
Most of these calls probably date back to the time before `__future__`-division existed... Note that this will have the side effect that some functions that were accidentally accepting strings instead of floats before will no longer accept strings -- which seems fine to me.
7c30859
to
3ea6cfa
Compare
rms = np.sqrt(float(sum_of_squares) / num_values) | ||
return rms | ||
# Convert to float to avoid overflowing finite integer types. | ||
return np.sqrt(((expectedImage - actualImage).astype(float) ** 2).mean()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I can run the Cartopy test suite without any issue using this PR.
Most of these calls probably date back to the time before
__future__
-division existed...Note that this will have the side effect that some functions that were
accidentally accepting strings instead of floats before will no longer
accept strings -- which seems fine to me.
PR Summary
PR Checklist