BUG: Fix crashes when using float32 values in uniform histograms #10324

eric-wieser · 2018-01-04T07:32:38Z

This is a workaround to #10322, not a fix for it.

eric-wieser · 2018-01-04T07:33:39Z

numpy/lib/tests/test_histograms.py

+
+        # gh-10322 means that the float64 decays to float32.
+        assert_equal(x_loc.dtype, np.float32)
+        assert_equal(count, [1])


In #9189, the results here were np.float64 and [0], which are self-consistent, but at odds with #10322.

I am a bit at odds with this code... I have modified np.result_type to only special case Python float, integer, (and complex). Even adding a warning on it about the switch, there are only a handful of places in NumPy such a change. And all of them seem to explicitly test corner cases (except possibly the default value in np.select which arguably should be modified to None which then could use result.dtype.type() as default).

So if you look at the code chunk with comments below, the result_type returns the "imprecise" version currently. If that would opt into a more precise version, this changes to [0] as commented. However, warnings about such a change seems pretty drastic/weird and potentially impossible to get right?: We have to pull the plug on gh-10322 eventually!

In longer words: I am in a conundrum, that I we need to fix a behaviour here (potentially with semi-public functions to opt-in to the"future" behaviour early).

Maybe its not actually a big issue, we just a lot of warnings and users can only avoid them by casting their inputs. The problem may just be that I pulled the plug on np.result_type for testing, but did not modify the ufunc code yet (and that is too big of a throw to do at once).

Then, in the future, this will switch to 0 when bins are typed at least.

charris · 2018-01-04T18:44:42Z

Just to be clear, this does all the computations in float32 for equally spaced bins and float32 data?

charris · 2018-01-04T18:47:22Z

numpy/lib/histograms.py

+        # gh-10322 forces us to request this now to avoid inconsistency
+        bin_type = np.result_type(first_edge, last_edge, a)
+        if np.issubdtype(bin_type, np.integer):
+            bin_type = np.result_type(bin_type, float)


What is the result type here with Python float? Might add a note.

It seems to be float no matter what.

So, we can just write bin_type = float? Seems reasonable. (Or None, since linspace will take care.)

This seems more flexible in case we add an int128 that only safely casts into a quad double or something in the distant future.

I would just let linspace deal with that case, i.e., set bin_type = None here (or at float32 to the one above). Obviously, not a big deal though.

eric-wieser · 2018-01-04T18:52:05Z

this does all the computations in float32

Yes. This is fine, because we already have special handling of loss of precision in the uniform search

charris · 2018-01-04T19:11:19Z

What about float16? Is it supported?

eric-wieser · 2018-01-04T19:25:57Z

I'll update the test to loop over all the floating point types

mhvk · 2018-01-04T19:34:00Z

numpy/lib/histograms.py

@@ -318,9 +318,15 @@ def _get_bin_edges(a, bins, range, weights):
        raise ValueError('`bins` must be 1d, when an array')

    if n_equal_bins is not None:
+        # gh-10322 forces us to request this now to avoid inconsistency


Here and in the other comment below, ideally explain in a sentence why we are doing this, and then add "see gh-10322" - future developers will thank you!

Fixes numpy#8123, closes numpy#9189, fixes numpy#10319 This is a workaround to numpy#10322, not a fix for it. Adds tests for cases where bounds are more precise than the data, which led to inconsistencies in the optimized path.

eric-wieser · 2018-02-02T09:04:58Z

Updated with better comments and more thorough tests. Dropped the ball a little on this one.

charris · 2018-02-09T02:55:55Z

Thanks Eric.

This covers the changes made in numpygh-10324 and numpygh-11023

eric-wieser added 00 - Bug component: numpy.lib labels Jan 4, 2018

eric-wieser commented Jan 4, 2018

View reviewed changes

charris reviewed Jan 4, 2018

View reviewed changes

mhvk reviewed Jan 4, 2018

View reviewed changes

eric-wieser force-pushed the histogram-range-comparison branch from 7b79175 to 1122303 Compare February 2, 2018 09:04

charris merged commit 7c4c213 into numpy:master Feb 9, 2018

eric-wieser added a commit to eric-wieser/numpy that referenced this pull request May 14, 2018

DOC: Add release notes for histogram[dd] changes.

63cd4da

This covers the changes made in numpygh-10324 and numpygh-11023

kmaehashi mentioned this pull request Aug 1, 2018

Fix unit test errors in NumPy 1.15 cupy/cupy#1514

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Fix crashes when using float32 values in uniform histograms #10324

BUG: Fix crashes when using float32 values in uniform histograms #10324

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BUG: Fix crashes when using float32 values in uniform histograms #10324

BUG: Fix crashes when using float32 values in uniform histograms #10324

Uh oh!

Conversation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!