-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: np.histogramdd loses precision on its inputs, leading to incorrect results #11023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously a fuzzy rounded comparison was used for the rightmost bin of histogramdd. It's not clear why this was done, and it resulted in surprising behavior. This also removes the restriction that bin edges must be floats, and allows integer arrays to be passed (and returned) Fixes numpygh-10864
This is due to numpygh-11022.
hist, edges = histogramdd((x, y), bins=(x_edges, y_edges)) | ||
|
||
assert_equal(edges[0].dtype, x_edges.dtype) | ||
assert_equal(edges[1].dtype, y_edges.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These fail on master, as edges[i].dtype
is float64
y = big + x | ||
y_edges = big + x_edges | ||
|
||
hist, edges = histogramdd((x, y), bins=(x_edges, y_edges)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also fails on master with
ValueError: `bins[0]` must be strictly increasing, when an array
There exist more malicious test cases that give the wrong result rather than crash, but I figured this was sufficient.
y = np.array([0, 1, 2]) | ||
x_edges = np.array([0, 2, 2]) | ||
y_edges = 1 | ||
hist, edges = histogramdd((x, y), bins=(x_edges, y_edges)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On master, this test crashes with
ValueError: `bins[0]` must be strictly increasing, when an array
@@ -646,7 +644,7 @@ def test_rightmost_binedge(self): | |||
bins = [[0., 0.5, 1.0]] | |||
hist, _ = histogramdd(x, bins=bins) | |||
assert_(hist[0] == 0.0) | |||
assert_(hist[1] == 1.) | |||
assert_(hist[1] == 0.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was weird. 1.0000000001 <= 1.0
should be false.
Hmm, this does change behavior and might cause problems downstream, so it probably needs a compatibility note in the release notes. |
I'll add one. I don't think the change is substantially different from the rounding fixes added to |
This covers the changes made in numpygh-10324 and numpygh-11023
@charris: Added, along with |
Thanks Eric. |
Previously:
Note that this is different to gh-7845, which is about precision on the outputs.