10000 histogram2d and histogramdd always return a float array · Issue #7845 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

histogram2d and histogramdd always return a float array #7845

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mnmelo opened this issue Jul 18, 2016 · 5 comments
Open

histogram2d and histogramdd always return a float array #7845

mnmelo opened this issue Jul 18, 2016 · 5 comments

Comments

@mnmelo
Copy link
mnmelo commented Jul 18, 2016

When no normalization is performed the histogram*d functions should return an int array, just as histogram does. A float array is returned instead.

The docs of histogram2d don't mention anything about return type, but also don't follow the style of the docs of histogram, in which it is stated that semantics depend on normalization (I guess this could be written more explicitly). histogramdd does follow the doc style of histogram, but also returns a float array, like histogram2d does.

Running NumPy 1.11.0 on Ubuntu 14.04
Sample code:

arr = np.random.random((20))

## 1D
np.histogram(arr)[0]
Out: array([5, 1, 1, 1, 2, 1, 2, 1, 2, 4])

## 2D
np.histogram2d(arr,arr)[0]
Out: 
array([[ 5.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  2.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  2.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  2.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  4.]])

## N-D
np.histogramdd(arr[:,None])[0]
Out: array([ 5.,  1.,  1.,  1.,  2.,  1.,  2.,  1.,  2.,  4.])
#
np.histogramdd(np.column_stack((arr, arr)))[0]
Out: 
array([[ 5.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  2.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  2.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  2.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  4.]])
@mnmelo
Copy link
Author
mnmelo commented Jul 18, 2016

This would be one more aspect to consolidate across histogram, histogram2d, and histogramdd, as per #4521.

@tom-bird
Copy link
Contributor
tom-bird commented Aug 1, 2016

Yes agreed, I will bring histogram2d and histogramdd in line with histogram

@eric-wieser
Copy link
Member

Channeling @charris:

I think this should be discussed on the mailing list as it changes the behavior of the affected functions.

If we decide to go ahead with the change, the fix is simply to remove

# This preserves the (bad) behavior observed in gh-7845, for now.
hist = hist.astype(float, casting='safe')

@eric-wieser
Copy link
Member

@WarrenWeckesser
Copy link
Member

The mailing list link given above is dead. I think this is the correct thread: https://mail.python.org/archives/list/numpy-discussion@python.org/thread/NIHE7UT4SDPU6KCDSMUIU2UA52PDSEIJ/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
0