8000 BUG: Histogram of float32 data · Issue #28823 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: Histogram of float32 data #28823

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service 8000 and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rad83 opened this issue Apr 25, 2025 · 2 comments
Closed

BUG: Histogram of float32 data #28823

rad83 opened this issue Apr 25, 2025 · 2 comments
Labels

Comments

@rad83
Copy link
rad83 commented Apr 25, 2025

Describe the issue:

Computing histogram of float32 data drops small values only at particular positions within the dataset.
In the following code, the output of the histogram function will change, if either the DTYPE or ORDER is changed, resulting in (at least for ORDER) rather inconsistent behavior.

Reproduce the code example:

import numpy as np

DTYPE = 'float32' # ['float32' | 'float64']
ORDER = -1        # [-1 | +1]

dat = np.zeros(10, dtype=DTYPE)
dat[2] = 1
dat[7] = 1e-10
dat = dat[::ORDER]

axs = np.linspace(0, 1, dat.shape[0]+1)
print(np.histogram(axs[:-1], axs, weights=dat)[0])

Error message:

Python and NumPy Versions:

2.2.5
3.13.3 (main, Apr 9 2025, 07:44:25) [GCC 14.2.1 20250207]

Runtime Environment:

[{'numpy_version': '2.2.5',
'python': '3.13.3 (main, Apr 9 2025, 07:44:25) [GCC 14.2.1 20250207]',
'uname': uname_result(system='Linux', node='book', release='6.13.8-arch1-1', version='#1 SMP PREEMPT_DYNAMIC Sun, 23 Mar 2025 17:17:30 +0000', machine='x86_64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2'],
'not_found': ['AVX512F',
'AVX512CD',
'AVX512_KNL',
'AVX512_KNM',
'AVX512_SKX',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL',
'AVX512_SPR']}},
{'filepath': '/usr/lib/libgomp.so.1.0.0',
'internal_api': 'openmp',
'num_threads': 4,
'prefix': 'libgomp',
'user_api': 'openmp',
'version': None}]

Context for the issue:

No response

@rad83 rad83 added the 00 - Bug label Apr 25, 2025
@jakevdp
Copy link
Contributor
jakevdp commented Apr 25, 2025

This is a result of floating point precision within the cumulative histogram implementation. When you add 1E-10 to 1.0 at float32 precision, the first value is smaller than the resolution of float32 (approximately 1E-7) and gets rounded to zero:

x = np.float32(1.0)
y = np.float32(1E-10)
print(x + y == x)  # True

So depending on the order in which your weights appear, you'll get different behavior.

This type of behavior is entirely expected when working with floating point arithmetic. The solution in this case would be to use a higher-precision floating point representation, such as float64, which has a resolution of approximately 1E-16.

@rad83
Copy link
Author
rad83 commented Apr 25, 2025

... cumulative histogram implementation ...

Oh, I see. Fair enough, thank you for the explanation.

@rad83 rad83 closed this as completed Apr 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants
0