-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
numpy.sum not stable enough sometimes (Kahan, math.fsum) #8786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't think that These seems like an improvement that could be built directly in to the |
sum uses pairwise summation which is reasonably accurate without a performance impact. how exactly are you summing? pairwise summation unfortunately is not used when you are summing along a strided axis, again for performance reasons. Implementing an even more accurate sum is interesting, maybe via a new function or some precision context manager. Though it the use cases need to be clear. Kahan is easy to implement and acceptable in performance but not as good the algorithm math.fsum uses which is significantly slower and more complicated to implement. |
And more precise.
Interesting. I'm using |
That might be the problem, the fast axis in numpy is the last one by default (C order). |
When |
Yes, that's because you're doing exactly the same operation over memory, and just calling the axes different things
|
You would have to sparkle in a |
Thanks for the clarification. Unfortunately,
has the same lack of precision as the original statement. |
Well, you can't expect much difference here, its just a few numbers after all. And you can always get catastrophic anihilation, even in pairwise summation, pairwise summation is just much better because in typical scenarios it make sure that the two numbers being added are always of similar size. E.g. if you would restructure your numbers by always having the first, last, second, second to last, etc. you would get a good result of course (and maybe even better for pairwise). |
Another quick hack to get better accuracy is to sort your array before summing. It'd certainly be nice to have better summation algorithms available as an option, but |
Right, silly by me. The pairwise stuff could maybe actually have helped,since it might sum all the negative and then all the positive ones one after another (the example actually was sorted, but basically symmetric about zero, which invites anihilation)? |
Definitely not, I agree. It'd be nice to have more options, though, for example |
The same algorithm used to generate a Hamming code could also be adapted: always summing the two numbers of smallest absolute value. Could be implemented as a priority queue. I'm not recommending that, have just wondered now and then how well it would work... |
Sorry, yeah, you would want to sort by absolute value. (I've only used this trick for summing probabilities, where this doesn't come up, so I forgot :-).) |
You can do this when summing |
@nschloe: you have to sort each array separately, yes. It's awkward but certainly possible, as an option. |
Would either a kahan_sum/fsum be a feature others think is worth implementing? |
A stable numpy sum would be awesome. Sorting is a PITA since you have to do it by a key (abs), and if you have a more than 1-d array you have to sort each subarray independently. |
For reference, I implemented some stable sums and dot products in accupy. Pretty slow in comparison to |
We would also like to have a more precise sum, for a likelihood that's the sum of ~ a billion numbers. @nschloe - Very nice comparison in So your conclusion is that the precision of kahan sum isn't much better, so not worth exposing in Numpy? (kahan is actually already here for a different application, not for numpy.sum) fsum accuracy seems amazing. How slow is it? |
@cdeil python uses Shewchuk summation, my guess is you can just look at their code or google a C implementation at the very least. Adapting it into short cython code is probably less work then trying to dig very long. Now that we support the axis argument for gufuncs, it could actually be a nice little exercise to implement a gufunc for stable summation. Not sure it will be ideal performance wise, but cython solution are unlikely to be any better (unless they bloat up a lot). |
Not my conclusion but a known property of Kahan. It's just not robust when it comes to ill-conditioned sums. The best thing would be have fsum (aka Shewchuk) indeed, but there are faster solutions too which work in k-fold machine precision. (See accupy.) You can acutally try accupy though and see if it fits your needs. If yes, perhaps we can tweak it for speed.
It does indeed. If you can get it done in C++, super! |
To add a simple example: import numpy as np
def naive_sum(a):
result = np.zeros_like(a, shape=(1,))
for i in range(a.shape[0]):
result[0] += a[i]
return result[0]
a = np.linspace(-100, 100, 1_000_000, dtype=np.float32)
np.random.default_rng(42).shuffle(a)
sum(a), naive_sum(a), np.sum(a)
# (0.0, -0.5259266, -0.008300781) This shows that And for reference (not implying numpy should do the same!), Python's |
For those interested, here is an example with a badly conditioned sum: import math
import sys
import numpy as np
def naive_sum(lst):
s = 0.0
for val in lst:
s += val
return val
# A badly conditioned sum, condition number ~2.188e+14
p = [
-0.41253261766461263,
41287272281118.43,
-1.4727977348624173e-14,
5670.3302557520055,
2.119245229045646e-11,
-0.003679264134906428,
-6.892634568678797e-14,
-0.0006984744181630712,
-4054136.048352595,
-1003.101760720037,
-1.4436349910427172e-17,
-41287268231649.57,
]
exact = -0.377392919181026
print(f"Python {sys.version}")
print(f"NumPy {np.__version__}\n")
for sumfun in [math.fsum, sum, np.sum, naive_sum]:
s = sumfun(p)
err = abs(s - exact)
name = f"{sumfun.__module__}.{sumfun.__name__}"
print(f"{name:<18}: {s:10.4e}, error {err:.2e} ({err / abs(s)*100:.1f} %)")
|
Using Python 3.12.3 on Mac OS 14.1.1 with Apple M3 Pro I get the following results from the script by @nschloe:
The reason why |
I am not sure whether I should open a new issue or not. Please let me know. Issue: Example:
|
It's not an issue, it is a basic property of numerical floating point math that re-ordering even the simplest operations can lead to different results. More stable summation may do that less, but I don't think there is any reasonable version that is completely exact. |
@seberg What do numpy maintainers intend to do about this issue. I see several options:
|
Well, I think there is nothing to do about mean/sum directly. We could think about a new function or maybe about a new option to sum/mean (but that is harder). |
Uh oh!
There was an error while loading. Please reload this page.
Background
To approximate an integral over the reference interval
[-1, 1]
, one arranges pointsx_i
and weightsw_i
such that the numberapproximates the integral of some functions
f
(typically a set of polynomials) reasonably well. Sophisticated schemes have many points and weights, and naturally the more weights there are, the smaller each individual weight must be.Bottom line: You'll have to sum up many small values.
When numpy.sum doesn't cut it
I noticed that
numpy.sum
produces large enough round-off errors for the tests to fail with as few as 7 summands (being-44.34121805, -15.72356145, -0.04563372, 0., 0.04563372, 15.72356145, 44.34121805
, not shown in full precision).math.fsum
always worked fine, but cannot be used anymore since I'd like to compute the sum for many intervals at once.Work so far
Kahan's summation
does the trick. Note, however, the use of a Python loop here, which leads to a performance degradation if the number of points and weights is large.
math.fsum
uses a different algorithm which may or may not be better suited than Kahan.See here for a previous discussion of the topic.
The text was updated successfully, but these errors were encountered: