numpy.sum not stable enough sometimes (Kahan, math.fsum) #8786

nschloe · 2017-03-15T19:21:04Z

Background

To approximate an integral over the reference interval [-1, 1], one arranges points x_i and weights w_i such that the number

sum(f(x_i) * w_i)

approximates the integral of some functions f (typically a set of polynomials) reasonably well. Sophisticated schemes have many points and weights, and naturally the more weights there are, the smaller each individual weight must be.

Bottom line: You'll have to sum up many small values.

When numpy.sum doesn't cut it

I noticed that numpy.sum produces large enough round-off errors for the tests to fail with as few as 7 summands (being -44.34121805, -15.72356145, -0.04563372, 0., 0.04563372, 15.72356145, 44.34121805, not shown in full precision). math.fsum always worked fine, but cannot be used anymore since I'd like to compute the sum for many intervals at once.

Work so far

Kahan's summation

def kahan_sum(a, axis=0):
    s = numpy.zeros(a.shape[:axis] + a.shape[axis+1:])
    c = numpy.zeros(s.shape)
    for i in range(a.shape[axis]):
        # http://stackoverflow.com/a/42817610/353337
        y = a[(slice(None),) * axis + (i,)] - c
        t = s + y
        c = (t - s) - y
        s = t.copy()
    return s

does the trick. Note, however, the use of a Python loop here, which leads to a performance degradation if the number of points and weights is large.

math.fsum uses a different algorithm which may or may not be better suited than Kahan.

See here for a previous discussion of the topic.

The text was updated successfully, but these errors were encountered:

eric-wieser · 2017-03-15T19:50:59Z

I don't think that copy is needed there.

These seems like an improvement that could be built directly in to the sum function if #8773 were implemented

juliantaylor · 2017-03-15T20:12:17Z

sum uses pairwise summation which is reasonably accurate without a performance impact.
kahan sum could already be implemented now but is significantly slower.

how exactly are you summing? pairwise summation unfortunately is not used when you are summing along a strided axis, again for performance reasons.

Implementing an even more accurate sum is interesting, maybe via a new function or some precision context manager. Though it the use cases need to be clear. Kahan is easy to implement and acceptable in performance but not as good the algorithm math.fsum uses which is significantly slower and more complicated to implement.

nschloe · 2017-03-15T20:25:55Z

kahan sum could already be implemented now but is significantly slower.

And more precise.

how exactly are you summing? pairwise summation unfortunately is not used when you are summing along a strided axis, again for performance reasons.

Interesting. I'm using numpy.sum(a, axis=0), so that shouldn't be a problem.

juliantaylor · 2017-03-15T20:27:16Z

That might be the problem, the fast axis in numpy is the last one by default (C order).
How does it fare when you have your arrays in fortran order?

nschloe · 2017-03-15T20:41:00Z

When transpose()ing and summing over axis=-1, I see the same deficiencies.

eric-wieser · 2017-03-15T20:43:47Z

I see the same deficiencies.

Yes, that's because you're doing exactly the same operation over memory, and just calling the axes different things

np.copy(c, order='f').sum(axis=0) is the relevent test.

seberg · 2017-03-15T20:43:54Z

You would have to sparkle in a copy there after the transpose, or do copy("F") and not transpose.

nschloe · 2017-03-15T20:46:43Z

Thanks for the clarification. Unfortunately,

numpy.copy(gamma, order='f').sum(axis=0)

has the same lack of precision as the original statement.

seberg · 2017-03-15T20:57:20Z

Well, you can't expect much difference here, its just a few numbers after all. And you can always get catastrophic anihilation, even in pairwise summation, pairwise summation is just much better because in typical scenarios it make sure that the two numbers being added are always of similar size.

E.g. if you would restructure your numbers by always having the first, last, second, second to last, etc. you would get a good result of course (and maybe even better for pairwise).

njsmith · 2017-03-15T21:19:36Z

Another quick hack to get better accuracy is to sort your array before summing.

It'd certainly be nice to have better summation algorithms available as an option, but np.sum is unlikely to ever use them by default given the performance cost.

seberg · 2017-03-15T21:25:57Z

Right, silly by me. The pairwise stuff could maybe actually have helped,since it might sum all the negative and then all the positive ones one after another (the example actually was sorted, but basically symmetric about zero, which invites anihilation)?

nschloe · 2017-03-15T21:29:15Z

np.sum is unlikely to ever use them by default given the performance cost.

Definitely not, I agree.

It'd be nice to have more options, though, for example numpy.kahan_sum with the same signature as numpy.sum.

charris · 2017-03-15T21:43:47Z

The same algorithm used to generate a Hamming code could also be adapted: always summing the two numbers of smallest absolute value. Could be implemented as a priority queue. I'm not recommending that, have just wondered now and then how well it would work...

njsmith · 2017-03-15T22:01:07Z

Sorry, yeah, you would want to sort by absolute value. (I've only used this trick for summing probabilities, where this doesn't come up, so I forgot :-).)

nschloe · 2017-03-15T22:32:46Z

always summing the two numbers of smallest absolute value.

You can do this when summing floats, but not when summing arrays.

njsmith · 2017-03-15T22:37:17Z

@nschloe: you have to sort each array separately, yes. It's awkward but certainly possible, as an option.

wuisawesome · 2017-03-20T23:34:08Z

Would either a kahan_sum/fsum be a feature others think is worth implementing?

asmeurer · 2017-11-14T00:17:02Z

A stable numpy sum would be awesome. Sorting is a PITA since you have to do it by a key (abs), and if you have a more than 1-d array you have to sort each subarray independently.

nschloe · 2018-05-12T10:23:19Z

For reference, I implemented some stable sums and dot products in accupy. Pretty slow in comparison to np.sum though.

cdeil · 2019-06-20T08:40:30Z 8000

We would also like to have a more precise sum, for a likelihood that's the sum of ~ a billion numbers.

@nschloe - Very nice comparison in accupy, thanks!

So your conclusion is that the precision of kahan sum isn't much better, so not worth exposing in Numpy?

(kahan is actually already here for a different application, not for numpy.sum)

fsum accuracy seems amazing. How slow is it?
The speed comparison in accupy isn't fair since it's using a Python for loop, no?
(See here).
Is a fsum available (in C or Cython or Numba or ...) somewhere that works on Numpy arrays?

seberg · 2019-06-20T17:06:08Z

@cdeil python uses Shewchuk summation, my guess is you can just look at their code or google a C implementation at the very least. Adapting it into short cython code is probably less work then trying to dig very long.

Now that we support the axis argument for gufuncs, it could actually be a nice little exercise to implement a gufunc for stable summation. Not sure it will be ideal performance wise, but cython solution are unlikely to be any better (unless they bloat up a lot).

nschloe · 2019-06-20T21:38:31Z

So your conclusion is that the precision of kahan sum isn't much better, so not worth exposing in Numpy?

Not my conclusion but a known property of Kahan. It's just not robust when it comes to ill-conditioned sums. The best thing would be have fsum (aka Shewchuk) indeed, but there are faster solutions too which work in k-fold machine precision. (See accupy.)

You can acutally try accupy though and see if it fits your needs. If yes, perhaps we can tweak it for speed.

The speed comparison in accupy isn't fair since it's using a Python for loop, no?

It does indeed. If you can get it done in C++, super!

lorentzenchr · 2024-02-18T13:39:54Z

To add a simple example:

import numpy as np

def naive_sum(a):
    result = np.zeros_like(a, shape=(1,))
    for i in range(a.shape[0]):
        result[0] += a[i]
    return result[0]

a = np.linspace(-100, 100, 1_000_000, dtype=np.float32)
np.random.default_rng(42).shuffle(a)

sum(a), naive_sum(a), np.sum(a)
# (0.0, -0.5259266, -0.008300781)

This shows that numpy.sum does better than the naive summation.

And for reference (not implying numpy should do the same!), Python's sum meanwhile uses (Kahan-) Neumaier summation, see python/cpython#100425.

nschloe · 2024-02-22T17:17:44Z

For those interested, here is an example with a badly conditioned sum:

import math
import sys

import numpy as np

def naive_sum(lst):
    s = 0.0
    for val in lst:
        s += val
    return val

# A badly conditioned sum, condition number ~2.188e+14
p = [
    -0.41253261766461263,
    41287272281118.43,
    -1.4727977348624173e-14,
    5670.3302557520055,
    2.119245229045646e-11,
    -0.003679264134906428,
    -6.892634568678797e-14,
    -0.0006984744181630712,
    -4054136.048352595,
    -1003.101760720037,
    -1.4436349910427172e-17,
    -41287268231649.57,
]

exact = -0.377392919181026


print(f"Python {sys.version}")
print(f"NumPy {np.__version__}\n")
for sumfun in [math.fsum, sum, np.sum, naive_sum]:
    s = sumfun(p)
    err = abs(s - exact)
    name = f"{sumfun.__module__}.{sumfun.__name__}"
    print(f"{name:<18}: {s:10.4e}, error {err:.2e} ({err / abs(s)*100:.1f} %)")

Python 3.12.4 (main, Jun  7 2024, 06:33:07) [GCC 14.1.1 20240522]
NumPy 2.1.0

math.fsum         : -3.7739e-01, error 0.00e+00 (0.0 %)
builtins.sum      : -3.7739e-01, error 0.00e+00 (0.0 %)
numpy.sum         : -3.8281e-01, error 5.42e-03 (1.4 %)
__main__.naive_sum: -4.1287e+13, error 4.13e+13 (100.0 %)

fsum() computes the sum without error, sum() and np.sum() are still in the error order of magnitude 10^{-3}. We don't speak of naive_sum().

asmaier · 2024-05-03T08:52:24Z

Using Python 3.12.3 on Mac OS 14.1.1 with Apple M3 Pro I get the following results from the script by @nschloe:

math.fsum() -0.377392919181026 0.0
sum():       -0.377392919181026 0.0
np.sum():    -0.3828125 0.005419580818973979
naive_sum():    -41287268231649.57 41287268231649.195

The reason why math.fsum() and sum() now yield the same result might be that since Python 3.12 the algorithm for sum() was improved: "sum() now uses Neumaier summation to improve accuracy and commutativity when summing floats or mixed ints and floats. ", see docs.python.org/3/whatsnew/3.12.html

NimaSarajpoor · 2025-04-08T21:54:49Z

I am not sure whether I should open a new issue or not. Please let me know.

Issue:
In the following example, I've created two 1D numpy arrays. Both have the same non-zero elements but at different positions. They have different sum !

Example:

import numpy as np

non_zero_elems = """
0.024861743494183196  0.024861743494183196  0.024861743494183196
 0.021792470064670937  0.021792470064670937  0.018949939845733486
 0.018949939845733486  0.0015757541842979532 0.0015757541842979532
 0.0015757541842979532 0.0015757541842979532 0.0015757541842979532
 0.0015757541842979532 0.0015757541842979532 0.0015757541842979532
 0.04033997886552635   0.04033997886552635   0.04033997886552635
 0.04033997886552635  
 """
non_zero_elems = np.array(non_zero_elems.split()).astype(np.float64)
IDX1 = np.array([28, 29, 30, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52,
       53, 54])

IDX2 = np.array([ 8,  9, 10, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52,
       53, 54])

# `x` and `y` are two arrays with the same non-zero elements but at different positions
x = np.zeros(55, dtype=np.float64)
y = np.zeros(55, dtype=np.float64)

x[IDX1] = non_zero_elems.copy()
y[IDX2] = non_zero_elems.copy()

print(np.sum(x) == np.sum(y))  # False!! 
print(np.sum(x[x!=0]) == np.sum(y[y!=0]))  # True

seberg · 2025-04-22T07:15:27Z

In the following example, I've created two 1D numpy arrays. Both have the same non-zero elements but at different positions. They have different sum !

It's not an issue, it is a basic property of numerical floating point math that re-ordering even the simplest operations can lead to different results. More stable summation may do that less, but I don't think there is any reasonable version that is completely exact.

lorentzenchr · 2025-04-22T13:27:35Z

@seberg What do numpy maintainers intend to do about this issue. I see several options:

Close the issue as a "non issue" or "won't be implemented".
Add an option to the existing sum (what about mean and average?)
Add a new function numpy.fsum that simply promotes python's fsum to a ufunc.
Do nothing.
Yet another option.

seberg · 2025-04-22T13:34:50Z

Well, I think there is nothing to do about mean/sum directly. We could think about a new function or maybe about a new option to sum/mean (but that is harder).
So I think it's a reasonable request, but a hard one. A new function would be the simplest thing probably.
(I don't see anything to do about documentation.)

sk1p mentioned this issue Dec 21, 2018

Better random data with guaranteed asymmetry; numerical test LiberTEM/LiberTEM#226

Merged

LukeMathWalker mentioned this issue Jan 5, 2019

Pairwise summation rust-ndarray/ndarray#577

Open

seberg mentioned this issue Nov 18, 2019

ENH: Stable summation method for cumsum (add.reduce) #14909

Open

charris mentioned this issue Jun 16, 2022

np.sum returns wrong answers #21769

Closed

NimaSarajpoor mentioned this issue Apr 8, 2025

unit testing failure for snippets TDAmeritrade/stumpy#1061

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

numpy.sum not stable enough sometimes (Kahan, math.fsum) #8786

numpy.sum not stable enough sometimes (Kahan, math.fsum) #8786

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

numpy.sum not stable enough sometimes (Kahan, math.fsum) #8786

numpy.sum not stable enough sometimes (Kahan, math.fsum) #8786

Comments

Uh oh!

Background

When numpy.sum doesn't cut it

Work so far

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!