ENH: Stable summation method for cumsum (add.reduce) #14909

robparrishqc · 2019-11-15T00:07:53Z

numpy (superb library - no slight intended here!) does not seem to support Kahan or butterfly summation in cumsum, which is critically important for accuracy in large arrays of numbers of similar magnitude.

Reproducing code example:

import numpy as np; print(np.cumsum(np.ones((2**28,), dtype=np.float32))[-1] - 2**28)

import numpy as np
print(np.cumsum(np.ones((2**28,), dtype=np.float32))[-1] - 2**28)
# Should be zero

Note that sum obviously uses butterfly summation:

print(np.sum(np.ones((2**28,), dtype=np.float32)) - 2**28)
# Is zero on any machine I can find

The text was updated successfully, but these errors were encountered:

seberg · 2019-11-17T18:52:58Z

Sum uses (partial) pairwise summation (in many cases). What is butterfly summation? There should be a few open issues about including stable summation methods in numpy, could you cross reference these here and comment there?

robparrishqc · 2019-11-17T19:47:40Z

I think butterfly summation == pairwise summation == parallel sum scan (just different names for the same thing). If there is any remaining doubt, the technique I am referring to is described extensively for CUDA here: https://developer.nvidia.com/gpugems/GPUGems3/gpugems3_ch39.html - this same approach should work quite well and be reasonably efficient for CPU, but some care will have to be taken to make sure the algorithm uses the cache structure efficiently (or you'll end up with something that scales like log2 memcpy ops). np.sum is clearly using something like this, but np.cumsum is clearly not (see example above). This is particularly problematic when summing large arrays of similar magnitude numbers in low precision (e.g., float32).

seberg · 2019-11-17T19:56:06Z

Yes, the issue is that np.cumsum uses np.add.accumulate which is a bit more limited in implementing special logic (at least right now).

If you can help out, I would prefer to move this discussion to an existing issue and close this one, although it may be that the existing stable summation issues are not specific about cumsum.

robparrishqc · 2019-11-17T20:04:17Z

Certainly - which open issue should I look at?

charris · 2019-11-17T23:20:27Z

I suppose one easy fix for float32 would be to accumulate the running sum in double precision.

seberg · 2019-11-18T19:48:06Z

True about float64 making the float32 issue less bad. xref gh-8786 I do not mind moving this there as well, but can also stay its own issue.

nschloe · 2019-12-04T23:53:32Z

Just an FTI: I once created accupy for this purpose. It's slow though.

seberg changed the title ~~Butterfly summation needed in cumsum~~ ENH: Stable summation method for cumsum (add.reduce) Nov 18, 2019

seberg added 01 - Enhancement 15 - Discussion component: numpy.ufunc labels Nov 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Stable summation method for cumsum (add.reduce) #14909

ENH: Stable summation method for cumsum (add.reduce) #14909

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ENH: Stable summation method for cumsum (add.reduce) #14909

ENH: Stable summation method for cumsum (add.reduce) #14909

Comments

Reproducing code example:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!