8000 GH-102670: Use sumprod() to simplify, speed up, and improve accuracy of statistics functions by rhettinger · Pull Request #102649 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

GH-102670: Use sumprod() to simplify, speed up, and improve accuracy of statistics functions #102649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 14, 2023
Next Next commit
Use sumprod() to speed-up and improve accuracy of correlation()
  • Loading branch information
rhettinger committed Mar 13, 2023
commit 3f2fd8d53da13a0278a7540edc6ffa68baab74be
8 changes: 4 additions & 4 deletions Lib/statistics.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@

from fractions import Fraction
from decimal import Decimal
from itertools import count, groupby, repeat
from itertools import count, groupby, repeat, tee
from bisect import bisect_left, bisect_right
from math import hypot, sqrt, fabs, exp, erf, tau, log, fsum, sumprod
from functools import reduce
Expand Down Expand Up @@ -1076,9 +1076,9 @@ def correlation(x, y, /, *, method='linear'):
y = _rank(y, start=start)
xbar = fsum(x) / n
ybar = fsum(y) / n
sxy = fsum((xi - xbar) * (yi - ybar) for xi, yi in zip(x, y))
sxx = fsum((d := xi - xbar) * d for xi in x)
syy = fsum((d := yi - ybar) * d for yi in y)
sxy = sumprod((xi - xbar for xi in x), (yi - ybar for yi in y))
sxx = sumprod(*tee(xi - xbar for xi in x))
syy = sumprod(*tee(yi - ybar for yi in y))
try:
return sxy / sqrt(sxx * syy)
except ZeroDivisionError:
Expand Down
0