-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
variance is inaccurate for arrays of identical, large values (Trac #1098) #1696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Attachment added by @thouis on 2009-04-29: test_var.py |
Milestone changed to |
Milestone changed to |
@bsouthey wrote on 2011-01-26 At least on Linux, this can be addressed by the dtype argument:
Of course it just delays the situation when using even larger numbers. |
This is much improved after #3685. Note that the relative error in each term is ~1ulp.
The error arises from the determination of the mean combined with the fact that all error are the same, hence no cancellation when adding up the variance. Using Python's "exact" fsum doesn't do any better. |
Original ticket http://projects.scipy.org/numpy/ticket/1098 on 2009-04-29 by @thouis, assigned to @charris.
Variance calculation is inaccurate for arrays of large, identical values:
There are more accurate algorithms for computing variance. One example from Welford (1962) is in the attached file.
The text was updated successfully, but these errors were encountered: