8000 No Bessel correction in PCA · Issue #7699 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

No Bessel correction in PCA #7699

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
budavari opened this issue Oct 18, 2016 · 8 comments · Fixed by #7843 or #9105
Closed

No Bessel correction in PCA #7699

budavari opened this issue Oct 18, 2016 · 8 comments · Fixed by #7843 or #9105
Labels
Documentation Easy Well-defined and straightforward way to resolve

Comments

@budavari
Copy link

Description

No big deal just missing Bessel's correction

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

@amueller
Copy link
Member

I just checked Murphy, Hastie/Tibshirani/Friedman, Bishop and Shalev-Shwartz/Ben-David (selected by being within arms reach) and none of them include the Bessel's correction in their PCA formulation.
So it seems rather non-standard. Which definition of PCA are you using?

@jakevdp
Copy link
Member
jakevdp commented Oct 19, 2016

This is interesting. Thinking about it, I could see where a Bessel correction might be applied or not applied based on the particular application. The Bessel correction serves to decrease the bias in inferred parameters at the expense of increasing the mean squared error. In a statistical modeling context (where you're attempting to, say, find an unbiased estimator of some latent parameter) this makes sense. In a machine learning context (where you're attempting to create a model with the best predictive power, where "best" often involves a mean-squared-error metric) a Bessel correction is probably not warranted.

It goes back to the distinction raised in Breiman's Statistical Modeling: the Two Cultures.

FWIW, in our astro textbook we define PCA using the Bessel correction for the covariance matrix. I'd never thought about it deeply before, but I imagine this is more of a standard in the astro community due to the fact that we generally use PCA in a statistical modeling sense (i.e. looking for meaning in the eigenvectors).

@amueller
Copy link
Member

@jakevdp do you think it warrants a parameter in PCA? We don't even have one in StandardScaler...

@jakevdp
Copy link
Member
jakevdp commented Oct 19, 2016

do you think it warrants a parameter in PCA?

Perhaps, but in the vast majority of cases it's not going to make much difference at all, so I'd be tempted to leave it as-is, perhaps with a note in the docstring.

@amueller amueller added Easy Well-defined and straightforward way to resolve Documentation Need Contributor labels Oct 19, 2016
@amueller
Copy link
Member

yeah ok lets add a note to the docstring.

@lesteve lesteve changed the title Wrong eigenvalues in PCA No Bessel correction in PCA Oct 20, 2016
@dalmia
Copy link
Contributor
dalmia commented Nov 9, 2016

@amueller @jakevdp Please have a look whether this solves the need.

@amanp10
Copy link
Contributor
amanp10 commented Dec 5, 2016

Hello,
I am new to scikit-learn. Can I try and add the Bessel Correction in PCA?

@amueller
Copy link
Member
amueller commented Dec 5, 2016

@amanp10 there is a fix in #7843

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Easy Well-defined and straightforward way to resolve
Projects
None yet
5 participants
0