-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG+1] Feature or Bug? Whitened PCA does not inverse_transform properly and is even document that way #3107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] Feature or Bug? Whitened PCA does not inverse_transform properly and is even document that way #3107
Conversation
@@ -161,8 +161,7 @@ def test_pca_inverse(): | |||
pca.fit(X) | |||
Y = pca.transform(X) | |||
Y_inverse = pca.inverse_transform(Y) | |||
relative_max_delta = (np.abs(X - Y_inverse) / np.abs(X).mean()).max() | |||
assert_almost_equal(relative_max_delta, 0.11, decimal=2) | |||
assert_almost_equal(X, Y_inverse, decimal=3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this one fail on master?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I wrote it before coding the fix, and it failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or rather "fix", because I am still not sure whether the current behaviour can be useful or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised about the 3 decimals of precision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or rather "fix", because I am still not sure whether the current behaviour can
be useful or not
Suppose that you have computed OLS weights in a reduced space, as in a
PCR (principal component regression), and you want to visualize them in
the full space. That would be a usecase.
I must admit that I would find it fishy to use whiten=True in such a
scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just copied the same assertion statement from above, where it is tested with whiten=False
. It should work at higher precision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But wouldn't transforming back into the full space imply an un-whitening as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But wouldn't transforming back into the full space imply an un-whitening as
well?
Hard to say. I would always write a PCR without whitening, so things are
better posed.
I find the current behavior unintuitive, to say the least, and agree with @eickenberg. Ugly proposed solution: introduce a parameter that changes the behavior, make the default Or we could just break it ;) |
Notes | ||
----- | ||
If whitening is enabled, inverse_transform does not compute the | ||
exact inverse operation as transform. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment dates back from 2011 so it seems this behavior as always been there
@GaelVaroquaux do you remember what was the rationale for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GaelVaroquaux do you remember what was the rationale for this?
No, and honestly if we cannot find one, I do not mind changing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the question is : do we consider this a bug and just fix and document in
what's new?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess what makes everybody hesitant is that it was explicitly documented. Are there general notes on what inverse_transform
should do? The rules I have been able to identify are something like inverse_transform(transform(point))
should be as close as possible to point
for a metric appropriate to the situation and within a reasonable tradeoff of effort/closeness (in case this actually involves solving another optimization problem).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you ping the mailing list? If everybody agrees it's a bug I would treat
it this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a bug as well.
Michael, Could you do the same fix for randomized PCA? sean |
Yes, I can take care of changing this in all relevant modules if it is decided that this behaviour is wanted. In the mean time, I posted a fix to this on stack overflow which can be used without changing the scikit learn code: http://stackoverflow.com/questions/23254700/inversing-pca-transform-with-sklearn-with-whiten-true/23263923#23263923 |
In my Incremental PCA PR (#3285) there is an introduction of a _BasePCA module. Though the other PCA modules have not yet been converted to use this as the base class, that is the direction we discussed in the PR comments. This is a great chance to a) unify common functions and b) provide deprecation support for the whole whiten/unwhiten transform issue in one place. |
Totally agree. Also my question to the mailing list whether it there was On Mon, Jul 14, 2014 at 5:58 PM, Kyle Kastner notifications@github.com
|
So does this wait for #3285? Or do we want to merge this first? |
One last question/ request? currently whiten scales the components_ variable ( in order to rescale the transformed data) . should this be done? or should components_ vectors always have norm 1 (whether we do whiten or not) |
I think that the components should always have norm 1. We expect them to |
+1 On Wed, Jul 16, 2014 at 6:33 PM, Gael Varoquaux notifications@github.com
|
If we merge my PR (#3285) then there will be some work to get more of the On Wed, Jul 16, 2014 at 11:33 AM, Gael Varoquaux notifications@github.com
|
+1 as well. This is another bug that should be tackled by this PR and properly documented. As this is quite an important semantic change I would be in favor of not including this fix in 0.15.1 but rather schedule that for 0.16.0 instead. Will change the milestone. |
oops, wrong comment button again. Sorry. |
Components are now always of unit length. |
if self.whiten: | ||
components_ = components_ * np.sqrt(exp_var[:, np.newaxis]) | ||
# if self.whiten: | ||
# components_ = components_ * np.sqrt(exp_var[:, np.newaxis]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please delete the commented out lines.
This looks good to me, please add an entry in the whats new file. This change need to be clearly documented. |
Do I need to squash these? |
Yes please. It helps to have a single commit per change that have a meaning on its own. |
Sorry for being slow: All of them into one, or, as I just pushed, into semantically coherent subunits? (I'm getting better at rebase -i :)) |
This is good, although you could prefix the commit messages with |
you need to rebase on top of current master though. github says that you are based on an old master. |
I was rebased on a master commit of earlier today. Now it should be the latest one. Also added the descriptors to the commits. Thanks for the patience! |
We usually don't put square brackets in the commits messages (just in the PR headers for some reason) but this is fine. +1 for merge on my side. |
Besides the rebase pb lgtm Travis failure is unrelated |
@eickenberg do you want me to cherry-pick your commits and push them to master or do you want to try to rebase one more time to get rid of my "FIX #3485: class_weight='auto' on SGDClassifier" spurious commit in the middle? |
Now there are only the 4 commits that I wrote. Unless I didn't mess up badly, this should be already rebased on the latest master and correct in content. |
The travis failure should go if this is rebased on master. there was a bug fix. |
This looks pretty good to me - I am a definite +1 . It will help for #3285 and future unification of the common code in PCA |
Reviewers are happy. I am going to merge by rebase. |
Merged by rebase. |
Thanks @eickenberg. Is there an issue ticket for this? |
I think I just used the PR as the issue ticket ... Thanks everybody for all the helpful comments! On Wed, Aug 6, 2014 at 2:45 PM, Arnaud Joly notifications@github.com
|
ZCA is a useful transformation defined as PCA + whitening + inverse rotation. this change breaks it. |
ZCA is mathematically different though (bias term, etc.). So if ZCA is On Sun, Mar 15, 2015 at 1:15 PM, eyaler notifications@github.com wrote:
|
but this is PCA, not ZCA, and its inverse transform should bring the signal On Sunday, March 15, 2015, eyaler notifications@github.com wrote:
|
Agreed. I am not really sure how this change would break (broke?) a ZCA On Sun, Mar 15, 2015 at 1:54 PM, eickenberg notifications@github.com
|
Well, the way it was before was that the Also, it scaled the lengths of the components, although orthonormality is expected. Exploiting this, you could do To make a ZCA using PCA, you can do |
The
PCA.inverse_transform
docstring even explicitly states that the inverse transform after whitening is inexact, because the necessary rescaling to fall back into the right space is not done. Is there a specific reason for this? One would think that the expected behaviour of aninverse_transform
should be that it maps to the closest possible point in input space given the incurred information loss. Since with (full rank) PCA one can keep all the information, the inverse transform should be the true inverse.Any opinions on this?
My main concern for changing this is that this behaviour is documented and thus possibly expected by some users.
Making the
PCA
object do a true inverse is as easy as adding 2 lines to theinverse_transform
as visible in the diff.