8000 LinearDiscriminantAnalysis predict_proba should use softmax · Issue #5149 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

LinearDiscriminantAnalysis predict_proba should use softmax #5149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
amueller opened this issue Aug 24, 2015 · 6 comments · Fixed by #11796
Closed

LinearDiscriminantAnalysis predict_proba should use softmax #5149

amueller opened this issue Aug 24, 2015 · 6 comments · Fixed by #11796
Labels
Milestone

Comments

@amueller
Copy link
Member

It uses an OVR normalization for multi-class for some unknown (to me) reason.
See #5134.

@amueller amueller added the Bug label Aug 24, 2015
@amueller amueller added this to the 0.16.2 milestone Aug 24, 2015
@mblondel
Copy link
Member

AFAIK there are several variants of LDA in the multiclass case.

@cle1109 @kazemakase Which one do we implement?

@cbrnr
Copy link
Contributor
cbrnr commented Aug 26, 2015

We actually didn't change anything in predict_proba, but looking at the code I see that the implementation is a bit strange (or at least I don't understand why it's done like this). For example, why are there two arguments for np.exp and np.reciprocal?

Anyway, the code is identical to the one in logistic regression, so I guess we're doing the same thing here. Since @MechCoder is working on a fix, we could use it in LDA as well.

@mblondel
Copy link
Member

For example, why are there two arguments for np.exp and np.reciprocal?

np.exp(a, a) is equal to np.exp(a, out=a). The named argument is not supported with older versions of NumPy.

Anyway, the code is identical to the one in logistic regression, so I guess we're doing the same thing here

The way to compute predict_proba may depend on the variant of multiclass LDA we're using. Since there are several variants, the code may be wrong or it may be correct.

@amueller What makes you think softmax is the right way to compute predict_proba here? Any reference?

@cbrnr
Copy link
Contributor
cbrnr commented Aug 26, 2015

np.exp(a, a) is equal to np.exp(a, out=a). The named argument is not supported with older versions of NumPy.

I see. I would have used a = np.exp(a), which is unambiguous because I don't have to look up the arguments.

Regarding the LDA variant, we're computing discriminant functions for each class. The class with the highest value will be selected for a particular sample (this is inherited from LinearClassifierMixin). Is that what you mean by variant of LDA?

@mblondel
Copy link
Member

I would have used a = np.exp(a)

The point is to save a memory allocation by doing the operation in place but I agree this is not big deal.

Is that what you mean by variant of LDA?

From C. Bishop's book: "There are now many possible choices of criterion (Fukunaga, 1990)"

we're computing discriminant functions for each class

This sounds to me like the current code (compute each proba then normalize) is correct then.

@amueller
< 8000 path d="M8 9a1.5 1.5 0 1 0 0-3 1.5 1.5 0 0 0 0 3ZM1.5 9a1.5 1.5 0 1 0 0-3 1.5 1.5 0 0 0 0 3Zm13 0a1.5 1.5 0 1 0 0-3 1.5 1.5 0 0 0 0 3Z"> Copy link
Member Author

I don't have my bishop here, we should check the reference.
It just seems like a very odd formula.
I was assuming that this is just p(y|x) under the probabilistic model where p(x|y) is gaussian with shared covariance.

@amueller amueller modified the milestones: 0.16.2, 0.17 Sep 8, 2015
@amueller amueller modified the milestones: 0.18, 0.17 Sep 20, 2015
@amueller amueller modified the milestones: 0.18, 0.19 Sep 22, 2016
@jnothman jnothman changed the title LDA predict_proba should use softmax LinearDiscriminantAnalysis predict_proba should use softmax Jun 14, 2017
@jnothman jnothman modified the milestones: 0.20, 0.19 Jun 14, 2017
@glemaitre glemaitre modified the milestones: 0.20, 0.21 Jun 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
0