8000 LogisticRegression assumes OvR for probability estimates when using cross-entropy error in multiclass problems · Issue #5176 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

LogisticRegression assumes OvR for probability estimates when using cross-entropy error in multiclass problems #5176

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
akxlr opened this issue Aug 28, 2015 · 1 comment

Comments

@akxlr
Copy link
akxlr commented Aug 28, 2015

When using LogisticRegression for multi-class classification with multi_class='multinomial' (i.e. cross-entropy error rather than OvR), I am finding that predict_proba gives incorrect probability estimates.

The weights are learned correctly but I think predict_proba assumes an OvR scheme and calculates the 2-class probability for each boundary using a sigmoid function and then normalises. I would have expected it instead to return exp(w_c'x)/[sum_k exp(w_k'x)] for each class c, which is the expression for p(C_c|x) used to derive the cross-entropy loss function. The results are quite different, at least in my application.

@agramfort
Copy link
Member
agramfort commented Aug 28, 2015 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0