LogisticRegression assumes OvR for probability estimates when using cross-entropy error in multiclass problems · Issue #5176 · scikit-learn/scikit-learn · GitHub
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using LogisticRegression for multi-class classification with multi_class='multinomial' (i.e. cross-entropy error rather than OvR), I am finding that predict_proba gives incorrect probability estimates.
The weights are learned correctly but I think predict_proba assumes an OvR scheme and calculates the 2-class probability for each boundary using a sigmoid function and then normalises. I would have expected it instead to return exp(w_c'x)/[sum_k exp(w_k'x)] for each class c, which is the expression for p(C_c|x) used to derive the cross-entropy loss function. The results are quite different, at least in my application.
The text was updated successfully, but these errors were encountered:
When using
LogisticRegression
for multi-class classification withmulti_class='multinomial'
(i.e. cross-entropy error rather than OvR), I am finding thatpredict_proba
gives incorrect probability estimates.The weights are learned correctly but I think
predict_proba
assumes an OvR scheme and calculates the 2-class probability for each boundary using a sigmoid function and then normalises. I would have expected it instead to returnexp(w_c'x)/[sum_k exp(w_k'x)]
for each class c, which is the expression for p(C_c|x) used to derive the cross-entropy loss function. The results are quite different, at least in my application.The text was updated successfully, but these errors were encountered: