-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG + 2] predict_proba should use the softmax function in the multinomial case #5182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG + 2] predict_proba should use the softmax function in the multinomial case #5182
Conversation
@@ -238,16 +238,32 @@ def _predict_proba_lr(self, X): | |||
1. / (1. + np.exp(-self.decision_function(X))); | |||
multiclass is handled by normalizing that over all classes. | |||
""" | |||
from sklearn.linear_model.logistic import ( | |||
LogisticRegression, LogisticRegressionCV) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's pretty ugly. I would rather override predict_proba
in LogisticRegression
.
def predict_proba(self, X):
if self.multiclass == "multinomial":
[...]
else:
return super(LogisticRegression, self).predict_proba(X)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should get rid of the isinstance
on self
as well.
@mblondel Soory for the hasty hack. I've fixed up your comment. |
I guess predict_proba's argmax = predict is already tested in the common tests? LGTM. |
yes. it is |
Does the test you added fails without the patch? |
491f0ad
to
a5537aa
Compare
I was thinking about that. Would it be sufficient to check that the predicted probability values are different for both the cases? |
It is always true that clf.predict_proba(X).max(axis=0) is greater for the multinomial case? |
You could try to compute the multinomial log loss: Hopefully, the loss should be smaller with the right probabilities (although this might not be true due the l2 regularization term on the coefficients...). |
thanks for the tip. I've added the test. |
+1 for merge when travis is happy. thanks @MechCoder |
1b6471e
to
af5e6e8
Compare
af5e6e8
to
c85f2ad
Compare
Ah, I see. I added that, but I kept the previous test as well because I thought it might be interesting. |
Thanks. LGTM now :) |
Also fixes #5134 |
Two +1s. We're only waiting for Appveyor (which is currently very slow). |
[MRG + 2] predict_proba should use the softmax function in the multinomial case
Minor comment: would it be more stable to calculate the log probability (using |
This will have overflow issues for large decision_function values (e.g. [750, 749, 748]). This can be fixed by subtracting the max from the output of decision_function (in this case 750), since: |
Thanks for the top ! See #5225 |
Fixes #5176