8000 added multiclass_log_loss metric by ephes · Pull Request #1125 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

added multiclass_log_loss metric #1125

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

ephes
Copy link
Contributor
@ephes ephes commented Sep 6, 2012

Don't know whether this is helpful, just practicing :)...

@mblondel
Copy link
Member
mblondel commented Sep 7, 2012

Thanks, I do think that's useful. I will try to review the code later.

@kyleabeauchamp
Copy link
Contributor

So one (somewhat) related issue is that one cannot optimize this type of metric using GridSearchCV. The problem is that the grid search always assumes that you want to score using model.predict() rather than predict_proba(). It's obviously easy to temporarily hack the code to allow this, but I was wondering if people had any desire for a better implementation of such a feature. Thoughts?

@kyleabeauchamp
Copy link
Contributor

I have two recommendations:

  1. Add multiclass_log_loss to metrics/__init__.py
  2. Re-normalize the probabilities after clipping. As of right now, when you clip values outside of your window, the resulting probability vectors are no longer normalized--their sum may not be 1.0. This may subtly change the final result.

@amueller
Copy link
Member

@kyleabeauchamp The issue with grid search was discussed here: #1014.

@ephes
Copy link
Contributor Author
ephes commented Sep 11, 2012

@kyleabeauchamp thanks for your recommendations, changed the files accordingly...


def multiclass_log_loss(y_true, y_pred, eps=1e-15):
"""Multi class version of Logarithmic Loss metric.
https://www.kaggle.com/wiki/MultiClassLogLoss
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather use more standard refences such as elements of statistical learning and wikipedia.
The link to the thread seems pretty out of place. Also, I guess alternative names should be mentioned. This is the multinomial logistic regression loss, right? aka softmax loss aka max entropy?

@amueller
Copy link
Member
amueller commented Oct 1, 2012

Hey @ephes, are you still working on this? Or are you to busy with the competition ;)
I thought that was merged already and kind of forgot about it.

@ephes
Copy link
Contributor Author
ephes commented Oct 1, 2012

Yes, I'm too busy. The competition is eating up all of my spare time atm :). But I do plan to work on this again next week, when the competition is over.

@amueller
Copy link
Member
amueller commented Oct 1, 2012

Ok, no worries! I'll just use your branch until then. Good luck!

@weilinear
Copy link
Member

How is this PR going @ephes ? I can try to help if needed :)

@larsmans
Copy link
Member

Ping myself: this should get merged.

@amueller
Copy link
Member

iirc the documentation and testing needs some work.
I agree, though, that shouldn't be much work and we should merge this soon.

@arjoly
Copy link
Member
arjoly commented May 21, 2013

Could it be called log_loss and also support binary classification?

@amueller
Copy link
Member

no, it should ;)

@larsmans
Copy link
Member

I have some time tomorrow, I hope I can finish it then.

@mblondel
Copy link
Member

I agree that this PR needs some work.

Binary log loss can also be used for OVR. You just need to sum up the losses of each class. So we might want to prefer two different functions, one for binary log loss and one for multiclass log loss.

@arjoly
Copy link
Member
arjoly commented Jul 25, 2013

Close this one in favour of #2013.
Re-open it if you still want to contribute.

@arjoly arjoly closed this Jul 25, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0