-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Add a list_scorers function to sklearn.metrics #10712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'd like to try this :) |
Go for it.
…On 27 February 2018 at 13:08, Danielle Shwed ***@***.***> wrote:
I'd like to try this :)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#10712 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6zmUjwQR58l8APWYP3dqOhcO0LCWks5tY2O3gaJpZM4SUFM3>
.
|
@jnothman It seems that we already have a simple way from sklearn.metrics import SCORERS
print(SCORERS.keys()) to list all the scorers (which is used to generate the above error message)? |
you're right. we could do that. I had once thought list_scorers would be
better to remove deprecated. We could consider list_scorers if we want to
list by task type, etc.
…On 27 February 2018 at 14:59, Hanmin Qin ***@***.***> wrote:
@jnothman <https://github.com/jnothman> It seems that we already have a
simple way
from sklearn.metrics import SCORERSprint(SCORERS.keys())
to list all the scorers (which is used to generate the above error
message)?
How about just improve the error message "Use
sklearn.metrics.SCORERS.keys() to get valid strings."
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10712 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6xPNHR_dEIuoWBCMbf-rUmaoWs2pks5tY32HgaJpZM4SUFM3>
.
|
sorted(SCORERS.keys()) might be better advice. But I still think there's a usability problem in the length of the list returned and its heterogeneity. Let's make these scorers usable (I think @amueller will agree): Let's define:
An alternative to this is to provide some kind of structured data that can be interpreted as a dataframe and let the user do their own filtering:
Personally, I think Either way, we would not just be storing Notes:
|
I'm +0 for such a function. |
You might be right, but:
Opinion @amueller? |
@jnothman, if there is a consensus, and nobody is working on this, I'd like to take it. |
at the moment there is no consensus. you can always take a risk of trying
to implement it, show whether it is useful, champion it, etc. and seeing if
others come on board.
|
Also, you should check if @danielleshwed hopes to work on it |
I think this can be added at least to the testing functions, just like sklearn.utils.testing.all_estimators. |
not sold on this one. Seems harder to maintain. And in the end, even for other interfaces, the user needs to select, right? So it's more a matter of pruning down which are appropriate for which task to select from... Automatic selection of metrics is not really a thing, right? My notebooks (and book?) have |
"recall" is a good choice if you're optimising logistic loss... And as one
of a few diagnostic measures. I'm happy to close this, though.
|
Only monitoring recall and not precision means you could just change the class_weight and get better, right? |
maybe the docs should do |
yes to both
|
+1 to simplify the error message with |
Can I try this if it's okay ? |
@princejha95 Please go ahead :) |
raise ValueError('%r is not a valid scoring value. '
By doing this, new error message is 'ValueError: 'rubbish' is not a valid scoring value. For Valid options use sorted(SCORERS.keys())' |
looks fine. please submit a pull request
|
I have changed the error message as suggested by @qinhanmin2014 above. Now if user gives a scoring value which is not present in SCORER dictionary, then the error message will suggest user that he can find valid scoring values from sorted(SCORERS.keys()). |
The
scoring
parameter allows users to specify a scoring method by name. Currently a list of names is available by getting it wrong:I think this error message, and maintaining it, is getting a bit absurd. Instead we should have a function
sklearn.metrics.list_scorers
implemented insklearn/metrics/scorer.py
and the error message should say "Use sklearn.metrics.list_scorers to get valid strings.". Perhaps we would eventually havelist_scorers
allow users to filter scorers by task type (binary classification, multiclass classification, multilabel classification, regression, etc.), or even to provide metadata about each scorer (a description, for instance), but initially we should just be able to list them.The text was updated successfully, but these errors were encountered: