8000 cross_val_score can't take scoring parameters without a custom scoring function · Issue #5308 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
8000

cross_val_score can't take scoring parameters without a custom scoring function #5308

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bjlange opened this issue Sep 24, 2015 · 5 comments
Closed

Comments

@bjlange
Copy link
bjlange commented Sep 24, 2015

When using cross_val_score, one can choose an averaging method easily using the various scoring presets. But if they want to choose a pos_label like those taken by f1, precision, and recall (to specify scoring with respect to a specific class), they have to define a custom scoring function like this:

def precision_class_2(estimator, X, y):
    y_pred = estimator.predict(X)
    return metrics.precision(y, y_pred, pos_label=2)

cross_val_score(estimator, X, y=None, scoring=precision_class_2)

This obviously isn't that big of a deal, but it would be nice to be able to pass scoring parameters into the function the same way you can pass fit parameters. Like:

cross_val_score(estimator, X, y=None, scoring='precision', scoring_params={'pos_label':2})

I'm happy to contribute code for this but wanted to open an issue first and see what others think/if I'm barking up the wrong tree.

@bjlange bjlange changed the title cross_val_score can't take different pos_labels without a custom scoring function cross_val_score can't take scoring parameters without a custom scoring function Sep 24, 2015
@amueller
Copy link
Member

@jnothman
Copy link
Member

The recommended approach currently might be
scoring=make_scorer(partial(precision_score, pos_label=2)).

I agree that in some ways your proposal would be an improvement: the user
needs not understand the potentially confusing parameters of make_scorer.
It would also be consistent with other places we allow a string parameter
to select among methods.

Its downsides are:

On 25 September 2015 at 05:42, Brian Lange notifications@github.com wrote:

When using cross_val_score, one can choose an averaging method easily
using the various scoring presets. But if they want to choose a pos_label
like those taken by f1, precision, and recall (to specify scoring with
respect to a specific class), they have to define a custom scoring function
like this:

def precision_class_2(estimator, X, y):
y_pred = estimator.predict(X)
return metrics.precision(y, y_pred, pos_label=2)

cross_val_score(estimator, X, y=None, scoring=precision_class_2)

This obviously isn't that big of a deal, but it would be nice to be
able to pass scoring parameters into the function the same way you can pass
fit parameters. Like:

cross_val_score(estimator, X, y=None, scoring='precision', scoring_params={'pos_label':2})

I'm happy to contribute code for this but wanted to open an issue first
and see what others think/if I'm barking up the wrong tree.


Reply to this email directly or view it on GitHub
#5308.

@jnothman
Copy link
Member

@amueller, you're right. But a user is easily tripped up by not realising
they need to both curry the scoring function, and provide make_scorer
parameters like needs_proba. So I can see from a usability perspective that
this sort of thing could benefit.

On 25 September 2015 at 06:56, Andreas Mueller notifications@github.com
wrote:

that is what make_scorer is for:
http://scikit-learn.org/dev/modules/generated/sklearn.metrics.make_scorer.html#sklearn.metrics.make_scorer


Reply to this email directly or view it on GitHub
#5308 (comment)
.

@bjlange
Copy link
Author
bjlange commented Sep 25, 2015

Ah, I'm just now seeing all the stuff about make_scorer in the model eval docs. The docs suggest that you don't have to curry the scoring function; you just pass keyword arguments and they're passed along. That's not too bad usability wise, and it seems like storing all the relevant info in the scorer makes it easier to pass multiple ones a la #2759.
Thanks for your prompt and helpful responses. If you think this is still worth pursuing I'm happy to help but make_scorer seems to scratch my itch.

@jnothman
Copy link
Member

Forgot about the **kwargs. Yes, no currying needed. Still, I think it's a
bit mean to require a user to look up the properties of the metric function
in order to use make_scorer correctly.

I wonder if one option is to allow make_scorer (or a new
altered_scorer function) to take a pre-existing scorer's string as the base
definition:

scoring=altered_scorer('precision', pos_label=2)

or even

scoring=make_scorer('precision', pos_label=2)

(The need for this could also be avoided by having the greater_is_better
and needs_proba features as annotations on the metric functions, but such
an approach has been rejected as too frameworkish.)

On 25 September 2015 at 15:02, Brian Lange notifications@github.com wrote:

Ah, I'm just now seeing all the stuff about make_scorer in the model eval
docs
http://scikit-learn.org/stable/modules/model_evaluation.html#defining-your-scoring-strategy-from-metric-functions.
The docs suggest that you don't have to curry the scoring function; you
just pass keyword arguments and they're passed along. That's not too bad
usability wise, and it seems like storing all the relevant info in the
scorer makes it easier to pass multiple ones a la #2759
#2759.
Thanks for your prompt and helpful responses. If you think this is still
worth pursuing I'm ha 5D41 ppy to help but make_scorer seems to scratch my itch.


Reply to this email directly or view it on GitHub
#5308 (comment)
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0