cross_val_score can't take scoring parameters without a custom scoring function #5308

bjlange · 2015-09-24T19:42:06Z

When using cross_val_score, one can choose an averaging method easily using the various scoring presets. But if they want to choose a pos_label like those taken by f1, precision, and recall (to specify scoring with respect to a specific class), they have to define a custom scoring function like this:

def precision_class_2(estimator, X, y):
    y_pred = estimator.predict(X)
    return metrics.precision(y, y_pred, pos_label=2)

cross_val_score(estimator, X, y=None, scoring=precision_class_2)

This obviously isn't that big of a deal, but it would be nice to be able to pass scoring parameters into the function the same way you can pass fit parameters. Like:

cross_val_score(estimator, X, y=None, scoring='precision', scoring_params={'pos_label':2})

I'm happy to contribute code for this but wanted to open an issue first and see what others think/if I'm barking up the wrong tree.

The text was updated successfully, but these errors were encountered:

amueller · 2015-09-24T20:56:15Z

that is what make_scorer is for: http://scikit-learn.org/dev/modules/generated/sklearn.metrics.make_scorer.html#sklearn.metrics.make_scorer

jnothman · 2015-09-25T01:45:47Z

The recommended approach currently might be
scoring=make_scorer(partial(precision_score, pos_label=2)).

I agree that in some ways your proposal would be an improvement: the user
needs not understand the potentially confusing parameters of make_scorer.
It would also be consistent with other places we allow a string parameter
to select among methods.

Its downsides are:

another parameter to clutter everywhere we allow scoring
it potentially complexifies features like multiple metric cross
validation ([WIP] Multiple-metric grid search #2759)

On 25 September 2015 at 05:42, Brian Lange notifications@github.com wrote:

When using cross_val_score, one can choose an averaging method easily
using the various scoring presets. But if they want to choose a pos_label
like those taken by f1, precision, and recall (to specify scoring with
respect to a specific class), they have to define a custom scoring function
like this:

def precision_class_2(estimator, X, y):
y_pred = estimator.predict(X)
return metrics.precision(y, y_pred, pos_label=2)

cross_val_score(estimator, X, y=None, scoring=precision_class_2)

This obviously isn't that big of a deal, but it would be nice to be
able to pass scoring parameters into the function the same way you can pass
fit parameters. Like:

cross_val_score(estimator, X, y=None, scoring='precision', scoring_params={'pos_label':2})

I'm happy to contribute code for this but wanted to open an issue first
and see what others think/if I'm barking up the wrong tree.

—
Reply to this email directly or view it on GitHub
#5308.

jnothman · 2015-09-25T01:53:50Z

@amueller, you're right. But a user is easily tripped up by not realising
they need to both curry the scoring function, and provide make_scorer
parameters like needs_proba. So I can see from a usability perspective that
this sort of thing could benefit.

On 25 September 2015 at 06:56, Andreas Mueller notifications@github.com
wrote:

that is what make_scorer is for:
http://scikit-learn.org/dev/modules/generated/sklearn.metrics.make_scorer.html#sklearn.metrics.make_scorer

—
Reply to this email directly or view it on GitHub
#5308 (comment)
.

bjlange · 2015-09-25T05:01:58Z

Ah, I'm just now seeing all the stuff about make_scorer in the model eval docs. The docs suggest that you don't have to curry the scoring function; you just pass keyword arguments and they're passed along. That's not too bad usability wise, and it seems like storing all the relevant info in the scorer makes it easier to pass multiple ones a la #2759.
Thanks for your prompt and helpful responses. If you think this is still worth pursuing I'm happy to help but make_scorer seems to scratch my itch.

jnothman · 2015-09-25T05:09:14Z

Forgot about the **kwargs. Yes, no currying needed. Still, I think it's a
bit mean to require a user to look up the properties of the metric function
in order to use make_scorer correctly.

I wonder if one option is to allow make_scorer (or a new
altered_scorer function) to take a pre-existing scorer's string as the base
definition:

scoring=altered_scorer('precision', pos_label=2)

or even

scoring=make_scorer('precision', pos_label=2)

(The need for this could also be avoided by having the greater_is_better
and needs_proba features as annotations on the metric functions, but such
an approach has been rejected as too frameworkish.)

On 25 September 2015 at 15:02, Brian Lange notifications@github.com wrote:

Ah, I'm just now seeing all the stuff about make_scorer in the model eval
docs
http://scikit-learn.org/stable/modules/model_evaluation.html#defining-your-scoring-strategy-from-metric-functions.
The docs suggest that you don't have to curry the scoring function; you
just pass keyword arguments and they're passed along. That's not too bad
usability wise, and it seems like storing all the relevant info in the
scorer makes it easier to pass multiple ones a la #2759
#2759.
Thanks for your prompt and helpful responses. If you think this is still
worth pursuing I'm ha 5D41 ppy to help but make_scorer seems to scratch my itch.

—
Reply to this email directly or view it on GitHub
#5308 (comment)
.

bjlange changed the title ~~cross_val_score can't take different pos_labels without a custom scoring function~~ cross_val_score can't take scoring parameters without a custom scoring function Sep 24, 2015

amueller closed this as completed Sep 24, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cross_val_score can't take scoring parameters without a custom scoring function #5308

cross_val_score can't take scoring parameters without a custom scoring function #5308

cross_val_score can't take scoring parameters without a custom scoring function #5308

cross_val_score can't take scoring parameters without a custom scoring function #5308

Comments