8000 RFC Metric `pos_label` handling for multiclass (and multilabel) data · Issue #33143 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

RFC Metric pos_label handling for multiclass (and multilabel) data #33143

@lucyleeow

Description

@lucyleeow

Warning

This is a request for comment (RFC), please do not open a pull request for this issue. If you wish to contribute to scikit-learn please have a look at our contributing doc and in particular the section New contributors.

Noticed in #32755 (comment)

Here are the metrics that have a pos_label:

METRICS_WITH_POS_LABEL = {
"confusion_matrix_at_thresholds",
"roc_curve",
"precision_recall_curve",
"det_curve",
"brier_score_loss",
"d2_brier_score",
"precision_score",
"recall_score",
"f1_score",
"f2_score",
"f0.5_score",
"jaccard_score",
"average_precision_score",
"weighted_average_precision_score",
"micro_average_precision_score",
"samples_average_precision_score",

Of these, some only support binary input (e.g., roc_curve, confusion_matrix_at_thresholds).

Of the metrics that also support multi-class input, most silently ignore the pos_label parameter if data is multiclass (or multilabel) but pos_label is also set. average_precision_score is the only one that raises an error:

elif y_type == "multilabel-indicator" and pos_label != 1:
raise ValueError(
"Parameter pos_label is fixed to 1 for multilabel-indicator y_true. "
"Do not set pos_label or set pos_label to 1."
)
elif y_type == "multiclass":
if pos_label != 1:
raise ValueError(
"Parameter pos_label is fixed to 1 for multiclass y_true. "
"Do not set pos_label or set pos_label to 1."

I am not sure whether it is better to error or silently ignore (or third option, warn) - but some consistency is probably ideal.

How should pos_label be handled?

cc @StefanieSenger @AnneBeyer @ogrisel ?

Edit:

Side note, brier_score_loss and d2_brier_score do not explicitly state that pos_label is ignored, but probably should. Maybe you'd be interested in tackling this @AnneBeyer ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0