8000 Allow for multiple scoring metrics in `RFECV` · Issue #28937 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
Allow for multiple scoring metrics in RFECV #28937
@ArturoSbr

Description

@ArturoSbr

Workflow

In its current state, RFECV only allows for a single scoring metric. In my opinion, calculating multiple scores on each model using k <= K features would be extremely valuable.

For example, if I wanted to study how the precision and recall metrics of a binary classifier evolve as I feed less and less features to a model, I would have to run RFECV twice: one with scoring='precision' and another with scoring='recall'.
This is inefficient, as it implies running RFECV twice instead of once.

The cv_results_ attribute of GridSearchCV returns one rank per metric used to evaluate each combination of hyperparameters. Replicating this behavior in RFECV would be extremely helpful.

Proposed solution

Notation

  • K is the number of folds used for cross-validation.
  • P is the total number of features available.
  • p is the number of features tried at each step. That is, an integer such that min_features_to_select <= p <= P.
  • m is one of M performance metrics passed by the user (e.g., 'precision').

Solution

User can pass a list of strings representing M predefined scoring metrics and at each step, the algorithm stores the performance metric of the k models trained with p <= P features.

The cv_results_ attribute of the resulting RFECV would now include the following keys for each metric m and fold k:

  • 'split{k}_test_{m}'
  • mean_test_{m}
  • 'std_test_{m}'
  • 'rank_test_{m}'

Example

rfecv = RFECV(
    estimator=clf,  # Some classifier instance
    step=1,
    min_features_to_select=1,
    cv=10,
    scoring=['precision', 'recall', 'f1', 'roc_auc', 'accuracy']
)
Considerations

It is likely that rank_test_{m1} will differ from rank_test_{m2} for any pair of performance metrics m1 and m2. Hence, adding this feature will no longer allow RFECV to automatically pick the best number of features, as the rankings can differ from one metric to another. This part of the workflow would be up to the user.

Describe alternatives you've considered, if relevant

Running RFECV as many times as there are metrics.

Additional context

I asked this question on StackOverflow and the community agrees that the most viable way to do this is to run one RFECV per performance metric I need to evaluate.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0