8000 BayesSearchCV multimetric scoring inconsistency: KeyError: 'mean_test_score' · Issue #1105 · scikit-optimize/scikit-optimize · GitHub
[go: up one dir, main page]

Skip to content
This repository was archived by the owner on Feb 28, 2024. It is now read-only.

BayesSearchCV multimetric scoring inconsistency: KeyError: 'mean_test_score' #1105

Open
leweex95 opened this issue Feb 6, 2022 · 3 comments

Comments

@leweex95
Copy link
leweex95 commented Feb 6, 2022

I am working on a multilabel classification and for now I have primarily been relying on RandomizedSearchCV from scikit-learn to perform hyperparameter optimization. I now started experimenting with BayesSearchCV and ran into a potential bug when using multi-metric scoring, combined with the refit argument.

I created a full reproducible toy example below.

Imports, data generation, pipeline:

import numpy as np
from sklearn.datasets import make_multilabel_classification
from sklearn.naive_bayes import MultinomialNB
from sklearn.multioutput import MultiOutputClassifier
from sklearn.model_selection import RandomizedSearchCV
from skopt.searchcv import BayesSearchCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MinMaxScaler

X, Y = make_multilabel_classification(
    n_samples=10000,
    n_features=20,
    n_classes=10,
    n_labels=3
)

pipe = Pipeline(
    steps = [
        ('scaler', MinMaxScaler()),
        ('model', MultiOutputClassifier(MultinomialNB()))
    ]
)

Step 1: Single metric with refit = True : works!

search = BayesSearchCV(
    estimator = pipe, 
    search_spaces = {'model__estimator__alpha': np.linspace(0.01,1,50)},
    scoring = 'precision_macro',
    refit = True,
    cv = 5
).fit(X, Y)

Step 2: Single metric with refit = 'precision_macro': works!

search = BayesSearchCV(
    estimator = pipe, 
    search_spaces = {'model__estimator__alpha': np.linspace(0.01,1,50)},
    scoring = 'precision_macro',
    refit = 'precision_macro',
    cv = 5
).fit(X, Y)

Step 3: Multiple metrics with refit = 'precision_macro': fails!

search = BayesSearchCV(
    estimator = pipe, 
    search_spaces = {'model__estimator__alpha': np.linspace(0.01,1,50)},
    scoring = ['precision_macro', 'recall_macro', 'accuracy'],
    refit = 'precision_macro',
    cv = 5
).fit(X, Y)

(Note: adding return_train_score=True to BayesSearchCV() didn't make a difference.)

The error message is:

  File "multioutput.py", line 46, in <module>
    ).fit(X, Y)
  File ".venv\lib\site-packages\skopt\searchcv.py", line 466, in fit
    super().fit(X=X, y=y, groups=groups, **fit_params)
  File ".venv\lib\site-packages\sklearn\model_selection\_search.py", line 891, in fit  
    self._run_search(evaluate_candidates)
  File ".venv\lib\site-packages\skopt\searchcv.py", line 514, in _run_search
    evaluate_candidates, n_points=n_points_adjusted
  File ".venv\lib\site-packages\skopt\searchcv.py", line 411, in _step
    local_results = all_results["mean_test_score"][-len(params):]
KeyError: 'mean_test_score'

For comparison, I ran the same setup through RandomizedSearchCV from scikit-learn:

search = RandomizedSearchCV(
    estimator = pipe,
    param_distributions = {'model__estimator__alpha': np.linspace(0.01,1,50)},
    scoring = ['precision_macro', 'recall_macro', 'accuracy'],
    refit = 'precision_macro',
    cv = 5
).fit(X, Y)

and it evaluated correctly.

My version of scikit-optimize: 0.9.0.
OS: Windows 10

@RNarayan73
Copy link

@leweex95
This may be addressed by an unofficial fork here #1071
I use it too. Hope this helps.
Narayan

@epenney1
Copy link

Hi, this is still an issue for me as well. Could you point me to where exactly I should download a version without this fault? Thanks.

@RNarayan73
Copy link

Hi, this is still an issue for me as well. Could you point me to where exactly I should download a version without this fault? Thanks.

@epenney1 see link #1071 which has a pip install command for the unofficial version

Hope this helps.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0