-
-
Notifications
You must be signed in to change notification settings - Fork 26.4k
Closed
Description
Describe the bug
I utilized the SequentialFeatureSelector for feature selection in my code, with the direction set to "backward." The tolerance value is negative and the selection process stops when the decrease in the metric, AUC in this case, is less than the specified tolerance. Generally, increasing the number of features results in a higher AUC, but sacrificing some features, especially correlated ones that offer little contribution, can produce a pessimistic model with a lower AUC. The code worked as expected in sklearn 1.1.1, but when I updated to sklearn 1.2.1, I encountered the following error.
Steps/Code to Reproduce
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.feature_selection import SequentialFeatureSelector
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
X, y = load_breast_cancer(return_X_y=True)
TOL = -0.001
feature_selector = SequentialFeatureSelector(
LogisticRegression(max_iter=1000),
n_features_to_select="auto",
direction="backward",
scoring="roc_auc",
tol=TOL
)
pipe = Pipeline(
[('scaler', StandardScaler()),
('feature_selector', feature_selector),
('log_reg', LogisticRegression(max_iter=1000))]
)
if __name__ == "__main__":
pipe.fit(X, y)
print(pipe['log_reg'].coef_[0])Expected Results
$ python sfs_tol.py
[-2.0429818 0.5364346 -1.35765488 -2.85009904 -2.84603016]
Actual Results
$ python sfs_tol.py
Traceback (most recent call last):
File "/home/modelling/users-workspace/nsofinij/lab/open-source/sfs_tol.py", line 28, in <module>
pipe.fit(X, y)
File "/home/modelling/opt/anaconda3/envs/py310/lib/python3.10/site-packages/sklearn/pipeline.py", line 401, in fit
Xt = self._fit(X, y, **fit_params_steps)
File "/home/modelling/opt/anaconda3/envs/py310/lib/python3.10/site-packages/sklearn/pipeline.py", line 359, in _fit
X, fitted_transformer = fit_transform_one_cached(
File "/home/modelling/opt/anaconda3/envs/py310/lib/python3.10/site-packages/joblib/memory.py", line 349, in __call__
return self.func(*args, **kwargs)
File "/home/modelling/opt/anaconda3/envs/py310/lib/python3.10/site-packages/sklearn/pipeline.py", line 893, in _fit_transform_one
res = transformer.fit_transform(X, y, **fit_params)
File "/home/modelling/opt/anaconda3/envs/py310/lib/python3.10/site-packages/sklearn/utils/_set_output.py", line 142, in wrapped
data_to_wrap = f(self, X, *args, **kwargs)
File "/home/modelling/opt/anaconda3/envs/py310/lib/python3.10/site-packages/sklearn/base.py", line 862, in fit_transform
return self.fit(X, y, **fit_params).transform(X)
File "/home/modelling/opt/anaconda3/envs/py310/lib/python3.10/site-packages/sklearn/feature_selection/_sequential.py", line 201, in fit
self._validate_params()
File "/home/modelling/opt/anaconda3/envs/py310/lib/python3.10/site-packages/sklearn/base.py", line 581, in _validate_params
validate_parameter_constraints(
File "/home/modelling/opt/anaconda3/envs/py310/lib/python3.10/site-packages/sklearn/utils/_param_validation.py", line 97, in validate_parameter_constraints
raise InvalidParameterError(
sklearn.utils._param_validation.InvalidParameterError: The 'tol' parameter of SequentialFeatureSelector must be None or a float in the range (0, inf). Got -0.001 instead.
Versions
System:
python: 3.10.8 | packaged by conda-forge | (main, Nov 22 2022, 08:26:04) [GCC 10.4.0]
executable: /home/modelling/opt/anaconda3/envs/py310/bin/python
machine: Linux-4.14.301-224.520.amzn2.x86_64-x86_64-with-glibc2.26
Python dependencies:
sklearn: 1.2.1
pip: 23.0
setuptools: 66.1.1
numpy: 1.24.1
scipy: 1.10.0
Cython: None
pandas: 1.5.3
matplotlib: 3.6.3
joblib: 1.2.0
threadpoolctl: 3.1.0
Built with OpenMP: True
threadpoolctl info:
user_api: openmp
internal_api: openmp
prefix: libgomp
filepath: /home/modelling/opt/anaconda3/envs/py310/lib/python3.10/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0
version: None
num_threads: 64
user_api: blas
internal_api: openblas
prefix: libopenblas
filepath: /home/modelling/opt/anaconda3/envs/py310/lib/python3.10/site-packages/numpy.libs/libopenblas64_p-r0-15028c96.3.21.so
version: 0.3.21
threading_layer: pthreads
architecture: SkylakeX
num_threads: 64
user_api: blas
internal_api: openblas
prefix: libopenblas
filepath: /home/modelling/opt/anaconda3/envs/py310/lib/python3.10/site-packages/scipy.libs/libopenblasp-r0-41284840.3.18.so
version: 0.3.18
threading_layer: pthreads
architecture: SkylakeX
num_threads: 64