diff --git a/doc/whats_new/v1.0.rst b/doc/whats_new/v1.0.rst index 8608688f715ee..15f00515d3414 100644 --- a/doc/whats_new/v1.0.rst +++ b/doc/whats_new/v1.0.rst @@ -24,7 +24,14 @@ Changelog - |Fix| Fixed an infinite loop in :func:`cluster.SpectralClustering` by moving an iteration counter from try to except. - :pr:`21271` by :user:`Tyler Martin ` + :pr:`21271` by :user:`Tyler Martin `. + +:mod:`sklearn.datasets` +....................... + +- |Fix| :func:`datasets.fetch_openml` is now thread safe. Data is first + downloaded to a temporary subfolder and then renamed. + :pr:`21833` by :user:`Siavash Rezazadeh `. :mod:`sklearn.decomposition` ............................ @@ -35,6 +42,21 @@ Changelog and :class:`decomposition.MiniBatchSparsePCA` to be convex and match the referenced article. :pr:`19210` by :user:`Jérémie du Boisberranger `. +:mod:`sklearn.ensemble` +....................... + +- |Fix| :class:`ensemble.RandomForestClassifier`, + :class:`ensemble.RandomForestRegressor`, + :class:`ensemble.ExtraTreesClassifier`, :class:`ensemble.ExtraTreesRegressor`, + and :class:`ensemble.RandomTreesEmbedding` now raise a ``ValueError`` when + ``bootstrap=False`` and ``max_samples`` is not ``None``. + :pr:`21295` :user:`Haoyin Xu `. + +- |Fix| Solve a bug in :class:`ensemble.GradientBoostingClassifier` where the + exponential loss was computing the positive gradient instead of the + negative one. + :pr:`22050` by :user:`Guillaume Lemaitre `. + :mod:`sklearn.feature_selection` ................................ @@ -42,6 +64,23 @@ Changelog for base estimators that do not set `feature_names_in_`. :pr:`21991` by `Thomas Fan`_. +:mod:`sklearn.impute` +..................... + +- |Fix| Fix a bug in :class:`linear_model.RidgeClassifierCV` where the method + `predict` was performing an `argmax` on the scores obtained from + `decision_function` instead of returning the multilabel indicator matrix. + :pr:`19869` by :user:`Guillaume Lemaitre `. + +:mod:`sklearn.linear_model` +........................... + +- |Fix| :class:`linear_model.LassoLarsIC` now correctly computes AIC + and BIC. An error is now raised when `n_features > n_samples` and + when the noise variance is not provided. + :pr:`21481` by :user:`Guillaume Lemaitre ` and + :user:`Andrés Babino `. + :mod:`sklearn.manifold` ....................... @@ -83,12 +122,25 @@ Changelog - |Fix| Fixes compatibility bug with NumPy 1.22 in :class:`preprocessing.OneHotEncoder`. :pr:`21517` by `Thomas Fan`_. +:mod:`sklearn.svm` +.................. + +- |Fix| :class:`smv.NuSVC`, :class:`svm.NuSVR`, :class:`svm.SVC`, + :class:`svm.SVR`, :class:`svm.OneClassSVM` now validate input + parameters in `fit` instead of `__init__`. + :pr:`21436` by :user:`Haidar Almubarak `. + :mod:`sklearn.tree` ................... - |Fix| Prevents :func:`tree.plot_tree` from drawing out of the boundary of the figure. :pr:`21917` by `Thomas Fan`_. +- |Fix| Support loading pickles of decision tree models when the pickle has + been generated on a platform with a different bitness. A typical example is + to train and pickle the model on 64 bit machine and load the model on a 32 + bit machine for prediction. :pr:`21552` by :user:`Loïc Estève `. + :mod:`sklearn.utils` .................... diff --git a/doc/whats_new/v1.1.rst b/doc/whats_new/v1.1.rst index 36859342ef6bf..b22c333ba6a65 100644 --- a/doc/whats_new/v1.1.rst +++ b/doc/whats_new/v1.1.rst @@ -117,10 +117,6 @@ Changelog hole; when set to True, it returns the swiss-hole dataset. :pr:`21482` by :user:`Sebastian Pujalte `. -- |Fix| :func:`datasets.fetch_openml` is now thread safe. Data is first downloaded - to a temporary subfolder and then renamed. - :pr:`21833` by :user:`Siavash Rezazadeh `. - - |Enhancement| :func:`datasets.load_diabetes` now accepts the parameter ``scaled``, to allow loading unscaled data. The scaled version of this dataset is now computed from the unscaled data, and can produce slightly @@ -141,18 +137,6 @@ Changelog get accurate results when the number of features is large. :pr:`21109` by :user:`Smile `. -- |Fix| :class:`decomposition.FastICA` now validates input parameters in `fit` instead of `__init__`. - :pr:`21432` by :user:`Hannah Bohle ` and :user:`Maren Westermann `. - -- |Fix| :class:`decomposition.FactorAnalysis` now validates input parameters - in `fit` instead of `__init__`. - :pr:`21713` by :user:`Haya ` and - :user:`Krum Arnaudov `. - -- |Fix| :class:`decomposition.KernelPCA` now validates input parameters in - `fit` instead of `__init__`. - :pr:`21567` by :user:`Maggie Chege `. - - |API| Adds :term:`get_feature_names_out` to all transformers in the :mod:`~sklearn.decomposition` module: :class:`~sklearn.decomposition.DictionaryLearning`, @@ -169,39 +153,23 @@ Changelog and :class:`~sklearn.decomposition.TruncatedSVD`. :pr:`21334` by `Thomas Fan`_. -:mod:`sklearn.feature_extraction` -................................. - -- |Fix| :class:`feature_extraction.FeatureHasher` now validates input parameters - in `transform` instead of `__init__`. :pr:`21573` by - :user:`Hannah Bohle ` and :user:`Maren Westermann `. - -- |API| :func:`decomposition.FastICA` now supports unit variance for whitening. - The default value of its `whiten` argument will change from `True` - (which behaves like `'arbitrary-variance'`) to `'unit-variance'` in version 1.3. - :pr:`19490` by :user:`Facundo Ferrin ` and :user:`Julien Jerphanion ` +- |Fix| :class:`decomposition.FastICA` now validates input parameters in `fit` + instead of `__init__`. + :pr:`21432` by :user:`Hannah Bohle ` and + :user:`Maren Westermann `. -:mod:`sklearn.feature_selection` -................................ +- |Fix| :class:`decomposition.FactorAnalysis` now validates input parameters + in `fit` instead of `__init__`. + :pr:`21713` by :user:`Haya ` and + :user:`Krum Arnaudov `. -- |Enhancement| Add a parameter `force_finite` to - :func:`feature_selection.f_regression` and - :func:`feature_selection.r_regression`. This parameter allows to force the - output to be finite in the case where a feature or a the target is constant - or that the feature and target are perfectly correlated (only for the - F-statistic). - :pr:`17819` by :user:`Juan Carlos Alfaro Jiménez `. +- |Fix| :class:`decomposition.KernelPCA` now validates input parameters in + `fit` instead of `__init__`. + :pr:`21567` by :user:`Maggie Chege `. :mod:`sklearn.ensemble` ....................... -- |Fix| :class:`ensemble.RandomForestClassifier`, - :class:`ensemble.RandomForestRegressor`, - :class:`ensemble.ExtraTreesClassifier`, :class:`ensemble.ExtraTreesRegressor`, - and :class:`ensemble.RandomTreesEmbedding` now raise a ``ValueError`` when - ``bootstrap=False`` and ``max_samples`` is not ``None``. - :pr:`21295` :user:`Haoyin Xu `. - - |API| Changed the default of :func:`max_features` to 1.0 for :class:`ensemble.RandomForestRegressor` and to `"sqrt"` for :class:`ensemble.RandomForestClassifier`. Note that these give the same fit @@ -211,10 +179,18 @@ Changelog :class:`ensemble.ExtraTreesClassifier`. :pr:`20803` by :user:`Brian Sun `. -- |Fix| Solve a bug in :class:`ensemble.GradientBoostingClassifier` where the - exponential loss was computing the positive gradient instead of the - negative one. - :pr:`22050` by :user:`Guillaume Lemaitre `. +:mod:`sklearn.feature_extraction` +................................. + +- |API| :func:`decomposition.FastICA` now supports unit variance for whitening. + The default value of its `whiten` argument will change from `True` + (which behaves like `'arbitrary-variance'`) to `'unit-variance'` in version 1.3. + :pr:`19490` by :user:`Facundo Ferrin ` and + :user:`Julien Jerphanion `. + +- |Fix| :class:`feature_extraction.FeatureHasher` now validates input parameters + in `transform` instead of `__init__`. :pr:`21573` by + :user:`Hannah Bohle ` and :user:`Maren Westermann `. :mod:`sklearn.feature_extraction.text` ...................................... @@ -224,6 +200,17 @@ Changelog by our API. :pr:`21832` by :user:`Guillaume Lemaitre `. +:mod:`sklearn.feature_selection` +................................ + +- |Enhancement| Add a parameter `force_finite` to + :func:`feature_selection.f_regression` and + :func:`feature_selection.r_regression`. This parameter allows to force the + output to be finite in the case where a feature or a the target is constant + or that the feature and target are perfectly correlated (only for the + F-statistic). + :pr:`17819` by :user:`Juan Carlos Alfaro Jiménez `. + :mod:`sklearn.impute` ..................... @@ -244,11 +231,6 @@ Changelog values in the training set. :pr:`21617` by :user:`Christian Ritter `. -- |Fix| Fix a bug in :class:`linear_model.RidgeClassifierCV` where the method - `predict` was performing an `argmax` on the scores obtained from - `decision_function` instead of returning the multilabel indicator matrix. - :pr:`19869` by :user:`Guillaume Lemaitre `. - - |Enhancement| :class:`linear_model.RidgeClassifier` is now supporting multilabel classification. :pr:`19689` by :user:`Guillaume Lemaitre `. @@ -276,12 +258,6 @@ Changelog for the highs based solvers. :pr:`21086` by :user:`Venkatachalam Natchiappan `. -- |Fix| :class:`linear_model.LassoLarsIC` now correctly computes AIC - and BIC. An error is now raised when `n_features > n_samples` and - when the noise variance is not provided. - :pr:`21481` by :user:`Guillaume Lemaitre ` and - :user:`Andrés Babino `. - :mod:`sklearn.metrics` ...................... @@ -293,7 +269,7 @@ Changelog - |API| Parameters ``sample_weight`` and ``multioutput`` of :func:`metrics. mean_absolute_percentage_error` are now keyword-only, in accordance with `SLEP009 - `. + `_. A deprecation cycle was introduced. :pr:`21576` by :user:`Paul-Emile Dugnat `. @@ -317,9 +293,10 @@ Changelog splits failed. Similarly raise an error during grid-search when the fits for all the models and all the splits failed. :pr:`21026` by :user:`Loïc Estève `. -- |Fix| :class:`model_selection.GridSearchCV`, :class:`model_selection.HalvingGridSearchCV` - now validate input parameters in `fit` instead of `__init__`. - :pr:`21880` by :user:`Mrinal Tyagi `. +- |Fix| :class:`model_selection.GridSearchCV`, + :class:`model_selection.HalvingGridSearchCV` + now validate input parameters in `fit` instead of `__init__`. + :pr:`21880` by :user:`Mrinal Tyagi `. :mod:`sklearn.mixture` ...................... @@ -329,6 +306,23 @@ Changelog its square root. :pr:`22058` by :user:`Guillaume Lemaitre `. +:mod:`sklearn.neighbors` +........................ + +- |Enhancement| `utils.validation.check_array` and `utils.validation.type_of_target` + now accept an `input_name` parameter to make the error message more + informative when passed invalid input data (e.g. with NaN or infinite + values). + :pr:`21219` by :user:`Olivier Grisel `. + +- |Enhancement| :func:`utils.validation.check_array` returns a float + ndarray with `np.nan` when passed a `Float32` or `Float64` pandas extension + array with `pd.NA`. :pr:`21278` by `Thomas Fan`_. + +- |Fix| :class:`neighbors.KernelDensity` now validates input parameters in `fit` + instead of `__init__`. :pr:`21430` by :user:`Desislava Vasileva ` and + :user:`Lucy Jimenez `. + :mod:`sklearn.pipeline` ....................... @@ -336,16 +330,9 @@ Changelog Setting a transformer to "passthrough" will pass the features unchanged. :pr:`20860` by :user:`Shubhraneel Pal `. -:mod:`sklearn.svm` -................... - -- |Enhancement| :class:`svm.OneClassSVM`, :class:`svm.NuSVC`, - :class:`svm.NuSVR`, :class:`svm.SVC` and :class:`svm.SVR` now expose - `n_iter_`, the number of iterations of the libsvm optimization routine. - :pr:`21408` by :user:`Juan Martín Loyola `. -- |Fix| :class: `pipeline.Pipeline` now does not validate hyper-parameters in +- |Fix| :class:`pipeline.Pipeline` now does not validate hyper-parameters in `__init__` but in `.fit()`. - :pr:`21888` by :user:`iofall ` and :user: `Arisa Y. `. + :pr:`21888` by :user:`iofall ` and :user:`Arisa Y. `. :mod:`sklearn.preprocessing` ............................ @@ -355,10 +342,6 @@ Changelog the model. The option is only available when `strategy` is set to `quantile`. :pr:`21445` by :user:`Felipe Bidu ` and :user:`Amanda Dsouza `. -- |Fix| :class:`preprocessing.LabelBinarizer` now validates input parameters in `fit` - instead of `__init__`. - :pr:`21434` by :user:`Krum Arnaudov `. - - |Enhancement| Added the `get_feature_names_out` method and a new parameter `feature_names_out` to :class:`preprocessing.FunctionTransformer`. You can set `feature_names_out` to 'one-to-one' to use the input features names as the @@ -368,13 +351,26 @@ Changelog then `get_output_feature_names` is not defined. :pr:`21569` by :user:`Aurélien Geron `. +- |Fix| :class:`preprocessing.LabelBinarizer` now validates input parameters in + `fit` instead of `__init__`. + :pr:`21434` by :user:`Krum Arnaudov `. + +:mod:`sklearn.random_projection` +................................ + +- |API| Adds :term:`get_feature_names_out` to all transformers in the + :mod:`~sklearn.random_projection` module: + :class:`~sklearn.random_projection.GaussianRandomProjection` and + :class:`~sklearn.random_projection.SparseRandomProjection`. :pr:`21330` by + :user:`Loïc Estève `. + :mod:`sklearn.svm` .................. -- |Fix| :class:`smv.NuSVC`, :class:`svm.NuSVR`, :class:`svm.SVC`, - :class:`svm.SVR`, :class:`svm.OneClassSVM` now validate input - parameters in `fit` instead of `__init__`. - :pr:`21436` by :user:`Haidar Almubarak `. +- |Enhancement| :class:`svm.OneClassSVM`, :class:`svm.NuSVC`, + :class:`svm.NuSVR`, :class:`svm.SVC` and :class:`svm.SVR` now expose + `n_iter_`, the number of iterations of the libsvm optimization routine. + :pr:`21408` by :user:`Juan Martín Loyola `. :mod:`sklearn.utils` .................... @@ -387,40 +383,6 @@ Changelog left corner of the HTML representation to show how the elements are clickable. :pr:`21298` by `Thomas Fan`_. -:mod:`sklearn.neighbors` -........................ - -- |Fix| :class:`neighbors.KernelDensity` now validates input parameters in `fit` - instead of `__init__`. :pr:`21430` by :user:`Desislava Vasileva ` and - :user:`Lucy Jimenez `. - -- |Enhancement| `utils.validation.check_array` and `utils.validation.type_of_target` - now accept an `input_name` parameter to make the error message more - informative when passed invalid input data (e.g. with NaN or infinite - values). - :pr:`21219` by :user:`Olivier Grisel `. - -- |Enhancement| :func:`utils.validation.check_array` returns a float - ndarray with `np.nan` when passed a `Float32` or `Float64` pandas extension - array with `pd.NA`. :pr:`21278` by `Thomas Fan`_. - -:mod:`sklearn.random_projection` -................................ - -- |API| Adds :term:`get_feature_names_out` to all transformers in the - :mod:`~sklearn.random_projection` module: - :class:`~sklearn.random_projection.GaussianRandomProjection` and - :class:`~sklearn.random_projection.SparseRandomProjection`. :pr:`21330` by - :user:`Loïc Estève `. - -:mod:`sklearn.tree` -................... - -- |Fix| Support loading pickles of decision tree models when the pickle has - been generated on a platform with a different bitness. A typical example is - to train and pickle the model on 64 bit machine and load the model on a 32 - bit machine for prediction. :pr:`21552` by :user:`Loïc Estève `. - Code and Documentation Contributors -----------------------------------