From d8f67b4c59bc2cb169619acdb69434ba1c9f4bd9 Mon Sep 17 00:00:00 2001 From: Nicolas Hug Date: Tue, 29 Oct 2019 16:40:15 -0400 Subject: [PATCH 1/7] Add missing entries to whatsnew --- doc/whats_new/v0.22.rst | 36 +++++++++++++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/doc/whats_new/v0.22.rst b/doc/whats_new/v0.22.rst index 39235625093bc..137c06675b3de 100644 --- a/doc/whats_new/v0.22.rst +++ b/doc/whats_new/v0.22.rst @@ -123,6 +123,12 @@ Changelog pass `**fit_params` to the underlying regressor. :pr:`14890` by :user:`Miguel Cabrera `. +- |Fix| The :class:`compose.ColumnTransformer` now requires the number of + features to be consistent between `fit` and `transform`. A `FutureWarning` + is raised now, and this will raise an error in 0.24. If the number of + features isn't consistent and negative indexing is used, an error is + raised. :pr:`14544` by `Adrin Jalali`_. + :mod:`sklearn.cross_decomposition` .................................. @@ -330,6 +336,14 @@ Changelog estimator's constructor but not stored as attributes on the instance. :pr:`14464` by `Joel Nothman`_. +:mod:`sklearn.hierarchical` +........................... + +- |Fix| :class:`hierarchical.AgglomerativeClustering` and + :class:`hierarchical.FeatureAgglomeration` now raise an error if + `affinity='cosine'` and `X` has samples that are all-zeros. :pr:`7943` by + :user:`mthorrell`. + :mod:`sklearn.impute` ..................... @@ -365,6 +379,10 @@ Changelog :class:`ensemble.HistGradientBoostingRegressor`. :pr:`13769` by `Nicolas Hug`_. +- |Feature| :func:`inspection.plot_partial_dependence` has been extended to + now support the new visualization API described in the :ref:`User Guide + `. :pr:`14646` by `Thomas Fan`_. + :mod:`sklearn.kernel_approximation` ................................... @@ -506,6 +524,10 @@ Changelog ``multioutput`` parameter. :pr:`14732` by :user:`Agamemnon Krasoulis `. +- |Enhancement| 'roc_auc_ovr_weighted' and 'roc_auc_ovo_weighted' can now be + used as the `scoring` parameter of model-selection tools. :pr:`14417` by + `Thomas Fan`_. + :mod:`sklearn.model_selection` .............................. @@ -531,6 +553,11 @@ Changelog `random_state` is set but `shuffle` is False. This will raise an error in 0.24. +- |Fix| The `cv_results_` attribute of :class:`model_selection.GridSearchCV` + and :class:`model_selection.RandomizedSearchCV` now only contains unfitted + estimators. This potentially saves a lot of memory since the state of the + estimators isn't stored. :pr:`#15096` by :user:`Andreas Müller `. + :mod:`sklearn.multioutput` .......................... @@ -603,6 +630,9 @@ Changelog :mod:`sklearn.preprocessing` ............................ +- |Enhancement| :class:`preprocessing.PolynomialFeatures` is now faster when + the input data is dense. :pr:`13290` by :user:`Xavier Dupré `. + - |Enhancement| Avoid unnecessary data copy when fitting preprocessors :class:`preprocessing.StandardScaler`, :class:`preprocessing.MinMaxScaler`, :class:`preprocessing.MaxAbsScaler`, :class:`preprocessing.RobustScaler` @@ -760,7 +790,7 @@ Miscellaneous - |Fix| Port `lobpcg` from SciPy which implement some bug fixes but only available in 1.3+. - :pr:`13609` by :user:`Guillaume Lemaitre `. + :pr:`13609` and :pr:`14971` by :user:`Guillaume Lemaitre `. Changes to estimator checks --------------------------- @@ -796,3 +826,7 @@ These changes mostly affect library developers. :pr:`13392` by :user:`Rok Mihevc `. - |Fix| Added ``check_transformer_data_not_an_array`` to checks where missing + +- |Fix| The estimators tags resolution now follows the regular MRO. They used + to be overridable only once. :pr:`14884` by :user:`Andreas Müller + `. \ No newline at end of file From c60fd75198f0b884bb9ce594e04a4f015221eed0 Mon Sep 17 00:00:00 2001 From: Nicolas Hug Date: Wed, 30 Oct 2019 10:53:35 -0400 Subject: [PATCH 2/7] expand on FutureWarning notes --- doc/whats_new/v0.22.rst | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/doc/whats_new/v0.22.rst b/doc/whats_new/v0.22.rst index 137c06675b3de..6fe20dd697d36 100644 --- a/doc/whats_new/v0.22.rst +++ b/doc/whats_new/v0.22.rst @@ -20,10 +20,15 @@ Deprecations: using ``FutureWarning`` from now on When deprecating a feature, previous versions of scikit-learn used to raise a ``DeprecationWarning``. Since the ``DeprecationWarnings`` aren't shown by default by Python, scikit-learn needed to resort to a custom warning filter -that would always show the warnings. - -This filter is now removed, and starting from 0.22 scikit-learn will show -``FutureWarnings`` for deprecations. :pr:`15080` by `Nicolas Hug`_. +to always show the warnings. That filter would sometimes interfere +with users custom warning filters. + +Starting from version 0.22, scikit-learn will show ``FutureWarnings`` for +deprecations, `as recommended by the Python documentation +`_. +``FutureWarnings`` are always shown by default by Python so the custom +filter has been removed and scikit-learn no longer hinders with user +filters. :pr:`15080` by `Nicolas Hug`_. Changed models From a970bcfec9d7c20b28d624579c9a7a89f8a4e6b0 Mon Sep 17 00:00:00 2001 From: Nicolas Hug Date: Wed, 30 Oct 2019 11:39:33 -0400 Subject: [PATCH 3/7] documented public API --- doc/whats_new/v0.22.rst | 35 +++++++++++++++++++++++++++++++++-- 1 file changed, 33 insertions(+), 2 deletions(-) diff --git a/doc/whats_new/v0.22.rst b/doc/whats_new/v0.22.rst index 6fe20dd697d36..523f0ae86ec9a 100644 --- a/doc/whats_new/v0.22.rst +++ b/doc/whats_new/v0.22.rst @@ -14,6 +14,38 @@ refer to :ref:`sphx_glr_auto_examples_release_highlights_plot_release_highlights_0_22_0.py`. +Cleaning of the public API +-------------------------- + +Scikit-learn has a public API, and a private API. + +Any change to the public API is subject to a deprecation cycle of two minor +versions. The private API isn't publicly documented and isn't subject to any +deprecation cycle, so users should not rely on it. + +Whether a tool is private or public depends on whether you can import it +without a leading underscore in the import path. For example +``sklearn.pipeline.make_pipeline`` is public, while +`sklearn.pipeline._name_estimators` is private. +``sklearn.ensemble._gb.BaseEnsemble`` is private too because the whole `_gb` +module is private. + +Up to 0.22, some tools were de-facto public (no leading underscore), while +they should have been private in the first place. In version 0.22, these +tools have been made properly private, and the public API space has been +cleaned. In addition, importing from files in sub-packages is deprecated: +you should use ``from sklearn.cluster import Birch`` instead of ``from +sklearn.cluster.birch import Birch`` (in practice, ``birch.py`` has been +moved to ``_birch.py``). + +.. note:: + + All the tools in the public API should be documented in the `API + Reference `_. If you + find a public tool that isn't in the API reference, that means it should + either be private or documented. Please let us know by opening an issue! + + Deprecations: using ``FutureWarning`` from now on ------------------------------------------------- @@ -26,11 +58,10 @@ with users custom warning filters. Starting from version 0.22, scikit-learn will show ``FutureWarnings`` for deprecations, `as recommended by the Python documentation `_. -``FutureWarnings`` are always shown by default by Python so the custom +``FutureWarnings`` are always shown by default by Python, so the custom filter has been removed and scikit-learn no longer hinders with user filters. :pr:`15080` by `Nicolas Hug`_. - Changed models -------------- From e49624ffd6fd8b8d02ff0f3887959dbef4766d99 Mon Sep 17 00:00:00 2001 From: Nicolas Hug Date: Wed, 30 Oct 2019 11:42:59 -0400 Subject: [PATCH 4/7] Addressed comments --- doc/whats_new/v0.22.rst | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/doc/whats_new/v0.22.rst b/doc/whats_new/v0.22.rst index 523f0ae86ec9a..3bcbaee124874 100644 --- a/doc/whats_new/v0.22.rst +++ b/doc/whats_new/v0.22.rst @@ -147,6 +147,12 @@ Changelog exposes an ``n_iter_`` indicating the maximum number of iterations performed on each seed. :pr:`15120` by `Adrin Jalali`_. +- |Fix| :class:`cluster.AgglomerativeClustering` and + :class:`cluster.FeatureAgglomeration` now raise an error if + `affinity='cosine'` and `X` has samples that are all-zeros. :pr:`7943` by + :user:`mthorrell`. + + :mod:`sklearn.compose` ...................... @@ -372,14 +378,6 @@ Changelog estimator's constructor but not stored as attributes on the instance. :pr:`14464` by `Joel Nothman`_. -:mod:`sklearn.hierarchical` -........................... - -- |Fix| :class:`hierarchical.AgglomerativeClustering` and - :class:`hierarchical.FeatureAgglomeration` now raise an error if - `affinity='cosine'` and `X` has samples that are all-zeros. :pr:`7943` by - :user:`mthorrell`. - :mod:`sklearn.impute` ..................... @@ -561,8 +559,8 @@ Changelog :pr:`14732` by :user:`Agamemnon Krasoulis `. - |Enhancement| 'roc_auc_ovr_weighted' and 'roc_auc_ovo_weighted' can now be - used as the `scoring` parameter of model-selection tools. :pr:`14417` by - `Thomas Fan`_. + used as the :term:`scoring` parameter of model-selection tools. + :pr:`14417` by `Thomas Fan`_. :mod:`sklearn.model_selection` .............................. From 49fb0cd52faf0a0369eaebc6c0ee050827978be7 Mon Sep 17 00:00:00 2001 From: Nicolas Hug Date: Wed, 30 Oct 2019 12:13:49 -0400 Subject: [PATCH 5/7] Addressed comments --- doc/whats_new/v0.22.rst | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/doc/whats_new/v0.22.rst b/doc/whats_new/v0.22.rst index 3bcbaee124874..a634645b5b9d5 100644 --- a/doc/whats_new/v0.22.rst +++ b/doc/whats_new/v0.22.rst @@ -14,14 +14,17 @@ refer to :ref:`sphx_glr_auto_examples_release_highlights_plot_release_highlights_0_22_0.py`. -Cleaning of the public API --------------------------- +Clear definition of the public API +---------------------------------- Scikit-learn has a public API, and a private API. -Any change to the public API is subject to a deprecation cycle of two minor -versions. The private API isn't publicly documented and isn't subject to any -deprecation cycle, so users should not rely on it. +We do our best not to break the public API, and to only introduce +backward-compatible changes that do not require any user action. However, in +cases where that's not possible, any change to the public API is subject to +a deprecation cycle of two minor versions. The private API isn't publicly +documented and isn't subject to any deprecation cycle, so users should not +rely on its stability. Whether a tool is private or public depends on whether you can import it without a leading underscore in the import path. For example From 2f1b73201535949d0cbac54a12276f361e8095bc Mon Sep 17 00:00:00 2001 From: Nicolas Hug Date: Thu, 31 Oct 2019 12:40:25 -0400 Subject: [PATCH 6/7] Addressed comments --- doc/whats_new/v0.22.rst | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/doc/whats_new/v0.22.rst b/doc/whats_new/v0.22.rst index a634645b5b9d5..f8b3d807d598d 100644 --- a/doc/whats_new/v0.22.rst +++ b/doc/whats_new/v0.22.rst @@ -26,9 +26,9 @@ a deprecation cycle of two minor versions. The private API isn't publicly documented and isn't subject to any deprecation cycle, so users should not rely on its stability. -Whether a tool is private or public depends on whether you can import it -without a leading underscore in the import path. For example -``sklearn.pipeline.make_pipeline`` is public, while +A function or object is public if it is documented in the reference API and +if it can be imported with an import path without leading underscores. For +example ``sklearn.pipeline.make_pipeline`` is public, while `sklearn.pipeline._name_estimators` is private. ``sklearn.ensemble._gb.BaseEnsemble`` is private too because the whole `_gb` module is private. @@ -36,10 +36,10 @@ module is private. Up to 0.22, some tools were de-facto public (no leading underscore), while they should have been private in the first place. In version 0.22, these tools have been made properly private, and the public API space has been -cleaned. In addition, importing from files in sub-packages is deprecated: -you should use ``from sklearn.cluster import Birch`` instead of ``from -sklearn.cluster.birch import Birch`` (in practice, ``birch.py`` has been -moved to ``_birch.py``). +cleaned. In addition, some sub-modules were deprecated: you should for +example you should use ``from sklearn.cluster import Birch`` instead of +``from sklearn.cluster.birch import Birch`` (in practice, ``birch.py`` has +been moved to ``_birch.py``). .. note:: @@ -416,7 +416,7 @@ Changelog :class:`ensemble.HistGradientBoostingRegressor`. :pr:`13769` by `Nicolas Hug`_. -- |Feature| :func:`inspection.plot_partial_dependence` has been extended to +- |Enhancement| :func:`inspection.plot_partial_dependence` has been extended to now support the new visualization API described in the :ref:`User Guide `. :pr:`14646` by `Thomas Fan`_. @@ -667,7 +667,7 @@ Changelog :mod:`sklearn.preprocessing` ............................ -- |Enhancement| :class:`preprocessing.PolynomialFeatures` is now faster when +- |Efficiency| :class:`preprocessing.PolynomialFeatures` is now faster when the input data is dense. :pr:`13290` by :user:`Xavier Dupré `. - |Enhancement| Avoid unnecessary data copy when fitting preprocessors From 63966ba4d764e51408714e166389e10b9c63d51b Mon Sep 17 00:00:00 2001 From: Nicolas Hug Date: Thu, 31 Oct 2019 12:42:33 -0400 Subject: [PATCH 7/7] some cleaning --- doc/whats_new/v0.22.rst | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/doc/whats_new/v0.22.rst b/doc/whats_new/v0.22.rst index f8b3d807d598d..181662562f3c1 100644 --- a/doc/whats_new/v0.22.rst +++ b/doc/whats_new/v0.22.rst @@ -26,9 +26,10 @@ a deprecation cycle of two minor versions. The private API isn't publicly documented and isn't subject to any deprecation cycle, so users should not rely on its stability. -A function or object is public if it is documented in the reference API and -if it can be imported with an import path without leading underscores. For -example ``sklearn.pipeline.make_pipeline`` is public, while +A function or object is public if it is documented in the `API Reference +`_ and if it can be +imported with an import path without leading underscores. For example +``sklearn.pipeline.make_pipeline`` is public, while `sklearn.pipeline._name_estimators` is private. ``sklearn.ensemble._gb.BaseEnsemble`` is private too because the whole `_gb` module is private. @@ -36,8 +37,8 @@ module is private. Up to 0.22, some tools were de-facto public (no leading underscore), while they should have been private in the first place. In version 0.22, these tools have been made properly private, and the public API space has been -cleaned. In addition, some sub-modules were deprecated: you should for -example you should use ``from sklearn.cluster import Birch`` instead of +cleaned. In addition, importing from most sub-modules is now deprecated: you +should for example use ``from sklearn.cluster import Birch`` instead of ``from sklearn.cluster.birch import Birch`` (in practice, ``birch.py`` has been moved to ``_birch.py``). @@ -45,8 +46,9 @@ been moved to ``_birch.py``). All the tools in the public API should be documented in the `API Reference `_. If you - find a public tool that isn't in the API reference, that means it should - either be private or documented. Please let us know by opening an issue! + find a public tool (without leading underscore) that isn't in the API + reference, that means it should either be private or documented. Please + let us know by opening an issue! Deprecations: using ``FutureWarning`` from now on