REDVM
diff --git a/‎doc/conftest.py‎
Lines changed: 7 additions & 0 deletions b/‎doc/conftest.py‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎doc/metadata_routing.rst‎
Lines changed: 231 additions & 0 deletions b/‎doc/metadata_routing.rst‎
Lines changed: 231 additions & 0 deletions
diff --git a/‎doc/modules/classes.rst‎
Lines changed: 6 additions & 0 deletions b/‎doc/modules/classes.rst‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎doc/modules/model_evaluation.rst‎
Lines changed: 8 additions & 0 deletions b/‎doc/modules/model_evaluation.rst‎
Lines changed: 8 additions & 0 deletions
diff --git a/‎doc/user_guide.rst‎
Lines changed: 9 additions & 0 deletions b/‎doc/user_guide.rst‎
Lines changed: 9 additions & 0 deletions
diff --git a/‎doc/whats_new/v1.3.rst‎
Lines changed: 13 additions & 0 deletions b/‎doc/whats_new/v1.3.rst‎
Lines changed: 13 additions & 0 deletions
@@ -144,6 +144,13 @@ def pytest_runtest_setup(item):
         setup_preprocessing()
     elif fname.endswith("statistical_inference/unsupervised_learning.rst"):
         setup_unsupervised_learning()
+    elif fname.endswith("metadata_routing.rst"):
+        # TODO: remove this once implemented
+        # Skip metarouting because is it is not fully implemented yet
+        raise SkipTest(
+            "Skipping doctest for metadata_routing.rst because it "
+            "is not fully implemented yet"
+        )
 
     rst_files_requiring_matplotlib = [
         "modules/partial_dependence.rst",
 
@@ -0,0 +1,231 @@
+
+.. _metadata_routing:
+
+.. currentmodule:: sklearn
+
+.. TODO: update doc/conftest.py once document is updated and examples run.
+
+Metadata Routing
+================
+
+.. note::
+  The Metadata Routing API is experimental, and is not implemented yet for many
+  estimators. It may change without the usual deprecation cycle. By default
+  this feature is not enabled. You can enable this feature  by setting the
+  ``enable_metadata_routing`` flag to ``True``:
+
+    >>> import sklearn
+
+This guide demonstrates how metadata such as ``sample_weight`` can be routed
+and passed along to estimators, scorers, and CV splitters through
+meta-estimators such as :class:`~pipeline.Pipeline` and
+:class:`~model_selection.GridSearchCV`. In order to pass metadata to a method
+such as ``fit`` or ``score``, the object consuming the metadata, must *request*
+it. For estimators and splitters, this is done via ``set_*_request`` methods,
+e.g. ``set_fit_request(...)``, and for scorers this is done via the
+``set_score_request`` method. For grouped splitters such as
+:class:`~model_selection.GroupKFold`, a ``groups`` parameter is requested by
+default. This is best demonstrated by the following examples.
+
+If you are developing a scikit-learn compatible estimator or meta-estimator,
+you can check our related developer guide:
+:ref:`sphx_glr_auto_examples_miscellaneous_plot_metadata_routing.py`.
+
+.. note::
+  Note that the methods and requirements introduced in this document are only
+  relevant if you want to pass metadata (e.g. ``sample_weight``) to a method.
+  If you're only passing ``X`` and ``y`` and no other parameter / metadata to
+  methods such as ``fit``, ``transform``, etc, then you don't need to set
+  anything.
+
+Usage Examples
+**************
+Here we present a few examples to show different common use-cases. The examples
+in this section require the following imports and data::
+
+  >>> import numpy as np
+  >>> from sklearn.metrics import make_scorer, accuracy_score
+  >>> from sklearn.linear_model import LogisticRegressionCV, LogisticRegression
+  >>> from sklearn.model_selection import cross_validate, GridSearchCV, GroupKFold
+  >>> from sklearn.feature_selection import SelectKBest
+  >>> from sklearn.pipeline import make_pipeline
+  >>> n_samples, n_features = 100, 4
+  >>> rng = np.random.RandomState(42)
+  >>> X = rng.rand(n_samples, n_features)
+  >>> y = rng.randint(0, 2, size=n_samples)
+  >>> my_groups = rng.randint(0, 10, size=n_samples)
+  >>> my_weights = rng.rand(n_samples)
+  >>> my_other_weights = rng.rand(n_samples)
+
+Weighted scoring and fitting
+----------------------------
+
+Here :class:`~model_selection.GroupKFold` requests ``groups`` by default. However, we
+need to explicitly request weights for our scorer and the internal cross validation of
+:class:`~linear_model.LogisticRegressionCV`. Both of these *consumers* know how to use
+metadata called ``sample_weight``::
+
+  >>> weighted_acc = make_scorer(accuracy_score).set_score_request(
+  ...     sample_weight=True
+  ... )
+  >>> lr = LogisticRegressionCV(
+  ...     cv=GroupKFold(), scoring=weighted_acc,
+  ... ).set_fit_request(sample_weight=True)
+  >>> cv_results = cross_validate(
+  ...     lr,
+  ...     X,
+  ...     y,
+  ...     props={"sample_weight": my_weights, "groups": my_groups},
+  ...     cv=GroupKFold(),
+  ...     scoring=weighted_acc,
+  ... )
+
+Note that in this example, ``my_weights`` is passed to both the scorer and
+:class:`~linear_model.LogisticRegressionCV`.
+
+Error handling: if ``props={"sample_weigh": my_weights, ...}`` were passed
+(note the typo), :func:`~model_selection.cross_validate` would raise an error,
+since ``sample_weigh`` was not requested by any of its underlying objects.
+
+Weighted scoring and unweighted fitting
+---------------------------------------
+
+When passing metadata such as ``sample_weight`` around, all scikit-learn
+estimators require weights to be either explicitly requested or not requested
+(i.e. ``True`` or ``False``) when used in another router such as a
+:class:`~pipeline.Pipeline` or a ``*GridSearchCV``. To perform an unweighted
+fit, we need to configure :class:`~linear_model.LogisticRegressionCV` to not
+request sample weights, so that :func:`~model_selection.cross_validate` does
+not pass the weights along::
+
+  >>> weighted_acc = make_scorer(accuracy_score).set_score_request(
+  ...     sample_weight=True
+  ... )
+  >>> lr = LogisticRegressionCV(
+  ...     cv=GroupKFold(), scoring=weighted_acc,
+  ... ).set_fit_request(sample_weight=False)
+  >>> cv_results = cross_validate(
+  ...     lr,
+  ...     X,
+  ...     y,
+  ...     cv=GroupKFold(),
+  ...     props={"sample_weight": my_weights, "groups": my_groups},
+  ...     scoring=weighted_acc,
+  ... )
+
+If :meth:`linear_model.LogisticRegressionCV.set_fit_request` has not
+been called, :func:`~model_selection.cross_validate` will raise an
+error because ``sample_weight`` is passed in but
+:class:`~linear_model.LogisticRegressionCV` would not be explicitly configured
+to recognize the weights.
+
+Unweighted feature selection
+----------------------------
+
+Setting request values for metadata are only required if the object, e.g. estimator,
+scorer, etc., is a consumer of that metadata Unlike
+:class:`~linear_model.LogisticRegressionCV`, :class:`~feature_selection.SelectKBest`
+doesn't consume weights and therefore no request value for ``sample_weight`` on its
+instance is set and ``sample_weight`` is not routed to it::
+
+  >>> weighted_acc = make_scorer(accuracy_score).set_score_request(
+  ...     sample_weight=True
+  ... )
+  >>> lr = LogisticRegressionCV(
+  ...     cv=GroupKFold(), scoring=weighted_acc,
+  ... ).set_fit_request(sample_weight=True)
+  >>> sel = SelectKBest(k=2)
+  >>> pipe = make_pipeline(sel, lr)
+  >>> cv_results = cross_validate(
+  ...     pipe,
+  ...     X,
+  ...     y,
+  ...     cv=GroupKFold(),
+  ...     props={"sample_weight": my_weights, "groups": my_groups},
+  ...     scoring=weighted_acc,
+  ... )
+
+Advanced: Different scoring and fitting weights
+-----------------------------------------------
+
+Despite :func:`~metrics.make_scorer` and
+:class:`~linear_model.LogisticRegressionCV` both expecting the key
+``sample_weight``, we can use aliases to pass different weights to different
+consumers. In this example, we pass ``scoring_weight`` to the scorer, and
+``fitting_weight`` to :class:`~linear_model.LogisticRegressionCV`::
+
+  >>> weighted_acc = make_scorer(accuracy_score).set_score_request(
+  ...    sample_weight="scoring_weight"
+  ... )
+  >>> lr = LogisticRegressionCV(
+  ...     cv=GroupKFold(), scoring=weighted_acc,
+  ... ).set_fit_request(sample_weight="fitting_weight")
+  >>> cv_results = cross_validate(
+  ...     lr,
+  ...     X,
+  ...     y,
+  ...     cv=GroupKFold(),
+  ...     props={
+  ...         "scoring_weight": my_weights,
+  ...         "fitting_weight": my_other_weights,
+  ...         "groups": my_groups,
+  ...     },
+  ...     scoring=weighted_acc,
+  ... )
+
+API Interface
+*************
+
+A *consumer* is an object (estimator, meta-estimator, scorer, splitter) which
+accepts and uses some metadata in at least one of its methods (``fit``,
+``predict``, ``inverse_transform``, ``transform``, ``score``, ``split``).
+Meta-estimators which only forward the metadata to other objects (the child
+estimator, scorers, or splitters) and don't use the metadata themselves are not
+consumers. (Meta-)Estimators which route metadata to other objects are
+*routers*. A(n) (meta-)estimator can be a consumer and a router at the same time.
+(Meta-)Estimators and splitters expose a ``set_*_request`` method for each
+method which accepts at least one metadata. For instance, if an estimator
+supports ``sample_weight`` in ``fit`` and ``score``, it exposes
+``estimator.set_fit_request(sample_weight=value)`` and
+``estimator.set_score_request(sample_weight=value)``. Here ``value`` can be:
+
+- ``True``: method requests a ``sample_weight``. This means if the metadata is
+  provided, it will be used, otherwise no error is raised.
+- ``False``: method does not request a ``sample_weight``.
+- ``None``: router will raise an error if ``sample_weight`` is passed. This is
+  in almost all cases the default value when an object is instantiated and
+  ensures the user sets the metadata requests explicitly when a metadata is
+  passed. The only exception are ``Group*Fold`` splitters.
+- ``"param_name"``: if this estimator is used in a meta-estimator, the
+  meta-estimator should forward ``"param_name"`` as ``sample_weight`` to this
+  estimator. This means the mapping between the metadata required by the
+  object, e.g. ``sample_weight`` and what is provided by the user, e.g.
+  ``my_weights`` is done at the router level, and not by the object, e.g.
+  estimator, itself.
+
+Metadata are requested in the same way for scorers using ``set_score_request``.
+
+If a metadata, e.g. ``sample_weight``, is passed by the user, the metadata
+request for all objects which potentially can consume ``sample_weight`` should
+be set by the user, otherwise an error is raised by the router object. For
+example, the following code raises an error, since it hasn't been explicitly
+specified whether ``sample_weight`` should be passed to the estimator's scorer
+or not::
+
+    >>> param_grid = {"C": [0.1, 1]}
+    >>> lr = LogisticRegression().set_fit_request(sample_weight=True)
+    >>> try:
+    ...     GridSearchCV(
+    ...         estimator=lr, param_grid=param_grid
+    ...     ).fit(X, y, sample_weight=my_weights)
+    ... except ValueError as e:
+    ...     print(e)
+    [sample_weight] are passed but are not explicitly set as requested or not for
+    LogisticRegression.score
+
+The issue can be fixed by explicitly setting the request value::
+
+    >>> lr = LogisticRegression().set_fit_request(
+    ...     sample_weight=True
+    ... ).set_score_request(sample_weight=False)
@@ -34,6 +34,7 @@ Base classes
    base.DensityMixin
    base.RegressorMixin
    base.TransformerMixin
+   base.MetaEstimatorMixin
    base.OneToOneFeatureMixin
    base.ClassNamePrefixFeaturesOutMixin
    feature_selection.SelectorMixin
@@ -1652,6 +1653,11 @@ Plotting
    utils.validation.check_symmetric
    utils.validation.column_or_1d
    utils.validation.has_fit_parameter
+   utils.metadata_routing.get_routing_for_object
+   utils.metadata_routing.MetadataRouter
+   utils.metadata_routing.MetadataRequest
+   utils.metadata_routing.MethodMapping
+   utils.metadata_routing.process_routing
 
 Specific utilities to list scikit-learn components:
 
 
@@ -222,6 +222,14 @@ the following two rules:
   Again, by convention higher numbers are better, so if your scorer
   returns loss, that value should be negated.
 
+- Advanced: If it requires extra metadata to be passed to it, it should expose
+  a ``get_metadata_routing`` method returning the requested metadata. The user
+  should be able to set the requested metadata via a ``set_score_request``
+  method. Please see :ref:`User Guide <metadata_routing>` and :ref:`Developer
+  Guide <sphx_glr_auto_examples_miscellaneous_plot_metadata_routing.py>` for
+  more details.
+
+
 .. note:: **Using custom scorers in functions where n_jobs > 1**
 
     While defining the custom scoring function alongside the calling function
 
@@ -31,3 +31,12 @@ User Guide
    model_persistence.rst
    common_pitfalls.rst
    dispatching.rst
+
+Under Development
+-----------------
+
+.. toctree::
+   :numbered:
+   :maxdepth: 1
+
+   metadata_routing.rst
@@ -145,6 +145,19 @@ Changes impacting all modules
   :pr:`26082` by :user:`Jérémie du Boisberranger <jeremiedbb>` and
   :user:`Olivier Grisel <ogrisel>`.
 
+Experimental / Under Development
+--------------------------------
+
+- |MajorFeature| :ref:`Metadata routing <metadata_routing>`'s related base
+  methods are included in this release. This feature is only available via the
+  `enable_metadata_routing` feature flag which can be enabled using
+  :func:`sklearn.set_config` and :func:`sklearn.config_context`. For now this
+  feature is mostly useful for third party developers to prepare their code
+  base for metadata routing, and we strongly recommend that they also hide it
+  behind the same feature flag, rather than having it enabled by default.
+  :pr:`24027` by `Adrin Jalali`_, :user:`Benjamin Bossan <BenjaminBossan>`, and
+  :user:`Omar Salman <OmarManzoor>`.
+
 Changelog
 ---------