FEA Add metadata routing through predict methods of BaggingClassifier and BaggingRegressor #30833

StefanieSenger · 2025-02-14T13:21:21Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Add metadata routing functionality to BaggingClassifier and BaggingRegressor's predict, predict_proba predict_log_proba and decision_function methods.

CC @adrinjalali @OmarManzoor
Would you like to have a look?

github-actions · 2025-02-14T13:22:46Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: f235bcd. Link to the linter CI: here}

StefanieSenger · 2025-02-14T13:24:19Z

sklearn/tests/metadata_routing_common.py

-        y_proba = np.empty(shape=(len(X), 2))
-        y_proba[: len(X) // 2, :] = np.asarray([1.0, 0.0])
-        y_proba[len(X) // 2 :, :] = np.asarray([0.0, 1.0])
+        y_proba = np.empty(shape=(len(X), len(self.classes_)), dtype=np.float32)
+        # each row sums up to 1.0:
+        y_proba[:] = np.random.dirichlet(alpha=np.ones(len(self.classes_)), size=len(X))
        return y_proba


It is necessary to make sure predict_proba and predict_log_proba return a column per class for all the test classes (ConsumingClassifier and NonConsumingClassifier) to avoid shape mismatch while testing.

…nd decision_function

OmarManzoor

A few comments otherwise looks good

sklearn/ensemble/_bagging.py

sklearn/tests/metadata_routing_common.py

StefanieSenger

Thank you for your review, @OmarManzoor!
I have added a comment on the conditions for the router.

sklearn/ensemble/_bagging.py

sklearn/tests/metadata_routing_common.py

OmarManzoor

LGTM. Thanks @StefanieSenger

sklearn/ensemble/_bagging.py

Co-authored-by: Omar Salman <omar.salman2007@gmail.com>

adrinjalali

Overall looks good, but a few things need fixing. This is not a complete review. Please ping me once you're ready for another round.

doc/whats_new/upcoming_changes/sklearn.ensemble/30833.feature.rst

sklearn/ensemble/_bagging.py

StefanieSenger

Thank you for reviewing, @adrinjalali!

I have changed the naming of the routed kwargs into **params, added documentation on the routing depending on which methods are available in the sub-estimators, moved the changelog entry into the correct section, re-defined the router depending on whether predict_log_proba is available and added corresponding test cases.

This PR is ready for another round of reviewing.

sklearn/ensemble/_bagging.py

adrinjalali · 2025-02-19T09:39:46Z

sklearn/ensemble/_bagging.py

+            # `sample_weight` is passed to the respective methods dynamically at
+            # runtime:
+            if hasattr(self._get_estimator(), "predict_proba"):
+                method_mapping.add(caller="predict_log_proba", callee="predict_proba")


seems like there's no test case for this one

That is because we need to test it with a classifier that does not have predict_log_proba, but has predict_proba.

I have added an "outsourced" test just as I did for the case when the sub-classifier doesn't have predict_proba.

adrinjalali · 2025-02-19T09:41:24Z

sklearn/ensemble/_bagging.py

+            routed_params = process_routing(self, "predict_proba", **params)
+        else:
+            routed_params = Bunch()
+            routed_params.estimator = Bunch(predict_proba=params)


we don't really support routing metadata around if metadata routing is disabled anyway (the _raise_for_params above). So here the Bunch can/should be empty.

adrinjalali · 2025-02-19T09:41:58Z

sklearn/ensemble/_bagging.py

+        **params : dict
+            Parameters routed to the `predict_log_proba`, the `predict_proba` or the
+            `proba` method of the sub-estimators via the metadata routing API. The
+            routing is tried in the before mentioned order depending on whether this


Suggested change

routing is tried in the before mentioned order depending on whether this

routing is tried in the mentioned order depending on whether this

adrinjalali · 2025-02-19T09:42:20Z

sklearn/ensemble/_bagging.py

+                routed_params = process_routing(self, "predict_log_proba", **params)
+            else:
+                routed_params = Bunch()
+                routed_params.estimator = Bunch(predict_log_proba=params)


same here with not routing params

adrinjalali · 2025-02-19T09:42:32Z

sklearn/ensemble/_bagging.py

+            routed_params = process_routing(self, "decision_function", **params)
+        else:
+            routed_params = Bunch()
+            routed_params.estimator = Bunch(decision_function=params)


adrinjalali · 2025-02-19T09:42:45Z

sklearn/ensemble/_bagging.py

+            routed_params = process_routing(self, "predict", **params)
+        else:
+            routed_params = Bunch()
+            routed_params.estimator = Bunch(predict=params)


StefanieSenger

Thanks for your review, @adrinjalali.

I have removed all the passed params from the bunches without metadata routing and added test cases and new test classes for dynamic method selection.

StefanieSenger · 2025-02-20T17:27:46Z

sklearn/ensemble/_bagging.py

+            # `sample_weight` is passed to the respective methods dynamically at
+            # runtime:
+            if hasattr(self._get_estimator(), "predict_proba"):
+                method_mapping.add(caller="predict_log_proba", callee="predict_proba")


That is because we need to test it with a classifier that does not have predict_log_proba, but has predict_proba.

I have added an "outsourced" test just as I did for the case when the sub-classifier doesn't have predict_proba.

adrinjalali · 2025-03-05T09:55:18Z

doc/whats_new/upcoming_changes/metadata-routing/30833.feature.rst

@@ -0,0 +1,4 @@
+- :class:`ensemble.BaggingClassifier` and :class:`ensemble.BaggingRegressor` now support
+  metadata routing through their `predict`, `predict_proba`, `predict_log_proba` and
+  `decision_function` methods and pass `**predict_params` to the underlying estimators.


Suggested change

`decision_function` methods and pass `**predict_params` to the underlying estimators.

`decision_function` methods and pass `**params` to the underlying estimators.

adrinjalali · 2025-03-05T10:10:25Z

sklearn/ensemble/_bagging.py

@@ -1279,6 +1403,14 @@ def predict(self, X):
            reset=False,
        )

+        _raise_for_params(params, self, "predict")


I'd put all these _raise_form_params calls as the first thing in these methods, so that the user knows as soon as possible that they need to change their script, instead of fitting and fixing their data, and then suddenly getting this error.

Yes, that makes sense, especially if fitting takes a while. I have put all the _raise_for_params() first.

adrinjalali · 2025-03-05T10:34:21Z

sklearn/ensemble/tests/test_bagging.py

+    )
+    bagging = BaggingClassifier(estimator=estimator)
+    bagging.fit(X, y)
+    bagging.predict(X=np.array([[1, 1], [1, 3], [0, 2]]))


I think the reason we still have a codecov warning is that here you're not calling predict_log_proba and/or predict_proba. I think we need to test all the 4 methods, and also need to pass metadata here to actually record something and activate the routing mechanism.

Fixed in commit a3acb0. Thanks for your help with it!

adrinjalali

Outstanding!

route metadata trough BaggingClassifier.predict

c90f6c1

github-actions bot added the module:ensemble label Feb 14, 2025

StefanieSenger commented Feb 14, 2025

View reviewed changes

StefanieSenger changed the title FEA Add metadata routing trough predict methods of Bagging* FEA Add metadata routing through predict methods of Bagging* Feb 14, 2025

StefanieSenger changed the title FEA Add metadata routing through predict methods of Bagging* FEA Add metadata routing through predict methods of BaggingClassifier and BaggingRegressor Feb 14, 2025

StefanieSenger added 2 commits February 15, 2025 13:21

add metadata routing for BaggingRegressor.predict

0d70c22

fix routing via predict_proba and add routing via predict_log_proba a…

9ee6e12

…nd decision_function

StefanieSenger marked this pull request as ready for review February 15, 2025 14:38

OmarManzoor reviewed Feb 17, 2025

View reviewed changes

sklearn/ensemble/_bagging.py Show resolved Hide resolved

sklearn/tests/metadata_routing_common.py Show resolved Hide resolved

OmarManzoor added the Waiting for Second Reviewer First reviewer is done, need a second one! label Feb 17, 2025

add comment on building router based on conditions

ed74c97

StefanieSenger commented Feb 17, 2025

View reviewed changes

sklearn/ensemble/_bagging.py Show resolved Hide resolved

sklearn/tests/metadata_routing_common.py Show resolved Hide resolved

clearer comment

e08007c

OmarManzoor reviewed Feb 17, 2025

View reviewed changes

sklearn/ensemble/_bagging.py Outdated Show resolved Hide resolved

OmarManzoor approved these changes Feb 17, 2025

View reviewed changes

Update sklearn/ensemble/_bagging.py

d06ec19

Co-authored-by: Omar Salman <omar.salman2007@gmail.com>

adrinjalali reviewed Feb 17, 2025

View reviewed changes

StefanieSenger added 2 commits February 18, 2025 14:37

change param name and add comment on routing order on prediction methods

c4ca1c6

add routing distinction on whether predict_log_proba present and test

7e5c22c

StefanieSenger commented Feb 18, 2025

View reviewed changes

sklearn/ensemble/_bagging.py Outdated Show resolved Hide resolved

StefanieSenger added Metadata Routing all issues related to metadata routing, slep006, sample props and removed Waiting for Second Reviewer First reviewer is done, need a second one! labels Feb 18, 2025

adrinjalali reviewed Feb 19, 2025

View reviewed changes

apply suggestions from code review

331231e

StefanieSenger commented Feb 20, 2025

View reviewed changes

adrinjalali reviewed Mar 5, 2025

View reviewed changes

StefanieSenger and others added 3 commits March 14, 2025 13:52

fix test for dynamically called sub-estimator methods

d70e885

Merge branch 'main' into metadata_routing_bagging

1e51649

fix routing for dynamic method selection

a3acb0c

_raise_for_params first

f235bcd

adrinjalali approved these changes Mar 18, 2025

View reviewed changes

adrinjalali merged commit 0372d5e into scikit-learn:main Mar 18, 2025
33 checks passed

github-project-automation bot moved this from In Progress to Done in Metadata routing Mar 18, 2025

StefanieSenger deleted the metadata_routing_bagging branch March 19, 2025 08:07

	routing is tried in the before mentioned order depending on whether this
	routing is tried in the mentioned order depending on whether this

	`decision_function` methods and pass `**predict_params` to the underlying estimators.
	`decision_function` methods and pass `**params` to the underlying estimators.

Uh oh!

FEA Add metadata routing through predict methods of BaggingClassifier and BaggingRegressor #30833

FEA Add metadata routing through predict methods of BaggingClassifier and BaggingRegressor #30833

Uh oh!

Conversation

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Uh oh!

Uh oh!

✔️ Linting Passed

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!