8000 TST catch UserWarning in test_predictions for HGBT by lorentzenchr · Pull Request #26312 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

TST catch UserWarning in test_predictions for HGBT #26312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 23, 2023

Conversation

lorentzenchr
Copy link
Member

Reference Issues/PRs

None

What does this implement/fix? Explain your changes.

This PR catches UserWarning: X does not have valid feature names for a test for HGBT.

Any other comments?

With this PR pytest -x -Werror sklearn/ensemble/_hist_gradient_boosting succeeds.

@lorentzenchr lorentzenchr changed the title TST catch UserWarning in test_predictions TST catch UserWarning in test_predictions for HGBT May 1, 2023
@lorentzenchr lorentzenchr added No Changelog Needed module:test-suite everything related to our tests labels May 1, 2023
Copy link
Member
@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! In this case, I think it's better to fix the warnings themselves by using a dataframe during predict:

diff --git a/sklearn/ensemble/_hist_gradient_boosting/tests/test_monotonic_contraints.py b/sklearn/ensemble/_hist_gradient_boosting/tests/test_monotonic_contraints.py
index 9456b9d993..f11bec3bd7 100644
--- a/sklearn/ensemble/_hist_gradient_boosting/tests/test_monotonic_contraints.py
+++ b/sklearn/ensemble/_hist_gradient_boosting/tests/test_monotonic_contraints.py
@@ -14,6 +14,7 @@ from sklearn.ensemble._hist_gradient_boosting.histogram import HistogramBuilder
 from sklearn.ensemble import HistGradientBoostingRegressor
 from sklearn.ensemble import HistGradientBoostingClassifier
 from sklearn.utils._openmp_helpers import _openmp_effective_n_threads
+from sklearn.utils._testing import _convert_container
 
 n_threads = _openmp_effective_n_threads()
 
@@ -212,9 +213,9 @@ def test_predictions(global_random_seed, use_feature_names):
     f_0 = rng.rand(n_samples)  # positive correlation with y
     f_1 = rng.rand(n_samples)  # negative correslation with y
     X = np.c_[f_0, f_1]
-    if use_feature_names:
-        pd = pytest.importorskip("pandas")
-        X = pd.DataFrame(X, columns=["f_0", "f_1"])
+    columns_name = ["f_0", "f_1"]
+    constructor_name = "dataframe" if use_feature_names else "array"
+    X = _convert_container(X, constructor_name, columns_name=columns_name)
 
     noise = rng.normal(loc=0.0, scale=0.01, size=n_samples)
     y = 5 * f_0 + np.sin(10 * np.pi * f_0) - 5 * f_1 - np.cos(10 * np.pi * f_1) + noise
@@ -244,20 +245,24 @@ def test_predictions(global_random_seed, use_feature_names):
     # First feature (POS)
     # assert pred is all increasing when f_0 is all increasing
     X = np.c_[linspace, constant]
+    X = _convert_container(X, constructor_name, columns_name=columns_name)
     pred = gbdt.predict(X)
     assert is_increasing(pred)
     # assert pred actually follows the variations of f_0
     X = np.c_[sin, constant]
+    X = _convert_container(X, constructor_name, columns_name=columns_name)
     pred = gbdt.predict(X)
     assert np.all((np.diff(pred) >= 0) == (np.diff(sin) >= 0))
 
     # Second feature (NEG)
     # assert pred is all decreasing when f_1 is all increasing
     X = np.c_[constant, linspace]
+    X = _convert_container(X, constructor_name, columns_name=columns_name)
     pred = gbdt.predict(X)
     assert is_decreasing(pred)
     # assert pred actually follows the inverse variations of f_1
     X = np.c_[constant, sin]
+    X = _convert_container(X, constructor_name, columns_name=columns_name)
     pred = gbdt.predict(X)
     assert ((np.diff(pred) <= 0) == (np.diff(sin) >= 0)).all()

Copy link
Member
@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thomasjpfan thomasjpfan added the Quick Review For PRs that are quick to review label May 4, 2023
Copy link
Member
@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

@jeremiedbb jeremiedbb merged commit c0bac2b into scikit-learn:main May 23, 2023
@lorentzenchr lorentzenchr deleted the fix_hgbt_test_predictions branch June 2, 2023 07:59
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:ensemble module:test-suite everything related to our tests No Changelog Needed Quick Review For PRs that are quick to review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0