-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
TST catch UserWarning in test_predictions for HGBT #26312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST catch UserWarning in test_predictions for HGBT #26312
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR! In this case, I think it's better to fix the warnings themselves by using a dataframe during predict
:
diff --git a/sklearn/ensemble/_hist_gradient_boosting/tests/test_monotonic_contraints.py b/sklearn/ensemble/_hist_gradient_boosting/tests/test_monotonic_contraints.py
index 9456b9d993..f11bec3bd7 100644
--- a/sklearn/ensemble/_hist_gradient_boosting/tests/test_monotonic_contraints.py
+++ b/sklearn/ensemble/_hist_gradient_boosting/tests/test_monotonic_contraints.py
@@ -14,6 +14,7 @@ from sklearn.ensemble._hist_gradient_boosting.histogram import HistogramBuilder
from sklearn.ensemble import HistGradientBoostingRegressor
from sklearn.ensemble import HistGradientBoostingClassifier
from sklearn.utils._openmp_helpers import _openmp_effective_n_threads
+from sklearn.utils._testing import _convert_container
n_threads = _openmp_effective_n_threads()
@@ -212,9 +213,9 @@ def test_predictions(global_random_seed, use_feature_names):
f_0 = rng.rand(n_samples) # positive correlation with y
f_1 = rng.rand(n_samples) # negative correslation with y
X = np.c_[f_0, f_1]
- if use_feature_names:
- pd = pytest.importorskip("pandas")
- X = pd.DataFrame(X, columns=["f_0", "f_1"])
+ columns_name = ["f_0", "f_1"]
+ constructor_name = "dataframe" if use_feature_names else "array"
+ X = _convert_container(X, constructor_name, columns_name=columns_name)
noise = rng.normal(loc=0.0, scale=0.01, size=n_samples)
y = 5 * f_0 + np.sin(10 * np.pi * f_0) - 5 * f_1 - np.cos(10 * np.pi * f_1) + noise
@@ -244,20 +245,24 @@ def test_predictions(global_random_seed, use_feature_names):
# First feature (POS)
# assert pred is all increasing when f_0 is all increasing
X = np.c_[linspace, constant]
+ X = _convert_container(X, constructor_name, columns_name=columns_name)
pred = gbdt.predict(X)
assert is_increasing(pred)
# assert pred actually follows the variations of f_0
X = np.c_[sin, constant]
+ X = _convert_container(X, constructor_name, columns_name=columns_name)
pred = gbdt.predict(X)
assert np.all((np.diff(pred) >= 0) == (np.diff(sin) >= 0))
# Second feature (NEG)
# assert pred is all decreasing when f_1 is all increasing
X = np.c_[constant, linspace]
+ X = _convert_container(X, constructor_name, columns_name=columns_name)
pred = gbdt.predict(X)
assert is_decreasing(pred)
# assert pred actually follows the inverse variations of f_1
X = np.c_[constant, sin]
+ X = _convert_container(X, constructor_name, columns_name=columns_name)
pred = gbdt.predict(X)
assert ((np.diff(pred) <= 0) == (np.diff(sin) >= 0)).all()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
Reference Issues/PRs
None
What does this implement/fix? Explain your changes.
This PR catches
UserWarning: X does not have valid feature names
for a test for HGBT.Any other comments?
With this PR
pytest -x -Werror sklearn/ensemble/_hist_gradient_boosting
succeeds.