-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
MNT remove boston from the common test / estimator checks #17356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MNT remove boston from the common test / estimator checks #17356
Conversation
Towards #16155 |
sklearn/utils/estimator_checks.py
Outdated
X, y = load_boston(return_X_y=True) | ||
X, y = shuffle(X, y, random_state=0) | ||
X, y = X[:n_samples], y[:n_samples] | ||
def _regression_dataset(n_samples=200): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker. The first call to _regression_dataset
will create a dataset with n_samples
, and for all proceeding calls n_samples
will do nothing.
At this point, I would just define REGRESSION_DATASET
on top
REGRESSION_DATASET = make_regression(...)
And then have the checks do:
X, y = REGRESSION_DATASET
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @glemaitre
@adrinjalali , @amueller do you mind to have a look? Thanks! |
But keep the lazy generation code.
There was a problem hiding this comment.
Choose a reason for hiding this comm 8000 ent
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and addressed #17356 (comment) by hardcoding the dataset size (but keeping the lazy dataset generation in a private helper function to avoid having too complex code being executed at module import time.
closes #17182
Using
make_regression
instead ofload_boston
in the common test.