-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
TST Extend tests for scipy.sparse.*array
in sklearn/ensemble/tests/test_weight_boosting.py
#27148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST Extend tests for scipy.sparse.*array
in sklearn/ensemble/tests/test_weight_boosting.py
#27148
Conversation
…st_weight_boosting.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are suggestions to improve variable names to make the intentions of the tests easier to grasp.
Otherwise, LGTM.
sparse_results = sparse_classifier.staged_decision_function(X_test_sparse) | ||
dense_results = dense_classifier.staged_decision_function(X_test) | ||
for sprase_res, dense_res in zip(sparse_results, dense_results): | ||
assert_array_almost_equal(sprase_res, dense_res) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we are at it, let's fix the typo: sprase
=> sparse
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Furthermore, the names "sparse_results" and "sparse_res" are confusing. Those are not sparse out datastructures but results of a classifier that fits and predicts on sparse inputs datastructures.
I think we should rename those to dense_clf_results
/ sparse_clf_results
instead (and similarly for the "_res" variables).
|
||
|
||
@pytest.mark.parametrize( | ||
"sparse_container, sparse_type", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment for sparse_type
.
@@ -308,7 +314,20 @@ def test_sample_weights_infinite(): | |||
clf.fit(iris.data, iris.target) | |||
|
|||
|
|||
def test_sparse_classification(): | |||
@pytest.mark.parametrize( | |||
"sparse_container, sparse_type", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename sparse_type
to expected_internal_type
.
As per your suggestions, I changed the variable names to be more clear. @ogrisel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm!
# Verify sparsity of data is maintained during training | ||
types = [i.data_type_ for i in sparse_classifier.estimators_] | ||
|
||
assert all([t == expected_internal_type for t in types]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @yuanx749! I just have a question regarding fixing the expected type for each parametrized case. Previously we were checking whether we have either csc_matrix or csr_matrix, now we only have csc for csc containers and csr matrix otherwise. I haven't checked the code so just want to confirm that do we expect csr array in all the other cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, according to the doc
https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html#sklearn.ensemble.AdaBoostClassifier.fit
Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK, and LIL are converted to CSR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…/test_weight_boosting.py` (scikit-learn#27148)
…/test_weight_boosting.py` (scikit-learn#27148)
…/test_weight_boosting.py` (scikit-learn#27148)
Towards #27090.
Reference Issues/PRs
What does this implement/fix? Explain your changes.
Any other comments?