8000 TST Extend tests for `scipy.sparse.*array` in `sklearn/ensemble/tests/test_weight_boosting.py` by yuanx749 · Pull Request #27148 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

TST Extend tests for scipy.sparse.*array in sklearn/ensemble/tests/test_weight_boosting.py #27148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 24, 2023

Conversation

yuanx749
Copy link
Contributor

Towards #27090.

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

@github-actions
Copy link
github-actions bot commented Aug 24, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: bc7839b. Link to the linter CI: here

Copy link
Member
@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are suggestions to improve variable names to make the intentions of the tests easier to grasp.

Otherwise, LGTM.

sparse_results = sparse_classifier.staged_decision_function(X_test_sparse)
dense_results = dense_classifier.staged_decision_function(X_test)
for sprase_res, dense_res in zip(sparse_results, dense_results):
assert_array_almost_equal(sprase_res, dense_res)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are at it, let's fix the typo: sprase => sparse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, the names "sparse_results" and "sparse_res" are confusing. Those are not sparse out datastructures but results of a classifier that fits and predicts on sparse inputs datastructures.

I think we should rename those to dense_clf_results / sparse_clf_results instead (and similarly for the "_res" variables).



@pytest.mark.parametrize(
"sparse_container, sparse_type",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment for sparse_type.

@@ -308,7 +314,20 @@ def test_sample_weights_infinite():
clf.fit(iris.data, iris.target)


def test_sparse_classification():
@pytest.mark.parametrize(
"sparse_container, sparse_type",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename sparse_type to expected_internal_type.

@yuanx749
Copy link
Contributor Author

As per your suggestions, I changed the variable names to be more clear. @ogrisel

Copy link
Member
@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm!

# Verify sparsity of data is maintained during training
types = [i.data_type_ for i in sparse_classifier.estimators_]

assert all([t == expected_internal_type for t in types])
Copy link
Contributor
@OmarManzoor OmarManzoor Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @yuanx749! I just have a question regarding fixing the expected type for each parametrized case. Previously we were checking whether we have either csc_matrix or csr_matrix, now we only have csc for csc containers and csr matrix otherwise. I haven't checked the code so just want to confirm that do we expect csr array in all the other cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, according to the doc
https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.AdaBoostClassifier.html#sklearn.ensemble.AdaBoostClassifier.fit

Sparse matrix can be CSC, CSR, COO, DOK, or LIL. COO, DOK, and LIL are converted to CSR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying

Copy link
Contributor
@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@OmarManzoor OmarManzoor merged commit a9611d0 into scikit-learn:main Aug 24, 2023
@yuanx749 yuanx749 deleted the sparse-weight-boosting branch August 25, 2023 03:15
akaashpatelmns pushed a commit to akaashp2000/scikit-learn that referenced this pull request Aug 25, 2023
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Aug 29, 2023
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0