8000 TST Extend tests for `scipy.sparse.*array` in `test_polynomial.py` by work-mohit · Pull Request #27166 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

TST Extend tests for scipy.sparse.*array in test_polynomial.py #27166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

work-mohit
Copy link
Contributor

Towards #27090

What does this implement/fix? Explain your changes.

Extended the test cases for the test_polynomial.py file. All tests run smoothly except two:

FAILED sklearn/preprocessing/tests/test_polynomial.py::test_polynomial_features_csc_X[csc_array-4-False-False-float64] - NotImplementedError: We have not yet implemented 1D sparse slices; please index using explicit indices, e.g. x[:, [0]]

FAILED sklearn/preprocessing/tests/test_polynomial.py::test_polynomial_features_csc_X[csc_array-4-False-True-float64] - NotImplementedError: We have not yet implemented 1D sparse slices; please index using explicit indices, e.g. x[:, [0]]

Any other comments?

I've already tried .A to .toarray() for csr_arrays but got nothing, the same errors keep coming. Maybe there is another source of error, unable to find it. Feel free to drop your review.

     test_polynomial_features_csc_X[csc_array-4-False-True-float64] 

deg = 4, include_bias = False, interaction_only = True, dtype = <class 'numpy.float64'>, csc_container = <class 'scipy.sparse._csc.csc_array'>

    @pytest.mark.parametrize(
        ["deg", "include_bias", "interaction_only", "dtype"],
        [
            (1, True, False, int),
            (2, True, False, int),
            (2, True, False, np.float32),
            (2, True, False, np.float64),
            (3, False, False, np.float64),
            (3, False, True, np.float64),
            (4, False, False, np.float64),
            (4, False, True, np.float64),
        ],
    )
    @pytest.mark.parametrize("csc_container", CSC_CONTAINERS)
    def test_polynomial_features_csc_X(
        deg, include_bias, interaction_only, dtype, csc_container
    ):
        rng = np.random.RandomState(0)
        X = rng.randint(0, 2, (100, 2))
        X_csc = csc_container(X)

        est = PolynomialFeatures(
            deg, include_bias=include_bias, interaction_only=interaction_only
        )
>       Xt_csc = est.fit_transform(X_csc.astype(dtype))    <<<<<<< At this line 

@github-actions
Copy link
github-actions bot commented Aug 25, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 9cb2f99. Link to the linter CI: here

@ogrisel
Copy link
Member
ogrisel commented Aug 25, 2023

It's important to look at the full traceback in the CI report to understand what causes the exception raised at the bottom of the traceback.

Here if you walk up the traceback, you will see that the code in scikit-learn causing this error is:

https://dev.azure.com/scikit-learn/scikit-learn/_build/results?buildId=58339&view=logs&j=7d13af21-9cb9-5d40-483b-ea0f074409c6&t=97bcaf17-ac30-535a-f717-a6f6663c5884&l=1378

../1/s/sklearn/preprocessing/_polynomial.py:498: in transform
    out_col = X[:, col_idx].multiply(out_col)

and the message of the exception is:

NotImplementedError: We have not yet implemented 1D sparse slices; please index using explicit indices, e.g. `x[:, [0]]`

which suggest a potential fix to make the code in scikit-learn compatible with the new sparse arrays:

    out_col = X[:, [col_idx]].multiply(out_col)

@ogrisel
Copy link
Member
ogrisel commented Aug 25, 2023

By the way that means that this PR will not just be about updating the tests but actually about fixing something in scikit-learn to add compatibility to the new scipy sparse arrays.

Therefore, this PR will need an entry in the changelog (doc/whats_new/v1.4.rst). Take example on the existing entry for #27100 that changed the code used by the NMF estimators.

@jjerphan jjerphan changed the title Extended the test cases of test_polynomial.py TST Extend tests for scipy.sparse.*array in test_polynomial.py Aug 25, 2023
@work-mohit
Copy link
Contributor Author

@ogrisel Have a look on this PR. Let me know if any updates required. Double check the change log file. Thanks !

Copy link
Member
@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the fix @work-mohit!

@ogrisel ogrisel added the Waiting for Second Reviewer First reviewer is done, need a second one! label Sep 11, 2023
@work-mohit work-mohit requested a review from ogrisel September 12, 2023 11:15
@glemaitre glemaitre self-requested a review September 29, 2023 20:49
Copy link
Member
@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well. Merging if the CIs turns green.

B581

@glemaitre glemaitre enabled auto-merge (squash) September 29, 2023 20:59
@glemaitre glemaitre merged commit 9f6592f into scikit-learn:main Sep 29, 2023
@work-mohit work-mohit deleted the fix/preprocessing/tests/test_polynomial.py branch September 30, 2023 06:19
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
…cikit-learn#27166)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:preprocessing Waiting for Second Reviewer First reviewer is done, need a second one!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0