8000 ENH add narwhals as dependency by lorentzenchr · Pull Request #31127 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

ENH add narwhals as dependency #31127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

lorentzenchr
Copy link
Member
@lorentzenchr lorentzenchr commented Apr 1, 2025

Reference Issues/PRs

Closes #31049.

What does this implement/fix? Explain your changes.

This PR adds narwhals as runtime dependency like joblib and uses narwhals in:

  • _safe_indexing

Any other comments?

Not yet.

@lorentzenchr lorentzenchr marked this pull request as draft April 1, 2025 21:32
Copy link
github-actions bot commented Apr 1, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 63e1bcd. Link to the linter CI: here

Comment on lines 269 to +277
if hasattr(X, "iloc"):
# TODO: we should probably use _is_pandas_df_or_series(X) instead but this
# would require updating some tests such as test_train_test_split_mock_pandas.
# TODO: Should also work with _narwhals_indexing, but
# test_safe_indexing_pandas_no_settingwithcopy_warning
# does not pass.
return _pandas_indexing(X, indices, indices_dtype, axis=axis)
elif _is_polars_df_or_series(X):
return _polars_indexing(X, indices, indices_dtype, axis=axis)
elif nw.dependencies.is_into_dataframe(X) or nw.dependencies.is_into_series(X):
return _narwhals_indexing(X, indices, indices_dtype, axis=axis)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole section should read

    if nw.dependencies.is_into_dataframe(X) or nw.dependencies.is_into_series(X):
        return _narwhals_indexing(X, indices, indices_dtype, axis=axis)

and the pandas/iloc path should be removed. But currently, it does not pass the test test_safe_indexing_pandas_no_settingwithcopy_warning.

@MarcoGorelli You might have additional insights.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the ping, i'll take a look next week!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest I don't understand this test. It checks that pandas makes a copy, but only when indexing with [0, 1]. On the main branch, changing that to slice(0, 2) is enough to make the test fail

I'll open a separate issue to discuss

Narwhals objects (DataFrame, Series) are immutable so this isn't something you should need to worry about with Narwhals anyway

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you go: #31290

Anyone in particular we should tag in it?

@lorentzenchr lorentzenchr marked this pull request as ready for review April 25, 2025 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC adopt narwhals for dataframe support
2 participants
0