-
-
Notifications
You must be signed in to change notification settings - Fork 26.6k
ENH add narwhals as dependency #31127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| if hasattr(X, "iloc"): | ||
| # TODO: we should probably use _is_pandas_df_or_series(X) instead but this | ||
| # would require updating some tests such as test_train_test_split_mock_pandas. | ||
| # TODO: Should also work with _narwhals_indexing, but | ||
| # test_safe_indexing_pandas_no_settingwithcopy_warning | ||
| # does not pass. | ||
| return _pandas_indexing(X, indices, indices_dtype, axis=axis) | ||
| elif _is_polars_df_or_series(X): | ||
| return _polars_indexing(X, indices, indices_dtype, axis=axis) | ||
| elif nw.dependencies.is_into_dataframe(X) or nw.dependencies.is_into_series(X): | ||
| return _narwhals_indexing(X, indices, indices_dtype, axis=axis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole section should read
if nw.dependencies.is_into_dataframe(X) or nw.dependencies.is_into_seri
8000
es(X):
return _narwhals_indexing(X, indices, indices_dtype, axis=axis)and the pandas/iloc path should be removed. But currently, it does not pass the test test_safe_indexing_pandas_no_settingwithcopy_warning.
@MarcoGorelli You might have additional insights.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the ping, i'll take a look next week!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest I don't understand this test. It checks that pandas makes a copy, but only when indexing with [0, 1]. On the main branch, changing that to slice(0, 2) is enough to make the test fail
I'll open a separate issue to discuss
Narwhals objects (DataFrame, Series) are immutable so this isn't something you should need to worry about with Narwhals anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you go: #31290
Anyone in particular we should tag in it?
|
I'd be happy to see us adopting narwhals, using |
|
@adrinjalali do you have thoughts on #31290? I think we'd need to align on what the desired behaviour is there |
Yes, I can, but that's not the main problem: It needs community support (from scikit-learn and narwhals) and we (scikit-learn) need to assess whether it is worth it. I first thought, yeah let't do it. But the more failures I fixed, the more I thought about pros and cons. The better place for those is in the original issue #31049. |
63e1bcd to
bcb8447
Compare
|
ok i see what's happening, will make a PR to your branch |
|
Alright here we go: lorentzenchr#9 Issues I spotted:
If you're happy to keep a separate pandas path, then I agree that #31290 isn't actually a blocker 👍 |
Fix PyArrow handling
Reference Issues/PRs
Closes #31049.
What does this implement/fix? Explain your changes.
This PR adds narwhals as runtime dependency like joblib and uses narwhals in:
_safe_indexingAny other comments?
Not yet.