8000 [REVIEW] Add dropna argument to groupby by rjzamora · Pull Request #5579 · dask/dask · GitHub
[go: up one dir, main page]

Skip to content

[REVIEW] Add dropna argument to groupby #5579

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 13, 2019
Merged

Conversation

rjzamora
Copy link
Member
@rjzamora rjzamora commented Nov 11, 2019

This PR allows cudf-backed groupby operations to include (or exclude) null values in the index. The changes are directly motivated by cudf#3302.

Since pandas does not yet support the dropna argument (see: pandas#3729 and pandas#21669), the new test_groupby_dropna_pandas test only checks for an unknown-argument TypeError. A more-thorough test_groupby_dropna_cudf test was also added here, but this was mostly to illustrate the value of these changes. The cudf-based test should most-likely be removed before merging, because it will be skipped by CI anyway.

Closes #5565 (Although not a permanent solution -- Cleanup will be necessary once pandas supports the dropna argument).

  • Tests added / passed
  • Passes black dask / flake8 dask

@quasiben
Copy link
Member

Is this the first case of building something in dask.dataframe which is not supported by pandas ? My guess is yes, though could be wrong. It's interesting that dask is now supporting dataframe operations which pandas does not yet have

@rjzamora
Copy link
Member Author

It's interesting that dask is now supporting dataframe operations which pandas does not yet have

Right - #5565 does seem to specify an explicit dropna argument. However, maybe we want to pass around something a bit more backend-agnostic here - something like groupby_kwargs?

@mrocklin
Copy link
Member

@TomAugspurger if you're comfortable I would prefer to leave reviewing this to you.

Copy link
Member
@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks good, thanks. Just one requested change on the test.

@rjzamora rjzamora changed the title [WIP] Add dropna argument to groupby Add dropna argument to groupby Nov 12, 2019
@rjzamora rjzamora changed the title Add dropna argument to groupby [REVIEW] Add dropna argument to groupby Nov 12, 2019
@mrocklin
Copy link
Member

Merging in. Thanks @rjzamora for the fix and @TomAugspurger for the review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add dropna keyword to groupby aggregations
4 participants
0