-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
ENH: Add dropna in groupby to allow NaN in keys #30584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
7e461a1
1314059
8bcb313
13b03a8
98f6127
d5fd74c
eb717ec
de2ee5d
def05cc
2888807
b357659
dc4fef1
25482ec
015336d
ac2a79f
eb9a6f7
ffb70f8
b0e3cce
a1d5510
11ef56a
b247a8b
7cb027c
d730c4a
42c4934
2ba79b9
8b79b6c
a4fdf2d
4ac15e3
4ebbad3
f141b80
23ad19b
bafc4a5
c98bafe
86a5958
6cf31d7
2b77f37
451ec97
1089b18
63da563
1b3f22a
3f360a9
5cabe4b
76ffb9f
6c126c7
6d61d6a
3630e8b
1cec7f1
1a1bb49
7ea2e79
13b1e9a
92a7eed
1315a9d
a7959d5
9fec9a8
ffbae76
ef90d7c
e219748
2940908
4ea6aa0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -635,7 +635,7 @@ def factorize( | |
uniques, codes, na_sentinel=na_sentinel, assume_unique=True, verify=False | ||
) | ||
if not dropna and (codes == na_sentinel).any(): | ||
uniques = np.append(uniques, [np.nan]) | ||
uniques = np.append(uniques, [None]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These are typically There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i think if you have any idea of better way of doing this, you are most welcome! 😃 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is likley changing the dtype; you need the na for this type, use pandas.core.dtypes.cast.na_value_for_dtype. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i do not think so, first of all, if this changed using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is leveraging the Int64 type an option here? This just seems a little strange in its current state |
||
codes = np.where(codes == na_sentinel, len(uniques) - 1, codes) | ||
|
||
uniques = _reconstruct_data(uniques, dtype, original) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5649,7 +5649,7 @@ def update( | |
Captive 210.0 | ||
Wild 185.0 | ||
|
||
We can also choose to include NaN in group keys or not by defining | ||
We can also choose to include NaN in group keys or not by setting | ||
`dropna` parameter: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. mention the default |
||
|
||
>>> l = [[1, 2, 3], [1, None, 4], [2, 1, 3], [1, 2, 2]] | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -7347,9 +7347,9 @@ def clip( | |||||
|
||||||
.. versionadded:: 0.23.0 | ||||||
dropna : bool, default True | ||||||
If True, and if group keys contain NaN values, NaN values together | ||||||
If True, and if group keys contain NA values, NA values together | ||||||
with row/column will be dropped. | ||||||
If False, NaN values will also be treated as the key in groups | ||||||
If False, NA values will also be treated as the key in groups | ||||||
|
||||||
.. versionadded:: 1.0.0 | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you assign
codes == na_sentinel
to a mask variable prior to this? Slightly preferable to re-evaluating this in the branchThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thanks!