8000 REF: Compute complete result_index upfront in groupby by rhshadrach · Pull Request #55738 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content

REF: Compute complete result_index upfront in groupby #55738

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 45 commits into from
Feb 7, 2024
Merged
Changes from 1 commit
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
e32b789
REF: Compute correct result_index upfront in groupby
rhshadrach Aug 16, 2023
31a7c92
Refinements
rhshadrach Oct 30, 2023
5ecfbeb
Merge branch 'main' of https://github.com/pandas-dev/pandas into gb_o…
rhshadrach Oct 30, 2023
8ce08d1
Refinements
rhshadrach Nov 1, 2023
6296f4a
Refinements
rhshadrach Nov 1, 2023
68f2aeb
Merge branch 'main' of https://github.com/pandas-dev/pandas into gb_o…
rhshadrach Nov 5, 2023
7141425
Restore inferring index dtype
rhshadrach Nov 5, 2023
7f74812
Merge branch 'gb_observed_pre' of https://github.com/rhshadrach/panda…
rhshadrach Nov 5, 2023
e39cbc8
Test fixups
rhshadrach Nov 5, 2023
c82bd65
Refinements
rhshadrach Nov 5, 2023
3a9892d
Refinements
rhshadrach Nov 5, 2023
25770be
fixup
rhshadrach Nov 5, 2023
a338efc
fixup
rhshadrach Nov 5, 2023
dbdec9f
fixup
rhshadrach Nov 5, 2023
0ae70b7
Fix sorting and non-sorting
rhshadrach Nov 12, 2023
99d2beb
Cleanup
rhshadrach Nov 12, 2023
a477dc0
Call ensure_plantform_int last
rhshadrach Nov 13, 2023
7fb7ca6
fixup
rhshadrach Nov 14, 2023
b79cc85
fixup
rhshadrach Nov 14, 2023
da9169d
REF: Compute correct result_index upfront in groupby
rhshadrach Aug 16, 2023
d2eee13
Merge branch 'main' of https://github.com/pandas-dev/pandas into gb_o…
rhshadrach Nov 17, 2023
b247544
Merge branch 'gb_observed_pre' of https://github.com/rhshadrach/panda…
rhshadrach Nov 17, 2023
700f40f
Add test
rhshadrach Nov 17, 2023
efd20c7
Remove test
rhshadrach Nov 18, 2023
08d02b0
Merge branch 'main' of https://github.com/pandas-dev/pandas into gb_o…
rhshadrach Nov 18, 2023
9dac297
Move unobserved to the end
rhshadrach Nov 18, 2023
2c30d63
Merge branch 'main' of https://github.com/pandas-dev/pandas into gb_o…
rhshadrach Nov 18, 2023
26da0b8
cleanup
rhshadrach Nov 18, 2023
0a6b63a
cleanup
rhshadrach Nov 18, 2023
001881f
cleanup
rhshadrach Nov 18, 2023
e285742
Merge branch 'main' of https://github.com/pandas-dev/pandas into gb_o…
rhshadrach Nov 18, 2023
4aaa1d2
Merge branch 'gb_observed_pre' of https://github.com/rhshadrach/panda…
rhshadrach Nov 22, 2023
e5d5c92
Merge remote-tracking branch 'upstream/main' into gb_observed_pre
rhshadrach Dec 8, 2023
d5b37a4
Merge fixup
rhshadrach Dec 8, 2023
c2c3859
Merge remote-tracking branch 'upstream/main' into gb_observed_pre
rhshadrach Feb 1, 2024
4f284ce
fixup
rhshadrach Feb 2, 2024
4374cdb
Merge remote-tracking branch 'upstream/main' into gb_observed_pre
rhshadrach Feb 2, 2024
c7e6a89
Merge remote-tracking branch 'upstream/main' into gb_observed_pre
rhshadrach Feb 2, 2024
dce05da
fixup
rhshadrach Feb 2, 2024
72209a8
Fixup and test
rhshadrach Feb 3, 2024
b58b69d
whatsnew
rhshadrach Feb 3, 2024
fe99dc5
type ignore
rhshadrach Feb 3, 2024
766c229
Refactor & type annotations
rhshadrach Feb 4, 2024
8f592ad
Merge remote-tracking branch 'upstream/main' into gb_observed_pre
rhshadrach Feb 4, 2024
a05ff18
Better bikeshed
rhshadrach Feb 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Call ensure_plantform_int last
  • Loading branch information
rhshadrach committed Nov 13, 2023
commit a477dc06bcb151c1aeac0a28e2688e3d7f4ff17c
7 changes: 4 additions & 3 deletions pandas/core/groupby/ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -762,7 +762,7 @@ def ids(self) -> np.ndarray:
@cache_readonly
def result_index_and_ids(self) -> tuple[Index, np.ndarray]:
names = self.names
codes = [ensure_platform_int(ping.codes) for ping in self.groupings]
codes = [ping.codes for ping in self.groupings]
levels = [Index._with_infer(ping.uniques) for ping in self.groupings]
obs = [
ping._observed or not ping._passed_categorical for ping in self.groupings
Expand Down Expand Up @@ -807,10 +807,10 @@ def result_index_and_ids(self) -> tuple[Index, np.ndarray]:

if all(obs):
result_index = ob_index
ids = ob_ids
ids = ensure_platform_int(ob_ids)
elif not any(obs):
result_index = unob_index
ids = unob_ids
ids = ensure_platform_int(unob_ids)
else:
# Combine unobserved and observed parts of result_index
unob_indices = [k for k, e in enumerate(obs) if not e]
Expand Down Expand Up @@ -841,6 +841,7 @@ def result_index_and_ids(self) -> tuple[Index, np.ndarray]:
[uniques, np.delete(np.arange(len(result_index)), uniques)]
)
result_index = result_index.take(taker)
ids = ensure_platform_int(ids)

return result_index, ids

Expand Down
0