Closed
Description
In [44]: a = pd.DataFrame({'a': [1, 1, 2], 'b': [1, 2, 3]})
In [45]: a.groupby('a').agg('sum') # 'a' is returned as index
Out[45]:
b
a
1 3
2 3
In [46]: a.groupby('a').agg('nunique') # 'a' is returned both as index and column
Out[46]:
a b
a
1 1 2
2 1 1
The groupby documentation notes:
Aggregation functions will not return the groups that you are aggregating over if they are named columns, when as_index=True, the default. The grouped columns will be the indices of the returned object.
However, in the snippet above, it looks like the nunique
aggregation behaves differently in this respect from the sum
aggregation.
Is it possible to predict when an aggregation will return groups as index columns v/s a "SQL-style output" v/s a combination of the two (as above)?