-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
TYP: Add annotation for df.pivot #32197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 33 commits
7e461a1
1314059
8bcb313
24c3ede
dea38f2
cd9e7ac
e5e912b
f337468
1cc1225
046cdc9
2807b63
2fdd875
5453237
18e85bb
f61dcac
5e3ac3f
ffda679
6b81abb
c01f221
1552f4b
29b0605
4d2d6d3
f7fb25a
80e9710
7618b3b
d44b3df
3c3c7a4
cb6473d
6586410
76d319a
45637fc
62edda6
f063a24
1ac9e0f
c17dd80
16798b1
63643aa
7b10761
9870d0f
4e05ec0
65b45a0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,18 @@ | ||
from typing import TYPE_CHECKING, Callable, Dict, List, Tuple, Union | ||
from typing import ( | ||
TYPE_CHECKING, | ||
Callable, | ||
Dict, | ||
List, | ||
Optional, | ||
Sequence, | ||
Tuple, | ||
Union, | ||
cast, | ||
) | ||
|
||
import numpy as np | ||
|
||
from pandas._typing import Label | ||
from pandas.util._decorators import Appender, Substitution | ||
|
||
from pandas.core.dtypes.cast import maybe_downcast_to_dtype | ||
|
@@ -424,16 +435,27 @@ def _convert_by(by): | |
|
||
@Substitution("\ndata : DataFrame") | ||
@Appender(_shared_docs["pivot"], indents=1) | ||
def pivot(data: "DataFrame", index=None, columns=None, values=None) -> "DataFrame": | ||
def pivot( | ||
data: "DataFrame", | ||
index: Optional[Union[Label, Sequence[Label]]] = None, | ||
columns: Optional[Union[Label, Sequence[Label]]] = None, | ||
values: Optional[Union[Label, Sequence[Label]]] = None, | ||
) -> "DataFrame": | ||
if columns is None: | ||
raise TypeError("pivot() missing 1 required argument: 'columns'") | ||
columns = columns if is_list_like(columns) else [columns] | ||
|
||
if is_list_like(columns): | ||
simonjayhawkins marked this conversation as resolved.
Show resolved
Hide resolved
|
||
columns = cast(Sequence[Label], columns) | ||
columns = columns | ||
else: | ||
columns = [columns] | ||
|
||
if values is None: | ||
cols: List[str] = [] | ||
cols: List[Label] = [] | ||
if index is None: | ||
pass | ||
elif is_list_like(index): | ||
index = cast(Sequence[Label], index) | ||
simonjayhawkins marked this conversation as resolved.
Show resolved
Hide resolved
|
||
cols = list(index) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One of the mypy failures is that @jbrockmendel or @simonjayhawkins I forget, but do we have a preferred way of handling that scoping? Would it be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this something we expect mypy to get smarter about? im increasingly leaning towards "type: ignore" on the grounds that it a) has zero runtime overhead and b) is easy to grep for in periodic see-if-we-can-remove-this sweeps There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe but its not implemented. The mypy issue number is 5206 so I think OK to also do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think asserts are OK when narrowing to exclude a type, e.g. assert is not None, but cast is better for narrowing to ensure a type. ( unless the assert is beneficial in a run-time context) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @WillAyd @jbrockmendel #32785 was opened for this discussion. the #type: ignore option could be discussed there also. cc @jorisvandenbossche |
||
else: | ||
cols = [index] | ||
|
@@ -443,23 +465,25 @@ def pivot(data: "DataFrame", index=None, columns=None, values=None) -> "DataFram | |
indexed = data.set_index(cols, append=append) | ||
else: | ||
if index is None: | ||
index = [Series(data.index, name=data.index.name)] | ||
idx_list = [Series(data.index, name=data.index.name)] | ||
elif is_list_like(index): | ||
index = [data[idx] for idx in index] | ||
index = cast(Sequence[Label], index) | ||
simonjayhawkins marked this conversation as resolved.
Show resolved
Hide resolved
|
||
idx_list = [data[idx] for idx in index] | ||
else: | ||
index = [data[index]] | ||
idx_list = [data[index]] | ||
|
||
data_columns = [data[col] for col in columns] | ||
index.extend(data_columns) | ||
index = MultiIndex.from_arrays(index) | ||
idx_list.extend(data_columns) | ||
mi_index = MultiIndex.from_arrays(idx_list) | ||
|
||
if is_list_like(values) and not isinstance(values, tuple): | ||
# Exclude tuple because it is seen as a single column name | ||
values = cast(Sequence[Label], values) | ||
indexed = data._constructor( | ||
data[values]._values, index=index, columns=values | ||
data[values]._values, index=mi_index, columns=values | ||
) | ||
else: | ||
indexed = data._constructor_sliced(data[values]._values, index=index) | ||
indexed = data._constructor_sliced(data[values]._values, index=mi_index) | ||
return indexed.unstack(columns) | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is a required kwarg, should the Optional be removed from the signature? cc @simonjayhawkins not sure how to handle this given that the default value is None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few considerations here.
Optional
is normally included to represent a keyword argument that defaults to None. Sinceindex
is a keyword argument,columns
must be a keyword argument (non-default argument follows default argument). but because Label already includes None (due to an issue with Hashable) the Optional isn't actually required in this case (This is the reason to exclude Optional from aliases in general).If Optional was omitted, this could add to the confusion.
Since columns should allow None, if a column is named None.
so the default and check here should probably be _lib.no_default.
_lib.no_default currently resolves to Any (defined in cython with no stubs) so is compatible with anything.
If we typed _lib.no_default which is currently object() we wouldn't be able to type parameters precisely since object() wouldn't be compatible. In the future we may wish to define using Enums but this would add to the noise of the function signature from an end user perspective.
from https://www.python.org/dev/peps/pep-0484/#support-for-singleton-types-in-unions