8000 Add allow_sets-kwarg to is_list_like by h-vetinari · Pull Request #23065 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content

Add allow_sets-kwarg to is_list_like #23065

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Oct 18, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Review (jorisvandenbossche)
  • Loading branch information
h-vetinari committed Oct 10, 2018
commit 8efee57c4503ba42cb3e0ae1f70e6b3958964deb
8 changes: 4 additions & 4 deletions pandas/core/dtypes/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@
ABCSparseArray, ABCSparseSeries, ABCCategoricalIndex, ABCIndexClass,
ABCDateOffset)
from pandas.core.dtypes.inference import ( # noqa:F401
is_bool, is_integer, is_float, is_number, is_decimal, is_complex, is_re,
is_re_compilable, is_dict_like, is_string_like, is_file_like, is_list_like,
is_ordered_list_like, is_nested_list_like, is_sequence, is_named_tuple,
is_array_like, is_hashable, is_iterator, is_scalar, is_interval)
is_bool, is_integer, is_float, is_number, is_decimal, is_complex,
is_re, is_re_compilable, is_dict_like, is_string_like, is_file_like,
is_list_like, is_nested_list_like, is_sequence, is_named_tuple,
is_hashable, is_iterator, is_array_like, is_scalar, is_interval)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat of an artefact of the version with is_ordered_list_like, where I tried to group these methods by similarity (i.e. scalar dtypes, regexes, containers), but I decided to keep it because I think it helps. Can revert that part of course

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, on any change, pls try to do the minimal changeset. This will lessen reviewer burden and make things go faster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"yes, please try to do minimal changeset [next time]" or "yes please revert"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine as is for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for now, but generally pls don't change unrelated things.

8000



_POSSIBLY_CAST_DTYPES = {np.dtype(t).name
Expand Down
37 changes: 8 additions & 29 deletions pandas/core/dtypes/inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,7 @@ def is_re_compilable(obj):
return True


def is_list_like(obj):
def is_list_like(obj, strict=False):
"""
Check if the object is list-like.

Expand All @@ -260,6 +260,8 @@ def is_list_like(obj):
Parameters
----------
obj : The object to check.
strict : boolean, default False
If this parameter is True, sets will not be considered list-like

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a versionadded tag

Returns
-------
Expand All @@ -284,37 +286,14 @@ def is_list_like(obj):
False
"""

return (isinstance(obj, compat.Iterable) and
return (isinstance(obj, compat.Iterable)
# we do not count strings/unicode/bytes as list-like
not isinstance(obj, string_and_binary_types) and
and not isinstance(obj, string_and_binary_types)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not correct, leave the and where it was

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8 is clear about this (https://www.python.org/dev/peps/pep-0008/#should-a-line-break-before-or-after-a-binary-operator)

Binary operators (like and is one) should come after a line-break. It's also more readable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, changing this is in principle fine, we have been following that PEP8 rule recently (typically we only want such changes on lines that are already touched by the PR, but since you are here already touching the function some lines below, I would say it is fine).

Note that that is a recent change in PEP8, so you will see many places in the code that does it differently.

# exclude zero-dimensional numpy arrays, effectively scalars
not (isinstance(obj, np.ndarray) and obj.ndim == 0))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from adding the kwarg everywhere, this is the only substantial change of this PR.



def is_ordered_list_like(obj):
"""
Check if the object is list-like and has a defined order
and not (isinstance(obj, np.ndarray) and obj.ndim == 0)
# exclude sets if ordered_only
and not (strict and isinstance(obj, Set)))

Works like :meth:`is_list_like` but excludes sets (as well as unordered
`dict` before Python 3.6)

Note that iterators can not be inspected for order - this check will return
True but it is up to the user to make sure that their iterators are
generated in an ordered way.

Parameters
----------
obj : The object to check.

Returns
-------
is_ordered_list_like : bool
Whether `obj` is an ordered list-like
"""
list_like = is_list_like(obj)
unordered_dict = not PY36 and (isinstance(obj, dict)
and not isinstance(obj, OrderedDict))
return list_like and not unordered_dict and not isinstance(obj, Set)

def is_array_like(obj):
"""
Expand Down
9 changes: 4 additions & 5 deletions pandas/core/strings.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
is_object_dtype,
is_string_like,
is_list_like,
is_ordered_list_like,
is_scalar,
is_integer,
is_re)
Expand Down Expand Up @@ -2084,12 +2083,12 @@ def _get_series_list(self, others, ignore_index=False):
elif isinstance(others, np.ndarray) and others.ndim == 2:
others = DataFrame(others, index=idx)
return ([others[x] for x in others], False)
elif is_ordered_list_like(others):
elif is_list_like(others, strict=True):
others = list(others) # ensure iterators do not get read twice etc

# in case of list-like `others`, all elements must be
# either one-dimensional list-likes or scalars
if all(is_ordered_list_like(x) for x in others):
if all(is_list_like(x, strict=True) for x in others):
los = []
join_warn = False
depr_warn = False
Expand Down Expand Up @@ -2117,7 +2116,7 @@ def _get_series_list(self, others, ignore_index=False):
# nested list-likes are forbidden:
# -> elements of nxt must not be list-like
is_legal = ((no_deep and nxt.dtype == object)
or all(not is_ordered_list_like(x)
or all(not is_list_like(x, strict=True)
for x in nxt))

# DataFrame is false positive of is_legal
Expand All @@ -2136,7 +2135,7 @@ def _get_series_list(self, others, ignore_index=False):
'deprecated and will be removed in a future '
'version.', FutureWarning, stacklevel=3)
return (los, join_warn)
elif all(not is_ordered_list_like(x) for x in others):
elif all(not is_list_like(x, strict=True) for x in others):
return ([Series(others, index=idx)], False)
raise TypeError(err_msg)

Expand Down
15 changes: 6 additions & 9 deletions pandas/tests/dtypes/test_inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,23 +83,20 @@ def test_is_list_like_fails(ll):
@pytest.mark.parametrize(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to have 2 tests total to avoid the duplication of the args here (IOW 1 for allow_sets=True and 1 for False).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if my solution is what you had in mind, but I gave it a shot

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see the earlier version, but I don't think this is what Jeff had in mind. If we want to de-duplicate the arguments, you would need a fixture giving them

@pytest.fixture(params=...)
def maybe_list_like(request):
    return request.param

Each of the params would be a tuple like ([], True), ('2', False), and I guess something like ({}, None) or ({}, 'maybe'}) for set-likes.

Then we would have two tests. In the first we do

obj, expected = ...
if expected:
    expected = True

assert is_list_like(obj) is expected

and in the second

if expected is None:
    expected = False

assert is_list_like(obj, include_sets=False) is expected

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah @TomAugspurger suggestion is good here. The issues is we can't list the args twice.

"ll",
[
[], [1], tuple(), (1, ), (1, 2), np.array([2]), OrderedDict({'a': 1}),
[], [1], tuple(), (1, ), (1, 2), {'a': 1}, np.array([2]),
Series([1]), Series([]), Series(['a']).str, Index([]), Index([1]),
DataFrame(), DataFrame([[1]]), iter([1, 2]), (x for x in [1, 2]),
np.ndarray((2,) * 2), np.ndarray((2,) * 3), np.ndarray((2,) * 4)
])
def test_is_ordered_list_like_passes(ll):
assert inference.is_list_like(ll)
def test_is_list_like_strict_passes(ll):
assert inference.is_list_like(ll, strict=True)


@pytest.mark.parametrize("ll", [1, '2', object(), str, np.array(2),
{1, 'a'}, frozenset({1, 'a'}), {1, 'a'}])
def test_is_ordered_like_fails(ll):
{1, 'a'}, frozenset({1, 'a'})])
def test_is_list_like_strict_fails(ll):
# GH 23061
if PY36 and isinstance(ll, dict):
assert inference.is_ordered_list_like(ll)
else:
assert not inference.is_ordered_list_like(ll)
assert not inference.is_list_like(ll, strict=True)


def test_is_array_like():
Expand Down
0