8000 BUG: CategoricalIndex.searchsorted doesn't return a scalar if input was scalar by fjetter · Pull Request #21019 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content

BUG: CategoricalIndex.searchsorted doesn't return a scalar if input was scalar #21019

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Address PR comments
  • Loading branch information
fjetter committed Jun 6, 2018
commit 04ca52f01c51fc009ddee75ee79f3d87b9435b95
3 changes: 0 additions & 3 deletions doc/source/whatsnew/v0.23.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1289,9 +1289,6 @@ Indexing
- Bug in performing in-place operations on a ``DataFrame`` with a duplicate ``Index`` (:issue:`17105`)
- Bug in :meth:`IntervalIndex.get_loc` and :meth:`IntervalIndex.get_indexer` when used with an :class:`IntervalIndex` containing a single interval (:issue:`17284`, :issue:`20921`)
- Bug in ``.loc`` with a ``uint64`` indexer (:issue:`20722`)
- Bug in :func:`CategoricalIndex.searchsorted` where the method did not return a scalar when the input values was scalar (:issue:`21019`)
- Bug in :class:`CategoricalIndex` where slicing beyond the range of the data raised a KeyError (:issue:`21019`)


MultiIndex
^^^^^^^^^^
Expand Down
3 changes: 1 addition & 2 deletions doc/source/whatsnew/v0.23.1.txt
8000
Original file line number Diff line number Diff line change
Expand Up @@ -89,8 +89,7 @@ Indexing
- Bug in :class:`IntervalIndex` constructors where creating an ``IntervalIndex`` from categorical data was not fully supported (:issue:`21243`, issue:`21253`)
- Bug in :meth:`MultiIndex.sort_index` which was not guaranteed to sort correctly with ``level=1``; this was also causing data misalignment in particular :meth:`DataFrame.stack` operations (:issue:`20994`, :issue:`20945`, :issue:`21052`)
- Bug in :func:`CategoricalIndex.searchsorted` where the method did not return a scalar when the input values was scalar (:issue:`21019`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move to 0.23.2

- Bug in :class:`CategoricalIndex` where slicing beyond the range of the data raised a KeyError (:issue:`21019`)
-
- Bug in :class:`CategoricalIndex` where slicing beyond the range of the data raised a ``KeyError`` (:issue:`21019`)

I/O
^^^
Expand Down
53 changes: 27 additions & 26 deletions pandas/tests/indexing/test_categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -627,38 +627,29 @@ def test_reindexing(self):
lambda: self.df2.reindex(['a'], limit=2))

def test_loc_slice(self):
# Raises KeyError since the left slice 'a' is not unique
pytest.raises(KeyError, lambda: self.df.loc["a":"b"])
result = self.df.loc["b":"c"]

expected = DataFrame(
{"A": [2, 3, 4]},
index=CategoricalIndex(
["b", "b", "c"], name="B", categories=list("cab")
),
df = DataFrame(
{"A": range(0, 6)},
index=CategoricalIndex(list("aabcde"), name="B"),
)

# slice on an unordered categorical using in-sample, connected edges
result = df.loc["b":"d"]
expected = df.iloc[2:5]
assert_frame_equal(result, expected)

ordered_df = DataFrame(
{"A": range(0, 6)},
index=CategoricalIndex(list("aabcde"), name="B", ordered=True),
)

# This should select the entire dataframe
result = ordered_df.loc["a":"e"]
assert_frame_equal(result, ordered_df)
result_iloc = ordered_df.iloc[0:6]
# Slice the entire dataframe
result = df.loc["a":"e"]
assert_frame_equal(result, df)
result_iloc = df.iloc[0:6]
assert_frame_equal(result_iloc, result)

result = ordered_df.loc["a":"b"]
expected = DataFrame(
{"A": range(0, 3)},
index=CategoricalIndex(
list("aab"), categories=list("abcde"), name="B", ordered=True
),
)
assert_frame_equal(result, expected)
# check if the result is identical to an ordinary index
df_non_cat_index = df.copy()
df_non_cat_index.index = df_non_cat_index.index.astype(str)
result = df.loc["a":"e"]
result_non_cat = df_non_cat_index.loc["a": "e"]
result.index = result.index.astype(str)
assert_frame_equal(result_non_cat, result)

@pytest.mark.parametrize(
"content",
Expand All @@ -669,6 +660,8 @@ def test_loc_beyond_edge_slicing(self, content):
"""
This test ensures that no `KeyError` is raised if trying to slice
beyond the edges of known, ordered categories.

see GH21019
"""
# This dataframe might be a slice of a larger categorical
# (i.e. more categories are known than there are in the column)
Expand Down Expand Up @@ -701,6 +694,14 @@ def test_loc_beyond_edge_slicing(self, content):
# If the category is not known, there is nothing we can do
ordered_df.loc["a":"z"]

unordered_df = ordered_df.copy()
unordered_df.index = unordered_df.index.as_unordered()
with pytest.raises(KeyError):
# This operation previously succeeded for an ordered index. Since
# this index is no longer ordered, we cannot perfom out of range
# slicing / searchsorted
unordered_df.loc["a": "d"]

def test_boolean_selection(self):

df3 = self.df3
Expand Down
0