10000 DataFrameGroupby.boxplot fails when subplots=False by charlesdong1991 · Pull Request #28102 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content

DataFrameGroupby.boxplot fails when subplots=False #28102

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
Nov 8, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
7e461a1
remove \n from docstring
charlesdong1991 Dec 3, 2018
1314059
fix conflicts
charlesdong1991 Jan 19, 2019
8bcb313
Merge remote-tracking branch 'upstream/master'
charlesdong1991 Jul 30, 2019
a30fd5c
Merge remote-tracking branch 'upstream/master' into fix_issue_16748
charlesdong1991 Aug 22, 2019
1d0ac65
Fix issue 16748
charlesdong1991 Aug 22, 2019
af41084
Code change based on review
charlesdong1991 Aug 23, 2019
193eb2c
Fix import sort linting
charlesdong1991 Aug 23, 2019
db214b6
Merge remote-tracking branch 'upstream/master' into fix_issue_16748
charlesdong1991 Aug 23, 2019
dfc72b2
Skip the failing test
charlesdong1991 Aug 24, 2019
6cd2d28
Merge remote-tracking branch 'upstream/master' into fix_issue_16748
charlesdong1991 Aug 27, 2019
5c69d10
Remove skip
charlesdong1991 Aug 30, 2019
c08c278
remove imports
charlesdong1991 Aug 30, 2019
1df91da
More careful change
charlesdong1991 Aug 31, 2019
24c5d93
fix conflict
charlesdong1991 Sep 5, 2019
10000 9cce7f7
keep the change
charlesdong1991 Sep 9, 2019
0df3670
fix conflict and merge master
charlesdong1991 Oct 19, 2019
2f99b67
update solution
charlesdong1991 Oct 19, 2019
3819f85
update test
charlesdong1991 Oct 19, 2019
3cbba0f
fix test
charlesdong1991 Oct 20, 2019
51a326f
Merge remote-tracking branch 'upstream/master' into fix_issue_16748
charlesdong1991 Dec 28, 2019
011218b
Merge remote-tracking branch 'upstream/master' into fix_issue_16748
charlesdong1991 Dec 30, 2019
c786a55
much better solution
charlesdong1991 Oct 3, 2020
a1d84b9
format
charlesdong1991 Oct 3, 2020
4eecae8
typo
charlesdong1991 Oct 3, 2020
b6d1b4c
whatsnew
charlesdong1991 Oct 3, 2020
8ab2db3
commit one more
charlesdong1991 Oct 3, 2020
9f5caaa
Merge remote-tracking branch 'upstream/master' into fix_issue_16748
charlesdong1991 Oct 13, 2020
6bf3914
Merge remote-tracking branch 'upstream/master' into fix_issue_16748
charlesdong1991 Oct 26, 2020
a2884e7
Merge remote-tracking branch 'upstream/master' into fix_issue_16748
charlesdong1991 Oct 29, 2020
067323d
Merge remote-tracking branch 'upstream/master' into fix_issue_16748
charlesdong1991 Nov 4, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
much better solution
  • Loading branch information
charlesdong1991 committed Oct 3, 2020
commit c786a5501e73bb3abc4dc43f0458b3c3633de01d 10000
3 changes: 1 addition & 2 deletions doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1195,8 +1195,7 @@ Plotting
- Bug in :meth:`DataFrame.plot` producing incorrect legend markers when plotting multiple series on the same axis (:issue:`18222`)
- Bug in :meth:`DataFrame.plot` when ``kind='box'`` and data contains datetime or timedelta data. These types are now automatically dropped (:issue:`22799`)
- Bug in :meth:`DataFrame.plot.line` and :meth:`DataFrame.plot.area` produce wrong xlim in x-axis (:issue:`27686`, :issue:`25160`, :issue:`24784`)
- Bug in :meth:`DataFrame.groupby.boxplot` when ``subplots=False``, a KeyError would raise (:issue:`16748`)
- Bug where :meth:`DataFrame.boxplot` would not accept a `color` parameter like `DataFrame.plot.box` (:issue:`26214`)
- Bug where :meth:`DataFrame.boxplot` would not accept a ``color`` parameter like :meth:`DataFrame.plot.box` (:issue:`26214`)
- Bug in the ``xticks`` argument being ignored for :meth:`DataFrame.plot.bar` (:issue:`14119`)
- :func:`set_option` now validates that the plot backend provided to ``'plotting.backend'`` implements the backend when the option is set, rather than when a plot is created (:issue:`28163`)
- :meth:`DataFrame.plot` now allow a ``backend`` keyword argument to allow changing between backends in one session (:issue:`28619`).
Expand Down
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v1.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -393,6 +393,8 @@ Plotting
- Bug in :meth:`DataFrame.plot` was rotating xticklabels when ``subplots=True``, even if the x-axis wasn't an irregular time series (:issue:`29460`)
- Bug in :meth:`DataFrame.plot` where a marker letter in the ``style`` keyword sometimes causes a ``ValueError`` (:issue:`21003`)
- Twinned axes were losing their tick labels which should only happen to all but the last row or column of 'externally' shared axes (:issue:`33819`)
- Bug in :meth:`DataFrame.groupby.boxplot` when ``subplots=False``, a KeyError would raise (:issue:`16748`)


Groupby/resample/rolling
^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
29 changes: 12 additions & 17 deletions pandas/plotting/_matplotlib/boxplot.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from pandas.core.dtypes.missing import remove_na_arraylike

import pandas as pd
import pandas.core.common as com

from pandas.io.formats.printing import pprint_thing
from pandas.plotting._matplotlib.core import LinePlot, MPLPlot
Expand Down Expand Up @@ -356,24 +357,10 @@ def plot_group(keys, values, ax: "Axes"):
ax = plt.gca()
data = data._get_numeric_data()

if columns:

# this is to align the current behavior of boxplot on columns, so if users
# explicitly specify column names and it is in df columns, then take the
# subset; Or if it is columns is in the first level, this is set mainly to
# avoid API change
if set(columns).issubset(data.columns) or set(columns).issubset(
data.columns.levels[0]
):
data = data[columns]
else:

# this loop is set for groupby situation, because user specified column
# will go to the second level of column index
data = data.loc[:, pd.IndexSlice[:, columns]]
columns = data.columns
else:
if column is None:
columns = data.columns
else:
data = data[columns]

result = plot_group(columns, data.values.T, ax)
ax.grid(grid)
Expand Down Expand Up @@ -458,6 +445,14 @@ def boxplot_frame_groupby(
df = frames[0].join(frames[1::])
else:
df = frames[0]

# GH 16748, DataFrameGroupby fails when subplots=False, and in this case, since
# `df` here becomes MI after groupby, so we need to couple the keys (grouped
# values) and column (original df column) together to search for subset to plot
if column is not None:
column = com.convert_to_list_like(column)
multi_key = pd.MultiIndex.from_product([keys, column])
column = list(multi_key.values)
ret = df.boxplot(
column=column,
fontsize=fontsize,
Expand Down
18 changes: 4 additions & 14 deletions pandas/tests/plotting/test_boxplot_method.py
Original file line number Diff line number Diff line change
Expand Up @@ -490,25 +490,15 @@ def test_groupby_boxplot_subplots_false(self, col, expected_xticklabel):
)
grouped = df.groupby("cat")

# check is boxplot with subplots=False works
axes = _check_plot_works(
grouped.boxplot, subplots=False, column=col, return_type="axes"
)

# check if xticks labels are plotted correctly
result_xticklabel = [x.get_text() for x in axes.get_xticklabels()]
assert expected_xticklabel == result_xticklabel

@pytest.mark.parametrize(
"col, expected_xticklabel",
[
([("bar", "one"), ("bar", "two")], ["(bar, one)", "(bar, two)"]),
("bar", ["bar", ""]),
(["two"], ["(bar, two)", "(baz, two)", "(foo, two)", "(qux, two)"]),
],
)
def test_boxplot_multiindex_column(self, col, expected_xticklabel):
# this is test the boxplot on multi-index column cases
def test_boxplot_multiindex_column(self):
# GH 16748
arrays = [
["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"],
["one", "two", "one", "two", "one", "two", "one", "two"],
Expand All @@ -517,9 +507,9 @@ def test_boxplot_multiindex_column(self, col, expected_xticklabel):
index = MultiIndex.from_tuples(tuples, names=["first", "second"])
df = DataFrame(np.random.randn(3, 8), index=["A", "B", "C"], columns=index)

# check if df.boxplot works
col = [("bar", "one"), ("bar", "two")]
axes = _check_plot_works(df.boxplot, column=col, return_type="axes")

# check if xticks labels are plotted correctly
expected_xticklabel = ["(bar, one)", "(bar, two)"]
result_xticklabel = [x.get_text() for x in axes.get_xticklabels()]
assert expected_xticklabel == result_xticklabel
You are viewing a condensed version of this merge commit. You can view the full changes here.
0