8000 BUG: Inconsistent return type for downsampling on resample of empty DataFrame by discort · Pull Request #15093 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content

BUG: Inconsistent return type for downsampling on resample of empty DataFrame #15093

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 13, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
added explicit 'size' method and defined logic there
  • Loading branch information
discort committed Jun 13, 2017
commit 37ba820ab2b59e258a9412cb7c0cf8d9723bd088
1 change: 0 additions & 1 deletion doc/source/whatsnew/v0.19.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1563,4 +1563,3 @@ Bug Fixes
- ``PeriodIndex`` can now accept ``list`` and ``array`` which contains ``pd.NaT`` (:issue:`13430`)
- Bug in ``df.groupby`` where ``.median()`` returns arbitrary values if grouped dataframe contains empty bins (:issue:`13629`)
- Bug in ``Index.copy()`` where ``name`` parameter was ignored (:issue:`14302`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moves to 0.20.0

give a user friendly function here, _downsample is internal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this file

- Bug in ``_downsample()``. Inconsistent return type on resample of empty DataFrame (:issue:`14962`)
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.20.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1685,6 +1685,7 @@ Groupby/Resample/Rolling
- Bug in ``.rolling()`` where ``pd.Timedelta`` or ``datetime.timedelta`` was not accepted as a ``window`` argument (:issue:`15440`)
- Bug in ``Rolling.quantile`` function that caused a segmentation fault when called with a quantile value outside of the range [0, 1] (:issue:`15463`)
- Bug in ``DataFrame.resample().median()`` if duplicate column names are present (:issue:`14233`)
- Bug in ``resample().size()``. Inconsistent return type on resample of empty DataFrame (:issue:`14962`)

Sparse
^^^^^^
Expand Down
20 changes: 9 additions & 11 deletions pandas/core/resample.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
from pandas.core.indexes.period import PeriodIndex, period_range
import pandas.core.common as com
import pandas.core.algorithms as algos
from pandas.types.generic import ABCDataFrame

import pandas.compat as compat
from pandas.compat.numpy import function as nv
Expand Down Expand Up @@ -549,7 +550,13 @@ def var(self, ddof=1, *args, **kwargs):
nv.validate_resampler_func('var', args, kwargs)
return self._downsample('var', ddof=ddof)


@Appender(GroupBy.size.__doc__)
def size(self):
# It 'seems' special and needs extra handling. GH14962
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better comment here. say its a special case as higher level does returns a copy of 0-len objects.

result = self._downsample('size')
if not len(self.ax) and isinstance(self._selected_obj, ABCDataFrame):
result = pd.Series([], index=result.index, dtype='int64')
return result
Resampler._deprecated_valids += dir(Resampler)

# downsample methods
Expand All @@ -563,8 +570,7 @@ def f(self, _method=method, *args, **kwargs):
setattr(Resampler, method, f)

# groupby & aggregate methods
for method in ['count', 'size']:

for method in ['count']:
def f(self, _method=method):
return self._downsample(_method)
f.__doc__ = getattr(GroupBy, method).__doc__
Expand Down Expand Up @@ -770,14 +776,6 @@ def _wrap_result(self, result):
if self.kind == 'period' and not isinstance(result.index, PeriodIndex):
result.index = result.index.to_period(self.freq)

# Make consistent type of result. GH14962
if not len(self.ax):
grouper = BinGrouper([], result.index)
grouped = self._selected_obj.groupby(grouper)
result = pd.Series([],
index=result.index,
name=grouped.name,
dtype='int64')
return result


Expand Down
21 changes: 9 additions & 12 deletions pandas/tests/test_resample.py
Original file line number Diff line number Diff line change
Expand Up @@ -796,15 +796,22 @@ def test_resample_empty_dataframe(self):
methods = downsample_methods + upsample_methods
for method in methods:
result = getattr(f.resample(freq), method)()

expected = pd.Series([])
if method != 'size':
expected = f.copy()
assert_equal = assert_frame_equal
else:
# GH14962
expected = Series([])
assert_equal = assert_series_equal

expected.index = f.index._shallow_copy(freq=freq)
assert_index_equal(result.index, expected.index)
assert result.index.freq == expected.index.freq
assert_frame_equal(result, expected, check_dtype=False)

# test size for GH13212 (currently stays as df)


def test_resample_empty_dtypes(self):

# Empty series were sometimes causing a segfault (for the functions
Expand Down Expand Up @@ -858,16 +865,6 @@ def test_resample_loffset_arg_type(self):
assert_frame_equal(result_agg, expected)
assert_frame_equal(result_how, expected)

def test_resample_empty_dataframe_with_size(self):
# GH 14962
index = pd.DatetimeIndex([], freq='M')
df = pd.DataFrame([], index=index)

for freq in ['M', 'D', 'H']:
result = df.resample(freq).size()
expected = pd.Series([], index=index, dtype='int64')
assert_series_equal(result, expected)


class TestDatetimeIndex(Base):
_index_factory = lambda x: date_range
Expand Down
0