8000 API: provide Rolling/Expanding/EWM objects for deferred rolling type calculations #10702 by jreback · Pull Request #11603 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content

API: provide Rolling/Expanding/EWM objects for deferred rolling type calculations #10702 #11603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Dec 19, 2015
Merged
Prev Previous commit
Next Next commit
DEPR: deprecate freq/how arguments to window functions
  • Loading branch information
jreback committed Dec 19, 2015
commit 05eb20f3657c72bbf3caa61409e1c243ec76412f
11 changes: 6 additions & 5 deletions doc/source/computation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,6 @@ Window Functions
functions and are now deprecated and replaced by the corresponding method call.

The deprecation warning will show the new syntax, see an example :ref:`here <whatsnew_0180.window_deprecations>`

You can view the previous documentation
`here <http://pandas.pydata.org/pandas-docs/version/0.17.1/computation.html#moving-rolling-statistics-moments>`__

Expand Down Expand Up @@ -244,8 +243,12 @@ accept the following arguments:
- ``window``: size of moving window
- ``min_periods``: threshold of non-null data points to require (otherwise
result is NA)
- ``freq``: optionally specify a :ref:`frequency string <timeseries.alias>`
or :ref:`DateOffset <timeseries.offsets>` to pre-conform the data to.

.. warning::

The ``freq`` and ``how`` arguments were in the API prior to 0.18.0 changes. These are deprecated in the new API. You can simply resample the input prior to creating a window function.

For example, instead of ``s.rolling(window=5,freq='D').max()`` to get the max value on a rolling 5 Day window, one could use ``s.resample('D',how='max').rolling(window=5).max()``, which first resamples the data to daily data, then provides a rolling 5 day window.

We can then call methods on these ``rolling`` objects. These return like-indexed objects:

Expand Down Expand Up @@ -604,8 +607,6 @@ all accept are:
- ``min_periods``: threshold of non-null data points to require. Defaults to
minimum needed to compute statistic. No ``NaNs`` will be output once
``min_periods`` non-null data points have been seen.
- ``freq``: optionally specify a :ref:`frequency string <timeseries.alias>`
or :ref:`DateOffset <timeseries.offsets>` to pre-conform the data to.

.. note::

Expand Down
6 changes: 5 additions & 1 deletion doc/source/whatsnew/v0.18.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ users upgrade to this version.

Highlights include:

- Window functions are now methods on ``.groupby`` like objects, see :ref:`here <whatsnew_0180.moments>`.

Check the :ref:`API Changes <whatsnew_0180.api>` and :ref:`deprecations <whatsnew_0180.deprecations>` before updating.

.. contents:: What's new in v0.18.0
Expand Down Expand Up @@ -212,7 +214,7 @@ Deprecations

.. _whatsnew_0180.window_deprecations:

- Function ``pd.rolling_*``, ``pd.expanding_*``, and ``pd.ewm*`` are deprecated and replaced by the corresponding method call. Note that
- The functions ``pd.rolling_*``, ``pd.expanding_*``, and ``pd.ewm*`` are deprecated and replaced by the corresponding method call. Note that
the new suggested syntax includes all of the arguments (even if default) (:issue:`11603`)

.. code-block:: python
Expand All @@ -237,6 +239,8 @@ Deprecations
2 0.5
dtype: float64

- The the ``freq`` and ``how`` arguments to the ``.rolling``, ``.expanding``, and ``.ewm`` (new) functions are deprecated, and will be removed in a future version. (:issue:`11603`)

.. _whatsnew_0180.prior_deprecations:

Removal of prior version deprecations/changes
Expand Down
9 changes: 2 additions & 7 deletions pandas/core/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -462,13 +462,8 @@ def _aggregate_multiple_funcs(self, arg, _level):
colg = self._gotitem(obj.name, ndim=1, subset=obj)
results.append(colg.aggregate(a))

# find a good name, this could be a function that we don't recognize
name = self._is_cython_func(a) or a
if not isinstance(name, compat.string_types):
name = getattr(a,'name',a)
if not isinstance(name, compat.string_types):
name = getattr(a,'__name__',a)

# make sure we find a good name
name = com._get_callable_name(a) or a
keys.append(name)
except (TypeError, DataError):
pass
Expand Down
76 changes: 49 additions & 27 deletions pandas/core/window.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
"""
from __future__ import division

import warnings
import numpy as np
from functools import wraps
from collections import defaultdict
Expand Down Expand Up @@ -39,6 +40,12 @@ class _Window(PandasObject, SelectionMixin):

def __init__(self, obj, window=None, min_periods=None, freq=None, center=False,
win_type=None, axis=0):

if freq is not None:
warnings.warn("The freq kw is deprecated and will be removed in a future version. You can resample prior "
"to passing to a window function",
FutureWarning, stacklevel=3)

self.blocks = []
self.obj = obj
self.window = window
Expand Down Expand Up @@ -298,7 +305,7 @@ def _apply_window(self, mean=True, how=None, **kwargs):
----------
mean : boolean, default True
If True computes weighted mean, else weighted sum
how : string, default to None
how : string, default to None (DEPRECATED)
how to resample

Returns
Expand Down Expand Up @@ -378,7 +385,7 @@ def _apply(self, func, window=None, center=None, check_minp=None, how=None, **kw
window : int/array, default to _get_window()
center : boolean, default to self.center
check_minp : function, default to _use_window
how : string, default to None
how : string, default to None (DEPRECATED)
how to resample

Returns
Expand Down Expand Up @@ -486,21 +493,31 @@ def sum(self):

Parameters
----------
how : string, default max
how : string, default 'max' (DEPRECATED)
Method for down- or re-sampling""")

def max(self, how='max'):
def max(self, how=None):
if how is not None:
warnings.warn("The how kw argument is deprecated and removed in a future version. You can resample prior "
"to passing to a window function",
FutureWarning, stacklevel=3)
else:
how = 'max'
return self._apply('roll_max', how=how)

_shared_docs['min'] = dedent("""
%(name)s minimum

Parameters
----------
how : string, default min
how : string, default 'min' (DEPRECATED)
Method for down- or re-sampling""")

def min(self, how='min'):
def min(self, how=None):
if how is not None:
warnings.warn("The how kw argument is deprecated and removed in a future version. You can resample prior "
"to passing to a window function",
FutureWarning, stacklevel=3)
else:
how = 'min'
return self._apply('roll_min', how=how)

_shared_docs['mean'] = """%(name)s mean"""
Expand All @@ -512,10 +529,15 @@ def mean(self):

Parameters
----------
how : string, default median
how : string, default 'median' (DEPRECATED)
Method for down- or re-sampling""")

def median(self, how='median'):
def median(self, how=None):
if how is not None:
warnings.warn("The how kw argument is deprecated and removed in a future version. You can resample prior "
"to passing to a window function",
FutureWarning, stacklevel=3)
else:
how = 'median'
return self._apply('roll_median_c', how=how)

_shared_docs['std'] = dedent("""
Expand Down Expand Up @@ -654,7 +676,7 @@ class Rolling(_Rolling_and_Expanding):
min_periods : int, default None
Minimum number of observations in window required to have a value
(otherwise result is NA).
freq : string or DateOffset object, optional (default None)
freq : string or DateOffset object, optional (default None) (DEPRECATED)
Frequency to conform the data to before computing the statistic. Specified
as a frequency string or DateOffset object.
center : boolean, default False
Expand Down Expand Up @@ -704,14 +726,14 @@ def sum(self):
@Substitution(name='rolling')
@Appender(_doc_template)
@Appender(_shared_docs['max'])
def max(self, how='max'):
return super(Rolling, self).max(how=how)
def max(self, **kwargs):
return super(Rolling, self).max(**kwargs)

@Substitution(name='rolling')
@Appender(_doc_template)
@Appender(_shared_docs['min'])
def min(self, how='min'):
return super(Rolling, self).min(how=how)
def min(self, **kwargs):
return super(Rolling, self).min(**kwargs)

@Substitution(name='rolling')
@Appender(_doc_template)
Expand All @@ -722,8 +744,8 @@ def mean(self):
@Substitution(name='rolling')
@Appender(_doc_template)
@Appender(_shared_docs['median'])
def median(self, how='median'):
return super(Rolling, self).median(how=how)
def median(self, **kwargs):
return super(Rolling, self).median(**kwargs)

@Substitution(name='rolling')
@Appender(_doc_template)
Expand Down Expand Up @@ -778,7 +800,7 @@ class Expanding(_Rolling_and_Expanding):
min_periods : int, default None
Minimum number of observations in window required to have a value
(otherwise result is NA).
freq : string or DateOffset object, optional (default None)
freq : string or DateOffset object, optional (default None) (DEPRECATED)
Frequency to conform the data to before computing the statistic. Specified
as a frequency string or DateOffset object.
center : boolean, default False
Expand Down Expand Up @@ -843,14 +865,14 @@ def sum(self):
@Substitution(name='expanding')
@Appender(_doc_template)
@Appender(_shared_docs['max'])
def max(self, how='max'):
return super(Expanding, self).max(how=how)
def max(self, **kwargs):
return super(Expanding, self).max(**kwargs)

@Substitution(name='expanding')
@Appender(_doc_template)
@Appender(_shared_docs['min'])
def min(self, how='min'):
return super(Expanding, self).min(how=how)
def min(self, **kwargs):
return super(Expanding, self).min(**kwargs)

@Substitution(name='expanding')
@Appender(_doc_template)
Expand All @@ -861,8 +883,8 @@ def mean(self):
@Substitution(name='expanding')
@Appender(_doc_template)
@Appender(_shared_docs['median'])
def median(self, how='median'):
return super(Expanding, self).median(how=how)
def median(self, **kwargs):
return super(Expanding, self).median(**kwargs)

@Substitution(name='expanding')
@Appender(_doc_template)
Expand Down Expand Up @@ -923,7 +945,7 @@ class EWM(_Rolling):
min_periods : int, default 0
Minimum number of observations in window required to have a value
(otherwise result is NA).
freq : None or string alias / date offset object, default=None
freq : None or string alias / date offset object, default=None (DEPRECATED)
Frequency to conform to before computing statistic
adjust : boolean, default True
Divide by decaying adjustment factor in beginning periods to account for
Expand Down Expand Up @@ -1004,7 +1026,7 @@ def _apply(self, func, how=None, **kwargs):
Parameters
----------
func : string/callable to apply
how : string, default to None
how : string, default to None (DEPRECATED)
how to resample

Returns
Expand Down
55 changes: 33 additions & 22 deletions pandas/tests/test_window.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,15 @@ def b(x):
result = r.aggregate([a,b])
assert_frame_equal(result, expected)

def test_preserve_metadata(self):
# GH 10565
s = Series(np.arange(100), name='foo')

s2 = s.rolling(30).sum()
s3 = s.rolling(20).sum()
self.assertEqual(s2.name, 'foo')
self.assertEqual(s3.name, 'foo')

class TestDeprecations(Base):
""" test that we are catching deprecation warnings """

Expand Down Expand Up @@ -815,10 +824,15 @@ def get_result(obj, window, min_periods=None, freq=None, center=False):

# check via the API calls if name is provided
if name is not None:
return getattr(obj.rolling(window=window,
min_periods=min_periods,
freq=freq,
center=center),name)(**kwargs)

# catch a freq deprecation warning if freq is provided and not None
w = FutureWarning if freq is not None else None
with tm.assert_produces_warning(w, check_stacklevel=False):
r = obj.rolling(window=window,
min_periods=min_periods,
freq=freq,
center=center)
return getattr(r,name)(**kwargs)

# check via the moments API
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
Expand Down Expand Up @@ -1002,15 +1016,6 @@ def test_ewma_halflife_arg(self):
self.assertRaises(Exception, mom.ewma, self.arr, com=9.5, span=20, halflife=50)
self.assertRaises(Exception, mom.ewma, self.arr)

def test_moment_preserve_series_name(self):
# GH 10565
s = Series(np.arange(100), name='foo')

s2 = s.rolling(30).sum()
s3 = s.rolling(20).sum()
self.assertEqual(s2.name, 'foo')
self.assertEqual(s3.name, 'foo')

def test_ew_empty_arrays(self):
arr = np.array([], dtype=np.float64)

Expand Down Expand Up @@ -2133,7 +2138,8 @@ def test_rolling_max_gh6297(self):
expected = Series([1.0, 2.0, 6.0, 4.0, 5.0],
index=[datetime(1975, 1, i, 0)
for i in range(1, 6)])
x = series.rolling(window=1, freq='D').max()
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
x = series.rolling(window=1, freq='D').max()
assert_series_equal(expected, x)

def test_rolling_max_how_resample(self):
Expand All @@ -2152,22 +2158,25 @@ def test_rolling_max_how_resample(self):
expected = Series([0.0, 1.0, 2.0, 3.0, 20.0],
index=[datetime(1975, 1, i, 0)
for i in range(1, 6)])
x = series.rolling(window=1, freq='D').max()
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
x = series.rolling(window=1, freq='D').max()
assert_series_equal(expected, x)

# Now specify median (10.0)
expected = Series([0.0, 1.0, 2.0, 3.0, 10.0],
index=[datetime(1975, 1, i, 0)
for i in range(1, 6)])
x = series.rolling(window=1, freq='D').max(how='median')
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
x = series.rolling(window=1, freq='D').max(how='median')
assert_series_equal(expected, x)

# Now specify mean (4+10+20)/3
v = (4.0+10.0+20.0)/3.0
expected = Series([0.0, 1.0, 2.0, 3.0, v],
index=[datetime(1975, 1, i, 0)
for i in range(1, 6)])
x = series.rolling(window=1, freq='D').max(how='mean')
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
x = series.rolling(window=1, freq='D').max(how='mean')
assert_series_equal(expected, x)


Expand All @@ -2187,8 +2196,9 @@ def test_rolling_min_how_resample(self):
expected = Series([0.0, 1.0, 2.0, 3.0, 4.0],
index=[datetime(1975, 1, i, 0)
for i in range(1, 6)])
x = series.rolling(window=1, freq='D').min()
assert_series_equal(expected, x)
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
r = series.rolling(window=1, freq='D')
assert_series_equal(expected, r.min())

def test_rolling_median_how_resample(self):

Expand All @@ -2206,14 +2216,15 @@ def test_rolling_median_how_resample(self):
expected = Series([0.0, 1.0, 2.0, 3.0, 10],
index=[datetime(1975, 1, i, 0)
for i in range(1, 6)])
x = series.rolling(window=1, freq='D').median()
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
x = series.rolling(window=1, freq='D').median()
assert_series_equal(expected, x)

def test_rolling_median_memory_error(self):
# GH11722
n = 20000
mom.rolling_median(Series(np.random.randn(n)), window=2, center=False)
mom.rolling_median(Series(np.random.randn(n)), window=2, center=False)
Series(np.random.randn(n)).rolling(window=2, center=False).median()
Series(np.random.randn(n)).rolling(window=2, center=False).median()

if __name__ == '__main__':
import nose
Expand Down
0