-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
DOC: Fix pandas.Series.resample docstring #23197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
2c23296
7940cbe
ff8009e
f71d48f
49fbd6b
b8f8b0e
ad1cada
ceff696
8cd9a89
ce7886a
abb2b58
bbddcd2
e0aef30
9b3c7d7
fab0ad0
7567e12
109e414
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
…issues... (#22894)
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7375,39 +7375,40 @@ def resample(self, rule, how=None, axis=0, fill_method=None, closed=None, | |
|
||
Parameters | ||
---------- | ||
rule : string | ||
rule : str | ||
The offset string or object representing target conversion. | ||
how : string | ||
how : str | ||
Method for down-/re-sampling, default to ‘mean’ for downsampling. | ||
|
||
.. deprecated:: 0.18.0 | ||
The new syntax is ``.resample(...).mean()``, or | ||
``.resample(...).apply(<func>)`` | ||
axis : int, optional, default 0 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you check other docstrings to see how the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another example is:
The problem is that for Maybe a more verbose but informative doc would be
I have added this tentatively to the PR but let me know if something less verbose would be better. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry I wasn't clear. There is not much standardization of the descriptions, use whatever makes more sense to you. But the used type is somehow standard, that's what I think we should copy from another docstring. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jmrr can you make this change? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @datapythonista not your fault at all. I somewhat skipped the word type in your request and I thought you meant how
If we extrapolate this to
I will make this change now, hope it sounds good. |
||
Which index to use for up- or down-sampling. Must be | ||
Which axis to use for up- or down-sampling. For ``Series`` this | ||
will default to 0, i.e. `along the rows`. Must be | ||
``DatetimeIndex``, ``TimedeltaIndex`` or ``PeriodIndex``. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand the must be. I don't think axis must be DatetimeIndex... Also, don't use backticks around the "along the rows". Backticks are to tell that it's a reference (to a variable, a function, a class...). We don't use them for text. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I took it from the main function description. As In fact, following this logic, For instance:
I hope I'm not missing anything here...
Slip of keyboard, I meant normal textual quotes ("). Changing. |
||
fill_method : string, default None | ||
fill_method : str, default None | ||
Filling method for upsampling. | ||
|
||
.. deprecated:: 0.18.0 | ||
The new syntax is ``.resample(...).<func>()``, | ||
e.g. ``.resample(...).pad()`` | ||
closed : {'right', 'left'} | ||
closed : {'right', 'left'}, default None | ||
Which side of bin interval is closed. The default is 'left' | ||
for all frequency offsets except for 'M', 'A', 'Q', 'BM', | ||
'BA', 'BQ', and 'W' which all have a default of 'right'. | ||
label : {'right', 'left'} | ||
label : {'right', 'left'}, default None | ||
Which bin edge label to label bucket with. The default is 'left' | ||
for all frequency offsets except for 'M', 'A', 'Q', 'BM', | ||
'BA', 'BQ', and 'W' which all have a default of 'right'. | ||
convention : {'start', 'end', 's', 'e'} | ||
convention : {'start', 'end', 's', 'e'}, default 'start' | ||
For ``PeriodIndex`` only, controls whether to use the start or | ||
end of ``rule``. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. single backticks, in both cases, double backticks are for code (including possible values of variables like NaN). But for classes, variables, parameters... we use single backticks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it, wasn't clear from the docstring docs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would None be with double backtick in examples like "default There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good question. I think it probably should, but we don't have it anywhere, so let's leave it without backticks in the default for now |
||
kind : {'timestamp', 'period'} optional | ||
kind : {'timestamp', 'period'}, optional, default None | ||
Pass 'timestamp' to convert the resulting index to a | ||
``DateTimeIndex`` or 'period' to convert it to a ``PeriodIndex``. | ||
By default the input representation is retained. | ||
loffset : timedelta | ||
loffset : timedelta, default None | ||
Adjust the resampled time labels. | ||
limit : int, default None | ||
Maximum size gap when reindexing with ``fill_method``. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same |
||
|
@@ -7417,13 +7418,13 @@ def resample(self, rule, how=None, axis=0, fill_method=None, closed=None, | |
For frequencies that evenly subdivide 1 day, the "origin" of the | ||
aggregated intervals. For example, for '5min' frequency, base could | ||
range from 0 through 4. Defaults to 0. | ||
on : string, optional | ||
on : str, optional | ||
For a DataFrame, column to use instead of index for resampling. | ||
Column must be datetime-like. | ||
|
||
.. versionadded:: 0.19.0 | ||
|
||
level : string or int, optional | ||
level : str or int, optional | ||
For a MultiIndex, level (name or number) to use for | ||
resampling. ``level`` must be datetime-like. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be dis F438 played to describe this comment to others. Learn more. same |
||
|
||
|
@@ -7531,7 +7532,7 @@ def resample(self, rule, how=None, axis=0, fill_method=None, closed=None, | |
Pass a custom function via ``apply`` | ||
|
||
>>> def custom_resampler(array_like): | ||
... return np.sum(array_like)+5 | ||
... return np.sum(array_like) + 5 | ||
|
||
>>> series.resample('3T').apply(custom_resampler) | ||
2000-01-01 00:00:00 8 | ||
|
@@ -7542,8 +7543,7 @@ def resample(self, rule, how=None, axis=0, fill_method=None, closed=None, | |
For a Series with a PeriodIndex, the keyword `convention` can be | ||
used to control whether to use the start or end of `rule`. | ||
|
||
>>> s = pd.Series([1, 2], index=pd.period_range( | ||
... '2012-01-01', | ||
>>> s = pd.Series([1, 2], index=pd.period_range('2012-01-01', | ||
... freq='A', | ||
... periods=2)) | ||
>>> s | ||
|
@@ -7584,7 +7584,8 @@ def resample(self, rule, how=None, axis=0, fill_method=None, closed=None, | |
For DataFrame objects, the keyword ``on`` can be used to specify the | ||
column instead of the index for resampling. | ||
|
||
>>> df = pd.DataFrame(data=9*[range(4)], columns=['a', 'b', 'c', 'd']) | ||
>>> df = pd.DataFrame(data=9 * [range(4)], | ||
... columns=['a', 'b', 'c', 'd']) | ||
>>> df['time'] = pd.date_range('1/1/2000', periods=9, freq='T') | ||
>>> df.resample('3T', on='time').sum() | ||
a b c d | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should show the data before resampling, otherwise it's difficult to see what There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I've revamped this example. Pushing now altogether with the rest of the changes. Agree more meaningful examples make sure everyone understands. Just a bit conscious of being too verbose. But looking forward to your feedback. |
||
|
@@ -7597,7 +7598,7 @@ def resample(self, rule, how=None, axis=0, fill_method=None, closed=None, | |
specify on level the resampling needs to take place. | ||
|
||
>>> time = pd.date_range('1/1/2000', periods=5, freq='T') | ||
>>> df2 = pd.DataFrame(data=10*[range(4)], | ||
>>> df2 = pd.DataFrame(data=10 * [range(4)], | ||
... columns=['a', 'b', 'c', 'd'], | ||
... index=pd.MultiIndex.from_product([time, [1, 2]]) | ||
... ) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the possible values? If they are a fixed set, can we have
{'mean',...}
? If it's a problem, leave it like it is, as this will go away soon.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible values are any valid string function that does some form of aggregation. Interestingly, this would also work with the example dataframe I shared above:
Happy to change the description of the parameter to something like:
I got inspiration from pandas.DataFrame.aggregate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f the name needs to be a string with the function name, I assume it can't be any arbitrary value. But as it's deprecated, let's not spend time on this.