8000 Merge branch 'master' into bugfix_groupby_multiindex_levels_equals_rows · pandas-dev/pandas@50c7979 · GitHub
[go: up one dir, main page]

Skip to content

Commit 50c7979

Browse files
authored
Merge branch 'master' into bugfix_groupby_multiindex_levels_equals_rows
2 parents 9af8a47 + 58d8729 commit 50c7979

File tree

160 files changed

+4554
-2806
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

160 files changed

+4554
-2806
lines changed

.github/ISSUE_TEMPLATE.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,16 @@
88

99
[this should explain **why** the current behaviour is a problem and why the expected output is a better solution.]
1010

11+
**Note**: We receive a lot of issues on our GitHub tracker, so it is very possible that your issue has been posted before. Please check first before submitting so that we do not have to handle and close duplicates!
12+
13+
**Note**: Many problems can be resolved by simply upgrading `pandas` to the latest version. Before submitting, please check if that solution works for you. If possible, you may want to check if `master` addresses this issue, but that is not necessary.
14+
1115
#### Expected Output
1216

1317
#### Output of ``pd.show_versions()``
1418

1519
<details>
16-
# Paste the output here pd.show_versions() here
20+
21+
[paste the output of ``pd.show_versions()`` here below this line]
1722

1823
</details>

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
- [ ] closes #xxxx
2-
- [ ] tests added / passed
3-
- [ ] passes ``git diff upstream/master -u -- "*.py" | flake8 --diff``
4-
- [ ] whatsnew entry
1+
- [ ] closes #xxxx
2+
- [ ] tests added / passed
3+
- [ ] passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
4+
- [ ] whatsnew entry

README.md

Lines changed: 29 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -53,15 +53,15 @@
5353
<tr>
5454
<td>Conda</td>
5555
<td>
56-
<a href="http://pandas.pydata.org">
56+
<a href="https://pandas.pydata.org">
5757
<img src="http://pubbadges.s3-website-us-east-1.amazonaws.com/pkgs-downloads-pandas.png" alt="conda default downloads" />
5858
</a>
5959
</td>
6060
</tr>
6161
<tr>
6262
<td>Conda-forge</td>
6363
<td>
64-
<a href="http://pandas.pydata.org">
64+
<a href="https://pandas.pydata.org">
6565
<img src="https://anaconda.org/conda-forge/pandas/badges/downloads.svg" alt="conda-forge downloads" />
6666
</a>
6767
</td>
@@ -123,31 +123,31 @@ Here are just a few of the things that pandas does well:
123123
moving window linear regressions, date shifting and lagging, etc.
124124

125125

126-
[missing-data]: http://pandas.pydata.org/pandas-docs/stable/missing_data.html#working-with-missing-data
127-
[insertion-deletion]: http://pandas.pydata.org/pandas-docs/stable/dsintro.html#column-selection-addition-deletion
128-
[alignment]: http://pandas.pydata.org/pandas-docs/stable/dsintro.html?highlight=alignment#intro-to-data-structures
129-
[groupby]: http://pandas.pydata.org/pandas-docs/stable/groupby.html#group-by-split-apply-combine
130-
[conversion]: http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe
131-
[slicing]: http://pandas.pydata.org/pandas-docs/stable/indexing.html#slicing-ranges
132-
[fancy-indexing]: http://pandas.pydata.org/pandas-docs/stable/indexing.html#advanced-indexing-with-ix
133-
[subsetting]: http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing
134-
[merging]: http://pandas.pydata.org/pandas-docs/stable/merging.html#database-style-dataframe-joining-merging
135-
[joining]: http://pandas.pydata.org/pandas-docs/stable/merging.html#joining-on-index
136-
[reshape]: http://pandas.pydata.org/pandas-docs/stable/reshaping.html#reshaping-and-pivot-tables
137-
[pivot-table]: http://pandas.pydata.org/pandas-docs/stable/reshaping.html#pivot-tables-and-cross-tabulations
138-
[mi]: http://pandas.pydata.org/pandas-docs/stable/indexing.html#hierarchical-indexing-multiindex
139-
[flat-files]: http://pandas.pydata.org/pandas-docs/stable/io.html#csv-text-files
140-
[excel]: http://pandas.pydata.org/pandas-docs/stable/io.html#excel-files
141-
[db]: http://pandas.pydata.org/pandas-docs/stable/io.html#sql-queries
142-
[hdfstore]: http://pandas.pydata.org/pandas-docs/stable/io.html#hdf5-pytables
143-
[timeseries]: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#time-series-date-functionality
126+
[missing-data]: https://pandas.pydata.org/pandas-docs/stable/missing_data.html#working-with-missing-data
127+
[insertion-deletion]: https://pandas.pydata.org/pandas-docs/stable/dsintro.html#column-selection-addition-deletion
128+
[alignment]: https://pandas.pydata.org/pandas-docs/stable/dsintro.html?highlight=alignment#intro-to-data-structures
129+
[groupby]: https://pandas.pydata.org/pandas-docs/stable/groupby.html#group-by-split-apply-combine
130+
[conversion]: https://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe
131+
[slicing]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#slicing-ranges
132+
[fancy-indexing]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#advanced-indexing-with-ix
133+
[subsetting]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing
134+
[merging]: https://pandas.pydata.org/pandas-docs/stable/merging.html#database-style-dataframe-joining-merging
135+
[joining]: https://pandas.pydata.org/pandas-docs/stable/merging.html#joining-on-index
136+
[reshape]: https://pandas.pydata.org/pandas-docs/stable/reshaping.html#reshaping-and-pivot-tables
137+
[pivot-table]: https://pandas.pydata.org/pandas-docs/stable/reshaping.html#pivot-tables-and-cross-tabulations
138+
[mi]: https://pandas.pydata.org/pandas-docs/stable/indexing.html#hierarchical-indexing-multiindex
139+
[flat-files]: https://pandas.pydata.org/pandas-docs/stable/io.html#csv-text-files
140+
[excel]: https://pandas.pydata.org/pandas-docs/stable/io.html#excel-files
141+
[db]: https://pandas.pydata.org/pandas-docs/stable/io.html#sql-queries
142+
[hdfstore]: https://pandas.pydata.org/pandas-docs/stable/io.html#hdf5-pytables
143+
[timeseries]: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#time-series-date-functionality
144144

145145
## Where to get it
146146
The source code is currently hosted on GitHub at:
147-
http://github.com/pandas-dev/pandas
147+
https://github.com/pandas-dev/pandas
148148

149149
Binary installers for the latest released version are available at the [Python
150-
package index](http://pypi.python.org/pypi/pandas/) and on conda.
150+
package index](https://pypi.python.org/pypi/pandas) and on conda.
151151

152152
```sh
153153
# conda
@@ -161,11 +161,11 @@ pip install pandas
161161

162162
## Dependencies
163163
- [NumPy](http://www.numpy.org): 1.7.0 or higher
164-
- [python-dateutil](http://labix.org/python-dateutil): 1.5 or higher
165-
- [pytz](http://pytz.sourceforge.net)
164+
- [python-dateutil](https://labix.org/python-dateutil): 1.5 or higher
165+
- [pytz](https://pythonhosted.org/pytz)
166166
- Needed for time zone support with ``pandas.date_range``
167167

168-
See the [full installation instructions](http://pandas.pydata.org/pandas-docs/stable/install.html#dependencies)
168+
See the [full installation instructions](https://pandas.pydata.org/pandas-docs/stable/install.html#dependencies)
169169
for recommended and optional dependencies.
170170

171171
## Installation from sources
@@ -197,13 +197,13 @@ mode](https://pip.pypa.io/en/latest/reference/pip_install.html#editable-installs
197197
pip install -e .
198198
```
199199

200-
See the full instructions for [installing from source](http://pandas.pydata.org/pandas-docs/stable/install.html#installing-from-source).
200+
See the full instructions for [installing from source](https://pandas.pydata.org/pandas-docs/stable/install.html#installing-from-source).
201201

202202
## License
203-
BSD
203+
[BSD 3](LICENSE)
204204

205205
## Documentation
206-
The official documentation is hosted on PyData.org: http://pandas.pydata.org/pandas-docs/stable/
206+
The official documentation is hosted on PyData.org: https://pandas.pydata.org/pandas-docs/stable
207207

208208
The Sphinx documentation should provide a good starting point for learning how
209209
to use the library. Expect the docs to continue to expand as time goes on.
@@ -223,7 +223,7 @@ Most development discussion is taking place on github in this repo. Further, the
223223
## Contributing to pandas
224224
All contributions, bug reports, bug fixes, documentation improvements, enhancements and ideas are welcome.
225225

226-
A detailed overview on how to contribute can be found in the **[contributing guide.](http://pandas.pydata.org/pandas-docs/stable/contributing.html)**
226+
A detailed overview on how to contribute can be found in the **[contributing guide.](https://pandas.pydata.org/pandas-docs/stable/contributing.html)**
227227

228228
If you are simply looking to start working with the pandas codebase, navigate to the [GitHub “issues” tab](https://github.com/pandas-dev/pandas/issues) and start looking through interesting issues. There are a number of issues listed under [Docs](https://github.com/pandas-dev/pandas/issues?labels=Docs&sort=updated&state=open) and [Difficulty Novice](https://github.com/pandas-dev/pandas/issues?q=is%3Aopen+is%3Aissue+label%3A%22Difficulty+Novice%22) where you could start out.
229229

appveyor.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ install:
7272
- cmd: conda info -a
7373

7474
# create our env
75-
- cmd: conda create -n pandas python=%PYTHON_VERSION% cython pytest pytest-xdist
75+
- cmd: conda create -n pandas python=%PYTHON_VERSION% cython pytest>=3.1.0 pytest-xdist
7676
- cmd: activate pandas
7777
- SET REQ=ci\requirements-%PYTHON_VERSION%_WIN.run
7878
- cmd: echo "installing requirements from %REQ%"

asv_bench/asv.conf.json

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -117,8 +117,10 @@
117117
// with results. If the commit is `null`, regression detection is
118118
// skipped for the matching benchmark.
119119
//
120-
// "regressions_first_commits": {
121-
// "some_benchmark": "352cdf", // Consider regressions only after this commit
122-
// "another_benchmark": null, // Skip regression detection altogether
123-
// }
120+
"regressions_first_commits": {
121+
"*": "v0.20.0"
122+
},
123+
"regression_thresholds": {
124+
"*": 0.05
125+
}
124126
}

asv_bench/benchmarks/stat_ops.py

Lines changed: 22 additions & 78 deletions
Original file line numberDiff line numberDiff line change
@@ -1,92 +1,36 @@
11
from .pandas_vb_common import *
22

33

4-
class stat_ops_frame_mean_float_axis_0(object):
5-
goal_time = 0.2
6-
7-
def setup(self):
8-
self.df = DataFrame(np.random.randn(100000, 4))
9-
self.dfi = DataFrame(np.random.randint(1000, size=self.df.shape))
10-
11-
def time_stat_ops_frame_mean_float_axis_0(self):
12-
self.df.mean()
13-
14-
15-
class stat_ops_frame_mean_float_axis_1(object):
16-
goal_time = 0.2
17-
18-
def setup(self):
19-
self.df = DataFrame(np.random.randn(100000, 4))
20-
self.dfi = DataFrame(np.random.randint(1000, size=self.df.shape))
21-
22-
def time_stat_ops_frame_mean_float_axis_1(self):
23-
self.df.mean(1)
24-
25-
26-
class stat_ops_frame_mean_int_axis_0(object):
27-
goal_time = 0.2
28-
29-
def setup(self):
30-
self.df = DataFrame(np.random.randn(100000, 4))
31-
self.dfi = DataFrame(np.random.randint(1000, size=self.df.shape))
32-
33-
def time_stat_ops_frame_mean_int_axis_0(self):
34-
self.dfi.mean()
35-
36-
37-
class stat_ops_frame_mean_int_axis_1(object):
38-
goal_time = 0.2
4+
def _set_use_bottleneck_False():
5+
try:
6+
pd.options.compute.use_bottleneck = False
7+
except:
8+
from pandas.core import nanops
9+
nanops._USE_BOTTLENECK = False
3910

40-
def setup(self):
41-
self.df = DataFrame(np.random.randn(100000, 4))
42-
self.dfi = DataFrame(np.random.randint(1000, size=self.df.shape))
43-
44-
def time_stat_ops_frame_mean_int_axis_1(self):
45-
self.dfi.mean(1)
46-
47-
48-
class stat_ops_frame_sum_float_axis_0(object):
49-
goal_time = 0.2
5011

51-
def setup 10000 (self):
52-
self.df = DataFrame(np.random.randn(100000, 4))
53-
self.dfi = DataFrame(np.random.randint(1000, size=self.df.shape))
54-
55-
def time_stat_ops_frame_sum_float_axis_0(self):
56-
self.df.sum()
57-
58-
59-
class stat_ops_frame_sum_float_axis_1(object):
12+
class FrameOps(object):
6013
goal_time = 0.2
6114

62-
def setup(self):
63-
self.df = DataFrame(np.random.randn(100000, 4))
64-
self.dfi = DataFrame(np.random.randint(1000, size=self.df.shape))
15+
param_names = ['op', 'use_bottleneck', 'dtype', 'axis']
16+
params = [['mean', 'sum', 'median'],
17+
[True, False],
18+
['float', 'int'],
19+
[0, 1]]
6520

66-
def time_stat_ops_frame_sum_float_axis_1(self):
67-
self.df.sum(1)
21+
def setup(self, op, use_bottleneck, dtype, axis):
22+
if dtype == 'float':
23+
self.df = DataFrame(np.random.randn(100000, 4))
24+
elif dtype == 'int':
25+
self.df = DataFrame(np.random.randint(1000, size=(100000, 4)))
6826

27+
if not use_bottleneck:
28+
_set_use_bottleneck_False()
6929

70-
class stat_ops_frame_sum_int_axis_0(object):
71-
goal_time = 0.2
72-
73-
def setup(self):
74-
self.df = DataFrame(np.random.randn(100000, 4))
75-
self.dfi = DataFrame(np.random.randint(1000, size=self.df.shape))
76-
77-
def time_stat_ops_frame_sum_int_axis_0(self):
78-
self.dfi.sum()
79-
80-
81-
class stat_ops_frame_sum_int_axis_1(object):
82-
goal_time = 0.2
83-
84-
def setup(self):
85-
self.df = DataFrame(np.random.randn(100000, 4))
86-
self.dfi = DataFrame(np.random.randint(1000, size=self.df.shape))
30+
self.func = getattr(self.df, op)
8731

88-
def time_stat_ops_frame_sum_int_axis_1(self):
89-
self.dfi.sum(1)
32+
def time_op(self, op, use_bottleneck, dtype, axis):
33+
self.func(axis=axis)
9034

9135

9236
class stat_ops_level_frame_sum(object):

asv_bench/benchmarks/timeseries.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -510,3 +510,17 @@ def time_begin_incr_rng(self):
510510

511511
def time_begin_decr_rng(self):
512512
self.rng - self.semi_month_begin
513+
514+
515+
class DatetimeAccessor(object):
516+
def setup(self):
517+
self.N = 100000
518+
self.series = pd.Series(
519+
pd.date_range(start='1/1/2000', periods=self.N, freq='T')
520+
)
521+
522+
def time_dt_accessor(self):
523+
self.series.dt
524+
525+
def time_dt_accessor_normalize(self):
526+
self.series.dt.normalize()

ci/install_circle.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ fi
6464
# create envbuild deps
6565
echo "[create env: ${REQ_BUILD}]"
6666
time conda create -n pandas -q --file=${REQ_BUILD} || exit 1
67-
time conda install -n pandas pytest || exit 1
67+
time conda install -n pandas pytest>=3.1.0 || exit 1
6868

6969
source activate pandas
7070

ci/install_travis.sh

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -52,9 +52,6 @@ conda update -q conda
5252

5353
echo
5454
echo "[add channels]"
55-
# add the pandas channel to take priority
56-
# to add extra packages
57-
conda config --add channels pandas || exit 1
5855
conda config --remove channels defaults || exit 1
5956
conda config --add channels defaults || exit 1
6057

@@ -106,7 +103,7 @@ if [ -e ${REQ} ]; then
106103
time bash $REQ || exit 1
107104
fi
108105

109-
time conda install -n pandas pytest
106+
time conda install -n pandas pytest>=3.1.0
110107
time pip install pytest-xdist
111108

112109
if [ "$LINT" ]; then
@@ -156,6 +153,7 @@ fi
156153
echo
157154
echo "[removing installed pandas]"
158155
conda remove pandas -y --force
156+
pip uninstall -y pandas
159157

160158
if [ "$BUILD_TEST" ]; then
161159

ci/requirements-2.7.pip

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
blosc
22
pandas-gbq
3+
html5lib
4+
beautifulsoup4
35
pathlib
46
backports.lzma
57
py

ci/requirements-2.7.run

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,11 @@ xlrd=0.9.2
1010
sqlalchemy=0.9.6
1111
lxml=3.2.1
1212
scipy
13-
xlsxwriter=0.4.6
13+
xlsxwriter=0.5.2
1414
s3fs
1515
bottleneck
16-
psycopg2=2.5.2
16+
psycopg2
1717
patsy
1818
pymysql=0.6.3
19-
html5lib=1.0b2
20-
beautiful-soup=4.2.1
2119
jinja2=2.8
2220
xarray=0.8.0

ci/requirements-2.7.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,4 +4,4 @@ source activate pandas
44

55
echo "install 27"
66

7-
conda install -n pandas -c conda-forge feather-format pyarrow=0.4.1
7+
conda install -n pandas -c conda-forge feather-format pyarrow=0.4.1 fastparquet

ci/requirements-2.7_COMPAT.pip

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,4 @@
1+
html5lib==1.0b2
2+
beautifulsoup4==4.2.0
13
openpyxl
24
argparse

ci/requirements-2.7_COMPAT.run

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,10 @@ pytz=2013b
44
scipy=0.11.0
55
xlwt=0.7.5
66
xlrd=0.9.2
7-
bottleneck=0.8.0
87
numexpr=2.2.2
98
pytables=3.0.0
10-
html5lib=1.0b2
11-
beautiful-soup=4.2.0
12-
psycopg2=2.5.1
9+
psycopg2
1310
pymysql=0.6.0
1411
sqlalchemy=0.7.8
15-
xlsxwriter=0.4.6
12+
xlsxwriter=0.5.2
1613
jinja2=2.8

ci/requirements-2.7_LOCALE.pip

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,3 @@
1+
html5lib==1.0b2
2+
beautifulsoup4==4.2.1
13
blosc

ci/requirements-2.7_LOCALE.run

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,9 @@ pytz=2013b
33
numpy=1.8.2
44
xlwt=0.7.5
55
openpyxl=1.6.2
6-
xlsxwriter=0.4.6
6+
xlsxwriter=0.5.2
77
xlrd=0.9.2
8-
bottleneck=0.8.0
98
matplotlib=1.3.1
109
sqlalchemy=0.8.1
11-
html5lib=1.0b2
1210
lxml=3.2.1
1311
scipy
14-
beautiful-soup=4.2.1

0 commit comments

Comments
 (0)
0