8000 Merge branch 'master' into mldata · scikit-learn/scikit-learn@32de9ff · GitHub
[go: up one dir, main page]

Skip to content

Commit 32de9ff

Browse files
committed
Merge branch 'master' into mldata
2 parents c1f5abf + dd700f4 commit 32de9ff

File tree

407 files changed

+22358
-6459
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

407 files changed

+22358
-6459
lines changed

.circleci/config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ jobs:
6565
path: ~/log.txt
6666
destination: log.txt
6767

68+
6869
deploy:
6970
docker:
7071
- image: circleci/python:3.6.1
@@ -91,4 +92,3 @@ workflows:
9192
- deploy:
9293
requires:
9394
- python3
94-
- python2

.travis.yml

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,13 +38,15 @@ matrix:
3838
NUMPY_VERSION="1.10.4" SCIPY_VERSION="0.16.1" CYTHON_VERSION="0.25.2"
3939
PILLOW_VERSION="4.0.0" COVERAGE=true
4040
if: type != cron
41-
# This environment tests the newest supported Anaconda release (5.0.0)
42-
# It also runs tests requiring Pandas and PyAMG
41+
# This environment tests the newest supported Anaconda release.
42+
# It runs tests requiring pandas and PyAMG.
43+
# It also runs with the site joblib instead of the vendored copy of joblib.
4344
- env: DISTRIB="conda" PYTHON_VERSION="3.6.2" INSTALL_MKL="true"
4445
NUMPY_VERSION="1.14.2" SCIPY_VERSION="1.0.0" PANDAS_VERSION="0.20.3"
4546
CYTHON_VERSION="0.26.1" PYAMG_VERSION="3.3.2" PILLOW_VERSION="4.3.0"
46-
COVERAGE=true
47+
JOBLIB_VERSION="0.12" COVERAGE=true
4748
CHECK_PYTEST_SOFT_DEPENDENCY="true" TEST_DOCSTRINGS="true"
49+
SKLEARN_SITE_JOBLIB=1
4850
if: type != cron
4951
# flake8 linting on diff wrt common ancestor with upstream/master
5052
- env: RUN_FLAKE8="true" SKIP_TESTS="true"

AUTHORS.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ The following people have been core contributors to scikit-learn's development a
4747
* `Kyle Kastner <http://kastnerkyle.github.io>`_
4848
* `Manoj Kumar <https://manojbits.wordpress.com>`_
4949
* Robert Layton
50+
* `Guillaume Lemaitre <https://github.com/glemaitre>`_
5051
* `Wei Li <http://kuantkid.github.io/>`_
5152
* Paolo Losi
5253
* `Gilles Louppe <http://glouppe.github.io/>`_
@@ -59,11 +60,14 @@ The following people have been core contributors to scikit-learn's development a
5960
* `Alexandre Passos <http://atpassos.posterous.com>`_
6061
* `Fabian Pedregosa <http://fa.bianp.net/blog/>`_
6162
* `Peter Prettenhofer <https://sites.google.com/site/peterprettenhofer/>`_
63+
* `Hanmin Qin <https://github.com/qinhanmin2014>`_
6264
* Bertrand Thirion
65+
* `Joris Van den Bossche <https://github.com/jorisvandenbossche>`_
6366
* `Jake VanderPlas <http://staff.washington.edu/jakevdp/>`_
6467
* Nelle Varoquaux
6568
* `Gael Varoquaux <http://gael-varoquaux.info/>`_
6669
* Ron Weiss
70+
* `Roman Yurchak <https://github.com/rth>`_
6771

6872
Please do not email the authors directly to ask for assistance or report issues.
6973
Instead, please see `What's the best way to ask questions about scikit-learn

README.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,9 @@ scikit-learn requires:
5353
- NumPy (>= 1.8.2)
5454
- SciPy (>= 0.13.3)
5555

56+
**Scikit-learn 0.20 is the last version to support Python2.7.**
57+
Scikit-learn 0.21 and later will require Python 3.5 or newer.
58+
5659
For running the examples Matplotlib >= 1.3.1 is required. A few examples
5760
require scikit-image >= 0.9.3 and a few examples require pandas >= 0.13.1.
5861

appveyor.yml

Lines changed: 5 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -17,22 +17,14 @@ environment:
1717
SKLEARN_SKIP_NETWORK_TESTS: 1
1818

1919
matrix:
20-
- PYTHON: "C:\\Python27"
21-
PYTHON_VERSION: "2.7.8"
22-
PYTHON_ARCH: "32"
23-
24-
- PYTHON: "C:\\Python27-x64"
25-
PYTHON_VERSION: "2.7.8"
20+
- PYTHON: "C:\\Python37-x64"
21+
PYTHON_VERSION: "3.7.0"
2622
PYTHON_ARCH: "64"
2723

28-
- PYTHON: "C:\\Python36"
29-
PYTHON_VERSION: "3.6.1"
24+
- PYTHON: "C:\\Python27"
25+
PYTHON_VERSION: "2.7.8"
3026
PYTHON_ARCH: "32"
3127

32-
- PYTHON: "C:\\Python36-x64"
33-
PYTHON_VERSION: "3.6.1"
34-
PYTHON_ARCH: "64"
35-
3628

3729
# Because we only have a single worker, we don't want to waste precious
3830
# appveyor CI time and make other PRs wait for repeated failures in a failing
@@ -49,7 +41,7 @@ install:
4941
# directly to master instead of just PR builds.
5042
# credits: JuliaLang developers.
5143
- ps: if ($env:APPVEYOR_PULL_REQUEST_NUMBER -and $env:APPVEYOR_BUILD_NUMBER -ne ((Invoke-RestMethod `
52-
https://ci.appveyor.com/api/projects/$env:APPVEYOR_ACCOUNT_NAME/$env:APPVEYOR_PROJECT_SLUG/history?recordsNumber=50).builds | `
44+
https://ci.appveyor.com/api/projects/$env:APPVEYOR_ACCOUNT_NAME/$env:APPVEYOR_PROJECT_SLUG/history?recordsNumber=500).builds | `
5345
Where-Object pullRequestId -eq $env:APPVEYOR_PULL_REQUEST_NUMBER)[0].buildNumber) { `
5446
throw "There are newer queued builds for this pull request, failing early." }
5547

benchmarks/bench_covertype.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@
5959
from sklearn.ensemble import RandomForestClassifier, ExtraTreesClassifier
6060
from sklearn.ensemble import GradientBoostingClassifier
6161
from sklearn.metrics import zero_one_loss
62-
from sklearn.externals.joblib import Memory
62+
from sklearn.utils import Memory
6363
from sklearn.utils import check_array
6464

6565
# Memoize the data extraction and memory map the resulting

benchmarks/bench_isolation_forest.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,8 @@ def print_outlier_ratio(y):
119119
y_test = y[n_samples_train:]
120120

121121
print('--- Fitting the IsolationForest estimator...')
122-
model = IsolationForest(n_jobs=-1, random_state=random_state)
122+
model = IsolationForest(behaviour='new', n_jobs=-1,
123+
random_state=random_state)
123124
tstart = time()
124125
model.fit(X_train)
125126
fit_time = time() - tstart

benchmarks/bench_mnist.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@
4141
from sklearn.ensemble import ExtraTreesClassifier
4242
from sklearn.ensemble import RandomForestClassifier
4343
from sklearn.dummy import DummyClassifier
44-
from sklearn.externals.joblib import Memory
44+
from sklearn.utils import Memory
4545
from sklearn.kernel_approximation import Nystroem
4646
from sklearn.kernel_approximation import RBFSampler
4747
from sklearn.metrics import zero_one_loss

benchmarks/bench_plot_nmf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
from sklearn.decomposition.nmf import _initialize_nmf
2323
from sklearn.decomposition.nmf import _beta_divergence
2424
from sklearn.decomposition.nmf import INTEGER_TYPES, _check_init
25-
from sklearn.externals.joblib import Memory
25+
from sklearn.utils import Memory
2626
from sklearn.exceptions import ConvergenceWarning
2727
from sklearn.utils.extmath import safe_sparse_dot, squared_norm
2828
from sklearn.utils import check_array

benchmarks/bench_rcv1_logreg_convergence.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
import gc
99
import time
1010

11-
from sklearn.externals.joblib import Memory
11+
from sklearn.utils import Memory
1212
from sklearn.linear_model import (LogisticRegression, SGDClassifier)
1313
from sklearn.datasets import fetch_rcv1
1414
from sklearn.linear_model.sag import get_auto_step_size

benchmarks/bench_saga.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212

1313
from sklearn.datasets import fetch_rcv1, load_iris, load_digits, \
1414
fetch_20newsgroups_vectorized
15-
from sklearn.externals.joblib import delayed, Parallel, Memory
15+
from sklearn.utils import delayed, Parallel, Memory
1616
from sklearn.linear_model import LogisticRegression
1717
from sklearn.metrics import log_loss
1818
from sklearn.model_selection import train_test_split

benchmarks/bench_tsne_mnist.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
import json
1616
import argparse
1717

18-
from sklearn.externals.joblib import Memory
18+
from sklearn.utils import Memory
1919
from sklearn.datasets import fetch_mldata
2020
from sklearn.manifold import TSNE
2121
from sklearn.neighbors import NearestNeighbors

build_tools/appveyor/requirements.txt

Lines changed: 4 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,7 @@
1-
# Fetch numpy and scipy wheels from the sklearn rackspace wheelhouse.
2-
# Those wheels were collected from https://www.lfd.uci.edu/~gohlke/pythonlibs/
3-
# This is a temporary solution. As soon as numpy and scipy provide official
4-
# wheel for windows we ca delete this --find-links line.
5-
--find-links http://28daf2247a33ed269873-7b1aad3fab3cc330e1fd9d109892382a.r6.cf2.rackcdn.com/
6-
7-
# fix the versions of numpy to force the use of numpy and scipy to use the whl
8-
# of the rackspace folder instead of trying to install from more recent
9-
# source tarball published on PyPI
10-
numpy==1.13.0
11-
scipy==0.19.0
12-
cython
1+
numpy
2+
scipy
3+
# Pin Cython to avoid bug with 0.28.x on Python 3.7
4+
cython==0.27.3
135
pytest
146
wheel
157
wheelhouse_uploader

build_tools/circle/build_doc.sh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,8 @@ else
9292
make_args=html
9393
fi
9494

95+
make_args="SPHINXOPTS=-T $make_args" # show full traceback on exception
96+
9597
# Installing required system packages to support the rendering of math
9698
# notation in the HTML documentation
9799
sudo -E apt-get -yq update

build_tools/circle/build_test_pypy.sh

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
#!/usr/bin/env bash
2+
set -x
3+
set -e
4+
5+
apt-get -yq update
6+
apt-get -yq install libatlas-dev libatlas-base-dev liblapack-dev gfortran ccache
7+
8+
pip install virtualenv
9+
10+
if command -v pypy3; then
11+
virtualenv -p $(command -v pypy3) pypy-env
12+
elif command -v pypy; then
13+
virtualenv -p $(command -v pypy) pypy-env
14+
fi
15+
16+
source pypy-env/bin/activate
17+
18+
python --version
19+
which python
20+
21+
pip install --extra-index https://antocuni.github.io/pypy-wheels/ubuntu numpy==1.14.4 Cython pytest
22+
pip install "scipy>=1.1.0" sphinx numpydoc docutils
23+
24+
ccache -M 512M
25+
export CCACHE_COMPRESS=1
26+
export PATH=/usr/lib/ccache:$PATH
27+
28+
pip install -e .
29+
30+
make test

build_tools/travis/install.sh

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,8 @@ export CXX=/usr/lib/ccache/g++
2424
# ~60M is used by .ccache when compiling from scratch at the time of writing
2525
ccache --max-size 100M --show-stats
2626

27-
if [[ "$DISTRIB" == "conda" ]]; then
27+
make_conda() {
28+
TO_INSTALL="$@"
2829
# Deactivate the travis-provided virtual environment and setup a
2930
# conda-based environment instead
3031
deactivate
@@ -37,6 +38,11 @@ if [[ "$DISTRIB" == "conda" ]]; then
3738
export PATH=$MINICONDA_PATH/bin:$PATH
3839
conda update --yes conda
3940

41+
conda create -n testenv --yes $TO_INSTALL
42+
source activate testenv
43+
}
44+
45+
if [[ "$DISTRIB" == "conda" ]]; then
4046
TO_INSTALL="python=$PYTHON_VERSION pip pytest pytest-cov \
4147
numpy=$NUMPY_VERSION scipy=$SCIPY_VERSION \
4248
cython=$CYTHON_VERSION"
@@ -59,8 +65,10 @@ if [[ "$DISTRIB" == "conda" ]]; then
5965
TO_INSTALL="$TO_INSTALL pillow=$PILLOW_VERSION"
6066
fi
6167

62-
conda create -n testenv --yes $TO_INSTALL
63-
source activate testenv
68+
if [[ -n "$JOBLIB_VERSION" ]]; then
69+
TO_INSTALL="$TO_INSTALL joblib=$JOBLIB_VERSION"
70+
fi
71+
make_conda $TO_INSTALL
6472

6573
# for python 3.4, conda does not have recent pytest packages
6674
if [[ "$PYTHON_VERSION" == "3.4" ]]; then
@@ -79,11 +87,7 @@ elif [[ "$DISTRIB" == "ubuntu" ]]; then
7987
pip install pytest pytest-cov cython==$CYTHON_VERSION
8088

8189
elif [[ "$DISTRIB" == "scipy-dev" ]]; then
82-
# Set up our own virtualenv environment to avoid travis' numpy.
83-
# This venv points to the python interpreter of the travis build
84-
# matrix.
85-
virtualenv --python=python ~/testvenv
86-
source ~/testvenv/bin/activate
90+
make_conda python=3.7
8791
pip install --upgrade pip setuptools
8892

8993
echo "Installing numpy and scipy master wheels"

conftest.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,23 @@
55
# doc/modules/clustering.rst and use sklearn from the local folder rather than
66
# the one from site-packages.
77

8+
import platform
89
from distutils.version import LooseVersion
910

1011
import pytest
1112
from _pytest.doctest import DoctestItem
1213

1314

1415
def pytest_collection_modifyitems(config, items):
16+
17+
# FeatureHasher is not compatible with PyPy
18+
if platform.python_implementation() == 'PyPy':
19+
skip_marker = pytest.mark.skip(
20+
reason='FeatureHasher is not compatible with PyPy')
21+
for item in items:
22+
if item.name == 'sklearn.feature_extraction.hashing.FeatureHasher':
23+
item.add_marker(skip_marker)
24+
1525
# numpy changed the str/repr formatting of numpy arrays in 1.14. We want to
1626
# run doctests only for numpy >= 1.14.
1727
skip_doctests = True

doc/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ doctest:
9898
"results in $(BUILDDIR)/doctest/output.txt."
9999

100100
download-data:
101-
python -c "from sklearn.datasets.lfw import check_fetch_lfw; check_fetch_lfw()"
101+
python -c "from sklearn.datasets.lfw import _check_fetch_lfw; _check_fetch_lfw()"
102102

103103
# Optimize PNG files. Needs OptiPNG. Change the -P argument to the number of
104104
# cores you have available, so -P 64 if you have a real computer ;)

doc/about.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,15 @@ Andreas Müller also received a grant to improve scikit-learn from the `Alfred P
125125
:align: center
126126
:target: http://www.sydney.edu.au/
127127

128+
`The Labex DigiCosme <https://digicosme.lri.fr>`_ funded Nicolas Goix (2015-2016),
129+
Tom Dupré la Tour (2015-2016 and 2017-2018), Mathurin Massias (2018-2019) to work part time
130+
on scikit-learn during their PhDs. It also funded a scikit-learn coding sprint in 2015.
131+
132+
.. image:: themes/scikit-learn/static/img/digicosme.png
133+
:width: 200pt
134+
:align: center
135+
:target: https://digicosme.lri.fr
136+
128137
The following students were sponsored by `Google <https://developers.google.com/open-source/>`_
129138
to work on scikit-learn through the
130139
`Google Summer of Code <https://en.wikipedia.org/wiki/Google_Summer_of_Code>`_

doc/conftest.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,11 @@
1+
import os
12
from os.path import exists
23
from os.path import join
4+
import warnings
35

46
import numpy as np
57

8+
from sklearn.utils import IS_PYPY
69
from sklearn.utils.testing import SkipTest
710
from sklearn.utils.testing import check_skip_network
811
from sklearn.datasets import get_data_home
@@ -55,6 +58,8 @@ def setup_twenty_newsgroups():
5558

5659

5760
def setup_working_with_text_data():
61+
if IS_PYPY and os.environ.get('CI'< 1012B /span>, None):
62+
raise SkipTest('Skipping too slow test with PyPy on CI')
5863
check_skip_network()
5964
cache_path = _pkl_filepath(get_data_home(), CACHE_NAME)
6065
if not exists(cache_path):
@@ -75,6 +80,12 @@ def setup_impute():
7580
raise SkipTest("Skipping impute.rst, pandas not installed")
7681

7782

83+
def setup_unsupervised_learning():
84+
# ignore deprecation warnings from scipy.misc.face
85+
warnings.filterwarnings('ignore', 'The binary mode of fromstring',
86+
DeprecationWarning)
87+
88+
7889
def pytest_runtest_setup(item):
7990
fname = item.fspath.strpath
8091
is_index = fname.endswith('datasets/index.rst')
@@ -91,8 +102,12 @@ def pytest_runtest_setup(item):
91102
setup_working_with_text_data()
92103
elif fname.endswith('modules/compose.rst') or is_index:
93104
setup_compose()
105+
elif IS_PYPY and fname.endswith('modules/feature_extraction.rst'):
106+
raise SkipTest('FeatureHasher is not compatible with PyPy')
94107
elif fname.endswith('modules/impute.rst'):
95108
setup_impute()
109+
elif fname.endswith('statistical_inference/unsupervised_learning.rst'):
110+
setup_unsupervised_learning()
96111

97112

98113
def pytest_runtest_teardown(item):

0 commit comments

Comments
 (0)
0