8000 Merge tag '0.12' into releases · seckcoder/scikit-learn@9b642e5 · GitHub
[go: up one dir, main page]

Skip to content

Commit 9b642e5

Browse files
committed
Merge tag '0.12' into releases
* tag '0.12': (949 commits) ENH more robust transformer testing.... don't ask why that came up Use pinvh wherever it helps in the codebase. Cloned @jakevdp's pinvh tests @jakevdp's version of pinvh speed up symmetric_pinv Add comments on optimized precision computations. Vectorize singular value inversion Compute pseudoinverse using eigendecomposition We already have the inverse at that step DOC added link to 0.11 docs to support page. DOC add people and commits do whatsnew MISC changed version number for release, change maintainer to myself COSMIT typo, thanks @ogrisel COSMIT pep8 small tweaks typos and alex`s review changes docstring changes not the problem afterall - switch back changed includes back - change broke JENKINS build docstring fixes docstring change ...
2 parents 5b136ab + 0fede44 commit 9b642e5

File tree

336 files changed

+45913
-32784
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

336 files changed

+45913
-32784
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,3 +37,5 @@ nips2010_pdf/
3737
*.nt.bz2
3838
*.tar.gz
3939
*.tgz
40+
41+
examples/cluster/joblib

.mailmap

Lines changed: 30 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,12 @@
11
Gael Varoquaux <gael.varoquaux@normalesup.org> gvaroquaux <gael.varoquaux@normalesup.org>
22
Gael Varoquaux <gael.varoquaux@normalesup.org> Gael varoquaux <gael.varoquaux@normalesup.org>
33
Gael Varoquaux <gael.varoquaux@normalesup.org> GaelVaroquaux <gael.varoquaux@normalesup.org>
4+
Gael Varoquaux <gael.varoquaux@normalesup.org> Varoquaux <varoquau@normalesup.org>
45
Olivier Grisel <olivier.grisel@ensta.org> ogrisel <olivier.grisel@ensta.org>
6+
Olivier Grisel <olivier.grisel@ensta.org> Olivier Grisel <ogrisel@turingcarpet.(none)>
57
Alexandre Gramfort <alexandre.gramfort@inria.fr> Alexandre Gramfort <alexandre.gramfort@gmail.com>
8+
Alexandre Gramfort <alexandre.gramfort@inria.fr> Alexandre Gramfort <alexandre.gramfort@m4x.org>
9+
Alexandre Gramfort <alexandre.gramfort@inria.fr> Alexandre Gramfort <gramfort@localhost.(none)>
610
Matthieu Perrot <matthieu.perrot@cea.fr> Matthieu Perrot <revilyo@earth.(none)>
711
Matthieu Perrot <matthieu.perrot@cea.fr> revilyo <revilyo@earth.(none)>
812
Vincent Michel <vincent.michel@inria.fr> vincent <vincent@vincent.org>
@@ -12,6 +16,7 @@ Vincent Michel <vincent.michel@inria.fr> Vincent M <vm.michel@gmail.com>
1216
Vincent Michel <vincent.michel@inria.fr> Vincent Michel <vincent.michel@logilab.fr>
1317
Vincent Michel <vincent.michel@inria.fr> Vincent M <vincent.michel@logilab.fr>
1418
Vincent Michel <vincent.michel@inria.fr> Vincent michel <vmic@crater2.logilab.fr>
19+
Vincent Michel <vincent.michel@inria.fr> Vincent Michel <vm.michel@gmail.com>
1520
Ariel Rokem <arokem@berkeley.edu> arokem <arokem@berkeley.edu>
1621
Bertrand Thirion <bertrand.thirion@inria.fr> bthirion <bertrand.thirion@inria.fr>
1722
Peter Prettenhofer <peter.prettenhofer@gmail.com> pprett <peter.prettenhofer@gmail.com>
@@ -23,19 +28,42 @@ James Bergstra <james.bergstra@gmail.com> james.bergstra <james.bergstra@gmail.c
2328
Xinfan Meng <mxf3306@gmail.com> mxf <mxf@chomsky.localdomain>
2429
Jan Schlüter <scikit-learn@jan-schlueter.de> f0k <scikit-learn@jan-schlueter.de>
2530
Vlad Niculae <vlad@vene.ro> vene <vlad@vene.ro>
26-
Andreas Müller <amueller@ais.uni-bonn.de> amueller <amueller@ais.uni-bonn.de>
2731
Virgile Fritsch <virgile.fritsch@gmail.com> VirgileFritsch <virgile.fritsch@gmail.com>
2832
Virgile Fritsch <virgile.fritsch@gmail.com> Virgile <virgile.fritsch@gmail.com>
33+
Virgile Fritsch <virgile.fritsch@gmail.com> Virgile <virgile@virgile-Precision-M4400.(none)>
2934
Jean Kossaifi <jean.kossaifi@gmail.com> Jean KOSSAIFI <jkossaifi@is208616.intra.cea.fr>
3035
Jean Kossaifi <jean.kossaifi@gmail.com> JeanKossaifi <jean.kossaifi@gmail.com>
31-
Jake Vanderplas <vanderplas@astro.washington.edu> Jacob Vanderplas <jakevdp@yahoo.com>
36+
Jean Kossaifi <jean.kossaifi@gmail.com> Jean Kossaifi <kossaifi@is208616.intra.cea.fr>
37+
Jake VanderPlas <vanderplas@astro.washington.edu> Jacob Vanderplas <jakevdp@yahoo.com>
38+
Jake VanderPlas <vanderplas@astro.washington.edu> Jake Vanderplas <jakevdp@yahoo.com>
39+
Jake VanderPlas <vanderplas@astro.washington.edu> Jake Vanderplas <vanderplas@astro.washington.edu>
3240
Andreas Mueller <amueller@ais.uni-bonn.de> Andy <amueller@ais.uni-bonn.de>
41+
Andreas Mueller <amueller@ais.uni-bonn.de> unknown <Andreas Mueller@MSRC-3645211.europe.corp.microsoft.com>
3342
Andreas Mueller <amueller@ais.uni-bonn.de> andy <andy@marvin>
3443
Andreas Mueller <amueller@ais.uni-bonn.de> Andreas Mueller <amueller@templateimage.ista.local>
44+
Andreas Mueller <amueller@ais.uni-bonn.de> Andreas Müller <amueller@ais.uni-bonn.de>
3545
Brian Holt <bh00038@cvplws63.eps.surrey.ac.uk> bdholt1 <bdholt1@gmail.com>
46+
Brian Holt <bh00038@cvplws63.eps.surrey.ac.uk> Brian Holt <bdholt1@gmail.com>
3647
Robert Layton <robertlayton@gmail.com> robertlayton <robertlayton@gmail.com>
48+
Robert Layton <robertlayton@gmail.com> = <robertlayton@gmail.com>
3749
Fabian Pedregosa <fabian@fseoane.net> Fabian Pedregosa <fabian.pedregosa@inria.fr>
3850
Lars Buitinck <L.J.Buitinck@uva.nl> Lars Buitinck <larsmans@gmail.com>
3951
Lars Buitinck <L.J.Buitinck@uva.nl> unknown <Lars@.(none)>
4052
Lars Buitinck <L.J.Buitinck@uva.nl> Lars Buitinck <l.j.buitinck@uva.nl>
4153
DraXus <draxus@gmail.com> draxus <draxus@hammer.ugr>
54+
Edouard DUCHESNAY <ed203246@is206877.intra.cea.fr> Edouard Duchesnay <duchesnay@is143433.(none)>
55+
Edouard DUCHESNAY <ed203246@is206877.intra.cea.fr> Edouard Duchesnay <edouard.duchesnay@gmail.com>
56+
Edouard DUCHESNAY <ed203246@is206877.intra.cea.fr> duchesnay <edouard.duchesnay@gmail.com>
57+
Edouard DUCHESNAY <ed203246@is206877.intra.cea.fr> duchesnay <edouard@is2206219.(none)>
58+
Emmanuelle Gouillart <emmanuelle.gouillart@nsup.org> Emmanuelle Gouillart <emma@aleph.(none)>
59+
Emmanuelle Gouillart <emmanuelle.gouillart@nsup.org> emmanuelle <emmanuelle.gouillart@nsup.org>
60+
Gilles Louppe <g.louppe@gmail.com> Gilles Louppe <g.louppe@ulg.ac.be>
61+
Nelle Varoquaux <nelle.varoquaux@gmail.com> Nelle Varoquaux <nelle@phgroup.com>
62+
Nicolas Pinto <pinto@alum.mit.edu> Nicolas Pinto <pinto@mit.edu>
63+
Olivier Hervieu <olivier.hervieu@gmail.com> Olivier Hervieu <olivier.hervieu@tinyclues.com>
64+
Satrajit Ghosh <satra@mit.edu> Satrajit Ghosh <satrajit.ghosh@gmail.com>
65+
Shiqiao Du <lucidfrontier.45@gmail.com> Shiqiao Du <s.du@freebit.net>
66+
Shiqiao Du <lucidfrontier.45@gmail.com> Shiqiao <lucidfrontier.45@gmail.com>
67+
Tim Sheerman-Chase <t.sheerman-chase@surrey.ac.uk> Tim Sheerman-Chase <ts00051@ts00051-desktop.(none)>
68+
Vincent Schut <schut@sarvision.nl> Vincent Schut <vincent@TIMO.(none)>
69+
iBayer <mane.desk@gmail.com> ibayer <mane.desk@gmail.com>

MANIFEST.in

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
include *.rst
2-
include test.py
3-
include scikits/__init__.py
42
recursive-include doc *
53
recursive-include examples *
64
recursive-include sklearn *.c *.h *.pyx
7-
recursive-include sklearn/datasets *.csv *.csv.gz *.TXT *.rst *.jpg *.txt
5+
recursive-include sklearn/datasets *.csv *.csv.gz *.rst *.jpg *.txt

Makefile

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,11 @@ CTAGS ?= ctags
1010
all: clean inplace test
1111

1212
clean-pyc:
13-
find . -name "*.pyc" | xargs rm -f
13+
find sklearn -name "*.pyc" | xargs rm -f
1414

1515
clean-so:
16-
find . -name "*.so" | xargs rm -f
17-
find . -name "*.pyd" | xargs rm -f
16+
find sklearn -name "*.so" | xargs rm -f
17+
find sklearn -name "*.pyd" | xargs rm -f
1818

1919
clean-build:
2020
rm -rf build
@@ -36,13 +36,14 @@ test-doc:
3636
doc/developers doc/tutorial/basic doc/tutorial/statistical_inference
3737

3838
test-coverage:
39+
rm -rf coverage .coverage
3940
$(NOSETESTS) -s --with-coverage --cover-html --cover-html-dir=coverage \
4041
--cover-package=sklearn sklearn
4142

4243
test: test-code test-doc
4344

4445
trailing-spaces:
45-
find . -name "*.py" | xargs perl -pi -e 's/[ \t]*$$//'
46+
find sklearn -name "*.py" | xargs perl -pi -e 's/[ \t]*$$//'
4647

4748
cython:
4849
find sklearn -name "*.pyx" | xargs $(CYTHON)

README-py3k.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ of these is:
1616
To generate python3 compatible sources for selected modules, run the
1717
2to3 tool on the module::
1818

19-
2to3 -wn --no-diffs scikits/learn/$module
19+
2to3 -wn --no-diffs sklearn/$module
2020

2121
If you would like to help with porting to python3, please propose
2222
yourself in the scikit-learn mailing list:

README.rst

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,5 +76,8 @@ source directory (you will need to have nosetest installed)::
7676

7777
python -c "import sklearn; sklearn.test()"
7878

79-
See web page http://scikit-learn.sourceforge.net/install.html#testing
79+
See web page http://scikit-learn.org/stable/install.html#testing
8080
for more information.
81+
82+
Random number generation can be controled during testing by setting
83+
the SKLEARN_SEED environment variable

benchmarks/bench_covertype.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@
5252

5353
from time import time
5454
import os
55+
import sys
5556
import numpy as np
5657
from optparse import OptionParser
5758

@@ -182,7 +183,7 @@ def benchmark(clf):
182183
'alpha': 0.001,
183184
'n_iter': 2,
184185
}
185-
classifiers['SGD'] = SGDClassifier( **sgd_parameters)
186+
classifiers['SGD'] = SGDClassifier(**sgd_parameters)
186187

187188
######################################################################
188189
## Train CART model
@@ -207,7 +208,7 @@ def benchmark(clf):
207208
selected_classifiers = opts.classifiers.split(',')
208209
for name in selected_classifiers:
209210
if name not in classifiers:
210-
op.error('classifier %r unknwon')
211+
op.error('classifier %r unknown' % name)
211212
sys.exit(1)
212213

213214
print("")

benchmarks/bench_plot_fastkmeans.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ def compute_bench(samples_range, features_range):
4242
# let's prepare the data in small chunks
4343
mbkmeans = MiniBatchKMeans(init='k-means++',
4444
k=10,
45-
chunk_size=chunk)
45+
batch_size=chunk)
4646
tstart = time()
4747
mbkmeans.fit(data)
4848
delta = time() - tstart
@@ -78,7 +78,7 @@ def compute_bench_2(chunks):
7878
tstart = time()
7979
mbkmeans = MiniBatchKMeans(init='k-means++',
8080
k=8,
81-
chunk_size=chunk)
81+
batch_size=chunk)
8282

8383
mbkmeans.fit(X)
8484
delta = time() - tstart

doc/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,4 +106,4 @@ doctest:
106106
"results in $(BUILDDIR)/doctest/output.txt."
107107

108108
download-data:
109-
python -c "from scikits.learn.datasets.lfw import check_fetch_lfw; check_fetch_lfw()"
109+
python -c "from sklearn.datasets.lfw import check_fetch_lfw; check_fetch_lfw()"

doc/conf.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@
7373
# built documents.
7474
#
7575
# The short X.Y version.
76-
version = '0.11'
76+
version = '0.12'
7777
# The full version, including alpha/beta/rc tags.
7878
import sklearn
7979
release = sklearn.__version__
@@ -220,6 +220,7 @@
220220
# Additional stuff for the LaTeX preamble.
221221
latex_preamble = """
222222
\usepackage{amsmath}\usepackage{amsfonts}\usepackage{bm}\usepackage{morefloats}
223+
\usepackage{enumitem} \setlistdepth{10}
223224
"""
224225

225226
# Documents to append as an appendix to all manuals.

doc/datasets/twenty_newsgroups_fixture.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
from os.path import exists
77
from os.path import join
88
from nose import SkipTest
9-
from scikits.learn.datasets import get_data_home
9+
from sklearn.datasets import get_data_home
1010

1111

1212
def setup_module(module):

doc/developers/index.rst

Lines changed: 82 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -75,29 +75,24 @@ repository <http://github.com/scikit-learn/scikit-learn/>`__ on GitHub:
7575

7676
$ git clone git@github.com:YourLogin/scikit-learn.git
7777

78-
4. Work on this copy, on your computer, using Git to do the version
79-
control::
78+
4. Create a branch to hold your changes::
8079

81-
$ git add modified_files
82-
$ git commit
83-
$ git push origin master
84-
85-
and so on.
80+
$ git checkout -b my-feature
8681

87-
If your changes are not just trivial fixes, it is better to directly
88-
work in a branch with the name of the feature you are working on. In
89-
this case, replace step 4 with step 5:
82+
and start making changes. Never work in the ``master`` branch!
9083

91-
5. Create a branch to host your changes and publish it on your public
92-
repo::
84+
5. Work on this copy, on your computer, using Git to do the version
85+
control. When you're done editing, do::
9386

94-
$ git checkout -b my-feature
9587
$ git add modified_files
9688
$ git commit
97-
$ git push origin my-feature
9889

99-
When you are ready, and you have pushed your changes to your GitHub repo, go
100-
the web page of the repo, and click on 'Pull request' to send us a pull
90+
to record your changes in Git, then push them to GitHub with::
91+
92+
$ git push -u origin my-feature
93+
94+
Finally, go to the web page of the your fork of the scikit-learn repo,
95+
and click 'Pull request' to send your changes to the maintainers for review.
10196
request. This will send an email to the committers, but might also send an
10297
email to the mailing list in order to get more visibility.
10398

@@ -109,8 +104,7 @@ email to the mailing list in order to get more visibility.
109104
to use instead of ``origin``. If we choose the name ``upstream`` for it, the
110105
command will be::
111106

112-
$ git remote add upstream git@github.com:scikit-learn/scikit-learn.git
113-
107+
$ git remote add upstream https://github.com/scikit-learn/scikit-learn.git
114108

115109
(If any of the above seems like magic to you, then look up the
116110
`Git documentation <http://git-scm.com/documentation>`_ on the web.)
@@ -156,6 +150,8 @@ You can also check for common programming errors with the following tools:
156150
$ pip install nose coverage
157151
$ nosetests --with-coverage path/to/tests_for_package
158152

153+
see also :ref:`testing_coverage`
154+
159155
* No pyflakes warnings, check with::
160156

161157
$ pip install pyflakes
@@ -185,13 +181,13 @@ and Cython optimizations.
185181
on all new contributions will get the overall code base quality in the
186182
right direction.
187183

188-
EasyFix Issues
189-
--------------
184+
Easy Issues
185+
-----------
190186

191187
A great way to start contributing to scikit-learn is to pick an item from the
192-
list of `EasyFix issues
193-
<https://github.com/scikit-learn/scikit-learn/issues?labels=EasyFix>`_
194-
in the issue tracker. Resolving these issues allow you to start contributing
188+
list of `Easy issues
189+
<https://github.com/scikit-learn/scikit-learn/issues?labels=Easy>`_
190+
in the issue tracker. Resolving these issues allow you to start contributing
195191
to the project without much prior knowledge. Your assistance in this area will
196192
be greatly appreciated by the more experienced developers as it helps free up
197193
their time to concentrate on other issues.
@@ -230,13 +226,76 @@ it.
230226
slightly differently. To get the best results, you should use version
231227
1.0.
232228

229+
.. _testing_coverage:
230+
231+
Testing and improving test coverage
232+
------------------------------------
233+
234+
High-quality `unit testing <http://en.wikipedia.org/wiki/Unit_testing>`_
235+
is a corner-stone of the sciki-learn development process. For this
236+
purpose, we use the `nose <http://nose.readthedocs.org/en/latest/>`_
237+
package. The tests are functions appropriately names, located in `tests`
238+
subdirectories, that check the validity of the algorithms and the
239+
different options of the code.
240+
241+
The full scikit-learn tests can be run using 'make' in the root folder.
242+
Alternatively, running 'nosetests' in a folder will run all the tests of
243+
the corresponding subpackages.
244+
245+
We expect code coverage of new features to be at least around 90%.
246+
247+
.. note:: **Workflow to improve test coverage**
248+
249+
To test code coverage, you need to install the `coverage
250+
<http://pypi.python.org/pypi/coverage>`_ package in addition to nose.
251+
252+
1. Run 'make test-coverage'. The output lists for each file the line
253+
numbers that are not tested.
254+
255+
2. Find a low hanging fruit, looking at which lines are not tested,
256+
write or adapt a test specifically for these lines.
257+
258+
3. Loop.
259+
260+
261+
233262
Developers web site
234263
-------------------
235264

236265
More information can be found on the `developer's wiki
237266
<https://github.com/scikit-learn/scikit-learn/wiki>`_.
238267

239268

269+
Issue Tracker Tags
270+
------------------
271+
All issues and pull requests on the
272+
`Github issue tracker <https://github.com/scikit-learn/scikit-learn/issues>`_
273+
should have (at least) one of the following tags:
274+
275+
:Bug / Crash:
276+
Something is happening that clearly shouldn't happen.
277+
Wrong results as well as unexpected errors from estimators go here.
278+
279+
:Cleanup / Enhancement:
280+
Improving performance, usability, consistency.
281+
282+
:Documentation:
283+
Missing, incorrect or sub-standard documentations and examples.
284+
285+
:New Feature:
286+
Feature requests and pull requests implementing a new feature.
287+
288+
There are two other tags to help new contributors:
289+
290+
:Easy:
291+
This issue can be tackled by anyone, no experience needed.
292+
Ask for help if the formulation is unclear.
293+
294+
:Moderate:
295+
Might need some knowledge of machine learning or the package,
296+
but is still approachable for someone new to the project.
297+
298+
240299
Other ways to contribute
241300
========================
242301

0 commit comments

Comments
 (0)
0