10000 Merge branch 'releases' into dfsg · yarikoptic/scikit-learn@8f9558b · GitHub
[go: up one dir, main page]

Skip to content

Commit 8f9558b

Browse files
committed
Merge branch 'releases' into dfsg
* releases: (99 commits) DOC one more version issue in doc skip docstring tests because not useful to users and has some issues deprecation of n_components happened in 0.19 not 0.18 (scikit-learn#9527) sync whatsnew with master so I'm less confused DOC more navigation links DOC a note on data leakage and pipeline (scikit-learn#9510) DOC set release date to Friday DOC Update news and menu for 0.19 release DOC list of contributors to 0.19 DOC Change release date to Thursday DOC Remove some whitespace from what's new Update what's new for recent changes Use base.is_classifier instead instead of isinstance (scikit-learn#9482) Fix safe_indexing with read-only indices (scikit-learn#9507) [MRG+1] add scorer based on explained_variance_score (scikit-learn#9259) fix wrong assert in test_validation (scikit-learn#9480) [MRG+1] FIX Add missing mixins to ClassifierChain (scikit-learn#9473) Bring last code block in line with the image. (scikit-learn#9488) FIX Pass sample_weight as kwargs in VotingClassifier (scikit-learn#9493) FIX Incorrent implementation of noise_variance_ in PCA._fit_truncated (scikit-learn#9108) ...
2 parents 4615112 + 445af38 commit 8f9558b

File tree

124 files changed

+5325
-4316
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

124 files changed

+5325
-4316
lines changed

doc/about.rst

Copy file name to clipboard
Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ Funding
6767

6868
`INRIA <https://www.inria.fr>`_ actively supports this project. It has
6969
provided funding for Fabian Pedregosa (2010-2012), Jaques Grobler
70-
(2012-2013) and Olivier Grisel (2013-2015) to work on this project
70+
(2012-2013) and Olivier Grisel (2013-2017) to work on this project
7171
full-time. It also hosts coding sprints and other events.
7272

7373
.. image:: images/inria-logo.jpg
@@ -77,7 +77,7 @@ full-time. It also hosts coding sprints and other events.
7777

7878
`Paris-Saclay Center for Data Science <http://www.datascience-paris-saclay.fr>`_
7979
funded one year for a developer to work on the project full-time
80-
(2014-2015).
80+
(2014-2015) and 50% of the time of Guillaume Lemaitre (2016-2017).
8181

8282
.. image:: images/cds-logo.png
8383
:width: 200pt
@@ -94,23 +94,37 @@ Environment also funds several students to work on the project part-time.
9494
:target: http://cds.nyu.edu/mooresloan/
9595

9696

97-
`Télécom Paristech <http://www.telecom-paristech.com>`_ funds Manoj Kumar (2014),
98-
Tom Dupré la Tour (2015), Raghav RV (2015-2016) and Thierry Guillemot (2016) to
99-
work on scikit-learn.
97+
`Télécom Paristech <http://www.telecom-paristech.com>`_ funded Manoj Kumar (2014),
98+
Tom Dupré la Tour (2015), Raghav RV (2015-2017), Thierry Guillemot (2016-2017)
99+
and Albert Thomas (2017) to work on scikit-learn.
100100

101101
.. image:: themes/scikit-learn/static/img/telecom.png
102102
:width: 100pt
103103
:align: center
104104
:target: http://www.telecom-paristech.fr/
105105

106106

107-
`Columbia University <http://columbia.edu>`_ funds Andreas Mueller since 2016.
107+
`Columbia University <http://columbia.edu>`_ funds Andreas Müller since 2016.
108108

109109
.. image:: themes/scikit-learn/static/img/columbia.png
110110
:width: 100pt
111111
:align: center
112112
:target: http://www.columbia.edu/
113113

114+
Andreas Müller also received a grant to improve scikit-learn from the `Alfred P. Sloan Foundation <https://sloan.org>`_ in 2017.
115+
116+
.. image:: images/sloan_banner.png
117+
:width: 200pt
118+
:align: center
119+
:target: https://sloan.org/
120+
121+
`The University of Sydney <http://sydney.edu.au>`_ funds Joel Nothman since July 2017.
122+
123+
.. image:: themes/scikit-learn/static/img/sydney-primary.jpeg
124+
:width: 200pt
125+
:align: center
126+
:target: http://www.sydney.edu.au/
127+
114128
The following students were sponsored by `Google <https://developers.google.com/open-source/>`_
115129
to work on scikit-learn through the
116130
`Google Summer of Code <https://en.wikipedia.org/wiki/Google_Summer_of_Code>`_

doc/datasets/index.rst

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -252,7 +252,7 @@ features::
252252

253253
.. topic:: Related links:
254254

255-
_`Public datasets in svmlight / libsvm format`: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
255+
_`Public datasets in svmlight / libsvm format`: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets
256256

257257
_`Faster API-compatible implementation`: https://github.com/mblondel/svmlight-loader
258258

@@ -268,15 +268,15 @@ DataFrame are also acceptable.
268268
Here are some recommended ways to load standard columnar data into a
269269
format usable by scikit-learn:
270270

271-
* `pandas.io <http://pandas.pydata.org/pandas-docs/stable/io.html>`_
271+
* `pandas.io <https://pandas.pydata.org/pandas-docs/stable/io.html>`_
272272
provides tools to read data from common formats including CSV, Excel, JSON
273273
and SQL. DataFrames may also be constructed from lists of tuples or dicts.
274274
Pandas handles heterogeneous data smoothly and provides tools for
275275
manipulation and conversion into a numeric array suitable for scikit-learn.
276-
* `scipy.io <http://docs.scipy.org/doc/scipy/reference/io.html>`_
276+
* `scipy.io <https://docs.scipy.org/doc/scipy/reference/io.html>`_
277277
specializes in binary formats often used in scientific computing
278278
context such as .mat and .arff
279-
* `numpy/routines.io <http://docs.scipy.org/doc/numpy/reference/routines.io.html>`_
279+
* `numpy/routines.io <https://docs.scipy.org/doc/numpy/reference/routines.io.html>`_
280280
for standard loading of columnar data into numpy arrays
281281
* scikit-learn's :func:`datasets.load_svmlight_file` for the svmlight or libSVM
282282
sparse format
@@ -288,14 +288,14 @@ For some miscellaneous data such as images, videos, and audio, you may wish to
288288
refer to:
289289

290290
* `skimage.io <http://scikit-image.org/docs/dev/api/skimage.io.html>`_ or
291-
`Imageio <http://imageio.readthedocs.io/en/latest/userapi.html>`_
291+
`Imageio <https://imageio.readthedocs.io/en/latest/userapi.html>`_
292292
for loading images and videos to numpy arrays
293-
* `scipy.misc.imread <http://docs.scipy.org/doc/scipy/reference/generated/scipy.
293+
* `scipy.misc.imread <https://docs.scipy.org/doc/scipy/reference/generated/scipy.
294294
misc.imread.html#scipy.misc.imread>`_ (requires the `Pillow
295295
<https://pypi.python.org/pypi/Pillow>`_ package) to load pixel intensities
296296
data from various image file formats
297297
* `scipy.io.wavfile.read
298-
<http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.io.wavfile.read.html>`_
298+
<https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.io.wavfile.read.html>`_
299299
for reading WAV files into a numpy array
300300

301301
Categorical (or nominal) features stored as strings (common in pandas DataFrames)

doc/developers/debugging.rst

Lines changed: 0 additions & 51 deletions
This file was deleted.

doc/developers/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Developer's Guide
1010
.. toctree::
1111

1212
contributing
13-
debugging
13+
tips
1414
utilities
1515
performance
1616
advanced_installation

doc/developers/tips.rst

Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
.. _developers-tips:
2+
3+
===========================
4+
Developers' Tips and Tricks
5+
===========================
6+
7+
Productivity and sanity-preserving tips
8+
=======================================
9+
10+
In this section we gather some useful advice and tools that may increase your
11+
quality-of-life when reviewing pull requests, running unit tests, and so forth.
12+
Some of these tricks consist of userscripts that require a browser extension
13+
such as `TamperMonkey`_ or `GreaseMonkey`_; to set up userscripts you must have
14+
one of these extensions installed, enabled and running. We provide userscripts
15+
as GitHub gists; to install them, click on the "Raw" button on the gist page.
16+
17+
.. _TamperMonkey: https://tampermonkey.net
18+
.. _GreaseMonkey: http://www.greasespot.net
19+
20+
Viewing the rendered HTML documentation for a pull request
21+
----------------------------------------------------------
22+
23+
We use CircleCI to build the HTML documentation for every pull request. To
24+
access that documentation, we provide a redirect as described in the
25+
:ref:`documentation section of the contributor guide
26+
<contribute_documentation>`. Instead of typing the address by hand, we provide a
27+
`userscript <https://gist.github.com/lesteve/470170f288884ec052bcf4bc4ffe958a>`_
28+
that adds a button to every PR. After installing the userscript, navigate to any
29+
GitHub PR; a new button labeled "See CircleCI doc for this PR" should appear in
30+
the top-right area.
31+
32+
Folding and unfolding outdated diffs on pull requests
33+
-----------------------------------------------------
34+
35+
GitHub hides discussions on PRs when the corresponding lines of code have been
36+
changed in the mean while. This `userscript
37+
<https://gist.github.com/lesteve/b4ef29bccd42b354a834>`_ provides a button to
38+
unfold all such hidden discussions at once, so you can catch up.
39+
40+
Checking out pull requests as remote-tracking branches
41+
------------------------------------------------------
42+
43+
In your local fork, add to your ``.git/config``, under the ``[remote
44+
"upstream"]`` heading, the line::
45+
46+
fetch = +refs/pull/*/head:refs/remotes/upstream/pr/*
47+
48+
You may then use ``git checkout pr/PR_NUMBER`` to navigate to the code of the
49+
pull-request with the given number. (`Read more in this gist.
50+
<https://gist.github.com/piscisaureus/3342247>`_)
51+
52+
Display code coverage in pull requests
53+
--------------------------------------
54+
55+
To overlay the code coverage reports generated by the CodeCov continuous
56+
integration, consider `this browser extension
57+
<https://github.com/codecov/browser-extension>`_. The coverage of each line
58+
will be displayed as a color background behind the line number.
59+
60+
Useful pytest aliases and flags
61+
-------------------------------
62+
63+
We recommend using pytest to run unit tests. When a unit tests fail, the
64+
following tricks can make debugging easier:
65+
66+
1. The command line argument ``pytest -l`` instructs pytest to print the local
67+
variables when a failure occurs.
68+
69+
2. The argument ``pytest --pdb`` drops into the Python debugger on failure. To
70+
instead drop into the rich IPython debugger ``ipdb``, you may set up a
71+
shell alias to::
72+
73+
pytest --pdbcls=IPython.terminal.debugger:TerminalPdb --capture no
74+
75+
Debugging memory errors in Cython with valgrind
76+
===============================================
77+
78+
While python/numpy's built-in memory management is relatively robust, it can
79+
lead to performance penalties for some routines. For this reason, much of
80+
the high-performance code in scikit-learn in written in cython. This
81+
performance gain comes with a tradeoff, however: it is very easy for memory
82+
bugs to crop up in cython code, especially in situations where that code
83+
relies heavily on pointer arithmetic.
84+
85+
Memory errors can manifest themselves a number of ways. The easiest ones to
86+
debug are often segmentation faults and related glibc errors. Uninitialized
87+
variables can lead to unexpected behavior that is difficult to track down.
88+
A very useful tool when debugging these sorts of errors is
89+
valgrind_.
90+
91+
< F41A code>92+
Valgrind is a command-line tool that can trace memory errors in a variety of
93+
code. Follow these steps:
94+
95+
1. Install `valgrind`_ on your system.
96+
97+
2. Download the python valgrind suppression file: `valgrind-python.supp`_.
98+
99+
3. Follow the directions in the `README.valgrind`_ file to customize your
100+
python suppressions. If you don't, you will have spurious output coming
101+
related to the python interpreter instead of your own code.
102+
103+
4. Run valgrind as follows::
104+
105+
$> valgrind -v --suppressions=valgrind-python.supp python my_test_script.py
106+
107+
.. _valgrind: http://valgrind.org
108+
.. _`README.valgrind`: http://svn.python.org/projects/python/trunk/Misc/README.valgrind
109+
.. _`valgrind-python.supp`: http://svn.python.org/projects/python/trunk/Misc/valgrind-python.supp
110+
111+
112+
The result will be a list of all the memory-related errors, which reference
113+
lines in the C-code generated by cython from your .pyx file. If you examine
114+
the referenced lines in the .c file, you will see comments which indicate the
115+
corresponding location in your .pyx source file. Hopefully the output will
116+
give you clues as to the source of your memory error.
117+
118+
For more information on valgrind and the array of options it has, see the
119+
tutorials and documentation on the `valgrind web site <http://valgrind.org>`_.

doc/documentation.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
<div class="container-index">
44

5-
Documentation of scikit-learn 0.19.dev0
5+
Documentation of scikit-learn |release|
66
=======================================
77

88
.. raw:: html
@@ -28,8 +28,8 @@ Documentation of scikit-learn 0.19.dev0
2828
<!-- doc versions -->
2929
<h2>Other Versions</h2>
3030
<ul>
31-
<li>scikit-learn 0.19 (development)</li>
32-
<li><a href="http://scikit-learn.org/stable/documentation.html">scikit-learn 0.18 (stable)</a></li>
31+
<li><a href="http://scikit-learn.org/stable/documentation.html">scikit-learn 0.19 (stable)</a></li>
32+
<li><a href="http://scikit-learn.org/0.18/documentation.html">scikit-learn 0.18</a></li>
3333
<li><a href="http://scikit-learn.org/0.17/documentation.html">scikit-learn 0.17</a></li>
3434
<li><a href="http://scikit-learn.org/0.16/documentation.html">scikit-learn 0.16</a></li>
3535
</ul>

doc/faq.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,9 @@ Apart from scikit-learn, another popular one is `scikit-image <http://scikit-ima
2424
How can I contribute to scikit-learn?
2525
-----------------------------------------
2626
See :ref:`contributing`. Before wanting to add a new algorithm, which is
27-
usually a major and lengthy undertaking, it is recommended to start with :ref:`known
28-
issues <easy_issues>`_. Please do not contact the contributors of scikit-learn directly
29-
regarding contributing to scikit-learn.
27+
usually a major and lengthy undertaking, it is recommended to start with
28+
:ref:`known issues <new_contributors>`. Please do not contact the contributors
29+
of scikit-learn directly regarding contributing to scikit-learn.
3030

3131
What's the best way to get help on scikit-learn usage?
3232
--------------------------------------------------------------

doc/images/sloan_banner.png

22.2 KB
Loading

doc/index.rst

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -207,14 +207,16 @@
207207
<li><em>On-going development:</em>
208208
<a href="/dev/whats_new.html"><em>What's new</em> (Changelog)</a>
209209
</li>
210+
<li><em>July 2017.</em> scikit-learn 0.19.0 is available for download (<a href="whats_new.html#version-0-19">Changelog</a>).
211+
</li>
212+
<li><em>June 2017.</em> scikit-learn 0.18.2 is available for download (<a href="whats_new.html#version-0-18-2">Changelog</a>).
213+
</li>
210214
<li><em>September 2016.</em> scikit-learn 0.18.0 is available for download (<a href="whats_new.html#version-0-18">Changelog</a>).
211215
</li>
212216
<li><em>November 2015.</em> scikit-learn 0.17.0 is available for download (<a href="whats_new.html#version-0-17">Changelog</a>).
213217
</li>
214218
<li><em>March 2015.</em> scikit-learn 0.16.0 is available for download (<a href="whats_new.html#version-0-16">Changelog</a>).
215219
</li>
216-
<li><em>July 2014.</em> scikit-learn 0.15.0 is available for download (<a href="whats_new.html#version-0-15">Changelog</a>).
217-
</li>
218220
<li><em>July 14-20th, 2014: international sprint.</em>
219221
During this week-long sprint, we gathered 18 of the core
220222
contributors in Paris.
@@ -323,14 +325,15 @@
323325
Funding provided by INRIA and others.
324326
</div>
325327
<div class="span6">
326-
<a class="reference internal" href="about.html#funding" style="text-decoration: none" >
328+
<a class="reference internal" href="about.html#funding" style="text-decoration: none; white-space: nowrap" >
327329
<img id="index-funding-logo-big" src="_static/img/inria-small.png" title="INRIA">
328330
<img id="index-funding-logo-small" src="_static/img/google.png" title="Google">
329331
<!--Due to Télécom ParisTech's logo text being smaller, a style has been added to improve readability-->
330332
<img id="index-funding-logo-small" src="_static/img/telecom.png" title="Télécom ParisTech" style="max-height: 36px">
331333
<img id="index-funding-logo-small" src="_static/img/FNRS-logo.png" title="FNRS">
332-
<img id="index-funding-logo-small" src="_static/img/nyu_short_color.png" title="NYU CDS">
334+
<img id="index-funding-logo-small" src="_static/img/sloan_logo.jpg" title="Alfred P. Sloan Foundation" style="max-height: 36px">
333335
<img id="index-funding-logo-small" src="_static/img/columbia.png" title="Columbia University" style="max-height: 36px;">
336+
<img id="index-funding-logo-small" src="_static/img/sydney-stacked.jpeg" title="The University of Sydney" style="max-height: 36px;">
334337
</a>
335338
</div>
336339
<div class="span3">

0 commit comments

Comments
 (0)
0