8000 Merge branch 'dfsg' into debian · scikit-learn/scikit-learn@631b7c8 · GitHub
[go: up one dir, main page]

Skip to content

Commit 631b7c8

Browse files
committed
Merge branch 'dfsg' into debian
* dfsg: (46 commits) REL: 0.14 release: update whats_new and version MAINT Update mailmap DOC: fix CSS bug DOC: update documentation for release MISC: deprection is in 2 releases MAINT: randn on float is deprecated TST: avoid nose running sklearn.test as a test MISC: fix wrong timing in example FIX integer types in Ward clustering TST: avoid a crash in Windows + Anaconda Py3.3 FIX search and replace misstake DOC put the narrative documentation of roc_curve and roc_auc_score in one place ENH more explicit name for auc + consistency for scorer, fix #2096 DOC: button layout tweak Website: bottom buttons DOC: bigger menu fonts DOC: layout tweaks Polishing on "Who's using scikit-learn" FIX typo in testimonials ENH testimonials img are now centered. ...
2 parents 7c6e398 + 608cd63 commit 631b7c8

32 files changed

+2114
-1158
lines changed

.mailmap

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,17 +18,25 @@ Brian Cheung <bcheung5@gmail.com> <cow@rusty.(none)>
1818
Brian Holt <bh00038@cvplws63.eps.surrey.ac.uk> <bdholt1@gmail.com>
1919
Christian Osendorfer <osendorf@gmail.com>
2020
Clay Woolam <clay@woolam.org>
21+
Denis Engemann <d.engemann@fz-juelich.de>
22+
Denis Engemann <d.engemann@fz-juelich.de> <denis.engemann@gmail.com>
23+
Denis Engemann <d.engemann@fz-juelich.de> <dengemann@Deniss-MacBook-Pro.local>
24+
Denis Engemann <d.engemann@fz-juelich.de> <dengemann <denis.engemann@gmail.com>
2125
Diego Molla <dmollaaliod@gmail.com> <diego@diego-desktop.(none)>
2226
DraXus <draxus@gmail.com> draxus <draxus@hammer.ugr>
2327
Edouard DUCHESNAY <ed203246@is206877.intra.cea.fr> <duchesnay@is143433.(none)>
2428
Edouard DUCHESNAY <ed203246@is206877.intra.cea.fr> <edouard.duchesnay@gmail.com>
2529
Edouard DUCHESNAY <ed203246@is206877.intra.cea.fr> <edouard@is2206219.(none)>
2630
Emmanuelle Gouillart <emmanuelle.gouillart@nsup.org>
2731
Emmanuelle Gouillart <emmanuelle.gouillart@nsup.org> <emma@aleph.(none)>
28-
Fabian Pedregosa <fabian@fseoane.net> <fabian.pedregosa@inria.fr>
32+
Fabian Pedregosa <fabian.pedregosa@inria.fr>
33+
Fabian Pedregosa <fabian.pedregosa@inria.fr> <fabian@fseoane.net>
34+
Fabian Pedregosa <fabian.pedregosa@inria.fr> <f@bianp.net>
2935
Federico Vaggi <vaggi.federico@gmail.com>
30-
Gael Varoquaux <gael.varoquaux@normalesup.org>
31-
Gael Varoquaux <gael.varoquaux@normalesup.org> <varoquau@normalesup.org>
36+
Federico Vaggi <vaggi.federico@gmail.com> <vaggi.federico@GMAIL.COM>
37+
Gael Varoquaux <gael.varoquaux@inria.fr>
38+
Gael Varoquaux <gael.varoquaux@inria.fr> <gael.varoquaux@normalesup.org>
39+
Gael Varoquaux <gael.varoquaux@inria.fr> <varoquau@normalesup.org>
3240
Gilles Louppe <g.louppe@gmail.com> <g.louppe@ulg.ac.be>
3341
Harikrishnan S <hihari777@gmail.com>
3442
Hrishikesh Huilgolkar <hrishikesh911@gmail.com> <hrishikesh@QE-IND-WKS007.(none)>
@@ -37,6 +45,7 @@ Jake VanderPlas <vanderplas@astro.washington.edu> <jakevdp@yahoo.com>
3745
Jake VanderPlas <vanderplas@astro.washington.edu> <jakevdp@gmail.com>
3846
Jake VanderPlas <vanderplas@astro.washington.edu> <vanderplas@astro.washington.edu>
3947
James Bergstra <james.bergstra@gmail.com>
48+
Jaques Grobler <jaques.grobler@inria.fr> <jaquesgrobler@gmail.com>
4049
Jan Schl�ter <scikit-learn@jan-schlueter.de>
4150
Jean Kossaifi <jean.kossaifi@gmail.com>
4251
Jean Kossaifi <jean.kossaifi@gmail.com> <jkossaifi@is208616.intra.cea.fr>
@@ -52,6 +61,7 @@ Nelle Varoquaux <nelle.varoquaux@gmail.com>
5261
Nelle Varoquaux <nelle.varoquaux@gmail.com> <nelle@phgroup.com> <nelle@varoquaux@gmail.com>
5362
Nicolas Pinto <pinto@alum.mit.edu> <pinto@mit.edu>
5463
Noel Dawe <Noel.Dawe@cern.ch> <noel.dawe@gmail.com>
64+
Noel Dawe <Noel.Dawe@cern.ch> <noel.dAwe@cern.ch>
5565
Olivier Grisel <olivier.grisel@ensta.org> <ogrisel@turingcarpet.(none)>
5666
Olivier Grisel <olivier.grisel@ensta.org> <olivier.grisel@ensta.org>
5767
Olivier Hervieu <olivier.hervieu@gmail.com> <olivier.hervieu@tinyclues.com>
@@ -78,3 +88,6 @@ Wei Li <kuantkid@gmail.com>
7888
Wei Li <kuantkid@gmail.com> <kuantkid+github@gmail.com>
7989
X006 <x006@x006-icsl.(none)> <x006@x006laptop.(none)>
8090
Xinfan Meng <mxf3306@gmail.com> <mxf@chomsky.localdomain>
91+
Yannick Schwartz <yannick.schwartz@inria.fr> <yannick.schwartz@cea.fr>
92+
Yannick Schwartz <yannick.schwartz@inria.fr> <ys218403@is220245.(none)>
93+

README.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,7 @@ Dependencies
3535
============
3636

3737
scikit-learn is tested to work under Python 2.6+ and Python 3.3+
38-
(using the same codebase thanks to an embedded copy of [six](
39-
http://pythonhosted.org/six/)).
38+
(using the same codebase thanks to an embedded copy of `six <http://pythonhosted.org/six/>`_).
4039

4140
The required dependencies to build the software Numpy >= 1.3, SciPy >= 0.7
4241
and a working C/C++ compiler.

doc/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@
6868
# built documents.
6969
#
7070
# The short X.Y version.
71-
version = '0.14-git'
71+
version = '0.14'
7272
# The full version, including alpha/beta/rc tags.
7373
import sklearn
7474
release = sklearn.__version__

doc/datasets/covtype.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ Each sample has 54 features, described on the
1313
Some of the features are boolean indicators,
1414
while others are discrete or continuous measurements.
1515

16-
``sklearn.datasets.fetch_covtype`` will load the covertype dataset;
16+
:func:`sklearn.datasets.fetch_covtype` will load the covertype dataset;
1717
it returns a dictionary-like object
1818
with the feature matrix in the ``data`` member
1919
and the target values in ``target``.

doc/documentation.rst

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
<div class="container-index">
44

5-
Documentation of scikit-learn 0.13
5+
Documentation of scikit-learn 0.14
66
==================================
77

88
.. raw:: html
@@ -78,9 +78,10 @@ Documentation of scikit-learn 0.13
7878
<div class="span4 box">
7979
<h2>Other Versions</h2>
8080
<ul>
81-
<li><a href="http://scikit-learn.org/0.13/user_guide.html">scikit-learn 0.13 (stable)</a></li>
82-
<li>scikit-learn 0.14 (development)</li>
81+
<li>scikit-learn 0.14 (stable)</li>
82+
<li><a href="http://scikit-learn.org/0.15/user_guide.html">scikit-learn 0.15 (development)</a></li>
8383

84+
<li><a href="http://scikit-learn.org/0.13/user_guide.html">scikit-learn 0.13</a></li>
8485
<li><a href="http://scikit-learn.org/0.12/user_guide.html">scikit-learn 0.12</a></li>
8586
<li><a href="http://scikit-learn.org/0.11/user_guide.html">scikit-learn 0.11</a></li>
8687
<li><a href="http://scikit-learn.org/0.10/user_guide.html">scikit-learn 0.10</a></li>

doc/index.rst

Lines changed: 55 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -246,69 +246,92 @@
246246
<div class="container index-lower">
247247
<div class="row-fluid">
248248
<!-- News -->
249-
<div class="span6">
249+
<div class="span4">
250250
<h4>News</h4>
251251
<ul>
252252
<li><em>On-going development:</em>
253253
<a href="whats_new.html"><em>What's new</em> (changelog)</a>
254254
</li>
255-
<li><em>July 22th - 28th, 2013: internal sprint</em>
255+
<li><em>August 2013.</em> scikit-learn 0.14 is available for download (<a href="whats_new.html">Changelog</a>).
256+
</li>
257+
<li><em>July 22-28th, 2013: international sprint.</em>
256258
During this week-long sprint, we gathered most of the core
257259
developers in Paris.
258-
<!--
259-
Here are some of the biggest changes in the upcoming version:
260-
<ul>
261-
<li>Python 3 support</li>
262-
<li>Ensembles of Randomized Trees speed improvements</li>
263-
<li>Restricted Boltzman Machines</li>
264-
<li>Missing data imputation</li>
265-
<li>Bi-clustering</li>
266-
</ul>
267-
-->
268-
269260
We want to thank our sponsors, our
270261
hosts <a href="http://www.telecom-paristech.fr/">Télécom ParisTech</a>
271262
and <a href="http://www.tinyclues.com/">tinyclues</a>, and
272263
donations that helped fund this event.
273264

274-
<li><em>February 2013.</em> scikit-learn 0.13.1 is available for download (<a href="whats_new.html">Changelog</a>).
275-
</li>
276265
</ul>
277266
</div>
278267

279268
<!-- Community -->
280-
<div class="span6">
269+
<div class="span4">
281270
<h4>Community</h4>
282271
<ul>
283-
<li><em>Questions?</em> See <a href="http://stackoverflow.com/questions/tagged/scikit-learn">stackoverflow</a> # scikit-learn for usage questions</li>
284-
<li><em>Mailing list:</em> scikit-learn-general@lists.sourceforge.net</li>
272+
<li><em>Questions?</em> See <a href="http://stackoverflow.com/questions/tagged/scikit-learn">stackoverflow</a> # scikit-learn</li>
273+
<li><em>Mailing list:</em> <a href="https://lists.sourceforge.net/lists/listinfo/scikit-learn-general">scikit-learn-general@lists.sourceforge.net</a></li>
285274
<li><em>IRC:</em> #scikit-learn @ <a href="http://webchat.freenode.net/">freenode</a></li>
286-
<li><em>Help us:</em>
287-
<button class="btn btn-warning btn-big" onclick="document.getElementById('paypal-form').submit(); return false;"><b>Donate!</b></button> (<a href="about.html#funding">read more</a>)</li>
288-
<form target="_top" id="paypal-form" method="post" action="https://www.paypal.com/cgi-bin/webscr">
275+
</ul>
276+
277+
<form target="_top" id="paypal-form" method="post" action="https://www.paypal.com/cgi-bin/webscr">
289278
<input type="hidden" value="_s-xclick" name="cmd">
290279
<input type="hidden" value="74EYUMF3FTSW8" name="hosted_button_id">
291-
</form>
292-
</ul>
280+
</form>
281+
282+
<a class="btn btn-warning btn-big" onclick="document.getElementById('paypal-form').submit(); return false;">Help us, <strong>donate!</strong></a>
283+
<a class="btn btn-warning btn-big cite-us" href="./about.html#citing-scikit-learn"><strong>Cite us!</strong></a>
284+
285+
<small style="display: block; margin-top: 10px"><a href="about.html#funding">Read more about donations</a></small>
293286
</div>
294287

295288
<!-- who using -->
296-
<!--
297289
<div class="span4">
298-
<h4>Who is using scikit-learn?</h4>
290+
<h4>Who uses scikit-learn?</h4>
299291

300-
</h4>
301-
<div id="myCarousel" class="carousel slide">
292+
<div id="testimonials_carousel" class="carousel slide">
302293
<div class="carousel-inner">
303-
<div class="active item"><img src="_images/inria.jpg" class="thumbnail" /><br /> <em>-- Great stuff!</em></div>
304-
<div class="item"><img src="_static/img/google.png" class="thumbnail" /><br /> <em>-- So good!</em></div>
294+
<div class="active item">
295+
<img src="_images/inria.jpg" class="thumbnail" />
296+
<p>
297+
<em>"We use scikit-learn to support leading-edge basic research [...]"</em>
298+
</p>
299+
</div>
300+
<div class="item">
301+
<img src="_images/evernote.png" class="thumbnail" />
302+
<p>
303+
<em>"For these tasks, we relied on the excellent scikit-learn package for Python."</em>
304+
</p>
305+
</div>
306+
<div class="item">
307+
<img src="_images/telecomparistech.jpg"
308+
class="thumbnail" />
309+
<p>
310+
<em>"The great benefit of scikit-learn is its fast learning curve [...]"</em>
311+
</p>
312+
</div>
313+
<div class="item">
314+
<img src="_images/aweber.png" class="thumbnail" />
315+
<p>
316+
<em>"It allows us to do AWesome stuff we would not otherwise accomplish"</em>
317+
</p>
318+
</div>
305319
</div>
306-
<div style="margin-top: 5px"><a href="#">More testimonials</a></div>
307320
</div>
308-
<script>$('#myCarousel').carousel()</script>
321+
<p align="right">
322+
<small class="example-link">
323+
<a href="testimonials/testimonials.html">More testimonials</a>
324+
</small>
325+
</p>
309326
</div>
310-
-->
311327

312328
</div>
313329
</div>
314330
</div>
331+
332+
333+
<script>
334+
$('#testimonials_carousel').carousel()
335+
</script>
336+
337+

doc/modules/classes.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -715,7 +715,6 @@ details.
715715

716716
metrics.accuracy_score
717717
metrics.auc
718-
metrics.auc_score
719718
metrics.average_precision_score
720719
metrics.classification_report
721720
metrics.confusion_matrix
@@ -730,6 +729,7 @@ details.
730729
metrics.precision_recall_fscore_support
731730
metrics.precision_score
732731
metrics.recall_score
732+
metrics.roc_auc_score
733733
metrics.roc_curve
734734
metrics.zero_one_loss
735735

doc/modules/model_evaluation.rst

Lines changed: 26 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ Scoring Function
5656
'f1' :func:`sklearn.metrics.f1_score`
5757
'precision' :func:`sklearn.metrics.precision_score`
5858
'recall' :func:`sklearn.metrics.recall_score`
59-
'roc_auc' :func:`sklearn.metrics.auc_score`
59+
'roc_auc' :func:`sklearn.metrics.roc_auc_score`
6060

6161
**Clustering**
6262
'adjusted_rand_score' :func:`sklearn.metrics.adjusted_rand_score`
@@ -182,11 +182,11 @@ Some of these are restricted to the binary classification case:
182182
.. autosummary::
183183
:template: function.rst
184184

185-
auc_score
186185
average_precision_score
187186
hinge_loss
188187
matthews_corrcoef
189188
precision_recall_curve
189+
roc_auc_score
190190
roc_curve
191191

192192

@@ -268,27 +268,6 @@ and with a list of labels format:
268268
for an example of accuracy score usage using permutations of
269269
the dataset.
270270

271-
Area under the curve (AUC)
272-
...........................
273-
274-
The :func:`auc_score` function computes the 'area under the curve' (AUC) which
275-
is the area under the receiver operating characteristic (ROC) curve.
276-
277-
This function requires the true binary value and the target scores, which can
278-
either be probability estimates of the positive class, confidence values, or
279-
binary decisions.
280-
281-
>>> import numpy as np
282-
>>> from sklearn.metrics import auc_score
283-
>>> y_true = np.array([0, 0, 1, 1])
284-
>>> y_scores = np.array([0.1, 0.4, 0.35, 0.8])
285-
>>> auc_score(y_true, y_scores)
286-
0.75
287-
288-
For more information see the
289-
`Wikipedia article on AUC
290-
<http://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_curve>`_
291-
and the :ref:`roc_metrics` section.
292271

293272
.. _average_precision_metrics:
294273

@@ -713,7 +692,7 @@ with a svm classifier::
713692

714693

715694
Log loss
716-
--------
695+
........
717696
The log loss, also called logistic regression loss or cross-entropy loss,
718697
is a loss function defined on probability estimates.
719698
It is commonly used in (multinomial) logistic regression and neural networks,
@@ -795,7 +774,7 @@ function:
795774
.. _roc_metrics:
796775

797776
Receiver operating characteristic (ROC)
798-
........................................
777+
.......................................
799778

800779
The function :func:`roc_curve` computes the `receiver operating characteristic
801780
curve, or ROC curve (quoting
@@ -809,16 +788,36 @@ Wikipedia) <http://en.wikipedia.org/wiki/Receiver_operating_characteristic>`_:
809788
positive rate), at various threshold settings. TPR is also known as
810789
sensitivity, and FPR is one minus the specificity or true negative rate."
811790

791+
This function requires the true binary
792+
value and the target scores, which can either be probability estimates of the
793+
positive class, confidence values, or binary decisions.
812794
Here a small example of how to use the :func:`roc_curve` function::
813795

814796
>>> import numpy as np
815-
>>> from sklearn import metrics
797+
>>> from sklearn.metrics import roc_curve
816798
>>> y = np.array([1, 1, 2, 2])
817799
>>> scores = np.array([0.1, 0.4, 0.35, 0.8])
818-
>>> fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2)
800+
>>> fpr, tpr, thresholds = roc_curve(y, scores, pos_label=2)
819801
>>> fpr
820802
array([ 0. , 0.5, 0.5, 1. ])
803+
>>> tpr
804+
array([ 0.5, 0.5, 1. , 1. ])
805+
>>> thresholds
806+
array([ 0.8 , 0.4 , 0.35, 0.1 ])
807+
808+
The :func:`roc_auc_score` function computes the area under the receiver
809+
operating characteristic (ROC) curve, which is also denoted by
810+
AUC or AUROC. By computing the
811+
area under the roc curve, the curve information is summarized in one number.
812+
For more information see the `Wikipedia article on AUC
813+
<http://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_curve>`_.
821814

815+
>>> import numpy as np
816+
>>> from sklearn.metrics import roc_auc_score
817+
>>> y_true = np.array([0, 0, 1, 1])
818+
>>> y_scores = np.array([0.1, 0.4, 0.35, 0.8])
819+
>>> roc_auc_score(y_true, y_scores)
820+
0.75
822821

823822
The following figure shows an example of such ROC curve.
824823

doc/testimonials/images/aweber.png

41.3 KB
Loading

doc/testimonials/images/evernote.png

4.6 KB
Loading
11.2 KB
Loading

0 commit comments

Comments
 (0)
0