8000 Fix few typos on links and doi · scikit-learn/scikit-learn@e103c31 · GitHub
[go: up one dir, main page]

Skip to content

Commit e103c31

Browse files
committed
Fix few typos on links and doi
- Add https for Wikipedia pages, - Add http://dx.doi.org/ link for doi entries.
1 parent ba3caf4 commit e103c31

File tree

1 file changed

+24
-22
lines changed

1 file changed

+24
-22
lines changed

doc/modules/clustering.rst

Lines changed: 24 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
Clustering
55
==========
66

7-
`Clustering <http://en.wikipedia.org/wiki/Cluster_analysis>`__ of
7+
`Clustering <https://en.wikipedia.org/wiki/Cluster_analysis>`_ of
88
unlabeled data can be performed with the module :mod:`sklearn.cluster`.
99

1010
Each clustering algorithm comes in two variants: a class, that implements
@@ -152,7 +152,7 @@ It suffers from various drawbacks:
152152
better and zero is optimal. But in very high-dimensional spaces, Euclidean
153153
distances tend to become inflated
154154
(this is an instance of the so-called "curse of dimensionality").
155-
Running a dimensionality reduction algorithm such as `PCA <PCA>`
155+
Running a dimensionality reduction algorithm such as `PCA <PCA>`_
156156
prior to k-means clustering can alleviate this problem
157157
and speed up the computations.
158158

@@ -208,8 +208,8 @@ each job).
208208

209209
.. warning::
210210

211-
The parallel version of K-Means is broken on OS X when numpy uses the
212-
Accelerate Framework. This is expected behavior: Accelerate can be called
211+
The parallel version of K-Means is broken on OS X when `numpy` uses the
212+
`Accelerate` Framework. This is expected behavior: `Accelerate` can be called
213213
after a fork but you need to execv the subprocess with the Python binary
214214
(which multiprocessing does not do under posix).
215215

@@ -323,6 +323,7 @@ appropriate for small to medium sized datasets.
323323
* :ref:`example_applications_plot_stock_market.py` Affinity Propagation on
324324
Financial time series to find groups of companies
325325

326+
326327
**Algorithm description:**
327328
The messages sent between points belong to one of two categories. The first is
328329
the responsibility :math:`r(i, k)`,
@@ -361,9 +362,8 @@ Mean Shift
361362
:class:`MeanShift` clustering aims to discover *blobs* in a smooth density of
362363
samples. It is a centroid based algorithm, which works by updating candidates
363364
for centroids to be the mean of the points within a given region. These
364-
candidates are then filtered in a
365-
post-processing stage to eliminate near-duplicates to form the final set of
366-
centroids.
365+
candidates are then filtered in a post-processing stage to eliminate
366+
near-duplicates to form the final set of centroids.
367367

368368
Given a candidate centroid :math:`x_i` for iteration :math:`t`, the candidate
369369
is updated according to the following equation:
@@ -373,11 +373,10 @@ is updated according to the following equation:
373373
x_i^{t+1} = x_i^t + m(x_i^t)
374374
375375
Where :math:`N(x_i)` is the neighborhood of samples within a given distance
376-
around :math:`x_i` and :math:`m` is the *mean shift* vector that is computed
377-
for each centroid that
378-
points towards a region of the maximum increase in the density of points. This
379-
is computed using the following equation, effectively updating a centroid to be
380-
the mean of the samples within its neighborhood:
376+
around :math:`x_i` and :math:`m` is the *mean shift* vector that is computed for each
377+
centroid that points towards a region of the maximum increase in the density of points.
378+
This is computed using the following equation, effectively updating a centroid
379+
to be the mean of the samples within its neighborhood:
381380

382381
.. math::
383382
@@ -412,7 +411,7 @@ given sample.
412411

413412
* `"Mean shift: A robust approach toward feature space analysis."
414413
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.76.8968&rep=rep1&type=pdf>`_
415-
D. Comaniciu, & P. Meer *IEEE Transactions on Pattern Analysis and Machine Intelligence* (2002)
414+
D. Comaniciu and P. Meer, *IEEE Transactions on Pattern Analysis and Machine Intelligence* (2002)
416415

417416

418417
.. _spectral_clustering:
@@ -524,7 +523,7 @@ build nested clusters by merging or splitting them successively. This
524523
hierarchy of clusters is represented as a tree (or dendrogram). The root of the
525524
tree is the unique cluster that gathers all the samples, the leaves being the
526525
clusters with only one sample. See the `Wikipedia page
527-
<http://en.wikipedia.org/wiki/Hierarchical_clustering>`_ for more details.
526+
<https://en.wikipedia.org/wiki/Hierarchical_clustering>`_ for more details.
528527

529528
The :class:`AgglomerativeClustering` object performs a hierarchical clustering
530529
using a bottom up approach: each observation starts in its own cluster, and
@@ -1003,7 +1002,7 @@ random labelings by defining the adjusted Rand index as follows:
10031002
L. Hubert and P. Arabie, Journal of Classification 1985
10041003

10051004
* `Wikipedia entry for the adjusted Rand index
1006-
<http://en.wikipedia.org/wiki/Rand_index#Adjusted_Rand_index>`_
1005+
<https://en.wikipedia.org/wiki/Rand_index#Adjusted_Rand_index>`_
10071006

10081007
.. _mutual_info_score:
10091008

@@ -1153,23 +1152,25 @@ calculated using a similar form to that of the adjusted Rand index:
11531152

11541153
* Strehl, Alexander, and Joydeep Ghosh (2002). "Cluster ensembles – a
11551154
knowledge reuse framework for combining multiple partitions". Journal of
1156-
Machine Learning Research 3: 583–617. doi:10.1162/153244303321897735
1155+
Machine Learning Research 3: 583–617.
1156+
`doi:10.1162/153244303321897735 <http://strehl.com/download/strehl-jmlr02.pdf>`_.
11571157

11581158
* Vinh, Epps, and Bailey, (2009). "Information theoretic measures
11591159
for clusterings comparison". Proceedings of the 26th Annual International
11601160
Conference on Machine Learning - ICML '09.
1161-
doi:10.1145/1553374.1553511. ISBN 9781605585161.
1161+
`doi:10.1145/1553374.1553511 <http://dx.doi.org/10.1145/1553374.1553511>`_.
1162+
ISBN 9781605585161.
11621163

11631164
* Vinh, Epps, and Bailey, (2010). Information Theoretic Measures for
11641165
Clusterings Comparison: Variants, Properties, Normalization and
1165-
Correction for Chance}, JMLR
1166+
Correction for Chance, JMLR
11661167
http://jmlr.csail.mit.edu/papers/volume11/vinh10a/vinh10a.pdf
11671168

11681169
* `Wikipedia entry for the (normalized) Mutual Information
1169-
<http://en.wikipedia.org/wiki/Mutual_Information>`_
1170+
<https://en.wikipedia.org/wiki/Mutual_Information>`_
11701171

11711172
* `Wikipedia entry for the Adjusted Mutual Information
1172-
<http://en.wikipedia.org/wiki/Adjusted_Mutual_Information>`_
1173+
<https://en.wikipedia.org/wiki/Adjusted_Mutual_Information>`_
11731174

11741175
.. _homogeneity_completeness:
11751176

@@ -1240,7 +1241,7 @@ homogeneous but not complete::
12401241
Advantages
12411242
~~~~~~~~~~
12421243

1243-
- **Bounded scores**: 0.0 is as bad as it can be, 1.0 is a perfect score
1244+
- **Bounded scores**: 0.0 is as bad as it can be, 1.0 is a perfect score.
12441245

12451246
- Intuitive interpretation: clustering with bad V-measure can be
12461247
**qualitatively analyzed in terms of homogeneity and completeness**
@@ -1375,7 +1376,8 @@ cluster analysis.
13751376

13761377
* Peter J. Rousseeuw (1987). "Silhouettes: a Graphical Aid to the
13771378
Interpretation and Validation of Cluster Analysis". Computational
1378-
and Applied Mathematics 20: 53–65. doi:10.1016/0377-0427(87)90125-7.
1379+
and Applied Mathematics 20: 53–65.
1380+
`doi:10.1016/0377-0427(87)90125-7 <http://dx.doi.org/10.1016/0377-0427(87)90125-7>`_.
13791381

13801382

13811383
Advantages

0 commit comments

Comments
 (0)
0