4
4
Clustering
5
5
==========
6
6
7
- `Clustering <http ://en.wikipedia.org/wiki/Cluster_analysis >`__ of
7
+ `Clustering <https ://en.wikipedia.org/wiki/Cluster_analysis >`_ of
8
8
unlabeled data can be performed with the module :mod: `sklearn.cluster `.
9
9
10
10
Each clustering algorithm comes in two variants: a class, that implements
@@ -152,7 +152,7 @@ It suffers from various drawbacks:
152
152
better and zero is optimal. But in very high-dimensional spaces, Euclidean
153
153
distances tend to become inflated
154
154
(this is an instance of the so-called "curse of dimensionality").
155
- Running a dimensionality reduction algorithm such as `PCA <PCA> `
155
+ Running a dimensionality reduction algorithm such as `PCA <PCA >`_
156
156
prior to k-means clustering can alleviate this problem
157
157
and speed up the computations.
158
158
@@ -208,8 +208,8 @@ each job).
208
208
209
209
.. warning ::
210
210
211
- The parallel version of K-Means is broken on OS X when numpy uses the
212
- Accelerate Framework. This is expected behavior: Accelerate can be called
211
+ The parallel version of K-Means is broken on OS X when ` numpy ` uses the
212
+ ` Accelerate ` Framework. This is expected behavior: ` Accelerate ` can be called
213
213
after a fork but you need to execv the subprocess with the Python binary
214
214
(which multiprocessing does not do under posix).
215
215
@@ -323,6 +323,7 @@ appropriate for small to medium sized datasets.
323
323
* :ref: `example_applications_plot_stock_market.py ` Affinity Propagation on
324
324
Financial time series to find groups of companies
325
325
326
+
326
327
**Algorithm description: **
327
328
The messages sent between points belong to one of two categories. The first is
328
329
the responsibility :math: `r(i, k)`,
@@ -361,9 +362,8 @@ Mean Shift
361
362
:class: `MeanShift ` clustering aims to discover *blobs * in a smooth density of
362
363
samples. It is a centroid based algorithm, which works by updating candidates
363
364
for centroids to be the mean of the points within a given region. These
364
- candidates are then filtered in a
365
- post-processing stage to eliminate near-duplicates to form the final set of
366
- centroids.
365
+ candidates are then filtered in a post-processing stage to eliminate
366
+ near-duplicates to form the final set of centroids.
367
367
368
368
Given a candidate centroid :math: `x_i` for iteration :math: `t`, the candidate
369
369
is updated according to the following equation:
@@ -373,11 +373,10 @@ is updated according to the following equation:
373
373
x_i^{t+1 } = x_i^t + m(x_i^t)
374
374
375
375
Where :math: `N(x_i)` is the neighborhood of samples within a given distance
376
- around :math: `x_i` and :math: `m` is the *mean shift * vector that is computed
377
- for each centroid that
378
- points towards a region of the maximum increase in the density of points. This
379
- is computed using the following equation, effectively updating a centroid to be
380
- the mean of the samples within its neighborhood:
376
+ around :math: `x_i` and :math: `m` is the *mean shift * vector that is computed for each
377
+ centroid that points towards a region of the maximum increase in the density of points.
378
+ This is computed using the following equation, effectively updating a centroid
379
+ to be the mean of the samples within its neighborhood:
381
380
382
381
.. math ::
383
382
@@ -412,7 +411,7 @@ given sample.
412
411
413
412
* `"Mean shift: A robust approach toward feature space analysis."
414
413
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.76.8968&rep=rep1&type=pdf> `_
415
- D. Comaniciu, & P. Meer *IEEE Transactions on Pattern Analysis and Machine Intelligence * (2002)
414
+ D. Comaniciu and P. Meer, *IEEE Transactions on Pattern Analysis and Machine Intelligence * (2002)
416
415
417
416
418
417
.. _spectral_clustering :
@@ -524,7 +523,7 @@ build nested clusters by merging or splitting them successively. This
524
523
hierarchy of clusters is represented as a tree (or dendrogram). The root of the
525
524
tree is the unique cluster that gathers all the samples, the leaves being the
526
525
clusters with only one sample. See the `Wikipedia page
527
- <http ://en.wikipedia.org/wiki/Hierarchical_clustering> `_ for more details.
526
+ <https ://en.wikipedia.org/wiki/Hierarchical_clustering> `_ for more details.
528
527
529
528
The :class: `AgglomerativeClustering ` object performs a hierarchical clustering
530
529
using a bottom up approach: each observation starts in its own cluster, and
@@ -1003,7 +1002,7 @@ random labelings by defining the adjusted Rand index as follows:
1003
1002
L. Hubert and P. Arabie, Journal of Classification 1985
1004
1003
1005
1004
* `Wikipedia entry for the adjusted Rand index
1006
- <http ://en.wikipedia.org/wiki/Rand_index#Adjusted_Rand_index> `_
1005
+ <https ://en.wikipedia.org/wiki/Rand_index#Adjusted_Rand_index> `_
1007
1006
1008
1007
.. _mutual_info_score :
1009
1008
@@ -1153,23 +1152,25 @@ calculated using a similar form to that of the adjusted Rand index:
1153
1152
1154
1153
* Strehl, Alexander, and Joydeep Ghosh (2002). "Cluster ensembles – a
1155
1154
knowledge reuse framework for combining multiple partitions". Journal of
1156
- Machine Learning Research 3: 583–617. doi:10.1162/153244303321897735
1155
+ Machine Learning Research 3: 583–617.
1156
+ `doi:10.1162/153244303321897735 <http://strehl.com/download/strehl-jmlr02.pdf >`_.
1157
1157
1158
1158
* Vinh, Epps, and Bailey, (2009). "Information theoretic measures
1159
1159
for clusterings comparison". Proceedings of the 26th Annual International
1160
1160
Conference on Machine Learning - ICML '09.
1161
- doi:10.1145/1553374.1553511. ISBN 9781605585161.
1161
+ `doi:10.1145/1553374.1553511 <http://dx.doi.org/10.1145/1553374.1553511 >`_.
1162
+ ISBN 9781605585161.
1162
1163
1163
1164
* Vinh, Epps, and Bailey, (2010). Information Theoretic Measures for
1164
1165
Clusterings Comparison: Variants, Properties, Normalization and
1165
- Correction for Chance} , JMLR
1166
+ Correction for Chance, JMLR
1166
1167
http://jmlr.csail.mit.edu/papers/volume11/vinh10a/vinh10a.pdf
1167
1168
1168
1169
* `Wikipedia entry for the (normalized) Mutual Information
1169
- <http ://en.wikipedia.org/wiki/Mutual_Information> `_
1170
+ <https ://en.wikipedia.org/wiki/Mutual_Information> `_
1170
1171
1171
1172
* `Wikipedia entry for the Adjusted Mutual Information
1172
- <http ://en.wikipedia.org/wiki/Adjusted_Mutual_Information> `_
1173
+ <https ://en.wikipedia.org/wiki/Adjusted_Mutual_Information> `_
1173
1174
1174
1175
.. _homogeneity_completeness :
1175
1176
@@ -1240,7 +1241,7 @@ homogeneous but not complete::
1240
1241
Advantages
1241
1242
~~~~~~~~~~
1242
1243
1243
- - **Bounded scores **: 0.0 is as bad as it can be, 1.0 is a perfect score
1244
+ - **Bounded scores **: 0.0 is as bad as it can be, 1.0 is a perfect score.
1244
1245
1245
1246
- Intuitive interpretation: clustering with bad V-measure can be
1246
1247
**qualitatively analyzed in terms of homogeneity and completeness **
@@ -1375,7 +1376,8 @@ cluster analysis.
1375
1376
1376
1377
* Peter J. Rousseeuw (1987). "Silhouettes: a Graphical Aid to the
1377
1378
Interpretation and Validation of Cluster Analysis". Computational
1378
- and Applied Mathematics 20: 53–65. doi:10.1016/0377-0427(87)90125-7.
1379
+ and Applied Mathematics 20: 53–65.
1380
+ `doi:10.1016/0377-0427(87)90125-7 <http://dx.doi.org/10.1016/0377-0427(87)90125-7 >`_.
1379
1381
1380
1382
1381
1383
Advantages
0 commit comments