8000 Add example on t-sne perplexity · CoderPat/scikit-learn@ae82233 · GitHub
[go: up one dir, main page]

Skip to content {"props":{"docsUrl":"https://docs.github.com/get-started/accessibility/keyboard-shortcuts"}}

Commit ae82233

Browse files
Narine Kokhlikyanogrisel
authored andcommitted
Add example on t-sne perplexity
1 parent c10c886 commit ae82233

File tree

1 file changed

+94
-0
lines changed

1 file changed

+94
-0
lines changed
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
"""
2+
=============================================================================
3+
t-SNE: The effect of various perplexity values on the shape
4+
=============================================================================
5+
6+
An illustration of t-SNE on the two concentric circles and the S-curve
7+
datasets for different perplexity values.
8+
9+
We observe a tendency towards clearer shapes as the preplexity value increases.
10+
11+
The size, the distance and the shape of clusters may vary upon initialization,
12+
perplexity values and does not always convey a meaning.
13+
14+
As shown below, t-SNE for higher perplexities finds meaningful topology of
15+
two concentric circles, however the size and the distance of the circles varies
16+
slightly from the original. Contrary to the two circles dataset, the shapes
17+
visually diverge from S-curve topology on the S-curve dateset even for
18+
larger perplexity values.
19+
20+
For further details, "How to Use t-SNE Effectively"
21+
http://distill.pub/2016/misread-tsne/ provides a good discussion of the
22+
effects of various parameters, as well as interactive plots to explore
23+
those effects.
24+
"""
25+
26+
# Author: Narine Kokhlikyan <narine@slice.com>
27+
# License: BSD
28+
29+
print(__doc__)
30+
31+
import matplotlib.pyplot as plt
32+
33+
from matplotlib.ticker import NullFormatter
34+
from sklearn import manifold, datasets
35+
from time import time
36+
37+
n_samples = 500
38+
n_components = 2
39+
(fig, subplots) = plt.subplots(2, 5, figsize=(15, 8))
40+
perplexities = [5, 50, 100, 150]
41+
42+
X, y = datasets.make_circles(n_samples=n_samples, factor=.5, noise=.05)
43+
44+
red = y == 0
45+
green = y == 1
46+
47+
ax = subplots[0][0]
48+
ax.scatter(X[red, 0], X[red, 1], c="r")
49+
ax.scatter(X[green, 0], X[green, 1], c="g")
50+
ax.xaxis.set_major_formatter(NullFormatter())
51+
ax.yaxis.set_major_formatter(NullFormatter())
52+
plt.axis('tight')
53+
54+
for i, perplexity in enumerate(perplexities):
55+
ax = subplots[0][i + 1]
56+
57+
t0 = time()
58+
tsne = manifold.TSNE(n_components=n_components, init='random',
59+
random_state=0, perplexity=perplexity)
60+
Y = tsne.fit_transform(X)
61+
t1 = time()
62+
print("circles, perplexity=%d in %.2g sec" % (perplexity, t1 - t0))
63+
ax.set_title("Perplexity=%d" % perplexity)
64+
ax.scatter(Y[red, 0], Y[red, 1], c="r")
65+
ax.scatter(Y[green, 0], Y[green, 1], c="g")
66+
ax.xaxis.set_major_formatter(NullFormatter())
67+
ax.yaxis.set_major_formatter(NullFormatter())
68+
ax.axis('tight')
69+
70+
# Another example using s-curve
71+
X, color = datasets.samples_generator.make_s_curve(n_samples, random_state=0)
72+
73+
ax = subplots[1][0]
74+
ax.scatter(X[:, 0], X[:, 2], c=color, cmap=plt.cm.Spectral)
75+
ax.xaxis.set_major_formatter(NullFormatter())
76+
ax.yaxis.set_major_formatter(NullFormatter())
77+
78+
for i, perplexity in enumerate(perplexities):
79+
ax = subplots[1][i + 1]
80+
81+
t0 = time()
82+
tsne = manifold.TSNE(n_components=n_components, init='random',
83+
random_state=0, perplexity=perplexity)
84+
Y = tsne.fit_transform(X)
85+
t1 = time()
86+
print("S-curve, perplexity=%d in %.2g sec" % (perplexity, t1 - t0))
87+
88+
ax.set_title("Perplexity=%d" % perplexity)
89+
ax.scatter(Y[:, 0], Y[:, 1], c=color, cmap=plt.cm.Spectral)
90+
ax.xaxis.set_major_formatter(NullFormatter())
91+
ax.yaxis.set_major_formatter(NullFormatter())
92+
ax.axis('tight')
93+
94+
plt.show()

0 commit comments

Comments
 (0)
0