@@ -538,7 +538,14 @@ entropy of the conditional probability distribution. The perplexity of a
538
538
:math: `k`-sided die is :math: `k`, so that :math: `k` is effectively the number of
539
539
nearest neighbors t-SNE considers when generating the conditional probabilities.
540
540
Larger perplexities lead to more nearest neighbors and less sensitive to small
541
- structure. Larger datasets tend to require larger perplexities.
541
+ structure. Conversely a lower perplexity considers a smaller number of
542
+ neighbors, and thus ignores more global information in favour of the
543
+ local neighborhood. As dataset sizes get larger more points will be
544
+ required to get a reasonable sample of the local neighborhood, and hence
545
+ larger perplexities may be required. Similarly noisier datasets will require
546
+ larger perplexity values to encompass enough local neighbors to see beyond
547
+ the background noise.
548
+
542
549
The maximum number of iterations is usually high enough and does not need
543
550
any tuning. The optimization consists of two phases: the early exaggeration
544
551
phase and the final optimization. During early exaggeration the joint
@@ -554,6 +561,10 @@ is a tradeoff between performance and accuracy. Larger angles imply that we<
7962
/div>
554
561
can approximate larger regions by a single point,leading to better speed
555
562
but less accurate results.
556
563
564
+ `"How to Use t-SNE Effectively" <http://distill.pub/2016/misread-tsne/ >`_
565
+ provides a good discussion of the effects of the various parameters, as well
566
+ as interactive plots to explore the effects of different parameters.
567
+
557
568
Barnes-Hut t-SNE
558
569
----------------
559
570
0 commit comments