@@ -319,6 +319,11 @@ is eigendecomposed in the Kernel PCA fitting process has an effective rank that
319
319
is much smaller than its size. This is a situation where approximate
320
320
eigensolvers can provide speedup with very low precision loss.
321
321
322
+
323
+ |details-start |
324
+ **Eigensolvers **
325
+ |details-split |
326
+
322
327
The optional parameter ``eigen_solver='randomized' `` can be used to
323
328
*significantly * reduce the computation time when the number of requested
324
329
``n_components `` is small compared with the number of samples. It relies on
@@ -343,6 +348,7 @@ is extremely small. It is enabled by default when the desired number of
343
348
components is less than 10 (strict) and the number of samples is more than 200
344
349
(strict). See :class: `KernelPCA ` for details.
345
350
351
+
346
352
.. topic :: References:
347
353
348
354
* *dense * solver:
@@ -365,6 +371,8 @@ components is less than 10 (strict) and the number of samples is more than 200
365
371
<https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.eigsh.html> `_
366
372
R. B. Lehoucq, D. C. Sorensen, and C. Yang, (1998)
367
373
374
+ |details-end |
375
+
368
376
369
377
.. _LSA :
370
378
@@ -375,6 +383,16 @@ Truncated singular value decomposition and latent semantic analysis
375
383
(SVD) that only computes the :math: `k` largest singular values,
376
384
where :math: `k` is a user-specified parameter.
377
385
386
+ :class: `TruncatedSVD ` is very similar to :class: `PCA `, but differs
387
+ in that the matrix :math: `X` does not need to be centered.
388
+ When the columnwise (per-feature) means of :math: `X`
389
+ are subtracted from the feature values,
390
+ truncated SVD on the resulting matrix is equivalent to PCA.
391
+
392
+ |details-start |
393
+ **About truncated SVD and latent semantic analysis (LSA) **
394
+ |details-split |
395
+
378
396
When truncated SVD is applied to term-document matrices
379
397
(as returned by :class: `~sklearn.feature_extraction.text.CountVectorizer ` or
380
398
:class: `~sklearn.feature_extraction.text.TfidfVectorizer `),
@@ -415,11 +433,6 @@ To also transform a test set :math:`X`, we multiply it with :math:`V_k`:
415
433
We present LSA in a different way that matches the scikit-learn API better,
416
434
but the singular values found are the same.
417
435
418
- :class: `TruncatedSVD ` is very similar to :class: `PCA `, but differs
419
- in that the matrix :math: `X` does not need to be centered.
420
- When the columnwise (per-feature) means of :math: `X`
421
- are subtracted from the feature values,
422
- truncated SVD on the resulting matrix is equivalent to PCA.
423
436
424
437
While the :class: `TruncatedSVD ` transformer
425
438
works with any feature matrix,
@@ -430,6 +443,8 @@ should be turned on (``sublinear_tf=True, use_idf=True``)
430
443
to bring the feature values closer to a Gaussian distribution,
431
444
compensating for LSA's erroneous assumptions about textual data.
432
445
446
+ |details-end |
447
+
433
448
.. topic :: Examples:
434
449
435
450
* :ref: `sphx_glr_auto_examples_text_plot_document_clustering.py `
@@ -442,6 +457,7 @@ compensating for LSA's erroneous assumptions about textual data.
442
457
<https://nlp.stanford.edu/IR-book/pdf/18lsi.pdf> `_
443
458
444
459
460
+
445
461
.. _DictionaryLearning :
446
462
447
463
Dictionary Learning
@@ -883,6 +899,10 @@ Note that this definition is not valid if :math:`\beta \in (0; 1)`, yet it can
883
899
be continuously extended to the definitions of :math: `d_{KL}` and :math: `d_{IS}`
884
900
respectively.
885
901
902
+ |details-start |
903
+ **NMF implemented solvers **
904
+ |details-split |
905
+
886
906
:class: `NMF ` implements two solvers, using Coordinate Descent ('cd') [5 ]_, and
887
907
Multiplicative Update ('mu') [6 ]_. The 'mu' solver can optimize every
888
908
beta-divergence, including of course the Frobenius norm (:math: `\beta =2 `), the
@@ -896,6 +916,8 @@ The 'cd' solver can only optimize the Frobenius norm. Due to the
896
916
underlying non-convexity of NMF, the different solvers may converge to
897
917
different minima, even when optimizing the same distance function.
898
918
919
+ |details-end |
920
+
899
921
NMF is best used with the ``fit_transform `` method, which returns the matrix W.
900
922
The matrix H is stored into the fitted model in the ``components_ `` attribute;
901
923
the method ``transform `` will decompose a new matrix X_new based on these
@@ -910,6 +932,8 @@ stored components::
910
932
>>> X_new = np.array([[1, 0], [1, 6.1], [1, 0], [1, 4], [3.2, 1], [0, 4]])
911
933
>>> W_new = model.transform(X_new)
912
934
935
+
936
+
913
937
.. topic :: Examples:
914
938
915
939
* :ref: `sphx_glr_auto_examples_decomposition_plot_faces_decomposition.py `
@@ -996,6 +1020,10 @@ of topics in the corpus and the distribution of words in the documents.
996
1020
The goal of LDA is to use the observed words to infer the hidden topic
997
1021
structure.
998
1022
1023
+ |details-start |
1024
+ **Details on modeling text corpora **
1025
+ |details-split |
1026
+
999
1027
When modeling text corpora, the model assumes the following generative process
1000
1028
for a corpus with :math: `D` documents and :math: `K` topics, with :math: `K`
1001
1029
corresponding to `n_components ` in the API:
@@ -1036,6 +1064,8 @@ Maximizing ELBO is equivalent to minimizing the Kullback-Leibler(KL) divergence
1036
1064
between :math: `q(z,\theta ,\beta )` and the true posterior
1037
1065
:math: `p(z, \theta , \beta |w, \alpha , \eta )`.
1038
1066
1067
+ |details-end |
1068
+
1039
1069
:class: `LatentDirichletAllocation ` implements the online variational Bayes
1040
1070
algorithm and supports both online and batch update methods.
1041
1071
While the batch method updates variational variables after each full pass through
0 commit comments