8000 DOC add dropdown menu for Section 2.5 Decomposing signals in componen… · punndcoder28/scikit-learn@281523c · GitHub
[go: up one dir, main page]

Skip to content
8000

Commit 281523c

Browse files
authored
DOC add dropdown menu for Section 2.5 Decomposing signals in components (scikit-learn#27551)
1 parent fb35756 commit 281523c

File tree

1 file changed

+35
-5
lines changed

1 file changed

+35
-5
lines changed

doc/modules/decomposition.rst

Lines changed: 35 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -319,6 +319,11 @@ is eigendecomposed in the Kernel PCA fitting process has an effective rank that
319319
is much smaller than its size. This is a situation where approximate
320320
eigensolvers can provide speedup with very low precision loss.
321321

322+
323+
|details-start|
324+
**Eigensolvers**
325+
|details-split|
326+
322327
The optional parameter ``eigen_solver='randomized'`` can be used to
323328
*significantly* reduce the computation time when the number of requested
324329
``n_components`` is small compared with the number of samples. It relies on
@@ -343,6 +348,7 @@ is extremely small. It is enabled by default when the desired number of
343348
components is less than 10 (strict) and the number of samples is more than 200
344349
(strict). See :class:`KernelPCA` for details.
345350

351+
346352
.. topic:: References:
347353

348354
* *dense* solver:
@@ -365,6 +371,8 @@ components is less than 10 (strict) and the number of samples is more than 200
365371
<https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.eigsh.html>`_
366372
R. B. Lehoucq, D. C. Sorensen, and C. Yang, (1998)
367373

374+
|details-end|
375+
368376

369377
.. _LSA:
370378

@@ -375,6 +383,16 @@ Truncated singular value decomposition and latent semantic analysis
375383
(SVD) that only computes the :math:`k` largest singular values,
376384
where :math:`k` is a user-specified parameter.
377385

386+
:class:`TruncatedSVD` is very similar to :class:`PCA`, but differs
387+
in that the matrix :math:`X` does not need to be centered.
388+
When the columnwise (per-feature) means of :math:`X`
389+
are subtracted from the feature values,
390+
truncated SVD on the resulting matrix is equivalent to PCA.
391+
392+
|details-start|
393+
**About truncated SVD and latent semantic analysis (LSA)**
394+
|details-split|
395+
378396
When truncated SVD is applied to term-document matrices
379397
(as returned by :class:`~sklearn.feature_extraction.text.CountVectorizer` or
380398
:class:`~sklearn.feature_extraction.text.TfidfVectorizer`),
@@ -415,11 +433,6 @@ To also transform a test set :math:`X`, we multiply it with :math:`V_k`:
415433
We present LSA in a different way that matches the scikit-learn API better,
416434
but the singular values found are the same.
417435

418-
:class:`TruncatedSVD` is very similar to :class:`PCA`, but differs
419-
in that the matrix :math:`X` does not need to be centered.
420-
When the columnwise (per-feature) means of :math:`X`
421-
are subtracted from the feature values,
422-
truncated SVD on the resulting matrix is equivalent to PCA.
423436

424437
While the :class:`TruncatedSVD` transformer
425438
works with any feature matrix,
@@ -430,6 +443,8 @@ should be turned on (``sublinear_tf=True, use_idf=True``)
430443
to bring the feature values closer to a Gaussian distribution,
431444
compensating for LSA's erroneous assumptions about textual data.
432445

446+
|details-end|
447+
433448
.. topic:: Examples:
434449

435450
* :ref:`sphx_glr_auto_examples_text_plot_document_clustering.py`
@@ -442,6 +457,7 @@ compensating for LSA's erroneous assumptions about textual data.
442457
<https://nlp.stanford.edu/IR-book/pdf/18lsi.pdf>`_
443458

444459

460+
445461
.. _DictionaryLearning:
446462

447463
Dictionary Learning
@@ -883,6 +899,10 @@ Note that this definition is not valid if :math:`\beta \in (0; 1)`, yet it can
883899
be continuously extended to the definitions of :math:`d_{KL}` and :math:`d_{IS}`
884900
respectively.
885901

902+
|details-start|
903+
**NMF implemented solvers**
904+
|details-split|
905+
886906
:class:`NMF` implements two solvers, using Coordinate Descent ('cd') [5]_, and
887907
Multiplicative Update ('mu') [6]_. The 'mu' solver can optimize every
888908
beta-divergence, including of course the Frobenius norm (:math:`\beta=2`), the
@@ -896,6 +916,8 @@ The 'cd' solver can only optimize the Frobenius norm. Due to the
896916
underlying non-convexity of NMF, the different solvers may converge to
897917
different minima, even when optimizing the same distance function.
898918

919+
|details-end|
920+
899921
NMF is best used with the ``fit_transform`` method, which returns the matrix W.
900922
The matrix H is stored into the fitted model in the ``components_`` attribute;
901923
the method ``transform`` will decompose a new matrix X_new based on these
@@ -910,6 +932,8 @@ stored components::
910932
>>> X_new = np.array([[1, 0], [1, 6.1], [1, 0], [1, 4], [3.2, 1], [0, 4]])
911933
>>> W_new = model.transform(X_new)
912934

935+
936+
913937
.. topic:: Examples:
914938

915939
* :ref:`sphx_glr_auto_examples_decomposition_plot_faces_decomposition.py`
@@ -996,6 +1020,10 @@ of topics in the corpus and the distribution of words in the documents.
9961020
The goal of LDA is to use the observed words to infer the hidden topic
9971021
structure.
9981022

1023+
|details-start|
1024+
**Details on modeling text corpora**
1025+
|details-split|
1026+
9991027
When modeling text corpora, the model assumes the following generative process
10001028
for a corpus with :math:`D` documents and :math:`K` topics, with :math:`K`
10011029
corresponding to `n_components` in the API:
@@ -1036,6 +1064,8 @@ Maximizing ELBO is equivalent to minimizing the Kullback-Leibler(KL) divergence
10361064
between :math:`q(z,\theta,\beta)` and the true posterior
10371065
:math:`p(z, \theta, \beta |w, \alpha, \eta)`.
10381066

1067+
|details-end|
1068+
10391069
:class:`LatentDirichletAllocation` implements the online variational Bayes
10401070
algorithm and supports both online and batch update methods.
10411071
While the batch method updates variational variables after each full pass through

0 commit comments

Comments
 (0)
0