8000 DOC fix links in truncated SVD docs (#17194) · InterferencePattern/scikit-learn@201060f · GitHub
[go: up one dir, main page]

Skip to content

Commit 201060f

Browse files
NicolasHugadrinjalali
authored andcommitted
DOC fix links in truncated SVD docs (scikit-learn#17194)
1 parent 3f86695 commit 201060f

File tree

2 files changed

+11
-11
lines changed

2 files changed

+11
-11
lines changed

doc/modules/decomposition.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -288,7 +288,8 @@ Truncated singular value decomposition and latent semantic analysis
288288
where :math:`k` is a user-specified parameter.
289289

290290
When truncated SVD is applied to term-document matrices
291-
(as returned by ``CountVectorizer`` or ``TfidfVectorizer``),
291+
(as returned by :class:`~sklearn.feature_extraction.text.CountVectorizer` or
292+
:class:`~sklearn.feature_extraction.text.TfidfVectorizer`),
292293
this transformation is known as
293294
`latent semantic analysis <https://nlp.stanford.edu/IR-book/pdf/18lsi.pdf>`_
294295
(LSA), because it transforms such matrices
@@ -327,8 +328,7 @@ To also transform a test set :math:`X`, we multiply it with :math:`V_k`:
327328
but the singular values found are the same.
328329

329330
:class:`TruncatedSVD` is very similar to :class:`PCA`, but differs
330-
in that it works on sample matrices :math:`X` directly
331-
instead of their covariance matrices.
331+
in that the matrix :math:`X` does not need to be centered.
332332
When the columnwise (per-feature) means of :math:`X`
333333
are subtracted from the feature values,
334334
truncated SVD on the resulting matrix is equivalent to PCA.
@@ -338,7 +338,7 @@ matrices without the need to densify them,
338338
as densifying may fill up memory even for medium-sized document collections.
339339

340340
While the :class:`TruncatedSVD` transformer
341-
works with any (sparse) feature matrix,
341+
works with any feature matrix,
342342
using it on tf–idf matrices is recommended over raw frequency counts
343343
in an LSA/document processing setting.
344344
In particular, sublinear scaling and inverse document frequency

sklearn/decomposition/_truncated_svd.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -27,16 +27,16 @@ class TruncatedSVD(TransformerMixin, BaseEstimator):
2727
This transformer performs linear dimensionality reduction by means of
2828
truncated singular value decomposition (SVD). Contrary to PCA, this
2929
estimator does not center the data before computing the singular value
30-
decomposition. This means it can work with scipy.sparse matrices
30+
decomposition. This means it can work with sparse matrices
3131
efficiently.
3232
3333
In particular, truncated SVD works on term count/tf-idf matrices as
34-
returned by the vectorizers in sklearn.feature_extraction.text. In that
35-
context, it is known as latent semantic analysis (LSA).
34+
returned by the vectorizers in :mod:`sklearn.feature_extraction.text`. In
35+
that context, it is known as latent semantic analysis (LSA).
3636
3737
This estimator supports two algorithms: a fast randomized SVD solver, and
38-
a "naive" algorithm that uses ARPACK as an eigensolver on (X * X.T) or
39-
(X.T * X), whichever is more efficient.
38+
a "naive" algorithm that uses ARPACK as an eigensolver on `X * X.T` or
39+
`X.T * X`, whichever is more efficient.
4040
4141
Read more in the :ref:`User Guide <LSA>`.
4242
@@ -56,8 +56,8 @@ class TruncatedSVD(TransformerMixin, BaseEstimator):
5656
n_iter : int, optional (default 5)
5757
Number of iterations for randomized SVD solver. Not used by ARPACK. The
5858
default is larger than the default in
59-
58BC `~sklearn.utils.extmath.randomized_svd` to handle sparse matrices that
60-
may have large slowly decaying spectrum.
59+
:func:`~sklearn.utils.extmath.randomized_svd` to handle sparse
60+
matrices that may have large slowly decaying spectrum.
6161
6262
random_state : int, RandomState instance, default=None
6363
Used during randomized svd. Pass an int for reproducible results across

0 commit comments

Comments
 (0)
0