8000 DOC fix typos in the user guide linear model documentation (#19554) · rth/scikit-learn@4797222 · GitHub
[go: up one dir, main page]

Skip to content

Commit 4797222

Browse files
DOC fix typos in the user guide linear model documentation (scikit-learn#19554)
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
1 parent 772ffd9 commit 4797222

File tree

2 files changed

+164
-146
lines changed

2 files changed

+164
-146
lines changed

doc/modules/linear_model.rst

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ and will store the coefficients :math:`w` of the linear model in its
5050

5151
The coefficient estimates for Ordinary Least Squares rely on the
5252
independence of the features. When features are correlated and the
53-
columns of the design matrix :math:`X` have an approximate linear
53+
columns of the design matrix :math:`X` have an approximately linear
5454
dependence, the design matrix becomes close to singular
5555
and as a result, the least-squares estimate becomes highly sensitive
5656
to random errors in the observed target, producing a large
@@ -68,7 +68,7 @@ It is possible to constrain all the coefficients to be non-negative, which may
6868
be useful when they represent some physical or naturally non-negative
6969
quantities (e.g., frequency counts or prices of goods).
7070
:class:`LinearRegression` accepts a boolean ``positive``
71-
parameter: when set to `True` `Non Negative Least Squares
71+
parameter: when set to `True` `Non-Negative Least Squares
7272
<https://en.wikipedia.org/wiki/Non-negative_least_squares>`_ are then applied.
7373

7474
.. topic:: Examples:
@@ -140,15 +140,15 @@ the output with the highest value.
140140

141141
It might seem questionable to use a (penalized) Least Squares loss to fit a
142142
classification model instead of the more traditional logistic or hinge
143-
losses. However in practice all those models can lead to similar
143+
losses. However, in practice, all those models can lead to similar
144144
cross-validation scores in terms of accuracy or precision/recall, while the
145145
penalized least squares loss used by the :class:`RidgeClassifier` allows for
146146
a very different choice of the numerical solvers with distinct computational
147147
performance profiles.
148148

149149
The :class:`RidgeClass 8000 ifier` can be significantly faster than e.g.
150-
:class:`LogisticRegression` with a high number of classes, because it is
151-
able to compute the projection matrix :math:`(X^T X)^{-1} X^T` only once.
150+
:class:`LogisticRegression` with a high number of classes because it can
151+
compute the projection matrix :math:`(X^T X)^{-1} X^T` only once.
152152

153153
This classifier is sometimes referred to as a `Least Squares Support Vector
154154
Machines
@@ -210,7 +210,7 @@ Lasso
210210
The :class:`Lasso` is a linear model that estimates sparse coefficients.
211211
It is useful in some contexts due to its tendency to prefer solutions
212212
with fewer non-zero coefficients, effectively reducing the number of
213-
features upon which the given solution is dependent. For this reason
213+
features upon which the given solution is dependent. For this reason,
214214
Lasso and its variants are fundamental to the field of compressed sensing.
215215
Under certain conditions, it can recover the exact set of non-zero
216216
coefficients (see
@@ -309,7 +309,7 @@ as the regularization path is computed only once instead of k+1 times
309309
when using k-fold cross-validation. However, such criteria needs a
310310
proper estimation of the degrees of freedom of the solution, are
311311
derived for large samples (asymptotic results) and assume the model
312-
is correct, i.e. that the data are actually generated by this model.
312+
is correct, i.e. that the data are generated by this model.
313313
They also tend to break when the problem is badly conditioned
314314
(more features than samples).
315315

@@ -393,7 +393,7 @@ the regularization properties of :class:`Ridge`. We control the convex
393393
combination of :math:`\ell_1` and :math:`\ell_2` using the ``l1_ratio``
394394
parameter.
395395

396-
Elastic-net is useful when there are multiple features which are
396+
Elastic-net is useful when there are multiple features that are
397397
correlated with one another. Lasso is likely to pick one of these
398398
at random, while elastic-net is likely to pick both.
399399

@@ -500,7 +500,7 @@ The disadvantages of the LARS method include:
500500
in the discussion section of the Efron et al. (2004) Annals of
501501
Statistics article.
502502

503-
The LARS model can be used using estimator :class:`Lars`, or its
503+
The LARS model can be used using via the estimator :class:`Lars`, or its
504504
low-level implementation :func:`lars_path` or :func:`lars_path_gram`.
505505

506506

@@ -546,7 +546,7 @@ the residual.
546546
Instead of giving a vector result, the LARS solution consists of a
547547
curve denoting the solution for each value of the :math:`\ell_1` norm of the
548548
parameter vector. The full coefficients path is stored in the array
549-
``coef_path_``, which has size (n_features, max_features+1). The first
549+
``coef_path_`` of shape `(n_features, max_features + 1)`. The first
550550
column is always zero.
551551

552552
.. topic:: References:

0 commit comments

Comments
 (0)
0