10000 DOC fix typos in the user guide linear model documentation by mohamed-khoualed · Pull Request #19554 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

DOC fix typos in the user guide linear model documentation #19554

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 29, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions doc/modules/linear_model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ and will store the coefficients :math:`w` of the linear model in its

The coefficient estimates for Ordinary Least Squares rely on the
independence of the features. When features are correlated and the
columns of the design matrix :math:`X` have an approximate linear
columns of the design matrix :math:`X` have an approximately linear
dependence, the design matrix becomes close to singular
and as a result, the least-squares estimate becomes highly sensitive
to random errors in the observed target, producing a large
Expand All @@ -68,7 +68,7 @@ It is possible to constrain all the coefficients to be non-negative, which may
be useful when they represent some physical or naturally non-negative
quantities (e.g., frequency counts or prices of goods).
:class:`LinearRegression` accepts a boolean ``positive``
parameter: when set to `True` `Non Negative Least Squares
parameter: when set to `True` `Non-Negative Least Squares
<https://en.wikipedia.org/wiki/Non-negative_least_squares>`_ are then applied.

.. topic:: Examples:
Expand Down Expand Up @@ -140,15 +140,15 @@ the output with the highest value.

It might seem questionable to use a (penalized) Least Squares loss to fit a
classification model instead of the more traditional logistic or hinge
losses. However in practice all those models can lead to similar
losses. However, in practice, all those models can lead to similar
cross-validation scores in terms of accuracy or precision/recall, while the
penalized least squares loss used by the :class:`RidgeClassifier` allows for
a very different choice of the numerical solvers with distinct computational
performance profiles.

The :class:`RidgeClassifier` can be significantly faster than e.g.
:class:`LogisticRegression` with a high number of classes, because it is
able to compute the projection matrix :math:`(X^T X)^{-1} X^T` only once.
:class:`LogisticRegression` with a high number of classes because it can
compute the projection matrix :math:`(X^T X)^{-1} X^T` only once.

This classifier is sometimes referred to as a `Least Squares Support Vector
Machines
Expand Down Expand Up @@ -210,7 +210,7 @@ Lasso
The :class:`Lasso` is a linear model that estimates sparse coefficients.
It is useful in some contexts due to its tendency to prefer solutions
with fewer non-zero coefficients, effectively reducing the number of
features upon which the given solution is dependent. For this reason
features upon which the given solution is dependent. For this reason,
Lasso and its variants are fundamental to the field of compressed sensing.
Under certain conditions, it can recover the exact set of non-zero
coefficients (see
Expand Down Expand Up @@ -309,7 +309,7 @@ as the regularization path is computed only once instead of k+1 times
when using k-fold cross-validation. However, such criteria needs a
proper estimation of the degrees of freedom of the solution, are
derived for large samples (asymptotic results) and assume the model
is correct, i.e. that the data are actually generated by this model.
is correct, i.e. that the data are generated by this model.
They also tend to break when the problem is badly conditioned
(more features than samples).

Expand Down Expand Up @@ -393,7 +393,7 @@ the regularization properties of :class:`Ridge`. We control the convex
combination of :math:`\ell_1` and :math:`\ell_2` using the ``l1_ratio``
parameter.

Elastic-net is useful when there are multiple features which are
Elastic-net is useful when there are multiple features that are
correlated with one another. Lasso is likely to pick one of these
at random, while elastic-net is likely to pick both.

Expand Down Expand Up @@ -500,7 +500,7 @@ The disadvantages of the LARS method include:
in the discussion section of the Efron et al. (2004) Annals of
Statistics article.

The LARS model can be used using estimator :class:`Lars`, or its
The LARS model can be used using via the estimator :class:`Lars`, or its
low-level implementation :func:`lars_path` or :func:`lars_path_gram`.


Expand Down Expand Up @@ -546,7 +546,7 @@ the residual.
Instead of giving a vector result, the LARS solution consists of a
curve denoting the solution for each value of the :math:`\ell_1` norm of the
parameter vector. The full coefficients path is stored in the array
``coef_path_``, which has size (n_features, max_features+1). The first
``coef_path_`` of shape `(n_features, max_features + 1)`. The first
column is always zero.

.. topic:: References:
Expand Down
Loading
0