8000 DOC Fix typos in math in Target Encoder user guide (#26584) · scikit-learn/scikit-learn@0800fb3 · GitHub
[go: up one dir, main page]

Skip to content

Commit 0800fb3

Browse files
authored
DOC Fix typos in math in Target Encoder user guide (#26584)
1 parent 76a927b commit 0800fb3

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

doc/modules/preprocessing.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -883,11 +883,11 @@ cardinality categories are location based such as zip code or region. For the
883883
binary classification target, the target encoding is given by:
884884

885885
.. math::
886-
S_i = \lambda_i\frac{n_{iY}}{n_i} + (1 - \lambda_i)\frac{n_y}{n}
886+
S_i = \lambda_i\frac{n_{iY}}{n_i} + (1 - \lambda_i)\frac{n_Y}{n}
887887
888888
where :math:`S_i` is the encoding for category :math:`i`, :math:`n_{iY}` is the
889889
number of observations with :math:`Y=1` with category :math:`i`, :math:`n_i` is
890-
the number of observations with category :math:`i`, :math:`n_y` is the number of
890+
the number of observations with category :math:`i`, :math:`n_Y` is the number of
891891
observations with :math:`Y=1`, :math:`n` is the number of observations, and
892892
:math:`\lambda_i` is a shrinkage factor. The shrinkage factor is given by:
893893

@@ -897,14 +897,14 @@ observations with :math:`Y=1`, :math:`n` is the number of observations, and
897897
where :math:`m` is a smoothing factor, which is controlled with the `smooth`
898898
parameter in :class:`TargetEncoder`. Large smoothing factors will put more
899899
weight on the global mean. When `smooth="auto"`, the smoothing factor is
900-
computed as an empirical Bayes estimate: :math:`m=\sigma_c^2/\tau^2`, where
900+
computed as an empirical Bayes estimate: :math:`m=\sigma_i^2/\tau^2`, where
901901
:math:`\sigma_i^2` is the variance of `y` with category :math:`i` and
902902
:math:`\tau^2` is the global variance of `y`.
903903

904904
For continuous targets, the formulation is similar to binary classification:
905905

906906
.. math::
907-
S_i = \lambda_i\frac{\sum_{k\in L_i}y_k}{n_i} + (1 - \lambda_i)\frac{\sum_{k=1}^{n}y_k}{n}
907+
S_i = \lambda_i\frac{\sum_{k\in L_i}Y_k}{n_i} + (1 - \lambda_i)\frac{\sum_{k=1}^{n}Y_k}{n}
908908
909909
where :math:`L_i` is the set of observations for which :math:`X=X_i` and
910910
:math:`n_i` is the cardinality of :math:`L_i`.

0 commit comments

Comments
 (0)
0