8000 DOC Fix typos in math in Target Encoder user guide (#26584) · scikit-learn/scikit-learn@72f4def · GitHub
[go: up one dir, main page]

Skip to content

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 72f4def

Browse files
lucyleeowjeremiedbb
authored andcommitted
DOC Fix typos in math in Target Encoder user guide (#26584)
1 parent f26fbe6 commit 72f4def

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

doc/modules/preprocessing.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -883,11 +883,11 @@ cardinality categories are location based such as zip code or region. For the
883883
binary classification target, the target encoding is given by:
884884

885885
.. math::
886-
S_i = \lambda_i\frac{n_{iY}}{n_i} + (1 - \lambda_i)\frac{n_y}{n}
886+
S_i = \lambda_i\frac{n_{iY}}{n_i} + (1 - \lambda_i)\frac{n_Y}{n}
887887
888888
where :math:`S_i` is the encoding for category :math:`i`, :math:`n_{iY}` is the
889889
number of observations with :math:`Y=1` with category :math:`i`, :math:`n_i` is
890-
the number of observations with category :math:`i`, :math:`n_y` is the number of
890+
the number of observations with category :math:`i`, :math:`n_Y` is the number of
891891
observations with :math:`Y=1`, :math:`n` is the number of observations, and
892892
:math:`\lambda_i` is a shrinkage factor. The shrinkage factor is given by:
893893

@@ -897,14 +897,14 @@ observations with :math:`Y=1`, :math:`n` is the number of observations, and
897897
where :math:`m` is a smoothing factor, which is controlled with the `smooth`
898898
parameter in :class:`TargetEncoder`. Large smoothing factors will put more
899899
weight on the global mean. When `smooth="auto"`, the smoothing factor is
900-
computed as an empirical Bayes estimate: :math:`m=\sigma_c^2/\tau^2`, where
900+
computed as an empirical Bayes estimate: :math:`m=\sigma_i^2/\tau^2`, where
901901
:math:`\sigma_i^2` is the variance of `y` with category :math:`i` and
902902
:math:`\tau^2` is the global variance of `y`.
903903

904904
For continuous targets, the formulation is similar to binary classification:
905905

906906
.. math::
907-
S_i = \lambda_i\frac{\sum_{k\in L_i}y_k}{n_i} + (1 - \lambda_i)\frac{\sum_{k=1}^{n}y_k}{n}
907+
S_i = \lambda_i\frac{\sum_{k\in L_i}Y_k}{n_i} + (1 - \lambda_i)\frac{\sum_{k=1}^{n}Y_k}{n}
908908
909909
where :math:`L_i` is the set of observations for which :math:`X=X_i` and
910910
:math:`n_i` is the cardinality of :math:`L_i`.

0 commit comments

Comments
 (0)
0