diff --git a/doc/modules/preprocessing.rst b/doc/modules/preprocessing.rst index 69045147d8af9..c2b3aa1accf63 100644 --- a/doc/modules/preprocessing.rst +++ b/doc/modules/preprocessing.rst @@ -883,11 +883,11 @@ cardinality categories are location based such as zip code or region. For the binary classification target, the target encoding is given by: .. math:: - S_i = \lambda_i\frac{n_{iY}}{n_i} + (1 - \lambda_i)\frac{n_y}{n} + S_i = \lambda_i\frac{n_{iY}}{n_i} + (1 - \lambda_i)\frac{n_Y}{n} where :math:`S_i` is the encoding for category :math:`i`, :math:`n_{iY}` is the number of observations with :math:`Y=1` with category :math:`i`, :math:`n_i` is -the number of observations with category :math:`i`, :math:`n_y` is the number of +the number of observations with category :math:`i`, :math:`n_Y` is the number of observations with :math:`Y=1`, :math:`n` is the number of observations, and :math:`\lambda_i` is a shrinkage factor. The shrinkage factor is given by: @@ -897,14 +897,14 @@ observations with :math:`Y=1`, :math:`n` is the number of observations, and where :math:`m` is a smoothing factor, which is controlled with the `smooth` parameter in :class:`TargetEncoder`. Large smoothing factors will put more weight on the global mean. When `smooth="auto"`, the smoothing factor is -computed as an empirical Bayes estimate: :math:`m=\sigma_c^2/\tau^2`, where +computed as an empirical Bayes estimate: :math:`m=\sigma_i^2/\tau^2`, where :math:`\sigma_i^2` is the variance of `y` with category :math:`i` and :math:`\tau^2` is the global variance of `y`. For continuous targets, the formulation is similar to binary classification: .. math:: - S_i = \lambda_i\frac{\sum_{k\in L_i}y_k}{n_i} + (1 - \lambda_i)\frac{\sum_{k=1}^{n}y_k}{n} + S_i = \lambda_i\frac{\sum_{k\in L_i}Y_k}{n_i} + (1 - \lambda_i)\frac{\sum_{k=1}^{n}Y_k}{n} where :math:`L_i` is the set of observations for which :math:`X=X_i` and :math:`n_i` is the cardinality of :math:`L_i`.