8000 [MRG+1] Ledoit-Wolf behavior explanation (#9500) · scikit-learn/scikit-learn@dacf9e3 · GitHub
[go: up one dir, main page]

Skip to content

Commit dacf9e3

Browse files
GKjohnsamueller
authored andcommitted
[MRG+1] Ledoit-Wolf behavior explanation (#9500)
* DOC add explaination of unexpected behavior to ledoit-wolf functions and class * DOC add explaination of unexpected ledoit-wolf behavior to module documentation * fix line that's longer than 80 chars, pep8 issue * fix documentation changes to Ledoit_Wolf behavior explaination * change bahavior explanation to a note in documentation * remove unexpected behavior explanation from docstrings * fix broken links in docs
1 parent 3c1e23a commit dacf9e3

File tree

2 files changed

+22
-6
lines changed

2 files changed

+22
-6
lines changed

doc/modules/covariance.rst

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ The empirical covariance matrix of a sample can be computed using the
3838
whether the data are centered or not, the result will be different, so
3939
one may want to use the ``assume_centered`` parameter accurately. More precisely
4040
if one uses ``assume_centered=False``, then the test set is supposed to have the
41-
same mean vector as the training set. If not so, both should be centered by the
41+
same mean vector as the training set. If not so, both should be centered by the
4242
user, and ``assume_centered=True`` should be used.
4343

4444
.. topic:: Examples:
@@ -105,6 +105,23 @@ a sample with the :meth:`ledoit_wolf` function of the
105105
`sklearn.covariance` package, or it can be otherwise obtained by
106106
fitting a :class:`LedoitWolf` object to the same sample.
107107

108+
.. note:: **Case when population covariance matrix is isotropic**
109+
110+
It is important to note that when the number of samples is much larger than
111+
the number of features, one would expect that no shrinkage would be
112+
necessary. The intuition behind this is that if the population covariance
113+
is full rank, when the number of sample grows, the sample covariance will
114+
also become positive definite. As a result, no shrinkage would necessary
115+
and the method should automatically do this.
116+
117+
This, however, is not the case in the Ledoit-Wolf procedure when the
118+
population covariance happens to be a multiple of the identity matrix. In
119+
this case, the Ledoit-Wolf shrinkage estimate approaches 1 as the number of
120+
samples increases. This indicates that the optimal estimate of the
121+
covariance matrix in the Ledoit-Wolf sense is multiple of the identity.
122+
Since the population covariance is already a multiple of the identity
123+
matrix, the Ledoit-Wolf solution is indeed a reasonable estimate.
124+
108125
.. topic:: Examples:
109126

110127
* See :ref:`sphx_glr_auto_examples_covariance_plot_covariance_estimation.py` for
@@ -334,4 +351,3 @@ ____
334351

335352
* - |robust_vs_emp|
336353
- |mahalanobis|
337-

sklearn/covariance/shrunk_covariance_.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -486,10 +486,10 @@ class OAS(EmpiricalCovariance):
486486
The formula used here does not correspond to the one given in the
487487
article. It has been taken from the Matlab program available from the
488488
authors' webpage (http://tbayes.eecs.umich.edu/yilun/covestimation).
489-
In the original article, formula (23) states that 2/p is multiplied by
490-
Trace(cov*cov) in both the numerator and denominator, this operation is omitted
491-
in the author's MATLAB program because for a large p, the value of 2/p is so
492-
small that it doesn't affect the value of the estimator.
489+
In the original article, formula (23) states that 2/p is multiplied by
490+
Trace(cov*cov) in both the numerator and denominator, this operation is
491+
omitted in the author's MATLAB program because for a large p, the value
492+
of 2/p is so small that it doesn't affect the value of the estimator.
493493
494494
Parameters
495495
----------

0 commit comments

Comments
 (0)
0