|
| 1 | +.. _covariance: |
| 2 | + |
| 3 | +=================================================== |
| 4 | +Covariance estimation |
| 5 | +=================================================== |
| 6 | + |
| 7 | +Many statistical problems require at some point the estimation of a |
| 8 | +population's covariance matrix, which can be seen as an estimation of |
| 9 | +data set scatter plot shape. Most of the time, such an estimation has |
| 10 | +to be done on a sample whose properties (size, structure, homogeneity) |
| 11 | +has a large influence on the estimation's quality. The |
| 12 | +`scikits.learn.covariance` package aims at providing tools affording |
| 13 | +an accurate estimation of a population's covariance matrix under |
| 14 | +various settings. |
| 15 | + |
| 16 | +The package does not include robust tools yet, so we assume that the |
| 17 | +data sets do not contain any outlying data. We also assume thah the |
| 18 | +observations are independant and identically distributed. |
| 19 | + |
| 20 | +Empirical covariance |
| 21 | +==================== |
| 22 | + |
| 23 | +.. currentmodule:: scikits.learn.covariance |
| 24 | + |
| 25 | +The covariance matrix of a data set is known to be well approximated |
| 26 | +with the classical `Maximum Likelihood Estimator` (or `empirical |
| 27 | +covariance`), provided the number of observations is large enough |
| 28 | +compared to the number of features (the variables describing the |
| 29 | +observations). More precisely, the Maximum Likelihood Estimator of a |
| 30 | +sample is an unbiased estimator of the corresponding population |
| 31 | +covariance matrix. |
| 32 | + |
| 33 | +The empirical covariance matrix of a sample can be computed using the |
| 34 | +:meth:`empirical_covariance` function of the package, or by fitting an |
| 35 | +:class:`EmpiricalCovariance` object to the data sample with the |
| 36 | +:meth:`EmpiricalCovariance.fit` method. Be careful that depending |
| 37 | +whether the data are centered or not, the result will be different, so |
| 38 | +one may want to use the `assume_centered` parameter accurately. |
| 39 | + |
| 40 | +.. topic:: Examples: |
| 41 | + |
| 42 | + * See :ref:`example_covariance_plot_covariance_estimation.py` for |
| 43 | + an example on how to fit an :class:`EmpiricalCovariance` object |
| 44 | + to data. |
| 45 | + |
| 46 | +Shrunk Covariance |
| 47 | +================= |
| 48 | + |
| 49 | +.. curentmodule:: scikits.learn.covariance |
| 50 | + |
| 51 | +Basic shrinkage |
| 52 | +--------------- |
| 53 | + |
| 54 | +Despite it is an unbiased estimator of the covariance matrix, the |
| 55 | +Maximum Likelihood Estimator is not a good estimator of the |
| 56 | +eigenvalues of the covariance matrix, so the precision matrix obtained |
| 57 | +from its inversion is not accurate. Sometimes, it even occurs that the |
| 58 | +empirical covariance matrix cannot be inverted for numerical |
| 59 | +reasons. To avoid such an inversion problem, a transformation of the |
| 60 | +empirical covariance matrix has been introduced: the `shrinkage`. It |
| 61 | +consists in reducing the ratio between the smallest and the largest |
| 62 | +eigenvalue of the empirical covariance matrix. This can be done by |
| 63 | +simply shifting every eigenvalue according to a given offset, which is |
| 64 | +equivalent of finding the l2-Penalized Maximum Likelihood Estimator of |
| 65 | +the covariance matrix, or by reducing the highest eigenvalue while |
| 66 | +increasing the smallest with the help of a convex transformation : |
| 67 | +:math:`\Sigma_{\rm shrunk} = (1-\alpha)\hat{\Sigma} + |
| 68 | +\alpha\frac{{\rm Tr}\hat{\Sigma}}{p}\rm Id`. The latter approach has been |
| 69 | +implemented in scikit-learn. |
| 70 | + |
| 71 | +A convex transformation (with a user-defined shrinkage coefficient) |
| 72 | +can be directly applied to a pre-computed covariance with the |
| 73 | +:meth:`shrunk_covariance` method. Also, a shrunk estimator of the |
| 74 | +covariance can be fitted to data with a :class:`ShrunkCovariance` |
| 75 | +object and its :meth:`ShrunkCovariance.fit` method. Again, depending |
| 76 | +whether the data are centered or not, the result will be different, so |
| 77 | +one may want to use the `assume_centered` parameter accurately. |
| 78 | + |
| 79 | +.. topic:: Examples: |
| 80 | + |
| 81 | + * See :ref:`example_covariance_plot_covariance_estimation.py` for |
| 82 | + an example on how to fit a :class:`ShrunkCovariance` object |
| 83 | + to data. |
| 84 | + |
| 85 | + |
| 86 | +Ledoit-Wolf shrinkage |
| 87 | +--------------------- |
| 88 | + |
| 89 | +In their 2004 paper [1], O. Ledoit and M. Wolf propose a formula so as |
| 90 | +to compute the optimal shrinkage coefficient :math:`\alpha` that |
| 91 | +minimizes the Mean Squared Error between the estimated and the real |
| 92 | +covariance matrix in terms of Frobenius norm. |
| 93 | + |
| 94 | +The Ledoit-Wolf estimator of the covariance matrix can be computed on |
| 95 | +a sample with the :meth:`ledoit_wolf` function of the |
| 96 | +`scikits.learn.covariance` package, or it can be otherwise obtained by |
| 97 | +fitting a :class:`LedoitWolf` object to the same sample. |
| 98 | + |
| 99 | +[1] "A Well-Conditioned Estimator for Large-Dimensional Covariance |
| 100 | + Matrices", Ledoit and Wolf, Journal of Multivariate Analysis, |
| 101 | + Volume 88, Issue 2, February 2004, pages 365-411. |
| 102 | + |
| 103 | +.. topic:: Examples: |
| 104 | + |
| 105 | + * See :ref:`example_covariance_plot_covariance_estimation.py` for |
| 106 | + an example on how to fit a :class:`LedoitWolf` object to data and |
| 107 | + for visualizing the performances of the Ledoit-Wolf estimator in |
| 108 | + terms of likelihood. |
| 109 | + |
| 110 | +.. figure:: ../auto_examples/covariance/images/plot_covariance_estimation_-1.png |
| 111 | + :target: ../auto_examples/covariance/plot_covariance_estimation.html |
| 112 | + :align: center |
| 113 | + :scale: 75% |
| 114 | + |
| 115 | +Oracle Approximating Shrinkage |
| 116 | +------------------------------ |
| 117 | + |
| 118 | +Under the assumption that the data are Gaussian distributed, Chen et |
| 119 | +al. [2] derived a formula aimed at choosing a shrinkage coefficient that |
| 120 | +yields a smaller Mean Squared Error than the one given by Ledoit and |
| 121 | +Wolf's formula. The resulting estimator is known as the Oracle |
| 122 | +Shrinkage Approximating estimator of the covariance. |
| 123 | + |
| 124 | +The OAS estimator of the covariance matrix can be computed on a sample |
| 125 | +with the :meth:`oas` function of the `scikits.learn.covariance` |
| 126 | +package, or it can be otherwise obtained by fitting an :class:`OAS` |
| 127 | +object to the same sample. The formula we used to implement the OAS |
| 128 | +does not correspond to the one given in the article. It has been taken |
| 129 | +from the matlab programm available from the authors webpage |
| 130 | +(https://tbayes.eecs.umich.edu/yilun/covestimation). |
| 131 | + |
| 132 | + |
| 133 | +[2] "Shrinkage Algorithms for MMSE Covariance Estimation" Chen et al., |
| 134 | + IEEE Trans. on Sign. Proc., Volume 58, Issue 10, October 2010. |
| 135 | + |
| 136 | +.. topic:: Examples: |
| 137 | + |
| 138 | + * See :ref:`example_covariance_plot_covariance_estimation.py` for |
| 139 | + an example on how to fit an :class:`OAS` object |
| 140 | + to data. |
| 141 | + |
| 142 | + * See :ref:`example_covariance_plot_lw_vs_oas.py` to visualize the |
| 143 | + Mean Squared Error difference between a :class:`LedoitWolf` and |
| 144 | + an :class:`OAS` estimator of the covariance. |
| 145 | + |
| 146 | + |
| 147 | +.. figure:: ../auto_examples/covariance/images/plot_lw_vs_oas_1.png |
| 148 | + :target: ../auto_examples/covariance/plot_lw_vs_oas.html |
| 149 | + :align: center |
| 150 | + :scale: 75% |
0 commit comments