|
13 | 13 | :ref:`shrunk_covariance` estimators. In particular, it focuses on how to |
14 | 14 | set the amount of regularization, i.e. how to choose the bias-variance |
15 | 15 | trade-off. |
16 | | -
|
17 | | -Here we compare 3 approaches: |
18 | | -
|
19 | | -* Setting the parameter by cross-validating the likelihood on three folds |
20 | | - according to a grid of potential shrinkage parameters. |
21 | | -
|
22 | | -* A close formula proposed by Ledoit and Wolf to compute |
23 | | - the asymptotically optimal regularization parameter (minimizing a MSE |
24 | | - criterion), yielding the :class:`~sklearn.covariance.LedoitWolf` |
25 | | - covariance estimate. |
26 | | -
|
27 | | -* An improvement of the Ledoit-Wolf shrinkage, the |
28 | | - :class:`~sklearn.covariance.OAS`, proposed by Chen et al. Its |
29 | | - convergence is significantly better under the assumption that the data |
30 | | - are Gaussian, in particular for small samples. |
31 | | -
|
32 | | -To quantify estimation error, we plot the likelihood of unseen data for |
33 | | -different values of the shrinkage parameter. We also show the choices by |
34 | | -cross-validation, or with the LedoitWolf and OAS estimates. |
35 | | -
|
36 | | -Note that the maximum likelihood estimate corresponds to no shrinkage, |
37 | | -and thus performs poorly. The Ledoit-Wolf estimate performs really well, |
38 | | -as it is close to the optimal and is computational not costly. In this |
39 | | -example, the OAS estimate is a bit further away. Interestingly, both |
40 | | -approaches outperform cross-validation, which is significantly most |
41 | | -computationally costly. |
42 | | -
|
43 | 16 | """ |
44 | 17 |
|
45 | | -import numpy as np |
46 | | -import matplotlib.pyplot as plt |
47 | | -from scipy import linalg |
48 | 18 |
|
49 | | -from sklearn.covariance import ( |
50 | | - LedoitWolf, |
51 | | - OAS, |
52 | | - ShrunkCovariance, |
53 | | - log_likelihood, |
54 | | - empirical_covariance, |
55 | | -) |
56 | | -from sklearn.model_selection import GridSearchCV |
| 19 | +# %% |
| 20 | +# Generate sample data |
| 21 | +# -------------------- |
57 | 22 |
|
| 23 | +import numpy as np |
58 | 24 |
|
59 | | -# ############################################################################# |
60 | | -# Generate sample data |
61 | 25 | n_features, n_samples = 40, 20 |
62 | 26 | np.random.seed(42) |
63 | 27 | base_X_train = np.random.normal(size=(n_samples, n_features)) |
|
68 | 32 | X_train = np.dot(base_X_train, coloring_matrix) |
69 | 33 | X_test = np.dot(base_X_test, coloring_matrix) |
70 | 34 |
|
71 | | -# ############################################################################# |
| 35 | + |
| 36 | +# %% |
72 | 37 | # Compute the likelihood on test data |
| 38 | +# ----------------------------------- |
| 39 | + |
| 40 | +from sklearn.covariance import ShrunkCovariance, empirical_covariance, log_likelihood |
| 41 | +from scipy import linalg |
73 | 42 |
|
74 | 43 | # spanning a range of possible shrinkage coefficient values |
75 | 44 | shrinkages = np.logspace(-2, 0, 30) |
|
83 | 52 | emp_cov = empirical_covariance(X_train) |
84 | 53 | loglik_real = -log_likelihood(emp_cov, linalg.inv(real_cov)) |
85 | 54 |
|
86 | | -# ############################################################################# |
87 | | -# Compare different approaches to setting the parameter |
| 55 | + |
| 56 | +# %% |
| 57 | +# Compare different approaches to setting the regularization parameter |
| 58 | +# -------------------------------------------------------------------- |
| 59 | +# |
| 60 | +# Here we compare 3 approaches: |
| 61 | +# |
| 62 | +# * Setting the parameter by cross-validating the likelihood on three folds |
| 63 | +# according to a grid of potential shrinkage parameters. |
| 64 | +# |
| 65 | +# * A close formula proposed by Ledoit and Wolf to compute |
| 66 | +# the asymptotically optimal regularization parameter (minimizing a MSE |
| 67 | +# criterion), yielding the :class:`~sklearn.covariance.LedoitWolf` |
| 68 | +# covariance estimate. |
| 69 | +# |
| 70 | +# * An improvement of the Ledoit-Wolf shrinkage, the |
| 71 | +# :class:`~sklearn.covariance.OAS`, proposed by Chen et al. Its |
| 72 | +# convergence is significantly better under the assumption that the data |
| 73 | +# are Gaussian, in particular for small samples. |
| 74 | + |
| 75 | + |
| 76 | +from sklearn.model_selection import GridSearchCV |
| 77 | +from sklearn.covariance import LedoitWolf, OAS |
88 | 78 |
|
89 | 79 | # GridSearch for an optimal shrinkage coefficient |
90 | 80 | tuned_parameters = [{"shrinkage": shrinkages}] |
|
99 | 89 | oa = OAS() |
100 | 90 | loglik_oa = oa.fit(X_train).score(X_test) |
101 | 91 |
|
102 | | -# ############################################################################# |
| 92 | +# %% |
103 | 93 | # Plot results |
| 94 | +# ------------ |
| 95 | +# |
| 96 | +# |
| 97 | +# To quantify estimation error, we plot the likelihood of unseen data for |
| 98 | +# different values of the shrinkage parameter. We also show the choices by |
| 99 | +# cross-validation, or with the LedoitWolf and OAS estimates. |
| 100 | + |
| 101 | +import matplotlib.pyplot as plt |
| 102 | + |
104 | 103 | fig = plt.figure() |
105 | 104 | plt.title("Regularized covariance: likelihood and shrinkage coefficient") |
106 | 105 | plt.xlabel("Regularization parameter: shrinkage coefficient") |
|
145 | 144 | plt.legend() |
146 | 145 |
|
147 | 146 | plt.show() |
| 147 | + |
| 148 | +# %% |
| 149 | +# .. note:: |
| 150 | +# |
| 151 | +# The maximum likelihood estimate corresponds to no shrinkage, |
| 152 | +# and thus performs poorly. The Ledoit-Wolf estimate performs really well, |
| 153 | +# as it is close to the optimal and is not computationally costly. In this |
| 154 | +# example, the OAS estimate is a bit further away. Interestingly, both |
| 155 | +# approaches outperform cross-validation, which is significantly most |
| 156 | +# computationally costly. |
0 commit comments