|
13 | 13 | :ref:`shrunk_covariance` estimators. In particular, it focuses on how to
|
14 | 14 | set the amount of regularization, i.e. how to choose the bias-variance
|
15 | 15 | trade-off.
|
16 |
| -
|
17 |
| -Here we compare 3 approaches: |
18 |
| -
|
19 |
| -* Setting the parameter by cross-validating the likelihood on three folds |
20 |
| - according to a grid of potential shrinkage parameters. |
21 |
| -
|
22 |
| -* A close formula proposed by Ledoit and Wolf to compute |
23 |
| - the asymptotically optimal regularization parameter (minimizing a MSE |
24 |
| - criterion), yielding the :class:`~sklearn.covariance.LedoitWolf` |
25 |
| - covariance estimate. |
26 |
| -
|
27 |
| -* An improvement of the Ledoit-Wolf shrinkage, the |
28 |
| - :class:`~sklearn.covariance.OAS`, proposed by Chen et al. Its |
29 |
| - convergence is significantly better under the assumption that the data |
30 |
| - are Gaussian, in particular for small samples. |
31 |
| -
|
32 |
| -To quantify estimation error, we plot the likelihood of unseen data for |
33 |
| -different values of the shrinkage parameter. We also show the choices by |
34 |
| -cross-validation, or with the LedoitWolf and OAS estimates. |
35 |
| -
|
36 |
| -Note that the maximum likelihood estimate corresponds to no shrinkage, |
37 |
| -and thus performs poorly. The Ledoit-Wolf estimate performs really well, |
38 |
| -as it is close to the optimal and is computational not costly. In this |
39 |
| -example, the OAS estimate is a bit further away. Interestingly, both |
40 |
| -approaches outperform cross-validation, which is significantly most |
41 |
| -computationally costly. |
42 |
| -
|
43 | 16 | """
|
44 | 17 |
|
45 |
| -import numpy as np |
46 |
| -import matplotlib.pyplot as plt |
47 |
| -from scipy import linalg |
48 | 18 |
|
49 |
| -from sklearn.covariance import ( |
50 |
| - LedoitWolf, |
51 |
| - OAS, |
52 |
| - ShrunkCovariance, |
53 |
| - log_likelihood, |
54 |
| - empirical_covariance, |
55 |
| -) |
56 |
| -from sklearn.model_selection import GridSearchCV |
| 19 | +# %% |
| 20 | +# Generate sample data |
| 21 | +# -------------------- |
57 | 22 |
|
| 23 | +import numpy as np |
58 | 24 |
|
59 |
| -# ############################################################################# |
60 |
| -# Generate sample data |
61 | 25 | n_features, n_samples = 40, 20
|
62 | 26 | np.random.seed(42)
|
63 | 27 | base_X_train = np.random.normal(size=(n_samples, n_features))
|
|
68 | 32 | X_train = np.dot(base_X_train, coloring_matrix)
|
69 | 33 | X_test = np.dot(base_X_test, coloring_matrix)
|
70 | 34 |
|
71 |
| -# ############################################################################# |
| 35 | + |
| 36 | +# %% |
72 | 37 | # Compute the likelihood on test data
|
| 38 | +# ----------------------------------- |
| 39 | + |
| 40 | +from sklearn.covariance import ShrunkCovariance, empirical_covariance, log_likelihood |
| 41 | +from scipy import linalg |
73 | 42 |
|
74 | 43 | # spanning a range of possible shrinkage coefficient values
|
75 | 44 | shrinkages = np.logspace(-2, 0, 30)
|
|
83 | 52 | emp_cov = empirical_covariance(X_train)
|
84 | 53 | loglik_real = -log_likelihood(emp_cov, linalg.inv(real_cov))
|
85 | 54 |
|
86 |
| -# ############################################################################# |
87 |
| -# Compare different approaches to setting the parameter |
| 55 | + |
| 56 | +# %% |
| 57 | +# Compare different approaches to setting the regularization parameter |
| 58 | +# -------------------------------------------------------------------- |
| 59 | +# |
| 60 | +# Here we compare 3 approaches: |
| 61 | +# |
| 62 | +# * Setting the parameter by cross-validating the likelihood on three folds |
| 63 | +# according to a grid of potential shrinkage parameters. |
| 64 | +# |
| 65 | +# * A close formula proposed by Ledoit and Wolf to compute |
| 66 | +# the asymptotically optimal regularization parameter (minimizing a MSE |
| 67 | +# criterion), yielding the :class:`~sklearn.covariance.LedoitWolf` |
| 68 | +# covariance estimate. |
| 69 | +# |
| 70 | +# * An improvement of the Ledoit-Wolf shrinkage, the |
| 71 | +# :class:`~sklearn.covariance.OAS`, proposed by Chen et al. Its |
| 72 | +# convergence is significantly better under the assumption that the data |
| 73 | +# are Gaussian, in particular for small samples. |
| 74 | + |
| 75 | + |
| 76 | +from sklearn.model_selection import GridSearchCV |
| 77 | +from sklearn.covariance import LedoitWolf, OAS |
88 | 78 |
|
89 | 79 | # GridSearch for an optimal shrinkage coefficient
|
90 | 80 | tuned_parameters = [{"shrinkage": shrinkages}]
|
|
99 | 89 | oa = OAS()
|
100 | 90 | loglik_oa = oa.fit(X_train).score(X_test)
|
101 | 91 |
|
102 |
| -# ############################################################################# |
| 92 | +# %% |
103 | 93 | # Plot results
|
| 94 | +# ------------ |
| 95 | +# |
| 96 | +# |
| 97 | +# To quantify estimation error, we plot the likelihood of unseen data for |
| 98 | +# different values of the shrinkage parameter. We also show the choices by |
| 99 | +# cross-validation, or with the LedoitWolf and OAS estimates. |
| 100 | + |
| 101 | +import matplotlib.pyplot as plt |
| 102 | + |
104 | 103 | fig = plt.figure()
|
105 | 104 | plt.title("Regularized covariance: likelihood and shrinkage coefficient")
|
106 | 105 | plt.xlabel("Regularization parameter: shrinkage coefficient")
|
|
145 | 144 | plt.legend()
|
146 | 145 |
|
147 | 146 | plt.show()
|
| 147 | + |
| 148 | +# %% |
| 149 | +# .. note:: |
| 150 | +# |
| 151 | +# The maximum likelihood estimate corresponds to no shrinkage, |
| 152 | +# and thus performs poorly. The Ledoit-Wolf estimate performs really well, |
| 153 | +# as it is close to the optimal and is not computationally costly. In this |
| 154 | +# example, the OAS estimate is a bit further away. Interestingly, both |
| 155 | +# approaches outperform cross-validation, which is significantly most |
| 156 | +# computationally costly. |
0 commit comments