8000 clarification on OAS estimator formula (sklearn.covariance.OAS) (mean instead of trace is used) · Issue #23280 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
clarification on OAS estimator formula (sklearn.covariance.OAS) (mean instead of trace is used) #23280
Closed
@assuntaciarlo

Description

@assuntaciarlo

Dear sklearn experts,

I was comparing different shrinkage algorithms and when looking at sklearn implementation of the OAS estimator I found something strange in the definition of the shrinkage factor or at least not clear to me. In the original formula from Chen et al. 2010 (the formula is also wrong in the paper, anyway) they used always the trace of the covariance matrix. Instead, this is the formula from sklearn.covariance.OAS module:

 mu = np.trace(emp_cov) / n_features
# formula from Chen et al.'s **implementation**
alpha = np.mean(emp_cov ** 2)
num = alpha + mu ** 2
den = (n_samples + 1.) * (alpha - (mu ** 2) / n_features)

shrinkage = 1. if den == 0 else min(num / den, 1.)
shrunk_cov = (1. - shrinkage) * emp_cov
shrunk_cov.flat[::n_features + 1] += shrinkage * mu

where alpha is the mean of the squared covariance matrix instead of the trace, and also the mu parameter is normalized by the number of features also at the numerator, differently from what I found in the literature, referring to the same formula.
A trascription of what I found in papers should be (discarding the factor 2/p as in sklearn):

 mu = np.trace(emp_cov) 
alpha = np.trace(emp_cov ** 2)
num = alpha + mu ** 2
den = (n_samples + 1.) * (alpha - (mu ** 2) / n_features)

shrinkage = 1. if den == 0 else min(num / den, 1.)
shrunk_cov = (1. - shrinkage) * emp_cov + shrinkage * np.diag(np.diag(emp_cov))

Is this right? Are the two forms equivalent in a way that I couldn't understand?

Thank you in advance,
Assunta

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0