clarification on OAS estimator formula (sklearn.covariance.OAS) (mean instead of trace is used)

Dear sklearn experts,

I was comparing different shrinkage algorithms and when looking at sklearn implementation of the OAS estimator I found something strange in the definition of the shrinkage factor or at least not clear to me. In the original formula from Chen et al. 2010 (the formula is also wrong in the paper, anyway) they used always the trace of the covariance matrix. Instead, this is the formula from sklearn.covariance.OAS module:

     mu = np.trace(emp_cov) / n_features
    # formula from Chen et al.'s **implementation**
    alpha = np.mean(emp_cov ** 2)
    num = alpha + mu ** 2
    den = (n_samples + 1.) * (alpha - (mu ** 2) / n_features)

    shrinkage = 1. if den == 0 else min(num / den, 1.)
    shrunk_cov = (1. - shrinkage) * emp_cov
    shrunk_cov.flat[::n_features + 1] += shrinkage * mu

where alpha is the mean of the squared covariance matrix instead of the trace, and also the mu parameter is normalized by the number of features also at the numerator, differently from what I found in the literature, referring to the same formula.
A trascription of what I found in papers should be (discarding the factor 2/p as in sklearn):

     mu = np.trace(emp_cov) 
    alpha = np.trace(emp_cov ** 2)
    num = alpha + mu ** 2
    den = (n_samples + 1.) * (alpha - (mu ** 2) / n_features)

    shrinkage = 1. if den == 0 else min(num / den, 1.)
    shrunk_cov = (1. - shrinkage) * emp_cov + shrinkage * np.diag(np.diag(emp_cov))
    

Is this right? Are the two forms equivalent in a way that I couldn't understand?

Thank you in advance,
Assunta

 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions