Covariance
Covariance
Covariance
where E[X] is the expected value of X, also known as the mean of X. By using the
linearity property of expectations, this can be simplified to
However, when
, this last equation is prone to catastrophic
cancellation when computed with floating point arithmetic and thus should be
avoided in computer programs when the data have not been centered before. [3]
For random vectors
and
(both of dimension m) the mm cross
covariance matrix (also known as dispersion matrix or variancecovariance
matrix,[4] or simply calledcovariance matrix) is equal to
Properties[edit]
Variance is a special case of the covariance when the two variables are
identical:
For a sequence X1, ..., Xn of random variables, and constants a1, ..., an, we have
can act on
, and let
be a matrix that
is:
.
This is a direct result of the linearity of expectation and is useful when applying
a linear transformation, such as a whitening transformation, to a vector.
Uncorrelatedness and independence[edit]
If X and Y are independent, then their covariance is zero. This follows because under
independence,
The converse, however, is not generally true. For example, let X be uniformly
distributed in [-1, 1] and let Y = X2. Clearly, X and Y are dependent, but
In this case, the relationship between Y and X is non-linear, while correlation and
covariance are measures of linear dependence between two variables. This example
shows that if two variables are uncorrelated, that does not in general imply that
they are independent. However, if two variables are jointly normally distributed (but
not if they are merely individually normally distributed),
uncorrelatedness does imply independence.
Relationship to inner products[edit]
Many of the properties of covariance can be extracted elegantly by observing that it
satisfies similar properties to those of an inner product:
1. bilinear: for constants a and b and random variables X, Y, Z, (aX + bY, Z)
= a (X, Z) + b (Y, Z);
2. symmetric: (X, Y) = (Y, X);
3. positive semi-definite: 2(X) = (X, X) 0 for all random variables X, and
(X, X) = 0 implies that X is a constant random variable (K).
In fact these properties imply that the covariance defines an inner product over
the quotient vector space obtained by taking the subspace of random variables with
finite second moment and identifying any two that differ by a constant. (This
identification turns the positive semi-definiteness above into positive definiteness.)
That quotient vector space is isomorphic to the subspace of random variables with
finite second moment and mean zero; on that subspace, the covariance is exactly
the L2 inner product of real-valued functions on the sample space.
As a result for random variables with finite variance, the inequality
Then we have
,
which is an estimate of the covariance between variable j and variable k.
The sample mean and the sample covariance matrix are unbiased estimates of
the mean and the covariance matrix of the random vector , a row vector
whose jth element (j = 1, ..., K) is one of the random variables. The reason the
sample covariance matrix has
in the denominator rather than
is
essentially that the population mean
sample mean . If the population mean
estimate is given by
Comments[edit]
The covariance is sometimes called a measure of "linear dependence" between the
two random variables. That does not mean the same thing as in the context
of linear algebra (see linear dependence). When the covariance is normalized, one
obtains the correlation coefficient. From it, one can obtain the Pearson coefficient,
which gives the goodness of the fit for the best possible linear function describing
the relation between the variables. In this sense covariance is a linear gauge of
dependence.
Applications[edit]