Descriptive Multivariate Statistics
By
Dr.Richard Tuyiragize
School of Statistics and Planning
Makerere University
February 22, 2022
1 Introduction
Before embarking on any statistical analysis of the nxp multivariate data set, a preliminary
data analysis should be done. This includes computation of multivariate descriptive statistics
such as measure of location, measure of spread, sample covariance, and sample correlation
coefficient. Consider the following data set;
Variable1 Variable2 · · · Variablej · · · Variablep
Item1 x11 x12 ··· x1j ··· x1p
Item2 x21 x22 ··· x2j ··· x2p
.. .. .. ... .. ... ..
. . . . .
Itemn xn1 xn2 ··· xnj ··· xnp
2 Measures of central tendency
Measures of central tendency help you find the middle, or the average, of a data set.
For any variable xj , we compute the sample mean such that;
n
1X
x̄j = xij
n i=1
The measure of central tendency for the multivariate data set is defined as the vector of
sample means for all the p variables
x̄1
x̄2
x̄ = ..
.
x̄p
called the Sample mean vector.
3 Measures of dispersion (variation)
Taking any one of the variables xj , a usual measure of dispersion on this variable is the
sample variance denoted as Sjj which is the sample mean of the squares deviation of the n
observations.
STA3120 1 Email:trgzrich@yahoo.com
n
1
(xij − x̄)2
P
Sjj = n−1
i=1
n
1
P
where x̄j = n
xij
i=1
n
1
P SS(xj )
Sjj = n−1
(xij − x̄j )(xij − x̄j ) = n−1
i=1
where SS = sum of squares of deviations
As an extension, taking any two variables in the multivariate data set, say xj and xk . A
measure of joint dispersion/variance is the sample co-variance demoted as Sjk . such that
n
1
P SCP (xj xk )
Sjk = n−1
(xij − x̄j )(xik − x̄k ) = n−1
i=1
where SCP = sum of cross products of deviations
The measure of dispersion for a multivariate data set is a square matrix of order p
V ar(x11 ) Cov(x1 x2 ) · · · · · · Cov(x1 xp )
Cov(x2 x1 ) V ar(x22 ) · · · · · · Cov(x1 xp )
S=
.. .. .. ... .
..
. . .
Cov(xp x1 ) Cov(xp x2 ) · · · · · · Cov(xp xp )
Generally,
S11 S12 · · · ··· S1p
S21 S22 · · · · · · S2p
S=
.. .. .. . . ..
. . . . .
Sp1 Sp2 · · · · · · Spp
The sample variance covariance matrix can be expressed in vector terms;
n
1
P ′ 1
S= n−1
(xi − x̄)(xi − x̄) = n−1
.A
i=1
where;
xi is the ith row of the data set
x̄ is the sample mean vector
A is the sample sum of squares of cross products matrix (SSCP matrix)
STA3120 2 Email:trgzrich@yahoo.com
The determinant of the sample variance covariance matrix summarizes the dispersion and is
called generalized sample variance of the multivariate data.
Properties of matrix S
1. Its diagonal entries are variances and the off diagonal entries are covariances
2. If the p variables are all pairwise jointly uncorrelated, the off diagonal entries will be
zero
i.e. S = diagp (Sjj ), for all j = 1, 2, .........,p
3. For any two variables, xj and xk . If Cov(xj xk ) = Cov(xk xj ), the the matrix is sym-
metric
4. Given a sample size N , for N > P ; then the matrix S is always positive definite
5. x̄ and S are independently distributed
6. x̄ and S are jointly sufficient statistics for µ and Σ respectively.
S
7. x̄ is an unbiased estimator of µ and N −1
is an unbiased estimator of Σ
S
8. x̄ and N
are the maximum likelihood estimates (MLE) of µ and Σ respectively.
Σ
9. x̄ is distributed as Np (µ, N
10. The distribution of S is known as the Wishart distribution, denoted as Wp (n, Σ).
Where n = N − 1. Hence S is called the Wishart matrix. The Wishart distribution
is a generalization of the χ2 distribution i.e. S ∼ σχ2N −1
4 Measures of correlation
The correlation coefficient, denoted by r, is a measure of the strength of the straight-line or
linear relationship between two continuous variables. For any two variables, xj and xk , in
the multivariate data set, the sample pearson correlation coefficient is given by;
Sample Cov(xj xk ) S
Sample PCC, r = = √ jk√
Sample SD(xj )Sample SD(xk ) Sjj Skk
Note that for j = k, the r = 1
STA3120 3 Email:trgzrich@yahoo.com
As a measure of correlation for the nxp multivariate data set, we summarize the sample PCC
into a square matrix of order pxp, called the Sample correlation matrix, denoted as R;
1 r12 · · · · · · r1p
r21 1 · · · · · · r2p
R=
.. .. .. . . ..
. . . . .
rp1 rp2 · · · · · · 1
Properties of R
1. If the p variables in the data are pairwise jointly uncorrelated then the off diagonal
entries will be zero. Hence R will take on the form;
R = Diagp (I), the identity matrix.
2. The matrix R is symmetric and always semi-positive definite,
′
i.e. X RX ≥ 0 ∀X ̸= 0
3. Given some sample covariance matrix S, then the corresponding sample correlation
matrix R is computed as;
R = D−1 SD−1 , where D = diagp ( Sjj )
p
Example For some bivariate data set, the sample covariance matrix has been found to be;
16 3
3 25
Compute the sample correlation coefficient, R
Solution
1
16 3 −1 −1 −1 0
D= ; R = D SD ; D = 4 1
3 25 3
1 1 5
0 16 3 0
R = D−1 SD−1 = 4 1 4
3 5 3 25 3 15
3
1 20
R= 3
20
1
Question Find the sample
mean vector, covariance and correlation matrices for the following
4 1
data matrix. −1 3
3 5
STA3120 4 Email:trgzrich@yahoo.com
5 Random Vectors and Matrices
A random vector is a vector whose components are random variables and a random matrix
is a matrix whose components are random.
A linear array of p ≥ 2 random variables x1 , x2 , x3 , · · · , · · · , xp in the form of a column or
row i.e
x1
x2
′
x = .. or x = x1 , x2 , · · · , xn
.
xp
is called a p-dimensional random vector of p components.
If we have a column vector of p random components and row vector of q random components,
then pxq is a rectangular matrix, U =uij for i=1, 2, 3, ....., p and j=1, 2, 3, ....., q; where pq
elements uij ′ s are random variables.
For this course, we shall deal with a column or row vector X with p variables.
We shall take the p variables in the multivariate data set as realizations of some p random
variables x1 , x2 , x3 , · · · , · · · , xp , whose simultaneous probabilistic or stochastic behaviour we
need to investigate; and we take the p random variables as forming the row vector;
′
x = x1 , x2 , · · · , xp
5.1 Expectation of a random vector or matrix
′
Let x be an nxp random vector, i.e, x = x1 , x2 , · · · , xp . Then the mean vector is;
x1 E(x1 ) µ1
x2 E(x2 ) µ2
E(x) = E .. = .. = .. =µ
. . .
xp E(xp ) µp
E(x) is a vector of expectations of the random variables.
5.2 Variance Covariance matrix of a random vector
For a univariate random variable x, a common measure of dispersion is the population
variance; where,
STA3120 5 Email:trgzrich@yahoo.com
σx2 = E(x − µ)2 for µ = E(x)
For two random variables x and y, a common measure of joint dispersion is the population
covariance,
σxy = E(x − µx )(y − µy )
The measure of dispersion for a p-variate random vector x is the population p-variate
variance-covariance matrix, denoted as Σ.
′
Σ = E(x − µ)(x − µ) where for µ = E(x)
x1 − µ 1
x2 − µ 2
Σ = E .. (x1 − µ1 )(x2 − µ2 ) · · · (xp − µp )
.
xp − µ p
(x1 − µ1 )2 (x1 − µ1 )(x2 − µ2 ) · · · · · · (x1 − µ1 )(xp − µp )
(x2 − µ2 )(x1 − µ1 )
(x2 − µ2 )2 ··· · · · (x2 − µ1 )(x1 − µp )
=E
.. .. .. .. ..
. . . . .
2
(xp − µp )(x1 − µ1 ) (xp − µp )(x2 − µ2 ) · · · ··· (xp − µp )
V ar(x11 ) Cov(x1 x2 ) · · · · · · Cov(x1 xp )
Cov(x2 x1 ) V ar(x22 ) · · · · · · Cov(x2 xp )
=
.. .. .. .. ..
. . . . .
Cov(xp x1 ) Cov(xp x2 ) · · · · · · V ar(xpp )
σ11 σ12 · · · · · · σ1p
σ21 σ22 · · · · · · σ2p
Σ=
.. .. . . .. ..
. . . . .
σp1 σp2 · · · · · · σpp
STA3120 6 Email:trgzrich@yahoo.com
6 Linear Combinations of Random Vectors
1. Univariate case:
E(a1 x1 ) = a1 E(x1 ) = a1 µ1
V ar(a1 x1 ) = a21 V ar(x1 ) = a21 σ11
2. Bivariate case:
Cov(a1 x1 , a2 x2 ) = a1 a2 Cov(x1 x2 ) = a1 a2 σ12
X1 X1 ′
Given X = ; a1 x1 + a2 x2 = (a1 , a2 ) =aX
X2 X2
µ
E(aX) = E(a1 x1 + a2 x2 )= a1 E(x1 ) + a2 E(x2 ) = a1 µ1 +a2 µ2 = (a1 , a2 ) 1
µ2
′
=⇒ E(aX) = a µ
V ar(aX) = V ar(a1 x1 + a2 x2 ) = V ar(a1 x1 ) + V ar(a2 x2 ) + Cov(a1 x1 a2 x2 )
= a21 σ11 + a22 σ22 + a1 a2 σ12
σ11 σ12 a1
=(a1 , a2 )
σ21 σ22 a2
′
=⇒ V ar(aX) = a Σa
3. Multivariate case:
′
If X a p-dimensional random vector and a R, then the linear combination a X is a
one-dimensional random variable. That is,
X1
X2
.
X = ..
.
..
Xp
X1
X2
.
Linear combination: a1 X1 + a2 X2 + .... + ap Xp = a1 , a2 , ... ap .. = a X
′
.
..
Xp
STA3120 7 Email:trgzrich@yahoo.com
′ ′ ′
E(a X) = a E(X) = a µ
′
V ar(aX) = a Σa
σ11 · · · σ1p
.. ..
Here Σ = . .
σp1 · · · σpp
4. Consider q linear combinations of p random variables.
p
P
Z1 = a11 X1 + a12 X2 + · · · + a1p Xp = a1j Xj = a1j X
j=1
Pp
Z2 = a21 X1 + a22 X2 + · · · + a2p Xp = a2j Xj = a2j X
j=1
..
.
p
P
Zq = aq1 X1 + aq2 X2 + · · · + aqp Xp = aqj Xj = aqj X
j=1
In matrix for:
a11 a12 · · · ···
a1p
Z1 Z1
Z 2 Z 2
a21 a22 · · · · · · a2p
= = .. ⇐⇒ Z = AX
..
. .. .. . . .
. .
. . ..
. . .
Zp Zp
ap1 p2 · · · · · · app
E(Z) = E(AX) = AE(X) = Aµ
′
Cov(Z) = Cov(AX) = AΣA
Example
Find the mean vector and covariance matrix for the linear combinations: Z1 = X1 − X2 and
Z2 = X1 + X2 .
1 −1 X1
Z= = AX
1 1 X2
1 −1 µ1 µ1 − µ2
E(Z) = AE(X) = Aµ = =
1 1 µ2 µ1 + µ2
′
Cov(Z)
=ACov(Z)A
1 −1 σ11 σ12 1 1 σ11 − 2σ12 + σ22 σ11 − σ22
= =
1 1 σ21 σ22 −1 1 σ11 − σ22 σ11 + 2σ12 + σ22
STA3120 8 Email:trgzrich@yahoo.com