MULTIVARIATE ANALYSIS
Introduction
Most of the observable phenomena in the empirical sciences are of a
multivariate nature
In financial studies, assets in stock markets are observed simultaneously
and their joint development is analyzed to better understand general
tendencies and to track indices
In medicine recorded observations of subjects in different locations are
the basis of reliable diagnoses and medication.
In quantitative marketing consumer preferences are collected in order to
construct models of consumer be-havior
The underlying theoretical structure of these and many other quantitative
studies of applied sciences is multivariate
Objectives of scientific investigations based on
multivariate data may involve;
Data reduction. The phenomenon being studied is represented as simply as
possible without sacrificing valuable information with intention of making
interpretation easier
Sorting and grouping. Groups of similar objects or variable are created based
upon measured characteristics
.
Investigation of the dependence among variables. Are all the variables mu-tually
independent or one or more variables depend on the others? If yes, how?
Prediction. If the variables are related, we may be able to predict the values of
one or more variables on the basis of the observations on the other variables
Hypothesis testing. Specific statistical hypothesis, formulated in terms of the
parameters of multivariate populations are tested in order to validate assumptions
or reinforce prior convictions.
Arrays and summary statistics
Let 𝑥𝑖𝑗 be the observed value for the 𝑖 𝑡ℎ item or trial and 𝑗𝑡ℎ variable
For n observations on p variables we can display these data as a
rectangular array (X) of n rows and p columns;
Example;
Suppose there was a nutrition survey on campus students, for a
sample of 30 students, we may have:
Operations carried out on scalars in univariate statistics are also
carried out by analogous operations on vectors and matrices in
multivariate statistics.
Let be elements on the first variable. Then the arithmetic
average of the measurements is
and in general
is the arithmetic average of the jthvariable. If the n measurements
represents a subset of the full set of measurements that might have
been observed then 𝑥𝑗 is the sample mean of the jthvariable. The
corresponding sample variance
When these descriptive statistics are orga-nized into arrays we get;
Example: The "classic blue" Pullovers Data given below is a data set
con-sisting of 10 measurements of 4 variables. The story: A textile shop
manager is studying the sales of "‘classic blue" pullovers over 10
periods. He uses three differ-ent marketing methods and hopes to
understand his sales as a fit of these variables using statistics. The
variables measured are
X1: Numbers of sold pullovers,
X2: Price (in EUR),
X3: Advertisement costs in local newspapers (in EUR),
X4: Presence of a sales assistant (in hours per period).
MULTIVARIATE NORMAL DISTRIBUTION
Estimation of mean vector and dispersion matrix
Mean vector
The Hotellings T 2 statistics
Marginal and conditional distributions
Example
solution
Conditional normal distribution
Hypothesis Testing
Union of intersection
Test for Independence
Tests for covariance matrices-Wisharts Distribution
Multiple Regression and Correlation coefficients
Discriminant Analysis
Canonical Correlations
Cluster analysis