Covariance and Correlation
Matrix
Understanding Concepts with
Examples
What are Covariance and Correlation?
• Covariance:
• - Measures how two variables vary together.
• - Positive covariance indicates that both variables increase
together.
• - Negative covariance indicates that as one variable
increases, the other decreases.
• Correlation:
• - Measures the strength and direction of the relationship
between two variables.
• - Values range from -1 to 1.
• - Correlation is the normalized form of covariance.
Covariance Matrix
• - A covariance matrix is a square matrix that
shows the covariance between pairs of
variables.
• - Diagonal elements represent the variance of
individual variables.
• - Example:
• For variables X and Y:
• Covariance matrix = [[Var(X), Cov(X, Y)],
[Cov(Y, X), Var(Y)]]
Correlation Matrix
• - A correlation matrix shows the pairwise
correlation coefficients between variables.
• - Diagonal elements are always 1 (correlation
of a variable with itself).
• - Example:
• For variables X and Y:
• Correlation matrix = [[1, Corr(X, Y)], [Corr(Y,
X), 1]]
Example Dataset
• Consider a dataset with two variables X and Y:
• X = [1, 2, 3, 4, 5]
• Y = [5, 4, 6, 8, 7]
• We'll calculate:
• - Covariance matrix
• - Correlation matrix
Covariance Matrix Calculation
• Steps:
• 1. Compute the mean of X and Y.
• 2. Calculate the deviations (X - mean(X)) and
(Y - mean(Y)).
• 3. Multiply deviations and take the average.
• Result:
• Covariance matrix = [[2.5, 1.25], [1.25, 1.5]]
Correlation Matrix Calculation
• Steps:
• 1. Compute the covariance between variables.
• 2. Divide by the product of standard
deviations of the variables.
• Result:
• Correlation matrix = [[1, 0.91], [0.91, 1]]
Formula for Covariance Matrix
Formula for correlation coefficients
Correlation coefficients matrix is known as Pearson’s
correlation (named after Karl Pearson) is used to show
linear relationship between two variables.
Rule of thumb for interpretation
correlation coefficients
Applications of Covariance and
Correlation
• - Finance: Understanding relationships
between stock prices.
• - Data Analysis: Identifying relationships
between variables.
• - Machine Learning: Feature selection and
multicollinearity detection.
Thank You!
• Questions and discussions are welcome.