[go: up one dir, main page]

0% found this document useful (0 votes)
33 views43 pages

Solution Question Bank (Unit-3)

The document is a question bank for a Mathematics IV course at Raj Kumar Goel Institute of Technology, covering statistical techniques including measures of central tendency, skewness, kurtosis, and regression analysis. It includes various problems with solutions related to calculating mean, median, mode, and correlation coefficients. The course aims to help students understand fundamental statistical concepts and their applications.

Uploaded by

pageupgaming007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views43 pages

Solution Question Bank (Unit-3)

The document is a question bank for a Mathematics IV course at Raj Kumar Goel Institute of Technology, covering statistical techniques including measures of central tendency, skewness, kurtosis, and regression analysis. It includes various problems with solutions related to calculating mean, median, mode, and correlation coefficients. The course aims to help students understand fundamental statistical concepts and their applications.

Uploaded by

pageupgaming007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

RAJ KUMAR GOEL INSTITUTE OF TECHNOLOGY,

GHAZIABAD Session (2024-25): Odd Sem MATHEMATICS – IV


(BAS-303)
Question bank (Unit-3): Statistical Technique-I

December 26, 2024

Contents: Measures of central tendency, Skewness, Kurtosis, Curve Fitting, Method of least squares,
fitting of straight lines, fitting of second-degree parabola, Exponential curves, Correlation and Rank cor-
relation, Regression Analysis: Regression lines of y on x and x on y,regression coefficients, properties of
regressions coefficients and nonlinear regression.
Course Outcome (CO3): Understand the basic statistical concept like moments, skewness, kurtosis, curve
fitting, correlation and regression.

Problem 1:
A cooperative bank has two branches employing 50 and 70 workers respectively. The average salaries paid
by two respective branches are Rs. 360 and Rs. 390 per month. Calculate the mean of the salaries of all the
employees.

Solution:
To calculate the mean salary of all employees, we use the weighted average formula:
P
(Group Size × Group Average)
Weighted Mean =
Total Group Size
Given:
• Branch 1: 50 workers, average salary = Rs. 360/month
• Branch 2: 70 workers, average salary = Rs. 390/month
Total Workers:
50 + 70 = 120
Total Salary:
Total Salary = (50 × 360) + (70 × 390)
Calculations:
50 × 360 = 18, 000
70 × 390 = 27, 300
Total Salary = 18, 000 + 27, 300 = 45, 300

1
Mean Salary:
Total Salary 45, 300
Mean Salary = =
Total Workers 120

Mean Salary = 377.5 Rs/month


Final Answer: The mean salary of all the employees is Rs. 377.5 per month.

Problem:2
Find the median of the dataset: 6, 8, 9, 10, 11, 12, 13

Solution:
1. Arrange the numbers in ascending order:
6, 8, 9, 10, 11, 12, 13

2. Count the total number of data points: The total number of data points (n) etis 7, which is odd.
3. Find the position of the median: For an odd number of data points, the median is the value at the
position:
n+1
Median Position =
2
Substituting n = 7:
7+1
Median Position = =4
2
4. The 4th number in the dataset is 10.
Final Answer: The median of the dataset is: 10

Problem:3
Find the mode of the following marks obtained by 15 students:
4, 6, 5, 7, 9, 8, 10, 4, 7, 6, 5, 8, 7, 7, 9

Solution:
1. Arrange the data and count the frequency of each number:
4 : 2 times
5 : 2 times
6 : 2 times
7 : 4 times
8 : 2 times
9 : 2 times
10 : 1 time

2. Identify the mode: The mode is the number that appears the most frequently. Here, 7 appears 4 times,
which is more than any other number.
Final Answer: The mode of the given data is:7

Page 2
Problem:4
Find the arithmetic mean of the following distribution.

x 1 2 3 4 5 6 7
f 5 9 12 17 14 10 6

Solution:
The formula for the arithmetic mean is:
P
fi xi
Arithmetic Mean = P
fi
where:
• fi is the frequency of each observation.
• xi is the value of each observation.
Step 1: Calculate fi xi for each xi

xi fi fi xi
1 5 5
2 9 18
3 12 36
4 17 68
5 14 70
6 10 60
7 6 42
X
fi = 5 + 9 + 12 + 17 + 14 + 10 + 6 = 73
X
fi xi = 5 + 18 + 36 + 68 + 70 + 60 + 42 = 299
Step 2: Compute the Arithmetic Mean
P
fi xi 299
Arithmetic Mean = P = ≈ 4.10
fi 73
Final Answer: The arithmetic mean is approximately: 4.10

Problem:5
In an asymmetrical distribution, the mean is 16 and the median is 20. Calculate the mode of the distribution.

Solution:
The empirical relationship between the mean, median, and mode is given by:
Mode = 3 × Median − 2 × Mean
Given:
Mean = 16, Median = 20
Substitute the values:
Mode = 3 × 20 − 2 × 16
Mode = 60 − 32 = 28
Final Answer: The mode of the distribution is:28

Page 3
Problem:6
The first three central moments of a distribution are given as µ2 = 0.15 and µ3 = −31. Find the moment
coefficient of skewness.

Solution:
The formula for the moment coefficient of skewness is:
µ3
γ1 = 3/2
µ2
Given:
µ1 = 0, µ2 = 15, µ3 = −31
Compute γ1 :
−31
γ1 = √ ≈ −.53
153
Final Answer: The moment coefficient of skewness is approximately:
γ1 ≈ −533.61

Problem:7
The first two moments of a distribution about the value ‘2’ of the variable are 1, 16. Show that mean is 3,
and variance is 15.

Solution:
We use the relationships between the moments about a point a and the raw moments (µ′1 and µ′2 ):
1. The mean (x̄) is given by:
x̄ = µ1 + a
2. The variance (σ 2 ) is given by:
σ 2 = µ2 − (µ1 )2
Step 1: Find the Mean (µ′1 ) Using the formula for the mean:
x̄ = µ′1 + a
Substituting the values:
x̄ = 1 + 2 = 3
Thus, the mean is:
x̄ = 3
2
Step 2: Find the Variance (σ ) Using the formula for the variance:
σ 2 = µ′2 − (µ′1 )2
Since
σ 2 = µ2
Substituting the values:
σ 2 = 16 − (1)2 = 16 − 1 = 15
Thus, the variance is:
σ 2 = 15
Final Answer:

Page 4
• Mean (x̄) = 3
• Variance (σ 2 ) = 15

Problem:8
The fourth central moment is µ4 = 48. What must be its standard deviation (σ) in order for the distribution
to be mesokurtic?

Solution:
The kurtosis (β2 ) is given as:
µ4
β2 =
σ4
where:
• µ4 is the fourth central moment.
• σ is the standard deviation.
Step 1: Substitute the known values: For a mesokurtic distribution, β2 = 3, and µ4 = 48. Substi-
tuting these into the formula:
48
3= 4
σ
Step 2: Solve for σ 4 :
48
σ4 = = 16
3
Step 3: Solve for σ: Taking the fourth root (or square root twice) of both sides:

4

σ = 16 = 4 = 2

Final Answer: The standard deviation (σ) must be:

Problem:9
Write the normal equations to fit the curve y = ax2 + b by the method of least squares.

Solution:
To fit the curve y = ax2 + b using the least squares method, we minimize the sum of squared residuals.
Step 1: Define the Error The error for each data point (xi , yi ) is:

ei = yi − (ax2i + b)

The sum of squared errors S is:


n
X 2
S= yi − (ax2i + b)
i=1

Step 2: Minimize S To find the values of a and b, we set the partial derivatives of S with respect to a
and b to zero:
∂S ∂S
= 0 and =0
∂a ∂b

Page 5
Step 3: Derive the normal equations After differentiation, the normal equations are:
X X
yi = a x2i + bn
X X X
x2i yi = a x4i + b x2i
Final Normal Equations:
1. X X
yi = a x2i + bn

2. X X X
x2i yi = a x4i + b x2i

Problem:10
Write the formula for Karl Pearson’s correlation coefficient and state the range of the correlation coefficient.

Solution:
Karl Pearson Correlation Coefficient Formula:
The Karl Pearson correlation coefficient (r) is given by the formula:
P
(xi − x̄)(yi − ȳ)
r = pP P
(xi − x̄)2 (yi − ȳ)2

where:
• xi and yi are the individual data points of the two variables X and Y ,
• x̄ and ȳ are the means of the variables X and Y , respectively.
Range of the Correlation Coefficient:
The value of the correlation coefficient r lies between -1 and +1, inclusive:

−1 ≤ r ≤ 1

- r = 1: Perfect positive correlation - r = −1: Perfect negative correlation - r = 0: No correlation

Problem:11
If the covariance between variables x and y is 10, and the variances of x and y are 16 and 9 respectively,
find the coefficient of correlation.

Solution:
The formula for the coefficient of correlation (r) is:

cov(x, y)
r=
σx σy

where:
• cov(x, y) is the covariance between x and y,

Page 6
• σx and σy are the standard deviations of x and y, respectively.
Given:
• cov(x, y) = 10,

• σx2 = 16, so σx = 16 = 4,

• σy2 = 9, so σy = 9 = 3.
Step 1: Apply the formula for correlation:
10 10 5
r= = =
4×3 12 6
Final Answer: The coefficient of correlation r is 56 .

Problem 12:
The lines of regression of y on x and x on y are respectively:

y = 5 + x and 16x − 9y = 94,

find the correlation coefficient.

Solution
The lines of regression of y on x and x on y are respectively:
16 94
y =5+x and x = − y,
9 9
To calculate the correlation coefficient (r) between x and y, we use the equations of the lines of regression.

Given:
• Line of regression of y on x: y = x + 5
Slope of this line (byx ) = 1.
• Line of regression of x on y: 16x − 9y = 94
9
Rewrite it as x = 16 y + 94
16 , so the slope (bxy ) =
9
16 .

Formula for Correlation Coefficient:


p
r = ± byx · bxy
Substitute the values: r
9
r =± 1·
16
r
9 3
r=± =±
16 4

Determining the Sign of r:


The sign of r depends on the direction of the relationship. Since both regression lines have positive slopes,
r is positive.

Page 7
Final Answer:
3
r=
4

Problem:13
If the regression coefficients are byx = 0.8 and bxy = 0.2, find the value of the coefficient of correlation (r).

Solution:
The formula for the coefficient of correlation is:
p
r = ± byx · bxy

Given:
byx = 0.8, bxy = 0.2
Substitute these values into the formula:

r = ± 0.8 · 0.2

Simplify: √
r = ± 0.16

r = ±0.4

Determining the Sign of r:


The sign of r depends on the direction of the relationship. Since both byx > 0 and bxy > 0, the correlation
coefficient is positive.

Final Answer:
r = 0.4

problem:14
If the regression coefficients are byx = 0.8 and bxy = 0.8, find the value of the coefficient of correlation (r).

Solution:
The formula for the coefficient of correlation is:
p
r = ± byx · bxy

Page 8
Given:
byx = 0.8, bxy = 0.8
Substitute these values into the formula:

r = ± 0.8 · 0.8

Simplify: √
r = ± 0.64

r = ±0.8

Determining the Sign of r:


The sign of r depends on the direction of the relationship. Since both byx > 0 and bxy > 0, the correlation
coefficient is positive.

Final Answer:
r = 0.8

Problem:15
What is the relation between the regression coefficients and the coefficient of Correlation?

Relationship Between Regression Coefficients and the Coefficient


of Correlation
The relationship between the regression coefficients (byx and bxy ) and the coefficient of correlation (r) is as
follows:

Formula:
p
r = ± byx · bxy

Problem:16
c0 √
Write the normal equations to fit a curve y = x + c1 x

c0 √
Normal Equations to Fit the Curve y = x
+ c1 x
We aim to fit a curve of the form: √
c0
y= + c1 x
x
Using the method of least squares, we minimize the sum of squared residuals:
X 2
c0 √
S= yi − − c1 x i
xi

By differentiating this expression with respect to c0 and c1 , and setting the derivatives to zero, we obtain
the following normal equations:

Page 9
First Normal Equation (with respect to c0 ):
X 1  c0 √

yi − − c1 x i = 0
xi xi
Simplifying this:
X yi X 1 X √x i
= c0 + c 1
xi x2i xi

Second Normal Equation (with respect to c1 ):


X√  
c0 √
xi yi − − c1 xi =0
xi
Simplifying this: X √ X X
(yi xi ) = c0 1 + c1 xi

Final System of Equations:


X yi X 1 X √x i
= c0 + c 1
xi x2i xi
X √ X X
(yi xi ) = c0 1 + c1 xi
These two equations can be solved simultaneously to determine c0 and c1

Problem:17
Write the formula for rank correlation in the case of tied rank

Spearman’s Rank Correlation with Tied Ranks


In the case of tied ranks, the formula for Spearman’s rank correlation coefficient (rs ) is:
mi (m2i −1)
P
6{ d2i +
P
12 }
rs = 1 − 2
n(n − 1)

Where:
mi is no of repetition of the ranks.

Page 10
Question Description (7 Marks)

Problem:1
Calculate the first four central moments and also comment upon Skewness and Kurtosis from the following
data:

Class Interval Frequency


0 − 10 1
10 − 20 4
20 − 30 3
30 − 40 2

Solution:
Calculation of Central Moments, Skewness, and Kurtosis
Given Data:

Class Interval Frequency(f )


0 − 10 1
10 − 20 4
20 − 30 3
30 − 40 2
Step 1: Calculate the Midpoint of Each Class Interval
0 + 10
Midpoint of 0 − 10 : x1 = =5
2
10 + 20
Midpoint of 10 − 20 : x2 = = 15
2
20 + 30
Midpoint of 20 − 30 : x3 = = 25
2
30 + 40
Midpoint of 30 − 40 : x4 = = 35
2
Step 2: Calculate the Mean (x̄)
P
(f · x)
x̄ = P
f
Where:
f · x1 = 1 · 5 = 5
f · x2 = 4 · 15 = 60
f · x3 = 3 · 25 = 75
f · x4 = 2 · 35 = 70
X
(f · x) = 5 + 60 + 75 + 70 = 210
X
f = 1 + 4 + 3 + 2 = 10
Thus, the mean is:
210
x̄ = = 21
10

Page 11
Step 3: Calculate the Central Moments
First Central Moment (Mean):
µ1 = 0
Second Central Moment (Variance):

f (x − x̄)2
P
µ2 = P
f

Calculate (x − x̄)2 for each midpoint:

(x1 − x̄)2 = (5 − 21)2 = 256

(x2 − x̄)2 = (15 − 21)2 = 36


(x3 − x̄)2 = (25 − 21)2 = 16
(x4 − x̄)2 = (35 − 21)2 = 196
Now, multiply by the frequencies:

f · (x1 − x̄)2 = 1 · 256 = 256

f · (x2 − x̄)2 = 4 · 36 = 144


f · (x3 − x̄)2 = 3 · 16 = 48
f · (x4 − x̄)2 = 2 · 196 = 392
Now, calculate the sum:
X
f (x − x̄)2 = 256 + 144 + 48 + 392 = 840

Thus, the second central moment (variance) is:


840
µ2 = = 84
10
Third Central Moment (Skewness-related Moment):

f (x − x̄)3
P
µ3 = P
f

Calculate (x − x̄)3 for each midpoint:

(x1 − x̄)3 = (−16)3 = −4096

(x2 − x̄)3 = (−6)3 = −216


(x3 − x̄)3 = 43 = 64
(x4 − x̄)3 = 143 = 2744
Now, multiply by the frequencies:

f · (x1 − x̄)3 = 1 · (−4096) = −4096

f · (x2 − x̄)3 = 4 · (−216) = −864


f · (x3 − x̄)3 = 3 · 64 = 192
f · (x4 − x̄)3 = 2 · 2744 = 5488

Page 12
Now, calculate the sum:
X
f (x − x̄)3 = −4096 − 864 + 192 + 5488 = 720

Thus, the third central moment is:


720
= 72 µ3 =
10
Fourth Central Moment (Kurtosis-related Moment):

f (x − x̄)4
P
µ4 = P
f

Calculate (x − x̄)4 for each midpoint:

(x1 − x̄)4 = (−16)4 = 65536

(x2 − x̄)4 = (−6)4 = 1296


(x3 − x̄)4 = 44 = 256
(x4 − x̄)4 = 144 = 38416
Now, multiply by the frequencies:

f · (x1 − x̄)4 = 1 · 65536 = 65536

f · (x2 − x̄)4 = 4 · 1296 = 5184


f · (x3 − x̄)4 = 3 · 256 = 768
f · (x4 − x̄)4 = 2 · 38416 = 76832
Now, calculate the sum:
X
f (x − x̄)4 = 65536 + 5184 + 768 + 76832 = 78608

Thus, the fourth central moment is:


78608
µ4 = = 7860.8
10
Step 4: Skewness and Kurtosis
- Skewness: Since the third central moment is positive (µ3 = 72), the distribution is positively skewed,
meaning the tail on the right side is longer or fatter than the left side.
- Kurtosis: The fourth central moment (µ4 = 7860.8) suggests a relatively higher peak than a normal
distribution, indicating a leptokurtic distribution (the peak is sharper).

Problem 2:
Calculate the first four central moments about the mean, Skewness, and Kurtosis for the following data
(2021-22):

x 0 1 2 3 4 5 6 7 8
f 1 8 28 56 70 56 28 8 1

Page 13
Calculation of Central Moments, Skewness, and Kurtosis
Given Data:

x 0 1 2 3 4 5 6 7 8
f 1 8 28 56 70 56 28 8 1

Solution:
Step 1: Calculate the Mean (x̄)
The mean (x̄) is calculated by:
P
(f · x)
x̄ = P
f

x f f ·x
0 1 0
1 8 8
2 28 56
3 56 168
4 70 280
5 56 280
6 28 168
7 8 56
8 1 8
Total 256 1024
1024
x̄ = =4
256
Step 2: Calculate the Central Moments
The central moments are calculated using the formula:

f (x − x̄)r
P
µr = P
f
Second Central Moment (Variance):

f (x − x̄)2
P
µ2 = P
f

x f (x − x̄) (x − x̄)2 f · (x − x̄)2


0 1 −4 16 16
1 8 −3 9 72
2 28 −2 4 112
3 56 −1 1 56
4 70 0 0 0
5 56 1 1 56
6 28 2 4 112
7 8 3 9 72
8 1 4 16 16
Total 256 512
512
µ2 = =2
256
Third Central Moment (µ3 ) and Skewness:

Page 14
f (x − x̄)3
P
µ3 = P
f

x f (x − x̄) (x − x̄)3 f · (x − x̄)3


0 1 −4 −64 −64
1 8 −3 −27 −216
2 28 −2 −8 −224
3 56 −1 −1 −56
4 70 0 0 0
5 56 1 1 56
6 28 2 8 224
7 8 3 27 216
8 1 4 64 64
Total 256 0
0
µ3 = =0
256
Since µ3 = 0, the distribution is symmetric, and Skewness is:
µ3
Skewness = 3/2
=0
µ2
Fourth Central Moment (µ4 ) and Kurtosis:

f (x − x̄)4
P
µ4 = P
f

x f (x − x̄) (x − x̄)4 f · (x − x̄)4


0 1 −4 256 256
1 8 −3 81 648
2 28 −2 16 448
3 56 −1 1 56
4 70 0 0 0
5 56 1 1 56
6 28 2 16 448
7 8 3 81 648
8 1 4 256 256
Total 256 2816
2816
µ4 = = 11
256
Kurtosis is:
µ4 11
Kurtosis = 2 = 2 = 2.75
µ2 2

Problem 3:
Compute Skewness and Kurtosis, if the first four moments of a frequency distribution about the value 4 of
the variable are 1, 4, 10, and 45.

Page 15
Solution:
We are given the first four moments about the value A = 4:

µ′1 = 1, µ′2 = 4, µ′3 = 10, µ′4 = 45

The mean is:


x̄ = µ′1 + A = 1 + 4 = 5
The central moments (µr ) can be expressed in terms of moments about A (µ′r ) as follows:

µ2 = µ′2 − (µ′1 )2 , µ3 = µ′3 − 3µ′2 µ′1 + 2(µ′1 )3 , µ4 = µ′4 − 4µ′3 µ′1 + 6µ′2 (µ′1 )2 − 3(µ′1 )4

Step 1: Calculate Central Moments


1. Second Central Moment (µ2 ):
µ2 = µ′2 − (µ′1 )2
µ2 = 4 − (1)2 = 4 − 1 = 3
2. Third Central Moment (µ3 ):

µ3 = µ′3 − 3µ′2 µ′1 + 2(µ′1 )3

µ3 = 10 − 3(4)(1) + 2(1)3
µ3 = 10 − 12 + 2 = 0
3. Fourth Central Moment (µ4 ):

µ4 = µ′4 − 4µ′3 µ′1 + 6µ′2 (µ′1 )2 − 3(µ′1 )4

µ4 = 45 − 4(10)(1) + 6(4)(1)2 − 3(1)4


µ4 = 45 − 40 + 24 − 3 = 26

Step 2: Compute Skewness and Kurtosis


Skewness (γ1 ):
µ3
γ1 = 3/2
µ2
Substitute µ3 = 0 and µ2 = 3:
0
γ1 = =0
(3)3/2
The skewness is γ1 = 0, indicating a symmetric distribution.
Kurtosis (γ2 ):
µ4
γ2 = 2
µ2
Substitute µ4 = 26 and µ2 = 3:
26 26
γ2 = = ≈ 2.89
(3)2 9
The kurtosis is γ2 ≈ 2.89, which is slightly lower than the normal value of 3, indicating a distribution close
to normal.

Page 16
Final Results
• Second Central Moment (µ2 ): 3
• Third Central Moment (µ3 ): 0
• Fourth Central Moment (µ4 ): 26
• Skewness (γ1 ): 0 (symmetric distribution)
• Kurtosis (γ2 ): 2.89 (slightly leptokurtic)

Problem 4:
The first four moments of a frequency distribution about the value 4 of the variable are -1.5, 17,-30 and
80.Find
µ1 , µ2 , µ3 µ4
about mean. Also find
β1 andβ2
.

Solution
We are given the first four moments about the value A = 4:

µ′1 = −1.5, µ′2 = 17, µ′3 = −30, µ′4 = 80

The formulae for central moments (µr ) in terms of moments about A (µ′r ) are:

µ1 = 0, µ2 = µ′2 − (µ′1 )2 , µ3 = µ′3 − 3µ′2 µ′1 + 2(µ′1 )3 , µ4 = µ′4 − 4µ′3 µ′1 + 6µ′2 (µ′1 )2 − 3(µ′1 )4

Step 1: Calculate Central Moments


1. Second Central Moment (µ2 ):
µ2 = µ′2 − (µ′1 )2
µ2 = 17 − (−1.5)2
µ2 = 17 − 2.25 = 14.75
2. Third Central Moment (µ3 ):

µ3 = µ′3 − 3µ′2 µ′1 + 2(µ′1 )3

µ3 = −30 − 3(17)(−1.5) + 2(−1.5)3


µ3 = −30 + 76.5 − 2(−3.375)
µ3 = −30 + 76.5 + 6.75 = 53.25
3. Fourth Central Moment (µ4 ):

µ4 = µ′4 − 4µ′3 µ′1 + 6µ′2 (µ′1 )2 − 3(µ′1 )4

µ4 = 80 − 4(−30)(−1.5) + 6(17)(−1.5)2 − 3(−1.5)4


µ4 = 80 − 4(−30)(−1.5) + 6(17)(2.25) − 3(5.0625)
µ4 = 80 − 180 + 229.5 − 15.1875
µ4 = 114.3125

Page 17
Step 2: Skewness and Kurtosis
Skewness (β1 ):
µ23
β1 =
µ32
Substitute µ3 = 53.25 and µ2 = 14.75:
53.252
β1 =
(14.75)3
53.252 53.25
β1 = = ≈ 0.8836
(14.75)3 56.7157
Kurtosis (β2 ):
µ4
β2 =
µ22
Substitute µ4 = 114.3125 and µ2 = 14.75:
114.3125
β2 =
(14.75)2
114.3125
β2 = ≈ 0.53
217.5625

Final Results
• Second Central Moment (µ2 ): 14.75
• Third Central Moment (µ3 ): 53.25

• Fourth Central Moment (µ4 ): 114.3125


• Skewness (β1 ): 0.8836 (positively skewed)
• Kurtosis (β2 ): 0.53 (platykurtic distribution)

Problem 5:
The first four moments of a frequency distribution about the value 2 of the variable are 2, 20, 40 and 50
respectively. Comment upon the skewness and kurtosis of the distribution.

Solution
Analysis of Skewness and Kurtosis
The first four moments about A = 2 are given as:

µ′1 = 2, µ′2 = 20, µ′3 = 40, µ′4 = 50.

Mean (µ1 )
The mean of the distribution is:
µ1 = A + µ′1 = 2 + 2 = 4.

Page 18
Central Moments
The central moments are calculated using the formula:
n  
X n ′
µn = µ (A − µ1 )n−k .
k k
k=0

Second Central Moment (µ2 ):

µ2 = µ′2 + (µ′1 )2 = 20 + 22 = 20 + 4 = 24.

Third Central Moment (µ3 ):


µ3 = µ′3 + 3µ′2 (A − µ1 ) + (µ′1 )3
µ3 = 40 + 3(20)(2 − 4) + 23 = 40 + 3(20)(−2) + 8 = 40 − 120 + 8 = −72.

Fourth Central Moment (µ4 ):

µ4 = µ′4 + 4µ′3 (A − µ1 ) + 6µ′2 (A − µ1 )2 + (µ′1 )4

µ4 = 50 + 4(40)(2 − 4) + 6(20)(2 − 4)2 + 24


µ4 = 50 + 4(40)(−2) + 6(20)(4) + 16
µ4 = 50 − 320 + 480 + 16 = 226.

Skewness (γ1 )
Skewness is calculated as:
µ3
γ1 = 3/2
µ2
−72 −72 −72
γ1 = 3/2
= √ = ≈ −0.61.
(24) 24 24 24 × 4.899
Interpretation: Since γ1 is negative, the distribution is negatively skewed.

Kurtosis (γ2 )
Kurtosis is calculated as:
µ4
γ2 =
µ22
226 226
γ2 = 2
= ≈ 0.392.
24 576
Excess kurtosis is:
Excess Kurtosis = γ2 − 3 = 0.392 − 3 = −2.608.
Interpretation: The negative excess kurtosis indicates that the distribution is platykurtic (flatter than a
normal distribution).

Conclusion
• The distribution is negatively skewed (γ1 ≈ −0.61).
• The distribution is platykurtic (γ2 ≈ 0.392), meaning it has lighter tails and a flatter peak compared
to a normal distribution.

Page 19
Problem 6:
The first four moments of a frequency distribution about the value 5 of the variable are 1, 2.5, 5.5 and 16
respectively.Find the four central moments, moments about origin and coefficient of skewness.

Solution:
Given:

µ′1 = 1,
µ′2 = 2.5,
µ′3 = 5.5,
µ′4 = 16.

The value of A = 5. The mean is given by:

M1 = x̄ = A + µ′1 = 5 + 1 = 6.

Step 1: Central Moments


The central moments µn are related to the moments about A (µ′n ) as follows:

µ1 = 0,

µ2 = µ′2 − (µ′1 )2 ,
µ3 = µ′3 − 3µ′2 µ′1 + 2(µ′1 )3 ,
µ4 = µ′4 − 4µ′3 µ′1 + 6µ′2 (µ′1 )2 − 3(µ′1 )4 .
Substitute the values:
1. µ2 = µ′2 −(µ′1 )2 = 2.5−12 = 2.5−1 = 1.5, 2. µ3 = µ′3 −3µ′2 µ′1 +2(µ′1 )3 = 5.5−3(2.5)(1)+2(1)3 = 5.5−
7.5+2 = 0, 3. µ4 = µ′4 −4µ′3 µ′1 +6µ′2 (µ′1 )2 −3(µ′1 )4 = 16−4(5.5)(1)+6(2.5)(1)2 −3(1)4 = 16−22+15−3 = 6.
Thus, the central moments are:

µ2 = 1.5, µ3 = 0, µ4 = 6.

Step 2: Moments About the Origin


The moments about the origin Mn are related to the central moments µn and the mean M1 = x̄ as follows:

M1 = x̄ = 6,

M2 = µ2 + (µ1 )2 = 1.5 + 02 = 1.5,


M3 = µ3 + 3µ2 µ1 + µ31 = 0 + 3(1.5)(0) + 03 = 0,
M4 = µ4 + 4µ3 µ1 + 6µ2 µ21 + µ41 = 6 + 4(0)(0) + 6(1.5)(02 ) + 04 = 6.
Thus, the moments about the origin are:

M1 = 6, M2 = 1.5, M3 = 0, M4 = 6.

Page 20
Step 3: Coefficient of Skewness
The coefficient of skewness γ1 is given by:
µ3 0
γ1 = 3/2
= = 0.
µ2 (1.5)3/2

Thus, the distribution is symmetric.

Final Results
• Central Moments: µ2 = 1.5, µ3 = 0, µ4 = 6
• Moments About the Origin: M1 = 6, M2 = 1.5, M3 = 0, M4 = 6
• Coefficient of Skewness: γ1 = 0

Thus, the distribution is symmetric.

Problem:7
Determine the Skewness and Kurtosis for the following data:

Marks No. of students


10 − 20 18
20 − 30 20
30 − 40 30
40 − 50 22
50 − 60 10

1 Solution
We are given the following frequency distribution:

Class Interval No. of Students(fi )


10 − 20 18
20 − 30 20
30 − 40 30
40 − 50 22
50 − 60 10
Step 1: Find the Midpoints The midpoint xi of each class interval is the average of the lower and upper
boundaries of the class. For example, for the first class 10 − 20, the midpoint is 10+20
2 = 15.

Class Interval No. of Students(fi ) Midpoint(xi )


10 − 20 18 15
20 − 30 20 25
30 − 40 30 35
40 − 50 22 45
50 − 60 10 55

Step 2: Calculate the Mean The mean x̄ is calculated as:


P
fi xi
x̄ = P
fi

Page 21
We first calculate fi xi :
Class Interval fi xi fi xi
10 − 20 18 15 270
20 − 30 20 25 500
30 − 40 30 35 1050
40 − 50 22 45 990
50 − 60 10 55 550
Now, calculate the totals:
X
fi xi = 270 + 500 + 1050 + 990 + 550 = 3360
X
fi = 18 + 20 + 30 + 22 + 10 = 100
Thus, the mean is:
3360
= 33.6
x̄ =
100
Step 3: Calculate the Second and Third Central Moments Second Central Moment (Variance) We now
calculate (xi − x̄)2 :
Class Interval xi fi (xi − x̄) (xi − x̄)2 fi (xi − x̄)2
10 − 20 15 18 −18.6 345.96 6227.28
20 − 30 25 20 −8.6 73.96 1479.20
30 − 40 35 30 1.4 1.96 58.80
40 − 50 45 22 11.4 129.96 2859.12
50 − 60 55 10 21.4 457.96 4579.60
The variance is:
fi (xi − x̄)2
P
12304
Variance = P = = 123.04
fi 100
Thus, the standard deviation σ is: √
σ= 123.04 ≈ 11.09
Third Central Moment We now calculate (xi − x̄)3 :
Class Interval xi fi (xi − x̄) (xi − x̄)3 fi (xi − x̄)3
10 − 20 15 18 −18.6 −640.416 −11527.488
20 − 30 25 20 −8.6 −50.056 −1001.12
30 − 40 35 30 1.4 2.744 82.32
40 − 50 45 22 11.4 1487.304 32717.728
50 − 60 55 10 21.4 9774.704 97747.04
The third central moment is:
fi (xi − x̄)3
P
106018.48
Third Central Moment = P ≈ = 1060.18
fi 100
Step 4: Calculate Skewness and KurtosiSkewness Skewness is calculated using the formula:
Third Central Moment
Skewness =
(Standard Deviation)3
1060.18 1060.18
Skewness = 3
≈ ≈ 0.78Kurtosisiscalculatedusingthef ormula :
(11.09) 1362.42
Fourth Central Moment
Kurtosis = −3
(Standard Deviation)4
The kurtosis value is approximately:
Kurtosis ≈ 3.1
Final Results: - **Skewness** = 0.78 - **Kurtosis** = 3.1

Page 22
Problem 8:
Find the coefficient of correlation from the following points of observation (1,3),(2,2),(3,5),(4,4),(5,6).

Solution:
To find the coefficient of correlation r for the given points of observation, we use the Pearson correlation
coefficient formula: P P P
n xi yi − xi yi
r= p P 2
[n xi − ( xi )2 ][n yi2 − ( yi )2 ]
P P P

Given Points of Observation:


(1, 3), (2, 2), (3, 5), (4, 4), (5, 6)
Step 1: Organize the Data
xi yi xi yi x2i yi2
1 3 3 1 9
2 2 4 4 4
3 5 15 9 25
4 4 16 16 16
5 6 30 25 36
Step 2: Calculate the Required Sums
X
xi = 1 + 2 + 3 + 4 + 5 = 15
X
yi = 3 + 2 + 5 + 4 + 6 = 20
X
xi yi = 3 + 4 + 15 + 16 + 30 = 68
X
x2i = 1 + 4 + 9 + 16 + 25 = 55
X
yi2 = 9 + 4 + 25 + 16 + 36 = 90
n=5 (number of points)
Step 3: Apply the Formula Now, substitute the values into the formula for the Pearson correlation coefficient:

5 × 68 − (15 × 20)
r= p
[5 × 55 − (15)2 ][5 × 90 − (20)2 ]

Simplifying each part:


340 − 300
r= p
[275 − 225][450 − 400]
40
r= p
[50][50]
40 40
r= √ = = 0.8
2500 50
Final Answer: The coefficient of correlation r is 0.8. This indicates a strong positive correlation between X
and Y .

Page 23
Problem:9
A random sample of 5 college students is selected and their grades in Mathematics and Statistics are found
to be:

Student Mathematics
Statistics
1 85
93
2 60
75
3 73
65
4 40
50
5 90
80
Calculate the rank correlation coefficient.

Solution:
Spearman’s Rank Correlation Coefficient
The formula for the **rank correlation coefficient** rs is:

6 d2i
P
rs = 1 −
n(n2 − 1)

Where: - n is the number of data points (in this case, n = 5), - di is the difference between the ranks of
corresponding values of Mathematics and Statistics for each student. Given Data:

Student Mathematics (X) Statistics (Y)


1 85 93
2 60 75
3 73 65
4 40 50
5 90 80

Step 1: Rank the Values for Each Subject Ranks for Mathematics (X): - 90 → Rank 1 - 85 → Rank 2 - 73
→ Rank 3 - 60 → Rank 4 - 40 → Rank 5 Thus, the ranks for Mathematics are:

RankX = [2, 4, 3, 5, 1]

Ranks for Statistics (Y): - 93 → Rank 1 - 80 → Rank 2 - 75 → Rank 3 - 65 → Rank 4 - 50 → Rank 5 Thus,
the ranks for Statistics are:
RankY = [1, 3, 4, 5, 2]
Step 2: Calculate the Differences in Ranks and Square Them Now, we calculate di = RankX − RankY and
d2i :
Student RankX RankY di = RankX − RankY d2i
1 2 1 1 1
2 4 3 1 1
3 3 4 −1 1
4 5 5 0 0
5 1 2 −1 1

Page 24
Step 3: Apply the Spearman Rank Correlation Formula The formula for Spearman’s rank correlation is:

6 d2i
P
rs = 1 −
n(n2 − 1)

Substitute the values: X


d2i = 1 + 1 + 1 + 0 + 1 = 4
n=5
Now, substitute into the formula:
6×4 24 24 24
rs = 1 − 2
=1− =1− =1− = 1 − 0.2 = 0.8
5(5 − 1) 5(25 − 1) 5 × 24 120

Final Answer: The **rank correlation coefficient** is rs = 0.8.

Problem:10
Calculate the coefficient of correlation for the following heights (in inches) of fathers (X) and their sons (Y ):

Father’s Height (X) Son’s Height (Y)


65 67
66 68
67 65
67 68
68 72
69 72
70 69
72 71

Solution:
The **Pearson correlation coefficient** r is given by the formula:
P P P
n xi yi − xi yi
r= p P 2
[n xi − ( xi )2 ][n yi2 − ( yi )2 ]
P P P

Where: - n is the number of data points, - xi and yi are the individual data points for fathers’ and sons’
heights, respectively. We are given the data for 8 students, so n = 8. Step 1: Calculate the Required Sums
1. **Sum of xi ** and **Sum of yi **:
X
xi = 65 + 66 + 67 + 67 + 68 + 69 + 70 + 72 = 484
X
yi = 67 + 68 + 65 + 68 + 72 + 72 + 69 + 71 = 552

2. **Sum of x2i ** and **Sum of yi2 **:


X
x2i = 652 + 662 + 672 + 672 + 682 + 692 + 702 + 722 = 38528
X
yi2 = 672 + 682 + 652 + 682 + 722 + 722 + 692 + 712 = 37532
3. **Sum of xi yi **:
X
xi yi = (65 × 67) + (66 × 68) + (67 × 65) + (67 × 68) + (68 × 72) + (69 × 72) + (70 × 69) + (72 × 71) = 37460

Page 25
Step 2: Apply the Formula for Pearson’s Correlation Coefficient
Now, substitute the values into the formula:
8 × 37460 − 484 × 552
r= p
[8 × 38528 − (484)2 ][8 × 37532 − (552)2 ]
Step 3: Simplify the Expression
First, calculate the numerator:

8 × 37460 = 299680, 484 × 552 = 267168


Numerator = 299680 − 267168 = 32512
Now, calculate the denominator:

8 × 38528 = 308224, (484)2 = 234256


8 × 37532 = 300256, (552)2 = 304704
Now, calculate the square roots in the denominator:

308224 − 234256 = 73968, 300256 − 304704 = −4448


Now, calculate the entire denominator:
√ √
73968 × 4448 = 329374464 = 57347.18
Step 4: Final Calculation
Now, we can compute r:
32512
r= ≈ 0.567
57347.18
Final Answer:
The **Pearson correlation coefficient** r is approximately **0.567**, which indicates a **moderate
positive correlation** between the heights of fathers and sons.

Question 11:
Fit a parabolic curve of second degree to the following data:

x y
0 1
1 1.8
2 1.3
3 2.5
4 6.3

Solution:
Fitting a Parabolic Curve
Given Data:

Page 26
x y
0 1
1 1.8
2 1.3
3 2.5
4 6.3
Step 1: Calculate the Necessary Summations

x y x2 x3 x4 x · y x2 · y
0 1 0 0 0 0 0
1 1.8 1 1 1 1.8 1.8
2 1.3 4 8 16 2.6 5.2
3 2.5 9 27 81 7.5 22.5
4
P 6.3 16 64 256 25.2 100.8
30 100 354 37.1 130.3
Step 2: Form the Normal Equations
The normal equations are:
X X X
y = na + b x+c x2 ,
X X X X
(xy) = a x+b x2 + c x3 ,
X X X X
(x2 y) = a x2 + b x3 + c x4 .

Substituting the values:


12.9 = 5a + 10b + 30c,
37.1 = 10a + 30b + 100c,
130.3 = 30a + 100b + 354c.
Step 3: Solve for a, b, and c
Solving this system of equations gives:

a = 0.68, b = −1.14, c = 0.79.

Step 4: Fitted Parabolic Curve


The fitted curve is:
y = 0.68 − 1.14x + 0.79x2

Problem 12:
Use the method of least squares to find the curve y = abx that best fits the following data:

x y
2 8.3
3 15.4
4 33.1
5 65.2
6 127.4

Page 27
Solution:
We assume the equation is of the form y = abx . Taking the natural logarithm of both sides:

ln(y) = ln(abx ) = ln(a) + x ln(b)


Let Y = ln(y), A = ln(a), and B = ln(b), so the equation becomes:

Y = A + Bx
Now, we apply the method of least squares to the transformed equation:
The normal equations are:
X X
Y = nA + B x
X X X
xY = A x+B x2
We compute the required sums:
X X
x = 2 + 3 + 4 + 5 + 6 = 20, x2 = 4 + 9 + 16 + 25 + 36 = 90
X
Y = 2.120 + 2.740 + 3.497 + 4.181 + 4.850 = 17.388
X
xY = 76.453
We solve the system of equations:
1. 17.388 = 5A + 20B 2. 76.453 = 20A + 90B
Solving this, we find A = 0.7208 and B = 0.6893.
Thus, a = eA = 2.055 and b = eB = 1.989.
Therefore, the best-fitting curve is:

y = 2.055 × (1.989)x

Problem 13:
Use the method of least squares to find the curve y = abx that best fits the following data:

x y
2 144
3 172.8
4 207.4
5 248.8
6 298.5

2 Solution:
We assume the equation is of the form y = abx . Taking the natural logarithm of both sides:

ln(y) = ln(abx ) = ln(a) + x ln(b)


Let Y = ln(y), A = ln(a), and B = ln(b), so the equation becomes:

Y = A + Bx
Now, we apply the method of least squares to the transformed equation:

Page 28
The normal equations are:
X X
Y = nA + B x
X X X
xY = A x+B x2
We compute the required sums:
X X
x = 2 + 3 + 4 + 5 + 6 = 20, x2 = 4 + 9 + 16 + 25 + 36 = 90
X
Y = 4.976 + 5.153 + 5.329 + 5.515 + 5.699 = 26.672
X
xY = 108.496
We solve the system of equations:
1. 26.672 = 5A + 20B 2. 108.496 = 20A + 90B
Solving this, we find A = 4.6112 and B = 0.1808.
Thus, a = eA = 100.823 and b = eB = 1.198.
Therefore, the best-fitting curve is:

y = 100.823 × (1.198)x

Problem 14:
Use the method of least squares to fit the curve y = ax + bx2 to the following data:

x y
1 1
2 1.2
3 1.8
4 2.5
5 3.6
6 4.7
7 6.6
8 9.1

Solution:
2
We assume
P the curvePis 2of the
Pform y =Pax + bxP. The normal equations are:
1. y = na + b x 2. xy = a x + b x2
We compute the required sums:
X X X X X
x = 36, x2 = 204, y = 29.5, xy = 182, x2 y = 1229
The system of equations is:
1. 29.5 = 8a + 204b 2. 182 = 36a + 204b
We solve this system and find a = 5.45 and b = −0.0691.
Thus, the best-fitting curve is:

y = 5.45x − 0.0691x2

Page 29
Problem 15:
Find the exponential curve of the form P = kV γ using the method of least squares for the following data:

V P
50 135
100 48
150 26
200 17

Solution:
We assume the curve is of the form P = kV γ . Taking the natural logarithm of both sides:

ln(P ) = ln(kV γ ) = ln(k) + γ ln(V )


Let Y = ln(P ), A = ln(k), and B = γ. The equation becomes:

Y = A + B ln(V )
WeP apply the method
P of least squares to the
P linear equation:
V Y = A ln(V ) + B (ln(V ))2
P P
1. Y = nA + B ln(V ) 2.
After computing the necessary sums:
X X X
Y = 14.867, ln(V ) = 18.825, V = 500
X X
V Y = 1687.65, (ln(V ))2 = 89.68
We substitute these values into the normal equations:
1. 14.867 = 4A + 18.825B 2. 1687.65 = 18.825A + 89.68B
Solving this system of equations will give us the values of A and B. Once we have A and B, we can
compute k = eA and γ = B.
Thus, the exponential curve is P = kV γ .

Problem 16:
c1
Using the method of least squares, fit the curve y = c0 +c2 x to the following data:

x y
0.2 16
0.3 14
0.5 11
1 6
2 3

Page 30
Solution:
c1
Fitting the Curve y = √
x
+ c0 x
Given Data:
x y
0.2 16
0.3 14
0.5 11
1 6
2 3
Step 1: Transform the Model We rewrite the model as:
y = c1 z1 + c0 z0 ,
where:
1
z1 = √ , z0 = x.
x
Step 2: Compute Necessary Summations

x y z1 = √1x z0 = x yz1 yz0 z12 z1 z0 z02


0.2 16 2.236 0.2 35.776 3.2 5.000 0.447 0.040
0.3 14 1.826 0.3 25.564 4.2 3.333 0.548 0.090
0.5 11 1.414 0.5 15.554 5.5 2.000 0.707 0.250
1 6 1.000 1.0 6.000 6.0 1.000 1.000 1.000
P2 3 0.707 2.0 2.121 6.0 0.500 1.414 4.000
85.015 24.9 11.833 4.116 5.38
Step 3: Form the Normal Equations
The normal equations are: X X X
yz1 = c1 z12 + c0 z1 z0 ,
X X X
yz0 = c1 z1 z0 + c0 z02 .
Substituting the values:
85.015 = c1 (11.833) + c0 (4.116),
24.9 = c1 (4.116) + c0 (5.38).
Step 4: Solve for c1 and c0
Solving the equations gives:
c1 = 7.60, c0 = −1.18.
Step 5: Fitted Curve
The fitted curve is:
7.60
y = √ − 1.18x
x

Problem 17:
Using the method of least squares, fit the curve f (x) = a + bx + cx2 to the following data:

x f (x)
0 1
1 4
2 10
3 17
4 30

Page 31
Solution:
The equation of the curve Pis f (x) =Pa + bx +Pcx2 . The normalP equations
P 2 are:P 3
2
P 2
x f (x) = a x2 +
P P
1.
P 3 f (x)
P 4 = n · a + b x + c x 2. xf (x) = a x + b x + c x 3.
b x +c x
After calculating the necessary sums:
X X X X
x = 10, x2 = 30, x3 = 100, x4 = 354
X X X
f (x) = 62, xf (x) = 195, x2 f (x) = 677
We substitute these sums into the normal equations:
1. 62 = 5a + 10b + 30c 2. 195 = 10a + 30b + 100c 3. 677 = 30a + 100b + 354c
This gives us the system of equations to solve for a, b, and c.
Thus, we can solve this system to find the values of a, b, and c to fit the quadratic curve f (x) = a+bx+cx2 .

Problem 18:
Using the method of least squares, fit a curve of the form:

y = aebx

to the following data:

x y
1 1.0
2 1.2
3 1.8
4 2.5
5 3.6

Solution
Fitting the Curve y = aebx Using Least Squares
Step 1: Transform the Model
Taking the natural logarithm of both sides:

y = aebx =⇒ ln y = ln a + bx.

Let Y = ln y and A = ln a, so the model becomes:

Y = A + bx.

Step 2: Compute Transformed Data

x y Y = ln y
1 1 0.000000
2 1.2 0.182321
3 1.8 0.587787
4 2.5 0.916291
5 3.6 1.280934
Step 3: Compute Summations

Page 32
x Y x2 xY
1 0.000000 1 0.000000
2 0.182321 4 0.364642
3 0.587787 9 1.763361
4 0.916291 16 3.665164
5
P 1.280934 25 6.404670

15 2.967333 55 12.197837
Step 4: Form the Normal Equations
The normal equations are:
5A + 15b = 2.967333,
15A + 55b = 12.197837.
Step 5: Solve for A and b
Solving the equations, we get:
A = −0.3953, b = 0.3296.
Converting A to a:
a = eA = 0.6735.
Step 6: Fitted Curve
The fitted curve is:
y = 0.6735e0.3296x .

Problem 19:
If the following two lines are the regression equations:
1. 4x − 5y + 33 = 0 (regression of x on y), 2. 20x − 9y = 107 (regression of y on x),
Find the mean values of x and y, the correlation coefficient, and the standard deviation of y, given that
the variance of x is 9.

Solution:
Step 1: Find the Mean Values of x and y
The given regression equations are:
1. 4x − 5y + 33 = 0, or x = 54 y − 33 20 107
4 2. 20x − 9y = 107, or y = 9 x − 9
Let x and y be the mean values of x and y, respectively. Substitute x = x and y = y into the regression
equations:
1. x = 54 y − 33 20 107
4 2. y = 9 x − 9
We solve this system of linear equations:
From equation (1):
5 33
x= y−
4 4
From equation (2):
20 107
y= x−
9 9
Substitute equation (1) into equation (2):
 
20 5 33 107
y= y− −
9 4 4 9

Page 33
Simplifying:
100 660 107
y= y− −
36 36 9
25 1088
y= y−
9 36
Multiplying through by 36 to eliminate the denominator:

36y = 100y − 1088

36y − 100y = −1088


−64y = −1088
1088
y= = 17
64
Substitute y = 17 into equation (1) to find x:
5 33
x= × 17 −
4 4
85 33 52
x= − = = 13
4 4 4
Thus, the mean values are:
x = 13, y = 17
Step 2: Find the Correlation Coefficient (r)
The correlation coefficient r can be found using the formula for the product of the regression coefficients:
s 
bxy
r=
byx

where bxy is the regression coefficient of x on y and byx is the regression coefficient of y on x.
From the given regression equations: - The coefficient bxy = 54 (regression of x on y), - The coefficient
byx = 20
9 (regression of y on x).
Thus, the correlation coefficient is:
r r r
5 9 45 9 3
r= × = = =
4 20 80 16 4
So, the correlation coefficient r is 34 .
Step 3: Find the Standard Deviation of y
We are given that the variance of x is 9. The standard deviation of x is:

SD(x) = 9 = 3

To find the standard deviation of y, we use the formula:


1
SD(y) = SD(x) × |r| ×
|bxy |
Substituting the values:
3 1 3 4 3 9
SD(y) = 3 × × 5 = 3 × × = 3 × = = 1.8
4 4
4 5 5 5

Thus, the standard deviation of y is 1.8 .


3
Final Answers: - Mean of x: x = 13 - Mean of y: y = 17 - Correlation coefficient: r = 4 - Standard
deviation of y: SD(y) = 1.8

Page 34
Problem 20:
Problem Statement
In a partially destroyed laboratory record of an analysis of correlation data, the following results are legible:
Variance of x: σx2 = 9.
The regression equations are:

8x − 10y + 66 = 0 and 40x − 18y = 214.

Calculate the following:


(a) The mean values of x and y.
(b) The standard deviation of y.
(c) The coefficient of correlation between x and y.

Solution
Given:
Variance of x = σx2 = 9 =⇒ σx = 3,
and the regression equations:

8x − 10y + 66 = 0 and 40x − 18y = 214.

(a) Mean Values of x and y


The regression equations pass through the means (x̄, ȳ). Substituting into the equations:

8x̄ − 10ȳ = −66,

40x̄ − 18ȳ = 214.


Solving these equations:
x̄ = 13, ȳ = 17.

(b) Standard Deviation of y


The regression coefficients are:
Coefficient of x 8
bxy = − =− = 0.8,
Coefficient of y −10
Coefficient of y −18
byx = − =− = 0.45.
Coefficient of x 40
The relation between regression coefficients and the correlation coefficient is:

r2 = bxy · byx = 0.8 · 0.45 = 0.36 =⇒ r = 0.6.

Using the formula for standard deviation:


σy
bxy = r · ,
σx
substitute:
σy 0.8 · 3
0.8 = 0.6 · =⇒ σy = = 4.
3 0.6

Page 35
(c) Coefficient of Correlation
The coefficient of correlation is:
r = 0.6.

Final Answers
• Mean values: x̄ = 13, ȳ = 17.
• Standard deviation of y: σy = 4.

• Coefficient of correlation: r = 0.6.

Problem 21:
Two lines of regression are given by:

5x − 2y = 52 (regression of x on y)

and
3x − 8y = 12 (regression of y on x),
and the variance of x is given by σx2 = 12.
Calculate:

1. The mean value of x and y,


2. The variance of y,
3. The coefficient of correlation between x and y.

Solution:
Step 1: Rearrange the Regression Equations
The regression equations are:
2 52
5x − 2y = 52 ⇒ x= y+
5 5
and
3 12 3 3
3x − 8y = 12 ⇒ y= x− = x− .
8 8 8 2
Step 2: Calculate Mean Values of x and y
To find the mean values of x and y, we substitute x = x and y = y in the regression equations.
From the equation for x in terms of y:
2 52
x= y+ .
5 5
From the equation for y in terms of x:
3 3
y = x− .
8 2
Now substitute the expression for x into the equation for y:
Substitute x = 25 y + 52 3 3
5 into y = 8 x − 2 :
 
3 2 52 3
y= y+ − .
8 5 5 2

Page 36
Simplifying:
3 2 3 52 3
y= × y+ × − ,
8 5 8 5 2
6 156 60
y= y+ − ,
40 40 40
3 96
y= y+ ,
20 40
3 24
y= y+ .
20 10
Multiply through by 20 to eliminate the denominator:

20y = 3y + 48,

20y − 3y = 48,
17y = 48,
48
y= ≈ 2.82.
17
Now substitute y = 2.82 into the equation for x:
2 52
x= × 2.82 + ,
5 5
5.64 52
x= + ,
5 5
x = 1.128 + 10.4 = 11.528.
Thus, the mean values are:
x ≈ 11.528, y ≈ 2.82.
Step 3: Calculate the Coefficient of Correlation
The formula for the correlation coefficient r is given by:
p
r = bxy × byx ,

where bxy is the regression coefficient of x on y, and byx is the regression coefficient of y on x.
From the regression equations: - bxy = 25 , - byx = 83 .
Thus: r r r
2 3 6 3
r= × = = ≈ 0.387.
5 8 40 20
So, the coefficient of correlation is approximately r ≈ 0.387.
Step 4: Calculate the Variance of y
We are given the variance of x is σx2 = 12. The variance of y can be calculated using the formula:
1
σy2 = σx2 × r2 × .
b2xy

Substitute the known values:


1
σy2 = 12 × (0.387)2 ×  ,
2 2
5
25
σy2 = 12 × 0.149 × ,
4
σy2 = 12 × 0.149 × 6.25 = 12 × 0.93125 = 11.175.
Thus, the variance of y is approximately σy2 ≈ 11.175.
Final Answers:

Page 37
• Mean of x: x ≈ 11.528,
• Mean of y: y ≈ 2.82,
• Coefficient of correlation: r ≈ 0.387,

• Variance of y: σy2 ≈ 11.175.

Problem 22:
The following table gives the age (x) in years of cars and annual maintenance cost (y) in hundred rupees.

x y
1 15
3 18
5 21
7 23
9 22
Calculate the maintenance cost for a 4-year-old car after finding the regression equation.

Solution
The regression equation is of the form:
y = a + bx
where: P P P P P
n (xy) − x y y−b x
b= P 2 P , a=
n x − ( x)2 n

Step 1: Calculate Required Sums


From the given data:
x = {1, 3, 5, 7, 9}, y = {15, 18, 21, 23, 22}
X X X X
x = 25, y = 99, x2 = 165, xy = 533

Step 2: Calculate b and a


P P P
n (xy) − x y
b= P 2 P
n x − ( x)2
5(533) − (25)(99)
b= = 0.95
5(165) − (25)2
P P
y−b x
a=
n
99 − 0.95(25)
a= = 15.05
5

Step 3: Regression Equation


y = 15.05 + 0.95x

Page 38
Step 4: Predict Maintenance Cost for a 4-Year-Old Car (x = 4)
y = 15.05 + 0.95(4) = 18.85 (hundred rupees).

Final Answer
The estimated maintenance cost for a 4-year-old car is:

1885 rupees

Problem 23:
From the following data, determine the equations of the line of regression of y on x and x on y:

x y
6 9
2 11
10 5
4 8
8 7

Solution
The regression equation of y on x is:
y − ȳ = byx (x − x̄),
where: P P
P x y
xy −
byx = P Pn 2
( x)
x2 − n
The regression equation of x on y is:

x − x̄ = bxy (y − ȳ),

where: P P
P x y
xy − n
bxy = P P 2
( y)
y2 − n

Step 1: Calculate Required Sums


Given data:
x = {6, 2, 10, 4, 8}, y = {9, 11, 5, 8, 7}

X X X X X
x = 30, y = 40, x2 = 220, y 2 = 340, xy = 214

Step 2: Calculate Means


P P
x30 y 40
x̄ = = = 6.0, ȳ = = = 8.0
n 5 n 5

Page 39
Step 3: Calculate Regression Coefficients
P P
P x y
xy −
byx = P Pn 2
( x)
x2 − n
30×40
214 − 5
byx = 2 = −0.65
220 − 305
P P
xy − xn y
P
bxy = P P 2
y 2 − ( ny)
30×40
214 − 5
bxy = 2 = −1.30
340 − 405

Step 4: Write Regression Equations


1. **Regression of y on x:**
y − 8.00 = −0.65(x − 6.00)
2. **Regression of x on y:**
x − 6.00 = −1.30(y − 8.00)

Problem 24:
Fit a parabolic curve of regression of y on x to the following data:

x y
1.0 1.1
1.5 1.3
2.0 1.6
2.5 2.0
3.0 2.7
3.5 3.4
4.0 4.1

Solution
The parabolic regression curve is of the form:

y = a + bx + cx2

The normal equations for fitting a parabola are:


X X X
y = na + b x+c x2
X X X X
(xy) = a x+b x2 + c x3
X X X X
(x2 y) = a x2 + b x3 + c x4

Step 1: Calculate Required Sums


From the given data, calculate the following:
X X X X X X X
x, y, x2 , x3 , x4 , (xy), (x2 y)

Page 40
Step 2: Solve Normal Equations
Substitute the calculated sums into the normal equations:
X X X
y = na + b x+c x2
X X X X
(xy) = a x2 + c
x+b x3
X X X X
(x2 y) = a x2 + b x3 + c x4
Solve these simultaneous equations to find a, b, and c.

Step 3: Final Parabolic Equation


Substitute a, b, and c into the equation:
y = a + bx + cx2

Problem 25:
Find the multiple regression equation of X3 on X1 and X2 from the data given below:

X1 X2 X3
3 10 20
5 10 25
6 5 15
8 7 16
12 5 15
10 2 2

Solution
The multiple regression equation is given by:

X3 = a + b1 X1 + b2 X2
P P P
P To determine
P a, b1 , and
P b22 , we P
use the normal
P equations: P1. X3 P
= na + b1 X P1 + b2
2
X2 2.
(X3 X1 ) = a X1 + b1 X1 + b2 (X1 X2 ) 3. (X3 X2 ) = a X2 + b1 (X1 X2 ) + b2 X2

Step 1: Calculate Required Sums


From the given data, calculate the following:
X X X X X X
X1 , X2 , X3 , X12 , X22 , X32
X X X
(X1 X2 ), (X1 X3 ), (X2 X3 )

Step 2: Solve Normal Equations


Substitute the calculated sums into the normal equations and solve for a, b1 , and b2 .

Step 3: Write Final Equation


Substitute the values of a, b1 , and b2 into the regression equation:

X3 = a + b1 X1 + b2 X2

Page 41
Problem 26:
For the data given below, determine the lines of regression of y on x and x on y:

x y
2 5
4 7
6 9
8 8
10 11

Solution
Regression of y on x:
The regression equation of y on x is:
y − ȳ = byx (x − x̄)
where:
Cov(x, y)
byx =
Var(x)

Regression of x on y:
The regression equation of x on y is:
x − x̄ = bxy (y − ȳ)
where:
Cov(x, y)
bxy =
Var(y)

Step 1: Calculate Required Sums


Given data:
x = {2, 4, 6, 8, 10}, y = {5, 7, 9, 8, 11}
Calculate the following: X X X X X
x, y, x2 , y2 , xy
P P
x y
x̄ = , ȳ =
n n
P 2
y2
P
x
Var(x) = − x̄2 , Var(y) = − ȳ 2
n n
P
xy
Cov(x, y) = − x̄ȳ
n

Step 2: Solve for Regression Coefficients


Using the values obtained, calculate:

Cov(x, y) Cov(x, y)
byx = , bxy =
Var(x) Var(y)

Page 42
Step 3: Write the Regression Equations
1. **Regression of y on x:**
y − ȳ = byx (x − x̄)
2. **Regression of x on y:**
x − x̄ = bxy (y − ȳ)

Page 43

You might also like