0% found this document useful (0 votes)

46 views18 pages

Statistical Treatment in Theory and Proofs

The document provides solutions to various statistical theory problems, including proofs of the mean and variance of standardized variables, derivation of the OLS estimator, and calculations of probabilities related to shared birthdays and expected values. It also discusses variance, covariance, and conditional probabilities using joint distributions. Key results include the unbiasedness of the OLS estimator under certain assumptions and the minimum number of participants needed for a specific probability of shared birthdays.

Uploaded by

Fridah Chepkoech

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views18 pages

Statistical Treatment in Theory and Proofs

Uploaded by

Fridah Chepkoech

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Statistical Theory Solutions and Proofs

Solution to Theory Part

Question T.1)
Prove that the mean of Z is zero and the variance of Z is 1.
𝑋−µ𝑥
We are given that 𝑍 = , where µx is the mean of X and σx is the standard deviation of
𝜎𝑥
X;
By definition, the mean of z is given by;

𝑋−µ𝑥 1
Mean of Z = 𝐸 [𝑍] = 𝐸 [ ]= (𝐸 [𝑋] − µ𝑥 ) = 0
𝜎𝑥 𝜎𝑥

The variance of Z will be given by;

Var (Z) =𝐸 [𝑍 2 ] − (𝐸 [𝑍])2

𝑋 − µ𝑥 2
= 𝐸 [( ) ]−0
𝜎𝑥

1 2]
𝜎 2𝑋
= 𝐸 [( 𝑋 − µ𝑥 ) = =1
𝜎2𝑋 𝜎 2𝑋
Question T.2)
We are to prove that Var (Y) = 𝑎2 𝜎 2 𝑥 + 𝑏2 𝜎 2 𝑧 + 2𝑎𝑏𝜎𝑋𝑍
Given that;
𝑉𝑎𝑟 (𝑌) = 𝑉𝑎𝑟(𝑎𝑋 + 𝑏𝑍)
By definition of Variance, we have that;
𝑉𝑎𝑟(𝑎𝑋 + 𝑏𝑍) = 𝑎2 𝑉𝑎𝑟(𝑋) + 𝑏2 𝑉𝑎𝑟(𝑍) + 2𝑎𝑏𝐶𝑜𝑣(𝑋, 𝑍)
Using the property of variance for linear combinations of random variables,
𝑉𝑎𝑟(𝑎𝑋 + 𝑏𝑍) = 𝑎2 𝜎 2 𝑥 + 𝑏2 𝜎 2 𝑧 + 2𝑎𝑏𝜎𝑋𝑍
This Justifies that Var(Y)= 𝑎2 𝜎 2 𝑥 + 𝑏2 𝜎 2 𝑧 + 2𝑎𝑏𝜎𝑋𝑍
Question T.3)
We are the derive the coefficient 𝛽 in the linear model 𝑦𝑖 = 𝛽𝑥𝑖 + £ using OLS.
The OLS Objective is to Minimize the sum of squared residuals
𝑚𝑖𝑛 𝛽 ∑𝑖 ℇ𝑖 2 = 𝑚𝑖𝑛 𝛽 ∑𝑖 (𝑦𝑖 − 𝛽𝑥𝑖 )2
For the first order condition, we will take the derivatives with respect to β then set it to Zero.
𝑑
∑(𝑦𝑖 − 𝛽𝑥𝑖 ) = −2 ∑(𝑦𝑖 − 𝛽𝑥𝑖 ) = 0
𝑑𝛽
𝑖 𝑖
We now solve for β

∑ 𝑥𝑖𝑦𝑖 = 𝛽 ∑ 𝑥𝑖 2
𝑖 𝑖

∑𝑖 𝑥𝑖𝑦𝑖
𝛽=
∑𝑖 𝑥𝑖 2
Question T.4)
We are to show that the OLS estimator β is unbiased and state the required assumptions.
Given that the population model 𝑌𝑖 = 𝛼 + 𝛽𝑋𝑖 + ℇ𝑖. The OLS estimator:
(∑𝑖 (𝑌𝑖 − 𝑌 − )(𝑋𝑖 − 𝑋 − )
𝛽=
∑𝑖 (𝑋𝑖 − 𝑋 − )2
The first step is for the unbiased proof, where we substitute;
𝑌𝑖 = 𝛼 + 𝛽𝑋𝑖 + ℇ𝑖 into the estimator.
(∑𝑖 (𝑌𝑖 − 𝑌 − )(𝑋𝑖 − 𝑋 − ) ∑𝑖 ℇ𝑖(𝑋𝑖 − 𝑋 − )
𝛽= = 𝛽 +
∑𝑖 (𝑋𝑖 − 𝑋 − )2 ∑𝑖 (𝑋𝑖 − 𝑋 − )2
We take the expectation, to get the following expression;
∑𝑖 ℇ𝑖(𝑋𝑖 − 𝑋 − )
[ ]
𝐸 𝛽 = 𝛽 +𝐸[ ] = 𝛽 𝑖𝑓 𝐸 [ℇ𝑖 𝑔𝑖𝑣𝑒𝑛 𝑋𝑖 ] = 0
∑𝑖 (𝑋𝑖 − 𝑋 − )2

The required assumptions are as follows;

 The linearity, that is, 𝑌𝑖 = 𝛼 + 𝛽𝑋𝑖 + ℇ𝑖
 The exogeneity, that is, 𝐸 [ℇ𝑖 𝑔𝑖𝑣𝑒𝑛 𝑋𝑖 ] = 0
 Random sampling {(𝑋𝑖, 𝑌𝑖 )} 𝑖𝑠 𝑖. 𝑖. 𝑑
 No perfect multicollinearity (for multiple regression).
Application Part
Solution to Question A.1
Part 1: Probability that at least 2 people share the same birthday
To find the probability that at least 2 people in a room of n share the same birthday, it is
easier to calculate the complement, that is, the probability that all n people have unique
birthdays) and subtract it from 1.
We have that total possible birthday combinations: Each person can have any of the 365 days
as their birthday. For n people, the total number of possible birthday combinations is giv
𝟑𝟔𝟓𝒏 .
Number of unique birthday combinations: The first person has 365 choices, the second has
364 (to avoid matching the first), the third has 363, and so on. This is given by the
permutation;
𝑃(365, 𝑛) = 365 ∗ 364 ∗ 363 ∗ … … . .∗ (365 − 𝑛 + 1)
The Probability of all unique birthdays will be given by;
𝑃 (365, 𝑛)
𝑃(𝑈𝑛𝑖𝑞𝑢𝑒) =
365𝑛
Probability of at least one shared birthday
𝑃(𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑡𝑤𝑜 𝑠ℎ𝑎𝑟𝑒𝑑 ) = 1 − 𝑃(𝑈𝑛𝑖𝑞𝑢𝑒)
= 1 − 𝑃(𝑈𝑛𝑖𝑞𝑢𝑒)
𝑃(365, 𝑛)
=1−
365𝑛
Part 2: Minimum number of participants for at least 40% probability
In this case, we need to determine the smallest n such that;
𝑃(365, 𝑛)
1− ≥ 0.40
365𝑛
We can solve this by testing successive values of n
For n=1, the 𝑃(𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑠ℎ𝑎𝑟𝑒𝑑 ) = 0
𝑃(365,10)
For n=10, the 𝑃(𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑠ℎ𝑎𝑟𝑒𝑑 ) = 1 − = 11.695%
36510

𝑃(365,19)
For n=19, the 𝑃(𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑠ℎ𝑎𝑟𝑒𝑑 ) = 1 − = 37.91%
36519

𝑃(365,20)
For n=20, the 𝑃(𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑠ℎ𝑎𝑟𝑒𝑑 ) = 1 − = 41.14%
36520

We have that the smallest n where the probability first exceeds 40% is n=20
Solution to Question A.2
i. Expected Values of X and Y
The first step is to calculate the marginal probabilities for X and Y by summing the joint
probabilities over the other variable.
Marginal probabilities for X
𝑃(𝑋 = −3) = 0.05 + 0 + 0 + 0 + 0.05 = 0.10
𝑃(𝑋 = 5) = 0 + 0.15 + 0 + 0.15 + 0 = 0.30
𝑃(𝑋 = 10) = 0 + 0 + 0.2 + 0 + 0 = 0.20
𝑃(𝑋 = 15) = 0 + 0.15 + 0 + 0.15 + 0 = 0.30
𝑃(𝑋 = 23) = 0.05 + 0 + 0 + 0 + 0.05 = 0.10
Marginal probabilities for Y
𝑃(𝑌 = 7) = 0.05 + 0 + 0 + 0 + 0.05 = 0.10
𝑃(𝑌 = 9) = 0 + 0.15 + 0 + 0.15 + 0 = 0.30
𝑃(𝑌 = 12) = 0 + 0 + 0.2 + 0 + 0 = 0.20
𝑃(𝑌 = 15) = 0 + 0.15 + 0 + 0.15 + 0 = 0.30
𝑃(𝑌 = 17) = 0.05 + 0 + 0 + 0 + 0.05 = 0.10
The second step is to compute the expected values using the marginal probabilities.
𝐸 [𝑋] = (−3)(0.10) + 5(0.30) + 10(0.20) + 15(0.30) + 23(0.10)=10
𝐸 [𝑌] = (7)(0.10) + 9(0.30) + 12(0.20) + 15(0.30) + 17(0.10)=12
Hence E[X]=10 and E[Y]=12
ii. The Variance of X and Y
The first step is to compute 𝐸 [𝑋 2 ] and 𝐸 [𝑌 2 ] using the marginal probabilities.
𝐸 [𝑋 2 ] = (−3)2 (0.10) + 52 (0.30) + 102 (0.20) + 152 (0.30) + 232 (0.10)= 148.8
𝐸 [𝑌 2 ] = 72 (0.10) + 92 (0.30) + 122 (0.20) + 152 (0.30) + 172 (0.10) = 154.4
The next step is to calculate the variances using 𝑉𝑎𝑟(𝑋) = 𝐸 [𝑋 2 ] − (𝐸 [𝑋])2 and also for
𝑉𝑎𝑟(𝑌) = 𝐸 [𝑌 2 ] − (𝐸[𝑌])2
𝑉𝑎𝑟(𝑋) = 148.8 − (102 ) = 148.8 − 100 = 48.8
𝑉𝑎𝑟(𝑌) = 154.4 − (122 ) = 154.4 − 144 = 10.4
iii. Covariance Between X and Y.
The first step in this case is to compute 𝐸 [𝑋𝑌] by summing over all the possible pairs (𝑥, 𝑦)
multiplied by their joint probabilities as follows;
𝐸 [𝑋𝑌] = (−3)(−7)0.05 + (−3)(17)0.05 + 5(9)(0.15) + 5(15)(0.15) + 10(12)(0.2)
+ 15(9)(0.15) + 15(15)(0.15) + 23(7)(0.05) + 23(17)(0.05)
𝐸 [𝑋𝑌] = 122.1
We now calculate the covariance using the formula;
𝐶𝑜𝑣(𝑋, 𝑌) = 𝐸 [𝑋𝑌] − 𝐸 [𝑋]𝐸[𝑌]
𝐶𝑜𝑣(𝑋, 𝑌) = 122.1 − (10)(12)
𝐶𝑜𝑣(𝑋, 𝑌) = 2.1
iv. Conditional Probability Distribution Y Given XY≤120
We start by identifying all the all the possible pairs (x,y) where XY≤120 and their joint
probabilities;
 (-3,7): (-3)(7)= -21≤120, P=0.05
 (-3,17): (-3)(17)= -51≤120, P=0.05
 (5,9): (5)(9)= 45≤120, P=0.15
 (5,15): (5)(15)= 75≤120, P=0.15
 (10,12): (10)(12)= 120≤120, P=0.2
We now estimate the probability P(XY≤120) as follows;
P(XY≤120) =0.05+0.05+0.15+0.15+0.2= 0.6
We now calculate the conditional probability of each Y value as follows;
0.05 + 0.05 1
𝑃(𝑌 = 7|𝑋𝑌 ≤ 120) = =
0.6 6
0.15 1
𝑃(𝑌 = 9|𝑋𝑌 ≤ 120) = =
0.6 4
0.15 1
𝑃(𝑌 = 12|𝑋𝑌 ≤ 120) = =
0.6 4
0.2 1
𝑃(𝑌 = 15|𝑋𝑌 ≤ 120) = =
0.6 3
0.05 + 0.05 1
𝑃(𝑌 = 17|𝑋𝑌 ≤ 120) = =
0.6 6
Hence the conditional probability of Y given XY≤120 is
1
𝑃(𝑌 = 7|𝑋𝑌 ≤ 120) =
6
1
𝑃(𝑌 = 9|𝑋𝑌 ≤ 120) =
4
1
𝑃(𝑌 = 12|𝑋𝑌 ≤ 120) =
4
1
𝑃(𝑌 = 15|𝑋𝑌 ≤ 120) =
3
1
𝑃 (𝑌 = 17|𝑋𝑌 ≤ 120) =
{ 6
Solution to Question A.3
i. Calculate 𝑷(𝟗. 𝟎𝟓 < 𝑿 ≤ 𝟕𝟑. 𝟐𝟕)
We are given that 𝑋~𝑁(µ𝑥 = 9, 𝜎 2 𝑥 = 361) so σx= 19
We standardize the bounds as follows;
9.05 − 29
𝑍1 = = −1.05
19
73.27 − 29
𝑍2 = = 2.33
19
We use the standard normal tables to obtain the following;
𝑃(𝑍 ≤ −1.05) = 0.1469
𝑃(𝑍 ≤ 2.33) = 0.9901
We now calculate the probability as follows;
𝑷(𝟗. 𝟎𝟓 < 𝑿 ≤ 𝟕𝟑. 𝟐𝟕) = 0.9901 − 0.1469 = 0.8432
ii. Distribution of Xˉ for nx=17
The sample mean Xˉ follows as a normal distribution;
𝜎2 𝑥 361
𝑋~𝑁(µ𝑥 = 29, = )
𝑛𝑥 17

iii. Covariance and Correlation between X and Y

Given that µx=29, µy=25, σx=19, σy=23 and E[XY]=441
The covariance; 𝐶𝑜𝑣(𝑋, 𝑌) = 𝐸 [𝑋𝑌] − µ𝑋µ𝑋 = 441 − (29)(25) = −284
The correlation formula is given by;
𝐶𝑜𝑣(𝑋, 𝑌) −284
ƿ= = = −0.65
𝜎𝑋𝜎𝑌 (19)(23)
iv. Mean and Variance of W=2X−3Y
Mean =E[W]=2µ𝑋 − 3µ𝑌 = 2(29) − 3(25) = −17
Variance= 𝑉𝑎𝑟(𝑊 ) = 4𝜎 2 𝑥 + 9𝜎 2 𝑦 − 12𝐶𝑜𝑣(𝑋, 𝑌)
= 4(361)+9(529)−12(−284)
=1444+4761+3408
=9613.
Solution to Question A.4
𝟑𝟗 𝟑
a. We are to calculate 𝑷(− < 𝑿− ≤ − )
𝟕 𝟕

𝜎2 81 9
We are given that 𝑋~𝑁 (µ𝑥 = −3, = 49 ) implying that 𝜎𝑥 = 7
𝑛

We standardize the bounds as follows;

39
− − (−3)
𝑍1 = 7 = −2
9
7
3
− 7 − (−3)
𝑍2 = =2
9
7
We use the standard normal tables to obtain the following;
𝑃(𝑍 ≤ −2) = 0.0228
𝑃(𝑍 ≤ 2) = 0.9772
We now calculate the probability as follows;
𝟑𝟗 𝟑
𝑷 (− < 𝑿− ≤ − ) = 0.9772 − 0.0228 = 0.9544
𝟕 𝟕
b. We are to find a such that 𝑷(−𝟑 − 𝒂 < 𝑿− ≤ −𝟑 + 𝒂) = 𝟗𝟖%
The given symmetry in the above equation implies that;
𝑃(𝑋 − ≤ −3 + 𝑎) = 0.99
The Z value for 99% confidence Z ≈ 2.33
We now solve for a:
9
𝑎 = 𝑍 ∗ 𝜎𝑋 − = 2.33 ∗ = 3.00
7
c. 98% Confidence Interval for μ (Unknown Variance)
Given that n=41, 𝑋 − = −5, 𝑆 = 10 𝑎𝑛𝑑 𝛼 = 0.002
We use the t distribution with df=n-1=40; t0.01=2.42
The margin of error (MOE):
𝛼 𝑠 10
𝐸 = 𝑡( )∗ = 2.42 ∗ = 3.78
2 √𝑛 √41
The confidence interval will be given by;
5 ± 3.78 = (−8.78, −1.22)
d. One-Tailed Test for H0:μ≥−3 at α=1%
The test statistic;
𝑋 − − µ𝑜 −5 − (−3)
𝑡= = = −1.28
𝑆 10
√𝑛 √40
The critical value (left-tailed): t0.01,40≈ -2.42
The decision is that we fail to reject the null hypothesis as the test statistic (-1.28) is greater
than the critical value at 1% level of significance.
Solution to Question A.5
Part i) Two-tailed test for 𝑯𝒐: µ𝑿 = µ𝒀 at α=1%
We are given that;
Young Patients (X): 𝑋 − = 1, 𝑆 2 𝑥 = 0.5, 𝑛𝑥 = 61
The Old Patients (Y): 𝑌 − = 0.7, 𝑆 2 𝑦 = 0.45, 𝑛𝑦 = 41
And that 𝜎 2 𝑥 = 𝜎 2 𝑦 (𝑒𝑞𝑢𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒𝑠)
The pooled variance will be given by the following;
(𝑛𝑥 − 1)𝑆 2 𝑥 + (𝑛𝑦 − 1)𝑆 2 𝑦 (60 ∗ 0.5) + (40 ∗ 0.5)
𝑆2𝑝 = = = 0.48
𝑛𝑥 + 𝑛𝑦 − 2 100
The test statistic will be given by;
𝑋 − − 𝑌− 1 − 0.7
𝑡= = = 2.78
1 1
√𝑆 2 𝑝 ( √0.48 ( 1 + 1 )
𝑛𝑥 𝑛𝑦)
+
61 41
The critical Value;
Degrees of freedom will be given by;
𝑛𝑥 + 𝑛𝑦 − 2 = 100

For α=1%, two tailed, t critical=2.626

Since the absolute value of t, that is |t|, is greater than Critical value (2.78>2.626), we reject
the null hypothesis;
Part ii) Conclusion for part i)
There is significant evidence at the 1% level to conclude that the mean plasma concentration
differs between young and old patients.
Part iii) Two-tailed test for 𝑯𝒐: 𝒑𝒙 = 𝒑𝒚 at α=1%.
We are given that;
10
Young patients: 10/61 had side effects (𝑃− 𝑥 = = 0.164)
6

17
Old patients: 17/41 had side effects (𝑃− 𝑦 = 41 = 0.415)

The Pooled proportion will be given by;

10 + 17 27
𝑃− = = = 0.265
61 + 41 102
The test statistics, Z test will be given by;
𝑃− 𝑥 − 𝑃− 𝑦 0.164 − 0.415
𝑍= = ≈ −3.07
1 1 1 1
√(𝑃− (1 − 𝑝) (𝑛𝑥 + 𝑛𝑦)) √(0.265 ∗ 0.735) ( + )
61 41

The critical value for α=1%, two tailed, Zcritical= ±2.576.

The decision is that since |z|=3.07>2.576, we reject the null hypothesis.

Part iv) Conclusion for part iii)
There is significant evidence at the 1% level to conclude that the proportion of patients with
side effects differs between young and old patients.
Solution to Question A.6
We are to find the probability the patient actually has the disease given a positive test
result is approximately
Given that P(D)=0.01 and that P(Positive|D)=0.95
Sensitivity (P(Test+|D): 95%(0.95)
False Positive Rate (P(Test +|¬D): Not given, in this case, we will assume 5% (0.05) because
it is not specified;
We need to find P(D| Test +), that is the probability that the patient has the disease given a
positive test result.
We first calculate the probabilities as follows;

 P(D) = 0.01
 P(¬D) = 1 - P(D) = 0.99
 P(Test+ | D) = 0.95
 P(Test+ | ¬D) = 0.05 (assuming)

We apply Bayes’ Theorem as Shown;

𝑃(𝑇𝑒𝑠𝑡 + |𝐷) ∗ 𝑃(𝐷)

𝑃(𝐷|𝑇𝑒𝑠𝑡 +) =
𝑃 (𝑇𝑒𝑠𝑡 +)

Where 𝑃(𝑇𝑒𝑠𝑡 +) = 𝑃(𝑇𝑒𝑠𝑡 + |𝐷) ∗ 𝑃(𝐷) +P(Test+ | ¬D)* P(¬D)

We compute 𝑃(𝑇𝑒𝑠𝑡 +)

𝑃(𝑇𝑒𝑠𝑡 +) = (0.95 × 0.01) + (0.05 × 0.99)

=0.0095+0.0495

=0.059

0.95∗0.01 0.0095
We compute 𝑃(𝐷|𝑇𝑒𝑠𝑡 +) = = ≈ 0.161 or 16.1%
0.059 0.059

The probability that the patient actually has the disease given a positive test result is
approximately 16.1%.

Solution to Question A.7

Binomial distribution: 𝑿~𝑩𝒊𝒏𝒐𝒎𝒊𝒂𝒍(𝒏 = 𝟏𝟎𝟎, 𝒑 = 𝟎. 𝟎𝟓)

I. Probability of at least one success in 100 trials:

Calculate the probability of no successes:

𝑃(𝑋 = 0) = (1 − 𝑝)𝑛 = 0.95100

Subtract from 1 to get P(X≥1)

𝑃(𝑋 ≥ 1) = 1 − 0.95100

= 1 − 0.00592 ≈ 0.9941 𝑜𝑟 99.41%

The probability of at least one success in 100 trials is approximately 99.41%.

ii) Probability of at least one success over one time unit:

Assuming that the 100 trials are uniformly distributed over one time unit (e.g., one hour), the
number of successes in one time unit can be modeled by a Poisson distribution with rate;

𝜆 = 𝑛 ∗ 𝑝 = 100 ∗ 0.05 = 5

The probability of at least one success in one time unit is:

𝑃(𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑠𝑢𝑐𝑐𝑒𝑠𝑠) = 1 − 𝑃 (𝑛𝑜 𝑠𝑢𝑐𝑐𝑒𝑠𝑠) = 1 − 𝑒 −𝜆

= 1 − 𝑒 −5 = 1 − 0.0067

The probability of at least one success over one time unit is approximately 0.9933

iii) Probability that the time between two successes is less than one time unit:

The time between successes in a Poisson process follows an exponential distribution with
rate λ=5. The probability that the time between two successes is less than one time unit is:

𝑃(𝑇 > 1) = 1 − 𝑒 −𝜆∗1

1 − 𝑒 −5∗1 = 0.9933

The probability that the time between two successes is less than one time unit is
approximately 0.9933
iv) Comparison of probabilities from part ii and part iii.

The probabilities in part ii and part iii are the same because both scenarios are described by
the same Poisson process, where the probability of an event occurring within a time interval
is equal to the probability that the waiting time between events is less than that interval.

Solution to Question A.8

Given that;

 Old Train (X): 22, 23, 22, 24, 21, 20, 20, 23, 21, 21
 New Train (Y): 20, 20, 21, 22, 17, 18, 19, 20, 22, 19

The summary statistics is given by;

Sample Size 𝑛𝑥 = 10 and 𝑛𝑦 = 10

The sample means are;

22 + 23 + 22 + 24 + 21 + 20 + 20 + 23 + 21 + 21
𝑋− = = 21.7
10

20 + 20 + 21 + 22 + 17 + 18 + 19 + 20 + 22 + 19
𝑌− = = 19.8
10

Train Old Train (X) New Train (Y)

1 22 20
2 23 20
3 22 21
4 24 22
5 21 17
6 20 18
7 20 19
8 23 20
9 21 22
10 21 19
Sample Mean 21.7 19.8
Sample Variance 1.78889 2.62222222
i) One-Tailed Test with Known Variances

Null Hypothesis: 𝐻𝑜: µ𝑋 ≤ µ𝑌

Alternative Hypothesis: 𝐻1: µ𝑋 > µ𝑌

Significance level α=0.01

The Test statistic;

(𝑋 − − 𝑌 − ) 21.7 − 19.8
𝑍= = = 3.035
𝜎 2𝑥 𝜎 2 𝑦
√ √1.96 + 1.96
𝑛𝑥 + 𝑛𝑦 10 10

The critical Value Z0.01=2.326 (for right tailed test)

The decision is to reject the null hypothesis since 3.035>2.326 and we conclude that the new
trains run faster than the old trains at the 1% significance level.

ii) One-Tailed Test with Unknown but Equal Variances

Null Hypothesis: 𝐻𝑜: µ𝑋 ≤ µ𝑌

Alternative Hypothesis: 𝐻1: µ𝑋 > µ𝑌

Significance level α=0.01

The Pooled Variance will be given by the expression;

2
(𝑛𝑋 − 1)𝑆 2 𝑋 + (𝑛𝑌 − 1)𝑆 2 𝑌 9 ∗ 1.789 + 9 ∗ 2.622
𝑆 𝑝= = = 2.206
𝑛𝑋 + 𝑛𝑌 − 2 (10 + 10 − 2)

The test statistic;

𝑋 − − 𝑌− 1.9
𝑡= = = 2.860
1 1 √2.206 ∗ 0.2
(𝑆𝑝√( + )
𝑛𝑥 𝑛𝑦

Degrees of Freedom: df=nX+nY−2=18

Critical Value t0.01,18=2.552

The decision is to reject the null hypothesis because the test statistic (2.860) is greater than

the critical value (2.552). The conclusion is that the new trains run faster than the old trains at

the 1% significance level.

iii) Two-Tailed Test with Unknown but Equal Variances

Null Hypothesis: 𝐻𝑜: µ𝑋 = µ𝑌

Alternative Hypothesis: 𝐻1: µ𝑋 ≠ µ𝑌

Significance level α=0.01

The test statistics is the same as the one in part (ii) which is 2.860

The critical value t 0.005,18=±2.878

Since the test statistic (2.860) is less than the critical value (2.878), we fail to reject the null
hypothesis and conclude that there is no significant difference in speeds between the old and
new trains at the 1% significance level.

iv) Conclusions from Part ii and Part iii

Part ii (One-Tailed): The new trains are faster than the old trains.

Part iii (Two-Tailed): The speeds of the old and new trains are equal.

These conclusions are inconsistent because failing to reject H0 in the two-tailed test does not
align with the one-tailed result indicating the new trains are faster.

v) Dilemma Intuition

The dilemma arises here because the tests lead to inconsistent conclusions. A dilemma is that
the one-tailed test rejects H0 while the two-tailed test does not, suggesting ambiguity in the
direction of the effect. In this case, the results are not clear.

Solution to Question A.9)

i) Probability of gaining a unit of invertebrate from one die:
The die has 6 faces: invertebrate, seed, fish, fruit, rodent, and "invertebrate or seed".
1
The probability of rolling "invertebrate" is 6
1
The probability of rolling "invertebrate or seed" is 6 and the player can choose to invertebrate.
1 1 2 1
The probability will be + = =
6 6 6 3
1
The answer =
3
ii) Probability of at least two uppermost faces being rodents
1
The probability of rolling "rodent" on one die is
6
We use the complement rule such that;
𝑃(𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 2 𝑟𝑜𝑑𝑒𝑛𝑡𝑠) = 1 − 𝑃(0 𝑟𝑜𝑑𝑒𝑛𝑡𝑠) − 𝑃(1 𝑟𝑜𝑑𝑒𝑛𝑡)

5 6
𝑃(0 𝑟𝑜𝑑𝑒𝑛𝑡𝑠) = ( ) = 0.4019
6
4
5 1 1 5
𝑃(1 𝑟𝑜𝑑𝑒𝑛𝑡𝑠) = ( ) ( ) ( ) = 0.4019
1 6 6
Therefore, 𝑃(𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 2 𝑟𝑜𝑑𝑒𝑛𝑡𝑠) = 1 − 0.4019 − 0.4019 = 0.1962
iii) Expected value of X (maximal units of rodent)
X can only be 0,1, or 2(Since the player picks 2 dice)
5 5
𝑃(𝑋 = 0): No rodent in any of the five dice (6) = 0.4019

𝑃(𝑋 = 1); At least one rodent, but no two rodents in the top two= 0.4019
𝑃(𝑋 = 2) = 1 − 0.4019 − 0.4019 = 0.1962
The Formula for the Expected Value will be given by;
𝐸 [ 𝑋 ] = 0 ∗ 𝑃 (𝑋 = 0) + 1 ∗ 𝑃 ( 𝑋 = 1) + 2 ∗ 𝑃 (𝑋 = 2)
𝐸 [𝑋] = 0 ∗ 0.4019 + 1 ∗ 0.4019 + 2 ∗ 0.1962 = 0.7943
E[X]= 0.7943
iv) Expected value of YY (maximal units of seed)

Y is the maximal number of seed units the player can gain by selecting any two dice from the
five. The player can gain seed from:

Possible values of Y:
 0: No seed-contributing faces.
 1: At least one seed-contributing face, but not two.
 2: At least two seed-contributing faces.

We calculate the probabilities:

2 5
𝑃(𝑌 = 0) = ( ) = 0.1317
3
1
5 1 2 4
( )
𝑃 𝑌 = 1 = ( ) ( ) ( ) = 0.3291
1 3 3

𝑃(𝑌 = 2) = 1 − 𝑃 (𝑌 = 0) − 𝑃(𝑌 = 1) = 1 − 0.1317 − 0.3291 = 0.5392

𝐸 [𝑋] = 0 ∗ 0.1317 + 1 ∗ 0.3291 + 2 ∗ 0.5392

E[X]= 1.4075

Solution to Question A.10)

i) Name of the Analysis Technique

The analysis technique used is Multiple Linear Regression.

ii) Interpretation of Coefficients

The coefficients represent the estimated effect of each independent variable on the housing
price, holding other variables constant:

 sqr_feet: For each additional square foot, the housing price increases by
approximately $40.051 thousand.
 age: For each additional year of age, the housing price decreases by approximately
$1.218 thousand.
 dtown_dum: Being close to downtown Boston increases the housing price by
approximately $250.628 thousand.
 crime_rate: A one-unit increase in the crime rate decreases the housing price by
approximately $75.699 thousand.
 educ_qual: A one-unit increase in school quality increases the housing price by
approximately $4.565 thousand.
The intercept (_cons) is $150.742 thousand, representing the baseline price when all other
variables are zero.

iii) Defending the Estimation Results

To support the validity of the results:

 Statistical Significance: All coefficients have p-values of 0.000, indicating they are
statistically significant at conventional levels.
 Model Fit: The high R-squared (0.756) suggests that 75.6% of the variation in
housing prices is explained by the model.
 Assumptions: The model assumes linearity, independence of errors,
homoscedasticity, and normality of residuals. Diagnostic tests (e.g., residual plots)
would be needed to verify these.

iv) Range of Possible Effects

The 95% confidence intervals for each coefficient provide a range of plausible effects:

 sqr_feet: [35.202, 44.90] thousand dollars per square foot.

 age: [-1.2894, -1.1473] thousand dollars per year.
 dtown_dum: [209.468, 291.788] thousand dollars.
 crime_rate: [-81.7312, -69.6666] thousand dollars per unit.
 educ_qual: [3.872, 5.257] thousand dollars per unit.

These intervals reflect uncertainty in the estimates.

v) Defending Model Validity

To address concerns about the model:

 Explanatory Power: The high R-squared and adjusted R-squared indicate strong
explanatory power.
 F-statistic: The significant F-statistic (Prob > F = 0.0000) confirms the overall model
fit.
 Robustness: Sensitivity analyses (e.g., adding or removing variables) could be
performed to ensure results are robust.
vi) Predicting Housing Price

For an apartment with:

 sqr_feet = 50,
 age = 10,
 dtown_dum = 1,
 crime_rate = 0.2,
 educ_qual = 9,

The predicted price is calculated as:

𝑦 − = 150.742 + 40.051(50) − 1.218(10) + 250.628(1) − 75.699(0.2) + 4.565(9)

𝑦 − = 150.742 + 2002.55 − 12.18 + 250.628 − 15.1398 + 41.085

𝑦 − = 2417.6852 thousand dollars

Thus, the estimated price is approximately $2,417,685.2

Solution to Bonus Question [15 Points]

a. Z if BVB does not go bankrupt (𝑿 = 𝟎)

Then Z=Y

b. Z if BVB does goes bankrupt (𝑿 = 𝟏)

Then Z=R*Y

Part (ii) on the Conditional probability 𝑷(𝒁 = 𝒛|𝑿 = 𝟎)

Since Z=Y, the distribution is the same as Y

𝑃(𝑍 = 0.5𝐼 ) = 0.1

𝑃(𝑍 = 𝐼 ) = 0.7

𝑃(𝑍 = 0.5𝐼 ) = 0.2

Part (iii) on the Conditional probability 𝑷(𝒁 = 𝒛|𝑿 = 𝟏)

Z=R*Y, where R is 1 with probability 0.4 and 0 otherwise.

Possible outcomes:

𝑍 = 0; (𝑖𝑓 𝑅 = 0); 𝑃 = 0.6

𝑍 = 0.5𝐼; (𝑖𝑓 𝑅 = 1 𝑎𝑛𝑑 𝑌 = 0.5𝐼 ); 𝑃 = 0.4 ∗ 0.1 = 0.01

𝑍 = 𝐼; (𝑖𝑓 𝑅 = 1 𝑎𝑛𝑑 𝑌 = 𝐼 ); 𝑃 = 0.4 ∗ 0.7 = 0.28

𝑍 = 1.5𝐼; (𝑖𝑓 𝑅 = 1 𝑎𝑛𝑑 𝑌 = 1.5𝐼 ); 𝑃 = 0.4 ∗ 0.2 = 0.08

Part (iv) on the unconditional probability 𝑷(𝒁 = 𝒛)

We use the law of total probability as follows;

𝑃(𝑍 = 𝑧) = 𝑃 (𝑍 = 𝑧|𝑋 = 0)𝑃(𝑋 = 0) + 𝑃(𝑍 = 𝑧|𝑋 = 1)𝑃(𝑋 = 1)

𝑃(𝑋 = 0) = 0.7, 𝑃(𝑋 = 1) = 0.3

We then calculate for each 𝑧;

𝑃 (𝑍 = 0) = 0 + 0.6 ∗ 0.3 = 0.18

𝑃 (𝑍 = 0.5𝐼 ) = 0.1 ∗ 0.7 + 0.04 ∗ 0.3 = 0.082

𝑃(𝑍 = 𝐼 ) = 0.7 ∗ 0.7 + 0.28 ∗ 0.3 = 0.574

𝑃 (𝑍 = 1.5𝐼 ) = 0.2 ∗ 0.7 + 0.08 ∗ 0.3 = 0.162

Probability Exam Review Spring 2014
No ratings yet
Probability Exam Review Spring 2014
11 pages
STA 103 Homework 4 Solutions
No ratings yet
STA 103 Homework 4 Solutions
4 pages
EE3110 Week 5 Tutorial Solutions
No ratings yet
EE3110 Week 5 Tutorial Solutions
7 pages
Mathematical Expectation and Variance
No ratings yet
Mathematical Expectation and Variance
39 pages
Variance Calculation for Defective Products
No ratings yet
Variance Calculation for Defective Products
12 pages
Data Science Statistics Assignment Solutions
No ratings yet
Data Science Statistics Assignment Solutions
11 pages
Probability and Statistics Final Solution
No ratings yet
Probability and Statistics Final Solution
21 pages
Probability Solutions and Concepts
No ratings yet
Probability Solutions and Concepts
7 pages
18.05 Final Exam Solutions Overview
No ratings yet
18.05 Final Exam Solutions Overview
11 pages
Probability and Statistics Exam Solutions
No ratings yet
Probability and Statistics Exam Solutions
11 pages
Chocolate Box Weight Distribution Analysis
No ratings yet
Chocolate Box Weight Distribution Analysis
12 pages
Solutions Manual for Mathematical Finance
No ratings yet
Solutions Manual for Mathematical Finance
65 pages
W3GS
No ratings yet
W3GS
11 pages
IITM Logo and Data Science Solutions
No ratings yet
IITM Logo and Data Science Solutions
12 pages
Expectation of Card Draws and Random Variables
No ratings yet
Expectation of Card Draws and Random Variables
8 pages
Review Questions on Probability and Statistics
No ratings yet
Review Questions on Probability and Statistics
5 pages
Properties of Expectation and Covariance
No ratings yet
Properties of Expectation and Covariance
5 pages
Statistics Group Assignment Solutions
No ratings yet
Statistics Group Assignment Solutions
6 pages
Math 151 Homework Solutions: Chapter 7
No ratings yet
Math 151 Homework Solutions: Chapter 7
8 pages
MMAT5340 Mid-term Solutions
No ratings yet
MMAT5340 Mid-term Solutions
9 pages
Probability and Statistics Problem Set
No ratings yet
Probability and Statistics Problem Set
8 pages
Expectation and Variance Explained
No ratings yet
Expectation and Variance Explained
31 pages
Week 3 Covariance and PMF Solutions
No ratings yet
Week 3 Covariance and PMF Solutions
7 pages
Instructor's Manual: Linear Algebra 5th Ed.
No ratings yet
Instructor's Manual: Linear Algebra 5th Ed.
12 pages
MATH 277 Assignment 2 Solutions
No ratings yet
MATH 277 Assignment 2 Solutions
11 pages
Probability Distribution of Coin Tosses
No ratings yet
Probability Distribution of Coin Tosses
6 pages
Mathematical Expectation and Variance Explained
No ratings yet
Mathematical Expectation and Variance Explained
17 pages
Probability Distribution of Card Values
No ratings yet
Probability Distribution of Card Values
12 pages
EC404 Winter 2023 Statistics Problem Set
No ratings yet
EC404 Winter 2023 Statistics Problem Set
9 pages
MAKAUT Discrete Mathematics Exam Paper 2023
No ratings yet
MAKAUT Discrete Mathematics Exam Paper 2023
4 pages
BBIT 3106 Probability & Statistics II Solutions
No ratings yet
BBIT 3106 Probability & Statistics II Solutions
8 pages
UCSD ECE153 Homework Set #4 Solutions
No ratings yet
UCSD ECE153 Homework Set #4 Solutions
11 pages
18.440 Probability Midterm Practice
No ratings yet
18.440 Probability Midterm Practice
4 pages
Statistical Analysis and Probability Calculations
No ratings yet
Statistical Analysis and Probability Calculations
19 pages
Solutions to Probability Exercises
No ratings yet
Solutions to Probability Exercises
13 pages
Committee Selection Probability Analysis
No ratings yet
Committee Selection Probability Analysis
9 pages
6.867 Machine Learning Problem Set Solutions
No ratings yet
6.867 Machine Learning Problem Set Solutions
5 pages
Solutions for Mathematical Statistics
71% (7)
Solutions for Mathematical Statistics
20 pages
Random Variables and Their Distributions
No ratings yet
Random Variables and Their Distributions
11 pages
Probability and Random Variables Exercises
No ratings yet
Probability and Random Variables Exercises
7 pages
Expected Trials for Rohit's Door Key
No ratings yet
Expected Trials for Rohit's Door Key
5 pages
AMAT 370 Midterm: Probability Exam 2024
No ratings yet
AMAT 370 Midterm: Probability Exam 2024
13 pages
Homework Solutions for Stochastic Models
No ratings yet
Homework Solutions for Stochastic Models
5 pages
Conditional Expectation and Variance Analysis
No ratings yet
Conditional Expectation and Variance Analysis
8 pages
Statistics 100A Homework 8 Solutions
No ratings yet
Statistics 100A Homework 8 Solutions
16 pages
Homework Set #4 Solutions Explained
No ratings yet
Homework Set #4 Solutions Explained
11 pages
Important Probability Distributions Guide
No ratings yet
Important Probability Distributions Guide
5 pages
STAT 333 Assignment 1 Solutions
No ratings yet
STAT 333 Assignment 1 Solutions
6 pages
Math 3215 Probability Solutions
No ratings yet
Math 3215 Probability Solutions
4 pages
Final Exam Overview for MAT 2377C
No ratings yet
Final Exam Overview for MAT 2377C
11 pages
Probability Concepts and Calculations
No ratings yet
Probability Concepts and Calculations
30 pages
IIT Kharagpur Probability & Statistics Exam 2024-25
No ratings yet
IIT Kharagpur Probability & Statistics Exam 2024-25
9 pages
Solutions for Stock & Watson Exercises
No ratings yet
Solutions for Stock & Watson Exercises
13 pages
Company Law Textbook PDF
100% (1)
Company Law Textbook PDF
100 pages
Cholinesterase Inhibitors Overview
No ratings yet
Cholinesterase Inhibitors Overview
11 pages
Youth Trends in Concert Attendance and Gaming
No ratings yet
Youth Trends in Concert Attendance and Gaming
6 pages
Waves MCQ Test for Physics Students
No ratings yet
Waves MCQ Test for Physics Students
21 pages
JMO 2010 Solution Notes: Compiled by Evan Chen
No ratings yet
JMO 2010 Solution Notes: Compiled by Evan Chen
8 pages
Integrating Generative AI in Higher Education
No ratings yet
Integrating Generative AI in Higher Education
11 pages
Rizal's Legacy in General Education Review
100% (1)
Rizal's Legacy in General Education Review
7 pages
Performance and Obstacles of SMEs in Viet Nam
No ratings yet
Performance and Obstacles of SMEs in Viet Nam
26 pages
Tamil To English Dictionary (241KB)
100% (1)
Tamil To English Dictionary (241KB)
7 pages
Computron's Bid Strategy for Konig
No ratings yet
Computron's Bid Strategy for Konig
5 pages
Prasna Interpretation with Arudha Analysis
50% (2)
Prasna Interpretation with Arudha Analysis
5 pages
Black & Decker Pricefighter Drill Overview
No ratings yet
Black & Decker Pricefighter Drill Overview
18 pages
Price List for Computer Components
No ratings yet
Price List for Computer Components
2 pages
Rise of the East India Company in India
No ratings yet
Rise of the East India Company in India
5 pages
Proforma Letter of Representation for Audit
100% (1)
Proforma Letter of Representation for Audit
4 pages
Hd64f5398 Asia Cpu
No ratings yet
Hd64f5398 Asia Cpu
872 pages
Upcoming TV Shows Reading Practice
No ratings yet
Upcoming TV Shows Reading Practice
2 pages
Initial Curvature Impact on Column Strength
No ratings yet
Initial Curvature Impact on Column Strength
18 pages
First Trust Value Line® 100 Exchange-Traded Fund: Fund Objective Index Description
No ratings yet
First Trust Value Line® 100 Exchange-Traded Fund: Fund Objective Index Description
2 pages
Application for Administrative Coordinator
No ratings yet
Application for Administrative Coordinator
11 pages
Anchor Ice Impact on Niagara River Flow
No ratings yet
Anchor Ice Impact on Niagara River Flow
12 pages
English Class: Riding a Bike on Sunday
No ratings yet
English Class: Riding a Bike on Sunday
16 pages
CCTV Equipment Quotation for TESDA
No ratings yet
CCTV Equipment Quotation for TESDA
1 page
Pakistan Capital Markets Performance FY2025
No ratings yet
Pakistan Capital Markets Performance FY2025
16 pages
Evaluating Control Room Habitability in Nuclear Plants
No ratings yet
Evaluating Control Room Habitability in Nuclear Plants
17 pages
Olive's Heartbreak at Aaron's Engagement
No ratings yet
Olive's Heartbreak at Aaron's Engagement
42 pages
Fuel Gas Compression Systems Overview
No ratings yet
Fuel Gas Compression Systems Overview
11 pages
A Guide To The Lifeblood of DAM: Key Concepts and Best Practices For Using Metadata in DAM.
No ratings yet
A Guide To The Lifeblood of DAM: Key Concepts and Best Practices For Using Metadata in DAM.
8 pages
Karnataka Lift Irrigation Tender Details
No ratings yet
Karnataka Lift Irrigation Tender Details
12 pages
Trading Amazon Gift Cards for Bitcoin
No ratings yet
Trading Amazon Gift Cards for Bitcoin
11 pages