[go: up one dir, main page]

0% found this document useful (0 votes)
51 views10 pages

Final

1) The document is a final exam solution key for a statistical inference course containing 4 questions. 2) For question 1, students are asked to summarize 3 out of the 4 subquestions in 10 lines or less. The subquestions cover topics like identifying the best point estimator, using stratified random sampling to determine sample sizes, and discussing the effects of sampling error. 3) Question 2 involves calculating probabilities for a binomial distribution. Question 3 requires interpreting the results of a hypothesis test. Question 4 asks students to define and explain statistical terms.

Uploaded by

sohail199ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views10 pages

Final

1) The document is a final exam solution key for a statistical inference course containing 4 questions. 2) For question 1, students are asked to summarize 3 out of the 4 subquestions in 10 lines or less. The subquestions cover topics like identifying the best point estimator, using stratified random sampling to determine sample sizes, and discussing the effects of sampling error. 3) Question 2 involves calculating probabilities for a binomial distribution. Question 3 requires interpreting the results of a hypothesis test. Question 4 asks students to define and explain statistical terms.

Uploaded by

sohail199ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

BAHRIA UNIVERSITY (KARACHI CAMPUS)

FINAL EXAMINATION – Spring – 2023


COURSE TITLE: STATISTIAL INFERENCE (QTM-204)
Solution Key
Session: II
Computer Lab Base
Class: BS-4 (Accounts & Finance) Section: [A, B, C]
Course Instructor: Mr. ISHTIAQ AHMED, Mr. SABAH QAISER Time Allowed: 2 Hours
Date: Thursday, July 6, 2022 Max Marks: 40 marks
Student’s Name: ____________________________ Reg. No: ___________
Note: 1) Question paper must be returned with answer book. Your soft solution data also required in CD of each student by invigilator.
2) No choice is given, attempt ALL questions in given sequence and proper manner.
3) All answers are required in prescribed answer book only. Don’t write any answer in question paper.
4) Don’t use pencil for solution of the given question, except graph.
5) Scientific calculator and Excel is allowed for computing purpose. Use separate computation sheet for each question in Excel.
6) Necessary Formula Sheet is attached with question paper.
7) Data is given in CD for this final examination. Students are advised to ask for examination data from examination invigilator.

Q.1 Attempt any Three (03) questions in the following (Not more than 10 lines). [3 x 3 =09 marks]
[30 minutes]
a) Suppose a social researcher is drawn four point estimators x̅1 = 235, x̅2 = 230, x̅3 = 232, and x̅4 = 231 from
same unknown population, while population mean μ is 240 of same unknown population. How you can suggest to social
researcher to identify good point estimator among four given point estimator. Also justify your recommended good point estimator.
Required
How you can help social researcher to identify good point estimator among four given point estimator. Also justify your
recommended good point estimator.

Solution:
Margin of Error or Sampling Error = e1 = x̅1 − μ = 235 – 240 = – 5
Margin of Error or Sampling Error = e2 = x̅2 − μ = 230 – 240 = – 10
Margin of Error or Sampling Error = e3 = x̅3 − μ = 232 – 240 = – 8
Margin of Error or Sampling Error = e4 = x̅4 − μ = 231 – 240 = – 9

The above computed margin errors in indicates that Margin of Error or Sampling Error = e1 = – 5 is minimum magnitude
between x̅1 − μ therefore we can conclude that x̅1=235 is good and efficient pion estimator.
b) Given data of four finite population A, B, C and D in the following table:
Population A B C D
Population size NA = 2,500 NB = 3,000 NC = 2,000 4,000
If n = 50 samples to be collected from entire given population A, B, C, and D collectively. Use Population Allocation Proportion
Stratified Random Sampling Method to determine nA = ?, nB = ?, nC = ?
Required:
Use Population Allocation Proportion Stratified Random Sampling Method to determine nA = ?, nB = ?, nC = ?, nD =? respectively.

Page 1 of 10 Statistical Inference (QTM-204) BS-4A (A & F) Spring - 2023


Solution
Now using population proportion allocation methods (Stratified Random Sampling Procedure)
Given
N = 11,500 𝑛𝐴 =?
n = 50 𝑛𝐵 =?
NA= 2,500 𝑛𝐶 =?
NB= 3,000 𝑛𝐷 =?
NC= 2,000
NC= 4,000
We know that
𝑁 𝑁 𝑛𝑁
= 𝑛𝐴 𝑛𝐴 = 𝑁 𝐴
𝑛 𝐴
𝑛𝐴 = 10.87 ≅ 11
𝑛𝐵 = 13.04 ≅ 22
𝑛𝐶 = 8.70 ≅ 09
𝑛𝐷 = 17.39 ≅ 17

c) Discuss the effects of sampling error (margin of error) on interval estimate at several confidence interval with suitable real life
example.

Solution:
Interval estimate of population parameters is the width of mathematical statements, where population parameter lie under the
given confidence level of confidence β. It has observed that the width of interval estimate of population parameters is directly
proportion with confidence level β. In short confidence level will be increased respectively the width of interval of population
parameters will be increased respectively and confidence level β will be decreased then width of interval of population parameters
will be decreased automatically. For illustrate an example of three cases of confidence level β and observe impact of width of
interval of population parameters with same good point estimator.
Case No. 1: β1 = 90%
90% Confidence Interval of population mean = Prob.[340 hr ≤ μ ≤ 655 hrs]
Case No. 2: β1 = 85%
85% Confidence Interval of population mean = Prob.[345 hr ≤ μ ≤ 644 hrs]
Case No. 3: β1 = 95%
95% Confidence Interval of population mean = Prob.[330 hr ≤ μ ≤ 665 hrs]
In the above examples Case No. 3 is explained that at 95% confidence level the width of population mean μ larger than others two
cases 1 and 2.
d) Explain the significant relationship between required sample size (n) and sampling error (margin of error). Also discuss effect on
standard error of inferential statistics.

Solution:
Margin of Error(e) is square time inversely proportion of sample size(n) is other factors are taking constant.
We know that
Z-Statistic is the ratio of margin of error and standard error, mathematically can be written as
𝑥̅ − 𝜇 𝜎
Z-Statistic= 𝜎 𝑥̅ where 𝜎𝑥̅ = 𝑛, if population is large and unknown and 𝑒 = 𝑥̅ − 𝜇𝑥̅
̅
𝑥 √

𝑒
±𝑍𝛼⁄2 = 𝜎
̅
𝑥
𝜎 𝑒
=𝑍
√𝑛 𝛼⁄
2

Page 2 of 10 Statistical Inference (QTM-204) BS-4A (A & F) Spring - 2023


√𝑛 𝑍𝛼⁄
2
=
𝜎 𝑒
𝑍𝛼⁄ 𝜎
2
√𝑛 = 𝑒
(𝑍𝛼⁄ 𝜎)2
2
𝑛= 𝑒2
(𝐾)2
𝑛= where K = (𝑍𝛼⁄2 𝜎) 𝑎𝑠 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡
𝑒2
1
𝑛 ∝ 𝑒2
The obtained relationship indicates, if margin error increases then sample size will be decreased and margin of error decreases
then sample size (n) will increase and efficiency of statistical analysis will better then less sample size as per definition of Central
Limit Theorem. Suppose sample mean is 𝑥̅ 𝑎𝑛𝑑 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛 𝑖𝑠 𝜇, then sampling error or margin of error ‘e’
can be written as mathematically e = 𝑥̅ − 𝜇. Our inferences about population parameter depend upon margin of error ‘e’
because least ‘e’ provides more accuracy in estimation of population parameters as compare with large ‘e’. For example if e = 0.2
and e = 0.3, in this case e = 0.2 gives better inferences about population parameters as compere to the e = 0.3.
f) A study is to be made to estimate the percentage of citizens in a town who favor having their water fluoridated.

Required
How large a samples (n1 and n2) is needed, if one wishes to be at least 90% and 96% confident that our estimate is within 1.5% of
the true percentage?

Solution:
For 90% Confidence Interval
Marginal error (sampling error) = 𝑝̅ – 𝑝 e = 1.5% =0.015
Sample proportion 𝑝̂ = 0.5
𝑞̂ = 0.5
Level of significant 𝛼 = 10%
𝛼
= 5% = 0.05
2
Confidence Level 𝛽 = 90%
𝑍0.05 = ±1.64485
Now required sample size
𝒁𝜶⁄ .±1.64485
𝒏=[ 𝟐
]𝟐 (𝑝̂ 𝑞̂) ⟹ 𝑛1 = [ 0.015 ]2 . (0.5)(0.5)
𝒆
⟹ 𝑛1 = 3,006 samples
For 96% Confidence Interval
Marginal error (sampling error) = 𝑝̅ – 𝑝 e = 1.5% =0.015
Sample proportion 𝑝̂ = 0.5
𝑞̂ = 0.5
Level of significant 𝛼 = 4%
𝛼
= 2% = 0.02
2
Confidence Level 𝛽 = 96%
𝑍0.02 = ±2.05375
Now required sample size
𝒁𝜶⁄ .±2.05375
𝒏=[ 𝟐
]𝟐 (𝑝̂ 𝑞̂) ⟹ 𝑛2 = [ 0.015 ]2 . (0.5)(0.5)
𝒆
⟹ 𝑛2 = 4,687 samples

Page 3 of 10 Statistical Inference (QTM-204) BS-4A (A & F) Spring - 2023


Q.2a Define Central Limit Theorem and purpose of utility in statistical inference. [02 marks]
[10 Minutes]
Solution:
The Central Limit Theorem is a core and theoretical concept of statistical inference. It is stated that “The relation between
population distribution and sampling distribution is called the Central Limit Theorem. In this relationship the shape of sampling
distribution approaches to the approximate normal distribution, when sample size gradually increases. At large enough sample
size will demonstrate the very close shape of approximate normal distribution of sampling distribution as shape of population
distribution.”

Q.2b Write significance difference between Simple Random Sampling Procedures and Systematic Random Sampling Procedure also write
advantages: [02 marks]
[10 Minutes]
Solution:
Simple Random Sampling Procedures Systematic Random Sampling Procedure
 In simple random sampling, each data point has an  Meanwhile, systematic sampling chooses a data point per
equal probability of being chosen. each predetermined interval. While systematic sampling
is easier to execute than simple random sampling, it can
produce skewed results if the data set exhibits patterns.
 In simple random sampling, each data point has an  While systematic sampling is easier to execute than
equal probability of being chosen. Meanwhile, simple random sampling, it can produce skewed results
systematic sampling chooses a data point per each if the data set exhibits patterns. It is also more easily
predetermined interval. manipulated.
 On the contrary, simple random sampling is best used
for smaller data sets and can produce more
representative results.

Q.3 Consider sample height (in cm) of men from two different populations A and B with same demographic: [09 marks]
[20 Minutes]
Population – A Population – B
54 64 56 51 45 65 73 45 56 72 90 55 49 59
65 72 88 56 59 93 73 56 78 78 68 80 61 65
45 54 72 48 45 67 46 57 72 56 59 50 84 69
43 69 60 81 65 54 48 77 66 91 86
73 58 63 86
Required:
a) Construct interval estimate of difference between population means (𝜇1− 𝜇2) at 94% confidence Interval.
b) Construct the testing of hypothesis of difference between two population A and B means at 3% level of significance. Compute
p-value and type of statistical Error.
Solution:
a) 𝑥̅𝐴 = 61.76 = 𝑥̅2
𝑥̅𝐵 = 66.44 = 𝑥̅1
𝜎𝐴 =13.49 = 𝜎2
(𝑥̅1 − 𝑥̅2 ) = 4.68
𝜎𝐵 = 12.76 = 𝜎1
𝑛𝐴 = 25 = 𝑛2
𝑛𝐵 = 32 = 𝑛1
𝛽 = 94%
𝛼 = 6% = 0.06

Page 4 of 10 Statistical Inference (QTM-204) BS-4A (A & F) Spring - 2023


𝛼
= 0.03
2
𝑍𝛼 = 𝑍0.03 = ± 1.88097
2

𝜎21 𝜎22
𝜎𝑥̅1 −𝑥̅2 = √ + = √7.28 + 5.09 = 3.52
𝑛1 𝑛2
Hence
94 % Confidence Interval for Difference 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 (𝜇1 − 𝜇2 ) = 𝑃𝑟𝑜𝑏. [(𝑥̅1 − 𝑥̅2 ) ± 𝑍𝛼2 𝜎(𝑥̅1−𝑥̅2) ]
= Prob.[(4.68) ± (1.88097) (3.52)]
= Prob.[ 4.68 ± 6.62]
b) Testing of Hypothesis

Step-1: Null Hypothesis: H0 : 𝝁 𝟏 = 𝝁 𝟐 , 𝝁 𝟏 − 𝝁 𝟐 = 0

Step-2: Alternative Hypothesis: H A : µ 1 ≠ µ2 .

Step-3: Non- directional or two tail hypothesis test

Step-4: Level of Significance 𝛼 = 3% = 0.03


𝛼
= 1.5% = 0.015
2

Step-5: Given Data n1 = 32 n2 = 25


𝑥̅1 = 66.3.44 𝑥̅2 = 61.76
σ1 = 12.76, σ2 = 13.49
𝜎2 𝜎2 2
𝜎𝑥̅ 1 −𝑥̅ 2 = √ 𝑛 1 + = 3.52
1 𝑛2

Step-6: Critical Value (Tabulated Value) 𝑍𝛼 =𝑍𝑇𝑎𝑏 = 𝑍0.015 = ±2.17008


∵ 𝐵𝑜𝑡ℎ 𝑡𝑎𝑖𝑙 𝑛𝑜𝑛 − 𝑑𝑖𝑟𝑒𝑐𝑡𝑖𝑜𝑛𝑎𝑙 𝑡𝑒𝑠𝑡
Step-7: Pre-Test Normal Curve
(𝑥̅1 −𝑥̅2 )−(𝝁𝟏 −𝝁𝟐 )
Step-8: Test Statistic (Calculated Value) 𝑍𝐶𝑎𝑙 = 𝜎(𝑥̅1 −𝑥̅2 )
(4.68)−0
𝑍𝐶𝑎𝑙 = 3.52
𝑍𝐶𝑎𝑙 = ±1.3301

Step-9: P-value (Actual Rejection Region) p-value can be obtained at 𝑍𝐶𝑎𝑙 = −1.3301
Probability at -1.3301 is 0.091646, there actual p-value will be for non-
direction test, p-value = 0.183291 = 18.3291%

Step-10: Power of Statistical Test β = 1– (p-value) = 1 – 0.183291 = 0.816709 or 81.6709%.

Step-11: Types of Statistical Error Type-II statistical Error occur because our calculated value fall in rejection
region, because
𝑍𝑇𝑎𝑏 > 𝑍𝐶𝑎𝑙 = 2.17008> 1.3301
Step-12: Post Test Normal Curve

Step-13: Decision: We can confidently conclude that we can accept the null hypothesis
at given level of significance (3%), but we strongly reject the null hypothesis
at p-value i.e.183291%. Therefore, Type-II statistical error occurred.
Page 5 of 10 Statistical Inference (QTM-204) BS-4A (A & F) Spring - 2023
Q.4a What is the purpose of use Analysis of Variance (ANOVA) in statistical data analysis? [02 marks]
[10 minutes]
Solution:
Analysis of Variance (ANOVA) is a statistical tool for testing of hypothesis. The Analysis of Variance can be used for more than two
unknown population means. Z-test is restricted to test maximum two population mean, but ANOVA help us to perform hypothesis
test for more than two population means. Analysis of Variance (ANOVA) is a statistical method to be used testing of hypothesis for
more than two population means. ANOVA is known F-ratio test of two variances. Analysis of Variance can be used for numbers of
population to identify about significance difference among more than two means. The type of ANOVA test used depends on a
number of factors. It is applied when data needs to be experimental. Analysis of variance is employed if there is no access to
statistical software resulting in computing ANOVA by hand. It is simple to use and best suited for small samples. With many
experimental designs, the sample sizes have to be the same for the various factor level combinations.

There are two significance classification of ANOVA as follows:


 One Way Classification of ANOVA Test, where we can be observed column effects, while
 Two Way Classification of AnOVA test, where we can be observed row and column effects collectively.

Q.4b The following data represent the final grades obtained by 5 students in Statistics, Economics, Chemistry, and Physics:
[08 marks]
[20 minutes]
Table
Subject
Students Statistics Economics Chemistry Physics
A 75 65 73 80
B 45 67 Absent 56
C 67 Absent 62 67
D 67 54 83 Absent
E Absent 73 54 69

Required:
Use a 0.05 level of significance to test the hypothesis that
i) The courses are of equal difficulty;
ii) The students have equal ability.

Solution:
The following steps of hypothesis should conclude required hypothesis under given information.

Step-1: Null Hypothesis: 𝐻 / 𝑜 : 𝜇𝑆 = 𝜇𝐸 = 𝜇𝐶 = 𝜇𝐸 = 0 𝐶𝑜𝑙𝑢𝑛𝑚 𝑒𝑓𝑓𝑒𝑐𝑡𝑠 𝑎𝑟𝑒 𝑧𝑒𝑟𝑜)


𝐻 // 𝑜 :𝜇𝐴 =𝜇𝐵 = 𝜇𝐶 = 𝜇𝐷 = 𝜇𝐸 = 0 (𝑅𝑜𝑤 𝑒𝑓𝑒𝑐𝑡𝑠 𝑎𝑟𝑒 𝑧𝑒𝑟𝑜)
Step-2: Alternative Hypothesis: 𝐻 /𝐴 : At least two means of subjects scores are not equal to zero
𝐻 //𝐴 : At least two means of students are not equal to zero
Step-3: Directional or Right tail test
Step-4: Level of Significance 𝛼 = 5% = 0.05
Step-5: Degree of Freedom 𝑣𝑅 = 𝑅 − 1 = 5 − 1 = 4
𝑣𝐶 = 𝐶 − 1 = 4 − 1 = 3
𝑣𝐸 = (𝑅 − 1)(𝐶 − 1) = (4)(3) = 12
Step-6: Critical Value (Tabulated Value) 𝑓𝑇𝑎𝑏𝑅 = 𝑓0.05,(𝑣𝑅,𝑣𝐸) = 𝑓0.05,(4,12) = 3.259167
𝑓𝑇𝑎𝑏𝐶 = 𝑓0.05,(𝑣𝐶,𝑣𝐸) = 𝑓0.05,(3,12) = 3.490295

Page 6 of 10 Statistical Inference (QTM-204) BS-4A (A & F) Spring - 2023


Step-7: Pre-Test ANOVA
𝑆𝑅 2 𝑆𝐶 2
Step-8: Test Statistic (Calculated Value) 𝑓𝐶𝑎𝑙𝑅 = 2 , 𝑓𝐶𝑎𝑙𝐶 = ,
𝑆𝐸 𝑆𝐸 2
Step-9: Computation:
Table
Subject
Students Statistics Economics Chemistry Physics Row Total
A 75 65 73 80 293
B 45 67 0 56 168
C 67 0 62 67 196
D 67 54 83 0 204
E 0 73 54 69 196
Column Total 254 259 272 272 1057

Now we have to be computed following values.


SST: Sum of Square of Total
SSR: Sum of Square of Row
SSC: Sum of Square of Column
SSE: Sum of Square of Error
Where SST = SSR + SSC + SSE
(∑𝑥)2 (1,057)2
SST = ∑x2− = 71351 - = 71,351- 69,828.0625 =1,522.938
𝑁 16
2 2 2 (∑𝑥)2 (293)2 (168)2
𝑅1 𝑅2 𝑅3 (196)2 (204)2 (196)2 (1,057)2
SSR = + + − = + + + + −
𝐶1 𝐶2 𝐶3 𝑁 4 3 3 3 3 16
=524.8542
𝐶1 2 𝐶2 2 𝐶3 2 (∑𝑥)2 (254)2 (259)2 (272)2 (272)2 (1,057)2
SSC = + + − = + + + −
𝑟1 𝑟2 𝑟3 𝑁 4 4 4 4 16
= 63.1875
SSE = SST – SSR – SSC = 1,522.938– 524.8542– 63.1875 = 934.8958
Step-10 ANAVA Table for 𝑓𝐶𝑎𝑙 :
Source Sum of Square DF Sum of Square (SS) 𝑓𝐶𝑎𝑙
Row Means SSR = 524.8542 𝑣𝑅 = 4 2
𝑆𝑅 =
𝑆𝑆𝑅
=
524.8542
= 131.2135 𝑓𝐶𝑎𝑙𝑅 =
𝑆𝑅 2
=
131.2135
= 1.684
𝑣𝑅 4 𝑆𝐸 2 77.908
2 𝑆𝑆𝐶 63.1875
Column Means SSC = 63.1875 𝑣𝐶 = 3 𝑆𝐶 = 𝑣𝐶
= 3
= 21.0625
𝑆𝐶 2 21.0625
𝑓𝐶𝑎𝑙𝐶 = 𝑆𝐸 2
= 77.908
=0.270351
Error SSE = 934.8958 𝑣𝐸 = 12 𝑆 2 = 𝑆𝑆𝐸
=
934.8958
= 77.908
𝐸 𝑣𝐸 12
Total SST = 1,522.938 n-1 = 19

Step-11: P-value (Actual Rejection Region): 1) p-value can be obtained at 𝑓𝐶𝑎𝑙𝑅 = 1.684
from f-score table that we can observe that the probability at
1.684 is approximate at same d.f (4, 12) = 0.217884 or 21.7884%
Hence actual probability of rejection region is 21.7884% instead or 5%
2) ) p-value can be obtained at 𝑓𝐶𝑎𝑙𝐶 = 0.270351 from f-score
table, we can observe that the probability at 𝑓𝐶𝑎𝑙𝐶 = 0.27035 is
approximate at same d.f (2, 6) = 0.845575 or 84.5575%. Hence actual
probability of acceptance region 84.5575% instead or 5%
Step-12: Power of Statistical Test p-value indicate that power of statistical test will
be expected for certain unknown population may be 𝛽𝑅 = 1– (p-value)
= 1 – 0.217884 = 0.7821163 or 78.21163%.

Page 7 of 10 Statistical Inference (QTM-204) BS-4A (A & F) Spring - 2023


𝛽𝐶 = 1– (p-value) = 1 – 0.845575 = 0.1544245or 15..44245%%.
Step-13: Types of Statistical Error Type-II statistical error occur for row effects because
our calculated value fall in rejection region
𝐹𝐶𝑎𝑙𝑅 < 𝐹𝑇𝑎𝑏𝑅 = 1.684 < 3.259167
Type-II statistical error occur for column effects
because our calculated value fall in accepted region
𝐹𝐶𝑎𝑙𝐶 < 𝐹𝑇𝑎𝑏𝐶 = 0.270351< 3.490295
Step-14: Post Test ANOVA Curve

Step-13: Decisions: a) Accept the 𝐻 / 𝑜 and conclude that there is no difference in subjects
Interest of all students at 5% level of significance and reject the null
at computed p-value.
b) Accept 𝐻 // 𝑜 and conclude that there is no difference in students
performance in of all offered courses at 5% level of significance and
reject the null at computed p-value.

Q.5 Following provided data set of 60 data of gender and Grades of business students: [08 marks]
[20 minutes]
Gender Grades Gender Grades Gender Grades Gender Grades
Female A Female B Female A Female B
Female C Male A Female C Male A
Male D Male A Male D Male C
Male B Male B Female A Female B
Female C Female D Male A Female B
Male D Female C Male B Female C
Female D Male D Female C Male B
Male C Male C Male D Male A
Male C Female D Female D Male C
Female C Male D Female C Female B
Male D Female D Male B Female D
Female A Male B Female D Male D
Male C Male A Female C Female C
Female B Female B Male A Male D
Female A Female A Male A Female D

Required:
a) Construct a contingency (Cross table) and fill-in.
A B C D Total
Female

Male

Total
b) Fill in above table of observed frequencies for gender and grade.
c) Calculate the sample Chi-Square value.
d) State the null and alternative hypothesis.
e) Perform all steps of testing of hypothesis

Solution:
The following steps of hypothesis should conclude required hypothesis under given information.
Step-1: Null Hypothesis: H0 : 𝑝𝐴 = 𝑝𝐵 = 𝑝𝐶 = 𝑝𝐷

Page 8 of 10 Statistical Inference (QTM-204) BS-4A (A & F) Spring - 2023


Step-2: Alternative Hypothesis: HA : At least two proportion are not equal
Step-3: Directional or Right tail test
Step-4: Level of Significance 𝛼 = 5% = 0.05
Step-5: Degree of Freedom 𝑣 = (𝐶 − 1)(𝑅 − 1) = (4 − 1)(2 − 1)
𝑣=3
Step-6: Critical Value (Tabulated Value) 𝑋 2 (𝛼,𝑣) = 𝑋 2 (0.05,3) = 7.815
Step-7: Pre-Test Chi-Squar Curve
(𝑜𝑖 −𝑒𝑖 )2
Step-8: Test Statistic (Calculated Value) 𝑋 2 𝐶𝑎𝑙 = ∑[ ]
𝑒𝑖
Step-9: Computation:
A B C D Total
Female 6 7 10 8 31

Male 8 6 6 9 29

Total 14 13 16 17 60
(31)(14) (29)(14)
𝑒1 = = 7.233 𝑒5 = = 6.767
60 60
(31)(13) (29)(13)
𝑒2 = = 6.717 𝑒6 = = 6.283
60 60
(31)(16) (29)(16)
𝑒3 = = 8.267 𝑒7 = = 7.733
60 60
(31)(17) (29)(17)
𝑒4 = = 8.789 𝑒8 = = 8.217
60 60
𝑜𝑖 𝑒𝑖 (𝑜𝑖 − 𝑒𝑖 ) (𝑜𝑖 − 𝑒𝑖 )2 (𝑜𝑖 − 𝑒𝑖 )2
𝑒𝑖
6 7.233 -1.233 1.5203 0.210189
7 6.717 0.283 0.0801 0.011923
10 8,267 1.733 3.0033 0.36329
8 8.789 -0.789 0.6225 0.07083
8 6.767 1.323 1.7511 0.25879
6 6.283 -0.283 0.0801 0.01275
6 7.733 -0.099 0.0097 0.0013
9 8.217 0.783 0.6131 0.0746
𝑋 2 𝐶𝑎𝑙 1.003672
Step-9: P-value (Actual Rejection Region) p-value can be obtained at 𝑋 2 𝐶𝑎𝑙 = 1.002673,
from𝑋 2 table, we can observe that the probability at
𝑋 2 𝐶𝑎𝑙 = 1.002673 is approximate at 0.800363 or 80.0363%
At 3 d.f. Hence actual probability of rejection region is 80.0363%
Step-10: Power of Statistical Test p-value indicate that power of statistical test will
be expected for certain unknown population may be β = 1– (p-
value) = 1 – 0.800363 = 0.199637 or 19.9637%.
Step-11: Types of Statistical Error Type-II statistical Error occur because our
calculated value fall in acceptance region
𝑋 2 𝐶𝑎𝑙 < 𝑋 2 𝑇𝑎𝑏 = 1.002673 < 7.815
Step-12: Post Test ANOVA Curve
Step-13: Decision: We can confidently conclude that we can accept the null
hypothesis at 5% level of significance, but we strongly reject null
hypothesis at p-value 80.0363%%. Therefore, Type-II statistical
error occurred.
- : * * * * * Good Luck * * * * * : -

Page 9 of 10 Statistical Inference (QTM-204) BS-4A (A & F) Spring - 2023


Formula Sheet
∑𝑥 𝜇𝑥̅ = 𝜇
𝑥̅ = 𝜎
𝑛
∑(𝑥− 𝑥̅ )2 𝜎𝑥̅ =
𝑠2 = √𝑛
𝑛−1 𝜎 𝑁−𝑛
∑(𝑥− 𝑥̅ )2 𝜎𝑥̅ = √
𝜎2 = √𝑛 𝑁−1
𝑁
𝑝̂ 𝑞̂
𝑥̅ − 𝜇𝑥
̅
𝜎𝑝̂ = √
Z-Statistic: 𝑛
𝜎𝑥
̅

𝑝̂1 𝑞̂1 𝑝̂2 𝑞̂2


Z-Statistic:
𝑝̂− 𝑝 𝜎𝑝̂1−𝑝̂2 = √ +
𝑛1 𝑛2
𝜎𝑝
̂

𝑥̅ − 𝜇 𝜎21 𝜎22
t-Statistic: 𝑠 𝜎𝑥̅1 −𝑥̅2 = √ +
𝑛1 𝑛2
√𝑛
For Possible Sample Size 𝑁 2
𝑥− 𝜇 𝑁!
Z-score: For Possible Sample Size (𝑁−𝑛)!
𝜎
𝑁−𝑛
(1-𝛼)% 𝐶. 𝐼 𝑓𝑜𝑟 𝜇 = 𝑃𝑟𝑜𝑏. (𝑥̅ ± 𝑍𝛼
𝜎
) Finite Population Multiplier √
𝑁−1
2 √𝑛
𝑥
𝑝̂ =
(1-𝛼)% 𝐶. 𝐼 𝑓𝑜𝑟 𝜇 = 𝑃𝑟𝑜𝑏. (𝑥̅ ± 𝑡(𝛼,𝑣)
𝑠
) 𝑛
2 √𝑛
𝑁
K=
𝑛
𝑝̂ 𝑞̂
(1-𝛼)% 𝐶. 𝐼 𝑓𝑜𝑟 𝑝 = 𝑃𝑟𝑜𝑏. (𝑝̂ ± 𝑍𝛼 √ )
2 𝑛
𝑍𝛼 𝜎
𝑛 = [ 2 ]2
(1-𝛼)% 𝐶. 𝐼 𝑓𝑜𝑟 (𝑝1 − 𝑝2 ) = 𝑃𝑟𝑜𝑏. [(𝑝̂1 − 𝑝̂2 ) ± 𝑍𝛼 𝜎(𝑝̂1 −𝑝̂2) ] 𝑒
2
𝑍𝛼 2
(1-𝛼)% 𝐶. 𝐼 𝑓𝑜𝑟 (𝜇1 − 𝜇2 ) = 𝑃𝑟𝑜𝑏. [(𝑥̅1 − 𝑥̅2 ) ± 𝑍𝛼 𝜎(𝑥̅1−𝑥̅2) ] 𝑛= 2
𝑝̂ 𝑞̂
2 𝑒2

One-Way Classification ANOVA e = 𝑥̅ − 𝜇


𝑆1 2
Fcal =
𝑆2 2 Two-Way Classification ANOVA
(∑𝑥)2
SST = ∑x2− 𝑆𝑅 2
𝑁 𝑓𝐶𝑎𝑙𝑅 =
𝑆𝐸 2
𝑇1 2 𝑇2 2 𝑇3 2 𝑇4 2 (∑𝑥)2
SSC = + + + − 𝑆𝐶 2
𝑟1 𝑟2 𝑟3 𝑟4 𝑁 𝑓𝐶𝑎𝑙𝐶 =
𝑆𝐸 2
SSE = SST – SSC (∑𝑥)2
𝑣1 = 𝐶 − 1, 𝑣2 = 𝑁 − 𝐶 SST = ∑x − 2
𝑁
𝑅1 2 𝑅2 2 𝑅3 2 (∑𝑥)2
SSR = + + −
Chi-Square 𝐶1 𝐶2 𝐶3 𝑁
(𝑜𝑖 −𝑒𝑖 )2 𝐶1 2 𝐶2 2 𝐶3 2 (∑𝑥)2
𝑋 2
= ∑[ ] SSC = + + −
𝐶𝑎𝑙 𝑒𝑖 𝑟1 𝑟2 𝑟3 𝑁
𝑛∑𝑥𝑦− ∑𝑥.∑𝑦 SSE = SST – SSR – SSC – SSE
𝑏̂ – Slope =
𝑛∑𝑥 2 −(∑𝑥)2 𝑣𝑅 = 𝑅 − 1, 𝑣𝐶 = 𝐶 − 1, 𝑣𝐸 = (𝑅 − 1)(𝐶 − 1)
∑𝑦− 𝑏̂ ∑ 𝑥 ∑𝑦 𝑏̂ ∑𝑥
𝑎̂ = = − = 𝑦̅ − 𝑏̂𝑥̅
𝑛 𝑛 𝑛
(𝑦− 𝑦̂)2 (𝑦− 𝑦̂)2
Standard Deviation = √ , Standard Deviation = √
𝑛−2 𝑛−5

Page 10 of 10 Statistical Inference (QTM-204) BS-4A (A & F) Spring - 2023

You might also like