TÓM TẮT XSTK
TÓM TẮT XSTK
m1 + m2 +…mn
EX: There are 5 types of flowers in the carton: 2 red flowers, 2 yellow flowers, 1 blue
flower, 1 white flower and 1 rose. Randomly take 1 type of flower from the box to
arrange, how many ways are there to choose?
- Solution:
How to choose red flowers: 2
How to choose yellow flowers: 2
How to choose blue flower: 1
How to choose white flower: 1
How to choose pink flower: 1
2+2+1+1+1 = 7 how to choose flowers for arrangement
1.3. Multiplication rule
Identifying signs: number of stages, all stages must be completed to get results
m1m2…mn
EX: A store sells shirts with 2 types of sizes including size 39, with 5 different colors and
size 40 with 4 different colors. How to choose 2 shirts of 2 sizes?
Solution:
Shirt size 39 has 5 options
Shirt size 40 has 4 options
5.4 = 20 how to choose 2 shirts of 2 sizes
1.4. Permutation
Identification signs: Get all, sort all elements
Note: The permutations are all the same in composition, only different by the
arrangement order of the elements in the group.
Pm = m! = 1.2.3…m
EX: Arrange 6 teddy bears on a cabinet with 6 drawers. Ask how many ways there are to
arrange them?
Solution: P6 = 6! = 1.2.3.4.5.6
= 720 how to arrange bears on the cabinet
EX: Class 12A1 has 5 people running for the executive committee position. Please elect
3 out of 5 friends to be the class executive committee in the order of class president, class
vice president and secretary.
Solution:
5!
𝐴35 = = 60 how to choose
(5−3)!
1.6. Combination
Identification signs: Is an order-irrespective group consisting of k different elements
chosen from n elements.
𝑛!
𝐶𝑛𝑘 =
𝑘!(𝑛−𝑘)!
EX: A basketball match organized by the school requires each class to have 7 male
members participating. We know that class 12C5 has 25 boys. How many ways are there
to choose 7 boys in class 12C5 to compete in basketball?
Solution:
7 25!
𝐶25 = = 480700 how to choose
7!(25−7)!
CHAPTER 2: BASIC CONCEPTS AND PROBABILITY FORMULAS
2.1. RANDOMNESS TEST AND EVENT
2.1.1. Random phenomenon:
2.1.2. Randomness test and event:
Sign sample space: Ω.
EX: Roll 1 dice of the same suit Ω ={1;2;3;4;5;6}
Solution:
Let A be the event of “an even number appearing” A={2;4;6}
Let B be the event of “an odd number appearing” B={1;3;5}
EX: In a flower basket there are 2 types of flowers available for sale. Know that these
two types of flowers have type 1 products, type 2 products and waste products. What is
the simple event?
Solution: Let A be the simple event, A={2 type 1 flowers; 2 type 2 flowers; 2 waste
products }
EX: Need to get 5 cans of water in a box containing 5 cans of beer and 3 cans of soft
drinks. Then which event is the sure event and which event is the empty event?
Solution:
Let A be the sure event. A ={ get at least 2 cans of beer }
Let B be the empty event. B={get 5 cans of soft drinks}
2.1.3. Relationships between events:
a.) Equivalent relationship
- Pull along when A⊂B
- Equivalent to when A⊂B and B⊂A. Sign: A=B
EX: Toss a dice, let A be the event "the die comes up 5" and B is the event "the dice
comes up odd", then A⊂B
EX: Check 2 t-shirts. Let A be the event "there is at least 1 defective shirt" and B the
event "there is 1 defective shirt or 2 defective shirts", then A=B
b.) Sum and product of 2 events
Ω
Sign: S ∪ T or S+T
S
T
EX: Consider the test of incubating 2 chicken eggs
Let Ni: “The ith fruit blooms” (i=1;2)
Ki: “The ith fruit does not bloom” (i=1;2)
A: “There is 1 fruit blooming”
Then, the sample space of the test is:
Ω={K1K2;N1K2;K1N2;N1N2}
The following events are simple events:
𝜔1 = K1K2 ; 𝜔2 =N1K2 ; 𝜔3 =K1N2 ; 𝜔4 =N1N2
Event A is not elementary because A=N1K2 ∪ K1N2
c.) Opposing events:
𝐴̅ = Ω \ A
EX: From a batch containing 10 genuine products and 3 waste products, 11 products are
randomly selected
Let Ai be the event: “choose ith genuine product”, (i=8,9,10,11)
Ω = A8+A9+A10+A11 và 𝐴̅10 = Ω\A10 =A8+A9+A11
d.) Two mutually exclusive events:
A and B do not occur in the same trial
EX: Check 2 boxes of mackerel. Let A be the event "there is a barrel of mackerel that is a
waste product". B is the event "no barrel of fish is a waste product"
=> A and B are two mutually exclusive events
Chart VEN
A∪B A∩B A and B conflict A,𝐴̅ Opposition
A B 𝐴̅
A
2.1.4. Exhaustive events
Identification signs: Conflict with each other, and the total sum is equal Ω
EX: There are 4 coat racks. Choose one shirt from each shelf. Let Ai be the event “the
coat is taken from the ith shelf”, i = ̅̅̅̅
1,4
When {A1;A2;A3;A4} is the exhaustive events
2.2. PROBABILITY OF EVENT
Sign: p(A)
2.2.1. Classic form
𝑛(𝐴) 𝑘
p(A)= =
𝑛(Ω) 𝑛
EX: An art team needs to recruit 2 members. There are 4 girls and 2 boys applying (the
probability of being accepted is the same for all 6 people). Calculate the probability to:
1. Both admitted candidates are female
2. There is at least 1 female student admitted
Solution:
1.) Let A be the event that the two admitted candidates are both female
𝑛(𝐴) 𝐶42 2
p(A) = = =
𝑛(Ω) 𝐶62 5
2.) Let B be the event that at least one female student is admitted
𝑛(𝐵) 𝐶41 𝐶21 +𝐶42 14
p(B) = = =
𝑛(Ω) 𝐶62 15
EX: Pearson flipped a balanced coin 12,000 times and found tails appearing 6,019 times
(frequency 0.0516); Toss 24,000 times and see tails appear 12,012 times (frequency is
0.5005).
2.2.3. Properties of probability
1.) If A is an arbitrary event 0 ≤ p(A) ≤ 1
2.) p(∅) = 0; p(Ω) = 1
3.) If A ⊂ B then p(A) ≤ p(B)
2.3. PROBABILITY FORMULA
2.3.1. Probability addition formula
• If A and B are two arbitrary events
EX: A group has 30 investors of all types, including 13 securities investors, 17 equipment
investors and 10 investors in both securities and equipment. Find the probability that that
person will meet the equipment investor.
Solution 1
Let A is “a partner who meets with a securities or equipment investor”
B is “a partner who meets stock investors"
C is “a partner who meets equipment investors"
13 17 10 2
P(A) = P(B) + P(C) – P(B∩C) = + − =
30 30 30 3
20 2
3+10+7=20 => =
30 3
EX: A gift basket has 10 candies, of which 3 are red. Randomly take 3 candies from a gift
basket. Calculate the probability of getting at least 1 red candy.
Solution 1: Let A be the event "take at least 1 red candy"
Ai is the event “getting ith red candy”, (i=0,1,2,3)
{A1; A2; A3} pairwise conflict
𝐶31 .𝐶72 𝐶32 .𝐶71 𝐶33 .𝐶70 17
=>P(A) = P(A1) + P(A2) +P(A3) = 3 + 3 + 3 =
𝐶10 𝐶10 𝐶10 24
𝐶30 .𝐶73 17
Solution 2: P(A) = 1 – P(A0) = 1 - 3 =
𝐶10 24
Note: ̅̅̅̅̅̅̅
𝐴 ∩ 𝐵 = 𝐴̅ ∪ 𝐵̅ ; ̅̅̅̅̅̅̅
𝐴 ∪ 𝐵 = 𝐴̅ ∩ 𝐵̅
2.3.2. Conditional probability
𝑃(𝐴∩𝐵)
P(A|B) =
𝑃(𝐵)
EX: A group of 10 employees includes 3 men and 7 women, including 2 30-year-old men
and 3 30-year-old women. Randomly select 1 employee from that group. Let A be "the
selected employee who is female", B is "the selected employee who is 30 years old".
Calculate P(A|B), P(B|A)?
Solution:
𝐶71 7 𝑃(𝐴∩𝐵) 0.3
P(A) = 1 = = 0.7 P(A|B) = = = 0.6
𝐶10 10 𝑃(𝐵) 0.5
𝐶31 3
P(A∩ 𝐵) = 1 = = 0.3
𝐶10 10
Nature:
1.) 0≤ p(A|B) ≤ 1, ∀𝐴 ⊂ Ω
2.) If A⊂ C then p(A|B) ≤ p(C|B)
3.) P(A|B) = 1-p(𝐴̅|𝐵)
2.3.3. Probability multiplication formula
• If A and B are two tensely independent events
p(A∩ 𝐵) = p(B).p(A|B) = p(A).p(B|A)
• If A and B are two independent events
p(A∩ 𝐵) = p(A).p(B)
• If n events Ai, i = 1,...n are not independent then
p(A1A2…An) = p(A1).p(A2|A1)…p(An|A1…An-1)
EX: On Christmas, Mr. A sold 1 large pine tree and 1 small pine tree. The probability of
selling a large pine tree is 0.9. If a large pine tree is sold, the probability of selling a small
pine tree is 0.7. If a large pine tree cannot be sold, the probability of selling a small pine
tree is 0.2. Knowing that Mr. A sold at least 1 pine tree. What is the probability that Mr. A
can sell both trees?
Solution:
Let P(A) is the probability of “selling both trees” and P(C) is the probability of “selling at
least 1 tree”
𝑃(𝐴∩𝐶) 0,9.0,7
P(A|C) = = = 0,6848
𝑃(𝐶) 1−0,1.0,8
EX: Aquarium I has 3 goldfish and 4 brown fish, aquarium II has 5 goldfish and 3 brown
fish. Observed a fish jumping from lake I to lake II. Calculate the probability that the fish
jumping into lake II is a goldfish?
Solution:
3 6
𝑝(𝑓𝑖𝑠ℎ 1 𝑔𝑜𝑙𝑑 𝑎𝑛𝑑 𝑓𝑖𝑠ℎ 2 𝑔𝑜𝑙𝑑 ) = . 3 6 4 5 38
7 9
Chart :{ 4 5 => P = . + . =
𝑝(𝑓𝑖𝑠ℎ 1 𝑏𝑟𝑜𝑤𝑛 𝑣à 𝑐á 2 𝑔𝑜𝑙𝑑 ) = . 7 9 7 9 63
7 9
EX: The ratio of trucks, cars and motorbikes passing through road X with an oil pump
station is 5:2:13. The probability for trucks, cars and motorbikes to pass through this road
and enter the oil pump is 0.1; 0.2; 0.15. Knowing that there is a car passing through road
X to the oil pump, calculate the probability that it is a car?
Solution:
5
𝑝(𝑡𝑟𝑢𝑐𝑘 𝑝𝑎𝑠𝑠𝑒𝑠 𝑋 𝑡𝑜 𝑡ℎ𝑒 𝑜𝑖𝑙 𝑝𝑢𝑚𝑝) = . 0,1 = 0,025
20
2
𝑝(𝑐𝑎𝑟 𝑝𝑎𝑠𝑠𝑒𝑠 𝑋 𝑡𝑜 𝑡ℎ𝑒 𝑜𝑖𝑙 𝑝𝑢𝑚𝑝) = . 0,2 = 0,02
20
13
𝑝 (𝑚𝑜𝑡𝑜𝑟𝑏𝑖𝑘𝑒 𝑝𝑎𝑠𝑠𝑒𝑠 𝑋 𝑡𝑜 𝑡ℎ𝑒 𝑜𝑖𝑙 𝑝𝑢𝑚𝑝) = . 0,15 = 0,0975
{ 20
0,02 8
=>p = =
0,025+0,02+0,0975 57
𝐸𝑋 = ∑ 𝑥𝑖 𝑝𝑖
𝑖
The expectation of the random variable X2 is: 𝐸(𝑋)2 = ∑ 𝑥𝑖2 𝑝𝑖
𝑖
𝜎 = √𝑉𝑎𝑟𝑋
𝑁−𝑛
𝐸𝑋 = 𝑛𝑝; 𝑉𝑎𝑟𝑋 = 𝑛𝑝𝑞
𝑁−1
𝑁𝐴
In there: 𝑝 = , 𝑞 = 1 − 𝑝
𝑁
EX: A shipment has N = 40 light bulbs, including 10 broken light bulbs, randomly
take 5 light bulbs to check. Let X be the random variable indicating the number of
broken light bulbs among the 5 light bulbs taken out.
a) Make a table of the probability distribution of X.
b) Calculate the average number of failed bulbs among the bulbs removed and find
the variance of X.
Solution:
a) We have: 𝑋 = {0; 1; 2; 3; 4; 5} và 𝑁 = 40, 𝑁𝐴 = 10, 𝑛 = 5Þ 𝑋 ∈ 𝐻(40,10,5)
So we have the probability distribution table of X:
X 0 1 2 3 4 5
1 4 2 3 3 2 4 1
P 0 5
𝐶10 𝐶30 𝐶10 𝐶30 𝐶10 𝐶30 𝐶10 𝐶30 𝐶10 𝐶30 5 0
𝐶10 𝐶30
5 5 5 5 5 5
𝐶40 𝐶40 𝐶40 𝐶40 𝐶40 𝐶40
𝑒 − 𝑘
𝑝𝑘 = 𝑃(𝑋 = 𝑘 ) = (𝑘 = 0,1, … , 𝑛 … )
𝑘!
EX: At gas station H, on average every 10 minutes, there are 15 motorbikes coming to
fill up gas. Knowing that the number of motorbikes coming to refuel at this gas station in
a period of t minutes is a random variable with a Poisson distribution.
a) Find the probability that in a period of 7 minutes, at least 4 motorbikes will come
to fill up gas at this gas station.
b) Find the probability that in a period of 15 minutes, from 20 to 25 motorbikes will
come to refuel at gas station H.
Solution:
Let X and Y be the number of motorbikes that come to refuel at gas station H in 7
minutes and in 15 minutes, respectively.
7.15
Under the assumption: 𝑋~𝑃(); 𝑌~𝑃(1 ); = = 10,5; 1 = 22,5
10
a) The probability that in a 7-minute period at least 4 motorbikes will come to fill up
with gas is:
𝑃(𝑋 ≤ 4) = 1 − 𝑃(𝑋 < 4) = 1 − [𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2) + 𝑃(𝑋 = 3)
0 −
1 −
2 −
3
=1−( 𝑒 + 𝑒 + 𝑒 + 𝑒 − ) ≈ 0,99
0! 1! 2! 3!
b) The probability that in 15 minutes there will be from 20 to 25 motorbikes coming
to fill up with gas is:
25 25 25
1𝑘 (22,5)𝑘 −22,5
𝑃(20 ≤ 𝑌 ≤ 25) = ∑ 𝑃(𝑌 = 𝑘 ) = ∑ 𝑒 −1 = ∑ 𝑒 ≈ 0,473
𝑘! 𝑘!
𝑘=20 𝑘=20 𝑘=20
+∞
𝐸𝑋 = ∫ 𝑥. 𝑓(𝑥)𝑑𝑥
−∞
+∞
𝜎 = √𝑉𝑎𝑟𝑋
EX: Find the expectation, variance, standard deviation of the random variable X with the
following density function:
2 2
(𝑥 + 2𝑥), 𝑥 ∈ [0; 1]
𝑓 (𝑥 ) = { 3
0, 𝑥 [0; 1]
We have:
1
2 11
𝐸𝑋 = ∫ (𝑥 2 + 2𝑥)𝑥𝑑𝑥 = ;
3 18
0
1 1 2
2 2 7 121 151
𝑉𝑎𝑟𝑋 = ∫ 𝑥 2 . (𝑥 2 + 2𝑥)𝑑𝑥 − [∫ 𝑥. (𝑥 2 + 2𝑥)𝑑𝑥 ] = − =
3 3 15 324 1620
0 0
151 √755
𝜎𝑋 = √ =
1620 90
V. NORMAL DISTRIBUTION
5.1. Standard normal distribution: T N (0;1)
a. Define:
1 𝑡2
𝑓 (𝑡 ) = 𝑒 2 ,𝑡 ∈ 𝑅
√2𝜋
c. Probability:
𝑏
1 (𝑥−𝜇)2
−
𝑓 (𝑥 ) = 𝑒 2𝜎2 , 𝑥 ∈𝑅
𝜎√2𝜋
c. Probability:
𝑏−𝜇 𝑎−𝜇
𝑝 (𝑎 ≤ 𝑋 ≤ 𝑏 ) = ( )− ( )
𝜎 𝜎
So for the percentage of equipment under warranty to be 1%, we stipulate the warranty
time to be 1150.5 hours.
X 𝑥1 𝑥2 ........ 𝑥m
P p1* p2* ......... pm*
in there, 𝑝𝑖∗ = 𝑝𝑖1 + 𝑝𝑖2 + ⋯ + 𝑝𝑖𝑛
X's expectation is: 𝐸𝑋 = 𝑥1 𝑝1∗ + 𝑥2 𝑝2∗ + ⋯ + 𝑥𝑚 𝑝𝑚∗
• Probability distribution table of Y
Y y1 y2 ......... ym
P p*1 p*2 ......... p*m
in there, 𝑝∗𝑗 = 𝑝1𝑗 + 𝑝2𝑗 + ⋯ + 𝑝𝑚𝑗
Y's expectation is: 𝐸𝑌 = 𝑦1 𝑝∗1 + 𝑦2 𝑝∗2 + ⋯ + 𝑦𝑛 𝑝∗𝑛
EX: Simultaneous probability distribution of random vector (X,Y) given by the table:
Y 1 2 3
X
3 0,02 0,10 0,15
4 0,30 0.05 0,20
5 0.05 0,03 0,10
𝑃(𝑋 = 𝑥𝑖 ; 𝑌 = 𝑦𝑗 ) 𝑝𝑖𝑗
𝑃(𝑋 = 𝑥𝑖 |𝑌 = 𝑦𝑗 ) = = , 𝑖 = ̅̅̅̅̅̅
1, 𝑚
𝑃(𝑌 = 𝑦𝑗 ) 𝑝∗𝑗
𝑃(𝑋 = 𝑥𝑖 ; 𝑌 = 𝑦𝑗 ) 𝑝𝑖𝑗
𝑃(𝑌 = 𝑦𝑗 |𝑋 = 𝑥𝑖 ) = = , 𝑗 = ̅̅̅̅̅
1, 𝑛
𝑃(𝑋 = 𝑋𝑖 ) 𝑝𝑖∗
+∞
• The density function of Y is: 𝑓𝑌 (𝑥) = ∫ 𝑓(𝑥, 𝑦)𝑑𝑥
−∞
4
𝐶 =
3
b) Component density function of X:
+∞ 1
4
𝑓𝑋 (𝑥) = ∫ 𝑓𝑋,𝑌 (𝑥, 𝑦)𝑑𝑦 = ∫ (𝑥 + 𝑥𝑦)𝑑𝑦 = 2𝑥, 𝑥 ∈ [0; 1].
3
−∞ 0
Component density function of Y:
+∞ 1
4 4
𝑓𝑌 (𝑥) = ∫ 𝑓𝑋,𝑌 (𝑥, 𝑦)𝑑𝑥 = ∫ (𝑥 + 𝑥𝑦)𝑑𝑥 = 𝑦, 𝑦 ∈ [0; 1]
3 3
−∞ 0
c) The conditional density function 𝑓𝑋 (𝑥 | 𝑦):
𝑓(𝑥, 𝑦) 4⁄3 (𝑥 + 𝑥𝑦) 𝑥
𝑓𝑋 (𝑥 | 𝑦) = = = + 𝑥, (𝑥, 𝑦) ∈ [0; 1]
𝑓𝑌 (𝑦) 4 𝑦
𝑦
3
The conditional density function 𝑓𝑌 (𝑦 | 𝑥 ):
𝑓(𝑥, 𝑦) 4⁄3 (𝑥 + 𝑥𝑦) 2
𝑓𝑌 (𝑦 | 𝑥 ) = = = (𝑦 + 1), (𝑥, 𝑦) ∈ [0; 1]
𝑓𝑋 (𝑥) 2𝑥 3
EX: The rubber industry has 500,000 workers. To study their living standards, people surveyed
the indicator X*: "Real income of rubber industry workers" and assumed the data given in the
following table:
2. Sample
A sample was randomly and objectively chosen from the whole overall, size n.
Characteristics of the sample include:
a) The sample average:
1
If we consider a random sample: 𝑋̅ = 𝑋̅𝑛 = ∑𝑛𝑖=1 𝑋𝑖
𝑛
1
If there is a specific sample: 𝑥̅ = ∑𝑛𝑖=1 𝑥𝑖
𝑛
*Property: If the original random quantity X has the following expectation and variance
E(X) = 𝜇 and Var(X) = 𝜎 2 then
𝜎2
𝐸(𝑋̅) = 𝜇 𝑣à 𝑉𝑎𝑟(𝑋̅) =
𝑛
1
𝑆 2 = 𝑆𝑛2 = ∑𝑛𝑖=1(𝑋𝑖 − 𝑋̅)2
𝑛−1
1 2
𝑠 2 = 𝑛−1 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥
̅)
𝑋1 +𝑋2 +⋯+𝑋𝑛
If we consider a random sample: F= 𝐹𝑛 = 𝑛
𝑛𝐴
If considering a specific sample 𝑓=
𝑛
e) Relate the characteristics of the sample and the overall
F p
𝑋̅ 𝜇
𝑆2 𝜎2
f) Method for calculating sample characteristic numbers
*In case the sample data is given in the form of n observed values:
𝑛
∑ 𝑥 1
𝑋̅ = 𝑖=1 𝑖 𝑠2 = [∑𝑛𝑖=1 𝑥𝑖 2 − 𝑛(𝑥̅ )2 ]
𝑛 𝑛−1
𝜎2
• Because E𝑋̅ = 𝜇 , Var 𝑋̅ = so :
𝑛
𝜎2 𝑋̅−𝜇
𝑋̅ ∈ 𝑁 (𝜇 ; ) => √𝑛 ∈ 𝑁(0,1)
𝑛 𝜎
𝑆2 𝑋̅−𝜇
𝑋̅ ∈ 𝑁 (𝜇 ; ) => √𝑛 ∈ 𝑁(0,1)
𝑛 𝑆
• When n<30 and 𝜎 2 is unknown then (use student distribution with n-1 degrees of
freedom)
𝑋̅−𝜇
𝑆
√𝑛 ∈ 𝑆𝑡(𝑛 − 1)
b) In case X does not have a normal distribution
• From the central limit theorem, we deduce:
𝑋̅−𝜇 𝑋̅−𝜇
𝜎 T∈ 𝑁(0; 1), 𝑆 T∈ 𝑁(0; 1)
√𝑛 √𝑛
EX: Investigating the productivity of 100 hectares of rice in area A, we have the
following data table:
Productivity 3-3,5 3,5-4 4-4,5 4,5-5 5-5,5 5,5-6 6-6,5 6,5-7
(ton/ha)
Area (ha) 7 12 18 27 20 8 5 3
Fields with a yield of less than 4.4 tons/hectare have low productivity. Calculate:
a)The proportion of rice areas with low productivity
𝑚 7+18+12
f= = = 0,37
𝑛 100
b) The average rice yield, sample variance, and the sample standard deviation
Click pocket calculator : Shift mode 6 => press number 1 => enter data in the table
The average rice yield: 𝑥̅ = 4,75
Sample variance: 𝑆 2 = 0,685
CHAPTER 6: PARAMETER ESTIMATION
1. Estimate the range for the overall mean 𝝁
a) Case 1: n ≥ 30 and 𝜎 2 are known
-Step1: from the sample can be calculated 𝑥̅ (the sample average)
1−𝛼
-Step2: from 1-𝛼 => = 𝜑(𝑡𝛼⁄2 )Look up table Laplace 𝑡𝛼⁄2
2
𝜎
-Step 3: The estimated range is : (𝑥̅ − 𝜀, 𝑥̅ + 𝜀), 𝜀 = 𝑡𝛼⁄2 .
√𝑛
c) Case 3: n < 30, 𝜎 2 is known and if X has a normal distribution, do the same
as case 1.
d) Case 4: n < 30, 𝜎 2 doesn’t know and X has a normal distribution
-Step 1: from the sample we calculate 𝑥̅ , s
Tra bảng pp Student
-Step 2: from 1-𝛼 => 𝛼 𝑡𝛼𝑛−1
⁄2
(Remember to reduce exponential level to n-1 before looking up the table)
𝑠
-Step 3: the estimated range is: (𝑥̅ − 𝜀, 𝑥̅ + 𝜀), 𝜀 = 𝑡𝛼𝑛−1
⁄2 .
√𝑛
So 𝜇 ∈ (𝑥̅ − 𝜀 ; 𝑥̅ + 𝜀)
𝜇 ∈ (19,5066 ;20,4934)
3. Find the sample size (consider only case 1 and case 2)
We fix s( ) to find sample size N.
a) If ' then we solve the inequality:
𝑠 𝑠 2
𝑡𝛼 . > ' => N < (𝑡𝛼 . ) => 𝑁𝑚𝑎𝑥
√𝑁 ′
the number of elements we are interested in, the estimated range for p is:
𝑓(1−𝑓)
(𝑓 − 𝜀 ; 𝑓 + 𝜀) , 𝜀 = 𝑡𝛼⁄2 . √
𝑛
1−𝛼
In there, 𝑡𝛼⁄2 is found from 𝜑(𝑡𝛼⁄2 ) = (look up table Laplace)
2
EX: Province X has 1,000,000 young people. People randomly surveyed 20,000
young people in province X about their educational level and found that 12,575
young people had graduated from high school. Estimate the proportion of young
people who have graduated from high school in province X with 95% confidence?
What is the number of young people who have graduated from high school in
province X?
Solution:
Let p be the proportion of young people who have graduated from high school in
province X.
𝑚 12575
f= = = 0,62875
𝑛 20000
1-𝛼 = 95% => 𝜑(𝑡𝛼⁄2 ) = 0,475 => 𝑡𝛼⁄2 = 1,96
𝑓(1−𝑓) 0,62875.(1−0.62875)
𝜀 = 𝑡𝛼⁄2 . √ 𝑛
= 1,96.√ ≈ 0,0067
20000
So p ∈ (𝑓 − 𝜀 ; 𝑓 + 𝜀)
p ∈ (0,62205 ; 0,63545)
The number of young people who have graduated from high school is:
1 000 000 . 0,62205 = 622050
1 000 000 . 0,63545 = 635450
H0 : 𝜇 = 𝜇 0 H0 : 𝜇 = 𝜇 0 H0 : 𝜇 = 𝜇 0
H1 : 𝜇 ≠ 𝜇 0 H1 : 𝜇 > 𝜇 0 H1 : 𝜇 < 𝜇 0
EX: Electricity Department A reported that: on average, a household has to pay 250,000
VND monthly for electricity, with a standard deviation of 20,000 VND. People randomly
surveyed 500 households and calculated that on average, each household pays 252,000
VND for electricity every month. In testing hypothesis H “The average monthly payment
per household is 250 thousand VND” with significance level 𝛼 = 1%. Please provide the t-
statistic value and conclusion.
Solution
Because n = 500, 𝜎 = 20
1− 𝛼
We have: 𝑥̅ = 252 and 𝛼 = 0,01 → = 0,495 → 𝑡𝛼 = 2,58
2
𝑥̅ − 𝜇0 252−250
Statistical value: t = = = 2,2361
𝜎/√𝑛 20/√500
EX: A report said that 25% of Vietnamese consumers are interested in Vietnamese
products. A random survey of 1,000 Vietnamese people found that 385 respondents were
interested in Vietnamese goods. With a significance level of 5%, retest the above statement.
Solution
𝑚 385
f= = = 0,385
𝑛 1000
1− 𝛼
𝛼 = 5% = 0,05 → = 0.475 = 𝜑(1,96) → 𝑡𝛼 = 1,96
2
Statistical value:
|𝑓− 𝑝0 | |0,385−0,25|
t= √𝑛 = √1000 = 9,8590
√𝑝0 𝑞0 √0,25 . 0,75
- Depending on n and whether the variance is known or unknown, we consider cases like
comparing the average with a number.
2. Quality.
1) -1 r 1
2) If r = 0 then X, Y o not have a linear relationship;
If r = 1 then X, Y have an absolutely linear relationship
3) If r < 0 then the relationship between X, Y is decreasing.
4) If r > 0 then the relationship between X, Y is covariate.
II. Experimental mean linear regression line
- From the experimental sample of random vectors (X, Y), we represent pairs of points
(xi;yi) on the Oxy plane. Then, the curve connecting the points is the curve that depends on
Y on X that we need to find (see picture a),b))
- The straight line is the empirical regression line that best approximates the given sample
points, and is also an approximation of the curve to be found.
+ Figure a) we see good approximation (strong linear dependence)
+ Figure b) is not a good approximation.
1. Least squares method
- When there is a relatively strong linear dependence between two random variables X and
Y we need to find the expression a+bX that best approximates Y in the sense of minimizing
the mean squared error E(Y-a-bX)2, this method is called least squares.
- For each pair of points (xi;yi) the approximate error is:
𝜀𝑖 = 𝑦𝑖 − (𝑥 + 𝑏𝑥𝑖 ) (see picture c))
Let's:
- Calculate the sample correlation coefficient.
- Build a sample regression equation.
- Estimate regression error.
- Forecast the value of the interest rate if the inflation rate is 22.5.
Solution
- With sample data we can calculate
𝑥̅ = 9,0625; 𝑦̅ = 12,3; ̅̅̅
𝑥𝑦 =130,9813;
𝑠 2𝑥 = 18,59; 𝑠𝑥 = 4,312
𝑠 2𝑦 = 20,76; 𝑠𝑦 = 4,56.
- So the sample correlation coefficient will be
r = (130,9813 - 9,0625 . 12,3) / (4,56 . 4,312) = 0,99
- We have:
𝑎̂ = 0,99 (4,56 / 4,312) = 1,045
𝑏̂ = 𝑦̅ - 𝑎̂𝑥̅ = 12,3 – 1,045 . 9,0625 = 2.83
- So we have a sample regression equation
𝑦̂ = 1,045.x + 2,83
- The estimate of the regression error is::
2
𝜀𝑦/𝑥 = 𝑠𝑦2 (1-r2) = 20,76 (1 – 0,992) = 0,413