Stat 2020-2021 Module 1 Lecture
Stat 2020-2021 Module 1 Lecture
Statement of Vision
Northeastern Mindanao Academy has greatly envisioned to transform the students, into
exemplary citizens and leaders by providing them with physical, mental, social and spiritual trainings.
Statement of Mission
By making Christ the Bedrock of Education, Northeastern Mindanao Academy is committed to
prepare the students for higher academic pursuits by consistently providing enhanced learning
experiences that will promote the maximum development of the mind, body, and soul. And to inspire
them to gain the highest possible capacity for usefulness and service in the life that now is and in the
life of the better world.
Statement of Philosophy
Northeastern Mindanao Academy conforms to the Seventh-day Adventist belief that the
students are God’s heritage and their teachers as His servants. Their school in all level adheres the
commission to educate the young for a true knowledge of God and experience His companionship in
study and service. To put in effect the Divine Plan “to restore in man the image of God.”
Secondly, the school seeks to train and prepares students for life-work and become better
citizens to meet the challenges of nation building.
Thirdly, students are expected to share with others what has been learned and experienced
during the entire stay in school.
to our Statistics & Probability class! Enjoy as we learn together. Have fun!
Introduction:
Life is full of uncertainties and surprises. Decision making is very important skill that we should
possess nowadays. Every day, we face many situations that need an immediate and sensible
decision. If you decide whether or not to take a certain college degree, you should consider many
factors. Or you want to know if whether or not to buy a certain product after knowing the history of
the company that manufactured it. These events and situations in real life can be better decided only
with careful choices-choices that can be made with precise anticipation of chances.
Probability is the body of knowledge that focuses on activities that involve predicting chances
and quantifying the randomness of events. In this unit, we will provide you with modules that will
allow you to appreciate the importance of probability.
Random Variables
A random event, like drawing a face card in a standard deck, always gives a random
outcome. Most of the time, when a random event is taken under consideration, the focus is
not on the outcome itself but on several factors that may affect the outcome. For example,
in the same event of drawing a face card in a standard deck of cards, you are more
interested in the number of times a face card appears after 10 draws instead of the actual
outcome of getting the card. This is because you know that the outcome in each trial
changes, considering you have 52 different cards. This situation illustrates the concept of a
random variable.
A random variable, is also called a stochastic variable, is a rule that assigns a numerical
value or characteristic to an outcome of an experiment. It is essentially a variable, usually
denoted as X or any capital letter of the alphabet, because its value is not constant – it
assumes different values due to chance. For example, a die is rolled five times and a random
variable X is assigned as the number of times a “6” appears. The random variable X can take
on the values 0, 1, 2, 3, 4, and 5 as the outcome may vary from trial to trial.
Generally, there are two categories of random variables: discrete and continuous random
variables. A discrete random variable takes on countable number of distinct values, which
are whole numbers such as 0, 1, 2, 3, 4, 5, … while a continuous random variable assumes
an infinite number of possible values including the decimals between two counting
numbers. For a quick comparison, the values of a discrete random variable are basically
“counts” and those of a continuous random variable are “measurements.”
Definition of Terms
A random variable, which is usually written using a capital letter, is a variable that
assigns a numerical value to each outcome of a random event. It can be classified as discrete
or continuous depending on whether the value is obtained by counting or by measurement.
Range Space is the set of all values possible for a given random variable.
a) A fair coin is tossed thirty times and the number of times X that a tail appears is a
discrete random variable since its possible values may be determined by counting,
i.e., 0, 1, 2, 3, 4, …, 30.
Note: A coin has two sides, the head and the tail. When you toss a coin, the result
3
may be a head or a coin. If it is a head, hence the X=0. The value of X= 0,1,2,3,..30
are the possible values of tails.
b) A machine is run and the recorded time it starts to experience a glitch Y illustrates a
continuous random variable since the value of the variable may be assigned using
measurement.
As mentioned earlier, a random variable may take in different values depending on the
outcome of
each trial. Now, the set of all values possible for a given random variable is called the range space.
Example 2: When two fair coins are tossed and the random variable X is defined as the number of
heads that
appear. Find the range space.
Solution:
Let us review our tree diagram:
In the range space, 2 indicates that there are two heads; 1 means 1 head, 0 means no head.
There should be no repetition of outcomes in the range space. There are two 1’s but we only write
only 1 in the range space.
Hence, the range space= {0, 1, 2}. in tossing two fair coins. Do not forget to write your brace { }.
Example 3: When two fair coins are tossed and the random variable X is defined as the number of
tails that
appear. Find the range space.
Solution:
Based on the tree diagram or range space= {0, 1, 2}. 0 means no tail or (H,H), 1 means 1 tail or
(H,T), (T,H)(no repetition is allowed in the range space), and 2 means (T,T).
Identifying the range space allows you to find the possible values of a random variable. The
value of a
random variable X at a specific outcome x, is usually denoted by X(xi).
4
Example 4: A pair of dice is thrown and the random variable Y is defined such that Y gives the sum of
the two numbers that appear. Determine the following:
Solution:
Second Die
F 1 2 3 4 5 6
I 1 (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
R
S
2 (2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
T 3 (3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
4 (4,1) (1,2) (4,3) (4,4) (5,5) (6,6)
D
I 5 (5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
e 6 (6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
The outcome (3,2) means 3 is the outcome of the first die and 2 is the outcome of the second die.
a) Range space.
Take note: Y is defined such that Y gives the sum of the two numbers that appear. Meaning
we will add the two numbers like (1,1) means 2; (1,2) means 3, and so on and so forth. Hence,
your range space is the set of all possible sums. Range space= { 2,3,4,5,6,7,8,9,10,11,12}.
b) Y(3,2)
Y = 3+2=5 It is defined as the sum of two numbers.
c) Y(4,6)
Y= 4+6= 10 It is defined as the sum of two numbers.
d) Y(1,3)
Y= 1+3= 4 As defined, add the two numbers.
Example 5: Consider the random event of tossing four coins and the variable X gives the number of
heads that
appear. Construct a probability distribution table.
Solution:
There are 16 possible outcomes. You may use a tree diagram or a table as shown below.
Step 1: We let X as the variable that gives the number of heads that appear. Our range space is:
{0, 1, 2, 3, 4}. (note: 0 means no head or (TTTT), 1 means (HTHH), etc.)
5
𝑛(𝐸)
For X= 0: P(E) = 𝑛(𝑆) For 0: it means no head/s that appear or (TTTT). There is only 1 of this
kind.
1
P(0) = 16 Hence, n(E) = 1 and n(S) = 16.
P(0) = 0.0625 Simplify. Which means 6.25% probability that no head will appear (to
Change decimal to percent, move 2 decimal places to the right and
affix the percent sign).
𝑛(𝐸)
For X=1: P(E) =
𝑛(𝑆)
4
P(1) = For 1: means there is only 1 head that appears. In this case: (HTTT), (THTT),
16
(TTTH), and (TTHT). So, n(E) = 4.
1
P(1) = = 0.25 Simplify. Which means 25% probability that 1 head will appear.
4
𝑛(𝐸)
For X=2: P(E) =
𝑛(𝑆)
6
P(2) = = 0.375 For 2: means there are two heads that appear. In this case: ((HHTT), (HTHT),
16
(HTTH), (THHT), (THTH), and (TTHH). So, n(E)= 6. Which means 37.5%
probability that 2 heads will appear.
𝑛(𝐸)
For X=3: P(E) = For 3: means there are three heads that appear. In this case: (HHHT), (HHTH),
𝑛(𝑆)
(HTHH), and (THHH). So, n(E) = 4.
4
P(3) = = 0.25 Simplify. Which means 25% probability that 3 heads will appear.
16
𝑛(𝐸)
For X=4: P(E) = For 4: means there are 4 heads that appear. In this case: (HHHH). Hence,
𝑛(𝑆)
n(E) = 1
1
P(4) = =0.0625 Simplify.
16
After you are done finding its probability, you construct a probability distribution table as shown below.
This table is called a probability distribution which is also known as probability mass function.
A probability distribution, also known as probability mass function, is a table that gives a list
of probability values along with their associated value in the range of a discrete random variable.
6
Note from the previous probability mass function that the following properties are
observed given that p, are the individual probabilities for each value (xi) of the random
variable:
1. Each probability value ranges from 0 to 1, in symbols, 0 ≤ pi ≤ 1.
2. The sum of all the individual probabilities in the distribution is equal to 1: thus,
p1 + p2 + p3 + … + pn = ∑𝑛𝑖=1 𝑝I = 1.
Also, note that like any other statistical distribution, a probability mass function may be
graphed using a histogram in which the horizontal axis represents the values of the random
variable X and the vertical axis gives the corresponding probabilities, P(X). For example,
below is the related histogram of the random variable X (showing a head after tossing four
coins).
Hence, in example 5 the sum of all the individual probabilities in the distribution is equal to 1 as shown
below.
0.0625 + 0.25 + 0.375 + 0.25 + 0.0625 = 1
Below is the histogram of the random variable X (showing a head after tossing four coins).
0.35
0.3
Probability
0.25
0.2
0.15
0.1
0.05
0
0 1 2 3 4
Number of Heads
Example 6: A card is drawn from a deck of 20 cards (i.e., all the cards numbered 1 to 5 in a standard
deck) and the random variable W gives the number on the card. Construct the probability
mass function and its corresponding histogram.
Solution:
The range space is the set {1, 2, 3, 4, 5). The number of sample space is 20. To solve for the
individual probability, use the formula
𝑛(𝐸)
P(E) = 𝑛(𝑆) Where n(S)= 20.
4 1
P(W = 1) = = 5 = 0.2 P(W=1) means how many 1’s in the deck of
20
cards. There are four. See the table below.
7
4 1
P(W = 2) = = = 0.2 P(W=2) means how many 2’s. There are
20 5
four 2’s .
4 1
P(W = 1) = = = 0.2 The deck of cards below:
20 5
4 1
P(W = 2) = = = 0.2 1 2 3 4 5
20 5
4 1 1 2 3 4 5
P(W = 1) = = = 0.2
20 5 1 2 3 4 5
4 1 1 2 3 4 5
P(W = 2) = = = 0.2
20 5
The probability distribution table or probability mass function is shown below. Where the range space is on the
first row and the probability or P(W) is on the second row.
The histogram of the random variable W(gives the number of the cards) is shown below.
0.2
Probability
0.15
0.1
0.05
0
1 2 3 4 5
Number of Heads
Assessment 1.1:
Example 7: Find the mean of the probability distribution involving the random variable X that gives the
Number of heads that appear after tossing four coins.
Number of Heads (X) 0 1 2 3 4
Probability (P(X)) 0.0625 0.25 0.375 0.25 0.0625
Solution: Identify the individual products of each outcome and their corresponding probability, and then,
determine the summation. To solve for the mean, we use the formula,
𝜇 = = ∑𝑛𝑖=1 𝑥 i𝑝(𝑥 i)
The mean of the distribution is 2. This means that the average number of heads that will appear
in a single toss of four coins is 2.
Example 8: The probabilities that a printer produces 0, 1, 2, and 3 misprints are 42%, 28%, 18%, and 12%,
respectively. Construct a probability mass function and then compute the mean value of the
random variable.
Solution:
The probability distribution table or probability mass function and its corresponding values are shown
below. The number of misprints (X) is written on the first row and the probability P(X) is on the second
row. To change percent to decimal, move two decimal places to the left.
The mean value of the variable is 1. Thus, the average number of misprints the printer can produce
is 1 (or one bond paper if you are printing bond papers).
Example 9: A coin is tossed and a die is rolled. The outcome of the coin is recorded “1” when it show a head
and “0” when it shows a tail. The random variable R gives the sum of the outcomes of the coin
9
and the die. Compute the average value of the random variable.
Solution:
Review your tree diagram.
Coin die Outcomes R (the outcome of the coin is recorded “1” when it
shows a head). And (0 if it shows tail)
H 1 (H,1) (1,1)
2 (H,2) (1,2)
3 (H,3) (1,3)
4 (H,4) (1,4)
5 (H,5) (1,5)
6 (H,6) (1,6)
T 1 (T,1) (0,1)
2 (T,2) (0,2)
3 (T,3) (0,3)
4 (T,4) (0,4)
5 (T,5) (0,5)
6 (T,6) (0,6)
(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), and (1, 6). N(S) = 12, the number of
outcomes.
0+1, 0+2 , 0+3 , 0+4 , 0+5 , 0+ 6, 1+1 , 1+2 , 1+3, 1+4 , 1+ 5, and 1,6. Get the sum.
1 , 2, 3, 4, 5, 6, 2, 3, 4, 5, 6, and 7. Remember no repetition of
elements in your range space.
variable is get the sum.
Hence, the range space = {1, 2, 3, 4, 5, 6, 7} and n(S) = 12
Range space(X) 1 2 3 4 5 6 7
# of occurrence 1 2 2 2 2 2 1
P(X) Refer to the above“sum”.
2 1 2 1
P(R = 2) = = P(R = 6) = =
12 6 12 6
2 1 1
P(R = 3) = = P(R = 7) =
12 6 12
2 1
P(R = 4) = =
12 6
Below is the probability distribution table with our range space at the first row and its probability of
occurrence in the second row.
(X) 1 2 3 4 5 6 7
1 1 1 1 1 1 1
P(X)
12 6 6 6 6 6 12
To solve for the average value or the mean (𝜇), we have to use the formula
𝜇 = = ∑𝑛𝑖=1 𝑥 i𝑝(𝑥 i)
10
1 1 1 1 1 1 1
𝜇 = (1 × ) + (2 × ) + ( (3 × ) + (4 × ) + (5 × ) + ( (6 × ) + (7 × )
12 6 6 6 6 6 12
1 1 1 2 5 7
𝜇= + + + + +1+ The fractions are not similar. Change to equivalent fractions.
12 3 2 3 6 12
1 4 4 1 6 6 2 4 8 5 2 10
The LCD is 12. • = ; • = ; • = ; • = ;
3 4 12 2 6 12 3 4 12 6 2 12
12 12
1• = ;
12 12
1 4 6 8 10 12 7
𝜇= + + + + + + Find the sum.
12 12 12 12 12 12 12
1+4+6+8+10+12+7 48
𝜇= = =4 Simplify.
12 12
The average value of the variable is 4, which means that in a single roll, it will most likely show a
sum of 4.
Example 10: Find the mean (𝜇) of the following probability distribution table.
(X) 2 4 6 8
P(X) 0.25 0.25 0.25 0.25
Solution:
To solve for the mean ( 𝜇), we use the formula
𝜇 = = ∑𝑛𝑖=1 𝑥 i𝑝(𝑥 i)
𝜇 = (2 × 0.25) + (4 × 0.25) + ( (6 × 0.25) + (8 × 0.25) Which means x1=2 and p(x1)=0.25. And
so on and so forth
Example 11: Find the mean (𝜇) of the following probability distribution table
(X) 2 4 6 8
P(X) 0.4 0.1 0.1 0.4
Solution:
To solve for the mean ( 𝜇), we use the formula
𝜇 = = ∑𝑛𝑖=1 𝑥 i𝑝(𝑥 i)
𝜇 = (2 × 0.4) + (4 × 0.1) + ( (6 × 0.1) + (8 × 0.4) Which means x1=2 and p(x1)=0.4. And
X2=4 and p(x2)=0.1; so on and so forth.
𝜇 = 0.8 + 0.4 + 0.6 + 3.2 Find the product and simplify. Hence,
𝜇= 5
11
The two distributions have the same mean with the same values for the variable. However, the
distribution of the probabilities is completely different. The variance and the standard deviation can help you
describe how different these two distributions are. The variance of a probability distribution depends on the
squared deviations from the mean of each value of the random variable multiplied by their corresponding
probabilities. The standard deviation, then, is determined by getting the square root of the variance.
The following steps will simplify the process of computing the variance and standard
Deviation.
1. Compute the mean value of the random variable.
2. Subtract each value from the mean and square the differences.
3. Multiply the squared differences by the corresponding probabilities.
4. Add all the products. This gives the variance of the probability distribution.
Example 12: Compute the variance and standard deviation of the two probability distributions (example 10 &
11) with the same 𝜇 of 5 presented earlier. Interpret the values.
x 2 4 6 8
p(x) 0.25 0.25 0.25 0.25
(x − 𝜇) -3 -1 1 3
(x − 𝜇)2
x 2 4 6 8
p(x) 0.25 0.25 0.25 0.25
(x − 𝜇) −3 −1 1 3
(x − 𝜇)2 9 1 1 9
Solve for the variance using the formula 𝛿 2 = ∑𝑛𝑖=1 (𝑥 i − 𝜇)2 𝑝(𝑥i)
√𝛿 2 = √5 The variance is √5
𝛿 = √5 = 2.24 The standard deviation is 2.24
x 2 4 6 8
p(x) 0.4 0.1 0.1 0.4
(x − 𝜇) −3 −1 1 3
(x − 𝜇)2 9 1 1 9
The second probability distribution is more spread out form the mean than the first probability
distribution.
An alternative method of solving the variance and standard deviation involves the use of the following
formula:
𝛿 2 = ∑𝑛𝑖=1 𝑥𝑖 2 𝑝(𝑥 i) −𝜇2
Example 13: Use the alternative formula to solve for the variance and standard variation of the second
probability distribution in example 7.
x 2 4 6 8
(xi)2 4 16 36 64
p (x ) 0.4 0.1 0.1 0.4
(xi)2 in the second row means square the value in x. So, 22 = 4; 42 = 16; 62= 36; 82 = 64.
Assessment 1.2:
Sometimes, you want to find out how choosing a random event will benefit you in the
long run. Suppose you join a raffle event that will cost you a ticket worth Php 200 for a chance
to win a grand prize of Php 10 000. You know that there are 500 tickets sold for the event and
you want to find out the amount of money that will pay you off for buying one of the tickets.
This situation will be best understood and decided if you know the concept of expected value.
Expected Value
Suppose that a random variable X assume i number of possible values 𝑥 1, 𝑥 2, 𝑥 3,… 𝑥 i
and the corresponding probability of each value is given by p(𝑥), then the expected value for X,
denoted as E(X), is equal to the mean value of the probability distribution. Thus,
To illustrate this concept, assume that the random variable X gives the payoff from buying the ticket.
There can only be two values for the payoff: if you win, the payoff will be Php 9 800 (it is not Php 10,000 due to
Php200 worth of ticket) and if you lose, the payoff will be – Php200. You know that the probability of winning
𝑛(𝐸)
is P(E) = where n(E) = 1. Which means only 1 ticket holder wins. And n(S)=500 since
𝑛(𝑆)
there were 500 tickets sold. Hence, the probability of
winning is,
𝑛(𝐸) 1
P(E) = =
𝑛(𝑆) 500
To find the expected payoff from joining the raffle, multiply each possible payoff value by their
corresponding probabilities and add all the products. We use the formula
𝐸(𝑋) = ∑𝑛𝑖=1 𝑥 i𝑝(𝑥 i).
1 499
E(X) = (9 800)( ) + (–200) ( )
500 500
The expected payoff for each ticket is –180, a negative number, which means that each of the 500
raffle tickets is expected to have an average loss of Php 180 in the long run.
The expected value of a random variable, also known as expectation or payoff value, is the mean of
the probability distribution of the given random variable.
Example 14: A card is drawn at random from a deck of cards consisting of cards numbered 1 through 5. A
player wins Php 100 if the number on the card is even and loses Php 100 if the number on the
card is odd. What is the expected value of his winnings?
Solution:
The cards are numbered
1 2 3 4 5
𝑛(𝐸) 2
P(E) = =
𝑛(𝑆) 5
𝑛(𝐸)
The probability of losing is P(E) = with n(E)=3 (odd cards) and n(S)=5, is the number of cards from a deck.
𝑛(𝑆)
Hence, the probability of losing is
𝑛(𝐸) 3
P(E) = =
𝑛(𝑆) 5
2 3 2 3
E(X) = (100)( ) + (–100) ( ) Substitute x1=100 and p(x1)= ; x2=(-100) and p(x2)= ;
5 5 5 5
E(X) = 40 – 60 Simplify.
E(X) = –20
The expected value is –20. Thus, the player is expected to lose an average of Php 20 in the game.
Example 15: A lottery that pays off Php 300 000 000 is made available for 10 000 000 tickets. Each ticket costs
Php 50. Suppose the variable Z gives the net winnings form playing the lottery. What is the
expected gain for joining the lottery with only one ticket?
Solution:
𝑛(𝐸)
The probability of winning is P(E) = . Where n(E) = 1 (only 1 ticket wins); n(S) = 10 000 000 (number of
𝑛(𝑆)
tickets).
𝑛(𝐸) 1
The probability of winning is P(E) = = (observe that 300 000 000 – 50 = 299 999 950)
𝑛(𝑆) 10 000 000
1 9 999 999
E(X) = (299 999 950)( ) + (–50) ( )
10 000 000 10 000 000
E(X) = –20
Assessment 1.3:
“Random events, even though how random they may be can still be understood through
assigning a random variable. Probability distributions can be used to represent and solve
problems concerning the randomness of an event in the real world”.
BOOKS
Canlapan, R. & Campena, F. (2016). Statistics and Probability. Makati City, Philippines: Diwa Learning Center.
IMAGES
https://www.google.com.ph/url?sa=i&url=https%3A%2F%2Fwww.vippng.com%2Fpreview%2FJxoo
mw_teacher-png-school-teacher-cartoons%
Brainfans.com
16