Probability
Probability
Basic concepts
Distributions
Probability
• Inferential statistics is built on the foundation of probability theory
• Random experiment: a process leading to two or more possible
outcomes, without knowing exactly which outcome will occur
• A coin is tossed and the outcome is either a head or a tail
• A Globo order has the possibility of receiving a 1-5 score
• A customer enters a store and either purchases a shirt or does not
• Tossing a dice and having one of six potential equally likely outcomes
Random variables and their probability
distributions
• A random variable is one that takes on numerical values and has an
outcome that is determined by an experiment.
• The number of heads appearing in 10 flips of a coin
• The average score of travels for a Yandex taxi driver today
• Random variables: X, Y, X (uppercase letters)
• Outcomes of random variables: x, y, z (lowercase letters)
• X is the number of heads appearing in 10 flips of a coin and takes a value in
the set {0, 1, 2, 3, …, 10}
• x is some particular outcome, e.g., x = 6
Continuous Random Variable
• A random variable is a continuous random variable if it can take any
value in an interval
• The yearly income for a family
• The amount of oil imported to Tajikistan in a particular month
• The change in the exchange rate between KGS and USD in a month
• The CO2 emission level to Bishkek air on a given day
Discrete Random Variables
• A discrete random variable is one that takes on only a finite number of
values.
• E.g., a Bernoulli (binary) random variable, the simplest discrete random
variable
• X can take values of 1 or 0
• In coin-flipping, P(X = 1) = ½ and P(X = 0) = ½
• More generally, P(X = 1) = θ and P(X = 0) = 1 - θ
• For discrete random variables, if X takes on the k possible values {x1, x2, …xk}, then
the probabilities p1, p2, …pk, are defined by
• pj = P(X=xj), j = 1, 2, …k where each pj is between 0 and 1 and ∑pj = 1
Discrete Random Variables: pdf
• Probability density function (pdf) of a random variable is a
representation of the probabilities for all the possible outcomes (may
be algebraic, graphical, or tabular)
• The probability density function of X is
f(xj) = pj, j = 1, 2, …, k
For discrete random variables, cdf is a sum of the pdfs over all values xj
such that xj ≤ x
and then
0 ≤ F(x0) ≤ 1 for every number x0
If x0 < x1, then F(x0) ≤ F(x1)
cdf: illustration
• Fit Motors is a car dealer in Kochkor. Based on an analysis of its sales
history, the managers know that on any single day the number of
Honda Fit cars sold can vary from 0 to 5. How can the probability
distribution function shown in the table be used for inventory
planning?
Expected value of a Discrete Random
Variable
• The expected value of a random variable is also called its mean and is
denoted μ.
• The expected value E[X] of a discrete random variable X is defined as
E[X] = μ = ∑xP(x)
e.g., the probability distribution for the number of errors (X) in the
lecture is
P(0) = 0.81 P(1) = 0.17 P(2) = 0.02
Find the expected number of errors per lecture
Variance of a Discrete Random Variable
• The expectation of the squared deviations about the mean, (X – μ)2 ,
is called the variance, denoted as σ2 and given by
σ2 = E[(X – μ)2 ] = ∑ (x – μ)2 P(x)
The variance of a discrete random variable X can also be expressed as
σ2 = E[X2 ] – μ2 = ∑ x2 P(x) – μ2
• where P(x) is the probability of x successes over a given time or space, given λ
• λ = the expected number of successes per time or space unit, λ >0
• e is the base for natural logarithm (=2.271828)
• The mean and variance of the Poisson distribution are
• μx = E[X] = λ
• σ 2x = E[(X – μ)2 ] = λ
Poisson distribution: illustration
• Asel, computer center manager, reports that her computer system
experienced three component failures during the past 100 days. From
past experience the expected number of failures per day is 3/100
• a. What is the probability of no failures in a given day?
• b. What is the probability of one or more component failures in a
given day?
• c. What is the probability of at least two failures in a 3-day period?
Poisson approximation to the Binomial
distribution
• If the number of trials, n, is large
• The distribution of the number of successes X is binomial
• Mean distribution of X is nP is of only moderate size (preferably nP ≤ 7),
• This distribution can be approximated by the Poisson distribution with
λ = Np. The probability distribution of the approximating distribution
is then
Poisson approximation to the Binomial
distribution
• An analyst predicted that 3.5% of all small corporations would file for
bankruptcy in the coming year. For a random sample of 100 small
corporations, estimate the probability that at least 3 will file for
bankruptcy in the next year, assuming that the analyst’s prediction is
correct.
Joint distributions, conditional distributions,
and independence
• Let X and Y be discrete random variables. Then, (X,Y) have a joint
distribution, which is fully described by the joint probability density
function of (X,Y):
fX,Y(x,y) = P(X = x, Y = y)
Random variables X and Y are independent iff
fX,Y(x,y) = fX(x) fY(y)