Math
Math
Math
Population = total and full collection of target data. Statistics from population is accurate, but
time consuming to gather the population
Random sample = every data in population have equal chance of getting selected.
Frequency distribution = a table that shows how much of a certain category there is.
Interval width = how wide an interval is. 10 ≤ x < 20, interval width is 10.
Mid-interval value = the value halfway up on interval width. Mid-interval value for
10 ≤ x < 20 is 15 because it is the value between 10 and 20.
Upper and lower interval boundaries = the boundaries stated. Be careful of < 𝑎𝑛𝑑 ≤.
For mean, variance and standard deviation you must be able to use the formula. Standard
deviation is essentially showing us “on average, how far away from the mean are the
individual values?” Variance is just st.dev squared.
2. Concepts of trial, outcome, equally likely outcomes, sample space (U) and event. The
𝒏(𝑨)
probability of an event A as 𝑷(𝑨) = . The complementary events A and A′ (not A). Use
𝒏(𝑼)
of Venn diagrams, tree diagrams, counting principles and tables of outcomes to solve
problems.
- Be able to use basic probability concepts:
Get familiar on how to use Venn diagrams, tree diagrams, means and tables. Practice
practice practice.
A B
A B
4. Conditional probability; the definition. Independent events; the definition. Use of Bayes’
theorem for a maximum of three events.
- Be able to understand conditional probability:
Conditional probability is when you are given an additional information so that “when given
B, calculate the probability of A”. In other words, find the probability of having A out of B.
A B
𝑃(𝐴∩𝐵)
Mathematically, 𝑃(𝐴|𝐵) = . You can see now that the denominator, the total number
𝑃(𝐵)
of events, is now the events for B. So our “total” has become B, and we try to find the events
A within B.
You should also be able to use Bayes’ theorem, but I have never ever seen a question about
it. Plus, it is in the formula booklet so it is not so difficult to use. Essentially, it is about
combining events (maximum 3) into expressions that you can solve. The key is to identify
that 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐵 ∩ 𝐴).
Independent events do not change probability when the other event changes. To identify it,
5. Concept of discrete and continuous random variables and their probability distributions.
Definition and use of probability density functions. Expected value (mean), mode, median,
variance and standard deviation.
- Be able to understand probability distribution and probability density function:
Probability distribution = a table that shows the probabilities of events. The particular
function that is used to find the probability of a specific event is our probability density
function.
There are two types of probability distributions: discrete and continuous. Let’s start with
discrete.
©Ibling
The function for discrete = 𝑓(𝑥) = 𝑃(𝑋 = 𝑥), where the lowercase x is the target event.
Be able to find expected value, mode, media, variance and standard deviation in terms of
P(X=x), discrete variable:
𝐸(𝑋) = 𝑚𝑒𝑎𝑛 = ∑𝑛𝑖=1 𝑥𝑖 𝑃(𝑋 = 𝑥), or 𝑃(𝑋 = 𝑥) can be substituted as 𝑓(𝑥). The basic
𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
idea here is that the probability is fundamentally 𝑃(𝑋 = 𝑥) = . This is then the
𝑡𝑜𝑡𝑎𝑙
same as the previous mean formula we learnt, which is:
∑𝑘𝑖=1 𝑥𝑖 𝑓𝑖
𝑚𝑒𝑎𝑛 =
𝑛
𝑓𝑖
= 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦
𝑛
𝑀𝑜𝑑𝑒 = 𝑚𝑜𝑠𝑡 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑡𝑙𝑦 𝑜𝑐𝑐𝑢𝑟𝑖𝑛𝑔 𝑥. This has the highest 𝑓(𝑥) = 𝑃(𝑋 = 𝑥). You simply
have to plug in the numbers to density function see which one has the highest probability.
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝑡ℎ𝑒 x𝑣𝑎𝑙𝑢𝑒 𝑡ℎ𝑎𝑡 𝑔𝑖𝑣𝑒𝑠 𝑐𝑢𝑚𝑢𝑙𝑎𝑡𝑖𝑣𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑒𝑥𝑎𝑐𝑡𝑙𝑦 0.5. This would be most
often possible with continuous distribution.
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝜎 2
The function for continuous = 𝑓(𝑥) = 𝑃(𝑎 ≤ 𝑥 ≤ 𝑏). Why? For continuous values, we use
integrals because that is the only way to calculate the area under continuous functions (area
under = the probability in this case), and in integrals cannot be in one value. Having an
integral of one value would mean an infinitesimal thin line, which gives value of 0. So we
write continuous probability density function as:
𝑏
∫𝑎 𝑓(𝑥)𝑑𝑥 = 𝐹(𝑏) − 𝐹(𝑎)
Be able to find expected value, mode, media, variance and standard deviation in terms of
P(a≤x≤b), continuous variable:
The idea is exactly the same as discrete, only that we now replace the sigma with integral.
𝑖=𝑚𝑎𝑥
𝐸(𝑋) = 𝑚𝑒𝑎𝑛 = ∫ 𝑥𝑖 𝑓(𝑥)𝑑𝑥
𝑖=0
𝑀𝑜𝑑𝑒 = 𝑚𝑜𝑠𝑡 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑡𝑙𝑦 𝑜𝑐𝑐𝑢𝑟𝑖𝑛𝑔 𝑥. In continuous, we find the mode by simply finding
the maximum point. In other words, mode is when 𝑓 ′ (𝑥) = 0 𝑎𝑛𝑑 𝑓′′(𝑥) < 0.
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝑡ℎ𝑒 x 𝑣𝑎𝑙𝑢𝑒 𝑡ℎ𝑎𝑡 𝑔𝑖𝑣𝑒𝑠 𝑐𝑢𝑚𝑢𝑙𝑎𝑡𝑖𝑣𝑒 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑒𝑥𝑎𝑐𝑡𝑙𝑦 0.5. Thus:
𝑚
∫𝑖=0 𝑓(𝑥)𝑑𝑥 = 0.5 and you find out the “m” which is the median.
Also, we use the same idea to find first quartile and third quartile. For first quartile we set
the density function as 0.25 and third quartile as 0.75.
Continuous uniform distribution = a distribution that has an equal probability across the
interval.
𝑏
Since ∫𝑎 𝑓(𝑥)𝑑𝑥 = 1, the area under that square is 1. This is obvious since maximum
1 1
probability is 1. Therefore the height of the square must be because × (𝑏 − 𝑎) = 1.
𝑏−𝑎 𝑏−𝑎
And as we know from earlier, the height here is the probability so there is a uniform
1
probability of across all possible events (x values within given boundaries). This makes
𝑏−𝑎
integral easier since the probability is constant and hence can be brought out from integral
sign.
©Ibling
6. Binomial distribution, its mean and variance. Poisson distribution, its mean and variance.
- Be able to use binomial distribution:
Binomial distribution = a model to calculate the outcome of a True/False result, hence the
name bi-nomial. A very simple example is the distribution of flipping a coin.
𝑛 𝑟 𝑛−𝑟
𝑃(𝑋 = 𝑟) = ( ) 𝑃𝑠𝑢𝑐𝑒𝑠𝑠 × 𝑃𝑓𝑎𝑖𝑙𝑢𝑟𝑒
𝑟
Where 𝑃𝑠𝑢𝑐𝑐𝑒𝑠𝑠 + 𝑃𝑓𝑎𝑖𝑙𝑢𝑟𝑒 = 1, 𝑟 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 𝑛 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑙𝑠,
𝑃 = 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦
Then you should be able to find out 𝐸(𝑋) 𝑎𝑛𝑑 𝑉𝑎𝑟(𝑋) of binomial distributions. The
formula is written in the booklet!
Its cumulative would simply be the sum of the distribution just like all cumulative results. We
would therefore write is as 𝑃(𝑋 ≤ 𝑟), where “r” is the specific number of success.
𝑟
𝑛 𝑟 𝑛−𝑟
∑ ( ) 𝑃𝑠𝑢𝑐𝑒𝑠𝑠 × 𝑃𝑓𝑎𝑖𝑙𝑢𝑟𝑒
𝑟
𝑟=𝑎
Poisson distribution = a model to calculate expected rate. It extremely important to note the
word “rate” here. Think of the Poisson as a fraction.
𝑚 𝑥 𝑒 −𝑚 𝜆𝑥 𝑒 −𝜆
𝑃(𝑋 = 𝑥) = 𝑜𝑟
𝑥! 𝑥!
𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑜𝑐𝑐𝑢𝑟𝑒𝑛𝑐𝑒
𝜆=
𝑠𝑝𝑎𝑐𝑒 𝑜𝑟 𝑡𝑖𝑚𝑒
Thus, when solving Poisson distribution, make sure that the units of 𝜆 is the same as in the
question. Ex, a car breaks 8 times per week. The 𝜆 here would be 8. However, if a question
instead asks the probability of breaking a car 4 times a day, you adjust the rate. 𝜆 would
8
now be simply because you have divided the per week by 7 to make it into per day. Then
7
you plug into the formula and solve!
This feels like “wth” in the beginning but after about 5 questions, it will be the easiest
distribution you have done.
An important characteristic of Poisson is that 𝐸(𝑋) 𝑎𝑛𝑑 𝑉𝑎𝑟(𝑋) are both equal to 𝜆.
Please realize that different distributions have different characteristics, so use them wisely. A
question may ask whether a certain distribution can be expressed into another distribution.
In those cases, 𝐸(𝑋) 𝑎𝑛𝑑 𝑉𝑎𝑟(𝑋) will come handy.
Within one standard deviation, we find about 0.68% of the values. Within two standard
deviations, we find about 0.95% of the values. Within three standard deviations, almost all
values are within that boundary.
2. It is symmetrical from 𝜇. The graph above is standardized so 𝜇 is 0. You will learn more
about standardization below.
3. It is bell-shaped.
1
4. Maximum value is . This means that if there is high standard deviation, the values are
𝜎 √2𝜋
spread out so mean is likely to be low. It is the opposite for low standard deviation.
©Ibling
The whole idea is to convert 𝑋~𝑁(𝜇, 𝜎 2 ) to 𝑍~𝑁(0,1). This gets rid of the units and makes
all normal distributions comparable, hence the term “standardizing” the distribution.
𝑥−𝜇
Z=
𝜎
This also feels like alien language but after a couple of questions, it will be simple.
8. Use of calculators!!!
- I cannot emphasize this enough. If you know how to use your calculator, not only in
statistics but in all topics, your life will be so much easier.
SO PLEASE GRAB YOUR TEACHER AND MAKE HIM TEACH YOU HOW TO USE THE GDC.