[go: up one dir, main page]

0% found this document useful (0 votes)
29 views20 pages

Probability - Notite

The document provides an overview of key concepts in probability and statistics including: - Definitions of experiments, outcomes, sample spaces, and events. - Calculating probabilities of events using formulas for unions, intersections, and complements of events. - Concepts of conditional probability, independence, and Bayes' rule. - Random variables, probability distributions, expected value, variance, and standard deviation. - Formulas for combinations, permutations, indicator variables, linearity of expectation, and Markov's inequality. Examples are provided to illustrate each concept.

Uploaded by

ionut
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views20 pages

Probability - Notite

The document provides an overview of key concepts in probability and statistics including: - Definitions of experiments, outcomes, sample spaces, and events. - Calculating probabilities of events using formulas for unions, intersections, and complements of events. - Concepts of conditional probability, independence, and Bayes' rule. - Random variables, probability distributions, expected value, variance, and standard deviation. - Formulas for combinations, permutations, indicator variables, linearity of expectation, and Markov's inequality. Examples are provided to illustrate each concept.

Uploaded by

ionut
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

*** cred ca ar merge niste exemple la fiecare lectie, e prea abstract

01. Experiments, Outcomes, Samples Spaces, Events

1’00” Terminology

 Experiment: a process to exactly one of various possible outcomes


 Outcome (a.k.a. simple event, or sample point): the things that can happen in an experiment
 Sample space (a.k.a. probability space): the set of all possible outcomes. Sometimes denoted S.
 Event: a subset of the sample space, that is, a set of some of the outcomes. Sometimes denoted
as A.

2’30” The probability of event A is

¿ of outcomes inthe event A


P(A)=
¿ of outcomes∈the sample space

02. Combining Events_ Multiplication _ Addition

0’10” Union of events

A ∪ B is the set of outcomes in A or B, meaning at least one of A or B is true.

P (A ∪ B) = P(A) + P(B) – P(A∩B)

If A and B are disjoint events (no overlap), then P(A∩B) = 0, so we get

P(A ∪ B) = P(A) + P(B)

4’20” Intersection on events

A∩B is the set of outcomes in A and B, meaning both of A and B are true.
To find P(A∩B), the multiplication rule is often useful: if you have m possible outcomes for one
experiment and n possible outcomes for a second, independent experiment, then there are mn possible
combined outcomes.

5’50” Conditional probability

The conditional probability P(A|B), P(A given B), is the probability of event A, given that event B is true.

P ( A ∩B)
P(A|B) =
P(B)

8’20” Independence

Events A and B are independent if knowing that B is true would not change the probability that A is true:

P(A) = P(A|B)

[probabilitatea ca A sa fie adevarata = probabilitatea ca A sa fie adevarat daca B este adevarat -> faptul
ca cunostem ceva despre B nu aduce un plus de informatie pentru A]

Independent does not mean disjoint: P(A∩B) =0

If A and B are independent, then

P ( A ∩B)
P(A) =
P(B)

[partea din dreapta este la fel ca pentru P(A|B) pentru ca am zis ca daca sunt independente P(A)=P(A|
B)]

03. Choices, Combinations, Permutations

9’25” Combinations ( le scrie invers de cum am invatat! – si pe un site romanesc n-ul apare jos nu sus

We use combinations to count the number of ways to choose a group of r unordered objects from n
possibilities without replacement:
13’55” Permutations

We use permutations to count the number of ways to choose a group of r ordered objects from n
possibilities without replacement:

17’30” Key formulas

04. Inclusion, Exclusion

2’30” Inclusion / Exclusion for three events:


19’50” Doar o notatie din exercitiu

A:= have at least one 3 inseamna “A is defined to be” whatever is appearing on the right (se pare ca e
mai degraba o notatie personala a prezentatorului). Exemplul 3 e chiar misto.

05. Independence

0’20” Definition: events A and B are independent if the following formula holds (vezi si lectia 2):

P(A) = P(A|B) → la unele exerctii ar parea ca iti dai seama ca sunt independente dar cel mai bine e sa
verifici mereu cu formula asta.

06. Bayes' Rule

01’00” When to use Bayes’ rule:

- your sample space must be a disjoint union of events: S = B 1∪B2∪…Bn

- then you have one event A that overlaps the others

- “Given that A occurred, what is the probability that one of the B’s is true?”

2’50” Bayes’ rule for two choices

If S = B1∪B2 is a disjoint union, then


5’05” Bayes’ rule for multiple choices

If S = B1∪…∪Bn is a disjoint union, then

Exerctiile sunt misto dar inteligibile; am avut probleme la ultimul (5) – eventual de revenit

07. Random Variables, Probability Distribution

0’30” Intuition: A random variable Y is a quantity you keep track of during an experiment.

Definition: a random variable is a function from a sample space to R , the set of real numbers.

Y: S → R

7’25” Probability distributions

p(y) = P(Y=y) is the sum of all the probabilities of the outcomes for which Y=y:

Y – variabila aleatoare; y – un numar, o posibila valoare a variabilei aleatoare Y; P(Y=y) – cand variabila Y
are o anumita valoare y

What is the probability that the random variable has a particular value? We try to find that probability
and we add all those probabilities over all the outcomes that lead to that value and we say that’s the
probability of that value.

Formula cu sigma (incearca sa o explice in cuvinte, dar zice ca mai bine se intelege din exemple): we look
at all the outcomes (E) in the sample space (S) for which the random variable has that particular value
(Y(E)=y) and then we add up the probabilities (P(E)) of all those outcomes.

The function p(y) is the probability distribution of the random variable Y.


9’30 Exemple – am ales ex 1 si 2 pentru ca mi se parea ca fiecare aduce ceva

Exemplul 1. You draw a card from a standard 52-card deck. If it’s ace through nine, I pay you that
amount. If it’s a ten or a face card, you pay me $10. What is the probability distribution for this random
variable?

Deci intrebarea este – what is the probability of getting different possible values for the random
variable? Deci care e probabilitatea de a colecta exact:

0 dolari → p(0) = P(Y=0) =0. Probabilitatea este zero pentru ca ori ii platesc eu ori imi plateste el.

4 1
1 dolar → p(1)= P(Y=1) = = . Sunt patru asi intr-un pachet de carti.
52 13

4 1
2 dolari → p(2) = P(Y=2) = = . Pe acelasi principiu cu asii. La fel pentru valorile 3 pana la 9.
52 13

10 dolari → p(10) = P(Y=10)= 0. Pentru ca nu am cum sa castig 10 dolari

16 4
-10 dolari → p(-10) = P(Y=-10)= =
52 13

That is our full probability distributions. Tot ce am scris mai sus e raspunsul. Remember the probability
distribution means you are thinking of all the possible values of Y and you are calculating the probability
of each one of these values.

Exemplul 2

Flip a fair coin 10 times. Let Y be the number of heads. What is the probability distribution for this
random variable?

Deci o sa calculam p(y) unde y este totalitatea valorilor lui Y. The possible values of y could be, that’s the
number of heads that we can see here. The fewest is 0, the most is 10. Deci 0 ≤ y ≤ 10.

To get y heads, we must fill in 10 blanks with y H’s and 10-y T’s. We must choose y blanks to be H. There

are (10y ) ways to do that (unordered, without replacement).


10 10
1 1
Each one has ()2
. De ex. TTHTTHHTTT are ()
2
sanse de aparitie.

So the total probability of getting all these sequences is:

10
p(y) = y ) , 0 ≤ y ≤ 10 → that is our probability distribution.
(
210
08. Expected Value (Mean)

0’20” Definition of expected value:

The expected value of a (discrete) random variable Y, also known as the mean, is

Explicatie formula: you find all the possible values of the random variable [y] you multiply each one by
the probability that it will come up [p(y)] and then you add them.

E(Y) – expected value

μ – medie

Intuitie: think of it as the average value of Y over the long run if an experiment is repeated many times.
If Y is a payoff for a fair game, then E(Y) is the amount you should pay to play the game once. Imi place
ideea de fair game – adica platesti cat face si la sfarsit, pe termen lung, practic joci pe gratis: castigurile
sunt egale cu cat ai platit sa joci.

3’10” Indicator variables

If A is an event, then we sometimes define the indicator variable:

The expected value of the indicator variable is the probability that it is true:

E(YA) = P(A)

4’40” Linearity of expectation

For random variables Y1 and Y2 and constants a and b, we have


This is often useful for breaking a complicated variable down into simpler variables, like indicator
variables.

6’05” Expected value of a function:

If g(Y) is a function of a random variable Y, then

Seamana cu functia de la inceputul capitolului doar ca inlocuim y cu g(Y) in stanga si dreapta.

09. Variance, Standard Deviation

0’12” Definition of variance

The variance of a random variable Y is

Where μ:= E(Y) is the mean (expected value) of the variable.

Variance is a measure of variability, or volatility, in Y.

The most useful way to calculate variance is as follows:

Formula de sus mi se pare cam confuza. E de fapt prima dar cu alta notatie. O foloseste in ex 4. A
inceput sa mi se para mai prietenoasa. Pe wikipedia arata trecerea de la prima la a doua formula:

A mnemonic for the above expression is "mean of square minus square of mean".
3’45” Standard deviation

The standard deviation of a random variable Y is

Because it is defined directly from variance, standard deviation also measures volatility in Y.

10. Markov's Inequality

0’25” Markov’s inequality is a quick way of estimating probabilities based on the mean of a random
variable. Suppose Y is random with only positive values, and a is constant:

We have some constant number (a) and we are estimating the probability that the variable will be
bigger than that value a. Markov’s inequality gives an answer saying that probability is less than the
expected value of Y (adica media) divided by a.

It is really a one sided estimation. It gives you an upper bound. It does not tell you that the probability is
equal to that.

We can reverse it:

Atentie la inversarea formulei pentru ca e e cam abstracta deocamdata si banuiesc ca o sa-mi prind
nasul (inca nu stiu care e utilitatea ei).

11. Tchebysheff's Inequality

0’55” This inequality is a quick way of estimating probabilities based on the mean μ and standard
deviation σ of a random variable.

Suppose Y is a random variable, and k is constant


Intuition: it is unlikely that the variable will be far (many standard deviations away) from its mean.

4’10” Inegalitatea in sens invers

Intuition: it is likely that the variable will be close to (within a few standard deviations of) its mean.

5’50” O sa scriu un exemplu pentru ca daca revin aici mai tarziu sigur nu inteleg nimic.

Surveys show that students on a particular campus carry an average of $20 in cash, with a standard
deviation of $10. If you meet a student at random, estimate the chance that she is carrying more than
$100. Also estimate the chance that she is carrying less than $80.

Rezolvare: e clar ca μ este 20 si ca σ este 10. Sa vedem care e faza cu k. Ca un student sa aiba peste $100
atunci ar trebui sa fie la opt deviatii standard fata de medie (100-20=80; σ =10). Deci k=8. Atunci avem:

1
P(|Y-20| > 8*10) ≤
82

Pentru a doua parte a problemei folosim varianta inversata a inegalitatii. Acum avem k=6, deci:

1 35
P(|Y-20| ≤ 6*10) ≥ 1- 2 → probabilitatea este de ≥
6 36

-sunt la 13’20”

12. Binomial Distribution (Bernoulli Trials)

0’40” The binomial distribution describes a sequence of n independent tests, each of which can have
two outcomes.

The prototypical example is flipping a coin n times. But it can also describe any process with two
outcomes (games between teams, rolling a die to get a 6, etc.)
3’45” Formula for the binomial distribution

n = number of trials (fixat – in sensul ca e specificat de cate ori faci experimentul; ex: dai cu zarul)

p = probability of success of any given trial

q = probability of failure = 1-p

p(y) = probability of exactly y successes [a nu se confunda cu p!]

n!
( ny ) = C n
y =
y ! ( n− y ) !

9’55” Key properties of the binomial distribution

Mean: μ = E(Y) = np

Variance:

Standard deviation:

13. Geometric Distribution

0’25” The geometric distribution describes a sequence of trials, each of which can have two outcomes
(success or failure). We continue the trials indefinitely until we get the first success. Unlike the binomial
distribution, we don’t know the number of trials in advance.

2’10” Formula

Fixed parameters:
- p = probability of success on each trial
- q = probability of failure = 1- p
Random variable: Y = number of trials (for one success)

Probability distribution:

[problema cu notatia din nou; p(y)= probability of y trials overall; p=probability of success on each trial]

6’50” Key properties of the geometric distribution:

Mean:

Variance:

Standard deviation:

7’50” Geometric series

Aici vorbeste despre suma unei serii geometrice infinite. Am scris in notitele de la Precalculus, lectia
15.04 despre asta. Atat timp cat common ratio este >1 atunci suma este divergenta. Dar atunci cand
este <1 atunci este convergenta. Pentru acest ultim caz formula este urmatoarea:

a1 este primul termen

Asta ne ajuta sa raspundem la genul de intrebari: What is the probabiity that it will take at least a certain
amount of trials to achieve success?

So we want to add up all the values that are bigger or equal to Y. Pe al doilea rand a folosit formula
distributiei geometrice: p(y) = qy-1p unde a inlocuit y cu y+1 pe urma cu y+2… Acest al doilea rand este o
distributie geometrica pentru ca vedem ca termenii difera de la unu la altul prin q, deci q este common
ratio. Daca folosim formula pentru suma unei serii geometrice de mai sus atunci avem prima grupare de
pe randul trei care simplificata devine q y-1. In traducere, daca vrei sa ai cel putin 5 incercari pana sa
reusesti trebuie sa o dai in bara de patru ori. Intuitiv dar cu formula aia parea foarte complicat.
23’30” Un exemplu, pentru ca e destul de abstract dupa am scris in susul paginii

You and a friend take turns rolling a die. You roll first, and the first person to roll a six wins.

A. What is the chance that you will win on your third roll? Atat timp cat incep eu primul atunci a treia
5 4
1 54
mea incercare e de fapt a cincea distributiei. Deci avem p(5)= ( )( )
6 6
= 5
6

B. What is the chance that your friend will get to roll three times or more?

Ca sa dea de trei ori sau mai mult trebuie sa ajunga la cel putin a sasea incercare a distributiei, dupa
rationamentul punctului anterior. Deci avem p(y≥6). Aici folosim ce am scris referitor la distributia
5
5
geometrica: qy-1 deci ()
6

C. What is the chance that you will win? Ca sa castig atunci atunci zarul cu sase trebuie sa apara la o
incercare impara: prima sau a treia sau a cincea, etc. Deci trebuie sa calculez suma aceasta.

Deci p(1)+p(3)+p(5) + ... = p + q2p + q4p + q6p+ … → asta e o serie geometrica cu q 2 ca ratie. In continuare
folosim formula pentru serie geometrica:

1
1 1 1 36 6
P* 2 = * 25 = * =
1−q 6 1− 6 11 11
36

47’23” In exemplul 5 foloseste o proprietate a variantei de care nu a mai spus pana atunci, si pe care
bineinteles nu o stiam. In exemplu era V(aY+b) = a 2V(Y). Am cautat pe wikipedia si am gasit aceasta
proprietate, alaturi de altele: Var(aX) = a 2Var(X).

In aceeasi idee vezi si linearity of expectation mentionata in lectia 8: tot la proprietati.

14. Negative Binomial Distribution

0’15” The negative binomial distribution describes a sequence of trials, each of which can have two
outcomes (success or failure). We continue the trials indefinitely until we get r successes. Unlike the
binomial distribution, we don’t know the number of trials in advance.

The geometric distribution is the case r =1.


3’40” Formula for negative binomial distribution

Fixed parameters:
p = probability of success on each trial
q = probability of failure (1-p)
r = number of successes desired

Random variable:
Y= number of trials (for r successes)

Probability distribution:

, primul termen, in paranteza, reprezinta combinatii

Din nou problema ca avem doi p in formula de mai sus, fiecare reprezentand lucruri diferite:

- p(y) = probability of success of y trials overall in order to get a certain number of successes

- p = probability of success on each trial

7’45” Key properties of negative binomial

r
Mean: μ = E(Y) =
p

rq
Variance: σ 2 = V(Y) =
p2

Standard deviation: σ = √ V (Y ) =
√rq
p

24’00 o sa scriu ex 3 si jumatate din ex 4, care mi se pare foarte ingenios

Ex. III: You roll a die until you get four sixes (not necessarily consecutive). What is the mean and
standard deviation of the number of rolls you will make? Deci avem p=1/6; q=5/6 si r=4. Inlocuim in
formule si avem: Media = r/p = 4/(1/6) = 24 de aruncari. Varianta rq/p 2 = 120. Std dev. = radical din 120
~ 10.95 aruncari.

Ex. IV: 10% of applicants for a job possess the right skills. A company has three positions to fill, and they
interview applicants one at a time until they fill three positions. What is the probability that they will
interview at least ten applicants?
Ce inseamna cel putin 10 aplicanti? Inseamna ca nu ai gasit 3 oameni printre primii 9 aplicanti. Care e
sansa ca printre cei 9 sa ai mai putin de 3 oameni? Avem p(0) sau p(1) sau p(2). Dar asta nu mai e o
distributie binominala negativa pentru ca ne uitam la un numar fix de oameni (9). Deci acum folosim
distributia binominala: cate succese dintr-un numar fix de incercari:
0
9 9 9 1
9 8 9 2
9 1 1 1 9 7
p(1) + p(2) + p(3) = ( )( )( ) ( )( )( ) ( )( )( )
0 10 10
+
1 10 10
+
2 10 10
≈ 94.7%

deci asta e sansa a intervieva cel putin 10 aplicanti.

15. Hypergeometric Distribution

0’15” The hypergeometric distribution describes choosing a committee of n men and women from a
larger group of r women and N-r men. This is an unordered choice, without replacement. The question is
- What are the chances of getting exactly y women on our committee?

1’50” Formula for the hypergeometric distribution:

N = total number of people

r = number of women

N-r = number of men

n = number on our committee

Y =number of women on our committee

Probability distribution:

Formula luata mai pe bucatele:

- numaratorul N/n - reprezinta total number of ways to choose your committee

- combinatii de r luate cate y – vezi cate femei ai in total si alegi doar y dintre ele in comitet
- combinatii de N-r luate de n-y – similar cu mai sus, alegi barbatii

- min{r, n) – numarul maxim de femei din comitet. Poate sa fie n pentru ca acesta e marimea comitetului
sau r pentru ca acesta este numarul de femei disponibile. Oricare din el este mai mic acela este numarul
maxim de femei disponibile.

6’15” Properties of Hypergeometric

nr
Mean: μ = E(Y) =
N

Variance:

Standard deviation:

13’45” Exercitiu 3.

Your shoe closet contains 10 pairs of shoes. Packing for a move, you begin throwing shoes into a box at
random. The box fills up at 13 shoes. What is the probability that there are 5 left shoes and 8 right shoes
in the box?

Mai intai datele problemei: N=20, r=10, N-r=10, n=13. Si acum y poate fi 5 sau 8 in functie de de care ne
intereseaza. De fapt e acelasi lucru ca daca avem 5 de stangul nu putem avea decat 8 de dreptul.
Formula da acelasi lucru indiferent daca folosim y=5 sau y=8. Deci:

10 10
P(5)=
( 5 )( 8 )

( 2013)
-vezi si ex5 – demonstratie pentru formula de medie

16. Poisson Distribution


0’20” Describes events that occur randomly and independently such as calls coming in to a tech support
center. The random variable we are keeping track of is Y (number of calls in an hour)

2’15” Formula

λ = average number of calls per hour (doesn’t have to be an integer)

Probability distribution:

5’30” Properties

Mean: μ = E(Y) = λ

Variance: σ2 = V(Y) = λ

Standard deviation: √ λ

6’30” Exemplu

California averages 6 major forest fires per year. What is the chance that here will be exactly 4 fires this
year? What is the chance that there will be at least 4 fires?

-6 64 54
P(4) = e * = = ~ 13.4%
4 ! e6

Pentru a doua parte ne intereseaza cel putin 4 incendii

p(y≥4) = 1 – [p(0) + p(1) + p(2) + p(3)]

60 61 62 63 61
= 1 – e-6 [ + + + ] = 1 - 6 ~ 84.9%
0 ! 1! 2! 3! e

Ex 5 l-am inteles pe jumatate. Prima parte cu linearity of expectation nu l-am inteles. De ce nu a inlocuit
direct y cu 2 si a mai facut linearity of expectation?
17. Density Cumulative Distribution Functions

0’18” Any time you have a continuous probability distribution you’ll have a density function and a
cumulative distribution function.

0’44” Density functions

Let Y be a continuous random variable. It has a density function f(Y) that satisfies:

1. f(y)≥0, (nu poti sa ai probabilitate negativa)



2. ∫ f ( y ) dy = 1 (aria totala de sub grafic la -∞ la ∞ este intotdeauna egala cu 1. That’s because it’s a
−∞
probability function, the total probability of something happening has to be equal to 1. Something has
to happen).

The way you use the density function to calculate probabilities is you always talk in terms of ranges. You
never talk about the probability of Y being equal to specific value. Instead we’ll ask what’s the
probability that Y is between one number and another.

For example we’ll find the probability that Y is between a and b.


The way you find that probability is that you calculate the area
under the density function. In order to calculate that area what
you do is to take an integral. So:

4’30” Cumulative distribution functions

If Y has a density function f, then it has cumulative distribution function F


It is the probability that Y is less than or equal to a particular cut-off value of y.

Daca vrei sa gasesti probabilitatea de sub valoarea y calculezi aria din stanga.
So you can do that as an integral. (acum urmeaza ceva ce nu inteleg!!). Now
we’ve already used y as the cut-off [ce e in partea de sus a intergralei] so I
can’t use y as my variable as integration [adica ce e in paranteza sub
integrala:f(t)dt]. So I’m using t: f(t)dt. By the way, that’s a very common
mistake that I see students make when they are doing probability problems is that they’ll mix up they
variables – o sa aiba y la limita sus dar si ca variabila de integrare. That’s very bad practice. Don’t do
that. Use t when you’re calculating the cumulative distribution function. Use t as your variable of
integration and then use y as your limit.

We can also use F to calculate probabilities:

The probability that you are within a and b – one way of calculating that is to calculate all the area less
than b and then to subtract off all the area that’s less than a. What you’re left with is the area that you
want which the area between a and b. F-ul se calculeaza dupa formula de mai sus.

7’30” Properties of CDF

1. F(-∞ ) = 0. There is no area to the left to calculate so it always has to start at zero.

2. F(∞ ) =1. Creste pana ajunge la 1. This function always has the same general shape: always starts at
zero at -∞ and it always increases and finishes up at 1 at ∞ .

3. F is increasing.

4. F’(y) = f(y). Cica e fundamental theorem of calculus.

9’45” sunt la --- se pare ca trebuie sa stiu sa rezolv integrale – o sa ma intorc la calculus for dummies,
recapitulez ce am scris si fac o gramada din exercitiile de acolo

You might also like