Probability in AI
1. Probability Axioms: These are the basic rules that all probabilities must follow.
Rules:
1. The probability of any event is between 0 and 1.
2. The probability of the whole sample space (everything happening) is 1.
3. If two events can't happen at the same time, the probability of either happening is the
sum of their probabilities.
Example:
• Tossing a coin:
o P(Heads) = 0.5
o P(Tails) = 0.5
o P(Heads or Tails) = 0.5 + 0.5 = 1
2. Prior Probability: The initial belief or guess about how likely something is, before we get
any new information.
Example:
You think there is a 30% chance of rain today based on past weather data.
That 30% is your prior probability.
3. Conditional Probability: It’s the probability of Event A happening, given that Event B has
already happened.
It’s written as:
𝑃(𝐴 ∣ 𝐵)
Which means: "Probability of A given B".
Formula:
𝑃(𝐴 ∩ 𝐵)
𝑃(𝐴 ∣ 𝐵) =
𝑃(𝐵)
Where:
• 𝑃(𝐴 ∩ 𝐵)= Probability of both A and B happening
• 𝑃(𝐵)= Probability that B happens
Example 1: Students and Math
In a class of 100 students:
• 60 students passed Math
• 30 students passed both Math and English
Q: What is the probability a student passed English given they passed Math?
𝑃(Math and English) 30
𝑃(English ∣ Math) = = = 0.5
𝑃(Math) 60
Example 2: Colored Balls in a Bag
A bag has:
• 3 red balls
• 2 blue balls
• 5 green balls
(total = 10 balls)
Q: What is the probability of choosing a blue ball if we know the ball picked is not green?
• Not green = red or blue = 3 + 2 = 5 balls
• Blue = 2 balls
2
𝑃(Blue ∣ Not Green) =
5
Example 3: Tossing a Coin Twice
All possible outcomes:
{HH, HT, TH, TT}
Q: What is the probability that both are heads given that the first toss is heads?
• Given: First toss is H → possible outcomes = {HH, HT}
• Of these, only HH has both heads.
1
𝑃(Both Heads ∣ First is Head) =
2
Example 4: Dice Roll
A die is rolled.
Q: What is the probability that the number is even, given that it is greater than 2?
• Numbers greater than 2 = {3, 4, 5, 6} → 4 numbers
• Even numbers from that set = {4, 6} → 2 numbers
2
𝑃(Even ∣> 2) = = 0.5
4
4. Joint Probability Distributions
A joint probability distribution describes the likelihood of two or more random variables
occurring simultaneously. It's a fundamental concept used to understand the relationships and
dependence between variables.
Example:
• P(Rain = Yes AND Clouds = Yes)
• = Probability that it's raining and it's cloudy.
• Say:
o P(Rain = Yes AND Clouds = Yes) = 0.6
Example 1: Tossing Two Coins
Let’s toss 2 coins. What are the possible outcomes?
• HH (Head, Head)
• HT (Head, Tail)
• TH (Tail, Head)
• TT (Tail, Tail)
Since each outcome is equally likely:
1
𝑃(𝐻𝐻) = 𝑃(𝐻𝑇) = 𝑃(𝑇𝐻) = 𝑃(𝑇𝑇) =
4
Joint Probability Table:
Coin 1 \ Coin 2 Head (H) Tail (T)
Head (H) 1/4 1/4
Tail (T) 1/4 1/4
This is a joint probability distribution showing all combinations of coin tosses.
Example 2: Rolling a Die and Flipping a Coin
You roll a die (1–6) and flip a coin (H or T).
Total outcomes = 6 (from die) × 2 (from coin) = 12 combinations.
If all outcomes are equally likely:
1
𝑃(1𝐻) = 𝑃(1𝑇) = 𝑃(2𝐻) =. . . = 𝑃(6𝑇) =
12
You can build a table showing each pair (Die number, Coin flip) and its probability (1/12
each).
For Example:
• P(Die = 3 AND Coin = H) = 1/12
• P(Die = 5 AND Coin = T) = 1/12
That full set of combinations with probabilities is the joint probability distribution.