IS328 Data Mining
Naïve Bayes Classification
Semester 2, 2019
Tutorial 8 Exercises
Q1. Consider the following loan database with 15 entries:
Using Naïve Bayes Classification, determine the likely outcome of the following two
applicants:
ID Age Has Own Credit Class
Job House Rating
16 middle True false good ??
17 young False true fair ??
SOLUTION
AGE HAS_JOB
YES NO YES NO
Young 2/9 3/6 True 5.9 0/6
Middle 3/9 2/6 False 4/9 6/6
Old 4/9 1/6
Own_House Credit_Rating
YES NO YES NO
True 6/9 0/6 Fair 1.9 4/6
False 3/9 6/6 Good 4/9 2/6
Excellent 4/9 0/6
P(Tes) = 9/15, P(No) = 6/15
E = age = middle , has_job = true, own_house = false, credit_rating = good
E1 is age = middle, E2 is has_job = true, E3 is own_house = false, E4 is credit_rating = good
We need to compute P(Yes } E), and P(No | E) and compare them.
P(Yes | E) =P(Yes), P(E1 | Yes) * P(E2 | Yes) * P(E3 | Yes)* P(E4 | Yes)
-------------------------------------------------------------------------------
P(E)
P(No | E) =P(No), P(E1 | No) * P(E2 | No) * P(E3 | No)* P(E4 | No)
-------------------------------------------------------------------------------
P(E)
P(E1 } Yes) = 3/9 P(E1 } No) = 2/6
P(E2 } Yes) = 5/9 P(E2 } No) = 0/6
(5+1) / (9+2) = 6/11 (0+1)/(6+2) = 1/8
P(E3 } Yes) = 3/9 P(E3 } No) = 6/6
P(E4 } Yes) = 4/9 P(E4 | No) = 2/6
P(Yes)= 9/15 P(No) = 6/15
P( Yes | E) = 9/15 * 3/9 * 6/11 * 3/9 * 4/9 = 0.0162
P(No | E) = 6/15 * 2/6 * 1/8 * 6/6 * 2/6 = 0.0056
Hence, the Naïve Bayes Classifier predicts the outcome as Yes for the example.
Q2 Given the training data given below (Travel time between the USP and the Nausori
Airport during morning peak time ) predict the class of the following example using Naïve
Bayes Classification.
hour:10 am , weather = rainy, accident = no, stall = no
Exampl
e Attributes Target
Hour Weather Accident Stall Commute
D1 8 AM Sunny No No Long
D2 8 AM Cloudy No Yes Long
D3 10 AM Sunny No No Short
D4 9 AM Rainy Yes No Long
D5 9 AM Sunny Yes Yes Long
D6 10 AM Sunny No Yes Medium
D7 10 AM Cloudy No No Short
D8 9 AM Rainy No No Medium
D9 9 AM Sunny Yes No Long
D10 10 AM Cloudy Yes Yes Long
D11 10 AM Rainy No No Short
D12 8 AM Cloudy Yes No Long
D13 9 AM Sunny No No Medium
SOLUTION
E = hour:10 am , weather = rainy, accident = no, stall = no
E1 is hour = 10 am, E2 is weather = rainy, E3 is accident = no, E4 is stall = no
We need to compute P(long } E), P (medium | E) and P(Short | E) and compare them.
P(l ong | E) =P(long), P(E1 | long) * P(E2 | long) * P(E3 | long)* P(E4 | long)
-------------------------------------------------------------------------------
P(E)
P(E1 } long) = 1/7 P(E1 } medium) = 1/3 P(E1 | short) = 3/3
P(E2 } long) = 1/7 P(E2 } medium) = 1/3 P(E2 | short) = 1/3
P(E3 } long) = 2/7 P(E3 } medium) = 2/3 P(E3 | short) = 3/3
P(E4 } long) = 4/7 P(E4 | medium) = 2/3 P(E4 | short) = 3/3
P( Long | E) = 7/13 * 1/7 * 1/7 * 2/7 * 4/7 = 8 /4459= 0.0018
P(Medium | E) = 3/13 * 1/3 * 1/3 * 2/3 * 2/3 = 4/351= 0.0114
P(Short | E) = 3/13 * 3/3 * 1/3 * 3/3 * 3/3 = 1/13 = 0.0767
Hence, thw Naïve Bayes Classifier predicts the commute time as short for the example.