[go: up one dir, main page]

0% found this document useful (0 votes)
49 views6 pages

NaiveBayes Algorithm

The document describes the Naive Bayes algorithm and provides an example of its use. It includes: - An example training dataset with attributes of Education, Age, Gender and Class Label - Steps to calculate the probability of a new data point belonging to each class - Consideration for how to handle zero probabilities - A second example with numerical Age attribute instead of categorical - Calculation of the class probabilities for a new data point and prediction of the class

Uploaded by

Aysun Güran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views6 pages

NaiveBayes Algorithm

The document describes the Naive Bayes algorithm and provides an example of its use. It includes: - An example training dataset with attributes of Education, Age, Gender and Class Label - Steps to calculate the probability of a new data point belonging to each class - Consideration for how to handle zero probabilities - A second example with numerical Age attribute instead of categorical - Calculation of the class probabilities for a new data point and prediction of the class

Uploaded by

Aysun Güran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

NAIVE BAYES ALGORITHM

Hmap = argmax(P(h\D)) = (P(D\h) * P(h)) /(P(D))


NB is a supervised classification algorithm:
The Training dataset:
ID Eduction Age Gender Class_Label
1 Secondary School Aged Male YES
2 Primary School Young Male NO
3 College Middle-Aged Female NO
4 Secondary School Middle-Aged Male YES
5 Primary School Middle-Aged Male YES
6 College Aged Female YES
7 Primary School Young Female NO
8 Secondary School Middle-Aged Female YES

By using NB algorithm determine the class label of the following test


instance:
Xtest(College, Middle_Aged, Female) ---
Step1:
    Class_Labels
Attribute
s Values YES (5) NO (3)
Primary 1/5 2/3
Educatio Secondary 3/5 0
n College 1/5 1/3
       
Young 0 2/3
Middle_Age
d 3/5 1/3
Age Aged 2/5 0
       
Female 2/5 2/3
Gender Male 3/5 1/3

C1 = YES P(C1=YES)= 5/8


C2 = NO P(C2=NO) = 3/8
Xtest(College, Middle_Aged, Female) ---
P(Ci \ X) = P(X / Ci ) P(Ci)
n

P(X / Ci ) = ∏ P ( x k /C i )
k =1

= argmax {P(X/C)*P(C)}
n

= argmax{∏ P ( x k /C i )∗P(C) }} ( i=1,2 ; xk = k.th attribute)


k =1

C1: YES
n

argmax{∏ P ( x k /C i )∗P(C) }}
k =1

= P(x1 = College / C1=Yes) * P(x2 = Middle_Aged / C1=Yes) * P(x3 = Female / C1=Yes) * P(C1=Yes)

= (1/5) * (3/5) * (2/5) * (5/8)


= 0.03

C2: No
n

argmax{∏ P ( x k /C i )∗P(C) }}
k =1

= P(x1 = College / C2=NO) * P(x2 = Middle_Aged / C2=NO) * P(x3 = Female / C2=NO) * P(C2=NO)

= (1/3) * (1/3) * (2/3) * (3/8)


= 0.028

Argmax {0.03, 0.028 } = 0.03 comes from the first class so we can say that the test
instances’s class label should be YES

ZERO VALUE PROBLEM IN NAIVEBAYES ALGORITHM:

Assume that P(x1 = College / C2=NO) = 0


Without appliying Laplace smoothing the result will be zero:

= P(x1 = College / C2=NO) * P(x2 = Middle_Aged / C2=NO) * P(x3 = Female / C2=NO) * P(C2=NO)

= (0/3) * (1/3)* (2/3)*(3/8) = 0

If we apply the Laplace Smoothing:


•One of these smoothing techniques is add-one smoothing (Laplacian
correction).
= P(x1 = College / C2=NO) * P(x2 = Middle_Aged / C2=NO) * P(x3 = Female / C2=NO) * P(C2=NO)

= ((0+1)/(3+3)) * ((1+1)/(3+3)) * ((2+1)/(3+3)) * (3/8)


Ex2: NUMERICAL ATTRIBUTES
ID Eduction Age Gender Class_Label
1 Secondary School 60 Male YES
2 Primary School 22 Male NO
3 College 38 Female NO
4 Secondary School 40 Male YES
5 Primary School 40 Male YES
6 College 60 Female YES
7 Primary School 20 Female NO
8 Secondary School 42 Female YES

By using NB algorithm determine the class label of the following test


instance:
Xtest(x1= College, x2= 44, x3=Female) --- > ? Yes ? No
C1 = YES P(C1=YES)= 5/8
C2 = NO P(C2=NO) = 3/8
P(C1/X) = P(X/C1)*P(C1)
P(C1/X) = P(x1,x2,x3 /C1)*P(C1)
P(C1=YES/Xtest) = P(x1= College /C1=YES)* P(x2= 44/C1=YES)* P(x3=Female /C1=YES)*P(C1=YES)

= (1/5) * (??????) * (2/5) * (5/8)


P(x2= 44/C1=YES)= g(44, mean_of_age_attribute(YES), stdev_of_age_attribute(Yes)) =

YES (Age): The age of the people who were accepted for the company:
60
40
40
60
42
Mean()=48.4
Stdev()=10.62

2
−1 44−48.4
1 ( )
P(x2= 44/C1=YES) = e 2 10.62
= 0.0344
√2 π ( 2.57 ) 2

???? = 0.0344
P(C1=YES/Xtest) = P(x1= College /C1=YES)* P(x2= 44/C1=YES)* P(x3=Female /C1=YES)*P(C1=YES)
= (1/5) * (0.0344) * (2/5) * (5/8)
= 0,00172

Now Let’s calculate the necessary items for No class:


P(C2/X) = P(X/C2)*P(C2)
P(C2/X) = P(x1,x2,x3 /C2)*P(C2)
P(C2=No/Xtest) = P(x1= College /C2=No)* P(x2= 44/ C2=No)* P(x3=Female / C2=No)*P(C2=No)

= (1/3) * (??????) * (2/3) * (3/8)

No (Age): The age of the people who weren’t accepted for the company:
22
38
20
Mean()= 26.66
Stdev()= 9.86
2
−1 44−26.66
1 ( )
P(x2= 44/C2=NO) = e 2 9.86
= 0.0086
√2 π ( 9.86 ) 2

P(C2=No/Xtest) = P(x1= College /C2=No)* P(x2= 44/ C2=No)* P(x3=Female / C2=No)*P(C2=No)

= (1/3) * (0.0086) * (2/3) * (3/8)


= 0.00071667

FINAL DECISION:
Argmax{YES= 0.00172, No= 0.00071667} = YES= 0.00172
The test will belong to the YES class

You might also like