0% found this document useful (0 votes)

70 views28 pages

Lecture 06 - Perceptron

Uploaded by

thuctranduynguyen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views28 pages

Lecture 06 - Perceptron

Uploaded by

thuctranduynguyen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Vietnam National University of HCMC

International University
School of Computer Science and Engineering

INTRODUCTION TO ARTIFICIAL INTELLIGENCE

(IT097IU)

LECTURE 06: SUPERVISED LEARNING –

NEURAL NETWORKS

Instructor: Nguyen Trung Ky

Machine Learning
 Up until now: model-based classification with Naive Bayes

 Machine learning: how to acquire a model from data / experience

 Learning parameters (e.g. probabilities)
 Learning structure (e.g. neural networks)
 Learning hidden concepts (e.g. clustering)
Perceptrons
Error-Driven Classification
Errors, and What to Do

 Examples of errors
Dear GlobalSCAPE Customer,
GlobalSCAPE has partnered with ScanSoft to offer you the
latest version of OmniPage Pro, for just $99.99* - the regular
list price is $499! The most common question we've received
about this offer is - Is this genuine? We would like to assure
you that this offer is authorized by ScanSoft, is genuine and
valid. You can get the . . .

. . . To receive your $30 Amazon.com promotional certificate,

click through to
http://www.amazon.com/apparel
and see the prominent link for the $30 offer. All details are
there. We hope you enjoyed receiving this message. However, if
you'd rather not receive future e-mails announcing new store
launches, please click . . .
What to Do About Errors
 Problem: there’s still spam in your inbox

 Need more features – words aren’t enough!

 Have you emailed the sender before?
 Have 1M other people just gotten the same email?
 Is the sending information consistent?
 Is the email in ALL CAPS?
 Do inline URLs point where they say they point?
 Does the email address you by (your) name?

 Naïve Bayes models can incorporate a variety of features, but tend to do

best in homogeneous cases (e.g. all features are word occurrences)
Linear Classifiers
Feature Vectors

Hello, # free : 2
YOUR_NAME : 0 SPAM
Do you want free printr MISSPELLED : 2
cartriges? Why pay more FROM_FRIEND : 0 or
when you can get them ...
ABSOLUTELY FREE! Just +

PIXEL-7,12 : 1
PIXEL-7,13
...
: 0
“2”
NUM_LOOPS : 1
...
Some (Simplified) Biology
 Very loose inspiration: human neurons
Linear Classifiers

 Inputs are feature values

 Each feature has a weight
 Sum is the activation

 If the activation is: f1

 Positive, output +1

w2
f2 >0?
w3
 Negative, output -1 f3
Weights
 Binary case: compare features to a weight vector
 Learning: figure out the weight vector from examples

# free : 4
YOUR_NAME :-1
MISSPELLED : 1 # free : 2
FROM_FRIEND :-3 YOUR_NAME : 0
... MISSPELLED : 2
FROM_FRIEND : 0
...

# free : 0
YOUR_NAME : 1
MISSPELLED : 1
Dot product positive FROM_FRIEND : 1
means the positive class ...
Decision Rules
Binary Decision Rule
 In the space of feature vectors
 Examples are points
 Any weight vector is a hyper-plane
 One side corresponds to Y=+1
 Other corresponds to Y=-1

money
2

+1 = SPAM

1
BIAS : -3
free : 4
money : 2
... 0
-1 = HAM 0 1 free
Weight Updates
Learning: Binary Perceptron
 Start with weights = 0
 For each training instance:
 Classify with current weights

 If correct (i.e., y=y*), no change!

 If wrong: adjust the weight vector

Learning: Binary Perceptron
 Start with weights = 0
 For each training instance:
 Classify with current weights

 If correct (i.e., y=y*), no change!

 If wrong: adjust the weight vector by
adding or subtracting the feature
vector. Subtract if y* is -1.
Examples: Perceptron
 Separable Case
Multiclass Decision Rule

 If we have multiple classes:

 A weight vector for each class:

 Score (activation) of a class y:

 Prediction highest score wins

Binary = multiclass where the negative class has weight zero

Learning: Multiclass Perceptron

 Start with all weights = 0

 Pick up training examples one by one
 Predict with current weights

 If correct, no change!
 If wrong: lower score of wrong answer,
raise score of right answer
Example: Multiclass Perceptron

“win the vote”

“win the election”
“win the game”

BIAS : 1 BIAS : 0 BIAS : 0

win : 0 win : 0 win : 0
game : 0 game : 0 game : 0
vote : 0 vote : 0 vote : 0
the : 0 the : 0 the : 0
... ... ...
Properties of Perceptrons
Separable
 Separability: true if some parameters get the training set
perfectly correct

 Convergence: if the training is separable, perceptron will

eventually converge (binary case)

 Mistake Bound: the maximum number of mistakes (binary Non-Separable

case) related to the margin or degree of separability
Examples: Perceptron
 Non-Separable Case
Improving the Perceptron
Problems with the Perceptron

 Noise: if the data isn’t separable,

weights might thrash
 Averaging weight vectors over time
can help (averaged perceptron)

 Trivial generalization: finds a

“barely” separating solution

 Over-training: test / validation

accuracy usually rises, then falls
 Overtraining is a kind of over-fitting
validation
Fixing the Perceptron
 Idea: adjust the weight update to mitigate these effects

 MIRA*: choose an update size that fixes the current

mistake…
 … but, minimizes the change to w

 The +1 helps to generalize

* Margin Infused Relaxed Algorithm

Minimum Correcting Update

min not =0, or would not have

made an error, so min will be
where equality holds
Maximum Step Size
 In practice, it’s also bad to make updates that are too large
 Example may be labeled incorrectly
 You may not have enough features
 Solution: cap the maximum possible value of  with some
constant C

 Corresponds to an optimization that assumes non-separable data

 Usually converges faster than perceptron
 Usually better, especially on noisy data
Linear Separators

 Which of these linear separators is optimal?

SP14 CS188 Lecture 22 - Perceptron - Print
No ratings yet
SP14 CS188 Lecture 22 - Perceptron - Print
35 pages
Lec 21
No ratings yet
Lec 21
34 pages
lec21-ML II
No ratings yet
lec21-ML II
66 pages
cs188 sp23 Lec25 - Z
No ratings yet
cs188 sp23 Lec25 - Z
38 pages
Week3 LearningI
No ratings yet
Week3 LearningI
48 pages
3 Percept Ron
No ratings yet
3 Percept Ron
34 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
19 pages
lec22-ML III
No ratings yet
lec22-ML III
51 pages
Lecture W1c UG
No ratings yet
Lecture W1c UG
33 pages
Complete Unit-1 Merged
No ratings yet
Complete Unit-1 Merged
74 pages
AIML Unit4
No ratings yet
AIML Unit4
30 pages
cs221 Lecture10
No ratings yet
cs221 Lecture10
43 pages
14 Supervised Machine Learning
No ratings yet
14 Supervised Machine Learning
94 pages
Intro to Binary Classification
No ratings yet
Intro to Binary Classification
10 pages
Unit-1 - Machine Learning
No ratings yet
Unit-1 - Machine Learning
85 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
W4 - Logistic Regression
No ratings yet
W4 - Logistic Regression
52 pages
ML Unit 1
No ratings yet
ML Unit 1
73 pages
Unit-1 - Session 1 - Supervised & Unsupervised
No ratings yet
Unit-1 - Session 1 - Supervised & Unsupervised
24 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
Machine Learning: The Hundred-Page Book
No ratings yet
Machine Learning: The Hundred-Page Book
9 pages
AI Chapter 5
No ratings yet
AI Chapter 5
31 pages
Lecture 2 - Supervised Learning
No ratings yet
Lecture 2 - Supervised Learning
6 pages
AI Linear Regression & Perceptron
No ratings yet
AI Linear Regression & Perceptron
8 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
56 pages
Neural Networks Two
No ratings yet
Neural Networks Two
69 pages
Supervised Learning 2
No ratings yet
Supervised Learning 2
4 pages
ML Unit I
No ratings yet
ML Unit I
14 pages
ML
No ratings yet
ML
49 pages
Data Analysis ch1
No ratings yet
Data Analysis ch1
13 pages
Comp Vis Week 3
No ratings yet
Comp Vis Week 3
44 pages
20ECE633T Machine Learning in VLSI
No ratings yet
20ECE633T Machine Learning in VLSI
81 pages
AI ML Unit4
No ratings yet
AI ML Unit4
252 pages
Lecture 4.2 Supervised Learning Classification
No ratings yet
Lecture 4.2 Supervised Learning Classification
25 pages
Lecture 9
No ratings yet
Lecture 9
97 pages
Introduction To Machine Learning: Workshop On Machine Learning For Intelligent Image Processing
No ratings yet
Introduction To Machine Learning: Workshop On Machine Learning For Intelligent Image Processing
44 pages
Ai and ML
No ratings yet
Ai and ML
16 pages
Introduction to Classification in AI
No ratings yet
Introduction to Classification in AI
66 pages
Lec10 Intro ML
No ratings yet
Lec10 Intro ML
93 pages
WEEK 5 Machine Learning
No ratings yet
WEEK 5 Machine Learning
8 pages
ML Notes Unit-1
No ratings yet
ML Notes Unit-1
11 pages
AI.5 Machine Learning (21 26)
No ratings yet
AI.5 Machine Learning (21 26)
196 pages
02 ArtificialNeurons
No ratings yet
02 ArtificialNeurons
31 pages
KNN Evaluation
No ratings yet
KNN Evaluation
51 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
65 pages
Presentation On ML
No ratings yet
Presentation On ML
469 pages
Perceptron Classifier Explanation
No ratings yet
Perceptron Classifier Explanation
4 pages
482 LectureNotes Chapter 1
No ratings yet
482 LectureNotes Chapter 1
6 pages
MLL
No ratings yet
MLL
2 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
316 pages
Gradient-Based Learning & Neural Networks
No ratings yet
Gradient-Based Learning & Neural Networks
72 pages
Machine Learning
100% (11)
Machine Learning
135 pages
Unit4 PPT
No ratings yet
Unit4 PPT
118 pages
LeeBoy Aspahlt Paver 8500 Manual
100% (2)
LeeBoy Aspahlt Paver 8500 Manual
117 pages
CAT CREA Accessories EN - 20191106 PDF
No ratings yet
CAT CREA Accessories EN - 20191106 PDF
44 pages
Manual Setting Type: Twin Volume Built In, High Accuracy Type
No ratings yet
Manual Setting Type: Twin Volume Built In, High Accuracy Type
4 pages
Life in-The-Last-Humanity - On The Speculative Ecology of Man, Animal, and Plant
No ratings yet
Life in-The-Last-Humanity - On The Speculative Ecology of Man, Animal, and Plant
19 pages
Compressed Natural Gas
100% (3)
Compressed Natural Gas
16 pages
Advanced Concrete Stress-Strain Analysis
No ratings yet
Advanced Concrete Stress-Strain Analysis
45 pages
2011 Design - Testing - Retrieval - Alumina Heads
No ratings yet
2011 Design - Testing - Retrieval - Alumina Heads
8 pages
HECO130 1 JulDec2021 FA1 GC V2 04082021
No ratings yet
HECO130 1 JulDec2021 FA1 GC V2 04082021
4 pages
Organometallic Organometallic Chemistry Chemistry: Hapter Hapter
No ratings yet
Organometallic Organometallic Chemistry Chemistry: Hapter Hapter
82 pages
Jacobs AssessmentContourLine 1988
No ratings yet
Jacobs AssessmentContourLine 1988
13 pages
PLC PLC Overviews
No ratings yet
PLC PLC Overviews
28 pages
Hooke Jeeves Method - 15
No ratings yet
Hooke Jeeves Method - 15
11 pages
Bus 511 Final Assignment Part 1
No ratings yet
Bus 511 Final Assignment Part 1
10 pages
Feed-Pump Hydraulic Performance and Design Improvement, Phase I: J2esearch Program Design
No ratings yet
Feed-Pump Hydraulic Performance and Design Improvement, Phase I: J2esearch Program Design
201 pages
Java Programming Quiz Questions
No ratings yet
Java Programming Quiz Questions
28 pages
Architecture for Design Students
No ratings yet
Architecture for Design Students
11 pages
Differentiation Practice Paper2
No ratings yet
Differentiation Practice Paper2
14 pages
Finance Course Assignment Feedback
No ratings yet
Finance Course Assignment Feedback
9 pages
R06 Time Value of Money IFT Notes
100% (1)
R06 Time Value of Money IFT Notes
32 pages
Service Manual: HCD-RG66T
No ratings yet
Service Manual: HCD-RG66T
72 pages
4.4 Best-Fit Lines by Hand Practice Worksheet - Exp
No ratings yet
4.4 Best-Fit Lines by Hand Practice Worksheet - Exp
2 pages
Unit 0 Packet
No ratings yet
Unit 0 Packet
6 pages
Impression Brochure
No ratings yet
Impression Brochure
2 pages
Year 1 Year 2 Year 3 Year 4 Year 5 Year 6: English
No ratings yet
Year 1 Year 2 Year 3 Year 4 Year 5 Year 6: English
10 pages
Agricultural Power and Energy Sources
No ratings yet
Agricultural Power and Energy Sources
30 pages
CBSE Sample Papers For Class 11 Computer Science Set 1 With Solutions - Learn CBSE
No ratings yet
CBSE Sample Papers For Class 11 Computer Science Set 1 With Solutions - Learn CBSE
17 pages
PPT-5 Carbon Nanostructures
No ratings yet
PPT-5 Carbon Nanostructures
34 pages
IGCSE Add Maths Differentiation
100% (1)
IGCSE Add Maths Differentiation
14 pages
Well Completion & Stimulation Guide
100% (4)
Well Completion & Stimulation Guide
49 pages
3.8 Investment Appraisal PBP, ARR, NPV
No ratings yet
3.8 Investment Appraisal PBP, ARR, NPV
56 pages

Lecture 06 - Perceptron

Uploaded by

Lecture 06 - Perceptron

Uploaded by

Vietnam National University of HCMC

INTRODUCTION TO ARTIFICIAL INTELLIGENCE

LECTURE 06: SUPERVISED LEARNING –

Instructor: Nguyen Trung Ky

 Machine learning: how to acquire a model from data / experience

. . . To receive your $30 Amazon.com promotional certificate,

 Need more features – words aren’t enough!

 Naïve Bayes models can incorporate a variety of features, but tend to do

 Inputs are feature values

 If the activation is: f1

 If correct (i.e., y=y*), no change!

 If wrong: adjust the weight vector

 If correct (i.e., y=y*), no change!

 If we have multiple classes:

 Score (activation) of a class y:

 Prediction highest score wins

Binary = multiclass where the negative class has weight zero

 Start with all weights = 0

“win the vote”

BIAS : 1 BIAS : 0 BIAS : 0

 Convergence: if the training is separable, perceptron will

 Mistake Bound: the maximum number of mistakes (binary Non-Separable

 Noise: if the data isn’t separable,

 Trivial generalization: finds a

 Over-training: test / validation

 MIRA*: choose an update size that fixes the current

 The +1 helps to generalize

* Margin Infused Relaxed Algorithm

min not =0, or would not have

 Corresponds to an optimization that assumes non-separable data

 Which of these linear separators is optimal?

You might also like