[go: up one dir, main page]

0% found this document useful (0 votes)
28 views7 pages

Lect7c - Bayesian Likelihood Classification

The document discusses statistical pattern recognition, focusing on Bayesian classification and the process of classifying objects based on feature vectors. It explains the roles of sensors, feature extractors, and classifiers in identifying objects, using examples like distinguishing between nuts, bolts, and washers based on shape and size. Additionally, it describes the Bayesian Maximum Likelihood classification method, which utilizes Bayes' theorem to determine the probability of an object belonging to a particular class based on its features.

Uploaded by

imcamillia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views7 pages

Lect7c - Bayesian Likelihood Classification

The document discusses statistical pattern recognition, focusing on Bayesian classification and the process of classifying objects based on feature vectors. It explains the roles of sensors, feature extractors, and classifiers in identifying objects, using examples like distinguishing between nuts, bolts, and washers based on shape and size. Additionally, it describes the Bayesian Maximum Likelihood classification method, which utilizes Bayes' theorem to determine the probability of an object belonging to a particular class based on its features.

Uploaded by

imcamillia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Statistical pattern

recognition
(Bayesian Classification)

Classification
• The final stage is the classification of the objects on the
basis of the set of features we have just computed, i.e. on
the basis of the features vector.
• Of one view the feature values as “co‐ordinates” of a point
in n‐dimensional space (one feature value implies a one‐
dimensional space, two features imply a two‐dimensional
space, and so on), then one may view the object of
classification as being the determination of the sub‐space
of the feature space to which the feature vector belongs.
• Since each sub‐space corresponds to a distinct object, the
classification essentially accomplishes the object
identification.

1
Common model for classification
• Classes
• An ideal class is a set of objects having some
important common properties.
• Classification is a process that assigns a label
to an object according to some representation
of the object’s properties.
• A classifier is a device or algorithm that inputs
an object representation and outputs a class
label.

• Sensor/transducer
• Some device to sense the actual physical
object and output (usually) digital
representation of it for processing by machine.
• E.g., color camera, a stylus
• Feature extractor (FE)
• FE extracts information relevant to
classification from the data input by the
sensor.

2
• Classifier
• The classifier uses the features extractor
from the sensed object data to assign
the object one of the m designed classes
C1, C2,…Cm.

• For example, consider a pattern recognition application


which requires us to discriminate between nuts, bolts,
and washers.
• Assuming that we can segment these objects adequately,
we might choose to use two features on which to base
the classification: washers and nuts are almost circularity
in shape, while bolts are quite long in comparison, so we
decide to use a circularity measure as one feature.
• Furthermore, washers have a larger diameter than nuts,
and bolts have an even large maximum dimension. Thus,
we decide to use the maximum length of the object as
the second feature.

3
• If we then proceed to measure these feature values for a
fairly large set of these objects, called the training set, and
plot the result on a piece of graph paper (two‐dimensional
feature space) we will probably observe the clustering
pattern where nuts, bolts, and washers are all grouped in
distinct sub‐spaces.

Washers
Circularity
A/P2

Nuts
A = area Bolts
P = perimeter length
Maximum dimension

• We generate the feature vector for this unknown object


(i.e. compute the maximum dimension and its circularity
measure) and see where this takes us in the feature
space.
• The question is now: to which sub‐space does the vector
belong, i.e. to which class does the object belong?
• One of the most popular and simple techniques, the
nearest neighbour classification (NNC) technique.
• NNC technique classifies the object on the basis of the
distance of the unknown object vector position from the
centre of the three clusters, choosing the closest cluster
as the one to which it belongs.

4
Bayesian Maximum Likelihood classification

• Utilise Bayes’ theorem from statistical decision theory


and is called the maximum likelihood classifier.
• We will develop the discussion using an example which
requires only one feature to discriminate between two
objects.
• Suppose that in a situation similar to that described in
the preceding example, we wish to distinguish between
nuts and bolts.
• We now have just one feature (circularity) and one‐
dimensional feature space with two classes of object: nut
and bolts (Cn and Cb)

• Let us also refer the circularity feature value as x.


• The first thing that is required is the probability density
functions (PDFs) for each of these two classes, i.e. a
measure of the probabilities that an object from a
particular class will have a given feature value.

10

5
• The PDFs for nuts & bolts can be estimated in a relatively
simple manner by measuring the value of x for a large
number of nuts (bolts), plotting the histogram of these
values, and normalising the values so that the total area
under the histogram equals one.
• We may know, for instance, that the class of nuts is, in
general, likely to occur twice as often as the class of
bolts. In this case we say that a priori probabilities of the
two classes are:
P(Cn) = 0.666 and P(Cb) = 0.333
In fact, in this case it is more likely that they will have the same a priori
probabilities (0.5) since we usually have a nut for each bolt.

11

• The “conditional probability” of an object having a


certain feature value, P(x|Cb) enumerates the probability
that a circularity x will occur, given that the object is a
bolt. We get this value from the histogram of each
particular class.
• Of course, this is not what we are interested in at all. We
want to determine the probability that an object belongs
to a particular class, given that a particular value of x has
occurred, allowed us to establish its identity.
• This is called the a posteriori probability P(Ci|x) that the
object belong to a particular class i and given by Bayes’
theorem:

12

6
Where:

P(x) is a normalisation factor which is used to ensure the


sum of a posteriori probabilities sums to one.

13

You might also like