Lecture 9
Lecture 9
Lecture 9
Model Evaluation
Classification vs Regression
Classification:
predicts categorical class labels
classifies data (constructs a model) based on the training set and the
values (class labels) in a classifying attribute and uses it in classifying new
data.
Regression:
models continuous-valued functions, i.e., predicts unknown or missing
values.
Typical Applications
credit approval
target marketing
medical diagnosis
treatment effectiveness analysis
Classification – A Motivating
Application
Credit approval
A bank wants to classify its customers based on whether they
are expected to pay back their approved loans
The history of past customers is used to train the classifier
The classifier provides rules, which identify potentially reliable
future customers
Classification rule:
If age = “31...40” and income = high then credit_rating = excellent
Future customers
Paul: age = 35, income = high excellent credit rating
John: age = 20, income = medium fair credit rating
Classification—A Two-Step Process
Testing
Data Unseen Data
(Jeff, Professor, 4)
NAME RANK YEARS TENURED
Tom Assistant Prof 2 no Tenured?
Mellisa Associate Prof 7 no
George Professor 5 yes
Joseph Assistant Prof 7 yes
Classification (Training Phase)
The individual tuples making up the training set are referred to as training
tuples and are randomly sampled from the database under analysis. In the
context of classification, data tuples can be referred to as samples, examples,
instances, data points, or objects.
This first step of the classification process can also be viewed as the learning of a
mapping or function, y = f (X), that can predict the associated class label y of a
given tuple X.
In this view, we wish to learn a mapping or function that separates the data
classes.
Predictive Accuracy
Speed
time to construct the model
time to use the model
Robustness
handling noise and missing values
Scalability
efficiency in disk-resident databases
Interpretability:
understanding and insight provided by the model
Goodness of rules (quality)
True Positives, True Negatives, False Negatives, False Positives
compactness of classification rules
Supervised Learning Algorithms
Perceptron
Developed by Frank Rosenblatt by using McCulloch and Pitts model,
perceptron is the basic operational unit of artificial neural networks. It
employs supervised learning rule and is able to classify the data into
two classes.
Operational characteristics of the perceptron: It consists of a single
neuron with an arbitrary number of inputs along with adjustable
weights, but the output of the neuron is 1 or 0 depending upon the
threshold. It also consists of a bias whose weight is always 1.
Following figure gives a schematic representation of the perceptron.
Perceptron thus has the following three basic elements −
Links − It would have a set of connection links, which carries a
weight including a bias always having weight 1.
Adder − It adds the input after they are multiplied with their
respective weights.
Activation function − It limits the output of neuron. The most basic
activation function is a Heaviside step function that has two possible
outputs. This function returns 1, if the input is positive, and 0 for any
negative input.
Linear Regression
Linear regression may be defined as the statistical model that
analyzes the linear relationship between a dependent variable with
given set of independent variables. Linear relationship between
variables means that when the value of one or more independent
variables will change (increase or decrease), the value of dependent
variable will also change accordingly (increase or decrease).
Mathematically the relationship can be represented with the help of
following equation −
Y = mX + b
Here, Y is the dependent variable we are trying to predict
X is the dependent variable we are using to make predictions.
m is the slop of the regression line which represents the effect X
has on Y
b is a constant, known as the Y-intercept. If X = 0,Y would be
equal to b.
Positive Linear Relationship
A linear relationship will be called positive if both independent and
dependent variable increases. It can be understood with the help of
following graph −
Negative Linear relationship
A linear relationship will be called negative if independent increases
and dependent variable decreases. It can be understood with the help
of following graph −
Support Vector Machines
PREDICTED CLASS
Class=Yes Class=No
a: TP (true positive)
ACTUAL Class=Yes a b
b: FN (false
CLASS negative)
Class=No c d
c: FP (false
positive)
d: TN (true
Metrics for Performance
Evaluation… PREDICTED CLASS
Class=Yes Class=No
ACTUAL Class=Yes a b
(TP) (FN)
CLASS
Class=No c d
(FP) (TN)
ad TP TN
Accuracy
a b c d TP TN FP FN
Methods of Estimation
Holdout
Reserve 2/3 for training and 1/3 for testing
Random subsampling
One sample may be biased -- Repeated holdout
Cross validation
Partition data into k disjoint subsets
k-fold: train on k-1 partitions, test on the
remaining one
Leave-one-out: k=n
Guarantees that each record is used the same number of times for training
and testing
Bootstrap
Sampling with replacement
~63% of records used for training, ~27% for testing
ROC (Receiver Operating Characteristic)