[go: up one dir, main page]

0% found this document useful (0 votes)
17 views80 pages

Lecture 2

This document provides an overview of machine learning concepts, including supervised and unsupervised learning techniques, various algorithms, and data processing methods. It discusses applications such as spam classification, time-series prediction, and quality assessment in industries. Additionally, it covers mathematical foundations related to loss functions and transformations used in machine learning.

Uploaded by

ranamzeeshan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views80 pages

Lecture 2

This document provides an overview of machine learning concepts, including supervised and unsupervised learning techniques, various algorithms, and data processing methods. It discusses applications such as spam classification, time-series prediction, and quality assessment in industries. Additionally, it covers mathematical foundations related to loss functions and transformations used in machine learning.

Uploaded by

ranamzeeshan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 80

MACHINE LEARNING

Lecture 02

Dr. Samana Batool


INTRODUCTION TO ML
Labeled Training Set for Spam Classification (Example of Supervised Learning)

Training Set Label

Sample

New Sample
kernel
approximation

Ensemble
Classifiers labeled data SVR (kernel=‘linear’)

SVR (kernel=‘rbf’)

number of
categories known

scikit-learn algorithm cheat-


sheet
kernel
approximation

Ensemble
Classifiers labeled data SVR (kernel=‘linear’)

SVR (kernel=‘rbf’)

number of
categories known
kernel
approximation

Ensemble
Classifiers labeled data SVR (kernel=‘linear’)

SVR (kernel=‘rbf’)

number of
categories known
Time-series
Anomaly prediction
detection

AR
continuous
Time-
Unsupervised series
Markov
Learning chain
modelling CRF
Latent undirected
variable
models Latent
Mixed HMM
Markov - discrete
member- Dimension chain
ship
reduction
under LDS
complete continuous
Clique
over SLDS
decomp. Latent Linear
complete mixed
Dirichlet
Allocation Visualis-
ation Non Factor
Linear analysis
sparse PLSA
basis NMF
PCA
MACHINE LEARNING ALGORITHMS
Unsupervised Supervised
 Clustering & Dimensionality Reduction  Regression (Linear, Polynomial )
Continuous

 SVD
 Decision Trees
 PCA
 K- means  Random Forests

 Classification
Categorical

 Association Analysis  KNN


 Apriori  Trees
 FP-Growth  Logistic Regression
 Naïve-Bayes
 SVM
Prediction Clustering
Making a model based on a given data, Comparing the properties of the
then applying the model to new cases data and forming clusters based on
for predictions similar characteristics
Predicting the quality specifications Clusterization of processes with
from ingredients and environment similar characteristics out of
(pressure, temperature, humidity, various processes
Data
etc.) in the fishing industry
Mining
Problems
Association Rule
Classification
Identifying the attributes or
Determining where a particular case relationships between items to which
belongs in a given series of classified the appearance of a pattern implies the
categories appearance of another pattern
Quality Ratings from good/normal/bad, Predicting what will happen to the
Determining the quality of new entire process when there is an
products abnormal pattern in one process
Data Source Collecting

Manual
Internal
data

Log
Collector

Web
Automated Crawling
External
data

Sensing

Media
Standardization Sampling
• Z-transform • Random Sampling
• Normalization • Stratified Sampling
(feature scaling) • Cluster Sampling

Normal Distribution Dimensionality Reduction


Data
• Log Transformation • Factor Analysis
Transformation
• Square Root Transformation • PCA

Categorization Signal Data Compression


• Discretization • Fourier Transform
• Binarization • Wavelet Transform
train model

Entire Data
choose model

model test
True/Actual
Cat Fish Hen
Cat 4 6 3
Predicted

Fish 1 2 0

Hen 1 2 6
Traditional Approach
Launch!

Solution
Problem Research Making Rules
Evaluation

Error Analysis
Machine Learning Approach

Launch!

Data

Machine Learning Solution


Problem Research
Algorithm Training Evaluation

Error Analysis
UNIT 01

2784

60000
MNIST

Rare Database in a Vast Space


Sampling Noise Sampling Bias
𝑦
𝑥 𝑥0

Graph of an Inaccurate Prediction Due to Overfitting


𝑦

𝑥
𝑦 = 5𝑥 + 3
3x + 5y = 7
(𝑥 − 2)3 = 𝑥 3 − 6𝑥 2 + 12𝑥 − 8
𝑥 2 + 3𝑥 + 2 = 0
𝑥 2 + 𝑥𝑦𝑧 + 2𝑥𝑦 + 3𝑦 2 = 9
4 Terms

Coefficient 4 -3 4 6

Degree 1 1 3 0
Graph of a First-Degree Equation

𝑦 = 5𝑥 + 3 𝑎>0 𝑦 𝑦
5
First−Degree Equation for 𝑥 , 𝑎<0 𝑦 = −7𝑥 + 5
𝑎𝑥 + 𝑏 (𝑏𝑢𝑡, 𝑎 ≠ 0) 𝑎 1
𝑎
3 𝑏 intercept b
1 intercept

𝑎 slope 𝑥
0 𝑥 0
𝑏 intercept

‣ a and b are treated as constants.


‣ The degree of term ax is 1, and the order of term b is zero because there is no letter x. Thus, this equation becomes a first-
degree equation.
Graph of a Second-Degree Equation

Second-Degree 𝑎>0 𝑎<0


𝑦 𝑦
Equation for 𝑥,
𝑎𝑥 2 + 𝑏𝑥 + 𝑐 𝑐
(𝑏𝑢𝑡 𝑎 ≠ 0)

𝑥 𝑥

0 0
c
Convex Downward Convex Upward

‣ a and b are treated as constants. The degree of ax is 2, the degree of b is 1, and the degree of c is 0 because there is no
letter x. Thus, this equation is a second-degree equation.
Distance = 3 Distance = 3

-3 -2 -1 0 1 2 3 4
∥𝑝−𝑞 ∥= 𝑝 − 𝑞 ∙ 𝑝 − 𝑞 = ∥ 𝑝 ∥2 +∥ 𝑞 ∥2 − 2𝑝 ⋅ 𝑞
y
The distance of
segment AB is the a2 A(a1, a2) A
distance between
point A and point B.
b2 (a2− b2)
B (b1, b2)

x B
0 b1 a1 (a1− b1)
23 = 2 ∙ 2 ∙ 2 = 8
23 = 2 ∙ 2 ∙ 2 = 8

Exponentials Logarithms

8 = 23 3 = log 2 8
100 = 102 2 = log10 100
81 = 34 4 = log 3 81

0.01 = 10 2 −2 =
log10 0.01
Exponents Log

Base

Antilog
f(x) = 𝑎 𝑥

For the real number 𝑎, satisfying 𝑎 > 0, 𝑎 ≠ 1, the graph of the exponential function 𝑦 = 𝑎𝑥 with the range 𝑎 can be
drawn as shown below.
y=ax y=ax
y y

a1
1 1
a1
x x
0 t 0 t
(a) When a >1 (b) When 0<a <1

The graph of the exponential function 𝒚 = 𝒂𝟐 with the range 𝒂


y2=2**(-x-2)

2^{-X-2}$
𝑦=3 2.0 𝑦 = 2-𝑥
8 𝑦 = 3𝑥-1 + 2 𝑦 = −2-x-2
1.5
6 1.0

4 0.5
0.0
2
-0.5
0
−1.0 -0.5 0.0 0.5 1.0 1.5 2.0 −1.0 -0.5 0.0 0.5 1.0 1.5 2.0
𝑓 𝑥 = 𝑙𝑜𝑔𝑎 𝑥

For the real number 𝑎, satisfying 𝑎 > 0, 𝑎 ≠ 1, the graph of the logarithmic function y = log a x with the
range 𝑎 can be drawn as shown below.
y y
𝑦=𝑥
𝑦 = 𝑎𝑥 𝑦=𝑥
𝑦 = 𝑎𝑥

𝑦 = log 𝑎 𝑥 1
(a) When a >1 1 (b) When 0<a <1
x x
0 1 0 1
𝑦 = log 𝑎 𝑥

The Graph of the Logarithmic Function 𝒚 = 𝒍𝒐𝒈𝒂𝒙 with the range 𝒂


2 𝑦 = log 2(𝑥)
𝑦 = log 𝑥 𝑥 + 1 − 1
1
0
-1
-2
-3
-4

−1 0 1 2 3 4
import numpy as np
import matplotlib.pyplot as plt

data = np.random.exponential(scale=2.0, size=1000)


log_data = np.log(data + 1) # Apply log transform (shift by 1 to avoid
log(0))

plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.hist(data, bins=30)
plt.title("Original Data (Exponential Distribution)")

plt.subplot(1, 2, 2)
plt.hist(log_data, bins=30)
plt.title("Log Transformed Data")
plt.show()
For binary classification (i.e., predicting between two classes), the
formula for cross-entropy loss is:

𝐿 𝑦, 𝑦ො = −[y log 𝑦ො + 1 − y log 1 − 𝑦ො ]


import numpy as np

y_true = np.array([1, 0, 1, 1]) # True labels


y_pred = np.array([0.9, 0.1, 0.8, 0.4]) #
Predicted probabilities

cross_entropy_loss = -np.mean(y_true *
np.log(y_pred) + (1 - y_true) * np.log(1 - y_pred))
print("Binary Cross-Entropy Loss:",
cross_entropy_loss)
Comparing Logarithmic and Linear Loss

Create two functions: one for computing linear loss and one for
logarithmic loss.
Compare the results when applied to the same data.
Task:
•Implement a linear_loss(y_true, y_pred) function that calculates
the mean absolute error (MAE).
•Implement a logarithmic_loss(y_true, y_pred) function that
calculates the log difference between predictions and true values.
•Compare both loss functions using the same test data.
import numpy as np
# Test the functions
y_true = np.array([1, 0, 1])
y_pred = np.array([0.9, 0.2, 0.8])

Linear_loss = np.mean(np.abs(y_true - y_pred))

Log_loss = -np.mean(y_true * np.log(y_pred) + (1 - y_true) * np.log(1 -


y_pred))

print("Linear Loss:", linear_loss(y_true, y_pred))


print("Logarithmic Loss:", logarithmic_loss(y_true, y_pred))
x (1+x)^(1/x) x (1+x)^(1/x)
0.01 2.70481383 0.0001 2.71814593
0.0099 2.7049473 0.00009 2.7181595
0.0098 2.70508079 0.00008 2.71817311
0.0097 2.70521431 0.00007 2.71818669
0.0096 2.70534785 0.00006 2.71820028
0.0095 2.70548142 0.00005 2.71821387
0.0094 2.70561501 0.00004 2.71822746
0.0093 2.70574863 0.00003 2.71824106
0.0092 2.70588227 0.00002 2.71825465
0.0091 2.70601593 0.00001 2.71826824
0.009 2.70614962 0 #DIV/0!
0.0089 2.70628333 -1.E-05 2.71829542
0.0088 2.70641707 -2.E-05 2.71830901
0.0087 2.70655083 -3.E-05 2.71832260
0.0086 2.70668461 -4.E-05 2.71833620
0.0085 2.70681842 -5.E-05 2.71834979
0.0084 2.70695225 -6.E-05 2.71836338
0.0083 2.70708611 -7.E-05 2.71837697
0.0082 2.70722 -8.E-05 2.71839057
0.0081 2.70735390 -9.E-05 2.71840416
0.008 2.70748783 -0.0001 2.71841776
0.0079 2.70762179 -0.0001 2.71843135
0.0078 2.70775577 -0.0001 2.71844494
0.0077 2.70788977 -0.0001 2.71845854
0.0076 2.70802380 -0.0001 2.71847213
0.0075 2.70815785 -0.0002 2.71848573
0.0074 2.70829193 -0.0002 2.71849932
0.0073 2.70842603 -0.0002 2.71851292
0.0072 2.70856016 -0.0002 2.71852651
𝑒 𝑖𝑛𝑓 = 𝑖𝑛𝑓
1 1
𝑒 −𝑖𝑛𝑓 = = =0
𝑒 𝑖𝑛𝑓 𝑖𝑛𝑓
1

Logistic Curve

0.5

0
-6 -4 -2 0 2 4 6
import numpy as np
import matplotlib.pyplot as plt

# Example of plotting the sigmoid function


x = np.linspace(-10, 10, 100)
y = 1 / (1 + np.exp(-x))

plt.plot(x, y)
plt.title("Sigmoid Function")
plt.xlabel("Input")
plt.ylabel("Output")
plt.grid(True)
plt.show()
x y Sigmoid Function Graph
-1 3.72008E-44
1 -0.9 8.19401E-40 1.2
1 + exp(−𝑎𝑥) -0.8 1.80485E-35
-0.7 3.97545E-31 1
-0.6 8.75651E-27
-0.5 1.92875E-22 0.8
-0.4 4.24835E-18
-0.3 9.35762E-14 0.6
-0.2 2.06115E-09
-0.1 4.53979E-05 0.4
0 0.5
0.1 0.999954602
0.2
0.2 0.999999998
0.3 1
0.4 1
0
0.5 1 -1.5 -1 -0.5 0 0.5 1 1.5
0.6 1
0.7 1
0.8 1
0.9 1
1 1
Linear Relationship between
Time and Height of Water
𝑄
𝑣
5 5
𝑃𝑄 = 𝑜𝑟 𝑣 =
𝑃𝑄 3 3

𝑃
𝑷 5
𝑣=
3
𝑣

O 𝑥
𝑏 = (2,3)

𝑎 = (3,1)

‣ The result of the vector’s inner product is in a form of a real number (size) rather than in a form of a vector.
‣ This real number is sometimes referred to as scalar, and so this process is often called the scalar multiplication.
‣ The operator symbol of the inner product is not x, but ∙, and it is called a “dot.”
𝑏 = (2,3)

𝜃 𝑎 = (3,1)
b

a
Normal Vector p

90 °
a(-4, -2) b(6, 3) a(4, 2) b(6, 3)
a(6, 3)
b(1, -2)

Cosine Similarity is -1. Cosine Similarity is 0. Cosine Similarity is 1.

You might also like