0% found this document useful (0 votes)

26 views67 pages

3ML.05.NeuralNetworks DeepLearning

This document provides an overview of neural networks (NN), detailing their structure, learning processes, and various architectures including single-layer and multi-layer networks. It discusses supervised and unsupervised learning, the backpropagation algorithm, and the significance of deep learning in modern applications such as image classification and natural language processing. Additionally, it highlights the limitations of traditional machine learning approaches compared to deep learning methods that automatically learn hierarchical representations.

Uploaded by

Alessandro Manera

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views67 pages

3ML.05.NeuralNetworks DeepLearning

Uploaded by

Alessandro Manera

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 67

Overview of Machine Learning and Pattern Recognition

Neural Networks
A NN is a machine learning approach inspired by the way in which
the brain performs a particular learning task
Network of simple computational units (neurons) connecte
by links (synapses)

Neural Networks
A NN is a machine learning approach inspired by the way in which the brain
performs a particular learning task
Network of simple computational units (neurons) connecte
by links (synapses)

Knowledge about the learning task is given in the form of training examples (training vectors).

Inter neuron connection strengths (weights) are used to store the acquired information (the
training examples)

During the learning process the weights are modi ed in order to

model the particular learning task correctly on the training examples.
.

fi
d

NN - Learning

Supervised Learning
Pattern recognition, regression, etc.
Labeled training examples (input + desired output)
Neural Network models: perceptron, feed-forward, etc.

Unsupervised Learning
Clustering
Unlabeled training examples (different realizations of the input alone)
Neural Network models: self organizing maps (SOMs), Hopfield networks, etc.
.

NN - architectures
Three different classes of network architectures
– single-layer feed-forward neurons are organized in acyclic layers
– multi-layer feed-forward (links have only one direction)
– Recurrent

A standard architecture consists of

Input unit
represent the input as a fixed-length vector of numbers (user-defined
Hidden units (optional
thresholded weighted sums of the input
represent intermediate calculations that the network learn
Output unit
represent the output as a fixed length vector of number

The architecture of a neural network is linked with the learning algorithm used to train
s

Feed-forward single layer (perceptron)

Input layer of Output layer

source nodes of neurons

Inpu

output
t

Feed-forward multi layer

3-4-2 Network

Input Output
layer layer

input
hidden
Hidden Layer output
The neuron
The neuron is the basic information processing unit of a NN. It consists of:

A set of synapses or connecting links, each link characterized by a numeric

weight: W1, W2, …, Wm
An adder function (linear combiner) which computes the weighted sum of the inputs:
m
u = ∑ w jx j
j =1

Activation function (squashing function) for limiting the amplitude of the

output of the neuron.

y = ϕ (u + b)
𝜑
The neuron
Bias is an external parameter of the neuron. It can be modeled as an extra input.
The neuron
Typical activation functions:
Step function
Sign function
Sigmoid function

Emulates the typical response of a biological neuron: the neuron “ res”

only if the input signal is above the threshold potential

fi
Designing a Neural Network
Various types of neurons

Various network architectures

Various learning algorithms

Various applications
Feed-forward single layer NN
• The simplest architecture is the perceptrons
network

– No hidden units: synonym for a single-

layer, feed-forward network
– It can be used for multi-class problems:
each neueon learns to recognize one
particular class (i.e., output 1 if the input
is in that class, and 0 otherwise) Perceptrons
network
– It is a linear classi er: it can only classify
linearly separable instances
fi
Perceptron: learning rule
The teacher (or target) specifies the desired output for a given input
Network calculates the output based on its current weights (random
initialization). Then it iteratively changes its weights in proportion to the error
between the desired output & calculated output:
For a neuron j:
,
wj,i(t+1) = wi,j (t) + Δwi,j
∆wj,i = * [tj - yj] * ′( )*xi , where:
– is the learning rate;
– tj - yj is the error term
– netj is the weighted sum of the inputs Delta rule
– is the ac va on func on
– xi is the i-th input component
𝑤
𝜑
𝜂
𝑥
𝑛
𝑗𝜑𝑦
𝑒
𝑗
𝑡
𝑖
𝜼
t
ti
𝝋
𝒏
ti
𝒆
𝒕
Gradient descent
Delta rule is a derivation of a gradient descent optimization
algorithm, attempting to minimize the total error in the output:

• The aim is to obtain adjust weights in order to minimize E.

The fastest procedure is to compute for each neuron:
GradE = [dE/dw1, dE/dw2, . . ., dE/dwn]
• Change i-th weight by
• ∆ wi = - * dE/dwi
Error as function of weights in
multidimensional space
𝜼
Perceptron convergence theorem

• Provided that the classes are linearly separable, the final

weights of a perceptron’s network can be obtained in a
finite number of steps and independent of initialization.

• However, a single layer perceptron can only learn linearly

separable classes
Feed-forward multi layer NN
3-4-2 Network • Layer 0 is input nodes
Layers 1 to N-1 are
hidden nodes, Layer N
Output
Input
layer
is output node
layer

• All nodes at any layer

Hidden Layer
k are connected to all
nodes at layer k+

• There are no cycles

• Can compute functions with convex region

• Each hidden node acts like a perceptron, learning a separating line
• Combination of multiple linear classifiers
s

,
s

Backpropagation

• Can’t use Delta Learning Rul

– Target values for hidden units are not available! How to compute error term?
• Use backpropagation algorithm:
1. Compute the error term for the output units, as with perceptro
2. From output layer, repea
- propagating the error term back to the previous layer and
- updating the weights between the two layers
until the earliest hidden layer is reached
• Backpropagation is a derivation of gradient descent (delta rule is a special case).
t

Backpropagation algorithm
• Initialize weights (typically random!)
• Keep doing epochs:
– For each example e in training set do
• forward pass to compute
– O = neural-net-output
– miss = (T-O) at each output unit
• backward pass to calculate deltas to weights
• update all weights
– end
• Until error stops improving or max number of epochs reached
Backpropagation algorithm
Pictures below illustrate how signal is propagating through the network, Symbols w(xm)n represent
weights of connections between network input xm and neuron n in input layer. Symbols yn
represents output signal of neuron n.
Backpropagation algorithm
Backpropagation algorithm
Backpropagation algorithm
Propagation of signals through the hidden layer. Symbols wmn represent weights of
connections between output of neuron m and input of neuron n in the next layer.
Backpropagation algorithm
Backpropagation algorithm

Propagation of signals through the output layer.

Backpropagation algorithm
In the next algorithm step the output signal of the network y is compared
with the desired output value (the target), which is found in training data
set. The difference is called error signal of output layer neuron

𝛿
Backpropagation algorithm
The idea is to propagate the error signal back to all neurons, which
output signals were input for discussed neuron.

𝛿
Backpropagation algorithm

The idea is to propagate the error signal back to all neurons, which
output signals were input for discussed neuron.

𝛿
Backpropagation algorithm
The weights' coefficients wmn used to propagate errors back are equal to this used
during computing output value. Only the direction of data flow is changed (signals
are propagated from output to inputs one after the other). This technique is used for
all network layers.
Backpropagation algorithm
When the error signal for each neuron is computed, the weights coefficients of each
neuron input node may be modified. In formulas below df(e)/de represents derivative
of neuron activation function (which weights are modi ed).

fi
Backpropagation algorithm
When the error signal for each neuron is computed, the weights coefficients of each
neuron input node may be modified. In formulas below df(e)/de represents derivative of
neuron activation function (which weights are modi ed).

fi
The decision boundary perspective
Initial random weights
The decision boundary perspective

Present a training instance / adjust the weights

The decision boundary perspective

Present a training instance / adjust the weights

The decision boundary perspective

Present a training instance / adjust the weights

The decision boundary perspective

Present a training instance / adjust the weights

The decision boundary perspective

Eventually ….
What learning rate?

1. Tuning set, or

2. Cross validation, or

3. Small for slow, conservative learning

How many layers?
Types of Exclusive-OR Classes with Most General
Structure Decision Problem Meshed regions Region Shapes
Regions
Single-Layer Half Plane A B
Bounded By
Hyperplane B A B A

Two-Layer
Convex Open A B
Or
Closed Regions B A B A

Three-Layer Abitrary A B
(Complexity
Limited by No. B
B A A
of Nodes)

Neural Networks – An Introduction Dr. Andrew Hunter

How big a training set?
• Mostly, empirical rules…
• Determine your target error rate, e
(success rate is 1- e
• Typical training set approx. n/e, where n is the number of
weights in the net
• Example
– e = 0.1, n = 80 weights
– training set size 800 trained until 95% correct training set
classification should produce 90% correct classification on testing
set (typical)
:

Limitations of Neural Networks

Random initialized densely connected networks lead to:
• High training cost
– Each neuron in the neural network can be considered as a regression algorithm.
Training the entire neural network is to train all the interconnected regressions.
• Difficult to train as the number of hidden layers increase
– In backpropagation, gradient is progressively getting more dilute. That is, below
top layers, the correction signal is minimal
• Stuck in local optima
– The random initialization does not guarantee starting from the
proximity of global optima.
Solution
– Deep Learning/Learning multiple levels of representation
:

𝛿
.

Deep learning
First conceived for image classification tasks, as a
replication of mammalians’ visual cortex

Cascade of many hidden layers of locally connected units, for feature

extraction and transformation

The hidden layers of a deep network learn multiple levels of

representations that correspond to different levels of abstraction;
the levels form a hierarchy of concepts.
Deep learning: the current big thing?
Deep learning: the current big thing?

In “Nature” 27 January 2016:

• “DeepMind’s program AlphaGo
beat Fan Hui, the European Go
champion, five times out of five...”
• “AlphaGo was not preprogrammed
to play Go: it used a general-purpose
algorithm to interpret the game’s
patterns.
• “…AlphaGo program applied deep
learning to neural networks
(convolutional NN).”
”

Deep Learning Today

• Computer Vision and Image Processing
– Feature engineering is the bread-and-butter of a large portion of the CV community,
which creates some resistance to feature learning
– But the record holders on ImageNet and Semantic Segmentation are convolutional
nets
• Speech recognition
– A few long-standing performance records were broken with deep learning methods
– Microsoft and Google have both deployed DL-based speech recognition systems in
their products
• Advancement in Natural Language Processing
– Fine-grained sentiment analysis, syntactic parsing
– Language model, machine translation, question answering
• … potentially any field, including bioinformatics
Motivations for Deep Architectures
• Insuf cient depth can hurt
– With shallow architecture (SVM, NB, KNN, etc.), the required number of nodes in the
graph (i.e. computations, and also number of parameters, when we try to learn the
function) may grow very large.
– Many functions that can be represented ef ciently with a deep architecture cannot be
represented ef ciently with a shallow one.
• The brain has a deep architecture
– The visual cortex shows a sequence of areas each of which contains a
representation of the input, and signals flow from one to the next.
– Note that representations in the brain are in between dense distributed and purely local:
they are sparse: about 1% of neurons are active simultaneously in the brain.
• Cognitive processes seem deep
– Humans organize their ideas and concepts hierarchically.
– Humans first learn simpler concepts and then compose them to represent more abstract
ones.
– Engineers break-up solutions into multiple levels of abstraction and processing
fi
fi
fi
Deep Learning vs Traditional ML
Traditional pattern recognition models use hand-crafted features
and relatively simple trainable classifier.

hand-crafted “Simple”
feature Trainable output
extractor Classifier

This approach has the following limitations:

It is very tedious and costly to develop hand-crafted feature
The hand-crafted features are usually highly dependent on one application, and
cannot be transferred easily to other applications

Deep Learning vs Traditional ML

Deep learning (a.k.a. representation learning) seeks to learn rich
hierarchical representations (i.e. features) automatically through
multiple stage of feature learning process.

Low-level Mid-level High-level Trainable

output
features features features classi er
fi
Learning Hierarchical Representations
Low-level Mid-level High-level Trainable
output
features features features classi er

Increasing level of abstractio

• Hierarchy of representations with increasing level of abstraction.

Each stage is a kind of trainable nonlinear feature transform
• Image recognition
– Pixel → edge → texton → motif → part → object

• Text
– Character → word → word group → clause → sentence → story

• Virtually, any kind of application requiring PR (e.g. DNA sequence classification)

fi
:

Convolutional Neural Network (CNN)

• Convolutional Neural Networks are inspired by mammalian visual cortex.
– The visual cortex contains a complex arrangement of cells, which are sensitive to
small sub-regions of the visual eld, called a receptive eld. These cells act as local
filters over the input space and are well-suited to exploit the strong spatially local
correlation present in natural images.
– Two basic cell types:
• Simple cells respond maximally to speci c edge-like patterns within their
receptive eld.
• Complex cells have larger receptive fields and are locally invariant to the exact
position of the pattern.
fi
fi
fi
fi
Convolutional Neural Networks (CNN)
• Inspired by the neurophysiological experiments conducted by [Hubel & Wiesel 1962],
CNNs are a special type of neural network whose hidden units are only connected to
local receptive eld. The number of parameters needed by CNNs is much smaller.
• Input can have very high dimension. Using a fully-connected neural network would
need a large amount of parameters.

Example: 200x200 image

a) fully connected: 40,000 hidden units
=> 1.6 billion parameters
b) CNN: 5x5 kernel, 100 feature maps
=> 2,500 parameters
fi
CNN Architecture
• Intuition: Neural network with specialized connectivity structure,
– Stacking multiple layers of feature extractors
– Low-level layers extract local features.
– High-level layers extract learn global patterns.
• A CNN is a list of layers that transform the input data into an output class/
prediction.
• There are a few distinct types of layers:
– Convolutional layer
– Non-linear layer
– Pooling layer
Building-blocks for CNN

Feature maps of a larger

region are combined.
Feature maps are trained
with neurons.

Each sub-region yields a

Shared weights
feature map, representing
its feature.

Images are segmented into

sub-regions.
Convolution operation

Input: an image (2-D array)

Convolution kernel/operator (2-D array of

learnable parameters): w Feature map (2-D
array of processed data):

Convolution operation in 2-D domains:

Multiple Convolutions

Usually there are multiple feature maps,

one for each convolution operator.

CNN Architecture: Convolutional Layer

• The core layer of CNNs
• The convolutional layer consists of a set of filters.
– Each filter covers a spatially small portion of the input data.
• Each filter is convolved across the dimensions of the input data, producing a
multidimensional feature map.
– As we convolve the filter, we are computing the dot product between the
parameters of the filter and the input.
• Intuition: the network will learn filters that activate when they see some
speci c type of feature at some spatial position in the input.
• The key architectural characteristics of the convolutional layer is
– local connectivity
– and shared weights.
fi
CNN Convolutional Layer: Local Connectivity
• Neurons in layer m are only connected to 3
adjacent neurons in the m-1 layer.
• Neurons in layer m+1 have a similar
connectivity with the layer below.
• Each neuron is unresponsive to variations
outside of its receptive field with respect to the
input.
– Receptive field: small neuron collections which
process portions of the input data
• The architecture thus ensures that the learnt
feature extractors produce the strongest
response to a spa ally local input pattern.
ti
CNN Convolutional Layer: Shared Weights
We show 3 hidden neurons belonging to the same feature map (the
layer right above the input layer).

Weights of the same color are shared—constrained to be identical.

Gradient descent can still be used to learn such shared parameters,
with only a small change to the original algorithm.
Replicating neurons in this way allows for features to be detected
regardless of their position in the input (spatial invariance).
Additionally, weight sharing increases learning efficiency by
greatly reducing the number of free parameters being learnt.
CNN Architecture: Non-linear Layer

Intuition: Increase the nonlinearity of the entire architecture without affecting the
receptive fields of the convolution laye
A layer of neurons that applies the non-linear activation function, such as,

Non-linearity

• Tanh(x) • ReLU

ex − e−X
tanh x = ex + e−X = max(0, )
𝑓
𝑥
𝑥
CNN Architecture: Pooling Layer
Intuition: to progressively reduce the spatial size of the representation
to reduce the amount of parameters and computation in the network, and
hence to also control overfitting
Pooling partitions the input image into a set of non-overlapping
rectangles and, for each such sub-region, outputs the maximum value
of the features in that region.
Input

63
Pooling
• Common pooling operations:

– Max pooling: reports the

maximum output within a
rectangular neighborhood.
– Average pooling: reports the
average output of a rectangular
neighborhood (possibly
weighted by the distance from
the central pixel).
Building-blocks for CNN

Feature maps of a larger

region are combined.

Feature maps are

trained with neurons.

Shared weights Each sub-region yields a

feature map, representing
its feature.
Images are segmented
into sub-regions.
Full CNN
Hand-crafted
kernel function

SVM Apply simple

classi er

Deep Learning Learnable simple

kernel classifier
…
x1 y1
…
x2 …
… y
…

…
…

…
xN
… yM
𝑥𝜙
𝑥
fi

Deep CNN: ImageNet 2012 winner

Multiple feature maps per convolutional layer.

Multiple convolutional layers for extracting features at different levels.

Higher-level layers take the feature maps in lower- level layers as input.
Tools for Deep Learning
Deep Learning
… moving beyond shallow machine learning since 2006!

http://deeplearning.net/software_links
Caffe (Python, Matlab)
Theano (Python)
Torch (Lua)
TensorFlow (Python, C++)
… many others /

Lecture 10 Neural Network
No ratings yet
Lecture 10 Neural Network
34 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
Lecture
No ratings yet
Lecture
59 pages
L6 Neural Network
No ratings yet
L6 Neural Network
57 pages
Neural Network
100% (1)
Neural Network
54 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Classification BP Regression KNN Other Classifiers - Final
No ratings yet
Classification BP Regression KNN Other Classifiers - Final
116 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
Neural
No ratings yet
Neural
53 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
68 pages
Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture
No ratings yet
Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture
24 pages
Artificial Neural Network: Lecture Module 22
No ratings yet
Artificial Neural Network: Lecture Module 22
54 pages
Neural Networks for Tech Enthusiasts
No ratings yet
Neural Networks for Tech Enthusiasts
15 pages
Neural Networks: Python & R Guide
100% (1)
Neural Networks: Python & R Guide
15 pages
36-Multi-Layer Perceptron and Its Properties-30-10-2024
No ratings yet
36-Multi-Layer Perceptron and Its Properties-30-10-2024
39 pages
Module 5 Lecture 2
No ratings yet
Module 5 Lecture 2
45 pages
Unit 4
No ratings yet
Unit 4
38 pages
ANN-Implemetation of Back-Prop
No ratings yet
ANN-Implemetation of Back-Prop
89 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
Artificial Neural Networks Overview
100% (1)
Artificial Neural Networks Overview
40 pages
Neural Networks in Python & R
No ratings yet
Neural Networks in Python & R
12 pages
19 - Introduction To Neural Networks
No ratings yet
19 - Introduction To Neural Networks
7 pages
NN Introduction MES
No ratings yet
NN Introduction MES
39 pages
Neural Network
No ratings yet
Neural Network
55 pages
Backpropagation in Neural Networks
No ratings yet
Backpropagation in Neural Networks
13 pages
Artificial Neural Networks Basics
No ratings yet
Artificial Neural Networks Basics
50 pages
Lecture 13.3 Classification ANN
No ratings yet
Lecture 13.3 Classification ANN
64 pages
Basics
No ratings yet
Basics
48 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
3 pages
Lecture 02 - Artificial Neural Network
No ratings yet
Lecture 02 - Artificial Neural Network
37 pages
Data Mining Techniques: Presentation On Neural Network
No ratings yet
Data Mining Techniques: Presentation On Neural Network
55 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
No ratings yet
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
31 pages
Lecture 7 - Neural Networks
No ratings yet
Lecture 7 - Neural Networks
48 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
66 pages
Chapter-4 Fundamental of Neural Network
No ratings yet
Chapter-4 Fundamental of Neural Network
26 pages
Unit - 2
No ratings yet
Unit - 2
24 pages
Unit 5 ML
No ratings yet
Unit 5 ML
37 pages
CSC 323-06 Artificial Neural Network
No ratings yet
CSC 323-06 Artificial Neural Network
29 pages
Unit V
No ratings yet
Unit V
49 pages
ML Unit-2
No ratings yet
ML Unit-2
141 pages
Unit 1
No ratings yet
Unit 1
20 pages
UNIT 3 - Backpropagation Algorithm
No ratings yet
UNIT 3 - Backpropagation Algorithm
38 pages
Unit-4 Full
No ratings yet
Unit-4 Full
48 pages
Neural Networks
No ratings yet
Neural Networks
40 pages
Isch 4
No ratings yet
Isch 4
44 pages
Lecture15 NeuronNetworks
No ratings yet
Lecture15 NeuronNetworks
61 pages
Unit 4 Neural Networks
No ratings yet
Unit 4 Neural Networks
76 pages
Unit - 4 ANN
No ratings yet
Unit - 4 ANN
46 pages
Ann 2
No ratings yet
Ann 2
22 pages
Unit III
No ratings yet
Unit III
29 pages
Neural Metwork: Institut Teknologi Sepuluh Nopember (ITS) Surabaya - Indonesia
No ratings yet
Neural Metwork: Institut Teknologi Sepuluh Nopember (ITS) Surabaya - Indonesia
43 pages
AIML-Module-3-part 2
No ratings yet
AIML-Module-3-part 2
122 pages
Neural Networks
No ratings yet
Neural Networks
27 pages
Ann MJJ-1
No ratings yet
Ann MJJ-1
64 pages
ANN Research
No ratings yet
ANN Research
18 pages
AML - Lecture - 10 - 15nov24
No ratings yet
AML - Lecture - 10 - 15nov24
169 pages
AML - Lecture - 09 - 08nov24
No ratings yet
AML - Lecture - 09 - 08nov24
126 pages
AML - Lecture - 08 - 05nov24
No ratings yet
AML - Lecture - 08 - 05nov24
106 pages
5network 01 Intro
No ratings yet
5network 01 Intro
202 pages
5network 02 BioNetwork
No ratings yet
5network 02 BioNetwork
125 pages
4data - Data and Statistics
No ratings yet
4data - Data and Statistics
55 pages
3ML 01 Introduction
No ratings yet
3ML 01 Introduction
34 pages
COE101 Project Group 16
No ratings yet
COE101 Project Group 16
12 pages
Computer Science
No ratings yet
Computer Science
18 pages
Heart Disease rp3
No ratings yet
Heart Disease rp3
20 pages
Lui 2014
No ratings yet
Lui 2014
7 pages
Bhavana Gubbi: Profile Summary Technical Skills
No ratings yet
Bhavana Gubbi: Profile Summary Technical Skills
1 page
Data Mining Slides
No ratings yet
Data Mining Slides
65 pages
Raju Internship Report
No ratings yet
Raju Internship Report
27 pages
DBSCAN Clustering Algorithm: Presented by
No ratings yet
DBSCAN Clustering Algorithm: Presented by
22 pages
A Review On Sentiment Analysis Using Machine Learning
No ratings yet
A Review On Sentiment Analysis Using Machine Learning
5 pages
Heart Disease Prediction with Naive Bayes
No ratings yet
Heart Disease Prediction with Naive Bayes
16 pages
NLP Corpus Approaches
No ratings yet
NLP Corpus Approaches
9 pages
Foml Paper Solution 1
No ratings yet
Foml Paper Solution 1
35 pages
Pattern Recognition
No ratings yet
Pattern Recognition
33 pages
Lorentzian Algorithm Trading Strategy
100% (1)
Lorentzian Algorithm Trading Strategy
20 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Naive Bayes Iris Classifier Guide
No ratings yet
Naive Bayes Iris Classifier Guide
14 pages
Data Warehousing and Data Mining - Handbook
0% (2)
Data Warehousing and Data Mining - Handbook
27 pages
Cloud Based Travel Planning System With A Learned ITA Algorithm Approach
No ratings yet
Cloud Based Travel Planning System With A Learned ITA Algorithm Approach
3 pages
Data Science
No ratings yet
Data Science
16 pages
Machine Learning Course Guide
No ratings yet
Machine Learning Course Guide
38 pages
DL Question Bank
No ratings yet
DL Question Bank
5 pages
COMPX310-19A Machine Learning Chapter 3: Classification
No ratings yet
COMPX310-19A Machine Learning Chapter 3: Classification
39 pages
What's Next?: Binary Classification and Related Tasks Classification
No ratings yet
What's Next?: Binary Classification and Related Tasks Classification
44 pages
K Means Final
No ratings yet
K Means Final
10 pages
هه
No ratings yet
هه
6 pages
Sat - 92.Pdf - Bank Loan Approval Data Analyze and Prediction Using Data Science Technique (ML)
No ratings yet
Sat - 92.Pdf - Bank Loan Approval Data Analyze and Prediction Using Data Science Technique (ML)
11 pages
Vital Wave USAID-AIML-FieldGuide FINAL VERSION 1
No ratings yet
Vital Wave USAID-AIML-FieldGuide FINAL VERSION 1
67 pages
Perceptron Neural Network Program
No ratings yet
Perceptron Neural Network Program
3 pages
Prediction of Company Bankruptcy: Amlan Nag
100% (2)
Prediction of Company Bankruptcy: Amlan Nag
16 pages
Statistical Advisor
No ratings yet
Statistical Advisor
1 page

3ML.05.NeuralNetworks DeepLearning

Uploaded by

3ML.05.NeuralNetworks DeepLearning

Uploaded by

Overview of Machine Learning and Pattern Recognition

During the learning process the weights are modi ed in order to

A standard architecture consists of

Feed-forward single layer (perceptron)

Input layer of Output layer

Feed-forward multi layer

A set of synapses or connecting links, each link characterized by a numeric

Activation function (squashing function) for limiting the amplitude of the

Emulates the typical response of a biological neuron: the neuron “ res”

Various network architectures

Various learning algorithms

– No hidden units: synonym for a single-

• The aim is to obtain adjust weights in order to minimize E.

• Provided that the classes are linearly separable, the final

• However, a single layer perceptron can only learn linearly

• All nodes at any layer

• There are no cycles

• Can compute functions with convex region

• Can’t use Delta Learning Rul

Propagation of signals through the output layer.

Present a training instance / adjust the weights

Present a training instance / adjust the weights

Present a training instance / adjust the weights

Present a training instance / adjust the weights

3. Small for slow, conservative learning

Neural Networks – An Introduction Dr. Andrew Hunter

Limitations of Neural Networks

Cascade of many hidden layers of locally connected units, for feature

The hidden layers of a deep network learn multiple levels of

In “Nature” 27 January 2016:

Deep Learning Today

This approach has the following limitations:

Deep Learning vs Traditional ML

Low-level Mid-level High-level Trainable

Increasing level of abstractio

• Hierarchy of representations with increasing level of abstraction.

• Virtually, any kind of application requiring PR (e.g. DNA sequence classification)

Convolutional Neural Network (CNN)

Example: 200x200 image

Feature maps of a larger

Each sub-region yields a

Images are segmented into

Input: an image (2-D array)

Convolution kernel/operator (2-D array of

Convolution operation in 2-D domains:

Usually there are multiple feature maps,

CNN Architecture: Convolutional Layer

Weights of the same color are shared—constrained to be identical.

– Max pooling: reports the

Feature maps of a larger

Feature maps are

Shared weights Each sub-region yields a

SVM Apply simple

Deep Learning Learnable simple

Deep CNN: ImageNet 2012 winner

Multiple feature maps per convolutional layer.

Multiple convolutional layers for extracting features at different levels.

You might also like