What is a neural network?
• Neural network
– Network of biological neurons
– Biological neural networks are made up
of real biological neurons that are
connected or functionally related in the
peripheral nervous system or the central
nervous system
• Artificial neurons
– Simple mathematical approximations of
biological neurons
#1
What is a neural network?
• Artificial neural networks
– Networks of artificial neurons
– Very crude approximations of small parts of the biological brain
– Implemented as software or hardware
– By “Neural Networks” we usually mean Artificial Neural Networks
#2
Neural network definitions
• Zurada (1992)
– Artificial neural systems, or neural networks, are physical cellular systems which
can acquire, store, and utilize experiential knowledge.
• Haykin (1999)
– A neural network is a massively parallel distributed processor that has a natural
propensity for storing experiential knowledge and making it available for use. It
resembles the brain in two respects:
– Knowledge is acquired by the network through a learning process.
– Interneuron connection strengths known as synaptic weights are used to store
the knowledge.
#3
Biological neural networks
Cortical neurons (nerve cells) growing
in culture
Neurons have a large cell body with
several long processes extending
from it, usually one thick axon and
several thinner dendrites
Dendrites receive information from
other neurons
Axon carries nerve impulses away from
the neuron. Its branching ends make
contact with other neurons and with
muscles or glands
Human neurons derived from IPSC stem cells
This complex network forms the
nervous system, which relays
information through the body
#4
Biological neuron
#5
Interaction of neurons
• Action potentials arriving at
the synapses stimulate
currents in its dendrites
• These currents depolarize
the membrane at its axon,
provoking an action potential
• Action potential propagates
down the axon to its
synaptic knobs, releasing
neurotransmitters and
stimulating the post-synaptic
neuron (lower left)
#6
Synapses
• Elementary structural and functional units that mediate the interaction between neurons
• Chemical synapse:
pre-synaptic electric signal → chemical neurotransmitter → post-synaptic electrical signal
#7
Action potential
• Spikes or action potential
– Neurons encode their outputs as a series of voltage pulses
– Axon is very long, high resistance & high capacity
– Frequency modulation → Improved signal/noise ratio
#8
Human nervous system
• Human nervous system can be represented by three stages:
Neural net
Stimulus Receptors (Brain) Effectors Response
• Receptors
– collect information from the environment (photons on the retina, tactile info, ...)
• Effectors
– generate interactions with the environment (muscle activation, ...)
• Flow of information
– feedforward & feedback
#9
Human brain
Human activity is regulated by
a nervous system:
• Central nervous system
– Brain
– Spinal cord
• Peripheral nervous system
≈ 1010 neurons in the brain
≈ 104 synapses per neuron
≈ 1 ms processing speed of a neuron
→ Slow rate of operation
→ Extrem number of processing
units & interconnections
→ Massive parallelism
#10
Structural organization of brain
Molecules & Ions ................ transmitters
Synapses ............................ fundamental organization level
Neural microcircuits .......... assembly of synapses organized into patterns of
connectivity to produce desired functions
Dendritic trees .................... subunits of individual neurons
Neurons ............................... basic processing unit, size: 100 μm
Local circuits ....................... localized regions in the brain, size: 1 mm
Interregional circuits .......... pathways, topographic maps
Central nervous system ..... final level of complexity
#11
MODELS OF A NEURON
#12
Cont’d
#13
Cont’d
#14
#15
#16
#17
#18
#19
#20
#21
NEURAL NETWORKS VIEWED AS DIRECTED GRAPHS
❖ We may simplify the appearance of the model by using the idea of signal-flow graphs
without sacrificing any of the functional details of the model.
❖ Signal-flow graphs, with a well-defined set of rules, were originally developed by Mason
(1953, 1956) for linear networks.
❖ A signal-flow graph is a network of directed links (branches) that are interconnected at
certain points called nodes.
❖ A typical node j has an associated node signal xj. A typical directed link originates at node
j and terminates on node k; it has an associated transfer function, or transmittance, that
specifies the manner in which the signal yk at node k depends on the signal xj at node j.
The flow of signals in the various parts of the graph is dictated by three basic rules:
#22
FIGURE 9 illustrating basic rules for the
construction of signal-flow graphs
Rule 1. A signal flows along a link only in the direction
defined by the arrow on the link.
Two different types of links may be distinguished: •
Synaptic links, whose behavior is governed by a linear
input–output relation. Specifically, the node signal xj is
multiplied by the synaptic weight Wkj to produce the node
signal yk, as illustrated in Fig. 9a.
Activation links, whose behavior is governed in general by
a nonlinear input–output relation. This form of relationship
is illustrated in Fig. 9b, where ϕ(·) is the non linear
activation function.
#23
Rule 2. A node signal equals the algebraic sum of all signals entering the pertinent node via
the incoming links.
This second rule is illustrated in Fig.9c for the case of synaptic convergence, or fan-in.
Rule 3. The signal at a node is transmitted to each outgoing link originating from that node,
with the transmission being entirely independent of the transfer functions of the outgoing
links.
This third rule is illustrated in Fig. 9d for the case of synaptic divergence, or fan-out.
❖ Indeed, based on the signal-flow graph of Fig.10 as the model of a neuron, we may now
offer the following mathematical definition of a neural network:
A neural network is a directed graph consisting of nodes with interconnecting synaptic and
activation links and is characterized by four properties:
1. Each neuron is represented by a set of linear synaptic links, an externally applied bias, and
a possibly nonlinear activation link. The bias is represented by a synaptic link connected to
an input fixed at +1.
2. The synaptic links of a neuron weight their respective input signals.
#24
3. The weighted sum of the input signals defines the induced local field of the neuron in
question.
4. The activation link squashes the induced local field of the neuron to produce an output
#25
NETWORK ARCHITECTURES
❖ The manner in which the neurons of a neural network are structured is intimately linked
with the learning algorithm used to train the network.
❖ We may therefore speak of learning algorithms (rules) used in the design of neural
networks as being structured.
❖ In general, we may identify three fundamentally different classes of network architectures
(i) Single-Layer Feedforward Networks:
✓ In a layered neural network, the neurons are organized in the form of layers.
✓ In the simplest form of a layered network, we have an input layer of source nodes that
projects directly onto an output layer of neurons (computation nodes), but not vice versa.
✓ In other words, this network is strictly of a feedforward type.
✓ It is illustrated in Fig. 15 for the case of four nodes in both the input and output layers.
✓ Such a network is called a single-layer network, with the designation “single layer”
referring to the output layer of computation nodes (neurons).
✓ We do not count the input layer of source nodes because no computation is performed
there.
#26
FIGURE 15 Feedforward network with a single layer of neurons.
#27
(ii) Introduction to Multilayer Feedforward Networks
❖ The second class of a feedforward neural network distinguishes itself by the presence of
one or more hidden layers, whose computation nodes are correspondingly called hidden
neurons or hidden units; the term “hidden” refers to the fact that this part of the neural
network is not seen directly from either the input or output of the network.
❖ The function of hidden neurons is to intervene between the external input and the network
output in some useful manner.
❖ By adding one or more hidden layers, the network is enabled to extract higher-order
statistics from its input.
❖ In a rather loose sense, the network acquires a global perspective despite its local
connectivity, due to the extra set of synaptic connections and the extra dimension of
neural interactions (Churchland and Sejnowski, 1992).
#28
FIGURE 16 Fully connected feedforward network with one hidden layer and one
output layer.
#29
❖ The source nodes in the input layer of the network supply respective elements of the
activation pattern (input vector), which constitute the input signals applied to the
neurons (computation nodes) in the second layer (i.e., the first hidden layer).
❖ The output signals of the second layer are used as inputs to the third layer, and so on for
the rest of the network.
❖ Typically, the neurons in each layer of the network have as their inputs the output
signals of the preceding layer only.
❖ The set of output signals of the neurons in the output (final) layer of the network
constitutes the overall response of the network to the activation pattern supplied by the
source nodes in the input (first) layer.
❖ The architectural graph in Fig.16 illustrates the layout of a multilayer feedforward
neural network for the case of a single hidden layer.
❖ For the sake of brevity, the network in Fig.16 is referred to as a 10–4–2 network
because it has 10 source nodes,4 hidden neurons, and 2 output neurons.
#30
iii) Recurrent Networks
❖ A recurrent neural network distinguishes itself from a feedforward neural
network in that it has at least one feedback loop.
❖ For example, a recurrent network may consist of a single layer of neurons with
each neuron feeding its output signal back to the inputs of all the other neurons,
as illustrated in the architectural graph in Fig.17.
❖ In the structure depicted in this figure, there are no self-feedback loops in the
network; self-feedback refers to a situation where the output of a neuron is fed
back into its own input.
❖ The recurrent network illustrated in Fig. 17 also has no hidden neurons.
#31
FIGURE 17 Recurrent network with no self-feedback loops and no hidden neurons.
#32
❖ In Fig. 18 we illustrate another class of recurrent networks with hidden neurons.
❖ The feedback connections shown in Fig. 18 originate from the hidden neurons as
well as from the output neurons.
❖ The presence of feedback loops, be it in the recurrent structure of Fig.17 or in
that of Fig. 18,has a profound impact on the learning capability of the network
and on its performance.
❖ Moreover, the feedback loops involve the use of particular branches composed of
unit-time delay elements (denoted by z-1),which result in a nonlinear dynamic
behavior, assuming that the neural network contains nonlinear units.
#33
#34
What NN can do?
• In principle
– NN can compute any computable function (everything a normal digital computer
can do)
• In practice
– NN are especially useful for classification and function approximation problems
which are tolerant of some imprecision
– Almost any finite-dimensional vector function on a compact set can be
approximated to arbitrary precision by feedforward NN
– Need a lot of training data
– Difficulties to apply hard rules (such as those used in an expert system)
• Problems difficult for NN
– Predicting random or pseudo-random numbers
– Factoring large integers
– Determining whether a large integer is prime or composite
– Decrypting anything encrypted by a good algorithm
#35
Benefits of neural networks
1. Ability to learn from examples
• Train neural network on training data
• Neural network will generalize on new data
• Noise tolerant
• Many learning paradigms
• Supervised (with a teacher)
• Unsupervised (no teacher, self-organized)
• Reinforcement learning
2. Adaptivity
• Neural networks have a natural capability to adapt to the changing environment
• Train neural network, then retrain
• Continuous adaptation in a nonstationary environment
#36
Benefits of neural networks
3. Nonlinearity
• Artificial neurons can be linear or nonlinear
• Network of nonlinear neurons has nonlinearity distributed throughout the network
• Important for modeling inherently nonlinear signals
4. Fault tolerance
• Capable of robust computation
• Graceful degradation rather than catastrophic failure
#37
Benefits of neural networks
5. Massively parallel distributed structure
• Well suited for VLSI implementation
• Very fast hardware operation
6. Neurobiological analogy
• NN design is motivated by analogy with brain
• NN are a research tool for neurobiologists
• Neurobiology inspires further development of artificial NN
7. Uniformity of analysis & design
• Neurons represent the building blocks of all neural networks
• Similar NN architecture for various tasks: pattern recognition, regression,
time series forecasting, control applications, ...
#38
Brief history of neural networks
-1940 von Helmholtz, Mach, Pavlov, etc.
– General theories of learning, vision, conditioning
– No specific mathematical models of neuron operation
1943 McCulloch and Pitts
– Proposed the neuron model
1949 Hebb
– Published his book The Organization of Behavior
– Introduced the Hebbian learning rule
1958 Rosenblatt, Widrow and Hoff
– Perceptron, ADALINE
– First practical networks and learning rules
1969 Minsky and Papert
– Published book Perceptrons, generalized the limitations of single-layer
perceptrons to multilayered systems
– Neural Network field went into hibernation
#39
Brief history of neural networks
1974 Werbos
– Developed back-propagation learning method in his PhD thesis
– Several years passed before this approach was popularized
1982 Hopfield
– Published a series of papers on Hopfield networks
1982 Kohonen
– Developed the Self-Organising Maps
1980s Rumelhart and McClelland
– Backpropagation rediscovered, re-emergence of neural networks field
– Books, conferences, courses, funding in USA, Europe, Japan
1990s Radial Basis Function Networks were developed
2000s The power of Ensembles of Neural Networks and
Support Vector Machines becomes apparent
#40
Brief history of neural networks
2000s Hardware-based designs
– CMOS computational devices: biophysical simulation & neuromorphic computing
– Nanodevices for very large-scale principal components analyses and convolution
– GPU-based backpropagation for multi-layered feedforward neural networks
2010s Convolutional neural networks and deep learning
– Rapid evolution of deep learning methods
– Convolutional layers and max-pooling layers become the state of the art
– Deep learning methods achieve human-competitive performance on certain
practical applications
2017 Alfa Zero
– Superhuman performance in three information perfect games (go, chess, shogi)
using the rules of the game and a single algorithm for all games
2020 Mu Zero
– Superhuman performance in four information perfect games (go, chess, shogi,
atari) without knowing the rules of the games (mastering unknown dynamics)
#41
#42
Applications of neural networks
• Aerospace
– High-performance aircraft autopilots, flight path simulations, aircraft control
systems, autopilot enhancements, aircraft component simulations, aircraft
component fault detectors
• Automotive
– Automobile automatic guidance systems, warranty activity analyzers
• Banking
– Check and other document readers, credit application evaluators
• Defense
– Weapon steering, target tracking, object discrimination, facial recognition, new
kinds of sensors, sonar, radar and image signal processing including data
compression, feature extraction and noise suppression, signal/image
identification
• Electronics
– Code sequence prediction, integrated circuit chip layout, process control, chip
failure analysis, machine vision, voice synthesis, nonlinear modeling
#43
Applications of neural networks
• Financial
– Real estate appraisal, loan advisor, corporate bond rating, credit line use
analysis, portfolio trading program, corporate financial analysis, currency price
prediction
• Manufacturing
– Manufacturing process control, product design and analysis, process and
machine diagnosis, real-time particle identification, visual quality inspection
systems, welding quality analysis, paper quality prediction, computer chip quality
analysis, analysis of grinding operations, chemical product design analysis,
machine maintenance analysis, project planning and management, dynamic
modeling of chemical process systems
• Medical
– Breast cancer cell analysis, EEG and ECG analysis, prothesis design,
optimization of transplant times, hospital expense reduction, hospital quality
improvement, emergency room test advisement
#44
Applications of neural networks
• Robotics
– Trajectory control, forklift robot, manipulator controllers, vision systems
• Speech
– Speech recognition, speech compression, vowel classification, text-to-speech
synthesis
• Securities
– Market analysis, automatic bond rating, stock trading advisory systems
• Telecommunications
– Image and data compression, automated information services, real-time
translation of spoken language, customer payment processing systems
• Transportation
– Truck brake diagnosis systems, vehicle scheduling, routing systems
#45