[go: up one dir, main page]

0% found this document useful (0 votes)
7 views39 pages

AI Module 3 Notes

This document discusses Artificial Neural Networks (ANNs), their definitions, historical evolution, and applications in civil engineering. It explains the structure and function of biological neurons, comparing them to perceptrons, which are the basic units of ANNs. The content emphasizes the significance of ANNs in pattern recognition and decision-making processes across various fields, particularly in infrastructure and engineering.

Uploaded by

Mohammed Imran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views39 pages

AI Module 3 Notes

This document discusses Artificial Neural Networks (ANNs), their definitions, historical evolution, and applications in civil engineering. It explains the structure and function of biological neurons, comparing them to perceptrons, which are the basic units of ANNs. The content emphasizes the significance of ANNs in pattern recognition and decision-making processes across various fields, particularly in infrastructure and engineering.

Uploaded by

Mohammed Imran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/383951466

AI for Civil Engineers _ Module 3 _ ANNs

Chapter · September 2024

CITATIONS READS

0 297

1 author:

Jayaram M.A
RASTA - Center for Road Technology VOLVO Construction Equipment Campus
196 PUBLICATIONS 1,527 CITATIONS

SEE PROFILE

All content following this page was uploaded by Jayaram M.A on 12 September 2024.

The user has requested enhancement of the downloaded file.


AI FOR CIVIL ENGINEERS
Module 3
ARTIFICIAL NEURAL NETWORKS

Dr.M.A.Jayaram
Professor

RASTA
Centre for Road Technology
VOLVO Construction Equipment Campus
Bengaluru
CONTENTS
Definitions of ANN
Biological Neuron
Perceptron
Capabilities of Perceptron
OR,AND GATES, Linearly separable Points
XOR GATE, Linearly not separable
Activation Functions
Perceptron Learning Rule
Multilayer ANNs
Training Multilayer ANNs
Backpropagation Algorithm
Types of ANNs, their functions, & their
specific use.
Appropriate Problems for ANN learning
Applications of ANNs in Civil Engineering

• Infrastructure construction &


management
• Geotechnical Engineering
• Hydraulics and water Resources
Engineering
• Transportation Engineering and
Highway Technology
• Structural Engineering
Module 3

Introduction to Artificial Neural Networks (ANNs)

1. Introduction
ANNs are computational models inspired by the human brain's structure and function. They
consist of layers of interconnected nodes (neurons) that process information and learn from
data. ANNs can identify patterns, make decisions, and predict outcomes by adjusting the
connections (weights) between neurons based on experience. Table 3.1 provides the definitions
of ANNs as perceived by some doyens of AI.
Table 3.1. Definitions of ANNs
Definition Stated by
“An ANN is a machine learning model Andrew Ng
designed to simulate the way a human Computer scientist, leading authority in ML, founded
brain processes and learns from Google brain project in 2011. Ng's work has had a broad
impact on AI research, industry applications, and
information, where each neuron works education.
together to form a cohesive prediction or
decision."
“ANNs are a simplified mathematical Geoffrey Hinton
model of the brain’s structure that is computer scientist, widely regarded as one of the founding
capable of learning patterns and making fathers of deep learning. Developed the famous
backpropagation algorithm in 1986 and deep belief
decisions based on data input” networks in 2006.
“ANNs are a set of algorithms, modelled Jürgen Schmidhuber
loosely after the human brain, that are Computer scientist known for his groundbreaking work in
designed to recognize patterns and are a ANNs, particularly in the development of deep learning
architectures. most notable contributions include the Long
cornerstone of deep learning” Short-Term Memory (LSTM) networks in 1997, a type of
recurrent neural network (RNN)
“ANNs are learning systems made of a Yann LeCun
large number of simple, interconnected French computer scientist, known for his pioneering work
processors (neurons) that can learn in the fields of machine learning, computer vision, and
artificial intelligence. The key figure behind the
complex patterns through experience” development of Convolutional Neural Networks (CNNs)
in the late 1980s and early 1990s, which have become
essential for image and video recognition tasks.

It is not hard to filter out the implicit meaning of each of the definitions listed in the Table;
• Andrew Ng's definition focuses on the analogy between ANNs and the human brain,
emphasizing that each neuron in an ANN contributes to a collective output, just like
neurons in the brain work together to form thoughts or decisions. It implicitly conveys
the collaborative nature of neural networks, where the overall intelligence of the system
arises from the interaction of its parts.
• Hinton emphasizes that ANNs mimic the brain's structure in a simplified way. The focus
is on the network's ability to learn patterns from data and make decisions, highlighting
the core idea that ANNs are powerful tools for pattern recognition and decision-making,
much in a way the human brain processes information.
• Schmidhuber underscores that ANNs are foundational to deep learning, a subset of
machine learning. His definition implicitly points to the importance of pattern
recognition as the central function of ANNs, and how this capability is crucial for the
success of deep learning in various complex tasks, such as image recognition and
language processing.
• LeCun highlights the simplicity of individual neurons but stresses the power that
emerges when many of them work together. His definition implicitly suggests that the
strength of ANNs lies in their interconnectedness and their ability to learn and adapt
over time through experience, reflecting the iterative nature of learning in both
machines and humans.
Several other definitions doled out by multitudes of AI experts are an amiss here. But, from the
literature survey on ANNs, the understanding of ANNs as one of the branches of AI may be
summarized as follows.
• Brain-inspired: All definitions draw a parallel between ANNs and the human brain,
implicitly suggesting that ANNs are powerful because they simulate the brain's ability
to learn and process information.
• Learning and Adaptation: The ability of ANNs to learn from data and improve over
time is a common thread, indicating that these networks are not static models but
dynamic systems that evolve with experience.
• Pattern Recognition: The focus on pattern recognition reflects the core strength of
ANNs in identifying and making sense of complex data patterns, which is central to
many AI applications.
• Collaboration Among Neurons: The idea that individual neurons (simple processors)
work together to achieve a collective goal implies that the intelligence of ANNs is an
emergent property arising from the interactions among its parts.
In a nutshell, ANNs as set of algorithms will provide a cognitive capability to a machine or an
AI agent.

2. Historical Evolution of ANNs


A quick and a quirk recording of chronological evolution of ANNs is worth considering;
• McCulloch and Pitts Model (1943): Warren McCulloch and Walter Pitts created the
first artificial neuron, a binary threshold unit, marking the birth of ANNs.
• Perceptron (1958): Frank Rosenblatt developed the Perceptron, an algorithm for
supervised learning of binary classifiers. This was a single-layer neural network.
• Minsky and Papert's Critique (1969): Marvin Minsky and Seymour Papert published
"Perceptrons," showing the limitations of single-layer networks, leading to a temporary
decline in neural network research.
• Backpropagation and the Revival (1980): The backpropagation algorithm was
rediscovered and popularized by David Rumelhart, Geoffrey Hinton, and Ronald
Williams, which allowed multi-layer networks to learn effectively, reigniting interest in
ANNs.
• Deep Learning Foundations (1990): The concept of deep learning began to emerge,
with ANNs evolving into more complex structures, including recurrent neural networks
(RNNs) and convolutional neural networks (CNNs).
• Deep Belief Networks (2006): Geoffrey Hinton and Ruslan Salakhutdinov introduced
Deep Belief Networks, which enabled the training of deep neural networks and
contributed significantly to the modern success of deep learning.
• Explosion of Deep Learning (2010): Advances in computational power, data
availability, and algorithm optimization led to significant breakthroughs in applications
like image and speech recognition, natural language processing, and more.
• Present - Widespread Applications: Currently, ANNs are integral to various fields,
including autonomous vehicles, healthcare, finance, and entertainment, driving
innovations in AI.

3. Biological Neuron
Having been clear on the motivation for current day ANNs in AI circles, it is imperative that
we need to give a courtesy look at biological neurons that fills a human brain in terms of their
structure, functions and mechanisms. A typical biological neuron is shown in figure 3.1. A
section of several such neurons forming a network within human brain is shown in figure 3,2.
Figure 3.3 shows the neuronal networks pervading the entire brain. Brain has 1011 neurons.

Fig.3.1. A typical biological neuron

Fig.3.2. A section of connected neuron in Fig.3.3. Neuronal network in Brain


Brain
Biological neuron is the fundamental building block of the nervous system, responsible for
transmitting and processing information throughout the body. Neurons are specialized cells that
communicate via electrical and chemical signals, playing a critical role in everything from
basic reflexes to complex thoughts and behaviours.
3.1 Structure of a Biological Neuron
A typical biological neuron consists of three main parts: the cell body (soma), dendrites, and
the axon.
Cell Body (Soma)
• Function: The cell body is the core of the neuron, containing the nucleus, which
houses the cell's genetic material. It also contains other organelles, such as
mitochondria, which provide energy, and the endoplasmic reticulum and Golgi
apparatus, which are involved in synthesizing and processing proteins.
• Role: The soma integrates incoming signals from the dendrites and generates the
necessary action potential if the signal is strong enough. It acts as the control center
of the neuron.
Dendrites
• Structure: Dendrites are tree-like extensions that branch out from the cell body. A
neuron can have many dendrites, which increase the surface area available for
receiving signals.
• Function: Dendrites receive chemical signals (neurotransmitters) from other neurons
at specialized junctions called synapses. These signals are converted into electrical
impulses that travel toward the cell body.
• Role: Dendrites play a crucial role in collecting and processing information from other
neurons, determining whether the neuron will fire an action potential.
Axon
• Structure: The axon is a long, thin extension that projects from the cell body and can
branch significantly at its end. It is often insulated by a fatty layer known as the myelin
sheath, which enhances the speed of signal transmission.
• Function: The axon transmits electrical signals away from the cell body towards other
neurons, muscles, or glands. These signals travel along the axon to reach the axon
terminals.
• Role: The axon is responsible for delivering the neuron's output to other cells. When
an action potential arrives at the axon terminal, it prompts the release of
neurotransmitters into the synaptic cleft, which then bind to receptors on the adjacent
neuron.
Axon Terminals (Synaptic Boutons)
• Structure: These are the small swellings at the end of the axon branches.
• Function: Axon terminals contain synaptic vesicles filled with neurotransmitters.
When an electrical impulse reaches the terminal, these vesicles release their contents
into the synapse.
• Role: Axon terminals are responsible for transmitting the signal to the next neuron in
the chain, effectively passing on the information.
3.2 Functions of a Neuron
Neurons are involved in a variety of critical functions, including:
• Signal Reception: Dendrites receive signals from other neurons. These signals can be
excitatory (increasing the likelihood of firing an action potential) or inhibitory
(decreasing the likelihood of firing).
• Signal Integration: The cell body integrates the incoming signals, and if the
cumulative signal exceeds a certain threshold, the neuron generates an action
potential.
• Signal Transmission: The action potential moves along the axon, aided by the myelin
sheath, which accelerates the conduction process.
• Signal Output: When the electrical signal reaches the axon terminals, it initiates the
release of neurotransmitters. These neurotransmitters cross the synapse and attach to
receptors on the next neuron, allowing the transmission of information to continue.
Neurons receive inputs primarily at the synapses located on the dendrites and the cell body.
These inputs come from the axon terminals of other neurons. The synapse is a crucial site where
the chemical communication between neurons occurs, involving the release and reception of
neurotransmitters.
Neurons process and transmit information. They form complex networks where they send and
receive signals that control every function in the body, from voluntary movements and sensory
perceptions to involuntary actions like breathing and heartbeat. Additionally, neurons are
responsible for higher functions such as thinking, memory, and emotions, allowing us to
interact with and interpret the world around us.
In essence, biological neurons are highly specialized cells designed to carry out the complex
task of communication within the nervous system. Through their unique structure and function,
neurons enable organisms to sense, respond, and adapt to their environment, driving all aspects
of behavior and cognition.
Looking at the enormity of biological neural nets, it is found that , if the axons of all the neurons
in the brain were stretched out end to end, they would cover a distance of about 500,000
kilometres. This incredible length is due to the vast number of neurons and the extensive
branching of their axons.

4.Biological Neuron V/s Perceptron


While biological neuron is a smallest element in the network. It takes electrochemical signals
, and adds them up and sends the output to the adjoint neuron through synaptic junctions. In
the same token, the smallest unit of an ANN is a Perceptron. It serves as a fundamental building
block in the field of machine learning and artificial intelligence. A typical perceptron is shown
in figure 3.4.
Fig, 3,4 . A perceptron

The Perceptron is a type of binary classifier designed to categorize input data into one of two
classes. It accomplishes this by establishing a linear decision boundary based on the input
features. If the weighted sum of the inputs surpasses a predefined threshold, the Perceptron
produces one class (commonly represented as 1); otherwise, it generates the other class
(commonly represented as 0). The main components of a Perceptron include:
• Inputs: The Perceptron receives multiple inputs, each representing a feature or attribute
of the data.
• Weights: Each input is assigned a weight that reflects its significance in the
classification process.
• Summation Unit: The Perceptron calculates the weighted sum of the inputs.
• Activation Function: A step or threshold function is applied to the weighted sum to
determine the output. If the sum exceeds the threshold, one class is output; if it falls
below, the other class is output.
It is important to know that the perceptron is to be trained for doing a classification task. This
is done using a learning algorithm that adjusts the weights based on the classification error. The
goal is to minimize this error over multiple iterations, allowing the Perceptron to find the best
decision boundary.
More specifically, the output of a perceptron, given the inputs x1,x2,x3,…..xn is:
O = 1 𝑖𝑓 𝑥1 𝑤1 + 𝑥2 𝑤2 + ⋯ 𝑥𝑛 𝑤𝑛 > 0
0 Otherwise
4.1 Historical Perspective
The Perceptron was one of the earliest neural network models and played a crucial role in the
development of artificial intelligence and machine learning. It laid the groundwork for more
advanced neural networks and deep learning models.
The Perceptron can only solve problems that are linearly separable (i.e., problems where a
straight line can separate the two classes). This limitation led to a period of reduced interest in
neural networks, especially after Marvin Minsky and Seymour Papert published their book
Perceptrons in 1969, highlighting these limitations.
Interest in neural networks was revived in the 1980s with the development of multi-layer
Perceptrons and backpropagation, which allowed neural networks to solve more complex, non-
linear problems.

The Perceptron was invented by Frank Rosenblatt in 1957. Rosenblatt was a psychologist
and computer scientist at the Cornell Aeronautical Laboratory in New York. He presented
the Perceptron as a model to understand how the human brain processes visual information,
viewing it as an initial step toward developing machines capable of "learning" in a way
similar to human learning

The Perceptron is now recognized as a key milestone in the history of machine learning, despite
its simplicity and limitations. It serves as a foundational concept for understanding more
complex neural network architectures.

5.The Capability of a Perceptron


The capacity of a perceptron as to what it can represent is crucial to understanding of big ANNs
with many interconnected perceptrons. Simply told, a perceptron can draw a partition line,
separating the data in to two groups. We can take two examples from Boolean logic. Precisely,
we will just see if a perceptron can represent OR and AND operations.
We have truth tables for OR and AND in Table 3.2. The perceptrons in action for implementing
this logic are shown in figure 3.5 and figure 3.6.
Table 3.2: Truth table
OR AND
Inputs Output Inputs Output
0 0 0 0 0 0
1 0 1 1 0 0
0 1 1 0 1 0
1 1 1 1 1 1
1 (0,1) (1,1)
x1 0.5 -0.3

O
x2
0.5 (0,0) (1,0)
Sample Calculation:
O = 0.5*1 + 0.5*0 – 1*0.3 = 0.2 > 0 Equation of the plane
O = 1 O = 0.5 x1 + 0.5 x2 – 0.3
Fig. 3.5 a. The perceptron for OR Fig.3.5 b Developed separating plane

x1 0.5 -0.8 (0,1) (1,1)

x2 0.5

Sample Calculation: (0,0) (1,0)


O = 0.5*1 + 0.5*0 – 1*0.8 = -0.3  0
O = 0 Equation of the plane
O = 0.5 x1 + 0.5 x2 – 0.8
Fig. 3.5 a. The perceptron for AND Fig.3.5 b Developed separating plane
The examples above illustrate that a perceptron can establish a plane to separate two types of
outputs or classes, but this is only possible if the data is linearly separable. Let's consider
another input-output pattern, as shown in Table 3.3. This is a classic example of data that is not
linearly separable. In this case, the output is 1 only when x1≠x2. The corresponding data plot
is presented in Figure 3.6
Table 3.3 XOR data
Inputs Output
x1 x2 O
1 1 0
0 0 0
1 0 1
0 1 1

(0,1) (1,1)

(0,0) (1,0)

Fig. 3.6 . Linearly non-separable data


Now, the problem became little complex, we need three perceptrons two in hidden layer, one
in output layer. The weights and biases are shown in Figure 3.7. These two hidden layer neurons
are capable of drawing two planes separating non-separable points. These two planes are shown
in figure 3.8.
5.1 The Perceptron Learning
The learning of perceptron is completed when appropriate weights on the connections are found
out. That is to say, the weights (or relative importance of inputs) are not magical. The weights
are to be “learnt” by the network . This will happen over several in cycles or epochs. A learnt
network will store the knowledge in the form of stabilized weights ( for a given problem).

H1 -1.5
x1 1

1 1 1

x2 1 -2 0.5
H2
-0.5
Sample calculation for inputs (1,1)
= (( 1*1+1*1)-1.5) = 0.5 output of H1 = 1
=(1*1+1*1-0.5) = 1.5 output of H2 = 1
Inputs for output neuron: 1*1 +1*-2 = -1
Output neuron = -1+0.5 = -0.5 < 0, therefore output = 0

Fig. 3.7. Two hidden perceptron and one output


perceptron for solving XOR problem

(0,1) (1,1)

(0,0) (1,0)

Fig. 3.8 . Two planes separating inseparable inputs


The linear boundaries in fig.3.8 is the result of activation functions. If a problem is about
separating the data or points with nonlinear boundaries, a neural network of the architecture as
shown in figure 3.9 may be nictitated. In that case, the activation functions in the processing
elements will be non-linear. Therefore, it is pertinent to know the most widely used activation
functions.

Fig. 3.9 a. Multilayer Network Fig 3.9 b. The non-linear separation

5.2 Activation Functions


There are several kinds of activation functions. the role of an activation function is crucial in
introducing non-linearity into the model. Without activation functions, a neural network would
only be capable of performing linear transformations, regardless of the number of layers. This
limitation would restrict the network's ability to learn and model complex patterns within the
data.
5.2.1 Sigmoid Function
Sigmoid functions are often used in binary classification problems, particularly in the output
layer of a binary classifier, where the output needs to be a probability (between 0 and 1). It
maps any input value to a value between 0 and 1. Figure 3.10 carries the graphical version of
the function.

Here, z is linear
sum of multiples
inputs and weights

Fig. 3.10. Sigmoid function


5.2.2 Tanh ( Hyperbolic) Function
Maps input values to a range between -1 and 1. Tanh is used in hidden layers of neural networks,
especially when the input data is centred around zero, as it helps in making the data zero-
centred. Figure 3.11 shows this function.
Fig 3.11. Tanh function

5.2.3 ReLu Function


The ReLU activation function maps the input to either zero or the input itself, allowing only
positive values to pass through. It is widely applied in the hidden layers of deep neural networks
because it helps the model tackle complex problems and mitigates the vanishing gradient
problem.

Fig. 3.12 ReLu function

5.2.4 Soft Max Function


This function, on the other hand, converts a vector of raw scores (logits) into probabilities by
normalizing them, ensuring that the total sum of all probabilities equals 1. It is commonly used
in the output layer of neural networks for multi-class classification, where the model needs to
assign probabilities to different classes.

Fig. 3.13a. SoftMax Function Fig.3.13b. Conversion of outputs to


probability values
5.2.5 Swish Function
A function that combines the linearity of the ReLU and the smoothness of the sigmoid. Swish
is used in deep learning models as it tends to outperform ReLU in certain tasks, particularly in
deep networks where gradient flow is critical. Figure 3.14 shows the configuration and the
formula of this function.
𝑥
𝑓(𝑥) = 1+𝑒 𝑥

Fig. 3.14 Swish Function

6. The Perceptron Training Rule


As mentioned in the preceding section that a simplest unit of an ANN is a perceptron ( also
called a processing element). Figure 3.15 brings a graphical presentation of a perceptron in
learning mode.
• It has one or more inputs and a single output.
• Each input has a weight, which is a number that represents the input's importance.
• The perceptron sums up the weighted inputs and then applies an activation function
(usually a step function) to decide the output, typically 0 or 1.
The goal is to adjust the weights so that the perceptron correctly classifies the inputs (e.g.,
deciding if given water sample is palatable or not).

Fig. 3.15. Perceptron training rule

The perceptron training rule is presented in Table 3.4


Table 3.4 . Perceptron Learning Rule
Step 1: Initialize Weights:
• Start by assigning small random numbers to the weights. This is like guessing before knowing
anything.
Step 2:Input and Prediction:
• Feed an input example to the perceptron and calculate the output using the current weights. This
output is the perceptron's prediction.
Step 3: Check the Prediction:
• Compare the perceptron's prediction to the actual, correct output (known as the label or target).
Step 4: Update the Weights:
• If the prediction is correct, do nothing.
• If the prediction is wrong, adjust the weights. The adjustment is done to reduce the error:
o Formula: new weight=old weight + learning rate×(target−prediction)×input
The learning rate is a small positive number that controls how big the changes are. It's like taking small
steps towards the correct solution.
Step 5: Repeat:
• Go through each input example in the dataset and repeat the process, updating weights whenever
the perceptron makes a mistake.
• This process is repeated many times (multiple epochs) until the perceptron makes very few or
no mistakes.

Table 3.5 shows an example of developing a perceptron model for classification of given water
sample as palatable or not palatable by taking two inputs pH and NaCl contents.
Table 3.5: Implementation of Perceptron rule for classification of water as palatable or not palatable
We want to train a perceptron to classify water based on:
• pH (whether it's within the acceptable range for drinking water).
• Sodium chloride (NaCl) content (whether it's within the safe limit for drinking water).
For simplicity, let's assume the following:
• Input 1: pH value (Say x1).
• Input 2: NaCl content (Say x2).
• Target Output (O):
o Palatable (safe to drink): Target = 1
o Not palatable (not safe to drink): Target = 0
Training Data:
pH(x1) NaCl(x2) Target Output(Y)
7 0.5 1 x1 w1 b
4 1.5 0
9 0.3 0 O
7 1.0 1
Initial Conditions x2 w2
• Initial weights: w1=0.2 and w2=−0.3
Fig. 3.16 The perceptron model for
• Initial bias: b=0.1 classification of water sample
• Learning rate η : 0.1
Training rule steps
Step 1: Calculate the Output for the First Input (pH = 7, NaCl = 0.5)
• Input: (x1=7, x2=0.5)
• Target Output: y=1
• Weights: w1=0.2, w2=−0.3
• Bias: b=0.1
Net Input Calculation:
Net Input=(w1×x1)+(w2×x2)+b
Net Input=(0.2×7)+(−0.3×0.5)+0.1
Net Input=1.4−0.15+0.1=1.35

Output (using Step Activation Function): Output=1 if Net Input ≥ 0, else 0


Comparison with Target:
• Predicted Output: 1
• Target Output: 1
• Result: Correct prediction, so no weight update is needed.

Step 2: Calculate the Output for the Second Input (pH = 4, NaCl = 1.5)
• Input: (x1=4, x2=1.5)
• Target Output: y=0
• Weights: w1=0.2, w2=−0.3
• Bias: b=0.1
• Net Input Calculation:
Net Input=(w1×x1) + (w2×x2)+b
Net Input=(0.2×4)+(−0.3×1.5)+0.1Net Input=(0.2×4)+(−0.3×1.5)+0.1
Net Input=0.8−0.45+0.1=0.45
Comparison with Target:
• Predicted Output: 1
• Target Output: 0
• Result: Incorrect prediction, so we update the weights.

Step 3: Update the Weights and Bias


Weight Update Rule: winew=wiold+η×(Target−Output)×xi
For w1(pH): w1new=0.2+0.1 × (0−1)×4 = -0.2
For NaCl: w2new = −0.3+0.1×(0−1)×1.5 = - 0.45
bnew=0.1+0.1×(0−1)= 0

Updated Weights and Bias:


• w1=−0.2
• w2=−0.45
• b=0
Step 4: Calculate the Output for the Third Input (pH = 9, NaCl = 0.3)

• Input: (x1=9,x2=0.3)
• Target Output: y=0
• Weights: w1=−0.2 , w2=−0.45
• Bias: b=0
Net Input Calculation:
Net Input=(w1×x1)+(w2×x2)+b
=(−0.2×9)+(−0.45×0.3)+0
Net Input=−1.8−0.135=−1.935
Comparison with Target:
• Predicted Output: 0
• Target Output: 0
• Result: Correct prediction, so no weight update is needed.
Step 5: Calculate the Output for the Fourth Input (pH = 7, NaCl = 1.0)
• Input: (x1=7, x2=1.0)
• Target Output: y=1
• Weights: w1=−0.2, w2=−0.45
• Bias: b=0
Net Input Calculation: Net Input=(w1×x1)+ (w2×x2)+b
Net Input=(−0.2×7)+(−0.45×1.0)+0
Net Input=−1.4−0.45=−1.85
Comparison with Target:
• Predicted Output: 0
• Target Output: 1
• Result: Incorrect prediction, so we update the weights.

Step 6: Update the Weights and Bias


Weight Update Rule: winew= wiold+ η×(Target−Output) ×xi 0.5 0.1
For w1 (pH): w1new=−0.2+ 0.1× (1−0) ×7 = 0.5 O
(NaCl content): w2new=−0.45+ 0.1× (1−0) ×1.0 = - 0.35 X1
For b: bnew=0+0.1× (1−0) =0.1
Updated Weights and Bias:
w1=0.5, w2=−0.35, b=0.1 X2
-0.35 Fig. 3.18.The trained
perceptron
Final Model
The perceptron is now trained with updated weights:
• w1=0.5
• w2=−0.35
• b=0.1

These weights can now be used to classify new water samples based on their pH and NaCl content.
7. Multilayer ANNs
A Multilayer ANN is a more sophisticated version of a neural network, consisting of multiple
layers of neurons. These layers are generally divided into three categories:
• Input Layer: This is the first layer that receives the input data. Each neuron in this
layer corresponds to a specific feature from the input data.
• Hidden Layer(s): These are the layers situated between the input and output layers.
Neurons in hidden layers apply weights and biases to the input data and use non-linear
activation functions to perform computations and extract features. The presence of
multiple hidden layers makes the network "deep," which is why the term "deep
learning" is used for such models.
• Output Layer: This layer generates the final output, which can be a classification label
or a continuous value, depending on whether the task is classification or regression.
The strength of a multilayer ANN lies in its capacity to learn complex patterns and relationships
within data through successive layers of transformations. Each neuron's connections to every
neuron in the subsequent layer enable the network to capture detailed features of the input data.
This structure allows the network to adapt to varying inputs and optimize results without
needing to redesign the output criteria. An example of such a network is illustrated in Figure
3.19.

Fig. 3.19. Multilayer ANN


7.1 Training Multilayer ANNs
Training multilayer ANNs involves employing various algorithms to adjust the network's
weights and biases with the goal of minimizing prediction error. These algorithms vary in their
methods for handling the learning rate, computing gradients, and determining convergence,
each offering different trade-offs in terms of speed, stability, and computational efficiency. The
choice of algorithm is typically influenced by the specific problem, dataset size, and network
architecture. Below is a brief overview of some popular training algorithms:.
7.1.1 Gradient Descent
• Batch Gradient Descent (BGD): Updates weights after processing the entire training
dataset.
• Stochastic Gradient Descent (SGD): Updates weights after processing each individual
training example.
• Mini-Batch Gradient Descent: A compromise between BGD and SGD, where weights
are updated after processing a small batch of training examples.
7.1.2 Momentum
Enhances gradient descent by adding a fraction of the previous weight update to the current
one, helping to accelerate convergence, especially in cases of high curvature or noisy gradients.
7.1.3 Nesterov Accelerated Gradient (NAG)
A variant of momentum that anticipates the future position of the weights based on the current
momentum, leading to more accurate updates.
7.1.4 Adagrad (Adaptive Gradient Algorithm)
Adjusts the learning rate individually for each parameter by taking into account the cumulative
sum of squared gradients for that parameter, which enables adaptive learning rates.7.1.5
RMSprop (Root Mean Square Propagation)
An improvement on Adagrad, RMSprop maintains a moving average of the squared gradients
and divides the gradient by the square root of this average, ensuring that the learning rate adapts
based on the recent gradient history.
7.1.6. Adam (Adaptive Moment Estimation)
Merges the benefits of both momentum and RMSprop by keeping an exponentially decaying
average of previous gradients (similar to momentum) and an exponentially decaying average
of past squared gradients (like RMSprop).
7.1.7 AdaDelta
Is an improvement on Adagrad that aims to address its overly aggressive and continuously
decreasing learning rate. It does this by applying a moving window of recent gradient updates
rather than accumulating all previous gradients..
7.1.8 Adamax
A variant of Adam based on the infinity norm, providing better performance in certain cases.
7.1.9 Nadam (Nesterov-accelerated Adaptive Moment Estimation)
A variant of Adam that incorporates Nesterov momentum, leading to potentially faster
convergence.
7.1.10 LBFGS (Limited-memory Broyden–Fletcher–Goldfarb–Shanno)
An optimization algorithm that approximates the Newton-Raphson method, often used when
second-order information is necessary or for smaller datasets.
7.1.11 Conjugate Gradient
A second-order optimization method that finds the optimal solution by considering the
curvature of the loss surface, often used in smaller networks or in cases where the cost of
computing second-order derivatives is not prohibitive.
7.1.12 Quasi-Newton Methods
Algorithms like BFGS (Broyden-Fletcher-Goldfarb-Shanno) that approximate the inverse
Hessian matrix to update weights, providing faster convergence than simple gradient descent.
7.1.13 Genetic Algorithms
Is an evolutionary algorithm applies methods like selection, crossover, and mutation to
optimize network weights by evolving them across multiple generations.
7.1.14 Simulated Annealing
A probabilistic technique that explores the weight space by allowing for occasional increases
in error, enabling escape from local minima in the error landscape.

8. Backpropagation Algorithm
The Backpropagation algorithm is a key technique for training multilayer artificial neural
networks (ANNs). Its objective is to reduce the error between the network’s predictions and
the actual target values. Multilayer ANNs, with their various hidden layers, are proficient at
learning complex data patterns, making them well-suited for tasks such as image recognition
and natural language processing. Backpropagation plays a critical role in training these
networks, enabling them to learn from data by iteratively refining weights and biases to
minimize prediction errors. This process involves adjusting the weights and biases through the
following steps:
Step 1: Forward Pass
• The network is fed with input data.
• Each layer will get the data in forward direction, with neurons applying weights, biases,
and activation functions.
• The network generates an output.
Step 2: Error Calculation
• The error is calculated by comparing the network's output with the actual target values.
• A loss function (such as Mean Squared Error for regression or Cross-Entropy Loss for
classification) quantifies the error.
Step 3: Backward Pass (Backpropagation)
• The error is propagated backward from the output layer toward the input layer.
• The algorithm calculates the gradient of the loss function with respect to each weight
and bias in the network using the chain rule of calculus. This step helps determine the
contribution of each weight and bias to the overall error.
• These gradients will guide the algorithm to modify the weights and biases to reduce the
error.
Step 4. Weight Update
• The weights and biases are updated using the gradients calculated in the backward pass.
This update is often performed using an optimization algorithm like Gradient Descent.
• The process involves adjusting the weights in the direction that reduces the error,
typically scaled by a learning rate (a small constant that controls the step size).
Step 5.Iteration
• The forward and backward passes are repeated for multiple iterations (epochs) over the
entire dataset until the network's weights converge to values that minimize the error.
Table 3.6 shows the procedure of training a ML network in algorithmic form.
Table 3.6 . Backpropagation algorithm
Step 1: Initialize Parameters:
o Randomly initialize the weights W[l] and biases b[l] for each layer l=1,2,…,L.
Step 2: Repeat until convergence:
(a) For each training example i from 1 to m:
Forward Pass:
o Set the input layer activations a[0]=x(i).
o For each layer l=1,2,…,L:
▪ Compute the weighted sum: z[l]=W[l]a[l−1]+b[l].
▪ Apply activation function: a[l]=f(z[l]).
o Obtain the output a[L]=y^(i).
Compute Error:
o Compute the loss/error: E(i)=J(y^(i), y(i)).
Step 3: Backward Pass (Backpropagation):
o Output Layer:
▪ Compute the output error (delta): δ[L]=∂z[L]∂J(y^(i),y(i))=y^(i)−y(i) (for
Mean Squared Error).
o Hidden Layers:
▪ For each layer l=L−1,L−2,…,1:
▪ Compute the error (delta) propagated to the current
layer: δ[l]=(W[l+1])⊤δ[l+1]⊙f′(z[l])
where ⊙ denotes element-wise multiplication, and f′(z[l]) is the derivative of the activation function.
Step 4: Update Weights and Biases:
o For each layer l=1,2,…,L:
▪ Update the weights: W[l]=W[l]−η⋅δ[l]⋅(a[l−1])⊤
▪ Update the biases: b[l]=b[l]−η⋅δ[l]
End Repeat
Output Final Weights and Biases
After the training converges, output the final weights W[l] and biases b[l] for each layer l.

8.1 Illustrative Example


To explain the training of a 3-layer neural network using the Backpropagation (BP) algorithm
in a civil engineering context, let's consider a simple example where the network takes two
inputs—water-cement ratio (w/c ratio) and the quantity of cement—and outputs the strength of
concrete. This example is provided to get a clear grasp of how a multilayer network performs
in learning. To avoid voluminous calculations involved , only one epoch is explained. After
one epoch, the weights and biases of the network will be adjusted based on the error in the
prediction. The Backpropagation algorithm allows the network to "learn" from the errors,
gradually improving its predictions. Repeating this process over many epochs with the entire
dataset will lead to a trained model that can accurately predict concrete strength based on the
given inputs. The network architecture is shown in figure 3.20.

0.1
x1(w/c) w11
w1O 0.1
w21 O (strength)
w12
x2(Cement) w2O
w22
[w1O, w2O]
W11 W12
W21 W22
0.2

Fig. 3.20 ANN topology for strength prediction

Network Architecture:
• Input Layer: 2 neurons (representing w/c ratio and quantity of cement)
• Hidden Layer: 2 neurons (arbitrarily chosen for simplicity)
• Output Layer: 1 neuron (representing the strength of concrete)
Initialization:
• Randomly initialize weights and biases for simplicity (e.g., small random values).
Example Input and Output:
• Input: w/c ratio = 0.5, Quantity of Cement = 400 kg
• Target Output: Strength of Concrete = 30 MPa
Step-by-Step Training (Just one iteration) Using Backpropagation:
Initialization:

Forward Pass:
Backward Pass:

Update Weights and Biases


9. Types of ANN
Several types of neural networks exist, and each has unique applications in civil engineering.
Understanding the strengths and appropriate uses of each type allows for more effective
problem-solving and innovation in the field. They are briefly explained in the following
paragraphs. For an elaborative treatment to each type, readers are suggested to go through listed
references.
9.1 Conventional ANNs
Artificial Neural Networks (ANNs) are modeled after the human brain and consist of
interconnected nodes, or neurons, arranged in layers. These layers include the input layer,
hidden layers, and the output layer. Each connection between neurons is assigned a weight, and
neurons are activated by applying an activation function to the weighted sum of inputs. ANNs
are well-suited for tasks such as:
• Classification: Such as image recognition, measurement anomaly detection, and voice
recognition.
• Prediction: They are also effective for regression tasks, predicting continuous values
like house prices or demand forecasting.
Following are the uses of such networks in civil engineering
• Structural Health Monitoring: ANNs can predict the condition of structures based on
sensor data.
• Traffic Flow Prediction: Used to predict traffic congestion and optimize traffic signal
timings.
• Material Property Estimation: ANNs can estimate the properties of construction
materials based on their composition.
9.2 Convolutional Neural Networks (CNNs)
CNNs are a specialized form of ANNs tailored for processing grid-like data structures, such as
images. CNNs include convolutional layers that use filters to extract features from input data,
pooling layers that reduce the dimensionality and computational load, and fully connected
layers that handle tasks like classification or regression. A standard CNN architecture is
illustrated in Figure 3.21.
They are suitable for ML tasks like:
• Image Classification: CNNs are the go-to architecture for tasks involving image
recognition, such as identifying defects in infrastructure.
• Object Detection: They are also used for detecting objects within images, useful for
tasks like identifying vehicles on roads.
Applications in Civil Engineering:
• Crack Detection in Pavements: CNNs can automatically detect cracks in road surfaces
from images.
• Satellite Image Analysis: Used for land use and urban planning by analyzing satellite
images.

Fig. 3.21. A typical CNN


9.3. Recurrent Neural Networks (RNNs)
RNNs are designed for processing sequential data. They have a loop mechanism in their
architecture that allows information to persist across time steps, making them suitable for time-
series analysis. Sequence learning problems are those in which we don't have a fixed size input,
and the inputs are no longer independent. This is because the output at any timestamp is
dependent on the current and previous input, and we have input at each timestamp. An example
of this is the autocomplete feature. It predicts the next letter based on the previous output,
sentiment analysis, etc. A typical RNN is in figure 3.22.
• Suitable ML Tasks:
• Time Series Prediction: RNNs are effective for tasks that involve predicting future
values based on past sequences, such as temperature forecasting.
• Sequence Classification: Useful for classifying sequences of data, such as classifying
phases in construction projects.
• Applications in Civil Engineering:
• Weather Prediction: RNNs can predict weather patterns that affect construction
schedules.
• Traffic Flow Forecasting: RNNs are used to predict traffic flow based on historical
traffic data.
Fig. 3.22. Recurrent networks
9.4. Long Short-Term Memory Networks (LSTMs)
LSTMs are a variants of RNNs specifically designed to address the vanishing gradient problem
commonly found in traditional RNNs. LSTMs utilize memory cells to retain information over
extended time intervals and employ gates to control the flow and retention of information.
Figure 3.23 shows a typical LSTM model.
• Suitable ML Tasks:
• Long-Term Time Series Prediction: LSTMs excel at tasks requiring long-term
dependencies in the data, such as predicting the long-term behaviour of a structural
element.
• Sequential Data Analysis: Suitable for analysing sequences where earlier data points
are critical for making decisions.
• Applications in Civil Engineering:
• Seismic Data Analysis: LSTMs can be used to analyse seismic waves and predict
earthquake patterns.
• Structural Health Monitoring: Used to model the long-term health of structures based
on historical sensor data.

Fig. 3.23. A typical LSTM model


9.5 Generative Adversarial Networks (GANs)
GANs are a powerful class of neural networks utilized for unsupervised learning. GANs consist
of two neural networks: a generator and a discriminator. These networks are trained
simultaneously through adversarial training, where the generator tries to deceive the
discriminator by producing artificial data that resembles real data.
• The generator's goal is to generate realistic samples from random noise in an attempt to
fool the discriminator, which is responsible for distinguishing between genuine and
generated data.
• This competitive dynamic between the two networks encourages the production of
high-quality, realistic samples as both networks improve over time.
• GANs have become highly versatile tools in artificial intelligence, with widespread
applications in image synthesis, style transfer, and text-to-image generation.
• They have also significantly advanced the field of generative modeling.
A typical GAN model is shown in figure 3.24.
• Suitable ML Tasks:
• Data Generation: GANs are primarily used for generating synthetic data, such as
creating realistic images or simulations.
• Image-to-Image Translation: To convert images from one domain to another, such as
converting satellite images into high-resolution maps.
• Applications in Civil Engineering:
• Urban Planning Simulations: GANs can generate realistic urban layouts for planning
purposes.
• Structural Simulation: Used to create synthetic data for simulating different loading
conditions on structures. or real data

Fig.3.14. Conceptual block diagram of a GAN

9.6. Autoencoders
Autoencoders are a type of neural network designed for unsupervised learning. They consist of
two main components: an encoder, which compresses the input data into a smaller, latent space
representation, and a decoder, which reconstructs the original input from this compressed
representation. Figure 3.15 shows typical auto encoder, that is capable of encoding an image
of a building and decoding the input to original image back.
• Suitable ML Tasks:
• Dimensionality Reduction: Autoencoders are used to reduce the dimensionality of data
while preserving important information.
• Anomaly Detection: They can be used to detect anomalies by comparing the original
input to its reconstruction.
• Applications in Civil Engineering:
• Sensor Data Compression: Autoencoders can compress large amounts of sensor data
from structural health monitoring systems.
• Fault Detection: Used to detect faults in construction materials by analyzing sensor
data.

Input Latent space representation Output


Fig.3.15. A typical auto encoder network

9.7. Graph Neural Networks (GNNs)


GNNs are specifically developed to operate on data structured as graphs, where individual data
points are linked by edges. These models can understand the interactions between nodes and
generate representations for either individual nodes or the entire graph.. A typical GNN is
shown in figure 3.16.
• Suitable ML Tasks:
• Node Classification: GNNs can classify nodes in a graph, such as identifying the critical
points in a structural network.
• Link Prediction: They can predict the existence of connections between nodes, useful
in network analysis.
• Applications in Civil Engineering:
• Structural Analysis: GNNs can model and analyze complex structural networks, such
as trusses and frames.
• Infrastructure Network Analysis: Used for analyzing the resilience of infrastructure
networks, such as transportation systems.
• Transportation Network Planning: Used for planning a transportation network among
several cities. Cities are represented as nodes, while the connecting roads are treated as
edges

Fig. 3.16 A typical GNN

10.Appropriate Problems for ANN learning


ANNs excel in addressing challenges that involve noisy and complex sensor inputs, such as
those received from cameras or microphones. Additionally, ANNs can be applied to problems
that often utilize symbolic representations. The most widely adopted learning method for ANNs
is the Backpropagation (BPN) algorithm, which is well-suited for problems that exhibit the
following characteristics:
• Instances are described by multiple pairs of attributes and values. The target function,
which is to be learned, operates over instances represented as a vector of predefined
features, such as pixel values from an image. These input attributes may either have
strong correlations or be entirely independent, and their values can range across any
real number.
• The target function's output can be either discrete or continuous, or even a vector made
up of multiple discrete or continuous values. For example, the output could consist of
a vector with m attributes, each reflecting a significant parameter of a problem. Each
output value is a real number between 0 and 1, signifying the confidence level for that
prediction.
• Training data may contain inaccuracies, but ANNs remain resilient to noisy data,
making them robust in such environments.
• ANNs are appropriate for situations where longer training durations are acceptable.
Training algorithms for neural networks often take more time compared to other
machine learning algorithms. Training times can range from a few seconds to several
hours, depending on factors such as the number of weights, the size of the training set,
and various algorithm parameters.
• Despite the longer training times, ANNs typically allow for very fast evaluation of the
learned network when applied to new data. In many scenarios, quick evaluation of the
learned target function is crucial.
• ANNs are suitable when human interpretability of the learned target function is not a
priority. The weights learned by ANNs are often complex and difficult to interpret,
making it challenging for humans to understand or explain the model in contrast to
more transparent rule-based systems.

11. Applications in Civil Engineering


To provide a feel for kind of applications of ANN in various allied fields of civil engineering ,
a summary of intense literature survey is provided in Table 3.7. Interested readers are suggested
to go through the related references. Barchart (Figure 3.17) indicates number of ANN related
publications in reputable journals during the tenure 1990 – 2022. Various kinds of concurrent
applications of ANN are provided in Figure 3.18.

Fig.3.17. ANN application related publications in Scoupus indexed Journals (1990-2022)


Fig.3.18. Kinds of ANN applications concurrent with different domains of AI
11.1 Infrastructure Construction
ANNs (ANNs) have been increasingly applied in the construction and infrastructure
engineering sectors from 2022 to 2024, reflecting their growing importance in tackling
complex problems characterized by high uncertainty and non-linearity. This period has seen
notable advancements in the application of ANNs across various domains within construction
and infrastructure, ranging from project management to safety and quality control.
A. Project Cost and Duration Estimation
ANNs have proven to be particularly effective in the estimation of construction project costs
and durations. Traditional methods of cost estimation and scheduling often struggle with the
complexity and uncertainty inherent in construction projects. ANNs, with their ability to learn
from historical data and adapt to new patterns, have been successfully applied to predict costs
and durations more accurately. These models account for a wide range of variables, including
material prices, labour costs, and project-specific factors, providing more reliable estimates
that can enhance decision-making and project planning.).
B. Safety and Risk Management
In the area of construction safety, ANNs have been used to assess worker safety behaviors and
predict potential accidents. By analyzing data from past incidents and near-misses, ANN
models can identify patterns and predict high-risk scenarios, allowing for proactive measures
to be implemented. Additionally, ANNs have been integrated with other techniques like Fuzzy
Logic and Genetic Algorithms to manage uncertainties and optimize risk allocation in multi-
project environments, leading to better resource management and risk mitigation strategies(
C. Quality Control and Structural Health Monitoring
Quality control and structural health monitoring have also benefited from ANN applications.
These models are used to monitor the health of construction structures in real-time, predicting
potential failures or the need for maintenance before issues become critical. This predictive
capability is particularly valuable in managing the lifecycle of infrastructure projects, ensuring
that structures remain safe and functional over time. ANNs are also employed in construction
stability testing and the management of mechanical equipment, further contributing to the
overall quality and efficiency of construction processes.
D. Resource and Performance Management
ANNs have been extensively applied to resource management in construction, including the
optimization of material and human resources. By predicting resource needs and managing
equipment usage more efficiently, ANN models help in reducing wastage and improving
overall project performance. Performance evaluation at both the corporate and project levels
has also seen advancements through the application of ANNs, providing insights that can drive
better management practices and enhance project outcomes
and development in this area continue to evolve, it is likely that ANNs will become even more
integral to the future of construction and infrastructure engineering.
11.2 Geotechnical Engineering
From 2022 to 2024, ANNs (ANNs) have seen significant applications in Geotechnical
Engineering, addressing challenges like soil classification, predicting soil and rock properties,
and analyzing stability issues in complex geological settings. Some of the key applications are:
Soil and Rock Classification: ANNs have been extensively used for classifying different soil
and rock types. These models can handle the nonlinear relationships between various
geotechnical properties better than traditional methods. For instance, ANNs are employed in
predicting the compressive strength of rocks and the settlement of soils, with backpropagation
and Hopfield networks showing promising results.
Predicting Geotechnical Properties: One of the significant applications of ANNs in this
period has been in predicting the properties of geotechnical materials. This includes modeling
the shear strength of clay, which is critical for foundation stability. The ANN models have
outperformed traditional regression models in these tasks due to their ability to capture complex
interactions among variables.
Landslide Susceptibility and Stability Analysis: ANNs have also been applied in assessing
landslide susceptibility, where they analyze the factors leading to slope failures and predict
potential landslide zones. These models help in proactive disaster management by identifying
high-risk areas based on input variables like soil moisture, slope gradient, and rainfall patterns.
Optimization and Parameter Inversion: In underground engineering, particularly in tunnel
and metro construction, ANNs combined with optimization algorithms have been used to back-
analyze and optimize the parameters of surrounding rocks. This ensures the stability and safety
of underground structures by providing accurate predictions of rock behavior under various
loading conditions.
Notable Studies
• A study focused on the Nile River's sensitive alluvial clay used ANNs to predict
undrained shear strength, offering an alternative to costly and time-consuming
laboratory tests.
• In metro station construction, ANNs were integrated with differential evolution
algorithms to optimize rock parameters, resulting in more accurate stability
assessments.
• Comprehensive reviews during this period have highlighted the comparative
performance of ANNs against other machine learning models like Support Vector
Machines (SVMs) and Decision Trees, emphasizing the suitability of ANNs for
complex, nonlinear geotechnical problems.
11.3 Hydraulics and Water Resources Engineering
The application of ANNs (ANNs) in Hydraulics and Water Resources Management has seen
significant growth between 2022 and 2024, particularly in areas like hydrological forecasting,
water quality prediction, and groundwater management. This period has been marked by the
integration of ANNs with other machine learning models to enhance the accuracy and
efficiency of predicting complex hydrological phenomena.
A.Hydrological Forecasting: One of the most prominent applications of ANNs is in
hydrological forecasting. ANNs have been used extensively to predict streamflow, runoff, and
other hydrological parameters. The non-linear nature of hydrological processes makes
traditional modeling approaches, such as physically-based and conceptual models, less
effective. ANNs offer a more flexible alternative as they can model complex relationships
between inputs and outputs without needing explicit physical equations. Recent research has
focused on hybrid models that combine ANNs with other machine learning techniques, like
Long Short-Term Memory (LSTM) networks, to improve the accuracy of short-term and long-
term hydrological forecasts(
B.Water Quality Prediction: ANNs have also been employed to predict water quality
parameters such as pH, dissolved oxygen, and pollutant levels in surface and groundwater.
These predictions are crucial for managing water resources, especially in regions facing
pollution challenges. The adaptability of ANNs allows them to learn from historical water
quality data and predict future trends, which is valuable for both regulatory purposes and
environmental conservation
C.Groundwater Management: Groundwater modelling has benefited from the application of
ANNs, particularly in predicting groundwater levels and the effects of various management
strategies. ANNs have been utilized to assess groundwater recharge rates, predict the impacts
of pumping on groundwater levels, and manage aquifer sustainability. Their ability to handle
large datasets and complex variable interactions makes them an invaluable tool in this field(
D.Flood and Drought Prediction: Another critical application of ANNs in water resources
management is the prediction of extreme hydrological events like floods and droughts. By
analyzing historical climate and hydrological data, ANNs can predict the likelihood and
severity of such events, aiding in disaster preparedness and mitigation efforts. These
predictions are increasingly important as climate change exacerbates the frequency and
intensity of extreme weather events.
11.4 Construction Material Modelling and Characterization
ANNs (ANNs) have seen widespread applications in the construction industry, particularly in
the modeling, characterization, and prediction of mechanical properties of materials. These
methods leverage the ability of ANNs to model complex, non-linear relationships that
traditional statistical methods struggle to capture. From predicting the behavior of concrete
under various conditions to estimating the mechanical properties of novel composite materials,
ANNs have proven to be invaluable tools in advancing the understanding and utilization of
construction materials.
A.Predicting Concrete Properties: One of the most common applications of ANNs in
construction material modeling is in predicting the properties of concrete. Between 2022 and
2024, significant research has been conducted using ANNs to predict the compressive strength,
tensile strength, and durability of concrete. These predictions often rely on input parameters
such as the mix proportions, curing conditions, and the type of additives used.
For example, a study conducted in 2023 utilized an ANN model to predict the compressive
strength of high-performance concrete with recycled aggregates. The model was trained using
data that included the proportions of recycled aggregates, the type of cement used, and the
curing time. The results demonstrated that the ANN model could predict compressive strength
with a higher degree of accuracy compared to traditional regression models(
B. Material Characterization: ANNs have also been applied in the characterization of new
construction materials, particularly in the development of sustainable and eco-friendly
materials. In 2024, researchers developed an ANN model to characterize the thermal
conductivity of novel phase change materials (PCMs) used in energy-efficient buildings. The
model used inputs such as the PCM composition, density, and temperature to predict thermal
conductivity, providing a reliable tool for material scientists and engineers to optimize PCM
formulations without the need for extensive experimental testing.
C.Mechanical Property Prediction of Composite Materials. Composite materials, often used in
modern construction due to their enhanced mechanical properties, have also been a focus of
ANN applications. In 2022, a study successfully employed ANNs to predict the mechanical
properties, such as tensile strength and modulus of elasticity, of fiber-reinforced polymer
composites. By using a dataset that included fiber type, matrix material, and fabrication
methods as input features, the ANN was able to predict these properties with a high degree of
accuracy, facilitating the design and optimization of composite materials for specific
construction applications.
D. Enhancing Sustainability in Material Design. Sustainability has become a critical focus in
construction, and ANNs have been instrumental in designing and characterizing materials with
lower environmental impact. For example, in 2023, researchers used an ANN model to predict
the mechanical properties of geopolymer concrete made with industrial waste materials. The
model helped identify optimal mix designs that reduced the carbon footprint while maintaining
the necessary mechanical properties for construction applications.
11.5 Transportation Engineering and Highway Technology
The application of ANNs in transportation engineering and highway technology from 2022 to
2024 has revolutionized several aspects of the field. From improving the design and
performance of pavements to enhancing traffic management and road safety, ANNs have
become a critical tool in addressing the complex challenges of modern transportation systems.
As these technologies continue to evolve, their integration into transportation infrastructure
will likely expand, offering even more sophisticated solutions to the pressing issues faced by
road authorities.
A.Pavement Performance and Design :One of the critical areas where ANNs have been applied
is in the performance and design of pavements, particularly flexible pavements. The design of
flexible pavements involves complex interactions between materials, vehicle loads,
environmental factors, and structural components. ANNs have been used to model these
interactions effectively, offering a robust tool for predicting pavement performance under
different conditions. For instance, ANNs have been utilized to predict the viscoelastic
properties of asphalt mixtures, optimizing the design by evaluating the effects of various
additives like crumb rubber and styrene-butadiene-styrene (SBS). These models help in fine-
tuning the material compositions to enhance pavement durability and performance(
B.Traffic Flow Prediction:ANNs have also been extensively used in traffic flow prediction,
where they help in modeling and forecasting traffic patterns based on historical data. These
predictions are crucial for urban traffic management, allowing for the optimization of signal
timings and traffic routing to minimize congestion. The ability of ANNs to learn from vast
datasets and predict traffic conditions under various scenarios has made them indispensable in
developing intelligent transportation systems (ITS). These systems contribute to smoother
traffic flow and reduced travel times
C.Pavement Maintenance and Rehabilitation:In the domain of pavement maintenance, ANNs
are employed to predict the deterioration of pavement surfaces, enabling transportation
agencies to prioritize maintenance activities more effectively. By analyzing data from various
sensors and historical maintenance records, ANNs can predict the optimal timing for
maintenance, thereby extending the lifespan of roadways and reducing costs associated with
premature repairs. This predictive capability is particularly valuable in managing large
networks of roadways where resource allocation must be carefully balanced.

D.Safety Analysis and Accident Prediction :Another critical application of ANNs is in road
safety analysis and accident prediction. By analysing patterns in traffic accidents and the
associated factors like road geometry, traffic density, and weather conditions, ANNs can predict
potential accident hotspots. These predictions allow for proactive measures such as redesigning
road layouts, implementing better traffic controls, or increasing enforcement in high-risk areas.
This application is crucial for reducing traffic fatalities and enhancing road safety.
E. Geotechnical Applications :In highway technology, ANNs have been applied to geotechnical
engineering problems, such as predicting the mechanical properties of soils used in road
construction. This includes the prediction of soil compaction levels, bearing capacity, and
settlement behavior under load, which are vital for the stability and durability of road
structures. By providing accurate predictions, ANNs assist engineers in making informed
decisions during the design and construction phases, leading to more reliable and cost-effective
infrastructure.
11.6 Structural Engineering
ANNs have found wide applications in structural engineering in recent years. Their ability to
model complex relationships makes them suitable for addressing various challenges in the
field. Below are several recent applications of ANNs in structural engineering:
A.Structural Health Monitoring (SHM):ANNs are extensively used for detecting damages and
assessing the health of structures by analyzing sensor data from various structural components
(e.g., bridges, buildings). The neural network learns from the historical vibration and strain
data to predict potential failures or damage localization. ANNs can classify damaged vs.
undamaged states using sensor data such as strain and acceleration. This helps in early
identification of structural damage.
B.Predicting Structural Response under Dynamic Loads:ANNs are used to predict the
structural response of buildings and bridges subjected to dynamic loads such as wind,
earthquake, and traffic loads. By training on past data, the networks can predict how structures
will behave under various loading conditions. ANN models can predict the nonlinear seismic
response of buildings, aiding in safer design and evaluation under earthquake conditions.
C.Optimization in Structural Design: ANNs are also employed in the optimization of structural
design, focusing on material usage, cost, and overall structural efficiency. ANNs can learn from
previous designs and assist in providing optimized designs with minimum weight, material
cost, or carbon footprint while maintaining structural integrity. ANNs can be used to optimize
the design of steel trusses in bridge construction, reducing weight while maintaining safety and
performance.
D.Modelling of Material Behaviour: ANNs are increasingly used to model the behavior of
various construction materials like concrete, steel, and composites. They can predict the stress-
strain relationship, crack propagation, or fatigue life of these materials under different loading
scenarios. ANNs can predict the compressive strength of concrete based on input variables such
as water-cement ratio, aggregate size, and curing time.
E.Retrofitting and Rehabilitation of Structures: ANNs are applied to predict the performance
of retrofitted structures. They assist engineers in selecting the most effective retrofitting
techniques for buildings and bridges, based on historical data.Using ANN models to determine
the most effective rehabilitation technique for aging or damaged bridges.
F.Structural Load Capacity Prediction: ANNs are used to predict the load-bearing capacity of
structural components and entire systems. This application is critical for ensuring that designs
meet safety requirements without overdesigning, which would increase costs. ANNs predict
the ultimate load capacity of composite beams with accuracy, taking into account material
properties and geometric variables.
G.Fire Resistance and Thermal Behaviour of Structures: ANNs help in predicting the thermal
and fire resistance performance of building components, aiding in the design of fire-safe
structures. This is particularly useful for high-rise buildings and critical infrastructure. ANNs
predict the fire resistance time of steel beams subjected to varying fire temperatures, enabling
engineers to design safer buildings.
H.Fatigue Life Prediction in Bridges and Other Infrastructure: ANNs are applied to predict
the fatigue life of structural elements, particularly bridges, by analysing data on cyclic loading,
material properties, and environmental conditions. ANNs help in predicting the fatigue life of
steel and composite components in bridge structures subjected to repeated loading, reducing
maintenance costs.
J.Wind Load Estimation :ANN models are used to predict wind pressures on tall buildings and
structures. They help in reducing the complexity of wind load calculations and improving
structural safety under wind forces. ANNs predict wind-induced vibrations and loads on
skyscrapers, improving safety and reducing the need for extensive wind tunnel testing.

Module End Questions


1. How does a biological neuron function, and how is it analogous to an artificial neuron
in an ANN?
2. What is a perceptron, and how does it differ from a biological neuron?
3. Compare the key differences between biological neural networks and ANNs in terms
of learning, structure, and functionality.
4. What are the limitations of a single-layer perceptron? Can it solve non-linearly
separable problems?
5. Design a perceptron that models an OR gate with two inputs. What are the weights and
bias values?
6. How can a perceptron be used to model an AND gate? What weights and bias should
be assigned?
7. Why can’t a single-layer perceptron solve the XOR problem? What is the significance
of this limitation?
8. Explain the perceptron learning rule. How does a perceptron adjust its weights during
the training process?
9. What are the different types of activation functions used in ANNs? How do functions
like sigmoid, ReLU, and tanh affect the performance of the network?
10. Given the following input values for the water-cement ratio (w/c) and compressive
strength of concrete:
w/c strength Output
0.35 65 1
0.55 30 0
0.25 70 1
0.65 22 0
Classify the compressive strength as "high strength" if it is greater than 60 MPa and
"not high strength" if it is less than or equal to 60MPa. Design a perceptron with
appropriate weights and bias to perform this classification.
11. How does a multi-layer neural network (MLP) differ from a single-layer perceptron?
What problems does it address?
12. Explain the backpropagation algorithm in detail. What are the key steps involved in
training a neural network using backpropagation?
13. Solve a simple backpropagation problem step by step. Assume a small network with
two input neurons, one hidden layer, and one output neuron.
14. What are the different types of ANNs, such as Feedforward Neural Networks,
Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and
Generative Adversarial Networks (GANs)?
15. Describe how a feedforward neural network (FNN) operates. Provide an example of a
practical application for FNNs in civil engineering.
16. What is the architecture of a CNN? How are CNNs used in applications involving
image recognition or pattern detection?
17. How do RNNs differ from traditional feedforward neural networks? What are their
applications in time-series prediction or sequential data?
18. What are the appropriate types of problems where ANNs excel over traditional
algorithms? Provide examples from different domains.
19. Discuss at least three applications of ANNs in civil engineering. How do ANNs
contribute to areas like structural health monitoring, material behaviour prediction, or
optimizing structural design?
20. Design an ANN-based solution for predicting the optimal material usage in a steel beam
design problem. What are the input parameters, output, and how would you train the
network?
References
1. McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The Bulletin of
Mathematical Biophysics, 5(4), 115-133.
2. Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain.
Psychological Review, 65(6), 386-408.
3. Minsky, M., & Papert, S. (1969). Perceptrons: An Introduction to Computational Geometry. MIT Press.
4. Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors.
Nature, 323(6088), 533-536.
5. Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science,
313(5786), 504-507.
6. Nikos D. Lagaros (2023), ANNs Applied in Civil Engineering, Editorial, Applied Sciences, 13,1113.
7. Al-Gahtani, K. S., Alsugair, A. M., Alsanabani, N. M., Alabduljabbar, A. A., & Almohsen, A. S. (2024). ANN
prediction model of final construction cost at an early stage. Journal of Asian Architecture and Building Engineering,
1–25.
8. Abdullah Abbas, Gafel K. Aswed,(2024) Enhancing Sewage Pipeline Project Cost Estimations in Iraq through
Artificial Neural Network Models, IOP Conf. Ser.: Earth Environ. Sci. 1374 012086
9. Abbas M. A bd Yassi A Kareem Raquim Zehawi,(2024), Prediction and Estimation of Highway Construction Cost
using Machine Learning, Engineering Technology, and Applied Science Research, 14(4), 17222-17231.
10. Sarkar, D., Pandya, J. (2024). Application of Machine Learning and ANNs to Predict Real Estate Sales. In: Patel,
D., Kim, B., Han, D. (eds) Innovation in Smart and Sustainable Infrastructure, Volume 2. ISSI 2022. Lecture Notes
in Civil Engineering, vol 485. Springer, Singapore
11. M. A. Jayaram and B. Gowda, Machine Learning-Based Surrogate Models for Construction Project Duration
Prediction, 2024 14th International Conference on Cloud Computing, Data Science & Engineering (Confluence),
Noida, India, 2024, pp. 544-548, doi: 10.1109/Confluence60223.2024.10463458.
12. Lin Shi; Jian Zhang; Xiaodong Yu; Daoyong Fu; Wenlong Zhao, Artificial neural network-based water distribution
scheme in real-time in long-distance water supply systems, AQUA - Water Infrastructure, Ecosystems and Society
(2024) 73 (8): 1611–1620.
13. Miyuru B. Gunathilake, Chamaka Karunanayake, Anura S. Gunathilake, Niranga Marasingha, Jayanga T.
Samarasinghe, Isuru M. Bandara, Upaka Rathnayake, (2021), Hydrological Models and Artificial Neural Networks
(ANNs) to Simulate Streamflow in a Tropical Catchment of Sri Lanka, Applied Computational Intelligence and Soft
Computing, 2021, 1-9.
14. Norazman, S.H., Aspar, M.A.S.M., Abd. Ghafar, A.N., Karumdin, N., Abidin, A.N.S.Z. (2024). Artificial Neural
Network Analysis in Road Crash Data: A Review on Its Potential Application in Autonomous Vehicles. Intelligent
Manufacturing and Mechatronics, Lecture Notes in Networks and Systems, vol 850. Springer, Singapore.
15. Yang, X., Guan, J., Ding, L., You, Z., Lee, V., Mohd Hasan, M., & Cheng, X. (2021). Research and applications of
artificial neural network in pavement engineering: A state-of-the-art review. Journal of Traffic and Transportation
Engineering, 8(6), 1000-1021.
16. Sangjinda, K., Jitchaijaroen, W., Nguyen, T. S., Keawsawasvong, S., & Jamsawang, P. (2024). Data-driven
modelling of bearing capacity of footings on spatially random anisotropic clays using ANN and Monte Carlo
simulations. International Journal of Geotechnical Engineering, 1–17.
17. Uzer, Ali Ulvi. (2024). "Accurate Prediction of Compression Index of Normally Consolidated Soils Using Artificial
Neural Networks" Buildings 14(9), 2688.
18. Artificial Neural Networks for Construction Project Cost and Duration Estimation. IIETA.
19. Application of Artificial Neural Networks in Construction Management. Encyclopedia MDPI.
20. R.S. Govindaraju, A. Ramachandra Rao, Artificial Neural Networks in Hydrology, Spinger Link,2000
21. Fabio D Nuno, Quoc Bao Pham, H.S. Mogadam(2021), Application of Artificial Neural Network Algorithms for
Hydrological Forecasting, Frontiers in Water,
22. Articles in Construction and Building Materials, Vol 360 – 448, 2023-2024.
23. Wang, H., et al. (2022). "Application of Deep Learning Methods in Structural Health Monitoring: A Review."
Sensors, 22(2), 547
24. Liu, X., et al. (2023). "ANN-Based Seismic Performance Prediction of Structures Subjected to Ground Motion."
Journal of Building Engineering, 67, 105882.
25. Hasançebi, O., et al. (2023). "Optimizing Structural Systems Using Artificial Neural Networks: A Review."
Structural and Multidisciplinary Optimization, 67, 23-41.
26. Ahmad, A., et al. (2023). "Predicting Compressive Strength of Concrete Using Neural Networks." Construction and
Building Materials, 338, 127430.
27. Choi, J., et al. (2022). "ANN-Based Rehabilitation Strategies for Aging Structures: Case Studies and Comparative
Analysis." Journal of Infrastructure Systems, 28(1), 04021033.
28. Zhang, W., et al. (2023). "Prediction of Ultimate Load Capacity of Composite Beams Using ANNs." Composite
Structures, 311, 116825.
29. Nguyen, P., et al. (2023). "ANN-Based Fire Resistance Prediction of Steel Structures." Journal of Fire Protection
Engineering, 33(2), 158-174.
30. Fawad, M., et al. (2022). "ANN-Based Fatigue Life Estimation of Bridge Girders under Traffic Load." International
Journal of Fatigue, 161, 106886.
31. Chen, Z., et al. (2023). "ANN-Based Estimation of Wind Loads on High-Rise Buildings." Wind and Structures,
32(5), 443-458.

View publication stats

You might also like