[go: up one dir, main page]

0% found this document useful (0 votes)
20 views33 pages

Project Report

Uploaded by

Tanmay Katkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views33 pages

Project Report

Uploaded by

Tanmay Katkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

A MINI PROJECT REPORT

ON

“Heart Condition Analysis”

Submitted in the partial fulfillment of the requirements for

The degree of

BACHELOR OF ENGINEERING IN COMPUTER ENGINEERING

By

1) Navnath Auti.
2) Omkar Harade.
3) Sahil Nazare.
4) Adnan Shaikh.

UNDER THE GUIDANCE OF


Prof. Sujata Bhairnallykar

Department of Computer Engineering


Saraswati College of Engineering, Kharghar, Navi Mumbai
University of Mumbai
2021-22

SCOE’s Computer Engineering 2021-22


Saraswati College of Engineering, Kharghar
Vision:
To be universally accepted as autonomous centre of learning in Engineering
Education and Research.

Mission:

➢ To educate students to become responsible and quality technocrats to fulfil


society and industry needs.
➢ To nurture student’s creativity and skills for taking up challenges in all facets of
life.
_______________________________________________
Department of Computer Engineering
Vision:
To be among renowned institution in Computer Engineering Education and
Research by developing globally competent graduates.

Mission:

➢ To produce quality Engineering graduates by imparting quality training, hands


on experience and value education.
➢ To pursue research and new technologies in Computer Engineering and across
interdisciplinary areas that extends the scope of Computer Engineering and
benefit humanity.
➢ To provide stimulating learning ambience to enhance innovative ideas, problem
solving ability, leadership qualities, team-spirit and ethical responsibilities.

SCOE’s Computer Engineering 2021-22


DEPARTMENT OF COMPUTER ENGINEERING
PROGRAM EDUCATIONAL OBJECTIVE’S

1. To embed a strong foundation of Computer Engineering fundamentals to identify, solve,


analyze and design real time engineering problems as a professional or entrepreneur for the
benefit of society.
2. To motivate and prepare students for lifelong learning & research to manifest global
competitiveness.
3. To equip students with communication, teamwork and leadership skills to accept
challenges in all the facets of life ethically.

SCOE’s Computer Engineering 2021-22


DEPARTMENT OF COMPUTER ENGINEERING
PROGRAM OUTCOMES

1. Apply the knowledge of Mathematics, Science and Engineering Fundamentals to solve


complex Computer Engineering Problems.
2. Identify, formulate and analyze Computer Engineering Problems and derive conclusion
using First Principle of Mathematics, Engineering Science and Computer Science.
3. Investigate Complex Computer Engineering problems to find appropriate solution leading
to valid conclusion.
4. Design a software System, components, Process to meet specified needs with appropriate
attention to health and Safety Standards, Environmental and Societal Considerations.
5. Create, select and apply appropriate techniques, resources and advance Engineering
software to analyze tools and design for Computer Engineering Problems.
6. Understand the Impact of Computer Engineering solution on society and environment for
Sustainable development.
7. Understand Societal, health, Safety, cultural, Legal issues and Responsibilities relevant to
Engineering Profession.
8. Apply Professional ethics, accountability and equity in Engineering Profession.
9. Work Effectively as a member and leader in multidisciplinary team for a common goal.
10. Communicate effectively within a Profession and Society at large.
11. Appropriately incorporate principles of Management and Finance in one’s own Work.
12. Identify Educational needs and engage in lifelong learning in a changing World of
Technology.

SCOE’s Computer Engineering 2021-22


DEPARTMENT OF COMPUTER ENGINEERING
PROGRAMME SPECIFIC OUTCOME

1. Formulate and analyze complex engineering problems in computer engineering


(Networking/Big data/ Intelligent Systems/Cloud Computing/Real time systems).
2. Plan and develop efficient, reliable, secure and customized application software using cost
effective emerging software tools ethically.

SCOE’s Computer Engineering 2021-22


(Approved by AICTE, recg. By Maharashtra Govt. DTE, Affiliated to Mumbai University)
PLOT NO. 46/46A, SECTOR NO 5, BEHIND MSEB SUBSTATION, KHARGHAR, NAVI MUMBAI-410210
Tel.: 022-27743706 to 11 * Fax: 022-27743712 * Website: www.sce.edu.in

CERTIFICATE

This is to certify that the requirements for the mini project report entitled “Heart Condition
Analysis” have been successfully completed by the following students:

Roll numbers Name


28 Navnath Auti
43 Omkar Harade
60 Sahil Nazare
76 Adnan Shaikh

In partial fulfillment of Sem –VI, Bachelor of Engineering of Mumbai University in Computer


Engineering of Saraswati college of Engineering, Kharghar during the academic year 2021-22.

Internal Guide External Examiner


Prof. Sujata Bhairnallykar

Mini Project Co-Ordinator Head of Department


Dr. Anjali Dadhich. Prof. Sujata Bhairnallykar

SCOE’s Computer Engineering 2021-22


DECLARATION

I declare that this written submission represents my ideas in my own words


and where others’ ideas or words have been included. I have adequately cited and
referenced the original sources. I also declare that I have adhered to all principles
of academic honesty and integrity and have not misrepresented or fabricated or
falsified any idea/data/fact/source in my submission. I understand that any
violation of the above will be cause for disciplinary action by the Institute and
can also evoke penal action from the sources which have thus not been properly
cited or from whom proper permission has not been taken when needed.

1. Navnath Auti.
2. Omkar Harade.
3. Sahil Nazare.
4. Adnan Shaikh.

SCOE’s Computer Engineering 2021-22


ACKNOWLEDGEMENT

After the completion of this work, words are not enough to express feelings
about all those who helped us to reach goal.

It’s a great pleasure and moment of immense satisfaction for us to express


my profound gratitude to Mini Project Guide, Prof. Sujata Bhairnallykar,
whose constant encouragement enabled us to work enthusiastically. Her perpetual
motivation, patience and excellent expertise in discussion during progress of the
project work have benefited us to an extent, which is beyond expression.

We would also like to give our sincere thanks to Prof. Sujata


Bhairnallykar, Head of Department, and Dr. Anjali Dadhich, Mini Project
coordinator from Department of Computer Engineering, Saraswati college of
Engineering, Kharghar, Navi Mumbai, for their guidance, encouragement and
support during a project.

I am thankful to Dr. Manjusha Deshmukh, Principal, Saraswati College


of Engineering, Kharghar, Navi Mumbai for providing an outstanding academic
environment, also for providing the adequate facilities.

Last but not the least we would also like to thank all the staffs of Saraswati
college of Engineering (Computer Engineering Department) for their valuable
guidance with their interest and valuable suggestions brightened us.

1. Navnath Auti.
2. Omkar Harade.
3. Sahil Nazare.
4. Adnan Shaikh.

SCOE’s Computer Engineering 2021-22


ABSTRACT
Heart irregularities are commonly detected using a stethoscope by a physician, Currently, there
are digital stethoscopes and mobile devices that anyone can use to record their heart sounds,
however, without medical knowledge, it will be difficult for them to know if there are any
irregularities. So, this project named as ‘Heart Condition Analysis’ which is bases on deep
learning. In this project the deep learning model is created which classifying audio heart
recordings to three most commonly occurring categories or classes which are Murmur, extra
heart sound or extrasystole and normal heartbeat sound. In this project two datasets were used
Dataset-A consist of sound of 4 categories that are Normal heartbeat sound, artifact,
extrasystole and murmur and Dataset B consist of Normal heartbeat sound, Extrasystole and
Murmur sound of heartbeat. For data preprocessing the audio signal converted into digital
signal with sampling rate around 16000 amplitude every second. And we denoised data using
band pass filtering. Then we calculated MFCCs values. Then we trained the model using four
different deep learning model or architecture that are RESNet (Residual Neural Network),
VGGNet16 both are based on Convolution Neural Network (CNN), third one is LSTM (Long
Short-Term Memory) which is based on Recursive Neural Network and last one is Traditional
Neural Network and Data augmentation. Among all of the Deep Learning Model RESNet gives
high accuracy around 85%.

SCOE’s Computer Engineering 2021-22


TABLE OF CONTENT

List of Figures………………………………………………………………………… 01-02

1. Introduction………………………………………………………………………... 03-04

1.1. General………………………………………………………………………… 03

1.2. Objective and problem statement……………………………………………. 04

2. Methodology………………………………………………………………………. 05-15

2.1. Data Set Details………………………………………………………………... 05

2.2. Data Preprocessing……………………………………………………………. 06-09

2.3. Calculating Values of MFCCs………………………………………………... 10-11

2.4. Algorithmic Details……………………………………………………………. 12-15

3. System requirement………………………………………………………………. 16

3.1. Hardware requirements………………………………………………………. 16

3.2. Software requirements………………………………………………………... 16

4. Implementation and Results……………………………………………………… 17-21

4.1. Performance Details…………………………………………………………… 17-20

4.2. Results…………………………………………………………………………. 21

5. Conclusion and Future Scope……………………………………………………. 22

5.1. Conclusion……………………………………………………………………... 22

5.2. Future Scope…………………………………………………………………… 22

References……………………………………………………………………………. 23

SCOE’s Computer Engineering 2021-22


List of Figures
Figure No. Name Page No.

2.1.1. Dataset A 4

2.1.2. Dataset B 4

2.2.1. Flowchart 5

2.2.2. Plot of Amplitude with Time 6

2.2.3. Mel-Spectrogram Before Bandpass Filtering 6

2.2.4. Mel-Spectrogram After Bandpass Filtering 6

2.2.5. Plot of Amplitude with Time 7

2.2.6. Mel-Spectrogram Before Bandpass Filtering 7

2.2.7. Mel-Spectrogram After Bandpass Filtering 7

2.2.8. Plot of Amplitude with Time 7

2.2.9. Mel-Spectrogram Before Bandpass Filtering 8

2.2.10. Mel-Spectrogram After Bandpass Filtering 8

2.3.1. Mel-Frequency Cepstrum Coefficients Spectrogram Before 9


Bandpass Filtering

2.3.2. Mel-Frequency Cepstrum Coefficients Spectrogram After 10


Bandpass Filtering

2.3.3. Mel-Frequency Cepstrum Coefficients values 10

2.4.1. Structure of VGGNet16 12

2.4.2. Structure of LSTM 13

2.4.3. Structure of Traditional Neural Network 14

4.1.1. Validation Accuracy and loss during the training of the RESNet 17
model

4.1.2. Confusion Matrix of RESNet Model 17

4.1.3. Confusion Matrix of VGGNet16 Model 18

4.1.4. Long Short Term Model 18

SCOE’s Computer Engineering 2021-22 1


List of Figures
Figure No. Name Page No.
4.1.5. Validation Accuracy and loss during the training of the LSTM 18
model
4.1.6. Traditional Neural Network and Data Augmentation Model 19
4.1.7. Confusion Matrix for Augmented Model 19
4.2.1. Frontend1 20
4.2.2. Frontend2 20
4.2.3. Output1 20
4.2.4. Output2 20

SCOE’s Computer Engineering 2021-22 2


CHAPTER 1
INTRODUCTION
1.1. GENERAL

Currently, most deaths are caused by heart disease. To overcome this situation,
Heartbeat sound analysis is a convenient way to diagnose heart disease. So, we created
such model in our mini-project which classify the heartbeat sound in three categories
that Normal, Extrasystole and Murmur. The heartbeat sound of a normal heart involves
two sound that is lub-dub, lub sound components associated with closing valves systole
and dub sound component associated with opening valves diastole. A normal heartbeat
sound has a clear pattern of “lub-dub, dub-lub”, with the time dub to lub is greater than
the time from lub to dub and its rate is 60-100 beats per minute. Murmur and
Extrasystole are the abnormal sound of heartbeat. A murmur heartbeat sound has a
noise pattern whooshing, roaring, rumbling or turbulent fluid between lub to dub or
dub to lub and symptoms of many heart disease. An extra-systole heartbeat sound has
an out of rhythm pattern “lub-lub dub, lub dub-dub” that normally is found in adults
and very common in children.

Heartbeat sound classification usually comprises of three steps. The first step is a pre-
processing that cleans the heartbeat signal by passing band-pass filter to eliminate
noise or it give heartbeat sound in particular frequency range. In second step we
extracted features from audio signal using Mel Frequency Cepstrum Coefficients
which we used to train model. In third step we train our model.

In this project we used two datasets, Dataset A and Dataset B, Dataset A contain four
different classes of sound that are Normal heartbeat sound, Extrasystole, Murmur and
Artifact where Dataset B contain three different classes of sound that Normal,
Extrasystole and Murmur. Initially this data is in audio signal form so we applied
sampling on it at the rate of 16000 amplitude per second to convert it into digital signal
form. Datasets has audio noise, so to remove the noise from data to get heartbeat sound
in particular frequency range for that we applied bandpass filtering. Then we applied
Fast Fourier Transform algorithm which converts signal from time domain to
frequency domain. After FFT transform we applied Short Term Fourier Transform
which gives Spectrogram. Spectrogram is visual representation of audio signal. Then
converted spectrogram into Mel-Spectrogram (converting frequency into Mel scale).
And the extracted features from audio signal by applying Mel frequency Cepstrum
Coefficients.

Fundamental concepts and deep learning algorithms used for heart beat sound
classification. We were used visual domain classfication approaches that
Convolutional Neural Network(CNN) based ResNet architecture and CNN we also
used VGG Very Deep Convolutional Neural Network(VGGNet16) and Also applied a
purposed model Recurrent Neural Network(RNN) that is based on Long Short-Term
Memory(LSTM). And also train Traditional Neural Network and Data Augmentation
model. So, this four model (RESNet, VGGNet16, LSTM, Traditional Neural Netowrk
and Data Augmentation) we trained for classfiying heartbeat sound.

SCOE’s Computer Engineering 2021-22 3


1.2. OBJECTIVE AND PROBLEM STATEMENT
The Project aims to build a model which can classify the heart beat sound recording of
humnans which helps the any human to detect their heart beat sound and take further
treatment.

FUNCTIONALITY:

• Data preprocessing by appling bandpass filtering (Making data suitable for


train a model)
• Features Extraction using Mel Frequency Cepstrum Coefficient.
• Trained following Deep Learning models

1. RESNet (Residual Neural Network):


Convolution Neural Network based architecture or model.

2. VGGNet16
Convolution Neural Network based architecture or model.

3. LSTM(Long Short Term Memory):


Recurrent Neural Network based architecture or model.

4. Traditional Neural Network and Data Augmentation.

SCOE’s Computer Engineering 2021-22 4


CHAPTER 2
METHODOLOGY
2.1 Data Set Details
Two datasets are used to verify the performance of the machine learning model. Dataset A
has four category sounds those are Normal, Artifact, Extrasystole and Murmur. This Data
was recorded at normal condition from human being. Whereas Dataset B has three category
of sounds those are Normal, Extrasystole and Murmur which was recorded professionally
in well condition because that it contain low noise compare to Dataset A.

Following Bar Plots show the Distribution of categories of sounds in Dataset A and Dataset
B.

Figure 2.1.1. Dataset A. Figure 2.1.2. Dataset B.

Normal includes normal heart beat sounds that is lub-dub and also include other sounds
like sound of lungs, stomach etc. Extrasystole, Murmur and Artifact and abnormal sounds
of heart that are commonly occurs.

SCOE’s Computer Engineering 2021-22 5


2.2 Data Preprocessing

Figure 2.2.1. Flowchart.

Before start with Data preprocessing First we need to convert audio signals to digital
signals, so we convert audio signal to digital signal with sampling rate of 16000 amplitude
every seconds.

Datasets has audio noise, so to denoise or remove unwanted noise from data to get only
heartbeat sound in particular frequency range for that we applied bandpass filtering which
allows audio signals within a selected range of frequencies to be heard while preventing
signals at unwanted frequencies from getting through.

With bandpass filtering we also applied FFT that is Fast Fourier Transform is an algorithm
that can efficiently compute the Fourier transform. It is widely used in signal processing.
An audio signal is comprised of several signal-frequency sound waves. When taking
samples of the signals over time, we only capture the resulting amplitudes. The FFT is a
mathematical formula that allows us to decompose a signal into its individual frequencies
and the frequency’s amplitude. In other word it converts the signal from the time domain
into the frequency domain. And the result which we got after FFT that is called spectrum.

The fast Fourier transform is a powerful tool that allows us to analyze the frequency content
of a signal but don’t allow to represent non periodic signals which vary over time. So, with
the help of Short Fourier Transform we can represent non periodic signals which vary over
time. It computes several spectrums by performing FFT on several windowed segments of
signal. The FFT is computed on overlapping windowed segments of the signal and we get
spectrogram. Spectrogram is a way to visually represent a signal’s loudness, or amplitude
as it varies over time at different frequencies. In spectrogram X-axis represent time and Y-

SCOE’s Computer Engineering 2021-22 6


axis represent frequency. The y-axis is converted to a log scale, and the color dimension is
converted to decibels.

After all this process we computed Mel-Spectrogram. Mel-Spectrogram is a spectrogram


where the frequencies are converted to Mel-Scale. The Mel-Scale is logarithmic
transformation of signal’s frequency. The core idea of this transformation is that sounds of
equal distance on Mel scale are perceived to be of equal distance to listener.

Following are results before and after applying bandpass filtering on Normal, Murmur and
Extrasystole heartbeat sound in the form of Mel-Spectrogram.

For Normal Heartbeat Sound

Figure 2.2.2. Plot of Amplitude with Time.

Figure 2.2.3. Mel-Spectrogram Before Bandpass Filtering.

Figure 2.2.4. Mel-Spectrogram After Bandpass Filtering.

SCOE’s Computer Engineering 2021-22 7


For Murmur Heartbeat Sound

Figure 2.2.5. Plot of Amplitude with Time.

Figure 2.2.6. Mel-Spectrogram Before Bandpass Filtering.

Figure 2.2.7. Mel-Spectrogram After Bandpass Filtering.

For Extrasystole Hearbeat Sound

Figure 2.2.8. Plot of Amplitude with Time.

SCOE’s Computer Engineering 2021-22 8


Figure 2.2.9. Mel-Spectrogram Befor Bandpass Filtering.

Figure 2.2.10. Mel-Spectrogram After Bandpass Filtering.

SCOE’s Computer Engineering 2021-22 9


2.3 Calculating MFCCs Values:

MFCCs Stands for Mel Frequency Cepstrum Coefficients which is used for to extracting
features from the audio signal and using it as input to base model will produce much better
performance than directly considering raw audio signal as input. MFCC is the widely used
technique for extracting the features from the audio signal. It is compact representation of
the spectrum of an audio signal. In MFCC we convert the audio signals into coeffiecient
values and that values of MFCC coefficients contain information about the rate changes in
the different spectrum bands or frequency.

To obtain MFCCs values we applied following steps:

1. First we apply Short Term Fourier Tranform(STFT) which gives us Spectrogram.


2. Then we convert spectrogram into Mel-Scale. Then we took logarithm of Mel
representation of audio.
3. Take logarithmic magnintude and use Discrete Consine Transformation (Applying
Inverse Fourier Transformation).
4. This resul creates a spectrum over Mel frequencies as oppsed to time, thus MFCCs
values are created.
If a cepstral coefficient has a positive value, the majority of the spectral energy is
concentrated in the low-frequency regions. And if a cepstral coefficient has a negative
vlaue, it represents that most of the spectral energy is concentrated at high frequencies.

Following are Mel-Frequency Cepstrum Coefficients Spectrogram.

Figure 2.3.1. Mel-Frequency Cepstrum Coefficients Spectrogram Before Bandpass Filtering.

SCOE’s Computer Engineering 2021-22 10


Figure 2.3.2. Mel-Frequency Cepstrum Coefficients Spectrogram After Bandpass Filtering.

Figure 2.3.3. Mel-Frequency Cepstrum Coefficients values.

This resultant list of numbers or the coefficients are temed as the Mel Frequency
Cepstrum Coefficients i.e. MFCCs.

SCOE’s Computer Engineering 2021-22 11


2.4 Algorithmic Details:
Fundamental concepts and deep learning algorithms used for heart beat sound
classification. We were used visual domain classfication approaches that Convolutional
Neural Network(CNN) based ResNet architecture and CNN we also used VGG Very Deep
Convolutional Neural Network(VGGNet16) and Also applied a purposed model Recurrent
Neural Network(RNN) that is based on Long Short-Term Memory(LSTM). We also trained
Traditional Neural Network and Data Augmentation Model.

Convolutional Neural Network(CNN):

In neural networks, Convolutional neural network (ConvNets or CNNs) is one of the main
categories to do images recognition, images classifications. Objects detections, recognition
faces etc., are some of the areas where CNNs are widely used. Convolutional neural
networks are composed of multiple layers of artificial neurons. Artificial neurons, a rough
imitation of their biological counterparts, are mathematical functions that calculate the
weighted sum of multiple inputs and outputs an activation value. When you input an image
in a ConvNet, each layer generates several activation functions that are passed on to the
next layer.

CNNs are inspired by the architecture of the brain. Just like a neuron in the brain processes
and transmits information throughout the body, artificial neurons or nodes in CNNs take
inputs, processes them and sends the result as output. The image is fed as input. The input
layer accepts the image pixels as input in the form of arrays. In CNNs, there could be
multiple hidden layers, which perform feature extraction from the image by doing
calculations. This could include convolution, pooling, rectified linear units, and fully
connected layers. The first layer usually extracts basic features such as horizontal or
diagonal edges. This output is passed on to the next layer which detects more complex
features such as corners or combinational edges. As we move deeper into the network it
can identify even more complex features such as objects, faces, etc.

Recurrent Neural Network (RNN):


A recurrent neural network (RNN) is a type of artificial neural network which uses
sequential data or time series data. These deep learning algorithms are commonly used for
ordinal or temporal problems, such as language translation, natural language processing
(nlp), speech recognition, and image captioning; they are incorporated into popular
applications such as Siri, voice search, and Google Translate. Like feedforward and
convolutional neural networks (CNNs), recurrent neural networks utilize training data to
learn. They are distinguished by their “memory” as they take information from prior inputs
to influence the current input and output. While traditional deep neural networks assume
that inputs and outputs are independent of each other, the output of recurrent neural
networks depend on the prior elements within the sequence. While future events would also
be helpful in determining the output of a given sequence, unidirectional recurrent neural
networks cannot account for these events in their predictions.
Variant of Recurrent Neural Network:
• Bidirectional recurrent neural networks (BRNN)
• Long short-term memory (LSTM) which we were used in our project.

SCOE’s Computer Engineering 2021-22 12


Three Deep Learning model or architecture used in this project to classify heartbeat sound.
These Deep learning approaches include the CNN based Residual Network that is RESNet
and VGGNet and RNN based Long short-term memory (LSTM).

1. RESNet: Residual learning network (RESNet) is a variation of CNN which


enables deeper network learning effectively. RESNet avoided the
vanishing/exploding gradients problem of deeper networks with help of residual
learning. In a residual network shortcut or skip connections can be inserted when
input and output dimensions are the same as the stacked convolutional blocks.
Skip connection adds the input of the first convolution block to the output of the
next convolution block in a 2-layer stack. When the input and output size differ
then shortcuts can either pad the output with zero or project to match dimensions
with help of 1*1 convolution. This helps to reformulate the layers explicitly by
identity mapping to address the degradation issues. Shortcut connections allow the
gradient to flow through itself which helps to mitigate the vanishing gradient
problem. The identity function of RESNet ensures that the subsequent layers learn
at least like the previous layers if not better. The RESNet study also introduced
the use of bottleneck blocks which stacks three convolution blocks of 1*1, 3*3,
and 1*1 kernel respectively instead of 2 layers for deeper networks. Parameter-
less identity shortcuts showcased better performance in bottleneck-based RESNet
architectures. The 152 Layer RESNet replaced 2-layer blocks of Resnet-34 using
3-layer bottlenecks which resulted in better accuracy in the ImageNet challenge.
ResNet152V2 is described by [13] as a pre-activated rectified linear unit (RELU)
with the original ResNet152.

2. VGGNet: VGG stands for Visual Geometry Group; it is a standard


deep Convolutional Neural Network (CNN) architecture with multiple layers. The
“deep” refers to the number of layers with VGG-16 or VGG-19 consisting of 16
and 19 convolutional layers. The VGG architecture is the basis of ground-breaking
object recognition models. Developed as a deep neural network, the VGGNet also
surpasses baselines on many tasks and datasets beyond ImageNet. Moreover, it is
now still one of the most popular image recognition architectures.

Figure 2.4.1. Structure of VGGNet16.

SCOE’s Computer Engineering 2021-22 13


The VGG model, or VGGNet, that supports 16 layers is also referred to as VGG16,
which is a convolutional neural network. The VGGNet-16 supports 16 layers and can
classify images into 1000 object categories. The concept of the VGG19 model (also
VGGNet-19) is the same as the VGG16 except that it supports 19 layers. The “16”
and “19” stand for the number of weight layers in the model (convolutional layers).
This means that VGG19 has three more convolutional layers than VGG16.

3. LSTM: Long Short-Term Memory is a kind of recurrent neural network. In RNN


output from the last step is fed as input in the current step.

Figure 2.4.2. Structure of LSTM.

LSTM network is comprised of different memory blocks called cells (the rectangles
that we see in the image). There are two states that are being transferred to the next cell;
the cell state and the hidden state. The memory blocks are responsible for remembering
things and manipulations to this memory is done through three major mechanisms,
called gates.

RNN and LSTM are memory-bandwidth limited problems -Temporal convolutional


network (TCN) “outperform canonical recurrent networks such as LSTMs across a
diverse range of tasks and datasets, while demonstrating longer effective memory”.

4. Traditional Neural Network and Data Augmentation

With the principle of “learning to behave”, traditional neural networks need to be well
trained before applied for applications. Training data can be directly applied as network
inputs, and the networks parameters, called “weights”, are adjusted iteratively
according with the differences between desired network behaviors and actual network
behaviors. Traditional neural networks have various architectures and the most popular
one is multiplayer perceptron (MLP) networks. Practically, MLP networks are very
inefficient for solving problems. Traditional neural networks with connections across
layers, such as fully connected cascade (FCC) networks and bridged multilayer
perceptron (BMLP) networks, are much more powerful, and also require more
challenging computations.

SCOE’s Computer Engineering 2021-22 14


Figure 2.4.3. Structure of Traditional Neural Network.

Net computation: net = ∑𝑁


𝑛=1 𝑥𝑛 𝑤𝑛 + 𝑤0

where: n is the index of inputs and weights, from 1 to N; 𝑤𝑛 is the weight on input 𝑥𝑛 ;
𝑤0 is the bias weight.

Output computation: 𝑦𝑚 = f(net) = tanh(net)

where: ym is the output of the neuron and f(net) activation function.

SCOE’s Computer Engineering 2021-22 15


CHAPTER 3
SYSTEM REQUIREMENT
3 HARDWARE AND SOFTWARE REQUIREMENTS

3.1 HARDWARE REQUIREMENTS:

1. RAM: 12 GB+ 2700 MHz DDR4 Minimum


2. Processor: 4 GHz, Dual Core Minimum
3. Hard Drive: SSD Required

3.2SOFTWARE REQUIREMENTS:

1. Anaconda 3
2. Jupyter Notebook
3. Google Collab
4. Python 3.7
5. Pandas
6. Numpy
7. Seaborn and Matplotlib
8. Scikit-Learn
9. Scipy
10. Librosa
11. Keras

SCOE’s Computer Engineering 2021-22 16


CHAPTER 4
IMPLEMENTATION AND RESULTS

4.1. PERFORMANCE DETAILS:


We used four deep learning models viz. Resnet, VGG16, LSTM and Traditional neural
network. Resnet and VGG16 are pretrained convolution neural networks, LSTM is based on
recurrent neural network and Traditional Neural Network is based on multi layered neural
network.

Following table shows Comparision of Deep Learning Models or Architectures

Sr.No Deep Learning Accuracy Running


Models Time

Residual Neural 85% 1:30 hours


1 Netowrk
(RESNet)

80% 3 hours
2 VGGNet16

Long Short Term 77% 5 mins


3 Memory
(LSTM)

Traditional 80% 2 mins


4 Neural Network
and Data
Augmentation

Table 4.1.1. Performance Details.

The RESNet based model gave the highest accuracy rate for given audio datasets.

1. RESNet (Residual Neural Network):

The model was trained with Adam optimizer, 0.001 as the LR(Learning Rate). This was
configured to be trained for 20 epochs, with softmax activation function. Softmax is effective
way that handles multi-class classification problems in which output represents in categorical
ways. The Validation Accuracy and Loss in training show in below figure:

SCOE’s Computer Engineering 2021-22 17


Figure 4.1.1. Validation Accuracy and loss during the training of the RESNet model.

Figure 4.1.2. Confusion Matrix of RESNet Model.

The validation accuracy and loss in training help to find out the model working is fine or not.
If validation loss increases and validation accuracy decreases, that means our model is not in
learning state. If validation loss and accuracy are increases, that means our model is over-
fitting. If validation loss decreases and validation accuracy increases, then it means model
learning and working is fine.

2. VGGNet16:

The model was trained with Adam optimizer, 0.001 as the LR(Learning Rate). This was
configured to be trained for 20 epochs, with softmax activation function.

SCOE’s Computer Engineering 2021-22 18


Figure 4.1.3. Confusion Matrix of VGGNet16 Model.

3. LSTM(Long Short Term Model):

The model was trained with Adam optimizer. This was configured to be trained for max 100
epochs, with softmax activation function.

Figure 4.1.4. Long Short Term Model.

Figure 4.1.5. Validation Accuracy and loss during the training of the LSTM model.

SCOE’s Computer Engineering 2021-22 19


4. Traditional Neural Network and Data Augmentation Model.

For this model the preprocessing done is extracting the mfcc, zcr and the bandwidth values
from the dataset and splitting the data into chuncks of 2 seconds. The model was trained
using mfcc, zcr and bandwidth values using the Adam Optimizer with LR of 0.0001 and
Softmax activation funtion for 50 epochs.

Figure 4.1.6. Traditional Neural Network and Data Augmentation Model.

Figure 4.1.7. Confusion Matrix for Augmented Model.

SCOE’s Computer Engineering 2021-22 20


4.2. RESULTS:

Figure 4.2.1 Frontend1.

Figure 4.2.2. Frontend2.

Figure 4.2.3. Output1.

Figure 4.2.2. Output2.


SCOE’s Computer Engineering 2021-22 21
CHAPTER 5
CONCLUSION AND FUTURE SCOPE
5.1. Conclusion:
This Mini Project purposed a deep learning model for heartbeat sound classification. This
purposed project or model can efficiently detect the heartbeat sound whether it is Normal sound
of heartbeat, Murmur or Extrasystole. With the help of this any human can detect their heartbeat
sound and do the further necessary treatment.
In this project we did preprocessing of audio file by applying bandpass filtering method.
Extracted features from audio sound using Mel Frequency Cepstrum Coefficient. We trained
various Deep Learning models that RESNet, VGGNet16, LSTM and Traditional Neural
Network with Data Augmentation. Among all of the Deep Learning Models, RESNet gave
highest accuracy around 85%.

5.2. Future Scope:


• In future we will try to improve the accuracy of classification and detecting heartbeat
of sound.
• Using this current model, we can build embedded system which can detect heart beat
sound and give direct result.
• Currently this model only detects and classify heart beat sound but in future with some
modification in it can also able to detect abnormal sounds of lungs, stomach, etc.

SCOE’s Computer Engineering 2021-22 22


REFERENCES

[1] Wes McKinney “Python for Data Analysis: Data Wrangling with Pandas, NumPy and
Ipython”.
[2] Nikhil Buduma, Nicholas Locascio “Fundamentals of Deep Learning: Designing next-
generation machine intelligence algorithms”.
[3] Ali Raza, Arif Mehmood, Saleem Ullah, Maqsood Ahmad, Gyu Sang Choi and Byung-
Won On “Heartbeat Sound Signal Classification Using Deep Learning”
[4] Uddipan Mukherjee, Sidharth Pancholi “Heartbeat Sound Classification with Visual
Domain Deep Neural Networks”
[5] Keunwoo Choi, George Fazekas, Mark Sandler “Automatic tagging using deep
convolutional neural networks”.
[6] Alfredo Canziani, Adam Paszke, Eugenio Culurciello “An Analysis of Deep Neural
Network Models for Practical Applications”.
[7] Venkatesh Boddapati, Andrej Petef, Jim Rasmusson, Lars Lundberg “Classifying
environmental sounds using image recognition networks”.

SCOE’s Computer Engineering 2021-22 23

You might also like