0% found this document useful (0 votes)

208 views295 pages

Proceedings Book

The document provides information about the International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021) that was held from July 09-11, 2021 in Istanbul, Turkey. It lists the host, sponsors, editorial board members, keynote speakers, committees and scientific boards for the conference. It also provides details about the proceedings book published after the conference containing contributions presented during the event.

Uploaded by

ma ha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

208 views295 pages

Proceedings Book

Uploaded by

ma ha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 295

July 09-10, 2021 .

Istanbul, Turkey .

ICAETA -2021
International Conference on Advanced Engineering,
Technology and Applications

PROCEEDINGS BOOK
2021
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)

Istanbul, Turkey

Host
Department of Computer Engineering,
Istanbul Aydin University,
Istanbul, Turkey.
https://izu.edu.tr

International Conference on Advanced Engineering, Technology

and Applications (ICAETA -2021)
July 09-11, 2021,
Istanbul, Turkey

Editorial Board
Dr. Akhtar Jamil
Dr. Alaa Ali Hameed

ISBN: 2752-8340

Istanbul Aydin University

Beşyol, İnönü Cd. No:38, 34295 Küçükçekmece/Istanbul
Tel: 444 1 428
www.aydin.edu.tr
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)

Istanbul, Turkey

transmitted in any form or by any means, electronic, mechanical, photocopying, recording,

scanning or otherwise, except under the terms of the Copyright.

The individual contributions in this publication and any liabilities arising from them remain the

responsibility of the authors. The publisher is not responsible for possible damages, which could

be a result of content derived from this publication. Moreover, the proceedings book is published

by AIPLUS in Blackburn, United Kingdom.

info@aiplustech.org
https://icaeta.aiplustech.org
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)

Istanbul, Turkey

Keynote Speakers

Prof. Dr. Levent Sevgi

IEEE Fellow, IEEE AP-S Distinguished Lecturer
Istanbul OKAN University,
Istanbul, Turkey

Prof. Dr. Ali Okatan

Istanbul Aydin University.
Istanbul, Turkey

Committees and Scientific Boards

General Chairs
Prof. Dr. Naim Mahmood Musleh, Istanbul Aydin University, Istanbul, Turkey.

Technical Program Chairs

Dr. Akhtar Jamil, Istanbul Sabahattin Zaim University, Turkey.
Dr. Alaa Ali Hameed, Istanbul Sabahattin Zaim University, Turkey.
Dr. Hasan Alpay Heperkan, Istanbul Aydin University, Istanbul, Turkey.
Dr. Bahar Ferah, Istanbul Sabahattin Zaim University, Turkey.
Dr. Omar Algburi, Istanbul Sabahattin Zaim University, Turkey.

Publication Committee
Dr. Imran Ahmed Siddiqi, Department of Computer Science, Bahria University, Pakistan.
Maryam Torabi, Alzahra Technical and Vocational College, Iran
Dourna Kiavar, Sama University of Tabriz, Iran
Mustafa Takaoğlu, Istanbul Aydin University, Turkey.
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)

Istanbul, Turkey

Registration Committee
Dr. Amani Yahyaoui, Istanbul Sabahattin Zaim University, Turkey.
Alireza Sakha, Islamic Azad University, Iran
Ali Khiabanian, Interdisiplinary Design Universe Office, Iran
Ayse Gul Gemci, Istanbul Technical University, Turkey

Publicity Committee
Dr. Adem Ozyavas, Istanbul Aydin University, Turkey.
Sama Khattab, Istanbul Aydin University, Turkey.

Program Committee
Dr. Ameer Al-Nemrat, School Of Architecture Computing And Engineering, University Of East London
Dr. Chawki Djeddi, Laboratoire D'informatique De Traitement De I'information Et Des Systemes (Litis),
University Of Rouen, France.
Dr. Can Balkaya,Department Of Civil Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Muhammad Abdul Basit, Montana Technological University, Butte Montana, Usa.
Dr. Naghmeh Moradpoor, School Of Computing, Edinburgh Napier University, United Kingdom
Dr. Kamran Dehghan, Department Of Architecture, Islamic Azad University, Iran
Dr. Selda Nazari, Department Of Architecture, Islamic Azad University, Iran
Dr. Syed Attique Shah , Balochistan University Of Information Technology Engineering And Management
Sciences, Pakistan
Dr. Müberra Eser Aydemir,Department Of Civil Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Muhammad Fahim , Institute Of Information Security And Cyberphysical Systems, Innopolis University,
Russia.
Dr. Shahab Adam Navasi, Department Of Architecture, Islamic Azad University, Iran
Dr. Muhammad Ilyas, Department Of Electrical And Electronics Engineering, Altinbas University, Turkey.
Dr. Rawad Hammad, School Of Architecture, Computing And Engineering, University Of East London, United
Kingdom
Dr. Prateek Agrawal, Department Of Computer Science, University Of Klagenfurt, Austria.
Dr. Atoosa Modiri, Department Of Architecture, Islamic Azad University, Iran
Dr. Hasan Volkan Oral, Department Of Civil Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Jayapandian N, Department Of Computer Science And Engineering, Christ University, Bangalore, India.
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)

Istanbul, Turkey

Dr. Kaveh Dehghanian, Department Of Civil Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Nidaa Flaih Hassan, Department Of Computer Science, University Of Technology, Iraq
Dr. Amani Yahyaoui, Department Of Computer Engineering, Istanbul Sabahattin Zaim University, Turkey.
Dr. Firas Ajlouni, Department Of Computer Science, Lancashire College, United Kingdom.
Dr. Bita Bagheri, Department Of Architecture, Islamic Azad University, Iran
Dr. Zeynep Kerem Öztürk, Department Of Interior Architecture And Environmental Design, Istanbul
Sabahattin Zaim University, Istanbul, Turkey
Dr. Aliyu Musa, Predictive Society And Data Analytics Lab, Faculty Of Information Technology And
Communication Sciences, Tampere University, Tampere, Finland.
Dr. S Amutha, Department Of Computer Science And Engineering, Sàveetha Engineering College
(Autonomous), Affiliated To Anna University, Chennai, India.
Dr. Sepanta Naimi, Department Of Civil Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Mehmet Fatih Altan, Department Of Civil Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Mohammad Golmohammadi, Department Of Architecture, Islamic Azad University, Iran
Dr. Lisa Oliver , Department Of Computer Science, Lancashire College, United Kingdom.
Dr. Ahmet Gürhanli, Department Of Computer Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Daniel White, Department Of Mathematics And Computing , Lancashire College, United Kingdom.
Dr. Yasmin Doozdoozani, Department Of Architecture, Islamic Azad University, Iran
Dr. Javad Eiraji, Faculty Of Architecture And Design, Eskisehir Technical University, Turkey
Dr. Kiarash Eftekhari, Department Of Architecture, Islamic Azad University, Iran
Dr. Mina Najafi, Editorial Assistant, Emerald Publishing Ltd, United Kingdom.
Dr. Nahid Khahnamouei, Department Of Architecture, University Of Nabi Akram, Iran
Dr. Murtaza Farsadi,Department Of Electric-Electronic Engineering, Istanbul Aydin University, Istanbul,
Turkey.
Dr. Sadeq Alhamouz, Deparment Of Computer Sciences, Wise University, Jordan.
Dr. Necip Gökhan Kasapoğlu, Department Of Electric-Electronic Engineering, Istanbul Aydin University,
Istanbul, Turkey.
Dr. Vira V. Shendryk, Deparment Of Computer Sciences, Sumy State University, Ukraine.
Dr. Elnaz Pashaei, Department Of Software Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Shareeful Islam, School Of Architecture Computing And Engineering, University Of East London, London,
United Kingdom
Dr. Navid Khaleghimoghaddam, Department Of Engineering And Architecture, Konya Food And Agriculture
University, Turkey.
Dr. Nima Mirzaei, Department Of Industrial Engineering, Istanbul Aydin University, Istanbul, Turkey.
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)

Istanbul, Turkey

Dr. Imene Yahyaoui, Universidad Rey Juan Carlos, Applied Mathematics Materials Science And Engineering
And Electronic Technology, Spain.
Dr. Ilham Huseyinov, Department Of Software Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Azeem Hafeez, Department Of Electrical And Computer Engineering, University Of Michigan, Usa.
Dr. Abdulkader Alwer, Department Of Electric-Electronic Engineering, Istanbul Aydin University, Istanbul,
Turkey.
Dr. Zeynep Orman, Deparment Of Computer Engineering, Istanbul University, Turkey.
Dr. Xiaodong Liu, School Of Computing, Edinburgh Napier University, United Kingdom
Dr. Saed Moghimi, Department Of Civil Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Alaa Sheta, Deparment Of Computer Sciences, Southern Connecticut State University, Usa.
Dr. Mehmet Güneş Gençyilmaz, Department Of Industrial Engineering, Istanbul Aydin University, Istanbul,
Turkey.
Dr. Raheleh Mirzaei, Department Of Industrial Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Vijayakumar Varadarajan, School Of Computer Science And Engineering, The University Of New South
Wales, Sydney, Australia.
Dr. Luca Romeo, Istituto Italiano Di Tecnologia, Italy.
Dr. Numan Khurshid, Seecs, National University Of Science And Technology, Pakistan
Dr. Elif Merve Kahraman, Department Of Food Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Sibel Kahraman, Department Of Food Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Mustafa Nafiz Duru, Department Of Industrial Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Saeid Homayouni, Centre For Water, Earth, And Environment, Inrs-Quebec, Canada
Dr. Imran Ahmed Siddiqi, Department Of Computer Science, Bahria University, Pakistan.
Dr. Zeynep Dilek Heperkan, Department Of Food Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. T. Lalitha, Computer Science & It, Jain Deemed-To-Be University, Bengaluru, Karnataka,India
Dr. Aysa Jafari Farmand, Istanbul Technical University, Turkey.
Dr. Mehdi Zahed, School Of Applied Sciences And Technology, Nait, Canada.
Dr. Mehmet Güneş Gençyilmaz, Department Of Food Engineering, Istanbul Aydin University, Istanbul,
Turkey.
Dr. Gülay Baysal, Department Of Food Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Fahimeh Jafari, School Of Architecture, Computing And Engineering, University Of East London, United
Kingdom
Dr, Mustansar Ali Ghazanfar, Department Of Computer Science And Digital Technologies, University Of East
London, United Kingdom
Dr. Sibel Senan, Deparment Of Computer Engineering, Istanbul University, Turkey.
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)

Istanbul, Turkey

Dr. Mehmet Emin Tacer, Department Of Electric-Electronic Engineering, Istanbul Aydin University, Istanbul,
Turkey.
Dr. Mohammed Alkrunz, Department Of Electrical And Electronics Engineering, Istanbul Aydin University,
Istanbul, Turkey.
Dr. Mohammed Vadi, Department Of Electrical And Electronics Engineering, Istanbul Sabahattin Zaim
University, Istanbul, Turkey.
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)

Istanbul, Turkey

TABLE OF CONTENTS

S.No Autors/Titles Page

Mesut Toğaçar and Burhan Ergen
1 Recognition Human Activities by Convolutional Based Bidirectional 1
LSTM Model Using ECG Signal Data
Adel Ridha Othman
2 Preventing Oscillation of Supply Voltage Due to Resonance Harmonic 6
Frequency of a Power Factor Corrected System
Parastoo Pourvahidi
3 11
Arranging Spaces for Next Pandemic by inspiring from Qajar period
Adeola Opesade
4 Comparison of Sentiment-Lexicon-based and Topic-based Sentiment 18
Analysis Approaches on E-Commerce Datasets
Abdulazeez Sulaiman, Fatih Özyurt and Shivan Mohammed
5 24
Autism detection from facial images using deep learning methods
Indronil Bhattacharjee, Al Mahmud and Tareq Mahmud
6 Diabetic Retinopathy Classification from Retinal Images using Machine 29
Learning Approaches
Melda Ozdinc Carpinlioglu
7 34
Methodology for Syngas Energy Assessment

Mohamed K. Hussein, Mahmoud I. Khalil and Bassem A. Abdullah

8 39
3D Object Detection using Mobile Stereo R-CNN on Nvidia Jetson TX2

Aida Ghiaseddin and Javad Eiraji

9 Investigation of Algorithmic Architecture Design Method by Using Digital
44
Technology to Increase Flexibility in Design Process
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)

Istanbul, Turkey

Huwaida Elgweri, Amal Hamed and Mohamed Mansor

10 Applying of the Finite Difference Time Domain Method for the Calculation 49
of Lower Angular Excited States of Symmetrical Two Dimensional System
Zeba Mahmood
11 Implementation Framework for a Blockchain-Based Reputation and Trust 55
System
Edmond Manahasa and Rudina Kazazi
12 An Observation on Residential Complexes as a New Housing Typology in 62
Post-Socialist Tirana
Abbas H. Hammood and Başar Özkan
13 Active Control Of In-Wheel Motor Electric Vehicle Suspension Using 68
The Half Car Model
Lamiya Salman
14 Evolutionary Deep automatic CAD system for Early detection of Diabetic 75
Retinopathy and its severity classification.
Oguzhan Erbas, Ahmet Akbulut and Fadime Menekşe İkbal
15 Studies on Increasing Energy Efficiency By Modernization Of A Single- 81
Stage Type Turbo-Compressor For Ammonia Combustion Air Process
Wafa Guedri, Mounir Jaouadi and Slah Msahli
16 Prediction of Cover Bunch Quality After Harvesting Period of Tunisian 87
Date Palm
Anna Yunitsyna and Mirela Hasanbashaj
17 Analysis of the Waterfront Transformation of the ‘Plazh’ Area of the City 91
of Durres, Albania
Şafak Kılıç, İman Askerzade and Yılmaz Kaya
18 97
Deep Learning Using MobileNet for Personal Recognizing
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)

Istanbul, Turkey

Kübra Kilinçarslan, Emirhan Sağ and Abdurrahman Günday

19 Effects of Photon-Shot and Excess Noises on Detectable Minimum 103
Rotation Rate in I-FOG Design for Autonomous Vehicles
Spencer Li
How Does the after-COVID-19 “ABCDEF” effects model affect the
20 108
development of Internet of Things and its Applications to improve
Customer Experiences?
Jara Muñoz-Hernández
21 Drawing as a Scientific Method. The School of Agricultural Engineers in 114
Madrid: a case study
Juan Pablo Vasco-Gallo, J. Isaac Pemberthy-R. and Eduard A.
Gañan-Cardenas
22 120
Urban distribution network proposal: A case study for the 14th district of
the city of Medellín.
Purushoth Anantharasa and Ragu Sivaraman
23 Close Price Prediction of Day Stock Markets with Machine Learning and 125
NLP models

24 Kaveh Dehghanian and Mohammad Haroon Saeedi 131

Investigation of Permeability Coefficient in Layered Soils
Munya Alkhalifa and Kasım Özacar
25 An AI-based Embodied Digital Human Assistant for Information in 137
University
Hanan Othman, Yakoub Bazi and Mohamad Alrahhal
26 Visual Question Answering for Medical Image Analysis based on 143
Transformers
Tuba Elmas Alkhan, Alaa Ali Hameed and Akhtar Jamil
27 148
Deep Learning for Face Detection and Recognition
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)

Istanbul, Turkey

Muhammad Farooq, Ali Javed, Khalid Malik and Anas Raza

28 154
A Lightweight and Interpretable Deepfakes Detection Framework
Edmond Manahasa and Arila Rasha
29 Searching for Aesthetical Values in an Upgraded Informal Neighborhood 160
in Tirana
Oğuzhan Erbas, F.Menekşe İkbal and Ahmet Akbulut
30 Effect of Oxidation Reactor Structure on Operating Parameters and 166
System Performance in a Nitric Acid Production Plant
Önsen Toygar, Mehtap Köse Ulukök and Emre Özbilge
31 171
Plant Disease Identification Through Deep Learning
Abdulkadir Şahiner, Kaan Kemal Polat and Hayati Ünsal Özer
32 176
Vaccines Perspective in the COVID-19 Era: Analysis of Twitter Data
Mohammad Ikhsan Zakaria, Akhtar Jamil, Alaa Ali Hameed
33 181
Deep Learning-based Healthcare Data Analysis System
Sami Benni and Nildem Tayşi
34 Optimum Usage of Mixed Damping Systems (Rubber Concerete or X 186
Diagonal Dampers) on Multistory Building
Murat Mustafa Savrun
35 Bidirectional DC-DC Converter Based on Quasi Z-Source Converter with 193
Coupled Inductors
Fabio Naselli and Enkela Krosi
36 Influences of urban fabrics onto microclimate assessment within the city 198
of Tirana
Sarosh Ahmad, Sheza Yasin, Sajal Naz, Amina Batool and Ali Suqrat
37 IoT Based Water Management and Monitoring System for Multi- 202
Resources
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)

Istanbul, Turkey

Sarosh Ahmad, Arslan Dawood Butt, Usama Umar, Sajal Naz, Sheza
Yasin and Amina Batool
38 208
Development of a High Precision Temperature Monitoring System for
Industrial Cold Storage

Mustafa Nurmuhammed, Ozan Akdağ and Teoman Karadağ

213
39 Modeling and Load Flow Analysis of Electric Vehicle Charging Stations
in Power Distribution Systems

Mohammed Majid Abdulrazzaq, Mustafa Mohammed Alhassow and

Abdullah Ahmed Al-Dulaimi
40 218
Obstacle Avoiding Capabilities for The Drone by Area Segmentation and
Artificial Neural Network
Muhammad Fawad Khan and Muhammad Azam
41 223
Joint User Selection and Base Station Assignment Strategy in Smart Grid
Mert Safak Tunalioglu
42 Comparison Of Flowdrill And Conventional Drilling Methods In Thin- 228
Walled Materials
Shahram Taheri, Zahra Golrizkhatami and Önsen Toygar
43 233
Classification of Animal Faces Using A Novel DAG-CNN Architecture
Bugay Sarikaya and Duygu Dede Sener
44 Movie Success Prediction with Statistical Analysis Techniques and 239
Machine Learning Methods
Abdullah Ahmed Al-Dulaimi, Mohammed Majid Abdulrazzaq,
45 Mustafa Mohammed Alhassow and Noor Qasim Atiyah Al Saedi 244
Digital Controllers Performance Analysis for a Robot Arm
Md Riyad Hossain, Douglas Timmer and Hiram Moya
46 250
Machine Learning optimization with Hyper-parameter tuning approach
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)

Istanbul, Turkey

Pınar Terzioglu and Eslem Kavas

47 Polyvinyl Alcohol/Cassava Starch/Nano-CaCO3 based Nano- 256
Biocomposite Films:Mechanical and Optical Properties
Nour Ammar, Ali Okatan and Naim Ajlouni
48 Movie Reviews Text Sentiment Analysis Based on Hybrid LSTM And 260
Glove
Omar Algburi and Bahar Ferah
49 Simulation Comparative Study to Highlight the Relation Between 265
Building Form and Energy Consumption
Osen Fili Nami, Abdul Halim and Dewi H. Budiarti
50 Comparison of PID and LQR Controller of Autonomous Underwater 270
Vehicle for Depth Control
Büşra Nur Darendeli and Alper Yılmaz
51 Convolutional Neural Network Approach to Distinguish and Characterize 276
Tumor Samples Using Gene Expression Data
Recognition Human Activities by Convolutional Based
Bidirectional LSTM Model Using ECG Signal Data
Mesut Toğaçar Burhan Ergen
Computer Technologies Department Computer Engineering Department
Fırat University Fırat University
Elazığ, Turkey Elazığ, Turkey
mtogacar@firat.edu.tr bergen@firat.edu.tr

Abstract—Recognition of human activities is a challenging learning techniques and deep learning approaches. They
process and technology-based systems are used for the achieved the best performance with the LSTM model.
successful realization of the process. Recently, artificial Accuracy success for the two datasets they used was 87% and
intelligence-based technologies have started to be used widely. 98.9%, respectively. S. Tsokov et al. [12] used accelerometer
The hybrid approach designed for this study consists of sensors to classify human activity types. They designed a 1D-
convolutional-based bidirectional long short-term memory (C- CNN model to describe the data they obtained according to
BiLSTM). In this study, 12 types of human activities were their types. The overall accuracy success they got from the
identified using C-BiLSTM using ECG signal data. As a result CNN model they designed was 98.86%.
of the analysis, an overall accuracy success rate of 98.96% was
achieved. The result obtained in the experimental analysis has The goals of this study are; it is to successfully recognize
been promising in identifying the types of human activity. human activities with the proposed approach. The proposed
approach is hybridized CNN & Bidirectional LSTM models
Keywords— bidirectional LSTM, deep learning, ECG designed using python libraries. The summary of this article
measurement, physical activities, human behavior about the sections is as follows; in the second section,
information about the dataset is given. The third section
I. Introduction
contains information about deep learning approaches and the
Today, recognizing human activities has become easier proposed approach. The fourth section contains the results of
thanks to various technological approaches (wearable sensor, the experimental analysis. The last section consists of
video, accelerometer, signal data, etc.) [1]. Information on the information about Discussion and Conclusion.
human activity actions is used in areas such as following the
elderly, performing health actions correctly, criminal tracking II. Dataset
systems, etc. [1], [2]. Technological infrastructure systems are The dataset consists of ECG data that includes various
supported by software containing innovative aspects [3]. physical activities created with 10 volunteer participants.
Artificial intelligence technologies form part of the innovative Using wearable sensors to measure movement data and vital
aspects. Artificial intelligence-based systems offer a service signs, other feature data were also created to detect activities.
where people can perform their transactions more easily today Measurements were made with an average speed of 50Hz
[4], [5]. during the creation of ECG data. The activity types created are
Many studies have been conducted recognition artificial 13 in total [13], [14]. The repetition/duration equivalents of
intelligence-based human activities. Shaohua Wan et al. [6] the activity types and types that make up the dataset are given
conducted the recognition of human activities using machine in Table 1.
learning and deep learning methods. They used convolutional
neural network (CNN), support vector machines (SVM), short Table I. Repetition/duration equivalents of the activity types and
long-term memory (LSTM), multilayer perceptron (MLP) types that make up the dataset
approaches in their study. They achieved the best accuracy Activity Type
Times (x) /
Activity Type
Times (x) /
success rate of 92.71% with the CNN model. Junfang Gong et minute (m) minute (m)
Frontal elevation
al. [7] used social media data to recognition human daily Nothing -
of arms
20x
activities. They achieved an overall accuracy rate of 89.35% 20x
Standing still 1m Knees bending
with the LSTM model they designed in their study. Nozha
Jlidi et al. [8] used the transfer learning-based PoseNet model Sitting and 1m
1m Cycling
relaxing
to recognition human activities. They successfully achieved 1m 1m
accuracy by emphasizing the body joints. Emilio Sansano et Lying down Jogging
al. [9] used gated recurrent unit networks (GRU), deep belief Walking 1m Running 1m
networks (DBN) and LSTM approaches to recognize human Jump front & 20x
activities. They achieved over 90% overall accuracy success Climbing stairs 1m
back
in all of the approaches they used. Sakorn Mekruksavanich et Waist bends
20x
al. [10] proposed a biometric user identification-based forward
approach to defining human activities for the health status
monitoring of elderly people. They obtained data using a
The dataset contains 14 feature columns for each data
triaxle gyroscope and triaxle accelerometer. They used the
operation. In addition, 20% of the dataset was separated as test
CNN model and LSTM model in the analysis of the data. The
data in experimental analysis, and 80% was allocated as
classification success of the models was 91.77% and 92.43%,
training data.
respectively. Negar Golestani et al. [11] presented a wireless
approach based on magnetic induction to recognize human
activities. They categorized their activity activities by
integrating the magnetic induction system with machine

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

1
III. Model Approaches the probability values as many as the type number of the
This section contains brief information about the models features from the fully connected layer and tags the input to
used in the proposed approach. the dominant probabilistic type [23].
Also, the ReLU activation function is generally preferred
A. Bi-directional Long Short-Term Memory
between layers in CNN models. ReLU is an activation
The Bi-LSTM model is the combined state of bidirectional function that allows input values to be linearized and can keep
recurrent networks (Bi-RNN). The most important feature that negative values at zero [24], [25]. CNN can be used in batch-
distinguishes LSTM from RNN consists of gates used as normalization and dropout functions to prevent models from
memory units. Input data are processed through input gates being over / underfitting [26].
and transferred to output gates. The difference between the Bi-
LSTM model from the LSTM model is that it can return from C. Proposed Approach
the previous context and receive data. That is, in Bi-LSTM The proposed approach is the result of combining the
models, previous layer information and next layer information CNN-based Bi-LSTM model, which aims to describe human
are kept in memory gates [15]. In Bi-LSTM models, the physical activities. Using the activity action features and ECG
number of hidden units and their functioning are calculated signal data obtained from C-BiLSTM model sensors, it aimed
according to Eq. (1) and Eq. (2). In these equations, 𝐿 and 𝐻 to successfully classify 13 activity types. The C-BiLSTM
inputs are used for the number of hidden units. Here; 𝑡 model is completely designed in Python programming
represents the time value, 𝑥 𝑡 is the sequence input, 𝜃ℎ the language, and "Tensorflow, Keras, Pandas, Numpy, etc."
activation function of the hidden unit, 𝑤 the weight values of coding was carried out using libraries. Jupyter Notebook
the hidden unit, and the variable 𝑏 𝑡 represents the activation interface program was used in the compilation of the model.
function of the ℎ unit at time 𝑡 [15], [16]. The general design of the model is given in Table 2. The
proposed approach consists of layers of the CNN model and
 ℎ′ =1,𝑡>0 𝑏ℎ′ 𝑤ℎ′ ℎ 
𝑎ℎ𝑡 = ∑𝐿𝑙=1 𝑥𝑙𝑡 𝑤𝑙ℎ + ∑𝐻 𝑡−1
 layers of the LSTM model. Since the two models are designed
with open source codes, data transitions and parameter values
𝑏ℎ𝑡 = 𝜃ℎ (𝑎ℎ𝑡 )  between models in python software must be compatible.
Therefore, in the proposed approach, the output values from
the last layer of the CNN model were provided to be equal to
B. Convolution Neural Networks
the input values of the BiLSTM model. In other words, while
CNN is an artificial intelligence-based model used in providing the transition, the tensor numbers, parameter values,
classification, recognition, segmentation, etc. processes by and input size are hybridized to be the same. The
processing input data [17]. These networks generally consist normalization between the feature values obtained from the
of convolutional layers, pooling layers, fully connected/dense convolutional layers was carried out with the batch -
layers [18]. Apart from that, it can contain different layers in normalization layer. Also, using the dropout layer in the last
line with the target of the model. CNN models contain hidden layers of the proposed approach, inefficient features were
layers in their architectural structures. Convolutional layers prevented from being trained by the model. Thus, the model's
process the input data by filters to extract activation features training speed and time savings were achieved.
[19]. The mathematical formula in Eq. (3) is used to extract
the activation maps [20]. In this equation; the variable 𝐹 Table II. The general design of the proposed model
represents the layer of the activation map. The variable 𝑛
represents the number of features in the layer. Variables 𝑖, 𝑗, Layer Value / Output Shape
and 𝑘 provide the position information of the input data. Convolutional (None, 400, 64)
Matrix values are represented by variable 𝑀.
Batch Normalization & ReLU (None, 400, 64)
𝑗 𝑗,𝑘
𝐹𝑖,𝑙 = ∑𝑛𝑘=1 𝑀𝑖,𝑙 𝑥 𝐹𝑘𝑖−1  Convolutional (None, 400, 128)

Batch Normalization & ReLU (None, 400, 128)

The pooling layer is often used after convolutional layers
and helps to train the CNN model more easily by reducing the Convolutional (None, 400, 256)
dimensions of the activation maps [21]. The mathematical
formula in Eq. (4) is used to calculate the maximum pooling Batch Normalization & ReLU (None, 400, 256)
region. Here, the result of pooling is represented by a variable Bidirectional LSTM (None, 400, 256)
𝑦𝑘𝑖𝑗 . The feature map is represented by 𝑘. The pooling region
is represented by 𝑅𝑖,𝑗 , while the variable 𝑥𝑘𝑝𝑞 is an element in Bidirectional LSTM (None, 400, 128)
this region. Also, (𝑝, 𝑞) represents the position in the pooling Maximum Pooling (None, 128)
region [20].
Dense (None, 1024)
𝑦𝑘𝑖𝑗 = max 𝑥𝑘𝑝𝑞 
(𝑝,𝑞)∈𝑅𝑖,𝑗 Dropout 0.25 / (None, 1024)

Dense (None, 512)

Fully connected layers collect the output values obtained
from all layers in a single layer and help calculate probabilistic Dropout 0.25 / (None, 512)
values for activation functions used in the classification
Dense (None, 256)
process such as Softmax. Fully connected layers are often
used in the last layers of CNN models [21], [22]. The Softmax Dropout 0.25 / (None, 256)
function is generally used in the last layer of CNN models and
is preferred due to its multi-classification feature. It extracts Dense (None, 13)

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

2
In addition, in the measurements of the proposed model, Experimental analyzes took an average of 16 seconds per
the optimization method Rmsprop was preferred and the loss epoch and the training-test success plot for the analyzes is
function was chosen as categorical cross-entropy. shown in Fig.1. The analysis results obtained from the training
are given in Table 3. The confusion matrix is given in Table
IV. Experimental Analysis 4. The success rates of the proposed approach for different
Experimental analyzes were compiled on the Google epochs are given in Table 5. In addition, information about the
COLAB server. Trainings were performed using the GPU and total time spent on training the model is given in Table 5. The
the preferred epoch value for training was 100. Other total training time of the model took 1659 seconds.
preferred parameters in model compilation; optimizer method
rmsprop, loss function "sparse categorical cross-entropy" and Table III. Metric results obtained in experimental analysis
metric function "sparse categorical accuracy" were selected. Activity Type Pre Rec F-scr Overall Acc
The "early stopping" parameter was used to prevent
overfitting in the model. Confusion Matrix was used to Nothing 1.00 1.00 1.00
compare analysis results. Eq. (5-8) were used to calculate Standing still 1.00 1.00 1.00
matrix metrics. The variables used in these equations; (TP): Sitting and relaxing 1.00 1.00 1.00
true positive, (TN): true negative, (FP): false positive, and
Lying down 0.99 1.00 1.00
(FN): false negative are defined [27]–[29].
Walking 0.99 0.99 0.99
TP
Precision (Pre)   Climbing stairs 0.99 1.00 1.00
TP+FP
Waist bends forward 1.00 0.98 0.99 0.98
TP
Recall (Rec)   Frontal elevation of arms 0.98 1.00 0.99
TP+FN
Knees bending 0.99 0.99 0.99
2xTP
F-scor (F-scr)   Cycling 1.00 0.99 1.00
2xTP+FP+FN
Jogging 0.97 0.99 0.98
TP+TN
Accuracy (Acc)   Running 0.92 0.94 0.93
TP+TN+FP+FN
Jump front & back 1.00 0.83 0.91

Figure I. Training-test success graphs of the proposed approach

Table IV. The confusion matrix was obtained from the proposed approach. (#1: nothing, #2: standing still, #3: sitting and relaxing, #4: lying
down, #5: walking, #6: climbing stairs, #7: waist bends forward, #8: frontal elevation of arms, #9: knees bending, #10: cycling, #11: jogging,
#12: running, #13: jump front & back)

#1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13

#1 150
#2 152
#3 154
#4 154
#5 1 149
#6 106
#7 2 130 1
#8 139
#9 2 145
#10 1 149
#11 112 1
#12 4 60
#13 1 4 25

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

3
Table V. Time - accuracy success chart obtained in the training Human Daily Activity Using Social Media Sensors and
process. Deep Learning,” Int. J. Environ. Res. Public Health, vol.
Epoch Training Time Training Acc. Test Acc.
16, no. 20, p. 3955, Oct. 2019.
(sec.) (%) (%) [8] N. Jlidi, A. Snoun, T. Bouchrika, O. Jemai, and M. Zaied,
25 442 98.73 96.71 “PTLHAR: PoseNet and transfer learning for human
50 846 99.21 97.93 activities recognition based on body articulations,” in Proc.
75 1251 99.40 97.14 SPIE, 2020, vol. 11433.
100 1659 99.64 98.96 [9] E. Sansano, R. Montoliu, and Ó. Belmonte Fernández, “A
study of deep neural networks for human activity
As a result of the analysis, the overall accuracy success of recognition,” Comput. Intell., vol. 36, no. 3, pp. 1113–
the proposed approach in recognizing human activity types
1139, Aug. 2020.
was 98.96%. The overall accuracy success obtained from the
training data was 99.64%. The proposed approach produced [10] S. Mekruksavanich and A. Jitpattanakul, “Biometric User
close to 100% results in the analysis of data from the ECG Identification Based on Human Activity Recognition Using
and three types of sensor devices. Wearable Sensors: An Experiment Using Deep Learning
Models,” Electronics, vol. 10, no. 3, p. 308, Jan. 2021.
V. Conclusion [11] N. Golestani and M. Moghaddam, “Human activity
Measurement of human activities is carried out using recognition using magnetic induction-based motion signals
various mechanical/electronic devices. Artificial intelligence- and deep recurrent neural networks,” Nat. Commun., vol.
based approaches have become indispensable for a faster and 11, no. 1, p. 1551, 2020.
more accurate realization of such systems [30]. In this study,
an artificial intelligence-based identification system has been [12] S. Tsokov, M. Lazarova, and A. Aleksieva-Petrova,
proposed using ECG and sensor measurement data of physical “Accelerometer-based human activity recognition using
activities. The proposed approach has a structure derived from 1D convolutional neural network,” IOP Conf. Ser. Mater.
the integration of the CNN model with the Bi-LSTM model. Sci. Eng., vol. 1031, no. 1, p. 12062, 2021.
The contribution of the proposed approach is that it performs [13] G. Jain, “Mobile Health Human Behavior Analysis,” Feb-
more efficient analyzes than traditional methods and can 2021. [Online]. Available:
increase success with a hybrid structure. A 98.96% success https://www.kaggle.com/gaurav2022/mobile-health.
rate was achieved in the study analyzes. As a result, the
[Accessed: 30-Mar-2021].
contribution of the hybrid approach was observed in the
experimental analyzes performed in this study. The results [14] O. Banos et al., “Design, implementation, and validation of
obtained showed that the proposed approach is promising. a novel open framework for agile development of mobile
health applications.,” Biomed. Eng. Online, vol. 14 Suppl
In future studies, hybrid models will be designed on video
2, no. Suppl 2, p. S6, 2015.
data of real-time human activities. In addition, it will be used
in hybrid models in metaheuristic methods that optimize time [15] I. N. Yulita, M. I. Fanany, and A. M. Arymuthy, “Bi-
savings. directional Long Short-Term Memory using Quantized
data of Deep Belief Networks for Sleep Stage
References Classification,” Procedia Comput. Sci., vol. 116, pp. 530–
[1] C. Jobanputra, J. Bavishi, and N. Doshi, “Human Activity 538, 2017.
Recognition: A Survey,” Procedia Comput. Sci., vol. 155, [16] C. Zhang, D. Biś, X. Liu, and Z. He, “Biomedical word
pp. 698–703, 2019. sense disambiguation with bidirectional long short-term
[2] M.-S. Dao, T.-A. Nguyen-Gia, and V.-C. Mai, “Daily memory and attention-based neural networks,” BMC
Human Activities Recognition Using Heterogeneous Bioinformatics, vol. 20, no. 16, p. 502, 2019.
Sensors from Smartphones,” Procedia Comput. Sci., vol. [17] R. Yang and Y. Yu, “Artificial Convolutional Neural
111, pp. 323–328, 2017. Network in Object Detection and Semantic Segmentation
[3] J. Rose and B. Furneaux, “Innovation Drivers and Outputs for Medical Imaging Analysis,” Front. Oncol., vol. 11, p.
for Software Firms: Literature Review and Concept 638182, Mar. 2021.
Development,” Adv. Softw. Eng., vol. 2016, p. 5126069, [18] S. Asif and K. Amjad, “Automatic COVID-19 Detection
2016. from chest radiographic images using Convolutional
[4] R. Vinuesa et al., “The role of artificial intelligence in Neural Network,” medRxiv, p. 2020.11.08.20228080, Jan.
achieving the Sustainable Development Goals,” Nat. 2020.
Commun., vol. 11, no. 1, p. 233, Jan. 2020. [19] W. S. Ahmed and A. a. A. Karim, “The Impact of Filter
[5] E. Prem, “Artificial Intelligence for Innovation in Austria,” Size and Number of Filters on Classification Accuracy in
Technol. Innov. Manag. Rev., vol. 9, no. 12, 2019. CNN,” in 2020 International Conference on Computer
[6] S. Wan, L. Qi, X. Xu, C. Tong, and Z. Gu, “Deep Learning Science and Software Engineering (CSASE), 2020, pp. 88–
Models for Real-time Human Activity Recognition with 93.
Smartphones,” Mob. Networks Appl., vol. 25, no. 2, pp. [20] H. J. Jie and P. Wanda, “RunPool: A Dynamic Pooling
743–755, 2020. Layer for Convolution Neural Network,” Int. J. Comput.
[7] J. Gong, R. Li, H. Yao, X. Kang, and S. Li, “Recognizing Intell. Syst., vol. 13, no. 1, p. 66, 2020.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

4
[21] R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, learning,” Multimed. Tools Appl., vol. 79, no. 19, pp.
“Convolutional neural networks: an overview and 12777–12815, 2020.
application in radiology,” Insights Imaging, vol. 9, no. 4, [27] F. Demir, A. Şengür, V. Bajaj, and K. Polat, “Towards the
pp. 611–629, 2018. classification of heart sounds based on convolutional deep
[22] S. H. S. Basha, S. R. Dubey, V. Pulabaigari, and S. neural network,” Heal. Inf. Sci. Syst., vol. 7, no. 1, p. 16,
Mukherjee, “Impact of fully connected layers on 2019.
performance of convolutional neural networks for image [28] M. Hasnain, M. F. Pasha, I. Ghani, M. Imran, M. Y.
classification,” Neurocomputing, vol. 378, pp. 112–119, Alzahrani, and R. Budiarto, “Evaluating Trust Prediction
2020. and Confusion Matrix Measures for Web Services
[23] H. A. Almurieb and E. S. Bhaya, “SoftMax Neural Best Ranking,” IEEE Access, vol. 8, pp. 90847–90861, 2020.
Approximation,” IOP Conf. Ser. Mater. Sci. Eng., vol. 871, [29] D. Chicco and G. Jurman, “The advantages of the
p. 12040, 2020. Matthews correlation coefficient (MCC) over F1 score and
[24] C. Banerjee, T. Mukherjee, and E. Pasiliao, “An Empirical accuracy in binary classification evaluation,” BMC
Study on Generalizations of the ReLU Activation Genomics, vol. 21, no. 1, p. 6, 2020.
Function,” in Proceedings of the 2019 ACM Southeast [30] T. Davenport, A. Guha, D. Grewal, and T. Bressgott, “How
Conference, 2019, pp. 164–167. artificial intelligence will change the future of marketing,”
[25] A. Sawant, M. Bhandari, R. Yadav, R. Yele, and S. J. Acad. Mark. Sci., vol. 48, no. 1, pp. 24–42, 2020.
Bendale, “Brain Cancer Detection From Mri: a Machine
Learning Approach (Tensorflow),” Int. Res. J. Eng.
Technol., vol. 05, no. 04, p. 2089, 2018.
[26] C. Garbin, X. Zhu, and O. Marques, “Dropout vs. batch
normalization: an empirical study of their impact to deep

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

5
Preventing Oscillation of Supply Voltage Due to
Resonance Harmonic Frequency of a Power Factor
Corrected System
Adel Ridha Othman
Electromechanical Engineering Department
University of Technology, Baghdad, Iraq
adel.r.othman@uotechnology.edu.iq

Abstract- The variable speed ac and dc drives that are using analyzed and a harmonic filters are designed to reduce the
power electronics are producing high levels of harmonic effects of these harmonics on the industrial supply. In [9] the
distortion. The simulated distribution system of this paper harmonic voltage in a large power system that is containing
consists of a 150 KVA generator supplying a DC drives of 75 AC- DC converters with a variable large capacity loads is
KW of type silicon controlled rectifier (SCR). A shunt mitigated using passive filter. In [10] different application of
capacitor bank is connected at a point of connecting the passive and active harmonic filters are presented for
supply with the load which is called point of common mitigation of harmonic distortion. In [11] an active filter is
coupling (PCC#2) to correct the power factor which is called used to mitigate the harmonics and a contribution of an easy
a power factor correction (PFC) capacitor. The parallel circuit control method is submitted without needing to transform
of the PFC capacitor and the inductance of the system has a between power systems frameworks. In [12] a
resonance frequency equals the 11th harmonic of the system Matlab/Simulink simulation is used to implement an active
and that causes an oscillation and distortion in supply voltage. power filter to select the current harmonics to be mitigated.
A shunt passive filter, single tuned to the resonance frequency In [13] a synchronous reference frame theory is used to
of 11 th component of harmonics is designed to mitigate the generate the reference current signal for a voltage source
voltage total harmonic distortion (VTHD) to compliance with inverter that is controlled by a hysteresis control method and
IEEE Std 519:2014 and preventing the oscillation of the simulating a shunt active power filter based on the
supply voltage. synchronous reference frame and hysteresis control method
to reduce the harmonic distortion in power system. [14]
presents a case study of a hybrid active filter and a passive
Keywords—Harmonic, resonanace, filter, distortion. notch filter to enhance the total harmonic distortion in an
I. Introduction industrial power system. [15] presents an active filter to
mitigate the generated harmonics from non linear loads of a
The aim of studying of harmonic distortion is to calculate the photovoltaic system connected to the grid. In [16] a
harmonic currents, voltages, and the percentage of the Matlab/Simulink software is used to simulate an active filter
distortion indices in an electrical system and then analyze the to enhance the total harmonic distortion in a photovoltaic grid
situation of resonance and to design filters to mitigate its connected system controlled by proportional resonance
effect on power system [1]. For reducing harmonics the control method. The effectiveness of the filter is investigated
power converter that must be used must be of operating using different load types. [17] presents a Matlab / Simulink
pulses of 12 and a higher static converter. But due to modelling of a photovoltaic (PV) grid connected system to
frequently required maintenance of converters that are of 12- study the optimal location of the PV system to the utility grid
pulse, it is almost using converters of six-pulse instead. The for minimum harmonic distortion without implementing
converters of lower number of pulses inject a large values of harmonic filters. In [18] an active filter is used to mitigate
5th, 7th harmonics and the harmonics of orders related to the the harmonic distortion in wind turbine power plants and the
12-pulse characteristic harmonics. These are traditionally optimal location of the filter is studied. In [19] an active filter
filtered out by designing a tuned filters to mitigate the low is simulated with Matlab/Simulink software to simulate an
order harmonics such as 5th, 7th, 11th and 13th, 17th and higher industrial power system that is using a non linear load of
orders [2]. In [3] various passive filters are used to mitigate power electronics that inject harmonics to the power system.
the harmonic distortion and power factor, in this work single In [20] a digital simulation of a shunt active filter is used to
and double tuned filters are designed and investigated for compensate for harmonics and reactive power. In [21] a
mitigating the harmonics. In [4] the harmonic distortion single tuned filter is designed using Electromagnetic
injected in 20 Kv distribution system is mitigated using Transient Analysis Program (ETAP) to mitigate the current
passive and active harmonic filters. In [5] a double tuned harmonics injected by variable speed drives (VSD). In this
filter is designed to mitigate the harmonics. In [6] a passive work the simulated distribution system is composed of a
harmonic filters are used to mitigate the source side current 150KVA generator supplying a DC drives of 75 KW of type
harmonics of a rectifier and DC - DC chopper that is feeding of six pulse silicon controlled rectifier (SCR). A parallel
a DC motor. In [7] an LC filter is connected at the input of a capacitor is added at the common coupling point (PCC#2) to
DC motor to reduce the high frequency harmonics and torque correct the power factor (PF). The frequency of resonance of
ripples. In [8] the harmonic distortion of industrial sources is

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

6
the parallel circuit of that capacitor and the inductance of the The voltage, current waveforms and their spectrum of
system is happened to be the 11th harmonic and that causes an harmonics are shown in figure II.
oscillation and distortion in supply voltage. A shunt passive
filter single tuned to the 11 th order harmonic is designed to
mitigate the voltage total harmonic distortion (VTHD) to be
in compliance with IEEE Std 519:2014 and preventing the
oscillation of the supply voltage [22 ].

II. The Simulated System:

Equation (1) shows that the resonance frequency can be
anticipated and calculated if it is known the value of the short
circuit level where the capacitor bank is installed [23].

𝑲𝑽𝑨𝒔𝒉𝒐𝒓𝒕 𝒄𝒊𝒓𝒄𝒖𝒊𝒕
𝒉𝒓 =  (1)
𝑲𝑽𝑨𝑹𝒄𝒂𝒑𝒂𝒄𝒊𝒕𝒐𝒓 𝒃𝒂𝒏𝒌
From equation (1) the resonance frequency hr can be
calculated by knowing the short circuit power level of the
system kVAshort circuit and the installed capacitor bank reactive
power rating kVARcapacitor bank . Figure I shows the one line
diagram of the distribution system.

Figure II Voltage and Current Waveforms at Resonance

and the Respective Spectrum at PCC#2

The VTHD, the Current Total Harmonic Distortion (ITHD)

and the values of their respective harmonics components are
shown in table II, as it is seen that they are incompliance with
IEEE Std 519:2014.

Figure I Simulated distribution system

Table II The Compliance with IEEE Std 519:2014

The calculated parameters of the distribution system The Calculated Harmonic Calculated IEEE Std
are shown in table I Parameter No. Value [%] 519:2014
Limit [%]
Table I Calculated Parameters at PCC # 2 VTHD ------ 91.6 8.0 FAIL
VTHD (%) 91.6
Maximum Voltage 11 87.5 5.0 FAIL
ITHD (%) 31.5 Distortion of Harmonics
Components (MVDHC)
Load Current IL (A) 166.2 ITHD ------ 31.5 5.0 FAIL

Short Circuit Current ISC (KA) 1.4

Maximum Current 5 29.5 4.0 FAIL
ISC / IL 10.7 Distortion of Harmonics
Components (MIDHC) 
Displacement Power Factor (DPF) 0.92 Lag. 11
MIDHC 11 to 16 11 7.3 2.0 FAIL
True Power Factor (TPF) 0.55
MIDHC 17 to 22 17 2.8 1.5 FAIL
S (KVA) 115.2
MIDHC 23 to 34 23 1.5 0.6 FAIL
Q (KVAR) 96.3
MIDHC  35 37 0.5 0.3 FAIL
P (KW) 63.2

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

7
The capacitor bank current waveform and its harmonics XL 2.5
𝑅= = = 0.06 Ω (7)
Q.F 40
spectrum is shown in figure III.
Where Q.F is the Inductor Quality Factor and supposed
to be equals 40

IV. Simulation Results

The designed and implemented filter of type SPSTF to
mitigate the 11th harmonic is of capacity 63.2 Kvar. Table III
shows the results of the calculated parameters with connected
filter and the comparison of these values with the parameters
shown in table I it is clearly there is an enhancement of these
parameters.
Table III Calculated Parameters at PCC # 2
VTHD (%) 5.3

ITHD (%) 6.3

Load Current IL (A) 127

Short Circuit Current ISC (KA) 1.4

ISC / IL 11.7

Displacement Power Factor (DPF) 0.95 Lag.

Figure III Capacitor Bank Current Waveform
and its Harmonics Spectrum True Power Factor (TPF) 0.95

S (KVA) 88.3
III. Design of a Series Passive Single Tuned Q (KVAR) 28.9
Filter (SPSTF)
An R-L-C connected in series and tuned to a specified P (KW) 83.5
frequency is constitute a series single tuned filter (STF) and
that specified frequency is the resonance frequency of the
series R-L-C. The 11th harmonic component is shown in table Figure IV shows the voltage and current waveforms with
II as it is of the highest value among other harmonics. At their respective harmonic spectrums with connected filter and
PCC#2 is connected a filter of type SPSTF tuned to the 11th in comparison with figure II it is clearly that the waveforms
harmonic component. The SPSTF must supply reactive and harmonics spectrum are greatly enhanced.
power to compensate the power factor which is calculated to
improve the true power factor (TPF) [24]. In this work TPF
is improved from 0.55 to 0.9, From table I:-
TPF = 0.55 (True Power Factor)
Active power = 63.2 KW
The phase angle of the TPF is TPF = arc TPF = arc 0.55 =
57 then tan TPF = 1.5
And PF is intended to be equal 0.9 then Required = 26° and
tan Required = 0.5.
The compensation of reactive power = load (KW) [tan TPF
- tan Required]
= 63.2 [1.5 – 0.5]
= 63.2 KVAR
2
VLL 4002
XC = = = 2.5 Ω (2 )
VAR 63.2 × 103
1
XC = (3)
2fC
1 1
C= = = 116 𝑓 (4)
2f XC 2 × 550 × 2.5
XC = XL (5)
2.5
𝐿= = 0. 72 𝑚 H (6) Figure IV Voltage and Current Waveforms at
2 × 550 Resonance and the Respective Spectrum at PCC#2

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

8
V. Discussion [5] HE Yi-Hong, SU Heng, “A New Method of Designing
Table IV shows that SPSTF of resonance frequency of 11 th Double-Tuned Filter”. Proceedings Of The 2nd
harmonic is managed to mitigate VTHD to be compliance International Conference On Computer Science And
with IEEE Std 519:2014 and the filter has eliminated the Electronics Engineering (ICCSEE 2013)
resonance, therefore the voltage oscillation is also eliminated
as shown in figure IV, the ITHD is still not compliance with [6] H. Prasad,M. Chilambarasan,T.D.Sudhakar, “Application
or out of the limit of the standard IEEE Std 519: 2014 but it Of Passive Harmonic Filters To Mitigate Source Side
is lowered greatly from 31.5% to 6.3% and also the individual Current Harmonics In An Ac – Dc - Dc System”,
current harmonic components distortion is reduced which is IJRET: International Journal Of Research In
clearly by comparing the values in table II with table IV. Engineering And Technology. Volume: 03 Issue: 01
Another SPSTF branches can be designed and installed to Jan-2014, Available @ Http://Www.Ijret.Org
mitigate the 7th harmonic. To mitigate ITHD to be compliance
with IEEE Std 519:2014 a high pass filter can be used. [7] A. Albert Rajan, Dr. S. Vasantharathna, “Harmonics
And Torque Ripple Minimization Using L-C Filter For
Brushless DC Motors”, International Journal Of
Table IV The Compliance with IEEE Std 519:2014 Recent Trends In Engineering, Vol 2, No. 5, November
The Harmonic Calculated IEEE Std 2009
Calculated No. Value [%] 519:2014 Limit [8] Thet Mon Aye, Soe Win Naing, “ Analysis Of
Parameter [%] Harmonic Reduction By Using Passive Harmonic
VTHD ------ 5.3 8.0 PASS Filters “, International Journal of Scientific Engineering
and Technology Research, ISSN 2319-8885
Vol.03,Issue.45 December-2014, Pages:9142-9147
MVDHC 7 3.5 5.0 PASS
[9] Byungju Park, Jaehyeong Lee, Hangkyu Yoo, And
ITHD ------ 6.3 5.0 FAIL Gilsoo Jang, “Article Harmonic Mitigation Using
Passive Harmonic Filters: Case Study In A Steel Mill
Power System”, Energies 2021, 14, 2278.
MIDHC  7 5.6 4.0 FAIL Https://Doi.Org/10.3390/En14082278
11
MIDHC 11 11 2.1 2.0 FAIL [10] Lukas Motta, Nicolás Faúndes, “ Active / Passive
to 16 Harmonic Filters: Applications, Challenges & Trends”,
978-1-5090-3792-6/16/$31.00 ©2016 IEEE.
MIDHC 17 17 1.1 1.5 PASS
to 22 [11] A.Medina-Rios, and H. A. Ramos-Carranza, “An
Active Power Filter in Phase Coordinates for Harmonic
MIDHC 23 23 0.7 0.6 FAIL Mitigation”, Ieee Transactions On Power Delivery, Vol.
to 34 22, No. 3, July 2007.
MIDHC  35 0.3 0.3 PASS [12] L. A. Cleary-Balderas, A. Medina-Rios, “Selective
35 Harmonic Current Mitigation with a Shunt Active
Power Filter,” 978-1-4673-2308-6/12/$31.00 ©2012
IEEE.
VI. References
[1] J. C. Das, “Power System Analysis Short-Circuit Load [13] Kakoli Bhattacharjee, “Harmonic Mitigation by SRF
Flow and Harmonics” Amec, Inc. Atlanta, Georgia Theory Based Active Power Filter using Adaptive
Hysteresis Control”, 2014 Power and Energy Systems:
[2] Arrillaga, J. and Watson, N., “Power Systems Towards Sustainable Energy (PESTSE 2014).
Harmonics”, 2nd ed., Wiley, New York,2003.
[14] Henning Tischer, Tomaz Pfeifer, “Hybrid Filter for
Dynamic Harmonics Filtering and Reduction of
[3] S.N. AL. Yousif, M. Z. C. Wanik, A. Mohamed, Commutation Notches – A Case Study”, 978-1-5090-
“Implementation of Different Passive Filter Designs for 3792-6/16/$31.00 ©2016 IEEE.
Harmonic Mitigation” National Power & Energy
Conference (PECon) 2004 Proceedings, Kuala Lumpur, [15] Mohamed J. M. A. Rasul, H.V. Khang, Mohan Kolhe,
Malaysia “Harmonic Mitigation of a Grid-connected Photovoltaic
System using Shunt Active Filter”, 978-1-5386-3246-
[4] Muhammad Rusli, Muhammad Ihsan, Danang 8/17/$31.00 © 2017 IEEE.
Setiawan, “Single Tuned Harmonic Filter Design As
Total Harmonic Distortion Compensator”. [16] Juan C. Colque, Jos´e L. Azcue, Ernesto Ruppert,
23rdinternational Conference On Electricity ‘Photovoltaic system grid-connected with active power
Distribution Lyon, 15-18 June 2015 filter functions for mitigate current harmonics feeding

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

9
nonlinear loads”, 2018 13th IEEE International
Conference on Industry Applications.

[17] Mariem Yakoub Yousef, Mohamed Mahmoud Ismail,

Said Mohamed El masry, “The Effect of Grid
Connected Photovoltaic Location and Penetration level
on Total Harmonic Distortion”, 2018 Twentieth
International Middle East Power Systems Conference
(MEPCON), Cairo University, Egypt, 978-1-5386-
6654-8/18/$31.00©2018 IEEE.

[18] Daphne Schwanz, Math Bollen, Anders Larsson, Łukasz

Hubert Kocewiak, “Harmonic Mitigation in Wind
Power Plants: active filter solutions”, 978-1-5090-3792-
6/16/$31.00 ©2016 IEEE.

[19] R. Sheba Rani, C. Srinivasa Rao, M. Vijaya Kumar,

“Analysis of Active Power Filter for Harmonic
Mitigation in Distribution System”, International
Conference on Electrical, Electronics, and Optimization
Techniques (ICEEOT) – 2016, 978-1-4673-9939-
5/16/$31.00 ©2016 IEEE.

[20] Narayan G. Apte, Vishram N. Bapat, Amruta N. Jog, “A

Shunt Active Filter for Reactive Power Compensation
and Harmonic Mitigation”, The 7th International
Conference on Power Electronics October 22-26, 2007
/ EXCO, Daegu, Korea.

[21] Mohamed Awadalla, Mohamed Orner, Ahmed

Mohamed, “Single-tuned Filter Design for Harmonic
Mitigation and Optimization with Capacitor Banks,”
International Conference on Computing, Control,
Networking, Electronics and Embedded Systems
Engineering, 978- 1-4673-7869-7/ 15/$3l.00 ©20 15
IEEE.

[22] Tony Hoevenaars, P. Eng., Kurt Ledoux, P. Eng. And

Matt Colosino, “Interpreting IEEE Std 519 And
Meeting Its Harmonic Limits In VFD Application”,
Copyright Material IEEE, Paper No. PCIC-2003-15.

[23] Francisco C. De La Rosa, “Harmonics And Power

Systems” Distribution Control Systems, Inc.
Hazelwood, Missouri, U.S.A. 2006

[24] Seema P. Diwan, Dr. H. P. Inamdar, and Dr. A. P.

Vaidya, “Simulation Studies of Shunt Harmonic
Filters: Six Pulse Rectifier Load – Power Factor
Improvement and Harmonic Control” Proc. of Int.
Conf. on Advances in Electrical & Electronics 2010,
ACEE.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

10
Arranging Spaces for Next Pandemic by inspiring from
Qajar period
Parastoo Pourvahidi
Department of Architecture, Faculty of
Fine Arts, Design and Architecture,
Cyprus International University, Via
Mersin 10, 99258 Nicosia, Turkey
ppourvahidi@ciu.edu.tr

beside of infecting over 3 million people in 185 countries

Abstract—this research has been initially commenced by global [16].
pursuing the Qajar period residential building’s situation if it
was in pandemic period. The result demonstrated the interesting
All these restriction for controlling the spread of virus all
entity about the organization of spaces which stimulated this around the world cause the noteworthy altering in the life style
research to consider it in contemporary building in Iran. of people. For instance, spaces were used by people such as
Although today’s word situation is totally different by facing public and private for responding of changing life style after
cumulative population and lack of land. Though comparing the Coronavirus cause to have delicate redesign process. Subjects
immense land with building that has more than 4 bedrooms to such as proximity, centrality and clustering concertation and
building that has 2 bedrooms is excruciating. Coronavirus density are revealing the indirect process, which design
changed the life style of people all around the word. In the past scholars like planner, urban designer and architect mentioned
if building without balcony has opening toward greenery area before about these negative impressions in quality life of
was a valuable option. But during pandemic, necessity for people. However, because of socioeconomic purposes all this
having semi-open spaces such as balcony become more impression has been overlooked by social performers [16].
imperative. The pandemic period taught people, open and semi- Balcony can perform the best semi-open spaces for having
open spaces is as much vital as close space if you are during lock social relation with neighbors and decreasing the stress of
down period. Subsequently, this study attempts to check the solitude. Since, during pandemic, balcony (semi-open or open
total depth value (TDn) of the spaces in Qajar period to find out space) supports the residents to have social relation with
the solution for today’s apartment building that unfortunately, neighbors with distinguishing type of reserved pleasantness.
open spaces is much smaller or invaluable than close spaces.
For instance, in the city of Beirut, balcony is confronted to the
Residents with low and medium level in economy could not have
street and obtainable company without stresses of closeness
a prime to buy a new apartment, hence, renovating the space
arrangement can be the surpassed and faster option. This
[17]. In this concern, World health origination proposing
research used the depthmaph-x as a method for the purpose of important factor in preventing the virus that cause
suggesting simple solution, try to encourage the residents to re- Coronavirus from spreading indoors. One of them was
evaluate the worth of life and try to have more connection with concerning for having natural ventilation in indoor spaces
fresh area as their ancestor did in the past. Space syntax as a such as Opening window. In addition, increasing air filtration
method used in this investigation to aid the research for is another recommendation. Significantly, cumulative the total
discovering the shortest distance or total depth value (TDn) and airflow supply to occupied spaces is another issue which can
integration value (i) of each space after repositioning the spaces. reduce the spread of virus in close spaces [10]. Furthermore,
Kong, in 2017 claimed that semi-open spaces increase the
Kewwords: pandemic, space arrangement, depthmap, Qajar content of the user from 80 % to near 100 %, since it has a
period, contemporary building latent to offer microenvironment. It means that semi-open
space is personalized to the individual preference of the user
[9]. All of these concerns cause to this research to compare the
I. Introduction modern apartment building with the traditional building by
using depthmap-x analysis method. In this case what was
Pandemic of Coronavirus (COVID-19) in almost all the
happening if pandemic period was happen in Qajar period?
cities around the world has caused the restriction for public
And what user can do for preparation for another pandemic
movement and as a result people locked down in their home
with fewer cost and faster method.
[13]. Lock down and movement restriction origin the
fundamental changes in convention of urban space and also in
socializing behavior. In this sense Tamborrino (2020)
contends that “despite global alarm about the spread of the II. Qajar and contemporary period
novel coronavirus, this desertification has been a gradual Iranian traditional houses designed base on privacy and
process, as people showed resistance to the interruption of the decency. Thus, the spaces are organized in a way not to permit
use of social spaces” [15]. Restriction was not just for social for instant expose to the house. The arrangement of the space
distancing such as work space control, facility closure, travel lead user first to enter to the house, move around and later they
restriction or curfews, similarly it has effect on the could have an access to varied spaces in a house. Yard in
organization of space in resident’s way of life. In that manner traditional houses is located in the heart of the building, thus
for decreasing the spread of Coronavirus, recommended to users should pass through the passage to have an access to the
people all around the world banned from being in public. yard Spaces such as Shahneshin, five-door and a space which
Since, the spread of virus slaughtered over 250,000 people is used for parties named as a big hall are called as public
zones. Jelosara where is front room is the closest space to the

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

11
yard. Spaces such as access room, house pool and interior yard space such as courtyard, after courtyard users enter to the
are the one which is used by family members. In these spaces doorway which was again semi-open space. However, today’s
even close relative and family members meet up for doing residential building plan composed of completely close area
activates. Furthermore, spaces like basement, the two-door with small percentage of semi-open spaces (balcony). Balcony
room and back room are considered as private zone in Iranian in most of the contemporary building is small hence, residents
traditional houses. These spaces had been used for relation, always change the function of it to the storage rather a sitting
study, sleep and chat therefore they are positioned at the and enjoying the comfortable space. The lack of space,
utmost part of the house [5]. Figure I represent the increasing population and price of the land based on the close
arrangement of the spaces in traditional houses. spaces is the result of all these changes during passing years.
But pandemic period changed this routine and alarm the
resident about the value of open and semi-open spaces in
house for communication with neighbors and freshen the air
and spending lockdown in open space without scaring about
virus. Furthermore, in figure II arrangement of the spaces for
contemporary building plan allow the users to content the
necessity of having semi-open spaces such as balcony.

Figure I: Space arrangement and hierarchy in Iranian

traditional houses [5].

Based on public and private zones of each spaces, they can

be used for dwelling separated or in combination, also all these
spaces are providing comfortable condition for the users. In Figure II. Proposing space arrangement for contemporary
traditional houses there is different kind of spaces such as the building based on Qajar period.
open spaces without roof (the yard), with roof without wall
(the porch) and closed spaces with roof. Comfortable living
condition had been provided for the users in these traditional It is not senseless to notify that inhabitant should change
houses by combing or connecting all three spaces together. the house for being ready for the next pandemic. But,
Respecting the life style by being flexible, accomplish the renovating and reconstructing the interior space can
user’s desires. Creating the larger spaces needs flexibility. reasonable. For instance, in most of the cases in Iranian
Since flexibility is the competence of the spaces to combine apartment building balcony is located next to kitchen with
with the nearby areas. Windows, Orssi or curtains can be used limited square meter. Or if the inhabitant is lucky, there is
as an element for connecting the private area to the public more than one balcony in one of the bedrooms as well. But
spaces through the middle spaces. For instance, house pool because of restricted accessibility and limited square meter,
and the room, the Shahneshin and Gooshvareh, the two-door this kind of semi-open spaces are useless. So as a solution this
room and the Shahneshin, three-door room and the back room research attempt to rearrange the organization of spaces in
and the yard and the Shahneshin can create the new area by case to have a direct accessibility to the balcony from living
connecting together. Considering both interior and exterior room. In addition, using flexible partition wall can also give
and degree of neglecting can assistance to comprehend these opportunities to the residents to extend the living room toward
spaces, which is the main principle for the space arrangement the balcony during spring and summer time. Furthermore, by
in the Iranian traditional houses [5]. closing the partition wall could limit the access during winter
Moreover, Safarkhani (2016), stated about the different time and balcony can be usable as a sunspace for warming the
size and location of the balcony in Iranian contemporary living room. Another prosper of having balcony with the
houses. In these houses, small percentage by considering the extension o living room is that during pandemic resident could
whole plan is belong to the balcony’s size. In these kinds of invite guest in open space by obeying the social distance. In
houses, living room and hall are combined together. the following analyzing of the Agraph regarding the cases has
Furthermore, toilet and bathroom are also separated colored been presented both before and after renovation.
as, blue and purple [12]. III. ANALYSIS AND DISCUSSION
A. connectivity Method
By comparing the space arrangement in Qajar and
contemporary period, it is noticeable that during Qajar period As Bahreini in 2014 mentioned that architect has been
after passing from street, resident was entering to the open always looking for understanding the invisible structures and
organization. Since these factors are located behind the

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

12
architectural form and phenomena. Space syntax defined as
one of the methods which can explore the space morphology
[3]. Space syntax method had a purpose to notice the social
relationship between the spaces [6]. In another pint of view,
space syntax attempt to detect the reason for the independence
(b)
of each space and try to state each space based on the position
of it [7]. There is different kind of software which is applicable Figure III: Traditional building plan in Qajar period (a),
for space syntax method, this research used the depthmap X indicating open (blue) and close spaces (yellow).
for the analysis.
Depthmap X is analysis software in different space for These dash lines demonstrated the combination of living room
representing the spatial network. Alasdair Turner (Space with courtyard. However, the interesting thing about these
syntax group) developed this software. The purpose of this spaces is the partition wall, which was flexible.
software is to create the map which has been shown the spatial Pandemic period demonstrates the fact how open plan
element and relationship between them [4]. Hence, Depthmap create problem for inhabitant while one of them should do
X is one of the methods which is used in this research. online work and the children should be online in school but
Depth map analysis manifest the values for TDn, MDn, I, with no partition wall concertation would be impossible
CV and RA for each space. Also, Ostwald mentioned that for however in this Qajar period’s plan by having flexible
comparing to building with each other, outcome prerequisite partition wall separating the spaces is possible and beneficial
to be stabilized in term of relative depth, which is called as during pandemic (fig.IV).
Relative Asymmetry (RA). This result could be between 0 to
1[11]. Segregation specifies by high value of Relative
Asymmetry and integration specify by low value of Relative
Asymmetry [8].
RA is the reflection of its relative isolation for a carrier
space. Afterwards, the i can be calculated by the shared of that
node. Table below represent the formula of the relationship
between RA and I value (table I) [11]. Figure IV: Dash lined represent the flexible space in
Qajar period.
Table I: RA and i formulas [11].
RA A measure of how deep a system is One of the first spaces after courtyard is bathroom. It is
Relative

relative to a symmetrical of balanced

asymmetry

located before living room and bedrooms, which gives the

model for the same system. RA is opportunities to the inhabitant to hygiene they hand before
calculated by the following formula: entering to the building, which is one of the huge problems in
RA= 2(MD-1)/K-2 contemporary houses during pandemic (fig.V).
i A measure of the degree of
Integrati

integration (relative centrality of spaces)

on value

in a system. i is the reciprocal of RA, thus

i =1/RA

Before, starting the analysis with depthmap X, this Figure V: Dash lined represent the bathroom space in
research generally analyzed the spaces in traditional building Qajar period.
plan during Qajar period. The analysis has been clarified such There is special space, which they call it removing shoes
as:
spaces. This space is open space in case to remove the smell
The proportion of open and semi-open spaces (blue color) and and is the second spaces after connecting corridor to the living
closes spaces (yellow color) demonstrated that open spaces is room.
as much important as close spaces during Qajar period. In These kinds of space are essential during Coronavirus
addition, during pandemic, they could spend more time disease’s time in case to remove all the dirty shoes outside of
outside, even they could invite the people, and there is enough the living space in and staying in open space to become
space for social activities (fig.III). cleanness with sunlight (fig.VI).

(a) Figure VI: Dash lined represent the Antechamber

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

13
Designing the corridor (yellow color) between each space
in another positive fact in the plan of Qajar period. Entering
directly to the space with shoes can cause a lot of problem
during pandemic. Though, if each space separated with
connecting corridor can be helpful for inhabitant to stay
sanitation from Coronavirus disease (fig.VII).

Result
Figure VII: connecting corridor placed beyond each
spaces
Depth map analysis in table II illustrations the axial
analysis that blue line (low connectivity) demonstrates the
sanitation area and red line (high connectivity) the dirty spaces
in both Qjara and contemporary buildings. Generally, drawing
the graph in case studies aid this research to manifest the
connectivity’s of each space similarly understanding the
hygiene of the spaces is conceivable. Hence, this research just
analyzing the spaces based on integration value.
Table II: Analyzing of plan in Qajar period throughout
pandemic
analysis

Alitajer in 2016 claim that in Iranian traditional houses, the

highest percentage of connectivity was in yard and corridor
which are positioned between inside (Andarooni) and outside
map
(connectivity)

(birooni). Furthermore, the minimum percentage belong to the

spaces such as bedrooms, toilet and spaces, which is, just
Blue line, which has a function such as room, are the
belong to the women (Pastoo). However, in modern and
safest and cleanest spaces.
Depth

contemporary houses in Iran, the connectivity situation is

Even the Living room has the color light blue which
totally different. For instance, the highest connectivity is in
again demonstrated the less integration.
corridor and living room. The lowest connectivity in modern
houses is in bedrooms, kitchen, bathroom, toilet and corner of
balcony (Alitajer & Molavi Nojumi, 2016). Therefore, this
research attempt to find out the highest and lowest
connectivity in three contemporary cases in Iran, for
understanding the better relation between the spaces. Since,
the lowest connectivity in balcony, toilet and kitchen is not
0-entrance, 1-entrance,2-courtyard, 3-Antechamber contest with the impression of this research for preparing the
(midway space through which the closed spaces are houses for the next pandemic. In the following this research
accessed), 4- antechamber, 5- antechamber, 6- dinning attempt to detect the connectivity’s beyond the contemporary
room, 7-room,8-room, 9-room, 10-room, 11- bathroom, 12- building in Iran (table III and table IV).
corridor, 13-corridor, 14-courtyrad, 15- Seh-dari (room
with three large contiguous window opening to the
Agraph analysis

Table III. Analysis of contemporary building case No.A

courtyard), 16- reception hall (large formal room usually with Agraph method
assigned to entertaining guest, 17- reception hall, 18-
backyard, 19-staircase, 20- kitchen, 21-Panj-dari (large Plan/ depth map Graph
analysis

room with five large contiguous window opening to the

courtyard, 22-storage, 23- kitchen storage, 24- room, 25-
corridor.
Depth map
(connectivity)

[12]
0. Entrance, 1. WC,

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

14
2. Bathroom, connected to the balcony (semi-open spaces) for
3. Living room, refreshing the air in that space.
4. Bedroom,
5. Bedroom, Table IV. Analysis of contemporary building case No.B
6. Kitchen, with Agraph method
7. Balcony Plan/ depth map Graph

Depth map analysis (connectivity)

[12] 0. Entrance,
1. Living room,
2. Kitchen,
3. Balcony,
Result

4. WC,
5. Bathroom,
6. Bedroom,
The highest i represent the dirty space which is
7. Bedroom,
living room in this case. The lowest i represent the
8. Balcony.
cleanest space which is balcony and entrance space.
Therefore there is something wrong in organization of
space in contemporary building. That is why balcony
should be the cleanest space?
Result

[12]

0. Entrance,
1. WC, In this case study, highest i (dirty space) value is
Suggestion plan

2. Bathroom, belong to living room and the lowest I (clean space)

3. Living room, value is belong to balcony.
4. Bedroom,
5. Bedroom,
6. Kitchen,
7. Balcony
Suggestion plan

0. Entrance,
[12] 1. Living room,
2. Kitchen,
3. balcony,
4. WC
,5. Bathroom,
Result

6. Bedroom,
7. Bedroom,
8. Balcony,
After rearranging the spaces, the i value of balcony 9.WC,
changes to the same value like bedrooms. Although the 10. Balcony.
i value of living room still is so high but at least it is

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

15
Moreover, in traditional house in Qajar period, the lowest
connectivity is detected in bedrooms and store room.
However, the highest connectivity is detected in in yards and
corridor. Furthermore, the kitchen, bathroom represent the
low connectivity as well. Nevertheless, in contemporary
building the lowest connectivity detected the bathroom and
toilet and the highest connectivity is detected on living room.
Therefore, based on all the analysis these are the
recommendation rule which can be supplementary by the
municipality’s rule for the modern building, which resident
can be prepared for the next pandemic:
Rules and regualtion for next pandemic:

After the rearranging the space, balcony’s I value  Integration value should be high for living room
increase same as the bedroom which is convenient. and balcony
 Balcony should not have low integration.
In case number one (A) (table III), case number 2 (B)  The wall of the balcony should be flexible in
(table VI) before renovation, connectivity is highest in the order to be extended toward living room based on
living room which is i=21 and i=14 and lowest in the balcony the appeal of the user.
(A. i=1.75) and (B. i=2) and bedroom (A. i=2.62) (A. i=3.11).  Living room must be connected to semi-open
Also, after renovation, connectivity is highest in the living spaces (balcony)
room (A.45) and (B. i=21) and balcony (A. i=3) and (B.  Having open space such as courtyard and
i=4.5). Lowest in two bedroom (A.i=3) and (B. i=4.5). balcony is the must in each residential building.
Before, rearranging the spaces, balcony (open spaces) There should be balance in arrangement of semi-open,
integration value is less than other spaces, even it is less than open and close spaces. (Pandemic period evident the fact,
entrance. However, it is the best space during pandemic that people prefer during the say spend some time in the balcony
people can have social distancing. Balcony in some of the to just see the outside for having peace of mind. Nonetheless
buildings located after kitchen area. It means the access to this unfortunately, the square meter of the balcony than close
space it should be from kitchen. WC is located next to the space is less than half in modern building).
entrance door which is good for hygiene prior to entering to
Living room with Partition wall (inhabitant could still keep
the main space. Kitchen can be considered as balcony that is the open plan but by adding partition glass or wall, they could
next to the entrance and kitchen could be places partially in have opportunities to separate the area whenever the online
living room. Partition between living room and balcony can class of children is starting or they should do online working
be flexible wall in case to extend the living room during winter and it needs concertation.
time likewise extend the balcony during summer time.
Generally, the result of all the cases based on Agraph method Roof function (most of the apartment building have
analysis demonstrated the interesting values in TDn and immense area in roof which is useless, CONID-19 clarify the
integration for balconies formerly and after renovating plan. fact people need social interaction so may be roof can be the
space which inhabitant who lives in apartment building can
TDn value of balcony before renovation has the same with the
gather there since it is open space)
entrance gate, but after extending balcony toward living room
the total depth value decreases noticeable. In addition, the Positioning the WC at first next to the entrance door
integration value before renovating the plan was the lowest (washing hands was the first and primitive way to stay safe
integration value but after renovation plan it is increased during pandemic period, thus locating the WC next to the
which is the content value for integration of balcony. entrance door could be the first space resident can enter to
Commonly, extending the balcony toward living represent the become sanitization ad after that entering to the main space)
nourishing outcome which this research was expected. Positing corridor before entering to the private spaces such
Pandemic period teach architect designing the beautiful close as room (connectivity spaces such as corridor create the
space is not adequate for pleasing the user since having open chances to not entering the private spaces directly, therefore,
spaces is much needed as close spaces. it can stay disinfectant)
Also based on the analysis, there is a decline of RA value
in both cases after renovation. This means that the segregation Entrance hall (this space can be useful for removing shoes
between the spaces (nodes) convert to integration due to at first in the entrance section of the house, which
diminishing the value of RA. Generally, decreasing the RA unfortunately obliterated form the space organization in the
value in spaces like balcony (semi-open) after renovation modern building. Nevertheless, this space can be semi-open
represent the more integration of this spaces which is for better ventilation or by using mechanical ventilating the
satisfying. Subsequently, the mean RA value in Qajar period space.)
is 0.22 which by comparing to the contemporary houses was
IV. Conclusion
lower than them. Now that RA value of case A is 0.36 and
after renovation is 0.30. RA value of case B is 0.31 and after Cocid-19 open signify that designing the plan of apartment
renovation is RA=0.21. Hence, decline of the RA values after houses should change since the user requirement is changes as
re-organizing the spaces develop content. well. Pandemic period states another desires of the user, such
as need for socializing in open space without having a fear of
interacting with virus. Afterwards, people try to start

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

16
socializing with each other from small balcony in case to [7] Hillier, B. (2005). The art of place and the science of space.
nourishing their feeling during lock down. Thus, balcony World architecture 11(185), 96-102.
become the spot in all building for the place which user can [8] Khadiga, O. M., & Mamoun , S. (n.d.). the space syntax
methodology: fits and misfits. Art and comport, 189-204.
have connection with outside and also with their neighbors.
What if the balcony has the potential for becoming extended [9] Kong, M. (2017). Semi-Open Space and Micr Semi-Open
Space and Micro-Envir o-Environmental Contr onmental
from balcony for inviting the friends without fearing of Control for Impr ol for Improving. Dissertations - ALL. 810:
dispersing the virus? Renovation and adding one flexible glass Syracuse University.
wall in living room, which can have an access toward open [10] organization, w. h. (2020, July 29 ). Coronavirus disease
spaces for having, better indoor ventilation and entering the (COVID-19): Ventilation and air conditioning in public spaces
solar radiation, can be the rapid and worthwhile solution for and buildings. Retrieved from Q&A Detail:
enduring another pandemic. Also, this research recommended https://www.who.int/news-room/q-a-detail/coronavirus-
disease-covid-19-ventilation-and-air-conditioning-in-public-
to add the mandatory rule in municipality that doorway and spaces-and-buildings
accessible spaces such as passage should be open or semi- [11] Ostwald , M. J. (2011). The mathematics of spatial
open spaces. Furthermore, the percentage of the open space configuration: revisiting revising and crtiquing justified plan
(balcony) toward the close spaces should have specific graph theory. Nexus Netwrok journal, 445-470.
dimension for averting havoc for the next pandemic period. [12] Safarkhani, M. (2016). N PARTIAL FULFILLMENT OF THE
REQUIREMENTS for THE DEGREE OF MASTER OF
References ARCHITECTURE IN ARCHITECTURE. THE GRADUATE
SCHOOL OF NATURAL AND APPLIED SCIENCES of
[1] SAFARKHANI, M. (2016). N PARTIAL FULFILLMENT OF MIDDLE EAST TECHNICAL UNIVERSITY.
THE REQUIREMENTS for THE DEGREE OF MASTER OF
ARCHITECTURE IN ARCHITECTURE. THE GRADUATE [13] Sandford, A. (2020). Sandford, A. (2020). Coronavirus: Half of
SCHOOL OF NATURAL AND APPLIED SCIENCES of humanity now on lockdown as 90 countries call for
MIDDLE EAST TECHNICAL UNIVERSITY. confinement. Euronews.
[2] Alitajer, S., & Molavi Nojumi, G. (2016). Privacy at home: [14] Shahid Beheshti university, c. h. (1998). Ganjnameh. Tehran:
Analysis of behavioral patternsin the spatial configuration of Fcaulty of architecture and urban planning documentation and
traditionaland modern houses in the city of Hamedanbased on research center.
the notion of space syntax. Frontiers of Architectural Research, [15] Tamborrino, R. (2020). Here's how locking down Italy's urban
341.352. spaces has changed daily life. weforum.org.
[3] Bahrainy, H., & Taghabon, S. (2015). Deficiency of the space [16] Ülkeryıldız, E., Vural, D., & Yıldız, D. (6-8 May 2020).
syntax method as an urban designe tool, designing traditional Transformation of Public and Private Spaces: Instrumentality
urban space and the need for some supplementary methods. of Restrictions on the. 3rd International Conference of
space Ontology International journal, 1-18. Contemporary Affairs in Architecture and Urbanism
[4] depthmapX development team. . (2017). Retrieved from (ICCAUA-2020).
depthmapX (Version 0.6.0) : [17] Zacka, B. (2020, May 11). An ode to the humble balcony, in
https://github.com/SpaceGroupUCL/depthmapX/ times of the pandemic. Retrieved from DTNEXT:
[5] Hasan Zolfagharzadeh, H., Jafariha, R., & Delzendeh, A. www.DTNEXT.IN/NEWS
(2017). Different Ways of Organizing Space Based on the
Architectural Models of Traditional Houses: A New Approach
to Designing Modern Houses: (Case Study: Qazvin’s
Traditional Houses). Space Ontology International Journal,
6(4), 17 - 31.
[6] Hillier, B. (1996). Space is the machine, A configurational
theory of architecture. Cambridge : Cambridge University
Press.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

17
Comparison of Sentiment-Lexicon-based and Topic-based
Sentiment Analysis Approaches on E-Commerce Datasets
Adeola O. Opesade
Department of Data and Information
Science, Faculty of Multidisciplinary
Studies, University of Ibadan, Nigeria
morecrown@gmail.com

Abstract— Discovering underlying sentiment in a user’s textual polarity of sentiments expressed by writers through the
data is a complex task; nevertheless, human beings have been process known as sentiment analysis [6], [2].
intuitive enough to interpret the tone of a piece of writing. The
hugeness of online reviews, due to advancements in internet- Sentiment analysis, an intellectual process of extracting
based applications, has however, made the need for computer-
user’s feelings and emotions contained in a piece of writing
based models highly imperative for sentiment analysis of texts
and speeches. Many of the existing studies have examined the or speech, is a language processing task that uses a
performance of either of the two main sentiment analysis computational approach to identify opinionated content and
approaches, symbolic and topic-based approaches. The present categorize it as neutral, positive or negative [7], [2]. It is one
study investigated comparatively, the performances of Liu Hu of the fields of Natural Language Processing (NLP) and data
sentiment-lexicon implementation and bag of words topic-based mining that has gained popularity in the recent years [1] [3].
approaches. The study revealed amongst others that sentiment A lot of research work is being carried out in the field of
analysis, like other data mining tasks, is an experimental sentiment analysis [1]. These studies employed different data
science. It recommends that analysts could compare the mining and Natural language methods, mainly classified as
performances of symbolic and topic-based approaches in their
symbolic and machine learning approaches.
sentiment classification endeavors when deciding on the most
precise technique to adopt.
The symbolic approach, also known as rule-based
Keywords— Sentiment analysis, Amazon customer reviews, Konga classification entails reliance on sentiment-lexicon to find the
customer reviews, Liu-Hu sentiment-lexicon, Textual data mining polarity of each word in a review; if the number of words
tagged positive is greater than that tagged negative, it is
I. Introduction concluded that the writer’s sentiment is positive, otherwise,
Advancements in internet-based applications have fuelled the it is said to be negative [8]. It is therefore said to be a great
availability of huge volumes of personalised reviews on the knowledge-based classification that might lack generality
Web [1]. This user-generated data, mostly unstructured, due to possibility of its closeness to specific linguistic and
usually carry elements of user opinions and sentiments about operational fields [4]. Machine learning approach also
goods, services, events and experiences in the online or regarded as a topic-based text classification approach is a
offline environments [2]. These reviews are becoming so general resolution, independent on any special fields. In this
increasingly important that many people now consult them as approach, the reviews are represented by different features,
sources of information to aid their understanding, planning followed by any text classification algorithm [4] [8].
and decision making processes [1]. Businesses have also
adopted online reviews as part of their criteria for quality In order to identify better alternatives in sentiment analysis,
assessments [3]. Little wonder then that learning customers’ a number of studies have used rule-based approaches,
emotional inclinations through online reviews is becoming combined with some text processing procedures and machine
more crucial in the present Information Age [4]. learning algorithms, to investigate the performances of
sentiment analysis tasks [3] [7] [9]. A number of studies have
Discovering underlying sentiments, based on user’s textual also investigated the performances of sentiment analysis
data, is not a trivial task. This is especially due to the different tasks based on machine learning approach by using bag of
intricacies associated with language, such as contextual words, n-grams and POS-tagged feature selection techniques,
differences, language implications and sentiment combined with series of text processing procedures and
indistinctness of certain words. Also, some writers could be machine learning classifiers [6] [10] [11]. It could be
sarcastic while some others may not express specific observed that most of these previous studies have
sentiment markers in their writing [3]. Despite these investigated sentiment polarity of textual data either from
complexities, human beings have been found to be passably sentiment-lexicon or topic-based approach. They have
intuitive in interpreting the tone of a piece of writing [5]. The therefore, reported the relative performances of sentiment
massiveness of online reviews has however, made human classification tasks, based either on one or the other sentiment
beings to rely on computer-based models in identifying the analysis approach. While the approaches employed might

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

18
suffice for the researcher’s purposes of investigation, there B. Methodology
still remains a dearth of information on the relative The methodologies employed for data analytic procedures
performances of sentiment lexicon-based and topic-based are the Liu Hu and bag of words implementations in Orange
polarity approaches on the same dataset. This is particularly data mining tool. Liu Hu is a lexicon-based sentiment
important because of variability in real datasets and the fact analysis technique that computes a single normalized score
that universality of learning algorithms has been said to be a of sentiment in the text (negative score for negative
mere fantasy [12]. How will sentiment-lexicon and topic- sentiment, positive for positive, 0 is neutral). The technique
based approaches perform on a dataset? Answer to this
was used to carry out sentiment-lexicon based scoring of each
question will help to investigate the universality of sentiment
review. Bag of words technique was used to extract a number
analysis approach further.
of most frequent content words in each corpus. With this
The present study, in order to provide an answer to this technique, words (excluding stop words) were successively
question, investigated the performances of both approaches extracted to contain 1000, 500, 200, 100, 50, 20 and then 10
on two different e-commerce datasets. Two datasets were most frequent words from each dataset. Orange data mining
examined for the purpose of triangulation. To achieve the tool was also used to carry out text pre-processing
objective of the study, the following research questions were (transformation, tokenization, removal of stop words and
specifically examined. extraction of word features) and machine learning
classification experiments.
1. How do machine assigned sentiment-lexicon based score
compare with human assigned sentiment label?
2. How do sentiment-lexicon-based classification schemes C. Experimental Setup
compare with topic-based classification schemes in each An experiment was carried out, using six machine learning
dataset? algorithms in Orange data mining tool. The machine learning
3. Which classification scheme is the best for each dataset? algorithms used were K-Nearest Neighbor (KNN), Tree,
Support Vector Machine (SVM), Neural Network (NN),
II. Research Methods and Materials Naïve Bayes (NB) and Logistic Regression (LR). The
experiment was carried out to determine the performances of
The method of research adopted by the present study is the these machine learning classification schemes based on the
textual data mining, with supervised machine learning following feature sets:
technique.
a. Sentiment-lexicon-based machine assigned score.
A. Data Collection b. Topic-based bag of word vectors (1000, 500, 200, 100, 50,
Two web based electronic commerce datasets were used in 20 and 10 most frequent content words).
the present study. The datasets are:
Ten fold cross validation was used to evaluate the models'
1. Dataset on Amazon: This dataset was collected from [13]. performances based on Classification Accuracy (CA) and F-
The dataset was created and uploaded by [14], authors of Measure (F1).
'From Group to Individual Labels using Deep Features'. It
comprises of one thousand reviews of products labelled with III. Results and Discussion
positive or negative sentiment. The authors of the dataset
selected review sentences that have a clearly positive or Research Question 1: How do machine assigned sentiment-
negative connotation in equal proportion. The dataset lexicon-based scores compare with human assigned sentiment
therefore, contains 500 positive and negative reviews each. labels?

2. Dataset on Konga: Konga is one of the most popular e-

commerce site in Nigeria. Users’ comments on services Amazon
No. of Reviews

520 500
received from Konga was collected by the author of the 500
600
present study. Data collection was carried out on the 17th of
April 2020 on Twitter, using #kongacustomer as search term 400 265
on Orange data mining tool Twitter API. Tweets were first 215
read and labelled based on the sentiment inclination 200
0
expressed therein. These sentiment polarity labels were 0
positive, negative and neutral. However, all neutral tweets Machine Human
were removed from the dataset in order to make the dataset
to conform to sentiment polarity format of the collected Negative Positive Neutral
Amazon dataset.
Figure Ia. Sentiment labels of Amazon dataset reviews

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

19
are as presented in Figure IIa and Figure IIb respectively,
Konga while the results of the same experimental task on Konga data
are as presented in Figure IIIa and Figure IIIb respectively.
200 162
No. of Reviews

150 In the case of Amazon dataset (Figures IIa and IIb), the
101
100 sentiment-lexicon-based approach outperformed the bag of
35 49 words sentiment polarity machine learning classification
50 23
0 across the six classification algorithms. In the case of Konga
0 dataset (Figures IIIa and IIIb), sentiment-lexicon-based
Machine Human approach outperformed the bag of words topic-based
sentiment polarity approach in two classifiers (Tree and
negative positive neutral
SVM). The 200 word and 500 word topic-based feature sets
Figure Ib. Sentiment labels of Konga dataset reviews outperformed sentiment-lexicon-based approach for KNN,
all examined bag of words feature sets other than 50 and 100
Whereas none of the two datasets contain neutral reviews, words outperformed sentiment-lexicon-based approach in
the machine sentiment-lexicon scoring algorithm scored some NN, fifty (50) word feature outperformed the sentiment-
of the reviews as neutral. For example, as shown in Figure Ia, lexicon-based approach in NB while the sentiment-lexicon-
the Amazon dataset originally contains 500 positive and based approach was outperformed by all bag of words feature
negative reviews each, as labelled by human identifier. The
set options in LR.
machine sentimental algorithm however, scored 520 reviews
as positive, 265 as negative and 215 as neutral. Also as shown It could be observed that the behavior of the two data sets are
in Figure Ib, while the original konga dataset contains 23 not exactly the same. While Amazon data was better
positive and 162 negative reviews as labelled based on human determined by the sentiment dictionary based approach, the
identification, machine sentimental algorithm identified 49
same did not hold for Konga data. This divergence might
reviews as positive, 35 as negative and 101 as neutral. This
possibly be traced to the assertion of [4] that since the
shows that machine scoring, based on sentiment-lexicon
algorithm, finds it more challenging than human agents to sentiment dictionary based approach is a great knowledge-
determine sentiment polarity of the reviews. This corroborates based classification approach, it lacks generality, and that of
the assertion that human beings are fairly more intuitive than [15] who reported that non frequently mentioned features are
machine when it comes to interpreting the tone of a piece of often not detected by this knowledge-based sentiment
writing [5]. analysis approach. It could possibly mean that most of the
sentiment clues in Amazon dataset are better represented in
Research Question 2: How do sentiment-based classification the lexicon database than those in Konga dataset.
schemes compare with topic-based classification schemes in Nevertheless, the result further buttresses the assertion that
each dataset? real datasets vary and idea of a universal approach is just a
fantasy [12]. Thus, Sentiment Analysis, like other data
The results of sentiment-lexicon and bag of words machine mining tasks, is an experimental science.
learning classification of Amazon data with the six
classification algorithms, KNN, Tree, SVM, NN, NB and LR

Amazon
1 Tree
KNN
Performance

0.8 SVM
0.6
0.4
0.2
0

Feature Sets

CA F1

Figure IIa. Performance of sentiment-lexicon-based analysis classification on Amazon data

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

20
Amazon Continued
LR
1 NB
NN
Performance

0.8
0.6
0.4
0.2
0

Feature Sets

CA F1

Figure IIb. Performance of sentiment analysis classification on Amazon data continued

Konga
KNN Tree SVM
1
Performance

0.8
0.6
0.4
0.2
0

Feature Sets

CA F1

Figure IIIa. Performance of sentiment analysis classification on Konga data

Konga Continued
1.2000
1.0000 NN LR
Axis Title

0.8000 NB
0.6000
0.4000
0.2000
0.0000

Axis Title

CA F1

Figure IIIb. Performance of sentiment analysis classification on Konga data continued

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

21
Research Question 3: Which classification scheme is the best provided the highest classification accuracy in the Amazon
for each dataset? and Konga data sets respectively. The emergence of NN as
the best classifier in the two datasets is at variance with some
The best performing feature set in each of the six classifiers of the previous studies that reported SVM [6], NB [7], or
for the two datasets are as shown in Tables I and II. Stochastic Gradient Descent (SGD) [3] as the best classifiers
in their studies. It could however, be observed that none of
Table I. Amazon best performing classification schemes the mentioned previous studies included NN as a classifier in
Algorithm Feature set CA F1 their experiments. The difference in the most performing
KNN Sentiment 0.7774 0.7771 feature sets in the present study, and the emergence of NN as
the best classifier contrary to reported cases, further buttress
Tree Sentiment 0.8106 0.8104
the fact that the concept of universal learner is an idealistic
SVM Sentiment 0.6544 0.6429 fantasy [12].
NN Sentiment 0.8203 0.8203
IV. Conclusion
NB Sentiment 0.8188 0.8188
Based on the findings of this study, it could be concluded that
LR Sentiment 0.8135 0.8134 rule-based sentiment analysis approach, particularly the Liu
Hu dictionary implementation used in the present study has
Table II. Konga best performing classification schemes not yet assumed the human status in sentiment polarity
Algorithm Feature set CA F1 approach. The study also concludes that sentiment analysis,
KNN 200 words 0.9297 0.9316 like other data mining tasks is an experimental science and
0.9189 0.9045 analysts could compare the performances of symbolic and
Tree Sentiment
topic-based approaches in their sentiment classification
SVM Sentiment 0.9135 0.8925
endeavor before deciding on the most precise technique.
NN 10 words 0.9676 0.9654
0.8919 0.8874
The sentiment-lexicon results in the present study were
NB 50 words
however, based on Liu Hu dictionary implementation alone.
LR 500 words 0.9568 0.9559
Also the topic-based approach was based on bag of words
lexical features alone. Performances of other methods of each
As shown in Tables I and II, sentiment-lexicon approach and of the two techniques could be investigated in further studies.
ten word topic-based feature sets, based on Neural Network,

References [6] T. U. Haque and F. M. Shah, “Sentiment Analysis on

Large Scale Amazon Product Reviews,” no. June
[1] M. . Devika, S. C., and A. Ganesh, “Sentiment 2019, 2018, doi: 10.1109/ICIRD.2018.8376299.
Analysis : A Comparative Study On Different
Approaches,” Procedia Comput. Sci., vol. 87, pp. [7] R. S. Jagdale, V. S. Shirsat, and S. N. Deshmukh,
44–49, 2016, doi: 10.1016/j.procs.2016.05.124. “Sentiment Analysis on Product Reviews Using
Machine Learning Techniques,” in Proceeding of
[2] V. K. Singh, R. Piryani, A. Uddin, and P. Waila, CISC 2017 Sentiment Analysis on Product Reviews
“Sentiment Analysis of Textual Reviews,” no. Using Machine Learning Techniques, 2019, no.
December 2015, 2013, doi: January, doi: 10.1007/978-981-13-0617-4.
10.1109/KST.2013.6512800.
[8] S. Mukherjee and P. Bhattacharyya, “Feature
[3] A. Salinca, “Business reviews classification using Specific Sentiment Analysis for Product Reviews,”
sentiment analysis,” no. September, 2015, doi: pp. 1–12.
10.1109/SYNASC.2015.46.
[9] Y. Xu, X. Wu, and Q. Wang, “Sentiment Analysis of
[4] L. Zhang, K. Hua, H. Wang, G. Qian, and L. Zhang, Yelp‘s Ratings Based on Text Reviews,” 2015 17th
“Sentiment Analysis on Reviews of Mobile Users,” Int. Symp. Symb. Numer. Algorithms Sci. Comput.,
Procedia - Procedia Comput. Sci., vol. 34, pp. 458– 2015.
465, 2014, doi: 10.1016/j.procs.2014.07.013.
[10] Z. Singla, S. Randhawa, and S. Jain, “Sentiment
[5] K. Bannister, “Sentiment Analysis: How Does It Analysis of Customer Product Reviews Using
Work? Why Should We Use It? | Brandwatch,” Machine Learning,” in 2017 International
Brandwatch.Com. 2018, [Online]. Available: Conference on Intelligent Computing and Control
https://www.brandwatch.com/blog/understanding- (I2C2), 2017, no. June, doi:
sentiment-analysis/. 10.1109/I2C2.2017.8321910.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

22
[11] W. Kasper and M. Vela, “Sentiment Analysis for
Hotel Reviews,” in Proceedings of the
Computational Linguistics-Applications Conference,
2011, vol. 231527, pp. 45–52.

[12] I. H. Witten and E. Frank, Data Mining: Practical

Machine Learning Tools and Techniques 2nd
edition, vol. 5, no. 1. 2005.
[13] Kaggle, “Sentiment Labelled Sentences Data Set,”
2021. https://www.kaggle.com/marklvl/sentiment-
labelled-sentences-data-set.
[14] D. Kotzias, M. Denil, N. De Freitas, and P. Smyth,
“From group to individual labels using deep
features,” Proc. ACM SIGKDD Int. Conf. Knowl.
Discov. Data Min., vol. 2015-Augus, pp. 597–606,
2015, doi: 10.1145/2783258.2783380.
[15] E. Guzman and W. Maalej, “How Do Users Like
This Feature ? A Fine Grained Sentiment Analysis of
App Reviews,” in 2014 IEEE 22nd International
Requirements Engineering Conference (RE), 2014,
pp. 153–162, doi: 10.1109/RE.2014.6912257.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

23
Autism detection from facial images using deep learning
methods
Abdulazeez Mousa Fatih Özyurt Shivan H. M. Mohammed
Department of Software Engineering Department of Software Engineering Computer Science
Firat university Firat university Duhok University
Elazig, Turkey Elazig, Turkey Duhok, Iraq
abdulazizmousa93@gmail.com fatihozyurt@firat.edu.tr shivan.cs@uod.ac

Abstract— Autism spectrum disorder (ASD) refers to a children develop their communication and social skills, as
collection of behavioral and developmental issues and well as their quality of life. Estimating the total ASD requires
difficulties. The cognitive, communication, and play skills of the expertise of an ASD specialist; however, early diagnosis
a child with autism spectrum disorder are all affected. The is crucial for controlling and treating this disease [3].
description "spectrum" in autism spectrum disorder refers However, in rural areas and isolated villages, health care and
to the fact that each child is special and has their own set of facilities are unavailable. Several methods for determining
characteristics different from other children. These come whether or not a child has autism spectrum disorder have been
together to give him a unique social bond as well as his own established. These methods are extremely useful for early
understanding of his own actions. Medical image
diagnosis of ASD and evaluating the efficacy of the ASD
classification is a significant research field that is gaining
traction among researchers and clinicians alike to detect and
protocol [4]. In latest years, deep learning (DL) has advanced
diagnosis diseases. It addresses the issue of medical rapidly. Iimage processing, computer-aided diagnosis, image
diagnosis, experiment purposes and analysis in the field of recognition, image fusion, image registration, image
medicine. To address and resolve these issues, a number of segmentation, and other fields have benefited from deep
data mining-based medical imaging modalities and learning techniques, By detection and analysis of medical
applications have been proposed and developed. But also to images Deep learning techniques extracts features from
understand and learn about how diseases develop in medical images accurately and efficiently reflect the
patients, to help doctors in early diagnosis of pathology. Not information. Deep learning (DL) helps physicists (medical
only is achieving good accuracy in classifying medical images practitioners) and doctors detect and predict disease risk more
is the main purpose alone. We used pre-trained accurately and quickly, allowing them to avoid disease before
Convolutional Neural Networks and transfer learning in this it develops [5]. One of the most dilemmas (challenges) in
study. These CNN architectures are used to train the image recognition field is the classification of medical images.
network and to classifying medical images. The experiment Classification of medical images with the aim of analyzing
results of this study show that the proposed model can detect and categorizing them (classification) into several different
Autism Spectrum Disorder (ASD), with the best accuracy of groups to assist physicists(medical practitioners) and doctors
95.75 percent achieved using the MobileNet model with in diagnosing diseases or performing research on them [6].
transfer learning. The architectures that tested in this study These techniques enhance the abilities of doctors and
are ready to be tested with additional data and can be used
researchers to understand that how to analyze the generic
to prescreen Individuals with ASD. The use of a deep
learning methods for feature selection and classification in
variations which will lead to disease. Deep learning
this study could greatly support future autism studies. algorithms such as convolutional neural networks (CNNs) to
detect and classify images. The classification of medical
Keywords—Autism, deep learning, Convolutional Neural images is divided into several stages. Firstly getting medical
Networks, Classification, Transfer Learning images and uploading them to the model. And then extraction
of essential features from the acquired input image is
performed. The features are then used to create models that
identify the image data set in the second phase of medical
I. INTRODUCTION
image classification. The final stage is the production, which
Autism, also known as autism spectrum disorder (ASD), is a classified image and a report based on image analysis that
is a developmental disorder that affects a person's interaction, reflects the outcome [7].
speech, and social behavior like (Strange gestures, such as
swinging hands and rotating, are repeated. Since autistic The rest of this article is structured as follows as: we
people can't stay in one place for long periods of time, we find presents the previous related work about autism detection and
that the patient are always moving. and the autistic person's image classification using deep learning and transfer learning
gestures are disorganized and spontaneous. Can't connect well techniques in Section II, The details of the dataset used in this
with others and speaks with a peculiar accent. Autistic people study and description of the CNN architectures that
are very sensitive to light and sound[1]. In addition, they are implemented on our dataset provided in the section III, The
unaware of other people's emotions and thoughts. When experimental results of the CNN architectures and transfer
compared to other children, they are more violent. They learning techniques that performed on the dataset and the
frequently have outbursts of anger. The pain response and show that the best results obtained of all models performed is
sensation are sluggish and mild. Depending on the disposition MobileNet model presents in section IV, Finally in section V
of the child, who may be slow to learn or have a high level of presents the conclusion of proposed framework.
intelligence.). and other disorders such as depression, anxiety,
and attention deficit hyperactivity disorder are common in
these children [2]. Early diagnosis in infancy can help autistic

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

24
II. LITERATURE REVIEW III. METHODOLOGY
Beary, M., Hadsell, A., Messersmith, R., & Hosseini, M. P. A. Dataset
proposed "Diagnosis of Autism in Children using Facial Early detection of autism is very important for the
Analysis and Deep Learning" In this paper, the authors development of the affected child, so the development of these
proposed a deep learning model to classify children if they are classifications based on a set of data related to it is a reason
autistic or non-autistic (healthy). Autistic patients in general for early detection of autism and this enhances the adaptation
have distinct facial abnormalities that are part of the disease of the patient to normal life. The dataset can be used in two
discovery that allows researchers to know if a person has different ways, both of which are popular for deep learning
autism or not. Therefore, the researchers in this study used the tasks. The dataset is divided into Training, Test, and
MobileNet deep learning model with two dense layers to Validation sets, which is a normal procedure. Train is the title
extract the features of the face and classify the people of the training set. That is contains of two subdirectories
according to the image. In this study, the model was trained known as (Autistic & Non_Autistic) [12, 13]. There are 1667
and tested on 3014 images (90% for training, 10% for testing). facial pictures of children with autism in the Autistic
The results of this study was 94.6% accuracy in classifying subdirectory in 224 X 224 X 3, jpg format. And also there are
[8]. Mythili, M. S., & Shanavas, A. M. have written "A study 1667 photos of children who do not have Autism in the
on Autism spectrum disorders using classification techniques" Non_Autistic subdirectory, all of which are in the 224 X 224
In this study, the authors focused on extracting data from X 3 jpg format. The unified directory is the second way the
methodologies to study the performance of students with data is made accessible. There are two subdirectories in this
autism. By mining data with the tasks it provides that can be directory: Autistic & Non_Autistic. It is a collection of files
from the train, test, and correct directories that have been
used to study the performance of students with autism. In this
combined into a single set. The combined data can then be
article the authors use algorithms for Data mining by task
partitioned into user-defined train, test, and validation sets.
classification at the level of students. The authors
used Support Vector Machines (SVM), Artificial Neural Some images of Autism dataset used in our study shown
Networks (ANN), and Fuzzy logic as machine learning below in Figure 1.
methods in this paper. The algorithms are extremely helpful in
dealing with the prediction level of autism students [9]. Ali,
N.A., Syafeeza, A.R., Jaafar, A., & Alif, M have written "
Autism spectrum disorder classification on
electroencephalogram signal using deep learning algorithm "
In this paper, the authors use deep learning algorithms via
electroencephalogram (EEG) to detect the different patterns
between normal children and children with autism. The
process of classifying normal children and children with
autism by using deep learning algorithms and using the
Multilayer Perceptron (MLP) model in this article is by means
of a database that contains brain signals for pattern
recognition, using the Multilayer Perceptron (MLP) model to
extract the features for the classification process [10]. Rad, N.
M., Kia, S. M., Zarbo, C., van Laarhoven, T., Jurman, G.,
Venuti, P., Marchiori, E., & Furlanello, C. have written "Deep
learning for automatic stereotypical motor movement Figure 1: some images of (a) Autistic and (b) Non-Autistic
detection using wearable sensors in autism spectrum Children.
disorders" in this paper the authors proposed deep learning
application to aid automatic stereotypical motor movements B. Transfer Learning and CNN architectures
(SMM) detection with multi-axis Inertial Measurement Due to the emergence of modern and high-performing
Units(IMUs). In the study from raw data convolutional neural machine learning systems, image classification has been more
network (CNN) have been to learn a discriminative feature important in the research field. Artificial neural
space, this method is used. To model the temporal patterns in networks(ANNs) have progressed, and deep learning
a series of multi-axis signals, combine the long short-term architectures such as the convolutional neural network(CNN)
memory (LSTM) with CNN architectures. Furthermore, it was have emerged. The application of multi - class image
demonstrated how the convolutional neural network (CNN) classification and recognition of objects belonging to multiple
can be used with transfer learning technique to improve the categories has been triggered by depending on artificial neural
detection and analyzation rate on longitudinal data. From the networks(ANN). In terms of efficiency and complexity, a new
machine learning(ML) algorithm has an advantage over older
results of this paper they shows that Handcrafted features are
ones[14]. In the deep learning field, transfer learning is a
outperformed by feature learning, the detection rate improves research problem. It extracts the knowledge collected from
when using LSTM to learn the temporal dynamic of signals, one problem and applies it to a separate but related issue. For
particularly when the training data is distorted. Detectors with example, knowledge acquired while learning to recognize a
an ensemble of LSTMs are more accurate and stable. In disease may apply when trying to identify another disease[15,
longitudinal settings, parameter transfer learning is beneficial. 16]. For example, we apply knowledge gained during the
These results represent a significant step forward in detecting identification of cancer to malaria. Transfer learning is a deep
SMM in real-time scenarios [11]. learning technique that involves training a neural network

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

25
model on a problem that is close to the one being solved. One epochs and batch size value of training and testing is 32. In
of the advantages of deep learning is that it shortens the this study the results showed that the maximum accuracy
training time on the deep learning model and this way we can achieved 95.75% obtained by MobileNet model in 90.5
build on previous knowledge rather than starting from scratch. minutes and this is the shortest period of time of all models,
Transfer learning is commonly expressed in computer vision and followed by Densenet121, InceptionV3, Xception, and
by the use of pre-trained models. A pre-trained model is one ResNet-50. With 91.96%, 91.46%, 90.60%, and 72.41%. But
that has been trained on a broad benchmark dataset to solve a the period of times of models are different and started with
problem that is close to the one we are working on. As a result MobileNet model, InceptionV3, Densenet121, ResNet-50,
of the high computational cost of training these models [17,
and Xception. As shown in below Table 1.
4].
Table .1 Comparison of pre-trained networks results.
Convolutional neural networks (ConvNets or CNNs) are
one of the most common types of neural networks used to Model Accuracy Time(minutes)
recognize and classify images. CNNs are commonly used in
areas such as object detection, face recognition, and so on. In
classification of images Convolutional neural network (CNN) MobileNet 95.75% 90.5
takes an image as input, process (extract features from it) it
and categorize it [18]. (Like: Autistic, Non-Autistic). An input Densenet121 91.96% 330.8
image is seen by computers as an array of pixels, with the
number of pixels varying depending on the image resolution. InceptionV3 91.46%, 239.7
Depending on the resolution of the picture. h x w x d (h =
Height, w = Width, d = Dimension) will be shown. For Xception 90.60% 445.4
example, a 6 x 6 x 3 array of RGB matrix (3 refers to RGB
values) and a 4 x 4 x 1 array of grayscale matrix image. ResNet-50 72.41%. 346.1
When re-purposing a pre-trained model for our own needs,
it should remove the original classifier, then add a new
classifier that is appropriate for our needs, and finally fine-
tune the model using one of three strategies [19].
Train the Entire Model - Apply the pre-trained model's
applied syntax to your dataset and train it. Start with values
from a pre-trained model instead of random weights.
Feature extraction (freezing convolutional neural network
(CNN) model base): The pre-trained basic form on which
to train a new classifier. In other words, training of the fully
connected layer alone. With leaving the weights of the
convolution layers without changing anything from them.
Fine-tuning: In this strategy the Original convolution layer
weights are used as starting points. In addition to a fully
connected classifier one or more convolution layers that
are retrained. Also, just to fit a new problem, the unsecured
convolution layers are adjusted. Figure 2: Densenet121 Training progress, Accuracy
There are many pre-trained CNN networks that only
require sets of data that consist of training and test data in their
own input layer and have the ability to transfer learning. These
networks are different in their structures in terms of the
internal layers and the technologies they use [20]. In this study
we have chosen five pre-trained CNN architectures to
implement on our autism dataset. The CNN architectures
(Densenet121, InceptionV3, MobileNet, Resnet50, and
Xception) are employed to classify autism images dataset into
Binary classes. In this study, the size of all images in the
dataset is 224 x 224 x 3, The Adam optimizer is used to train
each network architecture. Number of epochs is 35. The value
of training and testing batch size is set to 32.

IV. Results
Figure 3: Densenet121 Training progress, Loss
In this study, ASD diagnosis was presented as a binary
classification issue by applying transfer learning technique to
five pre-trained CNN models, the CNN models are
(Densenet121, InceptionV3, MobileNet, ResNet-50, and
Xception).all CNN architectures are completed after 35

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

26
Figure 4: MobileNet Training progress, Accuracy Figure 8: Xception Training progress, Accuracy

Figure 5: MobileNet Training progress, Loss Figure 9: Xception Training progress, Loss

Figure 6: InceptionV3 Training progress, Accuracy Figure 10: Resnet50 Training progress, Accuracy

Figure 7: InceptionV3 Training progress, Loss Figure 11: Resnet50 Training progress, Loss

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

27
V. Conclusion [8] Beary, M., Hadsell, A., Messersmith, R., & Hosseini, M. P.
(2020). Diagnosis of Autism in Children using Facial Analysis
In this paper, we introduce a CNN based approach using and Deep Learning. arXiv preprint arXiv:2008.02890.
transfer learning to detect and classify Autism Spectrum [9] Mythili, M. S., & Shanavas, A. M. (2014). A study on Autism
Disorder (ASD) patients from facial images as a binary spectrum disorders using classification
classification issue. In this study five pre-trained CNN models techniques. International Journal of Soft Computing and
are implemented with transfer learning technique on medical Engineering, 4(5), 88-91.
image dataset. The CNN models used to detect and classify [10] Ali, N.A., Syafeeza, A.R., Jaafar, A., & Alif, M. (2020).
Autism spectrum disorder classification on
the images are Densenet121, InceptionV3, MobileNet, electroencephalogram signal using deep learning algorithm.
ResNet-50, and Xception. All CNN models were trained used IAES International Journal of Artificial Intelligence, 9(1), 91.
Adam optimizer and performed after 35 complete the value [11] Rad, N. M., Kia, S. M., Zarbo, C., van Laarhoven, T., Jurman,
number of epochs. The batch size is 32. The results G., Venuti, P., Marchiori, E., & Furlanello, C. (2018). Deep
demonstrate that our model's accuracy with the test data is learning for automatic stereotypical motor movement detection
95.75 %, which is best accuracy achieved on this dataset by using wearable sensors in autism spectrum disorders. Signal
Processing, 144, 180-191.
the MobileNet model. In the future, we will compare the
[12] Mohammed, S. H., & Çinar, A. (2021). Lung cancer
results of these models used in this study with some of the classification with Convolutional Neural Network
other models and refine the data with the proposed models. Architectures. Qubahan Academic Journal, 1(1), 33-39.
And trying more models on more medical images to increase [13] Mousa, A., Karabatak, M., & Mustafa, T. (2020, June).
the efficiency of classification to get best results. Database Security Threats and Challenges. In 2020 8th
International Symposium on Digital Forensics and Security
References (ISDFS) (pp. 1-5). IEEE.
[1] Nasser, I. M., Al-Shawwa, M., & Abu-Naser, S. S. (2019). [14] Karabatak, M., & Mustafa, T. (2018, March). Performance
Artificial Neural Network for Diagnose Autism Spectrum comparison of classifiers on reduced phishing website dataset.
Disorder. In 2018 6th International Symposium on Digital Forensic and
Security (ISDFS) (pp. 1-5). IEEE.
[2] Sherkatghanad, Z., Akhondzadeh, M., Salari, S., Zomorodi-
Moghadam, M., Abdar, M., Acharya, U. R., ... & Salari, V. [15] Mustafa, T., & Varol, A. (2020, June). Review of the Internet
(2020). Automated detection of autism spectrum disorder using of Things for Healthcare Monitoring. In 2020 8th International
a convolutional neural network. Frontiers in neuroscience, 13, Symposium on Digital Forensics and Security (ISDFS) (pp. 1-
1325.Doland Drump, My Book, SYZ Publishers, Turkey, 6). IEEE.
2005. [16] Sharma, N., Jain, V., & Mishra, A. (2018). An analysis of
[3] Tamilarasi, F. C., & Shanmugam, J. (2020, June). convolutional neural networks for image classification.
Convolutional Neural Network based Autism Classification. Procedia computer science, 132, 377-384.
In 2020 5th International Conference on Communication and [17] Heinsfeld, A. S., Franco, A. R., Craddock, R. C., Buchweitz,
Electronics Systems (ICCES) (pp. 1208-1212). IEEE. A., & Meneguzzi, F. (2018). Identification of autism spectrum
[4] Lai, Z., & Deng, H. (2018). Medical image classification based disorder using deep learning and the ABIDE dataset.
on deep features extracted by deep model and statistic feature NeuroImage: Clinical, 17, 16-23.
fusion with multilayer perceptron. Computational intelligence [18] Hossain, M. D., Kabir, M. A., Anwar, A., & Islam, M. Z.
and neuroscience, 2018. (2021). Detecting autism spectrum disorder using machine
[5] Liu, W., Li, M., & Yi, L. (2016). Identifying children with learning techniques. Health Information Science and Systems,
autism spectrum disorder based on their face processing 9(1), 1-13.
abnormality: A machine learning framework. Autism [19] Raj, S., & Masood, S. (2020). Analysis and Detection of
Research, 9(8), 888-898. Autism Spectrum Disorder Using Machine Learning
[6] Erkan, U., & Thanh, D. N. (2019). Autism spectrum disorder Techniques. Procedia Computer Science, 167, 994-1004.
detection with machine learning methods. Current Psychiatry [20] Shahamiri, S. R., & Thabtah, F. (2020). Autism AI: a New
Research and Reviews Formerly: Current Psychiatry Reviews, Autism Screening System Based on Artificial Intelligence.
15(4), 297-308. Cognitive Computation, 12(4), 766-777.
[7] Karabatak, M., Mustafa, T., & Hamaali, C. (2020, June).
Remote Monitoring Real Time Air pollution-IoT (Cloud
Based). In 2020 8th International Symposium on Digital
Forensics and Security (ISDFS) (pp. 1-6). IEEE.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

28
Diabetic Retinopathy Classification from Retinal Images using
Machine Learning Approaches
Indronil Bhattacharjee Al-Mahmud Tareq Mahmud
Dept. of Computer Science and Engineering Dept. of Computer Science and Engineering Dept. of Computer Science and Engineering
Khulna University of Engineering & Khulna University of Engineering & Khulna University of Engineering &
Technology (KUET) Technology (KUET) Technology (KUET)
Khulna, Bangladesh Khulna, Bangladesh Khulna, Bangladesh
ibprince.2489@gmail.com mahmud@cse.kuet.ac.bd hridoytareqmahmud@gmail.com

Abstract— Diabetic Retinopathy is one of the most familiar Proliferate DR is the most critical phase of diabetic eye
diseases and is a diabetes complication that affects eyes. disease. It occurs when the retina starts growing excessive blood
Initially diabetic retinopathy may cause no symptoms or vessels, which is called neovascularization. These huge numbers
only mild vision problems. Eventually, it can cause of vulnerable vessels often bleed into the vitreous. When they
blindness. So early detection of symptoms could help to only bleed a little, a few dark floaters are found. On the other
avoid blindness. In this paper, we present some experiments hand, when they bleed a lot, that may block the whole of the
on some features of Diabetic Retinopathy like properties of vision.
exudates, properties of blood vessels and properties of
microaneurysm. Using the features, we can classify healthy,
mild non-proliferative, moderate non-proliferative, severe
non-proliferative and proliferative stage of DR. Support
Vector Machine, Random Forest and Naive Bayes
classifiers are used to classify the stages. Finally, Random
Forest is found to be the best for higher accuracy, sensitivity
and specificity of 76.5%, 77.2% and 93.3% respectively.

Keywords— Diabetic Retinopathy, Exudate, Blood Vessel,

Microaneurysm, Random Forest.

I. INTRODUCTION
Figure I. Different stages of diabetic retinopathy (From top left): (a) Healthy
People suffering from diabetes can have an eye complication Eye (b) Mild NPDR (c) Moderate NPDR (d) Severe NPDR (e) PDR
called diabetic retinopathy. When blood sugar levels go high,
that causes harm and erosion to the blood vessels in the retina. The objective of the paper are:
These affected blood vessels can fatten and exude. Alternately,
the vessels may have been closed, may stop flowing bloods. • Process color fundus retinal images for Diabetic Retinopathy
Sometimes unnecessary and anomalous blood vessels starts to detection.
grow on the surface of the retina. These abnormal changes can • Extract key features from the pre-processed images.
damage one’s vision, sometimes may destroy fully. According
to severity of the disease, DR can be classified into two main • Detect the presence of Diabetic Retinopathy.
stages. (a) Non-Proliferative Diabetic Retinopathy (NPDR) and • Classify whether the Diabetic Retinopathy is Proliferative or
(b) Proliferative Diabetic Retinopathy (PDR). Non-proliferative.
NPDR is the initial phase of diabetic retinopathy. Many
individuals with diabetes suffers from it. With NPDR, tinier II. THE PROPOSED SYSTEM
blood vessels excrete and fatten the retina. When the macula Input: Colour fundus retinal images
expands, this has been called macular edema. This is the most Output: Diabetic Retinopathy is not present, Mild, Moderate,
familiar reason why people having diabetes leads to blindness. Severe or PDR
In case of NPDR, the blood vessels in the retina can clogged Process:
off too. This situation is named macular ischemia. When Step 1: Input the initial fundus image
macular ischemia happens, macula cannot get the blood supply. Step 2: Preprocess the initial image
Intermittently some minute particles called exudates can be Step 3: Optical disk removal
grown in the retina. These affects one's vision too. If anybody
Step 4: Exudates detection
suffers from NPDR, his eye sight will go blurry. Furthermore,
Step 5: Blood Vessels detection
NPDR is sub classed into 3 stages, Mild, Moderate and Severe.
Step 6: Microaneurysm detection

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

29
Step 7: Features extraction has been extracted using thresholding. After applying
Step 8: Apply to classifiers these preprocessing, pixels having intensity value higher
Step 9: Classify the Diabetic Retinopathy stages than 235 are set to 255 and the rest of them are set to 0 for
Step 10: Detect whether it is Healthy, Mild, Moderate, having the clearest view. Then traversing the image, area
Severe or PDR eye of exudates are calculated. The images of different steps
are illustrated in Figure III.
III. METHODOLOGY
A. Dataset
To evaluate our method, we have used a dataset named as
Diabetic Retinopathy (Resized) from Kaggle. The dataset has a
total of 13402 retinal images and corresponding levels of
Diabetic Retinopathy for each image.
B. Data Preprocessing
Data preprocessing has been done in two steps, general
preprocessing for all the images and specific preprocessing for
individual feature extraction.
1) General Preprocessing:
 Resizing: In this work, the sizes of the actual images
Figure III. Preprocessing for Exudate detection
in the dataset were 1024x1024 pixels. As the dataset is huge
in size, the images have been stored with size 350x350  Blood Vessel Extraction: Blood vessel is one of the
pixels for reducing the computational time. most important features for differentiating diabetic
 Green Channel Extraction: Preprocessing has been retinopathy stages. After obtaining the green channel
done with the aim of improving the contrast level of the image and improving the contrast of the image, several
fundus images. For contrast enhancement of the retinal steps has been done for extracting blood vessel. Alternate
images, some components like the red and blue components sequential filtering (three times opening and closing)
of the image were commonly discarded before processing. using three different sized and ellipse shaped structural
Green channel shows the clearest background contrast and element 5x5, 11x11 and 23x23 is applied on the image.
greatest contrast difference between the optic disc and Then the resultant image is subtracted from the input
retinal tissue. Red channel is comparatively lighter and image. Subtracted image has lots of small noises. Those
vascular structures are visible. The retinal vessels are lightly noises are removed through area parameter noise removal.
visible but those show less contrast than that of the green Contours of each components including noises are found
channel. Blue channel contains very little information and is by finding out the contours and calculating the contour
comparatively noisier. area and remove the noises which are comparatively
 Contrast Limited Adaptive Histogram Equalization: bigger in size (200 used as reference). Then the resultant
Contrast Limited Adaptive histogram equalization image is binarized using a threshold value. Finally the
(CLAHE) is used for enhancing the contrast level of the number of pixels that covers the blood vessels area are
images. CLAHE calculates different histograms of the calculated. The images of different steps are illustrated in
image and uses these information to reallocate intensity Figure IV.
value of image. Hence, CLAHE is sigficant for improving
the regional contrast and enhancement of the edges in all the
regions of an image.

Figure II. General preprocessing

2) Specific Preprocessing:
 Exudate Detection: Firstly, optical disc has been
removed using red channel of the image. Then, using 6x6 Figure IV. Preprocessing for Blood Vessel detection
ellipse shaped structuring element, morphological
dilation is applied. Non linear median filter is used for  Microaneurysm Extraction: Green component is
noise removal. Exudates are in high intensity values. So it applied to extract microaneurysm. For better contrast,
CLAHE is used. Then median filter is used for noise

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

30
removal. 7x7 ellipse shaped structural element is used for F. Classification
morphological operation. Morphological operation Prediction has been performed using Support Vector
erosion is applied and then the image is inverted. For Machine (SVM), Random Forest (RF) and Naive Bayes
joining the disjoint segments of blood vessel, classifiers.
morphological closing is used. Then the image has been 1) Random Forest:
binarized. As the blood vessel, haemorrhage and Random Forest (RF) is an ensemble tree-based learning
microaneurysm is having the almost same intensity, all algorithm. The RF Classifier could be a set of decision trees
these components will be detected altogether in the from an arbitrarily chosen subset of the preparing sets. It
binarized image. Since microaneurysm is smaller in size, totals the votes from distinctive choice trees to choose the
it has been extracted using contour area. The images of ultimate class of the test question. The elemental concept
different steps are illustrated in Figure V. behind RF classifier could be a basic but capable one - the
intelligence of swarms. An expansive number of moderately
uncorrelated trees working as a committee will beat any of
the person constituent models. Uncorrelated models can
deliver gathering expectations that are more precise than any
of the individual predictions. The reason for this wonderful
impact is that the trees ensure each other from their personal
mistakes as long as they don't always all mistakes within the
same heading. Whereas a few trees may be off-base,
numerous other trees will be right, so as a gather the trees
are able to move within the adjusted heading.
Figure V. Preprocessing for Microaneurysm detection 2) Support Vector Machine:
SVM classify the input images into two classes such as
C. Dataset Splitting Diabetic Retinopathy affected eye and normal eye using its
The dataset has been divided into two parts, where 75% as features. As SVM is a binary classifier, our first task is to
training data and 25% as test data. Therefore, 10052 training classify which eye is affected by Diabetic Retinopathy and
images has been used to train the model and it has been tested which is a healthy one. After first classification, our next
on 3350 images. task is to use Support Vector Machine again. This time it is
applied only on the affected ones. It will again classify
D. Data Scaling which Diabetic Retinopathy is non-proliferative i.e. is in
In this system, standard scalar has been used to scale all the initial stage and which on is in proliferative i.e. is in severe
data to limit the ranges of the variables. Using data scaling, state.
those can be compared on the exact environments in case of all Support Vector Machine has been utilized since the SVM
the algorithms. is based on a convex objective function that never stuck into
the local maxima. The ideal hyperplane is the shape of the
E. Selection of Features isolating hyperplane and the objective work of the
Feature extraction will be done from preprocessed images optimization issue does not depend unequivocally on the
shown in Figure III, IV and V. The features which are extracted dimensionality of the input vector but depends as it were on
to detect Diabetic Retinopathy are- the inward items of two vectors. This fact permits
 Histogram of Exudates developing the isolating hyperplanes in high-dimensional
 Zeroth Hu moment of Exudates spaces.
 Histogram of Blood Vessels 3) Naïve Bayes:
 Zeroth Hu moment of Blood Vessels Naive Bayes classifier isn't a single algorithm, but a
 Histogram of Microaneurysm collection of algorithms where all of them contains a
 Zeroth Hu moment of Microaneurysm common rule, that is, each match of highlights being
classified is free of each other. Naive Bayes is mainly an
ensemble algorithm based on Bayes’ Theorem.
G. Evaluation Metrics
Accuracy, Sensitivity and Specificity are used as evaluation
metrics of the model. Accuracy, Sensitivity and Specificity are
calculated using (1), (2) and (3) respectively.

𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 + 𝑇𝑟𝑢𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (1)
𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐷𝑎𝑡𝑎
Figure VI. Differences between a normal retina and DR affected retina
in terms of different features 𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = (2)
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 + 𝐹𝑎𝑙𝑠𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

31
𝑇𝑟𝑢𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = (3)
𝑇𝑟𝑢𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 + 𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒

Finally, the workflow of this paper is illustrated in Figure VII

Figure VIII. Confusion Matrix for the five classes

(a) (b)
Figure IX. Bar diagram of (a) sensitivity and (b) specificity for
each class

The evaluation measures are compared with the related works

in terms of number of classes, methods used, features etc.

Figure VII. Flow Diagram of the system TABLE II. COMPARISON OF DIABETIC RETINOPATHY DETECTION BY
VARIOUS RESEARCHERS

IV. RESULTS

Sensitivity (%)

Specificity (%)
Comparison of evaluation metrics for Random Forest,

Accuracy (%)
Number of

Support Vector Machine and Naïve Bayes classifier is shown

Classes

Reference Method
in Table 1.

TABLE I. PERFORMANCE MEASURES OF DIFFERENT CLASSIFIERS

Accuracy Sensitivity Specificity Sinthanayothin 2 Moat Not 70.2 70.6

Classifier et al. [1] operator repor
(%) (%) (%)
ted
Random
76.5 77.2 93.3
Forest Singalavanija et 2 Exudate, Not 74.8 82.7
al. [2] Blood repor
vessel, ted
SVM 70.2 72.5 82.0 Microane
urysm
Kahai et al. [3] 2 Decision Not 80 63
Naïve Bayes 67.3 69.4 75.2 support repor
system ted
Wang et al. [4] 4 Area of 74 81.7 92
In the experiment, Random Forest gives the highest accuracy, blood
sensitivity and specificity. That is why Random Forest has been vessel
chosen as the best of the three classifiers used in this work.
Nayak et al. 3 Blood 73.6 70.3 90
Average sensitivity of the classification = 77.2% [5] vessel,
Average specificity of the classification = 93.3% exudates,
texture
Accuracy of the classification = 76.5%

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

32
Acharya 5 Higher- order 72 82.5 88.9
et al. [6] spectra Since, the instances of each class are important in this work, we
have calculated the evaluation metrics by using macro-
Lim et al. 5 Blood vessels, 75.9 80 86 averaging method. Because macro average reveals the better
[7] exudates, scenario of the smaller classes and it is to the point and more
microaneurysm, accurate when performances on each and every classes are
haemorrhage important equally.
This 5 Histograms and 76.5 77.2 93.3 V. CONCLUSION
work Zeroth Hu
moments of After studying the existing systems, we conclude that our
blood vessel, proposed technique is successfully detecting Diabetic
exudates, Retinopathy. Along with this, the proposed method is classifying
microaneurysm
into five classes of Diabetic Retinopathy. Classification has been
done based on three features- area of exudates, area of blood
Sinthanayothin et al. [1] distinguished Diabetic Retinopathy vessel and area of microaneurysm. And using this features, we
using image processing techniques from a healthy retina. In this have classified into five classes as normal eye, mild NPDR,
proposed system, fundus images were preprocessed using moderate NPDR, severe NPDR and PDR. Using Random Forest
adaptive local contrast enhancement. This method, established classifier, we have gained accuracy= 76.5%, sensitivity= 77.2%
on a multilayer neural network, made 70.21 percent sensitivity and specificity= 93.3%. The metrics we have found in this work
and 70.66 percent specificity. are compared with the existing works.

Kahai et al. [3] developed a system for the initial identification In this paper, we have performed the Diabetic Retinopathy
of the Diabetic Retinopathy. The identification system is based classification using Random Forest classifier with some
on a testing problem of binary-hypothesis that results only yes essential features like exudates, blood vessel and
or no. Bayes optimization criterion was used to the raw fundus microaneurysm. In future, we hope to make it work for some
images for the initial identification of the DR. This method was more classifiers like K-Nearest Neighbor classifiers and so on
able to detect the appearance of microaneurysms having using some secondary features like haemorrhage also.
sensitivity of 80 percent and specificity of 63 percent correctly. Moreover, we can perform this classification method using
larger dataset of infected eyes using neural network model in
Wang et al. [4] have classified healthy, moderate, and severe future.
DR stages using morphological image processing approaches REFERENCES
and a feedforward deep learning network. In this system, the
[1] C. Sinthanayothin, J. F. Boyce, T. H. Williamson, H. L. Cook, E. Mensah,
existence and covering region of the components of the lesions S. Lal, and D. Usher, “Automated detection of diabetic retinopathy
and blood vessels are selected as the main features. The ondigital fundus images” Diabetic Medicine, vol. 19, no. 2, pp. 105–
classification efficiency of this work was 74 percent, the 112, 2002.
sensitivity was 81 percent, and the specificity was 92 percent. [2] Singalavanija, A., Supokavej, J., Bamroongsuk, P.Sinthanayothin, C.,
Featuring the lesions and blood vessels, and texture parameters, Phoojaruenchanachai, S., and Kongbunkiat, V. Feasibility study on
computeraided screening for diabetic retinopathy. Jap. J.Ophthalmology,
they classified the input images into healthy, Moderate, and 2006, 50(4), 361–366.
Severe DR [4]. [3] P. Kahai, K. R. Namuduri, and H. Thompson, “A decision support
framework for automated screening of diabetic retinopathy.”Hindawi
An automatic identification system of DR was proposed by Publishing Corporation, Feb 02, 2006.
Acharya et al. [6]. They classified normal, mild, moderate, [4] H. Wang, W. Hsu, K. Goh, and M. Lee, “An effective approach to detect
lesions in color retinal images,” vol. 2, 02 2000, pp. 181 – 186 vol. 2.
severe and PDR using the two spectral layered invariant
[5] Nayak, J., Bhat, P. S., Acharya, U. R., Lim, C. M., and Kagathi, M.
features of higher-order spectra approaches and a Support Automated identification of different stages of diabetic retinopathy using
Vector Machine classifier [8]. This work reported an accuracy digital fundus images.
of 72%, a sensitivity of 82%, and a specificity of 88%. [6] U. R. Acharya, E. Y. K. Ng, J. H. Tan, V. S. Subbhuraam, and N. Kh,
“Anintegrated index for the identification of diabetic retinopathy stages
In this work, the fundus images are classified into five classes usingtexture parameters,”Journal of medical systems, vol. 36, pp.
2011–20,02 2011.
using the histograms and the zeroth Hu moments of the
[7] C M Lim, U R Acharya, E Y K Ng, C Chee and T Tamura, Proceedings
exudates, microaneurysms, and blood vessels present in the eye. of the Institution of Mechanical Engineers, Part H: Journal of Engineering
Random Forest is used for the classifier. The classifier is able in Medicine, 2009 223: 545
to identify the unknown class accurately with an efficiency of [8] Osareh, A.; Mirmehdi, M.; Thomas, B.T.; Markham, R. In MICCAI’02:
more than 76.5 percent with sensitivity 77.2 percent and Proceedings of the 5th International Conference on Medical Image
Computing and Computer-Assisted Intervention-Part II; Springer-
specificity 93.3 percent. Verlag: London, UK, 2002; pp 413–420

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

33
Methodology for Syngas Energy Assessment
Melda Ozdinc Carpinlioglu
Gaziantep University
Department of Mechanical Engineering
Gaziantep, Turkey
melda@gantep.edu.tr

Abstract— Energy assessment of syngas produced by solid sawdust, SD and polyethylene pellets, PP were used fuel.
granulated fuel decomposition in a special microwave plasma Syngas was defined as the gasified amount of fuel excluding
gasification process is the topic of presentation . Instantaneous ash left the process but including the supplied amount of air.
measurements of syngas volumetric content covering the total Therefore syngas production is governed by the nature of fuel
gasification time of the process and gas chromotography and MCw plasma decomposition process (Figure I)
analysis on the collected sample syngas amount during
approximately 30% of total gasification time are compared . Monitoring of gasification was based upon the
Instantaneous volumetric content of syngas and the instantaneous measurements , IM of local instantaneous
gasification time and location averaged process temperature temperatures, T along the reactor and syngas volumetric
which is defined as syngas temperature , T syn are the major content . The local instantaneous temperature measurements
energy assessment parameters. Gas chromatography analysis were taken by B type (Pt18 Rh- Pt) thermocouples located
verifies the transient nature of syngas production.. along the reactor at specified 5 locations. Thermocouples had
a sensitivity of ± 4 C. Local temperature measurement was at
Keywords— syngas, volumetric content, gas chromatography , 1 second sampling frequency . A semi-continuous
transient , instantaneous measurement commercial gas analyzer called MRU-VARIO PLUS was
I. Introduction used to measure the syngas volumetric content . CO, CO2,
CH4, H2, N2 and O2 in syngas was determined with an
The criticism on the literature and state of art under the accuracy of up to100 % and up to 25 % respectively. The
light of a research between 2015-2018 regarding microwave syngas volumetric content measurement was at 2 seconds
plasma gasification in a test system MCw GASIFIER for sampling frequency . Fuel decomposition was through a
biomass conversion to syngas are available in [1,2,3,4,5] . A defined process –gasification time , t g . Instantaneous
review on the manner [1] with a methodology on measurements were executed from t = 0 to t = t g. The
thermodynamical treatment of the process [2] can be treated gasification time , t g was resulted in no solid fuel left in the
as the pre-research publications .The article [3] is on an reactor with a minute amount of ash filtered from generated
overall treatment of research discussing syngas generation syngas through a cyclone separator located in front of the
with a variety of fuel as a post- research article. A Ph. D study MRU-VARIO PLUS gas analyzer.
[4] was completed with the support of the research [5]. In
terms of the current efforts in the relevant field a proceedings
paper was presented in May 2021[6]. It is denoted [6] that
research on microwave MCw plasma gasification is onwards
for recycling of biomass and waste to energy conversion due Syngas
MRU
to the importance of the manner in terms of the advantages of Ash Vario-Plus
MCw plasma gasification and existing gaps in physical nature Gas
of the process .[7,8,9,10,11] The major attention in this 5- T(tg) Analyzer
paper is given to the methodology for the energy content Reactor Syngas + Ash
determination of syngas produced by MCw plasma control 4- T(tg)
gasification of solid granulated fuel. The operational results volume
of MCw GASIFIER are referred for the purpose. The 3- T(tg)
available data on instantaneous volumetric content of 2- T(tg) Solid fuel
generated syngas and temperature measurements during the t=0,
total decomposition period are used with the data on gas 1- T(tg) mfuel=250g
chromotography analysis of stored syngas samples before the t=tg,
termination of the decomposition. Plasma Input mfuel=0g
Air + Electrical Power
The discussion is based on a brief on the operational test Figure I Sketch of the process [5]
cases of MCw GASIFIER [1-5]. Syngas energy assessment is
described with the critical parameters . It is also aimed to Stored amount of syngas during fuel decomposition before
determine the functional relationships between selected the termination of the process was collected in begs during
parameters expressing the physical process of gasification. the so called storage time, t s .The chromatographic analysis
, GC of stored syngas samples was done in TUBITAK-
II. Brief on process, basic parameters and MAM Laboratory according to TS EN ISO 6974 and TS EN
methodology ISO 6975 standards . Atmospheric pressure and reference
The operation of MCw GASIFIER[5] was such that air standard temperature T R = 20 ° C were valid in the
at standard temperature and pressure condition was used as analysis. C1-C6 hydrocarbons( Ethane, Ethylene, Propylene
plasma carrier. Granulated particles of different types of ,I-Pentane n- Hexane ,Butene , etc ) besides CO, CO2, CH4,
biomass having the particle size range of 0.1 mm – 1 mm H2, N2 and O2 components were determined .
were loaded as a static fuel bed in the reactor. Coal, C

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

34
The local instantaneous temperatures during fuel gathered during the total duration of process , t g and GC
decomposition process , t g were varying as a function of fuel analysis data conducted on a stored sample syngas amount
and process . Similarly volumetric amounts of CO, CO2, during t s < t g with ts /tg =0.3 corresponding to a partial
CH4, H2, N2 , O2 in syngas were also varying during t g . gasification are compared. The comparison is given in terms
Therefore process-decomposition –reactor temperature can of syngas volumetric content in part A .
be defined as syngas temperature , Tsyn as an ensembled
Furthermore syngas energy assessment for the
averaged parameter using collected local instantaneous
temperature data during t g. Similarly molecular weight of gasification of polyethylene pellets, PP as solid fuel with a
variable microwave power 3 k W ,4.8 k W and 6 k W use
syngas Msyn as a secondary ensembled averaged parameter
at 50 standard L/min of air are referred using data gathered
can be calculated using collected syngas volumetric content
data during t g . The energy content of syngas can be given by IM and GC analysis separately and in a combined manner
in part B .
by HHV of syngas. HHVsyn can be calculated based upon
Dalton-Amagat Model on instantaneous –ensembled averaged Influence of fuel content on the generated syngas is
syngas volumetric content similar to the calculation of Msyn. presented in terms of proposed routes for fuel decomposition
The calculations are based upon the assumption of perfect gas in part C.
treatment for syngas and referring to T syn [1,2,3,5].
HHVsyn, relative density, RD and Wobbe Index (WI) of A. Syngas Volumetric Content Data via GC and IM
syngas [12,13] can be determined in GC analysis of stored GC analysis data are such that Ethane, Ethylene,
syngas sample . GC analysis is based upon the treatment of Propylene ,I-Pentane n- Hexane ,Butene ,Acetylene 1-3
perfect gas and real gas assumptions for syngas separately . Butadiene ,Propane , n-Hexane etc.in syngas can be
The determination of gross and net quantities of the determined. However the amounts of these components are
parameters are also available [5]. Table I lists the defined rather small referring to the maximum 2.887% of Ethylene
parameters. HHVsyn is the common parameter in different and minimum 0.002% of n -Hexane . In both methods the
methodology . determined common components of syngas are CH4, O2,
Table I Parameters for syngas energy assessment derived from IM N2,CO2,CO,H2 . The comparison of the common
,GC analysis components in syngas for the referred data sets can be
discussed as follows :
Parameter Base Definition Explanation
1. CH4 content in GC data (1.94%) is in the same
No solid fuel order of IM data (1.26%) .
tg IM Process duration
left in reactor
2. O2 and N2 content in GC data are such that O2(
ts GC Storage time 4.132%) and N2(75.653%) while in IM data O2(1.44%)
ts< t g
Ensembled and N2(51.5%) are determined .
averaged 3. CO2, CO, H2 contents of IM data CO2 (13.01
tg based Process-
local
Tsyn IM Reactor –Syngas %), CO(21.31%) , H2 (10.80%) are greater than the
temperatures
temperature corresponding magnitudes of GC data as given with CO2
in reactor
during t g (6.55 %), CO(4.55%) , H2 (1.91%). The instantaneously
Calculated determined magnitudes of CO2, CO, and H2 have
Molecular Weight of Dalton-
M syn IM
Syngas Amagat
approximately 2, 4.3 and 5.7 multiples of the ones
Model determined by GC .
Calculated The reason of the observed deviations is due to the
Higher Heating Value of Perfect Gas incomplete decomposition of fuel with t s < t g. The similar
HHV syn IM
Syngas Mixtures at
Tsyn
magnitudes of CH4 of syngas determined by IM and GC
Calculated data can be due to the completed CH4 formation for the
HHV syn GC
Higher Heating Value of Perfect Gas time period t = t s . The greatest magnitudes of O2 and N2
Syngas Real Gas by GC data also confirm that the gasification procedure is
Syngas
not completed for t s < t g since O2 and N2 are the
density / Air components of air supplied steadily and continuously while
RD syn GC
Relative density of density fuel decomposition sensed,in terms of formation of
syngas Perfect Gas CO2,CO,CH4 ,H2 of syngas is not completed.
Real Gas

Amount of B. Syngas Energy Assessment :Comparison via

Energy from GC and IM
syngas Syngas energy level is a function of generated
conversion
WI syn GC Wobbe Index syngas amount, its volumetric content and temperature,
HHV/ RD 0.5 Tsyn . The parameters describing the energy level are given
Perfect Gas in Table I. The changes in the syngas content described in
Real Gas part A are due to the measurement and analysis conditions.
In reference to the discussed cases for the gasification of
polyethylene pellets t s/ t g is between 0.28-0.35 (Table II).
III. Results and Discussion
This means that collected sample mass for GC is the one
The produced syngas for the gasification of polyethylene before the complete decomposition of the fuel. The process
pellets , PP as solid fuel with 4.8 k W microwave power of decomposition of fuel is not a steady state steady flow
use at 50 standard L/min of air use are referred. IM data process. Therefore IM at varying high temperatures are the

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

35
correct ones resembling the physical nature of the process.
1,2
Furthermore the temperature acting in IM is Tsyn .It is
under the effect of decomposition since ensembled average
1
temperature during process at different locations is used. GC
analysis is based on reference temperature, T R and Tsyn is
>> T R. Therefore the difference of Tsyn- T R is a dominant 0,8
term in discussion.
(O2/N2) n 0,6 SD
Table II Syngas Energy based upon GC data for polyethylene C
pellets ,PP gasification at different power use in comparison with 0,4
IM data PP
0,2
GC
t s/ t g GC
E syn 0
GC Net WI MJ/m3
(kJ)
RD 0 0,2 0,4
From WI
0.28 1.05 8.21 7563 O2/N2 syn
0.3 0.98 5.41 4395
Figure II O2/N2 of generated syngas expressed relative to
0.35 0.97 7.44 5582 air
IM IM IM As a further comment M n is given with M syn in
WI MJ/m3 Figure III in reference to all of the collected data with C, SD
E syn Mn =Msyn /
(kJ) Mair
and PP gasification . The magnitudes of M n are between
0.28 2.89 3298 0.96 0.9 and 1.15. All data roughly.independent of fuel type
following a single line shown.
0.3 2.713 3274 0.94
0.35 2.573 3327
Table II lists syngas energy , E syn calculated based
0.91
upon WI , RD of GC data using the amount of syngas
measured from the operational cases of IM. The assumption
Net WI value and RD value of generated syngas is based upon the similarity of syngas generation with
produced with gasification of PP at different power use based decomposition of fuel for t > ts since GC data is referred in
on GC analysis are not varying depending on perfect gas calculation. However the calculated values with IM data at
or real gas treatment of syngas .(Table II) The maximum different power use are in the same range of Esyn = 3200 k J
magnitude of WI is belonging to 3 k W power use with 8.21 The values of Esyn derived from GC are dependent on
MJ/m 3 with the values 5.41 MJ/m3 and 7.44 MJ/m3 power use and varying in the approximate order of 4000-
corresponding to 4.8 k W and 6 k W power use respectively . 7300 k J. In fact combined data use from GC and IM is one
RD syn is in the order of 1 meaning density of syngas similar
of the reason . The second reason is the temperature difference
to the one of standard air. Instantaneous measurements are
, Tsyn- TR which induces sensible energy increase for syngas
used to determine molecular weight of syngas. Calculated
molecular weight of syngas , M syn is given as a ratio of in GC analysis.. The magnitudes of E syn determined by WI
atmospheric standard air molecular weight , M air in Table II utilization are greater than the magnitudes obtained by IM.
as normalized molecular weight , M n = M syn/M air .The Esyn calculated by IM are 0.43, 0.74 and 0.59 of Esyn of GC.
magnitudes of M n are in the same order of RD .This means
WI defined as a GC analysis parameter is also calculated
that almost independent of method used , the produced syngas
using data gathered by IM as an alternative method . The
has similar of air in terms of density and molecular weight.
Presence of air in syngas may be the reason. Therefore the results are given in Table II with the approximate
respective orders of O2 and N2 in syngas can be determined magnitudes of 2.89,2.7,2.57 MJ/m3. WI magnitudes derived
. The ratio of O2/N2 of syngas , O2/N2 syn with that of air from IM are almost 0.35,0.5 and 0.34 of the corresponding
is defined as normalized ratio of O2/N2 , (O2/N2 ) n . The magnitudes of GC analysis .
variation of (O2/N2 ) n with O2/N2 syn is shown in Figure II
Therefore different methods give different values for
using all of the collected data with C, SD and PP gasification.
The syngas generation by all kinds of fuel gasification almost Esyn . The combined method utilization seems to be not
follow the single line shown . The data behaviour in Figure realistic due to the difference between t s and t g ,
II means that oxygen in syngas is mostly less than its temperature difference of Tsyn- T R and the difference in
amount in standard pure atmospheric air.. As can be seen only the amount of syngas.
2 data with C gasification (O2/N2) n > 1. The presence of
CO , CO2 components in syngas causing reduction in pure
O2 is the possible reason of the fact .

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

36
1,2 2,5
1,15
2
1,1

CO/CO2
1,05 1,5
C
Mn

1 SD 1
SD
0,95 C
0,5 PP
0,9 PP
0,85 0
0,8 5 15 25 35 45
25 30 35 % CO2
Msyn kg /kgmol
Figure V Influence of fuel on the generated syngas CO/
Figure III Variation of M n with M syn CO2 with % CO2 content
It is seen that increase in H2% content of syngas is coupled
C. Influence of Fuel Content on Syngas with an increase in H2/CH4 ratio of syngas Meanwhile
Content:Analysis of Decomposition increase in CO2% of syngas is coupled with a decrease in
In order to describe the influence of fuel on the generated CO/CO2 ratio of syngas. The increase in dependent
syngas composition and the thermochemical decomposition parameter given in the abcissa and the used ratio definitions
process Figure IV and Figure V can be referred for a final is the logical fact for the observed data behaviour . Therefore
deduction just as samples . IM data are used . Figures IV and V can be interpreted as the sample plots on
thermochemical decomposition of fuel which is a severe
function of fuel type.
18
16 Acknowledgments
14 The author expresses her gratitude to TUBITAK for the
12 completed research project with a grant number of 115M389
(2015-2018) . TUBITAK –MAM Gebze Laboratory
H2/CH4

10 personnel for the conducted gas chromotography analyses of

C
8 the samples and the criticism of the referees of the 115 M 389
SD project during 2015-2018 period also deserve thanks . The
6
PP assistance of Dr. A.Sanlisoy in the project duration is
4
acknowledged. The hardworking of Res. Assistant E.Ozkur in
2 revision process is welcome . The author lastly but not leastly
0 thanks for the criticism and comments of the referees.
0 5 10 15
III. Conclusion
%H2
It seems that syngas energy should be expressed through
instantaneous gasification based measurements IM and gas
Figure IV Influence of fuel on the generated syngas H2 chromatography GC analysis. GC analysis on the sample gas
/ CH4 variation with % H2 content from gasification process is confirming the transient nature of
the process giving the total available energy in generated
In these figures the ratio of H2/CH4 in syngas versus % syngas due to the difference in Tsyn- TR . The major
H2 in syngas and the ratio of CO/CO2 versus % CO2 in parameter of IM is Tsyn since it is determined as a
syngas are used. The selection of the ratios comes from the gasification time averaged parameter. However energy
formation of the relevant components are the measure of fuel assessment of syngas should also be expressed in reference to
type with varying C ,H ,O amounts and fuel based GC analysis.
thermochemical decomposition process. As a common
reference base in Figure IV and Figure V % H2 and % CO2 Syngas characteristics determined by GC analysis are
in syngas are taken as dependent components. All of the identical by using real gas and perfect gas treatments due to
collected data with C, SD and PP gasification are used. the treatment at a reference temperature of TR = 20 °C . The
similarity in RD with M n is confirming the validity of IM.
Energy of syngas is considerably different in IM and GC due
to three basic reasons
1) Syngas temperature is rather higher than TR= 20
°C . Therefore sensible energy is added with a higher Esyn in
GC than the one in IM
2) Stored syngas sample for gas chromotography
analysis is not the total decomposed fuel.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

37
3) Fuel decomposition -gasification is not steady . [6] Carpinlioglu Ozdinc M “Perspectives on waste to energy
Therefore syngas production is not at a constant rate and conversion by microwave plasma gasification “Waste to
resources 9 th International Symposium MBT ,MRF
characteristics instead it is a transient procedure. &Recycling Online Conference 18-20 May 2021 Hanover
GERMANY
The major parameters are Tsyn, T R, Esyn,WI , RD , M n
[7] Bishoge, Obadia Kyetuza; Huang, Xinmei; Zhang, Lingling; et
of syngas. Thermochemical decomposition of fuel is al” The adaptation of waste-to-energy technologies: towards
absolutely a solid function of fuel content. The conversion of the conver-sion of municipal solid waste into a renewable
C and H in fuel to CH4, CO,CO2, and H2 components of energy resource”Environmental Reviews Volume: 27 Issue: 4
,2019 Pages: 435-446
syngas describes the process. Furthermore the respective
[8] Munir, M. T.; Mardon, I.; Al-Zuhair, S.; et al.” Plasma
orders of CO/CO2 ratio and H2/CH4 ratio of syngas indicate gasification of municipal solid waste for waste-to-value
the path of thermochemical decomposition on which further processing” Renewable & Sustainable Energy Reviews
analysis is vital for a complete understanding . Volume: 116 ,2019 Article Number: 109461
[9] Gimzauskaite, Dovile; Tamo-siunas, Andrius; Tuckute,
References Simona; et al” Treatment of diesel-contaminated soil using
[1] Sanlisoy, A Carpinlioglu Ozdinc M ,”A review on plasma thermal water vapor arc plasma” Environmental Science and
gasification for solid waste disposal” ,International Journal of Pollution Rese-arch Volume: 27 ,2020 Issue: 1 Special Issue:
Hydrogen Energy Vol 42 , 2017,pp 1361-1365. SI Pages: 43-54
[2] Carpinlioglu Ozdinc M, Sanlisoy A, “Performance assessment [10] Mukherjee,C.; Denney, J.; Mbo-nimpa, E. G.;et al.” A review
of plasma gasification for waste to energy conversion: A on municipal solid waste-to-energy trends in the USA”
methodology for thermodynamic analysis” International Renewable & Sustainable Energy Reviews Volume: 119
Journal of Hydrogen Energy Vol.43, 2018, pp 11493-11504. ,2020 Article Number: 109512
[3] Sanlisoy, A Carpinlioglu Ozdinc M “ Microwave Plasma [11] Gadzhiev, M. Kh.; Kulikov, Yu. M.; Son, E. E.; et al” Efficient
Gasification of a Variety of Fuel for Syngas Generator of Low-temperature Argon Plasma with an
Production”Plasma Chemistry and Processing V:39,2019 Expanding Channel of the Output” High Temperature Volume:
Issue:5 pp: 1211-1225 58 ,2020 Issue: 1 Pages: 12-20
[4] Sanlisoy , A., An Experimental Investigation on Design and [12] Wimcompass Hobre Instrument 2018 General Information
Performance of Plasma Gasification Systems, Ph. D thessis in Wobbe Index and Calorimetershttps://www.hobre.com
/files/products/Wobbe Index General Information.rev.1. pdf
Mechanical Engineering Department of Gaziantep University
TURKEY ,2018 [13] Neutrium, 2018 Wobbe Index https:// neutrium.net/properties
/wobbe-index/ 08.04.2018
[5] Carpinlioglu Ozdinc M , Design Construction and
Performance Assessment of a Test Plant(Microwave Gasifier)
“MCwgasifier” Using Plasma Gasification for Solid Waste-
Energy Conversion in Laboratory Scale – An Experimental
Case For Knowhow On Plasma Technology ,TUBITAK
115M389 Final Project Report ,2018

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

38
3D Object Detection using Mobile Stereo R-CNN on Nvidia
Jetson TX2
Mohamed K. Hussein Mahmoud I. Khalil Bassem A. Abdullah
Computer and Systems Engineering Computer and Systems Engineering Computer and Systems Engineering
Department Department Department
Ain Shams University Ain Shams University Ain Shams University
Cairo, Egypt Cairo, Egypt Cairo, Egypt
mohamed.khaled@eng.asu.edu.eg mahmoud.khalil@eng.asu.edu.eg babdullah@ eng.asu.edu.eg

Abstract—3D Object detection is one of the most important Stereo R-CNN [1] model which is a stereo-based end-to-end
perception tasks needed by autonomous vehicles to detect deep neural network that extends Faster R-CNN [2] and it
different road agents like other vehicles, cyclists, and does not rely on LIDAR supervision during training and
pedestrians which is essential for driving tasks like collision inference. Nvidia Jetson TX2 has 256-core Nvidia Pascal
avoidance and path planning. In this paper, our work is focused GPU architecture with 256 Nvidia CUDA cores, Dual-Core
on 3D Object Detection for car class from stereo images without Nvidia Denver 2 64-bit CPU and Quad-Core ARM Cortex-
LIDAR supervision neither during training nor during A57 MPCore, and 8 GB 128-bit LPDDR4 Memory.
inference and the challenging task of running 3D Object
Detection on an embedded target Nvidia Jetson TX2 by In the next section, we briefly review different methods
modifying Stereo R-CNN model and reducing the model size to used for 3D Object Detection. In section III, we illustrate the
approximately one third the size of the original model to be more modifications made to the Stereo R-CNN network to be more
suitable for embedded targets. Experiments on KITTI dataset suitable for an embedded target. In section IV, we explain the
showed that our model’s inference time is 1.8 seconds and its’ experiment setup and what was done and report the results
average precision for moderate car class is 17% on the test set. achieved and compare it to the original model based on the
Our model decreases training and inference time by evaluation on KITTI validation set and test set. In section V,
approximately 60% with a 13% drop on the test set which is an we conclude this paper with a summary of our work and
expected trade-off when decreasing the number of parameters
contribution.
inside the model.
II. Related Work
Keywords—3D Object Detection, Stereo Vision, Autonomous
Driving, Embedded Systems A. LIDAR Based Methods
I. Introduction LIDAR-based methods can be mostly classified into two
categories depending on how the point cloud is represented.
3D Object Detection extends 2D object detection by also Grid-based methods depend on point cloud voxelization like
detecting pose and orientation as well as real-world [3]-[8] or projecting point clouds to 2D grids [9] and then
dimensions of detected objects as shown in Figure I. This task these grids are processed by 2D or 3D CNN. Point-based
is essential in autonomous driving for tasks like motion methods process raw point clouds directly like [10]-[15].
prediction and accurate object localization which is important Also, some methods use both representations like [16]-[18]
for safe driving. and [46]. There are other categories like [19] which uses a
The most used sensors for 3D Object Detection are graph representation for point clouds. At the time of writing
LIDAR, stereo camera, and mono camera. State of the art this paper (April 2021), the top LIDAR method on KITTI 3D
models heavily rely on LIDARs but LIDARs have the benchmark is SE-SSD [46] which has an average precision of
disadvantage of being very expensive which can cost around 82.54% in the car moderate class. This is also the top method
$70000 compared to cameras which can cost less than $1000. regardless of approach.
Also, LIDAR point clouds contain sparse information B. LIDAR + Image Fusion Based Methods
compared to dense information in images. The main
Fusion methods can be mostly classified into three
advantage of LIDAR is accurate depth information which is
categories. Early Fusion in which inputs from camera and
essential for 3D Object Detection and cameras cannot provide
LIDAR are fused very early in the input of the deep neural
the same level of accuracy and this is why there is a huge gap
network like [20]. Late Fusion in which there is a separate
in average precision between LIDAR-based methods and
pipeline for Camera and LIDAR and they are fused at a later
camera-based methods. Camera-based methods are divided
stage in the model like [21]-[27]. Deep Fusion in which inputs
into stereo-based methods and methods relying on just one
are fused multiple times deep inside the model like the leading
RGB image. Stereo-based methods perform better than mono-
work in 3D Object Detection MV3D like [28]-[30]. At the
based methods because depth can be better estimated from
time of writing this paper, the top fusion method on KITTI 3D
stereo images.
benchmark is CLOCs [27] which has an average precision of
While average precision is usually the metric used for 80.67% in the car moderate class.
comparing different methods, in this paper our work is
focused more on the target platform and running 3D Object C. Stereo Images Based Methods
Detection on embedded target Nvidia Jetson TX2 with Stereo-based methods can be mostly classified into two
reduced memory usage and less inference time and our categories. LIDAR supervised methods that need LIDAR
contribution is testing our proposed on KITTI [45] 3D Object input for training and use only stereo images during inference
Detection task using an embedded target while other methods like [31]-[38]. Pseudo-LIDAR was proposed in [31] to reduce
use more powerful GPUs and sometimes use multiple GPUs the gap between 3D object detection using LIDAR and stereo
for training and inference. For this purpose, we modified the camera. It relies on building point clouds from depth maps and

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

39
Figure I. 2D Object Detection (top) and 3D Object Detection (bottom)
feeding the generated point cloud to a LIDAR-based 3D disparity image which is then projected to a grid which is then
Object Detector and this approach achieved the best average used to estimate the 3D bounding box. In [40], Instance Depth
precision on KITTI benchmark when it was published. Aware (IDA) 3D Object Detection is proposed in which stereo
Pseudo-LIDAR can be used with mono or stereo images by region proposal network is used to get 2D bounding boxes
just changing the depth estimation module which creates the then IDA module estimates the center of the 3D bounding box
depth map that is later transformed to a pseudo point cloud. In and the region proposals are used to determine position and
[36], Pseudo-LIDAR++ was proposed which builds over [31] orientation of the 3D bounding box. In [41], Disp R-CNN is
by enhancing depth estimation which is the first step to proposed in which stereo pair is used as input to Stereo Mask
produce pseudo point cloud. In [34], end-to-end Pseudo- R-CNN network to produce instance masks for objects of
LIDAR was proposed since previous methods trained two interest then instance disparity is generated which is used to
different blocks for depth estimation and 3D object detection generate instance point clouds which are fed to a 3D Object
separately. Deep Stereo Geometry Network for 3D object Detector.
detection was proposed in [32] which is an end-to-end
network relying on constructing a 3D geometric volume and Stereo methods that rely on the concept of generating point
plane-sweep volume for 2D features then use the 3D clouds perform better than methods that use images directly
geometric volume for 3D Object detection and the plane- for 3D Object Detection but most of them require the presence
sweep volume for Depth Estimation simultaneously. In [33], of ground truth point clouds during training and needs a lot of
confidence-guided 3D object detection is proposed by doing computation power as there are two steps in the process, first
depth estimation separately for foreground and background you need to generate a pseudo point cloud then feed it to a
pixels and giving a confidence estimation for each pixel which point cloud-based 3D Object Detector.
is used with the generated point cloud as input for a 3D Object At the time of writing this paper, the top stereo method on
Detector. In [35], Zoomnet was proposed and in this approach, KITTI 3D benchmark is CDN-DSGN [38] which has an
the 2D Region of Interests were resized to have the same average precision of 54.22% in the car moderate class.
resolution so that near and far objects are analyzed with the
same resolution and instance point cloud is generated from the D. Mono Images Based Methods
depth estimation for each instance bounding box. In [37], Monocular 3D Object Detection is the most challenging
Object Centric stereo matching is proposed trying to enhance method due to the complete lack of depth information and
the stereo matching problem for 3D object detection because therefore there is a huge performance gap between monocular
other approaches used depth estimation networks with depth methods and other methods. At the time of writing this paper,
maps as the main output not point clouds so in the approach the top monocular method [42] on KITTI 3D benchmark has
the instance segmentation is done, and object-centric instance an average precision of 12.72% in car moderate class which is
point clouds are generated to enhance the produced point a video-based method that uses temporal cues and kinematics
clouds which are then fed to a 3D object detector. In [38], a to improve localization accuracy.
Continuous Disparity Network (CDN) is proposed with a
Wasserstein distance-based loss function to enhance disparity III. Mobile Stereo R-CNN
estimation and this CDN can be used with any stereo 3D Mobile Stereo R-CNN as shown in Figure II is based on
Object Detector that relies on disparity estimation like DSGN Stereo R-CNN which is an end-to-end deep neural network
[32]. No LIDAR supervision methods only use stereo for 3D Object Detection from stereo images without the
images in both training and inference like [1] and [39]-[41]. supervision of LIDAR in neither training nor inference. Our
Stereo R-CNN [1] is the baseline model for our work so it will main target was running 3D Object detection on an embedded
be discussed later in section III. In [39], the images are used target like NVIDIA Jetson TX2. So first we analyzed the
to generate a semantic map with 2D bounding boxes and a building blocks of Stereo R-CNN to identify bottlenecks that

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

40
Figure II. Mobile Stereo RCNN Network Architecture
can be modified to enhance the training time, inference time, * the number of channels) these feature maps are then used
and memory footprint and we decided to change the backbone with a feature pyramid network to produce the output of the
network from ResNet101-FPN to MobileNetV2-FPN. The base model with 128 channels with the following dimension
new model size is approximately third the size of the original ratios 1/4, 1/8, 1/16, 1/32, and 1/64 relative to the input
model and inference time is reduced by approximately 60% dimension. Five anchor scales 32, 64, 128, 126, 512, and three
with a minor drop of 6% in average precision on moderate car ratios 0.5, 1, 2 are used in the network.
class on KITTI validation set and a drop of 13% in average
precision on moderate car class on KITTI test set. First, we Layer 0 consists of a 2D convolution operation and 2
review the building blocks of Stereo R-CNN then discuss the bottleneck operations. Layer 1 consists of 1 bottleneck
backbone replacement. operation. Layer 2 consists of 3 bottleneck operations. Layer
3 consists of 7 bottleneck operations. Layer 4 consists of 4
A. Stereo R-CNN bottleneck operations followed by a 2D convolution to
Stereo R-CNN is based on Faster R-CNN. It takes as input produce the final feature map. The reason for choosing these
two stereo images resized so that the shortest side length is layers was to match the lateral connections done to the feature
600px. pyramid network of the original model.

Backbone Network. ResNet101-FPN is used as the TABLE I. MobileNetV2-FPN Backbone

backbone network for Stereo R-CNN which takes as input the Layer Input Output
input stereo images and produces 2D feature maps. Both 0 1x3 1/4 x 24
backbone networks for right and left images use shared 1 1/4 x 24 1/4 x 24
weights. The feature mas are then concatenated and used as 2 1/4 x 24 1/8 x 32
input to Stereo Region Proposal Network (Stereo RPN). 3 1/8 x 32 1/16 x 96
Stereo RPN. Stereo RPN has two branches, a branch for 4 1/16 x 96 1/32 x 1280
classification that is responsible to measure the objectness of
each anchor and a branch for regressing the box offsets. Stereo IV. Experiment and Results
RPN produces left and right Regions of Interests (RoIs). We split the training set of KITTI to 3712 frames for
RoI alignment is done on both left and right RoIs training and 3769 frames for validation using the data split
separately. Left feature maps with RoIs are used as input for done by [44] which is also the same split used by Stereo R-
keypoint prediction and concatenated left and right features CNN so that we can compare our results on the same
with RoIs are used as input for stereo regression. validation set with the base model. Data augmentation is done
for the 3712 frames of the training set using the same approach
Keypoint Prediction. In this part of the model 3D as Stereo R-CNN with stereo flipping so the final training set
Semantic keypoints are predicted to enhance the 3D box size is 6478 frames.
estimation using convolution layers.
A. Training
Stereo Regression. In this part of the model the following
tasks are done: object classification, regression of 2D Training is performed on Nvidia GeForce GTX 1080Ti
bounding boxes, regression of dimension, and regression of GPU for 11 epochs for 14 hours compared to 33 hours using
viewpoint angle. the original Stereo R-CNN model. Batch size equals 1 and the
network is trained using SGD with a momentum of 0.9 and
3D Box Estimation. Outputs from stereo regression and weight decay of 0.0005. The learning rate equals 0.001 and is
keypoint prediction blocks are then taken as input for 3D box reduced by 0.1 every 5 epochs. The model is trained end-to-
estimation and alignment to produce final 3D boxes of 3D end but the backbone layers 0 to 2 were frozen during training
object detection. to maintain the pre-trained weights of MobileNetV2.
B. Backbone Replacement B. Inference
The new backbone is based on MobileNetV2 [43] which The inference is done on both Nvidia GeForce GTX
is more suitable for mobile and embedded targets than 1080Ti GPU and Nvidia Jetson TX2. Inference time on
ResNet101. we also use a feature pyramid network with Nvidia GeForce GTX 1080Ti GPU is 0.16 seconds
MobileNetV2 to maintain the region proposal network as it is compared to 0.36 seconds using the original Stereo R-CNN
in the original Stereo R-CNN implementation. We divide the model. Inference time on Nvidia Jetson TX2 is 1.8 seconds
MobileNetV2 into 5 layers as shown in table I. Each entry in compared to 4.7 seconds using the original Stereo R-CNN
table I is in the format (scale ratio with respect to input image model.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

41
Figure III. Precision-Recall curves for Orientation estimation, 3D object detection, and Bird's eye view detection respectively

TABLE II. Mobile Stereo-RCNN vs Stereo R-CNN on KITTI Validation Set

Model AP (Easy) AP (Medium) AP (Hard) Inference Time on Jetson Model Size (Number of
TX2 Parameters)
Stereo R-CNN 54.11% 36.69% 31.07% 4.7 seconds Approx. 105M
Mobile Stereo R-CNN 44.83% 30.77% 25.25% 1.8 seconds Approx. 35M

TABLE III. Mobile Stereo-RCNN vs Stereo R-CNN on KITTI Test Set

Model Stereo R-CNN Mobile Stereo R-CNN
Benchmark Easy Moderate Hard Easy Moderate Hard
Car (Detection) 93.98% 85.98% 71.25% 90.08% 76.73% 62.23%
Car (3D Detection) 47.58% 30.23% 23.72% 26.97% 17.04% 13.26%
Car (Bird's Eye View) 61.92% 41.31% 33.42% 44.51% 28.78% 22.30%

[3] Y. Yan, Y. Mao, and B. Li, “SECOND: Sparsely embedded

C. Results convolutional detection,” Sensors, vol. 18, p. 3337, Oct. 2018.
The new model as shown by the experiments done reduces [4] J. Lehner, A. Mitterecker, T. Adler, M. Hofmarcher, B.
model size and inference time with a minor drop in average Nessler, and S. Hochreiter, “Patch refinement – localized 3d
object detection,” 2019.
precision. Table II shows the comparison between the
[5] Z. Liu, X. Zhao, T. Huang, R. Hu, Y. Zhou, and X. Bai,
performance of Stereo R-CNN and Mobile Stereo R-CNN on “TANET: Robust 3d object detection from point clouds with
Nvidia Jetson TX2 on KITTI validation set. Table III shows a triple attention,” 2019.
comparison between the results of Mobile Stereo R-CNN and [6] B. Wang, J. An, and J. Cao, “Voxel-FPN: multi-scale voxel
Stereo R-CNN from KITTI test set. Figure III shows the feature aggregation in 3d object detection from point clouds,”
precision-recall curves from KITTI benchmark for 2019.
Orientation estimation, 3D object detection, and Bird's eye [7] Y. Ye, H. Chen, C. Zhang, X. Hao, and Z. Zhang,
view detection respectively. Our model average precision is “SARPNET: Shape attention regional proposal network for
less than the original model with a 13% drop in moderate car LIDAR-based 3d object detection,” Neurocomputing, vol. 379,
pp. 53–63, Feb. 2020.
class which is an expected trade-off after decreasing the model
[8] H. Yi, S. Shi, M. Ding, J. Sun, K. Xu, H. Zhou, Z. Wang,
size and its’ number of parameters by replacing ResNet101 S. Li, and G. Wang, “Segvoxelnet: Exploring semantic
with MobileNetV2 which is more suitable for running on context and depth-aware features for 3d vehicle detection from
embedded platforms. point cloud,” 2020.
[9] Y. Zeng, Y. Hu, S. Liu, J. Ye, Y. Han, X. Li, and N. Sun,
Acknowledgment “RT3d: Real-time 3-d vehicle detection in LiDAR point cloud
for autonomous driving,” IEEE Robotics and Automation
We would like to thank eJad (http://www.ejad.com.eg) for Letters, vol. 3, pp. 3434–3440, Oct. 2018.
their technical and logistics support for this work. [10] C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep
hierarchical feature learning on point sets in a metric space,”
V. Conclusion 2017.
In this paper, our main contribution is running 3D Object [11] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep
Detection on embedded target Nvidia Jetson TX 2 using only learning on point sets for 3d classification and segmentation,”
in Proceedings of theine Conference on Computer Vision and
stereo images without LIDAR supervision. we base our work Pattern Recognition (CVPR), July 2017.
on Stereo R-CNN and replace its' ResNet101-FPN backbone [12] A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O.
with a new MobileNetV2-FPN backbone that is more suitable Beijbom, “Pointpillars: Fast encoders for object detection from
for embedded devices and with this we achieved a model size point clouds,” in Proceedings of the IEEE/CVF Conference on
approximately third of the original model and inference time Computer Vision and Pattern Recognition (CVPR), June 2019.
that is approximately 40% of the original model with a minor [13] S. Shi, X. Wang, and H. Li, “PointRCNN: 3d object proposal
drop of 13% in average precision. generation and detection from point cloud,” in Proceedings of
the IEEE/CVF Conference on Computer Vision and Pattern
References Recognition (CVPR), June 2019.
[14] Z. Yang, Y. Sun, S. Liu, X. Shen, and J. Jia, “Std: Sparse-
[1] P. Li, X. Chen, and S. Shen, “Stereo R-CNN based 3d object to-dense 3d object detector for point cloud,” in Proceedings of
detection for autonomous driving,” in Proceedings of the the IEEE/CVF International Conference on Computer Vision
IEEE/CVF Conference on Computer Vision and Pattern (ICCV), October 2019.
Recognition (CVPR), June 2019.
[15] Z. Yang, Y. Sun, S. Liu, and J. Jia, “3dssd: Point-based 3d
[2] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: single stage object detector,” 2020.
Towards real-time object detection with region proposal
networks,” in Advances in Neural Information Processing
Systems, vol. 28, 2015.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

42
[16] Y. Chen, S. Liu, X. Shen, and J. Jia, “Fast point R-CNN,” in [35] Z. Xu, W. Zhang, X. Ye, X. Tan, W. Yang, S. Wen, E. Ding,
Proceedings of the IEEE/CVF International Conference on A. Meng, and L. Huang, “Zoomnet: Part-aware adaptive
Computer Vision (ICCV), October 2019. zooming neural network for 3d object detection,” 2020.
[17] C. He, H. Zeng, J. Huang, X.-S. Hua, and L. Zhang, [36] Y. You, Y. Wang, W.-L. Chao, D. Garg, G. Pleiss, B.
“Structure aware single-stage 3d object detection from point Hariharan, M. Campbell, and K. Q. Weinberger, “Pseudo-
cloud,” in IEEE/CVF Conference on Computer Vision and lidar++: Accurate depth for 3d object detection in autonomous
Pattern Recognition (CVPR), June 2020. driving,” 2020.
[18] S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li, [37] A. D. Pon, J. Ku, C. Li, and S. L. Waslander, “Object-
“PV-RCNN: Point-voxel feature set abstraction for 3d object centric stereo matching for 3d object detection,” in 2020 IEEE
detection,” in IEEE/CVF Conference on Computer Vision and International Conference on Robotics and Automation (ICRA),
Pattern Recognition (CVPR), June 2020. IEEE, May 2020.
[19] W. Shi and R. Rajkumar, “Point-GNN: Graph neural network [38] D. Garg, Y. Wang, B. Hariharan, M. Campbell, K.
for 3d object detection in a point cloud,” in IEEE/CVF Weinberger, and W.-L. Chao, “Wasserstein distances for
Conference on Computer Vision and Pattern Recognition stereo disparity estimation,” in NeurIPS, 2020.
(CVPR), June 2020. [39] H. Konigshof, N. O. Salscheider, and C. Stiller, “Realtime
[20] M. Simon, K. Amende, A. Kraus, J. Honer, T. Samann, H. 3d object detection for automated driving using stereo vision
Kaulbersch, S. Milz, and H. Michael Gross, “Complexer-yolo: and semantic information,” in 2019 IEEE Intelligent
Real-time 3d object detection and tracking on semantic point Transportation Systems Conference (ITSC), IEEE, Oct. 2019.
clouds,” in Proceedings of the IEEE/CVF Conference on [40] W. Peng, H. Pan, H. Liu, and Y. Sun, “Ida-3d: Instance-
Computer Vision and Pattern Recognition (CVPR) depth-aware 3d object detection from stereo vision for
Workshops, June 2019. autonomous driving,” in IEEE/CVF Conference on Computer
[21] X. Du, M. H. A. J. au2, S. Karaman, and D. Rus, “A general Vision and Pattern Recognition (CVPR), June 2020.
pipeline for 3d detection of vehicles,” 2018. [41] J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou, and
[22] C. R. Qi, W. Liu, C. Wu, H. Su, and L. J. Guibas, “Frustum H. Bao, “Disp R-CNN: Stereo 3d object detection via shape
pointnets for 3d object detection from RGB-d data,” in prior guided instance disparity estimation,” in IEEE/CVF
Proceedings of the IEEE Conference on Computer Vision and Conference on Computer Vision and Pattern Recognition
Pattern Recognition (CVPR), June 2018. (CVPR), June 2020.
[23] M. Liang, B. Yang, Y. Chen, R. Hu, and R. Urtasun, “Multi- [42] G. Brazil, G. Pons-Moll, X. Liu, and B. Schiele, “Kinematic 3d
task multi-sensor fusion for 3d object detection,” in object detection in monocular video,” in Computer Vision –
Proceedings of the IEEE/CVF Conference on Computer Vision ECCV 2020, pp. 135–152, Springer International Publishing,
and Pattern Recognition (CVPR), June 2019. 2020.
[24] Z. Wang and K. Jia, “Frustum convnet: Sliding frustums to [43] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C.
aggregate local point-wise features for amodal 3d object Chen, “Mobilenetv2: Inverted residuals and linear
detection,” 2019. bottlenecks,” in Proceedings of the IEEE Conference on
[25] S. Vora, A. H. Lang, B. Helou, and O. Beijbom, Computer Vision and Pattern Recognition (CVPR), June 2018.
“Pointpainting: Sequential fusion for 3d object detection,” in [44] X. Chen, K. Kundu, Y. Zhu, H. Ma, S. Fidler, and R. Urtasun,
IEEE/CVF Conference on Computer Vision and Pattern “3d object proposals using stereo imagery for accurate object
Recognition (CVPR), June 2020. class detection,” 2017.
[26] J. H. Yoo, Y. Kim, J. Kim, and J. W. Choi, “3d-cvf: [45] Geiger, P. Lenz, and R. Urtasun, “Are we ready for
Generating joint camera and lidar features using cross-view autonomous driving? the kitti vision benchmark suite,” in
spatial feature fusion for 3dobject detection,” Lecture Notes in Conference on Computer Vision and Pattern Recognition
Computer Science, p. 720–736, 2020. (CVPR), 2012.
[27] S. Pang, D. Morris, and H. Radha, “Clocs: Camera-lidar [46] Z. Li, Y. Yao, Z. Quan, W. Yang, and J. Xie, “Sienet: Spatial
object candidates fusion for 3d object detection,” 2020. information enhancement network for 3d object detection from
[28] X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, “Multi-view 3d point cloud,” 2021
object detection network for autonomous driving,” in
Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), July 2017.
[29] J. Ku, M. Mozifian, J. Lee, A. Harakeh, and S. Waslander,
“Joint 3d proposal generation and object detection from view
aggregation,” 2018.
[30] M. Liang, B. Yang, S. Wang, and R. Urtasun, “Deep
continuous fusion for multi-sensor 3d object detection,” in
Proceedings of the European Conference on Computer Vision
(ECCV), September 2018.
[31] Y. Wang, W.-L. Chao, D. Garg, B. Hariharan, M. Campbell,
and K. Q.Weinberger, “Pseudo-lidar from visual depth
estimation: Bridging the gap in 3d object detection for
autonomous driving,” in Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern
Recognition(CVPR), June 2019.
[32] Y. Chen, S. Liu, X. Shen, and J. Jia, “DSGN: Deep stereo
geometry network for 3d object detection,” in IEEE/CVF
Conference on Computer Vision and Pattern Recognition
(CVPR), June 2020.
[33] C. Li, J. Ku, and S. L. Waslander, “Confidence guided stereo
3d object detection with split depth estimation,” 2020.
[34] R. Qian, D. Garg, Y. Wang, Y. You, S. Belongie, B.
Hariharan, M. Campbell, K. Q. Weinberger, and W.-L. Chao,
“End-to-end pseudo-lidar for image-based 3d object
detection,” in IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR), June 2020. .

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

43
Investigation of Algorithmic Architecture Design Method by
Using Digital Technology to Increase Flexibility in Design
Process
Javad Eiraji
Aida Ghiaseddin
Faculty of Architecture and Design,
Department of Architecture
Islamic Azad University Eskisehir Technical University
Eskisehir, Turkey
Tehran, Iran
st_a_ghiaseddin@azad.ac.ir javadeiraji@eskisehir.edu.tr

Abstract—In metropolitan cities architectures projects are The term "algotecture" was taken from the algorithmic
full of diversity and innovation. By observing these global architecture and represents algorithms for architectural design.
architectural projects causes new ideas and by this way we can It has become widespread in architectural design over the past
understand these architectural diversity. In addition, when few decades. The parametric instruments work based on
these architectural varieties increases, for using conventional algorithms which can apply exact control over geometry of
tools and methods in designing architectural projects we need to the design throughout the design process. The flexibility and
provide new methods and tools. One of these tools and methods responsiveness of these instruments to the design changes
is designing by an algorithmic method. This algorithmic have result in the usefulness and applicability of the
architecture method quickly allows the designer to create
parametric models, particularly in designing complex and
various designs and causes precise control over the project
during the architectural process. This paper is a qualitative
unique models [2].
research which seeks to evaluate algorithmic design as a II. Literature Review
powerful tool in digital architecture efficiency and
implementing flexible and creative ideas. The aim of this study, According to Patrick Schumacher, a successful design
is to explore an approach that increases the designers' employs technologies and instruments, which help the
creativities and provides a framework for the basic designer proceed with his design. Regarding to algorithmic
requirements of the architectural design. and parametric architecture, he states that the architecture
should move from a single-layer system and an application for
the design editions toward a multi-layer and yet, consistent
Keywords—Algorithmic Design Process, Performance-Based and continuous design of the multisystem such as envelope,
Design, Parametric Architecture, Computer-Aided Design, structure, and internal subsection. The application of any
Flexible Design design operation on a multisystem must be correlated with
other components of the system and influence them.

I. Introduction In his well-known paper, Schumacher mentions

parametric design as a deep style and introduces the
Today architectural design are more flexible by using new word parametricism as a newly emerging approach for
software, which help us to produce significant number of articulating architecture, which provides an acceptable
choices during the design process. validity [3].
Since the mid-1990s, architectural approaches started to Although some degree of compatibility of style with
focus on parametric graphic design in architecture and it was parametric architecture yields an appropriate method for
increased considerably. During this period, software such generating free-form and complex-geometry designs,
as Grasshopper and Generative Component(GC) was parametric design is not limited to any formal language. It is
developed with advanced features and then evolved rapidly. not a style but, instead, it is a design method rather than a
In 2000, the applications of the parametric techniques in style.
designing buildings had matured so that an increasing number
of spectacular monuments proposed and built by using the Meredith points to the threat against parametric design,
parametric design method. In terms of research and education, which can quickly turn into a visual aestheticism integrated
the AA in London, LAAC in Spain, MIT, and the University with a tendency toward formalism [4].
of Colombia in the US and, Hyper body lab in the Delft Similarly, Mousavi states the parametric design as a style
University of Technology turned into an origin for the next that deprives itself of the limitations of the external parameters
generation of parameter designers. Meanwhile, the and improves the independence of the architectural forms.
international conferences on computational design, including However, it cannot go beyond the new ways of shaping the
ACADIA (Association for Computer Aided Design in materials for producing unexpected spaces [5] [6].
Architecture) , ASCAAD (Arab Society for Computer Aided
Architectural Design), CAADRIA (Computer-Aided III. Theoretical fundamentals
Architectural Design Research in Asia) and eCAADe
Parametric, as a term or method, is rooted in the
(Education and research in Computer Aided Architectural
mathematical concept of parametric equations. Its design
Design in Europe), publish many proceedings , which is a
program was observed first in the works of the Italian architect
critical subject founded in these research halls since that time.
Luigi Moretti in the 1950s [7]. The fundamental idea of
Recently, many other workshops and conferences were held
parametric design is that the values of different variables
to progress the parametric design objectives [1].
usually vary associatively based on different input parameters.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

44
The conception of this design is originated from production In terms of computations, there is no difference between
design that its primary purpose is to help human designers to algorithmic and parametric systems. Algorithms pre-
discover the design space by computational instruments [8]. supposedly work on parameters, and the main component of a
Thus this purpose can be realized by computers through fast parametric system is the algorithm itself, which is called a
sampling. Parametric design systems discuss as production design or definition. Although, unlike algorithmic design,
instruments in architectural design and the parametric parametric systems emphasize explicit and direct
instruments are deployed algorithmically so apply more manipulation of the parameter values to change and modify
computational control over the design geometry. the design artifact.
The role of parametric modeling addresses as a production The significant advantage of parametric modeling is that it
design instrument in architecture. Parametric design is a allows us to change the parameters at any level of the design
computational method that can act as a productive process [13].
(generative) method and an analysis. Besides, it has recently
received good acceptability in terms of practicality, research, So far, various parametric modeling techniques
and education. There are some debates about the limitations developed for visual purposes (e.g., finding form) and other
of the parametric systems as an exploratory instrument, functional or performance-related purposes. Figure II shows
some of the usual parametric modeling techniques used for
mainly addressing its role in architectural design, flexibility,
and design complexity. According to its algorithmic basis and form-finding, e.g., repetition (Fig II – left) and division (Fig II
its potential for expanding the design discovery space by – right). Other similar methods may include tiling and like that
changing the algorithm variables, i.e., parameters, the (Fig II).
parametric design can be classified as the third class of the
production systems [9].
Algorithmic thinking and algorithmic design correlate
with the concept of product design. Terzidis argues that the
inductive strategy of the algorithms can discover the
production processes or simulate complex phenomena.
Algorithms can be assumed as the suffix of the human brain
and might facilitate mutation in some regions of the Figure II. Examples of parametric modeling techniques for form
unpredictable potential [10]. finding: Repetition (Left) and Subdivision (Right). [1]

A. Parametric Design Systems Most software hold up free-form modeling, yet they have
a scripting plug-in (add-on), enabling the designers to create
The term parametricism introduced by (P. Schumacher) the rule algorithms directly and more freely. Some of these
for the first time and later, has been described more software packages are introduced in short in this
comprehensively as a combination of design concepts, which division. Rhino and Grasshopper are some of the most
provides a new and complex discipline based on the commonly known parametric instruments, particularly in
fundamental principles. architecture. However, Digital Project (DP) and Generative
Parametric systems work due to algorithmic rules. An Component (GC) are more suitable for large projects with
algorithm is a limited set of instructions for achieving a multiplex and geometric associations [13].
specific aim. An algorithm takes a value or a group of values DP is a compelling software package that can effectively
as the input, displays several countable stages, which convert shoulder geometric and complicated parameters, making it an
or change the input, and finally generates one or multiple ideal choice for sizeable parametric design projects.
values as the output [11].
Rhino is an independent and NURBS-based instrument
A parameter is the value or measurement of a variable that invented by Robert McNeil. Since1990, it uses widely in
can vary. Every object (thing) in a parametric system might different subject, including architecture, industrial design,
have some specific rules. When a parameter changes, other jewelry design, automotive, and marine design.
parameters will be adapted to it automatically [12]. Furthermore, Grasshopper is a rule algorithm editor with a
Parametric design usually uses complex building molds, graphic interface incorporated into Rhino as a scripting plug-
energy and structural optimization, other repetitive works, and in. This structure makes specific definition files that link to the
design sampling. As a novel digital design method, parametric main parametric model in Rhino.
design is entirely different from CAD/CAM because these Rhino is commonly used as a production instrument,
algorithmic features are rule-based (Fig I). instead of a correction (modification) instrument, in the
parametric design process. Compared to other parametric
software, Rhino and Grasshopper are now widely used in
practicing and education. Such a wide use attributes to the
simplicity of its function as a visual programming instrument.
A significant difference between parametric design and
other standard computer design methods is the ruleset
Figure I. Simplified examples of parametric variations generated in converted into the main design elements and procedures [14].
the same system .[1] In this technique, through the parametric design
procedure, the architects can return to any stage of the design

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

45
process to change the parameters or revise the rules to modify Norway are two works that are exemplary of these two
the design procedure for different purposes or perform approaches (Fig V).
different tests and experiments. Such flexibility allows the
architects to keep the design opened (Fig III). During the design process, the architects ultimately were
driving the overall form and cladding of the building and the
engineers driving the structural member sizing / positioning.
On the architectural side, certain form explorations were being
made in response to certain criteria such as concourse width
requirements, floor area ratios, or simply beautifying the
shape on the engineering side were the structure of the roof
trusses and cladding system designed as a rain screen
consisting of inter-locking louvers. A single parametric model
was shared between the architectural and engineering offices,
which acted both as a design tool and a coordination platform.
This allowed the integration of the design processes of the
Figure III. A large number of form variations generated in a
form, structure and façade, allowing fast response to design
parametric design system. [1] changes. Analysis tools were coupled with the parametric
model and provided quick analytical reaction to the geometry.
The sharing of the parametric model across the other design
In parametric design, once the rule algorithms get created, members and the fully integration of the engineering analysis
many design choices will be generated (Fig III). This design applications could be realized the benefits of a parametric
sampling can expand the design abilities considerably and approach (Fig VI).
extend the designer's thinking. Additionally, the designers do
not need to determine any solution early [15].
This feature allows keeping the maximum potential in the
process. Parametric design is not only a new design
instrument, it's a new method of design thinking [16].
B. Describing Design based on Parametric Logic
Parametric modeling as a design integration method diverges
the design space to discover many kinds of similar parametric
models. Parametric modeling can help a broader region for
design discovery. A change in a parameter would cause a
concurrent change in the form .While memorizing the basic
coherence of the design, it applies the changes to the form.
In Figure IV the multiple geometric arrangements of the
British Petrol Headquarters in Sunbury presented by Adams
Kara Taylor, Shows the creative discovery of the roof
structure is based on a parametric approach, which takes into
consideration both aesthetically and structurally.
The Parametric model turn into a controlled environment
in design discovery which searches for a greater design (Fig
IV).

Figure V. Process of parametric design of Aviva Stadium in Dublin,

Ireland. [6]
Figure IV. British Petrol Headquarters in Sunbury. [6]

Figure VII shows the Kilden Performing Arts Center in

Performative design principles can be integrated either
Kristiansand, Norway, as another case of parametric modeling
early in the design process where design concepts and main
designed by ALA Architects and engineered by Design-to-
geometry are being worked out, or later on during detail
Production Company. In this design, cantilevering up towards
design where performance boost of systems is carried out.
the waterfront, the timber façade intersects with a vertical
The Aviva Stadium in Dublin, Ireland (Figure V) and the
glass and steel facade in both interior and exterior parts.
façade of the kilden performing Arts Center in Kristiansand,

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

46
Figure VI. Aviva Stadium in Dublin, Ireland.]6[

The geometry of the façade is a ruled surface that spans Another similar example is the Sydney Opera House by
between a straight upper and a curved lower edge. Here, a Jorn Utzon. It is a competitive project that could win the first-
parametric system was used not during the form-finding rank prize in 1957. Today, this monument is known as a
process of the curvilinear roof, but during detail design for the masterpiece that catches the attention of many engineers and
parametric optimization of form and performance (Fig VII). architects. However, the design discovery procedure has
resulted in many differences at the time of its construction
[18].
The geometry of the roof structure was not primarily
defined and was considered unusable at first [19].
Over five years from designing the concept, the engineers
and architects had to change the roof into an appropriate,
appropriate form that allows the use of a unit mold and
consequently, a unit bending during the construction stages
[20].
This pattern by Gaudi and Utzon indicates a type of
geometry that can reply to computational approaches and
Figure VII. Model photo of Kilden Performing Arts Center. [6] especially, parametric systems of performance.

C. Design Reports in Parametric Modelling IV. Conclusion

Several perspectives of the design space was applied to
The creation and evolution of algorithmic instruments
parametric design instruments. On one hand, a 3D model
enable the designer to create a new and debatable structure
perspective demonstrates the geometric illustration
during the architectural design process and in addition, to
(analogue); on the other hand, an editor allows the designer to
answer to the environmental and physical needs using the new
encode the algorithm, i.e., schema. This editor is in either text
methods and proportionate to the design context. This
or graphic forms. The visual schema editors
approach is required when the structure and fundamental
(e.g., Grasshopper 3D plug-in by McNeil and Generative
needs of the design have some complexities and covered
Component by Bentley) are now widely used due to the low
layers, and the designer, despite mastership in designing,
level of technical knowledge required in programming
cannot manage and organize them. Thus, by providing
languages.
different instruments and methods, parametric architecture
can help the designers to control the design process by
optimizing the project, bringing the time, financial, and human
D. Details of Geometric Parameters Design resources under the designer's control. As can be seen in the
Another misunderstanding is that parametric design is the dead cubes scenario, the designer has performedd using the
only way to generate complex geometry. even before algorithmic environment and instruments to respond to the
computations, Complex geometry in architecture, has been an design's fundamental requirements and have a novel method
applied field in architectural design, as can be seen in the for designing residential buildings. This has guided the
works of Frei Otto, Jorn Utzon, Pier Luigi Nervi, Felix designer to employ the algorithmic instruments to create
Candela and Antonio Gaudi. diverse and exciting spatial sequences aligned with his
believes during the architectural design process for residential
Each of these architects found creative techniques for
buildings. The notable point is the existence of different
overcoming the complexities of the free-form geometry in
alternatives at the final presentation of the project, which
their works. For example, hanging the chain models was the
indicates the considerable capabilities of algorithmic
physical discovery instrument for Antonio Gaudi during the
architecture in the design process and accomplishment of the
form-finding process. However, this procedure was not fully
process.
efficient and smooth [17].

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

47
What is learned from investigating the parametric [15] Carlos Roberto Barrios Hernandez, “Thinking parametric
architecture design method by using digital technology aimed design: introducing parametric Gaudi’’, design studies,
volume 27, issue 3, 2006.
to improve flexibility is that a mutual and dynamic
[16] David Karle, Brian M. Kelly, “Parametric thinking’’,
relationship between the components can cause maturity and Proceedings of ACADIA regional conference, 2011.
evolution of the design. The digital design process improves [17] Tomlow, Jos, “The Model: Antoni Gaudi’s hanging model
the designer's creativity and mind power and causes his/her and its reconstruction”, PhD thesis, Universität Stuttgart,
mind to work more dynamically and precisely; besides, it 1989.
increases the designer's mental archive to the numerousness [18] ÖMER AKIN, “Three Fundamental Tenets for Architectural
and diversity of the ideas and architectural structures. In short, Ethics, invited paper for the ACSA Teacher’s Conference’’,
the algorithmic architecture employs the designer's creativity Cranbrook Academy of Art, 2004.
as a computer that does not merely take the quantities into [19] John Yeomans, “The Other Taj Mahal: What Happened to the
account and engages it in the design process. This approach Sydney Opera House’’, Longman Australia, 1973.
aligns architecture with some imaginations that were far from [20] O ARUP, R S JENKINS, “The evolution and design of the
Concourse at the Sydney Opera House’’volume 39, issue 4,
being brought into action until a short while ago. However, it 1968.
is obvious that this method will be effective when, in terms of
qualitative criteria, the designer's mind continuously evaluates
and controls the design process as an intelligent supervisor.

References
[1] Ning Gu, Rongrong Yu, and Peiman Amini Behbahani, “
Parametric Design: Theoretical Development and Algorithmic
Foundation for Design Generation in Architecture”, Handbook
of the Mathematics of the Arts and Sciences, Springer, 2018.
[2] Moghtadanezhad, Pashaei, mehdi, Sevda, “Investigation of
the Impact of Parametric Architectural Design Process Based
on Algorithmic Design, A New Method in Digital
Architectural Design for Achieving Sustainable Architectural
Goals’’3rd international conference on Modern research in
civil engineering architectural and urban development,Berlin-
Germany, 2016.
[3] Patrick Schumacher, “Parametricism: Rethinking
Architectures Agenda for the 21st century(architectural
design)’’Academy Press, 2016.
[4] Michael Meredith, AGU, Mutsuro Sasaki, P .ART,
Designtoproduction, Aranda, “From control to design:
Parametric/algorithmic architecture’’, Actar; English edition,
2008.
[5] Farshid Moussavi, “Parametric software is no substitute for
parametric thinking’’, The architectural review, Actar,
FunctionLab, Harvard Graduate School of Design, 2011.
[6] Ipek Gursel Dino, “Creative design exploration by parametric
generative systems in architecture’’, Journal of the Faculty of
Architecture, volume 29, issue 1, 2012.
[7] John Frazer, “Parametric computation:history and future’’,
Architectural Design, Volume 86, Issue 2, 2016.
[8] Christiane M. Herr, Thomas Kvan, ‘‘Adapting cellular
automata to support the architectural design process”,
Automation in Construction, volume 16, issue 1, 2007.
[9] Sheng-Fen Chien,‘‘Supporting information navigation in
generative design systems”, Carnegie Melon University,
1998.
[10] Achim Menges, Sean Ahlquist ‘‘Computational Design
Thinking”, Wiley; 1st edition, 2011.
[11] Thomas H. Cormen, Charles E. Leiserson , Ronald L. Rivest
, ‘‘Introduction to Algorithms, Second Edition’’, The MIT
Press, 2001.
[12] Michael J. Ostwald, “Systems and enablers: modeling the
impact of contemporary computational methods and
technologies on the design process’’, Computational design
methods and technologies, 2012.
[13] Javier Monedero, “Parametric design: a review and some
experiences’’ Automation in Construction, volume 9, issue 4,
2000.
[14] Abdelsalam, Mai, “The Use of the Smart Geometry through
Various Design Processes: Using the programming platform
(parametric features) and generative
components’’, International Conference Proceedings of the
Arab Society for Computer Aided Architectural Design
ASCAAD, 2009.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

48
Calculating the Lower Angular Excited States in Two
Dimensions Using the Finite Difference Time Domain Method
Huwaida Elgweri Amal Hamed Mohamed Mansor
Department of Physics Department of Physics Department of Physics
University of Tripoli University of Tripoli University of Tripoli
Tripoli, Libya Tripoli, Libya Tripoli, Libya
H.Elgweri@uot.edu.ly Amal.HAMED@uot.edu.ly m.mansor@uot.edu.ly

Abstract - The Finite Difference Time Domain method has term contains the potential, so the dimensionless form of the
been used to find the angular excited states wave functions in two Hamiltonian is given by,
dimensions. These excited states are calculated by applying the
iterative procedure on a specified initial guess wave function that ̂ ⃗
contains the desired excited state as a lowest state, this is simply The exact eigenfunctions and eigenvalues of the
done by introducing lines of zeros in the wave functions and their
differential equation (1) can be obtained only analytically for a
second derivatives. This of course depends on the symmetry of
the potential. We choose here either square or cylindrical
handful of potentials and in most cases we have to resort to the
symmetry, so the lowest angular excitations will contain lines of numerical analysis. One of the most useful numerical
zeros one or two passing through the region, namely, the first techniques is the diffusion method, which is based on the
excited state and the second excited state respectively. In our equivalence of the time- dependent Schrödinger equation to
investigation, we apply this technique to two simple potentials, the diffusion type equation by performing a transformation
which are the two dimensional simple harmonic oscillator and from real time domain to imaginary time domain to
the finite cylindrical well potential in order to illustrate the get the following equation
accuracy and the efficiency of these calculations. These potentials
were chosen, as the analytical solutions are available, so to ⃗
compare them with our results using MATLAB program. ̂ ⃗

Keywords: Finite Difference Time Domain Method, diffusion There have been various numerical methods to solve this
equation, cylindrical well potential, simple harmonic oscillator, diffusion type equation such as diffusion Monte Carlo method
Schrödinger equation [5], Grimm and Storer approximation method [6], and finite
difference time domain method (FDTD) [7]. All these
I. Introduction methods involve an iterative procedure applied on an arbitrary
The Finite Difference Time Domain Method (FDTD) has initial guess wave function that contains a mixture of all
several applications such as electromagnetic wave simulations, possible state wave functions. This iterative process can be
solving Maxwell’s equations [1], solar cells, filters, optical viewed physically as cooling the system and lowering its
switches, semiconductor based photonic devices and nonlinear energy [8], so the iterative procedure will always lead to the
devices [2], it is also used for solving Schrodinger equation ground state of the system. Hence if one only interested in the
which is the main topic of our investigation. There are many ground state of a system, the diffusion method is simple and
advantages of this method, for instance; the accuracy of sufficient.
numerical modeling, flexible for any geometrical shapes, and The advantage of this work is to extract higher angular
easy in programing [3,4]. However, this method is suitable to excited states using lines of zeros in the wave function. This is
calculate the ground state of the quantum systems because its done by classifying the initial guess wave function to even
diffusion behavior and in this paper we present a modified parity wave function and odd parity wave function. This
technique to improve the (FDTD) to be valid to calculate the procedure will still give the lowest possible excited state. The
lower angular excited states as well. space used in these calculations is extended to the unclassical
The time- dependent Schrödinger equation provides a region and it is kept small by using the end point formula for
description of the quantum system, that it is given by, the second derivative. The end point formula allows us to
calculate the wave function self consistently in the given
⃗
̂ ⃗ region. However, the region should be extended far enough to
calculate accurate energy eigenvalue. Due to the symmetry
where ̂ is the Hamiltonian of the system. The first term of only one sign region is kept for the actual numerical
the Hamiltonian contains the kinetic operator and the second calculations.

49
In addition to the introduction section this paper is wave function that is subjected to certain symmetric
organized into three sections as follows: The general theory properties, and then applying a suitable iterative procedure
section which presents the formulation of the (FDTD) method will converge to the lowest angular excited state in this initial
that is used to calculate the ground state, followed by the guess wave function.
improvement performance that is used to employ this method
to calculate the lower angular excited states. In application Therefore, introducing odd initial guess wave function that
section the improved (FDTD) method is applied to two contains one line of zero lies on either or
familiar examples in order to test this technique. Finally, the will exclude all even state wave functions, so, this odd initial
conclusion section contains the conclusion and the summary guess wave function will be a mixture of only the odd wave
in this work. functions of the system as

II. General Theory ∑

By using separation of variables technique we get the since only the lowest state remains after applying the iteration,
formal solution of the diffusion equation (3) in two it can easily be seen that applying iterative procedure which
dimensions as subjected to the anti symmetric property on the zero line will
approach to the first angular excited state i.e.
∑

where are expansion coefficients, and are a Similarly, introducing even initial guess wave function that
complete set of eigenfunctions and their corresponding energy contains two lines of zeros lay on both and
eigenvalues for the time-independent Schrödinger equation, so will exclude all dissimilar state wave functions as
they satisfy
̂ ∑

and applying appropriate iteration procedure will approach to

In this section, we introduce a short overview of the the second angular excited state i.e.
(FDTD) method to prove that the iterative process leads to the
lowest state of the system as follows.
Eq (4) can be rewritten in a convenient form in order to solve As stated before, the norm of the total wave function is kept
it iteratively as, equal to one before the next iteration step in order to prevent
the wave function from diminishing, since the iterative process
∑ damps all wave functions but with different rates.
where is the imaginary time step and is an integer Detailed finite difference scheme simulation in two
presents the number of iteration steps. The iterative procedure dimensions based on symmetric and anti- symmetric
is performed using an arbitrary initial guess wave function properties will be presented hereafter.
which is given by, Following the forward finite difference formula, the first order
time derivative in (3) can be approximately written as
∑

This initial guess wave function contains a mixture of all the

wave functions of the system, and the iterative process Similarly, the Laplacian operator in (2) can be found by the
increases the value of the imaginary time domain in each central finite difference formula as
iteration by value equal [9]. Therefore, for a sufficient
number of iterative steps which give a large value of that
is much larger than the inverse excitation energy only the
lowest state wave function remains as a result and the other
states damp faster because of their larger eigenvalues, so

where is the mesh size between adjacent points

, and is the mesh size between adjacent
where is the smallest eigenvalue. So, the iterative process is points . While the Laplacian operator at the
viewed physically as cooling the system and lowering its axis depends on the symmetric property along this axis. If the
energy. A normalization of the wave function after each time axis includes zero line then the second derivative along this
step ensures the final result being . axis equal to zero i.e.
Restricting the numerical calculations in the first quarter of
plane allows us to introduce a specific initial guess

50
In other words that must be chosen carefully because of its
association with the grid spacing.
If there is no zero line on the axis then the second derivative The energy eigenvalues are calculated by means of
across this axis is given by numerical evaluation of the expectation value of the
Hamiltonian for their corresponding normalized
eigenfunctions as

∑∑ ̂

III. Applications
A. Two Dimensional Simple Harmonic Oscillator
The second order spatial derivative at the boundaries of the As a first example, we consider the simple harmonic
spatial mesh is calculated using the end point difference oscillator in two dimensions which is a good example to test
formula as the validity of the presented method. The distance is measured
in unit √ and the unit of the energy is . In these units
the Hamiltonian operator for the relative system is
̂
and
where , and ,

the exact eigenfunctions are

√
The potential operator in (2) can be written as
[ ] ( )[ ] ( )

and the exact energy eigenvalues are

By plugging (13), (14) and (21) into (3) we get the recursion
form equation
The first three lowest eigenfunctions of the simple
harmonic oscillator calculated numerically are shown in
[ ] Figure I (a,b,c), respectively. The figures have been provided
with a comparison of the numerical wave functions with their
corresponding exact wave functions to confirm the accuracy
[ ] of the numerical method that we have adapted. The generated
first angular excited state presented in Figure I (b) is
where the notation , , and the second angular excited state presented in
and are given respectively by Figure I (c) is . The associated comparison figures
show the difference of the numerical wave functions from
their corresponding exact wave functions, these comparison
figures illustrate that the maximum value of the error is nearly
about which reflects the high accuracy of this method.
Noteworthy that applying symmetric technique produces the
parts of eigenfunctions in the first quarter in plane,
therefore, the numerical cost will be reduced to the quarter of
For the purpose of numerical computation, the recursions its value required to obtain the eigenfunctions in whole
equation (22) must converge at the chosen imaginary time plane. The presented contour plots of the eigenfunctions are
calculated by extending the eigenfunctions that are obtained
steps, therefore the coefficients and must satisfy the
using symmetric technique over the whole plane.
stability condition which is given in two dimensions by [6]
In Table І, we show the numerical energy eigenvalues for
the first three lowest states obtained using (26) and their
corresponding exact energy eigenvalues which they are in a
good agreement.

51
Table I. Comparison of the numerical energy eigenvalues of the first
where and .
three lowest states with their corresponding exact energy eigenvalues
for two dimensional simple harmonic oscillator The exact energy eigenvalues and their corresponding
The state Numerical eigenvalue Exact eigenvalue
eigenfunctions are calculated analytically by transforming (31)
into the polar coordinate formula as
Ground state 1.9999 2.0
First excited state 3.9977 4.0
Second excited state 5.9953 6.0 ( )

By using separation of variables technique we get the

solutions of the second order partial differential equation for
cylindrical potential as [10]

(√ )
{
(√ )

where (√ ) is the Bessel function of the order ,

(√ ) is the modified Bessel function of the order , and
is an integer which for corresponds to the ground
state wave function, corresponds to the first angular
excited state wave function and corresponds to the
second angular excited state wave function. In order to obtain
the energy eigenvalues we apply the continues conditions at
the well boundary to get the following equation,

√ (√ ) (√ ) √ (√ ) (√ )
Figure I. The first three lowest states wave functions of
two dimensional simple harmonic oscillators calculated
numerically. where defined as , and defined as .
a. The normalized ground state wave function
Therefore, the algebraic equation (34) will be satisfied only at
b. The normalized first angular excited state wave
certain discrete energy eigenvalues , which can be obtained
function
by getting the roots of this equation numerically. After energy
c. The normalized second angular excited state wave
eigenvalues have been found their corresponding
function
eigenfunctions can be calculated very easily by plugging the
energy eigenvalue into (33).
All the previous calculations are performed with
and , in these parameters iterations are Numerically, in the finite cylindrical well potential we have to
sufficient to get acceptable results. Numerically, the integrals use small spatial mesh size because of the circularly shape of
calculations used to normalize the wave function and those this potential, thereby the time step must be chosen carefully
used to determine the energy eigenvalues are evaluated using depending on the stability condition. In Tables (ІІ, Ш and ІV),
the trapezoidal rule. we show the effect of reducing the values of spatial mesh size
and the associated time step on the numerical eigenvalues, by
B. Finite Cylindrical Well Potential. presenting the first three lowest states of finite cylindrical well
As a second example we extend the same calculations to potential with depth , and radius . However,
the finite cylindrical well potential which is given in Cartesian smaller time step requires larger number of iterations to get
coordinate by acceptable results.
In Figure ІІ (a,b,c), we show the first three lowest
{ eigenfunctions respectively for finite cylindrical well potential
with depth and radius , these eigenfunctions
where is the depth of the potential, and is the radius of are calculated numerically using the most fit parameters To
the potential. illustrate the accuracy of the numerical results we show the
difference between the numerical eigenfunctions and their
In the distance unit and energy unit the dimensionless corresponding exact eigenfunctions in front of the figures. In
form of the time independent Schrödinger equation is given this case to get acceptable results nearly 5000 iterations are
by, required. Again, the integrals are evaluated numerically using
the trapezoidal rule.

52
IV. Conclusion
This present investigation has demonstrated the usefulness
of the (FDTD) method with the appropriate symmetric
boundary conditions of the wave functions to extract the
eigenfunctions and eigenvalues of the lower angular excited
states for cylindrical potential if they exist. The diffusion
method is very suitable for obtaining the ground state
calculations and it was proved in this paper that this method is
valid for determining the lower angular excited state as well,
by choosing an appropriate initial guess wave function that has
to be odd function for the first excited state and even function
for the second excited state. Choosing special initial guess
wave function removes all dissimilar excited states and forced
the iterative procedure to yield to the lowest state in this initial
guess function. Numerically removing the ground state by
Gram Schmidt orthogonalization procedure can also lead to
the excited states but the method introduced in this paper offer
Figure II. The first three lowest states wave functions of the advantage over the previous method that it's less expensive
the finite cylindrical well calculated numerically. in numerical cost. In addition, since the calculations of
a. The normalized ground state wave function symmetric method performed in the first quarter of
b. The normalized first angular excited state wave plane, so it reduces the greatly numerical cost to the quarter of
function those required to perform the calculations to entire the plane,
c. The normalized second angular excited state wave end point derivatives farther reduce the mesh points
function considerably. We have provided detailed calculations to
illustrate this improved technique in two dimensions, and we
Table II. The numerical ground state eigenvalue of finite cylindrical applied it to two different potentials as examples, namely the
well potential with and calculated using different simple harmonic oscillator and the finite cylindrical well. The
values of and ∆τ. The analytical eigenvalue is -28.78743 numerical results illustrated in Figure І, and in Table І, which
correspond to the simple harmonic oscillator potential show
Numerical eigenvalue Absolute error the efficiency and simplicity of this method, while the
0.1 0.001 -28.79096 0.00353 numerical results illustrated in Figure ІІ, and in Tables (ІІ, Ш,
0.08 0.0005 -28.78791 0.00048 and ІV), those correspond to the finite cylindrical well
0.07 0.0004 -28.78778 0.00035 potential show that extra care should be taken when choosing
the parameters used to obtain the eigenfunctions and the
energy eigenvalues in this case. Finally, we can generalize this
Table III. The numerical first angular excited state eigenvalue of finite method to calculate more angular excited states by using
cylindrical well potential with and calculated using diagonal zeros axis.
different values of and ∆τ. The analytical eigenvalue is
-26.92663 References

Numerical eigenvalue Absolute error [1] Antonio Soriano, Enrique Navarro, Jorge Porti, Vicente Such.
0.1 0.001 -26.93885 0.01222 “Analysis of the finite difference time domain technique to solve
the Schrödinger equation for quantum devices”, Journal of
0.08 0.0005 -26.92997 0.00334 Applied Physics, 95 (12), (2004).
0.07 0.0004 -26.92901 0.00238 [2] Charles Reinke, Aliakbar Jafarpour, Babak Momeni,
0.05 0.0003 -26.92866 0.00203 Mohammad Soltani, Sina Khorasani, Ali Adibi, Yong Xu, and
0.04 0.0001 -26.92516 0.00147 Reginald Lee,” Nonlinear Finite-Difference Time-Domain
Method for the Simulation of Anisotropic, χ (2) , and χ (3)
Optical Effects”, Journal of lightwave techology, 24, no. 1,
(2006)
Table IV. The numerical second angular excited state eigenvalue of
finite cylindrical well potential with and calculated [3] Dennis Sullivan, David Citrin, “Time-domain simulation of two
electrons in a quantum dot”, Journal of Applied Physics,
using different values of and ∆τ. The analytical eigenvalue is 89,3841,(2001).
-24.48947
[4] I Wayan. Sudiarta, Lily. Maysarr. Angraini “The Finite
Difference Time Domain (FDTD) Method to Determine
Numerical eigenvalue Absolute error Energies and Wave Functions of Two-Electron Quantum Dot”
0.1 0.001 -24.55968 0.07021 in Proc. AIP LLC, 2023, p. 020199-1, 2018.
0.08 0.0005 -24.52546 0.03599 [5] Thiago N Barbosa, Marcos M Almeida1, Frederico V Prudente,
0.07 0.0004 -24.48951 0.00004 “A quantum Monte Carlo study of confined quantum systems:
application to harmonic oscillator and hydrogenic-like atoms”,
Journal of Physics B: Atomic, Molecular and Optical Physics,
48 (5), (2015).

53
[6] R.Grimm, R.G. Storer, "A new method for the numerical
solution of the Schrödinger equation", Journal of Computational
Physics, 4, 230-249, (1969).
[7] I Wayan Sudiarta, D.J.Wallace Geldart, "Solving the
Schrödinger equation using the finite difference time domain
method", Journal of Physics A: Mathematical and Theoretical,
1885-1896, (2007).
[8] Huwaida Elgweri, Mohamed Mansor, "First excited solutions of
Schrödinger equation by the diffusion method applied to various
one dimension problem", Journal of Academy for Basic and
Applied Science, 14 (1), 1-4, (2015).
[9] Mohamed Mansor, Taher Sherif, Saleh Swedan, "Improved
simple numerical method using the diffusion equation applied
for central force bound quantum systems", Journal of basic and
applied science. 14,72-81,(2004).
[10] George Arfken, Hans Weber, Mathematical Methods for
Physics. edition. Academic Press is an imprint of
Elsevier.225 Wyman Street, Waltham, MA 02451, USA.
[11] Huwaida Elgweri, Mohamed Mansor,"Calculation of positive
spectrum for the higher excited states using Grimm and Storer
diffusion method", The Libyan journal of science. 30, 33-42,
(2017).
[12] Kailash Kumar, “On expanding the exponential”. Journal of
Mathematical Physics. 6, 1928-34, (1965).
[13] Ivan Sokolnikoff, Raymond Redheffer, Mathematics of Physics
and Modern Engineering edition, McGraw-Hill, New York,
(1966).
[14] Robert Eisberg, Robert Resnick, Quantum Physics of Atoms,
Molecules, Solids, Nuclei, and particles, John Wiley and Sons,
New York, (1974).

54
Implementation Framework for a Blockchain-Based
Reputation and Trust System

Zeba Mahmood Vacius Jusas

Department of Software Engineering Department of software Engineering
Kaunas University of Technology Kaunas University of Technology
Zeba.mahmood@ktu.edu Vacius.Jusas@ktu.lt

Abstract— The web revolutionized how people engage lacks a robust verification system allowing vendors to easily
with data. Consumers in e-commerce rely solely on online manipulate reviews to influence consumers’ perception
reputation systems when selecting which items to purchase. about their products. Major contributions of this study are
For its consumers and sellers, many e-commerce platforms three-fold.
now include built-in review systems. Users may share their • Contextualize reputation and trust systems regarding
evaluations and even propose better items for their online Blockchain technology.
shopping on social media networks like Facebook. Despite
these advancements, there are still significant flaws • Propose a model consisting of actors in the context of
numerous platforms that facilitate online interactions are reputation systems, the network architecture, and
still centralized and vulnerable to manipulation, tends to components.
result in a broken marketplace with ineffective verification • Propose an Ethereum-based platform that implements
where vendors can easily manipulate consumers' the reputation model.
perceptions to increase sales. Blockchain has been hailed as
a game-changing technology that may bring an extra layer The rest of paper is organized as follows: section II
of trust and security to online interactions, as well as Blockchain technology discusses the reputation model and
providing much-needed reputation in online interaction provides an implementation framework while section III
discusses the initial practical findings.
platforms. We offer a trustworthy, decentralized reputation
model based on the Ethereum Blockchain in this paper, with
the goal of restoring trust and integrity in the online A. ONLINE INTERACTION MARKETPLACE IN THE
CONTEXT
interaction industry. We go over the implementation
framework for such a system and present some preliminary Online interaction marketplace is currently broken
results. Our findings indicate that a decentralized because many reputation systems that are in use are
Blockchain-based reputation network is feasible, with centralized. As such, there are no mechanisms for
impact factor evaluations for each node serving as the guaranteeing that the behavior of entities remains honest
primary criterion for assessing the ecosystem's during the interaction process. In recent times, calls for a
trustworthiness. decentralized Internet have been growing. Although the
modern Internet is built on top of decentralized protocols such
Keywords— Reputation, Blockchain, trust, peer-to-peer, as TCP/IP and HTTP, a large section of the application stack
consortium networks, smart contracts has remained centralized. The desire for more
decentralization has largely emanated from the broken
marketplace bedeviled by fraudulent activities. Blockchain
I. INTRODUCTION can help enforce reputation and trust in online interactions.
The internet and social media have been acknowledged as Such a platform can help predict the outcomes of online
new frontiers for media convergence a phenomenon that is interactions. At present, there are many online businesses
increasingly characterized by how information flows and that have successfully implemented decentralized computing
users migrate linking content, communication, and systems with disruptive consequences. The unveiling of
computation in a complex setup. Users rely on the Internet to Bitcoin in 2009 led to the emergence of decentralized
send and receive emails, search online for media files or news, alternative platforms like Open Bazaar and Silk Road. The
and shop for products and services. As of this writing, the success of these platforms has been appraised by their
number of internet users has surged to 4.4 billion from 4.3 ability to create a trustless economy that does not require
billion in 2018 [1],[2]. At this rate, online interactions occur trust third parties for transaction verifications. We believe
between known entities. However, this is not the case. that Blockchain holds the potential to unlock the problems
Online communication takes place in an environment where bedeviling the reputation systems. Our model provides a
entities are anonymous with no mechanisms for verifying framework for implementing an open and transparent
interactions between them. Entities can easily create many reputation system based on Ethereum Blockchain. With this
accounts for each platform and engage in online interactions. system, any user intending to participate in online interaction
In e-commerce and social media platforms, this has led to the with the other party can verify the true identities and proof
emergence of a thriving economy based on fake reviews. that the parties are who they claim to be.
Driven by profits, many merchants and vendors are buying
positive reviews for their businesses and negative reviews B. Blockchain Technology
for their rivals in an attempt to influence how users perceive
Blockchain was first introduced with the publication of
their businesses [3],[4]. In 2017, the U.K. Consumer
whitepaper in 2008 by Satoshi Nakamoto as break-through.
Advocacy Group reported that sellers on Amazon were
listing products and services that carried thousands of
positive fake reviews [5]. This problem is prevalent because
many of these platforms are centralized and prone to
manipulation. As such, the online marketplace is broken and
55
technology for Bitcoin the first peer-to-peer (P2P) mathematical puzzle to authenticate and transactions and
cryptocurrency [6]. The technology paired cryptography an append the new blocks to the chain and in the process by they are
already established and understood concept in computer rewarded by coins. Whereas computation processes leading to
science at the time with a novel Merkle tree data structure to new blocks are difficult, they can easily be verified by nodes in
facilitate digital transactions. What made the technology grow the Blockchain ecosystem. As such, when a miner obtains the
fast is its ability to solve the decade-old double-spending solution to a new block, it broadcasts the generated block to the
problem: a scenario where the same money is copied and spent network for verification by other miners [9]. All the other
more than once. With a P2P model and Merkle tree data miners need to confirm that the solution is correct for the generated
structure, Blockchain does not require intermediaries, such as block to be confirmed. The mathematical puzzle to be solved in
banks, to facilitate online transactions. Blockchain became the PoW include [10].
progenitor of cryptocurrencies with Bitcoin becoming its first
use case. Rather than having traditional accounting systems of • Generating a hash function which requires the miner to
banks and other intermediaries to validate online transactions, determine input if the output is known.
cryptocurrencies use a public digital ledger or register • Integer factorization which involves presenting a
(Blockchain) to confirm transactions. When users transact, number as a product of two other integers (usually large
the transaction is recorded on the Blockchain. With traditional primes).
accounting systems, security and validation of the register
depend on banks, central banks, card issuers, and lately • Checking to confirm whether a DoS attack has occurred
telecommunication firms for mobile payments. However, in by computing hash functions.
Blockchain-backed transactions, the same is handled by A node that successfully generates a block is rewarded with
decentralized nodes (also called miners) who compete for coins as a form of incentive. PoW helps to protect the
verification of the transactions by solving a complex Blockchain network against attacks since an attack can only
mathematical puzzle. In essence, what Blockchain managed occur with a lot of computational power and time which would be
to solve is replacing trusted centralized authority’s entities inefficient since it would be expensive in terms of costs than the
with the decentralized and trustless system. potential rewards.
Inspired by Bitcoin, other cryptocurrencies (generally b) PoS (Proof-of-Stake)
referred to as altcoins) emerged with Ethereum becoming the
dominant platform. Ethereum was unveiled in 2013 not just as In a PoS,[11] the consensus is achieved by requiring nodes
a digital cash system but rather as a programmable platform to stake some of their coins or tokens in the process of
with unconceivable capabilities. With Ethereum, smart authenticating blocks. Essentially, staking involves depositing
contracts and other Decentralized applications (DApps) could some coins which are then locked up into the Blockchain
be executed in a “complete Turing World Computer.” Ideally, ecosystem. Such coins become collateral for vouching for the
smart contracts can be regarded as a special type of accounts new block [12]. The more a particular nodes stakes in the
that are recorded on the Blockchain and are therefore not ecosystem, the better the chances of being selected to validate
controlled by humans. In their most basic forms, smart the transactions [13]. PoS is specially conceived to resolve the
contracts can run all kinds of instructions like maintaining Byzantine Fault Tolerance issue that is rampant with PoW
states, checking conditions, and sending and receiving digital algorithm since all the validators are known in the network and
money. Of utmost significance is that the smart contract on the can easily be tracked on the Blockchain [14].
Blockchain cannot be changed, and/or even hacked. These
attributes make Blockchain such as Ethereum a perfect D. Types of Blockchains
platform for enforcing reputation rules because it is a Blockchains are broadly grouped into 3: Public permission-
permission-less Blockchain providing designated members less Blockchains; Public permissioned Blockchains; and Private
with the ability to read and write on the ledger [7]. permissioned Blockchains [15]. In a Public permission-less
Blockchain, there is no centralized entity that authorizes the
C. Consensus Mechanisms transactions on the Blockchain. These Blockchains can be
In Blockchain, consensus algorithms form the basic rules regarded as shared public ledgers where any node can view and
of agreement on how the nodes in the network validate the modify the data so long as the node is participating in the
transactions [8]. For example, suppose Alice sends $10 network. Ethereum and Bitcoin are some of the known examples
worth of bitcoins to Bob. There has to be a mechanism in under this category. In Public permissioned Blockchain,
place that ensures that Alice’s account balance reduces by selected nodes are used to authenticate the transactions on the
$10 while Bob’s account balance increases by the same Blockchain. For example, authentication of the transactions can
amount. Such a mechanism has to be implemented in a be assigned to a government entity, senior employees, or an
manner that does not allow any malicious transactions or institution. Lastly, a Private Permissioned Blockchain is a ledger
alterations to the Blockchain to be made without the full ecosystem where data is not available for public view.
consent of all the nodes participating in the network. Some of
the most common consensus algorithms include. E. Smart contracts
Smart contracts are essentially special types of accounts
a) PoW (Proof-of-Work) that are recorded on the Blockchain and are therefore not
controlled by humans [15]. In their most basic form, smart
It is by far the most used consensus mechanism. It was
contracts can run all kinds of instructions like maintaining
first applied in Bitcoin. In a PoW, miners compete against
each other by solving a complex.

56
states, checking conditions, and sending and receiving digital The total trust score represents the impact that a particular
money. Of utmost significance is that the smart contract on the entity has on the reputation platform. Each entity has two
Blockchain cannot be changed, and/or even hacked [16]. assignment variables:
Ethereum is one example that was developed to help
developers to program smart contracts besides acting as a • Eg: Total endorsements given by the entity; and
cryptocurrency. With Ethereum, smart contracts and other • Er: Total endorsement received by the entity.
Decentralized applications can be executed in a “complete
Turing World Computer [17]. A permission-less Ethereum To be considered as a node in the network, each entity
Blockchain can help enforce reputation rules since they can must have Eg and Er values equal to 1. The methodology used
control who can read and write on the ledge regarding data to model the reputation is described as follows:
being managed [18], [19]. • Re: Ratio of Eg to Er. It is an indicator of how the total sent
and received endorsements are far from one another. Re
F. Blockchain and Reputation systems must be less than or equal to 1 and is computed as
A reputation system that is open and transparent should be follows:
able to compute the trustworthiness of an entity in an online Re=min (Eg, Er)/max (Eg, Er)....................................... (1)
interaction process. A decentralized reputation system In a reputation system, the ratio between incoming and
incorporating a smart contract provides standardized outgoing endorsement connections should be
mechanisms for accessing reputation data which has been maintained. This ratio helps to build a trustworthy
aggregated where more authentications reinforce the notion of behavior where a high ratio value is an indicator that a
reliability for anyone reputation of the score. Meanwhile,
node participating in the network is a high impact
Blockchain’s consensus mechanisms in the reputation
performing and trusted node.
platform can also help to safeguard against known attacks in
centralized systems such as Sybil attacks, whitewashing, • CPTs: Total consumable points for A. Every entity that
Denial of service attacks, and slandering [20]. Transacting joins the reputation platform receives an equal CPTs
parties can leverage reputation indicators to decide who to from the platform and the value keeps on depleting with
interact with and who to avoid. In e-commerce, merchants and each endorsement and is computed as:
vendors are incentivized to sell credible products and services CPTs=1/Er ................................................................. (2)
[21]. The reputation system measures the contribution of each
node in the network and total consumable points help to
provide an indication of this measure. For example, the
II. IMPLEMENTATION FRAMEWORK total consumable points for a particular node can limit
that node’s ability to convince endorsees that it is a
A. REPUTATION MODEL trustworthy node.
A model that captures the reputation of an entity in an • RPTs: Total received points for A. It is the aggregated
online interaction should consists of both endorser and sum of all the consumable points that are received by A
endorsee interacting as shown below: from the endorsers. If endorsers E is the set {e1, e2, e3...
en} for A of the size of E is n, then RPTs is computed as
follows:
RPTs=sum (CPTs)...................................................... (3)
• IF: Impact Factor: Indicates the reputation score for A
and is computed as:
IF=Re * RPTs ............................................................. (4)
Figure 1. Reputation Model for online Interaction. Ethereum Blockchain is an ideal platform to be used as an
The acquaintance process would work as follows: implementation platform because it is public and permission
less. As such, an endorser (any entity in an online interaction
• Entities A and B have personally known each for a marketplace) to endorse the endorsee in a public and
long time. They probably have worked together, or permission less platform. In 2016, Vitalik Buterin—a Russian
went to the same school, and therefore, are mathematician and cryptographer—unveiled Ethereum as an
acquaintances of each other. evolving platform to rival the existing Bitcoin platform. At the
time, the Bitcoin protocol could only validate the ownership
• Entity A may have interacted so many times with and transfer of coins. Ethereum protocol was revolutionary in
entity B. Perhaps this interaction was in online the sense that instead of just being applied to enforce smart
shopping where A and B successfully transacted contracts so long as they have enough Ether (ETH) in their
and, in the process, A establishes that B is a credible wallets. On Ethereum platform, ETH acts as a native
seller while B establishes that A is the credible cryptocurrency that performs many Ethereum-based
buyer. applications.
Based on past experiences that A had with B, A is more
likely to endorse B on the reputation system. If A endorses B, When a smart contract executes on the Blockchain, all the
then this means that A has built trust in B. In this case, the nodes participating in the Ethereum (miners) execute the same
endorsement acts as a transaction message originating from a code and needs to be agree on a consensus mechanism [22].
user account on a Blockchain and is destined to another user At the time, the Ethereum was based on a consensus
account (endorsee). In the network, every user manages two mechanism called Proof-of-Work (PoW). In this mechanism,
distinct lists of entities that they have so far interacted with: nodes to compete to solve a complex mathematical and
list of endorsers and list of endorsees. The list must have cryptographic puzzle based on General Byzantine’s problem
account addresses that identify that user on the platform. The [23].
reputation platform would record all the endorsements between
the endorsers and endorsees. The system then aggregates the
information and compute a total trust score.
57
PoW served 2 major purposes: verification of the legitimacy The participants in this reputation system are endorser and
of transactions to avoid the double-spending problem that endorsee. For example, if Alice has interacted with Bob so many
previously existed with digital currencies; and create new times and wishes to recommend him based on interactions, Alice
coins where miners who successfully perform computations becomes the endorser while Bob is the endorsee. The endorser uses
are rewarded with ETH. However, this approach consumed so a DApp (a client web application) which runs a browser to endorse
much hardware resources and Ethereum has now migrated to a the endorsee in the system. Since these recommendations must be
new consensus algorithm called Proof-of-Stake (PoS). In the stored on the Blockchain, a smart contract is required. There are
PoS algorithm, validators (instead of miners) are used to lock three smart contracts that must be generated for the system to work
up some ETH that acts as a stake in the Ethereum ecosystem seamlessly Application Binary Interface which is a compiled byte
[24]. Validators that bet their ETH on the blocks are rewarded
code representation of reputation system, Ethereum client nodes,
with coins that are proportional to their stake in the ecosystem
to manage the nodes joining the network, and file storage that
[25].
manages the storage of files on the Blockchain. Finally, the system
With its smart contract’s potential, Ethereum has now requires a Blockchain which in this case is the Ethereum
become a massive, decentralized platform—what is often Blockchain.
called a “complete Turing machine” or Ethereum Virtual
Machine (EVM). Ethereum is the dominant platform that A. Smart contracts
supports both public/private management of transactions,
miners are required to deposit some currency (Ether) to Smart contracts allow the reputation platform to record all
validate the transaction in an arrangement known as Proof-of- the endorsements between the endorsers and endorsees. They
Stake (PoS) [26]. An EVM is Ethereum’s interpreter then aggregate the information and compute a total trust score.
environment that converts smart contracts from high-level The total trust score represents the impact factor that a
language statements into machine language. EVM acts an particular entity has on the reputation platform. When smart
interpreter for the Ethereum’s assembly language. As it is the contracts compile successfully, they generate ABI
case with programming in assembly languages, writing codes (Application Binary Interface) which is a binary
may be challenging for new developers. Therefore, the representation of the compiled EVM. ABI is then deployed to
Ethereum Foundation proposed Solidity—a high-level the Ethereum network resulting in the contract obtaining an
language—to be used as the basis for coding smart contracts. address and bytecode recorded. The smart contracts are then
Essentially, an Ethereum ecosystem has non-exhaustive invoked using Web3.Js - a JavaScript API that allows DApps
elements that are complete with cryptographic tokens, the to interact with remote or local Ethereum nodes [30]. The
address of nodes, consensus algorithms (PoS), validators, the main contract in this system is Endorse contract that defines
Blockchain/Ledger, EVM, and scripting languages are [28], the logic necessary for any endorsement in the Ethereum
[29]. network. The Endorse contract can be simplified using the
flow chart diagram below:
B. Reputation Network Architecture
The diagram below summarizes the architecture of the
Reputation system:

Figure 3. Flow chart Diagram for Endorse Contract

Figure 2. Architecture of Reputation System The DApp facilitates the interaction between the users (on their
browsers) and the Ethereum Blockchain on the Reputation system
The reputation system as presented in this figure 1 having [19]. Any node joining the Reputation platform submits his/her
following components. application via a DApp which retrieves the public keys from the
• DApps; key store. The keys are used to sign off the data which is then
transmitted securely. The DApp then runs the specified smart
• Smart contracts; and contract corresponding to the data being transmitted. If the
execution is successful, validators pick the transactions and
• Ethereum Blockchain.
broadcast them to the entire network.

58
III. FINDINGS AND DISCUSSION
As shown in the code, the joinNetwork () function first checks
This section presents the main findings regarding the to find out whether a node has been registered on the network
performance of the reputation system. Specifically, we discuss or not. If the user has not registered, then the function halts.
the system’s ability to provide security and privacy, its The function progresses with the computation process only if
performance, and the fulfilment of its requirements. the user has been registered. In this case, the function proceeds
to deal with the actions (changing the status of the registered
A. Security and privacy nodes and incrementing the registered nodes.
Reputation systems can form basis for solutions in the
To ascertain the fulfilment of requirements, simulation
broken online interaction marketplaces. However, as other
was performed with the help of an interaction graph that
studies have shown, these systems are vulnerable to different
offered clues regarding the performance of the network. The
attacks including Sybil attacks, whitewashing attacks, free-
code below illustrates the various graphical interactions
rider attacks, and Denial of service attacks [8]. We analyzed
between nodes:
our system based on how endorsers’ messages get stored on
the Blockchain, and how the reputation system computes the Algorithm 2 Graphical Interaction between nodes
impact factor. Whereas our system does not address this issue, 1. function getProfile()
available data regarding an entity can be used to determine 2. {
malicious nodes in the Blockchain ecosystem. 3. For each node:
4. Compute outDegree
Suppose the reputation system has four nodes (y= {a, b, c, 5. Compute used_Power
d}). Here, the reputation platform maintains two sets of 6. Compute outConns
information for each entity (one consisting of a list of 7. Compute inDegree
endorsers, while the other consisting a list of endorsees). If m 8. Compute receivedPoints Return:
is the list of endorsers and n is the list of endorsees for an entity 9. outDegree ,used_Power ,outConns ,inDegree
y, then, the intersection of m and n offers clues on the common 10. ,receivedPoints ,inConns)
entities in the sets. If the intersection set is the same for both 11. }
the endorsers’ list and endorsees’ list, then it is a likely
indicator that these actors are malicious and wants to interfere The reputation model was then applied to nodes in the
with the computation of impact factor. interaction graph nodes, and their impact factor calculated
based on incoming and outgoing connections (E g and Er).
Determining the actors that could be colluding in a system Each node on the Ethereum rated one other on a scale of -2
of certain size assumes is NP-complete problem that can only (representing total distrust) to +2 (representing total trust). To
be solved using heuristics approaches. Such a problem is provide more relevant findings, the interaction graph was
likely to become complex when more nodes join the network modelled to incorporate only edges that had a rating of +2 with
[31]. Essentially, the more the actors join the reputation no negative edges in the simulation model. Out of 5000 nodes,
system, the more the size of the network expands and so does 240 edges were marked as positive edges. The information
the computational complexity. Since the platform is available for each node included source, rating, target, and
implemented using Ethereum, the high cost of gas (gas is timestamp which formed the basis for the endorsement
mentioned just here. Is it correct?) for each transaction does system. The direction of endorsement system was based on
not incentivize participants to act maliciously on the network the source and target datasets while the timestamp data gave
[15]. hints regarding the order of transactions. The graph below
summarizes the distributions of both incoming and outgoing
B. Fulfillment of requirements graph connections:
A front-end web app prototype was developed to help test
the various contract functions on the Blockchain. Manual
testing was conducted using truffle IDE and ganache on the
local network. Smart contract should be implemented in the
form of conditions (checking to ensure all the necessary pre-
conditions such as the caller of the function), actions (in form
of events and functions) and interactions if such a code is
eliminating the reentrancy errors [32],[33]. Our system used the
same approach when writing the smart contracts as shown by a
snippet function below:
Algorithm 1 Verification Function
1. function joinNetwork()
2. if the user registered on the network
3. {
4. Record the sender’s name and id
5. add new entity to the existing members update the
member’s list
6. }
7. else
Figure 4: Distribution of incoming and outgoing connections
8. {Do not join the network} Return

59
The simulation model provided basis for collecting explore the existing cryptographic-based reputation systems
datasets about the endorsement system and analyzing it in a that take into consideration all the factors while computing the
manner like a net flow diagram [17]. This information was impact factor. Further research should be conducted to provide
useful in detecting anomalies within the endorsement system. an in-depth overview of current issues in reputation including
As evident in figure 1, there were no anomalies with the but not limited to use of Eigen trust systems in reputation
model which meant the model could be used to compute the systems, application of Anomaly Detection Algorithms to
impact factor for each node. thwart malicious behavior in reputation systems, extend the
capabilities of reputation platforms to deal with user accounts
The Impact Factor (IF) parameter computed based on E g and emails.
and Er. Out of 5000 nodes, 3800 nodes (76%) had an IF of 0.
On further examination of these nodes, we found only one
incoming or outgoing links (both E g and Er had values of 1).
REFERENCES
This makes sense according to our model because a node can
only make an impact on the ecosystem if it has more than one
connection. In our case, an IF of 0 is expected because it
represents the initial node and does not in any way suggest that [1] ‘Digital 2019: Global Digital Overview’,
it is untrustworthy in the system. Based on these findings, it is DataReportal – Global Digital Insights. [Online].
appropriate to conclude that 76% of the nodes in our network Available: https://datareportal.com/reports/digital-
were new users. The remaining nodes (12) had considerable 2019-global-digital-overview. [Accessed: 06-Oct-
IF scores. 2019].
[2] A. Gonzales, ‘The contemporary US digital divide:
There were no nodes with IF more than 1 that had an from initial access to technology maintenance’, Inf.
accumulated RPTs of 0. If this were the case, then their IF
Commun. Soc., vol. 19, no. 2, pp. 234-248, 2016.
would still be 0 because RPTs is a significant contributor to
the reputation platform. Our model shows that it is possible
for some nodes to have a maximum possible ratio (which in [3] Setti, Sunil, and Anjar Wanto. "Analysis of
this case is 1) but still with a low IF. This is true because, as Backpropagation Algorithm in Predicting the Most
we had mentioned earlier, Eg and Er alone do not contribute Number of Internet Users in the World." Jurnal
to the overall IF on the network. The table below provides Online Informatika 3.2 (2019): 110-115.
some results for selected nodes that had low IF:
[4] J. K. Rout, S. Singh, S. K. Jena, and S. Bakshi,
‘Deceptive review detection using labeled and
Label Eg Er Ratio RPTs IF unlabeled data’, Multimed. Tools Appl., vol. 76, no. 3,
3819.
[5] W. Liu, J. He, S. Han, and N. Zhu, ‘A Method for the
3 2 2 1 0 0
Detection of Fake Reviews based on Temporal
15 5 3 6 1 0.2 Features of Reviews and Comments’, in 3rd
International Conference on Mechatronics
41 4 3 0.75 2 03
Engineering and Information Technology (ICMEIT
Table 1: Selected nodes with low IF 2019).
[6] Nakamoto S. Bitcoin: A peer-to-peer electronic cash
The results in the above table clearly demonstrate that the system. Decentralized Business Review. 2008 Oct
ratio between Eg and Er is not the key contributor to the IF in 31:21260.
the network. Consequently, it is hard to speculate if a
[7] A. Bogner, M. Chanson, and A. Meeuw, ‘A
particular entity in the network would have a high IF by
merely using the ratio. decentralised sharing app running a smart contract on
the Ethereum blockchain’, in Proceedings of the 6th
International Conference on the Internet of Things,
IV. CONCLUSION 2016, pp. 177–178.

Blockchain adds an extra layer of trust and security to [8] Shanaev S, Shuraeva A, Vasenin M, Kuznetsov M.
online interactions and can provide a much-needed reputation Cryptocurrency value and 51% attacks: evidence from
in online interaction platforms. In this paper, we have event studies. The Journal of Alternative Investments.
presented a trusted, decentralized reputation model based on 2019 Dec 31;22(3):65-77.
Ethereum Blockchain. Our platform can help enforce
reputation and trust in online interactions. The IF score [9] Hernan SV, inventor; Microsoft Technology
computed for an entity in the reputation model is a mark of its Licensing LLC, assignee. Authentication using proof
trustworthiness in the ecosystem. Using this score, any user of work and possession. United States patent
intending to participate in online interaction with the other
applicationUS14/486,864.2016March17.
party can verify the true identities and proof that the parties
are who they claim to be.
While our model is one of the first attempts at leveraging
Blockchain to infer trustworthiness of nodes based on peer-to-
peer interactions and the computation of IF, it does not in any
way challenge existing cryptographic-based reputation
systems. Limited time and resources did not allow us to

60
[10] Bentov I, Gabizon A, Mizrahi A. Cryptocurrencies [22] Gramoli, Vincent. "From blockchain consensus back
without proof of work. InInternational conference on to Byzantine consensus." Future Generation
financial cryptography and data security 2016 Feb 22 Computer Systems 107 (2020): 760-769.
(pp. 142-157). Springer, Berlin, Heidelberg
[23] L. Luu, V. Narayanan, K. Baweja, C. Zheng, S. Gilbert,
[11] Nguyen CT, Hoang DT, Nguyen DN, Niyato D, and P. Saxena, ‘Scp: A computationally-scalable
Nguyen HT, Dutkiewicz E. Proof-of-stake consensus byzantine consensus protocol for blockchains’, SCP,
mechanisms for future blockchain networks: vol. 20, no. 20, p. 2016, 2015.
fundamentals, applications, and opportunities. IEEE
Access. 2019 Jun 26;7:85727-45. [24] S. Tamang, Decentralized Reputation Model and Trust
Framework Blockchain and Smart contracts. 2018.
[12] Tosh, Deepak, et al. "CloudPoS: A proof-of-stake
consensus design for blockchain integrated [25] Fernández Anta, A., Rimba, P., Abeliuk, A., Cebrian,
cloud." 2018 IEEE 11th International Conference on M., Stavrakakis, I., Tran, A. B., & Ojo, O. (2019).
Cloud Computing (CLOUD). IEEE, 2018. Miner Dynamics on the Ethereum Blockchain.

[13] Watanabe, H., Fujimura, S., Nakadaira, A., Miyazaki, [26] King, Sunny, and Scott Nadal. "Ppcoin: Peer-to-peer
Y., Akutsu, A., & Kishigami, J. (2016, January). crypto-currency with proof-of-stake." self-published
Blockchain contract: Securing a blockchain applied to paper, August 19.1 (2012).
smart contracts. In 2016 IEEE international
conference on consumer electronics (ICCE) (pp. 467- [27] W. Wang, ‘A Vision for Trust, Security and Privacy of
468). IEEE. Blockchain’, in International Conference on Smart
Blockchain,2018,pp.93–98.
[14] Nugent, Timothy, David Upton, and Mihai Cimpoesu.
"Improving data transparency in clinical trials using [28] M. Vukolić, ‘The quest for scalable blockchain fabric:
blockchain smart contracts." F1000Research 5 Proof-of-work vs. BFT replication’, in international
(2016). workshop on open problems in network security, 2015,
pp.112–125.
[15] Cong, Lin William, and Zhiguo He. "Blockchain
disruption and smart contracts." The Review of [29] C. Kaligotla and C. M. Macal, ‘A generalized agent
Financial Studies 32.5 (2019): 1754-1797. based framework for modeling a blockchain system’,
in Proceedings of the 2018 Winter Simulation
[16] Dennis R, Owen G. Rep on the block: A next Conference,2018,pp.1001–1012.
generation reputation system based on the blockchain.
In2015 10th International Conference for Internet [30] Berger TP, Gueye CT, Klamti JB. A NP-complete
Technology and Secured Transactions (ICITST) 2015 problem in coding theory with application to code
Dec 14 (pp. 131-138). IEEE. based cryptography. InInternational Conference on
Codes, Cryptology, and Information Security 2017
[17] Fichtner, Wolfgang, Qiuting Huang, Bernd Apr 10 (pp. 230-237). Springer, Cham.
Witzigmann, Hubert Kaeslin, Norbert Felber, and
Dölf Aemmer. "IIS Research Review 2004." IIS [31] M. Prates, P. H. Avelar, H. Lemos, L. C. Lamb, and M.
ResearchReview (2004). Y. Vardi, ‘Learning to Solve NP-Complete Problems:
A Graph Neural Network for Decision TSP’, in
Proceedings of the AAAI Conference on Artificial
[18] Y. Zhang, C. Xu, J. Ni, H. Li, and X. S. Shen, Intelligence, 2019, vol. 33, pp. 4731–4738.
‘Blockchain-assisted public-key encryption with
keyword search against keyword guessing attacks [32] Y. Hou, X. Zhao, Q. Li, J. Chen, Y. Li, and Z. Zheng,
for cloud storage’, IEEE Trans. Cloud Comput., ‘Solving large-scale NP-Complete problem with an
2019. optical solver driven by a dual-comb “clock”’, in
[19] G. Liang, S. R. Weller, F. Luo, J. Zhao, and Z. Y. CLEO: Science and Innovations, 2019, pp. SF1M–2.
Dong, ‘Distributed blockchain-based data
protection framework for modern power systems [33] Sultan K, Ruhi U, Lakhani R. Conceptualizing
against cyber attacks’, IEEE Trans. Smart Grid, blockchains: characteristics & applications. arXiv
vol. 10, no. 3, pp. 3162–3173, preprint arXiv:1806.03693. 2018 Jun 10.

[20] S. Eskandari, S. Moosavi, and J. Clark, ‘SoK: [34] DelGaudio CI, Hicks SD, Houston WM, Kurtz RS,
Transparent Dishonesty: front-running attacks on Hanrahan VA, Martin Jr JA, Mummey DP, Murray
Blockchain 2019. DG, Prince JE, Pritsky RR, Rauch DC, inventors.
Method and system for network connectivity
migration management. United States patent US
[21] Cachin C. Byzantine faults. In Concurrency: The 9,928,480. 2018 Mar 27.
Works of Leslie Lamport 2019 Oct 4 (pp. 67-81).
[35] Efanov, Dmitry, and Pavel Roschin. "The all-
pervasiveness of the blockchain
technology." Procedia computer science 123 (2018):
116-121.

61
An Observation on Residential Complexes as a New Housing Typology in Post-Socialist Tirana

Edmond Manahasa Rudina Kazazi

Department of Architecture Department of Architecture
Epoka University Epoka University
Tirana, Albania Tirana, Albania
emanahasa@epoka.edu.al rkazazi09@epoka.edu.al

Abstract— The shift from socialist to post-socialist political bankrupt or stopped production [2] (case of ex-mechanical
system caused an urban expansion of Tirana, which brought the factory Enver Hoxha).
growth of its population. This post-socialist urban context
associated with a high demand for dwelling need, the limited The number of the residential complexes that were
construction sites in inner city and the increasing land prizes, developed in peri-urban areas after years 2010, in consistence
pushed the developers to construct residential complexes as a with the expansion of the city have increased in quantity. The
new form of housing in Tirana. This research aims to reveal the characteristic feature of residential complexes are the high-
features of residential complexes as a new housing typology, that rise apartment blocks (in many cases tower shaped) aiming
emerged during the post socialist period in Tirana by analyzing maximize the dwelling area. They are arranged at the
typological housing features and the common outdoor spaces. perimeter of the rectangular sites and in appearance the spaces
To investigate properly there are selected three cases studies: in between are dedicated for outdoor social activities.
“Halili”, “Homeplan” and “Kika” residential complexes. The However, in most of the cases the ground floors are designed
methodology used in the research includes visual for coffee shops, which easily usurp the public spaces [3].
documentation, archival research for provision of the drawings
and an analysis of typological features and outdoor spaces of the
residential complexes. The study revealed that residential
complexes in Tirana are featured by dense high-rise apartment II. Case Study on Three Residential Complexes in
blocks or towers, which possess many apartments. The spatial Post-Socialist Tirana
features are characterized low number cores, which in majority This research focuses on the analysis of three case studies
are not lit and provide access to large number of apartments. of residential complexes in Tirana. The analysis includes
Apart from the housing functions, commercial functions are description of site plans, housing typological features and the
found in the ground floors, whereas in some cases the business common outdoor spaces of residential complexes. The
functions are located also in upper floors. The common outdoor analysis is supported with visual material for each of them,
spaces are featured by limited green spaces and considerably are
including photos, to plan drawing and technical materials.
usurped by the commercial units of the ground floors.
Firstly, an introduction about the site plan and urban context
of the residential complexes, including the position related to
the city center is depicted. The housing typological features
Keywords—residential complexes, housing typology, post- include architectural composition, spatial features, function
socialist period, Tirana distribution and exterior solutions. The common outdoor
spaces explain the shared spaces within the sites of the
I. Introduction on Multifamily Housing in Post- residential complexes, their organization and hardscape and
socialist Tirana green spaces ratios. Furthermore, it includes observation on
The multifamily housing development is one of the major the usage these spaces, and the relation of between the
dwelling forms in the post-socialist period Tirana. The post- dwellers and commercial units located on ground level and,
socialist period housing developments in Tirana are featured parking accesses or passages.
by informal housing settlements at the city periphery and high-
rise dwellings in the inner city [1]. Although after 90s, due to
technological limitations the first multifamily housing were A. “Halili” Residential Complex
middle-rise apartment blocks, by the years 2000 in a dominant “Halili” complex is conceptualized as a residential
majority they evolved in the form of high-rise apartment development and business center. It is located in the inner city
blocks or residential complexes. of Tirana, on the Dibra street within an old neighborhood
approximately 820 meters from the city center. Designed by
While for an apartment blocks the needed site was smaller,
Vladimir Bregu, it was completed in 2004. Although initially
in the case of residential complexes there was a need for larger
it was planned to cover a larger zone (approximately 13,900
sites. Based on that on the case of residential complexes it was
sq m2), its site is approximately 7200 sq m2 [4]. The site plot
easier to be developed in the peri urban zones of the city (out
has a triangular-like shape and excluding the “Partizani” high
of middle ring), but those zones were lacking the needed
school, which border the complex in its rear part. The design
infrastructure making the unattractive for the clients. Thus, it
approach was to have a built perimeter on the main streets
was more attractive to develop residential complexes in the
while enclosing the inner space, giving it private attributes.
inner-city zones which controversially lacked such large sites
for development. Taking into consideration the complex issue Based on that, the perimeter of the land plot is contoured
of land ownership in Albania and the weak urban planning by high rise apartment blocks, that vary in height from a
administration the target of developers for residential minimum of 6 floor height up to 11 floor height residential
complexes sites become the public green and sportive spaces towers. The common outdoor spaces are located on the back
(case of “Fusha e zeze” field or “Partizani” club training of the build perimeter, providing the necessary intimacy and
grounds) or socialist period enterprises which had gone

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

62
Figure 1. Location of "Halili" residential complex (left), image (middle) and its common outdoor (right).

The spatial organization of the apartment blocks is

security from the 3 vehicular streets that surround the featured by staircase cores that are located on the central parts
complex. The access from the main road to the inner space of of the building and do not receive neither daylight nor natural
the complex is done through the passageways. The ground ventilation. The distance of the apartment entrances and cores
floor of the complex is an alternation of commercial units, is short, diminishing also the shared space as hallways.
entrances on the upper floor business units and residents. Hallways lack natural light thus are supplied all day and night
through the help of artificial lighting. Their width varies from
A.1. Housing Typological Features a minimum of 145 cm up to 160 cm. Each staircase feeds a
maximum of 4-5 apartments per floor accompanied by two
The architectural composition of this residential complex elevators with a carrying capacity of 4 people each.
evolves as an interplay of cylindrical, oval, and edgy
volumes, giving a dynamic impact. The streetside façade A.2. Common Outdoor Spaces
balconies are extruded volumes from the main volume, as
consoles their shape varies from floor to floor, creating the The common outdoor spaces in this residential complex
illusion of movement in the façade. The corners of the plot are minimal. Since the complex has underground parking for
are treated with a circular element creating a smooth the residents, there are two ramp entrances, which
transition at the junctions, for circulation. The materials used compromise and reduce the public space. While going
in the grey ceramic tiles, whereas the rounded volumes are through the public space, several parts are occupied by cars
cladded with curtain wall façade, which due to the large parked also on ground level being an obstacle in the
cantilever and complicated geometry produce a high-tech and circulation of people and in using all of it.
dramatic effect. The materials used for the flooring of the public space are
a combination of stamped concrete and the red brick applied
The apartment blocks are designed into two different types as fishbone pattern in areas, where urban furniture and trees
and vary in height differing from 6 floor height up to 13 are located. Greenery presence in the area consists of small to
floors. One of them apart from residential functions, also medium deciduous trees, positioned in concrete vases and
accommodates business ones and ends in a semicircular form spread throughout the outdoor space. They do not provide
and is placed perpendicular to the Dibra street, whereas the enough shadow during summertime, thus resulting in a hot
other which is placed parallel to the street is used for environment during peak summer hours.
residential functions and possess circular balconies.

Figure 2. Site plan of "Halili" complex with realized parts in green (left), perpendicular (middle) and parallel housing block (right)
normal floor plan schemes

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

63
The commercial units located on the ground floor, that number of apartments accessed from the same core varies
look to the inner outdoor common space, mostly are not from five to seven. The cores and the corridors are not lit in
occupied and the small number of which do have a tenant majority of the cases. Related to the commercial units and
consists of small mini markets or dry-cleaning activities. their location, the ground floor as the easiest space to directly
Related to the usage of public space, it is generally a quiet having an access from the main road are in their function. The
place, although it is not vehicle free. The people using the entrances of the residents are positioned in such a way to not
space are generally old people, or young parents that carry interfere and occupy a valuable front-line commercial unit, but
their infants in the strollers spending a bit of quiet time they are positioned in the inner part of the passageways. All
outdoors. The space is populated during the late afternoon the entrances are numbered, to ease the directionality and
hours when the sun is setting, by children playing unattended orientation of the dwellers.
by their parent, while most of the residents tend to sit at the The exterior of this residential complex due to the
bars located nearby. innovative the materials and preciseness in realization
technique evolve, thus the difference can be clearly
distinguished among vicinity buildings. The façades of this
B. “Homeplan” Residential Complex
residential complex are coated by raw stucco finishing. The
“Homeplan” is designed as a residential complex finishing colors are sensitive and are pinkish pastel color, light
composed of residential and commercial units. It is located in grey and white.
the western part of Tirana, outside the middle ring. It is
designed by architect Irina Branko and completed a local B.2. Common Outdoor Spaces
company called Kontakt. The “Homeplan” residential
complex site plan scheme organization, consists of building
The common outdoor spaces of the “Homeplan”
volumes place at the perimeter of a quasi-rectangular site and
an inner courtyard in the center. It is approximately 2.4 km the residential complex is is a central square plaza arranged in the
city center of and is positioned on a 6751 m2 plot. As a inner courtyard. Although this space is semi closed and
construction period it has started in 2010 and ended in 2012. introverted, it is usable and accessible also for the dwellers
In verticality it reaches up to a maximum of 8 floors high with off complex. The public space has a coverage area of
one underground parking floor. Apart from housing, other approximately 1200 square meters, where urban furniture,
facilities present within “Homeplan” complex consist of greenery and water features are integrated.
service units as bars, restaurants, or supermarkets, which are
placed on the ground floor [5]. The materials used in the inner courtyard consists of grey
stamped concrete with a stone like pattern. Minimal green
areas are designed in rectangular upraised volumes 50 cm

Figure 3. "Homeplan" residential complex location (left) and its image (right)

B.1. Housing Typological Features from the ground level, which offset from the inner building
perimeter. The inner part of these volume is designed as a
There are seven apartment blocks in this residential hilly - like terrain, planted in grass, bushes and high trees.
complex. The built volumes rise as extruded volumes, while Urban furniture is integrated within the perimeter of the
some of them do not continue to the last floor but reach to a volumes. The urban furniture partly is designed with wooden
maximum of 5 floor high. In this way the roof is converted to elements, which create a warmer and more inviting element
an open veranda for the users of the 6th floor adding value to than concrete.
the above apartments as well as liberating the massive impact
of one whole volume. Under the lower volumes in floor The services located on the ground floor are bars and
height, there are passageways on ground level of the building. cafeterias, which usurp of the common outdoor space for their
The width of the buildings varies from 14.8 meters to 20 private purposes. This results in a limited area for the user,
meters. On the outer perimeter the balconies and loggias are who does not want to use the cafeterias nearby. Based on the
cantilevered, while on the inner perimeter they stay within the observations we conducted on site, the large number of coffee
built perimeter of the building.
The spatial organization of apartment blocks is planned
around the cores which are placed at the center of respective
block. The cores consist of a staircase and two elevators. The

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

64
shops push the young parents to sit in there, to keep their
children under their supervision.

Figure 4. "Homeplan" site plan (left), normal floor plan scheme (middle) and image of common outdoor spaces (right)

C.1. Housing Typological Features

C. “Kika” Residential Complex
To see in a closer way, the housing features we will
The residential complex “Kika” is as a mixed-use housing examine the 3rd division of the residential complex. This
development. It is located in the south-western part of Tirana, division has a site in a trapezoidal shape, which has
outside of middle ring approximately 1.8 km from the city influenced the housing design to have a sharp angular façade.
center in. Priorly during the socialist period, the site was used The buildings are mainly oriented on a north south axis,
as a greenhouse. Designed by Italian Atena studio it is meanwhile the apartments are oriented towards east and west
developed in a site of 40,000 m2 site and is planned in three directions. This orientation is favorable in maintaining a cool
large divisions. The three divisions belong to a previously inner environment in summer and a lower need for heating in
larger area which was developed through the known “Le Serre winter resulting in a considerable amount of energy saved.
masterplan design” [6]. Since the blocks are linked between each other and their
The site plan of residential complex is featured by housing proximity at several points may cause shading or privacy
blocks which are planned at the perimeter of each of the problems, the solution was to fragment the height into
divisions, whereas the common outdoor spaces are designed several, starting from a minimum of 6 floor height
in the form of central courtyards. buildings,7,8 and reaching a maximum of 9 floor height in
The volumetric composition of these large mass housing the 3rd division. The width of the building blocks varies from
development is featured by dense tower buildings, which are 23 meters up to 25 meters. In this residential complex is
fragmented vertically have large cantilevers and, in some observed a usage of extreme cantilever volumes. Beyond
cases, connects to each other with bridges. Beyond housing those cantilevered parts which can reach up to7-8 meters,
functions other facilities present within the residential within the complex there are a set of linking bridges. The
complex are restaurants, commercial areas, shopping centers, large, cantilevered volumes are designed in steel structure in
spa, cinemas, and auditoriums. The construction process order to provide the statical stability.
started in 2005 and is supposed to finish in 2020 [7]. The spatial organization of the apartment blocks, include
minimum number cores and linear long corridors. The

Figure 5. Image of “Kika” residential complex (left) and common outdoor spaces (right)

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

65
number of apartments accessed from one core varies from The green spaces are abundant and consist of two
five to eleven. In some cases, certain apartment block is not orthogonal islands in the inner court and another smaller one
designed with a core, and they are connected via bridges to in the eastern edge of the site. They are designed as an
the other block core. Apart from the ground floor is allocated upraised volume from the ground level at approximately 50
for commercial activities, in the angular part in difference cm. The topography of the green spaces is designed as a hilly-
from the previous block the first floor is totally dedicated to like terrain and is planted in grass, bushes, and high trees. The
commercial activities, shifting the apartment units to the perimeter of the green spaces is designed as urban sitting
second floor up to the ninth. furniture consisting of concrete bordure and wooden
The façade of this residential complex is covered in grey elements, providing a warmer environment.
plaster, giving the blocks a neutral look. Meanwhile the outer
and inner façade is fragmented, resulting in a much lighter III. Concluding Remarks
volume and game of solids and voids. The usage and To conclude it can be said that the residential
interplay of vertical windows on the perimeter of the block complexes are a new form of dense mass housing that have
serve to break the horizontalness of the built environment. emerged in considerable quantity during the post-socialist
Meanwhile, the inner façade differs from the outer, is a more period in Tirana. The increasing number of this housing form
porous one, offering more transparency and light to the inner is related to the capitalist economic reality, based on which
environment. The inner façade consists of continuous loggias the developers aim to maximize the housing construction
that are divided by walls between 2 apartments, providing the profit and the purchasing power of the citizens is not high.
necessary privacy between the neighbors. Due to this context, the residential complexes that were
subject of this study in majority are featured by dense and
C.2. Common Outdoor Spaces high-rise tower like apartment blocks.
Regarding their spatial organization their cores consist
The common outdoor spaces of the residential complex are of stairs which are unlit and provide access to large quantity
a result of the arrangement of the building blocks in the (five to twelve) of apartments. In some cases, there are
perimeter of the plots, resulting in an introverted courtyard. observed also long linear corridors. The exterior of this
The linkage between these inner spaces and the surrounding housing typology reflects the quality of contemporary
neighborhood is done through several punctuations in the construction technology and in many cases, has played an
built volumes, in forms of passages that also serve as avant-gardist role using curvilinear shapes and curtain wall
controlled entrances to these spaces. Each of these public façades.
spaces have two or three of these punctuations that either The common outdoor spaces generally are designed in
connect them with the neighborhood or the sites with each the form of inner courts. Due to the commercial activities in
other. The built volumes reach up to a maximum of 9 floor the ground floors of the housing blocks and the lack
height, despite that the distance among them provides the respective legal framework needed to regulate the usage of
opportunity for the inner environment to receive plenty of the common outdoor spaces, predominantly they are subject
natural daylight through the day. The materials used in the of usurpation by the owners of these activities. The green
inner spaces consists of grey stamped concrete stone like spaces are present in the inner courtyards, although they are
pattern [6]. in minority compared to overall space.

Figure 6. Kika residential complex in green within Le Serre master plan (left), "Kika" 3rd division site plan (middle) and normal
floor plan (right)

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

66
References [4] Halili construction archive, July 2018.
[1] Manahasa Edmond, Manahasa Odeta, “Defining urban identity [5] Kontakt archive, July 2018.
in a post-socialist turbulent context: The role of housing [6] Kika Construction archive, July 2018
typologies and urban layers in Tirana”, Habitat International, [7] http://www.atenastudio.it/sitoweb/project.php
Volume 102, 102202, 2020.
[2] Manahasa, Edmond, Ozsoy, Ahsen, “Place attachment to a
larger through a smaller scale: Attachment to city through
housing typologies in Tirana”. Journal of Housing and the Built
Environment, 35(1), 265–286, 2020.
[3] Manahasa, Edmond, “Place attachment as a tool in examining
place identity: A multilayered evaluation through housing in
Tirana”, PhD dissertation, Istanbul Technical University, 2017.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

67
Active Control Of In-Wheel Motor Electric Vehicle
Suspension Using The Half Car Model

Abbas Hussein Hammood Başar Özkan

Department of Mechatronics Engineering Department of Mechanical Engineering
Istanbul Okan University Istanbul Okan University
Istanbul, Turkey Istanbul, Turkey
abbhammood@stu.okan.edu.tr basar.ozkan@okan.edu.tr

Abstract— In recent years, electric vehicles are becoming more supposes that it is virtually connecting a damper between the
mainstream. Even though in-wheel electric motor vehicles have unsprung mass and the ground [6]. The ground-hook main
been developed as prototype vehicles, they have not become task is to reduce the vertical displacement of the tire and keep
common yet. One of the reasons for this is that when an electric the ground-tire contact force in a narrow area as possible to
motor is placed in the wheel it becomes heavier, which tends to the main value [7]. Ground-hook has improvement when
worsen the road holding properties of the vehicle. Active
suspensions are currently used in some high-end vehicles to
compared with passive suspension model. The use of the half
improve passenger comfort, vehicle handling and road holding. car model to show the effects of the ground-hook controller
In this paper an active suspension is used show that it is possible is the main contribution of this paper. The goal is to design
to mitigate the road holding problems caused by in-wheel control strategy namely ground-hook controller to improve
electric vehicles. The ground-hook method is used for the road handling for the vehicle.
control strategy. First, a quarter car model is used to show that
adding a weight to the wheel does actually decrease the road II. Quarter Car Model
holding performance of the vehicle. Afterwards, the active
suspension is shown to be able to improve the road holding
A. Quarter Car Model Passive Suspension System
properties of the vehicle to acceptable levels. The simulations Quarter Car model is the most popular model that is used in
are also repeated with a half car model. They also show that analysis of automotive suspensions and design. The main
increasing the tire mass worsens the road holding performance reason to use this model is that it is simple, it can give
of the vehicle. However, this bad performance can be reversed reasonably accurate information and predict lot of important
by the ground-hook active suspension. Simulations also show properties about the full car, Figure 1 shows a passive
that the ground-hook controller worsens the passenger comfort
suspension system for a quarter car in which the wheel is
level in the vehicle.
connected to the body of the car by passive parameters
(spring and damper), and the tire is represented as a spring
Keywords—active suspension, road holding, ground-hook, and the damper of the tire is neglected. The quarter car
electric vehicle, in-wheel motor structural model involved mass of car (𝑚𝑠 ) and mass of tire
(𝑚𝑢 ). There are three vertical displacement types included in
I. Introduction quarter car, the vertical displacement for the mass of car (𝑧𝑠 ),
The suspension system is the one of the most important part the vertical displacement for the mass of tire (𝑧𝑢 ) and the
of an automobile that isolate the vehicle from road shocks, vertical displacement for the road (𝑧𝑟 ).
vibrations and provide comfort effect to the occupant [1][2].
Automotive suspensions are divided into three forms namely
passive, semi-active, and active suspension system. Passive
suspensions are always used and continuous improvements
have been made by research. It is impossible to get both ride
comfort and road holding demands in the same time for
passive suspension car. Passive suspension systems are the
most favorite and are widely used because of their low cost
and high reliability. This type of system is considered as open
loop system [2]. Passive suspension consists of conventional
spring (the spring is pressed and stretched to absorb the wheel
movement) and damper which is a shock absorber that works
on the vibration motion of the vehicle. The main aim of using
damper is to slow down and minimizing the vibration
magnitude caused by the road. The damper connected in
parallel with spring which was fixed and it is impossible to
change externally by any signal [3][4]. So, it should need a
spring which can be stiff and soft simultaneously [3][4][5].
Researchers have made a lot of improvements over the years
and most of these experts think that the passive suspension Figure I. Passive suspension system for a quarter car model
are hard to be improved. A ground-hook control is one of the B. Quarter Car Model Forces
control strategies applied to the automotive suspension. A
ground-hook controller is used to improve the road holding ∑ 𝑓 = 𝑚𝑎 ()
for both quarter and half car models. This controller method

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

68
𝑘𝑠 𝑏𝑠 𝑏𝑠 𝑘𝑡
The variables are: (𝑓𝑠 ) force of spring, (𝑓𝑏 ) force of damper, 𝑧𝑢̈ = (x1 ) + (x2 ) − (x4 ) − (x3 ) ()
𝑚𝑡 𝑚𝑡 𝑚𝑡 𝑚𝑡
(𝑓𝑡 ) force of tire, (𝑚𝑠 ) mass of car, (𝑚𝑢 ) mass of tire, (𝑧𝑠̈ )
the acceleration for mass of car, (𝑧𝑠̇ ) the velocity for mass of
From eq. (4) and eq. (16) we will get the matrix of A
car, (zs ) position for mass of car, (𝑧𝑢̈ ) the acceleration for
mass of tire, (𝑧𝑢̇ ) the velocity for mass of tire, (zu ) position
0 1 0 −1
for mass of tire. k bs bs
− ms − ms
0 ms
−𝑓𝑠 −𝑓𝑏 = 𝑚𝑠 𝑧𝑠̈ () A= s
()
0 0 0 1
ks bs k bs
[ − 𝑚t −𝑚 ]
𝑚𝑠 𝑧𝑠̈ = −k s (zs − zu ) − 𝑏𝑠 (𝑧𝑠̇ − 𝑧𝑢̇ ) () 𝑚𝑡 𝑚𝑡 𝑡 𝑡

𝑘𝑠 𝑏𝑠 𝑏𝑠 The L matrix is shown in eq. 9.

𝑧𝑠̈ = − (x1 ) − (x2 ) + (x4 ) ()
𝑚𝑠 𝑚𝑠 𝑚𝑠

D. Ground-hook Control
𝑓𝑠 + 𝑓𝑏 − 𝑓𝑡 = 𝑚𝑢 𝑧𝑢̈ ()
A ground-hook control introduces a damper connected
𝑚𝑢 𝑧𝑢̈ = 𝑘𝑠 (𝑧𝑠 − 𝑧𝑢 ) + 𝑏𝑠 (𝑧𝑠̇ − 𝑧𝑢̇ ) − 𝑘𝑡 (𝑧𝑢 − 𝑧𝑟 ) () virtually to the ground as modelled in Figure 2, (𝑐𝑔𝑟𝑑 )
connected between the mass of the wheel and the fixed
𝑘 𝑏 𝑏 𝑘 imaginary frame on the ground. The (𝑐𝑔𝑟𝑑 ) is a ground-hook
𝑧𝑢̈ = 𝑚𝑠 (x1 ) + 𝑚𝑠 (x2 ) − 𝑚𝑠 (x4 ) − 𝑚𝑡 (x3 ) ()
𝑢 𝑢 𝑢 𝑢
damping coefficient. The damper (𝑐𝑔𝑟𝑑 ) is connected to 𝑚𝑢
(the mass of tire) instead of 𝑚𝑠 (the mass of the car). The
From eq. (4) and eq. (7) we will get the matrix of A ground-hook controller improves vehicle road holding by
minimizing the upper and lower peaks of wheel displacement
and tire deflection [12].
0 1 0 −1
K b bs
− ms − ms 0 𝑓𝑔𝑟𝑑 = 𝑐𝑔𝑟𝑑 (𝑧𝑢̇ − 𝑧𝑟̇ ) ()
ms
𝐴= s s
()
0 0 0 1
Ks bs K b
− mt − ms ] 0
[ mu mu u u 1
𝑚𝑠
0 𝐵= ()
0
0 −
𝐿 = −1]
[ () 1
[ 𝑚𝑡 ]
0
C. In-Wheel Motor Electric Vehicle
The In-Wheel Motor Electric Vehicle (IWM EV) is one of
the common types of the electric vehicles, the suspension
system of (IWM EV) is almost doubled the mass of the wheel
by adding the mass of an electric motor to the mass of the
wheel. The in-wheel motor (IWM) is placed in the wheel
empty space [8]. It was known there is relative relationship
between the body vibration and the mass of the wheel, when
the mass of the wheel increased the body vibration also
increased that’s affecting on passenger comfort [8]. An (IWM
EV) has extra wheel mass because of the electric motor mass
added to the mass of wheel, which lead to discomfort for the
passenger and also not safety on the road [9]. The increase of
unsprung mass cause the vehicle to get worse riding comfort
and handling stability [10,11]. Which affecting on the tire- Figure II. In-wheel motor electric quarter vehicle model with
road contact [11]. groundhook controller

The real state space of vehicle is

(𝑚𝑢 + 𝑚𝑖 ) = 𝑚𝑡 ()

𝑛1 = 𝑚𝑡 𝑧𝑢̈ () 𝑥̇ = 𝐴𝑥 + 𝐵𝑢 ()

𝑦 = 𝐶𝑥 + 𝐷𝑢 ()
𝑛1 = 𝑓𝑠 + 𝑓𝑏 − 𝑓𝑡 ()

𝑛2 = 𝑘𝑠 (𝑧𝑠 − 𝑧𝑢 ) + 𝑏𝑠 (𝑧𝑠̇ − 𝑧𝑢̇ ) () A is the state matrix, B is the input matrix, C is the output
matrix and D is the direct transmission matrix.
𝑛3 = − 𝑘𝑡 (𝑧𝑢 − 𝑧𝑟 ) () 𝑥 is state vector and 𝑢 is control vector
Where:
𝑛1 = 𝑛2 + 𝑛3 ()

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

69
𝑥1 = 𝑧𝑠 − 𝑧𝑢 ()

𝑥2 = 𝑧𝑠̇ ()

𝑥3 = 𝑧𝑢 − 𝑧𝑟 ()

𝑥4 = 𝑧𝑢̇ ()

𝑥1 is the car suspension deflection, 𝑥2 is the sprung mass

velocity, 𝑥3 is the tire deflection and 𝑥4 is the unsprung mass
velocity.
Figure V. Comparison in tire deflection for quarter vehicle
The ground-hook controller is compared with passive between normal passive, IWM EV passive and IWM EV with
suspension for both (normal vehicle and in-wheel motor ground-hook controller
electric vehicle). The simulations were carried out in
MATLAB/Simulink. E. Ground-hook and Sky-hook Control Comparison
A sky-hook control introduces a damper connected virtually
Parameter for normal quarter vehicle model passive to the sky, (𝑐𝑠𝑘𝑦 ) connected between the mass of the car and
suspension system the fixed imaginary frame to the sky. The (𝑐𝑠𝑘𝑦 ) is a sky-
Mass of car 𝑚𝑠 =250 kg, mass of tire 𝑚𝑢 =40 kg, the spring hook damping coefficient. The damper (𝑐𝑠𝑘𝑦 ) is connected
stiffness 𝑘𝑠 =16000, the tire stifness 𝑘𝑡 =160000, the damper to 𝑚𝑠 (the mass of the car). The sky-hook controller improves
𝑏𝑠 =1000. comfort of the occupants inside the car by minimizing the
upper and lower peaks of car body displacement and
Parameter for in-wheel motor quarter electric vehicle acceleration of sprung mass[13].
model passive suspension system
Mass of car 𝑚𝑠 =250 kg, mass of tire 𝑚𝑢 =40 kg, mass of in- 𝑓𝑠𝑘𝑦 = −𝑐𝑠𝑘𝑦 (𝑧𝑠̇ ) ()
wheel motor 𝑚𝑖 =45 kg, the spring stiffness 𝑘𝑠 =16000, the
tire stifness 𝑘𝑡 =160000, the damper 𝑏𝑠 =1000.
(𝑚𝑖 ) 𝑖𝑠 𝑡ℎ𝑒 𝑚𝑎𝑠𝑠 𝑜𝑓 𝑖𝑛 − 𝑤ℎ𝑒𝑒𝑙 𝑚𝑜𝑡𝑜𝑟.

Parameter for in-wheel motor quarter electric vehicle

with ground-hook controller
Mass of car 𝑚𝑠 =250 kg, mass of tire 𝑚𝑢 =40 kg, mass of in-
wheel motor 𝑚𝑖 =45 kg, the spring stiffness 𝑘𝑠 =16000, the
tire stifness 𝑘𝑡 =160000, ground-hook damping coefficient
(𝑐𝑔𝑟𝑑 ) =5000.

Figure VI. MATLAB simulation for quarter vehicle model with

normal passive, in-wheel motor passive suspension system, in-
wheel motor with ground-hook controller and in-wheel motor with
sky-hook controller

Figure III. MATLAB simulation for quarter vehicle model with

normal passive, in-wheel motor passive suspension system and in-
wheel motor with ground-hook controller

Figure VII. Comparison in acceleration of sprung mass for quarter

vehicle between normal passive, IWM EV passive, IWM EV with
ground-hook controller and IWM EV with sky-hook controller

Figure IV. Comparison in acceleration of sprung mass for quarter

vehicle between normal passive, IWM EV passive and IWM EV
with ground-hook controller

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

70
𝑧1 = 𝑧 − 𝑙𝑓 θ ()

𝑧1̇ = 𝑧̇ − 𝑙𝑓 𝜃̇ ()

𝑧2 = 𝑧 + 𝑙𝑟 θ ()

𝑧2̇ = 𝑧̇ + 𝑙𝑟 𝜃̇ ()
𝑘1 𝑘2 𝑘1 𝑙𝑓 𝑘2 𝑙𝑟 𝑏𝑠1 𝑏𝑠2
𝑧̈ = (− − )𝑧 +( − )𝜃 − ( + ) 𝑧̇ +
𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠
𝑏𝑠1 𝑙𝑓 𝑏𝑠2 𝑙𝑟 𝑘1 𝑏𝑠1 𝑘2 𝑏𝑠2
( 𝑚𝑠
− 𝑚𝑠
) 𝜃̇ + 𝑚 𝑧𝑢1 + 𝑧 ̇ + 𝑚 𝑧𝑢2 + 𝑚 𝑧𝑢2̇
𝑚𝑠 𝑢1
()
Figure VIII. Comparison in tire deflection for quarter vehicle 𝑠 𝑠 𝑠
between normal passive, IWM EV passive, IWM EV with ground-
hook controller and IWM EV with sky-hook controller (𝑚𝑢 + 𝑚𝑖 )𝑧𝑢1̈ = −𝐹𝑡 + 𝐹𝑠 + 𝐹𝑏 ()

𝑏1 = 𝑚𝑡 𝑧𝑢1̈ ()
III. Half Car Model
𝑏2 = −𝑘𝑡1 ( 𝑧𝑢1 − 𝑧𝑟1 ) + 𝑘1 ( 𝑧1 − 𝑧𝑢1 ) ()
A. Half Car Model Passive Suspension System
It is presented as a linear four-Degree-of-Freedom (4-DOF) 𝑏3 = 𝑏𝑠1 (𝑧1̇ − 𝑧𝑢1̇ ) ()
system. The vehicle body has two motions of heave and pitch
and the front and rear tires motions are also involved in the 𝑏1 = 𝑏2 + 𝑏3 ()
half car. A single mass of car is connected to two wheels
masses at each corner. Vertical and pitch motion is 𝑐1 = 𝑧𝑢1̈ ()
appropriate for sprung mass while only vertical motion for
both unsprung masses. The pitch motion for the half car 𝑘𝑡1 𝑘 𝑏
𝑐2 = − 𝑚𝑡
𝑧𝑢1 − 𝑚1 𝑧𝑢1 − 𝑚𝑠1 𝑧𝑢1̇ ()
sprung mass is represented and the vertical displacements for 𝑡 𝑡

the front tire (𝑧𝑢1 ) and for the rear tire (𝑧𝑢2 ) are also be 𝑘 𝑏
introduced. 𝑐3 = 𝑚1 𝑧 + 𝑚𝑠1 𝑧̇ ()
𝑡 𝑡

𝑘1 𝑙𝑓 𝑏𝑠1 𝑙𝑓
𝑐4 = − θ− 𝜃̇ ()
𝑚𝑡 𝑚𝑡

𝑘𝑡1
𝑐5 = 𝑧𝑟1 ()
𝑚𝑡

𝑐1 = 𝑐2 + 𝑐3 + 𝑐4 + 𝑐5 ()

𝑑1 = 𝑚𝑡 𝑧𝑢2̈ ()

𝑑1 = −𝑓𝑡 + 𝑓𝑠 + 𝑓𝑏 ()

𝑑2 = −𝑘𝑡2 ( 𝑧𝑢2 − 𝑧𝑟2 ) + 𝑘2 ( 𝑧2 − 𝑧𝑢2 ) ()

Figure IX. Passive suspension system for half car model (4-DOF) 𝑑3 = 𝑏𝑠2 (𝑧2̇ − 𝑧𝑢2̇ ) ()

𝑑1 = 𝑑2 + 𝑑3 ()
B. Half Car Model Forces
𝑥 state vector is [z(𝑧𝑠 ),v(𝑧̇ ),(𝜃),(𝜃̇ ), 𝑧𝑢1 , 𝑧𝑢1̇ , 𝑧𝑢2 , 𝑧𝑢2̇ ] 𝑒1 = 𝑧𝑢2̈ ()
Position of sprung mass z(𝑧𝑠 ), velocity of sprung mass (𝑧̇ ),
𝑘𝑡2 𝑘 𝑏
the pitch angle (𝜃), yaw response (𝜃̇ ), 𝑧𝑢1 (position of front 𝑒2 = − 𝑧𝑢2 − 𝑚2 𝑧𝑢2 − 𝑚𝑠2 𝑧𝑢2̇ ()
𝑚𝑡 𝑡 𝑡
tire), 𝑧𝑢1̇ (absolute velocity for unsprung mass front tire),
𝑧𝑢2 (position of rear tire), 𝑧𝑢2̇ (absolute velocity for 𝑘 𝑏
unsprung mass rear tire) 𝑒3 = 𝑚2 𝑧 + 𝑚𝑠2 𝑧̇ ()
𝑡 𝑡
𝑢 is the input vector [𝑧𝑟1 , 𝑧𝑟2 ].
𝑘2 𝑙𝑟 𝑏𝑠2 𝑙𝑟
𝑒4 = 𝑚𝑡
θ+ 𝑚𝑡
𝜃̇ ()
𝑎1 = 𝑚𝑠 𝑧̈ ()
𝑘𝑡2
𝑎2 = 𝑘1 (𝑧1 − 𝑧𝑢1 ) + 𝑏𝑠1(𝑧1̇ − 𝑧𝑢1̇ ) () 𝑒5 = 𝑚𝑡
𝑧𝑟2 ()

𝑎3 = 𝑘2 (𝑧2 − 𝑧𝑢2 ) + 𝑏𝑠2 (𝑧2̇ − 𝑧𝑢2̇ ) () 𝑒1 = 𝑒2 + 𝑒3 + 𝑒4 + 𝑒5 ()

𝑎1 + 𝑎2 + 𝑎3 () 𝑔1 = 𝐼𝜃̈ − 𝑘1 (𝑧1 − 𝑧𝑢1 )𝑙𝑓 + 𝑘2 (𝑧2 − 𝑧𝑢2 )𝑙𝑟 ()

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

71
𝑔2 = −𝑏𝑠1 (𝑧1̇ − 𝑧𝑢1̇ )𝑙𝑓 + 𝑏𝑠2(𝑧2̇ − 𝑧𝑢2̇ )𝑙𝑟 () 0 0 0 0
0 0 0 0
𝑔3 = 𝑔1 + 𝑔2 () 0 0 0 0
0 0 −
𝑙𝑓 𝑙𝑟

𝑟1 = (
𝑘1 𝑙𝑓
−
𝑘2 𝑙𝑟
)𝑧 +(
𝑏𝑠1 𝑙𝑓
−
𝑏𝑠2 𝑙𝑟
) 𝑧̇ () 𝐵= 0 0 𝐼 𝐼 ()
𝐼 𝐼 𝐼 𝐼 𝐾𝑡1 0 0 0
𝑚𝑡 0 0 0
𝑘1 𝑙𝑓2 𝑘2 𝑙𝑟2 𝑏𝑠1 𝑙𝑓2 𝑏𝑠2 𝑙𝑟2 0 0 0
𝑟2 = − ( + )𝜃 − ( + ) 𝜃̇ () 𝐾𝑡2
𝐼 𝐼 𝐼 𝐼 [0 𝑚𝑡 0 0]
𝑘1 𝑙𝑓 𝑏𝑠1 𝑙𝑓
𝑟3 = − 𝑧𝑢1 − 𝑧𝑢1̇ ()
𝐼 𝐼
Parameter for in-wheel motor half electric vehicle (IWM
𝑘2 𝑙𝑟 𝑏𝑠2 𝑙𝑟 EV) with ground-hook controller
𝑟4 = 𝐼
𝑧𝑢2 + 𝐼
𝑧𝑢2̇ () Mass of car 𝑚𝑠 =1200 kg, mass of tire 𝑚𝑢 =40 kg, mass of in-
wheel motor 𝑚𝑖 =45 kg, the right spring stiffness 𝑘1 =16000,
𝜃̈ = 𝑟1 + 𝑟2 + 𝑟3 + 𝑟4 () the left spring stifness 𝑘2 =16000, the right tire stifness
𝑘𝑡1 =160000, the left tire stifness 𝑘𝑡2 =160000, the distance
Parameter for half vehicle model passive suspension from the front wheel to the center of gravity (𝑙𝑓= 1.1 m), the
system distance from the rear wheel to the center of gravity (𝑙𝑟= 1.3
Mass of car 𝑚𝑠 =1200 kg, mass of tire 𝑚𝑢 =40 kg, the right m), the ground-hook damping coefficient 𝑐𝑔𝑟𝑑1 , 𝑐𝑔𝑟𝑑2 =
spring stiffness 𝑘1 =16000, the left spring stifness 𝑘2 =16000, 10000, 𝑧𝑟1 (random ground input 1) , 𝑧𝑟2 (random ground
the right tire stiffness 𝑘𝑡1 =160000, the left tire stifness input 2), 𝑓𝑔𝑟𝑑1 (ground-hook force 1), 𝑓𝑔𝑟𝑑2 (ground-hook
𝑘𝑡2 =160000, the right damper 𝑏𝑠1 =1000, the left damper force 2)
𝑏𝑠2=1000, the distance from the front wheel to the center of 𝑢 is the input vector [ 𝑧𝑟1 , 𝑧𝑟2 , 𝑓𝑔𝑟𝑑1 , 𝑓𝑔𝑟𝑑2 ].
gravity (𝑙𝑓= 1.1 m), the distance from the rear wheel to the
center of gravity (𝑙𝑟= 1.3 m), I is the mass moment of inertia.
0 0 0 0
1 1
0 0 0 0 0 0 𝑚𝑠 𝑚𝑠
0 0 0 0 0 0 0 0
0 0 0 0 0 0
𝑙𝑓 𝑙𝑟
−
0 0 −
𝑙𝑓 𝑙𝑟 𝐵= 𝐼 𝐼 ()
𝐵= () 0 0 0 0
0 0 𝐼 𝐼
𝐾𝑡1 0
𝐾𝑡1 0 0 0 −𝑚
1 0
𝑚𝑡 0
𝑚𝑢 0 0 0 𝑡 0
0 0 0 𝐾𝑡2
0 1
0 𝐾𝑡2
−𝑚 ]
[0 𝑚𝑢 0 0] [0 𝑚𝑡
0 𝑡

0 1 0 0 0 0 0 0
𝐾1 𝐾2 𝑏𝑠1 𝑏𝑠2 𝐾1 𝑙𝑓 𝐾2 𝑙𝑟 𝑏𝑠1 𝑙𝑓 𝑏𝑠2 𝑙𝑟 𝐾1 𝑏𝑠1 𝐾2 𝑏𝑠2
− − −( + ) − −
𝑚 𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠
0 0 0 1 0 0 0 0
𝐾1 𝑙𝑓 𝐾2 𝑙𝑟 𝑏𝑠1 𝑙𝑓 𝑏𝑠2 𝑙𝑟 𝐾1 𝑙𝑓 2 𝐾2 𝑙𝑟 2 𝑏𝑠1 𝑙𝑓 2 𝑏𝑠2 𝑙𝑟 2
− − −( + ) − − 𝐾1 𝑙𝑓 𝑏𝑠1 𝑙𝑓 𝐾2 𝑙𝑟 𝑏𝑠2 𝑙𝑟
𝐼 𝐼 𝐼 𝐼 𝐼 𝐼 𝐼 𝐼 − −
𝐴= 0 0 𝐼 𝐼 𝐼 𝐼 (67)
0 0
𝐾1 𝑏𝑠1 𝐾1 𝑙𝑓 𝑏𝑠1 𝑙𝑓 0 1 0 0
− 𝐾𝑡1 𝐾1 𝑏𝑠1 0 0
𝑚𝑢 𝑚𝑢 𝑚𝑢 𝑚𝑢 − − − 0 1
0 0 𝑚𝑢 𝑚𝑢 𝑚𝑢
0 0 𝐾𝑡2 𝐾2 𝑏𝑠2
𝐾2 𝑏𝑠2 𝐾2 𝑙𝑟 𝑏𝑠2 𝑙𝑟 0 0 − − −
0 0 𝑚𝑢 𝑚𝑢 𝑚𝑢
[ 𝑚𝑢 𝑚𝑢 𝑚𝑢 𝑚𝑢 ]

Parameter for in-wheel motor half electric vehicle (IWM

EV) passive suspension system
Mass of car 𝑚𝑠 =1200 kg, mass of tire 𝑚𝑢 =40 kg, mass of in-
wheel motor 𝑚𝑖 =45 kg, the right spring stiffness 𝑘1 =16000,
the left spring stifness 𝑘2 =16000, the right tire stifness
𝑘𝑡1 =160000, the left tire stifness 𝑘𝑡2 =160000, the right
damper 𝑏𝑠1 =1000, the left damper 𝑏𝑠2 =1000, the distance
from the front wheel to the center of gravity (𝑙𝑓= 1.1 m), the
distance from the rear wheel to the center of gravity (𝑙𝑟= 1.3
m).
The A matrix is shown in eq. 70.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

72
0 1 0 0 0 0 0 0
𝐾1 𝐾2 𝑏𝑠1 𝑏𝑠2 𝐾1 𝑙𝑓 𝐾2 𝑙𝑟 𝑏𝑠1 𝑙𝑓 𝑏𝑠2 𝑙𝑟 𝐾1 𝑏𝑠1 𝐾2 𝑏𝑠2
− − −( + ) − −
𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠
0 0 0 1 0 0 0 0
𝐾1 𝑙𝑓 𝐾2 𝑙𝑟 𝑏𝑠1 𝑙𝑓 𝑏𝑠2 𝑙𝑟 𝐾1 𝑙𝑓 2 𝐾2 𝑙𝑟 2 𝑏𝑠1 𝑙𝑓 2 𝑏𝑠2 𝑙𝑟 2
− − −( + ) − − 𝐾1 𝑙𝑓 𝑏𝑠1 𝑙𝑓 𝐾2 𝑙𝑟 𝑏𝑠2 𝑙𝑟
𝐼 𝐼 𝐼 𝐼 𝐼 𝐼 𝐼 𝐼 − −
𝐴= 0 𝐼 𝐼 𝐼 𝐼 (70)
0 0 0
𝐾1 𝑏𝑠1 𝐾1 𝑙𝑓 𝑏𝑠1 𝑙𝑓 0 1 0 0
− 𝐾𝑡1 𝐾1 𝑏𝑠1 0 0
𝑚𝑡 𝑚𝑡 𝑚𝑡 𝑚𝑡 − − − 0 1
0 0 𝑚𝑡 𝑚𝑡 𝑚𝑡
0 0 𝐾𝑡2 𝐾2 𝑏𝑠2
𝐾2 𝑏𝑠2 𝐾2 𝑙𝑟 𝑏𝑠2 𝑙𝑟 0 0 − − −
0 0 𝑚𝑡 𝑚𝑡 𝑚𝑡
[ 𝑚𝑡 𝑚𝑡 𝑚𝑡 𝑚𝑡 ]

Figure XIII. Comparison in tire deflection front tire for half vehicle
between normal passive in-wheel motor passive and in-wheel
motor with ground-hook controller

Figure X. In-wheel motor half electric vehicle with ground-hook

controller

Figure XIV. Comparison in tire deflection rear tire for half vehicle
between normal passive,in-wheel motor passive and in-wheel
motor with ground-hook controller
Figure XI. MATLAB simulation comparison for half vehicle
between normal passive, in-wheel motor passive and in-wheel C. Simulation and Analysis
motor with ground-hook controller
Figures IV, V, VI, VII shows the simulations for the passive
and active quarter car models. Figures XII, XIII and XIV
shows the simulations for the passive and active half car
models. Figure IV, VI and Figure XII compares the
accelerations of the vehicle bodies for the normal passive
vehicle, the (IWM) passive vehicle, the (IWM) ground-hook
active suspension vehicle and the (IWM) sky-hook active
suspension vehicle. The acceleration is commonly used to
measure passenger comfort. According to Figure IV, VI and
Figure XII the added weight of the in-wheel electric motor
does not have much of an impact on the acceleration and
Figure XII. Comparison in acceleration of sprung mass for half vehicle comfort. However, the ground-hook active
vehicle between normal passive in-wheel motor passive and in- suspension increases the acceleration and diminishes
wheel motor with ground-hook controller passenger comfort while the sky-hook active suspension
decreases the acceleration and improves the comfort for the
occupants inside the car. Figure V, Figure VII, Figure XIII and
Figure XIV show the tire deflections for the normal passive
vehicle, the (IWM) passive vehicle, the (IWM) ground-hook
active suspension vehicle and the (IWM) sky-hook active
suspension vehicle. The added weight of the in-wheel motor

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

73
increases the tire deflection dramatically. Therefore, it is shown with simulations using both the quarter car model and
observed that the road holding properties of the tire is worse the half car model.
for the in-wheel motor passive suspension system and also
worse for sky-hook controller active suspension because the References
sky-hook controller only improves the comfort of the [1] Nouby M. Ghazaly, Ahmad O. Moaaz, “The Future Development and
passengers. However, the ground-hook controller active Analysis of Vehicle Active Suspension System”, Journal of
Mechanical and Civil Engineering (IOSR-JMCE) 2014.
suspension is able to decrease the tire deflection to values [2] Vivek Kumar Maurya and Narinder Singh Bhangal, “Optimal Control
close to the passive suspension. Therefore, the ground-hook of Vehicle Active Suspension System”, Journal of Automation and
controller is able to reach its goal of improving road holding. Control Engineering 2018.
This simulation shows that the ground-hook controller is able [3] John Ekoru “Intelligent Model Predictive/Feedback Linearization
to eliminate the negative effects of an (IWM EV) in terms of Control of Half-Car Vehicle Suspension Systems” thesis 2012.
road holding. However, this is done at the cost of decreasing [4] Fischer, D. and Isermann, “Mechatronic semi-active and active vehicle
suspensions”, Control Engineering Practice, 2004.
passenger comfort. The sky-hook controller is able to
[5] Canale, M., Milanese, M. and Novara, “Semi-active suspension control
improve the comfort for the passengers of an (IWM EV), and using fast model-predictive techniques”, IEEE Transactions on Control
fails to improve the road holding. It should be noted that the Systems Technology, 2006.
gain of the controller for the ground-hook controller can be [6] M. FARID ALADDIN, JASDEEP SINGH, “MODELLING AND
adjusted to decrease the negative effects on passenger SIMULATION OF SEMI-ACTIVE SUSPENSION SYSTEM FOR
comfort. However, this will decrease the effectiveness of the PASSENGER VEHICLE”, Journal of Engineering Science and
Technology, 2018.
ground-hook controller.
[7] M. VALÁŠEK , M. NOVÁK , Z. ŠIKA & O. VACULÍN, “Extended
Ground-Hook - New Concept of SemiActive Control of Truck's
IV. Conclusion Suspension”, Extended GroundHook - New Concept of Semi-Active
A passive suspension system for both normal passive vehicle Control of Truck's Suspension, Vehicle System Dynamics, 2007.
and IWM EV without any controller, IWM EV with ground- [8] Hossein Salma ni, Milad Abbasi, Tondar Fahimi Zand , Mohammad
Fard and Reza Nakhaie Jazar, “A new criterion for comfort assessment
hook controller and IWM EV with sky-hook controller were of in-wheel motor electric vehicles”, Journal of Vibration and Control,
analyzed using MATLAB/Simulink. The simulations show 2020.
that the ground-hook controller improves road holding by [9] Yechen Qin, Chenchen He, Peng Ding, Mingming Dong, Yanjun
reducing the tire deflection for the IWM EV. The sky-hook is Huang, “suspension hybrid control for in-wheel motor driven electric
shown to improve the comfort for the occupants inside the vehicle with dynamic vibration absorbing structures”, IFAC Papers
OnLine 51-31 (2018).
car. In the case of decreasing or increasing unsprung mass,
[10] Fangwu Ma, Jiawei Wang, Yang Wang, Longfan Yang, 2969.
there is an opposite relationship between size of unsprung “Optimization design of semi-active controller for in-wheel motors
mass and road holding. When the unsprung mass is suspension”, 2018.
decreased, better road holding response is achieved. On the [11] Jialing Yao, Shuhui Ding, Zhaochun Li, Shan Ren, Saied Taheri,
other hand, increasing the unsprung mass for IWM EV Zhongnan Zhang, “Composite Control and Co-Simulation in In-Wheel
Motor Electric Vehicle Suspension”, The Open Automation and
(adding the mass of electric motor 45 kg), by increasing the Control Systems Journal, 2015.
unsprung mass of the front and rear wheel assemblies results [12] Patil, I. and Wani, K., “Design and Analysis of Semi-active Suspension
in worse road holding which directly affects the ground Using Skyhook, Ground Hook and Hybrid ControlModels for a Four
contact with the tire. This is the reason for worse road holding Wheeler”, SAE Technical Paper 2015.
for IWM EV. When the ground-hook controller is used for [13] Pipit Wahyu Nugroho, Weihua Li, Haiping Du, Gursel Alici, and Jian
IWM EV, better road holding and good contact with the road Yang, “An Adaptive Neuro Fuzzy Hybrid Control Strategy for a
Semiactive Suspension with Magneto Rheological Damper”, Hindawi
is achieved. This is because the ground-hook controller Publishing Corporation Advances in Mechanical Engineering, 2014.
provides better isolation from road disturbances by reducing [14] Rajesh Rajamani, “Vehicle Dynamics and Control”, Department of
upper and lower peaks of the tire deflection. But the sky-hook Mechanical Engineering University of Minne.
controller fails to improve the road holding because it is used
to provide comfort for the passengers. These results were

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

74
Evolutionary Deep automatic CAD system for Early detection
of Diabetic Retinopathy and its severity classification.
Lamiya Salman Dr Nidhal Abdulaziz
Faculty of Engineering & Information Faculty of Engineering & Information
Science, University of Wollongong in Science, University of Wollongong in
Dubai, Dubai, UAE Dubai, Dubai, UAE
lamiyasalman3@gmail.com NidhalAbdulaziz@uowdubai.ac.ae

Abstract— Diabetic retinopathy (DR) is a retinal malady have not placed much importance towards early detection of
prevalent in individuals aged 25 years and above, that leads to DR.
irreversible blindness when left unchecked. DR has yet to find a
definitive cure. Diagnostic and corrective measures may prevent
This study recognizes the vital importance of early
complete blindness by up to 90% provided early screening and
monitory clinical visits. Although ample research and detection and aims at a five-level severity grade output. This
advancements have been made, none have been successfully need for early detection is reinforced by the fact that
integrated into the medical system. This is due to the lack of progression and possible cure can be achieved only in the
acceptable classification accompanying screening. Further, early stages of DR as per Early Treatment Diabetic
diluted efforts have been made towards monopolizing early Retinopathy Study (ETDRS) and Diabetic Retinopathy Study
detection i.e., detection of non-proliferate DR a subclass and the (DRS) [6]. Scope of project extends strictly to screening,
earliest discernable form of DR. This study aimed at the early grading and subsequent preventive measures/guidelines.
detection of DR through adoption of the medically validated 5
class DR severity grading. The proposed system was a novel II. Theoretical Background
hybrid scheme based on past works. Multiple input image
modalities and classifier combinations were tested to configure A. Data Processsing
the final DR grading scheme. The system was actualized using Input retinal images are captured and sourced in real time
fundus images as input which were pre-processed and conditions such as poor contrast, blurring and non-uniform
augmented using green channel extraction, CLAHE and binary
illumination. Counter acting these implications is essential as
masking. Images were then synthesized into discriminatory
features using ResNet-50. Final system tier consisted of severity
system performance is dependent on accurate retinal DR
grading through classification using novel CNN based Support feature localization and subsequent classification based on
Vector Machine (SVM). The utilized ensemble of augmentation these discriminant traits.
and pre-processing module along with ResNet-50 based SVM
classifier was a novel contribution not explored in any past Histogram Equilisation (HE) is a common method of
works. Proposed pre-processing increased system accuracy by image contrast enhancement used by Foeady et.al. [7].
13.9% and 16% on IDRiD and Kaggle. Overall F1-score, SN, SP Although frequently used, HE alters the mean brightness of
and Acc of 0.978 ,0.979, 0.995 and 0.979, was achieved. input image introducing artifacts and intensity saturation
Incorporation of Artificial Intelligence made proposed system
therefore is not a suitable choice. Contrast limited Adaptive
time, cost, and labor efficient which is key in DR screening.
histogram equalization (CLAHE) is a variant of HE which
Keywords— Diabetic Retinopathy, severity classification, Pre- increases image quality, incorporates uniform intensity
Processing (PP), Convolutional Neural Network (CNN), Support equalization and reduces amplified noise thereby takes care
Vector Machine (SVM). of the shortcomings of HE. CLAHE, out performs global
enhancement methods like contrast stretching in blood
I. Introduction vasculature (BV) enhancement as seen in the work done by
Diabetic retinopathy (DR), a progressive sight- M.H. Fadzil, et.al. [8], this is further validated through
threatening ailment, develops in individuals with highest peak signal to noise ratio and lowest absolute mean
prolonged diabetes type 1 or 2, due to presence of brightness error when compared to other variants such as
high blood sugar levels in their system [1]. DR has a Adaptive Histogram Equalization [8]. Extraction of a single
predicted global prevalence of 4.4% by 2030 as per World component of the RGB models can be done as a means of
Health Organization and presents an asymptomatic contrast enhancement. Empirical evidence along with
inceptive stage, long latent phase which when deprived from entropic comparisons by N. R. Binti, et.al.[9] support
an early diagnosis may eventually lead to blindness[2]. extraction of the green component due to its capacity to
DR diagnosis and severity analysis today, is provided only by provide the best distinction among DR features/lesions such
an ophthalmologist or through his evaluation of retinal as Micro Aneurysms (MA), Cotton Wool spots, Exudate
images/scans. This process is cost, time and labor intensive, (EXs), Hemorrhages (HM) and BV, when compared to red,
especially in case of large disproportionate field expert to blue and gray channels. This enables maximum entropy
patient ratios, often leading to no patient screening [1][3]. preservation while enhancing contrast. In contradiction to
These alarming statistics, need for a cost efficient and this, Optic disk localization is aided by red channel entropy
accessible screening system have motivated copious research extraction due to better visual contrast as done by S. Pathan,
and advancement in computer aided diagnostic (CAD) et.al.[10]. Qiao, et.al.[11] employed Matched Filters (MF)
devices but no such work has been able to swap places with combined with LoG filters to localize transient EXs, MFs are
a comprehensive eye test [5]. This has been due to inadequate not suitable as they enhance BV along with the EXs.
system performance, lack of training data and classification Difference in lesion size will lead to inaccurate localization.
mostly limited to binary severity grades. Lastly, past works

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

75
H. Chen et.al. [12] cropped retinal images that had it handles differentiation of BV based on colour, proximity,
incomplete hemispherical boundaries to form complete shape, and size. An accuracy of 98% was reported. MA and
spheres. This approach is not ideal as removal of peripheral HM are spherical in shape with a diameter larger than feeding
regions would lead to inaccurate severity grading due to loss blood vessels. Using this property, H.F. Jaafer, et.al. [18]
of diagnostic lesion areas. segmented red lesions by applying the flood-filled operation
to pixels pertaining to the background and subtracting them
While surveying the appropriate literature, a trend was from the original image. Incorrect discrimination between
identified among CNN based architectures where pre- circular and linear shapes may be done due to inaccurate
processing was mostly limited to Data Augmentation (DA) threshold selection. A SN of 89.7% was reported by the
such as horizontal, vertical flipping, rotation, mirroring, study.
scaling and grayscale normalization as seen in work done by
S. Pao, et.al. [13]. This may have been due to the added pre- V. K. Sree and P. S. Rao, [19] carried out EX segmentation
processing effect of the Max pooling layer by means of de by applying Canny edge detection and smoothening via
noising along with the prevalence of DA which exposes a gaussian mask was done. CCA was used to colour code
variety of variations to the concerned Neural Networks, segmented EXs regions and subtract from original image for
aiding feature learning instead of separate enhancement of exudate localization. Reported accuracy was 72.7%. P.
said features. The direct increase in system performance by Khojasteh, et.al.[20] extracted features without added pre-
means of increased PP does not translate in CNN based processing or prior segmentation for means of EXs feature
systems due to their ability to iteratively learn features from extraction using ResNet-50, Discriminative Restricted
Boltzmann Machines (DBRMs) and a custom CNN. Feature
input images without requiring feature specific enhancement
extraction was enhanced using PCA along with parameter
or enrichments. Processing may become disruptive to said
optimization using grid-search approach. Results of ResNet-
feature learning process due to dampening of distinctive 50 were encouraging (Acc 98.2%, SN 99%,) but that of
characteristics. Undue PP leads to added computational and custom CNN (Acc 70%) and DRBMs (Acc 89%) were poor.
time complexity along with added memory constraints on
system. Despite this, contrast enhancement and optimal It may be concluded that while hand crafted feature
channel extraction have unequivocally facilitated improved extraction as seen in [18],[15] is efficient, it is complex and
system performance being the choice of many CNN and non- static hence would not fare well in extending system to
CNN users [14]. different datasets while maintaining high level of
performance. This issue is reinforced due to manual
B. Feature Segmentation and Extraction requirements such as threshold selection based on training
Features form the basis of classification, supplementing data seen in [18]. CNN based feature extraction is robust and
the classifier with information necessary to sort between fully automated hence is dynamic, having an edge over
different classes. This may be done through feature specific, feature specific non-CNN extractions. High accuracies may
Region of Interest (ROI) and CNN based feature abstraction. be accountable to overfitting of training data. To exploit
The Optic Disk (OD) and BV present in the retina are a CNNs for feature extraction, ROI localization, balanced and
potent source of errors, false positive rates (FPR) and large datasets by means of augmentation as seen
misclassification. The minimum-intensity maximum-solidity in[17][20]may be employed.
(MinIMaS) algorithm was invoked by S. Roychowdhury,
C. Classification
et.al. [15] for means of OD extraction and masking. Results
of overall classification system showed 100% sensitivity (SN) Classification may be seen at lesion level or at image level
but 53.16% specificity (SP), i.e. high FPR. into DR severity grades. Previous studies showed several
binary and multilevel classifications such as DR/No DR,
Thresholding is a common approach adopted for Referral(R)/ Non-referral(NR) DR, and the ETDR 5-level
segmentation. In case of BV, wide range of values are present
classification adopted in this project.
due to difference in vessel width and edge pixels. M.U.
Akram, et.al. [16] eradicated this limitation by means of
multilayered thresholding. Skeletonization of segmented 1) Feature based classification
image by means of thinning morphological operation (MO) M. H. Ahmad Fadzil, et.al.[8] used a non-conventional
was done. Although Acc of 94.69% was achieved, high approach by employing the analysis of the Foveal Avascular
performance metrics on limited dataset are misguiding due to zone (FAZ) for DR detection. Classification was limited to a
exposure to limited sample types. three-level severity grading with SP(>98%), SN(100%) and
Acc of (99%). Although the system fared exceptionally, FAZ
MAs are saccular capillary dilatations, for efficient MA area overlap were seen and deemed as progressive stages.
segmentation, Mateen et.al.[17] applied vessel extraction M.U. Akram, et.al. [16] employed PDR classification based
using a novel hybrid approach of enhanced Gaussian Mixture on neovascularization detection. A 15x15 window was slid
Model (GMM) using Adaptive Learning Rate (ALR) to aid over segmented BV to compute density and energy.
ROI localization for subsequent feature extraction. Disparity in behavior with normal blood vessel led to PDR
Incorporation of ALR was the optimal choice as it helps classification. Acc of 95.02% was achieved.
increase learning rate and convergence speed over others A novel two step lesion classification and severity
such as Expectation Maximization (EM). Post-vessel estimation was done by S. Roychowdhury, et.al.[15]. Various
detection Connected Components Analysis (CCA) along feature-based classifiers such as k-nearest neighbors (KNN),
with blob analysis was utilized to differentiate among lesions GMM, SVM and their ensembles were tested to find best
and healthy retina to enhance BV extraction. CCA is ideal as suited pick for the separate classification of bright and red

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

76
lesions. System achieved high SN ranging from 100-92% but such as KNN, random forest and random tree. Superiority of
poor SP 48-58% reflecting a high FPR due to bias in TL over custom CNNs was validated by demarcation in
algorithm selected for feature set optimization, feature set performance metrics. This study established the dominance
being key due to usage of feature-based classifiers. of SVMs over other classifiers when paired with CNNs,
Akram, et.al. [21] monopolized M-Medoid classification to through benefits in terms of training, data requirement and
develop a hybrid classification model using probabilistic time constraint using TL, despite these crucial assertions,
integration of GMM classifiers, an effective multi-class system limited itself to early PDR detection.
discriminant aided by intuitive modelling due to their Qummar, et.al.[26] exploiting an ensemble of CNNs, with
resilience towards class overlap which is common in DR Resnet50, Inceptionv3, Xception, Dense121, Dense169 as
classification classes. Severity classification was based on the base models, aimed at encoding rich feature extraction
numeric presence of MAs, EXs and HMs. Reported metrics and accurate DR classification into 5 class severities which is
were promising at 99.17% SN, 97.07% SP and 98.52% Acc. not seen often. System output was the stacked combination of
the individual predicted class labels. Khalifa, et.al. [27] set a
2) CNN comparative study to find the best CNN architecture for the
The prevalent issue of class imbalance in CNNs is same purpose as authors in [26] but unlike the former was
directly addressed by a novel two-step CNN architecture limited to a single CNN. Authors in [26] reported an Acc and
developed by N. Eftekhari, et.al. [22] which was claimed to F1 score of 80.8% and 53.7% but an impressive F1-score of
decrease FPR by means of pixel wise classification of MAs. 95.82% was reported by authors in [27] with the presumption
Large FPR was generated reaching up to 8 FP per image at of AlexNet being the best choice and VGG16 being second.
highest reported SN of 77.1%. Training and developing CNN classification although robust on its own may be further
CNNs manually along with pixel wise computation are labor, elevated by interweaving other supervised or un-supervised
time intensive and in case of the former error prone. classifiers forming an ensemble of sorts[25].
P Khojasteh, et.al.[20] set up a comparative analysis of a
Shaban, et.al. [23], H. Chen, et.al. [12] and H. custom CNN, unsupervised DRBM, and ResNet-50 transfer
Seyedarabi, et.al. [14] undertook DR severity grading using model with its SoftMax classifier layer switched with SVM,
Transfer learning (TL) without prior feature supervised Optimum-Path Forest, and k-NN classifiers
extraction/segmentation. Authors in [23] engineered DR comparing the performance of each in EX classification.
severity grading using TL applied VGG-19 network. No prior ResNet-50+SVM outperformed others with 0.96 SP,0.99 SN.
PP was utilized. Highest Acc, SN, and SP of 89%, 89%, 95% Mateen et.al.[17] classified 5 stages of DR using feature
were reported. TL enhanced CNN by authors in [23] fared vectors segmented from fundus image utilizing the SoftMax
better than the custom CNN proposed by N. Eftekhari, algorithm. Being one of the few 5 grade classifications using
et.al.[22]. While opting out of segmentation and feature a CNN structure reported Acc was of 98.13 %. It is noted that
extraction is acceptable due to structural gains of CNNs, PP machine learning (ML) classifiers such as GMMs, SVMs are
such as DA and contrast enhancement are crucial for system high performing and efficient but are feature set dependent.
performance especially due to the ambiguous nature of retinal Hence more complex and resistant to expansion to new
features. Classification was limited to No/Moderate/Severe datasets[14]. CNNs directly address this issue, and are high
DR. Unlike authors in [23], authors in [14] applied CLAHE performing but are victim to overfitting and class imbalance,
for PP and claimed EfficientNet to be the ideal choice of combining the two approaches, SVM and CNNs would result
CNN due to their reduced parameters, flops, increased speed in elevated results and resilient classification [17][20].
and accuracy. Classification was limited to referral and non-
referral DR with reported 93% SN. H. Chen, et.al. [12] aimed III. Materials and Method
at 5 level severity classification. To this end, pre trained The proposed system is a novel hybrid approach designed
Inception V3 due to its depth and elevated linear expression to generate DR severity grading based on input fundus
was enhanced using Stochastic Gradient Decent (SGD) and images. Overall system block diagram is shown in Figure I.
early stop mechanism to mitigate learning rate and
overfitting. Stated F1 score of 77% was poor, especially in
class 1 and 3.

3) Ensembles
Ensembles of classifiers is an active area of research due
to their ability to boost system performance but at the cost of
increased system complexity and time constraint [24].
Figure I. Overall System block diagram with all modalities tested.
J. Sahlsten, et.al. [24] validated this claim by employing the
Inception-V3 architecture and its ensemble of 6 units to 1) Dataset
compare classification performance of the two systems. The KAGGLE benchmark dataset is a compilation of
Results showed a clear increase in performance across all 35,126 images with separate training and testing sets with
planes when comparing the single CNN against the ensemble class labels ranging from 0-4, in line with the severity scheme
system. M. Ghazal, et.al.[25] utilized TL and ensembles and adopted by this study.
combined them to authenticate the advantages of TL. The
ensemble of 7 CNNs had pre-trained AlexNets and randomly IDRiD comprises of 516 images. Images are available in
initialized custom CNNs. Classification was by means of .jpg format, 4288x2848 resolution with 800KB size. Images
SVM. This choice was made by sampling other classifiers are accompanied with severity grading and ground truths.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

77
Table I. Effect of Pre-processing datasets on performance.
Data set Classifier No of images Accuracy Sensitivity Specificity Precision Recall F1 Score
IDRiD (raw) SoftMax 840 0.91 0.760 0.945 0.771 0.767 0.763
IDRiD (PP) SoftMax 840 0.933 0.824 0.96 0.826 0.824 0.824
Kaggle (raw) SoftMax 1000 0.836 0.584 0.899 0.59 0.584 0.578
Kaggle (PP) SoftMax 1000 0.904 0.776 0.944 0.755 0.756 0.743
IDRiD (raw) SVM 840 0.84 0.64 0.906 0.60 0.64 0.564
IDRiD (PP) SVM 840 0.979 0.979 0.995 0.978 0.979 0.978
Kaggle (raw) SVM 1000 0.824 0.54 0.897 0.56 0.54 0.52
Kaggle (PP) SVM 1000 0.984 0.961 0.99 0.956 0.961 0.96

2) Data Augmentation were based on features extracted by the CNN. 5-fold cross
Of the two datasets employed, IDRiD was first balanced validation was used to reflect authenticity of acquired results
in terms of images per DR severity class using manual up certifying lack of overfitting due to the 5 separate folds of TT
sampling. Once balanced, datasets were further augmented data used. The SGD which helps minimize cross entropy loss,
by means of reflection, scaling and translation. This was increase computational efficiency and accelerate learning
done to diminish overfitting to a singular class/dataset and was used [23][3].
provide a larger dataset for TT purposes from the smaller IV. Results and Discussion
source dataset. DA was done uniformly across all classes of Comparative analysis between Raw, Pre-processed, ROI
data, maintaining equal ratios to avoid class imbalance. input modalities and the SVM and SoftMax classifier using
the ResNet-50 architecture was done. Multiple mini batch
3) Pre-processing sizes, epochs and learning rates were experimented with to
Acquired images had their green channel extracted prior identify the best performing fusion. Through experimentation
to masking, highlighting foreground from background [7], the optimum combination of 10 mini batch size, 50 epochs
providing the best distinction between retinal features in the and 1e-3 learning rate was ascertained. System performance
foreground and darker background pixels [9]. Input retinal markedly increased across all parameters on increasing TT
fundus images were resized to standard input size of ResNet- images. This outcome was expected as CNN performance is
50. directly proportional to the mass of images supplied to the
architecture. More images make way for better learning and
4) ROI localization feature extraction. It may be inferred from Table I that system
BV segmentation was done in accordance to the work accuracy was increase by 13.9 to 16 % by pre-processing
presented by M. Mateen, et.al.[17]. Discriminant information input images prior to classification.
present in the BV regions was localized and fed to the CNN Further it may be noted that the smaller IDRiD (840)
for feature extraction. GMM was optimized using EM instead dataset achieved higher SN and F1 score as compared to the
of ALR due to comparable performance and limited works on larger Kaggle(1000) dataset. This may be accounted for by
fairly novel ALR implementation with GMMs [21]. the unruly nature of the latter. Kaggle comprises of 35,000
plus images captured by varying fundus cameras, zooming
5) Feature Extraction and Classification scales, illumination, and FOV angles, Figure II [17].
Feature extraction by means of CNNs was opted for due Inhomogeneous quality and sub-optimal lighting as
to their ability to automatically learn features instead of compared to the uniform composition of the IDRiD dataset,
requiring a set of hand-crafted features which are time Figure III, explains the higher system performance linked
consuming, complex, and fixed. This project aims at the with the latter despite its smaller size. This finding reinforces
comparative analysis of different classification approaches. the importance dataset quality and composition has on
The novelty of the proposed system is the usage of ResNet- performance and eventual incorporation of a computer aided
50 with SVM for DR severity classification which has not diagnostic DR grading system into the medical field.
been explored earlier. Classification has been implemented in Effect of Segmentation was tested using only the Kaggle
accordance with the work done by P. Khojasteh, et.al.[20], dataset. Overall system performance of the SoftMax
hence is an extension of his work from HE classification to classifier was lower than the system performance achieved
DR grading. Multiple classification approaches, ResNet-50 using only PP images at Acc= 0.89 and F1-score of 0.72.
architecture with raw fundus images, pre-processed images, Although performance was expected to improve as per the
segmented images and ResNet-50 with SVM classifier and study executed by M. Mateen, et.al. [17], replacement of EM
raw/ PP/ segmented images have been tested. SVM and as optimizer in place of the originally proposed ALR may be
ResNet-50 have been selected due to their consistent high accounted for the diminished performance. ALR as opposed
performance. Classification without SVM classifier was done to EM has an adaptive variable which updates the gaussian
by means of SoftMax activation technique, both protocols clusters through the weight factor parametric elimination to

Figure II. Kaggle Dataset (no pre-processing): No DR, Mild NPDR, Figure III. IDRiD Dataset (no pre-processing): No DR, Mild NPDR,
Mod NPDR, Severe NPDR and PDR (L-R). Mod NPDR, Severe NPDR and PDR (L-R).

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

78
Table II. Comparison of proposed method with past works.
Author Methodology Data set Acc SN SP Precision Recall F1 Score
M. Shaban, [33] Custom CNN Kaggle 0.88 0.87 0.94 - - -
N. Khalifa,[35] Augmentation with AlexNet Kaggle 0.979 - - 0.962 0.954 95.82
S. Poa, [19] Bichannel CNN, Kaggle 0.878 0.778 0.938 - - -
Chen, [18] Inception V3 Kaggle - - - 0.76 0.80 0.77
P. Khojasteh[28] ResNet-50 + SVM DIARETDB1 0.982 0.99 0.96 - - -
M. Mateen,[25] GMM ROI extraction,VGG-19 Kaggle 0.983 - - - - -
Wang,[32] Inception V3 Kaggle 0.632 - - - - -
Qummar, [34] Ensemble of ResNet-50, Inception V3, Kaggle - - 0.851 0.634 0.65 0.322
Xception, Dense 121 and Dense 169
Proposed ResNet-50 + SVM IDRiD .979 0.979 0.995 0.978 0.979 0.978
Proposed ResNet-50 + SVM Kaggle 0.984 0.961 0.99 0.956 0.961 0.96

enable quasi linear adaptation. Similarly, the SVM classifier comparison of overall performance was not done. Although
using the ROI extracted images fared very poorly. System the study done by P. Khojasteh, et.al. reported higher SN in
Acc peaked at 82% with F1 score of 52.2%. Segmentation of both datasets and higher Acc in one dataset, It was limited to
input images significantly reduced TT time as compared to HE classification and did not do DR classification or severity
raw or PP images. Although DR detection and diagnosis is grading.The proposed system fared better overall against
not time sensitive i.e., does not need to be done in real time, nearly all past works surveyed. Hence, the system has been
training of CNNs using PP images may take up to 8 hrs for successful in improving DR classification and detection
1000 images. TT over larger datasets will take longer and performance by incorporating past works and generating an
may prove troublesome. Hence, segmentation optimization evolutionary deep CAD system.
should be done as part of future work to enhance system
performance using ROI input using ALR. This will make way V. Conclusion
for smoother integration of the system in hospitals and care This study was formulated to tackle the rampant issue of
giving facilities. It was inferred that system performance had screening and grading Diabetic Retinopathy in large
a linear increase with PP and increase of input images. A 13.9 populations due to factors such as cost, scarcity of trained
to 16 % increase in system Acc was achieved using PP. The physicians or accessibility. Based on qualitative and
best performing duo was identified as the ResNet-50 quantitative analysis of previous works, A novel hybrid
architecture with the SVM classifier using only PP images, automated deep learning algorithm capable of providing
seen in Table I. Classification using ResNet-50 with the screening, grading and preventive measures has been
SoftMax classifier on PP images peaked at an Acc of 93.3% proposed. Due to its end-to-end nature, the system eradicates
with IDRiD and 90.4% with Kaggle as opposed to 97.9% and the need of trained Ophthalmologists there by alleviating time
98.4 % respectively using the SVM classifier. Segmentation and workload, seamlessly mitigating human error/bias,
led to degradation of system performance. This unanticipated restricted access to diagnosis and its high cost in the process.
dip in performance using ROI localization was due to the The adopted solution to the research question employed
inability of the EM algorithm to dynamically segment images image processing and was developed using MATLAB.
like the originally proposed ALR algorithm. Multiple types of input images and classifiers were tested to
find the superior performing system. The final system
System evaluation has been done by means of parametric initiated with a rigorous pre-processing and augmentation
comparison of operation indicators such as Acc, SN, SP, module followed by feature extraction using ResNet-50.
Recall, precision, and F1 score. All past research considered Images were then classified using an SVM classifier. System
for comparison accomplished DR grading using Kaggle bar construction was based on research and experimentation
one of the literatures followed. In comparison to past works, which yielded supremacy of the CNN used. The contribution
the proposed system fared better than most, Table II. The of this study was the robust pre-processing and augmentation
reported precision by N. Khalifa, et.al.[27] was 0.6% higher module along with classification of DR using ResNet-50 and
than the proposed system. F1 score, recall and Acc was lower. SVM which has not been explored in any previous studies.
This study takes a novel hybrid approach by building on the This choice was attributed to the enhanced overall
work done by M. Mateen, et.al. [17] and P. Khojasteh performance necessary for adequate DR classification despite
et.al.[20]. Although P. Khojasteh [20] employed different the increased complexity constraints. The system was trained
datasets and was limited to HE detection, Table II presents on both IDRiD and KAGGLE datasets using 1000 and 840
the acquired performance metrics for purpose of comparison. images, respectively. A GUI was designed to simulate
It may be seen that the proposed hybrid method led to incorporation of system into health facilities. Simulation
enhanced operation compared to M. Mateen, et.al[17]. In case results inferred an F1-score, SN, SP and Acc of 0.978 ,0.979,
of P. Khojasteh,[20], SP metrics obtained by the proposed 0.995 and 0.979, respectively. A 13.9 % and 16 % increase in
system were better than both reported specificities by 3 to system Acc was achieved on IDRiD and Kaggle respectively
4.5%. Acc achieved on the DIARETDB1 dataset is improved when proposed pre-processing and augmentation was used.
by the proposed system on Kaggle but falls short by .3 % on Acc and SP of literature followed was improved by 0.1-0.8%
the IDRiD dataset. Acc on E-Optha is improved on both and 3-4.5% respectively. The final system led to increased
datasets utilized by the proposed system. Both SNs reported performance across all parameters in comparison to almost
by authors of [20] are higher than the proposed system by 0.1- all past works reviewed. Although best results were
2.9 %. Neither of the studies provided F1 scores hence direct anticipated using segmented images, this was not the case due

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

79
to inadequacy of employed EM in comparison to originally Deep Learning Algorithms,” IEEE Access, vol. 8, pp. 104292–
104302, 2020, doi: 10.1109/ACCESS.2020.2993937.
proposed ALR. Further work may be done to test system
[12] H. Chen, X. Zeng, Y. Luo, and W. Ye, “Detection of Diabetic
performance using ALR in ROI segmentation and PCA/SVD Retinopathy using Deep Neural Network,” in 2018 IEEE 23rd
algorithms for the enrichment of feature extraction. Robust International Conference on Digital Signal Processing (DSP),
feature extraction would supplement system generalizability Nov. 2018, pp. 1–5. doi: 10.1109/ICDSP.2018.8631882.
to new datasets and images, this being crucial for successful [13] Shu-I. Pao, H.-Z. Lin, K.-H. Chien, M.-C. Tai, J.-T. Chen, and
system integration into a medical center. Editing the input G.-M. Lin, “Detection of Diabetic Retinopathy Using
Bichannel Convolutional Neural Network,” Journal of
layer size of the CNN structure as opposed to resizing input Ophthalmology, Jun. 20, 2020.
images would be a promising experiment. This may be done https://www.hindawi.com/journals/joph/2020/9139713/
as a solution to the degraded image quality attained when [14] H. Seyedarabi, S. H. A. Jahromi, A. Javadzadeh, and ASRA
image resizing is done to fit CNN input frame size. MOMENI POUR, “Automatic Detection and Monitoring of
Performance may be optimized using Twin SVMs due to Diabetic Retinopathy Using Efficient Convolutional Neural
Networks and Contrast Limited Adaptive Histogram
their supremacy over traditional SVMs. As established inter- Equalization,” IEEE Access, vol. 8, pp. 136668–136673, 2020,
dataset variances are present, using multiple different doi: 10.1109/ACCESS.2020.3005044.
datasets would help system adjust to different FOV angles, [15] S. Roychowdhury, D. D. Koozekanani, and K. K. Parhi,
illumination etc. Finally, results showed system performance “DREAM: Diabetic Retinopathy Analysis Using Machine
Learning,” IEEE J. Biomed. Health Inform., vol. 18, no. 5, pp.
to be dependent upon quality of images. Hence, the need for 1717–1728, Sep. 2014, doi: 10.1109/JBHI.2013.2294635.
better dataset compilation strategies and guidelines is [16] M. U. Akram, I. Jamal, A. Tariq, and J. Imtiaz, “Automated
essential to the development of a robust DR detection and segmentation of blood vessels for detection of proliferative
grading computer aided system. diabetic retinopathy,” in Proceedings of 2012 IEEE-EMBS
International Conference on Biomedical and Health
References Informatics, Jan. 2012, pp. 232–235.
[1] American Optometric Association, “Diabetic retinopathy.” [17] Muhammad Mateen, J. Wen, Nasrullah, S. Song, and Z. Huang,
https://www.aoa.org/healthy-eyes/eye-and-vision- “Fundus Image Classification Using VGG-19 Architecture
conditions/diabetic-retinopathy?sso=y with PCA and SVD,” Symmetry, vol. 11, no. 1, Art. no. 1, Jan.
2019, doi: 10.3390/sym11010001.
[2] S. Wild, G. Roglic, A. Green, R. Sicree, and H. King, “Global
prevalence of diabetes: estimates for the year 2000 and [18] H. F. Jaafar, A. K. Nandi, and W. Al-Nuaimy, “Automated
projections for 2030,” Diabetes Care, vol. 27, no. 5, pp. 1047– detection of red lesions from digital colour fundus
1053, May 2004, doi: 10.2337/diacare.27.5.1047. photographs,” in 2011 Annual International Conference of the
IEEE Engineering in Medicine and Biology Society, Aug.
[3] A. A. Alghadyan, “Diabetic retinopathy – An update,” Saudi J. 2011, pp. 6232–6235. doi: 10.1109/IEMBS.2011.6091539.
Ophthalmol., vol. 25, no. 2, pp. 99–111, Apr. 2011, doi:
10.1016/j.sjopt.2011.01.009. [19] V. K. Sree and P. S. Rao, “Diagnosis of ophthalmologic
disordersin retinal fundus images,” in The Fifth International
[4] M. Mateen, J. Wen, M. Hassan, N. Nasrullah, S. Sun, and S. Conference on the Applications of Digital Information and
Hayat, “Automatic Detection of Diabetic Retinopathy: A Web Technologies (ICADIWT 2014), Feb. 2014, pp. 131–136.
Review on Datasets, Methods and Evaluation Metrics,” IEEE doi: 10.1109/ICADIWT.2014.6814696.
Access, vol. 8, pp. 48784–48811, 2020, doi:
10.1109/ACCESS.2020.2980055. [20] Parham Khojasteh et al., “Exudate detection in fundus images
using deeply-learnable features,” Comput. Biol. Med., vol. 104,
[5] “Diabetic Retinopathy: A Position Statement by the American pp. 62–69, Jan. 2019, doi: 10.1016/j.compbiomed.2018.10.031
Diabetes Association | Diabetes Care.” [21] M. Usman Akram, S. Khalid, A. Tariq, S. A. Khan, and F.
https://care.diabetesjournals.org/content/40/3/412 Azam, “Detection and classification of retinal lesions for
[6] “Early photocoagulation for diabetic retinopathy. ETDRS grading of diabetic retinopathy,” Comput. Biol. Med., vol. 45,
report number 9. Early Treatment Diabetic Retinopathy Study pp. 161–171, Feb. 2014, doi:
Research Group,” Ophthalmology, vol. 98, no. 5 Suppl, pp. 10.1016/j.compbiomed.2013.11.014.
766–785, May 1991.
[22] Noushin Eftekhari, H.-R. Pourreza, M. Masoudi, K. Ghiasi-
[7] A. Z. Foeady, D. C. R. Novitasari, A. H. Asyhar, and M. Shirazi, and E. Saeedi, “Microaneurysm detection in fundus
Firmansjah, “Automated Diagnosis System of Diabetic images using a two-step convolutional neural network,”
Retinopathy Using GLCM Method and SVM Classifier,” in Biomed. Eng. OnLine, vol. 18, no. 1, p. 67, May 2019, doi:
2018 5th International Conference on Electrical Engineering, 10.1186/s12938-019-0675-9.
Computer Science and Informatics (EECSI), Oct. 2018, pp.
154–160. doi: 10.1109/EECSI.2018.8752726. [23] M. Shaban et al., “A convolutional neural network for the
screening and staging of diabetic retinopathy,” PLoS ONE, vol.
[8] M. H. Ahmad Fadzil, Nor Fariza Ngah, T. M. George, L. I. 15, no. 6, Jun. 2020, doi: 10.1371/journal.pone.0233514.
Izhar, H. Nugroho, and H. A. Nugroho, “Analysis of foveal
avascular zone in colour fundus images for grading of diabetic [24] Jaakko Sahlsten et al., “Deep Learning Fundus Image Analysis
retinopathy severity,” in 2010 Annual International for Diabetic Retinopathy and Macular Edema Grading,” Sci.
Conference of the IEEE Engineering in Medicine and Biology, Rep., vol. 9, no. 1, Art. no. 1, Jul. 2019, doi: 10.1038/s41598-
Aug. 2010, pp. 5632–5635. doi: 019-47181-w.
10.1109/IEMBS.2010.5628041. [25] M. Ghazal, S. S. Ali, A. H. Mahmoud, A. M. Shalaby, and A.
[9] N. R. Binti Sabri and H. B. Yazid, “Image Enhancement El-Baz, “Accurate Detection of Non-Proliferative Diabetic
Methods For Fundus Retina Images,” in 2018 IEEE Student Retinopathy in Optical Coherence Tomography Images Using
Conference on Research and Development (SCOReD), Nov. Convolutional Neural Networks,” IEEE Access, vol. 8, pp.
2018, pp. 1–6. doi: 10.1109/SCORED.2018.8711106. 34387–34397, 2020, doi: 10.1109/ACCESS.2020.2974158.
[10] Sumaiya Pathan, P. Kumar, R. Pai, and S. V. Bhandary, [26] S. Qummar et al., “A Deep Learning Ensemble Approach for
“Automated detection of optic disc contours in fundus images Diabetic Retinopathy Detection,” IEEE Access, vol. 7, pp.
using decision tree classifier,” Biocybern. Biomed. Eng., vol. 150530–150539, 2019, doi: 10.1109/ACCESS.2019.2947484.
40, no. 1, pp. 52–64, Jan. 2020, doi: [27] Nour Eldeen M. Khalifa, M. Loey, M. H. N. Taha, and H. N.
10.1016/j.bbe.2019.11.003. E. T. Mohamed, “Deep Transfer Learning Models for Medical
Diabetic Retinopathy Detection,” Acta Inform. Medica, vol. 27,
[11] L. Qiao, Y. Zhu, and H. Zhou, “Diabetic Retinopathy Detection no. 5, pp. 327–332, Dec. 2019, doi: 10.5455/aim.2019.27.327-
Using Prognosis of Microaneurysm and Early Diagnosis 332.
System for Non-Proliferative Diabetic Retinopathy Based on

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

80
Studies on Increasing Energy Efficiency By
Modernization of a Single-Stage Type Turbo-Compressor
for Ammonia Combustion Air Process

Oguzhan Erbas Ahmet Akbulut Fadime Menekse Ikbal

Department of Mechanical Engineering Department of R&D Department of Mechanical Engineering
Kutahya Dumlupinar University Istanbul Gubre Sanayi A.S (IGSAS) Kutahya Dumlupinar University
Kutahya, Turkey Kutahya, Turkey Kutahya, Turkey
oguzhan.erbas@dpu.edu.tr ahmet.akbulut@igsas.com.tr menekseikbal@hotmail.com

Abstract— Today, due to global warming, security of supply, temperature, humidity, dust, electromagnetic noise are
and rising prices, energy efficiency studies come to the fore. essential features. In addition, they are preferred by many
Energy efficiency also seems to be a way for industrial businesses because they are easy to maintain and can be
enterprises to be profitable. This study discussed the carried over a long distance[1,2].
modernization of a turbo-compressor located within a dilute
nitric acid production plant. This single-stage turbo-compressor Turbo compressors are dynamic compressors that are
sends ammonia combustion air into the reactor. As a result of commonly used to press air and gas. These machines create
the modernization, the energy efficiency in the system was pressure according to the dynamic principle; this means that
analyzed. As a result of modernization studies, the speed of the pressure increase is provided (using the air velocity) without
turbo compressor and the amount of air has been increased. any mechanical volume contraction (sliding) as in the
Thus, the acid production amount was raised in the facility. operation of positive displacement compressors. In turbo
compressors, the element that rotates at high speed to push the
Keywords— turbo compressor, modernization, nitric acid air (gas) is called an impeller or turbofan. There is no piston
production, energy efficiency or other type of mechanical driving or compression element
between the air inlet and the air outlet of the turbocharger[3,4].
I. Introduction
Instead, the turbo compressor sucks the air from the suction
Energy efficiency is a set of interdisciplinary strategic port (middle), and the impeller blades rotating at high speed
activities that complement and support national strategic create a centrifugal force and blow the air from the inside out
objectives. Significantly reducing the burden of energy costs, (around). Therefore, turbo compressors are also called
ensuring supply security in energy, reducing foreign centrifugal or even aerodynamic compressors. A radial
dependency, applying low-carbon technologies, protecting the discharge flow characterizes centrifugal compressors. Air is
environment, using domestic energy potential, and ensuring sucked towards the center of a rotating impeller (turbine) with
its sustainability are the primary targets. In addition, it is radial blades and is pressed against the circumference of the
possible to define energy efficiency as providing the same impeller by centrifugal (centripetal) forces. This
production capacity using less energy. In other words, energy circumferential (radial) movement of air causes both pressure
efficiency is the reduction of the amount of energy consumed increase and kinetic energy generation. Before the air is
without affecting economic development and social welfare directed to the turbine center of the following compression
without reducing the amount of output and quality in stage, it passes through a diffuser and spiral, during which
production. Therefore, savings are the most important factors kinetic energy is converted into pressure. Efficiency analysis
that stand out in terms of energy efficiency. Here, saving is critical because these systems are energy-intensive systems
intends to minimize energy consumption without hindering [5,6]. Modernization is essential for the continuity of
economic development and standard of living. production and energy efficiency today because
To make progress in energy efficiency, the industrial modernization enables investments to be made in the
sector must be considered as a priority. When examining the production lines of existing facilities, including adding
countries that stand out in energy efficiency worldwide, their suitable parts to machinery and equipment that have
improvement in the industrial sector is noticeable. The most completed their technical and/or economic life or replacing
intensive energy consumption sectors in Turkey are industry, existing machinery and equipment with new ones, completing
electricity generation, transportation, and housing. Therefore, missing parts in the facility, directly raising the quality of the
these sectors also stand out in terms of energy efficiency. final product or changing its model.
When the sectors are evaluated separately, it is seen that a In this study, the modernization of the turbo-compressor
large number of regulations related to energy efficiency have system in a dilute nitric acid production facility is
been made in Turkey, especially in the industrial field. Many discussed.To produce acid in this facility, it is necessary to
energy conversion systems are used in the production, obtain a mixture of ammonia and air as required by the
transmission, and storage of energy. Because compressed air process. Therefore, the plant capacity is directly dependent on
is convenient and safe, it is widely used as a power source in the amount of air produced by this turbo-compressor. Turbo
control valves, air motors, air guns for cleaning purposes, and air compressor is operated by a steam turbine and auxiliary
many more. Compressed air systems have a low power-to- gas expansion turbine. With the modernization works; It is
weight ratio and a high power density. Being resistant to aimed to increase the amount of acid production by increasing
explosions and overloads and not being affected by the speed of the turbo-compressor and the amount of air.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

81
II. The catalytic ammonia oxidation process known as the Ostwald process, the reactions between
In nitric acid production, first ammonia is oxidized to ammonia and oxygen. For example, oxidation, etc. Further
produce nitric oxide (nitrogen monoxide, NO), then NO is nitrogen oxides are formed between the steps, depending on
oxidized, one more step to becoming NO2 (nitrogen dioxide). the pressure and temperature conditions. Many unit processes
In the final stage, NO2 gas is absorbed with water, and nitric and processes with catalysts are applied to create this simple
acid (HNO3) is obtained. In the basic production technology reaction chain based on facilities. The general flow chart of
the nitric acid production process is shown in Figure I.

Figure I. Nitric acid production process general flow chart

Nitric acid plants are classified according to different or Whether the pressure levels used are low, medium or
same pressure levels in the reactors in two separate high pressure also significantly affects the result. The most
oxidation stages. Some technologies have the same pressure commonly used are medium and high-pressure dual
in both stages; these are single-pressure systems. Others systems. High-pressure single systems are also widely used.
operate at two different pressures, and these are dual This classification is essential in terms of emissions. The
pressure systems. nitric acid production steps are shown in Figure II.

Preparation of Catalytic ammonia Energy recovery by

Filtration of the Pressurizing the air
air/ammonia oxidation utilizing the heat of
incoming air (Turbo-compressor) mixture (Reactor) reaction

Gas pressurization,
Waste gas Absorption of NO2 Steam generation
energy recovery
gas to be converted Cooling of gases and sending to the
(tail gas filtration) and cooling
into nitric acid steam turbine
operations

Figure II. The nitric acid production steps

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

82
III. Results and recommendations system is converted into useful work. As a result, system
energy consumption is realized only in line with
The raw materials used in the production of nitric acid
operational needs. In addition, since the drops that may
are water, air, and ammonia. Ammonia is first gasified with
occur in the system outlet pressure due to leaks are
water in the ammonia gasifier and then comes to the
prevented, inefficient operation of the equipment at the
ammonia superheaters to be heated with hot air. Then the
end-use points is controlled, and the continuation of the
heated gaseous ammonia comes to ammonia incinerators.
production processes is ensured [7,8]. First of all, to
The air required for the combustion of ammonia is sucked
increase the efficiency of the turbo-compressor operating
from the atmosphere by the “turbo-compressor”. The
in the facility, modernization of the sealing system of the
turbo-compressor is operated by the steam turbine and by
rotor is envisaged. For this purpose, new fixed sealing parts
the rest gas expansion turbine.
were manufactured. The before and after view of the
Air leaks are one of the most important causes of
modernized sealing system of the turbo-compressor is
unnecessary energy consumption in a compressed air
shown in Figure III. In addition, the manufacturing
system. Necessary preliminary leak studies (leak detection
drawings of the newly designed sealing elements are shown
and repair) and continuous system performance reviews
in Figure IV.
ensure that almost all air production in the compressed air

Before modernization After modernization

Figure III. View of the sealing elements before and after modernization

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

83
Exting design New design

Figure IV. Manufacturing drawings of old and newly designed sealing elements
When compressor air leaks are reduced, compressor For example, in screw compressors, although the
pressure-flow increases, when the flow rate increases, pressure chamber and the air suction chamber are not
ammonia/air percentage composition is preserved in separated by definite boundaries, air leaks are still very
ammonia incinerators, and more airflow is fed for the same low. These are high speed compressors. In low-capacity
ammonia flow rate. Therefore, the ammonia and oxygen compressors, the yield decrease in the final stages is
reaction conversion rate increases, and more nitrogen unacceptably large. High outlet pressure production means
monoxide (NO gas) is produced. As more NO gas directly extra energy consumption of the compressor, that is,
affects the oxidation and absorption reaction in other steps, additional operating cost. Pressure levels that are
nitric acid production also increases. Therefore, the accidentally set higher than necessary will result in
compressor outlet pressure is one of the most critical compressor control set values (minimum and maximum)
factors affecting the efficiency of the compressor. Turbo that can be re-examined and gradually reduced to the
compressors formed with impellers arranged on a single required levels not to damage the sensitive equipment used
shaft, but if the capacity is high, it is efficient. in the business [9,10].

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

84
However, most of the time, the compressor outlet If the pressure losses from the compressed air outlet to
pressure level increases to compensate for the pressure the end-use point in a compressed air system do not exceed
losses between the compressor and critical end-use points in 10 % of the compressor outlet pressure, it can be said that
the system. These losses, which create the need to use this system is adequately designed. Since the inlet
compressed air like artificial equipment in the compressed conditions are known during these calculations, it is easy to
air system, cause low system performance and unnecessary design the suction edge. However, in design calculations
energy consumption of the compressor. In compressed air based on the pressure edge dimensionless mass flow
systems with any airflow restricting factor, it is imperative parameter, the losses must somehow be included in the
to increase the system pressure to achieve the pre-calculated calculations. The view of the turbo-compressor before and
flow capacity. after rotor maintenance, coating process, and high-speed
balancing is shown in Figure V.

Figure V. Before and after rotor maintenance, coating process and balancing adjustments.

As a result of modernization, the compressor pumps air at Table II. Compressor test results after modernization
a higher flow rate than before, and thus, the amount of acid Speed Air Temperature Production 100%
produced increases. Likewise, an increase in airflow and (rpm) (m3/h) (0C) acid (ton/day)
acid production was achieved according to the design
values. The test results before the modernization of the 5244 97358 18 680
compressor are given in Table I, and the test results after the 5252 97429 19 670
modernization are shown in Table II.
5246 96830 18 670
Table I. Compressor test results before modernization
5250 97613 17 670
Speed Air Temperature Production 100%
5241 97337 18 674
(rpm) (m3/h) (0C) acid (ton/day)
5247 97010 19 674
5250 89375 22 590 5246 96565 20 673
5250 87690 21 580 5247 97163 18 673
5250 87550 22 585
5250 87185 22 580
An unbalanced rotor will vibrate at the frequency of
5250 87760 22 585 shaft rotation speed due to the unbalanced mass's centrifugal
5250 87720 20 585 force. Therefore, a machine with an unbalance condition is
5250 88685 18 590
expected to produce a sinusoidal sine wave and a
corresponding dominant peak in the spectrum at the shaft
5250 87995 21 585 rotation speed. Turbo-compressor balance measurement
values are given in Table III.
The pressure ratio produced by the impellers is Table III. Turbo-compressor balance measurement values
proportional to the square of the operating speed. Therefore, Speed Pedestal 1 V Pedastal 2 V
unbandaged impellers can make much higher pressure
ratios than bandaged impellers. However, non-bandaged (rpm) mm/s, rms mm/s, rms mm/s, rms mm/s, rms
impellers tend to be less effective due to the high losses 2200 0.084 246 0.055 226
associated with wingtip leakage flow. In addition, there is
2700 0.053 112 0.084 114
no tip leakage in a bandaged impeller. Therefore, while the
centrifugal compressor is pre-designed, the head, flow, and 4900 0.060 77.9 0.132 54.1
speed are taken as a basis. 5600 0.278 101 0.151 26.9

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

85
Turbo-compressors are continuous service compressors, References
with the advantage of having very few moving parts,
especially in applications where high airflow is required and [1] Zhang Chaowei, Dong Xuezhi, Liu Xiyang, Sun
mainly where oil-free air is required are suitable for use. Zhigang, Wu Shixun, Gao Qing, Tan Chunqing, “A
However, unbalance forces put pressure on bearings and method to select loss correlations for centrifugal
seals, exacerbating looseness problems and can trigger compressor performance prediction”, Aerospace Science
resonances. The force created by an unbalance weight is and Technology, Volume 93, 2019.
related to the square of the velocity, so high-speed machines
can generate enormous unbalance forces and therefore [2] Xiao He, Xinqian Zheng, “Flow instability evolution in
cannot be allowed to go out of balance. In case of exit, it high pressure ratio centrifugal compressor with vaned
will cause inevitable damage. The balance report chart of diffuser”, Experimental Thermal and Fluid Science,
the turbo-compressor is given in Figure VI. Volume 98, 2018.

[3] Fei Chu, Fuli Wang, Xiaogang Wang, Shuning Zhang,

“Performance modeling of centrifugal compressor using
kernel partial least squares”, Applied Thermal
Engineering, Volume 44, 2012.

[4] Waleed Albusaidi , Pericles Pilidis, “An Iterative Method

to Derive the Equivalent Centrifugal Compressor
Performance at Various Operating Conditions: Part II:
Modeling of Gas Properties Impact”, Energies, Volume 8,
2015.

[5] Hong Zhang, Hang Zhang, Zhuo Wang, “Effect on

Vehicle Turbocharger Exhaust Gas Energy Utilization for
the Performance of Centrifugal Compressors under
Plateau Conditions”, Energies, Volume 10, 2017.

[6] Xiang XUE, Tong WANG, Tongtong ZHANG, Bo

YANG, “Mechanism of stall and surge in a centrifugal
compressor with a variable vaned diffuser”, Chinese
Journal of Aeronautics, Volume 31, Issue 6, 2018.

[7] Nobumichi Fujisawa, Tetsuya Inui, Yutaka Ohta,

Figure VI. Turbo-compressor balance report chart “Evolution Process of Diffuser Stall in a Centrifugal
Compressor With Vaned Diffuser”, J. Turbomach.
Volume 141(4), 2019.
IV. Conclusion
[8] Klaus Brun, Sarah Simons, Rainer Kurz, Enrico Munari,
In the production of dilute nitric acid, it was observed that Mirko Morini, Michele Pinelli, “Measurement and
the amount of air produced by the turbo-compressor, which Prediction of Centrifugal Compressor Axial Forces
sends air to the oxidation reactor, determines the acid During Surge—Part I: Surge Force Measurements“, J.
Eng. Gas Turbines Power., Volume 140(1), 2018.
production capacity. As a result of the modernization, there
were increases in air produced by 6.41% and 7.69%, [9] Charles Stuart, Stephen Spence, Dietmar Filsinger, Andre
respectively, compared to the previous two years. In Starke, Sung In Kim, “Characterizing the Influence of
addition, there was an increase in acid production by Impeller Exit Recirculation on Centrifugal Compressor
10.22% and 7.61%, respectively. Compared with the Work Input”, J. Turbomach, Volume 140, 2018.
design values, a rise of 2.87% in air and 4.26% in acid
production has been achieved after modification. In terms [10] Bachir NAIL, Abdellah KOUZOU, Ahmed HAFAIFA,
of economic gain by increasing the efficiency of the turbo- Ahmed CHAIBET, “Parametric Identification and
compressor, it has been calculated that the cost of Stabilization Of Turbo-compressor plant based on matrix
fraction Descriptıon using experimental data”, Journal of
modernization pays off in about 90 days.
Engineering Science and Technology Volume 13,2018.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

86
Prediction of cover bunch quality after harvesting
period of Tunisian date palm
Wafa Guedri Mounir Jaouadi Slah Msahli
Laboratory of Textile Engineering, Higher Institute of Technology Studies Higher Institute of Technology Studies ISET
University of Monastir ISET (Ksar Hellal) (Ksar Hellal)
Monastir, Tunisia Monastir, Tunisia Monastir, Tunisia
wafa.guedri@gmail.com jy.mounir@gmail.com slah.msahli@gmail.com

Abstract— Bagging date fruit is a necessary practice in the textile permitting protection from wind, insects and heavy
date palm. It is used in the date crop to protect fruits from rain. In fact, it is important for Tunisian farmers to be sure
humidity, rain and insects. At present, it is focused to many that cover bunch will keep its waterproofing and remain its
studies to improve its performances. In this work, a new strength during maturity period.
method is proposed intending to solve this multi-criterion
phenomenon. A practical mathematical tool named Previous works confirmed that using traditional covers
desirability function allowing the prediction of the cover bunch like plastic film, mosquito net and kraft paper had a
is developed. This approach permitted to define the global prominent effect on quality and yield of date fruit [5]. The
quality of the bag through a global quality index. In this work, ideal bag requests a compromise between several
five bagging products are selected to be inspected, used for the requirements of Tunisian farmers for efficient date
first time to protect date fruit against carob moth and rain. protection. In fact, the satisfaction of Tunisian farmers is a
Find results allowed the identification of ideal cover for phenomenon that requires satisfaction of a set of features
farmers to satisfy their needs that would has the highest during harvesting period.
properties.
In this work, we propose a mathematical approach in
Keywords—Desirability, quality, date fruit, nonwoven, order to estimate the Tunisian farmer’s satisfaction using an
satisfaction index named global quality index “QI”. The desirability
functions are used to develop this index. Modeling this
satisfaction allows developing the ideal choice of cover date
I. Introduction bunch.
Date palm is the most significant fruit crop grown in arid II. Material and methods
and semi-arid regions of Middle East and North Africa. In
Tunisia, date palm is the major factor of oases farming. It A. Selected samples
represents the main financial resources of farmers since date During the 2016-2017 seasons, three mature date palm
fruits are used for food or other commercial purposes [1]. For trees, received the same cultural practices at the experimental
occurrence, about 10% of the Tunisian population is plot of Institute Arid, “Atilaat” (Jemna, Kébili) were selected
dependent, on date palm and its related crops [2]. However, for this experience. The number of bunches was adjusted to
this important crop is actually in danger by severe factors. A six per tree. The covers were as follows: bunch cover with
majority of the date cultivars are susceptible to pests mainly nonwovens N1 (polyester high weight), N2 (polyester low
carob date or yield poorly caused by autumnal rain, sunburn weight), N3 (polypropylene high weight), N4 (medium
and wind. These factors had led to reduce the date quality, weight) and N5 (low weight), no bagging (control). Bunch
which menace the physical aspect of the date fruit. Hence, it covers with different products (TableI) was happening three
is very important to elaborate a strategy intending to improve months before the collect and continued during the full
the quality of the Tunisian date fruit. Covering procedure of maturity period.
date bunch is regarded as the most important practice for
date palm to get a date with better quality and an economical Table I. Properties of the selected nonwoven samples
Nonwoven samples
yield. It offers numerous advantages and is used in the date Properties
N1 N2 N3 N4 N5
fruit cultivated in order to protect date from rain, high Germa Germa Germa
humidity, bird and insects [3]. Different covers are existing Supplier German German
n n n
for date fruit protection [4]. Compositio Polypropy Polypropy Polyest Polyest Polyest
n lene lene er er er
The requirements for the cover date bunch were Color White White White White White
mentioned by D.E bliss in his study [5]. His study conducted G (g/m2) 60.3 41.36 75 50.2 34.9
that the best cover must be water-resistant during heavy rain
and allows maximum aeration because the vapor transpired T (mm) 0.44 0.38 0.34 0.31 0.17
by the date surface is imprisoned by the bag and guides to Gr-mass per surface unit, T-Thickness of the bags
water injury and infection. However, there are a few works
when evaluating the value of the bag used for protecting date
fruit [5]. The first work on the measurements of cover bunch Five nonwoven samples were selected from the textiles
features was described by Denis [6]. He planned the first offered by Freudenberg Group, plastic film and mosquito net
cover in the shape of cloth bag based on flexible woven from GIFruit-Tunisia. Figure I show the five samples of

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

87
covers chosen to study the protection efficiency of date fruits The arrangement of the quality of cover date bunch is
against rain and the carob moth. hard for all features that define these products. It is then
necessary to select a set of parameters to reflect together the
(a) (b) (c) global quality of the bag for a particular use [9]. The
selection of the properties is accompanied by a uncertainty
and subjectivity which are necessary for the utility of all
index. The selection of properties is based on data collected
from Tunisian farmers, organizations and researchers
troubled with the protection of dates.
In this work, the quality of the cover date bunch is
characterized by multiple parameters expressing the ability
of these bags to be satisfied. There are many physical and
mechanical properties that influence this quality. The number
(d) (e) of required parameters is fixed for all bagging products.
There is a set of parameters that the covers must have to
obtain it suitable for the protection of the date bunch. The
cover date bunch must be flexible and maintain the
movement of the date regime, so that tearing is the most
important parameters for any type of covering product.
Since a cover date bunch is mainly used to keep out
insects, the bag must have features of suitable dimensions to
Figure I. Images of used bagging samples: (a)N1; (b)N2;
(c)N3; (d)N4; (e)N5; surround the entire bunch. Textile strength is a key property
related to the features of fabric durability. Since the cover
B. Development of quality index bunch is focused to multidirectional forces, the tensile and
Development of the nonwoven covers quality is divided tear strength are used to determine the strength of the
to four steps: product. The properties of permeability and resistance to
water penetration are quality parameters that establish the
1. Choice of properties effectiveness of the bag and the preservation of the quality of
2. Property conversion to a common value date fruit.

3. Attribution of weighting B.2. Parameter conversion to a common value

4. Aggregation of indices The properties of the cover date bunch are expressed by
different units. In fact, levels can differ from one property to
With a focal point on developing the global quality index QI, another. The classification of cover quality intended to be
we have pursued the procedure offered by figure II. formulated. All properties must be changed to a common
value that is a quality index “QI”. The mathematical
equations that convert the values of the properties into
indices are made according to the desirability function. An
index will be in a scale of 0 to 1 [10]. If cover quality
properties get together the set specification values, a value of
1 is selected [11].
In the equations’ development, linear functions are made
as a correlation between cover properties and the quality of
the cover. In order to convert cover quality properties to a
common value, the lower and the upper limits of each
property must be identified. The upper and the lower limit
useful for the bag quality properties is based on the
theoretical knowledge, the specifications and the experts
opinions’ who participated in the survey.
In this approach, the quality properties of the cover date
bunch are divided into three mathematical functions named
Derringuer and Suich desirability.
The first functions are monotonous and can formalize a
preference in terms of decrease or increase. Harrington [11]
calls them One-sided. One-sided desirability characteristic
has three parameters. We get the example of an increasing
function. They are each parameterized by two points named
lower limit Yl (upper limit Yu for decreasing function) and
Figure II. Design of quality index QI target value YT. The first Yl indicates the lower region at
which the values of yj are considered unacceptable; the
B.1. Choice of parameters
second YT designated the area of satisfaction.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

88
Finally, the parameter s makes it possible to define the sj> 1, we believe that the requirement of the farmer is too
curvature of the function. In fact, the linear function high, and if sj <1, we can consider that the farer’s
corresponds to s = 1. Depending on the s values’, the requirement is too low [15].
curvature can be oriented.
RV
We employ the desirability function to maximize a sj  (5)
property, where ‘‘dj’’ is calculated according to equation 1 SL
[12]. Where RV: average recorded value; SL: average scale
value
In Equation 6, wj is the relative importance related to the
0 if YL  y j response j. The relative importance wj is a comparative level
 sj for weighting of the individual desirability dj in the global
 y  Y  desirability cover and it varies from the least important
d j   j L  if YL  y j  YT (1)
 YT  YL 
(wj=1) to the most important (wj= 5) [14].

1 if y j  YT RV
 wj  (6)
10
B.4. Aggregation of indices
We use equation 2 to minimize a property.
The presence of different criteria and the absence of a
1 if y j  YT relationship solutions guide to the use of strategies that
 sj
consider these particularities. The method we study in this
 y j  YU  work consists in transforming the multi-criteria difficulty
d j    if YT  y j  YU (2) into a single criterion problem. Geometric mean aggregation
Y 
 T U  Y in Equation7 and arithmetic mean aggregation in Equation 8
0 are used to join individual desirabilites “di” in overall
if y j  YU
 desirability Dg calculated by each group of property. The
weight of each property is primordial in the determination of
The last function proposed by Harrington [12] is called Two- the Dg.
sided and permitted the targeting value for yj, or specifying a
satisfaction region. It is situate by two points, the lower Yl Finally, different Dg's are collected together in a degree
and the upper Yu. In addition, YT value is used to put the of global satisfaction defined by QI using the desirability
incline of the function, where ‘‘dj’’ is calculated according to function of Derringen and Suich [16].
equation 3. 1
Dg  (d1W1  d 2W2  d3W3  ....  d nWn )  j
w (7)
0 if y j  YL
 1
Dg  (d1W1  d2W2  d3W3  ....  d nWn )  j
sj
 y j  YL  w (8)
  if YL  y j  YT
 YT  YL 
dj   (3) The quality index values QI vary between 0 and 1. The 0
1 if y j  YU value designated any satisfaction and the value of 1
corresponds to an entire satisfaction for the overall quality of
 sj
 y j  YU 
the cover bunch.
if YT  y j  YU
 YT  YU  III. Results and discussions
 
When bag is detached from the date palm, some
properties must be calculated. A full description of the
B.3. Attribution of weightings analytical methods employed is ready to study the status of
After the checking of the properties, weightings are covers removed.
consigned to each property specified its importance in the A. Cover bunch features after bagging period
global quality of the cover date bunch. The range of values
for these weights varies from zero to ten. The mean response Experimentation of the different covers used is carried out at
is employed to compare the importance of each property. the TTS laboratory in Moknines. The requirements of the
The value is calculated according to the Equation 4 [13]: covers must be related to the application on the date palm.
Experimental results presented in the form of pictures

M
 O V (4)
indicate a lack of tear resistance of N1 and N2 samples. The
literature indicated that the problem was due to the low UV
N resistance of polypropylene. Figure II shows that the
Where M: Mean response; O: occurrence; V: market covering period causes a significant destruction of the N1
value and N: number of response material while there are partial tears of the N2 material. In
contrast to the N1 and N2 materials, the N4 material has
The requirement degree of the Tunisian farmers is noted ''
sj '' and calculated using the equation 5. We consider that if sj retained its strength for 3 months.
= 1, the requirement degree is average. On the other part, if

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

89
from date palm as efficient anode materials for sodium-ion
batteries”, Carbon 146, pp 844, 2018.
[3] S. Sakin, "Analytical methods applied to the chemical
caracterization and classification of palm dates (Phoenix
dactylifera L.) from Elche's Palm Grove," University of
Alicante, Alicante, 2013.
[4] Salah Amroune et al., "Investigation of the date palm fiber for
a) b) c) composites reinforcement: thermo-physical and mechanical
Figure III. Tears of the N1 material after the bagging period properties of the fiber”," Journal of natural fibers, 2019.
[5] Wafa Guedri, Mounir Jaouadi and Slah Msahli, ‘Evaluating
farmer's satisfaction of different agrotextile cover bunch using
desirability function’, 2021.
[6] SP.Denis, "Means and method for protecting Deglet Noor
Dates," Brevet US2001051240, 2001.
[7] M. Selmane, Thèse en écologie animale, Annaba, FAaculté de
d) e) sciences, département de biologie, 2015.
Figure IV. Tears of the N2 material after the bagging period [8] D. Zohary, Domestication of Plants in the Old World: The
origin and spread of domesticated plants in Southwest Asia,
Global desirability function of Derriguer and Suich is used to Europe, and the Mediterranean Basin, Oxford, 2012.
calculate the QI were values are shown in Table II. [9] J. Smartt, Evolution of crop plants, 2nd edition, 1995.
[10] P. Munier, Le palmier dattier .coll. Techniques agricoles et
Table II. Quality index values productions tropicales, France: Ed. g. Maisonneuve et larose,
Samples N1 N2 N3 N4 N5 XXIV, p. 221, 1973.
[11] A.Hadj taieb, S.Msahli and F.Sakli "Optimization of the
Dg (structure) 0,42 0,12 0,25 0,56 0,43 Knitted Fabric Quality by using Multicriteria Phenomenon
Dg (resistance) 0,1 0 0,11 0,32 0,18 tools," International Journal of Fiber and Textile Research,
Dg (protection) 0,18 0,27 0,05 0,3 0,09 vol. 3, no. 4, pp. 66-77, 2013.
QI 0,007 0 0,001 0,05 0,007 [12] A. M C Balasooriya, "Development of a comprehensive fabric
quality grading system for selected end uses," National
Evaluation of covering product qualities by Tunisian Engineering Conference, 19TH ERU SYMPOSIUM, pp. 33-
farmers has been performed in order to determine farmer’s 37, 2013.
requirement degree and weight for each cover bunch [13] A. M. Shravan Kumar Gupta, "Optimization of durability of
Persian hand-knotted wool carpets by using desirability
property. Measured properties have been transformed to functions," Textile Research Journal, pp. 1-10, 2016.
individual satisfaction degree and global desirability index [14] F.Dabbebi and S. B. Abdessalem, "New approach for
was determined for the appreciation of farmer’s satisfaction. appreciating the surgeon’s satisfaction of braided sutures,"
Journal Of Industrial Textiles, pp. 1-23, 2015.
The quality index of all samples were determined to confirm [15] Zehdi S, "Analysis of Tunisian date palm germplasm using
the ability of these materials to protect against violence wind, simple sequence repeat primers," African Journal of
rain and to limit dust obtained for N1 and N2 materials. In Biotechnology Vol. 3 (4),, vol. 3, no. 4, pp. 215-219, 2004.
the case of N1, N2 and N3, their bad resistance after the [16] S. Rhouma, "Genetic diversity in ecotypes of Tunisian date
palm (Phoenix dactylifera L.) assessed by AFLP markers,"
period f bagging deeply depreciated their overall quality. The Journal of Horticultural Science and Biotechnology, vol.
82 , no. 6, p. 929–933, 2007.
IV. Conclusion [17] D. Mostafa, "Effect of Bunch Bagging on Yield and Fruit
Quality of Seewy Date Palm under New Valley Conditions
In this work, we have examined the multi-criterion (Egypt)," Middle East Journal of Agriculture Research, vol. 3,
phenomena of cover date bunch, during maturity period no. 3, pp. 517-521, 2014.
includes different properties simultaneously. A desirability [18] W.Guedri, M.Jaouadi and S.Msahli, "New approach for
method for appropriate nonwoven covers selection based on modeling the quality of the bagging date using desirability
Derriguer and Suich function was presented. This approach functions," Textile Research Journal, vol. 86, no. 19, p. 2106–
2116, 2016.
is suitable to minimize, maximize and achieve target values
of some objective functions at the same time. It is used to
convert the different output in one output that is the set of the
individual output while affecting different weights according
to the importance of each property in the studied case. This
method has been applied to reduce the different requirements
affecting the quality of the cover bunch in one index
representing the global quality of the nonwoven bags varying
between 0 and 1. This process advanced the quality
properties in term of the ideal output responses. The
considered quality indices have revealed that an
enhancement in permeability performance guarantees an
increase in product satisfaction.
References

[1] Jorg Kruse et al., Optimization of photosynthesis and stomatal

conductance in the date palm Phoenix dactylifera during
acclimation to heat and drought, New Phytologist, 2019.
[2] Ilyasse Izanzar et al., Corrigendum to “Hard carbons issued

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

90
Analysis of the Waterfront Transformation of the ‘Plazh’ Area
of the City of Durres, Albania.
Anna Yunitsyna Mirela Hasanbashaj
Department of Architecture Department of Architecture
Epoka University Epoka University
Tirana, Albania Tirana, Albania
ayunitsyna@epoka.edu.al mhasanbashaj10@epoka.edu.al

Abstract— In many waterside cities coastline urbanization is Albania has led to a planning policy in order to provide
an important issue for architects and urban planners. Albania economical and suitable land for building. The government is
has been unsuccessful in developing a balance between the seeking to accommodate public projects whereas the private
demand that the public has to access the coastline and the land ownership look for a land for development. In the past years,
use. The coastline nature, has changed because of demographic, people in Durrës have expressed their frustration that they
hasty urban and economic growth. This process took place have with the inaccessibility to the sea. The political and
without taking into consideration the social and cultural values economic changes that occurred after the 90s, were followed
of the coastline. Massive constructions have damaged the sewage by large flows of migration from rural areas, to urban areas.
system causing continuous over flooding and reduced the
The urban development of the territory, were largely the
coastline greenery. This study investigates the transformation of
the seaside of so-called 'Plazh' area during the last 20 years. The
product of free initiative and spontaneous development
study is divided into the 5 years periods, in order to have a more reflected in the creation of the first informal neighborhoods.
complete overview of the urban changes. The types of buildings All the economic activity was concentrated in the central and
included in the study are residential and commercial. In this coastal areas of Albania, but local authorities were unable to
research, have been used the information collected from the site provide appropriate urban regulatory plans and to control the
surveys and observations and archival. T this study presents a process of construction [2].
report explaining the site problems and transformation of the
costal line of Durrës during years, a comparison between the II. Waterfront Development Principles
master plans, proposed regulatory plans and the evaluation of In ancient times, societies used to live in waterfront areas,
the actual situation. such as next to Tigris, Nile and Euphrates. In order to sustain
life and satisfy the biological dependency on water, humans
Keywords— coastline, urbanization, land use, height limit, historically needed to locate near fresh water [3]. Waterfront
urban regeneration, waterfront means 'the part of a town or city adjoining a river, lake, harbor,
I. Introduction etc.' according to the Oxford American Dictionary of Current
English [4]. 'Waterfront is the urban area in direct contact with
City of Durrës is a coastal town which lies in the western water' [5]. According to Moretti, port activities and
part of the region. It holds the largest port in Albania which infrastructures occupy waterfront areas. Waterfront is defined
makes it an important landmark for tourism. It is the second as an interaction area between water and urban development
largest city from the politics, economy, administration, [6]. Although in the vocabulary the meaning of waterfront is
education and culture standpoint. Durrës itself holds good clear, in the literature, it is met by using different words
opportunities, lying in a strategically position between north instead of the term waterfront such as: riverside, river edge,
and south of Albania. It is the trade center of intersections and water edge, riverfront, a city port, and harbor front [4],
has won important functions and intensive territory usage, [7].Waterfront identifies the water’s edge in urban areas [8].
becoming one of the most developed regions of the country. The water body may be 'a creek or canal, river, lake, ocean,
The purpose of this work is to track the transformation the bay,' or even artificial [9]. To sum up, the waterfront area is a
coastline of the city of Durrës and to highilht the urban confluence area of water and land. It is the edge of land and
problems emerged during the last 20 years. The demand, also the edge of water. The waterfront is found as a continuous
inspiration and decision of the investors and owners for their process in most places where settlement and water are
private property have played an important role on the juxtaposed, whether or not a commercial port activity is or was
transformation of city and its coastline in terms of land use and present. It is an area that has a high density of elements and
urban planning. The study is focused on the analysis of so activities that affect each other. In geographical aspect, the
called 'Plazh' area using the building location, building height urban landscape is a synthesis of climate, soil, biology and
and land use criteria. The study examines the buildings of post physiognomy. An integrated landscape is a combination of
socialist period, specifically through 1990-2015. The coastal natural landscape and artificial landscapes including
line has a combination of elements such as residential housing, architecture, streets and squares [10].
high rise buildings, clubs, restaurants, shops etc.
Redeveloping the water front means re-utilizing the urban In most of European cities, the development process of
lands that were left behind through the years. urban waterfront area is like the following: prosperity, decline
and re-development. Urban waterfront areas flourishing
The major problem comes from the reclaimed land which timewasbefore 1920s. Before the industrial revolution, there
changes the relationship between the city and the waterfront. was a development of the society, there was a people
In most of the cases the lands that were reclaimed have been dependence on natural water sources, and the water was used
restricted to the general public by the private users. Durrës has for every day live but also for travel. With the development of
evolved with the land reclamation but such development has water traffic, waterfront areasbecame very important for their
a negative result with the urban waterfront development with cities because trading development began and it had a positive
the rest of the region and Europe [1]. Land reclamation in impact in the citizens economy, so waterfront areas became

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

91
the centers for many people’s lives. However, there was a have occurred in this area [16]. The oldest part of the city, is
declineof urban waterfront areas from the 1930s to 1960s. situated on the hill, overlooking the Adriatic Sea in the west,
After the industrial revolution, there was a rapidgrowth of the the bay of Durrës in the south and the plain in the east and
population; along the water were located many industrial and north-east. The city is growing in all of the directions, but
transport companies in order to get more benefits. This mostly on the coastline. During the last 20 years the city has
brought water pollution because much sewage rushed into suffered a drastic transformation. Nowadays on the 4.5 km of
waterso inhabitants did not want to live there anymore [11]. A coastal line there are problems with the buildings heights,
reconfiguration of the water aesthetics is done by urban density, distances and land use.
waterfront regeneration. The way in which the waterfront
looks actually depends on the creation of attractive waterside
open spaces, besides the provision support of residential and
business space with waterside prospects.
Waterfront cities expand over reclaimed land, and there is
a limited number of studies relating to the waterfront buildings
expansion. The waterfront area is where the land interacts Figure I. The Regulatory Plan of 1942, Detailed Plan
directly with the sea. It is a dynamic place which is totally for the ‘Plazh’ area
open to the action of waves, wind, currents and tides that can
expand the shore with sedimentary deposits and also erode it. In the early 20s the 'Plazh' district was a farmland [17]. In
It is very important to provide the vision of the future the mid-1930s, a regulatory plan was requested for the city of
sustainable development of the waterside urban edges and to Durrës, the implementation of which turned Durrës beach
improve the social, aesthetical, physical and economic before the Second World War in a touristic and recreational
conditions of those areas [12]. Besides of the engineering area. The royal court, in the mid 1930's charged Durrës
tasks the expansion of the waterfront should involve municipality and a group of technicians to draft the regulatory
systematic and careful planning, the strategy of development plan for the coast. A 4 km length of the coast line was divided
and sustainable management and also should consider the in 300 parcels with a 400-500 m2 each. These parcels were
interests of people living in the area [13]. The common sold to different individuals provided that the building would
strategy is to provide the guidelines for the waterfront not be more than 2 story high, 80 meters away from the sea
development which include detailed land use principles and line and half of its surface had to be used as greenery. The
architectural design approaches, such as building heights and
first regulatory plan of Durrës is done in 1942 by the architect
the built forms, presence of the public areas, access to the
Leone Carmignani (Figure I). The study presented a
water, human-centered deign of spaces, lighting and signage
[14]. Waterfronts are usually in the center of conflicts between functional zoning scheme, an administrative map defining the
different actors and parties, the combine the complex boundaries and the territory junction.
environmental problems such as pollution and flood, social
issues which is a contradiction of local inhabitants which can
belong to the poor groups and the tourists, the intension of the
developers to densify the constructions and demand for the
public spaces [15].
III. History of the Development of ‘Plazh’ Area of
Durres
The city of Durrës is positioned in the western part of the
county. It has a geographical location with a latitude of 41'
19’ on the north, and longitude 19’ 27’ on the east. It has a
total area of 432 km2. It is bounded the district of Tirana
which is 35 km away, and in the south with the district of Figure II. Durres Regulatory Plan of 1957
Kavaja. As mentioned above, the western boundary of the
city is defined by the Adriatic Sea Coastline, and is 30 km After the establishment of the communist state in 1946,
long. Durrës is mostly composed of large landfills and hills, all of the beach villas were nationalized. The buildings were
which stand 89 m above the sea level, whereas in the city, the used as a holiday home for workers, and later, some of them
average height is only 2 m above the sea level. The converted into guest villas for foreign party leaders or holiday
establishment of Durrës began on the 7 century B.C. The homes for local directors and officers. The same thing
modern city is built on the ruins of Epidamn which is known happened to buildings that were close to the 'Bllok', in Illyria,
as the old city and has there are actually a number of while others were reconstructed as small apartments and were
significant elements of the historical and cultural heritage. rented for two-week vacation from workers families, mainly
Durrës Bay is located between Selita cape and the cape of from Tirana. From 1950 to 1960, based on the principle of
south Durrës. On the entrance part of the capes, the bay industrial decentralization, the first comprehensive regional
shores are higher, while on the inside a smooth slope is seen. and urban strategies were developed and applied [18]. The
On the north part of the shoreline the port of Durrës is located, General Regulatory Plan of Durrës was prepared in 1957
while on the north east and south east of the port lies the city (Figure II). During this period the coastline was considered
of Durrës. The suburb of the city has changed a lot through an important natural element which belonged to the public
the centuries, which means that also the coastline has and was still treated with care.
undergone many transformations due to earthquakes, which

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

92
The second Regulatory Plan (Figure III) was done in
1987, covering a territory of 2800 hectares, which in 1999
extended in 7800 hectares by Durrës Municipality.
According to this plan, there was not allowed any
construction on 'Plazh' area. It was prohibited to use this area
for residential or other purposes.

Figure V. Revision of Urban Conditions of the Costal

Area Durrës, 2005
The next plan is designed within the World Bank project
on 'Land Management and Administration' during 2007-2011
(Figure VI). The plan extends to 15 years perspective. Durrës
Figure III. Durres Regulatory Plan of 1987 is divided in the industrial area, port area, in the city center as
well as green areas. The development of industrial area,
After the 90's Durrës became one of the most attractive which includes Porto Romano and Spitallë, is determined by
cities with a massive displacement of population from rural the government. The port is the government property, but it
areas or little town. This resulted in an urbanization process was planned the opening for its residents in order not to
with exceptional growth rates. The most common phenomena remain a closed area as a military base. The city center is
that appeared in Durrës, as also in whole Albania in that conceived as a service area. There is also determined the
period is the process of illegal construction. Another Master development of the touristic zone, which will extend the
Plan was done in 1995 by DAU (Urban and Architecture existing part, also in the north of 'Currila' area. The
department, Pescara Faculty). It had orientation purposes, but masterplan defined the development of facilities serving the
this Master Plan did not have juridical approval from Durrës community, such as hospitals, social centers, schools, sports
Administration. Master Plan for the city of Durrës of 1995 centers and others. Since the territories are already occupied
[16], was performed in very beginning of the transformation in Durrës, the plan envisages the land owners to secure them.
of the territory and the housing surface that changed the According to this regulatory plan there for area from 'Ura
whole structure of the region. The city of Durrës was foreseen Dajlanit' to 'Plepa' there are 7 land categories which include
to consist of four 'cities' or homogeneous piece of territory: the criteria of the land use, buildings heights, density of the
The old historical city; the touristic city; the hills city; Big
Natural Park, consisting of hilly coastal park. This Master
Plan underestimate the importance that would have the fifth
city, the informal one, which under the influx of migrants and
immigrants would fill the 'Këneta' area of the Durrës city.

Figure VI. World Bank project on 'Land Management

and Administration' for Durrës City, 2011
area and so on.
Figure IV. Satellite image of Durrës, 1995 During the last years the city is experiencing
densification, which is characterized by new housing
The Revision of the Urban Conditions of 2005 was construction as well as high rise and interventions on the
prepared by Durrës Municipality. This plan has done the existing buildings. The effects of urban densification, limited
resettlement of the urban criteria of the Costal area Master investments and maintenance underfunding, have accelerated
Plan (Figure V). It proposed a 'zoning', which was defining the decline on quality in infrastructure. The new buildings
only one urban criteria, the height (of 6, 8, and 10 floors). were developed mainly in existing areas with low density

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

93
houses, negotiating and compensating owners with % of the total) new constructions, most of the buildings were
apartments and shops. 2-3 floor high and mostly small apartments or villas, were
used for tourism. The biggest number of new constructions
IV. Analysis of the Coastline Development appeared during 2001-2006, when there were constructed 178
This comparative and descriptive analysis of coastline (71% of the total number) buildings from which 94 of them
transformation from 'Ura Dajlanit' to 'Plepa' is based on study are built on the foundations of the existing ones. After 2006,
of visual materials such as maps and images, archival the rhythm of constructions has decreased. There are added 14
research, site survey and site observation. (6% of the total number) buildings during this period. From
2010 to 2015 there are constructed 7 new buildings (3% of the
total number).
B. Building Heights
Durrës was perceived as a low rise city before 1995.
However, from this year a large number of large scale
developments was constructed along the coastline. First wave
of construction was mostly up to five or six floors. The boom
Figure VII. 1995, 2001, 2006, 2010, 2015 maps of development of construction of apartment blocks began from
Durrës coastline 2001 until 2006. There is a big contrast between the low rise
villas with the newly constructed residential buildings. On the
The basic maps of 1995 (technical drawings, architect maps of the building heights (Figure IX), it is seen that the
Gëzim Hasko, Archive of Durrës, Water Utility) that has been building heights have increased from 1995 to 2015, but the
redrawn as a Dwg file, 2001 and 2010 maps (Land Register period which makes contrast is from 2001-2006.
office), 2006 map (ALUIZNI) and the 2015 map (self-
updated on the terrain by using the 2010 and Geoportal Asig
as a reference) were provided (Figure VII), and used to
analyze the changes of the costal line for every 5 year period.
Each of the maps included the information regarding the
building heights. Comparison between the maps allows to
identify the number of new construction added during each
period.
A. Number of Construction
After 1990 the number of construction of the buildings has
rised year by year. This has completely changed the image of
the city, its structure and socio-economic aspects.

600
476 490 497
500
400
298
300 248
200
100
0
1995 2001 2006 2010 2015
Figure IX. Building heights map of Durrës ‘Plazh’ area
Figure VIII. Total number of buildings per each period from 1995 (below) to 2015 (above)
From the constructions analysis graph (Figure VIII), it is The map from 1995 shows that most of the buildings are
evident that in 1995 the total number of buildings was 248. In one up to two story high, so ‘Plazh’ area had only villas with
2001 the buildings number has reached up to 298 buildings. a max of 3 story high. Those villas were constructed before
The highest number of constructions and collaterals is seen years 60' and until 1995 there were not constructed any new
from 2001 to 2006 where the number of constructions has building because there were not allowed constructions on the
reached a total of 476 buildings. From 2006 up to 2010 the touristic areas. On the map of 2001 it is seen that there are not
rhythm of construction was lower. At the end of 2010 there very big changes in comparison with the 1995 map. The new
were 490 buildings in total. On the last 5 years, the number of buildings were not high rise and distances between in building
constructions has decreased. One of the reasons is that the new were respected. From 2001 until 2006 a massive and
Regulatory Plan does not allow new constructions on this area, uncontrolled expansion of apartment blocks is noted. This
since there is a height building density and there has not period of construction corresponds to the revision of the
remained too much free land to continue with the chaotic master plan made in 2005 by Durrës municipality. The
construction that has characterized this area. So, at the end of revision of urban conditions it is not based in any study done
2015 there are in total 497 buildings. The total number of for this area because the last studied master plan was the one
buildings added is 249. From 1995 to 2001 there were 50 (20 made in 1987. Despite on this, the construction continued by

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

94
causing many problems. One of the main reasons of the plus height. The total number of buildings with multiple
massive construction was land reclamation. In 2005, the problems is 64.
Revision of Urban Condition divided the area in 2 parts, the
area with buildings heights up to six stores and the area with D. Road Network
the building heights up to eight stores. The new buildings that The road network of coastline of Durrës has been studied
were added were from two up to eight stores. After 2006, the using geographical information systems (GIS). During the
construction continued. Also in this period the buildings were last years, the main road which connects the Plazh area with
mostly apartment blocks with the heights up to nine floors. Durrës and Golem got a large increase in traffic and transport
From 2010 to 2015 there were not added many buildings. This demand. The solution to resolve the capacity problem was to
is because there was not too much of vacant area remained. provide additional road space, which was the first applied
strategy. It was not enough because of the uncontrolled
The overall height of new buildings increased from 1995 building expansion that occurred these years in the beach area.
to 2015. In 1995 there were no buildings with 7-8 or 9+ floors. Secondary roads were not developed. Regarding the existing
In 2001 there are 7 buildings of 5-6 story height. Furthermore, road system a lack of access in the suburban area is noticed.
in 2006 and further on, the number of high rise buildings has The main street outlines the area. On the internal area, narrow
drastically increased. Between 2001 and 2006 there are added roads are constructed with dead ends. In the past years, with
24% buildings with 7-8 story high and 7% buildings of 9+ the massive construction around this area, these secondary
story height. During 2010-2015 just 7 new buildings were roads have been transformed into pedestrian roads. Parking is
added, which refers to the new master plan and the Urban another major problem since the new buildings did not
Laws which came in power. provide any parking places for the residents. The situation is
C. Building Distances and Parcel Use becoming a major concern to the public because it is getting
worse day by day. It also causes air pollution which is one of
According to the map of the buildings problems (Figure
the biggest problems on the area which damages the tourism
X), it is seen that there are a lot of problems with the buildings and has a negative effect on the quality of life.
heights, distances from each other and use of the parcels. Red
buildings are out of the construction criteria according to the The main road is the one which didn’t change during the
2005 Revision of the Urban Conditions and the 2011 last years. Because of the massive construction a lot of
Regulatory Plan. The pink color represents the buildings secondary roads were built within the area. They were built
added during years that have not respected the distance rule.
Most of the buildings should have a blind facade, but
according to the site investigation, there are found windows
and balconies. There are also problems with the coefficient of
use of the parcels. The cyan color indicates buildings with
more than 1 problem.

Figure XI. Road network map at 2005 (above) and at

2015 (below)

without a plan which means that there was no infrastructure.

The roads were mainly with dead ends. While comparing the
two maps it can be seen that before 2005 the secondary roads
were very confusing whereas later the road network became
clearer.
V. Conclusion
Water surface as part of ecological system, is an important
element of the nature and has environmental qualities for the
region, so it has to be treated as an essential component. In this
study it was examined the transformation of the coastline of
Durrës city, from 'Ura Dajlanit' to 'Plepa'. The analysis of the
site is done within a timeframe of 20 years from 1995 to 2015.
Figure X. Problems with buildings heights, distances In this framework the Master Plans, Regulatory Plans, and
and parcels, ‘Plazh’ area Durres construction criteria were analyzed and there were found
problems with implementation of the buildings in confront
114 buildings have problems with the distances between with the Urban Law. The transformation of the coastline area
them. Approximately 40 buildings have problems with their from 'Ura Dajlanit' to 'Plepa' below 'Pavarsia' street was
heights, they exceed the limit of the allowed height, referring observed on the study by using maps, pictures, and changes in
to the Revision of Urban Conditions of 2005. 31 buildings Urban Law and Master Plans. From the analysis of the maps
have problems with the parcels according to the analyzed map, during years is seen that there are some informal settlements
but since the parcels may have changed during years, the in this area and an inappropriate use of the land that came as a
number may be even higher. There are buildings with more result of the illegal constructions during years. The massive
than one problem on the map, some of them have distance plus construction and the problematic of the land use and the
parcels some others may have height plus parcel or distance buildings typology came out also as result of not respecting
the Master Plans and the Urban Laws. There are 2 main factors

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

95
which have affected the development of the Coastal area. The [7] A. Yassin, C. Eves and J. McDonagh, "An Evolution of
first factor is the land reclamation and the second is the Waterfront Development in Malaysia," Geography, 2010.
absence of an approved Master Plan from 1990 until 2011, [8] M. Marzia, Morphological, technological and functional
since the Regulatory Plan of 1987 was not taken into characteristics of infrastructures as a vital sector for the
consideration because it belonged to the previous regime. competitiveness of a country system, Milano: Politecnica,
2011.
Durrës is the second largest city on the region and it has
touristic and historical values. The transformation of ‘Plazh’ [9] S. Shaziman, I. Usman and M. Tahir, "Waterfront as Public
area can be seen as a bad example of management the touristic Space Case Study; Klang River between Masjid Jamek and
coastal areas. Every city which has an important historical and Central Market, Kuala Lumpur," in Selected topics in energy,
touristic value should the development plan where all the environment, sustainable development and landscaping;
EEESD '10, 3rd WSEAS International conference on
social, economic, aesthetical and architectural principles
landscape architecture LA '10, 2010.
should be merged in a harmonic and attractive way. Everyone
should enjoy the spaces which are offered by the waterfront [10] S. Kostof, The City Shaped: Urban Patterns and Meanings
areas. This land should be not treated as a personal asset and Through History, Thames & Hudson, 1991.
used for personal interest, but also take into consideration the [11] N. Erkal, Haliç extra mural zone : a spatio temporal framework
demands of the tourists and residents. The ‘concrete barrier’ for understanding the architecture of the Istanbul city frontier,
of the 'Plazh' area was done without a base of a Regulatory Istanbul: METU, 2001.
Plan or a specific study. Almost 50% of the buildings are done [12] A. R. Al-Shams, K. Ngah, Z. Zakaria, N. Noordin, M. Z.
with the violation of the law. At this situation it is difficult to Hilmie and M. Sawal, "Waterfront Development within the
rehabilitate al the area, since there is not any vacant land which Urban Design and Public Space Framework in Malaysia,"
can be used for the public spaces. It should be suggested a Asian Social Science, vol. 9, no. 10, pp. 77-87, 2013.
rehabilitation project, which should include sport areas, [13] C.-H. Chen, "The Analysis of Sustainable Waterfront
recreation areas and more green areas. The new construction Development Strategy - The Case of Keelung Port City,"
should be stopped, some of the informal settlements should be International Journal of Environmental Protection and Policy,
demolished and importance to the secondary roads which vol. 3, no. 3, pp. 65-78, 2015.
orient visitors to the sea should be given. [14] Y. Reyhan , Ş. Şenlie and G. İ. Burcu, "Sustainable Urban
Design Guidelines For Waterfront Developments," in 2nd
References International Sustainable Building symposium, Ankara, 2015.
[1] A. Hoti, Durrësi : Epidamni - Dyrrahu : guidë, Tiranë: Cetis [15] R. E. Pramesti, "Sustainable Urban Waterfrontre
Tirana, 2003. Development: Challenge And Key Issues," MEDIA
[2] S. Xhafa and B. Hasani, "Urban Planning Challenges in the MATRASAIN, vol. 14, no. 2, pp. 41-54, 2017.
Peripheral Areas of Durres City (Porto Romano)," [16] V. Koçi, Spatial transformations of the waterfront-as an urban
Mediterranean Journal of Social Sciences, vol. 4(10), 2013. frontier case study : Durres a port city, Istanbul: METU, 2005.
[3] R. Leakey and R. Lewin, People of the Lake: Mankind and Its [17] Dyrrah, "Durrës: Vilat e plazhit, "hirushet" e vetmuara mes
Beginnings, Anchor Press/Doubleday, 1978. katrahurës ndërtimore," DurrësLajm, 15 June 2015.
[4] L. Dong, "Waterfront Development: A Case Study of Dalian, [18] G. Enyedi, "Urbanization under Socialism," in Cities after
China," University of Waterloo , Waterloo , 2004. Socialism, Urban and Regional Change and conflict in Post-
[5] M. Moretti, "Cities on Water andWaterfront Regeneration: A Socialist Societies, Oxford, Blackwell Publishers, 1996, pp.
Strategic Challenge for the Future," Rivers of Change - 100-11.
River//Cities, Warsaw, 2008.
[6] M. Yassin, B. Azlina, S. Bond and J. McDonagh, "Principles
for sustainable riverfront development for Malaysia," Journal
of Techno-Social, 2012.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

96
Deep Learning Using MobileNet for Personal Recognizing
Şafak Kılıç İman Askerzade Yılmaz Kaya
Department of Computer Engineering Department of Computer Engineering Department of Computer Engineering
Siirt University Ankara University Siirt University
Siirt, Turkey Ankara, Turkey Siirt, Turkey
safakkilic@outlook.com imasker@eng.ankara.edu.tr yilmazkaya1977@gmail.com

Abstract— The usage areas of biometric technologies are created with data such as walking, signature and speech are
increasing day by day. As the importance of information based on behavioral features. These strategies, on the other
security becomes more important for people every day, one of hand, have a significant disadvantage in that they can be
the most used areas has been information security. In recent duplicated [2; 3; 4]
years, human-computer interactive systems have started to
attract academic and commercial interest and it is aimed to Mimic sounds, the use of duplicate irises, and disguised
solve problems such as person recognition, gender estimation, glasses can be examples of these scams. As a result, new
age estimation with these systems. In our study, person descriptive systems based on individual behavior or features,
recognition was performed through the data collected using known as biometrics, based on signals measured from various
wearable sensors. The Daily and Sports Activities data set, parts of the body, have been adopted in recent years [5; 6].
which we obtained from the UCI database, has been tested with Different medical signals are also employed as biometric data,
the developed MobilNet architecture. It has been seen that the according to studies. Biometric systems are developed using
data obtained from the sensors are successful in the person EEG and other signals [7; 8; 6; 9], electrocardiogram [10; 11;
recognition problem. The developed system has realized the 12; 13; 14; 15], and accelerometer [16; 17]. Medical indicators
person identification with 19 different physical movements and are unique to each person, according to studies [8; 18; 10].
also provided the detection of 19 different movements. In
addition, success rates were obtained according to the region In the study of Alyasseri et al. [19], people are recognized
where the sensors were installed. Thanks to the results obtained using multichannel EEG waves. In addition, active EEG
in this study, it has been seen that accelerometer, gyroscope and channels were uncovered by the researchers. The process of
magnetometric sensors are successful in biometric person recognizing persons is done by the use of electrical signals in
recognition. In summary, it has been determined that the the brain, according to Sun et alresearch .'s [9]. They
proposed method is successful in biometric person recognition, discovered that applying the conventional 1D-LSTM deep
thanks to the data obtained from wearable sensors. learning algorithm to 16-channel EEG measurements resulted
in a success rate of 99.56 percent. In their study of identifying
Keywords—Transfer Deep Learning Models, MobileNet, people using EEG data, Rodrigues et al. discovered an 87
Person Identification, Wearable Sensor, Biometric System
percent success rate [6].
I. Introduction The person recognition problem was attempted to be
In the past few decades, the problem of identifying people solved by using sensor signals in a study by Kılıç et al. [20;
has been one of the hot areas where researchers emphasize the 21]. The sensor signals were converted into pictures with
use of various methods. The human body has several unique various processes and the success of the system was tested
characteristics. Some systems can detect these characteristics with local binary pattern and deep learning networks and
and distinguish them from others. A system that models. In both of his articles, the author showed a success
acknowledges people supported their physical or behavioural rate of over 95% in the person recognition study from sensor
characteristics is termed a biometric system. Personal tests.
biometric authentication includes distinctive people supported Although the CNN training strategy from scratch can be
their physiological and/or behavioural characteristics [1]. successful in many problems, the correct optimization of the
Biometry technology checks the physical or behavioural hyper-parameters in the architecture to be installed is still a
characteristics that a private will acknowledge. The biometric difficult process [34].At the same time, a large amount of data
system works in 2 ways: (1) identification (also referred to as is required for a scratch-training technique [35]. However, it
"identity verification") and (2) authentication (also referred to is possible to reach high success rates faster and more
as "identity verification"). First, the verification of private precisely by training deep architectures, which are enriched
identity is completed by finding a match within the info of with techniques newly introduced to the literature and also do
everybody within the information (one-to-many comparison not need hyperparameter optimization, with transfer learning
strategy). within the latter, a human biometric info is or fine-tuning strategy [36]. The MobileNet architecture is
compared with its example hold on within the system chosen as the training model in this study because it is learned
information to verify a human identity. using deep transfer networks, has a low transaction cost, and
Physical and behavioral features are the two basic qualities is ideal for mobile applications.At the end of the study, higher
of persons. Physical properties are those that are stable and do success rates were achieved with the right education strategy
not vary over time, whereas behavioral properties are those compared to other studies in the literature.
that change over time and in response to environmental In our study, after this stage, the data set was introduced,
factors. Biometric recognition systems are generally the models used were shown, and the experimental results
developed thanks to these two characteristics of people. While obtained were presented in the discussion section. In the last
biometric systems created with data such as facial recognition, part, the general achievements obtained at the end of my study
fingerprint recognition, hand goometry, iris and retina data are are given.
systems created by physiological features, biometric systems

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

97
II. Data Set Activity Name of The Activity
Code
As a part of this study, dataset which is known as a Daily
and Sports Activities obtained from the ICU database [22; 23; A16 Vertical cycling is an activity that involves cycling in a
24]. This study used Xsens MTx sensors mounted to the vertical position.
person's assigned location to collect data from the 19 A17 rowing exercise
previously indicated behaviors (activities). To collect data, A18 jumping exercise
place these sensors in 5 different areas of the topic. The chest,
A19 basketball game in progress
right wrist joint, left wrist, right (above knee), and left leg
(above knee) have all been identified as possible locations for
the device (Figur1). In each Xsens MTx device, there are nine III. Methodology
sensors (accelerometer x, y, z; gyroscope x, y, z and
magnetometer x, y, z). A. Person Identification By MobileNet Deep Transfer
Learning Technique

The figure below shows the recommended method of

identifying people using the MobileNet deep learning
method. A brief introduction to the process performed at each
stage.

Figure I. Attaching the sensors to 5 different regions of the subjects.

(A) Xsens MTx, (B) Regions where the sensors are attached [33].
The data for this study was generated by four women and
four men participating in nineteen distinct planned activities
for five minutes each. Table I lists the activities carried out by
the topics.
Figure II MobileNet deep transfer learning approach template for
Table I. The study involved 19 different activities. person identification (PI-MobileNet )

Activity Name of The Activity

Code
Block 1: A wearable sensor is attached to the subject at this
A1 the act of sitting
point. The Xsens MTx sensor devices are placed throughout
the subject in five separate locations.
A2 the act of standing
A3 recumbency exercise Block 2: A total of 45 signal channels from 5 different places
A4 Lie down on your right side of the body are recorded using accelerometer, gyroscope, and
A5 transferring to the second floor magnetometer sensors.
A6 descending to the lower level
Block 3: Each action in Table I has a 5-minute signal
A7 in the elevator, standing activities
measurement. These signals are then separated into 5
A8 while the elevator is going, standing activities
seconds. The signal length is 5x25 = 125 because the
sampling rate is 25. First, use the following equation to
A9 In the parking lot, there is a lot of walking going on. convert the value of the signal to a value between 0-255.
A10 Walking exercise on a treadmill at a speed of 4km/h Since the number of channels is 45, you get an image in the
parallel to the ground form of 125x45.
A11 Walking on the treadmill at a 15-degree inclination to
the ground at a speed of 4 km/h 𝑋𝑋𝑖𝑖 −𝑀𝑀𝑀𝑀𝑀𝑀(𝑋𝑋)
𝑁𝑁𝑁𝑁𝑁𝑁 𝑋𝑋𝑖𝑖 = 𝑟𝑟𝑟𝑟𝑟𝑟𝑛𝑛𝑑𝑑 �� 𝑥𝑥255� (1)
A12 Running at a speed of 8 km/h is a high-intensity activity. 𝑀𝑀𝑀𝑀𝑀𝑀(𝑋𝑋)−𝑀𝑀𝑀𝑀𝑀𝑀(𝑋𝑋)

A13 Running at an 8-kilometer-per-hour pace is a high-

intensity activity.
Figure III, for example, depicts the image formed by the
A14 riding on an elliptical machine
standing activity signal for each individual.
A15 Cycling in a horizontal position is an activity that
involves cycling in a horizontal position.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

98
to each input channel using deep convolution and then creates
a linear combination of deep layer outputs using 1x1
convolutions (point by point). Batch normalisation (BN) and
changed long measure (ReLU) liner unit used once every
convolution.

Looking at Table II, the MobileNet architecture is seen in

detail. The picture given to the model is transformed into the
output image by going through the stages in the table. ReLu
was chosen as the activation function. This model has
approximately 4 million parameters, which is significantly
Figure III. For each subject, images of the signals of sitting activity. less than the other models.

Block 4: The image that was made is shown. These photos

are sent to the Deep Transfer Learning algorithm for
restoration. Table II. The Architecture Of MOBILENET
Block 5: The image is classified by MobileNet depth transfer MobileNet
learning technology. All architectures of version 1 and 2 of
Input layer
MobileNet have been tested.
Convolutional layer
B. Deep Trasnfer Learning
Depthwise n Convolution layer
Batch Normalization
AlexNet [25], GoogLeNet [26], VGG Net [27], ResNet ReLU
[28], and NASNet [29] are just a few of the CNN +
Pointwise n Convolution layer
architectures that have been developed in recent years. All of Batch Normalization
these architectures have been pre-trained on over one million ReLU
photos from ImageNet's Large-Scale Visual Recognition (n = 1, 2, 3, …, 13 layers)
Challenge [30]. To transmit these learnings, we can use the
model's weights and biases. One of the pre-trained models is
the MobileNet architecture [31]. The distinction between
MobileNet and other published designs is that MobileNet has
a lesser number of parameters. With separable convolution, Global Average Pooling layer
MobileNet exploits the depth introduced in to lower the Reshape layer
computational requirements of the first layers [32]. Latency
Dropout layer
and precision are trade-offs in MobileNet efficiency. Tasks
based on this model can be completed on the CPU without Convolutional layer
the need of the GPU due to MobileNet's light weight. They, Softmax layer
like other large-scale models, can be utilized for
Reshape layer
classification, detection, and segmentation.
Output

C. The Architecture Of MobileNet

Deep learning networks produce effective outcomes in IV. Experiment

areas like image analysis and computer vision [5]. There are
many hidden layers in a DL. The values given as input to In this study, we use a lightweight network called
these layers form outputs by going through mathematical MobileNet to develop a deep neural network architecture that
operations. These outputs are given as input to the next layer uses a separable deep convolution.
(Figure-1). DL is trained on the designated classes for image
analysis and classification. This training requires a previously Although the design employs fewer parameters than
labeled dataset that DL will work on. Because the analysis of previous efficiency models at the same level, Google
images requires high computational power, configuring the designed MobileNet with the goal of striking a balance
system in which the DL solution will run is very important between latency and precision and displaying impressive
for performance. performance. We propose our upgraded MobileNet, which is
appropriate for personal recognition categorization utilizing
MobileNet is built on a separable depth convolution with wearable sensor inputs. Starting with the primary
two basic layers: depth and point convolution. Deep convolutional layer, followed by thirteen deep convolutional
convolution is the process of filtering the input without layers, and at last a degree convolutional layer, our design is
adding additional functions. As a result, point-by-point galvanized by the MobileNet network layer. once every depth
convolution is used to combine the process of creating new and purpose convolutional layer, use the batch
features. Finally, a depth separable convolution is the result standardization (BN) and changed linear measure (ReLU)
of combining the two layers. The model applies a single filter trigger functions. the worldwide average grouping layer are
accustomed minimize the dimensions of the extracted feature

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

99
map once all of the convolutional layers have extracted women). Each activity is broken down into 60 segments. As
options from the input image. a result, there are 19x8x60 = 9120 signal matrices in the data
set. MobileNet deep transfer learning technology is utilized
The Reshape layer, Dropout layer, convolutional layer, after these signal matrices have been converted into images.
Softmax activation perform layer, and thus the Reshape layer, 9120 photos were extracted as a consequence of to see if our
which make up the last 5 layers of the quality MobileNet, are system worked. There are two MobileNet architectures in
replaced by the Dropout layer and therefore the completely operation. The success rate is determined as follows:
connected layer, which uses the Softmax activation perform. # 𝑻𝑻𝑻𝑻𝑻𝑻𝑻𝑻 𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄
𝟏𝟏𝟏𝟏𝟏𝟏 ∗ (%) (2)
Our fully linked layer will generate more accurate predictions # 𝑻𝑻𝑻𝑻𝑻𝑻𝑻𝑻 𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄 + # 𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭 𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄

for each class than the preceding five layers using standard
MobileNet. Increasing the number of convolutional layers in Table IV shows the success rate of personnel identification
the model will help it extract more alternatives from the using MobileNet Version 1 (MobileNetV1) and MobileNet
computer file in general. Vesiyon 2 (MobileNetV2) architectures.

However, there are some limitations to adding them. We

discovered in this work that traditional MobileNet overfits,
Table IV. Person Recognition Success Of mobilet Networks
causing the model to misclassify personel recognition. As a
result, our proposed design is capable of resolving this issue. Model Success rate%
Our modified model's total parameters are reduced, which
reduces the time it takes to calculate the model. Table III MobileNetV1 96,82
shows the specifics of our architecture.
MobileNetV2 98,46

Table III. Our MobilNet Articheture

MobileNet
Input layer
It is seen in Table IV that the person recognition
Convolutional layer problem with the data obtained from the sensor signals is
Depthwise n Convolution layer successfully solved with MobileNet networks. Although
Batch Normalization success rates are close to each other, MobilnetV2 networks
ReLU
+
were more successful in person recognition problem than
Pointwise n Convolution layer MobileNetv1 networks.
Batch Normalization
ReLU
(n = 1, 2, 3, …, 13 layers)
Table V. Success Rate of Mobilenet networks by Physical Activity
Activity MobileNetV1 MobileNetV2
Global Average Pooling layer
A1 100 64,5
Dropout layer
A2 100 12,5
Fully Connected layer
A3 97,9 25
Output
A4 93,75 47,916
A5 93,7 18,7
A6 95,8 22,9
MobileNet is a pre-trained model that recognizes the shape
A7 97,9 6,2
and part of an object in its initial layer and was trained using
A8 62,5 16,6
the ImageNet dataset. So in this paper, we chose all layer
deep transfer learning for traditional MobileNet and A9 95,8 4,1
compared it to our modified MobileNet, which is only two- A10 100 10,4
thirds of the layer ratio and the remaining layers configure the A11 100 16,6
Learn migration. We used our data set and settings to train A12 100 16,6
the model (such as weights, bias, and learning rate). Finally, A13 95,8 95,8
we assessed MobileNet and our model using the original data A14 97,9 22,9
set from the previous experiment, without any data
A15 91,6 18,7
augmentation or sampling techniques. MobileNet and our
model are tested in the second experiment utilizing data that A16 95,8 16,6
has been completed by sampling and augmenting data. A17 87,5 27
A18 79,1 27
A19 75 56,2
V. Results
Signals from 19 different activities of 8 people were used We also tried our model from the physical activity
to create the data set for this investigation (4 men and 4 recognition problem to measure the success of our network

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

100
after the person recognition process. In general, it has been [5] J. Galbally, S. Marcel, and J. Fierrez, "Biometric
determined that the MobileNet V1 network is more antispoofing methods: A survey in face recognition," IEEE
Access, vol. 2, pp. 1530-1552, 2014.
successful than the physical activation recognition problem. [6] D. Rodrigues, G. F. Silva, J. P. Papa, A. N. Marana, and
It has shown 100% success in recognizing some physical X.-S. Yang, "EEG-based person identification through binary
movements. Detailed results are presented in Table V. flower pollination algorithm," Expert Systems with
Applications, vol. 62, pp. 81-90, 2016.
[7] S. Marcel and J. d. R. Millán, "Person authentication using
Table VI. Success Rate of Mobilenet networks by region of sensors
brainwaves (EEG) and maximum a posteriori model
adaptation," IEEE transactions on pattern analysis and machine
intelligence, vol. 29, no. 4, pp. 743-752, 2007.
Region MobileNetV1 MobileNetV1 [8] Y. Dai, X. Wang, X. Li, and Y. Tan, "Sparse EEG
compressive sensing for web-enabled person identification,"
Chest Measurement, vol. 74, pp. 11-20, 2015.
94,5 93,8
[9] Y. Sun, F. P.-W. Lo, and B. Lo, "EEG-based user
Right Arm identification system using 1D-convolutional long short-term
92,1 92,1 memory neural networks," Expert Systems with Applications,
vol. 125, pp. 259-267, 2019.
Left Arm
91,1 89,9 [10] S. A. Israel, J. M. Irvine, A. Cheng, M. D. Wiederhold,
and B. K. Wiederhold, "ECG to identify individuals," Pattern
Right Leg recognition, vol. 38, no. 1, pp. 133-142, 2005.
93,6 92,3
[11] M. Deng, C. Wang, M. Tang, and T. Zheng, "Extracting
Left Leg cardiac dynamics within ECG signal for human identification
91,9 92 and cardiovascular diseases classification," Neural Networks,
vol. 100, pp. 70-83, 2018.
[12] A. Goshvarpour and A. Goshvarpour, "Human
Finally, we have tested the success of our model according identification using a new matching pursuit-based feature set of
to the region where the sensors receive signals. Both models ECG," Computer methods and programs in biomedicine, vol.
172, pp. 87-94, 2019.
showed high success. In addition, MobileNetv1 networks [13] K. Su et al., "Human identification using finger vein and
have a higher success rate. Data from chest-level sensors was ECG signals," Neurocomputing, vol. 332, pp. 111-118, 2019.
more distinctive than data from other regions. Details can be [14] F. Sufi and I. Khalil, "Faster person identification using
seen in Table VI. compressed ECG in time critical wireless telecardiology
applications," Journal of Network and Computer Applications,
VI. Conclusion vol. 34, no. 1, pp. 282-293, 2011.
[15] W. Chang, H. Wang, G. Yan, and C. Liu, "An EEG based
Several biometric technologies have been developed in familiar and unfamiliar person identification and classification
recent years. Face, voice, fingerprints, palm print, ear shape, system using feature extraction and directed functional brain
network," Expert Systems with Applications, vol. 158, p.
and gait are all biometric technologies that have been widely 113448, 2020.
used in security systems. However, because they may be [16] R. San-Segundo, R. Cordoba, J. Ferreiros, and L. F.
imitated, most of these systems have glaring faults. To address D'Haro-Enriquez, "Frequency features and GMM-UBM
these issues, a new biometric system based on medical signals approach for gait-based person identification using smartphone
has been developed. A biometric approach was developed in inertial signals," Pattern Recognition Letters, vol. 73, pp. 60-
67, 2016.
this study to identify people using wearable sensor inputs. The [17] R. San-Segundo, J. D. Echeverry-Correa, C. Salamea-
major goal of this study is to show that signals from portable Palacios, S. L. Lutfi, and J. M. Pardo, "I-vector analysis for
sensors such as accelerometers, gyroscopes, and gait-based person identification using smartphone inertial
magnetometers may be used to identify people. In the future signals," Pervasive and Mobile Computing, vol. 38, pp. 140-
153, 2017.
studies, due to the increasing use of mobile devices, person
[18] N. V. Boulgouris, K. N. Plataniotis, and E. Micheli-
recognition will be done with a deep transfer learning Tzanakou, Biometrics: theory, methods, and applications. John
approach through data obtained from portable mobile devices. Wiley & Sons, 2009.
[19] Z. A. A. Alyasseri, A. T. Khader, M. A. Al-Betar, and O.
References A. Alomari, "Person identification using EEG channel
[1] B. Ngugi, A. Kamis, and M. Tremaine, "Intention to use selection with hybrid flower pollination algorithm," Pattern
biometric systems," e-Service Journal: A Journal of Electronic Recognition, vol. 105, p. 107393, 2020.
Services in the Public and Private Sectors, vol. 7, no. 3, pp. 20- [20] Ş. Kılıç, Y. Kaya, and I. Askerbeyli, "A New Approach
46, 2011. for Human Recognition Through Wearable Sensor Signals,"
[2] T. Matsumoto, H. Matsumoto, K. Yamada, and S. Arabian Journal for Science and Engineering, vol. 46, no. 4, pp.
Hoshino, "Impact of artificial" gummy" fingers on fingerprint 4175-4189, 2021.
systems," in Optical Security and Counterfeit Deterrence [21] Ş. Kiliç, İ. Askerzade, and Y. Kaya, "Using ResNet
Techniques IV, 2002, vol. 4677: International Society for Transfer Deep Learning Methods in Person Identification
Optics and Photonics, pp. 275-289. According to Physical Actions," IEEE Access, vol. 8, pp.
[3] J. Yu, C. Fang, J. Xu, E.-C. Chang, and Z. Li, "ID 220364-220373, 2020.
repetition in Kad," in 2009 IEEE Ninth International [22] K. Altun, B. Barshan, and O. Tunçel, "Comparative study
Conference on Peer-to-Peer Computing, 2009: IEEE, pp. 111- on classifying human activities with miniature inertial and
120. magnetic sensors," Pattern Recognition, vol. 43, no. 10, pp.
[4] G. Gainotti, "Laterality effects in normal subjects' 3605-3620, 2010.
recognition of familiar faces, voices and names. Perceptual and [23] B. Barshan and M. C. Yüksek, "Recognizing daily and
representational components," Neuropsychologia, vol. 51, no. sports activities in two open source machine learning
7, pp. 1151-1160, 2013. environments using body-worn sensor units," The Computer
Journal, vol. 57, no. 11, pp. 1649-1667, 2014.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

101
[24] K. Altun and B. Barshan, "Human activity recognition [31] A. G. Howard et al., "Mobilenets: Efficient convolutional
using inertial/magnetic sensor units," in International workshop neural networks for mobile vision applications," arXiv preprint
on human behavior understanding, 2010: Springer, pp. 38-51. arXiv:1704.04861, 2017.
[25] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet [32] L. Sifre and S. Mallat, "Rigid-Motion Scattering for
classification with deep convolutional neural networks," Image Classification. arXiv 2014," arXiv preprint
Advances in neural information processing systems, vol. 25, arXiv:1403.1687.
pp. 1097-1105, 2012. [33]O. Dobrucalı , B. Barshan. (2013) Sensor-Activity
[26] C. Szegedy et al., "Going deeper with convolutions," in Relevance in Human Activity Recognition with Wearable
Proceedings of the IEEE conference on computer vision and Motion Sensors and Mutual Information Criterion. In: Gelenbe
pattern recognition, 2015, pp. 1-9. E., Lent R. (eds) Information Sciences and Systems 2013.
[27] K. Simonyan and A. Zisserman, "Very deep convolutional [34] M. Feurer and F. Hutter, "Hyperparameter optimization,"
networks for large-scale image recognition," arXiv preprint in Automated machine learning: Springer, Cham, 2019, pp. 3-
arXiv:1409.1556, 2014. [35] D. Shen, G. Wu, and H.-I. Suk, "Deep learning in medical
[28] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual image analysis," Annual review of biomedical engineering,
learning for image recognition," in Proceedings of the IEEE vol. 19, pp. 221-248, 2017.
conference on computer vision and pattern recognition, 2016, [36] D. Hendrycks, K. Lee, and M. Mazeika, "Using pre-
pp. 770-778. training can improve model robustness and uncertainty," in
[29] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, International Conference on Machine Learning, 2019: PMLR,
"Learning transferable architectures for scalable image pp. 2712-2721.
recognition," in Proceedings of the IEEE conference on
computer vision and pattern recognition, 2018, pp. 8697-8710.
[30] O. Russakovsky et al., "Imagenet large scale visual
recognition challenge," International journal of computer
vision, vol. 115, no. 3, pp. 211-252, 2015.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

102
Effects of Photon-Shot and Excess Noises on Detectable
Minimum Rotation Rate in I-FOG Design for
Autonomous Vehicles
Kübra Kılınçarslan Emirhan Sağ Abdurrahman Günday
Department of Electrical and Electronic Department of Electrical and Electronic Department of Electrical and Electronic
Engineering Engineering Engineering
Bursa Uludag University Bursa Uludag University Bursa Uludag University
Bursa, Turkey Bursa, Turkey Bursa, Turkey
kubrakilincarslan@uludag.edu.tr emirhansag@uludag.edu.tr agunday@uludag.edu.tr

Abstract—In the last decade, the importance of acquiring or estimated information of vehicle speed and route. There are
location information is gradually increased depending on the great numbers of sensors used in the computational navigation
developments in autonomous vehicle technologies. The rotation system with high accuracy and sensitivity. The gyroscope that
rate variation required for providing high precision continuous is one of them is exploited for detecting the deviation rate and
position information is obtained by using gyroscope, which is an route information and becomes the most important device that
important part of the Inertial Measurement Unit (IMU). determines the accuracy of the dead reckoning [3].
However, some effects limit the operation performance of
gyroscopes. In this study, the effects of photon shot and excess There are different types of gyroscopes employed in
noises that limit the measurement accuracy of interferometric autonomous vehicles such as optical-based and micro-electro-
fiber optic gyroscope (I-FOG) on detectable minimum rotation mechanical systems (MEMS). MEMS gyroscopes have cost
rate (DMRR) have been analyzed for a basic configuration using advantages in comparison with optical gyroscopes, on the
Superluminance diode (SLD) and Superfluorescent fiber source contrary, optical gyroscope-based navigation systems are
(SFS). Furthermore, simulations related to DMRR variations preferred due to their superiorities in terms of measurement
for the system parameters of fiber length, fiber coil diameter, accuracy and reliability [4].
photodetector bandwidth, output power of the optical sources
and spectral bandwidth have been obtained in Matlab 2020b. However, optical backscattering and distortions caused by
Moreover, for an optimum system design employing SLD and the nonlinear electro-optical effects influence the operation
SFS, DMRR has also been computed as 0.793°/h and 0.910°/h, performances of I-FOGs [5, 6]. To overcome this problem,
respectively. Thus, approximately 80% improvement has been low coherent and broadband optical sources are employed in
achieved over the DMRR values in the system. applications [7]. The simulations and analysis performed in
this study have been obtained making use of SLD and SFS
Keywords— fiber optic gyroscope, broadband sources, that are low coherent and broadband optical sources.
minimum rotation rate, photon-shot noise, excess noise
The main effect limiting the DMRR at the optical
I. Introduction gyroscope output is the photon shot noise induced by the
Acceleration sensors and gyroscopes with high sensitivity photodetector that converts the incoming optic signal into an
and accuracy are extensively used to detect the position, speed electrical signal in the FOG configuration exploited in this
and direction of an object. These kinds of information study. Furthermore, excess noise induced by characteristics of
gathered from inertial navigation systems (INS) are combined optical sources [6] is considered as a limiting effect on
with global positioning system (GPS) data, accordingly [1]. DMRR.

In recent years, many research and development studies In this study, performance values of DMRR for I-FOGs
have been carried out in the field of autonomous driving designs employing SLD and Erbium-doped SFS have been
technologies. In this context, detection systems such as GPS, compared with each other using theoretical and the simulation
Visual Simultaneous Localization and Mapping (SLAM), and results derived from the configuration illustrated in Figure I.
Light Detection and Ranging (LIDAR) are used to obtain the Moreover, DMRR variations and corresponding simulations
exact location information of the vehicle [2]. have also been performed making use of the parameters of
optical fiber coil diameter, fiber length, photodetector
However, these methods and techniques have some bandwidth and optical output power and spectral bandwidth
limitations in terms of performance due to their weaknesses in of both sources.
practical studies. Some of these restrictions are atmospheric
effects, multi-path propagation and inaccessibility to the II. Theory
mediums of tunnels for GPS, affecting adversely due to the The working principle of optical gyroscopes is based on
atmospheric events for image processing in Visual SLAM, the Sagnac effect. This effect gives rise to the ∆φ Sagnac
exhibiting capability of different performance for all kinds of phase shift between optical signals propagating clockwise
objects depending on the reflection of light for LIDAR (CW) and counterclockwise (CCW) in a ring interferometer
systems. rotating around an axis perpendicular to the ring. Detectable
In cases where other positioning methods cannot be used, minimum rotation rate information is obtained by using the
the dead reckoning technique is utilized to overcome the Sagnac phase shift.
weaknesses of these positioning methods. Dead reckoning is The minimum FOG configuration exploited in this study
in principle the process of calculating the current position of a is shown in Figure I [8]. This configuration consists of an
person or moving object by using the fixed or previously optical source, 50/50 optical coupler, fiber polarizer,
specified position and advancing this position utilizing known piezoelectric phase modulator, fiber coil and photodetector.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

103
Detectable minimum rotation rate depending on excess
noise can be stated given as in (5) [13].

λ2 B.c
Ωmin,excess = � [rad/s] (5)
2.π.L.D ∆λ
where ∆λ is the spectral bandwidth of the optical source.
Figure I. Minimum I-FOG configuration Photodetector current generated in the photodetector can
The fiber polarizer used in this FOG illustration given in be described as a function of the combined effect of photon
Figure I. polarizes the light linearly in only one direction. A shot noise and excess noise as given in (6) [14].
50/50 optical coupler placed in the configuration is utilized for 〈I〉2 B
splitting the light beam into two equal parts propagating in 〈(∆I)2 〉=2q〈I〉B+ [A] (6)
opposite directions of each other in the fiber coil. An optical ∆λ
phase modulator is utilized to modulate the phase of the light Considering the combined effect of photon shot and excess
pumped into the optical fiber. The phase difference variation noise, i.e. using (4) and (5), the detectable minimum rotation
caused by the rotational movement into the system reaches rate can be expressed as (7) [14].
towards the photodetector and then it is converted into an 1
electrical signal [9]. Afterward, rotation rate information is λ.c 1+J0 q λ2 2
obtained by using the signal processing unit at the Ωmin = � �� + � √B [rad/s] (7)
photodetector output. 2.π.D.L J1 〈I〉 2.c.Δλ

The relation between the rotation rate caused by the where J0 and J1 are Bessel functions and take the values of 0.34
rotational motion in I-FOGs and the Sagnac phase shift is and 0.58, respectively for maximum sensitivity.
given as in (1).
III. Simulations
λ. c Simulations have been performed using Matlab 2020b on
Ω = . ϕ [rad⁄s] (1)
2. π. L. D s the simulation model built up for analyzing the effects of
where Ω is the rotation rate, 𝛌𝛌 is the wavelength, c is the speed photon shot and excess noises on the DMRR utilizing the
of light, L is the fiber length, D is the diameter of the coil and performance parameters of fiber coil diameter, fiber length,
ϕs is Sagnac phase shift. photodetector bandwidth, optical source bandwidths, and
optical output power. The photon shot noise, excess noise and
In I-FOGs, the fundamental noise limit is described by combined effect of both noises expressed in (4), (5) and (7),
photon-shot noise, which causes random fluctuations in the respectively, have been exploited for obtaining relevant
detector output current ( is = �(qi0 B) ) due to the random simulations in this research.
scattering of incident photons on the photodetector. Thereafter, system parameters used in I-FOG
In the noise current equation, the parameters of q, i0, configuration have been specified and their values have also
B = 1/T and T represent the electron charge, the average been acquired for an optimum system design. In this way, the
current in the detector, the electrical bandwidth of the effects of design parameters on the noise characteristics have
detection system and the sampling time, respectively [10]. been analyzed in terms of system performance, as well.
I-FOG resolution for the gyroscope system is evaluated as Parameters of the system used for getting simulations in
the detectable minimum variation rate in rotation angle this study and their corresponding values are given as in
induced by uncertainties in detector output current and Table I.
rotation rate. The detectable minimum rotation rate for photon
shot noise is expressed by (2) [11]. Table I. Simulation parameters
Parameters Value ranges
λ. c B. h. c
Ωmin,shot = � [rad/s] (2) Fiber coil diameter (D) [m] 0.05 – 0.15
2. π. L. D η. λ. Pd
Fiber length (L) [m] 1000 – 1500
where B is the bandwidth of the photodetector, h is the Planck Photodetector bandwidth (B) [Hz] 60 – 500
constant, η is the optical efficiency of the detector, and Pd is
the optical power to the detector. Photodetector current is SLD spectral bandwidth (Δλ) [nm] 70 – 130
stated with (3) [12]. SFS spectral bandwidth (Δλ) [nm] 20 – 60
η. λ. Pd Output power (P) [mW] 2 – 20
I= q [A] (3)
h. c
where I is the photodetector current in Ampere. Substituting
For the I-FOG configuration indicated in Figure I, the
(3) in (2) and reorganizing the equation, detectable minimum
wavelength of the light launched into the optical fiber (λ),
rotation rate as a function of photodetector current can be
electron charge (q), speed of light in vacuum (c) and
written as in (4).
photodetector responsivity values have been taken as
1550 nm, 1.602x10-19 C, 3x108 m/s and 0.93 A/W,
λ.c B.q respectively in the simulations.
Ωmin,shot = � [rad/s] (4)
2.π.L.D I

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

104
A. Relationship between Minimum Rotation Rate and At the point where the D.L is 150 m2 in Figure II, values
D.L product of DMRR for SLD and SFS are decreased 85%,
Spectral bandwidths of SLD and SFS vary in ranges of approximately. The detectable minimum rotation rate gets the
70 nm - 130 nm and 20 nm - 60 nm, respectively as mentioned values of 1.573°/h and 1.717°/h for SLD and SFS, respectively
in Table I. Simulations with respect to the effects of fiber coil at this point.
diameter and fiber length products (D.L) on minimum rotation B. Relationship between Minimum Rotation Rate and
rate variations have been obtained for values of B is 60 Hz, Bandwidth
P is 2 mW and spectral bandwidths of SLD and SFS are 70
nm and 20 nm, respectively, as shown in Figure II. Simulations related to the effects of photodetector
bandwidth changes on the DMRR variations for DL is 150 m2
and P is 2 mW have been attained for SLD and SFS with
spectral bandwidths of 70 nm and 20 nm as indicated in
Figure III, respectively.

(a)

(b)
Figure II. Minimum rotation rate vs. D.L product for a) SLD and
b) SFS
(b)
DMRR variations for both SLD and SFS decrease Figure III. Minimum rotation rate vs bandwidth for a) SLD and
exponentially with increasing the products of D.L as shown in b) SFS
Figure II. For the variations of D.L in the range of
50 m2 - 225 m2, whilst DMRR changes between 1.963°/h and For the variation of bandwidth in the range of
0.436°/h depending on the photon shot noise for SLD, their 60 Hz - 500 Hz, DMRR values for SLD varies in the range of
values vary between 0.799°/h and 0.178°/h depending on the 0.654°/h - 1.889°/h depending on the photon shot noise and
excess noise. varies in the range of 0.266°/h - 0.769°/h depending on the
excess noise. In case the variation of bandwidth in the same
In case the variations of D.L are in the same range, DMRR range, DMRR shows a change from 0.654°/h to 1.889°/h
changes between 1.963°/h and 0.436°/h depending on the depending on the photon shot noise and from 0.499°/h to
photon shot noise whilst it changes between 1.496°/h and 1.440°/h depending on the excess noise for SFS.
0.333°/h depending on excess noise for SFS.
The total DMRR changes in ranges of 1.573°/h - 4.541°/h
The total DMRR shows a change in ranges of and 1.717°/h - 4.958°/h for SLD and SFS, respectively. In
4.720°/h - 1.049°/h and 5.152°/h - 1.145°/h for SLD and SFS, other words, the higher the photodetector bandwidth is, the
respectively. Therefore, it is seen that DMRR decreases as the higher DMRR is. The minimum rotation rate reaches its
values of coil diameter and fiber length increase. maximum value with 4.958 °/h because of the combined effect
of both the photon shot noise and the excess noise for SFS.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

105
C. Relationship between Minimum Rotation Rate and D. Relationship between Minimum Rotation Rate and
Spectral Bandwidth Optical Output Power
The variations of DMRR with the spectral bandwidths of Simulations relevant to the effects of the output powers on
the optical sources have been obtained as shown in Figure IV, DMRR for the spectral bandwidth of 70 nm and 20 nm for
when D.L = 150 m2, P = 2 mW, and B = 60 Hz. SLD and SFS, respectively and with D.L = 150 m2, B = 60 Hz
have been attained as illustrated in Figure V.

(a)
(a)

(b)
Figure IV. Minimum rotation rate vs spectral bandwidth for a) SLD (b)
and b) SFS Figure V. Minimum rotation rate vs output power for a) SLD and
b) SFS
The total minimum rotation rate shows a change from
1.573°/h to 1.545°/h for SLD spectral bandwidth variation in The total DMMR related to SLD and SFS changes in the
the range of 70 nm - 130 nm whilst they vary from 1.717°/h ranges of 1.573°/h - 0.646°/h and 1.717°/h - 0.945°/h,
to 1.583°/h for SFS spectral bandwidth variation in the range respectively, when the output power varies in the range of
of 20 nm - 60 nm. 2 mW - 20 mW.
The total DMRR has been obtained 1.555°/h and 1.618°/h As it is seen from Figure V, DMRR shows a falling
for SLD and SFS with the spectral bandwidths of 100 nm and tendency as the output powers of the optical source increase.
40 nm, respectively. Hence, the improvements with the values In designs, sources with higher output power can be selected
of 64 % and 74 % have been achieved at these points, for reaching the lower minimum rotation rate, on the contrary,
consecutively. increasing the output power causes a large amount of energy
consumption and high cost. For this reason, sources with
As it is obvious in Figure IV.b, excess noise is more optimum output power should be preferred by evaluating
effective in SFS with narrow bandwidth in comparison to the other parameters, as well.
one with broadband. Stated in other words, this noise effect
causes the detectable total minimum rotation rate to become At the point where the output power is 9.2 mW, DMRR
higher in SFSs. In this manner, when using a broadband decreases by 80 % and gets the values of 0.829°/h and
optical source, the effect of excess noise can be reduced and a 1.077°/h for both optical sources, respectively. This
lower rotation rate can be achieved. decreasing results in 80 % approximately, improvement in
DMRR for given value of output power for both sources.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

106
IV. Conclusion References
In this study, the effects of photon shot and excess noises [1] Johannes Rünz, Folko Flehming, Wolfgang Rosenstiel,
that limit the measurement sensitivity and accuracy of I-FOGs Michael Knoop, “Requirements and Evaluation of a
Smartphone Based Dead Reckoning Pedestrian Localization
on DMRR have been analyzed for a basic configuration for Vehicle Safety Applications”, Advanced Microsystems for
employing SLD and SFS. In addition, considering these noise Automotive Applications, Springer, 2016.
effects, simulations with respect to the relations between [2] Debeunne Cѐsar, Vivet Damien, “A review of visual-LiDAR
DMRR and the parameters of fiber length, fiber coil diameter, fusion based simultaneous localization and mapping”, Sensors,
photodetector bandwidth, output power, and spectral Volume 20, Issue 7, 2020.
bandwidth of optical sources have also been performed. [3] Tsunehiko Imamura, Tomohiro Matsui, Masaroni Yachi,
Hideo Kumagai, “A low-cost interferometric fiber optic gyro
In I-FOGs, the measurement sensitivities show an increase for autonomous driving”, In Proceedings of the 32nd
when the fiber length (L) and the coil diameter (D) have higher International Technical Meeting of the Satellite Division of The
values. However, choosing convenient values for both Institute of Navigation (ION GNSS+ 2019), (pp. 1685-1695),
2019.
parameters is important for optimum design by reason of the
[4] Chris Goodall, Sarah Carmichael, Bob Scannell, “The battle
increments of dimensions and weights of the system and between MEMS and FOGs for precision guidance”, Analog
optical attenuation. Increasing bandwidth causes a rise in Devices Technical Article, MS-2432, 2013.
DMRR in the system. In these kinds of designs, although [5] Hyang Kyun Kim, Michel J. F. Digonnet, Gordon S. Kino,
choosing a low-bandwidth detector enables lower rotation rate “Air-Core Photonic-Bandgap Fiber-Optic Gyroscope”, Jounal
measurement, high-bandwidth devices are preferred for of Lightwave Technology, Volume 24, Issue 8, 2006.
sampling at high rates and increasing the performance of [6] Oguz Celikel, Ferhat Sametoglu, Huseyin Sozeri,
closed-loop systems. Spectral bandwidth and DMRR are “Optoelectronic design parameters of interferometric fiber
inversely proportional for the reason that the increment of optic gyroscope with LiNbO3 having north finder capability
and earth rotation rate measurement”, Indian Journal of Pure &
spectral bandwidth reduces the effect of excess noise. The Applied Physics, Volume 48, Issue 6, 2010.
negative effects of excess noise in SFS with narrow spectral [7] Ramón José Pérez Menéndez, “IFOG and IORG Gyros: A
bandwidth have been also realized in other simulations Study of Comparative Performance”, Gyroscopes-Principles
performed in this study. This situation shows that spectral and Applications. IntechOpen, 2019.
bandwidth has a vital role and is an important parameter in [8] López-Higuera José Miguel, Handbook of Optical Fibre
I-FOG designs. The use of optical sources with high output Sensing Technology, Wiley, England, 2002.
powers minimizes the photon shot noise and provides lower [9] Emirhan Sağ, Oğuzhan Coşkun, Güneş Yılmaz “Modelling,
rotation rate measurement. Thus, the measurement sensitivity simulation and balancing of a car cirection with fiber optic
gyroscope and fuzzy logic algorithms”, In 2019 11th
can be increased by using a source with high output power, International Conference on Electrical and Electronics
but, in this manner, cost-effectiveness diminishes, and power Engineering (ELECO), (pp. 427-431), IEEE, 2019.
consumption shows an increase. Therefore, it is necessary to [10] Francis. T. S. Yu, Shizhuo Yin, Paul B. Ruffin, Fiber Optic
determine the optimum values in accordance with the Sensors, CRC Press, USA, 2008.
application. [11] Mario N. Armenise, Caterina Ciminelli, Francesco Dell'Olio,
Vittorio M.N. Passaro, Advances in Gyroscope Technologies
In this study, when considering the system parameters for ,Springer Science & Business Media, Germany, 2010.
the optimum design as DL is 150 m2, B is 60 Hz, P is 9.2 mW [12] Gerd Keiser, Optical Fiber Communications, Tata
and spectral bandwidths are 100 nm and 40 nm for SLD and McGraw-Hill Education Private Limited, India, 2008.
SFS, consecutively. DMRR has been computed as 0.793°/h [13] Emmanuel Desurvire, Erbium-doped fiber amplifiers: principle
and 0.910°/h, respectively. and applications, John Wiley & Sons, Inc., Canada, 2002.
[14] William K. Burns, Robert P. Moeller, Anthony Dandridge,
Consequently, employing optical sources with high “Excess noise in fiber gyroscope sources” IEEE photonics
spectral bandwidth and high output power is important for technology letters, Volume 2, Issue 8, 1990.
I-FOG designs to provide high accuracy and more precise
measurements in navigation systems. When viewed from this
aspect, this study will guide similar studies in the field and
contribute to the researchers for future investigations.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

107
How Does the after-COVID-19 “ABCDEF” effects model affect the development of
Internet of Things and its Applications to improve Customer Experiences?
Prof. Ir Spencer Li
Co-founder & CTO of Smart Business Consultancy Limited
Professor of Hong Kong Adventist College
Guest Lecturer, The University of Hong Kong
Hong Kong, China
spencer@smartbusiness.com.hk

Abstract - Based on the author’s ‘after-COVID-19 “ABCDEF” management should consider these three layers in the “next
effects model’1 defining “architectural framework for the normal” era.
decision-making process,” the paper examines how human
factors and emerging technologies affect organizational
behaviour on implementing the digital transformation of
A. Customer Experience
business processes by the adoption of the Internet of Things and A recent report from McKinsey emphasized optimizing
its applications in pandemic corona diseases. Recently, customer journeys better than merely focusing on touchpoints.
‘COVID-19 has radically changed the global economy by Customers expect to receive an excellent end-to-end customer
accelerating the digital transformation to create New Normal journey with clear customer experience (CX) goals.
customer experiences (CX). This paper summarizes “next McKinsey’s research concludes that “customer journeys are
experience” initiatives by applying six pivotal elements of the
‘after-COVID-19 “ABCDEF” effects model - Artificial
more strongly correlated with business outcomes rather than
Intelligence, Blockchain/Big Data, Customer Experience, touchpoints.”2,3
Digital Transformation, Emotion, and Fintech.’1
The multiplier effect of each highly satisfied touchpoint
Keywords - customer experience, digital transformation, decreases cumulatively the end-to-end customer journey
internet of things, IoT, after-COVID-19 “ABCDEF” effects model drastically (Fig. III).

Successful Customer Journey

I. Introduction ‘The CX pyramid model (Fig. IV) defines clear roles of
The paper forecasts the impact and future development trend customer care vital for delivering favorable business
on the Internet of Things and its applications based on the outcomes together with customer satisfaction.’2
author’s ‘after-COVID-19 “ABCDEF” effects model (Figure
I). ‘The after-COVID-19 “ABCDEF” effects model selects Data analytics can help corporates to design repeatedly
and groups six contemporary emerging technologies into a “customer journey to achieve business growth and improve
framework for the decision-making process in the next normal efficiency.”2,4
(new normal) after COVID-19.’1
“By understanding how operational factors such as speed and
The paper examines how the Internet of Things and its first-call resolution translate into customer satisfaction,
applications transform business with a better customer contact centers can ensure they focus their energy and
experience. COVID-19 has radically changed the global resources on areas that have the most significant impact on the
economy by accelerating the digital transformation to create customer experience.”2,5 If the companies focus on delivering
New Normal customer experiences (CX). customer journeys with solid business metrics, higher
successful business cases such as higher referral rates,
Internet of Things and its applications assist to achieve an recurring revenues are anticipated.
excellent customer journey through designing customer
journey by customer journey mapping. The tri-patriate “By analyzing data points associated with touchpoints, the
reciprocal driving forces amongst customer care, customer corporates can save a lot of resources to implement new
experience, and customer-centricity are shaping customer customer journey.”2,5 If the companies focus on delivering
behaviour with satisfaction (Figure II)2. customer journeys with solid business metrics, higher
successful business cases such as higher referral rates,
This paper summarizes “next experience” initiatives by recurring revenues are anticipated.
examining IoT and its applications using ‘Artificial
Intelligence, Blockchain/Big Data, Customer Experience,
Digital Transformation, Emotion, and Fintech in the after- Customer service and digital transformation are interacting to
COVID-19 “ABCDEF” effects model.’1 define a good customer journey. Most corporates
management prefers to choose omnichannel customer
The after-COVID-19 “ABCDEF” effects model categorizes journey. Daily customer behaviour monitoring can build an
its six pivotal elements as “three layers - human factors, data effective and efficient omnichannel delivering customer care
analytics, and emerging technologies.”1 All corporate in the customer journey. In the meantime, emerging
technologies can build customer-centric strategies with tailor-

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

108
made solutions to deliver organizational goals. “In an Recent research said that due to technological breakthroughs
omnichannel world, customer care is increasingly becoming a and engineering improvements on wearable sensors,
significant factor”2,5 affecting customer satisfaction. The “smartwatches are being widely used in healthcare in the next
management is ought to build big data, captured in the five years. Data, collected by smartwatches, can be used for
customer journey, developing customer experience strategies early diagnosis and remote patient monitoring.”8
to achieve business or corporate goals.
For the finance sector, it is very convenient to apply IoT to
Customer care is always the core deliverable in the customer carry out customer services and Know Your Client (KYC)
journey. Organization structures and functional touchpoints procedures.
must be interdependent and mutually supportive.
Refer to the paper Bank 2.0 - The big shift9, most financial
Gartner’s research said “Eighty-one percent of customer
institutes have deployed Bank 2.0 channel architecture to
experience (CX) leaders would compete mostly or entirely on eliminate too many isolated banking and finance systems
CX. Just less than half have believed CX can help the running together. Gradually, Bank 4.0 will come in stage to
organization to drive business outcomes. Although CX aims
replace all banking systems by “Banking Everywhere but
to deliver goods and services exceeding customer
Never at a Bank.”10 The paper predicts that IoT-based
expectations, there is only 48 percent rate their CX efforts
appliances can be used for implementing Bank 4.0 initiatives.
exceed management expectations and only 22 percent say that
For the banking sector, IoT plays a vital role in the business
their CX efforts to exceed customers’ expectations.”2, 5 transformation from Bank 1.0 to Bank 2.0, and even to Bank
4.0.
“To address this challenge, Gartner unveiled the CX Pyramid,
a new methodology to test organizations’ customer journeys
and forge more powerful experiences that deliver higher On the investment side, more institutional investors, private
customer loyalty and brand advocacy.”2, 6 equity, and venture capitals are keen to invest in earlier IoT
projects.

CX Pyramid In global investments, China has rapidly expanded its smart

The Gartner CX Pyramid (Fig. IV) lays out “a framework of cities and IoT deployment for its large area of continents. The
from bottom to top customer experience anticipated. It can United States has advocated its ever-largest investment plan
help to build an incrementally relationships between an on infrastructures like high-speed railway, and emerging
organization’s brand and its customers based on the way CX technologies like 5G, computer chipset research and
leaders listen for, understand, act on and solve customer development, and manufacturing.
needs.”6,15
IoT-based Emerging Technologies
Due to the implementation of 5G and faster Wi-Fi, the speed
B. Internet of Things and its Applications and connectivity are no longer technical issues for linking IoT-
Nowadays, 5G or above and the Internet of Things (IoT) are based appliances. The paper has examined the trend on IoT-
inter-operate with each other and accepted by the government, based emerging technologies; we conclude the following five
corporates, and customers. A recent Gartner survey identified emerging technologies enabling IoT appliances as below:
the executives believing “emerging technologies creating an o Battery-Free Sensors
impact on your customer experience projects.”7,16 IoT is one o Containers for MCU Devices
of the initiatives recommended for CX leaders to consider. o Mesh Sensors
o Network Slicing for IoT Applications
“By 2030, it forecasts that 125 billion devices would be
o Small Machine Learning Sensors
connected through IoT. IoT will affect all kinds of daily lives
such as B2C and B2B.”7,16
Battery-Free Sensors
With the growing demand for IoT needs, the applications of Battery replacements are severe problem of using IoT devices.
IoT can be applied for nearly all industries. Scientists and Recently, a US-based startup called “Everactive is developing
Subject Matter Experts are working hard to design various a battery-free Eversensor (Fig. V) of 20 years battery life. The
business models of deploying IoT with specific applications. sensor can accept power from combined power sources of
solar, thermal, and vibration energies.”11
We expect areas to be applied in the following sectors:
government, NGO/NPO, MNC, large corporates, public Containers for MCU Devices
utilities, agriculture, transportation, academics, professional Due to the wide adoption of IoT applications, the huge
firms like lawyers, SMEs, and individuals. demand on “microcontroller units (MCUs) driving to design
and manufacture small, low-power consumption chips
The critical success factors for IoT and Application supporting tens of millions of IoT appliances.”11
development are automation, big data, connectivity, and
business models.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

109
Mesh Sensors The manufacturers should be aware of ethical issues violating
The wearables of the future would track the exact movement National legislation, international conventions, particularly in
of the body rather than measuring heart rate, exertion, and the United Nations level (Fig. VI).
sleep quality. They can be applied with new types of The paper studies the in-depth implementation of IoT to
“garments which can collect sufficient data points which can enhance automation like RPA and Fintech in advanced
be used by an app to determine the body’s position in 3D engineering, technology, and applications.
space.”11
The paper has identified the most appropriate applications of
IoT to utilize the after-COVID-19 “ABCDEF” Effects Model
Network Slicing for IoT Applications
rationales for New Normal businesses. The matching metrics
Network slicing is the technique to apply different “latency,
(Table I) point out the implementation areas for “ABCDEF.”
reliability, and bandwidth for IoT devices. Network slicing is
recommended to deployed on mission-critical IoT devices for
long-term reliability.”11

C. The after-COVID-19 “ABCDEF” effects models

Small Machine Learning Sensors As mentioned before, the after-COVID-19 “ABCDEF”
Machine learning in sensors comprises a “growing field of effects model with six elements is shown in Fig VII.
hardware and software technologies that enable machine
learning algorithms running on small, low-power devices like Regardless of digital transformation, ‘the important
psychological "human factor" is always the first layer of the
microcontrollers. The sensors can enable sensors with
model.’1 "Customer Experience" studies how to achieve a
predictive maintenance capabilities for manufacturing
better customer experience by designing a better customer
equipment and super-accurate location tracking for goods on journey. Social media is an effective way of influencing and
the move.”11 motivating Generation Z and younger generations.
“Decision-makers, such as management, scientists, and
Development on the Internet of Things and its customer experience officers (CXO), must take serious
Applications consideration of the human factors, international rules, and
The world is changing rapidly. There is more than 1 billion regulations, such as the European Union’s General Data
population without a bank account – categorized as unbanked Protection Regulation (GDPR) (European Union). The GDPR
population. Fintech companies are investing in developing is the strictest privacy and security law in the world and it
winning APP to attract 1 billion unbanked population as imposes huge fines on those who violate its privacy and
customers. security standards, with fines amounting to tens of millions of
Recent production shortage of computer chips let all euros.”1
developed, and developing countries invest in computer chip “Decision-makers, such as management, scientists, and
research and development, and align with countries producing customer experience officers (CXO), must pay attention to the
chipsets to deliver steady supplies of chipsets. human factors of using data analysis, international rules and
Resulted from the laws of supply and demand, and technology regulations, such as the European Union’s General Data
trends, IoT and applications are critical success factors for Protection Regulation (GDPR) (European Union). The GDPR
countries to dominant in the coming few decades. is the strictest privacy and security law in the world and it
imposes huge fines on those who violate its privacy and
Anticipated developments on IoT and applications are security standards, with fines amounting to tens of millions of
o IoT, connecting an intelligent network of smart euros.”1
ATMs, is the bridge between banks and their ‘The United Nations recommends the organizations migrating
customers. IoT is one of the big data sources for to cyberspace and remote participation in social, educational
customer profiling, customer services, and content and economic activities which reduce the psychosocial impact
management in financial services; of social distancing. Big data is increasingly being deployed
o Financial giants like BlackRock/PayPal are in crisis management and predictive learning, allowing real-
investing 10x faster pace on IoT and Future time data-based decision-making at a faster and more efficient
Technologies and unbanked population (fund size response. Similarly, the world has witnessed a shift to
more than US$130M for each investment); electronic commerce over physical retail and service
o Unlock the value of IoT-enabled appliances and provision.’1
applications to help customer service experiences; “Artificial intelligence and Big Data have been deployed on
o Five emerging technologies enabling winning virus research, vaccine development, and data analysis for
“capabilities for IoT devices and applications, from public health matters”1,14.
battery-free sensors to 5G network slicing”11 in tiny
devices, play vital dominance on IoT development IoT-Based Alternative Data drives businesses to transform
and adapt rapidly. IoT-Based sensors can provide emerging
Although the future of IoT is promising, there are technologies to capture valuable data to construct special
“international ethical concerns in Artificial Intelligence purposes big data. CX leaders should design a better end-to-
systems which are running on big data and, or IoT-based end customer journey rather than focusing on a particular
sensors.”12 touchpoint. IoT and applications can add value in the end-to-

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

110
end customer journey as customers are acquainted with
accepting IoT appliances as a daily life necessity.

New Normal and Next Experience

Pandemic COVID-19 has changed the world rapidly. For
hygiene and human security issues, social distancing is the
norm. There is a great demand on IoT-based appliances
which can provide convenient and secure services to human.

AI brings a drastic change in digital transformation for

applying IoT and Applications. The new ways of O2O Figure II. “Successful customer journey”2
services can be implemented by Robotic Process Automation
(RPA) and connected IoT appliances and wearables. Smart
contracts generate legal contractual terms to eliminate
unnecessary contractual and business exceptions in a trusted
environment.
“The flexibility and agility of the Digital Service Factory
make it invaluable for Japanese industrial companies’ digital
transformations. By testing and proving the value of Industrial
IoT and service-oriented concepts quickly, companies can get
new revenue-generating digital services to market so much
faster.”13,14
Five emerging technologies like battery-free sensors,
containers for MCU devices, mesh sensors, network slicing
for IoT applications, and small machine learning sensors will
dominate the commercial and consumer markets. They drive
investors to shift their investment from traditional industries Figure III. “End-to-End customer journey and touchpoint”2
into Industrial IoT to manufacture these technologies to lead
the market very soon.

Humans will enjoy different kinds of living and customer

journeys in the New Normal age.

II. Figures and Tables

A. Figures

Figure IV. “The CX Pyramid, Gartner, July 2018”2

Figure I. “after-COVID-19 “ABCDEF” Effects Model –

Digital Transformation Flow Successful customer journey”1

Figure V. Eversensor

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

111
Customer care is the core element driving a better
customer experience. Customer-centricity focuses on
the understanding of customers’ needs to derive
customized strategies and solutions.

Many corporates organizations have developed

omnichannel of integrating different channels using
IoT technologies for sales and marketing, billing, and
call centres
“The flexibility and agility of the Digital Service
Digital Factory make it invaluable for Japanese industrial
Transformation companies’ digital transformations. By testing and
proving the value of Industrial IoT and service-
oriented concepts quickly, companies can get new
revenue-generating digital services to market so much
faster.”13

IoT and Applications are becoming driving forces on

digital transformation.
Low-cost wearable sensors, such as wristbands, are
Emotion not uncommon used by medical and health
Figure VI. Challenges: Ethical Concerns in AI systems, 3 professionals to capture blood pressure, heart rate, and
April 2020, HKU Talk, Spencer Li temperature to feed into programmable systems to
define an individual’s emotional states.

For those business transactions need pecular focused

attention and decision, emotion detection
technologies have been applied to increase their
attention and performance while to improve healthier
productive workplace.

An unprecedented most significant number of new

FinTech business models are driven by IoT, which can be used
for implementing Bank 4.0 initiatives to let customers
enjoy banking services everywhere but never at a
bank.

III. Conclusion
Figure VII. “after-COVID-19 “ABCDEF” Effects Model -
System Architecture”1 Development on the Internet of Things and its applications are
fast. Globally, all developed and under-developed countries
are building their 5G infrastructure to upgrade themselves as
Smart Cities.

B. Tables IoT-based sensors are easier to capture all data for data
analytics purposes. Resulted of the pandemic COVID-19,
Table I. Metrics of after-COVID-19 “ABCDEF” Effects Model affecting
IoT and Applications development new preventive measures like social distancing are taken. The
author has summarized the trend of human and business
behaviors in the “after-COVID-19 “ABCDEF” effects
“ABCDEF” IoT and its Applications
Effects model.”1 The paper anticipates IoT and applications play a
vital role in the evolution of customer journey by deploying
AI brings a drastic change in digital transformation for customer journey mapping. The tri-patriate reciprocal driving
Artificial applying IoT and Applications. The new ways of
Intelligence O2O services can be implemented by Robotic Process forces amongst customer care, customer experience, and
Automation (RPA) and related IoT appliances and customer-centricity are shaping customer behaviour with
wearables. Smart contracts generate legal contractual satisfaction.
terms to eliminate unnecessary contractual and
business exceptions in a trusted environment.
With the open APIs of common data on the From the macro side, huge institutional investments and
Blockchain & blockchain and big data from different stakeholders government funds are in the market to seek a good investment
Big Data like commercial corporates, government, and NGO, of earlier emerging technologies. Great demand on skillful
AI and ML tools can foster the development of hybrid human capital and technology transfer like IP, patents are the
blockchain, which deliver sophisticated services for
multi-disciplines and cross-industries at a quicker hot topics for academics and top corporate management to
pace. consider and study.
IoT-Based Alternative Data drives businesses to We believe that IoT and its applications can add value to six
transform and adapt rapidly.
pivotal elements of “the After-COVID-19 “ABCDEF” effects
IoT and its Applications can deliver good customer model - AI, Blockchain, Big Data, Customer Experience,
Customer care and customer-centricity services. Digital Transformation, Emotion, and Fintech.”1 The world
Experience is so fantastic. By 2030, humans will be connected and served

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

112
by 125 billion IoT devices with new emerging technologies [7] Gartner, Inc June 2020. “How to Leverage the Top 5 CX
Trends in 2020.” Gartner Inc.
deployed at an unprecedentedly fast pace. https://www.gartner.com/en/conferences/apac/customer-
experience-australia/gartner-insights/gc-rn-top-cx-trends
References [8] Dominic Hasler. 2021. “Looking after the health of your ATM
[1] Spencer Li. 2021. “How does COVID-19 Speed the digital fleet in a futuristic way.” www.fintechfutures.com, 2021.
transformation of Business Processes and Customer https://www.fintechfutures.com/2021/03/looking-after-the-
Experience?”, SPECIAL ISSUE IN FINTECH OF "REVIEW health-of-your-atm-fleet-in-a-futuristic-way/
OF BUSINESS" (St. John's New York), 41(1), 1-14, 2021. [9] Peter Mũya H. 2012. “Bank 2.0 - The big shift.”
https://www.stjohns.edu/sites/default/files/uploads/Review-of- https://www.slideshare.net/themuyas/bank-20-the-big-shift
Business-41%281%29-Jan-2021.pdf [10] Brett King. 2019. “Bank 4.0: Banking Everywhere but Never
[2] Spencer Li. 2021. “How Does Digital Transformation Improve at a Bank”, 2019
Customer Experience?”, The Palgrave Handbook of Fintech [11] Dylan Martin. 2021. “5 Emerging IoT Technologies You Need
and Blockchain, 487, Jun 2021. DOI: 10.1007/978-3-030- To Know In 2021.” www.crn.com, 2021.
66433-6_21. https://www.crn.com/news/internet-of-things/5-emerging-iot-
[3] Wray, Sarah. 2016. “Optimize Journeys Not Touchpoints – technologies-you-need-to-know-in-2021
Here’s Why and How.” McKinsey & Company. [12] World Economic Forum. 2019. “White paper : AI Governance
https://inform.tmforum.org/customercentricity/ – A holistic approach to implement ethics into AI.” January
2016/09/mckinseyoptimize-journeys-ottouchpoints-heres/. 2019
Accessed September 2016. [13] Kyoko Tamur. Raghu Gullapalli. 2019. “The Secret to
[4] Glagowski, Elizabeth. “The Race Is On: Rethinking Your Maximizing the Industrial IOT.” Accenture, 2019.
Digital Strategy.” ttec.com. https://www.accenture.com/_acnmedia/pdf-108/accenture-
https://www.ttec.com/articles/digital-customer-experience- apac-insight-dsf-tamura-final-lowres.pdf
strategysix-key-areas-focus-your-efforts. [14] United Nations Industrial Development Organization
[5] Lotz, Stephanie, Julian Raabe, and Stefan Roggenhofer. (UNIDO). 2020. “COVID-19 Implications and Response—
2018. “The Role of Customer Care in a Customer Experience Digital Transformation and Industrial Recovery.” Vienna,
Transformation.” McKinsey & Company. https://assets- Austria: UNIDO. https://tii.unido.org/news/covid-19-digital-
prod.mckinsey.com/~/media/McKinsey/Business%20Functio transformation-industrial-recovery
ns/Operations/Our%20Insights/The%20role%20of%20custom [15] Businesswire. 2018. “Gartner Says Customer Experience
er%20care%20in%20a%20customer%20experience%20transf Pyramid Drives Loyalty, Satisfaction and Advocacy”
ormation/The-role-of-customercare-in-a-customer-experience- businesswire.com, 2018.
transformation-vf.ashx. ok, SYZ Publishers, Turkey, 2005. https://www.businesswire.com/news/home/20180730005056/
[6] Kelly Blum. 2018. “Gartner Says Customer Experience en/Gartner-Customer-Experience-Pyramid-Drives-Loyalty-
Pyramid Drives Loyalty.” Gartner Inc. Satisfaction.
https://www.businesswire.com/news/home/201807300 [16] Capgemini. 2020. “Capgemini Customer Experience”
05056/en/Gartner-Customer-Experience-Pyramid-Drives- capgemini.com, 2020.
Loyalty-Satisfaction. https://www.capgemini.com/service/digital-
services/customer-experience/

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

113
Drawing as a Scientific Method. The School of Agricultural
Engineers in Madrid: a case study
Jara Muñoz-Hernández
School of Architecture (ETSAM)
Polytechnic University of Madrid (UPM)
Madrid, Spain
jara.munoz@upm.es
https://orcid.org/0000-0003-2530-2892

Abstract—In the current debate on the application of new Estate of La Florida and La Moncloa. Its status as a Crown
media in the documentation of architecture, the need to preserve property had indeed preserved the estate from urban growth
the values of the architectural drawing tradition stands out. This and it would also be so in the following decades, once the State
paper proposes its core role in the concept of graphic decided to transfer it to the agronomists, thus starting its
reconstitution as a method of integration of new and traditional educational and research trajectory. Today the School
media for the advancement in knowledge and dissemination of occupies a small part of the campus, but in 1869 the entire La
architecture. A research - aplying this method - on the School of Florida estate had been surrendered to the institution. The
Agronomists in Madrid is used to exemplify this theory. grounds would be later occupied by various charity, sanitary
Keywords—architectural drawing, graphic reconstitution, and recreational centers. La Moncloa was hence configured as
School of Agricultural Engineers, Madrid’s Ciudad Universitaria a natural park facing Casa de Campo, performing a transition
between the city – which had reached its limits with the
I. Introduction construction of Argüelles neighborhood – and the mount of El
For Architecture, drawing is, inevitably, the instrument of Pardo. The fusion of the cultivated land and the public garden,
thought, concretion and communication. From the first together with the distinctive topography of the landscape, gave
moment of ideation of an architectural element to the details this area a picturesque character that experienced great
resolution on the construction site, the entire project process success among the people of Madrid.
goes through being drawn [1]. This paper aims to analyze how La Moncloa as a Royal Site until 1869 and as a University
drawing, in addition to being the language of architecture – campus since 1927 has already been studied. However, that
and many other technical disciplines –, is also an absolutely time in between those two dates has only been partially
effective scientific instrument when it comes to researching researched in several papers. Consequently, its architectural
lost or modified architectural heritage and even the and urban aspects had not been approached in a global way. It
architecture that was devised, but never built. is yet to be understood how the territory and pre-existing
This graphic methodology has been applied in my PhD constructions where occupied after the transfer of the State,
thesis, which deals with the birth and development of how it was developed during sixty years until the creation of
Madrid’s School of Agricultural Engineers, the first university the Ciudad Universitaria and how this project coexisted with
complex to be built on what is today the main campus of the the School of Agronomists and other institutions until the
capital, the Ciudad Universitaria. Based on everything Spanish Civil War (1936-1939). Unfortunately, the excessive
analysed in the thesis, the aim here is to synthetically present and uncontrolled growth since the last sixties, almost
the method, take the School of Agronomists as a case study completely erased the traces of the past.
and show the results obtained, in such a way that it can also be This paper aims to approach all these questions through
applied to other areas of study. drawing, which is understood as a source of information, an
This research is not an isolated work, but is included instrument of analysis and a means of expressing results. In
within a broader research framework related to the drawing of this way, graphic narration becomes an essential and enriching
the city and architecture, from which various scientific articles complement to the written discourse, showing the research
[2] and several PhD theses have been carried out in the old area at various key moments of its development. The result is
research group of the Technical School of Architecture of a sequence of drawings of one same place at different
Madrid, Drawing and Documentation of Architecture and moments in time that allows recovering – in a virtual way –
City. All of them are studies that start from a work begun a the lost memory of this place, as well as contributing to the
long time ago and whose first verifiable result is the book La knowledge of the urban form of Madrid.
forma de la villa de Madrid [3].
III. Method
II. A case study: The School of Agricultural In order to achieve the objectives described above,
Engineers systematic work in archives and libraries has been carried out.
Nowadays, the northwest corner of Madrid is unfailingly However, the main methodological instrument of this research
linked to the Ciudad Universitaria. The social, political, urban is drawing. This work can be framed within lines of research
and symbolic value of the campus has ended up by imposing that have shown that drawing is an effective tool in the
its name over this whole area of the city, relegating to oblivion analysis of the urban form and in the transmission of results
– or reserving them for specific places – the names bounded and conclusions.
to the origin of this space: Florida and Moncloa. In another sense, drawings are also a source of
In 2019 the School of Agronomical Engineers celebrated information. Thus, our work is nourished by the documents
its 150 anniversary. It was the first institution established on that architecture and the city generate and that are created
these territories, coinciding in time with the end of the Royal during their development. The set of all these drawings is what
has been called the graphic life of buildings [4].

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

114
Most of the graphic documentary sources examined in this The first step for this graphic reconstitution is to set those
research come from libraries and public archives. We will not elements which remain, that is, that existed then and that still
stop to analyze the origin of the sources – [5] can be consulted exist today. In order to do so, they are identified on the current
for more information – but it is convenient to refer to the most urban parcel, which we consider the most reliable
important ones. Obviously, the information that each file can georeferenced cartographic source. These elements are the
provide is closely linked to changes in ownership of the scope only certainties that are available and that serve as a reference
of study. As La Florida was a property of the Crown, the for graphic reconstitution [3]. These persistences can be
General Archive of the Palace contains written documentation buildings or constructions, that is, they have a clear material
and plans from that historical period. Later, once these lands dimension, or be elements related to the orography, water
are transferred, the information is kept in other files. The courses, roads or property limits – that can manifest
General Administration Archive, an archive that preserves the themselves in a variable way in different historical moments,
documentation of the different Spanish ministries since the either through built boundaries, fences, street names... – , that
second half of the 19th and 20th centuries, contains is, they have an immaterial character [6].
information on the project of the School of Agronomists and
a large part of the multitude of buildings that were built in the The second step would be the location, based on the
surroundings of the Palacete de La Moncloa. Since the 1920s, reference marked by the persistence, of missing or
when the Construction Board of the Ciudad Universitaria was transformed elements of which there is measurable
constituted, and especially after the Civil War, the General documentation. In the case of La Moncloa, which for a long
Archive of the Complutense University of Madrid is also an time was a sparsely built area, it happens that some buildings
essential place of consultation. are built on top of others, or some parts are used for their
development, which also makes it possible to take the current
The systematic collection of information is essential as a building as a reference to locate the old one (Figure I).
starting point, both for written work and for the production of
graphic documentation. However, this compilation does not
end exactly at a certain point, but research and work feed one
another. It is clear in the drawing of plans, for example, that,
with a good collection of old graphic documentation, writings,
etc., it is possible to begin to draw, but it will always be
necessary to fill in the gaps that will force a new search for
documentation while those planes are being generated. It is
this back-and-forth process that builds the graphical base of
which we spoke before, and that must be understood not as a
closed and finished product, but rather as a work in which
information can continue to be poured into a constant
expansion and growth process.
IV. Drawing as a scientific instrument
Up to now, drawing has been discussed from a
documentary source point of view. However, in addition to its
value as a container of information, it can also be considered
as a tool for thought and analysis, as a project towards the past
– of what existed or what could have existed – or as a method
to illustrate and reflect the results of an investigation. All these
meanings will be taken by the accompanying drawings, which
are the research we are talking about.
It is worth stopping at the use of drawing in a scientific and
rigorous way, analyzing the original documentation, whether
graphic or written, and synthesizing it in a graphical,
georeferenced database, which can give rise to complete
planimetries, 3D models and images, in what we understand
as a graphic reconstitution of the case study: “... the term
reconstitution would be reserved for those drawings that
attempt to reflect one or more states of the building that no
longer exist or that never existed, but that could be part of its
biography. Note that the important difference is that in the
second case, given the almost always incomplete data, it is
usually necessary to introduce a certain dose of interpretation
we would like to assimilate to a certain idea of a project ”[4].
On this same basis it is possible to take a trip to the past,
in which drawing is an essential means of shaping that graphic
life of buildings. In this process of graphical reconstitution, the
most important thing is to minimize the data subject to Figure I. Process of graphic reconstitution of the location and layout
of the Porcelain Factory (bottom), taking as a reference the current
speculative interpretation. Obviously, this uncertainty is building of the School (top), its layout in 1927 and the situation
accentuated the further one is on the timeline. with respect to this of the Machine Building (center).

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

115
For example, in the case of the teaching building of the result of presenting, at the same time and according to the
School of Agronomists, we have nowadays the original same criteria, a series of different architectural facts in order
building partially enlarged and modified, as can be seen in the to classify or compare them. Thus, the graphic parallel
upper drawing of Figure I. From it, we can draw the layout of becomes a preliminary step in any systematic study with a
the original building, of which we keep plans. And, once the scientific vocation in which more than one architectural object
original building is located, it has been possible to locate one is involved and which has more or less form as a direct
of the auxiliary buildings that existed in the 20th century and reference ”[8].
that were later demolished. This is what the middle drawing
represents, where the footprint of the current building that has In this case study, the application of the graphical parallel
served as a guide is kept in red line. Finally, and following the has consisted in comparing the same spatial area at different
same graphic codes, we have been able to locate the building moments in time. This area has been drawn on two different
prior to the current one of the School of Agronomists, which scales, which make it necessary to approach the drawing in a
had been the Crown Porcelain Factory [7]. This has been different way.
possible thanks, on the one hand, to the knowledge of the A. Spacial scope of the drawings and temporal sequence
situation of the secondary buildings, which coexisted in time From this case study, three packages of plans of the same
with this construction. And on the other, to information historical sequence have been made, each one with a different
obtained from historical photography: the Porcelain Factory spatial framing (Figure II): an urban one, at a scale of 1: 7,500,
and the new School of Agronomists coincided on their north in which the territory of La Moncloa occupied by the
façade. It was relatively easy to obtain the floor plan of the agronomists is represented, and its relationship with the city
Porcelain Factory building, since we have some historical and with structural elements of the landscape, such as the
plans with measures. However, it was not so easy to place it Manzanares River or the Casa de Campo. The second and
in a space with very few built references. This is an example third frames, both at 1: 2,000 scale, reproduce the
of how the drawing, rigorously built, allows us to obtain this surroundings of the School of Agronomists and the whole of
information. In the same way, the plans of the complete area the Model Farm that was developed in the vicinity of the
that will be shown later have been produced. Palacete de La Moncloa. This second close frame, due to
Naturally, matching historical cartography to one another historical circumstances, will disappear in the time scenes
or to the current city maps is a useless task, since the different after the Civil War. Architecture is already represented on this
projection systems and the distortions and inaccuracies of the scale and includes the graphic reconstitution of all those
survey methods at all times prevent it from being transferred buildings for which a minimum of information has been found
directly onto the current parcel map. This problem is even to establish hypotheses.
more accentuated in sparsely built areas, where it was more
difficult to establish benchmarks from which to measure. For
this reason, in this case, references have been sought with the
present, or with the more controlled past, “building by
building”, in order to later be able to draw with precision those
for which there was data. These data have basically been
obtained from individual projects or, also, of the
measurements of the polygons of some buildings that appear
in the field sheets at a scale of 1: 500 that were drawn for the
subsequent elaboration of the urban plot of the Statistical
Board (1860-1870), which are preserved in the National
Geographic Institute.
The third step of the process consists of defining those
elements whose precise shape or location could not be
determined from dimensional documentation, and from which
there are currently no references that can be taken. It is then
that one enters the field of interpretation and speculation, in a
process very similar to that of an architecture project, which
does not mean that there are no reference elements on which
to rely, in order to establish sensible hypotheses: cartography
history, written documents, press, photographs, paintings and
engravings ...
Figure II. Spatial scope of the research over Madrid’s satellite view.
Once this graphic reconstitution has been carried out and
reliable drawings have been obtained, one can experiment Temporarily, key moments in the development of the
with the tools at our disposal – the most common, newer ones, urban form of the study area have been chosen, in which its
or in combination – to decide how to display these drawings. image has been “frozen” in order to study it:
The comparative method is one of the most powerful tools -1870, as the moment of the transfer of the property and
for analyzing and understanding the urban form of cities or, in preliminary state of the place.
this case, of the same city at different moments in time. By
referring all the drawings to the same cartographic base, with -1890, a state prior to the end of the century and the
a common scale and the same graphic variables, we can appearance of institutions other than the School, the year in
establish a “graphic parallel”: “We would thus arrive at the which the Alfonso XII Agricultural Institute was founded, the
concept of graphic parallel, which we could enunciate as the

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

116
first refurbishment works were undertaken and the expansion
process started.
-1910, as a state immediately prior to the new building of
the School and the date in which the development of the
institution is already remarkable. This section has been drawn
from the population plan of the same year.
-1927, the last moment of the School before the
construction of the campus (Figure III).
-1936, date marked by the beginning of the Civil War.
-1955, a state in which the first reconstruction projects of
the School building had already been undertaken.
-2020, as a current image as a result of the processes seen
throughout the work.

Figure IV. Topography of La Moncloa in 1927, 1936 and 2020.

In this general scale, some planimetries have been
produced in each of the temporal cuts mixing the 2 and 3
dimensions. A more scientific drawing, geographically
referenced, is then edited with textures and patterns and mixed
with layers of 3D elements that make it possible to incorporate
lights and shadows. With these plans, the aim is to achieve
plans of the same scale, drawing and editing criteria, which
allow a real comparison between historical moments.
C. Property delimitations
Another function of drawing that has been basic
throughout history is to establish the limits of properties. In
our example, La Moncloa goes from being a property owned
by the Crown to belonging to the State. Its outer limits were
varied and are carefully described in the thesis [5].
Internally, the delimitation is simpler since there are no
Figure III. Drawing of the research area in 1927.
different owners. However, over the years, the State gives land
B. The drawing of the territory to various institutions, in addition to the agronomists, which is
On the distant scale, digital terrain models have been made also interesting to reflect in the drawings. This issue will be
to allow us to understand what the topography of Florida was particularly relevant with the arrival of the Ciudad
like before it was modified (Figure IV). By means of these Universitaria, when the scale of the project is such that it
models we can establish quite reliable comparative studies of invades previously occupied land, and also after the Civil War,
the orography before and after the construction of the Ciudad at which time the state of ruin of much of La Moncloa and the
Universitaria. The current model of the land (DTM) is now growing needs of the Campus forces to rethink the property or
available to the public; the model of the year 27 has had to be usufruct limits of the territory [10].
built from the contour lines of the parcel sheets of Madrid In our case study the property issue was quite clear, but it
from 1929. The topographic operations prior to this date were is easy to find examples where public and private properties
quite minor – circumscribed to the construction of new are intertwined in a complex framework (the historic nuclei of
buildings –, so this model is also valid for earlier dates. A cities, without going any further). This topic is a good example
model of the terrain from 1936 has also been developed, which of what the economy of drawing entails: a single graphic
makes it possible to understand which earth movements were document allows us to understand at a glance the division that
caused by the Ciudad Universitaria project, and which have a owns the territory, a topic that in writing is extremely complex
later origin [9]. and stormy, and requires the writing of a long number of
The colored images of the terrain models can be used as a pages.
basis for planimetry, as well as a source to obtain three- D. The drawing of architecture
dimensional models. In addition to conventional three-
On a smaller scale, the same process has been followed as
dimensional models, we can create “fake” models from depth
for the drawing of the city or the territory. In this case, as some
diagrams obtained from DTM images. In this way we obtain
of the buildings that are currently preserved have been
an apparent 3D model that is much lighter to manage.
accessible, some partial surveys have been carried out, in

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

117
order to be able to compare them with the original project symmetry of the main facade of the building has been used to
plans or subsequent renovations. Data collection is essential show, on one side, the original building and, on the other, the
in the production of close-scale documentation. Through the current one, so that, in a very synthetic way, it is possible to
first sketches, the order of the buildings, the modulation and understand the transformation of the building (Figure V).
the existence of elements that are repeated are observed.
From these plans, 3D surveys have been carried out, either
modeling with a high degree of detail, to obtain descriptive
drawings, or performing a simple extrusion and mapping
afterwards the original plans on it, for models with a more
analytical purpose. For the modeling of the School of
Agronomists, the order module – including column, cornice,
frieze and brick panel – has been built and the complete Figure V. Main elevation of the School of Agriculture engineers.
building reconstructed from it. On the left, façade in 1936 and on the right, current façade.

The interest of modeling goes beyond obtaining three- Another narrative level of great interest is that of analytical
dimensional images of the building that no longer exists today: drawing. In this case, simpler drawings are usually used, with
studying current and old photographs and placing the same a lower degree of detail, but which nevertheless condense a
cameras within the model, highly accurate comparisons can large amount of information. In this paper two examples are
be made that allow reconstructing the landscape of the offered: on the one hand, that of the constructive evolution of
primitive Ciudad Universitaria [11]. the School of Agronomists, where in a schematic way and
with a color code the change of the building is shown since its
These working methods should not be considered as initial construction (Figure VI).
separate elements, but as a set of combinable tools. Working
with photography is a round trip, always in conjunction with
the rest of graphic and written methods. The first step to be
able to make a photographic comparison of a specific area or
building is to know what has been written about it and study
the different planimetries to be able to establish the time and
place where the image was taken. It is important to check each
photograph and identify them well, before continuing to work
with them, as many times photographs with totally wrong
descriptions are discovered. One of the consequences of the
Civil War was precisely the destruction and disorder of the
documentation housed in the faculties. The widespread chaos
after the war caused many photographs to be quickly and
poorly cataloged.
Once the photographed place has been identified, the two
and three-dimensional drawing acquires an important role,
since, through the plan plans, we can locate the position of the
photographer quite roughly and later we can achieve even
greater precision by means of the placement of cameras in
three-dimensional models. Once located, current photographs
are taken on the ground, looking for the studied point and
making the last position adjustments that are necessary,
provided that the current state of the place allows it.
Finally, after taking two photographs of the same place at
different times, they can now be compared and analyzed in
depth. The meticulous study of these photographs helps to
specify details that do not appear in the plans or to adjust the
modifications that were made during construction and were Figure VI. Evolution of the School of Agronomists from 1869 to the
not reflected off-plan. They are also a valuable source of present.
information to learn about the original finishes of these On the other hand, thanks to the fact that all these drawings
buildings and see how they have changed. The “skin” of have been scientifically built, a certain precision can be
buildings is one of the parts that undergoes most guaranteed that has enabled me to quantify the surfaces built
transformations over time, so this will be the most reliable in this territory, in addition to determining which surface was
document for its reconstruction. destroyed during the Civil War and how much was rebuilt.
Once the drawings have been constructed, one must think The conclusion is the following: in the area dedicated to
about how to display them to explain the research, that is, in teaching, the percentage of destruction was very similar to that
its narrative dimension. Again, the role of drawing is very of the campus average – 42%, although the School of
interesting because it allows us to establish narrative lines at Agriculture building suffered especially – but in the Model
different levels: absolutely descriptive of reality, as seen in Farm and the surroundings of the Palacete the destruction was
some examples above, or partially descriptive, allowing us practically total (86%) and, in addition, it was decided not to
certain licenses that allow us to better show ideas. For rebuild it, contrary to what happened with the teaching
example, in the case of the School of Agronomists, the environment. With this decision, the School was de facto

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

118
incorporated into the campus as a whole and the hegemony On the other hand, drawing has also been considered in its
that until that moment it had had, at least in extension, was narrative dimension, understanding it as an essential
eliminated. complement to written discourse. This graphic narrative
supposes, in itself, a contribution to the knowledge of the city,
V. Conclusion insofar as it provides an image of it that has not yet existed. In
In an investigation where the drawing is an inherent part addition, the support of the drawings, which is not only
of the research, it also becomes the result and conclusions of physical on paper, but also digital, since they have been made
the work. Thus, to date non-existent graphic material has been with computerized means, generates a georeferenced database
produced that represents a space in the city of Madrid at that becomes a piece of a much larger set of previous research
different times, in a constant journey between territorial and on the city of Madrid, available for future studies. In this
architectural scales. sense, there is also a possible further development of research
in means of representation towards more applied aspects,
The comparative analysis of these historical sequences since the abundant graphic information that has been produced
allows, thanks to the unit of scale, framing and graphics, a here could be organized in a Geographic Information System.
fluid reading of the evolution of the field of study in two This path has already been started by building all the drawings
aspects: the spatial, which orders the elements at a specific on a digital common basis, with which a database could be
moment, and the temporal, that orders the transformations generated in which the current campus and its previous states
successively (Figure VII). could be considered. This would work as a starting document
As indicated above in reference to the location of already for any urban development intervention, diffusion activity and
known places and constructions, the drawing is precisely the even heritage recovery projects.
tool that has made it possible to position them in specific
References
coordinates and graph them in the plans prepared for the
thesis. [1] Jorge Sainz, El dibujo de arquitectura: teoría e historia de un
lenguaje gráfico, Reverté, Barcelona, 2005.
[2] Jara Muñoz-Hernández, Carlos Villarreal-Colunga, “Las
andanzas de la portada de Oñate tras la demolición de la casa-
palacio: calle Mayor, Teatro Español, La Moncloa”,
Arqueología de la Arquitectura, volume 17: e094, 2020.
https://doi.org/10.3989/arq.arqt.2020.003
[3] Javier Ortega-Vidal, Francisco José Marín-Perellón, La forma
de la villa de Madrid. Soporte gráfico para la información
histórica de la ciudad, Dirección General de Patrimonio
Histórico, Madrid, 2006.
[4] Javier Ortega-Vidal, Ángel Martínez-Díaz, María José Muñoz-
de-Pablo, “El dibujo y las vidas de los edificios”, EGA Journal,
volume 18, 2011. http://doi.org/10.4995/ega.2011.1335
[5] Jara Muñoz-Hernández, La Escuela de Ingenieros Agrónomos
en La Florida-Moncloa [PhD thesis], Universidad Politécnica
de Madrid, 2020. https://doi.org/10.20868/UPM.thesis.65305
[6] Luis Sobrón-Martínez, Al Este del Retiro [PhD thesis],
Universidad Politécnica de Madrid, 2015.
[7] Jara Muñoz-Hernández, “De la Fábrica de Porcelana a la
Escuela de Agrónomos de Madrid”, Revista de Humanidades,
volume 41, 2020.
[8] María José Muñoz-de-Pablo, Ángel Martínez-Díaz, “El
paralelo. Bosquejo de un método gráfico”, EGA Journal,
volume 23, 2014. https://doi.org/10.4995/ega.2014.2172
[9] José Luis González-Casas, Jara Muñoz-Hernández, “The urban
and environmental Impact of Madrid’s Ciudad Universitaria: A
comparison between the first Campus and the post-war
Campus”, International Journal of Sustainable Development
and Planning, volume 15, issue 6, 2020.
http://doi.org/10.18280/ijsdp.150612
[10] Jara Muñoz-Hernández, José-Luis González-Casas, “Traces
and scars. The reconstruction of Madrid’s Ciudad Universitaria
after the Spanish Civil War”, WIT Transactions on The Built
Environment, volume 191, 2019.
http://doi.org/10.2495/STR190181
[11] José Luis González-Casas, Jara Muñoz-Hernández, “Drawing
for heritage dissemination. The birth of Madrid’s Ciudad
Universitaria”, International Journal of Heritage Architecture,
volume 2, issue 2, 2018. http://doi.org/10.2495/HA-V2-N2-
359-371

Figure VII. Drawings of the research area in each of the key dates

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

119
Urban distribution network proposal: A case study for the 14th
district of the city of Medellín.
Juan P. Vasco-Gallo J. Isaac Pemberthy-R. Eduard A. Gañan-Cardenas
Ingenieering Production Student Department of Quality and Production Department of Quality and Production
Instituto Tecnológico Metropolitano Instituto Tecnológico Metropolitano Instituto Tecnológico Metropolitano
Medellín, Colombia Medellín, Colombia Medellín, Colombia
juanvasco241321@correo.itm.edu.co jorgepemberthy@itm.edu.co eduardganan@itm.edu.co

Abstract— Today, in the context of the world's cities, the

displacement of the population to urban centers has significant 100

Google search index

90
consequences. This work seeks to propose a last mile 80
70
distribution network, controlled by the government of a city, 60
which seeks to reduce the impacts of this type of transport on 50
40
the quality of life of the inhabitants. The main objective is to 30
20
build a proposal that reduces the impacts that today, in times of 10
pandemic, brings to the city the increase in grocery delivery 0

June/2019
May/2019

May/2020
June/2020
October/2019

October/2020
April/2019

April/2020
July/2019

July/2020
March/2019

November/2019

March/2020

November/2020
January/2019

January/2020

December/2020
February/2019

August/2019
September/2019

August/2020
September/2020

January/2021
December/2019

February/2020
service. For this purpose, an integer linear programming
optimization model (MILP) was built, obtaining agile results in
a first scope focused on a district in the city of Medellin,
Colombia. The main theoretical contribution of this work is the
development of an efficient MILP that responds to the objective
under a problem in a real scenario. Figure I. Food domicile search trends in the Google search engine
in Colombia. Source: Google Trends ® [5].
Keywords— City logistics, Urban logistics, Operations
research, Optimization.

I. Introduction
The problem of urban distribution of goods or city
logistics has been widely studied over the years and today, it
is no exception since logistics is one of the most important
activities in the context of business and developed cities of the
XXI century. Nowadays, city logistics is becoming more
noticeable, because, with the growth of e-commerce [1] and
the global urbanization trends that force modern cities to offer
opportunities for employment, education, culture, health,
sports, among other activities such as, the development and
growth of industries. This leads to the expansion of urban
areas and the increase of road traffic and consequently to the
increase of environmental pollution, vehicular congestion,
negative social impacts, generating poor quality of life of
citizens [2], in addition to an inefficient and ineffective service
reducing the level of service of city logistics [3].
Today, given the current situation of confinement Figure II. Choropleth map of the search index of grocery home
generated by the COVID - 19 pandemic, the demand for delivery services by states in Colombia. Source: Google Trends ®
groceries has increased through home delivery services or [5].
through e-commerce, thus producing more trauma in the
logistics of the last mile in the city [4]. As can be seen in
Figure I, searches for groceries delivery services in Colombia La Calera
in the last year have increased considerably because of the Sabaneta
Colombian cities

Pasto
pandemic. Figure II and Figure III show that the state
Envigado
(Antioquia) and the city (Medellin) are among the most Zipaquirá
represented populations in the index of searches in the Google Cajicá
engine. This increase in Google® searches is a good index of Bogotá
the increase in the execution of services of the same type. In Chía
Medellín
this way, urban logistics can have an impact on aspects such
Ibagué
as traffic congestion in the city and the generation of vehicular
pollutants. 0 20 40 60 80 100
Search index in Google engine

Figure III. Colombian cities with highest search rates during 2020.
Source: Google Trends ® [5].

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

120
A possible strategy proposed by some authors to solve the
actions of city logistics, are urban warehouses (WH) or also
known as urban consolidation centers [6], whose main
function is to redirect as much as possible the flow of goods
and provide efficient transportation from the WH to the urban
areas of the city and vice versa, through the change of long-
distance cargo vehicles to short-distance vehicles [7]. As
evidenced in some works, where the installation or not of
urban WH is defined, applying mathematical models; this case
of application of integer linear programming (MILP), to
define the strategic installation of WH in the urban perimeter,
where it is concluded to install satellites in the peripheries of
the city [7]. Another outstanding case is based on a
methodology based on descriptive survey, applying a study of
a multicriteria structure for the sustainable implementation of
urban WH in cities, where the results define not to install any
WH in a small city in Brazil [8]. Another case study of linear
programming models is applied in the characterization of the
supply chain of bovine products with high production in the
province of Sabana Centro (Colombia), where the model
results in opening several centers in different strategic cities
for the company [9].
In this work, a case study is developed in the city of
Medellín, today it is ranked among the most congested cities
in the world according to the INRIX index [10]. The case
seeks to address a WH location problem by applying an
integer linear programming model, which optimize and
respond to current needs, with respect to the distribution of
basic products of the family basket. The district 14 of Medellín Figure IV. Medellin Districts’ location and population. Source:
called El Poblado, is chosen as a case study. This district has [11],[12]
special characteristics, such as the concentration of young For this purpose, district 14 El Poblado is selected as the
population with a high socioeconomic level, which makes its scope of the study, due to the importance of this area for the
population more prone to the use of technological services and city. As it is the second district with the greatest logistic
therefore to the home delivery service. influence in the city [13]. This district has 22 neighborhoods,
and its population has a decreasing growth, according to the
population growth rate and demographic profile [14].
II. Materials and methods
B. Input data
Given the current situation, a sequence of phases has been
To find the solution to this kind of problem, different
constructed, which allow proposing an improvement solution
information is required depending on the model to be applied
to the problem posed, by means of the consolidation or
in the development of the solution.
installation of WH.
▪ Geographical locations
A. Characterization and delimitation of the case of
study Based on a guide map of the division by neighborhoods of
Medellín is the capital of the state of Antioquia and is in district 14 El Poblado, an overlay of a cartesian plane is made
the Aburrá Valley, it is in the center of the state. This valley is in order to establish the coordinates (x,y) for each
composed of 10 cities forming the Metropolitan Area. neighborhood under study. To do so, we define to work with
Medellín is the most populated city in the valley, with a their centroid points as a reference point of location. The
population of 2.5 million of the nearly 4 million inhabitants of different candidate locations for WH are defined in points that
the Valley. Medellín is located in the middle of the Valley and have the availability of leasing premises, or spaces for
is divided into 16 districts comprising 249 neighborhoods as industrial facilities as WH. Given that district 14 is mainly of
illustrated in Figure IV below. Urban characteristics, there are few candidate locations in this
territory to locate a Warehouse. The location of the centroids
of each neighborhood (red dots) together with the candidate
locations for WH (blue triangles) are shown in Figure V.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

121
C. Mathematical statement of the problem
The model objective is to find a proposal for the opening
of urban WH in El Poblado district, seeking to minimize the
costs of opening and transporting basic necessities in the last
mile to the various neighborhoods, and, in compliance with
the assignment of each neighborhood only to a proposed WH
located at a distance of no more than 2 km. For the formulation
of the problem, two binary decision variables were used, 𝑥𝑗
that establishes whether a WH is proposed to open in
candidate location 𝑗 . And the variable 𝑦𝑖𝑗 that establishes
whether the attention of neighborhood 𝑖 is assigned to the
proposed WH in location 𝑗. Table I defines the terms used for
the linear modeling of the problem.
Table I. Definition of mathematical model terms.
Sets
𝑁 Set of neighborhoods available in El Poblado district. 𝑁 =
{1,2,3, … , 𝑛}
𝑈 Set of candidate locations for the location of urban WH.
𝑈 = {1,2,3, … , 𝑢}
Indexes
𝑖 Identifies each of the neighborhoods in the district. Where
𝑖 ∈ 𝑁.
𝑗 Identifies candidate locations throughout the district. Where
𝑗 ∈ 𝑈.
Parameters
𝑛 Number of available neighborhoods in district 14, El
Poblado.
𝑢 Number of candidate locations for the location of urban WH
Figure V. El Poblado district’s location, neighborhoods centroids in district 14, El Poblado.
and WH candidate locations. 𝐶𝑖𝑗 Estimated cost of shipping one kilogram per kilometer in
the city of Medellín from candidate location 𝑗 to
neighborhood 𝑖.
▪ Distance between locations 𝑃𝑖 Population of neighborhood i
𝑑𝑖𝑗 Euclidean distance in kilometers from the centroid location
For this purpose, it is defined to work with Euclidean of neighborhood 𝑖 to candidate location 𝑗.
distances in the defined cartesian plane. The distances used for 𝐶𝐼𝑗 Cost of installing or opening WH in candidate location 𝑗.
the case are calculated from the centroid points of each 𝑎𝑖𝑗 Binary parameter of compliance with the maximum
neighborhood to the different candidate locations for the coverage distance for the assignment of a client 𝑖 to a WH
location of the WH. The distances are calculated using at candidate location 𝑗 .
QGIS® software [14]. Decision variables
𝑥𝑗 Binary variable that establishes the opening of a WH in
candidate location 𝑗.
▪ Transportation and facility costs 𝑦𝑖𝑗 Binary variable that establishes the allocation of attention
of neighborhood 𝑖 to a WH located in candidate location 𝑗.
The cost of shipping goods is assumed to be COP $819.5
($0.22 US dollar) per kg. shipped per kilometer traveled; this
cost is taken as a reference from the "Rappi® delivery" The mathematical model formulated in Binary Integer
platform [16]. On the other hand, the cost for the installation Linear Programming is presented below with expressions (1)
is assumed as an average cost of renting premises such as to (4).
warehouses or commercial places in malls or shopping centers
present in the study area, it is COP $19 million in average
($5201,26 US dollar). This information is obtained based on 𝑀𝑖𝑛 𝑍 = ∑ 𝑥𝑗 ∙ 𝐶𝐼𝑗 + ∑ ∑ 𝑦𝑖𝑗 ∙ 𝐶𝑖𝑗 ∙ 𝑃𝑖 ∙ 𝑑𝑖𝑗 (1)
the platform of the local leasing company in the city [17]. 𝑗∈𝑈 𝑖∈𝑁 𝑗∈𝑈

▪ Population demand in each neighborhood ∑ 𝑎𝑖𝑗 ∙ 𝑦𝑖𝑗 = 1 ∀ 𝑖 ∈𝑁 (2)

𝑗∈𝑈
In this case, the population of each district will be used as
potential demand, without loss of generality when using the
average individual or per person consumption. This fact is
used to represent of consumption when calculating the ∑ 𝑎𝑖𝑗 ∙ 𝑥𝑗 ≥ ∑ 𝑦𝑖𝑗 ∀ 𝑗 ∈ 𝑈 (3)
representative transportation cost for each neighborhood in 𝑖∈𝑁 𝑖∈𝑁
the district of study. The population per neighborhood is
available in [13].

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

122
IV. Conclusion
𝑥𝑖 , 𝑦𝑖𝑗 ∈ {0,1} ∀ 𝑖 ∈ 𝑁, 𝑗 ∈ 𝑈 (4) Through the application of this work, a proposal is defined
that aims to improve the urban logistics of goods within the
city of Medellin, specifically taking as an object of study the
The objective function (1) seeks to minimize the costs of district 14, El Poblado. This proposal, as it has been seen in
the exercise. The first term represents the assumed cost for different reviewed works, is likely to generate significant
each facility. The second term that sums the transportation contributions in several areas of the city, such as vehicular
cost from the WH at location 𝑗 to the assigned customer 𝑖. The congestion and the generation of vehicular pollutants.
group of equations (2) guarantees the assignment of each
The development of the application of mathematical
neighborhood 𝑖 to an WH facility at a candidate location 𝑗
models to the realities of urban logistics can be considered a
complying with the maximum coverage limit. Equation group
great tool when looking for improvements in this framework.
(3) guarantees that customers 𝑖 are assigned to active WH at
These are easily adapted to the representation of real scenarios
candidate location 𝑗. Finally, group of equations (4) ensures
in cases throughout a city. From our point of view, it is a tool
that the variables 𝑥𝑗 , 𝑦𝑖𝑗 are of binary order.
that generates great benefits and to which, nowadays, little
visible use is given when making decisions in the framework
of public sector administration.
III. Results
Finally, as future work, three main complementary ideas
The mathematical model has been implemented in Python can be identified. (i) Assess the capacities needed for the PM,
3.9 and solved with Cplex IDLE Studio 20.1.0 optimizer. The to establish the space requirements to serve the assigned
results have been found on a 2.3 GHz Ryzen 7 computer with population. This can be done through design and sizing
8 GB of RAM, running on Windows 10 Professional methods of logistics facilities. All this to optimize and achieve
Operating System. The proposal seeks to establish a proposal cost savings per facility. (ii) Depending on the demands and
for a warehouse management scheme from the public sector, transports performed, an optimization of the resources
with the purpose of private use, but with the aim of reducing allocated to the functions of picking and transport of orders
trips for grocery services from distant distances in the city. could be applied. This to evaluate different work policies. For
Until to this point, we have started with these results by example, the transport personnel perform the picking in the
evaluating the model defined for the construction of the warehouse, or on the contrary, separate the functions. I touch
proposal at city level, we consider the results are efficient at with a view to evaluate the need for hours allocated to each
this scale, given that we found a solution of optimal order and task and the need for handling equipment and transport of
in an efficient way, with a run of 1 second of machine. goods. Exact or heuristic optimization techniques can be used
The results obtained propose the opening of 3 WH. The for this purpose. (iii) Complement the mathematical modeling
neighborhoods where the model yields the possible locations to ensure that there is load balancing, so that the
of the centers are: La Linde (WH3), Los Naranjos (WH5) and neighborhoods are distributed in a more equivalent way to the
El Diamante No. 2 (WH7). The final assignment of each various WH proposed as a solution.
neighborhood to the various locations proposed for WH
yielded a minimum cost objective function with a value of $
105,326,975.076 ($COP) ($ 28,833.30 US dollar). The results Acknowledgment
are plotted in Figure VI, the centroid points of the
neighborhoods assigned to each of the WHs are highlighted This work was supported by Instituto Tecnológico
by connecting lines reflected in different colors for each of Metropolitano (ITM) (Project no. P20239), in Medellín,
them. Colombia.

References
[1] Burgos, G. (2021, January 13). Supply chain: Los retos del
transporte ligero y la distribución urbana en tiempos del
Coronavirus | América Retail. https://www.america-
retail.com/supply-chain/supply-chain-los-retos-del-transporte-
ligero-y-la-distribucion-urbana-en-tiempos-del-coronavirus/
[2] Muñuzuri, J., Grosso, R., Escudero, A., & Cortés, P. (2017).
Distribución de mercancías y desarrollo urbano sostenible.
Revista Transporte y Territorio, 0(17), 34–58.
https://doi.org/10.34096/rtt.i17.3866
[3] Segura, V., Fuster, A., Antolín, F., Casellas, C., Payno, M.,
Grandío, A., Cagigós, A., & Muelas, M. (2020). Logística de
última milla retos y soluciones en España. Deloitte.
https://www2.deloitte.com/content/dam/Deloitte/es/Document
s/operaciones/Deloitte-es-operaciones-last-mile.pdf
[4] Neira Marciales, L. (2020, March 27). “Durante la cuarentena
por el virus Covid-19 se cuadruplican en el país los domicilios”.
https://www.larepublica.co/empresas/domicilios-se-
Figure VI. Solution graph. cuadruplican-en-tiempos-de-cuarentena-por-el-covid-19-
2983817
[5] Google Trends. Accessed February 2021.
(www.google.com/trends).

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

123
[6] Sopha, B. M., Sri Asih, A. M., Pradana, F. D., Gunawan, H. E., https://www.dane.gov.co/index.php/servicios-al-
& Karuniawati, Y. (2016). “Urban distribution center location: ciudadano/60-espanol/demograficas/censos
Combination of spatial analysis and multi-objective mixed- [13] Alcaldia de Medellín. (2013). Documento de rendición de
integer linear programming”. International Journal of cuentas a la ciudadanía para la Comuna 14 El Poblado.
Engineering Business Management, 8, 1–10. Períodico Cuentas Claras, 1, 8.
https://doi.org/10.1177/1847979016678371 https://www.medellin.gov.co/irj/go/km/docs/wpccontent/Sites
[7] Campos Magin, J. (2015, May). “Las plataformas logísticas de /Subportal del Ciudadano/Nuestro
distribución urbana de mercancías: un elemento de desarrollo y Gobierno/Secciones/Plantillas
regulación del transporte de mercancías en las ciudades”. Genéricas/Documentos/2013/Cuentas Claras Comuna/1
https://upcommons.upc.edu/bitstream/handle/2117/27229/155 octubre/comuna 14 baja.pdf
72417.pdf [14] Alcaldía de Medellín. (2016). Perfil Sociodemográfico por
[8] de Carvalho, N. L., Vieira, J. G. V., da Fonseca, P. N., & barrio Comuna 14 El Poblado 2016-2020. 223.
Dulebenets, M. A. (2020). A multi-criteria structure for https://www.medellin.gov.co/irj/go/km/docs/pccdesign/Subpo
sustainable implementation of urban distribution centers in rtaldelCiudadano_2/PlandeDesarrollo_0_17/IndicadoresyEsta
historical cities. Sustainability (Switzerland), 12(14). dsticas/Shared%20Content/Documentos/ProyeccionPoblacion
https://doi.org/10.3390/su12145538 2016-
[9] Ariza Nieto, J. A. (2013). “Modelo de programación lineal 2020/Perfil%20Demogr%C3%A1fico%20Barrios%202016%
basado en la caracterización de la cadena de suministro de los 20%E2%80%93%202020%20Comuna_14_El%20Poblado.pd
productos bovinos con alta producción en la provincia de f
sabana centro”. Journal of Chemical Information and [15] QGIS Development Team Version 3.16.7-Hannover (2009).
Modeling, 53(9), 1689–1699. QGIS Geographic Information System. Open Source
[10] INRIX. (2020). 2020 Global Traffic Scorecard. INRIX Geospatial Foundation, url: http://qgis.osgeo.org.
Research. https://inrix.com/scorecard/ [16] Rappi®. Accessed February 2021. www.rappi.com.co/.
[11] Alcaldía de Medellín. (2010). Primera Parte: Generalidades [17] Finca raiz. Accessed February 2021.
Medellín y su Población MEDELLÍN Y SU POBLACIÓN. https://www.fincaraiz.com.co/
https://www.medellin.gov.co/irj/go/km/docs/wpccontent/Sites
/Subportal%20del%20Ciudadano/Plan%20de%20Desarrollo/
Secciones/Informaci%C3%B3n%20General/Documentos/PO
T/medellinPoblacion.pdf
[12] DANE. (2019). Resultados Censo Nacional de Población y
Vivienda 2018 (National population census).

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

124
Close Price Prediction of Day Stock Markets with
Machine Learning and NLP models
Purushoth Ananatharasa Ragu Sivaraman
Informatics Institute of Technology Informatics Institute of Technology
anantharasa.2017152@iit.ac.lk ragu.s@iit.ac.lk

Abstract— The Stock market is an important factor that A stock trader before purchasing stocks considers factors
displays the development of a country. The reason for such as price patterns, trading pattern, opinions of public
considering this statement is due to the transactions that perception and the services offered by that specific company
happen in the market time brings out huge capital gain from whereas, a stock investor consider other factors like
investors and traders from trading with company shares. financial summary, dividends, economic growth, cash flow
Therefore the transactions of investors and stock traders are etc. Therefore these both stock investors and traders are
very important to keep the market alive. Considering the stock important factors for the stock market to function [4]. It is
traders in the short-term market, they analyze the
both the groups that keep the stock market transactions alive
performance of a company by the past values and performance
of those company indexes. However, analyzing those close every day though throughout the past history. However there
prices would not help. There are many systems that forecast the is never a solid price for a stock price for the traders to trade
future prices for a long time in future but due to the high because the value starts fluctuating in the market making the
volatile nature, these values predicted from these systems values of the assets highly unpredictable. These fluctuations
cannot always be accurate. Therefore the limitation of are based on capital flow in and out of financial reserve, also
analyzing both the past transactions and impacting features to due to the competitiveness of the other companies in the
predict the close price of a day shows the main research domains [5]. This stature of the stock makes it highly
problem that should be addressed in the research. This trained volatile and unreliable. Considering huge companies like
solution used machine learning models such as Random Forest
Apple (AAPL), it has many factors such as Apple events,
Regression, XGBoost, SVM and Lasso models to predict the
next two days of the closing prices. Also an additional feature product launches and many other internal company events to
which analyses the news sentiment which affects the pattern of impact its stock performance in a day trade [6]. Therefore as
stock performance and prompts suggestions for traders. The a well experienced day trader it is important to analyze all
evaluated system accuracy was measured with RMSE, MDA, the impacting factors like company events and past data up
MSE and the overall accuracy was up to 97% where the whole to a necessary time period before begin trading. In fact, there
system was efficient and satisfactory during benchmarking are prediction models which are used by the stock market
with existing systems. Furthermore, the system was evaluated experts to find the directions (up or down) for a specific
by domain experts and end users under well-designed number of days in future, but the systems cannot predict the
evaluation criteria.
closest stock price due to governing factors which makes the
Keywords— Stock Market Prediction, Machine Learning, stock prices really challenging to predict and do daily trad.
Random Forest Regression, Sentiment Analysis.
Therefore a system which analyzing and predicting the
closing price of certain companies for long time span has a
I. Introduction possible consequence of misleading the stock traders to
The stock market can be considered as an indicator of a make incorrect decisions to do trade in non-ideal days,
country’s economy since it acts as a platform of buying and because predicting for a longer time span which is more
selling stocks which includes many company indexes in than 5 days will have high chances of getting incorrect
sectors like agriculture, health, manufacturing etc, are being predictions due to the factors like market sentiments of
monitored by the public to invest capital [1]. Therefore a customers and unexpected global events which could affect
capital market acts as a transparent medium to evaluate the the whole industry. Therefore the existence of a system that
performance of the companies for the stock traders and the analyzes the past data for a time span to predict the close
investors to trade-in. Stocks are considered as the most price of the next 2 days (ideal) as the main function and
preferred trading medium where they can be categorized provide the nature of the stock index (Ideal to trade or not)
into long term and short-term stocks. The long-term stocks based on unexpected news published on the internet as
are considered as investments with a long time span of another feature using machine learning is still a wish list
revenue generator and short-term stocks which generate system among the stock traders.
income with a short time span to a daily stock trader [2]. This
trade between the public sector and the companies provides Therefore these below mentioned factors have to be
many mutual benefits. The companies get benefits like considered when predicting the close prices of short term
investment gains where the capital which is required to (daily trade) stock prices.
expand the business is obtained from verified income
channel. Diversification of value of the company into
another category possibly brings the opportunities of
revenues in the company profile [3]. From a stock trader’s
point of view, a person who has a company shares will have
an income channel from the profit gained from a daily stock
trade. Also, a shareholder will have the control of making
decisions to buy or sell stocks specifically to the individual
tolerance upon stocks.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

125
Table I. Existing work on past datasets.
A. Company News
Research Features Limitation
The public opinion of a stock company is mainly
Predicts the price for one Higher RMSE, MAPE
portrayed by the news that is published in the public news [13] day. values compared to other
media (positive/negative) [9]. This news published impacts models.
on the investors, long term and short-term traders to evaluate Displays result up or The reinvention of
the decisions which they made are wiser or profitable or [15] down by analyzing the traditional model with
past datasets more beneficial functions
non-profitable. It is the news which is delivered to the public Obtained 70% accuracy for Need to add sentiment data
that attracts the people to buy and sell or even invest in that [16] the model. due to impermanent nature
specific company to perform any kind of capital transactions of close price.
[10]. Used LASSO and Ridge Did not consider any
model to predict company market impacting factors
[14]
B. Volume of stocks stocks or market policies.
The volume of trade is also another important factor for Gave acceptable results Predicted scores were
the stock traders to analyze the value of the stock index which [17] compared to ANN and affected by trader
has performed on a day which also indirectly used to ARIMA models. sentiment.
Used multiple ML models Deep Learning models
determine the close price of a company [11]. The volume of [18] (XG Boost, SVC, KNN, had better confusion matrix
trade depends on specific company. Generally if the volume ANN)
of the stock is high, it directly brings out opinions that the The prediction was done Limited performance.
close price might get a positive increment since a lot of [19] with granger causal factor
between input parameters.
customers have present in the trade to contribute to the
overall volume to increase [9].
B. Predicting close prices using social media and
II. Existing Works financial news
Analyzing the existing works gives a deep understanding Social media impacts the decision of day traders
of the approaches and the system limitations that are in the involving in the trade. Specially to analyses whether it is
present systems. After carrying out a comparison with each profitable to perform trade in the market. This impacts the
available system the close price prediction of a stock index overall performance of the market at the close time. To
can be done in three main approaches. examine this factor the author [12] carried a research of
• Predicting close prices using past datasets. analyzing the social media feeds and other public news with
• Predicting close prices using social media and models like SVM, KNN, NN and Linear Regression. In
financial news. summary with other researches the positive and negative
• Predicting close prices using Technical Indicators. news affected considerably the performance of the closing
A. Predicting close prices using past datasets price. Therefore it was decided to analyze both factors to find
the direction and value of the close price. This table explains
Using ensemble approach like Random Forest can be the critiquing of the previous researches and systems that
used to predict the stock price for specific days and compared were completed with the social media approach with the
with a deep learning model for the obtained values in this features and limitations that are present.
research [13]. In this approach, the model used input
parameters such as Open, High, Low and Close prices for Table II. Existing work on social media and news articles.
prediction. The model was able to provide results from the
Research Features Limitations
Random Forest model where the model was validated with
evaluation metrics such as root mean squared error (RMSE), Used SVM and RF Add more companies for
classifiers prediction model and use
Mean Absolute Percentage Error (MAPE) and Mean Bias [20]
the same technique for
Error (MBE) for the unclassified data for five distinct algorithmic trading.
instances. Used Random Forest, Models are made based on
[12] SVM, kernel factory and certain regional markets
However this novel approach was challenged from [14] AdaBoost algorithms
where LASSO and Ridge models were used to predict the Accuracy is 56.07%. Can Number of topics and
close price. The obtained results were same enough with the do prediction for large sentiment must specify.
[21]
scale datasets.
existing works to benchmark. To evaluate these models same
metrics were used (MAPE and RMSE). The following table Textual representation is For the best profit-making
explains the critiquing of the previous approaches and [22] better than numerical qualities technical indexes
systems that were completed with the past data prediction data sets and bag of such as MA and MACD has
words methods were to be included.
approach with the features and limitations that were present. sued.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

126
A. Price Prediction Model
C. Predicting close prices using Technical Indicators
In this component an efficient model is utilized among
In the research of [15], the team decided to use the tree-
other models by inserting a valid dataset that are fed into the
based classifiers approach to predict the stock prices of
regression models. The training datasets were obtained from
certain companies go up or down (classification mechanism)
Yahoo finance. Before training the model the imported
for a specific number of days in future, based on stock
datasets were pre-processed with missing values, validated
market indicators such as MACD and RSI. At the end of
with datasets without noise in the imported data.
prediction, the Random Forest classifier had better accuracy
than the Gradient Boost Algorithm to the selected
companies. However, the team also noticed that the F1 score
of both models increased when they increased the window
width of the model. Since this model analyzes all these
indicators the author had to find correlation between these
input parametera to predict the output [15].

In another research of Condo Kim, the author used

Effective transfer entropy (ETE) which related to Granger
casual relationships (analyze daily price, exchange price and
interest rates) to predict the stock price [19]. For this research
ML models such as XGBoost, Linear regression and RF were
used with LSTM model. Among all the other models RF and
LSTM models predicted the future pattern of stocks (long
term) with less noise in the data flow compared to XGBoost
and linear Regression. The author mentioned that the models
have predicted the direction of stock price based on the ETE
might be affected by the financial events based on US Stocks
which made a conclusion to the price direction might be
Figure I : Overview Diagram of the prediction system
impacted due to any sudden reasons in the US Stock market.
Before training the model, it was decided to input the
Considering the past works, researches have been done on
open, high, low and volume columns of the dataset to predict
both classification and regression predictions to predict the
the close price from this regression model. The Figure II
stock prices depending on this scenario. In research of [9],
depicts the requirement gathered from the end users and
the author has predicted the average price of certain
domain experts regarding the core functionalities during the
companies for the next (3-10) days by using SVR and kernel
system requirement gathering phase. Finally after encoding
of Radial Basis Function (RBF). In this research, the author
and feature scaling the datasets were split with 80%-20% for
has compared the obtained accuracy of the certain
training and testing. In the training phase suitable parameters
companies obtained through the model where the technical
were identified by grid search importing from scikit library.
indicators were given as input parameters to predict the close
The parameters such as criterion, n_estimators, verbose and
price. However, the author has highlighted the obtained
max_iter were recommended most suitable for these models.
results might be changed since the timespan of closing price
is more than 3 days and market sentiments could impact in
the predicted prices.

III. Methodology
From the above discussions it was decided to go with the
historical transactions and social media, other financial news
which leverages on prediction of close price. Therefore
utilizing suitable models to find the close price using
ensemble method and nature of the stock index based on the Figure II : Survey from end users on most used data
news is focused in this section. The product consists two main B. News Analyzer Model
components which are “Price Prediction model”, “News
This model was based on NLP which associates with
Analyzer model” to predict the price and nature of a stock
human languages in the form of audio or textual information.
index. For this scenario the Apple stock index (AAPL) and
In this scenario which involves textual information,
Dow Jones (DWJA) were chosen.
sentiment analysis was used for classification. This model
has news datasets that were obtained from trusted news
sources like verified users in twitter, public news figures that
provide financial news on capital market. This dataset was
obtained from Kaggle, an open source data platform for data
science and machine learning projects. The imported data
was done a pre-processing step which consists of removing
English stop words and unwanted characters that could affect

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

127
Table IV. Time consumed for Model Training
the training process. Following that tokenizing and stemming
techniques were carried out to reduce the noise among the Model Time Taken (Seconds)
datasets. After benchmarking with multiple models for better Random Forest Regressor 31
accuracy the SoftMax logistic model was selected to classify
XG Boost 48
the news whether it could affect the stock price positively or
negatively. Decision Tree 66
The overall accuracy of the model was calculated by the LASSO 55
confusion matrix which consists of components like True
positive, true negative, false positive and false negative. SVM 51
Other additional metrics such as Recall , F1, Precision Score
were also calculated using these formulas. In the same criteria the Random Forest regression model
trained with the least amount of time with the most accurate
value compared to the others with the necessary input
parameters. Finally to validate the model, K-Fold validation
was carried out to validate whether the model is overfitting
or underfitting. And the table below shows the comparison
of actual predicted value for randomly taken for a sample of
three days, which was convincing for suggestion and proved
the model is balanced and can be reliable for critical
prediction purposes

Table V. Actual vs Predicted closing prices (AAPL).

Date Real Value Predicted
Figure III : Survey from end users on most used medium (Yahoo Finance.com) Values
29th April 2021 133.48 133.62

13th May 2021 124.97 124.11

IV. Discussion and Results
26th May 2021 126.85 126.89
According to this section, the experimental results
obtained were discussed here. The main purpose of this
research is to predict the close price of the stock index with To compare the results obtained, a benchmarking process
improved accuracy with the impact of the news or social was carried out with the existing work in order to validate the
media on the performance of the stock index. Furthermore the price prediction model has comparatively produced better
implemented prototype was evaluated with evaluation results. The table below explains the benchmarking process.
criteria in order to make sure it has acquired the required
functionalities that are expected from the end user Table VI. Benchmarking of system with existing system.
expectations. Loss functions Our system [13]
RMSE 0.37 0.42
As mentioned earlier, the aim is to design separate models
to predict the price for today and predict the impact of news MAPE 0.96 0.77
on the performance of the stock. Therefore the table below MAE 0.13 N/A
shows the results of loss function (MAPE, RMSE, MSE)
obtained from trying out different machine learning models
with suitable input parameters which are decided from According to this table this shows the accuracy of the
gridSerachCV method. classification model (News Analyzer Model) for impacting
on the performance of the stock index. From the observation
Table III. Loss function values of Price Prediction Models the SoftMax logistic model gave the highest classification
accuracy around 97% after the training phase. The results of
Regression Model MAE MAPE RMSE
(%) the model was obtained after training the dataset with cleaned
(%) (%)
Random Forest 0.13 0.96 0.37 data which consisted of neutral and non-ideal news and
0.34 comments in the news sections.
XG Boost 0.13 1.20
Decision Tree 0.18 1.24 0.50
Table VII. Accuracy of Classification model
SVM 0.26 2.34 0.75
Classification Model Accuracy (%)
LASSO 0.19 5.87 0.40
From the observation the random forest regression model Random Forest 96.63
out performed among the other regression models with less SoftMax Logistic 97.07
loss function score for the same instances. Also the
Multinomial Naïve Bayes 93.00
compiling time of the models were also measured in order to
find an effective model to process the dataset with less Bernoulli Naïve Bayes 95.03
amount of time. The table below describes the compilation
time taken to train the regression models for predicting the Furthermore in order to support the results obtained, the
close price. heatmap and the roc curve of the SoftMax model is presented
below

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

128
Adding vectorization techniques to test exception
scenarios in the trading market is also another
enhancement that will be added. Finally adding the feature
of analyzing the technical indicators such as Moving
Average, Fibonacci average to predict the movement of
stock for far long time period is also to be another
functionality which could ultimately addresses the issues
faced by the stock market experts and end users with the
misconceptions and unexpected errors in the capital
markets.

References
[1] Abraham, A., Krömer, P. and Snášel, V. (2015) ‘Afro-
European Conference for Industrial Advancement:
Proceedings of the First International Afro-European
Figure IV. Heatmap of the SoftMax model Conference for Industrial Advancement AECIA 2014’,
Advances in Intelligent Systems and Computing, 334, pp. 371–
381. doi: 10.1007/978-3-319-13572-4.
[2] Akita, R. (2016) ‘Deep learning for Stock Prediction Using
Numnerical and Textual Information’, 2016 IEEE/ACIS 15th
International Conference on Computer and Information
Science (ICIS), pp. 1–6. doi: 10.1109/ICIS.2016.7550882.
[3] Al-Jaifi, H. A. (2017) ‘Ownership concentration, earnings
management and stock market liquidity: evidence from
Malaysia’, Corporate Governance (Bingley), 17(3), pp. 490–
510. doi: 10.1108/CG-06-2016-0139.
[4] Ballings, M. et al. (2015) ‘Expert Systems with Applications
Evaluating multiple classifiers for stock price direction
prediction’, EXPERT SYSTEMS WITH APPLICATIONS.
Elsevier Ltd, (May). doi: 10.1016/j.eswa.2015.05.013.
[5] Basak, S. et al. (2019) ‘Predicting the direction of stock market
prices using tree-based classifiers’, North American Journal of
Economics and Finance. Elsevier, 47(December 2017), pp.
Figure V: ROC curve of the SoftMax model 552–567. doi: 10.1016/j.najef.2018.06.013.
[6] Cakra, Y. E. and Distiawan Trisedya, B. (2016) ‘Stock price
prediction using linear regression based on sentiment analysis’,
V. Conclusion ICACSIS 2015 - 2015 International Conference on Advanced
Computer Science and Information Systems, Proceedings, pp.
A gap for an accurate and effective system for a 147–154. doi: 10.1109/ICACSIS.2015.7415179.
suggestion system based on the past data and public [7] Cervelló-Royo, R., Guijarro, F. and Michniuk, K. (2015)
perception still exists in the market. This research paper ‘Stock market trading rule based on pattern recognition and
discusses a novel approach to get the most accurate close technical analysis: Forecasting the DJIA index with intraday
price of the next two days based on the historical datasets data’, Expert Systems with Applications, 42(14), pp. 5963–
and the impact of the public news on the stock index 5975. doi: 10.1016/j.eswa.2015.03.017.
[8] Clark, E. and Kassimatis, K. (2017) ‘Country financial risk and
performance. The random forest regression model to predict
stock market performance: The case of latin america’,
the price of the stock using machine learning was selected Evaluating Country Risks for International Investments: Tools,
after comparing benchmarking the models. The SoftMax Techniques and Applications, 56(1), pp. 117–148. doi:
logistic model was designed to analyze the impact of the 10.1142/9789813224940_0005.
public news headlines or feeds posted on the social media. [9] Di, X. (2014) ‘Stock Trend Prediction with Technical
Both the models have been trained with most ideal input Indicators using SVM’, Stanford University.
parameters and trained under most common scenarios to [10] Gaillard, P. (2004) ‘Rwanda 1994: “...kill as many people as
perform as ideal models. you want, you cannot kill their memory”’, International
Committee of the Red Cross, pp. 1–24. doi:
10.6084/m9.figshare.5028110.
For the time being, the system provides predictions for
[11] Greenwald, D., Lettau, M. and Ludvigson, S. (2014) ‘Origins
only two stock indices AAPL and DWJA since those indices of Stock Market Fluctuations’, NBER Working Paper Series,
have high volatile character due to many global and 19818. Available at: http://www.nber.org/papers/w19818.pdf.
economic uncertainties. For future enhancements more stock [12] Jin, Z. et al. (2020) ‘The industrial asymmetry of the stock price
indexes will be added to the prediction system based on the prediction with investor sentiment: Based on the comparison of
different market regions based on the locations. Another predictive effects with SVR’, Journal of Forecasting, 39(7),
functionality will be adding more stemming techniques to pp. 1166–1178. doi: 10.1002/for.2681.
the News Analyzer Model. Stemming techniques may vary [13] Joshi, K., H. N, B. and Rao, J. (2016) ‘Stock Trend Prediction
based on the found datasets such as Lancaster Stemmer and Using News Sentiment Analysis’, International Journal of
Lemmatization etc for data cleaning. Computer Science and Information Technology, 8(3), pp. 67–
76. doi: 10.5121/ijcsit.2016.8306.
[14] Kim, S. et al. (2020) ‘Predicting the Direction of US Stock
Prices Using Effective Transfer Entropy and Machine Learning

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

129
Techniques’, IEEE Access, 8, pp. 111660–111682. doi:
10.1109/ACCESS.2020.3002174.
[15] Maio, P. and Santa-Clara, P. (2017) Short-Term Interest Rates
and Stock Market Anomalies, Journal of Financial and
Quantitative Analysis. doi: 10.1017/S002210901700028X.
[16] Nabipour, M. et al. (2020) ‘Predicting Stock Market Trends
Using Machine Learning and Deep Learning Algorithms Via
Continuous and Binary Data; A Comparative Analysis’, IEEE
Access, 8, pp. 150199–150212. doi:
10.1109/ACCESS.2020.3015966.
[17] Nguyen, T. H. and Shirai, K. (2015) ‘Topic modeling based
sentiment analysis on social media for stock market
prediction’, ACL-IJCNLP 2015 - 53rd Annual Meeting of the
Association for Computational Linguistics and the 7th
International Joint Conference on Natural Language
Processing of the Asian Federation of Natural Language
Processing, Proceedings of the Conference, 1, pp. 1354–1364.
doi: 10.3115/v1/p15-1131.
[18] Pradhan, R. S. and Dahal, S. (2018) ‘Factors Affecting the
Share Price: Evidence from Nepalese Commercial Banks’,
SSRN Electronic Journal, pp. 1–16. doi:
10.2139/ssrn.2793469.
[19] Shah, D., Isah, H. and Zulkernine, F. (2018) ‘Predicting the
effects of news sentiments on the stock market’, arXiv, (1), pp.
1–4.
[20] Skuza, M. and Romanowski, A. (2015) ‘Sentiment analysis of
Twitter data within big data distributed environment for stock
prediction’, Proceedings of the 2015 Federated Conference on
Computer Science and Information Systems, FedCSIS 2015, 5,
pp. 1349–1354. doi: 10.15439/2015F230.
[21] Victor Chow, K. et al. (1995) ‘Long-term and short-term price
memory in the stock market’, Economics Letters, 49(3), pp.
287–293. doi: 10.1016/0165-1765(95)00690-H.
[22] Vijh, M. et al. (2020) ‘Stock Closing Price Prediction using
Machine Learning Techniques’, Procedia Computer Science.
Elsevier B.V., 167(2019), pp. 599–606. doi:
10.1016/j.procs.2020.03.3

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

130
Investigation of Permeability Coefficient in Layered Soils
Kaveh Dehghanian M ohammad Haroon Saeedi
Department of Civil Engineering Department of Civil Engineering
Istanbul Aydin University Istanbul Aydin University
Istanbul, Turkey Istanbul,Turkey
kavehdehghanian@iau.edu.tr haroonsaidi64@gmail.com

Abstract— The hydraulic conductivity of permeable media is "Uppot et al. (1989) investigated two clays subjected to
a critical property that depends upon different properties of soil organic and inorganic permeants to study the changes in
mass such as porosity, size, and shape of soil particles, initial permeability caused by the reaction between clays and
water content, and compaction. As the characteristic condition, permeants [2]. Afterward, Haug et al. (1990) a prototype liner
the soil mass exists in layered strata, hence it is called as formed of Ottawa sand and sodium bentonite. This material
stratified soil. S oils are permeable materials due to the presence was mixed, moisture-conditioned, and compacted into
of interconnecting spaces that enable fluids to flow when there reinforced wooden frames. The in situ permeability test results
is a difference in energy head. The shape and size of particles, in were verified with low gradient, back-pressure saturated
turn, affect the interconnecting voids. Water flow through a soil triaxial permeameter tests conducted on undisturbed cored and
mass, is proportional to the size of the void apertures rather than
the overall number of voids. Even though void ratios are
remolded samples [3]" . "Sridharan and Prakash (2002)
frequently greater than for fine grained soil. The relative researched on two-layer soil frameworks demonstrates that the
position and thickness of a soil layer in a stratified soil system, shared interaction among distinctive layers of diverse soil
are two critical variables that determine the permeability of the sorts shaping a stratified store influences the proportionate
composite soil layer. In this study, a series of falling head tests permeability of the stratified store, which cannot be essentially
were performed to determine the permeability of two-layered calculated by the utilize of the condition for the proportionate
soils using two types of bentonite and sand, as well as the coefficient of the porousness of a stratified store when the
Atterberg limit, sieve analysis, specific gravity, and proctor test. stream is typical to the introduction of the bedding planes
It is shown that increase in Atterberg limits results in decrease based on the Darcy’s law. The porousness of the exit layer
of permeability. The higher the specific gravity, the higher the controls whether the measured porousness is more prominent
permeability coefficient. The permeability is higher at higher or lesser than the hypothetical values for a stratified store " [4].
void ratio. Furthermore, the permeability of stratified soils is
affected by the thickness of the end layer. "Galvaeo et al. (2004) performed another test in which
coefficient of permeability of saprolitic soil increased about
Keywords—Permeability Coefficient, Layered Soil five times when two percent lime was added and then
Profile, Falling Head Test, Atterberg Limit, Void Ratio decreased on further addition of lime. This is assign to the
creation of chemical bonds and aggregation. As for lateritic
I. Introduction soil, the coefficient of permeability decreased as lime was
In classical soil mechanics , soil is considered as a added. This is also assign to the same mechanism except that
homogeneous and isotropic material. In most of the cases, the the bonds are weaker than those developed in Soil [5]. Nikraz
experiments and numerical analysis are performed for a single et al. (2011) carried out a series of laboratory permeability
layer while the soil is a layered medium in the field. The tests to evaluate fiber effect on hydraulic conductivity
permeability coefficient is often obtained by a constant head behavior of composite sand. Clayey sand was selected as soil
permeability test for coarse-grained soils and a falling head part of the composite and natural fiber was used as
test in fine-grained soils. The assessment of permeability is reinforcement [6]. Sridharan and Prakash (2013) conducted a
significant for erosion control, slope stability control, comparative study of the measured equivalent coefficien t of
wastewater management, and structural failure due to permeability of three-layer soil sediments with the
foundation settlement problems. For layered soil systems, the theoretically calculated values has been made. The results
layers can either be horizontal, vertical, or inclined. Each level demonstrate that, by and large, the coefficient of permeability
has its own permeability coefficient, k. The typical or of the bottom layer controls whether the measured value of
equivalent permeability coefficient of the stratified deposit, equivalent coefficient of permeability is greater or lesser than
keq, is total of the direction of flow in respect to the orientation the theoretically calculated value also when a stratified soil
of the bedding planes. The coefficient of permeability (k) of deposit contains more than 3layers, different combinations of
soil masses is calculated using Darcy's law. When the flow is positioning of layers of different k values are possible. Hence,
parallel to the bedding planes' orientation, the equivalent in such cases, it becomes difficult to predict whether the
coefficient of permeability of a stratified soil deposit is derived measured value of keq is less or equal to or more than that
by calculated. [7]" . The consequence of this observation is the
realization that the equivalent coefficient of permeability of
∑𝑛
𝑖 =1 𝐿𝑖
𝑘𝑒𝑞 = 𝐿𝑖 (1) any layered soil deposit is not just dependent upon the values
∑𝑛
𝑖 =1( ) of k of the individual layers constituting the deposit, and that
𝑘𝑖
it also depends upon the relative positioning of the layers in
Where Li is the thickness of the ith layer in the layered
the system. Sridharan and Prakash (2002) studied two-layer
profile and ki is the coefficient of permeability of that layer. soil systems with equal thickness layers, whereas in the current
The permeability coefficient of the bentonite clays is quite low
research, sand and bentonite were treated in varied thickness
and is traditionally measured by falling head permeability test. layers with varied sample sizes. The main goal of this paper is
This method gives results over a long period of time as the
to measure and compare various sizes of sand and bentonite in
sample is expected to saturate [1].
soil layers.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

131
II. Experimental study left in the oven for 24 hours to examine how much it shrinks
as the temperature rises, the more sand the lower shrinkage
Within the scope of the article, soil samples with varied limit becomes.
proportions were employed. Each soil sample's studies are Specific gravity tests are also done based on TS 1900-1
reported separately and in chronological sequence. Bentonite standards. Specific gravity of the combinations are as follow:
clay has a delicate, silky texture and is a natural clay. When
combined with water, it makes a paste. Elito Bentonite Clay 60% sand +40% Bentonit = 2.521
manufactured from a district in Izmir, southwestern of Turkey 70% sand +30% Bentonit = 2.548
and sand from concrete plants and an area close to the
university campus. The tested samples were selected in the 80% sand +20% Bentonit = 2.553
following percentages: 20% clay bentonite +80% sand, 30% 100% Bentonit = 2.318
clay bentonite+70% sand, and 40% clay bentonite + 60%
sand. "The uniformity coefficient (Cu) indicates the variance 100% sand = 2.588
in particle sizes in soil and is defined as the ratio of D60 to According to the specific gravity test results , the 100%
D10. D60 denotes the grain diameter at which 60% of soil sand has the highest specificity of the test, and the 100%
particles are finer and 40% are coarser, whereas D10 denotes bentonite clay sample has the lowest specific gravity of the
the grain diameter at which 10% of particles are finer and 90% test. As the amount of sand in the composite sample increases,
are coarser" [8]. Table I depicts the values of Cu and Cc for the specific gravity of sample increases too. Before beginning the
samples. falling head permeability test, the amount of water added to
"Atterberg limits were performed according to these each test must be considered carefully. Proctor test defines
methods are still being used to determine the Liquid Limit , how much water should be added to bentonite and sandy soils
Plastic Limit and Shrinkage Limit of soils, which are outlined directly. The results of proctor test are shown in Table III.
in ASTM D4318 and TS 1900-1" [8]. Table II depicts the Falling head test results are performed based on CEN ISO/ TS
values of Plastic Limit (PL), Liquid Limit (LL), Plasticity 17892-11standard and are shown in table IV.
Index (PI) and shrinkage Limit for the soil samples.
Table III: Determination amount of water and specific gravity test
Table I: Determination of CC, CU
Sample Sample ratios Water Specific
RAT IOS CU CC weight (gr) amount gravity

100% sand 3.22 0.825 2110 100% B 0.45 2.31

1805 100% sand 0.17 2.58
1805 80% S+20% B 0.2 2.55
80% sand + 20% bentonite 3.928 0.982
1961 70% S+30%B 0.3 2.561
70% sand + 30% bentonite 4.545 1.223
2079 60% S+40% B 0.35 2.57
60% sand + 40% bentonite 4.85 0.945

Table II: Atterberg Limits test result Table IV: permeability result for mixed samples.

RAT IOS 100% 80%sand+20 70%sand+30 60%sand+40

bentonite % bentonite % bentonite %bentonite RATIO 100%S 10 90%S+ 80%S+ 70%S+ 60%S+
0% 10%B 20%B 30 %B 40%B
Liquid Limit 167.71 26.44 43.39 55.01 B

Plastic Limit 50.53 NP NP NP

K 2.8x10-4 - 3.2x10-7 4.25x10 -6 1.4x10-10 5.7x10-8
Plastıcity 117.18 26.44 43.39 55.01 (cm/s)
Index
Shrinkage 14.17 3.63 3.665 5.01
Limit The permeability of 100 percent sand has the highest
permissibility in the test, while the mixed sample has the
lowest permissibility of 40 percent bentonite.
In Atterberg tests, bentonite has the highest value, as seen
in the table Ⅱ. The liquid limit reduces by the decrease in the As shown in the table IV, as the amount of bentonite in the
bentonite percentage, depending on the qualities of bentonite experiment drops, the amount of sand increases directly
soils, whereas the liquid limit increases as the bentonite ratios because the sand cavity is filled with bentonite. We might also
increase. Bentonite sample, which has less voids, necessitates infer that in the test, bentonite which is the fine grained
a large amount of water to be saturated since bentonite has less material has low permeability due to the high plasticity and
voids due to its grains and features, therefore the result swelling capability of these soil types . It was observed that 100
demonstrates that the liquid limit is extremely high. On the percent bentonite didnot saturated after 3 months,because this
other hand, when the grain of the sand sample mixed with the soil has a strong adhesion and water goes through the pore
bentonite its look like void of sand filling with bentonite throw slowly, it absorbs water and keeps it from passing while in the
the time therefore, as the gaps increase, the liquid limit begining the water goinge throw the sampla is so fast. We
decreases. The sample is not reduced to 3mm in diameter attempt to determine the porosity and void ratio of the same
throughout the test in the case of composite bentonite and sand sample after calculating the permeability of composite (sand,
since the ratio of sand is likely larger than bentonite, therefore
bentonite).
the plastic limit is zero. The sample with 25 impacts will be

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

132
1980

Dry Unit weight (KN/m3)

Table V: void ratio (e) and porosity (n) of different samples.
1960
1940
Weight Ratio Wet unit Dry unit (e) (n)
weight weight 1920
(gr) (gr/cm 3 )
3
(gr/cm ) 1900

2110 100%B 1.892 1.304 0.771 0.435 1880

40 50 60 70 80 90
1805 100% S 1.618 1.383 0.86 0.462
Water Content (%)
1805 80S+ 20 B 1.618 1.348 0.892 0.471
Figure I: dry unit weight versus water content
1961 70S+ 30B 1.752 1.344 0.888 0.470
In this experimental study, sand and bentonite layers are
2079 60S+ 40B 1.864 1.380 0.862 0.462 used to create a stratified soil. After the requisite
compression, samples are made by adjusting the thickness of
each sample and filling them into three layers. Since the initial
The sample with 100% bentonite clay has the lowest void bentonite thickness is 0.7 cm and is too thin to be compacted,
ratio and 20% bentonite has the highest void ratio relative to alternate rammers are used to compact the primary bentonite
the test sample. As the ratio of bentonite increases, the void stratified soils.
ratio should decrease, on the contrary, as the s and increases,
the void ratio should also increase. 1. Soil samples are obtained in the laboratory in brass
pipes with a diameter of 4.8 6mm, 1.084 mm and
The method for testing the sample as 3 layers is the same 10.84 mm and a length of 151.5 mm, or in larger
as 2 layers, by putting the sample in 3 different layers as containers from which the samples will be trimmed .
bentonite in the bottom of the mold. In the case that bentonite
2. Take 2200 gr sand and 250 gr bentonite
is placed at the top and sand at the bottom, the sand sample is
representative soil and mix with water if necessary
discarded during the compression of the bentonite. The mold
to obtain.
is separated into three layers, with filter sheets provided to
prevent the production of mixed layers between the bottom 3. After the soil sample is placed in the permeabilit y
and middle layers and the middle and top layers. The drop mold, the saturation process should be continued
height permeability test was done after establishing a steady- until it is 100% saturated with water. The result for
state flow condition under the imposed maximu m hydraulic
layered soils will be shown in the table below.
gradient in all cases.
Table VI: Permeability test results for new stratified soil sample.
The tests were repeated with a new bentonite sample
to see the impact of soil minerals and its physical and SAMPLE layer Permeability (cm/s)
RAT IO T hickness
mechanical properties on the permeability. The optimu m (cm)
value of the test is 60% as the graph directly shows the B = 0.7 Average
1.75E-04
maximu m optimum of the proctor test taking place in the 60% (95%S+5%B) S = 13.4 Result
bentonite zone as it will rise to 1970 kN/m3 and decrease after
1970 kN/m3 . Therefore, decide that the water added to this B = 1.4 Average
1.68E-04
(90%S+10%B) S = 12.7 Result
bentonite sample is 60 percent by weight of the bentonite
sample. Therefore, the previous bentonite clay sample differs B = 2.12 Average
from this type because the amount of water added to the old (85%S+15%B) S = 12 1.10E-04
Result
sample was found to be 45 percent. However, when the grains
of this sample approach a finer or even powder-like sample, The permeability coefficients of both layers composing the
it is expected that the amount of added water, which makes layered deposit must be different from their individual values
the sample saturated, is high. when analyzed individually because the Keq value is between
Kinlet and Kexit and the continuity of flow must be assured
After finding the bentonite optimum water content across the thickness of the layered bed. Each layer in a soil
profile might have similar or dissimilar qualities to the one
and Atterberg limits excluding the shrinkage limit for the new above or below it.
sample, permeabilties of the samples are achieved in two
"The individual permeability of these three sediments were
different methods: stratified and mixed states, as was done in
computed directly from the equation for falling head
earlier testing. The bentonite and sand ratios will be different permeability testing process since the validity of Darcy's law
from previous sample, so 5%, 10% and 15% are bentonite and was proven for the soft sediments of the three soils at relatively
95%, 90% and 85% are sand sample amounts. low hydraulic gradients " [4].
𝑎𝑖𝑛 𝑎𝑜𝑢𝑡 𝐿 ℎ1
𝑘 = 2.303 log 10 ( ) (2)
𝐴𝑡( 𝑎𝑖𝑛+𝑎𝑜𝑢𝑡) ℎ2

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

133
"Where ain = cross sectional area of the reservoir III. CONCLUSION AND DISCUSSION
containing the influent liquid; out = cross sectional area of the
reservoir containing the effluent liquid; A = cross sectional
area of the specimen; t = elapsed time between the In the laboratory, different types of permeability tests were
determination of h1 and h2; h1 = head loss across the specimen carried out using bentonite and sandy varying proportions soil,
at time t1; h2 = head loss across the specimen at time t2" [7]. and the equivalent permeability coefficient was determined for
two types of layered soils.
Table VII summarizes the results of the measured and
theoretical values. In most situations, the observed 20 19.7
permeability values exceed the theoretical values, as seen in

Dry Unit weight (Kg/m3)

this table. This is due to the fact that bentonite and sand have 19.5 19.21
different individual permeability. It's also worth noting that the
composite layer's measured permeability is influenced by the 19
composition of the exit layer. The composite layer with the
bentonite layer at the exit has a lower equivalent permeability 18.5 18.3
than the sandy composite layer at the exit.
Table VII: M easured and Theoretical permeability calculation. 18
Sample Measured T heoretical 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
permeability permeability coefficient of permeability, K X 10-4 cm/s
(95%S) (top) 8.75E-05 8.72 E-05
Figure II: Coefficient of permeability and dry density (newsample).
(5%B)
(90%S) (top) 8.53E-05 8.50 E-05
Table VIII: k and γd for proposed sample.
(10%B)
(85%S) (top) 5.59E-05 5.50 E-05 Ratio k (cm/s) γd (gr/cm 3 )
(15%B) 80% S+20%B 1.73E-06 1.4
70% S+30%B 1.91E-07 1.38
With bentonite at the top and sand at the bottom, the
measured permeability decreases with increasing thickness of 60% S+40%B 5.48E-08 1.355
the top layer. The reluctance of bentonite soil to flowing
through interconnected areas as the amount of water flowing It is seen that the dry density of the soil increased from 18.3
through it decreases is the cause for this downward tendency. kg/cm3 to 19.21 kg/cm3 and the permeability decreased from
On the other hand, as the thickness of the top layer increases, 1.8×10-4 to 1.6 ×10-4 cm/s. It is seen that while the density
the permeability tested with sand on top and bentonite on the changes from 18.3 kg/cm3 to 19.21 kg/cm3 , the permeability
bottom increases. Due to the increased load produced by sand coefficient decreases rapidly and falls more slowly from 19.21
on bentonite, sand has a substantially higher permeability than kg/cm3 to 19.7 kg/m3 . This indicates that the soil in its natural
bentonite. The procedure for the composite sample is the same state is loose and highly permeable.
as before, but slightly different, with some details. The new
amounts of the test procedure are 5%b+95%S, 10%B+90%S, 2.00E-06 1.73E-06
coefficient cm/s

and 15%B+85% S. The optimum of bentonite increased from

permeability

1.50E-06
45 percent to 60 percent when greater than the previous
sample. However, the optimum water of the sand is 17 percent 1.00E-06
of its weight as before. 5.01E-07 5.48E-08 1.91E-07
Take 2200 g sand, 110 g, 220 g, 330 g bentonite 1.00E-09
representative soil and mix with water if necessary. 1.3 1.35 1.4 1.45
 For 100% sand: 17% weight of the sand sample dry density
should be adding water.
Figure III: coefficient of permeability and dry density (First
 For 100% bentonite clay: 60% weight of the Sample).
bentonite clay sample should be adding water.
Table IX: k and γd for different ratios
 For 5% bentonite clay + 95% sand: 66 g+ 374 g water Sample Ratio Dry density K (cm/s)
should be added. (kg/cm ) 3

 For 10% bentonite clay + 90% sand: 132 g + 374 g (95%S +5% B) 19.7 1.1
water should be added. (%90S+10%B) 19.21 1.68

 For 15% bentonite clay + 85% sand: 198 g + 374 g (85%S+15%B) 18.3 1.75
water should be added. The graph of the old sample shows that the
permeability coefficient increases when the dry unit
weight increases.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

134
0.9 2.00E-06 1.73E-06
0.891

permeability coefficient
0.887
0.89 1.50E-06

void ratio 0.88 1.00E-06

cm/s
0.87 0.862 5.01E-07 1.91E-07
5.48E-08
0.86 1.00E-09
0.00E+00 5.00E-07 1.00E-06 1.50E-06 2.00E-06 2.545 2.55 2.555 2.56 2.565 2.57 2.575
permeability cm/s
specific gravity

Figure IV: permeability and void ratio for old sample Figure VI: permeability coefficient and specific gravity.
(composite).

Table X: Permeability and void ratio for Table XII: Gs and k for samples.
different samples. Ratio Permeability Specific gravity
Ratio K (cm/s) Void Ratio coefficient
80%S+20%B 1.73E-06 0.891 (cm/s)
70%S+30%B 1.91E-07 0.887 80% S+20%B 1.73E-06 2.55
60%S+40%B 5.48E-08 0.862
70% S+30%B 1.91E-07 2.56

The void ratio is increased owing to permeability, the larger 60% S+40%B 5.48E-08 2.57
quantity of voids in the particles, and the flow area in the
samples. In bentonite, flow in already small channels is further
hindered because some of the water in the voids is adsorbed or The increasing in specific gravity was observed with
adsorbed on the bentonite particles, reducing the flow area and increasing the bentonite content due to high specific gravity
further restricting the flow. Therefore K Bentonite <<<K sand . of bentonite. As the specific gravity increases, the
permeability coefficient will be decrease.

Figure V: liquid limit and permeability coefficient.

Figure VII: measured permeability and theoretical permeability (n).
Table XI: k and LL values for samples.
Figure VII shows that in most cases the measured
Ratio Permeability LL
(cm/s)
permeability values are greater than the theoretical values.

80%S+20%B 1.73E-06 26.44 0.00018 0.000175

0.000168
stratified permeability

70%S+30%B 1.91E-07 43.39 0.00017

60%S+40%B 5.48E-08 55.1 0.00016
0.00015
The liquid limit will grow and the coefficient of
0.00014
permeability will decrease. The permeability coefficient also
decreases with the increase of the plasticity index. Since the 0.00013
plastic limit approaches nearly zero when the sand content in 0.00012
the sample is high. The explanation for this behavior is that as 0.00011 0.00011
the plasticity of the soil sample increases, the soil particles
0.0001
prefer to keep the water between the voids, decreasing the
0.0001 0.00011 0.00012 0.00013 0.00014
possibility of water passing through the voids. That’s why
plastic clayey soils are used as soil barriers in several composite permeability
geotechnical projects such as municipal landfills and the core
of the dams to impede infiltrion. Figure VIII: Stratified and Composite permeability.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

135
 The measured permeability greater than theoretical
Table XIII: k and LL of samples. permeability.
Ratio Composite Stratified  For layered soils, the permeability decreases as the
Permeability Permeability bentonite increases, and the permeability of the
(cm/s) composite increases as the bentonite increases.
(cm/s)
95%S+5%B 1.38E-04 0.000175  The flow rate increases with the increase in the
90% S+10%B 1.32E-04 0.000168 hydraulic gradient rate.

85% S+15%B 1.13E-04 0.000113

References
[1] Kozlowski, T., & Ludynia, A. (2019). Permeability
Considering the results for layered soils reveals that the Coefficient of Low Permeable Soils as a Single-Variable
permeability decreases as the bentonite increases, and the Function of Soil Parameter , Water 11(12).
permeability of the composite increases as the bentonite
increases.
[2] Stephenson, R., & Uppot , J. (january 1989). Permeability
IV. Results of Clays Under Organic Permeants. Journal of
Geotechnical Engineering, 120-131.
[3] haug, m. (February 1990). Evaluation of In Situ
For bentonite and sandy soil, atterberg limit, specific Permeability Testing M ethods. Journal of Geotechnical
gravity, sieve analysis, proctore test and laboratory study were Engineering, 116-297.
carried out for two layers systems with different soil types, [4] Prakash, K., & Sridharan, A. (2002). Permeability of two-
layer type, varying ratios, position done and the equivalent layer soils. Geotechnical Testing Journal, 443-448.
permeability coefficient was found to be different from the [5] Gustavo, F. S., & Elsharief , A. (July 15, 2004). Effects of
value calculated from Darcy's law. The exit layer's Lime on Permeability and Compressibility of Two
permeability checks if the measured permeability for a Tropical Residual Soils. Journal of Environmental
stratified layer is above or below the theoretical values. The Engineering.
permeability measured in this document is larger than that [6] Chegenizadeh, A., & Nikraz, H. (2011). Permeability test
measured in theory. In the layered system, in addition to void on reinforced clayey sand. World Academy of Science,
ratio, thickness, and soil type, the permeability coefficient of Engineering and Technology, 130-133.
a soil seems to be a function of the interaction between the soil [7] Sridharan, A., & Prakash, K. (2013). Permeability of
Layered Soils: An Extended Study. Geotechnical Testing
and its surrounding soil(s).
Journal, 31(5):1639-1644.
And so it should be considered that the permeability [8] https://www.geoengineer.org/education/laboratory -
coefficient of a soil in a layered system also depends on the testing.
direction of flow, its relative position, and the thickness of the
layer. This work is purely experimental and opens up space for
further study and therefore deriving a mathematical equation
for layered soils.
 For stratified deposition, the permeability of the exit
layer controls whether the measured permeability is
greater or less than the theoretical permeability.
 It is seen that the permeability coefficient increases
when the dry unit volume weight increases compared
to both the old and new samples.
 As the coarse-grained material increases , the void
ratio and the permeability increases too.
 The permeability coefficient decreases as liquid
limit increases.
 Plasticity index and plastic limit increase withe
permeability coefficient decreases.
 As the specific gravity increases, the permeability
coefficient decrease.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

136
An AI-based Embodied Digital Human Assistant for
Information in University
Munia AlKhalifa Kasım Özacar
Department of Computer Engineering Department of Computer Engineering
Karabuk University Karabuk University
Karabuk, Turkey Karabuk, Turkey
munia.ak@outlook.com kasimozacar@karabuk.edu.tr

Abstract—Empowered by artificial intelligence (AI), digital During the last two decades technologies such as speech
assistants are taking an essential role in our lives as they are recognition, natural language understanding, and text to
serving the needs of people within many domains such as speech synthesis have been the interest for much research
providing customer service and translating. The two well- resulting in well-known digital assistants like Apple’s Siri
known types of assistants, which are text-based and voice- [3]. Scenarios for digital assistants have been extended in a
operated, are agents that answers the question or serve the
requests of users depending on data that is given to them.
variety of areas such as tutoring [4], health care [5],
Techniques used for building these agents differ depending on forecasting [6], translation [7] and navigation [8]. Digital
multiple measures and the methods these agents follow serve assistants can answer both simple and complex questions,
varies from being basic-simple rules to state-of-art techniques. provide information, recommendations, initiate conversation
However, to achieve a natural and accurate interaction like we occasionally and make predictions.
do in our everyday life, we contribute with a voice-based digital
assistant integrated into a virtual human whose aim is to serve Additionally, to the importance of questions responding,
at university working as a friendly assistant that answers the face to face communication achieves notable effects on
questions of students, newcomers, or visitors. We built and people, thus, this will offer better performance and more
trained multiple neural networks models and combined them to convincing interactions. Therefore, in this paper we build a
have a human assistant responding any query related to the virtual human as a digital assistant which works a guide and
university while expressing and being interactive. information provider about our university by answering the
Keywords—Human-Computer Interaction, Artificial questions of users and interacting with them.
Intelligence, digital assistant, chatbot, deep learning The contributions of the paper are as follows: 1) we
I. Introduction introduced our idea of a university digital assistant, 2) we go
through discussing some previous research of same field 3)
AI-based digital assistants have been wide spread as their we explain the methodology of our work 4) we explain about
affordability and efficiency make them a key element and a tests and experiments 5) we discuss our work and describe
recognized example in all industries and empowered by the drawbacks 6) we conclude our paper.
simple to advanced artificial intelligence techniques. These
agents are also known as digital assistants which can perform II. Related Work
different tasks as well as are capable of mimicking human Conversations and communication between human and
impressions in conversations along with providing the machine have been receiving attention since the early 1960s.
requested service in applications like ecommerce, information ELIZA, was the first natural language processing program
retrieving, and education [1]. and chatbot who worked as psychotherapist and used simple
A big challenge is the goal of developing assistants that pattern matching [9], then came IBM’s Shoebox-activated
to interact naturally with human and at the same time generate calculator and by the year’s big companies started developing
much more natural or human-like conversations making them speech recognition systems and machines.
indistinguishable from that of a human during a normal open- Another chatbot was launched on platforms like MSN
domain or closed-domain conversations that are used in Messenger, was assigned with simple tasks such as checking
providing service and help to users. In addition to weather, converse with users, and looking up facts [10]. Parry
conversation flow and service provision, another challenge is [11] is an improvement over ELIZA having its own
to make the chatbot acts not only as a tool, but also as a friend personality. In 2000, 2001, and 2004 a chatbot named ALICE
[2]. won the Loebner prize due to its high similarity to humans
[12] though it relies on simple pattern-matching algorithm

Figure I. (B). System workflow inside Unity.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

137
based on the Artificial Intelligence Markup Language informative built for providing information from a source, or
(AIML) [13]. could be chat based which talks to the user typically as a
Afterwards, many virtual personal assistants like IBM human by responding with the correct sentence, or can be
Watson [14], Amazon Alexa [15], Google Assistant [16], task-based like bots helping in booking a flight or like a FAQ
Microsoft Cortana [17], and Apple Siri [18] came to the life. chatbot. Another measure for deciding the bot type is the
By far we can see that chatbots, can be text-based or vocal input processing and response generation. From the three
and they are utilized into different industries including response generation models: rule-based, retrieval-based, and
marketing, supporting systems, education, healthcare, generative model [24], we picked the retrieval-based model.
cultural heritage, and entertainment. Therefore, our digital
After specifying the correct type, we started preparing our
assistant who works a guide and answers questions while
own dataset of questions and database of responses. The bot
interacting with human.
will retrieve the response from the response candidates in the
database to answer the user’s question considered as queries.
For achieving this purpose, we will explain each part included
III. Methodogy in the bot. The four essential concepts are: intents, entities,
We aim at a digital human assistant that answers any Named Entity Recognition (NER), Intent Classification. We
given question regarding the university. We summarize the display in figure II a conversation example held between a
architecture in Figure I (A) and (B). In the coming sections, user and the bot in the as textual interface before combining
we explain each part in the system. with the assistant virtually in order to check how well the
chatbot is answering.
A. Speech Recognition We created our own dataset that is a collection of different
Speech recognition or Speech to Text can be defined as questions related to Karabuk University that reached 300
converting the speech sound signal into instructions or words questions in total. We made two copies of the dataset, one
to give the machines the ability to respond to these commands that suites the first model and the other suits the second model
[19, 20]. We deploy this technique in our work because the but both having same questions. Relatively we created the
virtual assistance needs to understand what the user asks database of answers to these questions. For our trainings we
when speaking which leads to the need of converting the divide the dataset as 80% for training and 20% for validation.
user’s speech into text to apply the natural language
C. Intent Classification
understanding (NLU) techniques on the text.
Table I. Intents and Entities.
Many APIs have been introduced for speech recognition
such as DeepSpeech [21]. However, we use a very simple
script to convert speech into text with the help of Google
Speech Recognition API [22]. This script will be combined
with the chatbot to understand the speech then respond as a
text, then again converted back to speech.
B. Question Answering
Before building a chatbot, its objective must be decided
to pick the correct type of chatbot that suits the business or
task. Classifying the bot depends on varied parameters like
response generation technique, the goal, input processing, the
domain of knowledge, the provided service and finally the
method chosen for building. [23]
We decided what kind of dialog system we will build for
our agent starting with the knowledge domain, as we will be
providing information related only to the university then that
decides that domain will be closed-domain instead of the
open-domain.
Then considering goal classification, we defined what
goal the bot should achieve. Chatbots, like FAQ bots, can be

The intent classification is an NLP problem and more

specifically NLU problem which aims to understand the
intention of the user through machine learning model that
will recognize what the user originally demands and
consequently decide what step should be taken next. The
intent classification is very essential part in this system and
is considered as sentence classification that one label or
Figure I. (A). System workflow between the models. multiple labels will be detected from the sentences.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

138
Figure II. Conversation example between user and the chatbot.

Typically, many techniques are suggested for intent

recognition like the RNN [25] and SVM [26]. However, in
our work we build a Convolutional Neural Network (CNN)
for sentence classification [27]. Following up with the
original paper, we build a convolutional neural network on
top of word embeddings. To be able to solve text
classification problem with CNN we need to use pre-trained
word embeddings. Word embedding is NLP techniques for
mapping semantic context into geometric space known as
embedding space. Each word in a dictionary will be assigned
with a numeric vector where the distance between each two
vectors represents a part of the relationship semantically
between the two assigned words. For example school and
book can be related words thus they will be embedded closed
to each other, but apple and cat are not related semantically Figure IV. CNN layers for text classification.
thus they will be represented as distant word vectors in the
embedding space.
preparing embedding matrix that has the embedding vector at
Computation of word embedding is done through neural index z for the word of index z in the word index. We load
networks such as the word2vec technique which is used in the matrix to Embedding layer of Keras and build on top of it
this model. A lot of word embddings datasets can be a simple 1D convolutional neural network.
downloaded, for our work we will be using the 100-
dimensional GloVe embeddings (Global Vectors for Word The basic architecture of CNN for text classification is
Representation) of 400k words computed on of English of shown in figure III. The CNN in our work is displayed in
Wikipedia. figure IV, as described by [27] it consists of three 1D
convolutions and one dense layer. We train the model using
To build the intent classification model we start with the following hyperparameters:
converting our dataset text samples into word indices then
- 50 epochs
- filter_sizes = [2, 3, 5]
- number of filters = 512
- optimizer = Adam
- loss function = categorical_crossentropy
- ReLU and softmax activation functions
Output will be classified into one of the 15 intents shown
in Table I. The model achieved 99% accuracy. We displayed
the evaluation metrics of the model in Table II.
Figure III. Standard CNN on text classification.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

139
D. Named Entity Recognition IV. Preliminary Study
Named-entity recognition is a Natural Language For our experiments we conducted a preliminary test
Understanding (NLU) problem and means information applying qualitative and quantitative analysis with 12 users
extraction but for finding categories known as entities within in total between the ages of 23 and 45 (9 of them from our
the text [28]. Entity is a word considered as a parameter value university). For the qualitative analysis we have interviewed
that is extracted from context. While intents refer to user’s 5 users (1 female) by asking them three questions and then
main goal, the entities work as keywords that refer to we built high-level themes from their answers that will be
meaningful or important things and used by the users to used in the second quantitative test. In the quantitative
describe what they want, or in other words, to describe their analysis we conducted a survey with 7 users (2 females) after
intents. Entities can be system-defined like data references letting them test the system by interacting with the assistant,
[29] but for our problem we define our own entities. General and in the survey they were asked to give feedback over
statements by rating them. In both of the study steps all
entities examples are person, location, organization, city,
participants were either staff, or instructors in the computer
date, etc. dep ending on the industry, and they are called as
engineering department, or lecturers from other departments.
domain entities which gets tagged from an input sentence
[30]. A. Qualitative Test
In our work we got 14 entities in total and are displayed For the interviews, we applied them in different rooms of
the university and during them we concentrated on providing
in the Table I. For entity extraction many libraries and
an undisturbed surrounding where no other persons than the
frameworks have been introduced like Snips [31] and Rasa
interviewer and the participant were present. We interviewed
[32]. We used an NLP open-source library called SpaCy [33]
that is alternative to the popular NLTK [34] and includes
pretrained machine learning models.
SpaCy made NLP easier in Python by providing new
pipelines based on transformer which improved the accuracy,
efficiency and adaptability of SpaCy especially in the third Figure V Entities extracted from a sentence.
version. The NER model of SpaCy assigns label to the participants individually by asking them 3 questions in
contiguous tokens groups. The NER model of SpaCy consists sequence. First questions is “how advantageous do you see
of the following: the 3D virtual human assistant? And why?” second question
is “what kind of improvements should be added to any virtual
- wordly-wise word embedding technique through subword assistant?” and the third question was “what are the most
features and Bloom embedding common problems that may occur during the communication
- deep convolutional neural network with residual layers with a machine” They were permitted to answer with
whatever comes to their minds and then it will be edited after
- named entity parsing with modern transition dependent the interview. The interviewer collected their answers on
approach Microsoft Word. We aimed for deriving high-level themes
We labeled the entities in the question dataset using SpaCy Table II. Evaluation Metrics on neural network models.
NER annotator [35] and an example in Figure V shows how
the entities look like in a question. The NER model gets the F1 Precision Recall
sentence and extracts keywords that belong to proper domain Intent
entities and using them along with the intent a response will 99 99 99
Classifier
be retrieved from the database.
Entity
99 99 98
For fine tuning with SpaCy’s NER we start with creating Recognizer
the json file of our questions dataset with defined entities.
Afterwards we convert it to SpaCy file and create the
configuration and using SpaCy’s built in config files and this from the participants’ responses which will be used in the
is simply provided with one command line. Once done we survey part.
begin training with a SpaCy pipeline imported from their
library. The models was trained for 100 epochs and was B. Quantitative Test
evaluated on 20% of dataset and new data prediction. We For this part we held a two steps test with the second
show the evaluation metrics of the model in Table II. group of participants and as same as the previous group, they
were conducting the tests in undisturbed rooms. For this part
E. Combining with Unity we firstly let the participants individually try the system by
To combine the models with the virtual human project [36] running the virtual human assistant with the use of virtual
we connect them to Unity game engine using UDP reality (VR) technology, and then we apply an online survey
communication. In this procedure we send the data from consisting of 5 questions in total, three of which were based
python model by sockets to Unity that has UDP client that on the topics introduced previously in the qualitative test
reads received data from the socket. We summarize this in analysis.
Figure VI. Python server will send the response as text to
Unity’s C# client in real time. In unity’s virtual human we We conducted the survey on SurveyMonkey and
generate the speech, lip motion and smiling facial expression evaluated the statements on the Likert scales that includes
when interacting with user. “strongly disagree,” “disagree,” “neutral,” “agree,” and
“strongly agree.” We display the results of the survey for each
statement separately in the percentages form.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

140
Users Feedback
7
6
Figure VI. UDP connection between Python and Unity’s C#
5
Our goal of using these evaluations is to quantify the 4
range of participants’ agreement with the statements built
on the interviews part by providing strong evidence rather 3
than simply applying normal feedback survey. 2
C. Results 1
For the interview responses qualitative analysis, we 0
built the themes depending on keywords the frequently Theme 1 Theme 2 Theme 3 Theme 4 Theme 5
occurred in the answers namely fast /quickly/quick,
information/data, human/human-like/humane/humanoid, strongly agree agree neurtal
interaction/interactive/interact, intent/intention,
disagree strongly disagree
recognize/recognition and many more. Depending on the
count of words we constructed the themes. These themes
are the following: Figure VII. Diagram shows the number of users who strongly
agree, agree, disagree, strongly disagree and had neutral feedback
Theme 1, human-like interaction. (Qualitative based) on each theme.
Theme 2, quick responses. (Qualitative based)
reason for choosing a virtual human to be a digital assistant is
Theme 3, persuasive in interaction. (Qualitative based) because of the importance that expressions and face-to-face
Theme 4, fast recognition of intent. communication play in a conversation. Therefore, this project
is a contribution for serving at university but built with the
Theme 5, highly informative and clear. state-of-art techniques.
We summarize the results in figure VII through a bar chart Our future plan to add more expressions, gestures and
that shows the number of participants who chose a feedback increasing the size of the dataset to improve the project.
on every theme. For instance 3 of users strongly agree on
theme 4, while 3 others agree and only on feels neutral. References
[1] Abu Shawar, B.A., Atwell, E.S.: Chatbots: are they really
V. Discussion useful? J. Lang. Technol. Comput. Linguist. 22, 29–49 (2007)
Introducing an informative digital assistant was the key [2] Brandtzaeg, P.B., Følstad, A.: Why people use chatbots. In:
insights in this research that will serve students, visitors, or Kompatsiaris, I., et al. (eds.) Internet Science, pp. 377–392.
newcomers to the university by answering any question Springer, Cham (2017). https://doi.org/10.1007/978-3-319-
70284-1_30
related to our university. To achieve this, we integrate
several models to a 3D virtual character. These models are [3] Siri. https://www.apple.com/siri/
speech to text model, question answering model, and sound [4] Braun and N. Rummel, “Facilitating Learning From Computer-
synthesizing. We believe that building such assistant will Supported Collaborative Inquiry: the Challenge of Directing
Learners’ Interactions To Useful Ends,” Research and Practice
benefit the school by attracting the attention of users and in Technology Enhanced Learning, vol. 05, no. 03, pp. 205,
provide all needed information to them as a normal human. 2010.
Additionally, the assistant can help each user individually.
[5] D. Coyle, G. Doherty, M. Matthews, and J. Sharry, “Computers
However, we still believe that the assistant should in talk-based mental health interventions,” Interacting with
Computers, vol. 19, no. 4, pp. 545–562, 2007.
provide more changeable responses and ask back the user in
more flexible chat flow till understanding the intention of [6] Zue, S. Seneff, J. R. Glass, J. Polifroni, C. Pao, T. J. Hazen, and
L. Hetherington, “J UPITER : A TelephoneBased
user. Therefore, future work will be enhancing the dataset, Conversational Interface for Weather Information,” IEEE
the conversation flow, provide more interactive gestures Transactions on Speech and Audio Processing, vol. 8, no. 1, pp.
and advanced service by the agent such as providing a tour 85–96, 2000.
to the school visitors. [7] M. Kolss, D. Bernreuther, M. Paulik, S. Stucker, S. Vo- ¨ gel,
and A. Waibel, “Open Domain Speech Recognition &
VI. Conclusion Translation: Lectures and Speeches,” in Proceedings of
Digital assistants are being dominant applications in real ICASSP, 2006.
life that serve the community easily using artificial intelligent [8] R. Belvin, R. Burns, and C. Hein, “Development of the HRL
techniques. Additionally, virtual humans are playing a very route navigation dialogue system,” in Proceedings of ACL-
important role in the human-computer interaction allowing HLT, 2001, pp. 1–5.
people to interact with the system. In this paper our purpose [9] Weizenbaum, J.: ELIZA—a computer program for the study of
was to build an assistant that helps in university queries. We natural language communication between man and machine.
combined multiple models each one achieving task such as a Commun. ACM 9, 36–45 (1966). https://doi.org/10.1145/
365153.365168
model for speech recognition, a model for entity recognition
and a model for intent classification. Moreover, we deploy [10] Adamopoulou E, Moussiades L. An Overview of Chatbot
voice synthesizing plugin in unity with these models. The Technology. Artificial Intelligence Applications and

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

141
Innovations. 2020;584:373-383. Published 2020 May 6. [24] Hien, H.T., Cuong, P.-N., Nam, L.N.H., Nhung, H.L.T.K.,
doi:10.1007/978-3-030-49186-4_31 Thang, L.D.: Intelligent assistants in higher-education
environments: the FIT-EBot, a chatbot for administrative and
[11] Colby, K.M., Weber, S., Hilf, F.D.: Artificial paranoia. Artif. learning support. In: Proceedings of the Ninth International
Intell. 2, 1–25 (1971). https:// doi.org/10.1016/0004- Symposium on Information and Communication Technology,
3702(71)90002-6 pp. 69–76. ACM, New York (2018)
[12] Wallace, R.S.: The anatomy of A.L.I.C.E. In: Epstein, R., [25] Abiodun, Oludare Isaac; Jantan, Aman; Omolara, Abiodun
Roberts, G., Beber, G. (eds.) Parsing the Turing Test: Esther; Dada, Kemi Victoria; Mohamed, Nachaat Abdelatif;
Philosophical and Methodological Issues in the Quest for the Arshad, Humaira (2018-11-01). "State-of-the-art in artificial
Thinking Computer, pp. 181–210. Springer, Cham (2009). neural network applications: A survey". Heliyon. 4 (11)
https://doi.org/10.1007/978-1-4020-6710- 5_13
[26] Cortes, C., Vapnik, V. Support-vector networks. Mach Learn
[13] Marietto, M., et al.: Artificial intelligence markup language: a 20, 273–297 (1995). https://doi.org/10.1007/BF00994018
brief tutorial. Int. J. Comput. Sci. Eng. Surv. 4 (2013).
https://doi.org/10.5121/ijcses.2013.4301 [27] Kim, Yoon. (2014). Convolutional Neural Networks for
Sentence Classification. Proceedings of the 2014 Conference
[14] IBM Watson. https://www.ibm.com/watson on Empirical Methods in Natural Language Processing.
[15] What exactly is Alexa? Where does she come from? And how 10.3115/v1/D14-1181.
does she work? https://www. digitaltrends.com/home/what-is- [28] Perera N, Dehmer M, Emmert-Streib F. Named Entity
amazons-alexa-and-what-can-it-do/ Recognition and Relation Detection for Biomedical
[16] Google Assistant, your own personal Google. Information Extraction. Front Cell Dev Biol. 2020 Aug
https://assistant.google.com/ 28;8:673. doi: 10.3389/fcell.2020.00673. PMID: 32984300;
PMCID: PMC7485218.
[17] Personal Digital Assistant - Cortana Home Assistant –
Microsoft. https://www.microsoft. com/en-us/Cortana [29] Ramesh, K., Ravishankaran, S., Joshi, A., Chandrasekaran, K.:
A survey of design techniques for conversational agents. In:
[18] Siri. https://www.apple.com/siri/ Kaushik, S., Gupta, D., Kharb, L., Chahal, D. (eds.) ICICCT
2017. CCIS, vol. 750, pp. 336–350. Springer, Singapore
[19] Hui Liu, Chapter 1 - Introduction, Editor(s): Hui Liu, Robot (2017). https://doi.org/10.1007/978- 981-10-6544-6_31
Systems for Rail Transit Applications, Elsevier, 2020, Pages 1-
36. [30] Jung, S.: Semantic vector learning for natural language
understanding. Comput. Speech Lang. 56, 130–145 (2019).
[20] Zwass, Vladimir. "Speech recognition". Encyclopedia https://doi.org/10.1016/j.csl.2018.12.008
Britannica, 10 Feb. 2016,
https://www.britannica.com/technology/speech-recognition. [31] https://github.com/snipsco/snips-nlu
Accessed 12 June 2021.
[32] Bocklisch, Tom & Faulker, Joey & Pawlowski, Nick & Nichol,
[21] Hannun, C. Case, J. Casper, B. Catanzaro, G. Diamos, E. Elsen, Alan. (2017). Rasa: Open Source Language Understanding and
R. Prenger, S. Satheesh, S. Sengupta, A. Coates, et al. Deep Dialogue Management.
speech: Scaling up end-to-end speech recognition. arXiv
preprint arXiv:1412.5567, 2014 [33] https://github.com/explosion/spaCy

[22] https://pypi.python.org/pypi/SpeechRecognition/ [34] Bird, S., Klein, E., & Loper, E. (2009). Natural language
processing with Python: analyzing text with the natural
[23] Adamopoulou, Eleni & Moussiades, Lefteris. (2020). An language toolkit. " O'Reilly Media, Inc."
Overview of Chatbot Technology. 373-383. 10.1007/978-3-
030-49186-4_31. [35] https://github.com/ieriii/spacy-annotator
[36] https://github.com/GeoffreyGorisse/VHProject

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

142
Visual Question Answering for Medical Image Analysis based
on Transformers
Hanan Othman Yakoub Bazi Mohamad Alrahhal
Department of Computer Engineering Department of Computer Engineering Department of Applied Computer Science
King Saud University King Saud University King Saud University
Riyadh, Saudi Arebia Riyadh,Saudi Arebia Riyadh,Saudi Arebia
439203835@student.ksu.edu.sa ybazi@ksu.edu.sa mmalrahhal@ksu.edu.sa

Abstract— Health care has been revolutionized over the past the suitable utilization of medical resources, providing a
decades in conjunction with new discoveries and technological second opinion to physicians in diagnosis, and reducing the
advancements. One of those areas that have been rapidly high cost of training medical professionals.
evolved is medical imaging that plays a significant role in
screening, early diagnosis, and treatment selection. Artificial In the literature, researchers have utilized different
Intelligence (AI) has been utilized to support the physician's methods and models for medical VQA. Yan et al. [8] used
decisions related to medical imaging. Recently, medical visual BERT [9] and VGG-16 [10]with global average pooling
question answering (VQA) has been utilized to predict the right (GAP)[11] strategy to extract question and image features,
answer for a given medical image accompanied with a clinically respectively. Then this is followed by co-attention to combine
relevant question to support the clinical decision. However, the these features and a decoder to predict the answers. Chen et
validity of medical VQA is still not proven. In this paper, we al. [12] used BioBERT for the questions and ResNet34 [13]
proposed a full transformers architecture for generating to extract image features in which a bilateral-branch network
answers given the question and image. We extracted image (BBN) with a cumulative learning strategy [14] was used to
features using the data-efficient image transformer (DeiT) fuse these features. Ren et al. [15] propose a model called
model and bidirectional encoder representations from the CGMVQA that uses a multi-modal transformer architecture.
transformers (BERT) model for extracted textual features. We Additionally, Khare et al. [16] used a similar architecture of
also applied a concatenation to integrate the visual and language
CGMVQA with masked language modeling (MLM) and
features. The fused features were then fed to the decoder to
different datasets. Vu et al. [17] utilize a method denoted by
predict the answer. This model established new state-of-the-art
results 61.2 in accuracy and 21.3 in BLEU score in the PathVQA Question-Centric Multimodal Low-rank Bilinear (QCMLB)
data set. that combines image and question features by applying high
involvement to the query questions meaning.
Keywords—Transformers, visual question answer, medical However, VQA in the medical domain is still in its
image, vision transformers embryonic stage since the accuracy of previous methods has
I. Introduction significantly lower than doctor's assessments, owing to the
difficulty of answer evaluation and variety of answer
Health care has been revolutionized over the past decades expression. Thus, there is still a need to develop innovative
in conjunction with new discoveries and the advancement of techniques to overcome some of those limitations.
technology. As a result, the physician has been started to
adapt new modalities to optimize patient care. One of those Recently, there is a growing body of literature that
updated modalities is using medical images. Thus, medical supports the utilization of Transformers in VQA. Thus,
images play a significant role in screening, early diagnosis, transformers have been initially used in natural language
and during surgeries such as cardiac catheterization [1]. Over processing (NLP) tasks [18]. It is based entirely on attention
recent years, artificial intelligence (AI) has started mechanisms that use the encoder-decoder architecture. This
incorporating into the medical field and provided various attention focuses on specific parts of the input to get more
innovation initiatives. Thus, AI has been utilized in a health efficient results. The main motivation of transformer models
care setting, such as advancement in diagnosis, treatment is to ability a long-range interaction between different
personalization, and electronic health recording. Many sequence elements, unlike RNN. This motivation inspired
studies have indicated that the integration of AI in medical Dosovitskiy et al. [19] to propose a convolution-free
diagnosis programs had increased the accuracy, speed, and transformer called vision transformers (ViT) that tries
consistency of the diagnosis enhanced the prediction of directly to images by splitting the image into patches. These
patient outcomes and captured additional information as patches have been treated as tokens in NLP applications.
missed by the doctors [2], [3]. In this regard, several models These models led to very competitive results on the large
have been proposed to help patients understand their physical dataset using extensive computing resources [20]. However,
conditions through visual inspection, such as image caption when ViT has trained in a small dataset, the model will not
[4], image retrieval [5], visual question answering (VQA) [6], discover the properties of the image. Therefore, Touvron et
and visual question generation (VQG) [7]. al. [21] propose a new technique, called a data-efficient
image transformer (DeiT), That requires less data (e.g.,
Visual question answering (VQA) is a task that takes the ImageNet1K) and less computing resources to produce a
medical image and clinical question about the image as input high-performance model. DeiT has the same architecture as
and produces natural language answers as output. This the ViT model with knowledge distillation [22]. Knowledge
process shows great potential in providing medical distillation is a learning framework that uses student-teacher
assistance, such as helping patients get prompt feedback on techniques by applying data augmentation coupled with
their inquiries, making more informed decisions, supporting

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

143
Figure I. An overview of our model that contain image encoder to encode the visual inputs, a text encoder to encode the
language inputs, and a joint decoder to generate that answer

optimization tricks and regularization. This training strategy decoder for answer prediction. Detailed descriptions of the
has shown excellent results on ViT for smaller datasets, method have provided in the following subsections.
particularly when the knowledge has been distilled from a
convolutional neural network (CNN) teacher model [23]. A. Question encoder
We encode the question by using bidirectional encoder
Although previous studies used transformer as a single representations from Transformers (BERT). BERT is a
method, studies reported strong performance of multi-model transformer encoder model pre-trained on large corpora with
learning such as VQA. Tan et al. [24] utilize a method masked language modeling and following sentence
denoted by learning cross-modality encoder representations prediction tasks. It contains several blocks of Multi-Head
from Transformers (LXMERT) that used a two-stream model Attention (MHA) followed by Feed Forward Neural Network
with co-attention and only pre-trained the model with in- (FNN).
domain data. Lu et al. [25] used vision-and-language BERT
(ViLBERT) with the same architecture with more complex Given the input questions Qi, we tokenize it by adding a
co-attention, pre-trained with the out of domain data. Hu et special classification token [CLS] followed by the WordPiece
al. [26] proposed a unified transformer (UniT) based on tokens and the separator token [SEP]. The token sequence is
encodes each input modality with an encoder and a joint then added to word embedding to convert the tokens into
decoder that makes predictions for final outputs. vectors of dimension 𝑑𝑚𝑜𝑑𝑒𝑙 . A positional embedding is
added to each token to indicate its position in the sequence.
This paper proposes a full transformers encoder-decoder Then the result will be fed to a transformer with several
architecture for the VQA model in the medical domain. The blocks. Each transformer block is composed of a multi-head
encoder modules encode each input modality, and the self-attention (MSA), FFN, and normalization layer. The
decoder generates the answer word by word. The Image and MSA block uses the self-attention mechanism to drive long-
question features have been extracted by using the DeiT range dependencies between different words in the given text.
model and BERT model, respectively. The extracted features Equation (1) Shows the details of the computations in one
for images and questions were fused using a fusion self-attention head (SA). First, the input sequence is
mechanism. Compared to previous work on multi-model transformed into three different matrices which are key vector
learning with transformers, our work is the first one that trains 𝐾, query vector 𝑄, and value vector 𝑉 using three linear
on medical images. layers 𝐾 ∈ ℝ𝑑𝑚𝑜𝑑𝑒𝑙 ×𝑑𝐾 , 𝑄 ∈ ℝ𝑑𝑚𝑜𝑑𝑒𝑙 ×𝑑𝑄 , and 𝑉 ∈
The remainder of the paper is organized as follows: ℝ𝑑𝑚𝑜𝑑𝑒𝑙 ×𝑑𝑉 For 𝑖 = 1, 2, … ℎ where ℎ is the number of
Section II describes the main methods based on transformers. heads. The attention map was computed by matching the
In section III, we present the experimental results on the query matrix against the key matrix using the scaled-dot-
PathVQA dataset. Then we finally conclude and show future product. The output is scaled by the dimension of the key 𝑑𝐾
directions in Section IV. and then transformed into probabilities by a SoftMax layer.
Finally, the result is multiplied with the value 𝑉 to get a
II. Proposed method filtered value matrix which assigns high focus to more
The proposed medical VQA framework is shown in important elements.
Figure I, which consists of four parts: question encoding for 𝑄𝐾 𝑇
extracting textual features of the given question, image 𝐴𝑡𝑡𝑒𝑛𝑡𝑖𝑜𝑛(𝑄, 𝐾, 𝑉) = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 ( ).𝑉 (1)
encoding for capturing visual features of the given medical √𝑑𝐾
image, concatenation has used to fuse the visual and textual
feature vectors to generate a jointed representation and

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

144
Followed by FFN, which consists of two fully connected
layers with a ReLU activation function in between. It can be
formulated as:
𝐹𝐹𝑁(𝑥) = 𝑚𝑎𝑥(0, 𝑥𝑊1 + 𝑏1 )𝑊2 + 𝑏2 (2)

In each sublayer, residual connections are existing, followed

by layer normalization (LN) as described in the following
equations:
𝑥𝑜𝑢𝑡 = 𝐿𝑎𝑦𝑒𝑟𝑁𝑜𝑟𝑚(𝑥𝑖𝑛 + 𝑆𝑢𝑏𝑙𝑎𝑦𝑒𝑟(𝑥𝑖𝑛 ))) (3)
where 𝑥𝑜𝑢𝑡 , 𝑥𝑖𝑛 are the output and input of one sublayer, Figure II. The prediction process of generating the answer.
respectively, and the sublayer can be the attention layer or Each line is the output of the previous time and the masked
FFN. input of the current time, and the green box is the position of
B. Image encoder the predicted word piece
For image feature extraction, we encode the input image A. PathVQA dataset
M with the DeiT model. The input image is divided into a
sequence of patches with a dimension of 𝐻 × 𝑊 × 3 (with Compared with general domain images, medical images
for VQA methods have a smaller amount of data and few
three color channels). Each patch X has the dimension of
datasets. We used the PathVQA dataset [27] that containing
(𝐻 × 𝑊)⁄𝑝 Where 𝑝 is the patch size (p=16 in our
medical images accompanied by clinically relevant
experiment setting), these patches are comparable to word
questions. This dataset has a total of 4,998 pathology images
tokens in the original transformer. After that, we flatten each
with 32,799 question-answer pairs and is split into three sets:
patch into embeddings by feeding them into a linear
training, test, and validation sets. The validation set has 987
embedding layer 𝐸 to map their dimension to the model
images, the test sets contain 990 images, and the training set
dimension 𝑑𝑚𝑜𝑑𝑒𝑙 and add position embedding to avoid the
includes 3,021 images. The training set consists of 19,755
loss of the arrangement of the patches as in the original
question-answer pairs, the validation set 6,279, and the test
image. The resultant position-aware embeddings have
set consists of 6,761 question-answer pairs. Table I
appended with a learnable class token 𝑥𝑐𝑙𝑎𝑠𝑠 and distillation
summarizes these statistics. As the example shown in Figure
token 𝑥𝑑𝑖𝑠𝑡𝑖𝑙 to the patch embeddings (Equation (4)). The III. There are various questions related to multiple aspects of
two tokens and the patches embeddings interact with each the image, such as location, shape, color, appearance, etc. The
other via the self-attention mechanism. questions are classified into two types: open-ended questions
𝑧0 = [𝑥𝑐𝑙𝑎𝑠𝑠 ; 𝑥𝑑𝑖𝑠𝑡𝑖𝑙 ; 𝑥𝑝1 𝐸; 𝑥𝑝2 𝐸; . . ; 𝑥𝑝𝑚 𝐸] with several varieties such as why, where, what, how, , etc.,
+ 𝐸𝑝𝑜𝑠 , and close-ended "yes/no" questions. The task of the VQA-
(4) Med is to predict the most likely answer given the medical
(𝑝2 . 𝑐)×𝑑𝑚𝑜𝑑𝑒𝑙 (𝑚 +2)×𝑑𝑚𝑜𝑑𝑒𝑙 images.
𝐸∈ℝ , 𝐸𝑝𝑜𝑠 ∈ ℝ
The objective of the distillation token 𝑥𝑑𝑖𝑠𝑡𝑖𝑙 is to Table I. Statistics of PathVQA data split
reproduce the hard label produced by the teacher network that Training set Validation set Test set
allows the model to learn from the teacher's output while
remaining complementary to the class token tasked to #Images 3,021 987 990
reproduce the actual label. #Question-answer 19,755 6,279 6,761
pairs
C. Answer decoder
We generate the answer word by word for both the
addition results and encoder output features as the input on
the decoder side. The decoder consists of 𝑁 layer, each layer
containing masked multi-head attention followed by multi-
head cross attention and FNN sequentially.
The masked multi-head attention block computes the
attention vectors for current and prior words to avoid an
information leak. An example shown in Figure II, in addition
to the features of the image and the question, we use the
features of all masked tokens to predict the first-word piece
ℎ𝑎0 ("hydro"), then we use "hydro" and other following
masked tokens to predict ℎ𝑎1 . Loop this process until the
special token "[SEP]" is obtained.
III. Experimental results
In this section, we present the dataset and evaluation
metrics that used in the experiment. Then, the effectiveness Figure III. Exampler of image with a various
of the proposed VQA in the medical image was analyzed. questions related to multiple aspects of the image
Finally, the preliminary results of this study were
comparisons with the state-of-the-art model.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

145
B. Evaluation metrics active classification of electrocardiogram signals,” Inf. Sci.,
Accuracy [28] and BiLingual Evaluation Understudy vol. 345, pp. 340–354, Jun. 2016, doi:
(BLEU) [29] are commonly used as evaluation metrics in the 10.1016/j.ins.2016.01.082.
VQA task. Accuracy measures the ratio between correctly [4] C. Eickhoﬀ, I. Schwall, and H. Muller, “Overview
predicted observations to total observations. Meanwhile, of ImageCLEFcaption 2017 – Image Caption Prediction and
BLEU measures the similarity of predicted answers and Concept Detection for Biomedical Images,” p. 10.
ground-truth by matching n-grams. [5] A. Qayyum, S. M. Anwar, M. Awais, and M. Majid,
“Medical image retrieval using deep convolutional neural
C. Results
network,” Neurocomputing, vol. 266, pp. 8–20, Nov. 2017,
Our model is trained using Adam optimizer with an initial doi: 10.1016/j.neucom.2017.05.025.
learning rate of 0.001. We have used a batch size 50, the [6] S. A. Hasan, Y. Ling, O. Farri, J. Liu, H. Muller, and
number of epochs is up to 50, and the categorical cross- M. Lungren, “Overview of ImageCLEF 2018 Medical
entropy as a loss function. Table II shows the results for our
Domain Visual Question Answering Task,” p. 8.
proposed model, and it gives good achievement in terms of
[7] M. Sarrouti, A. Ben Abacha, and D. Demner-
accuracy metrics that achieve 61.2 and 21.3 in BLEU metrics.
Since there are few models that use PathVQA dataset for Fushman, “Visual Question Generation from Radiology
training, our model increases the accuracy by 2.2% and BLEU Images,” in Proceedings of the First Workshop on Advances
by 2.1%. One of the reasons for such a significant drop in in Language and Vision Research, Online, 2020, pp. 12–18.
performance is the presence of a new technique of vision doi: 10.18653/v1/2020.alvr-1.3.
transformers. [8] X. Yan, L. Li, C. Xie, J. Xiao, and L. Gu, “Zhejiang
University at ImageCLEF 2019 Visual Question Answering
Table II. Results Comparison in the Medical Domain,” p. 9.
[9] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova,
Methods Accuracy BLEU
CNN + LSTM + stacked attention
“BERT: Pre-training of Deep Bidirectional Transformers for
network [27]
59.4 19.2 Language Understanding,” ArXiv181004805 Cs, p. 13, May
Proposed (DeiT+BERT) 61.2 21.3 2019.
[10] K. Simonyan and A. Zisserman, “Very Deep
Convolutional Networks for Large-Scale Image
Acknowledgment Recognition,” ArXiv14091556 Cs, Apr. 2015, Accessed: Feb.
The authors extend their appreciation to the Deanship of 09, 2021. [Online]. Available: http://arxiv.org/abs/1409.1556
Scientific Research at King Saud University for funding this [11] M. Lin, Q. Chen, and S. Yan, “Network In
work through research group No. RG-1441-502. Network,” ArXiv13124400 Cs, Mar. 2014, Accessed: Jun. 12,
2021. [Online]. Available: http://arxiv.org/abs/1312.4400
[12] G. Chen, H. Gong, and G. Li, “HCP-MIC at VQA-
IV. Conclusion Med 2020: Eﬀective Visual Representation for Medical
Health care has rapidly grown with different tools and Visual Question Answering,” p. 8.
techniques to improve patients care. One of these techniques [13] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual
is a medical image. In this work, we propose full transformers Learning for Image Recognition,” in 2016 IEEE Conference
for answering corresponding questions based on medical on Computer Vision and Pattern Recognition (CVPR), Jun.
images. In particular, we use encoder modality for each input. 2016, pp. 770–778. doi: 10.1109/CVPR.2016.90.
The DeiT model is used to extract image features with a CNN [14] B. Zhou, Q. Cui, X.-S. Wei, and Z.-M. Chen, “BBN:
teacher distilled model. The text feature extraction uses the Bilateral-Branch Network With Cumulative Learning for
BERT model by adding token, segment, and position
Long-Tailed Visual Recognition,” in 2020 IEEE/CVF
embeddings layers. The answer was to predict the sequence
Conference on Computer Vision and Pattern Recognition
by a decoder. Our model gets results of 61.2 accuracy score
and 21.3 BLEU score. Empirical evaluation on the recently (CVPR), Seattle, WA, USA, Jun. 2020, pp. 9716–9725. doi:
published benchmark dataset PathVQA shows that our 10.1109/CVPR42600.2020.00974.
approach achieves superior performance compared with the [15] F. Ren and Y. Zhou, “CGMVQA: A New
state-of-the-art Med-VQA model. In future work, we plan to Classification and Generative Model for Medical Visual
explore a better evaluation strategy for evaluating the model. Question Answering,” IEEE Access, vol. 8, pp. 50626–
We also plan to introduce better individual models to handle 50636, 2020, doi: 10.1109/ACCESS.2020.2980024.
each of the leaf node tasks. [16] Y. Khare, V. Bagal, M. Mathew, A. Devi, U. D.
Priyakumar, and C. V. Jawahar, “MMBERT: Multimodal
References BERT Pretraining for Improved Medical VQA,”
[1] E. Bercovich and M. Javitt, “Medical Imaging: ArXiv210401394 Cs, Apr. 2021, Accessed: May 31, 2021.
From Roentgen to the Digital Revolution, and Beyond,” [Online]. Available: http://arxiv.org/abs/2104.01394
Rambam Maimonides Med. J., vol. 9, p. e0034, Oct. 2018, [17] M. H. Vu, T. Lofstedt, T. Nyholm, and R. Sznitman,
doi: 10.5041/RMMJ.10355. “A Question-Centric Model for Visual Question Answering
[2] J. Ker, L. Wang, J. Rao, and T. Lim, “Deep Learning in Medical Imaging,” IEEE Trans. Med. Imaging, vol. 39, no.
Applications in Medical Image Analysis,” IEEE Access, vol. 9, pp. 2856–2868, Sep. 2020, doi:
6, pp. 9375–9389, 2018, doi: 10.1109/TMI.2020.2978284.
10.1109/ACCESS.2017.2788044. [18] A. Vaswani et al., “Attention Is All You Need,”
[3] M. M. A. Rahhal, Y. Bazi, H. AlHichri, N. Alajlan, ArXiv170603762 Cs, Dec. 2017, Accessed: May 31, 2021.
F. Melgani, and R. R. Yager, “Deep learning approach for [Online]. Available: http://arxiv.org/abs/1706.03762

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

146
[19] A. Dosovitskiy et al., “An Image is Worth 16x16 for Vision-and-Language Tasks,” ArXiv190802265 Cs, Aug.
Words: Transformers for Image Recognition at Scale,” 2019, Accessed: May 31, 2021. [Online]. Available:
ArXiv201011929 Cs, Oct. 2020, Accessed: May 31, 2021. http://arxiv.org/abs/1908.02265
[Online]. Available: http://arxiv.org/abs/2010.11929 [26] R. Hu and A. Singh, “UniT: Multimodal Multitask
[20] Y. Bazi, L. Bashmal, M. M. A. Rahhal, R. A. Dayil, Learning with a Unified Transformer,” ArXiv210210772 Cs,
and N. A. Ajlan, “Vision Transformers for Remote Sensing Mar. 2021, Accessed: May 31, 2021. [Online]. Available:
Image Classification,” Remote Sens., vol. 13, no. 3, p. 516, http://arxiv.org/abs/2102.10772
Feb. 2021, doi: 10.3390/rs13030516. [27] X. He, Y. Zhang, L. Mou, E. Xing, and P. Xie,
[21] H. Touvron, M. Cord, M. Douze, F. Massa, A. “PathVQA: 30000+ Questions for Medical Visual Question
Sablayrolles, and H. Jégou, “Training data-efficient image Answering,” ArXiv200310286 Cs, Mar. 2020, Accessed: Jun.
transformers & distillation through attention,” 14, 2021. [Online]. Available:
ArXiv201212877 Cs, Jan. 2021, Accessed: May 31, 2021. http://arxiv.org/abs/2003.10286
[Online]. Available: http://arxiv.org/abs/2012.12877 [28] M. Malinowski and M. Fritz, “A Multi-World
[22] G. Hinton, O. Vinyals, and J. Dean, “Distilling the Approach to Question Answering about Real-World Scenes
Knowledge in a Neural Network,” ArXiv150302531 Cs Stat, based on Uncertain Input,” ArXiv14100210 Cs, May 2015,
Mar. 2015, Accessed: Jun. 12, 2021. [Online]. Available: Accessed: Jun. 12, 2021. [Online]. Available:
http://arxiv.org/abs/1503.02531 http://arxiv.org/abs/1410.0210
[23] L. Bashmal, Y. Bazi, M. M. Al Rahhal, H. Alhichri, [29] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu,
and N. Al Ajlan, “UAV Image Multi-Labeling with Data- “Bleu: a Method for Automatic Evaluation of Machine
Efficient Transformers,” Appl. Sci., vol. 11, no. 9, p. 3974, Translation,” in Proceedings of the 40th Annual Meeting of
Apr. 2021, doi: 10.3390/app11093974. the Association for Computational Linguistics, Philadelphia,
[24] H. Tan and M. Bansal, “LXMERT: Learning Cross- Pennsylvania, USA, Jul. 2002, pp. 311–318. doi:
Modality Encoder Representations from Transformers,” in 10.3115/1073083.1073135.
Proceedings of the 2019 Conference on Empirical Methods
in Natural Language Processing and the 9th International
Joint Conference on Natural Language Processing (EMNLP-
IJCNLP), Hong Kong, China, Nov. 2019, pp. 5100–5111.
doi: 10.18653/v1/D19-1514.
[25] J. Lu, D. Batra, D. Parikh, and S. Lee, “ViLBERT:
Pretraining Task-Agnostic Visiolinguistic Representations

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

147
Deep Learning for Face Detection and Recognition

Tuba Elmas ALKHAN Alaa Ali Hameed Akhtar Jamil Akhtar Jam
Department of Computrer Engineering Department of Computer Engineering Department of Computer Engineering Departme
Istanbul Sabahattin Zaim University Istanbul Sabahattin Zaim University Istanbul Sabahattin Zaim University Istanbul Sa
Istanbul, Turkey Istanbul, Turkey Istanbul, Turkey Istanbul, T
karabala.masa@std.izu.edu.tr 0000-0002-8514-9255 0000-0002-2592-1039 0000-0002

Abstract— With the more development of machine learning different have different objectives. Researchers have proposed
and deep learning, face recognition technology based on several facial detection and recognition systems, which will
convolutional neural network (CNN) has become the most
necessary and used methodology within the field of face be discussed in detail in the next section.
recognition. A face recognition model could be a technology
capable of distinctive a personal from a picture or a video. The purpose of this article is to execute emotion
Various strategies for face recognition systems are effective, and recognition in education area through a system using
they work by comparing selected facial features from images convolutional neural network (CNN) that analyzes facial
with faces in the database. This paper creates a system that uses expressions of students. CNN is a deep learning algorithm
Convolutional Neural Networks (CNN) to acknowledge
used for image classification. It includes of several stages
students' emotions from their faces. We achieved an accuracy of
74.41% and validation accuracy of 77.00% on the fer2013 image processing for extracting feature representation. There
dataset to classify seven different emotions through facial are several deep learning methods for extracting more
expressions. complex features, such as Autoencoders, Recurrent Neural
Network, Gradient Descent and Convolutional Neural
Keywords—Face Detection, Face Recognition, Deep Learning,
Networks.
Emotion recognition, Convolutional neural networks (CNN),
Student facial expression.
This study implements an automated system to realize
I. Introduction emotion recognition in education field. The system analyzes
student facial expressions and gives feedback to an educator
Face emotion recognition is an energetic and vital area of
with using Convolutional Neural Network. Several
research. Especially these days, due to the spread of the
classification algorithms were applied to learn instant
COVID-19 epidemic, it become distance education. These
emotional state (Random Forest, Artificial Neural Network
systems play an essential role in our daily life and make it
(ANN), Support Vector Machine (SVM), K-Nearest Neighbor
much more manageable. Face emotion recognition has been
(KNN) and Classification & Regression Trees). E-learning
implemented in medicine, psychology, interactive games,
has several advantages, such as saving time and money.
public security and distance education, etc.
Through this learning, All of students can uses the contents in
Face recognition in videos is challenging due to variations anytime and anywhere, which leads to good participation,
in pose, illumination, or facial expression. But this is an retention, scalable and offers personalization, but it does not
important task that has been widely utilized in several provide enough face-to-face interactivity between an educator
practical applications like security monitoring, surveillance and learners.
[1], etc.
II. LITERATURE REVIEW
The face is that the most important part of the body. It is This section highlights some developments made in the
vital and expressive. It can transfer many emotions silently. field of facial emotion recognition in various areas such as
Facial expression recognition determines an emotion from medicine, health, psychology, online education and
face images. Generally, six basic emotions are categorized, biomedical engineering . Today, Face emotion recognition is
which are the same across all cultures; happiness, sadness, a vital and important area of research. Especially these days,
surprise, angry, fear, disgust, neutral [1]. due to the spread of the COVID-19 epidemic, it was distance
education. The detection of facial emotions is possible in
CNN has been proven to be very active for various online education. Therefore, it can facilitate academics to
computer vision works, such as object or face detection and adjust their performance depending on the students’ emotions.
classification [2]. Applying facial expression recognition In general, Artificial intelligence includes deep learning and
system to the field of education can allow to detect, capture machine learning. Many machines learning and deep learning
and record the emotional changes of students during the algorithms are used in this field. Convolutional neural
learning process and supply higher reference for academics to Networks (CNNs) have become the most important and used
method in the field of face recognition. Which is a deep
teach based on students’ abilities. [3]
learning algorithm used for image classification. It includes of
The facial recognition system involves two steps: face several stages image processing for extracting features
detection, which identifies human faces in images. In contrast, representation.
face recognition matches the face from a video or an image In [6], the authors propose a CNN architecture called
Trunk Branch Ensemble Convolutional Neural network
against a database of faces to recognize it. Both are similar but

148
(TBE-CNN) to overcome problems in facial recognition from expression of the learner. SVM provides the best prediction
a video. They used this system in surveillance. Surveillance accuracy rate of 98.24% [10].
applications need to be capable of detecting and recognizing
faces quickly. This system must be able to withstand changes Recently, many works [12, 2] used CNN for facial
in blur, zoom, illumination and different poses. This version expressions recognition. The recognition of human facial
extracts features effectively via way of means of sharing the expressions is a hard problem for deep learning and machine
low- and middle-degree convolutional layers. learning, so the convolutional neural network is used to
conquer the problems in facial expression classification.
In [7], a new system for face recognition based on Stacked
Automatic Convolutional Encoder (SCAE) and sparse In their study, Roman RADIL et al. The performance of
representation was presented. The system can extract deeper the proposed Convolutional Neural Network with three image
and abstract features with high recognition speed. However, recognition methods like Local Binary Patterns Histograms
the recognition rate is not high, so they need to develop the (LBPH), K Nearest Neighbor (KNN) and Principal
system. Component Analysis (PCA) is tested. The result shows that
the Local Binary Patterns Histograms provide better results
Viola-Jones framework has been widely used by than Principal Component Analysis and K–Nearest Neighbor,
researchers Padilla and Costa for detecting the location of and an accuracy rate of 98.3% was achieved for proposed
faces, this work focuses on the appraisal of face detection CNN [11].
classifiers, such as OpenCV. The system needs images with
and without faces (positive & negative pictures) for training Saravanan et al. discussed the Classification of images of
the classifier and extract features (Haar) from images. The human faces into one of 7 basic emotions (Fear, Disgust,
authors evaluated the performance of some classifiers and Surprise, Anger, Sadness and Happiness), authors proposed
tested their accuracy [8]. approach (CNN) Convolutional Neural Network model which
content of six convolutional layers, two max pooling layers
Authors in [2] proposed a school system using and tow fully connected layers. This model achieved a final
convolutional neural network for helping professors and accuracy of 0.60.
instructors to change their academic performance based on
students’ emotions. First, they detect students faces from an In [14], the authors created a model using CNN to detect
image by using Haar Cascades Classifiers, then emotion facial expressions in real time using a webcam. The model is
recognition by using CNN with seven types of expressions. used to classify the expression of human faces, and the model
The system achieved an accuracy of 70% on fer-2013 dataset. gave a training accuracy of 79.89% and a test accuracy of
60.12%.
In [9], authors developed a system that determines
students emotions and provides feedback to improve distance (Wang et al., 2020)Online education has developed
education and update learning content. Head and eyes Because of the spread of the COVID-19 pandemic, which has
movement can help to comprehend student attention and led to the closure of schools and the transfer of education to
concentration level. The system is suitable and active for distance education .so author proposed a system combining a
detecting students’ negative emotions. Authors discussed Face Emotion Recognition (FER) algorithm and online
some face detection algorithms such as Local Binary pattern courses platforms depend on the architecture of CNN [17].
(LBP), Neural network and Ada Boost.
(Chang et al., 2018) designed a new Convolutional Neural III. MATRIALS AND METHODOLOGY
Network based on ResNet to extract features and Complexity
Perception Classification (CPC) algorithm for facial A. Dataset
expression recognition using three different classifiers For our deep learning model to be good and smart to discover
(Softmax, SVM, Random Forest).It improved the recognition expressions, we need to train it with a facial expression
accuracy and fixed some misclassified expression categories. dataset. Here we used FER-2013 dataset.FER-2013 dataset is
CNN and Softmax with CPC algorithm has achieved accuracy an open-source dataset to recognize facial expression, which
71.35% for Fer2013 and 98.78% for CK+ [16].
was shared on Kaggle through the ICML 2013 conference.
(Jiang et al., 2020) Gabor Convolutional Network is The dataset contains of 35.887 grayscale images with 48x48
shown in three types of datasets (real world affective faces, sized of face images, divided into 3.589 test and 28.709 train
FER+ and Fer2013). The proposed approach includes four images. The dataset contains of facial expressions belonging
Gabor convolutional layers with two fully connected layers.
They find the optimal model by changing the numbers of to these seven emotions (Happy, Sad, Neutral, Surprise, Fear,
layers and the numbers of units at the convolutional layers. Angry and Disgust). Figure 1 shows some example images
Then they discussed and compared the proposed GCN model from the FER-2023 dataset, Table 1 illustrates the description
with different models such as AlexNet and ResNet. The GCN of the dataset. The image dataset consists of grayscale images,
has achieved best accuracy on the fer2013 dataset [15]. and we kept size the same for our training and testing (300
In their study, (Ayvaz et al.) discussed a new facial x300).
emotion recognition system with the help of several
Table I. FER2013 dataset description
algorithms such as (Random Forest, Support Vector Machine
(SVM), K-Nearest Neighbor (KNN) and Classification & Label Number of images Emotion
Regression Trees) the system can classify the emotions of the
0 4593 Angry
students. The system detects facial emotional of the students
and gives response to an instructor according to the facial 1 547 Disgust
2 5121 Fear

149
∞ ∞
3 8989 Happy
4 6077 Sad 𝑓(𝑥, 𝑦) ∗ 𝑔(𝑥, 𝑦) = ∑ ∑ 𝑓(𝑛1, 𝑛2 ) . 𝑔(𝑥 − 𝑛1, 𝑦 − 𝑛2 )(1)
𝑛1 =−∞ 𝑛2 =−∞
5 4002 Surprise
6 6198 Neutral

Angry Happy Neutral Surprise Sad

Figure I. Sample images from the FER2013 dataset

B. Proposed Method
In this part, we will describe our proposed system that Figure III. Convolution operation.
uses Convolutional Neural Network (CNN) model to analyze
the students’ facial expressions. The first step of our system Pooling Layer: Pooling is also known as down sampling or
is to detects the face from images or video (video is a set of subsampling. Pooling layer can down sampling of any
images), then use these face images as input to the network. features map yet holds the significant data. There are three
Lastly, by using CNN the system classifies the expression of common pooling methods: Sum pooling, Max pooling and
a students’ face into one of these (happy, sad, fear, angry, Average pooling. The most typically approach used in pooling
surprise, neutral, disgust) expressions. is that Max pooling. Max pooling is used to gradually
minimize the spatial size of the input, it controls overfitting
Convolutional neural network is a type of artificial neural (regularization) and Invariance to small translations of the
network that uses a convolution method for extracting the input). Pooling layer provides higher generalization, resistant
large number of features from the input data. to distortion and quicker convergence. It is typically
positioned among the convolutional layers. Figure IV shows
CNN model contains 3 types of main layers:
an example of Max Pooling operation.
Convolutional layer, pooling layer and fully connected layer.
Figure II shows the CNN architecture.

Figure IV. Max pooling layers.

Figure II. Convolutional Neural Network Architecture Fully connected layer: Fully connected layer in a neural
network is that layer in which all inputs of one layer are
Convolutional Layer: We used convolutional layer to extract
connected to each neuron of the next layer. The intention of
the different attributes from an enter images. The convolution
using the FCL is to apply the output of the previous layers like
keeps a spatial association among pixels by learning features,
convolutional and pooling layers to classify the input image
then the images will be convoluted by use a group of learnable
into different classes according to the training dataset. The
neurons. This creates in the output picture a feature map that
term fully connected layer means that all filters of the previous
provides us some information about this image. Finally, the
layer are linked to all filters of the next layer. Fully connected
feature maps are fed to the next layer for learning more
layers are positioned earlier than the convolutional neural
features. We can explain the convolution in other words,
network sorting output and used to flatten results before
multiplied two images that can be represented as a matrix to
classification. In short, the convolutional layers and pooling
get an output that is used to extract a large number of features
layers work as characteristic extractors from the inputs and the
from the image.
fully connected layers work as classifier.
The convolution formula is represented in Equation 1: f is
IV. EXPERIMENTAL RESULTS
the input photo, * is for the convolution operation and g was
the filters matrix. Figure III shows the convolution operation. We designed our CNN model. Here we used FER-2013
dataset. It is an open-source dataset, shared on Kaggle. The
dataset includes seven categories such as Happy, Sad, Fear,
Angry, Disgust, Neutral and Surprise. The training set consists

150
of about 17084 images. The testing set consists of about 4180
images. We will be dealing with 3 classes (Happy, Sad,
Neutral) in training our model. image rows= 48, image Input Image Batch
Conv + ReLU
Batch Dropout(p=0.5
(48*48) Normalization Normalization )
column= 48, determine the size of the image matrix which we
will be feeding to our model. Batch size= 32, batch size is the Convolutional
Layer Conv + ReLU Max Pooling Dense + ReLU Dense
number of samples that are processed before updated the + ReLU
model. The number of complete passes through the training
Batch Dropout(p=0.5
dataset was 25 epochs. We used ImageDataGenerator class to Normalization
Max Pooling Conv + ReLU
)
Softmax

expand the size of a training dataset. ImageDataGenerator

class allowed us to expand the training pictures by using some Conv + ReLU Conv+ReLU
Batch Batch
Happy
Normalization Normalization
arguments like rescale, rotation, zoom, horizontal or vertical
flip and shifts. We generate a sequential model. We designed Batch Batch
Conv + ReLU Dense + ReLU
Normalization Normalization
our Convolutional Neural Networks (CNN) model contains of
8 convolutional layers, 4 max pooling layers and 3 fully
connected layers at the end Softmax classifier with seven Max Pooling Conv + ReLU Max Pooling Flatten

categories namely happy, sad, surprise, neutral, fear, disgust

and angry is used to classify the given image input. There are Figure V. The structure of the proposed model.
7 types of layers which we have used. In the first
convolutional layer the number of filters were 32, filter size of The are many approaches to recognize faces in an image,
(3, 3), padding = ‘same’ and with a kernel initializer = however we are going to use the Haar Cascade Classifiers that
he_normal. We are using this layer to extract various attributes are machine learning models trained to find and detect some
from the enter picture, and a convolution saves a spatial features such as faces, eyes and lips in an image. If the picture
connection among pixels by learning features using small includes these features that means the picture contains a face,
squares of input data. In our CNN architecture, the Rectified otherwise there is no face in the photo.
linear unit (ReLU) activation function has been used. ReLU is We tested our model on 3 optimizers such as SGD,
that the most generally used activation function. It is mainly Adadelta and Adam. The convolutional neural network
applied in hidden layers of the Neural network. consisting of 8 convolutional layers was trained on 17084
images and validated on 4180 images using adaptive moment
𝑅(𝑧) = max⁡(0, 𝑧) (2) estimation (Adam) optimizer, at 25th epoch, the batch size of
32 and learning rate of 0.001 the training accuracy was
The equation for ReLU activation function is as shown above. 74.41%. Validation accuracy of 77.00%. Whereas training
If Z is positive z, otherwise output will be 0. ReLu is a smaller loss was 0.5945 and validation loss was 0.5493, and we
amount computationally costly than tanh and sigmoid, as it calculated confusion matrix. It can be seen from the graph
entails less complicated mathematical operations. Solely some Figures VI, VII and VIII.
neurons are activated at a time, which makes the network
active and easy to calculate. Batch normalization layer batch
normalization accelerates training, provides some
regularization and reducing generalization error. Then we
have added max pooling layer to decrease the dimensionality
of each feature map. Here we have used the pooling size as
(2,2). We used a dropout layer to reduce the overfitting. We
used Dropout as 0.5, which means half of the neurons will be
ignored. Flatten layer used to convert the pooled feature map
to one column or a vector. We used this layer to flatten the
input or the output of the preceding layers. Dense layer (Fully
Figure VI. Accuracy over train and test data in our CNN with
Connected Layer) the aim of this layer is to utilize the output Adam optimizer.
of the convolutional and pooling layers to classify the input
images into different classes according to the training dataset.
Here we used 64 neurons and the kernel initializer he_normal.
At the end, the Softmax Classifier is used to classify human
faces. In the output layer we used Softmax as an activation
function to produce probabilistic output for each class.
Softmax takes a vector consisting of N real numbers, and then
normalizes that vector into a range of values between (0, 1).
Softmax conversion the input value which may be negative,
positive, zero or larger than one, values range from zero to
one, so they can be interpreted as probabilities. Figure V
shows the structure of the proposed model. Figure VII. Loss over test and train data with Adam optimizer.

151
The Adadelta optimizer gave an accuracy of 0.7179 and
validation accuracy of 0.7563 was attained over 50 epochs,the
learning rate was 0.1 and the batch size was 32.

Figure VIII. Confusion matrix of proposed method with Adam

optimizer.

Using the Stochastic Gradient Descent optimizer, the

learning rate of 0.001, the batch size of 32 and 18 epochs lead
to a low accuracy of 0.5267 and validation accuracy of 0.5507. Figure XII. Accuracy over test and train data with Adadelta
However, upon setting the learning rate to 0.1 and 40 epochs optimizer in proposed CNN.
the highest accuracy of 0.7236 and validation accuracy of
0.7683 was attained.

Figure XIII. Loss over train and test data with Adadelta optimizer in
Figure IX. Accuracy over test and train data with SGD optimizer in CNN.
the proposed CNN.

Figure X. Loss over test and train data with SGD optimizer. Figure XIV. Confusion Matrix with Adadelta optimizer using the
proposed method.

Based on these results, it is able to be concluded that the

Adam optimizer give us the best accuracy. We used these
functions EarlyStopping() and ModelCheckpoint() in Keras to
get better result in our model. In machine learning, particularly
in deep learning, early stopping is one of the most widely used
regularization techniques to avoid the overfitting problem in
neural network. Early stopping is a technique that permits you
to halt training once the model performance stops enhancing
at the validation dataset. This function can stop the execution
early. This is done by checking some parameters. Monitor:
Figure XI. Confusion Matrix of proposed method with SGD allows to specify the performance measure to monitor. Here,
optimizer. we are tracking the validation loss. min_delta: is a threshold
to whether or not quantify a loss at an epoch as improvement
or not. If the change of loss is under than the min_delta, will
depend on as no improvement. We've got given it zero.

152
Patience: represents the quantity of epochs which training will [12] A. Fathallah, L. Abdi, and A. Douik, "Facial expression recognition via
be stopped after it because no improvement and the loss start deep learning," Proc. IEEE/ACS Int. Conf. Comput. Syst. Appl.
AICCSA, vol. 2017-Octob, no. October, pp. 745–750, 2018.
to increase. We have given patience 10. Verbose: To discover
[13] M. Mohammadpour, H. Khaliliardali, S. M. R. Hashemi, and M. M.
and print the training epoch on which training was stopped, Alyannezhadi, "Facial emotion recognition using deep convolutional
verbose can be set to 1. restore_best_weights: whether to networks," 2017 IEEE 4th Int. Conf. Knowledge-Based Eng. Innov.
retrieve weights with the best value of the monitored quantity. KBEI 2017, vol. 2018-Janua, pp. 0017–0021, 2018.
Here i have given it True. [14] I. Talegaonkar, K. Joshi, S. Valunj, R. Kohok, and A. Kulkarni,
"Available on : Elsevier-SSRN Real Time Facial Expression
The ModelCheckpoint class allows you to define where to Recognition using Deep Learning," 2019.
checkpoint the model weight to keep it. Therefore, the weights [15] P. Jiang, B. Wan, Q. Wang, and J. Wu, "Fast and Efficient Facial
may be loaded later to carry on training from the saved state. Expression Recognition Using a Gabor Convolutional Network,"
We monitored the validation loss and minimized the loss IEEE Signal Process. Lett., vol. 27, pp. 1954–1958, 2020.
using the mode='min' parameter. [16] T. Chang, G. Wen, Y. Hu, and J. J. Ma, "Facial expression
recognition based on complexity perception classification
Our system realizes the faces of the input images of the algorithm," arXiv, 2018.
students by using Haar cascades detector then classifies them [17] W. Wang, K. Xu, H. Niu, and X. Miao, "Emotion Recognition of
Students Based on Facial Expressions in Online Education Based
into one of seven basic expressions. The proposed method on the Perspective of Computer Simulation," Complexity, vol.
achieved an accuracy of 77% using Adam optimizer on 2020, 2020.
FER2013 dataset at the 25 epochs.

V. CONCLUSION
In this study, our aim was to detect the face then to classify
facial expressions, so we present a convolutional neural
network model for the recognition of the facial expressions of
the students. With the help of deep learning and machine
learning technologies we can Classification of the emotions of
the online learner, therefore it is able to help the instructor to
recognize students' understanding during his presentation and
gives feedback to an educator. In our future work we are going
to focus on applying Convolutional Neural Network model on
3D students' face image so as to extract their emotions.

REFERENCES

[1] "Face Recognition in Real-world Surveillance Videos with Deep

Learning Method," pp. 239–243, 2017.
[2] I. Lasri, "Facial Emotion Recognition of Students using Convolutional
Neural Network," pp. 0–5, 2019.
[3] R. Ranjan et al., "A fast and accurate system for face detection,
identification, and verification," arXiv, vol. 1, no. 2. IEEE, pp. 82–96,
2018.
[4] H. Zhang, A. Jolfaei, and M. Alazab, "A Face Emotion Recognition
Method Using Convolutional Neural Network and Image Edge
Computing," IEEE Access, vol. 7, pp. 159081–159089, 2019.
[5] S. S. Mohamed, W. A. Mohamed, A. T. Khalil, and A. S. Mohra, "Deep
Learning Face Detection and Recognition," no. June, 2019.
[6] C. Ding and D. Tao, "Trunk-Branch Ensemble Convolutional Neural
Networks for Video-Based Face Recognition," vol. 40, no. 4, pp. 1002–
1014, 2018.
[7] L. Chang, J. Yang, S. Li, H. Xu, K. Liu, and C. Huang, "Face
Recognition Based on Stacked Convolutional Autoencoder and Sparse
Representation," Int. Conf. Digit. Signal Process. DSP, vol. 2018-
November, pp. 1–4, 2019.
[8] R. Padilla, C. C. Filho, and M. Costa, "Evaluation of haar cascade
classifiers designed for face detection," J. WASET, vol. 6, no. 4, pp.
323–326, 2012.
[9] L. B. Krithika and G. G. Lakshmi Priya, "Student Emotion Recognition
System (SERS) for e-learning Improvement Based on Learner
Concentration Metric," Procedia Comput. Sci., vol. 85, no. Cms, pp.
767–776, 2016.
[10] U. Ayvaz, H. Gürüler, and M. O. Devrim, "Use of Facial Emotion
Recognition in E-Learning Systems," Inf. Technol. Learn. Tools, vol.
60, no. 4, p. 95, 2017.
[11] P. Kamencay, M. Benco, T. Mizdos, and R. Radil, "A new method for
face recognition using convolutional neural network," Adv. Electr.
Electron. Eng., vol. 15, no. 4 Special Issue, pp. 663–672, 2017.

153
A Lightweight and Interpretable Deepfakes
Detection Framework
Muhammad Umar Farooq Ali Javed Khalid Mahmood Malik
Department of Software Engineering Department of Computer Science Department of CS and Engineering
University of Engineering and Technology University of Engineering and Technology Oakland University
Taxila, Pakistan Taxila, Pakistan Rochester, MI, USA
softwareengineerumar@gmail.com ali.javed@uettaxila.edu.pk mahmood@oakland.edu

Muhammad Anas Raza

Department of Mechanical Engineering
University of Engineering and Technology
Taxila, Pakistan
memanasraza@gmail.com

Abstract—The recent realistic creation and dissemination of so- [3]. However, the case is not always that simple. Depending
called deepfakes poses a serious threat to social life, civil rest, and on the time and context, deepfakes pose a serious threat to
law. Celebrity defaming, election manipulation, and deepfakes society. With deepfakes, celebrities are defamed, and election
as evidence in court of law are few potential consequences
of deepfakes. The availability of open source trained models campaigns could be manipulated. DL based video synthesis
based on modern frameworks such as PyTorch or TensorFlow, tools use generative adversarial networks (GAN) under the
video manipulations Apps such as FaceApp and REFACE, and hood. The adaptive nature of GAN made it difficult to develop
economical computing infrastructure has easen the creation of a robust detection solution. Whenever a deepfakes detection
deepfakes. Most of the existing detectors focus on detecting model is developed, we witness some variant of a GAN based
either face-swap, lip-sync, or puppet master deepfakes, but a
unified framework to detect all three types of deepfakes is hardly generation model to exploit the newly developed detection
explored. This paper presents a unified framework that exploits model by manipulating its cues. Thus, deepfakes creation and
the power of proposed feature fusion of hybrid facial landmarks detection is a constant battle between the ethical and unethical
and our novel heart rate features for detection of all types of machine learning (ML) experts.
deepfakes. We propose novel heart rate features and fused them Deepfakes detection got much attention in the last decade
with the facial landmark features to better extract the facial
artifacts of fake videos and natural variations available in the after realistic fake videos of politicians and celebrities got
original videos. We used these features to train a light-weight viral via social media platforms. Current deepfake videos are
XGBoost to classify between the deepfake and bonafide videos. categorized as face-swap, lip-sync, and puppet master [4]. In
We evaluated the performance of our framework on the world face-swap deepfakes, face of a target person is added at the
leaders dataset (WLDR) that contains all types of deepfakes. Ex- place of a source person in the original video to create a
perimental results illustrate that the proposed framework offers
superior detection performance over the comparative deepfakes fake video of the target person. In lip-sync deepfakes, lips
detection methods. Performance comparison of our framework of a person are synced for an audio to reflect that person
against the LSTM-FCN, a candidate of deep learning model, is speaking the text in that audio. In puppet-master, the face
shows that proposed model achieves similar results, however, it of the target person is placed in the original video but facial
is more interpretable. expressions of the source person are retained on the target face
Index Terms—Deepfakes, Multimedia Forensics, Random For-
est Ensembles, Tree boosting, XGBoost, Faceswap, Lip sync,
to make the fake more realistic. Most of the existing detection
Puppet Master. solutions target specific types of deepfakes, however, generic
solutions capable of countering all types of deepfakes are less
explored. For example, Agarwal et al. [5] proposed a detection
I. I NTRODUCTION
technique for lip-sync deepfakes. This technique exploited
Recent advancements in deep learning (DL) have impacted the inconsistencies between the viseme (mouth shape) and
the way we solve complex technical problems in computer a phoneme (spoken word). This work applied manual and
vision (CV) and robotics. With the widespread availability CNN based techniques to compute the mapping of viseme to
of video synthesis repositories and video manipulations Apps phoneme. This model is good for a specific set of seen data.
such as FaceApp [1] and REFACE [2], video manipulation has However, model performance can degrade on unseen data for
become easy, even for a layman. Video synthesis is beneficial different patterns of viseme to phoneme mapping, with the
in some ways like avatar creation, animated video content change of speaking accent or even non-alignment of audio-to-
creation, etc. Sometimes videos are synthesized just for the video.
sake of fun, like a recent realistic Tiktok video of Tom Cruise Most of the existing systems are unable to perform well on

154
all three types of deepfakes. Moreover, deepfakes detection authors proposed a new framework, ‘FakeCatcher’, which uses
models based on the traditional classifiers like SVM, works biological signals from three face regions in the real videos
only where data is linearly separable. CNN based models are to detect the fake videos. FakeCatcher applied many trans-
computationally more complex and are black-box in terms formations on biological features like autocorrelation, power
of prediction. Therefore, this paper addresses the following spectral density, wavelet transform, etc. Authenticity decision
research questions: is based on the aggregated probabilities of two probabilistic
classifiers (SVM and CNN). Performance was evaluated on
1) Is it possible to improve the detection accuracy of deep-
their own customized dataset, however, it is not evaluated on
fakes using hybrid landmark and heart-rate features on a
all three types of deepfakes.
diverse dataset containing all three types of deepfakes?
Besides the handcrafted features-based methods, deep
2) Is it possible to create a generalized detection model
learning-based methods are also being employed for deepfakes
based on proposed hybrid landmark and heart-rate fea-
detection. Guera et al. [10] applied a DL based technique
tures and ensemble learning?
to detect the deepfakes. This technique applied a CNN
3) Is it possible to achieve the same accuracy as deep
to extract features followed by a long-short term memory
learning models but improve the interpretability by using
(LSTM) to learn those features. Important contribution of this
an ensemble of supervised learning?
work was the exploitation of temporal inconsistencies among
Existing deepfake detection techniques are broadly catego- deepfakes for classification. However, this approach is unable
rized as handcrafted features [6]–[9] based or DL based [10]– to identify all three types of deepfakes. Afchar et al. [11]
[14]. For example, Yang et al. [9] used 68-D facial landmark designed a neural network (MesoNet) to detect deepfakes
features to train an SVM classifier for detection. This work and Face2Face video forgeries. This work designed an end-
achieved good performance on good quality videos of UADFV to-end architecture with convolutional and pooling layers for
[9] and DARPA MediFor [15] datasets but was unable to feature extraction followed by dense layers for classification.
perform well on low quality videos. Moreover, the evaluation These methods [10], [11] were evaluated on videos collected
of this work did not consider all types of deepfakes. Matern from random websites rather than a standard dataset that
et al. [6] used 16-D texture based eyes and teeth features for doubted the robustness of these approaches for a large-scale
the exploitation of the visual artifacts to detect video forgeries and diverse standard dataset. Nguyen et al. [12] designed a
like face-swap and Face2Face. Most important aspect of this capsule network to expose multiple types of tampering in
work was to detect the difference in eye color of a POI images and videos. This framework aimed at detection of face
for detection of face-swap deepfakes detection by exploiting swapping, facial re-enhancements and computer generated im-
the missing details like reflection in eye color. Additionally, ages. This framework used dynamic routing and expectation-
this work uses face border and nose tip features along with maximization algorithms for performance improvement. The
eye color features for Face2Face deepfakes detection. This Capsule network employed the VGG-19 for latent face features
technique [6] has a limitation of working only for faces with extraction and used them for classification of original and
clear teeth and open eyes. Lastly, the evaluation of this work bonafide videos. Framework is good at detecting face-swap
was only performed on FaceForensics++ [10] dataset. Li et forgeries in FaceForensics dataset, however, not evaluated on
al. [7] used the targeted affine warping artifacts introduced lip-sync and pupper-master deepfakes and complex in terms
during deepfakes generation. Targeting the specific artifacts of computations. Sabir et al. [13] proposed a method based
reduced the training overhead and improved the efficiency. on DL to feed cropped and aligned faces to a CNN (ResNet
However, these specific artifacts selection can compromise and DenseNet) for feature extraction followed by an RNN
the robustness of this technique by making it difficult to for classification. Most important aspect of this work was
detect a deepfake with slightly new transformation artifacts. to use features from multiple levels of CNN to incorporate
Agarwal et al. [8] used an open source toolkit OpenFace2 [16] mesoscopic level features extraction. This work [13] only used
for facial landmark features extraction. Some features were FaceForensics++[11] dataset for evaluation and didn’t consider
derived based on extracted landmark features. These derived lip-sync and puppet-master deepfakes. Yu et al. [14] used a
features were then used along with action unit (AU) features CNN to capture the fingerprints of GAN generated images
to train a binary SVM for deepfakes detection. This technique to perform the classification of synthetic and real images.
was proposed for five POIs where all POIs were linearly This technique targeted fake images generated with four GAN
separable in a t-SNE plot. However, for an increased number variants ProGAN, SNGAN, CramerGAN, MMDGAN, but
of POIs in the updated dataset [17], performance of this might not be able to detect fake images generated with a new
technique was significantly degraded. In their extended work, GAN variant. In [19], authors used an ensemble of four CNNs
Agarwal et al. [18] proposed a framework based on spatial and to achieve good results on DFDC. An attention mechanism
temporal artifacts in deepfakes. This framework is based on was added to EfficientNetB4 to get the insights of the training
some threshold based rules to classify a video as real or fake. process. EfficientNetB4 and EfficientNetB4Att were trained
This rule-based approach would work on selected datasets, as end-to-end training, whereas, EfficientNetB4ST and Effi-
however, performance of this hard coded threshold oriented cientNetB4AttST were trained in Siamese training settings.
approach is expected to degrade on unseen data. In [18], Important aspect of this method was the data augmentation

155
(i.e. down sampling, hue saturation, JPEG compression, etc.) to extract 850-D facial landmarks and 63-D heart rate fea-
during training and validation for model robustness. Moreover, tures. XGBoost classifier is used for classification. Classifier
this technique performs well on large face-swap dataset DFDC is trained on each sub-category of landmark and heart rate
but not evaluated on all three types of deepfakes and is features. Finally, we reduce the dimensions of our features to
computationally complex. In [20], authors used EfficientNet select the most reliable features among all to make the final
(a CNN) and gated recurrent unit (GRU) (an RNN) to exploit features-set. XGBoost classifier is trained on the final features-
spatiotemporal features in the video frames to detect deepfake set to classify the video as fake or bonafide. The process flow
videos. This work included data augmentation on real videos of the proposed solution is shown in Figure I.
during training to balance classes as DFDC is highly class
imbalanced. Moreover, this architecture performs well on large A. Features Extraction
face-swap dataset DFDC but is not evaluated on all three types Effective features extraction is crucial for any classification
of deepfakes and is complex in terms of computations. task. For this purpose, we proposed a fused features-set con-
Most methods based on handcrafted features [6]–[9] fail to sisting of our novel heart rate features and the facial landmark
generalize well on different types of deepfakes like lip-sync features. We extracted facial Landmark features using the
and puppet-master. CNN based techniques [10]–[14] are com- OpenFace2 [21] toolkit. For heart rate features, we selected
putationally complex and black-box in terms of generating the seven regions of interest as shown in Figure I. Seven ROIs
output. Moreover, these methods exploit some GAN specific are right cheek (RC), left cheek (LC), chin (C), forehead
artifacts produced during generation. So, they might fail to (F), outer right edge (OR), outer left edge (OL), and center
detect deepfakes, generated with a new GAN architecture. (C). We calculated RGB values of all ROIs and then applied
To address the above mentioned problems and limitations some transformations to create heart rate features. Details of
of existing works, this paper proposes a lightweight model transformations are as follows:
based on feature fusion of facial landmarks and heart rate
features. For landmark features, we analyzed the impact of HRs = {ZR , ZG , ZB } (1)
each landmark features category before final features selection.
We analyzed the impact of different combinations of features
categories. We started with two most effective features cat- HRr = {ZR /ZG , ZR /ZG , ZG /ZB } (2)
egories and then added one category in the feature-set at a Where HRs ∈ {RC, LC, C, F, OR, OL, CE}
time in the decreasing order of effectiveness. We disregarded &R ↔ red, G ↔ green, B ↔ blue
the concept of the POIs being linearly separable, because that
concept becomes invalid with a higher number of POIs. We HRr = HRs ∪HRr (3)
used the XGBoost [21] for classification purposes. XGBoost
uses Bagging in Random Forest for variance related errors and Where HRs represent the simple heart rate features at ROIs
gradient boosting algorithm for bias related errors. XGBoost and HRr is the ratios of heart rate features. Union of these
successfully addresses the data classification problem where HR features generate our heart rate features.
data points are not linearly separable.
B. Features Standardization & Segmentation
The main contributions of this paper are as follows:
• We propose a lightweight and interpretable deepfakes
Both the landmark and our proposed heart rate features
detection framework capable of accurately detecting all are on different scales. To fuse the features, we standardized
types of deepfakes namely, faceswap, puppet-master and features by partially learning the distribution of features during
lipsync. data loading. We apply standardization as shown in Eq. (4),
• We propose novel heart rate features and fused them
based on learned distribution over all features.
with a robust set of selected facial landmark features for x−µ
z= (4)
deepfakes detection. σ
• We highlight that an XGBoost based solution is Where µ is mean and σ is standard deviation of a feature
lightweight as compared to CNN based solutions and column.
better generalize as compared to other conventional clas- Our solution works at both the frame and segment level. For
sifiers like SVM, KNN, etc. segment level operation, we created segments with a length of
Rest of the paper is structured as follows. Section 2 presents 30 frames with an overlapping of 10 frames. In our case, the
the details of feature engineering and model development. In video frame rate is 30 frames per second.
Section 3, we present the details of performance evaluation and
C. Classification
comparative analysis w.r.t to state of the art methods. Finally,
we conclude our paper in Section 4. For the classification task, we need a classifier that should
be lightweight and can generalize easily to the new datasets.
II. M ETHODOLOGY Classification process should be interpretable so we can follow
This section provides an overview of the proposed frame- a directed path for further improvements. To incorporate those
work. As shown in the Figure I, the input video is processed requirements, we employed the extreme gradient boosting

156
Figure I. Architecture of the Proposed Framework

(XGBoost) [21], an approach for gradient boosted decision will be highly effective for large datasets as it is highly
trees. XGBoost is an algorithm in the class of gradient boost- scalable and computationally efficient. We can use the power
ing machines. In boosting algorithms, many weak learners are of GPU as XGBoost can perform out-of-core computations.
ensembled sequentially to create a strong learner having low Objective function of XGBoost is based on training loss and
variance and high accuracy. In boosting, learning of the next a regularization function as shown in Eqs. (5) & (6). Training
predictor is improved to avoid repeating the error caused by loss helps in stage wise bagging of trees in the random forest
any previous predictor. In Random forest, a model with deeper to decrease the variance error. Regularization function helps
trees gives good performance but in XGBoost, shallow trees to reduce the bias related errors using boosting.
perform better because of boosting. There are two boosting
approaches, Adaptive Boosting and Gradient Boosting. Adap- n
X t
X
tive boosting puts more weight on misclassified data samples. O= l(yi , ŷit ) + Ω(fi ) (5)
While gradient boosting identifies misclassified samples as i=1 i=1

gradients using the Gradient Descent to iteratively optimize Where t is the total number of trees and yi is actual value and
the loss. XGBoost employs Gradient boosting. Using XGBoost ŷit is the prediction at time t. n is the total number of training

157
samples. unit features. Table 1 presents the results of individual feature
types. We conducted an evaluation on different combinations
T
1 X 2 of features in the descending order of their effectiveness. Table
Ω(fi ) = γT + λ ω (6)
2 j=1 j 2 presents the results of a combination of features categories.
We observed from Table 2 that 2D and 3D lankmark features
Where γ is the min reduction in loss, required for a new split are most effective giving an AUC of 0.9311. We also observed
on leaf node and λ is the l2 regularization term on leaf weights that eye landmark and headpose features are effective thereby
and helps ovoid overfitting. increasing AUC from 0.9311 to 0.9326, when combined
with 2D and 3D landmarks. Additionally, we observed that
III. E XPERIMENTS AND R ESULTS
combining heart rate features with selected landmark features
A. Dataset is very effective and increases the AUC from 0.9326 to 0.9505.
We evaluated our method on the world leaders dataset Based on our observation, we didn’t include shape features
(WLDR) [8]. WLDR is the only dataset with all three types of in the final features-set due to slight improvement in AUC
deepfakes. WLDR comprises real and fake videos of ten U.S. from 0.9505 to 0.9510 when shape features are also included
politicians, and real videos of comedy impersonators of those in the fused features-set. Our final features-set includes eye
political figures. The WLDR dataset has all three types of landmarks, headpose, 2D & 3D landmarks and heart rate
deepfakes i.e., face-swap, puppet-master and lip-sync. WLDR features. As per our hypothesis, combinations of features
has lip-sync deepfakes for only one POI i.e., Obama. Face- that are individually effective also perform better. Finally, we
swap videos of WLDR are created by replacing the face of selected five out of seven features categories for our model.
the impersonator with the face of the corresponding politician. We evaluated our model on a wide range of parameters. More
The WLDR dataset has 1753 real and 93 fake videos. Other specifically, we set the learning rate to 0.01, number of trees
datasets like DFDC, FF++ and DFD have more fake videos to 1500, Max depth tree to 8.
as compared to real videos. WLDR has more real videos
C. Performance Comparison of the Proposed and Existing
(95%) than the fake videos (5%) which is good, as for better
Methods
detection, we have to learn the patterns in the real videos
rather than fake videos as fake videos are constantly changing This experiment is designed to measure the performance
with the evolution of GANs. Still it is not large enough to of our framework against existing state-of-the-art deepfakes
generalize a model to perform well in the wild deepfakes. detection methods. For this, we compared the performance
We used area under the curve (AUC) as an evaluation metric of the proposed framework against the [8] and [17]. Table
for model evaluation. The reason behind using AUC is that 3 presents the results of comparison of proposed framework
almost all the available datasets are highly class imbalanced. against existing models. Our model outperforms [8] that is
AUC gives a fair performance score for imbalanced classes as based on action unit features and derived features capturing
compared to Accuracy. mouth movements but our model performance is lower than
their extended work [18]. We also compared our model with a
B. Performance Evaluation of Proposed Framework deep learning (DL) classifier, LSTM-FCN [22]. Agarwal et al.
The objective of this experiment is to evaluate the perfor- [8] technique works on the assumption of linear separability
mance of the proposed framework on a diverse dataset WLDR, of bonafide and deepfake videos in a t-SNE plot based on
having all three types of deepfakes. For this purpose, we fed selected features. But this technique failed to generalize on all
the proposed features of selected landmarks and heart rate types of deefakes. In their extended work, Agarwal et al. [17]
features to train the XGBoost based random forest ensemble evaluated their method on 10-second video clips rather than
to perform the classification of bonafide and deepfakes. Heart frames and segments of small length. Although, this model
rate features and sub-categories of landmark features are on performs better and generalizes well on all existing datasets
different scales. We standardized features before feeding to of face-swap. However, in this work only face-swap deepfakes
the classifier. For standardization, we calculated mean and are considered and lip-sync and puppet-master deepfakes are
standard deviation of the whole training set during data not addressed. Moreover, performance of this method [17] is
preparation. We scaled train, test and validation sets to make expected to drop if evaluated on frame and segment level due
sure the mean of rescaled data is zero and standard deviation to its threshold based approach. We observed from the results
is one. We evaluated our model on frame and segment level. (Table 3) that a DL based model, LSTM-FCN can achieve
In WLDR, the frame rate of videos is 30 frames per second. comparable results as we achieved with XGBoost based Ran-
For segment level evaluation, we created 30 frames length dom Forest ensembles. However, compared to LSTM-FCN our
segments with an overlapping of 10 frames. Our model is proposed framework is light weight and interpretable rather
robust to both frame and segment level detection. than a black-box oriented model of a DL classifier.
We evaluated our model on each of six categories of facial
landmark and heart rate features. List of features effective to IV. C ONCLUSION AND F UTURE W ORK
the detection task in descending order is 2D landmark, 3D This work has presented a unified method based on fusion of
landmark, eye landmark, headpose, heart rate, shape and action our novel heart features and facial landmarks for detecting all

158
TABLE I
S EGMENT AND FRAME LEVEL AUC ON INDIVIDUAL FEATURES CATEGORIES

Features Used Eye landmark Head pose 2D landmark 3D landmark Shape Action Unit Heart Rate
(AUC): Seg level 0.8851 0.8023 0.8982 0.8978 0.7644 0.5027 0.7956
(AUC): Frame level 0.8659 0.7774 0.8903 0.8856 0.7357 0.5017 0.7866

TABLE II
S EGMENT AND FRAME LEVEL AUC ON COMBINATION FEATURES CATEGORIES

Features Used 2D lmk,3D lmk Eye lmk,2D lmk,3D lmk Eye lmk,Headpose,2D lmk,3D lmk Eye lmk,Head pose,2D lmk,3D lmk,HR Eye lmk,Head pose,2D lmk,3D lmk,Shape,HR
(AUC): Seg level 0.9297 0.9311 0.9326 0.9505 0.9510
(AUC): Frame level 0.9158 0.9059 0.9068 0.9425 0.9285

TABLE III [7] Li, Y. and S. Lyu, Exposing deepfake videos by detecting face warping
C OMPARISON OF XGB OOST WITH [8],[18] AND LSTM-FCN [23] artifacts. arXiv preprint arXiv:1811.00656, 2018. 2.
[8] Agarwal, S., et al. Protecting World Leaders Against Deep Fakes. in
Model Name WLDR Evaluation Levels Proceedings of the IEEE Conference on Computer Vision and Pattern
Protecting World Leaders [8] 0.93 Frame and segment level Recognition Workshops. 2019.
LSTM-FCN 0.95 segment level [9] X. Yang, Y. Li, and S. Lyu, ”Exposing deep fakes using inconsistent
XGBoost (proposed) 0.95 Frame and segment level head poses,” in ICASSP 2019-2019 IEEE International Conference on
Appearance and Behavior [18] 0.99 Video level Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 8261-
8265: IEEE.
[10] Güera, D. and E.J. Delp. Deepfake video detection using recurrent neural
networks. in 2018 15th IEEE International Conference on Advanced
Video and Signal Based Surveillance (AVSS). 2018. IEEE.
three types of deepfakes. Unlike many existing methods, our [11] Afchar, D., et al. Mesonet: a compact facial video forgery detection
method is light weight, interpretable and effective at the same network. in 2018 IEEE International Workshop on Information Forensics
time. Moreover, compared to existing light weight techniques, and Security (WIFS). 2018. IEEE.
[12] Nguyen, H.H., J. Yamagishi, and I. Echizen. Capsule-forensics: Using
our method is more robust and interpretable. We highlighted capsule networks to detect forged images and videos. in ICASSP 2019-
that an XGBoost based framework is lightweight over the 2019 IEEE International Conference on Acoustics, Speech and Signal
CNN based solutions and generalizes better as compared to Processing (ICASSP). 2019. IEEE.
[13] Sabir, E., et al., Recurrent Convolutional Strategies for Face Manipula-
other conventional classifiers. For this purpose, we compared tion Detection in Videos. Interfaces (GUI), 2019. 3: p. 1.
our proposed method with a time-series DL classification [14] Yu, N., L. Davis, and M. Fritz, Learning GAN fingerprints towards
model, LSTM-FCN. However, proposed framework follows Image Attribution. arXiv preprint arXiv:1811.08180, 2019.
[15] (14.06.2021) Available: https://www.nist.gov/itl/iad/mig/media-
a signature based approach and thus may not be very effective forensics-challenge-2018
against deepfakes developed in future. Proposed method also [16] Baltrusaitis, T., et al. Openface 2.0: Facial behavior analysis toolkit. in
need to be enhanced for optimized cross corpus evaluation. 2018 13th IEEE International Conference on Automatic Face Gesture
Recognition (FG 2018). 2018. IEEE.
For our future work, we’ll perform cross-dataset evaluation, [17] Agarwal, S., Farid, H., El-Gaaly, T., Lim, S. N. (2020, December).
experimenting on the datasets that have multiple forgeries per Detecting deep-fake videos from appearance and behavior. In 2020 IEEE
sample. International Workshop on Information Forensics and Security (WIFS)
(pp. 1-6). IEEE.
[18] Ciftci, U. A., Demir, I., Yin, L. (2020). Fakecatcher: Detection of
V. ACKNOWLEDGMENTS synthetic portrait videos using biological signals. IEEE Transactions on
This work was supported by grant of Punjab HEC of Pattern Analysis and Machine Intelligence.
[19] Bonettini, N., Cannas, E. D., Mandelli, S., Bondi, L., Bestagini, P.,
Pakistan via Award No. (PHEC/ARA/PIRCA/20527/21). Tubaro, S. (2021, January). Video face manipulation detection through
ensemble of cnns. In 2020 25th International Conference on Pattern
R EFERENCES Recognition (ICPR) (pp. 5012-5019). IEEE.
[20] Montserrat, D. M., Hao, H., Yarlagadda, S. K., Baireddy, S., Shao,
[1] (14.06.2021). Reface App. Available: https://reface.app/ R., Horváth, J., ... Delp, E. J. (2020). Deepfakes detection with
[2] (14.06.2021). FaceApp. Available: https://www.faceapp.com/ automatic face weighting. In Proceedings of the IEEE/CVF Conference
[3] (14.06.2021) Available: https://edition.cnn.com/videos/business/2021/03/02/tom- on Computer Vision and Pattern Recognition Workshops (pp. 668-669).
cruise-tiktok-deepfake-orig.cnn-business/video/playlists/business-social- [21] Chen, T., Guestrin, C. (2016, August). Xgboost: A scalable tree boosting
media/ system. In Proceedings of the 22nd acm sigkdd international conference
[4] Masood, M., Nawaz, M., Malik, K. M., Javed, A., Irtaza, A. (2021). on knowledge discovery and data mining (pp. 785-794).
Deepfakes Generation and Detection: State-of-the-art, open challenges, [22] Karim, F., Majumdar, S., Darabi, H., Harford, S. (2019). Multivariate
countermeasures, and way forward. arXiv preprint arXiv:2103.00484. LSTM-FCNs for time series classification. Neural Networks, 116, 237-
[5] Agarwal, S., Farid, H., Fried, O., Agrawala, M. (2020). Detecting deep- 245.
fake videos from phoneme-viseme mismatches. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition
Workshops (pp. 660-661).
[6] Matern, F., C. Riess, and M. Stamminger. Exploiting visual artifacts
to expose deepfakes and face manipulations. in 2019 IEEE Winter
Applications of Computer Vision Workshops (WACVW). 2019. IEEE.

159
Searching for Aesthetical Values in an Upgraded Informal
Neighborhood in Tirana
Edmond Manahasa Arila Rasha
Department of Architecture Department of Architecture
Epoka University Epoka University
Tirana, Albania Tirana, Albania
emanahasa@epoka.edu.al arasha14@epoka.edu.al

Abstract— This research aims to explore the possible settlements especially the agricultural lands at the periphery
aesthetical qualities in an informal settlement called Bathore, of Tirana.
which evolved in the post-socialist period in capital city of
Albania, Tirana. The informal settlement emerged after the fall The housing construction process was based on several
of socialist system, due to immigration from northern and steps which were related to the limited budget the settlers
eastern Albania because of the of unemployment and economic possessed. Once that had of the family or one of the males had
reasons. Based on those considerable parts of agricultural lands secured the land plot through different intermediate means
at the periphery of Tirana, were usurped to be transformed in (like buying from previous owner or obtaining from the state
informal neighborhoods, which lacked the needed company who managed during the socialist period; or simply
infrastructure. After almost thirty years from this large socio- forcefully seizing) the first step was to enclose by a fence or a
urban “tectonic crack” these zones are in continuous process of wall. In the second step after construction of the foundations,
urban upgrade and integration with the other parts of the city sometime only a room or a part of the house was constructed
through legalization and infrastructure provision. The study by hand power. At a further step, the first floor of the house
benefits from the theoretical framework of aesthetics, to use its was finished, and predominantly there were left bared
main principles in revealing the aesthetical values of the selected reinforcement irons at the housings' terraces aiming to add
urban fragment with Bathore. Based on that it provides an other upper floors. This process has continued like this and
aesthetical analysis examined into two levels: urban and
after approximately thirty years these informal dwellings has
architectural. The urban aesthetical level of this informal
ended up in three to four floor houses. While some of the
context is analyzed by using pattern analysis revealing the
existence of proportional relation between building and parcel informal houses are featured by irregular volumes, in
components. The study found good ratio in between these considerable cases due to the qualitative exterior finishing
components, which provided a comfortable built environment. appear like three floor villas.
Furthermore, in urban level is found to be used a quasi-modular Because the informal house holders built their homes,
building footprint in a considerable amount within the selected without construction permission from 2006 the state
fragment. Whereas in architectural level similar principles are institution legalization Agency “ALUIZNI” was established.
explored by analyzing the buildings’ exterior visual quality. The
This process apart from providing a legal status, opened the
study found the existence of aesthetical values also in
architectural level. Apart for the unfinished buildings, the
framework for further infrastructural investments which
aesthetical values in architectural level are provided by the upgraded the living conditions in the zones. However, the
usage of similar/balanced/repeated volumes in non-street side process is continuing and due to the disputable ownership
buildings and in street side landmark buildings by qualitative status of the site plot (in which the best case is when it
exterior materials and proportional volumes. belonged to state and the most complicated one when it
belonged to other peoples) have resulted in very long process.
This research aims to reveal the possible aesthetical values
Keywords—aesthetical values, informal settlements, post- of the informal settlements by focusing on a peripheral
socialist period, Tirana neighborhood of Tirana, called Bathore. The study selects and
analyzes an urban fragment within this neighborhood.
I. Introduction on Informal Settlements in Albania Referring to the main principles of aesthetics, it examines the
aesthetical values by dividing into urban and architectural
The fall of the socialist regime in Albania is associated
levels. For urban level it uses pattern analysis it analyzes the
with deep consequences in the life of Albanians. Apart from
relation between its smaller components, exploring the
the freedom of speech, the transition to the new liberal
existence of possible aesthetical principles of scale and
economic system caused the bankruptcy of most state
proportion. Similarly for architectural level it explores the
companies. As a matter of fact, large number of citizens
existence of buildings which possess aesthetical values, by
suffered unemployment, which caused migration to neighbor
conducting site observation and descriptive analysis of their
countries like Italy and Greece. Parallel to this process,
exterior.
immigration also occurred from country’s north and eastern
parts to central Albania cities like Tirana and Durres.
According Dervishi [1] by the 90s Tirana had 240000 citizens,
whereas now it has quadrupled up to 1 million. The existing A. Informal Neighbourhood of Bathore
housing stock could not provide dwelling for this large During the socialist period, Bathore was a village under
number of newcomers and, those could not afford to buy a the administration of Kamza, which was an agricultural town.
house. This economic reality merged with inefficiencies in After the fall of socialist period, as a result on internal
urban development and political capital gain reasons migration for reason that were explained in the prior part,
according to Fuga [2] made the flourishment of illegal newcomers from Northern Albania settled, transforming it

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

160
completely in a large an informal zone. Most people obtained “mathematical theory of musical consonance” to explain the
the land to develop their dwellings from other settlers who had order of universe is strongly inherent. Other Renaissance
seized it in the 90s and a minor number simply had occupied masters like Alberti, built their concept on aesthetics by what
the public land [3]. The agricultural character of Bathore have is “right”, “appropriate”, “proper” and “proportionable” [5].
disappeared, to be transformed into a zone into a huge The Enlightenment period architects put focus on “order” and
neighborhood fulfilled with informal buildings. Bathore's “sublime”. Together with the emerge of functionalism as the
chaotic silhouette take attention to the eye of any person, who major creative force of modern architecture, indisputably the
decides to travel through the national road to the northern aesthetical features are criticized to be highly reductionistic.
Albanian cities because the national road passes directly to Considering the impact of Loos and Mies Van der Rohe in the
this area. 20th century apparently rather than decorative features, the
spatial qualities have taken the dominant role in architectural
compositions.
Proportion and scale in fact are the most fundamental
principles of aesthetics widely accepted. While the basis of
proportion is laid in the works of Pythagoras, further in later
periods till nowadays there are produced different
proportional systems like: Golden ratio, Classical orders,
Fibonacci numbers, Vitruvian man or Le Modulor, using as its
the primary sources nature and human body. As for the scale
it refers to how we perceive the size of something in relation
to something else. The human scale tends to provide an
evaluation of a building size taking as the reference the human
body size [6].
Based on this theoretical framework we aim to explore the
existence of principles of aesthetics which can be operational
in an informal urban context. Apart from the two major
Figure I. Location of Bathore in northeast part of Tirana in red dot. principles of aesthetics like scale and proportion, we also aim
to see the existence of other principles like symmetry,
After the establishment of ALUIZNI agency, the process repetition, composition, balance, linearity, or rhythm. Since
of legalization of informal settlement provides another the informal settlements are not planned on the contrary are
impetus in offering the needed infrastructure to this developed spontaneously, we will explore also the possible
settlement. Although due to political intentions, the relation between certain social setting like brotherhood and
legalization process have been overextended for almost 30 aesthetical values.
years, important infrastructural facilities like asphalted roads,
solid waste management. Especially the role of one local NGO
called CO-PLAN has been essential in producing planning
strategies and bridging the gap between the local community II. Exploring Aesthetical Values in an Informal
and administration through participatory design processes [4]. Settlement
Apart from that people also has continuously upgraded their Having explained the research context and certain basic
houses, improving their dwelling comfort and the exterior concepts of aesthetics we have decided to analyze the
quality. aesthetical features of selecting an urban fragment within
Bathore called neighborhood number 1. To do that we propose
to analyze the aesthetical features into two levels: urban and
architectural. To comprehend the physical elements in urban
level we conduct a city image analysis based on Lynch.
Furthermore, to explore the possible aesthetical values in
urban level, we use a pattern analysis, revealing the relation
Figure II. The image of first informal settlements in 1994 (left) and
the same place in 2007 (middle- ©John Driscoll, IIUD) and the between building and parcel and evaluating the existence of
new infrastructure of Bathore in 2019. proportion and scale principles. In architectural level we
firstly define the buildings using their relation to the street into
B. Theoretical Unpinning for Urban Aesthetics two: street side and non-street-side buildings. In additions we
The notion of the aesthetics etymologically comes from tend to explore the existence of buildings with aesthetical
Greek meaning “to perceive” or related to senses or sensation. values in this upgraded neighborhood and analyze the possible
Starting from the classical antiquity it has been a subject of aesthetical elements and principles in their exterior. We use
elaboration of different field of life like art, music, poetry, or visual documentation and a descriptive form to reveal these
architecture. Plato looked to aesthetics in the absolute beauty, features.
which relied in proportion, harmony, and unity. Aristotle
A. Urban Urban Level Aesthetical Analysis
among others emphasized the pleasure as an element of
beauty, pointing the importance of perception. Vitruvius To search the existence of aesthetical principles in urban
provides his vision on buildings proposing his three famous level we conduct an urban analysis using Lynch’ elements to
principles: firmitas, utilitas, and venustas, putting the comprehend the physical components of the selected
aesthetics the last in order. Especially the influence of fragment. Furthermore, we make a pattern analysis.
Pythagoras’s in this periods perception which used

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

161
sizes. One of the buildings which is larger in size is the
neighborhood market which three floor high (Figure 4, right-
down image). Thus, most of the buildings are in human scale.
Similarly, the ratio between the building footprint to
parcel varies from 1/2 to more than 1/5 and the majority is
1/3 and over. These ratios between parcel and housing
pattern, due to the existence of considerable green spaces,
provide a good balance, resulting comfortable outdoor
environment.
The parcel and house pattern of the selected urban
fragment is featured by similarity and repetition. Although
the logic of parcel division and house scale is directly related
on brothers belonging to the same family, it appears that the
relation between these two elements seems to be balanced.
The repetition of similar house patterns in a balanced way
creates the possibilities to have quasi harmonious
composition.

Figure III. Location of urban fragment of neighborhood no1 within

Bathore map (top) and a 3d drawing of the fragment (down).

A.1. Comprehending the Urban Fragment Using

Lynch`s “City Image” Analysis
Based on that the paths, the edges, the districts, the nodes,
and the landmarks of the zone are revealed. The selected zone
paths consist of three longitudinal streets, the middle of which
in the last years is transformed into a dual carriageway street
with pavement in both sides, which locally is called a
“boulevard”. The other streets in perpendicular position, are
narrower and considerably are dead ended. The linearity can
be considered the only aesthetical element that can be Figure IV. Paths within the selected fragment (top) and (down)
associated with paths. While the two longitudinal streets in edges & districts.
both provide well defined edges, in the transversal direction
is quite impossible to speak about well-defined edges. The
districts are depicted by different colors in Figure 3 and their
organizational logic is based on blood relations or place of
origin of the householders. It unveils the also how brothers or
relatives have bought a common land and further they have
developed their houses. While the nodes are quite recessive,
the most important landmarks are buildings which are feature
with public character like the market building which
possesses better exterior quality, or other smaller shops. The
shops exterior reflects the transitional state of unfinished
buildings.

A.2. Exploration of Aesthetical Values in Urban

Level Through Pattern Analysis
As it was explained before to examine in urban level the
aesthetical potential a patterns analysis is conducted (Figure
5). We observe the usage of quasi-similar modular housing
footprint, in a considerable amount. Predominantly these
buildings are designed with four axes and three spaces in Figure V. Nodes (top) and (down) the landmarks within the
between, producing approximate modules whose width selected fragment.
varies around 10 meters. While most of the buildings have a
similar size, very low number of buildings possess larger

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

162
dimensions apart from business functions and residential
functions, in some cases the offered apartment for renting as
also in considerable cases upper residential floors are left
empty.

Figure VII. Street-side buildings of selected urban fragment in

Demokracia street.

Figure VI. Pattern analysis of parcels and houses relation in

selected fragment (top) and usage of quasi-modular building
footprint (down).
Figure VIII. Non-street side buildings of the selected urban
B. Architectural Level Aesthetical Analysis fragment
We use two steps to reveal the aesthetical values in The non-street side buildings in the selected urban fragment
architectural level. Firstly, based on an observation we define generally are dedicated to housing function and varies from
the building types based on their relation to the street and
one floor to three floors. The one-story residential buildings
secondly, we conduct a visual aesthetical analysis in their
exterior. have average living conditions. In certain cases, there are
"unused" one-story building with a very simple plan, which
B.1. Defining The Informal Building Types are constructed only to keep the” occupied land” by
The observation on the buildings in the selected urban instrumentalizing the legalization of that object. The two or
fragment reveals that they are located into two positions: i. more storied houses reflected a good condition as they are
street side and, ii. non-street side. Since there was no street built recently or recently restored. However, in majority of
infrastructure, it is quite impossible to say that the selection of cases, their last floors are empty.
the location of those on the streetside has always been done
intentionally. However, for the buildings which placed within
the surrounding wall of the plot, this can be said that it was B.2. Exploration of Aesthetical Values Through Visual
done intentionally. Analysis of Buildings Exterior
The street-side buildings are observed to be in majority Aiming to find aesthetical potentials of the informal
two to three floors high and a small minority are one floor. settlements in architectural level, we will make the visual
The one-story building is very rare, due to the commercial analysis by dividing the building typologies in street-side and
values which the presence of the road adds to a space. In this non-street side. Apart from the images shown in figures 7 and
case also most of this typology is used for commercial 8, we observed the existence of upgraded buildings are
activity. In the cases of two-story building, in majority of worthy to be evaluated for their aesthetical values.
cases the ground floor is used for commercial activities and The exterior of street side buildings due to the presence of
the first floor for residential functions. In less cases two story the commercial functions tends to have a more cacophonic
buildings both floors are used for commercial purposes. In visual language which at the same time break the possibility
the case of three-story buildings, they are featured by bigger for any rhythmical repetition in the upper floors. This

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

163
disbalance is apparent not only in the changing volumes, but Apart from this interestingly, in this small urban fragment
also in the varying colors and presence of other additions in there are observed at least four twin houses, which reflect a
the form of staircases, aiming to access the first floor for the sense of symmetry. In fact, this approach is based on the
same commercial activities. In this context the idea to make relation of brotherhood, in which brothers buy a land plot
the commercial spaces (shops, markets, or other services) together and develop identical houses. The similarity in the
more visually appealing, have pushed to cover with heavier houses is reflected also in the exterior finishing, where in
materials like marble or stone cladding. This composition some cases, the houses are left plastered and uncolored, or are
generates a visual contrast, between the different functions whitewashed, or even in the same colors.
providing another disbalance. Apart from this apparently
those buildings due to incomes coming from commercial
activities are upgraded with richer exterior quality, play the
role of landmarks.

Figure XI. Twin houses placed in symmetrical position.

III. Concluding Remarks

This study aimed to explore the possible aesthetical values
Figure IX. Street side building with more cacofonic visual qualities in an upgraded informal neighborhood of Tirana called
(top-right), and landmarks (others) upgraded with richer exterior. Bathore selecting an urban fragment within the settlement.
Based on the proposed theoretical approach the aesthetical
As for the non-street buildings, whose function values in such context are analyzed into two levels: urban and
predominantly is housing their exterior is featured by a more architectural. The urban level aesthetical values used pattern
balanced composition. In many cases the same functions, are analysis to reveal the relation building and parcel. Based on
even repeated in three floors, providing symmetry, repetition, that most of the buildings possess a good ratio with their
and harmony. In this building typologies this is easier because parcel, varying between 1/3 up to 1/5, providing considerable
of the similarity in functions between the floors. In the last green spaces and consequently a comfortable environment. In
years, many buildings have been subject of exterior addition, we revealed that a considerable amount of the
transformation. This have resulted into villas with vernacular buildings is constructed with four axes and three spaces in
and minor neo-classical architecture elements in exterior, between corresponding to approximate quasi-similar
which overpass the prejudice of being an informal house and modules. These quasi-modules when seen in overall, produce
repetition and similarity, resulting in a balanced built
give possibility to speak about the existence of proportions
environment.
between the façade elements. An important element that
gives values to this kind of house is the existence of abundant Although still many buildings in the selected fragment
front yard, which provides a feeling a pleasure and comfort. remain still unfinished, the study revealed that in parallel with
upgrade of the infrastructure by public investments, the
building owners also have also adjusted their houses, which
we found to have aesthetical values in their exteriors. The
study found that buildings in this urban fragment can be
defined into two: street side and non-street side. Since street
side buildings are possess commercial functions in ground
floor, they are featured by more dynamic, sometimes
disbalanced volumes, but also in the case of important
landmarks they are featured by more qualitative materials in
their exteriors, which goes beyond the image of an informal
building, producing an added visual aesthetical quality to the
neighborhood. The non-street side buildings in majority due
to the similarity of functions are featured by a balanced
exterior composition. Furthermore, they are featured by
Figure X. Non-street side houses visual qualities in exterior. repeated volumes and in many cases their exterior is featured
also by symmetrical façade solutions. In less cases there are
also observed the existence of vernacular and minor neo-
classical decoration in their exteriors. Finally, the study

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

164
revealed at least in four cases identical twin houses, reflecting [3] Pojani, Dorina, From Squatter Settlement to Suburb: The
a sense of symmetry, which is developed based on the will to Transformation of Bathore, Albania. Hous. Stud. 2013, 28,
805–821. 2013.
have identical houses between brothers. This finding
[4] CoPlan, Enabling a better urban governance. CoPlan case in
demonstrates that although the informal settlements are not the areas influenced by illegal buildings, Conference:
planned, certain social settings like brotherhood, due to this Strengthening Public Information and Participation for an
sense of having similar houses can be a reason to develop Open Governance in Albania, Tirana, February 20–21, 2003.
aesthetical values in a such a built environment. [5] Scruton, Roger, The aesthetics of architecture. Princeton,
NJ:Princeton University Press, 1979.
[6] Rasmussen, Steen Eiler, Eve M. Wendt, Experiencing
architecture, 1962.
References
[7] http://community.dur.ac.uk/geopad/first-impressions-
[1] Manahasa, Edmond, Place attachment as a tool in examining informal-settlement-bathore-albania/
place identity: A multilayered evaluation through housing in
Tirana. PhD dissertation, Istanbul Technical University, 2017.
[2] Personal Communication with Social Scientist and Philosopher
Artan Fuga, 2014.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

165
Effect of Oxidation Reactor Structure on Operating
Parameters and System Performance in a Nitric Acid
Production Plant
Oguzhan Erbas F. Menekse Ikbal Ahmet Akbulut
Department of Mechanical Engineering Department of Mechanical Engineering Department of R&D
Kutahya Dumlupinar University Kutahya Dumlupinar University Istanbul Gubre Sanayii A.S (IGSAS)
Kutahya, Turkey Kutahya, Turkey Kutahya, Turkey
oguzhan.erbas@dpu.edu.tr menekseikbal@gmail.com ahmet.akbulut@igsas.com.tr

Abstract—The factors affecting the process efficiency in a very flexible range according to needs. The nitrogen value of
plant producing dilute nitric acid were investigated. It has been ammonium nitrate fertilizer is 35 % N [1].
observed that the parameter that significantly affects the
efficiency in the ammonia combustion process is the reactor In fertilizer production facilities, diverse starting materials
unit, which is the heart of the system. It has been determined are used depending on the type of fertilizer produced, in some
that most of the malfunctions and stoppages in the existing plant cases together or sometimes separately. The primary starting
are caused by the problems in the old reactor. Therefore, the materials for chemical fertilizer production are ammonia,
aging reactor in the facility was revised, and the reactor unit nitric acid, sulfuric acid, and phosphoric acid. Some of these
structure was changed. In this study, the efficiency and substances are brought to the facility from outside, while
performance of the system in the old and new reactor after others are produced in the same facility. In an exothermic
revision were analyzed. reaction, gaseous ammonia and nitric acid combine to form
ammonium nitrate and water. Nitric acid is preheated without
Keywords—nitric acid production, ammonia oxidation reacting, especially when dilute acid is used; preheating is
reactor, ostwald process, waste heat boiler, energy efficiency essential. For this process, steam and hot condensate in the
I. Introduction advanced stages of the plant can be used [2,3].
With the industrial revolution, there has been a significant The nitric acid absorber is the primary emission source,
increase in the world population. While the world population and continuous emission into the air occurs from its outlet.
was 1 billion before the industrial revolution, it is now about These emissions are NH3, nitric acid vapor, NOx, NO2, and
8 billion. The rise in energy demand with population growth, NO. The waste gas flow rate and the pollutants in its content
globalization, and income and welfare has led to energy may differ according to the process used. For example, the
efficiency. Energy efficiency measures how energy losses can pressure and temperature of the combustion medium, the
be prevented without reducing the production quality and type/structure of the catalyst, its age compared to its lifetime,
process quantity in industrial enterprises. the choice of burners, etc. Depending on many factors,
varying proportions of N2O (nitrous oxide) may occur with the
The sector that comes to the forefront in energy efficiency combustion gases. Water, typically 0.2 % in liquid ammonia,
studies is the industry-manufacturing field. 2020 has been a accumulates in the evaporator as the ammonia is evaporated.
challenging year for the world. The world has dealt with the Some ammonia is also released when cleaning with
virus epidemic (Covid-19). The stagnation nature of the intermittent bluffs.
economic activities of the Covid-19 epidemic has also
profoundly affected Turkey. To overcome this crisis, it is Although ammonia leaks are not common in nitric acid
necessary to achieve maximum output with minimum input, production, they can be a serious source of danger if they do
increase the profit margin without increasing sales, and reduce occur. Pipelines, transfer equipment, corrosion punctures, etc.,
the costs without reducing the product quality. Energy is the leaks that may occur due to reasons should be monitored by
highest cost for the business. businesses. When preparing air/ammonia mixtures, attention
is paid to explosion risks. An additional threat posed by
Waste resulting from animal production, nitrogen fertilizer nitrous gas (N2O) is the possibility that ammonia, which can
used in plant production, diesel fuel in tractors, thermal fuels accumulate in refrigerated zones, may form salt precipitates in
in housing, greenhouse, and animal shelters with electricity, the nitrite/nitrate composition. These carry the risk of
the agricultural sector also affects the formation of greenhouse explosion, so the risk is eliminated by periodically washing
gases that cannot be ignored. While assessments of energy use the places where they can occur [4].
in agriculture often focus directly on energy use, 50 % and
more of the total energy use nitrogen fertilizer production For energy efficiency, measures should be created to
should be considered to be related to energy and other indirect increase energy efficiency, taking into account the dynamics
energy uses. of each enterprise. In this study, productivity-enhancing
studies were carried out in a nitric acid production facility that
Ammonium nitrate is the most commonly used nitrogen produces chemical fertilizers. In this nitric acid production
fertilizer. Ammonium nitrate is also used as an explosive. It is facility, processes with high energy consumption were
obtained by neutralizing nitric acid with ammonia. Depending determined, and it was aimed to reduce consumption values.
on the operating conditions, the obtained ammonium nitrate
solution has a 50-70 % concentration. After drying the As a solution alternative to minimize consumption,
resulting concentrated ammonium nitrate solution, a solid energy efficiency has been focused on. As a first step, the
fertilizer is formed. Facility capacities can be selected in a problems that cause malfunctions in the plant are discussed.
Machine-induced stops are examined. It has been seen that
this old reactor generates 90% of the failures. The ammonia

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

166
oxidation reactor process was analyzed, as the main reason for Using 100% HNO3, 56 % dilute nitric acid is produced.
the downtime at this nitric acid production facility was this old The raw materials used in the production of nitrate acid are
reactor. The reactor was renewed as a result of rehabilitation air, water, and ammonia. Nitric acid production steps are as
works at the facility. The operating parameters of the old and follows;
new reactors were evaluated, and their efficiency was
analyzed [5,6]. 1. Liquid ammonia is first gasified with water in the
ammonia gasifier and then comes to the ammonia
II. The Nitric Acid Production Process and the
superheaters to be heated with hot air. Then the heated gaseous
importance of the ammonium oxidation reactor ammonia comes to the ammonia combustion reactor.
Nitric acid (HNO3) is obtained by catalytic oxidation of
2. The air required for the combustion of ammonia is
ammonia (NH3). This process is called the "Ostwald
drawn from the atmosphere by the turbocharger and cleaned
process". The mixture is passed through a platinum-radium
in the filter. Ammonia with an ammonia concentration of
catalyst network. Under normal conditions, the reaction in around 10% is mixed with air, passed through the filter, and
equation (1) occurs more, and only elemental nitrogen is comes to the ammonia combustion reactor.
obtained.
3. Nitrous oxide gas (NO) is obtained by burning 11 %
ammonia and 89 % air mixture in ammonia combustion plants
4NH3 + 3O2 2N2 + 6H2O (1) on platinum-rhodium-palladium catalysts at approximately
For this reason, a catalyst is used to obtain nitrous oxide. 870 oC under 2.5–3.5 bar (kg/cm2) pressure. Superheated
The only catalyst used industrially is the platinum-radium steam is obtained from these hot gases in the waste heat boiler.
catalyst, which contains 5% to 10% radium and is pencil 4. NOx gases coming out of the ammonia combustion
platinum. It is in the form of a fine mesh. Oxygen atoms are reactor at approximately 250 0C are cooled by passing through
absorbed on the platinum surface. four coolers. At this stage; It passes first to the rest-gas heat
exchanger, then to the boiler feedwater exchanger, and then
The reaction takes place between the oxygen atoms on the through the horizontal heat exchanger. Then it reacts by
surface and the ammonia molecules. As a result, the ammonia mixing with the secondary air coming from the bleaching
molecule turns into NO [7]. In the reactor, the gas phase is column and is cooled again in the vertical heat exchanger.
reversible between NH3 and oxygen, and an exothermic Finally, it enters the oxidation tower last from the bottom.
reaction occurs, and nitrogen oxides (NOx) are released The oxidation reaction rate of nitrous oxide with additional
(equation 2). oxygen proceeds faster at low temperatures. Since the reaction
is highly exothermic, severe cooling is required to reach the
4NH3 + 5O2 4NO + 6H2O + Q (2) desired oxidation equilibrium quickly. The nitric acid
production process flow chart is given in figure I.

Figure I. Simplified scheme of the Ostwald process for nitric acid production [8]

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

167
5. NOx gases entering the tower from the bottom and III. Results and recommendations
low-concentration nitric acid (HNO3) coming from the top A. The main factors affecting process efficiency
react and complete the oxidation. The cooling process in the
towers is realized by using the cooling water with the help When the process in the nitric acid production facility
of the serpentines. was examined, it was found that the main parameters
affecting production efficiency were NH3 combustion rate,
6. Nitrogen oxide gases coming out of the oxidation sub-platinum temperature, no output temperature, amount
tower pass to the absorption tower. The water condensed of steam produced, final amount of acid produced,
during cooling absorbs nitrogen dioxide from the gas, and deactivation times, and postures associated with the
acid is obtained at a concentration of around 40% to 50%. oxidation reactor.
This acid accounts for 30% to 40% of the total production.
This acid is separated from the gas stream under the cooler, The percentage of NH3 burning in the mixture is an
and the absorber is given from a suitable place. Meanwhile, important parameter. If this NH3 ratio increases, there will
the oxidation state reached around 45%. Additional oxygen be an explosion in the system. The rate of ammonia in the
from the secondary air helps increase the oxidation state to ammonia-air mix is kept constant by the automatic
about 92% to 96%. Nitrous oxide gases coming from the regulators (10.5 % NH3 + 89.5 % O2). The facility has
bottom of the tower and 15-16% acid coming from the top recently been automated to minimize human intervention in
(in the absorption column) react and complete its oxidation. the process control mechanism. When the NH3 rate exceeds
Thus, 56% diluted nitric acid is produced [9]. 13 %, auto control is activated in the system, and the system
is disabled because 15 % is a critical level for NH3.
7. Unabsorbable gases leave the tower at a pressure of
2.5 bar and are heated in the rest-gas exchanger. These gases Another critical parameter is the catalyst (90 % Pt + 10
then enter the DeNOx system and are discharged into the % Rh). This catalyst is a perforated wire mesh with a
atmosphere through the chimney. In the DeNOx system, diameter of 3800 mm and a wire thickness of 0.06 mm. It
with the selective catalytic reduction method, NOx gases are is changed every nine months. Its efficiency decreases over
reacted with NH3 gas and reduced to N2 gas and H2O vapor time. The conversion efficiency of the ammonia combustion
already present in the air. As a result, the NOx gas in the flue reactor is also an important factor and refers to ammonia
gas is reduced below the limit value specified in the consumption. The conversion efficiency decreases with
“Industrial Air Pollution Control Regulation,” and the flue increasing pressure. The structure of the Pt-Rh catalyst is
is analyzed online 24 hours a day, and the data is recorded. shown in Figure II.
In addition, the operating pressure of the system is “medium
pressure”. Pressure selection is made according to the
amount of acid produced (2.5-3.5 bar kg/cm2)[10].

Figure II. Pt-Rh catalyst system structure

The conversion efficiency is 97 % at 870 oC catalyst The material used is "321(1.4541)" quality stainless
mesh temperature. The most important feature of this steel, which is acid-resistant. To not tire the material in the
process is that it produces its energy. Once the reaction has system, the operating temperatures mustn't exceed the
started, the pressure and temperature rise should be design values.
carefully monitored. Because here sub-platinum
temperature is an important parameter. Another critical parameter is the NO outlet temperature.
The high NOx gas temperature is coming out of the old
Since it is worked at high temperatures, stainless steel is reactor before the revision affected the operating parameters
used in terms of material strength. In the serpentine pipes, negatively. Because the gas temperature should have been
leaks may occur over time, and the system may stop. While 250 0C but found 350 0C. In addition, in the old system, the
there is a loss of time, the production capacity decreases. NO gas from the reactor reached 450 0C until it entered the
Therefore, material strength is essential. “rest-gas” heat exchanger. This situation has already

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

168
reduced the cooling efficiency of nitrous oxide gas entering Also, the heat transfer surface was almost 10 % more
the heat exchanger at high temperatures. The cooling water efficient than the other. When the reactor was renewed,
in the heat exchangers is also essential because NOx gas there was relief in the heat exchangers. Material life has
affects cooling efficiency. The more cooling is done in the been increased, and simultaneously, the time has been saved
system, the higher the efficiency. Productivity values as there are no downtimes as before.
change according to summer and winter conditions.
Production efficiency in summer is almost 10-20 % lower In the old system before the revision, the cause of these
than in winter (atmospheric air is sucked as fluid in the critical downtimes is due to leaks and changes. The main
compressor). In body-tube heat exchangers, cracks may problems in this previous system; waste heat boiler gas
occur in the pipes due to high temperatures. leakage, casing leakage, preheater (economizer) and
evaporator coil leakage, evaporator wall coil leakage, wall
When gas ammonia mixes into the water leaking from pipe leakage, continuous change of casing coils, platinum
the cracks, acid gas is formed in the water that goes to steam change and inappropriate NO outlet temperature. In figure
and damages the serpentines in the reactor. It erodes the IV, downtimes by years due to system failures before
serpentines. The water in the boiler feed exchanger comes revision are shown. While there is no stoppage due to
to the waste heat boiler in the reactor. For this reason, leaks malfunction in the new system, it is pretty high in the old
are not desired in the heat exchangers. When there is a leak, system.
the system stops. Another critical parameter is the pumps.
Pumps between the reactor and the high-pressure drum
(steam drum) should not be selected with low capacity. If
the pumps are insufficient, the circulation cannot be done
thoroughly. This can damage the serpentines.
B. Structure of Ammonia Oxidation Reactor and
Effect on System Performance.
The most crucial part of the nitric production facility is
the ammonia oxidation reactor. The reactor consists of two
parts. These are the combustion part and the steam-
generating waste heat boiler part. Catalyst and ammonia
burning rates are essential in the combustion part. In the
waste heat boiler part, serpentines are imperative. The NOx Figure IV. Annual downtime of the facility before overhaul
gas temperature at the reactor exit in the system should not After the revision, the main innovations made in the
exceed 250 0C. However, the design was outdated in the reactor are reactor material structure and design dimensions,
pre-revision system under review. Design efficiency values raching ring basket design, process gas cooler, serpentine
were very low. The temperature of the reactor exit NOx gas structure and surface area, waterways, and automation
was 350 0C. The former reactor and square-shaped system. The new reactors after the revision are shown in
serpentines are shown in Figure III. figure V.

Figure III. Old reactor and square-shaped serpentines

Figure V. New reactors after revision
This high-temperature value was 450 0C until it went to
Many nitric acid furnaces are fixed with a basket filled
the “rest-gas” heat exchanger. The water, which could not
with Raschig rings to provide structural reinforcement to the
be cooled sufficiently in the boiler feed exchanger, was
sieves, and the partial displacement of Raschig rings
pressed into the waste heat boiler and economizer line.
determines De-N2O catalysts. Therefore, there is no need to
Steam was moving more slowly through the serpentines. As
change the basket in general, and the bulk of the sieve mesh
a result of slower progress, cooling did not occur. So it was
is placed as usual. The 50 – 200 mm catalyst layer achieves
cracking the material, and it was getting punctured. This
a high dissolution rate with low additional pressure
was causing leaks in the serpentines. Due to the design of
reduction. However, the pressure drop on the catalyst
the reactor, there were frequent shutdowns. To prevent this,
increases with increasing oxidation pressure.
the reactor was renewed. In the new reactor, the
economizer, evaporator, and superheater, which are the Old and new reactor rehabilitation studies after revision
waste heat boiler part, were designed in the form of a spiral are evaluated in Table I. After the correction, there was no
by placing them on top of each other. Compared to the old reactor-induced downtime in the system. In the new system,
one, it became easier to intervene immediately when a production stops every nine months for maintenance
malfunction occurs. purposes and the replacement of the sinker.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

169
Table I. Changes in operating parameters before and after References
revision
Parameter Before Revision Post Revision [1] Hanyu Ma, William F. Schneider, “Structure- and
Sub-platinum temperature 865-870 0C 865-870 0C Temperature-Dependence of Pt-Catalyzed Ammonia Oxidation
NO gas outlet temperature 350-400 0C 250-290 0C Rates and Selectivities”, ACS Catalysis,Volume 9, 2019.
Superheated steam
33 t/h 36 t/h [2] Jiamin Jin, Ningling Sun, Wende Hu, Haiyang Yuan, Haifeng
generation
Superheated steam Wang, Peijun Hu,“Insight into Room-Temperature Catalytic
410 0C 440 0C Oxidation of Nitric oxide by Cr2O3: A DFT Study”, ACS
temperature
Catalysis, Volume 8, 2018.
Superheated vapor
40 bar 41 bar
pressure [3] Anshumaan Bajpai, Kurt Frey, and William F. Schneider,
Feed water temperature
280 0C 150 0C “Binary Approach to Ternary Cluster Expansions: NO-O-
(economizer inlet) Vacancy System on Pt (111)”, The Journal of Physical Chemistry
The final amount of acid C, Volume 121,2017.
610 ton/h 610 ton/h
produced
Deactivation times 90 day 0 [4] Chengxiong Wang, Dezhi Ren, Gavin Harle, Qinggao Qin, Lv
Coolant inlet temperature 35 0C 25 0C Guo, Tingting Zheng, Xuemei Yin, Junchen Du, Yunkun Zhao.
Coolant outlet “Ammonia removal in selective catalytic oxidation: Influence of
42 0C 30 0C catalyst structure on the nitrogen selectivity”, Journal of
temperature
Hazardous Materials, Volume 416, 2021.
Additional water flow 350-400 m3/h 50-100 m3/h
Reverse current pump [5] Jiamin Jin, Jianfu Chen, Haifeng Wang, Peijun Hu, “Insight
2 1 into room-temperature catalytic oxidation of NO by CrO2(110):
operated
1 tanker in 10 1 tanker in A DFT study”, Chinese Chemical Letters,Volume 30, 2019.
H2SO4 consumption
days 2,5-3 months
[6] Zhe Hong, Zhong Wang, Xuebing Li, “Catalytic oxidation of
Ammonia flow rate to nitric oxide (NO) over different catalysts: an overview”, Catalysis
180 kg/h 95 kg/h
DeNOx Science & Technology, issue 16, 2017.
0 0
Flue gas temperature 185 C 105 C
Lamont Differential [7] Ata ul Rauf Salmana, Bjørn Christian Enger, Xavier Auvraya,
No difference No difference
pressure Rune Lødeng, Mohan Menon, DavidWaller, Magnus Rønning,
Dome pressure 40 bar 38 bar “Catalytic oxidation of NO to NO2 for nitric acid production over
Total compressor airflow 86.500 Nm3/h 90.000 Nm3/h a Pt/Al2O3 catalyst”, Applied Catalysis A: General, Volume 564,
Turbine steam inlet flow No difference No difference 2018.
Amount of Steam [8] Carlos A. Grande , Kari Anne Andreassen, Jasmina H. Cavka,
20 t/h 26-27 t/h
produced David Waller, Odd-Arne Lorentsen, Halvor Øien, Hans-Jörg
Ammonia valves 70 % - 80 % 50 % - 60 % Zander, Stephen Poulston, Sonia García, and Deena Modeshia,
Dome level valve 70 % - 80 % 50 % - 60 % “Process Intensification in Nitric Acid Plants by Catalytic
Oxidation of Nitric Oxide”, Ind. Eng. Chem. Res.,Volume 57,
2018.
IV. Conclusion
[9] Yafei Shen, Xinlei Gea, Mindong Chena,“Catalytic oxidation
In the dilute nitric acid production facility examined,
of nitric oxide (NO) with carbonaceous materials”, issue 10, 2016.
there was an increase of 3-4 tons per hour in the amount of
steam produced with the renewal of the ammonia oxidation [10] Ruosi Peng, Shujun Li, Xibo Sun, Quanming Ren, Limin
reactor. Since the new reactor was commissioned, there has Chen, Mingli Fu, Junliang Wu, Daiqi Ye, Size effect of Pt
been no reactor-related failure at the facility so far. In the nanoparticles on the catalytic oxidation of toluene over Pt/CeO2
past, it was even possible to exit the circuit twice on the catalysts,Volume 220,2018.
same day. Since there are no downtimes, the production
capacity of the plant has increased. In the old reactor, the
waste heat boiler part was square, there were problems in
the corners of the serpentines turns, and there was more
destruction. In addition, there were serpentine leaks due to
temperature values. In the new reactor, the serpentines are
arranged in a spiral. In the old reactor, the casing cooling
pipes were insufficient. That's why there were gas leaks and
body punctures. In the new reactor, these problems have
disappeared. Since the number of operated reverse current
pumps has also decreased, approximately 185 kWh has
been saved. As a result; As a result of the new changes, there
was no interruption in the production of the plant, and the
system efficiency increased by 10%.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

170
Plant Disease Identification Through Deep Learning
Önsen Toygar Mehtap Köse Ulukök Emre Özbilge
Department of Computer Engineering Department of Computer Engineering Department of Computer Engineering
Eastern Mediterranean University Bahçeşehir Cyprus University, Cyprus International University,
Famagusta, North Cyprus, Nicosia, North Cyprus, Nicosia, North Cyprus,
Mersin, Turkey Mersin, Turkey Mersin, Turkey
onsen.toygar@emu.edu.tr mehtap.kose@baucyprus.edu.tr eozbilge@ciu.edu.tr

Abstract—Plant leaves show various symptoms on their plant diseases are spots (caused either by fungi or bacteria),
surfaces. Image processing and computer vision techniques are mildew, and rust [2]. Three defect types affecting several plant
applied on leaf images to identify plant diseases. Healthy and leaves are shown in Figures 1, 2 and 3. Spot defects are
diseased plant leaves are involved in several studies to identify demonstrated on tomato and grape leaves in Figure 1. Rust
or classify disease types using hand-crafted features and deep
learning architectures. In this study, we applied a Transfer
defects are shown on corn and apple in Figure 2. Mildew
Learning technique with several deep learning architectures to defects are presented on cherry and squash leaves in Figure 3.
identify plant diseases from their leaf images. PlantVillage
database is used in the experiments with all fourteen plant Every living organism on earth exhibit or react in a
species and thirty eight classes corresponding to healthy and particular way when in a condition or situation that deviates
infected leaf images. Experimental results are presented using from the normal state of being. For example, when the human
several evaluation metrics and a comparison is performed
skin goes red or develops a rash could be due to an allergic
among several deep learning architectures based on Inception,
VGG, ResNet, MobileNet, Xception and variants of these
reaction or an early indication of an underlying ailment. Plants
architectures. Results are demonstrated in terms of five are not excluded; the leaves in many instances serve as our
evaluation metrics, namely accuracy, F1 score, Matthews gateway for diagnosing a lot of diseases in plants. For
correlation coefficient, true positive rate and true negative rate. example, the Early Blight disease of tomato leads to the
The highest accuracy achieved is 99.81% by ResNet appearance of small dark spots that expand into circular
architecture. plaques made up of rings that circumnutates on the leaves [3].
This in turn, results in premature defoliation of the leaves and
Keywords—plant disease classification, deep learning, heavy losses in yield. Figure I (a) shows a healthy tomato leaf,
transfer learning, computer vision, image processing
and a diseased leaf affected by early blight is shown in Figure
I (b).
I. Introduction
Agricultural productivity is so important in countries in
which their economy is highly dependent on agriculture. It is
stated that 30 to 40% of crops are lost each year through the
production chain [1]. Losses from diseases also have an
important economic impact, causing a drop in income for
crop producers, higher amounts for consumers and
distributors. A lot of studies have been carried out under
changed environmental conditions, in different locations, to
estimate the losses that occur due to different diseases [1]. In
this respect, it is vital to detect and identify plant diseases in
their initial stage. There exist several computer vision and
image processing techniques to identify plant diseases
(a) (b)
through their leaf images.

Plant-affected diseases emanate from non-living or living

factors. The abiotic components known as non-living
components include climatically induced situations,
contaminated water sources, deficient or surplus liquid for
plant use, air and soil chemical substances, nutritional
deficiencies, while the biotic components (living) are usually
influenced by unwanted weeds, pests, and pathogenic
organisms or microorganisms (chromists, fungi, viruses,
nematodes, phytoplasms, and bacteria). Therefore, by
critically analyzing factors such as ease of spread, ability to (c) (d)
exist in diverse forms and complexity in extermination, biotic- Figure I. Spot defects on Tomato and Grape Leaves ((a) Healthy
borne pathogens continue to present a deleterious threat to the tomato leaf; (b) Early Blight on tomato leaf; (c) Healthy grape leaf;
lifespan of plants survival and harvest quality. The three most (d) Black Rot on grape leaf
commonly considered defects for identifying and classifying

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

171
The fungus, Guignardia bidwellii is responsible for the Another sample of rust defects can be seen on Cedar
Black Rot disease of grapes. It is usually common in regions Apple Rust which is a member of the Pucciniaceae family; a
of wet, warm and humid climate as this provides a conducive class of fungi with several species that typically need two or
situation for spore germination and infection. This disease more host to complete their life cycle. Members of this class
spreads when the spores are carried by wind or splashed by are known as rust which are seen at some stage in their
rain onto the surfaces of developing plant tissue. This goes on evolution and mostly they are orange or reddish in color. The
for as long as the environmental conditions remain suitable. fungus spreads through the leaves and develops aecia beneath
Black Rot can be identified when round, tan plaques with the leaves. The aecia produces aeciospores which are wind-
dark purple to brown edges are spotted on the leaves. Critical blown back to the redcedars. They afterwards germinate and
infections may result in leaf deformity, wilting of the leaves begin gall formation which produce telial horns to restart the
[4]. Figure I (c) shows a healthy grape leaf, and a diseased process. A heavily infested apple tree can take on a yellowish
leaf affected by Black Rot is demonstrated in Figure I (d). cast from multiple plaques on the leaves. Figure II (c) shows
a healthy apple leaf, and a Cedar Apple Rust infected leaf is
On the other hand, Puccinia Sorghi Schwein is the presented in Figure II (d).
pathogen (fungus) responsible for the popular disease of
maize known as Common Rust. Early plaques mostly occur Moreover, Powdery Mildew is the third type of most
in clusters and are circular. But as the plaques ripen, the common defects seen on some plant leaves such as cotton,
fungus protrudes through the foliage surface and the plaques cucumber, grape, squash and cherry. Samples of these defects
elongate with time. The characteristic symptom observed on are demonstrated in Figure III on cherry and squash leaves.
the maize leaves are Brownish-red oblong pustules, plaques Figure III (a) shows a healthy cherry leaf, and a diseased
of Common Rust occur on both the upper and lower surfaces cherry leaf affected by Powdery Mildew is shown in Figure
of the leaves and are spread sporadically along the leaves. III (b). Additionally, two samples of diseased squash leaves
Spores are transported by wind with new infections ensuing affected by Powdery Mildew are presented in Figure III (c)
weekly or bi-weekly. One plaque is capable of producing and (d).
both brownish-red urediniospores and black teliospores, yet
lastly, only black teliospores will be seen within the plaque
[5]. Figure II (a) shows a healthy corn leaf, and a diseased
leaf affected by Common Rust is shown in Figure II (b).

(a) (b)

(c) (d)
Figure III. Mildew defects on Cherry and Squash Leaves ((a)
Healthy cherry leaf; (b) Powdery Mildew on cherry leaf; (c)-(d) Two
samples of diseased squash leaves by Powdery Mildew.

(c) (d) The rest of the paper is organized as follows. Literature review is
discussed in Section II and the methodology used in this study is
Figure II. Rust defects on Corn and Apple Leaves ((a) Healthy corn
explained in Section III. Section IV presents the experiments and
leaf; (b) Common Rust on corn leaf; (c) Healthy apple leaf; (d) results. Finally, Section V concludes the paper with the findings and
Cedar Apple Rust on apple leaf the summary of the work done in this study.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

172
II. Literature Review III. Methodology
Plant disease classification has been studied in the Recently, deep learning based approaches are applied on
literature using two approaches, namely, hand-crafted feature plant leaf images to classify plant diseases. In this study, we
descriptors and deep learning approaches. An extensive applied Transfer Learning approach to classify plant diseases
research on hand-crafted descriptors for plant disease from leaf images of several plants available in PlantVillage
classification is presented in Kaur et al. [6]. Hand-crafted database. All classes of PlantVillage dataset are used in the
feature extraction from leaf images requires acquisition, pre- system and totally there are 38 different classes for 14 plant
processing, segmentation and feature extraction. Then, these species. The class numbers and the corresponding disease
features are used to train classification algorithms such as names are shown in Table I. The details related to the training
Support Vector Machines (SVM), Maximum Likelihood parameters used in the architecture are given in Table II.
Classification (MLC), K-Nearest-Neighbours (KNN), Naive
Bayes (NB), Decision Trees (DT), Random Forest (RF) and The system architecture is shown in Figure IV using a
block diagram of the steps followed for plant disease
Artificial Neural Networks (ANN). However, deep learning
approaches perform feature extraction and classification identification. The system receives input plant leaf images of
automatically. 256x256x3 size. The images are color images of plant leaves
and a pretrained CNN is employed on ImageNet dataset.
Deep learning based approaches for plant disease Afterwards, the frozen weights are trained with
classification are recently presented in several research studies backpropagation using a fully connected Artificial Neural
[7]. One of the well-known deep learning techniques is the Network (ANN) to obtain the outputs of the system as plant
Convolutional Neural Networks (CNN) and it gains more disease types to identify the healthy or diseased plant leaves.
attention by the researhers. Plant disease classification
performance of CNN models is found as superior compared to The system uses well-known deep networks that are
other classification techniques. However, there are some implemented on ImageNet database using Inception [10],
problems of CNN models with the usage of small datasets. ResNet[11], VGG [12], MobileNet [13] and Xception [14]
This may cause higher accuracy in classification performance architectures. Pretrained ImageNet network’s connection
but it is not true in practice. Another drawback is the high weights are frozen which are used as a feature extractor
execution time that is required with CNN models. without retraining the weights. After having obtained the new
features from the pretrained network, these features are
Impact of transfer learning strategies of CNN on presented to the custom ANN where its connection weights
pretrained model is worked out by Lee et al. [8]. Plant Village
dataset with 38 different classes is studied by using three deep are learned by using backpropagation algorithm.
learning architectures, namely, VGG16, InceptionV3,
GoogLeNet and a proposed model of GoogLeNetBN. The
best accuracy of 99% is achieved with pretrained CNN on that IV. Experiments and Results
dataset. It is concluded that there is no significant performance Experiments are conducted to perform plant disease
difference between pretrained or unpretrained model. classification using healthy and diseased leaf images from
Around two decades’ studies for plant diseases detection PlantVillage dataset. The details related to the experimental
and classification from plant leaf images by using image setup and the obtained results are presented in the following
processing techniques are summarized in [1, 9]. The subsections using several deep learning approaches.
importance of digital image quality and its difficulty in real A. Experimental Setup
life applications is highlighted in that survey paper. Besides PlantVillage database [3] is used in conducting the
many of the classification techniques, neural network experiments. All plant species from the database are used in
technique gives higher accuracy rate, especially for some the experiments. Totally, 14 different plant species, namely
plants 100% accuracy is reported. apple, blueberry, cherry, corn, grape, orange, pepper, potato,
raspberry, soybean, squash, strawberry and tomato are used

Figure IV. Block diagram of the system architecture with Transfer Learning

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

173
with their healthy and/or diseased leaf images. The database classes from 14 plant species. Transfer Learning is applied
consists of 38 different classes that comprise of healthy and using several network architectures namely Xception,
different types of infected plant leaves of the aforementioned VGG16, VGG19, ResNet50, ResNet101, ResNet152,
plant species. The list of classes in PlantVillage database used ResNet50V2, ResNET101V2, ResNET152V2, InceptionV3,
in the experiments are depicted in Table I. On the other hand, InceptionResNetV2, MobileNet, MobileNetV2. Evaluation of
the training parameters used in the experiments are available each network architecture is demonstrated using five
in Table II including training and test data sizes, input image evaluation metrics. The accuracy (ACC), F1 score, Matthews
size, output classes and deep learning architecture’s specific correlation coefficient (MCC), true positive rate (TPR) and
parameters. true negative rate (TNR) are computed in each class and
presented in Table III as different evaluation metrics. All
accuracies are within the range [99.54% - 99.81%] while F1
Table I. List of Classes in PlantVillage Database scores are computed in the range [91.63% - 96.60%]. On the
other hand, MCC values are within the range [91.62% -
1 Apple scab 20 Pepper bell healthy
96.55%] while TPR and TNR values are in the ranges of
2 Apple black rot 21 Potato early blight [91.96% - 96.64%] and [99.74% - 99.90%], respectively.
3 Apple healthy 22 Potato healthy
4 Blueberry healthy 23 Potato late blight
5 Cedar apple rust 24 Raspberry healthy Table III. Results for several deep learning architectures
6 Cherry healthy 25 Soybean healthy Network Architecture ACC F1 MCC TPR TNR
7 Cherry powdery mildew 26 Squash powdery mildew
Xception 0.9959 0.9252 0.9247 0.9273 0.9979
8 Corn common rust 27 Strawberry healthy
9 Corn gray leaf spot 28 Strawberry leaf scorch VGG16 0.9969 0.9452 0.9447 0.9463 0.9983

10 Corn healthy 29 Tomato bacterial spot VGG19 0.9965 0.9393 0.9388 0.9410 0.9981
11 Corn northern Leaf Blight 30 Tomato early blight ResNet50 0.9981 0.9660 0.9655 0.9664 0.9990
12 Grape black rot 31 Tomato healthy
ResNet101 0.9976 0.9582 0.9577 0.9586 0.9988
13 Grape esca Black Measles 32 Tomato late blight
ResNet152 0.9978 0.9624 0.9618 0.9627 0.9990
14 Grape healthy 33 Tomato leaf mold
15 Grape leaf blight 34 Tomato mosaic virus ResNet50V2 0.9970 0.9458 0.9452 0.9470 0.9984
16 Orange haunglongbing 35 Tomato septoria leaf spot ResNet101V2 0.9970 0.9479 0.9474 0.9491 0.9984
17 Peach bacterial spot 36 Tomato target Spot ResNet152V2 0.9968 0.9426 0.9418 0.9428 0.9983
18 Peach healthy 37 Tomato two spotted spider mite
InceptionV3 0.9954 0.9163 0.9162 0.9196 0.9974
19 Pepper bell bacterial spot 38 Tomato yellow leaf curl virus
InceptionResNetV2 0.9959 0.9224 0.9221 0.9246 0.9978
MobileNet 0.9980 0.9637 0.9632 0.9645 0.9989
Table II. Training parameters
MobileNetV2 0.9968 0.9429 0.9424 0.9443 0.9982
Training data size 27150
Test data size 27155
Performance evaluation of 13 different deep learning
Output Classes 38 models presented in Table III indicates that all network
Input image size [256x256x3] architectures are robust for identifying healthy or diseased
plant leaf images since all the results are high in terms of
Hidden layer (head) 1
accuracy, F1 score, Matthews correlation coefficient, true
Hidden nodes (head) 40 positive rate and true negative rate. In general, all results are
Learning rate 0.01 above 90% which means that these network architectures are
robust and applicable for plant disease classification.
Dropout rate 0.5
The ranges of performance values for all evaluation metrics
Batch normalisation enabled
show that the minimum performance among 13 different
Activation ReLU network architectures is obtained by InceptionV3 architecture.
Optimisation algorithm Adam Although it has the lowest accuracy among the other deep
learning architectures, the performance of InceptionV3
Cost function Focal Cross-Entropy Loss architecture is 99.54% in terms of accuracy. However, the
Maximum epoch 20 highest performance values for all the metrics are achieved by
ResNet50 architecture for plant disease identification.
Batch Size 16
Specifically, the highest accuracy achieved by ResNet50
Random Rotation (data augmentation) [-36o, 36o] architecture for classifying 38 classes is 99.81% in terms of
Random Contrast (data augmentation) [-10%, 10%] accuracy. Therefore, all network architectures employed in
the experiments are successful in identifying plant diseases
from leaf images.
B. Experimental Results and Discussion
V. Conclusion
The experiments are performed using leaf images of
several plant species and the experimental results are Identification of plant leaf diseases is studied in this paper.
presented to show the classification of 38 healthy or diseased Healthy and diseased plant leaf images from PlantVillage

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

174
database are used which constitute 14 plant species and 38 [5] Tamra A. Jackson-Ziems, “Rust Disease of Corn in Nebraska”,
University of Nebraska-Lincoln extension, Institute of
classes. In this study, in order to identify healthy or diseased Agriculture and Natural Resources. Revised January 2014
classes, Transfer Learning is employed using several deep Retrieved June 29, 2020, from
learning architectures, namely Xception, VGG16, VGG19, http://extensionpublications.unl.edu/assets/pdf/g1680.pdf
ResNet50, ResNet101, ResNet152, ResNet50V2, [6] Sukhvir Kaur, Shreelekha Pandey, Shivani Goel, “Plants
ResNET101V2, ResNET152V2, InceptionV3, Disease Identification and Classification Through Leaf Images:
InceptionResNetV2, MobileNet, MobileNetV2. Five A Survey, Archives of Computational Methods in Engineering,
2018.
evaluation metrics are used, namely accuracy, F1 score,
[7] M. Nagaraju, Priyanka Chawla, “Systematic review of deep
Matthews correlation coefficient, true positive rate and true learning techniques in plant disease detection”, Int J Syst Assur
negative rate to compute the performance of each Eng Manag (June 2020), Volume 11, No 3, pp. 547–560, 2020.
architecture. Experimental results show that all accuracies [8] Sue Han Lee, Hervé Goëau, Pierre Bonnet, Alexis Joly, “New
obtained with the network architectures achieve more than perspectives on plant disease characterization based on deep
90% accuracy. The highest performance is achieved by learning”, Computers and Electronics in Agriculture, Volume
170, (2020) 105220.
ResNet50 architecture with 99.81% accuracy, 96.60% F1 [9] Lawrence C. Ngugi, Moataz Abelwahab, Mohammed Abo-
score, 96.55% Matthews correlation coefficient, 96.64% true Zahhad, “Recent advances in image processing techniques for
positive rate and 99.90% true negative rate. Therefore, automated leaf pest and disease recognition- A Review”,
ResNet50 is the best architecture among the others used in Information Processing in Agriculture, Volume 8, pp. 27-51,
2021.
this study with the highest performance values to identify
[10] Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke,
plant diseases from leaf images. Alexander A. Alemi, “Inception-V4, inception-ResNet and the
Impact of Residual Connections on Learning, Thirty-first
AAAI Conference on Artificial Intelligende, 2017.
References [11] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, “Deep
residual learning for image recognition”, IEEE Conference on
[1] Gittaly Dhingra, Vinay Kumar, Hem Dutt Joshi, “Study of Computer Vision and Pattern Recognition, Las Vegas, NV,
digital image processing techniques for leaf disease etection USA, 27-30 June 2016.
and classification”, Multimedia Tools and Applications, [12] Karen Simonyan, Andrew Zisserman, “Very deep
Volume 77, pp. 19951–20000, 2018. convolutional networks for largescale image recognition”, The
[2] Pujari JD, Yakkundimath R, Byadgi AS, “SVM and ANN 3rd International Conference on Learning Representations
based classification of plant diseases using feature reduction (ICLR2015), 2015.
technique”. Int J Interact Multimed Artif Intell, Volume 3, pp. [13] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey
6–14, 2016. Zhmoginov, Liang-Chieh Chen, “MobileNetV2: Inverted
[3] David P. Hughes and Marcel Salathe, “An open access residuals and linear bottlenecks, IEEE/CVF Conference on
repository of images on plant health to enable the development Computer Vision and Pattern Recognition, Salt Lake City, UT,
of mobile disease diagnostics”, arXiv preprint USA, 18-23 June 2018.
arXiv:1511.08060, 2015. [14] François Chollet, “Xception: Deep Learning with Depthwise
[4] Angela Madeiras, “Grape IPM- Black Rot”, (2019, September Separable Convolutions”, IEEE Conference on Computer
27) Retrieved June 29, 2020, from Vision and Pattern Recognition, Honolulu, HI, USA, 21-26
https://ag.umass.edu/fruit/fact-sheets/grape-ipm-black-rot July 2017.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

175
Vaccines Perspective in the COVID-19 Era: Analysis of
Twitter Data
Abdulkadir Sahiner1,2 Kaan Kemal Polat Hayati Ünsal Özer
1 Department of Computer Engineering Department of Mathematical Department of Mathematical Engineering
Istanbul Sabahattin Zaim University Engineering Yildiz Technical University
2 Department of Mathematical Yildiz Technical University Istanbul, Turkey
Engineering Istanbul, Turkey huozer@yildiz.edu.tr
Yildiz Technical University kemalp@yildiz.edu.tr
Istanbul, Turkey
asahiner@yildiz.edu.tr

Abstract— Nowadays, preventive treatments are being Bonnevie et al. [9], by examining the Tweets of those who
developed for the COVID-19 epidemic, which affects all social are against vaccines during the COVID-19 period, measuring
life. One of these treatments is vaccination studies. Since the the anti-vaccination opposition, especially from the beginning
subject of vaccination has been discussed by people from past to of the COVID-19 epidemic in the United States to the present,
present, it increases the importance of revealing the approach in determining the change in vaccine-related messages and
this period. In this context, the aim of the study is to determine determining the effects of COVID-19 on general vaccine
people's approaches to COVID-19 vaccines through their opposition intended. Within the scope of the study, Tweets
Tweets. The sentiment analysis method was used within the obtained between 15 February 2020–14 June 2020 were used
scope of the study. With this widely used method, the data
as data. In the period covering the data set examined in the
obtained from Kaggle were analyzed. As a result of the research
of the open data set consisting of Tweets related to the Pfizer &
study, it was stated that the views on anti-vaccination
BioNTech vaccine, with the LSTM model, in reaching the best increased by 80%.
accuracy value; The activation function was obtained as A different study, it was aimed to analyze the vaccine
"softmax", the epoch value as "10" and the batch size value as discussions on Twitter in the Netherlands. Within the scope of
"128" appear as remarkable results. the study, a mixed model was used and a quadruple circular
model was followed in which community detection, text-
Keywords—Vaccine, COVID-19, Twitter, sentiment analysis
mining, perception analysis, and network analysis were
performed. As a result of the study carried out by Kearneys
using retweet and igraph packages, it was aimed to contribute
I. Introduction to the strategy studies on the vaccine by determining the
The COVID-19 epidemic, which affects almost every relational networks of the opposition and shared views on the
field, continues its impact today. Vaccine studies have subject [10].
accelerated to prevent COVID-19, which we can describe as Balenkenship et al. [11] in the study conducted by, it was
a dangerous epidemic that causes the death of people. Today, aimed to determine whether tweets with different emotions
the vaccines produced by some companies have been taken by and content for vaccination attract different levels of
countries after the approvals and started to be applied to their participation (retweets) from Twitter users. As a result of the
citizens. study using the regression model, it will be the most important
Vaccine studies regarding COVID-19, which was declared step to ensure the participation of key opinion leaders on
a global epidemic by the World Health Organization (WHO) social media in order to facilitate health education about
in March 2020, have been one of the important topics of vaccination in their Tweets and to ensure that their views
discussion until today. It is stated that at least 70% of the reach a wider audience. Thus, a positive approach to
population should be vaccinated as a precaution with vaccination will be provided through influential people.
continuity and protection of vaccines [1, 2]. It becomes very In another study, it was aimed to analyze and determine
important to understand the extent of public support for this the data on the international public debate about the pediatric
situation and to direct society to vaccination accordingly. One pentavalent vaccine (DTP-HepB-Hib) program by analyzing
of the areas where the public expresses their approach to Twitter messages. In the study in which Twitter data between
vaccination is social media platforms [3-5]. July 2006 and May 2015 were analyzed, it was seen that there
Although studies are stating that social media is a very was a little interaction between the tweets, but links containing
effective tool for vaccination, it is stated that it may contain information about the vaccine were used quite frequently [12].
negative feelings and false information that may affect In the study conducted by Wen-Ying Sylvia Chou &
individual opinions and lead to vaccine rejection [6]. Such a Alexandra Budenz [13], the importance of a data-based
situation has been stated by the World Health Organization communication strategy in controlling the anxiety
(WHO) as one of the ten main factors that are threatening experienced against vaccine hesitancy was emphasized and in
global health [6, 7, and 8]. Due to the differences in the this context, it was aimed to make suggestions regarding the
provision of vaccines by countries, people show different analysis of comments on the vaccine with sentiment analysis
approaches to the application of alternative vaccines. on Twitter. Within the scope of the study, it was stated that it
Sentiment analyzes are tried to be determined through Tweets was possible to examine the link between the vaccine and the
on Twitter, one of the social media platforms where people emotional approach and, accordingly, to take precautions
express their opinions against vaccines. Although it is seen against the highly intense opposition to the vaccine with the
that the studies in this context are increasing gradually, some communication strategy.
of the studies carried out are as follows.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

176
Vaccination is one of the most important issues from the hashtags, source, retweets, favorites, is_retweet” information
past to the present, and today, with widespread about Tweets.
communication networks, anti-vaccination is becoming more
and more widespread. However, in today's COVID-19 A total of 8631 Tweets were analyzed in the dataset containing
epidemic, where community immunity has a vital stroke, good the latest tweets about the Pfizer & BioNTech vaccine.
management of the process and sharing information that will Examined Tweets were divided into 3 groups as “positive,
prevent negative opinions against vaccines is possible with negative, and neutral” using Sentiment Intensity Analyzer.
data-based methods. In this context, our study is aimed to
4679
analyze the shares of people on Twitter about vaccines in the 5000
COVID-19 period, with an open data set, using the sentiment
4000 2729
analysis method.
3000
2000 1223
II. Dataset Description
The dataset we used in the study includes the latest tweets 1000
about the Pfizer & BioNTech vaccine created by Gabriel 0
Preda on the Kaggle Platform [14]. This data is stated to be Positive Negative Neutral
collected using the tweepy Python package to access the Sentiment
Twitter API.
Count
In the data set used, there are “id, user_name,
user_location, user_description, user_created, user_followers, Figure I. Data Sentiment Analysis
user_friends, user_favourites, user_verified, date, text,

Figure II. Tweet Count Over a Period of Time

Figure III. Top 50 Positive Words Used in Tweets Figure IV. Top 50 Negative Words Used in Tweets

Among the Tweets in the data set used within the scope of
the study, words such as "dose, thank" is the most used in the III. Material and Methods
tweets with a positive approach, while words such as "death,
arm, die" in the tweets with negative approach attract A. Data Preprocessing
attention. "Word Embedding" was used to process the data and the
texts were converted into word vectors. Using the KERAS
embedding layer, 32-dimensional word vectors were created.

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

177
In addition, various adjustments were made to the data. In this within the scope of the study were carried out using the
context; hash, internet connections, special characters, single Central Processing Unit (CPU), Graphics Processing Unit
characters, and double spaces are cleared. (GPU), or Tensor Processing Unit (TPU) hardware and the
online cloud service. The parameters used in the experimental
Emotion intensity information was added to the cleaned process can be summarized in Table I.
tweets using the VADER [16] sentiment analysis tool.
B. Long Short Term Memory (LSTM) Table I. Parameter Settings
Long Short Term Memory is called “LSTM” and is a Parameter Value
special type of RNN that can learn long-term dependencies. Activation function ReLU, softmax
This model, which was introduced by Hochreiter & Batch size 128
Schmidhuber [15], is widely used today due to its effective Loss function Binary
Optimizer Adam
operation in complex problems.
Deep learning criteria were used to examine the
performance criteria of the model proposed in the study are:
𝑇𝑁 + 𝑇𝑃
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (1)
𝑇𝑁 + 𝑇𝑃 + 𝐹𝑁 + 𝐹𝑃

𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = (2)
𝑇𝑃 + 𝐹𝑁
(𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑅𝑒𝑐𝑎𝑙𝑙)
Figure V. Architecture of LSTM Model 𝐹1 − 𝑆𝑐𝑜𝑟𝑒 = 2 × (3)
(𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙)
One of the tricky issues with natural language processing TP, FP, TN and FN given in Equations (1) - (3) represent
is that the meanings of words can change depending on their the numbers of True Positive, False Positive, True Negative
context. In the case of sensitivity analysis, we cannot ignore and False Negative, respectively.
the occurrence of some words like good because if it comes
before the word like not good, its meaning can change In addition, one of the evaluation criteria within the scope
completely. This makes it difficult, as it requires reading of the study is the Matthews correlation coefficient. The
between the lines. LSTM networks are well suited for solving Matthews correlation coefficient was introduced by Brian
such problems, as they can remember all the words that lead Matthews in 1975 and can be defined as a tool for model
to what is in question. evaluation. It is important in terms of revealing the strength of
the statistical relationship between the true value and the
The LSTM model used in the study can be summarized as estimated value [17].
follows:
IV. Results
This section describes the results obtained in this study. As
mentioned earlier, one model was used LSTM. More
experiments were also performed to investigate different
parameters. In this study, optimizers were tested, ADAM
optimizer. For model, different parameter values were tested
and optimum parameter values were determined. 80% of the
data set was used as training data, and 20% was considered as
test data. In the created model, 196 cells were used in the
LSTM layer.
In the first stage, the results obtained over the different
epoch and activation function values applied in the LSTM
model are as follows:

Table II. The Effect of Number of Epochs and Activation Function

on Parameter Values
Number Matthews
Activation Accuracy F1-
of Recall correlation*
Function (%) Score
Epochs
10 Softmax 80,83 0,81 0,80 0,66
20 Softmax 79,33 0,79 0,79 0,63
50 Softmax 75,80 0,76 0,75 0,57
10 ReLU 75,51 0,76 0,73 0,55
20 ReLU 79,97 0,80 0,79 0,64
50 ReLU 75,68 0,76 0,75 0,57
Figure VI. The LSTM Model Used in the Study
*
“The Matthews correlation coefficient is used in machine
C. Experimental Setup
learning as a measure of the quality of binary and multiclass
Python programming language was used in the training of classifications [18].”
the model proposed in the study. The experiments carried out

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

178
The best performance in the different activation function As a result of the analysis of the open data set consisting
and epoch values applied in the model used was obtained at of Tweets related to the Pfizer & BioNTech vaccine, with the
the activation function "softmax", and epoch "10" values. In LSTM model, in reaching the best accuracy value; The
addition, in this case, it was seen that the quality criterion in activation function was obtained as "softmax", the epoch
the model with multiple classifications was quite good value as "10" and the batch size value as "128". In addition,
(Matthews correlation = 0,66). the Matthews correlation value reached these values was 0.66.
In the second stage, the results obtained over the different V. Conclusion
epoch and batch size values applied in the LSTM model are
as follows:
The main purpose of this study is to analyze the effect of
Table III. The Effect of Number of Epochs on Parameter Values* tweets about Pfizer & BioNTech vaccines on anti-vaccine
sentiment, to examine how people's views on vaccines have
Number Matthews changed during the COVID-19 period. As a result of the use
Accuracy F1-
of Batch Size Recall correlation
(%) Score of "ADAM" optimizer and the epoch value of "10", the
Epochs
10 512 80,37 0,80 0,79 0,65 activation function "softmax" and the batch size value of
10 128 78,11 0,78 0,78 0,61 "128" in the "LSTM" model proposed within the scope of the
10 32 76,38 0,76 0,76 0,58 research, the best accuracy in the data set was obtained with
20 512 79,33 0,79 0,78 0,63 80.83%.
20 128 76,26 0,76 0,76 0,58
20 32 75,91 0,76 0,75 0,57 Although the number of data tagged as positive Tweets is
higher than the number of negative Tweets, the fact that the
* number of neutral Tweets is higher than these two groups
Activation function = softmax
shows similar results with recent studies in which anti-
The best performance in the different batch size and epoch vaccination is increasing [19].
values applied in the model used was obtained at the batch size
"512", and epoch "10" values. In addition, in this case, it was Within the scope of the study on how the subject of anti-
seen that the quality criterion in the model with multiple vaccination, which is an important issue in the literature,
classifications was quite good (Matthews correlation = 0,65). differs from the current COVID-19 pandemic, Twitter data
was examined within the framework of sentiment analysis. In
The graph of the change of accuracy and loss value this process, it is very important for countries to develop
according to epoch value on the data set with the LSTM model strategies according to the approach to vaccines during the
is given below. pandemic period by examining how user Tweets related to
COVID-19 vaccines have changed historically.
Examining the tweets about the vaccine developed by
Pfizer & BioNTech within the scope of the study can be
expressed as a limitation. In this context, studies can be carried
out to examine and compare Tweets related to different
COVID-19 vaccines, and as a result, how anti-vaccination
differs between these vaccines.
References
[1] Orenstein, W. A., & Ahmed, R. (2017). Simply put:
Vaccination saves lives.
[2] Aguas, R., Corder, R. M., King, J. G., Goncalves, G., Ferreira,
M. U., & Gomes, M. G. M. (2020). Herd immunity thresholds
for SARS-CoV-2 estimated from unfolding epidemics.
medRxiv.
[3] Velasco, E., Agheneza, T., Denecke, K., Kirchner, G., &
Eckmanns, T. (2014). Social media and internet‐based data in
global systems for public health surveillance: a systematic
review. The Milbank Quarterly, 92(1), 7-33.
[4] Yousefinaghani, S., Dara, R., Poljak, Z., Bernardo, T. M., &
Sharif, S. (2019). The assessment of Twitter’s potential for
outbreak detection: avian influenza case study. Scientific
reports, 9(1), 1-17.
[5] Guess, A. M., Nyhan, B., O’Keeffe, Z., & Reifler, J. (2020).
The sources and correlates of exposure to vaccine-related (mis)
information online. Vaccine, 38(49), 7799-7805.
[6] Piedrahita-Valdés, H., Piedrahita-Castillo, D., Bermejo-
Higuera, J., Guillem-Saiz, P., Bermejo-Higuera, J. R., Guillem-
Saiz, J., ... & Machío-Regidor, F. (2021). Vaccine Hesitancy
on Social Media: Sentiment Analysis from June 2011 to April
2019. Vaccines, 9(1), 28.
[7] Puri, N., Coomes, E. A., Haghbayan, H., & Gunaratne, K.
(2020). Social media and vaccine hesitancy: new updates for
the era of COVID-19 and globalized infectious
Figure VII. Variation of Model Accuracy and Loss Value by diseases. Human Vaccines & Immunotherapeutics, 1-8.
Epoch Value

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

179
[8] Kunneman, F., Lambooij, M., Wong, A., Van Den Bosch, A., hesitancy and fostering vaccine confidence. Health
& Mollema, L. (2020). Monitoring stance towards vaccination communication, 35(14), 1718-1722.
in twitter messages. BMC medical informatics and decision [14] Preda, G. Pfizer Vaccine Tweets. Kaggle Repository:
making, 20(1), 1-14. https://www.kaggle.com/gpreda/pfizer-vaccine-tweets, 2021.
[9] Bonnevie, E., Gallegos-Jeffrey, A., Goldbarg, J., Byrd, B., & [15] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term
Smyser, J. (2020). Quantifying the rise of vaccine opposition memory. Neural computation, 9(8), 1735-1780.
on Twitter during the COVID-19 pandemic. Journal of
Communication in Healthcare, 1-8. [16] Hutto, C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious
Rule-based Model for Sentiment Analysis of Social Media
[10] Lutkenhaus, R. O., Jansz, J., & Bouman, M. P. (2019). Text. Eighth International Conference on Weblogs and Social
Mapping the Dutch vaccination debate on Twitter: identifying Media (ICWSM-14). Ann Arbor, MI, June 2014.
communities, narratives, and interactions. Vaccine: X, 1,
100019. [17] Kaden, M., Hermann, W., & Villmann, T. (2014). Optimization
of General Statistical Accuracy Measures for Classification
[11] Blankenship, E. B., Goff, M. E., Yin, J., Tse, Z. T. H., Fu, K. Based on Learning Vector Quantization. In ESANN.
W., Liang, H., ... & Fung, I. C. H. (2018). Sentiment, contents,
and retweets: a study of two vaccine-related twitter datasets. [18] Scikit-Learn, sklearn.metrics.matthews_corrcoef:
The Permanente Journal, 22. https://scikit-
learn.org/stable/modules/generated/sklearn.metrics.matthews_
[12] Becker, B. F., Larson, H. J., Bonhoeffer, J., Van Mulligen, E. corrcoef.html, 2021.
M., Kors, J. A., & Sturkenboom, M. C. (2016). Evaluation of a
multinational, multilingual vaccine debate on Twitter. Vaccine, [19] Yousefinaghani, S., Dara, R., Mubareka, S., Papadopoulos, A.,
34(50), 6166-6171. & Sharif, S. (2021). An Analysis of COVID-19 Vaccine
Sentiments and Opinions on Twitter. International Journal of
[13] Chou, W. Y. S., & Budenz, A. (2020). Considering Emotion in Infectious Diseases.
COVID-19 vaccine communication: addressing vaccine

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

180
Deep Learning-based Healthcare Data Analysis System

Mohammad Ikhsan Zakaria Akhtar Jamil Alaa Ali Hameed

Department of Computer Engineering Department of Computer Engineering Department of Computer Engineering
Istanbul Sabahattin Zaim University Istanbul Sabahattin Zaim University Istanbul Sabahattin Zaim University
Istanbul, Turkey Istanbul, Turkey Istanbul, Turkey
zakaria.mohammad@std.izu.edu.tr 0000-0002-2592-1039 0000-0002-8514-9255

Abstract— Health care data is playing an essential role in from different distribution. Moreover, the most standard
delivering the proper treatment for the patient at the right time. classifier are designed to optimized the loss function based on
The amount of biomedical data has tremendously increased, how accurate the prediction. However, the predictive accuracy
which also increased challenges for its analysis. This paper gain low performance metrics from class imbalance learning
combines supervised and unsupervised learning techniques to [3]. As we will see in experiments we found out that heartbeat
obtain insights into the healthcare data for its classification. classes are not balance in distribution. So how to deal with
Specifically, we used a deep neural network to perform anomaly imbalance learning is with preprocessing step where data
analysis to determine the nature of health problems. Deep training is modified to produce more balance class
convolutional neural network (DCNN) and Deep Autoencoder
distribution.
Convolutional Neural Network (DACNN) were employed to
classify the image patterns of extracted electrocardiographs The authors [9] described a correlation between sequential
(ECG). We have combined the CNN model with autoencoder to forward selection feature selection. Highly correlated features
classify heartbeat data. The experimental results showed that are removed because it will lead to inconsistency models.
high accuracy and lower loss is obtained using our approach. Sequential forward selection algorithm which is based feature
selection can be considered and it is interactive step. The
Keywords—Deep Learning, supervised, unsupervised, ECG subsets are included to gain the final subset of features where
classification, autoencoder, Convolutional Neural Network
it will obtain correct accuracy.
I. Introduction The authors[3] described that autoencoder to drive deep
Currently health has become the most important thing to learning architecture than can learn the hidden representations
people around the world. Most of the countries have tried to of data even when data is perturbed by missing values
counter or prevent the pandemic effect major population by (noises).
restricted contacts in the community and giving vaccination Brain Tumor Classification in healthcare systems[6] in
which they prefer to use. Healthcare system is primary role to order to assist radiologist for a better diagnosis analysis. The
handle spreading illness, to care the people. Healthcare data classification was investigated using convolutional neural
are being analyzed by many scientist to predict what are the network models by performing extensive experiments using
coming virus that are being spread in the community. In this transfer learning with and without augmentation.
paper we propose to make classification in healthcare data
system. We focus on heartbeat data classification using deep
learning supervised and unsupervised learning.
III. Problem Statement
Heartbeat data can produced by using electrocardiograph
(ECG) which is used to record the heart activity in certain Many attempts to find the best way to achieve higher
time. So basically we see wavelet graph as the results. Most of accuracy to classify highly dimensional and complex ECG
the time, scientists need to do preprocessing from ECG results data. This research paper shows the gate to find autoencoder
raw data. In this paper, we will skip that preprocessing raw system for classifying heartbeat rate category. Meanwhile we
ECG files but instead we take prepared data in csv format and keep convolutional neural network be main part of training
analyze them using deep learning algorithms such as for supervised and unsupervised learning.
Convolutional Neural Network (CNN).
Classification data in healthcare system for heartbeat rate IV. Methodology
has been analyzed by many studies where they contributed in
In this paper we propose multi-layer or deep neural
finding automate system to identify normal beats,
supraventricular ectopic beats, ventricular ectopic beats, network (DNN) in order to find the best classification
fusion beats and other abnormal beats[1]. methods to get the best solution for unsupervised or
supervised learning neural networks. DNN can be used for
This paper is organized in different sections in the feature extraction and classification of ECG raw data
following. Section-I described short explanations about what extracted from patient into number of categories.
is the main argument of this paper. Section-II discusses some Our data matrices have high dimensions and imbalance
related works where there are some contributors who have classes makes the training process have become more
done experiments. Section-III problem statement and section- challenging. So come out with unsampled data to balance
IV explains about methodology that we used. Section-V classes, the class distribution will be even and good for
discussed some experiments and results. Section-VI discussed
training or testing convolutional neural networks. Figure 1, is
about results and conclusion.
the plotting for heartbeat classification for our four categories
II. Related Works which come from ECG signals. This is the sample, so that we
Class imbalance learning can produce inferior results[2]. know the wavelet visualization.
Class imbalance can be caused by data sample which come

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

Figure I. Heartbeat Classification
We prefer to use Python programming language because
it supports Tensorflow or Keras library and we can code
A. Autoencoder
better in Python.
Basically autoencoders are one of specific feedforward
neural networks that use the input same as the output. Inputs
are compressed into the lower-dimensional code and then
reconstructed the output from this representation. So simply
that compression of the input also called the latent-space
representation.
An autoencoder can consist of 3 components: encoder,
code and decoder. The encoder compresses the input and
produces the code, the decoder then reconstructs the input
only using this code[3].

V. Experiments and Results

In this section, we will see some basic requirements for
experiments and experiments setup for the test, evaluation of
testing for convolutional neural networks such as loss,
validation loss and accuracy. The final results obtained ECG
data classification experiments and comparative evaluations Figure 1 Imbalance Distribution
for several state of the techniques in research field.
C. Data Preprocessing
A. Dataset Information
Before we process the heartbeat data we train using deep
In this research, we have used dataset from “The MIT- learning, we have to check it whether there are missing or
BIH Arrhythmia Database” which is available at PhysioNet. empty for each column that represent classes. Moreover if we
There are 48 annotated ECG data, 30 min timeseries long for find imbalance data then we will unsampled and make it
each excerpts of two-channel ambulatory ECG recordings evenly distributed. Fig2 shows us that there are four types of
and obtained from 47 subjects studies by BIH Arrhythmia heartbeat rate which represent 0.0 for Normal,
Lab between 1975 and 1979. Supraventricular Ectopic, Ventricular Ectopic, Fusion and
Unknown.
B. Experiment Setup
Currently, there are several top numerical platform for D. Measuring Accuracy and Loss Values
machine learning development which have many useful To optimize and measure how strong the deep learning
features are Tensoflow, Theano, Pandas, Numpy and can make prediction, there is always target or result we expect
PyTorch. The platforms are the basic tools which have from training. Accuracy implies the number of successful
support library for creating deep learning models. Moreover, prediction and loss shows how much functions or weight
Keras Python library can facilitate us to provide the best way being used but not gaining correct or precise output. Loss
to create a range deep learning models on top of Tensorflow. output implies how well or bad particular model is behaving
after all optimization iterated. Basically we would expect the
reduction of loss every time its learning.
Then the test samples are fed to the model and the
inaccurate prediction (zero-one loss) the model makes is
recorded, after comparison to the true targets. Then the
misclassification is calculated. The final measure of accuracy
is approaching to 98% using supervised learning. On the
other hand using autoencoder we found 78% accurate and has
higher loss.

Figure 2 Balance Distribution

Figure 3, shows us plotting data that is already balanced

and this could help us to predict or classify heartbeat data
using DCNN or DACNN. The shape become more evenly
100000 by 188 matrices. Where column 188th is our label for
classification.
Following image we will see DCNN architecture that we
use in the first stage where we use multilayer convolutional,
batch normalization max pooling and with low dense because
we need to flatten layer. According to our evaluation DCNN
has higher dimension and yielded the best results in moderate
time. Larger networks come out with diminishing gradient
problems and learning time also significantly longer than
other.
In the next level of experimentation, we propose
autoencoder to solve healthcare data analysis. As autoencoder
detail explained in Fig.5 deep autoencoder neural networks,
we provide high input and one line output layer. Because
autoencoder supposed to have X_train as input and output for
re-learning.
Figure 4 shows us model accuracy, where training has
higher value but validation also gain and almost reaching
training accuracy which almost 97%. On the other hand
figure 5 shows us model loss, also here training loss lower
than validation loss, but it also reaching the most fit which
almost 0.025.

Figure 3 DCCN Architecture

representation. On the other hand decoder is the
reconstruction from encoded representation to be as closes as
the original input as possible. In this paper we provide
Conv1D layer and input (188,1) dimension. After that, we
decode using Conv1DTranspose model in order to give flat
output and single output. Fig8. Explains details of Deep
Autoencoder Convolutional Neural Networks.

Figure 4 Model Accuracy

Figure 5 Model Loss

E. Auto-Encoder
Classification using autoencoder with highly dimension
heartbeat data causes a lot of complexity to find the best fit
convolutional neural network. So we also make similar
preprocessing heartbeat data with balancing and fixing some
error during designing models.

Figure 7 DACNN Architecture

F. Mean Absolute Error

Machine learning algorithm predictions always come
with loss function to minimize the error during training and
benefit the implementation. Mean absolute error (MAE) loss
will be one of loss that we consider to make evaluation for
autoencoder algorithms. We also gave threshold from MAE
Figure 6 Autoencoder Architecture loss which we can make conclusion again the algorithm that
we propose in this paper. Currently we have got 0.0175
Generally autoencoder has four components, which are threshold from MAE loss which should be best result so far.
encoder, decoder, bottleneck and reconstruction loss. Those Figure 9, explains our calculations again MAE loss in
components cannot be missed in order to get proper or correct autoencoder architecture.
models. Encoder means that, the models learn how to reduce
the input dimensions into compressed encoded
correctly organized. The pattern that we feed to neural
network can be easily remember by the machine.

ECG signals data forms from wavelet graph complex data,

so when we make prediction using Feed Forward neural
network. We can conclude that autoencoder convolutional
models can be improved and can be used as common
algorithm for Deep learning in healthcare data analysis.
References
[1] J. J. Goldberger, M. E. Cain, S. H. Hohnloser et al.,
“AmericanHeart Association/American College of Cardiology
Founda-tion/Heart Rhythm Society scientific statement on
noninvasiverisk stratification techniques for identifying
patients at risk forsudden cardiac death: a scientific statement
Figure 8 MAE Loss from the AmericanHeart Association Council on Clinical
Cardiology Committeeon Electrocardiography and
Arrhythmias and Council onEpidemiology and
G. Prediction Prevention,”Circulation,vol.118,no.14,pp.1497–1518, 2008.
Our important purpose for autoencoder term is that we can [2] Muhammad Irfan, Ibrahim A. Hameed, “Deep Learning based
make prediction against data we train and automate the Classification for Healthcare Data Analysis System”, IEEE,
2017.
prediction. Figure 10, describes the best way on how model
[3] Akram Farhadi , David Chen , Rozalina McCoy , Christopher
convolutional can make better prediction for our large data Scott , John A. Miller , Celine M. Vachon , Che Ngufor, “Breast
ECG signals. It shows one sample from Normal beat, which Cancer Classification using Deep Transfer Learning on
is predicted almost 98% precise. Structured Healthcare Data”, IEEE, International Conference
on Data Science and Advanced Analytics, 2019.
[4] Son Phung, Ashnil Kumar, “A Deep Learning Technique for
Imputin missing healthcare data”, IEEE, 2019,
[5] Xiaokang Zhou,Wei Liang,Kevin I-Kai Wang, Hao
Wang,Laurence T. Yang ,Qun Jin, “Deep-Learning-Enhanced
Human Activity Recognition for Internet of Healthcare
Things”, IEEE Journal, IOT, volume 7, no 7, 2020.
[6] Khan Muhammad,Salman Khan,Javier Del Ser,Victor Hugo C.
de Albuquerque, “Deep Learning for Multigrade Brain Tumor
Classification in Smart Healthcare Systems: A Prospective
Survey”, IEEE Journal, volume 32, no 2, 2021.
[7] Dr.Subiksha.K.P, “Improvement in Analyzing Healthcare
Systems using Deep learning Architecture” , ICCCA, 2018.
[8] M.M.Al Rahhal, Yakoub Bazi, Haikel AlHichri, Naif Alajlan,
Farid Melgani, R.R. Yager, “Deep learning approach for active
classification of electrocardiogram signals”, ELSIVIER
Journal, accepted 2016.
Figure 9 Train and Prediction [9] Priyanka Saha, Srabani Patikar, Sarmistha Neogy, “A
Correlation - Sequential Forward Selection Based Feature
Selection Method for Healthcare Data Analysis”, IEEE
Conference, 2020.
Conclusion [10] Xianlong Zeng, Simon Lin, Chang Liu, “Multi-View Deep
Learning Framework for Predicting Patient Expenditure in
Based on the precision and accuracy, to classify ECG Healthcare”, IEEE Open Journal, published 2021.
signals model convolutional learning can be combined either [11] Pritom Das R, Rakshitha G, Ms. I. Juvanna, Dr. D. Venkat
as supervised or unsupervised learning. The result can be Subramanian, “Retina based Automated Helathcare
improved as we modified and re-train neural networks. Framework via Deep Learning”, IEEE Journal, ICGCIoT 2016.
Sometimes complex data doesn’t take more layers for [12] Tanbin Islam Rohan, Md. Salah Uddin Yusuf, Monira Islam
convolutional to make prediction if the target and labeling are and Shidhartho Roy, “Efficient Approach to Detect Epileptic
Seizure using Machine Learning Models for Modern
Healthcare System”, IEEE Journal, TENSYMP 2020.
Optimum Usage of Mixed Damping Systems (Rubber
Concrete or X Diagonal Dampers) on Multistory Building
Benni Sami Tayşi Nildem
Department of Civil Engineering Department of Civil Engineering
Gaziantep University Gaziantep University
Gaziantep, Turkey Gaziantep, Turkey
samibenni@gmail.com taysi@gantep.edu.tr

Abstract— This research aims to compare usage of X plate Immediate Occupancy (IO) are the descriptors of damage
diagonal chevron dampers (Win plastic property dampers) on a states, which are performance objectives only when they relate
multi-story building frames in various locations of the structure, to a selected seismic hazard level. The hazard may be an
accompanied by using treated crumbed rubber concrete with earthquake or the probability of ground shaking intensity
different percentages for the structure’s frames. By employing (10% chance of being exceeded in 50 years). Using the new
these two systems together as hybrid damping system it will be analysis techniques as a technical tool, it is possible to analyze
seen that changing those systems’ damping properties will affect buildings for multiple performance objectives. Relatively new
the results of push over analysis for the structure such analysis procedures help to describe the inelastic behavior of
transmitted base shear forces, roof displacement of the
the structural components of the building, which allows to
performance curve, pseudo acceleration ,pseudo displacement,
effective period, ductility ratio and most importantly effective
estimate the particular behavior of the building during a
damping ratio which plays major row in reducing demand curve selected ground motion. The analysis procedure predicts
of the design response spectrum. 3D models are modeled and which part of the building will fail first. Because the load and
analyzed by ETABS, compare analysis results according to displacement increase, other elements begin to yield and
multiple study cases, trying to reach optimum usage of such deform inelastically.
damping systems according to the indications mentioned above. The resulting graphical "curve" is a representation of the
Where comparison will be focused in this research only on the
capacity of the building. Several alternative techniques allow
scale factor of design response spectrum and damping ratio
the demand from a selected earthquake or ground shaking
which resulted from push over analysis done for the study cases
defined here in this research.
intensity to be correlated with the capacity curve to obtain an
area on the curve where capacity and demand are equal, which
Keywords— Damping Systems, Nonlinear Static Analysis is called the performance point (PP) of the structure, which is
(Pushover), Response Spectrum, Earthquakes, Crumbed Rubber an estimate of the particular displacement of the building for
Concrete, Chevron Braces the desired ground motion.

I. Introduction B. Estimation of damping & reduction of 5% damped

response spectrum in push over analysis
In recent years, the rapid construction of skyscrapers has
fostered a particular placelessness, as many new skyscrapers
have been built without scale, context, or place. By analyzing
many skyscrapers and providing numerous visual
representations that inspire, stimulate, and engage,
Understanding Tall Building asserts that well-designed
skyscrapers can rejuvenate cities, spark economic activity,
support social life, and enhance city pride. Ground shaking
during an earthquake can severely damage structures and the
equipment within them. Ground acceleration, velocity, and
displacement, when transmitted through a structure, are
amplified in most cases.
This amplified motion can produce forces and
displacements beyond what the structure can withstand. Many
factors affect ground motion and its amplification; therefore,
an understanding of how these factors affect the response of
structures and equipment is important for safe and economical
design. The ground motion of an earthquake is typically
Figure I. Derivation of Damping for Spectral Reduction
measured with strong-motion instruments that record the
acceleration of the ground. The recorded acceleration curves, The damping that occurs when an earthquake drives a building
after being corrected for instrument error and baseline, are into the inelastic region can be represented as a combination
integrated to obtain the velocity and displacement time of viscous damping, which is contained in the structure, and
histories. The peak ground motion Peak Ground Acceleration hysteretic damping. Hysteretic damping refers to the area of
(PGA), Peak Ground Velocity (PGV) and Peak Ground loops that form when the earthquake force (base shear) is
Displacement (PGD) are of particular interest for seismic plotted against displacement. It also can be represented as
analysis and design. [1] equivalent viscous damping 𝛽𝛽𝑒𝑒𝑒𝑒 , associated with a maximum
A. Resistance and Performance displacement of 𝑑𝑑𝑝𝑝𝑝𝑝 can be estimated from the following
equation:
A seismic performance target has two essential parts - a
“damage state” and a “hazard state”. Life Safety (LS) and 𝛽𝛽𝑒𝑒𝑒𝑒 = 𝛽𝛽0 + 0.05 (1)

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

186
Where, factor, k, such that the effective viscous damping is, 𝛽𝛽𝑒𝑒𝑒𝑒𝑒𝑒 is
𝛽𝛽0 hysteretic damping represented as equivalent viscous defined by:
damping 63.6𝑘𝑘�𝑎𝑎𝑦𝑦 𝑑𝑑𝑝𝑝𝑝𝑝 −𝑑𝑑𝑦𝑦 𝑎𝑎𝑝𝑝𝑝𝑝 �
𝛽𝛽𝑒𝑒𝑒𝑒𝑒𝑒 = 𝑘𝑘, 𝛽𝛽𝑜𝑜 + 5 = +5 (7)
𝑎𝑎𝑝𝑝𝑝𝑝 𝑑𝑑𝑝𝑝𝑝𝑝
0.05 5% viscous damping inherent in the structure
(assumed to be constant) k-factor is a measure of the actual structure hysteresis is
The term 𝛽𝛽0 can be calculated
1 𝐸𝐸𝐷𝐷
𝛽𝛽0 = (2)
4𝜋𝜋 𝐸𝐸𝑆𝑆0

Where,

Figure III. Family of Demand Spectra in ADRS Format

well represented, either initially, and/or after degradation.
Also, it depends on the structural behavior of the building, and
on the quality of the seismic resisting system and the duration
Figure II. Reduced Response Spectrum of ground motion. There are 3 categories of structural
𝐸𝐸𝐷𝐷 the energy dissipated by damping behavior Type A represents stable, reasonably full hysteresis
loops (k=1). Type B is assigned basic k of 2/3 and represents
𝐸𝐸𝑆𝑆0 the maximum strain energy a moderate reduction of area (k is reduced at higher values of,
𝐸𝐸𝐷𝐷 is the energy dissipated by the structure in a single 𝛽𝛽𝑒𝑒𝑒𝑒𝑒𝑒 to be consistent with Type A relationships). Type C
cycle of motion, that is, the area enclosed by a single represents poor hysteretic behavior with substantial reduction
hysteresis loop. 𝐸𝐸𝑆𝑆0 is the maximum strain energy associated of loop area and is assigned a k of 1/3.
with that cycle of motion, that is, the area of hatched triangle. The equations for the reduction factors SRA and SRV are
The term 𝐸𝐸𝐷𝐷 can be derived as given by:
𝐸𝐸𝐷𝐷 = 4 ∗ �𝑎𝑎𝑝𝑝𝑝𝑝 𝑑𝑑𝑝𝑝𝑝𝑝 − 2A1 − 2A2 − 2A3 � 63.6𝑘𝑘�𝑎𝑎𝑦𝑦 𝑑𝑑𝑝𝑝𝑝𝑝 −𝑑𝑑𝑦𝑦 𝑎𝑎𝑝𝑝𝑝𝑝 �
3.21−0.68 ln� +5�
𝑎𝑎𝑝𝑝𝑝𝑝 𝑑𝑑𝑝𝑝𝑝𝑝
= 4�𝑎𝑎𝑝𝑝𝑝𝑝 𝑑𝑑𝑝𝑝𝑝𝑝 . 𝑎𝑎𝑦𝑦 𝑑𝑑𝑦𝑦 − (𝑑𝑑𝑝𝑝𝑝𝑝 − 𝑑𝑑𝑦𝑦 )(𝑎𝑎𝑝𝑝𝑝𝑝 − 𝑎𝑎𝑦𝑦 ) − 2𝑑𝑑𝑦𝑦 (𝑎𝑎𝑝𝑝𝑝𝑝 𝑆𝑆𝑆𝑆𝐴𝐴 =
3.21−0.68 ln�𝛽𝛽𝑒𝑒𝑒𝑒𝑒𝑒 �
=
2.12 2.12
− 𝑎𝑎𝑦𝑦 )� (9)
= 4�𝑎𝑎𝑦𝑦 𝑑𝑑𝑝𝑝𝑝𝑝 − 𝑑𝑑𝑦𝑦 𝑎𝑎𝑝𝑝𝑝𝑝 � (3)
63.6𝑘𝑘�𝑎𝑎𝑦𝑦 𝑑𝑑𝑝𝑝𝑝𝑝 −𝑑𝑑𝑦𝑦 𝑎𝑎𝑝𝑝𝑝𝑝 �
The term 𝐸𝐸𝑆𝑆0 can be derived as 2.31−0.41 ln�
𝑎𝑎𝑝𝑝𝑝𝑝 𝑑𝑑𝑝𝑝𝑝𝑝
+5�
2.31−0.41 ln�𝛽𝛽𝑒𝑒𝑒𝑒𝑒𝑒 �
𝐸𝐸𝑆𝑆0 =
𝑎𝑎𝑝𝑝𝑝𝑝 𝑑𝑑𝑝𝑝𝑝𝑝
(4) 𝑆𝑆𝑆𝑆𝑉𝑉 = =
1.65 1.65
2
2 𝑎𝑎𝑦𝑦 𝑑𝑑𝑝𝑝𝑝𝑝 −𝑑𝑑𝑦𝑦 𝑎𝑎𝑝𝑝𝑝𝑝 (10)
𝛽𝛽0 = (5)
𝜋𝜋 𝑎𝑎𝑝𝑝𝑝𝑝 𝑑𝑑𝑝𝑝𝑝𝑝
II. Damping systems and effects of adding damping
63.6�𝑎𝑎𝑦𝑦𝑑𝑑𝑝𝑝𝑝𝑝 −𝑑𝑑𝑦𝑦 𝑎𝑎𝑝𝑝𝑝𝑝 �
𝛽𝛽𝑒𝑒𝑒𝑒 = +5 (6) to a structure
𝑎𝑎𝑝𝑝𝑝𝑝 𝑑𝑑𝑝𝑝𝑝𝑝
𝛽𝛽𝑒𝑒𝑒𝑒 values obtained are used to estimate spectral reduction Damping is one of many alternative methods proposed for
factors, which are used to decrease the elastic (5% damped) a structure to achieve optimum performance when subjected
response spectrum to reduced response spectrum with to seismic, or other types of transient shock and vibration
damping greater than 5% of critical damping. For damping disturbances. The conventional approach would dictate that
values less than 25%, spectral reduction factors calculated the structure passively mitigates or dissipate the effects of
using 𝛽𝛽𝑒𝑒𝑒𝑒 equation. transient shock through a mixture of strength, flexibility,
deformability, and energy absorption. The amount of damping
The idealized hysteresis loop is a reasonable in a conventional structure is incredibly small, and therefore
approximation for a ductile detailed building subjected to the amount of energy dissipated during transient disturbances
relatively short duration ground shaking (not enough cycles to is additionally very small. During strong motions, such as
significantly degrade elements) and with equivalent viscous earthquakes, conventional structures generally deform far
damping of less than 30%. In order to be consistent with the beyond their elastic limits and remain intact only because of
damping coefficients, B, and to allow simulation of imperfect their ability to deform inelastically.
hysteresis loops (area reduced loops), the concept of effective
viscous damping was modified with a damping modification Therefore, most of the dissipated energy is absorbed by the
structure itself through localized damage. The concept of
additional dampers within a structure assumes that some of the

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

187
energy added to the structure during a transient is absorbed by smooth nonlinear hysteretic loops under plastic deformation;
additional damping elements rather than the structure itself. it can withstand a large number of yield reversals; there is no
An idealized additional damper would be such that the force significant stiffness or strength degradation; and it can be
generated by the damper is large enough and occurs at such a accurately modeled by Wen's hysteretic model or as a bilinear
time that the damper forces do not increase the overall stress elasto-plastic material [4].
within the structure. Properly implemented, a perfect damper
(For X Plates Dampers, XPD) for each deformational
should be able to simultaneously reduce both stress and
degree of freedom, it may specify independent uniaxial
deflection within the structure [3].
plasticity properties. The plasticity model is based on the
A. X-shaped metallic dampers hysteretic behavior proposed by Wen (1976).
All internal deformations are independent. The yielding at
one degree of freedom does not affect the behavior of the other
deformations. If it does not specify nonlinear properties for a
degree of freedom, that degree of freedom is linear using the
effective stiffness, which may be zero.
The nonlinear force- deformation relationship is given by:
𝑓𝑓 = 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑘𝑘 𝑑𝑑 + (1 − 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟)𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦 𝑧𝑧 (11)
where k is the elastic spring constant, yield is the yield
force, ratio is the specified ratio of post- yield stiffness to
Figure IV. Samples of X Plate Dampers with Wen elastic stiff ness (k), and z is an internal hysteretic variable.
Hysteretic Loops This variable has a range of | z | ≤ 1, with the yield surface
X-plate dampers consist of one or more X-shaped steel represented by | z | =1. The initial value of z is zero, and it
plates, each plate having a double curvature and arranged in evolves according to the differential equation:
parallel; the number of plates depends on the amount of
energy required to be dissipated in the given system. The
K 𝑖𝑖𝑖𝑖 𝑑𝑑̇ 𝑧𝑧 > 0
𝑑𝑑̇ (1 − |𝑧𝑧|𝑒𝑒𝑒𝑒𝑒𝑒 )
𝑧𝑧̇ = � (12)
material used to make the X-plate can be any metal that allows
yield
𝑑𝑑̇ 𝑜𝑜𝑡𝑡ℎ𝑒𝑒𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
a large amount of deformation, such as mild steel, although where exp is an exponent greater than or equal to unity.
sometimes lead or more exotic metal alloys are used. To Larger values of this exponent increase the sharpness of
reduce the response of the structure by dissipating the applied yielding. The practical limit for exp is about 20. The equation
seismic energy, such a damper may be used with a suitable for 𝑧𝑧̇ is equivalent to Wen’s model with A =1 and α= β = 0.5
support system, where a combination of braces and XPDs may
be used in the building structure and such an assembly is III. Rubber Concrete
known as a device brace. When such a system is subjected to Crumb Rubber can absorb sudden shocks by controlling
lateral forces such as earthquakes, high winds, etc., the seismic
the motion of waves transmitted by moving loads. Rubber
energy introduced is dissipated by their flexural deformation.
compresses and deforms easily, but the rate of deformation
They can withstand many cycles of stable yielding
decreases as the load increases, making it a good shock
deformation, resulting in a high degree of energy dissipation
absorber. Rubber has very low hysteresis and controls energy
or damping.
dissipation in a system subjected to vibration-induced forces.
The goal behind using X-shaped dampers is to have a Rubber has an inherent damping ability and this property can
constant strain variation over their height to ensure that be exploited to improve the damping and vibration
yielding occurs simultaneously and uniformly over the entire characteristics of concrete. In addition, rubber is lightweight
height of the damper. XPDs can also behave nonlinearly, but and non-corrosive, making it easy to use and apply. It is
limit the behavior of the structure to the linear-elastic region. relatively inexpensive compared to other conventional
In a series of experimental tests, the behavior of XPDs was dampers such as steel springs and tuned mass dampers.
studied and the following results were observed: It exhibits
Material scientists have attempted to form concrete with a
ductile material. However, it appears that due to the brittle
nature of concrete, the most direct and effective approach to
creating damage tolerant concrete structures would be to
embed intrinsic tensile ductility into concrete. If concrete
behaves like steel in tension (highly ductile) while retaining
all other advantages (e.g., high, and extreme compressive
strength), concrete structures with improved serviceability
and safety can be easily realized. In addition, crumb rubber
can absorb sudden impacts by controlling the movement of
waves transmitted by moving loads. Rubber compresses and
deforms easily, but the rate of deformation decreases as the
load increases, which makes it a good damper. Rubber has low
hysteresis and controls energy dissipation in a system that is
highly exposed to vibration-induced forces. Structures located
Figure V. Nonlinear Force- Deformation relationship of Wen near roads are subject to vibrations caused by moving
Plasticity vehicles. These vibrations are harmful to the pavement and
structures adjacent to the pavement. The use of rubber in

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

188
pavements as a substitute for natural aggregate can reduce this 4.327 3.656 3.015 2.361
effect [5]. 3.22175 2.79 2.484 2.034
Average 2.492 1.799 1.789 1.319
IV. Optimum usage of mixed damping
Nowadays, technological development and the practical Meanwhile, during analysis the performance curve is
solutions it provides to complex issues have become an urgent being created accordingly, as the pushover analysis depends
necessity that cannot be dispensed with. Accordingly, the firmly upon the first mode participation ratio of the structure’s
development of damping systems used in modern structures modal. The Wen XPDs used in this research were selected due
and high-rise buildings depends on the software provided by simplicity of manufacturing and availability in local markets.
computers mainly. Three-dimensional models with study As we saw in Chapter II, XPDs work in both linear and
cases were modeled using ETABS to support this research nonlinear behaviors according to the degree of freedom
based on the assumptions and equations that were mentioned direction in which the link or the damper acts, thus plasticity
in the previous chapters. property of the added links or dampers will add extra energy
absorption as a nonlinearity action. To see how all those
The main objective of mixing damping systems (Rubber
factors affecting the pushover analysis in structures with
Concrete, chevron braces with XPDs) is to make a comparison
multiple damping systems this research was made.
between models with only crumbed rubber added to concrete
aggregates in 5,10,15 and 20 % as shown in Table I. A. 3D models sspecifications used in the research
3D models were developed where the analysis will be
Table I. DEFINITIONS OF 3D MODELS done by ETABS Program and are compatible with the
Crumbed Rubber following:
Ratio in Concrete 0 5% 10 % 15 % 20 %
Aggregates % • The proposed model will consist of symmetrical 26 stories
Model Code Type of Braces and Dampers Used in the Model with 3 bays in X and 3 bays Y directions.
3D_Nor_NDR NA
3D_Nor_XB_1 1 Middle Chevron Brace without XPDs • Wen Plasticity dampers (XPDs) will be add to the 3D
3D_Nor_XB_2 2 Corner Chevron Braces without XPDs models in the middle bays in X and Y directions as first step,
3D_Nor_XD_1 1 Middle Chevron Brace with XPDs next step added (XPDs) will be removed from the middle bays
3D_Nor_XD_2 2 Corner Chevron Braces with XPDs and will be added to corner bays.
• Chevron braces without XPDs will be considered in the
In push over analysis, effect of damping is related to the analysis as a comparison like it is shown in Table I.
structural elements material’s damping properties and the
added damper elements’ properties used in the model. Thus, • The first and second step will be repeated for all types of
in nonlinear pushover analysis, 𝛽𝛽0 the hysteretic damping rubber concrete used in this article which explained in
represented as equivalent viscous damping is changing previous chapters.
according to materials of structure, damper elements added • In 3D models the Response Spectrum will be amplified for
braces, chevron and XPDs. Also, it is related to 𝐸𝐸𝐷𝐷 the energy
dissipated by damping and 𝐸𝐸𝑆𝑆0 which is the maximum strain
energy. 𝛽𝛽𝑒𝑒𝑒𝑒𝑒𝑒 obtained values from analyzing models will
used in estimating spectral reduction factors to decrease the
elastic (5% damped) response spectrum. Which will play a
main role in the comparison as we will see.
Five types of concrete material were defined in ETABS
for the models as crumbed rubber added aggregates according
to mentioned percentages hereabove include their damping
properties as shown in Table II. Where the values of table
were taken by the research done by Najib N. Gergesa, Camille
A. Issab, Samer A. Fawazb [5].

Table II. PARAMETERS’ VALUES OF RUBBER CONCRETE

Parameters Normal 5% 10% 15% 20%
27.58 17 16 14 12.5 Figure VI. Sections, Bays and Heights of the 3D model with
16 13.82 11.88 8.97 middle XPDs
Compressive
26.88 24.13 19.53 13.65 each model by scale factor to reach the ultimate performance
Strength Mpa
30.07 28.15 22.13 16.3
point for each case separately.
39.95 34.63 23.96 18.93
Average 27.58 25.9 22.78 17.83 14.05
Material 18 30 33 38 Table III. 3D MODEL GEOMETRICAL SPECIFICATION
Damping Parameters Value Number
9 12 14 18
Ratio %
Bay Dimension, X Direction (m) 4 3
Average 5 13.5 21 23.5 28
Bay Dimension, Y Direction (m) 4 3
Young 24,855 27,000 20,000 17,500 12,500 Story Hight (m) 3 26
Modulus Mpa 30,000 22,500 19,000 14,000
Average 24,855 28,500 21,250 18,250 13,250
2.492 1.799 1.789 1.319
Tensile Table IV. SECTIONS DEFINITIONS AND LOCATIONS FOR 3D MODELS
2.675 2.49 2.295 2.066
Strength Mpa
3.393 3.215 2.837 2.39

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

189
Hight Width
Number Loads:
Section Location (Story) of
(cm) (cm)
elements • Dead load was defined as linear static load and applied for
C40 40 40 1 to 10 160 all models as self-weight load for all structural elements in
C35 35 35 11 to 19 144 addition to 2 KN/m2 for each Area element (Slabs).
C30 30 30 20 to 26 112
Comb 50 30 All Stories 624 • Modal load was defined as Eigen vectors with zero initial
HSS101.6*4.8 Middle/Corner
(Circular)
10.2 4.8
Bays
208/416 conditions and the defined mass sources of the model.
Slab 1 (thin shell) 20 1 to 26 26 • Live load defined as linear static load and applied as 2
KN/m2 for all Area elements (Slabs) of the models with zero
Table V. PROPERTIES AND FIELDS OF P-M2-M3 HINGES
initial conditions.
Hinge Type
P-M2-M3 Moment/ A B C D E • Push Over Load which is a nonlinear static typecontinue
Rotation Data from the state at end of another nonlinear case which is
Moment/Yield Mom 0 1 1.1 0 0 Gravity nonlinear, where all modal loads applied using modes
Rotation/scale factor 0 0 0.01 0.01 0.01 from the defined modal load with the same mass sources, load
application will be Displacement Control with multiple states.
Table VI. ACCEPTANCE CRITERIA OF P-M2-M3 HINGE TYPES

Hinge Type Value

• Garvity Nonlinear load is nonlinear static loadcase
P-M2-M3 Acceptance Criteria (Deformation/SF) starting from zero initial conditions (Unstressed State)
contains 1 Dead, 0.25 Live and 1 Super Dead load patterns,
Immediate Occupancy 3.000E-3
the load will be applied fully with final state result.
Life Safety 9.000E-3
Collapse Prevention 0.01 • Load combinations will be taken as the default ETABS 17
conbination for concrete RFS (Rigid Frame Structures).
Table VII. PROPERTIES AND RANGES OF M3 HINGES
• Analysis Options were taken to be 3D analyzing with
Hinge Type advanced solver with auto process.
Moment/Yield Rotation/scale
M3 Moment/ Rotation
Mom factor
Data • Wen plasticity dampers specifications (XPlate Dampers)
E- -0.2 -0.0457
D- -0.2 -0.0259
are functional into U2 direction with nonlinear properties for
C- -1.1 -0.0239 XPDs which acting in X axis and fixed for all other degrees of
B- -1 0 freedom, otherwise, all XPDs are functional into U3 direction
A 0 0 with nonlinear properties for XPDs which acting in Y axis and
B 1 0 fixed for all other degrees of freedom.
C 1.1 0.025
D 0.2 0.0269
Table X. PARAMETERS OF WEN DAMPERS (XPDS)
E 0.2 0.05
Parameters Name Value
n (Number of plates) 8
E (Young Modulus of plates) (N.mm-2) 199947.98
Table VIII. ACCEPTANCE CRITERIA OF M3 HINGE TYPES b (Width of plates) (mm) 100
t (thickness of plates) (mm) 8
Hinge Type a (height of plates) (mm) 100
Value
M3 Acceptance Criteria (Plastic Value (Positive) σy (Yielding strength of plates) (N/mm2) 248.2113
(Negative)
Rotation/SF) σu (Failure strength of plates) (N/mm2) 399.896
Immediate Occupancy 0.01 -8.92E-3 H (rate of strain hardening) 3450.95
Life Safety 0.025 -0.0269 d (yield displacement) (mm) 30
Collapse Prevention 0.05 -0.0457 Kd (the initial stiffness) (N/mm-1) 6824.89
Fy (yield load) N 21180.69
q (yield displacement) (mm) 3.103448
Mass Sources: the mass sources of were models taken from Y0 0.413793
elements self-mass and additional mass and 1 multiplier of P (plastic force for d displacement) (N) 31650.50
assigned super dead loads and 0.25 multiplier of assigned live Kp (plastic stiffness) (N/mm-1) 1055.016
loads. Kp/Kd 0.154583

While the response spectrum was defined according to

American Society for Civil Engineering ASCE7-16 code with B. Analysis results and comparisons
those parameters with Tonf, m, C Units. Figures VII & VIII show that models with corner chevron
braces XBs can bear extremely earthquakes with large scale
Table IX. DESIGN RESPONSE SPECTRUM PARAMETERS factors of design response spectrum, high ductility and
Parameters Name Value
0.2 Sec Spectral Accel, Ss (m.sec-2) 2.29
1 Sec Spectral Accel, S1 (m.sec-2) 0.869
Long Period Transition (sec) 8
Site Class C
Site Coefficient, Fa 1
Site Coefficient, Fu 1.3
SDS=2/3*Fa*Ss (m.sec-2) 1.5267
SD1=2/3*ft*S1 (m.sec-2) 0.7531

International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

190
10000

8650
9000

8000

6575
7000

5475
6000

4225
4175
5000

3725
3675
3400

3250

3200
3150

3150

3125
3100
4000

3075

3000
2975

2900

2825
2800
2775
2725

2675
2250
3000

2000

1000

Figure VII. Scale Factor of Applied Response Spectrum Comparison

effective damping ratio for performance point. On the other response spectrum increases. Comparing models of middle
hand, it shows high value pseudo acceleration and chevron XPDs with corner chevron XPDs, it can be notice
displacement are not preferable in building designs and cause slight enhancement in the behavior of middle chevron XPDs
collapsing to nonstructural elements. models than the corner XPDs ones, whereas, values of
effective damping ratio, are slightly higher. In the same time,
values of scale factor is slightly lower in the middle XPDs
25.00%
20.71%

20.48%
18.56%

20.00%
15.12%

15.04%
14.28%
11.63%

15.00%
9.74%
8.93%

8.88%
8.46%

8.42%
8.11%

7.98%
7.66%

7.60%

7.40%
7.35%

7.19%

10.00%
6.85%
6.80%

6.28%

6.10%
5.63%

5.39%

5.00%

0.00%

Figure VIII. Effective Damping Ratio Comparison with/out XPDs

For models without braces or XPDs we can see that scale models than the corner ones.
factors of design response spectrum is increasing, meanwhile
effective damping ratio is decreasing by increasing crumbed As general overview for the results of 3D models, models
rubber ratio in the aggregates of concrete for the performance with corner chevron braces without XPDs -especially for 15%
point. and 20% of rubber ratios- give unrealistic results comparing
to other models. Models with middle and corner XPDs can
Same approach is applied for models that have middle stand stronger earthquakes since scale factor for the response
chevron brace without XPDs, yet scale of all comparison spectrum and effective damping ratios reach maximum
parameters is a bit larger than those without XBs. The values, while transmitted base shear forces, displacement s
diagrams show that models with middle and corner chevron reach minimum values, especially the models with 10% and
XPDs have increasing of effective damping ratio
proportionally by increasing crumbed rubber ratio in
aggregates, meanwhile scale factor of defined earthquake

191
Figures IX and X show that structures in models reach
their ultimate capacity with the most reduced demand and
highest scale factor of design response spectrum which means
that the structures are pushed to the maximum limits.
V. Conclusion
The XPDs’ locations used in the research played a major
role in the mechanism of the model's response and behavior
towards the design response spectrum curve and also played a
major role in raising the scale factor in some study cases
compared to other models, in addition to the ratios of crumbed
rubber added to aggregates in the concrete used in the models.
Rubber concrete may be a good option to raise the
damping ratio of performance point in the structure, it
Figure IX. Pushover plot for model 3D_Nor_XD_2_R15% contributes in absorbing energy by a certain amount, but it is
not sufficient as individual, adding extra damping systems
15% ratio of added crumbed rubber models. Thus, this leads
can form a hybrid system that gives greater effectiveness to
us to conclude that an optimum situation can be obtained from
raise the effective damping ratio of the structure as a whole,
thus greater absorption capacity of seismic energy and
increasing scale factor of design response spectra which
means increasing structure damping, also proportions of
crumbed rubber added should be carefully studied to have the
optimum results.
Results obtained are depended on other researches results
which may be not accurate, selecting locations of XPDs and
ratio of crumped rubber added to concrete aggregates is
critical and depends on geometrical properties of the structural
models and the study case. For future researches it is
recommended to use optimization applications on such cases
to highlight the most proper usage of dampers locations in
structures with the best ratio of added crumped rubber to reach
practical applications for such researches.
References
[1] Kheir Al-Kodmany (2017). Understanding Tall Building A
Theory of Place Making. Routledge Taylor and Francis Group.
[2] Applied Technology Council (ATC-40 Project) (1996).
Seismic Evaluation and retrofit of concrete buildings. Seismic
Safety Commission State of California.
[3] U. D. D. Liyanage, T. N. Perera, H. Maneetes (2018). Seismic
Analysis of Low- and High-Rise Building Frames
Incorporating Metallic Yielding Dampers, Civil Engineering
and Architecture
[4] SarikaRadhakrishnan, Mr. Sanjay Bhadke. (2016). Seismic
Performance of RC Building with X-plate and Accordion
Metallic Dampers. International Research Journal of
Figure X. Pushover plot with detailed for model Engineering and Technology (IRJET). Volume: 03 Issue: 07 |
3D_Nor_XD_2_R15% July-2016.
mixing the damping systems like adding crumped rubber to [5] P Sugapriya, R Ramkrishnan, G Keerthana and S
aggregates of concrete, or installing other types of dampers Saravanamurugan (2018). Experimental Investigation on
Damping Property of Coarse Aggregate Replaced Rubber
like the XPDs or viscous liquid dampers or mass dampers, Concrete, IOP Conf. Series: Materials Science and Engineering
etc… 310 (2018) 012003. 10.1088/1757-899X/310/1/01200

192
Bidirectional DC-DC Converter Based on Quasi Z-Source
Converter with Coupled Inductors
Murat Mustafa Savrun
Department of Electrical & Electronics
Engineering, Adana Alparslan Türkeş
Science and Technology University,
Adana, Turkey
msavrun@atu.edu.tr

Abstract—This paper presents an improved bidirectional applied. These converter types are impedance network [6],
quasi z-source dc-dc converter to achieve a reduced input switched capacitor [7], capacitor clamped [8], cascaded boost
current ripple. The proposed converter consists of a quasi z- [9], and the quadratic boost [10]. Many of these topologies
source converter employed coupled inductors, shoot-through have some drawback of discontinuous input current [11]. In
switch, and output filter. The proposed converter interface is addition, coupled inductor-based boost converter topologies
able to provide bidirectional power flow as well as high voltage have been performed to enhance the voltage boost capability
gain. Besides, the quasi z-source converter that is equipped with with low input current ripple. The converters have high
coupled inductors makes it possible to reduce input current voltage gain thanks to the turns ratio of coupled-inductors.
ripple. In order to demonstrate the improvement of the input
However, higher turn ratios increase the leakage inductance of
current ripple and verifying power flow functionalities and high
voltage capability, a battery connected proof-of-concept model
the coupled-inductors. Therefore, instantaneous voltage
has been developed using MATLAB/Simulink environment. spikes and their disruptive effects increase [12]. In [13], a
The proposed topology is examined under different power flow quasi z-source inverter equipped with a coupled inductors
directions, battery charging algorithms, and various voltage topology with the advantages of high voltage gain and reduced
gain values. Besides, the efficiency of the converter is analyzed input current ripple is proposed to drive a motor.
for all case studies. The results validate the viability and In this paper, the bidirectional quasi z-source dc-dc
effectiveness of the proposed converter. converter, which excels with high voltage gain capability, is
Keywords—bidirectional dc-dc converter, quasi z-source
equipped with coupled-inductors in order to reduce input
converter, high voltage gain, coupled inductors current ripple. The proposed topology has the superior aspects
of each of the two approaches. The high voltage gain is
obtained by the quasi z-source converter, while the
disadvantage of high leakage inductance due to the high turns
I. Introduction ratio of coupled-inductors is eliminated. Besides, the
Nowadays, the use of systems equipped with renewable relatively high input current ripple of the quasi z-source
energy sources (RESs), which are rapidly replacing fossil converter is reduced via the coupled-inductors. The
fuels, has been increasing. RESs have a non-linear production performance analysis of the converter has been evaluated
behavior due to their nature. Therefore, RESs are often under different case studies.
equipped with batteries and DC-DC converters in order to
The pattern of the paper is organized as follows: The high
regulate their output. Power electronics converters have great
voltage gain dc-dc converter topology and its operation
importance in interfacing RESs, batteries and loads in such
principle are described in Section II, whilst the control scheme
application areas: renewable energy systems based distributed
is outlined in Section III. Section IV presents the operation
generation, electric vehicles, microgrids. The step-up
waveforms captured under defined case studies. Finally,
converters are frequently used due to the relatively low output
conclusions and discussions are presented in Section V.
voltage levels of RESs and batteries as well as the inherent
limitations. Step-up dc-dc converters temporarily store low II. Power Circuit Structure
voltage input energy on magnetic field storage components
and transfer it to the output in high voltage levels [1]. The The power circuit configuration of the proposed converter
traditional boost converter topology has the advantages of low is illustrated in Figure I. The proposed impedance network
conduction loss and simplicity of design. However, it is not based converter consists of quasi z-source converter, coupled
able to use in voltage-sensitive user applications that need inductors, active switch, and output filter. The quasi z-source
high voltage gain because of the restrictions of limited voltage converter allows to increase low input voltage to relatively
gain and high output voltage ripple [2]. Several dc-dc high voltages with its high gain characteristic. The quasi z-
converter topologies have been proposed in the literature source converter is equipped with coupled inductors in order
regarding the high voltage gain issues. The high voltage gain to reduce the input current ripple. The active switch is used to
topologies are categorized into two as isolated and non- control the shoot-through and nonshoot-through states of the
isolated dc-dc converters [3]. Isolated dc-dc converters are quasi z-source converter. LC filter is used to filter the
equipped with high-frequency transformers (HFT) in order to oscillations at the output of the converter. The quasi z-source
isolate the primary and secondary side as well as provide high converter makes it possible to perform bidirectional power
gain with a turns ratio of it. However, the topologies equipped flow. Therefore, the proposed converter has the functionality
with high turns ratio HFTs provide limited voltage gain and to be used not only for voltage regulation of RESs with low
sacrifice efficiency because of relatively high conduction output voltage, but also for both charging and discharging of
losses [4]. The non-isolated topologies are frequently used in low voltage batteries. In order to test and evaluate the power
applications where there is no need for isolation due to their transfer performance in both directions, a simulation model
reduced size and cost advantages [5]. In RESs based endowed with battery has been conducted.
applications, various high voltage gain dc-dc converters are

193
Cq1

Ib Lq1 Lq2 L out

Battery Sz
Pack

DC-Link
Cq2 S st Cout
Vb

Forward Power Flow

Reverse Power Flow

Figure I. Power circuit configuration of the proposed converter

III. Control Scheme C value. The charging operation ends following the charging
current reaches a value close to zero (0.005 C).
The controller of the proposed system depends on the duty
cycles of two active switches. During the forward power flow, IV. Performance Results of the Proposed Converter
while the freewheeling diode of Sz switch is biased, the Sst
switch is triggered by the determined duty cycle value in order The proposed converter has been modeled, tested and
to regulate the shoot-through condition and reach the desired evaluated in Matlab/Simulink environment. To evaluate the
voltage gain value. The duty cycle value varies within the performance of the proposed converter, a proof-of-concept
limits of 0-0.5. During reverse power flow, while the Sst model has been developed with a 100 V 50 Ah lithium-ion
switch is in OFF state, the Sz switch is triggered. The duty battery. The parameters of the simulated system are listed in
cycle value of the Sz switch determines the charging current Table I. The switching elements are determined as IGBTs
of the battery. Batteries are charged according to a charging considering the switching frequency and power transfer
algorithm in order to extend their service life. The most rating. The performance investigation has been conducted
commonly used battery charging algorithm is constant current under different voltage gain conditions for forward power
(CC) / constant voltage (CV) charging. To perform CC/CV flow and under CC/CV charging conditions for reverse power
charging, the battery current needs to be able to control. The flow. The case studies summarized in Table II has been
controller of the converter is illustrated in Figure II. formed considering all possible operation scenarios.

Table I. Parameters of the simulated system

Vout_ref Forward Power Flow
System Parameter Value
Limiter Battery Battery Capacity 50 Ah
Verr PI Sst
Vout_act Controller Duty Battery Nominal Voltage 100 V
cycle Sst Battery Maximum Charge Current 25 A
Reverse Power Flow PWM Quasi Z- Quasi Z-Source Inductors (Lq1,Lq2) 1 mH
Vb_ref Ib_ref Generator Source Quasi Z-Source Capacitors (Cq1,Cq2) 100 uF
Vb_act Verr Sz Converter
PI Sz Coupled Inductors Turns Ratio 1:1
Ib_act Controller Duty
Ierr General Capacitor (Cout) 200 uF
CV Charging CC Charging cycle Inductor (Lout) 1 mH
Switching Frequency 1 kHz
Figure II. Control scheme of the proposed converter
The first case represents the forward power flow operation
The duty values of the active switches are regulated with under 500 ohm constant load and varying shoot-through duty
proportional-integral (PI) controllers. The controller cycle values. The Sst switch is triggered by the determined
continuously monitors the output voltage during the power duty cycle values in order to regulate voltage gain, while the
flow towards the load. The desired output voltage level is freewheeling diode of Sz switch is biased. During the related
assumed as a reference value in the controller. The reference case, the duty cycle value is gradually increased as: 0.1, 0.15,
and the actual values of the load side are compared and an 0.2, 0.25, 0.3 to evaluate the gain of the proposed converter.
error signal is produced. The resulting error is given to the PI To demonstrate the performance of the converter, the
controller. The controller produces a suitable duty cycle value. input/output voltage/current/ power waveforms are presented
During the power flow towards the battery, CC/CV charging in Figure III. Also, the battery discharging current is
algorithm is operated. The controller continuously monitors demonstrated in Figure III to validate the reduced current
the battery current. The desired battery charging current value ripple thanks to the coupled inductors. As illustrated in Figure
is assumed as reference value in the controller. The reference III, the proposed converter boosts the input voltages to
and the actual values of the current are compared and an error relatively high voltages with different duty cycle values. As
signal is produced. The resulting error is give to the PI evident from Figure III, the efficiency values of the proposed
controller. The controller produces a suitable duty cycle value. converter are computed as 96.5%, 95.7%, 94.8%, 93.9%, and
The converter allows to charge battery with its maximum 92.8% for related time intervals, respectively. The
allowable current value (0.5 C) during CC charging. The CC performance waveforms verify the high voltage gain
charging switches to CV charging following the battery capability with the high efficiency of the converter during
voltage achieves the determined value. During the CV forward power flow.
charging, the charging current gradually decrease to the 0.005

194
Table II. Operation scenarios

Case 1 Case 2
Time Intervals Time Intervals
0–2s 3–6s 7 – 10 s 11 – 14 s 15 – 18 s 0–2s 3–6s
Duty Cycle 0.1 (Sst) 0.15 (Sst) 0.2 (Sst) 0.25 (Sst) 0.3 (Sst) 0.31(Sz) Decrease
Output Voltage 240 V 407 V 641 V 939 V 1294 V 1000 V 1000 V
Output Current 0.475 A 0.81 A 1.27 A 1.86 A 2.56 A -2.9 A Decrease
Battery Voltage 108.6 V 108.5 V 108.3 V 108 V 107.4 V 110 V 110 V
Battery Current 1.1 A 3.17 A 7.93 A 17.22 A 33.27 A -24.8 A Decrease
Operation Mode Gain: 2.21 Gain: 3.75 Gain: 5.91 Gain: 8.69 Gain:12.05 CC Charging CV Charging

The second case represents the reverse power flow charged with constant current up to the threshold voltage
operation and 0-2 s and 2-4 s time intervals correspond to CC value. After the battery achieves the determined voltage value,
and CV charging operations, respectively. During the related the charging algorithm switches to CV charging. During this
case study, the battery is charged from the load side to verify stage, the battery current gradually decrease to 0.005 times of
the reverse power transfer capability of the converter. It is battery capacity. Figure IV illustrates the operation
assumed that the dc-link voltage of the output side is 1 kV. waveforms of case 2. The efficiency values for related time
While the Sst switch is in OFF state, the Sz switch is triggered intervals are computed as 95.1% and 94.8%, respectively. The
considering the enabled charging algorithm. The battery performance waveforms reveal that the proposed converter
maximum charging current is determined as 25 A considering provides bidirectional power flow with high efficiency.
the 0.5 times of battery capacity (0.5 C). Thus, the battery is
Case-1 Operation Waveforms
120 1400
100 108.6 V 1200
Battery Voltage (V)

108.5 V 108.3 V 108 V 107.4 V 939 V 1294 V

Load Voltage (V)

80 1000
641 V
800
60
600 407 V
40 240 V
400
20 200
0 0

35
2.5
33.7 A 2.595 A
Battery Current (A)

30 1.88A
Load Current (A)

25 2
17.4 A 1.283 A
20 1.5
15 0.815 A
8A 1
10 0.485 A
3.2 A
1.1A
5 0.5
0 0

3500
3500
3617 W 3355 W
Battery Power (W)

3000
Load Power (W)

3000
2500
2500 1879 W 1765 W
2000
2000
1500 1500
867 W 822.5 W
1000 347 W 1000 332 W
120 W 115.7 W
500 500
0 0

0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20
Time (s) Time (s)

50
Battery SOC (%)

49.95

49.90

49.85

0 2 4 6 8 10 12 14 16 18 20
Time (s)

Figure III. Operation waveforms of case 1

195
Case-2 Operation Waveforms

120

20 200
0 0

DC-Link Current (A)

Battery Current (A)

0
-0.5 A
-5 0
-0.06 A
-10 -1
-15
-2 -2.9 A
-20 -24.8 A

-25 -3

DC-Link Power (W)

0
Battery Power (W)

-50 W
-500 -50 W
-1000 -1000
-1500
-2000 -2000 -2850 W
-2710 W
-2500
-3000
-3000
0 5 10 15 0 5 10 15
Time (s) Time (s)

94.12
94.10
Battery SOC (%)

94.08

94.06

94.04
94.02

94.00

0 5 10 15
Time (s)

Figure IV. Operation waveforms of case 2

V. Conclusion hydrogen fuel cell, battery and supercapacitor," Expert

Systems with Applications, vol. 40, no. 12, pp. 4791-4804,
In this paper, an improved non-isolated bidirectional dc-dc 2013.
converter topology based on coupled inductors equipped quasi [3] M. M. Savrun and A. Atay, "Multiport bidirectional DC–DC
z-source converter is proposed. The outstanding features of converter for PV powered electric vehicle equipped with
the proposed converter are; (i) achieving high voltage gain, (ii) battery and supercapacitor," IET Power Electronics,
https://doi.org/10.1049/iet-pel.2020.0759 vol. 13, no. 17, pp.
providing bidirectional power flow, (iii) mitigating the input 3931-3939, 2020.
current ripple, and enabling precise control and algorithm [4] F. Xue, R. Yu, and A. Q. Huang, "A 98.3% Efficient GaN
transitions. Thus, the battery service life is extended with both Isolated Bidirectional DC–DC Converter for DC Microgrid
mitigating the high input current ripple, which is not good for Energy Storage System Applications," IEEE Transactions on
battery life, on the battery side and the performing charging Industrial Electronics, vol. 64, no. 11, pp. 9094-9103, 2017.
algorithms. The performance of the proposed converter has [5] M. Lakshmi and S. Hemamalini, "Nonisolated High Gain DC–
been evaluated under different operating conditions. The DC Converter for DC Microgrids," IEEE Transactions on
Industrial Electronics, vol. 65, no. 2, pp. 1205-1212, 2018.
results highlight that the proposed converter is able to provide
[6] G. Zhang et al., "An Impedance Network Boost Converter With
all functionalities aimed with high-efficiency values. a High-Voltage Gain," IEEE Transactions on Power
The proposed converter has a restriction despite its many Electronics, vol. 32, no. 9, pp. 6661-6665, 2017.
advantages. The coupled-inductor size increases in parallel [7] W. Qian, D. Cao, J. G. Cintron-Rivera, M. Gebben, D. Wey,
and F. Z. Peng, "A Switched-Capacitor DC–DC Converter
with the power rating of the system. Future works will focus With High Voltage Gain and Reduced Component Rating and
on size reduction by increasing the operating frequency of the Count," IEEE Transactions on Industry Applications, vol. 48,
system. no. 4, pp. 1397-1406, 2012.
[8] Y. Zeng, H. Li, W. Wang, B. Zhang, and T. Q. Zheng, "High-
References Efficient High-Voltage-Gain Capacitor Clamped DC–DC
[1] M. Forouzesh, Y. P. Siwakoti, S. A. Gorji, F. Blaabjerg, and B. Converters and Their Construction Method," IEEE
Lehman, "Step-Up DC–DC Converters: A Comprehensive Transactions on Industrial Electronics, vol. 68, no. 5, pp. 3992-
Review of Voltage-Boosting Techniques, Topologies, and 4003, 2021.
Applications," IEEE Transactions on Power Electronics, vol. [9] P. Upadhyay and R. Kumar, "A high gain cascaded boost
32, no. 12, pp. 9143-9178, 2017. converter with reduced voltage stress for PV application,"
[2] P. García, J. P. Torreglosa, L. M. Fernández, and F. Jurado, Solar Energy, vol. 183, pp. 829-841, 2019.
"Control strategies for high-power electric vehicles powered by

196
[10] S. Lee and H. Do, "Quadratic Boost DC–DC Converter With [12] Y. Hsieh, J. Chen, T. Liang, and L. Yang, "Novel High Step-
High Voltage Gain and Reduced Voltage Stresses," IEEE Up DC–DC Converter With Coupled-Inductor and Switched-
Transactions on Power Electronics, vol. 34, no. 3, pp. 2397- Capacitor Techniques," IEEE Transactions on Industrial
2404, 2019. Electronics, vol. 59, no. 2, pp. 998-1007, 2012.
[11] M. Moslehi Bajestan and M. A. Shamsinejad, "Novel switched- [13] A. Battiston, E. -H. Miliani, S. Pierfederici and F. Meibody-
coupled-inductor quasi-Z-source network with enhanced boost Tabar, "A Novel Quasi-Z-Source Inverter Topology With
capability," Journal of Power Electronics, vol. 20, no. 6, pp. Special Coupled Inductors for Input Current Ripples
1343-1351, 2020. Cancellation," in IEEE Transactions on Power Electronics, vol.
31, no. 3, pp. 2409-2416, 2016, doi:
10.1109/TPEL.2015.2429593.

197
Influences of urban fabrics onto microclimate assessment
within the city of Tirana.
Fabio Naselli Enkela Krosi
Department of Architecture Department of Architecture
Epoka University Epoka University
Tirana, Albania Tirana, Albania
fnaselli@epoka.edu.al ekrosi@epoka.edu.al

Abstract Tirana, the capital city of Albania, could not possess low albedo values. The high building height of recent
escape from that feature that characterizes all the cities of the constructions has reduced the sky view factor ratios and
past socialist regime: sudden and low-governed development reduced the ventilation of outdoor areas and wind passages
process. The study aims to emphasize the problematics and which prevents the cooling of urbanized zones [9] and is
microclimate level differences that coexist to the existing urban reducing the daylight level which causes gloom within
fabrics within Tirana, which reduces the comfort of city building canyons [10]. The aim and objective of this paper is
inhabitants, mainly induced by the post-socialist urban growth. to accentuate the problematics associated and induced by the
We want to point out about the reduction of the open and green
rapid, uncontrolled urban sprawl which directly influences the
areas and the occupation of free soils by high-rise buildings or
microclimate of urbanized zones in city of Tirana. The study
informal ones, which have increased the UHI (Urban Heat
Island) effect, modifying the local metabolism and by reducing
intends to raise the awareness of environmental regulation
the general urban comfort. The UHI effect is escalated further authorities on the related phenomenon and contributing in the
with the substitution of natural materials by asphalt and improvement of thermal comfort, energy saving issues [11]
concrete. An analytical interpretations of land cover ratios where the building consumes almost 40% of the energy during
starting by both the land use and land cover analyses has the whole lifecycle [12], highlight the need of the green areas
conducted for 2 contiguous urban sites into the city of Tirana, and improve the quality of urbanized areas and as an overall
as we selected between diverse typologies of fabrics in the same will orient the strategies towards sustainable urban
area of the city. In the meantime, the research shown that an development [13].
uncontrolled edification may generate different urban
environments that do exhibit different microclimatic levels II. Influences of urban fabrics
despite their location in proximity one each other.
A. The Study areas
Keywords microclimatic values, UHI, urban metabolism, The selected sites to be analyzed in terms of microclimate
land cover, land use, informal urban development assessment with regards to their landcover and land use
features are located in the city of Tirana, Albania (Figure I),
I. Introduction adjacent to one of the main roads of the city which leads the
Approximately half of the population of the world lives in movement fluxes towards the city center and opposite
the urbanized areas and there is the tendency to be increased typologies aside the road is of
[1],[2]. This phenomenon has resulted in air, noise and land different features and architectural values constructed through
pollution and consequently changed the microclimate of the a time period of hundred years. The very first blocks that face
urban areas. The change in atmospheric and climatic the road are apartments of 4 and 5 floor compositions
conditions do affect our mood and activities and even our accompanied with 2 to 3 story height villas reflecting the
daily productivity [3]. In the urban areas where temperatures influence of the Italian architecture. In the back side of the first
register higher values compared to periphery is commonly facing blocks the architecture and urban morphology has lost
known as the Urban Heat Island Effect (UHI). UHI is been its space character by the amateurish and profit oriented
affected directly by the change in the wind speed, which interventions which have lowered the values of the outdoor
results to be higher during the night [4] and increases with the common spaces and reduced even in size.
increase of the urbanization and population and recently has
resulted to alter by the alteration of the land use and land cover
ratios [5], [6]. Greenery as a crucial element that mitigates the
UHI effect is rapidly reduced, [7] and has a direct relation to
reduction of health problems on humans [8]. The fast
expansion of urban morphology with the invention of new and
automated construction methods and techniques has resulted
in alteration of microclimate values especially the temperature
of air for different building blocks. Particularly in Tirana after
the decline of communist regime an uncontrolled urban
growth spread faster almost to all the city area. This
uncontrolled urban sprawl has been spread in such an
unpredictable manner that even to adjacent building blocks
separated by one main road to be seen a big disparity in terms
of land use, land cover, building intensity and so on. The
present condition is characterized by substitution of natural Figure I. Tirana city
surfaces and greenery by impermeable hardscapes such as
asphalt and concrete which store a large radiant heat and

198
The existing open and green areas have been occupied by
new buildings that frequently disobey even the urban planning
rules and regulations (Figure II).
City of Tirana is characterized by a Mediterranean climate.
It is one of the most wet and sunniest cities in Europe. Tirana
is characterized by a diverse urban morphology which is
developed by the uncontrolled urban sprawl and rapid
extension of the city border.

Figure IV. Land Covering for Z1 & Z2

One of the axes of the grid is positioned parallel to the
main Durres Street due to the fact that many of the buildings
of the zone are positioned parallel to the Durres Street. This
makes the usage of the grid to be more efficient and precise
among land cover rations of the covering materials. Land
coverings are grouped ore divided into categories such as;
Figure II. Urban growth of Tirana buildings, vegetation, and hardscape composed of the asphalt
and concrete elements. Grouping is done also within the
category of buildings for subdividing them as Buildings with
roof tile covering and asphalt or bituminous coverings.
C. The Land covering ratios
In the Table I. are provided the first data extracted from
ZON the calculations on land covering surfaces. There is a clear
ZON evidence regarding the differences that exist between the
zones despite their adjacent position within the city of Tirana.

Table I. General land Cover Ratios

Zone 1 Zone 2
Area m2 % Area m2 %
Total bonded area 18550 100.0% 13650 100.0%
Figure III. Location of the two study areas (blocks) building area 9825 53.0% 8050 59.0%
The selection of the two blocks along Durres street (Figure greenery 3500 18.9% 1125 8.2%
III) and nearby the city center is done intentionally to hard scape 5225 28.2% 4475 32.8%
emphasize the variety of the urban sprawl mainly in the heart
The total bounded zones have a difference in area of
of the city for understanding better the whole context and
approximately 5000 m2 (Zone 1 > Zone 2). Buildings in the
visualizing the problematics associated with the urbanization
Zone 1 (Z1) occupy 53% of the total land cover area while in
of Tirana.
the Zone 2 (Z2) occupy 59%. Greenery in the Z1 is covering
B. The main data used and their interpretation 18.9% of the total area while in the Z2 is only 8.2%. The
The data that is collected and has been interpreted in this hardscape like asphalts and concrete coverings in Z1 stand in
study is extracted by an analytical work conducted on the land the percentile ratio of 28.2 % while in the Z2 32.8%. As an
use and land cover of the two selected building blocks. To overall comparison between Z1 and Z2 is comprehended that
provide more accurate information from the land covering of the interventions that induce pollution and increase the UHI
the two zones, are used the maps from geospatial portal of effect are much higher at the Z2.
republic of Albania, ASIG. Figure V. Chart of land cover ratios
The partial maps that are inclusive for the two selected
study zones are furthermore elaborated to extract the intended
information for developing the proper comparisons. The land-
covering overlapping process with the aerial map is conducted
in a pixelated manner or a 5 by 5-meter grid that overlays the
aerial map. The 5*5 m grid is constructed in a manner to
simplify the comprehensive and calculation methods, shown
also in Figure IV.

199
The ratios of Land Covering can be better understood in Z2=59%, the Z1 does possess higher open and public
the charts. spaces with green character compared to Z2. With the
intensity values is shown that the sky view factor is highly
reduced in Z2 due to the large concentration of the buildings
Table II. Total built area for Z1 & Z2
composed of more than 5 floors within the zone.
Zone 1 Zone 2
Area % Built Area % Built
tot. tot.
area area
built area 9825 100 8050 100
% %
10k 0 0.0 % 0 675 8.4 % 6750
36.4 56.2
5k 3575 17875 4525 22625
% %
12.4
3k 100 1.0 % 300 1000 3000
%
32.1 14.3
2k 3150 6300 1150 2300
% %
30.5
1k 3000 3000 700 8.7 % 700
%
44.5 19.3
tile cover 4375 1550
% %
asphalt 55.5 80.7
5450 6500
cover % %
27475 35375
Figure VII. Info of the building regulations for Z2 by municipality
In the Table II. are exhibited the ratios of the buildings by extracted from planifikimi.gov.al
their floor height which are directly proportional to the
building intensity. From the Table II. is shown that in Z1 the Table III. Intensity and L.U.Coef for Permissibility values and
Land Use coefficient for buildings more than 5 floor are at a Actual situation
value of 36.4% while in the Z2 is 64.6%. The ratio of the
inhabited built area over the total Land Use is drastically Zone 1 Zone 2
higher in the Z2 than in Z1, the values are shown in the Table Permissibility Actual Permissibility Actual
IV. at the intensity values where for the Z1 it is 1.48 and for value Situation value Situation
Z2 2.59. Another issue that is extracted from the Table II. is intensity 3.3 1.48 2.4 2.59
that materials that store more heat like asphalt and concrete coef. of
are predominant in Z2. The values shown in Table IV. can be 45% 53% 45% 59%
land
comprehended that by the Coef. Of land usage of Z1=53% and
usage
In the Figure VI. and Table IV. are shown the values of
Land Use and Building Intensity provided by the municipality
of Tirana in their building regulation code extracted by the
official webpage pplanifikimi.gov.al. In the figure has been
understood that Land Use Coefficient is passed with the
current conditions for both of the Zones. Regarding the
intensity values Z1 has a permissibility of 3.3 while the actual
value is 1.48, on the other hand in Z2 the permissibility value
of the intensity is 2.4 while the actual condition is 2.59. This
has occurred due to the demolition of the Old Italian style
villas in the Z2 and construction of the new high rise.
III. Discussion
Microclimate assessment of urbanized areas is a
phenomenon that is widely spread and developed by usage of
several different strategies and methods for achieving certain
results. Based on past researches conducted on microclimate
assessment by usage of various methods and tools results have
shown that urban microclimate is directly affected by a
number of elements and conditions such as; green ratio,
material characteristic, built/unbuilt ratio, sky view factor and
Figure VI. Info of the building regulations for Z1 by municipality so on. Associated to these axioms, our study has been based
extracted from planifikimi.gov.al and developed for understanding the variety of microclimate

200
change within the urban blocks of city of Tirana. The land use and concrete has increased the UHI effect which reduces the
ratios extracted from the two Zones in this research have been indoor and outdoor comfort of the citizens of Tirana. The two
selected intentionally to exhibit the disparities that adjacent selected study zones show that even in areas adjacent to each
urbanized zones do possess. other due to the urban morphological variations and land cover
differences exist variations in temperatures and ventilation
Z1 analysis has shown that the block is surrounded with 5 which has a direct effect in mood and productivity.
floor apartment buildings and within the block dominate low
rise private houses which offer green space by their private References
gardens while in Z2 the demolishing of old private houses is [1] Wei Ruihan, Dexuan Song, Nyuk Hien Wong, and miguel
substituted by high rise apartments that have reduced the open Martin, "Impact of Urban Morphology Parameters on
and green areas within the block. Microclimate" Elsevier, The Netherlands, 2016.
Table IV. Concluding Table [2] , "Land Use/Land
Cover changes dynamics and their effects on Surface Urban
Zone 1 Zone 2 Heat Island in Bucharest, Romania" Int. J. Appl Earth Obs
Geoinformation, 2019.
high green areas low green areas
[3]
low rise building high rise building Press, New Jersey, 1962.
L. U. = 53% L. U. = 59% [4] Ibidem (1).
[5] Ibidem (1).
I=1.48 I=2.59
[6] Ibidem (2).
Large areas of tile covering Large areas of asphalt [7] Erell Evyatar, David Pearlmutter, and Terry Williamson,
high sky view factor Low sky view factor Urban Microclimate. New York 10017 Taylor and Francis,
2011.
Referring to the concluding Table IV. and considering the
past researches on similar topics, it has been comprehended [8] Roberts Hannah, Rosemary McEachan, Tamsin Margary, Mark
Conner, and Ian Kellar, "Identifying Effective Behavior
that by measurement of Land Use and Land Cover ratios for Change Techniques in Built Environment Interventions to
the two selected zones can be concluded that between Z1 and Increase Use of Green Space: A Systematic Review." In:
Z2 should exist a difference in air temperature within the Environment and Behaviour, SAGE Publications, 2018.
zones, wind flow and ventilation level. Should also exist a [9] Tsoka Stella, "Investigating the Relationship Between Urban
difference of UHI values for the both study zones.. Spaces Morphology and Local Microclimate: a study for
Thessaloniki." Elsevier, 2017.
IV. Conclusion [10] Oke Timothy Richard, "Street Canopy and Urban Layer
Climate." In: Environment and Behaviour, SAGE Publications,
Tirana as a city which has passed through several urban 1988.
transformations exhibits different urban characteristics which [11] Ibidem (1).
are evident even in zones located in high proximity within the [12] Kocagil Idil Erdemir, and Gul Koclar Oral, "The Effect of
city. The rapid and uncontrolled urban sprawl has accentuated Building Form and Settlement Texture on Energy Efficiency
furthermore the urban morphological variations. This for Hot Dry Climate Zone in Turkey." Elsevier, 2015.
phenomenon has resulted in the alteration of built and unbuilt [13] Mills Gerald, "Progress toward sustainable settlements: a role
ratios and reducing the greenery areas which has a direct for urban climatology." Theoritical and Applied Climatology,
relation with the increase of urban microclimate temperatures. 84, Springer-Verlag, 2006.
Reduction of Skyview factor and increasing the amount of
heat storing materials in building construction such as asphalt

201
IoT Based Water Management and Monitoring System for
Multi-Resources
Sarosh Ahmad Sheza Yasin Sajal Naz
Department of Electrical Engineering and Department of Electrical Engineering and Department of Electrical Engineering and Department
Technology Technology Technology Istanbul Sa
Government College University Faisalabad, Government College University Faisalabad, Government College University Faisalabad, Is
Faisalabad, Pakistan Faisalabad, Pakistan Faisalabad, Pakistan akhta
sarosh786a@gmail.com shezay42@gmail.com sajalnaz751@gmail.com

Amina Batool Ali Suqrat Yasin

Department of BBA Industrial Management Department of Electrical Engineering
Government College University Faisalabad, The University of Lahore,
Faisalabad, Pakistan Lahore, Pakistan
aminabatool258@gmail.com alisuqrat600@gmail.com

Abstract—This paper aims to manage water distribution in utilize water resources efficiently with the water management
an aligned manner so that everyone will get an equal amount system. A new method of IoT based with PLC and SCADA,
of water without wastage. Without the (programmable logic such a framework for water is required which deals with
control) PLC and sensor, the feedback is not obtained in a fast
manner. The proposed system is fully automated by connecting utilization of water. Control engineering has passed through
it to PLC & SCADA (Supervisory Control & Acquisition), many innovative changes over the last few years. Previously,
which provides an IoT (Internet of Things) solution. This all human beings were the only source for manipulating and
proposes a Multi-Resource Control System. Through this commanding any framework [1]. Having troubleshoot helps
project human efforts, time, resources wastage and other in analyzing and rectifying an error. Due to reliability, can be
complications are reduced. Our idea is to minimize water used for years without any malfunction [2-4]. Automation is
wastage through a transparent, accountable, and efficient water
supply system which in results reduce human efforts, minimize the theme of our proposed project as it plays a vital role in
the use of different resources like electricity. So, the goal is to controlling human errors. This system works on different
implement an efficient multi-resource water management parameters of water like level and flow rate. Using these
system in an affordable cost. Thus, we are going to design a setup parameters water wastage and water theft can be avoided. On
through which we will investigate various parameters like pH, the grounds, evolvement in technology enhances distinctive
water level, turbidity of water and manage them by comparing methods, observing the economical points in perception [5-
itself to a set benchmark and depending on the population, we
will manage the flow meter reading. In this research, we have 7]. Internet of Things (IoT) is a system composed of several
simulated the salt level testing system, UV (ultra-violet) testing branches of mechanical, electrical, computing devices, and
system, and PH testing system and designed a system to manage wireless technologies that can be employed to achieve water
and distribute water equally without wastage of water. management system requirements. The water pump and
pump station can be regularized through Multi-Resource
Keywords—Programmable Logic Controller, Supervisory
Control System (MRCS), which is an innovative technique.
Control & Data Acquisition, Internet of Things, Multi-Resource
Control System, water level control, wireless sensors The pumps controller, water level in water storages, and
alarming framework are comprised as an integral section for
I. Introduction MRCS. Additionally, a 4-state switch will be designed which
The water on the world’s surface is unevenly distributed. Just helps in operating the system manually, automatically, using
3% of the water on the superficial level is useable, the lasting IoT method, and in Off state. This hierarchy will be driven by
97% is found in the seas. Of freshwater, 69% can be found in an IoT technology, directed by the SMS, Wi-fi, or ringtone
glaciers, 30% underground, and less than 1% is in lakes, which will be accessible from anywhere and at any time [8-
waterways, and marshlands [1]. This all concludes that just a 10]. The water management system through MRCS using IoT
single percentage of the world’s surface water is usable by technique may be considered among the modern ways in
the population on earth., and 99% of the usable amount is controlling the wastage of water significantly.
residing underground. Consequently, water administration A. Programmable Logic Controller (PLC)
and dispersion must be done expertly. Due to the rapid
programmable Regulator is a computerized PC that is utilized
growth of the population, water requirement is increasing day
to control electromechanical procedures through
by day which boosts several issues such as water scarcity and mechanization. PLC is utilized to control procedures, for
shortage. These problems have been increasing swiftly, example, beguilement rides, apparatus in manufacturing
affecting the home users, agricultural lands, and industrial plant, water tank extinguishing in the aviation, filling
sectors. The traditional approach of pumping underground machine control framework in the food industry, shut circle
water through pumps, tube-well, Petter Engine, etc. contains material shrinkage framework, and different procedures in
no proper water management. Therefore, the unnecessary use our day-by-day life. PLC was chiefly intended for multi-
and waste of water reduce the level of underground water inputs and multi-yield forms as shown in Figure I. This
gradually which are causing severe problems to the further reached out to temperature ranges, invulnerability to
environment as well as to individuals. The future need is to

202
the electrical commotion, and protection from vibration and
different effects [11].

Figure III: IoT Framework in Water Management System

Figure I: Programmable Logic Control System
II. Project Methodology
B. Supervisory Control & Data Acquisition (SCADA)

SCADA is broadly utilized in the industry for Administrative For making our system efficient, we have added an
Control and Information Obtaining of mechanical automation system through PLC. Storage tanks will have
procedures; SCADA frameworks are currently additionally some sensors and other instruments, which will be attached
infiltrating the exploratory material science labs for the to a certain device in order to control and monitor various
controls of subordinate frameworks, for example, cooling, parameters. These sensors examine physical parameters and
ventilation, power circulation, and so forth. Even more, as of convert them into electrical signals in order to give input to
late they were likewise applied for the controls of littler size PLC. PLC is the main governing party because it controls
molecule finders, for example, the L3 moon identifier and the sensors and other devices and provides data to the controlling
NA48 try, to name only two models at CERN. SCADA room. A controlling room is a SCADA system that stores data
frameworks have gained generous ground over the ongoing in its server coming from PLC and other devices. In current
years as far as usefulness, versatility, execution, and designs, a man must be within the premises of the office for
receptiveness with the end goal that they are an option to in switching on/off the water supply, but we are moving towards
house advancement in any event, for exceptionally requesting PLC which will act like a man for supplying water. The
and complex control frameworks like those of material quality parameter which is very vital for healthy water is the
science tests [17] as can be seen in Figure II. pH value. Extreme pH numbers cause severe health issues
such as infections to the skin, eyes and also damage different
cell membranes. So, in order to avoid all these health and
other issues like corrosion of water pipes and mains pH
controlling and monitoring system have been installed in our
automation system. Proper monitoring of the process is
mandatory to have results at an optimal level. SCADA
systems have been using in most industries for a long time. It
is an efficient system because it shows the information on
real-time basis, which helps in sorting the problem and
correct them as identified. This SCADA system consists of a
primary control center and field sites as per requirement.
Point-to-point connections are used for the control center to
field site communications. All field sites are interconnected
to each other via networking for communication.
This system mainly
Figure II. Block Diagram of the SCADA system
consists of PLC. This is the central and important part of the
C. Internet of Things (IoT) system. SCADA system is designed in order to realize the
automatic controlling of valve and parameter transformation
These interconnected articles have information routinely
such as pipeline pressure and water quality [20]. PLC is the
gathered, broke down, and used to start an activity, giving an
heart of our automation system so it provides all logic
abundance of insight to arranging, the executives, and
dynamic. IoT characterized as a system of physical items. functions which will be developed through a ladder logic
The web isn't just a system of PCs, yet it has advanced into a program, used to command PLC. Sensors and Actuators will
system of gadgets of all kinds and sizes, vehicles, PDAs, provide their values and observations to PLC. Then, PLC will
home machines, toys, cameras, clinical instruments and monitor and control them based on logic through the ladder
mechanical frameworks, creatures, individuals, structures, all program. This logic will be uploaded on PLC through PLC
associated, all conveying and sharing data dependent on software and can be changed accordingly. PLC is synced with
specified conventions so as to accomplish brilliant redesigns, the SCADA system which results in monitoring and
situating, following, safe and control and even close to home commanding the distribution network of water. In the water
constant internet checking, online overhaul, process control supply system, we have one storage tank consisting of the
and organization [19] as presented in Figure III. level sensor for level monitoring, pH sensor for water quality

203
monitoring. Other water supply system elements contain water.
pipelines for water flow and pressure switches for closing and • Pumping and filtering processes work on six inputs to the
opening the outlet valve and the maintenance of volume in input module, such as start push button, stop push button,
the tank. In the case of heavy and high pH water, the amount chlorine tank lower, and higher-level measuring device.
of chlorine that will be added into the tank is defined in Data • Allen Bradley PLC controls the process and Wonder ware
acquisition Centre in ppm (parts per million) to make water Intouch software SCADA tool is used for monitoring the
pure. The equations we used to track ppm in the tank, process.
pressure (PSI), and precipitation (GPM) in the pipe [8]. From • After commanding from PLC, the pumping and filtering
the equation, we conclude that the tank's max volume, the process will output to the chlorine valve for outlet and
Solution’s ppm, the elevation height of the reservoir, the pipe reservoirs solenoid valve. In our system design, we have
diameter from their reservoir, the flow percentage open will added control for both motors from PLC, which can be
be the variables and must be specified at the start. commanded as required and the current status and history
A. Automation of the sensors can be seen remotely through the SCADA
system.
Our whole system is based on automation in order to nullify
human error and for designing an efficient and modern
system. We have used automation for various control
elements for operating them automatically for now many
years. The foremost gain of an automatic system is that it
saves resources, energy, and materials with no compromise
and even better results in quality, accuracy, and precision.
The block diagram of the PLC system is presented in Figure
IV.

Figure V. Layout Diagram for Pumping Section

C. Water Treatment/Filtration Section

The second department of the water supply chain is the water
treatment/filtration section. Water treatment is very much
necessary for avoiding any hazardous circumstances. The
water which is focused by us is raw water as it is wasted
untreated and furthermore, it has been used in many counties
after treatment and filtration processes. So, the model layout
of our water treatment system is manipulated in Figure VI.
The elements of treating water are dam/underground/raw
water, water reservoir tank, flow valve, and pipe. Let assume
that the water in the reservoir never runs out. From the
equation [8] the elevation of the reservoir with respect to the
tank will help to determine the (precipitation) GPM which is
flowing into the tank. This result will also assist in analyzing
Figure IV: Block Diagram of a PLC System
an estimate of (pressure) PSI that is getting out of the
B. Pumping Section reservoir. Hence the GPM calculated will add chlorine to the
In our pumping system design, we have three different tank according to the data and its analysis by Data and
sensors for monitoring the water level in the tank. The Control Centre. The same GPM will also be set at the
location of these sensors is specified at the bottom, middle, outflow. This will help in analyzing an error and rectifying it
in the first instance. This will be done by comparing the GPM
and top of the tank. They work as a level detector of water as
at the output with the output if the result matches then there
if the sensor detects water level at mid or low, then PLC will
is no error.
get an electrical signal and hence turn on the pumping station.
Similarly, the pumping station will be turned off as a water
level detector detects a water level at high. For emergency
and unforeseen circumstances, we have modeled two motors
in the pumping station. From which one will be used in
normal routines and the other will be as a backup or for an
emergency purpose. We have shown a layout of our pumping
section in Figure V. According to our block diagram, we have
three processes to perform which are:
Figure VI. Water Treatment System Tank
• Pumping of a raw or ground and then purifying the input

204
D. Salt Level Testing System it means the process is stopping and then again reprocessing.
The red color in the first tank is indicating the salt testing. Salt
from the water must be removed to get fresh water for
humans, irrigation, and for other purposes. Salt is produced
as a by-product from the removal of salt from the water of
desalination. This removal of salt from water is termed
desalination, which produces potential amount by-products
from different applications. Figure VII is showing a salt level
test simulation diagram. This can be considered as an
independent water source. The seas are immensely huge so
desalinating them is a very costly process and arises some
other big problems for the future. So, the alternatives methods
are mostly used.
Figure IX. PH Level Testing System
G. Distribution Section
In our project design, we have considered it as a separate
department because we want to have a better water supply for
customers and to avoid any faulty conditions. We have seen
that most problems have been seen in the water
supply/distribution system due to a pressure drop occurrence
or pump is used in the home for sucking water directly from
the main pipeline passing through the street. We have worked
on these main issues and come up with a solution for the
water distribution system. In our solution, we will have a
control system run by PLC, transmission channel, and other
Figure VII. Salt Level Testing System elements with connecting pipes. The block diagram for our
proposed design has been shown in Figure X. It shows that
E. Ultra-Violet (UV) Testing System
all the processes of the water distribution system will be
The first tank in green color is indicating the UV testing. For governed by the PLC and for transmitting/receiving control
the killing of harmful bacteria and viruses, UV is found more by PLC, a communication channel will be utilized.
effective. In recent studies by different researchers, it has
finally come to know that UV rays are very strong in dealing
with bacteria, viruses, and other microorganisms. Test results
have been shown in the Figure VIII simulation diagram. The
findings show that UV radiation is a good method for the
treatment of water for drinking purposes. Some bacteria and
viruses have found to survive during tests at high UV doses
but are removed from less amount.

(a)

Figure VIII. Ultra-Violet (UV) Testing System

F. PH Testing
PH is the concentration of ions of hydrogen in water. A pH
test is done to analyze the quality of water. As there is no way
of maintaining good pH manually so, we have replaced it
with automatic control and monitoring for water treatment. In
cooling tower systems, pH has been particularly difficult to
control manually because the response curve of pH to acid
addition is not linear. The first tank in violet color is shown (b)
Figure X. Block Diagram of Distribution Section
in the simulation in Figure IX. If the color is getting change

205
In this section, by monitoring the water storing and alteration in the circuit would be required. This reduces
distribution system, pressure, and other parameters, we can efficiency and increases time.
be able to detect any theft happening in streets or somewhere 3) Less Power Consumption
else. Every stage is monitored separately in order to get a PLC consumes 1/10th of the power as that of an equivalent
precise analysis. If the distribution tank's higher-level sensor relay control. Number of contacts in PLC each coil is more
is ON, it means that the water distribution is going on. Water than the number of contacts found in relay.
can be distributed to all the places at the same time or at 4) Operating speed
different times depending upon requirements and resources. PLC operates faster than an equivalent device. Speed of PLC
If there is any problem under the distribution control, then the can be determined by units in milliseconds.
valve will change automatically to manual control and the 5) Reduced Space
problem can be rectified. Figures XI and XII is showing the PLC is a very compact device because it is a solid-state
distribution valves simulation figure. The screen is having a device rather than electromagnetic and hard-wired devices,
status bulb. When the distribution is turned ON, the bulb turns where electromechanical motion is required.
green. 6) Ease of Maintenance
Troubleshooting is very much easy because it provides error
diagnostic through a program and a software. If the error is
found component replacement is also very much at ease.
7) Addition of circuits
It is easy to add multiple circuits to a PLC to provide a
control. This can be done without many efforts and saves the
amount of money needed to be spent on other controllers.
B. Applications
PLC is mostly preferred in certain automation projects. The
best market of PLC is industries. Mostly, industries have been
using PLC for a long time in their manufacturing processes.
Figure XI. Distribution of Valve 1
These are not only installed for manufacturing purposes but
also used in monitoring and automation tasks by industries.
PLC requires I/O devices which are then used with different
industrial components with respect to their compatibility. In
some cases, external circuit design is the requirement for
connecting those terminals physically with some
programming in ladder logic for making the complete circuit
operational. PLC is mostly found in high-end places where
the package of PLC is cheaper than the cost of the custom-
built controllers. At the low-end level, different automation
techniques have been employed to save money, time and
effort. The given project’s most applications are found in
industries where there is excessive use of water with no
proper maintenance. This water management system will
Figure XII. Distribution of Valve 2
assist to minimize the useless dissipation of water because it
III. ADVANTAGES, APPLICATIONS & is a vital element on earth. This project also finds its
LIMITATIONS applications in WASA, chemical industries, dying industries
and wherever, there is a usage of any liquid for different
A. Advantages processes or as a product. This is a one-time investment
Nowadays, PLC has become a basic controller in industries. through which we can save millions of gallons of water
It is an integral component in industries because it has annually. The whole system can work remotely through
replaced wiring which provides an efficient way of SCADA which can be operated from anywhere and hence
controlling systems. Other benefits of PLC have been listed will create a safe environment of operation.
below:
C. Limitations
1) Flexibility and Reliability
Due to evolvements in technology, one PLC is capable to • LC is a costly device.
control multiple systems. Whereas, in the past multiple • The project is suitable in industries or in government
systems acquired multiple controlling devices. These institutes. Whereas it will not be suitable for local and
controllers are very reliable, and the chances of any error are domestic use.
very less as there is very little moving mechanism in the • Mostly areas are still operating on old controllers so
device. transforming those areas to PLC is a difficult task to
2) Changes and error correction system easier achieve.
If any system needs to be modified only changes to program
IV. Conclusion
are made. This adds an extra efficiency to the circuit and
minimize the time. If using other devices like relays, We have worked on water treatment, water level monitoring,
water distribution, and controlling system through

206
automation. Our idea was to establish an economical, [12] Swapnil Namekars, Patel Tayyab Jahngir, Shahid.K. Hannure, Manasi
Jagtap, Pratiksha Zagade, “Water Level Controller”, International
feasible, and automated water management system for
Journal of Innovative Research in Technology (IJIRT), vol. 6, no.11,
avoiding the wastage of water. We have worked on this April,2020.
project and proposed multi-resource water management [13] M.O. Arowolo, A.A. Adekunle, M.O. Opeyemi "Design and
through PLC. In this project, its flexibility is immense as it Implementation of a PLC Trainer Workstation", Advances in Science,
Technology and Engineering Systems Journal, vol. 5, no. 4, pp. 755-761
can be controlled remotely. This is also the subsequent
2020.
benefit to control water efficiently. The edge PLC has from [14] A. Aguilar, M. Pérez, J. L. Camas, H. R. Hernández and C. Ríos,
any other controller or computer is that it is not affected by "Efficient Design and Implementation of a Multivariate Takagi-Sugeno
environmental changes like cold, heat, moisture, dust, etc. Fuzzy Controller on an FPGA," 2014 International Conference on
Mechatronics, Electronics and Automotive Engineering, Cuernavaca,
Due to the new advancement in technology PLC also
2014, pp. 152-157, doi: 10.1109/ICMEAE.2014.
upgraded itself by including different controls, motion [15] S. v. &. A. vosough, "PLC and its applications," International Journal of
controls, the capability of networks, divided control, and in Multidisciplinary Sciences and Engineering, vol. 2, 2011.
some other respects. The remote handling of PLC has been [16] "Scheduled Controls Explained (PLC)"
[17] A. Daneels, W. Salter, “What is SCADA?” International Conference on
provided through the SCADA system and IoT which is a
Accelerator and Large Experimental Physics Control Systems, Trieste,
more modern means. These systems have improved the Italy, 1999.
functionality, performance, and scope of the PLC. Due to the [18] P.M. Adhao, Mahavir’s, "Internet of Things (IoT): New Age",
attachment of SCADA with PLC, it gains immense International Journal of Engineering Development and Research
(IJEDR), vol.05, no.02, 2017.
popularity in industries because of remote access.
[19] K. K. Petal & S. M. Patel, "Internet of Things-IOT: Definition,
Characteristics, Architecture, Enabling Technologies, Application &
V. Future Prospective Future Challenges”, International Journal of Engineering Science and
Technological trends across the globe are pushing companies Computing, vol.6, no.05, May 2016.
[20] G. S. Ashok, "Water Anti-Theft and Quality Monitoring System
and industries towards next an automation-based system, Through PLC and SCADA," International Journal of Electrical and
monitoring, and manufacturing era. This all is based on the Electronics Engineering Research, pages 355-364, 2013.
subsequent work done in previous years to make industries
working on automation. Many Inc-operations come forefront
and built advanced and simulation tools which are being used
in many companies. As our project also has its applications
in any automation firm. Moreover, this project has its scope
in dying industries, petrochemical, pharmaceutical, etc. With
the addition of SCADA, the project has gotten huge
popularity in remote areas. Remote operations are becoming
preferable to simple automation and industries are willing to
convert automation tasks remotely which can be accessed
from anywhere.

REFERENCES
[1] V. C. &. L. F. C. Gungor, "Gungor, V. C., & Lambert, F. Research on
communication networks of electrical automation system. 50 (7), 877-
897.," 2006.
[2] M. D. J. F. &. S. M. A. Hadipour, A test set of intelligent control system
(MICS) water management system using Internet of Things (IoT). ISA
transactions, 96, 309-326., (2020).
[3] A. A.-S. M. &. A. E. A. S. Ali, "PLC water pumping system and
frequency control," 2009.
[4] J. Aziz, "National Water Quality Strategy," Asian Development Bank,
2002.
[5] S. Rana, "Poor water management is more costly for some countries,"
Express Tribune, 2019.
[6] G. &. Z. M. H. Murtaza, "Wastewater Production, Treatment and
International Use," pages 16-18, May 2012.
[7] A.-S. &. A. Ali, "PLC water pumping system and frequency control,"
2009.
[8] M. 1. W. i. t. Silicon Review. (2016, "What is the simple definition of
the Internet of Things? “16 March 2016.
[9] Hadipour.M, Derakhshandeh.F.J, Shiran.A, “An experimental setup of
multi-intelligent control system (MICS) of water management using the
Internet of Things (IoT)”, ISA Transactions, vol. 96, pp. 309-326,2020,
[10] Gonçalves R, J. M. Soares J, M. F. Lima R. “An IoT-Based Framework
for Smart Water Supply Systems Management”, Future Internet, vol. 12,
no.07,2020.
[11] Ek, K., Persson, L., “Priorities and Preferences in Water Quality
Management - a Case Study of the Alsterån River Basin”, Water
Resources Management, Springer, pp.155–173, 2020.

207
Development of a High Precision Temperature Monitoring
System for Industrial Cold Storage
Sarosh Ahmad Arslan Dawood Butt Usama Umar
Department of Electrical Engineering Department of Electrical Engineering Department of Electrical Engineering
and Technology and Technology and Technology
Government College University Faisalabad, Government College University Faisalabad, Government College University Faisalabad,
Faisalabad, Pakistan Faisalabad, Pakistan Faisalabad, Pakistan
sarosh786a@gmail.com arslandawood@gcuf.edu.pk usama.rwp96@outlook.com

Sajal Naz Sheza Yasin Amina Batool

Department of Electrical Engineering Department of Electrical Engineering Department of BBA Industrial
and Technology and Technology Management
Government College University Faisalabad, Government College University Faisalabad, Government College University Faisalabad,
Faisalabad, Pakistan Faisalabad, Pakistan Faisalabad, Pakistan
sajalnaz751@gmail.com shezay42@gmail.com aminabatool258@gmail.com

Abstract—Cold storages are widely used for a number of vegetables have their specific shelf life. The “Shelf life” for a
industries in all over the world, mainly for the food industry. fruit or vegetable is the time duration that it can be stored
Cold storage facilities play an important role in increasing the without becoming inadequate for usage, sale, or consumption
shelf life as well as retaining the quality of several raw and [5].
processed food items. But there are some problems in old style
manually controlled cold storage systems mostly used, which II. Literature Review
needs to be upgraded with modern technology to reduce
potential losses. This research related to the real-time The temperature of cold storage must be maintained
temperature monitoring of cold storage in order to maintain the according to the food stored in it. If temperature of the cold
shelf life, proper monitoring of temperature is required. For the storage is not kept in optimal range, the shelf life of stored
case of apple, if the temperature is maintained between 33.8° to items reduces significantly. Temperature and Humidity to be
39.9°F, its shelf life will be from 3-8 months. If the temperature more specific through IoT which doesn’t requires the
further increases, the shelf life will reduce drastically. In this presence of any individual. Temperature variations were
work, we proposed a highly precise and reliable remote
monitored precisely and when temperature is above or below
temperature monitoring system to be used in cold storage units.
This study constitutes developing an efficient and effective real the specified range, an alarm is activated [6-7]. At the same
time remote temperature monitoring system, that will display time the actuator starts or stops maintaining the temperature
the accurate and precise temperature on the android app in a in the storage room. The temperature changes seriously affect
cell phone as well as on the LCD. The device was developed the quality of farm products. In order to solve the issue, the
using Pt-100 sensor, LT3092, INA-128p, Arduino UNO, Wi-fi machinist has to keep an eye on present state of temperature
Module, LCD Display and PCB. It will provide high efficiency of the cold-storage, even if he is far away. Thus, a remote
monitoring from remote locations and will greatly help to monitoring system is required by operator to control the
minimize the temperature variations in cold storages. An temperature automatically [8]. The study designs a remote
economical device with latest features makes this device
monitoring system of the temperature. The introduced system
attractive for industrial use.
is projected to help the operator's facilities and the
Keywords—Internet of Things (IoT), surface mound management of farm products. The method adopted to
devices (SMD), temperature monitoring, cold storage overcome the problem is use of a diode thermal sensor. The
output of the controller is connected to a relay. So, the control
I. Introduction method is on-off control. The detected temperature is
Cold storage is an important part of the food industries. As transferred to the data collection device using serial
stored food items are sensitive to temperature variations and communication [9]. Various challenges were encountered to
require persistent monitoring. Lack of latest technologies and optimize control, due to coupling. To decrease the contrary
ignorance about humidity and temperature effects on the effect of coupling and increase the performance of the
fruits results in food safety issues. Slight variations from refrigeration system of the cold storage, a control strategy
optimum temperature can cause great economical losses for with dynamic coupling compensation was considered. On the
industries [1]. The main objective of cold storage is to basis of requirements of the control system, first the dynamic
preserve the fruits for a certain period of time. The “Cold model of the cold storage was established and then the
storage” is such a storage place where various food and coupling between the components was considered [10]. A
vegetable are stored at cold temperatures for a few months or fuzzy controller with dynamic coupling compensation was
longer. This allows the food item to be available throughout designed to address the challenge. A self-tuning fuzzy
the year. Every fruit or vegetable has its fix range of controller can serve as the primary controller therefore, an
temperature for storing them that is known as Optimal adaptive neural network was adopted to pay for the dynamic
temperature range. The temperature of the cold storage coupling. In the end, the control strategy was applied to the
should be kept within the Optimal temperature range for refrigeration system of cold storage. The simulations were
proper storage of fruits & vegetables [2-4]. All fruits and performed in the condition of a start-up by changing the load
and degree of the superheat. The simulation results verified

208
the efficiency of fuzzy control with dynamic coupling
compensation [11].
Our research relates to the real-time temperature
monitoring of cold storage. To maintain the shelf life, proper
monitoring of temperature is required. The system is
designed keeping in mind the optimal storage temperature
ranges for standard food items like potato (35°-40°F), for
garlic is 30⁰ to 32°F and for apple is 30° to 40°F. Most apple
varieties are best stored at or near 32°F. As the optimal
temperature range for maximum shelf life is 2-3°F in most
cases, our system aims to achieve a much higher precision
with a much longer temperature probe to display device
distance. The main features of research are as under. Figure I. Designed circuit connected with INA 128
• Efficient monitoring of temperature without any B. Equation of the circuit
fluctuations in the cold room.
By applying the Kirchoff Voltage’s Law on the first and the
• Remote monitoring from anywhere with great precision.
second loop, we have:
III. Proposed Methodology Applying KVL on the 1st Loop.
This research was executed in different stages starting from
planning to development of hard components and device. 𝑉+ = (𝐼𝑅1 + 𝐼𝑅2 + 𝐼𝑅4 ) (1)
Different methods have been used to monitor & control the
temperature of cold storage. However, more efficient and 𝑉+ = 𝐼(𝑅1 + 𝑅2 + 𝑅4 ) (2)
remote systems are need of the time, to save industries from
great losses & provide fresh and quality products to the end
customers. Mostly the optimum storage temperature range Applying KVL on 2nd Loop.
for many products is too small that it covers 2-3 degrees
Fahrenheit. Temperature difference of 2-to-3-degree 𝑉− = (𝐼𝑅3 + 𝐼𝑅5 + 𝐼𝑅4 ) (3)
Fahrenheit require a very precise measurement. Manual
monitoring to this extent is quite difficult for anyone. The
𝑉− = 𝐼(𝑅3 + 𝑅5 + 𝑅4 ) (4)
Cold storage is a big storage room in which there could be
different temperature at different places. At the entrance, the
temperature is high compared to the area near the refrigerator. The following equations can be calculated as follows.
Therefore, the device requires large no. of pt100 sensors at
different places to measure the average temperature of the 𝑉 = (𝑉+ ) − (𝑉− ) (5)
room. The sensor that is placed far apart from the device has
a lengthy wire. The wire itself has a resistance that is a 𝑉 = [𝐼(𝑅1 + 𝑅2 + 𝑅4 ) − 𝐼(𝑅3 + 𝑅5 + 𝑅4 ) (6)
prominent cause of error in temperature measurement.
Resistance that the device is taking in account in this case, is
not the resistance of the sensor only but the resistance of the As the length of wires for a single sensor is same so we can
wire is also added into it. Therefore, the resistance of the wire write R=R2 =R3=R4
is a major problem for pt100 sensors. We have to design such
a circuit that shall cancel out the resistance of the wire. 𝑉 = [𝐼(𝑅1 + 𝑅 + 𝑅) − 𝐼(𝑅 + 𝑅5 + 𝑅)] (7)
A. Designed Circuit with INA 128
The resistance of wire is the cause of error in temperature 𝑉 = [𝐼(𝑅1 − 𝑅5 )] (8)
measurements. To remove this error; circuit must be designed
by using components such as; pt100, resistances of wire, This is the final equation of our design. In this equation the
resistance Ro, two current sources. Two wires are attached resistance Ro (Ro=R5) is subtracted from the resistance
with the negative terminal of the Pt100 sensor. One of them obtained from the pt100. This V+ and V- is applied to the
contains a resistor. A circuit with two loops that makes Instrumentation Amplifier (INA-128p). V+ is attached to pin
Wheatstone bridge is designed. R1 is the resistance of Pt-100. 3 and V- is attached to the pin 2 of INA-128p.
The resistance of pt100 varies with the variation in
temperature. It varies from 100 ohm to 101.74 ohm for 32- C. Simulation Results
to-40-degree Fahrenheit. The resistances R2, R3 and R4 are The circuit is being designed in a Proteus software to get
the resistances of wire. Ro is the resistance which is set to 99 simulated results. Simulation’s results at the output pin of the
ohms. Circuit contains two equal values of current sources. INA- 128 are shown in a Table I having the temperature
The value of both current sources is set to 10m A. It also values with corresponding resistance of Pt-100, voltage
contains an instrumentation amplifier for giving gain to the difference at input pins of the INA-128 and the voltage at the
output voltage. The gain for INA128 is set to 85 by setting output pin of the INA-128.
gain resistance to 595 ohms. The designed circuit diagram
with INA 128 is shown in Figure I.

209
Table I. Simulated Results of INA-128
Input
Temperature Voltage to Output
(F) Resistance of INA128 Voltage (V)
PT100 (ohm) (mV)

32 100.00 10 0.85
33 100.22 12.2 1.03
34 100.43 14.3 1.21
35 100.65 16.5 1.40
36 100.89 18.7 1.59
37 101.09 20.9 1.77
38 101.30 23 1.95
39 101.52 25.2 2.14 Figure III. Arduino connected with LCD
40 101.74 27.4 2.33 The figure having complete circuit containing Arduino and
LCD is as shown in Figure IV.
Simulated results show that the voltage at output pin of the
INA-128 ranges from 0.85 to 2.33, which must be in range of
0 - 5 volt. It is achieved by setting gain equal to 85.
D. Arduino Interfacing with Proposed Design Circuit
Arduino is programmed in Arduino IDE software. When the
program is installed into the computer by using Arduino IDE
then USB cable is used to link the board with the computer.
Now opened the Arduino IDE and chose the precise board by
selecting Tools>Boards>Arduino/Genuino Uno, and choose
the correct Port by selecting Tools>Port. Arduino Uno is
programmed using Arduino programming language based on
Wiring. Write down the program of the research. To get it Figure IV. Complete circuit connected with Arduino and LCD
started with Arduino Uno board and hardware, load the When the resistance of pt-100 is set to 100-ohm temperature
written program. When the code (also shown below) is loaded displayed at LCD is 32 as can be seen in Figure V.
into your IDE, we clicked the ‘upload’ button given on the
top bar. Once the upload is finished, we observed the
Arduino’s built-in LED blinking. This is the design until now
with Arduino in it. The output pin of INA has been to the A0.
The Arduino is power up by 5V DC source as shown in
Figure II.

Figure V. LCD Temperature Display with resistance of 100 ohm

When the resistance of pt-100 is set to 100.65-ohm,
temperature displayed at LCD is 34.98 as presented in Figure
VI.
Figure II. Designed circuit connected with Arduino Uno

The LCD (16*2) is connected to the Arduino to get the

temperature value at LCD. It is connected in such a way that
the 4, 5, 6 and 7 pins of LCD, connect to the 5, 4, 3 and 2 pins
of Arduino. VDD of LCD is connected to the 5V DC source.
VSS and RW is connected to the Ground. VEE is connected
to the variable resistor for any voltage in between 0 to 5 volts.
RS is connected to the pin no. 12 and E is connected to 11
pins as illustrated in Figure III.

Figure VI. LCD Temperature Display with resistance of 100.65

ohm

210
When the resistance of pt-100 is set to 101.74-ohm Figure IX and the final prototype for commercialization is
temperature displayed at LCD is 40 as shown in Figure VII. presented in Figure X.

Figure VII. LCD Temperature Display with resistance of 101.74

ohm

IV. Hardware Implementation Figure IX. Prototype using Breadboard

For the hardware implementation of our design, PCB is used.
In our design we used SMD (surface mount devices) which
is implemented on PCB. It makes the design more compact.
As precise work requires good connections. The appearance
will be more professional. We have designed an efficient
device having temperature sensing mechanism with a very
high accuracy and precision for the cold storages. It will
perform remote monitoring by using its android application
in the cell phone easily. This device will be robust and will
have a battery backup to avoid any interruptions in case of
electricity shortage problems. The main purpose of our Figure X. Finalized Hardware Prototype
research is to provide value propositions to the customers,
who want to monitor the real time temperature of cold storage V. Conclusion
remotely. This research relates to the real-time temperature monitoring
A. Initial Prototype of cold storage in order to maintain the shelf life, proper
monitoring of temperature is required. We proposed a highly
Initially we have designed a prototype but there were few
precise and reliable remote temperature monitoring system to
problems in the PCB of that prototype. First problem was that
be used in cold storage units that will display the accurate and
the pins of INA-128, V+ and Vo, the wire that should be
precise temperature on the android app in a cell phone as well
attached to V+ was mistakenly attached with the Vo pin of
as on the LCD. The device was developed using Pt-100
the INA-128 and vice versa. Secondly in the case of LT-3092
sensor, LT3092, INA-128p, Arduino UNO, Wi-fi Module,
the Vo pins should be soldered with each other as a single pin
LCD Display and PCB. It will provide high efficiency
on PCB but here in the initial PCB both were not soldered but
monitoring from remote locations and will greatly help to
considered as different outputs. The initial PCB prototype is
minimize the temperature variations in cold storages. The
shown in Figure VIII.
main purpose of our research is to provide value propositions
to the customers, who want to monitor the real time
temperature of cold storage remotely.

References
[1] Karim, A. B., Hasan, M. Z., Akanda, M., & Mallik, A. “Monitoring
food storage humidity and temperature data using IoT”, MOJ Food
Processing & Technology, vol. 6, pp. 400-404, 2018.
[2] Ting, L., & Zeliang, L., “Temperature Control System of Cold
Storage”, International Conference on Electromechanical Control
Technology and Transportation, 2015.
[3] V. C. Chandanashree, U Prasanna Bhat, Prasad Kanade, K M Arjun, J
Gagandeep, Rajeshwari M Hegde, "Tinyos based WSN design for
Figure VIII. Initial PCB prototype monitoring of cold storage warehouses using internet of
things", International conference on Microelectronic Devices Circuits
B. Prototype using Breadboard and Systems (ICMDCS), pp. 1-6, 2017.
[4] Ma, X., & Mao, R, “Fuzzy Control of Cold Storage Refrigeration
After the finding the faults in the old PCB we started working System with Dynamic Coupling Compensation”, Journal of Control
on the breadboard because due to the lockdown we were not Science and Engineering, pp.1-7, 2018.
able to order new PCB. The breadboard circuit is shown in [5] Xu Xiaofeng, Zhang Xuelai, Simulation and experimental
investigation of a multi-temperature insulation box with phase change

211
materials for cold storage, Journal of Food Engineering, vol. 292, pp.
110286, 2021,
[6] Hamid Ikram, Adeel Javed, Mariam Mehmood, Musannif Shah, Majid
Ali, Adeel Waqas, “Techno-economic evaluation of a solar PV
integrated refrigeration system for a cold storage facility”, Sustainable
Energy Technologies and Assessments, vol. 44, pp.101063, 2021.
[7] Torres-Sánchez, R.; Martínez-Zafra, M.T.; Castillejo, N.; Guillamón-
Frutos, A.; Artés-Hernández, F. “Real-Time Monitoring System for
Shelf Life Estimation of Fruit and Vegetables,” Sensors, vol. 20, pp.
1860, 2020.
[8] R. Mishra, S.K. Chaulya, G.M. Prasad, S.K. Mandal, G. Banerjee,
“Design of a low cost, smart and stand-alone PV cold storage system
using a domestic split air conditioner,” Journal of Stored Products
Research, vol. 89, pp.101720, 2020.
[9] H. Feng, W. Wang, B. Chen and X. Zhang, "Evaluation on Frozen
Shellfish Quality by Blockchain Based Multi-Sensors Monitoring and
SVM Algorithm During Cold Storage," in IEEE Access, vol. 8, pp.
54361-54370, 2020.
[10] Hina Afreen, Imran Sarwar Bajwa, "An IoT-Based Real-Time
Intelligent Monitoring and Notification System of Cold
Storage", Access IEEE, vol. 9, pp. 38236-38253, 2021.
[11] Yadav, Ravindra. (2020). Remote Monitoring System for Cold Storage
Warehouse using IOT. International Journal for Research in Applied
Science and Engineering Technology. 8. 2810-2814.

212
Modeling and Load Flow Analysis of Electric Vehicle
Charging Stations in Power Distribution Systems
Mustafa NURMUHAMMED Ozan AKDAĞ Teoman KARADAĞ
Department of Electric and Energy Turkish Electricity Transmission Department of Electric and
Malatya OIZ Vocational School Malatya, Turkey Electronics Engineering
Inonu University, Malatya, Turkey ozan.akdag@live.com Inonu University, Malatya, Turkey
mustafa.nurmuhammed@inonu.edu.tr teoman.karadag@inonu.edu.tr

Abstract—As the electric vehicles are becoming part of our fffff charging rates due to design constraints of internal power
lives all over the world, charging them in an efficient way gains electronics.
more importance as the energy it draws from the distribution
network have increased dramatically. Unplanned and As the number of cars that support high speed charges, and
uncontrolled charging could cause problems such as electric vehicles increase, considerable amount of load will be
transformer overloading, voltage imbalances and power outages experienced by the power distribution system. This load is
in the power distribution network. This paper proposes considered controlled or uncontrolled load depending on
simulation of charging electric vehicles and comparing power operability. Controlled load is a process of charging or
distribution parameters with no load, full load and charging the discharging within certain limits or a plan. Uncontrolled load
maximum number of vehicles that system supports. Integrating is charging or discharging regardless of any predefined plan,
electric vehicle charging stations to distribution network is preparation or agreement. The charging of electric vehicles
analyzed using 11-bus test system. Modeling and load flow should be planned and rolled out in a controlled manner so that
analysis are performed and a new test system is proposed. Next, power distribution system components are not overloaded,
the effect of controlled and uncontrolled charging in the 11-bus power quality is not compromised and the system is free of
test system modeled in this study is discussed. power outages. In addition, the voltage and frequency
deviation, harmonics and three-phase voltage unbalance are
Keywords— Electric Vehicle Charging Systems, Plug-In
some of the parameters that affect the quality of the energy in
Electric Vehicles, Distribution Network, Power Analysis,
Modeling and Load Flow Analysis
the power distribution network.
This topic is researched in number of studies. One of them
I. Introduction is a research utilizing probabilistic analyses in power grid to
Electric vehicles are dominating the transportation demonstrate charging effects on the system [3]. Another
industry. According to a forecast the percentage of electric research observes the effect of charging station loads on
vehicles will be more than 25% of entire vehicles in the reliability indices [4]. In another study, impact of charging
coming ten years [1]. In general, electric vehicles provide electric vehicles on the distribution system is studied using
better acceleration, more economy per kilometer driven, less MATLAB/Simulink [5] simulation. In addition, a study group
maintenance costs, environmental benefits such as being able published a report demonstrating specifics of charging electric
to acquire energy from renewable resources and other direct vehicle effects with probable scenarios, and foresights [6].
or indirect advantages. These advantages lead to a remarkable One research examined the line status and transformer usage
rise of electric vehicle awareness and sales around the world. and models were suggested [7].
Electric vehicles have expanded at an average yearly rate of
There are research studies that study reliability of
60% in the 2014 – 19 period, totaling 7.2 million cars in 2019
distribution system [8], power quality of high voltage grid [9]
[2]. Everyday use is increasing faster than ever.
[10], voltage deviation [11][12], voltage unbalance [13]. In
Electric vehicles can be charged via relatively slow home another study power losses in charging and discharging
AC (Alternating Current) charger, AC charging stations and processes are investigated [14]. One research examine
fast DC (Direct Current) Charging stations. Most home mitigating the instant load increases [15]. Some others are
chargers charge at a rate under 3 kW. AC charging stations proposing solutions that might reduce the effect of charging
provide faster charging speeds but onboard chargers of most by scheduling charge sessions and using charging algorithms
electric vehicles are designed to charge under 20 kW. DC [10], [16]. One study proposes smart charging method that
charging rates fluctuate the most among new vehicles. DC optimizes the chargeable power by short term load forecast
charging rates for new vehicles commonly start at 50 kW and [17]. In addition, investing in the power distribution system
can go up to 250 kW. Overall charging rate of any kind of is always a choice; however, the cost associated is high and a
charging is limited to the lower of the two charging rates; the research study shows that the top twenty percent of load-
maximum rate that electric vehicle supports and the maximum serving capacity efficiently utilized only less than five percent
rate that charging station supports. of the hours at load duration and serves less than one percent
of the electricity demand in the system [18]. Figure I shows
Limited range for electric vehicle owners is becoming less the reserve capacity, rarely used peaking capacity and
of a concern when planning for long distance journeys as more underutilized baseload capacity. Therefore, investing solely to
ultra-fast DC chargers are installed on shorter intervals on the power network hardware may not be the most efficient
strategic locations and highways. Nowadays, 250 kW and 350 solution to protect the power distribution network when
kW ultra-fast DC chargers are very common in countries electric vehicles are at charge simultaneously.
where electric vehicles are adopted in large scale. On the other
hand most electric vehicles yet to reach those high speed

213
buses are 400 V. The line and load data of this distribution
system are given in Tables I and II, respectively.
Table I. 11-bus distribution system line data
Bus no Name R X B
(ohm) (ohm) (us)
Bus 2-3 Line 1 0.1051 0.087 373.8
Bus 2-3 Line 2 0.1051 0.087 373.8
Bus 2-4 Line 3 0.052 0.043 186.9
Bus 2-4 Line 4 0.052 0.043 186.9
Bus 3-5 Line 5 0.1051 0.087 373.8
Bus 3-5 Line 6 0.1051 0.087 373.8
Bus 3-6 Line 7 0.063 0.052 224.2
Bus 3-6 Line 8 0.063 0.052 224.2
Bus 3-10 Line 9 0.026 0.021 93.4
Bus 7-10 Line 10 0.1051 0.087 373.8
Bus 7-10 Line 11 0.1051 0.087 373.8
Figure I. Utility load duration graph [18] Bus 7-9 Line 12 0.031 0.026 112.1
Bus 7-9 Line 13 0.031 0.026 112.1
Besides mentioned studies, there are many research
studies on the impact of electric vehicles on distribution
networks [17][19][11][20][21].

Figure II. Single line diagram of the sample 11-bus distribution system

This paper provides a modeling and load flow analysis to

investigate the integration of electric vehicle charging stations In Figure II, “Load” represents loads connected to the
to an 11-bus distribution network. Depending on the effects of network with the maximum load that can be drawn from the
this integration, additional measures can be taken to mitigate network. On the other hand, the loads specified with “Charge”
the potential risks on the network. is the load that electric vehicle charging stations are connected
to. In this distribution system, a scenario in which charging
II. Simulation stations that support charging 80 electric vehicles
simultaneously was formed. All 10 charging stations are
In this study, an electric vehicle is modeled as a 3-phase designed to be integrated into the charging bus such as
load. It is assumed to have an average power of 25 kW. An Charge1 indicating 10 charging stations. The load model of 10
estimated 11-bus cross-section (36 KV/400 V) similar to the charging stations is modeled as shown in Figure III.
power distribution system in Turkey, where electric vehicle
charging stations will be connected to, is modeled using Figure III shows a load model in the simulation software.
DigSilent software [22]. The single line diagram of this model Active power, Reactive power, Voltage and Scaling factor
is shown in Figure II. parameters are entered for the simulation.
In Figure II, the bus no. 1 is connected to the mains. Bus
no. 1 is the oscillation bus. Bus no 1 is 36 KV, while other

214
Table V. Total Status (Uncontrolled Charging)
Generation 6.32 MW 1.01 MVAR
Load 5.46 MW 0.02 MVAR
Grid loses 0.86 MW 0.98 MVAR

Table VI. Transformer and line loadings (Uncontrolled Charging)

No Loading % No Loading %
TR1 130.3 Line6 109
TR2 174 Line7 150
TR3 124 Line8 151
TR4 106 Line9 151
Line1 108 Line10 109
Line2 108 Line11 109
Line3 143 Line12 131
Line4 143 Line13 131
Line5 109 - -

Figure III. Parameter interface of the load model When a total of 80 charging stations were commissioned,
the first three of the distribution transformers were
Table II shows 11-bus distribution system load data. overloaded. Likewise Line 3-4; 7-8-9; 12-13 are also
overloaded. When the other transformer and line loads are
Table II. Load data of the 11-bus distribution system examined, it is seen that some of them are at the limit and
Load Load
Load Load some are below the overload limit. This may cause damage
No P MW Q MVAR No P MW Q MVAR on the equipment in the power grid and unnecessary power
1 0.245 0.001 7 0.245 0.001 outages. In power systems, lines and transformers can be
2 0.265 0.0021 8 0.265 0.0021 loaded maximum at 100-110%. Considering the general
3 0.245 0.001 9 0.245 0.001 condition of the power system, controlled charging can be
4 0.245 0.001 10 1.02 0.0021 achieved in line with the limits that the power system will
5 0.265 0.0021 11 0.18 0.0021 withstand. The results of the load flow analysis according to
6 0.245 0.001 - - - controlled charging case are shown in Table VII-VIII.

A. Simulation Results Table VII. Total Status (Controlled Charging)

In this power system model, load flow analysis was carried Generation 4.87 MW 0.64 MVAR
out based on certain cases. In the first case the electric vehicle Load 4.31 MW 0.02 MVAR
charging stations are not present. Therefore, by taking these Grid loses 0.55 MW 0.61 MVAR
values as a reference, the effects on the grid can be analyzed
by comparing whether the charging stations are active or not Table VIII. Transformer and line loadings (Controlled Charging)
in use. The load flow analysis can provide information about No Loading % No Loading %
the loading of lines and transformers and the total losses of the TR1 100 Line6 108
TR2 100.1 Line7 50.7
system. The results of the load flow analysis in the absence of TR3 99.7 Line8 50.7
electric vehicle charging stations are shown in Table III-IV. TR4 105.9 Line9 77.2
Table III. Total Status Line1 102 Line10 108
Generation 3.95 MW 0.52 MVAR Line2 108 Line11 108
Load 3.46 MW 0.02 MVAR Line3 88 Line12 84
Grid loses 0.48 MW 0.02 MVAR Line4 88 Line13 84
Line5 108 - -

Table IV. Transformer and line loadings

No Loading % No Loading % With controlled charging, a total of 34 vehicles can be
TR1 77.4 Line6 108 charged simultaneously in this network. Capacity of each
TR2 93.1 Line7 41.6 charge bus is shown in Table IX. In this case, transformers and
TR3 76.9 Line8 41.6 lines can be operated within desired loading limits. Thus, in
TR4 85 Line9 59 power systems, electrical charging stations should be
Line1 108 Line10 108 integrated into power systems within a plan and controlled
Line2 108 Line11 108
charging should be exercised. This can be easily done by
Line3 41 Line12 40.4
Line4 41 Line13 40.4
scheduling electric vehicles in charging stations or distributing
Line5 108 - - power usage among transformers. Alternatively there are
other means of keeping the power distribution system safe and
When Table IV is examined, it can be seen that the loading stable by applying load flow optimization of energy
transmission system using various algorithms such as PSO
of 4 distribution transformers and lines in the sample power
[23] .
system is normal. Next, taking this information as a reference,
a load flow analysis was performed according to the Table IX. Number of vehicles that can be safely charged
uncontrolled charging case. In this case, all of the charging Charge bus Number of vehicles that
can be safely charged
stations were active. The results are shown in Table V-VI.
Charge1 10

215
Charge3 2 [4] D. Güneş, İ. G. Tekdemir, M. Ş. Karaarslan, and B.
Alboyacı, “Elektrikli araç şarj istasyonu yüklerinin
Charge5 5
güvenilirlik indisleri üzerine etkilerinin incelenmesi,” J.
Charge6 5 Fac. Eng. Archit. Gazi Univ., 2018, doi:
https://doi.or./10.17341/gazimmfd.416408.
Charge7 10
[5] B. Yagcitekin, M. Uzunoglu, and A. Karakas, “Elektrikli
Charge8 2 Araçların Şarjı ve Dağıtım Sistemi Üzerine Etkileri,” pp.
316–320.
III. Conclusion [6] D. Saygın, O. Tör, S. Teimourzadeh, M. Koç, J.
In this study, the primary goal was to observe the effects Hildermeier, and C. Kolokathis, Türkiye Ulaştırma
of charging electric vehicles to the power distribution system. Sektörünün Dönüşümü : Elektrikli Araçların Türkiye
In order to show the effects of electric vehicle charging Dağıtım Şebekesine Etkileri. 2019.
[7] M. Kiliçarslan Ouach and E. Çam, “Investigation on the
stations, 11-bus distribution system was modeled using
electrical vehicles effects on the electrical power grid,” El-
DigSilent software. The simulation output helps managing Cezeri J. Sci. Eng., vol. 8, no. 1, pp. 21–35, 2021, doi:
integration of electric vehicles, which are spreading rapidly in 10.31202/ecjse.753493.
various countries, into the existing power distribution [8] H. R. Galiveeti, A. K. Goswami, and N. B. Dev Choudhury,
networks. The differences between when electric vehicles are “Impact of plug-in electric vehicles and distributed
being charged in a controlled and uncontrolled conditions generation on reliability of distribution systems,” Eng. Sci.
have been studied. In addition, the adverse impacts of Technol. an Int. J., vol. 21, no. 1, pp. 50–59, 2018, doi:
uncontrolled charging are shown and discussed. Furthermore, 10.1016/j.jestch.2018.01.005.
according to the simulation results obtained from uncontrolled [9] L. S. Zhao and H. M. Yuan, “The impact of quick charge
charging, the number of electric vehicles that can be safely on power quality of high-voltage grid,” IOP Conf. Ser.
charged in a controlled condition can be determined. Mater. Sci. Eng., vol. 366, no. 1, 2018, doi: 10.1088/1757-
899X/366/1/012033.
This simulation study can help reducing the effects of [10] M. Singh, I. Kar, and P. Kumar, “Influence of EV on grid
electric vehicle charging stations by proposing possible power quality and optimizing the charging schedule to
impact based on a predefined load scenario. In addition, mitigate voltage imbalance and reduce power loss,” Proc.
relatively small 11-bus distribution system is tested and EPE-PEMC 2010 - 14th Int. Power Electron. Motion
presented with data. Simulation studies allow analyzing the Control Conf., pp. 196–203, 2010, doi:
system without taking risk of applying changes on real power 10.1109/EPEPEMC.2010.5606657.
distribution systems. [11] K. Clement-Nyns, “Impact of plug-in hybrid electric
vehicles on Electricity systems,” 2010.
Possible impact of integrating electric vehicle charging [12] G. Ma, L. Jiang, Y. Chen, C. Dai, and R. Ju, “Study on the
stations that simultaneously charge 80 vehicles to an 11-bus impact of electric vehicle charging load on nodal voltage
distribution network are shown in detail. According to the deviation,” Arch. Electr. Eng., vol. 66, no. 3, pp. 495–505,
simulation results, the number of charging stations that can be 2017, doi: 10.1515/aee-2017-0037.
safely integrated into a distribution system is calculated to be [13] S. Panich and J. G. Singh, “Impact of plug-in electric
34. Therefore, a power distribution network can be protected vehicles on voltage unbalance in distribution systems,” Int.
from technical problems and lack of infrastructure by J. Eng. Sci. Technol., vol. 7, no. 3, p. 76, 2016, doi:
estimating the maximum number of cars to be allowed at the 10.4314/ijest.v7i3.10s.
charging stations or limit the energy use that can be drawn [14] E. Apostolaki-Iosifidou, P. Codani, and W. Kempton,
from the system. “Measurement of power loss during electric vehicle
charging and discharging,” Energy, vol. 127, no. March,
In the near future, it is expected that more electric powered pp. 730–742, 2017, doi: 10.1016/j.energy.2017.03.015.
vehicles especially electric fleet vehicles will simultaneously [15] T. Dragičević, S. Sučić, J. C. Vasquez, and J. M. Guerrero,
charge through the power distribution network. Modeling and “Flywheel-based distributed bus signalling strategy for the
simulation studies regarding this research topic will play a public fast charging station,” IEEE Trans. Smart Grid, vol.
vital role in planning and designing power distribution grids. 5, no. 6, pp. 2825–2835, 2014, doi:
Future work can be done in mitigating the possible effects by 10.1109/TSG.2014.2325963.
scheduling electric vehicles in charging stations, distribute [16] J. De Hoog et al., “Electric vehicle charging and grid
power usage or other means of keeping the power distribution constraints: Comparing distributed and centralized
system safe and stable. approaches,” IEEE Power Energy Soc. Gen. Meet., 2013,
doi: 10.1109/PESMG.2013.6672222.
References [17] M. R. Poursistani, M. Abedi, N. Hajilu, and G. B.
Gharehpetian, “Impacts of plug-in electric vehicles smart
[1] G. Giordano, “Electric vehicles,” Manuf. Eng., vol. 161,
charging on distribution networks,” 2014 Int. Congr.
no. 3, pp. 50–58, 2018, [Online]. Available:
Technol. Commun. Knowledge, ICTCK 2014, pp. 1–5,
https://www2.deloitte.com/uk/en/insights/focus/future-of-
2015, doi: 10.1109/ICTCK.2014.7033499.
mobility/electric-vehicle-trends-2030.html.
[18] P. Denholm and W. Short, “An Evaluation of Utility
[2] R. Irle, “Global BEV & PHEV Sales for 2019,” EV-
System Impacts and Benefits of Optimally Dispatched
volumes.com, 2020. https://www.ev-
Plug-In Hybrid Electric Vehicles,” NREL Rep. noTP-620,
volumes.com/country/total-world-plug-in-vehicle-
no. October, p. 41, 2006, [Online]. Available:
volumes/ (accessed Jul. 07, 2020).
http://www.nrel.gov/docs/fy07osti/40293.pdf.
[3] I. G. Tekdemir, B. Alboyaci, D. Gunes, and M. Sengul, “A
[19] R. C. Green, L. Wang, and M. Alam, “The impact of plug-
probabilistic approach for evaluation of electric vehicles’
in hybrid electric vehicles on distribution networks: A
effects on distribution systems,” 2017 4th Int. Conf. Electr.
review and outlook,” Renew. Sustain. Energy Rev., vol. 15,
Electron. Eng. ICEEE 2017, pp. 143–147, 2017, doi:
no. 1, pp. 544–553, 2011, doi: 10.1016/j.rser.2010.08.015.
10.1109/ICEEE2.2017.7935809.
[20] K. Clement-Nyns, E. Haesen, and J. Driesen, “Analysis of

216
the impact of plug-in hybrid electric vehicles on residential [23] O. Akdag, F. Okumus, A. F. Kocamaz, and C. Yeroglu,
distribution grids by using quadratic and dynamic “Fractional Order Darwinian PSO with Constraint
programming,” World Electr. Veh. J., vol. 3, no. 2, pp. Threshold for Load Flow Optimization of Energy
214–224, 2009, doi: 10.3390/wevj3020214. Transmission System,” Gazi Univ. J. Sci., vol. 31, no. 3,
[21] J. Balcells and J. García, “Impact of plug-in electric pp. 831–844, 2018.
vehicles on the supply grid,” 2010 IEEE Veh. Power
Propuls. Conf. VPPC 2010, pp. 5–9, 2010, doi:
10.1109/VPPC.2010.5729217.
[22] “PowerFactory.” [Online]. Available:
https://www.digsilent.de/en/downloads.html/.

217
Obstacle Avoiding Capabilities for The Drone by Area
Segmentation and Artificial Neural Network
Mohammed Majid Abdulrazzaq Mustafa Mohammed Alhassow Abdullah Ahmed Al-dulaimi
Department of Computer Engineering Department of Electrical and Computer Department of Electrical Electronics
Karabuk University Engineering Altinbas University Engineering Karabuk University
moh.abdulrazzaq9@gmail.com Mustafa.alshakhe@gmail.com Abdalluhahmed1993@gmail.com

Abstract— Obstacle avoidance in unmanned Arial vehicles is an planning strategies center around impediment or obstacle
important task to ensure the mobility and safety of the vehicle. avoidance issues [2]. Zong et.al [4] proposed an obstacle
It attracted much attention in recent years, and today we know avoidance scheme for space robots based on the mixed double
that most drones are controlled remotely using some wireless integer values for predictive control, Zhao et.al [5] proposed a
technology, such as radiofrequency through remote control, with cooperative scheme with transfer learning in flocking swarm
a telephone. Mobile phone or tablet, making a drone always of UAVs for obstacle avoidance. Trajectory planning basically
depend on a user who is giving instructions on what to do and termed to the planning of an ideal flight path of the airplane or
who acts accordingly. The biggest challenge is the development
aircraft between the beginning stage and the closure point,
of autonomous air agents that complete missions without having
considering components, for example, fuel utilization,
any human intervention. In this paper we propose an area
segmentation approach that segments the area into smaller areas
mobility, appearance time, flight region, and danger level.
and classify those areas into (Safe/Unsafe) which will allow the Trajectory planning is considered as a significant assurance for
drone to pass safely through the safe areas and avoid the unsafe the effective finish of a UAV and one of the critical advances
or obstacle areas. In comparison with other related works our for mission planning frameworks. Because of specialized
results shows better performance with respect to time,shape,and restrictions.
path length. Zhang et.al [6] proposed a trajectory tracking in mobile
robots in order to avoid and dynamic obstacles in the robot's
Keywords— UAV, obstacle avoidance, Segmentation, Wireless,
ANN, Classification
path, Padhy et.al [12] proposed a feature extraction scheme
from the front view camera of the UAV in order to detect
obstacles and avoid them. trajectory planning depended
I. Introduction vigorously on manual tasks by experts. With the nonstop turn
Automated robots as well as vehicles have already been of events and improvement of the avoidance and control
used in order to complete missions in risky conditions, for framework and innovation, the precision necessities of a UAV
example, tasks in thermal power stations, investigation of Mars for planning the trajectory are getting increasingly elevated,
as well as in observation of opponent powers in the war zone. and artificial path planning has gotten increasingly harder to
Apart from these uses and implementations, there is the meet the requirements.
advancement of higher intelligence automated aeronautical Mendoza-Soto et.al [13] proposed an obstacle prediction
vehicles that are shortly termed as UAVs for upcoming or scheme in order to predict the obstacles in the UAVs trajectory
future battle in order to decrease manual setbacks. One of the for efficient obstacle avoidance. With the quick advancement
primary difficulties for intelligent unmanned aerial vehicles of correspondence innovation, different techniques for
advancement is basically way of path planning or arranging in identifying flight climate data have arisen endlessly, which
adversarial conditions. There have been concerns with path makes the data got by the trajectory planners increasingly
planning that effectively concentrated in the field of robotics. plentiful. To improve flight precision, the planned trajectory
An issue occurs in order for preparation in these applications, ought to fulfill the requirements of landscape following,
planning a path is to discover an impact free-way in a climate territory evasion, danger and threat avoidance while fulfilling
where there occur obstacles that are either static or dynamic. the performance limitations of the airplane.
Early research focused on both holonomic and non-holonomic
systems in kinematic movement issues without considering The idea of fully automated systems is the absence of a
system dynamics, together with static impediments. Other than human factor and thus the prevention of human losses in the
numerous outside contrasts, the majority of those strategies event of accidents, or much faster decision-making. Despite
depend upon couple of various conventional methodologies the fact that almost fully automated systems already exist and
that includes roadmap, cell-disintegration as well as potential their functionality is very accurate, it will take some time
field. before there will be fully automated and trusted robots or
vehicles.
While relocating deterrents are basically associated with
arranging issues, the time dimension is added to the In this paper, we train a deep learning algorithm to classify
configuration space [1] as well as planning is named as area segments into safe and unsafe areas also creating a safe
motion-planning or direction/trajectory-planning rather than path that will lead the drone to move from the start point to its
planning of path. Research has as of late been used in the goal. The rest of the paper is organized as follows: Section 2 is
planning of motion particularly that considers dynamic where we explain our proposed scheme, Section 3 is where we
requirements termed as kino-dynamic arranging or planning show the simulation and results for our model, Section 4 is
[3]. The entirety of previously mentioned path or motion where we conclude our work.

218
II.Proposed Scheme and Properties A. K-Means Algorithm
To get a drone to make decisions, first of all, it must be It is one of the most simple algorithm variations (k-means
considered that the first problem occurs when it dsoes not have cluster) that is simple to implement and is useful for
knowledge of the environment surrounding it, implying that segmentation of simple image. When it comes to image
the air vehicle does not process any information in this regard. segmentation, the concept we're talking about is that of image
All the information generated by the sensors it carries is sent vectorization. It's done by first color quantizing an image and
to the control station or application for its interpretation. then reducing the available color palette to a certain colors
Depinding on these information the environment will be finite amount. During the partitioning of the image, The
classifyed in order to creat the safe path as shown in the figure picture becomes multiple sections. This makes it easier to use
below: various image processing algorithms. The next task of our
image segmentation is to locate objects within the original
image, as well as to find their boundaries that define the areas
of objects that are located. The focus of this particular
technique is to locate those pixels with the same hue parameter
value. Since these pixels have been grouped into multiple
labeled segments, they are commonly referred to as segments.
The Euclidean distance formula has a somewhat more
general representation as follows:

𝑛
Figure I. Proposed obstacle avoidance scheme.
(𝐿)2 = ∑(𝑥𝑖 − 𝑦𝑖)2 (1)
There is no decision support system independent of the 𝑖=1
ground control station that would make it adopt a proactive
control model. To make this possible, visual perception Where the value for distance of Euclidean is 𝐿 , 𝑥 and 𝑦
techniques have to be integrated into the control of UAVs, in are the two vectors being considered. N - dimensions number
order to increase their navigation and direction skills. Yet we (i.e. vectors components) (n = 2). In this instance, only two or
do not simulate these perceptions, we merely suggest that in three-dimensional vectors are involved, so the following
our simulation the sensing device connected to the drone made formula is simplified: Let's use this variation of the Euclidian
it possible for the drone to be fully aware of its surroundings. distance formula to calculate When two pixels have a
particular coordinate, the distance between them is specific
coordinates:

𝐿 = √(𝑥1 − 𝑦1)2 + (𝑥2 − 𝑦2)2 (2)

The L-value represents the Euclidean distance (x1;x2) and

(y1;y2) the (X;Y) coordinates of two pixels. Then we extract
the features of each area in order to classify it into safe and
unsafe areas, we use CNN or Convolutional neural networks
to extract the features of the smaller areas and put them as an
input for the ANN input layer in order to classify them CNNs
have assumed a significant job throughout the entire existence
of profound learning. They are a key cause of effective
utilization of AI.
Figure II. General area sensing approach in UAVs[15].
After we acquire all the sensing data from the drone’s B. ANN Applied Features
sensing device, we segment the map of the flying area into They were likewise a portion of the principal profound
groups or smaller areas based on the similarities and distance models that functioned admirably, well before discretionary
from one are to another, we use the k-means segmentation profound models were viewed as suitable. Convolutional
algorithm and the Euclidian distance, as shown in the figure systems were additionally a portion of the principal neural
below: systems to understand significant business applications and
stay at the bleeding edge of business profound learning
applications today. Convolutional systems were a portion of
the principal profound working systems prepared with back
engendering or backpropagation.
It isn't totally clear why convolutional systems were
effective when general converse proliferation systems were
considered to have fizzled. It might essentially be that
convolutional systems were more computationally proficient
than completely associated systems, making it simpler to run
various tests with them and tweak their execution and
Figure III General area segmentation approach. hyperparameters, the figure below shows a general idea of the
CNN feature extraction:

219
area will be the center of the cluster (segment) as shown in the
figure below:

Figure IV. CNN Feature Extraction.

Figure VII. Area segmentation by K-means algorithm.
Then we use the CNN features and put them as input values
in the ANN classification process, the ANNs perceptions or Then CNN will extract the features of each segment and
(input neurons) number of features is same number of each export them to ANN which will take those features and classify
area segment. The features will be processed through an the areas into Safe and Unsafe segments in order to set a path
activation function in the hidden layer neurons and output the between the safe segments and avoid the unsafe segments. To
classification result of that area as shown in the figure below: evaluate our method, we use the mean square error index as a
cost function for our classification, normally the ANN has a
high accuracy rate and low MSE, the equation for the MSE is
shown below:

∑𝑛𝑖=1 (𝑦𝑖 − 𝑦̂𝑖 )2

𝑀𝑆𝐸 = (3)
𝑛

The objective of the cost functions is to indicate if the

output of the predictions corresponds to the expected values
and by adjusting the weights by the optimizer the error at the
output of the network is reduced. In order to measure the error
of the output of the predictions, that is, to measure whether the
Figure V. Proposed CNN-ANN scheme. network was learning or not, it was made use of the function
of mean squared error cost, which was decided or use due to
III.Evaluation and Simulation Results. the nature of the problem where it is a regression. MSE is the
sum of the square distances between our target variable and the
To simulate our method in a 3D view, we use the Unity 3D predicted values, the cost function according to the number of
environment combined with the codes for the CNN, ANN, and Epochs is shown below:
K-Means complied in Visual studio, we set up an environment
of 3D space with 3D obstacles and set the start and destination
point for the UAV as shown in the figure below:

Figure VI. 3D environment set up. Figure VIII. Low MSE rate by increasing number of
iterations.
Then we segment the area of 3D space into smaller areas
suing the K-means algorithm, the K-means will take into The velocity planning approach is based on getting speed
consideration the distance and similarities between the limits, limiting long-tied acceleration, and curving segments
segments, the obstacle or the greater number of obstacles in an and side acceleration on curved segments to allow the UAV to
fly with better control. By implementing robust path-planning

220
control, we are able to deliver precise path following speed still increasing and decreasing depending on the obstacle
regardless of uncertainties and variations in speed,on the other in the environment till the drone reach its goal.
hand our trajectory resulting is 36.9m with 2.7 time second. In
our work, we have made a speed profile that shows the changes In the future, we propose applying this scheme to a real-
in the speed while avoiding the obstacle in the environment. It life drone with sensing devices to sense the surrounding area
can be seen from Table I the result of our work in comparison in real-time to guide the drone in uncharted areas.
with the other related algorithms.
References
[1] Barraquand, Jerome, and Jean-Claude Latombe. "Robot motion
Table I. Result comparison with respect to trajectory,accuracy, and planning: A distributed representation approach." The
time. International Journal of Robotics Research 10.6 (1991): 628-
Mode Dijkstra RRT A* Proposed 649.
Resulting - 37.1015 - 36.9609 [2] Mustafa mohammed, Oguz Ata, and Dogu Cagdas
trajectory [m] “Nonholonomic path planning for a mobile robot based on
Accuracy - 95% - 98.6% Voronoi and Q-Learning algorithm”, 1st International
Time [s] 3.02996 6.9513 3.2781 2.7561 Conference on Computing and Machine Intelligence (ICMI),
pp.236-239, 2021.
[3] Erdmann, Michael, and Tomas Lozano-Perez. "On multiple
The idea behind the combination between the k-means and moving objects." Algorithmica 2.1 (1987): 477-521.
the ANN is to predict a suitable path has the ability to avoid [4] Karaman, Sertac, and Emilio Frazzoli. "Sampling-based
the obstacle till the drone reach its goal, from the result above algorithms for optimal motion planning." The international
we can conclude that our work is better in part of length and journal of robotics research 30.7 (2011): 846-894.
speed. The problem with RRT considered as not smooth [5] L. Zong, J. Luo, M. Wang, and J. Yuan, “Obstacle avoidance
handling and mixed integer predictive control for space robots,”
with highly random path so the path is not suitable. Comparing Adv. Sp. Res., vol. 61, no. 8, pp. 1997–2009, 2018, doi:
our work with the related shows that we have the 10.1016/j.asr.2018.01.025.
ability to avoid the obstacle as fast as possible while if we just [6] W. Zhao, H. Chu, M. Zhang, T. Sun, and L. Guo, “Flocking
work with RRT algorithms we may have too many problems Control of Fixed-Wing UAVs with Cooperative Obstacle
like,time speed, avoiding capabilities. Avoidance Capability,” IEEE Access, vol. 7, pp. 17798–17808,
2019, doi: 10.1109/ACCESS.2019.2895643.
[7] J. Zhang, S. Zhang, and R. Gao, “Discrete-time predictive
trajectory tracking control for nonholonomic mobile robots with
obstacle avoidance,” Int. J. Adv. Robot. Syst., vol. 16, no. 5, pp.
1–11, 2019, doi: 10.1177/1729881419877316.
[8] Y. Yu, X. Chen, Z. Lu, F. Li, and B. Zhang, “Obstacle
avoidance behavior of swarm robots based on aggregation and
disaggregation method,” Simulation, vol. 93, no. 11, pp. 885–
898, 2017, doi: 10.1177/0037549717711281.
[9] P. Wu et al., “Autonomous obstacle avoidance of an unmanned
surface vehicle based on cooperative manoeuvring,” Ind. Rob.,
vol. 44, no. 1, pp. 64–74, 2017, doi: 10.1108/IR-04-2016-0127.
[10] P. Wang, S. Gao, L. Li, B. Sun, and S. Cheng, “Obstacle
avoidance path planning design for autonomous driving
vehicles based on an improved artificial potential field
algorithm,” Energies, vol. 12, no. 12, 2019, doi:
10.3390/en12122342.
[11] L. Song, B. Y. Su, C. Z. Dong, D. W. Shen, E. Z. Xiang, and F.
P. Mao, “A two-level dynamic obstacle avoidance algorithm for
Figure IX. Speed profile path planning. unmanned surface vehicles,” Ocean Eng., vol. 170, no. October,
pp. 351–360, 2018, doi: 10.1016/j.oceaneng.2018.10.008.
Fig. IX shows the performance of the drone moving from [12] J. Seo, Y. Kim, S. Kim, and A. Tsourdos, “Collision Avoidance
the tart point trying to reach its goal. As we cn see from the Strategies for Unmanned Aerial Vehicles in Formation Flight,”
figure the speed is highest at the beginning of the movement IEEE Trans. Aerosp. Electron. Syst., vol. 53, no. 6, pp. 2718–
that mean the drone moving on th path after a while the speed 2734, 2017, doi: 10.1109/TAES.2017.2714898.
will start decreasing because its avoiding the obstacle, the [13] R. P. Padhy, S. K. Choudhury, P. K. Sa, and S. Bakshi,
speed still increasing and decreasing depinding on the obstacle “Obstacle Avoidance for Unmanned Aerial Vehicles: Using
Visual Features in Unknown Environments,” IEEE Consum.
in the environment till the drone reach its goal as safe as Electron. Mag., vol. 8, no. 3, pp. 74–80, 2019, doi:
possible. 10.1109/MCE.2019.2892280.
[14] J. L. Mendoza-Soto, L. Alvarez-Icaza, and H. Rodríguez-
IV. Conclusions Cortés, “Constrained generalized predictive control for obstacle
avoidance in a quadcopter,” Robotica, vol. 36, no. 9, pp. 1363–
In this paper, we presented an approach to guide a drone to 1385, 2018, doi: 10.1017/S026357471800036X.
fly through a 3D space area. The approach uses a K-means [15] X. Ma and A. Lee, “Self-self-adaptive obstacle avoidance fuzzy
algorithm to segment the known area, a convolutional neural system of mobile robots,” J. Intell. Fuzzy Syst., vol. 35, no. 4,
network to extract the features of the area segments, and ANN pp. 4399–4409, 2018, doi: 10.3233/JIFS-169759.
to classify these segments. Once the segments are defined and [16] Z. Lin, L. Castano, E. Mortimer, and H. Xu, “Fast 3D Collision
labeled, the drone can easily pass through the safe segments Avoidance Algorithm for Fixed Wing UAS,” J. Intell. Robot.
Syst. Theory Appl., vol. 97, no. 3–4, pp. 577–604, 2020, doi:
and avoid the unsafe segments, we evaluate our scheme using 10.1007/s10846-019-01037-7.
the MSE index and found that our approach has low MSE [17] J. Li, J. Sun, and G. Chen, “A multi-switching tracking control
which decreases with each iteration of ANN training, also the scheme for autonomous mobile robot in unknown obstacle

221
environments,” Electron., vol. 9, no. 1, 2020, doi: [20] Zhang, Denggui, Yong Xu, and Xingting Yao. "An improved
10.3390/electronics9010042. path planning algorithm for unmanned aerial vehicle based on
[18] L. Hwang, H. M. Wu, and J. Y. Lai, “On-Line Obstacle rrt-connect." 2018 37th Chinese Control Conference (CCC).
Detection, Avoidance, and Mapping of an Outdoor Quadrotor IEEE, 2018.
Using EKF-Based Fuzzy Tracking Incremental Control,” IEEE [21] He, ZeFang, and Long Zhao. "The comparison of four UAV
Access, vol. 7, pp. 160203–160216, 2019, doi: path planning algorithms based on geometry search
10.1109/ACCESS.2019.2950324. algorithm." 2017 9th International Conference on Intelligent
[19] F. Belkhouche and B. Bendjilali, “Reactive path planning for 3- Human-Machine Systems and Cybernetics (IHMSC). Vol. 2.
d autonomous vehicles,” IEEE Trans. Control Syst. Technol., IEEE, 2017.
vol. 20, no. 1, pp. 249–256, 2012, doi:
10.1109/TCST.2011.2111372.

222
Joint User Selection and Base Station Assignment
Strategy in Smart Grid
Muhammad Fawad Khan Muhammad Azam Ashfaq Ahmed
Dept. of Computer Science and Engineering School of Energy and Power Engineering Dept. of Electrical Engg. and CS
Kyungpook National University Jiangsu University Khalifa University
Daegu, South Korea Zhenjiang, China Abu dhabi, UAE
fawadkhan896@gmail.com azamlyh@gmail.com ashfaq2419@gmail.com

Adnan Ahmed Khan Muhammad Naeem Imtiaz Alam

Department of Computer Science Department of Electrical Engineering Department of Electrical Engineering
MCS, National University of Sciences COMSATS Univeristy Islamabad Foundation University Islamabad
and Technology Wah Campus Rawalpindi Campus
Rawalpindi, Pakistan Wah, Pakistan Rawalpindi, Pakistan
adnankhan@mcs.edu.pk muhammadnaeem@gmail.com imtiazalam99@gmail.com

Abstract—Communication network is the most important

part of emerging smart gird system. This information and
communication network converts the traditional grid system
to smart grids. The mere focus of this paper is to give the
best combination of user and BSs that maximizes the overall
throughput of the network. The traditional approach of user
association comprises max-SINR (Signal to interference plus
noise ratio) rule. According to which resources are allocated on
the basis of max-SINR which is not suitable for heterogenous
networks. In this work, overall performance of the network
is optimized by using heuristic as well as optimal approach
(fmincon). Proposed algorithm have the capability to find the best
BS and user combination for maximization of overall throughput
of the network. It also finds the maximum number of users
that can be accommodated for a specific fixed number of BSs
(Base Stations). Computational complexity has been remarkably
reduced as compared to the existing work.

Keywords—smart grid, user association, optimization, base

station assignment Fig. 1. Major stake holders of smart grid

I. I NTRODUCTION
Smart grids are the advanced form of conventional elec- technology and rest of the block will portray the true picture
trical grid systems where electricity is being supplied in a of conventional electrical grid systems. For a successful smart
controlled manner [1]. There is a communication link between grid system, smart metering, energy management for both
power generation, transmission and distribution that helps supplier and consumer as well as advanced communication
in improving the overall performance of grid systems. It is methods and infrastructure are the core requirements [2], [3].
that communication or information exchange that converts The emerging smart system merely rely on communication
the typical grid system to smarter one. In a brief context, network. Usually wireless communication network is preferred
smart grid is basically an automated electrical grid system as compared to wired link due to advantages like flexibility,
that is more reliable, secure and efficient as compared to ease of deployment, cutting cost and greater access [4]. Smart
typical electrical grids. It supplies electricity according to grid network comprises sensors, smart meters, a proper data
consumer’s demand in order to utilize the energy efficiently. management and monitoring system, various alarms, notifica-
As information and communication technology plays a pivotal tions, alerts and important indication mechanism. Keeping in
role among the major stake holders of smart grid system. So, it mind the architecture of smart grid, communication network
can be termed as the backbone or heart of smart grid system. could be categorized into short and long range communica-
As shown in Fig 1, the scenario will be totally change by tions. Short range communication network is just like personal
simply removing the block of information and communication area network and is termed as Home Area Networks (HAN’s).

223
It includes interfacing and communication between sensors,
smart meters and devices deployed within the home premises.
Bluetooth, Wifi, Zigbee, 6LoWPAN and Z-wave are some of
the suitable communication technologies for HAN’s.
Neighborhood Area Networks (NAN’s) connects HAN’s de-
vices to the gateway network. NAN’s are long range commu-
nication network. It is of worth importance in communication
architecture of smart grid systems. There exists numerous
options of communication technologies for NAN’s i.e. satellite
communication, WiMAX and cellular networks. Among these Fig. 2. Oscillatory behavior of greedy approach.
options cellular networks are preferred for NAN’s due to high
maturity of network, dedicated frequency band, high uplink
(UL) and downlink (DL) data rates, lower latency, reliability,
secure communication, ease of deployment and ubiquitous
coverage [5]. connecting the users to a particular BS. This conventional
Evolution of cellular network form 1G-4G and the exponen- scheme is termed as max-SINR rule which failed in balancing
tial growth of devices, sensors and users are the true witness the load between the macro and micro BSs. For this purpose
of drastic technological advancement in cellular network. [7] proposed a heuristic that adds a bias term in SINR of low
Objectives of 5G targets 1000 times higher spectral efficiency, power micro BS in order to push it to a comparable level
controlled privacy and reliable communication with 10 times to attract the users. [8]–[14] tried to minimize the interfer-
lower energy consumption. Heterogenous networks, mm wave ence by minimizing the total transmit power considering the
communication, D2D communication, massive MIMO, energy minimum requirement for SNR constraint for every user. In
aware networks are considered as state of the art technologies existing literature, max-SINR rule has been discussed most of
for 5G networks. But still user association and BS assignment the time. But this typical approach has some serious issues
is of great importance in order to improving the overall per- regarding load balancing and transmit power of micro BSs.
formance of cellular technology for next generation network. So, different researchers proposed numerous solutions for this
User association or BS assignment is the decision making problem. For example [15], targeted coordinate descent, dual
for a particular user to get connect with a specific BS. This coordinate descent as well as sub gradient method to get the
assignment or connection establishment is carried out using association problem solved. Pricing variables are introduced
some specific decision variable targeting the optimization of in the objective function and their weightage actually helps in
the network. So, there is a requirement of an appropriate finding the optimal solution for the defined objective function.
assignment strategy for both efficient utilization of resources Some of the work of [15] have already been published in
and optimized performance of a particular communication [16]. For increasing the coverage area and offloading traffic
network. This paper will focus on user selection and BS from high power macro BS [7] added a constant bias term
assignment strategy at the same time to get optimal results to SNR. But it is difficult to specify the exact value for that
in order to meet the objectives. optimal bias term. A greedy algorithm have been discussed in
[17]–[19]. Drawback of BS assignment using greedy approach
II. E XISTING WORK
is that user will connect to a particular BS only considering
Modern cellular networks are the most suitable technology the maximization of their own utility. Every time a user
for smart grid communications. For optimization of this com- switch its BS on the basis of its own throughput in order
munication network, researchers have discussed users and BS to increase its own utility irrespective of the other parameters
association problem extensively. In cellular architecture there of network environment. It is difficult to control this type of
are access nodes (BS) and a user must establish its connection mechanism. As shown in Fig. 2, each time user switch their
with a specific access node (BS) to get the provided ser- BS for maximizing its own data rate which leads to a back and
vices. But [6] highlighted Coordinated Multipoint Technology forth switching. It lies under the category of selfish algorithms
(CoMP) where a single user can even establish connection with leading to oscillatory behavior of the network. [20] proposed
more than one BSs. New access nodes have been deployed in game theory for assignment of users and BSs. In [21], channel
Het. Nets (Heterogenous Networks) in order to offload the state information (CSI) has not been shared because of selfish
traffic from macro BSs (Base stations). This deployment of reasons.
pico/femto cell for handling more number of users at crowded
areas is termed as cell splitting. Small cells (pico/femto) can Rest of the paper has been organized in such a way
hardly attract the users as they are low power cells as compared that section III comprises system model and problem
to macro BS. Transmit power has an important role for making formulation while simulation results and discussion on the
a connection between user and the access node. Conventional obtained results has been deeply discussed in section IV.
approach of BS assignment strategy uses SINR (Signal to Conclusion and future work has been highlighted in section V.
Interference Plus Noise Ratio ) as the decision parameter for

224
TABLE I Introducing a new parameter termed as utility parameter aij
N OTATIONS as
Symbol Definition
N Total number of active users SIN Rij
aij = log Blog 1 + (4)
uj User connected to each BS Γ
hij Channel between user i and BS j
bij Binary decision variable for association of iith user By substituting the value of aij into the objective function,
with jth BS our BS assignment problem becomes
Pj PSD (Power Spectral Density) of jth BS
γ2 PSD of AWGN (Additive White Gaussian Noise) X X
Rij Data rate max aij bij − uj log(uj )
Γ SNR (Signal to Noise ratio) gap b,U
SIN R Signal to interference plus noise ratio i,j j
subject to:
X
C1 : bij = 1, ∀i
III. S YSTEM M ODEL AND P ROBLEM F ORMULATION
j
The main focus of this work is on BS and user assign-
X
C2 : bij = uj , ∀j
ment strategy that yields the maximization of our objective. i
Choosing the cellular network as the communication network X
C3 : uj = U
for smart grid systems, the target is to improve the overall
j
throughput by selecting the best combinations of users and
BSs. bij ∈ (0, 1), ∀i, ∀j
Considering a downlink cellular network, let L be the C1 indicates one user can be connected to one BS at a time
number of BSs and total U active users are there within the i.e. one-to-one connection between user and BS. C2 shows the
area covered by the cellular network. i and j are used for total number of users connected to each BS denoted by uj.
indexing a particular user and BS respectively i.e. i ∈ [1, C3 shows that all the users will be served.
2, 3, . . . . , U ] and j ∈ [1, 2, 3 , . . . . , L]. B is the
total bandwidth (B.W) that is distributed among all the BSs. IV. R ESULTS AND D ISCUSSION
For making problem simpler, frequency flat power spectral In this section, the obtained result from exhaustive search
density (PSD) levels and flat fading channels are assumed to and specified algorithm (fmincon) will be discussed exten-
make SINR values constant for all the frequencies. Channel sively. Our objective function is a maximization function that
between ith user and jth BS is denoted by hij ∈ C and Pj is depends on the user and BS assignment strategy. bij assigns
PSD level. Then SINR value for association of user i and BS a particular user to a BS and aij is the utility parameter
j will be introduced to get an estimate of maximum capacity (data rate)
2
of the network.
|hij | Pj In Fig.3(a), exhaustive vs optimal algorithm output shows
SIN Rij = P 2 (1)
j 0 6=j |hij 0 | Pj 0 + γ 2 the value of objective function that is throughput in monetary
units. There is a comparison in this figure between exhaustive
γ 2 = PSD of AWGN (Additive White Guassian Noise). and optimal search algorithm. It can be observed by the bar
In this paper, we adopt sum log utility maximization ob- graph that optimal algorithm cannot give better results then the
jective for optimizing the throughput by selecting the best heuristic approach. Heuristic approach will always be better
combination of users and BSs. If uj is the number of user as compared to any algorithm. As indicated in graph, overall
associated with each j BS, then each user connected to that throughput of the network increases by increasing the number
specific BS will share 1/uj of frequency resource. The data of users because utility increases in this way by fixing the
rate of user i connected with BS j is calculated by number of BSs. If the number of BSs are fixed and we keep
on increasing the users, then after specific number of users,
B SIN Rij
Rij = log 1 + (2) the utility of the network start decreasing. This decrement
uj Γ indicates the point of saturation with specific number of BSs
Γ = SNR gap of user i that is determined by coding and that can be observed in case of optimal algorithm.
modulation scheme. Fig.3(b),(c),(d) shows the results of optimization algorithm
Γ is assumed to be same for every user. i.e. (fmincon) used to improve the overall performance of the
network with different combinations of users and BSs. For
Binary variable (0-1) bij is a decision variable used to every set of users, the optimization algorithm calculates the
determine whether user i is associated with BS j or not. Then objective function using the best assignment combination of
objective function problem can be written as users and BSs. There are three different graphs with unique
throughput which indicates that different number of BSs are
used. As indicated in Fig.3(b), 2 BSs are fixed for this case.
X
fo (b, R) = bij log(Rij ) (3)
i,j
When we keep on increasing the number of users with fixed

225
(a) Exaustive vs Optimal algorithm
(b) Max achievable throughput with 2 BSs

Fig. 3. (a) Throughput performance comparison of exhaustive vs optimal algorithm in monetary units. (b), (c), (d) indicates the max achievable throughput
and the saturation point for the accommodation of users with specific number of base stations.

number of BSs, the network throughput increases till 50 users. the overall throughput of the network. Implementing the
After 50 users, the throughput decreases which indicates the heuristic approach gives the maximum results as compared to
overloaded traffic on 2 BSs. So, for best utilization, 2 BSs optimal algorithm. For any specific number of BSs, saturation
can accommodate max 50 users in order to get maximum point of user accommodation has been identified. This satura-
overall throughput of the network. If we want to accommodate tion point is directly linked with the utility of the objective
more users then number of BSs must be increased for better function and identify that maximum number of users that
services. In the same way, if we increase number of BSs to can be served using that fixed amount of BSs. It is clearly
3 and 5 as indicated in Fig.3(c) and (d), we can provide observed from the results that by increasing the number users
best services to 70 and even more number of users taking increases the overall throughput till saturation point and after
the best assignment strategy using fmincon. In a nutshell, our this point any increase in user leads to decrement in the utility
algorithm selects users and BSs assignment jointly targeting of the network. The optimization algorithm (fmincon) yield
the maximum throughput of the network. the best combination of user and BS that is then used for
V. C ONCLUSION AND F UTURE WORK getting maximum throughput up till certain saturation level.
So, keeping the saturation point in mind, the performance of
This paper jointly optimizes user and BS association by the network has been optimized using optimal algorithm.
selecting the best combination of user and BS that maximizes

226
Jointly optimizing the power as well as user association [20] L. Jiang, S. Parekh, and J. Walrand, “Base station association game in
could be a promoting extension of this work. Beamforming multi-cell wireless networks (special paper),” in Wireless Communica-
tions and Networking Conference, 2008. WCNC 2008. IEEE. IEEE,
and power control are still two important aspects of a commu- 2008, pp. 1616–1621.
nication network that can make it more suitable and efficient [21] M. Hong and A. Garcia, “Mechanism design for base station association
for emerging smart grid communication network. and resource allocation in downlink ofdma network,” IEEE Journal on
Selected Areas in Communications, vol. 30, no. 11, pp. 2238–2250,
2012.
R EFERENCES

[1] P. Siano, “Demand response and smart grids—a survey,” Renewable and
sustainable energy reviews, vol. 30, pp. 461–478, 2014.
[2] Y. Kabalci, “A survey on smart metering and smart grid communication,”
Renewable and Sustainable Energy Reviews, vol. 57, pp. 302–318, 2016.
[3] Y. Yan, Y. Qian, H. Sharif, and D. Tipper, “A survey on smart grid com-
munication infrastructures: Motivations, requirements and challenges,”
IEEE communications surveys & tutorials, vol. 15, no. 1, pp. 5–20,
2013.
[4] M. Rafiei, S. M. Elmi, and A. Zare, “Wireless communication protocols
for smart metering applications in power distribution networks,” in
Electrical Power Distribution Networks (EPDC), 2012 Proceedings of
17th Conference on. IEEE, 2012, pp. 1–5.
[5] C. Kalalas, L. Thrybom, and J. Alonso-Zarate, “Cellular communi-
cations for smart grid neighborhood area networks: A survey,” IEEE
Access, vol. 4, pp. 1469–1493, 2016.
[6] E. U. T. R. Access, “Further advancements for e-utra physical layer
aspects,” 3GPP Technical Specification TR, vol. 36, p. V2, 2010.
[7] I. Guvenc, M.-R. Jeong, I. Demirdogen, B. Kecicioglu, and F. Watanabe,
“Range expansion and inter-cell interference coordination (icic) for
picocell networks,” in Vehicular Technology Conference (VTC Fall),
2011 IEEE. IEEE, 2011, pp. 1–6.
[8] L. Smolyar, I. Bergel, and H. Messer, “Unified approach to joint
power allocation and base assignment in nonorthogonal networks,” IEEE
Transactions on Vehicular Technology, vol. 58, no. 8, pp. 4576–4586,
2009.
[9] J. T. Wang, “Sinr feedback-based integrated base-station assignment,
diversity, and power control for wireless networks,” IEEE Transactions
on Vehicular Technology, vol. 59, no. 1, pp. 473–484, 2010.
[10] H. Pennanen, A. Tölli, and M. Latva-aho, “Decentralized base station
assignment in combination with downlink beamforming,” in Signal
Processing Advances in Wireless Communications (SPAWC), 2010 IEEE
Eleventh International Workshop on. IEEE, 2010, pp. 1–5.
[11] D. H. Nguyen and T. Le-Ngoc, “Joint beamforming design and base-
station assignment in a coordinated multicell system,” Iet Communica-
tions, vol. 7, no. 10, pp. 942–949, 2013.
[12] V. N. Ha and L. B. Le, “Distributed base station association and power
control for heterogeneous cellular networks,” IEEE Transactions on
Vehicular Technology, vol. 63, no. 1, pp. 282–296, 2014.
[13] R. D. Yates and C.-Y. Huang, “Integrated power control and base station
assignment,” IEEE Transactions on vehicular Technology, vol. 44, no. 3,
pp. 638–644, 1995.
[14] S. V. Hanly, “An algorithm for combined cell-site selection and power
control to maximize cellular spread spectrum capacity,” IEEE Journal
on selected areas in communications, vol. 13, no. 7, pp. 1332–1340,
1995.
[15] K. Shen and W. Yu, “Distributed pricing-based user association for
downlink heterogeneous cellular networks,” IEEE Journal on Selected
Areas in Communications, vol. 32, no. 6, pp. 1100–1113, 2014.
[16] ——, “Downlink cell association optimization for heterogeneous net-
works via dual coordinate descent,” in Acoustics, Speech and Signal
Processing (ICASSP), 2013 IEEE International Conference on. IEEE,
2013, pp. 4779–4783.
[17] R. Madan, J. Borran, A. Sampath, N. Bhushan, A. Khandekar, and T. Ji,
“Cell association and interference coordination in heterogeneous lte-a
cellular networks,” IEEE Journal on selected areas in communications,
vol. 28, no. 9, pp. 1479–1489, 2010.
[18] T. B. L. E. L. R. Ramjee, “Generalized proportional fair scheduling in
third generation wireless data networks,” in IEEE INFOCOM, 2006, pp.
1–12.
[19] K. Son, S. Chong, and G. De Veciana, “Dynamic association for load
balancing and interference avoidance in multi-cell networks,” IEEE
Transactions on Wireless Communications, vol. 8, no. 7, 2009.

227
Comparison of flowdrill and conventional drilling methods in
thin-walled materials

Mert Safak TUNALIOGLU

Department of Mechanical Engineering
Hitit University
Corum, Turkey
mertstunalioglu@hitit.edu.tr

Abstract—In this study, drilling and tapping processes with

flowdrill method in thin-walled materials were investigated. Holes
were drilled in AISI 304 stainless steel and St 37-2 materials with
a wall thickness of 1.5 mm with different diameters of flowdrill and
conventional drilling methods. Taps were drilled into the drilled
holes with flowtap and traditional tapping methods. Obtained
results were compared in terms of the bushing heights, the
strength of the joints, micro hardness around the holes, and
observation of the capillary crack formations around the holes.
When the flowdrill method is used in materials with thin wall
thickness, 4-5 times the bushing height and as a result Figure I. Stages of flowdrill process [1]
approximately 3.5-4 times the joint strength compared to the
conventional drilling method. While capillary cracks were
The main purpose of the studies about drilling holes to
observed in large hole diameters in the conventional drilling thin-walled materials in the literature is increasing the
process, no capillary cracks were found for each hole diameter strength of the joint. Especially in the studies conducted
tested in the flowdrill process. oriented to construction sector, applications such as welding
or bonding hollow bolt or nutted rivet to the holes drilled on
circular or square profile materials having different properties
Keywords—flowdrill, flowtap, conventional drilling, filled with concrete in T-type joints were made to increase the
conventional tapping strength of the joint, and the increase of the joint strengths
depending on the hole diameters and material thicknesses
I. Introduction
were investigated experimentally and theoretically [2-5].
In recent years, when automation has come into Some researchers have shown that the strength of the
prominence, the most important rule is to process materials connection is further increased with the extended hollo bolt,
in the most efficient way in the shortest time with appropriate slip-critical blind bolt and tube bolts they have developed [6-
production methods. Materials with thin wall thickness, the 9]. Researchers working in the flowdrill method, which was
pitch lengths are insufficient for tapping in drilling and a developed as an alternative to drilling, conducted
weak connection is provided. Besides, capillary cracks form experimental and theoretical studies on tool wear and drilling
around the hole during the drilling process due to the wear of parameters with spinning (friction angle, tool advance, tool
the drill tip and this decreases the strength of the materials. angle, spindle speed, tool rotation speed, thrust and axial
There are several alternative methods to strengthen the joint force). they have done. [10-14]. In this study, AISI 304
such as welding nuts, tapping rivets, bonding, and using blind stainless steel and St 37-2 materials having 1.5 mm thickness
bolts. These methods provide an increase in strength in the were flowdrilled and flowtapped in various diameters. Same
connection, but also increase both cost and production times. processes were repeated with conventional drilling method.
Obtained results were compared in terms of the bushing
The flow drilling process is forming holes on parts by the heights, the strength of the joints, micro hardness around the
effect of the contact of flow drill tip with the material at high holes, and observation of the capillary crack formations
speed. Sufficient pitch length required for the screw joint is around the holes.
easily obtained since there is no chip removal during the
drilling process (Figure I). Due to this advantage, there is no II. Theory and Method
need to weld a nut on the sheet metal part. During the flow
In this study, drilling and tapping processes were
drilling process, crack formation risk also is prevented since
performed on AISI 304 stainless steel and St 37-2 materials
high temperature occurs around the hole. In the flow tapping
with a wall thickness of 1.5 mm with flowdrill and flowtap
process, warping and cracking risk is also prevented.
methods, respectively. Long necked flowdrill tips with
diameters of Ø4.6 mm, Ø5.5 mm and Ø7.3 mm were used for
drilling with the flowdrill method (Figure II). Scrubbing type
M5, M6 and M8 taps were used to flowtap method (Figure
III). Conventional drilling and tapping processes were
applied to the same materials. The mechanical and chemical
properties of AISI 304 stainless steel and St 37-2 materials
used in the study are shown in Tables I and II.

228
terms of bushing heights, clamping force, hardness and crack
analyses.
Table III. Test plan
Thickness Spindle
Hole & tapping
Material speed Method
(mm) dia. (mm)
(rpm)

Ø4,6-M5 Flowdrill &

AISI 304 1,5
Ø5,5-M6 2000 Flowtap
Ø7,3-M8
Figure II. Flowdrill tips Ø4,6-M5 Flowdrill &
1,5
St 37-2 Ø5,5-M6 2000 Flowtap
Ø7,3-M8
Ø4,6-M5 Conventional
AISI 304 1,5
Ø5,5-M6 2000 Drilling &
Ø7,3-M8 Tapping
Ø4,6-M5 Conventional
1,5 2000
St 37-2 Ø5,5-M6 Drilling &
Ø7,3-M8 Tapping

A. Bushing height measurements of flowdrilling and

conventional drilling methods
In accordance with Table III, the bushing heights of the
test samples drilled with the flowdrill and conventional
drilling methods were measured with the help of a digital
Figure III. Flowtap tips
caliper (Figure IV). For the accuracy of each test,
Table I. Technical properties of the materials measurements were made on 3 samples from the same group,
and the average of these measurements was taken The
Properties name AISI 304 St 37-2
Hardness (HB) 201 120 bushing heights of the samples drilled with the flowdrill and
Ultimate Tensile Strength 505 360 conventional drilling methods were compared.
(MPa)
Ultimate Yield Strength 215 215
(MPa)
Elongation at Break (%) 70 24
Modulus of Elasticity 193 212
(GPa)
Poisson Ratio 0.29 0.28
Shear Modulus (GPa) 86 82
Density (g/cc) 8.00 7.96
Specific Heat Capacity 0.5 0.42
(J/goC)
Thermal Conductivity 16.2 5.7
(W/mK)
Melting Point (°C) 1455 1480

Table II. Chemical properties of the materials

Component Element AISI 304 St 37-2
Name
Carbon (C) max. 0.08 0.20 Figure IV. Bushing height measurement with digital caliper
Chrome (Cr) 18-20 -
B. Clamping strengths measurements of flowdrilling and
Iron (Fe) 66.35-74 75
Manganese (Mn) ≤2 -
conventional drilling methods
Phosphorous (P) ≤0.045 0.05 To examine the strength of thin-walled materials,
Sulphur (S) ≤ 0.03 0.05
clamping tests were performed on the test specimens. Since
Silicon (Si) ≤1 15-35
Nickel (Ni) 8-10.5 0.09
the material used in the study is thin-walled, a special
apparatus was made to prevent deformation in the clamping
In the flowdrill method, the feed rate was taken as 125 tests (Figure V). The materials drilled and tapped with both
mm/d for each test. Holes in Ø4.6 mm, Ø5.5 mm and Ø7.3 methods were fastened with M5, M6 and M8 bolts specified
mm diameters at 2000 rpm were drilled into the test samples in Table III and their clamping strengths were determined.
with a flowdrill tip, M5, M6 and M8 threads were drilled,
respectively. In the conventional drilling method, chips were
removed from the samples with a drill bit and Ø4.2 mm, Ø5
mm and Ø6.8 mm diameter holes were drilled and M5, M6
and M8 were drawn respectively. The test plan applied in the
study is shown in Table II. Both methods were compared in

229
III. Test Results

A. Comparison of flowdrilling and conventional drilling

methods in terms of bushing height

In this study materials drilled in 1.5 mm wall thickness

AISI 304 stainless steel and St 37-2 at different diameters
using flowdrill and conventional drilling methods, and the
graphs of the bushing heights measured with the digital
caliper are shown comparatively in Figures VII and VIII.

Figure V. Clamping strength test apparatus

C. Hardness measurements of flowdrilling and
conventional drilling methods
During the drilling process, the drilling tip is contacted
with a force on the material at high speeds and high
temperatures occur due to the friction that occurs. These
temperatures affect the hardness of the hole region. 3 Figure VII. AISI 304 stainless steel bushing heights test
measurements were made for each sample from the region results
close to the holes drilled with the flowdrill and tradition
drilling methods.
D. Penetrant test measurements of flowdrilling and
conventional drilling methods
Penetrant testing is a non-destructive testing method used
to detect surface defects (Figure VI). Since the wall thickness
of the test samples are thin, the capillary cracks that may
occur around the hole as a result of drilling were examined
with the help of penetrant tests. Penetrant tests were applied
to the outer parts of the holes and the root areas of the bushes,
which were applied by the flowdrill and conventional drilling
method. The capillary cracks in the samples were determined
by applying spray to these areas.

Figure VIII. St 37-2 bushing heights test results

The bushing heights formed which are equal to the wall
thickness of the test specimens are 1.5 mm when both
materials are drilled with the conventional drilling method
(Figure VII-VIII). According to Figures VII and VIII when
the hole diameters increase in AISI 304 stainless steel and St
37-2 materials, the amount of drilled material and the bushing
heights increase.
When the flowdrill and conventional drilling methods are
compared, the bushing heights in AISI 304 stainless steel
material increased 3.8 times for M5, 4.4 times for M6 and 4.8
times for M8, respectively (Figure VII). In the St 37-2
material, the bushing heights increased 4.1 times for M5, 4.2
times for M6 and 4.6 times for M8, respectively (Figure
VIII).

Figure VI. Penetrant test samples

230
As a result of the tests, the bushing heights of the holes method, the peel strength increases approximately 2-3 times
drilled with the flowdrill method increased by 4-5 times compared to the conventional drilling method.
compared to the conventional drilling method. With this
increase comes the possibility of tapping more of the test C. Comparison of flowdrilling and conventional drilling
specimens with a larger amount of tap. It is predicted that the methods in terms of part hardness
connection will be strengthened with the increase in the height During drilling with flowdrill method, the friction of the
of the bushes in these ratios and with more threading in the flowdrill tip and the part at high speeds causes a temperature
parts. increase of approximately 300-400 °C around the holes.
B. Comparison of flowdrilling and conventional drilling Hardness tests were performed for both materials in terms of
methods in terms of clamping strengths hardness changes around the hole with sudden heating and
cooling.
To measure the strength of thin-walled connections,
clamping tests were carried out on materials who drilled by
flowdrill and conventional drilling methods. Clamping tests
performed at 1 mm/min. a constant speed.

Figure XI. AISI 304 stainless steel clamping hardness test

results

Figure IX. AISI 304 stainless steel clamping strengths test

results

Figure XII. St 37-2 hardness test results

In Figure IX-X, the hardness values of both materials
increased as the hole diameter increased. When the hardness
values of AISI 304 stainless steel flowdrill and drilling
Figure X. St 37-2 clamping strengths test results methods are compared, an increase of approximately 4% is
observed in AISI 304 stainless steel and approximately 6% in
For both materials, the clamping strengths increased as St37-2.
the hole diameters increased (Figure IX-X). When the hole
diameter increase, more material is plastered and the bushing D. Comparison of flowdrilling and conventional drilling
height increases. As the height of the bushes increases, more methods in terms of penetrant test results
threads are formed and these threads resist greater clamping The results of the penetrant tests applied to the outer and
strengths. the root region of the holes that were subjected to drilling and
The clamping strength of AISI 304 stainless steel material tapping processes with the flowdrill and conventional drilling
increased 1.98 times for M5, 2.33 times for M6 and 3.02 for methods are given in Tables IV and V comparatively.
M8, respectively (Figure IX). In St37-2 material, the
clamping strengths increased 4 times for M5, 2.35 times for
M6 and 3.45 for M8, respectively (Figure X). When both
materials were examined, the holes drilled by the flowdrill

231
Table. IV AISI 304 stainless steel and St 37-2 flowdrill welding, rivets nuts, bonding). After the drilling process, the
method penetrant test results tapping can be opened easily, eliminating the risk of warping
AISI 304 stainless steel St 37-2
in the material.

Hole Spindle Test Hole Spindle Test References

dia. speed result dia. speed result
(mm) (rpm) (mm) (rpm)
M5 2000 No crack M5 2000 No crack [1]. Nilufer Dogru, “Investigation of Machining Characteristics of
M5 2000 No crack M5 2000 No crack Friction Drilling on AISI 1010 steel” Fırat University, Turkey,
M5 2000 No crack M5 2000 No crack 2010.
M6 2000 No crack M6 2000 No crack [2]. John E. France, J..Buick Davison, Patrick A. Kirby, “Strength
M6 2000 No crack M6 2000 No crack and rotational response of moment connections to tubular
M6 2000 No crack M6 2000 No crack columns using flowdrill connectors”, J. Constr. Steel Res. 50,
M8 2000 No crack M8 2000 No crack 1–14, 1999.
M8 2000 No crack M8 2000 No crack
M8 2000 No crack M8 2000 No crack [3]. Johan Kolsto Sonstabo, David Morin, Magnus Langseth,
“Macroscopic Modelling of Flow-drill Screw Connections in
Thin-walled Aluminium Structures”, Thin-Walled Structures.
Table V. AISI 304 stainless steel and St 37-2 conventional 105, 185-206, 2016.
drilling method penetrant test results
[4]. Johan Kolsto Sonstabo, Peter Henrick Holmstrom, David
AISI 304 stainless steel St 37-2 Morin, “Macroscobic Strength and Failure Properties of Flow-
Hole Spindle Test Hole Spindle Test drill Screw Connections”, Journal of Materials Processing
dia. speed result dia. speed result Technology, 222, 1-12, 2014.
(rpm) (rpm)
[5]. Jingpeng Wang, Liping Chen, “Experimental Investigation of
M5 2000 No crack M5 2000 No crack
Extended end Plate Joint to Concrete-filleed Stell Tubular
M6 No crack M6 Crack
Columns”, Journal of Constructional Steel Research, 79, 56-
2000 2000 occurs
70, 2012.
M8 Crack M8 Crack
2000 occurs 2000 occurs [6]. Walid Tizani, Ali Al-Mughairi, John S. Owen, Theodoros
Pitrakkos, “Rotational stiffness of a blind-bolted connection to
As a result of the penetrant tests applied to the samples concrete-filled tubes using modified Hollo-bolt”, Journal of
drilled and tapped at different hole diameters with the Constructional Steel Research, 80, 317–331, 2013.
flowdrill method, no capillary cracks were found around or [7]. Wei Wang, Ling Li, Dabiao Chen, “Progressive collapse
in the root regions of the holes in both material (Table IV). behaviour of endplate connections to cold-formed tubular
No cracks were found around the M5 and M6 holes in the column with novel Slip-Critical Blind Bolts”, Thin-Walled
AISI 304 stainless steel and St37-2 materials drilled and Structures, 131, 404–416, 2018.
tapped with the conventional drilling method. In both [8]. Mahdi Zeinizadeh Jeddi, N. H. Ramli Sulong, “Pull-out
materials, capillary cracks were found in the M8 hole (Table performance of a novel anchor blind bolt (TubeBolt) for beam
V). In the conventional drilling method, crack formation is to concrete-filled tubular (CFT) column bolted connections”,
observed when the hole diameter exceeds certain dimensions. Thin-Walled Structures 124, 402–414, 2018.
The formation of crack causes loss of strength in the [9]. Wei Wang, Mingxiao Li, Yiyi Chen, Xiaogang Jian, “Cyclic
materials. behavior of endplate connections to tubular columns with novel
slip-critical blind bolts”, Engineering Structures, 148, 949–
IV. Conclusion 962, 2017.
In this study the flowdrill method was examined to [10]. Cebeli Ozek, Zulkuf Demir, “Investigate the Surface
increase the strength in thin-walled materials and the results Roughness and Bushing Shape in Friction Drilling Of A7075-
were compared with the conventional drilling method. T651 and St 37 Steel”, TEM Journal, 2;2, 170-180, 2013.
AISI 304 stainless steel and St37-2 materials with a wall [11]. Zulkuf Demir, “Investigation of the Fluctuation Size in Thrust
thickness of 1.5 mm are approximately 4-5 times more Force and Chip Morphology in Drilling, Celal Bayar
advantageous in terms of the bushes heights that occur when University Journal of Science”, 14,4, 385-397, 2018.
the holes are drilled with flowdrill method. Holes tapped with [12]. Kuan-Yu Sua, Torgeir Welo, Jyhwen Wang, “Improving
the flowdrill method are 3.5-4 times more durable than Friction Drilling and Joining through Controlled Material
conventional drilling. The walls of the holes drilled with the Flow”, Procedia Manufacturing, 26, 663-670, 2018.
flowdrill method is approximately 4-5% harder than the
conventional drilling method. While there were no capillary [13]. Sara A. El-Bahloul, Hazem E. El-Shourbagy, Ahmed M. El-
Bahloul, Tawfik T. El-Mindany, “Experimental and Thermo-
cracks around the holes drilled with the flowdrill method, the Mechanical Modeling Optimization of Thermal Friction
formation of capillary cracks began to be observed as the hole Drilling for AISI 304 Stainless steel”, CIRP Journal of
diameter increased in the materials drilled with the Manufacturing Science and Technology, 20, 84–92, 2018.
conventional drilling method.
[14]. N. Rayesh Jeudoss Haynes, R. Kumar, “Simulation on friction
Considering all these results, the flowdrill method is a drilling process of Cu2C”, Materials Today: Proceedings, 5,
suitable technique especially for thin-walled materials. With 27161–27165, 2018.
this method, fast and durable holes are drilled. It is more
economical than other connection methods (blind bolts,

232
Classification of Animal Faces Using a Novel DAG-CNN
Architecture
Shahram Taheri Zahra Golrizkhatami Önsen Toygar
Department of Computer Engineering Department of Computer Engineering Department of Computer Engineering
Antalya B ilimUniversity Antalya B ilimUniversity Eastern Mediterranean University
Antalya, Turkey Antalya, Turkey Famagusta, North Cyprus, Turkey
shahram.taheri@antalya.edu.tr z.golrizkhatami@antalya.edu.tr onsen.toygar@emu.edu.tr

Abstract—Manual classification and recognition of animals be valuable in expert systems to determine the wild animals’
in wild life images and footage is a tiring and extremely migration corridors. Object characterization can be obtained
challenging process. Therefore, automatic systems that are by applying visual descriptors, shape descriptors or texture
developed by computer vision approaches for the classification representation. Deep learning and specifically Convolutional
of animals are suggested. In this research work, we present a Neural Network (CNN) approaches have been successfully
new architecture of a non-linear deep learning structure namely employed in recent studies of object classification and
Directed Acyclic Graph Convolutional Neural Networks (DAG- recognition tasks [2-5] and shown to have salient performance
CNNs) for animal classification. This system applies several comparing to state-of-the-art. CNN is an end-to-end system
Convolutional Neural Network (CNN) layers’ learned features
which is capable of extracting relevant features and also
and fuses them for the final decision making. For this purpose,
popular and publicly available CNN architecture, namely VGG-
integrate the feature extraction and classification phases of
16 is selected and considered as the underlying backbone classical machine learning systems. To train a CNN, presence
structure of the proposed system and several new branches are of a huge set of samples is required. CNN extracts the
added to it. The proposed system automatically performs the discrimination features automatically which in many cases
multi-stage feature extraction and combines multiple classifiers these features are shown to be superior to hand-crafted
decisions in score-level fusion manner. Experiments on the open features such as HOG, LBP, or SURF [6]. Generally CNN
access animal database prove the capability and efficiency of this architectures consists of various numbers of basic elements
novel method. namely convolutional, Max-pooling and ReLU, followed by
the multi-layer perceptron neural network. In our previous
Keywords convolutional neural network, score-level fusion, study [7], in order to take the advantage of both approaches,
directed acyclic graph, animal classification two different classifier systems were trained. One to employ
the CNN features, and the other system had been trained based
I. Introduction on hand-crafted features, such as appearance-based and shape-
Animal classification is a subdomain of visual object based features. Afterwards the outcomes were fused and the
categorization which has been reported in a few studies. final results were obtained based on this fusion.
Detection and classification of animals can be used in various
In this research, we develop a new CNN architecture,
applications of computer vision such as animal-vehicle
namely Directed Acyclic Graph Convolutional Neural
accident prevention, animal trace facility, identification,
Networks (DAG-CNNs) for animal classification. The
antitheft of animals in zoo and content-based image retrieval.
proposed method exerts the multi-stage features from different
Animals are considered as one of the most difﬁcult objects in
CNN layers. DAG-CNN integrates the feature extraction and
object detection applications [1]. There are various reasons
classification phases of CNN into a single automated learning
such as the point that most of the animals are able to self-mask
process. Furthermore, the proposed system employs multi-
and usually they come out in complicated scene with varying
stage features and carry out the score-level fusion of multiple
illumination, viewpoints and scales. Animal pictures captured
classifiers automatically. Hence, the proposed system revokes
in the wild may have complex backgrounds, various postures
the necessity of extracting the hand crafted feature.
and diverse illuminations. Another difficulty in the task of
animal classification is that the available datasets contain a The contributions of our work are as follows:
limited number of animal classes. Animals are one of the
object classes used in the state-of-the-art for object • The proposed system combines two main stages of
recognition. The performance of object recognition system conventional machine learning procedure, namely feature
strictly relies on how well the object representation and extraction and classification stages, into a complete automatic
characterization are done. learning scheme for animal classification.

Although various human face recognition methods have • Unlike the classic CNN which uses only the features
been proposed in literature, they are not fully appropriate for of the last layer, the proposed system employs the multi stage
animal face classification with high range of intra-class learned features from mid-layers as well.
variations and inter-class similarities. In this respect, several • The proposed system aggregates the results of
object recognition algorithms have been applied on animals’ different classifiers and automatically combines them by the
images with the aim of extracting hand-crafted features such score-level fusion approach and positively it negates the use
as texture and shape. The major drawback of these methods is of feature-level fusion.
that, they are entirely problem dependent. Consequently, the
productivity of these algorithms is problematic. Constructing The proposed architecture has been tested over LHI-animal-
the complex feature vector of various hand-crafted features faces database which contains 20 different categories. 19
which has high dimension will take extensive time which may classes of various animal types and one class of human faces.
not be efficient. Animal classification and recognition can also The differentiation of these categories is very challenging due

233
to their evolutional relationship and shared parts. Additionally, III. Backgrounds
significant within-class variations such as rotation, flip Theory and essential components of DAG-CNN are
transforms, posture variation and sub-types exist in the face presented in the following sub-sections. We briefly introduce
categories. In order to make a comparison and following the the VGG-16 architecture which has been employed as the
state-of-the-art methods, in our experiments, 30 samples of backbone of the proposed DAG-CNN approach.
each class were selected for training and the remaining
samples for testing. The obtained results confirm that the A. Directed Acyclic Graph-Convolutional Neural
usage of multi-stage features from different layers of a CNN Network (DAG-CNN)
remarkably improves the classification accuracy. The directed acyclic graph (DAG) networks are capable of
The rest of this paper is organized follows. Related works constructing more complicated network architectures
are reviewed in Section 2 and overview of DAG-CNN compared to classic CNN which has a linear chain of layers
architecture is given in Section 3. Proposed method and [20]. DAG structure is inspired from the concept of recurrent
experimental settings and the obtained results are presented in neural networks (RNNs) which gets some feedback
Section 4. Finally, the conclusion is stated in the last section. connections between forward and backward layers. These
feedbacks will make the network to be able to capture the
II. Related works dynamic states. The major superiority of DAG architecture is
One of the earliest studies on animal recognition was its possibility of receiving multiple input parameters from
performed by Schmid et al. [8]. They built an image retrieval several backward layers. As the result, it can gain different
system for 4 classes of animals by applying Gabor filters. scales of image representation. The basic feature of deep
Later, Ramanan et al. proposed systems for detecting animals neural networks (DNN), is the skip connections between
in video frames by utilizing the texture and shape-based layers that is alike the DAG-CNN’s main idea and it is proven
features [9-10]. In [11], the authors introduced a method for that these skip connections can improve the accuracy of the
animal images search engine. An attempt for classifying classification system accordingly.
marine species is performed by Cao et al. [12]. For this Yang and Ramanan [21] have proposed the DAG-CNN in
purpose, they fused CNN and local feature descriptors and 2015. Their algorithm was applied to a collection of multi-
achieved better result than individual system. Peng et al in [13] scale images’ features and tested for classification of 3
introduced a deep learning algorithm in order to classify standard scene benchmarks. The authors proved that the multi-
animal face images and investigated its performance on LHI- scale model can be carried out as a DAG-Structured feed
Animal-Faces dataset. They proposed a deep boosting forward CNN. According to this architecture, an end to end
framework based on layer-by-layer joint feature boosting and gradient-based learning can be applied for automatic multi-
dictionary learning. In each layer, they utilized a collection of scale feature extraction by using generalized back propagation
filters by fusing the early layer filters. algorithm over the layers that have multiple inputs. Basically,
Afkham et al. [14] collected a new dataset including the network training equations are following the standard
realistic images of 13 various species with complex in wild CNN equations except for the ADD and RELU layers due to
background. They introduced a system for visual object their several inputs and outputs.
classification based on combining the textural features in Figure I presents the parameter setup for 𝑖 𝑡ℎ ReLU layer,
samples and their background. An approach for classifying 25 (𝑗)
considering α𝑖 as its input and β𝑖 as the output for its 𝑗𝑡ℎ
categories of animals is reported in [15]. The animal images
are segmented and partitioned into small patches and then
some colour-related features are extracted from each of them.
These features are fed into probabilistic neural network for
classification.
In [16], the authors suggested two systems for classifying
20 classes of animals by using Gabor features and K-mean Figure I. Parameter configuration at 𝑖 𝑡ℎ ReLU [21]
algorithm. Matuska et al. [17] developed a system which
detects and classifies images of 5 categories: fox, deer, wolf, output branch (its 𝑗𝑡ℎ child in the DAG), 𝑧 will be the final
brown bear and wild bear. Burghardt and Calic [18] created a output of the softmax layer. The gradient of 𝑧 regarding the
real-time system which tracks the animals’ head in the video input of the 𝑖 𝑡ℎ ReLU layer can be calculated as follows:
and collects information about them. They applied the Viola-
Jones detection algorithm which is basically utilized in human (𝑗)
𝜕𝑧 𝜕𝑧 𝛼𝛽𝑖
face detection. = ∑𝐶𝑗=1 (𝑗) 𝜕𝛼 (1)
𝜕𝛼 𝑖 𝛼𝛽𝑖 𝑖

Si et al. [19] presented a learning system for generative

image representation which only needs a small number of where 𝑐 indicates the number of output edge of the 𝑖 𝑡ℎ
training image samples. Each learned template consists of ReLU.
several image blocks and their properties like location, scale (1) (𝑁)
and orientation. These properties are represented by local Additionally, βk = 𝑔(α𝑘 , · · ·, α𝑘 ) represents the output
features such as local sketch, texture gradients, ﬂatness of an Add layer with multiple inputs. By employing the chain
regions, and colours. Afterwards these blocks are ranked with rule, the gradient along the layer will be measured as stated
respect to their information gain. Finally they performed in Eq. (2):
automatic feature selection algorithm which makes the system 𝐶 (𝑗)
adjustable for applying in different image classification 𝜕𝑧 𝜕𝑧 𝜕𝛽𝑘 𝜕𝑧 𝜕𝛽𝑘 𝜕𝛼
= = ∑ (𝑗) 𝑘 (2)
applications. They evaluated the proposed system on different 𝜕𝛼𝑖 𝜕𝛽𝑘 𝜕𝛼𝑖 𝜕𝛽𝑘 𝜕𝛼 𝜕𝛼𝑖
𝑗=1 𝑘
public benchmarks including LHI-Animal-Faces dataset.

234
In the convolutional layers, the convolution operation is In DAG-CNNs, considering the fact that the lower layers
computed by Eq. (3) as follows: are directly linked to the output layer through multi-scale
𝑀−1 𝑁−1 connections, it is assured that these layers’ neurons receive a
strong gradient signal during the learning process and will not
𝑦(𝑖, 𝑗) = ∑ ∑ 𝑥(𝑖 − 𝑘, 𝑗 − 𝑙)ℎ(𝑘, 𝑙) (3) suffer from vanishing gradients issue.
𝑘=0 𝑙=0
In CNNs, the dimension of the learned features in mid-layers
where 𝑥 refers to input sample, ℎ indicates the used ﬁlter, can be very large. Therefore, concatenating these features may
and 𝑀, 𝑁 stand for the width and height of the input sample. result in curse of dimensionality problem. To avoid this issue,

Figure IV. LHI-Animal-Faces dataset. Five images are shown for each category

The Convolution-layer’s output is shown by 𝑦. In order to marginal activations are applied by operating the average
update neurons’ biases and weights, Eq. (4)-(5) are applied to pooling on the learned features of the layers which are used for
DAG-CNN layers: score-level fusion.
𝑥𝜆 𝑥 𝜕𝐶
∆𝑊𝑡 (𝑡 + 1) = − 𝑊𝑙 − + 𝑚∆𝑊𝑙 (𝑡) (4) B. VGG-16 architecture
𝑟 𝑛 𝜕𝑊𝑙
VGG-16 [22] which is proposed by the Oxford Visual
𝑥 𝜕𝐶 Geometry Groups in ImageNet Large-Scale Visual
∆𝐵𝑙 (𝑡 + 1) = − + 𝑚∆𝐵𝑙 (𝑡) (5) Recognition Challenge (ILSVRC 2014) provides deeper and
𝑛 𝜕𝐵𝑙
wider comparing to classic CNN architecture. VGG-16 consists
In the above equations, 𝑊, 𝐵, 𝑙, λ, 𝑥, 𝑛, 𝑚, 𝑡, and 𝐶 signify of five batches of convolution operations; every batch can have
the weight, bias, layer number, regularization parameter, 3 to 4 adjacent Conv-layers that are associated with max-
learning rate, total number of training samples, momentum, pooling layers. The size of kernels in all convolutional layers is
updating step, and cost function, respectively. 3×3. Convolutional layers and the number of kernels have

Figure III. Overview of proposed DAG-VGG16 method

235
Figure V. Confusion Matrix. (up) Score-level fusion of VGG-16 and KFA[5], (down) DAG-VGG-16

similar sizes within each batch (starts from 64 in earlier layers

and increases to 512 in the last one). Figure II demonstrates the IV. Proposed Method
16-layer VGG architecture. This architecture has been In this paper, a novel DAG-CNN architecture is proposed
employed in many research studies and in fact, it was the first with the aim of animal classification from their head image
architecture that outperformed the human-level performance on samples by extracting the multi-stage features from multiple
ImageNet [23]. layers of the VGG-16 CNN model. Despite the fact that mid-
level features of the CNN would be used to discriminate
patterns with different levels of details, in the most CNN-
based approaches reported in the literature, these features are
neglected.
The overview of the proposed method is illustrated in
Figure II. VGG-16 Architecture Figure III. In the pre-processing phase, some simple
enhancement/operations such as image resizing, converting
the RGB to grayscale and histogram equalization are done to
remove the negative effect of various factors such as size,
International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

236
illumination and image quality. Then two distinct set of intact throughout the fine-tuning process. The remaining
features are generated. Afterwards, the similarity score layers’ weights are fine-tuned by applying Stochastic Gradient
between these feature vectors and the whole feature vectors Descent (SGD) algorithm to minimize the loss function with a
available in the training set are computed and the minimum small initial learning rate of 0.001.
score is selected for each technique. The distance between the
test and training animal images is considered as the score of In order to investigate the efficiency of the proposed DAG-
that sample. At this point, the achieved scores are normalized CNN system, we compare its performance with the-state-of-
and the fusion is performed by the addition of these the-art algorithms that utilized the LHI-Animal-Faces dataset.
This comparison is summarized in Table 1 and shows that the
normalized scores and fed to Nearest Neighbour (NN) classifier proposed method outperforms other state-of-the-art
to obtain the final animal’s class label. algorithms for animal face classification. In [7], we examined
several categories of feature extractors including hand-crafted
V. Experiments and Results local and appearance-based descriptors and automatic learned
The experiments employ LHI-Animal-Faces dataset [19] features. We applied well-known local feature descriptors
that contains 2200 head images of 19 different category of such as Histogram of Oriented Gradients (HOG), Completed
animals plus one class of human head. Five random samples Local Binary Patterns (CLBP), LBP histogram Fourier (LBP-
from each of the aforementioned classes are shown in Figure HF), Haralick features and Median Robust Extended LBP
IV. Due to large intra-class similarities and inter-class (MRELBP). In order to utilize the discrimination power of
variations in the dataset, precise classification is a challenging appearance-based descriptors, Linear Discriminant Analysis
task. (LDA) and Kernel Fisher Analysis (KFA) are examined.
Finally we tested the learned features of pre-trained and fine-
In order to achieve comparable result with state-of-the-art, tuned publicly available CNN architecture, namely, AlexNet
we follow the experimental setup introduced by Si et al. [25]. and VGG-16. We investigated score-level fusion of various
On this setting, 30 random images from each class are selected combination of the aforementioned descriptors and we
for the training phase and the rest of images are utilized in the showed that fusion of hand-crafted features and multi-stage
test stage. CNN-based features gain even higher accuracy comparing to
The pre-processing step includes image resizing to 224 × CNN. On the other hand, the proposed DAG-CNN system
224 pixels, pixel intensity normalization and histogram causes meaningful improvement in accuracy. The 96.4%
equalization. VGG-16 is selected as the underlying classification accuracy was achieved by automatically
architecture of the proposed DAG-CNN due to its success in combining multi-stage learned features. This result is superior
to all of the state-of-the-art methods mentioned in Table 1,
even our previous system which utilized hand-crafted and
learned feature fusion.
Table I. Classification Accuracy on LHI-Animal-Faces dataset
The confusion matrix is computed to assess the
Method Accuracy
HOG+SVM[19] 70.8%
classification precision for the two best results of Table 1 and
SW-RBF[25] 44% is shown in Figure V. For 8 animal categories, the proposed
FRAME[26] 79.4% system successfully classified the images with 100% accuracy
HIT[19] 75.6% and for the 10 classes, the classification precisions are higher
LSVM[27] 77.6% than 92%. The maximum confusions are generated by the
AOT[24] 79.1%
Deep Boosting[13] 81.5%
sheep category versus cow head, rabbit and chicken head
Score-level fusion[7] 95.31 versus mouse head (8%).
Proposed DAG-CNN 96.29%
VI. Conclusions
image classification. By following the greedy approach In this paper, a new architecture of a nonlinear CNN structure,
explained in [2], three new links are added to Batch-3, Batch- namely DAG-CNN is presented for animal classification task. This system
4 and Batch-5 (Figure III). The proposed system is pre-trained automatically combines different layers’ features of a CNN for
by hundreds of thousands images from large public image making decision. For this purpose, several new links are added
repository, namely ImageNet database [28] and learned to the underlying backbone from a popular CNN structure,
powerful discriminative feature sets. namely VGG-16. The proposed method is compared with
In the next step by utilizing the target dataset, we fine- several state-of-the-art systems that use the same dataset. The
tuned the pre-trained proposed system. Data augmentation comparison results show that the proposed DAG-CNN
techniques like translation, rotation and flipping are applied to architecture outperforms the state-of-the-art systems for animal
avoid overfitting and the training size is expanded five times. face classification.
The first early layers’ weights are frozen so that they remain

References [3] Druzhkov, P. N., Kustikova, V. D.: A survey of deep

learning methods and software tools for image
classification and object detection. Pattern Recognition
[1] Elson, J., Douceur, J., Howell, J., et al.: Asirra: a and Image Analysis. 26(1), 9–15 (2016)
CAPTCHA that exploits interest-aligned manual image [4] Gao, H., Cheng, B., Wang, J., Li, K., Zhao, J. and Li, D.:
categorization. In: Proc. of ACM CCS, pp. 366-374. Object Classification Using CNN-Based Fusion of Vision
(2007)
and LIDAR in Autonomous Vehicle Environment. IEEE
[2] Taheri, S., Toygar, Ö.: On the use of DAG-CNN
architecture for age estimation with multi-stage features Transactions on Industrial Informatics. 14(9), 4224-4231
fusion. Neurocomputing. 329,300-310 (2018) (2018).

237
[5] Golrizkhatami, Z., Acan, A.: ECG classification using Intelligence. IEEE Transactions on. 35 (9), 2189–2205
three-level fusion of different feature descriptors. Expert (2013)
Systems with Applications. Vol 114, pp.54-64(2018) [25] Kolouri, S., Zou, Y., Rohde, G.K.: Sliced Wasserstein
[6] Taheri, S., Toygar, Ö.: Multi-stage age estimation using kernels for probability distributions. In: Proceedings of the
two level fusions of handcrafted and learned features on IEEE Conference on Computer Vision and Pattern
facial images. IET Biometrics, 8(2) , pp. 124 – 133 (2019) Recognition, pp. 5258-5267. IEEE (2016)
[7] Taheri, S., Toygar, Ö.: Animal classification using facial [26] Xie, J.X: Generative Modeling and Unsupervised
images with score-level fusion. IET Computer Vision. Learning in Computer Vision. Doctoral dissertation,
12(5), 679 – 685 (2018) UCLA University, (2016)
[8] Schmid, C.: Constructing models for content-based image [27] Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan,
retrieval. In: CVPR '01, Kauai, United State. pp. 11-39. D.: Object Detection with Discriminatively Trained Part-
CVPR (2001) Based Models. IEEE Trans. Pattern Analysis and Machine
[9] Ramanan, D., Forsyth, D. A., Barnard, K.: Detecting, Intelligence. 32(9), pp.1627-1645 (2010)
localizing and recovering kinematics of textured animals. [28] Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K. Fei-Fei,
In: 2005 IEEE Computer Society Conference on L.: Imagenet: A large-scale hierarchical image
Computer Vision and Pattern Recognition, June 2005, San database. In Proc. CVPR, 2009.
Diego, USA, pp. 635-642. IEEE (2005)
[10] Ramanan, D., Forsyth, D.A., Barnard, M.-K.: Building
models of animals from video. In: IEEE Transactions on
Pattern Analysis and Machine Intelligence, 28 (8), pp.
1319-1334. IEEE (2006)
[11] Berg, T. L., Forsyth, D. A.: Animals on the web. 2006
IEEE Computer Society Conference on Computer Vision
and Pattern Recognition (CVPR'06), NY, USA, pp. 1463-
1470. CPVR (2006)
[12] Cao, Z., Principe, J.C., Ouyang, B., Dalgleish, F.,
Vuorenkoski, A.: Marine animal classification using
combined CNN and hand-designed image features. In
OCEANS15 MTS/IEEE Washington, pp. 1-6. IEEE
(2015)
[13] Penga, Z., Lia, Y., Caib, Z., et al.: Deep Boosting: Joint
feature selection and analysis dictionary learning in
hierarchy. Neurocomputing. 178(20), 36–45 (2016)
[14] Afkham, h., Tavakoli, A., Eklundh, J., Pronobis, A.: Joint
Visual Vocabulary For Animal Classiﬁcation. In: ICPR
2008, Tampa, FL, USA, pp. 1-4. ICPR (2008)
[15] Kumar, Y.S., Manohar, N., Chethan, H.K.: Animal
classification system: a block based approach. Procedia
Computer Science, 45, 336-343 (2015)
[16] Manohar, N., Kumar, Y.S., Kumar, G.H.: Supervised and
unsupervised learning in animal classification. In:
Advances in Computing, Communications and
Informatics (ICACCI), International Conference on, pp.
156-161. IEEE(2016)
[17] Matuska, S., Hudec, R., Benco, M., Kamencay, P.,
Zachariasova, M.: A novel system for automatic detection
and classification of animal. In: ELEKTRO, pp. 76-80.
IEEE (2014)
[18] Burghardt, T., Calic, J.: Real-time face detection and
tracking of animals. In: Neural Network Applications in
Electrical Engineering. NEUREL 2006. 8th Seminar on,
pp. 27-32. IEEE(2006)
[19] Si, Z., Zhu, S.-C.: Learning hybrid image templates (hit)
by information projection. IEEE Transactions on Pattern
Analysis and Machine Intelligence. 34(7), 1354–1367
(2012)
[20] Golrizkhatami, Z., Taheri, S., Acan, A.: Multi-scale
features for heartbeat classification using directed acyclic
graph CNN. Applied Artificial Intelligence 32 (7-8), pp.
613-628 (2018)
[21] Yang, S., Ramanan, D.: Multi-scale recognition with
DAG-CNNs. In: Computer Vision (ICCV), 2015 IEEE
International Conference, pp. 1215-1223. ICCV (2015)
[22] Simonyan, K., Zisserman, A.: Very deep convolutional
networks for large-scale image recognition. arXiv preprint
arXiv:1409.1556 (2014)
[23] Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with
convolutions. CVPR (2015)
[24] Si, Z., Zhu, S.-C.: Learning and-or templates for object
recognition and detection, Pattern Analysis and Machine
International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

238
Movie Success Prediction with Statistical Analysis Techniques
and Machine Learning Methods
Bugay Sarıkaya Duygu Dede Sener
Department of Computer Department of Computer
Engineering Engineering
Baskent University Baskent University
Ankara, Turkey Ankara, Turkey

Abstract— In the movie industry, huge investments have and linear combination, multiple linear regression and neural
been made to shoot a successful motion picture. However, network methods and X2 analysis were used. It is stated that
despite large investments, there are some movie examples that some attributes, such as writers, actors, and directors,
cannot be successful as expected. Therefore, predicting the profoundly affect user ratings. In addition to this, Eker et al.
success of a movie is so important on a large scale for the movie [3] sought to measure the effect of features on movie
producers before releasing the movie. In this study, a classification. Decision trees, K-NN, Random Forest, c4.5,
classification-based prediction model is aimed to develop for c5.0 and Boosting algorithms were used in this study. A
providing a foresight to the producers about investing on a
comparative study was performed by using different machine
movie. Different statistical analysis and machine learning
learning algorithms on the dataset from IMDB website and
approaches were used in the proposed model for predicting
success of a movie. We mainly focus on detecting which movie
Facebook website. It has been observed that the user votes are
attribute is highly correlated with the success of the movie and the most important factor in the IMDB score, and the country
which machine learning technique is better at predicting the where the film is produced is the least important factor in
movie success. To do so, firstly a statistical analysis was determining the IMDB score. Saraee et al. [4] studied on
conducted by using chi-square analysis and analysis of variance IMDB data using various data mining techniques.
test. Then a comparative analysis was performed by using Furthermore, Lash and Zhao [5] proposed a way to predict
different machine learning techniques including random forest, decisions regarding film investments. This study assisted in
support vector machine and artificial neural network. The making an early investment decision in filmmaking by using
experimental results indicate that the most important predictors a hhistorical data. In this study, the profit was calculated
of a movie’s success are “voteAverage”, “voteCount”, “revenue” mainly on the box-office revenue. However, for many movies,
and “budget. In addition to this, random forest has become there are other sources of income, such as items for sale.
successful by the accuracy of 96% in predicting movie success Kyuhan Lee et al. [6] examine multiple approaches to improve
among other machine learning methods. the performance of the prediction model. They developed and
added a new feature derived from the theory of transmedia
Keywords— Movie success prediction, Movie success storytelling. They used an ensemble approach, which has
classification, Machine Learning, Model Prediction rarely been adopted in the research on predicting box-office
I. Introduction performance. As a result, the proposed model, Cinema
Ensemble Model (CEM), outperformed the prediction models
Movie industry is one of the expanded industries all from the past studies that use existing machine learning
around the world. With the rapid growth of this industry and algorithms. Besides this studies, Hemraj Verma and Garima
economic impact of it, many researchers have been studying Verma [7] conducted a comparative analysis of prediction
on the movie industry. Especially conducting predictive models using various machine learning techniques. The
models to investigate factors which has impact on the success models were used to predict whether a movie would be a hit
of a movie has become a popular research area over the past or a fop before it came out. The major predictors used in the
decade. It is a well-known fact that the success of a motion models are the ratings of the lead actor, IMDb ranking of a
picture is based on several features of the movie such as box- movie, music rank of the movie, and the total number of
office, budget, revenue, and popularity level. Although, there screens planned for the release of a movie.
are some other important factors such as director, actors which
has undeniable impact on a movie’s success, every movie In this study, statistical analysis techniques and different
cannot achieve the expected box-office or the success every machine learning methods were performed to predict the
time. Therefore, there is a basic need for producers to predict success of a movie. Two statistical tests such as chi-square
the success of a movie before its release. Recently there are analysis and analysis of variance test (ANOVA) and three
some studies which focus on the predicting the success of different machine learning algorithms including random forest
movie using different machine learning approaches. (RF), support vector machine (SVM) and artificial neural
Ahmad et al. [1] proposed a mathematical model for network (ANN) were performed on a collected dataset. A
predicting movie success including finding correlations comparative study was performed to investigate which
between various features using X2 analysis. Simulation data algorithm achieve the highest rate of the accuracy. An
was used in the study and only tested in Bollywood movies experimental study is performed on a dataset collected from
and they also concluded that actors and film genres affect the IMDB dataset. According to the results the most important
success of the film. Ping-Yu Hsu et al. [2] developed a special predictors of a movie’s success are “voteAverage”, “voteCount”,
model to predict user ratings with IMDB (Internet Movie “revenue” and “budget. Besides this, it has been found that
Database) attributes. There are 32968 films in the used dataset random forest is the most successful technique by the accuracy

239
of 0.96 in predicting movie success among other machine made by selected mostly predicted classes from the trees. RFs
learning methods. and its variants are called as black-box models and they have
been applied on a variety of research fields such as
II. Methods bioinformatics, finance and healthcare systems.
In this section used statistical analysis techniques and
machine learning techniques in the proposed model are b) Support Vector Machine (SVM)
described. Support vector machine (SVM) [11] is a supervised
A. Statistical Analysis Methods machine learning algorithm that used for data classification
and regression analysis. SVM’s main goal is finding a
a) Correlation Matrix
hyperplane that best divides a dataset to two different classes
Attribute correlation matrix was used to obtain multiple times (as many times needed to match number of
correlation between each attribute and the target attribute classes). Support vectors are the data points nearest to the
which is movie success in our study. It is a common way to hyperplane, this point help define the hyperplane, so all
summarize data and guide the data owner to focus more on computations are done through these points. This hyperplane
highly correlated attributes in which way data analysis can be creates an area of margin which divides two classes apart.
conducted in more efficient way. In our study, Pearson Error function here is designed so that the margin becomes
correlation coefficient was used. The score ranges between 0 larger as error decreases. If there is no clearly dividing
and 1, the values close to 1 represents a high correlation while hyperplane then the whole feature space is transformed into a
the values close to 0 represents a low correlation. new higher dimension feature space. This is known as
kernelling. SVMs produce accurate results on clean datasets
b) Chi-Squared Test with small to medium sample size. When dealing with larger
A chi-square (χ2) test [8] is a hypothesis testing method. datasets however computational costs can be too much to
It is used for comparing the observed values with the handle and also it is highly sensitive to the noisy nature of
expected values to detect the stated null hypothesis is true or large datasets.
not. The null hypothesis states that there is no difference
between the compared data. For this test, a p-value that is less c) Artificial Neural network (ANN)
than or equal to the defined significance level (0.05) indicates An artificial neural network (ANN) [12] is the piece of a
there is a strong evidence to conclude that the observed computing system designed to simulate the way the human
distribution is not the same as the expected distribution. brain analyzes and processes information. It is the foundation
Moreover, the data used in calculating a chi-square statistic of artificial intelligence (AI) and solves problems that would
must be random, raw, mutually exclusive, drawn from prove impossible or difficult by human or statistical
independent variables, and drawn from a large enough standards. ANNs have self-learning capabilities that enable
sample. them to produce better results as more data becomes
c) Analysis of Variance (ANOVA) available. Artificial neural networks are built like the human
Analysis of variance (ANOVA) [9] is an analysis tool brain, with neuron nodes interconnected like a web. The
used in statistics that splits an observed aggregate variability human brain has hundreds of billions of cells called neurons.
found inside a data set into two parts: systematic factors and Each neuron is made up of a cell body that is responsible for
random factors. The systematic factors have a statistical processing information by carrying information towards
influence on the given data set, while the random factors do (inputs) and away (outputs) from the brain.
not. Analysts use the ANOVA test to determine the influence
that independent variables have on the dependent variable in
a regression study. In our study, two-way ANOVA was used III. Results
because we wanted make comparisons between the means of In this section, the used dataset is explanied and
three groups of data, where two independent variables are expremental results are given in the following sections.
considered. The considered variables are movie success and
A. Dataset
the rest of the attributes. Moreover, Multivariate ANOVA
(MANOVA) was used to extend the capabilities of analysis In this study, a dataset of 4899 movies released from 2000
ANOVA by assessing multiple dependent variables to 2020 was collected from TMDB (The Movie Database)
simultaneously. [13] and OMDB (The Open Movie Database) [14]. The
dataset consists of various type of attributes such as date,
genre, language, season, IMDB (Internet Movie Database)
B. Machine Learning Methods rating of the features are categorical, box-office, budget,
a) Random Forest IMDB votes, popularity, revenue, run time of them are
numerical values. In addition to this, the sample distribution
Random forests (RF) or random decision trees [10] are of the dataset there are 1215 samples for successful class, 2663
combination of tree predictors such that each tree depends on samples for average successful and 1021 samples for
the values of a random vector sampled independently and unsuccessful class.
with the same distribution for all trees in the forest. It is an
ensemble learning method used for classification, regression a) Data Preprocessing Steps
and other tasks which needs constructing a multitude of In the dataset, each movie has a IMDB rating for
decision trees. For the classification task, each generated tree representing the success of the movie. Unlike a binary
predicts a class as an output class and the final decision is classification problem, having two classes, multi-class

240
classification would cover the movie success information Then correlation matrix was obtained given in Figure 2.
more broadly. An artificial class construction has been According to the matrix, the movie success given as the name
performed to transform the problem into a multi-class “imdbRating” is highly correlate with the attributes which are
classification problem. Therefore, each movie is categorized “voteAverage”, “voteCount”, “revenue” and “budget”. After
based on its IMDB rating such as movies that have the scores having obtained highly correlated attributes with the success
in the -range of [7-10] were assigned to class “successful”, the statistical analysis has been performed by focusing on these
scores in the range of [5-6.99] assigned to the “average obtained attributes to conclude the relationship between them.
successful” class, then the scores in the range of [0-4.99] was To do so, chi-square (χ2) test, ANOVA and MANOVA tests
matched with the “unsuccessful” class. In this way, our were performed to investigate which movie attribute has the
problem has been converted into a multi-class problem. biggest impact on the movie success.
Furthermore, missing value removal and transforming
categorical features into numeric features were performed
before applying classification algorithms. One-hot encoding
was applied to convert categorical values into the numerical
values. In this way, a generalized form to be provide to
machine learning algorithms can be obtained. Finally, feature
scaling approach was applied on some attributes to normalize
the range of independent attributes of data like box-office,
revenue and vote counts.
B. Evaluation Metrics
To evaluate the performance of the classification
algorithms one metric was used: accuracy. Accuracy (1)
refers to the degree of conformity of a measured or calculated
quantity to an actual (true) value. A result is said to be Figure II. Correlation matrix of movie attributes
accurate when it matches to a particular target. TP, FP, TN Chi-square (χ2) test statistic and corresponding p-values
and FN values represent true positives, false positives, true are given in Table 1. According to the obtained p-values which
negatives and false negatives, respectively. Overall accuracy are all less than the significance level of 0.05, we can conclude
were used to compare the performance of classification that the association between each attribute and movie’s
algorithms. success is statistically significant but excluding the attribute of
“season_summer”, since it does not have the p-value less than
𝑇𝑃+𝑇𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (1) or equal to 0.05. In addition to this, “Df” value in the table
𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁
represents “degrees of freedom”.
Table I. Chi-Squared test result
C. Experimental Results
To get insight about the success distribution among the Attribute X-squared Df p-value
movies based on the attributes, some distribution plots were voteAverage 9645.6 156 < 2.2e-16
obtained. Figure 1 gives the movie success distribution based voteCount 8865.4 3222 < 2.2e-16
on the genre attribute. The class categories are given with revenue 10081 4310 < 2.2e-16
different colors such as red, green and blue represent budget 5728 2154 < 2.2e-16
unsuccessful, average and successful movie class runtime 3230 372 < 2.2e-16
respectively. According to the figure the highest number of boxOffice 9119.6 3978 < 2.2e-16
successful movies can be seen in “drama” and “comedy” date 125.36 20 < 2.2e-16
movies, while most of the “horror” and” thriller” movies are
in the unsuccessful class. popularity 19113 8300 < 2.2e-16
season_winter 14.283 2 0.0007914
season_autumn 14.283 2 0.0007914
season_summer 4.9467 2 0.0843
season_spring 17.391 2 0.0001674
imdbVotes 4825.6 1512 < 2.2e-16

Two-way ANOVA results are given in the Table 2. The

attributes in table are selected based on their correlation score
and chi-square statistic results. According to the results the
difference between the means of group, which are successful,
average successful and successful movies, are statistically
significant, because all obtained p-values are less than the
significance level of 0.05. It can be concluded that all
selected attributes. are important factors to emphasize the
success of a movie.
Figure I. Chart of genre by success of movie

241
Table II. Two-way ANOVA results
Table IV. Classification performances of the machine
Two Way Df Sum Sq Mean Sq F Value Pr(>F) learning algorithms
with IMDB
Success
Categories Method

voteAverage 2 7449 3725 4062 <2e-16 RF SVM ANN

voteCount 2 8.003e+09 4.001e+09 592.8 <2e-16
revenue 2 1.532e+19 7.662e+18 259.1 <2e-16 Accuracy 0.96 0.89 0.94
budget 2 6.593e+17 3.296e+17 192.1 <2e-16
runtime 2 224351 112175 163.1 <2e-16
boxOffice 2 9.171e+17 4.585e+17 129.5 <2e-16
IV. Conclusion
Besides the ANOVA results, MANOVA was also With the rapid growth of the movie industry, large
conducted to assess multiple dependent variables investments have been made to release successful movies. It
simultaneously. “voteAverage” and “voteCount” attributes is also a well-known fact that the movie success does not only
are selected attributes for this task. According to the results it depend on one or two criteria, but so comprehensive studies
can be concluded that difference between the means of group should also be conducted to investigate all criteria which is
has been found statistically significant with grouping these related with the movie success. Therefore, knowing these
attributes. success criteria before the movie is released, it is very
Table III. Multivariate ANOVA (MANOVA) results important for the filmmakers to shoot the movie.
In this study, we present a model for predicting the success
MANOVA Df Pillai approx num den Pr(>F)
with IMDB F Df Df
of a movie by using different statistical analysis techniques
Success and machine learning approaches. We mainly focus on
Categories detecting which feature of a movie is highly correlate with the
success of the movie and which machine learning technique is
voteAverage 2 0.45203 1596.1 4 21864 < 2.2e-16 better at predicting the movie success. According to the
/ voteCount experimental results most important predictors of a movie’s
success are “voteAverage”, “voteCount”, “revenue” and
After having had some statistical analysis observations “budget”. In addition to this, random forest has become
about the dataset, the movie success prediction was performed successful by the accuracy of machine learning algorithms in
by using three different machine learning algorithms. predicting movie success among other machine learning
Accuracy metric was used to compare performance of the methods.
used methods. Moreover, 10-fold cross validation technique The proposed study as a supplementary tool can provide a
was conducted to obtain a more accurate model by providing practical implication for the decision makers and producers in
each sample or movie to take part in the training and testing the movie industry. They can make a decision about
of the proposed model. One-versus-all method was used in investment on a movie before its releasing. In the future, the
order to extend binary-class problem into multi-class problem. current dataset can be expanded by adding new movies and
All experiments were performed by using R language on attributes to get more reliable and robust predictive model.
RStudio.
To perform ANN method a layered structure was prepared References
as given in Figure 3. The black lines show the connections [1] Ahmad, J., Duraisamy, P., Yousef, A., & Buckles, B. (2017,
between each layer and the weights on each connection while July). Movie success prediction using data mining. In 2017 8th
International Conference on Computing, Communication and
the blue lines show the bias term added in each step. The bias Networking Technologies (ICCCNT) (pp. 1-4). IEEE.
can be thought as the intercept of a linear model. Table 3 gives [2] Hsu, P. Y., Shen, Y. H., & Xie, X. A. (2014, October).
the average accuracy values of all three classes. According to Predicting movies user ratings with imdb attributes. In
table, Random Forest method has become most successful International Conference on Rough Sets and Knowledge
method in predicting movie success with the accuracy rate of Technology (pp. 444-453). Springer, Cham.
0.96. It can be also observed that the lowest accuracy belongs [3] Eker, A. G., Duru, N., Kat, O., & Ildırar, A. (2018, September).
to SVM method. Makine Öğrenmesi ile Film Başarı Tahmini. In 2018 3rd
International Conference on Computer Science and
Engineering (UBMK) (pp. 610-614). IEEE.
[4] Saraee, M. H., White, S., & Eccleston, J. (2004). A data mining
approach to analysis and prediction of movie ratings.
Transactions of the Wessex Institute, 343-352.
[5] Lash, M. T., & Zhao, K. (2016). Early predictions of movie
success: The who, what, and when of profitability. Journal of
Management Information Systems, 33(3), 874-903.
[6] Lee, K., Park, J., Kim, I., & Choi, Y. (2018). Predicting movie
success with machine learning techniques: ways to improve
accuracy. Information Systems Frontiers, 20(3), 577-588.
[7] Verma, H., & Verma, G. (2020). Prediction model for
bollywood movie success: A comparative analysis of
performance of supervised machine learning algorithms. The
Review of Socionetwork Strategies, 14(1), 1-17.
Figure III. Graphical representation of the used ANN model with
the weights on each connection

242
[8] Plackett, R. L. (1983). Karl Pearson and the chi-squared test. [12] Wang, S. C. (2003). Artificial neural network. In
International Statistical Review/Revue Internationale de Interdisciplinary computing in java programming (pp. 81-100).
Statistique, 59-72. Springer, Boston, MA.
[9] Scheffe, H. (1999). The analysis of variance (Vol. 72). John [13] https://www.themoviedb.org/ access date: 20.04.2021
Wiley & Sons. [14] https://www.omdbapi.com/ access date: 20.04.2021
[10] Breiman, L. (2001). Random forests. Machine learning, 45(1),
5-32.
[11] Noble, W. S. (2006). What is a support vector machine?. Nature
biotechnology, 24(12), 1565-1567.

243
Digital Controllers Performance Analysis for a Robot arm
Abdullah Ahmed Al-dulaimi Mohammed Majid Abdulrazzaq Mustafa Mohammed Alhassow
Department of Electrical Electronics Department of Computer Engineering Department of Electrical and Computer
Engineering karabuk University Engineering
Karabuk University moh.abdulrazzaq9@gmail.com Altinbas University
Abdalluhahmed1993@gmail.com Mustafa.alshakhe@gmail.com
Noor Qasim AL saedi
Department of Electrical Electronics
Engineering
Karabuk University
Noorqasimat@gmail.com

Abstract—The design methodology and performance study of the methodologies as defined in [4]. This paper clarified the
various forms of digital compensators for a robot arm joint statistical and conceptual pricing articulated in the references.
control system with sensor input are presented in this article. The basic and illustrative frames and approaches to digital
continuous time (s-plane or w-plane) and Discrete (z-plane) control systems are mentioned in [6]. Digital control systems
domain parameters are used in the design process. The frequency have been documented for training, theory, simulation and
response characteristics design techniques were investigated, and experimental approaches [7],[8]. A closed loop model has been
five basic types of controllers were modelled and simulated using introduced in [1] and [4] for digital systems control and
MATLAB: phase-lag, phase-lead, proportional-integral (PI),
implementations in the Digital Drive Controller. Regarding to
proportional-derivative (PD), and proportional-integral
PID controller we citation the information from [5].
derivative (PID). Many of the controllers have been set up to
maintain a 40-degree step margin. both closed loop phase II. Methodology
answers as well as open loop bode plots have been analyzed. This
paper presents a comparison of the controllers based on their A sampler, D/A block that is a zero-order hold (ZOH), a
phase reaction characteristics. servomotor represented by an s-domain transfer mechanism,
Keywords: digital controllers, PID controllers, robot arm, robot arm
digital controller block, a power amplifier gain, gears
controllers, Digital Controllers Performance.
represented by a gain value, and a feedback sensor block
comprise the example robot control scheme outlined in this
I. Introduction article. A s-domain transfer function presents the
uncompensated plant. A/D conversion is started by the sampler
Controllers are needed to assess adjustments in system
parameters and to meet performance requirements for steady- and D/A conversion is held at zero order. Controllers must
state precision, transient reaction, reliability, and disturbance offset the margin of the plant phase and the desired result shall
prevention. Analog control systems are stable, with no intrinsic be 40 deg. For each controller, steady-state error, percent
band width limitations or system changes. Due to the tolerances overshoot, rise time, and settling time are calculated for output
of practical machines, In analog controls, intricate logics are assessment. This paper section by section documents a
difficult to synthesize, while rendering complex interfaces literature review of digital compensation, an example
among multiple subsystems is very difficult, and are vulnerable uncompensated robot arm joint plant, discrete and continuous
to incorrect designs and limitations. Furthermore, extraneous time equations with design method, MATLAB simulation
noise sources can corrupt analog systems significantly. High- results of lag, lead, PI, PD and PID controls, and a comparative
tech digital controls Since no signal loss occurs in an along to study among these five. In order to evaluate design
digital (A/D) and digital to analog (D/A) conversions [1], requirements, the digital system was adopted, simulated and
systems are reliable. Furthermore, with a more sophisticated extended in MATLAB.
logic implementation, systems are more flexible and accurate.
Filters do not encounter external noises, which makes them III. Literature Review
well-suited for adaptive filtering uses. Fast response and a The compensation theory, plant configuration and the
digital memory interface are possible for digital systems [1]. A mathematical derivations of design approaches, loop
physical planet or system is accurately controlled through parameters and open loop of the controllers mentioned in this
closed-loop or feedback operation where an output (system paper fully follow the literature provided in [1]. The controller
response) is adjusted as required by an error signal [2]. The transfer function for first-order compensation can be written as
discrepancy between the sponge as determined by sensor input
𝐾𝑑 (𝑧−𝑧0 )
and the target response generates the error signal. The error 𝐷(𝑧) = ()
𝑧−𝑧𝑝
quantity is processed by a controller or compensator in order to
satisfy those output requirements [3]. This paper describes five Here, 𝑧0 and 𝑧𝑝 represent the zero and pole positions,
digital controller design methodologies for a robot control respectively. The controller's bilinear or trapezoidal
system in real time. In these design methods, the compensating transformation from the discrete z-plane to the continuous w-
parameter is the phase margin specified in the plant bode 1+(𝑇/2)𝑤
diagram. The design method employs strategies of frequency plane (warped s-plane) implies 𝐷(𝑤) = 𝐷(𝑧), 𝑧 = and
1−(𝑇/2)𝑤
response that allow for frequency cross-over phase margin 1+(w/ωw0 )
D(w) = a 0 ,
(Pm). Phase-lag, PI and PID controllers (lag, lead) were 1+(w/ωwp )
drawn up in accordance with the principle of compensation and

244
Figure I. Robot arm joint control system block diagram

Here 𝜔𝜔0 and 𝜔𝜔𝑝 denotes the zero and pole positions in Servo motors are used. Gears are used to transfer motion.
the w-plane, and a 0 denotes the compensator dc gain. The By finding the torque and speed of the output gear, you can
bilinear approximation states that: find the torque and speed of the input gear. The uncompensated
2 𝑧−1
plant is presented by a s-domain transfer function. The sampler
𝑤= () initiates A/D conversion and zero-order hold implements D/A
𝑇 𝑧+1
conversion. For performance evaluation, steady-state error,
percent overshoot, rise time and settling time are measured for
From the equations (1)-(4), in z-plane the controller can be each controller. Here the given sensor feedback gain, GT=
realized as 0.07. The sensor input is θ𝑎 in degrees and the output is in
2/𝑇−𝜔
𝑤0
volts.
𝜔𝑤𝑝 (𝜔𝑤0 +2/𝑇) 𝑧−(2/𝑇+𝜔𝑤0)
𝐷(𝑧) = 𝑎0 ()
𝜔𝑤0 (𝜔𝑤𝑝 +2/𝑇) 𝑧−(2/𝑇−𝜔𝑤𝑝 ) V. PLANT
2/𝑇+𝜔𝑤𝑝
The control system of the robot arm has been shown in Fig.
The equation (1) yields to 1. This system shown the sampling time, 𝑇 = 0.1𝑠, the power
𝜔𝑤𝑝 (𝜔𝑤0 +2/𝑇) amplifier increase, K = 2.4 and the sensor feedback gain, 𝐻𝑘 =
𝐾𝑑 = 𝑎0 () 0.07. the system phase margin with, 𝐷(𝑧) = 1. the ZOH-TF can
𝜔𝑤0 (𝜔𝑤𝑝 +2/𝑇)
be define as
1−𝑒 −𝑠𝑇
𝐺𝐻𝑂 (𝑠) = (7)
2/𝑇−𝜔𝑤0 𝑠
𝑧0 = ()
2/𝑇+𝜔𝑤0 The plant TF in continuous time
2/𝑇−𝜔𝑤𝑝 9.6
𝑧𝑝 = () 𝐺𝑝 (𝑠) =
𝑠 2 +2𝑠
(8)
2/𝑇+𝜔𝑤𝑝
The sensor gain feedback TF is used in a continuous-time plant
IV. block diagram explanation 0.672
𝐺𝑐 (𝑠) = 𝐺𝑝 (𝑠) × 𝐻𝑘 = (9)
A closed loop model for digital control systems and 𝑠 2 +2𝑠
applications of digital controllers to speed drives has been Where the TF is transfer function.
shown in the above diagram. Thus, consists of a sampler,
digital controller block, D/A block which is a zero-order hold The sensor gain feedback TF that operates in discrete time is
(ZOH), a power amplifier gain, a servomotor represented by a known as a discrete-time plant
s-domain transfer function, gears represented by a gain value 0.00028289 (𝑠+3.39𝑒04)
and a feedback sensor block. In the case of a closed-loop 𝐺𝑑 (10)
(𝑠+1.524) (𝑠+0.4406)
feedback system, the D(z) digital controller system is
implemented. The controller uses algebraic algorithms such as Fig. 2. Introduces 𝐷(𝑧) = 1 system bode diagram, 𝑃𝑚, for
filters and compensatory controls to correct or regulate the the uncompensated system is 79.6 𝑑𝑒𝑔. With a gain margin
controlled system's behavior. The zero-order hold is a practical 𝐺𝑚 = 35.8 𝑑𝐵.
mathematical model of signal reconstruction using a digital-to- A. Design of Phase Lag Controller
analog converter (ZOH). This can be illustrated by you take a
and convert it to a continuous-time signal, at a set time, it stores design a phase-lag controller with a dc gain of 10 that
each sample value and doesn't allow changes. The amplitude or yields a system phase margin
power of a signal input to output port can be increased by 𝑎0 𝜔𝑤𝑝
𝐺ℎ𝑓 (𝑑𝐵) = 20log (11)
connecting it to an amplifier whose gain is set to a particular 𝜔𝑤0
level [9]. In order for a servomotor to transform the control
signal from the controller into the rotational angular The controller in this paper is built for 39.7842 degrees. For
displacement or angular velocity of the motor output shaft, this design, the Pm and the cross-over or Pm frequency have
implies it has a servomotor. To power the arms of the robot, been chosen as 𝜔𝑤𝑐 = 1.1291rads −1 .

245
Figure II. Controllers' Bode plot of the open loop

1−𝑎0 |𝐺𝑑 (𝑗𝜔𝑤𝑐 )|cos 𝜃𝑟

𝑎1 = (19)
𝜔𝑤0 = 0.1𝜔𝑤𝑐 (12) 𝜔𝑤𝑐 |𝐺𝑑 (𝑗𝜔𝑤𝑐 )|sin 𝜃𝑟

and and
𝜔𝑤0 cos 𝜃𝑟 −𝑎0 |𝐺𝑑 (𝑗𝜔𝑤𝑐 )|
𝜔𝑤𝑝 = (13) 𝑏1 = (20)
𝑎0 |𝐺𝑑 (𝑗𝜔𝑤𝑐 )| 𝜔𝑤𝑐 sin 𝜃𝑟

The TF of controller is Because of the phase lead characteristic, 𝜃𝑟 > 0 and in the
design procedure 𝜔𝑤𝑐 has been constrained by the following
0.01167 (𝑧−0.7105)
𝐷𝑙𝑎𝑔 (𝑧) = (14) requirements
(𝑧−0.9919)
∠𝐺𝑑 (𝑗𝜔𝑤𝑐 ) < 180 + 𝜙𝑝𝑚 ; |𝐷(𝑗𝜔𝑤𝑐 )| > 𝑎0
Figure II show the phase-lag controller. The compensated 1
plant Pm, 𝑃𝑚 = 39.7842 𝑑𝑒𝑔 at 0.0925 𝑟𝑎𝑑/𝑠, can be seen |𝐺𝑑 (𝑗𝜔𝑤𝑐 )| < ; 𝑏1 > 0 (21)
𝑎0
on the bode plot. 𝐺𝑚 = 29.6 𝑑𝐵 at 0.575 𝑟𝑎𝑑/𝑠. The gain cos 𝜃𝑐 > 𝑎0 |𝐺𝑑 (𝑗𝜔𝑤𝑐 )|
and phase margin values are unknown in the marginalized bode
plot of the controller, and hence these are determined to be The transfer function of controller is
infinite. The gain and phase margin values are unknown in the 10.424∗(𝑧−0.9832)
𝐷(𝑧) = (22)
Bode plot of the controller, and hence these are found to be (𝑧−0.9618)
infinite. Where the Pm is the phase margin.
From the bode plot, it can be observed that the compensated
B. Design of Phase Lead Controller plant Pm, 𝑃𝑚 = 39.8827 deg. At 2.29 𝑟𝑎𝑑/𝑠 and the gain
the phase-lead controller, a 0 = 10 and maximum phase margin, 𝐺𝑚 = 16.2 𝑑𝐵 at 6.57 𝑟𝑎𝑑/𝑠. From the bode plot of
shift, 𝜃𝑚 occurs at a frequency 𝜔𝑤𝑚 = √𝜔𝑤0 𝜔𝑤𝑝 . The controller figure, in fact, it is obvious that the gain and Pm
controller in this paper is equipped for 39.8827 degrees for values are undefined and thereby these are found to be infinite.
this design, the Pm and cross over or Pm frequency have been C. Design of PI Controller
chosen as 2.2950 𝑟𝑎𝑑/𝑠. a phase-lead design controller with a
dc gain of 10 that yields a system phase margin 40 deg. PI controller means Proportional Integral controller it is
composite of proportional and integral controller. They are in
𝐷(𝑗𝜔𝑤𝑐 )𝐺𝑑 (𝑗𝜔𝑤𝑐 ) = 1∠(180 + 𝜙𝑝𝑚 ) (15) cascade with each other, as we see in fig.
here 𝜙𝑝𝑚 pm is the desired Pm and
1+𝑤/(𝑎0 /𝑎1 )
𝐷(𝑤) = 𝑎0 (16)
1+𝑤/(𝑏1 )−1

Where 𝜔𝑤0 = a 0 /𝑎1 and 𝜔𝑤𝑝 = 1/𝑏1 . The angle can be

described as.
𝜃𝑟 = ∠𝐷(𝑗𝜔𝑤𝑐 ) = 180 + 𝜙𝑝𝑚 − ∠𝐺𝑑 (𝑗𝜔𝑤𝑐 ) (17)
The controller design requires Figure III. THE DESIGN OF PI CONTROLLER
1
|𝐷(𝑗𝜔𝑤𝑐 )| = (18)
|𝐺𝑑 (𝑗𝜔𝑤𝑐 )| The TF of controller can be expressed as
from the equation (16)-(18) it can be evaluated that 𝐾𝐼 1+𝑤/𝜔𝑤0
𝐷(𝑤) = 𝐾𝑃 + = 𝐾𝐼 (23)
𝑤 𝑤

246
Table I. Controllers' Bode plot characteristics
Characteristics Pm with D(z) = Lag Lead PI PD PID
1
Gain Margin [61.5755 [12.8749 [6.4935 [0 9.9702 Inf [0 9.9702
1.7929e+04] 6.7538e+03] 1.7014e+03] 1.8455e+03] 1.8455e+03]
GM Frequency [6.2240 [4.6917 [6.5659 [0 6.9890 Inf [0 6.9890
31.4159] 31.4159] 31.4159] 31.4159] 31.4159]
Phase Margin 79.6399 39.7842 39.8827 38.1362 40.0 38.1362

PM Frequency 0.3315 1.1291 2.2950 1.8486 4.0035 1.8486

Delay Margin 41.9333 6.1499 3.0331 3.6006 1.7438 3.6006

DM Frequency 0.3315 1.1291 2.2950 1.8486 4.0035 1.8486

Stable 1 1 1 1 1 1

Where 𝜔𝑤0 = 𝐾𝐼 /𝐾𝑃 . However,

𝐷(𝜔) = 𝑘𝑝 + 𝑘𝐷 𝑤 (27)
the discrete TF of a PI controller can be expressed as
𝑇 𝑧+1 and we can change the transfer function from w-plane to 𝑧 -
𝐷(𝑧) = 𝐾𝑃 + 𝐾𝐼 (24) transform by using bilinear transformation. In Bilinear
2 𝑧−1
2 𝑧−1
PI controller design that yields a system phase margin with 40 transformation 𝜔 is replaced by ( ), where 𝑇 = sampling
𝑇 𝑧+1
deg time.
𝐷(𝑗𝜔𝑤𝑐 )𝐺𝑑 (𝑗𝜔𝑤𝑐 ) = 1∠(−180 + 𝜙𝑝𝑚 ) (25) the discrete TF of a PI controller is
Let A = ∣ 𝐺𝑑 (𝑗𝜔𝑤𝑐 ∣, The 𝐾𝑝 proportional gain and 𝐾𝐼 integral
𝑧−1
gain can be expressed as 𝐷(𝑧) = 𝑘𝑝 + 𝑘𝐷 2𝑇 ( ) (28)
𝑧+1
cos 𝜃𝑟
𝐾𝑝 = controller design that yields a system phase margin with 40 deg
∣ 𝐺𝑑 (𝑗𝜔𝜔𝑐 ∣
𝜔1 𝜔
𝜔𝑤𝑐 sin 𝜃𝑟 −𝑘𝑑 𝑤12 + 𝑘𝑖 + 𝑗𝑘𝑝𝑤1 = 𝐴1
sin 𝜃 + 𝑗 𝐴 1 cos 𝜃 (29)
𝐾𝐼 = 1
|𝐺𝑑 (𝑗𝜔𝜔𝑐 ) ∣
The TF of controller is The derivative gain and proportional gain can be expressed as

𝐷(𝑧) =
1.4575 (𝑧−0.9648)
(26)
sin 𝜃
(𝑧−1) 𝑘𝑑 =
𝜔, 𝐴1
From the bode plot, this is worth noting as that the
compensated plant Pm, 𝑃𝑚 = 40.0182 deg. At 0.553 𝑟𝑎𝑑/𝑠 sin 𝜃
and the gain margin, 𝐺𝑚 = 30.9 𝑑𝐵 at 5.61 𝑟𝑎𝑑/𝑠. From the ∴ 𝑘d = 𝜔 (30)
1 |𝐺(𝑗𝜔1 )|
bode plot of controller figure, it can be observed that the gain
and Pm values are undefined and thereby these are found to be cos 𝜃
infinite. 𝑘𝑝 =
𝐴1
D. Design of PD Controller
PD controller means proportional derivative Controller so it cos 𝜃
has both, the proportional controller and derivative controller in
∴ 𝑘𝑝 = |𝐺(𝑗𝜔 (31)
1 )|
cascade, so we have to add both, as we see in fig.
E. Design of PID Controller
PID controller proportional Integral Derivative controller It
consist of proportional, integral and Derivative controller all
connected in the form of cascade, as we see in fig.

Figure IV. THE DESIGN OF PD CONTROLLER

Let 𝑤1 the Gain crssover frequency of the system with
cascade PD controller, 𝐴1 the |𝐺1 (𝑗𝜔)| at 𝜔1 . The PI
controller transfer function is Figure V. THE DESIGN OF PID CONTROLLER

247
Figure VI. Controllers' step response of the closed loop

Table II. Controllers' step response characteristics

Characteristics Pm with Lag Lead PI PD PID

D(z) = 1
Rise Time: 5.3278 1.0700 0.5359 1.8526 0.2940 0.6182
Settling Time 9.6546 8.1969 5.9127 16.0705 2.5379 5.9852
Settling Min 0.9020 0.9004 0.8641 0.9053 0.9068 0.9321
Settling Max 0.9986 1.2919 1.2543 1.3949 1.3064 1.3560
Overshoot 0 29.1881 25.4273 39.4865 30.6421 35.6044
Undershoot 0 0 0 0 0 0
Peak 0.9986 1.2919 1.2543 1.3949 1.3064 1.3560
Peak Time 15.6573 2.5692 1.2423 5.1509 0.6846 1.5198

𝐾𝐼
𝐷(𝑤) = 𝐾𝑃 + + 𝐾𝐷 𝑤 (32) gain can be expressed as the controller transfer function,
𝑤
which can be expressed as
and we can change the transfer function from w-plane to 𝑧 – 2 (2/𝑇)
𝐾𝐷 𝜔𝑤𝑐 cos 𝜃𝑟
transform by using bilinear transformation. In Bilinear ∴ 𝐾𝑃 + 2 = |𝐺 (34)
(2/𝑇)2 +𝜔𝑤𝑐 𝑑 (𝑗𝜔𝑤𝑐 )|
2 𝑧−1
transformation 𝜔 is replaced by ( ), where 𝑇 =
𝑇 𝑧+1
sampling time. 𝐾𝐷 𝜔𝑤𝑐 (2/𝑇)2 𝐾𝐼 sin 𝜃𝑟
∴ 2 − = |𝐺 (35)
(2/𝑇)2 +𝜔𝑤𝑐 𝜔𝑤𝑐 𝑑 (𝑗𝜔𝑤𝑐 )|
The PID controller's discrete TF can be expressed as
𝑇 𝑧+1 𝑧−1 The gain margin is = 20 𝑑𝐵 at 6.99 rad/s, and the Pm is =
𝐷(𝑧) = 𝐾𝑃 + 𝐾𝐼 + 𝐾𝐷 (33)
2 𝑧−1 𝑇𝑧 38.1 degrees by the PID controller at 1.85 rad/s. The gain
controller design that yields a system phase margin with 40 and phase margin values are unknown in the marginalized
deg bode plot of the controller, and hence these are determined to
be infinite.
2
𝐾𝐷 𝜔𝑤𝑐 (2/𝑇) 𝐾𝐷 𝜔𝑤𝑐 (2/𝑇)2 𝐾𝐼
[𝐾𝑃 + ] + 𝑗 [ − ] VI. Step Response Characteristicst
(2/𝑇)2 + 𝜔𝑤𝑐2 (2/𝑇)2 + 𝜔𝑤𝑐
2 𝜔𝑤𝑐
= 𝐾𝑅 + 𝑗𝐾𝐶 design problem explained in this paper has assumed an input
of 𝜃𝑐 = 0:07u(t). The controllers scaled step response of the
closed loop system for the designed is presented Figure VI.
The 𝐾𝑃 proportional gain, 𝐾𝐷 derivative gain and 𝐾𝐼 integral For the step response overshot, 𝝃 ↓⇒ 𝑴𝒑 %(%𝐎𝐒)) ↑. From

248
the figure VI. And table II the best one is PID controller MATLAB and bode plots with open loop and closed loop
because it very less steady state error compared to other step response curves have been analyzed for comparative
𝟏 𝟏 margin Pm 40 deg. C-O frequency is a crucial design
controllers. For PID → steady state error ∝ , OS ∝ ,
𝒌𝒑 𝒌𝒅 specification to compensate the plant. premises. the suggests
𝟏 𝟏 𝟏
Rise Time ∝ , ∝ , Settling Time ∝ . 𝐾𝑝, 𝐾𝑑, and Ki tells us Such design crucial as its specifications are
𝒌𝒑 𝒌𝒊 𝒌𝒅
applicable in different practical control systems.
can be described as the proportional, derivative, and integral
parameters. The closed loop controls system is affected by References
all three of these parameters. In addition to those factors , the
[1] Chowdhury, Dhiman. "Design and Performance Analysis of
slow rising, slow settling, and long. overshoot as well as the Digital Controllers in Discrete and Continuous Time Domains
steady state error are also affected. A lag compensator shifts for a Robot Control System." Global Journal of Research In
the Bode magnitude plot down at mid and high frequencies Engineering (2018).
with its attenuation property. highlights for specification on [2] Unglaub, Ricardo AG, and D. Chit-Sang Tsang. "Phase
steady-state error, the low frequency gain is changed. The tracking by fuzzy control loop." 1999 IEEE Aerospace
proportional integral controller is equivalent to a control Conference. Proceedings (Cat. No. 99TH8403). Vol. 5. IEEE,
1999.
system that produces an output, this calls attention to which
[3] Mastinu, Gianpiero, and Manfred Plöchl, eds. Road and off-
is the result of adding outputs from the proportional and road vehicle system dynamics handbook. CRC press, 2014.
integral controllers. PID is used in systems where [4] Chowdhury, Dhiman, and Mrinmoy Sarkar. "Digital
proportional, integral, and derivative controllers are in use to Controllers in Discrete and Continuous Time Domains for a
compute an output. implies It's also there to reduce steady Robot Arm Manipulator." arXiv preprint
state error and improve stability. implies It's also there to arXiv:1912.09020 (2019).
reduce steady state error and improve stability. reveals that [5] Alassar, Ahmed Z., Iyad M. Abuhadrous, and Hatem A.
when used in conjunction with a proportional and a Elaydi. "Comparison between FLC and PID Controller for
5DOF robot arm." 2010 2nd International Conference on
derivative controller, the proportional derivative controller Advanced Computer Control. Vol. 5. IEEE, 2010.
generates an output, which is the product of the proportional
[6] Phillips, Charles L., and H. Troy Nagel. Digital control
and derivative controllers. If PD is being used, noise may be system analysis and design. Prentice-Hall, Inc., 1989.
suppressed in the higher frequencies. [7] Misir, Dave, Heidar A. Malki, and Guanrong Chen. "Design
and analysis of a fuzzy proportional-integral-derivative
VII. conclusion controller." Fuzzy sets and systems 79.3 (1996): 297-314.
This paper examines the performance and design [8] Klee, Harold, and Joe Dumas. "Theory, simulation,
assessment of five simple digital controllers, including lag, experimentation: an integrated approach to teaching digital
control systems." IEEE transactions on education 37.1 (1994):
lead, PI, PD, and PID controllers, which are used for a 57-62.
physical robot arm joint plant. implies design a system [9] Liu, Hui. Robot Systems for Rail Transit Applications.
controller with a dc gain of 10 that yields a system phase The Elsevier, 2020.
design methodologies have been investigated in both discrete [10] Boukas EK., AL-Sunni F.M. (2011) Design Based on
z-domain time approaches and warped s-domain or w-plane Transfer Function. In: Mechatronic Systems. Springer, Berlin,
time frames. The controllers have been simulated on Heidelberg. https://doi.org/10.1007/978-3-642-22324-2_5.

249
Machine learning model optimization with
hyper-parameter tuning approach
Md Riyad Hossain Douglas Timmer Hiram Moya
Department of Manufacturing Department of Manufacturing Department of Manufacturing Engineering
Engineering Engineering University of Texas Rio Grande Valley,
University of Texas Rio Grande Valley, University of Texas Rio Grande Valley, USA
USA USA hiram.moya@utrgv.edu
md.hossain01@utrgv.edu douglas.timmer@utrgv.edu

Abstract— Hyper-parameters tuning is a key step to find the both of the models' parameters tuned. It is not expected to
optimal machine learning parameters. Determining the best compare a Decision Tree model with the already tuned
hyper-parameters takes a good deal of time, especially when the parameter versus an ANN model whose hyperparameters
objective functions are costly to determine, or myriad haven’t been optimized yet.
parameters are required to be tuned. In contrast to the
conventional machine learning algorithms, Neural Network II. Literature Review
requires tuning hyper-parameters more because it must process
a lot of parameters together, and depending on the fine tuning, The hyperparameter tuning, due to its importance, has
the accuracy of the model can be varied in between 25%-90%. changed to a new interesting topic in the ML community. The
A few of the most effective techniques for tuning hyper- hyperparameter tuning algorithms are either model-free or
parameters in the Deep learning methods are: Grid search, model-based. Model-free algorithms are free of using
Random forest, Bayesian optimization, etc. Every method has knowledge about the solution space extracted during the
some advantages and disadvantages over others. For example: optimization; a few of this category includes manual search
Grid search has proven to be an effective technique to tune [4], random search [2, 6-7], and grid search [5]. In the Manual
hyper-parameters despite some drawbacks like trying too many search categories, we assume the values of the parameters
combinations and performing poorly in case of tuning many
from our previous experience. In this technique, the user
parameters simultaneously. In our work, we will determine,
show and analyze the efficiencies of a real-world synthetic allows to set hyperparameters values based on judgments or
polymer dataset for different parameters and tuning methods. previous experience, trains the algorithm by them, observes
the performance, keeps doing it to train the model until
Keywords— Machine learning, Hyperparameter optimization, achieving a standard accuracy and then selects the best set of
Grid Search technique, Random Search, BO-GP hyperparameters that gives the maximum accuracy.
However, this technique is heavily dependent on the
I. Introduction judgment and previous expertise and its reliability is
In the era of Machine learning, performance (based on dependent on the correctness of the previous knowledge [3].
accuracy and computing time) is very important. The Some of the few of the main parameters used by Random
growing number of tuning parameters associated with the forest classifiers are criterion, max_depth, n_estimators,
Machine learning models is tedious and time-consuming to min_samples_split etc.
set by standard optimization techniques. Researchers
working with ML models often spend long hours to In the Random search, we train and test our model based on
determine the perfect combination of hyper-parameters [1]. If some random combinations of the hyperparameters. This
we think w, x, y, z as the parameters of the model, and if all method is better used to identify new combinations of the
of these parameters are integers ranging from 0.0001 to say parameters or to discover new hyperparameters. Although it
5.00, then hyperparameter tuning is the finding the best may take more time to process, it often leads to better
combinations to make the objective function optimal. performance. Bergstra et al. (2012) in their work mentioned
One of the major difficulties in working with the Machine that, over the same domain, random search can find models
learning problem is tuning hyperparameters. These are the that are as good as or even better in a reduced computation
design parameters that could directly affect the training time. After granting the same budget in terms of
outcome. The conversion from a non-tuned Machine learning computational constraints for the random search, it was
model to a tuned ML model is like learning to predict evident that random search can deliver best models within a
everything accurately from predicting nothing correctly [2]. larger and reduced promising configuration spaces [16].
There are two types of parameters in ML models: Random Search, which is developed based on grid research,
Hyperparameters, and Model parameters. Hyperparameters examine a set of random combinations to develop and train
are arbitrarily set by the user even before starting to train the
the algorithm; Bergstra et al. (2011) [2].
model, whereas, the model parameters are learned during the
training.
In the grid search, the user sets a matrix of hyperparameters
The quality of a predictive model mostly depends on the and trains the model based on each possible combination.
configuration of its hyperparameters, but it is often difficult to Amirabadi et al. (2020) proposes two novel suboptimal grid
know how these hyperparameters interact with each other to search techniques on the four separate datasets to show the
affect the final results of the model [14]. To determine
accuracy and make a comparison between two models it is
always better to make comparisons between two models with

250
Figure I. (a) Manual tuning (b) Random tuning (c) Grid tuning approach [From left to Right]

efficiency of their hyperparameter tuning model and later evaluated for any arbitrary 𝒙𝒙 ∈ 𝑿𝑿 , 𝑥𝑥 ∗ = arg 𝑚𝑚𝑚𝑚𝑚𝑚𝑥𝑥∈𝑋𝑋 𝑓𝑓(𝑥𝑥) ,
compare it with some of the other recently published work. and X is a hyperparameter space that can contain categorical,
The main drawback of the grid search method is its high discrete, and continuous variables [27]. In order to construct
complexity. It is commonly used when there are a few the design of different machine learning models, the
numbers of hyperparameters to be tuned. In other words, grid application of effective hyperparameter optimization
search works well when the best combinations are already techniques can simplify the process of identifying the best
determined. Some of the similar works of grid search hyperparameters for the models. HPO contains four major
applications have been reported by Zhang et al. (2014) [17], components: First, an estimator that could be a regressor or
Ghawi et al. (2019) [18], and Beyramysoltan et al. (2013) any classifier with one or more objective functions, second: a
[19]. search space, Third: an optimization method to find the best
combinations, and Fourth: a function to make a comparison
Zhang et al. (2019) [20] in their work reported a few of the between the effectiveness of various hyperparameter
drawbacks of the existing hyperparameter tuning methods. In configurations [28].
their work, they mentioned grid search as an ad-hoc process,
as it traverses all the possible combinations, and the entire A. Grid Search
procedure requires a lot of time. Andradóttir (2014) [13]
shows that Random Search (RS) eradicates some of the Grid search is a process that exhaustively searches a manually
limitations of the grid search technique to an extent. RS can specified subset of the hyperparameter space of the target
reduce the overall time consumption, but the main algorithm [30]. A traditional approach to finding the optimum
disadvantage is that it cannot converge to the global optimal is to do a grid search, for example, to run experiments or
value. processes on a number of conditions, for example, if there are
The combination of randomly selected hyper-parameters can three factors, a 15 × 15× 15 would mean performing 3375
never guarantee a steady and widely acceptable result. That’s experiments under different conditions. [32]. Grid search is
why, apart from the manually tuning methods, automated more practical when [31]: (1) the total number of parameters
tuning methods are becoming more and more popular in in the model is small, say M <10. The grid is M-dimensional,
recent times; snoek et al. (2015) [10]. Bayesian Optimization so the number of test solutions is proportional to LM, where L
is one of the most widely used automated hyperparameter is the number of test solutions along each dimension of the
tuning methods to find the global optimum in fewer steps. grid. (2) The solution is known to be within a specific range
However, Bayesian optimization’s results are sensitive to the of values, which can be used to define the limits of the grid.
parameters of the surrogate model and the accuracy is greatly (3) The direct problem d = g (m) can be computed quickly
depending on the quality of the learning model; Amirabadi et enough that the time required to compute LM from them is not
al. (2020) [3]. prohibitive. (4) The error function E (m) is uniform on the
To minimize the error function of hyperparameter values, scale of the grid spacing, Δm, so that the minimum is not lost
Bayesian optimization adopts probabilistic surrogate models because the grid spacing is too coarse.
like Gaussian processes. Through precise exploration and There are many problems with the grid search method. The
development, an alternative model of hyperparameter space first is that the number of experiments can be prohibitive if
is established; Eggensperger et al. (2013) [8]. However, there are several factors. The second is that there can be
probabilistic surrogates need accurate estimations of significant experimental error, which means that if the
sufficient statistics of error function distribution. So, a sizable experiments are repeated under identical conditions, different
number of hyperparameters is required to evaluate the responses can be obtained; therefore, choosing the best point
estimations and this method doesn’t work well when there is on the grid can be misleading, especially if the optimum is
to process myriad hyperparameters altogether. fairly flat. The third is that the initial grid may be too small
for the number of experiments to be feasible, and it could lose
characteristics close to the optimum or find a false (local)
III. Methodology optimum [32].
The purpose of hyperparameter optimization is to find the
global optimal value 𝑥𝑥 ∗ of the objective function f (x) can be

251
B. Random Search usually much cheaper than running the objective function.
Random search [33] is a basic improvement on grid However, because Bayesian optimization models are run
search. It started with a randomized search over hyper- based on previously tested values, it is difficult to belong to
parameters from certain distributions over approximate them with parallel sequential methods; but they are generally
parameter values. These searching process runs as long as the able to detect optimal close hyperparameter combinations in
predetermined budget is exhausted, or at least until achieving a few iterations [36]. Common substitution models for BO
a desired set of accuracy. These methods are the simplest include the Gaussian process (GP) [37], random forest (RF)
stochastic optimization and are very useful for certain [38]. Therefore, there are three main BO algorithms based on
problems, such as small search space and fast-running their substitution models: BO-GP, BO-RF, BO-TPE. GP is an
simulation. RS finds a value for each hyperparameter, prior attractive reduced order model of BO that can be used to
to the probability distribution function. Both the GS and RS quantify forecast uncertainty. This is not a parametric model
estimate the cost measure based on the produced and the number of its parameters depends only on the input
hyperparameter sets. Although RS is simple, it has proven to points. With the right kernel function, your GP can take
be more effective than Grid search in many of the cases [33]. advantage of the data structure. However, the GP also has
Random search has been shown to provide better results due disadvantages. For example, it is conceptually difficult to
to several benefits: first, the budget can be set independently understand with BO theory. In addition, its low scalability
based on the distribution of the search space, therefore, with large dimensions or many data points is another
random search technique can sometime work better important issue [36].
especially if the multiple hyper-parameters are not uniformly
distributed [34]. Second: Because each evaluation is IV. Dataset description & Basics of polymer extrusion
independent, it is easy to parallelize and allocate resources.
Unlike GS, RS samples a few parameter combinations from A. Denier
a defined distribution, which maximizes system efficiency by Denier is a weight measurement usually refers to the
reducing the likelihood of wasting a lot of time in a small, thickness of the threads. It is the weight (grams) of a single
underperforming area. In addition, this method can detect optical fiber for 9 kilometers. If we have a 9 km fiber weighs
global optimum values or close to global if given a sufficient 1 gram, this fiber has a denier of 1, or 1D. A fiber with less
budget. Third, although getting optimal outputs applying than 1-gram weight calls Microfibers [22]. Microfibers
random search is not promising, lengthy processing time may become a new development trend in the synthetic polymer
lead to a greater likelihood of getting the best hyperparameter industry. The higher the denier is, the thicker and stronger the
set, whereas extra search times cannot always guarantee fiber is. Conversely, less denier means that the fiber/fabric
improved results in Grid searches. The use of random search will be softer and more transparent. Fine denier fibers are
is recommended in the initial stages of HPO to narrow the becoming a new standard and are very useful for the
search space quickly, before using guided algorithms to get development of new textiles with excellent performance [21].
better results. The main drawback [28] of RS and GS is that
each evaluation in its iteration does not depend on previous B. Breaking Elongation (%)
evaluations; thus, they waste time evaluating Elongation at break is one of the few main quality
underperforming areas of the search space. parameters of any synthetic fiber [24]. It is the percentage of
elongation at break. Fiber elongation partly reflects the extent
C. Bayesian Optimization of stretching a filament under a certain loading condition.
Bayesian optimization (BO) is a commonly used Fibers with high elongation at break are determined to be
reprocessing algorithm for HPO problems. Unlike GS and easily stretched under a predetermined load. Fibers showing
RS, BO determines future assessment levels based on the these characteristics are known to be flexible. The elongation
previous results. To determine the following parameters of behavior of any single fiber can be complex because of its
the hyperparameter, BO uses two key factors: a surrogate multiplicity of structural factors affecting it. Moreover, a
model and an acquisition function. The division model aims cotton fiber comes up with a natural crimp, which is
to match all the points that are now seen in the objective important for fibers to stick together while undergoing other
function. The acquisition function determines the use of production processes [23]. If L is the length of the fiber, then
different points, balancing exploration and exploitation. The the equation for the percentages of the breaking elongation
BO model balances the search and use process to identify the would be:
best possible area and avoid losing the best configuration in
undeveloped areas [35]. ∆𝐿𝐿𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵
The basic BO method works as follows: (i) Building a 𝐸𝐸𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 = ∗ 100%
reduced-order probabilistic model (ROM) of the objective 𝐿𝐿0
function. (ii) Finding the best hyperparameter values in the Breaking elongation for the cotton fiber might be varied from
ROM model. (iii) Applying those optimal values to the 5% to 10%, which is significantly lower than that of wool
objective function. (iv) Updating the ROM model with the fibers (25%-45%), and much lower than polyester fibers
new set of results. (v) Repeating above steps until achieving (typically over 50%).
maximum number of iterations.

BO is more efficient than GS and RS because it can detect C. Breaking force (cN) and Tenacity (cN/tex)
optimal combinations of hyperparameters by analyzing Breaking tenacity is the maximum load that a single fiber
previously tested values, and running the surrogate model is can withstand before breaking. For the Polypropylene and

252
PET staple fibers, 10 mm lengths sample filaments are drawn Table II. MSE & Cycle time for Grid Search technique
until failure. Breaking tenacity is measured in grams/denier.
Very small forces are encountered when evaluating fiber
properties, so an instrument with gram-level accuracy is Name MSE Cycle time (s)
required [25]. The tenacity of virgin PP fibers is about 5–8
g/den, and the elongation at break is about 100%. At the same RF 1.053 5
time, the tenacity of recycled PET is about 3.5-5.7 g/den; the SVM 0.927 2
elongation at break usually exceeds 100%.
KNN 29.45 1
D. Draw Ratio
ANN 0.475 5
The drawing ration is the ratio of the diameter of the initial
blank form to the diameter of the drawn part. The
limiting drawing ratio (Capstan speed/Nip reel speed) for the
extruder section is between 1.6 and 2.2 [26], whereas, for the
stretching section it is in between 3 and 4.
V. Results
A. MSE for default hyperparameters
We ran the code in Google Colaboratory and got an MSE
value of 44.8%, 3653.6%, 3100.7%, and 713.7% accordingly
for Random Forest, Support Vector Machine, K-Nearest
Neighbors, and Artificial Neural Network.

Table I. MSE & Cycle time for Default hyperparameters

Name MSE Cycle time (s)
Figure III. Graphical representations of MSE values for Grid
RF 0.448 0.5 Search
SVM 36.536 1
KNN 31.007 0.8 C. MSE for Random Search
ANN 7.137 5 The MSE value of 324.3%, 275.1%, 2945.5%, and 30.4%
accordingly for Random Forest, Support Vector Machine, K-
Nearest Neighbors, and Artificial Neural Network.
Table III. MSE & Cycle time for Grid Search technique

Name MSE Cycle

time (s)
RF 3.243 3
SVM 2.751 1
KNN 29.45 2
ANN 0.304 5

Figure II. Graphical representations of MSE values for default

hyperparameters

B. MSE for Grid Search

The MSE value of 105.3%, 92.7%, 2945.7%, and 47.5%
accordingly for Random Forest, Support Vector Machine, K-
Nearest Neighbors, and Artificial Neural Network. This result
is much better and improved compared to default Figure IV. Graphical representations of MSE values for
hyperparameter values. Random Search

253
C. MSE for Bayesian Optimization with Gaussian RS 29.45 2
Process (BO-GP)
BO-GP 29.45 2
The MSE value for BO-GP are 723.5%, 648.5%, 2945%, ANN Default HP's 7.137 5
and 30.8% accordingly for Random Forest, Support Vector
Machine, K-Nearest Neighbors, and Artificial Neural GS 0.475 5
Network. RS 0.304 5
Table IV. MSE & Cycle time for BO-GP Search technique BO-GP 0.308 5
Name MSE Cycle
time (s)
MSE for HPO'S
RF 7.235 2
40
SVM 6.485 3
30 RF
KNN 29.45 2
20 SVM
ANN 0.308 5

MSE
10 KNN

0 ANN
1 2 3 4 Linear (RF)
-10
Performance horizon

Figure VI. Graphical representations of MSE for different

HPO’s

6
Cycle time for HPO's
5
Cycle time (s)

Figure V. Graphical representations of MSE values for BO- 4

GP RF
3
VI. Discussion SVM
2 KNN
In order to put the theory into practice, several experiments
have been performed on an industrial-based synthetic 1 ANN
polymer model. This section describes experiments with four
different HPO techniques on three general and representative 0 Linear (RF)
ML algorithms. In the first part of the section, we discussed 1 2 3 4
the experimental setup and the main HPO process. In the Performance Horizon
second part, we compare and analyze the results of the Figure VII. Graphical representations of Cycle time for
application of different HPO methods. different HPO’s
Table V. Performance evaluation of applying HPO methods VI. Conclusion
to the regressor on the synthetic polymer dataset Machine learning has become the primary strategy for dealing
with data problems and is widely used in various applications.
Model Optimization MSE Cycle To apply ML models to practical problems, hyperparameters
Algorithm time (s) must be tuned to handle specific datasets. However, as the size
RF Default HP's 0.448 0.5 of the generated data increases greatly in real life, and manual
GS 1.053 5 tuning of hyperparameters is extremely computationally
expensive, it has become essential to optimize the
RS 3.243 3 hyperparameters by an automatic process. In this work, we
BO-GP 7.235 2 used hyperparameter techniques in the ML model to find the
best set of hyperparameters. Our data set was small, and in this
SVM Default HP's 36.536 1
small dataset we can see that the randomly selected subsets are
GS 0.927 2 very representative for the given data set, as they can
RS 2.751 1 effectively optimize all types of hyperparameters. Our future
work would be to test our dataset with an advanced Machine
BO-GP 6.485 3 learning technique probably with deep learning models and
KNN Default HP's 31.007 0.8 see the feedback.
GS 29.45 1

254
References novel grid search method. Analytica Chimica Acta, 791, 25–
35. Doi: 10.1016/j.aca.2013.06.043
[1] Cho, H., Kim, Y., Lee, E., Choi, D., Lee, Y., & Rhee, W. (2020).
Basic Enhancement Strategies When Using Bayesian [20] Zhang, X., Chen, X., Yao, L., Ge, C., & Dong, M. (2019). Deep
Optimization for Hyperparameter Tuning of Deep Neural Neural Network Hyperparameter Optimization with
Networks. IEEE Access, 8, 52588-52608. Orthogonal Array Tuning. Communications in Computer and
doi:10.1109/access.2020.2981072 Information Science Neural Information Processing, 287-295.
doi:10.1007/978-3-030-36808-1_31
[2] J.S. Bergstra, R. Bardenet, Y. Bengio, B. Kégl, Algorithms for
hyperparameter optimization, in: Advances in Neural [21] Zhang, C., Liu, Y., Liu, S. et al. Crystalline behaviors and phase
Information Processing Systems, 2011, pp. 2546–2554. transition during the manufacture of fine denier PA6 fibers. Sci.
China Ser. B-Chem. 52, 1835 (2009).
[3] Amirabadi, M., Kahaei, M., & Nezamalhosseini, S. (2020). https://doi.org/10.1007/s11426-009-0242-5
Novel suboptimal approaches for hyperparameter tuning of
deep neural network [under the shelf of optical [22] Joe. (2020, May 5). What Is Denier Rating? Why Does It Matter
communication]. Physical Communication, 41, 101057. To You? DigiTravelist. https://www.digitravelist.com/what-is-
doi:10.1016/j.phycom.2020.101057 denier-rating/.
[4] F. Hutter, J. Lücke, L. Schmidt-Thieme, Beyond manual tuning [23] Elmogahzy, Yehia (2018). Handbook of Properties of Textile
of hyperparameters, DISKI 29 (4) (2015) 329–337. and Technical Fibres || Tensile properties of cotton fibers. , (),
223–273. doi:10.1016/B978-0-08-101272-7.00007-9
[5] F. Friedrichs, C. Igel, Evolutionary tuning of multiple SVM
parameters, Neurocomputing 64 (2005) 107–117. [24] Tyagi, G.K. (2010). Advances in Yarn Spinning Technology ||
Yarn structure and properties from different spinning
[6] R.G. Mantovani, A.L. Rossi, J. Vanschoren, B. Bischl, A.C. De techniques. , (), 119–154. doi:10.1533/9780857090218.1.119
Carvalho, Effectiveness of random search in SVM hyper-
parameter tuning, in: 2015 International Joint Conference on [25] Blair, K. (2007). Materials and design for sports apparel.
Neural Networks (IJCNN), 2015, pp. 1–8. Materials in Sports Equipment, 60-86.
doi:10.1533/9781845693664.1.60
[7] L. Li, A. Talwalkar, Random search and reproducibility for
neural architecture search, 2019, arXiv preprint [26] Swift, K., & Booker, J. (2013). Forming Processes.
arXiv:1902.07638. Manufacturing Process Selection Handbook, 93-140.
doi:10.1016/b978-0-08-099360-7.00004-5
[8] K. Eggensperger, M. Feurer, F. Hutter, J. Bergstra, J. Snoek, H.
Hoos, K. Leyton-Brown, Towards an empirical foundation for [27] Cho, H., Kim, Y., Lee, E., Choi, D., Lee, Y., & Rhee, W.
assessing bayesian optimization of hyperparameters. In NIPS (2020). Basic Enhancement Strategies When Using Bayesian
workshop on Bayesian Optimization in Theory and Practice Optimization for Hyperparameter Tuning of Deep Neural
(Vol. 10, 3), 2013. Networks. IEEE Access, 8, 52588-52608.
doi:10.1109/access.2020.2981072
[9] H. Larochelle, D. Erhan, A. Courville, J. Bergstra, Y. Bengio,
An empirical evaluation of deep architectures on problems with [28] Yang, L., & Shami, A. (2020). On hyperparameter optimization
many factors of variation, in: Proceedings of the 24th of machine learning algorithms: Theory and practice.
International Conference on Machine Learning, ACM, 2007, Neurocomputing, 415, 295-316.
pp. 473–480. doi:10.1016/j.neucom.2020.07.061
[10] J. Snoek, O. Rippel, K. Swersky, R. Kiros, N. Satish, N. [29] Chan, S., & Treleaven, P. (2015). Continuous Model Selection
Sundaram, et al., Scalable bayesian optimization using deep for Large-Scale Recommender Systems. Handbook of
neural networks, in: International conference on machine Statistics Big Data Analytics, 107-124. doi:10.1016/b978-0-
learning, 2015, pp. 2171-2180. 444-63492-4.00005-8
[11] Jones, D.R.; Schonlau, M.; Welch, W.J. Efficient global [30] Menke, W. (2012). Nonlinear Inverse Problems. Geophysical
optimization of expensive black-box functions. J. Glob. Optim. Data Analysis: Discrete Inverse Theory, 163-188.
1998, 13, 455–492. doi:10.1016/b978-0-12-397160-9.00009-6
[12] Calandra, R.; Peters, J.; Rasmussen, C.E.; Deisenroth, M.P. [31] Brereton, R. (2009). Steepest Ascent, Steepest Descent, and
Manifold Gaussian processes for regression. In Proceedings of Gradient Methods. Comprehensive Chemometrics, 577-590.
the 2016 International Joint Conference on Neural Networks, doi:10.1016/b978-044452701-1.00037-5
Vancouver, BC, Canada, 24–29 July 2016; pp. 3338–3345. [32] Bergstra, J., Bengio, Y., 2012. Random search for hyper-
[13] Andrad_ottir, S.: A review of random search methods. In: parameter optimization. J. Mach. Learn. Res. 13, 281–305.
Handbook of Simulation Optimization, pp. 277{292. Springer ISSN 1532-4435. URL
(2015) http://dl.acm.org/citation.cfm?id=2188385.2188395
[14] Li, L.; Jamieson, K.; DeSalvo, G.; Rostamizadeh, A.; [33] Yu, T., & Zhu, H. (2020). Hyper-Parameter Optimization: A
Talwalker, A. Hyperband: A Novel Bandit-Based Approach to Review of Algorithms and Applications,
Hyperparameter Optimization. Journal of Machine Learning https://arxiv.org/abs/2003.05689
Research 18 (2018) 1-52 [34] E. Hazan, A. Klivans, and Y. Yuan, Hyperparameter
[15] A. Klein, S. Falkner, J. T. Springenberg, and F. Hutter. optimization: a spectral approach, arXiv preprint
Learning curve prediction with Bayesian neural networks. In arXiv:1706.00764, (2017). https://arxiv.org/abs1706.00764.
International Conference On Learning Representation (ICLR), [35] Hutter, F., Kotthoff, L., & Vanschoren, J. (2019). Automated
2017. Machine Learning Methods, Systems, Challenges. Cham:
[16] J.S. Bergstra, Y. Bengio. Random Search for Hyper-Parameter Springer International Publishing.
Optimization, in: Journal of Machine Learning Research 13 [36] Seeger, M. (2004). Gaussian Processes For Machine Learning.
(2012) 281-305 International Journal of Neural Systems, 14(02), 69-106.
[17] Zhang, H., Chen, L., Qu, Y., Zhao, G., & Guo, Z. (2014). doi:10.1142/s0129065704001899
Support Vector Regression Based on Grid-Search Method for [37] Hutter F., Hoos H.H., Leyton-Brown K. (2011) Sequential
Short-Term Wind Power Forecasting. Journal of Applied Model-Based Optimization for General Algorithm
Mathematics, 2014, 1-11. doi:10.1155/2014/835791 Configuration. In: Coello C.A.C. (eds) Learning and Intelligent
[18] Ghawi, R., & Pfeffer, J. (2019). Efficient Hyperparameter Optimization. LION 2011. Lecture Notes in Computer Science,
Tuning with Grid Search for Text Categorization using kNN vol 6683. Springer, Berlin, Heidelberg.
Approach with BM25 Similarity. Open Computer Science, https://doi.org/10.1007/978-3-642-25566-3_40
9(1), 160–180. doi:10.1515/comp-2019-0011 [38] Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms
[19] Beyramysoltan, S., Rajkó, R., & Abdollahi, H. (2013). for hyper-parameter optimization. Adv Neural Inf Process Syst
Investigation of the equality constraint effect on the reduction (NIPS) 24:2546–2554
of the rotational ambiguity in three-component system using a

255
Polyvinyl Alcohol/Cassava Starch/Nano-CaCO3 based
Nano-Biocomposite Films:Mechanical and Optical Properties
Eslem Kavas Pinar Terzioglu*
Department of Polymer Materials Engineering Department of Polymer Materials Engineering
Bursa Technical University Bursa Technical University
Bursa, Turkey Bursa, Turkey
eslemkavas123@gmail.com pinar.terzioglu@btu.edu.tr

Abstract— This paper reports the preparation and [11],[12],[13],[14], oxygen barrier [12] and thermal properties
characterization of nano-CaCO3 incorporated polyvinyl alcohol [12] of the hybrid nano-biocomposite materials. Sudhir et al.
(PVA)/cassava starch biocomposite films. The films were prepared the rice starch/PVA/CaCO3 composite films by
prepared by solution casting method using a glass mold. The solution casting method and investigated the effect of calcium
effect of nano-CaCO3 content (1 and 2 %) on the structural, carbonate amount on the fire retardant, tensile strength and
mechanical and optical properties of biocomposite films was thermal properties of nanobiocomposite film [12]. It was
investigated by Fourier transform infrared (FTIR) reported that the thermal and tensile properties of the
spectroscopy, universal testing machine and ultraviolet-visible- composites increased with the increment of nano-CaCO3
near infrared (UV-VIS-NIR) spectroscopy analyses,
loading amount. The highest values were reached with 10
respectively. The mechanical tests revealed that the 1% nano-
CaCO3 made no changes on the tensile strength of the
wt.% of CaCO3 concentration. Furthermore, the oxygen
PVA/cassava starch films, while 2% nano-CaCO3 addition permeation of composite films was reduced when the filler
decreased the tensile strength of the films. The transparency of amount increased. Therefore, the developed material was
films was slightly increased with nano-filler addition. suggested to be used in packaging applications. In another
study, Fukuda et al. used calcium carbonate particles as
Keywords— biocomposite, biobased polymers, calcium inorganic fillers to enhance the properties of poly(l-lactide)
carbonate, nano filler (PLLA) composite films [13]. Higher results were obtained
for Young’s modulus of the 10 wt.% of the CaCO3 particles
I. Introduction incorporated films when compared with the pure PLLA film.
The materials research in the packaging industry focuses Another work by Sun et al. [14] characterized the corn starch
on the use of biodegradable polymers to overcome the based films impregnated with nano-CaCO3 particles. The
drawbacks of environmental problems caused by non- results showed that the tensile strength of the films increased
degradable plastics [1]. Among biodegradable polymers, PVA with the increased CaCO3 content up to 0.06%. However, the
has been preferred in the commercial packaging industry own 0.1% and 0.5% loading amount of CaCO3 lead to decrease in
to its excellent film-forming capability, resistance to oil and the tensile strength of the films. The studies showed that the
mechanical properties [2],[3]. However, the blending of PVA properties of the final composites can be varied mostly due to
with different polymers seems to be a good solution to its the polymer matrix composition, the filler-matrix interaction
relatively high price [2]. Due to the wide availability in natural and the distribution of the filler.
sources, biodegradability, low cost and non-toxicity, starch is Upon thorough researches, the addition of nano-CaCO3 to
usually used to blend with PVA [1],[2]. PVA/cassava starch biocomposite film has not been reported
In the combination of PVA and starch that has excellent to date. In the present work, it is proposed to evaluate the
compatibility with each other, the final materials present effect of nano-CaCO3 content on the performance of PVA/
physicomechanical features at good levels as well as cost cassava starch biocomposite films. The biocomposite films
competitiveness [4],[5]. However, the investigations going on were compared for their structural, mechanical, and optical
to obtain polymer materials with enhanced characteristics. properties.
The efficient method of widening functionality of PVA-starch
materials is the incorporation of a small amount of a functional II. Materials and Methods
filler to the polymer matrix [6]. A. Materials
Nano-biocomposites are an important class of hybrid Polyvinyl alcohol that has 87.16% degree of hydrolysis
materials that can be obtained by incorporation of nano-sized and 95.4% purity was obtained from Zag Kimya, TURKEY.
filler (nanofiller) to a bio-based matrix [4]. The together use Casavva starch was purchased from Tito, TURKEY. Calcium
of eco-friendly polymers and nano-sized fillers, in order to carbonate (CaCO3) nanopowder was obtained from Adaçal
obtain synergic effects, is considered as one of the most Industrial Minerals Company (Afyon, TURKEY). Glycerol
innovative way to improve the features of the polymer was also purchased from Merck. Citric acid was obtained from
materials [4],[7]. Nanofillers can be classified into three Aksu Company, TURKEY. Tween 80 was obtained from
different types as nano-fiber, nano-particle, nano-plate [8]. Sigma-Aldrich Company. The solutions were prepared using
Among the nano-particles used for such applications, nano- distilled water.
calcium carbonate generally seems to be a good candidate
B. Preparation of the biocomposite films
which exhibited unique properties like being cost-effective,
odorless, and high thermal stability [9],[10]. The preparation of three different biocomposite films was
carried out using the solution casting method (Figure I)
Previous studies that have used nano-CaCO3 as a filler according to the method of Terzioglu and Sıcak (2021) with
presented promising results to enhance the mechanical some modifications [15]. The PVA powder (8% w/v) and

256
distilled water were mixed using a hotplate magnetic stirrer at the spectrophotometer. The transparency of the biocomposite
80°C. Cassava starch (2% w/v) was gelatinized in distilled films were calculated according to the following equation:
water. The two mixture was mixed and then stirred at 600 rpm
and heated at 70 °C for 60 min. After the mixture cooled to log(%𝑇600 )
Transparency = (1)
50 °C, citric acid solution (10 % wt. of total polymer weight) 𝑦
was added and stirred for 30 minutes. Furthermore, glycerol
(20%, wt. of total polymer weight) and Tween 80 was poured where y is the thickness of film (mm) and %T600 is the percent
into the mixture with additional constant stirring for 15 transmittance at 600 nm [16].
minutes, respectively. At last, nano-CaCO3 powder at
III. Results and Discussion
concentrations of 1 and 2 % (wt. of total polymer weight) was
added to the PVA/cassava starch film-forming mixture and A. Structural Properties of the biocomposite films
stirred for 30 minutes. The 35 mL of obtained film forming FTIR spectroscopy was used to determine the interaction
mixture was poured into a 12 cm glass petri dish and dried at between the PVA, cassava starch and nano-CaCO3. The FTIR
40 °C for 24 hours. PVA/cassava starch without calcium
spectrum of biocomposite films is presented in Figure II.
carbonate was also prepared as a control film. The films
prepared with 0, 1 and 2 % nano-CaCO3 were named PSC-0,
PSC-1 and PSC-2, respectively. The spectrum of PSC-0 displayed a characteristic band at
1712 cm-1 attributed to the C=C stretching vibration, which is
  typical for the ester bond and carboxyl groups in citric acid
[17]. In the infrared spectra of films, stretching vibrations of
O-H groups were at 3285 cm-1 [18]. The peaks located at
1420, 1373 and 1324 cm-1 were related to the CH2 bending,
CH2 deformation and CH2 stretching vibrations, respectively
[2],[19]. The peak at 1240 cm-1 representing the C–H
wagging vibrations [18], while the peak at 1085 cm -1 can be
assigned to C-O stretching in C-O-H [18]. The band occurred
at 841 cm-1 due to the rocking vibration of CH2 and the
asymmetric stretching of C-O-C [20].

The pattern of nano-CaCO3 incorporated samples (PSC-1

and PSC-2) showed small differences. The peaks at 3281 and
1420 cm-1 were slightly shifted and the intensities of these
peaks were lowered when compared to PSC-0. The peaks at
~ 1483, 876, and 712 cm-1 are assigned to CaCO3 (Figure not
given) [21]. Therefore, the new peak formation that occurred
at ~1580 cm-1 could be related to the interaction of nano-
CaCO3 and the polymer matrix.

FigureSchematic diagram of PVA/cassava starch based

biocomposite film preparation.

C. Characterization of the biocomposite films

FTIR analysis: FTIR is a frequently used method to
analyze functional groups in materials such as biocomposite
films. FTIR spectra of biocomposite films were recorded with
an FTIR spectrometer (Thermo Nicolet iS50, USA) within
the wavenumber range of 4000-500 cm−1 under 4 cm−1
resolution.

Mechanical analysis: The mechanical features of samples FigureFTIR spectrum of biocomposite films.
were determined following the ASTM D 882 procedure using
a Shimadzu AGS-X tester. Mechanical tests were repeated five B. Mechanical Properties of the biocomposite films
times for each sample. The mechanical properties of the films are given in Figure
III. The calculated values of tensile strength, elongation at
Ultraviole-Visible-Near infrared spectroscopy analysis: break, and Young's modulus of the biocomposite films are
The UV-VIS-NIR spectral analysis was characterized using summarized in Table I. Tensile strength and elongation at
UV–Vis spectrophotometer (Shimadzu UV-3600, Japan) break values varied in the range of 22.50±1.7-27.74±1.1 MPa
with transmittance spectra of in the range of 200-800 nm. The and 186.75±17.3%-278.08±14.4%, respectively. The data
film samples were cut into 2.5 x 3.0 cm rectangles and put in showed that 1% nano-CaCO3 incorporation increased the
elongation capacity and flexibility of the control films as well
as not changed the tensile strength. However, the further

257
increase of nanofiller content from 1% to 2% caused a C. Optical Properties of the biocomposite films
decrease in the elongation at break and tensile strength values. The transparency capacity of the films is a significant
This should be due to the agglomeration of the nanofiller at quality parameter. Because foods tend to oxidation at
higher concentrations [16]. Young’s modulus of the films was
200−280 nm that causes oxidative deterioration,
also varied in the range of 38.28±2.0-82.30±5.3 MPa.
discoloration, and off-flavor or rancidity [16]. It is expected
Specifically, the nano-CaCO3 at 1 % led to a drop in Young’s
modulus value, while an increment occurred as its content from food packaging films not only to have UV-proof
reached 2%. properties but also be transparent enough for visual lights
[16],[23]. Therefore, determining the transmission
The capability to deform is particularly favorable in some percentage and transparency of films is important.
industries including agriculture, cosmetics, and food
packaging to fabricate elastic and flexible products based on The visual appearance of biocomposite films was
the applications [22]. Therefore, when the elongation at break presented in Figure IV. The transparency of the films was
results were examined, the 1% nano-CaCO3 loaded sample found to be 6.65, 7.36 and 7.47 for PSC-0, PSC-1 and PSC-
should be suggested as the best flexible film among the 2, respectively. The nanofiller incorporation slightly
produced three samples.
increased the transparency of the films (Figure V). This
Table I. Mechanical properties of biocomposite films. result may be related to the reduction of the matrix
Tensile strength Elongation at Young’s crystallinity [24]. The transparency of PVA/cassava starch
Sample
(MPa) break (%) modulus (MPa) film (PSC-0) was slightly lower than the transparency of
PSC-0 27.74±1.1 258.10±1.9 61.08±1.6 PVA/corn starch film (6.71) developed by Terzioğlu and
Parın [17]. The addition of nano-CaCO3 to the PVA/cassava
PSC-1 27.54±1.9 278.08±14.4 38.28±2.0
starch matrix resulted in a small change in the transmittance
PSC-2 22.50±1.7 186.75±17.3 82.30±5.3 in the range of 200-300 nm. Additionally, the transmission
rate of the biocomposite film notably increased in the visible
area, especially with 2% nano-CaCO3 addition. It was shown
that considerable transparency was still obtained (89~90 % at
800 nm) for all developed PVA/starch biocomposite films.

Figure IV. Visual appearance of biocomposite films.

FigureVUV-VIS-NIR spectra profiles of biocomposite films.

Acknowledgment
This research is financially supported by the Scientific and
Technological Research Council of Turkey (TÜBİTAK) with
the 2209-B Undergraduate Research Projects Grant. The
Figure III. Tensile strength, elongation at break and Young’s
modulus graphs of biocomposite films. authors acknowledge Adaçal Industrial Minerals Company
for supplying the calcium carbonate.

258
IV. Conclusion prepared by controlled hydrolysis of dimethylcarbonate”,
Carbohydrate Polymers, Volume 79, Issue 4, 2010.
Nano-CaCO3 filler incorporated PVA/cassava starch films [12] Sudhir K. Kisku , Niladri Sarkar , Satyabrata Dash, Sarat K.
were prepared successfully using a green route (solution Swain,“Preparation of Starch/PVA/CaCO3 Nanobiocomposite
casting method). FTIR characterization results demonstrated Films: Study of Fire Retardant, Thermal Resistant, Gas Barrier
similar structural composition of all developed films with the and Biodegradable Properties”, Polymer-Plastics Technology
same functional groups. This fact reflects the physical and Engineering, Volume 53, Issue 16, 2014.
interactions of the nanofiller and the polymer matrix. The [13] Norio Fukuda, Hideto Tsuji, Yasushi Ohnishi, “Physical
properties and enzymatic hydrolysis of poly(l-lactide)–CaCO3
nanofiller incorporation at 1% enhanced the elongation at the composites”, Polymer Degradation and Stability,Volume 78
break of the films. The transparency of the films increased 2002.
with nanofiller incorporation at both concentrations. Further [14] Qingjie Sun, Tingting Xi, Ying Li, Liu Xiong,
scientific research is still needed to find the best nano-CaCO3 “Characterization of Corn Starch Films Reinforced with
incorporation content for this polymer matrix to obtain nano- CaCO3 Nanoparticles”, Plos One, Volume 9, Issue 9, 2014.
biocomposite films with better functional properties. [15] Pınar Terzioglu, Yusuf Sıcak, “Citrus Limon L. Peel Powder
Incorporated Polyvinyl Alcohol/Corn Starch Antioxidant
References Active Films”, Journal of the Institute of Science and
Technology, Volume 11, Issue 2, 2021.
[1] Natwat Srikhao, Pornnapa Kasemsiri, Artjima Ounkaew, [16] Shaoxiang Lee, Meng Zhang, Guohui Wang, Wenqiao Meng,
Narubeth Lorwanishpaisarn, Manunya Okhawilai, Xin Zhang, Dong Wang, Yue Zhou, Zhonghua Wang,
Uraiwan Pongsa, Salim Hiziroglu, Prinya Chindaprasirt, “Characterization of polyvinyl alcohol/starch composite films
“Bioactive Nanocomposite Film Based on Cassava incorporated with p-coumaric acid modified chitosan and
Starch/Polyvinyl Alcohol Containing Green Synthesized Silver chitosan nanoparticles: A comparative study”, Carbohydrate
Nanoparticles”, Journal of Polymers and the Environment, Polymers, Volume 262, 2021.
Volume 29, 2021.
[17] Pınar Terzioğlu, Fatma Nur Parın , “Polyvinyl Alcohol-Corn
[2] Hairul Abral, Angga Hartono, Fadli Hafizulhaq, Dian Starch-Lemon Peel Biocomposite Films as Potential Food
Handayani, Eni Sugiarti, Obert Pradipta, “Characterization of Packaging ”, Celal Bayar University Journal of Science,
PVA/cassava starch biocomposites fabricated with and without Volume 16, Issue 4, 2021.
sonication using bacterial cellulose fiber loadings”,
Carbohydrate Polymers,Volume 206, 2019. [18] Phetdaphat Boonsuk, Apinya Sukolrat, Kaewta Kaewtatip,
Sirinya Chantarak, Antonios Kelarakis, Chiraphon Chaibundit,
[3] Farah Fahma, Sugiarto, Titi Candra Sunarti, Sabrina Manora “Modified cassava starch/poly(vinyl alcohol) blend films
Indriyani, and Nurmalisa Lisdayana, “Thermoplastic Cassava plasticized by glycerol: Structure and properties”, Volume 137,
Starch-PVA Composite Films with Cellulose Nanofibers from Issue 26, 2020.
Oil Palm Empty Fruit Bunches as Reinforcement Agent”,
International Journal of Polymer Science, Volume 2017,2017. [19] Priyanka Rani, M Basheer Ahamed, Kalim Deshmukh,
“Dielectric and electromagnetic interference shielding
[4] Maria-Cristina Popescu, Bianca-Ioana Dogaru, Mirela Goanta, properties of carbon black nanoparticles reinforced PVA/PEG
Daniel Timpu, “Structural and morphological evaluation of blend nanocomposite films”, Materials Research Express,
CNC reinforced PVA/Starch biodegradable films”, Volume 7, 2020.
International Journal of Biological Macromolecules,Volume
116, 2018. [20] Anida M.M. Gomes, Paloma L. da Silva, Carolina de L. e
Moura, Claudio E.M. da Silva, Nágila M.P.S. Ricardo, “Study
[5] M. Lubis, A. Gana, S. Maysarah, M.H.S. Ginting, M.B. of the Mechanical and Biodegradable Properties of Cassava
Harahap, “Production of bioplastic from jackfruit seed starch Starch/Chitosan/PVA Blends”, Macromolecular Symposia,
(Artocarpus heterophyllus) reinforced with microcrystalline Volume 299-300, Issue 1, 2011.
cellulose from cocoa pod husk (Theobroma cacao L.) using
glycerol as plasticizer”, IOP Conference Series: Materials [21] S. El-Sherbiny, S.M. El-Sheikh, A. Barhoum, “Preparation and
Science and Engineering, 309, 2018. modification of nano calcium carbonate filler from waste
marble dust and commercial limestone for papermaking wet
[6] Nataliya E. Kochkina, Olga A. Butikova, “ Effect of fibrous end application”, Powder Technology, Volume 279, 2015.
TiO2 filler on the structural, mechanical, barrier and optical
characteristics of biodegradable maize starch/PVA composite [22] H.P.S. Abdul Khalil, E.W.N. Chong, F.A.T. Owolabi, M.
films”, International Journal of Biological Macromolecules, Asniza, Y.Y. Tye, H.A. Tajarudin, M.T. Paridah, S. Rizal,
Volume 139, 2019. Microbial-induced CaCO3 filled seaweed-based film for green
plasticulture application, Journal of Cleaner Production,
[7] Frédéric Chivrac, Eric Pollet, Patrice Dole, Luc Avérous, Volume 199, 2018.
“ Starch-based nano-biocomposites: Plasticizer impact on the
montmorillonite exfoliation process”, Carbohydrate Polymers, [23] Azam Akhavan, Farah Khoylou, Ebrahim Ataeivarjovi,
Volume 79, Issue 4,2010. “Preparation and characterization of gamma irradiated
Starch/PVA/ZnO nanocomposite films”, Radiation Physics
[8] HS Mahadevaswamy, B Suresha, Role of nano-CaCO3 on and Chemistry, Volume 138, 2017.
mechanical and thermal characteristics of pineapple fibre [24] Vincenzo Titone, Francesco Paolo La Mantia, Maria Chiara
reinforced epoxy composites, Materials Today: Proceedings, Mistretta, “The Effect of Calcium Carbonate on the Photo-
Volume 22, Issue 3, 2020.
Oxidative Behavior of Poly(butylene adipate-co-
[9] Pınar Terzioğlu, “ Electrospun Chitosan/Gelatin/Nano-CaCO3 terephthalate)”, Macromolecular Materials and Engineering,
Hybrid Nanofibers for Potential Tissue Engineering Volume 305, Issue10, 2020.
Applications”, Journal of Natural Fibers, 2021. DOI:
10.1080/15440478.2020.1870639
[10] Kalyani Prusty, Sarat K Swain, “Nano CaCO3 imprinted starch
hybrid polyethylhexylacrylate\polyvinylalcohol
nanocomposite thin films ” , Carbohydrate Polymers, Volume
139, 2016.
[11] Carla Vilela, Carmen S.R. Freire, Paula A.A.P. Marques, Tito
Trindade, Carlos Pascoal Neto, Pedro Fardim, “Synthesis and
characterization of new CaCO3/cellulose nanocomposites

259
Movie Reviews Text Sentiment Analysis Based On Hybrid LSTM
and GloVe
Nour Ammar Ali Okatan
Department of Software Engineering Department of Software Engineering
Istanbul Aydin University Istanbul Aydin University
Istanbul, Turkey Istanbul, Turkey
noor101ammar@gmail.com aliokatan@aydin.edu.tr

Abstract— Social media has come to be a useful resource for connected memory cell and three gates, namely the input,
critiques and ratings. The mining of vital information that allows output, and forget gates. The gates are multiplicative units and
governments to keep public protection and organization's growth enable similar continuous write, read and reset operations. The
income has gradually advanced. However, the performance of network can only interact with the cells through the gates [2].
current techniques requires continuous improvement due to the
exponential growth of information. In this paper, the proposed In [3], a solution approach to the weakness of LSTM
structure is the LSTM neural network, an advanced spatial networks processing continuous input streams was proposed
technique of RNN. The verbal language information has been by the "learn to forget" algorithm, which refers to a novel
utilized by changing it into numerical information relying upon adaptive "forget gate". The "forget gate" allows an LSTM cell
the GloVe dictionary. The manner utilized in natural language to learn to reset itself at appropriate times. The approach
processing has accomplished acceptable results, and the proposed in this study is very similar to the approach of [2].
reasonable prediction changed into real with an excessive speed.
All the above approaches refer to [6], the approach
Keywords—NLP, LSTM, RNN, GloVe presented in 1997 by Hochreiter & Schmidhuber, who were
the first to define LSTM as a special kind of RNN with the
I. Introduction ability to learn long-term dependencies.
Recently, blogs, websites, and social media are considered
III. Overview, Methods and Tools
a powerful means of collecting product reviews. The proposed
method achieves maximum benefit at minimum cost in The RNN-LSTM model was trained on the IMDB movie
minimum time by analyzing users' opinions on positive and reviews dataset as a function of the GloVe word embedding
negative reviews using Long Short Term Memory (LSTM) dictionary.
algorithms. The required data can be extracted by technical
A. GloVe words embedding dictionary
methods as in the proposed method.
GloVe [5] is a word vectorization technique that embeds
Recurrent Neural Networks (RNNs) introduced by [14] words into a concise vector space where similar words are
provides high performance in speech processing, speech found close to each other as clusters, while different words are
recognition, speech translation, stock prediction and semantic found far away from each other, as shown in Figure I. GloVe
analysis, used in this study. LSTM neural network is a embedding is preferable to Word2vec embedding because
modified architecture of traditional RNN and is more accurate GloVe relies on local statistics (local context information of
[1]. The change from RNN to LSTM is the constant words) while it incorporates global statistics (word co-
backpropagated error flow in gradients processed in the occurrence) to obtain word vectors. The dimension of the data
LSTM network [13]. An LSTM layer consists of a set of is 200. Figure I shows the 3D graph of the embedded words
recurrently connected blocks called memory blocks. LSTM after conversion to sequences by PCA using the tensorboard
can learn to bridge minimal time delays of more than 1000 tool provided by Tensorflow.
discrete time steps by enforcing a constant error flow through
Constant Error Carrousels (CECs) [13]. In the presented
approach, a model consisting of two LSTM layers with a total
of seven layers is used.The results were fairly high and
satisfying; accuracy is high of each training and testing as well
as the speed of testing as detailed in section IV.
II. Related Work
In [1], a text sentiment analysis based on LSTM model
was presented to analyse human emotions. The training data
is classified into three categories (negative, positive, and
neutral) according to emotions, and then fitted into the LSTM
models trained for each data category, resulting in multiple
LSTM models for the corresponding emotional ratings. The
accuracy in [1] was higher than traditional RNNs. Another
approach is followed in [2], where both models are used,
Convolutional Neural Network (CNN) with RNN-LSTM.
Using the CNN model is to extract local features. The goal of
using the LSTM model is to capture long-distance
dependencies and combine the extracted features into a single
hybrid CNN-LSTM model. In [2], obtained efficient results. Figure I. 3-D diagram PCA by tensorboard of GloVe embeddings
Each memory block contains at least one recurrently sequences

260
B. RNN-LSTM architecture 𝑜𝑡 = 𝜎( 𝑊𝑜 [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑜 ) (9)
Recurrent neural networks (RNNs) [14] are related to ℎ𝑡 = 𝑜𝑡 × 𝑡𝑎𝑛ℎ( 𝐶𝑡 ) (10)
sequences, lists, and continuously initialized data. There have
been incredible successes in applying RNNs to a variety of The project was implemented using the TensorFlow
problems: Speech recognition, language modeling, platform in python; it is a high-performance computations
translation, image labeling, and more. LSTMs - introduced in platform.
[6] by Hochreiter & Schmidhuber, are a special form of RNNs C. Deep Neural Network Architecture
that avoids the problem of long term dependency that exists in
traditional RNNs, it utilizes tanh activation (1). Moreover, the Figure II, shows the basic network structure; however, the
LSTM model contains four neural network layers as shown in final architecture is given in Figure III. The proposed model
Figure II, which makes the model more robust to deal with contains seven layers, which include both the input and the
problems such as the vanishing gradient problem and reduce output. To avoid overfitting and improve performance, a
long dependencies by selecting only essential information dropout process is used. The input and recurrent connections
from the previous cells. to the LSTM units are excluded from the activation and weight
updates during network training [8]. The dense layer is used
𝑒 𝑥 − 𝑒 −𝑥 as an output layer with an activation function to improve the
𝑇(𝑥) = (1)
𝑒 𝑥 + 𝑒 −𝑥 performance of the approach [9].
1) Layers:
1. A sequence of words with 256x1 dimension is used
as an input to the network.
2. The input of the second layer is the matrix of
embeddings multiplied by sequences. The output
dimension is 256×200.
3. Here is the first LSTM layer with 128 features
(neurons). The output dimension is 256×128.
4. First dropout layer, input dimension and output
dimension are the same but it is used to reduce
overfitting.
5. Second LSTM layer with 32 features with
Figure II. LSTM block showing four interacting layers, gates, and dimension 32×1. 32 are the number of embedded
operations sequences reduced from 200 gradations to 32
Figure II clarifies the process of LSTM block’s gates- gradations.
input, forget, output, and updated Cell-and the parameters- 6. The second dropout layer, input, and output
activation functions-of gates as shown in (2). dimensions are the same, and it is used to reduce
overfitting.
𝑖 𝜎 7. The output layer is a Dense layer with dimension
𝑓 𝜎 ℎ𝑡−1
(𝑜) = ( 𝜎 ) 𝑊 ( ) (2) 1×1 as positive or negative weighting.
𝑥𝑡
𝐶̃𝑡 𝑡𝑎𝑛ℎ
The steps of the operation flow in LSTM block start with
Forgetting irrelevant past information using (3). Then, identify
New Information to be Stored, the sigmoid layer, which
indicated in (4), decides what values to update, tanh layer
which indicated in (5) generate a new vector of values that
could be added to the state called “candidate values”. Then
apply forget operation to the previous internal cell state by (6)
And add new candidate values, scaled by the number of values
that decided to be updated by (7). After that, the two
operations are summited to produce (8) that updates the cell
state. After updating the state, last sigmoid layer (9) decides
what parts of state to output. Finally, output a filtered cell state
(10) that uses tanh layer (1) to squash values between 1 and
-1. ℎ𝑡 is in the same time an input to the next cell in the
continuous LSTM block chain [6, 7, 16, 17].
𝑓𝑡 = 𝜎( 𝑊𝑓 [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑓 ) (3)
𝑖𝑡 = 𝜎( 𝑊𝑖 [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑖 ) (4)
𝐶̃𝑡 = 𝑡𝑎𝑛ℎ( 𝑊𝑐 [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑐 ) (5)
𝑓𝑡 × 𝐶𝑡−1 (6)
Figure III. Architecture of neural network layers showing number of
𝑖𝑡 × 𝐶̃𝑡 (7)
input and output in each layer
𝐶𝑡 = 𝑓𝑡 × 𝐶𝑡−1 + 𝑖𝑡 × 𝐶̃𝑡 (8)

261
Table I. sample of IMDB dataset

2) Functions:
a) Activation function:
There are many activation functions, which can be used
with the proposed model; however, the sigmoid activation
function is chosen in this case. Figure IV shows the sigmoid
function. The sigmoid function is represented by (11).
1
𝜎= (11)
1 + 𝑒 −𝑥

0.5

0
-6 -4 -2 0 2 4 6

Figure IV. Sigmoid function curve that lies between 0 and 1

b) Optimization function:
The optimization function used in this study is Adam
optimization, introduced by [11]. It is an adaptive learning 4) Training and Testing
rate-based method, which means it computes individual In this study, the dataset was divided into three parts: 70%
learning rates for different epochs. Adam refers to adaptive for training, 20% for testing, and 10% for validation. The
moment estimation. It uses estimates of the first and second network is trained depending on the GloVe embedding
moments of the gradient to adapt the learning rate for each dictionary [5]. The training time is machine hardware
weight of the neural network. The Nth moment is a random dependent. The training time and performance of the model
variable used as the expected value of that variable to the were improved by applying five basic measures, including:
power of n.
 Adding and discarding layers of the model as
𝜂 replacing the output layer with a dense layer.
𝜃𝑡+1 = 𝜃𝑡 − 𝑚
̂𝑡 (12)
√𝑣̂𝑡 + 𝜖  Using the TensorFlow platform, which is suitable
c) Loss function: for high performance computations.
The mean squared error (MSE) is used as a fitness function  Use of sigmoid activation function.
to minimize the error. It calculates the mean of the squared
differences between the predicted and actual values. It  Use of Adam optimization function.
minimizes the loss value [12].
 Use of MSE loss function.
𝑛
1
𝑀𝑆𝐸 = ∑(𝑦𝑖 − 𝑦̃𝑖 )2 (13) IV. Experiments, Analysis and Performance
𝑛
𝑖=1 The model was implemented using the open source
3) Data: framework TensorFlow. Training was performed on a PC with
IMDB is a popular movie website. It combines movie plot SSD hard drive 16 RAM and i7 10500 core.
description, metastore ratings, critic and user ratings and
reviews, release dates, and many other aspects. The system is A. Platform
trained using the IMDB dataset. This is a CSV file format that Training the network took about 20 minutes using the
contains 50K textual movie reviews with high polarity for TensorFlow environment. The flow of tensors (computations
semantic analysis and Natural Language Processing (NLP). of the model) can be viewed using the Tensorboard tool
The data fields are the review and its value, which is either 0 provided by TensorFlow, as shown in Figure V Each container
or 1, where 0 is negative and 1 is positive, as shown in Table (tensor) is a constant, vector or higher dimensional data and
I. This dataset is freely available [4]. The data is preprocessed each arrow (flow) is a directed flow of computations.
and tokenized using the libraries nltk and textblob.

262
Figure V. Tensorboard graph showing computations flow of the model

B. Results
Table II shows the values of the confusion matrix of the Table II. Confusion matrix of the trained model
model training, which, as can be seen, the true values (blue) positive negative
are higher than the negative values (red). Table III shows the
ratios of the training, testing, and validation results. The positive 4647 942
accuracy of the model is 87. 1%.
negative 346 4065
Figure VI and Figure VII show the plots for accuracy and
loss during the training and testing process. As shown, the
curves of training and testing increase in the accuracy plot,
while in the loss plot, the curves decrease, which confirms the
excellent performance of the model.
Table III. Values of training, testing, and validation ratios
training testing validation
Accuracy 86.24% 87.12% 87.19%
Loss 9.88% 11.25% 10.09%
Score - 10.30% 10.21%
F1 score - 86.32% -
Figure VI. Accuracy of model during training and testing in 14
epochs

263
References
[1] Alex Graves, Jürgen Schmidhuber, Framewise phoneme
classification with bidirectional LSTM and other neural
network architectures, Neural Networks, Volume 18, Issues 5–
6, 2005, Pages 602-610, ISSN 0893-6080,
https://doi.org/10.1016/j.neunet.2005.06.042 .
[2] Rehman, A.U., Malik, A.K., Raza, B. et al. A Hybrid CNN-
LSTM Model for Improving Accuracy of Movie Reviews
Sentiment Analysis. Multimed Tools Appl 78, 26597–26613
(2019). https://doi.org/10.1007/s11042-019-07788-7
[3] Felix A. Gers, Jürgen Schmidhuber, Fred Cummins; Learning
to Forget: Continual Prediction with LSTM. Neural
Comput 2000; 12 (10): 2451–2471.
doi: https://doi.org/10.1162/089976600300015015.
[4] Leon, Stefano, 2020, IMDb movies extensive dataset
https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-
Figure VII. Loss of model during training and testing in 14 epochs 50k-movie-reviews .
[5] Pennington, Jeffrey & Socher, Richard & Manning,
Christopher. (2014). Glove: Global Vectors for Word
Representation. EMNLP. 14. 1532-1543. 10.3115/v1/D14-
1162.
[6] Hochreiter, Sepp & Schmidhuber, Jürgen. (1997). Long Short-
term Memory. Neural computation. 9. 1735-80.
10.1162/neco.1997.9.8.1735.
[7] Karpathy, A., Johnson, J., & Fei-Fei, L. (2015). Visualizing
and understanding recurrent networks. arXiv preprint
https://arxiv.org/abs/1506.02078 .
[8] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., &
Salakhutdinov, R. (2014). Dropout: a simple way to prevent
neural networks from overfitting. The journal of machine
learning research, 15(1), 1929-1958 .
[9] Applications of Deep Neural Networks, Jeff Heaton Updated
regularly: last update: 21 Jan 2021 (v2)
https://arxiv.org/abs/2009.05673.
Figure VIII. example of a sample sequence in test dataset of [10] Han J., Moraga C. (1995) The influence of the sigmoid function
parameters on the speed of backpropagation learning. Lecture
predicting with probability of prediction Notes in Computer Science, vol 930. Springer, Berlin,
Heidelberg. https://doi.org/10.1007/3-540-59497-3_175 .
Figure VIII shows shows selected normalized data sample
[11] Kingma, D. P., Ba, J. (2014). Adam: A method for stochastic
and the probability of prediction, which is a true prediction. optimization. https://arxiv.org/abs/1412.6980 .
Figure IX shows a practical example done by the user. In this [12] Alejo, R., García, V. & Pacheco-Sánchez, J.H. An Efficient
case, the executed program prompts the user to input a Over-sampling Approach Based on Mean Square Error Back-
sentence then the system preprocesses the input and predicts propagation for Dealing with the Multi-class Imbalance
within milliseconds only. Problem. Neural Process Lett 42, 603–617 (2015).
https://doi.org/10.1007/s11063-014-9376-3 .
[13] Ravi, Vinayakumar & Kp, Soman & Poornachandran,
Prabaharan. (2017). Evaluation of Recurrent Neural Network
and its Variants for Intrusion Detection System (IDS).
International Journal of Information System Modeling and
Design. 8. 43-63.
https://doi.org/10.4018/IJISMD.2017070103.
[14] Rumelhart, David E., Geoffrey E. Hinton, and Ronald J.
Williams. Learning internal representations by error
propagation. California Univ San Diego La Jolla Inst for
Cognitive Science, 1985.
[15] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013).
Efficient estimation of word representations in vector
space. https://arxiv.org/abs/1301.3781 .
Figure IX. Practical example of testing the model
[16] Li, F., Johnson, J., Yeung, S. (2017). Lecture 10: Recurrent
V. Conclusion Neural Networks [PowerPoint slides]. Retrieved from Stanford
University School of Engineering. Convolutional Neural
The results show that the use of the LSTM improved Networks for Visual Recognition. CS231 n. CS231n.
performance of both speed and accuracy is improved. The http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture1
0.pdf .
GloVE 200-D dictionary was embedded in the Tensorflow
environment. Tensorflow has been very successful in handling [17] Jiang, C., Chen, Y., Chen, S., Bo, Y., Li, W., Tian, W., & Guo,
J. (2019). A Mixed Deep Recurrent Neural Network for MEMS
high-speed computations. It is expected that the model Gyroscope Noise Suppressing. Electronics, 8(2), 181. MDPI
presented in this study can be modified to be used for real- AG. Retrieved from
time emotion in speech recognition. http://dx.doi.org/10.3390/electronics8020181 .

264
Simulation Comparative Study to Highlight the Relation
Between Building Form and Energy Consumption
Omar ALGBURI Bahar FERAH
Department of Architecture Engineering Department of Architecture Engineering
Istanbul Sabahattin Zaim University Istanbul Sabahattin Zaim University
Istanbul, Turkey Istanbul, Turkey
omar.algburi @izu.edu.tr baharak.fareghi@izu.edu.tr

Abstract— Buildings construction is one of the causes of figure, the unit of the Form length represented by the
climate change. Architects who design energy-efficient and parameter 'a' symbolizes. Although all the buildings (A, B,
sustainable buildings lead in establishing the solutions to C, D) have similar volumes, they have different Form
climate change. Decisions concerning building form design and factors because of their different thermal envelope areas [2].
its impact on energy efficiency are the main motive behind this
The Form factor also relies on the building size. Building
work. This study examines the role of building form on energy
consumption on two proposed entire glazed office buildings, (C) is larger but has a lower form factor than building (A),
cubic (rectangular plan layout) and cylinder (circular plan although they have the same form. Irregular Forms with
layout). AutoCAD software was utilized to draw the plan open balconies that extend beyond the façade may also
layout, then the thermal simulation and the proposed two increase the Form factor, as illustrated by buildings D in
forms were modeled and analyzed respectively using Figure I.
DesignBuilder software. The results show that the energy
consumption of an entire glazed façade building strongly
depends on its solar heat gain through exterior windows,
building orientation, and glazing shading amount.
Furthermore, building form amongst the studied factors plays
the most critical role in determining a building's energy
consumption. In the end, suggestions were presented to raise
the awareness of architects to design energy-efficient buildings
form.

Keywords— Building form, Climate change, DesignBuilder,

Energy simulation

I. Introduction
Is there a significant and direct effect of the building
Form on energy consumption considered the central
question behind this study? In the first stage of designing a Figure I. Different sizes and Forms with the Form factor of
building (conception stage), engineers and architects are each building [2].
facing numerous economic, social, techniques details, The concept of building Form factor is linked to heat
ecological issues, and aesthetic limitations. Indeed, designing losses and gain on buildings, resulting in energy
a building is a complicated task, and to be energy efficient is consumption[3]. Building with a larger envelop surface area
unique. Moreover, the concern for pollution reduction, in proportion to their volume will have a higher Form factor,
climate change fighting, and energy savings must remain one maximizing heat losses. In contrast, buildings with lower
of the principles designers should regards. When the Form factors need lower specific heat demand[5].
designers develop their project's concept in the first stage,
they need operative knowledge about energy-efficient design Many studies addressed the building geometry and Form
principles. factors that affect the energy consumption of
buildings[6][7][8]. The study done by [8] proposed a
The form and size of a building have a sufficiently methodology to determine building form related to opaque
significant influence on energy consumption [1],[2], [3]. As component U-value represented by A/V ratio. A survey
stated in the study of Kocagil et al. [4], the building Form conducted by Joelsson et al. (2012), it was shown the vital
and the envelope complexity directly impact the total heat impact of the Form factor on final energy demand in
loss and gain and consequently the energy consumption. The residential buildings [2]. Another study [9] shows that space
proportion between the building's thermal envelope area (A) boundaries and building geometry significantly influence
and its volume (V) is known as the A/V ratio. The Form energy consumption factors. In addition, in the study done
factor of a building is relying on this proportion which by Al-Anzi [6] et al., different office building Forms have
measures the compactness of the building. The thermal been developed to study each form's energy demand. Many
envelope represents the isolation part between the outdoor studies were done using the energy simulation method for
(the unconditioned area) and the indoor (the conditioned optimizing the building Form in terms of its energy
space) environment. The value of this Form factor consumption [10][11][12][13][14][15][16]. However, many
determines the form of the building for a given volume. The factors can impact the energy consumption of a building.
study of Danielski et al. (2012) explained the value of the For example, according to (ASHRAE 90.1) American
Form factor as illustrated in Figure I. As shown in this Society of Heating, Refrigerating, and Air-Conditioning

265
Engineers [17] determined five main factors that impact complex process. Energy simulation is mainly founded to
energy consumption of a building as follows: (1) mechanical help the designer make the right decisions in the early design
systems ; (2) building envelope ; (3) power generation phase. According to many recent studies evaluating and
systems ; (4) water heating and (5) lighting systems. The analyzing the building energy performance lead to design
role of the designer here is to manage these factors and energy-efficient building. Therefore, design improvements or
design energy-efficient buildings. Other factors such as alternatives could be established by the design team to
window-to-wall ratio [18], glazing type [19] [20], solar heat enables energy saving. However, making different building
gain coefficient (SHGC) [21] [22], thermal insulation of Form models in order to identify energy-saving alternatives
does not typically occur correctly in many cases. In addition,
materials[23] [24], sun shading [25], and surface coloring
adapting simulation models (i.e., input data) usually ends in
[26] have a significant effect on the energy consumption of
various coding errors as it consists of manual or semi-manual
buildings. representation data[28]. A gradual simulation method of a
A. Energy consumption on buildings confirmed system with validation methodology of building
energy analysis should be involved by an expert team[29].
To understand energy consumption in buildings, we should
first understand how a building uses energy. According to II. Research Methodology
many studies, most of the energy consumed in buildings is
The energy simulation method by DesginBuilder
essentially used to enable an acceptable level of users'
software was used to test the impact of the building form on
thermal comfort. Cooling or heating the indoor air using air the reduction rate of energy consumption. Energy simulation
conditioning units or providing fresh air by ventilation. analyses were performed to evaluate the two proposed
Other energy uses are domestic hot water, artificial light, Cylinder (circular plan) and cubic (square plan) building
household appliances, or other electrical equipment forms in terms of energy consumption. Two-dimensional
(refrigerators, computers, TVs, etc.) Lighting, an essential modelings of the buildings were the first step using Autocad.
part of every indoor environment, often consumes less Importing the DXF model of the two different design
energy than other electric appliances. alternatives using Designbuilder simulation software was the
Heating or cooling anything - air, water, or food - is an second step.
energy-intensive process, and as such, appliances related to
The simulation method is a sequence of three related
heating or cooling are often energy-intensive. Air
phases, as shown in Figure II.
conditioners and space heating appliances have an essential
power requirement, as do cooking ovens, toasters, and
microwaves. However, those devices like the blender, steam
iron, and toaster are only used for a small time interval.
That's leading to reduces energy used compared to
appliances like the air conditioner, which is used for a much
longer duration. However, the most extensive domestic
energy requirement on the building is space heating and
cooling. According to [24], wisely choosing the wall and the
window material and the architecture of the enclosed space
can significantly reduce the energy required for space
heating and cooling. The heat energy from the outside air
will be passed through the walls and windows to the inside
of the building. The energy is transferred by vibrations of
neighboring molecules in the wall and windows. This leads
to an increase in indoor temperature. This mechanism of
heat transfer is called conduction, denoted by Q. The power
transferred through conduction is given by the following
equation (1): Figure II. A diagram shows the steps of the simulation methodology
Q = U.A.ΔT (1)
Where,
Q: heat transferred in W This study used comparative thermal analysis between
U: overall heat transfer coefficient in W/m2.°c two different forms, cubic and cylinder office building, as
A: heat transfer area in m2 shown in Figure III. For comparison needs, the two forms
ΔT: temperature difference on each surface of the wall in °c were proposed with the same floor area and same volume to
evaluate the impact of the building form on its energy
consumption. Both buildings have a central open courtyard
B. Energy modeling and simulation to enable access to natural light and ventilation. The two
Energy modeling is the process of computerized the proposed buildings are located in Istanbul- Turkey. The
parameters of a building used to perform energy simulation. height was also the same on both, with Four typical floors
In contrast, energy simulation predicts building energy 3.4 m in height for each floor. Therefore, the total height is
performance by using software analysis [27]. considered to be 15 m, 3.5 *4 (number of floors). The cubic
building plan dimensions are 64m*64m, and its total ground
According to [27], building parameters such as floor area was 4096 m2. Approximately the same ground
orientation, thermal properties, and envelope properties floor area applied in the Cylinder form (circular plan). The
contain computations factors that make energy simulation a radius of the circle is 36 m. The area of a circle is pi times

266
the radius squared (A = π r²). Therefore, the total ground
floor area was calculated as (36*36*3.14) =4069 m2.
Before running the simulation, parameters of the energy
simulation processes should be identified first. The
comparatıve simulation focused on cooling energy
consumption during a specific period, the peak cooling load
from 21 June till 21 September, as a typical summer season.
Indoor operative temperature and solar heat gain through
exterior windows were selected to determine the total
cooling energy consumption during the summer season.
These parameters have an essential influence on the average
indoor operative temperature, overall cooling load, and
energy consumption. Internal heat gain through the wall and
roof was ignored.
The construction templates of both buildings in this study
used to be Turkiye-medium weight which is data for a range Figure V. 3D plan of the cylinder building modeled by
of ready to use construction templates. The template author using DesignBuilder software.
provided the same data on facade partitions, walls, roofs, and
airtightness. The glazing template was also the same for both
buildings. The external window glazing type was project reduction rate. Air changes rate per hour (ac/h) with no fresh
air operating pressures (Pa) is left as the DesignBuilder
program automatically calculated as 1.7 ac/h @ 50 Pa.
DesignBuilder calculates the air rate flow in accordance with
ASHRAE standards.
III. Result and Discussions
The glazing facade is considered to be responsible for
a large amount of energy consumption in buildings due to
the glass's higher amount of heat exchange as opposed to the
other building envelope elements. As mentioned earlier in
the energy consumption on buildings section, the overall
heat transfer coefficient in W/m2 of the building envelope
can significantly reduce the energy consumption.
The proposed buildings in this study are fully glazed
with vertical solar shading elements, as shown in Figure III.
Therefore, internal heat gain through the wall and roof was
Figure III. 3D view of both cubic and cylinder shapes
ignored. The solar heat gain through exterior windows
modeled by author using DesignBuilder software.
considers as one of the main influence factors on the cooling
externally glazing. And the layout was horizontal strip, with
90% glazed (windows to wall ratio). For the energy
simulation process, the energy models were designed to be
mechanically ventilated using air conditioning (Fan coil unit
4-

Figure VI. Cubic (rectangular plan) shape building

simulation result: showing the impact of solar gain through
the exterior windows on cooling load

load of the building. Therefore, the main comparative

parameters of the two different forms are indoor operative
Figure IV. 3D plan of the cubic building modeled by temperature and solar heat gain through exterior windows.
author using DesignBuilder software. The simulation runs in this step to illustrate the effect of
using cubic building form on the amount of solar heat gain
pipe Air-cooled chiller) to calculate the cooling load's

267
through exterior windows and consequently on indoor
operative temperature
. As shown in Figure VI, solar heat gain through the cubic
form exterior windows during the peak period in July
(43797.52 kWh) has a significant impact on zone sensible
cooling load recording (-60617,03 kWh).
Comparing to cylinder form as in figure VII. solar heat gain
through the exterior windows of the cylinder form during the
peak period at July recorded (38198.14 kWh) lead to a
significant impact on zone sensible cooling load recording (-
5288,91 kWh).

Figure VIII. Cubic (rectangular plan) form building

simulation result: showing the impact of solar gain
through the exterior windows on indoor operative
temperature.
Table I. Comparative analysis results of the two different forms are
shown.
Solar
Zone
heat gain Operative
Cooling sensible
Building Form exterior temperature
period cooling load
windows (C)
(kWh)
(kWh)
Cylinder -5288,91
Figure VII. Cylinder (circular plan) form building 38198,14 29.27
(circular plan)
simulation result: showing the impact of solar gain through Cubic July -60617,03
the exterior windows on cooling load (rectangular 43797.52 29.00
plan)
Based on these results, the peak cooling load of the Cylinder -48971.45
3515.75 29.03
cylinder form during summer was recorded (-52888.91 kWh) (circular plan)
in July month while its recorded (-60617.03 kWh) in the Cubic August -55732.42
cubıc form in the same month. Therefore, the cylinder form (rectangular 40097.15 28.73
approximately recorded 7728.12 kWh less than the cubıc plan)
Cylinder -29224.67
form. According to figures VIII and IX, there is a noticeable (circular plan)
21372.14 27.73
effect in lowering indoor operative temperature about 0.2 ~ Cubic September -32094.69
0.8 in the cylinder building compared with the cubic (rectangular 24024.38 27.44
building. During the peak cooling load in July, the indoor plan)
operative temperature in July in the cubıc form recorded
29,27 while it was 29,00 in the cylinder building. And 29,03 IV. Conclusion
at Agustus in the cubıc form higher than Cylinder form In this study, the comparison between two proposed
which recorded 28,73. office building cubic (rectangular plan) and cylinder (circular
plan) forms in terms of cooling energy consumption was
To explain the simulation result more clearly, Table I presented. Using DesignBuilder simulation software,
shows the comparative parameters during the cooling period building models of the two different forms were modeled to
separately. The amount of solar heat gain in the cubic form determine the heat gains/ losses and cooling load analysis.
was higher than the cylınder form during all analyzed Subsequently, the energy monthly cooling loads simulation
months. This high amount of solar heat gain through exterior during the Summer session was conducted to compare the
windows increased the amount of indoor operative energy cooling consumption on each form. The simulation
temperature as well. results showed that cylinder building form received less heat
Finally, although the building is located within the same from solar through windows comparing to cubic form.
environmental circumstances and has the same construction Moreover, the peak cooling load of the cylinder form during
parameters, there was a noticeable cooling energy summer recorded (-52888.91 kWh) and (-60617.03 kWh) in
consumption differentiation. The results revealed that the the cubic form in July, respectively. This indicates that the
circular form is less exposed to the direct sun than the cylinder form recorded (approximately 7728.12 kWh) less
rectangular form. This lead to a decrease in the cooling than the cubic form. The presented simulation result shows
energy demanded to cool the indoor spaces. According to that when considering building form solutions based on
this fact, minimizing the large direct exposed glass facade saving energy, obtaining the optimal proportion of the glass
area will significantly reduce the cooling load and thus save area on the building's envelope is a vital solution.
energy.

268
References 10.1016/j.enbuild.2020.109802.
[15] M. Mokrzecka, "Influence of building Form and
[1] C. Hachem, A. Athienitis, and P. Fazio, "Parametric
orientation on heating demand: simulations for student
investigation of geometric form effects on solar potential
dormitories in temperate climate conditions," doi:
of housing units," Sol. Energy, vol. 85, no. 9, pp. 1864–
10.1051/e3sconf/20184400117.
1877, Sep. 2011, doi: 10.1016/j.solener.2011.04.027.
[16] A. Zhang, R. Bokel, A. van den Dobbelsteen, Y. Sun, Q.
[2] V. N. I. D. M. F. M. F. A. J. Anna Joelsson, "The Impact
Huang, and Q. Zhang, "The effect of geometry parameters
of the Form Factor on Final Energy Demand in
on energy and thermal performance of school buildings in
Residential Buildings in Nordic Climates," 2012.
cold climates of China," Sustain., vol. 9, no. 10, p. 1708,
[3] T. Catalina, J. Virgone, and V. Iordache, "Study on the
Sep. 2017, doi: 10.3390/su9101708.
impact of the building form on the energy consumption,"
[17] A. / Ashrae and / Iesna Addenda,
2011.
“ANSI/ASHRAE/IESNA Addenda to
[4] I. E. Kocagil and G. K. Oral, "The effect of building form
ANSI/ASHRAE/IESNA Standard 90.1-2007,” 2009.
and settlement texture on energy efficiency for hot dry
[18] C. Marino, A. Nucara, and M. Pietrafesa, "Does window-
climate zone in Turkey," in Energy Procedia, Nov. 2015,
to-wall ratio have a significant effect on the energy
vol. 78, pp. 1835–1840, doi:
consumption of buildings? A parametric analysis in Italian
10.1016/j.egypro.2015.11.325.
climate conditions," J. Build. Eng., vol. 13, pp. 169–183,
[5] R. Fallahtafti and M. Mahdavinejad, "Optimisation of
Sep. 2017, doi: 10.1016/j.jobe.2017.08.001.
building Form and orientation for better energy efficient
[19] A. R. AbouElhamd, K. A. Al-Sallal, and A. Hassan,
architecture," Int. J. Energy Sect. Manag., vol. 9, no. 4,
"Review of core/shell quantum dots technology integrated
pp. 593–618, Nov. 2015, doi: 10.1108/IJESM-09-2014-
into building's glazing," Energies, vol. 12, no. 6, 2019,
0001.
doi: 10.3390/en12061058.
[6] A. AlAnzi, D. Seo, and M. Krarti, "Impact of building
[20] K. J. Kontoleon, "Energy saving assessment in buildings
Form on thermal performance of office buildings in
with varying façade orientations and types of glazing
Kuwait," Energy Convers. Manag., vol. 50, no. 3, pp.
systems when exposed to sun," Int. J. Performability
822–828, Mar. 2009, doi:
Eng., vol. 9, no. 1, 2013.
10.1016/j.enconman.2008.09.033.
[21] A. Bhatia, S. A. R. Sangireddy, and V. Garg, "An
[7] B. Bektas Ekici and U. T. Aksoy, "Prediction of building
approach to calculate the equivalent solar heat gain
energy needs in early stage of design by using ANFIS,"
coefficient of glass windows with fixed and dynamic
Expert Syst. Appl., vol. 38, no. 5, pp. 5352–5358, May
shading in tropical climates," J. Build. Eng., vol. 22, pp.
2011, doi: 10.1016/j.eswa.2010.10.021.
90–100, Mar. 2019, doi: 10.1016/j.jobe.2018.11.008.
[8] G. K. Oral and Z. Yilmaz, "Building form for cold
[22] E. Graiz and W. Al Azhari, "Energy Efficient Glass: A
climatic zones related to building envelope from heating
Way to Reduce Energy Consumption in Office Buildings
energy conservation point of view," Energy Build., vol.
in Amman (October 2018)," IEEE Access, vol. 7, 2019,
35, no. 4, pp. 383–388, May 2003, doi: 10.1016/S0378-
doi: 10.1109/ACCESS.2018.2884991.
7788(02)00111-1.
[23] M. Khoukhi, "The combined effect of heat and moisture
[9] V. Bazjanac and L. Berkeley, "Space boundary
transfer dependent thermal conductivity of polystyrene
requirements for modeling of building geometry for
insulation material: Impact on building energy
energy and other performance simulation," in Proceedings
performance," Energy Build., vol. 169, pp. 228–235, Jun.
of the CIB W78 2010: 27th International Conference,
2018, doi: 10.1016/j.enbuild.2018.03.055.
2010, no. November.
[24] O. Algburi and F. Behan, "Cooling load reduction in a
[10] B. Raof, "THE CORRELATION BETWEEN BUILDING
single–family house, an energy–efficient approach," Gazi
FORM AND BUILDING ENERGY PERFORMANCE.,"
Univ. J. Sci., vol. 32, no. 2, pp. 385–400, 2019.
Int. J. Adv. Res., vol. 5, no. 5, pp. 552–561, May 2017,
[25] O. Algburi and F. Beyhan, "Climate-responsive strategies
doi: 10.21474/ijar01/4145.
in vernacular architecture of Erbil city," 2019, doi:
[11] S. Pathirana, A. Rodrigo, and R. Halwatura, "Effect of
10.1080/00207233.2019.1619324.
building Form, orientation, window to wall ratios and
[26] F. Fiorito, A. Cannavale, and M. Santamouris,
zones on energy efficiency and thermal comfort of
"Development, testing and evaluation of energy savings
naturally ventilated houses in tropical climate," Int. J.
potentials of photovoltachromic windows in office
Energy Environ. Eng., vol. 10, no. 3, pp. 107–120, 2019,
buildings. A perspective study for Australian climates,"
doi: 10.1007/s40095-018-0295-3.
Sol. Energy, vol. 205, pp. 358–371, Jul. 2020, doi:
[12] P. McKeen and A. S. Fung, "The effect of building aspect
10.1016/j.solener.2020.05.080.
ratio on energy efficiency: A case study for multi-unit
[27] J. Clarke, "Building simulation," in Energy Simulation in
residential buildings in Canada," Buildings, vol. 4, no. 3,
Building Design, Elsevier, 2001, pp. 64–98.
pp. 336–354, 2014, doi: 10.3390/buildings4030336.
[28] W. L. Oberkampf, S. M. Deland, B. M. Rutherford, K. V
[13] M. Premrov, M. Žigart, and V. Žegarac Leskovar,
Diegert, and K. F. Alvin, "Error and uncertainty in
"Influence of the building Form on the energy
modeling and simulation," Reliab. Eng. Syst. Saf., vol. 75,
performance of timber-glass buildings located in warm
pp. 333–357, 2002, Accessed: 29 June, 2021. [Online].
climatic regions," Energy, vol. 149, pp. 496–504, Apr.
Available: www.elsevier.com/locate/ress.
2018, doi: 10.1016/j.energy.2018.02.074.
[29] R. D. Judkoff, "Validation of building energy analysis
[14] A. Al-Saggaf, H. Nasir, and M. Taha, "Quantitative
simulation programs at the solar energy research
approach for evaluating the building design features
institute," Energy Build., vol. 10, no. 3, pp. 221–239, Jan.
impact on cooling energy consumption in hot climates,"
1988, doi: 10.1016/0378-7788(88)90008-4.
Energy Build., vol. 211, p. 109802, Mar. 2020, doi:

269
Comparison of PID and LQR Controller of Autonomous
Underwater Vehicle for Depth Control
Osen Fili Nami Abdul Halim Dewi H. Budiarti
Department of Electrical Engineering Department of Electrical Engineering Aerospace Engineer
Universitas Indonesia Universitas Indonesia BPPT
Depok, Indonesia Depok, Indonesia Jakarta, Indonesia
osen.fili@ui.ac.id a.halim@ui.ac.id dewi.habsari@bppt.go.id

Abstract—Autonomous Underwater Vehicle (AUV) is a This paper explains the design of PID and LQR AUV controls
small unmanned underwater vehicle that is important for and the results were compared.
Indonesia as an archipelagic country. Apart from military
purposes, it is also needed for civilian purposes. For this reason,
the development of AUV technology is necessary and has a II. Modeling of AUV
strategic value. One that should be developed is an AUV To design a mathematical model of dynamic motion of a
dynamic control technology. In this paper, we have designed an submarine, we must first study the mechanism of motion of
AUV depth control model with proportional integral derivative the AUV itself. There are two frames of reference to see the
(PID) and linear quadratic regulator (LQR) controllers. A
mathematical model of the AUV focused on the depth model has
position and orientation of the AUV, namely Body Fixed
been developed and its stability was analyzed. In the AUV depth Frame (BFF) and Ear Fixed Frame (EFF) which can be seen
model system, an unstable step response was produced, so it is in Figure I.
necessary to add a comparison gain to strengthen the stability of
Rudder
the system. Furthermore, the PID and LQR have been designed
and were simulated on MATLAB. Eventually, the results of the Fin
Thruster
PID and LQR controllers from the AUV depth control have
been analyzed for comparison. Body-fixed frame
Fin
{B}
Rudder
q
Keywords—AUV, Depth Model, Linear Quadratic Regulator, (pitch) v u
Proportional Integral Derivative, MATLAB (sway) r (surge)
Y0 r0 w (yaw)

Eart-fixed frame (heave) X0

𝟇 p
{E} Z0 (roll)
X
I. Introduction 𝜽
Y 𝜳
Indonesia is an archipelagic country with a total of 17,508
Z
islands including large and small islands, which are stated in Figure I. AUV reference
Law no. 6 of 1996 concerning Indonesian waters. Based on (body fixed frame and ear fixed frame)
these data, the islands are separated by waters with a total area
of 6,400,000 km2 [1]. With the vastness of Indonesian waters, The equations of motion for a AUV are similar to those
there is a great need for equipment that supports surveillance for a surface ship, however they include all six degrees of
and security defense in all waters in addition to the military freedom [6]. Figure I shows the notations that will be
submarines already owned. Submarines can only operate in explained in the following Table I.
deep waters and their operation is also limited only for the
purposes of supervision, defense and security in Indonesian Tabel I. Notation of AUV axis
waters. For this reason, mini submarines and unmanned
AUV Motion Position Velocity Force/Moment
underwater vehicle are needed.
Surge x u X
The advantage of unmanned underwater vehicles is that Sway y v Y
they can be used in shallow waters and inter-island waters Heave z w Z
that have small gaps. Unmanned submarines are also very
Roll ϕ p K
efficient and safer for human safety in their operations
because they use less manpower. Asides from being able to Pitch θ q M
assist surveillance and security in the waters, it can also be Yaw ψ r N
used for civilian purposes such as aquaculture, mapping,
mining, and various underwater biota research. A. Model Non-Linear AUV
The motion in the horizontal plane is referred to as surge
In developing unmanned submarine technology, we must (longitudinal motion) and sway (sideways motion). Heave is
first understand the submarine dynamic control technology. vertical motion. The three DOFs are roll is rotation about the
The dynamic model of AUV is based on REMUS [2]. The longitudinal axis, pitch is rotation about the transverse axis
PID-controller is the most common type of feedback and yaw is rotation about the vertical axis [7].
controller throughout the industry [3]. PID and LQR
controllers have been introduced in previous studies [4,5]. Non-linear equation of 6 Degrees of Freedom (6DOF)
AUV based on the reference frame of ship's motion is as
follows:

270
Surge: By separating the acceleration notation and assumed
𝑚[𝑢̇ − 𝑣𝑟 + 𝑤𝑞 − 𝑥𝐺 (𝑞 2 + 𝑟 2 ) + 𝑦𝐺 (𝑝𝑞 − 𝑟̇ ) + diagonal inertia tensor (𝐼𝑥𝑦 , 𝐼𝑥𝑧 , 𝐼𝑦𝑧 ) is zero then equations 7
𝑧𝐺 (𝑝𝑟 + 𝑞̇ )] = ∑𝑋 () to equation 12 will be simplified back to:
Surge:
Sway: (−𝑋𝑢̇ )𝑢̇ + 𝑚𝑧𝐺 𝑞̇ − 𝑚𝑦𝐺 𝑟̇ = 𝑋𝑟𝑒𝑠 + 𝑋|𝑢|𝑢 𝑢|𝑢| +
𝑚[𝑣̇ − 𝑤𝑝 + 𝑢𝑟 − 𝑦𝐺 (𝑟 2 + 𝑝2 ) + 𝑧𝐺 (𝑞𝑟 − 𝑝̇ ) + (2) (𝑋𝑤𝑞 − 𝑚)𝑤𝑞 + (𝑋𝑞𝑞 + 𝑚𝑥𝐺 )𝑞𝑞 + (𝑋𝑣𝑟 + 𝑚)𝑣𝑟 +
𝑥𝐺 (𝑝𝑞 + 𝑟̇ )] = ∑𝑌 (𝑋𝑟𝑟 + 𝑚𝑥𝐺 )𝑟𝑟 − 𝑚𝑦𝐺 𝑝𝑞 − 𝑚𝑧𝐺 𝑝𝑟 + 𝑋𝑝𝑟𝑜𝑝
Heave: where:
𝑚[𝑤̇ − 𝑢𝑞 + 𝑣𝑝 − 𝑧𝐺 (𝑝2 2)
+ 𝑞 + 𝑥𝐺 (𝑟𝑝 − 𝑞̇ ) + ∑𝑋𝑒𝑥𝑡 = 𝑋𝑟𝑒𝑠 + 𝑋|𝑢|𝑢 𝑢|𝑢| + (𝑋𝑤𝑞 − 𝑚)𝑤𝑞 +
𝑦𝐺 (𝑟𝑞 + 𝑝̇ )] = ∑𝑍 ()
(𝑋𝑞𝑞 + 𝑚𝑥𝐺 )𝑞𝑞 + (𝑋𝑣𝑟 + 𝑚)𝑣𝑟 + (𝑋𝑟𝑟 +
Roll: 𝑚𝑥𝐺 )𝑟𝑟 − 𝑚𝑦𝐺 𝑝𝑞 − 𝑚𝑧𝐺 𝑝𝑟 + 𝑋𝑝𝑟𝑜𝑝

𝐼𝑥 𝑝̇ + (𝐼𝑧 − 𝐼𝑦 )𝑞𝑟 − (𝑟̇ + 𝑝𝑞)𝐼𝑥𝑧 + (𝑟 2 − 𝑞 2 )𝐼𝑦𝑧 + So that the equation becomes:

() (−𝑋𝑢̇ )𝑢̇ + 𝑚𝑧𝐺 𝑞̇ − 𝑚𝑦𝐺 𝑟̇ = ∑𝑋𝑒𝑥𝑡 ()
(𝑝𝑟 − 𝑞̇ )𝐼𝑥𝑦 + 𝑚[𝑦𝐺 (𝑤̇ − 𝑢𝑞 + 𝑣𝑝) − 𝑧𝐺 (𝑣̇ − 𝑤𝑝 +
Sway:
𝑢𝑟)] = ∑𝐾
(m-𝑌𝑣̇ )𝑣̇ − 𝑚𝑧𝐺 𝑝̇ +( 𝑚𝑥𝐺 − 𝑌𝑟̇ )𝑟̇ = 𝑌𝑟𝑒𝑠 + 𝑌|𝑣|𝑣 𝑣|𝑣| +
Pitch: 𝑌𝑟|𝑟| 𝑟|𝑟| + 𝑚𝑦𝐺 𝑟 2 +(𝑌𝑢𝑟 − 𝑚)𝑢𝑟 + (𝑌𝑤𝑝 + 𝑚)𝑤𝑝 +
𝐼𝑦 𝑞̇ + (𝐼𝑥 − 𝐼𝑧 )𝑟𝑝 − (𝑝̇ + 𝑞𝑟)𝐼𝑥𝑦 + (𝑝2 − 𝑟 2 )𝐼𝑥𝑧 + (𝑌𝑝𝑞 − 𝑚𝑥𝐺 )𝑝𝑞 + 𝑌𝑢𝑣 𝑢𝑣 + 𝑚𝑦𝐺 𝑝2 + 𝑚𝑧𝐺 𝑞𝑟 +
(𝑝𝑞 − 𝑟̇ )𝐼𝑦𝑧 + 𝑚[𝑧𝐺 (𝑢̇ − 𝑣𝑟 + 𝑤𝑞) − 𝑥𝐺 (𝑤̇ − 𝑢𝑞 + ()
𝑌𝑢𝑢𝛿𝑟 𝑢2 𝛿𝑟
𝑣𝑝)] = ∑𝑀
where:
Yaw: ∑𝑌𝑒𝑥𝑡 = 𝑌𝑟𝑒𝑠 + 𝑌|𝑣|𝑣 𝑣|𝑣| + 𝑌𝑟|𝑟| 𝑟|𝑟| +
𝐼𝑧 𝑟̇ + (𝐼𝑦 − 𝐼𝑥 )𝑝𝑞 − (𝑞̇ + 𝑟𝑝)𝐼𝑦𝑧 + − (𝑞 2 𝑝2 )𝐼𝑥𝑦
+ 𝑚𝑦𝐺 𝑟 2 +(𝑌𝑢𝑟 − 𝑚)𝑢𝑟 + (𝑌𝑤𝑝 + 𝑚)𝑤𝑝 + (𝑌𝑝𝑞 −
() 𝑚𝑥𝐺 )𝑝𝑞 + 𝑌𝑢𝑣 𝑢𝑣 + 𝑚𝑦𝐺 𝑝2 + 𝑚𝑧𝐺 𝑞𝑟 + 𝑌𝑢𝑢𝛿𝑟 𝑢2 𝛿𝑟
(𝑟𝑞 − 𝑝̇ )𝐼𝑥𝑧 + 𝑚[𝑥𝐺 (𝑣̇ − 𝑤𝑝 + 𝑢𝑟) − 𝑦𝐺 (𝑢̇ − 𝑣𝑟 +
𝑤𝑞)] = ∑𝑁
So that the equation becomes:
Total force and moment when inserted into the equation (−𝑌𝑣̇ )𝑣̇ − 𝑚𝑧𝐺 𝑝̇ +( 𝑚𝑥𝐺 − 𝑌𝑟̇ )𝑟̇ = ∑𝑌𝑒𝑥𝑡 ()
6DOF AUV then the equation will be as below: Heave:
(𝑚 − 𝑍𝑤̇ )𝑤̇ + 𝑚𝑦𝐺 𝑝̇ − ( 𝑚𝑥𝐺 + 𝑍𝑞̇ )𝑞̇ = 𝑍𝑟𝑒𝑠 +
Surge:
𝑍|𝑤|𝑤 𝑤|𝑤| + 𝑍𝑞|𝑞| 𝑞|𝑞| + (𝑍𝑢𝑞 + 𝑚)𝑢𝑞 + (𝑍𝑣𝑝 −
𝑚[𝑢̇ − 𝑣𝑟 + 𝑤𝑞 − 𝑥𝐺 (𝑞 2 + 𝑟 2 ) + 𝑦𝐺 (𝑝𝑞 − 𝑟̇ ) + 𝑚)𝑣𝑝 + (𝑍𝑟𝑝 − 𝑚𝑥𝐺 )𝑟𝑝 + 𝑍𝑢𝑤 𝑢𝑤 + 𝑚𝑧𝐺 (𝑝2 +
𝑧𝐺 (𝑝𝑟 + 𝑞̇ )] = 𝑋𝑟𝑒𝑠 + 𝑋|𝑢|𝑢 𝑢|𝑢| + 𝑋𝑢̇ 𝑢̇ + 𝑋𝑤𝑞 𝑤𝑞 + 𝑞 2 ) − 𝑚𝑦𝐺 𝑟𝑞 + 𝑍𝑢𝑢𝛿𝑠 𝑢2 𝛿𝑠
()
𝑋𝑞𝑞 𝑞𝑞 + 𝑋𝑣𝑟 𝑣𝑟 + 𝑋𝑟𝑟 𝑟𝑟 + 𝑋𝑝𝑟𝑜𝑝
where:
Sway:
∑𝑍𝑒𝑥𝑡 = 𝑍𝑟𝑒𝑠 + 𝑍|𝑤|𝑤 𝑤|𝑤| + 𝑍𝑞|𝑞| 𝑞|𝑞| + (𝑍𝑢𝑞 +
m[v̇ − wp + ur − yG (r 2 + p2 ) + zG (qr − ṗ ) +
𝑚)𝑢𝑞 + (𝑍𝑣𝑝 − 𝑚)𝑣𝑝 + (𝑍𝑟𝑝 − 𝑚𝑥𝐺 )𝑟𝑝 +
xG (pq + ṙ )] = Yres + Y|v|v v|v| + Yr|r| r|r| + Yv̇ v̇ +
() 𝑍𝑢𝑤 𝑢𝑤 + 𝑚𝑧𝐺 (𝑝2 + 𝑞 2 ) − 𝑚𝑦𝐺 𝑟𝑞 + 𝑍𝑢𝑢𝛿𝑠 𝑢2 𝛿𝑠
Yṙ ṙ + Yur ur + Ywp wp + Ypq pq + Yuv uv + Yuuδr u2 δr
So that the equation becomes:
Heave: (𝑚 − 𝑍𝑤̇ )𝑤̇ + 𝑚𝑦𝐺 𝑝̇ − ( 𝑚𝑥𝐺 + 𝑍𝑞̇ )𝑞̇ = ∑𝑍𝑒𝑥𝑡 ()
𝑚[𝑤̇ − 𝑢𝑞 + 𝑣𝑝 − 𝑧𝐺 (𝑝2 + 𝑞 2 ) + 𝑥𝐺 (𝑟𝑝 − 𝑞̇ ) + Roll:
𝑦𝐺 (𝑟𝑞 + 𝑝̇ )] = 𝑍𝑟𝑒𝑠 + 𝑍|𝑤|𝑤 𝑤|𝑤| + 𝑍𝑞|𝑞| 𝑞|𝑞| + 𝑍𝑤̇ 𝑤̇ + −𝑚𝑧𝐺 𝑣̇ + 𝑚𝑦𝐺 𝑤̇ + (𝐼𝑥 − 𝐾𝑝̇ )𝑝̇ = 𝐾𝑟𝑒𝑠 + 𝐾𝑝|𝑝| 𝑝|𝑝| −
𝑍𝑞̇ 𝑞̇ + 𝑍𝑢𝑞 𝑢𝑞 + 𝑍𝑣𝑝 𝑣𝑝 + 𝑍𝑟𝑝 𝑟𝑝 + 𝑍𝑢𝑤 𝑢𝑤 + 𝑍𝑢𝑢𝛿𝑠 𝑢2 𝛿𝑠 () (𝐼𝑧 − 𝐼𝑦 )𝑞𝑟 + 𝑚(𝑢𝑞 − 𝑣𝑝) − 𝑚𝑧𝐺 (𝑤𝑝 − 𝑢𝑟) + 𝐾𝑝𝑟𝑜𝑝
Roll: where:
𝐼𝑥 𝑝̇ + (𝐼𝑧 − 𝐼𝑦 )𝑞𝑟 − (𝑟̇ + 𝑝𝑞)𝐼𝑥𝑧 + (𝑟 2 − 𝑞 2 )𝐼𝑦𝑧 + ∑𝐾𝑒𝑥𝑡 = 𝐾𝑟𝑒𝑠 + 𝐾𝑝|𝑝| 𝑝|𝑝| − (𝐼𝑧 − 𝐼𝑦 )𝑞𝑟 + 𝑚(𝑢𝑞 − 𝑣𝑝)
(𝑝𝑟 − 𝑞̇ )𝐼𝑥𝑦 + 𝑚[𝑦𝐺 (𝑤̇ − 𝑢𝑞 + 𝑣𝑝) − 𝑧𝐺 (𝑣̇ − 𝑤𝑝 +
() − 𝑚𝑧𝐺 (𝑤𝑝 − 𝑢𝑟) + 𝐾𝑝𝑟𝑜𝑝
𝑢𝑟)] = 𝐾𝑟𝑒𝑠 + 𝐾𝑝|𝑝| 𝑝|𝑝| + 𝐾𝑝̇ 𝑝̇ + 𝐾𝑝𝑟𝑜𝑝
So that the equation becomes:
Pitch: −𝑚𝑧𝐺 𝑣̇ + 𝑚𝑦𝐺 𝑤̇ + (𝐼𝑥 − 𝐾𝑝̇ )𝑝̇ = ∑𝐾𝑒𝑥𝑡 ()
𝐼𝑦 𝑞̇ + (𝐼𝑥 − 𝐼𝑧 )𝑟𝑝 − (𝑝̇ + 𝑞𝑟)𝐼𝑥𝑦 + (𝑝2 − 𝑟 2 )𝐼𝑥𝑧 + Pitch:
(𝑝𝑞 − 𝑟̇ )𝐼𝑦𝑧 + 𝑚[𝑧𝐺 (𝑢̇ − 𝑣𝑟 + 𝑤𝑞) − 𝑥𝐺 (𝑤̇ − 𝑢𝑞 + 𝑚𝑧𝐺 𝑢̇ − ( 𝑚𝑥𝐺 + 𝑀𝑤̇ )𝑤̇ + (𝐼𝑦 − 𝑀𝑞̇ )𝑞̇ = 𝑀𝑟𝑒𝑠 +
()
𝑣𝑝)] = 𝑀𝑟𝑒𝑠 + 𝑀|𝑤|𝑤 𝑤|𝑤| + 𝑀𝑞|𝑞| 𝑞|𝑞| + 𝑀𝑤̇ 𝑤̇ + 𝑀|𝑤|𝑤 𝑤|𝑤| + 𝑀𝑞|𝑞| 𝑞|𝑞| + (𝑀𝑢𝑞 − 𝑚𝑥𝐺 )𝑢𝑞 + (𝑀𝑣𝑝 +
𝑀𝑞̇ 𝑞̇ + 𝑀𝑢𝑞 𝑢𝑞 + 𝑀𝑣𝑝 𝑣𝑝 + 𝑀𝑟𝑝 𝑟𝑝 + 𝑀𝑢𝑤 𝑢𝑤 + 𝑚𝑥𝐺 )𝑣𝑝 + (𝑀𝑟𝑝 − (𝐼𝑥 − 𝐼𝑧 ))𝑟𝑝 + 𝑚𝑧𝐺 (𝑣𝑝 − 𝑤𝑝) +
𝑀𝑢𝑢𝛿𝑠 𝑢2 𝛿𝑠 𝑀𝑢𝑤 𝑢𝑤 + 𝑀𝑢𝑢𝛿𝑠 𝑢2 𝛿𝑠
Yaw: where:
𝐼𝑧 𝑟̇ + (𝐼𝑦 − 𝐼𝑥 )𝑝𝑞 − (𝑞̇ + 𝑟𝑝)𝐼𝑦𝑧 + (𝑞 2 − 𝑝2 )𝐼𝑥𝑦 + ∑𝑀𝑒𝑥𝑡 = 𝑀𝑟𝑒𝑠 + 𝑀|𝑤|𝑤 𝑤|𝑤| + 𝑀𝑞|𝑞| 𝑞|𝑞| + (𝑀𝑢𝑞 −
(𝑟𝑞 − 𝑝̇ )𝐼𝑥𝑧 + 𝑚[𝑥𝐺 (𝑣̇ − 𝑤𝑝 + 𝑢𝑟) − 𝑦𝐺 (𝑢̇ − 𝑣𝑟 + (12) 𝑚𝑥𝐺 )𝑢𝑞 + (𝑀𝑣𝑝 + 𝑚𝑥𝐺 )𝑣𝑝 + (𝑀𝑟𝑝 − (𝐼𝑥 − 𝐼𝑧 ))𝑟𝑝 +
𝑤𝑞)] = 𝑁𝑟𝑒𝑠 + 𝑁|𝑣|𝑣 𝑣|𝑣| + 𝑁𝑟|𝑟| 𝑟|𝑟| + 𝑁𝑣̇ 𝑣̇ + 𝑚𝑧𝐺 (𝑣𝑝 − 𝑤𝑝) + 𝑀𝑢𝑤 𝑢𝑤 + 𝑀𝑢𝑢𝛿𝑠 𝑢2 𝛿𝑠
𝑁𝑟̇ 𝑟̇ +𝑁𝑢𝑟 𝑢𝑟 + 𝑁𝑤𝑝 𝑤𝑝 + 𝑁𝑝𝑞 𝑝𝑞 + 𝑁𝑢𝑣 𝑢𝑣 + 𝑁𝑢𝑢𝛿𝑟 𝑢2 𝛿𝑟
International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021

271
So that the equation becomes: From equations 28 and 29 and equations 30 and 31 obtain
𝑚𝑧𝐺 𝑢̇ − ( 𝑚𝑥𝐺 + 𝑀𝑤̇ )𝑤̇ + (𝐼𝑦 − 𝑀𝑞̇ )𝑞̇ = ∑𝑀𝑒𝑥𝑡 simplified equations in the matrix below:
()
Yaw : 𝑚 − 𝑍𝑤̇ −(𝑚𝑥𝐺 + 𝑍𝑞̇ ) 0 0 𝑤̇
−(𝑚𝑥𝐺 + 𝑀𝑤̇ ) 𝐼𝑦 − 𝑀𝑞̇ 0 0 𝑞̇
−𝑚𝑦𝐺 𝑢̇ + ( 𝑚𝑥𝐺 − 𝑁𝑣̇ )𝑣̇ + (𝐼𝑧 − 𝑁𝑟̇ )𝑟̇ = 𝑁𝑟𝑒𝑠 + ൦ ൪൦ ൪−
𝑁|𝑣|𝑣 𝑣|𝑣| + 𝑁𝑟|𝑟| 𝑟|𝑟|+(𝑁𝑢𝑟 − 𝑚𝑥𝐺 )𝑢𝑟 + (𝑁𝑤𝑝 + 0 0 1 0 𝑧̇
0 0 0 1 𝜃̇
𝑚𝑥𝐺 )𝑤𝑝 + (𝑁𝑝𝑞 − (𝐼𝑦 − 𝐼𝑥 )) 𝑝𝑞 − 𝑚𝑦𝐺 (𝑣𝑟 − 𝑤𝑞) + 𝑍𝑤 𝑚𝑈 + 𝑍𝑞 0 0 𝑤 𝑍𝛿𝑠 ()
𝑁𝑢𝑣 𝑢𝑣 + 𝑁𝑢𝑢𝛿𝑟 𝑢2 𝛿𝑟 𝑀𝑤 −𝑚𝑥𝐺 𝑈 + 𝑀𝑞 0 𝑀𝜃 𝑞 𝑀𝛿
൦ ൪ ቎ ቏ = ൦ 𝑠 ൪ [𝛿𝑠 ]
where: 1 0 0 −𝑈 𝑧 0
0 1 0 0 𝜃 0
∑𝑁𝑒𝑥𝑡 = 𝑁𝑟𝑒𝑠 + 𝑁|𝑣|𝑣 𝑣|𝑣| + 𝑁𝑟|𝑟| 𝑟|𝑟|+(𝑁𝑢𝑟 −
𝑚𝑥𝐺 )𝑢𝑟 + (𝑁𝑤𝑝 + 𝑚𝑥𝐺 )𝑤𝑝 + (𝑁𝑝𝑞 − (𝐼𝑦 − 𝐼𝑥 )) 𝑝𝑞 − From equation 32 it is assumed 𝑥𝐺 = 0, the heave velocity
𝑚𝑦𝐺 (𝑣𝑟 − 𝑤𝑞) + 𝑁𝑢𝑣 𝑢𝑣 + 𝑁𝑢𝑢𝛿𝑟 𝑢 𝛿𝑟 2 is assumed always small so that both w and 𝑤̇ can be
So that the equation becomes: ignored so the equation 32 becomes:

−𝑚𝑦𝐺 𝑢̇ + ( 𝑚𝑥𝐺 − 𝑁𝑣̇ )𝑣̇ + (𝐼𝑧 − 𝑁𝑟̇ )𝑟̇ = ∑𝑁𝑒𝑥𝑡 (18)

𝐼𝑦 − 𝑀𝑞̇ 0 0 𝑞̇ 𝑀𝑞 0 𝑀𝜃 𝑞
B. Linearisasi Model Non-Linear Depth Model [ 0 1 0] [ 𝑧̇ ] − [ 0 0 −𝑈] [ 𝑧 ]
It is assumed that the dept-plane motion is in ideal 0 0 1 𝜃̇ 1 0 0 𝜃 ()
𝑀𝛿𝑠
condition so that we only need the surge velocity, heave = [ 0 ] [𝛿𝑠 ]
velocity and pitch rate as well as the forward position x, z
0
depth and pitch θ angle. It will be assumed that velocity other
than that will be created equal to zero (v,p, r dan 𝑦𝑔 ). Heave Will be regretted:
(w) and pitch (q) linearized to zero so that equations 1, 2 and
5 become :
𝐼𝑦 − 𝑀𝑞̇ 0 0
∑𝑋 = 𝑚(𝑢̇ + 𝑧𝐺 𝑞̇ ) ()
matrix [ 0 1 0] = 𝑇1
∑𝑍 = 𝑚(𝑤̇ − 𝑢𝑞 − 𝑥𝐺 𝑞̇ ) ()
0 0 1
∑𝑀 = 𝐼𝑦 𝑞̇ + 𝑚[𝑧𝐺 𝑢̇ − 𝑥𝐺 (𝑤̇ − 𝑢𝑞)] ()

The total depth-plane force and moments shown in 𝑀𝑞 0 𝑀𝜃

equations 22,23 and 24 below: matrix [ 0 0 −𝑈] = 𝑇2 dan
∑𝑋 = 𝑋𝑢̇ 𝑢̇ + 𝑋𝑢 𝑢 + 𝑋𝑞 𝑞 + 𝑋𝜃 𝜃 () 1 0 0
∑𝑍 = 𝑍𝑤̇ 𝑤̇ + 𝑍𝑞̇ 𝑞̇ + 𝑍𝑤 w+𝑍𝑞 𝑞 + 𝑍𝛿𝑠 𝛿𝑠 ()
∑𝑀 = 𝑀𝑤̇ 𝑤̇ + 𝑀𝑞̇ 𝑞̇ + 𝑀𝑤 w+𝑀𝑞 𝑞 + 𝑀𝜃 𝜃 + 𝑀𝛿𝑠 𝛿𝑠 ()
𝑀𝛿𝑠
From equations 19-21 and from equations 22-24 the linear matrix [ 0 ] = T3
equation of the depth model AUV motion becomes: 0

(𝑚 − 𝑋𝑢̇ )𝑢̇ + 𝑚𝑧𝐺 𝑞̇ − 𝑋𝑢 𝑢 − 𝑋𝑞 𝑞 − 𝑋𝜃 𝜃 = 0 () state becomes 𝑥 = [𝑞 𝑧 𝜃]𝑇

(𝑚 − 𝑍𝑤̇ )𝑤̇ − (𝑚𝑥𝐺 + 𝑍𝑞̇ )𝑞̇ − 𝑍𝑤 w input 𝑢 = [𝛿𝑠 ]𝑇
(() ()
−(𝑚𝑈 + 𝑍𝑞 )𝑞 = 𝑍𝛿𝑠 𝛿𝑠
output, 𝑦 = 𝑧
𝑚𝑧𝐺 𝑢̇ − (𝑚𝑥𝐺 + 𝑀𝑤̇ )𝑤̇ + (𝐼𝑦 − 𝑀𝑞̇ )𝑞̇ −
() ()
𝑀𝑤 w+(𝑚𝑥𝐺 𝑈 − 𝑀𝑞 )𝑞 − 𝑀𝜃 𝜃 = 𝑀𝛿𝑠 𝛿𝑠 𝑇1𝑥̇ − 𝑇2𝑥 = 𝑇3𝑢
Then it is assumed that 𝑧𝐺 is very small value compared to −1
𝑥̇ = 𝑇1 𝑇2𝑥 + 𝑇1 𝑇3𝑢 −1
()
others so that heave and pitch can be decouple.
So if transferred to the form of state space
from the surge equation resulting in motion equations as
follows: Matrix A=𝑇1−1 𝑇2 and matrix B=𝑇1−1 𝑇3
(𝑚 − 𝑍𝑤̇ )𝑤̇ − (𝑚𝑥𝐺 + 𝑍𝑞̇ )𝑞̇ −
(28)
𝑍𝑤 w−(𝑚𝑈 + 𝑍𝑞 )𝑞 = 𝑍𝛿𝑠 𝛿𝑠 C. Parameter-parameter AUV
−(𝑚𝑥𝐺 + 𝑀𝑤̇ )𝑤̇ + (𝐼𝑦 − 𝑀𝑞̇ )𝑞̇ − All AUV parameter values to be used in this study are
(29) using REMUS unmanned submarine model parameters. With
𝑀𝑤 w+(𝑚𝑥𝐺 𝑈 − 𝑀𝑞 )𝑞 − 𝑀𝜃 𝜃 = 𝑀𝛿𝑠 𝛿𝑠
the forward speed of the ship is u=1.54 m/s and the mass of
Then linearize kinematic equations between eart and body- the submarine is 30 kg. The other parameters are in Table II
fixed velocities: below.
𝑧̇ = −𝑈𝜃 + 𝑤 (30)
𝜃̇ = 𝑞 (31)

272
Table II. REMUS AUV Parameters So the matrix
Parameter Value Units Description 𝑨 = 𝑻𝟏−𝟏 𝑻𝟐
𝐼𝑥 +1.77e-001 𝑘𝑔. 𝑚2 M.I w.r.t Origin at
CB −0.8247 0 −0.6927
𝐼𝑦 +3.45e+000 𝑘𝑔. 𝑚2 M.I w.r.t Origin at 𝑨=[ 0 0 −1.5400]
CB
𝐼𝑧 +3.45e+000 𝑘𝑔. 𝑚2 M.I w.r.t Origin at
1 0 0
CB and
𝑍𝑞 -9.67e+000 𝑘𝑔. 𝑚/𝑠 Combined Term
𝑍𝑞̇ -1.93e+000 𝑘𝑔. 𝑚 Added Mass 𝑩 = 𝑻𝟏−𝟏 𝑻𝟑
𝑍𝑤 -6.66e+001 𝑘𝑔/𝑠 Combined Term
−4.1537
𝑍𝑤̇ -3.55e+001 𝑘𝑔 Added Mass
𝑍𝛿𝑠 -5.06e+001 𝑘𝑔. 𝑚/𝑠 2 Fin Lift 𝑩=[ 0 ]
𝑀𝑞 -6.87e+000 𝑘𝑔. 𝑚2 /𝑠 Combined Term 0
𝑀𝑞̇ -4.88e+000 𝑘𝑔. 𝑚2 Added Mass While the matrix
𝑀𝑤 +3.07e+001 𝑘𝑔. 𝑚/𝑠 Combined Term
𝑀𝑤̇ -1.93e+000 𝑘𝑔. 𝑚 Added Mass 𝑪 = [0 1 0]
𝑀𝛿𝑠 -3.46e+001 𝑘𝑔. 𝑚2 /𝑠 2 Added Mass
𝑀𝜃 -5.77e+000 𝑘𝑔. 𝑚2 /𝑠 2 Hydrostatic
and

D. Depth Model AUV 𝑫 = [0]

Model State Space So that the state space matrix becomes:

D 𝑥̇ (𝑡) = 𝐴𝑥(𝑡) + 𝐵𝑢(𝑡)

𝑞̇ −0.8247 0 −0.6927 𝑞 −4.1537
[ 𝑧̇ ] = [ 0 0 −1.5400] [ 𝑧 ] + [ 0 ] [𝛿𝑠 ]
𝑢 B
𝑥̇ 1 𝑥 ++ 𝑦 𝜃̇ 1 0 0 𝜃 0
C
++ 𝑠
+ 𝑞
𝑦(𝑡) = [0 1 0] [ 𝑧 ] + [0] [𝛿𝑠 ]
A 𝜃
Figure II. Blok Diagram State Space 𝑦(𝑡) = 𝑧 (37)
Transfer function Depth Model becomes:
State Space Equation from Figure II: 6.397
𝑥̇ (𝑡) = 𝐴𝑥(𝑡) + 𝐵𝑢(𝑡) 𝐺(𝑠) = 3 (38)
𝑦(𝑡) = 𝐶𝑥(𝑡) + 𝐷𝑢(𝑡) (36) 𝑠 + 0.8247𝑠 2 + 0.6927𝑠

E. AUV System Depth Model Stability Analysis

State space Depth Model :
After entering the parameters in Table II into equation 33 From the Transfer function Depth open loop model can
obtained: be determined the root position of the system. The root
𝒙̇ = 𝑨𝒙 + 𝑩𝒖 locus can be observed in Figure III.
𝑥1 𝑞
State : [𝑥2 ] = [ 𝑧 ]
𝑥3 𝜃
Input : 𝑢 = [𝛿𝑠 ]𝑇
Output : y = [𝑧]
Matrix
8.33 0 0
𝑻𝟏 = [ 0 1 0] Figure III. Root Locus Depth model
0 0 1
0.12 0 0 Zero and pole of depth model are as follows:
𝑻𝟏−𝟏 = [ 0 1 0] Zero : none
0 0 1 Pole 1 : 0.0000 + 0.0000𝑖
−6.87 0 −5.77 Pole 2 : −0.4123 + 0.7230𝑖
𝑻𝟐 = [ 0 0 −1.54] Pole 3 : −0.4123 − 0.7230𝑖
1 0 0 When viewed from the root position two poles are on the left
−34.6 axis imaginer and one pole is right in the zero position should
𝑻𝟑 = [ 0 ] the system can be said to be stable. However, further analysis
0 of system stability is needed by looking at bode diagram and
step response system.

273
response Depth Model with gain change obtained graph on
figure VII with Rise time of 2.4036, Settling time of 24.3448,
Overshoot of 28.4421 and steady state error of 0.027. From
the step response in figure VII although the system is able to
stabilize but still needed controller to be able to better
stabilize the system with better conditions.
III. Control Methods and Simulation
The depth model will be designed with PID control and
Figure IV. Diagram Bode Depth Model LQR controller to help stabilize the system.
A. PID Control Model
From bode Gain Margin System diagram of 0.0893,
The PID Control equation is :
Phase Margin -61.9435 of Negative margin phase can also be
concluded that the system is unstable.
1
𝐺𝑐 (𝑠) = 𝐾𝑝 (1 + + 𝑇𝑑 𝑠) (39)
𝑇𝑖 𝑠

To guide the gain of 𝐾𝑝 , 𝑇𝑖 𝑎𝑛𝑑 𝑇𝑑 on depth control the model

can use the Ziegler-Nicholes method. To help gain tunningnya
can be seen MATLAB Code Gain Tunning Ziegler-Nicholes
as introduced by Dingyu et al. (2009) [9].

Figure V. Close Loop Step Respon Depth Model

From the V images can be seen step response of the

unstable depth model. Therefore, it is necessary to add Gain
Compensator to compensate for system instability. From root
locus we can specify the Gain to be used. To determine the
Compensator Gain can be seen for example in the literature
[8]. In this paper used Gain of 0.048 Then the Gain is added
to the system. Figure VIII. Depth Control Normal PID response step and
with Derivative in feedback
The controller designed are :

1
𝐺𝑃𝐼𝐷 = 1.1165(1 + + 0.9059𝑠 )
3.7747𝑠

With value of Kp=1.1165, Ti=3.7747 dan Td=0.9059

From the performance of step response depth model

Figure VI. Bode Depth Model diagram added Gain controller using Normal PID is rise time 1.8386 seconds,
Compensator Settling time 18.5743 seconds, overshoot 49.798, Peak
1.4980, peak time 5.0172 seconds and steady state errornya
0.0017.
Step response performance from PID controller with
derivative in the feedback path is rise time 2.55 seconds,
settling time 53.0766 seconds, overshootnya 59.8747, Peak
1.5987, peak time 8.6876 seconds and steady state error
7.5162e-04.
From the results of the response step of the two PID
controllers at Figure VIII, both are able to control the depth of
the system model well. Although steady state error for PID
Figure VII. Close Loop Step Response Depth Model
controller with derivative in the feedback path is smaller even
with Gain Compensator. hamper near zero but in terms of performance is better
performance of ordinary PID Controller compared to PID with
From bode diagram contained in Figure VI the gain derivative in the feedback path.
margin value is 1.8608 and the phase margin is 45.9785.
From this can be said stable system.The Close Loop Step

274
B. LQR Control Model the LQR controller design is very good at stabilizing the
LQR is one of the optimal control methods based on state depth of the AUV model.
space and is essentially intended to determine the control C. Comparison of PID and LQR controller results
signal in such a way as to minimize the performance index 𝐽.
∞
𝐽 = ∫ (𝑥 ∗ 𝑄𝑥 + 𝑢∗ 𝑅𝑢)𝑑𝑡 (40)
0

To minimize the J performance index of the equation 40

of course we must first determine the values Q and definite.
Q matrix is the weighting matrix state and R is the weighting
matrix of control and both are definite and Hermitian positive
matrices. Before being inserted into equation 40, the values
Q and R are used first to produce a definite positive matrix of
P by inserting it into equation 41 namely Algebraic Riccati
Equation (ARE). After P is produced it will be used to look Figure X. Step response comparison graph of PID control
for the optimal K booster matrix contained in equation 42. and LQR control on Depth Model.
And once obtained matrix K will be obtained control signal
whose equation can be seen in equation 43. From Figure X, it can be clearly seen that the control result
with optimal control of LQR is better performance compared
𝐴∗ 𝑃 + 𝑃𝐴 − 𝑃𝐵𝑅 −1 𝐵 ∗ 𝑃 + 𝑄 = 0 (41) to PID control. This can be seen also with step response data,
𝐾 = 𝑅 −1 𝐵 ∗ 𝑃 (42) namely rise time, settling time, overshoot and steady state
error from both controllers can be concluded that a better
𝑢 = −𝐾𝑥 (43)
depth control model is to use optimal LQR control.
To facilitate in the calculation of LQR control can use matlab IV. Conclusion
code on reference [9].
From the simulation of PID control design and LQR Depth
To determine the values R and Q can be seen [10]. In depth Control Model both are able to control the AUV Depth Model
well. However, when compared between the two, LQR
model LQR is given matrix Q and R are as follows:
1 0 0 controls gives better performance compared to PID control,
especially observing from the settling time and overshoot.
𝑄 = [0 1 0] and 𝑅 = [1]
LQR control has faster settling time of 4.0418 seconds and
0 0 2 very small overshoot of 0.0836 compared to PID controller
Then inserted into the equation 46 obtained matrix values:
0.3089 −0.2407 0.5779 which has settling time of 18.5743 seconds and overshoot of
49.798.
𝑃 = [−0.2407 1.6671 −1.4817] and
0.5779 −1.4817 3.4001 References
K= [-1.2831 1.0000 -2.4006]
[1] kkp.go.id.(2019, 3 November). Menko Maritim Luncurkan
The eigen value can also be calculated by MATLAB code Data Rujukan Wilayah Kelautan Indonesia.
eig(A-B*K) obtained values as follows: https://kkp.go.id/brsdm/poltekkarawang/artikel/14863-menko-
-3.7818 + 0.0000i maritim-luncurkan-data-rujukan-wilayah-kelautan-indonesia.
-1.1863 + 0.5331i [2] Timothy Prestero, “Verification of Six-Degree of Freedom
-1.1863 - 0.5331i Simulation Model for REMUS Autonomous Underwater
Vehicle”, Massachusetts Institute Of Technology and the
It can be seen that the eigen value is all negative so that its Woods Hole Oceanographic Institution, M.Sc Thesis, 2001.
position is to the left of the imaginary axis and it is concluded [3] Erik Lind, Magnus Meijer, ”Simulation and Control of
that the system is stable. Submarines”, Lund University, M.Sc Thesis, 2014.
[4] Katsuhiko Ogata, Modern Control Engineering Fifth Edition ,
Then the response step of LQR control for depth control can Prentice Hall, New Jersey, 2010.
be seen in Figure IX. [5] Lectures, “Linear Quadratic Regulator (LQR) State Feedback
Design”, F.L.Lewis 1998, Update: Monday, September 16,
2019.
[6] Martin Renilson, Submarine Hydrodynamics, SpringerBriefs
in Applied Sciences and Technology, 2015.
[7] Thor I. Fossen, “ A Nonlinear Unified State-Space Model For
Ship Maneuvering And Control in Seaway”, Journal of
Bifurcation and Chaos, vol.15, no. 9, pp. 2717-2746, 2005.
[8] Norman S. Nise, Control System Engineering Sixth Edition,
John Wiley&Sons,Inc, United States of America, 2011.
[9] Dingyu Xue, TangQuan Chen, Derek P.Atherton, Linear
Feedback Control Analysis and Design with MATLAB,
Figure IX. Step response LQR Depth Model Society for Industrial and Applied Mathematics, Philadelphia,
2007.
[10] Yul Y.Nazaruddin, Franky, I.G.N.A.Indra Mandala, “Optimasi
In Figure IX obtained excellent response step with rise Pengontrol LQR menggunakan Algoritma Stochastic Fractal
time of 2.362 seconds, Settling time of 4.0418 seconds, Search”, Seminar Nasional Instrumentasi, Kontrol dan
overshoot 0.0836, Peak 1.0008, peak time of 6.2888 seconds Otomasi (SNIKO), Bandung, Indonesia, 10-11 Desember 2018
and steady state error of 5.8645e-04. It can be concluded that [11] www.mathworks.com

275
Convolutional Neural Network Approach to Distinguish and Characterize Tumor Samples Using
Gene Expression Data

Büşra Nur Darendeli Alper Yilmaz

Department of Bioengineering Department of Bioengineering
Yildiz Technical University Yildiz Technical University
Istanbul, Turkey
Istanbul, Turkey
alyilmaz@yildiz.edu.tr
bndarendeli@gmail.com

Abstract— Cancer is threatening millions of people each The presence of CT and histopathology data in the
year and its early diagnosis is still a challenging task. Early diagnosis of the disease enabled improvements in the
diagnosis is one of major ways to tackle this disease and lower diagnosis stage as a result of image-based processing using
the mortality rate. Advancements in deep learning approaches the deep learning approach of these data. Deep learning
and availability of biological data offer applications that can
facilitate the diagnosis and characterization of cancer. Here, we
algorithms have been used in the diagnosis of many types of
aimed to provide new perspective of cancer diagnosis using deep cancer, including breast cancer [3,4], prostate cancer [5,6],
learning approach on gene expression data. We turn the lung cancer [7,8], colon cancer[9], head and neck cancer [10],
information of gene expression data of cancer and normal and skin cancer[11]. These studies, which are based on image
tissues into input for Convolutional Neural Network (CNN) with processing, are widely used as they provide advantages to
the method we call RGB mapping. It is aimed to be learned with clinicians at the early diagnosis stage.
CNN by preserving the gene expression data values of the In addition to the image-based approach, biological data
tissues. In this way, it is aimed to protect the effect of each gene have also been used in cancer diagnosis [12] and even
in cancer diagnosis. In addition, we aimed to characterize the treatment [13] approaches. Gene expression signature-based
disease by identifying genes that are effective in cancer
prediction. In this study, The Cancer Genome Atlas (TCGA)
approach is generally used in eliminating the disadvantages
dataset with RNA-Seq data of approximately 30 different types caused by heterogeneity at the diagnosis stage. Gene
of cancer patients and GTEx RNA-seq data of normal tissues expression data and deep learning approach are used for
were used. The input data for the training was transformed to many methods such as estimation of survival times of
RGB format and the training was carried out with a CNN. The individuals with cancer [14], determination of biomarker
trained algorithm is able to predict cancer with 97,7% accuracy, genes [15], classification [16,17]. All of these studies show
based on gene expression data. Moreover, we applied one-pixel that by using gene expression data and deep learning
attack on the trained model to determine effective genes for approaches together, important information will be gained
prediction of the disease. As a result of this attack, 13 genes that about the mechanism of cancer.
are effective on the decision mechanism of the algorithm were
determined. In conclusion, a new data preprocessing method is
In this study, The Cancer Genome Atlas (TCGA) dataset
proposed before the training in this study. With the one-pixel with RNA-seq data of approximately 30 different types of
attack method, which can be applied to the model trained using cancer patients and a dataset obtained by curation of GTEx
this method, it has been possible to identify genes that may be data including RNA-seq analysis of normal tissues was used.
biomarkers in cancer. The effects of candidate genes on cancer Importance was given to homogeneity of tumor and normal
can be determined by experimental studies. tissue distribution of the prepared dataset. The gene
expression values of the tissues, which are the input data for
Keywords— cancer, CNN, TCGA, GTEx, RNA-seq, RGB training, were converted into 24-bit binary format. Then it
Mapping
was converted to 8 bits of Red, Green and Blue channels. The
training was carried out with the CNN algorithm. The trained
I. Introduction
algorithm is able to diagnose cancer and normal patients
The deep learning approach has emerged by designing 97.7% with accuracy, based on gene expression data.
computer models that can perform the learning process as a Afterwards, one-pixel attack method was applied to the input
result of interconnected layers based on the human brain, data created using RGB mapping. In this way, the
such as neurons. As a result of the development of data vulnerability of deep learning models has been used to
science and especially the rapid increase in biological data in identify genes that may be effective in cancer. As a result of
the last decade, the designed neural networks have begun to this process, 13 biomarker genes were obtained. When these
play important roles in the interpretation of biological data genes were investigated in the literature, their relationship
for the diagnosis and treatment of diseases [1]. Cancer, which with cancer was determined.
is one of the biggest health problems in the world, is one of
the diseases in which deep learning approaches are widely
applied. II. Materials and Methods
Since cancer is a disease with high genomic heterogeneity A. Dataset Preparation
and phenotypic plasticity, its diagnosis and treatment involve
various difficulties [2]. Thanks to the developing technology, Data downloaded from UCSC Xena platform [18] which
many medical data of cancer patients are available. As a includes three different RNA-Seq data sources; TARGET,
result of the processing of these medical data with deep TCGA and GTEx. Dataset label distribution is shown in
learning approaches, the stages of diagnosis and treatment Table I.
have improved.

276
Table I. Distribution of labels in whole dataset. activation function and to overcome overfitting 0.2 or 0.5
dropout rates were used. Final layer has Sigmoid as activation
function.
Datasets Normal Tumor
Table II. CNN Architecture
TCGA 727 9750
GTEx 7429 0

Data labels (Normal, Tumor) were extracted from

phenotype information of selected samples. Gene IDs were
converted from Entrez ID to ENSEMBL IDs using BioMart
online tool. The differentially expressed gene list [19] was
used to select 1024 genes which show highest up-regulation
or down-regulation count throughout the whole dataset.
Expression data for selected genes showing the most
expressivity value were used as input for training.
B. Proposed Data Preprocessing Method: RGB
Mapping
Gene expression values were converted into (R,G,B)
format before the trained step. RGB values are obtained by
converting gene expression value into 24 bit long binary and
then using first 8 bits for R (red), second 8 bits for G (green)
and third 8 bits for B (blue) (Figure I). For each sample,
32x32x3 3D Numpy array was prepared .

D. One-Pixel Attack
One pixel attack algorithm was adopted from an earlier
study [20] which utilizes "differential evolution" algorithm
Figure I. Conversion of gene expression value to RGB format from SciPy Python library. The attack algorithm picks
random locations (x, y) where x < 32 and y < 32 and random
Figure II summarizes the concept with an sample conversion. RGB colors. Although blue and green values are picked
After the conversion, 14,308 x 1,024 training data becomes within (0,255) range, the red color was only picked within
[14,308 x 32 x 32 x 3] Numpy array. Accordingly, test data (0,2) range since gene expression values are mostly below
becomes [3,578 x 32 x 32 x 3] Numpy array suitable for batch 196,607 corresponding to (2,255,255) RGB value. One pixel
processing by Tensorflow. Final layer represents R,G,B attack provides pixel location, the new color value which
values, the square 32x32 shape corresponds to 1,024 genes. causes label to change in trained model (from Normal to
Tumor or vice versa). Since attack is random, we performed
many attacks (10 times to be exact) to the test dataset. The
resulting attacks were filtered if the suggested pixel value is
within lowest and highest expression range of corresponding
gene.
III. Results
A. Input Images Obtained by Applying RGB Mapping
Method
Since gene expression data have been converted into RGB
format, visualizing the expression layout for any sample was
possible. In Figure III, sample images for Normal and Tumor
Figure II. Illustration of Numpy 4D arrays. (a) For each sample, samples are presented. Figure III shows 4 sample images
1024 genes are shaped as 32x32 pixels. Due to RGB mapping, 3 from (a) Normal tissue data and (b) Tumor tissue data
layers of color channels were used per sample. (b) This shape was generated by converting gene expression levels of 1024
imposed on whole dataset. selected genes using RGB mapping. The images do not reveal
any apparent pattern for naked eye. However, convolutional
C. CNN Architecture
layers are able to pick regions or patterns formed by
The CNN architecture shown in Table II was used for neighboring pixels so gene expression data was passed
training. The architecture includes eight convolution layers, through convolution layers. Please note that gene expression
four dropout layers, one global average pooling layer. Each data was converted into RGB format but they are not saved
convolution layer consists of 3x3 kernels. ReLU was used as as images before training.

277
Table III. Comparison model with other studies. SVM; support vector machine, t-SNE; t-distributed stochastic neighbor embedding.
Expression
Authors Classification Accuracy Sensitivity Specificity Precision F-measure
Preprocessing
Elbashir et al. [22] Normalization CNN 98,76 % 91,43% 100,00% 100,00% 0,955
Danaee et al. [23] Normalization Stacked
94,78 % 94,04% 97,50% 97,20%
Denoising
Elbashir et al. [22] Normalization AlexNet 96,69 % 96,89% 94,12% 99,54% 0,955
Elbashir et al. [22] t-SNE SVM 100,00% 100,00% 51,00% 95,96% 0,97
Proposed method RGB mapping CNN 97,73 % 97,66% 97,80% 98,00% 0,975

C. Performance Measurement
The training was performed on 32x32x3 3D
multidimensional array for each sample. Figure V shows the ROC curve of the model. The AUC value
of our model is 0.97.There are several different approaches
which uses gene expression data to classify tumor and normal
samples ranging from simpler machine learning approaches
to complex deep learning networks. These approaches
usually start with pre-processing the gene expression data
with an irreversible manipulation and even mapping data
points to a different domain. Our method involves minimal
and reversible change to gene expression data. The RGB
mapping is reversible and does not require normalization or
any dimensional reduction techniques. Table III compares
our approach with several different approaches both in pre-
processing and classification steps.

Figure III. Visualization of gene expression data as image.

B. Model Training
The deep learning architecture shown in Table II has been
performed. Evenly distributed normal and tumor samples
were trained at a rate of 80:20 using 40 epochs. The accuracy
value was determined to be 97.7%. The accuracy and loss
plots of the test and training samples are shown in Figure IV.

Figure V. The ROC curve of CNN model for tumor and normal
classification.

Although Elbashir et al [22] study (Normalization + CNN)

has highest accuracy, our approach has better results in
overall. Please note that Elbashir et al uses smaller and
unbalanced TCGA dataset. Their accuracy starts from 91%
and reaches 98.7% and due to dominating number of tumor
samples, their model has tendency to pick “tumor” as label,
explaining their lowest sensitivity and full precision. In our
case, our dataset is balanced (8156 Normal vs. 9750 Tumor)
and our accuracy start from 58% and reaches 97.7%.

D. One-Pixel Attack Results

As a result of the attacks, label changes took place in 240
different samples. When the label changes as a result of the
attacks were analyzed, it was seen that the changes on 13
genes caused changes in the decision mechanism of the
neural network. The 13 identified genes are shown in Table
IV along with their names and Ensembl IDs.
Figure IV. Proposed model (a) model accuracy, and (b) loss

278
Table IV. Gene List obtained by One-Pixel Attack
In Figure VI, the first images show the original images and
Ensembl ID Gene Name the second images show the images obtained as a result of the
attack. Figure VI (a) shows the original and post attack
ENSG00000163513 TGFBR2 images of gene expression data with sample ID TCGA-HC-
ENSG00000129250 KIF1C 8259-11. The areas marked with a red circle show the
changes in the Agrin gene as a result of the attack. The gene
ENSG00000215301 DDX3X
expression value has been increased for the TCGA-HC-8259-
ENSG00000188157 AGRN 11 sample. As can be seen in the image, brighter pixels were
Cancer ENSG00000138821 SLC39A8
obtained by increasing the expression value. It has been
Related shown that important clues for the mechanism of cancer can
Genes ENSG00000124942 AHNAK be obtained by performing changes that cannot be
According to
ENSG00000157557 ETS2 distinguished with the naked eye as well as finding effective
One-Pixel
Attack genes for cancer by attacking the model. In Figure VI (b), the
ENSG00000177469 CAVIN1 image of the sample with the sample ID TCGA-NJ-A4YI-01
ENSG00000123095 BHLHE41 is shown. In this example, the gene expression value for the
same gene was decreased. As can be seen in the picture, while
ENSG00000157514 TSC22D3
the pixel value of the original picture is brighter, it appears
ENSG00000116701 NCF2 darker with decreasing the gene expression value as a result
ENSG00000198911 SREBF2 of the attack.
ENSG00000121691 CAT IV. Discussion
In this study, a training was carried out through deep
When the genes obtained as a result of one pixel attack are learning models and gene expression data of cancer patients
examined in the literature, it is seen that the expression using RGB mapping data preprocessing methods. As a result
changes are associated with the formation of cancer or the of this training, it has been shown that deep learning methods
survival of the patient. For example, studies have been can distinguish the differences between tumor and normal
carried out that the expression changes of the TGFBR2 gene tissues. The data processing method applied before the model
(transforming growth factor beta receptor 2) affect the makes it possible to apply a one-pixel attack to the sample
prognosis in cervical cancer [24] and gastric cancers [25]. images obtained. Identifying genes that are effective for
Likewise, there are data showing that KIF1C (Kinesin-like cancer is critical for cancer diagnosis and treatment. For this
protein KIF1C) and SLC39A8 (Solute carrier family 39 reason, one pixel attack algorithm was applied to identify
member 8) are prognostic marker in renal cancers [26] and genes that may be cancer biomarkers over the obtained
that the change in expression level is related to survival time. training data. When the genes determined by this algorithm
When the changes in the estimates of the samples were were investigated in the literature, it was seen that the
examined by applying the one pixel attack method, it was expression changes of the genes were effective in cancer
seen that only the changes of the AGRN (Agrin) gene progression and survival of patients.
occurred by increasing and decreasing the expression level. The results of the study have proven that by developing
When the sample images converted to RGB format are appropriate processing methods for the experimentally
examined, the changes in the increased and decreased pixels obtained biological data, meaningful results can be obtained
are seen Figure VI. on the disease without loss of information about the disease.
The gene expression data, which are the inputs of the deep
learning model, are converted to RGB and applied to the
model, allowing the data to be used without any statistical
methods (such as normalization) and without loss. In this
way, high learning rate and high prediction rate were seen as
a result of the training.
All these findings have shown that it can bring a new
approach to diseases such as cancer that are difficult to
diagnose and require more biomarker genes. With the
application of this method, individual results can be obtained.
Inter- and intra-tumor heterogeneity characteristics of tumor
cells can be determined. It can be used as an approach that
makes it possible to make individual cancer analysis by
making it easier to find genes that differ from person to
person. The results obtained can be strengthened with
experimental data to identify new biomarkers for cancer and
can be used in personalized diagnostic or therapeutic studies.
References
[1] Esteva, A., Robicquet, A., Ramsundar, B., Kuleshov, V.,
Figure VI. Sample images obtained as a result of the attack. DePristo, M., Chou, K., Cui, C., Corrado, G., Thrun, S. and Dean,

279
J., 2019. Aguide to deep learning in healthcare. Nature medicine, tissue using gene expression data. In 2018 IEEE International
25(1), pp.24-29 Conference on Bioinformatics and Biomedicine (BIBM) (pp. 1748-
[2] Persi, E., Wolf, Y.I., Horn, D., Ruppin, E., Demichelis, F., 1752). IEEE.
Gatenby, R.A., Gillies, R.J. and Koonin, E.V., 2020. Mutation– [18] Vivian, J., Rao, A. A., Nothaft, F. A., Ketchum, C., Armstrong,
selection balance and compensatory mechanisms in tumour J., Novak, A., ... & Paten, B. (2017). Toil enables reproducible, open
evolution. Nature Reviews Genetics, pp.1-12. source, big biomedical data analyses. Nature biotechnology, 35(4),
[3] Zuluaga-Gomez, J., Al Masry, Z., Benaggoune, K., Meraghni, S. 314-316.
and Zerhouni, N., 2020. A CNN-based methodology for breast [19] Rouillard, A. D., Gundersen, G. W., Fernandez, N. F., Wang,
cancer diagnosis using thermal images. Computer Methods in Z., Monteiro, C. D., McDermott, M.G., & Ma’ayan, A., 2016. The
Biomechanics and Biomedical Engineering: Imaging & harmonizome: a collection of processed datasets gathered to serve
Visualization, pp.1-15. and mine knowledge about genes and proteins. Database, 2016.
[4] Gour, M., Jain, S. and SunilKumar, T., 2020. Residual learning [20] Su, J., Vargas, D. V., & Sakurai, K. (2019). One pixel attack
based CNN for breast cancer histopathological image classification. for fooling deep neural networks. IEEE Transactions on
International Journal of Imaging Systems and Technology. Evolutionary Computation, 23(5), 828-841.
[5] Swiderska-Chadaj, Z., de Bel, T., Blanchet, L., Baidoshvili, A., [21] O. Dulgerci. (2019). Minimizing with differential evolution.
Vossen, D., van der Laak, J. and Litjens, G., 2020. Impact of (Visited on 2021-6-18), [Online]. Available:
rescanning and normalization on convolutional neural network https://mathematica.stackexchange.com/questions/193009/minimiz
performance in multi-center, whole-slide classification of prostate ing-with-differential-evolution
cancer. Scientific Reports, 10(1), pp.1-14. [22] Elbashir, M. K., Ezz, M., Mohammed, M., & Saloum, S. S.
[6] Hartenstein, A., Lübbe, F., Baur, A.D., Rudolph, M.M., Furth, (2019). Lightweight convolutional neural network for breast cancer
C., Brenner, W., Amthauer, H., Hamm, B., Makowski, M. and classification using RNA-seq gene expression data. IEEE Access, 7,
Penzkofer, T., 2020. Prostate Cancer Nodal Staging: Using Deep 185338-185348.
Learning to Predict 68 Ga-PSMA-Positivity from CT Imaging [23] Danaee, P., Ghaeini, R., & Hendrix, D. A. (2017). A deep
Alone. Scientific Reports, 10(1), pp.1-11. learning approach for cancer detection and relevant gene
[7] Kanavati, F., Toyokawa, G., Momosaki, S., Rambeau, M., identification. In Pacific symposium on biocomputing 2017 (pp.
Kozuma, Y., Shoji, F., Yamazaki, K., Takeo, S., Iizuka, O. and 219-229).
Tsuneki, M., 2020. Weakly-supervised learning for lung carcinoma [24] Yang, H., Zhang, H., Zhong, Y., Wang, Q., Yang, L., Kang, H.,
classification using deep learning. Scientific Reports, 10(1), pp.1- ... & Zhou, Y. (2017). Concomitant underexpression of TGFBR2
11. and overexpression of hTERT are associated with poor prognosis in
[8] Lai, Y.H., Chen, W.N., Hsu, T.C., Lin, C., Tsao, Y. and Wu, S., cervical cancer. Scientific reports, 7(1), 1-14.
2020. overall survival prediction of non-small cell lung cancer by [25] Nadauld, L. D., Garcia, S., Natsoulis, G., Bell, J. M., Miotke,
integrating microarray and clinical data with deep learning. L., Hopmans, E. S., ... & Ji, H. P. (2014). Metastatic tumor evolution
Scientific reports, 10(1), pp.1-11. and organoid modeling implicate TGFBR2 as a cancer driver in
[9] Jiang, D., Liao, J., Duan, H., Wu, Q., Owen, G. Shu, C., Chen, diffuse gastric cancer. Genome biology, 15(8), 1-18.
L., He, Y., Wu, Z., He, D. and Zhang, W., 2020. A machine [26] KIF1C,https://www.proteinatlas.org/ENSG00000129250-
learning-based prognostic predictor for stage III colon cancer. KIF1C/pathology/renal+cancer, 29/05/21.
Scientific reports, 10(1), pp.1-9.
[10] Fontaine, P., Acosta, O., Castelli, J., De Crevoisier, R., Müller,
H. and Depeursinge, A., 2020. The importance of feature
aggregation in radiomics: a head and neck cancer study. Scientific
Reports, 10(1), pp.1-11.
[11] Tschandl, P., Rinner, C., Apalla, Z., Argenziano, G., Codella,
N., Halpern, A., Janda, M., Lallas, A., Longo, C., Malvehy, J. and
Paoli, J., 2020. Human–computer collaboration for skin cancer
recognition. Nature Medicine, 26(8), pp.1229-1234.
[12] Jiao, W., Atwal, G., Polak, P., Karlic, R., Cuppen, E., Danyi,
A., De Ridder, J., van Herpen, C., Lolkema, M.P., Steeghs, N. and
Getz, G., 2020. A deep learning system accurately classifies primary
and metastatic cancers using passenger mutation patterns. Nature
communications, 11(1), pp.1-12.
[13] Mencattini, A., Di Giuseppe, D., Comes, M.C., Casti, P., Corsi,
F., Bertani, F.R., Ghibelli, L., Businaro, L., Di Natale, C., Parrini,
M.C. and Martinelli, E., 2020. Discovering the hidden messages
within cell trajectories using a deep learning approach for in vitro
evaluation of cancer drug treatments. Scientific reports, 10(1), pp.1-
11.
[14] Ramirez, R., Chiu, Y. C., Zhang, S., Ramirez, J., Chen, Y.,
Huang, Y., & Jin, Y. F., 2021. Prediction and interpretation of
cancer survival using graph convolution neural networks. Methods.
[15] Xie, Y., Meng, W. Y., Li, R. Z., Wang, Y. W., Qian, X., Chan,
C., ... & Leung, E. L. H., 2021. Early lung cancer diagnostic
biomarker discovery by machine learning methods. Translational
oncology, 14(1), 100907.
[16] Binder, A., Bockmayr, M., Hägele, M., Wienert, S., Heim, D.,
Hellweg, K., ... & Klauschen, F. (2021). Morphological and
molecular breast cancer profiling through explainable machine
learning. Nature Machine Intelligence, 1-12.
[17] Ahn, T., Goo, T., Lee, C. H., Kim, S., Han, K., Park, S., & Park,
T., 2018. Deep learning-based identification of cancer or normal

280

UNIT I Part 1 Notes
No ratings yet
UNIT I Part 1 Notes
28 pages
Final Report Project
No ratings yet
Final Report Project
71 pages
Deep Learning and Its Applications
No ratings yet
Deep Learning and Its Applications
33 pages
Shipyard Individual Assignment 1
No ratings yet
Shipyard Individual Assignment 1
19 pages
Ecaade 2009
100% (1)
Ecaade 2009
856 pages
Fluent Tutorials 1
100% (1)
Fluent Tutorials 1
912 pages
Decision Sciences
No ratings yet
Decision Sciences
10 pages
Image Segmentation Using Deep Learning: A Survey
No ratings yet
Image Segmentation Using Deep Learning: A Survey
22 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
18 pages
Chairman Script (Proposal Defense)
No ratings yet
Chairman Script (Proposal Defense)
2 pages
ChatGPT - MyLearning On Coding For Machine Learning
No ratings yet
ChatGPT - MyLearning On Coding For Machine Learning
16 pages
SSCE2193 202020212 Final Exam Paper 2
No ratings yet
SSCE2193 202020212 Final Exam Paper 2
5 pages
Video Summarization Overview: Cyberagent, Inc. Otani - Mayu@Cyberagent - Co.Jp
No ratings yet
Video Summarization Overview: Cyberagent, Inc. Otani - Mayu@Cyberagent - Co.Jp
55 pages
Yichen Zhou
No ratings yet
Yichen Zhou
17 pages
BTV Stability Booklet
100% (1)
BTV Stability Booklet
70 pages
Project: Design and Analysis of Shock Absorber
100% (1)
Project: Design and Analysis of Shock Absorber
18 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
57 pages
Bunde Imoter Complete Work 2024
No ratings yet
Bunde Imoter Complete Work 2024
50 pages
Thesis Final
0% (1)
Thesis Final
186 pages
Hydromax: User Manual
No ratings yet
Hydromax: User Manual
257 pages
Research and Applications of Artificial Neural Networkin Pavement Engineering A State-Of-The-Art Review
No ratings yet
Research and Applications of Artificial Neural Networkin Pavement Engineering A State-Of-The-Art Review
24 pages
SPLM No. 4 Ce311
No ratings yet
SPLM No. 4 Ce311
19 pages
Empirical Analysis For Crime Prediction and Forecasting Using Machine
No ratings yet
Empirical Analysis For Crime Prediction and Forecasting Using Machine
15 pages
Internal Loadings Developed in Structural Members: Theory of Structures
No ratings yet
Internal Loadings Developed in Structural Members: Theory of Structures
44 pages
Abstract:: Guided By: Prof - Dr.P.Pavan Kumar
No ratings yet
Abstract:: Guided By: Prof - Dr.P.Pavan Kumar
37 pages
2022-A Multi-Modal Wildfire Prediction and Personalized Early-Warning
No ratings yet
2022-A Multi-Modal Wildfire Prediction and Personalized Early-Warning
11 pages
Deep Learning in Mining Biological Data
100% (1)
Deep Learning in Mining Biological Data
33 pages
Articolo CAADRIA2022 - Volume1
No ratings yet
Articolo CAADRIA2022 - Volume1
16 pages
التصمیمة للفضاءات الخارجیة وفعالیتھا الاجتماعیة
No ratings yet
التصمیمة للفضاءات الخارجیة وفعالیتھا الاجتماعیة
14 pages
Moment of Inertia
No ratings yet
Moment of Inertia
37 pages
An Experimental and Numerical Study of A High Speed Planing Craft With Full-Scale Validation
No ratings yet
An Experimental and Numerical Study of A High Speed Planing Craft With Full-Scale Validation
12 pages
Merkezi Çelik Çaprazların Bir Çelik Yapı Üzerinde İncelenmesi
No ratings yet
Merkezi Çelik Çaprazların Bir Çelik Yapı Üzerinde İncelenmesi
17 pages
Caspari Sadeghi 2022 Artificial Intelligence in Technology Enhanced Assessment A Survey of Machine Learning
No ratings yet
Caspari Sadeghi 2022 Artificial Intelligence in Technology Enhanced Assessment A Survey of Machine Learning
15 pages
Criteria of Architectural Composition Design in Re
No ratings yet
Criteria of Architectural Composition Design in Re
11 pages
08IJBAS31
No ratings yet
08IJBAS31
11 pages
STOCKMARKETPREDICTION
No ratings yet
STOCKMARKETPREDICTION
10 pages
Emotion Recognition Using CNN and RNN
No ratings yet
Emotion Recognition Using CNN and RNN
37 pages
Short Term Solar Power Forecasting Based On Recurrent Neural Network Model (Irjet)
No ratings yet
Short Term Solar Power Forecasting Based On Recurrent Neural Network Model (Irjet)
5 pages
Review of DL in Minimally Invasive Surgery
No ratings yet
Review of DL in Minimally Invasive Surgery
21 pages
Bab 10-11 Gaya Internal - Diagram Momen Rev1
No ratings yet
Bab 10-11 Gaya Internal - Diagram Momen Rev1
20 pages
Assignment: Area Moment of Inertia
No ratings yet
Assignment: Area Moment of Inertia
9 pages
Kim 2011
No ratings yet
Kim 2011
7 pages
Full Thesis - Kawsar Rashid PDF
No ratings yet
Full Thesis - Kawsar Rashid PDF
269 pages
Aug - Overlaid - Data Augmentation For Recognition of Handwritten PDF
No ratings yet
Aug - Overlaid - Data Augmentation For Recognition of Handwritten PDF
7 pages
A Text Classification Model Based On GCN and BiGRU Fusion
No ratings yet
A Text Classification Model Based On GCN and BiGRU Fusion
5 pages
Moments Center of Mass and Centroids PDF
No ratings yet
Moments Center of Mass and Centroids PDF
7 pages
Long Short-Term Temporal Fusion Transformer For SH
No ratings yet
Long Short-Term Temporal Fusion Transformer For SH
23 pages
Be Computer-Engineering Semester-8 2023 November Deep-Learning-2019-Pattern
No ratings yet
Be Computer-Engineering Semester-8 2023 November Deep-Learning-2019-Pattern
2 pages
Weakly-Supervised Deep Embedding For Product Review Sentiment Analysis
No ratings yet
Weakly-Supervised Deep Embedding For Product Review Sentiment Analysis
12 pages
X22-Artificial Intelligence Enabled Energy-Efficient Heating, Ventilation and Air
No ratings yet
X22-Artificial Intelligence Enabled Energy-Efficient Heating, Ventilation and Air
27 pages
To Rsion
No ratings yet
To Rsion
16 pages
Wa0007
No ratings yet
Wa0007
19 pages
Stress and Strain
No ratings yet
Stress and Strain
14 pages
ChatGPT Reading Testing Items 2023
No ratings yet
ChatGPT Reading Testing Items 2023
14 pages
Mekanika Teknik Ti: Kuliah V & Vii
No ratings yet
Mekanika Teknik Ti: Kuliah V & Vii
22 pages
MANE-4030: Elements of Mechanical Design: Worksheet #4: ( 986.4j 469.1k) N ( 563.6j 1250.9k) N
No ratings yet
MANE-4030: Elements of Mechanical Design: Worksheet #4: ( 986.4j 469.1k) N ( 563.6j 1250.9k) N
4 pages
Data Science Deep Learning & Artificial Intelligence
No ratings yet
Data Science Deep Learning & Artificial Intelligence
9 pages
A Survey of Quantitative Trading Based On Artificial Intelligence
No ratings yet
A Survey of Quantitative Trading Based On Artificial Intelligence
7 pages
Delftship Manual
No ratings yet
Delftship Manual
110 pages
30 Encoder, Decoder, Sequence To Sequence 25-09-2024
No ratings yet
30 Encoder, Decoder, Sequence To Sequence 25-09-2024
5 pages
Study of A Boat Hull Shape
No ratings yet
Study of A Boat Hull Shape
5 pages
CFD Lab Manual
No ratings yet
CFD Lab Manual
8 pages
Bending Moments Slopes
No ratings yet
Bending Moments Slopes
4 pages
Ship Design & Construction: From A To Z
No ratings yet
Ship Design & Construction: From A To Z
6 pages
cs224n Practice Midterm 3 Sol
No ratings yet
cs224n Practice Midterm 3 Sol
14 pages
Advances in Intelligent Systems and Computing
100% (1)
Advances in Intelligent Systems and Computing
10 pages
Image Captioning - A Deep Learning Approach
No ratings yet
Image Captioning - A Deep Learning Approach
4 pages
M5 Main PDF Lesson Centroid
No ratings yet
M5 Main PDF Lesson Centroid
21 pages
(ENG) Centre of Mass. Centroid and Moments of Inertia of Plane Areas
No ratings yet
(ENG) Centre of Mass. Centroid and Moments of Inertia of Plane Areas
26 pages
What Is Trim?: Changed by Moving Masses Already On Board Forward or AFT
No ratings yet
What Is Trim?: Changed by Moving Masses Already On Board Forward or AFT
2 pages
Tutorial 2 Statics
No ratings yet
Tutorial 2 Statics
1 page
Mechanics of Materials - Soultion - Metric 98 PDF
No ratings yet
Mechanics of Materials - Soultion - Metric 98 PDF
1 page
Equilibrium of Particle FBD of 2-D Systems: Objectives
No ratings yet
Equilibrium of Particle FBD of 2-D Systems: Objectives
20 pages
Thesis AndreCalado
No ratings yet
Thesis AndreCalado
74 pages
The Prediction of Ship Added Resistanc at The Preliminary Design Stage by The Use of An Artificial Neural Network
No ratings yet
The Prediction of Ship Added Resistanc at The Preliminary Design Stage by The Use of An Artificial Neural Network
14 pages
Mechanics New
No ratings yet
Mechanics New
56 pages
A Deep Learning-Based Approach For Machining Process Route Generation
No ratings yet
A Deep Learning-Based Approach For Machining Process Route Generation
19 pages
Tutorial 1
No ratings yet
Tutorial 1
3 pages
Larry E. Rocela Civil Engineer
No ratings yet
Larry E. Rocela Civil Engineer
39 pages
Hydro Max 2
No ratings yet
Hydro Max 2
3 pages
Design and Construction of High Speed, Hard Chine Planing Hull
No ratings yet
Design and Construction of High Speed, Hard Chine Planing Hull
8 pages
Me200 - Eqnsheet 12 Jun 2012
No ratings yet
Me200 - Eqnsheet 12 Jun 2012
2 pages
EGR280 Mechanics 3
No ratings yet
EGR280 Mechanics 3
2 pages
CE2155 - Combined Loadings
No ratings yet
CE2155 - Combined Loadings
21 pages
Questions - Mechanical Engineering Principle Lecture and Tutorial - Covering Basics On Distance, Velocity, Time, Pendulum, Hydrostatic Pressure, Fluids, Solids, Etc
No ratings yet
Questions - Mechanical Engineering Principle Lecture and Tutorial - Covering Basics On Distance, Velocity, Time, Pendulum, Hydrostatic Pressure, Fluids, Solids, Etc
8 pages
Equation Sheet - 101
No ratings yet
Equation Sheet - 101
2 pages
Torsethaugen - Haver - 2004 - Simplified Double Peak Spectral Model
No ratings yet
Torsethaugen - Haver - 2004 - Simplified Double Peak Spectral Model
9 pages
Shear Force &bending Moment Diagrams
No ratings yet
Shear Force &bending Moment Diagrams
17 pages
Quiz 3 Sample
No ratings yet
Quiz 3 Sample
3 pages
Final Exam w99
No ratings yet
Final Exam w99
3 pages
Extended Abstract - Empirical Formula Generation For Preliminary Ship Design
No ratings yet
Extended Abstract - Empirical Formula Generation For Preliminary Ship Design
3 pages
Electrospinning: Advancing Nanofiber Technologies for Drug Delivery and Biomedical Applications
From Everand
Electrospinning: Advancing Nanofiber Technologies for Drug Delivery and Biomedical Applications
Fouad Sabry
No ratings yet
kerpic'18: Proceedings for the 6. International Conference kerpic'18
From Everand
kerpic'18: Proceedings for the 6. International Conference kerpic'18
Ingram Spark
No ratings yet
Fiber Optics and Optoelectronic Devices
From Everand
Fiber Optics and Optoelectronic Devices
S Mohan
No ratings yet