Proceedings Book
Proceedings Book
Istanbul, Turkey .
ICAETA -2021
International Conference on Advanced Engineering,
Technology and Applications
PROCEEDINGS BOOK
2021
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)
Istanbul, Turkey
Host
Department of Computer Engineering,
Istanbul Aydin University,
Istanbul, Turkey.
https://izu.edu.tr
Sponsors
AIPLUS
308 Daisyfield Centre, Appleby Street
Blackburn Lancashire BB1 3BL,
United Kingdom
https://www.aiplustech.org/
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)
Istanbul, Turkey
BOOK OF PROCEEDINGS
Editorial Board
Dr. Akhtar Jamil
Dr. Alaa Ali Hameed
ISBN: 2752-8340
Istanbul, Turkey
Copyright © 2021
All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or
The individual contributions in this publication and any liabilities arising from them remain the
responsibility of the authors. The publisher is not responsible for possible damages, which could
be a result of content derived from this publication. Moreover, the proceedings book is published
info@aiplustech.org
https://icaeta.aiplustech.org
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)
Istanbul, Turkey
Keynote Speakers
Publication Committee
Dr. Imran Ahmed Siddiqi, Department of Computer Science, Bahria University, Pakistan.
Maryam Torabi, Alzahra Technical and Vocational College, Iran
Dourna Kiavar, Sama University of Tabriz, Iran
Mustafa Takaoğlu, Istanbul Aydin University, Turkey.
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)
Istanbul, Turkey
Registration Committee
Dr. Amani Yahyaoui, Istanbul Sabahattin Zaim University, Turkey.
Alireza Sakha, Islamic Azad University, Iran
Ali Khiabanian, Interdisiplinary Design Universe Office, Iran
Ayse Gul Gemci, Istanbul Technical University, Turkey
Publicity Committee
Dr. Adem Ozyavas, Istanbul Aydin University, Turkey.
Sama Khattab, Istanbul Aydin University, Turkey.
Program Committee
Dr. Ameer Al-Nemrat, School Of Architecture Computing And Engineering, University Of East London
Dr. Chawki Djeddi, Laboratoire D'informatique De Traitement De I'information Et Des Systemes (Litis),
University Of Rouen, France.
Dr. Can Balkaya,Department Of Civil Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Muhammad Abdul Basit, Montana Technological University, Butte Montana, Usa.
Dr. Naghmeh Moradpoor, School Of Computing, Edinburgh Napier University, United Kingdom
Dr. Kamran Dehghan, Department Of Architecture, Islamic Azad University, Iran
Dr. Selda Nazari, Department Of Architecture, Islamic Azad University, Iran
Dr. Syed Attique Shah , Balochistan University Of Information Technology Engineering And Management
Sciences, Pakistan
Dr. Müberra Eser Aydemir,Department Of Civil Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Muhammad Fahim , Institute Of Information Security And Cyberphysical Systems, Innopolis University,
Russia.
Dr. Shahab Adam Navasi, Department Of Architecture, Islamic Azad University, Iran
Dr. Muhammad Ilyas, Department Of Electrical And Electronics Engineering, Altinbas University, Turkey.
Dr. Rawad Hammad, School Of Architecture, Computing And Engineering, University Of East London, United
Kingdom
Dr. Prateek Agrawal, Department Of Computer Science, University Of Klagenfurt, Austria.
Dr. Atoosa Modiri, Department Of Architecture, Islamic Azad University, Iran
Dr. Hasan Volkan Oral, Department Of Civil Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Jayapandian N, Department Of Computer Science And Engineering, Christ University, Bangalore, India.
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)
Istanbul, Turkey
Dr. Kaveh Dehghanian, Department Of Civil Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Nidaa Flaih Hassan, Department Of Computer Science, University Of Technology, Iraq
Dr. Amani Yahyaoui, Department Of Computer Engineering, Istanbul Sabahattin Zaim University, Turkey.
Dr. Firas Ajlouni, Department Of Computer Science, Lancashire College, United Kingdom.
Dr. Bita Bagheri, Department Of Architecture, Islamic Azad University, Iran
Dr. Zeynep Kerem Öztürk, Department Of Interior Architecture And Environmental Design, Istanbul
Sabahattin Zaim University, Istanbul, Turkey
Dr. Aliyu Musa, Predictive Society And Data Analytics Lab, Faculty Of Information Technology And
Communication Sciences, Tampere University, Tampere, Finland.
Dr. S Amutha, Department Of Computer Science And Engineering, Sàveetha Engineering College
(Autonomous), Affiliated To Anna University, Chennai, India.
Dr. Sepanta Naimi, Department Of Civil Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Mehmet Fatih Altan, Department Of Civil Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Mohammad Golmohammadi, Department Of Architecture, Islamic Azad University, Iran
Dr. Lisa Oliver , Department Of Computer Science, Lancashire College, United Kingdom.
Dr. Ahmet Gürhanli, Department Of Computer Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Daniel White, Department Of Mathematics And Computing , Lancashire College, United Kingdom.
Dr. Yasmin Doozdoozani, Department Of Architecture, Islamic Azad University, Iran
Dr. Javad Eiraji, Faculty Of Architecture And Design, Eskisehir Technical University, Turkey
Dr. Kiarash Eftekhari, Department Of Architecture, Islamic Azad University, Iran
Dr. Mina Najafi, Editorial Assistant, Emerald Publishing Ltd, United Kingdom.
Dr. Nahid Khahnamouei, Department Of Architecture, University Of Nabi Akram, Iran
Dr. Murtaza Farsadi,Department Of Electric-Electronic Engineering, Istanbul Aydin University, Istanbul,
Turkey.
Dr. Sadeq Alhamouz, Deparment Of Computer Sciences, Wise University, Jordan.
Dr. Necip Gökhan Kasapoğlu, Department Of Electric-Electronic Engineering, Istanbul Aydin University,
Istanbul, Turkey.
Dr. Vira V. Shendryk, Deparment Of Computer Sciences, Sumy State University, Ukraine.
Dr. Elnaz Pashaei, Department Of Software Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Shareeful Islam, School Of Architecture Computing And Engineering, University Of East London, London,
United Kingdom
Dr. Navid Khaleghimoghaddam, Department Of Engineering And Architecture, Konya Food And Agriculture
University, Turkey.
Dr. Nima Mirzaei, Department Of Industrial Engineering, Istanbul Aydin University, Istanbul, Turkey.
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)
Istanbul, Turkey
Dr. Imene Yahyaoui, Universidad Rey Juan Carlos, Applied Mathematics Materials Science And Engineering
And Electronic Technology, Spain.
Dr. Ilham Huseyinov, Department Of Software Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Azeem Hafeez, Department Of Electrical And Computer Engineering, University Of Michigan, Usa.
Dr. Abdulkader Alwer, Department Of Electric-Electronic Engineering, Istanbul Aydin University, Istanbul,
Turkey.
Dr. Zeynep Orman, Deparment Of Computer Engineering, Istanbul University, Turkey.
Dr. Xiaodong Liu, School Of Computing, Edinburgh Napier University, United Kingdom
Dr. Saed Moghimi, Department Of Civil Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Alaa Sheta, Deparment Of Computer Sciences, Southern Connecticut State University, Usa.
Dr. Mehmet Güneş Gençyilmaz, Department Of Industrial Engineering, Istanbul Aydin University, Istanbul,
Turkey.
Dr. Raheleh Mirzaei, Department Of Industrial Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Vijayakumar Varadarajan, School Of Computer Science And Engineering, The University Of New South
Wales, Sydney, Australia.
Dr. Luca Romeo, Istituto Italiano Di Tecnologia, Italy.
Dr. Numan Khurshid, Seecs, National University Of Science And Technology, Pakistan
Dr. Elif Merve Kahraman, Department Of Food Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Sibel Kahraman, Department Of Food Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Mustafa Nafiz Duru, Department Of Industrial Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Saeid Homayouni, Centre For Water, Earth, And Environment, Inrs-Quebec, Canada
Dr. Imran Ahmed Siddiqi, Department Of Computer Science, Bahria University, Pakistan.
Dr. Zeynep Dilek Heperkan, Department Of Food Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. T. Lalitha, Computer Science & It, Jain Deemed-To-Be University, Bengaluru, Karnataka,India
Dr. Aysa Jafari Farmand, Istanbul Technical University, Turkey.
Dr. Mehdi Zahed, School Of Applied Sciences And Technology, Nait, Canada.
Dr. Mehmet Güneş Gençyilmaz, Department Of Food Engineering, Istanbul Aydin University, Istanbul,
Turkey.
Dr. Gülay Baysal, Department Of Food Engineering, Istanbul Aydin University, Istanbul, Turkey.
Dr. Fahimeh Jafari, School Of Architecture, Computing And Engineering, University Of East London, United
Kingdom
Dr, Mustansar Ali Ghazanfar, Department Of Computer Science And Digital Technologies, University Of East
London, United Kingdom
Dr. Sibel Senan, Deparment Of Computer Engineering, Istanbul University, Turkey.
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)
Istanbul, Turkey
Dr. Mehmet Emin Tacer, Department Of Electric-Electronic Engineering, Istanbul Aydin University, Istanbul,
Turkey.
Dr. Mohammed Alkrunz, Department Of Electrical And Electronics Engineering, Istanbul Aydin University,
Istanbul, Turkey.
Dr. Mohammed Vadi, Department Of Electrical And Electronics Engineering, Istanbul Sabahattin Zaim
University, Istanbul, Turkey.
International Conference on Advanced Engineering, Technology and Applications (ICAETA-2021)
Istanbul, Turkey
TABLE OF CONTENTS
Istanbul, Turkey
Istanbul, Turkey
Istanbul, Turkey
Istanbul, Turkey
Sarosh Ahmad, Arslan Dawood Butt, Usama Umar, Sajal Naz, Sheza
Yasin and Amina Batool
38 208
Development of a High Precision Temperature Monitoring System for
Industrial Cold Storage
Istanbul, Turkey
Abstract—Recognition of human activities is a challenging learning techniques and deep learning approaches. They
process and technology-based systems are used for the achieved the best performance with the LSTM model.
successful realization of the process. Recently, artificial Accuracy success for the two datasets they used was 87% and
intelligence-based technologies have started to be used widely. 98.9%, respectively. S. Tsokov et al. [12] used accelerometer
The hybrid approach designed for this study consists of sensors to classify human activity types. They designed a 1D-
convolutional-based bidirectional long short-term memory (C- CNN model to describe the data they obtained according to
BiLSTM). In this study, 12 types of human activities were their types. The overall accuracy success they got from the
identified using C-BiLSTM using ECG signal data. As a result CNN model they designed was 98.86%.
of the analysis, an overall accuracy success rate of 98.96% was
achieved. The result obtained in the experimental analysis has The goals of this study are; it is to successfully recognize
been promising in identifying the types of human activity. human activities with the proposed approach. The proposed
approach is hybridized CNN & Bidirectional LSTM models
Keywords— bidirectional LSTM, deep learning, ECG designed using python libraries. The summary of this article
measurement, physical activities, human behavior about the sections is as follows; in the second section,
information about the dataset is given. The third section
I. Introduction
contains information about deep learning approaches and the
Today, recognizing human activities has become easier proposed approach. The fourth section contains the results of
thanks to various technological approaches (wearable sensor, the experimental analysis. The last section consists of
video, accelerometer, signal data, etc.) [1]. Information on the information about Discussion and Conclusion.
human activity actions is used in areas such as following the
elderly, performing health actions correctly, criminal tracking II. Dataset
systems, etc. [1], [2]. Technological infrastructure systems are The dataset consists of ECG data that includes various
supported by software containing innovative aspects [3]. physical activities created with 10 volunteer participants.
Artificial intelligence technologies form part of the innovative Using wearable sensors to measure movement data and vital
aspects. Artificial intelligence-based systems offer a service signs, other feature data were also created to detect activities.
where people can perform their transactions more easily today Measurements were made with an average speed of 50Hz
[4], [5]. during the creation of ECG data. The activity types created are
Many studies have been conducted recognition artificial 13 in total [13], [14]. The repetition/duration equivalents of
intelligence-based human activities. Shaohua Wan et al. [6] the activity types and types that make up the dataset are given
conducted the recognition of human activities using machine in Table 1.
learning and deep learning methods. They used convolutional
neural network (CNN), support vector machines (SVM), short Table I. Repetition/duration equivalents of the activity types and
long-term memory (LSTM), multilayer perceptron (MLP) types that make up the dataset
approaches in their study. They achieved the best accuracy Activity Type
Times (x) /
Activity Type
Times (x) /
success rate of 92.71% with the CNN model. Junfang Gong et minute (m) minute (m)
Frontal elevation
al. [7] used social media data to recognition human daily Nothing -
of arms
20x
activities. They achieved an overall accuracy rate of 89.35% 20x
Standing still 1m Knees bending
with the LSTM model they designed in their study. Nozha
Jlidi et al. [8] used the transfer learning-based PoseNet model Sitting and 1m
1m Cycling
relaxing
to recognition human activities. They successfully achieved 1m 1m
accuracy by emphasizing the body joints. Emilio Sansano et Lying down Jogging
al. [9] used gated recurrent unit networks (GRU), deep belief Walking 1m Running 1m
networks (DBN) and LSTM approaches to recognize human Jump front & 20x
activities. They achieved over 90% overall accuracy success Climbing stairs 1m
back
in all of the approaches they used. Sakorn Mekruksavanich et Waist bends
20x
al. [10] proposed a biometric user identification-based forward
approach to defining human activities for the health status
monitoring of elderly people. They obtained data using a
The dataset contains 14 feature columns for each data
triaxle gyroscope and triaxle accelerometer. They used the
operation. In addition, 20% of the dataset was separated as test
CNN model and LSTM model in the analysis of the data. The
data in experimental analysis, and 80% was allocated as
classification success of the models was 91.77% and 92.43%,
training data.
respectively. Negar Golestani et al. [11] presented a wireless
approach based on magnetic induction to recognize human
activities. They categorized their activity activities by
integrating the magnetic induction system with machine
1
III. Model Approaches the probability values as many as the type number of the
This section contains brief information about the models features from the fully connected layer and tags the input to
used in the proposed approach. the dominant probabilistic type [23].
Also, the ReLU activation function is generally preferred
A. Bi-directional Long Short-Term Memory
between layers in CNN models. ReLU is an activation
The Bi-LSTM model is the combined state of bidirectional function that allows input values to be linearized and can keep
recurrent networks (Bi-RNN). The most important feature that negative values at zero [24], [25]. CNN can be used in batch-
distinguishes LSTM from RNN consists of gates used as normalization and dropout functions to prevent models from
memory units. Input data are processed through input gates being over / underfitting [26].
and transferred to output gates. The difference between the Bi-
LSTM model from the LSTM model is that it can return from C. Proposed Approach
the previous context and receive data. That is, in Bi-LSTM The proposed approach is the result of combining the
models, previous layer information and next layer information CNN-based Bi-LSTM model, which aims to describe human
are kept in memory gates [15]. In Bi-LSTM models, the physical activities. Using the activity action features and ECG
number of hidden units and their functioning are calculated signal data obtained from C-BiLSTM model sensors, it aimed
according to Eq. (1) and Eq. (2). In these equations, 𝐿 and 𝐻 to successfully classify 13 activity types. The C-BiLSTM
inputs are used for the number of hidden units. Here; 𝑡 model is completely designed in Python programming
represents the time value, 𝑥 𝑡 is the sequence input, 𝜃ℎ the language, and "Tensorflow, Keras, Pandas, Numpy, etc."
activation function of the hidden unit, 𝑤 the weight values of coding was carried out using libraries. Jupyter Notebook
the hidden unit, and the variable 𝑏 𝑡 represents the activation interface program was used in the compilation of the model.
function of the ℎ unit at time 𝑡 [15], [16]. The general design of the model is given in Table 2. The
proposed approach consists of layers of the CNN model and
ℎ′ =1,𝑡>0 𝑏ℎ′ 𝑤ℎ′ ℎ
𝑎ℎ𝑡 = ∑𝐿𝑙=1 𝑥𝑙𝑡 𝑤𝑙ℎ + ∑𝐻 𝑡−1
layers of the LSTM model. Since the two models are designed
with open source codes, data transitions and parameter values
𝑏ℎ𝑡 = 𝜃ℎ (𝑎ℎ𝑡 ) between models in python software must be compatible.
Therefore, in the proposed approach, the output values from
the last layer of the CNN model were provided to be equal to
B. Convolution Neural Networks
the input values of the BiLSTM model. In other words, while
CNN is an artificial intelligence-based model used in providing the transition, the tensor numbers, parameter values,
classification, recognition, segmentation, etc. processes by and input size are hybridized to be the same. The
processing input data [17]. These networks generally consist normalization between the feature values obtained from the
of convolutional layers, pooling layers, fully connected/dense convolutional layers was carried out with the batch -
layers [18]. Apart from that, it can contain different layers in normalization layer. Also, using the dropout layer in the last
line with the target of the model. CNN models contain hidden layers of the proposed approach, inefficient features were
layers in their architectural structures. Convolutional layers prevented from being trained by the model. Thus, the model's
process the input data by filters to extract activation features training speed and time savings were achieved.
[19]. The mathematical formula in Eq. (3) is used to extract
the activation maps [20]. In this equation; the variable 𝐹 Table II. The general design of the proposed model
represents the layer of the activation map. The variable 𝑛
represents the number of features in the layer. Variables 𝑖, 𝑗, Layer Value / Output Shape
and 𝑘 provide the position information of the input data. Convolutional (None, 400, 64)
Matrix values are represented by variable 𝑀.
Batch Normalization & ReLU (None, 400, 64)
𝑗 𝑗,𝑘
𝐹𝑖,𝑙 = ∑𝑛𝑘=1 𝑀𝑖,𝑙 𝑥 𝐹𝑘𝑖−1 Convolutional (None, 400, 128)
2
In addition, in the measurements of the proposed model, Experimental analyzes took an average of 16 seconds per
the optimization method Rmsprop was preferred and the loss epoch and the training-test success plot for the analyzes is
function was chosen as categorical cross-entropy. shown in Fig.1. The analysis results obtained from the training
are given in Table 3. The confusion matrix is given in Table
IV. Experimental Analysis 4. The success rates of the proposed approach for different
Experimental analyzes were compiled on the Google epochs are given in Table 5. In addition, information about the
COLAB server. Trainings were performed using the GPU and total time spent on training the model is given in Table 5. The
the preferred epoch value for training was 100. Other total training time of the model took 1659 seconds.
preferred parameters in model compilation; optimizer method
rmsprop, loss function "sparse categorical cross-entropy" and Table III. Metric results obtained in experimental analysis
metric function "sparse categorical accuracy" were selected. Activity Type Pre Rec F-scr Overall Acc
The "early stopping" parameter was used to prevent
overfitting in the model. Confusion Matrix was used to Nothing 1.00 1.00 1.00
compare analysis results. Eq. (5-8) were used to calculate Standing still 1.00 1.00 1.00
matrix metrics. The variables used in these equations; (TP): Sitting and relaxing 1.00 1.00 1.00
true positive, (TN): true negative, (FP): false positive, and
Lying down 0.99 1.00 1.00
(FN): false negative are defined [27]–[29].
Walking 0.99 0.99 0.99
TP
Precision (Pre) Climbing stairs 0.99 1.00 1.00
TP+FP
Waist bends forward 1.00 0.98 0.99 0.98
TP
Recall (Rec) Frontal elevation of arms 0.98 1.00 0.99
TP+FN
Knees bending 0.99 0.99 0.99
2xTP
F-scor (F-scr) Cycling 1.00 0.99 1.00
2xTP+FP+FN
Jogging 0.97 0.99 0.98
TP+TN
Accuracy (Acc) Running 0.92 0.94 0.93
TP+TN+FP+FN
Jump front & back 1.00 0.83 0.91
Table IV. The confusion matrix was obtained from the proposed approach. (#1: nothing, #2: standing still, #3: sitting and relaxing, #4: lying
down, #5: walking, #6: climbing stairs, #7: waist bends forward, #8: frontal elevation of arms, #9: knees bending, #10: cycling, #11: jogging,
#12: running, #13: jump front & back)
3
Table V. Time - accuracy success chart obtained in the training Human Daily Activity Using Social Media Sensors and
process. Deep Learning,” Int. J. Environ. Res. Public Health, vol.
Epoch Training Time Training Acc. Test Acc.
16, no. 20, p. 3955, Oct. 2019.
(sec.) (%) (%) [8] N. Jlidi, A. Snoun, T. Bouchrika, O. Jemai, and M. Zaied,
25 442 98.73 96.71 “PTLHAR: PoseNet and transfer learning for human
50 846 99.21 97.93 activities recognition based on body articulations,” in Proc.
75 1251 99.40 97.14 SPIE, 2020, vol. 11433.
100 1659 99.64 98.96 [9] E. Sansano, R. Montoliu, and Ó. Belmonte Fernández, “A
study of deep neural networks for human activity
As a result of the analysis, the overall accuracy success of recognition,” Comput. Intell., vol. 36, no. 3, pp. 1113–
the proposed approach in recognizing human activity types
1139, Aug. 2020.
was 98.96%. The overall accuracy success obtained from the
training data was 99.64%. The proposed approach produced [10] S. Mekruksavanich and A. Jitpattanakul, “Biometric User
close to 100% results in the analysis of data from the ECG Identification Based on Human Activity Recognition Using
and three types of sensor devices. Wearable Sensors: An Experiment Using Deep Learning
Models,” Electronics, vol. 10, no. 3, p. 308, Jan. 2021.
V. Conclusion [11] N. Golestani and M. Moghaddam, “Human activity
Measurement of human activities is carried out using recognition using magnetic induction-based motion signals
various mechanical/electronic devices. Artificial intelligence- and deep recurrent neural networks,” Nat. Commun., vol.
based approaches have become indispensable for a faster and 11, no. 1, p. 1551, 2020.
more accurate realization of such systems [30]. In this study,
an artificial intelligence-based identification system has been [12] S. Tsokov, M. Lazarova, and A. Aleksieva-Petrova,
proposed using ECG and sensor measurement data of physical “Accelerometer-based human activity recognition using
activities. The proposed approach has a structure derived from 1D convolutional neural network,” IOP Conf. Ser. Mater.
the integration of the CNN model with the Bi-LSTM model. Sci. Eng., vol. 1031, no. 1, p. 12062, 2021.
The contribution of the proposed approach is that it performs [13] G. Jain, “Mobile Health Human Behavior Analysis,” Feb-
more efficient analyzes than traditional methods and can 2021. [Online]. Available:
increase success with a hybrid structure. A 98.96% success https://www.kaggle.com/gaurav2022/mobile-health.
rate was achieved in the study analyzes. As a result, the
[Accessed: 30-Mar-2021].
contribution of the hybrid approach was observed in the
experimental analyzes performed in this study. The results [14] O. Banos et al., “Design, implementation, and validation of
obtained showed that the proposed approach is promising. a novel open framework for agile development of mobile
health applications.,” Biomed. Eng. Online, vol. 14 Suppl
In future studies, hybrid models will be designed on video
2, no. Suppl 2, p. S6, 2015.
data of real-time human activities. In addition, it will be used
in hybrid models in metaheuristic methods that optimize time [15] I. N. Yulita, M. I. Fanany, and A. M. Arymuthy, “Bi-
savings. directional Long Short-Term Memory using Quantized
data of Deep Belief Networks for Sleep Stage
References Classification,” Procedia Comput. Sci., vol. 116, pp. 530–
[1] C. Jobanputra, J. Bavishi, and N. Doshi, “Human Activity 538, 2017.
Recognition: A Survey,” Procedia Comput. Sci., vol. 155, [16] C. Zhang, D. Biś, X. Liu, and Z. He, “Biomedical word
pp. 698–703, 2019. sense disambiguation with bidirectional long short-term
[2] M.-S. Dao, T.-A. Nguyen-Gia, and V.-C. Mai, “Daily memory and attention-based neural networks,” BMC
Human Activities Recognition Using Heterogeneous Bioinformatics, vol. 20, no. 16, p. 502, 2019.
Sensors from Smartphones,” Procedia Comput. Sci., vol. [17] R. Yang and Y. Yu, “Artificial Convolutional Neural
111, pp. 323–328, 2017. Network in Object Detection and Semantic Segmentation
[3] J. Rose and B. Furneaux, “Innovation Drivers and Outputs for Medical Imaging Analysis,” Front. Oncol., vol. 11, p.
for Software Firms: Literature Review and Concept 638182, Mar. 2021.
Development,” Adv. Softw. Eng., vol. 2016, p. 5126069, [18] S. Asif and K. Amjad, “Automatic COVID-19 Detection
2016. from chest radiographic images using Convolutional
[4] R. Vinuesa et al., “The role of artificial intelligence in Neural Network,” medRxiv, p. 2020.11.08.20228080, Jan.
achieving the Sustainable Development Goals,” Nat. 2020.
Commun., vol. 11, no. 1, p. 233, Jan. 2020. [19] W. S. Ahmed and A. a. A. Karim, “The Impact of Filter
[5] E. Prem, “Artificial Intelligence for Innovation in Austria,” Size and Number of Filters on Classification Accuracy in
Technol. Innov. Manag. Rev., vol. 9, no. 12, 2019. CNN,” in 2020 International Conference on Computer
[6] S. Wan, L. Qi, X. Xu, C. Tong, and Z. Gu, “Deep Learning Science and Software Engineering (CSASE), 2020, pp. 88–
Models for Real-time Human Activity Recognition with 93.
Smartphones,” Mob. Networks Appl., vol. 25, no. 2, pp. [20] H. J. Jie and P. Wanda, “RunPool: A Dynamic Pooling
743–755, 2020. Layer for Convolution Neural Network,” Int. J. Comput.
[7] J. Gong, R. Li, H. Yao, X. Kang, and S. Li, “Recognizing Intell. Syst., vol. 13, no. 1, p. 66, 2020.
4
[21] R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, learning,” Multimed. Tools Appl., vol. 79, no. 19, pp.
“Convolutional neural networks: an overview and 12777–12815, 2020.
application in radiology,” Insights Imaging, vol. 9, no. 4, [27] F. Demir, A. Şengür, V. Bajaj, and K. Polat, “Towards the
pp. 611–629, 2018. classification of heart sounds based on convolutional deep
[22] S. H. S. Basha, S. R. Dubey, V. Pulabaigari, and S. neural network,” Heal. Inf. Sci. Syst., vol. 7, no. 1, p. 16,
Mukherjee, “Impact of fully connected layers on 2019.
performance of convolutional neural networks for image [28] M. Hasnain, M. F. Pasha, I. Ghani, M. Imran, M. Y.
classification,” Neurocomputing, vol. 378, pp. 112–119, Alzahrani, and R. Budiarto, “Evaluating Trust Prediction
2020. and Confusion Matrix Measures for Web Services
[23] H. A. Almurieb and E. S. Bhaya, “SoftMax Neural Best Ranking,” IEEE Access, vol. 8, pp. 90847–90861, 2020.
Approximation,” IOP Conf. Ser. Mater. Sci. Eng., vol. 871, [29] D. Chicco and G. Jurman, “The advantages of the
p. 12040, 2020. Matthews correlation coefficient (MCC) over F1 score and
[24] C. Banerjee, T. Mukherjee, and E. Pasiliao, “An Empirical accuracy in binary classification evaluation,” BMC
Study on Generalizations of the ReLU Activation Genomics, vol. 21, no. 1, p. 6, 2020.
Function,” in Proceedings of the 2019 ACM Southeast [30] T. Davenport, A. Guha, D. Grewal, and T. Bressgott, “How
Conference, 2019, pp. 164–167. artificial intelligence will change the future of marketing,”
[25] A. Sawant, M. Bhandari, R. Yadav, R. Yele, and S. J. Acad. Mark. Sci., vol. 48, no. 1, pp. 24–42, 2020.
Bendale, “Brain Cancer Detection From Mri: a Machine
Learning Approach (Tensorflow),” Int. Res. J. Eng.
Technol., vol. 05, no. 04, p. 2089, 2018.
[26] C. Garbin, X. Zhu, and O. Marques, “Dropout vs. batch
normalization: an empirical study of their impact to deep
5
Preventing Oscillation of Supply Voltage Due to
Resonance Harmonic Frequency of a Power Factor
Corrected System
Adel Ridha Othman
Electromechanical Engineering Department
University of Technology, Baghdad, Iraq
adel.r.othman@uotechnology.edu.iq
Abstract- The variable speed ac and dc drives that are using analyzed and a harmonic filters are designed to reduce the
power electronics are producing high levels of harmonic effects of these harmonics on the industrial supply. In [9] the
distortion. The simulated distribution system of this paper harmonic voltage in a large power system that is containing
consists of a 150 KVA generator supplying a DC drives of 75 AC- DC converters with a variable large capacity loads is
KW of type silicon controlled rectifier (SCR). A shunt mitigated using passive filter. In [10] different application of
capacitor bank is connected at a point of connecting the passive and active harmonic filters are presented for
supply with the load which is called point of common mitigation of harmonic distortion. In [11] an active filter is
coupling (PCC#2) to correct the power factor which is called used to mitigate the harmonics and a contribution of an easy
a power factor correction (PFC) capacitor. The parallel circuit control method is submitted without needing to transform
of the PFC capacitor and the inductance of the system has a between power systems frameworks. In [12] a
resonance frequency equals the 11th harmonic of the system Matlab/Simulink simulation is used to implement an active
and that causes an oscillation and distortion in supply voltage. power filter to select the current harmonics to be mitigated.
A shunt passive filter, single tuned to the resonance frequency In [13] a synchronous reference frame theory is used to
of 11 th component of harmonics is designed to mitigate the generate the reference current signal for a voltage source
voltage total harmonic distortion (VTHD) to compliance with inverter that is controlled by a hysteresis control method and
IEEE Std 519:2014 and preventing the oscillation of the simulating a shunt active power filter based on the
supply voltage. synchronous reference frame and hysteresis control method
to reduce the harmonic distortion in power system. [14]
presents a case study of a hybrid active filter and a passive
Keywords—Harmonic, resonanace, filter, distortion. notch filter to enhance the total harmonic distortion in an
I. Introduction industrial power system. [15] presents an active filter to
mitigate the generated harmonics from non linear loads of a
The aim of studying of harmonic distortion is to calculate the photovoltaic system connected to the grid. In [16] a
harmonic currents, voltages, and the percentage of the Matlab/Simulink software is used to simulate an active filter
distortion indices in an electrical system and then analyze the to enhance the total harmonic distortion in a photovoltaic grid
situation of resonance and to design filters to mitigate its connected system controlled by proportional resonance
effect on power system [1]. For reducing harmonics the control method. The effectiveness of the filter is investigated
power converter that must be used must be of operating using different load types. [17] presents a Matlab / Simulink
pulses of 12 and a higher static converter. But due to modelling of a photovoltaic (PV) grid connected system to
frequently required maintenance of converters that are of 12- study the optimal location of the PV system to the utility grid
pulse, it is almost using converters of six-pulse instead. The for minimum harmonic distortion without implementing
converters of lower number of pulses inject a large values of harmonic filters. In [18] an active filter is used to mitigate
5th, 7th harmonics and the harmonics of orders related to the the harmonic distortion in wind turbine power plants and the
12-pulse characteristic harmonics. These are traditionally optimal location of the filter is studied. In [19] an active filter
filtered out by designing a tuned filters to mitigate the low is simulated with Matlab/Simulink software to simulate an
order harmonics such as 5th, 7th, 11th and 13th, 17th and higher industrial power system that is using a non linear load of
orders [2]. In [3] various passive filters are used to mitigate power electronics that inject harmonics to the power system.
the harmonic distortion and power factor, in this work single In [20] a digital simulation of a shunt active filter is used to
and double tuned filters are designed and investigated for compensate for harmonics and reactive power. In [21] a
mitigating the harmonics. In [4] the harmonic distortion single tuned filter is designed using Electromagnetic
injected in 20 Kv distribution system is mitigated using Transient Analysis Program (ETAP) to mitigate the current
passive and active harmonic filters. In [5] a double tuned harmonics injected by variable speed drives (VSD). In this
filter is designed to mitigate the harmonics. In [6] a passive work the simulated distribution system is composed of a
harmonic filters are used to mitigate the source side current 150KVA generator supplying a DC drives of 75 KW of type
harmonics of a rectifier and DC - DC chopper that is feeding of six pulse silicon controlled rectifier (SCR). A parallel
a DC motor. In [7] an LC filter is connected at the input of a capacitor is added at the common coupling point (PCC#2) to
DC motor to reduce the high frequency harmonics and torque correct the power factor (PF). The frequency of resonance of
ripples. In [8] the harmonic distortion of industrial sources is
6
the parallel circuit of that capacitor and the inductance of the The voltage, current waveforms and their spectrum of
system is happened to be the 11th harmonic and that causes an harmonics are shown in figure II.
oscillation and distortion in supply voltage. A shunt passive
filter single tuned to the 11 th order harmonic is designed to
mitigate the voltage total harmonic distortion (VTHD) to be
in compliance with IEEE Std 519:2014 and preventing the
oscillation of the supply voltage [22 ].
𝑲𝑽𝑨𝒔𝒉𝒐𝒓𝒕 𝒄𝒊𝒓𝒄𝒖𝒊𝒕
𝒉𝒓 = (1)
𝑲𝑽𝑨𝑹𝒄𝒂𝒑𝒂𝒄𝒊𝒕𝒐𝒓 𝒃𝒂𝒏𝒌
From equation (1) the resonance frequency hr can be
calculated by knowing the short circuit power level of the
system kVAshort circuit and the installed capacitor bank reactive
power rating kVARcapacitor bank . Figure I shows the one line
diagram of the distribution system.
7
The capacitor bank current waveform and its harmonics XL 2.5
𝑅= = = 0.06 Ω (7)
Q.F 40
spectrum is shown in figure III.
Where Q.F is the Inductor Quality Factor and supposed
to be equals 40
ISC / IL 11.7
S (KVA) 88.3
III. Design of a Series Passive Single Tuned Q (KVAR) 28.9
Filter (SPSTF)
An R-L-C connected in series and tuned to a specified P (KW) 83.5
frequency is constitute a series single tuned filter (STF) and
that specified frequency is the resonance frequency of the
series R-L-C. The 11th harmonic component is shown in table Figure IV shows the voltage and current waveforms with
II as it is of the highest value among other harmonics. At their respective harmonic spectrums with connected filter and
PCC#2 is connected a filter of type SPSTF tuned to the 11th in comparison with figure II it is clearly that the waveforms
harmonic component. The SPSTF must supply reactive and harmonics spectrum are greatly enhanced.
power to compensate the power factor which is calculated to
improve the true power factor (TPF) [24]. In this work TPF
is improved from 0.55 to 0.9, From table I:-
TPF = 0.55 (True Power Factor)
Active power = 63.2 KW
The phase angle of the TPF is TPF = arc TPF = arc 0.55 =
57 then tan TPF = 1.5
And PF is intended to be equal 0.9 then Required = 26° and
tan Required = 0.5.
The compensation of reactive power = load (KW) [tan TPF
- tan Required]
= 63.2 [1.5 – 0.5]
= 63.2 KVAR
2
VLL 4002
XC = = = 2.5 Ω (2 )
VAR 63.2 × 103
1
XC = (3)
2fC
1 1
C= = = 116 𝑓 (4)
2f XC 2 × 550 × 2.5
XC = XL (5)
2.5
𝐿= = 0. 72 𝑚 H (6) Figure IV Voltage and Current Waveforms at
2 × 550 Resonance and the Respective Spectrum at PCC#2
8
V. Discussion [5] HE Yi-Hong, SU Heng, “A New Method of Designing
Table IV shows that SPSTF of resonance frequency of 11 th Double-Tuned Filter”. Proceedings Of The 2nd
harmonic is managed to mitigate VTHD to be compliance International Conference On Computer Science And
with IEEE Std 519:2014 and the filter has eliminated the Electronics Engineering (ICCSEE 2013)
resonance, therefore the voltage oscillation is also eliminated
as shown in figure IV, the ITHD is still not compliance with [6] H. Prasad,M. Chilambarasan,T.D.Sudhakar, “Application
or out of the limit of the standard IEEE Std 519: 2014 but it Of Passive Harmonic Filters To Mitigate Source Side
is lowered greatly from 31.5% to 6.3% and also the individual Current Harmonics In An Ac – Dc - Dc System”,
current harmonic components distortion is reduced which is IJRET: International Journal Of Research In
clearly by comparing the values in table II with table IV. Engineering And Technology. Volume: 03 Issue: 01
Another SPSTF branches can be designed and installed to Jan-2014, Available @ Http://Www.Ijret.Org
mitigate the 7th harmonic. To mitigate ITHD to be compliance
with IEEE Std 519:2014 a high pass filter can be used. [7] A. Albert Rajan, Dr. S. Vasantharathna, “Harmonics
And Torque Ripple Minimization Using L-C Filter For
Brushless DC Motors”, International Journal Of
Table IV The Compliance with IEEE Std 519:2014 Recent Trends In Engineering, Vol 2, No. 5, November
The Harmonic Calculated IEEE Std 2009
Calculated No. Value [%] 519:2014 Limit [8] Thet Mon Aye, Soe Win Naing, “ Analysis Of
Parameter [%] Harmonic Reduction By Using Passive Harmonic
VTHD ------ 5.3 8.0 PASS Filters “, International Journal of Scientific Engineering
and Technology Research, ISSN 2319-8885
Vol.03,Issue.45 December-2014, Pages:9142-9147
MVDHC 7 3.5 5.0 PASS
[9] Byungju Park, Jaehyeong Lee, Hangkyu Yoo, And
ITHD ------ 6.3 5.0 FAIL Gilsoo Jang, “Article Harmonic Mitigation Using
Passive Harmonic Filters: Case Study In A Steel Mill
Power System”, Energies 2021, 14, 2278.
MIDHC 7 5.6 4.0 FAIL Https://Doi.Org/10.3390/En14082278
11
MIDHC 11 11 2.1 2.0 FAIL [10] Lukas Motta, Nicolás Faúndes, “ Active / Passive
to 16 Harmonic Filters: Applications, Challenges & Trends”,
978-1-5090-3792-6/16/$31.00 ©2016 IEEE.
MIDHC 17 17 1.1 1.5 PASS
to 22 [11] A.Medina-Rios, and H. A. Ramos-Carranza, “An
Active Power Filter in Phase Coordinates for Harmonic
MIDHC 23 23 0.7 0.6 FAIL Mitigation”, Ieee Transactions On Power Delivery, Vol.
to 34 22, No. 3, July 2007.
MIDHC 35 0.3 0.3 PASS [12] L. A. Cleary-Balderas, A. Medina-Rios, “Selective
35 Harmonic Current Mitigation with a Shunt Active
Power Filter,” 978-1-4673-2308-6/12/$31.00 ©2012
IEEE.
VI. References
[1] J. C. Das, “Power System Analysis Short-Circuit Load [13] Kakoli Bhattacharjee, “Harmonic Mitigation by SRF
Flow and Harmonics” Amec, Inc. Atlanta, Georgia Theory Based Active Power Filter using Adaptive
Hysteresis Control”, 2014 Power and Energy Systems:
[2] Arrillaga, J. and Watson, N., “Power Systems Towards Sustainable Energy (PESTSE 2014).
Harmonics”, 2nd ed., Wiley, New York,2003.
[14] Henning Tischer, Tomaz Pfeifer, “Hybrid Filter for
Dynamic Harmonics Filtering and Reduction of
[3] S.N. AL. Yousif, M. Z. C. Wanik, A. Mohamed, Commutation Notches – A Case Study”, 978-1-5090-
“Implementation of Different Passive Filter Designs for 3792-6/16/$31.00 ©2016 IEEE.
Harmonic Mitigation” National Power & Energy
Conference (PECon) 2004 Proceedings, Kuala Lumpur, [15] Mohamed J. M. A. Rasul, H.V. Khang, Mohan Kolhe,
Malaysia “Harmonic Mitigation of a Grid-connected Photovoltaic
System using Shunt Active Filter”, 978-1-5386-3246-
[4] Muhammad Rusli, Muhammad Ihsan, Danang 8/17/$31.00 © 2017 IEEE.
Setiawan, “Single Tuned Harmonic Filter Design As
Total Harmonic Distortion Compensator”. [16] Juan C. Colque, Jos´e L. Azcue, Ernesto Ruppert,
23rdinternational Conference On Electricity ‘Photovoltaic system grid-connected with active power
Distribution Lyon, 15-18 June 2015 filter functions for mitigate current harmonics feeding
9
nonlinear loads”, 2018 13th IEEE International
Conference on Industry Applications.
10
Arranging Spaces for Next Pandemic by inspiring from
Qajar period
Parastoo Pourvahidi
Department of Architecture, Faculty of
Fine Arts, Design and Architecture,
Cyprus International University, Via
Mersin 10, 99258 Nicosia, Turkey
ppourvahidi@ciu.edu.tr
11
yard. Spaces such as access room, house pool and interior yard space such as courtyard, after courtyard users enter to the
are the one which is used by family members. In these spaces doorway which was again semi-open space. However, today’s
even close relative and family members meet up for doing residential building plan composed of completely close area
activates. Furthermore, spaces like basement, the two-door with small percentage of semi-open spaces (balcony). Balcony
room and back room are considered as private zone in Iranian in most of the contemporary building is small hence, residents
traditional houses. These spaces had been used for relation, always change the function of it to the storage rather a sitting
study, sleep and chat therefore they are positioned at the and enjoying the comfortable space. The lack of space,
utmost part of the house [5]. Figure I represent the increasing population and price of the land based on the close
arrangement of the spaces in traditional houses. spaces is the result of all these changes during passing years.
But pandemic period changed this routine and alarm the
resident about the value of open and semi-open spaces in
house for communication with neighbors and freshen the air
and spending lockdown in open space without scaring about
virus. Furthermore, in figure II arrangement of the spaces for
contemporary building plan allow the users to content the
necessity of having semi-open spaces such as balcony.
12
architectural form and phenomena. Space syntax defined as
one of the methods which can explore the space morphology
[3]. Space syntax method had a purpose to notice the social
relationship between the spaces [6]. In another pint of view,
space syntax attempt to detect the reason for the independence
(b)
of each space and try to state each space based on the position
of it [7]. There is different kind of software which is applicable Figure III: Traditional building plan in Qajar period (a),
for space syntax method, this research used the depthmap X indicating open (blue) and close spaces (yellow).
for the analysis.
Depthmap X is analysis software in different space for These dash lines demonstrated the combination of living room
representing the spatial network. Alasdair Turner (Space with courtyard. However, the interesting thing about these
syntax group) developed this software. The purpose of this spaces is the partition wall, which was flexible.
software is to create the map which has been shown the spatial Pandemic period demonstrates the fact how open plan
element and relationship between them [4]. Hence, Depthmap create problem for inhabitant while one of them should do
X is one of the methods which is used in this research. online work and the children should be online in school but
Depth map analysis manifest the values for TDn, MDn, I, with no partition wall concertation would be impossible
CV and RA for each space. Also, Ostwald mentioned that for however in this Qajar period’s plan by having flexible
comparing to building with each other, outcome prerequisite partition wall separating the spaces is possible and beneficial
to be stabilized in term of relative depth, which is called as during pandemic (fig.IV).
Relative Asymmetry (RA). This result could be between 0 to
1[11]. Segregation specifies by high value of Relative
Asymmetry and integration specify by low value of Relative
Asymmetry [8].
RA is the reflection of its relative isolation for a carrier
space. Afterwards, the i can be calculated by the shared of that
node. Table below represent the formula of the relationship
between RA and I value (table I) [11]. Figure IV: Dash lined represent the flexible space in
Qajar period.
Table I: RA and i formulas [11].
RA A measure of how deep a system is One of the first spaces after courtyard is bathroom. It is
Relative
Before, starting the analysis with depthmap X, this Figure V: Dash lined represent the bathroom space in
research generally analyzed the spaces in traditional building Qajar period.
plan during Qajar period. The analysis has been clarified such There is special space, which they call it removing shoes
as:
spaces. This space is open space in case to remove the smell
The proportion of open and semi-open spaces (blue color) and and is the second spaces after connecting corridor to the living
closes spaces (yellow color) demonstrated that open spaces is room.
as much important as close spaces during Qajar period. In These kinds of space are essential during Coronavirus
addition, during pandemic, they could spend more time disease’s time in case to remove all the dirty shoes outside of
outside, even they could invite the people, and there is enough the living space in and staying in open space to become
space for social activities (fig.III). cleanness with sunlight (fig.VI).
13
Designing the corridor (yellow color) between each space
in another positive fact in the plan of Qajar period. Entering
directly to the space with shoes can cause a lot of problem
during pandemic. Though, if each space separated with
connecting corridor can be helpful for inhabitant to stay
sanitation from Coronavirus disease (fig.VII).
Result
Figure VII: connecting corridor placed beyond each
spaces
Depth map analysis in table II illustrations the axial
analysis that blue line (low connectivity) demonstrates the
sanitation area and red line (high connectivity) the dirty spaces
in both Qjara and contemporary buildings. Generally, drawing
the graph in case studies aid this research to manifest the
connectivity’s of each space similarly understanding the
hygiene of the spaces is conceivable. Hence, this research just
analyzing the spaces based on integration value.
Table II: Analyzing of plan in Qajar period throughout
pandemic
analysis
[12]
0. Entrance, 1. WC,
14
2. Bathroom, connected to the balcony (semi-open spaces) for
3. Living room, refreshing the air in that space.
4. Bedroom,
5. Bedroom, Table IV. Analysis of contemporary building case No.B
6. Kitchen, with Agraph method
7. Balcony Plan/ depth map Graph
4. WC,
5. Bathroom,
6. Bedroom,
The highest i represent the dirty space which is
7. Bedroom,
living room in this case. The lowest i represent the
8. Balcony.
cleanest space which is balcony and entrance space.
Therefore there is something wrong in organization of
space in contemporary building. That is why balcony
should be the cleanest space?
Result
[12]
0. Entrance,
1. WC, In this case study, highest i (dirty space) value is
Suggestion plan
0. Entrance,
[12] 1. Living room,
2. Kitchen,
3. balcony,
4. WC
,5. Bathroom,
Result
6. Bedroom,
7. Bedroom,
8. Balcony,
After rearranging the spaces, the i value of balcony 9.WC,
changes to the same value like bedrooms. Although the 10. Balcony.
i value of living room still is so high but at least it is
15
Moreover, in traditional house in Qajar period, the lowest
connectivity is detected in bedrooms and store room.
However, the highest connectivity is detected in in yards and
corridor. Furthermore, the kitchen, bathroom represent the
low connectivity as well. Nevertheless, in contemporary
building the lowest connectivity detected the bathroom and
toilet and the highest connectivity is detected on living room.
Therefore, based on all the analysis these are the
recommendation rule which can be supplementary by the
municipality’s rule for the modern building, which resident
can be prepared for the next pandemic:
Rules and regualtion for next pandemic:
After the rearranging the space, balcony’s I value Integration value should be high for living room
increase same as the bedroom which is convenient. and balcony
Balcony should not have low integration.
In case number one (A) (table III), case number 2 (B) The wall of the balcony should be flexible in
(table VI) before renovation, connectivity is highest in the order to be extended toward living room based on
living room which is i=21 and i=14 and lowest in the balcony the appeal of the user.
(A. i=1.75) and (B. i=2) and bedroom (A. i=2.62) (A. i=3.11). Living room must be connected to semi-open
Also, after renovation, connectivity is highest in the living spaces (balcony)
room (A.45) and (B. i=21) and balcony (A. i=3) and (B. Having open space such as courtyard and
i=4.5). Lowest in two bedroom (A.i=3) and (B. i=4.5). balcony is the must in each residential building.
Before, rearranging the spaces, balcony (open spaces) There should be balance in arrangement of semi-open,
integration value is less than other spaces, even it is less than open and close spaces. (Pandemic period evident the fact,
entrance. However, it is the best space during pandemic that people prefer during the say spend some time in the balcony
people can have social distancing. Balcony in some of the to just see the outside for having peace of mind. Nonetheless
buildings located after kitchen area. It means the access to this unfortunately, the square meter of the balcony than close
space it should be from kitchen. WC is located next to the space is less than half in modern building).
entrance door which is good for hygiene prior to entering to
Living room with Partition wall (inhabitant could still keep
the main space. Kitchen can be considered as balcony that is the open plan but by adding partition glass or wall, they could
next to the entrance and kitchen could be places partially in have opportunities to separate the area whenever the online
living room. Partition between living room and balcony can class of children is starting or they should do online working
be flexible wall in case to extend the living room during winter and it needs concertation.
time likewise extend the balcony during summer time.
Generally, the result of all the cases based on Agraph method Roof function (most of the apartment building have
analysis demonstrated the interesting values in TDn and immense area in roof which is useless, CONID-19 clarify the
integration for balconies formerly and after renovating plan. fact people need social interaction so may be roof can be the
space which inhabitant who lives in apartment building can
TDn value of balcony before renovation has the same with the
gather there since it is open space)
entrance gate, but after extending balcony toward living room
the total depth value decreases noticeable. In addition, the Positioning the WC at first next to the entrance door
integration value before renovating the plan was the lowest (washing hands was the first and primitive way to stay safe
integration value but after renovation plan it is increased during pandemic period, thus locating the WC next to the
which is the content value for integration of balcony. entrance door could be the first space resident can enter to
Commonly, extending the balcony toward living represent the become sanitization ad after that entering to the main space)
nourishing outcome which this research was expected. Positing corridor before entering to the private spaces such
Pandemic period teach architect designing the beautiful close as room (connectivity spaces such as corridor create the
space is not adequate for pleasing the user since having open chances to not entering the private spaces directly, therefore,
spaces is much needed as close spaces. it can stay disinfectant)
Also based on the analysis, there is a decline of RA value
in both cases after renovation. This means that the segregation Entrance hall (this space can be useful for removing shoes
between the spaces (nodes) convert to integration due to at first in the entrance section of the house, which
diminishing the value of RA. Generally, decreasing the RA unfortunately obliterated form the space organization in the
value in spaces like balcony (semi-open) after renovation modern building. Nevertheless, this space can be semi-open
represent the more integration of this spaces which is for better ventilation or by using mechanical ventilating the
satisfying. Subsequently, the mean RA value in Qajar period space.)
is 0.22 which by comparing to the contemporary houses was
IV. Conclusion
lower than them. Now that RA value of case A is 0.36 and
after renovation is 0.30. RA value of case B is 0.31 and after Cocid-19 open signify that designing the plan of apartment
renovation is RA=0.21. Hence, decline of the RA values after houses should change since the user requirement is changes as
re-organizing the spaces develop content. well. Pandemic period states another desires of the user, such
as need for socializing in open space without having a fear of
interacting with virus. Afterwards, people try to start
16
socializing with each other from small balcony in case to [7] Hillier, B. (2005). The art of place and the science of space.
nourishing their feeling during lock down. Thus, balcony World architecture 11(185), 96-102.
become the spot in all building for the place which user can [8] Khadiga, O. M., & Mamoun , S. (n.d.). the space syntax
methodology: fits and misfits. Art and comport, 189-204.
have connection with outside and also with their neighbors.
What if the balcony has the potential for becoming extended [9] Kong, M. (2017). Semi-Open Space and Micr Semi-Open
Space and Micro-Envir o-Environmental Contr onmental
from balcony for inviting the friends without fearing of Control for Impr ol for Improving. Dissertations - ALL. 810:
dispersing the virus? Renovation and adding one flexible glass Syracuse University.
wall in living room, which can have an access toward open [10] organization, w. h. (2020, July 29 ). Coronavirus disease
spaces for having, better indoor ventilation and entering the (COVID-19): Ventilation and air conditioning in public spaces
solar radiation, can be the rapid and worthwhile solution for and buildings. Retrieved from Q&A Detail:
enduring another pandemic. Also, this research recommended https://www.who.int/news-room/q-a-detail/coronavirus-
disease-covid-19-ventilation-and-air-conditioning-in-public-
to add the mandatory rule in municipality that doorway and spaces-and-buildings
accessible spaces such as passage should be open or semi- [11] Ostwald , M. J. (2011). The mathematics of spatial
open spaces. Furthermore, the percentage of the open space configuration: revisiting revising and crtiquing justified plan
(balcony) toward the close spaces should have specific graph theory. Nexus Netwrok journal, 445-470.
dimension for averting havoc for the next pandemic period. [12] Safarkhani, M. (2016). N PARTIAL FULFILLMENT OF THE
REQUIREMENTS for THE DEGREE OF MASTER OF
References ARCHITECTURE IN ARCHITECTURE. THE GRADUATE
SCHOOL OF NATURAL AND APPLIED SCIENCES of
[1] SAFARKHANI, M. (2016). N PARTIAL FULFILLMENT OF MIDDLE EAST TECHNICAL UNIVERSITY.
THE REQUIREMENTS for THE DEGREE OF MASTER OF
ARCHITECTURE IN ARCHITECTURE. THE GRADUATE [13] Sandford, A. (2020). Sandford, A. (2020). Coronavirus: Half of
SCHOOL OF NATURAL AND APPLIED SCIENCES of humanity now on lockdown as 90 countries call for
MIDDLE EAST TECHNICAL UNIVERSITY. confinement. Euronews.
[2] Alitajer, S., & Molavi Nojumi, G. (2016). Privacy at home: [14] Shahid Beheshti university, c. h. (1998). Ganjnameh. Tehran:
Analysis of behavioral patternsin the spatial configuration of Fcaulty of architecture and urban planning documentation and
traditionaland modern houses in the city of Hamedanbased on research center.
the notion of space syntax. Frontiers of Architectural Research, [15] Tamborrino, R. (2020). Here's how locking down Italy's urban
341.352. spaces has changed daily life. weforum.org.
[3] Bahrainy, H., & Taghabon, S. (2015). Deficiency of the space [16] Ülkeryıldız, E., Vural, D., & Yıldız, D. (6-8 May 2020).
syntax method as an urban designe tool, designing traditional Transformation of Public and Private Spaces: Instrumentality
urban space and the need for some supplementary methods. of Restrictions on the. 3rd International Conference of
space Ontology International journal, 1-18. Contemporary Affairs in Architecture and Urbanism
[4] depthmapX development team. . (2017). Retrieved from (ICCAUA-2020).
depthmapX (Version 0.6.0) : [17] Zacka, B. (2020, May 11). An ode to the humble balcony, in
https://github.com/SpaceGroupUCL/depthmapX/ times of the pandemic. Retrieved from DTNEXT:
[5] Hasan Zolfagharzadeh, H., Jafariha, R., & Delzendeh, A. www.DTNEXT.IN/NEWS
(2017). Different Ways of Organizing Space Based on the
Architectural Models of Traditional Houses: A New Approach
to Designing Modern Houses: (Case Study: Qazvin’s
Traditional Houses). Space Ontology International Journal,
6(4), 17 - 31.
[6] Hillier, B. (1996). Space is the machine, A configurational
theory of architecture. Cambridge : Cambridge University
Press.
17
Comparison of Sentiment-Lexicon-based and Topic-based
Sentiment Analysis Approaches on E-Commerce Datasets
Adeola O. Opesade
Department of Data and Information
Science, Faculty of Multidisciplinary
Studies, University of Ibadan, Nigeria
morecrown@gmail.com
Abstract— Discovering underlying sentiment in a user’s textual polarity of sentiments expressed by writers through the
data is a complex task; nevertheless, human beings have been process known as sentiment analysis [6], [2].
intuitive enough to interpret the tone of a piece of writing. The
hugeness of online reviews, due to advancements in internet- Sentiment analysis, an intellectual process of extracting
based applications, has however, made the need for computer-
user’s feelings and emotions contained in a piece of writing
based models highly imperative for sentiment analysis of texts
and speeches. Many of the existing studies have examined the or speech, is a language processing task that uses a
performance of either of the two main sentiment analysis computational approach to identify opinionated content and
approaches, symbolic and topic-based approaches. The present categorize it as neutral, positive or negative [7], [2]. It is one
study investigated comparatively, the performances of Liu Hu of the fields of Natural Language Processing (NLP) and data
sentiment-lexicon implementation and bag of words topic-based mining that has gained popularity in the recent years [1] [3].
approaches. The study revealed amongst others that sentiment A lot of research work is being carried out in the field of
analysis, like other data mining tasks, is an experimental sentiment analysis [1]. These studies employed different data
science. It recommends that analysts could compare the mining and Natural language methods, mainly classified as
performances of symbolic and topic-based approaches in their
symbolic and machine learning approaches.
sentiment classification endeavors when deciding on the most
precise technique to adopt.
The symbolic approach, also known as rule-based
Keywords— Sentiment analysis, Amazon customer reviews, Konga classification entails reliance on sentiment-lexicon to find the
customer reviews, Liu-Hu sentiment-lexicon, Textual data mining polarity of each word in a review; if the number of words
tagged positive is greater than that tagged negative, it is
I. Introduction concluded that the writer’s sentiment is positive, otherwise,
Advancements in internet-based applications have fuelled the it is said to be negative [8]. It is therefore said to be a great
availability of huge volumes of personalised reviews on the knowledge-based classification that might lack generality
Web [1]. This user-generated data, mostly unstructured, due to possibility of its closeness to specific linguistic and
usually carry elements of user opinions and sentiments about operational fields [4]. Machine learning approach also
goods, services, events and experiences in the online or regarded as a topic-based text classification approach is a
offline environments [2]. These reviews are becoming so general resolution, independent on any special fields. In this
increasingly important that many people now consult them as approach, the reviews are represented by different features,
sources of information to aid their understanding, planning followed by any text classification algorithm [4] [8].
and decision making processes [1]. Businesses have also
adopted online reviews as part of their criteria for quality In order to identify better alternatives in sentiment analysis,
assessments [3]. Little wonder then that learning customers’ a number of studies have used rule-based approaches,
emotional inclinations through online reviews is becoming combined with some text processing procedures and machine
more crucial in the present Information Age [4]. learning algorithms, to investigate the performances of
sentiment analysis tasks [3] [7] [9]. A number of studies have
Discovering underlying sentiments, based on user’s textual also investigated the performances of sentiment analysis
data, is not a trivial task. This is especially due to the different tasks based on machine learning approach by using bag of
intricacies associated with language, such as contextual words, n-grams and POS-tagged feature selection techniques,
differences, language implications and sentiment combined with series of text processing procedures and
indistinctness of certain words. Also, some writers could be machine learning classifiers [6] [10] [11]. It could be
sarcastic while some others may not express specific observed that most of these previous studies have
sentiment markers in their writing [3]. Despite these investigated sentiment polarity of textual data either from
complexities, human beings have been found to be passably sentiment-lexicon or topic-based approach. They have
intuitive in interpreting the tone of a piece of writing [5]. The therefore, reported the relative performances of sentiment
massiveness of online reviews has however, made human classification tasks, based either on one or the other sentiment
beings to rely on computer-based models in identifying the analysis approach. While the approaches employed might
18
suffice for the researcher’s purposes of investigation, there B. Methodology
still remains a dearth of information on the relative The methodologies employed for data analytic procedures
performances of sentiment lexicon-based and topic-based are the Liu Hu and bag of words implementations in Orange
polarity approaches on the same dataset. This is particularly data mining tool. Liu Hu is a lexicon-based sentiment
important because of variability in real datasets and the fact analysis technique that computes a single normalized score
that universality of learning algorithms has been said to be a of sentiment in the text (negative score for negative
mere fantasy [12]. How will sentiment-lexicon and topic- sentiment, positive for positive, 0 is neutral). The technique
based approaches perform on a dataset? Answer to this
was used to carry out sentiment-lexicon based scoring of each
question will help to investigate the universality of sentiment
review. Bag of words technique was used to extract a number
analysis approach further.
of most frequent content words in each corpus. With this
The present study, in order to provide an answer to this technique, words (excluding stop words) were successively
question, investigated the performances of both approaches extracted to contain 1000, 500, 200, 100, 50, 20 and then 10
on two different e-commerce datasets. Two datasets were most frequent words from each dataset. Orange data mining
examined for the purpose of triangulation. To achieve the tool was also used to carry out text pre-processing
objective of the study, the following research questions were (transformation, tokenization, removal of stop words and
specifically examined. extraction of word features) and machine learning
classification experiments.
1. How do machine assigned sentiment-lexicon based score
compare with human assigned sentiment label?
2. How do sentiment-lexicon-based classification schemes C. Experimental Setup
compare with topic-based classification schemes in each An experiment was carried out, using six machine learning
dataset? algorithms in Orange data mining tool. The machine learning
3. Which classification scheme is the best for each dataset? algorithms used were K-Nearest Neighbor (KNN), Tree,
Support Vector Machine (SVM), Neural Network (NN),
II. Research Methods and Materials Naïve Bayes (NB) and Logistic Regression (LR). The
experiment was carried out to determine the performances of
The method of research adopted by the present study is the these machine learning classification schemes based on the
textual data mining, with supervised machine learning following feature sets:
technique.
a. Sentiment-lexicon-based machine assigned score.
A. Data Collection b. Topic-based bag of word vectors (1000, 500, 200, 100, 50,
Two web based electronic commerce datasets were used in 20 and 10 most frequent content words).
the present study. The datasets are:
Ten fold cross validation was used to evaluate the models'
1. Dataset on Amazon: This dataset was collected from [13]. performances based on Classification Accuracy (CA) and F-
The dataset was created and uploaded by [14], authors of Measure (F1).
'From Group to Individual Labels using Deep Features'. It
comprises of one thousand reviews of products labelled with III. Results and Discussion
positive or negative sentiment. The authors of the dataset
selected review sentences that have a clearly positive or Research Question 1: How do machine assigned sentiment-
negative connotation in equal proportion. The dataset lexicon-based scores compare with human assigned sentiment
therefore, contains 500 positive and negative reviews each. labels?
520 500
received from Konga was collected by the author of the 500
600
present study. Data collection was carried out on the 17th of
April 2020 on Twitter, using #kongacustomer as search term 400 265
on Orange data mining tool Twitter API. Tweets were first 215
read and labelled based on the sentiment inclination 200
0
expressed therein. These sentiment polarity labels were 0
positive, negative and neutral. However, all neutral tweets Machine Human
were removed from the dataset in order to make the dataset
to conform to sentiment polarity format of the collected Negative Positive Neutral
Amazon dataset.
Figure Ia. Sentiment labels of Amazon dataset reviews
19
are as presented in Figure IIa and Figure IIb respectively,
Konga while the results of the same experimental task on Konga data
are as presented in Figure IIIa and Figure IIIb respectively.
200 162
No. of Reviews
150 In the case of Amazon dataset (Figures IIa and IIb), the
101
100 sentiment-lexicon-based approach outperformed the bag of
35 49 words sentiment polarity machine learning classification
50 23
0 across the six classification algorithms. In the case of Konga
0 dataset (Figures IIIa and IIIb), sentiment-lexicon-based
Machine Human approach outperformed the bag of words topic-based
sentiment polarity approach in two classifiers (Tree and
negative positive neutral
SVM). The 200 word and 500 word topic-based feature sets
Figure Ib. Sentiment labels of Konga dataset reviews outperformed sentiment-lexicon-based approach for KNN,
all examined bag of words feature sets other than 50 and 100
Whereas none of the two datasets contain neutral reviews, words outperformed sentiment-lexicon-based approach in
the machine sentiment-lexicon scoring algorithm scored some NN, fifty (50) word feature outperformed the sentiment-
of the reviews as neutral. For example, as shown in Figure Ia, lexicon-based approach in NB while the sentiment-lexicon-
the Amazon dataset originally contains 500 positive and based approach was outperformed by all bag of words feature
negative reviews each, as labelled by human identifier. The
set options in LR.
machine sentimental algorithm however, scored 520 reviews
as positive, 265 as negative and 215 as neutral. Also as shown It could be observed that the behavior of the two data sets are
in Figure Ib, while the original konga dataset contains 23 not exactly the same. While Amazon data was better
positive and 162 negative reviews as labelled based on human determined by the sentiment dictionary based approach, the
identification, machine sentimental algorithm identified 49
same did not hold for Konga data. This divergence might
reviews as positive, 35 as negative and 101 as neutral. This
possibly be traced to the assertion of [4] that since the
shows that machine scoring, based on sentiment-lexicon
algorithm, finds it more challenging than human agents to sentiment dictionary based approach is a great knowledge-
determine sentiment polarity of the reviews. This corroborates based classification approach, it lacks generality, and that of
the assertion that human beings are fairly more intuitive than [15] who reported that non frequently mentioned features are
machine when it comes to interpreting the tone of a piece of often not detected by this knowledge-based sentiment
writing [5]. analysis approach. It could possibly mean that most of the
sentiment clues in Amazon dataset are better represented in
Research Question 2: How do sentiment-based classification the lexicon database than those in Konga dataset.
schemes compare with topic-based classification schemes in Nevertheless, the result further buttresses the assertion that
each dataset? real datasets vary and idea of a universal approach is just a
fantasy [12]. Thus, Sentiment Analysis, like other data
The results of sentiment-lexicon and bag of words machine mining tasks, is an experimental science.
learning classification of Amazon data with the six
classification algorithms, KNN, Tree, SVM, NN, NB and LR
Amazon
1 Tree
KNN
Performance
0.8 SVM
0.6
0.4
0.2
0
Feature Sets
CA F1
20
Amazon Continued
LR
1 NB
NN
Performance
0.8
0.6
0.4
0.2
0
Feature Sets
CA F1
Konga
KNN Tree SVM
1
Performance
0.8
0.6
0.4
0.2
0
Feature Sets
CA F1
Konga Continued
1.2000
1.0000 NN LR
Axis Title
0.8000 NB
0.6000
0.4000
0.2000
0.0000
Axis Title
CA F1
21
Research Question 3: Which classification scheme is the best provided the highest classification accuracy in the Amazon
for each dataset? and Konga data sets respectively. The emergence of NN as
the best classifier in the two datasets is at variance with some
The best performing feature set in each of the six classifiers of the previous studies that reported SVM [6], NB [7], or
for the two datasets are as shown in Tables I and II. Stochastic Gradient Descent (SGD) [3] as the best classifiers
in their studies. It could however, be observed that none of
Table I. Amazon best performing classification schemes the mentioned previous studies included NN as a classifier in
Algorithm Feature set CA F1 their experiments. The difference in the most performing
KNN Sentiment 0.7774 0.7771 feature sets in the present study, and the emergence of NN as
the best classifier contrary to reported cases, further buttress
Tree Sentiment 0.8106 0.8104
the fact that the concept of universal learner is an idealistic
SVM Sentiment 0.6544 0.6429 fantasy [12].
NN Sentiment 0.8203 0.8203
IV. Conclusion
NB Sentiment 0.8188 0.8188
Based on the findings of this study, it could be concluded that
LR Sentiment 0.8135 0.8134 rule-based sentiment analysis approach, particularly the Liu
Hu dictionary implementation used in the present study has
Table II. Konga best performing classification schemes not yet assumed the human status in sentiment polarity
Algorithm Feature set CA F1 approach. The study also concludes that sentiment analysis,
KNN 200 words 0.9297 0.9316 like other data mining tasks is an experimental science and
0.9189 0.9045 analysts could compare the performances of symbolic and
Tree Sentiment
topic-based approaches in their sentiment classification
SVM Sentiment 0.9135 0.8925
endeavor before deciding on the most precise technique.
NN 10 words 0.9676 0.9654
0.8919 0.8874
The sentiment-lexicon results in the present study were
NB 50 words
however, based on Liu Hu dictionary implementation alone.
LR 500 words 0.9568 0.9559
Also the topic-based approach was based on bag of words
lexical features alone. Performances of other methods of each
As shown in Tables I and II, sentiment-lexicon approach and of the two techniques could be investigated in further studies.
ten word topic-based feature sets, based on Neural Network,
22
[11] W. Kasper and M. Vela, “Sentiment Analysis for
Hotel Reviews,” in Proceedings of the
Computational Linguistics-Applications Conference,
2011, vol. 231527, pp. 45–52.
23
Autism detection from facial images using deep learning
methods
Abdulazeez Mousa Fatih Özyurt Shivan H. M. Mohammed
Department of Software Engineering Department of Software Engineering Computer Science
Firat university Firat university Duhok University
Elazig, Turkey Elazig, Turkey Duhok, Iraq
abdulazizmousa93@gmail.com fatihozyurt@firat.edu.tr shivan.cs@uod.ac
Abstract— Autism spectrum disorder (ASD) refers to a children develop their communication and social skills, as
collection of behavioral and developmental issues and well as their quality of life. Estimating the total ASD requires
difficulties. The cognitive, communication, and play skills of the expertise of an ASD specialist; however, early diagnosis
a child with autism spectrum disorder are all affected. The is crucial for controlling and treating this disease [3].
description "spectrum" in autism spectrum disorder refers However, in rural areas and isolated villages, health care and
to the fact that each child is special and has their own set of facilities are unavailable. Several methods for determining
characteristics different from other children. These come whether or not a child has autism spectrum disorder have been
together to give him a unique social bond as well as his own established. These methods are extremely useful for early
understanding of his own actions. Medical image
diagnosis of ASD and evaluating the efficacy of the ASD
classification is a significant research field that is gaining
traction among researchers and clinicians alike to detect and
protocol [4]. In latest years, deep learning (DL) has advanced
diagnosis diseases. It addresses the issue of medical rapidly. Iimage processing, computer-aided diagnosis, image
diagnosis, experiment purposes and analysis in the field of recognition, image fusion, image registration, image
medicine. To address and resolve these issues, a number of segmentation, and other fields have benefited from deep
data mining-based medical imaging modalities and learning techniques, By detection and analysis of medical
applications have been proposed and developed. But also to images Deep learning techniques extracts features from
understand and learn about how diseases develop in medical images accurately and efficiently reflect the
patients, to help doctors in early diagnosis of pathology. Not information. Deep learning (DL) helps physicists (medical
only is achieving good accuracy in classifying medical images practitioners) and doctors detect and predict disease risk more
is the main purpose alone. We used pre-trained accurately and quickly, allowing them to avoid disease before
Convolutional Neural Networks and transfer learning in this it develops [5]. One of the most dilemmas (challenges) in
study. These CNN architectures are used to train the image recognition field is the classification of medical images.
network and to classifying medical images. The experiment Classification of medical images with the aim of analyzing
results of this study show that the proposed model can detect and categorizing them (classification) into several different
Autism Spectrum Disorder (ASD), with the best accuracy of groups to assist physicists(medical practitioners) and doctors
95.75 percent achieved using the MobileNet model with in diagnosing diseases or performing research on them [6].
transfer learning. The architectures that tested in this study These techniques enhance the abilities of doctors and
are ready to be tested with additional data and can be used
researchers to understand that how to analyze the generic
to prescreen Individuals with ASD. The use of a deep
learning methods for feature selection and classification in
variations which will lead to disease. Deep learning
this study could greatly support future autism studies. algorithms such as convolutional neural networks (CNNs) to
detect and classify images. The classification of medical
Keywords—Autism, deep learning, Convolutional Neural images is divided into several stages. Firstly getting medical
Networks, Classification, Transfer Learning images and uploading them to the model. And then extraction
of essential features from the acquired input image is
performed. The features are then used to create models that
identify the image data set in the second phase of medical
I. INTRODUCTION
image classification. The final stage is the production, which
Autism, also known as autism spectrum disorder (ASD), is a classified image and a report based on image analysis that
is a developmental disorder that affects a person's interaction, reflects the outcome [7].
speech, and social behavior like (Strange gestures, such as
swinging hands and rotating, are repeated. Since autistic The rest of this article is structured as follows as: we
people can't stay in one place for long periods of time, we find presents the previous related work about autism detection and
that the patient are always moving. and the autistic person's image classification using deep learning and transfer learning
gestures are disorganized and spontaneous. Can't connect well techniques in Section II, The details of the dataset used in this
with others and speaks with a peculiar accent. Autistic people study and description of the CNN architectures that
are very sensitive to light and sound[1]. In addition, they are implemented on our dataset provided in the section III, The
unaware of other people's emotions and thoughts. When experimental results of the CNN architectures and transfer
compared to other children, they are more violent. They learning techniques that performed on the dataset and the
frequently have outbursts of anger. The pain response and show that the best results obtained of all models performed is
sensation are sluggish and mild. Depending on the disposition MobileNet model presents in section IV, Finally in section V
of the child, who may be slow to learn or have a high level of presents the conclusion of proposed framework.
intelligence.). and other disorders such as depression, anxiety,
and attention deficit hyperactivity disorder are common in
these children [2]. Early diagnosis in infancy can help autistic
24
II. LITERATURE REVIEW III. METHODOLOGY
Beary, M., Hadsell, A., Messersmith, R., & Hosseini, M. P. A. Dataset
proposed "Diagnosis of Autism in Children using Facial Early detection of autism is very important for the
Analysis and Deep Learning" In this paper, the authors development of the affected child, so the development of these
proposed a deep learning model to classify children if they are classifications based on a set of data related to it is a reason
autistic or non-autistic (healthy). Autistic patients in general for early detection of autism and this enhances the adaptation
have distinct facial abnormalities that are part of the disease of the patient to normal life. The dataset can be used in two
discovery that allows researchers to know if a person has different ways, both of which are popular for deep learning
autism or not. Therefore, the researchers in this study used the tasks. The dataset is divided into Training, Test, and
MobileNet deep learning model with two dense layers to Validation sets, which is a normal procedure. Train is the title
extract the features of the face and classify the people of the training set. That is contains of two subdirectories
according to the image. In this study, the model was trained known as (Autistic & Non_Autistic) [12, 13]. There are 1667
and tested on 3014 images (90% for training, 10% for testing). facial pictures of children with autism in the Autistic
The results of this study was 94.6% accuracy in classifying subdirectory in 224 X 224 X 3, jpg format. And also there are
[8]. Mythili, M. S., & Shanavas, A. M. have written "A study 1667 photos of children who do not have Autism in the
on Autism spectrum disorders using classification techniques" Non_Autistic subdirectory, all of which are in the 224 X 224
In this study, the authors focused on extracting data from X 3 jpg format. The unified directory is the second way the
methodologies to study the performance of students with data is made accessible. There are two subdirectories in this
autism. By mining data with the tasks it provides that can be directory: Autistic & Non_Autistic. It is a collection of files
from the train, test, and correct directories that have been
used to study the performance of students with autism. In this
combined into a single set. The combined data can then be
article the authors use algorithms for Data mining by task
partitioned into user-defined train, test, and validation sets.
classification at the level of students. The authors
used Support Vector Machines (SVM), Artificial Neural Some images of Autism dataset used in our study shown
Networks (ANN), and Fuzzy logic as machine learning below in Figure 1.
methods in this paper. The algorithms are extremely helpful in
dealing with the prediction level of autism students [9]. Ali,
N.A., Syafeeza, A.R., Jaafar, A., & Alif, M have written "
Autism spectrum disorder classification on
electroencephalogram signal using deep learning algorithm "
In this paper, the authors use deep learning algorithms via
electroencephalogram (EEG) to detect the different patterns
between normal children and children with autism. The
process of classifying normal children and children with
autism by using deep learning algorithms and using the
Multilayer Perceptron (MLP) model in this article is by means
of a database that contains brain signals for pattern
recognition, using the Multilayer Perceptron (MLP) model to
extract the features for the classification process [10]. Rad, N.
M., Kia, S. M., Zarbo, C., van Laarhoven, T., Jurman, G.,
Venuti, P., Marchiori, E., & Furlanello, C. have written "Deep
learning for automatic stereotypical motor movement Figure 1: some images of (a) Autistic and (b) Non-Autistic
detection using wearable sensors in autism spectrum Children.
disorders" in this paper the authors proposed deep learning
application to aid automatic stereotypical motor movements B. Transfer Learning and CNN architectures
(SMM) detection with multi-axis Inertial Measurement Due to the emergence of modern and high-performing
Units(IMUs). In the study from raw data convolutional neural machine learning systems, image classification has been more
network (CNN) have been to learn a discriminative feature important in the research field. Artificial neural
space, this method is used. To model the temporal patterns in networks(ANNs) have progressed, and deep learning
a series of multi-axis signals, combine the long short-term architectures such as the convolutional neural network(CNN)
memory (LSTM) with CNN architectures. Furthermore, it was have emerged. The application of multi - class image
demonstrated how the convolutional neural network (CNN) classification and recognition of objects belonging to multiple
can be used with transfer learning technique to improve the categories has been triggered by depending on artificial neural
detection and analyzation rate on longitudinal data. From the networks(ANN). In terms of efficiency and complexity, a new
machine learning(ML) algorithm has an advantage over older
results of this paper they shows that Handcrafted features are
ones[14]. In the deep learning field, transfer learning is a
outperformed by feature learning, the detection rate improves research problem. It extracts the knowledge collected from
when using LSTM to learn the temporal dynamic of signals, one problem and applies it to a separate but related issue. For
particularly when the training data is distorted. Detectors with example, knowledge acquired while learning to recognize a
an ensemble of LSTMs are more accurate and stable. In disease may apply when trying to identify another disease[15,
longitudinal settings, parameter transfer learning is beneficial. 16]. For example, we apply knowledge gained during the
These results represent a significant step forward in detecting identification of cancer to malaria. Transfer learning is a deep
SMM in real-time scenarios [11]. learning technique that involves training a neural network
25
model on a problem that is close to the one being solved. One epochs and batch size value of training and testing is 32. In
of the advantages of deep learning is that it shortens the this study the results showed that the maximum accuracy
training time on the deep learning model and this way we can achieved 95.75% obtained by MobileNet model in 90.5
build on previous knowledge rather than starting from scratch. minutes and this is the shortest period of time of all models,
Transfer learning is commonly expressed in computer vision and followed by Densenet121, InceptionV3, Xception, and
by the use of pre-trained models. A pre-trained model is one ResNet-50. With 91.96%, 91.46%, 90.60%, and 72.41%. But
that has been trained on a broad benchmark dataset to solve a the period of times of models are different and started with
problem that is close to the one we are working on. As a result MobileNet model, InceptionV3, Densenet121, ResNet-50,
of the high computational cost of training these models [17,
and Xception. As shown in below Table 1.
4].
Table .1 Comparison of pre-trained networks results.
Convolutional neural networks (ConvNets or CNNs) are
one of the most common types of neural networks used to Model Accuracy Time(minutes)
recognize and classify images. CNNs are commonly used in
areas such as object detection, face recognition, and so on. In
classification of images Convolutional neural network (CNN) MobileNet 95.75% 90.5
takes an image as input, process (extract features from it) it
and categorize it [18]. (Like: Autistic, Non-Autistic). An input Densenet121 91.96% 330.8
image is seen by computers as an array of pixels, with the
number of pixels varying depending on the image resolution. InceptionV3 91.46%, 239.7
Depending on the resolution of the picture. h x w x d (h =
Height, w = Width, d = Dimension) will be shown. For Xception 90.60% 445.4
example, a 6 x 6 x 3 array of RGB matrix (3 refers to RGB
values) and a 4 x 4 x 1 array of grayscale matrix image. ResNet-50 72.41%. 346.1
When re-purposing a pre-trained model for our own needs,
it should remove the original classifier, then add a new
classifier that is appropriate for our needs, and finally fine-
tune the model using one of three strategies [19].
Train the Entire Model - Apply the pre-trained model's
applied syntax to your dataset and train it. Start with values
from a pre-trained model instead of random weights.
Feature extraction (freezing convolutional neural network
(CNN) model base): The pre-trained basic form on which
to train a new classifier. In other words, training of the fully
connected layer alone. With leaving the weights of the
convolution layers without changing anything from them.
Fine-tuning: In this strategy the Original convolution layer
weights are used as starting points. In addition to a fully
connected classifier one or more convolution layers that
are retrained. Also, just to fit a new problem, the unsecured
convolution layers are adjusted. Figure 2: Densenet121 Training progress, Accuracy
There are many pre-trained CNN networks that only
require sets of data that consist of training and test data in their
own input layer and have the ability to transfer learning. These
networks are different in their structures in terms of the
internal layers and the technologies they use [20]. In this study
we have chosen five pre-trained CNN architectures to
implement on our autism dataset. The CNN architectures
(Densenet121, InceptionV3, MobileNet, Resnet50, and
Xception) are employed to classify autism images dataset into
Binary classes. In this study, the size of all images in the
dataset is 224 x 224 x 3, The Adam optimizer is used to train
each network architecture. Number of epochs is 35. The value
of training and testing batch size is set to 32.
IV. Results
Figure 3: Densenet121 Training progress, Loss
In this study, ASD diagnosis was presented as a binary
classification issue by applying transfer learning technique to
five pre-trained CNN models, the CNN models are
(Densenet121, InceptionV3, MobileNet, ResNet-50, and
Xception).all CNN architectures are completed after 35
26
Figure 4: MobileNet Training progress, Accuracy Figure 8: Xception Training progress, Accuracy
Figure 5: MobileNet Training progress, Loss Figure 9: Xception Training progress, Loss
Figure 6: InceptionV3 Training progress, Accuracy Figure 10: Resnet50 Training progress, Accuracy
Figure 7: InceptionV3 Training progress, Loss Figure 11: Resnet50 Training progress, Loss
27
V. Conclusion [8] Beary, M., Hadsell, A., Messersmith, R., & Hosseini, M. P.
(2020). Diagnosis of Autism in Children using Facial Analysis
In this paper, we introduce a CNN based approach using and Deep Learning. arXiv preprint arXiv:2008.02890.
transfer learning to detect and classify Autism Spectrum [9] Mythili, M. S., & Shanavas, A. M. (2014). A study on Autism
Disorder (ASD) patients from facial images as a binary spectrum disorders using classification
classification issue. In this study five pre-trained CNN models techniques. International Journal of Soft Computing and
are implemented with transfer learning technique on medical Engineering, 4(5), 88-91.
image dataset. The CNN models used to detect and classify [10] Ali, N.A., Syafeeza, A.R., Jaafar, A., & Alif, M. (2020).
Autism spectrum disorder classification on
the images are Densenet121, InceptionV3, MobileNet, electroencephalogram signal using deep learning algorithm.
ResNet-50, and Xception. All CNN models were trained used IAES International Journal of Artificial Intelligence, 9(1), 91.
Adam optimizer and performed after 35 complete the value [11] Rad, N. M., Kia, S. M., Zarbo, C., van Laarhoven, T., Jurman,
number of epochs. The batch size is 32. The results G., Venuti, P., Marchiori, E., & Furlanello, C. (2018). Deep
demonstrate that our model's accuracy with the test data is learning for automatic stereotypical motor movement detection
95.75 %, which is best accuracy achieved on this dataset by using wearable sensors in autism spectrum disorders. Signal
Processing, 144, 180-191.
the MobileNet model. In the future, we will compare the
[12] Mohammed, S. H., & Çinar, A. (2021). Lung cancer
results of these models used in this study with some of the classification with Convolutional Neural Network
other models and refine the data with the proposed models. Architectures. Qubahan Academic Journal, 1(1), 33-39.
And trying more models on more medical images to increase [13] Mousa, A., Karabatak, M., & Mustafa, T. (2020, June).
the efficiency of classification to get best results. Database Security Threats and Challenges. In 2020 8th
International Symposium on Digital Forensics and Security
References (ISDFS) (pp. 1-5). IEEE.
[1] Nasser, I. M., Al-Shawwa, M., & Abu-Naser, S. S. (2019). [14] Karabatak, M., & Mustafa, T. (2018, March). Performance
Artificial Neural Network for Diagnose Autism Spectrum comparison of classifiers on reduced phishing website dataset.
Disorder. In 2018 6th International Symposium on Digital Forensic and
Security (ISDFS) (pp. 1-5). IEEE.
[2] Sherkatghanad, Z., Akhondzadeh, M., Salari, S., Zomorodi-
Moghadam, M., Abdar, M., Acharya, U. R., ... & Salari, V. [15] Mustafa, T., & Varol, A. (2020, June). Review of the Internet
(2020). Automated detection of autism spectrum disorder using of Things for Healthcare Monitoring. In 2020 8th International
a convolutional neural network. Frontiers in neuroscience, 13, Symposium on Digital Forensics and Security (ISDFS) (pp. 1-
1325.Doland Drump, My Book, SYZ Publishers, Turkey, 6). IEEE.
2005. [16] Sharma, N., Jain, V., & Mishra, A. (2018). An analysis of
[3] Tamilarasi, F. C., & Shanmugam, J. (2020, June). convolutional neural networks for image classification.
Convolutional Neural Network based Autism Classification. Procedia computer science, 132, 377-384.
In 2020 5th International Conference on Communication and [17] Heinsfeld, A. S., Franco, A. R., Craddock, R. C., Buchweitz,
Electronics Systems (ICCES) (pp. 1208-1212). IEEE. A., & Meneguzzi, F. (2018). Identification of autism spectrum
[4] Lai, Z., & Deng, H. (2018). Medical image classification based disorder using deep learning and the ABIDE dataset.
on deep features extracted by deep model and statistic feature NeuroImage: Clinical, 17, 16-23.
fusion with multilayer perceptron. Computational intelligence [18] Hossain, M. D., Kabir, M. A., Anwar, A., & Islam, M. Z.
and neuroscience, 2018. (2021). Detecting autism spectrum disorder using machine
[5] Liu, W., Li, M., & Yi, L. (2016). Identifying children with learning techniques. Health Information Science and Systems,
autism spectrum disorder based on their face processing 9(1), 1-13.
abnormality: A machine learning framework. Autism [19] Raj, S., & Masood, S. (2020). Analysis and Detection of
Research, 9(8), 888-898. Autism Spectrum Disorder Using Machine Learning
[6] Erkan, U., & Thanh, D. N. (2019). Autism spectrum disorder Techniques. Procedia Computer Science, 167, 994-1004.
detection with machine learning methods. Current Psychiatry [20] Shahamiri, S. R., & Thabtah, F. (2020). Autism AI: a New
Research and Reviews Formerly: Current Psychiatry Reviews, Autism Screening System Based on Artificial Intelligence.
15(4), 297-308. Cognitive Computation, 12(4), 766-777.
[7] Karabatak, M., Mustafa, T., & Hamaali, C. (2020, June).
Remote Monitoring Real Time Air pollution-IoT (Cloud
Based). In 2020 8th International Symposium on Digital
Forensics and Security (ISDFS) (pp. 1-6). IEEE.
28
Diabetic Retinopathy Classification from Retinal Images using
Machine Learning Approaches
Indronil Bhattacharjee Al-Mahmud Tareq Mahmud
Dept. of Computer Science and Engineering Dept. of Computer Science and Engineering Dept. of Computer Science and Engineering
Khulna University of Engineering & Khulna University of Engineering & Khulna University of Engineering &
Technology (KUET) Technology (KUET) Technology (KUET)
Khulna, Bangladesh Khulna, Bangladesh Khulna, Bangladesh
ibprince.2489@gmail.com mahmud@cse.kuet.ac.bd hridoytareqmahmud@gmail.com
Abstract— Diabetic Retinopathy is one of the most familiar Proliferate DR is the most critical phase of diabetic eye
diseases and is a diabetes complication that affects eyes. disease. It occurs when the retina starts growing excessive blood
Initially diabetic retinopathy may cause no symptoms or vessels, which is called neovascularization. These huge numbers
only mild vision problems. Eventually, it can cause of vulnerable vessels often bleed into the vitreous. When they
blindness. So early detection of symptoms could help to only bleed a little, a few dark floaters are found. On the other
avoid blindness. In this paper, we present some experiments hand, when they bleed a lot, that may block the whole of the
on some features of Diabetic Retinopathy like properties of vision.
exudates, properties of blood vessels and properties of
microaneurysm. Using the features, we can classify healthy,
mild non-proliferative, moderate non-proliferative, severe
non-proliferative and proliferative stage of DR. Support
Vector Machine, Random Forest and Naive Bayes
classifiers are used to classify the stages. Finally, Random
Forest is found to be the best for higher accuracy, sensitivity
and specificity of 76.5%, 77.2% and 93.3% respectively.
I. INTRODUCTION
Figure I. Different stages of diabetic retinopathy (From top left): (a) Healthy
People suffering from diabetes can have an eye complication Eye (b) Mild NPDR (c) Moderate NPDR (d) Severe NPDR (e) PDR
called diabetic retinopathy. When blood sugar levels go high,
that causes harm and erosion to the blood vessels in the retina. The objective of the paper are:
These affected blood vessels can fatten and exude. Alternately,
the vessels may have been closed, may stop flowing bloods. • Process color fundus retinal images for Diabetic Retinopathy
Sometimes unnecessary and anomalous blood vessels starts to detection.
grow on the surface of the retina. These abnormal changes can • Extract key features from the pre-processed images.
damage one’s vision, sometimes may destroy fully. According
to severity of the disease, DR can be classified into two main • Detect the presence of Diabetic Retinopathy.
stages. (a) Non-Proliferative Diabetic Retinopathy (NPDR) and • Classify whether the Diabetic Retinopathy is Proliferative or
(b) Proliferative Diabetic Retinopathy (PDR). Non-proliferative.
NPDR is the initial phase of diabetic retinopathy. Many
individuals with diabetes suffers from it. With NPDR, tinier II. THE PROPOSED SYSTEM
blood vessels excrete and fatten the retina. When the macula Input: Colour fundus retinal images
expands, this has been called macular edema. This is the most Output: Diabetic Retinopathy is not present, Mild, Moderate,
familiar reason why people having diabetes leads to blindness. Severe or PDR
In case of NPDR, the blood vessels in the retina can clogged Process:
off too. This situation is named macular ischemia. When Step 1: Input the initial fundus image
macular ischemia happens, macula cannot get the blood supply. Step 2: Preprocess the initial image
Intermittently some minute particles called exudates can be Step 3: Optical disk removal
grown in the retina. These affects one's vision too. If anybody
Step 4: Exudates detection
suffers from NPDR, his eye sight will go blurry. Furthermore,
Step 5: Blood Vessels detection
NPDR is sub classed into 3 stages, Mild, Moderate and Severe.
Step 6: Microaneurysm detection
29
Step 7: Features extraction has been extracted using thresholding. After applying
Step 8: Apply to classifiers these preprocessing, pixels having intensity value higher
Step 9: Classify the Diabetic Retinopathy stages than 235 are set to 255 and the rest of them are set to 0 for
Step 10: Detect whether it is Healthy, Mild, Moderate, having the clearest view. Then traversing the image, area
Severe or PDR eye of exudates are calculated. The images of different steps
are illustrated in Figure III.
III. METHODOLOGY
A. Dataset
To evaluate our method, we have used a dataset named as
Diabetic Retinopathy (Resized) from Kaggle. The dataset has a
total of 13402 retinal images and corresponding levels of
Diabetic Retinopathy for each image.
B. Data Preprocessing
Data preprocessing has been done in two steps, general
preprocessing for all the images and specific preprocessing for
individual feature extraction.
1) General Preprocessing:
Resizing: In this work, the sizes of the actual images
Figure III. Preprocessing for Exudate detection
in the dataset were 1024x1024 pixels. As the dataset is huge
in size, the images have been stored with size 350x350 Blood Vessel Extraction: Blood vessel is one of the
pixels for reducing the computational time. most important features for differentiating diabetic
Green Channel Extraction: Preprocessing has been retinopathy stages. After obtaining the green channel
done with the aim of improving the contrast level of the image and improving the contrast of the image, several
fundus images. For contrast enhancement of the retinal steps has been done for extracting blood vessel. Alternate
images, some components like the red and blue components sequential filtering (three times opening and closing)
of the image were commonly discarded before processing. using three different sized and ellipse shaped structural
Green channel shows the clearest background contrast and element 5x5, 11x11 and 23x23 is applied on the image.
greatest contrast difference between the optic disc and Then the resultant image is subtracted from the input
retinal tissue. Red channel is comparatively lighter and image. Subtracted image has lots of small noises. Those
vascular structures are visible. The retinal vessels are lightly noises are removed through area parameter noise removal.
visible but those show less contrast than that of the green Contours of each components including noises are found
channel. Blue channel contains very little information and is by finding out the contours and calculating the contour
comparatively noisier. area and remove the noises which are comparatively
Contrast Limited Adaptive Histogram Equalization: bigger in size (200 used as reference). Then the resultant
Contrast Limited Adaptive histogram equalization image is binarized using a threshold value. Finally the
(CLAHE) is used for enhancing the contrast level of the number of pixels that covers the blood vessels area are
images. CLAHE calculates different histograms of the calculated. The images of different steps are illustrated in
image and uses these information to reallocate intensity Figure IV.
value of image. Hence, CLAHE is sigficant for improving
the regional contrast and enhancement of the edges in all the
regions of an image.
2) Specific Preprocessing:
Exudate Detection: Firstly, optical disc has been
removed using red channel of the image. Then, using 6x6 Figure IV. Preprocessing for Blood Vessel detection
ellipse shaped structuring element, morphological
dilation is applied. Non linear median filter is used for Microaneurysm Extraction: Green component is
noise removal. Exudates are in high intensity values. So it applied to extract microaneurysm. For better contrast,
CLAHE is used. Then median filter is used for noise
30
removal. 7x7 ellipse shaped structural element is used for F. Classification
morphological operation. Morphological operation Prediction has been performed using Support Vector
erosion is applied and then the image is inverted. For Machine (SVM), Random Forest (RF) and Naive Bayes
joining the disjoint segments of blood vessel, classifiers.
morphological closing is used. Then the image has been 1) Random Forest:
binarized. As the blood vessel, haemorrhage and Random Forest (RF) is an ensemble tree-based learning
microaneurysm is having the almost same intensity, all algorithm. The RF Classifier could be a set of decision trees
these components will be detected altogether in the from an arbitrarily chosen subset of the preparing sets. It
binarized image. Since microaneurysm is smaller in size, totals the votes from distinctive choice trees to choose the
it has been extracted using contour area. The images of ultimate class of the test question. The elemental concept
different steps are illustrated in Figure V. behind RF classifier could be a basic but capable one - the
intelligence of swarms. An expansive number of moderately
uncorrelated trees working as a committee will beat any of
the person constituent models. Uncorrelated models can
deliver gathering expectations that are more precise than any
of the individual predictions. The reason for this wonderful
impact is that the trees ensure each other from their personal
mistakes as long as they don't always all mistakes within the
same heading. Whereas a few trees may be off-base,
numerous other trees will be right, so as a gather the trees
are able to move within the adjusted heading.
Figure V. Preprocessing for Microaneurysm detection 2) Support Vector Machine:
SVM classify the input images into two classes such as
C. Dataset Splitting Diabetic Retinopathy affected eye and normal eye using its
The dataset has been divided into two parts, where 75% as features. As SVM is a binary classifier, our first task is to
training data and 25% as test data. Therefore, 10052 training classify which eye is affected by Diabetic Retinopathy and
images has been used to train the model and it has been tested which is a healthy one. After first classification, our next
on 3350 images. task is to use Support Vector Machine again. This time it is
applied only on the affected ones. It will again classify
D. Data Scaling which Diabetic Retinopathy is non-proliferative i.e. is in
In this system, standard scalar has been used to scale all the initial stage and which on is in proliferative i.e. is in severe
data to limit the ranges of the variables. Using data scaling, state.
those can be compared on the exact environments in case of all Support Vector Machine has been utilized since the SVM
the algorithms. is based on a convex objective function that never stuck into
the local maxima. The ideal hyperplane is the shape of the
E. Selection of Features isolating hyperplane and the objective work of the
Feature extraction will be done from preprocessed images optimization issue does not depend unequivocally on the
shown in Figure III, IV and V. The features which are extracted dimensionality of the input vector but depends as it were on
to detect Diabetic Retinopathy are- the inward items of two vectors. This fact permits
Histogram of Exudates developing the isolating hyperplanes in high-dimensional
Zeroth Hu moment of Exudates spaces.
Histogram of Blood Vessels 3) Naïve Bayes:
Zeroth Hu moment of Blood Vessels Naive Bayes classifier isn't a single algorithm, but a
Histogram of Microaneurysm collection of algorithms where all of them contains a
Zeroth Hu moment of Microaneurysm common rule, that is, each match of highlights being
classified is free of each other. Naive Bayes is mainly an
ensemble algorithm based on Bayes’ Theorem.
G. Evaluation Metrics
Accuracy, Sensitivity and Specificity are used as evaluation
metrics of the model. Accuracy, Sensitivity and Specificity are
calculated using (1), (2) and (3) respectively.
31
𝑇𝑟𝑢𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 = (3)
𝑇𝑟𝑢𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 + 𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒
(a) (b)
Figure IX. Bar diagram of (a) sensitivity and (b) specificity for
each class
Figure VII. Flow Diagram of the system TABLE II. COMPARISON OF DIABETIC RETINOPATHY DETECTION BY
VARIOUS RESEARCHERS
IV. RESULTS
Sensitivity (%)
Specificity (%)
Comparison of evaluation metrics for Random Forest,
Accuracy (%)
Number of
Reference Method
in Table 1.
32
Acharya 5 Higher- order 72 82.5 88.9
et al. [6] spectra Since, the instances of each class are important in this work, we
have calculated the evaluation metrics by using macro-
Lim et al. 5 Blood vessels, 75.9 80 86 averaging method. Because macro average reveals the better
[7] exudates, scenario of the smaller classes and it is to the point and more
microaneurysm, accurate when performances on each and every classes are
haemorrhage important equally.
This 5 Histograms and 76.5 77.2 93.3 V. CONCLUSION
work Zeroth Hu
moments of After studying the existing systems, we conclude that our
blood vessel, proposed technique is successfully detecting Diabetic
exudates, Retinopathy. Along with this, the proposed method is classifying
microaneurysm
into five classes of Diabetic Retinopathy. Classification has been
done based on three features- area of exudates, area of blood
Sinthanayothin et al. [1] distinguished Diabetic Retinopathy vessel and area of microaneurysm. And using this features, we
using image processing techniques from a healthy retina. In this have classified into five classes as normal eye, mild NPDR,
proposed system, fundus images were preprocessed using moderate NPDR, severe NPDR and PDR. Using Random Forest
adaptive local contrast enhancement. This method, established classifier, we have gained accuracy= 76.5%, sensitivity= 77.2%
on a multilayer neural network, made 70.21 percent sensitivity and specificity= 93.3%. The metrics we have found in this work
and 70.66 percent specificity. are compared with the existing works.
Kahai et al. [3] developed a system for the initial identification In this paper, we have performed the Diabetic Retinopathy
of the Diabetic Retinopathy. The identification system is based classification using Random Forest classifier with some
on a testing problem of binary-hypothesis that results only yes essential features like exudates, blood vessel and
or no. Bayes optimization criterion was used to the raw fundus microaneurysm. In future, we hope to make it work for some
images for the initial identification of the DR. This method was more classifiers like K-Nearest Neighbor classifiers and so on
able to detect the appearance of microaneurysms having using some secondary features like haemorrhage also.
sensitivity of 80 percent and specificity of 63 percent correctly. Moreover, we can perform this classification method using
larger dataset of infected eyes using neural network model in
Wang et al. [4] have classified healthy, moderate, and severe future.
DR stages using morphological image processing approaches REFERENCES
and a feedforward deep learning network. In this system, the
[1] C. Sinthanayothin, J. F. Boyce, T. H. Williamson, H. L. Cook, E. Mensah,
existence and covering region of the components of the lesions S. Lal, and D. Usher, “Automated detection of diabetic retinopathy
and blood vessels are selected as the main features. The ondigital fundus images” Diabetic Medicine, vol. 19, no. 2, pp. 105–
classification efficiency of this work was 74 percent, the 112, 2002.
sensitivity was 81 percent, and the specificity was 92 percent. [2] Singalavanija, A., Supokavej, J., Bamroongsuk, P.Sinthanayothin, C.,
Featuring the lesions and blood vessels, and texture parameters, Phoojaruenchanachai, S., and Kongbunkiat, V. Feasibility study on
computeraided screening for diabetic retinopathy. Jap. J.Ophthalmology,
they classified the input images into healthy, Moderate, and 2006, 50(4), 361–366.
Severe DR [4]. [3] P. Kahai, K. R. Namuduri, and H. Thompson, “A decision support
framework for automated screening of diabetic retinopathy.”Hindawi
An automatic identification system of DR was proposed by Publishing Corporation, Feb 02, 2006.
Acharya et al. [6]. They classified normal, mild, moderate, [4] H. Wang, W. Hsu, K. Goh, and M. Lee, “An effective approach to detect
lesions in color retinal images,” vol. 2, 02 2000, pp. 181 – 186 vol. 2.
severe and PDR using the two spectral layered invariant
[5] Nayak, J., Bhat, P. S., Acharya, U. R., Lim, C. M., and Kagathi, M.
features of higher-order spectra approaches and a Support Automated identification of different stages of diabetic retinopathy using
Vector Machine classifier [8]. This work reported an accuracy digital fundus images.
of 72%, a sensitivity of 82%, and a specificity of 88%. [6] U. R. Acharya, E. Y. K. Ng, J. H. Tan, V. S. Subbhuraam, and N. Kh,
“Anintegrated index for the identification of diabetic retinopathy stages
In this work, the fundus images are classified into five classes usingtexture parameters,”Journal of medical systems, vol. 36, pp.
2011–20,02 2011.
using the histograms and the zeroth Hu moments of the
[7] C M Lim, U R Acharya, E Y K Ng, C Chee and T Tamura, Proceedings
exudates, microaneurysms, and blood vessels present in the eye. of the Institution of Mechanical Engineers, Part H: Journal of Engineering
Random Forest is used for the classifier. The classifier is able in Medicine, 2009 223: 545
to identify the unknown class accurately with an efficiency of [8] Osareh, A.; Mirmehdi, M.; Thomas, B.T.; Markham, R. In MICCAI’02:
more than 76.5 percent with sensitivity 77.2 percent and Proceedings of the 5th International Conference on Medical Image
Computing and Computer-Assisted Intervention-Part II; Springer-
specificity 93.3 percent. Verlag: London, UK, 2002; pp 413–420
33
Methodology for Syngas Energy Assessment
Melda Ozdinc Carpinlioglu
Gaziantep University
Department of Mechanical Engineering
Gaziantep, Turkey
melda@gantep.edu.tr
Abstract— Energy assessment of syngas produced by solid sawdust, SD and polyethylene pellets, PP were used fuel.
granulated fuel decomposition in a special microwave plasma Syngas was defined as the gasified amount of fuel excluding
gasification process is the topic of presentation . Instantaneous ash left the process but including the supplied amount of air.
measurements of syngas volumetric content covering the total Therefore syngas production is governed by the nature of fuel
gasification time of the process and gas chromotography and MCw plasma decomposition process (Figure I)
analysis on the collected sample syngas amount during
approximately 30% of total gasification time are compared . Monitoring of gasification was based upon the
Instantaneous volumetric content of syngas and the instantaneous measurements , IM of local instantaneous
gasification time and location averaged process temperature temperatures, T along the reactor and syngas volumetric
which is defined as syngas temperature , T syn are the major content . The local instantaneous temperature measurements
energy assessment parameters. Gas chromatography analysis were taken by B type (Pt18 Rh- Pt) thermocouples located
verifies the transient nature of syngas production.. along the reactor at specified 5 locations. Thermocouples had
a sensitivity of ± 4 C. Local temperature measurement was at
Keywords— syngas, volumetric content, gas chromatography , 1 second sampling frequency . A semi-continuous
transient , instantaneous measurement commercial gas analyzer called MRU-VARIO PLUS was
I. Introduction used to measure the syngas volumetric content . CO, CO2,
CH4, H2, N2 and O2 in syngas was determined with an
The criticism on the literature and state of art under the accuracy of up to100 % and up to 25 % respectively. The
light of a research between 2015-2018 regarding microwave syngas volumetric content measurement was at 2 seconds
plasma gasification in a test system MCw GASIFIER for sampling frequency . Fuel decomposition was through a
biomass conversion to syngas are available in [1,2,3,4,5] . A defined process –gasification time , t g . Instantaneous
review on the manner [1] with a methodology on measurements were executed from t = 0 to t = t g. The
thermodynamical treatment of the process [2] can be treated gasification time , t g was resulted in no solid fuel left in the
as the pre-research publications .The article [3] is on an reactor with a minute amount of ash filtered from generated
overall treatment of research discussing syngas generation syngas through a cyclone separator located in front of the
with a variety of fuel as a post- research article. A Ph. D study MRU-VARIO PLUS gas analyzer.
[4] was completed with the support of the research [5]. In
terms of the current efforts in the relevant field a proceedings
paper was presented in May 2021[6]. It is denoted [6] that
research on microwave MCw plasma gasification is onwards
for recycling of biomass and waste to energy conversion due Syngas
MRU
to the importance of the manner in terms of the advantages of Ash Vario-Plus
MCw plasma gasification and existing gaps in physical nature Gas
of the process .[7,8,9,10,11] The major attention in this 5- T(tg) Analyzer
paper is given to the methodology for the energy content Reactor Syngas + Ash
determination of syngas produced by MCw plasma control 4- T(tg)
gasification of solid granulated fuel. The operational results volume
of MCw GASIFIER are referred for the purpose. The 3- T(tg)
available data on instantaneous volumetric content of 2- T(tg) Solid fuel
generated syngas and temperature measurements during the t=0,
total decomposition period are used with the data on gas 1- T(tg) mfuel=250g
chromotography analysis of stored syngas samples before the t=tg,
termination of the decomposition. Plasma Input mfuel=0g
Air + Electrical Power
The discussion is based on a brief on the operational test Figure I Sketch of the process [5]
cases of MCw GASIFIER [1-5]. Syngas energy assessment is
described with the critical parameters . It is also aimed to Stored amount of syngas during fuel decomposition before
determine the functional relationships between selected the termination of the process was collected in begs during
parameters expressing the physical process of gasification. the so called storage time, t s .The chromatographic analysis
, GC of stored syngas samples was done in TUBITAK-
II. Brief on process, basic parameters and MAM Laboratory according to TS EN ISO 6974 and TS EN
methodology ISO 6975 standards . Atmospheric pressure and reference
The operation of MCw GASIFIER[5] was such that air standard temperature T R = 20 ° C were valid in the
at standard temperature and pressure condition was used as analysis. C1-C6 hydrocarbons( Ethane, Ethylene, Propylene
plasma carrier. Granulated particles of different types of ,I-Pentane n- Hexane ,Butene , etc ) besides CO, CO2, CH4,
biomass having the particle size range of 0.1 mm – 1 mm H2, N2 and O2 components were determined .
were loaded as a static fuel bed in the reactor. Coal, C
34
The local instantaneous temperatures during fuel gathered during the total duration of process , t g and GC
decomposition process , t g were varying as a function of fuel analysis data conducted on a stored sample syngas amount
and process . Similarly volumetric amounts of CO, CO2, during t s < t g with ts /tg =0.3 corresponding to a partial
CH4, H2, N2 , O2 in syngas were also varying during t g . gasification are compared. The comparison is given in terms
Therefore process-decomposition –reactor temperature can of syngas volumetric content in part A .
be defined as syngas temperature , Tsyn as an ensembled
Furthermore syngas energy assessment for the
averaged parameter using collected local instantaneous
temperature data during t g. Similarly molecular weight of gasification of polyethylene pellets, PP as solid fuel with a
variable microwave power 3 k W ,4.8 k W and 6 k W use
syngas Msyn as a secondary ensembled averaged parameter
at 50 standard L/min of air are referred using data gathered
can be calculated using collected syngas volumetric content
data during t g . The energy content of syngas can be given by IM and GC analysis separately and in a combined manner
in part B .
by HHV of syngas. HHVsyn can be calculated based upon
Dalton-Amagat Model on instantaneous –ensembled averaged Influence of fuel content on the generated syngas is
syngas volumetric content similar to the calculation of Msyn. presented in terms of proposed routes for fuel decomposition
The calculations are based upon the assumption of perfect gas in part C.
treatment for syngas and referring to T syn [1,2,3,5].
HHVsyn, relative density, RD and Wobbe Index (WI) of A. Syngas Volumetric Content Data via GC and IM
syngas [12,13] can be determined in GC analysis of stored GC analysis data are such that Ethane, Ethylene,
syngas sample . GC analysis is based upon the treatment of Propylene ,I-Pentane n- Hexane ,Butene ,Acetylene 1-3
perfect gas and real gas assumptions for syngas separately . Butadiene ,Propane , n-Hexane etc.in syngas can be
The determination of gross and net quantities of the determined. However the amounts of these components are
parameters are also available [5]. Table I lists the defined rather small referring to the maximum 2.887% of Ethylene
parameters. HHVsyn is the common parameter in different and minimum 0.002% of n -Hexane . In both methods the
methodology . determined common components of syngas are CH4, O2,
Table I Parameters for syngas energy assessment derived from IM N2,CO2,CO,H2 . The comparison of the common
,GC analysis components in syngas for the referred data sets can be
discussed as follows :
Parameter Base Definition Explanation
1. CH4 content in GC data (1.94%) is in the same
No solid fuel order of IM data (1.26%) .
tg IM Process duration
left in reactor
2. O2 and N2 content in GC data are such that O2(
ts GC Storage time 4.132%) and N2(75.653%) while in IM data O2(1.44%)
ts< t g
Ensembled and N2(51.5%) are determined .
averaged 3. CO2, CO, H2 contents of IM data CO2 (13.01
tg based Process-
local
Tsyn IM Reactor –Syngas %), CO(21.31%) , H2 (10.80%) are greater than the
temperatures
temperature corresponding magnitudes of GC data as given with CO2
in reactor
during t g (6.55 %), CO(4.55%) , H2 (1.91%). The instantaneously
Calculated determined magnitudes of CO2, CO, and H2 have
Molecular Weight of Dalton-
M syn IM
Syngas Amagat
approximately 2, 4.3 and 5.7 multiples of the ones
Model determined by GC .
Calculated The reason of the observed deviations is due to the
Higher Heating Value of Perfect Gas incomplete decomposition of fuel with t s < t g. The similar
HHV syn IM
Syngas Mixtures at
Tsyn
magnitudes of CH4 of syngas determined by IM and GC
Calculated data can be due to the completed CH4 formation for the
HHV syn GC
Higher Heating Value of Perfect Gas time period t = t s . The greatest magnitudes of O2 and N2
Syngas Real Gas by GC data also confirm that the gasification procedure is
Syngas
not completed for t s < t g since O2 and N2 are the
density / Air components of air supplied steadily and continuously while
RD syn GC
Relative density of density fuel decomposition sensed,in terms of formation of
syngas Perfect Gas CO2,CO,CH4 ,H2 of syngas is not completed.
Real Gas
35
correct ones resembling the physical nature of the process.
1,2
Furthermore the temperature acting in IM is Tsyn .It is
under the effect of decomposition since ensembled average
1
temperature during process at different locations is used. GC
analysis is based on reference temperature, T R and Tsyn is
>> T R. Therefore the difference of Tsyn- T R is a dominant 0,8
term in discussion.
(O2/N2) n 0,6 SD
Table II Syngas Energy based upon GC data for polyethylene C
pellets ,PP gasification at different power use in comparison with 0,4
IM data PP
0,2
GC
t s/ t g GC
E syn 0
GC Net WI MJ/m3
(kJ)
RD 0 0,2 0,4
From WI
0.28 1.05 8.21 7563 O2/N2 syn
0.3 0.98 5.41 4395
Figure II O2/N2 of generated syngas expressed relative to
0.35 0.97 7.44 5582 air
IM IM IM As a further comment M n is given with M syn in
WI MJ/m3 Figure III in reference to all of the collected data with C, SD
E syn Mn =Msyn /
(kJ) Mair
and PP gasification . The magnitudes of M n are between
0.28 2.89 3298 0.96 0.9 and 1.15. All data roughly.independent of fuel type
following a single line shown.
0.3 2.713 3274 0.94
0.35 2.573 3327
Table II lists syngas energy , E syn calculated based
0.91
upon WI , RD of GC data using the amount of syngas
measured from the operational cases of IM. The assumption
Net WI value and RD value of generated syngas is based upon the similarity of syngas generation with
produced with gasification of PP at different power use based decomposition of fuel for t > ts since GC data is referred in
on GC analysis are not varying depending on perfect gas calculation. However the calculated values with IM data at
or real gas treatment of syngas .(Table II) The maximum different power use are in the same range of Esyn = 3200 k J
magnitude of WI is belonging to 3 k W power use with 8.21 The values of Esyn derived from GC are dependent on
MJ/m 3 with the values 5.41 MJ/m3 and 7.44 MJ/m3 power use and varying in the approximate order of 4000-
corresponding to 4.8 k W and 6 k W power use respectively . 7300 k J. In fact combined data use from GC and IM is one
RD syn is in the order of 1 meaning density of syngas similar
of the reason . The second reason is the temperature difference
to the one of standard air. Instantaneous measurements are
, Tsyn- TR which induces sensible energy increase for syngas
used to determine molecular weight of syngas. Calculated
molecular weight of syngas , M syn is given as a ratio of in GC analysis.. The magnitudes of E syn determined by WI
atmospheric standard air molecular weight , M air in Table II utilization are greater than the magnitudes obtained by IM.
as normalized molecular weight , M n = M syn/M air .The Esyn calculated by IM are 0.43, 0.74 and 0.59 of Esyn of GC.
magnitudes of M n are in the same order of RD .This means
WI defined as a GC analysis parameter is also calculated
that almost independent of method used , the produced syngas
using data gathered by IM as an alternative method . The
has similar of air in terms of density and molecular weight.
Presence of air in syngas may be the reason. Therefore the results are given in Table II with the approximate
respective orders of O2 and N2 in syngas can be determined magnitudes of 2.89,2.7,2.57 MJ/m3. WI magnitudes derived
. The ratio of O2/N2 of syngas , O2/N2 syn with that of air from IM are almost 0.35,0.5 and 0.34 of the corresponding
is defined as normalized ratio of O2/N2 , (O2/N2 ) n . The magnitudes of GC analysis .
variation of (O2/N2 ) n with O2/N2 syn is shown in Figure II
Therefore different methods give different values for
using all of the collected data with C, SD and PP gasification.
The syngas generation by all kinds of fuel gasification almost Esyn . The combined method utilization seems to be not
follow the single line shown . The data behaviour in Figure realistic due to the difference between t s and t g ,
II means that oxygen in syngas is mostly less than its temperature difference of Tsyn- T R and the difference in
amount in standard pure atmospheric air.. As can be seen only the amount of syngas.
2 data with C gasification (O2/N2) n > 1. The presence of
CO , CO2 components in syngas causing reduction in pure
O2 is the possible reason of the fact .
36
1,2 2,5
1,15
2
1,1
CO/CO2
1,05 1,5
C
Mn
1 SD 1
SD
0,95 C
0,5 PP
0,9 PP
0,85 0
0,8 5 15 25 35 45
25 30 35 % CO2
Msyn kg /kgmol
Figure V Influence of fuel on the generated syngas CO/
Figure III Variation of M n with M syn CO2 with % CO2 content
It is seen that increase in H2% content of syngas is coupled
C. Influence of Fuel Content on Syngas with an increase in H2/CH4 ratio of syngas Meanwhile
Content:Analysis of Decomposition increase in CO2% of syngas is coupled with a decrease in
In order to describe the influence of fuel on the generated CO/CO2 ratio of syngas. The increase in dependent
syngas composition and the thermochemical decomposition parameter given in the abcissa and the used ratio definitions
process Figure IV and Figure V can be referred for a final is the logical fact for the observed data behaviour . Therefore
deduction just as samples . IM data are used . Figures IV and V can be interpreted as the sample plots on
thermochemical decomposition of fuel which is a severe
function of fuel type.
18
16 Acknowledgments
14 The author expresses her gratitude to TUBITAK for the
12 completed research project with a grant number of 115M389
(2015-2018) . TUBITAK –MAM Gebze Laboratory
H2/CH4
37
3) Fuel decomposition -gasification is not steady . [6] Carpinlioglu Ozdinc M “Perspectives on waste to energy
Therefore syngas production is not at a constant rate and conversion by microwave plasma gasification “Waste to
resources 9 th International Symposium MBT ,MRF
characteristics instead it is a transient procedure. &Recycling Online Conference 18-20 May 2021 Hanover
GERMANY
The major parameters are Tsyn, T R, Esyn,WI , RD , M n
[7] Bishoge, Obadia Kyetuza; Huang, Xinmei; Zhang, Lingling; et
of syngas. Thermochemical decomposition of fuel is al” The adaptation of waste-to-energy technologies: towards
absolutely a solid function of fuel content. The conversion of the conver-sion of municipal solid waste into a renewable
C and H in fuel to CH4, CO,CO2, and H2 components of energy resource”Environmental Reviews Volume: 27 Issue: 4
,2019 Pages: 435-446
syngas describes the process. Furthermore the respective
[8] Munir, M. T.; Mardon, I.; Al-Zuhair, S.; et al.” Plasma
orders of CO/CO2 ratio and H2/CH4 ratio of syngas indicate gasification of municipal solid waste for waste-to-value
the path of thermochemical decomposition on which further processing” Renewable & Sustainable Energy Reviews
analysis is vital for a complete understanding . Volume: 116 ,2019 Article Number: 109461
[9] Gimzauskaite, Dovile; Tamo-siunas, Andrius; Tuckute,
References Simona; et al” Treatment of diesel-contaminated soil using
[1] Sanlisoy, A Carpinlioglu Ozdinc M ,”A review on plasma thermal water vapor arc plasma” Environmental Science and
gasification for solid waste disposal” ,International Journal of Pollution Rese-arch Volume: 27 ,2020 Issue: 1 Special Issue:
Hydrogen Energy Vol 42 , 2017,pp 1361-1365. SI Pages: 43-54
[2] Carpinlioglu Ozdinc M, Sanlisoy A, “Performance assessment [10] Mukherjee,C.; Denney, J.; Mbo-nimpa, E. G.;et al.” A review
of plasma gasification for waste to energy conversion: A on municipal solid waste-to-energy trends in the USA”
methodology for thermodynamic analysis” International Renewable & Sustainable Energy Reviews Volume: 119
Journal of Hydrogen Energy Vol.43, 2018, pp 11493-11504. ,2020 Article Number: 109512
[3] Sanlisoy, A Carpinlioglu Ozdinc M “ Microwave Plasma [11] Gadzhiev, M. Kh.; Kulikov, Yu. M.; Son, E. E.; et al” Efficient
Gasification of a Variety of Fuel for Syngas Generator of Low-temperature Argon Plasma with an
Production”Plasma Chemistry and Processing V:39,2019 Expanding Channel of the Output” High Temperature Volume:
Issue:5 pp: 1211-1225 58 ,2020 Issue: 1 Pages: 12-20
[4] Sanlisoy , A., An Experimental Investigation on Design and [12] Wimcompass Hobre Instrument 2018 General Information
Performance of Plasma Gasification Systems, Ph. D thessis in Wobbe Index and Calorimetershttps://www.hobre.com
/files/products/Wobbe Index General Information.rev.1. pdf
Mechanical Engineering Department of Gaziantep University
TURKEY ,2018 [13] Neutrium, 2018 Wobbe Index https:// neutrium.net/properties
/wobbe-index/ 08.04.2018
[5] Carpinlioglu Ozdinc M , Design Construction and
Performance Assessment of a Test Plant(Microwave Gasifier)
“MCwgasifier” Using Plasma Gasification for Solid Waste-
Energy Conversion in Laboratory Scale – An Experimental
Case For Knowhow On Plasma Technology ,TUBITAK
115M389 Final Project Report ,2018
38
3D Object Detection using Mobile Stereo R-CNN on Nvidia
Jetson TX2
Mohamed K. Hussein Mahmoud I. Khalil Bassem A. Abdullah
Computer and Systems Engineering Computer and Systems Engineering Computer and Systems Engineering
Department Department Department
Ain Shams University Ain Shams University Ain Shams University
Cairo, Egypt Cairo, Egypt Cairo, Egypt
mohamed.khaled@eng.asu.edu.eg mahmoud.khalil@eng.asu.edu.eg babdullah@ eng.asu.edu.eg
Abstract—3D Object detection is one of the most important Stereo R-CNN [1] model which is a stereo-based end-to-end
perception tasks needed by autonomous vehicles to detect deep neural network that extends Faster R-CNN [2] and it
different road agents like other vehicles, cyclists, and does not rely on LIDAR supervision during training and
pedestrians which is essential for driving tasks like collision inference. Nvidia Jetson TX2 has 256-core Nvidia Pascal
avoidance and path planning. In this paper, our work is focused GPU architecture with 256 Nvidia CUDA cores, Dual-Core
on 3D Object Detection for car class from stereo images without Nvidia Denver 2 64-bit CPU and Quad-Core ARM Cortex-
LIDAR supervision neither during training nor during A57 MPCore, and 8 GB 128-bit LPDDR4 Memory.
inference and the challenging task of running 3D Object
Detection on an embedded target Nvidia Jetson TX2 by In the next section, we briefly review different methods
modifying Stereo R-CNN model and reducing the model size to used for 3D Object Detection. In section III, we illustrate the
approximately one third the size of the original model to be more modifications made to the Stereo R-CNN network to be more
suitable for embedded targets. Experiments on KITTI dataset suitable for an embedded target. In section IV, we explain the
showed that our model’s inference time is 1.8 seconds and its’ experiment setup and what was done and report the results
average precision for moderate car class is 17% on the test set. achieved and compare it to the original model based on the
Our model decreases training and inference time by evaluation on KITTI validation set and test set. In section V,
approximately 60% with a 13% drop on the test set which is an we conclude this paper with a summary of our work and
expected trade-off when decreasing the number of parameters
contribution.
inside the model.
II. Related Work
Keywords—3D Object Detection, Stereo Vision, Autonomous
Driving, Embedded Systems A. LIDAR Based Methods
I. Introduction LIDAR-based methods can be mostly classified into two
categories depending on how the point cloud is represented.
3D Object Detection extends 2D object detection by also Grid-based methods depend on point cloud voxelization like
detecting pose and orientation as well as real-world [3]-[8] or projecting point clouds to 2D grids [9] and then
dimensions of detected objects as shown in Figure I. This task these grids are processed by 2D or 3D CNN. Point-based
is essential in autonomous driving for tasks like motion methods process raw point clouds directly like [10]-[15].
prediction and accurate object localization which is important Also, some methods use both representations like [16]-[18]
for safe driving. and [46]. There are other categories like [19] which uses a
The most used sensors for 3D Object Detection are graph representation for point clouds. At the time of writing
LIDAR, stereo camera, and mono camera. State of the art this paper (April 2021), the top LIDAR method on KITTI 3D
models heavily rely on LIDARs but LIDARs have the benchmark is SE-SSD [46] which has an average precision of
disadvantage of being very expensive which can cost around 82.54% in the car moderate class. This is also the top method
$70000 compared to cameras which can cost less than $1000. regardless of approach.
Also, LIDAR point clouds contain sparse information B. LIDAR + Image Fusion Based Methods
compared to dense information in images. The main
Fusion methods can be mostly classified into three
advantage of LIDAR is accurate depth information which is
categories. Early Fusion in which inputs from camera and
essential for 3D Object Detection and cameras cannot provide
LIDAR are fused very early in the input of the deep neural
the same level of accuracy and this is why there is a huge gap
network like [20]. Late Fusion in which there is a separate
in average precision between LIDAR-based methods and
pipeline for Camera and LIDAR and they are fused at a later
camera-based methods. Camera-based methods are divided
stage in the model like [21]-[27]. Deep Fusion in which inputs
into stereo-based methods and methods relying on just one
are fused multiple times deep inside the model like the leading
RGB image. Stereo-based methods perform better than mono-
work in 3D Object Detection MV3D like [28]-[30]. At the
based methods because depth can be better estimated from
time of writing this paper, the top fusion method on KITTI 3D
stereo images.
benchmark is CLOCs [27] which has an average precision of
While average precision is usually the metric used for 80.67% in the car moderate class.
comparing different methods, in this paper our work is
focused more on the target platform and running 3D Object C. Stereo Images Based Methods
Detection on embedded target Nvidia Jetson TX2 with Stereo-based methods can be mostly classified into two
reduced memory usage and less inference time and our categories. LIDAR supervised methods that need LIDAR
contribution is testing our proposed on KITTI [45] 3D Object input for training and use only stereo images during inference
Detection task using an embedded target while other methods like [31]-[38]. Pseudo-LIDAR was proposed in [31] to reduce
use more powerful GPUs and sometimes use multiple GPUs the gap between 3D object detection using LIDAR and stereo
for training and inference. For this purpose, we modified the camera. It relies on building point clouds from depth maps and
39
Figure I. 2D Object Detection (top) and 3D Object Detection (bottom)
feeding the generated point cloud to a LIDAR-based 3D disparity image which is then projected to a grid which is then
Object Detector and this approach achieved the best average used to estimate the 3D bounding box. In [40], Instance Depth
precision on KITTI benchmark when it was published. Aware (IDA) 3D Object Detection is proposed in which stereo
Pseudo-LIDAR can be used with mono or stereo images by region proposal network is used to get 2D bounding boxes
just changing the depth estimation module which creates the then IDA module estimates the center of the 3D bounding box
depth map that is later transformed to a pseudo point cloud. In and the region proposals are used to determine position and
[36], Pseudo-LIDAR++ was proposed which builds over [31] orientation of the 3D bounding box. In [41], Disp R-CNN is
by enhancing depth estimation which is the first step to proposed in which stereo pair is used as input to Stereo Mask
produce pseudo point cloud. In [34], end-to-end Pseudo- R-CNN network to produce instance masks for objects of
LIDAR was proposed since previous methods trained two interest then instance disparity is generated which is used to
different blocks for depth estimation and 3D object detection generate instance point clouds which are fed to a 3D Object
separately. Deep Stereo Geometry Network for 3D object Detector.
detection was proposed in [32] which is an end-to-end
network relying on constructing a 3D geometric volume and Stereo methods that rely on the concept of generating point
plane-sweep volume for 2D features then use the 3D clouds perform better than methods that use images directly
geometric volume for 3D Object detection and the plane- for 3D Object Detection but most of them require the presence
sweep volume for Depth Estimation simultaneously. In [33], of ground truth point clouds during training and needs a lot of
confidence-guided 3D object detection is proposed by doing computation power as there are two steps in the process, first
depth estimation separately for foreground and background you need to generate a pseudo point cloud then feed it to a
pixels and giving a confidence estimation for each pixel which point cloud-based 3D Object Detector.
is used with the generated point cloud as input for a 3D Object At the time of writing this paper, the top stereo method on
Detector. In [35], Zoomnet was proposed and in this approach, KITTI 3D benchmark is CDN-DSGN [38] which has an
the 2D Region of Interests were resized to have the same average precision of 54.22% in the car moderate class.
resolution so that near and far objects are analyzed with the
same resolution and instance point cloud is generated from the D. Mono Images Based Methods
depth estimation for each instance bounding box. In [37], Monocular 3D Object Detection is the most challenging
Object Centric stereo matching is proposed trying to enhance method due to the complete lack of depth information and
the stereo matching problem for 3D object detection because therefore there is a huge performance gap between monocular
other approaches used depth estimation networks with depth methods and other methods. At the time of writing this paper,
maps as the main output not point clouds so in the approach the top monocular method [42] on KITTI 3D benchmark has
the instance segmentation is done, and object-centric instance an average precision of 12.72% in car moderate class which is
point clouds are generated to enhance the produced point a video-based method that uses temporal cues and kinematics
clouds which are then fed to a 3D object detector. In [38], a to improve localization accuracy.
Continuous Disparity Network (CDN) is proposed with a
Wasserstein distance-based loss function to enhance disparity III. Mobile Stereo R-CNN
estimation and this CDN can be used with any stereo 3D Mobile Stereo R-CNN as shown in Figure II is based on
Object Detector that relies on disparity estimation like DSGN Stereo R-CNN which is an end-to-end deep neural network
[32]. No LIDAR supervision methods only use stereo for 3D Object Detection from stereo images without the
images in both training and inference like [1] and [39]-[41]. supervision of LIDAR in neither training nor inference. Our
Stereo R-CNN [1] is the baseline model for our work so it will main target was running 3D Object detection on an embedded
be discussed later in section III. In [39], the images are used target like NVIDIA Jetson TX2. So first we analyzed the
to generate a semantic map with 2D bounding boxes and a building blocks of Stereo R-CNN to identify bottlenecks that
40
Figure II. Mobile Stereo RCNN Network Architecture
can be modified to enhance the training time, inference time, * the number of channels) these feature maps are then used
and memory footprint and we decided to change the backbone with a feature pyramid network to produce the output of the
network from ResNet101-FPN to MobileNetV2-FPN. The base model with 128 channels with the following dimension
new model size is approximately third the size of the original ratios 1/4, 1/8, 1/16, 1/32, and 1/64 relative to the input
model and inference time is reduced by approximately 60% dimension. Five anchor scales 32, 64, 128, 126, 512, and three
with a minor drop of 6% in average precision on moderate car ratios 0.5, 1, 2 are used in the network.
class on KITTI validation set and a drop of 13% in average
precision on moderate car class on KITTI test set. First, we Layer 0 consists of a 2D convolution operation and 2
review the building blocks of Stereo R-CNN then discuss the bottleneck operations. Layer 1 consists of 1 bottleneck
backbone replacement. operation. Layer 2 consists of 3 bottleneck operations. Layer
3 consists of 7 bottleneck operations. Layer 4 consists of 4
A. Stereo R-CNN bottleneck operations followed by a 2D convolution to
Stereo R-CNN is based on Faster R-CNN. It takes as input produce the final feature map. The reason for choosing these
two stereo images resized so that the shortest side length is layers was to match the lateral connections done to the feature
600px. pyramid network of the original model.
41
Figure III. Precision-Recall curves for Orientation estimation, 3D object detection, and Bird's eye view detection respectively
42
[16] Y. Chen, S. Liu, X. Shen, and J. Jia, “Fast point R-CNN,” in [35] Z. Xu, W. Zhang, X. Ye, X. Tan, W. Yang, S. Wen, E. Ding,
Proceedings of the IEEE/CVF International Conference on A. Meng, and L. Huang, “Zoomnet: Part-aware adaptive
Computer Vision (ICCV), October 2019. zooming neural network for 3d object detection,” 2020.
[17] C. He, H. Zeng, J. Huang, X.-S. Hua, and L. Zhang, [36] Y. You, Y. Wang, W.-L. Chao, D. Garg, G. Pleiss, B.
“Structure aware single-stage 3d object detection from point Hariharan, M. Campbell, and K. Q. Weinberger, “Pseudo-
cloud,” in IEEE/CVF Conference on Computer Vision and lidar++: Accurate depth for 3d object detection in autonomous
Pattern Recognition (CVPR), June 2020. driving,” 2020.
[18] S. Shi, C. Guo, L. Jiang, Z. Wang, J. Shi, X. Wang, and H. Li, [37] A. D. Pon, J. Ku, C. Li, and S. L. Waslander, “Object-
“PV-RCNN: Point-voxel feature set abstraction for 3d object centric stereo matching for 3d object detection,” in 2020 IEEE
detection,” in IEEE/CVF Conference on Computer Vision and International Conference on Robotics and Automation (ICRA),
Pattern Recognition (CVPR), June 2020. IEEE, May 2020.
[19] W. Shi and R. Rajkumar, “Point-GNN: Graph neural network [38] D. Garg, Y. Wang, B. Hariharan, M. Campbell, K.
for 3d object detection in a point cloud,” in IEEE/CVF Weinberger, and W.-L. Chao, “Wasserstein distances for
Conference on Computer Vision and Pattern Recognition stereo disparity estimation,” in NeurIPS, 2020.
(CVPR), June 2020. [39] H. Konigshof, N. O. Salscheider, and C. Stiller, “Realtime
[20] M. Simon, K. Amende, A. Kraus, J. Honer, T. Samann, H. 3d object detection for automated driving using stereo vision
Kaulbersch, S. Milz, and H. Michael Gross, “Complexer-yolo: and semantic information,” in 2019 IEEE Intelligent
Real-time 3d object detection and tracking on semantic point Transportation Systems Conference (ITSC), IEEE, Oct. 2019.
clouds,” in Proceedings of the IEEE/CVF Conference on [40] W. Peng, H. Pan, H. Liu, and Y. Sun, “Ida-3d: Instance-
Computer Vision and Pattern Recognition (CVPR) depth-aware 3d object detection from stereo vision for
Workshops, June 2019. autonomous driving,” in IEEE/CVF Conference on Computer
[21] X. Du, M. H. A. J. au2, S. Karaman, and D. Rus, “A general Vision and Pattern Recognition (CVPR), June 2020.
pipeline for 3d detection of vehicles,” 2018. [41] J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou, and
[22] C. R. Qi, W. Liu, C. Wu, H. Su, and L. J. Guibas, “Frustum H. Bao, “Disp R-CNN: Stereo 3d object detection via shape
pointnets for 3d object detection from RGB-d data,” in prior guided instance disparity estimation,” in IEEE/CVF
Proceedings of the IEEE Conference on Computer Vision and Conference on Computer Vision and Pattern Recognition
Pattern Recognition (CVPR), June 2018. (CVPR), June 2020.
[23] M. Liang, B. Yang, Y. Chen, R. Hu, and R. Urtasun, “Multi- [42] G. Brazil, G. Pons-Moll, X. Liu, and B. Schiele, “Kinematic 3d
task multi-sensor fusion for 3d object detection,” in object detection in monocular video,” in Computer Vision –
Proceedings of the IEEE/CVF Conference on Computer Vision ECCV 2020, pp. 135–152, Springer International Publishing,
and Pattern Recognition (CVPR), June 2019. 2020.
[24] Z. Wang and K. Jia, “Frustum convnet: Sliding frustums to [43] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C.
aggregate local point-wise features for amodal 3d object Chen, “Mobilenetv2: Inverted residuals and linear
detection,” 2019. bottlenecks,” in Proceedings of the IEEE Conference on
[25] S. Vora, A. H. Lang, B. Helou, and O. Beijbom, Computer Vision and Pattern Recognition (CVPR), June 2018.
“Pointpainting: Sequential fusion for 3d object detection,” in [44] X. Chen, K. Kundu, Y. Zhu, H. Ma, S. Fidler, and R. Urtasun,
IEEE/CVF Conference on Computer Vision and Pattern “3d object proposals using stereo imagery for accurate object
Recognition (CVPR), June 2020. class detection,” 2017.
[26] J. H. Yoo, Y. Kim, J. Kim, and J. W. Choi, “3d-cvf: [45] Geiger, P. Lenz, and R. Urtasun, “Are we ready for
Generating joint camera and lidar features using cross-view autonomous driving? the kitti vision benchmark suite,” in
spatial feature fusion for 3dobject detection,” Lecture Notes in Conference on Computer Vision and Pattern Recognition
Computer Science, p. 720–736, 2020. (CVPR), 2012.
[27] S. Pang, D. Morris, and H. Radha, “Clocs: Camera-lidar [46] Z. Li, Y. Yao, Z. Quan, W. Yang, and J. Xie, “Sienet: Spatial
object candidates fusion for 3d object detection,” 2020. information enhancement network for 3d object detection from
[28] X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, “Multi-view 3d point cloud,” 2021
object detection network for autonomous driving,” in
Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), July 2017.
[29] J. Ku, M. Mozifian, J. Lee, A. Harakeh, and S. Waslander,
“Joint 3d proposal generation and object detection from view
aggregation,” 2018.
[30] M. Liang, B. Yang, S. Wang, and R. Urtasun, “Deep
continuous fusion for multi-sensor 3d object detection,” in
Proceedings of the European Conference on Computer Vision
(ECCV), September 2018.
[31] Y. Wang, W.-L. Chao, D. Garg, B. Hariharan, M. Campbell,
and K. Q.Weinberger, “Pseudo-lidar from visual depth
estimation: Bridging the gap in 3d object detection for
autonomous driving,” in Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern
Recognition(CVPR), June 2019.
[32] Y. Chen, S. Liu, X. Shen, and J. Jia, “DSGN: Deep stereo
geometry network for 3d object detection,” in IEEE/CVF
Conference on Computer Vision and Pattern Recognition
(CVPR), June 2020.
[33] C. Li, J. Ku, and S. L. Waslander, “Confidence guided stereo
3d object detection with split depth estimation,” 2020.
[34] R. Qian, D. Garg, Y. Wang, Y. You, S. Belongie, B.
Hariharan, M. Campbell, K. Q. Weinberger, and W.-L. Chao,
“End-to-end pseudo-lidar for image-based 3d object
detection,” in IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR), June 2020. .
43
Investigation of Algorithmic Architecture Design Method by
Using Digital Technology to Increase Flexibility in Design
Process
Javad Eiraji
Aida Ghiaseddin
Faculty of Architecture and Design,
Department of Architecture
Islamic Azad University Eskisehir Technical University
Eskisehir, Turkey
Tehran, Iran
st_a_ghiaseddin@azad.ac.ir javadeiraji@eskisehir.edu.tr
Abstract—In metropolitan cities architectures projects are The term "algotecture" was taken from the algorithmic
full of diversity and innovation. By observing these global architecture and represents algorithms for architectural design.
architectural projects causes new ideas and by this way we can It has become widespread in architectural design over the past
understand these architectural diversity. In addition, when few decades. The parametric instruments work based on
these architectural varieties increases, for using conventional algorithms which can apply exact control over geometry of
tools and methods in designing architectural projects we need to the design throughout the design process. The flexibility and
provide new methods and tools. One of these tools and methods responsiveness of these instruments to the design changes
is designing by an algorithmic method. This algorithmic have result in the usefulness and applicability of the
architecture method quickly allows the designer to create
parametric models, particularly in designing complex and
various designs and causes precise control over the project
during the architectural process. This paper is a qualitative
unique models [2].
research which seeks to evaluate algorithmic design as a II. Literature Review
powerful tool in digital architecture efficiency and
implementing flexible and creative ideas. The aim of this study, According to Patrick Schumacher, a successful design
is to explore an approach that increases the designers' employs technologies and instruments, which help the
creativities and provides a framework for the basic designer proceed with his design. Regarding to algorithmic
requirements of the architectural design. and parametric architecture, he states that the architecture
should move from a single-layer system and an application for
the design editions toward a multi-layer and yet, consistent
Keywords—Algorithmic Design Process, Performance-Based and continuous design of the multisystem such as envelope,
Design, Parametric Architecture, Computer-Aided Design, structure, and internal subsection. The application of any
Flexible Design design operation on a multisystem must be correlated with
other components of the system and influence them.
44
The conception of this design is originated from production In terms of computations, there is no difference between
design that its primary purpose is to help human designers to algorithmic and parametric systems. Algorithms pre-
discover the design space by computational instruments [8]. supposedly work on parameters, and the main component of a
Thus this purpose can be realized by computers through fast parametric system is the algorithm itself, which is called a
sampling. Parametric design systems discuss as production design or definition. Although, unlike algorithmic design,
instruments in architectural design and the parametric parametric systems emphasize explicit and direct
instruments are deployed algorithmically so apply more manipulation of the parameter values to change and modify
computational control over the design geometry. the design artifact.
The role of parametric modeling addresses as a production The significant advantage of parametric modeling is that it
design instrument in architecture. Parametric design is a allows us to change the parameters at any level of the design
computational method that can act as a productive process [13].
(generative) method and an analysis. Besides, it has recently
received good acceptability in terms of practicality, research, So far, various parametric modeling techniques
and education. There are some debates about the limitations developed for visual purposes (e.g., finding form) and other
of the parametric systems as an exploratory instrument, functional or performance-related purposes. Figure II shows
some of the usual parametric modeling techniques used for
mainly addressing its role in architectural design, flexibility,
and design complexity. According to its algorithmic basis and form-finding, e.g., repetition (Fig II – left) and division (Fig II
its potential for expanding the design discovery space by – right). Other similar methods may include tiling and like that
changing the algorithm variables, i.e., parameters, the (Fig II).
parametric design can be classified as the third class of the
production systems [9].
Algorithmic thinking and algorithmic design correlate
with the concept of product design. Terzidis argues that the
inductive strategy of the algorithms can discover the
production processes or simulate complex phenomena.
Algorithms can be assumed as the suffix of the human brain
and might facilitate mutation in some regions of the Figure II. Examples of parametric modeling techniques for form
unpredictable potential [10]. finding: Repetition (Left) and Subdivision (Right). [1]
A. Parametric Design Systems Most software hold up free-form modeling, yet they have
a scripting plug-in (add-on), enabling the designers to create
The term parametricism introduced by (P. Schumacher) the rule algorithms directly and more freely. Some of these
for the first time and later, has been described more software packages are introduced in short in this
comprehensively as a combination of design concepts, which division. Rhino and Grasshopper are some of the most
provides a new and complex discipline based on the commonly known parametric instruments, particularly in
fundamental principles. architecture. However, Digital Project (DP) and Generative
Parametric systems work due to algorithmic rules. An Component (GC) are more suitable for large projects with
algorithm is a limited set of instructions for achieving a multiplex and geometric associations [13].
specific aim. An algorithm takes a value or a group of values DP is a compelling software package that can effectively
as the input, displays several countable stages, which convert shoulder geometric and complicated parameters, making it an
or change the input, and finally generates one or multiple ideal choice for sizeable parametric design projects.
values as the output [11].
Rhino is an independent and NURBS-based instrument
A parameter is the value or measurement of a variable that invented by Robert McNeil. Since1990, it uses widely in
can vary. Every object (thing) in a parametric system might different subject, including architecture, industrial design,
have some specific rules. When a parameter changes, other jewelry design, automotive, and marine design.
parameters will be adapted to it automatically [12]. Furthermore, Grasshopper is a rule algorithm editor with a
Parametric design usually uses complex building molds, graphic interface incorporated into Rhino as a scripting plug-
energy and structural optimization, other repetitive works, and in. This structure makes specific definition files that link to the
design sampling. As a novel digital design method, parametric main parametric model in Rhino.
design is entirely different from CAD/CAM because these Rhino is commonly used as a production instrument,
algorithmic features are rule-based (Fig I). instead of a correction (modification) instrument, in the
parametric design process. Compared to other parametric
software, Rhino and Grasshopper are now widely used in
practicing and education. Such a wide use attributes to the
simplicity of its function as a visual programming instrument.
A significant difference between parametric design and
other standard computer design methods is the ruleset
Figure I. Simplified examples of parametric variations generated in converted into the main design elements and procedures [14].
the same system .[1] In this technique, through the parametric design
procedure, the architects can return to any stage of the design
45
process to change the parameters or revise the rules to modify Norway are two works that are exemplary of these two
the design procedure for different purposes or perform approaches (Fig V).
different tests and experiments. Such flexibility allows the
architects to keep the design opened (Fig III). During the design process, the architects ultimately were
driving the overall form and cladding of the building and the
engineers driving the structural member sizing / positioning.
On the architectural side, certain form explorations were being
made in response to certain criteria such as concourse width
requirements, floor area ratios, or simply beautifying the
shape on the engineering side were the structure of the roof
trusses and cladding system designed as a rain screen
consisting of inter-locking louvers. A single parametric model
was shared between the architectural and engineering offices,
which acted both as a design tool and a coordination platform.
This allowed the integration of the design processes of the
Figure III. A large number of form variations generated in a
form, structure and façade, allowing fast response to design
parametric design system. [1] changes. Analysis tools were coupled with the parametric
model and provided quick analytical reaction to the geometry.
The sharing of the parametric model across the other design
In parametric design, once the rule algorithms get created, members and the fully integration of the engineering analysis
many design choices will be generated (Fig III). This design applications could be realized the benefits of a parametric
sampling can expand the design abilities considerably and approach (Fig VI).
extend the designer's thinking. Additionally, the designers do
not need to determine any solution early [15].
This feature allows keeping the maximum potential in the
process. Parametric design is not only a new design
instrument, it's a new method of design thinking [16].
B. Describing Design based on Parametric Logic
Parametric modeling as a design integration method diverges
the design space to discover many kinds of similar parametric
models. Parametric modeling can help a broader region for
design discovery. A change in a parameter would cause a
concurrent change in the form .While memorizing the basic
coherence of the design, it applies the changes to the form.
In Figure IV the multiple geometric arrangements of the
British Petrol Headquarters in Sunbury presented by Adams
Kara Taylor, Shows the creative discovery of the roof
structure is based on a parametric approach, which takes into
consideration both aesthetically and structurally.
The Parametric model turn into a controlled environment
in design discovery which searches for a greater design (Fig
IV).
46
Figure VI. Aviva Stadium in Dublin, Ireland.]6[
The geometry of the façade is a ruled surface that spans Another similar example is the Sydney Opera House by
between a straight upper and a curved lower edge. Here, a Jorn Utzon. It is a competitive project that could win the first-
parametric system was used not during the form-finding rank prize in 1957. Today, this monument is known as a
process of the curvilinear roof, but during detail design for the masterpiece that catches the attention of many engineers and
parametric optimization of form and performance (Fig VII). architects. However, the design discovery procedure has
resulted in many differences at the time of its construction
[18].
The geometry of the roof structure was not primarily
defined and was considered unusable at first [19].
Over five years from designing the concept, the engineers
and architects had to change the roof into an appropriate,
appropriate form that allows the use of a unit mold and
consequently, a unit bending during the construction stages
[20].
This pattern by Gaudi and Utzon indicates a type of
geometry that can reply to computational approaches and
Figure VII. Model photo of Kilden Performing Arts Center. [6] especially, parametric systems of performance.
47
What is learned from investigating the parametric [15] Carlos Roberto Barrios Hernandez, “Thinking parametric
architecture design method by using digital technology aimed design: introducing parametric Gaudi’’, design studies,
volume 27, issue 3, 2006.
to improve flexibility is that a mutual and dynamic
[16] David Karle, Brian M. Kelly, “Parametric thinking’’,
relationship between the components can cause maturity and Proceedings of ACADIA regional conference, 2011.
evolution of the design. The digital design process improves [17] Tomlow, Jos, “The Model: Antoni Gaudi’s hanging model
the designer's creativity and mind power and causes his/her and its reconstruction”, PhD thesis, Universität Stuttgart,
mind to work more dynamically and precisely; besides, it 1989.
increases the designer's mental archive to the numerousness [18] ÖMER AKIN, “Three Fundamental Tenets for Architectural
and diversity of the ideas and architectural structures. In short, Ethics, invited paper for the ACSA Teacher’s Conference’’,
the algorithmic architecture employs the designer's creativity Cranbrook Academy of Art, 2004.
as a computer that does not merely take the quantities into [19] John Yeomans, “The Other Taj Mahal: What Happened to the
account and engages it in the design process. This approach Sydney Opera House’’, Longman Australia, 1973.
aligns architecture with some imaginations that were far from [20] O ARUP, R S JENKINS, “The evolution and design of the
Concourse at the Sydney Opera House’’volume 39, issue 4,
being brought into action until a short while ago. However, it 1968.
is obvious that this method will be effective when, in terms of
qualitative criteria, the designer's mind continuously evaluates
and controls the design process as an intelligent supervisor.
References
[1] Ning Gu, Rongrong Yu, and Peiman Amini Behbahani, “
Parametric Design: Theoretical Development and Algorithmic
Foundation for Design Generation in Architecture”, Handbook
of the Mathematics of the Arts and Sciences, Springer, 2018.
[2] Moghtadanezhad, Pashaei, mehdi, Sevda, “Investigation of
the Impact of Parametric Architectural Design Process Based
on Algorithmic Design, A New Method in Digital
Architectural Design for Achieving Sustainable Architectural
Goals’’3rd international conference on Modern research in
civil engineering architectural and urban development,Berlin-
Germany, 2016.
[3] Patrick Schumacher, “Parametricism: Rethinking
Architectures Agenda for the 21st century(architectural
design)’’Academy Press, 2016.
[4] Michael Meredith, AGU, Mutsuro Sasaki, P .ART,
Designtoproduction, Aranda, “From control to design:
Parametric/algorithmic architecture’’, Actar; English edition,
2008.
[5] Farshid Moussavi, “Parametric software is no substitute for
parametric thinking’’, The architectural review, Actar,
FunctionLab, Harvard Graduate School of Design, 2011.
[6] Ipek Gursel Dino, “Creative design exploration by parametric
generative systems in architecture’’, Journal of the Faculty of
Architecture, volume 29, issue 1, 2012.
[7] John Frazer, “Parametric computation:history and future’’,
Architectural Design, Volume 86, Issue 2, 2016.
[8] Christiane M. Herr, Thomas Kvan, ‘‘Adapting cellular
automata to support the architectural design process”,
Automation in Construction, volume 16, issue 1, 2007.
[9] Sheng-Fen Chien,‘‘Supporting information navigation in
generative design systems”, Carnegie Melon University,
1998.
[10] Achim Menges, Sean Ahlquist ‘‘Computational Design
Thinking”, Wiley; 1st edition, 2011.
[11] Thomas H. Cormen, Charles E. Leiserson , Ronald L. Rivest
, ‘‘Introduction to Algorithms, Second Edition’’, The MIT
Press, 2001.
[12] Michael J. Ostwald, “Systems and enablers: modeling the
impact of contemporary computational methods and
technologies on the design process’’, Computational design
methods and technologies, 2012.
[13] Javier Monedero, “Parametric design: a review and some
experiences’’ Automation in Construction, volume 9, issue 4,
2000.
[14] Abdelsalam, Mai, “The Use of the Smart Geometry through
Various Design Processes: Using the programming platform
(parametric features) and generative
components’’, International Conference Proceedings of the
Arab Society for Computer Aided Architectural Design
ASCAAD, 2009.
48
Calculating the Lower Angular Excited States in Two
Dimensions Using the Finite Difference Time Domain Method
Huwaida Elgweri Amal Hamed Mohamed Mansor
Department of Physics Department of Physics Department of Physics
University of Tripoli University of Tripoli University of Tripoli
Tripoli, Libya Tripoli, Libya Tripoli, Libya
H.Elgweri@uot.edu.ly Amal.HAMED@uot.edu.ly m.mansor@uot.edu.ly
Abstract - The Finite Difference Time Domain method has term contains the potential, so the dimensionless form of the
been used to find the angular excited states wave functions in two Hamiltonian is given by,
dimensions. These excited states are calculated by applying the
iterative procedure on a specified initial guess wave function that ̂ ⃗
contains the desired excited state as a lowest state, this is simply The exact eigenfunctions and eigenvalues of the
done by introducing lines of zeros in the wave functions and their
differential equation (1) can be obtained only analytically for a
second derivatives. This of course depends on the symmetry of
the potential. We choose here either square or cylindrical
handful of potentials and in most cases we have to resort to the
symmetry, so the lowest angular excitations will contain lines of numerical analysis. One of the most useful numerical
zeros one or two passing through the region, namely, the first techniques is the diffusion method, which is based on the
excited state and the second excited state respectively. In our equivalence of the time- dependent Schrödinger equation to
investigation, we apply this technique to two simple potentials, the diffusion type equation by performing a transformation
which are the two dimensional simple harmonic oscillator and from real time domain to imaginary time domain to
the finite cylindrical well potential in order to illustrate the get the following equation
accuracy and the efficiency of these calculations. These potentials
were chosen, as the analytical solutions are available, so to ⃗
compare them with our results using MATLAB program. ̂ ⃗
Keywords: Finite Difference Time Domain Method, diffusion There have been various numerical methods to solve this
equation, cylindrical well potential, simple harmonic oscillator, diffusion type equation such as diffusion Monte Carlo method
Schrödinger equation [5], Grimm and Storer approximation method [6], and finite
difference time domain method (FDTD) [7]. All these
I. Introduction methods involve an iterative procedure applied on an arbitrary
The Finite Difference Time Domain Method (FDTD) has initial guess wave function that contains a mixture of all
several applications such as electromagnetic wave simulations, possible state wave functions. This iterative process can be
solving Maxwell’s equations [1], solar cells, filters, optical viewed physically as cooling the system and lowering its
switches, semiconductor based photonic devices and nonlinear energy [8], so the iterative procedure will always lead to the
devices [2], it is also used for solving Schrodinger equation ground state of the system. Hence if one only interested in the
which is the main topic of our investigation. There are many ground state of a system, the diffusion method is simple and
advantages of this method, for instance; the accuracy of sufficient.
numerical modeling, flexible for any geometrical shapes, and The advantage of this work is to extract higher angular
easy in programing [3,4]. However, this method is suitable to excited states using lines of zeros in the wave function. This is
calculate the ground state of the quantum systems because its done by classifying the initial guess wave function to even
diffusion behavior and in this paper we present a modified parity wave function and odd parity wave function. This
technique to improve the (FDTD) to be valid to calculate the procedure will still give the lowest possible excited state. The
lower angular excited states as well. space used in these calculations is extended to the unclassical
The time- dependent Schrödinger equation provides a region and it is kept small by using the end point formula for
description of the quantum system, that it is given by, the second derivative. The end point formula allows us to
calculate the wave function self consistently in the given
⃗
̂ ⃗ region. However, the region should be extended far enough to
calculate accurate energy eigenvalue. Due to the symmetry
where ̂ is the Hamiltonian of the system. The first term of only one sign region is kept for the actual numerical
the Hamiltonian contains the kinetic operator and the second calculations.
49
In addition to the introduction section this paper is wave function that is subjected to certain symmetric
organized into three sections as follows: The general theory properties, and then applying a suitable iterative procedure
section which presents the formulation of the (FDTD) method will converge to the lowest angular excited state in this initial
that is used to calculate the ground state, followed by the guess wave function.
improvement performance that is used to employ this method
to calculate the lower angular excited states. In application Therefore, introducing odd initial guess wave function that
section the improved (FDTD) method is applied to two contains one line of zero lies on either or
familiar examples in order to test this technique. Finally, the will exclude all even state wave functions, so, this odd initial
conclusion section contains the conclusion and the summary guess wave function will be a mixture of only the odd wave
in this work. functions of the system as
By using separation of variables technique we get the since only the lowest state remains after applying the iteration,
formal solution of the diffusion equation (3) in two it can easily be seen that applying iterative procedure which
dimensions as subjected to the anti symmetric property on the zero line will
approach to the first angular excited state i.e.
∑
where are expansion coefficients, and are a Similarly, introducing even initial guess wave function that
complete set of eigenfunctions and their corresponding energy contains two lines of zeros lay on both and
eigenvalues for the time-independent Schrödinger equation, so will exclude all dissimilar state wave functions as
they satisfy
̂ ∑
50
In other words that must be chosen carefully because of its
association with the grid spacing.
If there is no zero line on the axis then the second derivative The energy eigenvalues are calculated by means of
across this axis is given by numerical evaluation of the expectation value of the
Hamiltonian for their corresponding normalized
eigenfunctions as
∑∑ ̂
III. Applications
A. Two Dimensional Simple Harmonic Oscillator
The second order spatial derivative at the boundaries of the As a first example, we consider the simple harmonic
spatial mesh is calculated using the end point difference oscillator in two dimensions which is a good example to test
formula as the validity of the presented method. The distance is measured
in unit √ and the unit of the energy is . In these units
the Hamiltonian operator for the relative system is
̂
and
where , and ,
√
The potential operator in (2) can be written as
[ ] ( )[ ] ( )
51
Table I. Comparison of the numerical energy eigenvalues of the first
where and .
three lowest states with their corresponding exact energy eigenvalues
for two dimensional simple harmonic oscillator The exact energy eigenvalues and their corresponding
The state Numerical eigenvalue Exact eigenvalue
eigenfunctions are calculated analytically by transforming (31)
into the polar coordinate formula as
Ground state 1.9999 2.0
First excited state 3.9977 4.0
Second excited state 5.9953 6.0 ( )
(√ )
{
(√ )
√ (√ ) (√ ) √ (√ ) (√ )
Figure I. The first three lowest states wave functions of
two dimensional simple harmonic oscillators calculated
numerically. where defined as , and defined as .
a. The normalized ground state wave function
Therefore, the algebraic equation (34) will be satisfied only at
b. The normalized first angular excited state wave
certain discrete energy eigenvalues , which can be obtained
function
by getting the roots of this equation numerically. After energy
c. The normalized second angular excited state wave
eigenvalues have been found their corresponding
function
eigenfunctions can be calculated very easily by plugging the
energy eigenvalue into (33).
All the previous calculations are performed with
and , in these parameters iterations are Numerically, in the finite cylindrical well potential we have to
sufficient to get acceptable results. Numerically, the integrals use small spatial mesh size because of the circularly shape of
calculations used to normalize the wave function and those this potential, thereby the time step must be chosen carefully
used to determine the energy eigenvalues are evaluated using depending on the stability condition. In Tables (ІІ, Ш and ІV),
the trapezoidal rule. we show the effect of reducing the values of spatial mesh size
and the associated time step on the numerical eigenvalues, by
B. Finite Cylindrical Well Potential. presenting the first three lowest states of finite cylindrical well
As a second example we extend the same calculations to potential with depth , and radius . However,
the finite cylindrical well potential which is given in Cartesian smaller time step requires larger number of iterations to get
coordinate by acceptable results.
In Figure ІІ (a,b,c), we show the first three lowest
{ eigenfunctions respectively for finite cylindrical well potential
with depth and radius , these eigenfunctions
where is the depth of the potential, and is the radius of are calculated numerically using the most fit parameters To
the potential. illustrate the accuracy of the numerical results we show the
difference between the numerical eigenfunctions and their
In the distance unit and energy unit the dimensionless corresponding exact eigenfunctions in front of the figures. In
form of the time independent Schrödinger equation is given this case to get acceptable results nearly 5000 iterations are
by, required. Again, the integrals are evaluated numerically using
the trapezoidal rule.
52
IV. Conclusion
This present investigation has demonstrated the usefulness
of the (FDTD) method with the appropriate symmetric
boundary conditions of the wave functions to extract the
eigenfunctions and eigenvalues of the lower angular excited
states for cylindrical potential if they exist. The diffusion
method is very suitable for obtaining the ground state
calculations and it was proved in this paper that this method is
valid for determining the lower angular excited state as well,
by choosing an appropriate initial guess wave function that has
to be odd function for the first excited state and even function
for the second excited state. Choosing special initial guess
wave function removes all dissimilar excited states and forced
the iterative procedure to yield to the lowest state in this initial
guess function. Numerically removing the ground state by
Gram Schmidt orthogonalization procedure can also lead to
the excited states but the method introduced in this paper offer
Figure II. The first three lowest states wave functions of the advantage over the previous method that it's less expensive
the finite cylindrical well calculated numerically. in numerical cost. In addition, since the calculations of
a. The normalized ground state wave function symmetric method performed in the first quarter of
b. The normalized first angular excited state wave plane, so it reduces the greatly numerical cost to the quarter of
function those required to perform the calculations to entire the plane,
c. The normalized second angular excited state wave end point derivatives farther reduce the mesh points
function considerably. We have provided detailed calculations to
illustrate this improved technique in two dimensions, and we
Table II. The numerical ground state eigenvalue of finite cylindrical applied it to two different potentials as examples, namely the
well potential with and calculated using different simple harmonic oscillator and the finite cylindrical well. The
values of and ∆τ. The analytical eigenvalue is -28.78743 numerical results illustrated in Figure І, and in Table І, which
correspond to the simple harmonic oscillator potential show
Numerical eigenvalue Absolute error the efficiency and simplicity of this method, while the
0.1 0.001 -28.79096 0.00353 numerical results illustrated in Figure ІІ, and in Tables (ІІ, Ш,
0.08 0.0005 -28.78791 0.00048 and ІV), those correspond to the finite cylindrical well
0.07 0.0004 -28.78778 0.00035 potential show that extra care should be taken when choosing
the parameters used to obtain the eigenfunctions and the
energy eigenvalues in this case. Finally, we can generalize this
Table III. The numerical first angular excited state eigenvalue of finite method to calculate more angular excited states by using
cylindrical well potential with and calculated using diagonal zeros axis.
different values of and ∆τ. The analytical eigenvalue is
-26.92663 References
Numerical eigenvalue Absolute error [1] Antonio Soriano, Enrique Navarro, Jorge Porti, Vicente Such.
0.1 0.001 -26.93885 0.01222 “Analysis of the finite difference time domain technique to solve
the Schrödinger equation for quantum devices”, Journal of
0.08 0.0005 -26.92997 0.00334 Applied Physics, 95 (12), (2004).
0.07 0.0004 -26.92901 0.00238 [2] Charles Reinke, Aliakbar Jafarpour, Babak Momeni,
0.05 0.0003 -26.92866 0.00203 Mohammad Soltani, Sina Khorasani, Ali Adibi, Yong Xu, and
0.04 0.0001 -26.92516 0.00147 Reginald Lee,” Nonlinear Finite-Difference Time-Domain
Method for the Simulation of Anisotropic, χ (2) , and χ (3)
Optical Effects”, Journal of lightwave techology, 24, no. 1,
(2006)
Table IV. The numerical second angular excited state eigenvalue of
finite cylindrical well potential with and calculated [3] Dennis Sullivan, David Citrin, “Time-domain simulation of two
electrons in a quantum dot”, Journal of Applied Physics,
using different values of and ∆τ. The analytical eigenvalue is 89,3841,(2001).
-24.48947
[4] I Wayan. Sudiarta, Lily. Maysarr. Angraini “The Finite
Difference Time Domain (FDTD) Method to Determine
Numerical eigenvalue Absolute error Energies and Wave Functions of Two-Electron Quantum Dot”
0.1 0.001 -24.55968 0.07021 in Proc. AIP LLC, 2023, p. 020199-1, 2018.
0.08 0.0005 -24.52546 0.03599 [5] Thiago N Barbosa, Marcos M Almeida1, Frederico V Prudente,
0.07 0.0004 -24.48951 0.00004 “A quantum Monte Carlo study of confined quantum systems:
application to harmonic oscillator and hydrogenic-like atoms”,
Journal of Physics B: Atomic, Molecular and Optical Physics,
48 (5), (2015).
53
[6] R.Grimm, R.G. Storer, "A new method for the numerical
solution of the Schrödinger equation", Journal of Computational
Physics, 4, 230-249, (1969).
[7] I Wayan Sudiarta, D.J.Wallace Geldart, "Solving the
Schrödinger equation using the finite difference time domain
method", Journal of Physics A: Mathematical and Theoretical,
1885-1896, (2007).
[8] Huwaida Elgweri, Mohamed Mansor, "First excited solutions of
Schrödinger equation by the diffusion method applied to various
one dimension problem", Journal of Academy for Basic and
Applied Science, 14 (1), 1-4, (2015).
[9] Mohamed Mansor, Taher Sherif, Saleh Swedan, "Improved
simple numerical method using the diffusion equation applied
for central force bound quantum systems", Journal of basic and
applied science. 14,72-81,(2004).
[10] George Arfken, Hans Weber, Mathematical Methods for
Physics. edition. Academic Press is an imprint of
Elsevier.225 Wyman Street, Waltham, MA 02451, USA.
[11] Huwaida Elgweri, Mohamed Mansor,"Calculation of positive
spectrum for the higher excited states using Grimm and Storer
diffusion method", The Libyan journal of science. 30, 33-42,
(2017).
[12] Kailash Kumar, “On expanding the exponential”. Journal of
Mathematical Physics. 6, 1928-34, (1965).
[13] Ivan Sokolnikoff, Raymond Redheffer, Mathematics of Physics
and Modern Engineering edition, McGraw-Hill, New York,
(1966).
[14] Robert Eisberg, Robert Resnick, Quantum Physics of Atoms,
Molecules, Solids, Nuclei, and particles, John Wiley and Sons,
New York, (1974).
54
Implementation Framework for a Blockchain-Based
Reputation and Trust System
Abstract— The web revolutionized how people engage lacks a robust verification system allowing vendors to easily
with data. Consumers in e-commerce rely solely on online manipulate reviews to influence consumers’ perception
reputation systems when selecting which items to purchase. about their products. Major contributions of this study are
For its consumers and sellers, many e-commerce platforms three-fold.
now include built-in review systems. Users may share their • Contextualize reputation and trust systems regarding
evaluations and even propose better items for their online Blockchain technology.
shopping on social media networks like Facebook. Despite
these advancements, there are still significant flaws • Propose a model consisting of actors in the context of
numerous platforms that facilitate online interactions are reputation systems, the network architecture, and
still centralized and vulnerable to manipulation, tends to components.
result in a broken marketplace with ineffective verification • Propose an Ethereum-based platform that implements
where vendors can easily manipulate consumers' the reputation model.
perceptions to increase sales. Blockchain has been hailed as
a game-changing technology that may bring an extra layer The rest of paper is organized as follows: section II
of trust and security to online interactions, as well as Blockchain technology discusses the reputation model and
providing much-needed reputation in online interaction provides an implementation framework while section III
discusses the initial practical findings.
platforms. We offer a trustworthy, decentralized reputation
model based on the Ethereum Blockchain in this paper, with
the goal of restoring trust and integrity in the online A. ONLINE INTERACTION MARKETPLACE IN THE
CONTEXT
interaction industry. We go over the implementation
framework for such a system and present some preliminary Online interaction marketplace is currently broken
results. Our findings indicate that a decentralized because many reputation systems that are in use are
Blockchain-based reputation network is feasible, with centralized. As such, there are no mechanisms for
impact factor evaluations for each node serving as the guaranteeing that the behavior of entities remains honest
primary criterion for assessing the ecosystem's during the interaction process. In recent times, calls for a
trustworthiness. decentralized Internet have been growing. Although the
modern Internet is built on top of decentralized protocols such
Keywords— Reputation, Blockchain, trust, peer-to-peer, as TCP/IP and HTTP, a large section of the application stack
consortium networks, smart contracts has remained centralized. The desire for more
decentralization has largely emanated from the broken
marketplace bedeviled by fraudulent activities. Blockchain
I. INTRODUCTION can help enforce reputation and trust in online interactions.
The internet and social media have been acknowledged as Such a platform can help predict the outcomes of online
new frontiers for media convergence a phenomenon that is interactions. At present, there are many online businesses
increasingly characterized by how information flows and that have successfully implemented decentralized computing
users migrate linking content, communication, and systems with disruptive consequences. The unveiling of
computation in a complex setup. Users rely on the Internet to Bitcoin in 2009 led to the emergence of decentralized
send and receive emails, search online for media files or news, alternative platforms like Open Bazaar and Silk Road. The
and shop for products and services. As of this writing, the success of these platforms has been appraised by their
number of internet users has surged to 4.4 billion from 4.3 ability to create a trustless economy that does not require
billion in 2018 [1],[2]. At this rate, online interactions occur trust third parties for transaction verifications. We believe
between known entities. However, this is not the case. that Blockchain holds the potential to unlock the problems
Online communication takes place in an environment where bedeviling the reputation systems. Our model provides a
entities are anonymous with no mechanisms for verifying framework for implementing an open and transparent
interactions between them. Entities can easily create many reputation system based on Ethereum Blockchain. With this
accounts for each platform and engage in online interactions. system, any user intending to participate in online interaction
In e-commerce and social media platforms, this has led to the with the other party can verify the true identities and proof
emergence of a thriving economy based on fake reviews. that the parties are who they claim to be.
Driven by profits, many merchants and vendors are buying
positive reviews for their businesses and negative reviews B. Blockchain Technology
for their rivals in an attempt to influence how users perceive
Blockchain was first introduced with the publication of
their businesses [3],[4]. In 2017, the U.K. Consumer
whitepaper in 2008 by Satoshi Nakamoto as break-through.
Advocacy Group reported that sellers on Amazon were
listing products and services that carried thousands of
positive fake reviews [5]. This problem is prevalent because
many of these platforms are centralized and prone to
manipulation. As such, the online marketplace is broken and
55
technology for Bitcoin the first peer-to-peer (P2P) mathematical puzzle to authenticate and transactions and
cryptocurrency [6]. The technology paired cryptography an append the new blocks to the chain and in the process by they are
already established and understood concept in computer rewarded by coins. Whereas computation processes leading to
science at the time with a novel Merkle tree data structure to new blocks are difficult, they can easily be verified by nodes in
facilitate digital transactions. What made the technology grow the Blockchain ecosystem. As such, when a miner obtains the
fast is its ability to solve the decade-old double-spending solution to a new block, it broadcasts the generated block to the
problem: a scenario where the same money is copied and spent network for verification by other miners [9]. All the other
more than once. With a P2P model and Merkle tree data miners need to confirm that the solution is correct for the generated
structure, Blockchain does not require intermediaries, such as block to be confirmed. The mathematical puzzle to be solved in
banks, to facilitate online transactions. Blockchain became the PoW include [10].
progenitor of cryptocurrencies with Bitcoin becoming its first
use case. Rather than having traditional accounting systems of • Generating a hash function which requires the miner to
banks and other intermediaries to validate online transactions, determine input if the output is known.
cryptocurrencies use a public digital ledger or register • Integer factorization which involves presenting a
(Blockchain) to confirm transactions. When users transact, number as a product of two other integers (usually large
the transaction is recorded on the Blockchain. With traditional primes).
accounting systems, security and validation of the register
depend on banks, central banks, card issuers, and lately • Checking to confirm whether a DoS attack has occurred
telecommunication firms for mobile payments. However, in by computing hash functions.
Blockchain-backed transactions, the same is handled by A node that successfully generates a block is rewarded with
decentralized nodes (also called miners) who compete for coins as a form of incentive. PoW helps to protect the
verification of the transactions by solving a complex Blockchain network against attacks since an attack can only
mathematical puzzle. In essence, what Blockchain managed occur with a lot of computational power and time which would be
to solve is replacing trusted centralized authority’s entities inefficient since it would be expensive in terms of costs than the
with the decentralized and trustless system. potential rewards.
Inspired by Bitcoin, other cryptocurrencies (generally b) PoS (Proof-of-Stake)
referred to as altcoins) emerged with Ethereum becoming the
dominant platform. Ethereum was unveiled in 2013 not just as In a PoS,[11] the consensus is achieved by requiring nodes
a digital cash system but rather as a programmable platform to stake some of their coins or tokens in the process of
with unconceivable capabilities. With Ethereum, smart authenticating blocks. Essentially, staking involves depositing
contracts and other Decentralized applications (DApps) could some coins which are then locked up into the Blockchain
be executed in a “complete Turing World Computer.” Ideally, ecosystem. Such coins become collateral for vouching for the
smart contracts can be regarded as a special type of accounts new block [12]. The more a particular nodes stakes in the
that are recorded on the Blockchain and are therefore not ecosystem, the better the chances of being selected to validate
controlled by humans. In their most basic forms, smart the transactions [13]. PoS is specially conceived to resolve the
contracts can run all kinds of instructions like maintaining Byzantine Fault Tolerance issue that is rampant with PoW
states, checking conditions, and sending and receiving digital algorithm since all the validators are known in the network and
money. Of utmost significance is that the smart contract on the can easily be tracked on the Blockchain [14].
Blockchain cannot be changed, and/or even hacked. These
attributes make Blockchain such as Ethereum a perfect D. Types of Blockchains
platform for enforcing reputation rules because it is a Blockchains are broadly grouped into 3: Public permission-
permission-less Blockchain providing designated members less Blockchains; Public permissioned Blockchains; and Private
with the ability to read and write on the ledger [7]. permissioned Blockchains [15]. In a Public permission-less
Blockchain, there is no centralized entity that authorizes the
C. Consensus Mechanisms transactions on the Blockchain. These Blockchains can be
In Blockchain, consensus algorithms form the basic rules regarded as shared public ledgers where any node can view and
of agreement on how the nodes in the network validate the modify the data so long as the node is participating in the
transactions [8]. For example, suppose Alice sends $10 network. Ethereum and Bitcoin are some of the known examples
worth of bitcoins to Bob. There has to be a mechanism in under this category. In Public permissioned Blockchain,
place that ensures that Alice’s account balance reduces by selected nodes are used to authenticate the transactions on the
$10 while Bob’s account balance increases by the same Blockchain. For example, authentication of the transactions can
amount. Such a mechanism has to be implemented in a be assigned to a government entity, senior employees, or an
manner that does not allow any malicious transactions or institution. Lastly, a Private Permissioned Blockchain is a ledger
alterations to the Blockchain to be made without the full ecosystem where data is not available for public view.
consent of all the nodes participating in the network. Some of
the most common consensus algorithms include. E. Smart contracts
Smart contracts are essentially special types of accounts
a) PoW (Proof-of-Work) that are recorded on the Blockchain and are therefore not
controlled by humans [15]. In their most basic form, smart
It is by far the most used consensus mechanism. It was
contracts can run all kinds of instructions like maintaining
first applied in Bitcoin. In a PoW, miners compete against
each other by solving a complex.
56
states, checking conditions, and sending and receiving digital The total trust score represents the impact that a particular
money. Of utmost significance is that the smart contract on the entity has on the reputation platform. Each entity has two
Blockchain cannot be changed, and/or even hacked [16]. assignment variables:
Ethereum is one example that was developed to help
developers to program smart contracts besides acting as a • Eg: Total endorsements given by the entity; and
cryptocurrency. With Ethereum, smart contracts and other • Er: Total endorsement received by the entity.
Decentralized applications can be executed in a “complete
Turing World Computer [17]. A permission-less Ethereum To be considered as a node in the network, each entity
Blockchain can help enforce reputation rules since they can must have Eg and Er values equal to 1. The methodology used
control who can read and write on the ledge regarding data to model the reputation is described as follows:
being managed [18], [19]. • Re: Ratio of Eg to Er. It is an indicator of how the total sent
and received endorsements are far from one another. Re
F. Blockchain and Reputation systems must be less than or equal to 1 and is computed as
A reputation system that is open and transparent should be follows:
able to compute the trustworthiness of an entity in an online Re=min (Eg, Er)/max (Eg, Er)....................................... (1)
interaction process. A decentralized reputation system In a reputation system, the ratio between incoming and
incorporating a smart contract provides standardized outgoing endorsement connections should be
mechanisms for accessing reputation data which has been maintained. This ratio helps to build a trustworthy
aggregated where more authentications reinforce the notion of behavior where a high ratio value is an indicator that a
reliability for anyone reputation of the score. Meanwhile,
node participating in the network is a high impact
Blockchain’s consensus mechanisms in the reputation
performing and trusted node.
platform can also help to safeguard against known attacks in
centralized systems such as Sybil attacks, whitewashing, • CPTs: Total consumable points for A. Every entity that
Denial of service attacks, and slandering [20]. Transacting joins the reputation platform receives an equal CPTs
parties can leverage reputation indicators to decide who to from the platform and the value keeps on depleting with
interact with and who to avoid. In e-commerce, merchants and each endorsement and is computed as:
vendors are incentivized to sell credible products and services CPTs=1/Er ................................................................. (2)
[21]. The reputation system measures the contribution of each
node in the network and total consumable points help to
provide an indication of this measure. For example, the
II. IMPLEMENTATION FRAMEWORK total consumable points for a particular node can limit
that node’s ability to convince endorsees that it is a
A. REPUTATION MODEL trustworthy node.
A model that captures the reputation of an entity in an • RPTs: Total received points for A. It is the aggregated
online interaction should consists of both endorser and sum of all the consumable points that are received by A
endorsee interacting as shown below: from the endorsers. If endorsers E is the set {e1, e2, e3...
en} for A of the size of E is n, then RPTs is computed as
follows:
RPTs=sum (CPTs)...................................................... (3)
• IF: Impact Factor: Indicates the reputation score for A
and is computed as:
IF=Re * RPTs ............................................................. (4)
Figure 1. Reputation Model for online Interaction. Ethereum Blockchain is an ideal platform to be used as an
The acquaintance process would work as follows: implementation platform because it is public and permission
less. As such, an endorser (any entity in an online interaction
• Entities A and B have personally known each for a marketplace) to endorse the endorsee in a public and
long time. They probably have worked together, or permission less platform. In 2016, Vitalik Buterin—a Russian
went to the same school, and therefore, are mathematician and cryptographer—unveiled Ethereum as an
acquaintances of each other. evolving platform to rival the existing Bitcoin platform. At the
time, the Bitcoin protocol could only validate the ownership
• Entity A may have interacted so many times with and transfer of coins. Ethereum protocol was revolutionary in
entity B. Perhaps this interaction was in online the sense that instead of just being applied to enforce smart
shopping where A and B successfully transacted contracts so long as they have enough Ether (ETH) in their
and, in the process, A establishes that B is a credible wallets. On Ethereum platform, ETH acts as a native
seller while B establishes that A is the credible cryptocurrency that performs many Ethereum-based
buyer. applications.
Based on past experiences that A had with B, A is more
likely to endorse B on the reputation system. If A endorses B, When a smart contract executes on the Blockchain, all the
then this means that A has built trust in B. In this case, the nodes participating in the Ethereum (miners) execute the same
endorsement acts as a transaction message originating from a code and needs to be agree on a consensus mechanism [22].
user account on a Blockchain and is destined to another user At the time, the Ethereum was based on a consensus
account (endorsee). In the network, every user manages two mechanism called Proof-of-Work (PoW). In this mechanism,
distinct lists of entities that they have so far interacted with: nodes to compete to solve a complex mathematical and
list of endorsers and list of endorsees. The list must have cryptographic puzzle based on General Byzantine’s problem
account addresses that identify that user on the platform. The [23].
reputation platform would record all the endorsements between
the endorsers and endorsees. The system then aggregates the
information and compute a total trust score.
57
PoW served 2 major purposes: verification of the legitimacy The participants in this reputation system are endorser and
of transactions to avoid the double-spending problem that endorsee. For example, if Alice has interacted with Bob so many
previously existed with digital currencies; and create new times and wishes to recommend him based on interactions, Alice
coins where miners who successfully perform computations becomes the endorser while Bob is the endorsee. The endorser uses
are rewarded with ETH. However, this approach consumed so a DApp (a client web application) which runs a browser to endorse
much hardware resources and Ethereum has now migrated to a the endorsee in the system. Since these recommendations must be
new consensus algorithm called Proof-of-Stake (PoS). In the stored on the Blockchain, a smart contract is required. There are
PoS algorithm, validators (instead of miners) are used to lock three smart contracts that must be generated for the system to work
up some ETH that acts as a stake in the Ethereum ecosystem seamlessly Application Binary Interface which is a compiled byte
[24]. Validators that bet their ETH on the blocks are rewarded
code representation of reputation system, Ethereum client nodes,
with coins that are proportional to their stake in the ecosystem
to manage the nodes joining the network, and file storage that
[25].
manages the storage of files on the Blockchain. Finally, the system
With its smart contract’s potential, Ethereum has now requires a Blockchain which in this case is the Ethereum
become a massive, decentralized platform—what is often Blockchain.
called a “complete Turing machine” or Ethereum Virtual
Machine (EVM). Ethereum is the dominant platform that A. Smart contracts
supports both public/private management of transactions,
miners are required to deposit some currency (Ether) to Smart contracts allow the reputation platform to record all
validate the transaction in an arrangement known as Proof-of- the endorsements between the endorsers and endorsees. They
Stake (PoS) [26]. An EVM is Ethereum’s interpreter then aggregate the information and compute a total trust score.
environment that converts smart contracts from high-level The total trust score represents the impact factor that a
language statements into machine language. EVM acts an particular entity has on the reputation platform. When smart
interpreter for the Ethereum’s assembly language. As it is the contracts compile successfully, they generate ABI
case with programming in assembly languages, writing codes (Application Binary Interface) which is a binary
may be challenging for new developers. Therefore, the representation of the compiled EVM. ABI is then deployed to
Ethereum Foundation proposed Solidity—a high-level the Ethereum network resulting in the contract obtaining an
language—to be used as the basis for coding smart contracts. address and bytecode recorded. The smart contracts are then
Essentially, an Ethereum ecosystem has non-exhaustive invoked using Web3.Js - a JavaScript API that allows DApps
elements that are complete with cryptographic tokens, the to interact with remote or local Ethereum nodes [30]. The
address of nodes, consensus algorithms (PoS), validators, the main contract in this system is Endorse contract that defines
Blockchain/Ledger, EVM, and scripting languages are [28], the logic necessary for any endorsement in the Ethereum
[29]. network. The Endorse contract can be simplified using the
flow chart diagram below:
B. Reputation Network Architecture
The diagram below summarizes the architecture of the
Reputation system:
Figure 2. Architecture of Reputation System The DApp facilitates the interaction between the users (on their
browsers) and the Ethereum Blockchain on the Reputation system
The reputation system as presented in this figure 1 having [19]. Any node joining the Reputation platform submits his/her
following components. application via a DApp which retrieves the public keys from the
• DApps; key store. The keys are used to sign off the data which is then
transmitted securely. The DApp then runs the specified smart
• Smart contracts; and contract corresponding to the data being transmitted. If the
execution is successful, validators pick the transactions and
• Ethereum Blockchain.
broadcast them to the entire network.
58
III. FINDINGS AND DISCUSSION
As shown in the code, the joinNetwork () function first checks
This section presents the main findings regarding the to find out whether a node has been registered on the network
performance of the reputation system. Specifically, we discuss or not. If the user has not registered, then the function halts.
the system’s ability to provide security and privacy, its The function progresses with the computation process only if
performance, and the fulfilment of its requirements. the user has been registered. In this case, the function proceeds
to deal with the actions (changing the status of the registered
A. Security and privacy nodes and incrementing the registered nodes.
Reputation systems can form basis for solutions in the
To ascertain the fulfilment of requirements, simulation
broken online interaction marketplaces. However, as other
was performed with the help of an interaction graph that
studies have shown, these systems are vulnerable to different
offered clues regarding the performance of the network. The
attacks including Sybil attacks, whitewashing attacks, free-
code below illustrates the various graphical interactions
rider attacks, and Denial of service attacks [8]. We analyzed
between nodes:
our system based on how endorsers’ messages get stored on
the Blockchain, and how the reputation system computes the Algorithm 2 Graphical Interaction between nodes
impact factor. Whereas our system does not address this issue, 1. function getProfile()
available data regarding an entity can be used to determine 2. {
malicious nodes in the Blockchain ecosystem. 3. For each node:
4. Compute outDegree
Suppose the reputation system has four nodes (y= {a, b, c, 5. Compute used_Power
d}). Here, the reputation platform maintains two sets of 6. Compute outConns
information for each entity (one consisting of a list of 7. Compute inDegree
endorsers, while the other consisting a list of endorsees). If m 8. Compute receivedPoints Return:
is the list of endorsers and n is the list of endorsees for an entity 9. outDegree ,used_Power ,outConns ,inDegree
y, then, the intersection of m and n offers clues on the common 10. ,receivedPoints ,inConns)
entities in the sets. If the intersection set is the same for both 11. }
the endorsers’ list and endorsees’ list, then it is a likely
indicator that these actors are malicious and wants to interfere The reputation model was then applied to nodes in the
with the computation of impact factor. interaction graph nodes, and their impact factor calculated
based on incoming and outgoing connections (E g and Er).
Determining the actors that could be colluding in a system Each node on the Ethereum rated one other on a scale of -2
of certain size assumes is NP-complete problem that can only (representing total distrust) to +2 (representing total trust). To
be solved using heuristics approaches. Such a problem is provide more relevant findings, the interaction graph was
likely to become complex when more nodes join the network modelled to incorporate only edges that had a rating of +2 with
[31]. Essentially, the more the actors join the reputation no negative edges in the simulation model. Out of 5000 nodes,
system, the more the size of the network expands and so does 240 edges were marked as positive edges. The information
the computational complexity. Since the platform is available for each node included source, rating, target, and
implemented using Ethereum, the high cost of gas (gas is timestamp which formed the basis for the endorsement
mentioned just here. Is it correct?) for each transaction does system. The direction of endorsement system was based on
not incentivize participants to act maliciously on the network the source and target datasets while the timestamp data gave
[15]. hints regarding the order of transactions. The graph below
summarizes the distributions of both incoming and outgoing
B. Fulfillment of requirements graph connections:
A front-end web app prototype was developed to help test
the various contract functions on the Blockchain. Manual
testing was conducted using truffle IDE and ganache on the
local network. Smart contract should be implemented in the
form of conditions (checking to ensure all the necessary pre-
conditions such as the caller of the function), actions (in form
of events and functions) and interactions if such a code is
eliminating the reentrancy errors [32],[33]. Our system used the
same approach when writing the smart contracts as shown by a
snippet function below:
Algorithm 1 Verification Function
1. function joinNetwork()
2. if the user registered on the network
3. {
4. Record the sender’s name and id
5. add new entity to the existing members update the
member’s list
6. }
7. else
Figure 4: Distribution of incoming and outgoing connections
8. {Do not join the network} Return
59
The simulation model provided basis for collecting explore the existing cryptographic-based reputation systems
datasets about the endorsement system and analyzing it in a that take into consideration all the factors while computing the
manner like a net flow diagram [17]. This information was impact factor. Further research should be conducted to provide
useful in detecting anomalies within the endorsement system. an in-depth overview of current issues in reputation including
As evident in figure 1, there were no anomalies with the but not limited to use of Eigen trust systems in reputation
model which meant the model could be used to compute the systems, application of Anomaly Detection Algorithms to
impact factor for each node. thwart malicious behavior in reputation systems, extend the
capabilities of reputation platforms to deal with user accounts
The Impact Factor (IF) parameter computed based on E g and emails.
and Er. Out of 5000 nodes, 3800 nodes (76%) had an IF of 0.
On further examination of these nodes, we found only one
incoming or outgoing links (both E g and Er had values of 1).
REFERENCES
This makes sense according to our model because a node can
only make an impact on the ecosystem if it has more than one
connection. In our case, an IF of 0 is expected because it
represents the initial node and does not in any way suggest that [1] ‘Digital 2019: Global Digital Overview’,
it is untrustworthy in the system. Based on these findings, it is DataReportal – Global Digital Insights. [Online].
appropriate to conclude that 76% of the nodes in our network Available: https://datareportal.com/reports/digital-
were new users. The remaining nodes (12) had considerable 2019-global-digital-overview. [Accessed: 06-Oct-
IF scores. 2019].
[2] A. Gonzales, ‘The contemporary US digital divide:
There were no nodes with IF more than 1 that had an from initial access to technology maintenance’, Inf.
accumulated RPTs of 0. If this were the case, then their IF
Commun. Soc., vol. 19, no. 2, pp. 234-248, 2016.
would still be 0 because RPTs is a significant contributor to
the reputation platform. Our model shows that it is possible
for some nodes to have a maximum possible ratio (which in [3] Setti, Sunil, and Anjar Wanto. "Analysis of
this case is 1) but still with a low IF. This is true because, as Backpropagation Algorithm in Predicting the Most
we had mentioned earlier, Eg and Er alone do not contribute Number of Internet Users in the World." Jurnal
to the overall IF on the network. The table below provides Online Informatika 3.2 (2019): 110-115.
some results for selected nodes that had low IF:
[4] J. K. Rout, S. Singh, S. K. Jena, and S. Bakshi,
‘Deceptive review detection using labeled and
Label Eg Er Ratio RPTs IF unlabeled data’, Multimed. Tools Appl., vol. 76, no. 3,
3819.
[5] W. Liu, J. He, S. Han, and N. Zhu, ‘A Method for the
3 2 2 1 0 0
Detection of Fake Reviews based on Temporal
15 5 3 6 1 0.2 Features of Reviews and Comments’, in 3rd
International Conference on Mechatronics
41 4 3 0.75 2 03
Engineering and Information Technology (ICMEIT
Table 1: Selected nodes with low IF 2019).
[6] Nakamoto S. Bitcoin: A peer-to-peer electronic cash
The results in the above table clearly demonstrate that the system. Decentralized Business Review. 2008 Oct
ratio between Eg and Er is not the key contributor to the IF in 31:21260.
the network. Consequently, it is hard to speculate if a
[7] A. Bogner, M. Chanson, and A. Meeuw, ‘A
particular entity in the network would have a high IF by
merely using the ratio. decentralised sharing app running a smart contract on
the Ethereum blockchain’, in Proceedings of the 6th
International Conference on the Internet of Things,
IV. CONCLUSION 2016, pp. 177–178.
Blockchain adds an extra layer of trust and security to [8] Shanaev S, Shuraeva A, Vasenin M, Kuznetsov M.
online interactions and can provide a much-needed reputation Cryptocurrency value and 51% attacks: evidence from
in online interaction platforms. In this paper, we have event studies. The Journal of Alternative Investments.
presented a trusted, decentralized reputation model based on 2019 Dec 31;22(3):65-77.
Ethereum Blockchain. Our platform can help enforce
reputation and trust in online interactions. The IF score [9] Hernan SV, inventor; Microsoft Technology
computed for an entity in the reputation model is a mark of its Licensing LLC, assignee. Authentication using proof
trustworthiness in the ecosystem. Using this score, any user of work and possession. United States patent
intending to participate in online interaction with the other
applicationUS14/486,864.2016March17.
party can verify the true identities and proof that the parties
are who they claim to be.
While our model is one of the first attempts at leveraging
Blockchain to infer trustworthiness of nodes based on peer-to-
peer interactions and the computation of IF, it does not in any
way challenge existing cryptographic-based reputation
systems. Limited time and resources did not allow us to
60
[10] Bentov I, Gabizon A, Mizrahi A. Cryptocurrencies [22] Gramoli, Vincent. "From blockchain consensus back
without proof of work. InInternational conference on to Byzantine consensus." Future Generation
financial cryptography and data security 2016 Feb 22 Computer Systems 107 (2020): 760-769.
(pp. 142-157). Springer, Berlin, Heidelberg
[23] L. Luu, V. Narayanan, K. Baweja, C. Zheng, S. Gilbert,
[11] Nguyen CT, Hoang DT, Nguyen DN, Niyato D, and P. Saxena, ‘Scp: A computationally-scalable
Nguyen HT, Dutkiewicz E. Proof-of-stake consensus byzantine consensus protocol for blockchains’, SCP,
mechanisms for future blockchain networks: vol. 20, no. 20, p. 2016, 2015.
fundamentals, applications, and opportunities. IEEE
Access. 2019 Jun 26;7:85727-45. [24] S. Tamang, Decentralized Reputation Model and Trust
Framework Blockchain and Smart contracts. 2018.
[12] Tosh, Deepak, et al. "CloudPoS: A proof-of-stake
consensus design for blockchain integrated [25] Fernández Anta, A., Rimba, P., Abeliuk, A., Cebrian,
cloud." 2018 IEEE 11th International Conference on M., Stavrakakis, I., Tran, A. B., & Ojo, O. (2019).
Cloud Computing (CLOUD). IEEE, 2018. Miner Dynamics on the Ethereum Blockchain.
[13] Watanabe, H., Fujimura, S., Nakadaira, A., Miyazaki, [26] King, Sunny, and Scott Nadal. "Ppcoin: Peer-to-peer
Y., Akutsu, A., & Kishigami, J. (2016, January). crypto-currency with proof-of-stake." self-published
Blockchain contract: Securing a blockchain applied to paper, August 19.1 (2012).
smart contracts. In 2016 IEEE international
conference on consumer electronics (ICCE) (pp. 467- [27] W. Wang, ‘A Vision for Trust, Security and Privacy of
468). IEEE. Blockchain’, in International Conference on Smart
Blockchain,2018,pp.93–98.
[14] Nugent, Timothy, David Upton, and Mihai Cimpoesu.
"Improving data transparency in clinical trials using [28] M. Vukolić, ‘The quest for scalable blockchain fabric:
blockchain smart contracts." F1000Research 5 Proof-of-work vs. BFT replication’, in international
(2016). workshop on open problems in network security, 2015,
pp.112–125.
[15] Cong, Lin William, and Zhiguo He. "Blockchain
disruption and smart contracts." The Review of [29] C. Kaligotla and C. M. Macal, ‘A generalized agent
Financial Studies 32.5 (2019): 1754-1797. based framework for modeling a blockchain system’,
in Proceedings of the 2018 Winter Simulation
[16] Dennis R, Owen G. Rep on the block: A next Conference,2018,pp.1001–1012.
generation reputation system based on the blockchain.
In2015 10th International Conference for Internet [30] Berger TP, Gueye CT, Klamti JB. A NP-complete
Technology and Secured Transactions (ICITST) 2015 problem in coding theory with application to code
Dec 14 (pp. 131-138). IEEE. based cryptography. InInternational Conference on
Codes, Cryptology, and Information Security 2017
[17] Fichtner, Wolfgang, Qiuting Huang, Bernd Apr 10 (pp. 230-237). Springer, Cham.
Witzigmann, Hubert Kaeslin, Norbert Felber, and
Dölf Aemmer. "IIS Research Review 2004." IIS [31] M. Prates, P. H. Avelar, H. Lemos, L. C. Lamb, and M.
ResearchReview (2004). Y. Vardi, ‘Learning to Solve NP-Complete Problems:
A Graph Neural Network for Decision TSP’, in
Proceedings of the AAAI Conference on Artificial
[18] Y. Zhang, C. Xu, J. Ni, H. Li, and X. S. Shen, Intelligence, 2019, vol. 33, pp. 4731–4738.
‘Blockchain-assisted public-key encryption with
keyword search against keyword guessing attacks [32] Y. Hou, X. Zhao, Q. Li, J. Chen, Y. Li, and Z. Zheng,
for cloud storage’, IEEE Trans. Cloud Comput., ‘Solving large-scale NP-Complete problem with an
2019. optical solver driven by a dual-comb “clock”’, in
[19] G. Liang, S. R. Weller, F. Luo, J. Zhao, and Z. Y. CLEO: Science and Innovations, 2019, pp. SF1M–2.
Dong, ‘Distributed blockchain-based data
protection framework for modern power systems [33] Sultan K, Ruhi U, Lakhani R. Conceptualizing
against cyber attacks’, IEEE Trans. Smart Grid, blockchains: characteristics & applications. arXiv
vol. 10, no. 3, pp. 3162–3173, preprint arXiv:1806.03693. 2018 Jun 10.
[20] S. Eskandari, S. Moosavi, and J. Clark, ‘SoK: [34] DelGaudio CI, Hicks SD, Houston WM, Kurtz RS,
Transparent Dishonesty: front-running attacks on Hanrahan VA, Martin Jr JA, Mummey DP, Murray
Blockchain 2019. DG, Prince JE, Pritsky RR, Rauch DC, inventors.
Method and system for network connectivity
migration management. United States patent US
[21] Cachin C. Byzantine faults. In Concurrency: The 9,928,480. 2018 Mar 27.
Works of Leslie Lamport 2019 Oct 4 (pp. 67-81).
[35] Efanov, Dmitry, and Pavel Roschin. "The all-
pervasiveness of the blockchain
technology." Procedia computer science 123 (2018):
116-121.
61
An Observation on Residential Complexes as a New Housing Typology in Post-Socialist Tirana
Abstract— The shift from socialist to post-socialist political bankrupt or stopped production [2] (case of ex-mechanical
system caused an urban expansion of Tirana, which brought the factory Enver Hoxha).
growth of its population. This post-socialist urban context
associated with a high demand for dwelling need, the limited The number of the residential complexes that were
construction sites in inner city and the increasing land prizes, developed in peri-urban areas after years 2010, in consistence
pushed the developers to construct residential complexes as a with the expansion of the city have increased in quantity. The
new form of housing in Tirana. This research aims to reveal the characteristic feature of residential complexes are the high-
features of residential complexes as a new housing typology, that rise apartment blocks (in many cases tower shaped) aiming
emerged during the post socialist period in Tirana by analyzing maximize the dwelling area. They are arranged at the
typological housing features and the common outdoor spaces. perimeter of the rectangular sites and in appearance the spaces
To investigate properly there are selected three cases studies: in between are dedicated for outdoor social activities.
“Halili”, “Homeplan” and “Kika” residential complexes. The However, in most of the cases the ground floors are designed
methodology used in the research includes visual for coffee shops, which easily usurp the public spaces [3].
documentation, archival research for provision of the drawings
and an analysis of typological features and outdoor spaces of the
residential complexes. The study revealed that residential
complexes in Tirana are featured by dense high-rise apartment II. Case Study on Three Residential Complexes in
blocks or towers, which possess many apartments. The spatial Post-Socialist Tirana
features are characterized low number cores, which in majority This research focuses on the analysis of three case studies
are not lit and provide access to large number of apartments. of residential complexes in Tirana. The analysis includes
Apart from the housing functions, commercial functions are description of site plans, housing typological features and the
found in the ground floors, whereas in some cases the business common outdoor spaces of residential complexes. The
functions are located also in upper floors. The common outdoor analysis is supported with visual material for each of them,
spaces are featured by limited green spaces and considerably are
including photos, to plan drawing and technical materials.
usurped by the commercial units of the ground floors.
Firstly, an introduction about the site plan and urban context
of the residential complexes, including the position related to
the city center is depicted. The housing typological features
Keywords—residential complexes, housing typology, post- include architectural composition, spatial features, function
socialist period, Tirana distribution and exterior solutions. The common outdoor
spaces explain the shared spaces within the sites of the
I. Introduction on Multifamily Housing in Post- residential complexes, their organization and hardscape and
socialist Tirana green spaces ratios. Furthermore, it includes observation on
The multifamily housing development is one of the major the usage these spaces, and the relation of between the
dwelling forms in the post-socialist period Tirana. The post- dwellers and commercial units located on ground level and,
socialist period housing developments in Tirana are featured parking accesses or passages.
by informal housing settlements at the city periphery and high-
rise dwellings in the inner city [1]. Although after 90s, due to
technological limitations the first multifamily housing were A. “Halili” Residential Complex
middle-rise apartment blocks, by the years 2000 in a dominant “Halili” complex is conceptualized as a residential
majority they evolved in the form of high-rise apartment development and business center. It is located in the inner city
blocks or residential complexes. of Tirana, on the Dibra street within an old neighborhood
approximately 820 meters from the city center. Designed by
While for an apartment blocks the needed site was smaller,
Vladimir Bregu, it was completed in 2004. Although initially
in the case of residential complexes there was a need for larger
it was planned to cover a larger zone (approximately 13,900
sites. Based on that on the case of residential complexes it was
sq m2), its site is approximately 7200 sq m2 [4]. The site plot
easier to be developed in the peri urban zones of the city (out
has a triangular-like shape and excluding the “Partizani” high
of middle ring), but those zones were lacking the needed
school, which border the complex in its rear part. The design
infrastructure making the unattractive for the clients. Thus, it
approach was to have a built perimeter on the main streets
was more attractive to develop residential complexes in the
while enclosing the inner space, giving it private attributes.
inner-city zones which controversially lacked such large sites
for development. Taking into consideration the complex issue Based on that, the perimeter of the land plot is contoured
of land ownership in Albania and the weak urban planning by high rise apartment blocks, that vary in height from a
administration the target of developers for residential minimum of 6 floor height up to 11 floor height residential
complexes sites become the public green and sportive spaces towers. The common outdoor spaces are located on the back
(case of “Fusha e zeze” field or “Partizani” club training of the build perimeter, providing the necessary intimacy and
grounds) or socialist period enterprises which had gone
62
Figure 1. Location of "Halili" residential complex (left), image (middle) and its common outdoor (right).
Figure 2. Site plan of "Halili" complex with realized parts in green (left), perpendicular (middle) and parallel housing block (right)
normal floor plan schemes
63
The commercial units located on the ground floor, that number of apartments accessed from the same core varies
look to the inner outdoor common space, mostly are not from five to seven. The cores and the corridors are not lit in
occupied and the small number of which do have a tenant majority of the cases. Related to the commercial units and
consists of small mini markets or dry-cleaning activities. their location, the ground floor as the easiest space to directly
Related to the usage of public space, it is generally a quiet having an access from the main road are in their function. The
place, although it is not vehicle free. The people using the entrances of the residents are positioned in such a way to not
space are generally old people, or young parents that carry interfere and occupy a valuable front-line commercial unit, but
their infants in the strollers spending a bit of quiet time they are positioned in the inner part of the passageways. All
outdoors. The space is populated during the late afternoon the entrances are numbered, to ease the directionality and
hours when the sun is setting, by children playing unattended orientation of the dwellers.
by their parent, while most of the residents tend to sit at the The exterior of this residential complex due to the
bars located nearby. innovative the materials and preciseness in realization
technique evolve, thus the difference can be clearly
distinguished among vicinity buildings. The façades of this
B. “Homeplan” Residential Complex
residential complex are coated by raw stucco finishing. The
“Homeplan” is designed as a residential complex finishing colors are sensitive and are pinkish pastel color, light
composed of residential and commercial units. It is located in grey and white.
the western part of Tirana, outside the middle ring. It is
designed by architect Irina Branko and completed a local B.2. Common Outdoor Spaces
company called Kontakt. The “Homeplan” residential
complex site plan scheme organization, consists of building
The common outdoor spaces of the “Homeplan”
volumes place at the perimeter of a quasi-rectangular site and
an inner courtyard in the center. It is approximately 2.4 km the residential complex is is a central square plaza arranged in the
city center of and is positioned on a 6751 m2 plot. As a inner courtyard. Although this space is semi closed and
construction period it has started in 2010 and ended in 2012. introverted, it is usable and accessible also for the dwellers
In verticality it reaches up to a maximum of 8 floors high with off complex. The public space has a coverage area of
one underground parking floor. Apart from housing, other approximately 1200 square meters, where urban furniture,
facilities present within “Homeplan” complex consist of greenery and water features are integrated.
service units as bars, restaurants, or supermarkets, which are
placed on the ground floor [5]. The materials used in the inner courtyard consists of grey
stamped concrete with a stone like pattern. Minimal green
areas are designed in rectangular upraised volumes 50 cm
Figure 3. "Homeplan" residential complex location (left) and its image (right)
B.1. Housing Typological Features from the ground level, which offset from the inner building
perimeter. The inner part of these volume is designed as a
There are seven apartment blocks in this residential hilly - like terrain, planted in grass, bushes and high trees.
complex. The built volumes rise as extruded volumes, while Urban furniture is integrated within the perimeter of the
some of them do not continue to the last floor but reach to a volumes. The urban furniture partly is designed with wooden
maximum of 5 floor high. In this way the roof is converted to elements, which create a warmer and more inviting element
an open veranda for the users of the 6th floor adding value to than concrete.
the above apartments as well as liberating the massive impact
of one whole volume. Under the lower volumes in floor The services located on the ground floor are bars and
height, there are passageways on ground level of the building. cafeterias, which usurp of the common outdoor space for their
The width of the buildings varies from 14.8 meters to 20 private purposes. This results in a limited area for the user,
meters. On the outer perimeter the balconies and loggias are who does not want to use the cafeterias nearby. Based on the
cantilevered, while on the inner perimeter they stay within the observations we conducted on site, the large number of coffee
built perimeter of the building.
The spatial organization of apartment blocks is planned
around the cores which are placed at the center of respective
block. The cores consist of a staircase and two elevators. The
64
shops push the young parents to sit in there, to keep their
children under their supervision.
Figure 4. "Homeplan" site plan (left), normal floor plan scheme (middle) and image of common outdoor spaces (right)
Figure 5. Image of “Kika” residential complex (left) and common outdoor spaces (right)
65
number of apartments accessed from one core varies from The green spaces are abundant and consist of two
five to eleven. In some cases, certain apartment block is not orthogonal islands in the inner court and another smaller one
designed with a core, and they are connected via bridges to in the eastern edge of the site. They are designed as an
the other block core. Apart from the ground floor is allocated upraised volume from the ground level at approximately 50
for commercial activities, in the angular part in difference cm. The topography of the green spaces is designed as a hilly-
from the previous block the first floor is totally dedicated to like terrain and is planted in grass, bushes, and high trees. The
commercial activities, shifting the apartment units to the perimeter of the green spaces is designed as urban sitting
second floor up to the ninth. furniture consisting of concrete bordure and wooden
The façade of this residential complex is covered in grey elements, providing a warmer environment.
plaster, giving the blocks a neutral look. Meanwhile the outer
and inner façade is fragmented, resulting in a much lighter III. Concluding Remarks
volume and game of solids and voids. The usage and To conclude it can be said that the residential
interplay of vertical windows on the perimeter of the block complexes are a new form of dense mass housing that have
serve to break the horizontalness of the built environment. emerged in considerable quantity during the post-socialist
Meanwhile, the inner façade differs from the outer, is a more period in Tirana. The increasing number of this housing form
porous one, offering more transparency and light to the inner is related to the capitalist economic reality, based on which
environment. The inner façade consists of continuous loggias the developers aim to maximize the housing construction
that are divided by walls between 2 apartments, providing the profit and the purchasing power of the citizens is not high.
necessary privacy between the neighbors. Due to this context, the residential complexes that were
subject of this study in majority are featured by dense and
C.2. Common Outdoor Spaces high-rise tower like apartment blocks.
Regarding their spatial organization their cores consist
The common outdoor spaces of the residential complex are of stairs which are unlit and provide access to large quantity
a result of the arrangement of the building blocks in the (five to twelve) of apartments. In some cases, there are
perimeter of the plots, resulting in an introverted courtyard. observed also long linear corridors. The exterior of this
The linkage between these inner spaces and the surrounding housing typology reflects the quality of contemporary
neighborhood is done through several punctuations in the construction technology and in many cases, has played an
built volumes, in forms of passages that also serve as avant-gardist role using curvilinear shapes and curtain wall
controlled entrances to these spaces. Each of these public façades.
spaces have two or three of these punctuations that either The common outdoor spaces generally are designed in
connect them with the neighborhood or the sites with each the form of inner courts. Due to the commercial activities in
other. The built volumes reach up to a maximum of 9 floor the ground floors of the housing blocks and the lack
height, despite that the distance among them provides the respective legal framework needed to regulate the usage of
opportunity for the inner environment to receive plenty of the common outdoor spaces, predominantly they are subject
natural daylight through the day. The materials used in the of usurpation by the owners of these activities. The green
inner spaces consists of grey stamped concrete stone like spaces are present in the inner courtyards, although they are
pattern [6]. in minority compared to overall space.
Figure 6. Kika residential complex in green within Le Serre master plan (left), "Kika" 3rd division site plan (middle) and normal
floor plan (right)
66
References [4] Halili construction archive, July 2018.
[1] Manahasa Edmond, Manahasa Odeta, “Defining urban identity [5] Kontakt archive, July 2018.
in a post-socialist turbulent context: The role of housing [6] Kika Construction archive, July 2018
typologies and urban layers in Tirana”, Habitat International, [7] http://www.atenastudio.it/sitoweb/project.php
Volume 102, 102202, 2020.
[2] Manahasa, Edmond, Ozsoy, Ahsen, “Place attachment to a
larger through a smaller scale: Attachment to city through
housing typologies in Tirana”. Journal of Housing and the Built
Environment, 35(1), 265–286, 2020.
[3] Manahasa, Edmond, “Place attachment as a tool in examining
place identity: A multilayered evaluation through housing in
Tirana”, PhD dissertation, Istanbul Technical University, 2017.
67
Active Control Of In-Wheel Motor Electric Vehicle
Suspension Using The Half Car Model
Abstract— In recent years, electric vehicles are becoming more supposes that it is virtually connecting a damper between the
mainstream. Even though in-wheel electric motor vehicles have unsprung mass and the ground [6]. The ground-hook main
been developed as prototype vehicles, they have not become task is to reduce the vertical displacement of the tire and keep
common yet. One of the reasons for this is that when an electric the ground-tire contact force in a narrow area as possible to
motor is placed in the wheel it becomes heavier, which tends to the main value [7]. Ground-hook has improvement when
worsen the road holding properties of the vehicle. Active
suspensions are currently used in some high-end vehicles to
compared with passive suspension model. The use of the half
improve passenger comfort, vehicle handling and road holding. car model to show the effects of the ground-hook controller
In this paper an active suspension is used show that it is possible is the main contribution of this paper. The goal is to design
to mitigate the road holding problems caused by in-wheel control strategy namely ground-hook controller to improve
electric vehicles. The ground-hook method is used for the road handling for the vehicle.
control strategy. First, a quarter car model is used to show that
adding a weight to the wheel does actually decrease the road II. Quarter Car Model
holding performance of the vehicle. Afterwards, the active
suspension is shown to be able to improve the road holding
A. Quarter Car Model Passive Suspension System
properties of the vehicle to acceptable levels. The simulations Quarter Car model is the most popular model that is used in
are also repeated with a half car model. They also show that analysis of automotive suspensions and design. The main
increasing the tire mass worsens the road holding performance reason to use this model is that it is simple, it can give
of the vehicle. However, this bad performance can be reversed reasonably accurate information and predict lot of important
by the ground-hook active suspension. Simulations also show properties about the full car, Figure 1 shows a passive
that the ground-hook controller worsens the passenger comfort
suspension system for a quarter car in which the wheel is
level in the vehicle.
connected to the body of the car by passive parameters
(spring and damper), and the tire is represented as a spring
Keywords—active suspension, road holding, ground-hook, and the damper of the tire is neglected. The quarter car
electric vehicle, in-wheel motor structural model involved mass of car (𝑚𝑠 ) and mass of tire
(𝑚𝑢 ). There are three vertical displacement types included in
I. Introduction quarter car, the vertical displacement for the mass of car (𝑧𝑠 ),
The suspension system is the one of the most important part the vertical displacement for the mass of tire (𝑧𝑢 ) and the
of an automobile that isolate the vehicle from road shocks, vertical displacement for the road (𝑧𝑟 ).
vibrations and provide comfort effect to the occupant [1][2].
Automotive suspensions are divided into three forms namely
passive, semi-active, and active suspension system. Passive
suspensions are always used and continuous improvements
have been made by research. It is impossible to get both ride
comfort and road holding demands in the same time for
passive suspension car. Passive suspension systems are the
most favorite and are widely used because of their low cost
and high reliability. This type of system is considered as open
loop system [2]. Passive suspension consists of conventional
spring (the spring is pressed and stretched to absorb the wheel
movement) and damper which is a shock absorber that works
on the vibration motion of the vehicle. The main aim of using
damper is to slow down and minimizing the vibration
magnitude caused by the road. The damper connected in
parallel with spring which was fixed and it is impossible to
change externally by any signal [3][4]. So, it should need a
spring which can be stiff and soft simultaneously [3][4][5].
Researchers have made a lot of improvements over the years
and most of these experts think that the passive suspension Figure I. Passive suspension system for a quarter car model
are hard to be improved. A ground-hook control is one of the B. Quarter Car Model Forces
control strategies applied to the automotive suspension. A
ground-hook controller is used to improve the road holding ∑ 𝑓 = 𝑚𝑎 ()
for both quarter and half car models. This controller method
68
𝑘𝑠 𝑏𝑠 𝑏𝑠 𝑘𝑡
The variables are: (𝑓𝑠 ) force of spring, (𝑓𝑏 ) force of damper, 𝑧𝑢̈ = (x1 ) + (x2 ) − (x4 ) − (x3 ) ()
𝑚𝑡 𝑚𝑡 𝑚𝑡 𝑚𝑡
(𝑓𝑡 ) force of tire, (𝑚𝑠 ) mass of car, (𝑚𝑢 ) mass of tire, (𝑧𝑠̈ )
the acceleration for mass of car, (𝑧𝑠̇ ) the velocity for mass of
From eq. (4) and eq. (16) we will get the matrix of A
car, (zs ) position for mass of car, (𝑧𝑢̈ ) the acceleration for
mass of tire, (𝑧𝑢̇ ) the velocity for mass of tire, (zu ) position
0 1 0 −1
for mass of tire. k bs bs
− ms − ms
0 ms
−𝑓𝑠 −𝑓𝑏 = 𝑚𝑠 𝑧𝑠̈ () A= s
()
0 0 0 1
ks bs k bs
[ − 𝑚t −𝑚 ]
𝑚𝑠 𝑧𝑠̈ = −k s (zs − zu ) − 𝑏𝑠 (𝑧𝑠̇ − 𝑧𝑢̇ ) () 𝑚𝑡 𝑚𝑡 𝑡 𝑡
D. Ground-hook Control
𝑓𝑠 + 𝑓𝑏 − 𝑓𝑡 = 𝑚𝑢 𝑧𝑢̈ ()
A ground-hook control introduces a damper connected
𝑚𝑢 𝑧𝑢̈ = 𝑘𝑠 (𝑧𝑠 − 𝑧𝑢 ) + 𝑏𝑠 (𝑧𝑠̇ − 𝑧𝑢̇ ) − 𝑘𝑡 (𝑧𝑢 − 𝑧𝑟 ) () virtually to the ground as modelled in Figure 2, (𝑐𝑔𝑟𝑑 )
connected between the mass of the wheel and the fixed
𝑘 𝑏 𝑏 𝑘 imaginary frame on the ground. The (𝑐𝑔𝑟𝑑 ) is a ground-hook
𝑧𝑢̈ = 𝑚𝑠 (x1 ) + 𝑚𝑠 (x2 ) − 𝑚𝑠 (x4 ) − 𝑚𝑡 (x3 ) ()
𝑢 𝑢 𝑢 𝑢
damping coefficient. The damper (𝑐𝑔𝑟𝑑 ) is connected to 𝑚𝑢
(the mass of tire) instead of 𝑚𝑠 (the mass of the car). The
From eq. (4) and eq. (7) we will get the matrix of A ground-hook controller improves vehicle road holding by
minimizing the upper and lower peaks of wheel displacement
and tire deflection [12].
0 1 0 −1
K b bs
− ms − ms 0 𝑓𝑔𝑟𝑑 = 𝑐𝑔𝑟𝑑 (𝑧𝑢̇ − 𝑧𝑟̇ ) ()
ms
𝐴= s s
()
0 0 0 1
Ks bs K b
− mt − ms ] 0
[ mu mu u u 1
𝑚𝑠
0 𝐵= ()
0
0 −
𝐿 = −1]
[ () 1
[ 𝑚𝑡 ]
0
C. In-Wheel Motor Electric Vehicle
The In-Wheel Motor Electric Vehicle (IWM EV) is one of
the common types of the electric vehicles, the suspension
system of (IWM EV) is almost doubled the mass of the wheel
by adding the mass of an electric motor to the mass of the
wheel. The in-wheel motor (IWM) is placed in the wheel
empty space [8]. It was known there is relative relationship
between the body vibration and the mass of the wheel, when
the mass of the wheel increased the body vibration also
increased that’s affecting on passenger comfort [8]. An (IWM
EV) has extra wheel mass because of the electric motor mass
added to the mass of wheel, which lead to discomfort for the
passenger and also not safety on the road [9]. The increase of
unsprung mass cause the vehicle to get worse riding comfort
and handling stability [10,11]. Which affecting on the tire- Figure II. In-wheel motor electric quarter vehicle model with
road contact [11]. groundhook controller
𝑦 = 𝐶𝑥 + 𝐷𝑢 ()
𝑛1 = 𝑓𝑠 + 𝑓𝑏 − 𝑓𝑡 ()
𝑛2 = 𝑘𝑠 (𝑧𝑠 − 𝑧𝑢 ) + 𝑏𝑠 (𝑧𝑠̇ − 𝑧𝑢̇ ) () A is the state matrix, B is the input matrix, C is the output
matrix and D is the direct transmission matrix.
𝑛3 = − 𝑘𝑡 (𝑧𝑢 − 𝑧𝑟 ) () 𝑥 is state vector and 𝑢 is control vector
Where:
𝑛1 = 𝑛2 + 𝑛3 ()
69
𝑥1 = 𝑧𝑠 − 𝑧𝑢 ()
𝑥2 = 𝑧𝑠̇ ()
𝑥3 = 𝑧𝑢 − 𝑧𝑟 ()
𝑥4 = 𝑧𝑢̇ ()
70
𝑧1 = 𝑧 − 𝑙𝑓 θ ()
𝑧1̇ = 𝑧̇ − 𝑙𝑓 𝜃̇ ()
𝑧2 = 𝑧 + 𝑙𝑟 θ ()
𝑧2̇ = 𝑧̇ + 𝑙𝑟 𝜃̇ ()
𝑘1 𝑘2 𝑘1 𝑙𝑓 𝑘2 𝑙𝑟 𝑏𝑠1 𝑏𝑠2
𝑧̈ = (− − )𝑧 +( − )𝜃 − ( + ) 𝑧̇ +
𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠
𝑏𝑠1 𝑙𝑓 𝑏𝑠2 𝑙𝑟 𝑘1 𝑏𝑠1 𝑘2 𝑏𝑠2
( 𝑚𝑠
− 𝑚𝑠
) 𝜃̇ + 𝑚 𝑧𝑢1 + 𝑧 ̇ + 𝑚 𝑧𝑢2 + 𝑚 𝑧𝑢2̇
𝑚𝑠 𝑢1
()
Figure VIII. Comparison in tire deflection for quarter vehicle 𝑠 𝑠 𝑠
between normal passive, IWM EV passive, IWM EV with ground-
hook controller and IWM EV with sky-hook controller (𝑚𝑢 + 𝑚𝑖 )𝑧𝑢1̈ = −𝐹𝑡 + 𝐹𝑠 + 𝐹𝑏 ()
𝑏1 = 𝑚𝑡 𝑧𝑢1̈ ()
III. Half Car Model
𝑏2 = −𝑘𝑡1 ( 𝑧𝑢1 − 𝑧𝑟1 ) + 𝑘1 ( 𝑧1 − 𝑧𝑢1 ) ()
A. Half Car Model Passive Suspension System
It is presented as a linear four-Degree-of-Freedom (4-DOF) 𝑏3 = 𝑏𝑠1 (𝑧1̇ − 𝑧𝑢1̇ ) ()
system. The vehicle body has two motions of heave and pitch
and the front and rear tires motions are also involved in the 𝑏1 = 𝑏2 + 𝑏3 ()
half car. A single mass of car is connected to two wheels
masses at each corner. Vertical and pitch motion is 𝑐1 = 𝑧𝑢1̈ ()
appropriate for sprung mass while only vertical motion for
both unsprung masses. The pitch motion for the half car 𝑘𝑡1 𝑘 𝑏
𝑐2 = − 𝑚𝑡
𝑧𝑢1 − 𝑚1 𝑧𝑢1 − 𝑚𝑠1 𝑧𝑢1̇ ()
sprung mass is represented and the vertical displacements for 𝑡 𝑡
the front tire (𝑧𝑢1 ) and for the rear tire (𝑧𝑢2 ) are also be 𝑘 𝑏
introduced. 𝑐3 = 𝑚1 𝑧 + 𝑚𝑠1 𝑧̇ ()
𝑡 𝑡
𝑘1 𝑙𝑓 𝑏𝑠1 𝑙𝑓
𝑐4 = − θ− 𝜃̇ ()
𝑚𝑡 𝑚𝑡
𝑘𝑡1
𝑐5 = 𝑧𝑟1 ()
𝑚𝑡
𝑐1 = 𝑐2 + 𝑐3 + 𝑐4 + 𝑐5 ()
𝑑1 = 𝑚𝑡 𝑧𝑢2̈ ()
𝑑1 = −𝑓𝑡 + 𝑓𝑠 + 𝑓𝑏 ()
Figure IX. Passive suspension system for half car model (4-DOF) 𝑑3 = 𝑏𝑠2 (𝑧2̇ − 𝑧𝑢2̇ ) ()
𝑑1 = 𝑑2 + 𝑑3 ()
B. Half Car Model Forces
𝑥 state vector is [z(𝑧𝑠 ),v(𝑧̇ ),(𝜃),(𝜃̇ ), 𝑧𝑢1 , 𝑧𝑢1̇ , 𝑧𝑢2 , 𝑧𝑢2̇ ] 𝑒1 = 𝑧𝑢2̈ ()
Position of sprung mass z(𝑧𝑠 ), velocity of sprung mass (𝑧̇ ),
𝑘𝑡2 𝑘 𝑏
the pitch angle (𝜃), yaw response (𝜃̇ ), 𝑧𝑢1 (position of front 𝑒2 = − 𝑧𝑢2 − 𝑚2 𝑧𝑢2 − 𝑚𝑠2 𝑧𝑢2̇ ()
𝑚𝑡 𝑡 𝑡
tire), 𝑧𝑢1̇ (absolute velocity for unsprung mass front tire),
𝑧𝑢2 (position of rear tire), 𝑧𝑢2̇ (absolute velocity for 𝑘 𝑏
unsprung mass rear tire) 𝑒3 = 𝑚2 𝑧 + 𝑚𝑠2 𝑧̇ ()
𝑡 𝑡
𝑢 is the input vector [𝑧𝑟1 , 𝑧𝑟2 ].
𝑘2 𝑙𝑟 𝑏𝑠2 𝑙𝑟
𝑒4 = 𝑚𝑡
θ+ 𝑚𝑡
𝜃̇ ()
𝑎1 = 𝑚𝑠 𝑧̈ ()
𝑘𝑡2
𝑎2 = 𝑘1 (𝑧1 − 𝑧𝑢1 ) + 𝑏𝑠1(𝑧1̇ − 𝑧𝑢1̇ ) () 𝑒5 = 𝑚𝑡
𝑧𝑟2 ()
71
𝑔2 = −𝑏𝑠1 (𝑧1̇ − 𝑧𝑢1̇ )𝑙𝑓 + 𝑏𝑠2(𝑧2̇ − 𝑧𝑢2̇ )𝑙𝑟 () 0 0 0 0
0 0 0 0
𝑔3 = 𝑔1 + 𝑔2 () 0 0 0 0
0 0 −
𝑙𝑓 𝑙𝑟
𝑟1 = (
𝑘1 𝑙𝑓
−
𝑘2 𝑙𝑟
)𝑧 +(
𝑏𝑠1 𝑙𝑓
−
𝑏𝑠2 𝑙𝑟
) 𝑧̇ () 𝐵= 0 0 𝐼 𝐼 ()
𝐼 𝐼 𝐼 𝐼 𝐾𝑡1 0 0 0
𝑚𝑡 0 0 0
𝑘1 𝑙𝑓2 𝑘2 𝑙𝑟2 𝑏𝑠1 𝑙𝑓2 𝑏𝑠2 𝑙𝑟2 0 0 0
𝑟2 = − ( + )𝜃 − ( + ) 𝜃̇ () 𝐾𝑡2
𝐼 𝐼 𝐼 𝐼 [0 𝑚𝑡 0 0]
𝑘1 𝑙𝑓 𝑏𝑠1 𝑙𝑓
𝑟3 = − 𝑧𝑢1 − 𝑧𝑢1̇ ()
𝐼 𝐼
Parameter for in-wheel motor half electric vehicle (IWM
𝑘2 𝑙𝑟 𝑏𝑠2 𝑙𝑟 EV) with ground-hook controller
𝑟4 = 𝐼
𝑧𝑢2 + 𝐼
𝑧𝑢2̇ () Mass of car 𝑚𝑠 =1200 kg, mass of tire 𝑚𝑢 =40 kg, mass of in-
wheel motor 𝑚𝑖 =45 kg, the right spring stiffness 𝑘1 =16000,
𝜃̈ = 𝑟1 + 𝑟2 + 𝑟3 + 𝑟4 () the left spring stifness 𝑘2 =16000, the right tire stifness
𝑘𝑡1 =160000, the left tire stifness 𝑘𝑡2 =160000, the distance
Parameter for half vehicle model passive suspension from the front wheel to the center of gravity (𝑙𝑓= 1.1 m), the
system distance from the rear wheel to the center of gravity (𝑙𝑟= 1.3
Mass of car 𝑚𝑠 =1200 kg, mass of tire 𝑚𝑢 =40 kg, the right m), the ground-hook damping coefficient 𝑐𝑔𝑟𝑑1 , 𝑐𝑔𝑟𝑑2 =
spring stiffness 𝑘1 =16000, the left spring stifness 𝑘2 =16000, 10000, 𝑧𝑟1 (random ground input 1) , 𝑧𝑟2 (random ground
the right tire stiffness 𝑘𝑡1 =160000, the left tire stifness input 2), 𝑓𝑔𝑟𝑑1 (ground-hook force 1), 𝑓𝑔𝑟𝑑2 (ground-hook
𝑘𝑡2 =160000, the right damper 𝑏𝑠1 =1000, the left damper force 2)
𝑏𝑠2=1000, the distance from the front wheel to the center of 𝑢 is the input vector [ 𝑧𝑟1 , 𝑧𝑟2 , 𝑓𝑔𝑟𝑑1 , 𝑓𝑔𝑟𝑑2 ].
gravity (𝑙𝑓= 1.1 m), the distance from the rear wheel to the
center of gravity (𝑙𝑟= 1.3 m), I is the mass moment of inertia.
0 0 0 0
1 1
0 0 0 0 0 0 𝑚𝑠 𝑚𝑠
0 0 0 0 0 0 0 0
0 0 0 0 0 0
𝑙𝑓 𝑙𝑟
−
0 0 −
𝑙𝑓 𝑙𝑟 𝐵= 𝐼 𝐼 ()
𝐵= () 0 0 0 0
0 0 𝐼 𝐼
𝐾𝑡1 0
𝐾𝑡1 0 0 0 −𝑚
1 0
𝑚𝑡 0
𝑚𝑢 0 0 0 𝑡 0
0 0 0 𝐾𝑡2
0 1
0 𝐾𝑡2
−𝑚 ]
[0 𝑚𝑢 0 0] [0 𝑚𝑡
0 𝑡
0 1 0 0 0 0 0 0
𝐾1 𝐾2 𝑏𝑠1 𝑏𝑠2 𝐾1 𝑙𝑓 𝐾2 𝑙𝑟 𝑏𝑠1 𝑙𝑓 𝑏𝑠2 𝑙𝑟 𝐾1 𝑏𝑠1 𝐾2 𝑏𝑠2
− − −( + ) − −
𝑚 𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠
0 0 0 1 0 0 0 0
𝐾1 𝑙𝑓 𝐾2 𝑙𝑟 𝑏𝑠1 𝑙𝑓 𝑏𝑠2 𝑙𝑟 𝐾1 𝑙𝑓 2 𝐾2 𝑙𝑟 2 𝑏𝑠1 𝑙𝑓 2 𝑏𝑠2 𝑙𝑟 2
− − −( + ) − − 𝐾1 𝑙𝑓 𝑏𝑠1 𝑙𝑓 𝐾2 𝑙𝑟 𝑏𝑠2 𝑙𝑟
𝐼 𝐼 𝐼 𝐼 𝐼 𝐼 𝐼 𝐼 − −
𝐴= 0 0 𝐼 𝐼 𝐼 𝐼 (67)
0 0
𝐾1 𝑏𝑠1 𝐾1 𝑙𝑓 𝑏𝑠1 𝑙𝑓 0 1 0 0
− 𝐾𝑡1 𝐾1 𝑏𝑠1 0 0
𝑚𝑢 𝑚𝑢 𝑚𝑢 𝑚𝑢 − − − 0 1
0 0 𝑚𝑢 𝑚𝑢 𝑚𝑢
0 0 𝐾𝑡2 𝐾2 𝑏𝑠2
𝐾2 𝑏𝑠2 𝐾2 𝑙𝑟 𝑏𝑠2 𝑙𝑟 0 0 − − −
0 0 𝑚𝑢 𝑚𝑢 𝑚𝑢
[ 𝑚𝑢 𝑚𝑢 𝑚𝑢 𝑚𝑢 ]
72
0 1 0 0 0 0 0 0
𝐾1 𝐾2 𝑏𝑠1 𝑏𝑠2 𝐾1 𝑙𝑓 𝐾2 𝑙𝑟 𝑏𝑠1 𝑙𝑓 𝑏𝑠2 𝑙𝑟 𝐾1 𝑏𝑠1 𝐾2 𝑏𝑠2
− − −( + ) − −
𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠 𝑚𝑠
0 0 0 1 0 0 0 0
𝐾1 𝑙𝑓 𝐾2 𝑙𝑟 𝑏𝑠1 𝑙𝑓 𝑏𝑠2 𝑙𝑟 𝐾1 𝑙𝑓 2 𝐾2 𝑙𝑟 2 𝑏𝑠1 𝑙𝑓 2 𝑏𝑠2 𝑙𝑟 2
− − −( + ) − − 𝐾1 𝑙𝑓 𝑏𝑠1 𝑙𝑓 𝐾2 𝑙𝑟 𝑏𝑠2 𝑙𝑟
𝐼 𝐼 𝐼 𝐼 𝐼 𝐼 𝐼 𝐼 − −
𝐴= 0 𝐼 𝐼 𝐼 𝐼 (70)
0 0 0
𝐾1 𝑏𝑠1 𝐾1 𝑙𝑓 𝑏𝑠1 𝑙𝑓 0 1 0 0
− 𝐾𝑡1 𝐾1 𝑏𝑠1 0 0
𝑚𝑡 𝑚𝑡 𝑚𝑡 𝑚𝑡 − − − 0 1
0 0 𝑚𝑡 𝑚𝑡 𝑚𝑡
0 0 𝐾𝑡2 𝐾2 𝑏𝑠2
𝐾2 𝑏𝑠2 𝐾2 𝑙𝑟 𝑏𝑠2 𝑙𝑟 0 0 − − −
0 0 𝑚𝑡 𝑚𝑡 𝑚𝑡
[ 𝑚𝑡 𝑚𝑡 𝑚𝑡 𝑚𝑡 ]
Figure XIII. Comparison in tire deflection front tire for half vehicle
between normal passive in-wheel motor passive and in-wheel
motor with ground-hook controller
Figure XIV. Comparison in tire deflection rear tire for half vehicle
between normal passive,in-wheel motor passive and in-wheel
motor with ground-hook controller
Figure XI. MATLAB simulation comparison for half vehicle
between normal passive, in-wheel motor passive and in-wheel C. Simulation and Analysis
motor with ground-hook controller
Figures IV, V, VI, VII shows the simulations for the passive
and active quarter car models. Figures XII, XIII and XIV
shows the simulations for the passive and active half car
models. Figure IV, VI and Figure XII compares the
accelerations of the vehicle bodies for the normal passive
vehicle, the (IWM) passive vehicle, the (IWM) ground-hook
active suspension vehicle and the (IWM) sky-hook active
suspension vehicle. The acceleration is commonly used to
measure passenger comfort. According to Figure IV, VI and
Figure XII the added weight of the in-wheel electric motor
does not have much of an impact on the acceleration and
Figure XII. Comparison in acceleration of sprung mass for half vehicle comfort. However, the ground-hook active
vehicle between normal passive in-wheel motor passive and in- suspension increases the acceleration and diminishes
wheel motor with ground-hook controller passenger comfort while the sky-hook active suspension
decreases the acceleration and improves the comfort for the
occupants inside the car. Figure V, Figure VII, Figure XIII and
Figure XIV show the tire deflections for the normal passive
vehicle, the (IWM) passive vehicle, the (IWM) ground-hook
active suspension vehicle and the (IWM) sky-hook active
suspension vehicle. The added weight of the in-wheel motor
73
increases the tire deflection dramatically. Therefore, it is shown with simulations using both the quarter car model and
observed that the road holding properties of the tire is worse the half car model.
for the in-wheel motor passive suspension system and also
worse for sky-hook controller active suspension because the References
sky-hook controller only improves the comfort of the [1] Nouby M. Ghazaly, Ahmad O. Moaaz, “The Future Development and
passengers. However, the ground-hook controller active Analysis of Vehicle Active Suspension System”, Journal of
Mechanical and Civil Engineering (IOSR-JMCE) 2014.
suspension is able to decrease the tire deflection to values [2] Vivek Kumar Maurya and Narinder Singh Bhangal, “Optimal Control
close to the passive suspension. Therefore, the ground-hook of Vehicle Active Suspension System”, Journal of Automation and
controller is able to reach its goal of improving road holding. Control Engineering 2018.
This simulation shows that the ground-hook controller is able [3] John Ekoru “Intelligent Model Predictive/Feedback Linearization
to eliminate the negative effects of an (IWM EV) in terms of Control of Half-Car Vehicle Suspension Systems” thesis 2012.
road holding. However, this is done at the cost of decreasing [4] Fischer, D. and Isermann, “Mechatronic semi-active and active vehicle
suspensions”, Control Engineering Practice, 2004.
passenger comfort. The sky-hook controller is able to
[5] Canale, M., Milanese, M. and Novara, “Semi-active suspension control
improve the comfort for the passengers of an (IWM EV), and using fast model-predictive techniques”, IEEE Transactions on Control
fails to improve the road holding. It should be noted that the Systems Technology, 2006.
gain of the controller for the ground-hook controller can be [6] M. FARID ALADDIN, JASDEEP SINGH, “MODELLING AND
adjusted to decrease the negative effects on passenger SIMULATION OF SEMI-ACTIVE SUSPENSION SYSTEM FOR
comfort. However, this will decrease the effectiveness of the PASSENGER VEHICLE”, Journal of Engineering Science and
Technology, 2018.
ground-hook controller.
[7] M. VALÁŠEK , M. NOVÁK , Z. ŠIKA & O. VACULÍN, “Extended
Ground-Hook - New Concept of SemiActive Control of Truck's
IV. Conclusion Suspension”, Extended GroundHook - New Concept of Semi-Active
A passive suspension system for both normal passive vehicle Control of Truck's Suspension, Vehicle System Dynamics, 2007.
and IWM EV without any controller, IWM EV with ground- [8] Hossein Salma ni, Milad Abbasi, Tondar Fahimi Zand , Mohammad
Fard and Reza Nakhaie Jazar, “A new criterion for comfort assessment
hook controller and IWM EV with sky-hook controller were of in-wheel motor electric vehicles”, Journal of Vibration and Control,
analyzed using MATLAB/Simulink. The simulations show 2020.
that the ground-hook controller improves road holding by [9] Yechen Qin, Chenchen He, Peng Ding, Mingming Dong, Yanjun
reducing the tire deflection for the IWM EV. The sky-hook is Huang, “suspension hybrid control for in-wheel motor driven electric
shown to improve the comfort for the occupants inside the vehicle with dynamic vibration absorbing structures”, IFAC Papers
OnLine 51-31 (2018).
car. In the case of decreasing or increasing unsprung mass,
[10] Fangwu Ma, Jiawei Wang, Yang Wang, Longfan Yang, 2969.
there is an opposite relationship between size of unsprung “Optimization design of semi-active controller for in-wheel motors
mass and road holding. When the unsprung mass is suspension”, 2018.
decreased, better road holding response is achieved. On the [11] Jialing Yao, Shuhui Ding, Zhaochun Li, Shan Ren, Saied Taheri,
other hand, increasing the unsprung mass for IWM EV Zhongnan Zhang, “Composite Control and Co-Simulation in In-Wheel
Motor Electric Vehicle Suspension”, The Open Automation and
(adding the mass of electric motor 45 kg), by increasing the Control Systems Journal, 2015.
unsprung mass of the front and rear wheel assemblies results [12] Patil, I. and Wani, K., “Design and Analysis of Semi-active Suspension
in worse road holding which directly affects the ground Using Skyhook, Ground Hook and Hybrid ControlModels for a Four
contact with the tire. This is the reason for worse road holding Wheeler”, SAE Technical Paper 2015.
for IWM EV. When the ground-hook controller is used for [13] Pipit Wahyu Nugroho, Weihua Li, Haiping Du, Gursel Alici, and Jian
IWM EV, better road holding and good contact with the road Yang, “An Adaptive Neuro Fuzzy Hybrid Control Strategy for a
Semiactive Suspension with Magneto Rheological Damper”, Hindawi
is achieved. This is because the ground-hook controller Publishing Corporation Advances in Mechanical Engineering, 2014.
provides better isolation from road disturbances by reducing [14] Rajesh Rajamani, “Vehicle Dynamics and Control”, Department of
upper and lower peaks of the tire deflection. But the sky-hook Mechanical Engineering University of Minne.
controller fails to improve the road holding because it is used
to provide comfort for the passengers. These results were
74
Evolutionary Deep automatic CAD system for Early detection
of Diabetic Retinopathy and its severity classification.
Lamiya Salman Dr Nidhal Abdulaziz
Faculty of Engineering & Information Faculty of Engineering & Information
Science, University of Wollongong in Science, University of Wollongong in
Dubai, Dubai, UAE Dubai, Dubai, UAE
lamiyasalman3@gmail.com NidhalAbdulaziz@uowdubai.ac.ae
Abstract— Diabetic retinopathy (DR) is a retinal malady have not placed much importance towards early detection of
prevalent in individuals aged 25 years and above, that leads to DR.
irreversible blindness when left unchecked. DR has yet to find a
definitive cure. Diagnostic and corrective measures may prevent
This study recognizes the vital importance of early
complete blindness by up to 90% provided early screening and
monitory clinical visits. Although ample research and detection and aims at a five-level severity grade output. This
advancements have been made, none have been successfully need for early detection is reinforced by the fact that
integrated into the medical system. This is due to the lack of progression and possible cure can be achieved only in the
acceptable classification accompanying screening. Further, early stages of DR as per Early Treatment Diabetic
diluted efforts have been made towards monopolizing early Retinopathy Study (ETDRS) and Diabetic Retinopathy Study
detection i.e., detection of non-proliferate DR a subclass and the (DRS) [6]. Scope of project extends strictly to screening,
earliest discernable form of DR. This study aimed at the early grading and subsequent preventive measures/guidelines.
detection of DR through adoption of the medically validated 5
class DR severity grading. The proposed system was a novel II. Theoretical Background
hybrid scheme based on past works. Multiple input image
modalities and classifier combinations were tested to configure A. Data Processsing
the final DR grading scheme. The system was actualized using Input retinal images are captured and sourced in real time
fundus images as input which were pre-processed and conditions such as poor contrast, blurring and non-uniform
augmented using green channel extraction, CLAHE and binary
illumination. Counter acting these implications is essential as
masking. Images were then synthesized into discriminatory
features using ResNet-50. Final system tier consisted of severity
system performance is dependent on accurate retinal DR
grading through classification using novel CNN based Support feature localization and subsequent classification based on
Vector Machine (SVM). The utilized ensemble of augmentation these discriminant traits.
and pre-processing module along with ResNet-50 based SVM
classifier was a novel contribution not explored in any past Histogram Equilisation (HE) is a common method of
works. Proposed pre-processing increased system accuracy by image contrast enhancement used by Foeady et.al. [7].
13.9% and 16% on IDRiD and Kaggle. Overall F1-score, SN, SP Although frequently used, HE alters the mean brightness of
and Acc of 0.978 ,0.979, 0.995 and 0.979, was achieved. input image introducing artifacts and intensity saturation
Incorporation of Artificial Intelligence made proposed system
therefore is not a suitable choice. Contrast limited Adaptive
time, cost, and labor efficient which is key in DR screening.
histogram equalization (CLAHE) is a variant of HE which
Keywords— Diabetic Retinopathy, severity classification, Pre- increases image quality, incorporates uniform intensity
Processing (PP), Convolutional Neural Network (CNN), Support equalization and reduces amplified noise thereby takes care
Vector Machine (SVM). of the shortcomings of HE. CLAHE, out performs global
enhancement methods like contrast stretching in blood
I. Introduction vasculature (BV) enhancement as seen in the work done by
Diabetic retinopathy (DR), a progressive sight- M.H. Fadzil, et.al. [8], this is further validated through
threatening ailment, develops in individuals with highest peak signal to noise ratio and lowest absolute mean
prolonged diabetes type 1 or 2, due to presence of brightness error when compared to other variants such as
high blood sugar levels in their system [1]. DR has a Adaptive Histogram Equalization [8]. Extraction of a single
predicted global prevalence of 4.4% by 2030 as per World component of the RGB models can be done as a means of
Health Organization and presents an asymptomatic contrast enhancement. Empirical evidence along with
inceptive stage, long latent phase which when deprived from entropic comparisons by N. R. Binti, et.al.[9] support
an early diagnosis may eventually lead to blindness[2]. extraction of the green component due to its capacity to
DR diagnosis and severity analysis today, is provided only by provide the best distinction among DR features/lesions such
an ophthalmologist or through his evaluation of retinal as Micro Aneurysms (MA), Cotton Wool spots, Exudate
images/scans. This process is cost, time and labor intensive, (EXs), Hemorrhages (HM) and BV, when compared to red,
especially in case of large disproportionate field expert to blue and gray channels. This enables maximum entropy
patient ratios, often leading to no patient screening [1][3]. preservation while enhancing contrast. In contradiction to
These alarming statistics, need for a cost efficient and this, Optic disk localization is aided by red channel entropy
accessible screening system have motivated copious research extraction due to better visual contrast as done by S. Pathan,
and advancement in computer aided diagnostic (CAD) et.al.[10]. Qiao, et.al.[11] employed Matched Filters (MF)
devices but no such work has been able to swap places with combined with LoG filters to localize transient EXs, MFs are
a comprehensive eye test [5]. This has been due to inadequate not suitable as they enhance BV along with the EXs.
system performance, lack of training data and classification Difference in lesion size will lead to inaccurate localization.
mostly limited to binary severity grades. Lastly, past works
75
H. Chen et.al. [12] cropped retinal images that had it handles differentiation of BV based on colour, proximity,
incomplete hemispherical boundaries to form complete shape, and size. An accuracy of 98% was reported. MA and
spheres. This approach is not ideal as removal of peripheral HM are spherical in shape with a diameter larger than feeding
regions would lead to inaccurate severity grading due to loss blood vessels. Using this property, H.F. Jaafer, et.al. [18]
of diagnostic lesion areas. segmented red lesions by applying the flood-filled operation
to pixels pertaining to the background and subtracting them
While surveying the appropriate literature, a trend was from the original image. Incorrect discrimination between
identified among CNN based architectures where pre- circular and linear shapes may be done due to inaccurate
processing was mostly limited to Data Augmentation (DA) threshold selection. A SN of 89.7% was reported by the
such as horizontal, vertical flipping, rotation, mirroring, study.
scaling and grayscale normalization as seen in work done by
S. Pao, et.al. [13]. This may have been due to the added pre- V. K. Sree and P. S. Rao, [19] carried out EX segmentation
processing effect of the Max pooling layer by means of de by applying Canny edge detection and smoothening via
noising along with the prevalence of DA which exposes a gaussian mask was done. CCA was used to colour code
variety of variations to the concerned Neural Networks, segmented EXs regions and subtract from original image for
aiding feature learning instead of separate enhancement of exudate localization. Reported accuracy was 72.7%. P.
said features. The direct increase in system performance by Khojasteh, et.al.[20] extracted features without added pre-
means of increased PP does not translate in CNN based processing or prior segmentation for means of EXs feature
systems due to their ability to iteratively learn features from extraction using ResNet-50, Discriminative Restricted
Boltzmann Machines (DBRMs) and a custom CNN. Feature
input images without requiring feature specific enhancement
extraction was enhanced using PCA along with parameter
or enrichments. Processing may become disruptive to said
optimization using grid-search approach. Results of ResNet-
feature learning process due to dampening of distinctive 50 were encouraging (Acc 98.2%, SN 99%,) but that of
characteristics. Undue PP leads to added computational and custom CNN (Acc 70%) and DRBMs (Acc 89%) were poor.
time complexity along with added memory constraints on
system. Despite this, contrast enhancement and optimal It may be concluded that while hand crafted feature
channel extraction have unequivocally facilitated improved extraction as seen in [18],[15] is efficient, it is complex and
system performance being the choice of many CNN and non- static hence would not fare well in extending system to
CNN users [14]. different datasets while maintaining high level of
performance. This issue is reinforced due to manual
B. Feature Segmentation and Extraction requirements such as threshold selection based on training
Features form the basis of classification, supplementing data seen in [18]. CNN based feature extraction is robust and
the classifier with information necessary to sort between fully automated hence is dynamic, having an edge over
different classes. This may be done through feature specific, feature specific non-CNN extractions. High accuracies may
Region of Interest (ROI) and CNN based feature abstraction. be accountable to overfitting of training data. To exploit
The Optic Disk (OD) and BV present in the retina are a CNNs for feature extraction, ROI localization, balanced and
potent source of errors, false positive rates (FPR) and large datasets by means of augmentation as seen
misclassification. The minimum-intensity maximum-solidity in[17][20]may be employed.
(MinIMaS) algorithm was invoked by S. Roychowdhury,
C. Classification
et.al. [15] for means of OD extraction and masking. Results
of overall classification system showed 100% sensitivity (SN) Classification may be seen at lesion level or at image level
but 53.16% specificity (SP), i.e. high FPR. into DR severity grades. Previous studies showed several
binary and multilevel classifications such as DR/No DR,
Thresholding is a common approach adopted for Referral(R)/ Non-referral(NR) DR, and the ETDR 5-level
segmentation. In case of BV, wide range of values are present
classification adopted in this project.
due to difference in vessel width and edge pixels. M.U.
Akram, et.al. [16] eradicated this limitation by means of
multilayered thresholding. Skeletonization of segmented 1) Feature based classification
image by means of thinning morphological operation (MO) M. H. Ahmad Fadzil, et.al.[8] used a non-conventional
was done. Although Acc of 94.69% was achieved, high approach by employing the analysis of the Foveal Avascular
performance metrics on limited dataset are misguiding due to zone (FAZ) for DR detection. Classification was limited to a
exposure to limited sample types. three-level severity grading with SP(>98%), SN(100%) and
Acc of (99%). Although the system fared exceptionally, FAZ
MAs are saccular capillary dilatations, for efficient MA area overlap were seen and deemed as progressive stages.
segmentation, Mateen et.al.[17] applied vessel extraction M.U. Akram, et.al. [16] employed PDR classification based
using a novel hybrid approach of enhanced Gaussian Mixture on neovascularization detection. A 15x15 window was slid
Model (GMM) using Adaptive Learning Rate (ALR) to aid over segmented BV to compute density and energy.
ROI localization for subsequent feature extraction. Disparity in behavior with normal blood vessel led to PDR
Incorporation of ALR was the optimal choice as it helps classification. Acc of 95.02% was achieved.
increase learning rate and convergence speed over others A novel two step lesion classification and severity
such as Expectation Maximization (EM). Post-vessel estimation was done by S. Roychowdhury, et.al.[15]. Various
detection Connected Components Analysis (CCA) along feature-based classifiers such as k-nearest neighbors (KNN),
with blob analysis was utilized to differentiate among lesions GMM, SVM and their ensembles were tested to find best
and healthy retina to enhance BV extraction. CCA is ideal as suited pick for the separate classification of bright and red
76
lesions. System achieved high SN ranging from 100-92% but such as KNN, random forest and random tree. Superiority of
poor SP 48-58% reflecting a high FPR due to bias in TL over custom CNNs was validated by demarcation in
algorithm selected for feature set optimization, feature set performance metrics. This study established the dominance
being key due to usage of feature-based classifiers. of SVMs over other classifiers when paired with CNNs,
Akram, et.al. [21] monopolized M-Medoid classification to through benefits in terms of training, data requirement and
develop a hybrid classification model using probabilistic time constraint using TL, despite these crucial assertions,
integration of GMM classifiers, an effective multi-class system limited itself to early PDR detection.
discriminant aided by intuitive modelling due to their Qummar, et.al.[26] exploiting an ensemble of CNNs, with
resilience towards class overlap which is common in DR Resnet50, Inceptionv3, Xception, Dense121, Dense169 as
classification classes. Severity classification was based on the base models, aimed at encoding rich feature extraction
numeric presence of MAs, EXs and HMs. Reported metrics and accurate DR classification into 5 class severities which is
were promising at 99.17% SN, 97.07% SP and 98.52% Acc. not seen often. System output was the stacked combination of
the individual predicted class labels. Khalifa, et.al. [27] set a
2) CNN comparative study to find the best CNN architecture for the
The prevalent issue of class imbalance in CNNs is same purpose as authors in [26] but unlike the former was
directly addressed by a novel two-step CNN architecture limited to a single CNN. Authors in [26] reported an Acc and
developed by N. Eftekhari, et.al. [22] which was claimed to F1 score of 80.8% and 53.7% but an impressive F1-score of
decrease FPR by means of pixel wise classification of MAs. 95.82% was reported by authors in [27] with the presumption
Large FPR was generated reaching up to 8 FP per image at of AlexNet being the best choice and VGG16 being second.
highest reported SN of 77.1%. Training and developing CNN classification although robust on its own may be further
CNNs manually along with pixel wise computation are labor, elevated by interweaving other supervised or un-supervised
time intensive and in case of the former error prone. classifiers forming an ensemble of sorts[25].
P Khojasteh, et.al.[20] set up a comparative analysis of a
Shaban, et.al. [23], H. Chen, et.al. [12] and H. custom CNN, unsupervised DRBM, and ResNet-50 transfer
Seyedarabi, et.al. [14] undertook DR severity grading using model with its SoftMax classifier layer switched with SVM,
Transfer learning (TL) without prior feature supervised Optimum-Path Forest, and k-NN classifiers
extraction/segmentation. Authors in [23] engineered DR comparing the performance of each in EX classification.
severity grading using TL applied VGG-19 network. No prior ResNet-50+SVM outperformed others with 0.96 SP,0.99 SN.
PP was utilized. Highest Acc, SN, and SP of 89%, 89%, 95% Mateen et.al.[17] classified 5 stages of DR using feature
were reported. TL enhanced CNN by authors in [23] fared vectors segmented from fundus image utilizing the SoftMax
better than the custom CNN proposed by N. Eftekhari, algorithm. Being one of the few 5 grade classifications using
et.al.[22]. While opting out of segmentation and feature a CNN structure reported Acc was of 98.13 %. It is noted that
extraction is acceptable due to structural gains of CNNs, PP machine learning (ML) classifiers such as GMMs, SVMs are
such as DA and contrast enhancement are crucial for system high performing and efficient but are feature set dependent.
performance especially due to the ambiguous nature of retinal Hence more complex and resistant to expansion to new
features. Classification was limited to No/Moderate/Severe datasets[14]. CNNs directly address this issue, and are high
DR. Unlike authors in [23], authors in [14] applied CLAHE performing but are victim to overfitting and class imbalance,
for PP and claimed EfficientNet to be the ideal choice of combining the two approaches, SVM and CNNs would result
CNN due to their reduced parameters, flops, increased speed in elevated results and resilient classification [17][20].
and accuracy. Classification was limited to referral and non-
referral DR with reported 93% SN. H. Chen, et.al. [12] aimed III. Materials and Method
at 5 level severity classification. To this end, pre trained The proposed system is a novel hybrid approach designed
Inception V3 due to its depth and elevated linear expression to generate DR severity grading based on input fundus
was enhanced using Stochastic Gradient Decent (SGD) and images. Overall system block diagram is shown in Figure I.
early stop mechanism to mitigate learning rate and
overfitting. Stated F1 score of 77% was poor, especially in
class 1 and 3.
3) Ensembles
Ensembles of classifiers is an active area of research due
to their ability to boost system performance but at the cost of
increased system complexity and time constraint [24].
Figure I. Overall System block diagram with all modalities tested.
J. Sahlsten, et.al. [24] validated this claim by employing the
Inception-V3 architecture and its ensemble of 6 units to 1) Dataset
compare classification performance of the two systems. The KAGGLE benchmark dataset is a compilation of
Results showed a clear increase in performance across all 35,126 images with separate training and testing sets with
planes when comparing the single CNN against the ensemble class labels ranging from 0-4, in line with the severity scheme
system. M. Ghazal, et.al.[25] utilized TL and ensembles and adopted by this study.
combined them to authenticate the advantages of TL. The
ensemble of 7 CNNs had pre-trained AlexNets and randomly IDRiD comprises of 516 images. Images are available in
initialized custom CNNs. Classification was by means of .jpg format, 4288x2848 resolution with 800KB size. Images
SVM. This choice was made by sampling other classifiers are accompanied with severity grading and ground truths.
77
Table I. Effect of Pre-processing datasets on performance.
Data set Classifier No of images Accuracy Sensitivity Specificity Precision Recall F1 Score
IDRiD (raw) SoftMax 840 0.91 0.760 0.945 0.771 0.767 0.763
IDRiD (PP) SoftMax 840 0.933 0.824 0.96 0.826 0.824 0.824
Kaggle (raw) SoftMax 1000 0.836 0.584 0.899 0.59 0.584 0.578
Kaggle (PP) SoftMax 1000 0.904 0.776 0.944 0.755 0.756 0.743
IDRiD (raw) SVM 840 0.84 0.64 0.906 0.60 0.64 0.564
IDRiD (PP) SVM 840 0.979 0.979 0.995 0.978 0.979 0.978
Kaggle (raw) SVM 1000 0.824 0.54 0.897 0.56 0.54 0.52
Kaggle (PP) SVM 1000 0.984 0.961 0.99 0.956 0.961 0.96
2) Data Augmentation were based on features extracted by the CNN. 5-fold cross
Of the two datasets employed, IDRiD was first balanced validation was used to reflect authenticity of acquired results
in terms of images per DR severity class using manual up certifying lack of overfitting due to the 5 separate folds of TT
sampling. Once balanced, datasets were further augmented data used. The SGD which helps minimize cross entropy loss,
by means of reflection, scaling and translation. This was increase computational efficiency and accelerate learning
done to diminish overfitting to a singular class/dataset and was used [23][3].
provide a larger dataset for TT purposes from the smaller IV. Results and Discussion
source dataset. DA was done uniformly across all classes of Comparative analysis between Raw, Pre-processed, ROI
data, maintaining equal ratios to avoid class imbalance. input modalities and the SVM and SoftMax classifier using
the ResNet-50 architecture was done. Multiple mini batch
3) Pre-processing sizes, epochs and learning rates were experimented with to
Acquired images had their green channel extracted prior identify the best performing fusion. Through experimentation
to masking, highlighting foreground from background [7], the optimum combination of 10 mini batch size, 50 epochs
providing the best distinction between retinal features in the and 1e-3 learning rate was ascertained. System performance
foreground and darker background pixels [9]. Input retinal markedly increased across all parameters on increasing TT
fundus images were resized to standard input size of ResNet- images. This outcome was expected as CNN performance is
50. directly proportional to the mass of images supplied to the
architecture. More images make way for better learning and
4) ROI localization feature extraction. It may be inferred from Table I that system
BV segmentation was done in accordance to the work accuracy was increase by 13.9 to 16 % by pre-processing
presented by M. Mateen, et.al.[17]. Discriminant information input images prior to classification.
present in the BV regions was localized and fed to the CNN Further it may be noted that the smaller IDRiD (840)
for feature extraction. GMM was optimized using EM instead dataset achieved higher SN and F1 score as compared to the
of ALR due to comparable performance and limited works on larger Kaggle(1000) dataset. This may be accounted for by
fairly novel ALR implementation with GMMs [21]. the unruly nature of the latter. Kaggle comprises of 35,000
plus images captured by varying fundus cameras, zooming
5) Feature Extraction and Classification scales, illumination, and FOV angles, Figure II [17].
Feature extraction by means of CNNs was opted for due Inhomogeneous quality and sub-optimal lighting as
to their ability to automatically learn features instead of compared to the uniform composition of the IDRiD dataset,
requiring a set of hand-crafted features which are time Figure III, explains the higher system performance linked
consuming, complex, and fixed. This project aims at the with the latter despite its smaller size. This finding reinforces
comparative analysis of different classification approaches. the importance dataset quality and composition has on
The novelty of the proposed system is the usage of ResNet- performance and eventual incorporation of a computer aided
50 with SVM for DR severity classification which has not diagnostic DR grading system into the medical field.
been explored earlier. Classification has been implemented in Effect of Segmentation was tested using only the Kaggle
accordance with the work done by P. Khojasteh, et.al.[20], dataset. Overall system performance of the SoftMax
hence is an extension of his work from HE classification to classifier was lower than the system performance achieved
DR grading. Multiple classification approaches, ResNet-50 using only PP images at Acc= 0.89 and F1-score of 0.72.
architecture with raw fundus images, pre-processed images, Although performance was expected to improve as per the
segmented images and ResNet-50 with SVM classifier and study executed by M. Mateen, et.al. [17], replacement of EM
raw/ PP/ segmented images have been tested. SVM and as optimizer in place of the originally proposed ALR may be
ResNet-50 have been selected due to their consistent high accounted for the diminished performance. ALR as opposed
performance. Classification without SVM classifier was done to EM has an adaptive variable which updates the gaussian
by means of SoftMax activation technique, both protocols clusters through the weight factor parametric elimination to
Figure II. Kaggle Dataset (no pre-processing): No DR, Mild NPDR, Figure III. IDRiD Dataset (no pre-processing): No DR, Mild NPDR,
Mod NPDR, Severe NPDR and PDR (L-R). Mod NPDR, Severe NPDR and PDR (L-R).
78
Table II. Comparison of proposed method with past works.
Author Methodology Data set Acc SN SP Precision Recall F1 Score
M. Shaban, [33] Custom CNN Kaggle 0.88 0.87 0.94 - - -
N. Khalifa,[35] Augmentation with AlexNet Kaggle 0.979 - - 0.962 0.954 95.82
S. Poa, [19] Bichannel CNN, Kaggle 0.878 0.778 0.938 - - -
Chen, [18] Inception V3 Kaggle - - - 0.76 0.80 0.77
P. Khojasteh[28] ResNet-50 + SVM DIARETDB1 0.982 0.99 0.96 - - -
M. Mateen,[25] GMM ROI extraction,VGG-19 Kaggle 0.983 - - - - -
Wang,[32] Inception V3 Kaggle 0.632 - - - - -
Qummar, [34] Ensemble of ResNet-50, Inception V3, Kaggle - - 0.851 0.634 0.65 0.322
Xception, Dense 121 and Dense 169
Proposed ResNet-50 + SVM IDRiD .979 0.979 0.995 0.978 0.979 0.978
Proposed ResNet-50 + SVM Kaggle 0.984 0.961 0.99 0.956 0.961 0.96
enable quasi linear adaptation. Similarly, the SVM classifier comparison of overall performance was not done. Although
using the ROI extracted images fared very poorly. System the study done by P. Khojasteh, et.al. reported higher SN in
Acc peaked at 82% with F1 score of 52.2%. Segmentation of both datasets and higher Acc in one dataset, It was limited to
input images significantly reduced TT time as compared to HE classification and did not do DR classification or severity
raw or PP images. Although DR detection and diagnosis is grading.The proposed system fared better overall against
not time sensitive i.e., does not need to be done in real time, nearly all past works surveyed. Hence, the system has been
training of CNNs using PP images may take up to 8 hrs for successful in improving DR classification and detection
1000 images. TT over larger datasets will take longer and performance by incorporating past works and generating an
may prove troublesome. Hence, segmentation optimization evolutionary deep CAD system.
should be done as part of future work to enhance system
performance using ROI input using ALR. This will make way V. Conclusion
for smoother integration of the system in hospitals and care This study was formulated to tackle the rampant issue of
giving facilities. It was inferred that system performance had screening and grading Diabetic Retinopathy in large
a linear increase with PP and increase of input images. A 13.9 populations due to factors such as cost, scarcity of trained
to 16 % increase in system Acc was achieved using PP. The physicians or accessibility. Based on qualitative and
best performing duo was identified as the ResNet-50 quantitative analysis of previous works, A novel hybrid
architecture with the SVM classifier using only PP images, automated deep learning algorithm capable of providing
seen in Table I. Classification using ResNet-50 with the screening, grading and preventive measures has been
SoftMax classifier on PP images peaked at an Acc of 93.3% proposed. Due to its end-to-end nature, the system eradicates
with IDRiD and 90.4% with Kaggle as opposed to 97.9% and the need of trained Ophthalmologists there by alleviating time
98.4 % respectively using the SVM classifier. Segmentation and workload, seamlessly mitigating human error/bias,
led to degradation of system performance. This unanticipated restricted access to diagnosis and its high cost in the process.
dip in performance using ROI localization was due to the The adopted solution to the research question employed
inability of the EM algorithm to dynamically segment images image processing and was developed using MATLAB.
like the originally proposed ALR algorithm. Multiple types of input images and classifiers were tested to
find the superior performing system. The final system
System evaluation has been done by means of parametric initiated with a rigorous pre-processing and augmentation
comparison of operation indicators such as Acc, SN, SP, module followed by feature extraction using ResNet-50.
Recall, precision, and F1 score. All past research considered Images were then classified using an SVM classifier. System
for comparison accomplished DR grading using Kaggle bar construction was based on research and experimentation
one of the literatures followed. In comparison to past works, which yielded supremacy of the CNN used. The contribution
the proposed system fared better than most, Table II. The of this study was the robust pre-processing and augmentation
reported precision by N. Khalifa, et.al.[27] was 0.6% higher module along with classification of DR using ResNet-50 and
than the proposed system. F1 score, recall and Acc was lower. SVM which has not been explored in any previous studies.
This study takes a novel hybrid approach by building on the This choice was attributed to the enhanced overall
work done by M. Mateen, et.al. [17] and P. Khojasteh performance necessary for adequate DR classification despite
et.al.[20]. Although P. Khojasteh [20] employed different the increased complexity constraints. The system was trained
datasets and was limited to HE detection, Table II presents on both IDRiD and KAGGLE datasets using 1000 and 840
the acquired performance metrics for purpose of comparison. images, respectively. A GUI was designed to simulate
It may be seen that the proposed hybrid method led to incorporation of system into health facilities. Simulation
enhanced operation compared to M. Mateen, et.al[17]. In case results inferred an F1-score, SN, SP and Acc of 0.978 ,0.979,
of P. Khojasteh,[20], SP metrics obtained by the proposed 0.995 and 0.979, respectively. A 13.9 % and 16 % increase in
system were better than both reported specificities by 3 to system Acc was achieved on IDRiD and Kaggle respectively
4.5%. Acc achieved on the DIARETDB1 dataset is improved when proposed pre-processing and augmentation was used.
by the proposed system on Kaggle but falls short by .3 % on Acc and SP of literature followed was improved by 0.1-0.8%
the IDRiD dataset. Acc on E-Optha is improved on both and 3-4.5% respectively. The final system led to increased
datasets utilized by the proposed system. Both SNs reported performance across all parameters in comparison to almost
by authors of [20] are higher than the proposed system by 0.1- all past works reviewed. Although best results were
2.9 %. Neither of the studies provided F1 scores hence direct anticipated using segmented images, this was not the case due
79
to inadequacy of employed EM in comparison to originally Deep Learning Algorithms,” IEEE Access, vol. 8, pp. 104292–
104302, 2020, doi: 10.1109/ACCESS.2020.2993937.
proposed ALR. Further work may be done to test system
[12] H. Chen, X. Zeng, Y. Luo, and W. Ye, “Detection of Diabetic
performance using ALR in ROI segmentation and PCA/SVD Retinopathy using Deep Neural Network,” in 2018 IEEE 23rd
algorithms for the enrichment of feature extraction. Robust International Conference on Digital Signal Processing (DSP),
feature extraction would supplement system generalizability Nov. 2018, pp. 1–5. doi: 10.1109/ICDSP.2018.8631882.
to new datasets and images, this being crucial for successful [13] Shu-I. Pao, H.-Z. Lin, K.-H. Chien, M.-C. Tai, J.-T. Chen, and
system integration into a medical center. Editing the input G.-M. Lin, “Detection of Diabetic Retinopathy Using
Bichannel Convolutional Neural Network,” Journal of
layer size of the CNN structure as opposed to resizing input Ophthalmology, Jun. 20, 2020.
images would be a promising experiment. This may be done https://www.hindawi.com/journals/joph/2020/9139713/
as a solution to the degraded image quality attained when [14] H. Seyedarabi, S. H. A. Jahromi, A. Javadzadeh, and ASRA
image resizing is done to fit CNN input frame size. MOMENI POUR, “Automatic Detection and Monitoring of
Performance may be optimized using Twin SVMs due to Diabetic Retinopathy Using Efficient Convolutional Neural
Networks and Contrast Limited Adaptive Histogram
their supremacy over traditional SVMs. As established inter- Equalization,” IEEE Access, vol. 8, pp. 136668–136673, 2020,
dataset variances are present, using multiple different doi: 10.1109/ACCESS.2020.3005044.
datasets would help system adjust to different FOV angles, [15] S. Roychowdhury, D. D. Koozekanani, and K. K. Parhi,
illumination etc. Finally, results showed system performance “DREAM: Diabetic Retinopathy Analysis Using Machine
Learning,” IEEE J. Biomed. Health Inform., vol. 18, no. 5, pp.
to be dependent upon quality of images. Hence, the need for 1717–1728, Sep. 2014, doi: 10.1109/JBHI.2013.2294635.
better dataset compilation strategies and guidelines is [16] M. U. Akram, I. Jamal, A. Tariq, and J. Imtiaz, “Automated
essential to the development of a robust DR detection and segmentation of blood vessels for detection of proliferative
grading computer aided system. diabetic retinopathy,” in Proceedings of 2012 IEEE-EMBS
International Conference on Biomedical and Health
References Informatics, Jan. 2012, pp. 232–235.
[1] American Optometric Association, “Diabetic retinopathy.” [17] Muhammad Mateen, J. Wen, Nasrullah, S. Song, and Z. Huang,
https://www.aoa.org/healthy-eyes/eye-and-vision- “Fundus Image Classification Using VGG-19 Architecture
conditions/diabetic-retinopathy?sso=y with PCA and SVD,” Symmetry, vol. 11, no. 1, Art. no. 1, Jan.
2019, doi: 10.3390/sym11010001.
[2] S. Wild, G. Roglic, A. Green, R. Sicree, and H. King, “Global
prevalence of diabetes: estimates for the year 2000 and [18] H. F. Jaafar, A. K. Nandi, and W. Al-Nuaimy, “Automated
projections for 2030,” Diabetes Care, vol. 27, no. 5, pp. 1047– detection of red lesions from digital colour fundus
1053, May 2004, doi: 10.2337/diacare.27.5.1047. photographs,” in 2011 Annual International Conference of the
IEEE Engineering in Medicine and Biology Society, Aug.
[3] A. A. Alghadyan, “Diabetic retinopathy – An update,” Saudi J. 2011, pp. 6232–6235. doi: 10.1109/IEMBS.2011.6091539.
Ophthalmol., vol. 25, no. 2, pp. 99–111, Apr. 2011, doi:
10.1016/j.sjopt.2011.01.009. [19] V. K. Sree and P. S. Rao, “Diagnosis of ophthalmologic
disordersin retinal fundus images,” in The Fifth International
[4] M. Mateen, J. Wen, M. Hassan, N. Nasrullah, S. Sun, and S. Conference on the Applications of Digital Information and
Hayat, “Automatic Detection of Diabetic Retinopathy: A Web Technologies (ICADIWT 2014), Feb. 2014, pp. 131–136.
Review on Datasets, Methods and Evaluation Metrics,” IEEE doi: 10.1109/ICADIWT.2014.6814696.
Access, vol. 8, pp. 48784–48811, 2020, doi:
10.1109/ACCESS.2020.2980055. [20] Parham Khojasteh et al., “Exudate detection in fundus images
using deeply-learnable features,” Comput. Biol. Med., vol. 104,
[5] “Diabetic Retinopathy: A Position Statement by the American pp. 62–69, Jan. 2019, doi: 10.1016/j.compbiomed.2018.10.031
Diabetes Association | Diabetes Care.” [21] M. Usman Akram, S. Khalid, A. Tariq, S. A. Khan, and F.
https://care.diabetesjournals.org/content/40/3/412 Azam, “Detection and classification of retinal lesions for
[6] “Early photocoagulation for diabetic retinopathy. ETDRS grading of diabetic retinopathy,” Comput. Biol. Med., vol. 45,
report number 9. Early Treatment Diabetic Retinopathy Study pp. 161–171, Feb. 2014, doi:
Research Group,” Ophthalmology, vol. 98, no. 5 Suppl, pp. 10.1016/j.compbiomed.2013.11.014.
766–785, May 1991.
[22] Noushin Eftekhari, H.-R. Pourreza, M. Masoudi, K. Ghiasi-
[7] A. Z. Foeady, D. C. R. Novitasari, A. H. Asyhar, and M. Shirazi, and E. Saeedi, “Microaneurysm detection in fundus
Firmansjah, “Automated Diagnosis System of Diabetic images using a two-step convolutional neural network,”
Retinopathy Using GLCM Method and SVM Classifier,” in Biomed. Eng. OnLine, vol. 18, no. 1, p. 67, May 2019, doi:
2018 5th International Conference on Electrical Engineering, 10.1186/s12938-019-0675-9.
Computer Science and Informatics (EECSI), Oct. 2018, pp.
154–160. doi: 10.1109/EECSI.2018.8752726. [23] M. Shaban et al., “A convolutional neural network for the
screening and staging of diabetic retinopathy,” PLoS ONE, vol.
[8] M. H. Ahmad Fadzil, Nor Fariza Ngah, T. M. George, L. I. 15, no. 6, Jun. 2020, doi: 10.1371/journal.pone.0233514.
Izhar, H. Nugroho, and H. A. Nugroho, “Analysis of foveal
avascular zone in colour fundus images for grading of diabetic [24] Jaakko Sahlsten et al., “Deep Learning Fundus Image Analysis
retinopathy severity,” in 2010 Annual International for Diabetic Retinopathy and Macular Edema Grading,” Sci.
Conference of the IEEE Engineering in Medicine and Biology, Rep., vol. 9, no. 1, Art. no. 1, Jul. 2019, doi: 10.1038/s41598-
Aug. 2010, pp. 5632–5635. doi: 019-47181-w.
10.1109/IEMBS.2010.5628041. [25] M. Ghazal, S. S. Ali, A. H. Mahmoud, A. M. Shalaby, and A.
[9] N. R. Binti Sabri and H. B. Yazid, “Image Enhancement El-Baz, “Accurate Detection of Non-Proliferative Diabetic
Methods For Fundus Retina Images,” in 2018 IEEE Student Retinopathy in Optical Coherence Tomography Images Using
Conference on Research and Development (SCOReD), Nov. Convolutional Neural Networks,” IEEE Access, vol. 8, pp.
2018, pp. 1–6. doi: 10.1109/SCORED.2018.8711106. 34387–34397, 2020, doi: 10.1109/ACCESS.2020.2974158.
[10] Sumaiya Pathan, P. Kumar, R. Pai, and S. V. Bhandary, [26] S. Qummar et al., “A Deep Learning Ensemble Approach for
“Automated detection of optic disc contours in fundus images Diabetic Retinopathy Detection,” IEEE Access, vol. 7, pp.
using decision tree classifier,” Biocybern. Biomed. Eng., vol. 150530–150539, 2019, doi: 10.1109/ACCESS.2019.2947484.
40, no. 1, pp. 52–64, Jan. 2020, doi: [27] Nour Eldeen M. Khalifa, M. Loey, M. H. N. Taha, and H. N.
10.1016/j.bbe.2019.11.003. E. T. Mohamed, “Deep Transfer Learning Models for Medical
Diabetic Retinopathy Detection,” Acta Inform. Medica, vol. 27,
[11] L. Qiao, Y. Zhu, and H. Zhou, “Diabetic Retinopathy Detection no. 5, pp. 327–332, Dec. 2019, doi: 10.5455/aim.2019.27.327-
Using Prognosis of Microaneurysm and Early Diagnosis 332.
System for Non-Proliferative Diabetic Retinopathy Based on
80
Studies on Increasing Energy Efficiency By
Modernization of a Single-Stage Type Turbo-Compressor
for Ammonia Combustion Air Process
Abstract— Today, due to global warming, security of supply, temperature, humidity, dust, electromagnetic noise are
and rising prices, energy efficiency studies come to the fore. essential features. In addition, they are preferred by many
Energy efficiency also seems to be a way for industrial businesses because they are easy to maintain and can be
enterprises to be profitable. This study discussed the carried over a long distance[1,2].
modernization of a turbo-compressor located within a dilute
nitric acid production plant. This single-stage turbo-compressor Turbo compressors are dynamic compressors that are
sends ammonia combustion air into the reactor. As a result of commonly used to press air and gas. These machines create
the modernization, the energy efficiency in the system was pressure according to the dynamic principle; this means that
analyzed. As a result of modernization studies, the speed of the pressure increase is provided (using the air velocity) without
turbo compressor and the amount of air has been increased. any mechanical volume contraction (sliding) as in the
Thus, the acid production amount was raised in the facility. operation of positive displacement compressors. In turbo
compressors, the element that rotates at high speed to push the
Keywords— turbo compressor, modernization, nitric acid air (gas) is called an impeller or turbofan. There is no piston
production, energy efficiency or other type of mechanical driving or compression element
between the air inlet and the air outlet of the turbocharger[3,4].
I. Introduction
Instead, the turbo compressor sucks the air from the suction
Energy efficiency is a set of interdisciplinary strategic port (middle), and the impeller blades rotating at high speed
activities that complement and support national strategic create a centrifugal force and blow the air from the inside out
objectives. Significantly reducing the burden of energy costs, (around). Therefore, turbo compressors are also called
ensuring supply security in energy, reducing foreign centrifugal or even aerodynamic compressors. A radial
dependency, applying low-carbon technologies, protecting the discharge flow characterizes centrifugal compressors. Air is
environment, using domestic energy potential, and ensuring sucked towards the center of a rotating impeller (turbine) with
its sustainability are the primary targets. In addition, it is radial blades and is pressed against the circumference of the
possible to define energy efficiency as providing the same impeller by centrifugal (centripetal) forces. This
production capacity using less energy. In other words, energy circumferential (radial) movement of air causes both pressure
efficiency is the reduction of the amount of energy consumed increase and kinetic energy generation. Before the air is
without affecting economic development and social welfare directed to the turbine center of the following compression
without reducing the amount of output and quality in stage, it passes through a diffuser and spiral, during which
production. Therefore, savings are the most important factors kinetic energy is converted into pressure. Efficiency analysis
that stand out in terms of energy efficiency. Here, saving is critical because these systems are energy-intensive systems
intends to minimize energy consumption without hindering [5,6]. Modernization is essential for the continuity of
economic development and standard of living. production and energy efficiency today because
To make progress in energy efficiency, the industrial modernization enables investments to be made in the
sector must be considered as a priority. When examining the production lines of existing facilities, including adding
countries that stand out in energy efficiency worldwide, their suitable parts to machinery and equipment that have
improvement in the industrial sector is noticeable. The most completed their technical and/or economic life or replacing
intensive energy consumption sectors in Turkey are industry, existing machinery and equipment with new ones, completing
electricity generation, transportation, and housing. Therefore, missing parts in the facility, directly raising the quality of the
these sectors also stand out in terms of energy efficiency. final product or changing its model.
When the sectors are evaluated separately, it is seen that a In this study, the modernization of the turbo-compressor
large number of regulations related to energy efficiency have system in a dilute nitric acid production facility is
been made in Turkey, especially in the industrial field. Many discussed.To produce acid in this facility, it is necessary to
energy conversion systems are used in the production, obtain a mixture of ammonia and air as required by the
transmission, and storage of energy. Because compressed air process. Therefore, the plant capacity is directly dependent on
is convenient and safe, it is widely used as a power source in the amount of air produced by this turbo-compressor. Turbo
control valves, air motors, air guns for cleaning purposes, and air compressor is operated by a steam turbine and auxiliary
many more. Compressed air systems have a low power-to- gas expansion turbine. With the modernization works; It is
weight ratio and a high power density. Being resistant to aimed to increase the amount of acid production by increasing
explosions and overloads and not being affected by the speed of the turbo-compressor and the amount of air.
81
II. The catalytic ammonia oxidation process known as the Ostwald process, the reactions between
In nitric acid production, first ammonia is oxidized to ammonia and oxygen. For example, oxidation, etc. Further
produce nitric oxide (nitrogen monoxide, NO), then NO is nitrogen oxides are formed between the steps, depending on
oxidized, one more step to becoming NO2 (nitrogen dioxide). the pressure and temperature conditions. Many unit processes
In the final stage, NO2 gas is absorbed with water, and nitric and processes with catalysts are applied to create this simple
acid (HNO3) is obtained. In the basic production technology reaction chain based on facilities. The general flow chart of
the nitric acid production process is shown in Figure I.
Nitric acid plants are classified according to different or Whether the pressure levels used are low, medium or
same pressure levels in the reactors in two separate high pressure also significantly affects the result. The most
oxidation stages. Some technologies have the same pressure commonly used are medium and high-pressure dual
in both stages; these are single-pressure systems. Others systems. High-pressure single systems are also widely used.
operate at two different pressures, and these are dual This classification is essential in terms of emissions. The
pressure systems. nitric acid production steps are shown in Figure II.
Gas pressurization,
Waste gas Absorption of NO2 Steam generation
energy recovery
gas to be converted Cooling of gases and sending to the
(tail gas filtration) and cooling
into nitric acid steam turbine
operations
82
III. Results and recommendations system is converted into useful work. As a result, system
energy consumption is realized only in line with
The raw materials used in the production of nitric acid
operational needs. In addition, since the drops that may
are water, air, and ammonia. Ammonia is first gasified with
occur in the system outlet pressure due to leaks are
water in the ammonia gasifier and then comes to the
prevented, inefficient operation of the equipment at the
ammonia superheaters to be heated with hot air. Then the
end-use points is controlled, and the continuation of the
heated gaseous ammonia comes to ammonia incinerators.
production processes is ensured [7,8]. First of all, to
The air required for the combustion of ammonia is sucked
increase the efficiency of the turbo-compressor operating
from the atmosphere by the “turbo-compressor”. The
in the facility, modernization of the sealing system of the
turbo-compressor is operated by the steam turbine and by
rotor is envisaged. For this purpose, new fixed sealing parts
the rest gas expansion turbine.
were manufactured. The before and after view of the
Air leaks are one of the most important causes of
modernized sealing system of the turbo-compressor is
unnecessary energy consumption in a compressed air
shown in Figure III. In addition, the manufacturing
system. Necessary preliminary leak studies (leak detection
drawings of the newly designed sealing elements are shown
and repair) and continuous system performance reviews
in Figure IV.
ensure that almost all air production in the compressed air
Figure III. View of the sealing elements before and after modernization
83
Exting design New design
Figure IV. Manufacturing drawings of old and newly designed sealing elements
When compressor air leaks are reduced, compressor For example, in screw compressors, although the
pressure-flow increases, when the flow rate increases, pressure chamber and the air suction chamber are not
ammonia/air percentage composition is preserved in separated by definite boundaries, air leaks are still very
ammonia incinerators, and more airflow is fed for the same low. These are high speed compressors. In low-capacity
ammonia flow rate. Therefore, the ammonia and oxygen compressors, the yield decrease in the final stages is
reaction conversion rate increases, and more nitrogen unacceptably large. High outlet pressure production means
monoxide (NO gas) is produced. As more NO gas directly extra energy consumption of the compressor, that is,
affects the oxidation and absorption reaction in other steps, additional operating cost. Pressure levels that are
nitric acid production also increases. Therefore, the accidentally set higher than necessary will result in
compressor outlet pressure is one of the most critical compressor control set values (minimum and maximum)
factors affecting the efficiency of the compressor. Turbo that can be re-examined and gradually reduced to the
compressors formed with impellers arranged on a single required levels not to damage the sensitive equipment used
shaft, but if the capacity is high, it is efficient. in the business [9,10].
84
However, most of the time, the compressor outlet If the pressure losses from the compressed air outlet to
pressure level increases to compensate for the pressure the end-use point in a compressed air system do not exceed
losses between the compressor and critical end-use points in 10 % of the compressor outlet pressure, it can be said that
the system. These losses, which create the need to use this system is adequately designed. Since the inlet
compressed air like artificial equipment in the compressed conditions are known during these calculations, it is easy to
air system, cause low system performance and unnecessary design the suction edge. However, in design calculations
energy consumption of the compressor. In compressed air based on the pressure edge dimensionless mass flow
systems with any airflow restricting factor, it is imperative parameter, the losses must somehow be included in the
to increase the system pressure to achieve the pre-calculated calculations. The view of the turbo-compressor before and
flow capacity. after rotor maintenance, coating process, and high-speed
balancing is shown in Figure V.
Figure V. Before and after rotor maintenance, coating process and balancing adjustments.
As a result of modernization, the compressor pumps air at Table II. Compressor test results after modernization
a higher flow rate than before, and thus, the amount of acid Speed Air Temperature Production 100%
produced increases. Likewise, an increase in airflow and (rpm) (m3/h) (0C) acid (ton/day)
acid production was achieved according to the design
values. The test results before the modernization of the 5244 97358 18 680
compressor are given in Table I, and the test results after the 5252 97429 19 670
modernization are shown in Table II.
5246 96830 18 670
Table I. Compressor test results before modernization
5250 97613 17 670
Speed Air Temperature Production 100%
5241 97337 18 674
(rpm) (m3/h) (0C) acid (ton/day)
5247 97010 19 674
5250 89375 22 590 5246 96565 20 673
5250 87690 21 580 5247 97163 18 673
5250 87550 22 585
5250 87185 22 580
An unbalanced rotor will vibrate at the frequency of
5250 87760 22 585 shaft rotation speed due to the unbalanced mass's centrifugal
5250 87720 20 585 force. Therefore, a machine with an unbalance condition is
5250 88685 18 590
expected to produce a sinusoidal sine wave and a
corresponding dominant peak in the spectrum at the shaft
5250 87995 21 585 rotation speed. Turbo-compressor balance measurement
values are given in Table III.
The pressure ratio produced by the impellers is Table III. Turbo-compressor balance measurement values
proportional to the square of the operating speed. Therefore, Speed Pedestal 1 V Pedastal 2 V
unbandaged impellers can make much higher pressure
ratios than bandaged impellers. However, non-bandaged (rpm) mm/s, rms mm/s, rms mm/s, rms mm/s, rms
impellers tend to be less effective due to the high losses 2200 0.084 246 0.055 226
associated with wingtip leakage flow. In addition, there is
2700 0.053 112 0.084 114
no tip leakage in a bandaged impeller. Therefore, while the
centrifugal compressor is pre-designed, the head, flow, and 4900 0.060 77.9 0.132 54.1
speed are taken as a basis. 5600 0.278 101 0.151 26.9
85
Turbo-compressors are continuous service compressors, References
with the advantage of having very few moving parts,
especially in applications where high airflow is required and [1] Zhang Chaowei, Dong Xuezhi, Liu Xiyang, Sun
mainly where oil-free air is required are suitable for use. Zhigang, Wu Shixun, Gao Qing, Tan Chunqing, “A
However, unbalance forces put pressure on bearings and method to select loss correlations for centrifugal
seals, exacerbating looseness problems and can trigger compressor performance prediction”, Aerospace Science
resonances. The force created by an unbalance weight is and Technology, Volume 93, 2019.
related to the square of the velocity, so high-speed machines
can generate enormous unbalance forces and therefore [2] Xiao He, Xinqian Zheng, “Flow instability evolution in
cannot be allowed to go out of balance. In case of exit, it high pressure ratio centrifugal compressor with vaned
will cause inevitable damage. The balance report chart of diffuser”, Experimental Thermal and Fluid Science,
the turbo-compressor is given in Figure VI. Volume 98, 2018.
86
Prediction of cover bunch quality after harvesting
period of Tunisian date palm
Wafa Guedri Mounir Jaouadi Slah Msahli
Laboratory of Textile Engineering, Higher Institute of Technology Studies Higher Institute of Technology Studies ISET
University of Monastir ISET (Ksar Hellal) (Ksar Hellal)
Monastir, Tunisia Monastir, Tunisia Monastir, Tunisia
wafa.guedri@gmail.com jy.mounir@gmail.com slah.msahli@gmail.com
Abstract— Bagging date fruit is a necessary practice in the textile permitting protection from wind, insects and heavy
date palm. It is used in the date crop to protect fruits from rain. In fact, it is important for Tunisian farmers to be sure
humidity, rain and insects. At present, it is focused to many that cover bunch will keep its waterproofing and remain its
studies to improve its performances. In this work, a new strength during maturity period.
method is proposed intending to solve this multi-criterion
phenomenon. A practical mathematical tool named Previous works confirmed that using traditional covers
desirability function allowing the prediction of the cover bunch like plastic film, mosquito net and kraft paper had a
is developed. This approach permitted to define the global prominent effect on quality and yield of date fruit [5]. The
quality of the bag through a global quality index. In this work, ideal bag requests a compromise between several
five bagging products are selected to be inspected, used for the requirements of Tunisian farmers for efficient date
first time to protect date fruit against carob moth and rain. protection. In fact, the satisfaction of Tunisian farmers is a
Find results allowed the identification of ideal cover for phenomenon that requires satisfaction of a set of features
farmers to satisfy their needs that would has the highest during harvesting period.
properties.
In this work, we propose a mathematical approach in
Keywords—Desirability, quality, date fruit, nonwoven, order to estimate the Tunisian farmer’s satisfaction using an
satisfaction index named global quality index “QI”. The desirability
functions are used to develop this index. Modeling this
satisfaction allows developing the ideal choice of cover date
I. Introduction bunch.
Date palm is the most significant fruit crop grown in arid II. Material and methods
and semi-arid regions of Middle East and North Africa. In
Tunisia, date palm is the major factor of oases farming. It A. Selected samples
represents the main financial resources of farmers since date During the 2016-2017 seasons, three mature date palm
fruits are used for food or other commercial purposes [1]. For trees, received the same cultural practices at the experimental
occurrence, about 10% of the Tunisian population is plot of Institute Arid, “Atilaat” (Jemna, Kébili) were selected
dependent, on date palm and its related crops [2]. However, for this experience. The number of bunches was adjusted to
this important crop is actually in danger by severe factors. A six per tree. The covers were as follows: bunch cover with
majority of the date cultivars are susceptible to pests mainly nonwovens N1 (polyester high weight), N2 (polyester low
carob date or yield poorly caused by autumnal rain, sunburn weight), N3 (polypropylene high weight), N4 (medium
and wind. These factors had led to reduce the date quality, weight) and N5 (low weight), no bagging (control). Bunch
which menace the physical aspect of the date fruit. Hence, it covers with different products (TableI) was happening three
is very important to elaborate a strategy intending to improve months before the collect and continued during the full
the quality of the Tunisian date fruit. Covering procedure of maturity period.
date bunch is regarded as the most important practice for
date palm to get a date with better quality and an economical Table I. Properties of the selected nonwoven samples
Nonwoven samples
yield. It offers numerous advantages and is used in the date Properties
N1 N2 N3 N4 N5
fruit cultivated in order to protect date from rain, high Germa Germa Germa
humidity, bird and insects [3]. Different covers are existing Supplier German German
n n n
for date fruit protection [4]. Compositio Polypropy Polypropy Polyest Polyest Polyest
n lene lene er er er
The requirements for the cover date bunch were Color White White White White White
mentioned by D.E bliss in his study [5]. His study conducted G (g/m2) 60.3 41.36 75 50.2 34.9
that the best cover must be water-resistant during heavy rain
and allows maximum aeration because the vapor transpired T (mm) 0.44 0.38 0.34 0.31 0.17
by the date surface is imprisoned by the bag and guides to Gr-mass per surface unit, T-Thickness of the bags
water injury and infection. However, there are a few works
when evaluating the value of the bag used for protecting date
fruit [5]. The first work on the measurements of cover bunch Five nonwoven samples were selected from the textiles
features was described by Denis [6]. He planned the first offered by Freudenberg Group, plastic film and mosquito net
cover in the shape of cloth bag based on flexible woven from GIFruit-Tunisia. Figure I show the five samples of
87
covers chosen to study the protection efficiency of date fruits The arrangement of the quality of cover date bunch is
against rain and the carob moth. hard for all features that define these products. It is then
necessary to select a set of parameters to reflect together the
(a) (b) (c) global quality of the bag for a particular use [9]. The
selection of the properties is accompanied by a uncertainty
and subjectivity which are necessary for the utility of all
index. The selection of properties is based on data collected
from Tunisian farmers, organizations and researchers
troubled with the protection of dates.
In this work, the quality of the cover date bunch is
characterized by multiple parameters expressing the ability
of these bags to be satisfied. There are many physical and
mechanical properties that influence this quality. The number
(d) (e) of required parameters is fixed for all bagging products.
There is a set of parameters that the covers must have to
obtain it suitable for the protection of the date bunch. The
cover date bunch must be flexible and maintain the
movement of the date regime, so that tearing is the most
important parameters for any type of covering product.
Since a cover date bunch is mainly used to keep out
insects, the bag must have features of suitable dimensions to
Figure I. Images of used bagging samples: (a)N1; (b)N2;
(c)N3; (d)N4; (e)N5; surround the entire bunch. Textile strength is a key property
related to the features of fabric durability. Since the cover
B. Development of quality index bunch is focused to multidirectional forces, the tensile and
Development of the nonwoven covers quality is divided tear strength are used to determine the strength of the
to four steps: product. The properties of permeability and resistance to
water penetration are quality parameters that establish the
1. Choice of properties effectiveness of the bag and the preservation of the quality of
2. Property conversion to a common value date fruit.
4. Aggregation of indices The properties of the cover date bunch are expressed by
different units. In fact, levels can differ from one property to
With a focal point on developing the global quality index QI, another. The classification of cover quality intended to be
we have pursued the procedure offered by figure II. formulated. All properties must be changed to a common
value that is a quality index “QI”. The mathematical
equations that convert the values of the properties into
indices are made according to the desirability function. An
index will be in a scale of 0 to 1 [10]. If cover quality
properties get together the set specification values, a value of
1 is selected [11].
In the equations’ development, linear functions are made
as a correlation between cover properties and the quality of
the cover. In order to convert cover quality properties to a
common value, the lower and the upper limits of each
property must be identified. The upper and the lower limit
useful for the bag quality properties is based on the
theoretical knowledge, the specifications and the experts
opinions’ who participated in the survey.
In this approach, the quality properties of the cover date
bunch are divided into three mathematical functions named
Derringuer and Suich desirability.
The first functions are monotonous and can formalize a
preference in terms of decrease or increase. Harrington [11]
calls them One-sided. One-sided desirability characteristic
has three parameters. We get the example of an increasing
function. They are each parameterized by two points named
lower limit Yl (upper limit Yu for decreasing function) and
Figure II. Design of quality index QI target value YT. The first Yl indicates the lower region at
which the values of yj are considered unacceptable; the
B.1. Choice of parameters
second YT designated the area of satisfaction.
88
Finally, the parameter s makes it possible to define the sj> 1, we believe that the requirement of the farmer is too
curvature of the function. In fact, the linear function high, and if sj <1, we can consider that the farer’s
corresponds to s = 1. Depending on the s values’, the requirement is too low [15].
curvature can be oriented.
RV
We employ the desirability function to maximize a sj (5)
property, where ‘‘dj’’ is calculated according to equation 1 SL
[12]. Where RV: average recorded value; SL: average scale
value
In Equation 6, wj is the relative importance related to the
0 if YL y j response j. The relative importance wj is a comparative level
sj for weighting of the individual desirability dj in the global
y Y desirability cover and it varies from the least important
d j j L if YL y j YT (1)
YT YL
(wj=1) to the most important (wj= 5) [14].
1 if y j YT RV
wj (6)
10
B.4. Aggregation of indices
We use equation 2 to minimize a property.
The presence of different criteria and the absence of a
1 if y j YT relationship solutions guide to the use of strategies that
sj
consider these particularities. The method we study in this
y j YU work consists in transforming the multi-criteria difficulty
d j if YT y j YU (2) into a single criterion problem. Geometric mean aggregation
Y
T U Y in Equation7 and arithmetic mean aggregation in Equation 8
0 are used to join individual desirabilites “di” in overall
if y j YU
desirability Dg calculated by each group of property. The
weight of each property is primordial in the determination of
The last function proposed by Harrington [12] is called Two- the Dg.
sided and permitted the targeting value for yj, or specifying a
satisfaction region. It is situate by two points, the lower Yl Finally, different Dg's are collected together in a degree
and the upper Yu. In addition, YT value is used to put the of global satisfaction defined by QI using the desirability
incline of the function, where ‘‘dj’’ is calculated according to function of Derringen and Suich [16].
equation 3. 1
Dg (d1W1 d 2W2 d3W3 .... d nWn ) j
w (7)
0 if y j YL
1
Dg (d1W1 d2W2 d3W3 .... d nWn ) j
sj
y j YL w (8)
if YL y j YT
YT YL
dj (3) The quality index values QI vary between 0 and 1. The 0
1 if y j YU value designated any satisfaction and the value of 1
corresponds to an entire satisfaction for the overall quality of
sj
y j YU
the cover bunch.
if YT y j YU
YT YU III. Results and discussions
When bag is detached from the date palm, some
properties must be calculated. A full description of the
B.3. Attribution of weightings analytical methods employed is ready to study the status of
After the checking of the properties, weightings are covers removed.
consigned to each property specified its importance in the A. Cover bunch features after bagging period
global quality of the cover date bunch. The range of values
for these weights varies from zero to ten. The mean response Experimentation of the different covers used is carried out at
is employed to compare the importance of each property. the TTS laboratory in Moknines. The requirements of the
The value is calculated according to the Equation 4 [13]: covers must be related to the application on the date palm.
Experimental results presented in the form of pictures
M
O V (4)
indicate a lack of tear resistance of N1 and N2 samples. The
literature indicated that the problem was due to the low UV
N resistance of polypropylene. Figure II shows that the
Where M: Mean response; O: occurrence; V: market covering period causes a significant destruction of the N1
value and N: number of response material while there are partial tears of the N2 material. In
contrast to the N1 and N2 materials, the N4 material has
The requirement degree of the Tunisian farmers is noted ''
sj '' and calculated using the equation 5. We consider that if sj retained its strength for 3 months.
= 1, the requirement degree is average. On the other part, if
89
from date palm as efficient anode materials for sodium-ion
batteries”, Carbon 146, pp 844, 2018.
[3] S. Sakin, "Analytical methods applied to the chemical
caracterization and classification of palm dates (Phoenix
dactylifera L.) from Elche's Palm Grove," University of
Alicante, Alicante, 2013.
[4] Salah Amroune et al., "Investigation of the date palm fiber for
a) b) c) composites reinforcement: thermo-physical and mechanical
Figure III. Tears of the N1 material after the bagging period properties of the fiber”," Journal of natural fibers, 2019.
[5] Wafa Guedri, Mounir Jaouadi and Slah Msahli, ‘Evaluating
farmer's satisfaction of different agrotextile cover bunch using
desirability function’, 2021.
[6] SP.Denis, "Means and method for protecting Deglet Noor
Dates," Brevet US2001051240, 2001.
[7] M. Selmane, Thèse en écologie animale, Annaba, FAaculté de
d) e) sciences, département de biologie, 2015.
Figure IV. Tears of the N2 material after the bagging period [8] D. Zohary, Domestication of Plants in the Old World: The
origin and spread of domesticated plants in Southwest Asia,
Global desirability function of Derriguer and Suich is used to Europe, and the Mediterranean Basin, Oxford, 2012.
calculate the QI were values are shown in Table II. [9] J. Smartt, Evolution of crop plants, 2nd edition, 1995.
[10] P. Munier, Le palmier dattier .coll. Techniques agricoles et
Table II. Quality index values productions tropicales, France: Ed. g. Maisonneuve et larose,
Samples N1 N2 N3 N4 N5 XXIV, p. 221, 1973.
[11] A.Hadj taieb, S.Msahli and F.Sakli "Optimization of the
Dg (structure) 0,42 0,12 0,25 0,56 0,43 Knitted Fabric Quality by using Multicriteria Phenomenon
Dg (resistance) 0,1 0 0,11 0,32 0,18 tools," International Journal of Fiber and Textile Research,
Dg (protection) 0,18 0,27 0,05 0,3 0,09 vol. 3, no. 4, pp. 66-77, 2013.
QI 0,007 0 0,001 0,05 0,007 [12] A. M C Balasooriya, "Development of a comprehensive fabric
quality grading system for selected end uses," National
Evaluation of covering product qualities by Tunisian Engineering Conference, 19TH ERU SYMPOSIUM, pp. 33-
farmers has been performed in order to determine farmer’s 37, 2013.
requirement degree and weight for each cover bunch [13] A. M. Shravan Kumar Gupta, "Optimization of durability of
Persian hand-knotted wool carpets by using desirability
property. Measured properties have been transformed to functions," Textile Research Journal, pp. 1-10, 2016.
individual satisfaction degree and global desirability index [14] F.Dabbebi and S. B. Abdessalem, "New approach for
was determined for the appreciation of farmer’s satisfaction. appreciating the surgeon’s satisfaction of braided sutures,"
Journal Of Industrial Textiles, pp. 1-23, 2015.
The quality index of all samples were determined to confirm [15] Zehdi S, "Analysis of Tunisian date palm germplasm using
the ability of these materials to protect against violence wind, simple sequence repeat primers," African Journal of
rain and to limit dust obtained for N1 and N2 materials. In Biotechnology Vol. 3 (4),, vol. 3, no. 4, pp. 215-219, 2004.
the case of N1, N2 and N3, their bad resistance after the [16] S. Rhouma, "Genetic diversity in ecotypes of Tunisian date
palm (Phoenix dactylifera L.) assessed by AFLP markers,"
period f bagging deeply depreciated their overall quality. The Journal of Horticultural Science and Biotechnology, vol.
82 , no. 6, p. 929–933, 2007.
IV. Conclusion [17] D. Mostafa, "Effect of Bunch Bagging on Yield and Fruit
Quality of Seewy Date Palm under New Valley Conditions
In this work, we have examined the multi-criterion (Egypt)," Middle East Journal of Agriculture Research, vol. 3,
phenomena of cover date bunch, during maturity period no. 3, pp. 517-521, 2014.
includes different properties simultaneously. A desirability [18] W.Guedri, M.Jaouadi and S.Msahli, "New approach for
method for appropriate nonwoven covers selection based on modeling the quality of the bagging date using desirability
Derriguer and Suich function was presented. This approach functions," Textile Research Journal, vol. 86, no. 19, p. 2106–
2116, 2016.
is suitable to minimize, maximize and achieve target values
of some objective functions at the same time. It is used to
convert the different output in one output that is the set of the
individual output while affecting different weights according
to the importance of each property in the studied case. This
method has been applied to reduce the different requirements
affecting the quality of the cover bunch in one index
representing the global quality of the nonwoven bags varying
between 0 and 1. This process advanced the quality
properties in term of the ideal output responses. The
considered quality indices have revealed that an
enhancement in permeability performance guarantees an
increase in product satisfaction.
References
90
Analysis of the Waterfront Transformation of the ‘Plazh’ Area
of the City of Durres, Albania.
Anna Yunitsyna Mirela Hasanbashaj
Department of Architecture Department of Architecture
Epoka University Epoka University
Tirana, Albania Tirana, Albania
ayunitsyna@epoka.edu.al mhasanbashaj10@epoka.edu.al
Abstract— In many waterside cities coastline urbanization is Albania has led to a planning policy in order to provide
an important issue for architects and urban planners. Albania economical and suitable land for building. The government is
has been unsuccessful in developing a balance between the seeking to accommodate public projects whereas the private
demand that the public has to access the coastline and the land ownership look for a land for development. In the past years,
use. The coastline nature, has changed because of demographic, people in Durrës have expressed their frustration that they
hasty urban and economic growth. This process took place have with the inaccessibility to the sea. The political and
without taking into consideration the social and cultural values economic changes that occurred after the 90s, were followed
of the coastline. Massive constructions have damaged the sewage by large flows of migration from rural areas, to urban areas.
system causing continuous over flooding and reduced the
The urban development of the territory, were largely the
coastline greenery. This study investigates the transformation of
the seaside of so-called 'Plazh' area during the last 20 years. The
product of free initiative and spontaneous development
study is divided into the 5 years periods, in order to have a more reflected in the creation of the first informal neighborhoods.
complete overview of the urban changes. The types of buildings All the economic activity was concentrated in the central and
included in the study are residential and commercial. In this coastal areas of Albania, but local authorities were unable to
research, have been used the information collected from the site provide appropriate urban regulatory plans and to control the
surveys and observations and archival. T this study presents a process of construction [2].
report explaining the site problems and transformation of the
costal line of Durrës during years, a comparison between the II. Waterfront Development Principles
master plans, proposed regulatory plans and the evaluation of In ancient times, societies used to live in waterfront areas,
the actual situation. such as next to Tigris, Nile and Euphrates. In order to sustain
life and satisfy the biological dependency on water, humans
Keywords— coastline, urbanization, land use, height limit, historically needed to locate near fresh water [3]. Waterfront
urban regeneration, waterfront means 'the part of a town or city adjoining a river, lake, harbor,
I. Introduction etc.' according to the Oxford American Dictionary of Current
English [4]. 'Waterfront is the urban area in direct contact with
City of Durrës is a coastal town which lies in the western water' [5]. According to Moretti, port activities and
part of the region. It holds the largest port in Albania which infrastructures occupy waterfront areas. Waterfront is defined
makes it an important landmark for tourism. It is the second as an interaction area between water and urban development
largest city from the politics, economy, administration, [6]. Although in the vocabulary the meaning of waterfront is
education and culture standpoint. Durrës itself holds good clear, in the literature, it is met by using different words
opportunities, lying in a strategically position between north instead of the term waterfront such as: riverside, river edge,
and south of Albania. It is the trade center of intersections and water edge, riverfront, a city port, and harbor front [4],
has won important functions and intensive territory usage, [7].Waterfront identifies the water’s edge in urban areas [8].
becoming one of the most developed regions of the country. The water body may be 'a creek or canal, river, lake, ocean,
The purpose of this work is to track the transformation the bay,' or even artificial [9]. To sum up, the waterfront area is a
coastline of the city of Durrës and to highilht the urban confluence area of water and land. It is the edge of land and
problems emerged during the last 20 years. The demand, also the edge of water. The waterfront is found as a continuous
inspiration and decision of the investors and owners for their process in most places where settlement and water are
private property have played an important role on the juxtaposed, whether or not a commercial port activity is or was
transformation of city and its coastline in terms of land use and present. It is an area that has a high density of elements and
urban planning. The study is focused on the analysis of so activities that affect each other. In geographical aspect, the
called 'Plazh' area using the building location, building height urban landscape is a synthesis of climate, soil, biology and
and land use criteria. The study examines the buildings of post physiognomy. An integrated landscape is a combination of
socialist period, specifically through 1990-2015. The coastal natural landscape and artificial landscapes including
line has a combination of elements such as residential housing, architecture, streets and squares [10].
high rise buildings, clubs, restaurants, shops etc.
Redeveloping the water front means re-utilizing the urban In most of European cities, the development process of
lands that were left behind through the years. urban waterfront area is like the following: prosperity, decline
and re-development. Urban waterfront areas flourishing
The major problem comes from the reclaimed land which timewasbefore 1920s. Before the industrial revolution, there
changes the relationship between the city and the waterfront. was a development of the society, there was a people
In most of the cases the lands that were reclaimed have been dependence on natural water sources, and the water was used
restricted to the general public by the private users. Durrës has for every day live but also for travel. With the development of
evolved with the land reclamation but such development has water traffic, waterfront areasbecame very important for their
a negative result with the urban waterfront development with cities because trading development began and it had a positive
the rest of the region and Europe [1]. Land reclamation in impact in the citizens economy, so waterfront areas became
91
the centers for many people’s lives. However, there was a have occurred in this area [16]. The oldest part of the city, is
declineof urban waterfront areas from the 1930s to 1960s. situated on the hill, overlooking the Adriatic Sea in the west,
After the industrial revolution, there was a rapidgrowth of the the bay of Durrës in the south and the plain in the east and
population; along the water were located many industrial and north-east. The city is growing in all of the directions, but
transport companies in order to get more benefits. This mostly on the coastline. During the last 20 years the city has
brought water pollution because much sewage rushed into suffered a drastic transformation. Nowadays on the 4.5 km of
waterso inhabitants did not want to live there anymore [11]. A coastal line there are problems with the buildings heights,
reconfiguration of the water aesthetics is done by urban density, distances and land use.
waterfront regeneration. The way in which the waterfront
looks actually depends on the creation of attractive waterside
open spaces, besides the provision support of residential and
business space with waterside prospects.
Waterfront cities expand over reclaimed land, and there is
a limited number of studies relating to the waterfront buildings
expansion. The waterfront area is where the land interacts Figure I. The Regulatory Plan of 1942, Detailed Plan
directly with the sea. It is a dynamic place which is totally for the ‘Plazh’ area
open to the action of waves, wind, currents and tides that can
expand the shore with sedimentary deposits and also erode it. In the early 20s the 'Plazh' district was a farmland [17]. In
It is very important to provide the vision of the future the mid-1930s, a regulatory plan was requested for the city of
sustainable development of the waterside urban edges and to Durrës, the implementation of which turned Durrës beach
improve the social, aesthetical, physical and economic before the Second World War in a touristic and recreational
conditions of those areas [12]. Besides of the engineering area. The royal court, in the mid 1930's charged Durrës
tasks the expansion of the waterfront should involve municipality and a group of technicians to draft the regulatory
systematic and careful planning, the strategy of development plan for the coast. A 4 km length of the coast line was divided
and sustainable management and also should consider the in 300 parcels with a 400-500 m2 each. These parcels were
interests of people living in the area [13]. The common sold to different individuals provided that the building would
strategy is to provide the guidelines for the waterfront not be more than 2 story high, 80 meters away from the sea
development which include detailed land use principles and line and half of its surface had to be used as greenery. The
architectural design approaches, such as building heights and
first regulatory plan of Durrës is done in 1942 by the architect
the built forms, presence of the public areas, access to the
Leone Carmignani (Figure I). The study presented a
water, human-centered deign of spaces, lighting and signage
[14]. Waterfronts are usually in the center of conflicts between functional zoning scheme, an administrative map defining the
different actors and parties, the combine the complex boundaries and the territory junction.
environmental problems such as pollution and flood, social
issues which is a contradiction of local inhabitants which can
belong to the poor groups and the tourists, the intension of the
developers to densify the constructions and demand for the
public spaces [15].
III. History of the Development of ‘Plazh’ Area of
Durres
The city of Durrës is positioned in the western part of the
county. It has a geographical location with a latitude of 41'
19’ on the north, and longitude 19’ 27’ on the east. It has a
total area of 432 km2. It is bounded the district of Tirana
which is 35 km away, and in the south with the district of Figure II. Durres Regulatory Plan of 1957
Kavaja. As mentioned above, the western boundary of the
city is defined by the Adriatic Sea Coastline, and is 30 km After the establishment of the communist state in 1946,
long. Durrës is mostly composed of large landfills and hills, all of the beach villas were nationalized. The buildings were
which stand 89 m above the sea level, whereas in the city, the used as a holiday home for workers, and later, some of them
average height is only 2 m above the sea level. The converted into guest villas for foreign party leaders or holiday
establishment of Durrës began on the 7 century B.C. The homes for local directors and officers. The same thing
modern city is built on the ruins of Epidamn which is known happened to buildings that were close to the 'Bllok', in Illyria,
as the old city and has there are actually a number of while others were reconstructed as small apartments and were
significant elements of the historical and cultural heritage. rented for two-week vacation from workers families, mainly
Durrës Bay is located between Selita cape and the cape of from Tirana. From 1950 to 1960, based on the principle of
south Durrës. On the entrance part of the capes, the bay industrial decentralization, the first comprehensive regional
shores are higher, while on the inside a smooth slope is seen. and urban strategies were developed and applied [18]. The
On the north part of the shoreline the port of Durrës is located, General Regulatory Plan of Durrës was prepared in 1957
while on the north east and south east of the port lies the city (Figure II). During this period the coastline was considered
of Durrës. The suburb of the city has changed a lot through an important natural element which belonged to the public
the centuries, which means that also the coastline has and was still treated with care.
undergone many transformations due to earthquakes, which
92
The second Regulatory Plan (Figure III) was done in
1987, covering a territory of 2800 hectares, which in 1999
extended in 7800 hectares by Durrës Municipality.
According to this plan, there was not allowed any
construction on 'Plazh' area. It was prohibited to use this area
for residential or other purposes.
93
houses, negotiating and compensating owners with % of the total) new constructions, most of the buildings were
apartments and shops. 2-3 floor high and mostly small apartments or villas, were
used for tourism. The biggest number of new constructions
IV. Analysis of the Coastline Development appeared during 2001-2006, when there were constructed 178
This comparative and descriptive analysis of coastline (71% of the total number) buildings from which 94 of them
transformation from 'Ura Dajlanit' to 'Plepa' is based on study are built on the foundations of the existing ones. After 2006,
of visual materials such as maps and images, archival the rhythm of constructions has decreased. There are added 14
research, site survey and site observation. (6% of the total number) buildings during this period. From
2010 to 2015 there are constructed 7 new buildings (3% of the
total number).
B. Building Heights
Durrës was perceived as a low rise city before 1995.
However, from this year a large number of large scale
developments was constructed along the coastline. First wave
of construction was mostly up to five or six floors. The boom
Figure VII. 1995, 2001, 2006, 2010, 2015 maps of development of construction of apartment blocks began from
Durrës coastline 2001 until 2006. There is a big contrast between the low rise
villas with the newly constructed residential buildings. On the
The basic maps of 1995 (technical drawings, architect maps of the building heights (Figure IX), it is seen that the
Gëzim Hasko, Archive of Durrës, Water Utility) that has been building heights have increased from 1995 to 2015, but the
redrawn as a Dwg file, 2001 and 2010 maps (Land Register period which makes contrast is from 2001-2006.
office), 2006 map (ALUIZNI) and the 2015 map (self-
updated on the terrain by using the 2010 and Geoportal Asig
as a reference) were provided (Figure VII), and used to
analyze the changes of the costal line for every 5 year period.
Each of the maps included the information regarding the
building heights. Comparison between the maps allows to
identify the number of new construction added during each
period.
A. Number of Construction
After 1990 the number of construction of the buildings has
rised year by year. This has completely changed the image of
the city, its structure and socio-economic aspects.
600
476 490 497
500
400
298
300 248
200
100
0
1995 2001 2006 2010 2015
Figure IX. Building heights map of Durrës ‘Plazh’ area
Figure VIII. Total number of buildings per each period from 1995 (below) to 2015 (above)
From the constructions analysis graph (Figure VIII), it is The map from 1995 shows that most of the buildings are
evident that in 1995 the total number of buildings was 248. In one up to two story high, so ‘Plazh’ area had only villas with
2001 the buildings number has reached up to 298 buildings. a max of 3 story high. Those villas were constructed before
The highest number of constructions and collaterals is seen years 60' and until 1995 there were not constructed any new
from 2001 to 2006 where the number of constructions has building because there were not allowed constructions on the
reached a total of 476 buildings. From 2006 up to 2010 the touristic areas. On the map of 2001 it is seen that there are not
rhythm of construction was lower. At the end of 2010 there very big changes in comparison with the 1995 map. The new
were 490 buildings in total. On the last 5 years, the number of buildings were not high rise and distances between in building
constructions has decreased. One of the reasons is that the new were respected. From 2001 until 2006 a massive and
Regulatory Plan does not allow new constructions on this area, uncontrolled expansion of apartment blocks is noted. This
since there is a height building density and there has not period of construction corresponds to the revision of the
remained too much free land to continue with the chaotic master plan made in 2005 by Durrës municipality. The
construction that has characterized this area. So, at the end of revision of urban conditions it is not based in any study done
2015 there are in total 497 buildings. The total number of for this area because the last studied master plan was the one
buildings added is 249. From 1995 to 2001 there were 50 (20 made in 1987. Despite on this, the construction continued by
94
causing many problems. One of the main reasons of the plus height. The total number of buildings with multiple
massive construction was land reclamation. In 2005, the problems is 64.
Revision of Urban Condition divided the area in 2 parts, the
area with buildings heights up to six stores and the area with D. Road Network
the building heights up to eight stores. The new buildings that The road network of coastline of Durrës has been studied
were added were from two up to eight stores. After 2006, the using geographical information systems (GIS). During the
construction continued. Also in this period the buildings were last years, the main road which connects the Plazh area with
mostly apartment blocks with the heights up to nine floors. Durrës and Golem got a large increase in traffic and transport
From 2010 to 2015 there were not added many buildings. This demand. The solution to resolve the capacity problem was to
is because there was not too much of vacant area remained. provide additional road space, which was the first applied
strategy. It was not enough because of the uncontrolled
The overall height of new buildings increased from 1995 building expansion that occurred these years in the beach area.
to 2015. In 1995 there were no buildings with 7-8 or 9+ floors. Secondary roads were not developed. Regarding the existing
In 2001 there are 7 buildings of 5-6 story height. Furthermore, road system a lack of access in the suburban area is noticed.
in 2006 and further on, the number of high rise buildings has The main street outlines the area. On the internal area, narrow
drastically increased. Between 2001 and 2006 there are added roads are constructed with dead ends. In the past years, with
24% buildings with 7-8 story high and 7% buildings of 9+ the massive construction around this area, these secondary
story height. During 2010-2015 just 7 new buildings were roads have been transformed into pedestrian roads. Parking is
added, which refers to the new master plan and the Urban another major problem since the new buildings did not
Laws which came in power. provide any parking places for the residents. The situation is
C. Building Distances and Parcel Use becoming a major concern to the public because it is getting
worse day by day. It also causes air pollution which is one of
According to the map of the buildings problems (Figure
the biggest problems on the area which damages the tourism
X), it is seen that there are a lot of problems with the buildings and has a negative effect on the quality of life.
heights, distances from each other and use of the parcels. Red
buildings are out of the construction criteria according to the The main road is the one which didn’t change during the
2005 Revision of the Urban Conditions and the 2011 last years. Because of the massive construction a lot of
Regulatory Plan. The pink color represents the buildings secondary roads were built within the area. They were built
added during years that have not respected the distance rule.
Most of the buildings should have a blind facade, but
according to the site investigation, there are found windows
and balconies. There are also problems with the coefficient of
use of the parcels. The cyan color indicates buildings with
more than 1 problem.
95
which have affected the development of the Coastal area. The [7] A. Yassin, C. Eves and J. McDonagh, "An Evolution of
first factor is the land reclamation and the second is the Waterfront Development in Malaysia," Geography, 2010.
absence of an approved Master Plan from 1990 until 2011, [8] M. Marzia, Morphological, technological and functional
since the Regulatory Plan of 1987 was not taken into characteristics of infrastructures as a vital sector for the
consideration because it belonged to the previous regime. competitiveness of a country system, Milano: Politecnica,
2011.
Durrës is the second largest city on the region and it has
touristic and historical values. The transformation of ‘Plazh’ [9] S. Shaziman, I. Usman and M. Tahir, "Waterfront as Public
area can be seen as a bad example of management the touristic Space Case Study; Klang River between Masjid Jamek and
coastal areas. Every city which has an important historical and Central Market, Kuala Lumpur," in Selected topics in energy,
touristic value should the development plan where all the environment, sustainable development and landscaping;
EEESD '10, 3rd WSEAS International conference on
social, economic, aesthetical and architectural principles
landscape architecture LA '10, 2010.
should be merged in a harmonic and attractive way. Everyone
should enjoy the spaces which are offered by the waterfront [10] S. Kostof, The City Shaped: Urban Patterns and Meanings
areas. This land should be not treated as a personal asset and Through History, Thames & Hudson, 1991.
used for personal interest, but also take into consideration the [11] N. Erkal, Haliç extra mural zone : a spatio temporal framework
demands of the tourists and residents. The ‘concrete barrier’ for understanding the architecture of the Istanbul city frontier,
of the 'Plazh' area was done without a base of a Regulatory Istanbul: METU, 2001.
Plan or a specific study. Almost 50% of the buildings are done [12] A. R. Al-Shams, K. Ngah, Z. Zakaria, N. Noordin, M. Z.
with the violation of the law. At this situation it is difficult to Hilmie and M. Sawal, "Waterfront Development within the
rehabilitate al the area, since there is not any vacant land which Urban Design and Public Space Framework in Malaysia,"
can be used for the public spaces. It should be suggested a Asian Social Science, vol. 9, no. 10, pp. 77-87, 2013.
rehabilitation project, which should include sport areas, [13] C.-H. Chen, "The Analysis of Sustainable Waterfront
recreation areas and more green areas. The new construction Development Strategy - The Case of Keelung Port City,"
should be stopped, some of the informal settlements should be International Journal of Environmental Protection and Policy,
demolished and importance to the secondary roads which vol. 3, no. 3, pp. 65-78, 2015.
orient visitors to the sea should be given. [14] Y. Reyhan , Ş. Şenlie and G. İ. Burcu, "Sustainable Urban
Design Guidelines For Waterfront Developments," in 2nd
References International Sustainable Building symposium, Ankara, 2015.
[1] A. Hoti, Durrësi : Epidamni - Dyrrahu : guidë, Tiranë: Cetis [15] R. E. Pramesti, "Sustainable Urban Waterfrontre
Tirana, 2003. Development: Challenge And Key Issues," MEDIA
[2] S. Xhafa and B. Hasani, "Urban Planning Challenges in the MATRASAIN, vol. 14, no. 2, pp. 41-54, 2017.
Peripheral Areas of Durres City (Porto Romano)," [16] V. Koçi, Spatial transformations of the waterfront-as an urban
Mediterranean Journal of Social Sciences, vol. 4(10), 2013. frontier case study : Durres a port city, Istanbul: METU, 2005.
[3] R. Leakey and R. Lewin, People of the Lake: Mankind and Its [17] Dyrrah, "Durrës: Vilat e plazhit, "hirushet" e vetmuara mes
Beginnings, Anchor Press/Doubleday, 1978. katrahurës ndërtimore," DurrësLajm, 15 June 2015.
[4] L. Dong, "Waterfront Development: A Case Study of Dalian, [18] G. Enyedi, "Urbanization under Socialism," in Cities after
China," University of Waterloo , Waterloo , 2004. Socialism, Urban and Regional Change and conflict in Post-
[5] M. Moretti, "Cities on Water andWaterfront Regeneration: A Socialist Societies, Oxford, Blackwell Publishers, 1996, pp.
Strategic Challenge for the Future," Rivers of Change - 100-11.
River//Cities, Warsaw, 2008.
[6] M. Yassin, B. Azlina, S. Bond and J. McDonagh, "Principles
for sustainable riverfront development for Malaysia," Journal
of Techno-Social, 2012.
96
Deep Learning Using MobileNet for Personal Recognizing
Şafak Kılıç İman Askerzade Yılmaz Kaya
Department of Computer Engineering Department of Computer Engineering Department of Computer Engineering
Siirt University Ankara University Siirt University
Siirt, Turkey Ankara, Turkey Siirt, Turkey
safakkilic@outlook.com imasker@eng.ankara.edu.tr yilmazkaya1977@gmail.com
Abstract— The usage areas of biometric technologies are created with data such as walking, signature and speech are
increasing day by day. As the importance of information based on behavioral features. These strategies, on the other
security becomes more important for people every day, one of hand, have a significant disadvantage in that they can be
the most used areas has been information security. In recent duplicated [2; 3; 4]
years, human-computer interactive systems have started to
attract academic and commercial interest and it is aimed to Mimic sounds, the use of duplicate irises, and disguised
solve problems such as person recognition, gender estimation, glasses can be examples of these scams. As a result, new
age estimation with these systems. In our study, person descriptive systems based on individual behavior or features,
recognition was performed through the data collected using known as biometrics, based on signals measured from various
wearable sensors. The Daily and Sports Activities data set, parts of the body, have been adopted in recent years [5; 6].
which we obtained from the UCI database, has been tested with Different medical signals are also employed as biometric data,
the developed MobilNet architecture. It has been seen that the according to studies. Biometric systems are developed using
data obtained from the sensors are successful in the person EEG and other signals [7; 8; 6; 9], electrocardiogram [10; 11;
recognition problem. The developed system has realized the 12; 13; 14; 15], and accelerometer [16; 17]. Medical indicators
person identification with 19 different physical movements and are unique to each person, according to studies [8; 18; 10].
also provided the detection of 19 different movements. In
addition, success rates were obtained according to the region In the study of Alyasseri et al. [19], people are recognized
where the sensors were installed. Thanks to the results obtained using multichannel EEG waves. In addition, active EEG
in this study, it has been seen that accelerometer, gyroscope and channels were uncovered by the researchers. The process of
magnetometric sensors are successful in biometric person recognizing persons is done by the use of electrical signals in
recognition. In summary, it has been determined that the the brain, according to Sun et alresearch .'s [9]. They
proposed method is successful in biometric person recognition, discovered that applying the conventional 1D-LSTM deep
thanks to the data obtained from wearable sensors. learning algorithm to 16-channel EEG measurements resulted
in a success rate of 99.56 percent. In their study of identifying
Keywords—Transfer Deep Learning Models, MobileNet, people using EEG data, Rodrigues et al. discovered an 87
Person Identification, Wearable Sensor, Biometric System
percent success rate [6].
I. Introduction The person recognition problem was attempted to be
In the past few decades, the problem of identifying people solved by using sensor signals in a study by Kılıç et al. [20;
has been one of the hot areas where researchers emphasize the 21]. The sensor signals were converted into pictures with
use of various methods. The human body has several unique various processes and the success of the system was tested
characteristics. Some systems can detect these characteristics with local binary pattern and deep learning networks and
and distinguish them from others. A system that models. In both of his articles, the author showed a success
acknowledges people supported their physical or behavioural rate of over 95% in the person recognition study from sensor
characteristics is termed a biometric system. Personal tests.
biometric authentication includes distinctive people supported Although the CNN training strategy from scratch can be
their physiological and/or behavioural characteristics [1]. successful in many problems, the correct optimization of the
Biometry technology checks the physical or behavioural hyper-parameters in the architecture to be installed is still a
characteristics that a private will acknowledge. The biometric difficult process [34].At the same time, a large amount of data
system works in 2 ways: (1) identification (also referred to as is required for a scratch-training technique [35]. However, it
"identity verification") and (2) authentication (also referred to is possible to reach high success rates faster and more
as "identity verification"). First, the verification of private precisely by training deep architectures, which are enriched
identity is completed by finding a match within the info of with techniques newly introduced to the literature and also do
everybody within the information (one-to-many comparison not need hyperparameter optimization, with transfer learning
strategy). within the latter, a human biometric info is or fine-tuning strategy [36]. The MobileNet architecture is
compared with its example hold on within the system chosen as the training model in this study because it is learned
information to verify a human identity. using deep transfer networks, has a low transaction cost, and
Physical and behavioral features are the two basic qualities is ideal for mobile applications.At the end of the study, higher
of persons. Physical properties are those that are stable and do success rates were achieved with the right education strategy
not vary over time, whereas behavioral properties are those compared to other studies in the literature.
that change over time and in response to environmental In our study, after this stage, the data set was introduced,
factors. Biometric recognition systems are generally the models used were shown, and the experimental results
developed thanks to these two characteristics of people. While obtained were presented in the discussion section. In the last
biometric systems created with data such as facial recognition, part, the general achievements obtained at the end of my study
fingerprint recognition, hand goometry, iris and retina data are are given.
systems created by physiological features, biometric systems
97
II. Data Set Activity Name of The Activity
Code
As a part of this study, dataset which is known as a Daily
and Sports Activities obtained from the ICU database [22; 23; A16 Vertical cycling is an activity that involves cycling in a
24]. This study used Xsens MTx sensors mounted to the vertical position.
person's assigned location to collect data from the 19 A17 rowing exercise
previously indicated behaviors (activities). To collect data, A18 jumping exercise
place these sensors in 5 different areas of the topic. The chest,
A19 basketball game in progress
right wrist joint, left wrist, right (above knee), and left leg
(above knee) have all been identified as possible locations for
the device (Figur1). In each Xsens MTx device, there are nine III. Methodology
sensors (accelerometer x, y, z; gyroscope x, y, z and
magnetometer x, y, z). A. Person Identification By MobileNet Deep Transfer
Learning Technique
98
to each input channel using deep convolution and then creates
a linear combination of deep layer outputs using 1x1
convolutions (point by point). Batch normalisation (BN) and
changed long measure (ReLU) liner unit used once every
convolution.
99
map once all of the convolutional layers have extracted women). Each activity is broken down into 60 segments. As
options from the input image. a result, there are 19x8x60 = 9120 signal matrices in the data
set. MobileNet deep transfer learning technology is utilized
The Reshape layer, Dropout layer, convolutional layer, after these signal matrices have been converted into images.
Softmax activation perform layer, and thus the Reshape layer, 9120 photos were extracted as a consequence of to see if our
which make up the last 5 layers of the quality MobileNet, are system worked. There are two MobileNet architectures in
replaced by the Dropout layer and therefore the completely operation. The success rate is determined as follows:
connected layer, which uses the Softmax activation perform. # 𝑻𝑻𝑻𝑻𝑻𝑻𝑻𝑻 𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄
𝟏𝟏𝟏𝟏𝟏𝟏 ∗ (%) (2)
Our fully linked layer will generate more accurate predictions # 𝑻𝑻𝑻𝑻𝑻𝑻𝑻𝑻 𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄 + # 𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭𝑭 𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄𝒄
for each class than the preceding five layers using standard
MobileNet. Increasing the number of convolutional layers in Table IV shows the success rate of personnel identification
the model will help it extract more alternatives from the using MobileNet Version 1 (MobileNetV1) and MobileNet
computer file in general. Vesiyon 2 (MobileNetV2) architectures.
100
after the person recognition process. In general, it has been [5] J. Galbally, S. Marcel, and J. Fierrez, "Biometric
determined that the MobileNet V1 network is more antispoofing methods: A survey in face recognition," IEEE
Access, vol. 2, pp. 1530-1552, 2014.
successful than the physical activation recognition problem. [6] D. Rodrigues, G. F. Silva, J. P. Papa, A. N. Marana, and
It has shown 100% success in recognizing some physical X.-S. Yang, "EEG-based person identification through binary
movements. Detailed results are presented in Table V. flower pollination algorithm," Expert Systems with
Applications, vol. 62, pp. 81-90, 2016.
[7] S. Marcel and J. d. R. Millán, "Person authentication using
Table VI. Success Rate of Mobilenet networks by region of sensors
brainwaves (EEG) and maximum a posteriori model
adaptation," IEEE transactions on pattern analysis and machine
intelligence, vol. 29, no. 4, pp. 743-752, 2007.
Region MobileNetV1 MobileNetV1 [8] Y. Dai, X. Wang, X. Li, and Y. Tan, "Sparse EEG
compressive sensing for web-enabled person identification,"
Chest Measurement, vol. 74, pp. 11-20, 2015.
94,5 93,8
[9] Y. Sun, F. P.-W. Lo, and B. Lo, "EEG-based user
Right Arm identification system using 1D-convolutional long short-term
92,1 92,1 memory neural networks," Expert Systems with Applications,
vol. 125, pp. 259-267, 2019.
Left Arm
91,1 89,9 [10] S. A. Israel, J. M. Irvine, A. Cheng, M. D. Wiederhold,
and B. K. Wiederhold, "ECG to identify individuals," Pattern
Right Leg recognition, vol. 38, no. 1, pp. 133-142, 2005.
93,6 92,3
[11] M. Deng, C. Wang, M. Tang, and T. Zheng, "Extracting
Left Leg cardiac dynamics within ECG signal for human identification
91,9 92 and cardiovascular diseases classification," Neural Networks,
vol. 100, pp. 70-83, 2018.
[12] A. Goshvarpour and A. Goshvarpour, "Human
Finally, we have tested the success of our model according identification using a new matching pursuit-based feature set of
to the region where the sensors receive signals. Both models ECG," Computer methods and programs in biomedicine, vol.
172, pp. 87-94, 2019.
showed high success. In addition, MobileNetv1 networks [13] K. Su et al., "Human identification using finger vein and
have a higher success rate. Data from chest-level sensors was ECG signals," Neurocomputing, vol. 332, pp. 111-118, 2019.
more distinctive than data from other regions. Details can be [14] F. Sufi and I. Khalil, "Faster person identification using
seen in Table VI. compressed ECG in time critical wireless telecardiology
applications," Journal of Network and Computer Applications,
VI. Conclusion vol. 34, no. 1, pp. 282-293, 2011.
[15] W. Chang, H. Wang, G. Yan, and C. Liu, "An EEG based
Several biometric technologies have been developed in familiar and unfamiliar person identification and classification
recent years. Face, voice, fingerprints, palm print, ear shape, system using feature extraction and directed functional brain
network," Expert Systems with Applications, vol. 158, p.
and gait are all biometric technologies that have been widely 113448, 2020.
used in security systems. However, because they may be [16] R. San-Segundo, R. Cordoba, J. Ferreiros, and L. F.
imitated, most of these systems have glaring faults. To address D'Haro-Enriquez, "Frequency features and GMM-UBM
these issues, a new biometric system based on medical signals approach for gait-based person identification using smartphone
has been developed. A biometric approach was developed in inertial signals," Pattern Recognition Letters, vol. 73, pp. 60-
67, 2016.
this study to identify people using wearable sensor inputs. The [17] R. San-Segundo, J. D. Echeverry-Correa, C. Salamea-
major goal of this study is to show that signals from portable Palacios, S. L. Lutfi, and J. M. Pardo, "I-vector analysis for
sensors such as accelerometers, gyroscopes, and gait-based person identification using smartphone inertial
magnetometers may be used to identify people. In the future signals," Pervasive and Mobile Computing, vol. 38, pp. 140-
153, 2017.
studies, due to the increasing use of mobile devices, person
[18] N. V. Boulgouris, K. N. Plataniotis, and E. Micheli-
recognition will be done with a deep transfer learning Tzanakou, Biometrics: theory, methods, and applications. John
approach through data obtained from portable mobile devices. Wiley & Sons, 2009.
[19] Z. A. A. Alyasseri, A. T. Khader, M. A. Al-Betar, and O.
References A. Alomari, "Person identification using EEG channel
[1] B. Ngugi, A. Kamis, and M. Tremaine, "Intention to use selection with hybrid flower pollination algorithm," Pattern
biometric systems," e-Service Journal: A Journal of Electronic Recognition, vol. 105, p. 107393, 2020.
Services in the Public and Private Sectors, vol. 7, no. 3, pp. 20- [20] Ş. Kılıç, Y. Kaya, and I. Askerbeyli, "A New Approach
46, 2011. for Human Recognition Through Wearable Sensor Signals,"
[2] T. Matsumoto, H. Matsumoto, K. Yamada, and S. Arabian Journal for Science and Engineering, vol. 46, no. 4, pp.
Hoshino, "Impact of artificial" gummy" fingers on fingerprint 4175-4189, 2021.
systems," in Optical Security and Counterfeit Deterrence [21] Ş. Kiliç, İ. Askerzade, and Y. Kaya, "Using ResNet
Techniques IV, 2002, vol. 4677: International Society for Transfer Deep Learning Methods in Person Identification
Optics and Photonics, pp. 275-289. According to Physical Actions," IEEE Access, vol. 8, pp.
[3] J. Yu, C. Fang, J. Xu, E.-C. Chang, and Z. Li, "ID 220364-220373, 2020.
repetition in Kad," in 2009 IEEE Ninth International [22] K. Altun, B. Barshan, and O. Tunçel, "Comparative study
Conference on Peer-to-Peer Computing, 2009: IEEE, pp. 111- on classifying human activities with miniature inertial and
120. magnetic sensors," Pattern Recognition, vol. 43, no. 10, pp.
[4] G. Gainotti, "Laterality effects in normal subjects' 3605-3620, 2010.
recognition of familiar faces, voices and names. Perceptual and [23] B. Barshan and M. C. Yüksek, "Recognizing daily and
representational components," Neuropsychologia, vol. 51, no. sports activities in two open source machine learning
7, pp. 1151-1160, 2013. environments using body-worn sensor units," The Computer
Journal, vol. 57, no. 11, pp. 1649-1667, 2014.
101
[24] K. Altun and B. Barshan, "Human activity recognition [31] A. G. Howard et al., "Mobilenets: Efficient convolutional
using inertial/magnetic sensor units," in International workshop neural networks for mobile vision applications," arXiv preprint
on human behavior understanding, 2010: Springer, pp. 38-51. arXiv:1704.04861, 2017.
[25] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet [32] L. Sifre and S. Mallat, "Rigid-Motion Scattering for
classification with deep convolutional neural networks," Image Classification. arXiv 2014," arXiv preprint
Advances in neural information processing systems, vol. 25, arXiv:1403.1687.
pp. 1097-1105, 2012. [33]O. Dobrucalı , B. Barshan. (2013) Sensor-Activity
[26] C. Szegedy et al., "Going deeper with convolutions," in Relevance in Human Activity Recognition with Wearable
Proceedings of the IEEE conference on computer vision and Motion Sensors and Mutual Information Criterion. In: Gelenbe
pattern recognition, 2015, pp. 1-9. E., Lent R. (eds) Information Sciences and Systems 2013.
[27] K. Simonyan and A. Zisserman, "Very deep convolutional [34] M. Feurer and F. Hutter, "Hyperparameter optimization,"
networks for large-scale image recognition," arXiv preprint in Automated machine learning: Springer, Cham, 2019, pp. 3-
arXiv:1409.1556, 2014. [35] D. Shen, G. Wu, and H.-I. Suk, "Deep learning in medical
[28] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual image analysis," Annual review of biomedical engineering,
learning for image recognition," in Proceedings of the IEEE vol. 19, pp. 221-248, 2017.
conference on computer vision and pattern recognition, 2016, [36] D. Hendrycks, K. Lee, and M. Mazeika, "Using pre-
pp. 770-778. training can improve model robustness and uncertainty," in
[29] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, International Conference on Machine Learning, 2019: PMLR,
"Learning transferable architectures for scalable image pp. 2712-2721.
recognition," in Proceedings of the IEEE conference on
computer vision and pattern recognition, 2018, pp. 8697-8710.
[30] O. Russakovsky et al., "Imagenet large scale visual
recognition challenge," International journal of computer
vision, vol. 115, no. 3, pp. 211-252, 2015.
102
Effects of Photon-Shot and Excess Noises on Detectable
Minimum Rotation Rate in I-FOG Design for
Autonomous Vehicles
Kübra Kılınçarslan Emirhan Sağ Abdurrahman Günday
Department of Electrical and Electronic Department of Electrical and Electronic Department of Electrical and Electronic
Engineering Engineering Engineering
Bursa Uludag University Bursa Uludag University Bursa Uludag University
Bursa, Turkey Bursa, Turkey Bursa, Turkey
kubrakilincarslan@uludag.edu.tr emirhansag@uludag.edu.tr agunday@uludag.edu.tr
Abstract—In the last decade, the importance of acquiring or estimated information of vehicle speed and route. There are
location information is gradually increased depending on the great numbers of sensors used in the computational navigation
developments in autonomous vehicle technologies. The rotation system with high accuracy and sensitivity. The gyroscope that
rate variation required for providing high precision continuous is one of them is exploited for detecting the deviation rate and
position information is obtained by using gyroscope, which is an route information and becomes the most important device that
important part of the Inertial Measurement Unit (IMU). determines the accuracy of the dead reckoning [3].
However, some effects limit the operation performance of
gyroscopes. In this study, the effects of photon shot and excess There are different types of gyroscopes employed in
noises that limit the measurement accuracy of interferometric autonomous vehicles such as optical-based and micro-electro-
fiber optic gyroscope (I-FOG) on detectable minimum rotation mechanical systems (MEMS). MEMS gyroscopes have cost
rate (DMRR) have been analyzed for a basic configuration using advantages in comparison with optical gyroscopes, on the
Superluminance diode (SLD) and Superfluorescent fiber source contrary, optical gyroscope-based navigation systems are
(SFS). Furthermore, simulations related to DMRR variations preferred due to their superiorities in terms of measurement
for the system parameters of fiber length, fiber coil diameter, accuracy and reliability [4].
photodetector bandwidth, output power of the optical sources
and spectral bandwidth have been obtained in Matlab 2020b. However, optical backscattering and distortions caused by
Moreover, for an optimum system design employing SLD and the nonlinear electro-optical effects influence the operation
SFS, DMRR has also been computed as 0.793°/h and 0.910°/h, performances of I-FOGs [5, 6]. To overcome this problem,
respectively. Thus, approximately 80% improvement has been low coherent and broadband optical sources are employed in
achieved over the DMRR values in the system. applications [7]. The simulations and analysis performed in
this study have been obtained making use of SLD and SFS
Keywords— fiber optic gyroscope, broadband sources, that are low coherent and broadband optical sources.
minimum rotation rate, photon-shot noise, excess noise
The main effect limiting the DMRR at the optical
I. Introduction gyroscope output is the photon shot noise induced by the
Acceleration sensors and gyroscopes with high sensitivity photodetector that converts the incoming optic signal into an
and accuracy are extensively used to detect the position, speed electrical signal in the FOG configuration exploited in this
and direction of an object. These kinds of information study. Furthermore, excess noise induced by characteristics of
gathered from inertial navigation systems (INS) are combined optical sources [6] is considered as a limiting effect on
with global positioning system (GPS) data, accordingly [1]. DMRR.
In recent years, many research and development studies In this study, performance values of DMRR for I-FOGs
have been carried out in the field of autonomous driving designs employing SLD and Erbium-doped SFS have been
technologies. In this context, detection systems such as GPS, compared with each other using theoretical and the simulation
Visual Simultaneous Localization and Mapping (SLAM), and results derived from the configuration illustrated in Figure I.
Light Detection and Ranging (LIDAR) are used to obtain the Moreover, DMRR variations and corresponding simulations
exact location information of the vehicle [2]. have also been performed making use of the parameters of
optical fiber coil diameter, fiber length, photodetector
However, these methods and techniques have some bandwidth and optical output power and spectral bandwidth
limitations in terms of performance due to their weaknesses in of both sources.
practical studies. Some of these restrictions are atmospheric
effects, multi-path propagation and inaccessibility to the II. Theory
mediums of tunnels for GPS, affecting adversely due to the The working principle of optical gyroscopes is based on
atmospheric events for image processing in Visual SLAM, the Sagnac effect. This effect gives rise to the ∆φ Sagnac
exhibiting capability of different performance for all kinds of phase shift between optical signals propagating clockwise
objects depending on the reflection of light for LIDAR (CW) and counterclockwise (CCW) in a ring interferometer
systems. rotating around an axis perpendicular to the ring. Detectable
In cases where other positioning methods cannot be used, minimum rotation rate information is obtained by using the
the dead reckoning technique is utilized to overcome the Sagnac phase shift.
weaknesses of these positioning methods. Dead reckoning is The minimum FOG configuration exploited in this study
in principle the process of calculating the current position of a is shown in Figure I [8]. This configuration consists of an
person or moving object by using the fixed or previously optical source, 50/50 optical coupler, fiber polarizer,
specified position and advancing this position utilizing known piezoelectric phase modulator, fiber coil and photodetector.
103
Detectable minimum rotation rate depending on excess
noise can be stated given as in (5) [13].
λ2 B.c
Ωmin,excess = � [rad/s] (5)
2.π.L.D ∆λ
where ∆λ is the spectral bandwidth of the optical source.
Figure I. Minimum I-FOG configuration Photodetector current generated in the photodetector can
The fiber polarizer used in this FOG illustration given in be described as a function of the combined effect of photon
Figure I. polarizes the light linearly in only one direction. A shot noise and excess noise as given in (6) [14].
50/50 optical coupler placed in the configuration is utilized for 〈I〉2 B
splitting the light beam into two equal parts propagating in 〈(∆I)2 〉=2q〈I〉B+ [A] (6)
opposite directions of each other in the fiber coil. An optical ∆λ
phase modulator is utilized to modulate the phase of the light Considering the combined effect of photon shot and excess
pumped into the optical fiber. The phase difference variation noise, i.e. using (4) and (5), the detectable minimum rotation
caused by the rotational movement into the system reaches rate can be expressed as (7) [14].
towards the photodetector and then it is converted into an 1
electrical signal [9]. Afterward, rotation rate information is λ.c 1+J0 q λ2 2
obtained by using the signal processing unit at the Ωmin = � �� + � √B [rad/s] (7)
photodetector output. 2.π.D.L J1 〈I〉 2.c.Δλ
The relation between the rotation rate caused by the where J0 and J1 are Bessel functions and take the values of 0.34
rotational motion in I-FOGs and the Sagnac phase shift is and 0.58, respectively for maximum sensitivity.
given as in (1).
III. Simulations
λ. c Simulations have been performed using Matlab 2020b on
Ω = . ϕ [rad⁄s] (1)
2. π. L. D s the simulation model built up for analyzing the effects of
where Ω is the rotation rate, 𝛌𝛌 is the wavelength, c is the speed photon shot and excess noises on the DMRR utilizing the
of light, L is the fiber length, D is the diameter of the coil and performance parameters of fiber coil diameter, fiber length,
ϕs is Sagnac phase shift. photodetector bandwidth, optical source bandwidths, and
optical output power. The photon shot noise, excess noise and
In I-FOGs, the fundamental noise limit is described by combined effect of both noises expressed in (4), (5) and (7),
photon-shot noise, which causes random fluctuations in the respectively, have been exploited for obtaining relevant
detector output current ( is = �(qi0 B) ) due to the random simulations in this research.
scattering of incident photons on the photodetector. Thereafter, system parameters used in I-FOG
In the noise current equation, the parameters of q, i0, configuration have been specified and their values have also
B = 1/T and T represent the electron charge, the average been acquired for an optimum system design. In this way, the
current in the detector, the electrical bandwidth of the effects of design parameters on the noise characteristics have
detection system and the sampling time, respectively [10]. been analyzed in terms of system performance, as well.
I-FOG resolution for the gyroscope system is evaluated as Parameters of the system used for getting simulations in
the detectable minimum variation rate in rotation angle this study and their corresponding values are given as in
induced by uncertainties in detector output current and Table I.
rotation rate. The detectable minimum rotation rate for photon
shot noise is expressed by (2) [11]. Table I. Simulation parameters
Parameters Value ranges
λ. c B. h. c
Ωmin,shot = � [rad/s] (2) Fiber coil diameter (D) [m] 0.05 – 0.15
2. π. L. D η. λ. Pd
Fiber length (L) [m] 1000 – 1500
where B is the bandwidth of the photodetector, h is the Planck Photodetector bandwidth (B) [Hz] 60 – 500
constant, η is the optical efficiency of the detector, and Pd is
the optical power to the detector. Photodetector current is SLD spectral bandwidth (Δλ) [nm] 70 – 130
stated with (3) [12]. SFS spectral bandwidth (Δλ) [nm] 20 – 60
η. λ. Pd Output power (P) [mW] 2 – 20
I= q [A] (3)
h. c
where I is the photodetector current in Ampere. Substituting
For the I-FOG configuration indicated in Figure I, the
(3) in (2) and reorganizing the equation, detectable minimum
wavelength of the light launched into the optical fiber (λ),
rotation rate as a function of photodetector current can be
electron charge (q), speed of light in vacuum (c) and
written as in (4).
photodetector responsivity values have been taken as
1550 nm, 1.602x10-19 C, 3x108 m/s and 0.93 A/W,
λ.c B.q respectively in the simulations.
Ωmin,shot = � [rad/s] (4)
2.π.L.D I
104
A. Relationship between Minimum Rotation Rate and At the point where the D.L is 150 m2 in Figure II, values
D.L product of DMRR for SLD and SFS are decreased 85%,
Spectral bandwidths of SLD and SFS vary in ranges of approximately. The detectable minimum rotation rate gets the
70 nm - 130 nm and 20 nm - 60 nm, respectively as mentioned values of 1.573°/h and 1.717°/h for SLD and SFS, respectively
in Table I. Simulations with respect to the effects of fiber coil at this point.
diameter and fiber length products (D.L) on minimum rotation B. Relationship between Minimum Rotation Rate and
rate variations have been obtained for values of B is 60 Hz, Bandwidth
P is 2 mW and spectral bandwidths of SLD and SFS are 70
nm and 20 nm, respectively, as shown in Figure II. Simulations related to the effects of photodetector
bandwidth changes on the DMRR variations for DL is 150 m2
and P is 2 mW have been attained for SLD and SFS with
spectral bandwidths of 70 nm and 20 nm as indicated in
Figure III, respectively.
(a)
(a)
(b)
Figure II. Minimum rotation rate vs. D.L product for a) SLD and
b) SFS
(b)
DMRR variations for both SLD and SFS decrease Figure III. Minimum rotation rate vs bandwidth for a) SLD and
exponentially with increasing the products of D.L as shown in b) SFS
Figure II. For the variations of D.L in the range of
50 m2 - 225 m2, whilst DMRR changes between 1.963°/h and For the variation of bandwidth in the range of
0.436°/h depending on the photon shot noise for SLD, their 60 Hz - 500 Hz, DMRR values for SLD varies in the range of
values vary between 0.799°/h and 0.178°/h depending on the 0.654°/h - 1.889°/h depending on the photon shot noise and
excess noise. varies in the range of 0.266°/h - 0.769°/h depending on the
excess noise. In case the variation of bandwidth in the same
In case the variations of D.L are in the same range, DMRR range, DMRR shows a change from 0.654°/h to 1.889°/h
changes between 1.963°/h and 0.436°/h depending on the depending on the photon shot noise and from 0.499°/h to
photon shot noise whilst it changes between 1.496°/h and 1.440°/h depending on the excess noise for SFS.
0.333°/h depending on excess noise for SFS.
The total DMRR changes in ranges of 1.573°/h - 4.541°/h
The total DMRR shows a change in ranges of and 1.717°/h - 4.958°/h for SLD and SFS, respectively. In
4.720°/h - 1.049°/h and 5.152°/h - 1.145°/h for SLD and SFS, other words, the higher the photodetector bandwidth is, the
respectively. Therefore, it is seen that DMRR decreases as the higher DMRR is. The minimum rotation rate reaches its
values of coil diameter and fiber length increase. maximum value with 4.958 °/h because of the combined effect
of both the photon shot noise and the excess noise for SFS.
105
C. Relationship between Minimum Rotation Rate and D. Relationship between Minimum Rotation Rate and
Spectral Bandwidth Optical Output Power
The variations of DMRR with the spectral bandwidths of Simulations relevant to the effects of the output powers on
the optical sources have been obtained as shown in Figure IV, DMRR for the spectral bandwidth of 70 nm and 20 nm for
when D.L = 150 m2, P = 2 mW, and B = 60 Hz. SLD and SFS, respectively and with D.L = 150 m2, B = 60 Hz
have been attained as illustrated in Figure V.
(a)
(a)
(b)
Figure IV. Minimum rotation rate vs spectral bandwidth for a) SLD (b)
and b) SFS Figure V. Minimum rotation rate vs output power for a) SLD and
b) SFS
The total minimum rotation rate shows a change from
1.573°/h to 1.545°/h for SLD spectral bandwidth variation in The total DMMR related to SLD and SFS changes in the
the range of 70 nm - 130 nm whilst they vary from 1.717°/h ranges of 1.573°/h - 0.646°/h and 1.717°/h - 0.945°/h,
to 1.583°/h for SFS spectral bandwidth variation in the range respectively, when the output power varies in the range of
of 20 nm - 60 nm. 2 mW - 20 mW.
The total DMRR has been obtained 1.555°/h and 1.618°/h As it is seen from Figure V, DMRR shows a falling
for SLD and SFS with the spectral bandwidths of 100 nm and tendency as the output powers of the optical source increase.
40 nm, respectively. Hence, the improvements with the values In designs, sources with higher output power can be selected
of 64 % and 74 % have been achieved at these points, for reaching the lower minimum rotation rate, on the contrary,
consecutively. increasing the output power causes a large amount of energy
consumption and high cost. For this reason, sources with
As it is obvious in Figure IV.b, excess noise is more optimum output power should be preferred by evaluating
effective in SFS with narrow bandwidth in comparison to the other parameters, as well.
one with broadband. Stated in other words, this noise effect
causes the detectable total minimum rotation rate to become At the point where the output power is 9.2 mW, DMRR
higher in SFSs. In this manner, when using a broadband decreases by 80 % and gets the values of 0.829°/h and
optical source, the effect of excess noise can be reduced and a 1.077°/h for both optical sources, respectively. This
lower rotation rate can be achieved. decreasing results in 80 % approximately, improvement in
DMRR for given value of output power for both sources.
106
IV. Conclusion References
In this study, the effects of photon shot and excess noises [1] Johannes Rünz, Folko Flehming, Wolfgang Rosenstiel,
that limit the measurement sensitivity and accuracy of I-FOGs Michael Knoop, “Requirements and Evaluation of a
Smartphone Based Dead Reckoning Pedestrian Localization
on DMRR have been analyzed for a basic configuration for Vehicle Safety Applications”, Advanced Microsystems for
employing SLD and SFS. In addition, considering these noise Automotive Applications, Springer, 2016.
effects, simulations with respect to the relations between [2] Debeunne Cѐsar, Vivet Damien, “A review of visual-LiDAR
DMRR and the parameters of fiber length, fiber coil diameter, fusion based simultaneous localization and mapping”, Sensors,
photodetector bandwidth, output power, and spectral Volume 20, Issue 7, 2020.
bandwidth of optical sources have also been performed. [3] Tsunehiko Imamura, Tomohiro Matsui, Masaroni Yachi,
Hideo Kumagai, “A low-cost interferometric fiber optic gyro
In I-FOGs, the measurement sensitivities show an increase for autonomous driving”, In Proceedings of the 32nd
when the fiber length (L) and the coil diameter (D) have higher International Technical Meeting of the Satellite Division of The
values. However, choosing convenient values for both Institute of Navigation (ION GNSS+ 2019), (pp. 1685-1695),
2019.
parameters is important for optimum design by reason of the
[4] Chris Goodall, Sarah Carmichael, Bob Scannell, “The battle
increments of dimensions and weights of the system and between MEMS and FOGs for precision guidance”, Analog
optical attenuation. Increasing bandwidth causes a rise in Devices Technical Article, MS-2432, 2013.
DMRR in the system. In these kinds of designs, although [5] Hyang Kyun Kim, Michel J. F. Digonnet, Gordon S. Kino,
choosing a low-bandwidth detector enables lower rotation rate “Air-Core Photonic-Bandgap Fiber-Optic Gyroscope”, Jounal
measurement, high-bandwidth devices are preferred for of Lightwave Technology, Volume 24, Issue 8, 2006.
sampling at high rates and increasing the performance of [6] Oguz Celikel, Ferhat Sametoglu, Huseyin Sozeri,
closed-loop systems. Spectral bandwidth and DMRR are “Optoelectronic design parameters of interferometric fiber
inversely proportional for the reason that the increment of optic gyroscope with LiNbO3 having north finder capability
and earth rotation rate measurement”, Indian Journal of Pure &
spectral bandwidth reduces the effect of excess noise. The Applied Physics, Volume 48, Issue 6, 2010.
negative effects of excess noise in SFS with narrow spectral [7] Ramón José Pérez Menéndez, “IFOG and IORG Gyros: A
bandwidth have been also realized in other simulations Study of Comparative Performance”, Gyroscopes-Principles
performed in this study. This situation shows that spectral and Applications. IntechOpen, 2019.
bandwidth has a vital role and is an important parameter in [8] López-Higuera José Miguel, Handbook of Optical Fibre
I-FOG designs. The use of optical sources with high output Sensing Technology, Wiley, England, 2002.
powers minimizes the photon shot noise and provides lower [9] Emirhan Sağ, Oğuzhan Coşkun, Güneş Yılmaz “Modelling,
rotation rate measurement. Thus, the measurement sensitivity simulation and balancing of a car cirection with fiber optic
gyroscope and fuzzy logic algorithms”, In 2019 11th
can be increased by using a source with high output power, International Conference on Electrical and Electronics
but, in this manner, cost-effectiveness diminishes, and power Engineering (ELECO), (pp. 427-431), IEEE, 2019.
consumption shows an increase. Therefore, it is necessary to [10] Francis. T. S. Yu, Shizhuo Yin, Paul B. Ruffin, Fiber Optic
determine the optimum values in accordance with the Sensors, CRC Press, USA, 2008.
application. [11] Mario N. Armenise, Caterina Ciminelli, Francesco Dell'Olio,
Vittorio M.N. Passaro, Advances in Gyroscope Technologies
In this study, when considering the system parameters for ,Springer Science & Business Media, Germany, 2010.
the optimum design as DL is 150 m2, B is 60 Hz, P is 9.2 mW [12] Gerd Keiser, Optical Fiber Communications, Tata
and spectral bandwidths are 100 nm and 40 nm for SLD and McGraw-Hill Education Private Limited, India, 2008.
SFS, consecutively. DMRR has been computed as 0.793°/h [13] Emmanuel Desurvire, Erbium-doped fiber amplifiers: principle
and 0.910°/h, respectively. and applications, John Wiley & Sons, Inc., Canada, 2002.
[14] William K. Burns, Robert P. Moeller, Anthony Dandridge,
Consequently, employing optical sources with high “Excess noise in fiber gyroscope sources” IEEE photonics
spectral bandwidth and high output power is important for technology letters, Volume 2, Issue 8, 1990.
I-FOG designs to provide high accuracy and more precise
measurements in navigation systems. When viewed from this
aspect, this study will guide similar studies in the field and
contribute to the researchers for future investigations.
107
How Does the after-COVID-19 “ABCDEF” effects model affect the development of
Internet of Things and its Applications to improve Customer Experiences?
Prof. Ir Spencer Li
Co-founder & CTO of Smart Business Consultancy Limited
Professor of Hong Kong Adventist College
Guest Lecturer, The University of Hong Kong
Hong Kong, China
spencer@smartbusiness.com.hk
Abstract - Based on the author’s ‘after-COVID-19 “ABCDEF” management should consider these three layers in the “next
effects model’1 defining “architectural framework for the normal” era.
decision-making process,” the paper examines how human
factors and emerging technologies affect organizational
behaviour on implementing the digital transformation of
A. Customer Experience
business processes by the adoption of the Internet of Things and A recent report from McKinsey emphasized optimizing
its applications in pandemic corona diseases. Recently, customer journeys better than merely focusing on touchpoints.
‘COVID-19 has radically changed the global economy by Customers expect to receive an excellent end-to-end customer
accelerating the digital transformation to create New Normal journey with clear customer experience (CX) goals.
customer experiences (CX). This paper summarizes “next McKinsey’s research concludes that “customer journeys are
experience” initiatives by applying six pivotal elements of the
‘after-COVID-19 “ABCDEF” effects model - Artificial
more strongly correlated with business outcomes rather than
Intelligence, Blockchain/Big Data, Customer Experience, touchpoints.”2,3
Digital Transformation, Emotion, and Fintech.’1
The multiplier effect of each highly satisfied touchpoint
Keywords - customer experience, digital transformation, decreases cumulatively the end-to-end customer journey
internet of things, IoT, after-COVID-19 “ABCDEF” effects model drastically (Fig. III).
108
made solutions to deliver organizational goals. “In an Recent research said that due to technological breakthroughs
omnichannel world, customer care is increasingly becoming a and engineering improvements on wearable sensors,
significant factor”2,5 affecting customer satisfaction. The “smartwatches are being widely used in healthcare in the next
management is ought to build big data, captured in the five years. Data, collected by smartwatches, can be used for
customer journey, developing customer experience strategies early diagnosis and remote patient monitoring.”8
to achieve business or corporate goals.
For the finance sector, it is very convenient to apply IoT to
Customer care is always the core deliverable in the customer carry out customer services and Know Your Client (KYC)
journey. Organization structures and functional touchpoints procedures.
must be interdependent and mutually supportive.
Refer to the paper Bank 2.0 - The big shift9, most financial
Gartner’s research said “Eighty-one percent of customer
institutes have deployed Bank 2.0 channel architecture to
experience (CX) leaders would compete mostly or entirely on eliminate too many isolated banking and finance systems
CX. Just less than half have believed CX can help the running together. Gradually, Bank 4.0 will come in stage to
organization to drive business outcomes. Although CX aims
replace all banking systems by “Banking Everywhere but
to deliver goods and services exceeding customer
Never at a Bank.”10 The paper predicts that IoT-based
expectations, there is only 48 percent rate their CX efforts
appliances can be used for implementing Bank 4.0 initiatives.
exceed management expectations and only 22 percent say that
For the banking sector, IoT plays a vital role in the business
their CX efforts to exceed customers’ expectations.”2, 5 transformation from Bank 1.0 to Bank 2.0, and even to Bank
4.0.
“To address this challenge, Gartner unveiled the CX Pyramid,
a new methodology to test organizations’ customer journeys
and forge more powerful experiences that deliver higher On the investment side, more institutional investors, private
customer loyalty and brand advocacy.”2, 6 equity, and venture capitals are keen to invest in earlier IoT
projects.
109
Mesh Sensors The manufacturers should be aware of ethical issues violating
The wearables of the future would track the exact movement National legislation, international conventions, particularly in
of the body rather than measuring heart rate, exertion, and the United Nations level (Fig. VI).
sleep quality. They can be applied with new types of The paper studies the in-depth implementation of IoT to
“garments which can collect sufficient data points which can enhance automation like RPA and Fintech in advanced
be used by an app to determine the body’s position in 3D engineering, technology, and applications.
space.”11
The paper has identified the most appropriate applications of
IoT to utilize the after-COVID-19 “ABCDEF” Effects Model
Network Slicing for IoT Applications
rationales for New Normal businesses. The matching metrics
Network slicing is the technique to apply different “latency,
(Table I) point out the implementation areas for “ABCDEF.”
reliability, and bandwidth for IoT devices. Network slicing is
recommended to deployed on mission-critical IoT devices for
long-term reliability.”11
110
end customer journey as customers are acquainted with
accepting IoT appliances as a daily life necessity.
Figure V. Eversensor
111
Customer care is the core element driving a better
customer experience. Customer-centricity focuses on
the understanding of customers’ needs to derive
customized strategies and solutions.
III. Conclusion
Figure VII. “after-COVID-19 “ABCDEF” Effects Model -
System Architecture”1 Development on the Internet of Things and its applications are
fast. Globally, all developed and under-developed countries
are building their 5G infrastructure to upgrade themselves as
Smart Cities.
B. Tables IoT-based sensors are easier to capture all data for data
analytics purposes. Resulted of the pandemic COVID-19,
Table I. Metrics of after-COVID-19 “ABCDEF” Effects Model affecting
IoT and Applications development new preventive measures like social distancing are taken. The
author has summarized the trend of human and business
behaviors in the “after-COVID-19 “ABCDEF” effects
“ABCDEF” IoT and its Applications
Effects model.”1 The paper anticipates IoT and applications play a
vital role in the evolution of customer journey by deploying
AI brings a drastic change in digital transformation for customer journey mapping. The tri-patriate reciprocal driving
Artificial applying IoT and Applications. The new ways of
Intelligence O2O services can be implemented by Robotic Process forces amongst customer care, customer experience, and
Automation (RPA) and related IoT appliances and customer-centricity are shaping customer behaviour with
wearables. Smart contracts generate legal contractual satisfaction.
terms to eliminate unnecessary contractual and
business exceptions in a trusted environment.
With the open APIs of common data on the From the macro side, huge institutional investments and
Blockchain & blockchain and big data from different stakeholders government funds are in the market to seek a good investment
Big Data like commercial corporates, government, and NGO, of earlier emerging technologies. Great demand on skillful
AI and ML tools can foster the development of hybrid human capital and technology transfer like IP, patents are the
blockchain, which deliver sophisticated services for
multi-disciplines and cross-industries at a quicker hot topics for academics and top corporate management to
pace. consider and study.
IoT-Based Alternative Data drives businesses to We believe that IoT and its applications can add value to six
transform and adapt rapidly.
pivotal elements of “the After-COVID-19 “ABCDEF” effects
IoT and its Applications can deliver good customer model - AI, Blockchain, Big Data, Customer Experience,
Customer care and customer-centricity services. Digital Transformation, Emotion, and Fintech.”1 The world
Experience is so fantastic. By 2030, humans will be connected and served
112
by 125 billion IoT devices with new emerging technologies [7] Gartner, Inc June 2020. “How to Leverage the Top 5 CX
Trends in 2020.” Gartner Inc.
deployed at an unprecedentedly fast pace. https://www.gartner.com/en/conferences/apac/customer-
experience-australia/gartner-insights/gc-rn-top-cx-trends
References [8] Dominic Hasler. 2021. “Looking after the health of your ATM
[1] Spencer Li. 2021. “How does COVID-19 Speed the digital fleet in a futuristic way.” www.fintechfutures.com, 2021.
transformation of Business Processes and Customer https://www.fintechfutures.com/2021/03/looking-after-the-
Experience?”, SPECIAL ISSUE IN FINTECH OF "REVIEW health-of-your-atm-fleet-in-a-futuristic-way/
OF BUSINESS" (St. John's New York), 41(1), 1-14, 2021. [9] Peter Mũya H. 2012. “Bank 2.0 - The big shift.”
https://www.stjohns.edu/sites/default/files/uploads/Review-of- https://www.slideshare.net/themuyas/bank-20-the-big-shift
Business-41%281%29-Jan-2021.pdf [10] Brett King. 2019. “Bank 4.0: Banking Everywhere but Never
[2] Spencer Li. 2021. “How Does Digital Transformation Improve at a Bank”, 2019
Customer Experience?”, The Palgrave Handbook of Fintech [11] Dylan Martin. 2021. “5 Emerging IoT Technologies You Need
and Blockchain, 487, Jun 2021. DOI: 10.1007/978-3-030- To Know In 2021.” www.crn.com, 2021.
66433-6_21. https://www.crn.com/news/internet-of-things/5-emerging-iot-
[3] Wray, Sarah. 2016. “Optimize Journeys Not Touchpoints – technologies-you-need-to-know-in-2021
Here’s Why and How.” McKinsey & Company. [12] World Economic Forum. 2019. “White paper : AI Governance
https://inform.tmforum.org/customercentricity/ – A holistic approach to implement ethics into AI.” January
2016/09/mckinseyoptimize-journeys-ottouchpoints-heres/. 2019
Accessed September 2016. [13] Kyoko Tamur. Raghu Gullapalli. 2019. “The Secret to
[4] Glagowski, Elizabeth. “The Race Is On: Rethinking Your Maximizing the Industrial IOT.” Accenture, 2019.
Digital Strategy.” ttec.com. https://www.accenture.com/_acnmedia/pdf-108/accenture-
https://www.ttec.com/articles/digital-customer-experience- apac-insight-dsf-tamura-final-lowres.pdf
strategysix-key-areas-focus-your-efforts. [14] United Nations Industrial Development Organization
[5] Lotz, Stephanie, Julian Raabe, and Stefan Roggenhofer. (UNIDO). 2020. “COVID-19 Implications and Response—
2018. “The Role of Customer Care in a Customer Experience Digital Transformation and Industrial Recovery.” Vienna,
Transformation.” McKinsey & Company. https://assets- Austria: UNIDO. https://tii.unido.org/news/covid-19-digital-
prod.mckinsey.com/~/media/McKinsey/Business%20Functio transformation-industrial-recovery
ns/Operations/Our%20Insights/The%20role%20of%20custom [15] Businesswire. 2018. “Gartner Says Customer Experience
er%20care%20in%20a%20customer%20experience%20transf Pyramid Drives Loyalty, Satisfaction and Advocacy”
ormation/The-role-of-customercare-in-a-customer-experience- businesswire.com, 2018.
transformation-vf.ashx. ok, SYZ Publishers, Turkey, 2005. https://www.businesswire.com/news/home/20180730005056/
[6] Kelly Blum. 2018. “Gartner Says Customer Experience en/Gartner-Customer-Experience-Pyramid-Drives-Loyalty-
Pyramid Drives Loyalty.” Gartner Inc. Satisfaction.
https://www.businesswire.com/news/home/201807300 [16] Capgemini. 2020. “Capgemini Customer Experience”
05056/en/Gartner-Customer-Experience-Pyramid-Drives- capgemini.com, 2020.
Loyalty-Satisfaction. https://www.capgemini.com/service/digital-
services/customer-experience/
113
Drawing as a Scientific Method. The School of Agricultural
Engineers in Madrid: a case study
Jara Muñoz-Hernández
School of Architecture (ETSAM)
Polytechnic University of Madrid (UPM)
Madrid, Spain
jara.munoz@upm.es
https://orcid.org/0000-0003-2530-2892
Abstract—In the current debate on the application of new Estate of La Florida and La Moncloa. Its status as a Crown
media in the documentation of architecture, the need to preserve property had indeed preserved the estate from urban growth
the values of the architectural drawing tradition stands out. This and it would also be so in the following decades, once the State
paper proposes its core role in the concept of graphic decided to transfer it to the agronomists, thus starting its
reconstitution as a method of integration of new and traditional educational and research trajectory. Today the School
media for the advancement in knowledge and dissemination of occupies a small part of the campus, but in 1869 the entire La
architecture. A research - aplying this method - on the School of Florida estate had been surrendered to the institution. The
Agronomists in Madrid is used to exemplify this theory. grounds would be later occupied by various charity, sanitary
Keywords—architectural drawing, graphic reconstitution, and recreational centers. La Moncloa was hence configured as
School of Agricultural Engineers, Madrid’s Ciudad Universitaria a natural park facing Casa de Campo, performing a transition
between the city – which had reached its limits with the
I. Introduction construction of Argüelles neighborhood – and the mount of El
For Architecture, drawing is, inevitably, the instrument of Pardo. The fusion of the cultivated land and the public garden,
thought, concretion and communication. From the first together with the distinctive topography of the landscape, gave
moment of ideation of an architectural element to the details this area a picturesque character that experienced great
resolution on the construction site, the entire project process success among the people of Madrid.
goes through being drawn [1]. This paper aims to analyze how La Moncloa as a Royal Site until 1869 and as a University
drawing, in addition to being the language of architecture – campus since 1927 has already been studied. However, that
and many other technical disciplines –, is also an absolutely time in between those two dates has only been partially
effective scientific instrument when it comes to researching researched in several papers. Consequently, its architectural
lost or modified architectural heritage and even the and urban aspects had not been approached in a global way. It
architecture that was devised, but never built. is yet to be understood how the territory and pre-existing
This graphic methodology has been applied in my PhD constructions where occupied after the transfer of the State,
thesis, which deals with the birth and development of how it was developed during sixty years until the creation of
Madrid’s School of Agricultural Engineers, the first university the Ciudad Universitaria and how this project coexisted with
complex to be built on what is today the main campus of the the School of Agronomists and other institutions until the
capital, the Ciudad Universitaria. Based on everything Spanish Civil War (1936-1939). Unfortunately, the excessive
analysed in the thesis, the aim here is to synthetically present and uncontrolled growth since the last sixties, almost
the method, take the School of Agronomists as a case study completely erased the traces of the past.
and show the results obtained, in such a way that it can also be This paper aims to approach all these questions through
applied to other areas of study. drawing, which is understood as a source of information, an
This research is not an isolated work, but is included instrument of analysis and a means of expressing results. In
within a broader research framework related to the drawing of this way, graphic narration becomes an essential and enriching
the city and architecture, from which various scientific articles complement to the written discourse, showing the research
[2] and several PhD theses have been carried out in the old area at various key moments of its development. The result is
research group of the Technical School of Architecture of a sequence of drawings of one same place at different
Madrid, Drawing and Documentation of Architecture and moments in time that allows recovering – in a virtual way –
City. All of them are studies that start from a work begun a the lost memory of this place, as well as contributing to the
long time ago and whose first verifiable result is the book La knowledge of the urban form of Madrid.
forma de la villa de Madrid [3].
III. Method
II. A case study: The School of Agricultural In order to achieve the objectives described above,
Engineers systematic work in archives and libraries has been carried out.
Nowadays, the northwest corner of Madrid is unfailingly However, the main methodological instrument of this research
linked to the Ciudad Universitaria. The social, political, urban is drawing. This work can be framed within lines of research
and symbolic value of the campus has ended up by imposing that have shown that drawing is an effective tool in the
its name over this whole area of the city, relegating to oblivion analysis of the urban form and in the transmission of results
– or reserving them for specific places – the names bounded and conclusions.
to the origin of this space: Florida and Moncloa. In another sense, drawings are also a source of
In 2019 the School of Agronomical Engineers celebrated information. Thus, our work is nourished by the documents
its 150 anniversary. It was the first institution established on that architecture and the city generate and that are created
these territories, coinciding in time with the end of the Royal during their development. The set of all these drawings is what
has been called the graphic life of buildings [4].
114
Most of the graphic documentary sources examined in this The first step for this graphic reconstitution is to set those
research come from libraries and public archives. We will not elements which remain, that is, that existed then and that still
stop to analyze the origin of the sources – [5] can be consulted exist today. In order to do so, they are identified on the current
for more information – but it is convenient to refer to the most urban parcel, which we consider the most reliable
important ones. Obviously, the information that each file can georeferenced cartographic source. These elements are the
provide is closely linked to changes in ownership of the scope only certainties that are available and that serve as a reference
of study. As La Florida was a property of the Crown, the for graphic reconstitution [3]. These persistences can be
General Archive of the Palace contains written documentation buildings or constructions, that is, they have a clear material
and plans from that historical period. Later, once these lands dimension, or be elements related to the orography, water
are transferred, the information is kept in other files. The courses, roads or property limits – that can manifest
General Administration Archive, an archive that preserves the themselves in a variable way in different historical moments,
documentation of the different Spanish ministries since the either through built boundaries, fences, street names... – , that
second half of the 19th and 20th centuries, contains is, they have an immaterial character [6].
information on the project of the School of Agronomists and
a large part of the multitude of buildings that were built in the The second step would be the location, based on the
surroundings of the Palacete de La Moncloa. Since the 1920s, reference marked by the persistence, of missing or
when the Construction Board of the Ciudad Universitaria was transformed elements of which there is measurable
constituted, and especially after the Civil War, the General documentation. In the case of La Moncloa, which for a long
Archive of the Complutense University of Madrid is also an time was a sparsely built area, it happens that some buildings
essential place of consultation. are built on top of others, or some parts are used for their
development, which also makes it possible to take the current
The systematic collection of information is essential as a building as a reference to locate the old one (Figure I).
starting point, both for written work and for the production of
graphic documentation. However, this compilation does not
end exactly at a certain point, but research and work feed one
another. It is clear in the drawing of plans, for example, that,
with a good collection of old graphic documentation, writings,
etc., it is possible to begin to draw, but it will always be
necessary to fill in the gaps that will force a new search for
documentation while those planes are being generated. It is
this back-and-forth process that builds the graphical base of
which we spoke before, and that must be understood not as a
closed and finished product, but rather as a work in which
information can continue to be poured into a constant
expansion and growth process.
IV. Drawing as a scientific instrument
Up to now, drawing has been discussed from a
documentary source point of view. However, in addition to its
value as a container of information, it can also be considered
as a tool for thought and analysis, as a project towards the past
– of what existed or what could have existed – or as a method
to illustrate and reflect the results of an investigation. All these
meanings will be taken by the accompanying drawings, which
are the research we are talking about.
It is worth stopping at the use of drawing in a scientific and
rigorous way, analyzing the original documentation, whether
graphic or written, and synthesizing it in a graphical,
georeferenced database, which can give rise to complete
planimetries, 3D models and images, in what we understand
as a graphic reconstitution of the case study: “... the term
reconstitution would be reserved for those drawings that
attempt to reflect one or more states of the building that no
longer exist or that never existed, but that could be part of its
biography. Note that the important difference is that in the
second case, given the almost always incomplete data, it is
usually necessary to introduce a certain dose of interpretation
we would like to assimilate to a certain idea of a project ”[4].
On this same basis it is possible to take a trip to the past,
in which drawing is an essential means of shaping that graphic
life of buildings. In this process of graphical reconstitution, the
most important thing is to minimize the data subject to Figure I. Process of graphic reconstitution of the location and layout
of the Porcelain Factory (bottom), taking as a reference the current
speculative interpretation. Obviously, this uncertainty is building of the School (top), its layout in 1927 and the situation
accentuated the further one is on the timeline. with respect to this of the Machine Building (center).
115
For example, in the case of the teaching building of the result of presenting, at the same time and according to the
School of Agronomists, we have nowadays the original same criteria, a series of different architectural facts in order
building partially enlarged and modified, as can be seen in the to classify or compare them. Thus, the graphic parallel
upper drawing of Figure I. From it, we can draw the layout of becomes a preliminary step in any systematic study with a
the original building, of which we keep plans. And, once the scientific vocation in which more than one architectural object
original building is located, it has been possible to locate one is involved and which has more or less form as a direct
of the auxiliary buildings that existed in the 20th century and reference ”[8].
that were later demolished. This is what the middle drawing
represents, where the footprint of the current building that has In this case study, the application of the graphical parallel
served as a guide is kept in red line. Finally, and following the has consisted in comparing the same spatial area at different
same graphic codes, we have been able to locate the building moments in time. This area has been drawn on two different
prior to the current one of the School of Agronomists, which scales, which make it necessary to approach the drawing in a
had been the Crown Porcelain Factory [7]. This has been different way.
possible thanks, on the one hand, to the knowledge of the A. Spacial scope of the drawings and temporal sequence
situation of the secondary buildings, which coexisted in time From this case study, three packages of plans of the same
with this construction. And on the other, to information historical sequence have been made, each one with a different
obtained from historical photography: the Porcelain Factory spatial framing (Figure II): an urban one, at a scale of 1: 7,500,
and the new School of Agronomists coincided on their north in which the territory of La Moncloa occupied by the
façade. It was relatively easy to obtain the floor plan of the agronomists is represented, and its relationship with the city
Porcelain Factory building, since we have some historical and with structural elements of the landscape, such as the
plans with measures. However, it was not so easy to place it Manzanares River or the Casa de Campo. The second and
in a space with very few built references. This is an example third frames, both at 1: 2,000 scale, reproduce the
of how the drawing, rigorously built, allows us to obtain this surroundings of the School of Agronomists and the whole of
information. In the same way, the plans of the complete area the Model Farm that was developed in the vicinity of the
that will be shown later have been produced. Palacete de La Moncloa. This second close frame, due to
Naturally, matching historical cartography to one another historical circumstances, will disappear in the time scenes
or to the current city maps is a useless task, since the different after the Civil War. Architecture is already represented on this
projection systems and the distortions and inaccuracies of the scale and includes the graphic reconstitution of all those
survey methods at all times prevent it from being transferred buildings for which a minimum of information has been found
directly onto the current parcel map. This problem is even to establish hypotheses.
more accentuated in sparsely built areas, where it was more
difficult to establish benchmarks from which to measure. For
this reason, in this case, references have been sought with the
present, or with the more controlled past, “building by
building”, in order to later be able to draw with precision those
for which there was data. These data have basically been
obtained from individual projects or, also, of the
measurements of the polygons of some buildings that appear
in the field sheets at a scale of 1: 500 that were drawn for the
subsequent elaboration of the urban plot of the Statistical
Board (1860-1870), which are preserved in the National
Geographic Institute.
The third step of the process consists of defining those
elements whose precise shape or location could not be
determined from dimensional documentation, and from which
there are currently no references that can be taken. It is then
that one enters the field of interpretation and speculation, in a
process very similar to that of an architecture project, which
does not mean that there are no reference elements on which
to rely, in order to establish sensible hypotheses: cartography
history, written documents, press, photographs, paintings and
engravings ...
Figure II. Spatial scope of the research over Madrid’s satellite view.
Once this graphic reconstitution has been carried out and
reliable drawings have been obtained, one can experiment Temporarily, key moments in the development of the
with the tools at our disposal – the most common, newer ones, urban form of the study area have been chosen, in which its
or in combination – to decide how to display these drawings. image has been “frozen” in order to study it:
The comparative method is one of the most powerful tools -1870, as the moment of the transfer of the property and
for analyzing and understanding the urban form of cities or, in preliminary state of the place.
this case, of the same city at different moments in time. By
referring all the drawings to the same cartographic base, with -1890, a state prior to the end of the century and the
a common scale and the same graphic variables, we can appearance of institutions other than the School, the year in
establish a “graphic parallel”: “We would thus arrive at the which the Alfonso XII Agricultural Institute was founded, the
concept of graphic parallel, which we could enunciate as the
116
first refurbishment works were undertaken and the expansion
process started.
-1910, as a state immediately prior to the new building of
the School and the date in which the development of the
institution is already remarkable. This section has been drawn
from the population plan of the same year.
-1927, the last moment of the School before the
construction of the campus (Figure III).
-1936, date marked by the beginning of the Civil War.
-1955, a state in which the first reconstruction projects of
the School building had already been undertaken.
-2020, as a current image as a result of the processes seen
throughout the work.
117
order to be able to compare them with the original project symmetry of the main facade of the building has been used to
plans or subsequent renovations. Data collection is essential show, on one side, the original building and, on the other, the
in the production of close-scale documentation. Through the current one, so that, in a very synthetic way, it is possible to
first sketches, the order of the buildings, the modulation and understand the transformation of the building (Figure V).
the existence of elements that are repeated are observed.
From these plans, 3D surveys have been carried out, either
modeling with a high degree of detail, to obtain descriptive
drawings, or performing a simple extrusion and mapping
afterwards the original plans on it, for models with a more
analytical purpose. For the modeling of the School of
Agronomists, the order module – including column, cornice,
frieze and brick panel – has been built and the complete Figure V. Main elevation of the School of Agriculture engineers.
building reconstructed from it. On the left, façade in 1936 and on the right, current façade.
The interest of modeling goes beyond obtaining three- Another narrative level of great interest is that of analytical
dimensional images of the building that no longer exists today: drawing. In this case, simpler drawings are usually used, with
studying current and old photographs and placing the same a lower degree of detail, but which nevertheless condense a
cameras within the model, highly accurate comparisons can large amount of information. In this paper two examples are
be made that allow reconstructing the landscape of the offered: on the one hand, that of the constructive evolution of
primitive Ciudad Universitaria [11]. the School of Agronomists, where in a schematic way and
with a color code the change of the building is shown since its
These working methods should not be considered as initial construction (Figure VI).
separate elements, but as a set of combinable tools. Working
with photography is a round trip, always in conjunction with
the rest of graphic and written methods. The first step to be
able to make a photographic comparison of a specific area or
building is to know what has been written about it and study
the different planimetries to be able to establish the time and
place where the image was taken. It is important to check each
photograph and identify them well, before continuing to work
with them, as many times photographs with totally wrong
descriptions are discovered. One of the consequences of the
Civil War was precisely the destruction and disorder of the
documentation housed in the faculties. The widespread chaos
after the war caused many photographs to be quickly and
poorly cataloged.
Once the photographed place has been identified, the two
and three-dimensional drawing acquires an important role,
since, through the plan plans, we can locate the position of the
photographer quite roughly and later we can achieve even
greater precision by means of the placement of cameras in
three-dimensional models. Once located, current photographs
are taken on the ground, looking for the studied point and
making the last position adjustments that are necessary,
provided that the current state of the place allows it.
Finally, after taking two photographs of the same place at
different times, they can now be compared and analyzed in
depth. The meticulous study of these photographs helps to
specify details that do not appear in the plans or to adjust the
modifications that were made during construction and were Figure VI. Evolution of the School of Agronomists from 1869 to the
not reflected off-plan. They are also a valuable source of present.
information to learn about the original finishes of these On the other hand, thanks to the fact that all these drawings
buildings and see how they have changed. The “skin” of have been scientifically built, a certain precision can be
buildings is one of the parts that undergoes most guaranteed that has enabled me to quantify the surfaces built
transformations over time, so this will be the most reliable in this territory, in addition to determining which surface was
document for its reconstruction. destroyed during the Civil War and how much was rebuilt.
Once the drawings have been constructed, one must think The conclusion is the following: in the area dedicated to
about how to display them to explain the research, that is, in teaching, the percentage of destruction was very similar to that
its narrative dimension. Again, the role of drawing is very of the campus average – 42%, although the School of
interesting because it allows us to establish narrative lines at Agriculture building suffered especially – but in the Model
different levels: absolutely descriptive of reality, as seen in Farm and the surroundings of the Palacete the destruction was
some examples above, or partially descriptive, allowing us practically total (86%) and, in addition, it was decided not to
certain licenses that allow us to better show ideas. For rebuild it, contrary to what happened with the teaching
example, in the case of the School of Agronomists, the environment. With this decision, the School was de facto
118
incorporated into the campus as a whole and the hegemony On the other hand, drawing has also been considered in its
that until that moment it had had, at least in extension, was narrative dimension, understanding it as an essential
eliminated. complement to written discourse. This graphic narrative
supposes, in itself, a contribution to the knowledge of the city,
V. Conclusion insofar as it provides an image of it that has not yet existed. In
In an investigation where the drawing is an inherent part addition, the support of the drawings, which is not only
of the research, it also becomes the result and conclusions of physical on paper, but also digital, since they have been made
the work. Thus, to date non-existent graphic material has been with computerized means, generates a georeferenced database
produced that represents a space in the city of Madrid at that becomes a piece of a much larger set of previous research
different times, in a constant journey between territorial and on the city of Madrid, available for future studies. In this
architectural scales. sense, there is also a possible further development of research
in means of representation towards more applied aspects,
The comparative analysis of these historical sequences since the abundant graphic information that has been produced
allows, thanks to the unit of scale, framing and graphics, a here could be organized in a Geographic Information System.
fluid reading of the evolution of the field of study in two This path has already been started by building all the drawings
aspects: the spatial, which orders the elements at a specific on a digital common basis, with which a database could be
moment, and the temporal, that orders the transformations generated in which the current campus and its previous states
successively (Figure VII). could be considered. This would work as a starting document
As indicated above in reference to the location of already for any urban development intervention, diffusion activity and
known places and constructions, the drawing is precisely the even heritage recovery projects.
tool that has made it possible to position them in specific
References
coordinates and graph them in the plans prepared for the
thesis. [1] Jorge Sainz, El dibujo de arquitectura: teoría e historia de un
lenguaje gráfico, Reverté, Barcelona, 2005.
[2] Jara Muñoz-Hernández, Carlos Villarreal-Colunga, “Las
andanzas de la portada de Oñate tras la demolición de la casa-
palacio: calle Mayor, Teatro Español, La Moncloa”,
Arqueología de la Arquitectura, volume 17: e094, 2020.
https://doi.org/10.3989/arq.arqt.2020.003
[3] Javier Ortega-Vidal, Francisco José Marín-Perellón, La forma
de la villa de Madrid. Soporte gráfico para la información
histórica de la ciudad, Dirección General de Patrimonio
Histórico, Madrid, 2006.
[4] Javier Ortega-Vidal, Ángel Martínez-Díaz, María José Muñoz-
de-Pablo, “El dibujo y las vidas de los edificios”, EGA Journal,
volume 18, 2011. http://doi.org/10.4995/ega.2011.1335
[5] Jara Muñoz-Hernández, La Escuela de Ingenieros Agrónomos
en La Florida-Moncloa [PhD thesis], Universidad Politécnica
de Madrid, 2020. https://doi.org/10.20868/UPM.thesis.65305
[6] Luis Sobrón-Martínez, Al Este del Retiro [PhD thesis],
Universidad Politécnica de Madrid, 2015.
[7] Jara Muñoz-Hernández, “De la Fábrica de Porcelana a la
Escuela de Agrónomos de Madrid”, Revista de Humanidades,
volume 41, 2020.
[8] María José Muñoz-de-Pablo, Ángel Martínez-Díaz, “El
paralelo. Bosquejo de un método gráfico”, EGA Journal,
volume 23, 2014. https://doi.org/10.4995/ega.2014.2172
[9] José Luis González-Casas, Jara Muñoz-Hernández, “The urban
and environmental Impact of Madrid’s Ciudad Universitaria: A
comparison between the first Campus and the post-war
Campus”, International Journal of Sustainable Development
and Planning, volume 15, issue 6, 2020.
http://doi.org/10.18280/ijsdp.150612
[10] Jara Muñoz-Hernández, José-Luis González-Casas, “Traces
and scars. The reconstruction of Madrid’s Ciudad Universitaria
after the Spanish Civil War”, WIT Transactions on The Built
Environment, volume 191, 2019.
http://doi.org/10.2495/STR190181
[11] José Luis González-Casas, Jara Muñoz-Hernández, “Drawing
for heritage dissemination. The birth of Madrid’s Ciudad
Universitaria”, International Journal of Heritage Architecture,
volume 2, issue 2, 2018. http://doi.org/10.2495/HA-V2-N2-
359-371
Figure VII. Drawings of the research area in each of the key dates
119
Urban distribution network proposal: A case study for the 14th
district of the city of Medellín.
Juan P. Vasco-Gallo J. Isaac Pemberthy-R. Eduard A. Gañan-Cardenas
Ingenieering Production Student Department of Quality and Production Department of Quality and Production
Instituto Tecnológico Metropolitano Instituto Tecnológico Metropolitano Instituto Tecnológico Metropolitano
Medellín, Colombia Medellín, Colombia Medellín, Colombia
juanvasco241321@correo.itm.edu.co jorgepemberthy@itm.edu.co eduardganan@itm.edu.co
June/2019
May/2019
May/2020
June/2020
October/2019
October/2020
April/2019
April/2020
July/2019
July/2020
March/2019
November/2019
March/2020
November/2020
January/2019
January/2020
December/2020
February/2019
August/2019
September/2019
August/2020
September/2020
January/2021
December/2019
February/2020
service. For this purpose, an integer linear programming
optimization model (MILP) was built, obtaining agile results in
a first scope focused on a district in the city of Medellin,
Colombia. The main theoretical contribution of this work is the
development of an efficient MILP that responds to the objective
under a problem in a real scenario. Figure I. Food domicile search trends in the Google search engine
in Colombia. Source: Google Trends ® [5].
Keywords— City logistics, Urban logistics, Operations
research, Optimization.
I. Introduction
The problem of urban distribution of goods or city
logistics has been widely studied over the years and today, it
is no exception since logistics is one of the most important
activities in the context of business and developed cities of the
XXI century. Nowadays, city logistics is becoming more
noticeable, because, with the growth of e-commerce [1] and
the global urbanization trends that force modern cities to offer
opportunities for employment, education, culture, health,
sports, among other activities such as, the development and
growth of industries. This leads to the expansion of urban
areas and the increase of road traffic and consequently to the
increase of environmental pollution, vehicular congestion,
negative social impacts, generating poor quality of life of
citizens [2], in addition to an inefficient and ineffective service
reducing the level of service of city logistics [3].
Today, given the current situation of confinement Figure II. Choropleth map of the search index of grocery home
generated by the COVID - 19 pandemic, the demand for delivery services by states in Colombia. Source: Google Trends ®
groceries has increased through home delivery services or [5].
through e-commerce, thus producing more trauma in the
logistics of the last mile in the city [4]. As can be seen in
Figure I, searches for groceries delivery services in Colombia La Calera
in the last year have increased considerably because of the Sabaneta
Colombian cities
Pasto
pandemic. Figure II and Figure III show that the state
Envigado
(Antioquia) and the city (Medellin) are among the most Zipaquirá
represented populations in the index of searches in the Google Cajicá
engine. This increase in Google® searches is a good index of Bogotá
the increase in the execution of services of the same type. In Chía
Medellín
this way, urban logistics can have an impact on aspects such
Ibagué
as traffic congestion in the city and the generation of vehicular
pollutants. 0 20 40 60 80 100
Search index in Google engine
Figure III. Colombian cities with highest search rates during 2020.
Source: Google Trends ® [5].
120
A possible strategy proposed by some authors to solve the
actions of city logistics, are urban warehouses (WH) or also
known as urban consolidation centers [6], whose main
function is to redirect as much as possible the flow of goods
and provide efficient transportation from the WH to the urban
areas of the city and vice versa, through the change of long-
distance cargo vehicles to short-distance vehicles [7]. As
evidenced in some works, where the installation or not of
urban WH is defined, applying mathematical models; this case
of application of integer linear programming (MILP), to
define the strategic installation of WH in the urban perimeter,
where it is concluded to install satellites in the peripheries of
the city [7]. Another outstanding case is based on a
methodology based on descriptive survey, applying a study of
a multicriteria structure for the sustainable implementation of
urban WH in cities, where the results define not to install any
WH in a small city in Brazil [8]. Another case study of linear
programming models is applied in the characterization of the
supply chain of bovine products with high production in the
province of Sabana Centro (Colombia), where the model
results in opening several centers in different strategic cities
for the company [9].
In this work, a case study is developed in the city of
Medellín, today it is ranked among the most congested cities
in the world according to the INRIX index [10]. The case
seeks to address a WH location problem by applying an
integer linear programming model, which optimize and
respond to current needs, with respect to the distribution of
basic products of the family basket. The district 14 of Medellín Figure IV. Medellin Districts’ location and population. Source:
called El Poblado, is chosen as a case study. This district has [11],[12]
special characteristics, such as the concentration of young For this purpose, district 14 El Poblado is selected as the
population with a high socioeconomic level, which makes its scope of the study, due to the importance of this area for the
population more prone to the use of technological services and city. As it is the second district with the greatest logistic
therefore to the home delivery service. influence in the city [13]. This district has 22 neighborhoods,
and its population has a decreasing growth, according to the
population growth rate and demographic profile [14].
II. Materials and methods
B. Input data
Given the current situation, a sequence of phases has been
To find the solution to this kind of problem, different
constructed, which allow proposing an improvement solution
information is required depending on the model to be applied
to the problem posed, by means of the consolidation or
in the development of the solution.
installation of WH.
▪ Geographical locations
A. Characterization and delimitation of the case of
study Based on a guide map of the division by neighborhoods of
Medellín is the capital of the state of Antioquia and is in district 14 El Poblado, an overlay of a cartesian plane is made
the Aburrá Valley, it is in the center of the state. This valley is in order to establish the coordinates (x,y) for each
composed of 10 cities forming the Metropolitan Area. neighborhood under study. To do so, we define to work with
Medellín is the most populated city in the valley, with a their centroid points as a reference point of location. The
population of 2.5 million of the nearly 4 million inhabitants of different candidate locations for WH are defined in points that
the Valley. Medellín is located in the middle of the Valley and have the availability of leasing premises, or spaces for
is divided into 16 districts comprising 249 neighborhoods as industrial facilities as WH. Given that district 14 is mainly of
illustrated in Figure IV below. Urban characteristics, there are few candidate locations in this
territory to locate a Warehouse. The location of the centroids
of each neighborhood (red dots) together with the candidate
locations for WH (blue triangles) are shown in Figure V.
121
C. Mathematical statement of the problem
The model objective is to find a proposal for the opening
of urban WH in El Poblado district, seeking to minimize the
costs of opening and transporting basic necessities in the last
mile to the various neighborhoods, and, in compliance with
the assignment of each neighborhood only to a proposed WH
located at a distance of no more than 2 km. For the formulation
of the problem, two binary decision variables were used, 𝑥𝑗
that establishes whether a WH is proposed to open in
candidate location 𝑗 . And the variable 𝑦𝑖𝑗 that establishes
whether the attention of neighborhood 𝑖 is assigned to the
proposed WH in location 𝑗. Table I defines the terms used for
the linear modeling of the problem.
Table I. Definition of mathematical model terms.
Sets
𝑁 Set of neighborhoods available in El Poblado district. 𝑁 =
{1,2,3, … , 𝑛}
𝑈 Set of candidate locations for the location of urban WH.
𝑈 = {1,2,3, … , 𝑢}
Indexes
𝑖 Identifies each of the neighborhoods in the district. Where
𝑖 ∈ 𝑁.
𝑗 Identifies candidate locations throughout the district. Where
𝑗 ∈ 𝑈.
Parameters
𝑛 Number of available neighborhoods in district 14, El
Poblado.
𝑢 Number of candidate locations for the location of urban WH
Figure V. El Poblado district’s location, neighborhoods centroids in district 14, El Poblado.
and WH candidate locations. 𝐶𝑖𝑗 Estimated cost of shipping one kilogram per kilometer in
the city of Medellín from candidate location 𝑗 to
neighborhood 𝑖.
▪ Distance between locations 𝑃𝑖 Population of neighborhood i
𝑑𝑖𝑗 Euclidean distance in kilometers from the centroid location
For this purpose, it is defined to work with Euclidean of neighborhood 𝑖 to candidate location 𝑗.
distances in the defined cartesian plane. The distances used for 𝐶𝐼𝑗 Cost of installing or opening WH in candidate location 𝑗.
the case are calculated from the centroid points of each 𝑎𝑖𝑗 Binary parameter of compliance with the maximum
neighborhood to the different candidate locations for the coverage distance for the assignment of a client 𝑖 to a WH
location of the WH. The distances are calculated using at candidate location 𝑗 .
QGIS® software [14]. Decision variables
𝑥𝑗 Binary variable that establishes the opening of a WH in
candidate location 𝑗.
▪ Transportation and facility costs 𝑦𝑖𝑗 Binary variable that establishes the allocation of attention
of neighborhood 𝑖 to a WH located in candidate location 𝑗.
The cost of shipping goods is assumed to be COP $819.5
($0.22 US dollar) per kg. shipped per kilometer traveled; this
cost is taken as a reference from the "Rappi® delivery" The mathematical model formulated in Binary Integer
platform [16]. On the other hand, the cost for the installation Linear Programming is presented below with expressions (1)
is assumed as an average cost of renting premises such as to (4).
warehouses or commercial places in malls or shopping centers
present in the study area, it is COP $19 million in average
($5201,26 US dollar). This information is obtained based on 𝑀𝑖𝑛 𝑍 = ∑ 𝑥𝑗 ∙ 𝐶𝐼𝑗 + ∑ ∑ 𝑦𝑖𝑗 ∙ 𝐶𝑖𝑗 ∙ 𝑃𝑖 ∙ 𝑑𝑖𝑗 (1)
the platform of the local leasing company in the city [17]. 𝑗∈𝑈 𝑖∈𝑁 𝑗∈𝑈
122
IV. Conclusion
𝑥𝑖 , 𝑦𝑖𝑗 ∈ {0,1} ∀ 𝑖 ∈ 𝑁, 𝑗 ∈ 𝑈 (4) Through the application of this work, a proposal is defined
that aims to improve the urban logistics of goods within the
city of Medellin, specifically taking as an object of study the
The objective function (1) seeks to minimize the costs of district 14, El Poblado. This proposal, as it has been seen in
the exercise. The first term represents the assumed cost for different reviewed works, is likely to generate significant
each facility. The second term that sums the transportation contributions in several areas of the city, such as vehicular
cost from the WH at location 𝑗 to the assigned customer 𝑖. The congestion and the generation of vehicular pollutants.
group of equations (2) guarantees the assignment of each
The development of the application of mathematical
neighborhood 𝑖 to an WH facility at a candidate location 𝑗
models to the realities of urban logistics can be considered a
complying with the maximum coverage limit. Equation group
great tool when looking for improvements in this framework.
(3) guarantees that customers 𝑖 are assigned to active WH at
These are easily adapted to the representation of real scenarios
candidate location 𝑗. Finally, group of equations (4) ensures
in cases throughout a city. From our point of view, it is a tool
that the variables 𝑥𝑗 , 𝑦𝑖𝑗 are of binary order.
that generates great benefits and to which, nowadays, little
visible use is given when making decisions in the framework
of public sector administration.
III. Results
Finally, as future work, three main complementary ideas
The mathematical model has been implemented in Python can be identified. (i) Assess the capacities needed for the PM,
3.9 and solved with Cplex IDLE Studio 20.1.0 optimizer. The to establish the space requirements to serve the assigned
results have been found on a 2.3 GHz Ryzen 7 computer with population. This can be done through design and sizing
8 GB of RAM, running on Windows 10 Professional methods of logistics facilities. All this to optimize and achieve
Operating System. The proposal seeks to establish a proposal cost savings per facility. (ii) Depending on the demands and
for a warehouse management scheme from the public sector, transports performed, an optimization of the resources
with the purpose of private use, but with the aim of reducing allocated to the functions of picking and transport of orders
trips for grocery services from distant distances in the city. could be applied. This to evaluate different work policies. For
Until to this point, we have started with these results by example, the transport personnel perform the picking in the
evaluating the model defined for the construction of the warehouse, or on the contrary, separate the functions. I touch
proposal at city level, we consider the results are efficient at with a view to evaluate the need for hours allocated to each
this scale, given that we found a solution of optimal order and task and the need for handling equipment and transport of
in an efficient way, with a run of 1 second of machine. goods. Exact or heuristic optimization techniques can be used
The results obtained propose the opening of 3 WH. The for this purpose. (iii) Complement the mathematical modeling
neighborhoods where the model yields the possible locations to ensure that there is load balancing, so that the
of the centers are: La Linde (WH3), Los Naranjos (WH5) and neighborhoods are distributed in a more equivalent way to the
El Diamante No. 2 (WH7). The final assignment of each various WH proposed as a solution.
neighborhood to the various locations proposed for WH
yielded a minimum cost objective function with a value of $
105,326,975.076 ($COP) ($ 28,833.30 US dollar). The results Acknowledgment
are plotted in Figure VI, the centroid points of the
neighborhoods assigned to each of the WHs are highlighted This work was supported by Instituto Tecnológico
by connecting lines reflected in different colors for each of Metropolitano (ITM) (Project no. P20239), in Medellín,
them. Colombia.
References
[1] Burgos, G. (2021, January 13). Supply chain: Los retos del
transporte ligero y la distribución urbana en tiempos del
Coronavirus | América Retail. https://www.america-
retail.com/supply-chain/supply-chain-los-retos-del-transporte-
ligero-y-la-distribucion-urbana-en-tiempos-del-coronavirus/
[2] Muñuzuri, J., Grosso, R., Escudero, A., & Cortés, P. (2017).
Distribución de mercancías y desarrollo urbano sostenible.
Revista Transporte y Territorio, 0(17), 34–58.
https://doi.org/10.34096/rtt.i17.3866
[3] Segura, V., Fuster, A., Antolín, F., Casellas, C., Payno, M.,
Grandío, A., Cagigós, A., & Muelas, M. (2020). Logística de
última milla retos y soluciones en España. Deloitte.
https://www2.deloitte.com/content/dam/Deloitte/es/Document
s/operaciones/Deloitte-es-operaciones-last-mile.pdf
[4] Neira Marciales, L. (2020, March 27). “Durante la cuarentena
por el virus Covid-19 se cuadruplican en el país los domicilios”.
https://www.larepublica.co/empresas/domicilios-se-
Figure VI. Solution graph. cuadruplican-en-tiempos-de-cuarentena-por-el-covid-19-
2983817
[5] Google Trends. Accessed February 2021.
(www.google.com/trends).
123
[6] Sopha, B. M., Sri Asih, A. M., Pradana, F. D., Gunawan, H. E., https://www.dane.gov.co/index.php/servicios-al-
& Karuniawati, Y. (2016). “Urban distribution center location: ciudadano/60-espanol/demograficas/censos
Combination of spatial analysis and multi-objective mixed- [13] Alcaldia de Medellín. (2013). Documento de rendición de
integer linear programming”. International Journal of cuentas a la ciudadanía para la Comuna 14 El Poblado.
Engineering Business Management, 8, 1–10. Períodico Cuentas Claras, 1, 8.
https://doi.org/10.1177/1847979016678371 https://www.medellin.gov.co/irj/go/km/docs/wpccontent/Sites
[7] Campos Magin, J. (2015, May). “Las plataformas logísticas de /Subportal del Ciudadano/Nuestro
distribución urbana de mercancías: un elemento de desarrollo y Gobierno/Secciones/Plantillas
regulación del transporte de mercancías en las ciudades”. Genéricas/Documentos/2013/Cuentas Claras Comuna/1
https://upcommons.upc.edu/bitstream/handle/2117/27229/155 octubre/comuna 14 baja.pdf
72417.pdf [14] Alcaldía de Medellín. (2016). Perfil Sociodemográfico por
[8] de Carvalho, N. L., Vieira, J. G. V., da Fonseca, P. N., & barrio Comuna 14 El Poblado 2016-2020. 223.
Dulebenets, M. A. (2020). A multi-criteria structure for https://www.medellin.gov.co/irj/go/km/docs/pccdesign/Subpo
sustainable implementation of urban distribution centers in rtaldelCiudadano_2/PlandeDesarrollo_0_17/IndicadoresyEsta
historical cities. Sustainability (Switzerland), 12(14). dsticas/Shared%20Content/Documentos/ProyeccionPoblacion
https://doi.org/10.3390/su12145538 2016-
[9] Ariza Nieto, J. A. (2013). “Modelo de programación lineal 2020/Perfil%20Demogr%C3%A1fico%20Barrios%202016%
basado en la caracterización de la cadena de suministro de los 20%E2%80%93%202020%20Comuna_14_El%20Poblado.pd
productos bovinos con alta producción en la provincia de f
sabana centro”. Journal of Chemical Information and [15] QGIS Development Team Version 3.16.7-Hannover (2009).
Modeling, 53(9), 1689–1699. QGIS Geographic Information System. Open Source
[10] INRIX. (2020). 2020 Global Traffic Scorecard. INRIX Geospatial Foundation, url: http://qgis.osgeo.org.
Research. https://inrix.com/scorecard/ [16] Rappi®. Accessed February 2021. www.rappi.com.co/.
[11] Alcaldía de Medellín. (2010). Primera Parte: Generalidades [17] Finca raiz. Accessed February 2021.
Medellín y su Población MEDELLÍN Y SU POBLACIÓN. https://www.fincaraiz.com.co/
https://www.medellin.gov.co/irj/go/km/docs/wpccontent/Sites
/Subportal%20del%20Ciudadano/Plan%20de%20Desarrollo/
Secciones/Informaci%C3%B3n%20General/Documentos/PO
T/medellinPoblacion.pdf
[12] DANE. (2019). Resultados Censo Nacional de Población y
Vivienda 2018 (National population census).
124
Close Price Prediction of Day Stock Markets with
Machine Learning and NLP models
Purushoth Ananatharasa Ragu Sivaraman
Informatics Institute of Technology Informatics Institute of Technology
anantharasa.2017152@iit.ac.lk ragu.s@iit.ac.lk
Abstract— The Stock market is an important factor that A stock trader before purchasing stocks considers factors
displays the development of a country. The reason for such as price patterns, trading pattern, opinions of public
considering this statement is due to the transactions that perception and the services offered by that specific company
happen in the market time brings out huge capital gain from whereas, a stock investor consider other factors like
investors and traders from trading with company shares. financial summary, dividends, economic growth, cash flow
Therefore the transactions of investors and stock traders are etc. Therefore these both stock investors and traders are
very important to keep the market alive. Considering the stock important factors for the stock market to function [4]. It is
traders in the short-term market, they analyze the
both the groups that keep the stock market transactions alive
performance of a company by the past values and performance
of those company indexes. However, analyzing those close every day though throughout the past history. However there
prices would not help. There are many systems that forecast the is never a solid price for a stock price for the traders to trade
future prices for a long time in future but due to the high because the value starts fluctuating in the market making the
volatile nature, these values predicted from these systems values of the assets highly unpredictable. These fluctuations
cannot always be accurate. Therefore the limitation of are based on capital flow in and out of financial reserve, also
analyzing both the past transactions and impacting features to due to the competitiveness of the other companies in the
predict the close price of a day shows the main research domains [5]. This stature of the stock makes it highly
problem that should be addressed in the research. This trained volatile and unreliable. Considering huge companies like
solution used machine learning models such as Random Forest
Apple (AAPL), it has many factors such as Apple events,
Regression, XGBoost, SVM and Lasso models to predict the
next two days of the closing prices. Also an additional feature product launches and many other internal company events to
which analyses the news sentiment which affects the pattern of impact its stock performance in a day trade [6]. Therefore as
stock performance and prompts suggestions for traders. The a well experienced day trader it is important to analyze all
evaluated system accuracy was measured with RMSE, MDA, the impacting factors like company events and past data up
MSE and the overall accuracy was up to 97% where the whole to a necessary time period before begin trading. In fact, there
system was efficient and satisfactory during benchmarking are prediction models which are used by the stock market
with existing systems. Furthermore, the system was evaluated experts to find the directions (up or down) for a specific
by domain experts and end users under well-designed number of days in future, but the systems cannot predict the
evaluation criteria.
closest stock price due to governing factors which makes the
Keywords— Stock Market Prediction, Machine Learning, stock prices really challenging to predict and do daily trad.
Random Forest Regression, Sentiment Analysis.
Therefore a system which analyzing and predicting the
closing price of certain companies for long time span has a
I. Introduction possible consequence of misleading the stock traders to
The stock market can be considered as an indicator of a make incorrect decisions to do trade in non-ideal days,
country’s economy since it acts as a platform of buying and because predicting for a longer time span which is more
selling stocks which includes many company indexes in than 5 days will have high chances of getting incorrect
sectors like agriculture, health, manufacturing etc, are being predictions due to the factors like market sentiments of
monitored by the public to invest capital [1]. Therefore a customers and unexpected global events which could affect
capital market acts as a transparent medium to evaluate the the whole industry. Therefore the existence of a system that
performance of the companies for the stock traders and the analyzes the past data for a time span to predict the close
investors to trade-in. Stocks are considered as the most price of the next 2 days (ideal) as the main function and
preferred trading medium where they can be categorized provide the nature of the stock index (Ideal to trade or not)
into long term and short-term stocks. The long-term stocks based on unexpected news published on the internet as
are considered as investments with a long time span of another feature using machine learning is still a wish list
revenue generator and short-term stocks which generate system among the stock traders.
income with a short time span to a daily stock trader [2]. This
trade between the public sector and the companies provides Therefore these below mentioned factors have to be
many mutual benefits. The companies get benefits like considered when predicting the close prices of short term
investment gains where the capital which is required to (daily trade) stock prices.
expand the business is obtained from verified income
channel. Diversification of value of the company into
another category possibly brings the opportunities of
revenues in the company profile [3]. From a stock trader’s
point of view, a person who has a company shares will have
an income channel from the profit gained from a daily stock
trade. Also, a shareholder will have the control of making
decisions to buy or sell stocks specifically to the individual
tolerance upon stocks.
125
Table I. Existing work on past datasets.
A. Company News
Research Features Limitation
The public opinion of a stock company is mainly
Predicts the price for one Higher RMSE, MAPE
portrayed by the news that is published in the public news [13] day. values compared to other
media (positive/negative) [9]. This news published impacts models.
on the investors, long term and short-term traders to evaluate Displays result up or The reinvention of
the decisions which they made are wiser or profitable or [15] down by analyzing the traditional model with
past datasets more beneficial functions
non-profitable. It is the news which is delivered to the public Obtained 70% accuracy for Need to add sentiment data
that attracts the people to buy and sell or even invest in that [16] the model. due to impermanent nature
specific company to perform any kind of capital transactions of close price.
[10]. Used LASSO and Ridge Did not consider any
model to predict company market impacting factors
[14]
B. Volume of stocks stocks or market policies.
The volume of trade is also another important factor for Gave acceptable results Predicted scores were
the stock traders to analyze the value of the stock index which [17] compared to ANN and affected by trader
has performed on a day which also indirectly used to ARIMA models. sentiment.
Used multiple ML models Deep Learning models
determine the close price of a company [11]. The volume of [18] (XG Boost, SVC, KNN, had better confusion matrix
trade depends on specific company. Generally if the volume ANN)
of the stock is high, it directly brings out opinions that the The prediction was done Limited performance.
close price might get a positive increment since a lot of [19] with granger causal factor
between input parameters.
customers have present in the trade to contribute to the
overall volume to increase [9].
B. Predicting close prices using social media and
II. Existing Works financial news
Analyzing the existing works gives a deep understanding Social media impacts the decision of day traders
of the approaches and the system limitations that are in the involving in the trade. Specially to analyses whether it is
present systems. After carrying out a comparison with each profitable to perform trade in the market. This impacts the
available system the close price prediction of a stock index overall performance of the market at the close time. To
can be done in three main approaches. examine this factor the author [12] carried a research of
• Predicting close prices using past datasets. analyzing the social media feeds and other public news with
• Predicting close prices using social media and models like SVM, KNN, NN and Linear Regression. In
financial news. summary with other researches the positive and negative
• Predicting close prices using Technical Indicators. news affected considerably the performance of the closing
A. Predicting close prices using past datasets price. Therefore it was decided to analyze both factors to find
the direction and value of the close price. This table explains
Using ensemble approach like Random Forest can be the critiquing of the previous researches and systems that
used to predict the stock price for specific days and compared were completed with the social media approach with the
with a deep learning model for the obtained values in this features and limitations that are present.
research [13]. In this approach, the model used input
parameters such as Open, High, Low and Close prices for Table II. Existing work on social media and news articles.
prediction. The model was able to provide results from the
Research Features Limitations
Random Forest model where the model was validated with
evaluation metrics such as root mean squared error (RMSE), Used SVM and RF Add more companies for
classifiers prediction model and use
Mean Absolute Percentage Error (MAPE) and Mean Bias [20]
the same technique for
Error (MBE) for the unclassified data for five distinct algorithmic trading.
instances. Used Random Forest, Models are made based on
[12] SVM, kernel factory and certain regional markets
However this novel approach was challenged from [14] AdaBoost algorithms
where LASSO and Ridge models were used to predict the Accuracy is 56.07%. Can Number of topics and
close price. The obtained results were same enough with the do prediction for large sentiment must specify.
[21]
scale datasets.
existing works to benchmark. To evaluate these models same
metrics were used (MAPE and RMSE). The following table Textual representation is For the best profit-making
explains the critiquing of the previous approaches and [22] better than numerical qualities technical indexes
systems that were completed with the past data prediction data sets and bag of such as MA and MACD has
words methods were to be included.
approach with the features and limitations that were present. sued.
126
A. Price Prediction Model
C. Predicting close prices using Technical Indicators
In this component an efficient model is utilized among
In the research of [15], the team decided to use the tree-
other models by inserting a valid dataset that are fed into the
based classifiers approach to predict the stock prices of
regression models. The training datasets were obtained from
certain companies go up or down (classification mechanism)
Yahoo finance. Before training the model the imported
for a specific number of days in future, based on stock
datasets were pre-processed with missing values, validated
market indicators such as MACD and RSI. At the end of
with datasets without noise in the imported data.
prediction, the Random Forest classifier had better accuracy
than the Gradient Boost Algorithm to the selected
companies. However, the team also noticed that the F1 score
of both models increased when they increased the window
width of the model. Since this model analyzes all these
indicators the author had to find correlation between these
input parametera to predict the output [15].
III. Methodology
From the above discussions it was decided to go with the
historical transactions and social media, other financial news
which leverages on prediction of close price. Therefore
utilizing suitable models to find the close price using
ensemble method and nature of the stock index based on the Figure II : Survey from end users on most used data
news is focused in this section. The product consists two main B. News Analyzer Model
components which are “Price Prediction model”, “News
This model was based on NLP which associates with
Analyzer model” to predict the price and nature of a stock
human languages in the form of audio or textual information.
index. For this scenario the Apple stock index (AAPL) and
In this scenario which involves textual information,
Dow Jones (DWJA) were chosen.
sentiment analysis was used for classification. This model
has news datasets that were obtained from trusted news
sources like verified users in twitter, public news figures that
provide financial news on capital market. This dataset was
obtained from Kaggle, an open source data platform for data
science and machine learning projects. The imported data
was done a pre-processing step which consists of removing
English stop words and unwanted characters that could affect
127
Table IV. Time consumed for Model Training
the training process. Following that tokenizing and stemming
techniques were carried out to reduce the noise among the Model Time Taken (Seconds)
datasets. After benchmarking with multiple models for better Random Forest Regressor 31
accuracy the SoftMax logistic model was selected to classify
XG Boost 48
the news whether it could affect the stock price positively or
negatively. Decision Tree 66
The overall accuracy of the model was calculated by the LASSO 55
confusion matrix which consists of components like True
positive, true negative, false positive and false negative. SVM 51
Other additional metrics such as Recall , F1, Precision Score
were also calculated using these formulas. In the same criteria the Random Forest regression model
trained with the least amount of time with the most accurate
value compared to the others with the necessary input
parameters. Finally to validate the model, K-Fold validation
was carried out to validate whether the model is overfitting
or underfitting. And the table below shows the comparison
of actual predicted value for randomly taken for a sample of
three days, which was convincing for suggestion and proved
the model is balanced and can be reliable for critical
prediction purposes
128
Adding vectorization techniques to test exception
scenarios in the trading market is also another
enhancement that will be added. Finally adding the feature
of analyzing the technical indicators such as Moving
Average, Fibonacci average to predict the movement of
stock for far long time period is also to be another
functionality which could ultimately addresses the issues
faced by the stock market experts and end users with the
misconceptions and unexpected errors in the capital
markets.
References
[1] Abraham, A., Krömer, P. and Snášel, V. (2015) ‘Afro-
European Conference for Industrial Advancement:
Proceedings of the First International Afro-European
Figure IV. Heatmap of the SoftMax model Conference for Industrial Advancement AECIA 2014’,
Advances in Intelligent Systems and Computing, 334, pp. 371–
381. doi: 10.1007/978-3-319-13572-4.
[2] Akita, R. (2016) ‘Deep learning for Stock Prediction Using
Numnerical and Textual Information’, 2016 IEEE/ACIS 15th
International Conference on Computer and Information
Science (ICIS), pp. 1–6. doi: 10.1109/ICIS.2016.7550882.
[3] Al-Jaifi, H. A. (2017) ‘Ownership concentration, earnings
management and stock market liquidity: evidence from
Malaysia’, Corporate Governance (Bingley), 17(3), pp. 490–
510. doi: 10.1108/CG-06-2016-0139.
[4] Ballings, M. et al. (2015) ‘Expert Systems with Applications
Evaluating multiple classifiers for stock price direction
prediction’, EXPERT SYSTEMS WITH APPLICATIONS.
Elsevier Ltd, (May). doi: 10.1016/j.eswa.2015.05.013.
[5] Basak, S. et al. (2019) ‘Predicting the direction of stock market
prices using tree-based classifiers’, North American Journal of
Economics and Finance. Elsevier, 47(December 2017), pp.
Figure V: ROC curve of the SoftMax model 552–567. doi: 10.1016/j.najef.2018.06.013.
[6] Cakra, Y. E. and Distiawan Trisedya, B. (2016) ‘Stock price
prediction using linear regression based on sentiment analysis’,
V. Conclusion ICACSIS 2015 - 2015 International Conference on Advanced
Computer Science and Information Systems, Proceedings, pp.
A gap for an accurate and effective system for a 147–154. doi: 10.1109/ICACSIS.2015.7415179.
suggestion system based on the past data and public [7] Cervelló-Royo, R., Guijarro, F. and Michniuk, K. (2015)
perception still exists in the market. This research paper ‘Stock market trading rule based on pattern recognition and
discusses a novel approach to get the most accurate close technical analysis: Forecasting the DJIA index with intraday
price of the next two days based on the historical datasets data’, Expert Systems with Applications, 42(14), pp. 5963–
and the impact of the public news on the stock index 5975. doi: 10.1016/j.eswa.2015.03.017.
[8] Clark, E. and Kassimatis, K. (2017) ‘Country financial risk and
performance. The random forest regression model to predict
stock market performance: The case of latin america’,
the price of the stock using machine learning was selected Evaluating Country Risks for International Investments: Tools,
after comparing benchmarking the models. The SoftMax Techniques and Applications, 56(1), pp. 117–148. doi:
logistic model was designed to analyze the impact of the 10.1142/9789813224940_0005.
public news headlines or feeds posted on the social media. [9] Di, X. (2014) ‘Stock Trend Prediction with Technical
Both the models have been trained with most ideal input Indicators using SVM’, Stanford University.
parameters and trained under most common scenarios to [10] Gaillard, P. (2004) ‘Rwanda 1994: “...kill as many people as
perform as ideal models. you want, you cannot kill their memory”’, International
Committee of the Red Cross, pp. 1–24. doi:
10.6084/m9.figshare.5028110.
For the time being, the system provides predictions for
[11] Greenwald, D., Lettau, M. and Ludvigson, S. (2014) ‘Origins
only two stock indices AAPL and DWJA since those indices of Stock Market Fluctuations’, NBER Working Paper Series,
have high volatile character due to many global and 19818. Available at: http://www.nber.org/papers/w19818.pdf.
economic uncertainties. For future enhancements more stock [12] Jin, Z. et al. (2020) ‘The industrial asymmetry of the stock price
indexes will be added to the prediction system based on the prediction with investor sentiment: Based on the comparison of
different market regions based on the locations. Another predictive effects with SVR’, Journal of Forecasting, 39(7),
functionality will be adding more stemming techniques to pp. 1166–1178. doi: 10.1002/for.2681.
the News Analyzer Model. Stemming techniques may vary [13] Joshi, K., H. N, B. and Rao, J. (2016) ‘Stock Trend Prediction
based on the found datasets such as Lancaster Stemmer and Using News Sentiment Analysis’, International Journal of
Lemmatization etc for data cleaning. Computer Science and Information Technology, 8(3), pp. 67–
76. doi: 10.5121/ijcsit.2016.8306.
[14] Kim, S. et al. (2020) ‘Predicting the Direction of US Stock
Prices Using Effective Transfer Entropy and Machine Learning
129
Techniques’, IEEE Access, 8, pp. 111660–111682. doi:
10.1109/ACCESS.2020.3002174.
[15] Maio, P. and Santa-Clara, P. (2017) Short-Term Interest Rates
and Stock Market Anomalies, Journal of Financial and
Quantitative Analysis. doi: 10.1017/S002210901700028X.
[16] Nabipour, M. et al. (2020) ‘Predicting Stock Market Trends
Using Machine Learning and Deep Learning Algorithms Via
Continuous and Binary Data; A Comparative Analysis’, IEEE
Access, 8, pp. 150199–150212. doi:
10.1109/ACCESS.2020.3015966.
[17] Nguyen, T. H. and Shirai, K. (2015) ‘Topic modeling based
sentiment analysis on social media for stock market
prediction’, ACL-IJCNLP 2015 - 53rd Annual Meeting of the
Association for Computational Linguistics and the 7th
International Joint Conference on Natural Language
Processing of the Asian Federation of Natural Language
Processing, Proceedings of the Conference, 1, pp. 1354–1364.
doi: 10.3115/v1/p15-1131.
[18] Pradhan, R. S. and Dahal, S. (2018) ‘Factors Affecting the
Share Price: Evidence from Nepalese Commercial Banks’,
SSRN Electronic Journal, pp. 1–16. doi:
10.2139/ssrn.2793469.
[19] Shah, D., Isah, H. and Zulkernine, F. (2018) ‘Predicting the
effects of news sentiments on the stock market’, arXiv, (1), pp.
1–4.
[20] Skuza, M. and Romanowski, A. (2015) ‘Sentiment analysis of
Twitter data within big data distributed environment for stock
prediction’, Proceedings of the 2015 Federated Conference on
Computer Science and Information Systems, FedCSIS 2015, 5,
pp. 1349–1354. doi: 10.15439/2015F230.
[21] Victor Chow, K. et al. (1995) ‘Long-term and short-term price
memory in the stock market’, Economics Letters, 49(3), pp.
287–293. doi: 10.1016/0165-1765(95)00690-H.
[22] Vijh, M. et al. (2020) ‘Stock Closing Price Prediction using
Machine Learning Techniques’, Procedia Computer Science.
Elsevier B.V., 167(2019), pp. 599–606. doi:
10.1016/j.procs.2020.03.3
130
Investigation of Permeability Coefficient in Layered Soils
Kaveh Dehghanian M ohammad Haroon Saeedi
Department of Civil Engineering Department of Civil Engineering
Istanbul Aydin University Istanbul Aydin University
Istanbul, Turkey Istanbul,Turkey
kavehdehghanian@iau.edu.tr haroonsaidi64@gmail.com
Abstract— The hydraulic conductivity of permeable media is "Uppot et al. (1989) investigated two clays subjected to
a critical property that depends upon different properties of soil organic and inorganic permeants to study the changes in
mass such as porosity, size, and shape of soil particles, initial permeability caused by the reaction between clays and
water content, and compaction. As the characteristic condition, permeants [2]. Afterward, Haug et al. (1990) a prototype liner
the soil mass exists in layered strata, hence it is called as formed of Ottawa sand and sodium bentonite. This material
stratified soil. S oils are permeable materials due to the presence was mixed, moisture-conditioned, and compacted into
of interconnecting spaces that enable fluids to flow when there reinforced wooden frames. The in situ permeability test results
is a difference in energy head. The shape and size of particles, in were verified with low gradient, back-pressure saturated
turn, affect the interconnecting voids. Water flow through a soil triaxial permeameter tests conducted on undisturbed cored and
mass, is proportional to the size of the void apertures rather than
the overall number of voids. Even though void ratios are
remolded samples [3]" . "Sridharan and Prakash (2002)
frequently greater than for fine grained soil. The relative researched on two-layer soil frameworks demonstrates that the
position and thickness of a soil layer in a stratified soil system, shared interaction among distinctive layers of diverse soil
are two critical variables that determine the permeability of the sorts shaping a stratified store influences the proportionate
composite soil layer. In this study, a series of falling head tests permeability of the stratified store, which cannot be essentially
were performed to determine the permeability of two-layered calculated by the utilize of the condition for the proportionate
soils using two types of bentonite and sand, as well as the coefficient of the porousness of a stratified store when the
Atterberg limit, sieve analysis, specific gravity, and proctor test. stream is typical to the introduction of the bedding planes
It is shown that increase in Atterberg limits results in decrease based on the Darcy’s law. The porousness of the exit layer
of permeability. The higher the specific gravity, the higher the controls whether the measured porousness is more prominent
permeability coefficient. The permeability is higher at higher or lesser than the hypothetical values for a stratified store " [4].
void ratio. Furthermore, the permeability of stratified soils is
affected by the thickness of the end layer. "Galvaeo et al. (2004) performed another test in which
coefficient of permeability of saprolitic soil increased about
Keywords—Permeability Coefficient, Layered Soil five times when two percent lime was added and then
Profile, Falling Head Test, Atterberg Limit, Void Ratio decreased on further addition of lime. This is assign to the
creation of chemical bonds and aggregation. As for lateritic
I. Introduction soil, the coefficient of permeability decreased as lime was
In classical soil mechanics , soil is considered as a added. This is also assign to the same mechanism except that
homogeneous and isotropic material. In most of the cases, the the bonds are weaker than those developed in Soil [5]. Nikraz
experiments and numerical analysis are performed for a single et al. (2011) carried out a series of laboratory permeability
layer while the soil is a layered medium in the field. The tests to evaluate fiber effect on hydraulic conductivity
permeability coefficient is often obtained by a constant head behavior of composite sand. Clayey sand was selected as soil
permeability test for coarse-grained soils and a falling head part of the composite and natural fiber was used as
test in fine-grained soils. The assessment of permeability is reinforcement [6]. Sridharan and Prakash (2013) conducted a
significant for erosion control, slope stability control, comparative study of the measured equivalent coefficien t of
wastewater management, and structural failure due to permeability of three-layer soil sediments with the
foundation settlement problems. For layered soil systems, the theoretically calculated values has been made. The results
layers can either be horizontal, vertical, or inclined. Each level demonstrate that, by and large, the coefficient of permeability
has its own permeability coefficient, k. The typical or of the bottom layer controls whether the measured value of
equivalent permeability coefficient of the stratified deposit, equivalent coefficient of permeability is greater or lesser than
keq, is total of the direction of flow in respect to the orientation the theoretically calculated value also when a stratified soil
of the bedding planes. The coefficient of permeability (k) of deposit contains more than 3layers, different combinations of
soil masses is calculated using Darcy's law. When the flow is positioning of layers of different k values are possible. Hence,
parallel to the bedding planes' orientation, the equivalent in such cases, it becomes difficult to predict whether the
coefficient of permeability of a stratified soil deposit is derived measured value of keq is less or equal to or more than that
by calculated. [7]" . The consequence of this observation is the
realization that the equivalent coefficient of permeability of
∑𝑛
𝑖 =1 𝐿𝑖
𝑘𝑒𝑞 = 𝐿𝑖 (1) any layered soil deposit is not just dependent upon the values
∑𝑛
𝑖 =1( ) of k of the individual layers constituting the deposit, and that
𝑘𝑖
it also depends upon the relative positioning of the layers in
Where Li is the thickness of the ith layer in the layered
the system. Sridharan and Prakash (2002) studied two-layer
profile and ki is the coefficient of permeability of that layer. soil systems with equal thickness layers, whereas in the current
The permeability coefficient of the bentonite clays is quite low
research, sand and bentonite were treated in varied thickness
and is traditionally measured by falling head permeability test. layers with varied sample sizes. The main goal of this paper is
This method gives results over a long period of time as the
to measure and compare various sizes of sand and bentonite in
sample is expected to saturate [1].
soil layers.
131
II. Experimental study left in the oven for 24 hours to examine how much it shrinks
as the temperature rises, the more sand the lower shrinkage
Within the scope of the article, soil samples with varied limit becomes.
proportions were employed. Each soil sample's studies are Specific gravity tests are also done based on TS 1900-1
reported separately and in chronological sequence. Bentonite standards. Specific gravity of the combinations are as follow:
clay has a delicate, silky texture and is a natural clay. When
combined with water, it makes a paste. Elito Bentonite Clay 60% sand +40% Bentonit = 2.521
manufactured from a district in Izmir, southwestern of Turkey 70% sand +30% Bentonit = 2.548
and sand from concrete plants and an area close to the
university campus. The tested samples were selected in the 80% sand +20% Bentonit = 2.553
following percentages: 20% clay bentonite +80% sand, 30% 100% Bentonit = 2.318
clay bentonite+70% sand, and 40% clay bentonite + 60%
sand. "The uniformity coefficient (Cu) indicates the variance 100% sand = 2.588
in particle sizes in soil and is defined as the ratio of D60 to According to the specific gravity test results , the 100%
D10. D60 denotes the grain diameter at which 60% of soil sand has the highest specificity of the test, and the 100%
particles are finer and 40% are coarser, whereas D10 denotes bentonite clay sample has the lowest specific gravity of the
the grain diameter at which 10% of particles are finer and 90% test. As the amount of sand in the composite sample increases,
are coarser" [8]. Table I depicts the values of Cu and Cc for the specific gravity of sample increases too. Before beginning the
samples. falling head permeability test, the amount of water added to
"Atterberg limits were performed according to these each test must be considered carefully. Proctor test defines
methods are still being used to determine the Liquid Limit , how much water should be added to bentonite and sandy soils
Plastic Limit and Shrinkage Limit of soils, which are outlined directly. The results of proctor test are shown in Table III.
in ASTM D4318 and TS 1900-1" [8]. Table II depicts the Falling head test results are performed based on CEN ISO/ TS
values of Plastic Limit (PL), Liquid Limit (LL), Plasticity 17892-11standard and are shown in table IV.
Index (PI) and shrinkage Limit for the soil samples.
Table III: Determination amount of water and specific gravity test
Table I: Determination of CC, CU
Sample Sample ratios Water Specific
RAT IOS CU CC weight (gr) amount gravity
Table II: Atterberg Limits test result Table IV: permeability result for mixed samples.
132
1980
133
"Where ain = cross sectional area of the reservoir III. CONCLUSION AND DISCUSSION
containing the influent liquid; out = cross sectional area of the
reservoir containing the effluent liquid; A = cross sectional
area of the specimen; t = elapsed time between the In the laboratory, different types of permeability tests were
determination of h1 and h2; h1 = head loss across the specimen carried out using bentonite and sandy varying proportions soil,
at time t1; h2 = head loss across the specimen at time t2" [7]. and the equivalent permeability coefficient was determined for
two types of layered soils.
Table VII summarizes the results of the measured and
theoretical values. In most situations, the observed 20 19.7
permeability values exceed the theoretical values, as seen in
1.50E-06
45 percent to 60 percent when greater than the previous
sample. However, the optimum water of the sand is 17 percent 1.00E-06
of its weight as before. 5.01E-07 5.48E-08 1.91E-07
Take 2200 g sand, 110 g, 220 g, 330 g bentonite 1.00E-09
representative soil and mix with water if necessary. 1.3 1.35 1.4 1.45
For 100% sand: 17% weight of the sand sample dry density
should be adding water.
Figure III: coefficient of permeability and dry density (First
For 100% bentonite clay: 60% weight of the Sample).
bentonite clay sample should be adding water.
Table IX: k and γd for different ratios
For 5% bentonite clay + 95% sand: 66 g+ 374 g water Sample Ratio Dry density K (cm/s)
should be added. (kg/cm ) 3
For 10% bentonite clay + 90% sand: 132 g + 374 g (95%S +5% B) 19.7 1.1
water should be added. (%90S+10%B) 19.21 1.68
For 15% bentonite clay + 85% sand: 198 g + 374 g (85%S+15%B) 18.3 1.75
water should be added. The graph of the old sample shows that the
permeability coefficient increases when the dry unit
weight increases.
134
0.9 2.00E-06 1.73E-06
0.891
permeability coefficient
0.887
0.89 1.50E-06
cm/s
0.87 0.862 5.01E-07 1.91E-07
5.48E-08
0.86 1.00E-09
0.00E+00 5.00E-07 1.00E-06 1.50E-06 2.00E-06 2.545 2.55 2.555 2.56 2.565 2.57 2.575
permeability cm/s
specific gravity
Figure IV: permeability and void ratio for old sample Figure VI: permeability coefficient and specific gravity.
(composite).
Table X: Permeability and void ratio for Table XII: Gs and k for samples.
different samples. Ratio Permeability Specific gravity
Ratio K (cm/s) Void Ratio coefficient
80%S+20%B 1.73E-06 0.891 (cm/s)
70%S+30%B 1.91E-07 0.887 80% S+20%B 1.73E-06 2.55
60%S+40%B 5.48E-08 0.862
70% S+30%B 1.91E-07 2.56
The void ratio is increased owing to permeability, the larger 60% S+40%B 5.48E-08 2.57
quantity of voids in the particles, and the flow area in the
samples. In bentonite, flow in already small channels is further
hindered because some of the water in the voids is adsorbed or The increasing in specific gravity was observed with
adsorbed on the bentonite particles, reducing the flow area and increasing the bentonite content due to high specific gravity
further restricting the flow. Therefore K Bentonite <<<K sand . of bentonite. As the specific gravity increases, the
permeability coefficient will be decrease.
135
The measured permeability greater than theoretical
Table XIII: k and LL of samples. permeability.
Ratio Composite Stratified For layered soils, the permeability decreases as the
Permeability Permeability bentonite increases, and the permeability of the
(cm/s) composite increases as the bentonite increases.
(cm/s)
95%S+5%B 1.38E-04 0.000175 The flow rate increases with the increase in the
90% S+10%B 1.32E-04 0.000168 hydraulic gradient rate.
136
An AI-based Embodied Digital Human Assistant for
Information in University
Munia AlKhalifa Kasım Özacar
Department of Computer Engineering Department of Computer Engineering
Karabuk University Karabuk University
Karabuk, Turkey Karabuk, Turkey
munia.ak@outlook.com kasimozacar@karabuk.edu.tr
Abstract—Empowered by artificial intelligence (AI), digital During the last two decades technologies such as speech
assistants are taking an essential role in our lives as they are recognition, natural language understanding, and text to
serving the needs of people within many domains such as speech synthesis have been the interest for much research
providing customer service and translating. The two well- resulting in well-known digital assistants like Apple’s Siri
known types of assistants, which are text-based and voice- [3]. Scenarios for digital assistants have been extended in a
operated, are agents that answers the question or serve the
requests of users depending on data that is given to them.
variety of areas such as tutoring [4], health care [5],
Techniques used for building these agents differ depending on forecasting [6], translation [7] and navigation [8]. Digital
multiple measures and the methods these agents follow serve assistants can answer both simple and complex questions,
varies from being basic-simple rules to state-of-art techniques. provide information, recommendations, initiate conversation
However, to achieve a natural and accurate interaction like we occasionally and make predictions.
do in our everyday life, we contribute with a voice-based digital
assistant integrated into a virtual human whose aim is to serve Additionally, to the importance of questions responding,
at university working as a friendly assistant that answers the face to face communication achieves notable effects on
questions of students, newcomers, or visitors. We built and people, thus, this will offer better performance and more
trained multiple neural networks models and combined them to convincing interactions. Therefore, in this paper we build a
have a human assistant responding any query related to the virtual human as a digital assistant which works a guide and
university while expressing and being interactive. information provider about our university by answering the
Keywords—Human-Computer Interaction, Artificial questions of users and interacting with them.
Intelligence, digital assistant, chatbot, deep learning The contributions of the paper are as follows: 1) we
I. Introduction introduced our idea of a university digital assistant, 2) we go
through discussing some previous research of same field 3)
AI-based digital assistants have been wide spread as their we explain the methodology of our work 4) we explain about
affordability and efficiency make them a key element and a tests and experiments 5) we discuss our work and describe
recognized example in all industries and empowered by the drawbacks 6) we conclude our paper.
simple to advanced artificial intelligence techniques. These
agents are also known as digital assistants which can perform II. Related Work
different tasks as well as are capable of mimicking human Conversations and communication between human and
impressions in conversations along with providing the machine have been receiving attention since the early 1960s.
requested service in applications like ecommerce, information ELIZA, was the first natural language processing program
retrieving, and education [1]. and chatbot who worked as psychotherapist and used simple
A big challenge is the goal of developing assistants that pattern matching [9], then came IBM’s Shoebox-activated
to interact naturally with human and at the same time generate calculator and by the year’s big companies started developing
much more natural or human-like conversations making them speech recognition systems and machines.
indistinguishable from that of a human during a normal open- Another chatbot was launched on platforms like MSN
domain or closed-domain conversations that are used in Messenger, was assigned with simple tasks such as checking
providing service and help to users. In addition to weather, converse with users, and looking up facts [10]. Parry
conversation flow and service provision, another challenge is [11] is an improvement over ELIZA having its own
to make the chatbot acts not only as a tool, but also as a friend personality. In 2000, 2001, and 2004 a chatbot named ALICE
[2]. won the Loebner prize due to its high similarity to humans
[12] though it relies on simple pattern-matching algorithm
137
based on the Artificial Intelligence Markup Language informative built for providing information from a source, or
(AIML) [13]. could be chat based which talks to the user typically as a
Afterwards, many virtual personal assistants like IBM human by responding with the correct sentence, or can be
Watson [14], Amazon Alexa [15], Google Assistant [16], task-based like bots helping in booking a flight or like a FAQ
Microsoft Cortana [17], and Apple Siri [18] came to the life. chatbot. Another measure for deciding the bot type is the
By far we can see that chatbots, can be text-based or vocal input processing and response generation. From the three
and they are utilized into different industries including response generation models: rule-based, retrieval-based, and
marketing, supporting systems, education, healthcare, generative model [24], we picked the retrieval-based model.
cultural heritage, and entertainment. Therefore, our digital
After specifying the correct type, we started preparing our
assistant who works a guide and answers questions while
own dataset of questions and database of responses. The bot
interacting with human.
will retrieve the response from the response candidates in the
database to answer the user’s question considered as queries.
For achieving this purpose, we will explain each part included
III. Methodogy in the bot. The four essential concepts are: intents, entities,
We aim at a digital human assistant that answers any Named Entity Recognition (NER), Intent Classification. We
given question regarding the university. We summarize the display in figure II a conversation example held between a
architecture in Figure I (A) and (B). In the coming sections, user and the bot in the as textual interface before combining
we explain each part in the system. with the assistant virtually in order to check how well the
chatbot is answering.
A. Speech Recognition We created our own dataset that is a collection of different
Speech recognition or Speech to Text can be defined as questions related to Karabuk University that reached 300
converting the speech sound signal into instructions or words questions in total. We made two copies of the dataset, one
to give the machines the ability to respond to these commands that suites the first model and the other suits the second model
[19, 20]. We deploy this technique in our work because the but both having same questions. Relatively we created the
virtual assistance needs to understand what the user asks database of answers to these questions. For our trainings we
when speaking which leads to the need of converting the divide the dataset as 80% for training and 20% for validation.
user’s speech into text to apply the natural language
C. Intent Classification
understanding (NLU) techniques on the text.
Table I. Intents and Entities.
Many APIs have been introduced for speech recognition
such as DeepSpeech [21]. However, we use a very simple
script to convert speech into text with the help of Google
Speech Recognition API [22]. This script will be combined
with the chatbot to understand the speech then respond as a
text, then again converted back to speech.
B. Question Answering
Before building a chatbot, its objective must be decided
to pick the correct type of chatbot that suits the business or
task. Classifying the bot depends on varied parameters like
response generation technique, the goal, input processing, the
domain of knowledge, the provided service and finally the
method chosen for building. [23]
We decided what kind of dialog system we will build for
our agent starting with the knowledge domain, as we will be
providing information related only to the university then that
decides that domain will be closed-domain instead of the
open-domain.
Then considering goal classification, we defined what
goal the bot should achieve. Chatbots, like FAQ bots, can be
138
Figure II. Conversation example between user and the chatbot.
139
D. Named Entity Recognition IV. Preliminary Study
Named-entity recognition is a Natural Language For our experiments we conducted a preliminary test
Understanding (NLU) problem and means information applying qualitative and quantitative analysis with 12 users
extraction but for finding categories known as entities within in total between the ages of 23 and 45 (9 of them from our
the text [28]. Entity is a word considered as a parameter value university). For the qualitative analysis we have interviewed
that is extracted from context. While intents refer to user’s 5 users (1 female) by asking them three questions and then
main goal, the entities work as keywords that refer to we built high-level themes from their answers that will be
meaningful or important things and used by the users to used in the second quantitative test. In the quantitative
describe what they want, or in other words, to describe their analysis we conducted a survey with 7 users (2 females) after
intents. Entities can be system-defined like data references letting them test the system by interacting with the assistant,
[29] but for our problem we define our own entities. General and in the survey they were asked to give feedback over
statements by rating them. In both of the study steps all
entities examples are person, location, organization, city,
participants were either staff, or instructors in the computer
date, etc. dep ending on the industry, and they are called as
engineering department, or lecturers from other departments.
domain entities which gets tagged from an input sentence
[30]. A. Qualitative Test
In our work we got 14 entities in total and are displayed For the interviews, we applied them in different rooms of
the university and during them we concentrated on providing
in the Table I. For entity extraction many libraries and
an undisturbed surrounding where no other persons than the
frameworks have been introduced like Snips [31] and Rasa
interviewer and the participant were present. We interviewed
[32]. We used an NLP open-source library called SpaCy [33]
that is alternative to the popular NLTK [34] and includes
pretrained machine learning models.
SpaCy made NLP easier in Python by providing new
pipelines based on transformer which improved the accuracy,
efficiency and adaptability of SpaCy especially in the third Figure V Entities extracted from a sentence.
version. The NER model of SpaCy assigns label to the participants individually by asking them 3 questions in
contiguous tokens groups. The NER model of SpaCy consists sequence. First questions is “how advantageous do you see
of the following: the 3D virtual human assistant? And why?” second question
is “what kind of improvements should be added to any virtual
- wordly-wise word embedding technique through subword assistant?” and the third question was “what are the most
features and Bloom embedding common problems that may occur during the communication
- deep convolutional neural network with residual layers with a machine” They were permitted to answer with
whatever comes to their minds and then it will be edited after
- named entity parsing with modern transition dependent the interview. The interviewer collected their answers on
approach Microsoft Word. We aimed for deriving high-level themes
We labeled the entities in the question dataset using SpaCy Table II. Evaluation Metrics on neural network models.
NER annotator [35] and an example in Figure V shows how
the entities look like in a question. The NER model gets the F1 Precision Recall
sentence and extracts keywords that belong to proper domain Intent
entities and using them along with the intent a response will 99 99 99
Classifier
be retrieved from the database.
Entity
99 99 98
For fine tuning with SpaCy’s NER we start with creating Recognizer
the json file of our questions dataset with defined entities.
Afterwards we convert it to SpaCy file and create the
configuration and using SpaCy’s built in config files and this from the participants’ responses which will be used in the
is simply provided with one command line. Once done we survey part.
begin training with a SpaCy pipeline imported from their
library. The models was trained for 100 epochs and was B. Quantitative Test
evaluated on 20% of dataset and new data prediction. We For this part we held a two steps test with the second
show the evaluation metrics of the model in Table II. group of participants and as same as the previous group, they
were conducting the tests in undisturbed rooms. For this part
E. Combining with Unity we firstly let the participants individually try the system by
To combine the models with the virtual human project [36] running the virtual human assistant with the use of virtual
we connect them to Unity game engine using UDP reality (VR) technology, and then we apply an online survey
communication. In this procedure we send the data from consisting of 5 questions in total, three of which were based
python model by sockets to Unity that has UDP client that on the topics introduced previously in the qualitative test
reads received data from the socket. We summarize this in analysis.
Figure VI. Python server will send the response as text to
Unity’s C# client in real time. In unity’s virtual human we We conducted the survey on SurveyMonkey and
generate the speech, lip motion and smiling facial expression evaluated the statements on the Likert scales that includes
when interacting with user. “strongly disagree,” “disagree,” “neutral,” “agree,” and
“strongly agree.” We display the results of the survey for each
statement separately in the percentages form.
140
Users Feedback
7
6
Figure VI. UDP connection between Python and Unity’s C#
5
Our goal of using these evaluations is to quantify the 4
range of participants’ agreement with the statements built
on the interviews part by providing strong evidence rather 3
than simply applying normal feedback survey. 2
C. Results 1
For the interview responses qualitative analysis, we 0
built the themes depending on keywords the frequently Theme 1 Theme 2 Theme 3 Theme 4 Theme 5
occurred in the answers namely fast /quickly/quick,
information/data, human/human-like/humane/humanoid, strongly agree agree neurtal
interaction/interactive/interact, intent/intention,
disagree strongly disagree
recognize/recognition and many more. Depending on the
count of words we constructed the themes. These themes
are the following: Figure VII. Diagram shows the number of users who strongly
agree, agree, disagree, strongly disagree and had neutral feedback
Theme 1, human-like interaction. (Qualitative based) on each theme.
Theme 2, quick responses. (Qualitative based)
reason for choosing a virtual human to be a digital assistant is
Theme 3, persuasive in interaction. (Qualitative based) because of the importance that expressions and face-to-face
Theme 4, fast recognition of intent. communication play in a conversation. Therefore, this project
is a contribution for serving at university but built with the
Theme 5, highly informative and clear. state-of-art techniques.
We summarize the results in figure VII through a bar chart Our future plan to add more expressions, gestures and
that shows the number of participants who chose a feedback increasing the size of the dataset to improve the project.
on every theme. For instance 3 of users strongly agree on
theme 4, while 3 others agree and only on feels neutral. References
[1] Abu Shawar, B.A., Atwell, E.S.: Chatbots: are they really
V. Discussion useful? J. Lang. Technol. Comput. Linguist. 22, 29–49 (2007)
Introducing an informative digital assistant was the key [2] Brandtzaeg, P.B., Følstad, A.: Why people use chatbots. In:
insights in this research that will serve students, visitors, or Kompatsiaris, I., et al. (eds.) Internet Science, pp. 377–392.
newcomers to the university by answering any question Springer, Cham (2017). https://doi.org/10.1007/978-3-319-
70284-1_30
related to our university. To achieve this, we integrate
several models to a 3D virtual character. These models are [3] Siri. https://www.apple.com/siri/
speech to text model, question answering model, and sound [4] Braun and N. Rummel, “Facilitating Learning From Computer-
synthesizing. We believe that building such assistant will Supported Collaborative Inquiry: the Challenge of Directing
Learners’ Interactions To Useful Ends,” Research and Practice
benefit the school by attracting the attention of users and in Technology Enhanced Learning, vol. 05, no. 03, pp. 205,
provide all needed information to them as a normal human. 2010.
Additionally, the assistant can help each user individually.
[5] D. Coyle, G. Doherty, M. Matthews, and J. Sharry, “Computers
However, we still believe that the assistant should in talk-based mental health interventions,” Interacting with
Computers, vol. 19, no. 4, pp. 545–562, 2007.
provide more changeable responses and ask back the user in
more flexible chat flow till understanding the intention of [6] Zue, S. Seneff, J. R. Glass, J. Polifroni, C. Pao, T. J. Hazen, and
L. Hetherington, “J UPITER : A TelephoneBased
user. Therefore, future work will be enhancing the dataset, Conversational Interface for Weather Information,” IEEE
the conversation flow, provide more interactive gestures Transactions on Speech and Audio Processing, vol. 8, no. 1, pp.
and advanced service by the agent such as providing a tour 85–96, 2000.
to the school visitors. [7] M. Kolss, D. Bernreuther, M. Paulik, S. Stucker, S. Vo- ¨ gel,
and A. Waibel, “Open Domain Speech Recognition &
VI. Conclusion Translation: Lectures and Speeches,” in Proceedings of
Digital assistants are being dominant applications in real ICASSP, 2006.
life that serve the community easily using artificial intelligent [8] R. Belvin, R. Burns, and C. Hein, “Development of the HRL
techniques. Additionally, virtual humans are playing a very route navigation dialogue system,” in Proceedings of ACL-
important role in the human-computer interaction allowing HLT, 2001, pp. 1–5.
people to interact with the system. In this paper our purpose [9] Weizenbaum, J.: ELIZA—a computer program for the study of
was to build an assistant that helps in university queries. We natural language communication between man and machine.
combined multiple models each one achieving task such as a Commun. ACM 9, 36–45 (1966). https://doi.org/10.1145/
365153.365168
model for speech recognition, a model for entity recognition
and a model for intent classification. Moreover, we deploy [10] Adamopoulou E, Moussiades L. An Overview of Chatbot
voice synthesizing plugin in unity with these models. The Technology. Artificial Intelligence Applications and
141
Innovations. 2020;584:373-383. Published 2020 May 6. [24] Hien, H.T., Cuong, P.-N., Nam, L.N.H., Nhung, H.L.T.K.,
doi:10.1007/978-3-030-49186-4_31 Thang, L.D.: Intelligent assistants in higher-education
environments: the FIT-EBot, a chatbot for administrative and
[11] Colby, K.M., Weber, S., Hilf, F.D.: Artificial paranoia. Artif. learning support. In: Proceedings of the Ninth International
Intell. 2, 1–25 (1971). https:// doi.org/10.1016/0004- Symposium on Information and Communication Technology,
3702(71)90002-6 pp. 69–76. ACM, New York (2018)
[12] Wallace, R.S.: The anatomy of A.L.I.C.E. In: Epstein, R., [25] Abiodun, Oludare Isaac; Jantan, Aman; Omolara, Abiodun
Roberts, G., Beber, G. (eds.) Parsing the Turing Test: Esther; Dada, Kemi Victoria; Mohamed, Nachaat Abdelatif;
Philosophical and Methodological Issues in the Quest for the Arshad, Humaira (2018-11-01). "State-of-the-art in artificial
Thinking Computer, pp. 181–210. Springer, Cham (2009). neural network applications: A survey". Heliyon. 4 (11)
https://doi.org/10.1007/978-1-4020-6710- 5_13
[26] Cortes, C., Vapnik, V. Support-vector networks. Mach Learn
[13] Marietto, M., et al.: Artificial intelligence markup language: a 20, 273–297 (1995). https://doi.org/10.1007/BF00994018
brief tutorial. Int. J. Comput. Sci. Eng. Surv. 4 (2013).
https://doi.org/10.5121/ijcses.2013.4301 [27] Kim, Yoon. (2014). Convolutional Neural Networks for
Sentence Classification. Proceedings of the 2014 Conference
[14] IBM Watson. https://www.ibm.com/watson on Empirical Methods in Natural Language Processing.
[15] What exactly is Alexa? Where does she come from? And how 10.3115/v1/D14-1181.
does she work? https://www. digitaltrends.com/home/what-is- [28] Perera N, Dehmer M, Emmert-Streib F. Named Entity
amazons-alexa-and-what-can-it-do/ Recognition and Relation Detection for Biomedical
[16] Google Assistant, your own personal Google. Information Extraction. Front Cell Dev Biol. 2020 Aug
https://assistant.google.com/ 28;8:673. doi: 10.3389/fcell.2020.00673. PMID: 32984300;
PMCID: PMC7485218.
[17] Personal Digital Assistant - Cortana Home Assistant –
Microsoft. https://www.microsoft. com/en-us/Cortana [29] Ramesh, K., Ravishankaran, S., Joshi, A., Chandrasekaran, K.:
A survey of design techniques for conversational agents. In:
[18] Siri. https://www.apple.com/siri/ Kaushik, S., Gupta, D., Kharb, L., Chahal, D. (eds.) ICICCT
2017. CCIS, vol. 750, pp. 336–350. Springer, Singapore
[19] Hui Liu, Chapter 1 - Introduction, Editor(s): Hui Liu, Robot (2017). https://doi.org/10.1007/978- 981-10-6544-6_31
Systems for Rail Transit Applications, Elsevier, 2020, Pages 1-
36. [30] Jung, S.: Semantic vector learning for natural language
understanding. Comput. Speech Lang. 56, 130–145 (2019).
[20] Zwass, Vladimir. "Speech recognition". Encyclopedia https://doi.org/10.1016/j.csl.2018.12.008
Britannica, 10 Feb. 2016,
https://www.britannica.com/technology/speech-recognition. [31] https://github.com/snipsco/snips-nlu
Accessed 12 June 2021.
[32] Bocklisch, Tom & Faulker, Joey & Pawlowski, Nick & Nichol,
[21] Hannun, C. Case, J. Casper, B. Catanzaro, G. Diamos, E. Elsen, Alan. (2017). Rasa: Open Source Language Understanding and
R. Prenger, S. Satheesh, S. Sengupta, A. Coates, et al. Deep Dialogue Management.
speech: Scaling up end-to-end speech recognition. arXiv
preprint arXiv:1412.5567, 2014 [33] https://github.com/explosion/spaCy
[22] https://pypi.python.org/pypi/SpeechRecognition/ [34] Bird, S., Klein, E., & Loper, E. (2009). Natural language
processing with Python: analyzing text with the natural
[23] Adamopoulou, Eleni & Moussiades, Lefteris. (2020). An language toolkit. " O'Reilly Media, Inc."
Overview of Chatbot Technology. 373-383. 10.1007/978-3-
030-49186-4_31. [35] https://github.com/ieriii/spacy-annotator
[36] https://github.com/GeoffreyGorisse/VHProject
142
Visual Question Answering for Medical Image Analysis based
on Transformers
Hanan Othman Yakoub Bazi Mohamad Alrahhal
Department of Computer Engineering Department of Computer Engineering Department of Applied Computer Science
King Saud University King Saud University King Saud University
Riyadh, Saudi Arebia Riyadh,Saudi Arebia Riyadh,Saudi Arebia
439203835@student.ksu.edu.sa ybazi@ksu.edu.sa mmalrahhal@ksu.edu.sa
Abstract— Health care has been revolutionized over the past the suitable utilization of medical resources, providing a
decades in conjunction with new discoveries and technological second opinion to physicians in diagnosis, and reducing the
advancements. One of those areas that have been rapidly high cost of training medical professionals.
evolved is medical imaging that plays a significant role in
screening, early diagnosis, and treatment selection. Artificial In the literature, researchers have utilized different
Intelligence (AI) has been utilized to support the physician's methods and models for medical VQA. Yan et al. [8] used
decisions related to medical imaging. Recently, medical visual BERT [9] and VGG-16 [10]with global average pooling
question answering (VQA) has been utilized to predict the right (GAP)[11] strategy to extract question and image features,
answer for a given medical image accompanied with a clinically respectively. Then this is followed by co-attention to combine
relevant question to support the clinical decision. However, the these features and a decoder to predict the answers. Chen et
validity of medical VQA is still not proven. In this paper, we al. [12] used BioBERT for the questions and ResNet34 [13]
proposed a full transformers architecture for generating to extract image features in which a bilateral-branch network
answers given the question and image. We extracted image (BBN) with a cumulative learning strategy [14] was used to
features using the data-efficient image transformer (DeiT) fuse these features. Ren et al. [15] propose a model called
model and bidirectional encoder representations from the CGMVQA that uses a multi-modal transformer architecture.
transformers (BERT) model for extracted textual features. We Additionally, Khare et al. [16] used a similar architecture of
also applied a concatenation to integrate the visual and language
CGMVQA with masked language modeling (MLM) and
features. The fused features were then fed to the decoder to
different datasets. Vu et al. [17] utilize a method denoted by
predict the answer. This model established new state-of-the-art
results 61.2 in accuracy and 21.3 in BLEU score in the PathVQA Question-Centric Multimodal Low-rank Bilinear (QCMLB)
data set. that combines image and question features by applying high
involvement to the query questions meaning.
Keywords—Transformers, visual question answer, medical However, VQA in the medical domain is still in its
image, vision transformers embryonic stage since the accuracy of previous methods has
I. Introduction significantly lower than doctor's assessments, owing to the
difficulty of answer evaluation and variety of answer
Health care has been revolutionized over the past decades expression. Thus, there is still a need to develop innovative
in conjunction with new discoveries and the advancement of techniques to overcome some of those limitations.
technology. As a result, the physician has been started to
adapt new modalities to optimize patient care. One of those Recently, there is a growing body of literature that
updated modalities is using medical images. Thus, medical supports the utilization of Transformers in VQA. Thus,
images play a significant role in screening, early diagnosis, transformers have been initially used in natural language
and during surgeries such as cardiac catheterization [1]. Over processing (NLP) tasks [18]. It is based entirely on attention
recent years, artificial intelligence (AI) has started mechanisms that use the encoder-decoder architecture. This
incorporating into the medical field and provided various attention focuses on specific parts of the input to get more
innovation initiatives. Thus, AI has been utilized in a health efficient results. The main motivation of transformer models
care setting, such as advancement in diagnosis, treatment is to ability a long-range interaction between different
personalization, and electronic health recording. Many sequence elements, unlike RNN. This motivation inspired
studies have indicated that the integration of AI in medical Dosovitskiy et al. [19] to propose a convolution-free
diagnosis programs had increased the accuracy, speed, and transformer called vision transformers (ViT) that tries
consistency of the diagnosis enhanced the prediction of directly to images by splitting the image into patches. These
patient outcomes and captured additional information as patches have been treated as tokens in NLP applications.
missed by the doctors [2], [3]. In this regard, several models These models led to very competitive results on the large
have been proposed to help patients understand their physical dataset using extensive computing resources [20]. However,
conditions through visual inspection, such as image caption when ViT has trained in a small dataset, the model will not
[4], image retrieval [5], visual question answering (VQA) [6], discover the properties of the image. Therefore, Touvron et
and visual question generation (VQG) [7]. al. [21] propose a new technique, called a data-efficient
image transformer (DeiT), That requires less data (e.g.,
Visual question answering (VQA) is a task that takes the ImageNet1K) and less computing resources to produce a
medical image and clinical question about the image as input high-performance model. DeiT has the same architecture as
and produces natural language answers as output. This the ViT model with knowledge distillation [22]. Knowledge
process shows great potential in providing medical distillation is a learning framework that uses student-teacher
assistance, such as helping patients get prompt feedback on techniques by applying data augmentation coupled with
their inquiries, making more informed decisions, supporting
143
Figure I. An overview of our model that contain image encoder to encode the visual inputs, a text encoder to encode the
language inputs, and a joint decoder to generate that answer
optimization tricks and regularization. This training strategy decoder for answer prediction. Detailed descriptions of the
has shown excellent results on ViT for smaller datasets, method have provided in the following subsections.
particularly when the knowledge has been distilled from a
convolutional neural network (CNN) teacher model [23]. A. Question encoder
We encode the question by using bidirectional encoder
Although previous studies used transformer as a single representations from Transformers (BERT). BERT is a
method, studies reported strong performance of multi-model transformer encoder model pre-trained on large corpora with
learning such as VQA. Tan et al. [24] utilize a method masked language modeling and following sentence
denoted by learning cross-modality encoder representations prediction tasks. It contains several blocks of Multi-Head
from Transformers (LXMERT) that used a two-stream model Attention (MHA) followed by Feed Forward Neural Network
with co-attention and only pre-trained the model with in- (FNN).
domain data. Lu et al. [25] used vision-and-language BERT
(ViLBERT) with the same architecture with more complex Given the input questions Qi, we tokenize it by adding a
co-attention, pre-trained with the out of domain data. Hu et special classification token [CLS] followed by the WordPiece
al. [26] proposed a unified transformer (UniT) based on tokens and the separator token [SEP]. The token sequence is
encodes each input modality with an encoder and a joint then added to word embedding to convert the tokens into
decoder that makes predictions for final outputs. vectors of dimension 𝑑𝑚𝑜𝑑𝑒𝑙 . A positional embedding is
added to each token to indicate its position in the sequence.
This paper proposes a full transformers encoder-decoder Then the result will be fed to a transformer with several
architecture for the VQA model in the medical domain. The blocks. Each transformer block is composed of a multi-head
encoder modules encode each input modality, and the self-attention (MSA), FFN, and normalization layer. The
decoder generates the answer word by word. The Image and MSA block uses the self-attention mechanism to drive long-
question features have been extracted by using the DeiT range dependencies between different words in the given text.
model and BERT model, respectively. The extracted features Equation (1) Shows the details of the computations in one
for images and questions were fused using a fusion self-attention head (SA). First, the input sequence is
mechanism. Compared to previous work on multi-model transformed into three different matrices which are key vector
learning with transformers, our work is the first one that trains 𝐾, query vector 𝑄, and value vector 𝑉 using three linear
on medical images. layers 𝐾 ∈ ℝ𝑑𝑚𝑜𝑑𝑒𝑙 ×𝑑𝐾 , 𝑄 ∈ ℝ𝑑𝑚𝑜𝑑𝑒𝑙 ×𝑑𝑄 , and 𝑉 ∈
The remainder of the paper is organized as follows: ℝ𝑑𝑚𝑜𝑑𝑒𝑙 ×𝑑𝑉 For 𝑖 = 1, 2, … ℎ where ℎ is the number of
Section II describes the main methods based on transformers. heads. The attention map was computed by matching the
In section III, we present the experimental results on the query matrix against the key matrix using the scaled-dot-
PathVQA dataset. Then we finally conclude and show future product. The output is scaled by the dimension of the key 𝑑𝐾
directions in Section IV. and then transformed into probabilities by a SoftMax layer.
Finally, the result is multiplied with the value 𝑉 to get a
II. Proposed method filtered value matrix which assigns high focus to more
The proposed medical VQA framework is shown in important elements.
Figure I, which consists of four parts: question encoding for 𝑄𝐾 𝑇
extracting textual features of the given question, image 𝐴𝑡𝑡𝑒𝑛𝑡𝑖𝑜𝑛(𝑄, 𝐾, 𝑉) = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 ( ).𝑉 (1)
encoding for capturing visual features of the given medical √𝑑𝐾
image, concatenation has used to fuse the visual and textual
feature vectors to generate a jointed representation and
144
Followed by FFN, which consists of two fully connected
layers with a ReLU activation function in between. It can be
formulated as:
𝐹𝐹𝑁(𝑥) = 𝑚𝑎𝑥(0, 𝑥𝑊1 + 𝑏1 )𝑊2 + 𝑏2 (2)
145
B. Evaluation metrics active classification of electrocardiogram signals,” Inf. Sci.,
Accuracy [28] and BiLingual Evaluation Understudy vol. 345, pp. 340–354, Jun. 2016, doi:
(BLEU) [29] are commonly used as evaluation metrics in the 10.1016/j.ins.2016.01.082.
VQA task. Accuracy measures the ratio between correctly [4] C. Eickhoff, I. Schwall, and H. Muller, “Overview
predicted observations to total observations. Meanwhile, of ImageCLEFcaption 2017 – Image Caption Prediction and
BLEU measures the similarity of predicted answers and Concept Detection for Biomedical Images,” p. 10.
ground-truth by matching n-grams. [5] A. Qayyum, S. M. Anwar, M. Awais, and M. Majid,
“Medical image retrieval using deep convolutional neural
C. Results
network,” Neurocomputing, vol. 266, pp. 8–20, Nov. 2017,
Our model is trained using Adam optimizer with an initial doi: 10.1016/j.neucom.2017.05.025.
learning rate of 0.001. We have used a batch size 50, the [6] S. A. Hasan, Y. Ling, O. Farri, J. Liu, H. Muller, and
number of epochs is up to 50, and the categorical cross- M. Lungren, “Overview of ImageCLEF 2018 Medical
entropy as a loss function. Table II shows the results for our
Domain Visual Question Answering Task,” p. 8.
proposed model, and it gives good achievement in terms of
[7] M. Sarrouti, A. Ben Abacha, and D. Demner-
accuracy metrics that achieve 61.2 and 21.3 in BLEU metrics.
Since there are few models that use PathVQA dataset for Fushman, “Visual Question Generation from Radiology
training, our model increases the accuracy by 2.2% and BLEU Images,” in Proceedings of the First Workshop on Advances
by 2.1%. One of the reasons for such a significant drop in in Language and Vision Research, Online, 2020, pp. 12–18.
performance is the presence of a new technique of vision doi: 10.18653/v1/2020.alvr-1.3.
transformers. [8] X. Yan, L. Li, C. Xie, J. Xiao, and L. Gu, “Zhejiang
University at ImageCLEF 2019 Visual Question Answering
Table II. Results Comparison in the Medical Domain,” p. 9.
[9] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova,
Methods Accuracy BLEU
CNN + LSTM + stacked attention
“BERT: Pre-training of Deep Bidirectional Transformers for
network [27]
59.4 19.2 Language Understanding,” ArXiv181004805 Cs, p. 13, May
Proposed (DeiT+BERT) 61.2 21.3 2019.
[10] K. Simonyan and A. Zisserman, “Very Deep
Convolutional Networks for Large-Scale Image
Acknowledgment Recognition,” ArXiv14091556 Cs, Apr. 2015, Accessed: Feb.
The authors extend their appreciation to the Deanship of 09, 2021. [Online]. Available: http://arxiv.org/abs/1409.1556
Scientific Research at King Saud University for funding this [11] M. Lin, Q. Chen, and S. Yan, “Network In
work through research group No. RG-1441-502. Network,” ArXiv13124400 Cs, Mar. 2014, Accessed: Jun. 12,
2021. [Online]. Available: http://arxiv.org/abs/1312.4400
[12] G. Chen, H. Gong, and G. Li, “HCP-MIC at VQA-
IV. Conclusion Med 2020: Effective Visual Representation for Medical
Health care has rapidly grown with different tools and Visual Question Answering,” p. 8.
techniques to improve patients care. One of these techniques [13] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual
is a medical image. In this work, we propose full transformers Learning for Image Recognition,” in 2016 IEEE Conference
for answering corresponding questions based on medical on Computer Vision and Pattern Recognition (CVPR), Jun.
images. In particular, we use encoder modality for each input. 2016, pp. 770–778. doi: 10.1109/CVPR.2016.90.
The DeiT model is used to extract image features with a CNN [14] B. Zhou, Q. Cui, X.-S. Wei, and Z.-M. Chen, “BBN:
teacher distilled model. The text feature extraction uses the Bilateral-Branch Network With Cumulative Learning for
BERT model by adding token, segment, and position
Long-Tailed Visual Recognition,” in 2020 IEEE/CVF
embeddings layers. The answer was to predict the sequence
Conference on Computer Vision and Pattern Recognition
by a decoder. Our model gets results of 61.2 accuracy score
and 21.3 BLEU score. Empirical evaluation on the recently (CVPR), Seattle, WA, USA, Jun. 2020, pp. 9716–9725. doi:
published benchmark dataset PathVQA shows that our 10.1109/CVPR42600.2020.00974.
approach achieves superior performance compared with the [15] F. Ren and Y. Zhou, “CGMVQA: A New
state-of-the-art Med-VQA model. In future work, we plan to Classification and Generative Model for Medical Visual
explore a better evaluation strategy for evaluating the model. Question Answering,” IEEE Access, vol. 8, pp. 50626–
We also plan to introduce better individual models to handle 50636, 2020, doi: 10.1109/ACCESS.2020.2980024.
each of the leaf node tasks. [16] Y. Khare, V. Bagal, M. Mathew, A. Devi, U. D.
Priyakumar, and C. V. Jawahar, “MMBERT: Multimodal
References BERT Pretraining for Improved Medical VQA,”
[1] E. Bercovich and M. Javitt, “Medical Imaging: ArXiv210401394 Cs, Apr. 2021, Accessed: May 31, 2021.
From Roentgen to the Digital Revolution, and Beyond,” [Online]. Available: http://arxiv.org/abs/2104.01394
Rambam Maimonides Med. J., vol. 9, p. e0034, Oct. 2018, [17] M. H. Vu, T. Lofstedt, T. Nyholm, and R. Sznitman,
doi: 10.5041/RMMJ.10355. “A Question-Centric Model for Visual Question Answering
[2] J. Ker, L. Wang, J. Rao, and T. Lim, “Deep Learning in Medical Imaging,” IEEE Trans. Med. Imaging, vol. 39, no.
Applications in Medical Image Analysis,” IEEE Access, vol. 9, pp. 2856–2868, Sep. 2020, doi:
6, pp. 9375–9389, 2018, doi: 10.1109/TMI.2020.2978284.
10.1109/ACCESS.2017.2788044. [18] A. Vaswani et al., “Attention Is All You Need,”
[3] M. M. A. Rahhal, Y. Bazi, H. AlHichri, N. Alajlan, ArXiv170603762 Cs, Dec. 2017, Accessed: May 31, 2021.
F. Melgani, and R. R. Yager, “Deep learning approach for [Online]. Available: http://arxiv.org/abs/1706.03762
146
[19] A. Dosovitskiy et al., “An Image is Worth 16x16 for Vision-and-Language Tasks,” ArXiv190802265 Cs, Aug.
Words: Transformers for Image Recognition at Scale,” 2019, Accessed: May 31, 2021. [Online]. Available:
ArXiv201011929 Cs, Oct. 2020, Accessed: May 31, 2021. http://arxiv.org/abs/1908.02265
[Online]. Available: http://arxiv.org/abs/2010.11929 [26] R. Hu and A. Singh, “UniT: Multimodal Multitask
[20] Y. Bazi, L. Bashmal, M. M. A. Rahhal, R. A. Dayil, Learning with a Unified Transformer,” ArXiv210210772 Cs,
and N. A. Ajlan, “Vision Transformers for Remote Sensing Mar. 2021, Accessed: May 31, 2021. [Online]. Available:
Image Classification,” Remote Sens., vol. 13, no. 3, p. 516, http://arxiv.org/abs/2102.10772
Feb. 2021, doi: 10.3390/rs13030516. [27] X. He, Y. Zhang, L. Mou, E. Xing, and P. Xie,
[21] H. Touvron, M. Cord, M. Douze, F. Massa, A. “PathVQA: 30000+ Questions for Medical Visual Question
Sablayrolles, and H. Jégou, “Training data-efficient image Answering,” ArXiv200310286 Cs, Mar. 2020, Accessed: Jun.
transformers & distillation through attention,” 14, 2021. [Online]. Available:
ArXiv201212877 Cs, Jan. 2021, Accessed: May 31, 2021. http://arxiv.org/abs/2003.10286
[Online]. Available: http://arxiv.org/abs/2012.12877 [28] M. Malinowski and M. Fritz, “A Multi-World
[22] G. Hinton, O. Vinyals, and J. Dean, “Distilling the Approach to Question Answering about Real-World Scenes
Knowledge in a Neural Network,” ArXiv150302531 Cs Stat, based on Uncertain Input,” ArXiv14100210 Cs, May 2015,
Mar. 2015, Accessed: Jun. 12, 2021. [Online]. Available: Accessed: Jun. 12, 2021. [Online]. Available:
http://arxiv.org/abs/1503.02531 http://arxiv.org/abs/1410.0210
[23] L. Bashmal, Y. Bazi, M. M. Al Rahhal, H. Alhichri, [29] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu,
and N. Al Ajlan, “UAV Image Multi-Labeling with Data- “Bleu: a Method for Automatic Evaluation of Machine
Efficient Transformers,” Appl. Sci., vol. 11, no. 9, p. 3974, Translation,” in Proceedings of the 40th Annual Meeting of
Apr. 2021, doi: 10.3390/app11093974. the Association for Computational Linguistics, Philadelphia,
[24] H. Tan and M. Bansal, “LXMERT: Learning Cross- Pennsylvania, USA, Jul. 2002, pp. 311–318. doi:
Modality Encoder Representations from Transformers,” in 10.3115/1073083.1073135.
Proceedings of the 2019 Conference on Empirical Methods
in Natural Language Processing and the 9th International
Joint Conference on Natural Language Processing (EMNLP-
IJCNLP), Hong Kong, China, Nov. 2019, pp. 5100–5111.
doi: 10.18653/v1/D19-1514.
[25] J. Lu, D. Batra, D. Parikh, and S. Lee, “ViLBERT:
Pretraining Task-Agnostic Visiolinguistic Representations
147
Deep Learning for Face Detection and Recognition
Tuba Elmas ALKHAN Alaa Ali Hameed Akhtar Jamil Akhtar Jam
Department of Computrer Engineering Department of Computer Engineering Department of Computer Engineering Departme
Istanbul Sabahattin Zaim University Istanbul Sabahattin Zaim University Istanbul Sabahattin Zaim University Istanbul Sa
Istanbul, Turkey Istanbul, Turkey Istanbul, Turkey Istanbul, T
karabala.masa@std.izu.edu.tr 0000-0002-8514-9255 0000-0002-2592-1039 0000-0002
Abstract— With the more development of machine learning different have different objectives. Researchers have proposed
and deep learning, face recognition technology based on several facial detection and recognition systems, which will
convolutional neural network (CNN) has become the most
necessary and used methodology within the field of face be discussed in detail in the next section.
recognition. A face recognition model could be a technology
capable of distinctive a personal from a picture or a video. The purpose of this article is to execute emotion
Various strategies for face recognition systems are effective, and recognition in education area through a system using
they work by comparing selected facial features from images convolutional neural network (CNN) that analyzes facial
with faces in the database. This paper creates a system that uses expressions of students. CNN is a deep learning algorithm
Convolutional Neural Networks (CNN) to acknowledge
used for image classification. It includes of several stages
students' emotions from their faces. We achieved an accuracy of
74.41% and validation accuracy of 77.00% on the fer2013 image processing for extracting feature representation. There
dataset to classify seven different emotions through facial are several deep learning methods for extracting more
expressions. complex features, such as Autoencoders, Recurrent Neural
Network, Gradient Descent and Convolutional Neural
Keywords—Face Detection, Face Recognition, Deep Learning,
Networks.
Emotion recognition, Convolutional neural networks (CNN),
Student facial expression.
This study implements an automated system to realize
I. Introduction emotion recognition in education field. The system analyzes
student facial expressions and gives feedback to an educator
Face emotion recognition is an energetic and vital area of
with using Convolutional Neural Network. Several
research. Especially these days, due to the spread of the
classification algorithms were applied to learn instant
COVID-19 epidemic, it become distance education. These
emotional state (Random Forest, Artificial Neural Network
systems play an essential role in our daily life and make it
(ANN), Support Vector Machine (SVM), K-Nearest Neighbor
much more manageable. Face emotion recognition has been
(KNN) and Classification & Regression Trees). E-learning
implemented in medicine, psychology, interactive games,
has several advantages, such as saving time and money.
public security and distance education, etc.
Through this learning, All of students can uses the contents in
Face recognition in videos is challenging due to variations anytime and anywhere, which leads to good participation,
in pose, illumination, or facial expression. But this is an retention, scalable and offers personalization, but it does not
important task that has been widely utilized in several provide enough face-to-face interactivity between an educator
practical applications like security monitoring, surveillance and learners.
[1], etc.
II. LITERATURE REVIEW
The face is that the most important part of the body. It is This section highlights some developments made in the
vital and expressive. It can transfer many emotions silently. field of facial emotion recognition in various areas such as
Facial expression recognition determines an emotion from medicine, health, psychology, online education and
face images. Generally, six basic emotions are categorized, biomedical engineering . Today, Face emotion recognition is
which are the same across all cultures; happiness, sadness, a vital and important area of research. Especially these days,
surprise, angry, fear, disgust, neutral [1]. due to the spread of the COVID-19 epidemic, it was distance
education. The detection of facial emotions is possible in
CNN has been proven to be very active for various online education. Therefore, it can facilitate academics to
computer vision works, such as object or face detection and adjust their performance depending on the students’ emotions.
classification [2]. Applying facial expression recognition In general, Artificial intelligence includes deep learning and
system to the field of education can allow to detect, capture machine learning. Many machines learning and deep learning
and record the emotional changes of students during the algorithms are used in this field. Convolutional neural
learning process and supply higher reference for academics to Networks (CNNs) have become the most important and used
method in the field of face recognition. Which is a deep
teach based on students’ abilities. [3]
learning algorithm used for image classification. It includes of
The facial recognition system involves two steps: face several stages image processing for extracting features
detection, which identifies human faces in images. In contrast, representation.
face recognition matches the face from a video or an image In [6], the authors propose a CNN architecture called
Trunk Branch Ensemble Convolutional Neural network
against a database of faces to recognize it. Both are similar but
148
(TBE-CNN) to overcome problems in facial recognition from expression of the learner. SVM provides the best prediction
a video. They used this system in surveillance. Surveillance accuracy rate of 98.24% [10].
applications need to be capable of detecting and recognizing
faces quickly. This system must be able to withstand changes Recently, many works [12, 2] used CNN for facial
in blur, zoom, illumination and different poses. This version expressions recognition. The recognition of human facial
extracts features effectively via way of means of sharing the expressions is a hard problem for deep learning and machine
low- and middle-degree convolutional layers. learning, so the convolutional neural network is used to
conquer the problems in facial expression classification.
In [7], a new system for face recognition based on Stacked
Automatic Convolutional Encoder (SCAE) and sparse In their study, Roman RADIL et al. The performance of
representation was presented. The system can extract deeper the proposed Convolutional Neural Network with three image
and abstract features with high recognition speed. However, recognition methods like Local Binary Patterns Histograms
the recognition rate is not high, so they need to develop the (LBPH), K Nearest Neighbor (KNN) and Principal
system. Component Analysis (PCA) is tested. The result shows that
the Local Binary Patterns Histograms provide better results
Viola-Jones framework has been widely used by than Principal Component Analysis and K–Nearest Neighbor,
researchers Padilla and Costa for detecting the location of and an accuracy rate of 98.3% was achieved for proposed
faces, this work focuses on the appraisal of face detection CNN [11].
classifiers, such as OpenCV. The system needs images with
and without faces (positive & negative pictures) for training Saravanan et al. discussed the Classification of images of
the classifier and extract features (Haar) from images. The human faces into one of 7 basic emotions (Fear, Disgust,
authors evaluated the performance of some classifiers and Surprise, Anger, Sadness and Happiness), authors proposed
tested their accuracy [8]. approach (CNN) Convolutional Neural Network model which
content of six convolutional layers, two max pooling layers
Authors in [2] proposed a school system using and tow fully connected layers. This model achieved a final
convolutional neural network for helping professors and accuracy of 0.60.
instructors to change their academic performance based on
students’ emotions. First, they detect students faces from an In [14], the authors created a model using CNN to detect
image by using Haar Cascades Classifiers, then emotion facial expressions in real time using a webcam. The model is
recognition by using CNN with seven types of expressions. used to classify the expression of human faces, and the model
The system achieved an accuracy of 70% on fer-2013 dataset. gave a training accuracy of 79.89% and a test accuracy of
60.12%.
In [9], authors developed a system that determines
students emotions and provides feedback to improve distance (Wang et al., 2020)Online education has developed
education and update learning content. Head and eyes Because of the spread of the COVID-19 pandemic, which has
movement can help to comprehend student attention and led to the closure of schools and the transfer of education to
concentration level. The system is suitable and active for distance education .so author proposed a system combining a
detecting students’ negative emotions. Authors discussed Face Emotion Recognition (FER) algorithm and online
some face detection algorithms such as Local Binary pattern courses platforms depend on the architecture of CNN [17].
(LBP), Neural network and Ada Boost.
(Chang et al., 2018) designed a new Convolutional Neural III. MATRIALS AND METHODOLOGY
Network based on ResNet to extract features and Complexity
Perception Classification (CPC) algorithm for facial A. Dataset
expression recognition using three different classifiers For our deep learning model to be good and smart to discover
(Softmax, SVM, Random Forest).It improved the recognition expressions, we need to train it with a facial expression
accuracy and fixed some misclassified expression categories. dataset. Here we used FER-2013 dataset.FER-2013 dataset is
CNN and Softmax with CPC algorithm has achieved accuracy an open-source dataset to recognize facial expression, which
71.35% for Fer2013 and 98.78% for CK+ [16].
was shared on Kaggle through the ICML 2013 conference.
(Jiang et al., 2020) Gabor Convolutional Network is The dataset contains of 35.887 grayscale images with 48x48
shown in three types of datasets (real world affective faces, sized of face images, divided into 3.589 test and 28.709 train
FER+ and Fer2013). The proposed approach includes four images. The dataset contains of facial expressions belonging
Gabor convolutional layers with two fully connected layers.
They find the optimal model by changing the numbers of to these seven emotions (Happy, Sad, Neutral, Surprise, Fear,
layers and the numbers of units at the convolutional layers. Angry and Disgust). Figure 1 shows some example images
Then they discussed and compared the proposed GCN model from the FER-2023 dataset, Table 1 illustrates the description
with different models such as AlexNet and ResNet. The GCN of the dataset. The image dataset consists of grayscale images,
has achieved best accuracy on the fer2013 dataset [15]. and we kept size the same for our training and testing (300
In their study, (Ayvaz et al.) discussed a new facial x300).
emotion recognition system with the help of several
Table I. FER2013 dataset description
algorithms such as (Random Forest, Support Vector Machine
(SVM), K-Nearest Neighbor (KNN) and Classification & Label Number of images Emotion
Regression Trees) the system can classify the emotions of the
0 4593 Angry
students. The system detects facial emotional of the students
and gives response to an instructor according to the facial 1 547 Disgust
2 5121 Fear
149
∞ ∞
3 8989 Happy
4 6077 Sad 𝑓(𝑥, 𝑦) ∗ 𝑔(𝑥, 𝑦) = ∑ ∑ 𝑓(𝑛1, 𝑛2 ) . 𝑔(𝑥 − 𝑛1, 𝑦 − 𝑛2 )(1)
𝑛1 =−∞ 𝑛2 =−∞
5 4002 Surprise
6 6198 Neutral
B. Proposed Method
In this part, we will describe our proposed system that Figure III. Convolution operation.
uses Convolutional Neural Network (CNN) model to analyze
the students’ facial expressions. The first step of our system Pooling Layer: Pooling is also known as down sampling or
is to detects the face from images or video (video is a set of subsampling. Pooling layer can down sampling of any
images), then use these face images as input to the network. features map yet holds the significant data. There are three
Lastly, by using CNN the system classifies the expression of common pooling methods: Sum pooling, Max pooling and
a students’ face into one of these (happy, sad, fear, angry, Average pooling. The most typically approach used in pooling
surprise, neutral, disgust) expressions. is that Max pooling. Max pooling is used to gradually
minimize the spatial size of the input, it controls overfitting
Convolutional neural network is a type of artificial neural (regularization) and Invariance to small translations of the
network that uses a convolution method for extracting the input). Pooling layer provides higher generalization, resistant
large number of features from the input data. to distortion and quicker convergence. It is typically
positioned among the convolutional layers. Figure IV shows
CNN model contains 3 types of main layers:
an example of Max Pooling operation.
Convolutional layer, pooling layer and fully connected layer.
Figure II shows the CNN architecture.
Figure II. Convolutional Neural Network Architecture Fully connected layer: Fully connected layer in a neural
network is that layer in which all inputs of one layer are
Convolutional Layer: We used convolutional layer to extract
connected to each neuron of the next layer. The intention of
the different attributes from an enter images. The convolution
using the FCL is to apply the output of the previous layers like
keeps a spatial association among pixels by learning features,
convolutional and pooling layers to classify the input image
then the images will be convoluted by use a group of learnable
into different classes according to the training dataset. The
neurons. This creates in the output picture a feature map that
term fully connected layer means that all filters of the previous
provides us some information about this image. Finally, the
layer are linked to all filters of the next layer. Fully connected
feature maps are fed to the next layer for learning more
layers are positioned earlier than the convolutional neural
features. We can explain the convolution in other words,
network sorting output and used to flatten results before
multiplied two images that can be represented as a matrix to
classification. In short, the convolutional layers and pooling
get an output that is used to extract a large number of features
layers work as characteristic extractors from the inputs and the
from the image.
fully connected layers work as classifier.
The convolution formula is represented in Equation 1: f is
IV. EXPERIMENTAL RESULTS
the input photo, * is for the convolution operation and g was
the filters matrix. Figure III shows the convolution operation. We designed our CNN model. Here we used FER-2013
dataset. It is an open-source dataset, shared on Kaggle. The
dataset includes seven categories such as Happy, Sad, Fear,
Angry, Disgust, Neutral and Surprise. The training set consists
150
of about 17084 images. The testing set consists of about 4180
images. We will be dealing with 3 classes (Happy, Sad,
Neutral) in training our model. image rows= 48, image Input Image Batch
Conv + ReLU
Batch Dropout(p=0.5
(48*48) Normalization Normalization )
column= 48, determine the size of the image matrix which we
will be feeding to our model. Batch size= 32, batch size is the Convolutional
Layer Conv + ReLU Max Pooling Dense + ReLU Dense
number of samples that are processed before updated the + ReLU
model. The number of complete passes through the training
Batch Dropout(p=0.5
dataset was 25 epochs. We used ImageDataGenerator class to Normalization
Max Pooling Conv + ReLU
)
Softmax
151
The Adadelta optimizer gave an accuracy of 0.7179 and
validation accuracy of 0.7563 was attained over 50 epochs,the
learning rate was 0.1 and the batch size was 32.
Figure XIII. Loss over train and test data with Adadelta optimizer in
Figure IX. Accuracy over test and train data with SGD optimizer in CNN.
the proposed CNN.
Figure X. Loss over test and train data with SGD optimizer. Figure XIV. Confusion Matrix with Adadelta optimizer using the
proposed method.
152
Patience: represents the quantity of epochs which training will [12] A. Fathallah, L. Abdi, and A. Douik, "Facial expression recognition via
be stopped after it because no improvement and the loss start deep learning," Proc. IEEE/ACS Int. Conf. Comput. Syst. Appl.
AICCSA, vol. 2017-Octob, no. October, pp. 745–750, 2018.
to increase. We have given patience 10. Verbose: To discover
[13] M. Mohammadpour, H. Khaliliardali, S. M. R. Hashemi, and M. M.
and print the training epoch on which training was stopped, Alyannezhadi, "Facial emotion recognition using deep convolutional
verbose can be set to 1. restore_best_weights: whether to networks," 2017 IEEE 4th Int. Conf. Knowledge-Based Eng. Innov.
retrieve weights with the best value of the monitored quantity. KBEI 2017, vol. 2018-Janua, pp. 0017–0021, 2018.
Here i have given it True. [14] I. Talegaonkar, K. Joshi, S. Valunj, R. Kohok, and A. Kulkarni,
"Available on : Elsevier-SSRN Real Time Facial Expression
The ModelCheckpoint class allows you to define where to Recognition using Deep Learning," 2019.
checkpoint the model weight to keep it. Therefore, the weights [15] P. Jiang, B. Wan, Q. Wang, and J. Wu, "Fast and Efficient Facial
may be loaded later to carry on training from the saved state. Expression Recognition Using a Gabor Convolutional Network,"
We monitored the validation loss and minimized the loss IEEE Signal Process. Lett., vol. 27, pp. 1954–1958, 2020.
using the mode='min' parameter. [16] T. Chang, G. Wen, Y. Hu, and J. J. Ma, "Facial expression
recognition based on complexity perception classification
Our system realizes the faces of the input images of the algorithm," arXiv, 2018.
students by using Haar cascades detector then classifies them [17] W. Wang, K. Xu, H. Niu, and X. Miao, "Emotion Recognition of
Students Based on Facial Expressions in Online Education Based
into one of seven basic expressions. The proposed method on the Perspective of Computer Simulation," Complexity, vol.
achieved an accuracy of 77% using Adam optimizer on 2020, 2020.
FER2013 dataset at the 25 epochs.
V. CONCLUSION
In this study, our aim was to detect the face then to classify
facial expressions, so we present a convolutional neural
network model for the recognition of the facial expressions of
the students. With the help of deep learning and machine
learning technologies we can Classification of the emotions of
the online learner, therefore it is able to help the instructor to
recognize students' understanding during his presentation and
gives feedback to an educator. In our future work we are going
to focus on applying Convolutional Neural Network model on
3D students' face image so as to extract their emotions.
REFERENCES
153
A Lightweight and Interpretable Deepfakes
Detection Framework
Muhammad Umar Farooq Ali Javed Khalid Mahmood Malik
Department of Software Engineering Department of Computer Science Department of CS and Engineering
University of Engineering and Technology University of Engineering and Technology Oakland University
Taxila, Pakistan Taxila, Pakistan Rochester, MI, USA
softwareengineerumar@gmail.com ali.javed@uettaxila.edu.pk mahmood@oakland.edu
Abstract—The recent realistic creation and dissemination of so- [3]. However, the case is not always that simple. Depending
called deepfakes poses a serious threat to social life, civil rest, and on the time and context, deepfakes pose a serious threat to
law. Celebrity defaming, election manipulation, and deepfakes society. With deepfakes, celebrities are defamed, and election
as evidence in court of law are few potential consequences
of deepfakes. The availability of open source trained models campaigns could be manipulated. DL based video synthesis
based on modern frameworks such as PyTorch or TensorFlow, tools use generative adversarial networks (GAN) under the
video manipulations Apps such as FaceApp and REFACE, and hood. The adaptive nature of GAN made it difficult to develop
economical computing infrastructure has easen the creation of a robust detection solution. Whenever a deepfakes detection
deepfakes. Most of the existing detectors focus on detecting model is developed, we witness some variant of a GAN based
either face-swap, lip-sync, or puppet master deepfakes, but a
unified framework to detect all three types of deepfakes is hardly generation model to exploit the newly developed detection
explored. This paper presents a unified framework that exploits model by manipulating its cues. Thus, deepfakes creation and
the power of proposed feature fusion of hybrid facial landmarks detection is a constant battle between the ethical and unethical
and our novel heart rate features for detection of all types of machine learning (ML) experts.
deepfakes. We propose novel heart rate features and fused them Deepfakes detection got much attention in the last decade
with the facial landmark features to better extract the facial
artifacts of fake videos and natural variations available in the after realistic fake videos of politicians and celebrities got
original videos. We used these features to train a light-weight viral via social media platforms. Current deepfake videos are
XGBoost to classify between the deepfake and bonafide videos. categorized as face-swap, lip-sync, and puppet master [4]. In
We evaluated the performance of our framework on the world face-swap deepfakes, face of a target person is added at the
leaders dataset (WLDR) that contains all types of deepfakes. Ex- place of a source person in the original video to create a
perimental results illustrate that the proposed framework offers
superior detection performance over the comparative deepfakes fake video of the target person. In lip-sync deepfakes, lips
detection methods. Performance comparison of our framework of a person are synced for an audio to reflect that person
against the LSTM-FCN, a candidate of deep learning model, is speaking the text in that audio. In puppet-master, the face
shows that proposed model achieves similar results, however, it of the target person is placed in the original video but facial
is more interpretable. expressions of the source person are retained on the target face
Index Terms—Deepfakes, Multimedia Forensics, Random For-
est Ensembles, Tree boosting, XGBoost, Faceswap, Lip sync,
to make the fake more realistic. Most of the existing detection
Puppet Master. solutions target specific types of deepfakes, however, generic
solutions capable of countering all types of deepfakes are less
explored. For example, Agarwal et al. [5] proposed a detection
I. I NTRODUCTION
technique for lip-sync deepfakes. This technique exploited
Recent advancements in deep learning (DL) have impacted the inconsistencies between the viseme (mouth shape) and
the way we solve complex technical problems in computer a phoneme (spoken word). This work applied manual and
vision (CV) and robotics. With the widespread availability CNN based techniques to compute the mapping of viseme to
of video synthesis repositories and video manipulations Apps phoneme. This model is good for a specific set of seen data.
such as FaceApp [1] and REFACE [2], video manipulation has However, model performance can degrade on unseen data for
become easy, even for a layman. Video synthesis is beneficial different patterns of viseme to phoneme mapping, with the
in some ways like avatar creation, animated video content change of speaking accent or even non-alignment of audio-to-
creation, etc. Sometimes videos are synthesized just for the video.
sake of fun, like a recent realistic Tiktok video of Tom Cruise Most of the existing systems are unable to perform well on
154
all three types of deepfakes. Moreover, deepfakes detection authors proposed a new framework, ‘FakeCatcher’, which uses
models based on the traditional classifiers like SVM, works biological signals from three face regions in the real videos
only where data is linearly separable. CNN based models are to detect the fake videos. FakeCatcher applied many trans-
computationally more complex and are black-box in terms formations on biological features like autocorrelation, power
of prediction. Therefore, this paper addresses the following spectral density, wavelet transform, etc. Authenticity decision
research questions: is based on the aggregated probabilities of two probabilistic
classifiers (SVM and CNN). Performance was evaluated on
1) Is it possible to improve the detection accuracy of deep-
their own customized dataset, however, it is not evaluated on
fakes using hybrid landmark and heart-rate features on a
all three types of deepfakes.
diverse dataset containing all three types of deepfakes?
Besides the handcrafted features-based methods, deep
2) Is it possible to create a generalized detection model
learning-based methods are also being employed for deepfakes
based on proposed hybrid landmark and heart-rate fea-
detection. Guera et al. [10] applied a DL based technique
tures and ensemble learning?
to detect the deepfakes. This technique applied a CNN
3) Is it possible to achieve the same accuracy as deep
to extract features followed by a long-short term memory
learning models but improve the interpretability by using
(LSTM) to learn those features. Important contribution of this
an ensemble of supervised learning?
work was the exploitation of temporal inconsistencies among
Existing deepfake detection techniques are broadly catego- deepfakes for classification. However, this approach is unable
rized as handcrafted features [6]–[9] based or DL based [10]– to identify all three types of deepfakes. Afchar et al. [11]
[14]. For example, Yang et al. [9] used 68-D facial landmark designed a neural network (MesoNet) to detect deepfakes
features to train an SVM classifier for detection. This work and Face2Face video forgeries. This work designed an end-
achieved good performance on good quality videos of UADFV to-end architecture with convolutional and pooling layers for
[9] and DARPA MediFor [15] datasets but was unable to feature extraction followed by dense layers for classification.
perform well on low quality videos. Moreover, the evaluation These methods [10], [11] were evaluated on videos collected
of this work did not consider all types of deepfakes. Matern from random websites rather than a standard dataset that
et al. [6] used 16-D texture based eyes and teeth features for doubted the robustness of these approaches for a large-scale
the exploitation of the visual artifacts to detect video forgeries and diverse standard dataset. Nguyen et al. [12] designed a
like face-swap and Face2Face. Most important aspect of this capsule network to expose multiple types of tampering in
work was to detect the difference in eye color of a POI images and videos. This framework aimed at detection of face
for detection of face-swap deepfakes detection by exploiting swapping, facial re-enhancements and computer generated im-
the missing details like reflection in eye color. Additionally, ages. This framework used dynamic routing and expectation-
this work uses face border and nose tip features along with maximization algorithms for performance improvement. The
eye color features for Face2Face deepfakes detection. This Capsule network employed the VGG-19 for latent face features
technique [6] has a limitation of working only for faces with extraction and used them for classification of original and
clear teeth and open eyes. Lastly, the evaluation of this work bonafide videos. Framework is good at detecting face-swap
was only performed on FaceForensics++ [10] dataset. Li et forgeries in FaceForensics dataset, however, not evaluated on
al. [7] used the targeted affine warping artifacts introduced lip-sync and pupper-master deepfakes and complex in terms
during deepfakes generation. Targeting the specific artifacts of computations. Sabir et al. [13] proposed a method based
reduced the training overhead and improved the efficiency. on DL to feed cropped and aligned faces to a CNN (ResNet
However, these specific artifacts selection can compromise and DenseNet) for feature extraction followed by an RNN
the robustness of this technique by making it difficult to for classification. Most important aspect of this work was
detect a deepfake with slightly new transformation artifacts. to use features from multiple levels of CNN to incorporate
Agarwal et al. [8] used an open source toolkit OpenFace2 [16] mesoscopic level features extraction. This work [13] only used
for facial landmark features extraction. Some features were FaceForensics++[11] dataset for evaluation and didn’t consider
derived based on extracted landmark features. These derived lip-sync and puppet-master deepfakes. Yu et al. [14] used a
features were then used along with action unit (AU) features CNN to capture the fingerprints of GAN generated images
to train a binary SVM for deepfakes detection. This technique to perform the classification of synthetic and real images.
was proposed for five POIs where all POIs were linearly This technique targeted fake images generated with four GAN
separable in a t-SNE plot. However, for an increased number variants ProGAN, SNGAN, CramerGAN, MMDGAN, but
of POIs in the updated dataset [17], performance of this might not be able to detect fake images generated with a new
technique was significantly degraded. In their extended work, GAN variant. In [19], authors used an ensemble of four CNNs
Agarwal et al. [18] proposed a framework based on spatial and to achieve good results on DFDC. An attention mechanism
temporal artifacts in deepfakes. This framework is based on was added to EfficientNetB4 to get the insights of the training
some threshold based rules to classify a video as real or fake. process. EfficientNetB4 and EfficientNetB4Att were trained
This rule-based approach would work on selected datasets, as end-to-end training, whereas, EfficientNetB4ST and Effi-
however, performance of this hard coded threshold oriented cientNetB4AttST were trained in Siamese training settings.
approach is expected to degrade on unseen data. In [18], Important aspect of this method was the data augmentation
155
(i.e. down sampling, hue saturation, JPEG compression, etc.) to extract 850-D facial landmarks and 63-D heart rate fea-
during training and validation for model robustness. Moreover, tures. XGBoost classifier is used for classification. Classifier
this technique performs well on large face-swap dataset DFDC is trained on each sub-category of landmark and heart rate
but not evaluated on all three types of deepfakes and is features. Finally, we reduce the dimensions of our features to
computationally complex. In [20], authors used EfficientNet select the most reliable features among all to make the final
(a CNN) and gated recurrent unit (GRU) (an RNN) to exploit features-set. XGBoost classifier is trained on the final features-
spatiotemporal features in the video frames to detect deepfake set to classify the video as fake or bonafide. The process flow
videos. This work included data augmentation on real videos of the proposed solution is shown in Figure I.
during training to balance classes as DFDC is highly class
imbalanced. Moreover, this architecture performs well on large A. Features Extraction
face-swap dataset DFDC but is not evaluated on all three types Effective features extraction is crucial for any classification
of deepfakes and is complex in terms of computations. task. For this purpose, we proposed a fused features-set con-
Most methods based on handcrafted features [6]–[9] fail to sisting of our novel heart rate features and the facial landmark
generalize well on different types of deepfakes like lip-sync features. We extracted facial Landmark features using the
and puppet-master. CNN based techniques [10]–[14] are com- OpenFace2 [21] toolkit. For heart rate features, we selected
putationally complex and black-box in terms of generating the seven regions of interest as shown in Figure I. Seven ROIs
output. Moreover, these methods exploit some GAN specific are right cheek (RC), left cheek (LC), chin (C), forehead
artifacts produced during generation. So, they might fail to (F), outer right edge (OR), outer left edge (OL), and center
detect deepfakes, generated with a new GAN architecture. (C). We calculated RGB values of all ROIs and then applied
To address the above mentioned problems and limitations some transformations to create heart rate features. Details of
of existing works, this paper proposes a lightweight model transformations are as follows:
based on feature fusion of facial landmarks and heart rate
features. For landmark features, we analyzed the impact of HRs = {ZR , ZG , ZB } (1)
each landmark features category before final features selection.
We analyzed the impact of different combinations of features
categories. We started with two most effective features cat- HRr = {ZR /ZG , ZR /ZG , ZG /ZB } (2)
egories and then added one category in the feature-set at a Where HRs ∈ {RC, LC, C, F, OR, OL, CE}
time in the decreasing order of effectiveness. We disregarded &R ↔ red, G ↔ green, B ↔ blue
the concept of the POIs being linearly separable, because that
concept becomes invalid with a higher number of POIs. We HRr = HRs ∪HRr (3)
used the XGBoost [21] for classification purposes. XGBoost
uses Bagging in Random Forest for variance related errors and Where HRs represent the simple heart rate features at ROIs
gradient boosting algorithm for bias related errors. XGBoost and HRr is the ratios of heart rate features. Union of these
successfully addresses the data classification problem where HR features generate our heart rate features.
data points are not linearly separable.
B. Features Standardization & Segmentation
The main contributions of this paper are as follows:
• We propose a lightweight and interpretable deepfakes
Both the landmark and our proposed heart rate features
detection framework capable of accurately detecting all are on different scales. To fuse the features, we standardized
types of deepfakes namely, faceswap, puppet-master and features by partially learning the distribution of features during
lipsync. data loading. We apply standardization as shown in Eq. (4),
• We propose novel heart rate features and fused them
based on learned distribution over all features.
with a robust set of selected facial landmark features for x−µ
z= (4)
deepfakes detection. σ
• We highlight that an XGBoost based solution is Where µ is mean and σ is standard deviation of a feature
lightweight as compared to CNN based solutions and column.
better generalize as compared to other conventional clas- Our solution works at both the frame and segment level. For
sifiers like SVM, KNN, etc. segment level operation, we created segments with a length of
Rest of the paper is structured as follows. Section 2 presents 30 frames with an overlapping of 10 frames. In our case, the
the details of feature engineering and model development. In video frame rate is 30 frames per second.
Section 3, we present the details of performance evaluation and
C. Classification
comparative analysis w.r.t to state of the art methods. Finally,
we conclude our paper in Section 4. For the classification task, we need a classifier that should
be lightweight and can generalize easily to the new datasets.
II. M ETHODOLOGY Classification process should be interpretable so we can follow
This section provides an overview of the proposed frame- a directed path for further improvements. To incorporate those
work. As shown in the Figure I, the input video is processed requirements, we employed the extreme gradient boosting
156
Figure I. Architecture of the Proposed Framework
(XGBoost) [21], an approach for gradient boosted decision will be highly effective for large datasets as it is highly
trees. XGBoost is an algorithm in the class of gradient boost- scalable and computationally efficient. We can use the power
ing machines. In boosting algorithms, many weak learners are of GPU as XGBoost can perform out-of-core computations.
ensembled sequentially to create a strong learner having low Objective function of XGBoost is based on training loss and
variance and high accuracy. In boosting, learning of the next a regularization function as shown in Eqs. (5) & (6). Training
predictor is improved to avoid repeating the error caused by loss helps in stage wise bagging of trees in the random forest
any previous predictor. In Random forest, a model with deeper to decrease the variance error. Regularization function helps
trees gives good performance but in XGBoost, shallow trees to reduce the bias related errors using boosting.
perform better because of boosting. There are two boosting
approaches, Adaptive Boosting and Gradient Boosting. Adap- n
X t
X
tive boosting puts more weight on misclassified data samples. O= l(yi , ŷit ) + Ω(fi ) (5)
While gradient boosting identifies misclassified samples as i=1 i=1
gradients using the Gradient Descent to iteratively optimize Where t is the total number of trees and yi is actual value and
the loss. XGBoost employs Gradient boosting. Using XGBoost ŷit is the prediction at time t. n is the total number of training
157
samples. unit features. Table 1 presents the results of individual feature
types. We conducted an evaluation on different combinations
T
1 X 2 of features in the descending order of their effectiveness. Table
Ω(fi ) = γT + λ ω (6)
2 j=1 j 2 presents the results of a combination of features categories.
We observed from Table 2 that 2D and 3D lankmark features
Where γ is the min reduction in loss, required for a new split are most effective giving an AUC of 0.9311. We also observed
on leaf node and λ is the l2 regularization term on leaf weights that eye landmark and headpose features are effective thereby
and helps ovoid overfitting. increasing AUC from 0.9311 to 0.9326, when combined
with 2D and 3D landmarks. Additionally, we observed that
III. E XPERIMENTS AND R ESULTS
combining heart rate features with selected landmark features
A. Dataset is very effective and increases the AUC from 0.9326 to 0.9505.
We evaluated our method on the world leaders dataset Based on our observation, we didn’t include shape features
(WLDR) [8]. WLDR is the only dataset with all three types of in the final features-set due to slight improvement in AUC
deepfakes. WLDR comprises real and fake videos of ten U.S. from 0.9505 to 0.9510 when shape features are also included
politicians, and real videos of comedy impersonators of those in the fused features-set. Our final features-set includes eye
political figures. The WLDR dataset has all three types of landmarks, headpose, 2D & 3D landmarks and heart rate
deepfakes i.e., face-swap, puppet-master and lip-sync. WLDR features. As per our hypothesis, combinations of features
has lip-sync deepfakes for only one POI i.e., Obama. Face- that are individually effective also perform better. Finally, we
swap videos of WLDR are created by replacing the face of selected five out of seven features categories for our model.
the impersonator with the face of the corresponding politician. We evaluated our model on a wide range of parameters. More
The WLDR dataset has 1753 real and 93 fake videos. Other specifically, we set the learning rate to 0.01, number of trees
datasets like DFDC, FF++ and DFD have more fake videos to 1500, Max depth tree to 8.
as compared to real videos. WLDR has more real videos
C. Performance Comparison of the Proposed and Existing
(95%) than the fake videos (5%) which is good, as for better
Methods
detection, we have to learn the patterns in the real videos
rather than fake videos as fake videos are constantly changing This experiment is designed to measure the performance
with the evolution of GANs. Still it is not large enough to of our framework against existing state-of-the-art deepfakes
generalize a model to perform well in the wild deepfakes. detection methods. For this, we compared the performance
We used area under the curve (AUC) as an evaluation metric of the proposed framework against the [8] and [17]. Table
for model evaluation. The reason behind using AUC is that 3 presents the results of comparison of proposed framework
almost all the available datasets are highly class imbalanced. against existing models. Our model outperforms [8] that is
AUC gives a fair performance score for imbalanced classes as based on action unit features and derived features capturing
compared to Accuracy. mouth movements but our model performance is lower than
their extended work [18]. We also compared our model with a
B. Performance Evaluation of Proposed Framework deep learning (DL) classifier, LSTM-FCN [22]. Agarwal et al.
The objective of this experiment is to evaluate the perfor- [8] technique works on the assumption of linear separability
mance of the proposed framework on a diverse dataset WLDR, of bonafide and deepfake videos in a t-SNE plot based on
having all three types of deepfakes. For this purpose, we fed selected features. But this technique failed to generalize on all
the proposed features of selected landmarks and heart rate types of deefakes. In their extended work, Agarwal et al. [17]
features to train the XGBoost based random forest ensemble evaluated their method on 10-second video clips rather than
to perform the classification of bonafide and deepfakes. Heart frames and segments of small length. Although, this model
rate features and sub-categories of landmark features are on performs better and generalizes well on all existing datasets
different scales. We standardized features before feeding to of face-swap. However, in this work only face-swap deepfakes
the classifier. For standardization, we calculated mean and are considered and lip-sync and puppet-master deepfakes are
standard deviation of the whole training set during data not addressed. Moreover, performance of this method [17] is
preparation. We scaled train, test and validation sets to make expected to drop if evaluated on frame and segment level due
sure the mean of rescaled data is zero and standard deviation to its threshold based approach. We observed from the results
is one. We evaluated our model on frame and segment level. (Table 3) that a DL based model, LSTM-FCN can achieve
In WLDR, the frame rate of videos is 30 frames per second. comparable results as we achieved with XGBoost based Ran-
For segment level evaluation, we created 30 frames length dom Forest ensembles. However, compared to LSTM-FCN our
segments with an overlapping of 10 frames. Our model is proposed framework is light weight and interpretable rather
robust to both frame and segment level detection. than a black-box oriented model of a DL classifier.
We evaluated our model on each of six categories of facial
landmark and heart rate features. List of features effective to IV. C ONCLUSION AND F UTURE W ORK
the detection task in descending order is 2D landmark, 3D This work has presented a unified method based on fusion of
landmark, eye landmark, headpose, heart rate, shape and action our novel heart features and facial landmarks for detecting all
158
TABLE I
S EGMENT AND FRAME LEVEL AUC ON INDIVIDUAL FEATURES CATEGORIES
Features Used Eye landmark Head pose 2D landmark 3D landmark Shape Action Unit Heart Rate
(AUC): Seg level 0.8851 0.8023 0.8982 0.8978 0.7644 0.5027 0.7956
(AUC): Frame level 0.8659 0.7774 0.8903 0.8856 0.7357 0.5017 0.7866
TABLE II
S EGMENT AND FRAME LEVEL AUC ON COMBINATION FEATURES CATEGORIES
Features Used 2D lmk,3D lmk Eye lmk,2D lmk,3D lmk Eye lmk,Headpose,2D lmk,3D lmk Eye lmk,Head pose,2D lmk,3D lmk,HR Eye lmk,Head pose,2D lmk,3D lmk,Shape,HR
(AUC): Seg level 0.9297 0.9311 0.9326 0.9505 0.9510
(AUC): Frame level 0.9158 0.9059 0.9068 0.9425 0.9285
TABLE III [7] Li, Y. and S. Lyu, Exposing deepfake videos by detecting face warping
C OMPARISON OF XGB OOST WITH [8],[18] AND LSTM-FCN [23] artifacts. arXiv preprint arXiv:1811.00656, 2018. 2.
[8] Agarwal, S., et al. Protecting World Leaders Against Deep Fakes. in
Model Name WLDR Evaluation Levels Proceedings of the IEEE Conference on Computer Vision and Pattern
Protecting World Leaders [8] 0.93 Frame and segment level Recognition Workshops. 2019.
LSTM-FCN 0.95 segment level [9] X. Yang, Y. Li, and S. Lyu, ”Exposing deep fakes using inconsistent
XGBoost (proposed) 0.95 Frame and segment level head poses,” in ICASSP 2019-2019 IEEE International Conference on
Appearance and Behavior [18] 0.99 Video level Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 8261-
8265: IEEE.
[10] Güera, D. and E.J. Delp. Deepfake video detection using recurrent neural
networks. in 2018 15th IEEE International Conference on Advanced
Video and Signal Based Surveillance (AVSS). 2018. IEEE.
three types of deepfakes. Unlike many existing methods, our [11] Afchar, D., et al. Mesonet: a compact facial video forgery detection
method is light weight, interpretable and effective at the same network. in 2018 IEEE International Workshop on Information Forensics
time. Moreover, compared to existing light weight techniques, and Security (WIFS). 2018. IEEE.
[12] Nguyen, H.H., J. Yamagishi, and I. Echizen. Capsule-forensics: Using
our method is more robust and interpretable. We highlighted capsule networks to detect forged images and videos. in ICASSP 2019-
that an XGBoost based framework is lightweight over the 2019 IEEE International Conference on Acoustics, Speech and Signal
CNN based solutions and generalizes better as compared to Processing (ICASSP). 2019. IEEE.
[13] Sabir, E., et al., Recurrent Convolutional Strategies for Face Manipula-
other conventional classifiers. For this purpose, we compared tion Detection in Videos. Interfaces (GUI), 2019. 3: p. 1.
our proposed method with a time-series DL classification [14] Yu, N., L. Davis, and M. Fritz, Learning GAN fingerprints towards
model, LSTM-FCN. However, proposed framework follows Image Attribution. arXiv preprint arXiv:1811.08180, 2019.
[15] (14.06.2021) Available: https://www.nist.gov/itl/iad/mig/media-
a signature based approach and thus may not be very effective forensics-challenge-2018
against deepfakes developed in future. Proposed method also [16] Baltrusaitis, T., et al. Openface 2.0: Facial behavior analysis toolkit. in
need to be enhanced for optimized cross corpus evaluation. 2018 13th IEEE International Conference on Automatic Face Gesture
Recognition (FG 2018). 2018. IEEE.
For our future work, we’ll perform cross-dataset evaluation, [17] Agarwal, S., Farid, H., El-Gaaly, T., Lim, S. N. (2020, December).
experimenting on the datasets that have multiple forgeries per Detecting deep-fake videos from appearance and behavior. In 2020 IEEE
sample. International Workshop on Information Forensics and Security (WIFS)
(pp. 1-6). IEEE.
[18] Ciftci, U. A., Demir, I., Yin, L. (2020). Fakecatcher: Detection of
V. ACKNOWLEDGMENTS synthetic portrait videos using biological signals. IEEE Transactions on
This work was supported by grant of Punjab HEC of Pattern Analysis and Machine Intelligence.
[19] Bonettini, N., Cannas, E. D., Mandelli, S., Bondi, L., Bestagini, P.,
Pakistan via Award No. (PHEC/ARA/PIRCA/20527/21). Tubaro, S. (2021, January). Video face manipulation detection through
ensemble of cnns. In 2020 25th International Conference on Pattern
R EFERENCES Recognition (ICPR) (pp. 5012-5019). IEEE.
[20] Montserrat, D. M., Hao, H., Yarlagadda, S. K., Baireddy, S., Shao,
[1] (14.06.2021). Reface App. Available: https://reface.app/ R., Horváth, J., ... Delp, E. J. (2020). Deepfakes detection with
[2] (14.06.2021). FaceApp. Available: https://www.faceapp.com/ automatic face weighting. In Proceedings of the IEEE/CVF Conference
[3] (14.06.2021) Available: https://edition.cnn.com/videos/business/2021/03/02/tom- on Computer Vision and Pattern Recognition Workshops (pp. 668-669).
cruise-tiktok-deepfake-orig.cnn-business/video/playlists/business-social- [21] Chen, T., Guestrin, C. (2016, August). Xgboost: A scalable tree boosting
media/ system. In Proceedings of the 22nd acm sigkdd international conference
[4] Masood, M., Nawaz, M., Malik, K. M., Javed, A., Irtaza, A. (2021). on knowledge discovery and data mining (pp. 785-794).
Deepfakes Generation and Detection: State-of-the-art, open challenges, [22] Karim, F., Majumdar, S., Darabi, H., Harford, S. (2019). Multivariate
countermeasures, and way forward. arXiv preprint arXiv:2103.00484. LSTM-FCNs for time series classification. Neural Networks, 116, 237-
[5] Agarwal, S., Farid, H., Fried, O., Agrawala, M. (2020). Detecting deep- 245.
fake videos from phoneme-viseme mismatches. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition
Workshops (pp. 660-661).
[6] Matern, F., C. Riess, and M. Stamminger. Exploiting visual artifacts
to expose deepfakes and face manipulations. in 2019 IEEE Winter
Applications of Computer Vision Workshops (WACVW). 2019. IEEE.
159
Searching for Aesthetical Values in an Upgraded Informal
Neighborhood in Tirana
Edmond Manahasa Arila Rasha
Department of Architecture Department of Architecture
Epoka University Epoka University
Tirana, Albania Tirana, Albania
emanahasa@epoka.edu.al arasha14@epoka.edu.al
Abstract— This research aims to explore the possible settlements especially the agricultural lands at the periphery
aesthetical qualities in an informal settlement called Bathore, of Tirana.
which evolved in the post-socialist period in capital city of
Albania, Tirana. The informal settlement emerged after the fall The housing construction process was based on several
of socialist system, due to immigration from northern and steps which were related to the limited budget the settlers
eastern Albania because of the of unemployment and economic possessed. Once that had of the family or one of the males had
reasons. Based on those considerable parts of agricultural lands secured the land plot through different intermediate means
at the periphery of Tirana, were usurped to be transformed in (like buying from previous owner or obtaining from the state
informal neighborhoods, which lacked the needed company who managed during the socialist period; or simply
infrastructure. After almost thirty years from this large socio- forcefully seizing) the first step was to enclose by a fence or a
urban “tectonic crack” these zones are in continuous process of wall. In the second step after construction of the foundations,
urban upgrade and integration with the other parts of the city sometime only a room or a part of the house was constructed
through legalization and infrastructure provision. The study by hand power. At a further step, the first floor of the house
benefits from the theoretical framework of aesthetics, to use its was finished, and predominantly there were left bared
main principles in revealing the aesthetical values of the selected reinforcement irons at the housings' terraces aiming to add
urban fragment with Bathore. Based on that it provides an other upper floors. This process has continued like this and
aesthetical analysis examined into two levels: urban and
after approximately thirty years these informal dwellings has
architectural. The urban aesthetical level of this informal
ended up in three to four floor houses. While some of the
context is analyzed by using pattern analysis revealing the
existence of proportional relation between building and parcel informal houses are featured by irregular volumes, in
components. The study found good ratio in between these considerable cases due to the qualitative exterior finishing
components, which provided a comfortable built environment. appear like three floor villas.
Furthermore, in urban level is found to be used a quasi-modular Because the informal house holders built their homes,
building footprint in a considerable amount within the selected without construction permission from 2006 the state
fragment. Whereas in architectural level similar principles are institution legalization Agency “ALUIZNI” was established.
explored by analyzing the buildings’ exterior visual quality. The
This process apart from providing a legal status, opened the
study found the existence of aesthetical values also in
architectural level. Apart for the unfinished buildings, the
framework for further infrastructural investments which
aesthetical values in architectural level are provided by the upgraded the living conditions in the zones. However, the
usage of similar/balanced/repeated volumes in non-street side process is continuing and due to the disputable ownership
buildings and in street side landmark buildings by qualitative status of the site plot (in which the best case is when it
exterior materials and proportional volumes. belonged to state and the most complicated one when it
belonged to other peoples) have resulted in very long process.
This research aims to reveal the possible aesthetical values
Keywords—aesthetical values, informal settlements, post- of the informal settlements by focusing on a peripheral
socialist period, Tirana neighborhood of Tirana, called Bathore. The study selects and
analyzes an urban fragment within this neighborhood.
I. Introduction on Informal Settlements in Albania Referring to the main principles of aesthetics, it examines the
aesthetical values by dividing into urban and architectural
The fall of the socialist regime in Albania is associated
levels. For urban level it uses pattern analysis it analyzes the
with deep consequences in the life of Albanians. Apart from
relation between its smaller components, exploring the
the freedom of speech, the transition to the new liberal
existence of possible aesthetical principles of scale and
economic system caused the bankruptcy of most state
proportion. Similarly for architectural level it explores the
companies. As a matter of fact, large number of citizens
existence of buildings which possess aesthetical values, by
suffered unemployment, which caused migration to neighbor
conducting site observation and descriptive analysis of their
countries like Italy and Greece. Parallel to this process,
exterior.
immigration also occurred from country’s north and eastern
parts to central Albania cities like Tirana and Durres.
According Dervishi [1] by the 90s Tirana had 240000 citizens,
whereas now it has quadrupled up to 1 million. The existing A. Informal Neighbourhood of Bathore
housing stock could not provide dwelling for this large During the socialist period, Bathore was a village under
number of newcomers and, those could not afford to buy a the administration of Kamza, which was an agricultural town.
house. This economic reality merged with inefficiencies in After the fall of socialist period, as a result on internal
urban development and political capital gain reasons migration for reason that were explained in the prior part,
according to Fuga [2] made the flourishment of illegal newcomers from Northern Albania settled, transforming it
160
completely in a large an informal zone. Most people obtained “mathematical theory of musical consonance” to explain the
the land to develop their dwellings from other settlers who had order of universe is strongly inherent. Other Renaissance
seized it in the 90s and a minor number simply had occupied masters like Alberti, built their concept on aesthetics by what
the public land [3]. The agricultural character of Bathore have is “right”, “appropriate”, “proper” and “proportionable” [5].
disappeared, to be transformed into a zone into a huge The Enlightenment period architects put focus on “order” and
neighborhood fulfilled with informal buildings. Bathore's “sublime”. Together with the emerge of functionalism as the
chaotic silhouette take attention to the eye of any person, who major creative force of modern architecture, indisputably the
decides to travel through the national road to the northern aesthetical features are criticized to be highly reductionistic.
Albanian cities because the national road passes directly to Considering the impact of Loos and Mies Van der Rohe in the
this area. 20th century apparently rather than decorative features, the
spatial qualities have taken the dominant role in architectural
compositions.
Proportion and scale in fact are the most fundamental
principles of aesthetics widely accepted. While the basis of
proportion is laid in the works of Pythagoras, further in later
periods till nowadays there are produced different
proportional systems like: Golden ratio, Classical orders,
Fibonacci numbers, Vitruvian man or Le Modulor, using as its
the primary sources nature and human body. As for the scale
it refers to how we perceive the size of something in relation
to something else. The human scale tends to provide an
evaluation of a building size taking as the reference the human
body size [6].
Based on this theoretical framework we aim to explore the
existence of principles of aesthetics which can be operational
in an informal urban context. Apart from the two major
Figure I. Location of Bathore in northeast part of Tirana in red dot. principles of aesthetics like scale and proportion, we also aim
to see the existence of other principles like symmetry,
After the establishment of ALUIZNI agency, the process repetition, composition, balance, linearity, or rhythm. Since
of legalization of informal settlement provides another the informal settlements are not planned on the contrary are
impetus in offering the needed infrastructure to this developed spontaneously, we will explore also the possible
settlement. Although due to political intentions, the relation between certain social setting like brotherhood and
legalization process have been overextended for almost 30 aesthetical values.
years, important infrastructural facilities like asphalted roads,
solid waste management. Especially the role of one local NGO
called CO-PLAN has been essential in producing planning
strategies and bridging the gap between the local community II. Exploring Aesthetical Values in an Informal
and administration through participatory design processes [4]. Settlement
Apart from that people also has continuously upgraded their Having explained the research context and certain basic
houses, improving their dwelling comfort and the exterior concepts of aesthetics we have decided to analyze the
quality. aesthetical features of selecting an urban fragment within
Bathore called neighborhood number 1. To do that we propose
to analyze the aesthetical features into two levels: urban and
architectural. To comprehend the physical elements in urban
level we conduct a city image analysis based on Lynch.
Furthermore, to explore the possible aesthetical values in
urban level, we use a pattern analysis, revealing the relation
Figure II. The image of first informal settlements in 1994 (left) and
the same place in 2007 (middle- ©John Driscoll, IIUD) and the between building and parcel and evaluating the existence of
new infrastructure of Bathore in 2019. proportion and scale principles. In architectural level we
firstly define the buildings using their relation to the street into
B. Theoretical Unpinning for Urban Aesthetics two: street side and non-street-side buildings. In additions we
The notion of the aesthetics etymologically comes from tend to explore the existence of buildings with aesthetical
Greek meaning “to perceive” or related to senses or sensation. values in this upgraded neighborhood and analyze the possible
Starting from the classical antiquity it has been a subject of aesthetical elements and principles in their exterior. We use
elaboration of different field of life like art, music, poetry, or visual documentation and a descriptive form to reveal these
architecture. Plato looked to aesthetics in the absolute beauty, features.
which relied in proportion, harmony, and unity. Aristotle
A. Urban Urban Level Aesthetical Analysis
among others emphasized the pleasure as an element of
beauty, pointing the importance of perception. Vitruvius To search the existence of aesthetical principles in urban
provides his vision on buildings proposing his three famous level we conduct an urban analysis using Lynch’ elements to
principles: firmitas, utilitas, and venustas, putting the comprehend the physical components of the selected
aesthetics the last in order. Especially the influence of fragment. Furthermore, we make a pattern analysis.
Pythagoras’s in this periods perception which used
161
sizes. One of the buildings which is larger in size is the
neighborhood market which three floor high (Figure 4, right-
down image). Thus, most of the buildings are in human scale.
Similarly, the ratio between the building footprint to
parcel varies from 1/2 to more than 1/5 and the majority is
1/3 and over. These ratios between parcel and housing
pattern, due to the existence of considerable green spaces,
provide a good balance, resulting comfortable outdoor
environment.
The parcel and house pattern of the selected urban
fragment is featured by similarity and repetition. Although
the logic of parcel division and house scale is directly related
on brothers belonging to the same family, it appears that the
relation between these two elements seems to be balanced.
The repetition of similar house patterns in a balanced way
creates the possibilities to have quasi harmonious
composition.
162
dimensions apart from business functions and residential
functions, in some cases the offered apartment for renting as
also in considerable cases upper residential floors are left
empty.
163
disbalance is apparent not only in the changing volumes, but Apart from this interestingly, in this small urban fragment
also in the varying colors and presence of other additions in there are observed at least four twin houses, which reflect a
the form of staircases, aiming to access the first floor for the sense of symmetry. In fact, this approach is based on the
same commercial activities. In this context the idea to make relation of brotherhood, in which brothers buy a land plot
the commercial spaces (shops, markets, or other services) together and develop identical houses. The similarity in the
more visually appealing, have pushed to cover with heavier houses is reflected also in the exterior finishing, where in
materials like marble or stone cladding. This composition some cases, the houses are left plastered and uncolored, or are
generates a visual contrast, between the different functions whitewashed, or even in the same colors.
providing another disbalance. Apart from this apparently
those buildings due to incomes coming from commercial
activities are upgraded with richer exterior quality, play the
role of landmarks.
164
revealed at least in four cases identical twin houses, reflecting [3] Pojani, Dorina, From Squatter Settlement to Suburb: The
a sense of symmetry, which is developed based on the will to Transformation of Bathore, Albania. Hous. Stud. 2013, 28,
805–821. 2013.
have identical houses between brothers. This finding
[4] CoPlan, Enabling a better urban governance. CoPlan case in
demonstrates that although the informal settlements are not the areas influenced by illegal buildings, Conference:
planned, certain social settings like brotherhood, due to this Strengthening Public Information and Participation for an
sense of having similar houses can be a reason to develop Open Governance in Albania, Tirana, February 20–21, 2003.
aesthetical values in a such a built environment. [5] Scruton, Roger, The aesthetics of architecture. Princeton,
NJ:Princeton University Press, 1979.
[6] Rasmussen, Steen Eiler, Eve M. Wendt, Experiencing
architecture, 1962.
References
[7] http://community.dur.ac.uk/geopad/first-impressions-
[1] Manahasa, Edmond, Place attachment as a tool in examining informal-settlement-bathore-albania/
place identity: A multilayered evaluation through housing in
Tirana. PhD dissertation, Istanbul Technical University, 2017.
[2] Personal Communication with Social Scientist and Philosopher
Artan Fuga, 2014.
165
Effect of Oxidation Reactor Structure on Operating
Parameters and System Performance in a Nitric Acid
Production Plant
Oguzhan Erbas F. Menekse Ikbal Ahmet Akbulut
Department of Mechanical Engineering Department of Mechanical Engineering Department of R&D
Kutahya Dumlupinar University Kutahya Dumlupinar University Istanbul Gubre Sanayii A.S (IGSAS)
Kutahya, Turkey Kutahya, Turkey Kutahya, Turkey
oguzhan.erbas@dpu.edu.tr menekseikbal@gmail.com ahmet.akbulut@igsas.com.tr
Abstract—The factors affecting the process efficiency in a very flexible range according to needs. The nitrogen value of
plant producing dilute nitric acid were investigated. It has been ammonium nitrate fertilizer is 35 % N [1].
observed that the parameter that significantly affects the
efficiency in the ammonia combustion process is the reactor In fertilizer production facilities, diverse starting materials
unit, which is the heart of the system. It has been determined are used depending on the type of fertilizer produced, in some
that most of the malfunctions and stoppages in the existing plant cases together or sometimes separately. The primary starting
are caused by the problems in the old reactor. Therefore, the materials for chemical fertilizer production are ammonia,
aging reactor in the facility was revised, and the reactor unit nitric acid, sulfuric acid, and phosphoric acid. Some of these
structure was changed. In this study, the efficiency and substances are brought to the facility from outside, while
performance of the system in the old and new reactor after others are produced in the same facility. In an exothermic
revision were analyzed. reaction, gaseous ammonia and nitric acid combine to form
ammonium nitrate and water. Nitric acid is preheated without
Keywords—nitric acid production, ammonia oxidation reacting, especially when dilute acid is used; preheating is
reactor, ostwald process, waste heat boiler, energy efficiency essential. For this process, steam and hot condensate in the
I. Introduction advanced stages of the plant can be used [2,3].
With the industrial revolution, there has been a significant The nitric acid absorber is the primary emission source,
increase in the world population. While the world population and continuous emission into the air occurs from its outlet.
was 1 billion before the industrial revolution, it is now about These emissions are NH3, nitric acid vapor, NOx, NO2, and
8 billion. The rise in energy demand with population growth, NO. The waste gas flow rate and the pollutants in its content
globalization, and income and welfare has led to energy may differ according to the process used. For example, the
efficiency. Energy efficiency measures how energy losses can pressure and temperature of the combustion medium, the
be prevented without reducing the production quality and type/structure of the catalyst, its age compared to its lifetime,
process quantity in industrial enterprises. the choice of burners, etc. Depending on many factors,
varying proportions of N2O (nitrous oxide) may occur with the
The sector that comes to the forefront in energy efficiency combustion gases. Water, typically 0.2 % in liquid ammonia,
studies is the industry-manufacturing field. 2020 has been a accumulates in the evaporator as the ammonia is evaporated.
challenging year for the world. The world has dealt with the Some ammonia is also released when cleaning with
virus epidemic (Covid-19). The stagnation nature of the intermittent bluffs.
economic activities of the Covid-19 epidemic has also
profoundly affected Turkey. To overcome this crisis, it is Although ammonia leaks are not common in nitric acid
necessary to achieve maximum output with minimum input, production, they can be a serious source of danger if they do
increase the profit margin without increasing sales, and reduce occur. Pipelines, transfer equipment, corrosion punctures, etc.,
the costs without reducing the product quality. Energy is the leaks that may occur due to reasons should be monitored by
highest cost for the business. businesses. When preparing air/ammonia mixtures, attention
is paid to explosion risks. An additional threat posed by
Waste resulting from animal production, nitrogen fertilizer nitrous gas (N2O) is the possibility that ammonia, which can
used in plant production, diesel fuel in tractors, thermal fuels accumulate in refrigerated zones, may form salt precipitates in
in housing, greenhouse, and animal shelters with electricity, the nitrite/nitrate composition. These carry the risk of
the agricultural sector also affects the formation of greenhouse explosion, so the risk is eliminated by periodically washing
gases that cannot be ignored. While assessments of energy use the places where they can occur [4].
in agriculture often focus directly on energy use, 50 % and
more of the total energy use nitrogen fertilizer production For energy efficiency, measures should be created to
should be considered to be related to energy and other indirect increase energy efficiency, taking into account the dynamics
energy uses. of each enterprise. In this study, productivity-enhancing
studies were carried out in a nitric acid production facility that
Ammonium nitrate is the most commonly used nitrogen produces chemical fertilizers. In this nitric acid production
fertilizer. Ammonium nitrate is also used as an explosive. It is facility, processes with high energy consumption were
obtained by neutralizing nitric acid with ammonia. Depending determined, and it was aimed to reduce consumption values.
on the operating conditions, the obtained ammonium nitrate
solution has a 50-70 % concentration. After drying the As a solution alternative to minimize consumption,
resulting concentrated ammonium nitrate solution, a solid energy efficiency has been focused on. As a first step, the
fertilizer is formed. Facility capacities can be selected in a problems that cause malfunctions in the plant are discussed.
Machine-induced stops are examined. It has been seen that
this old reactor generates 90% of the failures. The ammonia
166
oxidation reactor process was analyzed, as the main reason for Using 100% HNO3, 56 % dilute nitric acid is produced.
the downtime at this nitric acid production facility was this old The raw materials used in the production of nitrate acid are
reactor. The reactor was renewed as a result of rehabilitation air, water, and ammonia. Nitric acid production steps are as
works at the facility. The operating parameters of the old and follows;
new reactors were evaluated, and their efficiency was
analyzed [5,6]. 1. Liquid ammonia is first gasified with water in the
ammonia gasifier and then comes to the ammonia
II. The Nitric Acid Production Process and the
superheaters to be heated with hot air. Then the heated gaseous
importance of the ammonium oxidation reactor ammonia comes to the ammonia combustion reactor.
Nitric acid (HNO3) is obtained by catalytic oxidation of
2. The air required for the combustion of ammonia is
ammonia (NH3). This process is called the "Ostwald
drawn from the atmosphere by the turbocharger and cleaned
process". The mixture is passed through a platinum-radium
in the filter. Ammonia with an ammonia concentration of
catalyst network. Under normal conditions, the reaction in around 10% is mixed with air, passed through the filter, and
equation (1) occurs more, and only elemental nitrogen is comes to the ammonia combustion reactor.
obtained.
3. Nitrous oxide gas (NO) is obtained by burning 11 %
ammonia and 89 % air mixture in ammonia combustion plants
4NH3 + 3O2 2N2 + 6H2O (1) on platinum-rhodium-palladium catalysts at approximately
For this reason, a catalyst is used to obtain nitrous oxide. 870 oC under 2.5–3.5 bar (kg/cm2) pressure. Superheated
The only catalyst used industrially is the platinum-radium steam is obtained from these hot gases in the waste heat boiler.
catalyst, which contains 5% to 10% radium and is pencil 4. NOx gases coming out of the ammonia combustion
platinum. It is in the form of a fine mesh. Oxygen atoms are reactor at approximately 250 0C are cooled by passing through
absorbed on the platinum surface. four coolers. At this stage; It passes first to the rest-gas heat
exchanger, then to the boiler feedwater exchanger, and then
The reaction takes place between the oxygen atoms on the through the horizontal heat exchanger. Then it reacts by
surface and the ammonia molecules. As a result, the ammonia mixing with the secondary air coming from the bleaching
molecule turns into NO [7]. In the reactor, the gas phase is column and is cooled again in the vertical heat exchanger.
reversible between NH3 and oxygen, and an exothermic Finally, it enters the oxidation tower last from the bottom.
reaction occurs, and nitrogen oxides (NOx) are released The oxidation reaction rate of nitrous oxide with additional
(equation 2). oxygen proceeds faster at low temperatures. Since the reaction
is highly exothermic, severe cooling is required to reach the
4NH3 + 5O2 4NO + 6H2O + Q (2) desired oxidation equilibrium quickly. The nitric acid
production process flow chart is given in figure I.
Figure I. Simplified scheme of the Ostwald process for nitric acid production [8]
167
5. NOx gases entering the tower from the bottom and III. Results and recommendations
low-concentration nitric acid (HNO3) coming from the top A. The main factors affecting process efficiency
react and complete the oxidation. The cooling process in the
towers is realized by using the cooling water with the help When the process in the nitric acid production facility
of the serpentines. was examined, it was found that the main parameters
affecting production efficiency were NH3 combustion rate,
6. Nitrogen oxide gases coming out of the oxidation sub-platinum temperature, no output temperature, amount
tower pass to the absorption tower. The water condensed of steam produced, final amount of acid produced,
during cooling absorbs nitrogen dioxide from the gas, and deactivation times, and postures associated with the
acid is obtained at a concentration of around 40% to 50%. oxidation reactor.
This acid accounts for 30% to 40% of the total production.
This acid is separated from the gas stream under the cooler, The percentage of NH3 burning in the mixture is an
and the absorber is given from a suitable place. Meanwhile, important parameter. If this NH3 ratio increases, there will
the oxidation state reached around 45%. Additional oxygen be an explosion in the system. The rate of ammonia in the
from the secondary air helps increase the oxidation state to ammonia-air mix is kept constant by the automatic
about 92% to 96%. Nitrous oxide gases coming from the regulators (10.5 % NH3 + 89.5 % O2). The facility has
bottom of the tower and 15-16% acid coming from the top recently been automated to minimize human intervention in
(in the absorption column) react and complete its oxidation. the process control mechanism. When the NH3 rate exceeds
Thus, 56% diluted nitric acid is produced [9]. 13 %, auto control is activated in the system, and the system
is disabled because 15 % is a critical level for NH3.
7. Unabsorbable gases leave the tower at a pressure of
2.5 bar and are heated in the rest-gas exchanger. These gases Another critical parameter is the catalyst (90 % Pt + 10
then enter the DeNOx system and are discharged into the % Rh). This catalyst is a perforated wire mesh with a
atmosphere through the chimney. In the DeNOx system, diameter of 3800 mm and a wire thickness of 0.06 mm. It
with the selective catalytic reduction method, NOx gases are is changed every nine months. Its efficiency decreases over
reacted with NH3 gas and reduced to N2 gas and H2O vapor time. The conversion efficiency of the ammonia combustion
already present in the air. As a result, the NOx gas in the flue reactor is also an important factor and refers to ammonia
gas is reduced below the limit value specified in the consumption. The conversion efficiency decreases with
“Industrial Air Pollution Control Regulation,” and the flue increasing pressure. The structure of the Pt-Rh catalyst is
is analyzed online 24 hours a day, and the data is recorded. shown in Figure II.
In addition, the operating pressure of the system is “medium
pressure”. Pressure selection is made according to the
amount of acid produced (2.5-3.5 bar kg/cm2)[10].
The conversion efficiency is 97 % at 870 oC catalyst The material used is "321(1.4541)" quality stainless
mesh temperature. The most important feature of this steel, which is acid-resistant. To not tire the material in the
process is that it produces its energy. Once the reaction has system, the operating temperatures mustn't exceed the
started, the pressure and temperature rise should be design values.
carefully monitored. Because here sub-platinum
temperature is an important parameter. Another critical parameter is the NO outlet temperature.
The high NOx gas temperature is coming out of the old
Since it is worked at high temperatures, stainless steel is reactor before the revision affected the operating parameters
used in terms of material strength. In the serpentine pipes, negatively. Because the gas temperature should have been
leaks may occur over time, and the system may stop. While 250 0C but found 350 0C. In addition, in the old system, the
there is a loss of time, the production capacity decreases. NO gas from the reactor reached 450 0C until it entered the
Therefore, material strength is essential. “rest-gas” heat exchanger. This situation has already
168
reduced the cooling efficiency of nitrous oxide gas entering Also, the heat transfer surface was almost 10 % more
the heat exchanger at high temperatures. The cooling water efficient than the other. When the reactor was renewed,
in the heat exchangers is also essential because NOx gas there was relief in the heat exchangers. Material life has
affects cooling efficiency. The more cooling is done in the been increased, and simultaneously, the time has been saved
system, the higher the efficiency. Productivity values as there are no downtimes as before.
change according to summer and winter conditions.
Production efficiency in summer is almost 10-20 % lower In the old system before the revision, the cause of these
than in winter (atmospheric air is sucked as fluid in the critical downtimes is due to leaks and changes. The main
compressor). In body-tube heat exchangers, cracks may problems in this previous system; waste heat boiler gas
occur in the pipes due to high temperatures. leakage, casing leakage, preheater (economizer) and
evaporator coil leakage, evaporator wall coil leakage, wall
When gas ammonia mixes into the water leaking from pipe leakage, continuous change of casing coils, platinum
the cracks, acid gas is formed in the water that goes to steam change and inappropriate NO outlet temperature. In figure
and damages the serpentines in the reactor. It erodes the IV, downtimes by years due to system failures before
serpentines. The water in the boiler feed exchanger comes revision are shown. While there is no stoppage due to
to the waste heat boiler in the reactor. For this reason, leaks malfunction in the new system, it is pretty high in the old
are not desired in the heat exchangers. When there is a leak, system.
the system stops. Another critical parameter is the pumps.
Pumps between the reactor and the high-pressure drum
(steam drum) should not be selected with low capacity. If
the pumps are insufficient, the circulation cannot be done
thoroughly. This can damage the serpentines.
B. Structure of Ammonia Oxidation Reactor and
Effect on System Performance.
The most crucial part of the nitric production facility is
the ammonia oxidation reactor. The reactor consists of two
parts. These are the combustion part and the steam-
generating waste heat boiler part. Catalyst and ammonia
burning rates are essential in the combustion part. In the
waste heat boiler part, serpentines are imperative. The NOx Figure IV. Annual downtime of the facility before overhaul
gas temperature at the reactor exit in the system should not After the revision, the main innovations made in the
exceed 250 0C. However, the design was outdated in the reactor are reactor material structure and design dimensions,
pre-revision system under review. Design efficiency values raching ring basket design, process gas cooler, serpentine
were very low. The temperature of the reactor exit NOx gas structure and surface area, waterways, and automation
was 350 0C. The former reactor and square-shaped system. The new reactors after the revision are shown in
serpentines are shown in Figure III. figure V.
169
Table I. Changes in operating parameters before and after References
revision
Parameter Before Revision Post Revision [1] Hanyu Ma, William F. Schneider, “Structure- and
Sub-platinum temperature 865-870 0C 865-870 0C Temperature-Dependence of Pt-Catalyzed Ammonia Oxidation
NO gas outlet temperature 350-400 0C 250-290 0C Rates and Selectivities”, ACS Catalysis,Volume 9, 2019.
Superheated steam
33 t/h 36 t/h [2] Jiamin Jin, Ningling Sun, Wende Hu, Haiyang Yuan, Haifeng
generation
Superheated steam Wang, Peijun Hu,“Insight into Room-Temperature Catalytic
410 0C 440 0C Oxidation of Nitric oxide by Cr2O3: A DFT Study”, ACS
temperature
Catalysis, Volume 8, 2018.
Superheated vapor
40 bar 41 bar
pressure [3] Anshumaan Bajpai, Kurt Frey, and William F. Schneider,
Feed water temperature
280 0C 150 0C “Binary Approach to Ternary Cluster Expansions: NO-O-
(economizer inlet) Vacancy System on Pt (111)”, The Journal of Physical Chemistry
The final amount of acid C, Volume 121,2017.
610 ton/h 610 ton/h
produced
Deactivation times 90 day 0 [4] Chengxiong Wang, Dezhi Ren, Gavin Harle, Qinggao Qin, Lv
Coolant inlet temperature 35 0C 25 0C Guo, Tingting Zheng, Xuemei Yin, Junchen Du, Yunkun Zhao.
Coolant outlet “Ammonia removal in selective catalytic oxidation: Influence of
42 0C 30 0C catalyst structure on the nitrogen selectivity”, Journal of
temperature
Hazardous Materials, Volume 416, 2021.
Additional water flow 350-400 m3/h 50-100 m3/h
Reverse current pump [5] Jiamin Jin, Jianfu Chen, Haifeng Wang, Peijun Hu, “Insight
2 1 into room-temperature catalytic oxidation of NO by CrO2(110):
operated
1 tanker in 10 1 tanker in A DFT study”, Chinese Chemical Letters,Volume 30, 2019.
H2SO4 consumption
days 2,5-3 months
[6] Zhe Hong, Zhong Wang, Xuebing Li, “Catalytic oxidation of
Ammonia flow rate to nitric oxide (NO) over different catalysts: an overview”, Catalysis
180 kg/h 95 kg/h
DeNOx Science & Technology, issue 16, 2017.
0 0
Flue gas temperature 185 C 105 C
Lamont Differential [7] Ata ul Rauf Salmana, Bjørn Christian Enger, Xavier Auvraya,
No difference No difference
pressure Rune Lødeng, Mohan Menon, DavidWaller, Magnus Rønning,
Dome pressure 40 bar 38 bar “Catalytic oxidation of NO to NO2 for nitric acid production over
Total compressor airflow 86.500 Nm3/h 90.000 Nm3/h a Pt/Al2O3 catalyst”, Applied Catalysis A: General, Volume 564,
Turbine steam inlet flow No difference No difference 2018.
Amount of Steam [8] Carlos A. Grande , Kari Anne Andreassen, Jasmina H. Cavka,
20 t/h 26-27 t/h
produced David Waller, Odd-Arne Lorentsen, Halvor Øien, Hans-Jörg
Ammonia valves 70 % - 80 % 50 % - 60 % Zander, Stephen Poulston, Sonia García, and Deena Modeshia,
Dome level valve 70 % - 80 % 50 % - 60 % “Process Intensification in Nitric Acid Plants by Catalytic
Oxidation of Nitric Oxide”, Ind. Eng. Chem. Res.,Volume 57,
2018.
IV. Conclusion
[9] Yafei Shen, Xinlei Gea, Mindong Chena,“Catalytic oxidation
In the dilute nitric acid production facility examined,
of nitric oxide (NO) with carbonaceous materials”, issue 10, 2016.
there was an increase of 3-4 tons per hour in the amount of
steam produced with the renewal of the ammonia oxidation [10] Ruosi Peng, Shujun Li, Xibo Sun, Quanming Ren, Limin
reactor. Since the new reactor was commissioned, there has Chen, Mingli Fu, Junliang Wu, Daiqi Ye, Size effect of Pt
been no reactor-related failure at the facility so far. In the nanoparticles on the catalytic oxidation of toluene over Pt/CeO2
past, it was even possible to exit the circuit twice on the catalysts,Volume 220,2018.
same day. Since there are no downtimes, the production
capacity of the plant has increased. In the old reactor, the
waste heat boiler part was square, there were problems in
the corners of the serpentines turns, and there was more
destruction. In addition, there were serpentine leaks due to
temperature values. In the new reactor, the serpentines are
arranged in a spiral. In the old reactor, the casing cooling
pipes were insufficient. That's why there were gas leaks and
body punctures. In the new reactor, these problems have
disappeared. Since the number of operated reverse current
pumps has also decreased, approximately 185 kWh has
been saved. As a result; As a result of the new changes, there
was no interruption in the production of the plant, and the
system efficiency increased by 10%.
170
Plant Disease Identification Through Deep Learning
Önsen Toygar Mehtap Köse Ulukök Emre Özbilge
Department of Computer Engineering Department of Computer Engineering Department of Computer Engineering
Eastern Mediterranean University Bahçeşehir Cyprus University, Cyprus International University,
Famagusta, North Cyprus, Nicosia, North Cyprus, Nicosia, North Cyprus,
Mersin, Turkey Mersin, Turkey Mersin, Turkey
onsen.toygar@emu.edu.tr mehtap.kose@baucyprus.edu.tr eozbilge@ciu.edu.tr
Abstract—Plant leaves show various symptoms on their plant diseases are spots (caused either by fungi or bacteria),
surfaces. Image processing and computer vision techniques are mildew, and rust [2]. Three defect types affecting several plant
applied on leaf images to identify plant diseases. Healthy and leaves are shown in Figures 1, 2 and 3. Spot defects are
diseased plant leaves are involved in several studies to identify demonstrated on tomato and grape leaves in Figure 1. Rust
or classify disease types using hand-crafted features and deep
learning architectures. In this study, we applied a Transfer
defects are shown on corn and apple in Figure 2. Mildew
Learning technique with several deep learning architectures to defects are presented on cherry and squash leaves in Figure 3.
identify plant diseases from their leaf images. PlantVillage
database is used in the experiments with all fourteen plant Every living organism on earth exhibit or react in a
species and thirty eight classes corresponding to healthy and particular way when in a condition or situation that deviates
infected leaf images. Experimental results are presented using from the normal state of being. For example, when the human
several evaluation metrics and a comparison is performed
skin goes red or develops a rash could be due to an allergic
among several deep learning architectures based on Inception,
VGG, ResNet, MobileNet, Xception and variants of these
reaction or an early indication of an underlying ailment. Plants
architectures. Results are demonstrated in terms of five are not excluded; the leaves in many instances serve as our
evaluation metrics, namely accuracy, F1 score, Matthews gateway for diagnosing a lot of diseases in plants. For
correlation coefficient, true positive rate and true negative rate. example, the Early Blight disease of tomato leads to the
The highest accuracy achieved is 99.81% by ResNet appearance of small dark spots that expand into circular
architecture. plaques made up of rings that circumnutates on the leaves [3].
This in turn, results in premature defoliation of the leaves and
Keywords—plant disease classification, deep learning, heavy losses in yield. Figure I (a) shows a healthy tomato leaf,
transfer learning, computer vision, image processing
and a diseased leaf affected by early blight is shown in Figure
I (b).
I. Introduction
Agricultural productivity is so important in countries in
which their economy is highly dependent on agriculture. It is
stated that 30 to 40% of crops are lost each year through the
production chain [1]. Losses from diseases also have an
important economic impact, causing a drop in income for
crop producers, higher amounts for consumers and
distributors. A lot of studies have been carried out under
changed environmental conditions, in different locations, to
estimate the losses that occur due to different diseases [1]. In
this respect, it is vital to detect and identify plant diseases in
their initial stage. There exist several computer vision and
image processing techniques to identify plant diseases
(a) (b)
through their leaf images.
171
The fungus, Guignardia bidwellii is responsible for the Another sample of rust defects can be seen on Cedar
Black Rot disease of grapes. It is usually common in regions Apple Rust which is a member of the Pucciniaceae family; a
of wet, warm and humid climate as this provides a conducive class of fungi with several species that typically need two or
situation for spore germination and infection. This disease more host to complete their life cycle. Members of this class
spreads when the spores are carried by wind or splashed by are known as rust which are seen at some stage in their
rain onto the surfaces of developing plant tissue. This goes on evolution and mostly they are orange or reddish in color. The
for as long as the environmental conditions remain suitable. fungus spreads through the leaves and develops aecia beneath
Black Rot can be identified when round, tan plaques with the leaves. The aecia produces aeciospores which are wind-
dark purple to brown edges are spotted on the leaves. Critical blown back to the redcedars. They afterwards germinate and
infections may result in leaf deformity, wilting of the leaves begin gall formation which produce telial horns to restart the
[4]. Figure I (c) shows a healthy grape leaf, and a diseased process. A heavily infested apple tree can take on a yellowish
leaf affected by Black Rot is demonstrated in Figure I (d). cast from multiple plaques on the leaves. Figure II (c) shows
a healthy apple leaf, and a Cedar Apple Rust infected leaf is
On the other hand, Puccinia Sorghi Schwein is the presented in Figure II (d).
pathogen (fungus) responsible for the popular disease of
maize known as Common Rust. Early plaques mostly occur Moreover, Powdery Mildew is the third type of most
in clusters and are circular. But as the plaques ripen, the common defects seen on some plant leaves such as cotton,
fungus protrudes through the foliage surface and the plaques cucumber, grape, squash and cherry. Samples of these defects
elongate with time. The characteristic symptom observed on are demonstrated in Figure III on cherry and squash leaves.
the maize leaves are Brownish-red oblong pustules, plaques Figure III (a) shows a healthy cherry leaf, and a diseased
of Common Rust occur on both the upper and lower surfaces cherry leaf affected by Powdery Mildew is shown in Figure
of the leaves and are spread sporadically along the leaves. III (b). Additionally, two samples of diseased squash leaves
Spores are transported by wind with new infections ensuing affected by Powdery Mildew are presented in Figure III (c)
weekly or bi-weekly. One plaque is capable of producing and (d).
both brownish-red urediniospores and black teliospores, yet
lastly, only black teliospores will be seen within the plaque
[5]. Figure II (a) shows a healthy corn leaf, and a diseased
leaf affected by Common Rust is shown in Figure II (b).
(a) (b)
(a) (b)
(c) (d)
Figure III. Mildew defects on Cherry and Squash Leaves ((a)
Healthy cherry leaf; (b) Powdery Mildew on cherry leaf; (c)-(d) Two
samples of diseased squash leaves by Powdery Mildew.
(c) (d) The rest of the paper is organized as follows. Literature review is
discussed in Section II and the methodology used in this study is
Figure II. Rust defects on Corn and Apple Leaves ((a) Healthy corn
explained in Section III. Section IV presents the experiments and
leaf; (b) Common Rust on corn leaf; (c) Healthy apple leaf; (d) results. Finally, Section V concludes the paper with the findings and
Cedar Apple Rust on apple leaf the summary of the work done in this study.
172
II. Literature Review III. Methodology
Plant disease classification has been studied in the Recently, deep learning based approaches are applied on
literature using two approaches, namely, hand-crafted feature plant leaf images to classify plant diseases. In this study, we
descriptors and deep learning approaches. An extensive applied Transfer Learning approach to classify plant diseases
research on hand-crafted descriptors for plant disease from leaf images of several plants available in PlantVillage
classification is presented in Kaur et al. [6]. Hand-crafted database. All classes of PlantVillage dataset are used in the
feature extraction from leaf images requires acquisition, pre- system and totally there are 38 different classes for 14 plant
processing, segmentation and feature extraction. Then, these species. The class numbers and the corresponding disease
features are used to train classification algorithms such as names are shown in Table I. The details related to the training
Support Vector Machines (SVM), Maximum Likelihood parameters used in the architecture are given in Table II.
Classification (MLC), K-Nearest-Neighbours (KNN), Naive
Bayes (NB), Decision Trees (DT), Random Forest (RF) and The system architecture is shown in Figure IV using a
block diagram of the steps followed for plant disease
Artificial Neural Networks (ANN). However, deep learning
approaches perform feature extraction and classification identification. The system receives input plant leaf images of
automatically. 256x256x3 size. The images are color images of plant leaves
and a pretrained CNN is employed on ImageNet dataset.
Deep learning based approaches for plant disease Afterwards, the frozen weights are trained with
classification are recently presented in several research studies backpropagation using a fully connected Artificial Neural
[7]. One of the well-known deep learning techniques is the Network (ANN) to obtain the outputs of the system as plant
Convolutional Neural Networks (CNN) and it gains more disease types to identify the healthy or diseased plant leaves.
attention by the researhers. Plant disease classification
performance of CNN models is found as superior compared to The system uses well-known deep networks that are
other classification techniques. However, there are some implemented on ImageNet database using Inception [10],
problems of CNN models with the usage of small datasets. ResNet[11], VGG [12], MobileNet [13] and Xception [14]
This may cause higher accuracy in classification performance architectures. Pretrained ImageNet network’s connection
but it is not true in practice. Another drawback is the high weights are frozen which are used as a feature extractor
execution time that is required with CNN models. without retraining the weights. After having obtained the new
features from the pretrained network, these features are
Impact of transfer learning strategies of CNN on presented to the custom ANN where its connection weights
pretrained model is worked out by Lee et al. [8]. Plant Village
dataset with 38 different classes is studied by using three deep are learned by using backpropagation algorithm.
learning architectures, namely, VGG16, InceptionV3,
GoogLeNet and a proposed model of GoogLeNetBN. The
best accuracy of 99% is achieved with pretrained CNN on that IV. Experiments and Results
dataset. It is concluded that there is no significant performance Experiments are conducted to perform plant disease
difference between pretrained or unpretrained model. classification using healthy and diseased leaf images from
Around two decades’ studies for plant diseases detection PlantVillage dataset. The details related to the experimental
and classification from plant leaf images by using image setup and the obtained results are presented in the following
processing techniques are summarized in [1, 9]. The subsections using several deep learning approaches.
importance of digital image quality and its difficulty in real A. Experimental Setup
life applications is highlighted in that survey paper. Besides PlantVillage database [3] is used in conducting the
many of the classification techniques, neural network experiments. All plant species from the database are used in
technique gives higher accuracy rate, especially for some the experiments. Totally, 14 different plant species, namely
plants 100% accuracy is reported. apple, blueberry, cherry, corn, grape, orange, pepper, potato,
raspberry, soybean, squash, strawberry and tomato are used
Figure IV. Block diagram of the system architecture with Transfer Learning
173
with their healthy and/or diseased leaf images. The database classes from 14 plant species. Transfer Learning is applied
consists of 38 different classes that comprise of healthy and using several network architectures namely Xception,
different types of infected plant leaves of the aforementioned VGG16, VGG19, ResNet50, ResNet101, ResNet152,
plant species. The list of classes in PlantVillage database used ResNet50V2, ResNET101V2, ResNET152V2, InceptionV3,
in the experiments are depicted in Table I. On the other hand, InceptionResNetV2, MobileNet, MobileNetV2. Evaluation of
the training parameters used in the experiments are available each network architecture is demonstrated using five
in Table II including training and test data sizes, input image evaluation metrics. The accuracy (ACC), F1 score, Matthews
size, output classes and deep learning architecture’s specific correlation coefficient (MCC), true positive rate (TPR) and
parameters. true negative rate (TNR) are computed in each class and
presented in Table III as different evaluation metrics. All
accuracies are within the range [99.54% - 99.81%] while F1
Table I. List of Classes in PlantVillage Database scores are computed in the range [91.63% - 96.60%]. On the
other hand, MCC values are within the range [91.62% -
1 Apple scab 20 Pepper bell healthy
96.55%] while TPR and TNR values are in the ranges of
2 Apple black rot 21 Potato early blight [91.96% - 96.64%] and [99.74% - 99.90%], respectively.
3 Apple healthy 22 Potato healthy
4 Blueberry healthy 23 Potato late blight
5 Cedar apple rust 24 Raspberry healthy Table III. Results for several deep learning architectures
6 Cherry healthy 25 Soybean healthy Network Architecture ACC F1 MCC TPR TNR
7 Cherry powdery mildew 26 Squash powdery mildew
Xception 0.9959 0.9252 0.9247 0.9273 0.9979
8 Corn common rust 27 Strawberry healthy
9 Corn gray leaf spot 28 Strawberry leaf scorch VGG16 0.9969 0.9452 0.9447 0.9463 0.9983
10 Corn healthy 29 Tomato bacterial spot VGG19 0.9965 0.9393 0.9388 0.9410 0.9981
11 Corn northern Leaf Blight 30 Tomato early blight ResNet50 0.9981 0.9660 0.9655 0.9664 0.9990
12 Grape black rot 31 Tomato healthy
ResNet101 0.9976 0.9582 0.9577 0.9586 0.9988
13 Grape esca Black Measles 32 Tomato late blight
ResNet152 0.9978 0.9624 0.9618 0.9627 0.9990
14 Grape healthy 33 Tomato leaf mold
15 Grape leaf blight 34 Tomato mosaic virus ResNet50V2 0.9970 0.9458 0.9452 0.9470 0.9984
16 Orange haunglongbing 35 Tomato septoria leaf spot ResNet101V2 0.9970 0.9479 0.9474 0.9491 0.9984
17 Peach bacterial spot 36 Tomato target Spot ResNet152V2 0.9968 0.9426 0.9418 0.9428 0.9983
18 Peach healthy 37 Tomato two spotted spider mite
InceptionV3 0.9954 0.9163 0.9162 0.9196 0.9974
19 Pepper bell bacterial spot 38 Tomato yellow leaf curl virus
InceptionResNetV2 0.9959 0.9224 0.9221 0.9246 0.9978
MobileNet 0.9980 0.9637 0.9632 0.9645 0.9989
Table II. Training parameters
MobileNetV2 0.9968 0.9429 0.9424 0.9443 0.9982
Training data size 27150
Test data size 27155
Performance evaluation of 13 different deep learning
Output Classes 38 models presented in Table III indicates that all network
Input image size [256x256x3] architectures are robust for identifying healthy or diseased
plant leaf images since all the results are high in terms of
Hidden layer (head) 1
accuracy, F1 score, Matthews correlation coefficient, true
Hidden nodes (head) 40 positive rate and true negative rate. In general, all results are
Learning rate 0.01 above 90% which means that these network architectures are
robust and applicable for plant disease classification.
Dropout rate 0.5
The ranges of performance values for all evaluation metrics
Batch normalisation enabled
show that the minimum performance among 13 different
Activation ReLU network architectures is obtained by InceptionV3 architecture.
Optimisation algorithm Adam Although it has the lowest accuracy among the other deep
learning architectures, the performance of InceptionV3
Cost function Focal Cross-Entropy Loss architecture is 99.54% in terms of accuracy. However, the
Maximum epoch 20 highest performance values for all the metrics are achieved by
ResNet50 architecture for plant disease identification.
Batch Size 16
Specifically, the highest accuracy achieved by ResNet50
Random Rotation (data augmentation) [-36o, 36o] architecture for classifying 38 classes is 99.81% in terms of
Random Contrast (data augmentation) [-10%, 10%] accuracy. Therefore, all network architectures employed in
the experiments are successful in identifying plant diseases
from leaf images.
B. Experimental Results and Discussion
V. Conclusion
The experiments are performed using leaf images of
several plant species and the experimental results are Identification of plant leaf diseases is studied in this paper.
presented to show the classification of 38 healthy or diseased Healthy and diseased plant leaf images from PlantVillage
174
database are used which constitute 14 plant species and 38 [5] Tamra A. Jackson-Ziems, “Rust Disease of Corn in Nebraska”,
University of Nebraska-Lincoln extension, Institute of
classes. In this study, in order to identify healthy or diseased Agriculture and Natural Resources. Revised January 2014
classes, Transfer Learning is employed using several deep Retrieved June 29, 2020, from
learning architectures, namely Xception, VGG16, VGG19, http://extensionpublications.unl.edu/assets/pdf/g1680.pdf
ResNet50, ResNet101, ResNet152, ResNet50V2, [6] Sukhvir Kaur, Shreelekha Pandey, Shivani Goel, “Plants
ResNET101V2, ResNET152V2, InceptionV3, Disease Identification and Classification Through Leaf Images:
InceptionResNetV2, MobileNet, MobileNetV2. Five A Survey, Archives of Computational Methods in Engineering,
2018.
evaluation metrics are used, namely accuracy, F1 score,
[7] M. Nagaraju, Priyanka Chawla, “Systematic review of deep
Matthews correlation coefficient, true positive rate and true learning techniques in plant disease detection”, Int J Syst Assur
negative rate to compute the performance of each Eng Manag (June 2020), Volume 11, No 3, pp. 547–560, 2020.
architecture. Experimental results show that all accuracies [8] Sue Han Lee, Hervé Goëau, Pierre Bonnet, Alexis Joly, “New
obtained with the network architectures achieve more than perspectives on plant disease characterization based on deep
90% accuracy. The highest performance is achieved by learning”, Computers and Electronics in Agriculture, Volume
170, (2020) 105220.
ResNet50 architecture with 99.81% accuracy, 96.60% F1 [9] Lawrence C. Ngugi, Moataz Abelwahab, Mohammed Abo-
score, 96.55% Matthews correlation coefficient, 96.64% true Zahhad, “Recent advances in image processing techniques for
positive rate and 99.90% true negative rate. Therefore, automated leaf pest and disease recognition- A Review”,
ResNet50 is the best architecture among the others used in Information Processing in Agriculture, Volume 8, pp. 27-51,
2021.
this study with the highest performance values to identify
[10] Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke,
plant diseases from leaf images. Alexander A. Alemi, “Inception-V4, inception-ResNet and the
Impact of Residual Connections on Learning, Thirty-first
AAAI Conference on Artificial Intelligende, 2017.
References [11] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, “Deep
residual learning for image recognition”, IEEE Conference on
[1] Gittaly Dhingra, Vinay Kumar, Hem Dutt Joshi, “Study of Computer Vision and Pattern Recognition, Las Vegas, NV,
digital image processing techniques for leaf disease etection USA, 27-30 June 2016.
and classification”, Multimedia Tools and Applications, [12] Karen Simonyan, Andrew Zisserman, “Very deep
Volume 77, pp. 19951–20000, 2018. convolutional networks for largescale image recognition”, The
[2] Pujari JD, Yakkundimath R, Byadgi AS, “SVM and ANN 3rd International Conference on Learning Representations
based classification of plant diseases using feature reduction (ICLR2015), 2015.
technique”. Int J Interact Multimed Artif Intell, Volume 3, pp. [13] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey
6–14, 2016. Zhmoginov, Liang-Chieh Chen, “MobileNetV2: Inverted
[3] David P. Hughes and Marcel Salathe, “An open access residuals and linear bottlenecks, IEEE/CVF Conference on
repository of images on plant health to enable the development Computer Vision and Pattern Recognition, Salt Lake City, UT,
of mobile disease diagnostics”, arXiv preprint USA, 18-23 June 2018.
arXiv:1511.08060, 2015. [14] François Chollet, “Xception: Deep Learning with Depthwise
[4] Angela Madeiras, “Grape IPM- Black Rot”, (2019, September Separable Convolutions”, IEEE Conference on Computer
27) Retrieved June 29, 2020, from Vision and Pattern Recognition, Honolulu, HI, USA, 21-26
https://ag.umass.edu/fruit/fact-sheets/grape-ipm-black-rot July 2017.
175
Vaccines Perspective in the COVID-19 Era: Analysis of
Twitter Data
Abdulkadir Sahiner1,2 Kaan Kemal Polat Hayati Ünsal Özer
1 Department of Computer Engineering Department of Mathematical Department of Mathematical Engineering
Istanbul Sabahattin Zaim University Engineering Yildiz Technical University
2 Department of Mathematical Yildiz Technical University Istanbul, Turkey
Engineering Istanbul, Turkey huozer@yildiz.edu.tr
Yildiz Technical University kemalp@yildiz.edu.tr
Istanbul, Turkey
asahiner@yildiz.edu.tr
Abstract— Nowadays, preventive treatments are being Bonnevie et al. [9], by examining the Tweets of those who
developed for the COVID-19 epidemic, which affects all social are against vaccines during the COVID-19 period, measuring
life. One of these treatments is vaccination studies. Since the the anti-vaccination opposition, especially from the beginning
subject of vaccination has been discussed by people from past to of the COVID-19 epidemic in the United States to the present,
present, it increases the importance of revealing the approach in determining the change in vaccine-related messages and
this period. In this context, the aim of the study is to determine determining the effects of COVID-19 on general vaccine
people's approaches to COVID-19 vaccines through their opposition intended. Within the scope of the study, Tweets
Tweets. The sentiment analysis method was used within the obtained between 15 February 2020–14 June 2020 were used
scope of the study. With this widely used method, the data
as data. In the period covering the data set examined in the
obtained from Kaggle were analyzed. As a result of the research
of the open data set consisting of Tweets related to the Pfizer &
study, it was stated that the views on anti-vaccination
BioNTech vaccine, with the LSTM model, in reaching the best increased by 80%.
accuracy value; The activation function was obtained as A different study, it was aimed to analyze the vaccine
"softmax", the epoch value as "10" and the batch size value as discussions on Twitter in the Netherlands. Within the scope of
"128" appear as remarkable results. the study, a mixed model was used and a quadruple circular
model was followed in which community detection, text-
Keywords—Vaccine, COVID-19, Twitter, sentiment analysis
mining, perception analysis, and network analysis were
performed. As a result of the study carried out by Kearneys
using retweet and igraph packages, it was aimed to contribute
I. Introduction to the strategy studies on the vaccine by determining the
The COVID-19 epidemic, which affects almost every relational networks of the opposition and shared views on the
field, continues its impact today. Vaccine studies have subject [10].
accelerated to prevent COVID-19, which we can describe as Balenkenship et al. [11] in the study conducted by, it was
a dangerous epidemic that causes the death of people. Today, aimed to determine whether tweets with different emotions
the vaccines produced by some companies have been taken by and content for vaccination attract different levels of
countries after the approvals and started to be applied to their participation (retweets) from Twitter users. As a result of the
citizens. study using the regression model, it will be the most important
Vaccine studies regarding COVID-19, which was declared step to ensure the participation of key opinion leaders on
a global epidemic by the World Health Organization (WHO) social media in order to facilitate health education about
in March 2020, have been one of the important topics of vaccination in their Tweets and to ensure that their views
discussion until today. It is stated that at least 70% of the reach a wider audience. Thus, a positive approach to
population should be vaccinated as a precaution with vaccination will be provided through influential people.
continuity and protection of vaccines [1, 2]. It becomes very In another study, it was aimed to analyze and determine
important to understand the extent of public support for this the data on the international public debate about the pediatric
situation and to direct society to vaccination accordingly. One pentavalent vaccine (DTP-HepB-Hib) program by analyzing
of the areas where the public expresses their approach to Twitter messages. In the study in which Twitter data between
vaccination is social media platforms [3-5]. July 2006 and May 2015 were analyzed, it was seen that there
Although studies are stating that social media is a very was a little interaction between the tweets, but links containing
effective tool for vaccination, it is stated that it may contain information about the vaccine were used quite frequently [12].
negative feelings and false information that may affect In the study conducted by Wen-Ying Sylvia Chou &
individual opinions and lead to vaccine rejection [6]. Such a Alexandra Budenz [13], the importance of a data-based
situation has been stated by the World Health Organization communication strategy in controlling the anxiety
(WHO) as one of the ten main factors that are threatening experienced against vaccine hesitancy was emphasized and in
global health [6, 7, and 8]. Due to the differences in the this context, it was aimed to make suggestions regarding the
provision of vaccines by countries, people show different analysis of comments on the vaccine with sentiment analysis
approaches to the application of alternative vaccines. on Twitter. Within the scope of the study, it was stated that it
Sentiment analyzes are tried to be determined through Tweets was possible to examine the link between the vaccine and the
on Twitter, one of the social media platforms where people emotional approach and, accordingly, to take precautions
express their opinions against vaccines. Although it is seen against the highly intense opposition to the vaccine with the
that the studies in this context are increasing gradually, some communication strategy.
of the studies carried out are as follows.
176
Vaccination is one of the most important issues from the hashtags, source, retweets, favorites, is_retweet” information
past to the present, and today, with widespread about Tweets.
communication networks, anti-vaccination is becoming more
and more widespread. However, in today's COVID-19 A total of 8631 Tweets were analyzed in the dataset containing
epidemic, where community immunity has a vital stroke, good the latest tweets about the Pfizer & BioNTech vaccine.
management of the process and sharing information that will Examined Tweets were divided into 3 groups as “positive,
prevent negative opinions against vaccines is possible with negative, and neutral” using Sentiment Intensity Analyzer.
data-based methods. In this context, our study is aimed to
4679
analyze the shares of people on Twitter about vaccines in the 5000
COVID-19 period, with an open data set, using the sentiment
4000 2729
analysis method.
3000
2000 1223
II. Dataset Description
The dataset we used in the study includes the latest tweets 1000
about the Pfizer & BioNTech vaccine created by Gabriel 0
Preda on the Kaggle Platform [14]. This data is stated to be Positive Negative Neutral
collected using the tweepy Python package to access the Sentiment
Twitter API.
Count
In the data set used, there are “id, user_name,
user_location, user_description, user_created, user_followers, Figure I. Data Sentiment Analysis
user_friends, user_favourites, user_verified, date, text,
Figure III. Top 50 Positive Words Used in Tweets Figure IV. Top 50 Negative Words Used in Tweets
Among the Tweets in the data set used within the scope of
the study, words such as "dose, thank" is the most used in the III. Material and Methods
tweets with a positive approach, while words such as "death,
arm, die" in the tweets with negative approach attract A. Data Preprocessing
attention. "Word Embedding" was used to process the data and the
texts were converted into word vectors. Using the KERAS
embedding layer, 32-dimensional word vectors were created.
177
In addition, various adjustments were made to the data. In this within the scope of the study were carried out using the
context; hash, internet connections, special characters, single Central Processing Unit (CPU), Graphics Processing Unit
characters, and double spaces are cleared. (GPU), or Tensor Processing Unit (TPU) hardware and the
online cloud service. The parameters used in the experimental
Emotion intensity information was added to the cleaned process can be summarized in Table I.
tweets using the VADER [16] sentiment analysis tool.
B. Long Short Term Memory (LSTM) Table I. Parameter Settings
Long Short Term Memory is called “LSTM” and is a Parameter Value
special type of RNN that can learn long-term dependencies. Activation function ReLU, softmax
This model, which was introduced by Hochreiter & Batch size 128
Schmidhuber [15], is widely used today due to its effective Loss function Binary
Optimizer Adam
operation in complex problems.
Deep learning criteria were used to examine the
performance criteria of the model proposed in the study are:
𝑇𝑁 + 𝑇𝑃
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (1)
𝑇𝑁 + 𝑇𝑃 + 𝐹𝑁 + 𝐹𝑃
𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = (2)
𝑇𝑃 + 𝐹𝑁
(𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑅𝑒𝑐𝑎𝑙𝑙)
Figure V. Architecture of LSTM Model 𝐹1 − 𝑆𝑐𝑜𝑟𝑒 = 2 × (3)
(𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙)
One of the tricky issues with natural language processing TP, FP, TN and FN given in Equations (1) - (3) represent
is that the meanings of words can change depending on their the numbers of True Positive, False Positive, True Negative
context. In the case of sensitivity analysis, we cannot ignore and False Negative, respectively.
the occurrence of some words like good because if it comes
before the word like not good, its meaning can change In addition, one of the evaluation criteria within the scope
completely. This makes it difficult, as it requires reading of the study is the Matthews correlation coefficient. The
between the lines. LSTM networks are well suited for solving Matthews correlation coefficient was introduced by Brian
such problems, as they can remember all the words that lead Matthews in 1975 and can be defined as a tool for model
to what is in question. evaluation. It is important in terms of revealing the strength of
the statistical relationship between the true value and the
The LSTM model used in the study can be summarized as estimated value [17].
follows:
IV. Results
This section describes the results obtained in this study. As
mentioned earlier, one model was used LSTM. More
experiments were also performed to investigate different
parameters. In this study, optimizers were tested, ADAM
optimizer. For model, different parameter values were tested
and optimum parameter values were determined. 80% of the
data set was used as training data, and 20% was considered as
test data. In the created model, 196 cells were used in the
LSTM layer.
In the first stage, the results obtained over the different
epoch and activation function values applied in the LSTM
model are as follows:
178
The best performance in the different activation function As a result of the analysis of the open data set consisting
and epoch values applied in the model used was obtained at of Tweets related to the Pfizer & BioNTech vaccine, with the
the activation function "softmax", and epoch "10" values. In LSTM model, in reaching the best accuracy value; The
addition, in this case, it was seen that the quality criterion in activation function was obtained as "softmax", the epoch
the model with multiple classifications was quite good value as "10" and the batch size value as "128". In addition,
(Matthews correlation = 0,66). the Matthews correlation value reached these values was 0.66.
In the second stage, the results obtained over the different V. Conclusion
epoch and batch size values applied in the LSTM model are
as follows:
The main purpose of this study is to analyze the effect of
Table III. The Effect of Number of Epochs on Parameter Values* tweets about Pfizer & BioNTech vaccines on anti-vaccine
sentiment, to examine how people's views on vaccines have
Number Matthews changed during the COVID-19 period. As a result of the use
Accuracy F1-
of Batch Size Recall correlation
(%) Score of "ADAM" optimizer and the epoch value of "10", the
Epochs
10 512 80,37 0,80 0,79 0,65 activation function "softmax" and the batch size value of
10 128 78,11 0,78 0,78 0,61 "128" in the "LSTM" model proposed within the scope of the
10 32 76,38 0,76 0,76 0,58 research, the best accuracy in the data set was obtained with
20 512 79,33 0,79 0,78 0,63 80.83%.
20 128 76,26 0,76 0,76 0,58
20 32 75,91 0,76 0,75 0,57 Although the number of data tagged as positive Tweets is
higher than the number of negative Tweets, the fact that the
* number of neutral Tweets is higher than these two groups
Activation function = softmax
shows similar results with recent studies in which anti-
The best performance in the different batch size and epoch vaccination is increasing [19].
values applied in the model used was obtained at the batch size
"512", and epoch "10" values. In addition, in this case, it was Within the scope of the study on how the subject of anti-
seen that the quality criterion in the model with multiple vaccination, which is an important issue in the literature,
classifications was quite good (Matthews correlation = 0,65). differs from the current COVID-19 pandemic, Twitter data
was examined within the framework of sentiment analysis. In
The graph of the change of accuracy and loss value this process, it is very important for countries to develop
according to epoch value on the data set with the LSTM model strategies according to the approach to vaccines during the
is given below. pandemic period by examining how user Tweets related to
COVID-19 vaccines have changed historically.
Examining the tweets about the vaccine developed by
Pfizer & BioNTech within the scope of the study can be
expressed as a limitation. In this context, studies can be carried
out to examine and compare Tweets related to different
COVID-19 vaccines, and as a result, how anti-vaccination
differs between these vaccines.
References
[1] Orenstein, W. A., & Ahmed, R. (2017). Simply put:
Vaccination saves lives.
[2] Aguas, R., Corder, R. M., King, J. G., Goncalves, G., Ferreira,
M. U., & Gomes, M. G. M. (2020). Herd immunity thresholds
for SARS-CoV-2 estimated from unfolding epidemics.
medRxiv.
[3] Velasco, E., Agheneza, T., Denecke, K., Kirchner, G., &
Eckmanns, T. (2014). Social media and internet‐based data in
global systems for public health surveillance: a systematic
review. The Milbank Quarterly, 92(1), 7-33.
[4] Yousefinaghani, S., Dara, R., Poljak, Z., Bernardo, T. M., &
Sharif, S. (2019). The assessment of Twitter’s potential for
outbreak detection: avian influenza case study. Scientific
reports, 9(1), 1-17.
[5] Guess, A. M., Nyhan, B., O’Keeffe, Z., & Reifler, J. (2020).
The sources and correlates of exposure to vaccine-related (mis)
information online. Vaccine, 38(49), 7799-7805.
[6] Piedrahita-Valdés, H., Piedrahita-Castillo, D., Bermejo-
Higuera, J., Guillem-Saiz, P., Bermejo-Higuera, J. R., Guillem-
Saiz, J., ... & Machío-Regidor, F. (2021). Vaccine Hesitancy
on Social Media: Sentiment Analysis from June 2011 to April
2019. Vaccines, 9(1), 28.
[7] Puri, N., Coomes, E. A., Haghbayan, H., & Gunaratne, K.
(2020). Social media and vaccine hesitancy: new updates for
the era of COVID-19 and globalized infectious
Figure VII. Variation of Model Accuracy and Loss Value by diseases. Human Vaccines & Immunotherapeutics, 1-8.
Epoch Value
179
[8] Kunneman, F., Lambooij, M., Wong, A., Van Den Bosch, A., hesitancy and fostering vaccine confidence. Health
& Mollema, L. (2020). Monitoring stance towards vaccination communication, 35(14), 1718-1722.
in twitter messages. BMC medical informatics and decision [14] Preda, G. Pfizer Vaccine Tweets. Kaggle Repository:
making, 20(1), 1-14. https://www.kaggle.com/gpreda/pfizer-vaccine-tweets, 2021.
[9] Bonnevie, E., Gallegos-Jeffrey, A., Goldbarg, J., Byrd, B., & [15] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term
Smyser, J. (2020). Quantifying the rise of vaccine opposition memory. Neural computation, 9(8), 1735-1780.
on Twitter during the COVID-19 pandemic. Journal of
Communication in Healthcare, 1-8. [16] Hutto, C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious
Rule-based Model for Sentiment Analysis of Social Media
[10] Lutkenhaus, R. O., Jansz, J., & Bouman, M. P. (2019). Text. Eighth International Conference on Weblogs and Social
Mapping the Dutch vaccination debate on Twitter: identifying Media (ICWSM-14). Ann Arbor, MI, June 2014.
communities, narratives, and interactions. Vaccine: X, 1,
100019. [17] Kaden, M., Hermann, W., & Villmann, T. (2014). Optimization
of General Statistical Accuracy Measures for Classification
[11] Blankenship, E. B., Goff, M. E., Yin, J., Tse, Z. T. H., Fu, K. Based on Learning Vector Quantization. In ESANN.
W., Liang, H., ... & Fung, I. C. H. (2018). Sentiment, contents,
and retweets: a study of two vaccine-related twitter datasets. [18] Scikit-Learn, sklearn.metrics.matthews_corrcoef:
The Permanente Journal, 22. https://scikit-
learn.org/stable/modules/generated/sklearn.metrics.matthews_
[12] Becker, B. F., Larson, H. J., Bonhoeffer, J., Van Mulligen, E. corrcoef.html, 2021.
M., Kors, J. A., & Sturkenboom, M. C. (2016). Evaluation of a
multinational, multilingual vaccine debate on Twitter. Vaccine, [19] Yousefinaghani, S., Dara, R., Mubareka, S., Papadopoulos, A.,
34(50), 6166-6171. & Sharif, S. (2021). An Analysis of COVID-19 Vaccine
Sentiments and Opinions on Twitter. International Journal of
[13] Chou, W. Y. S., & Budenz, A. (2020). Considering Emotion in Infectious Diseases.
COVID-19 vaccine communication: addressing vaccine
180
Deep Learning-based Healthcare Data Analysis System
Abstract— Health care data is playing an essential role in from different distribution. Moreover, the most standard
delivering the proper treatment for the patient at the right time. classifier are designed to optimized the loss function based on
The amount of biomedical data has tremendously increased, how accurate the prediction. However, the predictive accuracy
which also increased challenges for its analysis. This paper gain low performance metrics from class imbalance learning
combines supervised and unsupervised learning techniques to [3]. As we will see in experiments we found out that heartbeat
obtain insights into the healthcare data for its classification. classes are not balance in distribution. So how to deal with
Specifically, we used a deep neural network to perform anomaly imbalance learning is with preprocessing step where data
analysis to determine the nature of health problems. Deep training is modified to produce more balance class
convolutional neural network (DCNN) and Deep Autoencoder
distribution.
Convolutional Neural Network (DACNN) were employed to
classify the image patterns of extracted electrocardiographs The authors [9] described a correlation between sequential
(ECG). We have combined the CNN model with autoencoder to forward selection feature selection. Highly correlated features
classify heartbeat data. The experimental results showed that are removed because it will lead to inconsistency models.
high accuracy and lower loss is obtained using our approach. Sequential forward selection algorithm which is based feature
selection can be considered and it is interactive step. The
Keywords—Deep Learning, supervised, unsupervised, ECG subsets are included to gain the final subset of features where
classification, autoencoder, Convolutional Neural Network
it will obtain correct accuracy.
I. Introduction The authors[3] described that autoencoder to drive deep
Currently health has become the most important thing to learning architecture than can learn the hidden representations
people around the world. Most of the countries have tried to of data even when data is perturbed by missing values
counter or prevent the pandemic effect major population by (noises).
restricted contacts in the community and giving vaccination Brain Tumor Classification in healthcare systems[6] in
which they prefer to use. Healthcare system is primary role to order to assist radiologist for a better diagnosis analysis. The
handle spreading illness, to care the people. Healthcare data classification was investigated using convolutional neural
are being analyzed by many scientist to predict what are the network models by performing extensive experiments using
coming virus that are being spread in the community. In this transfer learning with and without augmentation.
paper we propose to make classification in healthcare data
system. We focus on heartbeat data classification using deep
learning supervised and unsupervised learning.
III. Problem Statement
Heartbeat data can produced by using electrocardiograph
(ECG) which is used to record the heart activity in certain Many attempts to find the best way to achieve higher
time. So basically we see wavelet graph as the results. Most of accuracy to classify highly dimensional and complex ECG
the time, scientists need to do preprocessing from ECG results data. This research paper shows the gate to find autoencoder
raw data. In this paper, we will skip that preprocessing raw system for classifying heartbeat rate category. Meanwhile we
ECG files but instead we take prepared data in csv format and keep convolutional neural network be main part of training
analyze them using deep learning algorithms such as for supervised and unsupervised learning.
Convolutional Neural Network (CNN).
Classification data in healthcare system for heartbeat rate IV. Methodology
has been analyzed by many studies where they contributed in
In this paper we propose multi-layer or deep neural
finding automate system to identify normal beats,
supraventricular ectopic beats, ventricular ectopic beats, network (DNN) in order to find the best classification
fusion beats and other abnormal beats[1]. methods to get the best solution for unsupervised or
supervised learning neural networks. DNN can be used for
This paper is organized in different sections in the feature extraction and classification of ECG raw data
following. Section-I described short explanations about what extracted from patient into number of categories.
is the main argument of this paper. Section-II discusses some Our data matrices have high dimensions and imbalance
related works where there are some contributors who have classes makes the training process have become more
done experiments. Section-III problem statement and section- challenging. So come out with unsampled data to balance
IV explains about methodology that we used. Section-V classes, the class distribution will be even and good for
discussed some experiments and results. Section-VI discussed
training or testing convolutional neural networks. Figure 1, is
about results and conclusion.
the plotting for heartbeat classification for our four categories
II. Related Works which come from ECG signals. This is the sample, so that we
Class imbalance learning can produce inferior results[2]. know the wavelet visualization.
Class imbalance can be caused by data sample which come
E. Auto-Encoder
Classification using autoencoder with highly dimension
heartbeat data causes a lot of complexity to find the best fit
convolutional neural network. So we also make similar
preprocessing heartbeat data with balancing and fixing some
error during designing models.
Abstract— This research aims to compare usage of X plate Immediate Occupancy (IO) are the descriptors of damage
diagonal chevron dampers (Win plastic property dampers) on a states, which are performance objectives only when they relate
multi-story building frames in various locations of the structure, to a selected seismic hazard level. The hazard may be an
accompanied by using treated crumbed rubber concrete with earthquake or the probability of ground shaking intensity
different percentages for the structure’s frames. By employing (10% chance of being exceeded in 50 years). Using the new
these two systems together as hybrid damping system it will be analysis techniques as a technical tool, it is possible to analyze
seen that changing those systems’ damping properties will affect buildings for multiple performance objectives. Relatively new
the results of push over analysis for the structure such analysis procedures help to describe the inelastic behavior of
transmitted base shear forces, roof displacement of the
the structural components of the building, which allows to
performance curve, pseudo acceleration ,pseudo displacement,
effective period, ductility ratio and most importantly effective
estimate the particular behavior of the building during a
damping ratio which plays major row in reducing demand curve selected ground motion. The analysis procedure predicts
of the design response spectrum. 3D models are modeled and which part of the building will fail first. Because the load and
analyzed by ETABS, compare analysis results according to displacement increase, other elements begin to yield and
multiple study cases, trying to reach optimum usage of such deform inelastically.
damping systems according to the indications mentioned above. The resulting graphical "curve" is a representation of the
Where comparison will be focused in this research only on the
capacity of the building. Several alternative techniques allow
scale factor of design response spectrum and damping ratio
the demand from a selected earthquake or ground shaking
which resulted from push over analysis done for the study cases
defined here in this research.
intensity to be correlated with the capacity curve to obtain an
area on the curve where capacity and demand are equal, which
Keywords— Damping Systems, Nonlinear Static Analysis is called the performance point (PP) of the structure, which is
(Pushover), Response Spectrum, Earthquakes, Crumbed Rubber an estimate of the particular displacement of the building for
Concrete, Chevron Braces the desired ground motion.
186
Where, factor, k, such that the effective viscous damping is, 𝛽𝛽𝑒𝑒𝑒𝑒𝑒𝑒 is
𝛽𝛽0 hysteretic damping represented as equivalent viscous defined by:
damping 63.6𝑘𝑘�𝑎𝑎𝑦𝑦 𝑑𝑑𝑝𝑝𝑝𝑝 −𝑑𝑑𝑦𝑦 𝑎𝑎𝑝𝑝𝑝𝑝 �
𝛽𝛽𝑒𝑒𝑒𝑒𝑒𝑒 = 𝑘𝑘, 𝛽𝛽𝑜𝑜 + 5 = +5 (7)
𝑎𝑎𝑝𝑝𝑝𝑝 𝑑𝑑𝑝𝑝𝑝𝑝
0.05 5% viscous damping inherent in the structure
(assumed to be constant) k-factor is a measure of the actual structure hysteresis is
The term 𝛽𝛽0 can be calculated
1 𝐸𝐸𝐷𝐷
𝛽𝛽0 = (2)
4𝜋𝜋 𝐸𝐸𝑆𝑆0
Where,
187
energy added to the structure during a transient is absorbed by smooth nonlinear hysteretic loops under plastic deformation;
additional damping elements rather than the structure itself. it can withstand a large number of yield reversals; there is no
An idealized additional damper would be such that the force significant stiffness or strength degradation; and it can be
generated by the damper is large enough and occurs at such a accurately modeled by Wen's hysteretic model or as a bilinear
time that the damper forces do not increase the overall stress elasto-plastic material [4].
within the structure. Properly implemented, a perfect damper
(For X Plates Dampers, XPD) for each deformational
should be able to simultaneously reduce both stress and
degree of freedom, it may specify independent uniaxial
deflection within the structure [3].
plasticity properties. The plasticity model is based on the
A. X-shaped metallic dampers hysteretic behavior proposed by Wen (1976).
All internal deformations are independent. The yielding at
one degree of freedom does not affect the behavior of the other
deformations. If it does not specify nonlinear properties for a
degree of freedom, that degree of freedom is linear using the
effective stiffness, which may be zero.
The nonlinear force- deformation relationship is given by:
𝑓𝑓 = 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑘𝑘 𝑑𝑑 + (1 − 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟)𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦 𝑧𝑧 (11)
where k is the elastic spring constant, yield is the yield
force, ratio is the specified ratio of post- yield stiffness to
Figure IV. Samples of X Plate Dampers with Wen elastic stiff ness (k), and z is an internal hysteretic variable.
Hysteretic Loops This variable has a range of | z | ≤ 1, with the yield surface
X-plate dampers consist of one or more X-shaped steel represented by | z | =1. The initial value of z is zero, and it
plates, each plate having a double curvature and arranged in evolves according to the differential equation:
parallel; the number of plates depends on the amount of
energy required to be dissipated in the given system. The
K 𝑖𝑖𝑖𝑖 𝑑𝑑̇ 𝑧𝑧 > 0
𝑑𝑑̇ (1 − |𝑧𝑧|𝑒𝑒𝑒𝑒𝑒𝑒 )
𝑧𝑧̇ = � (12)
material used to make the X-plate can be any metal that allows
yield
𝑑𝑑̇ 𝑜𝑜𝑡𝑡ℎ𝑒𝑒𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
a large amount of deformation, such as mild steel, although where exp is an exponent greater than or equal to unity.
sometimes lead or more exotic metal alloys are used. To Larger values of this exponent increase the sharpness of
reduce the response of the structure by dissipating the applied yielding. The practical limit for exp is about 20. The equation
seismic energy, such a damper may be used with a suitable for 𝑧𝑧̇ is equivalent to Wen’s model with A =1 and α= β = 0.5
support system, where a combination of braces and XPDs may
be used in the building structure and such an assembly is III. Rubber Concrete
known as a device brace. When such a system is subjected to Crumb Rubber can absorb sudden shocks by controlling
lateral forces such as earthquakes, high winds, etc., the seismic
the motion of waves transmitted by moving loads. Rubber
energy introduced is dissipated by their flexural deformation.
compresses and deforms easily, but the rate of deformation
They can withstand many cycles of stable yielding
decreases as the load increases, making it a good shock
deformation, resulting in a high degree of energy dissipation
absorber. Rubber has very low hysteresis and controls energy
or damping.
dissipation in a system subjected to vibration-induced forces.
The goal behind using X-shaped dampers is to have a Rubber has an inherent damping ability and this property can
constant strain variation over their height to ensure that be exploited to improve the damping and vibration
yielding occurs simultaneously and uniformly over the entire characteristics of concrete. In addition, rubber is lightweight
height of the damper. XPDs can also behave nonlinearly, but and non-corrosive, making it easy to use and apply. It is
limit the behavior of the structure to the linear-elastic region. relatively inexpensive compared to other conventional
In a series of experimental tests, the behavior of XPDs was dampers such as steel springs and tuned mass dampers.
studied and the following results were observed: It exhibits
Material scientists have attempted to form concrete with a
ductile material. However, it appears that due to the brittle
nature of concrete, the most direct and effective approach to
creating damage tolerant concrete structures would be to
embed intrinsic tensile ductility into concrete. If concrete
behaves like steel in tension (highly ductile) while retaining
all other advantages (e.g., high, and extreme compressive
strength), concrete structures with improved serviceability
and safety can be easily realized. In addition, crumb rubber
can absorb sudden impacts by controlling the movement of
waves transmitted by moving loads. Rubber compresses and
deforms easily, but the rate of deformation decreases as the
load increases, which makes it a good damper. Rubber has low
hysteresis and controls energy dissipation in a system that is
highly exposed to vibration-induced forces. Structures located
Figure V. Nonlinear Force- Deformation relationship of Wen near roads are subject to vibrations caused by moving
Plasticity vehicles. These vibrations are harmful to the pavement and
structures adjacent to the pavement. The use of rubber in
188
pavements as a substitute for natural aggregate can reduce this 4.327 3.656 3.015 2.361
effect [5]. 3.22175 2.79 2.484 2.034
Average 2.492 1.799 1.789 1.319
IV. Optimum usage of mixed damping
Nowadays, technological development and the practical Meanwhile, during analysis the performance curve is
solutions it provides to complex issues have become an urgent being created accordingly, as the pushover analysis depends
necessity that cannot be dispensed with. Accordingly, the firmly upon the first mode participation ratio of the structure’s
development of damping systems used in modern structures modal. The Wen XPDs used in this research were selected due
and high-rise buildings depends on the software provided by simplicity of manufacturing and availability in local markets.
computers mainly. Three-dimensional models with study As we saw in Chapter II, XPDs work in both linear and
cases were modeled using ETABS to support this research nonlinear behaviors according to the degree of freedom
based on the assumptions and equations that were mentioned direction in which the link or the damper acts, thus plasticity
in the previous chapters. property of the added links or dampers will add extra energy
absorption as a nonlinearity action. To see how all those
The main objective of mixing damping systems (Rubber
factors affecting the pushover analysis in structures with
Concrete, chevron braces with XPDs) is to make a comparison
multiple damping systems this research was made.
between models with only crumbed rubber added to concrete
aggregates in 5,10,15 and 20 % as shown in Table I. A. 3D models sspecifications used in the research
3D models were developed where the analysis will be
Table I. DEFINITIONS OF 3D MODELS done by ETABS Program and are compatible with the
Crumbed Rubber following:
Ratio in Concrete 0 5% 10 % 15 % 20 %
Aggregates % • The proposed model will consist of symmetrical 26 stories
Model Code Type of Braces and Dampers Used in the Model with 3 bays in X and 3 bays Y directions.
3D_Nor_NDR NA
3D_Nor_XB_1 1 Middle Chevron Brace without XPDs • Wen Plasticity dampers (XPDs) will be add to the 3D
3D_Nor_XB_2 2 Corner Chevron Braces without XPDs models in the middle bays in X and Y directions as first step,
3D_Nor_XD_1 1 Middle Chevron Brace with XPDs next step added (XPDs) will be removed from the middle bays
3D_Nor_XD_2 2 Corner Chevron Braces with XPDs and will be added to corner bays.
• Chevron braces without XPDs will be considered in the
In push over analysis, effect of damping is related to the analysis as a comparison like it is shown in Table I.
structural elements material’s damping properties and the
added damper elements’ properties used in the model. Thus, • The first and second step will be repeated for all types of
in nonlinear pushover analysis, 𝛽𝛽0 the hysteretic damping rubber concrete used in this article which explained in
represented as equivalent viscous damping is changing previous chapters.
according to materials of structure, damper elements added • In 3D models the Response Spectrum will be amplified for
braces, chevron and XPDs. Also, it is related to 𝐸𝐸𝐷𝐷 the energy
dissipated by damping and 𝐸𝐸𝑆𝑆0 which is the maximum strain
energy. 𝛽𝛽𝑒𝑒𝑒𝑒𝑒𝑒 obtained values from analyzing models will
used in estimating spectral reduction factors to decrease the
elastic (5% damped) response spectrum. Which will play a
main role in the comparison as we will see.
Five types of concrete material were defined in ETABS
for the models as crumbed rubber added aggregates according
to mentioned percentages hereabove include their damping
properties as shown in Table II. Where the values of table
were taken by the research done by Najib N. Gergesa, Camille
A. Issab, Samer A. Fawazb [5].
189
Hight Width
Number Loads:
Section Location (Story) of
(cm) (cm)
elements • Dead load was defined as linear static load and applied for
C40 40 40 1 to 10 160 all models as self-weight load for all structural elements in
C35 35 35 11 to 19 144 addition to 2 KN/m2 for each Area element (Slabs).
C30 30 30 20 to 26 112
Comb 50 30 All Stories 624 • Modal load was defined as Eigen vectors with zero initial
HSS101.6*4.8 Middle/Corner
(Circular)
10.2 4.8
Bays
208/416 conditions and the defined mass sources of the model.
Slab 1 (thin shell) 20 1 to 26 26 • Live load defined as linear static load and applied as 2
KN/m2 for all Area elements (Slabs) of the models with zero
Table V. PROPERTIES AND FIELDS OF P-M2-M3 HINGES
initial conditions.
Hinge Type
P-M2-M3 Moment/ A B C D E • Push Over Load which is a nonlinear static typecontinue
Rotation Data from the state at end of another nonlinear case which is
Moment/Yield Mom 0 1 1.1 0 0 Gravity nonlinear, where all modal loads applied using modes
Rotation/scale factor 0 0 0.01 0.01 0.01 from the defined modal load with the same mass sources, load
application will be Displacement Control with multiple states.
Table VI. ACCEPTANCE CRITERIA OF P-M2-M3 HINGE TYPES
190
10000
8650
9000
8000
6575
7000
5475
6000
4225
4175
5000
3725
3675
3400
3250
3200
3150
3150
3125
3100
4000
3075
3000
2975
2900
2900
2825
2800
2775
2725
2675
2250
3000
2000
1000
20.48%
18.56%
20.00%
15.12%
15.04%
14.28%
11.63%
15.00%
9.74%
8.93%
8.88%
8.46%
8.42%
8.11%
7.98%
7.66%
7.60%
7.40%
7.35%
7.19%
10.00%
6.85%
6.80%
6.28%
6.10%
5.63%
5.39%
5.00%
0.00%
191
Figures IX and X show that structures in models reach
their ultimate capacity with the most reduced demand and
highest scale factor of design response spectrum which means
that the structures are pushed to the maximum limits.
V. Conclusion
The XPDs’ locations used in the research played a major
role in the mechanism of the model's response and behavior
towards the design response spectrum curve and also played a
major role in raising the scale factor in some study cases
compared to other models, in addition to the ratios of crumbed
rubber added to aggregates in the concrete used in the models.
Rubber concrete may be a good option to raise the
damping ratio of performance point in the structure, it
Figure IX. Pushover plot for model 3D_Nor_XD_2_R15% contributes in absorbing energy by a certain amount, but it is
not sufficient as individual, adding extra damping systems
15% ratio of added crumbed rubber models. Thus, this leads
can form a hybrid system that gives greater effectiveness to
us to conclude that an optimum situation can be obtained from
raise the effective damping ratio of the structure as a whole,
thus greater absorption capacity of seismic energy and
increasing scale factor of design response spectra which
means increasing structure damping, also proportions of
crumbed rubber added should be carefully studied to have the
optimum results.
Results obtained are depended on other researches results
which may be not accurate, selecting locations of XPDs and
ratio of crumped rubber added to concrete aggregates is
critical and depends on geometrical properties of the structural
models and the study case. For future researches it is
recommended to use optimization applications on such cases
to highlight the most proper usage of dampers locations in
structures with the best ratio of added crumped rubber to reach
practical applications for such researches.
References
[1] Kheir Al-Kodmany (2017). Understanding Tall Building A
Theory of Place Making. Routledge Taylor and Francis Group.
[2] Applied Technology Council (ATC-40 Project) (1996).
Seismic Evaluation and retrofit of concrete buildings. Seismic
Safety Commission State of California.
[3] U. D. D. Liyanage, T. N. Perera, H. Maneetes (2018). Seismic
Analysis of Low- and High-Rise Building Frames
Incorporating Metallic Yielding Dampers, Civil Engineering
and Architecture
[4] SarikaRadhakrishnan, Mr. Sanjay Bhadke. (2016). Seismic
Performance of RC Building with X-plate and Accordion
Metallic Dampers. International Research Journal of
Figure X. Pushover plot with detailed for model Engineering and Technology (IRJET). Volume: 03 Issue: 07 |
3D_Nor_XD_2_R15% July-2016.
mixing the damping systems like adding crumped rubber to [5] P Sugapriya, R Ramkrishnan, G Keerthana and S
aggregates of concrete, or installing other types of dampers Saravanamurugan (2018). Experimental Investigation on
Damping Property of Coarse Aggregate Replaced Rubber
like the XPDs or viscous liquid dampers or mass dampers, Concrete, IOP Conf. Series: Materials Science and Engineering
etc… 310 (2018) 012003. 10.1088/1757-899X/310/1/01200
192
Bidirectional DC-DC Converter Based on Quasi Z-Source
Converter with Coupled Inductors
Murat Mustafa Savrun
Department of Electrical & Electronics
Engineering, Adana Alparslan Türkeş
Science and Technology University,
Adana, Turkey
msavrun@atu.edu.tr
Abstract—This paper presents an improved bidirectional applied. These converter types are impedance network [6],
quasi z-source dc-dc converter to achieve a reduced input switched capacitor [7], capacitor clamped [8], cascaded boost
current ripple. The proposed converter consists of a quasi z- [9], and the quadratic boost [10]. Many of these topologies
source converter employed coupled inductors, shoot-through have some drawback of discontinuous input current [11]. In
switch, and output filter. The proposed converter interface is addition, coupled inductor-based boost converter topologies
able to provide bidirectional power flow as well as high voltage have been performed to enhance the voltage boost capability
gain. Besides, the quasi z-source converter that is equipped with with low input current ripple. The converters have high
coupled inductors makes it possible to reduce input current voltage gain thanks to the turns ratio of coupled-inductors.
ripple. In order to demonstrate the improvement of the input
However, higher turn ratios increase the leakage inductance of
current ripple and verifying power flow functionalities and high
voltage capability, a battery connected proof-of-concept model
the coupled-inductors. Therefore, instantaneous voltage
has been developed using MATLAB/Simulink environment. spikes and their disruptive effects increase [12]. In [13], a
The proposed topology is examined under different power flow quasi z-source inverter equipped with a coupled inductors
directions, battery charging algorithms, and various voltage topology with the advantages of high voltage gain and reduced
gain values. Besides, the efficiency of the converter is analyzed input current ripple is proposed to drive a motor.
for all case studies. The results validate the viability and In this paper, the bidirectional quasi z-source dc-dc
effectiveness of the proposed converter. converter, which excels with high voltage gain capability, is
Keywords—bidirectional dc-dc converter, quasi z-source
equipped with coupled-inductors in order to reduce input
converter, high voltage gain, coupled inductors current ripple. The proposed topology has the superior aspects
of each of the two approaches. The high voltage gain is
obtained by the quasi z-source converter, while the
disadvantage of high leakage inductance due to the high turns
I. Introduction ratio of coupled-inductors is eliminated. Besides, the
Nowadays, the use of systems equipped with renewable relatively high input current ripple of the quasi z-source
energy sources (RESs), which are rapidly replacing fossil converter is reduced via the coupled-inductors. The
fuels, has been increasing. RESs have a non-linear production performance analysis of the converter has been evaluated
behavior due to their nature. Therefore, RESs are often under different case studies.
equipped with batteries and DC-DC converters in order to
The pattern of the paper is organized as follows: The high
regulate their output. Power electronics converters have great
voltage gain dc-dc converter topology and its operation
importance in interfacing RESs, batteries and loads in such
principle are described in Section II, whilst the control scheme
application areas: renewable energy systems based distributed
is outlined in Section III. Section IV presents the operation
generation, electric vehicles, microgrids. The step-up
waveforms captured under defined case studies. Finally,
converters are frequently used due to the relatively low output
conclusions and discussions are presented in Section V.
voltage levels of RESs and batteries as well as the inherent
limitations. Step-up dc-dc converters temporarily store low II. Power Circuit Structure
voltage input energy on magnetic field storage components
and transfer it to the output in high voltage levels [1]. The The power circuit configuration of the proposed converter
traditional boost converter topology has the advantages of low is illustrated in Figure I. The proposed impedance network
conduction loss and simplicity of design. However, it is not based converter consists of quasi z-source converter, coupled
able to use in voltage-sensitive user applications that need inductors, active switch, and output filter. The quasi z-source
high voltage gain because of the restrictions of limited voltage converter allows to increase low input voltage to relatively
gain and high output voltage ripple [2]. Several dc-dc high voltages with its high gain characteristic. The quasi z-
converter topologies have been proposed in the literature source converter is equipped with coupled inductors in order
regarding the high voltage gain issues. The high voltage gain to reduce the input current ripple. The active switch is used to
topologies are categorized into two as isolated and non- control the shoot-through and nonshoot-through states of the
isolated dc-dc converters [3]. Isolated dc-dc converters are quasi z-source converter. LC filter is used to filter the
equipped with high-frequency transformers (HFT) in order to oscillations at the output of the converter. The quasi z-source
isolate the primary and secondary side as well as provide high converter makes it possible to perform bidirectional power
gain with a turns ratio of it. However, the topologies equipped flow. Therefore, the proposed converter has the functionality
with high turns ratio HFTs provide limited voltage gain and to be used not only for voltage regulation of RESs with low
sacrifice efficiency because of relatively high conduction output voltage, but also for both charging and discharging of
losses [4]. The non-isolated topologies are frequently used in low voltage batteries. In order to test and evaluate the power
applications where there is no need for isolation due to their transfer performance in both directions, a simulation model
reduced size and cost advantages [5]. In RESs based endowed with battery has been conducted.
applications, various high voltage gain dc-dc converters are
193
Cq1
DC-Link
Cq2 S st Cout
Vb
III. Control Scheme C value. The charging operation ends following the charging
current reaches a value close to zero (0.005 C).
The controller of the proposed system depends on the duty
cycles of two active switches. During the forward power flow, IV. Performance Results of the Proposed Converter
while the freewheeling diode of Sz switch is biased, the Sst
switch is triggered by the determined duty cycle value in order The proposed converter has been modeled, tested and
to regulate the shoot-through condition and reach the desired evaluated in Matlab/Simulink environment. To evaluate the
voltage gain value. The duty cycle value varies within the performance of the proposed converter, a proof-of-concept
limits of 0-0.5. During reverse power flow, while the Sst model has been developed with a 100 V 50 Ah lithium-ion
switch is in OFF state, the Sz switch is triggered. The duty battery. The parameters of the simulated system are listed in
cycle value of the Sz switch determines the charging current Table I. The switching elements are determined as IGBTs
of the battery. Batteries are charged according to a charging considering the switching frequency and power transfer
algorithm in order to extend their service life. The most rating. The performance investigation has been conducted
commonly used battery charging algorithm is constant current under different voltage gain conditions for forward power
(CC) / constant voltage (CV) charging. To perform CC/CV flow and under CC/CV charging conditions for reverse power
charging, the battery current needs to be able to control. The flow. The case studies summarized in Table II has been
controller of the converter is illustrated in Figure II. formed considering all possible operation scenarios.
194
Table II. Operation scenarios
Case 1 Case 2
Time Intervals Time Intervals
0–2s 3–6s 7 – 10 s 11 – 14 s 15 – 18 s 0–2s 3–6s
Duty Cycle 0.1 (Sst) 0.15 (Sst) 0.2 (Sst) 0.25 (Sst) 0.3 (Sst) 0.31(Sz) Decrease
Output Voltage 240 V 407 V 641 V 939 V 1294 V 1000 V 1000 V
Output Current 0.475 A 0.81 A 1.27 A 1.86 A 2.56 A -2.9 A Decrease
Battery Voltage 108.6 V 108.5 V 108.3 V 108 V 107.4 V 110 V 110 V
Battery Current 1.1 A 3.17 A 7.93 A 17.22 A 33.27 A -24.8 A Decrease
Operation Mode Gain: 2.21 Gain: 3.75 Gain: 5.91 Gain: 8.69 Gain:12.05 CC Charging CV Charging
The second case represents the reverse power flow charged with constant current up to the threshold voltage
operation and 0-2 s and 2-4 s time intervals correspond to CC value. After the battery achieves the determined voltage value,
and CV charging operations, respectively. During the related the charging algorithm switches to CV charging. During this
case study, the battery is charged from the load side to verify stage, the battery current gradually decrease to 0.005 times of
the reverse power transfer capability of the converter. It is battery capacity. Figure IV illustrates the operation
assumed that the dc-link voltage of the output side is 1 kV. waveforms of case 2. The efficiency values for related time
While the Sst switch is in OFF state, the Sz switch is triggered intervals are computed as 95.1% and 94.8%, respectively. The
considering the enabled charging algorithm. The battery performance waveforms reveal that the proposed converter
maximum charging current is determined as 25 A considering provides bidirectional power flow with high efficiency.
the 0.5 times of battery capacity (0.5 C). Thus, the battery is
Case-1 Operation Waveforms
120 1400
100 108.6 V 1200
Battery Voltage (V)
80 1000
641 V
800
60
600 407 V
40 240 V
400
20 200
0 0
35
2.5
33.7 A 2.595 A
Battery Current (A)
30 1.88A
Load Current (A)
25 2
17.4 A 1.283 A
20 1.5
15 0.815 A
8A 1
10 0.485 A
3.2 A
1.1A
5 0.5
0 0
3500
3500
3617 W 3355 W
Battery Power (W)
3000
Load Power (W)
3000
2500
2500 1879 W 1765 W
2000
2000
1500 1500
867 W 822.5 W
1000 347 W 1000 332 W
120 W 115.7 W
500 500
0 0
0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20
Time (s) Time (s)
50
Battery SOC (%)
49.95
49.90
49.85
0 2 4 6 8 10 12 14 16 18 20
Time (s)
195
Case-2 Operation Waveforms
120
20 200
0 0
0
-0.5 A
-5 0
-0.06 A
-10 -1
-15
-2 -2.9 A
-20 -24.8 A
-25 -3
-50 W
-500 -50 W
-1000 -1000
-1500
-2000 -2000 -2850 W
-2710 W
-2500
-3000
-3000
0 5 10 15 0 5 10 15
Time (s) Time (s)
94.12
94.10
Battery SOC (%)
94.08
94.06
94.04
94.02
94.00
0 5 10 15
Time (s)
196
[10] S. Lee and H. Do, "Quadratic Boost DC–DC Converter With [12] Y. Hsieh, J. Chen, T. Liang, and L. Yang, "Novel High Step-
High Voltage Gain and Reduced Voltage Stresses," IEEE Up DC–DC Converter With Coupled-Inductor and Switched-
Transactions on Power Electronics, vol. 34, no. 3, pp. 2397- Capacitor Techniques," IEEE Transactions on Industrial
2404, 2019. Electronics, vol. 59, no. 2, pp. 998-1007, 2012.
[11] M. Moslehi Bajestan and M. A. Shamsinejad, "Novel switched- [13] A. Battiston, E. -H. Miliani, S. Pierfederici and F. Meibody-
coupled-inductor quasi-Z-source network with enhanced boost Tabar, "A Novel Quasi-Z-Source Inverter Topology With
capability," Journal of Power Electronics, vol. 20, no. 6, pp. Special Coupled Inductors for Input Current Ripples
1343-1351, 2020. Cancellation," in IEEE Transactions on Power Electronics, vol.
31, no. 3, pp. 2409-2416, 2016, doi:
10.1109/TPEL.2015.2429593.
197
Influences of urban fabrics onto microclimate assessment
within the city of Tirana.
Fabio Naselli Enkela Krosi
Department of Architecture Department of Architecture
Epoka University Epoka University
Tirana, Albania Tirana, Albania
fnaselli@epoka.edu.al ekrosi@epoka.edu.al
Abstract Tirana, the capital city of Albania, could not possess low albedo values. The high building height of recent
escape from that feature that characterizes all the cities of the constructions has reduced the sky view factor ratios and
past socialist regime: sudden and low-governed development reduced the ventilation of outdoor areas and wind passages
process. The study aims to emphasize the problematics and which prevents the cooling of urbanized zones [9] and is
microclimate level differences that coexist to the existing urban reducing the daylight level which causes gloom within
fabrics within Tirana, which reduces the comfort of city building canyons [10]. The aim and objective of this paper is
inhabitants, mainly induced by the post-socialist urban growth. to accentuate the problematics associated and induced by the
We want to point out about the reduction of the open and green
rapid, uncontrolled urban sprawl which directly influences the
areas and the occupation of free soils by high-rise buildings or
microclimate of urbanized zones in city of Tirana. The study
informal ones, which have increased the UHI (Urban Heat
Island) effect, modifying the local metabolism and by reducing
intends to raise the awareness of environmental regulation
the general urban comfort. The UHI effect is escalated further authorities on the related phenomenon and contributing in the
with the substitution of natural materials by asphalt and improvement of thermal comfort, energy saving issues [11]
concrete. An analytical interpretations of land cover ratios where the building consumes almost 40% of the energy during
starting by both the land use and land cover analyses has the whole lifecycle [12], highlight the need of the green areas
conducted for 2 contiguous urban sites into the city of Tirana, and improve the quality of urbanized areas and as an overall
as we selected between diverse typologies of fabrics in the same will orient the strategies towards sustainable urban
area of the city. In the meantime, the research shown that an development [13].
uncontrolled edification may generate different urban
environments that do exhibit different microclimatic levels II. Influences of urban fabrics
despite their location in proximity one each other.
A. The Study areas
Keywords microclimatic values, UHI, urban metabolism, The selected sites to be analyzed in terms of microclimate
land cover, land use, informal urban development assessment with regards to their landcover and land use
features are located in the city of Tirana, Albania (Figure I),
I. Introduction adjacent to one of the main roads of the city which leads the
Approximately half of the population of the world lives in movement fluxes towards the city center and opposite
the urbanized areas and there is the tendency to be increased typologies aside the road is of
[1],[2]. This phenomenon has resulted in air, noise and land different features and architectural values constructed through
pollution and consequently changed the microclimate of the a time period of hundred years. The very first blocks that face
urban areas. The change in atmospheric and climatic the road are apartments of 4 and 5 floor compositions
conditions do affect our mood and activities and even our accompanied with 2 to 3 story height villas reflecting the
daily productivity [3]. In the urban areas where temperatures influence of the Italian architecture. In the back side of the first
register higher values compared to periphery is commonly facing blocks the architecture and urban morphology has lost
known as the Urban Heat Island Effect (UHI). UHI is been its space character by the amateurish and profit oriented
affected directly by the change in the wind speed, which interventions which have lowered the values of the outdoor
results to be higher during the night [4] and increases with the common spaces and reduced even in size.
increase of the urbanization and population and recently has
resulted to alter by the alteration of the land use and land cover
ratios [5], [6]. Greenery as a crucial element that mitigates the
UHI effect is rapidly reduced, [7] and has a direct relation to
reduction of health problems on humans [8]. The fast
expansion of urban morphology with the invention of new and
automated construction methods and techniques has resulted
in alteration of microclimate values especially the temperature
of air for different building blocks. Particularly in Tirana after
the decline of communist regime an uncontrolled urban
growth spread faster almost to all the city area. This
uncontrolled urban sprawl has been spread in such an
unpredictable manner that even to adjacent building blocks
separated by one main road to be seen a big disparity in terms
of land use, land cover, building intensity and so on. The
present condition is characterized by substitution of natural Figure I. Tirana city
surfaces and greenery by impermeable hardscapes such as
asphalt and concrete which store a large radiant heat and
198
The existing open and green areas have been occupied by
new buildings that frequently disobey even the urban planning
rules and regulations (Figure II).
City of Tirana is characterized by a Mediterranean climate.
It is one of the most wet and sunniest cities in Europe. Tirana
is characterized by a diverse urban morphology which is
developed by the uncontrolled urban sprawl and rapid
extension of the city border.
199
The ratios of Land Covering can be better understood in Z2=59%, the Z1 does possess higher open and public
the charts. spaces with green character compared to Z2. With the
intensity values is shown that the sky view factor is highly
reduced in Z2 due to the large concentration of the buildings
Table II. Total built area for Z1 & Z2
composed of more than 5 floors within the zone.
Zone 1 Zone 2
Area % Built Area % Built
tot. tot.
area area
built area 9825 100 8050 100
% %
10k 0 0.0 % 0 675 8.4 % 6750
36.4 56.2
5k 3575 17875 4525 22625
% %
12.4
3k 100 1.0 % 300 1000 3000
%
32.1 14.3
2k 3150 6300 1150 2300
% %
30.5
1k 3000 3000 700 8.7 % 700
%
44.5 19.3
tile cover 4375 1550
% %
asphalt 55.5 80.7
5450 6500
cover % %
27475 35375
Figure VII. Info of the building regulations for Z2 by municipality
In the Table II. are exhibited the ratios of the buildings by extracted from planifikimi.gov.al
their floor height which are directly proportional to the
building intensity. From the Table II. is shown that in Z1 the Table III. Intensity and L.U.Coef for Permissibility values and
Land Use coefficient for buildings more than 5 floor are at a Actual situation
value of 36.4% while in the Z2 is 64.6%. The ratio of the
inhabited built area over the total Land Use is drastically Zone 1 Zone 2
higher in the Z2 than in Z1, the values are shown in the Table Permissibility Actual Permissibility Actual
IV. at the intensity values where for the Z1 it is 1.48 and for value Situation value Situation
Z2 2.59. Another issue that is extracted from the Table II. is intensity 3.3 1.48 2.4 2.59
that materials that store more heat like asphalt and concrete coef. of
are predominant in Z2. The values shown in Table IV. can be 45% 53% 45% 59%
land
comprehended that by the Coef. Of land usage of Z1=53% and
usage
In the Figure VI. and Table IV. are shown the values of
Land Use and Building Intensity provided by the municipality
of Tirana in their building regulation code extracted by the
official webpage pplanifikimi.gov.al. In the figure has been
understood that Land Use Coefficient is passed with the
current conditions for both of the Zones. Regarding the
intensity values Z1 has a permissibility of 3.3 while the actual
value is 1.48, on the other hand in Z2 the permissibility value
of the intensity is 2.4 while the actual condition is 2.59. This
has occurred due to the demolition of the Old Italian style
villas in the Z2 and construction of the new high rise.
III. Discussion
Microclimate assessment of urbanized areas is a
phenomenon that is widely spread and developed by usage of
several different strategies and methods for achieving certain
results. Based on past researches conducted on microclimate
assessment by usage of various methods and tools results have
shown that urban microclimate is directly affected by a
number of elements and conditions such as; green ratio,
material characteristic, built/unbuilt ratio, sky view factor and
Figure VI. Info of the building regulations for Z1 by municipality so on. Associated to these axioms, our study has been based
extracted from planifikimi.gov.al and developed for understanding the variety of microclimate
200
change within the urban blocks of city of Tirana. The land use and concrete has increased the UHI effect which reduces the
ratios extracted from the two Zones in this research have been indoor and outdoor comfort of the citizens of Tirana. The two
selected intentionally to exhibit the disparities that adjacent selected study zones show that even in areas adjacent to each
urbanized zones do possess. other due to the urban morphological variations and land cover
differences exist variations in temperatures and ventilation
Z1 analysis has shown that the block is surrounded with 5 which has a direct effect in mood and productivity.
floor apartment buildings and within the block dominate low
rise private houses which offer green space by their private References
gardens while in Z2 the demolishing of old private houses is [1] Wei Ruihan, Dexuan Song, Nyuk Hien Wong, and miguel
substituted by high rise apartments that have reduced the open Martin, "Impact of Urban Morphology Parameters on
and green areas within the block. Microclimate" Elsevier, The Netherlands, 2016.
Table IV. Concluding Table [2] , "Land Use/Land
Cover changes dynamics and their effects on Surface Urban
Zone 1 Zone 2 Heat Island in Bucharest, Romania" Int. J. Appl Earth Obs
Geoinformation, 2019.
high green areas low green areas
[3]
low rise building high rise building Press, New Jersey, 1962.
L. U. = 53% L. U. = 59% [4] Ibidem (1).
[5] Ibidem (1).
I=1.48 I=2.59
[6] Ibidem (2).
Large areas of tile covering Large areas of asphalt [7] Erell Evyatar, David Pearlmutter, and Terry Williamson,
high sky view factor Low sky view factor Urban Microclimate. New York 10017 Taylor and Francis,
2011.
Referring to the concluding Table IV. and considering the
past researches on similar topics, it has been comprehended [8] Roberts Hannah, Rosemary McEachan, Tamsin Margary, Mark
Conner, and Ian Kellar, "Identifying Effective Behavior
that by measurement of Land Use and Land Cover ratios for Change Techniques in Built Environment Interventions to
the two selected zones can be concluded that between Z1 and Increase Use of Green Space: A Systematic Review." In:
Z2 should exist a difference in air temperature within the Environment and Behaviour, SAGE Publications, 2018.
zones, wind flow and ventilation level. Should also exist a [9] Tsoka Stella, "Investigating the Relationship Between Urban
difference of UHI values for the both study zones.. Spaces Morphology and Local Microclimate: a study for
Thessaloniki." Elsevier, 2017.
IV. Conclusion [10] Oke Timothy Richard, "Street Canopy and Urban Layer
Climate." In: Environment and Behaviour, SAGE Publications,
Tirana as a city which has passed through several urban 1988.
transformations exhibits different urban characteristics which [11] Ibidem (1).
are evident even in zones located in high proximity within the [12] Kocagil Idil Erdemir, and Gul Koclar Oral, "The Effect of
city. The rapid and uncontrolled urban sprawl has accentuated Building Form and Settlement Texture on Energy Efficiency
furthermore the urban morphological variations. This for Hot Dry Climate Zone in Turkey." Elsevier, 2015.
phenomenon has resulted in the alteration of built and unbuilt [13] Mills Gerald, "Progress toward sustainable settlements: a role
ratios and reducing the greenery areas which has a direct for urban climatology." Theoritical and Applied Climatology,
relation with the increase of urban microclimate temperatures. 84, Springer-Verlag, 2006.
Reduction of Skyview factor and increasing the amount of
heat storing materials in building construction such as asphalt
201
IoT Based Water Management and Monitoring System for
Multi-Resources
Sarosh Ahmad Sheza Yasin Sajal Naz
Department of Electrical Engineering and Department of Electrical Engineering and Department of Electrical Engineering and Department
Technology Technology Technology Istanbul Sa
Government College University Faisalabad, Government College University Faisalabad, Government College University Faisalabad, Is
Faisalabad, Pakistan Faisalabad, Pakistan Faisalabad, Pakistan akhta
sarosh786a@gmail.com shezay42@gmail.com sajalnaz751@gmail.com
Abstract—This paper aims to manage water distribution in utilize water resources efficiently with the water management
an aligned manner so that everyone will get an equal amount system. A new method of IoT based with PLC and SCADA,
of water without wastage. Without the (programmable logic such a framework for water is required which deals with
control) PLC and sensor, the feedback is not obtained in a fast
manner. The proposed system is fully automated by connecting utilization of water. Control engineering has passed through
it to PLC & SCADA (Supervisory Control & Acquisition), many innovative changes over the last few years. Previously,
which provides an IoT (Internet of Things) solution. This all human beings were the only source for manipulating and
proposes a Multi-Resource Control System. Through this commanding any framework [1]. Having troubleshoot helps
project human efforts, time, resources wastage and other in analyzing and rectifying an error. Due to reliability, can be
complications are reduced. Our idea is to minimize water used for years without any malfunction [2-4]. Automation is
wastage through a transparent, accountable, and efficient water
supply system which in results reduce human efforts, minimize the theme of our proposed project as it plays a vital role in
the use of different resources like electricity. So, the goal is to controlling human errors. This system works on different
implement an efficient multi-resource water management parameters of water like level and flow rate. Using these
system in an affordable cost. Thus, we are going to design a setup parameters water wastage and water theft can be avoided. On
through which we will investigate various parameters like pH, the grounds, evolvement in technology enhances distinctive
water level, turbidity of water and manage them by comparing methods, observing the economical points in perception [5-
itself to a set benchmark and depending on the population, we
will manage the flow meter reading. In this research, we have 7]. Internet of Things (IoT) is a system composed of several
simulated the salt level testing system, UV (ultra-violet) testing branches of mechanical, electrical, computing devices, and
system, and PH testing system and designed a system to manage wireless technologies that can be employed to achieve water
and distribute water equally without wastage of water. management system requirements. The water pump and
pump station can be regularized through Multi-Resource
Keywords—Programmable Logic Controller, Supervisory
Control System (MRCS), which is an innovative technique.
Control & Data Acquisition, Internet of Things, Multi-Resource
Control System, water level control, wireless sensors The pumps controller, water level in water storages, and
alarming framework are comprised as an integral section for
I. Introduction MRCS. Additionally, a 4-state switch will be designed which
The water on the world’s surface is unevenly distributed. Just helps in operating the system manually, automatically, using
3% of the water on the superficial level is useable, the lasting IoT method, and in Off state. This hierarchy will be driven by
97% is found in the seas. Of freshwater, 69% can be found in an IoT technology, directed by the SMS, Wi-fi, or ringtone
glaciers, 30% underground, and less than 1% is in lakes, which will be accessible from anywhere and at any time [8-
waterways, and marshlands [1]. This all concludes that just a 10]. The water management system through MRCS using IoT
single percentage of the world’s surface water is usable by technique may be considered among the modern ways in
the population on earth., and 99% of the usable amount is controlling the wastage of water significantly.
residing underground. Consequently, water administration A. Programmable Logic Controller (PLC)
and dispersion must be done expertly. Due to the rapid
programmable Regulator is a computerized PC that is utilized
growth of the population, water requirement is increasing day
to control electromechanical procedures through
by day which boosts several issues such as water scarcity and mechanization. PLC is utilized to control procedures, for
shortage. These problems have been increasing swiftly, example, beguilement rides, apparatus in manufacturing
affecting the home users, agricultural lands, and industrial plant, water tank extinguishing in the aviation, filling
sectors. The traditional approach of pumping underground machine control framework in the food industry, shut circle
water through pumps, tube-well, Petter Engine, etc. contains material shrinkage framework, and different procedures in
no proper water management. Therefore, the unnecessary use our day-by-day life. PLC was chiefly intended for multi-
and waste of water reduce the level of underground water inputs and multi-yield forms as shown in Figure I. This
gradually which are causing severe problems to the further reached out to temperature ranges, invulnerability to
environment as well as to individuals. The future need is to
202
the electrical commotion, and protection from vibration and
different effects [11].
SCADA is broadly utilized in the industry for Administrative For making our system efficient, we have added an
Control and Information Obtaining of mechanical automation system through PLC. Storage tanks will have
procedures; SCADA frameworks are currently additionally some sensors and other instruments, which will be attached
infiltrating the exploratory material science labs for the to a certain device in order to control and monitor various
controls of subordinate frameworks, for example, cooling, parameters. These sensors examine physical parameters and
ventilation, power circulation, and so forth. Even more, as of convert them into electrical signals in order to give input to
late they were likewise applied for the controls of littler size PLC. PLC is the main governing party because it controls
molecule finders, for example, the L3 moon identifier and the sensors and other devices and provides data to the controlling
NA48 try, to name only two models at CERN. SCADA room. A controlling room is a SCADA system that stores data
frameworks have gained generous ground over the ongoing in its server coming from PLC and other devices. In current
years as far as usefulness, versatility, execution, and designs, a man must be within the premises of the office for
receptiveness with the end goal that they are an option to in switching on/off the water supply, but we are moving towards
house advancement in any event, for exceptionally requesting PLC which will act like a man for supplying water. The
and complex control frameworks like those of material quality parameter which is very vital for healthy water is the
science tests [17] as can be seen in Figure II. pH value. Extreme pH numbers cause severe health issues
such as infections to the skin, eyes and also damage different
cell membranes. So, in order to avoid all these health and
other issues like corrosion of water pipes and mains pH
controlling and monitoring system have been installed in our
automation system. Proper monitoring of the process is
mandatory to have results at an optimal level. SCADA
systems have been using in most industries for a long time. It
is an efficient system because it shows the information on
real-time basis, which helps in sorting the problem and
correct them as identified. This SCADA system consists of a
primary control center and field sites as per requirement.
Point-to-point connections are used for the control center to
field site communications. All field sites are interconnected
to each other via networking for communication.
This system mainly
Figure II. Block Diagram of the SCADA system
consists of PLC. This is the central and important part of the
C. Internet of Things (IoT) system. SCADA system is designed in order to realize the
automatic controlling of valve and parameter transformation
These interconnected articles have information routinely
such as pipeline pressure and water quality [20]. PLC is the
gathered, broke down, and used to start an activity, giving an
heart of our automation system so it provides all logic
abundance of insight to arranging, the executives, and
dynamic. IoT characterized as a system of physical items. functions which will be developed through a ladder logic
The web isn't just a system of PCs, yet it has advanced into a program, used to command PLC. Sensors and Actuators will
system of gadgets of all kinds and sizes, vehicles, PDAs, provide their values and observations to PLC. Then, PLC will
home machines, toys, cameras, clinical instruments and monitor and control them based on logic through the ladder
mechanical frameworks, creatures, individuals, structures, all program. This logic will be uploaded on PLC through PLC
associated, all conveying and sharing data dependent on software and can be changed accordingly. PLC is synced with
specified conventions so as to accomplish brilliant redesigns, the SCADA system which results in monitoring and
situating, following, safe and control and even close to home commanding the distribution network of water. In the water
constant internet checking, online overhaul, process control supply system, we have one storage tank consisting of the
and organization [19] as presented in Figure III. level sensor for level monitoring, pH sensor for water quality
203
monitoring. Other water supply system elements contain water.
pipelines for water flow and pressure switches for closing and • Pumping and filtering processes work on six inputs to the
opening the outlet valve and the maintenance of volume in input module, such as start push button, stop push button,
the tank. In the case of heavy and high pH water, the amount chlorine tank lower, and higher-level measuring device.
of chlorine that will be added into the tank is defined in Data • Allen Bradley PLC controls the process and Wonder ware
acquisition Centre in ppm (parts per million) to make water Intouch software SCADA tool is used for monitoring the
pure. The equations we used to track ppm in the tank, process.
pressure (PSI), and precipitation (GPM) in the pipe [8]. From • After commanding from PLC, the pumping and filtering
the equation, we conclude that the tank's max volume, the process will output to the chlorine valve for outlet and
Solution’s ppm, the elevation height of the reservoir, the pipe reservoirs solenoid valve. In our system design, we have
diameter from their reservoir, the flow percentage open will added control for both motors from PLC, which can be
be the variables and must be specified at the start. commanded as required and the current status and history
A. Automation of the sensors can be seen remotely through the SCADA
system.
Our whole system is based on automation in order to nullify
human error and for designing an efficient and modern
system. We have used automation for various control
elements for operating them automatically for now many
years. The foremost gain of an automatic system is that it
saves resources, energy, and materials with no compromise
and even better results in quality, accuracy, and precision.
The block diagram of the PLC system is presented in Figure
IV.
204
D. Salt Level Testing System it means the process is stopping and then again reprocessing.
The red color in the first tank is indicating the salt testing. Salt
from the water must be removed to get fresh water for
humans, irrigation, and for other purposes. Salt is produced
as a by-product from the removal of salt from the water of
desalination. This removal of salt from water is termed
desalination, which produces potential amount by-products
from different applications. Figure VII is showing a salt level
test simulation diagram. This can be considered as an
independent water source. The seas are immensely huge so
desalinating them is a very costly process and arises some
other big problems for the future. So, the alternatives methods
are mostly used.
Figure IX. PH Level Testing System
G. Distribution Section
In our project design, we have considered it as a separate
department because we want to have a better water supply for
customers and to avoid any faulty conditions. We have seen
that most problems have been seen in the water
supply/distribution system due to a pressure drop occurrence
or pump is used in the home for sucking water directly from
the main pipeline passing through the street. We have worked
on these main issues and come up with a solution for the
water distribution system. In our solution, we will have a
control system run by PLC, transmission channel, and other
Figure VII. Salt Level Testing System elements with connecting pipes. The block diagram for our
proposed design has been shown in Figure X. It shows that
E. Ultra-Violet (UV) Testing System
all the processes of the water distribution system will be
The first tank in green color is indicating the UV testing. For governed by the PLC and for transmitting/receiving control
the killing of harmful bacteria and viruses, UV is found more by PLC, a communication channel will be utilized.
effective. In recent studies by different researchers, it has
finally come to know that UV rays are very strong in dealing
with bacteria, viruses, and other microorganisms. Test results
have been shown in the Figure VIII simulation diagram. The
findings show that UV radiation is a good method for the
treatment of water for drinking purposes. Some bacteria and
viruses have found to survive during tests at high UV doses
but are removed from less amount.
(a)
205
In this section, by monitoring the water storing and alteration in the circuit would be required. This reduces
distribution system, pressure, and other parameters, we can efficiency and increases time.
be able to detect any theft happening in streets or somewhere 3) Less Power Consumption
else. Every stage is monitored separately in order to get a PLC consumes 1/10th of the power as that of an equivalent
precise analysis. If the distribution tank's higher-level sensor relay control. Number of contacts in PLC each coil is more
is ON, it means that the water distribution is going on. Water than the number of contacts found in relay.
can be distributed to all the places at the same time or at 4) Operating speed
different times depending upon requirements and resources. PLC operates faster than an equivalent device. Speed of PLC
If there is any problem under the distribution control, then the can be determined by units in milliseconds.
valve will change automatically to manual control and the 5) Reduced Space
problem can be rectified. Figures XI and XII is showing the PLC is a very compact device because it is a solid-state
distribution valves simulation figure. The screen is having a device rather than electromagnetic and hard-wired devices,
status bulb. When the distribution is turned ON, the bulb turns where electromechanical motion is required.
green. 6) Ease of Maintenance
Troubleshooting is very much easy because it provides error
diagnostic through a program and a software. If the error is
found component replacement is also very much at ease.
7) Addition of circuits
It is easy to add multiple circuits to a PLC to provide a
control. This can be done without many efforts and saves the
amount of money needed to be spent on other controllers.
B. Applications
PLC is mostly preferred in certain automation projects. The
best market of PLC is industries. Mostly, industries have been
using PLC for a long time in their manufacturing processes.
Figure XI. Distribution of Valve 1
These are not only installed for manufacturing purposes but
also used in monitoring and automation tasks by industries.
PLC requires I/O devices which are then used with different
industrial components with respect to their compatibility. In
some cases, external circuit design is the requirement for
connecting those terminals physically with some
programming in ladder logic for making the complete circuit
operational. PLC is mostly found in high-end places where
the package of PLC is cheaper than the cost of the custom-
built controllers. At the low-end level, different automation
techniques have been employed to save money, time and
effort. The given project’s most applications are found in
industries where there is excessive use of water with no
proper maintenance. This water management system will
Figure XII. Distribution of Valve 2
assist to minimize the useless dissipation of water because it
III. ADVANTAGES, APPLICATIONS & is a vital element on earth. This project also finds its
LIMITATIONS applications in WASA, chemical industries, dying industries
and wherever, there is a usage of any liquid for different
A. Advantages processes or as a product. This is a one-time investment
Nowadays, PLC has become a basic controller in industries. through which we can save millions of gallons of water
It is an integral component in industries because it has annually. The whole system can work remotely through
replaced wiring which provides an efficient way of SCADA which can be operated from anywhere and hence
controlling systems. Other benefits of PLC have been listed will create a safe environment of operation.
below:
C. Limitations
1) Flexibility and Reliability
Due to evolvements in technology, one PLC is capable to • LC is a costly device.
control multiple systems. Whereas, in the past multiple • The project is suitable in industries or in government
systems acquired multiple controlling devices. These institutes. Whereas it will not be suitable for local and
controllers are very reliable, and the chances of any error are domestic use.
very less as there is very little moving mechanism in the • Mostly areas are still operating on old controllers so
device. transforming those areas to PLC is a difficult task to
2) Changes and error correction system easier achieve.
If any system needs to be modified only changes to program
IV. Conclusion
are made. This adds an extra efficiency to the circuit and
minimize the time. If using other devices like relays, We have worked on water treatment, water level monitoring,
water distribution, and controlling system through
206
automation. Our idea was to establish an economical, [12] Swapnil Namekars, Patel Tayyab Jahngir, Shahid.K. Hannure, Manasi
Jagtap, Pratiksha Zagade, “Water Level Controller”, International
feasible, and automated water management system for
Journal of Innovative Research in Technology (IJIRT), vol. 6, no.11,
avoiding the wastage of water. We have worked on this April,2020.
project and proposed multi-resource water management [13] M.O. Arowolo, A.A. Adekunle, M.O. Opeyemi "Design and
through PLC. In this project, its flexibility is immense as it Implementation of a PLC Trainer Workstation", Advances in Science,
Technology and Engineering Systems Journal, vol. 5, no. 4, pp. 755-761
can be controlled remotely. This is also the subsequent
2020.
benefit to control water efficiently. The edge PLC has from [14] A. Aguilar, M. Pérez, J. L. Camas, H. R. Hernández and C. Ríos,
any other controller or computer is that it is not affected by "Efficient Design and Implementation of a Multivariate Takagi-Sugeno
environmental changes like cold, heat, moisture, dust, etc. Fuzzy Controller on an FPGA," 2014 International Conference on
Mechatronics, Electronics and Automotive Engineering, Cuernavaca,
Due to the new advancement in technology PLC also
2014, pp. 152-157, doi: 10.1109/ICMEAE.2014.
upgraded itself by including different controls, motion [15] S. v. &. A. vosough, "PLC and its applications," International Journal of
controls, the capability of networks, divided control, and in Multidisciplinary Sciences and Engineering, vol. 2, 2011.
some other respects. The remote handling of PLC has been [16] "Scheduled Controls Explained (PLC)"
[17] A. Daneels, W. Salter, “What is SCADA?” International Conference on
provided through the SCADA system and IoT which is a
Accelerator and Large Experimental Physics Control Systems, Trieste,
more modern means. These systems have improved the Italy, 1999.
functionality, performance, and scope of the PLC. Due to the [18] P.M. Adhao, Mahavir’s, "Internet of Things (IoT): New Age",
attachment of SCADA with PLC, it gains immense International Journal of Engineering Development and Research
(IJEDR), vol.05, no.02, 2017.
popularity in industries because of remote access.
[19] K. K. Petal & S. M. Patel, "Internet of Things-IOT: Definition,
Characteristics, Architecture, Enabling Technologies, Application &
V. Future Prospective Future Challenges”, International Journal of Engineering Science and
Technological trends across the globe are pushing companies Computing, vol.6, no.05, May 2016.
[20] G. S. Ashok, "Water Anti-Theft and Quality Monitoring System
and industries towards next an automation-based system, Through PLC and SCADA," International Journal of Electrical and
monitoring, and manufacturing era. This all is based on the Electronics Engineering Research, pages 355-364, 2013.
subsequent work done in previous years to make industries
working on automation. Many Inc-operations come forefront
and built advanced and simulation tools which are being used
in many companies. As our project also has its applications
in any automation firm. Moreover, this project has its scope
in dying industries, petrochemical, pharmaceutical, etc. With
the addition of SCADA, the project has gotten huge
popularity in remote areas. Remote operations are becoming
preferable to simple automation and industries are willing to
convert automation tasks remotely which can be accessed
from anywhere.
REFERENCES
[1] V. C. &. L. F. C. Gungor, "Gungor, V. C., & Lambert, F. Research on
communication networks of electrical automation system. 50 (7), 877-
897.," 2006.
[2] M. D. J. F. &. S. M. A. Hadipour, A test set of intelligent control system
(MICS) water management system using Internet of Things (IoT). ISA
transactions, 96, 309-326., (2020).
[3] A. A.-S. M. &. A. E. A. S. Ali, "PLC water pumping system and
frequency control," 2009.
[4] J. Aziz, "National Water Quality Strategy," Asian Development Bank,
2002.
[5] S. Rana, "Poor water management is more costly for some countries,"
Express Tribune, 2019.
[6] G. &. Z. M. H. Murtaza, "Wastewater Production, Treatment and
International Use," pages 16-18, May 2012.
[7] A.-S. &. A. Ali, "PLC water pumping system and frequency control,"
2009.
[8] M. 1. W. i. t. Silicon Review. (2016, "What is the simple definition of
the Internet of Things? “16 March 2016.
[9] Hadipour.M, Derakhshandeh.F.J, Shiran.A, “An experimental setup of
multi-intelligent control system (MICS) of water management using the
Internet of Things (IoT)”, ISA Transactions, vol. 96, pp. 309-326,2020,
[10] Gonçalves R, J. M. Soares J, M. F. Lima R. “An IoT-Based Framework
for Smart Water Supply Systems Management”, Future Internet, vol. 12,
no.07,2020.
[11] Ek, K., Persson, L., “Priorities and Preferences in Water Quality
Management - a Case Study of the Alsterån River Basin”, Water
Resources Management, Springer, pp.155–173, 2020.
207
Development of a High Precision Temperature Monitoring
System for Industrial Cold Storage
Sarosh Ahmad Arslan Dawood Butt Usama Umar
Department of Electrical Engineering Department of Electrical Engineering Department of Electrical Engineering
and Technology and Technology and Technology
Government College University Faisalabad, Government College University Faisalabad, Government College University Faisalabad,
Faisalabad, Pakistan Faisalabad, Pakistan Faisalabad, Pakistan
sarosh786a@gmail.com arslandawood@gcuf.edu.pk usama.rwp96@outlook.com
Abstract—Cold storages are widely used for a number of vegetables have their specific shelf life. The “Shelf life” for a
industries in all over the world, mainly for the food industry. fruit or vegetable is the time duration that it can be stored
Cold storage facilities play an important role in increasing the without becoming inadequate for usage, sale, or consumption
shelf life as well as retaining the quality of several raw and [5].
processed food items. But there are some problems in old style
manually controlled cold storage systems mostly used, which II. Literature Review
needs to be upgraded with modern technology to reduce
potential losses. This research related to the real-time The temperature of cold storage must be maintained
temperature monitoring of cold storage in order to maintain the according to the food stored in it. If temperature of the cold
shelf life, proper monitoring of temperature is required. For the storage is not kept in optimal range, the shelf life of stored
case of apple, if the temperature is maintained between 33.8° to items reduces significantly. Temperature and Humidity to be
39.9°F, its shelf life will be from 3-8 months. If the temperature more specific through IoT which doesn’t requires the
further increases, the shelf life will reduce drastically. In this presence of any individual. Temperature variations were
work, we proposed a highly precise and reliable remote
monitored precisely and when temperature is above or below
temperature monitoring system to be used in cold storage units.
This study constitutes developing an efficient and effective real the specified range, an alarm is activated [6-7]. At the same
time remote temperature monitoring system, that will display time the actuator starts or stops maintaining the temperature
the accurate and precise temperature on the android app in a in the storage room. The temperature changes seriously affect
cell phone as well as on the LCD. The device was developed the quality of farm products. In order to solve the issue, the
using Pt-100 sensor, LT3092, INA-128p, Arduino UNO, Wi-fi machinist has to keep an eye on present state of temperature
Module, LCD Display and PCB. It will provide high efficiency of the cold-storage, even if he is far away. Thus, a remote
monitoring from remote locations and will greatly help to monitoring system is required by operator to control the
minimize the temperature variations in cold storages. An temperature automatically [8]. The study designs a remote
economical device with latest features makes this device
monitoring system of the temperature. The introduced system
attractive for industrial use.
is projected to help the operator's facilities and the
Keywords—Internet of Things (IoT), surface mound management of farm products. The method adopted to
devices (SMD), temperature monitoring, cold storage overcome the problem is use of a diode thermal sensor. The
output of the controller is connected to a relay. So, the control
I. Introduction method is on-off control. The detected temperature is
Cold storage is an important part of the food industries. As transferred to the data collection device using serial
stored food items are sensitive to temperature variations and communication [9]. Various challenges were encountered to
require persistent monitoring. Lack of latest technologies and optimize control, due to coupling. To decrease the contrary
ignorance about humidity and temperature effects on the effect of coupling and increase the performance of the
fruits results in food safety issues. Slight variations from refrigeration system of the cold storage, a control strategy
optimum temperature can cause great economical losses for with dynamic coupling compensation was considered. On the
industries [1]. The main objective of cold storage is to basis of requirements of the control system, first the dynamic
preserve the fruits for a certain period of time. The “Cold model of the cold storage was established and then the
storage” is such a storage place where various food and coupling between the components was considered [10]. A
vegetable are stored at cold temperatures for a few months or fuzzy controller with dynamic coupling compensation was
longer. This allows the food item to be available throughout designed to address the challenge. A self-tuning fuzzy
the year. Every fruit or vegetable has its fix range of controller can serve as the primary controller therefore, an
temperature for storing them that is known as Optimal adaptive neural network was adopted to pay for the dynamic
temperature range. The temperature of the cold storage coupling. In the end, the control strategy was applied to the
should be kept within the Optimal temperature range for refrigeration system of cold storage. The simulations were
proper storage of fruits & vegetables [2-4]. All fruits and performed in the condition of a start-up by changing the load
and degree of the superheat. The simulation results verified
208
the efficiency of fuzzy control with dynamic coupling
compensation [11].
Our research relates to the real-time temperature
monitoring of cold storage. To maintain the shelf life, proper
monitoring of temperature is required. The system is
designed keeping in mind the optimal storage temperature
ranges for standard food items like potato (35°-40°F), for
garlic is 30⁰ to 32°F and for apple is 30° to 40°F. Most apple
varieties are best stored at or near 32°F. As the optimal
temperature range for maximum shelf life is 2-3°F in most
cases, our system aims to achieve a much higher precision
with a much longer temperature probe to display device
distance. The main features of research are as under. Figure I. Designed circuit connected with INA 128
• Efficient monitoring of temperature without any B. Equation of the circuit
fluctuations in the cold room.
By applying the Kirchoff Voltage’s Law on the first and the
• Remote monitoring from anywhere with great precision.
second loop, we have:
III. Proposed Methodology Applying KVL on the 1st Loop.
This research was executed in different stages starting from
planning to development of hard components and device. 𝑉+ = (𝐼𝑅1 + 𝐼𝑅2 + 𝐼𝑅4 ) (1)
Different methods have been used to monitor & control the
temperature of cold storage. However, more efficient and 𝑉+ = 𝐼(𝑅1 + 𝑅2 + 𝑅4 ) (2)
remote systems are need of the time, to save industries from
great losses & provide fresh and quality products to the end
customers. Mostly the optimum storage temperature range Applying KVL on 2nd Loop.
for many products is too small that it covers 2-3 degrees
Fahrenheit. Temperature difference of 2-to-3-degree 𝑉− = (𝐼𝑅3 + 𝐼𝑅5 + 𝐼𝑅4 ) (3)
Fahrenheit require a very precise measurement. Manual
monitoring to this extent is quite difficult for anyone. The
𝑉− = 𝐼(𝑅3 + 𝑅5 + 𝑅4 ) (4)
Cold storage is a big storage room in which there could be
different temperature at different places. At the entrance, the
temperature is high compared to the area near the refrigerator. The following equations can be calculated as follows.
Therefore, the device requires large no. of pt100 sensors at
different places to measure the average temperature of the 𝑉 = (𝑉+ ) − (𝑉− ) (5)
room. The sensor that is placed far apart from the device has
a lengthy wire. The wire itself has a resistance that is a 𝑉 = [𝐼(𝑅1 + 𝑅2 + 𝑅4 ) − 𝐼(𝑅3 + 𝑅5 + 𝑅4 ) (6)
prominent cause of error in temperature measurement.
Resistance that the device is taking in account in this case, is
not the resistance of the sensor only but the resistance of the As the length of wires for a single sensor is same so we can
wire is also added into it. Therefore, the resistance of the wire write R=R2 =R3=R4
is a major problem for pt100 sensors. We have to design such
a circuit that shall cancel out the resistance of the wire. 𝑉 = [𝐼(𝑅1 + 𝑅 + 𝑅) − 𝐼(𝑅 + 𝑅5 + 𝑅)] (7)
A. Designed Circuit with INA 128
The resistance of wire is the cause of error in temperature 𝑉 = [𝐼(𝑅1 − 𝑅5 )] (8)
measurements. To remove this error; circuit must be designed
by using components such as; pt100, resistances of wire, This is the final equation of our design. In this equation the
resistance Ro, two current sources. Two wires are attached resistance Ro (Ro=R5) is subtracted from the resistance
with the negative terminal of the Pt100 sensor. One of them obtained from the pt100. This V+ and V- is applied to the
contains a resistor. A circuit with two loops that makes Instrumentation Amplifier (INA-128p). V+ is attached to pin
Wheatstone bridge is designed. R1 is the resistance of Pt-100. 3 and V- is attached to the pin 2 of INA-128p.
The resistance of pt100 varies with the variation in
temperature. It varies from 100 ohm to 101.74 ohm for 32- C. Simulation Results
to-40-degree Fahrenheit. The resistances R2, R3 and R4 are The circuit is being designed in a Proteus software to get
the resistances of wire. Ro is the resistance which is set to 99 simulated results. Simulation’s results at the output pin of the
ohms. Circuit contains two equal values of current sources. INA- 128 are shown in a Table I having the temperature
The value of both current sources is set to 10m A. It also values with corresponding resistance of Pt-100, voltage
contains an instrumentation amplifier for giving gain to the difference at input pins of the INA-128 and the voltage at the
output voltage. The gain for INA128 is set to 85 by setting output pin of the INA-128.
gain resistance to 595 ohms. The designed circuit diagram
with INA 128 is shown in Figure I.
209
Table I. Simulated Results of INA-128
Input
Temperature Voltage to Output
(F) Resistance of INA128 Voltage (V)
PT100 (ohm) (mV)
32 100.00 10 0.85
33 100.22 12.2 1.03
34 100.43 14.3 1.21
35 100.65 16.5 1.40
36 100.89 18.7 1.59
37 101.09 20.9 1.77
38 101.30 23 1.95
39 101.52 25.2 2.14 Figure III. Arduino connected with LCD
40 101.74 27.4 2.33 The figure having complete circuit containing Arduino and
LCD is as shown in Figure IV.
Simulated results show that the voltage at output pin of the
INA-128 ranges from 0.85 to 2.33, which must be in range of
0 - 5 volt. It is achieved by setting gain equal to 85.
D. Arduino Interfacing with Proposed Design Circuit
Arduino is programmed in Arduino IDE software. When the
program is installed into the computer by using Arduino IDE
then USB cable is used to link the board with the computer.
Now opened the Arduino IDE and chose the precise board by
selecting Tools>Boards>Arduino/Genuino Uno, and choose
the correct Port by selecting Tools>Port. Arduino Uno is
programmed using Arduino programming language based on
Wiring. Write down the program of the research. To get it Figure IV. Complete circuit connected with Arduino and LCD
started with Arduino Uno board and hardware, load the When the resistance of pt-100 is set to 100-ohm temperature
written program. When the code (also shown below) is loaded displayed at LCD is 32 as can be seen in Figure V.
into your IDE, we clicked the ‘upload’ button given on the
top bar. Once the upload is finished, we observed the
Arduino’s built-in LED blinking. This is the design until now
with Arduino in it. The output pin of INA has been to the A0.
The Arduino is power up by 5V DC source as shown in
Figure II.
210
When the resistance of pt-100 is set to 101.74-ohm Figure IX and the final prototype for commercialization is
temperature displayed at LCD is 40 as shown in Figure VII. presented in Figure X.
References
[1] Karim, A. B., Hasan, M. Z., Akanda, M., & Mallik, A. “Monitoring
food storage humidity and temperature data using IoT”, MOJ Food
Processing & Technology, vol. 6, pp. 400-404, 2018.
[2] Ting, L., & Zeliang, L., “Temperature Control System of Cold
Storage”, International Conference on Electromechanical Control
Technology and Transportation, 2015.
[3] V. C. Chandanashree, U Prasanna Bhat, Prasad Kanade, K M Arjun, J
Gagandeep, Rajeshwari M Hegde, "Tinyos based WSN design for
Figure VIII. Initial PCB prototype monitoring of cold storage warehouses using internet of
things", International conference on Microelectronic Devices Circuits
B. Prototype using Breadboard and Systems (ICMDCS), pp. 1-6, 2017.
[4] Ma, X., & Mao, R, “Fuzzy Control of Cold Storage Refrigeration
After the finding the faults in the old PCB we started working System with Dynamic Coupling Compensation”, Journal of Control
on the breadboard because due to the lockdown we were not Science and Engineering, pp.1-7, 2018.
able to order new PCB. The breadboard circuit is shown in [5] Xu Xiaofeng, Zhang Xuelai, Simulation and experimental
investigation of a multi-temperature insulation box with phase change
211
materials for cold storage, Journal of Food Engineering, vol. 292, pp.
110286, 2021,
[6] Hamid Ikram, Adeel Javed, Mariam Mehmood, Musannif Shah, Majid
Ali, Adeel Waqas, “Techno-economic evaluation of a solar PV
integrated refrigeration system for a cold storage facility”, Sustainable
Energy Technologies and Assessments, vol. 44, pp.101063, 2021.
[7] Torres-Sánchez, R.; Martínez-Zafra, M.T.; Castillejo, N.; Guillamón-
Frutos, A.; Artés-Hernández, F. “Real-Time Monitoring System for
Shelf Life Estimation of Fruit and Vegetables,” Sensors, vol. 20, pp.
1860, 2020.
[8] R. Mishra, S.K. Chaulya, G.M. Prasad, S.K. Mandal, G. Banerjee,
“Design of a low cost, smart and stand-alone PV cold storage system
using a domestic split air conditioner,” Journal of Stored Products
Research, vol. 89, pp.101720, 2020.
[9] H. Feng, W. Wang, B. Chen and X. Zhang, "Evaluation on Frozen
Shellfish Quality by Blockchain Based Multi-Sensors Monitoring and
SVM Algorithm During Cold Storage," in IEEE Access, vol. 8, pp.
54361-54370, 2020.
[10] Hina Afreen, Imran Sarwar Bajwa, "An IoT-Based Real-Time
Intelligent Monitoring and Notification System of Cold
Storage", Access IEEE, vol. 9, pp. 38236-38253, 2021.
[11] Yadav, Ravindra. (2020). Remote Monitoring System for Cold Storage
Warehouse using IOT. International Journal for Research in Applied
Science and Engineering Technology. 8. 2810-2814.
212
Modeling and Load Flow Analysis of Electric Vehicle
Charging Stations in Power Distribution Systems
Mustafa NURMUHAMMED Ozan AKDAĞ Teoman KARADAĞ
Department of Electric and Energy Turkish Electricity Transmission Department of Electric and
Malatya OIZ Vocational School Malatya, Turkey Electronics Engineering
Inonu University, Malatya, Turkey ozan.akdag@live.com Inonu University, Malatya, Turkey
mustafa.nurmuhammed@inonu.edu.tr teoman.karadag@inonu.edu.tr
Abstract—As the electric vehicles are becoming part of our fffff charging rates due to design constraints of internal power
lives all over the world, charging them in an efficient way gains electronics.
more importance as the energy it draws from the distribution
network have increased dramatically. Unplanned and As the number of cars that support high speed charges, and
uncontrolled charging could cause problems such as electric vehicles increase, considerable amount of load will be
transformer overloading, voltage imbalances and power outages experienced by the power distribution system. This load is
in the power distribution network. This paper proposes considered controlled or uncontrolled load depending on
simulation of charging electric vehicles and comparing power operability. Controlled load is a process of charging or
distribution parameters with no load, full load and charging the discharging within certain limits or a plan. Uncontrolled load
maximum number of vehicles that system supports. Integrating is charging or discharging regardless of any predefined plan,
electric vehicle charging stations to distribution network is preparation or agreement. The charging of electric vehicles
analyzed using 11-bus test system. Modeling and load flow should be planned and rolled out in a controlled manner so that
analysis are performed and a new test system is proposed. Next, power distribution system components are not overloaded,
the effect of controlled and uncontrolled charging in the 11-bus power quality is not compromised and the system is free of
test system modeled in this study is discussed. power outages. In addition, the voltage and frequency
deviation, harmonics and three-phase voltage unbalance are
Keywords— Electric Vehicle Charging Systems, Plug-In
some of the parameters that affect the quality of the energy in
Electric Vehicles, Distribution Network, Power Analysis,
Modeling and Load Flow Analysis
the power distribution network.
This topic is researched in number of studies. One of them
I. Introduction is a research utilizing probabilistic analyses in power grid to
Electric vehicles are dominating the transportation demonstrate charging effects on the system [3]. Another
industry. According to a forecast the percentage of electric research observes the effect of charging station loads on
vehicles will be more than 25% of entire vehicles in the reliability indices [4]. In another study, impact of charging
coming ten years [1]. In general, electric vehicles provide electric vehicles on the distribution system is studied using
better acceleration, more economy per kilometer driven, less MATLAB/Simulink [5] simulation. In addition, a study group
maintenance costs, environmental benefits such as being able published a report demonstrating specifics of charging electric
to acquire energy from renewable resources and other direct vehicle effects with probable scenarios, and foresights [6].
or indirect advantages. These advantages lead to a remarkable One research examined the line status and transformer usage
rise of electric vehicle awareness and sales around the world. and models were suggested [7].
Electric vehicles have expanded at an average yearly rate of
There are research studies that study reliability of
60% in the 2014 – 19 period, totaling 7.2 million cars in 2019
distribution system [8], power quality of high voltage grid [9]
[2]. Everyday use is increasing faster than ever.
[10], voltage deviation [11][12], voltage unbalance [13]. In
Electric vehicles can be charged via relatively slow home another study power losses in charging and discharging
AC (Alternating Current) charger, AC charging stations and processes are investigated [14]. One research examine
fast DC (Direct Current) Charging stations. Most home mitigating the instant load increases [15]. Some others are
chargers charge at a rate under 3 kW. AC charging stations proposing solutions that might reduce the effect of charging
provide faster charging speeds but onboard chargers of most by scheduling charge sessions and using charging algorithms
electric vehicles are designed to charge under 20 kW. DC [10], [16]. One study proposes smart charging method that
charging rates fluctuate the most among new vehicles. DC optimizes the chargeable power by short term load forecast
charging rates for new vehicles commonly start at 50 kW and [17]. In addition, investing in the power distribution system
can go up to 250 kW. Overall charging rate of any kind of is always a choice; however, the cost associated is high and a
charging is limited to the lower of the two charging rates; the research study shows that the top twenty percent of load-
maximum rate that electric vehicle supports and the maximum serving capacity efficiently utilized only less than five percent
rate that charging station supports. of the hours at load duration and serves less than one percent
of the electricity demand in the system [18]. Figure I shows
Limited range for electric vehicle owners is becoming less the reserve capacity, rarely used peaking capacity and
of a concern when planning for long distance journeys as more underutilized baseload capacity. Therefore, investing solely to
ultra-fast DC chargers are installed on shorter intervals on the power network hardware may not be the most efficient
strategic locations and highways. Nowadays, 250 kW and 350 solution to protect the power distribution network when
kW ultra-fast DC chargers are very common in countries electric vehicles are at charge simultaneously.
where electric vehicles are adopted in large scale. On the other
hand most electric vehicles yet to reach those high speed
213
buses are 400 V. The line and load data of this distribution
system are given in Tables I and II, respectively.
Table I. 11-bus distribution system line data
Bus no Name R X B
(ohm) (ohm) (us)
Bus 2-3 Line 1 0.1051 0.087 373.8
Bus 2-3 Line 2 0.1051 0.087 373.8
Bus 2-4 Line 3 0.052 0.043 186.9
Bus 2-4 Line 4 0.052 0.043 186.9
Bus 3-5 Line 5 0.1051 0.087 373.8
Bus 3-5 Line 6 0.1051 0.087 373.8
Bus 3-6 Line 7 0.063 0.052 224.2
Bus 3-6 Line 8 0.063 0.052 224.2
Bus 3-10 Line 9 0.026 0.021 93.4
Bus 7-10 Line 10 0.1051 0.087 373.8
Bus 7-10 Line 11 0.1051 0.087 373.8
Figure I. Utility load duration graph [18] Bus 7-9 Line 12 0.031 0.026 112.1
Bus 7-9 Line 13 0.031 0.026 112.1
Besides mentioned studies, there are many research
studies on the impact of electric vehicles on distribution
networks [17][19][11][20][21].
Figure II. Single line diagram of the sample 11-bus distribution system
214
Table V. Total Status (Uncontrolled Charging)
Generation 6.32 MW 1.01 MVAR
Load 5.46 MW 0.02 MVAR
Grid loses 0.86 MW 0.98 MVAR
Figure III. Parameter interface of the load model When a total of 80 charging stations were commissioned,
the first three of the distribution transformers were
Table II shows 11-bus distribution system load data. overloaded. Likewise Line 3-4; 7-8-9; 12-13 are also
overloaded. When the other transformer and line loads are
Table II. Load data of the 11-bus distribution system examined, it is seen that some of them are at the limit and
Load Load
Load Load some are below the overload limit. This may cause damage
No P MW Q MVAR No P MW Q MVAR on the equipment in the power grid and unnecessary power
1 0.245 0.001 7 0.245 0.001 outages. In power systems, lines and transformers can be
2 0.265 0.0021 8 0.265 0.0021 loaded maximum at 100-110%. Considering the general
3 0.245 0.001 9 0.245 0.001 condition of the power system, controlled charging can be
4 0.245 0.001 10 1.02 0.0021 achieved in line with the limits that the power system will
5 0.265 0.0021 11 0.18 0.0021 withstand. The results of the load flow analysis according to
6 0.245 0.001 - - - controlled charging case are shown in Table VII-VIII.
215
Charge3 2 [4] D. Güneş, İ. G. Tekdemir, M. Ş. Karaarslan, and B.
Alboyacı, “Elektrikli araç şarj istasyonu yüklerinin
Charge5 5
güvenilirlik indisleri üzerine etkilerinin incelenmesi,” J.
Charge6 5 Fac. Eng. Archit. Gazi Univ., 2018, doi:
https://doi.or./10.17341/gazimmfd.416408.
Charge7 10
[5] B. Yagcitekin, M. Uzunoglu, and A. Karakas, “Elektrikli
Charge8 2 Araçların Şarjı ve Dağıtım Sistemi Üzerine Etkileri,” pp.
316–320.
III. Conclusion [6] D. Saygın, O. Tör, S. Teimourzadeh, M. Koç, J.
In this study, the primary goal was to observe the effects Hildermeier, and C. Kolokathis, Türkiye Ulaştırma
of charging electric vehicles to the power distribution system. Sektörünün Dönüşümü : Elektrikli Araçların Türkiye
In order to show the effects of electric vehicle charging Dağıtım Şebekesine Etkileri. 2019.
[7] M. Kiliçarslan Ouach and E. Çam, “Investigation on the
stations, 11-bus distribution system was modeled using
electrical vehicles effects on the electrical power grid,” El-
DigSilent software. The simulation output helps managing Cezeri J. Sci. Eng., vol. 8, no. 1, pp. 21–35, 2021, doi:
integration of electric vehicles, which are spreading rapidly in 10.31202/ecjse.753493.
various countries, into the existing power distribution [8] H. R. Galiveeti, A. K. Goswami, and N. B. Dev Choudhury,
networks. The differences between when electric vehicles are “Impact of plug-in electric vehicles and distributed
being charged in a controlled and uncontrolled conditions generation on reliability of distribution systems,” Eng. Sci.
have been studied. In addition, the adverse impacts of Technol. an Int. J., vol. 21, no. 1, pp. 50–59, 2018, doi:
uncontrolled charging are shown and discussed. Furthermore, 10.1016/j.jestch.2018.01.005.
according to the simulation results obtained from uncontrolled [9] L. S. Zhao and H. M. Yuan, “The impact of quick charge
charging, the number of electric vehicles that can be safely on power quality of high-voltage grid,” IOP Conf. Ser.
charged in a controlled condition can be determined. Mater. Sci. Eng., vol. 366, no. 1, 2018, doi: 10.1088/1757-
899X/366/1/012033.
This simulation study can help reducing the effects of [10] M. Singh, I. Kar, and P. Kumar, “Influence of EV on grid
electric vehicle charging stations by proposing possible power quality and optimizing the charging schedule to
impact based on a predefined load scenario. In addition, mitigate voltage imbalance and reduce power loss,” Proc.
relatively small 11-bus distribution system is tested and EPE-PEMC 2010 - 14th Int. Power Electron. Motion
presented with data. Simulation studies allow analyzing the Control Conf., pp. 196–203, 2010, doi:
system without taking risk of applying changes on real power 10.1109/EPEPEMC.2010.5606657.
distribution systems. [11] K. Clement-Nyns, “Impact of plug-in hybrid electric
vehicles on Electricity systems,” 2010.
Possible impact of integrating electric vehicle charging [12] G. Ma, L. Jiang, Y. Chen, C. Dai, and R. Ju, “Study on the
stations that simultaneously charge 80 vehicles to an 11-bus impact of electric vehicle charging load on nodal voltage
distribution network are shown in detail. According to the deviation,” Arch. Electr. Eng., vol. 66, no. 3, pp. 495–505,
simulation results, the number of charging stations that can be 2017, doi: 10.1515/aee-2017-0037.
safely integrated into a distribution system is calculated to be [13] S. Panich and J. G. Singh, “Impact of plug-in electric
34. Therefore, a power distribution network can be protected vehicles on voltage unbalance in distribution systems,” Int.
from technical problems and lack of infrastructure by J. Eng. Sci. Technol., vol. 7, no. 3, p. 76, 2016, doi:
estimating the maximum number of cars to be allowed at the 10.4314/ijest.v7i3.10s.
charging stations or limit the energy use that can be drawn [14] E. Apostolaki-Iosifidou, P. Codani, and W. Kempton,
from the system. “Measurement of power loss during electric vehicle
charging and discharging,” Energy, vol. 127, no. March,
In the near future, it is expected that more electric powered pp. 730–742, 2017, doi: 10.1016/j.energy.2017.03.015.
vehicles especially electric fleet vehicles will simultaneously [15] T. Dragičević, S. Sučić, J. C. Vasquez, and J. M. Guerrero,
charge through the power distribution network. Modeling and “Flywheel-based distributed bus signalling strategy for the
simulation studies regarding this research topic will play a public fast charging station,” IEEE Trans. Smart Grid, vol.
vital role in planning and designing power distribution grids. 5, no. 6, pp. 2825–2835, 2014, doi:
Future work can be done in mitigating the possible effects by 10.1109/TSG.2014.2325963.
scheduling electric vehicles in charging stations, distribute [16] J. De Hoog et al., “Electric vehicle charging and grid
power usage or other means of keeping the power distribution constraints: Comparing distributed and centralized
system safe and stable. approaches,” IEEE Power Energy Soc. Gen. Meet., 2013,
doi: 10.1109/PESMG.2013.6672222.
References [17] M. R. Poursistani, M. Abedi, N. Hajilu, and G. B.
Gharehpetian, “Impacts of plug-in electric vehicles smart
[1] G. Giordano, “Electric vehicles,” Manuf. Eng., vol. 161,
charging on distribution networks,” 2014 Int. Congr.
no. 3, pp. 50–58, 2018, [Online]. Available:
Technol. Commun. Knowledge, ICTCK 2014, pp. 1–5,
https://www2.deloitte.com/uk/en/insights/focus/future-of-
2015, doi: 10.1109/ICTCK.2014.7033499.
mobility/electric-vehicle-trends-2030.html.
[18] P. Denholm and W. Short, “An Evaluation of Utility
[2] R. Irle, “Global BEV & PHEV Sales for 2019,” EV-
System Impacts and Benefits of Optimally Dispatched
volumes.com, 2020. https://www.ev-
Plug-In Hybrid Electric Vehicles,” NREL Rep. noTP-620,
volumes.com/country/total-world-plug-in-vehicle-
no. October, p. 41, 2006, [Online]. Available:
volumes/ (accessed Jul. 07, 2020).
http://www.nrel.gov/docs/fy07osti/40293.pdf.
[3] I. G. Tekdemir, B. Alboyaci, D. Gunes, and M. Sengul, “A
[19] R. C. Green, L. Wang, and M. Alam, “The impact of plug-
probabilistic approach for evaluation of electric vehicles’
in hybrid electric vehicles on distribution networks: A
effects on distribution systems,” 2017 4th Int. Conf. Electr.
review and outlook,” Renew. Sustain. Energy Rev., vol. 15,
Electron. Eng. ICEEE 2017, pp. 143–147, 2017, doi:
no. 1, pp. 544–553, 2011, doi: 10.1016/j.rser.2010.08.015.
10.1109/ICEEE2.2017.7935809.
[20] K. Clement-Nyns, E. Haesen, and J. Driesen, “Analysis of
216
the impact of plug-in hybrid electric vehicles on residential [23] O. Akdag, F. Okumus, A. F. Kocamaz, and C. Yeroglu,
distribution grids by using quadratic and dynamic “Fractional Order Darwinian PSO with Constraint
programming,” World Electr. Veh. J., vol. 3, no. 2, pp. Threshold for Load Flow Optimization of Energy
214–224, 2009, doi: 10.3390/wevj3020214. Transmission System,” Gazi Univ. J. Sci., vol. 31, no. 3,
[21] J. Balcells and J. García, “Impact of plug-in electric pp. 831–844, 2018.
vehicles on the supply grid,” 2010 IEEE Veh. Power
Propuls. Conf. VPPC 2010, pp. 5–9, 2010, doi:
10.1109/VPPC.2010.5729217.
[22] “PowerFactory.” [Online]. Available:
https://www.digsilent.de/en/downloads.html/.
217
Obstacle Avoiding Capabilities for The Drone by Area
Segmentation and Artificial Neural Network
Mohammed Majid Abdulrazzaq Mustafa Mohammed Alhassow Abdullah Ahmed Al-dulaimi
Department of Computer Engineering Department of Electrical and Computer Department of Electrical Electronics
Karabuk University Engineering Altinbas University Engineering Karabuk University
moh.abdulrazzaq9@gmail.com Mustafa.alshakhe@gmail.com Abdalluhahmed1993@gmail.com
Abstract— Obstacle avoidance in unmanned Arial vehicles is an planning strategies center around impediment or obstacle
important task to ensure the mobility and safety of the vehicle. avoidance issues [2]. Zong et.al [4] proposed an obstacle
It attracted much attention in recent years, and today we know avoidance scheme for space robots based on the mixed double
that most drones are controlled remotely using some wireless integer values for predictive control, Zhao et.al [5] proposed a
technology, such as radiofrequency through remote control, with cooperative scheme with transfer learning in flocking swarm
a telephone. Mobile phone or tablet, making a drone always of UAVs for obstacle avoidance. Trajectory planning basically
depend on a user who is giving instructions on what to do and termed to the planning of an ideal flight path of the airplane or
who acts accordingly. The biggest challenge is the development
aircraft between the beginning stage and the closure point,
of autonomous air agents that complete missions without having
considering components, for example, fuel utilization,
any human intervention. In this paper we propose an area
segmentation approach that segments the area into smaller areas
mobility, appearance time, flight region, and danger level.
and classify those areas into (Safe/Unsafe) which will allow the Trajectory planning is considered as a significant assurance for
drone to pass safely through the safe areas and avoid the unsafe the effective finish of a UAV and one of the critical advances
or obstacle areas. In comparison with other related works our for mission planning frameworks. Because of specialized
results shows better performance with respect to time,shape,and restrictions.
path length. Zhang et.al [6] proposed a trajectory tracking in mobile
robots in order to avoid and dynamic obstacles in the robot's
Keywords— UAV, obstacle avoidance, Segmentation, Wireless,
ANN, Classification
path, Padhy et.al [12] proposed a feature extraction scheme
from the front view camera of the UAV in order to detect
obstacles and avoid them. trajectory planning depended
I. Introduction vigorously on manual tasks by experts. With the nonstop turn
Automated robots as well as vehicles have already been of events and improvement of the avoidance and control
used in order to complete missions in risky conditions, for framework and innovation, the precision necessities of a UAV
example, tasks in thermal power stations, investigation of Mars for planning the trajectory are getting increasingly elevated,
as well as in observation of opponent powers in the war zone. and artificial path planning has gotten increasingly harder to
Apart from these uses and implementations, there is the meet the requirements.
advancement of higher intelligence automated aeronautical Mendoza-Soto et.al [13] proposed an obstacle prediction
vehicles that are shortly termed as UAVs for upcoming or scheme in order to predict the obstacles in the UAVs trajectory
future battle in order to decrease manual setbacks. One of the for efficient obstacle avoidance. With the quick advancement
primary difficulties for intelligent unmanned aerial vehicles of correspondence innovation, different techniques for
advancement is basically way of path planning or arranging in identifying flight climate data have arisen endlessly, which
adversarial conditions. There have been concerns with path makes the data got by the trajectory planners increasingly
planning that effectively concentrated in the field of robotics. plentiful. To improve flight precision, the planned trajectory
An issue occurs in order for preparation in these applications, ought to fulfill the requirements of landscape following,
planning a path is to discover an impact free-way in a climate territory evasion, danger and threat avoidance while fulfilling
where there occur obstacles that are either static or dynamic. the performance limitations of the airplane.
Early research focused on both holonomic and non-holonomic
systems in kinematic movement issues without considering The idea of fully automated systems is the absence of a
system dynamics, together with static impediments. Other than human factor and thus the prevention of human losses in the
numerous outside contrasts, the majority of those strategies event of accidents, or much faster decision-making. Despite
depend upon couple of various conventional methodologies the fact that almost fully automated systems already exist and
that includes roadmap, cell-disintegration as well as potential their functionality is very accurate, it will take some time
field. before there will be fully automated and trusted robots or
vehicles.
While relocating deterrents are basically associated with
arranging issues, the time dimension is added to the In this paper, we train a deep learning algorithm to classify
configuration space [1] as well as planning is named as area segments into safe and unsafe areas also creating a safe
motion-planning or direction/trajectory-planning rather than path that will lead the drone to move from the start point to its
planning of path. Research has as of late been used in the goal. The rest of the paper is organized as follows: Section 2 is
planning of motion particularly that considers dynamic where we explain our proposed scheme, Section 3 is where we
requirements termed as kino-dynamic arranging or planning show the simulation and results for our model, Section 4 is
[3]. The entirety of previously mentioned path or motion where we conclude our work.
218
II.Proposed Scheme and Properties A. K-Means Algorithm
To get a drone to make decisions, first of all, it must be It is one of the most simple algorithm variations (k-means
considered that the first problem occurs when it dsoes not have cluster) that is simple to implement and is useful for
knowledge of the environment surrounding it, implying that segmentation of simple image. When it comes to image
the air vehicle does not process any information in this regard. segmentation, the concept we're talking about is that of image
All the information generated by the sensors it carries is sent vectorization. It's done by first color quantizing an image and
to the control station or application for its interpretation. then reducing the available color palette to a certain colors
Depinding on these information the environment will be finite amount. During the partitioning of the image, The
classifyed in order to creat the safe path as shown in the figure picture becomes multiple sections. This makes it easier to use
below: various image processing algorithms. The next task of our
image segmentation is to locate objects within the original
image, as well as to find their boundaries that define the areas
of objects that are located. The focus of this particular
technique is to locate those pixels with the same hue parameter
value. Since these pixels have been grouped into multiple
labeled segments, they are commonly referred to as segments.
The Euclidean distance formula has a somewhat more
general representation as follows:
𝑛
Figure I. Proposed obstacle avoidance scheme.
(𝐿)2 = ∑(𝑥𝑖 − 𝑦𝑖)2 (1)
There is no decision support system independent of the 𝑖=1
ground control station that would make it adopt a proactive
control model. To make this possible, visual perception Where the value for distance of Euclidean is 𝐿 , 𝑥 and 𝑦
techniques have to be integrated into the control of UAVs, in are the two vectors being considered. N - dimensions number
order to increase their navigation and direction skills. Yet we (i.e. vectors components) (n = 2). In this instance, only two or
do not simulate these perceptions, we merely suggest that in three-dimensional vectors are involved, so the following
our simulation the sensing device connected to the drone made formula is simplified: Let's use this variation of the Euclidian
it possible for the drone to be fully aware of its surroundings. distance formula to calculate When two pixels have a
particular coordinate, the distance between them is specific
coordinates:
219
area will be the center of the cluster (segment) as shown in the
figure below:
Figure VI. 3D environment set up. Figure VIII. Low MSE rate by increasing number of
iterations.
Then we segment the area of 3D space into smaller areas
suing the K-means algorithm, the K-means will take into The velocity planning approach is based on getting speed
consideration the distance and similarities between the limits, limiting long-tied acceleration, and curving segments
segments, the obstacle or the greater number of obstacles in an and side acceleration on curved segments to allow the UAV to
fly with better control. By implementing robust path-planning
220
control, we are able to deliver precise path following speed still increasing and decreasing depending on the obstacle
regardless of uncertainties and variations in speed,on the other in the environment till the drone reach its goal.
hand our trajectory resulting is 36.9m with 2.7 time second. In
our work, we have made a speed profile that shows the changes In the future, we propose applying this scheme to a real-
in the speed while avoiding the obstacle in the environment. It life drone with sensing devices to sense the surrounding area
can be seen from Table I the result of our work in comparison in real-time to guide the drone in uncharted areas.
with the other related algorithms.
References
[1] Barraquand, Jerome, and Jean-Claude Latombe. "Robot motion
Table I. Result comparison with respect to trajectory,accuracy, and planning: A distributed representation approach." The
time. International Journal of Robotics Research 10.6 (1991): 628-
Mode Dijkstra RRT A* Proposed 649.
Resulting - 37.1015 - 36.9609 [2] Mustafa mohammed, Oguz Ata, and Dogu Cagdas
trajectory [m] “Nonholonomic path planning for a mobile robot based on
Accuracy - 95% - 98.6% Voronoi and Q-Learning algorithm”, 1st International
Time [s] 3.02996 6.9513 3.2781 2.7561 Conference on Computing and Machine Intelligence (ICMI),
pp.236-239, 2021.
[3] Erdmann, Michael, and Tomas Lozano-Perez. "On multiple
The idea behind the combination between the k-means and moving objects." Algorithmica 2.1 (1987): 477-521.
the ANN is to predict a suitable path has the ability to avoid [4] Karaman, Sertac, and Emilio Frazzoli. "Sampling-based
the obstacle till the drone reach its goal, from the result above algorithms for optimal motion planning." The international
we can conclude that our work is better in part of length and journal of robotics research 30.7 (2011): 846-894.
speed. The problem with RRT considered as not smooth [5] L. Zong, J. Luo, M. Wang, and J. Yuan, “Obstacle avoidance
handling and mixed integer predictive control for space robots,”
with highly random path so the path is not suitable. Comparing Adv. Sp. Res., vol. 61, no. 8, pp. 1997–2009, 2018, doi:
our work with the related shows that we have the 10.1016/j.asr.2018.01.025.
ability to avoid the obstacle as fast as possible while if we just [6] W. Zhao, H. Chu, M. Zhang, T. Sun, and L. Guo, “Flocking
work with RRT algorithms we may have too many problems Control of Fixed-Wing UAVs with Cooperative Obstacle
like,time speed, avoiding capabilities. Avoidance Capability,” IEEE Access, vol. 7, pp. 17798–17808,
2019, doi: 10.1109/ACCESS.2019.2895643.
[7] J. Zhang, S. Zhang, and R. Gao, “Discrete-time predictive
trajectory tracking control for nonholonomic mobile robots with
obstacle avoidance,” Int. J. Adv. Robot. Syst., vol. 16, no. 5, pp.
1–11, 2019, doi: 10.1177/1729881419877316.
[8] Y. Yu, X. Chen, Z. Lu, F. Li, and B. Zhang, “Obstacle
avoidance behavior of swarm robots based on aggregation and
disaggregation method,” Simulation, vol. 93, no. 11, pp. 885–
898, 2017, doi: 10.1177/0037549717711281.
[9] P. Wu et al., “Autonomous obstacle avoidance of an unmanned
surface vehicle based on cooperative manoeuvring,” Ind. Rob.,
vol. 44, no. 1, pp. 64–74, 2017, doi: 10.1108/IR-04-2016-0127.
[10] P. Wang, S. Gao, L. Li, B. Sun, and S. Cheng, “Obstacle
avoidance path planning design for autonomous driving
vehicles based on an improved artificial potential field
algorithm,” Energies, vol. 12, no. 12, 2019, doi:
10.3390/en12122342.
[11] L. Song, B. Y. Su, C. Z. Dong, D. W. Shen, E. Z. Xiang, and F.
P. Mao, “A two-level dynamic obstacle avoidance algorithm for
Figure IX. Speed profile path planning. unmanned surface vehicles,” Ocean Eng., vol. 170, no. October,
pp. 351–360, 2018, doi: 10.1016/j.oceaneng.2018.10.008.
Fig. IX shows the performance of the drone moving from [12] J. Seo, Y. Kim, S. Kim, and A. Tsourdos, “Collision Avoidance
the tart point trying to reach its goal. As we cn see from the Strategies for Unmanned Aerial Vehicles in Formation Flight,”
figure the speed is highest at the beginning of the movement IEEE Trans. Aerosp. Electron. Syst., vol. 53, no. 6, pp. 2718–
that mean the drone moving on th path after a while the speed 2734, 2017, doi: 10.1109/TAES.2017.2714898.
will start decreasing because its avoiding the obstacle, the [13] R. P. Padhy, S. K. Choudhury, P. K. Sa, and S. Bakshi,
speed still increasing and decreasing depinding on the obstacle “Obstacle Avoidance for Unmanned Aerial Vehicles: Using
Visual Features in Unknown Environments,” IEEE Consum.
in the environment till the drone reach its goal as safe as Electron. Mag., vol. 8, no. 3, pp. 74–80, 2019, doi:
possible. 10.1109/MCE.2019.2892280.
[14] J. L. Mendoza-Soto, L. Alvarez-Icaza, and H. Rodríguez-
IV. Conclusions Cortés, “Constrained generalized predictive control for obstacle
avoidance in a quadcopter,” Robotica, vol. 36, no. 9, pp. 1363–
In this paper, we presented an approach to guide a drone to 1385, 2018, doi: 10.1017/S026357471800036X.
fly through a 3D space area. The approach uses a K-means [15] X. Ma and A. Lee, “Self-self-adaptive obstacle avoidance fuzzy
algorithm to segment the known area, a convolutional neural system of mobile robots,” J. Intell. Fuzzy Syst., vol. 35, no. 4,
network to extract the features of the area segments, and ANN pp. 4399–4409, 2018, doi: 10.3233/JIFS-169759.
to classify these segments. Once the segments are defined and [16] Z. Lin, L. Castano, E. Mortimer, and H. Xu, “Fast 3D Collision
labeled, the drone can easily pass through the safe segments Avoidance Algorithm for Fixed Wing UAS,” J. Intell. Robot.
Syst. Theory Appl., vol. 97, no. 3–4, pp. 577–604, 2020, doi:
and avoid the unsafe segments, we evaluate our scheme using 10.1007/s10846-019-01037-7.
the MSE index and found that our approach has low MSE [17] J. Li, J. Sun, and G. Chen, “A multi-switching tracking control
which decreases with each iteration of ANN training, also the scheme for autonomous mobile robot in unknown obstacle
221
environments,” Electron., vol. 9, no. 1, 2020, doi: [20] Zhang, Denggui, Yong Xu, and Xingting Yao. "An improved
10.3390/electronics9010042. path planning algorithm for unmanned aerial vehicle based on
[18] L. Hwang, H. M. Wu, and J. Y. Lai, “On-Line Obstacle rrt-connect." 2018 37th Chinese Control Conference (CCC).
Detection, Avoidance, and Mapping of an Outdoor Quadrotor IEEE, 2018.
Using EKF-Based Fuzzy Tracking Incremental Control,” IEEE [21] He, ZeFang, and Long Zhao. "The comparison of four UAV
Access, vol. 7, pp. 160203–160216, 2019, doi: path planning algorithms based on geometry search
10.1109/ACCESS.2019.2950324. algorithm." 2017 9th International Conference on Intelligent
[19] F. Belkhouche and B. Bendjilali, “Reactive path planning for 3- Human-Machine Systems and Cybernetics (IHMSC). Vol. 2.
d autonomous vehicles,” IEEE Trans. Control Syst. Technol., IEEE, 2017.
vol. 20, no. 1, pp. 249–256, 2012, doi:
10.1109/TCST.2011.2111372.
222
Joint User Selection and Base Station Assignment
Strategy in Smart Grid
Muhammad Fawad Khan Muhammad Azam Ashfaq Ahmed
Dept. of Computer Science and Engineering School of Energy and Power Engineering Dept. of Electrical Engg. and CS
Kyungpook National University Jiangsu University Khalifa University
Daegu, South Korea Zhenjiang, China Abu dhabi, UAE
fawadkhan896@gmail.com azamlyh@gmail.com ashfaq2419@gmail.com
I. I NTRODUCTION
Smart grids are the advanced form of conventional elec- technology and rest of the block will portray the true picture
trical grid systems where electricity is being supplied in a of conventional electrical grid systems. For a successful smart
controlled manner [1]. There is a communication link between grid system, smart metering, energy management for both
power generation, transmission and distribution that helps supplier and consumer as well as advanced communication
in improving the overall performance of grid systems. It is methods and infrastructure are the core requirements [2], [3].
that communication or information exchange that converts The emerging smart system merely rely on communication
the typical grid system to smarter one. In a brief context, network. Usually wireless communication network is preferred
smart grid is basically an automated electrical grid system as compared to wired link due to advantages like flexibility,
that is more reliable, secure and efficient as compared to ease of deployment, cutting cost and greater access [4]. Smart
typical electrical grids. It supplies electricity according to grid network comprises sensors, smart meters, a proper data
consumer’s demand in order to utilize the energy efficiently. management and monitoring system, various alarms, notifica-
As information and communication technology plays a pivotal tions, alerts and important indication mechanism. Keeping in
role among the major stake holders of smart grid system. So, it mind the architecture of smart grid, communication network
can be termed as the backbone or heart of smart grid system. could be categorized into short and long range communica-
As shown in Fig 1, the scenario will be totally change by tions. Short range communication network is just like personal
simply removing the block of information and communication area network and is termed as Home Area Networks (HAN’s).
223
It includes interfacing and communication between sensors,
smart meters and devices deployed within the home premises.
Bluetooth, Wifi, Zigbee, 6LoWPAN and Z-wave are some of
the suitable communication technologies for HAN’s.
Neighborhood Area Networks (NAN’s) connects HAN’s de-
vices to the gateway network. NAN’s are long range commu-
nication network. It is of worth importance in communication
architecture of smart grid systems. There exists numerous
options of communication technologies for NAN’s i.e. satellite
communication, WiMAX and cellular networks. Among these Fig. 2. Oscillatory behavior of greedy approach.
options cellular networks are preferred for NAN’s due to high
maturity of network, dedicated frequency band, high uplink
(UL) and downlink (DL) data rates, lower latency, reliability,
secure communication, ease of deployment and ubiquitous
coverage [5]. connecting the users to a particular BS. This conventional
Evolution of cellular network form 1G-4G and the exponen- scheme is termed as max-SINR rule which failed in balancing
tial growth of devices, sensors and users are the true witness the load between the macro and micro BSs. For this purpose
of drastic technological advancement in cellular network. [7] proposed a heuristic that adds a bias term in SINR of low
Objectives of 5G targets 1000 times higher spectral efficiency, power micro BS in order to push it to a comparable level
controlled privacy and reliable communication with 10 times to attract the users. [8]–[14] tried to minimize the interfer-
lower energy consumption. Heterogenous networks, mm wave ence by minimizing the total transmit power considering the
communication, D2D communication, massive MIMO, energy minimum requirement for SNR constraint for every user. In
aware networks are considered as state of the art technologies existing literature, max-SINR rule has been discussed most of
for 5G networks. But still user association and BS assignment the time. But this typical approach has some serious issues
is of great importance in order to improving the overall per- regarding load balancing and transmit power of micro BSs.
formance of cellular technology for next generation network. So, different researchers proposed numerous solutions for this
User association or BS assignment is the decision making problem. For example [15], targeted coordinate descent, dual
for a particular user to get connect with a specific BS. This coordinate descent as well as sub gradient method to get the
assignment or connection establishment is carried out using association problem solved. Pricing variables are introduced
some specific decision variable targeting the optimization of in the objective function and their weightage actually helps in
the network. So, there is a requirement of an appropriate finding the optimal solution for the defined objective function.
assignment strategy for both efficient utilization of resources Some of the work of [15] have already been published in
and optimized performance of a particular communication [16]. For increasing the coverage area and offloading traffic
network. This paper will focus on user selection and BS from high power macro BS [7] added a constant bias term
assignment strategy at the same time to get optimal results to SNR. But it is difficult to specify the exact value for that
in order to meet the objectives. optimal bias term. A greedy algorithm have been discussed in
[17]–[19]. Drawback of BS assignment using greedy approach
II. E XISTING WORK
is that user will connect to a particular BS only considering
Modern cellular networks are the most suitable technology the maximization of their own utility. Every time a user
for smart grid communications. For optimization of this com- switch its BS on the basis of its own throughput in order
munication network, researchers have discussed users and BS to increase its own utility irrespective of the other parameters
association problem extensively. In cellular architecture there of network environment. It is difficult to control this type of
are access nodes (BS) and a user must establish its connection mechanism. As shown in Fig. 2, each time user switch their
with a specific access node (BS) to get the provided ser- BS for maximizing its own data rate which leads to a back and
vices. But [6] highlighted Coordinated Multipoint Technology forth switching. It lies under the category of selfish algorithms
(CoMP) where a single user can even establish connection with leading to oscillatory behavior of the network. [20] proposed
more than one BSs. New access nodes have been deployed in game theory for assignment of users and BSs. In [21], channel
Het. Nets (Heterogenous Networks) in order to offload the state information (CSI) has not been shared because of selfish
traffic from macro BSs (Base stations). This deployment of reasons.
pico/femto cell for handling more number of users at crowded
areas is termed as cell splitting. Small cells (pico/femto) can Rest of the paper has been organized in such a way
hardly attract the users as they are low power cells as compared that section III comprises system model and problem
to macro BS. Transmit power has an important role for making formulation while simulation results and discussion on the
a connection between user and the access node. Conventional obtained results has been deeply discussed in section IV.
approach of BS assignment strategy uses SINR (Signal to Conclusion and future work has been highlighted in section V.
Interference Plus Noise Ratio ) as the decision parameter for
224
TABLE I Introducing a new parameter termed as utility parameter aij
N OTATIONS as
Symbol Definition
N Total number of active users SIN Rij
aij = log Blog 1 + (4)
uj User connected to each BS Γ
hij Channel between user i and BS j
bij Binary decision variable for association of iith user By substituting the value of aij into the objective function,
with jth BS our BS assignment problem becomes
Pj PSD (Power Spectral Density) of jth BS
γ2 PSD of AWGN (Additive White Gaussian Noise) X X
Rij Data rate max aij bij − uj log(uj )
Γ SNR (Signal to Noise ratio) gap b,U
SIN R Signal to interference plus noise ratio i,j j
subject to:
X
C1 : bij = 1, ∀i
III. S YSTEM M ODEL AND P ROBLEM F ORMULATION
j
The main focus of this work is on BS and user assign-
X
C2 : bij = uj , ∀j
ment strategy that yields the maximization of our objective. i
Choosing the cellular network as the communication network X
C3 : uj = U
for smart grid systems, the target is to improve the overall
j
throughput by selecting the best combinations of users and
BSs. bij ∈ (0, 1), ∀i, ∀j
Considering a downlink cellular network, let L be the C1 indicates one user can be connected to one BS at a time
number of BSs and total U active users are there within the i.e. one-to-one connection between user and BS. C2 shows the
area covered by the cellular network. i and j are used for total number of users connected to each BS denoted by uj.
indexing a particular user and BS respectively i.e. i ∈ [1, C3 shows that all the users will be served.
2, 3, . . . . , U ] and j ∈ [1, 2, 3 , . . . . , L]. B is the
total bandwidth (B.W) that is distributed among all the BSs. IV. R ESULTS AND D ISCUSSION
For making problem simpler, frequency flat power spectral In this section, the obtained result from exhaustive search
density (PSD) levels and flat fading channels are assumed to and specified algorithm (fmincon) will be discussed exten-
make SINR values constant for all the frequencies. Channel sively. Our objective function is a maximization function that
between ith user and jth BS is denoted by hij ∈ C and Pj is depends on the user and BS assignment strategy. bij assigns
PSD level. Then SINR value for association of user i and BS a particular user to a BS and aij is the utility parameter
j will be introduced to get an estimate of maximum capacity (data rate)
2
of the network.
|hij | Pj In Fig.3(a), exhaustive vs optimal algorithm output shows
SIN Rij = P 2 (1)
j 0 6=j |hij 0 | Pj 0 + γ 2 the value of objective function that is throughput in monetary
units. There is a comparison in this figure between exhaustive
γ 2 = PSD of AWGN (Additive White Guassian Noise). and optimal search algorithm. It can be observed by the bar
In this paper, we adopt sum log utility maximization ob- graph that optimal algorithm cannot give better results then the
jective for optimizing the throughput by selecting the best heuristic approach. Heuristic approach will always be better
combination of users and BSs. If uj is the number of user as compared to any algorithm. As indicated in graph, overall
associated with each j BS, then each user connected to that throughput of the network increases by increasing the number
specific BS will share 1/uj of frequency resource. The data of users because utility increases in this way by fixing the
rate of user i connected with BS j is calculated by number of BSs. If the number of BSs are fixed and we keep
on increasing the users, then after specific number of users,
B SIN Rij
Rij = log 1 + (2) the utility of the network start decreasing. This decrement
uj Γ indicates the point of saturation with specific number of BSs
Γ = SNR gap of user i that is determined by coding and that can be observed in case of optimal algorithm.
modulation scheme. Fig.3(b),(c),(d) shows the results of optimization algorithm
Γ is assumed to be same for every user. i.e. (fmincon) used to improve the overall performance of the
network with different combinations of users and BSs. For
Binary variable (0-1) bij is a decision variable used to every set of users, the optimization algorithm calculates the
determine whether user i is associated with BS j or not. Then objective function using the best assignment combination of
objective function problem can be written as users and BSs. There are three different graphs with unique
throughput which indicates that different number of BSs are
used. As indicated in Fig.3(b), 2 BSs are fixed for this case.
X
fo (b, R) = bij log(Rij ) (3)
i,j
When we keep on increasing the number of users with fixed
225
(a) Exaustive vs Optimal algorithm
(b) Max achievable throughput with 2 BSs
(c) Max achievable throughput with 3 BSs (d) Max achievable throughput with 5 BSs
Fig. 3. (a) Throughput performance comparison of exhaustive vs optimal algorithm in monetary units. (b), (c), (d) indicates the max achievable throughput
and the saturation point for the accommodation of users with specific number of base stations.
number of BSs, the network throughput increases till 50 users. the overall throughput of the network. Implementing the
After 50 users, the throughput decreases which indicates the heuristic approach gives the maximum results as compared to
overloaded traffic on 2 BSs. So, for best utilization, 2 BSs optimal algorithm. For any specific number of BSs, saturation
can accommodate max 50 users in order to get maximum point of user accommodation has been identified. This satura-
overall throughput of the network. If we want to accommodate tion point is directly linked with the utility of the objective
more users then number of BSs must be increased for better function and identify that maximum number of users that
services. In the same way, if we increase number of BSs to can be served using that fixed amount of BSs. It is clearly
3 and 5 as indicated in Fig.3(c) and (d), we can provide observed from the results that by increasing the number users
best services to 70 and even more number of users taking increases the overall throughput till saturation point and after
the best assignment strategy using fmincon. In a nutshell, our this point any increase in user leads to decrement in the utility
algorithm selects users and BSs assignment jointly targeting of the network. The optimization algorithm (fmincon) yield
the maximum throughput of the network. the best combination of user and BS that is then used for
V. C ONCLUSION AND F UTURE WORK getting maximum throughput up till certain saturation level.
So, keeping the saturation point in mind, the performance of
This paper jointly optimizes user and BS association by the network has been optimized using optimal algorithm.
selecting the best combination of user and BS that maximizes
226
Jointly optimizing the power as well as user association [20] L. Jiang, S. Parekh, and J. Walrand, “Base station association game in
could be a promoting extension of this work. Beamforming multi-cell wireless networks (special paper),” in Wireless Communica-
tions and Networking Conference, 2008. WCNC 2008. IEEE. IEEE,
and power control are still two important aspects of a commu- 2008, pp. 1616–1621.
nication network that can make it more suitable and efficient [21] M. Hong and A. Garcia, “Mechanism design for base station association
for emerging smart grid communication network. and resource allocation in downlink ofdma network,” IEEE Journal on
Selected Areas in Communications, vol. 30, no. 11, pp. 2238–2250,
2012.
R EFERENCES
[1] P. Siano, “Demand response and smart grids—a survey,” Renewable and
sustainable energy reviews, vol. 30, pp. 461–478, 2014.
[2] Y. Kabalci, “A survey on smart metering and smart grid communication,”
Renewable and Sustainable Energy Reviews, vol. 57, pp. 302–318, 2016.
[3] Y. Yan, Y. Qian, H. Sharif, and D. Tipper, “A survey on smart grid com-
munication infrastructures: Motivations, requirements and challenges,”
IEEE communications surveys & tutorials, vol. 15, no. 1, pp. 5–20,
2013.
[4] M. Rafiei, S. M. Elmi, and A. Zare, “Wireless communication protocols
for smart metering applications in power distribution networks,” in
Electrical Power Distribution Networks (EPDC), 2012 Proceedings of
17th Conference on. IEEE, 2012, pp. 1–5.
[5] C. Kalalas, L. Thrybom, and J. Alonso-Zarate, “Cellular communi-
cations for smart grid neighborhood area networks: A survey,” IEEE
Access, vol. 4, pp. 1469–1493, 2016.
[6] E. U. T. R. Access, “Further advancements for e-utra physical layer
aspects,” 3GPP Technical Specification TR, vol. 36, p. V2, 2010.
[7] I. Guvenc, M.-R. Jeong, I. Demirdogen, B. Kecicioglu, and F. Watanabe,
“Range expansion and inter-cell interference coordination (icic) for
picocell networks,” in Vehicular Technology Conference (VTC Fall),
2011 IEEE. IEEE, 2011, pp. 1–6.
[8] L. Smolyar, I. Bergel, and H. Messer, “Unified approach to joint
power allocation and base assignment in nonorthogonal networks,” IEEE
Transactions on Vehicular Technology, vol. 58, no. 8, pp. 4576–4586,
2009.
[9] J. T. Wang, “Sinr feedback-based integrated base-station assignment,
diversity, and power control for wireless networks,” IEEE Transactions
on Vehicular Technology, vol. 59, no. 1, pp. 473–484, 2010.
[10] H. Pennanen, A. Tölli, and M. Latva-aho, “Decentralized base station
assignment in combination with downlink beamforming,” in Signal
Processing Advances in Wireless Communications (SPAWC), 2010 IEEE
Eleventh International Workshop on. IEEE, 2010, pp. 1–5.
[11] D. H. Nguyen and T. Le-Ngoc, “Joint beamforming design and base-
station assignment in a coordinated multicell system,” Iet Communica-
tions, vol. 7, no. 10, pp. 942–949, 2013.
[12] V. N. Ha and L. B. Le, “Distributed base station association and power
control for heterogeneous cellular networks,” IEEE Transactions on
Vehicular Technology, vol. 63, no. 1, pp. 282–296, 2014.
[13] R. D. Yates and C.-Y. Huang, “Integrated power control and base station
assignment,” IEEE Transactions on vehicular Technology, vol. 44, no. 3,
pp. 638–644, 1995.
[14] S. V. Hanly, “An algorithm for combined cell-site selection and power
control to maximize cellular spread spectrum capacity,” IEEE Journal
on selected areas in communications, vol. 13, no. 7, pp. 1332–1340,
1995.
[15] K. Shen and W. Yu, “Distributed pricing-based user association for
downlink heterogeneous cellular networks,” IEEE Journal on Selected
Areas in Communications, vol. 32, no. 6, pp. 1100–1113, 2014.
[16] ——, “Downlink cell association optimization for heterogeneous net-
works via dual coordinate descent,” in Acoustics, Speech and Signal
Processing (ICASSP), 2013 IEEE International Conference on. IEEE,
2013, pp. 4779–4783.
[17] R. Madan, J. Borran, A. Sampath, N. Bhushan, A. Khandekar, and T. Ji,
“Cell association and interference coordination in heterogeneous lte-a
cellular networks,” IEEE Journal on selected areas in communications,
vol. 28, no. 9, pp. 1479–1489, 2010.
[18] T. B. L. E. L. R. Ramjee, “Generalized proportional fair scheduling in
third generation wireless data networks,” in IEEE INFOCOM, 2006, pp.
1–12.
[19] K. Son, S. Chong, and G. De Veciana, “Dynamic association for load
balancing and interference avoidance in multi-cell networks,” IEEE
Transactions on Wireless Communications, vol. 8, no. 7, 2009.
227
Comparison of flowdrill and conventional drilling methods in
thin-walled materials
228
terms of bushing heights, clamping force, hardness and crack
analyses.
Table III. Test plan
Thickness Spindle
Hole & tapping
Material speed Method
(mm) dia. (mm)
(rpm)
229
III. Test Results
230
As a result of the tests, the bushing heights of the holes method, the peel strength increases approximately 2-3 times
drilled with the flowdrill method increased by 4-5 times compared to the conventional drilling method.
compared to the conventional drilling method. With this
increase comes the possibility of tapping more of the test C. Comparison of flowdrilling and conventional drilling
specimens with a larger amount of tap. It is predicted that the methods in terms of part hardness
connection will be strengthened with the increase in the height During drilling with flowdrill method, the friction of the
of the bushes in these ratios and with more threading in the flowdrill tip and the part at high speeds causes a temperature
parts. increase of approximately 300-400 °C around the holes.
B. Comparison of flowdrilling and conventional drilling Hardness tests were performed for both materials in terms of
methods in terms of clamping strengths hardness changes around the hole with sudden heating and
cooling.
To measure the strength of thin-walled connections,
clamping tests were carried out on materials who drilled by
flowdrill and conventional drilling methods. Clamping tests
performed at 1 mm/min. a constant speed.
231
Table. IV AISI 304 stainless steel and St 37-2 flowdrill welding, rivets nuts, bonding). After the drilling process, the
method penetrant test results tapping can be opened easily, eliminating the risk of warping
AISI 304 stainless steel St 37-2
in the material.
232
Classification of Animal Faces Using a Novel DAG-CNN
Architecture
Shahram Taheri Zahra Golrizkhatami Önsen Toygar
Department of Computer Engineering Department of Computer Engineering Department of Computer Engineering
Antalya B ilimUniversity Antalya B ilimUniversity Eastern Mediterranean University
Antalya, Turkey Antalya, Turkey Famagusta, North Cyprus, Turkey
shahram.taheri@antalya.edu.tr z.golrizkhatami@antalya.edu.tr onsen.toygar@emu.edu.tr
Abstract—Manual classification and recognition of animals be valuable in expert systems to determine the wild animals’
in wild life images and footage is a tiring and extremely migration corridors. Object characterization can be obtained
challenging process. Therefore, automatic systems that are by applying visual descriptors, shape descriptors or texture
developed by computer vision approaches for the classification representation. Deep learning and specifically Convolutional
of animals are suggested. In this research work, we present a Neural Network (CNN) approaches have been successfully
new architecture of a non-linear deep learning structure namely employed in recent studies of object classification and
Directed Acyclic Graph Convolutional Neural Networks (DAG- recognition tasks [2-5] and shown to have salient performance
CNNs) for animal classification. This system applies several comparing to state-of-the-art. CNN is an end-to-end system
Convolutional Neural Network (CNN) layers’ learned features
which is capable of extracting relevant features and also
and fuses them for the final decision making. For this purpose,
popular and publicly available CNN architecture, namely VGG-
integrate the feature extraction and classification phases of
16 is selected and considered as the underlying backbone classical machine learning systems. To train a CNN, presence
structure of the proposed system and several new branches are of a huge set of samples is required. CNN extracts the
added to it. The proposed system automatically performs the discrimination features automatically which in many cases
multi-stage feature extraction and combines multiple classifiers these features are shown to be superior to hand-crafted
decisions in score-level fusion manner. Experiments on the open features such as HOG, LBP, or SURF [6]. Generally CNN
access animal database prove the capability and efficiency of this architectures consists of various numbers of basic elements
novel method. namely convolutional, Max-pooling and ReLU, followed by
the multi-layer perceptron neural network. In our previous
Keywords convolutional neural network, score-level fusion, study [7], in order to take the advantage of both approaches,
directed acyclic graph, animal classification two different classifier systems were trained. One to employ
the CNN features, and the other system had been trained based
I. Introduction on hand-crafted features, such as appearance-based and shape-
Animal classification is a subdomain of visual object based features. Afterwards the outcomes were fused and the
categorization which has been reported in a few studies. final results were obtained based on this fusion.
Detection and classification of animals can be used in various
In this research, we develop a new CNN architecture,
applications of computer vision such as animal-vehicle
namely Directed Acyclic Graph Convolutional Neural
accident prevention, animal trace facility, identification,
Networks (DAG-CNNs) for animal classification. The
antitheft of animals in zoo and content-based image retrieval.
proposed method exerts the multi-stage features from different
Animals are considered as one of the most difficult objects in
CNN layers. DAG-CNN integrates the feature extraction and
object detection applications [1]. There are various reasons
classification phases of CNN into a single automated learning
such as the point that most of the animals are able to self-mask
process. Furthermore, the proposed system employs multi-
and usually they come out in complicated scene with varying
stage features and carry out the score-level fusion of multiple
illumination, viewpoints and scales. Animal pictures captured
classifiers automatically. Hence, the proposed system revokes
in the wild may have complex backgrounds, various postures
the necessity of extracting the hand crafted feature.
and diverse illuminations. Another difficulty in the task of
animal classification is that the available datasets contain a The contributions of our work are as follows:
limited number of animal classes. Animals are one of the
object classes used in the state-of-the-art for object • The proposed system combines two main stages of
recognition. The performance of object recognition system conventional machine learning procedure, namely feature
strictly relies on how well the object representation and extraction and classification stages, into a complete automatic
characterization are done. learning scheme for animal classification.
Although various human face recognition methods have • Unlike the classic CNN which uses only the features
been proposed in literature, they are not fully appropriate for of the last layer, the proposed system employs the multi stage
animal face classification with high range of intra-class learned features from mid-layers as well.
variations and inter-class similarities. In this respect, several • The proposed system aggregates the results of
object recognition algorithms have been applied on animals’ different classifiers and automatically combines them by the
images with the aim of extracting hand-crafted features such score-level fusion approach and positively it negates the use
as texture and shape. The major drawback of these methods is of feature-level fusion.
that, they are entirely problem dependent. Consequently, the
productivity of these algorithms is problematic. Constructing The proposed architecture has been tested over LHI-animal-
the complex feature vector of various hand-crafted features faces database which contains 20 different categories. 19
which has high dimension will take extensive time which may classes of various animal types and one class of human faces.
not be efficient. Animal classification and recognition can also The differentiation of these categories is very challenging due
233
to their evolutional relationship and shared parts. Additionally, III. Backgrounds
significant within-class variations such as rotation, flip Theory and essential components of DAG-CNN are
transforms, posture variation and sub-types exist in the face presented in the following sub-sections. We briefly introduce
categories. In order to make a comparison and following the the VGG-16 architecture which has been employed as the
state-of-the-art methods, in our experiments, 30 samples of backbone of the proposed DAG-CNN approach.
each class were selected for training and the remaining
samples for testing. The obtained results confirm that the A. Directed Acyclic Graph-Convolutional Neural
usage of multi-stage features from different layers of a CNN Network (DAG-CNN)
remarkably improves the classification accuracy. The directed acyclic graph (DAG) networks are capable of
The rest of this paper is organized follows. Related works constructing more complicated network architectures
are reviewed in Section 2 and overview of DAG-CNN compared to classic CNN which has a linear chain of layers
architecture is given in Section 3. Proposed method and [20]. DAG structure is inspired from the concept of recurrent
experimental settings and the obtained results are presented in neural networks (RNNs) which gets some feedback
Section 4. Finally, the conclusion is stated in the last section. connections between forward and backward layers. These
feedbacks will make the network to be able to capture the
II. Related works dynamic states. The major superiority of DAG architecture is
One of the earliest studies on animal recognition was its possibility of receiving multiple input parameters from
performed by Schmid et al. [8]. They built an image retrieval several backward layers. As the result, it can gain different
system for 4 classes of animals by applying Gabor filters. scales of image representation. The basic feature of deep
Later, Ramanan et al. proposed systems for detecting animals neural networks (DNN), is the skip connections between
in video frames by utilizing the texture and shape-based layers that is alike the DAG-CNN’s main idea and it is proven
features [9-10]. In [11], the authors introduced a method for that these skip connections can improve the accuracy of the
animal images search engine. An attempt for classifying classification system accordingly.
marine species is performed by Cao et al. [12]. For this Yang and Ramanan [21] have proposed the DAG-CNN in
purpose, they fused CNN and local feature descriptors and 2015. Their algorithm was applied to a collection of multi-
achieved better result than individual system. Peng et al in [13] scale images’ features and tested for classification of 3
introduced a deep learning algorithm in order to classify standard scene benchmarks. The authors proved that the multi-
animal face images and investigated its performance on LHI- scale model can be carried out as a DAG-Structured feed
Animal-Faces dataset. They proposed a deep boosting forward CNN. According to this architecture, an end to end
framework based on layer-by-layer joint feature boosting and gradient-based learning can be applied for automatic multi-
dictionary learning. In each layer, they utilized a collection of scale feature extraction by using generalized back propagation
filters by fusing the early layer filters. algorithm over the layers that have multiple inputs. Basically,
Afkham et al. [14] collected a new dataset including the network training equations are following the standard
realistic images of 13 various species with complex in wild CNN equations except for the ADD and RELU layers due to
background. They introduced a system for visual object their several inputs and outputs.
classification based on combining the textural features in Figure I presents the parameter setup for 𝑖 𝑡ℎ ReLU layer,
samples and their background. An approach for classifying 25 (𝑗)
considering α𝑖 as its input and β𝑖 as the output for its 𝑗𝑡ℎ
categories of animals is reported in [15]. The animal images
are segmented and partitioned into small patches and then
some colour-related features are extracted from each of them.
These features are fed into probabilistic neural network for
classification.
In [16], the authors suggested two systems for classifying
20 classes of animals by using Gabor features and K-mean Figure I. Parameter configuration at 𝑖 𝑡ℎ ReLU [21]
algorithm. Matuska et al. [17] developed a system which
detects and classifies images of 5 categories: fox, deer, wolf, output branch (its 𝑗𝑡ℎ child in the DAG), 𝑧 will be the final
brown bear and wild bear. Burghardt and Calic [18] created a output of the softmax layer. The gradient of 𝑧 regarding the
real-time system which tracks the animals’ head in the video input of the 𝑖 𝑡ℎ ReLU layer can be calculated as follows:
and collects information about them. They applied the Viola-
Jones detection algorithm which is basically utilized in human (𝑗)
𝜕𝑧 𝜕𝑧 𝛼𝛽𝑖
face detection. = ∑𝐶𝑗=1 (𝑗) 𝜕𝛼 (1)
𝜕𝛼 𝑖 𝛼𝛽𝑖 𝑖
234
In the convolutional layers, the convolution operation is In DAG-CNNs, considering the fact that the lower layers
computed by Eq. (3) as follows: are directly linked to the output layer through multi-scale
𝑀−1 𝑁−1 connections, it is assured that these layers’ neurons receive a
strong gradient signal during the learning process and will not
𝑦(𝑖, 𝑗) = ∑ ∑ 𝑥(𝑖 − 𝑘, 𝑗 − 𝑙)ℎ(𝑘, 𝑙) (3) suffer from vanishing gradients issue.
𝑘=0 𝑙=0
In CNNs, the dimension of the learned features in mid-layers
where 𝑥 refers to input sample, ℎ indicates the used filter, can be very large. Therefore, concatenating these features may
and 𝑀, 𝑁 stand for the width and height of the input sample. result in curse of dimensionality problem. To avoid this issue,
Figure IV. LHI-Animal-Faces dataset. Five images are shown for each category
The Convolution-layer’s output is shown by 𝑦. In order to marginal activations are applied by operating the average
update neurons’ biases and weights, Eq. (4)-(5) are applied to pooling on the learned features of the layers which are used for
DAG-CNN layers: score-level fusion.
𝑥𝜆 𝑥 𝜕𝐶
∆𝑊𝑡 (𝑡 + 1) = − 𝑊𝑙 − + 𝑚∆𝑊𝑙 (𝑡) (4) B. VGG-16 architecture
𝑟 𝑛 𝜕𝑊𝑙
VGG-16 [22] which is proposed by the Oxford Visual
𝑥 𝜕𝐶 Geometry Groups in ImageNet Large-Scale Visual
∆𝐵𝑙 (𝑡 + 1) = − + 𝑚∆𝐵𝑙 (𝑡) (5) Recognition Challenge (ILSVRC 2014) provides deeper and
𝑛 𝜕𝐵𝑙
wider comparing to classic CNN architecture. VGG-16 consists
In the above equations, 𝑊, 𝐵, 𝑙, λ, 𝑥, 𝑛, 𝑚, 𝑡, and 𝐶 signify of five batches of convolution operations; every batch can have
the weight, bias, layer number, regularization parameter, 3 to 4 adjacent Conv-layers that are associated with max-
learning rate, total number of training samples, momentum, pooling layers. The size of kernels in all convolutional layers is
updating step, and cost function, respectively. 3×3. Convolutional layers and the number of kernels have
235
Figure V. Confusion Matrix. (up) Score-level fusion of VGG-16 and KFA[5], (down) DAG-VGG-16
236
illumination and image quality. Then two distinct set of intact throughout the fine-tuning process. The remaining
features are generated. Afterwards, the similarity score layers’ weights are fine-tuned by applying Stochastic Gradient
between these feature vectors and the whole feature vectors Descent (SGD) algorithm to minimize the loss function with a
available in the training set are computed and the minimum small initial learning rate of 0.001.
score is selected for each technique. The distance between the
test and training animal images is considered as the score of In order to investigate the efficiency of the proposed DAG-
that sample. At this point, the achieved scores are normalized CNN system, we compare its performance with the-state-of-
and the fusion is performed by the addition of these the-art algorithms that utilized the LHI-Animal-Faces dataset.
This comparison is summarized in Table 1 and shows that the
normalized scores and fed to Nearest Neighbour (NN) classifier proposed method outperforms other state-of-the-art
to obtain the final animal’s class label. algorithms for animal face classification. In [7], we examined
several categories of feature extractors including hand-crafted
V. Experiments and Results local and appearance-based descriptors and automatic learned
The experiments employ LHI-Animal-Faces dataset [19] features. We applied well-known local feature descriptors
that contains 2200 head images of 19 different category of such as Histogram of Oriented Gradients (HOG), Completed
animals plus one class of human head. Five random samples Local Binary Patterns (CLBP), LBP histogram Fourier (LBP-
from each of the aforementioned classes are shown in Figure HF), Haralick features and Median Robust Extended LBP
IV. Due to large intra-class similarities and inter-class (MRELBP). In order to utilize the discrimination power of
variations in the dataset, precise classification is a challenging appearance-based descriptors, Linear Discriminant Analysis
task. (LDA) and Kernel Fisher Analysis (KFA) are examined.
Finally we tested the learned features of pre-trained and fine-
In order to achieve comparable result with state-of-the-art, tuned publicly available CNN architecture, namely, AlexNet
we follow the experimental setup introduced by Si et al. [25]. and VGG-16. We investigated score-level fusion of various
On this setting, 30 random images from each class are selected combination of the aforementioned descriptors and we
for the training phase and the rest of images are utilized in the showed that fusion of hand-crafted features and multi-stage
test stage. CNN-based features gain even higher accuracy comparing to
The pre-processing step includes image resizing to 224 × CNN. On the other hand, the proposed DAG-CNN system
224 pixels, pixel intensity normalization and histogram causes meaningful improvement in accuracy. The 96.4%
equalization. VGG-16 is selected as the underlying classification accuracy was achieved by automatically
architecture of the proposed DAG-CNN due to its success in combining multi-stage learned features. This result is superior
to all of the state-of-the-art methods mentioned in Table 1,
even our previous system which utilized hand-crafted and
learned feature fusion.
Table I. Classification Accuracy on LHI-Animal-Faces dataset
The confusion matrix is computed to assess the
Method Accuracy
HOG+SVM[19] 70.8%
classification precision for the two best results of Table 1 and
SW-RBF[25] 44% is shown in Figure V. For 8 animal categories, the proposed
FRAME[26] 79.4% system successfully classified the images with 100% accuracy
HIT[19] 75.6% and for the 10 classes, the classification precisions are higher
LSVM[27] 77.6% than 92%. The maximum confusions are generated by the
AOT[24] 79.1%
Deep Boosting[13] 81.5%
sheep category versus cow head, rabbit and chicken head
Score-level fusion[7] 95.31 versus mouse head (8%).
Proposed DAG-CNN 96.29%
VI. Conclusions
image classification. By following the greedy approach In this paper, a new architecture of a nonlinear CNN structure,
explained in [2], three new links are added to Batch-3, Batch- namely DAG-CNN is presented for animal classification task. This system
4 and Batch-5 (Figure III). The proposed system is pre-trained automatically combines different layers’ features of a CNN for
by hundreds of thousands images from large public image making decision. For this purpose, several new links are added
repository, namely ImageNet database [28] and learned to the underlying backbone from a popular CNN structure,
powerful discriminative feature sets. namely VGG-16. The proposed method is compared with
In the next step by utilizing the target dataset, we fine- several state-of-the-art systems that use the same dataset. The
tuned the pre-trained proposed system. Data augmentation comparison results show that the proposed DAG-CNN
techniques like translation, rotation and flipping are applied to architecture outperforms the state-of-the-art systems for animal
avoid overfitting and the training size is expanded five times. face classification.
The first early layers’ weights are frozen so that they remain
237
[5] Golrizkhatami, Z., Acan, A.: ECG classification using Intelligence. IEEE Transactions on. 35 (9), 2189–2205
three-level fusion of different feature descriptors. Expert (2013)
Systems with Applications. Vol 114, pp.54-64(2018) [25] Kolouri, S., Zou, Y., Rohde, G.K.: Sliced Wasserstein
[6] Taheri, S., Toygar, Ö.: Multi-stage age estimation using kernels for probability distributions. In: Proceedings of the
two level fusions of handcrafted and learned features on IEEE Conference on Computer Vision and Pattern
facial images. IET Biometrics, 8(2) , pp. 124 – 133 (2019) Recognition, pp. 5258-5267. IEEE (2016)
[7] Taheri, S., Toygar, Ö.: Animal classification using facial [26] Xie, J.X: Generative Modeling and Unsupervised
images with score-level fusion. IET Computer Vision. Learning in Computer Vision. Doctoral dissertation,
12(5), 679 – 685 (2018) UCLA University, (2016)
[8] Schmid, C.: Constructing models for content-based image [27] Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan,
retrieval. In: CVPR '01, Kauai, United State. pp. 11-39. D.: Object Detection with Discriminatively Trained Part-
CVPR (2001) Based Models. IEEE Trans. Pattern Analysis and Machine
[9] Ramanan, D., Forsyth, D. A., Barnard, K.: Detecting, Intelligence. 32(9), pp.1627-1645 (2010)
localizing and recovering kinematics of textured animals. [28] Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K. Fei-Fei,
In: 2005 IEEE Computer Society Conference on L.: Imagenet: A large-scale hierarchical image
Computer Vision and Pattern Recognition, June 2005, San database. In Proc. CVPR, 2009.
Diego, USA, pp. 635-642. IEEE (2005)
[10] Ramanan, D., Forsyth, D.A., Barnard, M.-K.: Building
models of animals from video. In: IEEE Transactions on
Pattern Analysis and Machine Intelligence, 28 (8), pp.
1319-1334. IEEE (2006)
[11] Berg, T. L., Forsyth, D. A.: Animals on the web. 2006
IEEE Computer Society Conference on Computer Vision
and Pattern Recognition (CVPR'06), NY, USA, pp. 1463-
1470. CPVR (2006)
[12] Cao, Z., Principe, J.C., Ouyang, B., Dalgleish, F.,
Vuorenkoski, A.: Marine animal classification using
combined CNN and hand-designed image features. In
OCEANS15 MTS/IEEE Washington, pp. 1-6. IEEE
(2015)
[13] Penga, Z., Lia, Y., Caib, Z., et al.: Deep Boosting: Joint
feature selection and analysis dictionary learning in
hierarchy. Neurocomputing. 178(20), 36–45 (2016)
[14] Afkham, h., Tavakoli, A., Eklundh, J., Pronobis, A.: Joint
Visual Vocabulary For Animal Classification. In: ICPR
2008, Tampa, FL, USA, pp. 1-4. ICPR (2008)
[15] Kumar, Y.S., Manohar, N., Chethan, H.K.: Animal
classification system: a block based approach. Procedia
Computer Science, 45, 336-343 (2015)
[16] Manohar, N., Kumar, Y.S., Kumar, G.H.: Supervised and
unsupervised learning in animal classification. In:
Advances in Computing, Communications and
Informatics (ICACCI), International Conference on, pp.
156-161. IEEE(2016)
[17] Matuska, S., Hudec, R., Benco, M., Kamencay, P.,
Zachariasova, M.: A novel system for automatic detection
and classification of animal. In: ELEKTRO, pp. 76-80.
IEEE (2014)
[18] Burghardt, T., Calic, J.: Real-time face detection and
tracking of animals. In: Neural Network Applications in
Electrical Engineering. NEUREL 2006. 8th Seminar on,
pp. 27-32. IEEE(2006)
[19] Si, Z., Zhu, S.-C.: Learning hybrid image templates (hit)
by information projection. IEEE Transactions on Pattern
Analysis and Machine Intelligence. 34(7), 1354–1367
(2012)
[20] Golrizkhatami, Z., Taheri, S., Acan, A.: Multi-scale
features for heartbeat classification using directed acyclic
graph CNN. Applied Artificial Intelligence 32 (7-8), pp.
613-628 (2018)
[21] Yang, S., Ramanan, D.: Multi-scale recognition with
DAG-CNNs. In: Computer Vision (ICCV), 2015 IEEE
International Conference, pp. 1215-1223. ICCV (2015)
[22] Simonyan, K., Zisserman, A.: Very deep convolutional
networks for large-scale image recognition. arXiv preprint
arXiv:1409.1556 (2014)
[23] Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with
convolutions. CVPR (2015)
[24] Si, Z., Zhu, S.-C.: Learning and-or templates for object
recognition and detection, Pattern Analysis and Machine
International Conference on Advanced Engineering, Technology and Applications (ICAETA) © 2021
238
Movie Success Prediction with Statistical Analysis Techniques
and Machine Learning Methods
Bugay Sarıkaya Duygu Dede Sener
Department of Computer Department of Computer
Engineering Engineering
Baskent University Baskent University
Ankara, Turkey Ankara, Turkey
Abstract— In the movie industry, huge investments have and linear combination, multiple linear regression and neural
been made to shoot a successful motion picture. However, network methods and X2 analysis were used. It is stated that
despite large investments, there are some movie examples that some attributes, such as writers, actors, and directors,
cannot be successful as expected. Therefore, predicting the profoundly affect user ratings. In addition to this, Eker et al.
success of a movie is so important on a large scale for the movie [3] sought to measure the effect of features on movie
producers before releasing the movie. In this study, a classification. Decision trees, K-NN, Random Forest, c4.5,
classification-based prediction model is aimed to develop for c5.0 and Boosting algorithms were used in this study. A
providing a foresight to the producers about investing on a
comparative study was performed by using different machine
movie. Different statistical analysis and machine learning
learning algorithms on the dataset from IMDB website and
approaches were used in the proposed model for predicting
success of a movie. We mainly focus on detecting which movie
Facebook website. It has been observed that the user votes are
attribute is highly correlated with the success of the movie and the most important factor in the IMDB score, and the country
which machine learning technique is better at predicting the where the film is produced is the least important factor in
movie success. To do so, firstly a statistical analysis was determining the IMDB score. Saraee et al. [4] studied on
conducted by using chi-square analysis and analysis of variance IMDB data using various data mining techniques.
test. Then a comparative analysis was performed by using Furthermore, Lash and Zhao [5] proposed a way to predict
different machine learning techniques including random forest, decisions regarding film investments. This study assisted in
support vector machine and artificial neural network. The making an early investment decision in filmmaking by using
experimental results indicate that the most important predictors a hhistorical data. In this study, the profit was calculated
of a movie’s success are “voteAverage”, “voteCount”, “revenue” mainly on the box-office revenue. However, for many movies,
and “budget. In addition to this, random forest has become there are other sources of income, such as items for sale.
successful by the accuracy of 96% in predicting movie success Kyuhan Lee et al. [6] examine multiple approaches to improve
among other machine learning methods. the performance of the prediction model. They developed and
added a new feature derived from the theory of transmedia
Keywords— Movie success prediction, Movie success storytelling. They used an ensemble approach, which has
classification, Machine Learning, Model Prediction rarely been adopted in the research on predicting box-office
I. Introduction performance. As a result, the proposed model, Cinema
Ensemble Model (CEM), outperformed the prediction models
Movie industry is one of the expanded industries all from the past studies that use existing machine learning
around the world. With the rapid growth of this industry and algorithms. Besides this studies, Hemraj Verma and Garima
economic impact of it, many researchers have been studying Verma [7] conducted a comparative analysis of prediction
on the movie industry. Especially conducting predictive models using various machine learning techniques. The
models to investigate factors which has impact on the success models were used to predict whether a movie would be a hit
of a movie has become a popular research area over the past or a fop before it came out. The major predictors used in the
decade. It is a well-known fact that the success of a motion models are the ratings of the lead actor, IMDb ranking of a
picture is based on several features of the movie such as box- movie, music rank of the movie, and the total number of
office, budget, revenue, and popularity level. Although, there screens planned for the release of a movie.
are some other important factors such as director, actors which
has undeniable impact on a movie’s success, every movie In this study, statistical analysis techniques and different
cannot achieve the expected box-office or the success every machine learning methods were performed to predict the
time. Therefore, there is a basic need for producers to predict success of a movie. Two statistical tests such as chi-square
the success of a movie before its release. Recently there are analysis and analysis of variance test (ANOVA) and three
some studies which focus on the predicting the success of different machine learning algorithms including random forest
movie using different machine learning approaches. (RF), support vector machine (SVM) and artificial neural
Ahmad et al. [1] proposed a mathematical model for network (ANN) were performed on a collected dataset. A
predicting movie success including finding correlations comparative study was performed to investigate which
between various features using X2 analysis. Simulation data algorithm achieve the highest rate of the accuracy. An
was used in the study and only tested in Bollywood movies experimental study is performed on a dataset collected from
and they also concluded that actors and film genres affect the IMDB dataset. According to the results the most important
success of the film. Ping-Yu Hsu et al. [2] developed a special predictors of a movie’s success are “voteAverage”, “voteCount”,
model to predict user ratings with IMDB (Internet Movie “revenue” and “budget. Besides this, it has been found that
Database) attributes. There are 32968 films in the used dataset random forest is the most successful technique by the accuracy
239
of 0.96 in predicting movie success among other machine made by selected mostly predicted classes from the trees. RFs
learning methods. and its variants are called as black-box models and they have
been applied on a variety of research fields such as
II. Methods bioinformatics, finance and healthcare systems.
In this section used statistical analysis techniques and
machine learning techniques in the proposed model are b) Support Vector Machine (SVM)
described. Support vector machine (SVM) [11] is a supervised
A. Statistical Analysis Methods machine learning algorithm that used for data classification
and regression analysis. SVM’s main goal is finding a
a) Correlation Matrix
hyperplane that best divides a dataset to two different classes
Attribute correlation matrix was used to obtain multiple times (as many times needed to match number of
correlation between each attribute and the target attribute classes). Support vectors are the data points nearest to the
which is movie success in our study. It is a common way to hyperplane, this point help define the hyperplane, so all
summarize data and guide the data owner to focus more on computations are done through these points. This hyperplane
highly correlated attributes in which way data analysis can be creates an area of margin which divides two classes apart.
conducted in more efficient way. In our study, Pearson Error function here is designed so that the margin becomes
correlation coefficient was used. The score ranges between 0 larger as error decreases. If there is no clearly dividing
and 1, the values close to 1 represents a high correlation while hyperplane then the whole feature space is transformed into a
the values close to 0 represents a low correlation. new higher dimension feature space. This is known as
kernelling. SVMs produce accurate results on clean datasets
b) Chi-Squared Test with small to medium sample size. When dealing with larger
A chi-square (χ2) test [8] is a hypothesis testing method. datasets however computational costs can be too much to
It is used for comparing the observed values with the handle and also it is highly sensitive to the noisy nature of
expected values to detect the stated null hypothesis is true or large datasets.
not. The null hypothesis states that there is no difference
between the compared data. For this test, a p-value that is less c) Artificial Neural network (ANN)
than or equal to the defined significance level (0.05) indicates An artificial neural network (ANN) [12] is the piece of a
there is a strong evidence to conclude that the observed computing system designed to simulate the way the human
distribution is not the same as the expected distribution. brain analyzes and processes information. It is the foundation
Moreover, the data used in calculating a chi-square statistic of artificial intelligence (AI) and solves problems that would
must be random, raw, mutually exclusive, drawn from prove impossible or difficult by human or statistical
independent variables, and drawn from a large enough standards. ANNs have self-learning capabilities that enable
sample. them to produce better results as more data becomes
c) Analysis of Variance (ANOVA) available. Artificial neural networks are built like the human
Analysis of variance (ANOVA) [9] is an analysis tool brain, with neuron nodes interconnected like a web. The
used in statistics that splits an observed aggregate variability human brain has hundreds of billions of cells called neurons.
found inside a data set into two parts: systematic factors and Each neuron is made up of a cell body that is responsible for
random factors. The systematic factors have a statistical processing information by carrying information towards
influence on the given data set, while the random factors do (inputs) and away (outputs) from the brain.
not. Analysts use the ANOVA test to determine the influence
that independent variables have on the dependent variable in
a regression study. In our study, two-way ANOVA was used III. Results
because we wanted make comparisons between the means of In this section, the used dataset is explanied and
three groups of data, where two independent variables are expremental results are given in the following sections.
considered. The considered variables are movie success and
A. Dataset
the rest of the attributes. Moreover, Multivariate ANOVA
(MANOVA) was used to extend the capabilities of analysis In this study, a dataset of 4899 movies released from 2000
ANOVA by assessing multiple dependent variables to 2020 was collected from TMDB (The Movie Database)
simultaneously. [13] and OMDB (The Open Movie Database) [14]. The
dataset consists of various type of attributes such as date,
genre, language, season, IMDB (Internet Movie Database)
B. Machine Learning Methods rating of the features are categorical, box-office, budget,
a) Random Forest IMDB votes, popularity, revenue, run time of them are
numerical values. In addition to this, the sample distribution
Random forests (RF) or random decision trees [10] are of the dataset there are 1215 samples for successful class, 2663
combination of tree predictors such that each tree depends on samples for average successful and 1021 samples for
the values of a random vector sampled independently and unsuccessful class.
with the same distribution for all trees in the forest. It is an
ensemble learning method used for classification, regression a) Data Preprocessing Steps
and other tasks which needs constructing a multitude of In the dataset, each movie has a IMDB rating for
decision trees. For the classification task, each generated tree representing the success of the movie. Unlike a binary
predicts a class as an output class and the final decision is classification problem, having two classes, multi-class
240
classification would cover the movie success information Then correlation matrix was obtained given in Figure 2.
more broadly. An artificial class construction has been According to the matrix, the movie success given as the name
performed to transform the problem into a multi-class “imdbRating” is highly correlate with the attributes which are
classification problem. Therefore, each movie is categorized “voteAverage”, “voteCount”, “revenue” and “budget”. After
based on its IMDB rating such as movies that have the scores having obtained highly correlated attributes with the success
in the -range of [7-10] were assigned to class “successful”, the statistical analysis has been performed by focusing on these
scores in the range of [5-6.99] assigned to the “average obtained attributes to conclude the relationship between them.
successful” class, then the scores in the range of [0-4.99] was To do so, chi-square (χ2) test, ANOVA and MANOVA tests
matched with the “unsuccessful” class. In this way, our were performed to investigate which movie attribute has the
problem has been converted into a multi-class problem. biggest impact on the movie success.
Furthermore, missing value removal and transforming
categorical features into numeric features were performed
before applying classification algorithms. One-hot encoding
was applied to convert categorical values into the numerical
values. In this way, a generalized form to be provide to
machine learning algorithms can be obtained. Finally, feature
scaling approach was applied on some attributes to normalize
the range of independent attributes of data like box-office,
revenue and vote counts.
B. Evaluation Metrics
To evaluate the performance of the classification
algorithms one metric was used: accuracy. Accuracy (1)
refers to the degree of conformity of a measured or calculated
quantity to an actual (true) value. A result is said to be Figure II. Correlation matrix of movie attributes
accurate when it matches to a particular target. TP, FP, TN Chi-square (χ2) test statistic and corresponding p-values
and FN values represent true positives, false positives, true are given in Table 1. According to the obtained p-values which
negatives and false negatives, respectively. Overall accuracy are all less than the significance level of 0.05, we can conclude
were used to compare the performance of classification that the association between each attribute and movie’s
algorithms. success is statistically significant but excluding the attribute of
“season_summer”, since it does not have the p-value less than
𝑇𝑃+𝑇𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (1) or equal to 0.05. In addition to this, “Df” value in the table
𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁
represents “degrees of freedom”.
Table I. Chi-Squared test result
C. Experimental Results
To get insight about the success distribution among the Attribute X-squared Df p-value
movies based on the attributes, some distribution plots were voteAverage 9645.6 156 < 2.2e-16
obtained. Figure 1 gives the movie success distribution based voteCount 8865.4 3222 < 2.2e-16
on the genre attribute. The class categories are given with revenue 10081 4310 < 2.2e-16
different colors such as red, green and blue represent budget 5728 2154 < 2.2e-16
unsuccessful, average and successful movie class runtime 3230 372 < 2.2e-16
respectively. According to the figure the highest number of boxOffice 9119.6 3978 < 2.2e-16
successful movies can be seen in “drama” and “comedy” date 125.36 20 < 2.2e-16
movies, while most of the “horror” and” thriller” movies are
in the unsuccessful class. popularity 19113 8300 < 2.2e-16
season_winter 14.283 2 0.0007914
season_autumn 14.283 2 0.0007914
season_summer 4.9467 2 0.0843
season_spring 17.391 2 0.0001674
imdbVotes 4825.6 1512 < 2.2e-16
241
Table II. Two-way ANOVA results
Table IV. Classification performances of the machine
Two Way Df Sum Sq Mean Sq F Value Pr(>F) learning algorithms
with IMDB
Success
Categories Method
242
[8] Plackett, R. L. (1983). Karl Pearson and the chi-squared test. [12] Wang, S. C. (2003). Artificial neural network. In
International Statistical Review/Revue Internationale de Interdisciplinary computing in java programming (pp. 81-100).
Statistique, 59-72. Springer, Boston, MA.
[9] Scheffe, H. (1999). The analysis of variance (Vol. 72). John [13] https://www.themoviedb.org/ access date: 20.04.2021
Wiley & Sons. [14] https://www.omdbapi.com/ access date: 20.04.2021
[10] Breiman, L. (2001). Random forests. Machine learning, 45(1),
5-32.
[11] Noble, W. S. (2006). What is a support vector machine?. Nature
biotechnology, 24(12), 1565-1567.
243
Digital Controllers Performance Analysis for a Robot arm
Abdullah Ahmed Al-dulaimi Mohammed Majid Abdulrazzaq Mustafa Mohammed Alhassow
Department of Electrical Electronics Department of Computer Engineering Department of Electrical and Computer
Engineering karabuk University Engineering
Karabuk University moh.abdulrazzaq9@gmail.com Altinbas University
Abdalluhahmed1993@gmail.com Mustafa.alshakhe@gmail.com
Noor Qasim AL saedi
Department of Electrical Electronics
Engineering
Karabuk University
Noorqasimat@gmail.com
Abstract—The design methodology and performance study of the methodologies as defined in [4]. This paper clarified the
various forms of digital compensators for a robot arm joint statistical and conceptual pricing articulated in the references.
control system with sensor input are presented in this article. The basic and illustrative frames and approaches to digital
continuous time (s-plane or w-plane) and Discrete (z-plane) control systems are mentioned in [6]. Digital control systems
domain parameters are used in the design process. The frequency have been documented for training, theory, simulation and
response characteristics design techniques were investigated, and experimental approaches [7],[8]. A closed loop model has been
five basic types of controllers were modelled and simulated using introduced in [1] and [4] for digital systems control and
MATLAB: phase-lag, phase-lead, proportional-integral (PI),
implementations in the Digital Drive Controller. Regarding to
proportional-derivative (PD), and proportional-integral
PID controller we citation the information from [5].
derivative (PID). Many of the controllers have been set up to
maintain a 40-degree step margin. both closed loop phase II. Methodology
answers as well as open loop bode plots have been analyzed. This
paper presents a comparison of the controllers based on their A sampler, D/A block that is a zero-order hold (ZOH), a
phase reaction characteristics. servomotor represented by an s-domain transfer mechanism,
Keywords: digital controllers, PID controllers, robot arm, robot arm
digital controller block, a power amplifier gain, gears
controllers, Digital Controllers Performance.
represented by a gain value, and a feedback sensor block
comprise the example robot control scheme outlined in this
I. Introduction article. A s-domain transfer function presents the
uncompensated plant. A/D conversion is started by the sampler
Controllers are needed to assess adjustments in system
parameters and to meet performance requirements for steady- and D/A conversion is held at zero order. Controllers must
state precision, transient reaction, reliability, and disturbance offset the margin of the plant phase and the desired result shall
prevention. Analog control systems are stable, with no intrinsic be 40 deg. For each controller, steady-state error, percent
band width limitations or system changes. Due to the tolerances overshoot, rise time, and settling time are calculated for output
of practical machines, In analog controls, intricate logics are assessment. This paper section by section documents a
difficult to synthesize, while rendering complex interfaces literature review of digital compensation, an example
among multiple subsystems is very difficult, and are vulnerable uncompensated robot arm joint plant, discrete and continuous
to incorrect designs and limitations. Furthermore, extraneous time equations with design method, MATLAB simulation
noise sources can corrupt analog systems significantly. High- results of lag, lead, PI, PD and PID controls, and a comparative
tech digital controls Since no signal loss occurs in an along to study among these five. In order to evaluate design
digital (A/D) and digital to analog (D/A) conversions [1], requirements, the digital system was adopted, simulated and
systems are reliable. Furthermore, with a more sophisticated extended in MATLAB.
logic implementation, systems are more flexible and accurate.
Filters do not encounter external noises, which makes them III. Literature Review
well-suited for adaptive filtering uses. Fast response and a The compensation theory, plant configuration and the
digital memory interface are possible for digital systems [1]. A mathematical derivations of design approaches, loop
physical planet or system is accurately controlled through parameters and open loop of the controllers mentioned in this
closed-loop or feedback operation where an output (system paper fully follow the literature provided in [1]. The controller
response) is adjusted as required by an error signal [2]. The transfer function for first-order compensation can be written as
discrepancy between the sponge as determined by sensor input
𝐾𝑑 (𝑧−𝑧0 )
and the target response generates the error signal. The error 𝐷(𝑧) = ()
𝑧−𝑧𝑝
quantity is processed by a controller or compensator in order to
satisfy those output requirements [3]. This paper describes five Here, 𝑧0 and 𝑧𝑝 represent the zero and pole positions,
digital controller design methodologies for a robot control respectively. The controller's bilinear or trapezoidal
system in real time. In these design methods, the compensating transformation from the discrete z-plane to the continuous w-
parameter is the phase margin specified in the plant bode 1+(𝑇/2)𝑤
diagram. The design method employs strategies of frequency plane (warped s-plane) implies 𝐷(𝑤) = 𝐷(𝑧), 𝑧 = and
1−(𝑇/2)𝑤
response that allow for frequency cross-over phase margin 1+(w/ωw0 )
D(w) = a 0 ,
(Pm). Phase-lag, PI and PID controllers (lag, lead) were 1+(w/ωwp )
drawn up in accordance with the principle of compensation and
244
Figure I. Robot arm joint control system block diagram
Here 𝜔𝜔0 and 𝜔𝜔𝑝 denotes the zero and pole positions in Servo motors are used. Gears are used to transfer motion.
the w-plane, and a 0 denotes the compensator dc gain. The By finding the torque and speed of the output gear, you can
bilinear approximation states that: find the torque and speed of the input gear. The uncompensated
2 𝑧−1
plant is presented by a s-domain transfer function. The sampler
𝑤= () initiates A/D conversion and zero-order hold implements D/A
𝑇 𝑧+1
conversion. For performance evaluation, steady-state error,
percent overshoot, rise time and settling time are measured for
From the equations (1)-(4), in z-plane the controller can be each controller. Here the given sensor feedback gain, GT=
realized as 0.07. The sensor input is θ𝑎 in degrees and the output is in
2/𝑇−𝜔
𝑤0
volts.
𝜔𝑤𝑝 (𝜔𝑤0 +2/𝑇) 𝑧−(2/𝑇+𝜔𝑤0)
𝐷(𝑧) = 𝑎0 ()
𝜔𝑤0 (𝜔𝑤𝑝 +2/𝑇) 𝑧−(2/𝑇−𝜔𝑤𝑝 ) V. PLANT
2/𝑇+𝜔𝑤𝑝
The control system of the robot arm has been shown in Fig.
The equation (1) yields to 1. This system shown the sampling time, 𝑇 = 0.1𝑠, the power
𝜔𝑤𝑝 (𝜔𝑤0 +2/𝑇) amplifier increase, K = 2.4 and the sensor feedback gain, 𝐻𝑘 =
𝐾𝑑 = 𝑎0 () 0.07. the system phase margin with, 𝐷(𝑧) = 1. the ZOH-TF can
𝜔𝑤0 (𝜔𝑤𝑝 +2/𝑇)
be define as
1−𝑒 −𝑠𝑇
𝐺𝐻𝑂 (𝑠) = (7)
2/𝑇−𝜔𝑤0 𝑠
𝑧0 = ()
2/𝑇+𝜔𝑤0 The plant TF in continuous time
2/𝑇−𝜔𝑤𝑝 9.6
𝑧𝑝 = () 𝐺𝑝 (𝑠) =
𝑠 2 +2𝑠
(8)
2/𝑇+𝜔𝑤𝑝
The sensor gain feedback TF is used in a continuous-time plant
IV. block diagram explanation 0.672
𝐺𝑐 (𝑠) = 𝐺𝑝 (𝑠) × 𝐻𝑘 = (9)
A closed loop model for digital control systems and 𝑠 2 +2𝑠
applications of digital controllers to speed drives has been Where the TF is transfer function.
shown in the above diagram. Thus, consists of a sampler,
digital controller block, D/A block which is a zero-order hold The sensor gain feedback TF that operates in discrete time is
(ZOH), a power amplifier gain, a servomotor represented by a known as a discrete-time plant
s-domain transfer function, gears represented by a gain value 0.00028289 (𝑠+3.39𝑒04)
and a feedback sensor block. In the case of a closed-loop 𝐺𝑑 (10)
(𝑠+1.524) (𝑠+0.4406)
feedback system, the D(z) digital controller system is
implemented. The controller uses algebraic algorithms such as Fig. 2. Introduces 𝐷(𝑧) = 1 system bode diagram, 𝑃𝑚, for
filters and compensatory controls to correct or regulate the the uncompensated system is 79.6 𝑑𝑒𝑔. With a gain margin
controlled system's behavior. The zero-order hold is a practical 𝐺𝑚 = 35.8 𝑑𝐵.
mathematical model of signal reconstruction using a digital-to- A. Design of Phase Lag Controller
analog converter (ZOH). This can be illustrated by you take a
and convert it to a continuous-time signal, at a set time, it stores design a phase-lag controller with a dc gain of 10 that
each sample value and doesn't allow changes. The amplitude or yields a system phase margin
power of a signal input to output port can be increased by 𝑎0 𝜔𝑤𝑝
𝐺ℎ𝑓 (𝑑𝐵) = 20log (11)
connecting it to an amplifier whose gain is set to a particular 𝜔𝑤0
level [9]. In order for a servomotor to transform the control
signal from the controller into the rotational angular The controller in this paper is built for 39.7842 degrees. For
displacement or angular velocity of the motor output shaft, this design, the Pm and the cross-over or Pm frequency have
implies it has a servomotor. To power the arms of the robot, been chosen as 𝜔𝑤𝑐 = 1.1291rads −1 .
245
Figure II. Controllers' Bode plot of the open loop
and and
𝜔𝑤0 cos 𝜃𝑟 −𝑎0 |𝐺𝑑 (𝑗𝜔𝑤𝑐 )|
𝜔𝑤𝑝 = (13) 𝑏1 = (20)
𝑎0 |𝐺𝑑 (𝑗𝜔𝑤𝑐 )| 𝜔𝑤𝑐 sin 𝜃𝑟
The TF of controller is Because of the phase lead characteristic, 𝜃𝑟 > 0 and in the
design procedure 𝜔𝑤𝑐 has been constrained by the following
0.01167 (𝑧−0.7105)
𝐷𝑙𝑎𝑔 (𝑧) = (14) requirements
(𝑧−0.9919)
∠𝐺𝑑 (𝑗𝜔𝑤𝑐 ) < 180 + 𝜙𝑝𝑚 ; |𝐷(𝑗𝜔𝑤𝑐 )| > 𝑎0
Figure II show the phase-lag controller. The compensated 1
plant Pm, 𝑃𝑚 = 39.7842 𝑑𝑒𝑔 at 0.0925 𝑟𝑎𝑑/𝑠, can be seen |𝐺𝑑 (𝑗𝜔𝑤𝑐 )| < ; 𝑏1 > 0 (21)
𝑎0
on the bode plot. 𝐺𝑚 = 29.6 𝑑𝐵 at 0.575 𝑟𝑎𝑑/𝑠. The gain cos 𝜃𝑐 > 𝑎0 |𝐺𝑑 (𝑗𝜔𝑤𝑐 )|
and phase margin values are unknown in the marginalized bode
plot of the controller, and hence these are determined to be The transfer function of controller is
infinite. The gain and phase margin values are unknown in the 10.424∗(𝑧−0.9832)
𝐷(𝑧) = (22)
Bode plot of the controller, and hence these are found to be (𝑧−0.9618)
infinite. Where the Pm is the phase margin.
From the bode plot, it can be observed that the compensated
B. Design of Phase Lead Controller plant Pm, 𝑃𝑚 = 39.8827 deg. At 2.29 𝑟𝑎𝑑/𝑠 and the gain
the phase-lead controller, a 0 = 10 and maximum phase margin, 𝐺𝑚 = 16.2 𝑑𝐵 at 6.57 𝑟𝑎𝑑/𝑠. From the bode plot of
shift, 𝜃𝑚 occurs at a frequency 𝜔𝑤𝑚 = √𝜔𝑤0 𝜔𝑤𝑝 . The controller figure, in fact, it is obvious that the gain and Pm
controller in this paper is equipped for 39.8827 degrees for values are undefined and thereby these are found to be infinite.
this design, the Pm and cross over or Pm frequency have been C. Design of PI Controller
chosen as 2.2950 𝑟𝑎𝑑/𝑠. a phase-lead design controller with a
dc gain of 10 that yields a system phase margin 40 deg. PI controller means Proportional Integral controller it is
composite of proportional and integral controller. They are in
𝐷(𝑗𝜔𝑤𝑐 )𝐺𝑑 (𝑗𝜔𝑤𝑐 ) = 1∠(180 + 𝜙𝑝𝑚 ) (15) cascade with each other, as we see in fig.
here 𝜙𝑝𝑚 pm is the desired Pm and
1+𝑤/(𝑎0 /𝑎1 )
𝐷(𝑤) = 𝑎0 (16)
1+𝑤/(𝑏1 )−1
246
Table I. Controllers' Bode plot characteristics
Characteristics Pm with D(z) = Lag Lead PI PD PID
1
Gain Margin [61.5755 [12.8749 [6.4935 [0 9.9702 Inf [0 9.9702
1.7929e+04] 6.7538e+03] 1.7014e+03] 1.8455e+03] 1.8455e+03]
GM Frequency [6.2240 [4.6917 [6.5659 [0 6.9890 Inf [0 6.9890
31.4159] 31.4159] 31.4159] 31.4159] 31.4159]
Phase Margin 79.6399 39.7842 39.8827 38.1362 40.0 38.1362
Stable 1 1 1 1 1 1
𝐷(𝑧) =
1.4575 (𝑧−0.9648)
(26)
sin 𝜃
(𝑧−1) 𝑘𝑑 =
𝜔, 𝐴1
From the bode plot, this is worth noting as that the
compensated plant Pm, 𝑃𝑚 = 40.0182 deg. At 0.553 𝑟𝑎𝑑/𝑠 sin 𝜃
and the gain margin, 𝐺𝑚 = 30.9 𝑑𝐵 at 5.61 𝑟𝑎𝑑/𝑠. From the ∴ 𝑘d = 𝜔 (30)
1 |𝐺(𝑗𝜔1 )|
bode plot of controller figure, it can be observed that the gain
and Pm values are undefined and thereby these are found to be cos 𝜃
infinite. 𝑘𝑝 =
𝐴1
D. Design of PD Controller
PD controller means proportional derivative Controller so it cos 𝜃
has both, the proportional controller and derivative controller in
∴ 𝑘𝑝 = |𝐺(𝑗𝜔 (31)
1 )|
cascade, so we have to add both, as we see in fig.
E. Design of PID Controller
PID controller proportional Integral Derivative controller It
consist of proportional, integral and Derivative controller all
connected in the form of cascade, as we see in fig.
247
Figure VI. Controllers' step response of the closed loop
𝐾𝐼
𝐷(𝑤) = 𝐾𝑃 + + 𝐾𝐷 𝑤 (32) gain can be expressed as the controller transfer function,
𝑤
which can be expressed as
and we can change the transfer function from w-plane to 𝑧 – 2 (2/𝑇)
𝐾𝐷 𝜔𝑤𝑐 cos 𝜃𝑟
transform by using bilinear transformation. In Bilinear ∴ 𝐾𝑃 + 2 = |𝐺 (34)
(2/𝑇)2 +𝜔𝑤𝑐 𝑑 (𝑗𝜔𝑤𝑐 )|
2 𝑧−1
transformation 𝜔 is replaced by ( ), where 𝑇 =
𝑇 𝑧+1
sampling time. 𝐾𝐷 𝜔𝑤𝑐 (2/𝑇)2 𝐾𝐼 sin 𝜃𝑟
∴ 2 − = |𝐺 (35)
(2/𝑇)2 +𝜔𝑤𝑐 𝜔𝑤𝑐 𝑑 (𝑗𝜔𝑤𝑐 )|
The PID controller's discrete TF can be expressed as
𝑇 𝑧+1 𝑧−1 The gain margin is = 20 𝑑𝐵 at 6.99 rad/s, and the Pm is =
𝐷(𝑧) = 𝐾𝑃 + 𝐾𝐼 + 𝐾𝐷 (33)
2 𝑧−1 𝑇𝑧 38.1 degrees by the PID controller at 1.85 rad/s. The gain
controller design that yields a system phase margin with 40 and phase margin values are unknown in the marginalized
deg bode plot of the controller, and hence these are determined to
be infinite.
2
𝐾𝐷 𝜔𝑤𝑐 (2/𝑇) 𝐾𝐷 𝜔𝑤𝑐 (2/𝑇)2 𝐾𝐼
[𝐾𝑃 + ] + 𝑗 [ − ] VI. Step Response Characteristicst
(2/𝑇)2 + 𝜔𝑤𝑐2 (2/𝑇)2 + 𝜔𝑤𝑐
2 𝜔𝑤𝑐
= 𝐾𝑅 + 𝑗𝐾𝐶 design problem explained in this paper has assumed an input
of 𝜃𝑐 = 0:07u(t). The controllers scaled step response of the
closed loop system for the designed is presented Figure VI.
The 𝐾𝑃 proportional gain, 𝐾𝐷 derivative gain and 𝐾𝐼 integral For the step response overshot, 𝝃 ↓⇒ 𝑴𝒑 %(%𝐎𝐒)) ↑. From
248
the figure VI. And table II the best one is PID controller MATLAB and bode plots with open loop and closed loop
because it very less steady state error compared to other step response curves have been analyzed for comparative
𝟏 𝟏 margin Pm 40 deg. C-O frequency is a crucial design
controllers. For PID → steady state error ∝ , OS ∝ ,
𝒌𝒑 𝒌𝒅 specification to compensate the plant. premises. the suggests
𝟏 𝟏 𝟏
Rise Time ∝ , ∝ , Settling Time ∝ . 𝐾𝑝, 𝐾𝑑, and Ki tells us Such design crucial as its specifications are
𝒌𝒑 𝒌𝒊 𝒌𝒅
applicable in different practical control systems.
can be described as the proportional, derivative, and integral
parameters. The closed loop controls system is affected by References
all three of these parameters. In addition to those factors , the
[1] Chowdhury, Dhiman. "Design and Performance Analysis of
slow rising, slow settling, and long. overshoot as well as the Digital Controllers in Discrete and Continuous Time Domains
steady state error are also affected. A lag compensator shifts for a Robot Control System." Global Journal of Research In
the Bode magnitude plot down at mid and high frequencies Engineering (2018).
with its attenuation property. highlights for specification on [2] Unglaub, Ricardo AG, and D. Chit-Sang Tsang. "Phase
steady-state error, the low frequency gain is changed. The tracking by fuzzy control loop." 1999 IEEE Aerospace
proportional integral controller is equivalent to a control Conference. Proceedings (Cat. No. 99TH8403). Vol. 5. IEEE,
1999.
system that produces an output, this calls attention to which
[3] Mastinu, Gianpiero, and Manfred Plöchl, eds. Road and off-
is the result of adding outputs from the proportional and road vehicle system dynamics handbook. CRC press, 2014.
integral controllers. PID is used in systems where [4] Chowdhury, Dhiman, and Mrinmoy Sarkar. "Digital
proportional, integral, and derivative controllers are in use to Controllers in Discrete and Continuous Time Domains for a
compute an output. implies It's also there to reduce steady Robot Arm Manipulator." arXiv preprint
state error and improve stability. implies It's also there to arXiv:1912.09020 (2019).
reduce steady state error and improve stability. reveals that [5] Alassar, Ahmed Z., Iyad M. Abuhadrous, and Hatem A.
when used in conjunction with a proportional and a Elaydi. "Comparison between FLC and PID Controller for
5DOF robot arm." 2010 2nd International Conference on
derivative controller, the proportional derivative controller Advanced Computer Control. Vol. 5. IEEE, 2010.
generates an output, which is the product of the proportional
[6] Phillips, Charles L., and H. Troy Nagel. Digital control
and derivative controllers. If PD is being used, noise may be system analysis and design. Prentice-Hall, Inc., 1989.
suppressed in the higher frequencies. [7] Misir, Dave, Heidar A. Malki, and Guanrong Chen. "Design
and analysis of a fuzzy proportional-integral-derivative
VII. conclusion controller." Fuzzy sets and systems 79.3 (1996): 297-314.
This paper examines the performance and design [8] Klee, Harold, and Joe Dumas. "Theory, simulation,
assessment of five simple digital controllers, including lag, experimentation: an integrated approach to teaching digital
control systems." IEEE transactions on education 37.1 (1994):
lead, PI, PD, and PID controllers, which are used for a 57-62.
physical robot arm joint plant. implies design a system [9] Liu, Hui. Robot Systems for Rail Transit Applications.
controller with a dc gain of 10 that yields a system phase The Elsevier, 2020.
design methodologies have been investigated in both discrete [10] Boukas EK., AL-Sunni F.M. (2011) Design Based on
z-domain time approaches and warped s-domain or w-plane Transfer Function. In: Mechatronic Systems. Springer, Berlin,
time frames. The controllers have been simulated on Heidelberg. https://doi.org/10.1007/978-3-642-22324-2_5.
249
Machine learning model optimization with
hyper-parameter tuning approach
Md Riyad Hossain Douglas Timmer Hiram Moya
Department of Manufacturing Department of Manufacturing Department of Manufacturing Engineering
Engineering Engineering University of Texas Rio Grande Valley,
University of Texas Rio Grande Valley, University of Texas Rio Grande Valley, USA
USA USA hiram.moya@utrgv.edu
md.hossain01@utrgv.edu douglas.timmer@utrgv.edu
Abstract— Hyper-parameters tuning is a key step to find the both of the models' parameters tuned. It is not expected to
optimal machine learning parameters. Determining the best compare a Decision Tree model with the already tuned
hyper-parameters takes a good deal of time, especially when the parameter versus an ANN model whose hyperparameters
objective functions are costly to determine, or myriad haven’t been optimized yet.
parameters are required to be tuned. In contrast to the
conventional machine learning algorithms, Neural Network II. Literature Review
requires tuning hyper-parameters more because it must process
a lot of parameters together, and depending on the fine tuning, The hyperparameter tuning, due to its importance, has
the accuracy of the model can be varied in between 25%-90%. changed to a new interesting topic in the ML community. The
A few of the most effective techniques for tuning hyper- hyperparameter tuning algorithms are either model-free or
parameters in the Deep learning methods are: Grid search, model-based. Model-free algorithms are free of using
Random forest, Bayesian optimization, etc. Every method has knowledge about the solution space extracted during the
some advantages and disadvantages over others. For example: optimization; a few of this category includes manual search
Grid search has proven to be an effective technique to tune [4], random search [2, 6-7], and grid search [5]. In the Manual
hyper-parameters despite some drawbacks like trying too many search categories, we assume the values of the parameters
combinations and performing poorly in case of tuning many
from our previous experience. In this technique, the user
parameters simultaneously. In our work, we will determine,
show and analyze the efficiencies of a real-world synthetic allows to set hyperparameters values based on judgments or
polymer dataset for different parameters and tuning methods. previous experience, trains the algorithm by them, observes
the performance, keeps doing it to train the model until
Keywords— Machine learning, Hyperparameter optimization, achieving a standard accuracy and then selects the best set of
Grid Search technique, Random Search, BO-GP hyperparameters that gives the maximum accuracy.
However, this technique is heavily dependent on the
I. Introduction judgment and previous expertise and its reliability is
In the era of Machine learning, performance (based on dependent on the correctness of the previous knowledge [3].
accuracy and computing time) is very important. The Some of the few of the main parameters used by Random
growing number of tuning parameters associated with the forest classifiers are criterion, max_depth, n_estimators,
Machine learning models is tedious and time-consuming to min_samples_split etc.
set by standard optimization techniques. Researchers
working with ML models often spend long hours to In the Random search, we train and test our model based on
determine the perfect combination of hyper-parameters [1]. If some random combinations of the hyperparameters. This
we think w, x, y, z as the parameters of the model, and if all method is better used to identify new combinations of the
of these parameters are integers ranging from 0.0001 to say parameters or to discover new hyperparameters. Although it
5.00, then hyperparameter tuning is the finding the best may take more time to process, it often leads to better
combinations to make the objective function optimal. performance. Bergstra et al. (2012) in their work mentioned
One of the major difficulties in working with the Machine that, over the same domain, random search can find models
learning problem is tuning hyperparameters. These are the that are as good as or even better in a reduced computation
design parameters that could directly affect the training time. After granting the same budget in terms of
outcome. The conversion from a non-tuned Machine learning computational constraints for the random search, it was
model to a tuned ML model is like learning to predict evident that random search can deliver best models within a
everything accurately from predicting nothing correctly [2]. larger and reduced promising configuration spaces [16].
There are two types of parameters in ML models: Random Search, which is developed based on grid research,
Hyperparameters, and Model parameters. Hyperparameters examine a set of random combinations to develop and train
are arbitrarily set by the user even before starting to train the
the algorithm; Bergstra et al. (2011) [2].
model, whereas, the model parameters are learned during the
training.
In the grid search, the user sets a matrix of hyperparameters
The quality of a predictive model mostly depends on the and trains the model based on each possible combination.
configuration of its hyperparameters, but it is often difficult to Amirabadi et al. (2020) proposes two novel suboptimal grid
know how these hyperparameters interact with each other to search techniques on the four separate datasets to show the
affect the final results of the model [14]. To determine
accuracy and make a comparison between two models it is
always better to make comparisons between two models with
250
Figure I. (a) Manual tuning (b) Random tuning (c) Grid tuning approach [From left to Right]
efficiency of their hyperparameter tuning model and later evaluated for any arbitrary 𝒙𝒙 ∈ 𝑿𝑿 , 𝑥𝑥 ∗ = arg 𝑚𝑚𝑚𝑚𝑚𝑚𝑥𝑥∈𝑋𝑋 𝑓𝑓(𝑥𝑥) ,
compare it with some of the other recently published work. and X is a hyperparameter space that can contain categorical,
The main drawback of the grid search method is its high discrete, and continuous variables [27]. In order to construct
complexity. It is commonly used when there are a few the design of different machine learning models, the
numbers of hyperparameters to be tuned. In other words, grid application of effective hyperparameter optimization
search works well when the best combinations are already techniques can simplify the process of identifying the best
determined. Some of the similar works of grid search hyperparameters for the models. HPO contains four major
applications have been reported by Zhang et al. (2014) [17], components: First, an estimator that could be a regressor or
Ghawi et al. (2019) [18], and Beyramysoltan et al. (2013) any classifier with one or more objective functions, second: a
[19]. search space, Third: an optimization method to find the best
combinations, and Fourth: a function to make a comparison
Zhang et al. (2019) [20] in their work reported a few of the between the effectiveness of various hyperparameter
drawbacks of the existing hyperparameter tuning methods. In configurations [28].
their work, they mentioned grid search as an ad-hoc process,
as it traverses all the possible combinations, and the entire A. Grid Search
procedure requires a lot of time. Andradóttir (2014) [13]
shows that Random Search (RS) eradicates some of the Grid search is a process that exhaustively searches a manually
limitations of the grid search technique to an extent. RS can specified subset of the hyperparameter space of the target
reduce the overall time consumption, but the main algorithm [30]. A traditional approach to finding the optimum
disadvantage is that it cannot converge to the global optimal is to do a grid search, for example, to run experiments or
value. processes on a number of conditions, for example, if there are
The combination of randomly selected hyper-parameters can three factors, a 15 × 15× 15 would mean performing 3375
never guarantee a steady and widely acceptable result. That’s experiments under different conditions. [32]. Grid search is
why, apart from the manually tuning methods, automated more practical when [31]: (1) the total number of parameters
tuning methods are becoming more and more popular in in the model is small, say M <10. The grid is M-dimensional,
recent times; snoek et al. (2015) [10]. Bayesian Optimization so the number of test solutions is proportional to LM, where L
is one of the most widely used automated hyperparameter is the number of test solutions along each dimension of the
tuning methods to find the global optimum in fewer steps. grid. (2) The solution is known to be within a specific range
However, Bayesian optimization’s results are sensitive to the of values, which can be used to define the limits of the grid.
parameters of the surrogate model and the accuracy is greatly (3) The direct problem d = g (m) can be computed quickly
depending on the quality of the learning model; Amirabadi et enough that the time required to compute LM from them is not
al. (2020) [3]. prohibitive. (4) The error function E (m) is uniform on the
To minimize the error function of hyperparameter values, scale of the grid spacing, Δm, so that the minimum is not lost
Bayesian optimization adopts probabilistic surrogate models because the grid spacing is too coarse.
like Gaussian processes. Through precise exploration and There are many problems with the grid search method. The
development, an alternative model of hyperparameter space first is that the number of experiments can be prohibitive if
is established; Eggensperger et al. (2013) [8]. However, there are several factors. The second is that there can be
probabilistic surrogates need accurate estimations of significant experimental error, which means that if the
sufficient statistics of error function distribution. So, a sizable experiments are repeated under identical conditions, different
number of hyperparameters is required to evaluate the responses can be obtained; therefore, choosing the best point
estimations and this method doesn’t work well when there is on the grid can be misleading, especially if the optimum is
to process myriad hyperparameters altogether. fairly flat. The third is that the initial grid may be too small
for the number of experiments to be feasible, and it could lose
characteristics close to the optimum or find a false (local)
III. Methodology optimum [32].
The purpose of hyperparameter optimization is to find the
global optimal value 𝑥𝑥 ∗ of the objective function f (x) can be
251
B. Random Search usually much cheaper than running the objective function.
Random search [33] is a basic improvement on grid However, because Bayesian optimization models are run
search. It started with a randomized search over hyper- based on previously tested values, it is difficult to belong to
parameters from certain distributions over approximate them with parallel sequential methods; but they are generally
parameter values. These searching process runs as long as the able to detect optimal close hyperparameter combinations in
predetermined budget is exhausted, or at least until achieving a few iterations [36]. Common substitution models for BO
a desired set of accuracy. These methods are the simplest include the Gaussian process (GP) [37], random forest (RF)
stochastic optimization and are very useful for certain [38]. Therefore, there are three main BO algorithms based on
problems, such as small search space and fast-running their substitution models: BO-GP, BO-RF, BO-TPE. GP is an
simulation. RS finds a value for each hyperparameter, prior attractive reduced order model of BO that can be used to
to the probability distribution function. Both the GS and RS quantify forecast uncertainty. This is not a parametric model
estimate the cost measure based on the produced and the number of its parameters depends only on the input
hyperparameter sets. Although RS is simple, it has proven to points. With the right kernel function, your GP can take
be more effective than Grid search in many of the cases [33]. advantage of the data structure. However, the GP also has
Random search has been shown to provide better results due disadvantages. For example, it is conceptually difficult to
to several benefits: first, the budget can be set independently understand with BO theory. In addition, its low scalability
based on the distribution of the search space, therefore, with large dimensions or many data points is another
random search technique can sometime work better important issue [36].
especially if the multiple hyper-parameters are not uniformly
distributed [34]. Second: Because each evaluation is IV. Dataset description & Basics of polymer extrusion
independent, it is easy to parallelize and allocate resources.
Unlike GS, RS samples a few parameter combinations from A. Denier
a defined distribution, which maximizes system efficiency by Denier is a weight measurement usually refers to the
reducing the likelihood of wasting a lot of time in a small, thickness of the threads. It is the weight (grams) of a single
underperforming area. In addition, this method can detect optical fiber for 9 kilometers. If we have a 9 km fiber weighs
global optimum values or close to global if given a sufficient 1 gram, this fiber has a denier of 1, or 1D. A fiber with less
budget. Third, although getting optimal outputs applying than 1-gram weight calls Microfibers [22]. Microfibers
random search is not promising, lengthy processing time may become a new development trend in the synthetic polymer
lead to a greater likelihood of getting the best hyperparameter industry. The higher the denier is, the thicker and stronger the
set, whereas extra search times cannot always guarantee fiber is. Conversely, less denier means that the fiber/fabric
improved results in Grid searches. The use of random search will be softer and more transparent. Fine denier fibers are
is recommended in the initial stages of HPO to narrow the becoming a new standard and are very useful for the
search space quickly, before using guided algorithms to get development of new textiles with excellent performance [21].
better results. The main drawback [28] of RS and GS is that
each evaluation in its iteration does not depend on previous B. Breaking Elongation (%)
evaluations; thus, they waste time evaluating Elongation at break is one of the few main quality
underperforming areas of the search space. parameters of any synthetic fiber [24]. It is the percentage of
elongation at break. Fiber elongation partly reflects the extent
C. Bayesian Optimization of stretching a filament under a certain loading condition.
Bayesian optimization (BO) is a commonly used Fibers with high elongation at break are determined to be
reprocessing algorithm for HPO problems. Unlike GS and easily stretched under a predetermined load. Fibers showing
RS, BO determines future assessment levels based on the these characteristics are known to be flexible. The elongation
previous results. To determine the following parameters of behavior of any single fiber can be complex because of its
the hyperparameter, BO uses two key factors: a surrogate multiplicity of structural factors affecting it. Moreover, a
model and an acquisition function. The division model aims cotton fiber comes up with a natural crimp, which is
to match all the points that are now seen in the objective important for fibers to stick together while undergoing other
function. The acquisition function determines the use of production processes [23]. If L is the length of the fiber, then
different points, balancing exploration and exploitation. The the equation for the percentages of the breaking elongation
BO model balances the search and use process to identify the would be:
best possible area and avoid losing the best configuration in
undeveloped areas [35]. ∆𝐿𝐿𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵
The basic BO method works as follows: (i) Building a 𝐸𝐸𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 = ∗ 100%
reduced-order probabilistic model (ROM) of the objective 𝐿𝐿0
function. (ii) Finding the best hyperparameter values in the Breaking elongation for the cotton fiber might be varied from
ROM model. (iii) Applying those optimal values to the 5% to 10%, which is significantly lower than that of wool
objective function. (iv) Updating the ROM model with the fibers (25%-45%), and much lower than polyester fibers
new set of results. (v) Repeating above steps until achieving (typically over 50%).
maximum number of iterations.
BO is more efficient than GS and RS because it can detect C. Breaking force (cN) and Tenacity (cN/tex)
optimal combinations of hyperparameters by analyzing Breaking tenacity is the maximum load that a single fiber
previously tested values, and running the surrogate model is can withstand before breaking. For the Polypropylene and
252
PET staple fibers, 10 mm lengths sample filaments are drawn Table II. MSE & Cycle time for Grid Search technique
until failure. Breaking tenacity is measured in grams/denier.
Very small forces are encountered when evaluating fiber
properties, so an instrument with gram-level accuracy is Name MSE Cycle time (s)
required [25]. The tenacity of virgin PP fibers is about 5–8
g/den, and the elongation at break is about 100%. At the same RF 1.053 5
time, the tenacity of recycled PET is about 3.5-5.7 g/den; the SVM 0.927 2
elongation at break usually exceeds 100%.
KNN 29.45 1
D. Draw Ratio
ANN 0.475 5
The drawing ration is the ratio of the diameter of the initial
blank form to the diameter of the drawn part. The
limiting drawing ratio (Capstan speed/Nip reel speed) for the
extruder section is between 1.6 and 2.2 [26], whereas, for the
stretching section it is in between 3 and 4.
V. Results
A. MSE for default hyperparameters
We ran the code in Google Colaboratory and got an MSE
value of 44.8%, 3653.6%, 3100.7%, and 713.7% accordingly
for Random Forest, Support Vector Machine, K-Nearest
Neighbors, and Artificial Neural Network.
253
C. MSE for Bayesian Optimization with Gaussian RS 29.45 2
Process (BO-GP)
BO-GP 29.45 2
The MSE value for BO-GP are 723.5%, 648.5%, 2945%, ANN Default HP's 7.137 5
and 30.8% accordingly for Random Forest, Support Vector
Machine, K-Nearest Neighbors, and Artificial Neural GS 0.475 5
Network. RS 0.304 5
Table IV. MSE & Cycle time for BO-GP Search technique BO-GP 0.308 5
Name MSE Cycle
time (s)
MSE for HPO'S
RF 7.235 2
40
SVM 6.485 3
30 RF
KNN 29.45 2
20 SVM
ANN 0.308 5
MSE
10 KNN
0 ANN
1 2 3 4 Linear (RF)
-10
Performance horizon
6
Cycle time for HPO's
5
Cycle time (s)
254
References novel grid search method. Analytica Chimica Acta, 791, 25–
35. Doi: 10.1016/j.aca.2013.06.043
[1] Cho, H., Kim, Y., Lee, E., Choi, D., Lee, Y., & Rhee, W. (2020).
Basic Enhancement Strategies When Using Bayesian [20] Zhang, X., Chen, X., Yao, L., Ge, C., & Dong, M. (2019). Deep
Optimization for Hyperparameter Tuning of Deep Neural Neural Network Hyperparameter Optimization with
Networks. IEEE Access, 8, 52588-52608. Orthogonal Array Tuning. Communications in Computer and
doi:10.1109/access.2020.2981072 Information Science Neural Information Processing, 287-295.
doi:10.1007/978-3-030-36808-1_31
[2] J.S. Bergstra, R. Bardenet, Y. Bengio, B. Kégl, Algorithms for
hyperparameter optimization, in: Advances in Neural [21] Zhang, C., Liu, Y., Liu, S. et al. Crystalline behaviors and phase
Information Processing Systems, 2011, pp. 2546–2554. transition during the manufacture of fine denier PA6 fibers. Sci.
China Ser. B-Chem. 52, 1835 (2009).
[3] Amirabadi, M., Kahaei, M., & Nezamalhosseini, S. (2020). https://doi.org/10.1007/s11426-009-0242-5
Novel suboptimal approaches for hyperparameter tuning of
deep neural network [under the shelf of optical [22] Joe. (2020, May 5). What Is Denier Rating? Why Does It Matter
communication]. Physical Communication, 41, 101057. To You? DigiTravelist. https://www.digitravelist.com/what-is-
doi:10.1016/j.phycom.2020.101057 denier-rating/.
[4] F. Hutter, J. Lücke, L. Schmidt-Thieme, Beyond manual tuning [23] Elmogahzy, Yehia (2018). Handbook of Properties of Textile
of hyperparameters, DISKI 29 (4) (2015) 329–337. and Technical Fibres || Tensile properties of cotton fibers. , (),
223–273. doi:10.1016/B978-0-08-101272-7.00007-9
[5] F. Friedrichs, C. Igel, Evolutionary tuning of multiple SVM
parameters, Neurocomputing 64 (2005) 107–117. [24] Tyagi, G.K. (2010). Advances in Yarn Spinning Technology ||
Yarn structure and properties from different spinning
[6] R.G. Mantovani, A.L. Rossi, J. Vanschoren, B. Bischl, A.C. De techniques. , (), 119–154. doi:10.1533/9780857090218.1.119
Carvalho, Effectiveness of random search in SVM hyper-
parameter tuning, in: 2015 International Joint Conference on [25] Blair, K. (2007). Materials and design for sports apparel.
Neural Networks (IJCNN), 2015, pp. 1–8. Materials in Sports Equipment, 60-86.
doi:10.1533/9781845693664.1.60
[7] L. Li, A. Talwalkar, Random search and reproducibility for
neural architecture search, 2019, arXiv preprint [26] Swift, K., & Booker, J. (2013). Forming Processes.
arXiv:1902.07638. Manufacturing Process Selection Handbook, 93-140.
doi:10.1016/b978-0-08-099360-7.00004-5
[8] K. Eggensperger, M. Feurer, F. Hutter, J. Bergstra, J. Snoek, H.
Hoos, K. Leyton-Brown, Towards an empirical foundation for [27] Cho, H., Kim, Y., Lee, E., Choi, D., Lee, Y., & Rhee, W.
assessing bayesian optimization of hyperparameters. In NIPS (2020). Basic Enhancement Strategies When Using Bayesian
workshop on Bayesian Optimization in Theory and Practice Optimization for Hyperparameter Tuning of Deep Neural
(Vol. 10, 3), 2013. Networks. IEEE Access, 8, 52588-52608.
doi:10.1109/access.2020.2981072
[9] H. Larochelle, D. Erhan, A. Courville, J. Bergstra, Y. Bengio,
An empirical evaluation of deep architectures on problems with [28] Yang, L., & Shami, A. (2020). On hyperparameter optimization
many factors of variation, in: Proceedings of the 24th of machine learning algorithms: Theory and practice.
International Conference on Machine Learning, ACM, 2007, Neurocomputing, 415, 295-316.
pp. 473–480. doi:10.1016/j.neucom.2020.07.061
[10] J. Snoek, O. Rippel, K. Swersky, R. Kiros, N. Satish, N. [29] Chan, S., & Treleaven, P. (2015). Continuous Model Selection
Sundaram, et al., Scalable bayesian optimization using deep for Large-Scale Recommender Systems. Handbook of
neural networks, in: International conference on machine Statistics Big Data Analytics, 107-124. doi:10.1016/b978-0-
learning, 2015, pp. 2171-2180. 444-63492-4.00005-8
[11] Jones, D.R.; Schonlau, M.; Welch, W.J. Efficient global [30] Menke, W. (2012). Nonlinear Inverse Problems. Geophysical
optimization of expensive black-box functions. J. Glob. Optim. Data Analysis: Discrete Inverse Theory, 163-188.
1998, 13, 455–492. doi:10.1016/b978-0-12-397160-9.00009-6
[12] Calandra, R.; Peters, J.; Rasmussen, C.E.; Deisenroth, M.P. [31] Brereton, R. (2009). Steepest Ascent, Steepest Descent, and
Manifold Gaussian processes for regression. In Proceedings of Gradient Methods. Comprehensive Chemometrics, 577-590.
the 2016 International Joint Conference on Neural Networks, doi:10.1016/b978-044452701-1.00037-5
Vancouver, BC, Canada, 24–29 July 2016; pp. 3338–3345. [32] Bergstra, J., Bengio, Y., 2012. Random search for hyper-
[13] Andrad_ottir, S.: A review of random search methods. In: parameter optimization. J. Mach. Learn. Res. 13, 281–305.
Handbook of Simulation Optimization, pp. 277{292. Springer ISSN 1532-4435. URL
(2015) http://dl.acm.org/citation.cfm?id=2188385.2188395
[14] Li, L.; Jamieson, K.; DeSalvo, G.; Rostamizadeh, A.; [33] Yu, T., & Zhu, H. (2020). Hyper-Parameter Optimization: A
Talwalker, A. Hyperband: A Novel Bandit-Based Approach to Review of Algorithms and Applications,
Hyperparameter Optimization. Journal of Machine Learning https://arxiv.org/abs/2003.05689
Research 18 (2018) 1-52 [34] E. Hazan, A. Klivans, and Y. Yuan, Hyperparameter
[15] A. Klein, S. Falkner, J. T. Springenberg, and F. Hutter. optimization: a spectral approach, arXiv preprint
Learning curve prediction with Bayesian neural networks. In arXiv:1706.00764, (2017). https://arxiv.org/abs1706.00764.
International Conference On Learning Representation (ICLR), [35] Hutter, F., Kotthoff, L., & Vanschoren, J. (2019). Automated
2017. Machine Learning Methods, Systems, Challenges. Cham:
[16] J.S. Bergstra, Y. Bengio. Random Search for Hyper-Parameter Springer International Publishing.
Optimization, in: Journal of Machine Learning Research 13 [36] Seeger, M. (2004). Gaussian Processes For Machine Learning.
(2012) 281-305 International Journal of Neural Systems, 14(02), 69-106.
[17] Zhang, H., Chen, L., Qu, Y., Zhao, G., & Guo, Z. (2014). doi:10.1142/s0129065704001899
Support Vector Regression Based on Grid-Search Method for [37] Hutter F., Hoos H.H., Leyton-Brown K. (2011) Sequential
Short-Term Wind Power Forecasting. Journal of Applied Model-Based Optimization for General Algorithm
Mathematics, 2014, 1-11. doi:10.1155/2014/835791 Configuration. In: Coello C.A.C. (eds) Learning and Intelligent
[18] Ghawi, R., & Pfeffer, J. (2019). Efficient Hyperparameter Optimization. LION 2011. Lecture Notes in Computer Science,
Tuning with Grid Search for Text Categorization using kNN vol 6683. Springer, Berlin, Heidelberg.
Approach with BM25 Similarity. Open Computer Science, https://doi.org/10.1007/978-3-642-25566-3_40
9(1), 160–180. doi:10.1515/comp-2019-0011 [38] Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms
[19] Beyramysoltan, S., Rajkó, R., & Abdollahi, H. (2013). for hyper-parameter optimization. Adv Neural Inf Process Syst
Investigation of the equality constraint effect on the reduction (NIPS) 24:2546–2554
of the rotational ambiguity in three-component system using a
255
Polyvinyl Alcohol/Cassava Starch/Nano-CaCO3 based
Nano-Biocomposite Films:Mechanical and Optical Properties
Eslem Kavas Pinar Terzioglu*
Department of Polymer Materials Engineering Department of Polymer Materials Engineering
Bursa Technical University Bursa Technical University
Bursa, Turkey Bursa, Turkey
eslemkavas123@gmail.com pinar.terzioglu@btu.edu.tr
Abstract— This paper reports the preparation and [11],[12],[13],[14], oxygen barrier [12] and thermal properties
characterization of nano-CaCO3 incorporated polyvinyl alcohol [12] of the hybrid nano-biocomposite materials. Sudhir et al.
(PVA)/cassava starch biocomposite films. The films were prepared the rice starch/PVA/CaCO3 composite films by
prepared by solution casting method using a glass mold. The solution casting method and investigated the effect of calcium
effect of nano-CaCO3 content (1 and 2 %) on the structural, carbonate amount on the fire retardant, tensile strength and
mechanical and optical properties of biocomposite films was thermal properties of nanobiocomposite film [12]. It was
investigated by Fourier transform infrared (FTIR) reported that the thermal and tensile properties of the
spectroscopy, universal testing machine and ultraviolet-visible- composites increased with the increment of nano-CaCO3
near infrared (UV-VIS-NIR) spectroscopy analyses,
loading amount. The highest values were reached with 10
respectively. The mechanical tests revealed that the 1% nano-
CaCO3 made no changes on the tensile strength of the
wt.% of CaCO3 concentration. Furthermore, the oxygen
PVA/cassava starch films, while 2% nano-CaCO3 addition permeation of composite films was reduced when the filler
decreased the tensile strength of the films. The transparency of amount increased. Therefore, the developed material was
films was slightly increased with nano-filler addition. suggested to be used in packaging applications. In another
study, Fukuda et al. used calcium carbonate particles as
Keywords— biocomposite, biobased polymers, calcium inorganic fillers to enhance the properties of poly(l-lactide)
carbonate, nano filler (PLLA) composite films [13]. Higher results were obtained
for Young’s modulus of the 10 wt.% of the CaCO3 particles
I. Introduction incorporated films when compared with the pure PLLA film.
The materials research in the packaging industry focuses Another work by Sun et al. [14] characterized the corn starch
on the use of biodegradable polymers to overcome the based films impregnated with nano-CaCO3 particles. The
drawbacks of environmental problems caused by non- results showed that the tensile strength of the films increased
degradable plastics [1]. Among biodegradable polymers, PVA with the increased CaCO3 content up to 0.06%. However, the
has been preferred in the commercial packaging industry own 0.1% and 0.5% loading amount of CaCO3 lead to decrease in
to its excellent film-forming capability, resistance to oil and the tensile strength of the films. The studies showed that the
mechanical properties [2],[3]. However, the blending of PVA properties of the final composites can be varied mostly due to
with different polymers seems to be a good solution to its the polymer matrix composition, the filler-matrix interaction
relatively high price [2]. Due to the wide availability in natural and the distribution of the filler.
sources, biodegradability, low cost and non-toxicity, starch is Upon thorough researches, the addition of nano-CaCO3 to
usually used to blend with PVA [1],[2]. PVA/cassava starch biocomposite film has not been reported
In the combination of PVA and starch that has excellent to date. In the present work, it is proposed to evaluate the
compatibility with each other, the final materials present effect of nano-CaCO3 content on the performance of PVA/
physicomechanical features at good levels as well as cost cassava starch biocomposite films. The biocomposite films
competitiveness [4],[5]. However, the investigations going on were compared for their structural, mechanical, and optical
to obtain polymer materials with enhanced characteristics. properties.
The efficient method of widening functionality of PVA-starch
materials is the incorporation of a small amount of a functional II. Materials and Methods
filler to the polymer matrix [6]. A. Materials
Nano-biocomposites are an important class of hybrid Polyvinyl alcohol that has 87.16% degree of hydrolysis
materials that can be obtained by incorporation of nano-sized and 95.4% purity was obtained from Zag Kimya, TURKEY.
filler (nanofiller) to a bio-based matrix [4]. The together use Casavva starch was purchased from Tito, TURKEY. Calcium
of eco-friendly polymers and nano-sized fillers, in order to carbonate (CaCO3) nanopowder was obtained from Adaçal
obtain synergic effects, is considered as one of the most Industrial Minerals Company (Afyon, TURKEY). Glycerol
innovative way to improve the features of the polymer was also purchased from Merck. Citric acid was obtained from
materials [4],[7]. Nanofillers can be classified into three Aksu Company, TURKEY. Tween 80 was obtained from
different types as nano-fiber, nano-particle, nano-plate [8]. Sigma-Aldrich Company. The solutions were prepared using
Among the nano-particles used for such applications, nano- distilled water.
calcium carbonate generally seems to be a good candidate
B. Preparation of the biocomposite films
which exhibited unique properties like being cost-effective,
odorless, and high thermal stability [9],[10]. The preparation of three different biocomposite films was
carried out using the solution casting method (Figure I)
Previous studies that have used nano-CaCO3 as a filler according to the method of Terzioglu and Sıcak (2021) with
presented promising results to enhance the mechanical some modifications [15]. The PVA powder (8% w/v) and
256
distilled water were mixed using a hotplate magnetic stirrer at the spectrophotometer. The transparency of the biocomposite
80°C. Cassava starch (2% w/v) was gelatinized in distilled films were calculated according to the following equation:
water. The two mixture was mixed and then stirred at 600 rpm
and heated at 70 °C for 60 min. After the mixture cooled to log(%𝑇600 )
Transparency = (1)
50 °C, citric acid solution (10 % wt. of total polymer weight) 𝑦
was added and stirred for 30 minutes. Furthermore, glycerol
(20%, wt. of total polymer weight) and Tween 80 was poured where y is the thickness of film (mm) and %T600 is the percent
into the mixture with additional constant stirring for 15 transmittance at 600 nm [16].
minutes, respectively. At last, nano-CaCO3 powder at
III. Results and Discussion
concentrations of 1 and 2 % (wt. of total polymer weight) was
added to the PVA/cassava starch film-forming mixture and A. Structural Properties of the biocomposite films
stirred for 30 minutes. The 35 mL of obtained film forming FTIR spectroscopy was used to determine the interaction
mixture was poured into a 12 cm glass petri dish and dried at between the PVA, cassava starch and nano-CaCO3. The FTIR
40 °C for 24 hours. PVA/cassava starch without calcium
spectrum of biocomposite films is presented in Figure II.
carbonate was also prepared as a control film. The films
prepared with 0, 1 and 2 % nano-CaCO3 were named PSC-0,
PSC-1 and PSC-2, respectively. The spectrum of PSC-0 displayed a characteristic band at
1712 cm-1 attributed to the C=C stretching vibration, which is
typical for the ester bond and carboxyl groups in citric acid
[17]. In the infrared spectra of films, stretching vibrations of
O-H groups were at 3285 cm-1 [18]. The peaks located at
1420, 1373 and 1324 cm-1 were related to the CH2 bending,
CH2 deformation and CH2 stretching vibrations, respectively
[2],[19]. The peak at 1240 cm-1 representing the C–H
wagging vibrations [18], while the peak at 1085 cm -1 can be
assigned to C-O stretching in C-O-H [18]. The band occurred
at 841 cm-1 due to the rocking vibration of CH2 and the
asymmetric stretching of C-O-C [20].
Mechanical analysis: The mechanical features of samples FigureFTIR spectrum of biocomposite films.
were determined following the ASTM D 882 procedure using
a Shimadzu AGS-X tester. Mechanical tests were repeated five B. Mechanical Properties of the biocomposite films
times for each sample. The mechanical properties of the films are given in Figure
III. The calculated values of tensile strength, elongation at
Ultraviole-Visible-Near infrared spectroscopy analysis: break, and Young's modulus of the biocomposite films are
The UV-VIS-NIR spectral analysis was characterized using summarized in Table I. Tensile strength and elongation at
UV–Vis spectrophotometer (Shimadzu UV-3600, Japan) break values varied in the range of 22.50±1.7-27.74±1.1 MPa
with transmittance spectra of in the range of 200-800 nm. The and 186.75±17.3%-278.08±14.4%, respectively. The data
film samples were cut into 2.5 x 3.0 cm rectangles and put in showed that 1% nano-CaCO3 incorporation increased the
elongation capacity and flexibility of the control films as well
as not changed the tensile strength. However, the further
257
increase of nanofiller content from 1% to 2% caused a C. Optical Properties of the biocomposite films
decrease in the elongation at break and tensile strength values. The transparency capacity of the films is a significant
This should be due to the agglomeration of the nanofiller at quality parameter. Because foods tend to oxidation at
higher concentrations [16]. Young’s modulus of the films was
200−280 nm that causes oxidative deterioration,
also varied in the range of 38.28±2.0-82.30±5.3 MPa.
discoloration, and off-flavor or rancidity [16]. It is expected
Specifically, the nano-CaCO3 at 1 % led to a drop in Young’s
modulus value, while an increment occurred as its content from food packaging films not only to have UV-proof
reached 2%. properties but also be transparent enough for visual lights
[16],[23]. Therefore, determining the transmission
The capability to deform is particularly favorable in some percentage and transparency of films is important.
industries including agriculture, cosmetics, and food
packaging to fabricate elastic and flexible products based on The visual appearance of biocomposite films was
the applications [22]. Therefore, when the elongation at break presented in Figure IV. The transparency of the films was
results were examined, the 1% nano-CaCO3 loaded sample found to be 6.65, 7.36 and 7.47 for PSC-0, PSC-1 and PSC-
should be suggested as the best flexible film among the 2, respectively. The nanofiller incorporation slightly
produced three samples.
increased the transparency of the films (Figure V). This
Table I. Mechanical properties of biocomposite films. result may be related to the reduction of the matrix
Tensile strength Elongation at Young’s crystallinity [24]. The transparency of PVA/cassava starch
Sample
(MPa) break (%) modulus (MPa) film (PSC-0) was slightly lower than the transparency of
PSC-0 27.74±1.1 258.10±1.9 61.08±1.6 PVA/corn starch film (6.71) developed by Terzioğlu and
Parın [17]. The addition of nano-CaCO3 to the PVA/cassava
PSC-1 27.54±1.9 278.08±14.4 38.28±2.0
starch matrix resulted in a small change in the transmittance
PSC-2 22.50±1.7 186.75±17.3 82.30±5.3 in the range of 200-300 nm. Additionally, the transmission
rate of the biocomposite film notably increased in the visible
area, especially with 2% nano-CaCO3 addition. It was shown
that considerable transparency was still obtained (89~90 % at
800 nm) for all developed PVA/starch biocomposite films.
Acknowledgment
This research is financially supported by the Scientific and
Technological Research Council of Turkey (TÜBİTAK) with
the 2209-B Undergraduate Research Projects Grant. The
Figure III. Tensile strength, elongation at break and Young’s
modulus graphs of biocomposite films. authors acknowledge Adaçal Industrial Minerals Company
for supplying the calcium carbonate.
258
IV. Conclusion prepared by controlled hydrolysis of dimethylcarbonate”,
Carbohydrate Polymers, Volume 79, Issue 4, 2010.
Nano-CaCO3 filler incorporated PVA/cassava starch films [12] Sudhir K. Kisku , Niladri Sarkar , Satyabrata Dash, Sarat K.
were prepared successfully using a green route (solution Swain,“Preparation of Starch/PVA/CaCO3 Nanobiocomposite
casting method). FTIR characterization results demonstrated Films: Study of Fire Retardant, Thermal Resistant, Gas Barrier
similar structural composition of all developed films with the and Biodegradable Properties”, Polymer-Plastics Technology
same functional groups. This fact reflects the physical and Engineering, Volume 53, Issue 16, 2014.
interactions of the nanofiller and the polymer matrix. The [13] Norio Fukuda, Hideto Tsuji, Yasushi Ohnishi, “Physical
properties and enzymatic hydrolysis of poly(l-lactide)–CaCO3
nanofiller incorporation at 1% enhanced the elongation at the composites”, Polymer Degradation and Stability,Volume 78
break of the films. The transparency of the films increased 2002.
with nanofiller incorporation at both concentrations. Further [14] Qingjie Sun, Tingting Xi, Ying Li, Liu Xiong,
scientific research is still needed to find the best nano-CaCO3 “Characterization of Corn Starch Films Reinforced with
incorporation content for this polymer matrix to obtain nano- CaCO3 Nanoparticles”, Plos One, Volume 9, Issue 9, 2014.
biocomposite films with better functional properties. [15] Pınar Terzioglu, Yusuf Sıcak, “Citrus Limon L. Peel Powder
Incorporated Polyvinyl Alcohol/Corn Starch Antioxidant
References Active Films”, Journal of the Institute of Science and
Technology, Volume 11, Issue 2, 2021.
[1] Natwat Srikhao, Pornnapa Kasemsiri, Artjima Ounkaew, [16] Shaoxiang Lee, Meng Zhang, Guohui Wang, Wenqiao Meng,
Narubeth Lorwanishpaisarn, Manunya Okhawilai, Xin Zhang, Dong Wang, Yue Zhou, Zhonghua Wang,
Uraiwan Pongsa, Salim Hiziroglu, Prinya Chindaprasirt, “Characterization of polyvinyl alcohol/starch composite films
“Bioactive Nanocomposite Film Based on Cassava incorporated with p-coumaric acid modified chitosan and
Starch/Polyvinyl Alcohol Containing Green Synthesized Silver chitosan nanoparticles: A comparative study”, Carbohydrate
Nanoparticles”, Journal of Polymers and the Environment, Polymers, Volume 262, 2021.
Volume 29, 2021.
[17] Pınar Terzioğlu, Fatma Nur Parın , “Polyvinyl Alcohol-Corn
[2] Hairul Abral, Angga Hartono, Fadli Hafizulhaq, Dian Starch-Lemon Peel Biocomposite Films as Potential Food
Handayani, Eni Sugiarti, Obert Pradipta, “Characterization of Packaging ”, Celal Bayar University Journal of Science,
PVA/cassava starch biocomposites fabricated with and without Volume 16, Issue 4, 2021.
sonication using bacterial cellulose fiber loadings”,
Carbohydrate Polymers,Volume 206, 2019. [18] Phetdaphat Boonsuk, Apinya Sukolrat, Kaewta Kaewtatip,
Sirinya Chantarak, Antonios Kelarakis, Chiraphon Chaibundit,
[3] Farah Fahma, Sugiarto, Titi Candra Sunarti, Sabrina Manora “Modified cassava starch/poly(vinyl alcohol) blend films
Indriyani, and Nurmalisa Lisdayana, “Thermoplastic Cassava plasticized by glycerol: Structure and properties”, Volume 137,
Starch-PVA Composite Films with Cellulose Nanofibers from Issue 26, 2020.
Oil Palm Empty Fruit Bunches as Reinforcement Agent”,
International Journal of Polymer Science, Volume 2017,2017. [19] Priyanka Rani, M Basheer Ahamed, Kalim Deshmukh,
“Dielectric and electromagnetic interference shielding
[4] Maria-Cristina Popescu, Bianca-Ioana Dogaru, Mirela Goanta, properties of carbon black nanoparticles reinforced PVA/PEG
Daniel Timpu, “Structural and morphological evaluation of blend nanocomposite films”, Materials Research Express,
CNC reinforced PVA/Starch biodegradable films”, Volume 7, 2020.
International Journal of Biological Macromolecules,Volume
116, 2018. [20] Anida M.M. Gomes, Paloma L. da Silva, Carolina de L. e
Moura, Claudio E.M. da Silva, Nágila M.P.S. Ricardo, “Study
[5] M. Lubis, A. Gana, S. Maysarah, M.H.S. Ginting, M.B. of the Mechanical and Biodegradable Properties of Cassava
Harahap, “Production of bioplastic from jackfruit seed starch Starch/Chitosan/PVA Blends”, Macromolecular Symposia,
(Artocarpus heterophyllus) reinforced with microcrystalline Volume 299-300, Issue 1, 2011.
cellulose from cocoa pod husk (Theobroma cacao L.) using
glycerol as plasticizer”, IOP Conference Series: Materials [21] S. El-Sherbiny, S.M. El-Sheikh, A. Barhoum, “Preparation and
Science and Engineering, 309, 2018. modification of nano calcium carbonate filler from waste
marble dust and commercial limestone for papermaking wet
[6] Nataliya E. Kochkina, Olga A. Butikova, “ Effect of fibrous end application”, Powder Technology, Volume 279, 2015.
TiO2 filler on the structural, mechanical, barrier and optical
characteristics of biodegradable maize starch/PVA composite [22] H.P.S. Abdul Khalil, E.W.N. Chong, F.A.T. Owolabi, M.
films”, International Journal of Biological Macromolecules, Asniza, Y.Y. Tye, H.A. Tajarudin, M.T. Paridah, S. Rizal,
Volume 139, 2019. Microbial-induced CaCO3 filled seaweed-based film for green
plasticulture application, Journal of Cleaner Production,
[7] Frédéric Chivrac, Eric Pollet, Patrice Dole, Luc Avérous, Volume 199, 2018.
“ Starch-based nano-biocomposites: Plasticizer impact on the
montmorillonite exfoliation process”, Carbohydrate Polymers, [23] Azam Akhavan, Farah Khoylou, Ebrahim Ataeivarjovi,
Volume 79, Issue 4,2010. “Preparation and characterization of gamma irradiated
Starch/PVA/ZnO nanocomposite films”, Radiation Physics
[8] HS Mahadevaswamy, B Suresha, Role of nano-CaCO3 on and Chemistry, Volume 138, 2017.
mechanical and thermal characteristics of pineapple fibre [24] Vincenzo Titone, Francesco Paolo La Mantia, Maria Chiara
reinforced epoxy composites, Materials Today: Proceedings, Mistretta, “The Effect of Calcium Carbonate on the Photo-
Volume 22, Issue 3, 2020.
Oxidative Behavior of Poly(butylene adipate-co-
[9] Pınar Terzioğlu, “ Electrospun Chitosan/Gelatin/Nano-CaCO3 terephthalate)”, Macromolecular Materials and Engineering,
Hybrid Nanofibers for Potential Tissue Engineering Volume 305, Issue10, 2020.
Applications”, Journal of Natural Fibers, 2021. DOI:
10.1080/15440478.2020.1870639
[10] Kalyani Prusty, Sarat K Swain, “Nano CaCO3 imprinted starch
hybrid polyethylhexylacrylate\polyvinylalcohol
nanocomposite thin films ” , Carbohydrate Polymers, Volume
139, 2016.
[11] Carla Vilela, Carmen S.R. Freire, Paula A.A.P. Marques, Tito
Trindade, Carlos Pascoal Neto, Pedro Fardim, “Synthesis and
characterization of new CaCO3/cellulose nanocomposites
259
Movie Reviews Text Sentiment Analysis Based On Hybrid LSTM
and GloVe
Nour Ammar Ali Okatan
Department of Software Engineering Department of Software Engineering
Istanbul Aydin University Istanbul Aydin University
Istanbul, Turkey Istanbul, Turkey
noor101ammar@gmail.com aliokatan@aydin.edu.tr
Abstract— Social media has come to be a useful resource for connected memory cell and three gates, namely the input,
critiques and ratings. The mining of vital information that allows output, and forget gates. The gates are multiplicative units and
governments to keep public protection and organization's growth enable similar continuous write, read and reset operations. The
income has gradually advanced. However, the performance of network can only interact with the cells through the gates [2].
current techniques requires continuous improvement due to the
exponential growth of information. In this paper, the proposed In [3], a solution approach to the weakness of LSTM
structure is the LSTM neural network, an advanced spatial networks processing continuous input streams was proposed
technique of RNN. The verbal language information has been by the "learn to forget" algorithm, which refers to a novel
utilized by changing it into numerical information relying upon adaptive "forget gate". The "forget gate" allows an LSTM cell
the GloVe dictionary. The manner utilized in natural language to learn to reset itself at appropriate times. The approach
processing has accomplished acceptable results, and the proposed in this study is very similar to the approach of [2].
reasonable prediction changed into real with an excessive speed.
All the above approaches refer to [6], the approach
Keywords—NLP, LSTM, RNN, GloVe presented in 1997 by Hochreiter & Schmidhuber, who were
the first to define LSTM as a special kind of RNN with the
I. Introduction ability to learn long-term dependencies.
Recently, blogs, websites, and social media are considered
III. Overview, Methods and Tools
a powerful means of collecting product reviews. The proposed
method achieves maximum benefit at minimum cost in The RNN-LSTM model was trained on the IMDB movie
minimum time by analyzing users' opinions on positive and reviews dataset as a function of the GloVe word embedding
negative reviews using Long Short Term Memory (LSTM) dictionary.
algorithms. The required data can be extracted by technical
A. GloVe words embedding dictionary
methods as in the proposed method.
GloVe [5] is a word vectorization technique that embeds
Recurrent Neural Networks (RNNs) introduced by [14] words into a concise vector space where similar words are
provides high performance in speech processing, speech found close to each other as clusters, while different words are
recognition, speech translation, stock prediction and semantic found far away from each other, as shown in Figure I. GloVe
analysis, used in this study. LSTM neural network is a embedding is preferable to Word2vec embedding because
modified architecture of traditional RNN and is more accurate GloVe relies on local statistics (local context information of
[1]. The change from RNN to LSTM is the constant words) while it incorporates global statistics (word co-
backpropagated error flow in gradients processed in the occurrence) to obtain word vectors. The dimension of the data
LSTM network [13]. An LSTM layer consists of a set of is 200. Figure I shows the 3D graph of the embedded words
recurrently connected blocks called memory blocks. LSTM after conversion to sequences by PCA using the tensorboard
can learn to bridge minimal time delays of more than 1000 tool provided by Tensorflow.
discrete time steps by enforcing a constant error flow through
Constant Error Carrousels (CECs) [13]. In the presented
approach, a model consisting of two LSTM layers with a total
of seven layers is used.The results were fairly high and
satisfying; accuracy is high of each training and testing as well
as the speed of testing as detailed in section IV.
II. Related Work
In [1], a text sentiment analysis based on LSTM model
was presented to analyse human emotions. The training data
is classified into three categories (negative, positive, and
neutral) according to emotions, and then fitted into the LSTM
models trained for each data category, resulting in multiple
LSTM models for the corresponding emotional ratings. The
accuracy in [1] was higher than traditional RNNs. Another
approach is followed in [2], where both models are used,
Convolutional Neural Network (CNN) with RNN-LSTM.
Using the CNN model is to extract local features. The goal of
using the LSTM model is to capture long-distance
dependencies and combine the extracted features into a single
hybrid CNN-LSTM model. In [2], obtained efficient results. Figure I. 3-D diagram PCA by tensorboard of GloVe embeddings
Each memory block contains at least one recurrently sequences
260
B. RNN-LSTM architecture 𝑜𝑡 = 𝜎( 𝑊𝑜 [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑜 ) (9)
Recurrent neural networks (RNNs) [14] are related to ℎ𝑡 = 𝑜𝑡 × 𝑡𝑎𝑛ℎ( 𝐶𝑡 ) (10)
sequences, lists, and continuously initialized data. There have
been incredible successes in applying RNNs to a variety of The project was implemented using the TensorFlow
problems: Speech recognition, language modeling, platform in python; it is a high-performance computations
translation, image labeling, and more. LSTMs - introduced in platform.
[6] by Hochreiter & Schmidhuber, are a special form of RNNs C. Deep Neural Network Architecture
that avoids the problem of long term dependency that exists in
traditional RNNs, it utilizes tanh activation (1). Moreover, the Figure II, shows the basic network structure; however, the
LSTM model contains four neural network layers as shown in final architecture is given in Figure III. The proposed model
Figure II, which makes the model more robust to deal with contains seven layers, which include both the input and the
problems such as the vanishing gradient problem and reduce output. To avoid overfitting and improve performance, a
long dependencies by selecting only essential information dropout process is used. The input and recurrent connections
from the previous cells. to the LSTM units are excluded from the activation and weight
updates during network training [8]. The dense layer is used
𝑒 𝑥 − 𝑒 −𝑥 as an output layer with an activation function to improve the
𝑇(𝑥) = (1)
𝑒 𝑥 + 𝑒 −𝑥 performance of the approach [9].
1) Layers:
1. A sequence of words with 256x1 dimension is used
as an input to the network.
2. The input of the second layer is the matrix of
embeddings multiplied by sequences. The output
dimension is 256×200.
3. Here is the first LSTM layer with 128 features
(neurons). The output dimension is 256×128.
4. First dropout layer, input dimension and output
dimension are the same but it is used to reduce
overfitting.
5. Second LSTM layer with 32 features with
Figure II. LSTM block showing four interacting layers, gates, and dimension 32×1. 32 are the number of embedded
operations sequences reduced from 200 gradations to 32
Figure II clarifies the process of LSTM block’s gates- gradations.
input, forget, output, and updated Cell-and the parameters- 6. The second dropout layer, input, and output
activation functions-of gates as shown in (2). dimensions are the same, and it is used to reduce
overfitting.
𝑖 𝜎 7. The output layer is a Dense layer with dimension
𝑓 𝜎 ℎ𝑡−1
(𝑜) = ( 𝜎 ) 𝑊 ( ) (2) 1×1 as positive or negative weighting.
𝑥𝑡
𝐶̃𝑡 𝑡𝑎𝑛ℎ
The steps of the operation flow in LSTM block start with
Forgetting irrelevant past information using (3). Then, identify
New Information to be Stored, the sigmoid layer, which
indicated in (4), decides what values to update, tanh layer
which indicated in (5) generate a new vector of values that
could be added to the state called “candidate values”. Then
apply forget operation to the previous internal cell state by (6)
And add new candidate values, scaled by the number of values
that decided to be updated by (7). After that, the two
operations are summited to produce (8) that updates the cell
state. After updating the state, last sigmoid layer (9) decides
what parts of state to output. Finally, output a filtered cell state
(10) that uses tanh layer (1) to squash values between 1 and
-1. ℎ𝑡 is in the same time an input to the next cell in the
continuous LSTM block chain [6, 7, 16, 17].
𝑓𝑡 = 𝜎( 𝑊𝑓 [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑓 ) (3)
𝑖𝑡 = 𝜎( 𝑊𝑖 [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑖 ) (4)
𝐶̃𝑡 = 𝑡𝑎𝑛ℎ( 𝑊𝑐 [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑐 ) (5)
𝑓𝑡 × 𝐶𝑡−1 (6)
Figure III. Architecture of neural network layers showing number of
𝑖𝑡 × 𝐶̃𝑡 (7)
input and output in each layer
𝐶𝑡 = 𝑓𝑡 × 𝐶𝑡−1 + 𝑖𝑡 × 𝐶̃𝑡 (8)
261
Table I. sample of IMDB dataset
2) Functions:
a) Activation function:
There are many activation functions, which can be used
with the proposed model; however, the sigmoid activation
function is chosen in this case. Figure IV shows the sigmoid
function. The sigmoid function is represented by (11).
1
𝜎= (11)
1 + 𝑒 −𝑥
0.5
0
-6 -4 -2 0 2 4 6
b) Optimization function:
The optimization function used in this study is Adam
optimization, introduced by [11]. It is an adaptive learning 4) Training and Testing
rate-based method, which means it computes individual In this study, the dataset was divided into three parts: 70%
learning rates for different epochs. Adam refers to adaptive for training, 20% for testing, and 10% for validation. The
moment estimation. It uses estimates of the first and second network is trained depending on the GloVe embedding
moments of the gradient to adapt the learning rate for each dictionary [5]. The training time is machine hardware
weight of the neural network. The Nth moment is a random dependent. The training time and performance of the model
variable used as the expected value of that variable to the were improved by applying five basic measures, including:
power of n.
Adding and discarding layers of the model as
𝜂 replacing the output layer with a dense layer.
𝜃𝑡+1 = 𝜃𝑡 − 𝑚
̂𝑡 (12)
√𝑣̂𝑡 + 𝜖 Using the TensorFlow platform, which is suitable
c) Loss function: for high performance computations.
The mean squared error (MSE) is used as a fitness function Use of sigmoid activation function.
to minimize the error. It calculates the mean of the squared
differences between the predicted and actual values. It Use of Adam optimization function.
minimizes the loss value [12].
Use of MSE loss function.
𝑛
1
𝑀𝑆𝐸 = ∑(𝑦𝑖 − 𝑦̃𝑖 )2 (13) IV. Experiments, Analysis and Performance
𝑛
𝑖=1 The model was implemented using the open source
3) Data: framework TensorFlow. Training was performed on a PC with
IMDB is a popular movie website. It combines movie plot SSD hard drive 16 RAM and i7 10500 core.
description, metastore ratings, critic and user ratings and
reviews, release dates, and many other aspects. The system is A. Platform
trained using the IMDB dataset. This is a CSV file format that Training the network took about 20 minutes using the
contains 50K textual movie reviews with high polarity for TensorFlow environment. The flow of tensors (computations
semantic analysis and Natural Language Processing (NLP). of the model) can be viewed using the Tensorboard tool
The data fields are the review and its value, which is either 0 provided by TensorFlow, as shown in Figure V Each container
or 1, where 0 is negative and 1 is positive, as shown in Table (tensor) is a constant, vector or higher dimensional data and
I. This dataset is freely available [4]. The data is preprocessed each arrow (flow) is a directed flow of computations.
and tokenized using the libraries nltk and textblob.
262
Figure V. Tensorboard graph showing computations flow of the model
B. Results
Table II shows the values of the confusion matrix of the Table II. Confusion matrix of the trained model
model training, which, as can be seen, the true values (blue) positive negative
are higher than the negative values (red). Table III shows the
ratios of the training, testing, and validation results. The positive 4647 942
accuracy of the model is 87. 1%.
negative 346 4065
Figure VI and Figure VII show the plots for accuracy and
loss during the training and testing process. As shown, the
curves of training and testing increase in the accuracy plot,
while in the loss plot, the curves decrease, which confirms the
excellent performance of the model.
Table III. Values of training, testing, and validation ratios
training testing validation
Accuracy 86.24% 87.12% 87.19%
Loss 9.88% 11.25% 10.09%
Score - 10.30% 10.21%
F1 score - 86.32% -
Figure VI. Accuracy of model during training and testing in 14
epochs
263
References
[1] Alex Graves, Jürgen Schmidhuber, Framewise phoneme
classification with bidirectional LSTM and other neural
network architectures, Neural Networks, Volume 18, Issues 5–
6, 2005, Pages 602-610, ISSN 0893-6080,
https://doi.org/10.1016/j.neunet.2005.06.042 .
[2] Rehman, A.U., Malik, A.K., Raza, B. et al. A Hybrid CNN-
LSTM Model for Improving Accuracy of Movie Reviews
Sentiment Analysis. Multimed Tools Appl 78, 26597–26613
(2019). https://doi.org/10.1007/s11042-019-07788-7
[3] Felix A. Gers, Jürgen Schmidhuber, Fred Cummins; Learning
to Forget: Continual Prediction with LSTM. Neural
Comput 2000; 12 (10): 2451–2471.
doi: https://doi.org/10.1162/089976600300015015.
[4] Leon, Stefano, 2020, IMDb movies extensive dataset
https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-
Figure VII. Loss of model during training and testing in 14 epochs 50k-movie-reviews .
[5] Pennington, Jeffrey & Socher, Richard & Manning,
Christopher. (2014). Glove: Global Vectors for Word
Representation. EMNLP. 14. 1532-1543. 10.3115/v1/D14-
1162.
[6] Hochreiter, Sepp & Schmidhuber, Jürgen. (1997). Long Short-
term Memory. Neural computation. 9. 1735-80.
10.1162/neco.1997.9.8.1735.
[7] Karpathy, A., Johnson, J., & Fei-Fei, L. (2015). Visualizing
and understanding recurrent networks. arXiv preprint
https://arxiv.org/abs/1506.02078 .
[8] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., &
Salakhutdinov, R. (2014). Dropout: a simple way to prevent
neural networks from overfitting. The journal of machine
learning research, 15(1), 1929-1958 .
[9] Applications of Deep Neural Networks, Jeff Heaton Updated
regularly: last update: 21 Jan 2021 (v2)
https://arxiv.org/abs/2009.05673.
Figure VIII. example of a sample sequence in test dataset of [10] Han J., Moraga C. (1995) The influence of the sigmoid function
parameters on the speed of backpropagation learning. Lecture
predicting with probability of prediction Notes in Computer Science, vol 930. Springer, Berlin,
Heidelberg. https://doi.org/10.1007/3-540-59497-3_175 .
Figure VIII shows shows selected normalized data sample
[11] Kingma, D. P., Ba, J. (2014). Adam: A method for stochastic
and the probability of prediction, which is a true prediction. optimization. https://arxiv.org/abs/1412.6980 .
Figure IX shows a practical example done by the user. In this [12] Alejo, R., García, V. & Pacheco-Sánchez, J.H. An Efficient
case, the executed program prompts the user to input a Over-sampling Approach Based on Mean Square Error Back-
sentence then the system preprocesses the input and predicts propagation for Dealing with the Multi-class Imbalance
within milliseconds only. Problem. Neural Process Lett 42, 603–617 (2015).
https://doi.org/10.1007/s11063-014-9376-3 .
[13] Ravi, Vinayakumar & Kp, Soman & Poornachandran,
Prabaharan. (2017). Evaluation of Recurrent Neural Network
and its Variants for Intrusion Detection System (IDS).
International Journal of Information System Modeling and
Design. 8. 43-63.
https://doi.org/10.4018/IJISMD.2017070103.
[14] Rumelhart, David E., Geoffrey E. Hinton, and Ronald J.
Williams. Learning internal representations by error
propagation. California Univ San Diego La Jolla Inst for
Cognitive Science, 1985.
[15] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013).
Efficient estimation of word representations in vector
space. https://arxiv.org/abs/1301.3781 .
Figure IX. Practical example of testing the model
[16] Li, F., Johnson, J., Yeung, S. (2017). Lecture 10: Recurrent
V. Conclusion Neural Networks [PowerPoint slides]. Retrieved from Stanford
University School of Engineering. Convolutional Neural
The results show that the use of the LSTM improved Networks for Visual Recognition. CS231 n. CS231n.
performance of both speed and accuracy is improved. The http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture1
0.pdf .
GloVE 200-D dictionary was embedded in the Tensorflow
environment. Tensorflow has been very successful in handling [17] Jiang, C., Chen, Y., Chen, S., Bo, Y., Li, W., Tian, W., & Guo,
J. (2019). A Mixed Deep Recurrent Neural Network for MEMS
high-speed computations. It is expected that the model Gyroscope Noise Suppressing. Electronics, 8(2), 181. MDPI
presented in this study can be modified to be used for real- AG. Retrieved from
time emotion in speech recognition. http://dx.doi.org/10.3390/electronics8020181 .
264
Simulation Comparative Study to Highlight the Relation
Between Building Form and Energy Consumption
Omar ALGBURI Bahar FERAH
Department of Architecture Engineering Department of Architecture Engineering
Istanbul Sabahattin Zaim University Istanbul Sabahattin Zaim University
Istanbul, Turkey Istanbul, Turkey
omar.algburi @izu.edu.tr baharak.fareghi@izu.edu.tr
Abstract— Buildings construction is one of the causes of figure, the unit of the Form length represented by the
climate change. Architects who design energy-efficient and parameter 'a' symbolizes. Although all the buildings (A, B,
sustainable buildings lead in establishing the solutions to C, D) have similar volumes, they have different Form
climate change. Decisions concerning building form design and factors because of their different thermal envelope areas [2].
its impact on energy efficiency are the main motive behind this
The Form factor also relies on the building size. Building
work. This study examines the role of building form on energy
consumption on two proposed entire glazed office buildings, (C) is larger but has a lower form factor than building (A),
cubic (rectangular plan layout) and cylinder (circular plan although they have the same form. Irregular Forms with
layout). AutoCAD software was utilized to draw the plan open balconies that extend beyond the façade may also
layout, then the thermal simulation and the proposed two increase the Form factor, as illustrated by buildings D in
forms were modeled and analyzed respectively using Figure I.
DesignBuilder software. The results show that the energy
consumption of an entire glazed façade building strongly
depends on its solar heat gain through exterior windows,
building orientation, and glazing shading amount.
Furthermore, building form amongst the studied factors plays
the most critical role in determining a building's energy
consumption. In the end, suggestions were presented to raise
the awareness of architects to design energy-efficient buildings
form.
I. Introduction
Is there a significant and direct effect of the building
Form on energy consumption considered the central
question behind this study? In the first stage of designing a Figure I. Different sizes and Forms with the Form factor of
building (conception stage), engineers and architects are each building [2].
facing numerous economic, social, techniques details, The concept of building Form factor is linked to heat
ecological issues, and aesthetic limitations. Indeed, designing losses and gain on buildings, resulting in energy
a building is a complicated task, and to be energy efficient is consumption[3]. Building with a larger envelop surface area
unique. Moreover, the concern for pollution reduction, in proportion to their volume will have a higher Form factor,
climate change fighting, and energy savings must remain one maximizing heat losses. In contrast, buildings with lower
of the principles designers should regards. When the Form factors need lower specific heat demand[5].
designers develop their project's concept in the first stage,
they need operative knowledge about energy-efficient design Many studies addressed the building geometry and Form
principles. factors that affect the energy consumption of
buildings[6][7][8]. The study done by [8] proposed a
The form and size of a building have a sufficiently methodology to determine building form related to opaque
significant influence on energy consumption [1],[2], [3]. As component U-value represented by A/V ratio. A survey
stated in the study of Kocagil et al. [4], the building Form conducted by Joelsson et al. (2012), it was shown the vital
and the envelope complexity directly impact the total heat impact of the Form factor on final energy demand in
loss and gain and consequently the energy consumption. The residential buildings [2]. Another study [9] shows that space
proportion between the building's thermal envelope area (A) boundaries and building geometry significantly influence
and its volume (V) is known as the A/V ratio. The Form energy consumption factors. In addition, in the study done
factor of a building is relying on this proportion which by Al-Anzi [6] et al., different office building Forms have
measures the compactness of the building. The thermal been developed to study each form's energy demand. Many
envelope represents the isolation part between the outdoor studies were done using the energy simulation method for
(the unconditioned area) and the indoor (the conditioned optimizing the building Form in terms of its energy
space) environment. The value of this Form factor consumption [10][11][12][13][14][15][16]. However, many
determines the form of the building for a given volume. The factors can impact the energy consumption of a building.
study of Danielski et al. (2012) explained the value of the For example, according to (ASHRAE 90.1) American
Form factor as illustrated in Figure I. As shown in this Society of Heating, Refrigerating, and Air-Conditioning
265
Engineers [17] determined five main factors that impact complex process. Energy simulation is mainly founded to
energy consumption of a building as follows: (1) mechanical help the designer make the right decisions in the early design
systems ; (2) building envelope ; (3) power generation phase. According to many recent studies evaluating and
systems ; (4) water heating and (5) lighting systems. The analyzing the building energy performance lead to design
role of the designer here is to manage these factors and energy-efficient building. Therefore, design improvements or
design energy-efficient buildings. Other factors such as alternatives could be established by the design team to
window-to-wall ratio [18], glazing type [19] [20], solar heat enables energy saving. However, making different building
gain coefficient (SHGC) [21] [22], thermal insulation of Form models in order to identify energy-saving alternatives
does not typically occur correctly in many cases. In addition,
materials[23] [24], sun shading [25], and surface coloring
adapting simulation models (i.e., input data) usually ends in
[26] have a significant effect on the energy consumption of
various coding errors as it consists of manual or semi-manual
buildings. representation data[28]. A gradual simulation method of a
A. Energy consumption on buildings confirmed system with validation methodology of building
energy analysis should be involved by an expert team[29].
To understand energy consumption in buildings, we should
first understand how a building uses energy. According to II. Research Methodology
many studies, most of the energy consumed in buildings is
The energy simulation method by DesginBuilder
essentially used to enable an acceptable level of users'
software was used to test the impact of the building form on
thermal comfort. Cooling or heating the indoor air using air the reduction rate of energy consumption. Energy simulation
conditioning units or providing fresh air by ventilation. analyses were performed to evaluate the two proposed
Other energy uses are domestic hot water, artificial light, Cylinder (circular plan) and cubic (square plan) building
household appliances, or other electrical equipment forms in terms of energy consumption. Two-dimensional
(refrigerators, computers, TVs, etc.) Lighting, an essential modelings of the buildings were the first step using Autocad.
part of every indoor environment, often consumes less Importing the DXF model of the two different design
energy than other electric appliances. alternatives using Designbuilder simulation software was the
Heating or cooling anything - air, water, or food - is an second step.
energy-intensive process, and as such, appliances related to
The simulation method is a sequence of three related
heating or cooling are often energy-intensive. Air
phases, as shown in Figure II.
conditioners and space heating appliances have an essential
power requirement, as do cooking ovens, toasters, and
microwaves. However, those devices like the blender, steam
iron, and toaster are only used for a small time interval.
That's leading to reduces energy used compared to
appliances like the air conditioner, which is used for a much
longer duration. However, the most extensive domestic
energy requirement on the building is space heating and
cooling. According to [24], wisely choosing the wall and the
window material and the architecture of the enclosed space
can significantly reduce the energy required for space
heating and cooling. The heat energy from the outside air
will be passed through the walls and windows to the inside
of the building. The energy is transferred by vibrations of
neighboring molecules in the wall and windows. This leads
to an increase in indoor temperature. This mechanism of
heat transfer is called conduction, denoted by Q. The power
transferred through conduction is given by the following
equation (1): Figure II. A diagram shows the steps of the simulation methodology
Q = U.A.ΔT (1)
Where,
Q: heat transferred in W This study used comparative thermal analysis between
U: overall heat transfer coefficient in W/m2.°c two different forms, cubic and cylinder office building, as
A: heat transfer area in m2 shown in Figure III. For comparison needs, the two forms
ΔT: temperature difference on each surface of the wall in °c were proposed with the same floor area and same volume to
evaluate the impact of the building form on its energy
consumption. Both buildings have a central open courtyard
B. Energy modeling and simulation to enable access to natural light and ventilation. The two
Energy modeling is the process of computerized the proposed buildings are located in Istanbul- Turkey. The
parameters of a building used to perform energy simulation. height was also the same on both, with Four typical floors
In contrast, energy simulation predicts building energy 3.4 m in height for each floor. Therefore, the total height is
performance by using software analysis [27]. considered to be 15 m, 3.5 *4 (number of floors). The cubic
building plan dimensions are 64m*64m, and its total ground
According to [27], building parameters such as floor area was 4096 m2. Approximately the same ground
orientation, thermal properties, and envelope properties floor area applied in the Cylinder form (circular plan). The
contain computations factors that make energy simulation a radius of the circle is 36 m. The area of a circle is pi times
266
the radius squared (A = π r²). Therefore, the total ground
floor area was calculated as (36*36*3.14) =4069 m2.
Before running the simulation, parameters of the energy
simulation processes should be identified first. The
comparatıve simulation focused on cooling energy
consumption during a specific period, the peak cooling load
from 21 June till 21 September, as a typical summer season.
Indoor operative temperature and solar heat gain through
exterior windows were selected to determine the total
cooling energy consumption during the summer season.
These parameters have an essential influence on the average
indoor operative temperature, overall cooling load, and
energy consumption. Internal heat gain through the wall and
roof was ignored.
The construction templates of both buildings in this study
used to be Turkiye-medium weight which is data for a range Figure V. 3D plan of the cylinder building modeled by
of ready to use construction templates. The template author using DesignBuilder software.
provided the same data on facade partitions, walls, roofs, and
airtightness. The glazing template was also the same for both
buildings. The external window glazing type was project reduction rate. Air changes rate per hour (ac/h) with no fresh
air operating pressures (Pa) is left as the DesignBuilder
program automatically calculated as 1.7 ac/h @ 50 Pa.
DesignBuilder calculates the air rate flow in accordance with
ASHRAE standards.
III. Result and Discussions
The glazing facade is considered to be responsible for
a large amount of energy consumption in buildings due to
the glass's higher amount of heat exchange as opposed to the
other building envelope elements. As mentioned earlier in
the energy consumption on buildings section, the overall
heat transfer coefficient in W/m2 of the building envelope
can significantly reduce the energy consumption.
The proposed buildings in this study are fully glazed
with vertical solar shading elements, as shown in Figure III.
Therefore, internal heat gain through the wall and roof was
Figure III. 3D view of both cubic and cylinder shapes
ignored. The solar heat gain through exterior windows
modeled by author using DesignBuilder software.
considers as one of the main influence factors on the cooling
externally glazing. And the layout was horizontal strip, with
90% glazed (windows to wall ratio). For the energy
simulation process, the energy models were designed to be
mechanically ventilated using air conditioning (Fan coil unit
4-
267
through exterior windows and consequently on indoor
operative temperature
. As shown in Figure VI, solar heat gain through the cubic
form exterior windows during the peak period in July
(43797.52 kWh) has a significant impact on zone sensible
cooling load recording (-60617,03 kWh).
Comparing to cylinder form as in figure VII. solar heat gain
through the exterior windows of the cylinder form during the
peak period at July recorded (38198.14 kWh) lead to a
significant impact on zone sensible cooling load recording (-
5288,91 kWh).
268
References 10.1016/j.enbuild.2020.109802.
[15] M. Mokrzecka, "Influence of building Form and
[1] C. Hachem, A. Athienitis, and P. Fazio, "Parametric
orientation on heating demand: simulations for student
investigation of geometric form effects on solar potential
dormitories in temperate climate conditions," doi:
of housing units," Sol. Energy, vol. 85, no. 9, pp. 1864–
10.1051/e3sconf/20184400117.
1877, Sep. 2011, doi: 10.1016/j.solener.2011.04.027.
[16] A. Zhang, R. Bokel, A. van den Dobbelsteen, Y. Sun, Q.
[2] V. N. I. D. M. F. M. F. A. J. Anna Joelsson, "The Impact
Huang, and Q. Zhang, "The effect of geometry parameters
of the Form Factor on Final Energy Demand in
on energy and thermal performance of school buildings in
Residential Buildings in Nordic Climates," 2012.
cold climates of China," Sustain., vol. 9, no. 10, p. 1708,
[3] T. Catalina, J. Virgone, and V. Iordache, "Study on the
Sep. 2017, doi: 10.3390/su9101708.
impact of the building form on the energy consumption,"
[17] A. / Ashrae and / Iesna Addenda,
2011.
“ANSI/ASHRAE/IESNA Addenda to
[4] I. E. Kocagil and G. K. Oral, "The effect of building form
ANSI/ASHRAE/IESNA Standard 90.1-2007,” 2009.
and settlement texture on energy efficiency for hot dry
[18] C. Marino, A. Nucara, and M. Pietrafesa, "Does window-
climate zone in Turkey," in Energy Procedia, Nov. 2015,
to-wall ratio have a significant effect on the energy
vol. 78, pp. 1835–1840, doi:
consumption of buildings? A parametric analysis in Italian
10.1016/j.egypro.2015.11.325.
climate conditions," J. Build. Eng., vol. 13, pp. 169–183,
[5] R. Fallahtafti and M. Mahdavinejad, "Optimisation of
Sep. 2017, doi: 10.1016/j.jobe.2017.08.001.
building Form and orientation for better energy efficient
[19] A. R. AbouElhamd, K. A. Al-Sallal, and A. Hassan,
architecture," Int. J. Energy Sect. Manag., vol. 9, no. 4,
"Review of core/shell quantum dots technology integrated
pp. 593–618, Nov. 2015, doi: 10.1108/IJESM-09-2014-
into building's glazing," Energies, vol. 12, no. 6, 2019,
0001.
doi: 10.3390/en12061058.
[6] A. AlAnzi, D. Seo, and M. Krarti, "Impact of building
[20] K. J. Kontoleon, "Energy saving assessment in buildings
Form on thermal performance of office buildings in
with varying façade orientations and types of glazing
Kuwait," Energy Convers. Manag., vol. 50, no. 3, pp.
systems when exposed to sun," Int. J. Performability
822–828, Mar. 2009, doi:
Eng., vol. 9, no. 1, 2013.
10.1016/j.enconman.2008.09.033.
[21] A. Bhatia, S. A. R. Sangireddy, and V. Garg, "An
[7] B. Bektas Ekici and U. T. Aksoy, "Prediction of building
approach to calculate the equivalent solar heat gain
energy needs in early stage of design by using ANFIS,"
coefficient of glass windows with fixed and dynamic
Expert Syst. Appl., vol. 38, no. 5, pp. 5352–5358, May
shading in tropical climates," J. Build. Eng., vol. 22, pp.
2011, doi: 10.1016/j.eswa.2010.10.021.
90–100, Mar. 2019, doi: 10.1016/j.jobe.2018.11.008.
[8] G. K. Oral and Z. Yilmaz, "Building form for cold
[22] E. Graiz and W. Al Azhari, "Energy Efficient Glass: A
climatic zones related to building envelope from heating
Way to Reduce Energy Consumption in Office Buildings
energy conservation point of view," Energy Build., vol.
in Amman (October 2018)," IEEE Access, vol. 7, 2019,
35, no. 4, pp. 383–388, May 2003, doi: 10.1016/S0378-
doi: 10.1109/ACCESS.2018.2884991.
7788(02)00111-1.
[23] M. Khoukhi, "The combined effect of heat and moisture
[9] V. Bazjanac and L. Berkeley, "Space boundary
transfer dependent thermal conductivity of polystyrene
requirements for modeling of building geometry for
insulation material: Impact on building energy
energy and other performance simulation," in Proceedings
performance," Energy Build., vol. 169, pp. 228–235, Jun.
of the CIB W78 2010: 27th International Conference,
2018, doi: 10.1016/j.enbuild.2018.03.055.
2010, no. November.
[24] O. Algburi and F. Behan, "Cooling load reduction in a
[10] B. Raof, "THE CORRELATION BETWEEN BUILDING
single–family house, an energy–efficient approach," Gazi
FORM AND BUILDING ENERGY PERFORMANCE.,"
Univ. J. Sci., vol. 32, no. 2, pp. 385–400, 2019.
Int. J. Adv. Res., vol. 5, no. 5, pp. 552–561, May 2017,
[25] O. Algburi and F. Beyhan, "Climate-responsive strategies
doi: 10.21474/ijar01/4145.
in vernacular architecture of Erbil city," 2019, doi:
[11] S. Pathirana, A. Rodrigo, and R. Halwatura, "Effect of
10.1080/00207233.2019.1619324.
building Form, orientation, window to wall ratios and
[26] F. Fiorito, A. Cannavale, and M. Santamouris,
zones on energy efficiency and thermal comfort of
"Development, testing and evaluation of energy savings
naturally ventilated houses in tropical climate," Int. J.
potentials of photovoltachromic windows in office
Energy Environ. Eng., vol. 10, no. 3, pp. 107–120, 2019,
buildings. A perspective study for Australian climates,"
doi: 10.1007/s40095-018-0295-3.
Sol. Energy, vol. 205, pp. 358–371, Jul. 2020, doi:
[12] P. McKeen and A. S. Fung, "The effect of building aspect
10.1016/j.solener.2020.05.080.
ratio on energy efficiency: A case study for multi-unit
[27] J. Clarke, "Building simulation," in Energy Simulation in
residential buildings in Canada," Buildings, vol. 4, no. 3,
Building Design, Elsevier, 2001, pp. 64–98.
pp. 336–354, 2014, doi: 10.3390/buildings4030336.
[28] W. L. Oberkampf, S. M. Deland, B. M. Rutherford, K. V
[13] M. Premrov, M. Žigart, and V. Žegarac Leskovar,
Diegert, and K. F. Alvin, "Error and uncertainty in
"Influence of the building Form on the energy
modeling and simulation," Reliab. Eng. Syst. Saf., vol. 75,
performance of timber-glass buildings located in warm
pp. 333–357, 2002, Accessed: 29 June, 2021. [Online].
climatic regions," Energy, vol. 149, pp. 496–504, Apr.
Available: www.elsevier.com/locate/ress.
2018, doi: 10.1016/j.energy.2018.02.074.
[29] R. D. Judkoff, "Validation of building energy analysis
[14] A. Al-Saggaf, H. Nasir, and M. Taha, "Quantitative
simulation programs at the solar energy research
approach for evaluating the building design features
institute," Energy Build., vol. 10, no. 3, pp. 221–239, Jan.
impact on cooling energy consumption in hot climates,"
1988, doi: 10.1016/0378-7788(88)90008-4.
Energy Build., vol. 211, p. 109802, Mar. 2020, doi:
269
Comparison of PID and LQR Controller of Autonomous
Underwater Vehicle for Depth Control
Osen Fili Nami Abdul Halim Dewi H. Budiarti
Department of Electrical Engineering Department of Electrical Engineering Aerospace Engineer
Universitas Indonesia Universitas Indonesia BPPT
Depok, Indonesia Depok, Indonesia Jakarta, Indonesia
osen.fili@ui.ac.id a.halim@ui.ac.id dewi.habsari@bppt.go.id
Abstract—Autonomous Underwater Vehicle (AUV) is a This paper explains the design of PID and LQR AUV controls
small unmanned underwater vehicle that is important for and the results were compared.
Indonesia as an archipelagic country. Apart from military
purposes, it is also needed for civilian purposes. For this reason,
the development of AUV technology is necessary and has a II. Modeling of AUV
strategic value. One that should be developed is an AUV To design a mathematical model of dynamic motion of a
dynamic control technology. In this paper, we have designed an submarine, we must first study the mechanism of motion of
AUV depth control model with proportional integral derivative the AUV itself. There are two frames of reference to see the
(PID) and linear quadratic regulator (LQR) controllers. A
mathematical model of the AUV focused on the depth model has
position and orientation of the AUV, namely Body Fixed
been developed and its stability was analyzed. In the AUV depth Frame (BFF) and Ear Fixed Frame (EFF) which can be seen
model system, an unstable step response was produced, so it is in Figure I.
necessary to add a comparison gain to strengthen the stability of
Rudder
the system. Furthermore, the PID and LQR have been designed
and were simulated on MATLAB. Eventually, the results of the Fin
Thruster
PID and LQR controllers from the AUV depth control have
been analyzed for comparison. Body-fixed frame
Fin
{B}
Rudder
q
Keywords—AUV, Depth Model, Linear Quadratic Regulator, (pitch) v u
Proportional Integral Derivative, MATLAB (sway) r (surge)
Y0 r0 w (yaw)
270
Surge: By separating the acceleration notation and assumed
𝑚[𝑢̇ − 𝑣𝑟 + 𝑤𝑞 − 𝑥𝐺 (𝑞 2 + 𝑟 2 ) + 𝑦𝐺 (𝑝𝑞 − 𝑟̇ ) + diagonal inertia tensor (𝐼𝑥𝑦 , 𝐼𝑥𝑧 , 𝐼𝑦𝑧 ) is zero then equations 7
𝑧𝐺 (𝑝𝑟 + 𝑞̇ )] = ∑𝑋 () to equation 12 will be simplified back to:
Surge:
Sway: (−𝑋𝑢̇ )𝑢̇ + 𝑚𝑧𝐺 𝑞̇ − 𝑚𝑦𝐺 𝑟̇ = 𝑋𝑟𝑒𝑠 + 𝑋|𝑢|𝑢 𝑢|𝑢| +
𝑚[𝑣̇ − 𝑤𝑝 + 𝑢𝑟 − 𝑦𝐺 (𝑟 2 + 𝑝2 ) + 𝑧𝐺 (𝑞𝑟 − 𝑝̇ ) + (2) (𝑋𝑤𝑞 − 𝑚)𝑤𝑞 + (𝑋𝑞𝑞 + 𝑚𝑥𝐺 )𝑞𝑞 + (𝑋𝑣𝑟 + 𝑚)𝑣𝑟 +
𝑥𝐺 (𝑝𝑞 + 𝑟̇ )] = ∑𝑌 (𝑋𝑟𝑟 + 𝑚𝑥𝐺 )𝑟𝑟 − 𝑚𝑦𝐺 𝑝𝑞 − 𝑚𝑧𝐺 𝑝𝑟 + 𝑋𝑝𝑟𝑜𝑝
Heave: where:
𝑚[𝑤̇ − 𝑢𝑞 + 𝑣𝑝 − 𝑧𝐺 (𝑝2 2)
+ 𝑞 + 𝑥𝐺 (𝑟𝑝 − 𝑞̇ ) + ∑𝑋𝑒𝑥𝑡 = 𝑋𝑟𝑒𝑠 + 𝑋|𝑢|𝑢 𝑢|𝑢| + (𝑋𝑤𝑞 − 𝑚)𝑤𝑞 +
𝑦𝐺 (𝑟𝑞 + 𝑝̇ )] = ∑𝑍 ()
(𝑋𝑞𝑞 + 𝑚𝑥𝐺 )𝑞𝑞 + (𝑋𝑣𝑟 + 𝑚)𝑣𝑟 + (𝑋𝑟𝑟 +
Roll: 𝑚𝑥𝐺 )𝑟𝑟 − 𝑚𝑦𝐺 𝑝𝑞 − 𝑚𝑧𝐺 𝑝𝑟 + 𝑋𝑝𝑟𝑜𝑝
271
So that the equation becomes: From equations 28 and 29 and equations 30 and 31 obtain
𝑚𝑧𝐺 𝑢̇ − ( 𝑚𝑥𝐺 + 𝑀𝑤̇ )𝑤̇ + (𝐼𝑦 − 𝑀𝑞̇ )𝑞̇ = ∑𝑀𝑒𝑥𝑡 simplified equations in the matrix below:
()
Yaw : 𝑚 − 𝑍𝑤̇ −(𝑚𝑥𝐺 + 𝑍𝑞̇ ) 0 0 𝑤̇
−(𝑚𝑥𝐺 + 𝑀𝑤̇ ) 𝐼𝑦 − 𝑀𝑞̇ 0 0 𝑞̇
−𝑚𝑦𝐺 𝑢̇ + ( 𝑚𝑥𝐺 − 𝑁𝑣̇ )𝑣̇ + (𝐼𝑧 − 𝑁𝑟̇ )𝑟̇ = 𝑁𝑟𝑒𝑠 + ൦ ൪൦ ൪−
𝑁|𝑣|𝑣 𝑣|𝑣| + 𝑁𝑟|𝑟| 𝑟|𝑟|+(𝑁𝑢𝑟 − 𝑚𝑥𝐺 )𝑢𝑟 + (𝑁𝑤𝑝 + 0 0 1 0 𝑧̇
0 0 0 1 𝜃̇
𝑚𝑥𝐺 )𝑤𝑝 + (𝑁𝑝𝑞 − (𝐼𝑦 − 𝐼𝑥 )) 𝑝𝑞 − 𝑚𝑦𝐺 (𝑣𝑟 − 𝑤𝑞) + 𝑍𝑤 𝑚𝑈 + 𝑍𝑞 0 0 𝑤 𝑍𝛿𝑠 ()
𝑁𝑢𝑣 𝑢𝑣 + 𝑁𝑢𝑢𝛿𝑟 𝑢2 𝛿𝑟 𝑀𝑤 −𝑚𝑥𝐺 𝑈 + 𝑀𝑞 0 𝑀𝜃 𝑞 𝑀𝛿
൦ ൪ = ൦ 𝑠 ൪ [𝛿𝑠 ]
where: 1 0 0 −𝑈 𝑧 0
0 1 0 0 𝜃 0
∑𝑁𝑒𝑥𝑡 = 𝑁𝑟𝑒𝑠 + 𝑁|𝑣|𝑣 𝑣|𝑣| + 𝑁𝑟|𝑟| 𝑟|𝑟|+(𝑁𝑢𝑟 −
𝑚𝑥𝐺 )𝑢𝑟 + (𝑁𝑤𝑝 + 𝑚𝑥𝐺 )𝑤𝑝 + (𝑁𝑝𝑞 − (𝐼𝑦 − 𝐼𝑥 )) 𝑝𝑞 − From equation 32 it is assumed 𝑥𝐺 = 0, the heave velocity
𝑚𝑦𝐺 (𝑣𝑟 − 𝑤𝑞) + 𝑁𝑢𝑣 𝑢𝑣 + 𝑁𝑢𝑢𝛿𝑟 𝑢 𝛿𝑟 2 is assumed always small so that both w and 𝑤̇ can be
So that the equation becomes: ignored so the equation 32 becomes:
272
Table II. REMUS AUV Parameters So the matrix
Parameter Value Units Description 𝑨 = 𝑻𝟏−𝟏 𝑻𝟐
𝐼𝑥 +1.77e-001 𝑘𝑔. 𝑚2 M.I w.r.t Origin at
CB −0.8247 0 −0.6927
𝐼𝑦 +3.45e+000 𝑘𝑔. 𝑚2 M.I w.r.t Origin at 𝑨=[ 0 0 −1.5400]
CB
𝐼𝑧 +3.45e+000 𝑘𝑔. 𝑚2 M.I w.r.t Origin at
1 0 0
CB and
𝑍𝑞 -9.67e+000 𝑘𝑔. 𝑚/𝑠 Combined Term
𝑍𝑞̇ -1.93e+000 𝑘𝑔. 𝑚 Added Mass 𝑩 = 𝑻𝟏−𝟏 𝑻𝟑
𝑍𝑤 -6.66e+001 𝑘𝑔/𝑠 Combined Term
−4.1537
𝑍𝑤̇ -3.55e+001 𝑘𝑔 Added Mass
𝑍𝛿𝑠 -5.06e+001 𝑘𝑔. 𝑚/𝑠 2 Fin Lift 𝑩=[ 0 ]
𝑀𝑞 -6.87e+000 𝑘𝑔. 𝑚2 /𝑠 Combined Term 0
𝑀𝑞̇ -4.88e+000 𝑘𝑔. 𝑚2 Added Mass While the matrix
𝑀𝑤 +3.07e+001 𝑘𝑔. 𝑚/𝑠 Combined Term
𝑀𝑤̇ -1.93e+000 𝑘𝑔. 𝑚 Added Mass 𝑪 = [0 1 0]
𝑀𝛿𝑠 -3.46e+001 𝑘𝑔. 𝑚2 /𝑠 2 Added Mass
𝑀𝜃 -5.77e+000 𝑘𝑔. 𝑚2 /𝑠 2 Hydrostatic
and
273
response Depth Model with gain change obtained graph on
figure VII with Rise time of 2.4036, Settling time of 24.3448,
Overshoot of 28.4421 and steady state error of 0.027. From
the step response in figure VII although the system is able to
stabilize but still needed controller to be able to better
stabilize the system with better conditions.
III. Control Methods and Simulation
The depth model will be designed with PID control and
Figure IV. Diagram Bode Depth Model LQR controller to help stabilize the system.
A. PID Control Model
From bode Gain Margin System diagram of 0.0893,
The PID Control equation is :
Phase Margin -61.9435 of Negative margin phase can also be
concluded that the system is unstable.
1
𝐺𝑐 (𝑠) = 𝐾𝑝 (1 + + 𝑇𝑑 𝑠) (39)
𝑇𝑖 𝑠
1
𝐺𝑃𝐼𝐷 = 1.1165(1 + + 0.9059𝑠 )
3.7747𝑠
274
B. LQR Control Model the LQR controller design is very good at stabilizing the
LQR is one of the optimal control methods based on state depth of the AUV model.
space and is essentially intended to determine the control C. Comparison of PID and LQR controller results
signal in such a way as to minimize the performance index 𝐽.
∞
𝐽 = ∫ (𝑥 ∗ 𝑄𝑥 + 𝑢∗ 𝑅𝑢)𝑑𝑡 (40)
0
275
Convolutional Neural Network Approach to Distinguish and Characterize Tumor Samples Using
Gene Expression Data
Abstract— Cancer is threatening millions of people each The presence of CT and histopathology data in the
year and its early diagnosis is still a challenging task. Early diagnosis of the disease enabled improvements in the
diagnosis is one of major ways to tackle this disease and lower diagnosis stage as a result of image-based processing using
the mortality rate. Advancements in deep learning approaches the deep learning approach of these data. Deep learning
and availability of biological data offer applications that can
facilitate the diagnosis and characterization of cancer. Here, we
algorithms have been used in the diagnosis of many types of
aimed to provide new perspective of cancer diagnosis using deep cancer, including breast cancer [3,4], prostate cancer [5,6],
learning approach on gene expression data. We turn the lung cancer [7,8], colon cancer[9], head and neck cancer [10],
information of gene expression data of cancer and normal and skin cancer[11]. These studies, which are based on image
tissues into input for Convolutional Neural Network (CNN) with processing, are widely used as they provide advantages to
the method we call RGB mapping. It is aimed to be learned with clinicians at the early diagnosis stage.
CNN by preserving the gene expression data values of the In addition to the image-based approach, biological data
tissues. In this way, it is aimed to protect the effect of each gene have also been used in cancer diagnosis [12] and even
in cancer diagnosis. In addition, we aimed to characterize the treatment [13] approaches. Gene expression signature-based
disease by identifying genes that are effective in cancer
prediction. In this study, The Cancer Genome Atlas (TCGA)
approach is generally used in eliminating the disadvantages
dataset with RNA-Seq data of approximately 30 different types caused by heterogeneity at the diagnosis stage. Gene
of cancer patients and GTEx RNA-seq data of normal tissues expression data and deep learning approach are used for
were used. The input data for the training was transformed to many methods such as estimation of survival times of
RGB format and the training was carried out with a CNN. The individuals with cancer [14], determination of biomarker
trained algorithm is able to predict cancer with 97,7% accuracy, genes [15], classification [16,17]. All of these studies show
based on gene expression data. Moreover, we applied one-pixel that by using gene expression data and deep learning
attack on the trained model to determine effective genes for approaches together, important information will be gained
prediction of the disease. As a result of this attack, 13 genes that about the mechanism of cancer.
are effective on the decision mechanism of the algorithm were
determined. In conclusion, a new data preprocessing method is
In this study, The Cancer Genome Atlas (TCGA) dataset
proposed before the training in this study. With the one-pixel with RNA-seq data of approximately 30 different types of
attack method, which can be applied to the model trained using cancer patients and a dataset obtained by curation of GTEx
this method, it has been possible to identify genes that may be data including RNA-seq analysis of normal tissues was used.
biomarkers in cancer. The effects of candidate genes on cancer Importance was given to homogeneity of tumor and normal
can be determined by experimental studies. tissue distribution of the prepared dataset. The gene
expression values of the tissues, which are the input data for
Keywords— cancer, CNN, TCGA, GTEx, RNA-seq, RGB training, were converted into 24-bit binary format. Then it
Mapping
was converted to 8 bits of Red, Green and Blue channels. The
training was carried out with the CNN algorithm. The trained
I. Introduction
algorithm is able to diagnose cancer and normal patients
The deep learning approach has emerged by designing 97.7% with accuracy, based on gene expression data.
computer models that can perform the learning process as a Afterwards, one-pixel attack method was applied to the input
result of interconnected layers based on the human brain, data created using RGB mapping. In this way, the
such as neurons. As a result of the development of data vulnerability of deep learning models has been used to
science and especially the rapid increase in biological data in identify genes that may be effective in cancer. As a result of
the last decade, the designed neural networks have begun to this process, 13 biomarker genes were obtained. When these
play important roles in the interpretation of biological data genes were investigated in the literature, their relationship
for the diagnosis and treatment of diseases [1]. Cancer, which with cancer was determined.
is one of the biggest health problems in the world, is one of
the diseases in which deep learning approaches are widely
applied. II. Materials and Methods
Since cancer is a disease with high genomic heterogeneity A. Dataset Preparation
and phenotypic plasticity, its diagnosis and treatment involve
various difficulties [2]. Thanks to the developing technology, Data downloaded from UCSC Xena platform [18] which
many medical data of cancer patients are available. As a includes three different RNA-Seq data sources; TARGET,
result of the processing of these medical data with deep TCGA and GTEx. Dataset label distribution is shown in
learning approaches, the stages of diagnosis and treatment Table I.
have improved.
276
Table I. Distribution of labels in whole dataset. activation function and to overcome overfitting 0.2 or 0.5
dropout rates were used. Final layer has Sigmoid as activation
function.
Datasets Normal Tumor
Table II. CNN Architecture
TCGA 727 9750
GTEx 7429 0
D. One-Pixel Attack
One pixel attack algorithm was adopted from an earlier
study [20] which utilizes "differential evolution" algorithm
Figure I. Conversion of gene expression value to RGB format from SciPy Python library. The attack algorithm picks
random locations (x, y) where x < 32 and y < 32 and random
Figure II summarizes the concept with an sample conversion. RGB colors. Although blue and green values are picked
After the conversion, 14,308 x 1,024 training data becomes within (0,255) range, the red color was only picked within
[14,308 x 32 x 32 x 3] Numpy array. Accordingly, test data (0,2) range since gene expression values are mostly below
becomes [3,578 x 32 x 32 x 3] Numpy array suitable for batch 196,607 corresponding to (2,255,255) RGB value. One pixel
processing by Tensorflow. Final layer represents R,G,B attack provides pixel location, the new color value which
values, the square 32x32 shape corresponds to 1,024 genes. causes label to change in trained model (from Normal to
Tumor or vice versa). Since attack is random, we performed
many attacks (10 times to be exact) to the test dataset. The
resulting attacks were filtered if the suggested pixel value is
within lowest and highest expression range of corresponding
gene.
III. Results
A. Input Images Obtained by Applying RGB Mapping
Method
Since gene expression data have been converted into RGB
format, visualizing the expression layout for any sample was
possible. In Figure III, sample images for Normal and Tumor
Figure II. Illustration of Numpy 4D arrays. (a) For each sample, samples are presented. Figure III shows 4 sample images
1024 genes are shaped as 32x32 pixels. Due to RGB mapping, 3 from (a) Normal tissue data and (b) Tumor tissue data
layers of color channels were used per sample. (b) This shape was generated by converting gene expression levels of 1024
imposed on whole dataset. selected genes using RGB mapping. The images do not reveal
any apparent pattern for naked eye. However, convolutional
C. CNN Architecture
layers are able to pick regions or patterns formed by
The CNN architecture shown in Table II was used for neighboring pixels so gene expression data was passed
training. The architecture includes eight convolution layers, through convolution layers. Please note that gene expression
four dropout layers, one global average pooling layer. Each data was converted into RGB format but they are not saved
convolution layer consists of 3x3 kernels. ReLU was used as as images before training.
277
Table III. Comparison model with other studies. SVM; support vector machine, t-SNE; t-distributed stochastic neighbor embedding.
Expression
Authors Classification Accuracy Sensitivity Specificity Precision F-measure
Preprocessing
Elbashir et al. [22] Normalization CNN 98,76 % 91,43% 100,00% 100,00% 0,955
Danaee et al. [23] Normalization Stacked
94,78 % 94,04% 97,50% 97,20%
Denoising
Elbashir et al. [22] Normalization AlexNet 96,69 % 96,89% 94,12% 99,54% 0,955
Elbashir et al. [22] t-SNE SVM 100,00% 100,00% 51,00% 95,96% 0,97
Proposed method RGB mapping CNN 97,73 % 97,66% 97,80% 98,00% 0,975
C. Performance Measurement
The training was performed on 32x32x3 3D
multidimensional array for each sample. Figure V shows the ROC curve of the model. The AUC value
of our model is 0.97.There are several different approaches
which uses gene expression data to classify tumor and normal
samples ranging from simpler machine learning approaches
to complex deep learning networks. These approaches
usually start with pre-processing the gene expression data
with an irreversible manipulation and even mapping data
points to a different domain. Our method involves minimal
and reversible change to gene expression data. The RGB
mapping is reversible and does not require normalization or
any dimensional reduction techniques. Table III compares
our approach with several different approaches both in pre-
processing and classification steps.
Figure V. The ROC curve of CNN model for tumor and normal
classification.
278
Table IV. Gene List obtained by One-Pixel Attack
In Figure VI, the first images show the original images and
Ensembl ID Gene Name the second images show the images obtained as a result of the
attack. Figure VI (a) shows the original and post attack
ENSG00000163513 TGFBR2 images of gene expression data with sample ID TCGA-HC-
ENSG00000129250 KIF1C 8259-11. The areas marked with a red circle show the
changes in the Agrin gene as a result of the attack. The gene
ENSG00000215301 DDX3X
expression value has been increased for the TCGA-HC-8259-
ENSG00000188157 AGRN 11 sample. As can be seen in the image, brighter pixels were
Cancer ENSG00000138821 SLC39A8
obtained by increasing the expression value. It has been
Related shown that important clues for the mechanism of cancer can
Genes ENSG00000124942 AHNAK be obtained by performing changes that cannot be
According to
ENSG00000157557 ETS2 distinguished with the naked eye as well as finding effective
One-Pixel
Attack genes for cancer by attacking the model. In Figure VI (b), the
ENSG00000177469 CAVIN1 image of the sample with the sample ID TCGA-NJ-A4YI-01
ENSG00000123095 BHLHE41 is shown. In this example, the gene expression value for the
same gene was decreased. As can be seen in the picture, while
ENSG00000157514 TSC22D3
the pixel value of the original picture is brighter, it appears
ENSG00000116701 NCF2 darker with decreasing the gene expression value as a result
ENSG00000198911 SREBF2 of the attack.
ENSG00000121691 CAT IV. Discussion
In this study, a training was carried out through deep
When the genes obtained as a result of one pixel attack are learning models and gene expression data of cancer patients
examined in the literature, it is seen that the expression using RGB mapping data preprocessing methods. As a result
changes are associated with the formation of cancer or the of this training, it has been shown that deep learning methods
survival of the patient. For example, studies have been can distinguish the differences between tumor and normal
carried out that the expression changes of the TGFBR2 gene tissues. The data processing method applied before the model
(transforming growth factor beta receptor 2) affect the makes it possible to apply a one-pixel attack to the sample
prognosis in cervical cancer [24] and gastric cancers [25]. images obtained. Identifying genes that are effective for
Likewise, there are data showing that KIF1C (Kinesin-like cancer is critical for cancer diagnosis and treatment. For this
protein KIF1C) and SLC39A8 (Solute carrier family 39 reason, one pixel attack algorithm was applied to identify
member 8) are prognostic marker in renal cancers [26] and genes that may be cancer biomarkers over the obtained
that the change in expression level is related to survival time. training data. When the genes determined by this algorithm
When the changes in the estimates of the samples were were investigated in the literature, it was seen that the
examined by applying the one pixel attack method, it was expression changes of the genes were effective in cancer
seen that only the changes of the AGRN (Agrin) gene progression and survival of patients.
occurred by increasing and decreasing the expression level. The results of the study have proven that by developing
When the sample images converted to RGB format are appropriate processing methods for the experimentally
examined, the changes in the increased and decreased pixels obtained biological data, meaningful results can be obtained
are seen Figure VI. on the disease without loss of information about the disease.
The gene expression data, which are the inputs of the deep
learning model, are converted to RGB and applied to the
model, allowing the data to be used without any statistical
methods (such as normalization) and without loss. In this
way, high learning rate and high prediction rate were seen as
a result of the training.
All these findings have shown that it can bring a new
approach to diseases such as cancer that are difficult to
diagnose and require more biomarker genes. With the
application of this method, individual results can be obtained.
Inter- and intra-tumor heterogeneity characteristics of tumor
cells can be determined. It can be used as an approach that
makes it possible to make individual cancer analysis by
making it easier to find genes that differ from person to
person. The results obtained can be strengthened with
experimental data to identify new biomarkers for cancer and
can be used in personalized diagnostic or therapeutic studies.
References
[1] Esteva, A., Robicquet, A., Ramsundar, B., Kuleshov, V.,
Figure VI. Sample images obtained as a result of the attack. DePristo, M., Chou, K., Cui, C., Corrado, G., Thrun, S. and Dean,
279
J., 2019. Aguide to deep learning in healthcare. Nature medicine, tissue using gene expression data. In 2018 IEEE International
25(1), pp.24-29 Conference on Bioinformatics and Biomedicine (BIBM) (pp. 1748-
[2] Persi, E., Wolf, Y.I., Horn, D., Ruppin, E., Demichelis, F., 1752). IEEE.
Gatenby, R.A., Gillies, R.J. and Koonin, E.V., 2020. Mutation– [18] Vivian, J., Rao, A. A., Nothaft, F. A., Ketchum, C., Armstrong,
selection balance and compensatory mechanisms in tumour J., Novak, A., ... & Paten, B. (2017). Toil enables reproducible, open
evolution. Nature Reviews Genetics, pp.1-12. source, big biomedical data analyses. Nature biotechnology, 35(4),
[3] Zuluaga-Gomez, J., Al Masry, Z., Benaggoune, K., Meraghni, S. 314-316.
and Zerhouni, N., 2020. A CNN-based methodology for breast [19] Rouillard, A. D., Gundersen, G. W., Fernandez, N. F., Wang,
cancer diagnosis using thermal images. Computer Methods in Z., Monteiro, C. D., McDermott, M.G., & Ma’ayan, A., 2016. The
Biomechanics and Biomedical Engineering: Imaging & harmonizome: a collection of processed datasets gathered to serve
Visualization, pp.1-15. and mine knowledge about genes and proteins. Database, 2016.
[4] Gour, M., Jain, S. and SunilKumar, T., 2020. Residual learning [20] Su, J., Vargas, D. V., & Sakurai, K. (2019). One pixel attack
based CNN for breast cancer histopathological image classification. for fooling deep neural networks. IEEE Transactions on
International Journal of Imaging Systems and Technology. Evolutionary Computation, 23(5), 828-841.
[5] Swiderska-Chadaj, Z., de Bel, T., Blanchet, L., Baidoshvili, A., [21] O. Dulgerci. (2019). Minimizing with differential evolution.
Vossen, D., van der Laak, J. and Litjens, G., 2020. Impact of (Visited on 2021-6-18), [Online]. Available:
rescanning and normalization on convolutional neural network https://mathematica.stackexchange.com/questions/193009/minimiz
performance in multi-center, whole-slide classification of prostate ing-with-differential-evolution
cancer. Scientific Reports, 10(1), pp.1-14. [22] Elbashir, M. K., Ezz, M., Mohammed, M., & Saloum, S. S.
[6] Hartenstein, A., Lübbe, F., Baur, A.D., Rudolph, M.M., Furth, (2019). Lightweight convolutional neural network for breast cancer
C., Brenner, W., Amthauer, H., Hamm, B., Makowski, M. and classification using RNA-seq gene expression data. IEEE Access, 7,
Penzkofer, T., 2020. Prostate Cancer Nodal Staging: Using Deep 185338-185348.
Learning to Predict 68 Ga-PSMA-Positivity from CT Imaging [23] Danaee, P., Ghaeini, R., & Hendrix, D. A. (2017). A deep
Alone. Scientific Reports, 10(1), pp.1-11. learning approach for cancer detection and relevant gene
[7] Kanavati, F., Toyokawa, G., Momosaki, S., Rambeau, M., identification. In Pacific symposium on biocomputing 2017 (pp.
Kozuma, Y., Shoji, F., Yamazaki, K., Takeo, S., Iizuka, O. and 219-229).
Tsuneki, M., 2020. Weakly-supervised learning for lung carcinoma [24] Yang, H., Zhang, H., Zhong, Y., Wang, Q., Yang, L., Kang, H.,
classification using deep learning. Scientific Reports, 10(1), pp.1- ... & Zhou, Y. (2017). Concomitant underexpression of TGFBR2
11. and overexpression of hTERT are associated with poor prognosis in
[8] Lai, Y.H., Chen, W.N., Hsu, T.C., Lin, C., Tsao, Y. and Wu, S., cervical cancer. Scientific reports, 7(1), 1-14.
2020. overall survival prediction of non-small cell lung cancer by [25] Nadauld, L. D., Garcia, S., Natsoulis, G., Bell, J. M., Miotke,
integrating microarray and clinical data with deep learning. L., Hopmans, E. S., ... & Ji, H. P. (2014). Metastatic tumor evolution
Scientific reports, 10(1), pp.1-11. and organoid modeling implicate TGFBR2 as a cancer driver in
[9] Jiang, D., Liao, J., Duan, H., Wu, Q., Owen, G. Shu, C., Chen, diffuse gastric cancer. Genome biology, 15(8), 1-18.
L., He, Y., Wu, Z., He, D. and Zhang, W., 2020. A machine [26] KIF1C,https://www.proteinatlas.org/ENSG00000129250-
learning-based prognostic predictor for stage III colon cancer. KIF1C/pathology/renal+cancer, 29/05/21.
Scientific reports, 10(1), pp.1-9.
[10] Fontaine, P., Acosta, O., Castelli, J., De Crevoisier, R., Müller,
H. and Depeursinge, A., 2020. The importance of feature
aggregation in radiomics: a head and neck cancer study. Scientific
Reports, 10(1), pp.1-11.
[11] Tschandl, P., Rinner, C., Apalla, Z., Argenziano, G., Codella,
N., Halpern, A., Janda, M., Lallas, A., Longo, C., Malvehy, J. and
Paoli, J., 2020. Human–computer collaboration for skin cancer
recognition. Nature Medicine, 26(8), pp.1229-1234.
[12] Jiao, W., Atwal, G., Polak, P., Karlic, R., Cuppen, E., Danyi,
A., De Ridder, J., van Herpen, C., Lolkema, M.P., Steeghs, N. and
Getz, G., 2020. A deep learning system accurately classifies primary
and metastatic cancers using passenger mutation patterns. Nature
communications, 11(1), pp.1-12.
[13] Mencattini, A., Di Giuseppe, D., Comes, M.C., Casti, P., Corsi,
F., Bertani, F.R., Ghibelli, L., Businaro, L., Di Natale, C., Parrini,
M.C. and Martinelli, E., 2020. Discovering the hidden messages
within cell trajectories using a deep learning approach for in vitro
evaluation of cancer drug treatments. Scientific reports, 10(1), pp.1-
11.
[14] Ramirez, R., Chiu, Y. C., Zhang, S., Ramirez, J., Chen, Y.,
Huang, Y., & Jin, Y. F., 2021. Prediction and interpretation of
cancer survival using graph convolution neural networks. Methods.
[15] Xie, Y., Meng, W. Y., Li, R. Z., Wang, Y. W., Qian, X., Chan,
C., ... & Leung, E. L. H., 2021. Early lung cancer diagnostic
biomarker discovery by machine learning methods. Translational
oncology, 14(1), 100907.
[16] Binder, A., Bockmayr, M., Hägele, M., Wienert, S., Heim, D.,
Hellweg, K., ... & Klauschen, F. (2021). Morphological and
molecular breast cancer profiling through explainable machine
learning. Nature Machine Intelligence, 1-12.
[17] Ahn, T., Goo, T., Lee, C. H., Kim, S., Han, K., Park, S., & Park,
T., 2018. Deep learning-based identification of cancer or normal
280