Using Artificial Bee Colony Algorithm Fo
Using Artificial Bee Colony Algorithm Fo
Abstract—Nowadays, computer scientists have shown the interest in the study of social insect’s behaviour in neural networks area for
solving different combinatorial and statistical problems. Chief among these is the Artificial Bee Colony (ABC) algorithm. This paper
investigates the use of ABC algorithm that simulates the intelligent foraging behaviour of a honey bee swarm. Multilayer Perceptron
(MLP) trained with the standard back propagation algorithm normally utilises computationally intensive training algorithms. One of the
crucial problems with the backpropagation (BP) algorithm is that it can sometimes yield the networks with suboptimal weights because
of the presence of many local optima in the solution space. To overcome ABC algorithm used in this work to train MLP learning the
complex behaviour of earthquake time series data trained by BP, the performance of MLP-ABC is benchmarked against MLP training
with the standard BP. The experimental result shows that MLP-ABC performance is better than MLP-BP for time series data.
—————————— ——————————
1 INTRODUCTION
population consisting of feasible solutions to the [4], [8]. Figure 1 shows the architecture of MLP with two
difficulty is customised by applying some agents on the hidden layers, one output layer, and one input layer.
solutions depending on the information of their
robustness. Therefore, the population is encouraged
towards improved solution areas of the solution space.
Population-based optimisation algorithms are
categorised into two sections namely evolutionary
algorithm (EA) and SI-based algorithm [21], [22]. In EA,
the major plan underlying this combination is to take the
weight matrices of the ANNs as individuals, to change
the weights by means of some operations such as
crossover and mutation, and to use the error produced by F ig 1 : M u lti L a y e r P e r c e p t r o n N e u r a l N e t w o r k
the ANNs as the fitness measure that guides selection. In n
with colony to train NNs by optimal weights [26]. In this
n k
study, ABC algorithm is used successfully to train MLP 1
E (w (t)) = (d k -O t ) (2 )
on earthquake time series data for prediction task. The n j= 1 k=1
performance of the algorithm is compared with standard where E (w (t)) is the error at the t th iteration; w(t ) is the
BP algorithm. weights in the connections at the t th iteration; dk is the
desired output node; ok is the actual value of the kth
This paper is organised as follows: A brief review on output node; K is the number of output nodes; and n is
ANN and ABC and BP algorithms is given in Section 2 the number of patterns. T is the optimisation target to
and Section 3, respectively. The proposed ABC algorithm minimise the objective function by optimising the
and the training and testing of the network using ABC network weights w (t).
algorithm are detailed in Section 4. Section 5 contains the
prediction of earthquake event. Results and discussion
3 ARTIFICIAL BEE COLONY
are discussed in Section 6. Finally, the paper is concluded
in Section 7. 3.1 Swarm Intelligence
2 ARTIFICIAL NEURAL NETWORKS Since the last two decades, swarm intelligence (SI) has
been the focus of many researches because of its unique
2.1 T RA IN IN G O F M LP N EURA L N ETW ORKS
behaviour inherent from the social insects [13], [14], [22],
MLP was introduced in 1957 to solve different [25]. Bonabeau has defined the SI as “any attempt to
combinatorial problems [27]. MLP, which is also known design algorithm or distributed problem-solving devices
as feed forward neural networks was first introduced for inspired by the collective behaviour of social insect
the non-linear XOR, and was then successfully applied to colonies and other animal societies” [28]. He mainly
different combinatorial problems. MLP is mostly used for focused on the behaviour of social insects alone such as
information processing and pattern recognition in termites, bees, wasps, and different ant species. However,
prediction of seismic activities. In this section, MLP’s swarm can be considered as any collection of interacting
characteristics and interaction with the seismic signals are agents or individuals. Ants are individual agents of ACO
explained. MLP works as a universal approximation in [29]. An immune system can be considered as a group of
which inputs signal propagates in forward direction. It is cells and molecules as well as a crowd is a swarm of
highly used and tested with different problems such as in people [31]. PSO and ABC are popular population-based
time series prediction and function approximation [1], [3], stochastic optimisation algorithms adapted for the
AUTHOR: TITLE 3
optimisation of non-linear functions in multidimensional 5: Produce new solutions (food source positions) Vi,j in
space [32]. the neighbourhood of xi,j for the employed bees using the
formula
3.2 Artificial Bee Colony algorithm
fit n
p i = SN (4 )
tasks such as employed bees, onlooker bees, and scout
bees. These three bees/tasks determine the objects of k = 1
problems by sharing information to others bees. The The calculation of fitness values of solutions is defined
common duties of these artificial bees are as follows: as
1
fit i = 1 + f i
Employed bees: Employed bees use multidirectional fi >= 0
1 + a b s ( f
(5 )
search space for food source with initialisation of the
i ) fi < 0
area. They get information and all possibilities to find
Normalise pi values into [0, 1]
food source and solution space. Sharing of information
8: Produce the new solutions (new positions) υi for the
with onlooker bees is performed by employee bees. An
onlookers from the solutions xi, selected depending on Pi,
employed bee produces a modification on the source
and evaluate them
position in her memory and discovers a new food source
9: Apply the Greedy Selection process for the onlookers
position. Provided that the nectar amount of the new
between xi and vi
source is higher than that of the previous source, the
10: Determine the abandoned solution (source), if exists,
employed bee memorizes the new source position and
replace it with a new randomly produced solution xi for
forgets the old one.
the scout using the following equation
Onlooker bees: Onlooker bees evaluate the nectar
amount obtained by employed bees and choose a food x ji = x jm in + ran d(0 ,1 )(x jm ax -x jm in ) (6 )
source depending on the probability values calculated
using the fitness values. For this purpose, a fitness-based
11: Memorise the best food source position (solution)
selection technique can be used. Onlooker bees watch the
achieved so far
dance of hive bees and select the best food source
12: cycle=cycle+1
according to the probability proportional to the quality of
that food source. 13: until cycle= Maximum Cycle Number (MCN)
Scout bees: Scout bees select the food source randomly 4 THE PROPO SED FRA M EW O RK FO R M LP-ABC
without experience. If the nectar amount of a food source The proposed flowchart of the ABC algorithm for
is higher than that of the previous source in their earthquake time series data prediction is given in Figure
memory, they memorise the new position and forget the 2. In the figure, each cycle of the search consists of three
previous position. Whenever employed bees get a food steps after initialisation of the colony, foods, and three
source and use the food source very well again, they control parameters in the number of food sources, which
become scout bees to find new food source by are equal to the number of employed bees or onlooker
memorising the best path. The detailed pseudocode of bees (SN), the value of limit, the maximum cycle number
ABC algorithm is shown as follows: (MCN) for MLP-ABC algorithm. The initialisation of
weights was compared with output and the best weight
1: Initialise the population of solutions Xi where
cycle was selected by scout bees’ phase. The bees
i=1…..SN
(employed bees, onlooker bees) would continue
2: Evaluate the population
searching until the last cycle to find the best weights for
3: Cycle=1
networks. The food source of which the nectar was
4: Repeat from step 2 to step 13
neglected by the bees was replaced with a new food
AUTHOR: TITLE 4
source by the scout bees. Every bee (employed bees, 5 PREDICTION OF EARTHQUAKE EVENT
onlooker bees) would produce new solution area for the
network and the Greedy Selection would decide the best In this research, real-time series data of seismic event
food source position. Suppose that the neglected source is earthquake was selected for training and testing. The
xi and j ∈ {1, 2... D}, then the scout bees determined a new data from the Southern California Earthquake Data
food source to be replaced with xi. Center (SCEDC) holdings for 2011 were selected [30]. The
The foods area was limited in range [10,-10]. It was data included local, regional, and quarry-blast events
applied randomly and was initialised for evaluation. This with epicentres between latitudes 32.0S and 37.0N and
operation can be defined by using equation 6. Every bee longitudes between -122.0W and -114.0E. There were four
(employed bees, onlooker bees) would produce new main earthquake parameters, namely depth of
evaluated solution area for the network and the Greedy earthquake, time of occurrence, geographical area, and
Selection was decided for the best food source position. If magnitude of earthquakes. The significant parameter
the new food source has equal or better nectar than the earthquake magnitude by Richter scale was used for
old food source, it was replaced with the new food source simulation of earthquake magnitude prediction. Data
in the memory. Otherwise, the old food source was obtained from the SCEC website were used to define
retained in the memory. The basic idea of ABC scheme is input classes and test the MLP-ABC model proposed in
to use agents of bees to search the best combination of this research. The earthquake record of Southern
weights for network. All steps in finding optimal weights California between 1st January 2010 and 30th May 2011
for network are shown in proposed ABC algorithm was divided into fifty data sets per day. The networks
framework in Figure 2. The figure shows how to find and were tested for the prediction of earthquake magnitude
select the best weights and how to replace with the from horizon one to horizon five by MLP-ABC and MLP-
previous one. The Greedy Selection was applied between BP.
two sets of values xi and Vi while the best scout bees were Neural networks have been used successfully to solve
randomly selected. complicated pattern recognition and classification
The proposed frameworks can easily train earthquake problems in different domains such as satellite data, GPS
time series data for prediction task by finding optimal data, and financial forecasting [3], [34]. Recently, NNs are
network weights for MLP. also applied for earthquake prediction by using different
models such as Backpropagation Neural Networks
(BPNNs), Radial-Basis Function (RBF) NNs, Recurrent
NNs, and probabilistic NNs [1], [35], [36], [37], [38], [39],
[40]. These models mostly use the seismicity indicators as
the parameters. These models are limited to predict the
earthquake magnitude of more than 7.5. Therefore, MLP-
ABC can predict the magnitude of more than 7.5.
6 SIMULATION RESULTS
[1,-1] is for MLP-BP and [10,-10] is for MLP-ABC. The No. of inputs From 2 to 4 nodes
weight values of MLP-ABC were initialised, evaluated, nodes
and fitted using ABC algorithm, while the weight values
of MLP-BP were adjusted from the range [1,-1] randomly. No. of output From 1 to 4 nodes
All the simulation parameters were taken as given in nodes
Table 1. Besides that, the minimum value of mean square
W eights range [1,-1] [10,-10]
errors (MSE) was selected for testing. The stopping
criteria of minimum error were set to 0.0001 for MLP-BP Runtime ___________ From 2 to 10
while MLP-ABC was stopped on MCN. The MLP was
trained with inputs, hidden, and an output node varying
from 2 to 4, respectively.
TABLE 2
During the experiment, 5 trials were performed for AVERAGE RESULTS OF MLP-BP AND MLP-ABC FOR
training MLP-ABC. Each case and run was started with PREDICTION.
different number of parameters and with random
Network M LP-ABC M LP-BP
population of foods. The sigmoid function was used as
Structure/M SE
activation function for network output. The value of
“limit” is equal to FoodNumber D where D is the
2-2-1 0.00161368 0.0195048
dimension of the problem and FoodNumber is half of the
colony size, which is 50. 2-3-1 0.00170239 0.0184944
3-3-1 0.00161061 0.0174701
When the number of input, hidden, and output nodes
4-2-4 0.00163705 0.0193873
of the neural network and running time varied, the
performance of ABC was stable, which is important for 4-4-2 0.00187162 0.0220181
the designation of neural networks in the current state
where there are no specific rules for the decision of the
number of hidden nodes. where OFE=M CN FoodSource.
Finally, mean square errors (MSE) and normalised The dimension (D ) can change the structure of MLP-
mean square error (NMSE) were calculated for MLP-BP ABC network model selected where 6, 9, 13, 16, 22, and
and MLP-ABC algorithms. The simulation results 28 showed 2-2-1, 2-3-1, 3-3-1, 4-2-4 and 4-4-4, respectively.
showed the effectiveness and efficiency of ABC
algorithm. The comparison simulation of different From Table 1, we can see that the maximum cycle
network structures is presented in Table 1. The network numbers are less than maximum epochs for MLP-ABC
parameter, MCN, objective function evaluation (OFE), training while the OFE increases. The network structure
runtimes, network shape, and epochs are presented in employed in this experimentation started from two
Table 1. inputs, two hidden, and one output layer up to four
inputs, four hidden, and two outputs nodes that
TABLE 1 contained the MSE and NMSE for training MLP-ABC and
NETWORK PARAMETER FOR MLP-BP AND MLP-ABC MLP-BP algorithm on earthquake data.
Parameters M LP-BP M LP-ABC The NMSE can be found by the following formulae:
n Λ
LR 0.6 __________ NM SE= 1
(Y i - Y i ) 2 (7 )
σ 2n
i= 1
M omentum 0.5 __________
n
1
σ2= (Y i -Y ) 2 (8 )
Dimension From 6 to 28 n -1 i= 1
M CN ___________ From 100 to 1000 1 n
Y= Yi (9 )
n i= 1
Epochs/OFE 1000 to 3000 1000-5000
where n is the total number of given data, Y is the actual
No. of hidden From 2 to 4 nodes Λ
nodes value of earthquake magnitude, and Y represents values
of predicted magnitude of earthquake.
AUTHOR: TITLE 6
F ig 3 : M S E o f M L P -B P fo r E ra th q u a k e D a ta F ig 7 : T e s tin g o f E a r th q u a k e D a ta b y M L P -B P
F ig 4 : M S E b y M L P -A B C fo r E a rth q u a k e D a ta
F ig 8 : T e s tin g o f E a r th q u a k e D a t a b y M L P -A B C
7 CONCLUSION
of the algorithms may speed up the initialisation and [12] Yao, X.: Evolutionary artificial neural networks. International
improve the prediction accuracy of the trained NNs. The Journal of Neural Systems 4(3), 203–222 (1993)
simulation results show that the proposed ABC [13] Mendes, R., Cortez, P., Rocha, M., Neves, J.: Particle swarm for
feedforward neural network training. In: Proceedings of the
algorithm can successfully train real-time data for International Joint Conference on
prediction purpose, which further extends the quality of Neural Networks, vol. 2, pp. 1895–1899 (2002)
the given approach. The performance of ABC is [14] Van der Bergh, F., Engelbrecht, A.: Cooperative learning in
compared with the traditional BP algorithm. ABC shows neural networks using particle swarm optimizers. South
African Computer Journal 26, 84–90 (2000)
significantly higher results than backpropagation during
[15] Ilonen, J., Kamarainen, J.I., Lampinen, J.: Differential Evolution
experiment. ABC also shows higher accuracy in Training Algorithm for Feed-Forward Neural Networks
prediction. The proposed frameworks have successfully [16] Yu, B., He, X.: Training Radial Basis Function Networks with
predicted the magnitude of earthquake. Differential Evolution, Granular Computing, 2006. In: IEEE
International Conference on, May 10-12, 2006, pp. 369–372
ACKNOWLEDGEMENT (2006)
[17] Liu, Y.-P., M.-G. Wu, et al. (2006). Evolving Neural Networks
Using the Hybrid of Ant Colony Optimization and BP
The authors would like to thank University Tun
Algorithms. Advances in Neural Networks - ISNN 2006. J.
Hussein Onn Malaysia (UTHM) for supporting this Wang, Z. Yi, J. Zurada, B.-L. Lu and H. Yin, Springer Berlin /
research under the Postgraduate Incentive Research Heidelberg. 3971: 714-722.
Grant Vote No .0739. [18] N. M. Nawi, R. G h a z a l i , and M.N.M Salleh: “Predicting
Patients with Heart Disease by Using an Improved Back-
propagation Algorithm”, Journal of Computing. Volume 3,
REFERENCES
Issue 2, February 2011,pp :53-58
[19] D.E. Rumelhart, J.L. McClelland, and the PDP Research Group
[1] Panakkat, A. & Adeli, H. (2007), Neural network models for
(1986), Parallel Distributed Processing: Explorations in the
earthquake magnitude prediction using multiple
Microstructure of Cognition, vols. 1 and 2 (MIT Press,
seismicity indicators, International Journal of Neural Systems,
Cambridge, MA
17(1), 2007, 13-33.
[20] D.Karaboga , Bahriye Akay, “A comparative study of Artificial
[2] Liao, S.-H. and C.-H. Wen (2007). "Artificial neural networks
Bee Colony algorithm” Applied Mathematics and
classification and clustering of methodologies and applications
Computation” 214 (2009) 108–132.
- literature analysis from 1995 to 2005." Expert Systems with
Applications 32(1): 1-11. [21] A.E. Eiben, J.E. Smith, Introduction to Evolutionary
Computing, Springer, 2003.
[3] Ghazali, R., A. Jaafar Hussain, et al. (2009). "Non-stationary
and stationary prediction of financial time series using [22] R.C. Eberhart, Y. Shi, J. Kennedy, Swarm Intelligence, Morgan
dynamic ridge polynomial neural network." Neurocomputing Kaufmann, 2001.
72(10-12): 2359-2367. [23] D. Karaboga, B. Akay (2007). “Artificial Bee Colony (ABC)
[4] J.Connor and L.Atlas, “Recurrent Neural Networks andTime Algorithm on Training Artificial Neural Networks” signals
Series Prediction”, IEEE International Joint conference on Neural Processing and Communications Applications, SIU
networks, New York, USA, pp. I 301- I 306. 2007. IEEE 15th.
[5] Du, K. L. (2010). "Clustering: A neural network approach." [24] Zhang, C., D. Ouyang, et al. (2010). "An artificial bee colony
Neural Networks 23(1): 89-107.
approach for clustering." Expert Systems with Applications
[6] Carrasco, M. P. and M. V. Pato (2004). "A comparison of 37(7): 4761-4767.
discrete and continuous neural network approaches to solve
[25] Darvis. Karaboga, “An idea based on honey Bee Swarm for
the class/teacher timetabling problem." European Journal of
Numerical Optimization Technique” Report-06, Erciyes
Operational Research 153(1): 65-79.
University Engineering Faculty,Computer Engineering
[7] Chen Guojin; Zhu Miaofen et al, "Application of Neural
Department(2005)
Networks in Image Definition Recognition," Signal
Processing and Communications, 2007. ICSPC. IEEE [26] Sandhya Samarasinghe (2007) Neural networks for applied
International Conference on, vol, no., pp.1207- 1210, 24-27 sciences and engineering
Nov. 2007 doi: 10.1109/ICSPC.2007. [27] F. Rosenblatt, "A Probabilistic Model for Information Storage
and Organization in the Brain, “Cornell Aeronautical
[8] Romano, Michele, et al “Artificial neural network for tsunami
Laboratory”, vol. 65, pp.386-108, 1958.
forecasting,” Journal of Asian Earth Sciences Vol. 36, pp. 29-
37,September 2009. [28] E. Bonabeau, M. Dorigo, G. Theraulaz, Swarm Intelligence:
From Natural to Artificial Systems, Oxford University Press,
[9] Mohsen Hayati, and Zahra Mohebi, “Application of Artificial
NY,1999.
Neural Networks for Temperature forecasting,” World
Academy of Science, Engineering and Technology 28 2007. [29] Dorigo M, Maniezzo V, Colorni A (1996) The ant system:
optimization by a colony of cooperating agents. IEEE Trans
[10] Muriel Perez, “Artificial neural networks and bankruptcy
Syst Man Cybern part B 26(1):1–13 http://www.i-
forecasting: a state of the art,” Neural Comput & Application
csrs.org/ijasca/index.html
(2006) 15: 154–163.
[30] http://earthquake.usgs.gov/earthquakes/recenteqsww/
[11] Leung, C., Member, Chow, W.S.: A Hybrid Global Learning
Algorithm Based on Global Search and Least Squares [31] L.N. De Castro, F.J. Von Zuben, Artificial immune systems,
Techniques for Backpropagation Networks, Neural Part I. Basic theory and applications, Technical Report Rt Dca
Networks,1997. In: International Conference on, vol. 3, pp. 01/99, Feec/Unicamp, Brazil,1999.
1890–1895(1997)
AUTHOR: TITLE 8
[32] R.C. Eberhart, Y. Shi, J. Kennedy, Swarm Intelligence, Morgan Habib Shah is a Ph.D student at Universiti Tun Hussein Onn
Kaufmann, 2001. Malaysia (UTHM) since 2010. His current research focuses on the
optimization of Artificial Neural Networks using Swarm Intelligence
[33] Weiß, G.: Neural Networks and EvolutionaryComputation. Algorithms. He got his Masters in Computer Science from Federal
PartI: Hybrid Approaches in Artificial Intelligence. In: Urdu University of Arts, Science and Technology, Karachi, Pakistan
International Conference on Evolutionary Computation, pp. in 2007. And his Bachelors in Computer Science from University of
268–272 (1994) Malakand, Pakistan in 2005.
[34] Mosavi, M. R. (2007). GPS receivers timing data processing
using neural networks:Optimal estimation and errors Rozaida Ghazali is currently a Deputy Dean (Research and
Development) at the Faculty of Information Technology and
modeling. International Journal of Neural Systems,17(5),
Multimedia, Universiti Tun Hussein Onn Malaysia (UTHM). She
383_393. graduated with Ph.D. degree from the School of Computing and
[35] Hung, S. L, & Adeli, H. (1993). Parallel backpropagation Mathematical Sciences at Liverpool John Moores University, United
learning algorithms on cray Y-MP8/864 supercomputer. Kingdom in 2007, on the topic of Higher Order Neural Networks for
Financial Time series Prediction. Earlier, in 2003 she completed her
Neurocomputing, 5(6), 287_302.
M.Sc. degree in Computer Science from Universiti Teknologi
[36] Adeli, H., & Karim, A. (2000). Fuzzy-wavelet RBFNN model Malaysia (UTM). She received her B.Sc. (Hons) degree in Computer
for freeway incident detection. Journal of Transportation Science (Information System) from the Universiti Sains Malaysia
Engineering, 126(6), 464_471. (USM) in 1997. In 2001, Rozaida joined the academic staff in
[37] Liu, H., Wang, X., & Qiang, W. (2007). A fast method for UTHM. Her research area includes neural networks, data mining,
implicit surface reconstruction based on radial basis functions financial time series prediction, data analysis, physical time series
network from 3D scattered points. International Journal of forecasting, and fuzzy logic.
Neural Systems, 17(6), 459_465.
[38] Mayorga, R. V., & Carrera, J. (2007). A radial basis function Nazri Mohd Nawi received his B.S.degree in Computer Science
network approach forthe computational of inverse continuous from University of Science Malaysia (USM), Penang, Malaysia. His
time variant functions. International Journal of Neural M.Sc. degree in computer science was received from University of
Systems, 17(3), 149_160.
Technology Malaysia (UTM), Skudai,Johor, Malaysia. He received
[39] Schaefer, A. M., & Zimmermann, H. G. (2007). Recurrent neural his Ph.D. degree in Mechanical Engineering department, Swansea
networks are universal approximators. International Journal of University, Wales Swansea. He is currently a Associate professor in
Neural Systems, 17(4),253_263. Software Engineering Department at Universiti Tun Hussein Onn
[40] Adeli, Hojjat,Panakkat, Ashif. “A probabilistic neural network Malaysia (UTHM). His research interests are in optimization, data-
for earthquake magnitude prediction” Earthquake mining techniques and neural networks.
engineering VL: 22 pp 1018-1024 SN- 0893-6080 september 2009