Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
Performance Evaluation and Benchmarking of an Extended Computational Model
of Ant Colony System for DNA Sequence Design
Zuwairie Ibrahim and Mohd Falfazli Mat Jusof
Faculty of Electrical and Electronic Engineering
Universiti Malaysia Pahang
26600 Pekan, Malaysia
zuwairie@ump.edu.my
Abstract— Ant colony system (ACS) algorithm is one of the biologically inspired algorithms that have been introduced to effectively
solve a variety of combinatorial optimisation problems. In literature, ACS has been employed to solve DNA sequence design problem.
The DNA sequence design problem was modelled based on a finite state machine in which the nodes represent the DNA bases {A, C,
T, G}. Later in 2011, an extended computational model of finite state machine has been employed for DNA sequence design using
ACS. The performance evaluation, however, was limited. In this study, the extended computational model of finite state machine is
revisited and an extensive performance evaluation is conducted using 5, 7, 10, 15, 20, 25, 30, 35, and 40 agents/ants, each with 100
independent runs. The performance of the extended computational model is also benchmarked with the existing algorithm such as a
Genetic Algorithm (GA), Multi-Objective Evolutionary Algorithm (MOEA), and Particle Swarm Optimisation (PSO).
Keywords-Ant Colony System, DNA Sequence Design, Finite State Machine.
I.
sequences that have a minimal tendency for crosshybridisation and that have a maximal difference among
them. In addition, they must have similar physical conditions,
such as the length and melting temperature. By removing the
error beforehand, no DNA is wasted because of an illegal
reaction, and the reliability of the computation is improved,
which consequently ensures a high computational accuracy.
Ant colony optimisation (ACO) [2] is one of the
biologically inspired algorithms that have been introduced to
effectively solve various combinatorial optimisation
problems. ACO was first introduced in 1992 by Marco
Dorigo. It was derived from an observation of real ants’
behaviours. The main idea behind the algorithm is that the
self-organising and highly coordinated behaviour of ants can
be exploited to solve complex computational problems.
The ant colony system (ACS) [3] is an extension of the
ACO. In literature, ACS has been employed for solving DNA
sequence design problem [4-5]. In order to model the DNA
sequence design problem into a path-finding problem, a
simple model, similar to a finite state machine of four nodes
has been employed. Basically, the number of ant required is
similar to the number of DNA sequences. This computational
model, however, limits the efficiency of ACS algorithm to
find good solution. Hence, there is a need to expand the
computational model in order to fully utilize the stochastic
search of ACS. An extended computation model has been
developed [6]. To improve the flexibility in terms of the
number of ants, every ant represents a set of DNA sequences
in the extended computational model. In this study, an
INTRODUCTION
DNA computing is an interdisciplinary research area that
uses DNA molecules to solve computational problems.
Adleman initiated the field of DNA computing in 1994 [1],
when he discovered a method for solving a hard
combinatorial problem using DNA. Adleman used the
method of manipulating DNA to solve a seven-node
Hamiltonian Path Problem (HPP). The goal of Adleman’s
experiment was to determine the existence of a path that starts
at the initial city, finishes at the end city and passes through
each of the remaining cities exactly once. Based on
Adleman’s success, researchers around the world are
currently working to exploit the extremely dense information
storage and massive parallelism properties of DNA, hoping
to one day produce a DNA computer that has better
performance compared to the conventional electronics
computers.
The reliability of DNA computation is highly dependent
on and influenced by the information represented on the DNA
strand and the strand reaction. However, because of
technological difficulties and the nature of the chemical
characteristics of the molecules, DNA reactions could result
in inaccuracies in the computation. One of the main
approaches to overcoming the possibilities of illegal
reactions and consequently to remove the potential error due
to the biochemical reaction in advance is to focus on
designing a good set of independent DNA sequences. An
independent DNA sequence set means a set of DNA
DOI 10.5013/IJSSST.a.15.06.06
49
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
extensive performance evaluation of the extended
computational model is presented and the performance of the
extended computational model is also benchmarked with the
existing algorithms, which has been employed in DNA
sequence design.
II.
DNA SEQUENCE DESIGN APPROACHES
A good set of DNA sequence that is unique but forms a
stable duplex with its complement is highly desirable to
improve the reliability and accuracy of the DNA computing.
To achieve this goal, several objective functions and
constraints have been employed in DNA sequence design.
The most excellent method should include all of the desired
design criteria. These criteria can be classified into three
categories [7] as follows:
•
Prevent mismatch hybridisation - Mismatch
hybridisation is considered to be an illegal reaction in
DNA sequence design, and it will reduce the reliability
and efficiency of the DNA computation. Basically,
satisfying this first criterion forces the set of sequences
to form duplexes between a given DNA sequence and
its complement only and, consequently, improves the
computational accuracy. The objective functions that
fall under this category are the similarity, H-measure,
distance reverse complement, and distance reverse
Hamming.
•
Prevent the formation of a secondary structure - Types
of possible secondary structures that could occur
include the internal loop, hairpin loop and bulge loop,
which are usually formed by the interaction of singlestranded DNA or RNA. We must steer clear of the
formation of a secondary structure because it will lower
the efficiency of the computational system. The known
factor that causes the secondary structure is the
continuous occurrence of the same base, which makes a
strand twist or bend. The objective functions that are
used when addressing the formation of secondary
structures are the hairpin, continuity and forbidden
subsequences.
•
Maintain uniform chemical characteristics - In many
cases, it is favourable to control DNA sequences that
possess similar chemical characteristics, with which the
system will behave similarly. Some of the constraints
that have previously been used to measure this criterion
are the free energy, melting temperature (Tm) and GC
content. GC content is the percentage of guanine or
cytosine in a whole DNA sequence. The Tm is defined
as the temperature at which half of the double-stranded
DNA starts to break into its single-stranded form. The
Free energy is the energy that is necessary to make a
duplex.
Over the years, many advanced algorithms have emerged
to be employed in DNA sequence design. Various methods
and approaches [8-19] were used to derive this algorithm and
ultimately to produce a good DNA sequence design.
DOI 10.5013/IJSSST.a.15.06.06
Figure 1. Illustration of Hmeasure measure calculation [20].
III. FORMULATION OF OBJECTIVE FUNCTIONS AND
CONSTRAINTS IN THE DNA SEQUENCE DESIGN
To achieve a good set of DNA sequences, all of the design
criteria must be included. However, because some criteria
overlap with each other, a total of four objective functions
and two constraints have been selected for this study.
DNA sequence design, as a constrained multi-objective
optimisation problem, constitutes finding an optimal solution
from a set of multiple objectives to be optimised while
adhering to several constraints, which must be satisfied. For
DNA sequence design, the outlined objective functions are to
be minimised while the required constraints are satisfied. The
design problem is simplified and converted into a singleobjective optimisation problem using the weighted
summation method. Therefore, the new objective function is
the weighted sum of objective functions, as formulated in Eq.
(1).
min f DNA = ∑ ωi f i
i
(1)
f
subjected to Tm and GCcontent constraints, where i is the
objective function for each i ∈ {Hmeasure, similarity, hairpin,
continuity}, and ωi is the weight for each fi . For simplicity,
each weight is set to 1. A brief description of the objective
functions and constraints are as follows. The detailed
description can be found in [20].
A. H-measure
H-measure is a measure of the possibility of unintended
DNA hybridisation based on the Hamming distance. In DNA
sequencing, the Hamming distance is used to describe the
degree of non-similarity between two DNA sequences. The
greater the Hamming distance, the less similar the degree of
complementary base pairs and the less likely for mismatch
hybridisation. Illustration of H-measure calculation is shown
in Figure 1.
B. Similarity
Similarity computes the similarity in the same direction of
two given sequences to keep each sequence as unique as
possible including position shift. Illustration of similarity
calculation is shown in Figure 2.
50
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
Tm ( x) =
ΔH
+ 16.6 log( Na + )
ΔS + R ln CT
(3)
where ΔH and ΔS are the enthalpy and entropy changes of the
annealing reaction, respectively. R, CT, and Na+ denote the
Boltzmann’s constant, the total oligonucleotide strand
concentration, and salt concentration for salt adjustment,
respectively.
Figure 2. Illustration of similarity measure calculation [20].
IV.
Figure 3. Illustration of hairpin calculation [20].
C. Continuity
Continuity serves as an assessment of the degree of
successive occurrence of the same DNA bases. Continuous
occurrence of the same bases will result in instability of the
DNA structure. This will ultimately increase the difficulty in
managing the reaction and the experiment become less
controllable.
A. State Transition Rule
The ACS state transition rule is the rule that governs the
movement of ants in the system when constructing the
desired sequence. This rule is also known as the pseudorandom-proportional rule. From this rule, the movement
preference of the ants will be toward the nodes, which are
connected by short edges and have the highest concentration
of pheromone. This rule was developed to explicitly balance
between an exploration of new edges and an exploitation of a
priori knowledge and is determined by the parameter q0. The
transition rule is probabilistic. For an ant k on node r, the
selection of the next node s depends on a random variable q,
and q0 and is given by the transition probability, as shown in
Eq. (4);
#
β
arg max [τ (r, u)] ⋅ [η (r, u)]
if q ≤ q0
pk (r, s)ACS = %
u∈J k (r )
$
otherwise
%& S
(4)
where q is a uniformly distributed random variable [0,1], q0
is between 0 and 1, and S is a variable that is randomly
selected according to the probability distribution given by Eq.
(5).
D. Hairpin
The hairpin objective function computes the probability of
the single stranded DNA to form a secondary structure
particularly hairpin. Hairpin occurs from a self-hybridization
of a single stranded DNA which causes formation of a loop.
The hairpin objective function formulation considers the
length of the hairpin loop denoted as r (ring) and number of
hybridized pair represented by p (pair). Figure 3 shows the
possible formation of hairpin based on the value of p and r.
{
E. GCcontent Constraint
The GCcontent is the percentage of G base and C base in a
DNA sequence. It is an important parameter as the content of
G and C in a strand of DNA can affect the chemical property
of DNA sequence. It can be calculated using Eq. (2).
)*+,𝐺𝐶#$%&'%& =
(2)
𝑆=
./+01+)*+,-
where wA, xT, yG, and zC, are the number of A, T, G, and C
in a sequence, respectively.
}
3 4,6 [8(4,6)]<
>?@A (B) 3
4,= [8(4,=)]<
(5)
Jk(r) is the set of feasible components, in other words, the set
of edges (r, s), where r is current node and s is a next node
that has not yet been visited by the k-th ant. Additionally, (r,
u) represents the other edges, where u is all of the nodes that
have not yet been visited by the k-th ant. The parameters β (β
≥ 0) control the relative importance of the pheromone versus
the problem-dependent heuristic information η(r, s), which is
given by:
F. Melting Temperature Constraint
Melting temperature, Tm, is defined as a temperature,
where half of double-stranded DNA starts to break into its
single-stranded form. There are numerous equations that can
be use in calculation Tm. In this study, the nearest-neighbour
formulation to calculate Tm as formulated in Eq. (3), is used.
DOI 10.5013/IJSSST.a.15.06.06
ANT COLONY SYSTEM
The ACS algorithm [3] comprises two main phases; the
first phase is the construction of the solution, and the second
phase is the pheromone updates. In this algorithm, an ant
moves from one node to another according to the state
transition rule, to incrementally construct the desired solution.
During the construction, the ant releases pheromones at each
step, which will later influence the movement of the next ants,
and the pheromone concentration is updated using the local
pheromone updating rule. This process continues until all of
the ants have completed their routes. The decision by the
globally best ant that found the best decision is reinforced by
allowing it to deposit pheromone.
51
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
exploitation, and the pheromone level is updated by applying
the global updating rule according to Eq. (7):
A
τ (r , s)t +1 = (1 − ρ ) ⋅τ (r , s)t + Δτ (r , s)
(7)
where ρ (0 ≥ ρ ≥ 1) is the evaporation rate, and Δτ (r, s) is
T
the quantity of pheromone laid on edge (r, s) by ant k at
iteration t, which is given by:
⎧⎪ 1
Δτ (r , s ) = ⎨ Lk
⎪⎩0
C
if (r , s ) ∈ tour done by ant k
otherwise
(8)
where Lk is the length of the globally best tour from the
beginning of the trial constructed by ant k.
The value of ρ is usually fixed. However, it can be
dynamically changed according to the users’ preferences
with respect to exploration and exploitation. For small values
of ρ, the pheromone concentration on the edges will
evaporate slowly and the algorithm will favour exploitation.
The effect of a large value of ρ is that the pheromone
deposited by the ants with quickly evaporate and exploration
is emphasised.
G
Figure 4. Finite state machine used in DNA sequence design.
C. Local Update Rule
In constructing the solution, each ant deposits pheromones
on each of the edges that it has visited. The change in the
pheromone level on each of the edges is given by Eq. (9):
τ (r, s)t +1 = (1 − ζ ) ⋅τ (r, s)t + τ 0
(9)
where ζ ∈[0, 1] is the pheromone decay coefficient, and τ0 is
the initial value of the pheromone.
The local pheromone update is applied by all of the ants
after each of the construction steps. The main goal of the local
update is to diversify the search by decreasing the pheromone
concentration on the traversed edges, which encourages ants
to diversify their routes for the purpose of producing different
solutions. This arrangement would prevent several ants from
producing identical solutions during each iteration.
Figure 5. Sequence generation.
1
(6)
drs
and τ(r, s) is a pheromone trail that is associated with the edge
that joins node r and s. The parameter drs is the distance
between nodes r and s. The quantity of pheromone represents
the past experience of the colony with respect to choosing
which path to take. If q ≤ q0, then the best edge, as described
in Eq. (4), is chosen (exploitation); otherwise, an edge is
chosen according to Eq. (5) (biased exploration). Therefore,
a smaller value of q0 used, fewer best links are exploited, and
the algorithm will accentuate more the exploration.
η (r, s) =
V. ANT COLONY SYSTEM FOR DNA SEQUENCE DESIGN
BASED ON THE EXTENDED COMPUTATIONAL MODEL
A model similar to a finite state machine is utilised in
solving the DNA sequence design problem [4-5]. Each node
of the finite state machine represents the four DNA bases A,
T, C and G. Every node is connected to each other node
including itself, as illustrated in Figure 4.
Figure 5 depicts the movement of an ant from one node to
another in constructing a DNA sequence. The initial
positioning of an ant is random. The path of an ant from one
node to another will form the DNA sequence. For example,
the movement from node A to node T (in Figure 5) forms a
path that can be translated to the DNA sequence ‘AT’. If the
next tour of the ant is to node C, then the DNA Sequence
formed is ‘ATC’. This process continues until the number of
required bases has been achieved.
B. Global Update Rule
The global update rule basically encourages the ants to
search in the neighbourhood of the best solution that has been
found thus far, which makes the pursuit of finding the optimal
solution more directed. This approach is applied after all of
the ants have constructed a solution and only the edges that
belong to the globally best tour will receive reinforcement.
Therefore, only the globally best ant that found the best
solution up to the current iteration of the algorithm is
permitted to deposit the pheromone. This strategy favours
DOI 10.5013/IJSSST.a.15.06.06
D. Initialization
The initialisation of the ACS for the DNA Sequence
design comprises the initialisation of the ACS parameters, the
52
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
initialisation of the DNA Sequence parameters and the initial
placement of the ant. The initial pheromone concentration is
obtained using the following equation:
E
𝜏D = F
(10)
Therefore, the number of ants used can affect the overall
performance of the system. Based on the literature on
applications of the ACS algorithm, the number of ants used
is mostly problem-dependent. Different types of problems
and modelling will require different numbers of ants. Thus
far, there are no suggestions as to the number of ants for the
DNA sequence design problem.
Each ant represents a solution of the DNA sequences
problem by producing the number of DNA sequences that are
required. For example, 7 DNA sequences of 20-mer length
per DNA sequence are generated. This pattern can be
illustrated as in Eq. (12).
𝑥(𝑘)& = [𝑝E , 𝑝J , … , 𝑝% ]
(12)
where x(k)t is ant k at iteration t. Here, pn is DNA sequence n,
which is represented as pn = [b1 , b2 ,..., bz ], bz = { A, C, G, T }
%
where Q is the sum of the objectives calculated for a set of
randomly generated DNA sequences, and n is the number of
sequences.
E. Construction Process
After the initialisation process, the ants move from one
node to another node to incrementally construct the DNA
Sequence, based on the state transition rule. Some
modifications have been performed on the formulation,
whereby the heuristic information has been omitted from the
calculation, and only the pheromone information is used in
determining the decision for the next node, as shown in Eq.
(11):
⎧
if q ≤ q0
⎪arg max {[τ (r , u )]}
pk (r , s ) ACS = ⎨ u∈J k ( r )
otherwise
⎪
⎩S
(11)
Usually, the transition probability that is used is a balance
between the pheromone intensity, τ(r, s), which is the history
of a previously successful move, and heuristic information,
η(r, s), which expresses the desirability of the move. This
approach effectively balances the exploration-exploitation
trade-off. Because the DNA sequence design problem offers
no information that can be directly used as heuristic
information, this model uses only pheromone information in
the computations.
After each construction step, a local pheromone updating
rule is applied until a complete solution is constructed. The
next step is to check the two constraints, Tm and GCcontent, of
the solution. Only after the value of the two constraints are
within a specified range will the processes of constructing the
next solution for the next ant start. This process continues
until all of the ants have constructed a solution with
satisfactory constraint values. The total objective value is
then calculated for every solution constructed by each ant.
The minimum objective value among the solutions is stored
as the best found solution. Then, the best ant decision is
reinforced by applying the global updating rule. The
iterations proceed until the specified number of iterations has
been exceeded.
where bz is the zth base for the nth DNA sequence.
Based on this proposition, the number of ants used does
not contribute and is totally independent of the number of
sequences produced in the process. Each ant will produce a
solution based on the number of required bases. Table 1
summarises the ACS algorithm with the extended
computational model. The parameters are initialised to the
values that are tabulated in Table 2.
TABLE I.
01 Initialize parameter in Table 1
02 Loop
/*each loop is called an iteration
03 Loop /*each loop is called an ant
04
Each ant is positioned randomly on the start
node
05
Loop
06
Each ant applies state transition rule to
construct a solution
07
Local pheromone updating rule is applied
08
Until ant build a complete solution
09
If GCcontent and Tm constraints for all DNA
sequences passed then
10
Proceed with next ant
11
Else
12
Repeat the DNA sequences generation using
current ant
13
End if
14
Until all ant have built a complete solution
15
For each solution do
16
Calculate objective functions
17
Next
18
If the objective functions better than previous
then
19
store the sequences as best found sequence
20
End if
21 A global pheromone updating is applied to the best
solution
22 Until stopping condition meet
F. Extended Computational Model
Ants exhibit cooperative behaviour by contributing and
sharing knowledge about the paths that have been taken by
other ants, through the deposition of pheromones. An
increase in the number of ants that are used improves the
exploration and exploitation ability of the algorithm because
it enhances the information about the available routes as well
as increasing the chances of finding a new optimal route. A
small number of ants can result in a sub-optimal solution,
while having too many ants will result in significantly higher
computational time, which makes the solution infeasible.
DOI 10.5013/IJSSST.a.15.06.06
ACS ALGORITHM WITH EXTENDED COMPUTATIONAL
MODEL.
53
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
TABLE II.
THE ACS PARAMETERS.
Parameter
q0
β
ζ
ρ
Maximum number of iteration (tmax)
TABLE III.
No. of
Ants
5
7
10
15
20
25
30
35
40
PERFORMANCE BASED ON THE NUMBER OF ANTS.
Average
Std Dev.
Min
Max
109.045
108.180
106.685
105.995
105.542
105.519
105.527
105.503
105.483
3.298
3.266
2.158
2.179
2.352
2.026
1.998
1.868
1.973
103.10
102.57
102.91
102.14
101.29
100.88
101.00
101.88
99.57
117.98
117.29
113.29
112.86
111.71
110.43
110.14
109.86
109.29
TABLE IV.
No
1
2
3
4
5
6
7
Value
0.5
0
0.05
0.1
300
DNA Sequences
len
TCTCGTCTCTTCGCGCTCCT
20
ATCTCTCTCTCTGTCCTCTT
20
CTTTCTCTTCTCTCTCTCGC
20
TCATCTTCGCTCTCATCGCA
20
CCGTATCGTCTCTATCCTCT
20
ACTCTGTTCCTCATGTACTT
20
TATCTGCTCCTCTCTCCCGA
20
Total
Fitness Value (1 run)
Average Value (100 runs)
BEST RESULT USING 5 ANTS
GC%
Tm
Hm1
Sm2
Hr3
C4
Total
60
45
50
45
50
40
55
48.38
39.37
40.12
41.40
40.30
38.62
44.31
40
29
26
57
42
41
40
241
39.286
46.494
70
69
66
58
66
57
67
463
64.714
62.164
0
0
9
0
0
0
9
18
2.574
0.287
0
0
0
0
0
0
0
0
0.000
0.100
110
98
101
115
108
98
116
746
106.571
109.045
Remark: len is length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Hm, Sm, Hr and C
are the Hmeasure, similarity, hairpin and continuity, objective functions, respectively.
VI.
were input into the same system that was developed to
regenerate the objective function values.
Table 13 and Figure 7 show a comparison between the
results of the proposed ACS model with the results obtained
using a Genetic Algorithm [20]. GA was used to generate 7
DNA sequences with a 20-mer length. The ACS model
significantly outperformed the GA model for all four of the
objective functions.
Also, comparison has been made with [7]. They generated
14 sequences of 20-mer lengths as shown in Table 14 and
Figure 8. The sequences generated using the ACS with
extended computational model have a smaller total objective
value in comparison with the sequences generated using SA,
which indicates that the performance of the ACS model is
superior to the SA model. The sequences generated using the
ACS with extended computational model have much lower
values for Hmeasure, similarity and hairpin but have a slightly
higher continuity value when compared to the sequence
generated using SA.
RESULTS AND DISCUSSION
In this study, 9 sets of studies have been conducted using
5, 7, 10, 15, 20, 25, 30, 35 and 40 ants, each with 100
independent runs. The number of iterations is set to be 300
for each set. Figure 6 shows the overall average of the
objective functions for each set of studies, while Table 3
tabulates the overall average of each objective function as
well as the standard deviation, minimum and maximum value
for each set of the study. The minimum value reflects the best
results that were obtained for each ant model. Table 4 to
Table 11 show the best-found sequence for each number of
ants used in this study. The best result basically is the solution
that has the least total objective value.
A comparison between the ACS with extended
computational model and other approaches has been made for
benchmarking based on the analysis of sequences generated
using various algorithms, as shown in Table 12. To ensure
consistency of the fitness measure and constraints imposed
on the sequences, the sequences published by the authors
DOI 10.5013/IJSSST.a.15.06.06
54
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
Figure 6. Plot of the average of the total objective value of each ant model.
TABLE V.
No
1
2
3
4
5
6
7
DNA Sequences
len
TGAGTTGAGTAGTGGGAGGA
20
GTGTGGGTTGTGTTGTGTGA
20
TGTCTGTACGTGTAGTGTCG
20
AGCGTGTTCGTAGTGAGTGC
20
CTTGTCTGTGGTAAGTTGTG
20
TGCGTGCGTGTTGTGTGTTG
20
ATGTGGTGTGGTTGCGCGTT
20
Total
Fitness Value (1 run)
Average Value (100 runs)
BEST RESULT USING 10 ANTS.
GC%
50
50
50
55
45
55
55
Tm
42.46
44.05
42.10
45.75
39.26
47.87
49.10
Hm1
34
16
42
45
37
30
37
241
34.429
44.292
Sm2
65
78
56
63
68
73
60
463
66.143
62.054
Hr3
9
0
0
0
0
0
0
9
1.286
0.268
C4
0
0
0
0
0
0
0
0
0.000
0.071
Total
108
103
98
108
105
103
97
722
103.143
106.685
Remark: len is length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Hm, Sm, Hr and C
are the Hmeasure, similarity, hairpin and continuity, objective functions, respectively.
TABLE VI.
No
1
2
3
4
5
6
7
DNA Sequences
len
GTACAGGAAGAGAGGTTACA
20
CAGAACAGAGAGAGAGAGAG
20
TTCCAGTAGCATAGACATAA
20
GCAGAAGCAGTTCGTACCACA
20
GAGAGATAGAGTAGAGATAG
20
CAGAGAGAGGAGAGAGGACG
20
AGACAGGAGAGGAGAGAGAC
20
Total
Fitness Value (1 run)
Average Value (100 runs)
BEST RESULT USING 5 ANTS.
GC%
45
50
35
55
40
60
55
Tm
38.69
38.84
35.92
45.51
32.50
43.26
42.27
Hm1
39
29
60
54
38
23
29
272
38.857
43.221
Sm2
61
68
55
54
67
68
70
443
63.286
62.471
Hr3
0
0
0
0
0
0
0
0
0
0.203
C4
0
0
0
0
0
0
0
0
0
0.100
Total
100
97
115
108
105
91
99
715
102.143
105.995
Remark: len is length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Hm, Sm, Hr and C
are the Hmeasure, similarity, hairpin and continuity, objective functions, respectively.
DOI 10.5013/IJSSST.a.15.06.06
55
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
TABLE VII.
No
1
2
3
4
5
6
7
DNA Sequences
len
CGAGAGGAGAGAGACGAGAC
20
TAGTGAGAGAGAAGAGCTTG
20
CGTGAGAGAGAGAGAGTGAG
20
CGAGAGAGTGTAGAGAGAAC
20
CGTGAGAAGACGAGAGACGC
20
AGAGCGAGAGAGTGAAGAGT
20
TAGAGATAGAGTAGGATTGT
20
Total
Fitness Value (1 run)
Average Value (100 runs)
BEST RESULT USING 20 ANTS.
GC%
60
45
55
50
60
50
34
Tm
43.54
38.77
41.34
39.19
45.49
42.82
33.99
Hm1
30
43
28
40
33
40
40
254
36.286
41.988
Sm2
66
69
65
65
60
68
62
455
65.000
62.759
Hr3
0
0
0
0
0
0
0
0
0.000
0.589
C4
0
0
0
0
0
0
0
0
0.000
0.182
Total
96
112
93
105
93
108
102
709
101.286
105.542
Remark: len is length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Hm, Sm, Hr and C
are the Hmeasure, similarity, hairpin and continuity, objective functions, respectively.
TABLE VIII.
No
1
2
3
4
5
6
7
DNA Sequences
len
GAGACGAGAGAAGAGAGCGA
20
TAGAGTAAGGTAGAGAGAAC
20
TAGAGAAGGAAGTACCACCA
20
GAGGAGAGAGAGAGCAGAAG
20
ATAGCCAAGAGAGTAGCAGA
20
CCGAAGAGATAGAAGAGACT
20
TAGAGCAGTAGAAGTTAGAG
20
Total
Fitness Value (1 run)
Average Value (100 runs)
BEST RESULT USING 25 ANTS.
GC%
55
40
45
55
45
45
40
Tm
43.49
34.97
39.9
41.39
40.41
38.25
35.67
Hm1
28
38
39
17
41
32
45
240
34.286
43.538
Sm2
70
64
63
71
65
68
61
462
66.000
61.569
Hr3
0
0
0
0
0
0
0
0
0.000
0.261
C4
0
0
0
0
0
0
0
0
0.000
0.070
Total
98
102
102
88
106
100
106
702
100.286
105.519
Remark: len is length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Hm, Sm, Hr and C
are the Hmeasure, similarity, hairpin and continuity, objective functions, respectively.
TABLE IX.
No
1
2
3
4
5
6
7
DNA Sequences
len
TGTTGTGAGTTGGTGTTGGT
20
GGTGTGTGTAGTGTCCTTGT
20
TGTGTGTGCGTGTGTGTCGC
20
GTGTTCGTGTGTAGTGACGTA
20
GGTGTGAGGTAGTGTGTACG
20
GTGGAAGTAGAGTAGCGTGC
20
GTGTAGTGTGTGTGTGTGTG
20
Total
Fitness Value (1 run)
Average Value (100 runs)
BEST RESULT USING 30 ANTS.
GC%
45
50
60
50
55
55
50
Tm
43.12
42.64
49.40
42.37
42.76
43.17
41.95
Hm1
23
32
36
44
40
47
27
249
35.571
42.896
Sm2
65
64
70
60
70
56
73
458
65.429
61.887
Hr3
0
0
0
0
0
0
0
0
0.000
0.664
C4
0
0
0
0
0
0
0
0
0.000
0.149
Total
88
96
106
104
110
103
100
707
101.000
105.517
Remark: len is length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Hm, Sm, Hr and C
are the Hmeasure, similarity, hairpin and continuity, objective functions, respectively.
DOI 10.5013/IJSSST.a.15.06.06
56
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
TABLE X.
BEST RESULT USING 35 ANTS.
DNA Sequences
len
GC%
Tm
Hm1
Sm2
Hr3
C4
Total
AGAGTAGCAGATTAGAAGGA
20
40
37.49
41
64
0
0
105
CTACAGGAAGATAGAGTACA
20
40
35.35
45
61
0
0
106
CAGAACAGGAGGAGAGAGTC
20
55
41.61
31
69
0
0
100
AGAGGAAGAGAGAAGCATAA
20
40
38.09
36
72
0
0
108
GCAGAGAGCGTTCGTACCAG
20
60
45.85
57
54
0
0
111
AGAGAGATAGAGATACAGAT
20
35
33.13
35
64
0
0
99
GAGAGAGAGAGGAGAGAGGA
20
55
41.39
17
67
0
0
84
Total
262
451
0
0
713
Fitness Value (1 run)
37.429 64.429 0.000
0.000 101.857
Average Value (100 runs)
42.557 62.476 0.417
0.122 105.503
Remark: len is the length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Cont, hair,
hme, and sim are the continuity, hairpin, Hmeasure, and similarity objective functions, respectively.
No
1
2
3
4
5
6
7
TABLE XI.
BEST RESULT USING 40 ANTS.
DNA Sequences
len
GC%
Tm
Hm1
Sm2
Hr3
C4
Total
ACTCACCGACTTCAGGCACC
20
60
47.41
43
56
0
0
99
CACACAACACAACACACACA
20
45
42.41
18
64
0
0
82
TCAACTACTCACACGACACT
20
45
41.39
39
61
0
0
100
GGACAACACACACGCACACT
20
55
46.52
37
66
0
0
103
CCAGCCACAGCCACGCACGC
20
75
54.68
38
59
0
0
97
CCAACACATTCATCACACAT
20
40
39.35
37
66
0
0
103
TGTCACCCAACACAGCATCA
20
50
45.18
46
58
9
0
113
Total
258
430
9
0
697
Fitness Value (1 run)
36.857 61.429 1.286
0.000
99.571
Average Value (100 runs)
41.929 63.091 0.334
0.129 105.483
Remark: len is the length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Cont, hair,
hme, and sim are the continuity, hairpin, Hmeasure, and similarity objective functions, respectively.
No
1
2
3
4
5
6
7
TABLE XII.
Algorithm
GA
SA
MOEA (NACST/Seq)
MOPSO
PSO
P-ACO
BinPSO
ACS
ALGORITHMS SELECTED FOR BENCHMARKING.
Author
Deaton et al. [21]
Tanaka et al. [7]
Shin et al. [16]
Zhao et al. [17]
GuangZhou et al. [18]
Kurniawan et al. [22]
Khalid et al. [19]
Yaakop et al. [23]
GA - Deaton et al.
DNA Seq
7 Seq, 20-mer
14 Seq, 20-mer
14 Seq, 20-mer
7 Seq, 20-mer
20 Seq, 20-mer
7 Seq, 20-mer
7 Seq, 20-mer
7 Seq, 20-mer
Year
1996
2001
2005
2007
2007
2009
2010
2010
Proposed ACS model
140.00
120.00
100.00
80.00
60.00
40.00
20.00
0.00
H-measure
Similarity
Continuity
Hairpin
Total
Figure 7. Results obtained by Deaton et al. [21] vs results obtained by the ACS with extended computational model.
DOI 10.5013/IJSSST.a.15.06.06
57
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
Tanaka et al. (SA)
Proposed ACS
300.000
250.000
200.000
150.000
100.000
50.000
0.000
H-measure
Similarity
Continuity
Hairpin
Total
Figure 8. Results obtained by Tanaka et al. [7] vs results obtained by the ACS with extended computational model.
TABLE XIII.
SEQUENCES GENERATED BY DEATON ET AL. [21] VS SEQUENCES GENERATED BY THE ACS WITH EXTENDED COMPUTATIONAL MODEL.
Hm1
Sm2
Hr3
C4
Total
GA - Deaton et al.
ATAGAGTGGATAGTTCTGGG
55
66
0
9
130
CATTGGCGGCGCGTAGGCTT
44
62
0
0
106
CTTGTGACCGCTTCTGGGGA
60
70
0
16
146
GAAAAAGGACCAAAAGAGAG
40
69
0
41
150
GATGGTGCTTAGAGAAGTGG
51
61
0
0
112
TGTATCTCGTTTTAACATCC
41
74
4
16
135
TTGTAAGCCTACTGCGTGAC
47
64
0
0
111
Fitness Value
48.29
66.57
0.57
11.71
127.14
ACS WITH EXTENDED COMPUTATIONAL MODEL
AGAGTAGCAGATTAGAAGGA
41
64
0
0
105
CTACAGGAAGATAGAGTACA
45
61
0
0
106
CAGAACAGGAGGAGAGAGTC
31
69
0
0
100
AGAGGAAGAGAGAAGCATAA
36
72
0
0
108
GCAGAGAGCGTTCGTACCAG
57
54
0
0
111
AGAGAGATAGAGATACAGAT
35
64
0
0
99
GAGAGAGAGAGGAGAGAGGA
17
67
0
0
84
Fitness Value (1 run)
37.43
64.43
0.00
0.00
101.86
Average Value (100 runs)
42.557
62.476
0.417
0.122
105.571
Remark: len is the length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Cont, hair,
hme, and sim are the continuity, hairpin, Hmeasure, and similarity objective functions, respectively.
DNA Sequence
A comparison was then made between the proposed model
and the model design by utilising a multi-objective
evolutionary algorithm (MOEA) that was designed by Shin
et al. [16]. Shin et al. published multiple DNA sequences that
were generated by their system and a set of 14 DNA
sequences whose length is a 20-mer, as shown in Table 15,
which is used to make a comparison with the proposed model.
As seen in Figure 9, the sequence generated by the ACS with
extended computational model has a significantly lower
Hmeasure value compared to sequences that were generated by
MOEA. The sequence generated by MOEA outperformed the
ACS model in terms of the similarity, continuity, and hairpin
DOI 10.5013/IJSSST.a.15.06.06
measure. However, the total objective value of the sequence
produced by the ACS with extended computational model is
lower compared to the sequences designed using MOEA.
Therefore, it can be concluded that the overall performance
of the ACS with extended computational model is better in
comparison with MOEA.
Table 16 and Figure 10 compare the results of the ACS
with extended computational model with the multi-objectives
PSO (MOPSO) [17]. MOPSO was employed to generate 7
DNA sequences with a 20-mer length. The ACS with
extended computational model significantly outperformed
MOPSO for all four of the objective measures.
58
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
TABLE XIV.
SEQUENCES GENERATED BY TANAKA ET AL. [7] VS SEQUENCES GENERATED BY THE ACS WITH EXTENDED
COMPUTATIONAL MODEL.
Sequence
Hm
S
Ha
C
Total
SA - TANAKA et al.
CGAGACATCGTGCATATCGT
111
159
4
0
274
TATAGCACGAGTGCGCGTAT
110
158
0
0
268
GATCTACGATCATGAGAGCA
119
160
4
0
283
TCTGTACTGCTGACTCGAGT
120
154
0
0
274
CGAGTAGTCACACGATGAGA
112
160
0
0
272
AGATGATCGGCAGCGAGAGT
117
157
0
0
274
TGTGCTCGTCTCTGCATACT
108
161
4
0
273
AGACGAGTCGTACAGTACAG
113
150
0
0
263
ATGTACGTGAGATGCAGCAG
111
150
0
0
261
ATCACTACTCGCTCGTCACT
109
151
0
0
260
TCAGAGATACTCACGTCACG
115
158
0
0
273
GACAGAGCTATCAGCTACTG
112
153
0
0
265
GCTGACATAGAGTGCGATAG
112
149
0
0
261
ACATCGACACTACTACGCAC
104
155
0
0
259
112.36
155.36
0.86
0.00
268.57
ACS WITH EXTENDED COMPUTATIONAL MODEL
CAGCACTCCAAGCACAACAG
82
127
0
0
209
GCACTACACACACGCGCCAC
95
151
0
0
246
AGCACACAGTCACGAAGGCG
107
124
0
0
231
ACACACACACACAGCGAACA
67
153
0
0
220
ACACACACACACACACACGA
60
160
0
0
220
GCACTAACCGAGAGCACTAC
75
145
0
0
220
GCTACACTATAGCGGCACAG
114
129
0
0
243
GCACAACCACAACTAAACAC
73
147
0
0
220
GCAGGCAACGCGCGAGCCAA
103
119
0
0
222
TAACTACGCGACACTAGCAT
101
133
0
0
234
TCACACACATACACAAACAT
79
143
9
0
231
TATACACTAGTAGCAACTAC
106
121
0
0
227
TAGCCCTACAACGCACGGTA
111
119
9
0
239
GTACCAGCACCTACACACAC
101
139
0
0
240
Fitness Value (1 run)
91.00
136.43
1.29
0.00
228.71
Average Value (100 run)
97.941
135.908
0.740
0.129
234.717
Note: Hm, Sm, Hr, C and Total are the Hmeasure, similarity, hairpin, continuity and Total Objective value, respectively.
Shin et al. (MOEA)
Proposed ACS Model
300.00
250.00
200.00
150.00
100.00
50.00
0.00
H-measure
Similarity
Continuity
Hairpin
Total
Figure 9. Results obtained by Shin et al. [16] vs results obtained by the ACS with extended computational model.
DOI 10.5013/IJSSST.a.15.06.06
59
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
TABLE XV.
SEQUENCES GENERATED BY SHIN ET AL. [16] VS SEQUENCES GENERATED BY THIS STUDY.
Sequence
Hm
S
Ha
C
Total
MOEA - Shin et al.
GTGACTTGAGGTAGGTAGGA
149
96
0
0
245
ATCATACTCCGGAGACTACC
143
101
0
0
244
CACGTCCTACTACCTTCAAC
145
113
0
0
258
ACACGCGTGCATATAGGCAA
144
105
0
0
249
AAGTCTGCACGGATTCCTGA
152
108
0
0
260
AGGCCGAAGTTGACGTAAGA
152
107
0
0
259
CGACACTTGTAGCACACCTT
142
102
0
0
244
TGGCGCTCTACCGTTGAATT
151
106
0
0
257
CTAGAAGGATAGGCGATACG
138
107
0
0
245
CTTGGTGCGTTATGTGTACA
148
93
0
0
241
TGCCAACGGTCTCAACATGA
152
108
0
0
260
TTATCTCCATAGCTCCAGGC
142
99
0
0
241
TGAACGAGCATCACCAACTC
144
110
0
0
254
CTAGATTAGCGGCCATAACC
135
108
0
0
243
Fitness Value
145.5
104.5
0
0
250
ACS WITH EXTENDED COMPUTATIONAL MODEL
CAGCACTCCAAGCACAACAG
82
127
0
0
209
GCACTACACACACGCGCCAC
95
151
0
0
246
AGCACACAGTCACGAAGGCG
107
124
0
0
231
ACACACACACACAGCGAACA
67
153
0
0
220
ACACACACACACACACACGA
60
160
0
0
220
GCACTAACCGAGAGCACTAC
75
145
0
0
220
GCTACACTATAGCGGCACAG
114
129
0
0
243
GCACAACCACAACTAAACAC
73
147
0
0
220
GCAGGCAACGCGCGAGCCAA
103
119
0
0
222
TAACTACGCGACACTAGCAT
101
133
0
0
234
TCACACACATACACAAACAT
79
143
9
0
231
TATACACTAGTAGCAACTAC
106
121
0
0
227
TAGCCCTACAACGCACGGTA
111
119
9
0
239
GTACCAGCACCTACACACAC
101
139
0
0
240
Fitness Value (1 run)
91.00
136.43
1.29
0.00
228.71
Average Value (100 run)
97.941
135.908
0.740
0.129
234.717
Note: Hm, Sm, Hr, C and Total are the Hmeasure, similarity, hairpin, continuity and Total Objective value, respectively.
Zhao et al. (MOPSO)
Proposed ACS
140.00
120.00
100.00
80.00
60.00
40.00
20.00
0.00
H-measure
Similarity
Continuity
Hairpin
Total
Figure 10. Results obtained by Zhao et al. [17] vs results obtained by the ACS with extended computational model.
DOI 10.5013/IJSSST.a.15.06.06
60
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
TABLE XVI.
SEQUENCES GENERATED BY ZHAO ET AL. [17] VS SEQUENCES GENERATED BY THIS STUDY.
Sequence
Hm
S
Ha
C
Total
MOPSO - Zhao et al.
CATCAGCCGGACTCGTCAGT
55
66
0
9
130
AGATCGCATGTAAAGGAGTG
44
62
0
0
106
AAAGCAGGGTGTATCAGTCA
60
70
0
16
146
TACAGGCGCTAATTAGCTCC
40
69
0
41
150
GCGGACCCAACACATATGAG
51
61
0
0
112
ATCATCATTTCATGGGGCAA
41
74
4
16
135
GGGATCGACGTATATTAACG
47
64
0
0
111
Fitness Value
48.29
66.57
0.57
11.71
127.14
ACS WITH EXTENDED COMPUTATIONAL MODEL
AGAGTAGCAGATTAGAAGGA
41
64
0
0
105
CTACAGGAAGATAGAGTACA
45
61
0
0
106
CAGAACAGGAGGAGAGAGTC
31
69
0
0
100
AGAGGAAGAGAGAAGCATAA
36
72
0
0
108
GCAGAGAGCGTTCGTACCAG
57
54
0
0
111
AGAGAGATAGAGATACAGAT
35
64
0
0
99
GAGAGAGAGAGGAGAGAGGA
17
67
0
0
84
Fitness Value (1 run)
37.43
64.43
0.00
0.00
101.86
Average Value (100 runs)
42.557
62.476
0.417
0.122
105.571
Note: Hm, Sm, Hr, C and Total are the Hmeasure, similarity, hairpin, continuity and Total Objective value, respectively.
PSO - GuangZhou et al.
Proposed ACS Model
400
350
300
250
200
150
100
50
0
H-measure
Similarity
Continuity
Hairpin
Total
Figure 11. Results obtained by GuangZhou et al. [18] vs results obtained by the ACS with extended computational model.
GuangZhou et al. [18] employed a PSO algorithm to
generate 7 DNA sequences, each with a 20-mer length. Table
17 presents the sequences in comparison with the sequences
generated by the ACS with extended computational model.
Figure 11 shows that the ACS with extended computational
model outperforms the PSO.
Table 18 shows the sequences that were generated by
utilising the binary particle swarm optimisation (BinPSO)
[18] in comparison with sequences generated using the ACS
with extended computational model. Based on the plot shown
in Figure 12, it can be seen that the overall performance of
the ACS with extended computational model is better
compared to the sequences that employ the BinPSO
DOI 10.5013/IJSSST.a.15.06.06
algorithm. By examining the value of each objective function,
it can be seen that the ACS with extended computational
model outperformed the PSO model in terms of the Hmeasure and continuity, but the BinPSO model has a better
similarity and hairpin measurement.
The ACS with extended computational model produced a
significantly lower value in terms of the Hmeasure, and the ACS
model also outperformed the PSO model in terms of the
hairpin and continuity. However, the PSO model produced a
lower similarity value compared to the ACS, but in general,
the ACS model outperformed the PSO model because the
ACS model produced a much lower total objective value.
61
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
TABLE XIIII
SEQUENCES GENERATED BY GUANGZHOU ET AL. [18] VS SEQUENCES GENERATED BY THE ACS WITH EXTENDED
COMPUTATIONAL MODEL.
Sequence
Hm
S
Ha
C
Total
PSO - GuangZhou et al.
GTCAAATTCCCTCTATCGTC
208
155
0
18
381
AGCGATAGTAGATCACCTGA
211
144
0
0
355
CACGATATAGCTTCGAGCCG
212
158
0
0
370
AATACACCGCTCACCAAGGA
211
155
0
0
366
AACAGGGAAGAATGCAGAGG
211
146
0
9
366
CCTCTACCAGCCAATGATGC
203
158
0
0
361
TTAGGACTCGACGCCACTCC
204
153
0
0
357
CCATGACCGAGGATCCACGT
203
175
0
0
378
CGCCATTATCAGGCCTTTAC
215
157
0
9
381
ACACAGTGGACGCACATACA
209
167
0
0
376
TTATCCCGCCTCTTCTCCGT
213
165
0
9
387
AATACGGTTCAAGCGGCTTC
216
158
4
0
378
TAAAGGCGCGTGATCGGAAG
221
152
0
9
382
TTGTTCGGGATTGAGCAACT
230
148
5
9
392
GTCACTGAGTCAGCACTCAT
210
156
4
0
370
CCATAAACTGCCAGCTCGCG
207
165
0
9
381
CAACATAGAGTCAGGCGCTG
210
163
0
0
373
CCAATGAGTCACCTCGTTCG
217
164
9
0
390
GGGGTGGAGGCCCAACTATT
206
157
0
25
388
CAGCGGTCTGAACCTCCATA
210
152
0
0
362
Fitness Value
211.35
157.4
1.1
4.85
374.7
ACS WITH EXTENDED COMPUTATIONAL MODEL
AGAGAAACGCGATTAGAGTA
143
213
0
0
356
AGAAGATAGATTACGAGACC
146
205
0
0
351
TAGAGAAGAGAGATACGAGC
122
215
0
0
337
GGAGAGAGAGACGAGAGTCC
126
232
0
0
358
CGTAGACGAGAGAGCGAGAC
124
209
0
0
333
CGGTAGATAGAGAGAGAGTA
123
244
0
0
367
CGAGAAGATACCGAAGATAG
135
194
0
0
329
AGGACGAGAGAGAGAGAAGG
88
223
0
0
311
TAGATAGATTCCGAGAAGAG
158
194
0
0
352
AGACGCGAGAGAGAGGAGAG
107
214
0
0
321
AGCGTACGGAGAGATACCGT
153
203
0
8
364
GAGAGAGAGAGAGAGAGAGT
67
25
0
0
92
CTAGAGAGACGAGAGAGAGT
124
227
0
0
351
CCGCGTAGAGGTCGAGAGCG
161
195
0
0
356
TAGGTACGAGACGAGGAAGG
122
192
0
0
314
TAGTATAGCGAGACGTAAAG
157
194
9
0
360
TGAAGAGAGAGGAGAGTAGA
103
239
0
0
342
CGGAGGAGACCATACGTATA
143
184
0
0
327
CGAGTAGCGGATACAAGACT
159
179
0
0
338
GAGAGGAGAAGATTAGAGAG
104
208
0
0
312
Fitness Value (1 run)
128.25
199.45
0.45
0.4
328.55
Average Value (100 run)
145.908
200.800
0.888
0.216
347.812
Note: Hm, Sm, Hr, C and Total are the Hmeasure, similarity, hairpin, continuity and Total Objective value, respectively.
DOI 10.5013/IJSSST.a.15.06.06
62
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
BinPSO - Khalid et al.
Proposed ACS Model
140.00
120.00
100.00
80.00
60.00
40.00
20.00
0.00
H-measure
Similarity
Continuity
Hairpin
Total
Figure 12. Results obtained by Khalid et al. [19] vs results obtained by the ACS with extended computational model.
TABLE XIIIIII.
SEQUENCES GENERATED BY KHALID ET AL. [19] VS SEQUENCES GENERATED BY THE ACS WITH EXTENDED
COMPUTATIONAL.
Sequence
Hm
S
Ha
C
Total
BinPSO - Khalid et al.
CGGTCACGCCTCTTGTATTG
63
42
0
0
105
ATCCGCGCCGCACGGTCATG
70
45
0
0
115
CCAAATACATTGACTCCCAA
72
46
18
0
136
TATTTGCTCGGAGACCGCGG
65
46
9
0
120
GCATTTGATTCAGCGTTCCA
66
43
9
0
118
GTTGGAATGGTGTAGCTGAG
66
45
0
0
111
GTCTGTGTACTCTTCCGTGG
63
48
0
0
111
Fitness Value
66.43
45.00
5.14
0.00
116.57
ACS WITH EXTENDED COMPUTATIONAL MODEL
AGAGTAGCAGATTAGAAGGA
41
64
0
0
105
CTACAGGAAGATAGAGTACA
45
61
0
0
106
CAGAACAGGAGGAGAGAGTC
31
69
0
0
100
AGAGGAAGAGAGAAGCATAA
36
72
0
0
108
GCAGAGAGCGTTCGTACCAG
57
54
0
0
111
AGAGAGATAGAGATACAGAT
35
64
0
0
99
GAGAGAGAGAGGAGAGAGGA
17
67
0
0
84
Fitness Value (1 run)
37.43
64.43
0.00
0.00
101.86
Average Value (100 runs)
42.557
62.476
0.417
0.122
105.571
Note: Hm, Sm, Hr, C and Total are the Hmeasure, similarity, hairpin, continuity and Total Objective value, respectively
Next, an analysis and comparison is made between the
sequences generated by the ACS with extended
computational model and the population-based ant colony
optimisation (P-ACO) [22]. The sequences generated by PACO outperformed the ACS with extended computational
model in terms of similarity, continuity, and hairpin. P-ACO
also is superior in terms of overall performance. The
sequences and the overall results are presented in Table 19
and Figure 13, respectively.
DOI 10.5013/IJSSST.a.15.06.06
Finally, the ACS with extended computational model is
compared with the original ACS model [23]. The result of the
comparison is shown in Table 20 and Figure 14. The ACS
with extended computational model outperformed the
original ACS model in all of the objective functions except
for the similarity measure, for which the original ACS model
performed slightly better. The overall performance of the
ACS with extended computational model overtakes the
performance of the original ACS model.
63
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
TABLE XIX.
SEQUENCES GENERATED BY KURNIAWAN ET AL. [22] VS SEQUENCES GENERATED BY THE ACS WITH EXTENDED
COMPUTATIONAL MODEL.
Sequence
Hm
S
Ha
C
Total
P-ACO - Kurniawan et al.
CCACCACCACCACCAATAAT
23
77
0
0
100
ACCTCACTCACTCACTCAAC
49
35
0
0
84
TAACAGAACAGAACAGGCCG
21
68
0
0
89
CACACACACACACACACACA
25
74
0
0
99
AATCTCTCTCTCTCTCTGCC
20
81
0
0
101
CGCCAGCCAGCCTATATATA
39
49
0
0
88
TTGCATTCCTTCCTTCCTGG
21
69
0
0
90
Fitness Value
28.29
64.71
0.00
0.00
93.00
ACS WITH EXTENDED COMPUTATIONAL MODEL
AGAGTAGCAGATTAGAAGGA
41
64
0
0
105
CTACAGGAAGATAGAGTACA
45
61
0
0
106
CAGAACAGGAGGAGAGAGTC
31
69
0
0
100
AGAGGAAGAGAGAAGCATAA
36
72
0
0
108
GCAGAGAGCGTTCGTACCAG
57
54
0
0
111
AGAGAGATAGAGATACAGAT
35
64
0
0
99
GAGAGAGAGAGGAGAGAGGA
17
67
0
0
84
Fitness Value (1 run)
37.43
64.43
0.00
0.00
101.86
Average Value (100 runs)
42.557
62.476
0.417
0.122
105.571
Note: Hm, Sm, Hr, C and Total are the Hmeasure, similarity, hairpin, continuity and Total Objective value, respectively
P-ACO - Kurniawan et. al
Proposed ACS Model
120
100
80
60
40
20
0
H-measure
Similarity
Continuity
Hairpin
Total
Figure 13. Results obtained by Kurniawan et al. [22] vs results obtained by the ACS with extended computational model.
DOI 10.5013/IJSSST.a.15.06.06
64
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
Proposed ACS model
Yakop et al.
850.000
650.000
450.000
250.000
50.000
H-measure
Similarity
Continuity
Hairpin
Total
-150.000
Figure 14. Results obtained by Yakop et al. [23] vs results obtained by the ACS with extended computational model.
TABLE XX.
SEQUENCES GENERATED BY YAKOP ET AL. [23] VS SEQUENCES GENERATED BY THE ACS WITH EXTENDED
COMPUTATIONAL MODEL.
Sequence
Hm
S
Ha
C
Total
ACS - Yakop et al.
GGAGTGAGAGAGAGGAAGAG
25
73
0
0
98
AGAGAGAATGAGTTCAGATG
43
67
0
0
110
CGAGGAGATCCGCGATACCG
53
60
0
0
113
AGATGAGGAGCGCAGAGGCG
39
69
0
0
108
AGAGCGATGAGAAGAGAGAT
28
72
0
0
100
TGAGAGAGAGATGAGAGAGT
23
74
0
0
97
AAGAGAAGAGAGAGAGAGAG
22
71
0
0
93
Fitness Value (1 run)
33.29
69.43
0.00
0.00
102.71
Average Value (100 runs)
51.89
60.1
3.25
0.65
115.89
ACS WITH EXTENDED COMPUTATIONAL MODEL
AGAGTAGCAGATTAGAAGGA
41
64
0
0
105
CTACAGGAAGATAGAGTACA
45
61
0
0
106
CAGAACAGGAGGAGAGAGTC
31
69
0
0
100
AGAGGAAGAGAGAAGCATAA
36
72
0
0
108
GCAGAGAGCGTTCGTACCAG
57
54
0
0
111
AGAGAGATAGAGATACAGAT
35
64
0
0
99
GAGAGAGAGAGGAGAGAGGA
17
67
0
0
84
Fitness Value (1 run)
37.43
64.43
0.00
0.00
101.86
Average Value (100 runs)
42.557
62.476
0.417
0.122
105.571
Note: Hm, Sm, Hr, C and Total are the Hmeasure, similarity, hairpin, continuity and Total Objective value, respectively
ants were used in the system. However, improvement of the
total objective stagnates when the number of ants is larger than
25. Based on this finding, it can be concluded that, for the
application of the ACS algorithm in the DNA sequence
design, a suitable number of ants for finding a feasible
solution is 30. The benchmarking of the results obtained using
the ACS algorithm with other algorithms suggest that the ACS
algorithm is superior in most cases and is a good method for
DNA sequence design.
VII. CONCLUSION
This paper revisits an improved ACS algorithm for solving
the DNA sequence design problem in which the dependence
between the number of ants and the number of sequences
produced is eliminated. Unlike the original computational
model, this model allows the number of ants to be flexible in
order to enhance the possibility of finding an optimum DNA
sequence design. The total objective value decreased as more
DOI 10.5013/IJSSST.a.15.06.06
65
ISSN: 1473-804x online, 1473-8031 print
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY
[13] A. Marathe, A.E. Condon, and R.M. Corn, “On Combinatorial DNA
Word Design,” Proceedings of the 5th International Meeting on DNA
Based Computers, 1999.
[14] B.T. Zhang and S.Y. Shin, “Molecular Algorithms for Efficient and
Reliable DNA Computing,” Proceeding of Genetic Programming,
1998, pp. 735-742.
[15] R. Deaton, J. Chen, H. Bi, M. Garzon, H. Rubin, and D.H. Wood, “A
PCR-based Protocol for in vitro Selection of Non-crosshybridizing
Oligonucleotides,” Proceedings of 8th International Workshop on
DNA Based Computers, 2002, pp. 196-204.
[16] S.Y. Shin, I.H. Lee, D. Kim, and B.T. Zhang, “Multiobjective
Evolutionary Optimization of DNA Sequences for Reliable DNA
Computing,” IEEE Transaction on Evolutionary Computation, vol. 9,
no. 2, pp. 143-158, 2005.
[17] S. Zhou, Q. Zhang, J. Zhao, and J. Li, “DNA Encodings Based on
Multi-Objective Particle Swarm,” Journal of Computational and
Theoretical Nanoscience, vol. 4, pp. 1249-1252, 2007.
[18] G. Cui, Y. Niu, Y. Wang, X. Zhang, and L. Pan, “A New Approach
based on PSO Algorithms to Find Good Computational Encoding
Sequences,” Progress in Natural Science, vol. 17, no. 6, pp. 712-716,
2007.
[19] N.K. Khalid, Z. Ibrahim, T. Kurniawan, M. Khalid, and A.
Engelbrecht, “Implementation of Binary Particle Swarm Optimization
for DNA Sequence Design,” Distributed Computing, Artificial
Intelligence, Bioinformatics, Soft Computing, and Ambient Assisted
Living, vol. 5518 of the series Lecture Notes in Computer Science, pp.
450-457.
[20] S.-Y. Shin, Multi-objective Evolutionary Optimization of DNA
Sequences for Molecular Computing, PhD Thesis, Seoul National
University, 2005.
[21] R. Deaton, M. Garzon, R.C. Murphy, and J.A. Rose, “Genetic Search
of Reliable Encodings for DNA-based Computation,” First Conference
on Genetic Programming, 1996.
[22] T.B. Kurniawan, Z. Ibrahim, N.K. Khalid, and M. Khalid, “A
Population-based Ant Colony Optimization Approach for DNA
Sequence Optimization,” Third Asia International Conference on
Modelling and Simulation, 2009, pp. 246-251.
[23] F. Yakop, Z. Ibrahim, A.F. Zainal Abidin, Z.M. Yusof, M.S.
Mohamad, K. Wan, and J. Watada, “An Ant Colony System for
Solving DNA Sequence Design Problem in DNA Computing,”
International Journal of Innovative Computing, Information and
Control, vol. 8, no. 10, pp. 7329-7339, 2012.
REFERENCES
[1]
L. Adleman. “Molecular Computation of Solutions to Combinatorial
Problems,” Science, vol. 266, pp. 1021-1024, 1994.
[2] M. Dorigo, V. Maniezzo, and A. Colorni, “Ant System: Optimization
by a Colony of Cooperating Agents,” IEEE Transactions on Systems,
Man, and Cybernetics-Part B, vol. 26, no. 1, pp. 29-41, 1996.
[3] M. Dorigo and M. Gambardella. “Ant Colony System: A Cooperative
Learning Approach to the Traveling Salesman Problem,” IEEE
Transactions on Evolutionary Computation, vol. 1, no. 1, pp. 53-66,
1997.
[4] Z. Ibrahim, T.B. Kurniawan, N.K. Khalid, S. Sudin, and M. Khalid,
“Implementation of an Ant Colony System for DNA Sequence
Optimization,” Artificial Life and Robotics, vol. 14, issue 2, pp. 293296, 2009.
[5] T.B. Kurniawan, N.K. Khalid, Z. Ibrahim, M. Khalid, and M.
Middendorf, “Evaluation of Ordering Methods for DNA Sequence
Design Based on Ant Colony System,” Second Asia International
Conference on Modeling & Simulation, 2008, pp. 905-910.
[6] S.M. Mustaza, A.F. Zainal Abidin, Z. Ibrahim, S. Buyamin, M.A.
Shamsudin, A.R. Husain, J.A. Ahmed Mukred, “A Modified
Computational Model of Ant Colony System in DNA Sequences
Design,” Proceedings of IEEE Student Conference on Research and
Development, 2011, pp. 19-20.
[7] F. Tanaka, M. Nakatsugawa, M. Yamamoto, T. Shiba, and A. Ohuchi,
“Developing Support System for Sequence Design in DNA
Computing,” Proceedings of The 7th International Workshop on DNA
Based Computers, 2001, pp. 340-349.
[8] S. Hussini, L. Kari, and S. Konstantinidis, “Coding Properties of DNA
Languages,” Theoretical Computer Science, vol. 290, pp. 1557-1579,
2003.
[9] M.H. Garzon and R.J. Deaton, “Codeword Design and Information
Encoding in DNA Ensembles,” Natural Computing, vol. 3, pp. 253292, 2004.
[10] A.J. Hartemink, D.K. Gifford, and J. Khodor, “Automated Constraintbased Nucleotide Sequence Selection for DNA Computation,”
Proceedings of the 4th DIMACS Workshop DNA Based Computer,
1998, pp. 227-235.
[11] M. Arita, and S. Kobayashi, “DNA Sequence Design using
Templates,” New Generation Computing, vol. 20, pp. 263-277, 2002.
[12] U. Feldkamp, S. Saghafi, W. Banzhaf, and H. Rauhe, “DNA Sequence
Generator – A Program for the Construction of DNA Sequences,”
Proceedings of the 7th Int. Workshop DNA Based Computer, 2001, pp.
179-188.
DOI 10.5013/IJSSST.a.15.06.06
66
ISSN: 1473-804x online, 1473-8031 print