[go: up one dir, main page]

Academia.eduAcademia.edu
Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY Performance Evaluation and Benchmarking of an Extended Computational Model of Ant Colony System for DNA Sequence Design Zuwairie Ibrahim and Mohd Falfazli Mat Jusof Faculty of Electrical and Electronic Engineering Universiti Malaysia Pahang 26600 Pekan, Malaysia zuwairie@ump.edu.my Abstract— Ant colony system (ACS) algorithm is one of the biologically inspired algorithms that have been introduced to effectively solve a variety of combinatorial optimisation problems. In literature, ACS has been employed to solve DNA sequence design problem. The DNA sequence design problem was modelled based on a finite state machine in which the nodes represent the DNA bases {A, C, T, G}. Later in 2011, an extended computational model of finite state machine has been employed for DNA sequence design using ACS. The performance evaluation, however, was limited. In this study, the extended computational model of finite state machine is revisited and an extensive performance evaluation is conducted using 5, 7, 10, 15, 20, 25, 30, 35, and 40 agents/ants, each with 100 independent runs. The performance of the extended computational model is also benchmarked with the existing algorithm such as a Genetic Algorithm (GA), Multi-Objective Evolutionary Algorithm (MOEA), and Particle Swarm Optimisation (PSO). Keywords-Ant Colony System, DNA Sequence Design, Finite State Machine. I. sequences that have a minimal tendency for crosshybridisation and that have a maximal difference among them. In addition, they must have similar physical conditions, such as the length and melting temperature. By removing the error beforehand, no DNA is wasted because of an illegal reaction, and the reliability of the computation is improved, which consequently ensures a high computational accuracy. Ant colony optimisation (ACO) [2] is one of the biologically inspired algorithms that have been introduced to effectively solve various combinatorial optimisation problems. ACO was first introduced in 1992 by Marco Dorigo. It was derived from an observation of real ants’ behaviours. The main idea behind the algorithm is that the self-organising and highly coordinated behaviour of ants can be exploited to solve complex computational problems. The ant colony system (ACS) [3] is an extension of the ACO. In literature, ACS has been employed for solving DNA sequence design problem [4-5]. In order to model the DNA sequence design problem into a path-finding problem, a simple model, similar to a finite state machine of four nodes has been employed. Basically, the number of ant required is similar to the number of DNA sequences. This computational model, however, limits the efficiency of ACS algorithm to find good solution. Hence, there is a need to expand the computational model in order to fully utilize the stochastic search of ACS. An extended computation model has been developed [6]. To improve the flexibility in terms of the number of ants, every ant represents a set of DNA sequences in the extended computational model. In this study, an INTRODUCTION DNA computing is an interdisciplinary research area that uses DNA molecules to solve computational problems. Adleman initiated the field of DNA computing in 1994 [1], when he discovered a method for solving a hard combinatorial problem using DNA. Adleman used the method of manipulating DNA to solve a seven-node Hamiltonian Path Problem (HPP). The goal of Adleman’s experiment was to determine the existence of a path that starts at the initial city, finishes at the end city and passes through each of the remaining cities exactly once. Based on Adleman’s success, researchers around the world are currently working to exploit the extremely dense information storage and massive parallelism properties of DNA, hoping to one day produce a DNA computer that has better performance compared to the conventional electronics computers. The reliability of DNA computation is highly dependent on and influenced by the information represented on the DNA strand and the strand reaction. However, because of technological difficulties and the nature of the chemical characteristics of the molecules, DNA reactions could result in inaccuracies in the computation. One of the main approaches to overcoming the possibilities of illegal reactions and consequently to remove the potential error due to the biochemical reaction in advance is to focus on designing a good set of independent DNA sequences. An independent DNA sequence set means a set of DNA DOI 10.5013/IJSSST.a.15.06.06 49 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY extensive performance evaluation of the extended computational model is presented and the performance of the extended computational model is also benchmarked with the existing algorithms, which has been employed in DNA sequence design. II. DNA SEQUENCE DESIGN APPROACHES A good set of DNA sequence that is unique but forms a stable duplex with its complement is highly desirable to improve the reliability and accuracy of the DNA computing. To achieve this goal, several objective functions and constraints have been employed in DNA sequence design. The most excellent method should include all of the desired design criteria. These criteria can be classified into three categories [7] as follows: • Prevent mismatch hybridisation - Mismatch hybridisation is considered to be an illegal reaction in DNA sequence design, and it will reduce the reliability and efficiency of the DNA computation. Basically, satisfying this first criterion forces the set of sequences to form duplexes between a given DNA sequence and its complement only and, consequently, improves the computational accuracy. The objective functions that fall under this category are the similarity, H-measure, distance reverse complement, and distance reverse Hamming. • Prevent the formation of a secondary structure - Types of possible secondary structures that could occur include the internal loop, hairpin loop and bulge loop, which are usually formed by the interaction of singlestranded DNA or RNA. We must steer clear of the formation of a secondary structure because it will lower the efficiency of the computational system. The known factor that causes the secondary structure is the continuous occurrence of the same base, which makes a strand twist or bend. The objective functions that are used when addressing the formation of secondary structures are the hairpin, continuity and forbidden subsequences. • Maintain uniform chemical characteristics - In many cases, it is favourable to control DNA sequences that possess similar chemical characteristics, with which the system will behave similarly. Some of the constraints that have previously been used to measure this criterion are the free energy, melting temperature (Tm) and GC content. GC content is the percentage of guanine or cytosine in a whole DNA sequence. The Tm is defined as the temperature at which half of the double-stranded DNA starts to break into its single-stranded form. The Free energy is the energy that is necessary to make a duplex. Over the years, many advanced algorithms have emerged to be employed in DNA sequence design. Various methods and approaches [8-19] were used to derive this algorithm and ultimately to produce a good DNA sequence design. DOI 10.5013/IJSSST.a.15.06.06 Figure 1. Illustration of Hmeasure measure calculation [20]. III. FORMULATION OF OBJECTIVE FUNCTIONS AND CONSTRAINTS IN THE DNA SEQUENCE DESIGN To achieve a good set of DNA sequences, all of the design criteria must be included. However, because some criteria overlap with each other, a total of four objective functions and two constraints have been selected for this study. DNA sequence design, as a constrained multi-objective optimisation problem, constitutes finding an optimal solution from a set of multiple objectives to be optimised while adhering to several constraints, which must be satisfied. For DNA sequence design, the outlined objective functions are to be minimised while the required constraints are satisfied. The design problem is simplified and converted into a singleobjective optimisation problem using the weighted summation method. Therefore, the new objective function is the weighted sum of objective functions, as formulated in Eq. (1). min f DNA = ∑ ωi f i i (1) f subjected to Tm and GCcontent constraints, where i is the objective function for each i ∈ {Hmeasure, similarity, hairpin, continuity}, and ωi is the weight for each fi . For simplicity, each weight is set to 1. A brief description of the objective functions and constraints are as follows. The detailed description can be found in [20]. A. H-measure H-measure is a measure of the possibility of unintended DNA hybridisation based on the Hamming distance. In DNA sequencing, the Hamming distance is used to describe the degree of non-similarity between two DNA sequences. The greater the Hamming distance, the less similar the degree of complementary base pairs and the less likely for mismatch hybridisation. Illustration of H-measure calculation is shown in Figure 1. B. Similarity Similarity computes the similarity in the same direction of two given sequences to keep each sequence as unique as possible including position shift. Illustration of similarity calculation is shown in Figure 2. 50 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY Tm ( x) = ΔH + 16.6 log( Na + ) ΔS + R ln CT (3) where ΔH and ΔS are the enthalpy and entropy changes of the annealing reaction, respectively. R, CT, and Na+ denote the Boltzmann’s constant, the total oligonucleotide strand concentration, and salt concentration for salt adjustment, respectively. Figure 2. Illustration of similarity measure calculation [20]. IV. Figure 3. Illustration of hairpin calculation [20]. C. Continuity Continuity serves as an assessment of the degree of successive occurrence of the same DNA bases. Continuous occurrence of the same bases will result in instability of the DNA structure. This will ultimately increase the difficulty in managing the reaction and the experiment become less controllable. A. State Transition Rule The ACS state transition rule is the rule that governs the movement of ants in the system when constructing the desired sequence. This rule is also known as the pseudorandom-proportional rule. From this rule, the movement preference of the ants will be toward the nodes, which are connected by short edges and have the highest concentration of pheromone. This rule was developed to explicitly balance between an exploration of new edges and an exploitation of a priori knowledge and is determined by the parameter q0. The transition rule is probabilistic. For an ant k on node r, the selection of the next node s depends on a random variable q, and q0 and is given by the transition probability, as shown in Eq. (4); # β arg max [τ (r, u)] ⋅ [η (r, u)] if q ≤ q0 pk (r, s)ACS = % u∈J k (r ) $ otherwise %& S (4) where q is a uniformly distributed random variable [0,1], q0 is between 0 and 1, and S is a variable that is randomly selected according to the probability distribution given by Eq. (5). D. Hairpin The hairpin objective function computes the probability of the single stranded DNA to form a secondary structure particularly hairpin. Hairpin occurs from a self-hybridization of a single stranded DNA which causes formation of a loop. The hairpin objective function formulation considers the length of the hairpin loop denoted as r (ring) and number of hybridized pair represented by p (pair). Figure 3 shows the possible formation of hairpin based on the value of p and r. { E. GCcontent Constraint The GCcontent is the percentage of G base and C base in a DNA sequence. It is an important parameter as the content of G and C in a strand of DNA can affect the chemical property of DNA sequence. It can be calculated using Eq. (2). )*+,𝐺𝐶#$%&'%& = (2) 𝑆= ./+01+)*+,- where wA, xT, yG, and zC, are the number of A, T, G, and C in a sequence, respectively. } 3 4,6 [8(4,6)]< >?@A (B) 3 4,= [8(4,=)]< (5) Jk(r) is the set of feasible components, in other words, the set of edges (r, s), where r is current node and s is a next node that has not yet been visited by the k-th ant. Additionally, (r, u) represents the other edges, where u is all of the nodes that have not yet been visited by the k-th ant. The parameters β (β ≥ 0) control the relative importance of the pheromone versus the problem-dependent heuristic information η(r, s), which is given by: F. Melting Temperature Constraint Melting temperature, Tm, is defined as a temperature, where half of double-stranded DNA starts to break into its single-stranded form. There are numerous equations that can be use in calculation Tm. In this study, the nearest-neighbour formulation to calculate Tm as formulated in Eq. (3), is used. DOI 10.5013/IJSSST.a.15.06.06 ANT COLONY SYSTEM The ACS algorithm [3] comprises two main phases; the first phase is the construction of the solution, and the second phase is the pheromone updates. In this algorithm, an ant moves from one node to another according to the state transition rule, to incrementally construct the desired solution. During the construction, the ant releases pheromones at each step, which will later influence the movement of the next ants, and the pheromone concentration is updated using the local pheromone updating rule. This process continues until all of the ants have completed their routes. The decision by the globally best ant that found the best decision is reinforced by allowing it to deposit pheromone. 51 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY exploitation, and the pheromone level is updated by applying the global updating rule according to Eq. (7): A τ (r , s)t +1 = (1 − ρ ) ⋅τ (r , s)t + Δτ (r , s) (7) where ρ (0 ≥ ρ ≥ 1) is the evaporation rate, and Δτ (r, s) is T the quantity of pheromone laid on edge (r, s) by ant k at iteration t, which is given by: ⎧⎪ 1 Δτ (r , s ) = ⎨ Lk ⎪⎩0 C if (r , s ) ∈ tour done by ant k otherwise (8) where Lk is the length of the globally best tour from the beginning of the trial constructed by ant k. The value of ρ is usually fixed. However, it can be dynamically changed according to the users’ preferences with respect to exploration and exploitation. For small values of ρ, the pheromone concentration on the edges will evaporate slowly and the algorithm will favour exploitation. The effect of a large value of ρ is that the pheromone deposited by the ants with quickly evaporate and exploration is emphasised. G Figure 4. Finite state machine used in DNA sequence design. C. Local Update Rule In constructing the solution, each ant deposits pheromones on each of the edges that it has visited. The change in the pheromone level on each of the edges is given by Eq. (9): τ (r, s)t +1 = (1 − ζ ) ⋅τ (r, s)t + τ 0 (9) where ζ ∈[0, 1] is the pheromone decay coefficient, and τ0 is the initial value of the pheromone. The local pheromone update is applied by all of the ants after each of the construction steps. The main goal of the local update is to diversify the search by decreasing the pheromone concentration on the traversed edges, which encourages ants to diversify their routes for the purpose of producing different solutions. This arrangement would prevent several ants from producing identical solutions during each iteration. Figure 5. Sequence generation. 1 (6) drs and τ(r, s) is a pheromone trail that is associated with the edge that joins node r and s. The parameter drs is the distance between nodes r and s. The quantity of pheromone represents the past experience of the colony with respect to choosing which path to take. If q ≤ q0, then the best edge, as described in Eq. (4), is chosen (exploitation); otherwise, an edge is chosen according to Eq. (5) (biased exploration). Therefore, a smaller value of q0 used, fewer best links are exploited, and the algorithm will accentuate more the exploration. η (r, s) = V. ANT COLONY SYSTEM FOR DNA SEQUENCE DESIGN BASED ON THE EXTENDED COMPUTATIONAL MODEL A model similar to a finite state machine is utilised in solving the DNA sequence design problem [4-5]. Each node of the finite state machine represents the four DNA bases A, T, C and G. Every node is connected to each other node including itself, as illustrated in Figure 4. Figure 5 depicts the movement of an ant from one node to another in constructing a DNA sequence. The initial positioning of an ant is random. The path of an ant from one node to another will form the DNA sequence. For example, the movement from node A to node T (in Figure 5) forms a path that can be translated to the DNA sequence ‘AT’. If the next tour of the ant is to node C, then the DNA Sequence formed is ‘ATC’. This process continues until the number of required bases has been achieved. B. Global Update Rule The global update rule basically encourages the ants to search in the neighbourhood of the best solution that has been found thus far, which makes the pursuit of finding the optimal solution more directed. This approach is applied after all of the ants have constructed a solution and only the edges that belong to the globally best tour will receive reinforcement. Therefore, only the globally best ant that found the best solution up to the current iteration of the algorithm is permitted to deposit the pheromone. This strategy favours DOI 10.5013/IJSSST.a.15.06.06 D. Initialization The initialisation of the ACS for the DNA Sequence design comprises the initialisation of the ACS parameters, the 52 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY initialisation of the DNA Sequence parameters and the initial placement of the ant. The initial pheromone concentration is obtained using the following equation: E 𝜏D = F (10) Therefore, the number of ants used can affect the overall performance of the system. Based on the literature on applications of the ACS algorithm, the number of ants used is mostly problem-dependent. Different types of problems and modelling will require different numbers of ants. Thus far, there are no suggestions as to the number of ants for the DNA sequence design problem. Each ant represents a solution of the DNA sequences problem by producing the number of DNA sequences that are required. For example, 7 DNA sequences of 20-mer length per DNA sequence are generated. This pattern can be illustrated as in Eq. (12). 𝑥(𝑘)& = [𝑝E , 𝑝J , … , 𝑝% ] (12) where x(k)t is ant k at iteration t. Here, pn is DNA sequence n, which is represented as pn = [b1 , b2 ,..., bz ], bz = { A, C, G, T } % where Q is the sum of the objectives calculated for a set of randomly generated DNA sequences, and n is the number of sequences. E. Construction Process After the initialisation process, the ants move from one node to another node to incrementally construct the DNA Sequence, based on the state transition rule. Some modifications have been performed on the formulation, whereby the heuristic information has been omitted from the calculation, and only the pheromone information is used in determining the decision for the next node, as shown in Eq. (11): ⎧ if q ≤ q0 ⎪arg max {[τ (r , u )]} pk (r , s ) ACS = ⎨ u∈J k ( r ) otherwise ⎪ ⎩S (11) Usually, the transition probability that is used is a balance between the pheromone intensity, τ(r, s), which is the history of a previously successful move, and heuristic information, η(r, s), which expresses the desirability of the move. This approach effectively balances the exploration-exploitation trade-off. Because the DNA sequence design problem offers no information that can be directly used as heuristic information, this model uses only pheromone information in the computations. After each construction step, a local pheromone updating rule is applied until a complete solution is constructed. The next step is to check the two constraints, Tm and GCcontent, of the solution. Only after the value of the two constraints are within a specified range will the processes of constructing the next solution for the next ant start. This process continues until all of the ants have constructed a solution with satisfactory constraint values. The total objective value is then calculated for every solution constructed by each ant. The minimum objective value among the solutions is stored as the best found solution. Then, the best ant decision is reinforced by applying the global updating rule. The iterations proceed until the specified number of iterations has been exceeded. where bz is the zth base for the nth DNA sequence. Based on this proposition, the number of ants used does not contribute and is totally independent of the number of sequences produced in the process. Each ant will produce a solution based on the number of required bases. Table 1 summarises the ACS algorithm with the extended computational model. The parameters are initialised to the values that are tabulated in Table 2. TABLE I. 01 Initialize parameter in Table 1 02 Loop /*each loop is called an iteration 03 Loop /*each loop is called an ant 04 Each ant is positioned randomly on the start node 05 Loop 06 Each ant applies state transition rule to construct a solution 07 Local pheromone updating rule is applied 08 Until ant build a complete solution 09 If GCcontent and Tm constraints for all DNA sequences passed then 10 Proceed with next ant 11 Else 12 Repeat the DNA sequences generation using current ant 13 End if 14 Until all ant have built a complete solution 15 For each solution do 16 Calculate objective functions 17 Next 18 If the objective functions better than previous then 19 store the sequences as best found sequence 20 End if 21 A global pheromone updating is applied to the best solution 22 Until stopping condition meet F. Extended Computational Model Ants exhibit cooperative behaviour by contributing and sharing knowledge about the paths that have been taken by other ants, through the deposition of pheromones. An increase in the number of ants that are used improves the exploration and exploitation ability of the algorithm because it enhances the information about the available routes as well as increasing the chances of finding a new optimal route. A small number of ants can result in a sub-optimal solution, while having too many ants will result in significantly higher computational time, which makes the solution infeasible. DOI 10.5013/IJSSST.a.15.06.06 ACS ALGORITHM WITH EXTENDED COMPUTATIONAL MODEL. 53 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY TABLE II. THE ACS PARAMETERS. Parameter q0 β ζ ρ Maximum number of iteration (tmax) TABLE III. No. of Ants 5 7 10 15 20 25 30 35 40 PERFORMANCE BASED ON THE NUMBER OF ANTS. Average Std Dev. Min Max 109.045 108.180 106.685 105.995 105.542 105.519 105.527 105.503 105.483 3.298 3.266 2.158 2.179 2.352 2.026 1.998 1.868 1.973 103.10 102.57 102.91 102.14 101.29 100.88 101.00 101.88 99.57 117.98 117.29 113.29 112.86 111.71 110.43 110.14 109.86 109.29 TABLE IV. No 1 2 3 4 5 6 7 Value 0.5 0 0.05 0.1 300 DNA Sequences len TCTCGTCTCTTCGCGCTCCT 20 ATCTCTCTCTCTGTCCTCTT 20 CTTTCTCTTCTCTCTCTCGC 20 TCATCTTCGCTCTCATCGCA 20 CCGTATCGTCTCTATCCTCT 20 ACTCTGTTCCTCATGTACTT 20 TATCTGCTCCTCTCTCCCGA 20 Total Fitness Value (1 run) Average Value (100 runs) BEST RESULT USING 5 ANTS GC% Tm Hm1 Sm2 Hr3 C4 Total 60 45 50 45 50 40 55 48.38 39.37 40.12 41.40 40.30 38.62 44.31 40 29 26 57 42 41 40 241 39.286 46.494 70 69 66 58 66 57 67 463 64.714 62.164 0 0 9 0 0 0 9 18 2.574 0.287 0 0 0 0 0 0 0 0 0.000 0.100 110 98 101 115 108 98 116 746 106.571 109.045 Remark: len is length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Hm, Sm, Hr and C are the Hmeasure, similarity, hairpin and continuity, objective functions, respectively. VI. were input into the same system that was developed to regenerate the objective function values. Table 13 and Figure 7 show a comparison between the results of the proposed ACS model with the results obtained using a Genetic Algorithm [20]. GA was used to generate 7 DNA sequences with a 20-mer length. The ACS model significantly outperformed the GA model for all four of the objective functions. Also, comparison has been made with [7]. They generated 14 sequences of 20-mer lengths as shown in Table 14 and Figure 8. The sequences generated using the ACS with extended computational model have a smaller total objective value in comparison with the sequences generated using SA, which indicates that the performance of the ACS model is superior to the SA model. The sequences generated using the ACS with extended computational model have much lower values for Hmeasure, similarity and hairpin but have a slightly higher continuity value when compared to the sequence generated using SA. RESULTS AND DISCUSSION In this study, 9 sets of studies have been conducted using 5, 7, 10, 15, 20, 25, 30, 35 and 40 ants, each with 100 independent runs. The number of iterations is set to be 300 for each set. Figure 6 shows the overall average of the objective functions for each set of studies, while Table 3 tabulates the overall average of each objective function as well as the standard deviation, minimum and maximum value for each set of the study. The minimum value reflects the best results that were obtained for each ant model. Table 4 to Table 11 show the best-found sequence for each number of ants used in this study. The best result basically is the solution that has the least total objective value. A comparison between the ACS with extended computational model and other approaches has been made for benchmarking based on the analysis of sequences generated using various algorithms, as shown in Table 12. To ensure consistency of the fitness measure and constraints imposed on the sequences, the sequences published by the authors DOI 10.5013/IJSSST.a.15.06.06 54 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY Figure 6. Plot of the average of the total objective value of each ant model. TABLE V. No 1 2 3 4 5 6 7 DNA Sequences len TGAGTTGAGTAGTGGGAGGA 20 GTGTGGGTTGTGTTGTGTGA 20 TGTCTGTACGTGTAGTGTCG 20 AGCGTGTTCGTAGTGAGTGC 20 CTTGTCTGTGGTAAGTTGTG 20 TGCGTGCGTGTTGTGTGTTG 20 ATGTGGTGTGGTTGCGCGTT 20 Total Fitness Value (1 run) Average Value (100 runs) BEST RESULT USING 10 ANTS. GC% 50 50 50 55 45 55 55 Tm 42.46 44.05 42.10 45.75 39.26 47.87 49.10 Hm1 34 16 42 45 37 30 37 241 34.429 44.292 Sm2 65 78 56 63 68 73 60 463 66.143 62.054 Hr3 9 0 0 0 0 0 0 9 1.286 0.268 C4 0 0 0 0 0 0 0 0 0.000 0.071 Total 108 103 98 108 105 103 97 722 103.143 106.685 Remark: len is length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Hm, Sm, Hr and C are the Hmeasure, similarity, hairpin and continuity, objective functions, respectively. TABLE VI. No 1 2 3 4 5 6 7 DNA Sequences len GTACAGGAAGAGAGGTTACA 20 CAGAACAGAGAGAGAGAGAG 20 TTCCAGTAGCATAGACATAA 20 GCAGAAGCAGTTCGTACCACA 20 GAGAGATAGAGTAGAGATAG 20 CAGAGAGAGGAGAGAGGACG 20 AGACAGGAGAGGAGAGAGAC 20 Total Fitness Value (1 run) Average Value (100 runs) BEST RESULT USING 5 ANTS. GC% 45 50 35 55 40 60 55 Tm 38.69 38.84 35.92 45.51 32.50 43.26 42.27 Hm1 39 29 60 54 38 23 29 272 38.857 43.221 Sm2 61 68 55 54 67 68 70 443 63.286 62.471 Hr3 0 0 0 0 0 0 0 0 0 0.203 C4 0 0 0 0 0 0 0 0 0 0.100 Total 100 97 115 108 105 91 99 715 102.143 105.995 Remark: len is length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Hm, Sm, Hr and C are the Hmeasure, similarity, hairpin and continuity, objective functions, respectively. DOI 10.5013/IJSSST.a.15.06.06 55 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY TABLE VII. No 1 2 3 4 5 6 7 DNA Sequences len CGAGAGGAGAGAGACGAGAC 20 TAGTGAGAGAGAAGAGCTTG 20 CGTGAGAGAGAGAGAGTGAG 20 CGAGAGAGTGTAGAGAGAAC 20 CGTGAGAAGACGAGAGACGC 20 AGAGCGAGAGAGTGAAGAGT 20 TAGAGATAGAGTAGGATTGT 20 Total Fitness Value (1 run) Average Value (100 runs) BEST RESULT USING 20 ANTS. GC% 60 45 55 50 60 50 34 Tm 43.54 38.77 41.34 39.19 45.49 42.82 33.99 Hm1 30 43 28 40 33 40 40 254 36.286 41.988 Sm2 66 69 65 65 60 68 62 455 65.000 62.759 Hr3 0 0 0 0 0 0 0 0 0.000 0.589 C4 0 0 0 0 0 0 0 0 0.000 0.182 Total 96 112 93 105 93 108 102 709 101.286 105.542 Remark: len is length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Hm, Sm, Hr and C are the Hmeasure, similarity, hairpin and continuity, objective functions, respectively. TABLE VIII. No 1 2 3 4 5 6 7 DNA Sequences len GAGACGAGAGAAGAGAGCGA 20 TAGAGTAAGGTAGAGAGAAC 20 TAGAGAAGGAAGTACCACCA 20 GAGGAGAGAGAGAGCAGAAG 20 ATAGCCAAGAGAGTAGCAGA 20 CCGAAGAGATAGAAGAGACT 20 TAGAGCAGTAGAAGTTAGAG 20 Total Fitness Value (1 run) Average Value (100 runs) BEST RESULT USING 25 ANTS. GC% 55 40 45 55 45 45 40 Tm 43.49 34.97 39.9 41.39 40.41 38.25 35.67 Hm1 28 38 39 17 41 32 45 240 34.286 43.538 Sm2 70 64 63 71 65 68 61 462 66.000 61.569 Hr3 0 0 0 0 0 0 0 0 0.000 0.261 C4 0 0 0 0 0 0 0 0 0.000 0.070 Total 98 102 102 88 106 100 106 702 100.286 105.519 Remark: len is length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Hm, Sm, Hr and C are the Hmeasure, similarity, hairpin and continuity, objective functions, respectively. TABLE IX. No 1 2 3 4 5 6 7 DNA Sequences len TGTTGTGAGTTGGTGTTGGT 20 GGTGTGTGTAGTGTCCTTGT 20 TGTGTGTGCGTGTGTGTCGC 20 GTGTTCGTGTGTAGTGACGTA 20 GGTGTGAGGTAGTGTGTACG 20 GTGGAAGTAGAGTAGCGTGC 20 GTGTAGTGTGTGTGTGTGTG 20 Total Fitness Value (1 run) Average Value (100 runs) BEST RESULT USING 30 ANTS. GC% 45 50 60 50 55 55 50 Tm 43.12 42.64 49.40 42.37 42.76 43.17 41.95 Hm1 23 32 36 44 40 47 27 249 35.571 42.896 Sm2 65 64 70 60 70 56 73 458 65.429 61.887 Hr3 0 0 0 0 0 0 0 0 0.000 0.664 C4 0 0 0 0 0 0 0 0 0.000 0.149 Total 88 96 106 104 110 103 100 707 101.000 105.517 Remark: len is length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Hm, Sm, Hr and C are the Hmeasure, similarity, hairpin and continuity, objective functions, respectively. DOI 10.5013/IJSSST.a.15.06.06 56 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY TABLE X. BEST RESULT USING 35 ANTS. DNA Sequences len GC% Tm Hm1 Sm2 Hr3 C4 Total AGAGTAGCAGATTAGAAGGA 20 40 37.49 41 64 0 0 105 CTACAGGAAGATAGAGTACA 20 40 35.35 45 61 0 0 106 CAGAACAGGAGGAGAGAGTC 20 55 41.61 31 69 0 0 100 AGAGGAAGAGAGAAGCATAA 20 40 38.09 36 72 0 0 108 GCAGAGAGCGTTCGTACCAG 20 60 45.85 57 54 0 0 111 AGAGAGATAGAGATACAGAT 20 35 33.13 35 64 0 0 99 GAGAGAGAGAGGAGAGAGGA 20 55 41.39 17 67 0 0 84 Total 262 451 0 0 713 Fitness Value (1 run) 37.429 64.429 0.000 0.000 101.857 Average Value (100 runs) 42.557 62.476 0.417 0.122 105.503 Remark: len is the length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Cont, hair, hme, and sim are the continuity, hairpin, Hmeasure, and similarity objective functions, respectively. No 1 2 3 4 5 6 7 TABLE XI. BEST RESULT USING 40 ANTS. DNA Sequences len GC% Tm Hm1 Sm2 Hr3 C4 Total ACTCACCGACTTCAGGCACC 20 60 47.41 43 56 0 0 99 CACACAACACAACACACACA 20 45 42.41 18 64 0 0 82 TCAACTACTCACACGACACT 20 45 41.39 39 61 0 0 100 GGACAACACACACGCACACT 20 55 46.52 37 66 0 0 103 CCAGCCACAGCCACGCACGC 20 75 54.68 38 59 0 0 97 CCAACACATTCATCACACAT 20 40 39.35 37 66 0 0 103 TGTCACCCAACACAGCATCA 20 50 45.18 46 58 9 0 113 Total 258 430 9 0 697 Fitness Value (1 run) 36.857 61.429 1.286 0.000 99.571 Average Value (100 runs) 41.929 63.091 0.334 0.129 105.483 Remark: len is the length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Cont, hair, hme, and sim are the continuity, hairpin, Hmeasure, and similarity objective functions, respectively. No 1 2 3 4 5 6 7 TABLE XII. Algorithm GA SA MOEA (NACST/Seq) MOPSO PSO P-ACO BinPSO ACS ALGORITHMS SELECTED FOR BENCHMARKING. Author Deaton et al. [21] Tanaka et al. [7] Shin et al. [16] Zhao et al. [17] GuangZhou et al. [18] Kurniawan et al. [22] Khalid et al. [19] Yaakop et al. [23] GA - Deaton et al. DNA Seq 7 Seq, 20-mer 14 Seq, 20-mer 14 Seq, 20-mer 7 Seq, 20-mer 20 Seq, 20-mer 7 Seq, 20-mer 7 Seq, 20-mer 7 Seq, 20-mer Year 1996 2001 2005 2007 2007 2009 2010 2010 Proposed ACS model 140.00 120.00 100.00 80.00 60.00 40.00 20.00 0.00 H-measure Similarity Continuity Hairpin Total Figure 7. Results obtained by Deaton et al. [21] vs results obtained by the ACS with extended computational model. DOI 10.5013/IJSSST.a.15.06.06 57 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY Tanaka et al. (SA) Proposed ACS 300.000 250.000 200.000 150.000 100.000 50.000 0.000 H-measure Similarity Continuity Hairpin Total Figure 8. Results obtained by Tanaka et al. [7] vs results obtained by the ACS with extended computational model. TABLE XIII. SEQUENCES GENERATED BY DEATON ET AL. [21] VS SEQUENCES GENERATED BY THE ACS WITH EXTENDED COMPUTATIONAL MODEL. Hm1 Sm2 Hr3 C4 Total GA - Deaton et al. ATAGAGTGGATAGTTCTGGG 55 66 0 9 130 CATTGGCGGCGCGTAGGCTT 44 62 0 0 106 CTTGTGACCGCTTCTGGGGA 60 70 0 16 146 GAAAAAGGACCAAAAGAGAG 40 69 0 41 150 GATGGTGCTTAGAGAAGTGG 51 61 0 0 112 TGTATCTCGTTTTAACATCC 41 74 4 16 135 TTGTAAGCCTACTGCGTGAC 47 64 0 0 111 Fitness Value 48.29 66.57 0.57 11.71 127.14 ACS WITH EXTENDED COMPUTATIONAL MODEL AGAGTAGCAGATTAGAAGGA 41 64 0 0 105 CTACAGGAAGATAGAGTACA 45 61 0 0 106 CAGAACAGGAGGAGAGAGTC 31 69 0 0 100 AGAGGAAGAGAGAAGCATAA 36 72 0 0 108 GCAGAGAGCGTTCGTACCAG 57 54 0 0 111 AGAGAGATAGAGATACAGAT 35 64 0 0 99 GAGAGAGAGAGGAGAGAGGA 17 67 0 0 84 Fitness Value (1 run) 37.43 64.43 0.00 0.00 101.86 Average Value (100 runs) 42.557 62.476 0.417 0.122 105.571 Remark: len is the length of the sequences. GC% and Tm are the GCcontent and melting temperature constraints, respectively. Cont, hair, hme, and sim are the continuity, hairpin, Hmeasure, and similarity objective functions, respectively. DNA Sequence A comparison was then made between the proposed model and the model design by utilising a multi-objective evolutionary algorithm (MOEA) that was designed by Shin et al. [16]. Shin et al. published multiple DNA sequences that were generated by their system and a set of 14 DNA sequences whose length is a 20-mer, as shown in Table 15, which is used to make a comparison with the proposed model. As seen in Figure 9, the sequence generated by the ACS with extended computational model has a significantly lower Hmeasure value compared to sequences that were generated by MOEA. The sequence generated by MOEA outperformed the ACS model in terms of the similarity, continuity, and hairpin DOI 10.5013/IJSSST.a.15.06.06 measure. However, the total objective value of the sequence produced by the ACS with extended computational model is lower compared to the sequences designed using MOEA. Therefore, it can be concluded that the overall performance of the ACS with extended computational model is better in comparison with MOEA. Table 16 and Figure 10 compare the results of the ACS with extended computational model with the multi-objectives PSO (MOPSO) [17]. MOPSO was employed to generate 7 DNA sequences with a 20-mer length. The ACS with extended computational model significantly outperformed MOPSO for all four of the objective measures. 58 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY TABLE XIV. SEQUENCES GENERATED BY TANAKA ET AL. [7] VS SEQUENCES GENERATED BY THE ACS WITH EXTENDED COMPUTATIONAL MODEL. Sequence Hm S Ha C Total SA - TANAKA et al. CGAGACATCGTGCATATCGT 111 159 4 0 274 TATAGCACGAGTGCGCGTAT 110 158 0 0 268 GATCTACGATCATGAGAGCA 119 160 4 0 283 TCTGTACTGCTGACTCGAGT 120 154 0 0 274 CGAGTAGTCACACGATGAGA 112 160 0 0 272 AGATGATCGGCAGCGAGAGT 117 157 0 0 274 TGTGCTCGTCTCTGCATACT 108 161 4 0 273 AGACGAGTCGTACAGTACAG 113 150 0 0 263 ATGTACGTGAGATGCAGCAG 111 150 0 0 261 ATCACTACTCGCTCGTCACT 109 151 0 0 260 TCAGAGATACTCACGTCACG 115 158 0 0 273 GACAGAGCTATCAGCTACTG 112 153 0 0 265 GCTGACATAGAGTGCGATAG 112 149 0 0 261 ACATCGACACTACTACGCAC 104 155 0 0 259 112.36 155.36 0.86 0.00 268.57 ACS WITH EXTENDED COMPUTATIONAL MODEL CAGCACTCCAAGCACAACAG 82 127 0 0 209 GCACTACACACACGCGCCAC 95 151 0 0 246 AGCACACAGTCACGAAGGCG 107 124 0 0 231 ACACACACACACAGCGAACA 67 153 0 0 220 ACACACACACACACACACGA 60 160 0 0 220 GCACTAACCGAGAGCACTAC 75 145 0 0 220 GCTACACTATAGCGGCACAG 114 129 0 0 243 GCACAACCACAACTAAACAC 73 147 0 0 220 GCAGGCAACGCGCGAGCCAA 103 119 0 0 222 TAACTACGCGACACTAGCAT 101 133 0 0 234 TCACACACATACACAAACAT 79 143 9 0 231 TATACACTAGTAGCAACTAC 106 121 0 0 227 TAGCCCTACAACGCACGGTA 111 119 9 0 239 GTACCAGCACCTACACACAC 101 139 0 0 240 Fitness Value (1 run) 91.00 136.43 1.29 0.00 228.71 Average Value (100 run) 97.941 135.908 0.740 0.129 234.717 Note: Hm, Sm, Hr, C and Total are the Hmeasure, similarity, hairpin, continuity and Total Objective value, respectively. Shin et al. (MOEA) Proposed ACS Model 300.00 250.00 200.00 150.00 100.00 50.00 0.00 H-measure Similarity Continuity Hairpin Total Figure 9. Results obtained by Shin et al. [16] vs results obtained by the ACS with extended computational model. DOI 10.5013/IJSSST.a.15.06.06 59 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY TABLE XV. SEQUENCES GENERATED BY SHIN ET AL. [16] VS SEQUENCES GENERATED BY THIS STUDY. Sequence Hm S Ha C Total MOEA - Shin et al. GTGACTTGAGGTAGGTAGGA 149 96 0 0 245 ATCATACTCCGGAGACTACC 143 101 0 0 244 CACGTCCTACTACCTTCAAC 145 113 0 0 258 ACACGCGTGCATATAGGCAA 144 105 0 0 249 AAGTCTGCACGGATTCCTGA 152 108 0 0 260 AGGCCGAAGTTGACGTAAGA 152 107 0 0 259 CGACACTTGTAGCACACCTT 142 102 0 0 244 TGGCGCTCTACCGTTGAATT 151 106 0 0 257 CTAGAAGGATAGGCGATACG 138 107 0 0 245 CTTGGTGCGTTATGTGTACA 148 93 0 0 241 TGCCAACGGTCTCAACATGA 152 108 0 0 260 TTATCTCCATAGCTCCAGGC 142 99 0 0 241 TGAACGAGCATCACCAACTC 144 110 0 0 254 CTAGATTAGCGGCCATAACC 135 108 0 0 243 Fitness Value 145.5 104.5 0 0 250 ACS WITH EXTENDED COMPUTATIONAL MODEL CAGCACTCCAAGCACAACAG 82 127 0 0 209 GCACTACACACACGCGCCAC 95 151 0 0 246 AGCACACAGTCACGAAGGCG 107 124 0 0 231 ACACACACACACAGCGAACA 67 153 0 0 220 ACACACACACACACACACGA 60 160 0 0 220 GCACTAACCGAGAGCACTAC 75 145 0 0 220 GCTACACTATAGCGGCACAG 114 129 0 0 243 GCACAACCACAACTAAACAC 73 147 0 0 220 GCAGGCAACGCGCGAGCCAA 103 119 0 0 222 TAACTACGCGACACTAGCAT 101 133 0 0 234 TCACACACATACACAAACAT 79 143 9 0 231 TATACACTAGTAGCAACTAC 106 121 0 0 227 TAGCCCTACAACGCACGGTA 111 119 9 0 239 GTACCAGCACCTACACACAC 101 139 0 0 240 Fitness Value (1 run) 91.00 136.43 1.29 0.00 228.71 Average Value (100 run) 97.941 135.908 0.740 0.129 234.717 Note: Hm, Sm, Hr, C and Total are the Hmeasure, similarity, hairpin, continuity and Total Objective value, respectively. Zhao et al. (MOPSO) Proposed ACS 140.00 120.00 100.00 80.00 60.00 40.00 20.00 0.00 H-measure Similarity Continuity Hairpin Total Figure 10. Results obtained by Zhao et al. [17] vs results obtained by the ACS with extended computational model. DOI 10.5013/IJSSST.a.15.06.06 60 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY TABLE XVI. SEQUENCES GENERATED BY ZHAO ET AL. [17] VS SEQUENCES GENERATED BY THIS STUDY. Sequence Hm S Ha C Total MOPSO - Zhao et al. CATCAGCCGGACTCGTCAGT 55 66 0 9 130 AGATCGCATGTAAAGGAGTG 44 62 0 0 106 AAAGCAGGGTGTATCAGTCA 60 70 0 16 146 TACAGGCGCTAATTAGCTCC 40 69 0 41 150 GCGGACCCAACACATATGAG 51 61 0 0 112 ATCATCATTTCATGGGGCAA 41 74 4 16 135 GGGATCGACGTATATTAACG 47 64 0 0 111 Fitness Value 48.29 66.57 0.57 11.71 127.14 ACS WITH EXTENDED COMPUTATIONAL MODEL AGAGTAGCAGATTAGAAGGA 41 64 0 0 105 CTACAGGAAGATAGAGTACA 45 61 0 0 106 CAGAACAGGAGGAGAGAGTC 31 69 0 0 100 AGAGGAAGAGAGAAGCATAA 36 72 0 0 108 GCAGAGAGCGTTCGTACCAG 57 54 0 0 111 AGAGAGATAGAGATACAGAT 35 64 0 0 99 GAGAGAGAGAGGAGAGAGGA 17 67 0 0 84 Fitness Value (1 run) 37.43 64.43 0.00 0.00 101.86 Average Value (100 runs) 42.557 62.476 0.417 0.122 105.571 Note: Hm, Sm, Hr, C and Total are the Hmeasure, similarity, hairpin, continuity and Total Objective value, respectively. PSO - GuangZhou et al. Proposed ACS Model 400 350 300 250 200 150 100 50 0 H-measure Similarity Continuity Hairpin Total Figure 11. Results obtained by GuangZhou et al. [18] vs results obtained by the ACS with extended computational model. GuangZhou et al. [18] employed a PSO algorithm to generate 7 DNA sequences, each with a 20-mer length. Table 17 presents the sequences in comparison with the sequences generated by the ACS with extended computational model. Figure 11 shows that the ACS with extended computational model outperforms the PSO. Table 18 shows the sequences that were generated by utilising the binary particle swarm optimisation (BinPSO) [18] in comparison with sequences generated using the ACS with extended computational model. Based on the plot shown in Figure 12, it can be seen that the overall performance of the ACS with extended computational model is better compared to the sequences that employ the BinPSO DOI 10.5013/IJSSST.a.15.06.06 algorithm. By examining the value of each objective function, it can be seen that the ACS with extended computational model outperformed the PSO model in terms of the Hmeasure and continuity, but the BinPSO model has a better similarity and hairpin measurement. The ACS with extended computational model produced a significantly lower value in terms of the Hmeasure, and the ACS model also outperformed the PSO model in terms of the hairpin and continuity. However, the PSO model produced a lower similarity value compared to the ACS, but in general, the ACS model outperformed the PSO model because the ACS model produced a much lower total objective value. 61 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY TABLE XIIII SEQUENCES GENERATED BY GUANGZHOU ET AL. [18] VS SEQUENCES GENERATED BY THE ACS WITH EXTENDED COMPUTATIONAL MODEL. Sequence Hm S Ha C Total PSO - GuangZhou et al. GTCAAATTCCCTCTATCGTC 208 155 0 18 381 AGCGATAGTAGATCACCTGA 211 144 0 0 355 CACGATATAGCTTCGAGCCG 212 158 0 0 370 AATACACCGCTCACCAAGGA 211 155 0 0 366 AACAGGGAAGAATGCAGAGG 211 146 0 9 366 CCTCTACCAGCCAATGATGC 203 158 0 0 361 TTAGGACTCGACGCCACTCC 204 153 0 0 357 CCATGACCGAGGATCCACGT 203 175 0 0 378 CGCCATTATCAGGCCTTTAC 215 157 0 9 381 ACACAGTGGACGCACATACA 209 167 0 0 376 TTATCCCGCCTCTTCTCCGT 213 165 0 9 387 AATACGGTTCAAGCGGCTTC 216 158 4 0 378 TAAAGGCGCGTGATCGGAAG 221 152 0 9 382 TTGTTCGGGATTGAGCAACT 230 148 5 9 392 GTCACTGAGTCAGCACTCAT 210 156 4 0 370 CCATAAACTGCCAGCTCGCG 207 165 0 9 381 CAACATAGAGTCAGGCGCTG 210 163 0 0 373 CCAATGAGTCACCTCGTTCG 217 164 9 0 390 GGGGTGGAGGCCCAACTATT 206 157 0 25 388 CAGCGGTCTGAACCTCCATA 210 152 0 0 362 Fitness Value 211.35 157.4 1.1 4.85 374.7 ACS WITH EXTENDED COMPUTATIONAL MODEL AGAGAAACGCGATTAGAGTA 143 213 0 0 356 AGAAGATAGATTACGAGACC 146 205 0 0 351 TAGAGAAGAGAGATACGAGC 122 215 0 0 337 GGAGAGAGAGACGAGAGTCC 126 232 0 0 358 CGTAGACGAGAGAGCGAGAC 124 209 0 0 333 CGGTAGATAGAGAGAGAGTA 123 244 0 0 367 CGAGAAGATACCGAAGATAG 135 194 0 0 329 AGGACGAGAGAGAGAGAAGG 88 223 0 0 311 TAGATAGATTCCGAGAAGAG 158 194 0 0 352 AGACGCGAGAGAGAGGAGAG 107 214 0 0 321 AGCGTACGGAGAGATACCGT 153 203 0 8 364 GAGAGAGAGAGAGAGAGAGT 67 25 0 0 92 CTAGAGAGACGAGAGAGAGT 124 227 0 0 351 CCGCGTAGAGGTCGAGAGCG 161 195 0 0 356 TAGGTACGAGACGAGGAAGG 122 192 0 0 314 TAGTATAGCGAGACGTAAAG 157 194 9 0 360 TGAAGAGAGAGGAGAGTAGA 103 239 0 0 342 CGGAGGAGACCATACGTATA 143 184 0 0 327 CGAGTAGCGGATACAAGACT 159 179 0 0 338 GAGAGGAGAAGATTAGAGAG 104 208 0 0 312 Fitness Value (1 run) 128.25 199.45 0.45 0.4 328.55 Average Value (100 run) 145.908 200.800 0.888 0.216 347.812 Note: Hm, Sm, Hr, C and Total are the Hmeasure, similarity, hairpin, continuity and Total Objective value, respectively. DOI 10.5013/IJSSST.a.15.06.06 62 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY BinPSO - Khalid et al. Proposed ACS Model 140.00 120.00 100.00 80.00 60.00 40.00 20.00 0.00 H-measure Similarity Continuity Hairpin Total Figure 12. Results obtained by Khalid et al. [19] vs results obtained by the ACS with extended computational model. TABLE XIIIIII. SEQUENCES GENERATED BY KHALID ET AL. [19] VS SEQUENCES GENERATED BY THE ACS WITH EXTENDED COMPUTATIONAL. Sequence Hm S Ha C Total BinPSO - Khalid et al. CGGTCACGCCTCTTGTATTG 63 42 0 0 105 ATCCGCGCCGCACGGTCATG 70 45 0 0 115 CCAAATACATTGACTCCCAA 72 46 18 0 136 TATTTGCTCGGAGACCGCGG 65 46 9 0 120 GCATTTGATTCAGCGTTCCA 66 43 9 0 118 GTTGGAATGGTGTAGCTGAG 66 45 0 0 111 GTCTGTGTACTCTTCCGTGG 63 48 0 0 111 Fitness Value 66.43 45.00 5.14 0.00 116.57 ACS WITH EXTENDED COMPUTATIONAL MODEL AGAGTAGCAGATTAGAAGGA 41 64 0 0 105 CTACAGGAAGATAGAGTACA 45 61 0 0 106 CAGAACAGGAGGAGAGAGTC 31 69 0 0 100 AGAGGAAGAGAGAAGCATAA 36 72 0 0 108 GCAGAGAGCGTTCGTACCAG 57 54 0 0 111 AGAGAGATAGAGATACAGAT 35 64 0 0 99 GAGAGAGAGAGGAGAGAGGA 17 67 0 0 84 Fitness Value (1 run) 37.43 64.43 0.00 0.00 101.86 Average Value (100 runs) 42.557 62.476 0.417 0.122 105.571 Note: Hm, Sm, Hr, C and Total are the Hmeasure, similarity, hairpin, continuity and Total Objective value, respectively Next, an analysis and comparison is made between the sequences generated by the ACS with extended computational model and the population-based ant colony optimisation (P-ACO) [22]. The sequences generated by PACO outperformed the ACS with extended computational model in terms of similarity, continuity, and hairpin. P-ACO also is superior in terms of overall performance. The sequences and the overall results are presented in Table 19 and Figure 13, respectively. DOI 10.5013/IJSSST.a.15.06.06 Finally, the ACS with extended computational model is compared with the original ACS model [23]. The result of the comparison is shown in Table 20 and Figure 14. The ACS with extended computational model outperformed the original ACS model in all of the objective functions except for the similarity measure, for which the original ACS model performed slightly better. The overall performance of the ACS with extended computational model overtakes the performance of the original ACS model. 63 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY TABLE XIX. SEQUENCES GENERATED BY KURNIAWAN ET AL. [22] VS SEQUENCES GENERATED BY THE ACS WITH EXTENDED COMPUTATIONAL MODEL. Sequence Hm S Ha C Total P-ACO - Kurniawan et al. CCACCACCACCACCAATAAT 23 77 0 0 100 ACCTCACTCACTCACTCAAC 49 35 0 0 84 TAACAGAACAGAACAGGCCG 21 68 0 0 89 CACACACACACACACACACA 25 74 0 0 99 AATCTCTCTCTCTCTCTGCC 20 81 0 0 101 CGCCAGCCAGCCTATATATA 39 49 0 0 88 TTGCATTCCTTCCTTCCTGG 21 69 0 0 90 Fitness Value 28.29 64.71 0.00 0.00 93.00 ACS WITH EXTENDED COMPUTATIONAL MODEL AGAGTAGCAGATTAGAAGGA 41 64 0 0 105 CTACAGGAAGATAGAGTACA 45 61 0 0 106 CAGAACAGGAGGAGAGAGTC 31 69 0 0 100 AGAGGAAGAGAGAAGCATAA 36 72 0 0 108 GCAGAGAGCGTTCGTACCAG 57 54 0 0 111 AGAGAGATAGAGATACAGAT 35 64 0 0 99 GAGAGAGAGAGGAGAGAGGA 17 67 0 0 84 Fitness Value (1 run) 37.43 64.43 0.00 0.00 101.86 Average Value (100 runs) 42.557 62.476 0.417 0.122 105.571 Note: Hm, Sm, Hr, C and Total are the Hmeasure, similarity, hairpin, continuity and Total Objective value, respectively P-ACO - Kurniawan et. al Proposed ACS Model 120 100 80 60 40 20 0 H-measure Similarity Continuity Hairpin Total Figure 13. Results obtained by Kurniawan et al. [22] vs results obtained by the ACS with extended computational model. DOI 10.5013/IJSSST.a.15.06.06 64 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY Proposed ACS model Yakop et al. 850.000 650.000 450.000 250.000 50.000 H-measure Similarity Continuity Hairpin Total -150.000 Figure 14. Results obtained by Yakop et al. [23] vs results obtained by the ACS with extended computational model. TABLE XX. SEQUENCES GENERATED BY YAKOP ET AL. [23] VS SEQUENCES GENERATED BY THE ACS WITH EXTENDED COMPUTATIONAL MODEL. Sequence Hm S Ha C Total ACS - Yakop et al. GGAGTGAGAGAGAGGAAGAG 25 73 0 0 98 AGAGAGAATGAGTTCAGATG 43 67 0 0 110 CGAGGAGATCCGCGATACCG 53 60 0 0 113 AGATGAGGAGCGCAGAGGCG 39 69 0 0 108 AGAGCGATGAGAAGAGAGAT 28 72 0 0 100 TGAGAGAGAGATGAGAGAGT 23 74 0 0 97 AAGAGAAGAGAGAGAGAGAG 22 71 0 0 93 Fitness Value (1 run) 33.29 69.43 0.00 0.00 102.71 Average Value (100 runs) 51.89 60.1 3.25 0.65 115.89 ACS WITH EXTENDED COMPUTATIONAL MODEL AGAGTAGCAGATTAGAAGGA 41 64 0 0 105 CTACAGGAAGATAGAGTACA 45 61 0 0 106 CAGAACAGGAGGAGAGAGTC 31 69 0 0 100 AGAGGAAGAGAGAAGCATAA 36 72 0 0 108 GCAGAGAGCGTTCGTACCAG 57 54 0 0 111 AGAGAGATAGAGATACAGAT 35 64 0 0 99 GAGAGAGAGAGGAGAGAGGA 17 67 0 0 84 Fitness Value (1 run) 37.43 64.43 0.00 0.00 101.86 Average Value (100 runs) 42.557 62.476 0.417 0.122 105.571 Note: Hm, Sm, Hr, C and Total are the Hmeasure, similarity, hairpin, continuity and Total Objective value, respectively ants were used in the system. However, improvement of the total objective stagnates when the number of ants is larger than 25. Based on this finding, it can be concluded that, for the application of the ACS algorithm in the DNA sequence design, a suitable number of ants for finding a feasible solution is 30. The benchmarking of the results obtained using the ACS algorithm with other algorithms suggest that the ACS algorithm is superior in most cases and is a good method for DNA sequence design. VII. CONCLUSION This paper revisits an improved ACS algorithm for solving the DNA sequence design problem in which the dependence between the number of ants and the number of sequences produced is eliminated. Unlike the original computational model, this model allows the number of ants to be flexible in order to enhance the possibility of finding an optimum DNA sequence design. The total objective value decreased as more DOI 10.5013/IJSSST.a.15.06.06 65 ISSN: 1473-804x online, 1473-8031 print Z IBRAHIM et al.: BENCHMARKING OF AN EXTENDED COMPUTATIONAL MODEL OF ANT COLONY [13] A. Marathe, A.E. Condon, and R.M. Corn, “On Combinatorial DNA Word Design,” Proceedings of the 5th International Meeting on DNA Based Computers, 1999. [14] B.T. Zhang and S.Y. Shin, “Molecular Algorithms for Efficient and Reliable DNA Computing,” Proceeding of Genetic Programming, 1998, pp. 735-742. [15] R. Deaton, J. Chen, H. Bi, M. Garzon, H. Rubin, and D.H. Wood, “A PCR-based Protocol for in vitro Selection of Non-crosshybridizing Oligonucleotides,” Proceedings of 8th International Workshop on DNA Based Computers, 2002, pp. 196-204. [16] S.Y. Shin, I.H. Lee, D. Kim, and B.T. Zhang, “Multiobjective Evolutionary Optimization of DNA Sequences for Reliable DNA Computing,” IEEE Transaction on Evolutionary Computation, vol. 9, no. 2, pp. 143-158, 2005. [17] S. Zhou, Q. Zhang, J. Zhao, and J. Li, “DNA Encodings Based on Multi-Objective Particle Swarm,” Journal of Computational and Theoretical Nanoscience, vol. 4, pp. 1249-1252, 2007. [18] G. Cui, Y. Niu, Y. Wang, X. Zhang, and L. Pan, “A New Approach based on PSO Algorithms to Find Good Computational Encoding Sequences,” Progress in Natural Science, vol. 17, no. 6, pp. 712-716, 2007. [19] N.K. Khalid, Z. Ibrahim, T. Kurniawan, M. Khalid, and A. Engelbrecht, “Implementation of Binary Particle Swarm Optimization for DNA Sequence Design,” Distributed Computing, Artificial Intelligence, Bioinformatics, Soft Computing, and Ambient Assisted Living, vol. 5518 of the series Lecture Notes in Computer Science, pp. 450-457. [20] S.-Y. Shin, Multi-objective Evolutionary Optimization of DNA Sequences for Molecular Computing, PhD Thesis, Seoul National University, 2005. [21] R. Deaton, M. Garzon, R.C. Murphy, and J.A. Rose, “Genetic Search of Reliable Encodings for DNA-based Computation,” First Conference on Genetic Programming, 1996. [22] T.B. Kurniawan, Z. Ibrahim, N.K. Khalid, and M. Khalid, “A Population-based Ant Colony Optimization Approach for DNA Sequence Optimization,” Third Asia International Conference on Modelling and Simulation, 2009, pp. 246-251. [23] F. Yakop, Z. Ibrahim, A.F. Zainal Abidin, Z.M. Yusof, M.S. Mohamad, K. Wan, and J. Watada, “An Ant Colony System for Solving DNA Sequence Design Problem in DNA Computing,” International Journal of Innovative Computing, Information and Control, vol. 8, no. 10, pp. 7329-7339, 2012. REFERENCES [1] L. Adleman. “Molecular Computation of Solutions to Combinatorial Problems,” Science, vol. 266, pp. 1021-1024, 1994. [2] M. Dorigo, V. Maniezzo, and A. Colorni, “Ant System: Optimization by a Colony of Cooperating Agents,” IEEE Transactions on Systems, Man, and Cybernetics-Part B, vol. 26, no. 1, pp. 29-41, 1996. [3] M. Dorigo and M. Gambardella. “Ant Colony System: A Cooperative Learning Approach to the Traveling Salesman Problem,” IEEE Transactions on Evolutionary Computation, vol. 1, no. 1, pp. 53-66, 1997. [4] Z. Ibrahim, T.B. Kurniawan, N.K. Khalid, S. Sudin, and M. Khalid, “Implementation of an Ant Colony System for DNA Sequence Optimization,” Artificial Life and Robotics, vol. 14, issue 2, pp. 293296, 2009. [5] T.B. Kurniawan, N.K. Khalid, Z. Ibrahim, M. Khalid, and M. Middendorf, “Evaluation of Ordering Methods for DNA Sequence Design Based on Ant Colony System,” Second Asia International Conference on Modeling & Simulation, 2008, pp. 905-910. [6] S.M. Mustaza, A.F. Zainal Abidin, Z. Ibrahim, S. Buyamin, M.A. Shamsudin, A.R. Husain, J.A. Ahmed Mukred, “A Modified Computational Model of Ant Colony System in DNA Sequences Design,” Proceedings of IEEE Student Conference on Research and Development, 2011, pp. 19-20. [7] F. Tanaka, M. Nakatsugawa, M. Yamamoto, T. Shiba, and A. Ohuchi, “Developing Support System for Sequence Design in DNA Computing,” Proceedings of The 7th International Workshop on DNA Based Computers, 2001, pp. 340-349. [8] S. Hussini, L. Kari, and S. Konstantinidis, “Coding Properties of DNA Languages,” Theoretical Computer Science, vol. 290, pp. 1557-1579, 2003. [9] M.H. Garzon and R.J. Deaton, “Codeword Design and Information Encoding in DNA Ensembles,” Natural Computing, vol. 3, pp. 253292, 2004. [10] A.J. Hartemink, D.K. Gifford, and J. Khodor, “Automated Constraintbased Nucleotide Sequence Selection for DNA Computation,” Proceedings of the 4th DIMACS Workshop DNA Based Computer, 1998, pp. 227-235. [11] M. Arita, and S. Kobayashi, “DNA Sequence Design using Templates,” New Generation Computing, vol. 20, pp. 263-277, 2002. [12] U. Feldkamp, S. Saghafi, W. Banzhaf, and H. Rauhe, “DNA Sequence Generator – A Program for the Construction of DNA Sequences,” Proceedings of the 7th Int. Workshop DNA Based Computer, 2001, pp. 179-188. DOI 10.5013/IJSSST.a.15.06.06 66 ISSN: 1473-804x online, 1473-8031 print