Abstract
For multiple sequence alignment problem in molecular biological sequence analysis, a hybrid genetic algorithm and an associated software package called HGA-COFFEE are presented. The COFFEE function is used to measure individual fitness, and five novel genetic operators are designed, a selection operator, two crossover operators and two mutation operators. One of the mutation operators is designed based on the COFFEE’s consistency information that can improve the global search ability, and another is realized by dynamic programming method that can improve individuals locally. Experimental results of the 144 benchmarks from the BAliBASE show that the proposed algorithm is feasible, and for datasets in twilight zone and comprising N/C terminal extensions, HGA-COFFEE generates better alignment as compared to other methods studied in this paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Attwood, T.K., Parry-Smith, D.J. (translated by Luo JingChu): Introduction to Bioinformatics(in chinese). BeijingPeking University Press (2002)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)
Carrillo, H., Lipman, D.J.: The multiple sequence alignment problem in biology. SIAM Appl. Math. 48(5), 1073–1082 (1988)
Lipman, D., Altschul, S., Kececioglu, J.: A tool for multiple sequence alignment. Proc. Natl. Acad. Sci. USA 86, 4412–4415 (1989)
Hogeweg, P., Hesper, B.: The alignment of sets of sequences and the construction of phylogenetic trees: An integrated method. J. Mol. Evol. 20(2), 175–186 (1984)
Feng, D.F., Doolittle, R.F.: Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25(4), 351–360 (1987)
Taylor, W.R.: A flexible method to align large numbers of biological sequences. J. Mol. Evol. 28(1-2), 161–169 (1988)
Thompson, J.D., Higgins, D.G., Gibson, T.J.: ACLUSTAL Wimproving the sensitivity of progressive multiple sequence algnment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22(22), 4673–4680 (1994)
Notredame, C., Higgins, D.G.: PSAGAsequence alignment by genetic algorithm. Nucleic Acids Research 24(8), 1515–1524 (1996)
Nguyen, H.D., Yoshihara, I.: Aligning multiple protein sequences by parallel hybrid genetic algorithm. In: Genome Informatics 2002, pp. 123–132. Universal Academy Press, Tokyo (2002)
Notredame, C., Holm, L., Higgins, D.G.: COFFEE an objective function for multiple sequence alignment. Bioinformatics 14(5), 407–422 (1998)
Thompson, J.D., Plewniak, F., Poch, O.: A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Research 27(13), 2682–2690 (1999)
Eddy, S.: Multiple alignment using hidden Markov models. In: Proc. Int. Conf. on Intelligent Systems for Molecular Biology, pp. 114–120. AAAI/MIT Press, Cambridge (1995)
Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. J. Comp. Biol. 1(4), 337–348 (1994)
Notredame, C.: Recent progresses in multiple sequence alignmenta survey. Pharmacogenomics 3(1), 131–144 (2002)
Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. In: Proceedings of the National Academy of Sciences of the USA, pp. 10915–10919. National Academy of Sciences, Washington (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, Lf., Huo, Hw., Wang, Bs. (2005). HGA-COFFEE : Aligning Multiple Sequences by Hybrid Genetic Algorithm. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_56
Download citation
DOI: https://doi.org/10.1007/11527503_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27894-8
Online ISBN: 978-3-540-31877-4
eBook Packages: Computer ScienceComputer Science (R0)