Analysis of the Effects of Multiple Sequence Alignments in Protein Secondary Structure Prediction

Georgios Joannis Pappas Jr.²¹ &
Shankar Subramaniam²²

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3594))

Included in the following conference series:

Brazilian Symposium on Bioinformatics

738 Accesses

Abstract

Secondary structure prediction methods are widely used bioinformatics algorithms providing initial insights about protein structure from sequence information. Significant efforts to improve the prediction accuracy over the past years were made, specially the incorporation of information from multiple sequence alignments. This motivated the search for the factors contributing for this improvement. We show that in two of the highly ranked secondary structure prediction methods, DSC and PREDATOR, the use of multiple alignments consistently improves the prediction accuracy as compared to the use of single sequences. This is validated by using different measures of accuracy, which also permit to identify that helical regions benefit the most from alignments, whereas β-strands seem to have reached a plateau in terms of predictability. Also, the origins of this improvement is explored in terms of sequence specificity, secondary structure composition and the extent of sequence similarity which provides the optimal performance. It is found that divergent sequences, in the identity range of 25–55% provide the largest accuracy gain and that above 65% identity there is almost no advantage in using multiple alignments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Benchmarking Methods of Protein Structure Alignment

Article 28 July 2020

Advanced Protein Alignments Based on Sequence, Structure and Hydropathy Profiles; The Paradigm of the Viral Polymerase Enzyme

Article 31 January 2017

DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment

Article Open access 06 October 2015

References

Anfinsen, C.: Principles that govern the folding of protein chains. Science 181, 223–230 (1973)
Article Google Scholar
Rost, B.: Prediction in 1D: secondary structure, membrane helices, and accessibility. Methods Biochem. Anal. 44, 559–587 (2003)
Google Scholar
Rost, B.: Review: protein secondary structure prediction continues to rise. J. Struct. Biol. 134, 204–218 (2001)
Article Google Scholar
Garnier, J., Levin, J.: The protein structure code: what is its present status? Comput. Appl. Biosci. 7, 133–142 (1991)
Google Scholar
Rackovsky, S.: On the existence and implications of an inverse folding code in proteins. Proc. Natl. Acad. Sci. USA 92, 6861–6863 (1995)
Article Google Scholar
Kloczkowski, A., Ting, K.L., Jernigan, R., Garnier, J.: Combining the GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence. Proteins 49, 154–166 (2002)
Article Google Scholar
Zvelebil, M., Barton, G., Taylor, W., Sternberg, M.: Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J. Mol. Biol. 195, 957–961 (1987)
Article Google Scholar
Rost, B., Sander, C.: Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19, 55–72 (1994)
Article Google Scholar
Salamov, A., Solovyev, V.: Protein secondary structure prediction using local alignments. J. Mol. Biol. 268, 31–36 (1997)
Article Google Scholar
King, R., Sternberg, M.: Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Sci. 5, 2298–2310 (1996)
Article Google Scholar
Frishman, D., Argos, P.: Seventy-five percent accuracy in protein secondary structure prediction. Proteins 27, 329–335 (1997)
Article Google Scholar
Abagyan, R., Batalov, S.: Do aligned sequences share the same fold? J. Mol. Biol. 273, 355–368 (1997)
Article Google Scholar
Rost, B.: Twilight zone of protein sequence alignments. Protein Eng. 12, 85–94 (1999)
Article Google Scholar
Chothia, C.: Proteins. One thousand families for the molecular biologist. Nature 357, 543–544 (1992)
Article Google Scholar
Pascarella, S., Argos, P.: Analysis of insertions/deletions in protein structures. J. Mol. Biol. 224, 461–471 (1992)
Article Google Scholar
Di Francesco, V., Garnier, J., Munson, P.: Improving protein secondary structure prediction with aligned homologous sequences. Protein Sci. 5, 106–113 (1996)
Article Google Scholar
Altschul, S., Madden, T., Schäffer, A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)
Article Google Scholar
Jones, D.: Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202 (1999)
Article Google Scholar
Cuff, J., Barton, G.: Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins 40, 502–511 (2000)
Article Google Scholar
Petersen, T., Lundegaard, C., Nielsen, M., Bohr, H., Bohr, J., Brunak, S., Gippert, G., Lund, O.: Prediction of protein secondary structure at 80% accuracy. Proteins 41, 17–20 (2000)
Article Google Scholar
Rost, B., Sander, C.: Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232, 584–599 (1993)
Article Google Scholar
Cuff, J., Barton, G.: Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins 34, 508–519 (1999)
Article Google Scholar
Przybylski, D., Rost, B.: Alignments grow, secondary structure prediction improves. Proteins 46, 197–205 (2002)
Article Google Scholar
Bernstein, F., Koetzle, T., Williams, G., Meyer, E., Brice, M., Rodgers, J., Kennard, O., Shimanouchi, T., Tasumi, M.: The Protein Data Bank: a computer-based archival file for macromolecular structures. J. Mol. Biol. 112, 535–542 (1977)
Article Google Scholar
Heringa, J., Sommerfeldt, H., Higgins, D., Argos, P.: OBSTRUCT: a program to obtain largest cliques from a protein sequence set according to structural resolution and sequence similarity. Comput. Appl. Biosci. 8, 599–600 (1992)
Google Scholar
Sander, C., Schneider, R.: Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins 9, 56–68 (1991)
Article Google Scholar
Kabsch, W., Sander, C.: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983)
Article Google Scholar
Matthews, B.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta. 405, 442–451 (1975)
Google Scholar
Goldman, N., Thorne, J., Jones, D.: Using evolutionary trees in protein secondary structure prediction and other comparative sequence analyses. J. Mol. Biol. 263, 196–208 (1996)
Article Google Scholar
Argos, P.: Analysis of sequence-similar pentapeptides in unrelated protein tertiary structures. Strategies for protein folding and a guide for site-directed mutagenesis. J. Mol. Biol. 197, 331–348 (1987)
Article Google Scholar
Cohen, B., Presnell, S., Cohen, F.: Origins of structural diversity within sequentially identical hexapeptides. Protein Sci. 2, 2134–2145 (1993)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Biotechnology and Genomic Sciences program, Universidade Católica de Brasília,
Georgios Joannis Pappas Jr.
Departments of Bioengineering, Chemistry, and Biochemistry, University of California at San Diego, La Jolla, California
Shankar Subramaniam

Authors

Georgios Joannis Pappas Jr.
View author publications
You can also search for this author in PubMed Google Scholar
Shankar Subramaniam
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Virginia Bioinformatics Institute and Virginia Polytechnic Institute and State University, Bioinformatics 1, Box 0477, 24060-0477, Blacksburg, VA, USA
João Carlos Setubal
Instituto de Quimica, Departamento de Bioquimica, Universidade de Sao Paulo, Av. Prof. Lineu Prestes 748, 05508-000, Sao Paulo, SP, Brazil
Sergio Verjovski-Almeida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pappas, G.J., Subramaniam, S. (2005). Analysis of the Effects of Multiple Sequence Alignments in Protein Secondary Structure Prediction. In: Setubal, J.C., Verjovski-Almeida, S. (eds) Advances in Bioinformatics and Computational Biology. BSB 2005. Lecture Notes in Computer Science(), vol 3594. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11532323_14

Download citation

DOI: https://doi.org/10.1007/11532323_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28008-8
Online ISBN: 978-3-540-31861-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Analysis of the Effects of Multiple Sequence Alignments in Protein Secondary Structure Prediction

Abstract

Access this chapter

Preview

Similar content being viewed by others

Benchmarking Methods of Protein Structure Alignment

Advanced Protein Alignments Based on Sequence, Structure and Hydropathy Profiles; The Paradigm of the Viral Polymerase Enzyme

DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Analysis of the Effects of Multiple Sequence Alignments in Protein Secondary Structure Prediction

Abstract

Access this chapter

Preview

Similar content being viewed by others

Benchmarking Methods of Protein Structure Alignment

Advanced Protein Alignments Based on Sequence, Structure and Hydropathy Profiles; The Paradigm of the Viral Polymerase Enzyme

DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation