Skip to main content
PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of... more
PredictProtein is a meta-service for sequence analysis
that has been predicting structural and functional
features of proteins since 1992. Queried with
a protein sequence it returns: multiple sequence
alignments, predicted aspects of structure (secondary
structure, solvent accessibility, transmembrane
helices (TMSEG) and strands, coiled-coil regions,
disulfide bonds and disordered regions) and
function. The service incorporates analysis methods
for the identification of functional regions (ConSurf),
homology-based inference of Gene Ontology terms
(metastudent), comprehensive subcellular localization
prediction (LocTree3), protein–protein binding
sites (ISIS2), protein–polynucleotide binding sites
(SomeNA) and predictions of the effect of point mutations
(non-synonymous SNPs) on protein function
(SNAP2). Our goal has always been to develop a system
optimized to meet the demands of experimentalists
not highly experienced in bioinformatics. To this
end, the PredictProtein results are presented as both
text and a series of intuitive, interactive and visually
appealing figures. The web server and sources are
available at http://ppopen.rostlab.org.
Research Interests:
The rapidly increasing quantity of protein sequence data continues to widen the gap between available sequences and annotations. Comparative modeling suggests some aspects of the 3D structures of approximately half of all known proteins;... more
The rapidly increasing quantity of protein sequence data continues to widen the gap between available sequences and annotations. Comparative modeling suggests some aspects of the 3D structures of approximately half of all known proteins; homology- and network-based inferences annotate some aspect of function for a similar fraction of the proteome. For most known protein sequences, however, there is detailed knowledge about neither their function nor their structure. Comprehensive efforts towards the expert curation of sequence annotations have failed to meet the demand of the rapidly increasing number of available sequences. Only the automated prediction of protein function in the absence of homology can close the gap between available sequences and annotations in the foreseeable future. This review focuses on two novel methods for automated annotation, and briefly presents an outlook on how modern web software may revolutionize the field of protein sequence annotation. First, predi...
The prediction of protein sub-cellular localization is an important step toward elucidating protein function. For each query protein sequence, LocTree2 applies machine learning (profile kernel SVM) to predict the native sub-cellular... more
The prediction of protein sub-cellular localization is an important step toward elucidating protein function. For each query protein sequence, LocTree2 applies machine learning (profile kernel SVM) to predict the native sub-cellular localization in 18 classes for eukaryotes, in six for bacteria and in three for archaea. The method outputs a score that reflects the reliability of each prediction. LocTree2 has performed on par with or better than any other state-of-the-art method. Here, we report the availability of LocTree3 as a public web server. The server includes the machine learning-based LocTree2 and improves over it through the addition of homology-based inference. Assessed on sequence-unique data, LocTree3 reached an 18-state accuracy Q18=80±3% for eukaryotes and a six-state accuracy Q6=89±4% for bacteria. The server accepts submissions ranging from single protein sequences to entire proteomes. Response time of the unloaded server is about 90 s for a 300-residue eukaryotic pr...
We report on several proteins recently solved by structural genomics consortia, in particular by the Northeast Structural Genomics consortium (NESG). The proteins considered in this study differ substantially in their sequences but they... more
We report on several proteins recently solved by structural genomics consortia, in particular by the Northeast Structural Genomics consortium (NESG). The proteins considered in this study differ substantially in their sequences but they share a similar structural core, characterized by a pseudobarrel five-stranded beta sheet. This core corresponds to the PUA domain-like architecture in the SCOP database. By connecting sequence information with structural knowledge, we characterize a new subgroup of these proteins that we propose to be distinctly different from previously described PUA domain-like domains such as PUA proper or ASCH. We refer to these newly defined domains as EVE. Although EVE may have retained the ability of PUA domains to bind RNA, the available experimental and computational data suggests that both the details of its molecular function and its cellular function differ from those of other PUA domain-like domains. This study of EVE and its relatives illustrates how the combination of structure and genomics creates new insights by connecting a cornucopia of structures that map to the same evolutionary potential. Primary sequence information alone would have not been sufficient to reveal these evolutionary links.
Research Interests: