Papers by Esmeralda Vicedo
F1000Research, 2015
Recent experiments established that a culture of Saccharomyces cerevisiae (baker&... more Recent experiments established that a culture of Saccharomyces cerevisiae (baker's yeast) survives sudden high temperatures by specifically duplicating the entire chromosome III and two chromosomal fragments (from IV and XII). Heat shock proteins (HSPs) are not significantly over-abundant in the duplication. In contrast, we suggest a simple algorithm to " postdict " the experimental results: Find a small enough chromosome with minimal protein disorder and duplicate this region. This algorithm largely explains all observed duplications. In particular, all regions duplicated in the experiment reduced the overall content of protein disorder. The differential analysis of the functional makeup of the duplication remained inconclusive. Gene Ontology (GO) enrichment suggested over-representation in processes related to reproduction and nutrient uptake. Analyzing the protein-protein interaction network (PPI) revealed that few network-central proteins were duplicated. The predictive hypothesis hinges upon the concept of reducing proteins with long regions of disorder in order to become less sensitive to heat shock attack.
Bookmarks Related papers MentionsView impact
BMC Bioinformatics, 2007
Bookmarks Related papers MentionsView impact
BMC Bioinformatics, 2014
ABSTRACT This report summarizes the scientific content and activities of the annual symposium org... more ABSTRACT This report summarizes the scientific content and activities of the annual symposium organized by the Student Council of the International Society for Computational Biology (ISCB), held in conjunction with the Intelligent Systems for Molecular Biology (ISMB) / European Conference on Computational Biology (ECCB) conference in Berlin, Germany, on July 19, 2013.
Bookmarks Related papers MentionsView impact
PLOS ONE, 2015
Many prokaryotic organisms have adapted to incredibly extreme habitats. The genomes of such extre... more Many prokaryotic organisms have adapted to incredibly extreme habitats. The genomes of such extremophiles differ from their non-extremophile relatives. For example, some proteins in thermophiles sustain high temperatures by being more compact than homologs in non-extremophiles. Conversely, some proteins have increased volumes to compensate for freezing effects in psychrophiles that survive in the cold. Here, we revealed that some differences in organisms surviving in extreme habitats correlate with a simple single feature, namely the fraction of proteins predicted to have long disordered regions. We predicted disorder with different methods for 46 completely sequenced organisms from diverse habitats and found a correlation between protein disorder and the extremity of the environment. More specifically, the overall percentage of proteins with long disordered regions tended to be more similar between organisms of similar habitats than between organisms of similar taxonomy. For example, predictions tended to detect substantially more proteins with long disordered regions in prokaryotic halophiles (survive high salt) than in their taxonomic neighbors. Another peculiar environment is that of high radiation survived, e.g. by Deinococcus radiodurans. The relatively high fraction of disorder predicted in this extremophile might provide a shield against mutations. Although our analysis fails to establish causation, the observed correlation between such a simplistic, coarse-grained, microscopic molecular feature (disorder content) and a macroscopic variable (habitat) remains stunning.
Bookmarks Related papers MentionsView impact
BMC Bioinformatics, 2015
Bookmarks Related papers MentionsView impact
Computational Statistics
Due to the increasing availability of powerful hardware resources, parallel computing is becoming... more Due to the increasing availability of powerful hardware resources, parallel computing is becoming an important issue, as a noticeable speedup may be achieved. The statistical programming language R allows for parallel computing on computer clusters as well as multicore systems through several packages. This tutorial gives a short, practical overview of four, in view of the authors, important packages for parallel computing in R, namely multicore, snow, snowfall and nws. First, the general principle of parallelizing simple tasks is briefly illustrated based on a statistical cross-validation example. Afterwards, the usage of each of the introduced packages is being demonstrated on the example. Furthermore, we address some specific features of the packages and provide guidance for selecting an adequate package for the computing environment at hand. KeywordsParallel computing–R–Multicore–Snow–Snowfall–nws
Bookmarks Related papers MentionsView impact
Background / Purpose: One common definition of regions of “disorder” in proteins is that they do ... more Background / Purpose: One common definition of regions of “disorder” in proteins is that they do not adopt a regular three-dimensional structure in isolation (i. e., when not bound to other molecules) on their own. These disordered regions are in contrast to regions that are well structured or “ordered”. Notably, there is a great variety of “flavours” of disorder: some proteins adopt a unique regular 3D secondary structure only upon binding, others, for example loops, remain irregular. It is also possible for a protein to be almost entirely disordered whilst others have only short disordered regions.Numerous computational methods exist that predict disorder based on a variety of concepts. One of these methods, NORSnet, has been developed in our group. NORSnet aims to predict disordered regions of the “loopy” type. Here, we predict secondary structure and disorder for all completely sequenced organisms. Main conclusion: We report two observations: most of the predicted disorder regio...
Bookmarks Related papers MentionsView impact
Bioinformatics and biology insights, 2009
Microarray data repositories as well as large clinical applications of gene expression allow to a... more Microarray data repositories as well as large clinical applications of gene expression allow to analyse several hundreds of microarrays at one time. The preprocessing of large amounts of microarrays is still a challenge. The algorithms are limited by the available computer hardware. For example, building classification or prognostic rules from large microarray sets will be very time consuming. Here, preprocessing has to be a part of the cross-validation and resampling strategy which is necessary to estimate the rule's prediction quality honestly.This paper proposes the new Bioconductor package affyPara for parallelized preprocessing of Affymetrix microarray data. Partition of data can be applied on arrays and parallelization of algorithms is a straightforward consequence. The partition of data and distribution to several nodes solves the main memory problems and accelerates preprocessing by up to the factor 20 for 200 or more arrays.affyPara is a free and open source package, un...
Bookmarks Related papers MentionsView impact
Nature Methods, 2013
Bookmarks Related papers MentionsView impact
Current Opinion in Structural Biology, 2011
Bookmarks Related papers MentionsView impact
Computational Statistics, 2011
Due to the increasing availability of powerful hardware resources, parallel computing is becoming... more Due to the increasing availability of powerful hardware resources, parallel computing is becoming an important issue, as a noticeable speedup may be achieved. The statistical programming language R allows for parallel computing on computer clusters as well as multicore systems through several packages. This tutorial gives a short, practical overview of four, in view of the authors, important packages for
Bookmarks Related papers MentionsView impact
Computational Statistics, 2011
As microarray data quality can affect each step of the microarray analysis process, quality asses... more As microarray data quality can affect each step of the microarray analysis process, quality assessment and control is an integral part. It detects divergent measurements beyond the acceptable level of random fluctuations. This empirical study identifies association and correlation between the six quality assessment methods for microarray outlier detection used in the arrayQualityMetrics package version 2.2.2. For evaluation two different
Bookmarks Related papers MentionsView impact
BMC Bioinformatics, 2013
Any method that de novo predicts protein function should do better than random. More challenging,... more Any method that de novo predicts protein function should do better than random. More challenging, it also ought to outperform simple homology-based inference. Here, we describe a few methods that predict protein function exclusively through homology. Together, they set the bar or lower limit for future improvements. During the development of these methods, we faced two surprises. Firstly, our most successful implementation for the baseline ranked very high at CAFA1. In fact, our best combination of homology-based methods fared only slightly worse than the top-of-the-line prediction method from the Jones group. Secondly, although the concept of homology-based inference is simple, this work revealed that the precise details of the implementation are crucial: not only did the methods span from top to bottom performers at CAFA, but also the reasons for these differences were unexpected. In this work, we also propose a new rigorous measure to compare predicted and experimental annotations. It puts more emphasis on the details of protein function than the other measures employed by CAFA and may best reflect the expectations of users. Clearly, the definition of proper goals remains one major objective for CAFA.
Bookmarks Related papers MentionsView impact
BMC Bioinformatics, 2014
ABSTRACT This report summarizes the scientific content and activities of the annual symposium org... more ABSTRACT This report summarizes the scientific content and activities of the annual symposium organized by the Student Council of the International Society for Computational Biology (ISCB), held in conjunction with the Intelligent Systems for Molecular Biology (ISMB) / European Conference on Computational Biology (ECCB) conference in Berlin, Germany, on July 19, 2013.
Bookmarks Related papers MentionsView impact
BioMed Research International, 2013
Bookmarks Related papers MentionsView impact
Uploads
Papers by Esmeralda Vicedo