Skip to main content

    rita casadio

    Grouping residue variations in a protein according to their physicochemical properties allows a dimensionality reduction of all the possible substitutions in a variant with respect to the wild type. Here, by using a large dataset of... more
    Grouping residue variations in a protein according to their physicochemical properties allows a dimensionality reduction of all the possible substitutions in a variant with respect to the wild type. Here, by using a large dataset of proteins with disease-related and benign variations, as derived by merging Humsavar and ClinVar data, we investigate to which extent our physicochemical grouping procedure can help in determining whether patterns of variation types are related to specific groups of diseases and whether they occur in Pfam and/or InterPro gene domains. Here, we download 75,145 germline disease-related and benign variations of 3,605 genes, group them according to physicochemical categories and map them into Pfam and InterPro gene domains. Statistically validated analysis indicates that each cluster of genes associated to Mondo anatomical system categorizations is characterized by a specific variation pattern. Patterns identify specific Pfam and InterPro domain–Mondo categor...
    The Bologna ENZyme Web Server (BENZ WS) annotates four-level Enzyme Commission numbers (EC numbers) as defined by the International Union of Biochemistry and Molecular Biology (IUBMB). BENZ WS filters a target sequence with a combined... more
    The Bologna ENZyme Web Server (BENZ WS) annotates four-level Enzyme Commission numbers (EC numbers) as defined by the International Union of Biochemistry and Molecular Biology (IUBMB). BENZ WS filters a target sequence with a combined system of Hidden Markov Models, modelling protein sequences annotated with the same molecular function, and Pfams, carrying along conserved protein domains. BENZ returns, when successful, for any enzyme target sequence an associated four-level EC number. Our system can annotate both monofunctional and polyfunctional enzymes, and it can be a valuable resource for sequence functional annotation.
    In this paper we describe an algorithm, based on neural networks that adds to the previously published results (ISPRED, www.biocomp.unibo.it) and increases the predictive performance of protein-protein interaction sites in protein... more
    In this paper we describe an algorithm, based on neural networks that adds to the previously published results (ISPRED, www.biocomp.unibo.it) and increases the predictive performance of protein-protein interaction sites in protein structures. The goal is to reduce the number of spurious assignment and developing knowledge based computational approach to focus on clusters of predicted residues on the protein surface. The
    PCPV is a member of the parapoxvirus genus the type specis of which is Orf virus (ORFV). PCPV is maintained in cattle, while ORFV is maintained in sheep and goats and both infect humans. Recently, a homolog of the vascular endothelial... more
    PCPV is a member of the parapoxvirus genus the type specis of which is Orf virus (ORFV). PCPV is maintained in cattle, while ORFV is maintained in sheep and goats and both infect humans. Recently, a homolog of the vascular endothelial growth factor (VEGF) has been identified in the VR634 pseudocowpoxvirus (PCPV genome). The relatedness between PCPV VR634 VEGF and ORFV NZ7 VEGF raised the possibility that ORFV NZ7 strain is a natural recombinant between strains of ORFV and PCPV and that the DNA segment spanning the VEGF gene of PRFV NZ7 is derived from PCPV. In our study, the VEGF genes of two PCPV field strains (1303/05 and 380/06) recently isolated in Italy, were characterised. Phylogenetic analysis was carried out using the maximum likelihood approach and a variety of statistical analyses regarding genetic differentiation and gene flow were performed with DNASP version 4.10. To screen for recombination, we employed two preliminary detection programs: Single Breackpoint recombination and Genetic Algorithms for Recombination Detection, both implemented in Datamonkey. A breakpoint of recombination has been identified at nucleotide position 237 of NZ-2 gene corresponding to a conserved codon encoding for CYS in all the viral variants indicating that recombination may occur at this site while DNASP and statistical tests support the hypotesis that PCPV 1303 and 380 VEGFs differentiated from NZ-2 variants. PCPV amino acid sequences were compared with other viral VEGFs using Clustal W. A 51% and 41,6% identity values were computed with the NZ-2 and VR634 variants, respectively. The secondary and tertiary structure analysis was performed with PSIPRED and Homology Modelling, indicating the conservation of the functional motifs and signature. The characterisation of the C-terminal has been performed to compare the amino acid residues involved in the interaction with NP-1 and R1 receptors. C-terminal peptides identified in the PCPV 1303 and 380 VEGF-variants suggest a low affinity with the receptor NP-1, while the presence of few glycosilation sites may lead to the activation of R1 which mediates monocyte migration. Taken together our results provide new insights into interspecies recombination between bovine and sheep parapoxviruses and on the role of VEGF in normal and pathological conditions
    KIT/PDGF receptor-α (PDGFRA) wild-type (WT) gastrointestinal stromal tumors (GIST) are characterized by an overexpression of IGF-1 receptor (IGF1R) at the mRNA and protein level. More recently, germline and somatic mutations in succinate... more
    KIT/PDGF receptor-α (PDGFRA) wild-type (WT) gastrointestinal stromal tumors (GIST) are characterized by an overexpression of IGF-1 receptor (IGF1R) at the mRNA and protein level. More recently, germline and somatic mutations in succinate dehydrogenase (SDH) subunits A, B and C have been identified in KIT/PDGFRA WT sporadic GIST. Until now, the molecular basis of IGF1R overexpression in KIT/PDGFRA WT GIST has not been explained. In this brief report we investigate the status of the SDH complex at the genomic and protein level in relation to IGF1R expression at the mRNA and protein level in seven KIT/PDGFRA WT sporadic GIST patients. We found that IGF1R was upregulated in all patients harboring SDH mutations or displaying a SDH dysfunction, with respect to KIT/PDGFRA WT GIST without SDH mutations. Western blot analysis confirmed that all patients with an upregulation of IGF1R mRNA had detectable IGF1R protein expression. This report would suggest that IGF1R overexpression in KIT/PDGFR...
    Becerra et al. exposed S. cerevisiae cells to a range of heat and cold shocks, including a pre-adaptation to heat shock, and then observed and contrasted the changes in gene expression in each case, identifying general and specific... more
    Becerra et al. exposed S. cerevisiae cells to a range of heat and cold shocks, including a pre-adaptation to heat shock, and then observed and contrasted the changes in gene expression in each case, identifying general and specific shock-related genes.
    ABSTRACT For transmembrane proteins experimental determina-tion of three-dimensional structure is problematic. However, membrane proteins have important impact for molecular biology in general, and for drug design in particular. Thus,... more
    ABSTRACT For transmembrane proteins experimental determina-tion of three-dimensional structure is problematic. However, membrane proteins have important impact for molecular biology in general, and for drug design in particular. Thus, prediction method are needed. Here we introduce a method that started from the output of the profile-based neural network system PHDhtm (Rost, et al.
    The availability of whole genome sequences in public databases permits genome-wide comparative studies of various bacterial species. Whole genome sequence-single nucleotide polymorphisms (WGS-SNP) analysis has been used in recent studies... more
    The availability of whole genome sequences in public databases permits genome-wide comparative studies of various bacterial species. Whole genome sequence-single nucleotide polymorphisms (WGS-SNP) analysis has been used in recent studies and allows the discrimination of various Brucella species and strains. In the present study, 13 Brucella spp. strains from cattle of various locations in provinces of South Africa were typed and discriminated. WGS-SNP analysis indicated a maximum pairwise distance ranging from 4 to 77 single nucleotide polymorphisms (SNPs) between the South African Brucella abortus virulent field strains. Moreover, it was shown that the South African B. abortus strains grouped closely to B. abortus strains from Mozambique and Zimbabwe, as well as other Eurasian countries, such as Portugal and India. WGS-SNP analysis of South African B. abortus strains demonstrated that the same genotype circulated in one farm (Farm 1), whereas another farm (Farm 2) in the same provi...
    A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and... more
    A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis ...
    In the last decades, effective cellulose degradation became a major point of interest due to the properties of cellulose as a renewable energy source and the widespread application of cellulases (the cellulose degrading enzymes) in many... more
    In the last decades, effective cellulose degradation became a major point of interest due to the properties of cellulose as a renewable energy source and the widespread application of cellulases (the cellulose degrading enzymes) in many industrial processes. Effective bioconversion of lignocellulosic biomass into soluble sugars for ethanol production requires use of thermostable and highly active cellulases. The library of current cellulases includes enzymes that can work at acidic and neutral pH in a wide temperature range. However, only few cellulases are reported to be thermostable. In order to alleviate this, we have performed a hybrid approach for the thermostabilization of a key cellulase, Endoglucanase I (EGI) from Trichoderma reesei. We combined in silico and in vitro experiments to modulate the thermostability of EGI. Four different predictive algorithms were used to set up a library of mutations. Three thermostabilizer mutations (Q126F, K272F, Q274V) were selected and mole...
    Targeting peptides are N-terminal sorting signals in proteins that promote their translocation to mitochondria through the interaction with different protein machineries. We recently developed TPpred, a machine learning-based method... more
    Targeting peptides are N-terminal sorting signals in proteins that promote their translocation to mitochondria through the interaction with different protein machineries. We recently developed TPpred, a machine learning-based method scoring among the best ones available to predict the presence of a targeting peptide into a protein sequence and its cleavage site. Here we introduce TPpred2 that improves TPpred performances in the task of identifying the cleavage site of the targeting peptides. TPpred2 is now available as a web interface and as a stand-alone version for users who can freely download and adopt it for processing large volumes of sequences. Availability and implementaion: TPpred2 is available both as web server and stand-alone version at http://tppred2.biocomp.unibo.it. gigi@biocomp.unibo.it Supplementary data are available at Bioinformatics online.
    In the genomic era a key issue is protein annotation, namely how to endow protein sequences, upon translation from the corresponding genes, with structural and functional features. Routinely this operation is electronically done by... more
    In the genomic era a key issue is protein annotation, namely how to endow protein sequences, upon translation from the corresponding genes, with structural and functional features. Routinely this operation is electronically done by deriving and integrating information from previous knowledge. The reference database for protein sequences is UniProtKB divided into two sections, UniProtKB/TrEMBL which is automatically annotated and not reviewed and UniProtKB/Swiss-Prot which is manually annotated and reviewed. The annotation process is essentially based on sequence similarity search. The question therefore arises as to which extent annotation based on transfer by inheritance is valuable and specifically if it is possible to statistically validate inherited features when little homology exists among the target sequence and its template(s). In this paper we address the problem of annotating protein sequences in a statistically validated manner considering as a reference annotation resour...
    Mapping of the coding sequences of the best characterized subfamilies of G-protein-coupled receptors is performed with unsupervised neural networks based on a winner-take-all strategy. High order features therefrom extracted originate... more
    Mapping of the coding sequences of the best characterized subfamilies of G-protein-coupled receptors is performed with unsupervised neural networks based on a winner-take-all strategy. High order features therefrom extracted originate signals along the aligned protein sequences of the different subfamilies. These plots reveal characteristic domains common and/or characteristic of the receptor subfamily. By comparison with the existing experimental results, it is obtained that most of the regions signalled by clustering overlap with possible functional regions in the folded proteins. This is particularly noticeable for the third cytoplasmic loop, which is likely to be involved in the molecular coupling with the G-proteins. The results suggest that functional regions in proteins may be characterized by intrinsic representative features in the coding sequences which can be enlighted by high order mapping.
    Chloroplast glyceraldehyde-3-phosphate dehydrogenase (GAPDH) is a light-regulated, NAD(P)H-dependent enzyme involved in plant photosynthetic carbon reduction. Unlike lower photosynthetic organisms, which only contain A 4 –GAPDH, the major... more
    Chloroplast glyceraldehyde-3-phosphate dehydrogenase (GAPDH) is a light-regulated, NAD(P)H-dependent enzyme involved in plant photosynthetic carbon reduction. Unlike lower photosynthetic organisms, which only contain A 4 –GAPDH, the major GAPDH isoform of land plants is made up of A and B subunits, the latter containing a C-terminal extension (CTE) with fundamental regulatory functions. Light-activation of AB–GAPDH depends on the redox state of a pair of cysteines of the CTE, which can form a disulfide bond under control of thioredoxin f , leading to specific inhibition of the NADPH-dependent activity. The tridimensional structure of A 2 B 2 –GAPDH from spinach chloroplasts, crystallized in the oxidized state, shows that each disulfide-containing CTE is docked into a deep cleft between a pair of A and B subunits. The structure of the CTE was derived from crystallographic data and computational modeling and confirmed by site-specific mutagenesis. Structural analysis of oxidized A 2 B...
    Endemic Burkitt lymphoma (eBL) constitutes the commonest cancer in children in Developing Countries, while the sporadic (sBL) and immunodeficiency associated BL (ID-BL) forms are mainly encountered in Western Countries. The molecular... more
    Endemic Burkitt lymphoma (eBL) constitutes the commonest cancer in children in Developing Countries, while the sporadic (sBL) and immunodeficiency associated BL (ID-BL) forms are mainly encountered in Western Countries. The molecular hallmark of three BL variants is the translocation of MYC proto-oncogene to the immunoglobulin-heavy [t(8;14)(q24;q32)] or one of the light chain genes [t(2;8)(p12; q24) and t(8;22)(q24; q11)], leading to constitutive MYC activation. However, additional genetic events contributes to BL pathogenesis, most of which have been studies in sBL only. Here, we performed RNA Sequencing aiming to identify genetic changes possibly cooperating with MYC in the pathogenesis of eBL. We studied by RNA Sequencing (Illumina HiScanSQ) 21 eBL cases, collected at different African Institutions as discovery set. Total RNA was extracted with Trizol and libraries were prepared according to TruSeq RNA sample preparation v2 protocol. Sequence variants were obtained using the SAV...
    By analogy with its human nectin1 counterpart, murine nectin1 serves as a cellular receptor for the entry of herpes simplex virus (HSV) into murine cells. HSV entry mediated by either receptor is dependent on the viral glycoprotein D... more
    By analogy with its human nectin1 counterpart, murine nectin1 serves as a cellular receptor for the entry of herpes simplex virus (HSV) into murine cells. HSV entry mediated by either receptor is dependent on the viral glycoprotein D (gD). Whereas human nectin1 binds gD at high affinity and in a saturable manner, murine nectin1 binds gD in a barely detectable fashion, depending on the sensitivity of the assay. The immunoglobulin type V domain of murine nectin differs from its human counterpart in 11 amino acids. To identify the key residues responsible for the high-affinity binding of gD to human nectin1, we replaced each of the 11 divergent amino acids with the human counterparts singly or in groups in an incremental manner. Replacement in murine nectin1 of six amino acids that lie within the gD binding region of human nectin1 (previously mapped to residues 64 to 94, likely the CC′C″ surface) increased the gD binding activity to a limited extent. In contrast, the single P138L subst...
    Cytosol-synthesized preproteins destined for the mitochondria are transported across the outer membrane by the translocase of the mitochondrial outer membrane (TOM complex). This dynamic transport machinery can be divided into receptors... more
    Cytosol-synthesized preproteins destined for the mitochondria are transported across the outer membrane by the translocase of the mitochondrial outer membrane (TOM complex). This dynamic transport machinery can be divided into receptors that recognize preprotein targeting signals and components of the general import pore complex that mediate preprotein transport across the outer membrane. This review focuses on recent studies dealing with the central questions regarding the pore-forming subunits, and architecture and gating of the translocation channel of the outer membrane.
    The LIBI project (International Laboratory of BioInformatics), which started in 2005 and will end in 2009, was initiated with the aim of setting up an advanced bioinformatics and computational biology laboratory, focusing on basic and... more
    The LIBI project (International Laboratory of BioInformatics), which started in 2005 and will end in 2009, was initiated with the aim of setting up an advanced bioinformatics and computational biology laboratory, focusing on basic and applied research in modern biology and biotechnologies. One of the goals of this project has been the development of a Grid Problem Solving Environment, built on top of EGEE, DEISA and SPACI infrastructures, to allow the submission and monitoring of jobs mapped to complex experiments in bioinformatics. In this work we describe the architecture of this environment and describe several case studies and related results which have been obtained using it.
    Eukaryotic Subcellular Localization DataBase col-lects the annotations of subcellular localization of eukaryotic proteomes. So far five proteomes have been processed and stored: Homo sapiens, Mus musculus, Caenorhabditis elegans,... more
    Eukaryotic Subcellular Localization DataBase col-lects the annotations of subcellular localization of eukaryotic proteomes. So far five proteomes have been processed and stored: Homo sapiens, Mus musculus, Caenorhabditis elegans, Saccharomyces cerevisiae and Arabidopsis thaliana. For each sequence, the database lists localization obtained adopting three different approaches: (i) experiment-ally determined (when available); (ii) homology-based (when possible); and (iii) predicted. The latter is computed with a suite of machine learning based methods, developed in house. All the data are available at our website and can be searched by sequence, by protein code and/or by protein des-cription. Furthermore, a more complex search can be performed combining different search fields and keys. All the data contained in the database can be freely downloaded in flat file format. The database is available at
    Knowing the number of residue contacts in a protein is crucial for deriving constraints useful in modeling protein folding and/or scoring remote homology search. Here we focus on the prediction of residue contacts and show that this... more
    Knowing the number of residue contacts in a protein is crucial for deriving constraints useful in modeling protein folding and/or scoring remote homology search. Here we focus on the prediction of residue contacts and show that this figure can be predicted with a neural network based method. The accuracy of the prediction is 12 percentage points higher than that of a simple statistical method. The neural network is used to discriminate between two different states of residue contacts, characterized by a contact number higher or lower than the average value of the residue distribution. When evolutionary information is taken into account, our method correctly predicts 69% of the residue states in the data base and it adds to the prediction of residue solvent accessibility. The predictor is available at htpp://www.biocomp.unibo.it
    Motivation. The knowledge of the subcellular localization of a protein is fundamental for elucidating its function. It is difficult to determine the subcellular location for eukaryotic cells with experimental highthroughput procedures.... more
    Motivation. The knowledge of the subcellular localization of a protein is fundamental for elucidating its function. It is difficult to determine the subcellular location for eukaryotic cells with experimental highthroughput procedures. Computational procedures are then needed for annotating the subcellular location of proteins in large scale genomic projects. Results. BaCelLo is a predictor for five classes of subcellular localization (secretory pathway, cytoplasm, nucleus, mitochondrion and chloroplast) and it is based on different SVMs organized in a decision tree. The system exploits the information derived from the residue sequence and from the evolutionary information contained in alignment profiles. It analyzes the whole sequence composition and the compositions of both the N- and C-termini. The training set is curated in order to avoid redundancy. For the first time a balancing procedure is introduced in order to mitigate the effect of biased training sets. Three kingdom-spec...
    The three papers in this special section come from the Fifth International Workshop on Algorithms in Bioinformatics (WABI '05).

    And 310 more