[go: up one dir, main page]

0% found this document useful (0 votes)
248 views20 pages

Genome Annotation and Tools

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 20

The Genome :

• The genome contains all the biological information required to build and maintain any given
living organism.
• The genome contains the organisms molecular history.
• Decoding the biological information encoded in these molecules will have enormous impact in
our understanding of biology.
Genome annotation:
• It is the process of taking the raw DNA sequence produced by genome - sequencing projects and adding
the layers of analysis and interpretation necessary to extract its biological significance and place it into
the context of our understanding of biological processes.
Genome annotation:
It is classified into 2 types . They are:
• Structural genome annotation
• Functional genome annotation
• First genome annotation software system was designed in 1995 by Dr. Owen White with the institute for
genomic research that sequenced and analysed the first genome of a free living organism to be decoded, the
bacterium Haemophilus influenzae.
• It involve assembling of the reads to form contigs then assembling with a reference genome or de Novo
assembly to obtain the complete genome.
• Variations such as mutations, SNP, InDels etc can be identified.
• The genome is then annotated by structural and functional annotation.
• To extract biological knowledge from anonymous genomic sequence is the main objective of genome
annotation.
• The extensive use of computer tools is needed to minimised the slow and costly human interventions. This is
the reason why annotation is often synonymous with prediction.
Structural genome annotation:
• A detailed structural explanation of a gene is called structural genome annotation.
• The genome sequence is the first raw material.
• The genome sequence is isolated from the desirable source.
• The prediction of the gene elements is a complex problem and its issue is primordial because of its
consequences on all the following analysis.
• The genome sequence is isolated by Della porta method and it is the purest form.
• The O.D value of purest form of genome is 1.8 to 2.0 .
• It helps in the predicting / identifying the elements such as introns/exons , CDS, stop, start in the
genome.
• Pyrosequencing( next generation sequencing) is performed to the purest form of genome
sequence.
• The complete genome sequence is open in the form of computer readable sequence.
Tools used:
 ORF finder
 Promoter 2.0
 Rnold
 Grail exp
 Genome scan
 Gene finder
• ORF finder:
ORF finder is used to identify the open reading frames in the genome sequence .
• Promoter 2.0:
Promoter 2.0 server is used to identify the promoter sequence present in the genome sequence.
• Rnold server:
Rnold server is used to identify the Terminator sequence present in genome sequence.
• Grail exp server:
Grail exp server is used to identify the repeated DNA sequence in the genome sequences.
• Genome scan:
Genome scan is used to identify the exons and introns in the genome sequence.
• Gene Finder:
Gene Finder used to identify the 5’ splice sites and 3’ splice sites.
• Structural genome annotation identifies the structural components in the genome by using
the bioinformatics tools.

• Physical map:
the identification of the distance between gene to gene or repeated path is called
physical map.
• Functional genome annotation:
A detailed functional explanation of gene is called functional genome annotation.
The structural genome annotation output is the raw material for functional genome annotation.
Without structural genome annotation output we can’t perform functional genome annotation.
• We can predict the m RNA that is produced from the gene compartment.
• The gene segment in the genome sequence undergoes transcription and producers
heteronuclear RNA .
• The heteronuclear RNA undergoes splicing process. The Introns will be removed and only
exons will present.
• The exons will associate and form a mature RNA.
• We can predict the structure of protein that produced from m-RNA.
• We can also predict the secondary structure of protein by using the FASTA format of primary
structure of protein by CFSSP server.
• We can also predict the there is tertiary structure of protein by PDB server.
• We can also predict if the available of protein is a transmembrane protein or non
transmembrane protein by TMHMM server.
• We can predict the signal peptide in the protein by signal P5.0 server.
Other features which can be determined by functional annotation are
• Signal peptides
• Transmembrane domains
• Low complexity regions
• Various binding sites, glycosylation sites etc.
• Protein domain
• Secretome
Thank
You

You might also like