0% found this document useful (0 votes)

62 views13 pages

Answer Multiple Sequence Alignment (MSA) Practical 2

The document outlines a practical workflow for performing multiple sequence alignment (MSA) using tools like Clustal Omega, MAFFT, and MUSCLE, focusing on the P53 protein and mitochondrial cytochrome b sequences from various rodent species. It details steps for data collection, alignment execution, and result interpretation, including identifying conserved and variable regions. Additionally, it provides a case study with specific accession numbers and questions to assess understanding of the alignment process.

Uploaded by

mernagoodgirl666

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views13 pages

Answer Multiple Sequence Alignment (MSA) Practical 2

Uploaded by

mernagoodgirl666

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Multiple sequence alignment (MSA)

Practical Workflow:
1. Data Collection:
• Download the sequences you want to align from public sequence
databases like NCBI. Ensure diversity in the dataset (sequences
from different species or variants for meaningful comparisons).
• Store the sequences in a file for easy access.

For now, we will investigate the evolutionary history and diversity of the
P53 protein, a critical player in cell regulation and DNA repair. Using 6 P53
protein sequences from various species, with following accession
numbers:
AAC53040.1, BAA08629.1, AAA39883.1, AEG21062.2, AAL83290.1,
AAA39882.1

2. Clustal Omega:
• Clustal Omega employs a progressive alignment algorithm. It builds a
guide tree, aligning the most closely related sequences first and then
adding others.
• Clustal Omega employs a progressive alignment algorithm. It builds a
guide tree, aligning the most closely related sequences first and then
adding others.
• The tool generates a final MSA, highlighting conserved regions and
variable regions.
• Open Clustal Omega, a multiple sequence alignment tool. Go to
http://www.ebi.ac.uk/Tools/msa/clustalo/.

• Step 1 - You will get a page to select the type of sequences to be

aligned (Protein, DNA or RNA), enter the sequences directly into
this box in FASTA format (or upload a file of a supported format)
and set the output format.
Copy all of your sequences in FASTA format into the open frame
below the Submission Form, making sure to leave one space
between them. Clustal Omega will attempt to align these amino
acid sequences based on their similarities. Click RUN, Your results
might take a few seconds.
• Step 2 - Set Your Parameter

Multiple Sequence Alignment Tool Output Examples

Clustal Omega: Clustal w/o Numbers:

MAFFT: Pearson/ FASTA:

MUSCLE: HTML:

*For more information about the output format, please check

(https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/Multiple+Sequence+Alignment+Tool+Output+Exampl
es#MultipleSequenceAlignmentToolOutputExamples-ClustalOmegaproteinoutputexamples: )

3. MAFFT:
• Open MAFFT, another multiple sequence alignment tool
(https://www.ebi.ac.uk/Tools/msa/mafft/ ).
• Load the same set of sequences into MAFFT.
• MAFFT uses an iterative refinement method to align sequences. It
automatically selects an appropriate strategy for your dataset
(e.g., FFT-NS-2 for accurate alignments).
• MAFFT outputs an aligned sequence file.
4. MUSCLE:
• Repeat the process with MUSCLE
(https://www.ebi.ac.uk/Tools/msa/muscle/ ).
• MUSCLE employs a progressive method and also incorporates
iterative refinement.
• It aligns sequences and produces an aligned output.
5. Comparison:
• Compare the results from Clustal Omega, MAFFT, and MUSCLE.
• Look for consensus in conserved regions and evaluate any
differences in variable regions.
• Different algorithms may produce slightly different alignments,
and some tools may perform better for specific types of
sequences or datasets.
6. Results Interpretation:
• Analyze the aligned sequences to identify conserved domains or
motifs.
• Evaluate the positions of indels and their potential impact on the
function or structure.
• Consider the evolutionary implications of the alignment.
Here's a step-by-step guide on how to interpret MSA results (eg. Clustal Omega:
Clustal w/o Numbers):

In Clustal Omega results, you typically have access to various tabs or sections
that provide additional information and options for analyzing and extractin
information from your multiple sequence alignment.
(Please note: Depending on the specific software or interface you are using, the available
tabs and their functionalities may vary).

▪ Sequence Alignment Tab:

This is the primary tab where you can view the aligned sequences. It displays
the MSA itself, often with color-coding to represent conserved regions, gaps,
and sequence properties.

1. Conserved Regions:
• Start by identifying segments within the alignment where most of the
sequences have identical or highly similar amino acids. These are
conserved regions.
• Conserved regions are often functionally important. They may
correspond to critical structural elements or functional domains of the
protein or DNA.
Consensus Symbols:
"*" means that the residues or nucleotides in that column are identical in
all sequences in the alignment.
":" means that conserved substitutions have been observed, according to
the COLOR table below.
"." means that semi-conserved substitutions are observed, i.e., amino
acids having similar shape. Conserved means the amino acid is replaced
by one having similar characteristics.
2. Variable Regions:
• Look for segments in the alignment where the sequences show
variations. These variable regions may indicate adaptational differences
or regions with less functional constraint.
3. Gaps (Indels):
• Identify gaps or insertions/deletions (indels) in the alignment. Gaps
represent regions where sequences differ in length or have insertions or
deletions.
• Indels can be functionally significant, potentially indicating structural
variation or unique features in specific sequences.
4. Alignment Quality:
• Assess the overall quality of the alignment. A well-aligned region should
have minimal gaps and few sequence variations. Higher alignment quality
indicates stronger sequence similarity.

▪ Show Colours Tab:

Returning to your results, you will notice that there are various taps at the upper
part of your results page, Click on the tap called “ Show Colours." it. Now your
sequences appear in color.

The use of colors can be a

or divergence at each position in the alignment. Clustal programs often use specific
colors to represent different amino acid or nucleotide properties. For example, red
may represent negatively charged amino acids, while blue represents positively
charged ones.
Here is some common color schemes used in MSA visualization:

▪ Guide Tree Tab:

The "Guide Tree" is a graphical representation of the evolutionary relationships
among the sequences that were aligned. It helps you visualize how closely
related or distant the sequences are from one another. Here's how to interpret
the Guide Tree in Clustal Omega results:

▪ Phylogenetic Analysis Tab:

If your alignment was used to build a phylogenetic tree, this tab may include
options for visualizing or further analyzing the tree, such as selecting a root
node or adjusting tree display settings.
Case-study: Multiple Sequence Alignment of Mitochondrial Cytochrome b
in Rodents
Introduction: The objective of this experiment is to perform MSA on a set of
mitochondrial cytochrome b Protein sequences from various rodent species.
MSA is a fundamental bioinformatics technique used to identify conserved
regions and sequence variations, aiding in the study of molecular evolution. In
this study, we aim to uncover evolutionary patterns and similarities among
rodent cytochrome b genes.

Exercise 1:
o Data Collection: Retrieve the mitochondrial cytochrome b Protein
sequences for eight different rodent species from the GenBank database.
Species included in the analysis are Mus musculus (house mouse), Rattus
norvegicus (brown rat), Cricetulus griseus (Chinese hamster),
Peromyscus eremicus (Cactus mouse) and others.
Use the following accession numbers to download the sequences from
the Protein database with the same format.
YP_001686710.1, AP_004904.1, YP_537131.1, YP_006073056.1,
YP_009245653.1, YP_009245095.1, YP_009186415.1, YP_009166339.1

o Clustal Omega, MAFFT, and MUSCLE Alignment: use the three different
MSA tools: Clustal Omega, MAFFT, and MUSCLE. Apply each tool to align
the sequences separately with default parameters.
• Clustal Omega:
o Input the sequences into the Clustal Omega tool.
• MAFFT:
o Load the same set of sequences into MAFFT.
• MUSCLE:
o Repeat the process with MUSCLE.
Questions:
1. How many Cytochrome b protein sequences did you include in the
alignment?

Eight sequences

2. What is the length of the longest sequence and smallest sequence?

length of the longest sequence is 381 and length of the smallest

sequence is 379

3. What are the conserved regions in the Clustal Omega alignment, and what
might these regions signify in terms of function or structure?

Conserved regions are often functionally important. They may

correspond to critical structural elements or functional domains of the
protein.

4. Did you observe any differences between the MAFFT alignment and the
Clustal Omega alignment? If so, what might explain these differences?
They differ in output format.
MAFFT: Pearson/ FASTA:
Clustal Omega: Clustal w/o Numbers

5. Are there any additional conserved regions or differences in the aligned

sequences revealed by MAFFT?
No, there are not any additional conserved regions or differences
Exercise 2:
o Repeat the previous practice, but with adding the “XP_059124694.1”
to the same previous set of sequences:

o Clustal Omega, MAFFT, and MUSCLE Alignment: use the three

different MSA tools: Clustal Omega, MAFFT, and MUSCLE.
Apply each tool to align the sequences separately.

Questions:
1. How many Cytochrome b protein sequences did you include in the
alignment?

2. What is the length of the longest sequence and smallest sequence?

3. Do you observe any difference between this alignment and the previous
one?

Lab 3 - Multiple Sequence Alignment: Bioinformatic Methods I Lab 3
No ratings yet
Lab 3 - Multiple Sequence Alignment: Bioinformatic Methods I Lab 3
14 pages
BioinfoMethods-I Lab03 r2025
No ratings yet
BioinfoMethods-I Lab03 r2025
14 pages
Bioinformatics Practical Part Iii
No ratings yet
Bioinformatics Practical Part Iii
4 pages
Multiple Sequence Alignment: Department of
No ratings yet
Multiple Sequence Alignment: Department of
62 pages
Lecture 10 (Multiple Sequences Alignment)
No ratings yet
Lecture 10 (Multiple Sequences Alignment)
22 pages
Lec (5) - MSA
No ratings yet
Lec (5) - MSA
23 pages
Multiple Sequence Alignments:: Clustal Omega
No ratings yet
Multiple Sequence Alignments:: Clustal Omega
33 pages
Protein Analysis for Biologists
No ratings yet
Protein Analysis for Biologists
5 pages
Multiple Sequence Alignment Report
No ratings yet
Multiple Sequence Alignment Report
21 pages
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
17 pages
Lecture 4
No ratings yet
Lecture 4
21 pages
Mega6 Tutorial
100% (1)
Mega6 Tutorial
10 pages
ClustalW Tutorial
100% (1)
ClustalW Tutorial
8 pages
Multiple Sequence Alignment: Hamid Hamzeiy Izmir Institute of Technology
No ratings yet
Multiple Sequence Alignment: Hamid Hamzeiy Izmir Institute of Technology
6 pages
Alignment With Mega
No ratings yet
Alignment With Mega
2 pages
Bookmark This Page
No ratings yet
Bookmark This Page
35 pages
FALLSEM2022-23 BIT3001 ETH VL2022230101828 Reference Material II 13-09-2022 Clustal Omega FAQ
No ratings yet
FALLSEM2022-23 BIT3001 ETH VL2022230101828 Reference Material II 13-09-2022 Clustal Omega FAQ
10 pages
Analysis of Protein Sequence Alignment and Phylogenetic Tree Construction
No ratings yet
Analysis of Protein Sequence Alignment and Phylogenetic Tree Construction
9 pages
Multiple Sequence Alignment Part 1
No ratings yet
Multiple Sequence Alignment Part 1
64 pages
04-Alinemiento Múltiple de Secuencias
No ratings yet
04-Alinemiento Múltiple de Secuencias
14 pages
Clu Stal
No ratings yet
Clu Stal
6 pages
بحث المعلوماتية الحيوية
No ratings yet
بحث المعلوماتية الحيوية
39 pages
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
18 pages
Module4 Session1 Prac Lucy Nakabazzi 2
100% (1)
Module4 Session1 Prac Lucy Nakabazzi 2
3 pages
Lab 4: Phylogenetics: Bioinformatic Methods I Lab 4
No ratings yet
Lab 4: Phylogenetics: Bioinformatic Methods I Lab 4
20 pages
Clustalw
No ratings yet
Clustalw
9 pages
Module - 4 - Reference Course Content
No ratings yet
Module - 4 - Reference Course Content
25 pages
Bioinfo Lab - Exp 6 9921001004
No ratings yet
Bioinfo Lab - Exp 6 9921001004
5 pages
Phylogenetic Tree
No ratings yet
Phylogenetic Tree
13 pages
Experiment 9 Bioinformatics Tools For Cell and Molecular Biology
No ratings yet
Experiment 9 Bioinformatics Tools For Cell and Molecular Biology
11 pages
PW6 Act 2 BIOLOGY
No ratings yet
PW6 Act 2 BIOLOGY
11 pages
MUSCLE User Guide
No ratings yet
MUSCLE User Guide
1 page
Art of Alignment in R
No ratings yet
Art of Alignment in R
16 pages
Class 6
No ratings yet
Class 6
24 pages
Msa
No ratings yet
Msa
28 pages
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
No ratings yet
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
51 pages
Sequence Alignment
No ratings yet
Sequence Alignment
29 pages
MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability
No ratings yet
MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability
9 pages
Msa Notes
No ratings yet
Msa Notes
10 pages
Note 7 - Group 7 Scribbing
No ratings yet
Note 7 - Group 7 Scribbing
7 pages
L8 Msa
No ratings yet
L8 Msa
52 pages
Module 4 - Session 1 - Practical - Assignment - 2025
No ratings yet
Module 4 - Session 1 - Practical - Assignment - 2025
2 pages
Clustal Omega - 1st Page
No ratings yet
Clustal Omega - 1st Page
2 pages
The Accuracy of Several Multiple Sequence Alignment Programs For Proteins
No ratings yet
The Accuracy of Several Multiple Sequence Alignment Programs For Proteins
18 pages
Lecture 9-10 (Sequence Alignment)
No ratings yet
Lecture 9-10 (Sequence Alignment)
48 pages
Building A Multiple Sequence Alignment
No ratings yet
Building A Multiple Sequence Alignment
7 pages
Manual ClustalX PDF
No ratings yet
Manual ClustalX PDF
23 pages
Bioinformatics Sequence Alignment
No ratings yet
Bioinformatics Sequence Alignment
24 pages
Msaviewer: Interactive Javascript Visualiza-Tion of Multiple Sequence Alignments
No ratings yet
Msaviewer: Interactive Javascript Visualiza-Tion of Multiple Sequence Alignments
2 pages
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
89 pages
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
19 pages
Q&A Report From The Workshop - Exploring EMBL-EBI Sequence Analysis Tools and Managing Bioinformatics Workflows
No ratings yet
Q&A Report From The Workshop - Exploring EMBL-EBI Sequence Analysis Tools and Managing Bioinformatics Workflows
4 pages
Chapter 7 Multiple Alignment
No ratings yet
Chapter 7 Multiple Alignment
6 pages
IBB - MB.501 Database Search and Sequence Alignment
No ratings yet
IBB - MB.501 Database Search and Sequence Alignment
51 pages
Protein Structure Modeling Guide
No ratings yet
Protein Structure Modeling Guide
3 pages
Sequence Alignment
No ratings yet
Sequence Alignment
17 pages
Biotech PTC Introd
No ratings yet
Biotech PTC Introd
27 pages
هيسييهه
No ratings yet
هيسييهه
3 pages
DT4 Security
No ratings yet
DT4 Security
23 pages
Sec1 Introduction To Bioinformatics
No ratings yet
Sec1 Introduction To Bioinformatics
20 pages
Carbohydrate Metabolism Test Bank
No ratings yet
Carbohydrate Metabolism Test Bank
44 pages
Data Mining Proteomes
No ratings yet
Data Mining Proteomes
4 pages
Bioinformatics: Blast and Sequence Analysis
No ratings yet
Bioinformatics: Blast and Sequence Analysis
45 pages
Cbmar: See Also
No ratings yet
Cbmar: See Also
2 pages
Split Decomposition in Phylogenetic Tree Construction
No ratings yet
Split Decomposition in Phylogenetic Tree Construction
3 pages
Statistics Genomics Quiz4
No ratings yet
Statistics Genomics Quiz4
3 pages
MSC Bioinformatics and Systems Biology
No ratings yet
MSC Bioinformatics and Systems Biology
4 pages
Phylogenetics for Systematics Students
No ratings yet
Phylogenetics for Systematics Students
36 pages
Lecture-7-Dynamic Programming Global-Sequence Alignment
No ratings yet
Lecture-7-Dynamic Programming Global-Sequence Alignment
31 pages
Activated Sludge Model Sensitive Parameters
No ratings yet
Activated Sludge Model Sensitive Parameters
9 pages
VFDB
No ratings yet
VFDB
1 page
Bioinformatics for Researchers
No ratings yet
Bioinformatics for Researchers
12 pages
Gen Bank
No ratings yet
Gen Bank
2 pages
Genostar Bioinformatics Solutions
No ratings yet
Genostar Bioinformatics Solutions
11 pages
Bioinformatics 1
No ratings yet
Bioinformatics 1
39 pages
Bioinformatics for Students
No ratings yet
Bioinformatics for Students
22 pages
Find Out The Best Alignment Among The Two Alignments With Your Own Basic Assumed Scores?
No ratings yet
Find Out The Best Alignment Among The Two Alignments With Your Own Basic Assumed Scores?
2 pages
Cladogram
No ratings yet
Cladogram
6 pages
Bioinformatics Tools for Biologists
No ratings yet
Bioinformatics Tools for Biologists
26 pages
Bioinformatics Assignment 4
No ratings yet
Bioinformatics Assignment 4
7 pages
Vonviddy Genome Full
100% (1)
Vonviddy Genome Full
10,176 pages
Bioinformatics: An Emerging Tool To Address Animal Health
No ratings yet
Bioinformatics: An Emerging Tool To Address Animal Health
5 pages
KEGG
No ratings yet
KEGG
6 pages
Larone S Medically Important Fungi - 2018 - Walsh - Selected Websites
No ratings yet
Larone S Medically Important Fungi - 2018 - Walsh - Selected Websites
3 pages
GTGF GGCF
No ratings yet
GTGF GGCF
19 pages
Plant Bioinformatics Methods and Protocols 3rd Edition David Edwards Instant Download
No ratings yet
Plant Bioinformatics Methods and Protocols 3rd Edition David Edwards Instant Download
55 pages
Genomics Workshop for GCC Researchers
No ratings yet
Genomics Workshop for GCC Researchers
2 pages
GenBank: A Resource for Researchers
100% (1)
GenBank: A Resource for Researchers
6 pages
Lab 1A - Exploring Ncbi: Bioinformatic Methods I Lab 1
No ratings yet
Lab 1A - Exploring Ncbi: Bioinformatic Methods I Lab 1
22 pages
Comparative Genomics
No ratings yet
Comparative Genomics
23 pages
Question Bank (Bioinformatics I)
No ratings yet
Question Bank (Bioinformatics I)
75 pages

Answer Multiple Sequence Alignment (MSA) Practical 2

Uploaded by

Answer Multiple Sequence Alignment (MSA) Practical 2

Uploaded by

Multiple sequence alignment (MSA)

• Step 1 - You will get a page to select the type of sequences to be

Multiple Sequence Alignment Tool Output Examples

Clustal Omega: Clustal w/o Numbers:

MAFFT: Pearson/ FASTA:

*For more information about the output format, please check

▪ Sequence Alignment Tab:

▪ Show Colours Tab:

The use of colors can be a

▪ Guide Tree Tab:

▪ Phylogenetic Analysis Tab:

2. What is the length of the longest sequence and smallest sequence?

length of the longest sequence is 381 and length of the smallest

Conserved regions are often functionally important. They may

5. Are there any additional conserved regions or differences in the aligned

o Clustal Omega, MAFFT, and MUSCLE Alignment: use the three

2. What is the length of the longest sequence and smallest sequence?

You might also like