Computational Biological
Problem for Practice
1.Write a Python code for transcribing DNA
into RNA
dna_sequence = "ACGTACGTATTTTTTTCGT"
rna_sequence = ""
for nucleotide in dna_sequence:
if nucleotide == "A":
rna_sequence += "U"
elif nucleotide == "C":
rna_sequence += "G"
elif nucleotide == "G":
rna_sequence += "C"
elif nucleotide == "T":
rna_sequence += “A"
print(rna_sequence)
2.Write a MATLAB code for transcribing DNA
into RNA
% Define the DNA sequence
dna_sequence = 'ACGTACGTATTTTTTTCGT';
% Initialize an empty string for the RNA sequence
rna_sequence = '';
% Loop through each nucleotide in the DNA sequence
for i = 1:length(dna_sequence)
nucleotide = dna_sequence(i);
% Find the RNA complement of each DNA nucleotide
if nucleotide == 'A'
rna_sequence = [rna_sequence 'U'];
elseif nucleotide == 'C'
rna_sequence = [rna_sequence 'G'];
elseif nucleotide == 'G'
rna_sequence = [rna_sequence 'C'];
elseif nucleotide == 'T'
rna_sequence = [rna_sequence 'A'];
end
end
% Display the RNA sequence
fprintf('RNA Sequence: %s\n', rna_sequence);
3.Given an RNA sequence, write a Python program that translates the RNA
into its protein sequence, using the standard genetic code.
def translate_rna(rna_sequence):
genetic_code = {
'AUG': 'Methionine', 'UUU': 'Phenylalanine', 'UUC': 'Phenylalanine',
# Add remaining codons as needed
}
protein_sequence = [genetic_code[rna_sequence[i:i+3]] for i in range(0,
len(rna_sequence), 3)]
return protein_sequence
# Example RNA sequence
rna_sequence = "AUGUUUUUC"
print(f"Protein: {translate_rna(rna_sequence)}")
4.Given an RNA sequence, write a MATLAB code that
translates the RNA into its protein sequence, using the
standard genetic code.
% Main script
rna_sequence = 'AUGUUUUUC';
protein_sequence = translate_rna(rna_sequence);
fprintf('Protein: %s\n', strjoin(protein_sequence, ', '));
% Function definition
function protein_sequence = translate_rna(rna_sequence)
% Define the genetic code dictionary
genetic_code = containers.Map( ...
{'AUG', 'UUU', 'UUC'}, ...
{'Methionine', 'Phenylalanine', 'Phenylalanine'} ... % Add remaining codons as needed
);
% Initialize the protein sequence
protein_sequence = {};
% Iterate over the RNA sequence in steps of 3
for i = 1:3:length(rna_sequence)
codon = rna_sequence(i:i+2);
if isKey(genetic_code, codon)
protein_sequence = [protein_sequence, genetic_code(codon)];
else
protein_sequence = [protein_sequence, 'Unknown']; % Handle unknown
codons
end
end
end
5. Calculate the length of each DNA sequence.
sequence1 = "ATCGATCGATCGTACG"
sequence2 = "TTAGGCTAATCGGCTA"
sequence1 = "ATCGATCGATATATCAGCTCATGCGTCG"
sequence2 = "TTAGGCTAATCGGCTA"
length_sequence1 = len(sequence1)
length_sequence2 = len(sequence2)
print("Length of sequence1:", length_sequence1)
print("Length of sequence2:", length_sequence2)
6. Calculate the length of each DNA sequence. Use MATLAB
code.
sequence1 = "ATCGATCGATCGTACG"
sequence2 = "TTAGGCTAATCGGCTA"
% Define the sequences
sequence1 = 'ATCGATCGATATATCAGCTCATGCGTCG';
sequence2 = 'TTAGGCTAATCGGCTA';
% Calculate the lengths of the sequences
length_sequence1 = length(sequence1);
length_sequence2 = length(sequence2);
% Display the lengths of the sequences
fprintf('Length of sequence1: %d\n', length_sequence1);
fprintf('Length of sequence2: %d\n', length_sequence2);
7. You are given a DNA sequence as a string, and your
task is to write a Python program to calculate and
display the GC content (percentage of G and C
nucleotides) in the sequence.
sequence = "ATCGATCGATCGTACG"
def calculate_gc_content(sequence):
# Convert the sequence to uppercase to handle lowercase inputs
sequence = sequence.upper()
# Calculate the count of 'G' and 'C' nucleotides
g_count = sequence.count('G')
c_count = sequence.count('C')
# Calculate the total length of the sequence
total_length = len(sequence)
# Calculate the GC content percentage
gc_content = ((g_count + c_count) / total_length) * 100
return gc_content
# Given DNA sequence
sequence = "ATCGATCGATCGTACG"
# Calculate the GC content
gc_content = calculate_gc_content(sequence)
# Display the result
print(f"GC Content: {gc_content:.2f}%")
8. You are given a DNA sequence as a string, and your task is to
write a MATLAB program to calculate and display the GC
content (percentage of G and C nucleotides) in the sequence.
sequence = "ATCGATCGATCGTACG"
% Main script
sequence = 'ATCGATCGATCGTACG';
% Calculate the GC content
gc_content = calculate_gc_content(sequence);
% Display the result
fprintf('GC Content: %.2f%%\n', gc_content);
% Function definition
function gc_content = calculate_gc_content(sequence)
% Convert the sequence to uppercase to handle lowercase inputs
sequence = upper(sequence);
% Initialize the count for G and C nucleotides
g_count = 0;
c_count = 0;
% Loop through each nucleotide in the sequence
for i = 1:length(sequence)
if sequence(i) == 'G'
g_count = g_count + 1;
elseif sequence(i) == 'C'
c_count = c_count + 1;
end
end
% Calculate the total length of the sequence
total_length = length(sequence);
% Calculate the GC content percentage
gc_content = ((g_count + c_count) / total_length) * 100;
end
9. Determine if the RNA sequences contain any stop codons
(AUG-ACC-UGA). If yes, report their positions.
sequence1 = " AUCGAUCGAUCGUACGUGGUAAUAAUGA "
sequence2 = “UUAGGCUAAUCGGCUA "
sequence1 = "AUCGAUCGAUCGUACGUGGUAAUAAUGA"
sequence2 = "UUAGGCUAAUCGGCUA"
stop_codons = ["UAA", "UAG", "UGA"]
def find_stop_codons(sequence):
stop_codon_positions = []
for i in range(0, len(sequence) - 2, 3):
codon = sequence[i:i + 3]
if codon in stop_codons:
stop_codon_positions.append(i)
return stop_codon_positions
stop_codon_positions_sequence1 = find_stop_codons(sequence1)
stop_codon_positions_sequence2 = find_stop_codons(sequence2)
if stop_codon_positions_sequence1:
print("Stop codons found in sequence1 at positions:", stop_codon_positions_sequence1)
else:
print("No stop codons found in sequence1")
if stop_codon_positions_sequence2:
print("Stop codons found in sequence2 at positions:", stop_codon_positions_sequence2)
else:
print("No stop codons found in sequence2")
10. Determine if the RNA sequences contain any stop codons
(AUG-ACC-UGA). If yes, report their positions.
sequence1 = " AUCGAUCGAUCGUACGUGGUAAUAAUGA "
sequence2 = “UUAGGCUAAUCGGCUA “ Give MATLAB
code.
% Define sequences
sequence1 = 'AUCGAUCGAUCGUACGUGGUAAUAAUGA';
sequence2 = 'UUAGGCUAAUCGGCUA';
% Define stop codons
stop_codons = {'UAA', 'UAG', 'UGA'};
% Find stop codons in sequences
stop_codon_positions_sequence1 = find_stop_codons(sequence1, stop_codons);
stop_codon_positions_sequence2 = find_stop_codons(sequence2, stop_codons);
% Display results for sequence1
if ~isempty(stop_codon_positions_sequence1)
fprintf('Stop codons found in sequence1 at positions: ');
fprintf('%d ', stop_codon_positions_sequence1);
fprintf('\n');
else
fprintf('No stop codons found in sequence1\n');
end
% Display results for sequence2
if ~isempty(stop_codon_positions_sequence2)
fprintf('Stop codons found in sequence2 at positions: ');
fprintf('%d ', stop_codon_positions_sequence2);
fprintf('\n');
else
fprintf('No stop codons found in sequence2\n');
end
% Function definition to find stop codons
function stop_codon_positions = find_stop_codons(sequence, stop_codons)
stop_codon_positions = [];
for i = 1:3:length(sequence) - 2
codon = sequence(i:i + 2);
if any(strcmp(codon, stop_codons))
stop_codon_positions = [stop_codon_positions, i];
end
end
end