p2 Python Project

This program asks the user to enter a number of DNA sequences. It then finds the consensus sequence by analyzing the sequences column by column to determine the nucleotide frequencies and the maximum nucleotide repetitions in each column. It outputs the consensus sequence and displays the nucleotide frequencies in each column for each sequence.

Uploaded by

Daniella Vargas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views3 pages

p2 Python Project

Uploaded by

Daniella Vargas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 3

"""This program ask the user to enter a number of DNA sequences and finds the

consensus sequence. The ouput is the consensus.

Add the corresponding code to accomplish the requested tasks
"""

##### ADD YOUR NAME, Student ID, and Section number #######
# NAME: Daniella Vargas Figueroa
# STUDENT ID:802228453
# SECTION:096
###########################################################

# Auxiliar functions

# The function valid_seq() will check if the given sequence is valid or not.
# seq: is a string containing the sequence entered by the user
def valid_seq(seq):
isvalid = False
#Checks which of the inputs is valid.
for s in seq:
if (s == 'A') or (s == 'C') or (s == 'T') or (s == 'G'):
isvalid = True
else:
isvalid = False
break
return isvalid
# the max_nuc() takes four inputs: the nucleotide frequencey in a colum,
# calculate which nucleotide is more frequent
# and returns a list with two elements: the nucleotide with maximum frequency and
its frequency.
# a,b,c,d: are the number of frequencies for each nucleotide
def max_nuc(A,G,C,T):
if A>G and A>C and A>T:
return["A",A]
elif G>A and G>C and G>T:
return ["G",G]
elif C>A and C>G and C>T:
return ["C",C]
elif T>A and T>C and T>G:
return ["T",T]
#########################
# The function load_data, it take as an argument, it input the DNA sequences, save
in the list and return the list
# a: is a number of sequences to be input
def load_data(a):
#Create a counter for the while loop.
counter=a
#Create an empty list named sequences.
sequences=[]
# While loop continues adding entered sequences to list sequences until reached
number of sequences the user input.
while counter > 0:
seq=input("DNA sequence: ")
if valid_seq(seq):
sequences.append(seq)
counter-=1
else:
print("Invalid Input. Try again")
#Created a new list to add all the valid sequences.
validseq=[]
for i in sequences:
if valid_seq(i):
validseq.append(i)
return validseq
# input sequences
# validate sequences
# save list
# return list
#New function to sort the order of the frequencies from greater to least for the
challenge.
def order(l):
#Reverse each element in l, sort l and reverse l again. Then after the list is
sorted reverse l again to get the list from greatest to least.
for element in l:
element.reverse()
l.sort()
l.reverse()
for element in l:
element.reverse()
#return l
return l

# The function count_nucl_freq, it take arguments the load_data,

# contains the frecuencies of the nucleotides for each column
# a: is a list of DNA sequences
def count_nucl_freq(a):
#create an empty list to store each letter's frequency
frequencies=[]
#Another empty list to store the order for the challenge.
bono=[]
#Use for loops to look for the frequency of each letter in each column.
for i in range(0,len(a[0])):
columnfrec=[0,0,0,0]
for j in range(0,len(a)):
let= a[j][i]
if let=="A":
columnfrec[0]=columnfrec[0]+1
elif let=="G":
columnfrec[1]=columnfrec[1]+1
elif let=="C":
columnfrec[2]=columnfrec[2]+1
else:
columnfrec[3]=columnfrec[3]+1
#Append each letter frequency from greater to least for the challenge display.
bono.append(order([["A:",columnfrec[0]], ["G:",columnfrec[1]],
["C:",columnfrec[2]], ["T:",columnfrec[3]] ])) # BONO
#Append each Maximum frequency by column to the list frequencies.
frequencies.append(max_nuc(columnfrec[0], columnfrec[1], columnfrec[2],
columnfrec[3]))
#Return both lists.
return frequencies, bono

# analyze the list by columns

# find nucleotide frecuencies
# find the nucleotide with the maximum number of repetitions for each columm
# append the output from the max_nuc() function to a list Result

# The function find_consensus, it take arguments the count_nucl_freq and return a

consensus sequence
# a: is a you return in count_nucl_freq
def find_consensus(a):
freq_lst=a
consensusString = ""
#For loop to access each element in index 0 in the frequency list done before and
add it to the consensous string.
for element in freq_lst:
#print(element)
x=element[0]
consensusString= consensusString + x
return consensusString

# The function main, your program to start and function calls

def main():
# ask the number DNA sequence
n_seq = int(input('Number of DNA sequences: '))
#call all the function before
list_seq = load_data(n_seq)
list_freq,list_bono = count_nucl_freq(list_seq)
consensus =find_consensus(list_freq)
#display's DNA consensus
print("Consensus:",consensus)
#Display the word challenge
print("Challenge:")
#Create a for loop to display the frequencies of each letter in ech column
counter=1
for col in list_bono:
#Identify and asign a variable to the postion you want to access in the list
named list_bono:
x = col[0][0]
x2= col[0][1]
y = col[1][0]
y2= col[1][1]
z= col[2][0]
z2=col[2][1]
f= col[3][0]
f2= col[3][1]
#Display each column based on the length of the sequence and each letter's
frequency.
print("Col",str(counter)+": ",sep=" ", end="")
counter+=1
print(str(x) +''+ str(x2),end=" ")
print(str(y) +''+ str(y2),end=" ")
print(str(z) +''+ str(z2),end=" ")
print(str(f) +''+ str(f2))

if __name__ == "__main__":
main()

Chapter 3 Introduction To Optimization Modeling
No ratings yet
Chapter 3 Introduction To Optimization Modeling
12 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
p3 Python Project
No ratings yet
p3 Python Project
4 pages
IDC306_Assignment_5_MS21009
No ratings yet
IDC306_Assignment_5_MS21009
4 pages
University of Mauritius
No ratings yet
University of Mauritius
9 pages
02-11-22-Lab-5-MS21212.ipynb - Colaboratory
No ratings yet
02-11-22-Lab-5-MS21212.ipynb - Colaboratory
8 pages
Manual de Ejercicios de Python
No ratings yet
Manual de Ejercicios de Python
1 page
Python
No ratings yet
Python
9 pages
solutionsExerciseMaster11 23
No ratings yet
solutionsExerciseMaster11 23
13 pages
Function Solutions
No ratings yet
Function Solutions
10 pages
with open
No ratings yet
with open
6 pages
Ass 2 Bioinformatics
No ratings yet
Ass 2 Bioinformatics
8 pages
01 07 FrequentWordsWithMismatchesSolution
No ratings yet
01 07 FrequentWordsWithMismatchesSolution
2 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Lab2
No ratings yet
Lab2
7 pages
Assignments NPTEL
No ratings yet
Assignments NPTEL
6 pages
vertopal.com_bioinf575_hw07_dmeghana (1)
No ratings yet
vertopal.com_bioinf575_hw07_dmeghana (1)
34 pages
INFO390C DNDS Pset05
No ratings yet
INFO390C DNDS Pset05
9 pages
Group17 2
No ratings yet
Group17 2
9 pages
MOOC Project Work - Sequence Analysis - Data Analysis With Python 2021
No ratings yet
MOOC Project Work - Sequence Analysis - Data Analysis With Python 2021
29 pages
DWM EXP 1 to 14 C_merged_compressed
No ratings yet
DWM EXP 1 to 14 C_merged_compressed
104 pages
python assignment
No ratings yet
python assignment
8 pages
BINP16 Programming Exam 2016-10-25 Solutions
No ratings yet
BINP16 Programming Exam 2016-10-25 Solutions
5 pages
AI and ML Lab Program
No ratings yet
AI and ML Lab Program
24 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Code2pdf 6564f797c624e
No ratings yet
Code2pdf 6564f797c624e
2 pages
CLASS XII RECORD computer
No ratings yet
CLASS XII RECORD computer
14 pages
Record 8 To 14
No ratings yet
Record 8 To 14
33 pages
solutionsExerciseMaster1 10
No ratings yet
solutionsExerciseMaster1 10
9 pages
programs (1)
No ratings yet
programs (1)
8 pages
PS1
No ratings yet
PS1
2 pages
complex
No ratings yet
complex
6 pages
CSE160-Final-23wi-key
No ratings yet
CSE160-Final-23wi-key
10 pages
Program 1
No ratings yet
Program 1
25 pages
Assignment 1
No ratings yet
Assignment 1
5 pages
Final AI LAB FILE
No ratings yet
Final AI LAB FILE
20 pages
Basic Exercises for Competitive Programming: Python
From Everand
Basic Exercises for Competitive Programming: Python
Jan Pol
No ratings yet
Faculty of Engineering Ain Shams University Name: Ahmed Nashaat Hassanen Department: CESS Bioinformatics ID: 14P6016 Ass1
No ratings yet
Faculty of Engineering Ain Shams University Name: Ahmed Nashaat Hassanen Department: CESS Bioinformatics ID: 14P6016 Ass1
3 pages
Lösungen Zu Den Exercises AI Python
No ratings yet
Lösungen Zu Den Exercises AI Python
26 pages
Artificial Intelligence Lab File
No ratings yet
Artificial Intelligence Lab File
10 pages
AIML Manual _merged
No ratings yet
AIML Manual _merged
41 pages
In-Linear-Time: Check This Web Site
No ratings yet
In-Linear-Time: Check This Web Site
4 pages
tp4
No ratings yet
tp4
3 pages
SSCE Practicals2025.docx
No ratings yet
SSCE Practicals2025.docx
15 pages
CSE 5370: Bioinformatics Homework 2: Due Thursday, February 24th, 2022 at 4:59PM CST
No ratings yet
CSE 5370: Bioinformatics Homework 2: Due Thursday, February 24th, 2022 at 4:59PM CST
3 pages
exam_programming_exercises
No ratings yet
exam_programming_exercises
7 pages
AI_Programs KP Print
No ratings yet
AI_Programs KP Print
14 pages
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Python Lab Programs
No ratings yet
Python Lab Programs
15 pages
CS Practical File
No ratings yet
CS Practical File
28 pages
AIML_Manual_V1-6-83
No ratings yet
AIML_Manual_V1-6-83
78 pages
prgm aiml
No ratings yet
prgm aiml
27 pages
Cps Ass
No ratings yet
Cps Ass
11 pages
EX-9 EXCEPTION
No ratings yet
EX-9 EXCEPTION
3 pages
Frequent Words With Mismatches&Reverse Complements
No ratings yet
Frequent Words With Mismatches&Reverse Complements
3 pages
Source Code Ai
No ratings yet
Source Code Ai
1 page
BECOB236 Code
No ratings yet
BECOB236 Code
10 pages
15CSL76 Students
No ratings yet
15CSL76 Students
18 pages
Shakib (1)
No ratings yet
Shakib (1)
15 pages
Assignmnet 1
No ratings yet
Assignmnet 1
6 pages
Symmetric Group and Young Tableaux
No ratings yet
Symmetric Group and Young Tableaux
13 pages
Multiple Choice Test Bank Questions No Feedback - Chapter 1
No ratings yet
Multiple Choice Test Bank Questions No Feedback - Chapter 1
8 pages
BALCOR 2020 Book of Proceedings
No ratings yet
BALCOR 2020 Book of Proceedings
498 pages
Theory of Underwater Explosion Bubbles
100% (1)
Theory of Underwater Explosion Bubbles
72 pages
3111 Group Theory Week 4
No ratings yet
3111 Group Theory Week 4
39 pages
Notes - Effect of Ice Loading
No ratings yet
Notes - Effect of Ice Loading
23 pages
CBSE Class 7 Mathematics Fractions and Decimals MCQS, Multiple Choice Questions
No ratings yet
CBSE Class 7 Mathematics Fractions and Decimals MCQS, Multiple Choice Questions
15 pages
Asymmetric Relationship of Environmental Degradation and Economic Growth With Tourism Demand in Pakistan - Evidence From Non-Linear ARDL and Causality Estimation
No ratings yet
Asymmetric Relationship of Environmental Degradation and Economic Growth With Tourism Demand in Pakistan - Evidence From Non-Linear ARDL and Causality Estimation
12 pages
Electronics & Communication: Institute of Engineering Studies Mock Test-I
No ratings yet
Electronics & Communication: Institute of Engineering Studies Mock Test-I
13 pages
MA6251 Syllabus
No ratings yet
MA6251 Syllabus
2 pages
11 Mathematics Binomial Theorem Test 05
No ratings yet
11 Mathematics Binomial Theorem Test 05
1 page
Naph Maths Paper 2
No ratings yet
Naph Maths Paper 2
7 pages
Chapter 8 Motion
No ratings yet
Chapter 8 Motion
4 pages
Advanced Business Quantitative Methods L4: Reinhold Kamati, PHD
No ratings yet
Advanced Business Quantitative Methods L4: Reinhold Kamati, PHD
7 pages
Alias-Free Digital Synthesis of Classic Analog Waveforms
No ratings yet
Alias-Free Digital Synthesis of Classic Analog Waveforms
12 pages
Class 12 Sci BT1 Sample Paper QSTN 2024-25
No ratings yet
Class 12 Sci BT1 Sample Paper QSTN 2024-25
68 pages
EL7133 Exercises
No ratings yet
EL7133 Exercises
92 pages
LP-Math 10-Q1-W3
No ratings yet
LP-Math 10-Q1-W3
2 pages
Year 9 Revision Paper 21062024 (QP)
No ratings yet
Year 9 Revision Paper 21062024 (QP)
33 pages
Unified Vision For A Sustainable Future A Multidisciplinary Approach Towards The Sustainable Development Goals 1st Edition Mir Sayed Shah Danish download
No ratings yet
Unified Vision For A Sustainable Future A Multidisciplinary Approach Towards The Sustainable Development Goals 1st Edition Mir Sayed Shah Danish download
60 pages
TBL Explanation 1
No ratings yet
TBL Explanation 1
2 pages
Risk and Return
No ratings yet
Risk and Return
70 pages
Welcome To Introduction To Programming
No ratings yet
Welcome To Introduction To Programming
48 pages
mp2_c
No ratings yet
mp2_c
7 pages
Accounting 9th Edition Horngren Solution Manual
100% (1)
Accounting 9th Edition Horngren Solution Manual
19 pages
11 Maths Key PDF
No ratings yet
11 Maths Key PDF
5 pages
Math Solving Equation
No ratings yet
Math Solving Equation
12 pages
limits of accuracy worksheet
100% (1)
limits of accuracy worksheet
5 pages