Download the complete genome sequence (..genomic.fna) for the bacterium Clostrid
ID: 3605630 • Letter: D
Question
Download the complete genome sequence (..genomic.fna) for the bacterium Clostridium phytofermentans from ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/bacteria/Lachnoclostridium_phytofermentans/representative/GCF_000018685.1_ASM1868v1/ (or your favorite bacterial genome sequence). Write a program that inputs this sequence (discard the first line with the > fasta header) and writes to an output file a tab formatted file (Hint: use and in your .write statement) with the nucleotide counts and the total number of nucleotides.
Explanation / Answer
def readfile(filename):
nucleotides_dict = {}
with open(filename,'r') as f: #opens file in read mode
next(f) #skips first line
lines = f.readlines() #reads file data into list
try:
data = ''.join(lines) #convert list into string
data = data.replace(' ','') #replace newline characters
total_nucleotides = len(data) #total characters ib string
nucleotides = list(set(data)) #unique characters in string
for nucleotide in nucleotides: #counts count of nucleotide
nucleotides_dict.update({nucleotide:data.count(nucleotide)})
nucleotides_dict.update({'Total':total_nucleotides})
except Exception as e:
print (e)
return nucleotides_dict
def writefile(nucleo_dict):
with open('output.txt','w') as f: #open file in write mode
for key,value in nucleo_dict.items():
line = key+' '+str(value)+' '
f.write(line)
if __name__ == '__main__':
filepath = '' #full path to GCF_000018685.1_ASM1868v1_genomic.fna
nucleo_dict = readfile(filepath) #reads file
writefile(nucleo_dict) #write data to file
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.