Consider the Motif Finding Problem for the following DNA sample. Assume the size
ID: 669770 • Letter: C
Question
Consider the Motif Finding Problem for the following DNA sample. Assume the size of the motif is 3.
1
2
3
4
T
C
C
A
T
C
C
G
C
T
C
A
The following tree structure represents the complete search space for the problem above.
The leaf nodes 4, 5, 7, 8, 11, 12, 14, and 15 in the tree above represent all possible 3-mer starting positions in the DNA sample. For example, node 7 represents the alignment with starting positions (1,2,1), i.e., the alignment for the 3-mers TCC, CCG, and CTC. The other nodes in the tree represent partial 3-mer starting positions. For example, the partial starting positions at node 6 is (1,2,-), and the corresponding alignment for the 3-mers involving only the first two sequences is TCC and CCG.
Run the BranchAndBoundMotifSearch algorithm for the Motif Finding problem above to answers the questions below.
Find the score values for the following starting position vectors:
Score(s4) = ___7_____
Score(s5) = _________
Score(s7) = _________
Score(s8) = _________
Score(s11) = ________
Score(s12) = ________
Score(s14) = ________
Score(s15) = ________
Find the score values for the following partial starting position vectors:
Score(s3,2) = ___6____
Score(s6,2) = ___4____
Score(s10,2) = _______
Score(s13,2) = _______
Score(s9,1) = ________
Score(s1,0) = ________
Write the 3-mer consensus motif found by the BranchAndBoundMotifSearch algorithm: _____________
1
2
3
4
T
C
C
A
T
C
C
G
C
T
C
A
Explanation / Answer
#search.py
#BranchandBoundMotifSearch
def Score(s,i,l,dna):
t=len(dna) #t=number of dna sequences
alignment=[][]
for j in range(0,i):
alignment.append(dna[j][s[j].s[j]+l])
profile=[[],[],[]]
for j in range(0,4):
profile[j]=[0]*l
for c in range(0,l): #for each column number c
for r in range(0,i): #for each row number r in column c
if alignment[r][c]=='a':
profile[0][c]=profile[0][c]+1
elif alignment[r][c]=='t':
profile[1][c]=profile[1][c]+1
elif alignment[r][c]=='g':
profile[2][c]=profile[2][c]+1
else
profile[3][c]=profile[3][c]+1
score=0
for c in range(0,l):
score=score+max([profile[0][c],profile[1][c],profile[2][c],profile[3][c]])
return score
def NextVertex(a,i,l,k):
if i<l
a[i]=0
return(a,i+1)
else
for j in range(l,0,-1): #level l to 1
if a[j-1]<k-1:
a[j-1]=a[j-1]+1
return(a,j)
return(a,0)
defBypass(a,i,l,k):
for j in range(i,0,-1): #level i down to 1
if a[j-1]<k-1:
a[j-1]=a[j-1]+1
return(a,j)
return(a,0)
def BB_MotifSearch(dna,t,n,l):
s=[0]*t
bestScore=0
i=1 #current tree level
while i>0
if i<t
optimisticScore=Score(s,i,l,dna)+(t-1)*l
if optimisticScore<bestScore:
(s,i)=Bypass(s,i,t,n-l+1)
else
(s,i)=NextVertex(s,i,t,n-l+1)
else
score=Score(s,t,l,dna)
if score>bestScore
bestScore=score
bestMotif=s[i] #make a copy
(s,i)=NextVertex(s,i,t,n-l+1)
return(bestScore,bestMotif)
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.