Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

3. [Bonus Problem] DNA Subsequence A DNA sequence is a sequence of some combinat

ID: 3854302 • Letter: 3

Question

3. [Bonus Problem] DNA Subsequence

A DNA sequence is a sequence of some combination of the characters A (adenine), C (cytosine), G (guanine), and T (thymine) which correspond to the four nucleobases

that make up DNA.

Given a long DNA sequence, it is often necessary to compute the number of instances of a certain subsequence.

For this exercise, you will develop a program that processes a DNA sequence from a file and, given a subsequences, searches the DNA sequence and counts the number of times s appears.

As an example, consider the following sequence: GGAAGTAGCAGGCCGCATGCTTGGAGGTAAAGTTCATGGTTCCCTGGCCC If we were to search for the subsequence GTA, it appears twice.

You will write a program (place your source in a file named dnaSearch.c) that takes, as command line inputs, an input file name and a valid DNA (sub)sequence. That is, it should be callable from the command line as follows:

./dnaSearch dna01.txt GTA

Sample out put: GTA appears 2 times

Explanation / Answer

#include<stdio.h>

#include<string.h>

//Recurrence function to count number of subsequences

int countSubSequence(char DNA[],char target[],int txtLen,int targetLen)

{

//base condition

if((txtLen==0&&targetLen==0)||targetLen==0)

return 1;

//if DNA[] is empty, then return 0

if(txtLen==0)

return 0;

//if last characters are same,

//call recurrence function with targetLen-1 and txtLen-1

if(DNA[txtLen-1]==target[targetLen-1])

return countSubSequence(DNA,target,txtLen-1,targetLen-1)+countSubSequence(DNA,target,txtLen-1,targetLen);

//if last characters not same then call with txtLen-1

else

return countSubSequence(DNA,target,txtLen-1,targetLen);

}

//main function with command line arguments

int main(int argc,char *argv[])

{

int i=0,txtLen,targetLen,count;

//target[] is to store the required subsequence

char target[10],DNA[10],ch;;

//getting requested subsequence into target array

if(argc>1)

{

strncpy(target,argv[2],10);

target[10]='';

}

targetLen=strlen(target);//length of the target[] array

FILE *f=fopen("argv[2]","r");//opening and reading txt file from the commandline argument argc[1]

//if file is not exist

if(f==NULL)

printf("can not open file ");

//store all the characters from the text file to DNA[]

while(fscanf(f,"%c,",&ch)>0)

{

DNA[i++]=ch;

}

//length of the text file length

//that is number of characters in the text file

txtLen=i;

fclose(f); //closing file that we have opened

//calling recurrece funtion countSubsequence

//this function will return the count of the number of subsequences

count=countSubSequence(DNA,target,txtLen,targetLen);

//printing count of the subsequences

printf("The number of subsequences are %d. ",count);

}

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote