c++ For this lab, you will practice while loops, if statements, reading from an
ID: 3563999 • Letter: C
Question
c++
For this lab, you will practice while loops, if statements, reading from an input file, writing to an output
files, and string operations. You are given a file containing protein sequences with the task for finding
motifs. Motifs are certain patterns of amino acids that appear many times in protein sequences which
act as indictors or markers for special regions, genes, mutations, etc. An example of what this file will
look like is below:
>ENSG00000035141|ENST00000037869
MAELQQLRVQEAVESMVKSLERENIRKMQGLMFRCSASCCEDSQASMKQVHQCIERCHVP
LAQAQALVTSELEKFQDRLARCTMHCNDKAKDSIDAGSKELQVKQQLDSCVTKCVDDHMH
LIPTMTKKMKEALLSIGK*
>ENSG00000003137|ENST00000001146
MLFEGLDLVSALATLAACLVSVTLLLAVSQQLWQLRWAATRDKSCKLPIPKGSMGFPLIG
ETGHWLLQGSGFQSSRREKYGNVFKTHLLGRPLIRVTGAENVRKILMGEHHLVSTEWPRS
TRMLLGPNTVSNSIGDIHRNKRKVFSKIFSHEALESYLPKIQLVIQDTLRAWSSHPEAIN
VYQEAQKLTFRMAIRVLLGFSIPEEDLGHLFEVYQQFVDNVFSLPVDLPFSGYRRGIQAR
QILQKGLEKAIREKLQCTQGKDYLDALDLLIESSKEHGKEMTMQELKDGTLELIFAAYAT
TASASTSLIMQLLKHPTVLEKLRDELRAHGILHSGGCPCEGTLRLDTLSGLRYLDCVIKE
VMRLFTPISGGYRTVLQTFELDGFQIPKGWSVMYSIRDTHDTAPVFKDVNVFDPDRFSQA
RSEDKDGRFHYLPFGGGVRTCLGKHLAKLFLKVLAVELASTSRFELATRTFPRITLVPVL
HPVDGLSVKFFGLDSNQNEILPETEAMLSATV*
These lines:
>ENSG00000035141|ENST00000037869
>ENSG00000003137|ENST00000001146
Are the sequence names. They start with '>'. What follows is the protein sequence itself.
Your task, should you choose to accept it, is to read through the given input file and:
? Output the sequence name to a file called motifSequences.txt followed by the amino acids
before the motif, a space, and the amino acids after the motif, if the motif is present in the
sequence.
? Count the number of motifs that exist in the file
? Count the number of protein sequences
? Count the number of amino acid lines without motifs
Notes:
? This lab will involve creating an input file stream
? This lab will involve creating one output file stream for the proteins with the motif
? The only output to the screen will be the number of motifs that exist in the file, the number of
protein sequences, and the number of amino acid lines without motifs
? Because the input file is very large (1.4 MB, 25003 lines), do not copy the input file to your C
account. Instead use this section of code in your program to open the input file stream:
ifstream proteinFile;
proteinFile.open("/nfshome/mw3n/human_aa_chr2_partial.txt");
? Use "SLR" for the motif to search for
Sample output:
To the terminal:
Total number of sequences: 2063
Total number of lines with motifs: 534
Total number of lines without motifs: 22406
First 10 lines in motifSequences.txt:
>ENSG00000040933|ENST00000074304
PPVTRSVDTVNGRMVLPVDESLTEALGIRSKYA KDTLLKSVFGGAICRMYRFPTTDG
>ENSG00000040933|ENST00000074304
NHLRILEQMAESVLSLHVPRQFVKLLLEEDAARVCELEELGELSPCWE RQIVTQYQT
>ENSG00000072080|ENST00000168148
MISRMEKMTMMMKILIMFALGMNYWSCSGFPVYDYDPS DALSASVVKVNSQSLSPYL
>ENSG00000015568|ENST00000016946
IIDDGDSNLSVVKKLPVPLESVKQMLNSVMQELEDYSEGGPLYKNG NADSEIKHSTP
>ENSG00000183091|ENST00000172853
SVRGKVAPTTKTVDLDRALHAYKLQSSNLYKT TLPTGYRLPGDTPHFKHIKDTRYMS
Explanation / Answer
#include int main(void) { FILE *fileptr1, *fileptr2; char filechar[40]; char c; int delete_line, temp = 1; printf("Enter file name: "); scanf("%s", filechar); fileptr1 = fopen(filechar, "r"); c = getc(fileptr1); //print the contents of file . while (c != EOF) { printf("%c", c); c = getc(fileptr1); } printf(" Enter line number to be deleted and replaced"); scanf("%d", &delete_line); //take fileptr1 to start point. rewind(fileptr1); //open replica.c in write mode fileptr2 = fopen("replica.c", "w"); c = getc(fileptr1); while (c != EOF) { if (c == 'n') { temp++; } //till the line to be deleted comes,copy the content to other if (temp != delete_line) { putc(c, fileptr2); } else { while ((c = getc(fileptr1)) != 'n') { } //read and skip the line ask for new text printf("Enter new text"); //flush the input stream fflush(stdin); putc('n', fileptr2); //put 'n' in new file while ((c = getchar()) != 'n') putc(c, fileptr2); //take the data from user and place it in new file fputs("n", fileptr2); temp++; } //continue this till EOF is encountered c = getc(fileptr1); } fclose(fileptr1); fclose(fileptr2); remove(filechar); rename("replica.c", filechar); fileptr1 = fopen(filechar, "r"); //reads the character from file c = getc(fileptr1); //until last character of file is encountered while (c != EOF) { printf("%c", c); //all characters are printed c = getc(fileptr1); } fclose(fileptr1); return 0; }Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.