Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

MATLAB ASSIGNMENT: I JUST NEED HELP FOR PART 2 HERE\'S WHAT I DID FOR PART 1: Pa

ID: 3601111 • Letter: M

Question

MATLAB ASSIGNMENT: I JUST NEED HELP FOR PART 2

HERE'S WHAT I DID FOR PART 1:

Part 1: Create a Matlab script hw7_1.m that prompts the user for two DNA sequence strings and scores an alignment. • Each A or T match contributes a value of +2 • Each C or G match contributes a value of +3 • Each mismatch or Gap contributes a score of -2 You can assume the input will be valid (only DNA bases) and the sequences will be at most 99 bases in length. For example, two sequences and their corresponding score would be:

Sequence 1: ATGCTGACTGCA

Sequence 2: CTTGAGACG---

A/T score = 2* 2= +4

G/C score = 2* 3= +6

Mismatches + GAPs = 8*-2= -16

Score = -6

Part 1:

clc;
clear;

% Read the DNA sequences from user
seq1 = input('Please enter the first DNA sequence ', 's');
seq2 = input('Please enter the second DNA sequence ', 's');

score = 0.0;
score = score - (2*abs(numel(seq1) - numel(seq2)));

if numel(seq1) < numel(seq2)
shorterLength = numel(seq1);
else
shorterLength = numel(seq2);
end

for index=1:1:shorterLength
if (seq1(index)==seq2(index))
if(seq1(index) == 'A' || seq1(index) == 'T')
score = score + 2;
else
score = score + 3;
end
else
score = score - 2;
end   
end   

score

Part 2: Create a Matlab script hw7_2.m (Copy and modify part 1). The second program will additionally access if each DNA sequence could be a coding strand for a protein (assume no introns). Scan each sequence (in both forward and reverse directions in all 3 reading frames) to determine if there is at least one start codon (i.e. the coding strand DNA must contain “ATG”, which is transcribed to the RNA start codon “AUG”) with an in frame stop codon (i.e. the coding strand DNA must contain “TAA”, “TAG” or “TGA”, which are transcribed to the RNA stop codons “UAA”, “UAG” or “UGA”, respectively). The output for each sequence should simply be a message that the sequence does or does not code for a protein. If the sequence does, report the number of amino acids in the translated protein.

Explanation / Answer

Sequence 1: ATGCTGACTGCA Sequence 2: CTTGAGACG A/T score = 3*2= 6 C/G score = 3*3= 9 Non-match = 6*-2= -12 Total score = 3 Enter the first DNA sequence: ATGCTGACTGCA Enter the second DNA sequence: CTTGAGACG seq1 ---------ATGCTGACTGCA--------- seq2 CTTGAGACG--------------------- 0 A/T matches 0 C/G matches 21 mismatches + gaps score: -42 seq1 ---------ATGCTGACTGCA--------- seq2 -CTTGAGACG-------------------- 0 A/T matches 0 C/G matches 20 mismatches + gaps score: -40 seq1 ---------ATGCTGACTGCA--------- seq2 --CTTGAGACG------------------- 0 A/T matches 0 C/G matches 19 mismatches + gaps score: -38 seq1 ---------ATGCTGACTGCA--------- seq2 ---CTTGAGACG------------------ 1 A/T matches 1 C/G matches 16 mismatches + gaps score: -27 ...(others not shown)... seq1 ---------ATGCTGACTGCA--------- seq2 ---------------------CTTGAGACG 0 A/T matches 0 C/G matches 21 mismatches + gaps score: -42 Maximum score = ???