Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Genbank Assignment You are called to the scene of a gruesome murder in the prima

ID: 81731 • Letter: G

Question

Genbank Assignment

You are called to the scene of a gruesome murder in the primate house at the local zoo. An apprentice keeper has been strangled to death. The victim and your friend, the head primate keeper, were the only people in the locked building at the time of the murder. According to the policeman on the scene, your friend is the prime suspect. You find blood under the victim's fingernails and extract DNA from the blood. Though little intact DNA is present, you do get the following DNA sequence.

gaaaaaaaat cgagtaagag accactgtgg cagtgattgc acagaactgg aaaacactgt

Who is the killer?

Can this sequence data be used to determine the killer's identity?

Use the National Center for Biotechnology Information BLAST server to solve this mystery.

I'll send you to the BLAST site in a moment -- but first, some comments.

The NCBI site is a repository for various DNA and protein sequences and a lot of other information. There are zillions (well, actually something over 99 billion, as of 2008) of bases of DNA information in there. Thousands of organisms are represented. Many have ALL of their DNA sequence (their entire genome) in the database. Most have varying amounts. This is a fantastic resource for researchers. I use it almost every day when I get gene sequences from the fungi I work on. I submit my data to see if there are any similar genes in the databank. When one lines up, I have a way to identify what gene I have pulled out (if the gene in the databank has been identified in some other organism and mine lines up against it, then I can infer that it is likely my gene codes for something similar to what the one in the databank codes for).

Anyway, you are going to get a crack at using the GenBank database. Because the site has all kinds of links and it takes some familiarity with all this stuff to navigate, I will send you straight to the appropriate search engine and even set the defaults for you (all in the magic of a single URL!). Your job will be to submit the sequence, run the search and interpret the results.

Here's what you do:

Highlight the sequence (60 bases long) above and copy it (use "Edit" on the menu bar and then "Copy", OR press Ctrl-C, OR right click your mouse and select "Copy").

Click on the link at the bottom of the page, to BLASTn. (It will open in a new window, and you can switch back and forth from that one to this one (hold down "alt" and hit the "tab" key to switch windows) if you need to read the instructions further).

Paste the sequence into the text box (it says "Enter accession numbers, gis, or FASTA sequences" above it). To paste, click in the box and use "Edit" then "Paste", OR Ctrl-V, OR right mouse click and select "Paste".

DON'T type in any other box or change ANY of the already selected options.

In the resulting Window, click the "Format!" button (looks like the "BLAST!" button).

A Window with your search results will open (as a separate Window). If you get a screen saying it will retry in X number of seconds, you can wait. It may take several minutes before the results come back (it is a BIG database, and lots of people use it)

There is a lot of information on this screen. Scroll down past the distribution graphic (the box with colored lines), and scroll past the list of sequences. They're sorted in order from the best match down. Further down the page you will see the Alignments.

Alignments shows exactly how your sequence lined up against sequences in the database. Both sequences are shown lined up against each other. There is a Score, which is a measure of how well they matched, and a line that says "Identities =", which tells how many of the total nucleotides were the same, and then gives a percent identity in parentheses.

To determine the identity of the killer, you need to find the best match in the database. One way you can judge this by the number of "identities" -- nucleotides that are the same. Try looking at the top scoring hit (at the top of the list), and the first one in the Alignments.

To run the search, click here: BLASTn

Now, here's the assignment:
Think about this carefully. Realize, the database has DNA sequences of various different species of organisms. It does not have DNA of a lot of specific individual organisms (at least, not much of that). Thus, it has a representative DNA sequence from a human, not DNA frp, Ralph, Jerry, Mary, your instructor, etc.

Question 1

Not yet answered

Marked out of 1.00

Flag question

Question text

What was the hypothesis of the Police Department

Question 2

Not yet answered

Marked out of 1.00

Flag question

Question text

According to their hypothesis, what BLAST search results were predicted?

.

Question 3

Not yet answered

Marked out of 1.00

Flag question

Question text

Do the data and prediction agree?

Question 4

Not yet answered

Marked out of 1.00

Flag question

Question text

If not, "Who done it?" (Who do the data implicate?)

Question 5

Not yet answered

Marked out of 1.00

Flag question

Question text

How do you know?

Question 6

Not yet answered

Marked out of 1.00

Flag question

Question text

What other organisms had the most similar DNA sequences to this one, and how many of the 60 nucleotides were identical in them?

Explanation / Answer

1. Hypothesis of the Police:

Friend is the suspect with respect to the crime scene.

2. NCBI BLAST result:

The gene sequence obtained from the blood stain matched with that of Gorilla. this gene sequence has 100% identity with the gene sequence of Gorilla.

3. Prediction

the data and prediction do not agree.

4. who did that?

from the blast analysis, we can conclude that the murder should be done by one of the animals from the zoo which is nothing but Gorilla.

5. How do you know?

the analysis is showing top hits with 100% identity to the animal only rather than any human.

6. other matching

the next hit from the analysis is human only. this has 97% identity with the given dna sequence