NCBI BLAST = http://blast.ncbi.nlm.nih.gov/Blast.cgi 1. Using the unknown sequen
ID: 211577 • Letter: N
Question
NCBI BLAST = http://blast.ncbi.nlm.nih.gov/Blast.cgi
1. Using the unknown sequence provided at the bottom of this document, perform a BLASTN (i.e., nucelotide blast) search using “Database” set to the "RefSeq Genome Database (refseq_genomes)" and “Organism” set to “vertebrates (taxid:7742):
Also, make sure the Program Optimization setting is set to "Highly similar sequences (megablast)":
Keep all other settings and parameters at the default value.
1.a. How many Blast Hits do you get for the sequence?
Answer =
1.b. What is the sequence most similar to? Provide the top-listed sequence Accession as your answer.
Answer =
1.c. What is the likely SOURCE ORGANISM for this sequence?
Answer =
1.d. What is the E-value for this sequence and is it significant and why?
Answer =
1.e. What is the Max score for this sequence?
Answer =
1.f. What is the Total score for the top BLASTN search result?
Answer =
1.g. What is the Query coverage for the top BLASTN search result?
Answer =
1.h. What is the Ident (i.e., Identity) for the top BLASTN search result?
Answer =
1.i. Are there any other results having the same E-value as this sequence? If so, how many?
Answer =
1.j. For the other results having the same E-value as this sequence, how many also have the same Query coverage?
Answer =
1.k. For those results having the same E-value and same Query coverage, do they also have the same Total score?
Answer =
Choose Search Set C Human genomic+transcript Mouse genomic+transcript COthers (nr etc.) I RefSeq Genome Database (refseq-genomes) Database Organismvertebrates (taxid:7742) Optional Exclude +Explanation / Answer
Since there is no mention of the nucleotide sequence for this problem, the following query sequence was used.
Query Length 2597 nucleotides
Database Name refseq_genomes (241 databases)
Keeping the required settings as given, the following were observed.
1a. 100 Blast Hits
1b. NC_034576.1 Mus caroli chromosome 7, CAROLI_EIJ_v1.1
1c. Ryukyu mouse /ricefield mouse/ Mus caroli
1d. 2e-169.
The E value or the Expect value is a parameter that denotes the number of hits for a query sequence one can "expect" to see by chance when searching a database of a known size. In other words, the E value describes the background noise or the non relevant sequences . Therefore lower the E-value, or the closer it is to zero, the match becomes more "significant". An E value of 1 for a subject sequence hit means that in a database of a particular size one can expect to see, by chance, 1 match with a similar score. In this case, the top leasted sequence has the least E-value, close to zero, and therefore this match is most significant taking only the E-value into consideration.
1e. 610
1f. 3932
1g. 89%
1h. 98%
1i. No
1j. Not applicable for this sequence.
But please note that there is a sequence with 100% query coverage, max score 601, total score 4783, Sequence identity 97%, E value 1e-166 for the accession number NC_005100.4 (Rattus norvegicus strain mixed chromosome 1, Rnor_6.0). This comes as the top listed accession if the results are sorted by Total score, instead of Max score. There is another accession with the same E-value 1e-166 (AC_000069.1), bu the query coverage is relatively poor (82%)
1k. Not applicable to this sequence.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.