Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

1) This question is a modification of Question 2.3, page 134, in a textbook Intr

ID: 162128 • Letter: 1

Question

1) This question is a modification of Question 2.3, page 134, in a textbook Introduction to Bioinformatics by Arthur Lesk.

It is estimated that the human immune system can produce 1015 antibodies. Would it be feasible for such a large number of proteins each to be encoded entirely by a separate gene, with the diversity arising from gene duplication and divergence? A typical gene for an immune system biomolecule is ~2000 bp long. Show calculations to justify your answer.

To answer the question above you will need to find the size of the human genome. Do this using the databases at either NCBI or Ensembl. Use your chosen database's most recent reference assembly for the human genome. Give the size, and the URL (at either NCBI, Ensembl, or EBI) at which you found the size.

2) This is a modification of Question 2.1, page 132, in a textbook by Arthur Lesk.

According to the NCBI database, the size of the E. coli strain K12 substrain MG1655 reference genome is 4,641,652 bp. The overall base composition of the genome is A/T=49.2%, G/C=50.8%. In a random sequence of 4,641,652 nucleotides with these properties, what is the expected number of occurrences of the sequence CTAG? I.e. how many times would we expect to see the sequence CTAG in the E. coli genome due to random chance alone? Give your answer to the nearest integer.

Explanation / Answer

Answer 1 -

Answer 2 -