Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Write code using R Language or Python, that reads in the .csv files and generate

ID: 3915918 • Letter: W

Question

Write code using R Language or Python, that reads in the .csv files and generates a Manhattan Plot (using any library).

I cannot attach the .csv files mentioned below so any files that you can get to work with your code will be fine. I struggled because there was a lot of data so please be mindful that the code should be able to take in a bunch of data from the files.

We will visualize Genome-Wide Association Studies (GWAS) with the Manhattan plot for psychiatric disorders. The two data sets of SNPs and phenotypes that contain genome data of two groups: psychiatric disorders (y=1) and control (y = 0). In "SNP.csv", there are 37,853 SNPs of 130 samples. A value in the file indicates the numbers of minor allele on each SNP (i.e., ?? ? {0, 1, 2}). In "Phenotype.csv", zero indicates that the sample is a control, while one shows a psychiatric disorder (one of bipolar disorder, schizophrenia, and major depression).

Compute p-values by using t-test (you can use any libraries for t-test). Perform t-test pairwise between a SNP and phenotype. I.e., you need to perform 37,853 t-tests and compute p-values. Then, make a Manhattan plot with bonferroni multiple testing correction (i.e., consider the pvalue cutoff: 0.05/37,853).

10000 20000 30000 SNPs Figure 1. Manhattan Plot

Explanation / Answer

Here is a function which can make a Manhattan plot using lattice graphics. While the function itself is quite long, you don't have to worry about most of it. You really only need to pay attention to the parameters that you pass to the funciton. An example of its use is given below.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote