using matlab ChIP-seq is used to identify genomic segments bound by transcriptio
ID: 3845814 • Letter: U
Question
using matlab
ChIP-seq is used to identify genomic segments bound by transcription factors. In a ChIP-seq experiment, you obtain the chromosome regions that the transcription factor FoxA is binding. Write a Matlab function locationtogenes(chr,start,finish,genefile) that takes a chromosome name, the start and finish of the chromosome region in base pairs, and a gene-location file; and determines which genes are located in that region. The gene-location file is a tab-delimited text file where each line contains chromosome-start-finish-genename. Find the genes whose position overlaps with the input range (the gene doesn't have to be completely enclosed in the input range to be considered overlapping; at least one nucleotide overlap is sufficient). Return these genes as a cell array. If no genes are found in an input range, return an empty cell array. An example gene-location file can be downloaded from http://sacan.biomed.drexel.edu/ftp/bmeprog/genelocs_sample.txt You should not download any files from the web in your code; you can assume that your function will be given a filename that already exists.
Explanation / Answer
function [ out ] = locationtogenes( chr,start,finish,genefile )
%UNTITLED Summary of this function goes here
% Detailed explanation goes here
f=fopen(genefile,'r'); %open file
out=[];
while ~feof(f)
line = fgetl(f);
locs=find(line==sprintf(' '));
if numel(locs)<3; continue; end
if strfind(line,chr)>0
chr1= line(locs(1)+1 : locs(2)-1);
if strfind(line,start)>0
start1= line(locs(2)+1 : locs(3)-1);
if strfind(line,finish)>0
finish1= line(locs(3)+1 : locs(4)-1);
if strfind(line,genename)>0
genename= line(locs(4)+1 : locs(5)-1);
break;
if start1==start && finish1==finish
out=genename
break;
end
end
end
end
end
end
end
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.