Roughly half of the genes on Affymetrix gene expression platform has more than one probesets representing the transcript. Often the derived gene expression measures of such synonymous probesets varies more than expected, creating the problem of individual probeset quality. The goal is to use various resources to help user to choose probeset that is most trustworthy. Those resources would be existing gene expression data and sequence derived features. Overall result is an analysis pipeline that suggests the best behaving probeset for every gene having more than one probeset on Affymetrix platform.
Tasks • Get an overview of gene expression microarray technology (Affymetrix) • Get an overview of DNA alignment tools (BLAT / BLAST) • Use gene expression data and sequence mapping accuracy to estimate the quality of selected probesets • implement analysis pipeline resulting the suggested probesets to be used in MEM tool Literature • http://nar.oxfordjournals.org/content/33/3/e31.short • http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0000088 • http://iospress.metapress.com/content/b9617387608mnv4w/ • P. Adler, R. Kolde, M. Kull, A. Tkatšenko, H. Peterson, J. Reimand and J. Vilo: Mining for coexpression across hundreds of datasets using novel rank aggregation and visualisation methods (2009) Genome Biology