Searching regulatory motives in the histone defined promoter/enhancer regions

Category:Data Analysis
Supervisor:Raivo Kolde
Abstract:The search for patterns in regulatory regions of the genome is among the first problems that bioinformaticians tackled. In simpler organisms like bacteria and yeast it also works well, we can deduce a lot of information about the regulation of the genes from the motives present in their promoter sequence. In mammals, however, it does not work so well, because the amount of free room between the genes is much larger and the regulatory areas do not have to be exactly at the gene start. Therefore, the signal is much more dispersed than in simpler organisms and as a result harder to find. However, it has been shown, that incorporating additional information can solve the problem. It appears that regulatory regions of the DNA are packed differently than other regions and we can therefore pinpoint more precisely the regions where to look for the motives. These regions are already available for download. The task of the student would be to download these regions and come up with an analysis pipeline that takes in a list of genes and returns the possible regulators for that.


  • Download epigenetic information from UCSC
  • Associate regulatory regions with nearby genes
  • Run motif matching and motif discovery algorithms on the data
  • Perform motif enrichment analysis on some gene lists


