Genome-wide discovery of cis-elements in promoter sequences using gene expression data

Tatiana Tatarinova, John Bouck, Richard Flavell, Nickolai Alexandrov, Maxim Troukhan

Research output: Contribution to journalArticlepeer-review


The availability of complete or nearly complete genome sequences, a large number of 5' expressed sequence tags, and significant public expression data allow for a more accurate identification of cis-elements regulating gene expression. We have implemented a global approach that takes advantage of available expression data, genomic sequences, and transcript information to predict cis-elements associated with specific expression patterns. The key components of our approach are: (1) precise identification of transcription start sites, (2) specific locations of cis-elements relative to the transcription start site, and (3) assessment of statistical significance for all sequence motifs. By applying our method to promoters of Arabidopsis thaliana and Mus musculus, we have identified motifs that affect gene expression under specific environmental conditions or in certain tissues. We also found that the presence of the TATA box is associated with increased variability of gene expression. Strong correlation between our results and experimentally determined motifs shows that the method is capable of predicting new functionally important cis-elements in promoter sequences.
Original languageEnglish
Pages (from-to)139 - 151
Number of pages12
JournalOMICS: A Journal of Integrative Biology
Issue number2
Publication statusPublished - 30 Apr 2009


  • gene expression
  • bioinformatics
  • transcription factor binding sites


Dive into the research topics of 'Genome-wide discovery of cis-elements in promoter sequences using gene expression data'. Together they form a unique fingerprint.

Cite this