Unknown

Dataset Information

0

Quantitative modeling of gene expression using DNA shape features of binding sites.


ABSTRACT: Prediction of gene expression levels driven by regulatory sequences is pivotal in genomic biology. A major focus in transcriptional regulation is sequence-to-expression modeling, which interprets the enhancer sequence based on transcription factor concentrations and DNA binding specificities and predicts precise gene expression levels in varying cellular contexts. Such models largely rely on the position weight matrix (PWM) model for DNA binding, and the effect of alternative models based on DNA shape remains unexplored. Here, we propose a statistical thermodynamics model of gene expression using DNA shape features of binding sites. We used rigorous methods to evaluate the fits of expression readouts of 37 enhancers regulating spatial gene expression patterns in Drosophila embryo, and show that DNA shape-based models perform arguably better than PWM-based models. We also observed DNA shape captures information complimentary to the PWM, in a way that is useful for expression modeling. Furthermore, we tested if combining shape and PWM-based features provides better predictions than using either binding model alone. Our work demonstrates that the increasingly popular DNA-binding models based on local DNA shape can be useful in sequence-to-expression modeling. It also provides a framework for future studies to predict gene expression better than with PWM models alone.

SUBMITTER: Peng PC 

PROVIDER: S-EPMC5291265 | biostudies-literature | 2016 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Quantitative modeling of gene expression using DNA shape features of binding sites.

Peng Pei-Chen PC   Sinha Saurabh S  

Nucleic acids research 20160601 13


Prediction of gene expression levels driven by regulatory sequences is pivotal in genomic biology. A major focus in transcriptional regulation is sequence-to-expression modeling, which interprets the enhancer sequence based on transcription factor concentrations and DNA binding specificities and predicts precise gene expression levels in varying cellular contexts. Such models largely rely on the position weight matrix (PWM) model for DNA binding, and the effect of alternative models based on DNA  ...[more]

Similar Datasets

2014-11-04 | E-GEOD-59845 | biostudies-arrayexpress
2015-03-01 | E-GEOD-60200 | biostudies-arrayexpress
| S-EPMC4403198 | biostudies-literature
2015-03-01 | GSE60200 | GEO
2014-11-04 | GSE59845 | GEO
| S-EPMC7145579 | biostudies-literature
| S-EPMC3526315 | biostudies-other
| S-EPMC3491397 | biostudies-literature
| S-EPMC5042832 | biostudies-literature
| S-EPMC6166240 | biostudies-literature