Genome-wide motif statistics are shaped by DNA binding proteins over evolutionary time scales

Long Qian, Edo Kussell

Research output: Contribution to journalArticlepeer-review

Abstract

The composition of a genome with respect to all possible short DNA motifs impacts the ability of DNA binding proteins to locate and bind their target sites. Since nonfunctional DNA binding can be detrimental to cellular functions and ultimately to organismal fitness, organisms could benefit from reducing the number of nonfunctional DNA binding sites genome wide. Using in vitro measurements of binding affinities for a large collection of DNA binding proteins, in multiple species, we detect a significant global avoidance of weak binding sites in genomes. We demonstrate that the underlying evolutionary process leaves a distinct genomic hallmark in that similar words have correlated frequencies, a signal that we detect in all species across domains of life. We consider the possibility that natural selection against weak binding sites contributes to this process, and using an evolutionary model we show that the strength of selection needed to maintain global word compositions is on the order of point mutation rates. Likewise, we show that evolutionary mechanisms based on interference of protein-DNA binding with replication and mutational repair processes could yield similar results and operate with similar rates. On the basis of these modeling and bioinformatic results, we conclude that genome-wide word compositions have been molded by DNA binding proteins acting through tiny evolutionary steps over time scales spanning millions of generations.

Original languageEnglish (US)
Article number041009
JournalPhysical Review X
Volume6
Issue number4
DOIs
StatePublished - 2016

Keywords

  • Biological Physics
  • Statistical Physics

ASJC Scopus subject areas

  • Physics and Astronomy(all)

Fingerprint Dive into the research topics of 'Genome-wide motif statistics are shaped by DNA binding proteins over evolutionary time scales'. Together they form a unique fingerprint.

Cite this