OutPredict: multiple datasets can improve prediction of expression and inference of causality

Jacopo Cirrone, Matthew D. Brooks, Richard Bonneau, Gloria M. Coruzzi, Dennis E. Shasha

Research output: Contribution to journalArticle

Abstract

The ability to accurately predict the causal relationships from transcription factors to genes would greatly enhance our understanding of transcriptional dynamics. This could lead to applications in which one or more transcription factors could be manipulated to effect a change in genes leading to the enhancement of some desired trait. Here we present a method called OutPredict that constructs a model for each gene based on time series (and other) data and that predicts gene's expression in a previously unseen subsequent time point. The model also infers causal relationships based on the most important transcription factors for each gene model, some of which have been validated from previous physical experiments. The method benefits from known network edges and steady-state data to enhance predictive accuracy. Our results across B. subtilis, Arabidopsis, E.coli, Drosophila and the DREAM4 simulated in silico dataset show improved predictive accuracy ranging from 40% to 60% over other state-of-the-art methods. We find that gene expression models can benefit from the addition of steady-state data to predict expression values of time series. Finally, we validate, based on limited available data, that the influential edges we infer correspond to known relationships significantly more than expected by chance or by state-of-the-art methods.

Original languageEnglish (US)
Article number6804
JournalScientific reports
Volume10
Issue number1
DOIs
StatePublished - Dec 1 2020

ASJC Scopus subject areas

  • General

Fingerprint Dive into the research topics of 'OutPredict: multiple datasets can improve prediction of expression and inference of causality'. Together they form a unique fingerprint.

  • Cite this