TY - JOUR
T1 - Robust discovery of gene regulatory networks from single-cell gene expression data by Causal Inference Using Composition of Transactions
AU - Shojaee, Abbas
AU - Carol Huang, Shao Shan
N1 - Publisher Copyright:
© The Author(s) 2023. Published by Oxford University Press.
PY - 2023/11/1
Y1 - 2023/11/1
N2 - Gene regulatory networks (GRNs) drive organism structure and functions, so the discovery and characterization of GRNs is a major goal in biological research. However, accurate identification of causal regulatory connections and inference of GRNs using gene expression datasets, more recently from single-cell RNA-seq (scRNA-seq), has been challenging. Here we employ the innovative method of Causal Inference Using Composition of Transactions (CICT) to uncover GRNs from scRNA-seq data. The basis of CICT is that if all gene expressions were random, a non-random regulatory gene should induce its targets at levels different from the background random process, resulting in distinct patterns in the whole relevance network of gene–gene associations. CICT proposes novel network features derived from a relevance network, which enable any machine learning algorithm to predict causal regulatory edges and infer GRNs. We evaluated CICT using simulated and experimental scRNA-seq data in a well-established benchmarking pipeline and showed that CICT outperformed existing network inference methods representing diverse approaches with many-fold higher accuracy. Furthermore, we demonstrated that GRN inference with CICT was robust to different levels of sparsity in scRNA-seq data, the characteristics of data and ground truth, the choice of association measure and the complexity of the supervised machine learning algorithm. Our results suggest aiming at directly predicting causality to recover regulatory relationships in complex biological networks substantially improves accuracy in GRN inference.
AB - Gene regulatory networks (GRNs) drive organism structure and functions, so the discovery and characterization of GRNs is a major goal in biological research. However, accurate identification of causal regulatory connections and inference of GRNs using gene expression datasets, more recently from single-cell RNA-seq (scRNA-seq), has been challenging. Here we employ the innovative method of Causal Inference Using Composition of Transactions (CICT) to uncover GRNs from scRNA-seq data. The basis of CICT is that if all gene expressions were random, a non-random regulatory gene should induce its targets at levels different from the background random process, resulting in distinct patterns in the whole relevance network of gene–gene associations. CICT proposes novel network features derived from a relevance network, which enable any machine learning algorithm to predict causal regulatory edges and infer GRNs. We evaluated CICT using simulated and experimental scRNA-seq data in a well-established benchmarking pipeline and showed that CICT outperformed existing network inference methods representing diverse approaches with many-fold higher accuracy. Furthermore, we demonstrated that GRN inference with CICT was robust to different levels of sparsity in scRNA-seq data, the characteristics of data and ground truth, the choice of association measure and the complexity of the supervised machine learning algorithm. Our results suggest aiming at directly predicting causality to recover regulatory relationships in complex biological networks substantially improves accuracy in GRN inference.
KW - causal inference
KW - gene regulatory network
KW - network inference
KW - single-cell RNA sequencing
KW - systems inference
UR - http://www.scopus.com/inward/record.url?scp=85175272316&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85175272316&partnerID=8YFLogxK
U2 - 10.1093/bib/bbad370
DO - 10.1093/bib/bbad370
M3 - Article
C2 - 37897702
AN - SCOPUS:85175272316
SN - 1467-5463
VL - 24
JO - Briefings in Bioinformatics
JF - Briefings in Bioinformatics
IS - 6
M1 - bbad370
ER -