TY - JOUR
T1 - Integrating single-cell transcriptomic data across different conditions, technologies, and species
AU - Butler, Andrew
AU - Hoffman, Paul
AU - Smibert, Peter
AU - Papalexi, Efthymia
AU - Satija, Rahul
N1 - Publisher Copyright:
© 2018 Nature Publishing Group. All rights reserved.
PY - 2018/6/1
Y1 - 2018/6/1
N2 - Computational single-cell RNA-seq (scRNA-seq) methods have been successfully applied to experiments representing a single condition, technology, or species to discover and define cellular phenotypes. However, identifying subpopulations of cells that are present across multiple data sets remains challenging. Here, we introduce an analytical strategy for integrating scRNA-seq data sets based on common sources of variation, enabling the identification of shared populations across data sets and downstream comparative analysis. We apply this approach, implemented in our R toolkit Seurat (http://satijalab.org/seurat/), to align scRNA-seq data sets of peripheral blood mononuclear cells under resting and stimulated conditions, hematopoietic progenitors sequenced using two profiling technologies, and pancreatic cell 'atlases' generated from human and mouse islets. In each case, we learn distinct or transitional cell states jointly across data sets, while boosting statistical power through integrated analysis. Our approach facilitates general comparisons of scRNA-seq data sets, potentially deepening our understanding of how distinct cell states respond to perturbation, disease, and evolution.
AB - Computational single-cell RNA-seq (scRNA-seq) methods have been successfully applied to experiments representing a single condition, technology, or species to discover and define cellular phenotypes. However, identifying subpopulations of cells that are present across multiple data sets remains challenging. Here, we introduce an analytical strategy for integrating scRNA-seq data sets based on common sources of variation, enabling the identification of shared populations across data sets and downstream comparative analysis. We apply this approach, implemented in our R toolkit Seurat (http://satijalab.org/seurat/), to align scRNA-seq data sets of peripheral blood mononuclear cells under resting and stimulated conditions, hematopoietic progenitors sequenced using two profiling technologies, and pancreatic cell 'atlases' generated from human and mouse islets. In each case, we learn distinct or transitional cell states jointly across data sets, while boosting statistical power through integrated analysis. Our approach facilitates general comparisons of scRNA-seq data sets, potentially deepening our understanding of how distinct cell states respond to perturbation, disease, and evolution.
UR - http://www.scopus.com/inward/record.url?scp=85046298440&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85046298440&partnerID=8YFLogxK
U2 - 10.1038/nbt.4096
DO - 10.1038/nbt.4096
M3 - Article
C2 - 29608179
AN - SCOPUS:85046298440
SN - 1087-0156
VL - 36
SP - 411
EP - 420
JO - Nature Biotechnology
JF - Nature Biotechnology
IS - 5
ER -