TY - JOUR
T1 - Correlated mutations and homologous recombination within bacterial populations
AU - Lin, Mingzhi
AU - Kussell, Edo
N1 - Publisher Copyright:
© 2017 by the Genetics Society of America.
PY - 2017/2
Y1 - 2017/2
N2 - Inferring the rate of homologous recombination within a bacterial population remains a key challenge in quantifying the basic parameters of bacterial evolution. Due to the high sequence similarity within a clonal population, and unique aspects of bacterial DNA transfer processes, detecting recombination events based on phylogenetic reconstruction is often difficult, and estimating recombination rates using coalescent model-based methods is computationally expensive, and often infeasible for large sequencing data sets. Here, we present an efficient solution by introducing a set of mutational correlation functions computed using pairwise sequence comparison, which characterize various facets of bacterial recombination. We provide analytical expressions for these functions, which precisely recapitulate simulation results of neutral and adapting populations under different coalescent models. We used these to fit correlation functions measured at synonymous substitutions using whole-genome data on Escherichia coli and Streptococcus pneumoniae populations. We calculated and corrected for the effect of sample selection bias, i.e., the uneven sampling of individuals from natural microbial populations that exists in most datasets. Our method is fast and efficient, and does not employ phylogenetic inference or other computationally intensive numerics. By simply fitting analytical forms to measurements from sequence data, we show that recombination rates can be inferred, and the relative ages of different samples can be estimated. Our approach, which is based on population genetic modeling, is broadly applicable to a wide variety of data, and its computational efficiency makes it particularly attractive for use in the analysis of large sequencing datasets.
AB - Inferring the rate of homologous recombination within a bacterial population remains a key challenge in quantifying the basic parameters of bacterial evolution. Due to the high sequence similarity within a clonal population, and unique aspects of bacterial DNA transfer processes, detecting recombination events based on phylogenetic reconstruction is often difficult, and estimating recombination rates using coalescent model-based methods is computationally expensive, and often infeasible for large sequencing data sets. Here, we present an efficient solution by introducing a set of mutational correlation functions computed using pairwise sequence comparison, which characterize various facets of bacterial recombination. We provide analytical expressions for these functions, which precisely recapitulate simulation results of neutral and adapting populations under different coalescent models. We used these to fit correlation functions measured at synonymous substitutions using whole-genome data on Escherichia coli and Streptococcus pneumoniae populations. We calculated and corrected for the effect of sample selection bias, i.e., the uneven sampling of individuals from natural microbial populations that exists in most datasets. Our method is fast and efficient, and does not employ phylogenetic inference or other computationally intensive numerics. By simply fitting analytical forms to measurements from sequence data, we show that recombination rates can be inferred, and the relative ages of different samples can be estimated. Our approach, which is based on population genetic modeling, is broadly applicable to a wide variety of data, and its computational efficiency makes it particularly attractive for use in the analysis of large sequencing datasets.
KW - Adapting populations
KW - Bacteria
KW - Bolthausen–Sznitman coalescent
KW - Homologous recombination
KW - Population diversity
KW - Sample ages
KW - Sample selection bias
UR - http://www.scopus.com/inward/record.url?scp=85021847801&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85021847801&partnerID=8YFLogxK
U2 - 10.1534/genetics.116.189621
DO - 10.1534/genetics.116.189621
M3 - Article
C2 - 28007887
AN - SCOPUS:85021847801
SN - 0016-6731
VL - 205
SP - 891
EP - 917
JO - Genetics
JF - Genetics
IS - 2
ER -