Variation in LINE composition is one of themajor determinants for the substantial size and structural differences among vertebrate genomes. In particular, the larger genomes of mammals are characterized by hundreds of thousands of copies from a single LINE clade, L1, whereas nonmammalian vertebrates possess amuch greater diversity of LINEs, yet with orders of magnitude less in copy number. It has been proposed that such variation in copy number amongvertebrates is due to differential effect of LINE insertions on host fitness. To investigate LINE selection, we deployed a framework of demographic modeling, coalescent simulations, and probabilistic inference against population-levelwhole-genome data sets for fourmodel species: one population each of threespine stickleback, green anole, and house mouse, as well as three human populations. Specifically, we inferred a null demographic background utilizing SNP data, which was then exploited to simulate a putative null distribution of summary statistics that was compared with LINE data. Subsequently, we appliedthe inferred null demographic model with an additional exponential size change parameter, coupled with model selection, to test for neutrality aswell as estimate the strength of either negative or positive selection. We found a robust signal for purifying selection in anole and mouse, but a lack of clear evidence for selection in stickleback and human. Overall, we demonstrated LINE insertion dynamics that are not in accordance to a mammalian versus nonmammalian dichotomy, and instead the degree of existing LINE activity together with host-specific demographic history may be the main determinants of LINE abundance.
- Approximate Bayesian computation
- Comparative population genomics
- Composite likelihood optimization
- Purifying selection
- Transposable elements
ASJC Scopus subject areas
- Ecology, Evolution, Behavior and Systematics