16Stimator: Statistical estimation of ribosomal gene copy numbers from draft genome assemblies

Matthew Perisin, Madlen Vetter, Jack A. Gilbert, Joy Bergelson

Research output: Contribution to journalArticlepeer-review


The 16S rRNA gene (16S) is an accepted marker of bacterial taxonomic diversity, even though differences in copy number obscure the relationship between amplicon and organismal abundances. Ancestral state reconstruction methods can predict 16S copy numbers through comparisons with closely related reference genomes; however, the database of closed genomes is limited. Here, we extend the reference database of 16S copy numbers to de novo assembled draft genomes by developing 16Stimator, a method to estimate 16S copy numbers when these repetitive regions collapse during assembly. Using a read depth approach, we estimate 16S copy numbers for 12 endophytic isolates from Arabidopsis thaliana and confirm estimates by qPCR. We further apply this approach to draft genomes deposited in NCBI and demonstrate accurate copy number estimation regardless of sequencing platform, with an overall median deviation of 14%. The expanded database of isolates with 16S copy number estimates increases the power of phylogenetic correction methods for determining organismal abundances from 16S amplicon surveys.

Original languageEnglish (US)
Pages (from-to)1020-1024
Number of pages5
JournalISME Journal
Issue number4
StatePublished - Apr 1 2016

ASJC Scopus subject areas

  • Microbiology
  • Ecology, Evolution, Behavior and Systematics


Dive into the research topics of '16Stimator: Statistical estimation of ribosomal gene copy numbers from draft genome assemblies'. Together they form a unique fingerprint.

Cite this