Abstract
We consider the classic problem of estimating T, the total number of species in a population, from repeated counts in a simple random sample. We first show that the frequently used Chao-Lee estimator can in fact be obtained by Bayesian methods with a Dirichlet prior, and then use such clarification to develop a new estimator; numerical tests and some real experiments show that the new estimator is more flexible than existing ones, in the sense that it adapts to changes in the normalized interspecies variance γ2. Ourmethod involves simultaneous estimation of T, γ2, and of the parameter λ in the Dirichlet prior, and the only limitation seems to come from the required convergence of the prior which imposes the restriction γ2 ≤ 1. We also obtain confidence intervals for T and an estimation of the species' distribution. Some numerical examples are given, together with applications to sampling from a Census database closely following Benford's law, showing good performances of the new estimator, even beyond γ2 = 1. Tests on confidence intervals show that the coverage frequency appears to be in good agreement with the desired confidence level. AMS (2000) subject classification. Primary 62G05; Secondary 62P10, 62P30, 62P35.
Original language | English (US) |
---|---|
Pages (from-to) | 80-100 |
Number of pages | 21 |
Journal | Sankhya: The Indian Journal of Statistics |
Volume | 74 |
Issue number | 1 A |
DOIs | |
State | Published - Dec 1 2012 |
Keywords
- Bayesian posterior
- Confidence interval
- Dirichlet prior
- Point estimator
- Simple random sample
- Unobserved probability
- Unobserved species
ASJC Scopus subject areas
- Statistics, Probability and Uncertainty
- Statistics and Probability