Abstract

Nonparametric Estimators for the Number of Classes in Multiple Populations
Changxuan Mao and Bruce G. Lindsay


Consider a collection of populations, each of them with infinitely many individuals. All individuals are classified into $N$ disjoint classes. The statistical goal is the estimation of $N$, the total number of distinct classes in these populations from independent abundance-based or incidence-based random subsamples drawn from each population. There exists a collection of estimators by linear extrapolation of a multivariate log ratio function, which include several estimators developed in the literature, such as the well-known Lincoln-Peterson estimator and some triple system estimators used in census undercount adjustment. A second kind of estimation procedure is based on frequency-profile combinations, which allows the estimation procedure for one population to be applied directly. The asymptotic behavior and a bootstrap confidence inference methodology are presented for each estimator. Three datasets from epidemiology, ecology and genomics are studied for illustration.

Key Words Number of species; Number of shared species; Population size; Capture-recapture; Poisson mixture; Multivariate Poisson mixture; Multivariate binomial mixture; Moment estimator.