A non-parametric estimator of the lower bound of species richness.
Usage
chao1(counts, margin = 1L, cpus = n_cpus())Arguments
- counts
A numeric matrix of count data (samples \(\times\) features). Typically contains absolute abundances (integer counts), though proportions are also accepted.
- margin
The margin containing samples.
1if samples are rows,2if samples are columns. Ignored whencountsis a special object class (e.g.phyloseq). Default:1- cpus
How many parallel processing threads should be used. The default,
n_cpus(), will use all logical CPU cores.
Details
The Chao1 estimator uses the ratio of singletons to doubletons to estimate the number of missing species: $$n + \frac{(F_1)^2}{2 F_2}$$
Where:
\(n\) : The number of observed features.
\(F_1\) : Number of features observed once (singletons).
\(F_2\) : Number of features observed twice (doubletons).
Base R Equivalent:
Input Types
The counts parameter is designed to accept a simple numeric matrix, but
seamlessly supports objects from the following biological data packages:
phyloseqrbiomSummarizedExperimentTreeSummarizedExperiment
For large datasets, standard matrix operations may be slow. See
vignette('performance') for details on using optimized formats
(e.g. sparse matrices) and parallel processing.
References
Chao, A. (1984). Nonparametric estimation of the number of classes in a population. Scandinavian Journal of Statistics, 11, 265-270.
See also
Other Richness metrics:
ace(),
margalef(),
menhinick(),
observed(),
squares()
