Simpson alpha diversity metric.
Usage
simpson(counts, cpus = n_cpus())
Arguments
- counts
An OTU abundance matrix where each column is a sample, and each row is an OTU. Any object coercible with
as.matrix()
can be given here, as well asphyloseq
,rbiom
,SummarizedExperiment
, andTreeSummarizedExperiment
objects.- cpus
How many parallel processing threads should be used. The default,
n_cpus()
, will use all logical CPU cores.
Details
The Simpson index is a popular metric that incorporates both richness and evenness to describe community diversity. The most common version, the Gini-Simpson index (implemented here), measures the probability that two individuals selected randomly from the community will belong to different species. The value ranges from 0 to 1, where higher values indicate greater diversity. Because the calculation involves squaring the proportional abundances of each species, the index is heavily weighted by the most abundant (dominant) taxa and is less sensitive to the presence of rare species. A low Simpson index suggests that the community is dominated by one or a few species, making it a strong measure of community dominance.
Calculation
Pre-transformation: drop all OTUs with zero abundance.
In the formulas below, \(x\) is a single column (sample) from counts
.
\(p\) are the relative abundances.
$$p_{i} = \displaystyle \frac{x_i}{\sum x}$$ $$D = \displaystyle 1 - \sum_{i = 1}^{n} p_{i}\times\ln(p_{i})$$
References
Simpson EH 1949. Measurement of diversity. Nature, 163. doi:10.1038/163688a0
Examples
# Example counts matrix
ex_counts
#> Saliva Gums Nose Stool
#> Streptococcus 162 793 22 1
#> Bacteroides 2 4 2 611
#> Corynebacterium 0 0 498 1
#> Haemophilus 180 87 2 1
#> Propionibacterium 1 1 251 0
#> Staphylococcus 0 1 236 1
# Simpson diversity values
simpson(ex_counts)
#> Saliva Gums Nose Stool
#> 0.50725478 0.18924937 0.64075388 0.01295525
# Low diversity
simpson(c(100, 1, 1, 1, 1)) # 0.075
#> [1] 0.07507396
# High diversity
simpson(c(20, 20, 20, 20, 20)) # 0.8
#> [1] 0.8
# Low richness
simpson(1:3) # 0.61
#> [1] 0.6111111
# High richness
simpson(1:100) # 0.99
#> [1] 0.9867327