Observed Features

Observed Features alpha diversity metric.

Usage

observed(counts, cpus = n_cpus())

Arguments

counts: An OTU abundance matrix where each column is a sample, and each row is an OTU. Any object coercible with as.matrix() can be given here, as well as phyloseq, rbiom, SummarizedExperiment, and TreeSummarizedExperiment objects.
cpus: How many parallel processing threads should be used. The default, n_cpus(), will use all logical CPU cores.

Value

A numeric vector.

Details

This is the most straightforward and intuitive measure of diversity. It is a simple count of the number of unique microbial taxa (such as Amplicon Sequence Variants, or ASVs) detected in a sample. A higher value indicates greater richness. While easy to understand, this metric is highly sensitive to the number of sequences per sample (sequencing depth). A sample with more sequences is more likely to detect rare taxa by chance, leading to an inflated richness value. Therefore, it is not appropriate to directly compare the Observed Features of samples with different sequencing depths without first normalizing the data, typically through a process called rarefaction (subsampling all samples to an equal depth).

Calculation

Pre-transformation: drop all OTUs with zero abundance.

In the formulas below, $x$ is a single column (sample) from counts.

$$p_{i} = \displaystyle \frac{x_i}{\sum x}$$ $$D = \displaystyle \sum_{i = 1}^{n} 1$$

  x <- c(4, 0, 3, 2, 6)[-2]
  length(x)
  #>  4

Examples

    # Example counts matrix
    ex_counts
#>                   Saliva Gums Nose Stool
#> Streptococcus        162  793   22     1
#> Bacteroides            2    4    2   611
#> Corynebacterium        0    0  498     1
#> Haemophilus          180   87    2     1
#> Propionibacterium      1    1  251     0
#> Staphylococcus         0    1  236     1
    
    # Observed features
    observed(ex_counts)
#> Saliva   Gums   Nose  Stool 
#>      4      5      6      5