Intoduction
Ecodive calculates ecological diversity metrics. Alpha diversity metrics provide insight about a single sample’s diversity, whereas beta diversity metrics indicate how different a pair of samples are from each other.
In this guide, we’ll use the ex_counts
dataset included
with ecodive. ex_counts
is a feature table that enumerates
how many times each bacterial genera was observed on different body
sites.
library(ecodive)
ex_counts
#> Saliva Gums Nose Stool
#> Streptococcus 162 793 22 1
#> Bacteroides 2 4 2 611
#> Corynebacterium 0 0 498 1
#> Haemophilus 180 87 2 1
#> Propionibacterium 1 1 251 0
#> Staphylococcus 0 1 236 1
In this example, the ‘features’ in our feature table are genera. However, your own dataset can use whatever feature makes sense - species, OTUs, ASVs, or even something completely unrelated to ecology.
Alpha Diversity
Alpha diversity metrics describe how many different genera are present in a sample. Depending on the metric, this can take into account the number of unique genera (richness), how evenly the population is split among genera (evenness), or how distantly related the genera are (phylogenetic diversity).
Classic metrics:
chao1()
,shannon()
,simpson()
,inv_simpson()
Phylogenetic metrics:
faith()
Further reading:
vignette('adiv')
Beta Diversity
Beta diversity metrics describe how different two samples are, based on the genera observed in each. Also known as “distance” or “dissimilarity”. UniFrac metrics incorporate a phylogenetic tree into this calculation.
Classic metrics:
bray_curtis()
,canberra()
,euclidean()
,gower()
,jaccard()
,kulczynski()
,manhattan()
Phylogenetic metrics:
unweighted_unifrac()
,weighted_unifrac()
,weighted_normalized_unifrac()
,generalized_unifrac()
,variance_adjusted_unifrac()
Further reading:
vignette('bdiv')
andvignette('unifrac')
.
Example
Rarefaction
The ex_counts
feature table has 345 saliva observations,
but nose has 1011 observations. This unequal sampling depth can cause
systematic biases. Specifically, rare genera will be observed more often
in samples with greater sampling depths, thereby artificially inflating
the observed richness.
The first step then is to rarefy ex_counts
so that all
samples have the same number of observations. Rarefying randomly removes
observations from samples with more observations.
colSums(ex_counts)
#> Saliva Gums Nose Stool
#> 345 886 1011 615
counts <- rarefy(ex_counts)
colSums(counts)
#> Saliva Gums Nose Stool
#> 345 345 345 345
counts
#> Saliva Gums Nose Stool
#> Streptococcus 162 309 6 1
#> Bacteroides 2 2 0 341
#> Corynebacterium 0 0 171 1
#> Haemophilus 180 34 0 1
#> Propionibacterium 1 0 82 0
#> Staphylococcus 0 0 86 1
Classic Metrics
These alpha and beta diversity metrics have been around for 50+ years
and don’t require a phylogenetic tree. The beta diversity functions can
take a weighted = FALSE
argument to use only
presence/absence information instead of relative abundances.
## Alpha Diversity -------------------
shannon(counts)
#> Saliva Gums Nose Stool
#> 0.74119910 0.35692121 1.10615349 0.07927797
## Beta Diversity --------------------
bray_curtis(counts)
#> Saliva Gums Nose
#> Gums 0.4260870
#> Nose 0.9797101 0.9826087
#> Stool 0.9884058 0.9884058 0.9913043
bray_curtis(counts, weighted = FALSE)
#> Saliva Gums Nose
#> Gums 0.1428571
#> Nose 0.5000000 0.7142857
#> Stool 0.3333333 0.2500000 0.3333333
Phylogenetic Metrics
A phylogenetic tree enables alpha and beta diversity metrics to take into account evolutionary relatedness between the observed genera, generally giving higher diversity values for samples with more distantly related genera. Faith (for alpha diversity) and UniFrac (for beta diversity) are examples of phylogenetic metrics.
The ex_tree
object included with ecodive provides the
phylogenetic tree for the genera in ex_counts
. For your own
datasets, you can use ecodive’s read_tree()
function to
import a phylogenetic tree from a newick formatted string or file.
## Alpha Diversity -------------------
faith(counts, tree = ex_tree)
#> Saliva Gums Nose Stool
#> 180 155 101 202
## Beta Diversity --------------------
weighted_normalized_unifrac(counts, tree = ex_tree)
#> Saliva Gums Nose
#> Gums 0.4328662
#> Nose 0.7928701 0.6767840
#> Stool 0.9677535 0.9829736 0.9936121
Distance Matrices
Beta diversity functions return a dist
object. You can
convert this to a standard R matrix with the as.matrix()
function.
dm <- bray_curtis(counts, weighted = FALSE)
dm
#> Saliva Gums Nose
#> Gums 0.1428571
#> Nose 0.5000000 0.7142857
#> Stool 0.3333333 0.2500000 0.3333333
mtx <- as.matrix(dm)
mtx
#> Saliva Gums Nose Stool
#> Saliva 0.0000000 0.1428571 0.5000000 0.3333333
#> Gums 0.1428571 0.0000000 0.7142857 0.2500000
#> Nose 0.5000000 0.7142857 0.0000000 0.3333333
#> Stool 0.3333333 0.2500000 0.3333333 0.0000000
mtx['Saliva', 'Nose']
#> [1] 0.5