Skip to contents

Unweighted UniFrac beta diversity metric.

Usage

unweighted_unifrac(counts, tree = NULL, pairs = NULL, cpus = n_cpus())

Arguments

counts

An OTU abundance matrix where each column is a sample, and each row is an OTU. Any object coercible with as.matrix() can be given here, as well as phyloseq, rbiom, SummarizedExperiment, and TreeSummarizedExperiment objects.

tree

A phylo-class object representing the phylogenetic tree for the OTUs in counts. The OTU identifiers given by colnames(counts) must be present in tree. Can be omitted if a tree is embedded with the counts object or as attr(counts, 'tree').

pairs

Which combinations of samples should distances be calculated for? The default value (NULL) calculates all-vs-all. Provide a numeric or logical vector specifying positions in the distance matrix to calculate. See examples.

cpus

How many parallel processing threads should be used. The default, n_cpus(), will use all logical CPU cores.

Value

A dist object.

Calculation

Given \(n\) branches with lengths \(L\) and a pair of samples' abundances (\(A\) and \(B\)) on each of those branches:

$$D = \displaystyle \frac{\sum_{i = 1}^{n} L_i(|A_i - B_i|)}{\sum_{i = 1}^{n} L_i(max(A_i,B_i))}$$

Abundances in \(A\) and \(B\) are coded as 1 or 0 to indicate their presence or absence, respectively, on each branch.

See https://cmmr.github.io/ecodive/articles/unifrac.html for details and a worked example.

References

Lozupone C, Knight R 2005. UniFrac: A new phylogenetic method for comparing microbial communities. Applied and Environmental Microbiology, 71(12). doi:10.1128/AEM.71.12.8228-8235.2005

Examples

    # Example counts matrix
    ex_counts
#>                   Saliva Gums Nose Stool
#> Streptococcus        162  793   22     1
#> Bacteroides            2    4    2   611
#> Corynebacterium        0    0  498     1
#> Haemophilus          180   87    2     1
#> Propionibacterium      1    1  251     0
#> Staphylococcus         0    1  236     1
    
    # Unweighted UniFrac distance matrix
    unweighted_unifrac(ex_counts, tree = ex_tree)
#>           Saliva       Gums       Nose
#> Gums  0.05759162                      
#> Nose  0.16279070 0.11162791           
#> Stool 0.22325581 0.17209302 0.06046512
    
    # Only calculate distances for A vs all.
    unweighted_unifrac(ex_counts, tree = ex_tree, pairs = 1:3)
#>           Saliva       Gums       Nose
#> Gums  0.05759162                      
#> Nose  0.16279070         NA           
#> Stool 0.22325581         NA         NA