Skip to contents

Weighted UniFrac beta diversity metric.

Usage

weighted_unifrac(counts, tree = NULL, pairs = NULL, cpus = n_cpus())

Arguments

counts

An OTU abundance matrix where each column is a sample, and each row is an OTU. Any object coercible with as.matrix() can be given here, as well as phyloseq, rbiom, SummarizedExperiment, and TreeSummarizedExperiment objects.

tree

A phylo-class object representing the phylogenetic tree for the OTUs in counts. The OTU identifiers given by colnames(counts) must be present in tree. Can be omitted if a tree is embedded with the counts object or as attr(counts, 'tree').

pairs

Which combinations of samples should distances be calculated for? The default value (NULL) calculates all-vs-all. Provide a numeric or logical vector specifying positions in the distance matrix to calculate. See examples.

cpus

How many parallel processing threads should be used. The default, n_cpus(), will use all logical CPU cores.

Value

A dist object.

Calculation

Given \(n\) branches with lengths \(L\) and a pair of samples' abundances (\(A\) and \(B\)) on each of those branches:

$$D = \sum_{i = 1}^{n} L_i|\frac{A_i}{A_T} - \frac{B_i}{B_T}|$$

See vignette('unifrac') for details and a worked example.

References

Lozupone CA, Hamady M, Kelley ST, Knight R 2007. Quantitative and Qualitative \(\beta\) Diversity Measures Lead to Different Insights into Factors That Structure Microbial Communities. Applied and Environmental Microbiology, 73(5). doi:10.1128/AEM.01996-06

Examples

    # Example counts matrix
    ex_counts
#>                   Saliva Gums Nose Stool
#> Streptococcus        162  793   22     1
#> Bacteroides            2    4    2   611
#> Corynebacterium        0    0  498     1
#> Haemophilus          180   87    2     1
#> Propionibacterium      1    1  251     0
#> Staphylococcus         0    1  236     1
    
    # Weighted UniFrac distance matrix
    weighted_unifrac(ex_counts, tree = ex_tree)
#>          Saliva      Gums      Nose
#> Gums   37.08021                    
#> Nose   67.00360  55.56710          
#> Stool 110.25564 109.96250 110.14056
    
    # Only calculate distances for A vs all.
    weighted_unifrac(ex_counts, tree = ex_tree, pairs = 1:3)
#>          Saliva      Gums      Nose
#> Gums   37.08021                    
#> Nose   67.00360        NA          
#> Stool 110.25564        NA        NA