Skip to contents

Variance Adjusted UniFrac beta diversity metric.

Usage

variance_adjusted_unifrac(counts, tree = NULL, pairs = NULL, cpus = n_cpus())

Arguments

counts

An OTU abundance matrix where each column is a sample, and each row is an OTU. Any object coercible with as.matrix() can be given here, as well as phyloseq, rbiom, SummarizedExperiment, and TreeSummarizedExperiment objects.

tree

A phylo-class object representing the phylogenetic tree for the OTUs in counts. The OTU identifiers given by colnames(counts) must be present in tree. Can be omitted if a tree is embedded with the counts object or as attr(counts, 'tree').

pairs

Which combinations of samples should distances be calculated for? The default value (NULL) calculates all-vs-all. Provide a numeric or logical vector specifying positions in the distance matrix to calculate. See examples.

cpus

How many parallel processing threads should be used. The default, n_cpus(), will use all logical CPU cores.

Value

A dist object.

Calculation

Given \(n\) branches with lengths \(L\) and a pair of samples' abundances (\(A\) and \(B\)) on each of those branches:

$$D = \displaystyle \frac{\sum_{i = 1}^{n} L_i\displaystyle \frac{|\frac{A_i}{A_T} - \frac{B_i}{B_T}|}{\sqrt{(A_i + B_i)(A_T + B_T - A_i - B_i)}} }{\sum_{i = 1}^{n} L_i\displaystyle \frac{\frac{A_i}{A_T} + \frac{B_i}{B_T}}{\sqrt{(A_i + B_i)(A_T + B_T - A_i - B_i)}} }$$

See vignette('unifrac') for details and a worked example.

References

Chang Q, Luan Y, Sun F 2011. Variance adjusted weighted UniFrac: a powerful beta diversity measure for comparing communities based on phylogeny. BMC Bioinformatics, 12. doi:10.1186/1471-2105-12-118

Examples

    # Example counts matrix
    ex_counts
#>                   Saliva Gums Nose Stool
#> Streptococcus        162  793   22     1
#> Bacteroides            2    4    2   611
#> Corynebacterium        0    0  498     1
#> Haemophilus          180   87    2     1
#> Propionibacterium      1    1  251     0
#> Staphylococcus         0    1  236     1
    
    # Variance Adjusted UniFrac distance matrix
    variance_adjusted_unifrac(ex_counts, tree = ex_tree)
#>          Saliva      Gums      Nose
#> Gums  0.4242631                    
#> Nose  0.7753369 0.5565010          
#> Stool 0.9655749 0.9807634 0.9785147
    
    # Only calculate distances for A vs all.
    variance_adjusted_unifrac(ex_counts, tree = ex_tree, pairs = 1:3)
#>          Saliva      Gums      Nose
#> Gums  0.4242631                    
#> Nose  0.7753369        NA          
#> Stool 0.9655749        NA        NA