Skip to contents

Gower beta diversity metric.

Usage

gower(counts, weighted = TRUE, pairs = NULL, cpus = n_cpus())

Arguments

counts

An OTU abundance matrix where each column is a sample, and each row is an OTU. Any object coercible with as.matrix() can be given here, as well as phyloseq, rbiom, SummarizedExperiment, and TreeSummarizedExperiment objects.

weighted

If TRUE, the algorithm takes relative abundances into account. If FALSE, only presence/absence is considered.

pairs

Which combinations of samples should distances be calculated for? The default value (NULL) calculates all-vs-all. Provide a numeric or logical vector specifying positions in the distance matrix to calculate. See examples.

cpus

How many parallel processing threads should be used. The default, n_cpus(), will use all logical CPU cores.

Value

A dist object.

Calculation

Each row (OTU) of counts is rescaled to the range 0-1. In cases where a row is all the same value, those values are replaced with 0.

counts                 scaled recounts
     A B C  D                 A   B   C D
OTU1 0 0 0  0    ->    OTU1 0.0 0.0 0.0 0
OTU2 0 8 9 10    ->    OTU2 0.0 0.8 0.9 1
OTU3 5 5 5  5    ->    OTU3 0.0 0.0 0.0 0
OTU4 2 0 0  0    ->    OTU4 1.0 0.0 0.0 0
OTU5 4 6 4  1    ->    OTU5 0.6 1.0 0.6 0

In the formulas below, x and y are two columns (samples) from the scaled counts. n is the number of rows (OTUs) in counts.

$$D = \displaystyle \frac{1}{n}\sum_{i = 1}^{n} |x_i - y_i|$$

  x <- c(0, 0, 0, 1, 0.6)
  y <- c(0, 0.8, 0, 0, 1)
  sum(abs(x-y)) / length(x)
  #>  0.44

References

Gower JC 1971. A general coefficient of similarity and some of its properties. Biometrics. 27(4). doi:10.2307/2528823

Gower JC, Legendre P 1986. Metric and Euclidean Properties of Dissimilarity Coefficients. Journal of Classification. 3. doi:10.1007/BF01896809

Examples

    # Example counts matrix
    ex_counts
#>                   Saliva Gums Nose Stool
#> Streptococcus        162  793   22     1
#> Bacteroides            2    4    2   611
#> Corynebacterium        0    0  498     1
#> Haemophilus          180   87    2     1
#> Propionibacterium      1    1  251     0
#> Staphylococcus         0    1  236     1
    
    # Gower weighted distance matrix
    gower(ex_counts)
#>          Saliva      Gums      Nose
#> Gums  0.2206319                    
#> Nose  0.6945328 0.7405680          
#> Stool 0.3689187 0.4138592 0.6709761
    
    # Gower unweighted distance matrix
    gower(ex_counts, weighted = FALSE)
#>          Saliva      Gums      Nose
#> Gums  0.3333333                    
#> Nose  0.3333333 0.3333333          
#> Stool 1.0000000 0.6666667 0.6666667
    
    # Only calculate distances for A vs all.
    gower(ex_counts, pairs = 1:3)
#>          Saliva      Gums      Nose
#> Gums  0.2206319                    
#> Nose  0.6945328        NA          
#> Stool 0.3689187        NA        NA