Dice-Sorensen dissimilarity

A statistic used for comparing the similarity of two samples.

Usage

sorensen(counts, margin = 1L, pairs = NULL, cpus = n_cpus())

Arguments

counts: A numeric matrix of count data (samples $\times$ features). Typically contains absolute abundances (integer counts), though proportions are also accepted.
margin: The margin containing samples. 1 if samples are rows, 2 if samples are columns. Ignored when counts is a special object class (e.g. phyloseq). Default: 1
pairs: Which combinations of samples should distances be calculated for? The default value (NULL) calculates all-vs-all. Provide a numeric or logical vector specifying positions in the distance matrix to calculate. See examples.
cpus: How many parallel processing threads should be used. The default, n_cpus(), will use all logical CPU cores.

Details

The Dice-Sorensen dissimilarity is defined as: $$\frac{2J}{(A + B)}$$

Where:

$A$, $B$ : Number of features in each sample.
$J$ : Number of features in common (intersection).

Base R Equivalent:

x <- ex_counts[1,]
y <- ex_counts[2,]
2 * sum(x & y) / sum(x>0, y>0)

Input Types

The counts parameter is designed to accept a simple numeric matrix, but seamlessly supports objects from the following biological data packages:

phyloseq
rbiom
SummarizedExperiment
TreeSummarizedExperiment

For large datasets, standard matrix operations may be slow. See vignette('performance') for details on using optimized formats (e.g. sparse matrices) and parallel processing.

References

Sørensen, T. (1948). A method of establishing groups of equal amplitude in plant sociology based on similarity of species content. Kongelige Danske Videnskabernes Selskab, Biologiske Skrifter, 5, 1-34.

Dice, L. R. (1945). Measures of the amount of ecologic association between species. Ecology, 26(3), 297–302. doi:10.2307/1932409

Examples

    sorensen(ex_counts)
#>           Saliva       Gums       Nose
#> Gums  0.11111111                      
#> Nose  0.20000000 0.09090909           
#> Stool 0.33333333 0.20000000 0.09090909