A measure of overlap between samples that is independent of sample size. Requires integer counts.
Usage
morisita(counts, margin = 1L, pairs = NULL, cpus = n_cpus())Arguments
- counts
A numeric matrix of count data (samples \(\times\) features). Typically contains absolute abundances (integer counts), though proportions are also accepted.
- margin
The margin containing samples.
1if samples are rows,2if samples are columns. Ignored whencountsis a special object class (e.g.phyloseq). Default:1- pairs
Which combinations of samples should distances be calculated for? The default value (
NULL) calculates all-vs-all. Provide a numeric or logical vector specifying positions in the distance matrix to calculate. See examples.- cpus
How many parallel processing threads should be used. The default,
n_cpus(), will use all logical CPU cores.
Details
The Morisita dissimilarity is defined as: $$1 - \frac{2\sum_{i=1}^{n}X_{i}Y_{i}}{\left(\frac{\sum_{i=1}^{n}X_i(X_i - 1)}{X_T(X_T - 1)} + \frac{\sum_{i=1}^{n}Y_i(Y_i - 1)}{Y_T(Y_T - 1)}\right)X_{T}Y_{T}}$$
Where:
\(X_i\), \(Y_i\) : Absolute counts of the \(i\)-th feature.
\(X_T\), \(Y_T\) : Total counts in each sample. \(X_T = \sum_{i=1}^{n} X_i\).
\(n\) : The number of features.
Base R Equivalent:
Input Types
The counts parameter is designed to accept a simple numeric matrix, but
seamlessly supports objects from the following biological data packages:
phyloseqrbiomSummarizedExperimentTreeSummarizedExperiment
For large datasets, standard matrix operations may be slow. See
vignette('performance') for details on using optimized formats
(e.g. sparse matrices) and parallel processing.
References
Morisita, M. (1959). Measuring of interspecific association and similarity between communities. Memoirs of the Faculty of Science, Kyushu University, Series E (Biology), 3, 65-80.
See also
beta_div(), vignette('bdiv'), vignette('bdiv_guide')
Other Abundance metrics:
aitchison(),
bhattacharyya(),
bray(),
canberra(),
chebyshev(),
chord(),
clark(),
divergence(),
euclidean(),
gower(),
hellinger(),
horn(),
jensen(),
jsd(),
lorentzian(),
manhattan(),
matusita(),
minkowski(),
motyka(),
psym_chisq(),
soergel(),
squared_chisq(),
squared_chord(),
squared_euclidean(),
topsoe(),
wave_hedges()
