Distance / dissimilarity between samples.
Usage
bdiv_table(
biom,
bdiv = "Bray-Curtis",
weighted = TRUE,
normalized = TRUE,
tree = NULL,
md = ".all",
within = NULL,
between = NULL,
delta = ".all",
transform = "none",
ties = "random",
seed = 0,
cpus = NULL
)
bdiv_matrix(
biom,
bdiv = "Bray-Curtis",
weighted = TRUE,
normalized = TRUE,
tree = NULL,
within = NULL,
between = NULL,
transform = "none",
ties = "random",
seed = 0,
cpus = NULL
)
bdiv_distmat(
biom,
bdiv = "Bray-Curtis",
weighted = TRUE,
normalized = TRUE,
tree = NULL,
within = NULL,
between = NULL,
transform = "none",
cpus = NULL
)
Arguments
- biom
An rbiom object, such as from
as_rbiom()
. Any value accepted byas_rbiom()
can also be given here.- bdiv
Beta diversity distance algorithm(s) to use. Options are:
"Bray-Curtis"
,"Manhattan"
,"Euclidean"
,"Jaccard"
, and"UniFrac"
. For"UniFrac"
, a phylogenetic tree must be present inbiom
or explicitly provided viatree=
. Multiple/abbreviated values allowed. Default:"Bray-Curtis"
- weighted
Take relative abundances into account. When
weighted=FALSE
, only presence/absence is considered. Multiple values allowed. Default:TRUE
- normalized
Only changes the "Weighted UniFrac" calculation. Divides result by the total branch weights. Default:
TRUE
- tree
A
phylo
object representing the phylogenetic relationships of the taxa inbiom
. Only required when computing UniFrac distances. Default:biom$tree
- md
Dataset field(s) to include in the output data frame, or
'.all'
to include all metadata fields. Default:'.all'
- within, between
Dataset field(s) for intra- or inter- sample comparisons. Alternatively, dataset field names given elsewhere can be prefixed with
'=='
or'!='
to assign them towithin
orbetween
, respectively. Default:NULL
- delta
For numeric metadata, report the absolute difference in values for the two samples, for instance
2
instead of"10 vs 12"
. Default:TRUE
- transform
Transformation to apply. Options are:
c("none", "rank", "log", "log1p", "sqrt", "percent")
."rank"
is useful for correcting for non-normally distributions before applying regression statistics. Default:"none"
- ties
When
transform="rank"
, how to rank identical values. Options are:c("average", "first", "last", "random", "max", "min")
. Seerank()
for details. Default:"random"
- seed
Random seed for permutations. Must be a non-negative integer. Default:
0
- cpus
The number of CPUs to use. Set to
NULL
to use all available, or to1
to disable parallel processing. Default:NULL
Value
bdiv_matrix()
-An R matrix of samples x samples.
bdiv_distmat()
-A dist-class distance matrix.
bdiv_table()
-A tibble data.frame with columns names .sample1, .sample2, .weighted, .bdiv, .distance, and any fields requested by
md
. Numeric metadata fields will be returned asabs(x - y)
; categorical metadata fields as"x"
,"y"
, or"x vs y"
.
Metadata Comparisons
Prefix metadata fields with ==
or !=
to limit comparisons to within or
between groups, respectively. For example, stat.by = '==Sex'
will
run calculations only for intra-group comparisons, returning "Male" and
"Female", but NOT "Female vs Male". Similarly, setting
stat.by = '!=Body Site'
will only show the inter-group comparisons, such
as "Saliva vs Stool", "Anterior nares vs Buccal mucosa", and so on.
The same effect can be achieved by using the within
and between
parameters. stat.by = '==Sex'
is equivalent to
stat.by = 'Sex', within = 'Sex'
.
See also
Other beta_diversity:
bdiv_boxplot()
,
bdiv_clusters()
,
bdiv_corrplot()
,
bdiv_heatmap()
,
bdiv_ord_plot()
,
bdiv_ord_table()
,
bdiv_stats()
,
distmat_stats()
Examples
library(rbiom)
# Subset to four samples
biom <- hmp50$clone()
biom$counts <- biom$counts[,c("HMP18", "HMP19", "HMP20", "HMP21")]
# Return in long format with metadata
bdiv_table(biom, 'unifrac', md = ".all")
#> # A tibble: 6 × 9
#> .sample1 .sample2 .weighted .bdiv .distance Age BMI `Body Site` Sex
#> <chr> <chr> <lgl> <fct> <dbl> <dbl> <dbl> <fct> <fct>
#> 1 HMP18 HMP19 TRUE UniFrac 0.735 0 3 Saliva vs Sto… Fema…
#> 2 HMP18 HMP20 TRUE UniFrac 0.765 1 2 Saliva vs Sto… Fema…
#> 3 HMP18 HMP21 TRUE UniFrac 0.771 4 2 Saliva vs Sto… Male
#> 4 HMP19 HMP20 TRUE UniFrac 0.433 1 5 Stool Fema…
#> 5 HMP19 HMP21 TRUE UniFrac 0.387 4 1 Stool Fema…
#> 6 HMP20 HMP21 TRUE UniFrac 0.150 5 4 Stool Fema…
# Only look at distances among the stool samples
bdiv_table(biom, 'unifrac', md = c("==Body Site", "Sex"))
#> # A tibble: 3 × 7
#> .sample1 .sample2 .weighted .bdiv .distance `Body Site` Sex
#> <chr> <chr> <lgl> <fct> <dbl> <fct> <fct>
#> 1 HMP19 HMP20 TRUE UniFrac 0.433 Stool Female
#> 2 HMP19 HMP21 TRUE UniFrac 0.387 Stool Female vs Male
#> 3 HMP20 HMP21 TRUE UniFrac 0.150 Stool Female vs Male
# Or between males and females
bdiv_table(biom, 'unifrac', md = c("Body Site", "!=Sex"))
#> # A tibble: 4 × 7
#> .sample1 .sample2 .weighted .bdiv .distance `Body Site` Sex
#> <chr> <chr> <lgl> <fct> <dbl> <fct> <fct>
#> 1 HMP18 HMP19 TRUE UniFrac 0.735 Saliva vs Stool Female vs Male
#> 2 HMP18 HMP20 TRUE UniFrac 0.765 Saliva vs Stool Female vs Male
#> 3 HMP19 HMP21 TRUE UniFrac 0.387 Stool Female vs Male
#> 4 HMP20 HMP21 TRUE UniFrac 0.150 Stool Female vs Male
# All-vs-all matrix
bdiv_matrix(biom, 'unifrac')
#> HMP18 HMP19 HMP20 HMP21
#> HMP18 0.0000000 0.7353910 0.7649262 0.7705909
#> HMP19 0.7353910 0.0000000 0.4332007 0.3874131
#> HMP20 0.7649262 0.4332007 0.0000000 0.1503528
#> HMP21 0.7705909 0.3874131 0.1503528 0.0000000
#> attr(,"cmd")
#> [1] "bdiv_matrix(biom, \"unifrac\")"
# All-vs-all distance matrix
dm <- bdiv_distmat(biom, 'unifrac')
dm
#> HMP18 HMP19 HMP20
#> HMP19 0.7353910
#> HMP20 0.7649262 0.4332007
#> HMP21 0.7705909 0.3874131 0.1503528
plot(hclust(dm))