Distance / dissimilarity between samples.
Usage
bdiv_table(
biom,
bdiv = "Bray-Curtis",
weighted = TRUE,
tree = NULL,
md = ".all",
within = NULL,
between = NULL,
delta = ".all",
transform = "none",
ties = "random",
seed = 0
)
bdiv_matrix(
biom,
bdiv = "Bray-Curtis",
weighted = TRUE,
tree = NULL,
within = NULL,
between = NULL,
transform = "none",
ties = "random",
seed = 0
)
bdiv_distmat(
biom,
bdiv = "Bray-Curtis",
weighted = TRUE,
tree = NULL,
within = NULL,
between = NULL,
transform = "none"
)
Arguments
- biom
An rbiom object, such as from
as_rbiom()
. Any value accepted byas_rbiom()
can also be given here.- bdiv
Beta diversity distance algorithm(s) to use. Options are:
"Bray-Curtis"
,"Manhattan"
,"Euclidean"
,"Jaccard"
, and"UniFrac"
. For"UniFrac"
, a phylogenetic tree must be present inbiom
or explicitly provided viatree=
. Default:"Bray-Curtis"
Multiple/abbreviated values allowed.- weighted
Take relative abundances into account. When
weighted=FALSE
, only presence/absence is considered. Default:TRUE
Multiple values allowed.- tree
A
phylo
object representing the phylogenetic relationships of the taxa inbiom
. Only required when computing UniFrac distances. Default:biom$tree
- md
Dataset field(s) to include in the output data frame, or
'.all'
to include all metadata fields. Default:'.all'
- within, between
Dataset field(s) for intra- or inter- sample comparisons. Alternatively, dataset field names given elsewhere can be prefixed with
'=='
or'!='
to assign them towithin
orbetween
, respectively. Default:NULL
- delta
For numeric metadata, report the absolute difference in values for the two samples, for instance
2
instead of"10 vs 12"
. Default:TRUE
- transform
Transformation to apply. Options are:
c("none", "rank", "log", "log1p", "sqrt", "percent")
."rank"
is useful for correcting for non-normally distributions before applying regression statistics. Default:"none"
- ties
When
transform="rank"
, how to rank identical values. Options are:c("average", "first", "last", "random", "max", "min")
. Seerank()
for details. Default:"random"
- seed
Random seed for permutations. Default:
0
Value
bdiv_matrix()
-An R matrix of samples x samples.
bdiv_distmat()
-A dist-class distance matrix.
bdiv_table()
-A tibble data.frame with columns names .sample1, .sample2, .weighted, .bdiv, .distance, and any fields requested by
md
. Numeric metadata fields will be returned asabs(x - y)
; categorical metadata fields as"x"
,"y"
, or"x vs y"
.
Metadata Comparisons
Prefix metadata fields with ==
or !=
to limit comparisons to within or
between groups, respectively. For example, stat.by = '==Sex'
will
run calculations only for intra-group comparisons, returning "Male" and
"Female", but NOT "Female vs Male". Similarly, setting
stat.by = '!=Body Site'
will only show the inter-group comparisons, such
as "Saliva vs Stool", "Anterior nares vs Buccal mucosa", and so on.
The same effect can be achieved by using the within
and between
parameters. stat.by = '==Sex'
is equivalent to
stat.by = 'Sex', within = 'Sex'
.
See also
Other beta_diversity:
bdiv_boxplot()
,
bdiv_clusters()
,
bdiv_corrplot()
,
bdiv_heatmap()
,
bdiv_ord_plot()
,
bdiv_ord_table()
,
bdiv_stats()
,
distmat_stats()
Examples
library(rbiom)
# Subset to four samples
biom <- hmp50$clone()
biom$counts <- biom$counts[,c("HMP18", "HMP19", "HMP20", "HMP21")]
# Return in long format with metadata
bdiv_table(biom, 'unifrac', md = ".all")
#> # A tibble: 6 × 9
#> .sample1 .sample2 .weighted .bdiv .distance Age BMI `Body Site` Sex
#> <chr> <chr> <lgl> <fct> <dbl> <dbl> <dbl> <fct> <fct>
#> 1 HMP18 HMP19 TRUE UniFrac 0.665 0 3 Saliva vs Sto… Fema…
#> 2 HMP18 HMP20 TRUE UniFrac 0.681 1 2 Saliva vs Sto… Fema…
#> 3 HMP19 HMP20 TRUE UniFrac 0.418 1 5 Stool Fema…
#> 4 HMP18 HMP21 TRUE UniFrac 0.717 4 2 Saliva vs Sto… Male
#> 5 HMP19 HMP21 TRUE UniFrac 0.390 4 1 Stool Fema…
#> 6 HMP20 HMP21 TRUE UniFrac 0.149 5 4 Stool Fema…
# Only look at distances among the stool samples
bdiv_table(biom, 'unifrac', md = c("==Body Site", "Sex"))
#> # A tibble: 3 × 7
#> .sample1 .sample2 .weighted .bdiv .distance `Body Site` Sex
#> <chr> <chr> <lgl> <fct> <dbl> <fct> <fct>
#> 1 HMP19 HMP20 TRUE UniFrac 0.418 Stool Female
#> 2 HMP19 HMP21 TRUE UniFrac 0.390 Stool Female vs Male
#> 3 HMP20 HMP21 TRUE UniFrac 0.149 Stool Female vs Male
# Or between males and females
bdiv_table(biom, 'unifrac', md = c("Body Site", "!=Sex"))
#> # A tibble: 4 × 7
#> .sample1 .sample2 .weighted .bdiv .distance `Body Site` Sex
#> <chr> <chr> <lgl> <fct> <dbl> <fct> <fct>
#> 1 HMP18 HMP19 TRUE UniFrac 0.665 Saliva vs Stool Female vs Male
#> 2 HMP18 HMP20 TRUE UniFrac 0.681 Saliva vs Stool Female vs Male
#> 3 HMP19 HMP21 TRUE UniFrac 0.390 Stool Female vs Male
#> 4 HMP20 HMP21 TRUE UniFrac 0.149 Stool Female vs Male
# All-vs-all matrix
bdiv_matrix(biom, 'unifrac')
#> HMP18 HMP19 HMP20 HMP21
#> HMP18 0.0000000 0.6651627 0.6810017 0.7170374
#> HMP19 0.6651627 0.0000000 0.4183059 0.3896741
#> HMP20 0.6810017 0.4183059 0.0000000 0.1490926
#> HMP21 0.7170374 0.3896741 0.1490926 0.0000000
#> attr(,"cmd")
#> [1] "bdiv_matrix(biom, \"unifrac\")"
# All-vs-all distance matrix
dm <- bdiv_distmat(biom, 'unifrac')
dm
#> HMP18 HMP19 HMP20
#> HMP19 0.6651627
#> HMP20 0.6810017 0.4183059
#> HMP21 0.7170374 0.3896741 0.1490926
plot(hclust(dm))