This function reduces the number of observations (reads) in each sample to
a fixed integer value (depth). Samples with fewer observations than the
specified depth are discarded.
Rarefaction is a common technique in microbiome analysis used to account for uneven sequencing effort across samples. By standardizing the library size, it allows for fair comparisons of alpha and beta diversity metrics.
Usage
rarefy(
biom,
depth = NULL,
seed = 0L,
inflate = FALSE,
clone = TRUE,
cpus = n_cpus()
)Arguments
- biom
An rbiom object, or any value accepted by
as_rbiom().- depth
The number of observations to keep per sample. Must be an integer greater than 0.
If
NULL(the default), a depth is automatically selected that retains at least 10% of the dataset's total abundance while maximizing the number of samples kept. Seesuggest_rarefy_depth()for the specific heuristic used.Samples with total counts less than
depthwill be dropped from the result.
- seed
Random seed for permutations. Must be a non-negative integer. Default:
0- inflate
Logical. Handling for non-integer data (e.g. relative abundances).
FALSE(Default): The function will error if non-integers are detected. Rarefaction requires discrete counts (integers).TRUE: The function will automatically rescale (inflate) non-integers to integers usingbiom_inflate()before rarefying. This is useful for 'shoehorning' metagenomic relative abundance data into diversity functions that strictly require integers.
- clone
Create a copy of
biombefore modifying. IfFALSE,biomis modified in place as a side-effect. See speed ups for use cases. Default:TRUE- cpus
The number of CPUs to use. Set to
NULLto use all available, or to1to disable parallel processing. Default:NULL
Value
An rbiom object.
Details
Normalizes the library sizes of a dataset by randomly sub-sampling observations from each sample to a specific depth.
See also
suggest_rarefy_depth() for details on the default depth selection.
Other transformations:
biom_inflate(),
biom_relativize(),
biom_rescale(),
modify_metadata,
slice_metadata,
subset(),
with()
Examples
library(rbiom)
biom <- hmp50[1:5]
sample_sums(biom)
#> HMP01 HMP02 HMP03 HMP04 HMP05
#> 1660 1371 1353 1895 3939
# Rarefy to the lowest sample depth
# (All samples are kept, but counts are reduced)
biom_rare <- rarefy(biom, depth = min(sample_sums(biom)))
sample_sums(biom_rare)
#> HMP01 HMP02 HMP03 HMP04 HMP05
#> 1353 1353 1353 1353 1353
# Auto-select depth (may drop samples with low coverage)
biom_auto <- rarefy(biom)
sample_sums(biom_auto)
#> HMP01 HMP02 HMP03 HMP04 HMP05
#> 1353 1353 1353 1353 1353