Sub-sample OTU observations such that all samples have an equal number.
If called on data with non-integer abundances, values will be re-scaled to
integers between 1 and depth
such that they sum to depth
.
Usage
rarefy(
counts,
depth = 0.1,
n_samples = NULL,
seed = 0,
times = NULL,
cpus = n_cpus()
)
Arguments
- counts
An OTU abundance matrix where each column is a sample, and each row is an OTU. Any object coercible with
as.matrix()
can be given here, as well asphyloseq
,rbiom
,SummarizedExperiment
, andTreeSummarizedExperiment
objects.- depth
How many observations to keep per sample. When
0 < depth < 1
, it is taken as the minimum percentage of the dataset's observations to keep. Ignored whenn_samples
is specified. Default:0.1
- n_samples
The number of samples to keep. When
0 < n_samples < 1
, it is taken as the percentage of samples to keep. If negative, that number of samples is dropped. If0
, all samples are kept. IfNULL
, thendepth
is used instead. Default:NULL
- seed
An integer seed for randomizing which observations to keep or drop. If you need to create different random rarefactions of the same data, set the seed to a different number each time.
- times
How many independent rarefactions to perform. If set,
rarefy()
will return a list of matrices. The seeds for each matrix will be sequential, starting fromseed
.- cpus
How many parallel processing threads should be used. The default,
n_cpus()
, will use all logical CPU cores.
Examples
# Create an OTU matrix with 4 samples (A-D) and 5 OTUs.
counts <- matrix(
data = c(4,0,3,2,6,0,8,0,0,5,0,9,0,0,7,0,10,0,0,1),
nrow = 5,
dimnames = list(paste0('OTU', 1:5), LETTERS[1:4]) )
counts
#> A B C D
#> OTU1 4 0 0 0
#> OTU2 0 8 9 10
#> OTU3 3 0 0 0
#> OTU4 2 0 0 0
#> OTU5 6 5 7 1
colSums(counts)
#> A B C D
#> 15 13 16 11
counts <- rarefy(counts, depth = 14)
counts
#> A B C D
#> OTU1 4 0 0 0
#> OTU2 0 0 9 0
#> OTU3 3 0 0 0
#> OTU4 2 0 0 0
#> OTU5 5 0 5 0
colSums(counts)
#> A B C D
#> 14 0 14 0