Skip to contents

Sub-sample OTU observations such that all samples have an equal number. If called on data with non-integer abundances, values will be re-scaled to integers between 1 and depth such that they sum to depth.

Usage

rarefy(
  counts,
  depth = 0.1,
  n_samples = NULL,
  seed = 0,
  times = NULL,
  cpus = n_cpus()
)

Arguments

counts

An OTU abundance matrix where each column is a sample, and each row is an OTU. Any object coercible with as.matrix() can be given here, as well as phyloseq, rbiom, SummarizedExperiment, and TreeSummarizedExperiment objects.

depth

How many observations to keep per sample. When 0 < depth < 1, it is taken as the minimum percentage of the dataset's observations to keep. Ignored when n_samples is specified. Default: 0.1

n_samples

The number of samples to keep. When 0 < n_samples < 1, it is taken as the percentage of samples to keep. If negative, that number of samples is dropped. If 0, all samples are kept. If NULL, then depth is used instead. Default: NULL

seed

An integer seed for randomizing which observations to keep or drop. If you need to create different random rarefactions of the same data, set the seed to a different number each time.

times

How many independent rarefactions to perform. If set, rarefy() will return a list of matrices. The seeds for each matrix will be sequential, starting from seed.

cpus

How many parallel processing threads should be used. The default, n_cpus(), will use all logical CPU cores.

Value

An integer matrix.

Examples

    # Create an OTU matrix with 4 samples (A-D) and 5 OTUs.
    counts <- matrix(
      data     = c(4,0,3,2,6,0,8,0,0,5,0,9,0,0,7,0,10,0,0,1),
      nrow     = 5,
      dimnames = list(paste0('OTU', 1:5), LETTERS[1:4]) )
    counts
#>      A B C  D
#> OTU1 4 0 0  0
#> OTU2 0 8 9 10
#> OTU3 3 0 0  0
#> OTU4 2 0 0  0
#> OTU5 6 5 7  1
    colSums(counts)
#>  A  B  C  D 
#> 15 13 16 11 
    
    counts <- rarefy(counts, depth = 14)
    counts
#>      A B C D
#> OTU1 4 0 0 0
#> OTU2 0 0 9 0
#> OTU3 3 0 0 0
#> OTU4 2 0 0 0
#> OTU5 5 0 5 0
    colSums(counts)
#>  A  B  C  D 
#> 14  0 14  0