Generate a matrix of samples by taxa, at the specified taxonomic rank.

taxa.rollup(biom, rank = "OTU", map = NULL, lineage = FALSE, sparse = FALSE)

Arguments

biom

A matrix, simple_triplet_matrix, or BIOM object, as returned from read.biom. For matrices, the rows and columns are assumed to be the taxa and samples, respectively.

rank

The taxonomic rank. E.g. “OTU”, “Phylum”, etc. May also be given numerically: 0 for OTU, 1 for the highest level (i.e. Kingdom), and extending to the number of taxonomic ranks encoded in the original biom file. See example below to fetch the names of all available ranks.

map

A character matrix defining the value that each taxa IDs is assigned for each taxonomic rank. If map=NULL and biom is a BIOM class object, the map will be automatically loaded from biom$taxonomy. map must not be null when biom is a matrix or simple_triplet_matrix. See the example below for an example of map's structure.

lineage

Include all ranks in the name of the taxa. For instance, setting to TRUE will produce Bacteria; Actinobacteria; Coriobacteriia; Coriobacteriales. Whereas setting to FALSE (the default) will return simply Coriobacteriales. You want to set this to TRUE if you have genus names (such as Incertae_Sedis) that map to multiple higher level ranks.

sparse

If true, returns a sparse matrix as described by slam::simple_triplet_matrix, otherwise returns a normal R matrix object. Sparse matrices will likely be considerably more memory efficient in this scenario.

Value

A numeric matrix with samples as column names, and taxonomic identifiers as row names.

Examples

library(rbiom) infile <- system.file("extdata", "hmp50.bz2", package = "rbiom") biom <- read.biom(infile) colnames(biom$taxonomy)
#> [1] "Kingdom" "Phylum" "Class" "Order" "Family" "Genus"
phyla <- taxa.rollup(biom, 'Phylum') phyla[1:4,1:6]
#> HMP01 HMP02 HMP03 HMP04 HMP05 HMP06 #> __Actinobacteria 18 60 126 120 30 71 #> __Bacteroidetes 276 221 313 218 144 880 #> __Cyanobacteria 0 0 0 0 0 0 #> __Deinococcus_Thermus 0 0 0 0 0 0
# Custom matrices should be formatted like so: counts <- as.matrix(biom$counts) map <- biom$taxonomy counts[1:3,1:6]
#> HMP01 HMP02 HMP03 HMP04 HMP05 HMP06 #> UncO2713 0 0 0 0 0 0 #> UncO4101 1 5 6 18 5 15 #> AnmMass2 0 0 0 0 0 0
map[1:3,1:4]
#> Kingdom Phylum Class Order #> UncO2713 "Bacteria" "__Bacteroidetes" "__Bacteroidia" "__Bacteroidales" #> UncO4101 "Bacteria" "__Firmicutes" "__Clostridia" "__Clostridiales" #> AnmMass2 "Bacteria" "__Actinobacteria" "__Actinobacteria" "__Actinomycetales"
phyla <- taxa.rollup(counts, 'Phylum', map=map) phyla[1:3,1:6]
#> HMP01 HMP02 HMP03 HMP04 HMP05 HMP06 #> __Actinobacteria 18 60 126 120 30 71 #> __Bacteroidetes 276 221 313 218 144 880 #> __Cyanobacteria 0 0 0 0 0 0