This package is a toolkit for working with Biological Observation Matrix (BIOM) files. Features include reading/writing all BIOM formats, rarefaction, alpha diversity, beta diversity (including UniFrac), summarizing counts by taxonomic level, and sample subsetting. Standalone functions for reading, writing, and subsetting phylogenetic trees are also provided. All CPU intensive operations are encoded in C with multi-thread support.
Reference material is available online at https://cmmr.github.io/rbiom/index.html
Source code can be found at https://github.com/cmmr/rbiom
The latest stable version can be downloaded from CRAN.
The development version is available on GitHub.
library(rbiom) infile <- system.file("extdata", "hmp50.bz2", package = "rbiom") biom <- read.biom(infile) # Rarefy to 1000 reads per sample biom <- rarefy(biom, depth=1000) # Summarize counts by phylum phyla <- taxa.rollup(biom, 'Phylum') phyla[1:4,1:6] # Work with metadata table(biom$metadata$Sex, biom$metadata$Body.Site) sprintf("Mean age: %.1f", mean(biom$metadata$Age)) # Draw the phylogenetic tree plot(biom$phylogeny) # Get unifrac distance matrix dm <- beta.div(biom, 'unifrac')
Several functions will by default use all available CPU cores. To limit the number of cores used, you can set the numThreads option:
RcppParallel::setThreadOptions(numThreads = 4)
To enable caching to speed up repeat operations, call
init.cache(). For instance:
rbiom requires the following system libraries which can be installed through your operating system’s package manager.
libudunits2-dev libssl-dev libxml2-dev libcurl4-openssl-dev libgdal-dev
udunits2-devel openssl-devel libxml2-devel libcurl-devel gdal-devel
libssl_dev email@example.com libxml2_dev gdal_dev