Skip to contents

Create an rbiom object

The fastest way to make an rbiom object is with as_rbiom(), which accepts:

  • A filepath or URL to a BIOM file.
  • An abundance matrix with OTUs in rows and samples in columns.
  • A phyloseq-class object, from the phyloseq Bioconductor R package.
  • A list with counts and optionally metadata, taxonomy, tree, etc (see as_rbiom()).
library(rbiom)

# create a simple matrix ------------------------
mtx <- matrix(
  data     = floor(runif(24) * 1000), 
  nrow     = 6, 
  dimnames = list(paste0("OTU", 1:6), paste0("Sample", 1:4)) )
mtx
#>      Sample1 Sample2 Sample3 Sample4
#> OTU1      33     456     657     403
#> OTU2     779     479      45     802
#> OTU3     575      85     995     381
#> OTU4      25     295     841       8
#> OTU5     848     738     524     613
#> OTU6     387     393     651     452

# convert matrix to rbiom -----------------------
biom <- as_rbiom(biom = mtx)
biom
#> An rbiom object (2024-03-26)
#>       4 Samples:  Sample1, Sample2, Sample3, and Sample4
#>       6 OTUs:     OTU5, OTU2, OTU3, OTU6, OTU1, and OTU4
#>       1 Ranks:    .otu
#>       1 Metadata: .sample
#>         Tree:     <absent>


# convert from phyloseq to rbiom ----------------
file <- system.file("extdata", "rich_sparse_otu_table.biom", package="phyloseq")
phy  <- phyloseq::import_biom(file)
phy
#> phyloseq-class experiment-level object
#> otu_table()   OTU Table:         [ 5 taxa and 6 samples ]
#> sample_data() Sample Data:       [ 6 samples by 4 sample variables ]
#> tax_table()   Taxonomy Table:    [ 5 taxa by 7 taxonomic ranks ]

biom <- as_rbiom(biom = phy)
biom
#> Imported PhyloSeq Data (2024-03-26)
#>       6 Samples:  Sample1, Sample2, ..., and Sample6
#>       5 OTUs:     GG_OTU_2, GG_OTU_3, GG_OTU_4, ...
#>       8 Ranks:    .otu, Rank1, Rank2, ..., and Rank7
#>       5 Metadata: .sample, BarcodeSequence, ...
#>         Tree:     <absent>

Now we have biom, an rbiom-class object that can be used with this package’s functions. If you loaded your data from a BIOM file or phyloseq object, it might already include metadata, ranks, and a tree. These attributes are technically optional, however, more analyses are possible when extra information about samples and OTUs are present.

Attach metadata

$metadata lets you set arbitrary data for each sample.

A few quick rules:

  • .sample should be the first column.
  • Other column names cannot start with a dot (.).
  • Sample names need to match biom$samples.
# create example metadata -----------------------
md <- data.frame(
  .sample   = paste0("Sample", 1:4),
  state     = c("TX", "TX", "WA", "WA"),
  age       = c(32, 19, 36, 40),
  treatment = c(1, 2, 1, 2) )
md
#>   .sample state age treatment
#> 1 Sample1    TX  32         1
#> 2 Sample2    TX  19         2
#> 3 Sample3    WA  36         1
#> 4 Sample4    WA  40         2

# add metadata to rbiom object ------------------
biom <- as_rbiom(biom = mtx)
biom$metadata <- md
biom
#> An rbiom object (2024-03-26)
#>       4 Samples:  Sample1, Sample2, Sample3, and Sample4
#>       6 OTUs:     OTU5, OTU2, OTU3, OTU6, OTU1, and OTU4
#>       1 Ranks:    .otu
#>       4 Metadata: .sample, state, age, and treatment
#>         Tree:     <absent>

# or in a single step ---------------------------
biom <- as_rbiom(biom = list(counts = mtx, metadata = md))
biom
#> An rbiom object (2024-03-26)
#>       4 Samples:  Sample1, Sample2, Sample3, and Sample4
#>       6 OTUs:     OTU5, OTU2, OTU3, OTU6, OTU1, and OTU4
#>       1 Ranks:    .otu
#>       4 Metadata: .sample, state, age, and treatment
#>         Tree:     <absent>

Setting categorical variables

Any categorical metadata variable that looks numerical, such as “treatment” in the above example, will need to be manually changed to a categorical variable.

class(pull(biom, 'treatment'))
#> [1] "numeric"

biom$metadata$treatment %<>% as.factor()

class(pull(biom, 'treatment'))
#> [1] "factor"
pull(biom, 'treatment')
#> Sample1 Sample2 Sample3 Sample4 
#>       1       2       1       2 
#> Levels: 1 2

Attach a tree

Use $tree to set the tree. You can specify a phylo object directly, or a newick file/string.

# define a random tree --------------------------
biom$tree <- "(((OTU6,(OTU5,OTU4)),OTU3),(OTU2,OTU1));"
biom
#> An rbiom object (2024-03-26)
#>       4 Samples:  Sample1, Sample2, Sample3, and Sample4
#>       6 OTUs:     OTU5, OTU2, OTU3, OTU6, OTU1, and OTU4
#>       1 Ranks:    .otu
#>       4 Metadata: .sample, state, age, and treatment
#>         Tree:     <present>

Attach taxonomy

Use $taxonomy to define taxonomic clades for each OTU.

# .otu must match otu_names(biom) ---------------
map <- data.frame(
  .otu   = paste0("OTU", 1:6),
  Phylum = c("Bacteroidetes", "Firmicutes", "Firmicutes"),
  Order  = c("Bacteroidia", "Clostridiales", "Bacillales") )
map
#>   .otu        Phylum         Order
#> 1 OTU1 Bacteroidetes   Bacteroidia
#> 2 OTU2    Firmicutes Clostridiales
#> 3 OTU3    Firmicutes    Bacillales
#> 4 OTU4 Bacteroidetes   Bacteroidia
#> 5 OTU5    Firmicutes Clostridiales
#> 6 OTU6    Firmicutes    Bacillales

biom$taxonomy <- map
biom
#> An rbiom object (2024-03-26)
#>       4 Samples:  Sample1, Sample2, Sample3, and Sample4
#>       6 OTUs:     OTU5, OTU2, OTU3, OTU6, OTU1, and OTU4
#>       3 Ranks:    .otu, Phylum, and Order
#>       4 Metadata: .sample, state, age, and treatment
#>         Tree:     <present>