Introduction
One of the most powerful features of h5lite is its
ability to map R’s nested list objects directly to the
hierarchical structure of an HDF5 file. This allows you to save and load
complex, organized data structures with a single command.
This vignette explores: * How h5_write() recursively
saves a list. * How h5_read() recursively loads a file
structure into a list. * The important detail of alphabetical ordering.
* How to manually manage the file structure using h5lite’s
organizational functions.
For an introduction to writing data, see
vignette("h5lite").
1. Recursive Writing with Lists
When you pass a list to h5_write(), it
treats it as a blueprint for creating a file structure:
- The
listitself becomes an HDF5 group. - Named elements inside the
listbecome datasets or sub-groups.
Let’s create a nested list representing a session’s data.
session_data <- list(
metadata = list(
user = "test",
version = 1.2,
timestamp = "2025-11-17"
),
raw_data = matrix(rnorm(100), ncol = 10)
)
# We can even add an attribute to the top-level list
attr(session_data, "info") <- "This is the root group attribute"
# Write the entire structure in one call
h5_write(file, "session_1", session_data, attrs = TRUE)We can now inspect the file with h5_str() to see the
hierarchy that was created.
h5_str(file)
#> /
#> └── session_1
#> ├── @info <string x 1>
#> ├── metadata
#> │ ├── user <string x 1>
#> │ ├── version <float64 x 1>
#> │ └── timestamp <string x 1>
#> └── raw_data <float64 x 10 x 10>Recursive Reading of Groups
Just as h5_write() can write a list recursively,
h5_read() can read an HDF5 group recursively. When you read
a group path, h5lite traverses its contents and builds a
nested R list that mirrors the file structure.
# Read the 'session_1' group back into a list
read_data <- h5_read(file, "session_1", attrs = TRUE)
str(read_data)
#> List of 2
#> $ metadata:List of 3
#> ..$ timestamp: chr "2025-11-17"
#> ..$ user : chr "test"
#> ..$ version : num 1.2
#> $ raw_data: num [1:10, 1:10] -1.40004 0.25532 -2.43726 -0.00557 0.62155 ...
#> - attr(*, "info")= chr "This is the root group attribute"Important: HDF5 Group Ordering
A critical detail to remember is that HDF5 groups do not
preserve the creation order of their members. When
h5_read() reads a group, it returns the elements in the
list sorted alphabetically by name.
Our original session_data$metadata list had the order
user, version, timestamp. The
read-back list has the order timestamp, user,
version.
To correctly compare a list that has been round-tripped, you should sort the original list’s elements by name first.
Manual File Organization
While recursive list writing is convenient, you often need to manage
the file structure manually. h5lite provides a suite of
simple functions for this.
Creating Groups
You can pre-build a directory structure using
h5_create_group(). It works like mkdir -p,
creating any necessary parent groups.
h5_create_group(file, "/archive/2024/run_01")
h5_ls(file, recursive = TRUE)
#> [1] "session_1" "session_1/metadata"
#> [3] "session_1/metadata/user" "session_1/metadata/version"
#> [5] "session_1/metadata/timestamp" "session_1/raw_data"
#> [7] "archive" "archive/2024"
#> [9] "archive/2024/run_01"Moving and Renaming Objects
h5_move() allows you to efficiently rename or move any
object (group or dataset). This is a fast metadata operation that does
not rewrite any data.
# Let's move our session data into the archive
h5_move(file, from = "session_1", to = "/archive/2024/run_01/data")
h5_str(file)
#> /
#> └── archive
#> └── 2024
#> └── run_01
#> └── data
#> ├── @info <string x 1>
#> ├── metadata
#> │ ├── user <string x 1>
#> │ ├── version <float64 x 1>
#> │ └── timestamp <string x 1>
#> └── raw_data <float64 x 10 x 10>Deleting Objects
Finally, you can remove any object using h5_delete().
The deletion is recursive, so if you target a group, everything inside
it will also be removed.
# Let's create a temporary group to delete
h5_write(file, "/tmp/deleteme", 1:10)
h5_ls(file, "/tmp")
#> [1] "deleteme"
# Delete just the dataset
h5_delete(file, "/tmp/deleteme")
h5_exists(file, "/tmp/deleteme") # FALSE
#> [1] FALSE
# Delete the now-empty group
h5_delete(file, "/tmp")
h5_exists(file, "/tmp") # FALSE
#> [1] FALSEThese organizational functions give you full control over the structure of your HDF5 file, allowing you to build and manage complex data hierarchies with simple, predictable commands.
# Clean up the temporary file
unlink(file)