Introduction
This article is for R package developers who want to use the HDF5 C
library in their own package. hdf5lib makes this easy by
providing a reliable, self-contained HDF5 build that you can link to
without requiring your users to install any system dependencies.
We will walk through the four main steps to link your package to
hdf5lib and then demonstrate a complete example using
Rcpp.
The 4 Steps to Link Your Package
To use hdf5lib in your package, you need to:
- Declare the dependency in your
DESCRIPTIONfile. - Tell the compiler where to find the
hdf5libheaders and libraries in asrc/Makevarsfile. - Manage the global filter state to initialize compression plugins.
- Include the HDF5 headers in your C/C++ code.
Step 1: Update DESCRIPTION
Add hdf5lib to the LinkingTo field in your
DESCRIPTION file. This tells R that your package needs to
access header files from hdf5lib.
(We’ve also added Rcpp because we’ll use it in our
example below.)
Step 2: Create src/Makevars
Create a file named Makevars inside your package’s
src/ directory. This file provides instructions to the
compiler. hdf5lib provides two helper functions,
c_flags() and ld_flags(), that output the
exact strings needed.
Add the following lines to src/Makevars:
Step 3: Manage Global Filter State (Important)
To utilize the bundled compression plugins (LZ4, Zstd, Blosc, etc.), they must be registered with the HDF5 library. Because registering filters modifies global state and spins up background thread pools (via Blosc2), you should never register filters on a per-I/O basis. Instead, expose the registration functions to R and call them exactly once during your package’s load/unload cycle.
Create C Wrappers (e.g., in
src/init.c):
#include <Rinternals.h>
#include "hdf5lib.h"
SEXP r_register_hdf5_filters() {
hdf5lib_register_all_filters();
return R_NilValue;
}
SEXP r_destroy_hdf5_filters() {
hdf5lib_destroy_all_filters();
return R_NilValue;
}Hook into R Package Load (e.g., in
R/zzz.R):
.onLoad <- function(libname, pkgname) {
# Register plugins and spin up Blosc thread pools once per session
.Call("r_register_hdf5_filters", PACKAGE = pkgname)
}
.onUnload <- function(libpath) {
# Cleanly tear down threads and free memory to prevent Valgrind warnings
.Call("r_destroy_hdf5_filters", PACKAGE = "myhdf5package")
}Example: Creating a Package with Rcpp
Let’s create a minimal R package that uses Rcpp to
provide one function: get_hdf5_version(), which calls the
HDF5 C library and returns its version string.
C++ Source Code (src/hdf5_helpers.cpp)
#include <Rcpp.h>
#include <hdf5.h>
#include <string>
#include <vector>
//' Get the version of the linked HDF5 library
//'
//' @export
// [[Rcpp::export]]
Rcpp::String get_hdf5_version() {
unsigned int majnum, minnum, relnum;
// Call the HDF5 C function
herr_t status = H5get_libversion(&majnum, &minnum, &relnum);
if (status < 0) {
Rcpp::stop("Failed to get HDF5 library version.");
}
// Format the version string
std::vector<char> version_str(20);
snprintf(version_str.data(), version_str.size(), "%u.%u.%u", majnum, minnum, relnum);
return Rcpp::String(version_str.data());
}Build and Run
- Run
Rcpp::compileAttributes()to generate the Rcpp export files. - Build and install your package.
Now, you can use your new function from R:
library(myhdf5package)
get_hdf5_version()
#> "2.1.1"