Skip to contents

Jobs go in. Results come out.

Usage

jobqueue(
  globals = NULL,
  packages = NULL,
  namespace = NULL,
  init = NULL,
  max_cpus = availableCores(),
  workers = ceiling(max_cpus * 1.2),
  timeout = NULL,
  hooks = NULL,
  reformat = NULL,
  signal = FALSE,
  cpus = 1L,
  stop_id = NULL,
  copy_id = NULL
)

Arguments

globals

A named list of variables that all <job>$exprs will have access to. Alternatively, an object that can be coerced to a named list with as.list(), e.g. named vector, data.frame, or environment.

packages

Character vector of package names to load on workers.

namespace

The name of a package to attach to the worker's environment.

init

A call or R expression wrapped in curly braces to evaluate on each worker just once, immediately after start-up. Will have access to variables defined by globals and assets from packages and namespace. Returned value is ignored.

max_cpus

Total number of CPU cores that can be reserved by all running jobs (sum(<job>$cpus)). Does not enforce limits on actual CPU utilization.

workers

How many background worker processes to start. Set to more than max_cpus to enable standby workers to quickly swap out with workers that need to restart.

timeout

A named numeric vector indicating the maximum number of seconds allowed for each state the job passes through, or 'total' to apply a single timeout from 'submitted' to 'done'. Can also limit the 'starting' state for workers. A function (job) can be used in place of a number. Example: timeout = c(total = 2.5, running = 1). See vignette('stops').

hooks

A named list of functions to run when the job state changes, of the form hooks = list(created = function (worker) {...}). Or a function (job) that returns the same. Names of worker hooks are typically 'created', 'submitted', 'queued', 'dispatched', 'starting', 'running', 'done', or '*' (duplicates okay). See vignette('hooks').

reformat

Set reformat = function (job) to define what <job>$result should return. The default, reformat = NULL passes <job>$output to <job>$result unchanged. See vignette('results').

signal

Should calling <job>$result signal on condition objects? When FALSE, <job>$result will return the object without taking additional action. Setting to TRUE or a character vector of condition classes, e.g. c('interrupt', 'error', 'warning'), will cause the equivalent of stop(<condition>) to be called when those conditions are produced. Alternatively, a function (job) that returns TRUE or FALSE. See vignette('results').

cpus

The default number of CPU cores per job. Or a function (job) that returns the number of CPU cores to reserve for a given job. Used to limit the number of jobs running simultaneously to respect <jobqueue>$max_cpus. Does not prevent a job from using more CPUs than reserved.

stop_id

If an existing job in the jobqueue has the same stop_id, that job will be stopped and return an 'interrupt' condition object as its result. stop_id can also be a function (job) that returns the stop_id to assign to a given job. A stop_id of NULL disables this feature. See vignette('stops').

copy_id

If an existing job in the jobqueue has the same copy_id, the newly submitted job will become a "proxy" for that earlier job, returning whatever result the earlier job returns. copy_id can also be a function (job) that returns the copy_id to assign to a given job. A copy_id of NULL disables this feature. See vignette('stops').

Value

A jobqueue object.

Examples


jq <- jobqueue(globals = list(N = 42), workers = 2)
print(jq)
#> ── Q1 <jobqueue/R6> ────────────────────── idle ──
#> 
#>      0 jobs - 0 are running
#>      2 workers - 0 are busy
#>      0 of 4 CPUs are currently in use
#> 
#> ────────────── 0 jobs run in 1 secs ──────────────

job <- jq$run({ paste("N is", N) })
job$result
#> [1] "N is 42"

jq$stop()