Skip to contents

sm_snapshot() serializes an sm_corpus to disk as a compressed RDS file, embedding the corpus hash as part of the filename for traceability.

sm_snapshot_load() reads a previously saved snapshot and validates that the loaded object is a well-formed sm_corpus.

Usage

sm_snapshot(
  corpus,
  path = NULL,
  compress = c("xz", "gzip", "bzip2", "none"),
  call = rlang::caller_env()
)

sm_snapshot_load(path, call = rlang::caller_env())

Arguments

corpus

An sm_corpus object.

path

Character. File path for the snapshot. If NULL, a path is generated in the current working directory using the corpus hash.

compress

Character. Compression method passed to saveRDS(). One of "xz" (default, best compression), "gzip", "bzip2", or "none".

call

Caller environment for error reporting.

Value

For sm_snapshot(): the file path (invisibly). For sm_snapshot_load(): an sm_corpus object.

Examples

corpus <- sm_example_corpus()
path <- tempfile(fileext = ".rds")
sm_snapshot(corpus, path = path)
#>  Corpus snapshot saved to /tmp/RtmpIQMgiF/file237076e2a90a.rds.
#>  Size: 122K | Hash: ea446b5f4465

loaded <- sm_snapshot_load(path)
#>  Loaded corpus from /tmp/RtmpIQMgiF/file237076e2a90a.rds.
#>  200 works, 80 authors.
identical(nrow(corpus$works), nrow(loaded$works))
#> [1] TRUE