Computes a content-addressable SHA-256 hash of the corpus. The hash captures the works, authors, authorships, references, concepts, screening, and provenance tables. Metadata (which contains timestamps) is excluded from the hash to allow comparing corpus content across sessions.
Two corpora with identical scientific content will produce the same hash, regardless of when they were built.
See also
Other reproducibility:
sm_certificate(),
sm_cite_corpus(),
sm_diff_corpora(),
sm_provenance(),
sm_snapshot()
Examples
corpus <- sm_example_corpus()
sm_hash_corpus(corpus)
#> [1] "ea446b5f44659ca0804636faa6e2c6cb66dacc49278ed0ffb03b415cac54dee4"