Skip to contents

Parse RIS (Research Information Systems) tag-value format files into an sm_corpus object. Supports the standard RIS specification with two-letter tags followed by two spaces, a dash, and a space.

Usage

sm_read_ris(
  path,
  encoding = "UTF-8",
  engine = c("native", "bibliometrix", "auto"),
  verbose = TRUE,
  call = rlang::caller_env()
)

Arguments

path

Character scalar. Path to a .ris file.

encoding

Character scalar. File encoding (default "UTF-8").

engine

Character scalar. One of "native" (built-in parser), "bibliometrix" (delegate to bibliometrix::convert2df()), or "auto" (try bibliometrix first, fall back to native).

verbose

Logical. Print progress messages?

call

Caller environment for error reporting.

Value

An sm_corpus object.

Implementation

The native parser follows the RIS format specification (Thomson Reuters). Each record starts with TY - and ends with ER -. Tags are two uppercase letters followed by two spaces, a hyphen, and a space. Repeating tags (AU, KW) generate multiple values per record.

References

Aria, M. & Cuccurullo, C. (2017). bibliometrix: An R-tool for comprehensive science mapping analysis. Journal of Informetrics, 11(4), 959–975. doi:10.1016/j.joi.2017.08.007

Examples

if (FALSE) { # \dontrun{
corpus <- sm_read_ris("references.ris")
corpus$works
} # }