Parse RIS (Research Information Systems) tag-value format files into an
sm_corpus object. Supports the standard RIS specification with two-letter
tags followed by two spaces, a dash, and a space.
Usage
sm_read_ris(
path,
encoding = "UTF-8",
engine = c("native", "bibliometrix", "auto"),
verbose = TRUE,
call = rlang::caller_env()
)Arguments
- path
Character scalar. Path to a
.risfile.- encoding
Character scalar. File encoding (default
"UTF-8").- engine
Character scalar. One of
"native"(built-in parser),"bibliometrix"(delegate tobibliometrix::convert2df()), or"auto"(try bibliometrix first, fall back to native).- verbose
Logical. Print progress messages?
- call
Caller environment for error reporting.
Value
An sm_corpus object.
Implementation
The native parser follows the RIS format specification (Thomson Reuters).
Each record starts with TY - and ends with ER -. Tags are two
uppercase letters followed by two spaces, a hyphen, and a space. Repeating
tags (AU, KW) generate multiple values per record.
References
Aria, M. & Cuccurullo, C. (2017). bibliometrix: An R-tool for comprehensive science mapping analysis. Journal of Informetrics, 11(4), 959–975. doi:10.1016/j.joi.2017.08.007