Attribute matched affiliations to a controlled institution vocabulary
Source:R/affiliation.R
sm_attribute_institution.RdRolls institution matches (from sm_affiliation_match()) up to a controlled
vocabulary, writing a normalised institution_id and institution_name
onto the authorships table. Supports a ROR-backed vocabulary (via a
user-supplied offline ROR table) or a "custom" vocabulary derived directly
from the matched institution names.
Usage
sm_attribute_institution(
corpus,
vocabulary = c("ror", "custom"),
ror_table = NULL,
call = rlang::caller_env()
)Arguments
- corpus
An
sm_corpus. If it has noinstitution_matchcolumn,sm_affiliation_match()is run first with default settings.- vocabulary
"ror"(default) or"custom".- ror_table
For
vocabulary = "ror", a data frame with columnsror_id,name, andaliases(aliases either a;-separated string or a list-column). Matching is case-insensitive againstnameand each alias, as well as against theinstitution_matchvalue. A synthetic example ships atsystem.file("extdata", "example_ror.csv", package = "scimapR").- call
Caller environment for error reporting.
Value
The corpus with its authorships table gaining institution_id
and institution_name columns (for "ror", institution_id holds the
ROR id). Unmatched rows keep NA – the function never errors on
unmatched affiliations. Type-stable.
Examples
ror <- utils::read.csv(
system.file("extdata", "example_ror.csv", package = "scimapR"),
stringsAsFactors = FALSE
)
corpus <- sm_example_corpus(n_works = 5, n_authors = 5)
corpus$authorships$raw_affiliation[1] <- "Charite Universitatsmedizin Berlin"
corpus <- sm_affiliation_match(corpus)
#> ✔ Affiliation matching flagged 1 authorship across 1 institution.
#> ℹ By signal: name_token: 1. See `sm_affiliation_summary()` for the full
#> breakdown.
corpus <- sm_attribute_institution(corpus, vocabulary = "ror",
ror_table = ror)
corpus$authorships$institution_name[1]
#> [1] "Charite Berlin"