Identifies self-citations from the references already in the corpus — no
per-citation API calls. A citation from a citing work to a cited work is a
self-citation at the chosen level when the citing and cited works share an
author (or institution). This is the quota-light "reference overlap" method.
Usage
sm_self_citation(
corpus,
level = c("author", "institution"),
method = c("reference_overlap"),
call = rlang::caller_env()
)
# S3 method for class 'sm_self_citation'
print(x, ...)
# S3 method for class 'sm_self_citation'
summary(object, ...)Arguments
- corpus
An
sm_corpuswith a populatedreferencessub-tibble whosecited_work_idlinks to corpus works.- level
"author"(default) or"institution".- method
Self-citation definition; currently only
"reference_overlap".- call
Caller environment for error reporting.
- x
An
sm_self_citationobject.- ...
Ignored.
- object
An
sm_self_citationobject.
Value
An sm_self_citation S3 object (a list) with components:
- by_entity
Tibble:
entity_id,n_citations_received(internal citations to the entity's works),n_self_citations,self_citation_share.- by_work
Tibble:
cited_work_id,n_citations_received,n_self_citations,self_citation_share(per cited work).- provenance
Tibble:
citing_work_id,cited_work_id, andshared_author_id(author level) orshared_institution_id(institution level) — the evidence behind each self-citation.
Type-stable: when references is absent/empty the components are 0-row
tibbles with the documented columns, returned after a cli::cli_warn
(the function never spins).
print returns x invisibly.
summary returns the by_entity tibble.
Examples
corpus <- sm_example_corpus(n_works = 40, seed = 1)
sc <- sm_self_citation(corpus, level = "author")
head(sc$by_entity)
#> # A tibble: 6 × 4
#> entity_id n_citations_received n_self_citations self_citation_share
#> <chr> <int> <int> <dbl>
#> 1 A000000001 351 292 0.832
#> 2 A000000003 47 7 0.149
#> 3 A000000061 47 5 0.106
#> 4 A000000015 40 4 0.1
#> 5 A000000046 56 4 0.0714
#> 6 A000000057 43 4 0.093