Beyond the h-index
scimapR implements several modern citation indicators that go beyond simple counts.
library(scimapR)
corpus <- sm_example_corpus(seed = 42)Classical indices
h <- sm_metric_h_index(corpus, level = "author")
head(h, 10)
#> # A tibble: 10 × 2
#> author_id h_index
#> <chr> <int>
#> 1 A000000001 14
#> 2 A000000032 11
#> 3 A000000042 11
#> 4 A000000013 10
#> 5 A000000038 10
#> 6 A000000015 9
#> 7 A000000020 9
#> 8 A000000022 9
#> 9 A000000036 9
#> 10 A000000045 9CD index (disruption)
The CD index (Funk & Owen-Smith, 2017) measures whether a work consolidates or disrupts its field.
cd <- sm_metric_disruption(corpus)
head(cd)
#> # A tibble: 6 × 2
#> work_id cd_index
#> <chr> <dbl>
#> 1 W000000001 0
#> 2 W000000002 0.0101
#> 3 W000000003 -0.0642
#> 4 W000000004 0.167
#> 5 W000000005 0.0303
#> 6 W000000006 0.188Relative Citation Ratio
The RCR (Hutchins et al., 2016) normalises citations relative to the co-citation network.
rcr <- sm_metric_rcr(corpus)
head(rcr)
#> # A tibble: 6 × 4
#> work_id cited_by_count expected_rate rcr
#> <chr> <int> <dbl> <dbl>
#> 1 W000000001 3 10.2 0.293
#> 2 W000000002 9 12.2 0.738
#> 3 W000000003 28 11.3 2.47
#> 4 W000000004 29 27.5 1.05
#> 5 W000000005 16 16 1
#> 6 W000000006 16 16 1Field-Normalized Citation Impact
FNCI (Waltman et al., 2011) normalises by field and year.
fnci <- sm_metric_fnci(corpus)
head(fnci)
#> # A tibble: 6 × 6
#> work_id field year cited_by_count field_mean fnci
#> <chr> <chr> <int> <int> <dbl> <dbl>
#> 1 W000000001 clinical outcomes 2023 3 10.2 0.293
#> 2 W000000002 colorectal cancer 2020 9 12.2 0.738
#> 3 W000000003 gene expression 2024 28 11.3 2.47
#> 4 W000000004 spatial transcriptomics 2020 29 27.5 1.05
#> 5 W000000005 immune checkpoint 2020 16 16 1
#> 6 W000000006 immune checkpoint 2018 16 16 1Uzzi novelty
Measures atypical journal combinations in reference lists (Uzzi et al., 2013).
nov <- sm_metric_novelty(corpus)
head(nov)
#> # A tibble: 6 × 2
#> work_id novelty
#> <chr> <dbl>
#> 1 W000000001 -0.323
#> 2 W000000002 -0.384
#> 3 W000000003 -0.335
#> 4 W000000004 -0.313
#> 5 W000000005 -0.345
#> 6 W000000006 -0.378Summary tables
sm_summary_authors(corpus) |> head(5)
#> # A tibble: 1 × 8
#> n_authors n_with_orcid pct_orcid mean_works_per_author median_works_per_author
#> <int> <int> <dbl> <dbl> <dbl>
#> 1 80 45 56.2 9.44 9
#> # ℹ 3 more variables: max_works_per_author <int>, mean_authors_per_work <dbl>,
#> # single_author_pct <dbl>Self-citation and self-corrected indices
sm_self_citation() derives author (or institution)
self-citation from the reference lists already in the corpus — no
per-citation API calls. It returns per-entity and per-work tibbles plus
a provenance trail showing which works drove each self-citation,
suitable for an institutional report.
sc_corpus <- readRDS(system.file("extdata", "example_self_citation_corpus.rds",
package = "scimapR"))
sc <- sm_self_citation(sc_corpus, level = "author")
sc$by_entity
#> # A tibble: 2 × 4
#> entity_id n_citations_received n_self_citations self_citation_share
#> <chr> <int> <int> <dbl>
#> 1 A1 5 4 0.8
#> 2 A2 4 2 0.5
head(sc$provenance)
#> # A tibble: 6 × 3
#> citing_work_id cited_work_id shared_author_id
#> <chr> <chr> <chr>
#> 1 W2 W1 A1
#> 2 W4 W2 A1
#> 3 W4 W2 A2
#> 4 W5 W3 A2
#> 5 W6 W2 A1
#> 6 W6 W4 A1The h/g/m indices accept self_corrected = TRUE, which
recomputes the index after removing those self-citations (always
<= the uncorrected value):
merge(
sm_metric_h_index(sc_corpus, "author"),
sm_metric_h_index(sc_corpus, "author", self_corrected = TRUE),
by = "author_id", suffixes = c("", "_corrected")
)
#> author_id h_index h_index_corrected
#> 1 A1 3 3
#> 2 A2 3 3
#> 3 A3 1 1References
- Funk, R. J. & Owen-Smith, J. (2017). A Dynamic Network Measure of Technological Change. Management Science, 63(3), 791–817.
- Hutchins, B. I. et al. (2016). Relative Citation Ratio (RCR). PLOS Biology, 14(9), e1002541.
- Waltman, L. et al. (2011). Towards a new crown indicator. Scientometrics, 87(3), 467–481.
- Uzzi, B. et al. (2013). Atypical Combinations and Scientific Impact. Science, 342(6157), 468–472.