Create an sm_corpus object
Source:R/corpus-class.R, R/corpus-coerce.R, R/corpus-print.R
sm_corpus.RdBuild a typed, tibble-of-tibbles corpus container for bibliometric analysis. The corpus is the central data structure in scimapR, holding works, authors, authorships, institutions, sources, references, concepts, embeddings, provenance, screening decisions, and metadata.
Usage
sm_corpus(
works,
authors = NULL,
authorships = NULL,
institutions = NULL,
sources = NULL,
references = NULL,
concepts = NULL,
embeddings = NULL,
provenance = NULL,
screening = NULL,
metadata = list()
)
# S3 method for class 'sm_corpus'
x[i, ...]
# S3 method for class 'sm_corpus'
length(x)
# S3 method for class 'sm_corpus'
dim(x)
# S3 method for class 'sm_corpus'
as_tibble(x, ...)
# S3 method for class 'sm_corpus'
as.data.frame(x, ...)
# S3 method for class 'sm_corpus'
print(x, ...)
# S3 method for class 'sm_corpus'
format(x, ...)
# S3 method for class 'sm_corpus'
summary(object, ...)
# S3 method for class 'sm_corpus'
str(object, ...)Arguments
- works
A tibble of works (publications). See Details for schema.
A tibble of authors. If
NULL, constructed fromworks.A tibble linking works to authors. If
NULL, empty.- institutions
A tibble of institutions. If
NULL, empty.- sources
A tibble of publication sources/journals. If
NULL, empty.- references
A tibble of cited references. If
NULL, empty.- concepts
A tibble of concepts/keywords. If
NULL, empty.- embeddings
A numeric matrix of work embeddings, or
NULL.- provenance
A tibble tracking data lineage. If
NULL, empty.- screening
A tibble of screening decisions. If
NULL, empty.- metadata
A list of corpus-level metadata.
- x, object
An
sm_corpusobject.- i
Row index for subsetting.
- ...
Ignored.
Value
An sm_corpus S3 object.
[: Ansm_corpuswith the selected works.length(): Number of works (integer).dim(): Integer vector of length 2 (rows, columns of works table).
as_tibble(): The works tibble.as.data.frame(): The works table as a data frame.
Examples
corpus <- sm_corpus(
works = tibble::tibble(
work_id = "W000000001",
doi = "10.1234/example",
title = "Example Work",
abstract = "An example abstract.",
year = 2024L,
type = "journal-article",
source_id = NA_character_,
cited_by_count = 0L,
oa_status = "closed",
language = "en",
pmid = NA_character_,
arxiv_id = NA_character_,
openalex_id = NA_character_,
is_retracted = FALSE,
retraction_date = NA_real_,
last_refreshed = Sys.time()
)
)
print(corpus)
#>
#> ── <sm_corpus> ─────────────────────────────────────────────────────────────────
#> Works: 1 | Authors: 0 | Institutions: 0
#> Years: 2024 - 2024
#> Sources (journals): 0
#> Embeddings: none
#> Status: Unlocked (last refreshed: never)