ScienceDirect API in R

ScienceDirect API in R#

by Michael T. Moen

These recipe examples demonstrate how to use Elsevier’s ScienceDirect API to retrieve full-text articles in various formats (XML, text).

This tutorial content is intended to help facilitate academic research. Please check your institution for their Text and Data Mining or related License Agreement with Elsevier.

Documentation
- ScienceDirect API
- ScienceDirect API Documentation
Terms
- ScienceDirect API Terms of Use
Data Reuse
- Elsevier Text & Data Mining

Note: See your institution’s rate limit in the ScienceDirect API Terms of Use.

These recipe examples were tested on February 7, 2025.

Setup#

Import Libraries#

library(httr)

Import API Key#

An API key is required to access the ScienceDirect API. Registration is available on the Elsevier developer portal. The key is imported from an environment variable below:

myAPIKey <- Sys.getenv("sciencedirect_key")

Identifier Note#

We will use DOIs as the article identifiers. See our Crossref and Scopus API tutorials for workflows on how to create lists of DOIs and identfiers for specific searches and journals. The Elsevier ScienceDirect Article (Full-Text) API also accepts other identifiers like Scopus IDs and PubMed IDs (see API specification documents linked above).

1. Retrieve full-text XML of an article#

# For XML download
elsevier_url <- "https://api.elsevier.com/content/article/doi/"
doi1 <- '10.1016/j.tetlet.2017.07.080' # Example Tetrahedron Letters article
fulltext1 <- GET(paste0(elsevier_url, doi1, "?APIKey=", myAPIKey, "&httpAccept=text/xml"))

# Save to file
writeLines(content(fulltext1, "text"), "fulltext1.xml")

2. Retrieve plain text of an article#

# For simplified text download
doi2 <- '10.1016/j.tetlet.2022.153680' # Example Tetrahedron Letters article
fulltext2 <- GET(paste0(elsevier_url, doi2, "?APIKey=", myAPIKey, "&httpAccept=text/plain"))

# Save to file
writeLines(content(fulltext2, "text"), "fulltext2.txt")

3. Retrieve full-text in a loop#

# Make a list of 5 DOIs for testing
dois <- c('10.1016/j.tetlet.2018.10.031',
          '10.1016/j.tetlet.2018.10.033',
          '10.1016/j.tetlet.2018.10.034',
          '10.1016/j.tetlet.2018.10.038',
          '10.1016/j.tetlet.2018.10.041')

for (doi in dois) {
  article <- GET(paste0(elsevier_url, doi, "?APIKey=", myAPIKey, "&httpAccept=text/plain"))
  doi_name <- gsub("/", "_", doi)
  writeLines(content(article, "text"), paste0(doi_name, "_plain_text.txt"))
  Sys.sleep(1)
}