Wiley Text and Data Mining (TDM) in R#

by Michael T. Moen

Wiley TDM: https://onlinelibrary.wiley.com/library-info/resources/text-and-datamining

Wiley TDM Terms of Use: Please check with your institution to see their Text and Data Mining Agreement

The Wiley Text and Data Mining (TDM) API allows users to retrieve the full-text articles of Wiley content in PDF form. This tutorial content is intended to help facilitate academic research.

These recipe examples were tested on February 12, 2025.

NOTE: The Wiley TDM API limits requests to a maximum of 3 requests per second.

Setup#

Import Libraries#

This tutorial uses the following libraries:

library(httr)

Text and Data Mining Token#

A token is required to access the Wiley TDM API. Sign up can be found here. Import your token below:

wiley_token <- Sys.getenv("wiley_token")

# The token will be sent as a header in the API calls
headers <- add_headers("Wiley-TDM-Client-Token" = wiley_token)

1. Retrieve full-text of an article#

The Wiley TDM API returns the full-text of an article as a PDF when given the article’s DOI.

In the first example, we download the full-text of the article with the DOI “10.1002/net.22207”. This article was found on the Wiley Online Library.

# DOI to download
doi <- "10.1002/net.22207"
url <- paste0("https://api.wiley.com/onlinelibrary/tdm/v1/articles/", doi)

response <- GET(url, headers)

if (status_code(response) == 200) {
  # Download if status code indicates success
  filename <- paste0(gsub("/", "_", doi), ".pdf")
  writeBin(content(response, "raw"), filename)
  cat(paste0(filename, " downloaded successfully\n"))
  
} else {
  # Print status code if unsuccessful
  cat(paste0("Failed to download PDF. Status code: ", status_code(response), "\n"))
}
## 10.1002_net.22207.pdf downloaded successfully

2. Retrieve full-text of multiple articles#

In this example, we download 5 articles found in the Wiley Online Library:

# DOIs of articles to download
dois <- c(
  "10.1111/j.1467-8624.2010.01564.x",
  "10.1111/1467-8624.00164",
  "10.1111/cdev.12864",
  "10.1111/j.1467-8624.2007.00995.x",
  "10.1111/j.1467-8624.2010.01499.x",
  "10.1111/j.1467-8624.2010.0149.x"  # Invalid DOI, will throw error
)

# Loop through DOIs and download each article
for (doi in dois) {
  url <- paste0("https://api.wiley.com/onlinelibrary/tdm/v1/articles/", doi)
  response <- GET(url, headers)
  
  if (status_code(response) == 200) {
    # Download if status code indicates success
    filename <- paste0(gsub("/", "_", doi), ".pdf")
    writeBin(content(response, "raw"), filename)
    cat(paste0(filename, " downloaded successfully\n"))
    
  } else {
    # Print status code if unsuccessful
    cat(paste0("Failed to download PDF. Status code: ", status_code(response), "\n"))
  }
  
  # Wait 1 second to be nice to Wiley's servers
  Sys.sleep(1)
}
## 10.1111_j.1467-8624.2010.01564.x.pdf downloaded successfully
## 10.1111_1467-8624.00164.pdf downloaded successfully
## 10.1111_cdev.12864.pdf downloaded successfully
## 10.1111_j.1467-8624.2007.00995.x.pdf downloaded successfully
## 10.1111_j.1467-8624.2010.01499.x.pdf downloaded successfully
## Failed to download PDF. Status code: 404