Wiley Text and Data Mining (TDM) in Python#
by Michael T. Moen
The Wiley Text and Data Mining (TDM) API allows users to retrieve the full-text articles of subscribed Wiley content in PDF form. TDM use is for non-commercial scholarly research, see terms and restrictions in below links.
This tutorial content is intended to help facilitate academic research. Please check your institution for their Text and Data Mining or related License Agreement with Wiley.
Please see the following resources for more information on API usage:
Documentation
Terms
Data Reuse
Wiley TDM Data Reuse (see sections 4 and 5 of Text and Data Mining Agreement)
These recipe examples were tested on April 4, 2025.
NOTE: The Wiley TDM API limits requests to a maximum of 3 requests per second.
Setup#
Import Libraries#
The following external libraries need to be installed into your enviornment to run the code examples in this tutorial:
We import the libraries used in this tutorial below:
import os
import requests
from time import sleep
from dotenv import load_dotenv
Import Text and Data Mining Token#
An token is required for text and data mining with Wiley. You can sign up for one here.
We keep our token in a .env
file and use the dotenv
library to access it. If you would like to use this method, create a .env
file and add the following line to it:
WILEY_TDM_TOKEN=PUT_YOUR_TOKEN_HERE
load_dotenv()
try:
WILEY_TDM_TOKEN = os.environ["WILEY_TDM_TOKEN"]
except KeyError:
print("Token not found. Please set 'WILEY_TDM_TOKEN' in your .env file.")
1. Retrieve Full-Text of an Article#
The Wiley TDM API returns the full-text of an article as a PDF when given the article’s DOI.
In the first example, we download the full-text of the article with the DOI “10.1002/net.22207”. This article was found on the Wiley Online Library.
# DOI of article to download
doi = '10.1002/net.22207'
url = f'https://api.wiley.com/onlinelibrary/tdm/v1/articles/{doi}'
headers = {
"Wiley-TDM-Client-Token": WILEY_TDM_TOKEN
}
response = requests.get(url, headers=headers)
# Download PDF if status code indicates success
if response.status_code == 200:
filename = f'{doi.replace('/', '_')}.pdf'
with open(filename, 'wb') as file:
file.write(response.content)
print(f'{filename} downloaded successfully')
else:
print(f'Failed to download PDF. Status code: {response.status_code}')
10.1002_net.22207.pdf downloaded successfully
2. Retrieve Full-Text of Multiple Articles#
In this example, we download 5 articles found in the Wiley Online Library:
# DOIs of articles to download
dois = [
'10.1111/j.1467-8624.2010.01564.x',
'10.1111/1467-8624.00164',
'10.1111/cdev.12864',
'10.1111/j.1467-8624.2007.00995.x',
'10.1111/j.1467-8624.2010.01499.x',
'10.1111/j.1467-8624.2010.0149.x' # Invalid DOI, will throw error
]
# Send an HTTP request for each DOI
for doi in dois:
url = f'https://api.wiley.com/onlinelibrary/tdm/v1/articles/{doi}'
response = requests.get(url, headers=headers)
# Download PDF if status code indicates success
if response.status_code == 200:
filename = f'{doi.replace('/', '_')}.pdf'
with open(filename, 'wb') as file:
file.write(response.content)
print(f'{filename} downloaded successfully')
else:
print(f'Failed to download PDF for {doi.replace('%2f', '/')}.')
print(f'Status code: {response.status_code}')
sleep(1) # Wait 1 second to be nice on Wiley's servers
10.1111_j.1467-8624.2010.01564.x.pdf downloaded successfully
10.1111_1467-8624.00164.pdf downloaded successfully
10.1111_cdev.12864.pdf downloaded successfully
10.1111_j.1467-8624.2007.00995.x.pdf downloaded successfully
10.1111_j.1467-8624.2010.01499.x.pdf downloaded successfully
Failed to download PDF for 10.1111/j.1467-8624.2010.0149.x. Status code: 404