Scopus API in Python#
By Vincent F. Scalfani and Avery Fernandez
The Scopus API, provided by Elsevier, offers programmatic access to a comprehensive database of abstracts and citations from peer-reviewed literature. It supports advanced search capabilities, author and affiliation retrieval, and citation analysis, facilitating a wide range of academic and research applications.
This tutorial content is intended to help facilitate academic research.
Please see the following resources for more information on API usage:
- Documentation 
- Terms 
- Data Reuse 
- Scopus Platform 
NOTE: The Scopus API limits requests to a maximum of 2 per second.
These recipe examples were tested on May 7, 2025.
Setup#
Import Libraries#
The following external libraries need to be installed into your environment to run the code examples in this tutorial:
We import the libraries used in this tutorial below:
import requests
from time import sleep
from pprint import pprint
from dotenv import load_dotenv
import os
import pandas as pd
Import API Key#
An API key is required to access the Scopus API. You can sign up for one at the Scopus Developer Portal.
We keep our API key in a separate file, a .env file, and use the dotenv library to access it. If you use this method, create a file named .env in the same directory as this notebook and add the following line to it:
SCOPUS_API_KEY=PUT_YOUR_API_KEY_HERE
load_dotenv()
try:
    API_KEY = os.environ["SCOPUS_API_KEY"]
except KeyError:
    print("API key not found. Please set 'SCOPUS_API_KEY' in your .env file.")
else:
    print("Environment and API key successfully loaded.")
Environment and API key successfully loaded.
3. Get References via a Title Search#
Number of Title Match Records#
# Search Scopus for all references containing 'ChemSpider' in the record title
params = {
    "query": "TITLE(ChemSpider)",
    "apiKey": API_KEY,
    "httpAccept": "application/json"
}
try:
    response = requests.get(BASE_URL, params=params)
    response.raise_for_status()  # Raise an error for bad responses
    data = response.json()
    print(data["search-results"]["opensearch:totalResults"])
except requests.exceptions.RequestException as e:
    print(f"An error occurred: {e}")
7
# Repeat this in a loop
titleWord_list = ['ChemSpider', 'PubChem', 'ChEMBL', 'Reaxys', 'SciFinder']
# Get number of Scopus records for each title search
num_records_title = []
for titleWord in titleWord_list:
    # Set up query parameters
    params = {
        "query": f"TITLE({titleWord})",
        "apiKey": API_KEY,
        "httpAccept": "application/json"
    }
    try:
        # Make the API request
        response = requests.get(BASE_URL, params=params)
        response.raise_for_status()  # Raise an error for bad responses
        data = response.json()
        # Extract the total number of results
        numt = data["search-results"]["opensearch:totalResults"]
        # Compile saved Scopus data into a list of lists
        num_records_title.append([titleWord, numt])
        # Delay 1 second between API calls to be nice to Elsevier servers
        sleep(1)
    except requests.exceptions.RequestException as e:
        print(f"An error occurred for {titleWord}: {e}")
        num_records_title.append([titleWord, None])
num_records_title
[['ChemSpider', '7'],
 ['PubChem', '102'],
 ['ChEMBL', '64'],
 ['Reaxys', '9'],
 ['SciFinder', '34']]
Download Title Match Record Data#
# Download records and create a list of selected metadata
titleWord_list = ['ChemSpider', 'PubChem', 'ChEMBL', 'Reaxys', 'SciFinder']
scopus_title_data = []
for titleWord in titleWord_list:
    # Set up query parameters
    params = {
        "query": f"TITLE({titleWord})",
        "apiKey": API_KEY,
        "httpAccept": "application/json"
    }
    try:
        # Make the API request
        response = requests.get(BASE_URL, params=params)
        # Delay 1 second between API calls to be nice to Elsevier servers
        sleep(1)
        # Raise an error for bad responses
        response.raise_for_status()  
        data = response.json()
        # Extract the 'entry' data and convert it to a DataFrame
        entries = data['search-results'].get('entry', [])
        for entry in entries:
            # Extract relevant metadata
            doi = entry.get('prism:doi', None)
            title = entry.get('dc:title', None)
            coverDate = entry.get('prism:coverDate', None)
            # Append to the list
            scopus_title_data.append([titleWord, doi, title, coverDate])
    except requests.exceptions.RequestException as e:
        print(f"An error occurred for {titleWord}: {e}")
        scopus_title_data.append([titleWord, None, None, None])
# Add to DataFrame
scopus_title_data_df = pd.DataFrame(scopus_title_data)
scopus_title_data_df.rename(columns={0:"titleWord",1: "doi",2: "title", 3: "coverDate"},
                            inplace=True)
scopus_title_data_df
| titleWord | doi | title | coverDate | |
|---|---|---|---|---|
| 0 | ChemSpider | 10.1039/c5np90022k | Editorial: ChemSpider-a tool for Natural Produ... | 2015-08-01 | 
| 1 | ChemSpider | 10.1021/bk-2013-1128.ch020 | ChemSpider: How a free community resource of d... | 2013-01-01 | 
| 2 | ChemSpider | 10.1007/s13361-011-0265-y | Identification of "known unknowns" utilizing a... | 2012-01-01 | 
| 3 | ChemSpider | 10.1002/9781118026038.ch22 | Chemspider: A Platform for Crowdsourced Collab... | 2011-05-03 | 
| 4 | ChemSpider | 10.1021/ed100697w | Chemspider: An online chemical information res... | 2010-11-01 | 
| ... | ... | ... | ... | ... | 
| 86 | SciFinder | None | SciFinder not affordable [1] | 2006-03-13 | 
| 87 | SciFinder | 10.1021/ci050481b | SciFinder Scholar 2006: An empirical analysis ... | 2006-01-01 | 
| 88 | SciFinder | 10.2174/1570163054064693 | Exploration tools for drug discovery and beyon... | 2005-06-01 | 
| 89 | SciFinder | 10.1021/ed082p652 | A literature exercise using SciFinder Scholar ... | 2005-01-01 | 
| 90 | SciFinder | 10.1002/asi.10192 | Analysis of SciFinder scholar and web of scien... | 2002-12-01 | 
91 rows × 4 columns
