Scopus API in Python

Scopus API in Python#

By Vincent F. Scalfani and Avery Fernandez

The Scopus API, provided by Elsevier, offers programmatic access to a comprehensive database of abstracts and citations from peer-reviewed literature. It supports advanced search capabilities, author and affiliation retrieval, and citation analysis, facilitating a wide range of academic and research applications.

This tutorial content is intended to help facilitate academic research.

Please see the following resources for more information on API usage:

Documentation
Terms
- Elsevier API Service Agreement
- Elsevier API Policy
Data Reuse
- Elsevier Research Data Policy
Scopus Platform

NOTE: The Scopus API limits requests to a maximum of 2 per second.

These recipe examples were tested on May 7, 2025.

Setup#

Import Libraries#

The following external libraries need to be installed into your enviornment to run the code examples in this tutorial:

We import the libraries used in this tutorial below:

import requests
from time import sleep
from pprint import pprint
from dotenv import load_dotenv
import os
import pandas as pd

Import API Key#

An API key is required to access the Scopus API. You can sign up for one at the Scopus Developer Portal.

We keep our API key in a separate file, a .env file, and use the dotenv library to access it. If you use this method, create a file named .env in the same directory as this notebook and add the following line to it:

SCOPUS_API_KEY=PUT_YOUR_API_KEY_HERE

load_dotenv()
try:
    API_KEY = os.environ["SCOPUS_API_KEY"]
except KeyError:
    print("API key not found. Please set 'SCOPUS_API_KEY' in your .env file.")
else:
    print("Environment and API key successfully loaded.")

Environment and API key successfully loaded.

1. Get Author Data#

Number of Records for Author#

BASE_URL = "https://api.elsevier.com/content/search/scopus"
params = {
    "query": "AU-ID(55764087400)",
    "apiKey": API_KEY,
    "httpAccept": "application/json"
}

try:
    response = requests.get(BASE_URL, params=params)
    # Raise an error for bad responses
    response.raise_for_status()  
    data = response.json()
except requests.exceptions.RequestException as e:
    print(f"An error occurred: {e}")
    data = None

We can take a closer look at the data given back:

pprint(data, depth=1)

{'search-results': {...}}

It seems the data is wrapped in a dictionary with the key search-results.

pprint(data["search-results"], depth=1)

{'entry': [...],
 'link': [...],
 'opensearch:Query': {...},
 'opensearch:itemsPerPage': '25',
 'opensearch:startIndex': '0',
 'opensearch:totalResults': '29'}

Inside the search-results dictionary, there are six keys:

entry - the actual data we want
link - a link to the API endpoint
opensearch:Query - the query we used to get the data
opensearch:itemsPerPage - the number of items per page
opensearch:startIndex - the starting index of the items
opensearch:totalResults - the total number of results

pprint(data["search-results"]["entry"], depth=1)

[{...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...},
 {...}]

The entry key contains a list of data points, we can see that the first item in the list like so:

pprint(data["search-results"]["entry"][0], depth=1)

{'@_fa': 'true',
 'affiliation': [...],
 'article-number': '102984',
 'citedby-count': '0',
 'dc:creator': 'Walker K.W.',
 'dc:identifier': 'SCOPUS_ID:85211077014',
 'dc:title': 'Comparing impact of green open access and toll-access '
             'publication in the chemical sciences',
 'eid': '2-s2.0-85211077014',
 'link': [...],
 'openaccess': '0',
 'openaccessFlag': False,
 'pii': 'S0099133324001459',
 'prism:aggregationType': 'Journal',
 'prism:coverDate': '2025-01-01',
 'prism:coverDisplayDate': 'January 2025',
 'prism:doi': '10.1016/j.acalib.2024.102984',
 'prism:issn': '00991333',
 'prism:issueIdentifier': '1',
 'prism:pageRange': None,
 'prism:publicationName': 'Journal of Academic Librarianship',
 'prism:url': 'https://api.elsevier.com/content/abstract/scopus_id/85211077014',
 'prism:volume': '51',
 'source-id': '12791',
 'subtype': 'ar',
 'subtypeDescription': 'Article'}

The dictionary inside the entry list has information for the individual articles. We can load this into a pandas dataframe to make it easier to work with.

df = pd.DataFrame(data["search-results"]["entry"])
df

	@_fa	link	prism:url	dc:identifier	eid	dc:title	dc:creator	prism:publicationName	prism:issn	prism:volume	...	subtype	subtypeDescription	article-number	source-id	openaccess	openaccessFlag	prism:eIssn	freetoread	freetoreadLabel	pubmed-id
0	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85211077014	2-s2.0-85211077014	Comparing impact of green open access and toll...	Walker K.W.	Journal of Academic Librarianship	00991333	51	...	ar	Article	102984	12791	0	False	NaN	NaN	NaN	NaN
1	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85150443997	2-s2.0-85150443997	Citation Metrics and Boyer’s Model of Scholars...	Gilstrap D.L.	Innovative Higher Education	07425627	48	...	ar	Article	NaN	144736	0	False	15731758	NaN	NaN	NaN
2	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85174507273	2-s2.0-85174507273	Creating a Scholarly API Cookbook: Supporting ...	Scalfani V.F.	Issues in Science and Technology Librarianship	NaN	2023	...	ar	Article	NaN	19400156823	1	True	10921206	{'value': [{'$': 'all'}, {'$': 'publisherfullg...	{'value': [{'$': 'All Open Access'}, {'$': 'Go...	NaN
3	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85149244960	2-s2.0-85149244960	The current landscape of author guidelines in ...	Parks N.A.	Pure and Applied Chemistry	00334545	95	...	cp	Conference Paper	NaN	21458	1	True	13653075	{'value': [{'$': 'all'}, {'$': 'publisherhybri...	{'value': [{'$': 'All Open Access'}, {'$': 'Hy...	NaN
4	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85145228753	2-s2.0-85145228753	Visualizing chemical space networks with RDKit...	Scalfani V.F.	Journal of Cheminformatics	NaN	14	...	ar	Article	87	19600157322	1	True	17582946	{'value': [{'$': 'all'}, {'$': 'publisherfullg...	{'value': [{'$': 'All Open Access'}, {'$': 'Go...	NaN
5	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85142025984	2-s2.0-85142025984	The Power Law and Emerging and Senior Scholar ...	Bray N.J.	Innovative Higher Education	07425627	47	...	ar	Article	NaN	144736	0	False	15731758	NaN	NaN	NaN
6	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85132889260	2-s2.0-85132889260	Cheminformatics: data and standards a Pure and...	Scalfani V.F.	Pure and Applied Chemistry	00334545	94	...	ed	Editorial	NaN	21458	1	True	13653075	{'value': [{'$': 'all'}, {'$': 'publisherfree2...	{'value': [{'$': 'All Open Access'}, {'$': 'Br...	NaN
7	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85120727934	2-s2.0-85120727934	Using NCBI Entrez Direct (EDirect) for Small M...	Scalfani V.F.	Journal of Chemical Education	00219584	98	...	ar	Article	NaN	24169	0	False	19381328	NaN	NaN	NaN
8	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85192077253	2-s2.0-85192077253	Enhancing the Discovery of Chemistry Theses by...	Scalfani V.F.	Issues in Science and Technology Librarianship	NaN	2021	...	ar	Article	NaN	19400156823	1	True	10921206	{'value': [{'$': 'all'}, {'$': 'publisherfullg...	{'value': [{'$': 'All Open Access'}, {'$': 'Go...	NaN
9	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85117202467	2-s2.0-85117202467	Using the linux operating system full-time tip...	Scalfani V.F.	College and Research Libraries News	00990086	82	...	no	Note	NaN	14239	1	True	21506698	{'value': [{'$': 'all'}, {'$': 'publisherfullg...	{'value': [{'$': 'All Open Access'}, {'$': 'Go...	NaN
10	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85052857258	2-s2.0-85052857258	Analysis of the Frequency and Diversity of 1,3...	Scalfani V.F.	Industrial and Engineering Chemistry Research	08885885	57	...	ar	Article	NaN	13057	0	False	15205045	{'value': [{'$': 'all'}, {'$': 'repository'}, ...	{'value': [{'$': 'All Open Access'}, {'$': 'Gr...	NaN
11	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85022325302	2-s2.0-85022325302	Rapid Access to Multicolor Three-Dimensional P...	Van Wieren K.	Journal of Chemical Education	00219584	94	...	ar	Article	NaN	24169	0	False	19381328	NaN	NaN	NaN
12	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85021056236	2-s2.0-85021056236	Text analysis of chemistry thesis and disserta...	Scalfani V.	Issues in Science and Technology Librarianship	NaN	2017	...	ar	Article	NaN	19400156823	0	False	10921206	NaN	NaN	NaN
13	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85014118766	2-s2.0-85014118766	Phototunable Thermoplastic Elastomer Hydrogel ...	Huq N.A.	Macromolecules	00249297	50	...	ar	Article	NaN	21100779404	0	False	15205835	NaN	NaN	NaN
14	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:84996614793	2-s2.0-84996614793	Programmatic conversion of crystal structures ...	Scalfani V.F.	Journal of Cheminformatics	NaN	8	...	ar	Article	NaN	19600157322	1	True	17582946	{'value': [{'$': 'all'}, {'$': 'publisherfullg...	{'value': [{'$': 'All Open Access'}, {'$': 'Go...	NaN
15	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:84962040801	2-s2.0-84962040801	Dangling-End Double Networks: Tapping Hidden T...	Guo C.	Chemistry of Materials	08974756	28	...	ar	Article	NaN	12878	0	False	15205002	NaN	NaN	NaN
16	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:84960407188	2-s2.0-84960407188	Replacing the Traditional Graduate Chemistry L...	Scalfani V.	Journal of Chemical Education	00219584	93	...	ar	Article	NaN	24169	0	False	19381328	NaN	NaN	NaN
17	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:84946944650	2-s2.0-84946944650	3D Printed Block Copolymer Nanostructures	Scalfani V.F.	Journal of Chemical Education	00219584	92	...	ar	Article	NaN	24169	1	True	19381328	{'value': [{'$': 'all'}, {'$': 'publisherfree2...	{'value': [{'$': 'All Open Access'}, {'$': 'Br...	NaN
18	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:84953229974	2-s2.0-84953229974	Hypotheses in librarianship: Applying the scie...	Scalfani V.	College and Research Libraries News	00990086	76	...	sh	Short Survey	NaN	14239	1	True	21506698	{'value': [{'$': 'all'}, {'$': 'publisherfree2...	{'value': [{'$': 'All Open Access'}, {'$': 'Br...	NaN
19	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:84922251515	2-s2.0-84922251515	Recruiting students to campus: Creating tangib...	Scalfani V.F.	College and Research Libraries News	00990086	76	...	ar	Article	NaN	14239	1	True	21506698	{'value': [{'$': 'all'}, {'$': 'publisherfree2...	{'value': [{'$': 'All Open Access'}, {'$': 'Br...	NaN
20	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:84918555333	2-s2.0-84918555333	Finally free	Scalfani V.F.	Science	00368075	346	...	sh	Short Survey	NaN	23571	0	False	10959203	NaN	NaN	25477466
21	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:84901923940	2-s2.0-84901923940	3D printed molecules and extended solid models...	Scalfani V.	Journal of Chemical Education	00219584	91	...	ar	Article	NaN	24169	1	True	19381328	{'value': [{'$': 'all'}, {'$': 'publisherfree2...	{'value': [{'$': 'All Open Access'}, {'$': 'Br...	NaN
22	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:84905085276	2-s2.0-84905085276	Repurposing Space in a Science and Engineering...	Sandy J.	Journal of Academic Librarianship	00991333	40	...	ar	Article	NaN	12791	0	False	NaN	NaN	NaN	NaN
23	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:84878980283	2-s2.0-84878980283	A model for managing 3D printing services in a...	Scalfani V.	Issues in Science and Technology Librarianship	NaN	72	...	ar	Article	NaN	19400156823	0	False	10921206	NaN	NaN	NaN
24	true	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:84861406753	2-s2.0-84861406753	Morphological phase behavior of poly(RTIL)-con...	Scalfani V.F.	Macromolecules	00249297	45	...	ar	Article	NaN	21100779404	0	False	NaN	NaN	NaN	NaN

25 rows × 29 columns

# See the columns of the DataFrame
df.columns

Index(['@_fa', 'link', 'prism:url', 'dc:identifier', 'eid', 'dc:title',
       'dc:creator', 'prism:publicationName', 'prism:issn', 'prism:volume',
       'prism:issueIdentifier', 'prism:pageRange', 'prism:coverDate',
       'prism:coverDisplayDate', 'prism:doi', 'pii', 'citedby-count',
       'affiliation', 'prism:aggregationType', 'subtype', 'subtypeDescription',
       'article-number', 'source-id', 'openaccess', 'openaccessFlag',
       'prism:eIssn', 'freetoread', 'freetoreadLabel', 'pubmed-id'],
      dtype='object')

# Number of rows
len(df)

# We can index data from our new dataframe, df1.
# For example, create a list of just the DOIs
dois = df['prism:doi'].tolist()
dois

['10.1016/j.acalib.2024.102984',
 '10.1007/s10755-023-09648-7',
 '10.29173/istl2766',
 '10.1515/pac-2022-1001',
 '10.1186/s13321-022-00664-x',
 '10.1007/s10755-022-09636-3',
 '10.1515/pac-2022-2019',
 '10.1021/acs.jchemed.1c00904',
 '10.29173/istl2566',
 '10.5860/crln.82.9.428',
 '10.1021/acs.iecr.8b02573',
 '10.1021/acs.jchemed.6b00602',
 '10.5062/F4TD9VBX',
 '10.1021/acs.macromol.6b02005',
 '10.1186/s13321-016-0181-z',
 '10.1021/acs.chemmater.5b04431',
 '10.1021/acs.jchemed.5b00512',
 '10.1021/acs.jchemed.5b00375',
 '10.5860/crln.76.9.9384',
 '10.5860/crln.76.2.9259',
 '10.1126/science.346.6214.1258',
 '10.1021/ed400887t',
 '10.1016/j.acalib.2014.03.015',
 '10.5062/F4XS5SB9',
 '10.1021/ma300328u']

# Get a list of article titles
titles = df['dc:title'].tolist()
titles

['Comparing impact of green open access and toll-access publication in the chemical sciences',
 'Citation Metrics and Boyer’s Model of Scholarship: How Do Bibliometrics and Altmetrics Respond to Research Impact?',
 'Creating a Scholarly API Cookbook: Supporting Library Users with Programmatic Access to Information',
 'The current landscape of author guidelines in chemistry through the lens of research data sharing',
 'Visualizing chemical space networks with RDKit and NetworkX',
 'The Power Law and Emerging and Senior Scholar Publication Patterns',
 'Cheminformatics: data and standards a Pure and Applied Chemistry special issue',
 'Using NCBI Entrez Direct (EDirect) for Small Molecule Chemical Information Searching in a Unix Terminal',
 'Enhancing the Discovery of Chemistry Theses by Registering Substances and Depositing in PubChem',
 'Using the linux operating system full-time tips and experiences from a subject liaison librarian',
 'Analysis of the Frequency and Diversity of 1,3-Dialkylimidazolium Ionic Liquids Appearing in the Literature',
 'Rapid Access to Multicolor Three-Dimensional Printed Chemistry and Biochemistry Models Using Visualization and Three-Dimensional Printing Software Programs',
 'Text analysis of chemistry thesis and dissertation titles',
 'Phototunable Thermoplastic Elastomer Hydrogel Networks',
 'Programmatic conversion of crystal structures into 3D printable files using Jmol',
 'Dangling-End Double Networks: Tapping Hidden Toughness in Highly Swollen Thermoplastic Elastomer Hydrogels',
 'Replacing the Traditional Graduate Chemistry Literature Seminar with a Chemical Research Literacy Course',
 '3D Printed Block Copolymer Nanostructures',
 'Hypotheses in librarianship: Applying the scientific method',
 'Recruiting students to campus: Creating tangible and digital products in the academic library',
 'Finally free',
 '3D printed molecules and extended solid models for teaching symmetry and point groups',
 'Repurposing Space in a Science and Engineering Library: Considerations for a Successful Outcome',
 'A model for managing 3D printing services in academic libraries',
 'Morphological phase behavior of poly(RTIL)-containing diblock copolymer melts']

# Now a list of the cited by count
cited_by = df['citedby-count'].tolist()
cited_by

['0',
 '6',
 '2',
 '2',
 '38',
 '3',
 '0',
 '3',
 '0',
 '0',
 '24',
 '29',
 '7',
 '14',
 '27',
 '7',
 '13',
 '28',
 '0',
 '1',
 '0',
 '114',
 '6',
 '39',
 '48']

# Get sum of cited_by counts
sum([int(x) for x in cited_by])

2. Get Author Data in a Loop#

Number of Records for Author#

# Load a list of author names and Scopus AUIDs
import csv
filename = 'authors.txt'

with open(filename, 'r') as infile:
    rows = csv.reader(infile, delimiter='\t')
    author_list = list(rows)
author_list

[['Emy Decker', '36660678600'],
 ['Lindsey Lowry', '57210944451'],
 ['Karen Chapman', '35783926100'],
 ['Kevin Walker', '56133961300'],
 ['Sara Whitver', '57194760730']]

# Get number of Scopus records for each author
num_records = []
for author, authorID in author_list:

    params = {
        'query': f'AU-ID({authorID})',
        'apiKey': API_KEY,
        'httpAccept': 'application/json'
    }

    try:
        response = requests.get(BASE_URL, params=params)
        sleep(1)  
        # Raise an error for bad responses
        response.raise_for_status()  
        data = response.json()
        number_of_records = data['search-results']['opensearch:totalResults']
        num_records.append([author, authorID, number_of_records])
    except requests.exceptions.RequestException as e:
        print(f"An error occurred: {e}")
        num_records.append([author, authorID, None])

num_records

[['Emy Decker', '36660678600', '22'],
 ['Lindsey Lowry', '57210944451', '8'],
 ['Karen Chapman', '35783926100', '23'],
 ['Kevin Walker', '56133961300', '11'],
 ['Sara Whitver', '57194760730', '7']]

Download Record Data#

# Let's say we want the DOIs and cited by counts in a list
cites = []
for author,authorID in author_list:
    params = {
        'query': f'AU-ID({authorID})',
        'apiKey': API_KEY,
        'httpAccept': 'application/json'
    }

    try:
        response = requests.get(BASE_URL, params=params)
        sleep(1)  
        # Raise an error for bad responses
        response.raise_for_status()  
        data = response.json()
        author_df = pd.DataFrame(data['search-results']['entry'])
        # Get the DOIs and cited by counts
        dois = author_df['prism:doi'].tolist()
        cited_by = author_df['citedby-count'].tolist()
        # Create a list of lists with author, authorID, DOI, and cited by count
        for doi, cited in zip(dois, cited_by):
            cites.append([author, authorID, doi, cited])
    except requests.exceptions.RequestException as e:
        print(f"An error occurred: {e}")
        cites.append([author, authorID, None, None])

# The cites variable is a list of list with the data
# View data for first 5 records
cites[:5]

[['Emy Decker', '36660678600', '10.1007/s10755-024-09698-5', '0'],
 ['Emy Decker', '36660678600', '10.1016/j.acalib.2024.102858', '1'],
 ['Emy Decker', '36660678600', nan, '0'],
 ['Emy Decker', '36660678600', '10.1016/j.acalib.2022.102648', '0'],
 ['Emy Decker', '36660678600', '10.1016/j.acalib.2022.102634', '3']]

# Add to DataFrame
cites_df = pd.DataFrame(cites)
cites_df

	0	1	2	3
0	Emy Decker	36660678600	10.1007/s10755-024-09698-5	0
1	Emy Decker	36660678600	10.1016/j.acalib.2024.102858	1
2	Emy Decker	36660678600	NaN	0
3	Emy Decker	36660678600	10.1016/j.acalib.2022.102648	0
4	Emy Decker	36660678600	10.1016/j.acalib.2022.102634	3
...	...	...	...	...
66	Sara Whitver	57194760730	10.1016/j.acalib.2020.102136	7
67	Sara Whitver	57194760730	10.1353/pla.2020.0019	5
68	Sara Whitver	57194760730	10.1108/RSR-04-2019-0023	5
69	Sara Whitver	57194760730	10.15760/comminfolit.2017.11.1.41	7
70	Sara Whitver	57194760730	10.1108/RSR-10-2016-0061	3

71 rows × 4 columns

Save Record Data to a File#

Here is one method if you want to loop over author queries and save all Scopus document data to a file

# Load a list of author names and Scopus AUIDs
import csv
with open('authors.txt') as infile:
    rows = csv.reader(infile, delimiter='\t')
    author_list = list(rows)
author_list

[['Emy Decker', '36660678600'],
 ['Lindsey Lowry', '57210944451'],
 ['Karen Chapman', '35783926100'],
 ['Kevin Walker', '56133961300'],
 ['Sara Whitver', '57194760730']]

# NOTE: This writes one file for each author dataset
for authorName, authorID in author_list:
    # Create new empty DataFrame on each loop
    df = pd.DataFrame()

    # Set up query parameters
    params = {
        'query': f'AU-ID({authorID})',
        'apiKey': API_KEY,
        'httpAccept': 'application/json'
    }

    try:
        # Make the API request
        response = requests.get(BASE_URL, params=params)
        sleep(2)  
        response.raise_for_status()  # Raise an error for bad responses
        data = response.json()

        # Extract the 'entry' data and convert it to a DataFrame
        if 'entry' in data['search-results']:
            df = pd.DataFrame(data['search-results']['entry'])

        # Save to file
        filename = f"{authorName.replace(' ', '_')}_{authorID}_ScopusData.tsv"
        df.to_csv(filename, sep='\t', index=False)

        print(f"Data for {authorName} saved to {filename}")
    except requests.exceptions.RequestException as e:
        print(f"An error occurred for {authorName}: {e}")

Data for Emy Decker saved to Emy_Decker_36660678600_ScopusData.tsv
Data for Lindsey Lowry saved to Lindsey_Lowry_57210944451_ScopusData.tsv
Data for Karen Chapman saved to Karen_Chapman_35783926100_ScopusData.tsv
Data for Kevin Walker saved to Kevin_Walker_56133961300_ScopusData.tsv
Data for Sara Whitver saved to Sara_Whitver_57194760730_ScopusData.tsv

# Load one of the files into pandas
df_author = pd.read_csv('Karen_Chapman_35783926100_ScopusData.tsv', delimiter='\t')
df_author

	@_fa	link	prism:url	dc:identifier	eid	dc:title	dc:creator	prism:publicationName	prism:issn	prism:eIssn	...	prism:aggregationType	subtype	subtypeDescription	source-id	openaccess	openaccessFlag	freetoread	freetoreadLabel	pii	article-number
0	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85159073764	2-s2.0-85159073764	The Ideal Review Process Is a Three-Way Street	Ellinger A.D.	Human Resource Development Review	15344843	15526712.0	...	Journal	ar	Article	7100153132	1	True	{'value': [{'$': 'all'}, {'$': 'publisherhybri...	{'value': [{'$': 'All Open Access'}, {'$': 'Hy...	NaN	NaN
1	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85120603808	2-s2.0-85120603808	Launching chat service during the pandemic: in...	Decker E.N.	Reference Services Review	907324	NaN	...	Journal	ar	Article	144671	0	False	NaN	NaN	NaN	NaN
2	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85106619646	2-s2.0-85106619646	Characteristics of systematic reviews in the s...	Chapman K.	Journal of Academic Librarianship	991333	NaN	...	Journal	ar	Article	12791	0	False	NaN	NaN	S0099133321000872	102396.0
3	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85073936184	2-s2.0-85073936184	An evaluation of Web of Science, Scopus and Go...	Chapman K.	International Journal of Logistics Management	9574093	17586550.0	...	Journal	ar	Article	19700201449	0	False	NaN	NaN	NaN	NaN
4	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85065484203	2-s2.0-85065484203	BENCHMARKING MARKETING SCHOLAR PRODUCTIVITY	Chapman K.	Marketing Education Review	10528008	21539987.0	...	Journal	ar	Article	21100887523	0	False	NaN	NaN	NaN	NaN
5	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85018240091	2-s2.0-85018240091	The Impact of the Monographs Crisis on the Fie...	Chapman K.	Journal of Academic Librarianship	991333	NaN	...	Journal	ar	Article	12791	0	False	NaN	NaN	S0099133316303305	NaN
6	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:84955449622	2-s2.0-84955449622	IJPDLM’s 45th anniversary: a retrospective bib...	Ellinger A.	International Journal of Physical Distribution...	9600035	NaN	...	Journal	re	Review	144922	0	False	NaN	NaN	NaN	NaN
7	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:84905729060	2-s2.0-84905729060	Literature of Behavioral Economics, Part 2: Da...	Chapman K.	Behavioral and Social Sciences Librarian	1639269	15444546.0	...	Journal	ar	Article	12881	0	False	NaN	NaN	NaN	NaN
8	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:84887142927	2-s2.0-84887142927	Literature of Behavioral Economics, Part 1: In...	Chapman K.	Behavioral and Social Sciences Librarian	1639269	15444546.0	...	Journal	ar	Article	12881	0	False	NaN	NaN	NaN	NaN
9	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:81855199796	2-s2.0-81855199796	Benchmarking leading supply chain management a...	Ellinger A.E.	International Journal of Logistics Management	9574093	17586550.0	...	Journal	ar	Article	19700201449	0	False	NaN	NaN	NaN	NaN
10	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:77954591057	2-s2.0-77954591057	Library outreach to the Alabama black belt: Th...	Pike L.	Journal of Business and Finance Librarianship	8963568	15470644.0	...	Journal	ar	Article	4700152802	0	False	NaN	NaN	NaN	NaN
11	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:79953907723	2-s2.0-79953907723	CONSTRUCTING IMPACT FACTORS TO MEASURE THE INF...	Chapman K.	Journal of Business Logistics	7353766	21581592.0	...	Journal	ar	Article	19700201522	0	False	NaN	NaN	NaN	NaN
12	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:77951061493	2-s2.0-77951061493	Methods of demonstrating article and journal i...	Chapman K.	Journal of Business and Finance Librarianship	8963568	15470644.0	...	Journal	ar	Article	4700152802	0	False	NaN	NaN	NaN	NaN
13	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:36549079093	2-s2.0-36549079093	An examination of the use of monographs in the...	Yates S.D.	Behavioral and Social Sciences Librarian	1639269	15444546.0	...	Journal	ar	Article	12881	0	False	NaN	NaN	NaN	NaN
14	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:32244433049	2-s2.0-32244433049	Database coverage for research in management i...	Chapman K.	College and Research Libraries	100870	NaN	...	Journal	ar	Article	14238	1	True	{'value': [{'$': 'all'}, {'$': 'publisherfullg...	{'value': [{'$': 'All Open Access'}, {'$': 'Go...	NaN	NaN
15	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:32244439197	2-s2.0-32244439197	A reference study of leading MIS journals: Ide...	Chapman K.	Journal of Business and Finance Librarianship	8963568	15470644.0	...	Journal	ar	Article	4700152802	0	False	NaN	NaN	NaN	NaN
16	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:33645448405	2-s2.0-33645448405	Identifying the role of multidisciplinary jour...	Ackerson L.G.	College and Research Libraries	100870	NaN	...	Journal	ar	Article	14238	1	True	{'value': [{'$': 'all'}, {'$': 'publisherfullg...	{'value': [{'$': 'All Open Access'}, {'$': 'Go...	NaN	NaN
17	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85012436955	2-s2.0-85012436955	Working Papers and Scholarly Research in Finance	Chapman K.	Behavioral and Social Sciences Librarian	1639269	15444546.0	...	Journal	ar	Article	12881	0	False	NaN	NaN	NaN	NaN
18	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:32244435047	2-s2.0-32244435047	Full-text database support for scholarly resea...	Chapman K.	Journal of Business and Finance Librarianship	8963568	15470644.0	...	Journal	ar	Article	4700152802	0	False	NaN	NaN	NaN	NaN
19	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85012551549	2-s2.0-85012551549	An Examination of the Usefulness of JSTOR to R...	Chapman K.	Behavioral and Social Sciences Librarian	1639269	15444546.0	...	Journal	ar	Article	12881	0	False	NaN	NaN	NaN	NaN
20	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:85023815233	2-s2.0-85023815233	Super-DRIPs: Direct Stock Purchase Through Div...	Chapman K.	Journal of Business and Finance Librarianship	8963568	15470644.0	...	Journal	ar	Article	4700152802	0	False	NaN	NaN	NaN	NaN
21	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:0009146348	2-s2.0-0009146348	Public librarians as authors in the library sc...	Chapman K.	Public Library Quarterly	1616846	15411540.0	...	Journal	ar	Article	4700152731	0	False	NaN	NaN	NaN	NaN
22	True	[{'@_fa': 'true', '@ref': 'self', '@href': 'ht...	https://api.elsevier.com/content/abstract/scop...	SCOPUS_ID:0346625115	2-s2.0-0346625115	Survey of travel support policies at ARL libra...	Blomberg D.	Journal of Academic Librarianship	991333	NaN	...	Journal	re	Review	12791	0	False	NaN	NaN	NaN	NaN

23 rows × 28 columns

# Get info about citedby_count
df_author["citedby-count"].describe()

count    23.000000
mean     10.739130
std      12.274206
min       0.000000
25%       4.000000
50%       6.000000
75%      11.500000
max      45.000000
Name: citedby-count, dtype: float64

# Get info about publication titles
df_author['prism:publicationName'].describe()

count                                           23
unique                                          11
top       Behavioral and Social Sciences Librarian
freq                                             5
Name: prism:publicationName, dtype: object

3. Get References via a Title Search#

Number of Title Match Records#

# Search Scopus for all references containing 'ChemSpider' in the record title
params = {
    "query": "TITLE(ChemSpider)",
    "apiKey": API_KEY,
    "httpAccept": "application/json"
}

try:
    response = requests.get(BASE_URL, params=params)
    response.raise_for_status()  # Raise an error for bad responses
    data = response.json()
    print(data["search-results"]["opensearch:totalResults"])
except requests.exceptions.RequestException as e:
    print(f"An error occurred: {e}")

# Repeat this in a loop
titleWord_list = ['ChemSpider', 'PubChem', 'ChEMBL', 'Reaxys', 'SciFinder']

# Get number of Scopus records for each title search
num_records_title = []
for titleWord in titleWord_list:
    # Set up query parameters
    params = {
        "query": f"TITLE({titleWord})",
        "apiKey": API_KEY,
        "httpAccept": "application/json"
    }

    try:
        # Make the API request
        response = requests.get(BASE_URL, params=params)
        response.raise_for_status()  # Raise an error for bad responses
        data = response.json()

        # Extract the total number of results
        numt = data["search-results"]["opensearch:totalResults"]

        # Compile saved Scopus data into a list of lists
        num_records_title.append([titleWord, numt])

        # Delay 1 second between API calls to be nice to Elsevier servers
        sleep(1)
    except requests.exceptions.RequestException as e:
        print(f"An error occurred for {titleWord}: {e}")
        num_records_title.append([titleWord, None])

num_records_title

[['ChemSpider', '7'],
 ['PubChem', '102'],
 ['ChEMBL', '64'],
 ['Reaxys', '9'],
 ['SciFinder', '34']]

Download Title Match Record Data#

# Download records and create a list of selected metadata
titleWord_list = ['ChemSpider', 'PubChem', 'ChEMBL', 'Reaxys', 'SciFinder']
scopus_title_data = []

for titleWord in titleWord_list:
    # Set up query parameters
    params = {
        "query": f"TITLE({titleWord})",
        "apiKey": API_KEY,
        "httpAccept": "application/json"
    }

    try:
        # Make the API request
        response = requests.get(BASE_URL, params=params)
        # Delay 1 second between API calls to be nice to Elsevier servers
        sleep(1)
        # Raise an error for bad responses
        response.raise_for_status()  
        data = response.json()

        # Extract the 'entry' data and convert it to a DataFrame
        entries = data['search-results'].get('entry', [])
        for entry in entries:
            # Extract relevant metadata
            doi = entry.get('prism:doi', None)
            title = entry.get('dc:title', None)
            coverDate = entry.get('prism:coverDate', None)

            # Append to the list
            scopus_title_data.append([titleWord, doi, title, coverDate])
    except requests.exceptions.RequestException as e:
        print(f"An error occurred for {titleWord}: {e}")
        scopus_title_data.append([titleWord, None, None, None])

# Add to DataFrame
scopus_title_data_df = pd.DataFrame(scopus_title_data)
scopus_title_data_df.rename(columns={0:"titleWord",1: "doi",2: "title", 3: "coverDate"},
                            inplace=True)
scopus_title_data_df

	titleWord	doi	title	coverDate
0	ChemSpider	10.1039/c5np90022k	Editorial: ChemSpider-a tool for Natural Produ...	2015-08-01
1	ChemSpider	10.1021/bk-2013-1128.ch020	ChemSpider: How a free community resource of d...	2013-01-01
2	ChemSpider	10.1007/s13361-011-0265-y	Identification of "known unknowns" utilizing a...	2012-01-01
3	ChemSpider	10.1002/9781118026038.ch22	Chemspider: A Platform for Crowdsourced Collab...	2011-05-03
4	ChemSpider	10.1021/ed100697w	Chemspider: An online chemical information res...	2010-11-01
...	...	...	...	...
86	SciFinder	None	SciFinder not affordable [1]	2006-03-13
87	SciFinder	10.1021/ci050481b	SciFinder Scholar 2006: An empirical analysis ...	2006-01-01
88	SciFinder	10.2174/1570163054064693	Exploration tools for drug discovery and beyon...	2005-06-01
89	SciFinder	10.1021/ed082p652	A literature exercise using SciFinder Scholar ...	2005-01-01
90	SciFinder	10.1002/asi.10192	Analysis of SciFinder scholar and web of scien...	2002-12-01

91 rows × 4 columns

Scopus API in Python

Contents

Scopus API in Python#

Setup#

Import Libraries#

Import API Key#

1. Get Author Data#

Number of Records for Author#

2. Get Author Data in a Loop#

Number of Records for Author#

Download Record Data#

Save Record Data to a File#

3. Get References via a Title Search#

Number of Title Match Records#

Download Title Match Record Data#