Z39.50 in Bash#
by Vincent F. Scalfani and Cyrus Gomes
Note
This an alpha version of a Z39.50 tutorial; we are still learning about Z39.50 query construction and how best to apply a workflow for searching library catalogs programmatically. Comments are certainly welcome.
UA Libraries Z39.50 Connection Details: http://www.lib.ua.edu/research-tools/university-libraries-catalog/z39-50-connection/
These recipe examples were tested in February 2023 using Ubuntu 22.04 LTS and YAZ 5.31.1.
Program requirements#
For 1. through 3. below, you will need to install Index Data’s yaz-client.
yaz-client
is a Z39.50 and SRU (Search/Retrieval via URL) command-line program.
More broadly, the YAZ suite of software is an open source toolkit:
On Debian-based Linux systems, you can install the yaz-client
(and dependencies) via apt-get
:
sudo apt-get install yaz yaz-doc
For other ways to install yaz
and yaz-client
, see Index Data’s documentation:
A very brief introduction to Z39.50#
What is Z39.50?#
Z39.50 is a standardized protocol that defines communication rules between a client computer (e.g., your computer) and a server computer (e.g., a library database server) [1]. The protocol also includes a query language, defines record syntax, and other important parameters for exchanging information [2]. For additional Z39.50 background and references, see [3], [4], and [5]. A copy of the Z39.50 standard is available from the Library of Congress [6].
Z39.50 has been around for decades and is still widely used in libraries at the staff level, for example, to retrieve metadata and search partner library catalogs.
What do end-users in 2023 use Z39.50 for?#
Today, the majority of end-users use a graphical website environment (via HTTP/HTTPS), not Z39.50 for searching library catalogs. In our experience, end-users still use Z39.50 to configure bibliographic software managers for searching the UA Libraries catalog.
What else can I use Z39.50 for?#
You can use Z39.50 to search library catalogs programmtically with your own custom scripts and software.
Why would I use Z39.50 instead of a web service API?#
If the information is available via a web service (e.g., REST) API, it’s definitely easier to use an API compared to Z39.50. However, to our knowledge, there is not wide availability of API access for individual institution level library catalog searches. As a result, Z39.50 may be your only choice.
How do I find Z39.50 connection details?#
You can find UA Libraries Z39.50 connection details here:
http://www.lib.ua.edu/research-tools/university-libraries-catalog/z39-50-connection/
For other Z39.50 connection details, see the Library of Congress Z39.50 Gateway:
Index Data also maintains some Z39.50 accessible database indexes such as Project Gutenberg and Wikipedia:
How do I construct Z39.50 queries?#
Constructing Z39.50 queries is likely different from what you are used to with modern database boolean queries. Z39.50 allows for different query specifications (see p. 23 in [6]). A commonly implemented specification is the Type-1 query, which is Reverse Polish Notation (RPN) with the bib-1 attribute set. Standard operators typically accepted include: AND, OR, AND-NOT. Result sets are often limited to 10,000 maximum returned.
The entire bib-1 attribute set can be viewed in the manual pages on linux-baed systems:
man bib1-attr
or online at: https://www.loc.gov/z3950/agency/defns/bib1.html
There are a variety of different attributes included in the bib-1 set including Use (1), Relation (2), Position (3), Structure (4), Truncation (5) and Completeness (6).
Here are a few selected examples for each attribute type along with the corresponding value and name:
# not a complete set, examples only.
Use(1)
1 Personal-name
4 Title
7 ISBN
16 LC-call-number
21 Subject-heading
30 Date
62 Abstract
1001 Record-type
1003 Author
1016 Any
1018 Publisher
1023 Indexed-by
1036 Author-Title-Subject
RELATION (2)
1 Less than
2 Less than or equal
3 Equal
4 Greater or equal
5 Greater than
6 Not equal
POSITION (3)
1 First in field
2 First in subfield
3 Any position in field
STRUCTURE (4)
1 Phrase
2 Word
3 Key
4 Year
TRUNCATION (5)
1 Right truncation
2 Left truncation
3 Left and right truncation
100 Do not truncate
COMPLETENESS (6)
1 Incomplete subfield
2 Complete subfield
3 Complete field
Hint
Something to be aware of is that Z39.50 implementations do not have to support all bib-1 attributes, so you will want to look at the Z39.50 connection details carefully for a list of supported attributes. For example, the UA Z39.50 implementation does not support relation attributes; all relations are considered equal.
To construct a query, you first define the operator (if needed), then the attribute(s), then the keyword(s). Here are a few basic examples:
# search for `cheminformatics` in the title field
@attr 1=4 "cheminformatics"
# search for `cheminformatics` in the title field at first position with truncation
@attr 1=4 @attr 3=1 @attr 5=1 "cheminformatics"
# search for `cheminformatics` in the title field and author `noordik`
@and @attr 1=4 "cheminformatics" @attr 1=1003 "noordik"
# search for `cheminformatics` in the title field but not "bioinformatics"
@not @attr 1=4 "cheminformatics" @attr 1=4 "bioinformatics"
# search for `drug discovery` in the abstract or title
@or @attr @1=4 "drug discovery" @attr 1=62 "drug discovery"
1. Basic UA Libraries Catalog Searching#
We will use the yaz-client
program for these search examples. First, start yaz-client
in your terminal:
yaz-client
After starting yaz-client, you should see a Z>
prompt in the terminal. Next, open the connection to the
UA Libraries Catalog:
open library.ua.edu:7090/voyager
If the connection is successful, you should get something like this:
Output:
Connecting...OK.
Sent initrequest.
Connection accepted by v3 target.
ID : 34
Name : Voyager LMS - Z39.50 Server
Version: 2010.3.0
Options: search present
Elapsed: 0.358596
Once connected to the UA Libraries Catalog, we can then search the catalog and retrieve records.
To exit yaz-client
, type quit
quit
Output:
See you later, alligator.
Identifier searches#
Search for the government document NAS 1.15:110209
by GPO number (1=50
):
find @attr 1=50 "NAS 1.15:110209"
Output:
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 1
records returned: 0
Elapsed: 0.024986
Find all LC call numbers (1=16
) matches that start with TP145
:
find @attr 1=16 "TP145"
Output:
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 92
records returned: 0
Elapsed: 0.027160
2. Searching UA Libraries Catalog in a Loop#
Here are a few ways to run multiple searches with yaz-client
:
First, create a file with your queries. In this example we will search for 5 books via their ISBN identifiers:
cat mysearches
Output:
open library.ua.edu:7090/voyager
find @1=7 "1683925041"
sleep 1
find @1=7 "9780470183014"
sleep 1
find @1=7 "1565925858"
sleep 1
find @1=7 "9780136778851"
sleep 1
find @1=7 "1785284444"
quit
Next, run yaz-client
with the option -f
:
yaz-client -f mysearches
Output:
Connecting...OK.
Sent initrequest.
Connection accepted by v3 target.
ID : 34
Name : Voyager LMS - Z39.50 Server
Version: 2010.3.0
Options: search present
Elapsed: 0.353889
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 1
records returned: 0
Elapsed: 0.007999
Done sleeping 1 seconds
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 1
records returned: 0
Elapsed: 0.005176
Done sleeping 1 seconds
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 2
records returned: 0
Elapsed: 0.004862
Done sleeping 1 seconds
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 1
records returned: 0
Elapsed: 0.004774
Done sleeping 1 seconds
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 1
records returned: 0
Elapsed: 0.003902
See you later, alligator.
Here is an alternative method with a bash loop:
for isbn in \
"1683925041" \
"9780470183014" \
"1565925858" \
"9780136778851" \
"1785284444"
do
printf "open library.ua.edu:7090/voyager\nfind @1=7 "$isbn"\nquit\n" |
yaz-client -f /dev/stdin
sleep 1
done
Note
/dev/stdin
allows us to pass a string via stdin with the -f
option, since yaz-client -f
expects a file [7].
And here is a more efficient method suggested on GitHub which does not quit yaz-client
on each loop [8]:
for isbn in \
"1683925041" \
"9780470183014" \
"1565925858" \
"9780136778851" \
"1785284444"
do
printf "open library.ua.edu:7090/voyager\nfind @1=7 "$isbn"\nsleep 1\n"
done | yaz-client -f /dev/stdin
Finally, if you have a file with your search strings as one per line, use a while loop to avoid having to write out your strings or declaring them as a bash variable:
cat isbns.txt
Output:
1683925041
9780470183014
1565925858
9780136778851
1785284444
cat isbns.txt |
while read isbn
do
printf "open library.ua.edu:7090/voyager\nfind @1=7 "$isbn"\nsleep 1\n"
done | yaz-client -f /dev/stdin
3. Retrieve Record(s) Data#
USmarc#
For catalog records at The University of Alabama, the default format returned within yaz-client
is USmarc (MARC 21). The records are rendered as (mostly) human-readable within the terminal output.
If you are looking for “raw” MARC, that is, the complete machine-readable binary file, see the
below section on “Saving Raw MARC data”.
To retrieve records in the terminal with yaz-client
, use the show
command with a start
postion and optional number of records. For example, to get the first record:
open library.ua.edu:7090/voyager
Output:
Connecting...OK.
Sent initrequest.
Connection accepted by v3 target.
ID : 34
Name : Voyager LMS - Z39.50 Server
Version: 2010.3.0
Options: search present
Elapsed: 0.514120
find @or @attr 1=4 "dinosaur" @attr 1=4 "dinosauria"
Output:
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 222
records returned: 0
Elapsed: 0.087466
show 1
Output:
Sent presentRequest (1+1).
Records: 1
[VOYAGER]Record type: USmarc
01239cam 2200325Ka 4500
001 3444796
005 20171110111851.0
008 101221s2008 nyua b 000 0 eng d
020 $a 0760783950
020 $a 9780760783955
035 $a (OCoLC)ocn828688251
035 $a (OCoLC)828688251
035 $a 3444796
040 $a ALM $c ALM $d UtOrBLW
049 $a ALMM
050 4 $a PZ7.H672 $b Adv 2008
100 1 $a Hoff, Syd, $d 1912-2004. $0 http://id.loc.gov/authorities/names/n78086441
245 10 $a Adventures of Danny and the dinosaur / $c Syd Hoff.
264 1 $a New York : $b Barnes & Noble, $c 2008.
300 $a 128 pages : $b color illustrations ; $c 24 cm.
336 $a text $b txt $2 rdacontent
337 $a unmediated $b n $2 rdamedia
338 $a volume $b nc $2 rdacarrier
490 1 $a I can read
505 0 $a Danny and the dinosaur -- Happy birthday, Danny and the dinosaur! -- Danny and the dinosaur go to camp.
520 $a Danny goes to a museum to see the dinosaurs and ends up spending the day outside with one.
650 1 $a Dinosaurs $v Fiction.
650 0 $a Dinosaurs $v Juvenile fiction. $0 http://id.loc.gov/authorities/subjects/sh2008102274
830 0 $a I can read book. $0 http://id.loc.gov/authorities/names/n42013105
994 $a C0 $b ALM
nextResultSetPosition = 2
Elapsed: 0.060194
To show the first 3 results, add a stop position show 1 + 4
:
open library.ua.edu:7090/voyager
find @or @attr 1=4 "dinosaur" @attr 1=4 "dinosauria"
show 1 + 4
quit
To quickly scan multiple records from a search, we can pipe the USMarc stdout to grep
and display selected lines:
printf "open library.ua.edu:7090/voyager\nfind @or @attr 1=4 "dinosaur" @attr 1=4 "dinosauria"\nshow 1+10\n" | \
yaz-client -f /dev/stdin | grep "^245"
Output:
245 10 $a Adventures of Danny and the dinosaur / $c Syd Hoff.
245 10 $a Age of tephra beds at the Ocean Point dinosaur locality, North Slope, Alaska, based on K-Ar and 40Ar/39Ar analyses / $c by James E. Conrad, Edwin H. McKee, and Brent D. Turrin.
245 10 $a Age of tephra beds at the Ocean Point dinosaur locality, North Slope, Alaska, based on K-Ar and 40Ar/39Ar analyses / $c by James E. Conrad, Edwin H. McKee, and Brent D. Turrin.
245 10 $a American dinosaur abroad : $b a cultural history of Carnegie's plaster diplodocus / $c Ilja Nieuwland.
245 10 $a American experience. $p Dinosaur wars $h [videorecording] / $c WGBH Boston ; produced by Mark Davis and Anna Saraceno ; written and directed by Mark Davis.
245 14 $a The archaeology of Castle Park Dinosaur National Monument / $c by Robert F. Burgh and Charles R. Scoggin, with appendices by Edgar Anderson, Richard E. Pillmore [and] Volney H. Jones.
245 10 $a Archeological investigations at two sites in Dinosaur National Monument $h [microform] : $b 42UN1724 and 5MF2645 / $c by James A. Truesdale.
245 00 $a Artist With Dinosaur Model $h [electronic resource].
245 10 $a Atlas of dinosaur adventures / $c illustrated by Lucy Letherland ; written by Emily Hawkins.
245 10 $a Auks, rocks, and the odd dinosaur : $b inside stories from the Smithsonian's Museum of Natural History / $c Peggy Thomson.
How cool is that!
OPAC#
The University of Alabama Catalog also support the OPAC format, which can be useful for finding the library location or checking if a book is available:
open library.ua.edu:7090/voyager
find @1=4 "core python programming"
format opac
show 1
Output:
...
...
...
Data holdings 0
typeOfRecord: x
encodingLevel: 1
receiptAcqStatus: 2
generalRetention: 8
completeness: 4
dateOfReport: 000000
nucCode: sel
localLocation: Science & Engineering Library
callNumber: QA76.73.P98 C48 2007
circulation 0
availableNow: 1
itemId: 2359071
renewable: 0
onHold: 0
nextResultSetPosition = 2
Elapsed: 0.060914
So here is a fun example, let’s look at the availability of print books in the C (Computer program language) subject heading:
printf "open library.ua.edu:7090/voyager\nfind @not @attr 1=21 \"C (Computer program language)\" \
@attr 1=1016 \"electronic resource\"\nformat opac\nshow 1+10\n" | \
yaz-client -f /dev/stdin | grep --text -e "^245" -e "callNumber" -e "availableNow" -e "localLocation"
Output:
245 10 $a Applications of numerical techniques with C / $c Suresh Chandra.
localLocation: Archival Facility (use Request Item button for retrieval)
callNumber: QA297 .C49 2006
availableNow: 1
localLocation: Science & Engineering Library
callNumber: QA297 .C49 2006
availableNow: 1
245 10 $a Artificial intelligence using C / $c Herbert Schildt.
localLocation: Science & Engineering Library
callNumber: Q336 .S35 1987
availableNow: 1
245 12 $a A book on C : $b programming in C / $c Al Kelley, Ira Pohl.
localLocation: Science & Engineering Library
callNumber: QA76.73.C15 K44 1998
availableNow: 1
245 10 $a C.
localLocation: Gorgas Library Gov. Doc.
callNumber: C 13.52:160
availableNow: 1
245 10 $a C & C++ code capsules : $b a guide for practitioners / $c Chuck Allison ; [foreword by Bruce Eckel].
localLocation: Science & Engineering Library
callNumber: QA76.73.C15 A44 1998
availableNow: 1
245 10 $a C, an introduction to programming / $c Jim Keogh, Peter Aitken, Bradley L. Jones.
localLocation: Gorgas Library
callNumber: QA76.73.C15 K466; 1996
availableNow: 1
245 14 $a The C and UNIX dictionary : $b from absolute pathname to Zombie / $c Kaare Christian.
localLocation: Science & Engineering Library
callNumber: QA76.73.C15 C49 1988
availableNow: 1
245 10 $a C/C++ programmers reference / $c Herbert Schildt.
localLocation: Science & Engineering Library
callNumber: QA76.73.C15 S348; 1997
availableNow: 0
245 10 $a C for programmers : $b a complete tutorial based on the ANSI standard / $c Leendert Ammeraal.
localLocation: Science & Engineering Library
callNumber: QA76.73.C15 A46; 1991
availableNow: 1
localLocation: Science & Engineering Library
callNumber: QA76.73.C15 A46; 1991
availableNow: 1
245 10 $a C in a nutshell / $c Peter Prinz and Tony Crawford.
localLocation: Science & Engineering Library
callNumber: QA76.73.C15 P74 2016
availableNow: 1
Saving Raw MARC data#
If you are looking to process or parse MARC records with software designed for MARC,
you probably want the Raw binary MARC. In that case, you can
use the yaz-client set_marcdump
command to save the results to a named binary MARC file:
open library.ua.edu:7090/voyager
find @not @attr 1=21 "C (Computer program language)" @attr 1=1016 "electronic resource"
set_marcdump C_books.marc
show 1+10
quit
If you have multiple queries and want to use a loop as shown in above to save MARC data, here is one potential workflow that would print human-readable MARC to the terminal output and save a file, isbn_records.marc, with the Raw binary MARC data:
cat isbns.txt |
while read isbn
do
printf "open library.ua.edu:7090/voyager\nfind @1=7 "$isbn"\nshow 1\nsleep 1\n"
done | yaz-client -f /dev/stdin -m isbn_records.marc
References