U.S. Census Geocoding API in Python#
by Michael T. Moen
This product uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau.
Please see the following resources for more information on API usage:
Documentation
Terms of Use
These recipe examples were tested on March 7, 2025.
Setup#
The following external libraries need to be installed into your enviornment to run the code examples in this tutorial:
import requests
import csv
from pprint import pprint
1. Address Lookup#
One of the main use cases of this API is finding the latitude and longitude of an address. In this example, we find the latitude and longitude of the Bruno Business Library at the University of Alabama.
The API allows searching through two methods: address
and onelineaddress
. These methods are nearly identical, with the only difference being the format of the parameters passed to API.
Using address
Search#
BASE_URL = 'https://geocoding.geo.census.gov/geocoder/'
return_type = 'locations'
search_type = 'address'
params = {
# Specify the address to lookup with the following parameters
'street': '425 Stadium Dr',
'city': 'Tuscaloosa',
'state': 'AL',
'zip': 35401,
# Specify the version of the locator to be searched
'benchmark': 'Public_AR_Current',
# Specify that data should be returned in JSON format
'format': 'json'
}
response = requests.get(f'{BASE_URL}{return_type}/{search_type}', params=params)
# Status code of 200 indicates success
response.status_code
200
response.json()
{'result': {'input': {'address': {'zip': '35401',
'city': 'Tuscaloosa',
'street': '425 Stadium Dr',
'state': 'AL'},
'benchmark': {'isDefault': True,
'benchmarkDescription': 'Public Address Ranges - Current Benchmark',
'id': '4',
'benchmarkName': 'Public_AR_Current'}},
'addressMatches': [{'tigerLine': {'side': 'L', 'tigerLineId': '636109874'},
'coordinates': {'x': -87.549700416257, 'y': 33.21105403378},
'addressComponents': {'zip': '35401',
'streetName': 'STADIUM',
'preType': '',
'city': 'TUSCALOOSA',
'preDirection': '',
'suffixDirection': '',
'fromAddress': '401',
'state': 'AL',
'suffixType': 'DR',
'toAddress': '499',
'suffixQualifier': '',
'preQualifier': ''},
'matchedAddress': '425 STADIUM DR, TUSCALOOSA, AL, 35401'}]}}
latitude = response.json()['result']['addressMatches'][0]['coordinates']['y']
longitude = response.json()['result']['addressMatches'][0]['coordinates']['x']
# Display coordinates
latitude, longitude
(33.21105403378, -87.549700416257)
Using onelineaddress
Search#
return_type = 'locations'
search_type = 'onelineaddress'
params = {
# Specify the address to lookup with the parameters
# Note that 'street' is required, and the other parameters are optional
'address': '425 Stadium Dr, Tuscaloosa, AL 35401',
# Specify the version of the locator to be searched
'benchmark': 'Public_AR_Current',
# Specify that data should be returned in JSON format
'format': 'json'
}
response = requests.get(f'{BASE_URL}{return_type}/{search_type}', params=params)
# Status code of 200 indicates success
response.status_code
200
latitude = response.json()['result']['addressMatches'][0]['coordinates']['y']
longitude = response.json()['result']['addressMatches'][0]['coordinates']['x']
# Display coordinates
latitude, longitude
(33.21105403378, -87.549700416257)
2. Batch Address Lookup#
The U.S. Census Geocoding API also allows for batch geocoding with the submission of a CSV, TXT, DAT, XLS, or XLSX file. These files must be formatted with one record per line, where each record must be formatted as followed: Unique ID, Street address, City, State, ZIP. Users are limited to 10,000 records per batch file.
This example uses the CSV file created below:
# Create list of addresses for the batch lookup
# Note that each record must begin with a unique ID
addresses = [
['1', '425 Stadium Dr', 'Tuscaloosa', 'AL', '35401'],
['2', '1600 Pennsylvania Avenue NW', 'Washington', 'DC', '20500'],
['3', '350 Fifth Avenue', 'New York', 'NY', '10118'],
['4', '660 Cannery Row', 'Monterey', 'CA', '93940'],
['5', '700 Clark Ave', 'St. Louis', 'MO', '63102']
]
# Export addresses to a CSV file
input_filename = 'batch_addresses.csv'
with open(input_filename, 'w', newline='') as f:
csv_writer = csv.writer(f)
csv_writer.writerows(addresses)
# Format parameters needed for POST request
return_type = 'locations'
params = {
'benchmark' : 'Public_AR_Current'
}
files = {
'addressFile': open(input_filename, "rb")
}
url = f'https://geocoding.geo.census.gov/geocoder/{return_type}/addressbatch'
response = requests.post(url, data=params, files=files)
# Status code of 200 indicates success
response.status_code
200
# Save content of response to a new CSV
output_filename = 'geocoded_addresses.csv'
with open(output_filename, 'wb') as f:
f.write(response.content)
# Printing contents of CSV for demonstation purposes
with open(output_filename, newline='') as f:
csv_reader = csv.reader(f)
for row in csv_reader:
print(row)
['1', '425 Stadium Dr, Tuscaloosa, AL, 35401', 'Match', 'Exact', '425 STADIUM DR, TUSCALOOSA, AL, 35401', '-87.549700416257,33.211054033781', '636109874', 'L']
['2', '1600 Pennsylvania Avenue NW, Washington, DC, 20500', 'Match', 'Exact', '1600 PENNSYLVANIA AVE NW, WASHINGTON, DC, 20500', '-77.036543957308,38.898690918656', '76225813', 'L']
['3', '350 Fifth Avenue, New York, NY, 10118', 'Match', 'Exact', '350 5TH AVE, NEW YORK, NY, 10118', '-73.985077152891,40.747848600317', '59653473', 'L']
['4', '660 Cannery Row, Monterey, CA, 93940', 'Match', 'Exact', '660 CANNERY ROW, MONTEREY, CA, 93940', '-121.901280304574,36.617235842516', '647390330', 'R']
['5', '700 Clark Ave, St. Louis, MO, 63102', 'Match', 'Non_Exact', '700 CLARK AVE, SAINT LOUIS, MO, 63119', '-90.340369438036,38.602422417149', '100141071', 'R']
Note that the last two columns of the above data are the TIGER/Line ID and TIGER/Line Side. For more information on these values, please see the U.S. Census TIGER/Line Geodatabase Documentation. However, this tutorial does not utilize any TIGER/Line data.
3. Retrieving Additional Geographic Data#
The geographies
return type allows for the retrieval of additional data associated for a given address or set of coordinates. The example below retrieves this data using the address of the Bruno Business Library at the University of Alabama.
Note that the geographies
return type requires the vintage
parameter to be specified.
Users may additionally include the layers
parameter, which determines the types of geography data returned. For a list of all layers, see here.
return_type = 'geographies'
search_type = 'address'
params = {
# Specify the address to lookup with the following parameters
'street': '425 Stadium Dr',
'city': 'Tuscaloosa',
'state': 'AL',
'zip': 35401,
# Specify the version of the locator to be searched
'benchmark': 'Public_AR_Current',
# Specify the vintage
'vintage': 'Current_Current',
# Specify what categories of geographic data to retrieve
'layers': 'all',
# Specify that data should be returned in JSON format
'format': 'json'
}
response = requests.get(f'{BASE_URL}{return_type}/{search_type}', params=params)
# Status code of 200 indicates success
response.status_code
200
Note that the geographies
return type returns all of the data that the locations
return type does in addition to the geographies data.
pprint(response.json()['result']['addressMatches'][0], depth=1)
{'addressComponents': {...},
'coordinates': {...},
'geographies': {...},
'matchedAddress': '425 STADIUM DR, TUSCALOOSA, AL, 35401',
'tigerLine': {...}}
The geographies data contains the following categories:
pprint(response.json()['result']['addressMatches'][0]['geographies'], depth=1)
{'119th Congressional Districts': [...],
'2020 Census Blocks': [...],
'2020 Census Public Use Microdata Areas': [...],
'2020 Census ZIP Code Tabulation Areas': [...],
'2024 State Legislative Districts - Lower': [...],
'2024 State Legislative Districts - Upper': [...],
'Census Block Groups': [...],
'Census Divisions': [...],
'Census Regions': [...],
'Census Tracts': [...],
'Counties': [...],
'County Subdivisions': [...],
'Incorporated Places': [...],
'Metropolitan Statistical Areas': [...],
'States': [...],
'Unified School Districts': [...],
'Urban Areas': [...]}
As an example, this is how the Counties data is formatted.
response.json()['result']['addressMatches'][0]['geographies']['Counties']
[{'GEOID': '01125',
'CENTLAT': '+33.2894031',
'AREAWATER': '78666216',
'STATE': '01',
'BASENAME': 'Tuscaloosa',
'OID': '2759075608325',
'LSADC': '06',
'FUNCSTAT': 'A',
'INTPTLAT': '+33.2902197',
'NAME': 'Tuscaloosa County',
'OBJECTID': 3113,
'CENTLON': '-087.5250366',
'COUNTYCC': 'H1',
'COUNTYNS': '00161588',
'AREALAND': '3421017287',
'INTPTLON': '-087.5227834',
'MTFCC': 'G4020',
'COUNTY': '125'}]