Looking for location-based insights in your data? Geocoding is the process of determining geographic coordinates for place names, street addresses or zip codes, allowing you to explore your data spatially. Street addresses are the most common inputs to geocoding and return the most specific location data. Related tasks include:

 

  • Batch geocoding: submitting multiple locations in one request to a geocoding service, as opposed to making one API request at at time. Batch processing is typically faster.
  • Reverse geocoding: determining the nearest street address for a given latitude and longitude.

 

Once processed, you can map geocoded data, link them spatially to other data or use them as inputs to spatial analysis. This makes geocoding an extremely valuable tool for data scientists. Consequently, services that provide high quality  - fast, current, and accurate - street address geocoding can be quite expensive, on the order of $4 per 1,000 geocodes. (That said, it’s hard to find current pricing information online.)

 

 

Tools

 

There are a huge number of geocoding tools available online. Some tools require programming while others have an easy to use graphical user interface (GUI) or integrate with desktop software. Some are free or freemium - providing a limited number of geocodes for free - while others can be quite expensive. Some have global coverage and are kept current while others are dated and have limited geographic coverage. It’s often hard to figure out what you are getting when you use these tools so I encourage you to carefully read the documentation and terms of use and also examine the output returned by the tool.

 

The table below provides a comparative summary of some of the best geocoding tools currently available to the UC Berkeley community.

 

 

Comparison of Geocoding Tools as of March 2018

Service / Software

Free geocodes?

Programming Required?

Relative output quality,  Speed, & coverage

Online or locally installed

Key Benefits

US Census Online Geocoding Tool

Unlimited

No

Medium

 

Slowish

 

US Street addresses only

Online

Can be configured to return the Geo IDs (FIPS codes) of census tracts or blocks.

 

Up to 10,000 addresses can be batch processed at a time.

~1200 addressed geocoded in 3 minutes*

Google Geocoding API

2,500 per day

 

API Key required

Yes, unless you

use it with the

 QGIS MMQGIS 

plug -in

High

 

Moderately fast

 

Global

Online

 

Super accurate. When used programmatically you have a high level of control over the output metadata. There are R (ggmap) and Python (geopy) packages to make this easier.

 

~1200 in 20 minutes (1 API request per second)

OpenStreetMap Nominatim geocoding API

Unlimited but not meant for large numbers of geocoding (read docs)

Yes, unless you

use it with the

 QGIS MMQGIS 

plug -in

 

Medium

 

Moderately fast

 

Global - but spotty

Online

 

Free, no API key needed. Really easy when used in QGIS.

 

~1200 in 20 minutes in QGIS

ArcGIS Online with ESRI World Geocoding Service

Unlimited via UCB Site License

 

Current calnet id required

No

High

 

Fast

 

Global

Online

Easy to use once you figure it out the workflow - but, limited metadata output with results.

 

~1200 in 2 minutes.

ArcGIS Desktop or Pro with ESRI World Geocoding Service

Unlimited via UCB Site License

 

Current calnet id required

No

High

 

Fast

 

Global

Online

Highly customizable. Detailed output metadata. Great online documentation.

ArcGIS Online with ESRI Business Analyst Data (updated yearly)

Unlimited via UCB Site License

 

Current calnet id required

No

High

 

Super Fast

 

Global

Local

 

You need to install the software & 60+GB data files locally

Your best option when you have > 100,000 addresses to geocode. Scales up to millions of addresses.

 

Suitable for geocoding restricted use data that cannot be put online.

 

Performance tuning options for the fastest performance your machine can deliver - even faster if used with ArcGIS Pro.

 

~1,000,000 in an hour or less (~365,000 in ~20 minutes)

 

Customizable input and output options.

*Mileage may vary - results based on limited testing and local hardware, software and network configurations.

 

Getting Started

 

For a quick summary of these geocoding options and to stay abreast of changes, bookmark this link to the Library Guide to Geocoding.

 

For a detailed guide on geocoding with the ESRI software tools, read this Guide to Geocoding with ESRI Software and Data put together by me and Susan Powell, the UC Berkeley GIS and Map Librarian.

 

If you need more help, send me a D-Lab consult request, but please read the above references first!

Author: 

Patty Frontiera

Dr. Patty Frontiera is the D-Lab Data Services Lead and a geospatial data scientist.  She is the the official campus representative for ICPSR, the Roper Center, and the Census State Data Center network, and serves as the Co-Director of the Berkeley Federal Statistical Research Data Center (FSRDC).  Patty also develops the geospatial workshop curriculum, teaches workshops and consults on geospatial topics.  Patty has been with the D-Lab since 2014 and served as the the Academic Coordinator through Spring 2017. Patty received her Ph.D.