Log in

Sign up for our weekly newsletter!

When & Where
Date: 
Tue, February 28, 2017 - 10:00 AM to 11:30 AM
Location: 
Barrows 356: D-Lab Convening Room
Description
Type: 

Getting research materials in a digital form that you can search and computationally analyze can be a time-consuming initial step in the research process. While Adobe Acrobat can do basic optical character recognition (OCR, transforming an image of a text into editable text), it performs poorly on documents with complex layouts or non-English text.

This workshop will cover how to use ABBYY FineReader, professional-level OCR software, via the OCR virtual research desktop provided by Research IT or in the D-Lab. It will also briefly cover the pros and cons of FineReader compared to the open-soruce OCR package Tesseract, and how you can use Tesseract on the Savio high-performance compute cluster for large-scale OCR jobs.

Prior knowledge: No prior knowledge is required for this workshop. Register if you have any interest in learning more about OCR tools and resources.

Technology requirement: None. This workshop will demonostrate realistic applications of OCR software.

Materials: 
Details
Training Host: 
D-lab Facilitator: 
Jon Stiles
Format Detail: 
Follow-along
Log in to register for this training.