Text Analysis Fundamentals: Basic Tools and Techniques

Instructors:

Ben Gebre-Medhin

Ben (www.gebre-medhin.com) is a PhD Candiate in sociology. His interests are in the subfields of organization, the professions, and higher education. His dissertation focuses on elite universities and the MOOC movement. For part of that project it uses topic models and text analysis tools to document changes in discourse within an organizational or professional field over time.

Read more about Ben Gebre-Medhin

When & Where

Date:

Tue, December 5, 2017 - 10:00 AM to 12:00 PM

Location:

Barrows 371

Description

Type:

Workshop

This hands on workshop goes through the common “preprocessing recipe” that is used as the foundation for a variety of other applications as well as some basic natural language processing techniques. These include: a) digitization (utf 8), b) removal of stopwords, numbers, punctuation, c) tokenization, d) calculation of word frequencies / proportions, e) part of speech tagging, and f) concordances.Prior knowedlge: We will be using the NLTK Python package, so basic familiarity with Python is required if you wish to follow along with the tutorial. Completion of D-Lab's Python FUN!damentals workshop series will be sufficient.This workshop is one of a four-part series that will prepare participants to move forward with text analysis research, with a special focus on humanities and social science applications. Please register for each workshop separately.

Text Analysis Fundamentals: Methods and Approaches 

Text Analysis Fundamentals: Unsupervised Approaches 

Text Analysis Fundamentals: Supervised Methods 

Keyword:

Software Tools, Text Analysis

Details

Training Host:

D-Lab

D-lab Facilitator:

Ben Gebre-Medhin

Format Detail:

hands-on, interactive