Sign up for our weekly newsletter!
Aditi is a PhD candidate in computer science who specializes in natural language processing, machine learning, information retrieval, UI design and visualization. She is the creator of WordSeer, a text analysis environment for scholarly research.
Social science research frequently involves analyzing collections of text: interview transcripts, letters, pamphlets, newspaper articles and other kinds of documents. This 4-hour workshop on "natural language processing" or "text mining” will explore the use of computers to find common patterns and meaningful information in text collections. It will be both an overview and a practical hands-on tutorial. The workshop will cover various techniques that extract different kinds of information from text, with examples of how social scientists have used them to gain new insights. Then, as a practical first step, you will learn how to use the Python programming language to find the most frequent words and phrases in a collection of text.
If you have a text collection you'd like to analyze, contact us, and we'll try to set you up to use it during the workshop.
Prerequisite: Knowledge of Python is required (at least know how to write a function command in Python).
Lunch will be provided for the first 24 registrants.