This hands on workshop builds on part 2 by introducing the basics of Python's scikit-learn package to implement unsupervised text analysis methods. This workshop will cover a) vectorization and Document Term Matrices, b) weighting (tf-idf), and c) uncovering patterns using topic modeling.
Prior knowledge: A basic familiarity with Python is required if you wish to follow along with the tutorial.
This workshop is one of a four-part series that will prepare participants to move forward with text analysis research, with a special focus on humanities and social science applications. Please register for each workshop separately. The other workshops in the series are listed below: