Log in

Sign up for our weekly newsletter!

This is an archive of our past training offerings. We are looking to include workshops on topics not yet covered here. Is there something not currently on the list? Send us a proposal.

E.g., 21-Apr-25
E.g., 21-Apr-25
February 25, 2020
Author:
Chris Kennedy

This workshop introduces the basic concepts of Deep Learning - the training and performance evaluation of large neural networks, especially for image classification, natural language processing, and time-series data.

February 24, 2020
Author:
Drew Hart

It is often said that 80% of data analysis is spent on the process of cleaning and preparing the data. This workshop will introduce tools (notably dplyr and tidyr) that makes data wrangling and manipulation much easier. Participants will learn how to use these packages to subset and reshape data sets, do calculations across groups of data, clean data, and other useful stuff.

February 24, 2020
Author:
Drew Hart

It is often said that 80% of data analysis is spent on the process of cleaning and preparing the data. This workshop will introduce tools (notably dplyr and tidyr) that makes data wrangling and manipulation much easier. Participants will learn how to use these packages to subset and reshape data sets, do calculations across groups of data, clean data, and other useful stuff.

February 24, 2020
Author:
Aniket Kesari, Renata Barreto

Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python.

In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.

We plan to cover:

February 21, 2020
Author:
Ilya Akdemir

In this workshop we will cover the most common CTA task: supervised classification. Using the Python library scikit-learn, we will implement Logistic Regression and Random Forest methods to perform sentiment analysis. Optional: introduction to word vector representations with Word2Vec.

February 20, 2020
Author:
Ilya Akdemir

This hands on workshop builds on part 1 by introducing the basics of Python's scikit-learn package to implement unsupervised text analysis methods. This workshop will cover a) vectorization and Document Term Matrices, b) weighting (tf-idf), and c) uncovering patterns using topic modeling.

February 19, 2020
Author:
Emily Grabowski

This workshop introduces students to scikit-learn, the popular machine learning library in Python, as well as the auto-ML library built on top of scikit-learn, TPOT. The focus will be on scikit-learn syntax and available tools to apply machine learning algorithms to datasets.

February 19, 2020
Author:
Ann Glusker

This two-part series will focus on how to set up database-like structures, navigate them, create models and build various types of reports in Microsoft Excel. By the end of this series, participants will be able to sort and look for information within large datasets, use character-based functions, pivot tables, and build basic financial models.

February 19, 2020
Author:
Ilya Akdemir

This hands on workshop goes through the common “preprocessing recipe” that is used as the foundation for a variety of other applications as well as some basic natural language processing techniques.  These include: a) removal of stopwords, numbers, punctuation, b) tokenization, c) calculation of word frequencies / proportions, and d) part of speech tagging.

February 18, 2020
Author:
Chris Kennedy

This workshop introduces the basic concepts of Deep Learning - the training and performance evaluation of large neural networks, especially for image classification, natural language processing, and time-series data.

February 14, 2020
Author:
Evan Muzzall

Machine learning often evokes images of Skynet, self-driving cars, and computerized homes. However, these ideas are less science fiction as they are tangible phenomena that are predicated on description, classification, prediction, and pattern recognition in data.

February 14, 2020
Author:
Pelagie Elimbi Moudio

For this workshop, we'll provide an introduction to visualization with Python. We'll cover visualization theory and plotting with Matplotlib and Seaborn, working through examples in a Jupyter (formerly IPython) notebook. The following plot types will be covered:

Working Group Session: Data Scholars: Discovery
February 13, 2020
February 13, 2020
Author:
Aniket Kesari

This four-part, interactive workshop series is your complete introduction to programming Python for people with little or no previous programming experience. By the end of the series, you will be able to apply your knowledge of basic principles of programming and data manipulation to a real-world social science application.

Working Group Session: Data Scholars: Pathways
February 12, 2020
February 12, 2020
Author:
Adam G. Anderson, Stacy Reardon

The Berkeley Digital Humanities Working Group began in 2011 as a place to facilitate interdisciplinary conversations around topics in the Digital Humanities (broadly defined).  We welcome participants from all disciplinary backgrounds, beginners and experts in digital skills, students, faculty, and staff.  The agenda for our biweekly meetings is participant driven, and we typically h

February 12, 2020
Author:
Emily Grabowski

This workshop introduces students to scikit-learn, the popular machine learning library in Python, as well as the auto-ML library built on top of scikit-learn, TPOT. The focus will be on scikit-learn syntax and available tools to apply machine learning algorithms to datasets.

February 12, 2020
Author:
Evan Muzzall

In the final part, we will review data importation, subsetting, and visualization. Students will then be given the majority of time to reproduce a workflow on two different datasets, ask questions, and review the solutions as a group.

February 11, 2020
Author:
Aniket Kesari

This four-part, interactive workshop series is your complete introduction to programming Python for people with little or no previous programming experience. By the end of the series, you will be able to apply your knowledge of basic principles of programming and data manipulation to a real-world social science application.

Part 3 Topics:

February 10, 2020
Author:
Pelagie Elimbi Moudio

Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python.

In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.

We plan to cover:

Pages