This workshop introduces the basic concepts of Deep Learning - the training and performance evaluation of large neural networks, especially for image classification, natural language processing, and time-series data.
Sign up for our weekly newsletter!
This is an archive of our past training offerings. We are looking to include workshops on topics not yet covered here. Is there something not currently on the list? Send us a proposal.
Unfortunately, due to a family loss, Prof. Ozer is unable to be present at this event. Catherine Duarte and Brian Villa, her research team members and planned co-presenters, will incorporate a previously recorded lecture by Prof. Ozer and otherwise deliver the presentation as planned. We look forward to your attendance at this event. To Prof.
In this workshop we will cover the most common CTA task: supervised classification. Using the Python library scikit-learn, we will implement Logistic Regression and Random Forest methods to perform sentiment analysis. Optional: introduction to word vector representations with Word2Vec.
This workshop introduces the basic concepts of Deep Learning - the training and performance evaluation of large neural networks, especially for image classification, natural language processing, and time-series data.
In this hands-on workshop, we will learn how to create web graphics for your digital publishing projects and websites. We will cover topics such as: image editing tools in Photoshop; image resolution for the web; sources for free public domain and Creative Commons images; and image upload to publishing tools such as WordPress. If possible, please bring a laptop with Photoshop installed.
This workshop will cover theory and techniques for maximizing the effectiveness of figures used for visualizing information. Rather than teaching any particular visualization software, this course will teach students about the "nuts and bolts" of effective data visualization.
Qualitative Data Analysis (QDA) software is used to organize and structure data, codes, memos, and other components of a qualitative study.
This workshop is a two-part series for qualitative researchers, new and established, interested in learning about MAXQDA, a QDA software for which D-Lab provides substantive support.
Research on Data Science Career Paths
This is a six-hour tutorial on machine learning in R that covers data preprocessing, cross-validation, ordinary least squares regression, lasso, decision trees, random forest, xgboost, and superlearner algorithms.
This workshop introduces students to scikit-learn, the popular machine learning library in Python, as well as the auto-ML library built on top of scikit-learn, TPOT. The focus will be on scikit-learn syntax and available tools to apply machine learning algorithms to datasets.
Git is a powerful tool for keeping track of changes you make to the files in a project. You can use it to synchronize your work across computers, collaborate with others, and even deploy applications to the cloud. In this workshop, we'll learn the basics of understanding and using Git, including working with the popular "social coding" website, GitHub.
This hands on workshop builds on part 1 by introducing the basics of Python's scikit-learn package to implement unsupervised text analysis methods. This workshop will cover a) vectorization and Document Term Matrices, b) weighting (tf-idf), and c) uncovering patterns using topic modeling.
Crowdsourcing is a method increasingly used in qualitative, quantitative, and mixed-methods research. However, many researchers remain unclear about what this method is, when it may be appropriate to use, and how it could be implemented.
This is a six-hour tutorial on machine learning in R that covers data preprocessing, cross-validation, ordinary least squares regression, lasso, decision trees, random forest, xgboost, and superlearner algorithms.
This workshop will provide a comprehensive overview of graphics in R, including base graphics and ggplot2. Participants will learn how to construct, customize, and export a variety of plot types in order to visualize relationships in data.
R Fundamentals Part 4: Putting it all together
In the final part, we will review data importation, subsetting, and visualization. Students will then be given the majority of time to reproduce a workflow on two different datasets, ask questions, and review the solutions as a group.
This workshop introduces the basic principles of understanding digital imagery, including the fundamentals of multi-spectral imagery. Participants will learn how to find and download satellite and aerial imagery, how to display and enhance digital imagery, and basic techniques for image interpretation and analysis.
For this workshop, we'll provide an introduction to visualization with Python. We'll cover visualization theory and plotting with Matplotlib and Seaborn, working through examples in a Jupyter (formerly IPython) notebook. The following plot types will be covered:
Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python.
In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.
We plan to cover:
This hands on workshop goes through the common “preprocessing recipe” that is used as the foundation for a variety of other applications as well as some basic natural language processing techniques. These include: a) digitization (utf 8), b) removal of stopwords, numbers, punctuation, c) tokenization, d) calculation of word frequencies / proportions, e) part of speech tagging, and f) concordan