Please visit our Hate Speech Research website for more information.
According to the Pew Research Center 41% of American adults have experienced online hate speech and harassment, and 66% have witnessed it. D-Lab’s Online Hate Index research project investigates hate speech as a social and linguistic phenomenon, grounded on prior domestic and international policy and law, academic research, and in collaboration with advocacy organizations. Our hate speech research innovates on several fronts:
Detailed survey instrument.
We developed and tested a survey instrument, containing specific questions, drawing on the academic literature on hate speech in order to clearly distinguish hate speech from other offensive language. The items that we have developed are both highly reliable, meaning that different labellers tend to agree, and valid, meaning that items capture the same concept.
Crowdsourcing of labellers.
We first worked with ten students that were diverse in terms of race/ethnicity, gender, religion, linguistic background, nationality, sexuality in order to de-code and discern hateful comments targeted to a wide variety of groups of people. Subsequently, we developed a system to use workers from Amazon Mechanical Turk. In this way, we were able to scale comment labelling by orders of magnitude.
Hate speech lexicon.
As part of our data sampling process, we are creating an expansive lexicon of hate speech words and phrases, improving on existing sources such as Hatebase.
Enhanced Natural Language Processing and machine learning methods.
Our team has experimented with Natural Language Processing and machine learning methodologies to improve the accuracy of the models and to allow us to compare across platforms and over time.
D-Lab's Online Hate Index Research In the Press
This work has been receiving a fair amount of financial support and media exposure. Most recently, the Berkeley Institute for Data Science recognized this research through a $30,000 grant. Sadly, recent events have brought hate speech to the forefront. One piece by Rachael Myrow titled, Why It's So Hard to Scrub Hate Speech Off Social Media featured on the California Report on KQED discussed the connections between hate speech and hate acts and the many times illusive language used by offenders. A longer audio blog is also available by Rachael Myrow and Devin Katayama titled, Silicon Valley Is Trying To Prevent Hate Speech. Is It Working? Previously, Brittan Heller, an online human rights advocate, wrote the following piece featured in Wired Magazine, What Mark Zuckerberg Gets Wrong--and Right--About Hate Speech.
If you would like to know more about hate speech, don’t hesitate to contact me, Claudia von Vacano, at cvonvacano at berkeley dot edu.