602Detection of Abusive Language: the Problem of Biased DatasetsMichael Wiegand*0, Josef Ruppenhofer†‡, Thomas Kleinbauer**Spoken Language Systems, Saarland University, Saarbr¨ucken, Germany†Leibniz ScienceCampus, Heidelberg/Mannheim, Germany‡Institute for German Language, Mannheim, Germany[email protected][email protected][email protected]AbstractWe discuss the impact of data bias on abu-sive language detection.We show that clas-sification scores on popular datasets reportedin previous work are much lower under realis-tic settings in which this bias is reduced. Suchbiases are most notably observed on datasetsthat are created by focused sampling insteadof random sampling.Datasets with a higherproportion of implicit abuse are more affectedthan datasets with a lower proportion.1IntroductionAbusive or offensive language is commonly de-fined as hurtful, derogatory or obscene utterancesmade by one person to another person.1Examplesare(1)-(3). In the literature, closely related termsincludehate speech(Waseem and Hovy,2016) orcyber bullying(Zhong et al.,2016). While theremay be nuanced differences in meaning, they areall compatible with the general definition above.(1) stop editing this, youdumbass.(2) Just want to slap thestupidout of thesebimbos!!!(3) Go lick a pig you arab muslim piece ofscum.Due to the rise of user-generated web content,in particular on social media networks, the amountof abusive language is also steadily growing. NLPmethods are required to focus human review ef-forts towards the most relevant microposts.In this paper, we examine the issue of data bias.For the creation of manually annotated datasets,randomly sampling microposts from large socialmedia platforms typically results in a too smallproportion of abusive comments (Wulczyn et al.,2017;Founta et al.,2018).Therefore, more fo-cused sampling strategies have to be applied which0Present affiliation:Leibniz ScienceCampus,Heidel-berg/Mannheim, Germany1cause biases in the resulting datasets.We showwhat implications this has on classifiers trained onthese datasets: Previous evaluations reported highclassification performance on datasets with diffi-cult cases of abusive language, e.g. implicit abuse(§2). Contrarily, we find that the high classifica-tion scores are likely to be the result of modelingthe bias in those datasets.Although we will explicitly name shortcomingsof existing individual datasets, our paper is not in-tended as a reproach of those who created them.On the contrary, we acknowledge the great ef-forts the researchers have taken to provide theseresources. Without them, much existing researchwould not have been possible. However, we alsonoticed a lack of awareness of the special prop-erties of those datasets among researchers usingthem. As we will illustrate with specific examples,this may result in unforeseen results of particularclassification approaches.