Automatic detection of online insults

Hate speech in the digital sphere has the potential to silence voices and thereby threaten democracy. But hate is not always expressed through swear words online; implicit insults are also ubiquitous. Tracking these down efficiently by technical means, however, is extremely challenging. Michael Wiegand is currently working on the “Recognition of Implicit Insults” in a project funded by the FWF.

“In many comments, the insult is obvious because of the profanity used; but often, the issue is more complicated, and we frequently need a human being to detect the insult,” explains Michael Wiegand, who conducts research and teaches in the field of computational linguistics at the Digital Age Research Center (D!ARC).

However, implicit insults are just as tricky, with even the human eye often only spotting the insult at second glance. The fact that these insults cannot be detected by most programmes for hate speech detection makes the problem all the more complex. “If you say, ‘Well, you’re not very intelligent,’ most systems will not comprehend this as an insult. So, in addition to word recognition, we also have to make an effort to recognise linguistic patterns behind remarks like these, which will help us to identify them as offensive,” Michael Wiegand goes on to explain. A crucial factor here are data sets from which an algorithm can learn certain linguistic phenomena. For this purpose, the utterances of people have to be labelled “offensive” and “non-offensive”. Michael Wiegand elaborates: “Once you have the annotated data, you can use machine learning techniques that will allow the machine to recognise signals, interactions and linguistic structures that constitute an insult from the observations in a relatively autonomous way. However, the learning techniques should not only refer to words, but must also take more complex patterns into account.”

Michael Wiegand has been working in the field of hate speech recognition for many years. In his current project, the focus is on implicit insults. Dedicated data sets will be compiled for a number of concrete linguistic expressions, such as euphemisms (e.g. “I’m glad that at least we won’t see each other at the weekend.”). This is challenging because expressions of this kind are severely underrepresented in existing text collections. In the classification procedures that will subsequently be tested on this new data, large language models, which have enjoyed a triumphant advance in computational linguistics in recent years, are set to play a central role. At present, they represent the only machine learning methods that have a genuine albeit rudimentary understanding of the text and can thus manage to read between the lines to a limited extent.