Understanding Counterspeech for Online Harm Mitigation

Yi-Ling Chung; Gavin Abercrombie; Florence Enock; Jonathan Bright; Verena Rieser

doi:10.3384/nejlt.2000-1533.2024.5203

Authors

Yi-Ling Chung The Alan Turing Institute
Gavin Abercrombie
Florence Enock
Jonathan Bright
Verena Rieser

DOI:

https://doi.org/10.3384/nejlt.2000-1533.2024.5203

Abstract

Counterspeech offers direct rebuttals to hateful speech by challenging perpetrators of hate and showing support to targets of abuse. It provides a promising alternative to more contentious measures, such as content moderation and deplatforming, by contributing a greater amount of positive online speech rather than attempting to mitigate harmful content through removal. Advances in the development of large language models mean that the process of producing counterspeech could be made more efficient by automating its generation, which would enable large-scale online campaigns. However, we currently lack a systematic understanding of several important factors relating to the efficacy of counterspeech for hate mitigation, such as which types of counterspeech are most effective, what are the optimal conditions for implementation, and which specific effects of hate it can best ameliorate. This paper aims to fill this gap by systematically reviewing counterspeech research in the social sciences and comparing methodologies and findings with natural language processing (NLP) and computer science efforts in automatic counterspeech generation. By taking this multi-disciplinary view, we identify promising future directions in both fields.

Understanding Counterspeech for Online Harm Mitigation

Authors

DOI:

Abstract

Downloads

Published

Issue

Section

License

Make a Submission