Twitter May Be Able to Curb Hate Speech With Warnings, Study Finds

Internet

Hate speech on social media is a rising concern across the globe. Now, researchers have found a way to temporarily dissuade people from indulging in hate speech on Twitter. According to a study by New York University’s Center for Social Media and Politics, warning Twitter users about the serious consequences of hateful language can significantly help reduce hate speech on the microblogging platform for almost a week. Twitter and other social media platforms regularly roll out updates and changes to eradicate hate speech. This study substantiates this effort through data and gives it a direction.

In a paper published in the journal Perspectives on Politics, researchers have examined one of the many ways in which hate speech may be curbed on Twitter, by issuing warnings of potential suspension of accounts.

To determine the effects of this method of warning, the study was based on a series of experiments. Researchers focussed on followers of users whose accounts were suspended for posting tweets that used hate speech. These users, as researchers have claimed, would consider themselves potential “suspension candidates” and would be willing to moderate their behaviour after a warning.

The research team downloaded more than 600,000 tweets on July 21, 2020, which were posted a week earlier. These tweets contained at least one word that was regarded as hateful. This was the time when hate speech against Asian and Black communities surged on Twitter, following the coronavirus pandemic and Black Lives Matter protests. From this sample pool of tweets, researchers shortlisted approximately 4,300 users, who were followers of accounts that were suspended in this time period.

The users were divided into seven groups — six treatment groups and one control group. Users from the six groups were sent different types of warning messages, all of which started with the line, “The user [@account] you follow was suspended, and I suspect that this was because of hateful language.” Thereafter, the messages either warned that the account would be temporarily suspended or the users might permanently lose their posts, followers, and accounts. The control group did not receive any warning message.

Researchers found that users who received these warning messages reduced the ratio of hateful tweets by up to 10 percent a week later. Whenever the warning message was more politely phrased, the decline reached 15 to 20 percent. But the impact of these warnings lasted only a month.

“Even though the impact of warnings are temporary, the research nonetheless provides a potential path forward for platforms seeking to reduce the use of hateful language by users,” said Mustafa Mikdat Yildirim, an NYU doctoral candidate and the lead author of the paper.