Governing Hate Speech by Means of Counter Speech

0 downloads 0 Views 361KB Size Report
factors for the success of counter speech are the proportion of the hate .... atively simple, high accuracy hate speech recognition, especially its separation from.
GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

Governing Hate Speech by Means of Counter Speech on Facebook Carla Schieb Institute for Communication Research Westfälische Wilhelms-Universität Münster

Mike Preuss European Research Center for Information Systems Westfälische Wilhelms-Universität Münster

Abstract Counter speech is currently advocated by social networks as a measure for delimiting the effects of hate speech. Conveniently, counter speech is left to the dedicated user so that internet companies do not have to come up with new technologies or invest into manual treatment. But how efficient is counter speech? Our approach is twofold: Firstly, we review existing literature in order to find examples where counter speech worked well. Secondly, due to the missing availability of data, we set up a computational simulation model that is used to answer general questions concerning the effects that hinder or support the impact of counter speech. On the basis of our findings, we argue that the defining factors for the success of counter speech are the proportion of the hate speech faction and the type of influence the counter speakers can exert on the undecided.

1

Introduction The internet enables boundless, rather inexpensive, and ubiquitous communication, providing individuals with immediate information, enabling to share opinions, and bringing people together. High hopes were associated with its diffusion in the late 1990s and early 2000s (Bowman & Willis, 2003; Deuze, 1999; Elin & Davis, 2002; Shane, 2004). However, along with the good it did, there is also the notice of various problematic issues such as an increase in websites, communities, postings, comments, pictures and videos devoted to hateful speech and other antisocial activities (Erjavec & Kovačič, 2012; Gerstenfeld, Grant, & Chiang, 2003; Citron & Norton, 2011). Hate speech can be shared at a great pace via social networks and it reaches large audiences unfolding its poisonous effect (Benesch, 2014a). Crude racist content being published by users on Facebook caused major waves of anger in Germany throughout 2015, culminating in late August 2015 when the refugee crisis in Europe took on a dramatic scale. The discontent resulted from users who reported violations of Facebook’s community standards1 (particularly, hate speech and hoaxes profiling refugees as criminals etc.) yet were largely ignored by the social network site’s (SNS) administrators. Facebook prohibits hate speech in its community standards, namely verbal attacks and the promotion of hatred based on people’s race, ethnicity, national origin, etc., which is closely related to scholarly definitions of hate speech (Delgado & Stefancic, 2014; Boyle, 2001). However, the community standards state that hate speech is not prohibited per se, but is allowed under certain circumstances, such as expression of humor/satire, raising awareness for certain topics etc. Comments that claimed the death of refugees and expressed calls 1

https://www.facebook.com/communitystandards

: [email protected]

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

2

for violent acts such as burning down refugee hostels do not only violate Facebook’s terms but constitute severe criminal acts which are being prosecuted in Germany and several other Western countries. While there are reports from single cases where such prosecution based on hate speech incidents on Facebook is currently under way, this is only possible if the statements are very explicit and hence violate national law. However, the controversy about hate posts on Facebook concerning the refugee crisis is salient and omnipresent in German media to such an extent, that the German minister of justice met with leading European Facebook managers in September 2015 in order to call for a more efficient hate speech deletion process and is still arguing in favor of more deletions regularly on his Facebook page and in press statements.2 Why did Facebook not react instantaneously to those cases of verbal arson powerful enough to lead to real violence? If we take the enormous amount of comments and postings into account — Facebook speaks of several million hate postings per week3 — there is huge effort and enormous employment of labor involved in order to check incoming reported content. In order to obtain a very rough estimation of the financial and organizational consequences of such an endeavor, we embark on a simple thought experiment: let us assume that a native speaker needs about 1 minute to check if a complaint justifies the deletion of a post because it does not comply with the community standards, and to actively perform the deletion and/or block the responsible user. As stated above, this implies to decide if hate speech is not used in a humorous or satirical way, that it is a direct statement by the author and not meant to raise awareness or simply in order to comment on it. Most likely, it also means that 2

Statement by Heiko Maas from October 30, 2015 (in German): http://www.tagesspiegel.de/ themen/hasskommentare-im-internet/hasskommentare-im-internet-die-trolle-sind -monster-geworden/12521932.html 3 Online report on a discussion between journalists and Facebook representatives (in German): Facebook äußert sich zu Hatespeech-Vorwürfen: “Ja, wir haben Fehler gemacht!” http://t3n.de/news/facebook-hatespeech-fehler-gemacht-637860/

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

3

the context of the post has to be analyzed in order to identify the intention of the poster. We doubt that this can be automatized with high accuracy. But let us further assume that tools were available to push the error rate to around 10% and only these remaining posts would have to be checked manually. But even under these (probably much too optimistic) assumptions, Facebook would need around 100 native German speakers in their community operations teams in order to cope with this amount of complaints. Our estimation makes Facebook’s predicament obvious: is it possible to distinguish between real hate speech and for example humorous statements that are ambiguous enough to appear as hate speech? Especially in online environments, successful decoding of ambiguous statements proves difficult, because it depends on underlying knowledge of the context, the intention, or the social background. Furthermore, human language processing of ambiguous statements is delayed (Giora, Fein, & Schwartz, 1998) and should constitute too great a burden to the community operations team, having to cope with a myriad of reported postings. Concerning the demanded removal of hate speech posts, it is currently argued that Facebook promptly removes other content that is not compatible with its community standards, namely pictures containing nudity. However, the technical background is completely different: computationally, detecting nudity is a relatively well solved problem. Fully automated methods have been presented already around 20 years ago (Fleck, Forsyth, & Bregler, 1996), and have been vastly improved further since (Kakumanu, Makrogiannis, & Bourbakis, 2007). Whereas skin recognition is relatively simple, high accuracy hate speech recognition, especially its separation from humorous posts or discussions about hate speech is currently computationally almost intractable. In our present work, we argue that Facebook is reluctant to approach the hate

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

4

speech problematic as German politicians suggest, namely to promptly delete inciting statements, but instead focuses on promoting a diverse communication culture4 . This happens not only in addition to financial considerations (e.g. additional costs caused by more administrators, implementation of new features, loss of trust of the SNS’s shareholders), but three more aspects come into play: • Firstly, the efforts made to organize the revision of the reported hate postings are - apart from being costly - immense (e.g. coordination of staff). • Secondly, the SNS would establish a precedent if they conceded German demands. Other countries would possibly urge Facebook to adapt to their more restrictive laws, which leads us to the third point. • Facebook Inc. is a US-based enterprise and hence advocates US law, users outside of the US and Canada contract with Facebook Ltd. Ireland and hence are subject to European Union law. One of the main questions regarding internet governance also affects a globally widespread SNS such as Facebook: Whose laws are to be applied? Facebook responds to reasonable requests of national authorities whenever criminal acts are involved (e.g. Holocaust denial which is illegal in Germany) (Facebook, 2015). And yet, the SNS encourages the sharing of diverse opinions and views which might be challenging or disturbing to some users. Its community standards implicate a broad understanding of freedom of expression which derives from the First Amendment. The First Amendment guarantees free speech by US constitution and is as such an absolute right with very few limitations (Boyle, 2001), (Citron & Norton, 2011). Challenging opinions and ideas are therefore protected even if they express problematic views. 4

Facebook’s community standards https://www.facebook.com/communitystandards

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

5

Facebook’s approach is twofold: On the one hand it collaborates with national authorities in cases of criminal acts and on the other hand it promotes a diverse communication culture by encouraging counterspeech. Reasonable and accurate arguments, facts and figures, employed in direct responses to hate posts, are seen as a helpful treatment to restrict the impact of hate speech instead of deleting reported posts. Users are emboldened to respect each other and to treat each other mindfully, their desire to discuss controversial topics is given an environment which leads to the experience of a sense of self-efficacy while counter-arguing. “When used wisely, counterspeech may prove to be a very effective solution for harmful or threatening expression.” (Richards & Calvert, 2000). Counter speech is advocated to minimize the risks of violent acts (Benesch, 2014a) by encouraging audiences to take a stand against individuals who spread hate and mistrust. Our primary aim in this article is to evaluate the efficacy of counterspeech by means of a simple computational model. The main question is: is it reasonable to encourage users to use counterspeech against hate posts on SNS? We chose a simulation model for a number of reasons: First and foremost, reliable data is not available. Twitter data is not useful because we are aiming at blackboard discussions with a “group” character and a defined audience. Secondly, control conditions can easily be simulated. We can employ the model in order to answer questions as: “what is the likely effect if the audience is twice as large?”. Thirdly, the operation is low-cost, unlike content analysis and ultimately, computing capacities allow for more sophisticated models. In the following sections, we first delve into related work regarding hate speech and its proposed resolution, namely counter speech. In order to explore the impact of counter speech on Facebook, we implement a simple computational simulation model that is used to answer general questions concerning the effects that hinder or support the impact of counter speech.

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

6

Related Work

“Hate speech describes a problematic category of speech [...] that involves the advocacy of hatred and discrimination against groups on basis of their race, colour, ethnicity, religious beliefs, sexual orientation, or other status.” (Boyle, 2001). As the definition implies, hate speech is not new to internet culture, rather it has always been part of antisocial behavior, such as bullying, or stalking (Delgado & Stefancic, 2014). As a matter of course, bullying and stalking do also occur in an online environment showing an upward trend as the use of social media is wide-spread (Festl & Quandt, 2013). Bullying and hate speech differ from each other in the number of people being harassed: one or many bullies pick on an individual, while many speakers imbued with hatred address their statements towards certain groups. The Gamergate controversy, for example, began in 2013 as an act of cyberbullying against Zoë Quinn, a game developer, and extended to others, mostly women in the game industry environment. Misogynic hate speech and massive threats arose on Reddit, 4chan and Twitter and focused not only individuals but women in the game industry in general (Chess & Shaw, 2015). Research shows that hate speech deepens prejudice and stereotypes in a society (Citron & Norton, 2011) but also has a detrimental effect on mental health and emotional well-being of targeted groups, especially on targeted individuals (Citron & Norton, 2011; Festl & Quandt, 2013; Benesch, 2014a) and is a source of harm in general for those under attack (Waldron, 2012), when culminating in violent acts incited by hateful speech (Lawrence III, 1990). Such violent hate crimes may erupt in the aftermath of certain key events, e.g. anti-Muslim hate crimes in response to the 9/11 terrorist attacks (King & Sutton, 2013). Hate speech is an American expression (Boyle, 2001), though not a solely US-

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

7

American phenomenon, and as such it is closely tied to questions regarding freedom of speech established through the First Amendment (Henry, 2009). Just because free speech is protected by US-constitution does not implicate that it represents an absolute law, in fact it has some, though very few, limitations (Boyle, 2001; Citron & Norton, 2011). US courts apply a broad interpretation of the First Amendment with the result that challenging opinions and ideas are protected even if they express problematic views. This fact has provoked a controversy around the US-centric governance of internet-related issues (DeNardis, 2014), which will be discussed shortly as it affects our argumentation in part. Websites are accessible worldwide and may violate certain laws in countries with a divergent understanding of the law. The broad interpretation of free speech protected by the First Amendment in the US is not shared by any other country. Democratic countries protect free speech but they define clear boundaries, e.g. Holocaust denial is prohibited in Belgium, Germany, France, Spain and Switzerland but it is no violation of law in the US (Boyle, 2001). It is important to remark that the First Amendment is applicable predominantly to governmental restrictions to free speech and does not apply to private actors which means that SNS cannot be made liable if others spread offensive, hateful posts (Citron & Norton, 2011). Whether it is hateful and inciting comments on online news web sites (Erjavec & Kovačič, 2012), on SNS such as Facebook or Twitter (Burnap & Williams, 2015), US-American internet content providers are free to choose how they respond to hate content. In essence, they may choose (1) inaction, (2) deletion of improper and hateful speech, (3) education and promotion of respectful conduct, or (4) addressing hate speech with counterspeech (Citron & Norton, 2011). Ad (1) inaction: Inaction can lead to greater harm to targets of hate speech and may demonstrate users that content providers do not take victims of hate speech seriously.

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

8

Ad (2) deletion: Citron and Norton rank the removal of hateful speech as the most powerful tool at disposal (Citron & Norton, 2011). Removal includes not only the deletion of offensive, hateful content but also blocking users or closing their accounts. The latter options are chosen by content providers especially in case of violent threats towards individuals or certain social groups leaving criminal prosecution untouched. Ad (3) education: Content providers could play an active role in promoting respectful behavior and thus informing about the harms resulting from hate speech. Furthermore, they could make their actions towards hate speech public, exposing their motives and hence taking a stand against hate speech. (Citron & Norton, 2011). Ad (4) counter speech: Counter speech by online content providers themselves is rare but occurs from time to time, especially as they are not only a platform for hate speech but also its targets. A lot more often counter speech is performed by users themselves and is meant to encourage users to get to know and to tolerate more diverse opinions. Counter speech is regarded as the most important remedy, yet as “constitutionally preferred” (Benesch, 2014a). As stated above, freedom of speech guaranteed by the First Amendment comprises even hate speech and as such it is regarded as beneficial if “bad” speech is met with more speech, i.e. counter speech. Scholarly definitions of the term are scarce, rather some vague examples serve for clarification (Richards & Calvert, 2000; Benesch, 2014a; Henry, 2009). “Counter-speech is a common, crowd-sourced response to extremism or hateful content. Extreme posts are often met with disagreement, derision, and counter-campaigns.” (Bartlett & Krasodomski-Jones, 2015). We define counter speech as all communicative actions aimed at refuting hate speech through thoughtful and cogent reasons, and true and fact-bound arguments. Such communicative actions can be memes such as the Pan-

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

9

zagar (flower speech) meme of Burmese blogger Nay Phone Latt (Benesch, 2014b), the billboard of citizens in Missouri to respond to the Ku Klux Klan (Richards & Calvert, 2000), or information spread in online hate groups by the Southern Poverty Law Center (Henry, 2009) and many other means to fight hate speech. Academic work on counter speech is descriptive in nature and tackles the subject matter merely in terms of successful case studies. Our aim is to move beyond and to turn towards an analytically more sophisticated approach which is able to identify the potential counter speech may have in an instigative environment. Simulation Model In order to investigate the effect of counterspeech on a Facebook page, we set up a simple simulation model that basically reduces all posts to expressions of an opinion on a one-dimensional opinion scale. We are aware of the fact that the model can only help in identifying trends, the numerical results cannot be directly transferred to implementable policies. However, we can ask “what if”-questions and find out what kind of knowledge is most urgently needed, because certain parameters or conditions have much more influence than others. Furthermore, we want to obtain a general idea of how much counterspeech is needed to balance the leading opinion or even revert it, and what the important factors for the effect of counterspeech are. The contents of a post are not modeled, we only look at the influence they exert on participants of such a forum at a given point in time. Therefore, the model is based on the following assumptions: • All posts are concerned with only one general area of opinion, that is, participants do not discuss completely different matters, but focus on a single, possibly very general topic. As an example, this could be the immigration of refugees into Germany.

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH Table 1 Property values for the different modeled groups Property/interval Core Clowns Opinion Volatility Activity Default scenario fraction Scenario 2 fraction

10

Followers

Counter Speaker

−1 0 1

[−1, 0] [0, 1] 1

[−1, 0] [0, 1] [0, 1]

1 0 1

20% 10%

5% 5%

75% 85%

var var

• Opinions of participants are well reflected in their posts, so that they can be recognized as expressions of a specific opinion by the audience. Posters may use language skills as irony or sarcasm, but readers are still able to determine which opinion is expressed (positive or negative in different strengths, or neutral). • Most participants, except the ones with extreme opinions, can be influenced by counter speech, and they change their opinions only gradually as a reaction to the posts they see, or respectively, the opinions that are expressed by these posts. This change is at least in principle possible in both directions (positive, negative). We call the fixed group of participants that is directly or indirectly involved in the discussion on a specific part (e.g., Facebook page) of an SNS at a given time the audience. Note that this does not even potentially encompass all Facebook users, but only the ones who visit a specific page for whatsoever reason. The audience consists of participants in 2 fractions, namely supporters of the original post that is assumed to be hate speech, and counter speakers. Whereas the latter are homogeneous, the supporters come in three flavors: core, clowns, and followers. Members of the core have extreme opinions and no volatility, that is, they cannot be influenced. Clowns follow the haters, have less extreme opinions and a high activity. This group is related to people known as trolls in other network contexts (Buckels, Trapnell,

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

11

& Paulhus, 2014). Followers are much easier to influence than the core, but have a lower activity. Counter speakers are the core’s antagonists, which means that they also do have extreme opinions, but at the other side of the allowed interval, and they are also highly active and cannot be influenced. The intervals for all groups are given in table 1. Our general approach to simulate a mutual influencing process is related to agent-based modeling (see (Heath, Hill, & Ciarallo, 2009) for an example-based survey), only that in our model, an agent is little more than a container for three numbers that represent three defining properties. It is thus similar to the approach pursued with opinion formation models (Watts & Dodds, 2007), only that we ignore the network component here and presume that every participant in the audience is able to see every post on a specific Facebook page: • Opinion o ∈ [−1, 1], where −1 stands for the one extreme (in our context a hater), and 1 for the other extreme, • Volatility v ∈ [0, 1], where 0 means that the opinion of the participant is not mutable at all, and 1 means that it is very easily influenced, and • Activity a ∈ [0, 1], which corresponds to the probability of a participant to actively take part in a discussion (by posting or liking).

A participant and its behavior during the simulation is completely defined by this triple p = (po , pv , pa ). In order to keep our simulation model simple, we assume the following influence process: • Every participant in the audience can see posts and likes and is influenced by them if the own volatility is > 0.

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

12

• Every participant can choose to act via writing an own post or liking the original post. If the original post is liked, the direction of the influence exerted on the other participants is equivalent to the one of the original post (and does not necessarily fully correspond to the own opinion). Likes also generally influence weaker than posts, we utilize a multiplicative like damping factor L ∈ [0, 1] with a default value of 0.5 to soften the influence. • Every participant is allowed to react exactly once per iteration. That is, we allow every participant one action before another one who already acted is allowed to consider posting or liking again. • The potential of an influence change is the higher, the larger the difference in opinions between two participants is. That does not necessarily mean that the resulting opinion change is always large. A participant’s volatility acts as a filter here: low volatility reduces the change severely, high volatility enables it. Thus, the potential for an opinion change is larger for participants with very different opinions than for participants who have similar opinions (in a numerical sense within the given interval of [−1, 1]). Next to this original (linear) influence shape, we also experiment with an alternative one that defines the influence potential as a triangle that is minimal (0) when opinions are equal or maximally different, and maximal (1) if the difference corresponds to half of the available interval (neutral to one extreme). According to the given assumptions, we can handle the influencing process in a sequential manner, by computing the influence exerted by a single post (or like) on the rest of the audience. At first, we describe the effects for the linear influence shape. A single interaction is characterized by equation (1), using p as the posting participant, r as the receiving participant, and D as damping factor that reduces the

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

13

range of possible opinion changes within one interaction. This factor is set to D = 0.1 per default, but its importance for the simulation is limited because it slows down or speeds up all interactions at the same rate.

ro = ro + (po − ro ) ∗ rv ∗ D

(1)

For a like instead of a post, the situation is slightly different and we obtain equation (2) by exchanging po with oo , which stands for the opinion of the poster who wrote the post that is liked. Additionally, we use L as like damping factor as detailed above.

ro = ro + (oo − ro ) ∗ rv ∗ D ∗ L

(2)

Next to the linear influence shape, we define both equations also for an triangle influence shape as described above, for which equations (1) and (2) read:

ro = ro + (1 − |1 − (po − ro )|) ∗ rv ∗ D

(3)

ro = ro + (1 − |1 − (oo − ro )|) ∗ rv ∗ D ∗ L

(4)

The overall simulation method is given by the pseudo-code algorithm 1, where S is the share of likes in relation to the total number of activities, we set this to S = 0.9 per default, which means that 90% of all actions are likes. “Model like” and “model post” in algorithm 1 are performed by applying equation (1)/(3) or (2)/(4), respectively, on the whole audience, with the current participant as poster p. Note that in contrast to many other opinion formation models, we do not strive for a discrete state (as necessary for decisions, e.g., in an election context), but the participants may end up with gradually different opionions distributed over the whole

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

14

Algorithm 1: Influence Model 1 create initial audience according to values in table 1 ; 2 mark all participants ∈ audience as “ready”; 3 select supporter with extreme opinion, mark as “done”, model post; 4 if #counter speakers > 0 then 5 select counter speaker, mark as “done”, model post 6 7 8 9

10 11 12 13

14 15

while participants ∈ audience that are “ready” do randomly select one of these as p; mark p as “done”; if random number ∈ [0, 1] < pa then // the participant “chooses” to get active: if random number ∈ [0, 1] < S then model like; else model post; // we can do more than one iteration by starting again: if !termination then goto step 2

possible interval [−1, 1].

Experimental Setup Research Question: Is a small number of counter speakers able to significantly influence a larger audience, and if so, what are the most important factors defining the strength of the influence? Pre-experimental Planning: As our simulations are stochastic, we need to trade off computation time against the number of repeats done per configuration. On a normal PC, our R implementation of algorithm 1 needs about 8 hours for computing the data needed to plot figure 1 (220 different configurations with 50 repeats per configuration). The measured standard deviation is usually below 0.03 with this setup, which is deemed sufficient for a qualitative evaluation.

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

15

Task: We are interested in detecting single parameters or parameter interactions that lead to high differences in the aggregated opinion change over the whole audience, compared to the initial state. Setup: We run several simulation studies, where for each of these two parameters are varied, whereas everything else is fixed. For each parameter combination, 50 repeats are performed, and the result is the average shift in opinions of the whole audience. More specifically, we investigate #counter speakers over #supporters of original post,5 for core fractions of 10 and 20% (default scenario and scenario 2, according to table 1), each with the original and the triangle influence shape. The 5

# stands for: “number of”

0.20 10

0.00

9

counterspeakers

8 7

0.15

0 0.1

0.10

0.05

0.05

6 5

0.00

5 0.0

−0.10



−0.05

4

−0.10

3 2

−0.15

1

−0.15 −0.20

0 −0.25 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200

supporters of original post

Figure 1 . Results of a simulation study (220 simulations with 50 repeats each) with up to 200 supporters of an original post and 0 to 10 counter speakers (linear influence shape, default scenario with core fraction 20%). At the 0 contour line, both influences cancel each other out on average, negative numbers mean an overall shift towards the original post, positive numbers stand for an overall shift towards the opinion of the counter speakers.

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

16

10

0.18

0.06 0.04

9

0.08

counterspeakers

8 7 6

0.16 0.14

0.20 0.15 0.10

0.12 0.10

0.02 0.05

5

−0.04

4

0.00 −0.02

3

−0.05

−0.10

2

−0.06

1

0.00

−0.08

−0.10

−0.12 −0.15

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200

supporters of original post

Figure 2 . Similar to figure 1, but with 10% core fraction (scenario 2), linear influence. The counter speakers are much more effective here.

remaining parameters are set as follows: fraction of likes=0.9, weight of likes=0.5.

Results Figure 1 and figure 2 depict the results for the linear influence shape, default scenario and scenario 2, respectively. Figure 3 and figure 4 show the results for the triangle influence shape. With the linear influence shape (figures 1 and 2), we find that the strongest effect of counter speech is reached with a group of around 30 supporters for 10 counter speakers. For the triangle influence shape (figures 3 and 4), this effect is, if at all existing, much weaker. Apart from this effect, the general shape of the lines of equally strong shifts in the contour plots is almost linear, with the slope depending on the scenario as well as on the chosen influence shape.

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

17

10

7 6

0.00

5

−0.12

−0.1 0

.02 −0

0.00

−0 .0 8

counterspeakers

8

−0. −0 04 .06

2

0.05

0.0

9

−0.05

−0.10

4

−0.18

3

−0.15 2

.2 0

−0.14

−0

−0.16

1

−0.20

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200

supporters of original post

Figure 3 . Similar to figure 1 (20% core fraction, default scenario), but with the triangle influence shape. Counter speakers are getting much less effective now.

0.10 9

counterspeakers

8

0.06

0. 04

10

0.02 0.00

7

.0 −0

6

2

0.05

−0.04 0.00

5

−0.05

4

10 −0.

3

−0.12 −0.10

2 1

−0.06

−0.08

−0.14

−0.15

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200

supporters of original post

Figure 4 . Same configuration as figure 3 (triangle influence shape), but for a core fraction of 10% (scenario 2). The counter speakers are more effective than in figure 1, but much less than for the linear influence shape.

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

18

The strange shape of the region of strongest influence of the counter speakers appears to be an overkill effect. With a small audience, the counter speakers cannot use their full potential, as they could convince more listeners but there are none available. This is visible especially in figure 2, but to a lesser extent also in figure 4. We presume that this effect is the more prominent, the lower the core fraction is who function as the counter weight fo the counter speakers.

It is an interesting but non-trivial question concerning which effect is stronger: the one of the influence shape or the one of the core fraction. Keeping in mind that several of our basic model assumptions have ad-hoc character and are not grounded by reliable data, the core fraction effect may at least be more relevant, as it may be considered as mutable parameter that changes for different audiences (visitors of Facebook pages). We assume that the real influence shape for online communication media is much less volatile, but has, to the best of our knowledge, not yet fully been identified. However, it may well be that the influence shape also depends on the domain of communication, and in particular, on the emotional involvement of the participants: in this respect, a discussion about different product times (e.g., books, computer mainboards) will most likely possess different influence shapes than a discussion about the number of refugees Germany shall accept. We can only state that it makes a huge difference if counter speakers are able to influence people with far-from neutral opinions or not. If not (this corresponds to the triangle influence shape and is presumably the case for highly emotionally charged debates), counter speech seems to be much less effective than for the linear influence shape that allows for a strong influence on participants with completely different opinions.

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

19

Conclusion

There is little academic literature exploring the effect of counterspeech in an empirical or experimental way. To the best of our knowledge scholars concentrate on mere descriptions of successful case studies and deduce at best ex post facto conditions for the effectiveness of counter speech (Richards & Calvert, 2000). In this paper we have presented a simulation model to investigate the effectiveness of counter speech, since "more" and reasonable speech is regarded by many SNS as the number one remedy against hate speech. We focused on two distinctive features of Facebook audiences: the proportion of the core faction, i.e. users who initiate hate speech posts, and the type of influence counter speakers can exert on the undecided, the large faction of followers but also clowns, i.e. trolls. In our study, we found that counter speech can have a considerable impact on a given audience, albeit it strongly depends on the fraction of hate speakers. That is to say, our first variation, larger vs. smaller core faction, revealed a trivial result at first sight: the smaller the proportion of the core, the higher the influence of counter speech. However, we noticed an unexpected shape which we interpret as an overkill effect: with a small audience, the effect of counter speech seems to be constrained by the missing availability of listeners. These first results have been obtained with a very optimistic linear influence shape function: the potential of an opinion change grows linearly with the mutual difference in opinions of the acting and the receiving individual. Our second condition regarding the influence shape function (triangle shape) assumes that counter speech works best if the opinion of individuals is not further apart than half of the available interval. This revealed a considerably weaker influence of counter speech. Audiences are prone to confirmation bias and seek for information and opinions they already have. Overall, we find that even a small

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

20

group of counter speakers can influence an audience that is much larger, if there is a significant amount of listeners that do not possess extreme opinions. We return to the question we asked at the beginning: is it reasonable to encourage users to use counter speech against hate posts on SNS? We are aware of the fact that our findings reflect only trends and cannot be directly transferred into implementable policies. However, our results indicate that counter speech might not only stop a specific social media audience from drifting towards more extreme opinions via the confirmation bias effect, but that it is even possible to influence the audience (on average) towards the opposite direction, albeit only slightly. According to our model, this holds true only for a medium-sized audience with a large amount of inactive, rather undecided individuals. Nevertheless, we have only considered the effects of counter speech on relatively small groups within a small time frame. As opinion formation is an ongoing process, it is yet unclear how sustainable the effect of counterspeech is: a single social media group is most likely not the only source of influences on the opinions of networked citizens with a political interest. Next to some generic extensions (e.g., larger audiences, observation/modeling of a longer period of time), it shall be interesting to improve the simulation model and apply it to some more different situations: • We kept volatility and activity fixed during the simulation. However, especially in the context of a longer modeled time period, it makes sense to allow them to change as a result of the course of the discussion, too. For instance, some individuals may become frustrated or confused and remain quiet for some time. • The chosen influence shape has a large influence on the effects of counter speech, it would therefore make sense to investigate what influence shapes are realistic for a specific scenario.

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

21

• Currently, we have assumed a fully connected network structure, we do not care how the audience arrived at the blackboard where the discussion takes place. However, a single participant is connected only to a limited amount of people, and the connection structure may make a difference, as related work in actor-based modeling in dynamic network contexts (Veenstra, Steglich, Laursen, Little, & Card, 2012) shows.

References Bartlett, J., & Krasodomski-Jones, A. (2015). counter-speech examining content that challenges extremism online. Retrieved from http://www.demos.co.uk/wp-content/ uploads/2015/10/Counter-speech.pdf Benesch, S. (2014a). Countering dangerous speech: New ideas for genocide prevention. working paper. Retrieved from http://www.ushmm.org/m/pdfs/20140212-benesch -countering-dangerous-speech.pdf Benesch, S. (2014b). Flower speech: New responses to hatred online. Retrieved from https://thenetmonitor.org/research/2014/ Bowman, S., & Willis, C. (2003). We media: How audiences are shaping the future of news and information. Boyle, K. (2001). Hate speech–the united states versus the rest of the world. Me. L. Rev., 53 , 487. Buckels, E. E., Trapnell, P. D., & Paulhus, D. L. (2014). Trolls just want to have fun. Personality and individual Differences, 67 , 97–102. Burnap, P., & Williams, M. L. (2015). Cyber hate speech on twitter: An application of machine classification and statistical modeling for policy and decision making. Policy & Internet. Chess, S., & Shaw, A. (2015). A conspiracy of fishes, or, how we learned to stop worrying about# gamergate and embrace hegemonic masculinity. Journal of Broadcasting &

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

22

Electronic Media, 59 (1), 208–220. Citron, D. K., & Norton, H. L. (2011). Intermediaries and hate speech: Fostering digital citizenship for our information age. Boston University Law Review, 91 , 1435. Delgado, R., & Stefancic, J. (2014). Hate speech in cyberspace. Wake Forest Law Review, 49 . DeNardis, L. (2014). The global war for internet governance. Yale University Press. Deuze, M. (1999). Journalism and the web an analysis of skills and standards in an online environment. International Communication Gazette, 61 (5), 373–390. Elin, L., & Davis, S. (2002). Click on democracy: the internet’s power to change political apathy into civic action. Westview Press, Inc. Erjavec, K., & Kovačič, M. P. (2012). “You Don’t Understand, This is a New War!” Analysis of Hate Speech in News Web Sites’ Comments. Mass Communication and Society, 15 (6), 899–920. Facebook. (2015). Government requests report. Retrieved from https://govtrequests .facebook.com/ Festl, R., & Quandt, T. (2013). Social relations and cyberbullying: The influence of individual and structural attributes on victimization and perpetration via the internet. Human Communication Research, 39 (1), 101–126. Fleck, M. M., Forsyth, D. A., & Bregler, C. (1996). Finding naked people. In Computer Vision – ECCV’96 (pp. 593–602). Springer. Gerstenfeld, P. B., Grant, D. R., & Chiang, C.-P. (2003). Hate online: A content analysis of extremist internet sites. Analyses of social issues and public policy, 3 (1), 29–44. Giora, R., Fein, O., & Schwartz, T. (1998). Irony: Grade salience and indirect negation. Metaphor and Symbol, 13 (2), 83–101. Heath, B., Hill, R., & Ciarallo, F. (2009). A survey of agent-based modeling practices (january 1998 to july 2008). Journal of Artificial Societies and Social Simulation, 12 (4), 9.

GOVERNING HATE SPEECH BY MEANS OF COUNTER SPEECH

23

Henry, J. S. (2009). Beyond free speech: novel approaches to hate on the Internet in the United States. Information & Communications Technology Law, 18 (2), 235–251. Kakumanu, P., Makrogiannis, S., & Bourbakis, N. (2007). A survey of skin-color modeling and detection methods. Pattern recognition, 40 (3), 1106–1122. King, R. D., & Sutton, G. M. (2013). High times for hate crimes: Explaining the temporal clustering of hate-motivated offending. Criminology, 51 (4), 871–894. Lawrence III, C. R. (1990). If he hollers let him go: Regulating racist speech on campus. Duke Law Journal, 431–483. Richards, R. D., & Calvert, C. (2000). Counterspeech 2000: A new look at the old remedy for bad speech. BYU L. Rev., 553. Shane, P. M. (2004). Democracy online: The prospects for political renewal through the internet. Routledge. Veenstra, R., Steglich, C., Laursen, B., Little, T., & Card, N. (2012). Actor-based model for network and behavior dynamics. Handbook of developmental research methods, 598–618. Waldron, J. (2012). The harm in hate speech. Harvard University Press. Watts, D. J., & Dodds, P. S. (2007). Influentials, networks, and public opinion formation. Journal of consumer research, 34 (4), 441–458.