Do tweets to scientific articles contain positive or negative sentiments?

31 downloads 468 Views 158KB Size Report
scientific documents measure, tweets are already used as metrics by data aggregators .... http://www.pewinternet.org/2015/01/01/social-media-update-2014/.
Do tweets to scientific articles contain positive or negative sentiments? Natalie Friedrich1, Timothy D. Bowman2 & Stefanie Haustein3 1

[email protected] Heinrich Heine University Düsseldorf, Institute of Linguistics and Information, Department of Information Science, Düsseldorf (Germany) 2

[email protected] Research Unit for the Sociology of Education – RUSE, University of Turku, Turku (Finland) 3

[email protected] École de bibliothéconomie et des sciences de l'information, Université de Montréal, Montréal (Canada)

Introduction With 23% of the adult online population on Twitter (Duggan, Ellison, Lampe, Lenhart, & Madden 2015), 500 tweets sent per day and 21% of recent journal articles tweeted at least once (Haustein, Costas, Larivière, 2015), the microblogging platform has been identified as one of the altmetrics data sources with the largest potential to measure impact of papers beyond the scientific community. Although it is not quite clear what type of impact tweets to scientific documents measure, tweets are already used as metrics by data aggregators such as Altmetric.com, ImpactStory and Plum Analytics. In this context, it is crucial to determine if and to what extent tweets contain positive or negative opinions about the papers they link . A case study based on manual coding of 270 random tweets to journal articles found that Twitter users hardly express any sentiments towards the papers they linked to, which suggests that Twitter is mostly used to disseminate scientific papers (Thelwall, Tsou, Weingart, Holmberg, & Haustein, 2013). Friedrich, Bowman, Stock, and Haustein (2015) found similar results based on 1,000 intellectually analyzed tweets, which were used to adapt automated sentiment analysis with SentiStrength to tweets about scientific papers. This paper builds upon the results by Friedrich et al. (2015) and Friedrich (2015) using improved methods and reports the results of a large-scale sentiment analysis of tweets mentioning journal articles with a particular focus on differences between scientific disciplines. Results will contribute to the understanding of tweets as impact measures. Methods The study is based on all articles and reviews published in 2012 in the Web of Science (WoS) linked to tweets captured by Altmetric.com until 30 June 2014 via the Digital Object Identifier (DOI) as described in Haustein, Costas, and Larivière (2015). Retweets were excluded to focus on original contributions on Twitter and tweets reduced to those from accounts with English language settings to limit the number of tweets in other languages. This resulted in as set of 487,610 tweets mentioning 192,832 papers, which was analysed with SentiStrength. Since the direct processing of tweets with SentiStrength led to inaccurate results, both tweets and the lexicon were adapted as described in Friedrich (2015). The preprocessing of tweets included removing Twitter specific affordances such as user names, URLs and hash signs as well as terms that appeared in the article title. Adaption of the lexicon involved removing terms that often appeared as the subjects of studies instead of carrying a negative (e.g., cancer) or positive sentiment (e.g., baby). Each pre-processed tweet was assigned a sentiment (positive, negative or neutral). Results were aggregated on the level of

scientific disciplines using the NSF classification assigned to the journal the tweeted paper was published in. It should be noted that although the automated sentiment detection was improved from to 56.8% (Cohen’s Kappa K=0.10) to 92.1% (K=0.54, moderate agreement) correctly classified tweets based on a random sample of 1,000, misclassifications are possible.

180.000

positive sentiment in % number of documents

100,0% 90,0% 80,0%

140.000

70,0%

120.000

60,0%

100.000

50,0%

80.000

40,0%

60.000

30,0%

40.000

20,0%

20.000

10,0%

Chemistry

Clinical Medicine

Health

0,0%

Physics

Mathematics

Biomedical Research

Earth and Space

Professional Fields

Biology

Psychology

Social Sciences

Humanities

0

Engineering and…

160.000

sentiment in %

neutral sentiment in % negative sentiment in % number of tweets

200.000

Arts

number of documents amd tweets

Results Of the 487,610 tweets, 11.0% were identified to contain positive and 7.3% negative sentiments, while 81.7% tweets were neutral. On the level of scientific disciplines (Figure 1), positive tweets prevail negative tweets, while particular differences can be observed, for example, between Arts and Humanities with a large share of sentiments compared to Chemistry, where 92.2% of tweets did not contain any sentiments. In Clinical Medicine (8.9% positive, 7.7% negative) and Health (9.0%, 7.5%) sentiments were most equally distributed. Psychology (11.8%) and Social Science (11.6%) represented the disciplines with the highest share of negative sentiments.

Figure 1. Percentage of tweets with positive, negative and neutral sentiments per scientific discipline.

Conclusions Considering tweets to scientific papers as an altmetric indicator, the provided results show that the majority of the processed tweets do not contain any sentiments and are therefore neither praise nor criticism but merely diffusion of the paper. A possible reason for the lack of sentiments might be the limitation to 140 characters. Still, there are 20% of the tweets, which contain some sentiment and therefore give an opinion towards the linked article. Although

positive sentiments prevail in all disciplines, negative sentiments represent more than 10% of tweets in the Social Sciences, Psychology and Humanities. This might be due to the fact that topics studied in these disciplines are often related to people’s experiences and opinions and are thus more likely to trigger negative comments by Twitter users. On the other hand, papers in the natural sciences – such as chemistry, physics and engineering – provoke less sentiments, as natural phenomena are less likely to cause emotional reactions by Twitter users than social ones. Differentiating between neutral as well as positive and negative tweets can help to improve the value of tweets as altmetrics indicators. References Duggan, M., Ellison, N.B., Lampe, C., Lenhart, A., & Madden, M. (2015). Social Media Update 2014. Pew Research Center, January 2015. Retrieved from: http://www.pewinternet.org/2015/01/01/social-media-update-2014/ Friedrich, N. (2015). Applying sentiment analysis for tweets linking to scientific papers. Arbeit zur Erlangung des akademischen Grades Bachelor of Arts (B. A.)
 im Studiengang Informationswissenschaft und Sprachtechnologie (integrativ), Bachelor’s thesis Heinrich Heine University Düsseldorf, Düsseldorf, Germany. Friedrich, N., Bowman, T. D., Stock, W. G., & Haustein, S. (2015). Adapting sentiment analysis for tweets linking to scientific papers. In Proceedings of ISSI 2015 Instanbul: 15th International Society of Scientometrics and Informetrics Conference, Istanbul, Turkey, 29 June to 3 July, 2015 (pp. 107-108). Istanbul, Turkey: Bogaziçi University Printhouse. Retrieved from: http://www.issi2015.org/files/downloads/all-papers/0107.pdf Haustein, S., Costas, R., & Larivière, V. (2015). Characterizing social media metrics of scholarly papers: the effect of document properties and collaboration patterns. PloS one, 10(3), e0120495. Thelwall, M., Tsou, A., Weingart, S., Holmberg, K., & Haustein, S. (2013). Tweeting links to academic articles. Cybermetrics: International Journal of Scientometrics, Informetrics and Bibliometrics, (17), 1-8.