PoliticAlly: Finding Political Friends on Twitter

6 downloads 1011 Views 445KB Size Report
In this work, our objective is to assist political ... metric that we call relatedness between any two users on twitter. .... 29: for all neighbours uj of U in Gc do. 30:.
2015 IEEE International Conference on Advanced Networks and Telecommuncations Systems (ANTS)

PoliticAlly: Finding Political Friends on Twitter Suchita Jain , Vanya Sharma and Rishabh Kaushal Department of Information Technology Indira Gandhi Delhi Technical University for Women, Delhi, India

Abstract—Twitter is fast becoming the most popular platform for spread of information in general and pertaining to political views in particular. Like any other social networking site, Twitter is a medium to socialize with people and particularly in political space, it is pertinent to gauge the public opinion from time to time. One of the key requirements is to find people with similar political opinions in order to consolidate and find new political friends (online sepoys). In this work, our objective is to assist political parties in addressing this issue through a recommendation system which recommends new people to an already registered party worker based on participation of this worker on Twitter in the trending hashtags. Our proposed system is based on a new metric that we call relatedness between any two users on twitter. This metric is derived from analysis of two sources namely link and content of the tweets. The link source is characterized by participation of party worker on Twitter in form of @Mentions and @RT (retweet) that he/she posts in trending hashtags. The content source is characterized by the words appearing in the tweets posted by party worker combined with the hashtags present in them. We construct a directed weighted graph and use the proposed relatedness metric as weight of edges in the graph, whereas the users are considered as nodes of the graph. A well known WalkTrap community detection algorithm is then used to identify clusters of people with similar views based on which recommendation is done. A prototype system called PoliticAlly is developed which provides a simple web interface for party workers to use and get friend recommendations. Keywords—Similarity metrics, Community Detection and Mining, Twitter Followee Recommendation, Trending Hashtags

I.

I NTRODUCTION

Twitter has become a popular online tool for expressing political views in today’s times. Often the news gets broken at Twitter and later gets spread through electronic and print medium. As on March, 2015 [1], over 500 million users and 302 million active users worldwide, use Twitter. Twitter's active group of users particularly in political space includes almost all World leaders and politicians. Getting to know people with similar political opinions and interests is one of the most pertinent requirement of various political parties in order to woo them into their party fold by officially making them their poltical sepoys. Unfortunately, finding people with similar interests is a time consuming job in the vast Twitter space and beyond the capacities of party workers who are not very tech savy. To the best of our knowledge, as of now, Twitter suggests who to follow 1 based on the user's own network. In this paper, we propose a recommendation system to recommend followees (whom to follow) based on the user's past particiption in the trending hashtags. The system is based on the assumption that 1 Who to Follow? Link: https://support.twitter.com/articles/227220suggestions-for-you-discover-who-to-follow

978-1-5090-0293-1/15/$31.00 ©2015 IEEE

356

users in same community participate in political debates over Twitter in a similar manner. The rest of the paper is structured as follows. Next section indicates the related work in brief. In Section III, we describe the proposed system. Section IV gives the implementation outcomes of our recommendation system. Finally, in Section V paper is concluded and open issues discussed.

II.

R ELATED W ORK

Many have proposed various methods for recommendation, for instance, we mention indicatively works of Hannon et al [2] and Guy et al [3]. In political domain, people have focussed on the trying to predict political outcomes, for instance, in works of Conover et al in [10] and [11]. Election prediction is also done in works of Tumasjan et al in [12] and [13].Various community detection algorithms have been proposed ([4],[5],[6],[8],[9]). For brevity, we have mentioned only few indicative works and move on to describe our proposed approach.

III.

P ROPOSED A PPROACH

A. Data Collection We use a web tool, Twitter Archiver2 to collect tweets posted on hashtags related to Indian political space. It automatically saves tweets to google drive sheet in real time. 20 most popular political hashtags were identified, total of 1,59,814 tweets from 46,004 unique twitter users were collected belonging to these hashtags during the period spread over 30 days.

B. Graph Generation After collecting data and removing duplicate entries we generate a directed graph based on link (interaction) and content analysis of the tweets. Users are represented as the vertices of the graph and edge weight represents relatedness metric defined below. Relatedness: It is expressed as the weight of the edge from Ui to Uj and represents the degree to which user Ui is similar to user Uj and vice-versa. It is calculated by analyzing the tweets on two basis namely link analysis and content analysis. 2 Twitter Archiever, Link: http://wersm.com/twitter-archiver-automaticallysaves-tweets-to-your-google-drive/

2015 IEEE International Conference on Advanced Networks and Telecommuncations Systems (ANTS)

1) Link analysis: In this, tweets are analyzed based on their interactions. Two types of interactions are captured namely, retweet RT and mentions @M:  1 if tweet ti of Ui is a retweet of Uj RTti (Ui , Uj ) = 0 otherwise  1 if tweet ti of Ui mentions Uj @Mti (Ui , Uj ) = 0 otherwise

D. Proposed Algorithm This section describes the proposed algorithm to find friends based on his/her participation in political discussions(hashtags). The algorithm takes as input all the popular political hashtags in the list HT and user U to whom friends are to be recommended. It outputs a list of recommended users and their corresponding Relatedness score.

2) Content Analysis: Content analysis of the tweets is done based on Hashtag and Full text. In hashtag analysis, an edge with weight HS indicating relative number of common hashtags is added between the tweets’ users, if the commonality is greater than a threshold τ (0.25): n(Hti ∩ Htj ) σti ,tj = n(Hti ) Hti is set of hashtags in tweet ti Htj is set of hashtags in tweet tj  σti ,tj if σti ,tj > τ HSti ,tj (Ui , Uj ) = 0 otherwise where

In full text content analysis, the two tweet texts are compared for similarity using a well known sequence matcher metric  , if it is greater than threshold value ρ(0.60). 2∗M ti ,tj = T where M is the number of matches in the two tweets T is the total number of words in the two tweets  ti ,tj if ti ,tj > ρ F Sti ,tj (Ui , Uj ) = 0 otherwise 3) Relatedness Metric: The relatedness metric between two users is calculated as a weighted sum total of all above factors as below: X Relatedness(Ui , Uj ) = αRTti (Ui , Uj ) + β@Mti (Ui , Uj ) ti

+

X

γHSti ,tj (Ui , Uj ) + δF Sti ,tj (Ui , Uj )



Algorithm Recommend-Political-Friends (HT , U ) for all Hi ∈ HT do T Di ← getT weets(Hi ) end for TD ← φ for all T Di0 s do T D ← T D ∪ T Di end for W (ui , uj ) ← 0 for all ti ∈ T D do if ti is a retweet of tweet tj then W (ui , uj ) ← W (ui , ij ) + α end if if ti mentions user uj then W (ui , uj ) ← W (ui , uj ) + β end if for all tj ∈ T D do n(Hti ∩Htj ) 17: σ ← n(H ti ) 18: if σ > τ then 19: W (ui , uj ) ← W (ui , uj ) + γσ 20: end if 21:  ← sequenceM atcher(ti , tj ) 22: if  > ρ then 23: W (ui , uj ) ← W (ui , uj ) + δ 24: end if 25: end for 26: end for 27: Rescale all the edge weights in percentage from 1 to 100 28: Gc ← CommunityDetection(G) Gc now has vertex structure V < U, c > where c is the ID of the community to which user U belongs 29: for all neighbours uj of U in Gc do 30: if cj = c and isF ollower(U, uj ) = F alse then 31: print uj and W (U, uj ) 32: end if 33: end for 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16:



tj

The values of α , β, γ, and δ are taken as 2, 5, 1 and 1, respectively, based on their relative contribution to the Relatedness metric.

IV.

I MPLEMENTATION

C. Recommendation using Community Detection

Proposed recommendation system is implemented as a web application, PoliticAlly which helps twitter users to find their political ally (friend with similar political interests). The homepage, refer Fig 1., prompts the user to enter his/her screen name and choose all the hashtags he/she is interested in.

The graph is then processed by a community detection algorithm. There are numerous community detection algorithm, however in our work, we use Walktrap [8] algorithm as it is algorithmically less costly in large graph and gives good results in short time. The users in the same community are expected to share common interests.

Proposed algorithm runs seamlessly in the background: filters the dataset according to the hashtags, analyzes the tweets based on their links and content, constructs a graph, performs community detection algorithm on the graph and uses it to recommend political allies. A sample graph is shown in Fig. 2.

The neighbours in the same community of the user, who are not currently being followed by the user are recommended as potential friends.

The users in same community are assumed to share similar interests, so it recommends twitter users from the same community, whom he/she is not following currently. A screenshot

To establish a baseline for comparison, we rescale the Relatedness metric in form of percentages (from 1 to 100).

357

2015 IEEE International Conference on Advanced Networks and Telecommuncations Systems (ANTS)

Fig. 1.

Fig. 3. A sample result page of web portal PoliticAlly. The webpage shows recommendations for twitter user Lakshmi(AKOnlineSena) based on her participation in all the hashtags listed on the login page.

Home page of web portal PoliticAlly.

threshold values of the identified parameters can be further fine-tuned for improved results. Other community detection algorithms can be used and compared for better results. We aim to address these issues in our future works. R EFERENCES

Fig. 2. A sample graph formed in the background after link and content analysis of tweets set containing hashtags #aapkacm or #mufflerman.

of the result page is shown in Fig. 3. System outputs a list of recommended political friends in order of decreasing percentages. V.

C ONCLUSION

Through our work, we prove that it is possible to recommend political friends based on user’s participation in trending hashtags. This implementation can be extended to find friends with common and/or similar interests in other domains like education, entertainment, political, social domain. VI.

O PEN ISSUES AND FUTURE WORK

Being our first step in this direction, a number of extensions are possible. We need to improve the performance of our proposed algorithm in terms of time and space. Weights and

358

[1] Twitter MAU Were 302M For Q1, Up 18% YoY - Twitter (NYSE:TWTR) — Benzinga] April 28, 2015. Retrieved May 2, 2015. [2] Hannon, John, Mike Bennett, and Barry Smyth. Recommending twitter users to follow using content and collaborative filtering approaches. Proceedings of the fourth ACM conference on Recommender systems. ACM, 2010. [3] I. Guy, I. Ronen, and E. Wilcox. Do you know?: Recommending people to invite into your social network. In IUI 09: Proceedings of the 13th international conference on Intelligent user interfaces, pages 7786, New York, NY, USA, 2009. ACM. [4] Lancichinetti, Andrea, and Santo Fortunato. Community detection algorithms: a comparative analysis. Physical review E 80.5 (2009): 056117. [5] Girvan, Michelle, and Mark EJ Newman. Community structure in social and biological networks. Proceedings of the national academy of sciences 99.12 (2002): 7821-7826. [6] Clauset, Aaron, Mark EJ Newman, and Cristopher Moore. Finding community structure in very large networks. Physical review E 70.6 (2004): 066111. [7] Wakita, Ken, and Toshiyuki Tsurumi. Finding community structure in mega-scale social networks:[extended abstract]. Proceedings of the 16th International Conference on World Wide Web. ACM, 2007. [8] Pons, Pascal, and Matthieu Latapy. Computing communities in large networks using random walks. Computer and Information SciencesISCIS 2005. Springer Berlin Heidelberg, 2005. 284-293. [9] Raghavan, Usha Nandini, Rka Albert, and Soundar Kumara. Near linear time algorithm to detect community structures in large-scale networks. Physical Review E 76.3 (2007). [10] Michael D. Conover, Bruno Goncalves, Jacob Bruno Goncalves, Jacob Flammini and Filippo Menczer. Predicting the Political Alignment of Twitter Users, 2011. [11] Conover, Michael, et al. Political Polarization on Twitter. ICWSM. 2011. [12] Tumasjan, Andranik, et al. Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment. ICWSM 10 (2010): 178185. [13] Tumasjan, Andranik, et al. Election forecasts with Twitter: How 140 characters reflect the political landscape. Social Science Computer Review (2010).