Film Buddy: A Social Recommender Engine using ...

10 downloads 0 Views 8MB Size Report
Film Buddy gives control over the recommendation pro- ...... e pros and cons of Hybrid recommender systems are strongly depended on the combination.
ARISTOTLE UNIVERSITY OF THESSALONIKI

Film Buddy: A Social Recommender Engine using Interactivity and Explanation Techniques by Sofia Yfantidou - Student ID: 2247 A thesis submitted in partial fulfillment for the Undergraduate degree in the Faculty of Sciences School of Informatics Supervising Professor: Dr. Athena Vakali

February 2017

Declaration of Authorship I, Yfantidou Sofia, declare that this thesis titled, ‘Film Buddy: A Social Recommender Engine using using Interactivity and Explanation Techniques’ and the work presented in it are my own. I confirm that:



This work was done wholly or mainly while in candidature for a research degree at this University.



Where any part of this thesis has previously been submitted for a degree or any other qualification at this University or any other institution, this has been clearly stated.



Where I have consulted the published work of others, this is always clearly attributed.



Where I have quoted from the work of others, the source is always given. With the exception of such quotations, this thesis is entirely my own work.



I have acknowledged all main sources of help.



Where the thesis is based on work done by myself jointly with others, I have made clear exactly what was done by others and what I have contributed myself.

Signed:

Date:

i

“What would you do if you weren’t afraid?” Sheryl Sandberg

ARISTOTLE UNIVERSITY OF THESSALONIKI

Abstract Faculty of Sciences School of Informatics Undergraduate Degree by Sofia Yfantidou - Student ID: 2247

This thesis introduces an interactive recommender system, “Film Buddy”, which provides movie recommendations based on Social and Semantic Web technologies, specifically Facebook, Wikipedia and WordNet. It is a knowledge-based recommender system, whose architecture is inspired by Information Retrieval Systems. fiFilm Buddyfi gives control over the recommendation process to the user, by using an interactive and explanatory graphical interface, automates user and item profiling, by taking advantage of the aforementioned technologies, as well as puts emphasis on the diversity and novelty of the suggested results. In addition, user-centric A/B Testing is utilized for the evaluation of the platform, in order to identify the effect of semantic technologies on the recommendation process. Also, association rules are used to examine the factors that affect user satisfaction. The results of this research indicate that there is a direct relationship between interactive interfaces, diversity and novelty of the results, user satisfaction and precision of the systemfis recommendations. Finally, using semantics for user profiling have not been found statistically significant.

Acknowledgements Creating “Film Buddy” and writing this thesis have not been easy tasks. However, during this challenging time, I have been extremely lucky to be have the best companions. First of all, I would like to thank my supervisor, Professor Athena Vakali, for the guidance, the support, the trust and the time she offered, as well as the opportunities she provided me with throughout our cooperation. Her input has been an integral part of this effort and of utter importance for my personal and professional development. I will always be grateful to her. I would also like to sincerely thank my family for always prioritizing my education. They have always made sure I would not miss any opportunity, even when times were rough. Finally, I would like to thank all the awesome, rebellious women in my life. My mum, who is always by my side whatever happens. My sister, who is the best role model I could have asked for; my love and admiration for her achievements cannot be described with words, while her contribution in extracting the statistics of this thesis has been pivotal. Despoina, who tolerates me those nights that I could not even tolerate myself. Konstantina and Sofia; I hope our friendship will be stronger than time. My niece, Alexandra, and my goddaughter, Chrysanthi, who give me the strength to keep going; I hope some day I will be a person you can look up to and believe that you can achieve everything.

iv

Contents Declaration of Authorship

i

Abstract

iii

Acknowledgements

iv

List of Figures

viii

List of Tables

x

Abbreviations

xi

1 Introduction 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 The Social Web as an Active Users Space . . . . . . . 1.1.2 The Semantic Web as a Knowledge Extraction Space 1.1.3 Recommender Systems: Techniques & Challenges . . 1.1.3.1 Recomennder Systems’ Techniques . . . . 1.1.3.2 Recomennder Systems’ Challenges . . . . 1.1.3.3 Knowledge-based Recommendation Assets 1.2 Idea & Motivation . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 User-side Demands and User Profiling . . . . . . . . 1.2.2 Implementation Demands . . . . . . . . . . . . . . . 1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

1 2 3 5 7 8 9 9 10 10 12 12 14

2 Literature Review 2.1 Collaborative filtering systems . 2.1.1 Pros & Cons . . . . . . . 2.1.2 Tools & Implementation 2.2 Content-based filtering systems 2.2.1 Pros & Cons . . . . . . . 2.2.2 Tools & Implementation 2.3 Knowledge-based systems . . . 2.3.1 Pros & Cons . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

15 15 15 17 17 18 19 19 20

. . . . . . . .

. . . . . . . .

. . . . . . . . v

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

vi

Contents

2.4

2.5

2.3.2 Tools & Implementation Hybrid systems . . . . . . . . . 2.4.1 Pros & Cons . . . . . . . 2.4.2 Tools & Implementation Comparison . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

3 Notation & Fundamentals 3.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Principles and Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Formulating the Recommendation Process as an IR Task . . . . . . . 3.2.2 IR Methods as Part of the Recommendation Process . . . . . . . . . . 3.2.3 An IR System as the Backbone of the Recommendation Process . . . 3.2.3.1 Inverted Index Structure: Organising the Movies . . . . . . 3.2.3.2 Inverted Index Querying & Scoring: Ranking the Movies . 3.2.3.3 Querying Top-k Algorithm: Retrieving Recommendations

. . . . .

21 22 22 23 24

. . . . . . . .

27 27 28 28 32 33 34 35 36

4 Basic Principles and Implementation Framework for an IR Recommender System using Semantics 4.1 General IR Architecture of a Recommender System . . . . . . . . . . . . . . . 4.2 Film Buddy Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Film Buddy: Wikipedia Film Dataset . . . . . . . . . . . . . . . . . . . 4.2.2 Film Buddy: OMDb Film Dataset . . . . . . . . . . . . . . . . . . . . . 4.3 Framework Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38 38 40 40 41 43

5 Implementation and Core Components of “Film Buddy” 5.1 Framework Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Content Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1.1 Content Analyzer and the Content Component . . . . . 5.1.1.2 Content Analyzer and the User Component . . . . . . . 5.1.2 User Component . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.3 Content Component . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.4 Search Engine Component . . . . . . . . . . . . . . . . . . . . . . . 5.1.4.1 Search Engine Component: Building the Inverted Index . 5.1.4.2 Search Engine Component: Querying the Inverted Index 5.1.5 Web Platform Component . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

45 45 45 46 46 47 48 49 50 51 51

6 Experimentation & Validation 6.1 Experimentation . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 Film Buddy Sample . . . . . . . . . . . . . . . . . . . . 6.1.2 Film Buddy Questionnaire . . . . . . . . . . . . . . . . 6.1.3 Film Buddy User Survey . . . . . . . . . . . . . . . . . 6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Exploratory Analysis and Reliability . . . . . . . . . . 6.2.2 Correlation . . . . . . . . . . . . . . . . . . . . . . . . 6.2.3 T-Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.4 Multivariate Analysis of Variance (MANOVA) . . . . . 6.2.5 User’s Habits regarding Facebook & Movie Discovery 6.2.6 Demographics . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

57 57 58 58 59 60 60 62 62 62 63 67

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

vii

Contents 6.2.7 6.2.8

. . . . . .

68 69 69 71 72 73

. . . . . .

79 79 79 80 80 81 81

A Mathematical Notation A.1 General Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 Set Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.3 Logic Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83 83 83 83

B Major IR Methods B.1 Standard Boolean Model . . . B.1.1 Definitions . . . . . . B.1.2 Advantages . . . . . . B.1.3 Limitations . . . . . . B.2 Vector Space Model . . . . . . B.2.1 Definitions . . . . . . B.2.2 Applications . . . . . . B.2.3 Advantages . . . . . . B.2.4 Limitations . . . . . . B.3 Probabilistic Relevance Model B.3.1 Definitions . . . . . . B.3.2 Limitations . . . . . .

. . . . . . . . . . . .

84 84 84 85 85 85 86 86 87 87 88 88 88

. . . .

89 89 89 89 90

6.2.9

Users’ opinion about Film Buddy’s characteristics . Metrics Calculation . . . . . . . . . . . . . . . . . . 6.2.8.1 Precision@k & MAP . . . . . . . . . . . 6.2.8.2 Breese’s R-Score Utility . . . . . . . . . . 6.2.8.3 Normalized Discounted Cumulative Gain Apriori & Association Rules . . . . . . . . . . . . .

7 Conclusions & Future Work 7.1 Conclusions . . . . . . . . . . . . . . . . . . . . 7.1.1 Social media collective profiling . . . . . 7.1.2 Semantics aware User and Item Profiling 7.1.3 Interactive implementation . . . . . . . 7.1.4 Serendipitous Results . . . . . . . . . . . 7.2 Suggestions for Future Work . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

C Apache Lucene C.1 Lucene Features . . . . . . . . . . . . . . . . . . . . . . . . . C.1.1 Scalable, High-Performance Indexing . . . . . . . . . C.1.2 Powerful, Accurate and Efficient Search Algorithms C.1.3 Cross-Platform Solution . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . . . . . .

D Film Buddy Questionnaire

91

Bibliography

98

List of Figures 1.1 1.2 1.3 1.4 1.5 1.6

Number of users of social networking sites as of January 2016. . . . . . . . . . Computer-based personality judgment accuracy (y axis), plotted against the number of Likes available for prediction (x axis). . . . . . . . . . . . . . . . . . Visual facilitation of the semantic expansion graph based on Twitter following the 2012 Colorado theatre shooting. . . . . . . . . . . . . . . . . . . . . . . . . A visual representation of what happens on average every minute on Web’s most popular sites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Total movies produced based on IMDb data. . . . . . . . . . . . . . . . . . . . “Film Buddy” as an intersection of Web 2.0, Web 3.0 and Recommender Technologies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 4 6 8 11 13

3.1 3.2

Visualization of the Recommendation Process. . . . . . . . . . . . . . . . . . . The information need can be expressed as a combination of the U IQ and the explicit information I. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

4.1 4.2 4.3 4.4 4.5

Visual facilitation of the general IR architecture of a Recommender System. . An example record from the “Film Buddy: Wikipedia Film Dataset”. . . . . . An example document from the “Film Buddy: OMDb Film Dataset”. . . . . . Visual facilitation of the general framework of “Film Buddy”. . . . . . . . . . Visual facilitation of the detailed back end framework of “Film Buddy”. . . .

. . . . .

39 41 43 43 44

5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10

The configuration file for Solr’s DataImportHandler. Excerpt of a Solr’s response JSON. . . . . . . . . . . “Film Buddy” Homepage. . . . . . . . . . . . . . . . “Film Buddy” Permission Dialogue. . . . . . . . . . “Film Buddy” Loading Page. . . . . . . . . . . . . . “Film Buddy” Interests Page. . . . . . . . . . . . . . “Film Buddy” Results Page. . . . . . . . . . . . . . . “Film Buddy” Movie’s Page. . . . . . . . . . . . . . . “Film Buddy” Example Movie’s Categories. . . . . . “Film Buddy” Category Results Page. . . . . . . . .

. . . . . . . . . .

50 52 53 53 54 54 55 56 56 56

6.1 6.2

Multivariate Analysis of Variance Results. . . . . . . . . . . . . . . . . . . . . The participants’ responses to the question “How many hours per week do you spend on Facebook?” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The participants’ responses to the question “How often do you post status updates on Facebook?” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The participants’ responses to the question “How many Facebook pages have you liked approximately?” . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

6.3 6.4

viii

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

29

64 65 65

List of Figures 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15 6.16 6.17 6.18 6.19 6.20 6.21 6.22 6.23 6.24 B.1

ix

The participants’ responses to the question “How often do you watch movies?” 66 The participants’ responses to the question “What are your sources for movie suggestions?” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 The responders’ gender. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 The responders’ age group. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 The responders’ ethnicity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 The responders’ educational background. . . . . . . . . . . . . . . . . . . . . . 68 Mean values displayed as dots and standard deviations as error bars of the participants’ opinions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Precision@k metric curve. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Breese’s R-Score Utility metric curve. . . . . . . . . . . . . . . . . . . . . . . . 71 NDCG metric curve. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Apriori with LHS “InteractFilters” and RHS metrics with minlen = 2, maxlen = 5, supp = 0.2, conf = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Apriori with LHS “InteractFilters” and RHS not specified with minlen = 2, maxlen = 5, supp = 0.1, conf = 0.8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Apriori with LHS “InteractKeywords” and RHS metrics with minlen = 2, maxlen = 5, supp = 0.1, conf = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Apriori with LHS “InteractKeywords” or “InteractFilters” and RHS not specified with minlen = 2, maxlen = 5, supp = 0.1, conf = 0.8. . . . . . . . . . . . . 76 Apriori with LHS “AttractiveGUI” and RHS not specified with minlen = 2, maxlen = 4, supp = 0.1, conf = 0.65. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Apriori with LHS “Control” and RHS not specified with minlen = 2, maxlen = 4, supp = 0.1, conf = 0.65. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Apriori with LHS “Diverse” and RHS not specified with minlen = 2, maxlen = 5, supp = 0.25, conf = 0.85. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Apriori with LHS “Explanations” and RHS not specified with minlen = 2, maxlen = 4, supp = 0.1, conf = 0.65. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Apriori with LHS not specified and RHS metrics with minlen = 2, maxlen = 5, supp = 0.5, conf = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Apriori with LHS not specified and RHS “UseAgain” with minlen = 2, maxlen = 4, supp = 0.2, conf = 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Application of the Vector Space Model (image taken for Wikipedia) . . . . . .

86

List of Tables 2.1 2.2 2.3 2.4 2.5

List of related CF systems’ projects. . . . . List of related CBF systems’ projects . . . . List of related KBF systems’ projects. . . . List of related Hybrid systems’ projects. . . Comparison between selected related tools.

. . . . .

17 19 21 23 25

3.1 3.2 3.3

General Recommendation Process Notation. . . . . . . . . . . . . . . . . . . . Index Notation for the Recommendation Algorithms. . . . . . . . . . . . . . . An example of an II; Each row includes a (term, total frequency) pair (column 1) and one or more (movie plot document, ft,m ) pairs (column 2). . . . . . . .

27 28

6.1 6.2

Principal Component Analysis of the 4th section of the questionnaire. . . . . . The associations between questions and variable names. . . . . . . . . . . . .

61 74

x

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

34

Abbreviations aka

also known as

API

Application Programming Interface

BIR

Boolean Information Retrieval

BM

Boolean Model

CBF

Content Based Filtering

CF

Collaborative Filtering

CSV

Comma Seperated Value(s)

DCG

Discounted Cumulative Gain

e-commerce

electronic commerce

e.g.

exempli gratia (“for the sake of an example”)

etc.

et cetera (“and other things”)

i.e.

id est (“it is”)

IMDb

Internet Movie Database

II

Inverted Index

IR

Information Retrieval

JSON

JavaScript Object Notation

KBF

Knowledge Based Filtering

MANOVA

Multivariate ANalysis Of VAriance

MAP

Mean Average Precision

MI

Mutual Information

NDCG

Normalized Discounted Cumulative Gain

NLP

Natural Language Processing

OMDb

Open Movie Database

PCA

Principal Component Aanalysis

RESTful

Repreantational State Transfer xi

xii

Abbreviations RDF

Resource Description Framework

RP

Recommendation Process

SDK

Software Development Kit

SEM-ALL

SEMantic distance from ALL tracks

SEM-GMM

SEMantic Gaussian Mixture Model

SEM-MEAN

SEMantic distance from the MEAN

SQL

Structured Query Language

SVD

Singular Value Decomposition

tf-idf

term frequency - inverse document frequency

UGC

User Generated Content

UI

User Interface

URL

Uniform Resource Locator

UX

User eXperience

VSM

Vector Space Model

WWW

World Wide Web

XML

eXtensible Markup Language

Dedicated to my sister, who’s always been there for me. . .

xiii

Chapter 1

Introduction Social media’s impact in everyday life evolves in unprecedented rates and new knowledge harvesting opportunities are widely open. In this social media era, novel and unforeseen data correlations can reveal new phenomena under creative information use. These correlations enable more personalized experiences to be created, by surfacing content related to people’s activities and interests. Thus, some of the next questions can trigger new approaches for social media analytics. For instance, can our social media content, such as our Facebook profile’s posts and likes, be used to elicit our interests and help us find our next favorite entertainment item, such as an unforgettable movie? Or maybe the most ideal travel destination? In today’s society, social media users face a huge information overload. As a result, users tend to seek for Infotainment services, which combine information and entertainment, in order to accommodate their need for information in a “fresher”, more compelling way [1]. Consequently, providing users with personalized and useful content, presented in an appealing way, has become a necessity for services dedicated to improving user experience. Especially, when it comes to entertainment no user wants to spend more time looking for a movie than actually watching it. That’s why more and more movie related websites, such as IMDB or Rotten Tomatoes, rely on user profiling, in order to recommend the most suitable films for each user. This thesis is motivated by the endless hours spent in search of a good movie and the frustration they caused. It aims at exploring alternative ways of providing users with accurate movie recommendations without the dire need for their input. Finally, it proceeds to propose a recommendation process, which simplifies recommendations by implicitly building the user’s profile, using one’s Facebook data. Thus it spares the user valuable time, but without depriving one of essential control over the recommendation process. In order to match users with movies 1

2 that suit their interets, this thesis uses information and technologies from various sources, including IMDb, Wikipedia and the Semantic Web. The rest of this chapter includes an overview of the current web trends especially in relation to the Web 2.0 era (Section 1.1), a brief description of recommender systems approaches (Section 1.1.3), the idea and motivation behind this thesis (Section 1.2), the contribution of this work (Section 1.3) and an outline of the overall thesis (Section 1.4).

1.1

Overview

Undoubtedly, WWW, plays a major role in today’s people’s lives. WWW generations include Web 1.0, websites that provide only static content with no interaction, Web 2.0 (aka Social Web) i.e. websites which place emphasis on UGC, usability, and interoperability and Web 3.0 (aka Semantic Web) which encapsulates the evolution of the Web as an extension of Web 2.0 at which data and semantics contribute to a smarter Web. As current trends indicate, the introduction of Web 2.0 has largely contributed in revolutionising the Web status and its momentum [2, 3]. The Social Web is not just about the people; Social Web exists because and for the people who are the drivers of Web’s evolution and extend. Web 2.0 gives power to the users, by allowing them to express themselves in various forms, including posting, tagging, participating in online communities and generating content in general. This ability to interact and participate in such a great degree and in a continuous pace has led to a huge increase in the amounts of User-Generated Content (UGC) available. This data offers multiple valuable sources which can be exploited to deliver valuable knowledge and insights once intelligence is embedded inside Social Web’s applications and services. The concept of “Collective Intelligence” has arisen as a result of advancing Web 2.0 services with smart correlations of people’s tasks, actions, and inter-relations. Such correlations involve the combination of behaviors, preferences and ideas from individuals and groups of people to create novel insights and knowledge entities. Such insights can then be used to improve the user experience and the quality of services provided to the users, by personalising the delivered “product” (e.g content, item) according to the users’ interests. Furthermore, the large impact of Web 2.0 in the evolution of the WWW overall, has been related with the Web 2.0’s extension to an new Web 3.0 era which has already started to gain momentum. This era exploits the earlier Semantic Web generation which, according to Markoff (2006) [4], puts emphasis on “machine-facilitated understanding of information, to provide a more productive and intuitive user experience”. It has created a web experience that can be personalized with intelligent search and behavioural advertisements where content is generated by machines instead of by humans. The full impact of Web 3.0 to the evolution of the

3 WWW is yet to be discovered in the upcoming years, but its current strong trend provides promising potential [5, 6]. This thesis builds upon both the Social and the Semantic Web technologies and practices as detailed in the next sub-sections.

1.1.1

The Social Web as an Active Users Space

Web 2.0, as mentioned above, allows users to interact and collaborate with each other in a virtual community setting. Users are no more passive visitors of websites, but creators of their UGC. Examples of popular Web 2.0 technologies according to Wikipedia include “social networking sites, blogs, wikis, folksonomies, video sharing sites, Web applications, and mashups”. Each of these technologies prioritizes different features and addresses various entities (users, content, metadata etc), targeting services at which ”Wisdom of the crowds” (the collective opinion of a group of individuals rather than that of a single expert1 ) is exploited and advanced. Web 2.0’s biggest success, judging by their influence on users, social networking sites (aka social media) provide varied functionality to their users. Generally, they allow users to create a profile, articulate a list of other users that they share a connection with, upload content to their profile and view the profiles of their connections. Social media are extremely popular nowadays as indicated by various facts and figures. Facebook has a total of 1.6 billion users, followed by YouTube with 1 billion and Google+ with 440 million, as can be seen in Figure 1.1. Facebook also accounts for an astonishing number of 1.44 billion monthly active users2 , who generate huge amounts of data. More specifically, with the average time spent per visit being 20 minutes, Facebook users upload 510 comments, 293.000 statuses and 136.000 photos every 60 seconds!3 The aforementioned numbers prove that Facebook users upload posts, statuses and photos to their profiles on a regular basis, in order to share their life moments with their connections. However, their uploads reveal much more. According to Bachrach, Kosinski, Graepel, Kohli, and Stillwell (2012) [7] personality traits are correlated with patterns of social network use, as reflected by features of one’s Facebook profile. Another research indicates that it is possible to obtain information about the users’ personality rather precisely starting from information about their interactions in social networks (Ortigosa, Carro, and Quiroga, 2014) [8]. Last but not least, a recent study by Youyou, Kosinski, and Stillwell (2015) [9] shows that, using several 1

Wisdom of the crowd fi?! Wikipedia, the free encyclopedia. 2017. Retrieved from: https://en. wikipedia.org/wiki/Wisdom of the crowd. 2 Facebook Statistics fi?! Statistic Brain. 2016. Retrieved from: http://www.statisticbrain.com/ facebook-statistics/. 3 Social Media Facts and Statistics for 2016 - Growing Social Media. 2016. Retrieved from http:// growingsocialmedia.com/social-media-facts-and-statistics-for-2016/.

4

Figure 1.1: Number of users of social networking sites as of January 2016.

Figure 1.2: Computer-based personality judgment accuracy (y axis), plotted against the number of Likes available for prediction (x axis).

5 criteria, automated estimations of people’s personalities are more accurate and valid even than judgments made by their close others or acquaintances (friends, family, spouse, colleagues, etc.). These computer estimations are based on their digital footprints, specifically their Facebook likes (See Figure 1.2). These results pave the way for automatically generating a user’s personality “profile” by analyzing one’s Facebook behavior (including wall posts and page likes), without requiring human input. Such “profiling” techniques could be extremely valuable in applications like advertising, or recommender systems. This work’s focus on recommenders is highly inspired by these earlier approaches, because, despite the above-mentioned evidence of strong correlation between a user’s personality and the content of one’s social media profile, social data largely remains unused, when it comes to recommendation. Last but not least, Web 2.0 is characterized by the emergence of mashups. Mashups are applications that allow data from different sources to be pulled together, in order to provide valuable and novel results with the different combinations of the data. This allows for a whole range of handcrafted merges of data sources. Merging the concept of mashups with recommender systems is proven to provide better results than traditional recommendation approaches according to Bostandjiev, Donovan, and H¨ollerer (2012) [10].

1.1.2

The Semantic Web as a Knowledge Extraction Space

The Semantic web is based on the concept of the Semantic Network Model, which was formed in the 1960s by the cognitive scientist Allan M. Collins, linguist M. Ross Quillian and psychologist Elizabeth F. Loftus, in order to represent semantically structured knowledge4 . When applied in the context of the modern Web, “it extends the network of hyperlinked humanreadable web pages by inserting machine-readable metadata about pages and how they are related to each other. This enables automated agents to access the Web more intelligently and perform more tasks on behalf of users”.5 It is important to note here that the Semantic Web “is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation” [11]. “Semantical relatedness” is a concept linked with the Semantic Web, due to its interwoven information. The term denotes a large group of language phenomena, which indicate a relation among words, varying from simple synonyms and antonyms to more complicated ones, like sets of words used in a particular scientific field or domain, or sets of words usually cooccurring in the English language. For instance, the word “mafia” usually co-occurs with 4 Semantic Network - Wikipedia, the free encyclopedia. 2017. Retrieved from https://en.wikipedia. org/wiki/Semantic network. 5 Semantic Web - Wikipedia, the free encyclopedia. 2017. Retrieved from https://en.wikipedia.org/ wiki/Semantic Web.

6 words, such as American or Sicilian. This means that the word “mafia” is semantically related to the words “American” and “Sicilian”.Semantically related words can be of huge value in the field of automatic natural language processing as it is one of the key issues in the solution of many problems in the field, namely the problem of word sense disambiguation tasks, machine translation, document classification, information retrieval and others [12]. Taking advantage of such semantic relations can help build systems that overcome the above-mentioned problems.

Figure 1.3: Visual facilitation of the semantic expansion graph based on Twitter following the 2012 Colorado theatre shooting.

In addition, interwoven with the concept of semantic relatedness is the concept of “semantic expansion”. It is defined as “a kind of technique that adds words to a set of words (e.g. user’s query) to better represent an object or meaning; this technique is utilized to restructure a query in information retrieval systems” [13]. For example, a user’s query “Netherlands” at a search engine may be expanded with words like “Holland” or “Amsterdam”, due to their semantic relation to the original query. The words added to the set are typically semantically related with the original words of the set. For instance, in Figure 1.36 , the word “Colorado” is semantically related to the word “shooting” at that specific period of time. Taking advantage of semantics inclusion in recommender systems with an information retrieval architecture can be of great value in the recommendation process as will be discussed in detail in Section 1.2.1. Overall, Semantic Web can offer its users a more personalized browsing experience, due to its capacity to contextualize information and to augment content with ”meaning”. Most personalization approaches rely on the exploitation of the navigational patterns and/or behaviors of 6 Our Semantic Keyword Expansion API is ON! - Three.Fourteen. 2014. Retrieved from http://www.nirg. net/sem-exp-api.html.

7 the users, e.g. users’ ratings. However, relying only on user-based results is a narrow approach in the context of Web 3.0, as it may lead to loss of conceptually related information, e.g. users’ location, interests, etc.. On the contrary, taking advantage of the semantics in the field of an application “can considerably improve the results of web personalization, since it provides a more abstract yet uniform and both machine and human understandable way of processing and analyzing the data” [14].

1.1.3

Recommender Systems: Techniques & Challenges

Given the information overload today’s users face on the Web, recommender systems (a.k.a. recommender engines) have become extremely popular over the years, in order to help users identify what best suits their interests among the endless possibilities in various domains. Recommender systems can be standalone web platforms or integrated tools in existing platforms. Recommender Systems or Recommendation Systems are a “subclass of information filtering system that seeks to predict the “rating” or “preference” that a user would give to an item”7 . Most popular Web applications (See Figure 1.48 ) which have developed advanced recommendation systems, focus on : • content such as music (e.g. Spotify, Pandora, Last.fm), videos (e.g. YouTube, Vine, Vimeo), and movies (e.g. IMDb, MovieLens, Netflix, Rotten Tomatoes, TasteKid), • products (e.g. Ebay, Amazon), and • people (e.g. Facebook, Pinterest, Twitter). Especially, the movies’ domain has received significant attention over the years. The Netflix prize9 , offered by the American multinational entertainment company, awarded an $1M Grand Prize to a team, that implemented an alternative recommendation algorithm, which outperformed Netflix’s by 10.06%. Moreover, in 2010 and 2011, ACM introduced a Challenge on Context-Aware Movie Recommendation (CAMRa)10

11 ,

which drew researchers’ attention

worldwide. 7 Recommender system - Wikipedia, the free encyclopedia. 2017. Retrieved from https://en. wikipedia.org/wiki/Recommender system. 8 This Happens In an Internet Minute: The Astounding Growth of Web. 2015. Retrieved from blog.capitalogix.com/public/2015/06/ this-happens-in-an-internet-minute-the-astounding-growth-of-web-content. html. 9 Netflix Prize. 2009. Retrieved from http://www.netflixprize.com/. 10 RecSys 2010 fi?! CAMRa fi?! RecSys - ACM. 2010. Retrieved from https://recsys.acm.org/ recsys10/camra/. 11 RecSys 2011 fi?! CAMRa fi?! RecSys - ACM. 2011. Retrieved from https://recsys.acm.org/ recsys11/camra/.

8

Figure 1.4: A visual representation of what happens on average every minute on Web’s most popular sites.

1.1.3.1

Recomennder Systems’ Techniques

Recommendation systems use a number of different technologies, based on the way they produce their recommendations. We can classify these systems into a few broad basic groups [15, 16], which will be analysed in depth in Chapter 2. Here, we will give a brief definition: Collaborative Filtering Systems Systems that recommend items to users based on what similar users (aka users with similar interests) have liked in the past. Content-based Filtering Systems Systems that recommend items to users based on a comparison between the content of the items and the users’ profiles. In other words they recommend similar content with what the users have liked in the past. Hybrid Systems Systems that recommend items to users based on a combination of techniques, such as a combination of collaborative and content-based filtering IMDb uses.

9 1.1.3.2

Recomennder Systems’ Challenges

Recommender systems based on the typical existing approaches mentioned above, face various challenges that decrease their performance (efficiency, precision, recall, novelty, etc.) and/or lower the quality of their results. Some of them are, but are not limited to : 1. Scalability issues, meaning that traditional systems face difficulties in coping with large scale applications, such as e-commerce, due to the constantly increasing number of users and volume of items becoming available for recommendation [17]. 2. Content Unpredictability issues, meaning that systems are not designed to handle the unpredicted bursts of content generation, which occur frequently in applications that reflect societal concerns, such as social media [18]. 3. Cold-start issues, meaning that systems are unable to provide new users with effective recommendations, due to lack of information about the users, or unable to recommend new items to users, due to the lack of ratings and/or recorded preference for the specific items [19]. 4. “Black box” issues, meaning that systems are incapable of providing explanations to the users about their recommendations, resulting to loss of users’ trust [20]. In addition, “Black box” issues in recommender systems are connected with limitations in accepting feedback for the recommendations, and thus inability to improve [19]. 5. Diversity and novelty issues, meaning that the systems recommend items of the same type, without diversifying, and items that are famous amongst the majority of the users, without recommending less well-known content that may be relevant [21, 22]. 6. Social streams issues, meaning that traditional systems are not built to handle the huge amounts of real-time data generated by social networking sites’ users, such as homogeneous, public data from Twitter and heterogeneous, limited-access from Facebook. Subsequently, challenges arise in terms of filtering and personalization [23]. 7. Cross-domain issues, meaning that systems face difficulties handling data originating from different domains/sources, whether these refer to different fields, such as movies and music, or different social networking sites, such as Twitter and Facebook [23].

1.1.3.3

Knowledge-based Recommendation Assets

In an attempt to face most of the challenges in Section 1.1.3.2 many new recommendation approaches have arisen [15]. One of them is knowledge-based recommendation, which has the ability to integrate information from external sources into the recommendation process.

10 Knowledge-based recommender systems offer recommendation of items based on inferences about users’ preferences. In other words, they extract knowledge about the users’ needs, behaviors and preferences and then they detect the content that matches these preferences. Knowledge-based systems face many of the above challenges depending on their architecture and UI design. More details about knowledge-based systems and their pros are given in Chapter 2. Their strong asset though is that they overcome the cold-start problem by using the extracted knowledge for new users, for example based on their profiles in other platforms.However, a potential subsequent drawback can be knowledge acquisition bottlenecks, caused by the system’s need to define recommendation knowledge explicitly. Taking into consideration Section 1.1, this problem can be resolved by using the abundance of user data from social networking sites and more specifically from Facebook, since it has been proven to provide an accurate representation of the users’ personalities and preferences (as discussed in Section 1.1.1). Expanding the aforementioned social data with semantic information, as discussed in Section 1.1.2, can lead to even better results compared to using raw UGC.

1.2

Idea & Motivation

The movies domain is quite popular in the field of recommender systems as mentioned in Section 1.1.3. And not without a reason. There are hundreds of thousands of movies produced (See Figure 1.512 ), thus making it nearly impossible for a user to choose the most suitable for one’s interests. Most of the existing commercial movie recommender systems, such as Netflix, IMDb, Rotten Tomatoes or Movielens, require the user’s input to produce recommendations, or allow limited or no interaction with the user. However, as discussed in Section 1.1.1, there is an abundance of users’ information available online, which remains unused. That’s why, this thesis aims at adding further intelligence to the movie recommendation process by taking advantage of such information.

1.2.1

User-side Demands and User Profiling

Users nowadays have certain expectations when it comes to recommendations. This thesis proceeds to describe and handle the ones described below. 1. Immediate Results A recent study by The Pew Research Center’s Internet & American Life Project indicates that the hyperconnected lives of today’s people have “negative effects including 12 Watching all the movies ever made - Justgeek.de. 2014. Retrieved from http://www.justgeek.de/ watching-all-the-movies-ever-made/.

11 a need for instant gratification and loss of patience” [24]. Today’s Web users want immediate results. They are not willing to wait, thus systems that spend user’s limited time may be easily abandoned. Under these circumstances, applications and services should not expect users to devote time for providing input to a recommender system. The abundance of information in the social web (e.g. Facebook), can highly contribute to automate this task, as discussed in Section 1.1.1. The automation of this task not only will improve the users experience, but also it will resolve the cold-start problems that traditional recommender systems face (as mentioned in Section 1.1.3.2).

Figure 1.5: Total movies produced based on IMDb data.

2. Accurate User-profiling As mentioned in Section 1.1.1, user-profiling can be based on Facebook data. However, in order to extract the knowledge needed for our knowledge-based recommender system, raw Facebook data is not enough. What is essential, in order to build the users and items’ “profiles” is the expansion of the raw data with semantically related information and metadata. A study by Liang, Yang, Chen, and Ku (2008) indicates that the semantic-expansion approach significantly outperformed the key-word-only approach in capturing the user’s interests [25]. In addition, a study by Freitas, and Curry (2014), uses semantic relatedness to tackle the gap between the way users express their information needs and the representation of the data. As a result, they outperform existing systems in recall and query coverage [26]. Therefore, taking advantage of semantics can contribute in reaching more accurate results when it comes to user-profiling that traditional key-word-only approaches.

12

1.2.2

Implementation Demands

Apart from the user-side demands, there are some practical requirements related to the implementation of recommender systems, which play a role in users’ satisfaction. This thesis proceeds to describe and handle the ones described below. 1. Interactive User Interfaces Even with a solid back-end development, a recommendation engine relies heavily on its user interface. Studies indicate that users are happier when they are given control (Harper et al., 2015 [27]) and that interaction with the recommender system increases user satisfaction (Gretarsson, O’Donovan, Bostandjiev, Hall, and H¨ollerer (2010) [28]). In addition, explanation interfaces lead to improved acceptance of a predicted rating and better results [10, 20]. 2. Serendipitous Results A recommender system that provides users solely with foreseeable results is of no use to the latter. Serendipity is defined as the ability to give novel and “surprising” results [21]. A study by Jones, and Pu (2007) [29] indicates that recommender websites can attract new users by maximizing the novelty of the results. As mentioned in Section 1.1.3.2 though, most traditional recommender systems suffer from serendipity issues. The above issues have provided the motivation to build the so called “Film Buddy”, a knowledgebased and semantics aware recommender engine with knowledge acquisition from Facebook. “Film Buddy” uses semantics to capture the users’ interests and items’ characteristics, and gives user control over the recommendation process, in order to increase user satisfaction.

1.3

Contribution

Based on the the above user-side demands and profiling (Section 1.2.1), implementation demands (Section 1.2.2) and the recommender systems’ challenges and limitations (Section 1.1.3.2), “Film Buddy” contributes to the field of recommender systems in the ways discussed below: 1. Social media collective profiling “Film Buddy” extracts user data from Facebook, given the user’s permission. This way it eliminates the time and effort needed for a user to build one’s profile either by giving ratings or by stating one’s preferences explicitly in any other way. Not only “Film Buddy” utilizes the user’s likes to build one’s profile, a practice common in many social recommender systems based on Facebook data, but it also uses the user’s posts, which significantly outperform the user’s likes numerically, if we consider the number of words

13

Figure 1.6: “Film Buddy” as an intersection of Web 2.0, Web 3.0 and Recommender Technologies.

included in a post compared to a like (On average 86 posts shared per user per month with multiple words versus 81 likes per user per month with limited words13 ). “Film Buddy” allows users to get recommendations that suit their interests just by clicking a button; the rest is taken care of in the background. This social element is crucial in the recommendation process of “Film Buddy”. In addition, this method of data acquisition eliminates cold-start problems frequent in traditional recommender systems. Last but not least, “Film Buddy” recommender system is built based on a search engine’s architecture, which means response times are limited to milliseconds14 . 2. Semantics aware User and Item Profiling “Film Buddy” not only uses raw data and keywords extracted from a user’s social media profile, but also expands this data based on semantic relatedness (See Section 1.1.2 for more information), in order to achieve better profiling. Moreover, it uses the same semantic expansion technique to build the movies’ profiles, introducing an automatic semantic aware tagging technique, which will be explained in depth in Chapter 3. 3. Interactive implementation The “Film Buddy” User Interface (UI) is designed to improve the user experience and enhance the users’ satisfaction. Users have a say over what is recommended to them. “Film Buddy” allows users to edit their recommendation-profile and thus play a major role over the recommendation process. Also, it offers an interactive UI, where users can filter results, as well as get explanations about the recommendations, increasing users’ trust as a result. 13 41 Up-to-Date Facebook Facts and Stats - Wishpond. 2015. Retrieved from http://blog.wishpond. com/post/115675435109/40-up-to-date-facebook-facts-and-stats. 14 Measuring SOLR query performance - 29min - WordPress.com. 2013. Retrieved from https://29min. wordpress.com/2013/07/31/measuring-solr-query-performance/.

14 4. Serendipitous Results “Film Buddy” does not use users’ ratings or items’ popularity to provide recommendations. Instead, it only uses the degree an item matches the user’s interests and preferences, recommending only the best matches. The calculation of this similarity degree will be discussed in detail in Section 3.2.3.2. This way it does not face any serendipity issues, as all movies are treated equally either they are a Box Office hit or an indie European production. This way it succeeds in revealing even the most surprising, unexpected results if they suit a user’s interests. To sum up, the mission of “Film Buddy” is to create a movie recommender engine, that saves users’ time by automatically building their profiles, achieves more accurate profiling by using semantics and gives users control of the recommendation process by providing a highly interactive UI.

1.4

Thesis Structure

This thesis is structured as follows: In the 1st Chapter, an Introduction of the Social and Semantic Web is provided, as well as an Introduction to the field of Recommender Systems. Also, the contribution of this work is discussed. In the 2nd Chapter, the Main Approaches of Recommendation Techniques, along with their advantages, disadvantages and related bibliography are introduced. In addition, a comparison between existing recommender systems is presented. The 3rd Chapter describes the Notation used throughout this thesis, as well as the Basic Principles of the Platform’s Architecture as an IR System. Moreover, it provides all the Formulas and Algorithms, which are utilized by this approached. In the 4th Chapter, the Implementation of the “Film Buddy” Platform and the created Datasets are presented, while in the 5th Chapter a Detailed Implementation of the Platform is discussed. Finally, in the 6th Chapter, the Experimentation and its Results are discussed, while in the 7th Chapter, the aforementioned Results are analyzed and Suggestions for Future Work are provided.

Chapter 2

Literature Review In this section the most important recommendation approaches will be presented in full detail along with their advantages and disadvantages, as well as a list of recommender systems for each approach, which use social media information and/or semantics. Finally, in subsection 2.5 a comparative discussion of all approaches’ projects is provided for better understanding of the related work’s emphasis and open issues.

2.1

Collaborative filtering systems

Collaborative Filtering (CF) is the “process of filtering or evaluating items using the opinions of other people” [30]. CF systems use implicit and explicit information about their users’ behaviors and preferences, such as ratings and reviews, in order to provide recommendations. Their basic principle is that users, who have shown similar behaviors (for instance providing similar ratings and reviews), are more likely to have similar interests and thus be interested for the same items. Thus, CF systems predict what users will like based on their similarity to other users of the same platform.

2.1.1

Pros & Cons

CF is one of the most popular recommendation approaches nowadays due to its content independence and its capacity of “evolution” through time, meaning that users’ ratings “evolve” along with the users’ interests through time [15]. However, it faces many problems, which have led to the adoption of hybrid approaches [31]. More details on its pros and cons can be found below:

15

16 1. Pros The most important advantages of CF recommenders are [16, 32]: i. lack of knowledge database There is no need for a knowledge database to store the characteristics of the items or users, as CF recommendations are based only upon the users’ ratings and do not need any additional information about the users or items. ii. Improving quality rate The quality of the recommendations is not static, but is constantly improving, as the number of the users’ ratings in the system increases through time. iii. Domain knowledge not required Domain knowledge is not needed to produce recommendations. CF methods utilize only users’ ratings as mentioned above. 2. Cons However, CF approaches often suffer from various problems [15]: i. Cold-start These systems often require a large amount of existing data on a user or item, in order to make accurate recommendations. This prevents new users with few ratings from receiving accurate recommendations. The same applies for new items, not yet rated by many users and thus not recommended either [33]. ii. Scalability In most of today’s information systems the number of items, as well as the number of users, are extremely high. Consequently, large amounts of computational power are needed to calculate the recommendations [17, 34]. iii. Sparsity The problem’s root is the existence of a huge number of items, especially in large databases, compared to a small number of users. Even the most active users will only have rated a small subset of the overall database [35]. Subsequently, even the most popular items have very few ratings. iv. “Banana Problem” CF systems tend to recommend popular items; items that most users show a preference towards, thus lacking novelty in their recommendations [32]. v. “Black-box Problem” Due to the anonymity of the ratings and the methodology of the recommendation, it is nearly impossible to give adequate explanations to users about their recommendations [34]. vi. Unused knowledge These systems have no knowledge of neither the specific characteristics of the products they recommend nor the personal interests of the people they recommend them to. This means that they let a huge amount of valuable information about their users available on the Social Web go unused.

17

Table 2.1: List of related CF systems’ projects.

2.1.2

Tools & Implementation

“MovieLens” is mentioned in this section, although it has no semantic or social features, as one of the most popular research CF recommender systems to this day. In an effort to eliminate the need for user input, which is the main drawback of “MovieLens”, and the new-user cold-start problem, the 3rd (Sedhain, Sanner, Braziunas, Xie, and Christersen, 2014) [36], 4th (Fern´andeztob´ıas, and Cantador, 2014) [37] and 5th (Wu, and Shih, 2015) [38] projects from Table 2.1 utilize users’ Facebook data (or social media in general). More specifically, they use likes, friendships and ratings, but without providing the users with a GUI, limiting themselves to theory-only. Same applies for the 2nd project by Mart´ın-Vicente, et al. (2014), which does not use Facebook data, but enriches the CF algorithms with semantic information, improving its performance [35]. None of the research CF projects found have both semantic and social features.

2.2

Content-based filtering systems

Content-based filtering (CBF) methods are based on “a description of the item and a profile of the user’s preference” [39]. CBF recommenders recommend items that are similar to the ones the user has liked in the past. The similarity between the items is calculated based on their common characteristics with respect to the user’s actions. For instance, if a user has given high ratings to ”Julia Roberts”’ movies, then the recommender system, based on this

18 preference, will recommend more movies that share this characteristic (i.e. ”Julia Roberts” as the female lead).

2.2.1

Pros & Cons

CBF reduces some of the issues and challenges of CF techniques. However, it still suffers from the “New-user” cold-start problem and is highly dependent on item’s representation, as will be explained below. More details on its pros and cons are summarized next: 1. Pros The most important advantages of CBF include [15, 16]: i. “Banana Problem” resolution CBF recommenders overcome the “Banana problem” the CF recommenders face, as the popularity of a specific item does not matter when it comes to recommending it. What matters is its characteristics and how they match the user’s preferences. ii. “New-item” cold-start problem overcome These systems have the ability to recommend new items, not yet rated, thus overcoming the cold-start problem when it comes to new items. iii. lack of “Black-box Problem” CBF approaches overcome the “Black box problem”, as they are capable of offering explanations for their recommendations. 2. Cons However, CBF approaches face various problems [15, 16]: i. Item representation The quality of the recommendations is highly dependant on the number and quality of the characteristics associated with each item. If there is not enough information about an item, in order for the system to decide whether it suits a user’s interests or not, then the recommendation may be unsuccessful. ii. Capture Serendipity The items recommended suit only the user’s expressed preferences. This means that if the user has expressed very specific preferences to the system e.g. has rated only movies by a specific director, then the system will mainly suggest items that share the same characteristic. In the case above, it will recommend movies mostly by this specific director, thus lacking serendipity, which is defined as the systems’ tendency to recommend items that lack novelty. iii. “New-user” cold-start accuracy These systems often require a large amount of existing data on a user, in order to make accurate recommendations. This prevents new users with few ratings from receiving accurate recommendations.

19

2.2.2

Tools & Implementation

Table 2.2: List of related CBF systems’ projects

From the Table 2.2 cases, neither the “C-eVSM” project (Musto, Semerano, Lops, de Gemmis, 2014) [40] nor the last one (Bogdanov, et al., 2013) [41] have included social media data in their recommendation process, so they fail to solve the new-user cold-start problem automatically. On the contrary, “MORE” approach uses both semantics for item-profiling and social data. They discovered that without considering the semantic information, the recommendation results got worse drastically. To receive recommendations the user is given the option to provide access on his/her Facebook profile movies’ section, at which users save their favorite or to-watch movies. However, not all users update the movies’ section of their profiles. This means that users without an updated movies’ section are unable to use this social functionality. Instead, they can only provide input manually, when there is an abundance of unused, valuable information in their Facebook profiles e.g. likes, posts, etc.. Although “MORE” tries to eliminate the newuser cold-start problem, it fails in specific cases as mentioned above. In addition, no semantic information is used by “MORE” to improve the quality of the user-profiling, as discussed in Section 1.2.1. Finally, the resulting movies are ranked based on popularity, which means that serendipity issues (See Cons of CBF in Section 2.2.1) may arise, as less known films, which may suit the user’s interests, are ranked lower in the list.

2.3

Knowledge-based systems

Knowledge-based filtering (KBF) recommender systems take advantage of the knowledge about items and/or users, in order to produce recommendations. Therefore, these systems aim to create/extract knowledge about the characteristics of the items to be recommended and/or the

20 users’ interests and desires [15]. Based on this knowledge they match users to items that suit their personal interests. The knowledge about the users and/or items can be implicit or explicit [15]: • Implicit: Implicit knowledge acquisition requires the use of data mining techniques in large amounts of data and selection of useful information. For instance, extracting information about a user’s personality from their Facebook profile (with their consent) and use it, in order to produce recommendations can be categorised as implicit knowledge acquisition. • Explicit: Explicit knowledge acquisition requires the characteristics of the items and/or users to be inserted directly into the system. For instance, asking a user to state one’s preferences before the recommendation process can be categorised as explicit knowledge acquisition.

2.3.1

Pros & Cons

KBF eliminates most of the problems CF and CBF suffer from, such as the cold-start problem and the “Black-box” problem. However, it faces a few new challenges of its own. More details on its pros and cons can be found below: 1. Pros Knowledge-based systems have a lot of advantages [15] to counterbalance their shortcomings, which will be mentioned later on: i. cold-start problem resolution They overcome the cold-start problem. By implicitly or explicitly eliciting information about new users, KBF recommender systems have no problem recommending items to new users, even without a single rating. The same applies for new items. Implicit knowledge acquisition allows these systems to recommend new items, which have not been rated by users before. ii. Accurate user-profiling Since it is possible to elicit information about the user at any time, these systems are capable of capturing the shifts in the preferences of the user, thus producing more accurate recommendations. iii. Computational power efficiency They do not have large computational power requirements, especially in comparison with CF systems, which need huge computational power, in order to compute the nearest neighbors of each user. iv. Serendipity capturing They can produce novel recommendations, since they are not based on an item’s popularity or previous ratings, but solely on the item properties and the user’s interests.

21 v. Absence of “Black-box Problem” They are capable of giving explanations for their results and more importantly there is the possibility of giving some control over the recommendations to the user. 2. Cons KBF systems face their own challenges [15]: i. Complex knowledge database KBF systems require a complex knowledge database, which includes information about the user desires and how these desires can be met by the itemset, as well as information and characteristics about the itemset itself. Building this database can be a difficult procedure. ii. Staticity Issues Traditional KBF systems suffer from staticity issues (the condition of being static), meaning that the complexity of their knowledge database makes it hard to update the knowledge about the users’ preferences. As a result the preferences in the system remain static contrary to real life, where they “evolve” through time [32].

2.3.2

Tools & Implementation

Table 2.3: List of related KBF systems’ projects.

As indicated in Table 2.3, both “RecomMetz” (Colombo-Mendoza, Valencia-Garc´ıa, Rodr´ıguezGonz´alez, Alor-Hern´andez, and Samper-Zapater, 2015) [42] and the 1st system of Liang et al.

22 [25] do not utilize any kind of social media information. Thus, they both suffer from the newuser cold-start problem or alternatively they need the user’s input, leaving valuable information about the user unused. However, “IBRS” (De Graaff, Van De Venis, Van Keulen, and De By, 2015) [43], has considered both social and semantic features. More specifically, “IBRS” extracts the user’s Facebook likes, matches them to DBpedia pages to identify broader concepts and finally matches these concepts with a known tag-set, which is based on the item-set to achieve uniform tagging between the user’s interests and the item-set properties. However, matching the user’s likes to DBpedia pages brought only 19.2% success rate, which means that 80.8% of the user’s likes were not utilized. Moreover, DBpedia caused more problems in the recommendation process since some paths in the RDF graph formed a reason not to recommend an item. For example, “in the holiday home domain, a user was less likely to book a home in his own town, even though there may be many paths between him and that holiday home based on his local likes”. Also, many paths that are quite common, such as the “European Central Time”, are of no use to the user, although they are again connected with one’s interests. Based on these issues, exploring alternatives to DBpedia when it comes to semantics may give better results.Last but not least, “IBRS” does not provide users with an interactive interface, but it is limited to a static presentation of the recommendation results, which does not enhance the user satisfaction as discussed in Section 1.2.2.

2.4

Hybrid systems

Hybrid systems result from a combination of two or more recommendation approaches from the ones mentioned in the previous sub-Sections. By combining traditional approaches these systems aim to eliminate the shortcomings of each approach and take advantage of their assets [16]. Depending on the recommendation needs, there can be multiple combinations. Most of today’s applications use hybrid approaches, in order to solve the problems of CF and CBF, which is also the most common combination used.

2.4.1

Pros & Cons

The pros and cons of Hybrid recommender systems are strongly depended on the combination of recommendation approaches selected. 1. Pros As mentioned above, the pros of Hybrid recommender systems depend on the combination at hand. In other words, Hybrid recommender systems use the advantages of each of the combined approaches to create a system that counterbalances the disadvantages of these approaches. For instance, one can combine CF and CBF approaches to utilize

23 both the representation of the items (CBF) and the similarities among users (CF) assets of the combined approaches. By utilizing CBF’s representation of items, it overcomes the new-item cold-start problem CF systems face (See Section 2.1.1). While by utilizing CF’s users’ similarity, it overcomes the Serendipity issue CBF systems face (See Section 2.2.1). 2. Cons Hybrid recommender systems are built to face the challenges connected to individual recommendation approaches. However, this does not mean that they come without a ”cost”. Pinpointing the appropriate combination for the recommender system and combining approaches is not an easy task and the decision should be made based on the pros and cons of each approach and the specific requirements of a given application.

2.4.2

Tools & Implementation

Table 2.4: List of related Hybrid systems’ projects.

24 All selected Hybrid projects have exploited social features, utilizing data from Facebook, Twitter and other social media. The 1st (Agrawal, and Karimzadehgan, 2009) [44] and 2nd (CarrerNeto, Hern´andez-Alcaraz, Valencia-Garc´ıa, and Garc´ıa-S´anchez, 2012) [45] projects from Table 2.4 use only Facebook connections to aid the recommendation process, but fail to overcome the new-user cold-start problem by capturing the user’s preferences from social media. The 3rd project (David, Bajaj, and Jazra, 2012) [46] from Table 2.4 faces this challenge by utilizing Facebook data of unspecified nature. However, it lacks a GUI and does not take advantage of semantics to improve recommendations. “TasteWeights” (Bostandjiev, Donovan, and H¨ollerer, 2012) [10], on the other hand, provides users with a highly interactive GUI and exploits semantics (Wikipedia specifically), in order to identify broader categories. In addition, it uses both Facebook friendships as a trust measure and Facebook likes as an initial source for user-profiling. However, it does not use semantics to improve user-profiling as discussed in Section 1.2.1, and most importantly it is limited to domains the user has expressed one’s explicit preference by liking a directly relevant Facebook page. In other words, it is unable to automatically identify the user’s interests in general to use them in cross-domain cases.

2.5

Comparison

To sum up, “Film Buddy” introduces a standalone platform for recommendation with the characteristics described in Section 1.3, a summary of which can be found in Table 2.5. Only half (8 out of 16) of the other projects, presented in Table 2.5, have introduced standalone software. The rest solely describe theoretic frameworks. The majority of these standalone recommenders utilize semantics, mainly when it comes to content-profiling. Only “IBRS” uses semantics, in order to build the user profile. However, “IBRS” falls behind when it comes to content-profiling as discussed in Section 2.3.2. This means that no other standalone software discussed uses semantic technologies both for user and content profiling, except for “Film Buddy”. The advantages of using semantics for profiling have been discussed in detail in Section 1.2.1. In addition, many of the standalone recommenders take advantage of social media UGC. Thus, they achieve automatic user profiling as mentioned in Section 1.2.1 and face the new-user coldstart problem as discussed in Section 1.1.3.2. However, as discussed in Sections 2.1.2, 2.2.2, 2.3.2 and 2.4.2, none of these recommenders utilizes users’ Facebook posts as “Film Buddy” does, ignoring a great amount of user-generated textual data. Also, out of the standalone recommenders with social features, only “TasteWeights” provides explanations in combination with an interactive UI to its users. Because interactive and explanatory UIs increase user satisfaction as discussed in Section 1.2.2 and face the “Black box”

25 issue as presented in Section 1.1.3.2, “Film Buddy” follows “TasteWeights” steps. However, “TasteWeights” completely ignores semantics when it comes to user-profiling, while it faces cross-domain issues (See Sections 1.1.3.2 and 2.4.2). “Film Buddy” is not domain specific, because it mainly uses Wikipedia (not domain specific) as its source of content information as will be discussed in detail in Section 5.1.3. Overall, “Film Buddy” is the only recommender system amongst the selected recommenders that combines all features described in Table 2.5.

CF

2.1.2.

3

CBF KBF

Cross-domain

Serendipitious results

Faces Cold-start

3

Explanatory GUI

3

∼ ∼

3

2.1.3.

3

3



2.1.4.

3

3



3

3

2.1.5.

3

2.2.1. MORE

3

2.2.2. C-eVSM

3

2.2.3.

3

2.3.1.

3

3

2.3.2. IBRS

3

3

2.3.3. RecomMetz Film Buddy 2.4.2.

3

3

3

3

3

2.4.3.

3

3 ∼

3 3

∼ 3

3

3



3

3

3





3

3

3

3

3

3

3

3

3

3

3

3 3

3

3

2.4.1. Hybrid

Interactice GUI

2.1.1. MovieLens

Standalone Software

Social Features

Content-profiling Semantics

Explicit Interests Extraction

User-profiling Semantics

Attributes

3

3

2.4.5.

3

3

3

∼ ∼

3

2.4.4. TasteWeights

3



3 3

3

3

3



3



Table 2.5: Comparison between selected related tools.

Note: The project numbering refers to the Table the project belongs to and its position in the table. For instance, 2.1.3 refers to the 3rd project of the Table 2.1. The “∼” symbol is used to

26 denote lack of information or debatable topic. All numbering is clickable and redirects to the appropriate table. Overall, in this chapter, a literature review was provided along with a comparison between available recommendation system tools and “Film Buddy”, in order to create a platform, as discussed in Chapters 4 and 5, which covers the needs in the field of recommendation, as discussed in Section 1.2.1.

Chapter 3

Notation & Fundamentals In this chapter notation and definitions of terms used throughout the thesis are introduced. More definitions are introduced in other chapters (where needed). In addition, the basic concepts of the project’s architecture are presented.

3.1

Notation

This section introduces the basic recommendation-specific notation used throughout this chapter. For mathematical notation (e.g. set notation, logic notation, etc.), used for the introduction of formulas in this chapter, check Appendix A. I IN M R RP U U IQ

Explicit Information - Contextual information about the user U e.g. favorite genres, preferred release year, etc., provided by the user oneself Information Need - Context (setting) of the recommendation e.g. touristic destinations, movies, etc. Collection of movies to be searched Collection of retrieved recommended movies in response to U IQ Recommendation Process User User Interest Query - Tailored data for the query retrieved from a user’s U social platform e.g. Twitter, Facebook, etc. Table 3.1: General Recommendation Process Notation.

For the algorithms presented in this chapter the index i will be skipped for readability purposes. For instance mi will be written as simple lowercase m.

27

28 mi M qti QT ri

i = 1, . . . , M

i = 1, . . . , Q i = 1, . . . , K

R ti T

i = 1, . . . , N

Movie i; a movie mi is represented by its semantically expanded Wikipedia extended plot Set of movies mi (M in total) to be searched Query term i that appears at least once in the U IQ Set of terms qti (Q in total) with qti ∈ U IQ Recommended movie i for the user U ; a retrieved movie ri is represented by its IMDb details Set of retrieved movies ri (K in total) to be recommended to the user U Term i that occurs in at least one movie plot mi ∈ M Set of terms ti (N in total) that occur in all movie plots in M

Table 3.2: Index Notation for the Recommendation Algorithms.

3.2

Principles and Fundamentals

In this section various definitions of fundamental terms related to IR and the recommendation process, which are used throughout this thesis, are introduced.

3.2.1

Formulating the Recommendation Process as an IR Task

Section 1.1.3 gives a general, non-formal explanation of how recommender systems work and the challenges they face. In this section a more formulated definition of the recommendation process of this project will be introduced. It is possible to model the recommendation process with an IR task at its core. Following the notation used by Dominich (2008) in his book “The Modern Algebra of Information Retrieval” [47], we define a model for a recommender system. This model follows the definitions below, in order to provide users with recommendations. “The user U has an information need IN , i.e. in our case he/she requires recommendations of M movies that suit his/her interests. Therefore, IN is the context (setting) of the recommendation. This information need is expressed in the form of a user interest query, provided by filling an online form (according to the syntax of some query language). The U IQ represents the user’s interests in the form of keywords’ list. In our case, these interests are retrieved from a user’s U Facebook profile upon the user’s permission. The recommender system then delivers the information (movies) entities as a result R in the form of a list, in response to the U IQ. Thus, the meaning of the term RP , which is defined as the entire recommendation process, may be formulated formally as the following mapping (See also Figure 3.1):” Problem 1

29 1. Given: A set of movies M and a user U with an information need IN expressed as an interest query U IQ, 2. Extract: a set of recommended movies R for the user U , 3. In order to: succeed in the recommendation process RP by satisfying the user’s U information need IN

RP : (U, IN, U IQ, M ) → R (p. 8)

Figure 3.1: Visualization of the Recommendation Process.

30 However, the information need cannot be fully expressed only through the user interest query, since additional information about the user (such as location and culture, spoken languages, preferred movie genres, etc.), are required for the recommender system (See Figure 3.2). The importance of the additional information lies in the fact that it affects the user’s relevance judgement. In other words, the user’s opinion on whether a retrieved movie is relevant to his/her interests or not is affected by such additional information. Therefore, the additional information is an explicit (i.e., not expressed in U IQ) information I, which is specific to the user U and is provided by the U through filtering options. For instance, a specific user U who is interested only about European Cinema (I), should be able to filter the results, based on this criterion. If such a criterion is not available the user is obliged to scroll through uninteresting results. The problem addressed then is more formally defined as next : Problem 2 1. Given: An implicit user interest query U IQ retrieved from a social media platform, 2. Extract: the explicit information that is provided by the user U through filtering options, 3. In order to: identify the user’s information need IN

IN = (U IQ, I)

Figure 3.2: The information need can be expressed as a combination of the U IQ and the explicit information I.

31 Based on the above, the recommendation process can be reformulated more strictly than in Problem 1 as follows: “RP refers to finding a relevance relationship < between movies M and information need IN ”: Problem 3 1. Given: A set of movies M and the user’s information need IN as defined in Problem 2, 2. Extract: the relation between M and IN , 3. In order to: succeed in the recommendation process RP by satisfying the user’s information need IN

RP = 0.05. Similarly, there is no statistically significant effect neither of the gender variable on the components with F1,116 = 1.497, p = 0.224 > 0.05 and F1,116 = 1.564, p = 0.214 > 0.05, nor of the educational background variable with F6,116 = 1.391, p = 0.225 > 0.05 and F6,116 = 1.683, p = 0.133 > 0.05 equivalently. Moreover, there is no statistically significant effect of the ethnicity variable on the components with F4,117 = 0.498, p = 0.737 > 0.05 (1st component) and F4,117 = 1.376, p = 0.247 > 0.05 (2nd component). The same applies for the age variable with F2,117 = 0.568, p = 0.568 > 0.05 and F2,117 = 1.402, p = 0.251 > 0.05. Regarding the user’s habits, there is no statistically significant effect neither of the hours the user spends on Facebook on none of the components with F3,117 = 1.665, p = 0.180 > 0.05 and F3,117 = 1.426, p = 0.240 > 0.05, nor of the user’s posts number or likes number with F5,117 = 0.630, p = 0.677 > 0.05 and F5,117 = 2.103, p = 0.072 > 0.05 (posts), and F4,117 = 0.191, p = 0.943 > 0.05 and F4,117 = 0.556, p = 0.695 > 0.05 (likes). However, these is statistically significant effect of the user’s movie watching frequency on both components. Specifically, for the 1st component F3,117 = 2.971, p = 0.035 < 0.05 and for the 2nd component F3,117 = 4.105, p = 0.009 < 0.05. In particular, users who watch movies on a daily basis have an especially high mean for the 2nd component (Interaction and Interface Adequacy), mean = 4.281, and for the 1st component, mean = 4.016, while users who watch movies on a yearly basis have a slightly low mean for the 2nd component with mean = 2.750 and for the 1st component with mean = 2.597. Similarly, users who watch movies on a monthly basis have higher means, mean = 3.842 (2nd component) and mean = 3.572 (1st component), than the users who watch movies on a yearly basis. More information are visualized in Figure 6.1. In the upcoming subsection, the users’ responses for the 1st and 2nd section of the questionnaire will be presented.

6.2.5

User’s Habits regarding Facebook & Movie Discovery

Regarding the users’ Facebook habits, the respondents were asked about the hours they spend on Facebook, their posting frequency, as well as the number of pages they have liked.

64

Figure 6.1: Multivariate Analysis of Variance Results.

The responses regarding the use frequency of Facebook were split. In other words nearly the same percentage of responders said that they use Facebook less than 6 hours per week (29.1%), 6 to 10 hours per week (31.6%) and 11 to 20 hours per week (25.6%). Fewer responders spend more that 20 hours per week on Facebook (13.7%) as seen in Figure 6.2. Regarding the users’

Figure 6.2: The participants’ responses to the question “How many hours per week do you spend on Facebook?”

posting frequency, the majority of the responders uploads content less than weekly (41%) or never (24.8%). Fewer responders post several times a week (16.2%) or once per week (12%) and finally daily uploads of content are quite rare (5.1%) and several daily uploads are even rarer (0.9%) as seen in Figure 6.3. Regarding the number of liked pages per responder, the majority of the responders have liked 0 to 100 pages (38.5%) or 101 to 300 pages (24.8%). 14.5% of the responders have liked 301 to 500 pages, while only 1 out of 10 responders (11.1%) has likes more than 500 pages. In addition, a few participants could not specify how many pages they have liked (11.1%) as seen in Figure 6.4.

65

Figure 6.3: The participants’ responses to the question “How often do you post status updates on Facebook?”

Figure 6.4: The participants’ responses to the question “How many Facebook pages have you liked approximately?”

Regarding the users’ habits related to watching and discovering movies, the responders where questioned about the movie watching frequency, as well as the sources they use to discover new movies. As regards the movie watching frequency , about 1 out of 2 users (54.7%) watches a few movies per month, while 1 out of 3 (30.8%) watches a few movies per week. The users who watch a few movies per year follow (8.5%) and then the users who watch movies daily (6%). There were no responders who do not watch movies at all as can be see in Figure 6.5. Concerning the question about sources of movie suggestions, responders could select multiple choices, thus the results show the percentage of users who use each specific source (do not sum up to 100). The striking

66

Figure 6.5: The participants’ responses to the question “How often do you watch movies?”

majority of users trust family or friends for movie suggestions (91.4%), then IMDb (77.6%), Google Search (44%) and YouTube (39.7%). Significantly smaller percentages go to Netflix (11.2%), Rotten Tomatoes (8.6%) and the website https://www.icheckmovies.com/ (2.6%). 4.5% of the responders trust other sources. In total, the above-mentioned numbers are visualized in Figure 6.6.

Figure 6.6: The participants’ responses to the question “What are your sources for movie suggestions?”

67

6.2.6

Demographics

Out of 117 of the questionnaire’s responders 52 are men (44.4%), 64 are women (54.7%) and 1 responder other (0.9%) (See Figure 6.7). There are 3 different age groups (since there are

Figure 6.7: The responders’ gender.

not responders under 18 or above 45): 18-24 (67.5%), 25-34 (28.2%) and 35-44 (4.3%) (See Figure 6.8). Regarding the responders’ ethnicity, the striking majority of the participants iden-

Figure 6.8: The responders’ age group.

tify themselves as Caucasian/White (82.9%), while significantly smaller percentages identify as Latino/Hispanic (2.6%), Middle Eastern (3.4%) and mixed (3.4%) or other (7.7%) ethnicity (See Figure 6.9). Finally, as regards the responders’ educational background, 1 out of 2 participants have completed or currently attend Bachelor studies (55.6%), 23.9% Master studies

68

Figure 6.9: The responders’ ethnicity.

and 9.4% are High School graduates. Smaller percentages have completed or currently attend Doctorate studies (5.5%), Technical education (2.6%), Associate degrees (1.7%) or Professional degrees (1.7%) (See Figure 6.10).

Figure 6.10: The responders’ educational background.

6.2.7

Users’ opinion about Film Buddy’s characteristics

An overview of the participants’ responses to the questions of the 4th section of the questionnaire, before PCA (Section 6.2.1), is given in Figure 6.11, which will be discussed in detail in the upcoming Chapter. Moreover, it’s worth mentioning that the mean value of “I interacted with the proposed interest keywords.” is µ = 3.38 with standard deviation σ = 1.267, while

69 the mean value of “I interacted with the filtering options before the evaluation.” is µ = 3.23 with standard deviation σ = 1.575.

Figure 6.11: Mean values displayed as dots and standard deviations as error bars of the participants’ opinions.

6.2.8

Metrics Calculation

Up to this point, the results derived from the 1st , 2nd and 4th sections of the questionnaire (See Appendix D) have been discussed. This subsection presents the results of the 3rd section of the questionnaire, namely the users’ scores for the recommended movies. For evaluating these results and calculating the evaluation metrics, the programming language R was used. The metrics used will be discussed below.

6.2.8.1

Precision@k & MAP

When it comes to evaluating new web apps, which utilize Information Retrieval techniques, metrics such as precision or recall are of no use, because queries return thousands of results and the majority of the users navigates only through the top results9 . Thus, it is impossible to determine all the relevant results. On the contrary, Precision@k metric takes into consideration only the top k documents and is calculated based on how many of these k documents are relevant. This metric works only with binary weights (1 if the document is relevant, 0 otherwise) and does not take into consideration the position of the document in the top-k result list. 9 The Value of Google Result Positioning — Chitika — Online Advertising. 2013. Retrieved from https:// chitika.com/google-positioning-value.

70 Moreover, for queries that return less than k results, even a perfect system, would get a score less that 110 . Precision@k can take values from 0 to q and is given by the formula below (for a single query):

Pk

i=1 reli

P @k =

k

Where reli = 1, 0 depending on whether the retrieved document is relevant to the query or not. As far as “Film Buddy” is concerned, users evaluated 10 movies (k − 10) each on a scale from 1 to 5. Thus, the scores needed to be discretized into binary weights. Based on previous research [10], scores below 3 are considered to indicate a not relevant result, while scores larger or equal to 3 are considered to indicate a relevant result. Version A’s precision is calculated at P @10A = 0.575 with standard deviation σA = 0.335 and Version B’s at P @10B = 0.664 with standard deviation σB = 0.298, thus the total precision is calculated at P @10AB = 0.621 with σAB = 0.318. Superficially, without taking into account the 5-star ratings and the position of the items, the use of semantics seems to decrease the system’s total precision. The metric’s curve is depicted in Figure 6.12. Moreover, the mean of

Figure 6.12: Precision@k metric curve.

the average precision scores for each user/query (MAP) based on the formula below: PQ M AP =

q=1 AveP (q)

Q

Where Q are the total queries and AveP (q) is the average precision for the query q. It is calculated that M APA = 0.582 and M APB = 0.666. 10

Information retrieval - Wikipedia. 2017. Retrieved from https://en.wikipedia.org/wiki/ Information retrieval.

71 6.2.8.2

Breese’s R-Score Utility

As discussed in Section 6.2.8.1, Precision@k does not take into account neither the item’s position in the results’ list nor the exact score. In order to take into consideration this data, Breese’s R-Score Utility metric is used [57], which calculates a utility score for the results’ list of a search query. The premise of this metric is that a recommendation’s value is decreased exponentially based on its position in the retrieved recommendation list. This metric is given by the formula below: Ru =

X max(ruij − d, 0) j−1

j

2 α−1

Where ij is the item that is placed in the j-st position in the list, ruij is the user’s score u for the item i in a 1 to 5 scale, d is the Breese’s “don’t care” threshold, which is chosen as equal to 2 based on previous research [10], and α is the half-life parameter, which controls the exponential decline of the significance of positions in the ranked list, and is set at 1.5. As far as “Film Buddy” is concerned, the mean value of this metric for Version A is RA = 1.515 with standard deviation σA = 1.107 and maximum value RmaxA = 3.747 and for Version B RB = 1.643 with σB = 1.343 and maximum value RmaxB = 3.999. In this case, even though Version B has a better mean value, the standard deviation indicates that there is a significant gap between the Version B’s scores compared to Version A. The curve of this metric is depicted in Figure 6.13.

Figure 6.13: Breese’s R-Score Utility metric curve.

72 6.2.8.3

Normalized Discounted Cumulative Gain

In order to take advantage of the items’ positions in the list and the 1-5 scale, a second metric is used, NDCG. NDCG is essentially a normalized DCG, namely DCG converted to the [0, 1] scale. The metric’s premise is that documents with high scores that appear lower in the list should be “penalized”, as score is decreased logarithmically depending on the item’s position in the list11 . In other words: DCGp =

p X i=1

p

X reli reli = rel1 + log2 (i + 1) log2 (i + 1) i=2

Where p is the item’s position in the list. For the normalization the following formula is used: nDCGp =

DCGp IDCGp

Where IDCGp is the ideal DCG. “Film Buddy” considers the largest DCG of all queries to be the ideal DCG. As far as “Film Buddy” is concerned, the mean value of NDCG for Version A is N DCGA = 0.901 with standard deviation σA = 0.074, for Version B it is N DCGB = 0.902 with σB = 0.077, and in total it is N DCGAB = 0.9016 with standard deviation σAB = 0.076. In other words, in this case there is no significant difference between the two versions. The metric’s curve is depicted in Figure 6.14.

Figure 6.14: NDCG metric curve. 11 Discounted cumulative gain - Wikipedia. 2017. Retrieved from https://en.wikipedia.org/wiki/ Discounted cumulative gain.

73

6.2.9

Apriori & Association Rules

Apriori is an algorithm used for extracting association rules. Association rules is a machine learning technique for extracting useful conclusions out of large datasets12 . The Apriori method is usually utilized in Market Basket Analysis, but when it comes to “Film Buddy” it is utilized to examine the associations between the various variables. Variables should be discrete in order to use this method. As a result, the numerical scale (1 to 5) of the users’ opinion questions about “Film Buddy” (4th section of the questionnaire) is discretized into 3 levels: • LOW: From 0 to 2 • MEDIUM: 3 • HIGH: From 4 to 5 The numerical scale of the 3 aforementioned metrics (Precision@k, Breese’s R-Score Utility and NDCG) is discretized as follows: • LOW: From 0 to mean value • HIGH: From mean value to maximum value In addition, abbreviations of the questions are used to name the variables and produce the following figures which are given in Table 6.2 for readability purposes. Association rules are formed like M atchInterests = LOW → precision = LOW , which means that the left side (M atchInterests = LOW ) leads to/is associated with the right side (precision = LOW ). How significant each rules is depends on 3 metrics: 1. Support, which indicates how frequently the itemset appears in the dataset and takes values between [0, 1]. 2. Confidence, which indicates how often the rule has been found to be true between the items of the dataset and takes values between [0, 1]. 3. Lift, which indicates the independence level between the variables of the rule or in other words whether the rule is coincidental or not. If lif t ≤ 1, then the variables are independent, otherwise they are dependent and the rule is possibly useful. 12

Association rule learning - Wikipedia. 2017. Retrieved from https://en.wikipedia.org/wiki/ Association rule learning.

74

Table 6.2: The associations between questions and variable names.

The following figures present various executions of the Apriori algorithm with various left side variables (LHS) and right side variables (RHS), different minimum and maximum rule lenght (minlen and maxlen), different support threshold (supp) and different confidence threshold (conf ).

Figure 6.15: Apriori with LHS “InteractFilters” and RHS metrics with minlen = 2, maxlen = 5, supp = 0.2, conf = 0.5.

75

Figure 6.16: Apriori with LHS “InteractFilters” and RHS not specified with minlen = 2, maxlen = 5, supp = 0.1, conf = 0.8.

Initially, in Figures 6.15 and 6.16 the effect of interacting with the filtering options has on the other variables is shown. In Figure 6.17 the effect of interacting with the interest keywords has on the other variables is shown, while Figure 6.18 combines the aforementioned interactions. All figures indicate that interaction leads to positive results regarding the various metrics and the users’ opinions about the system.

Figure 6.17: Apriori with LHS “InteractKeywords” and RHS metrics with minlen = 2, maxlen = 5, supp = 0.1, conf = 0.5.

Figure 6.19 shows the effect that the graphical user interface of “Film Buddy” has on the results, Figure 6.20 shows the effect that the sense of control over the results has on other variables, Figure 6.21 shows the effect that the diversity and novelty of the results has on other variables and Figure 6.22 shows the effect that providing explanations has on other variables. Similarly, the adequacy of the GUI, the high sense of control, the diversity of the results, as well as the explanations, lead to better metric values and more positive user opinions. Finally, Figure 6.23 depicts the variables that lead to high precision, while Figure 6.24 shows the variable that leads to users using the platform again.

76

Figure 6.18: Apriori with LHS “InteractKeywords” or “InteractFilters” and RHS not specified with minlen = 2, maxlen = 5, supp = 0.1, conf = 0.8.

Figure 6.19: Apriori with LHS “AttractiveGUI” and RHS not specified with minlen = 2, maxlen = 4, supp = 0.1, conf = 0.65.

Figure 6.20: Apriori with LHS “Control” and RHS not specified with minlen = 2, maxlen = 4, supp = 0.1, conf = 0.65.

77

Figure 6.21: Apriori with LHS “Diverse” and RHS not specified with minlen = 2, maxlen = 5, supp = 0.25, conf = 0.85.

Figure 6.22: Apriori with LHS “Explanations” and RHS not specified with minlen = 2, maxlen = 4, supp = 0.1, conf = 0.65.

Figure 6.23: Apriori with LHS not specified and RHS metrics with minlen = 2, maxlen = 5, supp = 0.5, conf = 0.5.

78

Figure 6.24: Apriori with LHS not specified and RHS “UseAgain” with minlen = 2, maxlen = 4, supp = 0.2, conf = 1.

To sum up, this chapter describes the evaluation process of “Film Buddy” platform, as well as the results of the evaluation. The upcoming chapter discusses the aforementioned results to reach useful conclusions.

Chapter 7

Conclusions & Future Work In the previous chapter the results of “Film Buddy” qualitative survey were presented. This chapter further discusses the aforementioned findings and the possibilities for future work and further improvement of this work’s results and techniques.

7.1

Conclusions

As mentioned in Section 1.2.1, “Film Buddy” addresses specific user-side and implementation demands, as well as specific limitations of recommender systems. Below, it will be examined whether the system managed to tackle these demands and limitations or not and to what degree, based on the contributions defined in Section 1.3, which can be found below.

7.1.1

Social media collective profiling

Facebook is a deciding factor of “Film Buddy”, as discussed in Chapter 4. Thus, this work managed to successfully tackle cold-start problems with zero input from the users themselves. Regarding the users’ opinions, as shown in Figure 6.11, the participants appear to be neutral about whether the movies recommended match their interests with µ = 3.11, while they are also neutral about whether the system takes their personal context into consideration with µ = 3.38. A reason for this neutrality may be the abundance of languages used in Facebook posts. “Film Buddy” is designed to work with English input, but English is not the mother tongue of the majority of the participants, who may publish content in other languages. This hypothesis has not been examined in depth in this work, but may be the main reason why the systems faces difficulties in capturing the users’ preferences accurately.

79

80

7.1.2

Semantics aware User and Item Profiling

Both of the platform’s version utilize semantics when it comes to item-profiling, which may affect the results that follow. The difference lies in the use (Version A) or not (Version B) of semantics in user-profiling, as mentioned in Section 6.1.3. Based on the PCA and the components independence, version does not have a statistically significant effect on neither of the components (See Section 6.2.3), namely it does not affect neither the perceived quality of the results nor the interface and interaction adequacy. Regarding the metrics, superficially the use of semantics appears to decrease the system’s Precision@k by 9% (See Section 6.2.8.1). However, if we take into consideration the items’ ranking and the 1 to 5 scale, this gap closes with a decrease of 3.5% for Breese’s metric (See Section 6.2.8.2). he versions’ curves though do not provide us with clear results, as Version B has more high values of the metric compared to Version A, but also has more low values at the same time. Things become clearer if we take into account NDGC (See Section 6.2.8.3), which confirms the aforementioned T-Test, as it does not indicate any significant difference between the two versions, as shown in Figure 6.14.

7.1.3

Interactive implementation

This contribution is validated by the PCA, and specifically by the 2nd component, “Interaction and Interface Adequacy”, as sense of control, explanations and interface attractiveness are directly associated, as shown in Table 6.1. However, based on the Pearson’s Linear Correlation Coefficient (See Section 6.2.2), these characteristics also directly affect the perceived quality of recommended items (1st component). Moreover, this contribution is proven to be affected by the film-watching frequency, as discussed in Section 6.2.4, as users who rarely watch movies give lower ratings tot he interaction and interface adequacy. The reasons why this happens need to be examined. It is worth mentioning that the features of this contribution are among the highest ranked by the users, with µExplanations = 3.79, µAttractiveGU I = 4.15, µControl = 3.50 and µU nderstanding = 3.48. Finally, the Association Rules (See Section 6.2.9) validate the significance of this contribution. Based on Figure 6.15 low or no interaction with filtering options leads to low list utility (Breese), while high interaction leads to higher values of the three metrics. Similarly, based on Figure 6.16 the use of filtering options contributes to finding the ideal movie, user satisfaction and platform reuse, while it enables users to express their preferences. Similarly, high interaction with interest keywords leads to higher values for all metrics, while low interaction with keywords worsens the metrics’ values based on Figure 6.17. Figure 6.19 shows that users who give a positive feedback on the interface adequacy, have higher user satisfaction, as well

81 as higher understanding due to explanations. As shown in Figure 6.23 better user satisfaction leads to higher precision. In addition, for participants who have low sense of control over the results, understanding, satisfaction and personal context consideration are also low, while the contrary applies for the participants who have high sense of control over the results (See Figure 6.20). Finally, explanations lead to better understanding, reuse and satisfaction, as shown in Figure 6.22, which lead to higher precision (See Figure 6.23). The value of interaction is verified bu the aforementioned results. The fact that the users of “Film Buddy” haven’t taken full advantage of the interaction options (µInteractF ilters = 3.23 and µInteractKeywords = 3.38), may has had a negative impact on the produced metrics.

7.1.4

Serendipitous Results

Based on the survey, “Film Buddy” manages to overcome the novelty problems of traditional recommender systems, as the features of this contribution get mainly positive ratings from users, with µN ovelInteresting = 3.40 and µDiverse = 3.80 (See Figure 6.11). Based on the Association Rules and Figures 6.21 and 6.23, the results’ variety and novelty leads to higher user satisfaction and user confidence regarding the recommendations, higher precision and probability of platform reuse.

7.2

Suggestions for Future Work

This work has brought up issues that need further examination and future researchers are encourages to work on them. Some of them are described below: • Alternative Datasets: The system’s performance has been examined using a movie dataset. The design and implementation of “Film Buddy” though allows recommendations’ production based on other datasets as well, such as literature, travel destinations, etc. Utilizing other datasets will provide a more exhaustive view of the system’s performance. • Language Independence: As discussed in Section 7.1.1, the performance of “Film Buddy” is possibly limited by non-English input fed into the system. It is encouraged to examine ways of utilizing social media content regardless its language. • Alternative Semantic Approaches: “Film Buddy” uses semantic dictionaries to integrate semantics into the recommendation process. The use of alternative semantic networks (apart from WordNet), such as Gellish Models1 , to examine their effect on the system’s performance, is suggested. 1

Gellish - Wikipedia. 2017. Retrieved from https://en.wikipedia.org/wiki/Gellish.

82 • Intuitive Design: As mentioned in Section 7.1.3, a great percentage of users interacted neither with the filtering options nor with the interest keywords, which are a crucial part of “Film Buddy”, although both elements were quite visible on the online platform. It is worth examining a more intuitive design, which may lead the user to unconsciously interact with the platform. Moreover, the reasons why the non-movie-fan users have lower ratings when it comes to interface adequacy (See Section 7.1.3) need to be examined. • Alternative Algorithms: “Film Buddy” uses the default Solr algorithms both for indexing and querying. There may be interest in examining the effect an alternative Solr algorithm might have on the system’s performance.

Appendix A

Mathematical Notation This appendix introduces the general mathematical notation used throughout this thesis.

A.1

General Notation