An Experiment in Social Search* - CiteSeerX

An Experiment in Social Search? Jill Freyne1 and Barry Smyth1 Smart Media Institute, Department of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland {Jill.Freyne, Barry.Smyth}@ucd.ie

Abstract. Social search is an approach to Web search that attempts to offer communities of like-minded individuals more targeted search services, based on the search behaviour of their peers, bringing together ideas from Web search, social networking and personalization. In this paper we describe the I-SPY architecture for social search and present the results of a recent live-user evaluation that highlight the potential benefits of our approach in a realistic search setting.

1

Introduction

The brief history of Web search to date is characterised by a variety of significant technological developments that have seen Web search engines evolve beyond their information retrieval (IR) origins. For example, Meta-search [1, 2] was an early attempt to combine the capabilities of many underlying search engines in order to improve overall coverage and relevance ranking. More recently, search engines such as Google [3] have argued for the need to consider factors such as link-connectivity information (in addition to the more traditional IR term-based factors) as a way to guide search towards more informative documents; see also [4]. Currently most of the main Web search engines adopt a “one size fits all” approach to search—two users with the same query receive the same result-list, regardless of their preferences or context—and while there is broad agreement that this is far from optimal, real developments towards practical personalization techniques, that are both capable of coping with Internet-scale search tasks, and that are likely to be acceptable to today’s privacy conscious users, have been slow to emerge. Although, that said, a number of researchers have looked at the issue of context-sensitive search, where the search engine draws additional information context from the searcher [5]. One area of research that may have the potential to offer the right mix of personalization, while at the same time protecting user privacy, comes from recent work that has focused on the intersection between social networking and Web search. Social networking applications such as Friendster (www.frienster.com) or Orkut (www.orkut.com) allow users to create, maintain and participate in online communities, and provide a range of applications and services to help these communities socialise more effectively on-line and off-line. Inevitably, the members of ?

The support of the Informatics Research Initiative of Enterprise Ireland is gratefully acknowledged.

a given community will all share certain characteristics and interests; for example, the members of a “caving and potholing” community will obviously all have an interest in caving and potholing activities, but they are also likely to share a range of peripheral preferences too; a general interest in outdoor activities, they may be regular travelers, etc. The point is that these well-defined communities of like-minded individuals provide a backdrop for social search applications; search engines that are sensitive to the needs and preferences of a specific community of users, operating within a well-defined topical domain. Indeed social search is an increasingly important topic for the search industry and many commentators have predicted that it will become the “next big thing”. In this paper we will describe our own model of social search, called I-SPY, but other similar services such as Eurekster (www.eurkster.com) are rapidly emerging to try and take advantage of this social dimension to search. Indeed it is worth speculating about the intentions of Google in this regard, especially since they have launched their own social networking service (www.orkut.com) but have yet to announce how they plan to integrate it with the Google search engine. I-SPY operates as a post-processing engine for traditional search engines, or as a meta-search service over a number of underlying search engines. It allows groups of users to establish online search communities and anonymously records their interaction patterns as they search. These interaction patterns— the queries provided and the resulting selections—allow I-SPY to adaptively re-rank the result-lists of future queries in a way that is sensitive to the preferences of a given community of users. For example, over time I-SPY will learn that a community of AI researchers are more likely to be interested in the work of the Berkeley professor than the basketball star when they search under the vague query, “Micheal Jordan”; I-SPY actively promotes such relevant sites for this community of users. In the following sections we will describe the operation of I-SPY in detail (see Section 3) and present the results of a live-user study over a particular community of users (Section 4). First we will outline related work with a particular focus on research that has sought to exploit the interaction patterns of users to provide other types of adaptive, collaborative information services on the Web. We will look, in particular, at collaborative browsing services a their relationship to our own proposal for collaborative search.

2

Related Work

The research described in this paper—the I-SPY system and its collaborativeranking features—touches on a number of areas of related research. One legitimate view of our work is that I-SPY is focused on disambiguating vague queries, that it provides a form of context-sensitive search. As such we could compare our work to other attempts at developing context-sensitive search techniques, differentiating between explicit [6] and implicit [7] forms of context or between external [8] or local [9] sources of context. Alternatively, one might view I-SPY’s approach to search as a way of coping with disparities between the query-space and indexing-space and cite related work in which researchers have attempted

to leverage knowledge about the query-space to solve this problem and improve search performance [10–12]. Indeed, these particular perspectives are considered in [13, 14]. However, as indicated in the introduction, we will take a different standpoint, one that looks at I-SPY from an interaction perspective, one that highlights the social and collaborative aspects of information seeking activities. To begin with it is worth highlighting the work of Hill and Hollan [15] in which they introduced the concept of computational wear in the context of digital information. Real world objects are history-rich—a well-read book naturally falls open on a popular page, important chapters become dog-eared with regular use, and paragraphs may be underlined or annotated with margin notes—and this history often helps us to make better use of these objects. In contrast, digital information is usually history-poor—Web pages do not change as surfers view them, for example—and this lack of history can limit the manner in which we can use and benefit from digital information. Hill and Hollan describe ways in which interaction histories can be incorporated into digital objects. Specifically they describe the Edit Wear and Read Wear applications which allow the user to leave wear marks on a document as they edit or read it. These wear marks are represented visually as part of the document’s scroll-bar and help the user to quickly appreciate where the main focus of editing or reading activities has been. The Footprints project [16], inspired by the work of Hill and Hollan, applied the idea of computational wear and interaction histories to the Web, building a number of tools to help users navigate through sites. For example, one tool graphically represented the traffic through a Web site, allowing surfers to appreciate the links that are being followed regularly by others. Another Footprints tool visualises the paths (sequences of links) followed by individual surfers. These tools represent the interaction histories of previous users, and they allow the current user to adapt their browsing pattern as appropriate. Rather than surfing blind, the current user can be guided by the actions of others. The Footprints tools facilitate a form of collaborative browsing, an idea that is closely related to the work of [17]. The I-SPY philosophy is related to the above. I-SPY also attempts to exploit interaction history or computational wear, as we shall discuss in the next section, but rather than focusing on editing or reading documents, or browsing Web pages, I-SPY focuses on search. It records the interactions of searchers with result-lists and uses these interactions to improve result-ranking and helps the searcher to understand which results have been found relevant in the past. Thus search becomes a more social activity. New searchers benefit from the searches carried out by past users. They see the results that past users have liked, and they benefit from an ordering of results that is sensitive to the degree to which these past results have been preferred.

3

An Architecture for Social Search

The basic architecture for our model of social search, implemented by I-SPY (ispy.ucd.ie), is presented in Figure 1. On the face of it, I-SPY is a meta-search engine, adapting a user query, qT , for a series of underlying search engines, S1 , ...Sn , and merging each of their result-lists, R1 , ..., Rn , to produce a final result-list, R, that is returned to the user. The uniqueness of I-SPY stems from a combination of important features: (1) its capturing of interaction histories; (2) the use of these interaction histories to re-rank search results; and (3) its ability to separate the interaction histories of individual search communities (social groups) so that this re-ranking can take place in a community-sensitive manner.

Fig. 1. The I-SPY system architecture for social, collaborative search.

3.1

Interaction Histories

Each time a user selects a page, pj , from a result-list generated by I-SPY in response to a query, qi , a record of this selection is noted by incrementing a counter in I-SPY’s so-called hit-matrix, H; see Figure 1. Thus, the value of Hij represents the number of times that pj has been selected for query qi . The hitmatrix represents the interaction history (in the sense of [16]) relative to a set of queries and their relevant results. 3.2

Collaborative Ranking

The hit-matrix is a record of what past searchers have viewed as relevant to their queries and I-SPY takes advantage of this information to re-rank some of the search results that are returned from the meta-search. If any of the meta-search results, for a query qT , have non-zero hit-values in the hit-matrix row that corresponds to qT , then this provides further evidence (the number of past selections)

that these results are relevant to qT . Furthermore the degree of relevance can be estimated as the percentage of selections that a given page has received for this query; see Equation 1. This relevance score can be used to rank these previously selected results ahead of the other results returned by the meta-search. So the first results presented to the user are those that have been previously selected for their query, ordered according to their past selection probability(see Figure 2(a)). The remainder of the results are ordered according to their standard meta-search score. The hope is that the promoted results will turn out to be more relevant to the searchers, helping them to locate their target information more efficiently; we will test this hypothesis directly in Section 4. HT j Relevance(pj , qT ) = P (1) ∀j HT j 3.3

Search Communities

I-SPY is designed to support multiple hit-matrices, each for a different community of searchers. For example, consider one group of searchers interested in motoring information and another interested in wildlife. The query, “jaguar” has very different meanings for each of these communities: the former are looking for information about the high-performance car while the latter are interested in the big cat variety. Ordinarily a standard search engine or meta-search engine would respond to each community in the same way. However, with I-SPY, the likelihood is that the previous interactions of the motoring community will have produced a hit-matrix that prioritises car-related sites for the “jaguar” query, while the wildlife hit-matrix will prioritise wildlife-related sites; see Figure 2(A). Thus, I-SPY’s collaborative ranking is designed to operate for well-defined communities of searchers. To facilitate this I-SPY allows individual users or usergroups to configure their own search service by filling out an online form; see Figure2(b). The result is a unique hit-matrix that is linked to a search community and a URL that contains a version of I-SPY whose queries are associated with this new hit-matrix. Alternatively, the searchers can add a piece of javascript to their site to include a search-box that is linked to their own hit-matrix.

4

Evaluation

Collaborative ranking on its own is unlikely to be successful in a general-purpose search context because the alternative meanings of vague queries are likely to be merged in the hit-matrix; for example, we would expect to find selections for car sites and wildlife sites recorded for the “jaguar” query in a general-purpose search engine. The secret of I-SPY is that this problem is largely eliminated once we allow for the separation of interaction histories for different communities of users. And we believe that this particular combination of collaborative ranking and community-based search will pay dividends when it comes to improving overall search performance. To test this hypothesis we report on a live trial of I-SPY that was carried out in late 2003 on a community of third-year computer science students.

Fig. 2. (A) A result-list for a motoring community for the query “jaguar”, the eyes denote promoted results; (B) I-SPY’s on-line configuration form facilitates the creation of new community-specific search services.

4.1

Setup

A total of 92 students were asked to answer 25 general knowledge computer science questions. They were directed to use I-SPY (configured to use Google, HotBot, AllTheWeb, and Teoma) to source their answers and they were divided into two groups of 45 and 47 students, respectively. The first group served as a training group. They did not benefit from I-SPY’s collaborative ranking technique—the results were ranked using the standard meta-search ranking function only—but their queries and selections were used to construct a hit-matrix for the second group of users (the test group) who did benefit from I-SPY’s collaborative ranking. Each user group was allotted 45 minutes to complete the questions. Overall more than 70% of the training group’s queries were repeated by the test group, and these repeated queries were used by approximately 65% of the 97 students. This high repeat rate suggests a close correspondence between the query formation capabilities of each group of users. However, it is worth pointing out that the test users tended to use slightly shorter (fewer terms) queries than the training users; the average query length for the test group was 2.16, compared to 2.57 for the training group. All other things being equal this might suggest that the training group were better able to produce focused queries and that the training users might be at a slight advantage when it comes to their inherent search abilities. 4.2

Selection Behaviour

Our first question concernes the selection behaviour of the training and test users. If I-SPY’s collaborative ranking technique is working to promote more

relevant results to higher positions within result-lists then we should find that the test users are selecting results nearer to the top of result-lists, when compared to the selections of the training users. Figure 3(a) plots the position of result selections during each search session for the training and test users; for each search session we plot the median position of the selected results. Although the position of results vary considerably from search session to search session, as expected, there is a marked difference between the position of results for the training and test users and it is clear that the test users are selecting results that appear nearer to the top of result-lists (lower position values) when compared to the training users, due to the improved position of relevant results for the test users as a consequence of I-SPY’s collaborative ranking technique. To provide a clearer picture of the benefits of I-SPY’s

Fig. 3. (a) Median positions of selected results; (b) Mean positions of all result selections between training and test groups; (c) Mean number of questions attempted/correctly answered per student; (d) Percentage of students that achieve a given test-score.

collaborative ranking we can summarise the results of Figure 3(a) by computing the mean position values of the selections of the training group and compare these to the mean positions of the selections of the test group; see Figure 3(b).

The test users selected results with an average position of 2.24 whereas the training users selected results with an average position of 4.26; a 47% reduction in the position of selected results for the test users compared to the training users, and a strong indicator of the benefit of I-SPY’s collaborative ranking function. These results indicate that the test users were more likely to select results from higher positions within the result-lists. We argue that this is because these users were able to benefit from the interaction histories of their peers within the training group because I-SPY’s collaborative ranking technique was actively promoting the selections of these peers. The hope is that these community preferences will turn out to be useful results, when it comes to the students answering their questions, and we will consider this issue in the next section. 4.3

Search Accuracy

Of course finding out that the test users selected results from higher positions in the result-lists, compared to the training users, does not really tell us anything about the usefulness of these results. It may be, for example, that the selections of the training users were not so relevant to the task at hand and that I-SPY’s eagerness to promote these misleading results simply encouraged the test users to follow false-leads, and thus hampered their ability to answer the test questions. If this is the case then our experiment in social search will have failed. If, however, the promoted results were more likely to be relevant, then we should find that the test users were able to answer questions more efficiently. We should find that the test users attempt more questions than the training users and that they answer more of these attempted questions correctly. In short, we should find that the test students achieve higher overall test-scores. Figure 3(c) compare the training and test users in terms of the mean number of questions attempted per student/user and the mean number of correctly answered questions per student. The results indicate a clear advantage for the test users: they did in fact attempt more questions on average than the training group (9.93 versus 13.93, respectively) and they did answer more of these questions correctly (7.58 versus 11.54, respectively).It is worth noting that both these differences are significant at the 0.01 significance level. Indeed it is revealing to note that the test group of users answered more questions correctly (11.54) than the training group even managed to attempt (9.93). To look at this another way: of the 9.93 questions attempted, on average, by the training group, only 76% of these questions (7.58) are answered correctly. The test group not only managed to attempt 40% more questions but they answered a higher proportion of these attempted questions correctly (11.54 out of 13.93 or 82%). Figure 3(d) plots the percentage of students in each group that achieved different overall test-scores; the test score is the overall percentage of the 25 questions that a student has answered correctly. These scores clarify the gap that exists between the performance of the training and test groups. For example, more than twice as many test students (70% of the test group, or 33 students) achieved a pass grade of 40% compared to the training students (30%, or 13 students). And while 56% of the test group achieved an honours grade (55%),

none of the training group managed to score more than 52% and 5 of the test students achieved a distinction (70% or greater). 4.4

Summary

To begin with in this experiment it appeared that, if anything, the training users might be at a slight advantage owing to their longer queries. However, the results clearly indicate superior search performance from the test group. They attempted more questions and they answered more of these questions correctly. This advantage must be due to the model of social search implemented by I-SPY. And although this is a small-scale experiment, 97 academically-related students on a limited search task, we believe that this positive outcome speaks to the potential of social search as a valuable approach to more focused search.

5

Conclusions

In this paper we have argued for the benefits of social search and we have described a particular approach that integrates ideas from social networking and adaptive information retrieval to provide a personalized search service to welldefined communities of like-minded individuals. The I-SPY system has delivered significant performance advantages in live search scenarios, with communities able to locate the right information faster and more reliably by leveraging the past search behaviour of their peers. I-SPY delivers this level of personalization in an relatively anonymous fashion. Individual community members are not tracked, nor are they identified. Instead, personalization operates at the level of the community rather than the individual. We believe that this level of personalization strikes the right balance between accuracy and privacy: the community-based ranking of results is sufficiently accurate for the individual user to benefit from the social search, but at the same time they can be confident that their privacy and identity have been protected. In recent work we have considered a number of issues arising out of this model of social search. We have proposed the use of various strategies to protect against fraudulent search activity. For example, we can frustrate users who attempt to drive the promotion of certain result pages, by making repeated selections, by discounting or filtering sequences of consecutive result selections. Similarly we have proposed the use of decay models to reduce hit-values over time in order to reduce the natural bias that operates in favour of older pages; older pages will have had more of an opportunity to attract hits and may therefore be promoted above newer but more relevant pages. Finally, we have recently explored the possibility of leveraging the interaction histories of similar queries to the target query—right now I-SPY operates on the basis of exact matches between the target query and the entries of the hit-matrix—and our initial results show that such an extension has the potential to improve the performance of I-SPY still further by increasing its precision and recall characteristics [18]

References 1. Selberg, E., Etzioni, O.: The Meta-Crawler Architecture for Resource Aggregation on the Web. IEEE Expert Jan-Feb (1997) 11–14 2. Dreilinger, D., Howe, A.: Experiences with Selecting Search Engines Using Meta Search. ACM Transactions on Information Systems 15(3) (1997) 195–222 3. Brin, S., Page, L.: The Anatomy of A Large-Scale Web Search Engine. In: Proceedings of the Seventh International World-Wide Web Conference. (1998) 4. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. In: Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms. (1998) 668–677 5. Lawrence, S.: Context in Web Search. IEEE Data Engineering Bulletin 23(3) (2000) 25–32 6. Glover, E., Lawrence, S., Gordon, M.D., Birmingham, W.P., Giles, C.L.: Web Search - Your Way. Communications of the ACM (2000) 7. Rhodes, B.J., Starner, T.: Remembrance Agent: A Continuously Running Automated Information Retrieval System. In: Proceedings of the First International Conference on the Practical Applications of Intelligent Agents and Multi-Agent Technologies. (1996) 487–495 8. Budzik, J., Hammond, K.: User Interactions with Everyday Applications as Context for Just-In-Time Information Access. In: Proceedings International Conference on Intelligent User Interfaces., ACM Press (2000) 9. Bharat, K.: SearchPad: Explicit Capture of Search Context to Support Web Search. In: Proceedings of the Ninth International World-Wide Web Conference. (2000) 10. Raghavan, V.V., Sever, H.: On the reuse of past optimal queries. In: Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, ACM Press (1995) 344–350 11. Fitzpatrick, L., Dent, M.: Automatic feedback using past queries: social searching? In: Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval, ACM Press (1997) 306–313 12. Glance, N.S.: Community search assistant. In: Proceedings of the 6th international conference on Intelligent user interfaces, ACM Press (2001) 91–96 13. Freyne, J., Smyth, B., Coyle, M., Briggs, P., Balfe, E.: Further experiments in collaborative ranking in community-based web search. AI Review: An international Science and Engineering Journal (In Press) (2004) 14. Freyne, J., Smyth, B.: Query based indexing in collaborative search. In: Submitted to 27th Annual International ACM SIGIR Conference. (2004) 15. Hill, W., Hollan, J., Wroblewzki, D., T.McCandless: Edit Wear and Read Wear. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ACM Press (1992) 3–9 16. Wexelblat, A., Maes, P.: Footprints: History-Rich Web Browsing. In: Proceedings of the Third International Conference on Computer-Assisted Information Retrieval. (1997) Montreal, Quebec, Canada. 17. M.Twindale, D.N., Paice, C.: Browsing is a collaborative process. Information Processing and Management 33(6) (1997) 761–83 18. Balfe, E., Smyth, B.: Cabe based collaborative web search. In: Submitted to the 7th European Conference on Cased Based Reasoning. (2004)