Measuring Users' Privacy Payoff Using Intelligent Agents

Measuring Users’ Privacy Payoff Using Intelligent Agents Abdulsalam Yassine, Shervin Shirmohammadi DISCOVER Laboratory School of Information Technology and Engineering University of Ottawa, Ottawa, Canada [email protected], [email protected]

Abstract—As many people are now taking advantages of on-line services, the value of the private data they own comes into sight as a problem of fundamental concern. This paper takes the position that, individuals are entitled to secure control over their personal information, disclosing it as part of a transaction only when they are fairly compensated. To make this a concrete possibility, users require technical instruments to be able to measure their privacy payoff and track the use of their private data. In this paper, we propose an intelligent agent-based framework for privacy payoff measurements and negotiation. Intelligent agents in our system collaboratively work on behalf of users for the goal of maximizing their benefit and protect the use of their private data. The overall framework is described, and a particular simulation experiment is presented to evaluate our approach.

I.

INTRODUCTION

As the number of Internet users increases, systems architects, designers, and administrators are faced with the challenge of protecting users’ privacy and preferences [3]. In applications such as games, virtual reality, e-business, etc, users’ behavior is often tracked and recorded. For example in virtual boutiques, an e-business application, users’ taste and preferences are particularly valuable to the growing commercial markets inside virtual worlds [6] [9]. Another example is online social networks such as Facebook, Myspace, Second Life etc. users’ personal data profiling is considered an integral part of the service providers’ business model. According to [15] and [10], there are large amount of private data about users, usually assembled in sophisticated databases, changed hands or ownership, as part of online transactions and other firm’s strategic decisions which often include selling users information lists to other firms. A look at our present-day reveals that users’ profile data is now considered among the most valuable assets owned by businesses in the virtual world [5]. In such an “unfair” information market, users cannot participate since they have no instrument to capture a share of their private data asset or receive compensations. This is due to the fact that current systems for processing transactions are designed to facilitate a one-time surrender of control over personal information. Secondly, under current law, the ownership right to personal information is given to the collector of that information and not to the individual to whom the information refers [5] [12].

This paper takes the position that, individuals are entitled to secure control over their personal information, disclosing it as part of a transaction only when they are fairly compensated. While similar views have already been expressed in a number of legal and non-legal publications concerning information privacy [15] [16] [17] [18], this paper extends our previous work in [4] and further contributes to this debate by proposing a novel intelligent agent-based framework for privacy payoff measurements and negotiation. Intelligent agents in our framework work on behalf of users, collect their private data, categorize them, add privacy risk weights defined by users, monitor service providers and collect information related to their reputation and finally negotiate with service providers a payoff value in return for the dissemination of the information. In reality, people cannot appreciate the magnitude of privacy threat when revealing their private data or its impact later on their lives [11]. Our intelligent agents will help their users to overcome this obstacle. The rest of the paper is organized as follows: in the next Section we elaborate on the intended contribution of this work. In Section 3, we present the framework architecture. In Section 4, a simulation experiment is presented to evaluate our approach. Finally, in Section 5, we provide conclusions of our paper, and discuss plans for future work. II.

INTENDED CONTRIBUTION

The intended contribution of our work is to employ intelligent agent-based solution to quantify and measure privacy payoff in virtual environments through private data valuation and privacy risks quantification. We also intent to prove through a simulated case study that our approach of allowing users to participate in the market of private data in the virtual world will help make it a privacy-friendly environment. III.

FRAMEWORK ARCHITECTURE

Figure 1 depicts the high level architecture of the framework. It is based on the following: 1) Users open accounts that record their tastes, preferences, and personal data 2) Each agent is an independent entity with its own goal and information. However, the information under control by

an individual agent is not sufficient to satisfy its goals, and so the agent must interact with other agents 3) The architecture allows the interaction with external agents for information sharing The proposed architecture consists of five types of core modules; 1) Facilitator agent, 2) Database agent, 3) Reputation agent 4) Payoff agent, 5) Negotiation agent. The facilitator agent manages the interaction of the agents and orchestrates the order of tasks execution and acts as a single point of contact with agents inside and outside our system. The database agent receives data specification from users and classifies them into categories and derives privacy risk quantification based on the private data sensitivity. The reputation agent collaborates with other agents in the online world thus creating a network where they exchange transaction ratings about service providers whom their users have dealt with. This is called a users’ coalition. The payoff agent employs a payoff model that can be used to determine the value of the private information according to the privacy risk quantification and the privacy context weight (discussed later in the paper). Finally, the negotiation agent negotiates with agents representing e-businesses or service providers for a tradeoff value in return for the dissemination of the information. user : Alice

DB Agent

Payoff Agent

Database

ACL

ACL Internet

Facilitator Agent

Web Portal Reputation Agent

ACL ACL

Negotiation Agent

Figure 1. High level framework architecture

A. Agent communication The cooperation between agents should be made by means of a common message through which the request for a work will be able to be made and the result be sent. The communication paradigm is based on asynchronous message passing. Thus, each agent has a message queue and a message parser. In our system, the message is based on ACL (Agent Communication Language) of FIPA-ACL standards [7] that is similar to the KQML (Knowledge Query and Manipulation Language). B. Facilitator Agent The facilitator agent manages the interaction of the agents and orchestrates the order of tasks execution and acts as a single point of contact for agents inside and outside our system. The facilitator agent depicted in figure 2 consists of three main components (namely, the message manager, task manager, and the decision manager), which run concurrently and intercommunicate by exchanging internal, i.e. intra agent, messages. When the message received at the message manager component it is then relayed to the task manager using a message queuing mechanism. All components implement the

same behavior and remain inactive while no messages to be processed are available.

Task Manager

Decision Manager User, external agent, internal agent

Message Manager Figure .2. Facilitator agent architecture

The message manager of the facilitator agent is responsible for the agent’s interaction with its environment that is the user, agents in the system, and the agent representing the service provider. The task manager handles the parts of the interaction protocol of (i) the agents and the user (ii) the agent and the online seller’s agent (iii) the order of task execution among the agents In addition, the task manager helps construct the task that needs to be handled by the agents, monitor the current tasks and agent’s finished tasks (that is, the history of task execution). As shown in figure 2, the task manger interacts with both the message manager and the decision manager. Finally, the decision manager uses a knowledge base which contains the preferences and rules that dictate the final decisions to be made. The main functionality of the decision making component is ultimately to find the best offer (to be then recommended to the user), according to the user’s choices, sensitivity to privacy, and the information at hand. C. Reputation Agent The reputation of a service provider is a measure of trustworthiness [1]. It is defined as the amount of trust inspired by the particular provider in the specific domain of interest, which is in our case handling private data. Let ti,j represent a transaction that agent i has with provider j. Let Q1 (ti,j), .. , Qn (ti,j) be the associated n reputation factors assigned by agent i to provider j related to agents i’s experience with provider j. The value of each reputation factor is 0 ≤ Q ≤ 1 because reputation factors are scores assigned by the agents to the service provider such as a score of 3/5 for privacy statement compliancy, a score of 4/5 for reliability, a score of 2/5 for security etc. Then agent i assigns provider j a reputation component R(ti,j) as follows:

R (t i , j ) =

1 n ∑ Qk (t i, j ) n k =1

(1)

Over the course of m transactions, agent i assigns provider j a reputation Ri,j as follows:

Ri , j =

1 ∑ R(t i, j ) m ti , j

Notice that 0 ≤

R (t i , j ), Ri , j ≤ 1 .

(2)

There are different sets of criteria that can be taken into consideration when forming reputation metrics to assess a service provider. In return, a user provides its agent with ratings about a transaction in order to build the reputation database of the services. Agents create a network where they exchange transaction ratings about service providers whom their users have dealt with; this is called a users’ coalition. In this way agents are involved in a joint recommendation process. This process takes each agent’s experience with the service provider as well as other agents. The reader may refer to [1] and [2] for a reference on reputation criteria that can be used to reflect the conduct of service providers with regard to how they collect and use private data such as their compliancy to privacy statement and their proneness to the disclosure of private data, transaction protection, reliability, security etc. D. Database Agent The database agent receives data specification and classifies them into categories, derives privacy risk quantification based on the private data sensitivity, and processes conversion of ACL messages and queries. Once users open their accounts and record their tastes, preferences, and personal data, the agent determines private data objects that are perceived to be valuable to capitalize on them for the goal of maximizing their value once the user decides to reveal them. The agent follows the following methods to address the problem of private data valuation [19]: • Define data sets taking into account the different overall sensitivity levels towards each private data and different importance one may assign to a specific data; • Define context-dependent weights to fit different situation as one user rule about any single private data item may not fit all situations; • Determine the privacy risk value under each context which will be used later to valuate the user’s payoff once he decided to reveal the data; 1) Data classifications: The agent receives data specification from users which reflects their true personal information and then classifies them into M different categories (C1...CM), such as personal identification, contact information, address, hobbies, tastes, and salary. In every category, private data are further divided into subsets (Sij, i=1…M), each subset may contain one or more private data as shown in the example below. Example: Category (contact) = {Subset (telephone number), Subset (email)} Subset (telephone number) = {Private Date (work phone number), Private Date (home phone number), Private Data (cellular phone)} Subset (email) = {Private Date (work email), Private Date (personal email)} The privacy sensitivity of each private data depends on parameters ( Wij ,ψ ij ) given in the following definitions:

Context-dependent risk weight Wij : it is a value specific for each user. It represents the user’s valuation for each context. Users and providers are engaged in a transaction process, where users provide their data to complete a transaction in return for a service. The context, which represents the nature of the transaction, differs from one transaction process to another. For example, a transaction process that involves releasing private data about the user’s health information differs in its context and nature from a transaction process that involves releasing private data about the user’s preferences on movies. The user evaluates the privacy risk weight of each transaction based on its context and the information related to the reputation of the service provider collected by the reputation agent. Such knowledge helps the user make an informed decision about the release strategy of his private data and the risk weight s/he wishes to give for the context of the transaction. According to [14], users have various global valuation levels for each transaction that involves private data. The context-dependent risk weight represents the user’s type. Each user provides a weight value for private information category i under context j. Privacy risk ψ ij : is the weighted privacy risk value of revealing private data of category i under context j. The presented data classification has three interesting characteristics in the context of profile data subsets that are related to each other: first, data in different categories may have different context-dependent weights. Since the value of the private data may differ from one context to another, its composition may have different implications on the level of revelation. Second, the substitution rate of private data in the same subset is constant and independent from the current level of revealed data, i.e. assuming that one of the private data has been revealed, revealing the rest of the data in the same subset will not increase the disclosure risk of privacy. For instance, a user’s age information can be expressed by age, year of birth or high school graduation year. Knowing all of them at the same time allows only marginal improvements. This will allow us to consider each private data subset as one unit. Third, data in different subsets are not substitutable; revealing any one of them will increase the privacy risk. For example, the user’s telephone number and his email address constitute two possible ways to contact the user but they are not completely interchangeable. 2) Risk quantification: Let Ti be the private data size: i.e. the number of subsets in category i. If fi subsets have been revealed, then the weighted privacy risk of revealing private data fi normalized over the number of subsets in category i is calculated as follows:

ψ ij =

fi ⋅ Wij Ti

(3)

Where the following properties apply: - Wij is linear in the interval [0,1]; The sum of context privacy risk weights for all categories under each context is 1.

In equation (3), if the user revealed all his private data, that is Ti = fi, then his risk weight valuation is maximized and will be equal to the overall privacy risk value. To compute the overall privacy risk ψ of the revealed private data, the privacy risk in (3) is further normalized over the privacy risk value of each category under context j and then added together. Below is an example to illustrate the computation. E. Measurement and Payoff Agent One approach to privacy protection is to impose costs (risk premium) on the use of information so that service providers in the virtual world are more conservative when handling users’ personal information as privacy risk penalties and reputation consequences on violators of users presumed privacy rights are more likely to be costly. In [4] we presented a payoff module based on linear correlation between the risk of using the private data and the compensation payoff. In this paper, we consider a module in which the information acquired today may have a learning effect in the future and hence the service provider may benefit from selling or using the information at different time intervals. The goal of this model is to find a payoff value for the private data at the current time. To measure the value of the private data in such situation, let us consider a process π (t ) , 0 ≤ t ≤ Τ representing the discounted payoff of benefitting from the data at time t, and a class of admissible learning times Γ with values in [0, T]. The problem, then, is to find the expected discounted payoff (4) sup E[π (t )]

~

function for the provider at time ti. Let Vi ( x) denote the value of the private data at time ti given Xi =x. We are ultimately

~

interested in V0 ( x0 ) . This value is determined recursively as follows:

~ ~ V m ( x ) = hm ( x )

(6)

~   ~ ~ Vi −1 ( x) = maxhi−1 ( x), E Vi ( X i ) X i −1 = x   

(7)

i=1… m Now after the general shape of the payoff problem is formulated, we need to construct the learning process at each time interval with the associated privacy risk weight value ψ that resulted from using the private data. Let us in the first step assume that the learning process follow different independent paths of learning nodes according to the Markov process X0,X1,…,Xm ; in the second step, we “forget” which node at time i generate which node at time i+1 interconnect all nodes at consecutive time steps and then assign a privacy risk weight ψ to a transition from the jth node at step i to the kth node at step i +1 as shown in figure 4 below. ...

. .

Xi+1,k

τ ∈Γ

The process π is commonly derived from more primitive elements. With little loss of generality, we restrict attention to problems that can be formulated through Markov processes [20] {X (t ),0 ≤ t ≤ T } , where X(t) represents the state of the learning process at time t. The reason of this restriction is based on the assumption that the learning effect at each time interval are not necessarily related and hence we can assume a memoryless relationship between any two intervals. The payoff to the service provider from learning at time t is

xij

. .

...

xij

. .

ψijk

. .

Xi+1,k

Figure 4: Construction of independent stochastic mesh from independent paths

We use Xij to denote the jth node at the ith exercise time, the time the service provider learn new information from the

~ ~ h ( X (t )) for some nonnegative payoff function h . If we

private data set for i=1,…,m and j=1,…,b. We use

further suppose the existence of an instantaneous short rate process {r (t ),0 ≤ t ≤ T } , the payoff problem becomes calculation of

At the terminal nodes we set Vmj

 −τ r (u ) du   ∫  ~ sup E e 0 h( X (τ )) τ ∈Γ    

(5)

The Markovian formulation of the above payoff problem lends itself to a characterization of the private data value

~ through dynamic programming. Let hi denote the payoff

~ Vij to

denote the estimated value at this node, computed as follows.

~

~ = hmj ( x mj ) , we then work

backward recursively by defining

  1 b Vˆij = max hi ( xij ), ∑ψ ijk Vˆ i +1,k  b k =1  

(8)

At the root, we set

1 b Vˆ0 = ∑ Vˆ 1k b k =1

(9)

For similar problems that are related to option pricing [13], simulation methods are used to determine the value of the weight. In this paper, and for the sake of simplicity, we will be

F. Negotiation Agent The negotiation agent depicts in figure 5 below includes three components; the negotiation strategy, the reasoning mechanism and offer construction graph. Below is a description of each component. Negotiation Agent Negotiation Strategy

Offer Construction Graph

Reasoning Mechanism

Figure 5. Negotiation agent architecture

1) Negotiation Strategy: The negotiation strategy is a mean of analyzing the opponent’s negotiation strategy, thus understanding his proposal, and offering a responding proposal. In our system, we follow a negotiation process based on a game theoretic model. Game theory is a branch of applied mathematics that is often used in the context of economics [20]. 2) Reasoning Mechanism: The reasoning mechanism provides a decision support criterion to make a comparison between one’s own negotiation proposal and the counterpart’s proposal. The MADM (Multiple Attribute Decision Making) presented in [8] is used to formalize each negotiation item on the basis of a defined uniform criteria. 3) Offer Construction Graph: It stores a relational graph of all the offers and the strategy of executing them. The offer construction graph forms a library of all offers. IV.

negotiation process ends. The service provider will walk '

away with N only. • The service provider increases the discount offer by step θ , and then step 3 is repeated. • The negotiation ends as follows: 1) When the service provider reaches ( Pr − C r ) = l 2) When D = N the maximum number of users has reached. The negotiation process ends in the first case, because the service provider has no incentive to increase the discount as negative results are undesirable. The service provider in this case sends the same message twice. In the second case, the negotiation process ends when the same number of users N is sent twice to the service provider. Reputation metrics The reputation metrics, which include compliancy to privacy policy, reliability, and reputation, is shown in table 5 where all values are measured with values between 0 and 1. TABLE 5: REPUTATION RATING OF THE PROVIDER

Compliancy 0.65

Reputation 0.6

Users' Privacy Risk Values 1 0.8 0.6 0.4 0.2

SIMULATION EXPERIMENT

0 0

200

400

600

800

Num ber of Users

Figure 6: Users’ privacy valuation measurements

Consider also that the service provider is engaging in the information market and will be selling the users’ list to other providers where s/he will make $100 for each private data record after exhausting all possible selling time intervals. The valuation of the private data based on one time selling interval is done by the payoff agent and the result for each user private data value is depicted in figure 7 below. Us e rs ' pr ivate data Valuation bas e d on one tim e s e lling 100 Valuein Dollar

Consider that the agent is working on behalf of a list of 1000 users. The agent objective is to negotiate a payoff paid to users by the service provider in return for their private information revelation. The payoff can be equivalent to a price discount offered to the user once the negotiation process completed with acceptance. The negotiation steps between the agent and the service provider are presented below: • The agent sends a request to the service provider asking for a price discount; • The service provider first replies with a discount that maximizes his profit. The service provider is assumed to be a monopoly, but striving for the largest demand at the lowest cost. However, if the service provider offers zero discount then no agreement is reached and both the service provider and the agent walk away with zero payoff ; • Based on the offered discount the agent replies with demand D = N ′ number of users such that 0 ≤ N ′ ≤ N and asks the service provider to go higher than the current offer. If the service provider decides not to go higher, then the same offer is sent twice and the

Reliability 0.75

Based on the reputation metrics provided above, each user will provide a weighted privacy risk value for each private data category. The database agent calculates the users’ privacy valuation ψ as shown in figure 6 below.

Privacy Risk Values

considering that the privacy risk weight is constant at each time interval.

80 60 40 20 0 0

200

400

600

800

1000

Num be r of Us e rs

Figure 7: Users’ private data value of one trading interval

The service provider negotiation strategy is to acquire the largest number of private data records in the list with the

minimum payoff, while the negotiation agent strategy is to maximize the users’ payoff based on the highest privacy regime. Assume that the service provider will set a target $60 per record and the negotiation agent sets the negotiation strategy based on 4 times selling intervals. We start our simulation by assessing the results based on the privacy risk values that the user used based on his prior knowledge of the provider given in table 5 and then repeat the test with different reputation metrics, thus, creating different privacy risk values (regimes) each time. We assume that the user will change his privacy risk valuation for his private data categorization depending on the reputation metrics provided by the reputation agent in table 5. We increased the values in table 5 as follows: for compliancy (0.75, 0.85, 0.90, and 0.95), for reliability (0.80, 0.85, 0.90, and 0.95), for reputation (0.6, 0.65, 0.75, and 0.85). The final result of negotiation for all variations is depicted in figure 8. Figure 8 shows that when privacy perception increases, both the service provider and the user are better off which means that better privacy practices will yield a better benefit to both parties and low privacy practices will not be as beneficial.

REFERENCES [1]

[2]

[3]

[4]

[5]

[6]

[7] [8]

Benef it for the users Benef it for the service provider Bene fit for both the users and the se rvice provider 70000 Benefit in dollars

60000

[9]

50000 40000 30000

[10]

20000 10000 0 1

2

3

4

5

[11]

Re putation setting in different intervals

Figure 8: Benefit for users and provider under different privacy regimes

The results show that private data assets are no exception to the fact that assets are matters safely and usually better left to markets. Under such market, the incentives for service providers to protect privacy are entirely financial.

[12]

[13] [14]

V.

CONCLUSION

In this paper, we presented a framework to measure user’s privacy payoff using intelligent agents. We have shown in the presented work that users’ privacy is measurable risk that can be used to determine the users’ payoff in risky situations. In our model, not only users will benefit from the revelation of their private data, but also it will be beneficial for service providers in virtual environments. Since attention in the information market of the virtual world is one of the main reasons, if not the most important reason, behind the collection of personal information, it is essential for service providers in the virtual world find ways to economize on attention. A future direction for our work is to investigate the optimality of the discounted payoff and its validity in a realworld application. Another direction is to extend the negotiation problem for multiple providers.

[15]

[16]

[17]

[18] [19]

[20]

A. Gutowska and K. Bechkoum, "The Issue of Online Trust and Its Impact on International Curriculum Design", The Third China-Europe International Symposium on Software Industry-Oriented Education, Dublin, 6-7 February, 2007, pp. 134-140 A. Gutowska, K. Buckly, “Computing Reputation Metric in Multi-Agent E-Commerce Reputation System” Distributed Computing Systems Workshops, 2008. ICDCS '08. 28th international Conference on 17-20 June 2008 Page(s):255 - 260 A. Masaud-Wahaisi, H. Ghenniwa, and W. Shen, “ A privacy-based brokering architecture for collaboration in virtual environments” IFIP International Federation for Information Processing, Volume 243, Establishing the foundation of collaborative networks Springer 2007 pp. 283-290 A. Yassine, S. Shirmohammadi “Privacy and the Market for Private Data A Negotiation Model to Capitalize on Private Data” AICCSA , Qatar, 2008 Pages 669-678 C. Prins, ‘When Personal data, behavior and virtual identities become a commodity: Would a property rights approach matter?’, SCRIPT-ed, Volume 3, Issue 4. 2006 E. Paquet, H. Victor, S. Peters, “The virtual boutique: a synergic approach to virtualization contentbase management of 3D information, 3D data mining a virtual reality for ecommerce” 3D Data Processing Visualization and Transmission, 2002. Proceedings. First International Symposium on 19- 21 June 2002 Page(s):268 - 276 FIPA specification http://www.fipa.org/. H. R. Choi, J. Park, H. S. Kim, Y. S. Park, Y. J. park, “Multi-Agent based negotiation support systems for order based manufactures” ICEC 2003, Proceedings of the 5th international conferences on electronic commerceK. Elissa, “Title of paper if known,” unpublished. Il-Horn Hann, K. Hui, T. S. Lee, and I.P.L. Png “The Value of Online Information Privacy: An Empirical Investigation” Joint Center AEIBrookings joint center for regulatory studies October 2003Y. J. Rendelman, “Customer data means money” InformationWeek, n851, Aug 20, 2001, p49-50 –135. J. Turow, D.K. Mulligan, and C.J. Hoofnagle, “User fundamentally misunderstands the onlineadvertising marketplace”. University of Pennsylvania Annenberg School for Communication and UC-Berkeley Samuelson Law Technology and Public Policy Clinic, 2007 J.A.Dighton, "Market Solutions to Privacy Problems?", Chap. 6 in Digital Anonymity and the Law - Tensions and Dimensions, The Hague: T.M.C. Asser Press, 2003 P. Glasserman “ Monte Carlo methods in financial engineering” Springer 2004 S. Preibush, “Implementing Privacy Negotiation Techniques in ECommerce”, Proceeding of the seventh IEEE International Conference on E-Commerce Technology, (CEC’05), 2005 T. Zarsky “ Privacy and Data Collection in Virtual Worlds” From “State of play-Law, Games and virtual worlds” NYU press 2006; pages 217223 A. Masaud-Wahaisi, H. Ghenniwa, and W. Shen, “ A privacy-based brokering architecture for collaboration in virtual environments” IFIP International Federation for Information Processing, Volume 243, Establishing the foundation of collaborative networks Springer 2007 pp. 283-290 J.A.Dighton, "Market Solutions to Privacy Problems?", Chap. 6 in Digital Anonymity and the Law - Tensions and Dimensions, The Hague: T.M.C. Asser Press, 2003 M. Schwartz, ‘Property, Privacy, and Personal Data’ Harvard Law Review, 117, 2056-2127, 2004 T.Yu, Y. Zhang, and K.J. Lin “ Modeling and measuring privacy Risks in QoS Web services” Proceeding of the 8th IEEE International conference on E-Commerce technology and the 3rd IEEE International Conference on Enterprise Computing, E-Commerce, and E-Services (CEC/EEE’06), 2006 D. Fudenburg, J. Tirole, “Game Theory” The MIT press Cambridge, Massachusetts second printing 1992