Privacy Friendly Information Disclosure

Privacy Friendly Information Disclosure Steven Gevers and Bart De Decker Department of Computer Science, K.U.Leuven, Celestijnenlaan 200A, B-3001 Leuven, Belgium

Abstract. When using electronic services, people are often asked to provide personal information. This raises many privacy issues. To gain the trust of the user, service providers can use privacy policy languages such as P3P to declare the purpose and usage of this personal information. User agents can compare these policies to privacy preferences of a user and warn the user if his privacy is threatened. This paper extends two languages: P3P and APPEL. It makes it possible to refer to certified data and credentials. This allows service providers to define the minimal level of assurance. It is also shown how different ways of disclosure (exact, blurred, verifiably encrypted, ...) can be specified to achieve more privacy friendly policies. Last, the paper describes a privacy agent that makes use of the policies to automate privacy friendly information disclosure.

1

Introduction

When using electronic services, people are often asked to provide personal information. This raises many privacy issues. To gain the trust of the user, service providers can use privacy policy languages to specify the purpose and usage of this personal information. User agents can compare these policies to privacy preferences of a user and warn the user if his privacy is threatened. Two well known privacy languages are P3P (The Platform for Privacy Preferences [14]) and APPEL (A P3P Preference Exchange Language [15]). The former is used for privacy policies, the latter for privacy preferences. There are three ways in which a user can prove personal information to a service provider. He can provide it as uncertified data, certified data or embedded in a credential. In this paper, the term information structure denotes all three. Uncertified data is data that is not certified by another entity. It can easily have been stolen, forged or made up. Certified data is data that is endorsed (e.g. signed) by a certifying entity. When a service provider receives certified data, he can be sure that the information is correct. However, when someone receives certified data, he can easily pretend to be the legitimate owner. Credentials contain certified data and offer a means to ensure the service provider that the person sending the data is indeed the one to whom the credential was issued. Examples are X.509 certificates [13] and private credentials [1, 2]. Hence, credentials offer the highest assurance and allow for implementing secure services. R. Meersman, Z. Tari, P. Herrero et al. (Eds.): OTM Workshops 2006, LNCS 4277, pp. 636–646, 2006. c Springer-Verlag Berlin Heidelberg 2006

Privacy Friendly Information Disclosure

637

Private credentials have many privacy friendly properties. They make it possible to hide the values that the service provider does not need to know (selective disclosure). Also, they support different ways of disclosure. It is possible to prove properties of attributes (e.g. proving being over eighteen) or just proof possession of the credential without revealing it. The paper extends P3P and APPEL with the different ways of disclosure and the different types of information structures. This way, service providers can request a certain level of assurance. Also, the user’s privacy can be better protected. The policies make it possible to automate personal information disclosure. The paper describes how they are used by a privacy agent. The privacy agent discloses information structures to a service provider. User intervention is only required when absolutely necessary. However, the user’s privacy is protected according to his preferences. More detailed information about our approach can be found in [4]. Next section extends P3P and APPEL with information structures and the different ways of disclosure. Section 3 gives a description of the privacy agent. Section 4 discusses our approach. Last section gives some conclusions.

2

Extending Privacy Languages

This section extends P3P and APPEL. The extensions help service providers to define which (parts of) information structures they are willing to accept. Users can define accurate privacy preferences about their information structures. This section assumes basic knowledge of P3P and APPEL. A short introduction can be found in [4]. 2.1

Information Structure Description Language

This section introduces a language that is able to define information structures in XML. By including these descriptions in privacy policies and privacy preferences it is possible to state which information structures may be used. The language makes it possible to reason about the contents of different types of information structures in a uniform way. Uncertified data keeps information about the owner of the data. Certified data and credentials also include information about the certifying entity. Furthermore, they may have several properties. An information structure description can thus be divided in one part about the owner, another part about the certifier and yet another part about properties. Figure 1 shows the description of a private credential containing the age and name of its owner. More examples can be found in [4]. The name-attribute of the -tag is used to allow for making references to it. Type defines the type of information structure. Currently, privatecredential, X.509Certificate, uncertifieddata and certifieddata are defined. A generic value credential can be used to denote both private credentials and X.509 certificates.

638

S. Gevers and B. De Decker

The -part in figure 1 defines properties (attributes) of the owner. In this case, the credential includes information about the person’s age and name.

Bill 18 03-06-07 01:01:01 Fig. 1. The information structure description language

Credentials are verified through the use of other credentials. This leads to a chain of credentials that ends with a root credential (e.g. X.509 certificate chains). The -tag points to an information structure description of the next credential in the chain. The -tag defines the root credential. The last part defines the properties of the information structure. Certified data typically has a signature algorithm as property. Common properties of credentials are, for example, the validity period and revocation information. It is important that every information structure is mapped on the tags in a consistent way. That way it is possible to use the different types of information structures interchangeably. 2.2

Privacy Policies

Whenever a service provider needs personal information from the user, he has to create a privacy policy. To include information structures in P3P the service provider first has to define the information structures he is willing to accept from users. He can do this by using the information structure description language, described in 2.1. Then, the service provider is able to make references to these descriptions in the P3P policy. To make references, two parts of P3P policies are extended: the and the part. The former is used to define data elements. The latter is used to specify the information that has to be disclosed. Note that P3P privacy policies also contain parts about, for example, purpose and retention time (e.g. the address of a user is required for sending advertisements). Our approach does not change these parts. Unchanged parts that are irrelevant for the examples are not included in this paper.


639

Fig. 2. Information structure descriptions

Defining Acceptable Information Structures. The service provider has to create information structure descriptions containing the distinguishing tags of the information structure he is willing to accept. An information structure matches a description if it contains at least every tag in the description. The user is allowed to use every information structure that matches the description. Figure 2 shows an information structure description of a credential. If the service provider requests an IDcredential, every credential that contains the attributes age and name can be used. Furthermore, the credential has to be certified by the municipality CA. The -tags points to an information structure description of a credential of this entity. In [4], a level -attribute is described that makes it possible to refer to credentials higher in the credential chain. Note that it is possible to point to multiple information structures with only one description. Private credentials as well as X.509 certificates can match the specification in figure 2. This is an important aspect in this paper. Different types of information structures can be used interchangeably. Services can be made more accessible by allowing users to show their personal information in different ways. Creating References to Information Structures. In P3P, the -tag is used to define data elements. This tag is extended to be able to refer to attributes of information structures. Figure 3 shows that if the service provider requests a statement on the age of a user, only the age attribute of

Fig. 3. References to information structures

640


the IDcredential has to be used. The ’@’ indicates the parts of the information structure a user has to disclose. Hence, when using private credentials, the owner may hide the other attributes. Defining Different Ways of Disclosure. In standard P3P a service provider can only state he needs certain information in clear text. With private credentials, however, there are more possibilities. – It is possible to show/prove the value of credential attributes (i.e. clear text ). – Users are able to prove knowledge of a certain attribute. This corresponds to proving ownership of a credential containing the attribute without revealing any attributes. – A user can disclose information as a verifiable encryption. A verifiable encryption is associated with a condition and a third party. If the condition is fulfilled, the third party is allowed to decrypt the encrypted information. This can, for example, be useful to identify a person in case of abuse. Cryptographic mechanisms ensure that the encryption contains the information requested by the service provider. – A user can prove equations (≤, ≥, , = and =). It is possible to compare an attribute with a known value or with attributes of other credentials. These equations can be relatively complex: e.g. attr1 + 7 ≤ attr2 · (4 + attr3). Service providers frequently request the interval a certain attribute belongs to. An example can be a site that needs to know that the user’s income is in a certain interval. These intervals are typically defined by a startpoint and a step. The user then has to provide a number k and prove start + k.step ≤ attribute ≤ start + (k + 1).step. In order to allow the service provider to define how personal information must be shown, the part of P3P policies is extended. Figure 4 defines that the user can either use his IDcredential and prove being over eighteen or proof knowledge of a VISA card credential (i.e. proof to be the owner of a valid VISA card credential). The information will be used for pseudonymous analysis. The VISA card credential is not worked out in this paper. Additional tags are introduced to support the different ways of disclosure. The tags and can be used to define more complex policies such as the ones in [3]. More information and examples about the different tags can be found in [4]. Our approach allows for a distinction between the personal information requested by the service provider and the technologies that are used to realize the disclosure. For instance, if a user’s IDcredential is a private credential, a zero knowledge proof can be used to prove his adulthood. If he owns an X.509 certificate, he has to show the entire certificate and disclose every attribute in clear text. The level of disclosure is a partial relation based on the different ways of disclosure and the level of assurance provided by the types of information structures. x has a higher level of disclosure than y if x reveals, in every case, more


641

18

Fig. 4. Extending a data group in P3P

(a) ways of disclosure

(b) information structures

Fig. 5. Levels of disclosure

information than y. For example, showing an attribute in clear text reveals more information than proving an equation. Also, the type of information structure used must provide at least an equal level of assurance. x has a higher (or equal) level of disclosure than y if it has at least the same level in both figure 5(a) and 5(b). The level in which the service provider wants to receive the attributes is the minimum level of disclosure. When a user has to disclose a property of an attribute, the service provider will of course accept it if the user gives it away in clear text. Moreover, he will also accept it if the user discloses more information than necessary (for example, by using X.509 certificates). Also, if the service provider requests uncertified data, he will also accept a credential (assuming he is capable of handling the protocols associated with that type of information structure).

642

2.3


Privacy Preferences

The service provider has specified which information he wants to receive and how the user should provide it. APPEL can be similarly extended as P3P to include information structures. First, the user needs to have descriptions of his information structures. If an information structure has well defined semantics (e.g. X.509 certificates), it is possible to generate the tags of its description automatically [4]. If its semantics are not specified, it is impossible to automate the generation of information structure descriptions. In this case, the entity that issued the information structure could provide the description. References to these descriptions are made similarly as in extended P3P. Figure 6 specifies that it is allowed (behavior=’request’) to disclose being older than 18 for pseudonymous analysis.

18

Fig. 6. Defining how an attribute can be disclosed

The level at which the user wants to show attributes is the maximum level of disclosure. In the example, the user will not reveal his age in clear text. Comparisons can be made weaker. The user will allow to prove being older than sixteen because this reveals less information than if he proves being over eighteen. However, he will not prove being older than, for example, thirty.

3

A Privacy Agent

The previous section explained how P3P and APPEL can be extended with different types of information structures and different ways of disclosure. This section describes a privacy agent that shows the benefits of these extensions. When a user wants to use a service, a privacy policy is sent to the privacy agent. The privacy agent first checks whether the user’s privacy preferences allow the


643

disclosure. Then, the privacy agent searches the user’s information structures to find the ones that can be used to fulfil the privacy policy. After that, the privacy agent can either start the information disclosure automatically or show the (combinations of) information structures that can be used to the user. The latter is comparable to the identity selector of Microsoft CardSpace [8]. The user can then choose how his personal information has to be disclosed. For the privacy agent to be successful, it has to be user friendly. One aspect is privacy preferences. Most users are not able to generate complex privacy preferences. To handle this, the privacy agent retrieves privacy preferences from a trusted third party. This approach is described in [5]. Also, if the user has a choice between several (combinations of) information structures, the privacy agent tries to help the user. By using sensitivities (described in [4]) the most privacy friendly combination is calculated. This combination is suggested to the user to help him protect his privacy.

Fig. 7. Description of the system

Figure 7 shows the interaction. The following steps are necessary: 1. The user application requests the service. 2. The service provider sends his privacy policy to the privacy agent. 3. The privacy agent checks whether the policy matches the privacy preferences of the user. 4. The privacy agent selects the necessary credentials that comply with the policy. This action comprises more than just selecting information structures that contain the requested attributes. For example, when using X.509 certificates, all included attributes are revealed which may be more than necessary. The privacy preferences must be checked again to see whether it is allowed to show every attribute included. If there are different possibilities to show the information, the most privacy friendly combination should be calculated (using sensitivities [4]). Based on the user’s preferences, the privacy agent can either start the information disclosure automatically or request the user’s consent. 5. A protocol is started to disclose the information. First, the privacy agent informs the service how the information will be shown. When this information is exchanged, the correct protocol can be started. This approach makes it

644


necessary for the service provider to support the appropriate protocols for the different information structures. To deal with this, a protocol compiler as mentioned in [3] can be used. 6. The user application can make use of the service.

4

Discussion

The extensions of the language provide several advantages. Including the different ways of disclosure allows for more privacy friendly policies. It makes it possible to include the properties of private credentials. Other privacy languages, such as Rei [10], EPAL [12] and XPref [9], are not able to do this. Using the different types of information structures, service providers can define a certain level of assurance in their policy. It also makes it possible to automatically select the information structures that can be used for the personal information disclosure. [3] proposes a language that allows one to specify what data to release and how to release it. Their specifications can be converted to the XML notation proposed in this paper. Our work extends the functionality by allowing the user to use different types of information structures. Our work also uses privacy preferences to check whether the personal information disclosure is allowed. Usability is an important concern for the privacy agent. Almost every aspect of personal information disclosure can be automated. This way, complex technologies (e.g. private credentials) can be used without even understanding their basics. The privacy agent is very extendable. The information structure description language is very general. This makes it possible to include all types of information structures in the system by mapping them on the language. A positive aspect of our approach is that it can be very useful for the service provider. A service provider can easily give the user many options to prove personal information. For example, in figure 4, if a user does not want to proof knowledge of a VISA card, he can still prove his age with his IDcredential. This makes services more accessible. By making use of credentials, services can be made more secure. The impact of the extensions on the evaluator of the privacy preferences is rather limited. Policies that give the users a choice are split into multiple policies. Checking the levels of disclosure does not require complex calculations either. Microsoft CardSpace [8] provides a mechanism similar to our work. However, it does not include privacy policies. It also does not include the different ways of disclosure. Only clear text claims can be handled. Our approach is able to put constraints on the properties of information structures that are allowed. This can, for example, be useful if a site wants his users to have a passport that will remain valid for at least six months. Note that the Microsoft CardSpace fits perfectly in our system. Infocards can be described using the information structure description language. Claims are properties of the owner; the reference to the security token service is a property of the information structure itself. Requested claims can easily be included in a predefined privacy policy. Our approach is also


645

more general than Microsoft CardSpace. CardSpace always needs to contact a security token service to obtain a credential. Our approach is able to use credentials that are fully under the control of the user such as, for example, credentials on an electronic identity card. Our privacy agent provides more functionality than existing user agents focussing on privacy policies such as AT&T’s privacy bird [6] and JRC P3P Proxy [7]. Both only warn users when a site does not respect their preferences. Our privacy agent is able to use the different information structures and to define different ways of disclosure which makes it much more useful. To make the system more usable, the system can be extended to support trust negotiation. Instead of sending a privacy policy to the privacy agent, the trust-target graph procedure discussed in [11] can be used. The extensions of the privacy languages proposed in this paper can be used to support the communication between the different parties. This is future work.

5

Conclusions

This paper proposed two extensions to P3P and APPEL. It is possible to include different types of information structures and different ways of disclosure. A privacy agent is described that makes use of these policies. The privacy agent provides user and privacy friendly information disclosure.

References 1. J. Camenisch and E. Van Herreweghen: Design and Implementation of the Idemix Anonymous Credential System. In Proc. 9th ACM Conf. Computer and Comm. Security, 2002. 2. S. Brands: Rethinking Public Key Infrastructures and Digital Certificates: Building in Privacy, 2000. 3. J. Camenisch, D. Sommer and R.Zimmermann: A general certification framework with applications to privacy-enhancing certificate infrastructures. Tech. Rep. RZ 3629, IBM Zurich Research Laboratory, July 2005. 4. S. Gevers and B. De Decker: Automating privacy friendly information disclosure. Tech. Rep. CW441, Katholieke Universiteit Leuven, May 2006 5. G. Yee and L. Korba: Semi-Automated Derivation of Personal Privacy Policies. In IRMA ’04: Proceedings of the 2004 Information Resources Management Association International Conference, 2004. 6. AT&T Privacy Bird. http://www.privacybird.com/ 7. JRC P3P Resource Centre. http://p3p.jrc.it/ 8. Microsoft CardSpace http://msdn.microsoft.com/winfx/reference/infocard/default. aspx 9. R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu: An XPath based preference language for P3P. In Proc. of the 12th Int’l World Wide Web Conference, 2003. 10. L. Kagal, T. Finin, and A. Joshi: A policy based approach to security for the semantic web. In Proceedings of the 2nd International Semantic Web Conference, 2003.

646


11. J. Li, N. Li and W.H. Winsborough: Automated trust negotiation using cryptographic credentials. In Proceedings of the 12th ACM Conference on Computer and Communications Security, 2005. 12. The Enterprise Privacy Authorization Language (EPAL 1.1) http://www. zurich.ibm.com/security/enterprise-privacy/epal/ 13. R. Housley, W. Ford, W. Polk and D. Solo: RFC 2459: Internet X.509 Public Key Infrastructure Certificate and CRL Profile 14. Platform for Privacy Preferences (P3P) Project. http://www.w3.org/P3P/ 15. A P3P Preference Exchange Language 1.0 (APPEL1.0). http://www.w3. org/TR/P3P-preferences/