Research Techniques Derived From Rough Sets ... - Semantic Scholar

2 downloads 0 Views 270KB Size Report
Department of Management, University of Canterbury, Christchurch, New Zealand. [email protected]. Abstract: Research techniques are often ...
Research Techniques Derived From Rough Sets Theory: Rough Classification and Rough Clustering Kevin E. Voges Department of Management, University of Canterbury, Christchurch, New Zealand [email protected] Abstract: Research techniques are often divided into quantitative approaches, concerned with measurement, and qualitative approaches, concerned with meaning. Methodologists have often found it difficult to reconcile this division. However, computational intelligence is directly concerned with capturing meaning in a rigorous form. One commonly used technique is rough sets theory, which analyses an information system consisting of a set of objects. The concept of an object is broad enough to include both qualitative and quantitative data structures. This paper provides an overview of rough sets theory, and shows how the theory can be used in both classification and clustering tasks. It presents an analysis of a number of simplified data sets that demonstrate how rough sets theory can be applied to problems in business research. Keywords: Computational intelligence, Rough sets, Rough classification, Rough clustering

1. Introduction Research techniques are often divided into quantitative and qualitative approaches, and there has been little interaction, and occasionally even animosity, between the two “camps.” One distinction sometimes made between these approaches is that quantitative techniques are concerned with the measurement of definable constructs and qualitative techniques are concerned with uncovering meaning. Research methodologists often find it difficult to reconcile these two aims. However, a field of study developed over the last few decades has been directly concerned with capturing meaning, in a form sufficiently rigorous to enable it to be expressed computationally. This field is computational intelligence, whose primary aim is the automation of intelligent behaviour, including the representation and transfer of meaning. A key area of interest has been the development of expressive and relatively complex ways of representing meaning. These have included formats such as frames, scripts, narratives, the event calculus, fuzzy sets, and rough sets. The last approach has received considerable attention in the computational intelligence literature since its development by Zdzislaw Pawlak in the early 1980s. Rough sets theory operates on an information system, which consists of a set of objects. The concept of an object is broad enough to include both qualitative and quantitative data structures. The only assumption required is that each object has associated with it a set of attributes used to describe the object. Because rough sets theory is derived from set theory, the usual assumptions of traditional quantitative research techniques do not apply. A rough set is formed from two sets, referred to as the lower and upper approximations. The lower approximation contains objects that are definitely in the set. The remaining objects are either definitely not in the set, or their set membership is unknown. The set of objects whose membership is unknown is called the boundary region. The upper approximation is the union of the lower approximation and the boundary region. Results of analyses using rough sets theory are usually presented as sets of rules linking attributes of objects. To date, the majority of published research applications have concentrated on rough classification, where at least one of the attributes partitions the information system into pre-existing subgroups. A more recent extension to this approach is rough clustering, where the information set has no pre-existing subgroups, and the technique identifies these subgroups or clusters. This paper provides an overview of rough sets theory, and shows how the theory can be used in both classification and clustering tasks. It presents the results of an analysis of a number of simplified data sets, and demonstrates how rough set theory can be applied to problems in business research.

1

4th European Conference on Research Methodology for Business and Management Studies

2. Rough Sets Overview The concept of rough or approximation sets was introduced by Pawlak (1982, 1991), and is based on a basic assumption - with every object (referred to as a data record in traditional statistical analysis) of an information system (data matrix), there is associated a certain amount of information. This information is expressed by means of attributes (variables) used as descriptions of the objects. For example, if objects are customers in a database, the information about the customers consists of various measures such as demographics and purchase history. The data is treated from the perspective of set theory and none of the traditional assumptions of multivariate analysis are relevant. For an introduction to rough sets, see Pawlak (1991) or Manukata (1998).

2.1 Information systems and indiscernibility The complete information system expresses all the knowledge available about the objects being studied. More formally, the information system is a pair, S = ( U , A ), where U is a non-empty finite set of objects called the universe and A = { a1, …, aj } is a non-empty finite set of attributes on U . With every attribute a ∈ A we associate a set Va such that a : U → Va. The set Va is called the domain or value set of a. In statistical terms, this value set equates to the range of values associated with a specific variable. The initial detailed data contained in the information system is used as the basis for the development of subsets of the data that are “coarser” or “rougher” than the original set. As with any data analysis technique, detail is lost, but the removal of detail is controlled to uncover the underlying characteristics of the data. The technique works by ‘lowering the degree of precision in data, based on a rigorous mathematical theory. By selecting the right roughness or precision of data, we will find the underlying characteristics’ (Munakata 1998, p.141). A core concept of rough sets theory is that of equivalence between objects (called indiscernibility). Objects in the information system about which we have the same knowledge form an equivalence relation. Let S = ( U , A ) be an information system, then with any B ⊆ A there is associated an equivalence relation, INDA (B), called the B-indiscernibility relation. It is defined as: INDA (B) = { ( x, x' ) ∈ U 2 | ∀ a ∈ B a( x ) = a( x' ) }

(1)

If ( x, x' ) ∈ INDA (B), then the objects x and x' are indiscernible from each other when considering the subset B of attributes. Equivalence relations lead to the universe being divided into partitions, which can then be used to build new subsets of the universe.

2.2 Lower and upper approximations Let S = ( U, A ) be an information system, and let B ⊆ A and X ⊆ U . We can describe the subset X using only the information contained in the attribute values from the subset B by constructing two subsets, referred to as the B-lower and B-upper approximations of X, and denoted as B*(X) and B*(X) respectively, where: B*(X) = { x | [x]B ⊆ X } and B*(X) = { x | [x]B ∩ X ≠ ∅ }

(2, 3)

The lower approximation (LA), defined in (2), contains objects that are definitely in the subset X and the upper approximation (UA), defined in (3), contains objects that may or may not be in X. A third subset is also useful in analysis, the boundary region, which is the difference between the upper and lower approximations. This definition of a rough (approximate) set in terms of two other sets is the simple but powerful insight contributed by Pawlak.

2.3 Decision rules To date, most of the published literature in rough sets has concentrated on a specific type of information system, referred to as a decision system (see Section 3 below). In a decision system, at least one of the attributes is a decision attribute. This decision attribute partitions the information system into groups (in rough set terminology, concepts). The problem is expressed in rough set theory as finding mappings from the partitions induced by the equivalence relations in the condition attributes to the partitions induced by the equivalence relations in the decision attribute(s). These mappings are usually expressed in terms of decision rules.

2

More formally, with an information system S = ( U, A ), we can associate a formal language L(S). Expressions in this language are logical formulas built up from attributes and attribute-value pairs and standard logical connectives (Pawlak 1999). A decision rule in L is an expression φ → ψ (read if φ then ψ), where φ and ψ are respectively the conditions and decisions of the rule. Each rule can be assigned a confidence factor, which is the number of objects in the attribute subset that also satisfy the decision subset (concept), divided by the total number of objects in the attribute subset. Let φi be a partition of the condition attributes and ψj be a partition of the decision attribute (concept). The confidence factor, α , for a rule φ → ψ is:

"=

| #1 $ % 1 | | #1 |

(4)

where | A | is the cardinality of set A.

2.4 Templates

! of rough sets theory is the concept of a template as described in Nguyen A useful extension (2000). Let S = ( U , A ) be an information system. Any clause of the form D = ( a ∈ Va ) is called a descriptor, with the value set Va called the range of D. A template is a conjunction of unique descriptors defined over attributes from B ⊆ A. Any propositional formula T = Λ a ∈ B ( a ∈ Va ) is called a template of S. Template T is simple if any descriptor of T has a range of one element. Templates with descriptors having a range of more than one element are called generalized.

2.5 Further reading Rough sets theory has developed an extensive literature well beyond the brief introduction provided here, and the interested reader is referred to Orlowska (1998), Peters and Skowron (2004), and Polkowski, Tsumoto and Lin (2000) for recent comprehensive overviews of developments in the field. One common extension is that of probabilistic rough sets, which overcomes some of the limitations of the canonical rough set theory originally developed (Pawlak, Wong & Ziarko 1988). In recent years, there have been numerous edited books and conferences extending Pawlak’s original insight into new areas of application and theory (e.g. Lin and Cercone 1997; Polkowski & Skowron 1998; Polkowski, Tsumoto & Lin 2000; Wang, Liu, Yao & Skowron 2003; Zhong, Skowron & Ohsuga 1999). In the last decade, rough set approaches have been applied to a variety of business problems, such as predicting business failure (Beynon & Peel 2001; Dimitras, Slowinski, Susmaga & Zopounidis 1999; McKee 2000; Zopounidis, Slowinski, Doumpos, Dimitras & Susmaga 1999) stock market analysis (Golan & Ziarko 1995; Grzymala-Busse 1997; Shen & Loh 2004), marketing (Beynon, Curry & Morgan 2001), and tourism (Goh & Law 2003).

3. Rough Classification 3.1 Classification Classification is a fundamental technique in both traditional data analysis and in data mining. It involves assigning new objects to pre-existing groups on the basis of current information regarding those objects whose group membership is known. Classification techniques based on traditional statistical approaches include discriminant function analysis and logistic regression (Hair, Anderson, Tatham and Black 1998). Discriminant function analysis is used to determine which variables discriminate between two or more prior existing groups. For example, a marketing researcher may want to investigate which variables discriminate between high use consumers and non-consumers. The researcher would collect data on a range of variables that may aid in this discrimination – for example, demographic

3

4th European Conference on Research Methodology for Business and Management Studies

and psychographic variables. Discriminant analysis would then be used to determine which variable(s) are the best predictors of the consumer’s purchase intention.

3.2 Rough classification Early applications of rough sets theory used the technique for classification problems, where prior group membership is known. Results are usually expressed in terms of rules for group membership (Grzymala-Busse & Zou 1998; Pawlak 1984). Rough classification techniques are widely used for problems involving decision-making (Greco, Matarazzo & Slowinski 2000; Pawlak 2000, 2001).

3.3 Example A rough classification algorithm was applied to a study of the attitudes of young adults (18 to 24 years) towards attending formal dancing classes. As part of a wider study, 116 participants were asked to subjectively rank their attitudes towards dancing on a five-point Likert scale (from Strongly Agree to Strongly Disagree). Two groups were identified through self-report – Non-dancers (59 people) and Dancers (57 people). Two demographic variables were obtained – gender and age. For the purpose of this example, four attitudinal variables were considered – dancing is a female activity, dancing is expensive, prefer sport to dancing, and dancing is a good way to meet people, Table 1: Rough classification of dancing preference data

Variables Gender = female Age < 20 Female activity Expensive Prefer sport Meet people 1 2

| φi | 62 70 30 45 51 42

2

Groups Non-dancers 1 | φi ∩ ψj | 30 20 17 30 36 20

3

α 0.48 0.28 0.57 0.67 0.71 0.48

Dancers 1 | φi ∩ ψj | 32 50 13 15 15 22

3

α 0.52 0.71 0.43 0.33 0.29 0.52

Cardinal value of intersection between condition attribute and decision attribute 3 Cardinal value of condition attribute Confidence factor

Table 1 shows the results of the rough classification analysis. The second column shows the cardinal value of the condition attribute, that is, the number of people who satisfy the condition shown in the variable column. For the four attitude variables, the table shows the number of people who agree with the attitudinal statement. The third column shows the intersection between the condition attribute and the decision attribute, for the non-dancer group. Similarly, the fifth column shows the cardinal value of the intersection between the condition attribute and the decision attribute, for the dancer group. The fourth and sixth columns show the corresponding confidence factors, as calculated by equation (4). The results show that the main difference between the dancing and non-dancing groups are accounted for by age (dancers are older), perceived cost (non-dancers consider the activity too expensive), and sport (non-dancers prefer sport). No differences were found based on gender, considering dancing a female activity, and considering dancing as a way to meet people.

4. Rough Clustering 4.1 Cluster analysis Cluster analysis is a second fundamental technique in both traditional data analysis and in data mining. The technique is defined as grouping ‘individuals or objects into clusters so that objects in the same cluster are more similar to one another than they are to objects in other clusters’ (Hair et al. 1998, p.470). Many clustering methods have been identified, including partitioning, hierarchical, nonhierarchical, overlapping, and mixture models. One of the most commonly used nonhierarchical methods is the k-means approach (MacQueen 1967).

4

In the last few decades, as data sets have grown in size and complexity, and the field of data mining has matured, many new techniques based on developments in computational intelligence have started to be more widely used as clustering algorithms. A technique currently receiving considerable attention is the theory of rough sets (Pawlak 1991). Applied to clustering problems, the technique is referred to as rough clustering.

4.2 Rough clustering The concept of a rough cluster was introduced in Voges, Pope and Brown (2002), as a simple extension of the notion of rough sets described in Section 2. A rough cluster was defined in a similar manner to a rough set – that is with a lower and upper approximation. The lower approximation of a rough cluster contains objects that only belong to that cluster. The upper approximation of a rough cluster contains objects in the cluster which are also members of other clusters. To use the theory of rough sets in clustering, the value set ( Va ) needs to be ordered. This allows a measure of the distance between each object to be defined. Distance is a form of similarity, which is a relaxing of the strict requirement of indiscernibility outlined in canonical rough sets theory, and allows the inclusion of objects that are similar rather than identical. Clusters of objects are then formed on the basis of their distance from each other. An important distinction between rough clustering and traditional clustering approaches is that, with rough clustering, an object can belong to more than one cluster. More formally, let S = ( U , A ) be an information system, where U is a non-empty finite set of M th objects (1 ≤ i ≤ M), and A is a non-empty finite set of N attributes (1 ≤ j ≤ N) on U. The j th attribute of the i object has value R ( i, j ) drawn from the ordered value set Va . For any pair of objects, p and q, the distance between the objects is defined as: N

D( p,q) = # | R( p, j) " R(q, j) | j=1

(5)

That is, the absolute differences between the values for each object pair's attributes are summed. The distance measure ranges from 0 (indicating indiscernible objects) to a maximum determined by the number of attributes and the size of the value set for each attribute. The clustering algorithm described! in Voges, Pope and Brown (2002) used this distance measure (5) to construct a similarity matrix, and each object-object pair in this similarity matrix was assigned to existing or new clusters depending on whether none, one, or both objects in the pair were currently assigned. Problems with this approach were the large number of clusters generated and uncertainty as to whether the lower approximations of each cluster provided the best coverage of the data set. A different approach was followed in do Prado, Engel and Filho (2002), who used reducts to develop clusters. Reducts are subsets of the attribute set A, which provide the same information as the original data set. The reducts were used as initial group centroids, which were then grouped together to form clusters. One problem with this approach is that not all information systems have reducts, and some sets of reducts overlap, which means that the cluster centroids are not necessarily well separated. Another approach was followed by Lingras (2001, 2002), who used a genetic algorithm where the genome comprised two sections – lower approximation membership and upper approximation membership. This approach also had a number of limitations. The algorithm required repair operators, as some invalid genes were randomly generated, and the number of clusters needed to be specified in advance. This preliminary knowledge is not always available for larger data sets. Voges and Pope (2004) presented an extension to their original rough clustering approach that attempted to overcome the limitations of these previous attempts. Their approach used an evolutionary algorithm to maximize the coverage of the data set by the clusters, without prespecifying the number of clusters required, and without relying on structural characteristics of the cluster such as reducts.

5

4th European Conference on Research Methodology for Business and Management Studies

4.3 Example An evolutionary algorithm based rough clustering algorithm was applied to a study of the beer preferences of “emerging drinkers” (i.e. young adults experiencing alcohol consumption for the first time). See Voges and Pope (2004) for a more detailed outline of the data structure used and how the cluster solution was derived. As part of a wider study, 174 participants were asked to subjectively rank attributes of beer on a four-point scale (covering Very Important, Important, Unimportant, and Very Unimportant) in terms of which attributes they considered when making a purchasing decision. Five attributes were used: image, packaging, price, alcohol content, and place sold. The data was used to conduct a rough cluster analysis, partitioning the participants into distinct clusters depending on which beer attributes were considered important. To create a viable description of a cluster using templates (see Section 2.4), at least two attributes from B need to be chosen. This results in compact, but non-trivial, descriptions of the rough cluster being produced. In this example, only simple templates are used. However the technique could be easily extended to include generalized templates, incorporating intervals of attributes (i.e. using a similarity relation rather than an indiscernibility relation). The cluster solution, C, is defined as any conjunction of k unique templates, (6)

C = T 1 Λ , …, Λ T k

A template describes a partition of U and the conjunction of templates contained in a cluster solution results in some templates having both LAs (that is, objects satisfying one template only) and UAs (that is, objects satisfying more than one template). Consequently C is a rough cluster solution. The “best” cluster solution obtained is shown in Table 2. This cluster achieved a coverage of 94.8% of the data set. Table 2 shows the thirteen templates that comprise this cluster solution, and the size of the lower and upper approximations for each template. The accuracy of each template ranged from 0.50 to 1.00. A number of interesting cluster descriptions are apparent in the cluster solution. Template 13 shows an “image conscious” cluster, unconcerned with price, and Template 12 shows a weaker version of this. Templates 8 to 11 show various combination of importance assigned to price and packaging. Templates 6 and 7 show another group whose major concern is the level of alcohol content in the beer. Templates 3 and 4 relate to the importance of the purchase location. Templates 1, 2 and 5 are difficult to interpret, as they show the “unimportance” of certain attributes, but don’t trade these off against “important” attributes. Table 2: “Best” cluster solution for rough cluster analysis of beer preference data T 1 2 3 4 5 6 7 8 9 10 11 12 13 1

Variables Image * * * * * * * * * * * I VI

Package VU VU VU VU U U U I I VI VI * *

Price * * * * * * * I VI I VI U VU

Alcohol * * * * U I VI * * * * * *

2

Place VU U I VI * * * * * * * * *

| [T i] *X | 26 15 12 10 11 11 10 18 11 13 15 12 9

1

| [T i] *X |

2

25 14 11 9 9 11 8 18 11 13 15 6 7

Size of upper approximation Size of lower approximation VI-Very important, I-Important, U-Unimportant, VU-Very Unimportant, *-“Don’t care”

6

5. Conclusion The use of rough sets theory in classification and clustering tasks provides an alternative to quantitative techniques based on “traditional” statistical approaches. Due to space limitations, this paper has not made comparisons between rough set approaches to classification and clustering and traditional methods such as discriminant analysis and k-means clustering. However, such comparisons are available in the literature. See, for example, Beynon, Curry and Morgan (2001) for a comparison between rough classification and discriminant analysis, and Voges, Pope and Brown (2002) for a comparison between rough clustering and k-means clustering. More comparative studies are likely to be produced. One interesting implication of rough set theory is its applicability to qualitiative analysis. A number of papers have been presented in this area. For example, Klopotek and Wierzchon (1998) present an extension of rough sets using diversity of support rather than frequency counts to model belief functions, and Huang and Tseng (2004) use rough sets to analyse representations of unstructured information contained in business cases. This promises to be a useful and interesting area of extension of the theory of rough sets.

6. Acknowledgements Funding for research and conference attendance was provided by the Department of Management at the University of Canterbury. My thanks to Jolie Doig and Kelly Emanuel for providing the data used in the rough classification example, and to Michael Veitch and Kevin Nicholson for providing the data used in the rough clustering example.

References Beynon, M., Curry, B. & Morgan, P. 2001, ‘Knowledge discovery in marketing: An approach through rough set theory’, European Journal of Marketing, vol. 35, no. 7/8, pp. 915-935. Beynon, M.J. & Peel, M.J. 2001, ‘Variable precision rough set theory and data discretisation: An application to corporate failure prediction’, Omega, vol. 29, pp. 561-576. do Prado, H.A., Engel, P.M. & Filho, H.C. 2002, ‘Rough clustering: An alternative to find meaningful clusters by using the reducts from a dataset’, in Rough Sets and Current Trends in Computing, Third International Conference, RSCTC 2002. Lecture Notes in Computer Science, Vol. 2475, eds. J.J. Alpigini, J.F. Peters, A. Skowron & N. Zhong, SpringerVerlag, Berlin, pp. 234–238. Dimitras, A.I., Slowinski, R., Susmaga, R. & Zopounidis, C. 1999, ‘Business failure prediction using rough sets’, European Journal of Operational Research, vol. 114, pp. 263–280. Goh, C. & Law, R. 2003, ‘Incorporating the rough sets theory into travel demand analysis’, Tourism Management, vol. 24, pp. 511-517. Golan, R. & Ziarko, W. 1995, ‘A methodology for stock market analysis utilizing rough sets theory’, In Proceedings IEEE/IAFE 1995 Conference on Computational Intelligence in Financial Engineering, New York, pp. 32–40. Greco, S., Matarazzo, B. & Slowinski, R. 2000, ‘Extension of the rough set approach to multicriteria decision support’, INFOR, vol. 38, no. 3, pp. 161-195. Grzymala-Busse, J.W. 1997, ‘A new version of the rule induction system LERS’, Fundamenta Informaticae, vol. 31, pp. 27–39. Grzymala-Busse, J.W. & Zou, X. 1998,’Classification strategies using certain and possible rules’, in Rough Sets and Current Trends in Computing: First International Conference, RSCTC 98. Lecture Notes in Computer Science, Vol. 1424, eds. L. Polkowski & A. Skowron, Springer, Berlin, pp. 37-44. Hair, J.E., Anderson, R.E., Tatham, R.L. & Black, W.C. 1998, Multivariate Data Analysis, 5th edn, Prentice-Hall International, London. Huang, C-C. & Tseng, T-L. 2004, ‘Rough set approach to case-based reasoning application’, Expert Systems with Applications, vol. 26, pp. 369-385. Klopotek, M. & Wierzchon, S.T. 1998, ‘A new qualitative rough-set approach to modeling belief functions’, in Rough Sets and Current Trends in Computing: First International Conference, RSCTC 98. Lecture Notes in Computer Science, Vol. 1424, eds. L. Polkowski & A. Skowron, Springer, Berlin, pp. 346-354.

7

4th European Conference on Research Methodology for Business and Management Studies

Lin, T.Y. & Cercone, N. (eds.) 1997, Rough Sets and Data Mining: Analysis of Imprecise Data, Kluwer, Boston. Lingras, P. 2001, ‘Unsupervised rough set classification using GAs’, Journal of Intelligent Information Systems, vol. 16, no. 3, pp. 215–228. Lingras, P. 2002, ‘Rough set clustering for web mining’, in Proceedings of 2002 IEEE International Conference on Fuzzy Systems. MacQueen, J. 1967, ‘Some methods for classification and analysis of multivariate observations’, in Proceedings of the Fifth Berkeley Symposium on Mathematics, Statistics and Probability, Volume 1, eds L.M. Le Cam & J. Neyman, University of California Press, Berkeley, pp. 281–298. McKee, T.E. 2000, ‘Developing a bankruptcy prediction model via rough sets theory’, International Journal of Intelligent Systems in Accounting, Finance and Management, vol. 9, no. 3, pp. 159–173. Munakata, T. 1998, Fundamentals of the New Artificial Intelligence: Beyond Traditional Paradigms, Springer-Verlag, New York. Nguyen, S.H. 2000, ‘Regularity analysis and its applications in data mining’, in Rough Set Methods and Applications: New Developments in Knowledge Discovery in Information Systems, eds L. Polkowski, S. Tsumoto & T.Y. Lin, Physica-Verlag, Heidelberg, pp. 289-378. Orlowska, E. 1998, Incomplete Information: Rough Set Analysis, Physica-Verlag, Heidelberg. Pawlak, Z. 1982, ‘Rough sets’, International Journal of Information and Computer Sciences, vol. 11, no. 5, pp. 341–356. Pawlak, Z. 1984, ‘Rough classification’, International Journal of Man-Machine Studies, vol. 20, pp. 469-483. Pawlak, Z. 1991, Rough Sets: Theoretical Aspects of Reasoning About Data, Kluwer, Boston. Pawlak, Z. 1999, ‘Decision rules, Bayes' rule and rough sets’, in New Directions in Rough Sets, Data Mining, and Granular-Soft Computing, eds. N. Zhong, A. Skowron & S. Ohsuga, Springer, Berlin, pp.1-9. Pawlak, Z. 2000, ‘Rough sets and decision analysis’, INFOR, vol. 38, no. 3, pp. 132-144. Pawlak, Z. 2001, ‘Rough sets and decision algorithms’, in Rough Sets and Current Trends in Computing (Second International Conference, RSCTC 2000), eds. W. Ziarko & Y. Yao, Springer, Berlin, pp. 30-45. Pawlak, Z., Wong, S.K. & Ziarko, W. 1988, ‘Rough sets: Probabilistic versus deterministic approach’, International Journal of Man-Machine Studies, vol. 29, pp. 81-95. Peters, J.F. & Skowron, A. (eds.) 2004, Transactions on Rough Sets I, Springer-Verlag, Berlin. Polkowski, L. & Skowron, A. (eds.) 1998, Rough Sets and Current Trends in Computing (First International Conference, RSCTC 98), Springer, Berlin. Polkowski, L., Tsumoto, S. & Lin, T.Y. (eds.) 2000, Rough Set Methods and Applications: New Developments in Knowledge Discovery in Information Systems, Physica-Verlag, Heidelberg. Shen, L. & Loh, H.T. 2004, ‘Applying rough sets to market timing decisions’, Decision Support Systems, vol. 37, pp. 583-597. Voges, K.E., Pope, N.K.Ll. & Brown, M.R. 2002, ‘Cluster analysis of marketing data examining online shopping orientation: A comparison of k-means and rough clustering approaches’, in Heuristics and Optimization for Knowledge Discovery, eds. H.A. Abbass, R.A. Sarker & C.S. Newton, Idea Group Publishing, Hershey, pp. 207-224. Voges, K.E. & Pope, N.K.Ll. 2004, ‘Generating compact rough cluster descriptions using an evolutionary algorithm’, in GECCO 2004: Genetic and Evolutionary Algorithm Conference LNCS 3103, eds. K Deb et al., Springer Verlag, Berlin, pp. 1332–1333. Wang, G., Liu, Q., Yao, Y. & Skowron, A. (eds.) 2003, Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing: Proceedings 9th International Conference, RSFDGrC 2003, Springer, New York. Zhong, N., Skowron, A. & Ohsuga, S. (eds.) 1999, New Directions in Rough Sets, Data Mining, and Granular-Soft Computing, Springer, Berlin. Zopounidis, C., Slowinski, R., Doumpos, M., Dimitras, A.I. & Susmaga, R. 1999, ‘Business failure prediction using rough sets’, Fuzzy Economic Review, vol. 4, no. 1, pp. 3–33.

8