Network Selection using Fuzzy Logic - CiteSeerX

21 downloads 64727 Views 830KB Size Report
has to decide which network best meets its service requirements and accordingly join a network. ... to date has assumed the existence of static grid sites, which.
Network Selection using Fuzzy Logic Shubha Kher, Arun K. Somani, and Rohit Gupta. Dependable Computing and Networking Laboratory Department of Electrical and Computer Engineering Iowa State University,Ames, IA 50011 E-mail:{shubha, arun, rohit}@iastate.edu

Abstract—The peer-to-peer technology offers many advantages, but at the same time, it poses many novel challenges for the research community. Modern peer-to-peer systems are characterized by large scale, poor reliability, and extreme dynamism of the participating nodes, with a continuous flow of nodes joining and leaving the systems. Selection of an optimal network requires estimation of its rank using attributes such as storage, average load, cost, reliability etc. There are multiple networks, each vying for users business by providing them novel services. The user in turn has to decide which network best meets its service requirements and accordingly join a network. In this paper, we propose a model with two ranking schemes, one being network-specific and the other user-specific. The schemes use fuzzy logic to rank different networks based on their attributes. The difference between the two ranking schemes lie in the dynamism offered by them. The user-specific ranking criteria is flexible while the network-specific scheme uses a fixed criteria. The network-specific scheme helps in providing a structure to visualize the overall performance index of the networks. The user-specific scheme is adaptive in the sense that it caters to the specific needs of the users. The Simulations performed by us show that the two schemes are light-weight, highly accurate and easily implementable.

I. I NTRODUCTION In this paper we propose novel network selection schemes based on fuzzy logic. The schemes use fuzzy logic to rank different networks based on the services offered by them (as advertised by the respective network operators) and user requirements. The networks under consideration are peer-to-peer (P2P) based grid systems, and the applications pertain to distributed computations. Much of the focus on grid computing to date has assumed the existence of static grid sites, which have out-of-band trust relationship among themselves. However, there is also a growing body of research that allows small workgroups and laboratories to develop a less formal and organized grid environment. Such an ad-hoc computing environment eliminates the need for grid infrastructure administration and instead offer a more peer-to-peer type grid environment to users (for more details on ad-hoc grid computing see [6]). Peer-to-peer networks [1–5] are flexible distributed systems that allow nodes (also called peers) to act as both clients and The research reported in this paper is funded in part by Jerry R. Junkins Chair position at Iowa State University.

0-7803-9277-9/05/$20.00/©2005 IEEE

servers to access and provide services to each other. P2P is a powerful emerging networking paradigm that permits sharing of virtually unlimited data and computational resources in a completely distributed, fault-tolerant, scalable, and flexible manner. In near future, with the tremendous growth in the popularity of P2P and grid computing, and the possible confluence of the two technologies (whereby large grids are created and organized in a P2P fashion), one can envision that there would be several networks capable of catering to users’ needs. We refer to such networks simply as P2P systems or networks. Moreover, one can imagine that existing research grids would be commercialized by offering short-term membership to users with general computation requirements in terms of CPU cycles, storage space, etc. These multiple networks would be vying for users’ business by providing them novel differentiated services. The user in turn has to decide which network best meets its goals for task execution and accordingly join a network. Users of such networks would have more options at their disposal and would have the flexibility of joining a network that offers it the “best” service. The word “best” as used here is relative, since users would typically perceive services differently depending on their own goals and requirements. As the tasks to be executed become complicated and the services offered by the networks become more varied, decision for selection is not based on a single parameter, but is more complex and so selecting an optimal network to join becomes a difficult issue. To model this imperfect information with interdependencies and loose boundaries, and to device a mechanism for evaluation of service offerings of different networks, we have exercised the role of fuzzy logic, which can incorporate the imprecision and infer a crisp output value. We propose two novel schemes for network selection using fuzzy logic. The first ranking technique, called network specific ranking, uses fuzzy logic to rank different networks based on the services offered by them (as advertised by the respective network operators). This is a generalized scheme which considers all available attributes to rank the network. It assigns equivalent weights to maintain the order and applies it for ranking a network. For example, a network specifies the cost of service as 0.7, reliability as 0.6, reputation as 0.9, capacity as 0.4 and anonymity as 0.2, then the attributes cost and reliability will

941

play a major role in generating a ranking index. The networkspecific ranking technique ranks the networks as a composite index of all attributes a network has to offer. The second ranking technique, called user specific ranking, considers the user requirements apart from the rank generated by the earlier scheme and provides ranking of the networks based on the selection of the user. Suppose, a user is ready to afford the cost of up to 0.8 (where 0.8 is the normalized index of cost) demands a capacity of at least 0.6 and reliability of not less than 0.8 and anonymity of 0.5 then the decision support system must be able to meet the desired constraints if possible. The rest of the paper is organized as follows. Section II gives the selection mechanism. Section III explains our system model. Section IV shows how fuzzy logic can be used to rank networks. Section V and SectionVI discuss the network specific scheme and the user specific scheme design, respectively, and evaluate their performances. We conclude the paper in Section VII. II. N ETWORK SELECTION STRATEGY There are many instances in which several alternatives need to be assigned scores and such problems have been widely studied in different areas, such as economics, political science, statistics etc. Chess players are ranked according to the ELO scoring system [14]. Citation indexes like the Journal of Citation Reports database, are used for ranking scientific journals by their intellectual impact [13], [30]. Different methods for ranking candidates in elections are studied in [18, 24]. These ranking schemes employ discrete function as the measurement metric. In social choice theory social alternatives are ranked based on voters’ preferences [15, 16], and [17] who emphasize impossibility results. In the last decade, with the advent of the World ide Web (WWW), numerous web search engines developed efficient techniques for mining information from the internet in response to user queries. The newer search engines view the WWW as a directed graph which is analyzed relative to the given query using methods based on the Perron-Frobenius theory of the eigen vectors of non-negative matrices. See, for instance, [35–37]and [38]. However, to the best of our knowledge no prior work exists that rank different networks based on their service offerings and user requirements. There are several research efforts that rate service providers based on the quality of service they provide. Another commonly used approach to categorize service providers is on the basis of their reputation, and several reputation management systems have been proposed in P2P and grid computing to differentiate among the service providers. For P2P systems we claim that a continuous function is the metric and hence propose a dynamic model from users’ perspective. Since it may be hard for a user to define the parameters based on which the selection can be made to join a network, in real-world users would know some partial imprecise knowledge about the networks. Such knowledge is a combination of

some statistical information along with other derived variables such as cost, reputation. For complex systems, physical- or knowledge-based modelling is either time consuming or it fails. In such cases a databased modelling approach using available system data may be successful [19]. However, success of a data-based modelling approach depends decisively on suitable selection of actual input variables. A well known method is correlation analysis. This method identifies only the linear dependencies between variables. It is possible to design a fuzzy method for detecting both linear and non linear dependencies tailored to data-based modelling and can be improved by adding rules generated from expert knowledge [20]. To model the imperfect information with interdependencies and loose boundaries, and to device a mechanism for evaluation of service offerings of different networks, we have explored the role of fuzzy logic. The model we propose is for selecting relevant sets of variables by evaluating the relevance of these sets depending on the chosen granularity. The granularity allows us to automatically generate the required membership functions for data based modelling using fuzzy method . P2P Network

 

%%&+, 



   

   

'(')*

Note

Bulletin Board

##$!"

 //0-. Note

Note

User Agent layer

User

Fig. 1. Users rely on specialized agents to select the network that best meet their requirements for distributed computation.

III. S YSTEM M ODEL Our system model consists of several P2P networks out of which a user selects one to join for obtaining service. Each network advertises its characteristics in terms of its attributes. The nodes essentially join the network to obtain service but may also offer services. The network composes of two kinds of nodes; user nodes and service nodes. A user node, seeks service from a network and is willing to join a network if the desired service is offered by the network. Once the node joins a network, it may also share its resources and become one of the

942

serviceprovider nodes. Further, depending on the satisfaction after getting a service, the user may continue to stay or leave the network. We use the following model for our ranking scheme. The set S of i networks to be ranked is given by; S = {N1 , N2 , ....., Ni }, Each network Ni consists of a set of ki nodes defined by {u1( Ni ) , u2( Ni ) , u3( Ni ) , ....., uk( Ni ) } Each network Ni advertises information regarding its attributes, given by set A. A = {Cs , Re , Rp , Cp , At } where, Cs (Ni ) = Cost to join the network, Re (Ni ) = Reliability of the network, Rp (Ni ) = Reputation of the network Cp (Ni ) = Capacity available At (Ni ) = Anonymity provided by the network, Each network has a controlling entity called the network operator (NetOps) that manoeuvres the state of the network. Network operator is not necessarily one of the nodes in its network, but provides other services, such as bootstrapping new nodes (i.e., allowing new nodes to join the network), serving as a certification authority (CA), providing a platform to users for community formation and resource sharing etc. Network operator typically charges some fixed amount from the user as they join the network. A user has a fixed task to execute for which it likes to join one of the several networks. The desirability of joining a network is dependent on how well the network would be able to execute the task on behalf of the user. The task of selecting the best network is assigned by a user to its U serAgentLayer. The attributes identified above are the input parameters for the U serAgentLayer and are described in details below. Cost (Cs ): Networks display cost of providing services through NetOp. The cost of providing a service is based on the service provided by the network. Network can quote cost as per its resources and is given by Cs (Ni ) = f1 (history, f eedback, intialcost) where f1 is a function of history information and feedback from users and the initial cost such that Cs ∈ R+ . Reliability (Re ): This is the measure of correctness in terms of executing the desired task and producing the correct result and is represented by an index Re such that 0 ≤ Re ≤ 1. Reputation (Rp ): This is generated by the NetOps based on the past performance information a user may report to the NetOps after getting the service from the network. The NetOps collect periodically all the information about the service feedback from the users and provides it to the agent as a history record. In addition to this, the NetOps keep record of the rank index. If the history record is good, then there is a user tendency to choose the same network. If the history record is bad then the user avoids joining this network. It is given by Rp (Ni ) = f2 (servicesof f ered, rank, f eedback) such that 0 ≤ Rp ≤ 1, where f2 is a function of the services offered and rank information alongwith the feedback from the user.

Capacity (Cp ): A it NetOp advertises the resources and their capacity available. Anonymity (At ): A user might like to remain anonymous upon entering a network. This might be purely for commercial reasons. For example, a user, might like to hide information about its tasks being executed in the network from its competitors. Anonymity is represented as 0 ≤ At ≤ 1. The rank generated by the User Agent Layer is an ordered set of networks Srn given by, • Srn = {N1 , N2 , ....., N i}, such that Srn = S and rank(Ni ) ≥ rank(Nj ) ∀ i ≤ j where Srn indicates the network-specific rank. • Suppose the user specifies that the rank be computed with least cost and highest reliability then the ranking model orders the network as per these attributes and calculates the rank that best meets the user requirements. This is given by, Sru = Nj , where, ∃j ∈ (1, ..., i). NetOps post the information regarding their attributes on a bulletin board. Since the networks are essentially large scale distributed systems, obtaining precise information for their attributes is very difficult. To account for this uncertainty NetOps specify the values for different attributes in terms of probability density distributions. IV. F UZZY LOGIC M ODULE FOR N ETWORK S ELECTION A large number of proposals have recently been devoted for modelling nonlinear systems using fuzzy logic. Theoretically, Fuzzy Logic Controllers (FLC) have proved to be universal approximators under certain weak assumptions [39]. They are capable of approximating any real continuous function on a compact set to arbitrary accuracy. In practice many industrial problems have been successfully solved using fuzzy logic. Fuzzy logic has found successful applications in real time control, automatic control, data classification, decision analysis, expert systems, time series prediction, robotics, and pattern recognition. (Readers can refer to [40] for a detailed description of the fuzzy logic concepts used in this paper). Fuzzy logic has a characteristic of representing human knowledge or experiences as fuzzy rules [41]. However, in most of the existing FLCs, shapes and internal parameters of membership functions and fuzzy rules are determined and tuned through trial and error by operators. The construction of a fuzzy system involves the selection of several parameters: position, shape and distribution of the membership function, rule base construction, selection of logical operations, consequences of the rules, etc. This large number of degrees of freedom makes it difficult to select all these parameters at once. A typical approach is to set in advance the logical operations and the type of membership functions using certain criteria. Remaining parameters are estimated from the data using different strategies, with a single objective, which is

943

to minimize the approximation error between the output values and the values given by the fuzzy model.

are overlappings in the partitions. The procedure is explained below.

Data sets User Agent Layer N1

Network -Specific

N2

Rank (Srn) Fuzzy modules

NetOps Ni-1

User-Specific

Rank (Sru)

Ni

User User joining as per the received rank

Fig. 2. Ranking model Fig. 3. Mamdani model for generalized ranking

Figure 2 gives the components of the ranking scheme. Here, the NetOps provide data (related to the attributes of a network) to the fuzzy modules. The fuzzy module composes of two schemes that rank the network as per the service, or as per the user demand. The fuzzy decision system in each of the schemes fits a non-linear function to the data set in hand, and derives a rank for the network. In this way the entire set S of networks is evaluated and a rank Srn is displayed. Also, rank obtained from the network-specific ranking scheme is used as input for the user specific ranking scheme. This way a new rank Sru is generated, which caters to a particular user’s requirements. V. N ETWORK - SPECIFIC RANKING The purpose of this ranking is to provide a composite rank for the networks based on equal priority to all the inputs attributes. The network specific ranking is developed using fuzzy logic where the output of a rule is computed using Mamdani method [40], as depicted in Figure 3. The idea behind using a Mamdani rule base is that the rules for many systems can be easily described by the humans in terms of fuzzy variables. Thus we can effectively model a complex non-linear system with common sense rules on fuzzy variables. In this method the input and the output are the fuzzy partitions of the original data sets. The output is computed with the use of user defined fuzzy rules on user defined fuzzy variables. Figure 3 shows a system with five inputs and one output being related using the rule set. For the ranking, we divided each input data into three partitions called memberships and the output data into five partitions. With this new partition set, the rules are defined. For a given value of input data, desired output is derived using number of applicable rules. For a single condition, there may be more than one rule applicable as there

Consider a system with two inputs x and y and one output z, having three membership functions x(A1 , A2 , A3 ) and y(B1 , B2 , B3 ) and z(C1 , C2 , C3 ) with the following rules: If x is A1 ∧ y is B1 −→ z is C1 ; If x is A2 ∧ y is B2 −→ z is C2 ; If x is A3 ∧ y is B3 −→ z is C3 ; Then the following steps are carried out. • To convert numerical values (x = x0 , y = y0 ) on numerical scales into membership values. e.g. µA1 (x0 ), µA2 (x0 ), µA3 (x0 ), µB1 (y0 ), µB2 (y0 ), µB3 (y0 ) • To find the strength of antecedents (Fuzzy operation in antecedents) . Suppose currently x = x0 , y = y0 . We evaluate the antecedents of the rules, i.e., If x is A1 ∧ y is B1 If x is A2 ∧ y is B2 If x is A3 ∧ y is B3 as µAi (x) and µBi (y) , where i = 1,2,3 and the fuzzy operation AND is used. This gives the strength of antecedents of each rule; Si (x0 , y0 ) = min(µAi (x0 , µBi (y0 ) • To find implication of each rule (Mamdani’s method). Each rule will contribute to the fuzzy output as follows: Di (z) = min(Si (x0 , y0 , µCi (z) • Aggregation of implied consequences from each rule: This is evaluated using the following: CAggregate (z) = max(D1 (z), D2 (z), D3 (z)) • Defuzzification: It translates the CAggregate (z) function to a crisp value z0 , using a suitable method such as centroid, bisector, etc.

944

A. Design The inputs considered here are Cs, Re, Rp, At , and Cp and the output is given by Srn , and we consider the input-output data to be normalized between 0 and 1. The design includes the following steps. • For a wide range of non-zero truth values of the inputs it seems to be appropriate to apply generalized gaussian bell membership function as the best fit as compared to trapezoidal and triangular membership functions. We therefore selected gaussian membership for all the inputs and the output. Gaussian function is given by: µ( x; a, b, c) = 1/(1 + |(x − c)/a)|2b where, c = center of the bell, slope = -b/2a, where b is a positive number and width of the bell (with membership 0.5) = 2a. • Each input data is partitioned into three sets with membership functions as (bad, moderate, good) and the output data into five sets as (very low, low, moderate, good, very good). • Based on the partitions and taking into consideration the interdependencies of the inputs, the rule set is designed. The philosophy underlying the design is given below. • Pessimistic: When more than two input variables are “poor/bad”, while other three inputs give a “good”, then the rank is below the acceptable value and is likely to be rejected. However, if any one of the two variables is in “moderate” (0.6) range, the rank results in a marginally fair value. • Optimistic : When all the inputs are “moderate”, then this condition results in a value of the judgement index, i.e. acceptable value of the rank is met. Any NetOp meeting this value can compete to serve users. When all inputs are in the highest region (0.8), then the rank yields a value of around 0.89. However, when four inputs are good (0.9), then the rule base can further signify the potential of the fifth input value and consider it as a “don’t care” (null value). The rule base supports this feature by using an “OR” operator. When this rule is fired the outcome is 0.956. In other words, choosing between marginal overlappings need refining of the rules by adding weights to the output. Below we give an example of how this is done. If Cs is low ∧ Re is low ∧ Rp is low ∧ Cp is low ∧ A is low −→ Srn is very low(0.2). If Cs is low ∧ Re is moderate ∧ Rp is low ∧ Cp is low ∧ A is low −→ Srn is very low(0.4). If Cs is low ∧ Re is moderate ∧ Rp is moderate ∧ Cp is moderate ∧ A is low −→ Srn is very low(0.7). If Cs is low ∧ Re is moderate ∧ Rp is moderate ∧ Cp is moderate ∧ A is low −→ Srn is low (0.4). Our goal is to identify a set of rules that will be able to process all possible combinations of inputs. With this combination of the input-output partitions we can have 35 = 243 rules that

do not conflict. This is because each input has three possibilities and there are five inputs. In general, the number of rules will be k n where n is the number of inputs and k is the number of partitions for each input. In a more general situation, the Qk number of rules will be i=1 ki , where ki is the number of partitions for the ith input. This method of partitioning is called grid partitioning. However, all the rules may not apply for the system. Each partition of one input may not have a combination with every partition of the other inputs. i.e., if the input partitions of cost are low, moderate and high and that of anonymity are also low, normal and high, then it might exist that for anonymity value low, the cost may never be high so that the combination of low anonymity and high cost will never appear as a possibility. Rules having such combination therefore may never apply. Based on the knowledge about the input parameters we designed a subset of 243 rules to include possible combinations of the inputs. This method is equivalent to choosing the rules manually by trial and error. Of course, this method is not well defined and requires the knowledge of the problem domain apart from experimentation to correct the rule set. Simulations were carried out with a rule set of 105 rules to observe if it generates the rank as per the knowledge provided. The results of this simulation confirm that the technique is effective and the rank Srn generated matches with the pessimistic and optimistic philosophy. Subsequently, in an attempt to develop a system with minimum number of rules we refined the rule set. The next simulation was carried out with 58 rules. These rules were selected using a simple heuristic intuitively that chooses one rule for each fuzzy combination that gives output to closely match with the output observed by the real system. Figure 4 is a graphic representation of the network specific FIS model. As stated earlier, it shows five inputs (the first five columns) and one output (the last column) being represented with the membership functions. Each row is a rule indicating the combination of inputs that contributes to generate Srn , making a total of 58 rules. It is evident that for a particular value of the inputs, more than one rules may be applicable. Applying the process of implication and defuzzification, the composite rank index is generated which is shown in the last row, last column. This represents the Srn of the network. B. Evaluation of the Network specific model The results of the ranking scheme with 105 rules is compared with that of 58 rules. It is found that the results generated by either of the rule set is in accordance with the criteria for the selection. The results of the rule set of 58 rules are given in the Figure 4. The result indicates that, by choosing rule set of 58 rules we are not making any significant compromise with the ranking index. However, this is not an optimal design and needs further investigation to identify the sufficient conditions for generating rules.

945

to generating fuzzy rules from an input-output data set. The steps involved in ANFIS are explained below. A typical fuzzy rule in a Sugeno fuzzy system has the format: If x is A and y is B then z = f(x,y) where A and B are fuzzy sets in the antecedent; z = f (x, y) is a crisp function in the consequent. Usually this function is a polynomial of the input variables x and y, but it can be any other function that can approximately describe the output of the system within the fuzzy region specified by the antecedent of the rule. When f (x, y) is a first order polynomial, the model is called first order Sugeno fuzzy model. Consider such a model that contains two rules: If x is A1 ∧ y is B1 −→ (f1 = p1 x + q1 y + r1 ) If x is A2 ∧ y is B2 −→ (f2 = p2 x + q2 y + r2 ) The firing strengths w1 and w2 are usually obtained as the product of membership grades of the premise part, and the output f is the weighted average of each rule’s output. Figure.5 shows the ANFIS structure where nodes within the same layer perform functions of the same type as detailed below. Layer 1

Layer 2

Layer 3

Layer 4

TT1

n1

1

n2

2

A1 x1

TT2

B2 TT3

x3

Fig. 4. Graphical representation of the rules: It depicts Set of 58 rules for the network-specific ranking and the output

C1 C2

sum n3

3

VI. U SER SPECIFIC RANKING This scheme is developed to allow users to specify their criteria by prioritizing the desired attributes for selecting a network. Here the objective is to define an output function which can assign different weight values to the inputs as per the user choice. We consider a nonlinear mapping between the input-output with weight index as the measure of the choice parameter of the user, and develop an adaptive fuzzy model (ANFIS) using Sugeno method [40]. A. Adaptive fuzzy model (ANFIS) The type of fuzzy model first suggested by Takagi and Sugeno uses fuzzy inputs and rules, but its outputs are nonfuzzy sets. It provides a powerful tool for modelling complex non-linear problems when combined with a network structure as in Adaptive Network based Fuzzy Inference System or ANFIS [40]. ANFIS can be applied to non linear prediction of the ranking index of a network based on user defined priorities. ANFIS is a class of adaptive multi-layer feed-forward network that is functionally equivalent to a fuzzy inference system. It was proposed in an effort to formalize a systematic approach

x5

D2

n4

4

E1

n5

5

y

TT4

D1 x4

Layer6

A2 B1

x2

Layer 5

TT5 TT6

E2 TT7

Fig. 5. ANFIS architecture

Note that Oji denotes the output of the ith node in the j th layer. • Layer1: Indicates the number of input nodes to the network. • Layer2: Each node in this layer generates a membership grade of a linguistic label. For instance, the node function of the ith node may be a generalized gaussian bell membership function: Oji = µAi (x) = 1/(1 + |(x − ci )/ai |)2bi ,

(1)

where x is the input to the node i; Ai is the linguistic label (small, large, etc) associated with this node; and (ai , bi , ci ) is the parameter set that changes the shapes of the membership function. Parameters in this layer are referred to as the premise parameters.

946



Layer3: Each node in this layer calculates the firing strength of a rule via multiplication: O2i = wi = µAi (x)µBi (y), i = 1, 2



Layer4: Node i in this layer calculates the ratio of the ith rule’s firing strength to the total of all firing strengths: O3i = w ¯i = wi /w1 + w2 , i = 1, 2





(2)

(3)

Layer5: Node i in this layer computes the contribution of the ith rule towards the overall output, with the following node function: O4i = w¯i fi = w¯i (p1 x + q1 y + r1 ); where w¯i is the output of layer 3 and pi , qi , ri is the consequence parameter set. Parameters in this layer are referred to as the consequent parameters. Layer6: Single node in this layer computes the overall output as the summation of contribution from each rule: X X X O5i = w¯i fi = (wi fi )/ wi (4)

ANFIS in Figure 5 has as the basic learning rule, a hybrid algorithm [40], to calculate the error signals recursively from the output layer backwards to the input nodes for the effective search of the optimal parameters. From this architecture, it is seen that given the values of premise parameters, the overall output f can be expressed as, f = w¯1 f1 + w¯2 f2

(5)

f = (w¯1 x)p1 +(w¯1 y)q1 +(w¯1 )r1 +(w¯2 x)p2 +(w¯2 y)q2 +(w¯2 )r2 (6) which is linear in consequent parameters p1 , q1 , r1 , p2 , q2 , r2 . From this observation, we have S = (S1 , S2 ) where S = Set of total parameters, S1 = Set of nonlinear (premise) parameters, S2 = Set of linear (consequent) parameters. We can use backpropagation or hybrid algorithm to identify ANFIS parameters. Using the hybrid learning algorithm, in the forward pass, node outputs go forward until layer 5 and the consequent parameters are identified by the least squares method. In the backward pass, the error signals propagate backward and premise parameters are updated by gradient descent. The consequent parameters thus identified are optimal under the condition that premise parameters are fixed. Accordingly, the hybrid algorithm converges much faster since it reduces the search space dimension of the backpropagation method [40]. We implement ANFIS as a five input and one output system using the fuzzy logic tool box of MATLAB. B. Data sets ANFIS is used for ranking the networks according to the user demands of priority of the attributes. To use ANFIS, a training data set is needed.

Fig. 6. Relationship between reliability, cost and the rank Sru

For the simulation purpose we emulated the data as follows: consider the input-output data set to have five inputs and one output. The inputs are labelled as x1 , x2, x3 , x4 , and x5 , which represent attributes such as cost, reliability, reputation, capacity and anonymity. The output is labelled as y which represents the rank Sru generated by the ANFIS. Figure 6 shows the relationship between reliability, cost and the rank Sru for the data set generated. We emulate the real time data using a random function: x1 = rand(200, 1) (7) which returns 200 random values between [0,1]. x2 , x3 , x4 and x5 are considered to be nonlinear functions of x1 . The output for the training data is shown in Figure 7. The relationship between inputs and y is considered to be a first order Sugeno function: y = p1 x1 + q1 x2 + r1 x3 + s1 x4 + t1 x5

(8)

Here depending on the user requirement the weight of the specific attribute can be initialized as a higher value compared to the other attributes, i.e., (pi , qi , ri , si , ti ) will be user dependent. The resultant rank generated is therefore specific only for the user. The entire data set composes of three sets of 200 samples each. They are referred to as training data, testing data and checking data. We carried simulation using a set of 200 samples initially to train the ANFIS. Upon training, the ANFIS shows the training error which reflects the how good the mapping function is. To validate the model, we further apply the testing data to see how the ANFIS behaves for known data. ANFIS maps the function onto the testing data as per the training. This is depicted by the testing error. Finally we use a set of 200 unknown samples to check the validity. When the ANFIS maps

947

the function for the unknown data giving negligible error called checking error then it is validated.





samples to see if the error for the new set is within the acceptable range. This data set is called as checking data and the error is called checking error. We found that checking error for the FIS with 12 rules is 6.6682e-005. With 15 rules, training error is observed to be 2.135e-005 while testing error was 4.9193e-005 and checking data error is 5.7698e-005. With 19 rules, the training error was 2.4781e-005, testing error was 5.9809e-005 and checking error was 5.9024e005. For the data values between [0,1], an error of the magnitude 2.4781e-005 is considered to be negligible. It is observed that the error is minimum when the rule set of 15 rules was used. Figure 8 indicates the results of the ANFIS training.

Fig. 7. 200 samples of input-output data for training

Having created the data set the next step is to train the network. This means we create a new FIS to fit the data into membership functions. Using the subtractive clustering technique, the ANFIS automatically selects the membership function and also generates the new FIS. Subtractive clustering uses tree partitioning and generates optimum number of rules unlike grid partitioning method. The technique clusters the data points defining a cluster center and choosing all the points in the vicinity of the center at the same time refining the cluster center itself. Once the FIS is generated then we train the ANFIS using hybrid optimization method [40]. Training here is a way to express the set of rules that can be applied over the generated FIS. The rules and the membership functions are updated in each epoch (iteration) to reduce the error in optimization. Upon training, another data set is chosen to test the ANFIS model so developed. C. Evaluation of the User Specific Model We carried out simulations considering input-output data set with five inputs and an output. In turn the system generates a rank Sru which best suits to the user needs. • We configured ANFIS and simulated using three different rule sets of 12, 15, and 19 rules respectively. • With 12 rules, training error is 2.6674e-005. The training error indicates that the FIS so generated is with an error from the actual input-output data relationship. With the FIS trained using the initial data set of 200 samples, we then applied another data set of 200 samples to test the FIS for new data. We observed the testing error of 6.4587e005. Finally to validate the FIS we use a new data of 200

Fig. 8. ANFIS training results

It is evident that with 15 rules the training error, testing error and the checking error are minimum for the data set considered for simulation. Looking at the fact that each input value ranges from [0 1] and the output value also ranges from [0 1], the training error of the order of 2.135e005 is considered to be promising. Similarly, with this training error, the testing error and the checking error are also very less and in the same range as the training error. The checking data exhibits the incoming data values. The average testing error is found to be 4.9193e-005 and the checking error is 5.7698e-005. Figure 9 shows the testing error, Figure 10 represents the checking data set along with the training data and Figure 11 displays the checking error. The simulations were carried out for a number of data sets. It is observed that the technique bears promise and can be used to implement it in the U serAgentLayer. The weights of various inputs here can be selected as per the priority defined by the user. This way each time the ranking scheme shall evaluate the •

948

Fig. 9. ANFIS testing using another test-data set of 200 samples

Fig. 11. ANFIS result validation using the checking data set

can also be employed in other ranking problems, such as Webserver selection problem, etc. The network specific ranking technique bears good promise to generate the rank and gives idea to the user for joining a network. This ranking scheme proposed is based on intuitive design of rules that can be optimized by applying a boolean logic to figure out combinations of inputs that do not apply. We are working on exploring use of fuzzy logic based genetic algorithms to optimize the number of rules. On the other hand, the user specific ranking scheme gives rank as per a user’s needs. This provides the user the capability to decide about the network to join. The scheme is adaptive and the rule set is optimal. The above observatios are substantiated by our simulation results, which show that the training time is negligible and resulting outputs are close to optimal. The proposed ranking schemes can easily be implemented inside a user agent layer. Fig. 10. plot of Checking data against training data

R EFERENCES

network with the criteria specified by the user. VII. C ONCLUSION AND F UTURE WORK In this paper, we described two novel network ranking schemes based on fuzzy logic. These schemes enable users to evaluate different P2P-based grid networks for the suitability of joining them. Our ranking scheme evaluated P2P-based grid networks based on a certain fixed set of commonly used attributes such as cost, capacity, reliability, etc. We believe that this is a valid strategy, since ranking scheme must be specific to a particular application domain and problem context only. However, our proposed schemes can easily be extended if additional network attributes are to be considered. Moreover, our methodology of utilizing fuzzy logic for ranking networks

[1] [2] [3] [4] [5] [6] [7] [8] [9]

949

Napster. http://www.napster.com/. Gnutella. http://gnutella.wego.com/. SETI@home. http://setiathome.ssl.berkeley.edu. I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications. In Proceedings of the 2001 ACM SIGCOMM Conference, 2001. S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. A scalable content-addressable network. In Proceedings of ACM SIGCOMM (San Diego, 2001), 2001. I. Foster, and C. Kesselman. The Grid: Blueprint for a New Computing Infrastructure, 2nd Edition, Morgan Kaufmann, 2004. S. D. Kamvar, M. T. Schlosser, and H. Garcia-Molina. EigenRep: Reputation Management in P2P Networks. In Proc. of ACM World Wide Web Conference, Budapest, Hungary, May 2003. T. Moreton, and A. Twigg. Enforcing collaboration in P2P routing services. In First International Conference on Trust Management, 2003. Z. Diamadi, and M. J. Fischer. A Simple Game for the Study of Trust in Distributed Systems. Appeared in Wuhan University Journal of Natural Sciences, 2001.

[10] K. Aberer and Z. Despotovic. Managing trust in a peer-2-peer information system. In Proc. of the Tenth International Conference on Information and Knowledge Management (CIKM ’01), pages 310-317, 2001. [11] S. Lee, R. Sherwood, and B. Bhattacharjee. Cooperative Peer Groups in NICE. In Proc. of INFOCOM 20003, 2003. [12] A. Farag, and M. Muthucumaru. Evolving and Managing Trust in Grid Computing Systems. In Proc. of the 2002 IEEE Canadian Conference on Electrical and Computer Engineering, 2002. [13] G. Pinski, and F. Narin. Citation influence for journal aggregates of scientific publications: Theory, with applications to the literature of physics. 12 pages 297-312, 1976. [14] A. E. Elo. The Rating of Chess Players, Past and Prsent. Arco Publishing Inc. 1978. [15] K. J. Arrow. Social Choice and Individual Values (2nd. ed.). New Heaven: Yale University Press. 1963. [16] A. Sen. Individual Choice and Social Welfare. Holden Day, 1970. [17] H. Moulin. Axioms of Cooperative Decision Making. Cambridge, U. K.: Cambridge University Pres, 1988. [18] D. G. Saari. Decisions and Elections: Explaining the Unexpected. Cambridge University Press, 2001. [19] D.P. Solomatine. Data-Driven Modelling: paradigm, methods, experiences. In Proc. of 5th international conference on hydroinformatics, cardiff, UK, 2002. [20] R. Mikut, J. Jakel, L. Groll. interpretability issues in data based learning of fuzzy systems. appeared in Elesevier Journal of Fuzzy sets and systems July 2004. [21] T. Wei. The algebraic foundation of ranking theory. Ph.D. Thesis, Cambridge University, 1952. [22] M. G. Kendall. Further contributions to the theory of paired comparisons. Biometrics 11, pages 43-62, 1955. [23] H. E. Daniels. Round-robin tournament scores. Biometrika 56, pages 295299, 1969. [24] J. W. Moon. Topics on Tournaments. New York: Holt, Rinehart and Winston, 1968. [25] M. Kano, A. Sakamoto. Ranking the vertices of a paired comparison digraph. SIAM Jornal on Algebraic and Discrete Methods 6, pages 79-92, 1985. [26] J. P. Keener. The Perron-Frobenius theorem and the ranking of football teams. SIAM Review 35, pages 80-93, 1993. [27] V. S. Lechenkov. Self-consistent rule for group choice: Axiomatic approach. Discussion Paper 95-3, Conservatoire National ds Arts et Metiers, 1995. [28] Van Den Brink, and R. Gilles. Measuring domination in directed networks. Social networks 22, pages 141-157, 2000. [29] J. J. Herings. G. Van Der Laan, and D. Talman. Measuring the power of nodes indigraphs. Discussion paper. October 2001. [30] I. Palacios-Huerta, and O. Volij. The measurement of intellectual influence. mimeo, 2002. [31] G. R. Conner, and C. P. Grant. An extension of Zermelo’s model for ranking by paired comparisons. European Jounal of Applied Mathematics 11, pages 225-247, 2000. [32] H. A. David. Ranking from unbalanced paired-comparison data. Biometrika 74, pages 432-436, 1987. [33] H. A. David. The Method of Paired Comparisons (2 ed.). London: Charles Griffin and Company, 1988. [34] J. Laslier. Tournament Solutions and Majority Voting. Berlin: Springer, 1997. [35] S. Brin, and L. Page. The anatomy of large-scale hypertextual web search engine. Computer Networks 30(17), pages 107-117, 1998. [36] J. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM 46, pages 604-632, 1999. [37] L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation rankings: Bringing order to the web. Technical Report, Stanford University, 1999. [38] S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Hypersearching the web. Scientific American, June 1999. [39] L.X. Wang. Fuzzy systems are universal approximators IEEE trans on SMC, Vol.25, no.4, 1995, pp.629-634. [40] J. S. R. Jang, C. T. Sun, E. Mizutani. Neuro-Fuzzy and soft Computing. Prentice hall.1997.

[41] D.W. Dorsey, M.D.Coovert. Mathematical modelling of decision making: a soft and fuzzy approach to computing hard decisions. Appeared in the Journal of human factors and ergonomics society, vol.45, issue 1, pages 117-135, spring 2003.

950