Improving web server performance by a clustering ...

Improving Web Server Performance by a Clustering-Based Dynamic Load Balancing Algorithm∗ Lai Kuen Ho Hau Yee Sit Kei Shiu Ho Hong Va Leong Robert W. P. Luk Department of Computing, The Hong Kong Polytechnic University, Hong Kong {cslkho,c9142126,csksho,cshleong,csrluk}@comp.polyu.edu.hk Abstract

ceived requests in a round-robin manner [2]. Effectively, static load balancing is achieved.

In this paper, a load balancing scheme is presented which allows HTTP requests to be dynamically migrated between clustered back-end web servers based on the loading condition of the system. We adopt a nearest neighborhood clustering algorithm whereby an adaptive number of requests are migrated as determined by the real-time distribution of load among the servers. Experiment results demonstrate that our proposed algorithm yields the best performance when compared with several other common approaches.

Each back-end web server is made up of two components: the Queue Agent (QA) and the Coordinator Agent (CA). The Queue Agent maintains two internal data structures: the Request Queue and the Redirect Queue. The Request Queue is for queuing up the HTTP requests and the Redirect Queue is a multi-level queue structure, which keeps track of how many times each request in the Request Queue has been migrated. HTTP request (initial) Front-end Web Server

Load Balancer

1. Introduction

Web client

HTTP redirect (static load balancing) HTTP request (re-send)

Update load information

A critical issue in the design of a dynamic load balancing algorithm is to determine the amount of load to be migrated in each step. Existing approaches usually employ a migration policy in which a fixed number or proportion of client requests are migrated, regardless of the current load level of individual servers and overall system load. Previous research has revealed that these approaches will lead to problems such as system instability and slow responsiveness to load fluctuations [4]. In this paper, we propose a clusteringbased dynamic load balancing algorithm to tackle this problem and study the performance experimentally.

2. System Design An overview of the system architecture is shown in Figure 1. It consists of three parts: the Front-end Web Server, the Load Balancer and a cluster of back-end web server modules, which execute the clients’ requests. During operation, HTTP requests submitted by clients will be directed to the Front-end Web Server first, which dispatches the re∗ The work described in this article was fully supported by a grant from The Hong Kong Polytechnic University (4Z03D).

HTTP redirect (dynamic load balancing) or HTTP response

Coordinator Agent (CA)




Queue Agent (QA)

Queue Agent (QA)

Queue Agent (QA)

Queue Agent (QA)

Request Queue ...

Request Queue ...

Request Queue ...

... ... ... ... ... ...

0 1 p

... ... ... ... ... ...

0 1 p

Redirect Queue

Server1

...

Redirect Queue

Serveri

...

... ... ... ... ... ...

0 1 p

Redirect Queue

Serverk

Request Queue ...

...

... ... ... ... ... ...

0 1 p

Redirect Queue

Servern

Figure 1. Overview of the system. The QA provides load information of the server to the Coordinator Agent (CA) for load balancing purpose. Periodically, the CA communicates with the CAs of other backend web servers to exchange load information. Each time, one of the CAs serves as the leader to consolidate load information from different back-end servers (the leader role is rotated among the CAs). The Load Balancer is responsible for making decisions regarding dynamic load migration between the back-end web servers. Figure 2 details our dynamic load balancing algorithm.

Proceedings of the 18th International Conference on Advanced Information Networking and Application (AINA’04) 0-7695-2051-0/04 $ 20.00 © 2004 IEEE

1. 2. 3. 4. 5. 6. 7. 8.

let Serverh be the heaviest loaded server, Serverl be the lightest loaded server, CAh be the CA of Serverh , CAleader be the leader CA and LB be the Load Balancer; for every Tc seconds do CAleader consolidates all load information and identifies Serverh and Serverl , as well as their load difference if the conditions for triggering load migration are satisfied then CAleader sends the load information to LB LB computes the number of requests that should be migrated and sends back the result to CAh CAh instructs Serverh to dequeue the required number of requests Serverh sends redirect response to each of the clients concerned, which carries the IP address of Serverl

Figure 2. Dynamic Load Balancing algorithm. • Selection of servers for load migration

3. Load Balancing Policy

Requests are migrated from Serverh to Serverl . Every load balancing approach involves two major design dimensions: load estimation method and load migration policy.

3.1. Load Estimation

• Conditions for load migration The conditions that trigger load migration are: N s Load Difference (∆n ) >

Loadi,n

×θ

(4)

i=1

The total load of Serveri at the nth time interval is defined as: Loadi,n = Ti,n × Ni,n

(1)

where Ti,n is the expected service time to serve a request and Ni,n is the number of requests in the Request Queue. Assume Ti,0 = 0. The expected service time is defined as: Ti,n = (1 − α)Ti,n−1 + α ti,n

(2)

where ti,n is the measured actual service time used to serve the latest request and α is a constant between 0 and 1. The average load of all the back-end servers at the nth time interval is thus defined as: Loadavg,n

Ns 1 = Loadi,n Ns

(3)

i=1

where Ns is the number of servers.

3.2. Load Migration There are several areas to be considered when establishing a load migration policy:

∆n > Average Service Time (Tavg,n ) × β

(5)

∆n > Average Service Time (Tavg,n ) × τ

(6)

where Tavg,n =

1 Ns

Ns

Ti,n and β < τ . Requests are

i=1

migrated when both conditions (4) and (5) are satisfied or when condition (6) happens. • Number of requests to be migrated The Load Balancer is responsible for computing the number of requests to be migrated from Serverh to Serverl based on a clustering algorithm. • Selection of requests for migration The Redirect Queue consists of a number of priority queues, each being associated with a non-negative priority value (based on the number of times it has been migrated). When requests are to be unloaded from Serverh , requests in queues of higher priority will be preferred to requests in queues of lower priority for migration. • Request migration mechanism Normal HTTP Redirection mechanism is adopted.

• Frequency of performing dynamic load balancing If load migration is performed more frequently, then the Load Balancer will has more accurate information regarding the system’s current load distribution. A more effective load balancing decision can thus be made. However, a high frequency will result in undesirable overhead, thus haunting the system’s performance.

4. Clustering Algorithm An adaptive nearest neighborhood algorithm is adopted for determining the number of requests to be transferred in each load migration process [3, 5], as shown in Figure 3. The inputs xn ’s to the algorithm are vectors of current load distribution. These inputs are grouped into a number of


1. 2. 3.

let xn be the input, centroidk be the centroid of clusterk ; initialize number of cluster: C ← 1; assign x1 to cluster1 : centroid1 ← x1 for each input xn where n > 1 do m ← argmin(|xn − centroidk |), where 1 ≤ k ≤ C; distmin ← min(|xn − centroidk |), 1 ≤ k ≤ C k

4. 5. 6.

if (distmin ≤ R) then assign xn to clusterm calculate closeness of xn to centroidm : µm (xn ) = 1 −

Nm

µm (xn )×xn +

update the centroidm : centroidnew = m

7.

|xn −centroidm | R m µm (xm j )×xj

j=1

Nm

µm (xn )+

µm (xm ) j

j=1

m m where xm 1 , x2 , . . . , xNm are in clusterm before xn is assigned to it

8. 9.

else form a new cluster and initialize it: C ← C + 1; centroidC ← xn ; initialize µC (xn ) ← 1

Figure 3. Clustering algorithm. clusters. The algorithm then assigns xn to the cluster whose centroid is nearest to it. The number of clusters formed (i.e., C) is adapted by the algorithm itself, as determined by the inputs and the threshold R. Each cluster corresponds to a certain number of requests to be migrated from Serverh to Serverl , given the load discrepancy xn . More specifically, if xn is assigned to clusterk , the number of requests (M) to be migrated is given as follows: M = round(

∆n,centroidk 1 ) × 2 Tavg,n

Table 1. Parameters in different load levels Load Level No. of UEs No. of requests R

1 160 187K 1750

2 180 209K 750

3 200 231K 1500

4 220 253K 1750

5 240 275K 500

6 260 297K 1250

Table 2. Parameters used in the experiments Parameter Value

Tc 0.5 s

θ 0.3

β 30

τ 80

α 0.2

(7)

where Tavg,n is the adjusted average service time per request across all servers and ∆n,centroidk is the load difference between Serverh and Serverl of the centroid of clusterk .

5. Preliminary Evaluation Experiments were conducted to evaluate the performance of the proposed load balancing algorithm. Three back-end servers, one Front-end Web Server and one Load Balancer were deployed on five separate PCs. Another three client machines use a tool, called SURGE [1], to generate representative Web workloads. Six sets of experiments were carried out to simulate different load situations. Each set was repeated 10 times and the average results were reported. The values of the parameters were set as in Table 1 and Table 2. Four different approaches have been implemented and their performances were compared: • Approach 1: The number of requests migrated is determined using the clustering algorithm.

• Approach 2: At most one request is migrated from Serverh to Serverl each time, regardless of the current load discrepancy [4]. • Approach 3: The number of requests migrated is equal ∆n ), where ∆n is the load differto round( 12 × Tavg,n ence between Serverh and Serverl , and Tavg,n is the adjusted average service time per request. • Approach 4: No dynamic load balancing is performed.

5.1. Performance Indices Three performance indices were adopted: • Average load The overall average load Loadavg was computed, which is defined as: Loadavg =

P 1 Loadavg,n P n=1

where P is the number of time intervals.


(8)

0.42

600

0.4

1400 Approach 1 (clustering) Approach 2 (fixed) Approach 3 (half) Approach 4 (no DLB)

Approach 1 (clustering) Approach 2 (fixed) Approach 3 (half) Approach 4 (no DLB)

1200

400 Approach 1 (clustering) Approach 2 (fixed) Approach 3 (half) Approach 4 (no DLB)

300 200

Average load (s)

0.38

500

Response time (s)

Standard deviation of load (s)

700

0.36 0.34 0.32

1000

800

600

0.3 400

100

0.28

0

0.26

1

2

3

4

5

6

1

2

3

4

5

6

200

1

2

Load level (1:lightest, 6:heaviest)

Load level (1:lightest, 6:heaviest)

(a) Standard deviation of load

(b) Average response time

3

4

5

6

Loal level (1:lightest, 6:heaviest)

(c) Average load

Figure 4. Comparing the performances of the four load balancing approaches. • Standard deviation of load The average standard deviation of load Loadstd was computed, which is defined as: Ns P 1 1 Loadstd = (Loadi,n − Loadavg,n )2 P

n=1

Ns

i=1

(9)

• Average response time Average response time is defined as the average duration from the time a request arrives at the Front-end Web Server and the time a back-end server finishes serving the request.

5.2. Evaluation Results The results of the experiments are shown in Figure 4. As depicted in Figure 4(a), almost in all load scenarios, the three load migration approaches resulted in smaller standard deviation of load among the servers than approach 4 where load balancing was not adopted. The result sufficiently reflects that the system’s load was more evenly distributed among the servers when dynamic load migration was used. Figure 4(b) presents how the average response time varied with the total load of the system by employing the four approaches. Among the three load migration approaches, the performances of approach 1 and approach 3 were close and they outperformed approach 2 consistently. However, the average response time of approach 1 was always lower than that of approach 3 when the load level was 2, 4, 5 and 6 and it was only marginally higher than that of approach 3 when the load level was 1 and 3. The result indicates that approach 3 performed not as good as approach 1, especially under heavy loaded scenarios. This is probably caused by the weakness of approach 3. In approach 3, the extra load of Serverh is equally shared by Serverh and Serverl , regardless of the load level of Serverh and the system’s load. So,

chances could be that too many requests may be migrated, thus overshooting the “equilibrium” and causing the herd effect [4]. Or, in other situations, the number of requests migrated may not be enough. In either case, the imbalance in load cannot be efficiently rectified. By examining Figure 4(c), one can see that approach 1 led to the lowest average load when the load level was 4, 5 and 6. Specifically, approach 1 outperformed the other approaches when the system was suffered from heavy load.

6. Conclusion We proposed a nearest neighborhood clustering-based load balancing algorithm whereby the load discrepancy between the servers is mapped to a cluster whose centroid is used to derive the appropriate number of requests for migration in order to rectify the imbalance of load. Preliminary evaluation reveals that it is effective in improving system scalability and shortening turnaround time of client requests.

References [1] P. Barford and M. Crovella. Generating representative web workloads for network and server performance evaluation. In Measurement and Modeling of Computer Systems, pages 151–160, 1998. [2] T. Brisco. DNS Support for Load Balancing. RFC 1974, 1995. [3] M. H. Dunham. Data Mining: Introductory and Advanced Tools. Prentice Hall, 2003. [4] K. S. Ho and H. V. Leong. Improving the Scalability of the CORBA Event Service with a Multi-agent Load Balancing Algorithm. Software-Practice and Experience Journal, 32(5):417–441, Apr. 2002. [5] D. Michie, D. J. Spiegelhalter, and C. C. Taylor, editors. Machine Learning, Neural and Statistical Classification. Ellis Horwood, 1994.