Improving Performance of Composite Web Services - CiteSeerX

Improving Performance of Composite Web Services Dmytro Dyachuk, Ralph Deters Department of Computer Science University of Saskatchewan [email protected], [email protected] Abstract Composite Web Services (CWS) aggregate multiple Web Services in one logical unit in order to accomplish a complex task (e.g. business process). This orchestration is typically achieved by use of a workflow language. Workflows facilitate the process of aggregating existing atomic and other CWS into new service layers. However due to numerous consumers and possible fluctuations in their arrivals the services performance under various loads becomes an important issue. Service compositions exposed to transient overloads expose problematic behaviour due to complex interactions of the underlying services. This in its turn usually results in the performance degradation. This paper proposes employing scheduling service requests in order to improve the overall CWS performance in overload situations. Different scheduling policies are evaluated for the CWS workflow patterns sequence and split-synchronization. In addition the paper presents scheduling policy called Augmented Least Work Reaming (ALWKR), that extends LWKR by taking advantage of existing workflow topology information.

system and significant fluctuations in its performance. This is particularly worrisome in the context of CWS where the output of one service is the input of another one.

Figure 1.a: Sequence Pattern Figure 1.a shows the most basic workflow pattern called sequence [4]. If a service in a sequence experiences an overload, it will result in a decreased throughput that will propagate through the sequence resulting in an underutilization of the subsequent services and an increased (worse) response time of the CWS [3]. Consequently, it is important to avoid overloads in interconnected services.

Figure 1.b: Parallel Split &Synchronization Pattern

1. Introduction Well defined, loosely coupled services are the basic building blocks of the service-oriented (SO) design/integration paradigm. Services are computational elements that expose functionality in a platform independent manner and can be described, published, discovered, orchestrated and consumed across language, platform and organizational borders. While serviceorientation (SO) can be achieved using different technologies[1, 2], Web Services (WS) is the most commonly used one due to the standardization efforts and the available tools/infrastructure. However, the ease with which components (e.g. legacy systems) can be aggregated/orchestrated, leads to the questions of how the new Composite Web Services (CWS) perform. In our previous studies [3] it is shown that unexpected bulk arrivals of service requests cause the destabilization of the

A similar problem can be observed in the Parallel Split and Synchronization pattern (fig. 1.b). In the split phase multiple parallel service requests are created that “converge into one single thread of control” [4] at the synchronization point. This workflow pattern is vulnerable to overloads, since performance degradation in one of the parallel branches would lead to a delay in the synchronization point, thus increasing the response time of the workflow. As a result the business value and attractiveness of the Composite Web Service is reduced, since service selection mechanisms often consider average response time as one of the main selection criteria. This paper focuses on the use of request scheduling and admission control as mechanisms for improving CWS performance in overload situations and introduces a novel scheduling policy called Augmented Least Work Reaming (ALWKR) Scheduling policy, which unlike LWKR, uses the workflow topology when scheduling. As a result

ALWKR is no longer constrainted to the sequentialpattern of LWKR allowing a much broader deployment. The remainder of the paper is organized as follows. The following section discusses the service behaviors in respect to the loads they encounter. Section three presents an introduction into adaptive load control and scheduling. The results of the experiments are presented in section four and related work is discussed in section five. The paper ends with a summary and outlook on future work.

2. Services & Loads Service providers which do not share resources (e.g. no two providers expose the same database), can be modeled as a separate service. The providers handling multiple requests simultaneously assign a separate thread to each incoming request [1, 2]. Due to a finite amount of resources service provider can only serve a certain number of requests concurrently (actual number depends on the job sizes of the requests). The typical standard behavior [5] of services under various loads is shown on figure two. A service gradually exposed to an ever-increasing number of service requests experiences three distinct stages, namely underload, saturation and overload. In the initial phase the service should handle a load that is below its capacity (underload) and consequently it is not fully utilized. Therefore an increase of the number of requests leads to an increase of the throughput (number of completed jobs per time unit) increases. As the rate of incoming requests continues to grow, the server will reach its saturation point (peak load). During the saturation phase the service is fully utilized and operating at its full power. As load keeps increasing the throughput starts to drop the overload and ultimately the thrashing effect [5] appears. Thrashing emerges due to an overload of physical resources (resource contention) like processor or memory or as a result of locking (data contention). w ith thrashing

Throughput

w ithout thrashing

Number jobs of on server underload saturation overload

3. Adaptive Load Control and Scheduling Since transient overload situations lead to a decline in service throughput, it is important to avoid them. Heiss and Wagner [5] offer employing an adaptive load control. Adaptive load control is based on artificial limiting the number of concurrently processed jobs in order to escape thrashing. The adaptive algorithm determines the maximum number of parallel requests (e.g. maximum number of simultaneous consumers) and then buffers new requests once the saturation point has been reached. The outcomes of applying this approach is depicted in figure two. The darker curve (circles) represents the characteristic three phases a server without a load control can face namely underload, saturation and overload. Employing an admission control (grey, triangles) the thrashing is eliminated due to constraining the number of concurrent threads and buffering of requests above peak load.

3.1 Admission Control for Web Services Augmentation of a legacy service with an admission control can be performed by means of the proxy pattern [6-9]. The schema of placing a transparent shielding proxy implementing load control is shown in figure three,

Figure 3: Transparent Admission Control. The main task of the proxy is monitoring the number of requests being processed simultaneously and preventing overloading. In case the number of incoming requests overfills service capacity a queue is used to buffer the excess requests [fig. 4]. Works [7-10] show that a transparent admission control can used as an effective approach for dealing with transient overloads and requests bursts especially if the requests impose similar loads on the service. Nevertheless using as a default scheduling policy FirstIn-First-Out (FIFO) may not be sufficient from the performance view.

Figure 2: Service behavior under various loads.

Figure 4: Transparent Scheduling.

3.2 Scheduling of Requests Since SO middleware (e.g. Web Services) tends to foster declarative communication styles (e.g. SOAP[12]), the task of analyzing the service bound traffic becomes relatively simple. The analysis includes identification the request in order to estimate the impact each request will have on the service provider (job size) [7, 11]. This in its turn makes possible to reorder (schedule) the service requests. Scheduling of requests (fig. four) opens a wide range of possible options, like maximizing service performance in terms of interactions per time unit and/or minimizing the variance of service response times, etc.

3.3 Scheduling & Composite Services (CWS) Depending on an amount of control, scheduling can be done on a system (workflow/CWS) or component (service) level. While system-level scheduling treats CWS as an atomic entity and thus reorders the requests prior to invoking the workflow, component-level scheduling focuses on scheduling sub-requests for each service and can adjust to changes during the execution of the workflow. In this paper Composite Services composed according to the Sequence and Parallel Split & Synchronization patterns [4] with a static structure are considered. It is also assumed that size of the sub-requests is known a priori or can be estimated [7, 11]. In component-level scheduling a separate scheduler is placed in front of each service resulting in a multi-step scheduling [fig 5]. In case the component schedulers are not provided with workflow topology/context information and can only estimate the size of the sub-request assigned to them, SJF should be used. For instance a Web Service is incorporated in an Enterprise Composite Service as an outsourcing component. And the topology of Composite Service appears to be commercial classified information. For an atomic (non composite) service, the SJF (Shortest Job First) scheduling is sufficient as a means for optimizing average response time. SJF is a scheduling policy which operates by minimizing the response time of light requests, at the expense of the heavier ones. All incoming service calls placed in a waiting queue are processed in the order of their size as shown in figure six. Smith [13] proved that SJF is the optimal scheduling policy for minimizing average service time if accurate information on job sizes is available.

Figure 5: Component-Level Scheduling

Service

SJF

Figure 6: SJF Scheduling If each component scheduler can evaluate the size of the remaining work in the workflow and it is provided with the information on the topology of the workflow, it is possible to use ALWKR. ALWKR is similar to LWKR [11, 14] with one major difference, the use of the workflow topology. LWKR assumes as sequence pattern, e.g. that each component request has at most one component preceding it [14]. ALWKR relaxes this assumption by analyzing the actual workflow topology and thus allowing for more complex workflow patterns. LWKR sums up all remaining sub-requests to determine the costs, while ALWKR calculates the costs for each branch of the workflow, thus recognizing that some requests are executed in a parallel manner. For the sequence pattern the remaining time is the aggregation of the processing times at each remaining service. So as closer the composite request comes to the end as higher becomes its priority. The priority of the request is the inverse proportion of the remaining work. Consequently small requests are processed significantly faster.

3 1

1

3

5

2

0

2

4

4

7

6

Figure 7: Nested composition using Split, Sequence and Synchronization. For the parallel split-synchronization pattern, the remaining time is the completion of the slowest branch of the workflow. Therefore all branches are evaluated but only the one with the largest cost is considered. In case of nested compositions, the service time will be substituted by the corresponding remaining time for the segment of composite service. It is noteworthy that the remaining time does not necessarily describe the exact time for processing a particular request. The remaining time indicates the weight of the current request in the overall context in compare to other requests. An example of a

100 Response time,s

composite service with the corresponding resultant remaining times is shown on figure seven. The composition is divided into five segments. Remaining time for executing j-th segment is Rj, time for processing request by i-th service - Si.. Therefore the corresponding remaining work amounts for each segment are: R0=max(R1,R2), R1=S1+R3 , R2= S2+R4+S7, R3=max(S3,S4), R4=max(S5,S6 ).

Service Model

10 1 0

50

100

0.1 0.01 Request rate,req/s

Figure 8.a: Model and Echo Service Response Time

4. Experiments 70

Model

4.1 Simulation of Axis Services In order to investigate impacts of scheduling a large controlled environment containing multiple services is required, since every service might require a separate host. Simulation capturing only primary aspects of Web Services behaviour requires less resource intensive environment and allows conducting more comprehensive studies within the same time frame. Using the simulation tool Anylogic [15], a model that describes the way WS and CWS respond to various loads is developed. Experiments (simulation and reference implementation of real system) indicate that services including WS running a transactional database at the backend [16] as well as a standalone computation bound WS are a subject of thrashing. For instance an origin of thrashing can be in the parsing, which is mandatory part for each WS. Consequently thrashing effect can appear in all the WS without admission control if the loads exceed certain limits. Calibration of a model is performed by a means of a basic WS called echo which is implemented in Axis [1]. In order to describe accurately the observed behavior (e.g. page faults) two main resources CPU and Memory, governed by PS (Processor Sharing) are included in the WS simulation model. The mentioned resources have been tuned to demonstrate the same thrashing effects that have been observed in the experiments. Figures 8.a and 8.b present a comparison of the actual Axis WS and simulated WS. The observed response time and throughput indicate a close match. Since the simulated “echo” service exhibits the same behavior as the Axis WS, several models aggregated in one allow to simulate CWS.

Resp/sec

60

Service

50 40 30 20 10 0 0

50 Request rate, req/s

100

Figure 8.b: Model and Echo Service Throughput Seven composite services [fig. 9, 12, 14, 16] used for evaluating different admission and scheduling approaches include up to seven identical atomic services. To simplify the simulation the costs for orchestrations are assumed to be zero. Therefore the time between receiving a response from a prior service and sending a request to a next service is neglected. Without loss of generality any other value bigger than zero can be used instead. The upper limit on the number of the concurrent jobs per service is set to 5. The job-sizes of the client requests for the CWS are exponentially distributed with an average of 100 ms. The load has a periodic impulse nature. Each impulse contains a peak of the high load lasting for 33 seconds (overload) which is continued by the low load with the duration of 66 seconds. Each experiment covers 2 hours 46 minutes of simulated time. The charts depict average response time in seconds of various compositions being exposed to a gradual load. The X axis indicate an average request rate measured in requests per second. 4.2 Impact of Scheduling on CWS ALWKR applied to a sequence of services behaves exactly like Least Work Remaining scheduling policy. In [11] we investigate possible outcomes of applying LWKR to sequences of services. Also we evaluate it in compare to FIFO, and component SJF. With sequences of services LWKR exhibits stable response time optimizations which can reach up to 50%.

140 120 FIFO

100

ALWKR

80

SJF

60 40 20 0 7

Figure 9: The CWS (2 & 4 services) (Split & Synchronization)

7.3 7.6 7.9 8.2 8.5 8.8 9.1 9.4 9.7 10

Figure 11: Average response time of CWS with Split and Synchronization patterns (4 components)

80 70 60 50

FIFO

40 30 20

SJF

ALWKR

10 0 7

7.3 7.6 7.9 8.2 8.5 8.8 9.1 9.4 9.7 10

Figure 10: Average response time of the CWS with Split and Synchronization patterns (2 components) In the first experiment the composite service consists of two parallel invocations [fig. 9]. Figure 10 shows that scheduling allows compensating the steep increase in the response time caused by the increasing load. Due to the underlying topology [fig. 9] the chance that scheduling decisions made by ALWKR will differ a lot from SJF is very small and as the result the performance of SJF is very close ALWKR. However ALWKR still outperforms SJF by 10-16%. In the composition of four parallel services, the variation in processing times between different branches is higher. Therefore scheduling of a composition with four services [fig. 9] allows improving performance at peak loads by 25-80% [fig. 12]. The outcomes of applying SJF include 75% response time reduction. At the same time ALWKR allows another 25-50% improvement in compare to SJF. The quality of ALWKR scheduling depends on the diversity of job/work sizes within a particular CWS request. Therefore a more sophisticated topology [fig. 12] allows better optimizations. The difference between different scheduling approaches can be already seen even with lower loads [fig 13]. SJF exhibits stable performance improvement of 18%-27% in compare to FIFO. While ALWKR shows 36%-40% optimizations of the average response time [fig. 15].

Figure 12: The CWS with Sequence, Split and Synchronization pattern (Split of Sequences) 25 FIFO

20

ALWKR

15

SJF

10 5 0 7

7.3 7.6 7.9 8.2 8.5 8.8 9.1 9.4 9.7 10

Figure 13: Average response time of the composite service containing Sequence, Split and Synchronization patterns (Split of Sequences) Four services orchestrated with a different topology, namely Sequence of Splits have two synchronization points [fig. 14]. Therefore the sequence contains two splits. The proximity of the results of SJF and ALWKR being applied to a split of two services is shown earlier [fig. 9,10]. The similar effect is observed with a sequence of splits. Scheduling saves up to 33-36% of time on processing composite requests [fig. 15]. At the same time the difference between SJF and ALWKR does not exceed 10-15%.

better than FIFO, however the difference does not exceed 10% in case lower loads(7-7.5 req/s) and 2-3% in the rest of cases [fig. 17].

25 20

Figure 14: The composite services containing Sequence, Split and Synchronization pattern (Sequence of Splits)

15 10

FIFO ALWKR

5

SJF

25 0

20

FIFO

15

ALWKR SJF

10

7

7.3 7.6 7.9 8.2 8.5 8.8 9.1 9.4 9.7 10

Figure 17: Average response time of the CWS with Sequence, Split and Synchronization pattern (Mixed)

5 0 7

7.3 7.6 7.9 8.2 8.5 8.8 9.1 9.4 9.7 10

Figure 15: Average response time of the composite service containing Sequence, Split and Synchronization pattern (Sequences of Splits)

ALWKR proposes first serving composite requests with lesser remaining work. Therefore the mean processing time of Service 1 and Service 2 governed by this policy is higher than SJF [fig. 18]. The remaining works includes sizes of jobs for Service 1,2. Therefore there is a correlation between them and the total remaining work [fig. 7]. Thus the processing times for Service 1,2 are lesser than FIFO. The same effect takes place while handling subsequent component requests [fig. 18]. Consequently the “smaller” composite service invocations are going faster through the composition. As the result the overall composition response time is reduced [fig. 17]. 14 FIFO 12

SJF LWKR

10

Figure 16: The composite services containing Sequence, Split and Synchronization pattern (Mixed)

8 6

With the composition shown on figure sixteen SJF demonstrates the behaviour similar to FIFO while ALWKR still exhibits stable performance improvements [fig. 17]. In the last experiment Services 1 and 2 [fig. 16] are the first nodes to process the composite request. SJF attempts to reduce the average response time for each service as a separate entity. Consequently ALWKR optimizes the response time of Service 1 and Service 2 [fig. 18]. Nevertheless the order in which the composite requests are processed on Service 1 and Service 2 strongly impacts the order of processing the consequent requests belonging to the same composite service invocation. It is important to note that the subsequent atomic services in case of FIFO and SJF do not see the same request arrivals. Therefore even after future attempt to minimize the response time Services 3..7 governed by SJF process requests slower than if FIFO was used. As the result the total average processing time for SJF is still

4 2 0 1

2

3

4

6

5

7

Figure 18: Average processing times for each component service with Sequence, Split and Synchronization pattern (Mixed). Avg. arrival rate 9.1 req/s. In the current approach we have also employed an aging technique [17, 18] in order to avoid infinite delays of larger requests.

5. Related Work Elnikety et. al. [18], propose using admission control and request scheduling for dynamic web pages. The threetired web application servers with a database at a backend are considered. The transparent proxy hides the server and implements scheduling. The approach also includes overload protection and preferring scheduling SJF augmented with aging method. Besides that the authors describe mechanisms allowing estimation of incoming request job sizes. Another successful attempt of using SJF scheduling but on the static web pages domain is done by Cherkassova in [17,18]. She proposes the use of Alphascheduling to improve the overall performance. Alphascheduling is a modification of SJF algorithm with a focus on preventing starvation of big requests. The approach is in a way similar to aging methods [18]. There is a set of works targeting the problem of Quality of Service (QoS) differentiation done on the domain of Web Services. Siddaharta et al. [19] propose controlling the SOAP traffic by a means of a Smartware platform. Smartware is as managing infrastructure for Web Services. It is formed by three main components namely interceptor, scheduler and dispatcher. The interceptor is responsible for intercepting the incoming requests, parsing and classifying them according to a user and client (device) type. The classified requests are placed in a waiting queue. Later their processing order is determined. The QoS differentiation is achieved by means of scheduling which is driven by predefined business policies. The policies describe the priority of one class of clients over another. Smartware uses an implementation of a static scheduling. The algorithm assigns priorities to service calls according the classification of the interceptor. Since assigning static priorities could cause starvation of low priority requests Smartware applies a randomized probabilistic scheduling policy derived from lottery scheduling [20]. As for the last step the dispatcher forwards the requests to the service endpoints. Sharma et al [21] also implement QoS differentiation by the means of predefined business policies. The authors employ static scheduling. Each request is assigned to a certain class. The request class is determined by the application type, device type and client (user) level. Than the priority is calculated using component priorities, for example it can be a sum or a product. In order to avoid starvation of low priority requests the authors propose applying probabilistic scheduling. But since static scheduling can not ensure accurate throughput differentiation in case of fluctuating loads [21]. The authors propose extending static scheduling by adapting the priorities at a runtime on a base of the observed

throughput. In case the observed throughput differs from its desired value an adjustment is being made. The adjustments are implemented in form of a correction factor of a hyperbolic nature. As the result the slower requests are being sped up while the faster requests are slowed down accordingly to each request class. In our previous work [3] we investigate the impact of transient overloads on Composite Web Service behaviour. We showed that CWS orchestrating services without admission control tend to exhibit chaotic behaviours. The premises can be avoided if the orchestrated services are governed by an admission control. In addition an admission control achieves significant performance increases. Combination of admission control with scheduling for optimizing single service performance is studied in [6, 7, 11]. Using the TPC-APP [16], a two-tier B2B application, as a benchmark we evaluate the performance gains of SJF (Shortest Job First) compared to pure admission control and standard PS for a single service. By introducing a proxy between a consumer and a provider, it is possible to achieve a transparent scheduling and admission control that leads to significant performance improvements in overload cases. In addition, the experimental evaluation shows that even in the absence of a priori knowledge, a SJF scheduler that uses observed runtime behaviour can lead to schedules that outperform FIFO and PS, making it an attractive approach for boosting Web Services performance. The results of the experimentation indicate that transparent scheduling can be applied to Web Services as an effective and easy way to implement approach for boosting performance and avoiding service provider thrashing. Besides that, we [7] show that the processing time for the particular request can be estimated in the online fashion. Clustering requests according to the operation type and applying exponential smoothing to the clusters allows achieving a strong 0.7 correlation between estimated and experienced values.

6. Conclusions This paper presents transparent scheduling of Web Services requests as a means for achieving better performance. In order to evaluate CWS scheduling policies a model for simulation was developed. The simulations show that scheduling can significantly reduce the average response time for the CWS in case of pulse loads. The experiments showed the benefits of applying scheduling to CWS that orchestrate according to the sequence and synchronization split workflow patterns. This paper also introduces a heuristic based scheduling policy, named Augmented Least Work Remaining

(ALKWR). ALWKR extends Least Work Remaining by considering parallel workflow branches. LWKR can be applied only to sequences of services thus the sum of all remaining sub-requests is used to determine the costs. ALWKR calculates the costs for each branch of the workflow thus recognizing that some requests are executed in a parallel manner. ALWKR improves the performance up to 70-80%, but requires placing a proxy in front of each component. Applying component SJF scheduling appears beneficial in terms of performance only for simple structures orchestrating up to three services. For bigger and more complex structures the attempt to improve the overall performance of by means SJF does not allow any noticeable optimizations. ALWKR outperforms SJF in average by 20-40%. While the results of applying scheduling are very promising it is important to note that the current work only focused on a very simplified SOA architecture. Future work in transparent scheduling of Web Services will overcome this by addressing the following issues: Composite Web Services with more complex workflow patterns: Scheduling Composite Web Services that implement different workflow patterns is still an open issue. Service-Layer Agreement (SLA): SLAs are an increasingly important aspect of SOA. Scheduling can be used as a means for achieving this by minimizing penalties and supporting QoS contracts in critical situations.

[7] D. Dyachuk and R. Deters, "Transparent Scheduling Of Web Services," in 3rd International Conference on Web Information Systems and Technologies, 2007(to appear).

References

[17] L. Cherkasova, "Scheduling strategy to improve response time for web applications," in HPCN Europe 1998: Proceedings of the International Conference and Exhibition on HighPerformance Computing and Networking, 1998, pp. 305-314.

[1] Apache Axis. [Online]. 2007Available: http://ws.apache.org/axis/. [2] Visual Studio Home. [Online]. 2007Available: http://msdn.microsoft.com/vstudio/. [3] D. Dyachuk and R. Deters, "The impact of transient loads on the performance of service ecologies," in DEST 2007: 2007 Inagural IEEE Internation Conference on Digital Ecosystems and Technologies, 2007(to appear). [4] W.M.P. van der Aalst, A.H.M. ter Hofstede, B. Kiepuszewski, and A.P. Barros,”Worfflow patterns”, Distributed and Parallel Databases, 14(3), 2003, pp. 5-51.. [5] H. Heiss and R. Wagner, "Adaptive load control in transaction processing systems," in VLDB '91: Proceedings of the 17th International Conference on very Large Data Bases, 1991, pp. 47-54. [6] D. Dyachuk and R. Deters, “Scheduling of Composite Web Services” in DOA’06: OTM Workshops 2006, LNCS 4277, 2006, pp. 19 – 20.

[8] A. Erradi and P. Maheshwari, "wsBus: QoS-Aware Middleware For Reliable Web Services Interactions," in EEE '05: Proceedings of the 2005 IEEE International Conference on e-Technology, e-Commerce and e-Service (EEE'05) on eTechnology, e-Commerce and e-Service, 2005, pp. 634-639. [9] S. P. Abdelkarim Erradi and Niranjan Varadharajan, "Differential QoS support in Web Services Management," ICWS, vol. 0, pp. 781-788, 2006. [10] C. D. Gill, D. L. Levine and D. C. Schmidt, "The Design and Performance of a Real-Time CORBA Scheduling Service," Real-Time Syst., vol. 20, pp. 117-154, 2001. [11] D. Dyachuk and R. Deters, "Optimizing performance of web service providers," in The IEEE International Conference on Advanced Information Networking and Applications(AINA2007), 2007.(to appear), [12] N. Mitra. SOAP version 1.2 part 0. [Online]. 2007Available: http://www.w3c.org/TR/soap12-part0/ [13] W. E. Smith, Various Optimizers for Single-State Production. Naval Research Logistics Quarterly, 1956. [14] Conway R. W.,Maxwell W.L. and Miller L. W., Theory of Scheduling. Addison-Wesley, 1967. [15] XJ Techologies. Anylogic 5.5. [Online]. 2007Available: http://www.xjtek.com/ [16] TPC-App - Application Server and Web Service Benchmark. [Online]. Available: http://www.tpc.org/tpc_app/

[18] S. Elnikety, E. Nahum, J. Tracey and W. Zwaenepoel, "A method for transparent admission control and request scheduling in e-commerce web sites," in WWW '04: Proceedings of the 13th International Conference on World Wide Web, 2004, pp. 276286. [19] P. Siddhartha, R. Ganesan and S. Sengupta, "Smartware A management infrastructure for web services." in WSMAI, 2003, pp. 42-49. [20] C. A. Waldspurger and W. E. Weihl, "Lottery scheduling: Flexible proportional-share resource management," in Operating Systems Design and Implementation, 1994, pp. 1-11. [21] A. Sharma, H. Adarkar and S. Sengupta, "Managing QoS through Prioritization in Web Services," WISEW, vol. 00, pp. 140-148, 2003.