Microsoft Word Template - Computer Science Master Research

CSMR, VOL. 1, NO. 1 (2011)

On Scheduling in Service Oriented Architectures Marius Ion University POLITEHNICA of Bucharest Faculty of Automatic Control and Computers, Computer Science Department Emails: [email protected]

Abstract Service oriented architectures become more and more popular with the emergence and consolidation of new paradigms such as Clouds and Grids. In this context scheduling becomes an important and difficult problem because the services are abstractions. In this paper I present some of the problems which are encountered when designing a job scheduler for system oriented architectures: estimating the performance of the services, estimating the running time of the jobs , enabling the scalability of the system, and I propose solutions for them. Keywords: Grid, Scheduling, Web Services, SOA, Resource Allocation

1.

Introduction

Grid Computing relies on a large number of individual computers interconnected through a private or public network such as the Internet. The loose coupling of the elements that form a computing grid make it highly scalable: computing grids range in dimension from only a few computers connected through a local area network (enterprise computing grids) to continental and intercontinental computing grids such as the EGEE Grid (Enabling Grids for E-SciencE Grid) [2] and the TeraGrid [15], which provide computing power in the range of hundreds of teraflops. Grid Computing facilitates the access to computing resources that are geographically dispersed, and allows users to resources remote access that are underutilized. Another important objective for Grid Computing is to allow users to easily share data between them, typically in the form of files that are being jointly produced and used by collaborators in disparate locations. To this day, the Globus Toolkit [5] remains the de facto standard for building grid solutions and it is based on OGSA (Open Grid Service Architecture). In a distributed environment (e.g. Grids) discrete software agents must work together to perform some tasks. Furthermore, the agents in a distributed system do not operate in the same processing environment, so they must communicate by hardware/software protocol stacks over a network. This means that communications with a distributed system are intrinsically less fast and reliable than those using direct code invocation and shared memory. This has important architectural implications because distributed systems require that developers consider the unpredictable latency of remote access, and take into account issues of concurrency and the possibility of partial failure [10]. Service Oriented architecture (SOA) is a good solution to increase the abstraction level for communication, allowing applications to bind to services that evolve and improve over time without requiring modification to the applications that consume them. The advantages of SOA determined its adoption into Grid systems at a time when this concept was relatively new. This model was based on the existence of several layers, which were arranged in a way similar to an hourglass: the narrow part at the center corresponding to a small set of protocols and basic abstractions, on top of which a large number of components can be mapped, and underneath a large variety of low level technologies. Grid Services can be aggregated in order to fulfill the needs of the Virtual Organizations. This new model, known as OGSA, aimed to

33

MARIUS ION

ON SCHEDULING IN SERVICE ORIENTED ARCHITECTURES

align the Grid technologies with the web services standards. The advantages of this technology are the possibility to automatically generate code from the WSDL description, the possibility to discover services through the use of public catalogs, the association between the descriptions of the services and inter operable network protocols, etc. The OGSA model has been implemented in the Globus Toolkit. The widely spread gLite middleware [1] (especially Workload Management System) is also based on a Service Oriented Architecture [11]. In this context, the dynamic resources allocation improves the execution of workflow applications and allows users to define the adequate policies. The most challenging issue is to allow users to dynamically change the policy employed by the scheduler at runtime, through a class loading mechanism. This allows the employment of application profiling techniques in order to finely tune the scheduler in accordance with the characteristics of the environment it is running in, either by changing the various parameters of the policies proposed, or by loading completely new policies. The rest of the paper is structured as follows: in the second section, I present related work, in the third section the problems of estimating the performance of the services and in the fourth section the difficulties when estimating the running times of the jobs. In the fifth section methods for enabling the scalability of the system and in the sixth section I present conclusions.

2.

Related Work

The Grid technologies that have been developed within the Grid community have produced protocols, services, and tools that address precisely the challenges that arise when we seek to build scalable VOs. These technologies include security solutions that support management of credentials and policies when computations span multiple institutions; resource management protocols and services that support secure remote access to computing and data resources and the co-allocation of multiple resources; information query protocols and services that provide configuration and status information about resources, organizations, and services; and data management services that locate and transport datasets between storage systems and applications. Because of their focus on dynamic, cross-organizational sharing, Grid technologies complement rather than compete with existing distributed computing technologies. For example, enterprise distributed computing systems can use Grid technologies to achieve resource sharing across institutional boundaries; Grid technologies can be used to establish dynamic markets for computing and storage resources, hence overcoming the limitations of current static configurations [6]. The gLite middleware [16] is developed within the EGEE project, and is widely adopted, because of its maturity. The gLite Grid services follow a Service Oriented Architecture which will facilitate interoperability among Grid services, using solution based on P-GRADE [8], and allow easier compliance with upcoming standards, such as OGSA, that are also based on these principles [14]. The architecture constituted by this set of services is not bound to specific implementations of the services and although the services are expected to work together in a concerted way in order to achieve the goals of the end-user they can be deployed and used independently, allowing their exploitation in different contexts. The Information Service (IS) provides information about the WLCG/EGEE Grid resources and their status [4]. This information is essential for the operation of the whole Grid, as it is via the IS that resources are discovered. The published information is also used for monitoring and accounting purposes. Much of the data published to the IS conforms to the GLUE Schema [1], which is a data model used for Grid resource monitoring and discovery. The full potential of Web Services as an integration platform will be achieved only when applications and business processes are able to integrate their complex interactions by using a standard process integration model. The interaction model that is directly supported by WSDL is essentially a stateless model of request-response or uncorrelated one-way interactions [7]. To define such interactions, a formal description of the message exchange

34

CSMR - COMPUTER SCIENCE MASTER RESEARCH, VOL. 1, NO. 1 (2011)

protocols used by business processes in their interactions is needed. WS-BPEL defines a model and a grammar for describing the behavior of a business process based on interactions between the process and its partners. WS-BPEL also introduces systematic mechanisms for dealing with business exceptions and processing faults. Moreover, WS-BPEL introduces a mechanism to define how individual or composite activities within a unit of work are to be compensated in cases where exceptions occur or a partner requests reversal. A WS-BPEL process is a reusable definition that can be deployed in different ways and in different scenarios, while maintaining a uniform application-level behavior across all of them. In Grids, application workflow could be developed using Globus Toolkit [12]. The WS-BPEL could be integrated with Globus. Resources allocation process in the Grid can be divided into three stages: resource discovering and filtering, resource selecting and scheduling according to certain objectives and policies, and job submission [17]. For submission process, within a domain, one or multiple local schedulers run with locally specified resource management policies [3]. Examples of such local schedulers include OpenPBS and Condor. An LRM also collects local resource information, and report the resource status information to GIS.

3.

Scheduling in a SOA Environment

Service oriented architectures become more and more popular with the emergence and consolidation of new paradigms such as Clouds and Grids. In this context scheduling becomes an important and difficult problem because services are abstractions, and they hide their actual implementation, the hardware characteristics of the systems they are running on, as well as the fact that they are spread across multiple institutions and geographical locations.

4.

Difficulties When Estimating the Performance of the Services

The Glue Schema 1.3 specification [1] which is currently used in all the major grid projects provides information about the resources which are deployed that is of limited use when scheduling jobs on services. This is because there is no direct connection between the elements describing the services and the hardware resources on which they are running. Therefore, other methods than the GIS must be used in order to acquire information about the status of the services which are of interest and are running in the Grid environment. One possible approach in this case is to estimate based on the previous performance of the services. This can be done by gathering information from the GIS, by keeping track on the scheduler of all the tasks sent to each service, and by receiving updates from a fault tolerance module which signals when the services have encountered errors [18]. A drawback to this technique is that we assume that all the jobs which are executed on the services are sent using our scheduler, which is not always true. Because of this a service may have multiple jobs waiting in its queue, and the scheduler would be aware only of a small fraction of them, and would consider the service less loaded than it actually is.

35

MARIUS ION


Figure 1: Core Glue Schema 1.3 [1]

An improvement to this technique would be to have multiple proxies, one for each service that will accept jobs from the scheduler. These proxies will intercept all the incoming invocations even those which have not been sent using the scheduler, the return values and the exceptions. In this way we can gain access to much more precise information such as the duration of the execution of the job, we can intercept all the exceptions, and report all the information to the scheduler, thus enabling it to take better decisions in the resource allocation process. These proxies will have to emulate the behavior of the underlying services, in order to ensure transparency and full compatibility. Therefore, they must expose an identical interface, and have similar WSDL descriptions. Another problem that may arise when scheduling tasks onto services is the fact that several services may share the same hardware resources. When using Glue Schema 1.3 [1] it is very difficult to bypass this shortcoming, because the performance of one service may be dependent on the performance of another one, and the scheduler may not be aware of this. Even if it is aware, it would be unable to correctly predict how well a service behaves, since its jobs may execute poorly because there are other jobs running on the same hardware resource from another service.

F Figure 2: Glue Schema 2.0 Computing Service UML [19]

The Glue Schema 2.0 draft [19] proposes a completely new solution for describing the resources in the Grid environment. Unlike the previous versions where there was a separation between the main entities of Service, Storage Element and Computing Element, in this draft the schema is centered on the idea of service oriented architecture. As we can see in Figure [2] there is a connection between the Computing Service entity and its Execution Environment. This entity describes the hardware capabilities of the system on which the service is running.

36


The Computing Service entity contains information about the total number of jobs which are scheduled, how many are suspended, how many are running, how many are waiting, how many jobs are currently managed by the Grid software layer waiting to be passed to the underlying Computing Manager (LRMS), and hence are not yet candidates to start execution. The local resource management systems (LRMS) are modeled by Computing Manager entities, which provide extensive information about their characteristics. Because these entities are linked together in the Glue Schema 2.0 specification which is published in the GIS, the scheduler can have access to enough information to efficiently allocate resources solely based on this data. However, the current version is only a draft, and there have been no implementations of the Glue Schema 2.0 [19] so far.

5.

Difficulties When Estimating the Running Time of the Jobs

Scheduling algorithms usually require approximate knowledge about the running times of the jobs they receive as input. These estimates are either user given or determined in various ways such as code profiling [24], statistical determination of execution times [22], linear regression or task templating [21], [25]. User given estimates are dependent on the user’s prior experience with executing similar tasks. Furthermore users can overestimate task execution times knowing that schedulers rely on them. This kind of malicious behavior can lead to scheduling decisions where overestimated tasks are assigned to resources ahead of other tasks which might execute earlier if well-intended. To counteract it schedulers could implement penalty systems where tasks belonging to such users would be intentionally delayed from execution. Code profiling works well on CPU intensive tasks but fails to cope with data intensive applications. Statistical estimations of run times can be done on static code but faces the same problems as code profiling. In addition these methods are not applicable to service oriented environments where users do not have access to the actual solver application source code or do not have any insights on the solvers’ performances. A solution to these issues could be to let the service decide how much a task would take to execute but this would require additional time and probably executing the actual application [20]. Templating has also been used for assigning task estimates by placing newly arrived tasks in already existing categories. This method relies on general task characteristics for matching such as owner, solver application, machine used for submitting the task, input data size, arguments used, submission time, start time etc. After selecting the set of criteria a method based on genetic algorithms can be used to search the global space for similarities. However there are cases where such estimates are hard to give due to the nature of the solver application. For instance we can consider two examples of solvers one which processes satellite images and another one which solves symbolic mathematical problems. In the first case it is quite easy to determine runtime estimates from historical execution times as they are dependent on the image -size and the required operation. In contrast the latter case proves to be more problematic as the run time depends both on the algorithm and the input parameters and may vary irrespectively of them. For example, when factorizing large integers, the run time for an input of size n can be very different from one of size n+1 or n-1. A solution would be to refine as much as possible the notion of similarity between two tasks but in this particular case it could mean searching for identical past submissions. Besides this problem services cannot be trusted as their interfaces act as black boxes with the content changeable without notice and thus invalidating previous historical information regarding them [20]. Transfer costs play an important role when estimating task completion times and are also a problem in SOA as little or nothing is known about the location and network route towards a particular service. Since migrating large amounts of data such as satellite images requires a lot of time, this may constitute an issue if data intensive tasks are used. There is SA which do not uses such estimates at all but rely on periodical workload re-balancing. Such algorithms take into consideration only resource load and perform migration of tasks from one resource to another when their loads become unbalanced. This approach works well and tests have

37

MARIUS ION


shown that heuristics such as Round-Robin [23] give results comparable to other classic heuristics based on run time estimations. In reality there exist cases where users submit tasks and give some deadline constraints to them without being interested in fast execution or workload management. As a result it does not matter how fast, how slow or where a task gets executed as long as it gets completed inside the specified time interval. This is also applicable for environments based on SOA as mentioned in the previous paragraph little is known about the actual solvers as they are exposed through a standard WSDL interface. Consequently users usually attach deadline constraints instead of estimations to workflows or batch tasks and hope they will not be exceeded too greatly. It is then up to the scheduler to make sure that this does not happen [20]. This can be implemented by maintaining a queue of jobs for each web service. These can be contained in the scheduling module, or, if proxies for the web services are used, inside these proxies. The idea behind this is that we want to be able to migrate the jobs from one web service to a similar one when the algorithm dictates so, and by having the queues on the proxies we can also migrate jobs that have not been submitted using the scheduler if we find it necessary. Also, this reduces the performance requirements on the scheduler, since a part of the computations are migrated to the machines which host the proxies, and the services. An important aspect regarding job migration from one service to another one, or when a user wants to invoke two methods on a service is the fact that the services may be state-full. In this case, the invocations must be submitted to the same service, even if it is not the best choice from the point of view of load balancing. For example, let's assume that there is a web service which has two methods: one for submitting the job, and one for getting the results. If a user calls the first method and submits a job to the service, he will need to get the results from the same service, even if other similar services exist which may be less loaded. Therefore, a mechanism must exist that forces the scheduler to assign tasks to a specific web service, and to prevent the migration of the tasks to other similar services, even if it affects the performance. This must be done in order to ensure the correctness of the results.

6.

Enabling the Scalability of the System

The Grid enables the sharing of resources on a wide geographical area; these resources are located in various sites which are administered by different organizations. Having a single scheduler that assigns jobs to services which are located on more than one site can lead to a serious performance bottleneck, as well as to the existence of a single point of failure in the system. Another fact is that grid sites are administered by their organizations in different ways: each site may have different policies for the sharing of the resources. This leads to an increase in the complexity of the system that makes the use of a single centralized scheduler inefficient. Because of these reasons, a scalable scheduling architecture must be used.

38


Figure 3: A Scalable Architecture Using Scheduling Agents [20]

In [20] a scalable architecture which uses policy enabled scheduling agents is described. In this approach each service provider exposes an agent for dealing with scheduling decisions inside its domain. This agent is responsible for accepting inquiries from other agents requiring transferring some of their tasks to it or information about its current load. The inquired agent then decides whether to accept new tasks or not. In this way a completely decentralized approach is achieved, where each agent responsible for a domain takes its own decisions concerning which task to accept or where to relocate it. Furthermore this is how each agent can implement its own internal scheduling policy without interfering with external ones. This is the reason why each provider and VO is independent, can use its own security and scheduling mechanisms, and at the same time can keep interacting with external services. In order that all these agents cooperate they must register themselves to a yellow page service responsible for keeping track of them. Having agents running different scheduling algorithms can lead to problems as the makespan might be negatively influenced, but the approach is essential when trying to maintain the scheduling policy independence of each VO or service provider. An important moment in this case is when an agent decides to move a task to a new one. Each time a task is rescheduled inside the agent’s domain and if it is necessary, the rest of the agents are queried for proposals and the best is then chosen as the final location.

7.

Conclusions

I have identified several problems which must be taken into account when designing a scheduler for service oriented architectures enabled by grid environments: estimating the performance of the services, dealing with the unknown size of the jobs, enabling the scalability of the system, and I have also proposed solutions for these problems.

8.

Acknowledgments

I would like to thank Prof.Dr.Ing. Valentin Cristea and Sl.Dr.Ing. Florin Pop who coordinated the realization of this project.

References [1] S. Andreozzi, S. Burke, L. Field, S. Fisher, B. Konya, M. Mambelli, J.M. Schopf, M. Viljoen, and A. Wilson. GLUE Schema Specification - Version 1.2, December 2005.

39

MARIUS ION


[2] Rüdiger Berlich, Marcus Hardt, Marcel Kunze, Malcolm Atkinson, and David Fergusson. Egee: building a pan-european grid training organisation. In ACSW Frontiers 06: Proceedings of the 2006 Australasian workshops on Grid computing and e-research, pages 105–111, Darlinghurst, Australia, Australia, 2006. Australian Computer Society. [3] Fangpeng Dong and Selim G. Akl. An adaptive double-layer workflow scheduling approach for grid computing. In HPCS ’07: Proceedings of the 21st International Symposium on High Performance Computing Systems and Applications, page 7, USA, 2007. IEEE Computer Society. [4] A. Nobrega Duarte, Piotr Nyczyk, Antonio Retico, and Domenico Vicinanza. Global grid monitoring: the egee/wlcg case. In GMW ’07: Proceedings of the 2007 workshop on Grid monitoring, pages 9–16, New York, NY, USA, 2007. ACM. [5] Ian Foster and Carl Kesselman. Globus: A metacomputing infrastructure toolkit. International Journal of Supercomputer Applications, 2(11):115–128, 1997. [6] Ian Foster, Carl Kesselman, and Steven Tuecke. The anatomy of the grid: Enabling scalable virtual organizations. Int. J. High Perform. Comput. Appl., 15(3):200–222, 2001. [7] Matjaz B. Juric. Business Process Execution Language for Web Services BPEL and BPEL4WS 2nd Edition. Packt Publishing, 2006. [8] Peter Kacsuk, Tamas Kiss, and Gergely Sipos. Solving the grid interoperability problem by p-grade portal at workflow level. Future Gener. Comput. Syst., 24(7):744–751, 2008. [9] Dimka Karastoyanova, Alejandro Houspanossian, Mariano Cilia, Frank Leymann, and Alejandro Buchmann. Extending bpel for run time adaptability. In EDOC ’05: Proceedings of the Ninth IEEE International EDOC Enterprise Computing Conference, pages 15–26, Washington, DC, USA, 2005. IEEE Computer Society. [10] Samuel C. Kendall, Jim Waldo, Ann Wollrath, and Geoff Wyant. A note on distributed computing. Technical report, CA, USA, 1994. [11] Cecchi Marco, Capannini Fabio, Dorigo Alvise, Ghiselli Antonia, Giacomini Francesco, Maraschini Alessandro, Marzolla Moreno, Monforte Salvatore, Pacini Fabrizio, Petronzio Luca, and Prelz Francesco. The glite workload management system. In GPC ’09: Proceedings of the 4th International Conference on Advances in Grid and Pervasive Computing, pages 256–268, Berlin, Heidelberg, 2009. Springer-Verlag. [12] Dana Petcu. A comprehensive development guide for the globus toolkit. IEEE Distributed Systems Online, 9(6):4, 2008. [13] Florin Pop, Ciprian Dobre, and Valentin Cristea. Decentralized dynamic resource allocation for workflows in grid environments. In Proceedings of 10th International Symposium on Symbolic and Numeric Algorithms, SYNASC08, pages 557–563, Timisoara, Romania, 2008. IEEE Comp. [14] Radu Prodan and Thomas Fahringer. From web services to ogsa: Experiences in implementing an ogsa-based grid application. In GRID ’03: Proceedings of the 4th International Workshop on Grid Computing, page 2, Washington, DC, USA, 2003. IEEE Computer Society. [15]

Mohamed Sayeed, Kumar Mahinthakumar, and Nicholas T. Karonis. Grid-enabled solution of groundwater inverse problems on the teragrid network. Simulation, 83(6):437–448, 2007.

40


[16] Diego Scardaci and Giordano Scuderi. A secure storage service for the glite middleware. In IAS ’07: Proceedings of the Third International Symposium on Information Assurance and Security, pages 261–266, Washington, DC, USA, 2007. IEEE Computer Society. [17] Jennifer M. Schopf. Ten actions when grid scheduling: the user as a grid scheduler. Pages 15–23, 2004. [18] Marius Ion, Florin Pop, Ciprian Dobre, Valentin Cristea: Dynamic Resources Allocation in Grid Enviroments. In Proceedings of 11th International Symposium on Symbolic and Numeric Algorithms, SYNASC09, Timisoara, Romania, 2009. IEEE Comp. [19] S. Andreozzi, S. Burke, L. Field, G. Galang, B. Konya, M. Litmaath, P. Millar and JP Navarro: GLUE Schema Specification - Version 2.0, March 2009. [20] Marc E. Frincu: Distributed Scheduling Policy in Service Oriented Environments. In Proceedings of 11th International Symposium on Symbolic and Numeric Algorithms, SYNASC09, Timisoara, Romania, 2009. IEEE Comp. [21] A. Ali, J. Bunn, et al, Predicting Resource Requirements of a Job Submission, In Proceedings of the Conference on Computing in High Energy and Nuclear Physics, 2004, pp. 273-281. [22] L. David and I. Puaut, Static Determination of Probabilistic Execution Times, In Proc. of the 16th Euromicro Conference on Real-Time Systems, 2004, IEEE Press, pp. 223-230. [23] N. Fujimoto, K. Hagihara, A comparison among grid scheduling algorithms for independent coarse-grained tasks, International Symposium on Applications and the Internet Workshops, IEEE Press, 2004, pp. 674-680. [24] M. Maheswaran, T.D. Braun, and Howard Jay Siegel Heterogeneous Distributed Computing, Encyclopedia of Electrical and Electronics Engineering, vol. 8, John Wiley & Sons, New York, NY, 1999, pp. 679-690. [25] W. Smith, I. Foster and V.E. Taylor, Predicting Application Run Times Using Historical Information, Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing, LNCS, vol. 1459, Springer, 1998, pp. 122-142.

41