Online and Offline Independent Tasks Allocation

Online and Offline Independent Tasks Allocation Schemes on Large Scale Master-Worker Heterogeneous Platforms Olivier Beaumont, Lionel Eyraud-Dubois, Hejer Rejeb, Christopher Thraves∗ INRIA Bordeaux – Sud-Ouest University of Bordeaux, LaBRI Nowadays, scientific research has brought challenging calculations to solve intractable problems for a single machine. Desktop Grids has appeared as an interesting and cheap solution to perform such computations. Desktop Grids profits of machine idle times in a network to, altogether, perform a hard computation. On the Internet, where only 5 to 10% of the computational power of personal computers is used, platforms like BOINC [1] or SETI@home [2] uses volunteers machines to perform hard computations in mathematics, biology, medicine or to search for intelligent life outside earth in case of SETI@home. Each machine perform a small part of a huge computation, small enough to do not disturb the work of the volunteer machines, but where altogether complete the whole work. All the applications running on these platforms consist in a huge number of independent tasks and all communications take place under master-worker paradigm. In this context, we consider the problem of allocating a large number of independent, equal-sized tasks to a heterogeneous large scale computing platforms. We model the platform using a set of servers that initially hold (or generate) the tasks to be processed by a set of workers (volunteers). All resources have different speeds of communication and computation and we model contentions using the bounded multi-port model. Under this model, a processor can be involved simultaneously in several communications, provided that its incoming and outgoing bandwidths are not exceeded. But, for the sake of realism, another parameter needs to be introduced in order to bound the number of simultaneous connexions that can be opened at a server node. We prove that unfortunately, this additional parameter makes the problem of maximizing the overall throughput (i.e., the fractional number of tasks that can be processed within one time-unit) NP-Complete. On the other hand, we also propose a polynomial time algorithm, based on a slight resource augmentation, to solve this problem. More specifically, we prove that, if dj denotes the maximal number of connexions that can be opened at server Sj , then the throughput achieved using this algorithm and dj + 1 is at least the same as the optimal one with dj . Going further, we consider the same problem in a dynamic setting, i.e., when workers and/or servers join and leave the system in an online manner. In that case, we show that no approximation factor to the optimal throughput (without resource augmentation) can be guaranteed even using (any) resource augmentation if disconnections are not allowed. Finally, we propose an online version of the offline algorithm, that maintains optimal throughput using very small resource augmentation, while producing at most one disconnection, one new connection, and one change in bandwidth allocations each time a new worker joins or leaves the system.

References [1] D.P. Anderson. BOINC: A System for Public-Resource Computing and Storage. In 5th IEEE/ACM International Workshop on Grid Computing, pages 365–372, 2004. [2] D.P. Anderson, J. Cobb, E. Korpela, M. Lebofsky, and D. Werthimer. SETI@ home: an experiment in public-resource computing. Communications of the ACM, 45(11):56–61, 2002. ∗ Christopher

Thraves is supported by the French ANR project Alpage

1