TRAVELER: A Mobile Agent Based Infrastructure for Wide ... - CiteSeerX

TRAVELER: A Mobile Agent Based Infrastructure for Wide Area Parallel Computing Brian Wims and Cheng-Zhong Xu Department of Electrical and Computer Engineering Wayne State University, Detroit, MI 48202 [email protected]

Abstract This paper proposes a Java-based mobile agent infrastructure, T RAVELER, to support wide area parallel applications. Unlike other meta-computing systems, TRAVELER allows users to dispatch their compute-intensive jobs as mobile agents via a resource broker. The broker forms a parallel virtual machine atop servers to execute the agents. Since the agents can be programmed to satisfy their goals, even if they move and lose contact with their creators, they can survive intermittent or unreliable network connection. During their lifetime, the agents can also move themselves autonomously from one machine to another for load balancing, enhancing data locality, and tolerating faults. TRAVELER relies on an integrated distributed shared array runtime system in support of agent communications on clusters of servers. We demonstrated the feasibility of the TRAVELER in LU factorization problems.

executes trades between clients and servers and forms a parallel virtual machine out of the available servers upon receiving an agent. The agent is then cloned for each server. The cloned agents are run in a single-programmultiple-data paradigm. They are executed on the virtual machine independently of the broker. Servers of the machine can report results to the broker or directly to clients. Notice that the system may comprise of more than one broker. Brokers are organized in a hierarchical way for a wide area computational grid. Application

3 Virtual Machine executes

task while communicating with Client

Agent Task

Server VPI

Client

Server

1

Server Server

Client submits Agent to Broker

Virtual Machine

Agent

1 Introduction The 1990s are seeing the explosive growth of Internet and Web-based information sharing and dissemination systems. The Internet is also showing a potential of forming a giant computing resource out of networked computers. Existing web-based high performance metacomputing infrastructures, such as Bayanihan[3], Charlotte[1], and Javelin[2], were run in a “pull” model. That is, one machine maintains a pool of tasks as Java applets and dispatches these applets to clients on demand. Relying on voluntary participants, it works well for applications that are of common interest to the Internet community. However, it cannot provide any guarantee of the service quality from the perspective of end users. This paper proposes a novel “push”-based high performance meta-computing infrastructure, T RAVELER. Mobile agents have been applied to artificial intelligence, information gathering and services on the Internet. Traveler employs the mobile agent technology in a nontraditional way.

2 Architecture of TRAVELER Relying on Java-based mobile agent technologies, TRAVELER allows clients to declare their parallel applications as mobile agents and request services by dispatching the agents to a broker. As shown in Figure 1, the broker

Broker

2

Broker distributes Agent to an initial Virtual Machine

Figure 1. Architecture of Traveler Specifically, a client defines a computational task as an $JHQW7DVN and meanwhile creates a virtual processor interface (VPI) for communication between the client, the broker and servers. The VPI creates a 3DUDOOHO$JHQW to wrap the $JHQW7DVN object. The VPI then sends out the 3DUDOOHO$JHQW object to a broker via RMI. The broker collects states of the registered servers and forms a virtual machine out of the servers for the 3DUDOOHO$ JHQW object. On each server, the task agent spawns threads for multiprocessing. A monitoring agent can be created to oversee the execution of the code on the virtual machine. Multithreaded agents are run on the virtual machine, supported by an integrated distributed shared array (DSA) run-time support system. Agents communicate to one another via accessing user-defined shared regions of distributed arrays. Servers can contact clients for input data or returning results through callback handlers carried with the task agent. The virtual machine is finally destroyed upon completion of the computation.

Throughout the lifetime of an agent, availability of the computational resources of its servers may change with time. Servers may also stop their services due to some unexpected events. In both scenarios, the TRAVELER’s virtual machine must be reconfigured to adapt to the change of resource supplies. Java virtual machines only allow an implementation of weak mobility. That is, the agent must be re-started from the beginning with re-stored instance values upon arrival at a destination. Due to the complexity in the migration of Java-based multithreaded agents, we gear reconfigured virtual machines toward to a popular bulk synchronous computational model. Such a computation proceeds in phases. During each phase, agents perform calculations independently and then communicate with their data-dependent peers. The phases are separated by global synchronization operations. For simplicity in implementation, we restrict a virtual machine to be reconfigured only in between phases of a computation. Since little information needs to be recorded for the computation to proceed into the subsequent phase, threads of an agent can be re-started easily from its limited instance variables. In the TRAVELER, each user program starts with a master thread of control. For example, in LU factorization of matrices, the thread calls a parallel method, luVPI, which establishes an application-specific VPI object and creates a ParallelAgent object to wrap a task agent luFact. Note that luFact is an extended object of AgentTask, which implements the actual LU factorization functions. The luFact object is wrapped into a ParallelAgent via a method addTask of luVPI.

were run either in the same machine as the broker or in workstations of a remote local area network. All codes were written in Java and compiled in JDK 1.1.6. 200.00 to Server 3 to Server 2 to Server 1 Overhead to Broker

150.00 m Se 100.00 c

26.89 23.00 23.33

21.78 24.44

20.89

63.11 44.56

29.89

50.00

37.44

40.56

48.44

1 Server

2 Servers

3 Servers

0.00

Figure 2. Cost of Creating a Virtual Machine Establish virtual machine involves three major steps: a client submits agents to a broker, the broker executes trades and dispatches the agents to selected servers. Figure 2 shows the overall time and a breakdown of the time for creating a virtual machine comprising up to three servers upon receiving an agent from clients. The time from client to broker and from broker to each server is dominated by the cost of RMI and object serialization. Figure 3 presents the total execution time of LU factorization of a 100x100 integer array in Traveler. It was partitioned into threads in the simplest row-wise block decomposition way. The figure demonstrates benefits from a parallel virtual machine of multiple servers. 1400

6 cpu (1 Server) 10 cpu (2 Servers) 14 cpu (3 Servers)

1200

mSec

public void luVPI() { VPI luVpi = new VPI(); ParallelAgent pa = new ParallelAgent(); luFact ft = new luFact(float matrix[][]); pa.addTask(ft); vpi.addAgent(pa); vpi.sendAgent(); float answer = (float[][])luVpi.waitComplete(); }

1000

800

600 3

4

5

6

7

Number of Threads Per Server

8

Figure 3. Timing of LU Factorization

3 Experimental Results The evaluation of the TRAVELER was done in two major aspects. First is the time for establishing a parallel virtual machine, including the cost of RMI. Second is about TRAVELER’s overall performance in LU factorization of matrices. Experiments on Traveler for more parallel applications were reported in [4]. All the experiments were conducted on a cluster of four SUN Enterprise Servers. One machine is 6-way E4000 and the other three are 4way E3000s. The machines are connected by a 155Mbs ATM switch. We designated one 4-way machine as the broker and others for parallel virtual machines. Clients

References [1] A. Baratloo, et al. Charlotte: Metacomputing on the Web, in Proc. of 9th Int’l Conf. on PDCS, 1996. [2] B. Christiansen, et al. Javelin: Internet-based parallel computing using Java, Tech. Report UC at Santa Barbara, 1997. [3] L. Sarmenta and S. Hirano. Bayanihan: Building and studying web-based volunteer computing systems using Java. Future Generation of Computer Systems. Vol.15 (6), 1999. [4] B. Wims and C. Xu, Traveler: A Mobile Agent Infrastructure for Wide Area Parallel Computing. Tech. Report, ECE, WSU, January 1999. http://www.pdcl.eng.wayne.edu/traveler