Web Based Framework for Parallel Computing - CiteSeerX

1 downloads 1741 Views 185KB Size Report
describe the working of JAVADC for enabling MPI applications. 2. 3. 2. Get the application source. 1. Java. Host A. Host 3. Domain. Host 2. Web. Server. Web.
Web Based Framework for Parallel Computing Zhikai Chena

Kurt Malya

Piyush Mehrotrab

Praveen K. Vangalaa

Mohammad Zubaira a

Computer Science Department, Old Dominion University, Norfolk VA 23529 USA b

ICASE, MS 132C, NASA Langley Research Center, Hampton VA 23681 USA Abstract Parallel and distributed computing on a cluster of workstations is being increasingly applied to a

variety of large size computational problems. Several software systems have been developed that make distributed computing available to an application programmer. However, these systems either are not web-based or lack a collaborative environment. The increasing use of Web technology for Internet and Intranet applications is making the web an attractive framework for solving distributed applications, least of all because the interface can be made platform independent. In this paper, we describe JAVADC a Web-Java based framework to enable parallel applications written using PVM, pPVM, and MPI. We also discuss the issues and the design for a general Collaborative Distributed Computing Environment (CDCE). CDCE is an environment to design, execute, monitor, and control a distributed heterogeneous computing application with the following features: (i) it will allow collaborative design and control, (ii) it is easy to access, (iii) it is easy to use, (iv) it will work on heterogeneous platforms, and (v) it will be

exible to adapt to di erent scenarios.  This work was supported by the National Aeronautics and Space Administration under NASA Contract No. NAG 1 1550 and NASA Contract No. NAS1-19480, while the authors were in residence at ICASE, NASA Langley Research Center, Hampton, VA 23681.

1

1 Introduction Parallel and distributed computing on a cluster of workstations is being increasingly applied to a variety of large size computational problems. Several software systems have been developed that make distributed computing available to an application programmer. We have to distinguish between run-time environments which were created to make distributed computing available to the general application programmer. Examples of these environments are PVM [1], pPVM [5], Linda [10], P4 [11], Express [12], they all support distributed computing in varying degrees of generality; however, none of them is web based, lack collaborative features, and are mostly suitable for running SPMD programs. The second category of available environments which addresses large distributed heterogeneous applications are: FIDO and MIDAS [7, 8]. However, both these systems are either hardwired to a speci c problem or are too restrictive. The other major limitation is that they lack a collaborative environment which allows di erent members in a group to interact with the application at various stages of its execution. There is a Web based computing environment, WebFlow, that enables distributed applications over the Web [9]. The WebFlow is an attractive working environment for parallel distributed applications written in Java. It is not targeted for integrating legacy applications which are written in Fortran or C. In this paper, we describe JAVADC [13](JAVA for Distributed Computing), which is a Web-Java based framework to enable parallel and distributed applications written using PVM, pPVM, and MPI[6]. The JAVA interface is integrated with the Netscape Web browser to keep the user interface familiar and easy to learn. The use of Java makes the graphical interface portable to di erent platforms. In this environment, a user in one Internet domain can con gure a parallel environment on a high-performance workstation cluster, HPC, in another domain, run an application on HPC and monitor its progress. We have successfully run scienti c computations, launched from a PC in the odu.edu domain with all les being located in that domain, run the computation on the HPC in the icase.edu domain and monitored the results on the PC. We did not observe any signi cant performance degradation due to the JAVA server, client communication and execution. 2

The JAVADC prototype is a good environment for running applications based on PVM, pPVM, MPI, etc. However, the environment is not rich enough to support a general distributed heterogeneous computing application. Such application, for example, the multidisciplinary design optimization (MDO) of an aircraft, generally consist of multiple heterogeneous modules interacting with each other to solve an overall design problem [7]. Typically these modules are developed in di erent disciplines and are optimized independently. The traditional way of integrating these modules and optimizing them for the overall design is a long and tedious process (typically taking several weeks). The slowness of this process is mainly due to the absence of a collaborative environment where (i) di erent modules and their interaction can be speci ed, and (ii) testing, monitoring, and steering of the overall design can be done. To address these problems we propose a Collaborative Distributed Computing Environment (CDCE) that extends JAVADC. The CDCE is a three-tier architecture that provides a collaborative environment and reduces signi cantly the overall time for solving a multidisciplinary optimization problem. The front-end, rst tier, of this system is a GUI interface integrated with the Web. It is implemented using Netscape Internet Foundation Classes (IFCs). Integrating the interface in a web browser, e.g., Netscape, will provide users with a familiar interface on desktops ranging from Unix based workstations, to Windows-based PCs, and Macintoshes. The middle tier consists of logic to process the user input and interact with modules running on a heterogeneous set of machines. These modules along with the control processes form the last tier. The overall design is a client-server based architecture. The main advantage of a three-tier system is that the client or the front-end becomes very thin, thus making it feasible to run on low-end machines. We now brie y discuss the JAVADC. Next we discuss the design of CDCE, and nally we have the conclusion.

2 JAVADC Consider a situation, where a user in one Internet domain wants to con gure (i) a parallel environment on a high-performance workstation cluster, HPC, in another domain, (ii) run an application on HPC, and 3

(iii) monitor its progress. We have designed an architecture, JAVADC - JAVA for Distributed Computing, which will support this process independent of the platform of the user's hostmachine. The architecture of JAVADC is illustrated in Figure 1, it consists of ve basic components. For the lack of space we only describe the working of JAVADC for enabling MPI applications. Host D Host A

HPC

Remote machine

Web Server

Web Client

Host C

CGI Script

Host B

Host 3

3

Java Server

Application Site

1

Java Client

Domain

2

Host 2

Host 1

1. Send the properties of MPI and application details to the Server 2. Get the application source 3. Run the application

Figure 1: Basic Components of JAVADC. 

Web Client: is a graphical World Wide Web browser, with a runtime Java environment. For our prototype we are using the Netscape browser.



Web Server: is a HTTP-server. We are using the public domain apache server.



JavaClient: is a Java based graphical user interface which runs under the browser environment. The JavaClient interacts with the user and communicates with the JavaServer over the network. The client 4

supports drag and drop feature for running MPI applications (See Figure 2). 

JavaServer: is a Java based server which is the main working engine of JAVADC. It collects user preferences regarding the application and MPI. Next, runs the application on the MPI HPC. For this, it may be necessary to move the application les from a di erent site to the MPI HPC. This task is also done by the server. The server details are discussed in the next section.



CGI Script: It gets executed on the server when a user accesses the JavaClient Interface link from the MPI home page. The script starts the JavaServer if it is not running already.



HPC: It consists of a cluster of workstations where a user runs a MPI application.



Application Site: The MPI application is started on one of the machines in this domain. In our JAVADC architecture the application site domain need not be same as the MPI HPC domain.

Figure 2: Runnning an application using Drag and Drop.

5

2.1 Example We illustrate the working of JAVADC with stepping through the execution of an application where the code resides in one domain, the application is run on a Unix HPC in another domain and the output data are sent to a PC in another network of that domain. Step1: The user launches the JavaClient using a Netscape browser. We assume a user is already browsing

the MPI home page which contains a link \Running MPI interface". The link is pointing to a CGI script on the Web Server which, when executed, starts the JavaServer (if not running already). Also, the CGI script sends the default con guration le to the Web Client along with the Java applet. Next the browser starts the applet and opens a graphical interface. The user speci es the application properties such as the location of source code and input/output les. Step 2: The user runs the application by dragging the application icon onto the MPI icon (See Figure 2).

We are assuming here that the user is interested in running the application on the default con guration of MPI. Once the application icon is dropped onto the MPI icon, the client sends the application information along with the MPI con guration to the server. Step 3: The server after receiving the con guration and application details le, gets the application from the

application domain site, and runs it on the HPC . Step 4: The server monitors the host machines and sends the status to the client. After the application is

executed the server sends the output to the client. See Figure 3.

3 CDCE: General Web Based Framework The CDCE system is a three-tier architecture. The rst tier consists of a client interface that is implemented using Netscape IFCs as an applet. The client interface has four sub-interfaces, namely application speci cation interface, resource mapping and execution interface, and monitor and control interface. The middle

tier consists of a user interface server (UIS), and an execution controller (EC). The third tier consists of a lightweight process along with the application execution modules.

6

Figure 3: Monitor and Status Window.

7

3.1 CDCE Overview User 1

User 2

Web Client

Web Server

Web Client

Applet

Java UIS

Applet

Java EC

Lightweight process

Lightweight process

Lightweight Process

Modules

Modules

Modules

Figure 4: Overview of the complete system. The overview of the complete system is shown in Figure 4. The project manager sets up the project with the help of a web-based client interface. Typically, this interface is invoked by downloading a URL document. The Java based applet is downloaded along with the document and opens an interface for the manager after an authentication process. To complete the set-up the manager: 

use the application speci cation interface to specify visually or otherwise di erent modules and their dependency;



set up the privileges of team members involved in the project, providing individual members access to their modules.

For entering this information the manager may need to get some information from remote machines. This is done with the help of the user information server (UIS) running on the same machine as the web server. Also, the information entered by the manager in this phase is communicated to the UIS. Note that UIS 8

needs to run on the web-server machine from where the client interface applet was downloaded. The Java run-time environment imposes this restriction. The team members complete the rest of the set-up. Team members give additional information about their modules (such as the module location, input-output les, etc.,). They also map modules to an available and appropriate hardware resource with the help of the resource-mapping interface. On the completion of set-up, UIS receives all the information regarding the project. Based on this information the UIS creates an intermediate representation, a script that is interpreted by the execution controller (EC). The execution controller requests an appropriate light weight process to run a speci c module. These light weight processes are running on all the hardware resources available for the project. The execution controller also speci es various status variables related to the module execution which need to be updated. The execution controller maintains a database to keep the current status information. The UIS server periodically, or whenever there is a change in status, reads this database. The UIS sends this information to the monitor interface of all the active web clients. The UIS is a Java multi-threaded server that keeps track of all active clients. A team member can also change data values for a particular module to steer the computation in the right direction. For example, in a design cycle, the responsible team member may decide that a particular module is not a ecting the optimization and may bypass the module by using old values in each cycle. Similarly, the team could replace a module with a plug-compatible module to use another algorithm.

4 Conclusion In this paper we have discussed JAVADC which enables users in one Internet domain to run a distributed application on a HPC in another domain. JAVADC is based on JAVA and is integrated with the Web. This makes the interface portable and easily accessible. We have implemented JAVADC in conjunction with MPI, PVM, and pPVM. We have successfully run scienti c computations, launched from a PC in the odu.edu domain with all les being located in that domain, run the computation on the HPC in the icase.edu domain and monitored the results on the PC without any signi cant performance degradation. We also discussed 9

the design of a Collaborative Distributed Computing Environment (CDCE) that extends JAVADC. The CDCE is an environment to design, execute, monitor, and control a distributed heterogeneous computing application with the following features: (i) it will allow collaborative design and control, (ii) it is easy to access, (iii) it is easy to use, (iv) it will work on heterogeneous platforms, and (v) it will be exible to adapt to di erent scenarios.

References [1] A. Beguelin, J. Dongara, G. A. Geist, R. Manchek, and V. Sunderam. A Users' Guide to PVM Parallel Virtual Machine. Technical report, Oak Ridge National Laboratory, TR ORNL/TM-11826, Oak Ridge, TN 37831, June 1991. [2] G. A. Geist, A. Beguelin, J. Dongara, R. Manchek, and V. Sunderam. PVM 3 User's Guide and Reference Manual. Technical report, Oak Ridge National Laboratory, TR ORNL/TM-11826, Oak Ridge, TN 37831, May 1993. [3] T. Mattson. Programming Environments for Parallel Computing: A comparison of CPS, Linda, P4, PVM, POSYBL, and TCGMSG. Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences, Vol 2, pages 586-594, January 1994. [4] V. Sunderam. PVM: A Framework for parallel Distributed Computing. Concurrency: Practice and Experience Vol 2 No 4, December 1990. [5] \Scienti c Computing using pPVM". International Conference on Parallel Processing, August 1994. [6] \Message Passing Interface". Argonne National Laboratories and Mississippi State University. http://www.mcs.anl.gov/mpi/index.html. [7] Thomas M. Eidson, Robert P. Weston, FIDO - Framework for Interdisciplinary Design Optimization, http://hpccp-www.larc.nasa.gov/ do/homepage.html [8] John C. Peterson, Multidisciplinary Integrated Design Assistant For Spacecraft (MIDAS), http://mishkin.jpl.nasa.gov/Midas Page [9] Dimple Bhatia, Vanco Burzevski, Maja Camuseva, Geo rey Fox, Wojtek Furmanski and Girish Premchandran, WebFlow - a visual programming paradigm for Web/Java based coarse grain distributed computing, http://www.npac.syr.edu/projects/webbasedhpcc/index.html [10] N. Carrieo and D. Gelernter. Linda in context. Communication of the ACM, Vol 32 No. 4, pages 444-458, April 1989. [11] R. Butler and E. Lusk. User's Guide to the P4 Programming System. Technical report, Argonne National Laboratory, ANL-92/17, 1992. [12] A. Kolawa. The Express Programming Environment. Workshop on Heterogeneous Network-Based Concurrent Computing, Tallahassee, October 1991. [13] Kurt Maly, Praveen K. Vangala, and Mohammad Zubair, \JAVADC: A Web-Java Based Environment to Run and Monitor Parallel Distributed Applications", Technical Report, Old Dominion University, 1997. 10