Heterogeneous Distributed Computing in Practice - Semantic Scholar

2 downloads 0 Views 135KB Size Report
building truly distributed environments, a solution for current systems is not .... each laboratory as a rounded rectangle enclosing a group of resources. ..... approach to information retrieval developed jointly by Thinking Machines Corporation, .... own tools (SAM in HPUX, SMIT in IBM, SysAdmSH in SCO Unix, NextAdmin in ...
Proc. of HP European Users Conference, Brighton 1993

Heterogeneous Distributed Computing in Practice Giuseppe Attardi Tito Flagella Luca Francesconi Stefano Suin Dipartimento di Informatica Corso Italia 40 I-56125 Pisa, Italy net: [email protected]

Abstract The computing facilities of a computer science department offer a good opportunity to experiment with issues of distributed computing. A typical environment consists of a network of heterogeneous computers with different operating systems which need to be accessible by users with rather various expertise and requirements. While future versions of Unix (USL Atlas or OSF DCE/DME) promise to deliver facilities for building truly distributed environments, a solution for current systems is not readily available. We present the architecture and the solutions involved in building a distributed computing environment integrating Unix, MacOS and MS DOS/Windows, where we achieved distribution in an effective, economical and flexible way. We discuss possible evolution of our solution and mention current developments in the field.

1. Introduction The computing facilities of a computer science department offer the opportunity to experiment with issues of distributed computing. A typical environment consists of a network of heterogeneous computers with different operating systems which need to be accessed by users with rather various expertise and requirements. While future versions of Unix (USL Atlas or OSF DCE/DME) promise to deliver facilities for building truly distributed environments, a solution for current systems is not readily available. Therefore we decided to explore the level of integration which is achievable with present technology and limited investments. It is fairly clear that in a system where each machine is managed independently without integration with the others, there is a lot of wasted efforts and resources due to the need to replicate the resources and of repeating similar tasks of maintenance or update on each individual machine. Even though independent machines may be attractive for the degree of freedom and flexibility they provide, as their number grows the economical disadvantages of such approach become apparent. Another motivation for integration is the possibility of obtaining services which require cooperation, in particular those services involving communication. Finally, in a dynamic environment, where

computing requirements vary quite frequently, it is convenient to be able to allocate available resources to accommodate demand, rather than statically.

2. Architecture We aimed to build a distributed architecture with the following goals: • to exploit heterogeneous computers, so that one may choose the most suitable platform for each task • to provide interoperability among heterogeneous systems • to provide uniform management and administration of network, accounts and groups • to be easily extendible • to support client/server applications. To achieve a high degree of openness and vendor independence, we decided to restrict ourselves only to standard or public domain software and tools. Also, since interoperability is a very complex problem, we decided to concentrate on three major families of hardware/software platforms upon which to build our distributed computing environment. The overall organisation is described in the following figure, where the lowest OSI network layers (Ethernet, IP, TCP) constitute the beams across the pillars, i.e. the elements of the network, made up of a hardware base and a software column. Further network protocols and software components provide the support for distributed applications.

Distributed Applications

Security

C2 passwd Kerberos

Network File System NFS PC-NFS AFP

Network information Service Users, Services Hosts, Naming Directory

Interoperability Resource Sharing RPP, PAP

G.U.I. X11 Xview MacX Windows 3

Management Resource access Accounting Dump

Ethernet - TCP/IP

U ni x S V R4

RISC

M ac O S

Motorola

2

M S D O S

Intel

3. Interoperability The objectives of interoperability are easy to describe: to be able to access and use in the same way from anywhere all resources, facilities and information wherever they are available. Even if we had the same computers everywhere, interoperability would not be granted since it requires that systems be designed and built with interoperability in mind. Since in reality we have to face with different architectures, different operating systems and different software, the complexity of the task is apparent. Interoperability should be a principle taken into account at any level of a processing system. Such a principle could be enforced by an authority (as in the case of the telephone system) or through the development of international standards. In the field of computers manufacturers have been reluctant to provide interoperability in order to maintain a competitive advantage or to lock customers into proprietary solutions. Customers have been starting to perceive the benefits of interoperability and have started requesting open systems. The benefits of interoperability are fairly obvious both from a technical and economical perspective: a customer may select the best product among a variety of competing products; competition fosters improvements in the products; larger volumes reduce production costs; spreading of competence and technology reduces development costs. The architecture of our heterogeneous system consists of two levels: the first level provides remote access to all resources in the network. The following resources are considered: CPUs, disks, graphic workplaces, printers. These are turned into network wide resources through several protocols: Remote Procedure Call for sharing CPUs; NFS, PC-NFS and Apple File Protocol for disk sharing; X Windows Protocol for graphic resources; Remote Printer Protocol and Printer AppleTalk Protocol for printer sharing.



CPU





disk CPU



disk

One essential advantage of this solution is a high degree of expandability. The system can easily grow accommodating the demand: when a shortage of CPU power appears, new CPUs can be added; when new workplaces are required, they can be added easily. The resources are not dedicated and can be easily moved or allocated differently. For instance in one semester there can be a greater demand for X Window Terminals than for PCs, so we allocate some PCs to work as X server. The second level allows several facilities to be grouped into virtual laboratories, through which access can be granted to satisfy specific demands while complying with appropriate access regulations. In the following figure we show each laboratory as a rounded rectangle enclosing a group of resources.

3

Some resources may belong to more than one laboratory, meaning that it is shared for different uses. Virtual laboratories are built by a structured usage of the netgroup facility of the NIS (Network Information Service [Sun 90]), which is part of NFS. The following sections describe the interoperability achieved for each pair of operating systems, with Unix playing a central role.

3.1. Unix to Unix Unix provides several mechanisms to achieve interoperability, but some guidance is required to put them to work. In this section we describe briefly how we exploited these mechanisms, overriding sometimes the defaults used in proprietary Unix implementations. The specific goals that we addressed were to allow each user to have access to his own resources independently from the location from which he accesses the network. This means in particular, for each user to be able to access from any machine: • his own mailbox • his own home directory • his own customised environment • the same collection of general tools • data relative to the projects in which he is involved. We achieved these goals with an appropriate use of NFS (Network File System) and associated facilities. The /usr/spool/mail directory is exported from the mail server host and visible from all other hosts. The user home directories reside on various network disk servers and are exported toall hosts on the network. Due to a global naming scheme, as explained below, home directories can be accessed with the same path on any machine. A collection of initialisation profiles has been produced which work uniformly across the various Unix variants. A collection of widely used tools, ranging from text processing, graphics, program development to system utilities is maintained centrally and distributed to all platforms. Software distribution of the centrally maintained tools is organised in two steps. The first step is the creation of architecture specific version of the tools. While the sources of programs are maintained in a single read-only global repository, compilation is performed on a machine which creates a virtual copy of the sources by means symbolic links. The second step is distribution of new releases. Two possibilities are available: for each machine architecture a central repository exists which contains binaries for that architecture, directly accessible via NFS over the network; alternatively one can use a tool (like rdist [Cooper 92]) which automatically transfers a new release to each machine, where having local copy of the software is preferred for reliability reasons. The information sharing needed for our goals requires globally accessible repositories which can grow dynamically. We build these repositories from a pool of network disks, to which new elements can be added when necessary and where it is most convenient. The idea is similar to the mechanism of Logical Volumes of HP-UX or OSF/1, except that it applies to disks spread over the network rather than to local disk partitions. Our solution is obtained through: a general naming schema for data, a distributed partitioning schema for physically placing the information and a mapping from the logical names into the physical locations. The mapping is distributed so that all machines maintain the same logical view. Naming The naming schema that we adopted is to have some global directories (currently /home, /share, /project) for the shared repositories. /share contains subdirectories for various common material

4

(/share/src, /share/lib, /share/doc, etc.) and a subdirectory for each of the supported architectures (/share/hp, /share/sparc, /share/dos, /share/apple etc.) which contains architecture specific binaries and information. /home is the repository for home directories of all users, and /project contains repositories for each research project (/project/project_name). Distributed partitioning The repositories are placed in a set of network disks, i.e. directories with unique network names exported via NFS. The naming for the network disks could be arbitrary, but for convenience we choose names such as /net/home/host1, /net/home/host2, etc. for those used for home directories, /net/share/disci, for those used for /share, and /net/project/hosti, for the repositories of projects. Mapping Since these repositories tend to grow beyond the size of physical disks and hence also of network disks, they must be split among several such disks. It becomes then difficult to manage them by means of NFS, since the map of NFS mount points would have to be propagated to each machine, raising problems of consistency. Another problem with NFS is that when a server is down, the client machine hangs waiting for the server to restart. We solved both these problems by adopting an automounter, an interface to NFS through which filesystems are mounted on demand when they are first referenced, and unmounted after a period of inactivity. One problem with automounters is that the auto mounted directories are not listed by the ls Unix command: only when an access is attempted, they become visible and accessible. This is a serious problem since it may confuse users. Consider for instance the case of the directory /project, containing the subdirectories oikos and theory, which must be split among three network disks /net/project/olimpo (for oikos), /net/project/inti and /net/project/limbo (for theory). The simplest approach is to make /project an automount point, with the following map describing which network disk to mount: # Automounter map for directory /project: oikos type=nfs; rhost=olimpo; rfs=/net/project/olimpo/oikos theory type=auto; fs:=${map}; pref:=${key}/ theory/tools type=nfs; rhost=inti; rfs=/net/project/inti/theory/tools theory/users type=nfs; rhost=limbo; rfs=/net/project/limbo/theory/users This map defines /project/theory as an automount point as well. Nothing is visible under /project before the first explicit reference to one of its entries. If we blindly issue the command ls /project/theory, the directory itself will appear but listed as an empty directory. Only after issuing ls /project/theory/tools, its contents will appear. To overcome this problem we introduce an extra level of indirection, moving the automount point from /project to /net/project, and making /project a directory containing links to files inside /net/project. For instance: /project/oikos /project/theory /project/theory/tools /project/theory/users

is a link to is a directory is a link to is a link to

/net/project/olimpo/oikos /net/project/inti/theory/tools /net/project/limbo/theory/users

The contents of /project is now always visible. A reference to a specific project like /project/oikos will induce the automount of /net/project/olimpo/oikos from server olimpo in a fully transparent way. The mount map for /net/project will now become:

5

# Automounter map for directory /net/project: olimpo type=nfs; rhost=olimpo; rfs=/net/project/olimpo limbo type=nfs; rhost=limbo; rfs=/net/project/limbo inti type=nfs; rhost=inti; rfs=/net/project/inti However this map will not work onthe machines which export the network disks. For instance on olimpo the network disk /net/project/olimpo is not imported, but it is just a link to a local filesystem. By using the feature of selectors provided by the Amd automounter [Pendry 90], we can write a single conditional entry for olimpo, selecting two alternatives depending on the host name: olimpo

host == olimpo; type=link; fs=/local/project; \ host != olimpo; type=nfs; rhost=olimpo; rfs=/net/project/olimpo

This solution however introduces a problem of consistency: on each machine in the network the tree of directories and links for the shared repositories must be created and maintained. Luckily we can use for this task the same tool we adopted for software distribution: with rdist we periodically update these directory structures from a single central distribution. To summarize: we use a pool of network disks to contain the global repositories; we describe how to combine these disks into logical filesystems by means of automounter maps; the maps are distributed to all machines through the NIS; to maintain the structure of the repositories always visible, the structure is replicated on each machine up to the mount points for the automounter; this structure is ditributed to all machines by means of rdist. As already mentioned, we adopted Amd, a public domain automounter, rather than the standard automounter from Sun, for its support of selectors and multiple filesystems. With selectors the choice of which filesystem to mount can be controlled for each machine in terms of its hostname, architecture, or OS. Amd also supports a variety of filesystem types, including link, NFS and UFS. The combination of selectors and multiple filesystem types allows identical configuration files to be used on all machines. This feature combined with the possibility to use the NIS to specify Amd maps, crucially reduce the administrative overhead. Amd ensures that it will not hang if a remote server goes down. Moreover, Amd can determine when a remote server has become inaccessible and then mount replacement filesystems as and when they become available.

3.2. DOS to Unix To integrate the Unix and DOS environment we experimented several solutions before finding one satisfying all requirements, in particular reliability and performance. The problem of reliability arises from the lack of protections in PC environments, since an user can modify the system, deleting or changing critical files. No protection is effective if a user is allowed to boot his copy of the operating system from a floppy, thereby becoming fully in charge of the system and bypassing any software barrier. Experience with large number of users shows that in such configuration DOS systems are completely unmanageable: in a few days most of the systems are out of order because of either deliberate or incompetent actions. Our solution is a mixed hardware/software solution: the PCs perform boot through the network from a Unix server, using the BOOTP protocol of TCP/IP implemented in EPROM on the Ethernet board. Booting from local disk is disabled. All disk space for the PCs is provided through PC-NFS, on directories which are remotely mounted and appear as logical disk units to DOS. To use a PC, users must supply their personal password to the Unix server. In this way we achieve access control and uniformity of management, since we can use the same tools to administer accounts for PCs and Unix systems. The logical disks which contain system software and tools are shared among all PCs and write protected through the Unix mechanisms, while each user has full access to his home directory.

6

We chose the PC-NFS solution rather than NetWare because of more direct integration with Unix and rather than LanManager for a significantly better performance. We selected the Beam&Whiteside implementation of PC-NFS for its support of remote boot and for the ability to access the PC via telnet from Unix. This last ability is quite useful in order to perform system administration via network. As part of PC-NFS the Remote Printer Protocol provides a way to share printers. The PC’s can be used either as DOS/Windows machines or as X Window terminals for Unix through an X server implementation like XView. In this case integration with Unix reaches the level of the graphical user interface. On a networked PC, equipped with TCP/IP and PC-NFS, several client/server applications are available, which will be mentioned later. While a DOS user has full access to the facilities provided by a Unix server, the only access possible from a Unix system to a DOS PC is by means of ftp or a DOS terminal emulator. A greater degree of interoperability is possible with X/Deskview, a multitasking operating system based on X Window, which constitutes an alternative to MS Windows. Using X/Deskview, DOS and Windows applications are accessible to Unix users. This solution is of particular interest, when considering the cost of PC applications compared to the Unix ones.

3.3. MacOS to Unix The integration between the Unix and MacOS environments has been achieved using the group of tools called Columbia AppleTalk Package (CAP). The basic functionality of this package is an implementation of the AppleTalk protocols for Unix systems. Those services which MacOS provides through AppleTalk become accessible from Unix or vice versa available from Unix. Through the Apple File Protocol for instance Unix systems can provide AppleShare service to Macintosh for file sharing. A Macintosh user can access Unix directories by connecting to them remotely by means of the Chooser: they will appear as disk s on the Macintosh desktop. The user can work on such disks with normal operations: transfers between folders, copies or removal of documents, activation of applications can be performed with the usual mouse and menu operations. Using the Printer AppleTalk Protocol (PAP), Unix systems can access AppleTalk printers, and vice versa Macintosh users can print on Unix printers. Security for MacOS is still an open question, since booting from network is complicated and costly, and mechanism for software protection are not very reliable.

4. Network Services Given that all our platforms are equipped with TCP/IP, the basic services like telnet and ftp are immediately available throughout the network. Several network information services have been made available on top of the basic protocols. They are examples of client/server applications, where servers reside on Unix nodes, while the clients run on personal computers or are available through X Windows terminals. The main ones are [Kehoe 92]: • Usenet News: Usenet is the set of machines that exchange articles tagged with one or more universally-recognised labels, called news groups. Usenet encompasses government agencies, large Universities, high schools, business of all sizes, home computers, etc. There are news groups for various topics, from those of interest of computer professionals to groups oriented towards hobbies and recreational activities. Many clients are available for this service: rn, nn, xrn, mxrn are some of the Unix clients. Nuntius, The News are Mac clients.

7

• Popper: an implementation of the Post Office Protocol server that runs on a variety of Unix computers to manage electronic mail for Macintosh and MS-DOS computers. Eudora is a client mail program for Macintosh and MS Windows. It uses the POP3 and SMTP protocols to receive and send mail from a network-connected Macintosh or PC. • Archie [Emtage 92]: it is a query system currently tracking the contents of over 800 anonymous ftp archive sites containing over a million file stored across the Internet. Xarchie is an X11 browser interface to the service. • WAIS [Kahle 90]: stands for Wide Area Information Servers. It is an Internet-based network approach to information retrieval developed jointly by Thinking Machines Corporation, Apple Computer, and Dow Jones. It allows users to access information based on keyword searches; the list of items likely to be of interest is returned sorted in order of probable relevance to the search. Users may then select the documents they wish to view, and screens of text are sent across the network. • Gopher [Alberti 91]: it is a distributed document delivery service. It allows a neophyte user to access various types of data residing on multiple hosts in a seamless fashion. This is accomplished by presenting the user a hierarchical arrangement of documents and by using a client-server communications model. The Internet Gopher Server accepts simple queries, and responds by sending the client a document. Xgopher is an X11 client interface to the server. Mac and PC clients for gopher are also available. • World Wide Web and Mosaic: the WWW project [Berners-Lee 93] involves the processing of structured documents by a number of systems around the globe. WWW has been developed to facilitate wide-area network-based information discovery and retrieval on different platforms. With its HyperText Markup Language WWW can display: • • • •

Menus of options Online Help Database query result Hypertextual documentation

Mosaic [Andreessen 93] is a WWW client and a networked information browser to display documents (but also images, audio and video) with hypertextual links, which are represented as enlightened text. The links connect to other documents which can reside on different hosts. In this way Mosaic provide the access to a lot of material which includes manuals for different operating systems, the CERN libraries, information space navigation, history tracking facilities and so on. Besides Mosaic supplies a graphical interface for the major Internet services such as Gopher, Wais, Ftp, Usenet News, Hytelnet, Whois, X500, Texinfo, and Archie. • Fax service [Leffler 90]: FlexFAX is a utility that provides spooling and management of fax communications.

5. Security System security is a vital concern in a computing centre. The vast amount of labour involved in the creation and maintenance of the information stored in a computing system makes such information valuable and precious. According to the National Computer Security Center, systems are classified into seven security levels. The C2 level is an intermediate level in which security-related events are audited, the login-password procedure provides authentication and encrypted passwords are stored in a place inaccessible to unprivileged users. We brought all our systems to the C2 level as supplied by their respective manufacturer. Some difficulties arose since there is no standard solution for such implementations. We faced for instance incompatibilities with the Network Information Service (NIS) which allows to maintain network wide account information and that is fundamental for our system management. Only the Sun implementation provides a service of remote password

8

authentication. To overcome this problem we created suitable filters to convert the password files into the format required by the various systems. A more sophisticated solution can be achieved through the adoption of the Kerberos authentication facility of the Athena project. Though in use for some time, Kerberos is not yet standardised and requires significant changes to various components of the operating system. Protection of copyrighted material is another sensitive issue: beside the standard protection mechanism of Unix, CAP provides a facility which allows users to launch applications from their Macintosh, but denies the permission to copy the file.

5.1. Filtering Network Services Protection from unauthorised access from outside or limitations to the use of external network facilities can be achieved using a per port filtering mechanism of the incoming and outgoing connections available on many routers. We have set up filtering using the mechanism of IP Access List and IP Extended Access List provided by Cisco routers [Cisco 90]. It allows us both to filter incoming untrusted connections and to restrict the access to specific external services from single hosts or whole subnetworks. Consider the following situation in which it is necessary: • to deny connections from certain external untrusted networks to the subnetwork 131.114.4 • to allow external connections only for a restricted set of network services (like Usenet News, email, X11 clients), while denying other services (like telnet, rlogin, rsh, ftp) for the subnetwork 131.114.11

Internet

Ethernet 131.114.11.0

Gateway

Cisco

Ethernet 131.114.4.0 The effect is achieved configuring the IP Extend Access Lists as follows: # Subnetwork 131.114.4 # Remote Connections from network x.x.x.0 restricted to e-mail service (port 25; tcp protocol) access-list 101 deny tcp x.x.x.0 0.0.0.255 131.114.4.0 0.0.0.255 neq 25 # All other connections allowed access-list 101 permit ip 0.0.0.0 255.255.255.255 131.114.4.0 0.0.0.255 # Subnetwork 131.114.11 # Remote Connections restricted to some services # Usenet News (port 119; tcp protocol) access-list 101 permit tcp 131.114.11.0 0.0.0.255 0.0.0.0 255.255.255.255 eq 119 … other allowed services … # All other services denied access-list 101 deny tcp 131.114.11.0 0.0.0.255 0.0.0.0 255.255.255.255

9

6. Management A quite delicate issue is the ability to manage the distributed system without propagating root privileges: security and reliability reasons dictate to avoid as much as possible using superuser privileges. On the other hand, prompt intervention on the system can be ensured only if enough personnel is enabled to act. We solved this issue by means of a collection of operator procedures which cover most frequent areas of intervention: dump, printer spooling, process priorities, updating NIS and nameserver databases, account and resource management. The invocation of these procedures is strictly limited to the staff of the centre but does not require superuser privileges. The procedures have been implemented using the Perl language [Wall 91]. Perl has been chosen for portability and security: Perl scripts run unchanged on all machines and Perl checks that no leaks in security are present in these procedures. In the following sections the most significant procedures are described.

6.1. Resource Access The main operator procedure is the procedure for managing user accounts and resources allocated to each account. There is no standard in this area and various Unix implementation provide their own tools (SAM in HPUX, SMIT in IBM, SysAdmSH in SCO Unix, NextAdmin in NeXTOS). However these tools are meant for managing an individual machine rather than a network. Since we wanted a uniform mechanism across the network, we had to develop our own procedure. The account procedure supports the abstraction of virtual laboratories, mentioned earlier, as the mechanism to control access to network resources. The account procedure handles virtual laboratories instead of individual resources greatly improving the flexibility of the system. Since virtual laboratories are implemented through the mechanism of netgroups, the granularity of control they provide is limited. Netgroups only control access to machines, therefore it is not possible to assign different network resources, like disks, printers or terminals, to laboratories which share the same machine.

6.2. Disk Quotas Disk usage must be strictly regulated in an environment with hundreds of users. For doing this, an operator procedure is provided as a higher level interface to the standard Unix quota facility. The procedure must be distributed because a home directory can be located on any of the file servers .

6.3. Dump Periodic backup of disk storage must be performed regularly. The procedure that we developed for this task had to solve problems of heterogeneity and security. Our first approach was to use the Unix facility rdump to dump from each file server onto a single machine equipped with a DAT drive. Unfortunately rdump requires root access from the host to dump to the DAT server. This is an unacceptable security hole, because anybody gaining root access on any of the network machines can obtain root access on the central DAT Server. For this reason we modified the Unix command rdump so that the local part of the operation (reading the file system) is done by root, while the remote connection is opened as an unprivileged user. We applied the same patch to amanda, the tool that we later adopted. amanda has been developed at the University of Maryland and fulfils the requirements of a complex distributed environment,.

10

We are looking for a solution allowing the remote dump of the local disks of PC’s and Macintoshes, from a Unix Dump server. The programs currently in use work in the other direction, not allowing to include Macintosh and PC local disks in the central dump schedule.

7. Monitoring The processes, configuration and log files which constitute the distributed environment need to be monitored regularly in order to avoid that services be disrupted due to failure of any one of them. It can be very annoying to face a malfunctioning service, since it may be difficult to relate the problem with the failure causing it: for instance a block in the printer spooling may be caused by problems in the nameserver. To overcome these problems we developed a set of utilities which monitor automatically the critical components of the distributed system, repair malfunctions whenever possible and alert the operator in the other cases. This monitoring tool consists in a collection of Perl [Wall 91] procedures, conceived to foresee undesired system behaviours. Typical system aspects controlled by the monitor are: • File System occupation, avoiding file system full • Network media status • Critical processes • Printing, mailing, and other spooling areas. Currently we are working to integrate the monitor with the maintenance procedures. For instance when a critical process dies, the monitor detects the problem and can automatically restart it.

8. Evolution The main directions for evolution of our environment are in the area of system management and of interoperability between the supported operating systems. While we succesfully manage our network using our current tools, we feel that system management remains an area where further investigation is required. It is significant for instance that even in the recently announced Common Open Software Environment (COSE) this aspect is still to be developed, while consensus and solutions exist in areas like desktop environment, graphics, networking, multimedia and object technology. In the following sections we discuss some interesting proposals in the field, like Tivoli Management Environment and the Athena project [Rosenstein 90], analysing their impact on the administration of large heterogeneous installations.

8.1. Tools for Network Management Tools like Sam for HP UX, Smit for IBM AIX, or those included in Solaris 2.x [Hanlon 92] provide a graphical interface and high level procedures to simplify system administration. However they principally target inexperienced system administrators who manage individual systems. A number of network management tools have appeared recently. Most notable is the Tivoli Management Environment (TME) from Tivoli System, which has been selected by the Open Software Foundation as the heart of its DME. Currently however it is available only on Sun workstations. TME has an applealing drag-and-drop graphical user interface: each administrator has an operator desktop, and he can move users between hosts by simply dragging user icons and dropping them onto host icons. Users can be added to groups just as easily. You can create desktops with limited

11

privileges – such as managing users on a subset of systems – which allows you to delegate privileges, a well known critical feature. Unfortunately the implementation seems not to be of the same quality of the interface. In a recent review [UnixWorld 92], the system required the activation of 308 processes and several minutes to complete simple tasks. One can expect that the system will be tuned in the future, but one must keep in mind that a graphical interface is only a marginal aspect in the task of managing a wide network. The Athena project for instance explicitly rejected to depend on a graphical interface for administrative tools, because a crucial access could be required via modem, or through a character terminal. Other problems like software and configuration distribution required many years of research and tuning. These aspects are missing from TME and similar graphical tools.

8.2. Athena Project Because of the apparent limitations of proprietary tools, public domain and qualitative solutions have been in use for daily administration in large University campus like University Southern California (USC), MIT and others. Some thousands of machines are centrally, mostly automatically, administered using public domain tools. Project Athena is a large cooperative project at MIT addressing many aspects of a heterogeneous environment, like graphical user interface (X Windows), security (Kerberos) and System Management (Moira). Moira was designed to keep management costs down in face of the rising number of workstations and servers. Moira collects information in a central location rather than keeping configuration information on workstations or manually managing copies of server configuration files. In this way when data must be modified, it can be done in one place rather than on thousands of machines. Moira provides central management of a database of information on users, machines, lists, and network servers. This information in turn is used to automatically manage network services in the Athena environment. Moira main components are: the database:

Moira requires an underlying relational database system. It resides on a secure machine, and currently takes up about 13 megabytes of storage.

the server:

It is a continuously running program which accepts requests across the network and performs manipulation on the database according to those requests.

the client programs: they are run by users to effect changes in the SMS database regarding user information like groups, mail and other lists, post offices, etc. the Data Control Manager (DCM): it is a program running on the Moira main server machine periodically by cron. It looks in the database to see which servers need to be updated, converts the necessary information from the database into a format suitable for the server’s consumption, and propagates it to the server’s machine with the cooperation of an Update Server running on that machine. Athena relies on well proven technologies, like track, NFS and AFS. Athena can be considered a model of future solutions for very large installations which may get inspiration from the results of several years of research and experimentation.

12

9. Conclusions We presented our experience in building a distributed heterogeneous environment. The experience has shown what degree of interoperability is currently possible without large investments using widely available or public domain tools. A significant effort is still required to put all these tools into place, and additional work is required to fill in the missing pieces. The degree of integration achieved is satisfactory but there are several issues that require further work. We have sketched the possible evolution of our system and mentioned new technologies under development which may have an impact in the future and provide further degrees of integration.

10. Acknowledgements We gratefully acknowledge the encouragement and support from Antonio Albano. Lorenzo Coslovi from Hewlett-Packard Italiana SpA provided a fellowship to partially support Luca Francesconi.

11. References [Andreesen 93]

M. Andreesen, Mosaic, NCSA, 1993.

[Berners-Lee 92]

T. Berners-Lee, et al., World Wide Web: the information universe, CERN, 1992.

[Cisco 91]

Cisco Systems Inc., Gateway Cisco Manual, 1991.

[Cooper 92]

M. A. Cooper, Overhauling Rdist for the ‘90s, University of Southern California, 1992.

[Dern 92]

D. P. Dern, Get your computers chatting, UnixWorld Magazine, April 1992.

[Emtage 92]

A. Emtage and P. Deutch, Archie – an elelctronic directory service for the Internet, 1992 Usenix conference, 1992.

[Farrow 92]

R. Farrow, Object Oriented Network Management, UnixWorld Magazine, November 1992.

[Hanlon 92]

S. Hanlon, Sun OS 5.0. System Administration - A White Paper, Sun Microsystems, 1992.

[Hubley 91]

M. Hubley, Distributed open environments, Byte, 16(12), November 1991.

[Kahle 90]

B. Kahle et al., WAIS interface prototype functional specification, Thinking machine Corporation, 1990.

[Kehoe 92]

B. P. Kehoe, Zen and the Art of the Internet, 1992.

[Leffler 90]

S. J. Leffler, FlexFAX – a network-based facsimile service, Silicon Graphics, 1990.

[Pendry 90]

J. S. Pendry, Amd, an Automounter, Department of Computing, Imperial College, London, 1990.

[Rosenstein 90]

M. A. Rosenstein, The Athena Service Management System, Project Athena, Cambridge, Massachusetts, 1990. 13

[Steiner 90]

J. G. Steiner, Network Services in the Athena Environment, Project Athena, Cambridge, Massachusetts, 1990.

[Sun 90]

Sun Microsystems Inc., System & Network Administration, 1990.

[Wall 91]

L. Wall and R. Schwartz, Programming Perl, O’Reilly, 1991.

14