Traffic forwarding with GSH/GLOGIN - CiteSeerX

1 downloads 0 Views 74KB Size Report
Now, traffic from the remote pppd will be sent to the remote glogin, which will forward it to the local glogin. Of course, our local instance writes the ppp messages ...
Traffic forwarding with GSH/GLOGIN Herbert Rosmanith, Jens Volkert GUP, Joh. Kepler University Linz Altenbergerstr. 69, A-4040 Linz, Austria/Europe rosmanith|[email protected]

Abstract Interactive grid computing requires steering of remote parallel applications. On the one hand, the control data stream of a local steering part of a distributed application need to be forwarded onto all involved nodes in the grid. On the other hand, results from the grid need to be collected and sent back to the local application. Interactivity requires that these data streams pass through the network with as little delay as possible. In theory, the amount of time spent for computation and transmission should be below the threshold of human perception. The glogin tool offers interactive connections for the grid and various methods of forwarding network traffic by using interactive connections for remote parallel applications in the grid.

1 Introduction Nowadays, we see a huge demand for computational power in science and engineering. Applications from e.g. high energy physics or medical image processing produce large amounts of data which needs to be analysed. One way to meet this need is with computational grids. However, today’s grids are based on an approach that is decades old: batch processing, which only allowed users to submit jobs, patiently wait for their completion and review the results later. With the appearance of desktop computers, they way users worked changed entirely. By having computing resources available at their fingertips, users were now able to work interactively. This allowed them to carry out their work much more efficiently. Carrying out everyday work is unimaginable today without interactive software. We think that providing the grid with interactivity will increase its acceptance and usability. glogin is a light-weight tool for the Globus Toolkit [?]. It addresses the problem of establishing interactive connections. This is achieved by starting glogin on both endpoints of the connection, using a special technique to bypass the GASScache [?]. glogin is started as an ordinary job by means of globusrun on a fork-type gatekeeper service. This provides communication with applications executing on the grid entry point on which it was started, including interactive shells such as bash or tcsh (which require pseudo terminals). In this simple case, it is sufficient to use the standard UNIX I/O-mechanism – redirecting data to and from standard input and output file descriptors. This allows two programs to communicate in the grid environment without being aware that their communication is being sent over a GSS [?, ?] secured tunnel. Yet one of the significant attributes of the grid is the provision of enormous computing power by utilising the computational capacity of the attached nodes. Thus, for applications on the grid to be steered interactively, glogin needs to be able to allow piloting of all nodes in question. Obviously, we need to forward control of the local steering application (executing on the user’s desktop) to the grid. glogin basically offers two different methods of data forwarding. The solution which solves it all at once would be forwarding entire IP networks. In this case it makes no difference whether the networks are public or private [?]. In case of interconnecting private networks, the term virtual private network (VPN) has been established. Unfortunately, this solution requires work by network administrators and coordination on the part of the organisation authorities involved in the grid. A different solution which can be used immediately by an everyday user is “TCP Port Forwarding”. Its use is restricted to TCP, and each required port has to be specified separately. Thus, it is not feasible if it is necessary to forward a wide range of TCP ports.

1

machines in network 2

machines in network 1

192.168.1.0/24

192.168.2.0/24

network 1

network 2

PPP0 on clio: 5.6.7.8 pppd PTY

CLIO

PPP0 on pan: 1.2.3.4 pppd

GSS−secured connection 111111111111111 000000000000000 000000000000000 111111111111111 000000000000000 111111111111111 glogin

PTY

PAN glogin

Figure 1. Securely routing networks with PPP and GSS Finally it is also possible to forward traffic from the X11 windowing system in glogin. As a special case of TCP forwarding it does not fit entirely in the forwarding mechanism and thus has to be handle separately. This paper is organised as follows: Section ?? describes how routing of entire networks is implemented. Section ?? shows in detail the steps necessary for TCP port forwarding. Section ?? introduces the implementation of X11 forwarding. An overview of related work is given in Section ??, and a summary and a perspective of future work conclude the paper.

2 Routing networks for Globus with glogin Back in the “good old days” of computer communication, terminals were often used for dial-up lines. As TCP/IP became more popular, these terminals turned into network devices: if the “line discipline” of a terminal is changed into C/SLIP [?, ?] or PPP [?], the UNIX kernel will create an appropriate network device. The device only needs to be configured by a few ioctl() calls. Fortunately, the widely available “point to point protocol daemon” [?], “pppd”1 , already provides this functionality. Let us assume that glogin is started manually in a terminal and instructed to execute pppd on the remote node. By default, glogin will not allocate a “pseudo terminal” (a “PTY”) if a command is specified, but this behavior can be altered by specifying the appropriate parameter. The pppd will use the PTY provided by glogin and create a network device on the grid node – provided there is a properly configured ppp environment. Now, traffic from the remote pppd will be sent to the remote glogin, which will forward it to the local glogin. Of course, our local instance writes the ppp messages to our terminal – and what looks like modem noise are really LCP requests. What is missing is a local instance of pppd which understands these messages. The crucial questions here are: 1. where do obtain a PTY to allow the local instance of pppd to use it2 and 2. how do we instruct the local pppd to call glogin? What is required is a piece of code inside pppd which creates a PTY and connects it to glogin’s standard input and standard output file descriptors. Fortunately, pppd provides a corresponding feature. As of pppd-2.4, the “pty” option has been 1 Of

2 the

course, we could use SLIP too, and although it is the faster protocol, PPP has become more popular in the last decade PTY of the local shell can not be used

2

introduced which performs exactly the two required steps. Obviously, there has been a need for this type of functionality in other projects as well. To quote the manual page: pty script Specifies that the command script is to be used to communicate rather than a specific terminal device. Pppd will allocate itself a pseudo-tty master/slave pair and use the slave as its terminal device.

So, the “call chain” is asymmetric: on the local node pppd is executed and starts glogin, which in turn starts glogin on the remote node, which starts the remote pppd. Therefore, on the remote node, the invocation of programs is reversed. Although proper configuration of pppd is simple, this topic is beyond the scope of this paper3 . It may be interesting to note that once the PTY is in network mode, the main job of the pppd is to keep the “connection” up, which, in this case, means to keep the pipe to glogin active. Therefore, the pppd will be “sleeping” most of the time. When the UNIX kernel receives a packet which is destined for the ppp network interface, it will send it to glogin. The pppd will never see such packets. When glogin receives a packet from the PTY, it encrypts it by means of gss wrap and sends it over the tunnel to the glogin process running on the other node. The other process decrypts the packet by means of gss unwrap and writes the clear-text to its PTY pipe. Again, the kernel which receives the packet will not send it to the other end of the PTY, but instead performs IP routing on it and sends it to a target (possibly private) network – the trip over the GSS secured ppp tunnel has ended. This shows that any traffic between the two networks is encrypted by GSS. Technically speaking, this approach turns the Globus node into something which can be called a “GSS router”. We can attach arbitrary networks to the nodes which will route traffic for their private networks over the GSS tunnel, if IP forwarding is enabled. In figure ??, we see clio and pan, which have been configured to route the networks 192.168.1.0/24 and 192.168.2.0/24 over their ppp interfaces. None of the pppds are aware that data travels over a GSS-secured connection. It may also be interesting to look at the layering of data as it travels over the network. Figure ?? shows how the entire network layer from the private network is transported piggyback over the GSS tunnel. The figure also makes clear that this process is not restricted to IP: if the pppd is configured such that it accepts e.g. IPX, then the stacking of protocols would be IPX over PPP, security transported over GSS. Application Data TCP IP PPP GSS TCP IP (e.g.) Ethernet

VPN Layer

encrypted data

unencrypted data

Figure 2. protocol stacking when tunneling VPNs

Disadvantages of PPP Although the solution presented here is fully operational, it also has its disadvantage. One is the administrative cost: standard users are not allowed to modify most of the configuration. They are not allowed to add network routes, they are not allowed to create or modify ppp configuration files, and they are not allowed to pass arbitrary parameters to pppd. The pppd has to be installed with “root permissions”, which could be a security problem. If installed on a large site, there has to be a central authority keeping track of network assignment. From this it follows that configuring VPNs can not be done in an ad-hoc manner and requires system administrator privileges on all involved nodes. 3 Interested

readers can find online instructions at http://www.gup.uni-linz.ac.at/glogin

3

3 Port forwarding Port forwarding with gsh/glogin is another method of transporting data over a GSS connection. It differs from the previous solution in that only a set of specific TCP ports is forwarded instead of an entire network. If programs communicate by means of TCP/IP, then port forwarding to and from a grid node could be an easy “plug-in” like solution. A forwarder is determined by three pieces of information: a port to listen to, a destination host and a destination port4 . There are two types of forwarders: “local” and “remote” forwarders. The difference is the side of the tunnel where the listener is created. Local forwarders are created on the system where the users starts glogin, remote forwarders are created on the target grid node. It is possible to use both types of forwarders at once. To make things worse, it is possible to create multiple forwarders, resulting in the need to maintain multiple local and multiple remote forwarders. Since there is only one connection between the two glogin processes, although each forwarder can create multiple connections, connection multiplexing is required. This is done by specifying a unique token for each forwarded connection. This unique token is called a “channel number”.

Node

connect

fwd 0

GSS-Tunnel

connect

fwd 1

fwd N-1

connect Figure 3. multiple port forwarders creating multiple forwarded connections

(1)

(4) connect() socket descriptor = sl

connect() (2)

TMSG FWD CONNECT index=1, socket = sr

(3)

forwarder index = 1 socket = sr

GSH-Protocol Local or Left instance of glogin

0: host:port

0:

1: host:port

sl:

.. .

Remote or Right instance of glogin

(5) 0:

sr .. .

sl .. .

sr:

Figure 4. relaying connect()-requests created by remote forwarders Figure ?? shows a particular node with N port forwarders. Each of them is listening for new connections. Multiple 4 instead

of port numbers, a service name from /etc/services may also be used

4

(4)

(1) connect() forwarder index = 1 (2) socket = sl TMSG FWD CONNECT index=1, socket = sl

Local or Left instance of glogin 0: sl:

TMSG FWD REPLY socket pair = (sl, sr) (5)

(6)

0:

sr .. .

sr:

connect() socket = sr

(3) Remote or Right instance of glogin 0: host:port

sl .. .

1: host:port

.. .

Figure 5. relaying connect()-requests created by local forwarders connections can be created per forwarder. The same applies on the other side of the tunnel. First we look at remote port forwarding5 . The issue that needs to be addressed first is how to relate a connect() on one side to the proper connect() on the other side of the tunnel. To solve this, the local node needs to maintain a forwarder target table, each entry of this table consisting of the destination host and port. The index of the creating forwarder relates to the index in the forwarder target table. Figure ?? illustrates establishing the forwarded connection. (1) A particular forwarder receives a connection request. For this request, a socket descriptor sr is created by the operating system. (2) The other side of the tunnel needs to know the index of the creating forwarder. This information is passed by sending a “forwarder connect” message, which holds the index of the forwarder, but also the socket descriptor sr. The socket descriptor sr is also required, as we will see later. (3) The index of the forwarder is used as an index in the forwarder table to determine the correct destination address for the relayed connect. (4) The local node connects to the destination, for which the operating system will assign a local socket descriptor sl. (5) The association (sl, sr) is entered into the two socket descriptor look-up tables. These tables perform the functions sr = f (sl) and sl = f −1 (sr). Theoretically, it would be sufficient to maintain one table only and extract f −1 by performing a linear search. However, this is not feasible for reasons of performance. By maintaining an additional look-up table, direct addressing can be used instead. After all, the memory requirement for one table is only around 4 kilobytes. When relaying traffic, we distinguish between two cases: (1) data which is received from the tunnel contains the remote socket descriptor sr. The corresponding local socket descriptor sl to which data has to be written to is obtained by looking up sl = f −1 (sr). (2) data is received from a local socket descriptor. The corresponding remote socket descriptor is obtained by sr = f (sl). The data which has been read from the local socket is sent over the tunnel along with the socket descriptor sr. This shows that on the remote node, it is not necessary to maintain lookup-tables. When we receive incoming traffic from the other glogin, the socket descriptor to send the data to is present in the packet. As we see now, the socket descriptor from the remote grid node is always unique. In fact, the operating system guarantees its uniqueness. Therefore, instead of creating an artificial channel number, we can just use this remote socket descriptor as the channel number. The next paragraph covers local port forwarding. With local port forwarding, the situation is a bit different. Again, it is possible to specify multiple local port forwarders. The difference here is that socket descriptors are created before a channel number is ready for transmission. This implementation is based on unique channel numbers, so if the local socket descriptor is used as the channel number, traffic forwarding will not work – socket descriptors are only unique within one process. Using the local socket descriptor would result in a “descriptor collision”. As we have seen above, the remote node assigns channel numbers – so why not just ask the remote node to assign a free, unused channel number? The local node does so by sending 5 simply

because that was implemented first

5

a “forwarder connect” message to the remote node again, but this time a “forwarder reply” message will be returned. Before sending the “forwarder reply”, the remote node will connect to the target address in the forwarding table. The resulting socket descriptor can now be used as the unique channel number. In the reply, both local and remote socket descriptors are sent, therefore the local node does not have to store context information or wait for a reply and block all other requests. Instead, the reply contains the socket pair whose elements are associated with each other. At the local node, the socket descriptors are now registered in the two look-up tables and data forwarding can now continue as usual. This process is shown in Figure ??.

4 X11 forwarding Forwarding X11 displays is comparable to TCP port forwarding. X11 is a windowing system that can be used in a networked environment. The X-Server is responsible for operating the graphics hardware. X-Clients contact the X-Server by means of sockets: UNIX sockets if executing on the same machine, or inet-sockets if executing on a remote machine6 . The server and the client exchange data by a well defined protocol: the X-protocol, which is described on several hundred pages in [?]. The location of the X-server is stored in the “DISPLAY” environment variable. The X-library which a client is linked against, honours the contents of this variable and opens a TCP-connection to the destination. The destination is determined by a “host:display” tuple, where the port value is calculated by adding 6000 to the display number. On one machine, there can be multiple X-servers, so there needs to be a way to address them separately. The X-server for display number 0 listens on port 6000, while :1 can be reached on 6001 and so on. Of course it os most common that there is only one X-server. This mechanism can be exploited to redirect X-traffic to glogin. Two steps are necessary: first, create a TCP port forwarder listening at port address 6000 plus some offset, second, change the content of the DISPLAY variable on the remote node so that it points to “localhost:offset”. Now, glogin will received all traffic from X11-clients. The forwarding mechanism is the same as with port forwarding, except for some important differences. For one (and this makes life easier), X-clients are accepted at the remote grid node only. It does not make much sense to forward X-sessions to the grid: we want to “see” what happens on the grid, but the grid does not need to see what happens on our desktop. Secondly, the destination address is not in the forwarder table, but in the DISPLAY variable on the local node. If this variable points to the local machine, then glogin will not create a TCP socket, but connect to the UNIX socket where the X-server can be reached. If DISPLAY contains a host name, then glogin will establish a TCP connection to this machine and forward the (now unencrypted) traffic to it. Note that this makes it possible to forward X-sessions to machines for which glogin is not available. The greatest difference to TCP port forwarding is the forwarding of X11 authorisation data. Aside from host-based authorisation (which use is not recommended), the key to accessing an X-Server is a “MIT-MAGIC-COOKIE”. X-clients find these cookies in the .Xauthority file of the machine they are running on. Upon establishing a connection by means of XOpenDisplay(), a client sends the cookie to the server, which compares it to its locally stored cookie. If the cookies do not match, the server refuses the connection. Sending the cookie over the network for authentication has always been a security problem. The situation is analogous to sending plain passwords over the network: the possibility of eavesdropping the wire cannot be denied. Therefore, implementations for securely transporting the X-cookies have been built: at the MIT, support for KERBEROS has been added to the X-server, which then was not available due to export restrictions. However, since glogin uses GSS, it is not necessary to take care of cookie encryption: the tunnel already encrypts all X11-traffic along with the MIT-MAGIC-COOKIE. Another issue is the creation of cookies. When we log into a grid node with glogin, a cookie has to be created there. The fastest method would be to just copy the server cookie to the grid node. But this is not a good method, since then the server “password” would be stored on a remote machine. Should it be compromised, an intruder would also gain access to the X-server from everywhere, not just from the compromised machine. So, it is better to create a different cookie for the forwarded X11-connection. This approach works well if the X-server lives on the same machine where the glogin client was started. Problems arise when the DISPLAY variable of the local machine points to another machine. Let us assume that on the local machine, we have a valid cookie for the remote X-Server. If glogin adds a randomly generated cookie for the connection between the local and the grid node, then the remote X-server will refuse connections from the local node. The reason is that now there are two cookies: one is stored in the .Xauthority file on the local node, the other one is stored in the .Xauthority file on the grid node. As the cookie travels through the machines, it is merely forwarded but not changed. Therefore, the XOpenDisplay() call to the remote X-Server will end up with the wrong cookie. So, when dealing with a remote X-Server, we now do what we just 6 however,

the X protocol is not restricted to TCP/IP, it should be possible to e.g. use AF IPX

6

said could be a security issue: we export the server cookie to the grid node. The only other option would be to analyse the data stream and exchange the cookie on the fly. For the sake of simplicity, we have refrained from that method for now.

5 Related work The ubiquitous SSH [?] package can offer the same functionality as glogin. However, it does not fit well into the Globus environment. The standard distribution does not make use of Globus related authentication. To address this problem, NCSA [?] has released a patch to OpenSSH which adds support for GSI. Still, compared to glogin, SSH is a heavy-weight tool. The ssh-server must be run separately as a daemon or be started by inetd. It is unclear whether it can be started by a normal user via the gatekeeper service, not least of all because it requires superuser privileges on the target machine. glogin is a single program which can be copied into a users home directory and be started from there. Another solution which addresses interactivity for the grid is I-GASP [?]. The remote display solution used in I-GASP is VNC [?], which is, as experience shows, a slow and sometimes unreliable graphics protocol. This architecture can roughly be compared to X11 forwarding, which indicates that the separation of an application into a local and one or more remote parts is not done at the user’s workstation. Also, forwarding VNC-sessions can be configured easily with glogin, after all it’s just a matter of specifying the appropriate port addresses. I-GASP support neither TCP traffic forwarding nor IP routing. The CrossGrid Deliverable D3.5 [?] shows a method called “Job Shadow” for redirecting data from the standard UNIX file descriptors. This can be compared to glogin’s function UNIX pipe redirection. Since this solution does not support pseudoterminals, features such as interactive shells and network traffic routing are not available. A closer inspection of the source code reveals that for interactive transport, SSH seems to be used. On the other hand, the “Job Shadow” mechanism also seems to support nodes reachable via PBS jobmanagers – this feature has only partly been implemented in glogin. Currently, it is not possible to directly reach “worker nodes” which are located in private subnets, but work is underway to support this kind of configuration.

6 Conclusion and Future Work glogin7 provides interactive steering of grid applications. This is achieved by forwarding traffic from grid nodes onto a local steering part of a distributed application. Various ways of forwarding traffic have been implemented, such as routing networks with PPP, TCP port forwarding and, as a special case, X11 forwarding. Integration with the Globus Toolkit 3 and PBS type jobmanagers is currently being investigated. Performance issues when using encryption provided by Globus have to be resolved.

Acknowledgments We would like to thank Paul Heinzlreiter and Martin Polak for their contribution to this work, which is partially supported by the EU CrossGrid project [?], “Development of Grid Environment for Interactive Applications”, under contract IST-200132243.

References [1] I.Foster and C.Kesselman: Globus: A metacomputing infrastructure toolkit, International Journal of Supercomputer Applications, 1997. [2] J. Bester, I. Foster, C. Kesselman, J. Tedesco, S. Tuecke: GASS: A Data Movement and Access Service for Wide Area Computing Systems, Sixth Workshop on I/O in Parallel and Distributed Systems, May 5, 1999. [3] J. Linn.: Generic Security Service Application Program Interface, RFC 2743, Internet Engineering Task Force, January 2000 [4] J. Wray: Generic Security Service API Version 2: C-bindings, RFC 2744, Internet Engineering Task Force, January 2000 7 Downloads

of the latest version of glogin can be found at http://www.gup.uni-linz.ac.at/glogin

7

[5] J. L. Romkey: Nonstandard for transmission of IP datagrams over serial lines: SLIP, Internet Engineering Task Force, June 1988 [6] Van Jacobson: Compressing TCP/IP Headers for Low-Speed Serial Links, RFC 1144, Internet Engineering Task Force, February 1990 [7] Perkins; Drew D.: Point-to-Point Protocol for the transmission of multi-protocol datagrams over Point-to-Point links, RFC 1171, Internet Engineering Task Force, July 1990. [8] Rekhter, Yakov; Moskowitz, Robert G.; Karrenberg, Daniel; de Groot, Geert Jan; Lear, Eliot: Address Allocation for Private Internets, RFC 1918, Internet Engineering Task Force, February 1996 [9] pppd, ftp://ftp.samba.org/pub/ppp/ [10] Nye, Adrian: X Protocol Reference Manual : Volume Zero for X11, Release 6, O’Reilly & Associates Inc. [11] Yl¨onen, Tatu: SSH Secure Login Connections over the Internet, Sixth USENIX Security Symposium, Pp. 37 - 42 of the Proceedings, SSH Communications Security Ltd. 1996. [12] Philips, Chase; Von Welch; Wilkinson, Simon: GSI-Enabled OpenSSH, available on the internet from http://grid.ncsa.uiuc.edu/ssh/ January 2002 [13] Sujoy Basu; Vanish Talwar; Bikash Agarwalla; Raj Kumar: Interactive Grid Architecture for Application Service Providers, Mobile and Media Systems Laboratory, HP Laboratories Palo Alto, Technical Report, July 2003. [14] T. Richardson, Q. Stafford-Fraser, K. Wood and A. Hopper: Virtual Network Computing, IEEE Internet Computing, 2(1) 33-38, Jan/Feb 1998 [15] Various Authors: CrossGrid Deliverable D3.5: Report on the Result of the WP3 2nd and 3rd Prototype http://gridportal.fzk.de/distribution/ crossgrid/crossgrid/wp3/wp3 2-scheduling/ docs/ CG3.0-D3.5-v1.2-PSNC010-Proto2Status.pdf pp 52-57, February 2004. [16] CrossGrid Consortium. CrossGrid: Development of Grid Environment for Interactive Applications. http://www.eu-crossgrid.org/, June 2004. CrossGrid is a project funded by the European Union under contract IST-2001-32243.

8