Sensibility Improvement Using Streaming Technology for ... - CiteSeerX

4 downloads 9125 Views 278KB Size Report
advances in computer technology and software engineering ... Carnegie-Mellon University experimented with a mobile robot ... telemedicine and distance learning, to the sharing of ... and security applications [5]. Despite ..... Master iBot Client.
PolyUiBot: Sensibility Improvement Using Streaming Technology for Internet Telerobotics Meng Wang , James N.K. Liu Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Hong Kong Abstract: - While Internet-based telerobotics have attracted much interest around the world, up to now there has been little systematic research into Internet Telerobotics (IT). This article proposes an innovative method for measuring the performance of IT systems: the ARIUS Criteria. With the goal of orienting the direction of further research, ARIUS can be used to assess existing IT projects according to five criteria: sensibility, interactivity, intellectuality, robustness, and usability. In this article, we focus on using a streaming technology-based approach to improve the sensibility of our “PolyUiBot”project. The results of preliminary experiments testing a real robot teleoperation over a 33.6 kbps (modem) Internet connection have been promising, with this approach performing better than other approaches. Key-Words: - Internet telerobotics, Robot teleoperation, Streaming Technology, Telecommunication, Video transmission, Performance measurement criterion

1 Introduction Researchers have been investigating remote controlled operations since the 1950s. Recent advances in computer technology and software engineering and the development of inexpensive sensory equipment have allowed the development of not just local spot robot applications but of Internet-based, distant-controlled telerobotics. These developments now allow humans to safely explore hazardous or less accessible places, whether these places are deep under the oceans, out in space, or polluted with radioactivity. An Internet-based robot teleoperation platform obviates the need for dedicated networks, devices, and operators, reduces costs, extends operating distances, and is accessible from any node on the Web. Yet the Internet has some inherent limitations, such as uncertain time-delays, delay jitter, packet loss, and data transmission security problems. The 1994 “Mercury Project”was probably the first successful implementation of telerobotics over the Internet [1]. In that project a remotely controlled industrial robot arm was used to explore a sandbox filled with buried artifacts. Since then, other famous Internet-based robot teleoperation systems have been successfully developed. For example, Stein of US Wilkes University developed the PumaPaint project, which allowed a PUMA 760 robot to paint by remote control through the Internet; Mizoguchi et al developed a service robot that provides web users with office services; Simmons of US Carnegie -Mellon University experimented with a mobile robot (Xavier) that, via the Internet, navigated through different offices in a building [2]; Carnegie Mellon University and the University of Bonn,

Germany, also jointly developed the first robotic museum tour-guide robots, RHINO and MINERVA [3,4]. The Chinese University of Hong Kong (CUHK), jointly with the universities from the United States and Japan [19,20], has developed an Internet-operated, supermedia -enhanced real-time telerobotic system that includes the bilateral control of mobile manipulators. Internet-based telerobotics has potential applications in many fields from consumer home pet services, entertainment, telemedicine and distance learning, to the sharing of laboratory resources, industry automation, military and security applications [5]. Despite these many advances, Internet Telerobotics (IT) continue to be treated in much research as merely a subspecialty of related domains, such as robotics, artificial intelligence, or Internet technologies. Up to now very few studies have, in any exclusive way, carried out systematic research into IT. In this article, we propose the innovative ARIUS Criteria for measuring the performance of IT against five criteria: sensibility, interactivity, intellectuality, robustness, usability. It is our hope that it can be used to assess existing IT projects and help them to direct their further research. To address the deficiency and kernel components of IT: sensibility, interactivity, intellectuality, and robustness, we have launched the project “PolyUiBot”, which denotes “The Hong Kong Polytechnic University Internet-Intelligent Robot”. The “PolyUiBot” project seeks to create an innovative agent-based teleoperation platform that has the capacity to let a robot act autonomously, and allows an intelligent mobile robot (vehicle) to be controlled remotely through the Internet, navigating

in the unknown, structured and non-structured, highly dynamic real world. To improve the “Sensibility”of IT, we propose an approach based on streaming technology. This paper is organized as follows. Section 2 introduces related published research. Section 3 proposes an innovative performance measurement criterion for Internet telerobotics, and seeks assess the projects introduced in the previous section. Section 4 discusses the streaming technology based approach used in the “PolyUiBot”project. Section 5 shows the results of our preliminary experiments. The last section summarizes the paper.

2 Related Work 2.1 KhepOnTheWeb In The Swiss Federal Institute of Technology, KhepOnTheWeb [6] goal is to provide access, through network communication facilities, to a slightly complex mobile robotics setup for carrying out research in mobile robotics. The system consists of a mobile robot that moves in a wooden maze. The robot is a small Khepera mobile robot carrying a small, color CCD camera. Additionally there is a panoramic overhead camera (Figure 2.1). The control method of KhepOnTheWeb is based on plain HTML pages. When the user connects to the system, the server creates a new CGI process, which continuously updates the video on the user interface. The user, using clickable images, can control the robot's movements and the orientation and zoom of an overhead camera. A Java applet running in the client continually requests information about the state of the robot and the time left to the user.

Figure 2.1 Khepera with its on-board video camera in the maze, which is just 65×90cm This system was available from May 1997 to May 1998. There were 27,498 visits performed by 18,408 unique machines, and 3,178 machines accessed the system more than once. The robot has no local intelligence such as obstacle avoidance so the operator's control of the robot is closed-loop, using video as the feedback. This has a major drawback: the control of the robot is difficult under important delays without help. Moreover, the robot operates in

a tethered mode. Although KhepOnTheWeb provides a satisfactory user experience, the approach does not scale to real world environments.

2.2 Xavier The Carnegie Mellon University’s Xavier [2] web venture started as an aid in testing navigation algorithms. The experiment consists of an autonomous mobile robot that moves in a real building. The robot is a custom mobile robot based on a robotic base from Real World Interface, Xavier, which has a full set of sensors (bumpers, encoders, laser, and sonar) and a color CCD camera. The way Xavier is controlled is very simple and is based on plain HTML pages (See Figure 2.2). Unlike KhepOnTheWeb, Xavier does not permit closed-loop operator control, allowing only very high-level task commands, such as reaching a goal in the building. To achieve the desired task, the robot is provided with a handmade model of the building, and then implements a task sequencing module and a navigation module.

Figure 2.2 Xavier and its web control interface Operating through the Internet from December 1995 to December 1999, Xavier has been very successful, receiving nearly 40,000 requests received and traveling 240 kilometers. Although the Xavier experiment suffers real world problems such as inaccurate the sensors, moving obstacles and autonomy, most of the time the system performs satisfactorily, with an extremely low failure ratio. Since the robot is being tasked at a high level, high bandwidth interaction is not needed and both video and position are updated at a very low rate. Unfortunately, the Xavier approach has a real negative impact on web-based interaction: commanding at a high level is not as interactive as teleoperation.

2.3 RHINO and MINERVA The first robotic museum tour-guide robots RHINO and MINERVA[3,4], developed jointly by the Robot Learning Laboratory at Carnegie Mellon University, Pittsburgh, and by the Computer Science Department

III at the University of Bonn, Germany, were installed successfully in two museums, the Deutsches Museum, Bonn, and the Smithsonian National Museum of American History, Washington, DC(See Figure 2.3). RHINO and MINERVA not only can enable Internet users to remote control the robot through the Internet for museum visit, but also can provide a control interface for the local people in the museum. MINERVA can also interact with people; she has a face to exhibit her emotional states, which people naturally find friendlier. Users can select a target point by clicking in the map. The left side of the window can display information such as the robot's current position, speed, and pending target locations. Users can get live video from both the robot’s camera and from a stationary camera.

(a) (b) (c) Figure 2.3 Autonomous tour-guide robots (a) RHINO (b) MINERVA (c) MINERVA in the museum During three weeks of on-line operation (RHINO one week in Germany and MINERVA two weeks in the US, in the summer of 1998), approximately 4000 web users controlled the robots. At certain time the robots were under exclusive control of the Internet users, while at other times web users and conventional museum visitors shared the control. The approach of RHINO and MINERVA addressed the challenges arising from the size and dynamics of the environment (highly populated public places) and from the need to interact with crowds of people. Although web users perhaps had a less complete experience than the physical users in the museum, the empirical results of the exhibition indicated a high level of robustness and effectiveness.

2.4 CAMERA system The implementation of the CAMERA [7,15] system is based on the Nomad-200 autonomous mobile robot in a lab of the National Chung Cheng University, Taiwan. The purpose of the system is to demonstrate that, despite high latency, remote behavior programming control of an intelligent mobile robot through the Internet is feasible and reliable. The web control interface consists of three sub-windows (See figure 2.4). The left one is for command setting, and the right-top one shows a global map of the remote

environment and the trajectory of the mobile robot using dead-reckoning measurement. The last sub-window shows visual information from the robot’s vision system. The operator can set a command from the menu selection, fill the corresponding arguments, then press the SUBMIT button. After the robot executes the command and reaches the position close to the reference object, it will slow down and stop before returning the trajectory and captured image to the operator. It will then wait for the next command.

Figure 2.4 CAMERA robot and its web interface The components for providing autonomous capabilities of the mobile robot are grouped into motion planner, motion executor, motion assistant, and rule -based behavior arbitrator. Multiple intelligent behaviors are built, such as goal following, obstacle avoidance, wall following, and docking. The experiment has demonstrated that a behavior programming control mode is more effective than a direct control mode, but this system must provide the web users with a global map of the environment, and the coordinates of reference positions must be calculated a priori.

3 Performance Measurement Criterion In Section 2 we have discussed several IT projects which over long periods conducted experiments that involved public Internet users around the world: Mercury [1] from August 1994 to March 1995; KhepOnTheWeb [6] from May 1997 to May 1998; Xavier[2] from Dec. 1994 to Dec. 1999. These projects investigated not only crucial academic and engineering problems, they also explored the human factors such as behavioral psychology, sociological issues, and issues arising from the Internet culture, such as the “surfing effect”. In March and July 2000, the IEEE Robotics & Automation Magazine published two “Robots On The Web”[8] issues on IT, presenting eight research papers. These special issues presented a snapshot of recent research activities and promoted the development of IT, yet Internet telerobotics continued to be regarded as a

subspecialty of related domains, lacking its own explicit direction for development. In this article, we propose an innovative criterion for measuring the performance of Internet telerobotics, the ARIUS Criteria. These criteria are derived from the contributions of other researchers as well as our own investigations. We believe these criteria are significant not just in measuring the performance of an IT system, but more importantly in orienting the further IT research directions by challenging research issues. ARIUS Criteria = { a, r, i, u, s } In which (a, r, i, u, s) imply (interActivity, Robustness, Intellectuality, Usability, Sensibility) respectively. The details are elaborated next.

“Autonomy can help in reducing the bandwidth requirements for control but this introduces problems of its own, particularly in the area of interactivity. People seem to prefer ‘hands on’ control. ….. The only real negative impact of autonomy on web-based interaction is that commanding at a high level is not as interactive as (conventional) teleoperation. --- Simmon [2]” “… Another problem is obviously the delay that prevents people from having a good interaction and from taking interests in the site. That’s one reason why users do not come back. –Patrick Saucy [6]” Interactivity differs from “user involvement”. This difference will be discussed later in this section.

Sensibility Sensibility is the ability to provide the Internet operator with informative feedback reflecting the situation of the remotely located robot. If there is sensibility, the operator will have some “situation awareness” of the remote robot location. In the literature, this is often referred to as “tele-presence”. “Virtual tele-presence: a Web interface enables people around the world to monitor the robot, control its movement, and watch live images. --- Sebastian Thrun[3]” Sensibility is difficult to achieve because of the inherent latency of the Internet as expressed in low bandwidth, uncertain time delays and delay jitter, packet loss and data error. Prof. Simmon et al summarized their lessons from the four year Xavier experiment as follows. “Based on our experience, we have a number of observations that can guide the implementation of future web-based robots. The most important is the need for high-quality feedback. --- Simmon [2]” Prof. Patrick Saucy similarly alluded to the significance of sensibility in the KhepOnTheWeb project. “Our site is functionally less attractive for users far away because of the unacceptable response time. --- Patrick Saucy [6]”

Intellectuality Without a doubt, intellectuality is the kernel ability of a robot. To deal with the restricted bandwidth and arbitrary transmission delays of the Internet, it is very important that a robot be able to use local intelligence and that it have a high degree of autonomy to complete some sophisticated or dexterous tasks. This is reflected in Prof. Sebastian Thrun's experience of the museum tour-guide robots RHINO and MINERVA. “With communication as unpredictable as it is on the web, autonomy plays a much greater role for web controlled robots. As a minimum, the web interface must ensure ‘safe’operation even if communication breaks down. --- Sebastian Thrun [4]” And people also like the intelligent robots. “… 27.0% of all people (predominately kids of 10 years of age or less) believed MINERVA was ‘alive’…--- Sebastian Thrun [4]” “Judging by the feedback we have received, the overall response to Xavier has been extremely positive, and people are generally very impressed that robots can have such capabilities (autonomy). --- Simmon [2]” The CAMERA system's expert also emphasized the importance of intellectuality. “Promoting local intelligence in the networked intelligent robot system must be the major issue to be investigated in the future. --- R.C. Luo [5]”

Interactivity Interactivity is the important ability to be able to provide Internet operators not only with high quality feedback through sensibility, but to be able to actively interact with the robot, and to be able to map human intelligence into remote robot behaviors. In general, robots can meet only predefined goals for which they have been trained in advance, yet the ability of people to control and influence remote robot behaviors is highly desirable. As both Simmon and Saucy have pointed out:

Robustness Robustness is the quality that allows an Internet-based telerobotics system to operate effectively under three specific conditions of use. • Non-roboticist operator: Internet users typically lack the technical education and skill required by most existing teleoperation interfaces. • Internet latency: Internet telerobotics systems must take into account different Internet user



situations such as different bandwidths, time delays, packet loss or errors. Dynamic environment change: Robots may encounter not only static environmental obstacles (tables, walls), but also obstacles that may arise in a dynamic, changing environment, such as lost targets, or people blocking paths.

Usability Usability may means different things for different types of robots (e.g., wheeled robots [2,3], or manipulated robots [1,18]) as well as different robotic applications. (1) Radio coverage and battery recharging are two important factors when using wheeled robots . “From the perspective of web-based robotics, we learned lessons that stemmed from the facts that the robot was both mobile and autonomous. The robot’s mobility had both positive and negative impacts on web interactions. The positive effect (gleaned through comments in our guestbook) was that users felt that controlling a mobile robot remotely was a unique experience. However, many of the effects of mobility on connecting with web were negative, especially as compared to stationary web-based robots. The need for radio communication limits the bandwidth to the robot, which lessens the interactivity that can be achieved. Running on batteries limits the on-line time of the robot to a few hours per day. ……. By far, the most frequent complaint is from visitors who miss those few hours when Xavier is live on the web. --- Simmon [2]” (2) Client type: Different types of client interface may be used to remotely control a robot through the Internet. This has been foreseen by R.C. Luo. ARIUS Criterion

KhepOnTheWeb

Level: VERY BAD Tethered connection, 24 hours per day

Camera image updated every 20 seconds if low bandwidth (dial-up Internet), status updated every 5-10s, send email after task finished Level: BAD High level command for the goal in a map of the building Level: MEDIUM Autonomy for obstacle avoidance, path planning and navigation Level: GOOD Easy to use for non-roboticist, can serve for low bandwidth, need map while avoiding dynamic obstacles like people Level: VERY GOOD Wireless Ethernet system for data exchange

Level: MEDIUM

Level: MEDIUM

Image client-pull from both onboard and ceiling cameras, robot position updated 4 times per second in a virtual map Level: MEDIUM High level command for target location in the map Level: MEDIUM Autonomy for mapping, localization, collision avoidance, path planning Level: GOOD Easy use for non-roboticist for low bandwidth, in an unmodified populated museum Level: VERY GOOD Automatic monitor battery recharge, Ethernet bridge, web and local users Level: VERY GOOD

Intellectuality (i)

No any intelligence , sometimes blocked by wall Level: VERY BAD

Usability (u)

Performance Measurement Comparison Our project “PolyUiBot” uses an Internet-based wheeled intelligent robot, so we do not consider using a manipulated robot with a haptic device or forced feedback [1,18]. The ARIUS criterion cannot be used to measure the existing IT systems quantitatively, but they can be used to make a qualitative assessment (See Table 1). The results are in five categories: VERY BAD, BAD, MEDIUM, GOOD, VERY GOOD. MINERVA

interActivity (a)

Robustness (r)

Other problems When Simmon summarized his experience of the Xavier experiments [2], he also pointed out that a practical IT system also required reliability and user involvement. Reliability means that the normal running of the IT system is guaranteed. “User involvement”, or “user interaction” as Simmon referred to it, implies the ways to get Internet operators involved, and the ways to provide users with feedback and comments. In our view, reliability and user involvement are chiefly engineering rather than research problems.

Xavier

Server-push JPEG image (160*120) from both onboard and ceiling cameras, very low image rate Level: BAD Commands for robot and camera movement Level: MEDIUM

Sensibility (s)

“Instead of using desktop PCs, more and more researchers have recently been using PDAs and mobile phones as client interface. Using PDAs and mobile phones as client interfaces can take advantage of mobility and convenience. We think that using hand-held devices as the user interface will take the place of the desktop PC. --- R.C. Luo [5]” (3) Teleoperation architecture: Research within Internet telerobotics may involve One-to-One (one user to control one robot) control architecture, One-to-Many, Many-to-One, and Many-to-Many architectures.

Structured and limited environment (maze), for high bandwidth

Table 1. Performance comparison: Internet telerobotics systems

CAMERA Return the trajectory and image after task completed, show the environment virtual map Level: BAD High level behavior (e.g. wall following) Level: GOOD Autonomy for motion planning, assistant, and execution Level: GOOD Reference position must be known in advance in a structured office. Level: MEDIUM Radio Ethernet device in an office Level: MEDIUM

It must be noted that the existing IT projects often focus on certain aspects of ARIUS criterion, such as intellectuality or robustness, in order to make some concept-proof experiments (e.g. behavior programming) or realize specific application goals. Our investigations indicate (Table 1) however, that many researchers have neglected sensibility, interactivity and usability. To date, the literature in these areas is sparse. All of this is unfortunate, as these three abilities are important to Internet-based robotics systems; they greatly impact the quality of service and its attractiveness, while the feeling of interactivity is important to Internet users.

4 PolyUiBot 4.1 Objective The project “PolyUiBot”, “The Hong Kong Polytechnic University Internet-Intelligent Robot”, is an on-going project that seeks to address deficiencies in a number of central components of Internet telerobotics: sensibility, interactivity, intellectuality, and robustness. The mid-term objective of “PolyUiBot” is to create an innovative agent-based teleoperation platform that has the capacity to let a robot act autonomously, and allows the remote control through the Internet of an intelligent mobile robot (vehicle) navigating in the unknown, structured and non-structured, highly dynamic real world. The long-term objective is to make use of this platform to study artificial intelligence behaviorism. It is our assertion that artificial intelligence can progressively evolve like human intelligence, and that this intelligence can be seen in interactions between robots and their real world. In this view, intelligent robot behaviors can progressively evolve by continuous training and learning of both basic behaviors and more advanced behaviors using a “perception-action” model. This article addresses only the issue of the ways to improve sensibility. The most effective way to do this is through vision feedback, because vision is the most direct and informative way for people to obtain information about robot surroundings. Researchers have approached the problem of improving remote sensing (sensibility) in a variety of ways. Early researchers used a picture transmission scheme (e.g. JPEG or GIF) or hybrid image-virtual reality approaches [1-4,6,14,15]. The drawback of picture transmission is the very low frame rate and large time delay (over 10-20s). A more serious problem is that Internet performance degrades, such as reductions in bandwidth, may cause service-stop errors.

Researchers have now begun to use video conferencing systems instead of picture transmissions [16,18], but the crucial video coding algorithms of these systems are obsolete (e.g. H.261, H.263). The best current candidates for transferring multimedia perception information with the best quality of service (QoS) through the Internet are emerging streaming technologies such as MPEG4, RTP, and MMS [21].

4.2 Streaming Technology Streaming media technologies were introduced in 1995 [10]. Streaming offers a whole new approach to media on the Internet. Instead of waiting for the whole file to download to a user’s computer before playback begins, streaming media plays back as the file is being transferred. The data travels across the Internet, is played back and then discarded. To deal with the stochastic nature of bandwidth on the Internet, the streaming media player utilizes a buffer. The first few seconds of the media are stored in the computer’s memory before playback begins. This gives the media player a reserve of bits to fall back on when the user’s bandwidth becomes constricted. Streaming media is made possible by different pieces of software that communicate on a number of different levels. A basic streaming media system has three components [9]. - Player. The software that viewers use to watch or listen to streaming media. - Server. The software that delivers streams to audience members. - Encoder. The software that converts raw audio and video files into a format that can be streamed. These components communicate with each other using specific protocols (e.g. RTSP, MMS), and exchange files in particular formats (e.g. RM, WMV, MOV, MP4). Some files contain data that has been encoded using a particular codec (e.g. MPEG4, Windows Media Video, Real Video, Sorenson Video), which is an algorithm designed to reduce the size of files [21].

4.3 The iBot Platform Figure 4.1 shows our iBot platform model. This platform allows users to remotely control a robot in response to live video images captured by the camera on the robot. The iBot Server connects the robot and camera over wireless channel, obviating problems associated with cables.

Slave iBot Client #1

microphone camera Robot

Internet Slave iBot Client #2

iBot Server

... Streaming media server (SMS)

Master iBot Client

Figure 4.1 The iBot Platform Model The iBot Server contains two components: a Streaming Media Encoder (SME) and a Controller Server (CS). The iBot Client contains another two components: a Streaming Media Player (SMP) and a Controller Client(CC) (See Figure 4.2). The SME captures, encodes and transfers the real time video and audio under the instruction of the CS. The SMP receives, decodes and shows the media data. The CC communicates with the SMP and the CC interacts with the CS to implement the intelligent control algorithms. The CS eventually distributes its robot movement commands and camera pan-tilt-zoom commands. iBot Server

iBot Client

Streaming Media Encoder (SME)

Streaming Media Player (SMP)

Controller Server (CS)

transfers the stream media after it receives a request from the SMS or SMP.

Controller Client (CC)

Figure 4.2 Principle of iBot Server and iBot Client The SME, SMP and SMS use and improve streaming media technologies. Because it uses the latest streaming technologies, the iBot framework has considerable flexibility and scalability. If there is enough network bandwidth (over 100kbps), the SME can simultaneously but separately capture and encode video and audio, and transform them into an integrated stream to the SMP. Given the limits of most Internet users (under 100kbps) and our Internet export bandwidth (under 150kbps), our main concern is video processing. After all, sight is the main way we recognize the world. Under the iBot framework, all kinds of text, animation and images can be encoded and transferred by the SME, which establishes the base for the potential future extension. Two stream transmission schemes can be adopted: "push" and "pull". In a push scheme, the SME actively pushes the encoded stream media to the SMS or the SMP. If the receiver does not work, however, this scheme has the potential to waste network resources, so we have adopted the pull scheme in which the SME listens to a predefined port, and

5 Preliminary Experiments & Results Microsoft provides a complete series of software development kits (SDKs), the Windows Media Series SDK [11]. This SDK helps researchers to develop their streaming applications based on Windows Media Series 9. Using the SDK, we finished a prototype system. Table 2 records the stream media size of a 19-second video clip operating under different network bandwidths, using different compression algorithms, and different streaming optimizations. WMV9 denotes Windows Media Video 9 involved in windows media encoder V9. MPEG4 means ISO MPEG4 video codec [13], which is implemented using QuickTime Player Pro [12]. Table 2. Network Bandwidth versus Video Codec WMV9 MPEG4 20 kbps (28k dial-up 50 KB * modem, 3 fps) 34 kbps ( 56k dial-up modem, 12 fps) 50 kbps (64k Single ISDN, 15 fps) 100 kbps (128k Dual ISDN, 15 fps) 150 kbps (150k LAN or DSL, 15 fps)

92 KB

183 KB

131 KB

193 KB

240 KB

260 KB

384 KB

378 KB

Note: fps means frame per second. * means no way to produce stream at 20kbps. This 19-second video clip simulates the rapid movement of the robot in the campus. The source resolution is 320*240, the frame rate is 25 fps, and the data size of uncompressed RGB24 format is 110MB. In Table 2, we classify the potential audiences with reference to the classification of RealNetworks and Microsoft. According to the actual Internet export bandwidth of our university, we establish five audiences with capacities ranging from a 28kbps dial-up modem to a 150kbps LAN or DSL audience. It is worth noting that the actual stream media should be lower than the theoretical network bandwidth. For example, in order to ensure stable performance, a 50kbps media stream is provided for a 64kbps Single ISDN audience. The results of Table 2 show that at a low bandwidth (< 100kbps) WMV9 is more effective than MPEG4, while at a higher bandwidth, over 100kbps, their performance is similar. More importantly, it is completely feasible to highly compress the video images for streaming, which is further testified in the following.

Bitrate (kbps)

3000 2000 1000 0 0

20

40

60

80 100 120 140 160 180 Time (s)

(a) Quality based VBR, quality level 100 Bitrate (kbps)

250 200 150 100 50 0 0

20

40

60

80

100

120

140

160

180

Time (s)

Bitrate (kbps)

(b) Quality based VBR, quality level 50 150 100 50 0 0

20

40

60

80

100 120 140 160 180

Time (s)

(c) CBR, 100kbps(campus Internet), 15 fps Bitrate (kbps)

There are two video encoding methods that can be applied to a live broadcast, Constant Bit Rate (CBR) encoding and Quality-based Variable Bit Rate (VBR) encoding. CBR encoding allows us to specify the average bit rate that we want to maintain and to then set the size of the buffer. The bit rate will fluctuate across the stream, but the fluctuations are constrained by the buffer size. Quality-based VBR allows us to specify a desired quality level (from 0 to 100), then during encoding, the bit rate fluctuates according to the complexity of the stream. A higher bit rate is used for intense detail or high motion, and a lower bit rate is used for simple content. We conducted the following experiments using WMV9 as the video codec and MMS(TCP) as the streaming protocol at the side of the iBot server, with CBR and Quality-based VBR encoding methods at different bandwidths. The source was the 19-second campus video clip, broadcast ten times for a total of 190 seconds. We measured the actual receiving bit rate of the iBot client every second. The results are shown in Figure 5.1.

25 20 15 10 5 0 0

20

40

60

80

100 120 140 160 180

Time (s)

(d) CBR, 20kbps(33.6kbps modem), 5 fps Figure 5.1 (a) (b) (c) (d) It can be seen from Figure 5.1(a) that the curve is well-regulated, because at a high bandwidth (over 2.5Mbps) Quality-based VBR maintains a consistent quality across all streams. The wave crest and trough reflect the repeated scene details. At a low bandwidth (about 100kbps), however, playback performance is poor (Figure 5.1(b)). The advantage of Quality-based VBR encoding is that the quality remains consistent across all streams for which the specified quality setting is the same. The disadvantage is that we cannot predict the file size or bandwidth requirements of the encoded content. We must conclude, then, that Quality-based VBR is not suitable for our iBot live broadcast. Under 100kbps (Figure 5.1(c)) and 20kbps bandwidth (Figure 5.1(d)), CBR performs well. The content quality fluctuates to ensure that the buffer does not over- or underflow. The advantage of CBR encoding is that the bit rate and size of the content are known before encoding, so we can predict the final size and bandwidth requirements of the encoded content. Of course, when content varies in complexity, the encoding quality is not constant. Using CBR encoding in our iBot framework ensures that the media is streamed smoothly. We have carried out a successful teleoperation experiment using the real P3-DX8 robot. The robot uses a multifunctional Hitachi H8S-based microcontroller, and has a 44cm x 38cm x 22cm aluminum body and eight forward sonars. The control commands transfer through radio Ethernet devices, and the video/audio data is fedback through a set of 2.4GHz frequency A/V transmitter-receivers from a pan-tilt-zoom camera mounted on the robot deck.

Figure 5.2 User interface window of iBot client

We set up the experiment as in Figure 4.1. Through a telephone line, the iBot client connects with Internet via 33.6Kbps dial-up modem. With the streaming video feedback, we can see the remote robot's surroundings. The Internet operator can remotely control the real robot to explore areas of interest, and also can observe details via the camera pan-tilt-zoom movement. Figure 5.2 shows the current Internet user interface window of the iBot client. Although there is a large time delay (about 12 seconds), we did succeed in remotely controlling the robot, using direct control to negotiate a complicated hall with many desks and walls (See Figure 5.3). Further investigations continue.

Figure 5.3 The robot to move in a complicated hall Project name Mercury [1] (1994) Xavier [2] (1995)

Test environment 14.4K Modem Internet Low bandwidth Internet

Technology

Image size

Efficiency

GIF client pull

192 × 165

GIF server push

Low resolution

1 frame every 60 seconds 1 frame every 20 seconds

GIF or JPEG server push Video conference using H.261

200 × 150

10-15fps

176 × 144

7.5 frames per second (fps) 7-8 fps/ total 50 frames

EPFL [14] (1998)

LAN

BGen [16] (2001)

Internet

Essex [17] (2001)

Internet

JPEG server push

200 × 150

VLAB [18] (2003)

Internet

unknown

3-4 fps

PolyUiBot (2003)

33.6K Modem Internet

Video conferencing Streaming technology using WMV9 or MPEG4

320 × 240

5 fps /total 25 frames

Table 3. Comparison of the IT projects using different approaches for vision feedback We have compared the projects of Internet telerobotics according to their differing approaches to vision feedback (See Table 3) and have come to the conclusion that streaming media technology-based approach has a number of clear advantages over. • Streaming technology, based on WMV9 or MPEG4 compression algorithms, can greatly improve the quality of service over the low-bandwidth, uncertain Internet transmission



channel, producing a more stable system, higher image resolution, and smoother images. Streaming technology allows many Internet users to monitor the remote robot actions simultaneously, without reducing quality of service or increasing network bandwidth. This function is derived from an attractive feature of streaming technology: Multicast. When using multicast streams, the media server generates one single stream that allows multiple

player-clients to connect to it. Users watch the content from the time they join the broadcast. The client connects to the stream, but not to the server. This method saves network bandwidth and is mostly used for live broadcasts. Unfortunately, to distribute multicast streams, a network must be equipped with routers and switches supporting multicast protocols. • Streaming technology can incorporate multiple types of data into a single transmission stream. This function will be an advantage if future applications of Internet telerobotics need more multimedia information feedback, such as audio. In our “PolyUiBot” project, we intend using approaches based on both streaming technology and virtual reality (e.g. sonar reading visualization, virtual trajectory display) to improve sensibility as much as possible.

6. Conclusion In this article, we have introduced several Internet Telerobotics (IT) systems. More importantly, based on some published research contributions and our own investigations, we have proposed an innovative ARIUS criterion for measuring the performance of IT. With the goal of orienting the direction of further research, ARIUS applies five criteria – sensibility, interactivity, intellectuality, robustness, and usability – in the assessment of existing IT projects. In this article, we focus on improving the sensibility for our “PolyUiBot” project using a streaming technology-based approach. The results of our preliminary experiments, involving a test of real robot teleoperation over a 33.6 kbps (modem) Internet, have been promising. Indeed, our approach is shown to perform better than other approaches, which greatly encourages us about its potential for improving sensibility in Internet telerobotics. References: [1] Goldberg, K.; Gentner, S.; Sutter, C.; Wiegley, J. “The Mercury Project: a feasibility study for Internet robots ”, IEEE Robotics & Automation Magazine, 7(1), 35-40, Mar 2000 [2] Simmons, R, Fernandez, J.L., etc., “Lessons learned from Xavier”, IEEE Robotics & Automation Magazine, Vol.7, Iss.2, 33-39, 2000. [3] Thrun, S.; Bennewitz, M. et al., “MINERVA: a second-generation museum tour-guide robot”, IEEE International Conference on Robotics and Automation, Vol.3, 1999-2005, 1999 [4] Schulz, D.; Burgard, W.; Fox, D.; Thrun, S.; Cremers, A.B. “Web interfaces for mobile robots in public places”, IEEE Robotics & Automation Magazine, Vol.7, Iss.1, 48-56, Mar 2000

[5] Luo, R.C.; Su, K.L.”Networked intelligent robots through the Internet: issues and opportunities”, Proc. of the IEEE, Vol.91, Iss.3, 371- 382, 2003 [6] Saucy, P.; Mondada, F. “ KhepOnTheWeb: open access to a mobile robot on the Internet”, IEEE Robotics & Automation Magazine, Vol.7, Iss.1, 41-47, Mar 2000 [7] Luo, R.C.; Tse Min Chen; Chih-Chen Yih, “Intelligent autonomous mobile robot control through the Internet”, Proceedings of the 2000 IEEE International Symposium on Industrial Electronics, Vol.1,PL6-P11,2000 [8] Siegwart, R.; Goldberg, K. “Robots on the web”, IEEE Robotics & Automation Magazine, Vol.7, Iss.1, p4, Mar 2000 [9] Steve Mack, Streaming Media Bible (New York, Hungry Minds Inc, 2002) [10] Eyal Menin, Streaming media handbook (New Jersey, Prentice Hall PTR, 2002) [11]http://www.microsoft.com/windows/windowsmedia/ [12] http://www.apple.com/mpeg4/ [13] ISO/IEC JTC1/SC29/WG11 N4668, Overview of the MPEG-4 Standard , March 2002 [14] R.Siegwart, C.Wannaz, P.Garcia and R.Blank, “Guiding mobile robots through the Web”, Proceedings of IROS’98 Workshop on Web Robots, p1-6, Victoria, Canada, 12-17 Oct. 1998 [15] Luo, R.C.; Tse Min Chen. “Development of a multi-behavior based mobile robot for remote supervisory control through the Internet”, IEEE/ASME Transactions on Mechatronics, Vol.5, Iss.4, 376-385, Dec 2000 [16] Barbera, H.M.; Izquierdo, M.A.Z.; Skarmeta, A.F.G., “Web-based supervisory control of mobile robots”, Proceeding 10th IEEE International Workshop on Robot and Human Interactive Communication, 256-261, 2001 [17] Lixiang Yu; Pui Wo Tsui, etc., “A Web-based telerobotic system for research and education at Essex”, IEEE/ASME International Conference onAdvanced Intelligent Mechatronics, Vol.1,37-42, 2001 [18] Safaric, R.; Sinjur, S.,”Control of robot arm with virtual environment via the Internet”, Proc.of the IEEE,Vol.91,Iss.3,422- 429,2003 [19] Elhajj, I.; Ning Xi,etc.”Supermedia -enhanced Internet-based telerobotics”, Proceedings of the IEEE, Vol.91, Iss.3, 396- 421, 2003 [20] Wai-keung Fung; Ning Xi, etc.,”Improving efficiency of Internet based teleoperation using network QoS”, ICRA ’02 Robotics and Automation, Vol.3, 2707-2712, 2002 [21] Meng Wang , James N.K. Liu, “Streaming Technologies based Internet Telerobotics”, in 2nd IASTED International Conference on Communications, Internet &Information Technology, Scottsdale, USA. Nov. 2003.