A Situated Approach to Scalable Control for

A Situated Approach to Scalable Control for Strongly Cooperative Robot Teams Ph.D. Dissertation Proposal submitted by Barry Brian Werger

September 2000

Guidance Committee M. Mataric (Chairperson) R. Hill B. Khoshnevis (External Member) N. Medvidovic G. Sukhatme

Contents List Of Figures 1 Introduction 2 Situatedness

2.1 Introduction and Related Research . . . . . . . . . 2.2 Situated Interaction { what it is and what it's not 2.3 Evaluation of the Situated Approach . . . . . . . . 2.3.1 Minimalism . . . . . . . . . . . . . . . . . . 2.3.2 Scalability . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

3 Survey of Related Research: Control of Multi-Robot Teams

v 1 3

3 4 5 5 6

7

3.1 Behavior-based Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Behavior-Based Control of Multi-Robot Teams . . . . . . . . . . . . . . . . . . . . . 8

4 Port-Arbitrated Behavior-Based Control

10

4.1 Behavior-Producing Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.2 Connections Between Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.3 Survey of Related Research: PAB Systems . . . . . . . . . . . . . . . . . . . . . . . . 12

5 Minimalist Multi-Robot Systems

5.1 Physically-Situated Role Assumption . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Experiment: Robot Chaining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 An Ant-like Robotic System . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 The Foraging Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3 Robot Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3.1 Behaviors for Chain-Based Foraging . . . . . . . . . . . . . . . . . 5.2.4 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.4.1 Behavioral Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.4.2 The Robot Herd . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.4.3 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.5 Adaptive Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.5.1 Chain Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.5.2 Chain Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.5.3 Chains in Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Experiment: Minimal Control for Sophisticated Individual and Team Behavior in Robot Soccer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 The Soccer Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 RoboCup Robotic Soccer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Individual Competence in Robotic Soccer . . . . . . . . . . . . . . . . . . . 5.3.3.1 Safety - de-coupled motor control . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

14

14 14 15 15 16 17 19 20 20 22 23 24 24 25 28 29 29 30 31 ii

5.3.3.2 Patrol - bottom level behavior . . . . . . . . . . . . . . . 5.3.3.3 Ball Manipulation . . . . . . . . . . . . . . . . . . . . . . 5.3.3.4 Emergent properties of ball-manipulation basis behaviors 5.3.3.5 Kicking - style without sequences . . . . . . . . . . . . . 5.3.3.6 Sophistication of ball-manipulation behavior . . . . . . . 5.3.4 Team Cooperation . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.5 RoboCup Team Behavior . . . . . . . . . . . . . . . . . . . . . . . 5.3.5.1 Oensive formation . . . . . . . . . . . . . . . . . . . . . 5.3.5.2 Defensive group formation . . . . . . . . . . . . . . . . . 5.3.5.3 Transition between formations . . . . . . . . . . . . . . . 5.3.5.4 Survey of Related Research: Formations . . . . . . . . . . 5.3.5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Generality of Techniques Exploiting Physical Situatedness . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

6 Extending PAB and Situatedness to Networked Robot Teams

31 31 36 37 37 38 39 39 41 41 42 43 44

45

6.1 Abstract Situatedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 6.2 PAB Control as Situated Interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

7 Ayllu: Scalable, Distributed Port-Arbitrated Behavior-Based Control 48 7.1 The Ayllu Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 7.1.1 Ayllu Extensions to the PAB Paradigm . . . . . . . . . . . . . . . . . . . . 49 7.1.1.1 Connections Over Networks . 7.1.1.2 Scalability Features . . . . . 7.1.1.3 Write-Inhibition . . . . . . . 7.2 Ayllu Programming . . . . . . . . . . . . . . 7.2.1 Ayllu Behavior Structure . . . . . . . 7.3 Brief Survey of Ayllu-based Systems . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

8 A PAB Approach to a Common Control Language 9 Broadcast of Local Eligibility 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9

BLE-Enabled BPMs . . . . . . . . . . . . . Cross-Inhibition of Behaviors . . . . . . . . Cross-Subsumption . . . . . . . . . . . . . . Allocation of Multiple Robots to a Task . . Strict vs. Opportunistic Cross-Subsumption Heterogeneous Systems . . . . . . . . . . . . Failure Recovery and Role Switching . . . . Scalability of BLE . . . . . . . . . . . . . . BLE Summary . . . . . . . . . . . . . . . .

. . . . . . . . .

10 BLE Experiments: Multi-Target Observation 10.1 10.2 10.3 10.4 10.5

The CMOMMT Task . . . Experimental Design . . . . Robot Behaviors . . . . . . Results . . . . . . . . . . . . Comparison to ALLIANCE

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

49 50 50 50 51 52

53 55

55 56 56 57 58 58 58 60 61

62

63 64 65 68 70

iii

11 Conclusions and Timetable of Future Work 11.1 11.2 11.3 11.4

Future Experiments with BLE Common Control Language . . Analysis of Situated Systems . Dissertation and Defense . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

73

74 74 75 75

Reference List 75 Appendix A Ayllu Code for CMOMMT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 A.1 The Observer Behavior . . . . . . A.1.1 Behavior Class De nition A.1.2 Process De nition . . . . A.2 main() Function . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

82 82 82 83

Appendix B

Glossary of Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

iv

List Of Figures 4.1 Eects of Connection Type on Message Propagation . . . . . . . . . . . . . . . . . . 12 4.2 A Subsumption Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 5.15 5.16 5.17 5.18 5.19 5.20 5.21

Foraging with a Robot Chain . . . . . . . . . . . . . . Communication in the Robot Chain . . . . . . . . . . Excursion Search . . . . . . . . . . . . . . . . . . . . . Behaviors for Chaining . . . . . . . . . . . . . . . . . . Robot Chain Formation . . . . . . . . . . . . . . . . . A Nerd Herd Robot . . . . . . . . . . . . . . . . . . . Nerd Herd Sensors . . . . . . . . . . . . . . . . . . . . Shifting the Robot Chain . . . . . . . . . . . . . . . . Optimizing Chain Length . . . . . . . . . . . . . . . . Sweeping with a Robot Chain . . . . . . . . . . . . . . Following the Sweeping Chain . . . . . . . . . . . . . . Pioneer 1 Soccer-playing Robot . . . . . . . . . . . . . Patrol - Low-level Behavior . . . . . . . . . . . . . . . Basis Behaviors for Ball Manipulation . . . . . . . . . Ball Manipulation Trajectories . . . . . . . . . . . . . Addition of the Ball-Manipulation Layer . . . . . . . . Table Implementation of Ball-Manipulation Behaviors The Rear-End Kick . . . . . . . . . . . . . . . . . . . . Complete Behavior System for Soccer . . . . . . . . . Oensive Group Formation . . . . . . . . . . . . . . . Defensive Group Formation . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

16 17 18 18 20 21 22 25 26 27 27 30 31 32 33 34 35 37 39 41 42

9.1 9.2 9.3 9.4

Cross-Inhibition and Cross-Subsumption . . . Assigning multiple robots to tasks . . . . . . Cross-Subsumption of Heterogeneous Robots Robot or Communication Failure . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

57 59 60 61

10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8

Experimental Environment . . . . . . . . . . . . . . . Local Subsumption for CMOMMT . . . . . . . . . . . BLE Control for CMOMMT . . . . . . . . . . . . . . . Greedy Control for CMOMMT . . . . . . . . . . . . . Centralized Control for CMOMMT . . . . . . . . . . . Observation and Weighted Observation, by Algorithm Simultaneous Observation, by Algorithm . . . . . . . . Observation Over Time . . . . . . . . . . . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

65 66 67 68 69 70 71 72

. . . .

. . . .

. . . .

. . . .

v

Chapter 1 Introduction This thesis examines the exploitation of the situated nature of multi-robot systems. We believe that such exploitation is the key to construction of robust, scalable systems with minimal requirements for sensing, computation, and communication. Traditional approaches to autonomous robotics view controllers as symbol processing systems, while situated approaches map perceptions to actions without symbolic deliberation. Situated approaches (in the form of behavior-based or reactive control) were the key to overcoming many of the problems of early autonomous robots, and have been so in uential that almost all current robot control systems are described as at least partly \reactive." The question of how much of the intelligent control spectrum can be covered by situated systems remains open, yet most researchers turn to hybrid systems [Mat95a] (with both deliberative and reactive \layers") when attempting to scale behavior to higher levels of complexity. Many biologically-inspired systems take advantage of individual agents' situatedness to reduce or eliminate the need for centralized control or explicit global knowledge. This reduces the need for complexity (of sensing, computation, and communication) of individuals and leads to robust, scalable systems. Situatedness is not often associated with strongly cooperative parkertra98 robot team behavior { that is, with performance of tasks that require distinct roles to be concurrently lled, and which cannot be be performed by a single robot. In natural insect systems, however, situated approaches have proven eective both for task performance [HM00, DGF+ 91, TGGD91] and for task allocation [Gor99, HW90]. We are inspired by these insects to focus on a situated apporach to the problem of role assumption, distributed task allocation in which each agent selects its own task-performing role { a key factor in strong cooperation. While our earlier work in antlike robotics [WM99] has focused on physically situated interaction, the desire for more general, principled situated techniques has led us to study the exploitation of abstract situatedness, that is, situatedness in non-physical environments. The Port-Arbitrated Behavior-Based Control approach we will discuss provides a well-structured abstract behavior space in which agents can participate in situated interaction. We present the general, principled Broadcast of Local Eligibility (BLE) technique for role-assumption in such behavior-space-situated systems.

1

Chapter 2 discusses the concept of situatedness and the relation of a situated approach to minimalism and scalability

Chapter 3 surveys the spectrum of approaches to multi-robot control. Chapter 4 presents the Port-Arbitrated Behavior control paradigm. Chapter 5 presents two experimental multi-robot systems that take advantage of situatedness

in the physical world, one inspired by ant pheromone trail formation and one designed for Robot Soccer competition, demonstrating both the bene ts of such situated systems and the need for their generalization.

Chapter 6 introduces the concept of abstract situatedness, focusing on the situatedness in the behavior space created by port-arbitrated behavior-based control techniques.

Chapter 7 describes Ayllu, a language developed to extend Port-Arbitrated Behavior-Based Control for increaseed scalability and distribution over networks.

Chapter 8 presents the application of PAB principles to a Common Control Language for operatorrobot and inter-robot communication,

Chapter 9 introduces the Broadcast of Local Eligibility, a technique enabled by Ayllu which

serves as a general, language level tool for building arbitrarily scalable strongly cooperative teams, by structuring the PAB behavior space for generalized types of situated interaction,

Chapter 10 presents experiments applying BLE to a cooperative multi-target observation task, an elevator control task, and a material transportation task,

Chapter 11 provides a summary of concepts, work-to-date, and a timetable of future work towards the dissertation.

Appendix A provides an example of PAB (speci cally, Ayllu) programming { the code for the observation task of Chapter 10.

Appendix B is an easy-reference glossary of important terminology used throughout this thesis, including both brief de nitions and page references to their introductions in the text.

2

Chapter 2 Situatedness 2.1 Introduction and Related Research While the concept of intelligent behavior arising from environmental interaction goes at least as far back as Simon's wandering ant [Sim69], Brooks' formulation of situatedness [Bro91a] and successful exploitation of such interaction is considered to be responsible for the behavior-based revolution in robot control [Ark98b, Mat97]. The basic tenet of situatedness is \the world is its own best model" { given the dynamism and uncertainty of the world and robots' perception of it, it is better to rely on rapid sensor-based feedback (\shallow computation") than persisting internal models of world state. A number of robots implemented with extremely simple control systems demonstrated the eectiveness of a situated approach by robustly performing tasks that had proved problematic for previous deliberative robots, particularly in navigation and insect-like legged locomotion [Bro90c, Ark98b]. A variety of research projects began to study exploitation of situatedness for control of groups of robots. Many were inspired by natural phenomena such as ocking [Mat95b] and stigmergic [BHD94, HM00, WM96] insect interaction (i.e., interaction through environmental eects; see Section 5.2 below), others examined applicability to tasks such as military formation control [Par93], and still others, focusing on manipulation tasks, characterized situated techniques as \externalization" of information into the environment, and began investigation of methods for rigorous analysis of what we call situated systems [Don95, DJR94]. Particularly in the area of communication between multiple robots, these issues are in need of investigation. It is here that the symbol-grounding problem becomes most dicult, since symbols and representations must be shared by robots that have dierent, uncertain perceptions of their environment. Techniques for situated communication through the environment, inspired by natural systems, have only begun to be investigated (as discussed in Section 5.2), and we have only recently introduced the techniques for communication \situated" in abstract task spaces discussed in Chapter 9. Whereas earlier work in situated control has focused on physically-situated aspects of multirobot systems, below we extend the concept of situatedness to include environmental interaction between robots in an abstract task-space. Where a number of researchers have classi ed inter-robot 3

communication as either state or goal transmission [SV00, Ark98b], we introduce a new form of \Ethereal" communication that is analogous to environmental interaction. We do not claim that there is no place for deliberation, world models, or symbolic computation in robotic systems. We do claim that the situated approach can scale to higher levels of complexity than many robotics researchers assume, and that both the process of development of situated systems and the resulting systems themselves can in many cases be more ecient and robust than those that rely on \classical AI" deliberative approaches. By exploiting the environment as \its own best model," taking advantage of the environment itself as the direct source of information for computation and constraints on behavior, requirements for sensing, computational complexity, and symbolic communication can be minimized.

2.2 Situated Interaction { what it is and what it's not \Interaction through the environment" encompasses a wide range of interactions, of which individual robots may or may not be aware. The behavior of a robot may bedirectly aected by some factor in the environment that is not perceived by the robot { that is, its actions within the environment may be interfered with. This is usually understood as a problem, such as physical interference [GM96] (collisions between robots), relocation within the environment (by the experimentor or some other external force, common in testing localization strategies), or undetectable impediments to motion (such as oor gaps or carpet edges). However, it is possible to develop strategies that take advantage of such undetectable interaction to increase system robustness while reducing the need for environmental sensing, as seen in compliant motion control of robot arms and \gas-like" area coverage strategies. The behavior of a robot may be directly aected by some factor in the environment that is perceived by the robot. This is again commonly seen in \problems" of collisions with walls and other robots (when such collisions are detected), but can also be exploited positively as in, for example, a wall-following system that monitors contact with the wall to generate behavior. Minimal systems that interact in this manner have performed tasks such as object clustering and sorting [HM00, BHD94] and cooperative large-object manipulation [DJR94]. The behavior of a robot may also be indirectly aected by some factor in the environment perceived by the robot { that is, that is, when its actions in the environment are not interfered with, but the control system of the robot responds to an environmental stimulus. A common example of this is reactive collision avoidance. A survey of the literature shows that the bulk of robot-environment interactions (through vision, sonar, infrared, laser, and other sensors) fall into this category, especially since physical interaction, as discussed above, is usually considered problematic. In non-situated interaction, neither direct in uence nor immediate perception plays a role; instead, the robot bases its selection of actions on manipulation of symbolic information. This includes, for example, behavior based on the use of an internal map, even if such a map was created sometime previously through processing of perceptions. In multi-robot systems, it is common for

4

robots to share a global coordinate system and broadcast their positions, so that robot behavior depends on calculation, rather than perception, of relative positions. Note that there are a number of gray areas between situated and non-situated interaction. One notable area is limited-range communication, in which robots communicate potentially nonsituated information, but the communication limitations impose some constraints of locality which can be exploited. However, such systems in practice are generally arti cially limited { the robots in actuality receive all information and then calculate, based on other robots' position broadcasts, which data is \within their range" [Mat95b, Par99, VySM00]. (We see this willful discarding of \global" data as an armation of the bene ts of situated techniques.) There are also systems which communicate symbolic information directionally, such as through narrow-beam IR (e.g., [Don95]). This allows position-relational tasks to be carried out without a shared global coordinate system, and can thus reduce the complexity of symbolic interactions. Another factor in situated interaction involves modi cation of the environment by the robots (known as stigmergy [HM00] or externalization of state or programs [Don95]). This is discussed fully in Section 5.2 below.

2.3 Evaluation of the Situated Approach There are three major axes of evaluation we would like to apply to our situated systems: minimalism, scalability, and robustness.

2.3.1 Minimalism Donald and colleagues [DJR97] discuss minimalism, the pursuit of the minimal con guration of resources for performance of a task, as an approach to the design of multi-agent systems. This is theoretically interesting because it can prove that certain resources are inessential to the information structure of the task ([Don95] discusses this in greater detail). The practical bene ts of minimalism, including easier and faster development and debugging and a more ecient and robust execution system, have led to recent popularity of minimalist systems; [DJR97] gives an overview which includes walking and running machines without static stability, dextrous manipulation without sensing, walkers without sensors or actuators, and behavior-based control systems. They also present the concept of supermodularity, or relocatable modularity, which partially orders the ability of systems to function when physically embedded in dierent ways. For multi-robot systems, this functions as a measure of simplicity, ease of reuse, and fault tolerance, and provides for certain performance guarantees. Analyzing the supermodularity of a system allows us some insight into the emergent properties [Bro91a, Ste94] of such systems. [DJR94] presents the beginnings of a methodology for minimizing parallel manipulation protocols through insights gained from this and other analyses based on their information invariants [Don95] approach to multi-robot system design. 5

Our plans for future work (see Section 11) include analysis of our situated approaches through these \information invariants" and other analytic techniques, and the extension of these techniques to cover and elucidate abstract as well as physical situatedness.

2.3.2 Scalability We also aim to develop techniques for analysis of the scalability of robot teams in the process of completing the dissertation. We demonstrate scalability of strongly cooperative robot teams in all three of our example systems through appropriate self-selection of roles. One factor in analysis of scalability is that of indexical-functional perception, introduced by [AC87]. This involves the ability to react, in a sense, to the roles played by other agents in the environment, relative to oneself, rather than to an identity particular to an agent (that is, in [AC87]'s terms, to \the bee that is chasing me" rather than to \bee 27"). Systems able to make such distinctions are in some ways inherently scalable, as other agents do not to be modeled when not perceived, and are categorized into a somthing addressable (i.e., a chasing bee) when they are perceived. Other approaches to scalability include what [Kra97] calls \physics-based." These model such phenomena as attraction and repulsion between particles, are fully distributed and require no communication; they are thus arbitrarily scalable, extremely simple to implement, and robust to environmental changes and robot failures. Such interaction has been shown useful in such tasks as robot ocking [Mat95a], collective sorting [HM00], Parker's [Par99]'s implementation of CMOMMT, and our own work in robot soccer (discussed in Section 5.3).

6

Chapter 3 Survey of Related Research: Control of Multi-Robot Teams Our focus is on systems of cooperative autonomous robots in dynamic environments. In this area there is intense debate over deliberative, behavior-based, and hybrid control strategies [Wer99b]. Many researchers state that various types of deliberation, models of other agents, and explicit communication are necessary for cooperative behavior [BA95, Jen95, Kra97, SV99, Tam97b], others advocate ethologically-inspired systems which cooperate only through environmental interaction [BHD94, DGF+ 91, Bro90b, KZ92, Mat95b, WM96], and still others [DJR97, DJR94, DJMW95, OT97, Wer98] pursue other methodologies for minimal control of groups of robots. [Bro90b, AH92, Mat95a, Par93, Kra97] present general discussions of the tradeos between local and global control in multi-robot systems. We can brie y summarize our argument as follows: symbolic processing and communication allow the use of well-known, well-understood traditional AI methods which have proven successful for high-level reasoning in centralized, noiseless problem spaces (such as chess-playing), but which have proven brittle when symbols must be grounded in real-world uncertainty. Situated control, de ned as it is by such grounding, has proven eective in dealing with real-world uncertainty, but does not have the history of proven techniques for high-level behavior. This leads somewhat naturally to the promise of hybrid systems in which high-level, abstract reasoning is handled by a symbolic deliberative system which is \grounded" in the behavior space of a situated system. However, it is not clear that hybrid approaches truly address issues of uncertainty rather than merely delay them (i.e., the \horizon eect" discussed in [Bro91a]); and, as discussed here, techniques for scaling the situated approach toward higher levels of behavior continue to make progress.

3.1 Behavior-based Robotics As we have mentioned, \behavior-based" has become a very popular term, but one without exact de nition. Mataric [Mat97] gives an overview of common conceptions of the behavior-based approach. Brooks [Bro91a] describes a set of four key concepts and their key ideas that lead to behavior-based robotics: situatedness - the world as its own best model, embodiment - the world grounds regress, intelligence - intelligence as determined by the dynamics of interaction with the 7

world, and emergence - intelligence in the eye of the beholder. Behavior-based systems thus are structured in terms of observable activity that they produce, rather than traditional functional decompositions [Bro91b]. The activity producing components, behaviors, compete for actuator resources and share perceptions of the world rather than any centralized representation. Behaviors tend to be simple, so that computational \depth" [Bro91a] - the computational path from sensor to actuator - is minimized to maintain a high degree of interactivity with the environment. Behavior-based systems are highly parallel so that capability - new behaviors - can be added as increased computational \breadth." Behaviors are \layered" [Bro86] in such a way that capability is incrementally added to a functional system, leading to a design process that goes not from isolated components to a nal system which integrates them into something meaningful, but from simple yet complete behavior to more complex complete behavior [Bro91b, Bro90b, Mat95b]. Basis behaviors [Mat95a] are a set of minimal behaviors that are sucient to be combined into solutions to a class of tasks. Mataric's [Mat95b] research on group behavior showed how various complex, biologically-inspired group behaviors could be composed from a set of general basis behaviors for spatial tasks, through two operators, summation of outputs and switching of outputs. Flocking, for example, is achieved by the summation of homing, dispersion, aggregation, and safe-wandering, while foraging results from switching (based on sensory conditions) between safe-wandering, dispersion, homing, and following. Development of basis behaviors is somewhat analogous to the selection of representations in symbol-processing systems: the choice of basis behaviors has great in uence on the eciency, and even tractability, of both the development process and the nal system. Eort expended in re ning basis behavior choices is usually paid back many times over; it is all too easy to reach (and sometimes dicult to detect) a state where a good percentage of a system's code is dedicated to working around earlier implementation choices. We emphasize the importance of maintaining the design of a system at the level of physically grounded [Bro90c] behavior, always with an eye to perfecting the set of basis behaviors, until nal coding, in order to avoid these pitfalls. This facilitates analysis of information ow through the system and allows for the necessary major restructurings with little cost.

3.2 Behavior-Based Control of Multi-Robot Teams Pioneering work in control of groups of behavior-based robots involved tasks such as ocking and foraging [Mat95b] and clustering [HM00], which are weakly cooperative, that is, which may bene t from multiple robots but do not require coordination between robots for completion. [FM96] modi ed such foraging strategies to include adaptive territorial division, requiring some coordination among robots, and [GM96] investigated role-based task division in foraging systems, both with and without explicit coordination. [Kra97] Examines the full spectrum of communication and negotiation in multi-agent systems, from physics-based, stigmergic interaction to full centralized negotiation. [Par93] shows that local control alone is not enough for certain classes of multi-robot 8

tasks, and that increasing use of global knowledge can result in steadily improving group behavior in her implementation of a formation task. When analyzing their schema blending (AuRA) and voting (DAMN) approaches to control. [BA98] discuss how ommunication is \foreign" to behaviorselection methods in both, since schemas and voting are egocentric and cannot be blended across robots. [SDHK97] clearly states a common view of strictly behavior-based multi-robot control: "Purely reactive approaches such as that of Brooks are ecient but lack a mechanism for global control..." They discuss as a solution potential levels of abstraction that allow entire robots to be seen as single behaviors in a behavioral system, but rely on "observer" behaviors which come close to a centralized solution. The BeRoSH system [WNM95] extends a subsumption approach to allow inter-robot communication as input to behaviors (but not for inhibition/suppression). Task performance is fully distributed, but one robot "host" must perform coordination functions, assign goals to robots, and redirect data to proper destinations, which can be seen as centralized control. [JKW98] presents the Method of Dynamic Teams, which provides elegant high-level abstractions for team coordination, recruitment, and failure recovery. It also relies on individual robots to occasionally act as group coordinators, and does not address issues of failure of such coordinators during task performance. [VSH98b] discuss the use of of "locker room agreements," or periodic synchronization (full-scale planning at opportune moments), which leads to individual behaviors and stock "team behaviors," patterns of role-assumption given certain environmental conditions. The ALLIANCE architecture [Par98] addresses the main problems preventing real-world application of multi-robot systems: the need for fault-tolerance and lack of generality. The original formulation of ALLIANCE [Par96] involved behavior-based control on individual robots. Behaviors were grouped into "sets" of which only one can be active at a time and "motivational" behaviors which control activation. Later extensions [Par98] incorporated distributed coordination of robots through communication at the level of motivational behaviors only. Acquiescence and impatience, factors in the calculation of motivation for behavior sets, were introduced as a means of fault-tolerance, allowing robots to monitor the activities of other robots and take over their tasks when appropriate. The ALLIANCE architecture comes closest to our approach of extending behavior-based control across networks, but it operates at a level too high to provide the generality and simplicity that we aim for. Motivational behaviors must be re-written to re ect changes in the system, and, as the gateways between robots, restrict the ways in which robots can interact with each other.

9

Chapter 4 Port-Arbitrated Behavior-Based Control The behavior-based approach to robot control introduced by Brooks [Bro99] has been so in uential in the eld of robotics that the term \revolution" is commonly accepted as a description of its eect. Years after this revolution, the principles of horizontal decomposition, the world as its own model, and \emergent" intelligence (in the eye of the beholder) are Brooks' most widely-embraced contributions [Ark98b]. One of Brooks' lesser-known (but importantly enabling) contributions is a set of well-de ned abstractions and techniques for behavior interaction, implemented in a number of special-purpose languages which are more exible successors to the well-known Subsumption Architecture [Bro86]. We refer to these abstractions and techniques as the Port-Arbitrated Behavior, or PAB, paradigm [Wer00].

4.1 Behavior-Producing Modules In PAB systems, controllers are written in terms of behavior-producing modules 1 (BPMs), each of which is an encapsulated piece of code that, when properly interfaced to sensors and actuators and run as a process (or behavior ), generates an observable behavior. Behaviors run continuously and concurrently with other behaviors. Each behavior has a public interface for message-based interaction with other behaviors, the accessible elements of which are referred to as ports ; they are registers that hold a single data item or \message," which may be simple or complex (i.e., an integer or an arbitrary data structure). Ports are local to behaviors, so that an individual port is addressed by a pair (behaviorname, portname ). BPMs can be multiply-instantiated under dierent names and dierently interfaced to system resources to form dierent behaviors. Private, or internally accessible parts of the behavior interface are slots, which are registers like the ports but accessible only to code within the BPM, and monostables, which are boolean variables which remain true for a speci ed brief period of time after they are triggered. 1 For clarity, we introduce this term to distinguish program code (behavior-producing modules ) from instantiated computational processes (behaviors ) and observable behavior of a robot. Previous referenced papers do not make these distinctions, though context generally suces to distinguish between dierent meanings of behavior.

10

4.2 Connections Between Ports Ports in dierent behaviors are linked together by connections, unidirectional data paths between a source port and a destination port. A port can have any number of incoming and outgoing connections. When a message (data item) arrives at a port, either written directly by code within the BPM or indirectly through a connection, it generally replaces the previous data item in the port and is propagated along all of the port's outgoing connections. Such data ow can, however, be modi ed by connections which are speci ed to be suppressive, inhibitory, or overriding. Given a connection Cs;d from port s to port d with an associated period p, the following are the eects of Cs;d 's connection type on d whenever a message m is propagated from s

If Cs;d is Normal, then m is written to d, and is propagated along all of d's outgoing connections. p is not speci ed for Normal connections;

If Cs;d is Suppressive, then m is not written to d, and for period p, no incoming connections will be able to write to d (that is, for all ports x, any connection Cx;d will be temporarily disabled);

If Cs;d is Inhibitory, then m is not written to d, and for period p, no messages will be

propagated out from d (that is, for all ports x, any connection Cd;x will be temporarily disabled);

If Cs;d is Input Overriding, then d is suppressed for period p, except that only messages arriving along Cs;d (including m) are written to d and propagated as normal;

If Cs;d is Output Overriding, then d is inhibited for period p, except that only messages arriving along Cs;d (including m) are propagated outward along all Cd;x as normal.

Figure 4.1 illustrates the eect of the various connection types on message propagation. It is through these mechanisms of suppression and inhibition that subsumption hierarchies, as well as other forms of arbitration, can be eciently and intuitively implemented. Since connections are external to the BPMs, behavior code is easily re-usable, and interaction between behaviors can be modi ed dynamically. The port abstraction enforces a data-driven approach to programming that facilitates grounding of computation in sensor readings and eector actions. By placing coordination in the interaction between behaviors (connections) rather than in the BPM code, these systems allow complex controllers to be built \bottom-up" from simple, easily testable behaviors. The PAB approach allows a clean, uniform interface between encapsulated system components (behaviors) at all levels that abstracts away many issues of timing and communication; the \black boxes" of BPMs may contain reactive mappings or deliberative planners. While our research focuses on non-deliberative approaches, we believe that PAB interaction between system components can help reduce the complexity of the components themselves, whatever their type.

11

A writes a

a

B writes b

b

C writes c

abc

A writes a

a

B writes b

b

C writes c

bc

A writes a

a

B writes b

b

C writes c

abc

Normal abc

D

Inhibit D

OverrideOut a

D

A writes a

a

B writes b

b

C writes c

c

A writes a

a

B writes b

b

C writes c

ac

A writes a

a

B writes b

b

C writes c

bc

Suppress c

D

OverrideIn ac

D

InhibitWrite b

D

Figure 4.1: Eects of the Six Connection Types on Message Propagation. The rectangles are behaviors, and the circles are ports. All connections, indicated by arrows, are normal except for that between A and C , which is of the type indicated by the phrase next to the A-C arrow in each subdiagram. Letters in the circles indicate possible values of the port. In the Normal case, for example, D's port might hold any one of the values a, b, or c, depending on the timing of message arrival; while in the Inhibit case, A's message is not propagated (see Section 4.2) and no data is propagated from C 's port, so C 's port can hold values of b or c as determined by message arrival timing, but D's port will receive no messages.

4.3 Survey of Related Research: PAB Systems The earliest PAB systems were the Subsumption Architecture [Bro86] and the Colony Architecture [Con90], both of which focused on the possibility of \compiling" control systems into electroninc circuits. Various compelling systems were demonstrated, but neither reached the convenience level of a high-level language or development system. The Behavior Language [Bro90a] has been used in numerous experiments and applications, and has demonstrated scalability of the behavior-based approach to higher levels of competence, particularly in Mataric's demonstration of distributed mapping and \path planning" [Mat90] and social learning [Mat97], and Maes' work in action selection [Mae90]. The MARS/L system [BR95] maintains the inter-behavior communication methods of the Behavior Language but allows behaviors to be coded with all the facilities of Common LISP. This system has been used successfully in such domains as robot soccer [Wer99b], multi-robot learning [MM99], and human-robot interaction [MV99]. 12

B1 B2 B3 B4 B5

Figure 4.2: Subsumption : diagram of a traditional single-robot subsumption hierarchy. Dashed arrows pointing into the tops of circles indicate data that overrides the data owing through the circle. Behaviors \higher up" thus have highest priority. Unfortunately, the Behavior Language and MARS/L have been limited to speci c target and development platforms, and have not seen widespread use. Other systems built to implement behavior-based control have not captured their elegance and ease of inter-behavior message passing, inhibition, and suppression. We believe that this lack of a good, widely-available implementation of the basic substrate of such behavior-based systems is one of the major reasons that the approach is seen to be have problems scaling to higher levels of beahvior (see, e.g., [Ark98b]). Furthermore, neither L nor the Behavior Language has facilities for behaviors distributed across robots. The ALLIANCE architecture [Par98] and BeRoSH [WNM95], among others, provide PAB-style control within robots and communication between robots, but do not generalize communication so that PAB arbitration techniques can be used between behaviors on dierent robots. We believe that by extending the notion of PAB's inter-behavior connections across networks of distributed computers (i.e., over IP), we will be able to open up a wide range of new, simple, robust algorithms for multi-robot coordination. We have developed Ayllu a language for such distributed PAB control [Wer00, Wer99a]), in order to begin exploration of these algorithm.

13

Chapter 5 Minimalist Multi-Robot Systems 5.1 Physically-Situated Role Assumption In this section we present some of our previous work involving the exploitation of physical situatedness for role assumption: a robot chaining system for foraging and a robot soccer team. Both of these systems demonstrate the simplicity, scalablity, and robustness that can result from the exploitation of situatedness, as well as the need for generality in situated techniques.

5.2 Experiment: Robot Chaining To perform a position-dependent foraging task eciently using only local contact sensing Situatedness: Perceived in uential (contact switches) Scalability: robots are able to determine there roles through environmental interaction so that a near optimal number of robots dedicate themselves to building a chain, and as many as possible dedicate themselves to the foraging task. Robustness: variations in robot hardware cause robots to dedicate themselves to the task for which they are most suitable. Though some sub-tasks have a high failure rate, robots recover from failure and try again or take on other roles. Most importantly, all successful actions persist in the environment, while failures leave no lasting mark. Minimization: position dependent task and dynamic role assignment achieved with no need for localization, global knowledge of agents or their activities, or the sensors or communication equipment required for either of these. Purpose:

14

5.2.1 An Ant-like Robotic System Previous research inspired by insect behavior has aimed to reproduce particular instances of stigmergy - \the production of a certain behavior in agents as a consequence of the eects produced in the local environment by previous behavior" [BHD94] (see also [HM00, DGF+ 91, TGGD91]). While purely stigmergic solutions have been found for tasks such as clustering items in the environment ([HM00],[BHD94]) and even sorting of scattered heterogeneous items into homogeneous clusters ([HM00],[DGF+ 91]), tasks which require particular behaviors to take place at speci c locations have generally relied upon some type of global position sensing, globally visible beacons, or random encounter of some locally-sensible position marker. Our goal in robot chaining is to reproduce the stigmergic techniques and bene ts of pheremonetrail formation by natural ants [ADGP90, GADP89, MW88, HW90]. Individual ants deliberately encode information into the physical environment (by depositing chemicals known as pheremones), and over time interesting global properties emerge that allow these chemical markings to be used as a navigational aid for position-dependent tasks. The release of pheromones leads to trails that can be followed. Over time, as large numbers of ants use these trails, dierent paths are dierentiated by \strength," which is a function of frequency of use of the trail and chemical decay. If pheromones are released only during certain phases of a task (e.g., while carrying some item back to the nest), then trails can begin to form ecient paths to useful locations, such as rich supply areas. Since paths that take less time to traverse (and are thus traversed more frequently) gain more pheremone strength than longer ones, a very simple control strategy of probabilistically choosing the most frequently used path leads to group behavior that adjusts to follow dynamically determined shortest paths to dynamically changing useful destinations. The robot chaining system for foraging that we present replaces the chemical pheromones of the ant trails with the physical bodies of robots. We have demonstrated [WM99] that a group of robots equipped with only physical contact sensors is able to form itself into a physical pathway that members of the group can use for navigation (see Figure 5.3). Rather than depositing pheromones and having paths \emerge" through chemical processes, the chain links can collect some statistics of the activity of the chain-following robots, and use them to adapt to the environment by physically modifying the chain, since the links of the chain are capable of computation and motion. By these means, robot chains are able to form shortest paths to rich deposits much as pheromone trails do. This work has been reported in [WM96], [WM99], and [WM00].

5.2.2 The Foraging Task Variations of foraging { collecting items from the environment and depositing them at a speci c location { form a common class of robotic tasks that require some knowledge of global positioning for ecient performance. Approaches to the foraging task have used a variety of sensory modalities and strategies, including use of an omniscient Planner that can \see" the whole task environment and all foragers' 15

Figure 5.1: Foraging with a robot chain : A robot returns to the chain carrying a puck after a circular excursion. positions within it; various types of global positioning such as satellite GPS, laser positioning systems, radio-sonar triangulation, and dead-reckoning; various forms of beacon-following including phototaxis and vison-based color-blob tracking; and landmark recognition and topological mapping. A system that takes advantage of multiple robots in a way that begins to imitate ant-trail formation was tested in simulation in [DGS+ 90] and [GD91]. We next present pheremone-trail inspired robot chainiing system, in which multiple robots that have only physical-contact sensing (perceived in uential interaction) suce for ecient foraging.

5.2.3 Robot Chains The system we present involves the formation of a \chain" by a group of robots in order to provide local information sucient for the performance of globally position-dependent tasks. The chain maintains contact with a starting point (Home ). Robots that are not currently part of the chain are able to follow the chain both away from Home and back towards it. The chains can adjust to link Home with other points, such as rich supply areas, and re-form when the supply diminishes or new deposits are discovered, and, potentially, be put into motion to completely sweep an area. Our approach to chaining uses only physical contact sensors (simple microswitches). Each member (link) of the chain maintains periodic contact with the links ahead and behind by touch. Limited communication is implemented through the same mechanisms to allow for chain maintenance. Most communication between members of the chain is phatic, intended only to assert the existence of the line of communication (i.e., the integrity of the robot chain). This is implemented as a \double tap." One robot begins the communicative act (Figure 5.2) by moving enough to tap the robot ahead or behind twice and returning to its (approximate) initial position. The tapped robot 16

1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

IS ROBOTICS

IS ROBOTICS

IS ROBOTICS

IS ROBOTICS

IS ROBOTICS

IS ROBOTICS

IS ROBOTICS

IS ROBOTICS

IS ROBOTICS

IS ROBOTICS

IS ROBOTICS

IS ROBOTICS

2 * R−1

IS ROBOTICS

IS ROBOTICS

* R−1

IS ROBOTICS IS ROBOTICS

IS ROBOTICS

IS ROBOTICS

* R−1

IS ROBOTICS

IS ROBOTICS

IS ROBOTICS IS ROBOTICS

IS ROBOTICS

* R−1

IS ROBOTICS

3

* R−1

IS ROBOTICS

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

IS ROBOTICS

IS ROBOTICS

* R−1

IS ROBOTICS

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

IS ROBOTICS

IS ROBOTICS

* R−1

IS ROBOTICS

IS ROBOTICS

* R−1

IS ROBOTICS

4 * R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

* R−1

IS ROBOTICS

IS ROBOTICS

A

IS ROBOTICS

IS ROBOTICS

B

IS ROBOTICS

IS ROBOTICS

C

IS ROBOTICS

IS ROBOTICS

D

IS ROBOTICS

IS ROBOTICS

E

IS ROBOTICS

IS ROBOTICS

F

Figure 5.2: Communication passed down the chain. A) The chain in resting state. B) Robot 3 taps robot 2 twice to initiate the communication act C) 3 returns to normal position. D) 2 taps 3 twice to acknowledge communication. E) 2 returns to normal position, terminating communication act. F) 2 taps 1, initiating next communication act in the passing of the message down the chain. In non-phatic communication, stages B, D, and E are modi ed. answers by tapping back twice and returning. Two taps are used to distinguish communication from the many random taps of other robots in the environment. More informative communication can be performed similarly, with contact held for a xed period, or taps added, at points B, D, and F of Figure 5.2. Many interesting behaviors require no more than just this simple 1-bit phatic communication, but it is possible to pass more elaborate messages through combinations of \short" and \long" taps (as in Morse code) or through the use of an electrical signal transmitted through the contact point.

5.2.3.1 Behaviors for Chain-Based Foraging The behavior-based control system used for the chain-based foraging task is diagrammed in Figure 5.4; the individual behaviors are:

FollowChain: Following along the chain of robots is performed by tacking, which involves motion along arcs of two radii, a and b, where a > b (that is, arc b is tighter than arc a). The robot continually moves forward (counterclockwise) along arc a until its front contact sensors make contact with the chain, then moves backward (clockwise) along the tighter arc b until the rear contact sensors touch the chain. This produces forward motion with periodic chain contact, and enforces directionality: the right side of the chain (viewed from Home ) is for outbound trac, and the left side for inbound trac. This behavior suces to get the robot around the end of the chain; after the last contact on the right side of the chain, arc 17

HOME * R−1

IS ROBOTICS

* R−1

IS ROBOTICS

*R−1 IS

*R−1 IS

ROBOT ICS

ROBOT ICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

*R−

1

IS

* R−1

IS ROBOTICS

TICS

ROBO

*R−

1

IS

TICS

ROBO

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

* R−1

IS ROBOTICS

Figure 5.3: Excursion Search strategy: Robots search for pucks and return to the chain by making roughly circular \excursions" from the chain.

EndOfChain Contact

Link ExcursionSearch

Carrying

JoinChain FollowChain

Motor Control

Figure 5.4: Subsumption diagram of robot chain-based foraging system

18

a will eventually bring the robot into contact with the left side of the chain for the return to Home. Note that the behavior is fairly insensitive to the actual values of a and b as long as their relation holds; this proved essential in our implementation with imprecisely aligned robots.

ExcursionSearch: Whenever the rear sensors make contact (as a result of backing up in

the FollowChain behavior), if the robot is not carrying a puck, there is a probability that ExcursionSearch will activate itself, causing the robot to follow an arc of random radius clockwise-forward (that is, away from the chain; see Figure 5.3) until front contact is made (that is, until the chain is re-acquired). Since the radius is held constant, the robot will generally make a circle and return to its starting position; since the starting position (backed on a tight radius into the chain) cannot be achieved as a result of forward motion, the robot will generally make front contact with the chain somewhere slightly before it's starting point.

Link: Maintains communication with the \next" and \previous" links of the chain by returning taps as described above (and illustrated in Figure 5.2).

HomeLink: The rst Link of the chain, which maintains contct with the Home location. The HomeLink responds to taps, and periodically initiates a wave of taps that travels to the end of the chain (as in Figure 5.2). These waves of taps tend to \tighten up" the chain.

EndOfChain: Responds to taps from the previous link, as well as taps from robots attenpting

to JoinChain (and, in doing so, aids them in the alignment process). Once a new robot has successfully joined the chain, the previous EndOfChain switches to Link behavior. If the EndOfChain detects that there is a rich supply of pucks closer to \home" than itself (see Section 5.2.5.2), it can communicates its intention to the \previous" link to transfer EndOfChain status, and leave the chain to become a forager (FollowChain ).

JoinChain: When a robot executing FollowChain rounds the end of the chain, there is an un-

usually long period of motion along arc a before the robot makes a contact. When JoinChain detects this, it causes the robot to back up along an arc of radius c, which is slightly smaller than a. With some probability, this causes the robot to make contact with the EndOfChain. If it gets a tapping response from the EndOfChain, it continues to exchange taps and realign itself until the taps are registered on both of its rear contact sensors (indicating it is roughly aligned with the current chain direction). When this occurs, it switches to EndOfChain behavior. If at any point during the JoinChain process the robot fails to receive tapped responses from the EndOfChain in a timely manner, it reverts to FollowChain behavior.

5.2.4 Implementation Details We have implemented a foraging system which moves metal pucks, distributed around an area, to the Home location using only physical contact-level sensing. We describe this here in brief; full details are elaborated in [WM96]. 19

Figure 5.5: Four robots form a chain from Home The system is designed for the foraging team to begin in the Home area and commence chain construction from there. Downward-pointing infrared emitter/detector sensors with an eective range of less than 1 inch, located on the underside of the robots, are used to determine when the robots are at Home, which is marked with a non-re ective black area on the oor. These extremely short-ranged sensors can be replaced with physical sensors capable of detecting some property of Home, or of the HomeLink. The robots are powered up sequentially by hand, though this could be better acvhieved either by xed timing based on unique ID numbers or, ideally, through messages passed back through the chain to team members waiting at Home. The current system assumes that the environment contains only robots, pucks, and Home. A QuickTime movie of robots forming chains and performing the ExcursionSearch is available on the World Wide Web at http://www-robotics.usc.edu/barry/chaining.mov.

5.2.4.1 Behavioral Details Additional behaviors were implemented to deal with the details of detecting and leaving Home, and dropping pucks there, using the IR sensors. JoinChain was composed of two sub-behaviors BackInto and AlignBack, each with separate time-outs to determine whether or not to give up the joining attempt. Since the R1 robots are Ackerman-steered (i.e., steered by turning the front wheels, like a car), ExcursionSearch was simple to implement as randomly choosing, then holding, a steering angle, with random probability (0.125) at each rear contact.

5.2.4.2 The Robot Herd Our experiments are implemented and tested on the Nerd Herd, the Interaction Lab's group of 20 IS Robotics R1 mobile robots. Each member of the Nerd Herd is a 12-inch long four-wheeled vehicle, equipped two front, two side, and two rear contact sensors, and a two-pronged forklift for picking up, carrying, and stacking pucks (Figure 5.6). The forklift contains two downward pointing IR sensors with a range of less than 1 inch (which we use to detect Home, and two break-beam sensors for detecting a puck within the forklift. (Figure 5.7). The pucks are special-purpose light ferrous foam- lled disks, 1.5 inches in diameter and between 1.5 and 2.0 inches in height. They are sized to t into the unactuated fork and be held in the fork by an electro-magnet. Each robot also 20

Figure 5.6: A Nerd Herd Robot: Each of the Nerd Herd robots is a 12"-long four-wheeled base equipped front, rear, and side contact sensors and a two-pronged forklift for picking up pucks. has one piezo-electric bump sensor on each side of the chassis. All processing is performed by two Motorola 68HC11 microprocessors on each robot. The control systems are programmed using the Behavior Language [Bro90a].

Hardware Limitations The R1 robots' mechanical steering system, when in perfect condition,

is \accurate" to within about 20 degrees. At certain steering angles, a drive wheel is lifted o the ground, while at others, the steering wheels jam against metal parts of the chassis. \Perfect condition" does not last through even part of an experiement, since during any type of physical interaction parts tend to change alignment. Thus, given the same program, dierent robots show signi cantly dierent behavior. The uncertainty and variability inherent in any work with physical robots, and especially salient in the case of the R1s, although frustrating, is bene cial to experimental validity. Hardware variability between robots is necessarily re ected in their group behavior. Even when programmed with identical software, the robots behave dierently due to their varied sensory and actuator properties. Small dierences among individuals become ampli ed as many robots interact over extended time. As in nature, individual variability creates a demand for more robust and adaptive behavior. The variance in mechanics and the resulting behavior have provided stringent tests for our methodologies. The R1 robots in general require more maintenance time than they provide useful experimental time, and this was exacerbated by the physical demands of the chaining task. Given 20 robots from which to mix and match parts, we were able to maintain a stable level of ve or six robots functioning at a time for experiments.

21

IR

IR

contact

contact breakbeam IRs

radio bumper (contact)

* R−1

IS ROBOTICS

* R−1

IR

IS ROBOTICS

bump

bump

* R−1

IS ROBOTICS

contact

contact

Figure 5.7: R1 Sensors: Each of the Nerd Herd R1 robots is equipped with contact sensors at the ends of the fork and on the rear of the chassis, piezo-electric bump sensors on each side and six infra-red sensors on the fork (not all of these are used in experiments).

5.2.4.3 Performance The foraging system tested with six working R1 robots demonstrates practicability of our robot chain concept. While, as expected, JoinChain demonstrated a high failure rate, graceful recovery allowed multiple attempts and eventual success, as detailed below. The ability of robots to follow the formed chains was robust, and was lost only when mechanical failures led to following robots pushing chain robots so far as to open up wide gaps in the chain. The average separation in well-formed chains was observed to be about six inches, and the nature of the communication along the chain tended to maintain this distance through minor (though not major1) \pushing" by chain-following robots. This can be improved with slightly more sophisticated chain maintenance behavior, and/or with the ability of chain link robots to anchor themselves against pushing. The eective length of a chain can be said to be approximately 1.5 times the length of the robots that form it (when using the R1s). Our limited number of robots only allowed us to have one or two robots searching for pucks while the others formed the chain; in this case, with fairly high puck density, the searchers brought an average of one puck Home each trip around a chain of four robots, which took about two minutes, depending on the number of circular excursions. With a greater number of functional robots we could examine the eects of interference along the chain and its in uence on scalability. 1 On occasion, the front contact sensors would fall o of the chain-following robots or become misaligned, leading to their pushing \through" the chain; this was generally not recoverable in the implementation under discussion.

22

The JoinChain process requires the most precision and was most prone to failure. Approximately fty percent of attempts made a successful rst contact (in BackInto ), and of these approximately fty percent successfully exchanged taps and resulted in joining the chain. These rates could be improved by tuning the steering systems of the robots and/or tuning the timing of individual robots, but improvement would be only temporary since alignment changes rapidly. Though a raw success rate of twenty ve percent does not seem impressive, graceful recovery (timing out of the JoinChain attempt) and repeated attempts allowed eventual joining in most cases. Those robots that were, due to their mechanical properties, unable to join the chain continued to be useful as foragers. This can be viewed as a case of emergent dynamic role assignment. As seen in Section 5.2.5.2, the probabilistic nature of chain joining actually bene ts the process of optimizing chain length. As in natural systems, such as ant pheromone trail formation, global behavior is a result of the cumulative eects of many actions. The key point we see in both natural (i.e., ant) and arti cial (robotic) systems is that while individual successes bene t the system as a whole through lasting stigmergic eect on the environment, individual failures do not accumulate. The most ecient ant paths are more frequently traveled than the longer ones, and are thus given a stronger marking that overpowers, and out-survives, the weaker ones. Analogously, in robot chains, only those robots that successfully join the chain have a lasting eect on the behavior of others. In both systems, success results in a persistent encoding of information in the environment, while failure does not.

5.2.5 Adaptive Chains Like ant systems, robot chains can be adaptive to the environment in order to optimize system function. Ants use pheromones to encode information into the environment about their current activities. When returning to the nest with useful material, an ant leaves a trail that can be followed back to the source of the material. As groups of ants forage, the trails accumulate and decay, so that little-used paths tend to fade while oft-used paths grow stronger. Since shorter paths lead to more frequent trips from a source to the nest, the shortest paths tend to be the strongest, and therefore most attractive, ones. We can achieve a similar eect in our robot chains without the need to leave or sense markers, since the links of the chain are capable of computation and motion. Rather than depositing pheromones and having paths \emerge" through chemical processes, the chain links can collect some statistics of the activity of the chain-following robots, and use them to adapt to the environment by physically modifying the chain. Two types of chain modi cation are sucient for generating an optimal path to a rich source in a plane with no insurmountable obstacle: 1) shifting of chain direction, and 2) lengthening/shortening of the chain. Both are based on a simple report, analogous to a pheromone deposit: after a chain-following robot returns to the chain from an excursion having picked something up, it taps messages indicating this while tacking. We call this message a SuccessReport ; it can consist of either a a pattern of taps against the sides of chain robots made at the end of the a arc motion (described in Section 5.2.3.1), or a contact held for a longer duration 23

than normal at the same point (of course, there are many other ways to do this if we allow dierent sensing modalities). [Gor99] describes ndings about how ants change roles (e.g., from foragers to internal nest workers). This is found to happen in response to the number of encounters each ant has with ants ful lling other roles { a nest worker that encounters a number of successful foragers in a given time period will decide to forage. As seen below, the process we describe for adjusting the length of the chain functions in a very similar manner. [Gor99] describes ndings about how ants change roles (e.g., from foragers to internal nest workers). This is found to happen in response to the number of encounters each ant has with ants ful lling other roles { a nest worker that encounters a number of successful foragers in a given time period will decide to forage. The process we describe in [WM99] for adjusting the length of the chain functions in a similar manner: robots periodically make a decision to assume the role of forager or chain-link based on encounters with other robots. Through a physically-situated approach, robots are able divide themselves eciently into foragers and chain links and perform position-dependent tasks using only local sensing and interaction. [WM99] discusses further interesting properties of the chaining system regarding ecient role assumption given the inherent physical heterogeneity of the particular robots used.

5.2.5.1 Chain Direction In order for the chain to move to intercept a rich source, all that is necessary is for the chain links to monitor how many times they have had SuccessReport s on their right and left sides. If basic behaviors are in place that maintain chain integrity, individual robots can shift towards the direction of more SuccessReport s (within constraints of chain integrity) without need for explicit communication with neighboring links. In this way, the entire chain will slowly shift towards a rich source. Figure 5.8 illustrates a situation in which the chain shifts towards a source. In order to more clearly replicate the ant systems, and eliminate the risk of the chain in nitely extending in a direction with no sources, it would be necessary to introduce random directionshifting of chain links with some probability. Decay of trails could be replicated in two ways: either the links could factor recency into their statistics, or, more minimally, the links could merely react by shifting towards the direction of every SuccessReport, allowing such temporally-based statistics to to be computed \physically."

5.2.5.2 Chain Length Ideally, once the chain has shifted to intersect a rich source, we would like it to end there { that is, we would like the EndOfChain to be near the center of the richest area, so that robots can return directly from the source to Home. Figure 5.9 demonstrates a situation where the chain should be shortened in order to both optimize the pathway and allow more robots to participate in transport of material. 24

HOME * R1

IS ROBOTICS

* R1

IS ROBOTICS

*R1 IS

*R1 IS

ROBOTICS

ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

Figure 5.8: Moving Toward a Rich Deposit: After a robot picks up an item, it communicates the fact of the pickup to the rst link robot it contacts when returning to the chain. The links collect statistics that allow them to determine whether most pickups are coming from the left or the right. Using this information, the links can begin shifting the chain towards rich deposits. There are two ways for this to happen; in either case, the chain will tend to shorten to the optimal length when there is a rich deposit, and naturally begin to grow again if this source begins to be exhausted. One way is for the chain links to collect SuccessReport statistics (most likely, the number of recent SuccessReport s at each link, for comparison) and pass them along the chain through some protocol, allowing the EndOfChain to decide when it should leave the chain and become a forager (by passing EndOfChain status to the preceding link). Another minimal, environmentally-oriented way to adjust the chain length is to simply have the EndOfChain leave the chain after a period of time. If the chain extends past a rich source, there will be fewer robots attempting to append themselves to the end of the chain (since many will be carrying material and thus be ineligible); if the chain does not reach a source, few if any robots will be carrying and thus most will attempt to append themselves and lengthen the chain. This can be seen as dynamic role ful llment such as [Gor99] nds in ant colonies: when the EndOfChain encounters mostly successful foragers (which do not attempt to append themselves to the chain), it is likely to leave the chain and become a forager. When the foragers encounter mostly chain links without nding useful material, they tend to become chain links. The robots, like the ants, ful ll roles as determined by global constraints.

5.2.5.3 Chains in Motion Robot chains need not serve only as \passive" pathways for other robots; rather, they can structure motion of a group of robots. Speci cally, a chain of robots that remain within perceptual range of each other can thoroughly sweep an area, for applications such as de-mining, planetary exploration, or search-and-rescue. As shown in Figure 5.10, a chain of robots \anchored" at one end can sweep a circular area with some guarantee of coverage: either the robots remain in sensor range, providing 25

HOME * R1

IS ROBOTICS

* R1

IS ROBOTICS

*R1 IS

*R1 IS

ROBOTICS

ROBOTICS

* R1

* R1

IS ROBOTICS

IS ROBOTICS

* R1

* R1

IS ROBOTICS

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

* R1

IS ROBOTICS

Figure 5.9: Optimizing Chain Length: Using statistics of where along the chain robots return after successful excursions, the chain length can be optimized so that foraging robots get to, and return from, rich deposits along a shortest path. When robots near the End-Of-Chain robot record signi cantly fewer successful returns than \earlier" links, the End-Of-Chain robot switches roles to become a chain-follower, decreasing the chain length by one. This is repeated until most successful returns are close to the end of the chain. a maximum space between concentric circles of the sweep, or the system can report that the sweep was not complete. The sweeping chain can be of great bene t in situations where coverage is essential but localization is dicult (such as surf-zone demining), or where hetereogeneous resources are combined. A single robot with a global positioning system can serve as the \anchor" for a sweeping chain, and allow for rough localization of any robot in the system (using the chain as a polar coordinate system relative to the ancor). Thus large areas can be thoroughly covered, and items of interest localized, with a single, non-moving GPS receiver. After a complete sweep, the anchor can move to a new location for a new sweep. In situations where items being sought must be further investigated or manipulated (e.g., sensed in more detail or retrieved), it is possible to combine the concepts of a sweeping chain and a chain pathway, as follows. A small number of specialized robots wait at the anchor point as the sweep progresses. If any robot in the chain encounters an object that requires further attention, it sends a report down the chain. The chain pauses and one of the specialized robots follows the chain to the notifying robot (Figure 5.11), at which point it can perform its specialized task. This allows heterogeneous systems to redirect resources eectively without need for either localization or global communication.

26

Figure 5.10: Sweeping an Area: Rather than growing outward from a Home, a robot chain can be \anchored" at a given point and sweep a circular area around it for thorough coverage. A chain of simple robots can be combined with a more capable anchor robot (e.g., equipped with GPS) for ecient and accurate patterned searches.

Figure 5.11: Following the Sweeping Chain: If objects encountered during a sweep require the attention of specialized equipment (e.g., more detailled sensing, or retrieval), the chain can pause the sweep upon encounter of such objects, allowing an appropriate robot to follow along the chain to the object of interest and back to the anchor area.

27

5.3 Experiment: Minimal Control for Sophisticated Individual and Team Behavior in Robot Soccer Purpose:

To develop a minimal control strategy for eective soccer play, including smooth trajectories for ball manipulation and dynamic assumption of roles and formations appropriate to eld situations, without communication, position information, or internal state. Demonstration of incremental layered approach to strongly-cooperative, situated behavior design.

Perceptual (Sonar, Vision) the most appriate robots assume roles in formations limited in size by task dynamics, while others take on roles (territorial patrol) that distribute them over physical space through local interaction. Robustness: Formations cover fumbles, membership in formations is determined reactively through game situation and thus selfrepairing without communication, perceptual decay allows behavior switching (interference resolution) without loss of \awareness" of game situation. Tolerance of extremely noisy sensors. Minimization: Teamwork with no need for communication, world model, explicit roles or role negotiation. Eective ball tracking and manipulation without trajectory calculation through emergence from simple basis behaviors. De-coupling of rotational and translational motion leads to \emergence" of additional modalities of ball manipulation. Simple physical modi cation replaces extremely complex sonar/visionbased distinction of ball from other items on eld. Situatedness: Scalability:

As we have discussed in Section 5.2.5, ant-like agents are able to determine, through their local interactions with other agents, what roles would be globally ecient for them to assume (e.g., forager or domestic nest worker), and our chaining robots are able to use similar means of assuming ecient roles. Here we discuss a system we have implemented for a drastically dierent domain { robotic soccer { which is also able to use local interactions to determine globally ecient roles. Here we describe at length our physically-situated minimalist approach to team cooperation for robot soccer, embodied in our robot soccer team, the \Spirit of Bolivia." Though individual players can perceive only the ball, the goals, and obstacles (which are not distinguished but may 28

be walls, opponents, or teammates), and have no communication equipment, the team displays sophisticated cooperative behavior, falling dynamically into appropriate formations for oensive and defensive situations with an interesting property of formation size limitation. The players also display sophisticated ball handling skills, though they do not attempt to model ball trajectories. [Wer98] presents a long example of the application of a number of minimalist design principles in the development of this team, including the many early drastic transformations it went through that are not presented here. Below, we summarize the bottom-up behavioral design of each elder, starting with individual competence and adding in, through an extremely simple modi cation, dynamic role selection for cooperative behavior.

5.3.1 The Soccer Robots The soccer playing robots are Pioneer I robots sold by ActivMedia, Inc.(see Figure 5.12). They feature an 18"x14" dierentially steered base, ve forward- and two side-facing sonars, and 2-DOF grippers with contact and breakbeam sensors. They are equipped with the Fast Track Vision System from Newton Laboratories, an on-board processor based on a 16-MHz 68332 microcontroller, which extracts color-blob information from a video source at a frame rate of 60 Hz. It can be trained to recognize three distinct colors at a time, and outputs the size, visual eld location, bounding box, and other data for multiple blobs of each color. It also provides various types of line-tracking data. The robots use wide angle (60-degree) non-actuated cameras. The Pioneer has a low level controller that runs the Pioneer Server Operating System from ROM. PSOS provides sensor readings at 10 Hz, and accepts commands which set rotational velocity, translational velocity, and individual wheel speeds, as well as other motor- and sensor-control functions. Software was developed for the MARS/L system from IS Robotics [BR95], an on-board Motorola 68332 microcontroller programmed in a Common LISP extended for behavior-based control, and later ported to ActivMedia's C-based PAI (Pioneer Application Interface) for control of borrowed robots not equipped with the MARS/L hardware, and later to Ayllu. The only physical modi cation we made to the stock Pioneer robots for RoboCup competition was to raise the sonar pingers to a level (about three inches higher than normal) at which they would not perceive the ball, but would perceive other robots and walls. This simple exploitation of physics avoided the need to use complicated sonar-visual analysis to distinguish the ball from other objects, which would have been far less reliable.

5.3.2 RoboCup Robotic Soccer The domain of robot soccer has been characterized by friendly and hostile agents, inter-agent cooperation, real-time interaction, and a dynamic, uncertain environment [SM94]. A notion of sportsmanship (and the rules of many competitions) add the characteristic of non-damaging interaction between players. We de ne a \comprehensive" set of soccer team behaviors as one that addresses all of these characteristics; a minimally comprehensive set of behaviors will cause members of a team to, at the very least, progress towards the goal, and obstruct progress of opponents, 29

Contact Sensors

Sonar Pingers Pioneer

Video Camera

Figure 5.12: The Pioneers 1's are dierentially steered bases with seven sonar pingers, grippers with contact sensors on the tips, and a Cognachrome vision system. The grippers were \ lled" to prevent \holding." by interacting constructively with teammates, and safely with all agents, in a physical, real-time soccer competition. RoboCup [KAK+ 95] has become a popular domain for multi-agent systems research. It is a soccer competition that has leagues for small, mid-size, and legged physical robots, and for simulated players. The environment is extremely dynamic and the task complex.

5.3.3 Individual Competence in Robotic Soccer Individual players of robotic soccer teams must display certain skills in ball-searching, ball-approaching, ball-manipulation, and obstacle avoidance [ASK+ 98, SM94, VSH+ 98a, Tam97b]. The RoboCup organizers [ASK+ 98] claim that a pan-tilt active vision system (and the accompanying complicated control and spatial memory organization for lost objects) or omnidirectional lens is necessary to keep track of the ball and other eld items, and that a purely behavior-based approach would be sucient for very simple behavior but dicult to scale. Prediction of ball trajectories is said to be the key to catching passes and intercepting, as well as goal-keeping. Our team members follow trajectories very similar to those generated by the global knowledge and precise calculation of the small league teams such as [VSH+ 98a] (see Figure 5.15). They manage to chase down, intercept, dribble, and smoothly circle the ball in order to line up properly for an advance, while eectively avoiding collisions, without any calculation of ball motion or modeling of world state - that is, in a completely situated and minimalist manner. Two behaviors, Patrol and Safety, comprise the bottom level of the robots' behavioral structure, diagrammed in Figure 5.13. They run in parallel and do not compete with each other { Patrol controls rotational velocity of the robot, while Safety controls translational velocity.

30

Sonar: Forward

Patrol

Rotation Vel

Safety

Forward Vel

Figure 5.13: The lowest level behaviors of our soccer players. Patrol (which outputs a xed rotational velocity) and Safety (which outputs a forward velocity proportional to the distance to the closest sonar-detected obstacle) combine to produce a robust, obstacle-avoiding navigation behavior.

5.3.3.1 Safety - de-coupled motor control The translational component of all motion control, Safety, receives input from the ve forwardfacing sonars and outputs a velocity proportional to the distance to the closest sonar-perceived object. By itself, this causes the robot maintain the maximum safe speed to avoid hitting anything. The Safety behavior runs in parallel with all other behaviors and has no competition for velocity control; this decoupling of control and simple, uncontested velocity component turns out to be common to almost all systems we build. An Ayllu version of the BPM code for this behavior is presented and discussed in Section 7.2.1.

5.3.3.2 Patrol - bottom level behavior The Patrol behavior basically outputs a xed rotational velocity. This causes the robot to move in a circle in the absence of obstacles, and in a somewhat distorted circle in the presence of obstacles, due to Safety 's varying of translational velocity. When approaching a wall, for example, the speed of the robot decreases with the distance to the wall, while the rotation continues at its normal rate, until the robot is again headed in a direction without obstacles. An interesting emergent property of this simple Patrol behavior results from the application of \minimal heterogeneity" [Wer99b] dierence in direction of rotation. If two or more robots in an enclosed space (such as the RoboCup eld) Patrol in with dierent directions of rotation, they tend to patrol separate territories due to the Safety /Patrol interaction. The basis behavior Safety by itself moves straight forward until something gets in its way, then hovers at a small distance from the obstacle; Patrol by itself rotates in place. From their parallel execution emerges a robust behavior of navigating around the environment and avoiding obstacles.

5.3.3.3 Ball Manipulation Our players manipulate the ball in three ways: dribbling, kicking, and batting, always towards the opponent's goal. Dribbling entails moving the ball through fairly continuous contact with the front of the robot, kicking is propulsion of the ball away from the robot at high speed (by swinging the rear around rapidly), and batting (with the side of the gripper) is used when the ball is resting 31

Orbit−CW

Pioneer

Kick−Ahead

Pioneer

Orbit−CCW

Pioneer

Figure 5.14: The three basis behaviors sucient to generate ball-manipulation trajectories for soccer play. against the wall or \held" by an opponent. The robots follow smooth, complex trajectories to properly align with the ball for forward progress.

Ball-manipulation basis behaviors The basic intuition behind our ball manipulation is that

the robot should aim to interpolate itself between the ball and its own goal, then move towards the opponent's goal. This observable behavior suces for both oense and defense. It is, however, too complicated to be considered a basis behavior and implemented monolithically. We decompose this behavior into three component BPMs that we determined to be minimal yet sucient for generation of all ball manipulation. These three basis behaviors, Orbit-CW, Orbit-CCW, and kick, are illustrated in Figure 5.14. Orbit-CW and Orbit-CCW approach the ball directly from a distance, but fall into an \orbit" around the ball when close; they dier only in direction (clockwise or counter-clockwise) of this orbit. Kick-Ahead basically pushes the ball forward with small corrections when necessary to keep the ball centered in the eld of view, and in some situations (described in Section 5.18) performs the high-speed kick. Rapid switching between these behaviors (at a maximum rate of 10 Hz) based on environmental factors (described in the next section, and in Table 5.1) leads to trajectories such as those illustrated in Figure 5.15, which emerge quickly enough to handle a moving ball.

Behavior Selection This switching between basis behaviors is handled by the Orienter behavior. It passes the visual position of the ball to one of the basis behaviors, selected as follows:

Given B is a bounding-box of the ball in the visual eld, OG of the opponent's goal, and MG of the robot's goal, and that \north" is the direction towards the opponent's goal, If robot sees OG

32

− CW Orbit

Pioneer

CW

−C

bit

Or

Kick−Forward

ard Forw Kick−

rd wa or −F k c Ki

r

W

nee

−CC

Orbit−CW

Orbit Pio ne er

Pio

Kick−Forward

Opponent’s Goal

Figure 5.15: Some typical trajectories generated by combination of the ball-manipulation basis behaviors. If B overlaps OG, Else if B is left of OG, Else if B is right of OG, Else if robot sees MG If B is left of MG, Else if B is right of MG, Else if robot facing north, Else if robot facing east, Else if robot facing west,

Kick-Ahead Orbit-CW Orbit-CCW Orbit-CCW Orbit-CW Kick-Ahead Orbit-CCW Orbit-CW

This behavior selection can be performed as look-up in a table such as Table 5.1. Whenever one of the basis behaviors receives the ball-position information, it in turn outputs a rotational velocity. If the ball is not seen, there is no output from the ball manipulation behaviors at all, and the default Patrol rotational velocity is used. The generation of trajectories by combination of basis behaviors can best be understood by tracing the process on paper. The ball is always approached from a direction that minimizes the chance of its being accidentally knocked towards the robot's own goal - this is the importance of the east-west distinction. The robot will push the ball forward with the Kick-Ahead behavior, which makes minor heading adjustments to center the ball in the visual eld, until its alignment to the ball and goal change enough to mandate a temporary switch to an Orbit. Figure 5.16 shows the behavioral structure of the ball-manipulation system.

Basis behavior implementation issues - hidden state Given the sensing limitations of the

xed, limited eld-of-view cameras on our robots, the implementation of the Orbit behaviors is not straightforward. As the robot only sees in the direction in which it is headed, it is unavoidable that the ball must be out of the robot's eld of view for signi cant portions of the process of orbiting the ball. Thus the control system must deal with the issue of hidden state, important parts of 33

Ball Left Ball Right Ball Left Ball Right of Goal of Goal of My Goal of My Goal Heading Behavior 0 1 ? ? ? Orbit-CCW 1 0 ? ? ? Orbit-CW 1 1 ? ? ? Kick-Ahead 0 0 0 1 ? Orbit-CW 0 0 1 ? ? Orbit-CCW 0 0 0 0 West Orbit-CW 0 0 0 0 East Orbit-CCW 0 0 0 0 North Kick-Ahead Table 5.1: A reactive policy for the Orienter behavior Vision: Ball decay

OrbitCW

Vision: My Goal

Orienter

OrbitCCW

Vision: Other Goal

Kick Dead− Reckoning

Sonar: Forward

Patrol

Rotation Vel

Safety

Forward Vel

Figure 5.16: Addition of the ball manipulation layer. the task environment that are not perceivable. Purely reactive systems are not well-suited to such non-Markovian environments [Lit94]. Bowling et. al. [BSV96] claim that an accurate memory model is needed to overcome problems of hidden state in nondeterministic environments, giving examples from a simulated soccer system with limited perception. They advocate use of a probabilistic model that maintains \reasonable estimates" of locations of objects relative to the agent. Estimation of objects' motion within the environment and eects of the agent's motion must be calculated. They compare this favorable to the \simplest model of memory that provides enough functionality to be usable by the client," one that updates a memory of directions to unseen objects by the amount of agent rotation. Arkin and Balch [AB98] also make use of a world model that must be updated to keep track of other agents in the environment as a robot loses perception of them. The RoboCup Challenge [ASK+ 98] states that an actuated camera or extremely wide-angle lens (such as an omnidirectional one) is 34

Orbit−CW

Kick−Ahead

Orbit−CCW

Figure 5.17: Ball-Manipulation Basis Behaviors. Each of these behaviors is implemented as a table that assigns rotational velocities to segments of the visual eld (represented here by arrows). The rapid acceleration away from the ball indicated in the lower corners of Kick-Ahead produce the \rear-end kick." necessary to keep track of the ball and targets, and that modeling of ball speed and direction is key to catching and intercepting the ball. On a more minimalist note, Littman [Lit94] examines the improvement of performance brought to reactive systems by even a single bit of memory. In many cases, this minimal state addition can lead to optimal performance in non-Markovian environments. We employ a situated approach that has power similar to Littman's explicit addition of internal state, which we call perceptual decay. Quite simply, our robots retain an \afterimage" of the most recent perception of an object for a xed period of time after such perception is lost. This afterimage is not distinguished from a direct perception by the behaviors within the system. When these behaviors are properly designed, the afterimage will lead to recovery of direct perception. Perceptual decay has become ubiquitous in all of our robotic systems that operate in dynamic environments. In our soccer robots, the application of a few seconds of decay to visual perception of the ball not only eectively and through situatedness solves this problem of hidden state, but allows us to exploit (as discussed below and in Section 5.3.5.1) the loss of perception of the ball in powerful ways. Our implementation of the ball-manipulation basis behaviors makes this clear.

Table implementation of ball-manipulation basis behaviors Armed with this perceptual

decay of the ball's position, we implement the three ball-manipulation basis behaviors as simple lookup tables that map the ball position to robot rotation. We divide the visual eld into a 5x5 grid and assign a rotational velocity to each of the twenty- ve resulting areas. Figure 5.17 gives an example of such a table. These tables are fairly robust; our initial guessing of the values was functional, though not optimal. Hand tuning was fairly simple, but deciding the values of the entries in the table would be a good problem for learning. 35

Careful examination of the table will reveal that \recovery" behavior is built into the edges of the visual eld. When the ball leaves the eld of view, it leaves an afterimage in one of the edge squares, and the rotation indicated by that entry in the table is performed until either the period of decay expires or visual perception of the ball is re-acquired. We place in all the edges rotational velocities that have a high likelihood of steering the robot back towards the ball. Making use of this built-in recovery, the Orbit behaviors actually cause the robot to \tack" around the ball. When approaching from a distance, the ball tends to stay smoothly within the eld of view. When the robot gets close to the ball and must circumnavigate it, it rotates away from the ball until it is fairly tangential and loses direct perception, then rotates back rapidly (tracing this process through Figure 5.17 will help in understanding this process). This process repeats, as the robot moves forward at the speed determined by Safety, until vision or dead-reckoning indicate a change in behavior. The tacking occurs rapidly enough that a smooth trajectory around the ball is generated. This tacking behavior insures that the robot thoroughly explores the space of headings while traveling around the ball, making it highly likely that at some point it will either see both the ball and the opponent's goal at the same time, or see the ball while dead-reckoning indicates a heading of roughly \north" - which, as indicated in Section 5.3.3.3, are the triggers for the Kick-Ahead behavior.

5.3.3.4 Emergent properties of ball-manipulation basis behaviors Two major functionalities emerge from the interaction of the ball-manipulation behaviors and the de-coupled velocity control of Safety : detours and batting. When a robot or other obstacle comes between a player and the ball, the player makes a detour around the obstruction without losing awareness of the ball. As the robot's velocity slows due to Safety 's perception of the obstacle, the tacking behavior becomes more pronounced and the robot tends to shift towards one side of the obstacle until it again has a clear path toward the ball. Since rotation is still doing what is necessary to tack around the ball, the robot rotates back towards the ball frequently enough that the perceptually-persistent "afterimage" of the ball is never lost. This same combination of perceptual decay and de-coupled velocity control is what allowed our hors-d'oeuvres serving robots [Ark98a] to navigate as a cohesive group through dense crowds with many obstructions. When the ball is stuck against a wall (a common occurrence at RoboCup '97) or being advanced by an opponent, the robot still tries to tack around it. Again, when Safety detects the wall or robot behind the ball, it slows the velocity to an eventual standstill. In this case, the back and forth motion of the tacking behavior becomes extreme, and generally leads to the side of the robot's gripper \batting" the ball out of its stuck position. Since the Orbit behaviors cause the robot to interpolate between the ball and its own goal, this batting almost always causes the ball to move towards the opponent's goal. 36

Pioneer

Figure 5.18: The proper con guration for a rear-end kick.

5.3.3.5 Kicking - style without sequences Consideration of the breakdown of the visual eld in the behavior tables will reveal that there are two squares - the lower left and right corners - that can be quite problematical for the Kick-Ahead behavior. If the ball is perceived in one of these areas, it is likely to be just to the side of the robot (see Figure 5.18). Attempting to rotate towards the ball would likely knock it towards the side of the eld, away from its path to the goal. It seems that the robot would either have to pass the ball and come around for a new approach, or back up far enough to re-align without danger of pushing it in the wrong direction. The former is unappealing because of the time it would take to re-acquire the ball and the chance that it might not be re-acquired before being stolen; the latter involves the type of sequenced behavior which we strive to avoid. Our solution to this problem exploits the physics of the Pioneers - their lack of rotational symmetry, which in some cases can make control more dicult, here turned out to be an asset. When the ball enters one of the problematic areas of the visual eld, the robot rotates very rapidly away from the ball, and kicks the ball with its rear end as it comes around. Though the con guration necessary for this to occur successfully is fairly rare (once or twice a game), the accuracy and speed of a successful kick led to its being hailed as the most stylish move yet seen in RoboCup competition [Nor97]. When the kick is not successful, the fast rotation causes the robot to regain visual contact with the ball quickly. This drastically dierent style of ball manipulation is implemented by merely placing two appropriate rotational velocities in the Kick-Ahead table.

5.3.3.6 Sophistication of ball-manipulation behavior We believe that our ball-manipulation behavior operates on a level comparable to teams such as [VSH+ 98a] which use global vision systems for a clear view of all objects in the environment. They describe trajectories that bring robots around the ball and into a position to advance, and which detour around other agents, based on calculation of motion of objects in the environment and setting of intermediate targets for segmented approaches. Our robots also navigate around the ball as appropriate in order to line up for an advance, and make detours around opponents, but do so with no such calculation, internal representation, or global information. The sophistication 37

of having three eective ball-handling techniques is not only unmatched by other systems, but achieved without any added control system complexity. Furthermore, rather than being unable to function in an inaccessible (hidden-state) environment as many predict a strictly behavior-based system would be, our players exploit loss of perception to generate the tacking behavior which leads to both the thorough exploration of heading space necessary to trigger behavior changes, and the useful emergent behaviors of extracting stuck balls by batting and detouring around obstacles. If, as the principle of emergence dictates, intelligence is in the eye of the beholder [Bro91a], the purposeful, smooth, dynamically-adjusted trajectories generated by the high reactivity of this approach are similar in intelligence to many of the precisely-calculated, segmented trajectories generated by deliberative systems.

5.3.4 Team Cooperation As we have discussed in Section 3, it is often assumed that team cooperation entails some type of explicit communication between team members and a shared world model. \Ethereal" communication (such as by radio), however, adds a great deal of complexity; often, in practice, with little bene t. In many cases, when a symbolic message is received, a mapping must be performed between the perceived environment and the information of the message. This mapping is subject to both error of perception of the sender and error of the perception of the receiver. The decision of how to target communication can be very dicult: either one speci c robot is the target of the message, in which case the decision of which to send to can be very dicult (requiring mapping of symbolic representations to perceived objects in the environment), or the message can be broadcast to all robots, which will respond en masse, only to have to sort out later, through perception, what each should do. It is far easier, when possible, to skip straight to the perception. This is one of the reasons why traditional planning for teams of robots can be so brittle in dynamic, uncertain environments: if one robot fails, or if some part of an internal representation gets out of synch with the physical world, reorganization of the physically embodied system through symbolic means requires prohibitive analysis and model veri cation. Kraus [Kra97] begins to look at systems that take a \classical mechanics" approach to team coordination. Robots can be given behaviors simulate properties of particles in physics, which can attract or repel each other. The complexity of such systems is low, and they can be analyzed using theory from physics. This is commonly seen in ethologically inspired ocking behavior [Mat95b]. These systems are robust to failure because they don't assign speci c roles to speci c robots, and do not specify how tasks are to be accomplished. No unit is essential, and error in many cases can be \more creative than inecient" [DGF+ 91]: interesting properties tend to emerge from unexpected events, rather than system failures. Below we describe the cooperative behaviors implemented for the \Spirit of Bolivia." Though based on an extremely simple modi cation of the system as described so far, we believe that our team behavior embodies many of the ideals that researchers in both stigmergic and deliberative camps strive for, while suering few of the pitfalls they seek to avoid. 38

Sonar: Sides

Disperse

Vision: Ball decay

OrbitCW

Vision: My Goal

Orienter

OrbitCCW

Vision: Other Goal

Kick Dead− Reckoning

Sonar: Forward

Patrol

Rotation Vel

Safety

Forward Vel

Figure 5.19: The nal behavioral system from RoboCup '97.

5.3.5 RoboCup Team Behavior The only modi cation of our behavioral system to achieve team cooperation was the addition of a simpli ed form of Dispersion [Mat95b]: when the robots sense something close to their sides through sonar, they move away from it a little bit. When merged in with our other behaviors (see Figure 5.19), this leads to very interesting emergent eects.

5.3.5.1 Oensive formation In an oensive situation, robots are trying to align with the ball and move it toward the opponent's goal. The robots do not collide head-on in pursuit of the ball, due to the Safety behavior, and maintain a set distance from the robots on their sides due to Disperse. Their tendency, however, due to the ball-manipulation behaviors, is to approach the ball from directly behind. One robot will by serendipity be the rst to be so aligned and push the ball forward. Any other robots close to the ball will continue trying to get directly behind it, but will be caused to veer away by Disperse 's reaction to the physical presence of the rst robot. The robots on the sides stay slightly behind, because they slow down as they draw abreast of the ball and turn more sharply towards it (thereby causing Disperse to perceive the pushing robot). As the lead robot pushes the ball forward, the other robots will continue to follow the ball, remaining roughly parallel to and slightly behind the ball-pushing robot due to the competition between Disperse and the ball-manipulation behaviors. The result is motion across the eld in the reliable formation shown in Figure 5.20. There is no \decision" to enter such a formation; it follows naturally from the robots' \attraction" to the ball and \repulsion" from each other, in situations where the ball is moving forward. The cooperative behaviors result from the interaction of simple individual behaviors such as attraction to the ball, repulsion from obstacles, and patrolling of an area when the ball is not 39

visible. In an oensive situation, one robot pushes the ball forward, while the forces of attraction and repulsion cause the teammates to fall into a \V" formation (as shown in Figure 5.20. This formation provides eective \fumble protection" that is essential in the robotic soccer domain. Robots often accidentally knock the ball o course while dribbling it forward; this formation provides backup and recovery. With this formation it is not uncommon for possession of the ball to transfer between the robots of an advancing group without loss of possession by the team. The formation also provides for a very quick defense if the ball is stolen. The size of the oensive formation is limited by the physical interaction between the robots. Once there are three robots in the formation, other teammates are unable to pursue the ball without occasional visible occlusion by formation members, and will thus revert to defensive patrolling. In this way, necessary roles are lled (attacker, supporters, and defense) without negotiation, explicit de nition or assignment of roles within the system, or even any explicit representation of teammates. In a defensive situation the ball is not advancing toward the opponent's goal. The same forces described above cause the robots to fall into a semi-circular arrangement around the ball rather than the V-formation of the advance (see Figure 5.21. This formation very eectively prevents the opponent from continuing to move the ball up the eld, and places players in a good position to gain possession of the ball. An emergent \batting behavior" (described in [Wer99b]) makes it likely that the center robot will jostle the ball towards one of its teammates, which can smoothly begin an advance from the side; this can be seen as a rudimentary pass. Transition between oensive and defensive formations is determined by motion of the ball, and is not even perceived by the robots; there is no concept of \oensive" or \defensive" (or even of \formation") anywhere in the behavior structure. Simple sensing of the local environment leads to exible, dynamic team behavior that many researchers claim requires higher deliberation and explicit communication ([BA98, BA95, Jen95, OT97, SV99, Tam97b]; see [Kra97] for a detailed discussion of similar \physics-inspired" systems). Thus, in our soccer system, the situated approach allows robots to eciently assume roles in oensive and defensive formations as determined purely by physics-inspired interaction and visual occlusion. Simple, stateless control allows sophisticated behavior including dynamically-determined limited-size formations, maintenance and recovery of ball possession, and simple passing. Assumption of roles takes place without any communication or explicit representation or coding of roles { the role behavior \emerges" from the interaction of a few simple behaviors. This formation provides an eective \fumble protection" that is very important in the robotic soccer domain. A robot will often accidentally knock the ball o course while dribbling it forward; this formation provides backup and recovery. In our team it is not uncommon for possession of the ball to transfer between the robots of an advancing group without loss of possession by the team. The formation also provides for a very quick defense if the ball is stolen. The size of the oensive formation is limited by the interaction between the perceptual decay of the ball position and the Disperse behavior. As the group grows larger, peripheral robots tend to lose sight of the ball for longer amounts of time. This provides for a \drop out" of group members; once they lose perception of the ball for more than the decay period, they leave the formation by 40

Opponent’s Goal

Pioneer

Pioneer

Pioneer

Figure 5.20: Oensive group formation. Interaction of ball manipulation, Disperse, and Safety behaviors cause the robots to fall into a V-formation when the ball is in motion roughly towards the opponent's goal. Perceptual properties limit the formation to three robots reverting to their Patrol behavior. When the robots are \dressed" for RoboCup, the stable group size is three. In this way, necessary roles are lled (attacker, supporters, and defense) without negotiation, explicit de nition or assignment of roles, or even any representation of teammates.

5.3.5.2 Defensive group formation In a defensive situation the ball is not advancing toward the opponent's goal. The same behaviors described in Section 5.3.5.1 cause the robots to fall into a semi-circular arrangement around the ball rather than the V-formation of the advance (see Figure 5.21), since the robots on the sides are no longer kept behind by lower speed. This formation very eectively prevents the opponent from continuing to move the ball up the eld, and places players in a good position to gain possession of the ball. An emergent \batting behavior" (another result of the interaction between the four behaviors listed above, described in [Wer99b]) makes it likely that the Push ing robot will jostle the ball towards one of its teammates, which can smoothly begin an advance from the side.

5.3.5.3 Transition between formations Given a Patrol behavior as we have discussed in Section 5.3.3.2 which assigns a rough territory to each player, we get fairly sophisticated formation behavior. The robots patrol their territories when not close to the ball, dynamically enter population-controlled formations when they are useful for advancing the ball, and drop into defensive formations when they need to block an opponent advance and turn it around. Transition between oensive and defensive formations is triggered solely by behavior of the ball and other agents, and is not perceived by the robots; there is no concept of \oensive" or \defensive" (or even of \formation," \teammate," or \opponent") anywhere in the behavior structure. Simple sensing of the local environment leads to the type of exible, dynamic team behavior that many researchers claim requires higher deliberation and 41

Opponent

Pioneer

Pioneer

Opponent’s Goal

Pioneer

Figure 5.21: Defensive Formation in Robot Soccer: When the ball is not moving roughly towards the opponent's goal, the robots cluster around it to form an eective barrier and be in good positions for recovery. explicit communication ([BA98, BA95, Jen95, OT97, SV99, Tam97b]; see [Kra97] for a detailed discussion).

5.3.5.4 Survey of Related Research: Formations Balch and Arkin [BA98, BA95] have done extensive work with formations of robots. These formations consist of simple geometric patterns of four (or in some cases two) robots. Their approach requires global knowledge of the direction in which the group is traveling and a predetermined spatial relation to at least one speci c robot in the group. [BA98] discusses performance problems linked to radio communication as a demonstration of the utility of a passive (environmental) communication approach, and [BA95] mentions that problems of robot failure have not been addressed, and that possible solution paths are automatic recon guration and fault-tolerant communications. We believe that the sophistication of our oensive formation is not far from that of Balch and Arkin's work. Our formations, however, are not dependent on speci c relations between speci c robots, and thus exhibit some of the tolerance to robot failure that they seek. Our soccer system also performs something akin to automatic recon guration in the case of such failure; new robots in the environment are incorporated as necessary to maintain the stable group size determined by the density dependence [Bro90b] that arises from the interaction dynamics of the system (as discussed in Section 5.20). Parker's [Par93] work on formations compares systems with varying amounts of global knowledge. Performance of the strictly local system was relatively bad, but we attribute this at least in part to the lack of a true minimalist approach. The robots were aligning themselves relative to the heading of the their peers, rather than their mere relative location. This caused formations to break during sharp turns as robots tried to stay, for example, on the left side of some other robot, rather than trying to keep the other robot to their left side. Again, the formations of four robots 42

required speci c spatial relations between speci c robots, and were thus brittle to robot failures or behavioral uctuations. Parker also discusses the suitability of behavioral analysis as a robust approximation to global knowledge. This is realized to some extent as well in our formation behaviors - it is the behavior of the ball and other robots, rather than any symbolic distinction that causes transitions from patrolling to oensive to defensive behavior. Her assertion that global goals (known at design time) allow more local control is another factor that allows minimization of systems such as the \Spirit of Bolivia."

5.3.5.5 Discussion We hope that we have shown that, contrary to what is claimed by Tambe [Tam97a], it is not necessarily true that \an agent must be provided \deep" or causal models of its domains of operation," such as a general model of teamwork, for eective cooperation. In the simulated soccer domain, Tambe says that failure detection and recovery requires advanced spatial reasoning and agent tracking/plan recognition skill, and because of this has not been implemented, leaving the team susceptible to breakdown [Tam97a]. The need for agents to share joint intentions - to know the intentions of their teammates - makes the chance of such failures high in uncertain domains. The exibility brought to pre-determined team actions by Stone and Veloso's formations [SV99] is a step towards embracing the minimalist principle of homogeneity, but still requires that at some point speci c roles be assigned to speci c robots, and that agents negotiate in order to change roles or formations. Challenges they list for their locker-room agreements include determination of when to switch roles or formations, smooth transitions of roles and formations, and how to make sure that all agents use the same formation. Though its transitions are simpler, our current system makes some headway towards addressing these challenges. Robots reliably enter into and switch roles smoothly and in appropriate situations, and ow in and out of formations (in the colloquial sense) in appropriate situations.We expect to scale our formations and strategies to the point where our minimalist situated strategies will have the power to generate team behaviors as complicated as those generated by deliberative systems, with a greater level of robustness, as they have done for the ball manipulation behaviors. We can use situatedness as a partial ordering on the amount of reactivity of a system; it is a measure of the amount of internal state maintained. A purely reactive system is of course purely situated, whereas a classical model-based planning system has very low situatedness. The problem with internal state is that it must usually be kept synchronized with the environment, at great expense and with grave consequences of failure; the diametrically opposed problem of reactive systems is that they can guarantee no continuity of behavior, leading to frequent problems of oscillations due to local minima and temporary losses of perception. The interesting degree of situatedness for us is what we de ne as perceptual decay : the persistence of some perceptual data for a brief period of time after it is perceived, analogous to a retinal afterimage. As we will discuss in Section 5.3.3, much research has been done on the power added by even a single bit of state to 43

a reactive system; perceptual decay allows us to take great advantage of this without changing our \situated" code or approach; behaviors are all purely reactive, but some of their inputs fade over time, rather than disappearing suddenly. A slow decay of visual perception of the ball is the only state in our sophisticated soccer players, and is not distinguished from a direct visual perception by any part of the system. Thus we refer to the system as \situated" (or, to be very precise, \highly situated") even though it is not strictly reactive. This allows us to ground the entire system in sensing and action, that is, use only the world as its own model, yet address problems of hidden perceptual state and perceptual noise.

5.4 Generality of Techniques Exploiting Physical Situatedness Though we (and other researchers discussed above) have been able to build systems that eectively exploit their physically-situated nature, the techniques used have been fairly speci c to their task domains and experimental environments. Our goal is to derive techniques that provide the bene ts of situated approaches { simple agents, scalability, and robustness to environmental changes and robot failures { in a generally applicable manner. For this, we have investigated situatedness in abstract environments.

44

Chapter 6 Extending PAB and Situatedness to Networked Robot Teams While the PAB approach to robotic control was introduced more than a decade ago, it is only with the recent introduction of our Ayllu language implementation (Chapter 7) that PAB behavior interaction has been able to take place between networked robots { that is, that separate robots could inhabit the same behavior space. It is precisely this development that has motivated us to explore situatedness in behavior space and apply principles derived from our physically-interacting ant-like systems to ethereally-interacting robot teams.

6.1 Abstract Situatedness Though our previous example systems have focused on communication through the physical environment, there is a strong trend in the robotics community towards distributed robotic systems. While, as a result, it is not uncommon these days to see mobile robots equipped with wireless Ethernet communication, there are few if any that have communications architectures that address the many problems speci c to the robotic domain. Some of these are:

Unreliable communication: Even the most sophisticated radio systems are subject to interference, both loss of bandwidth due to RF noise and competition for bandwidth.

Real world task and safety constraints: Robotic tasks have time dependencies that are crucial to task performance and system survival. Much information has a severely limited useful lifetime which in uences how reliability of communication must be addressed.

Scalability: Much of the motivation for multi-robot research involves economies of scale; the

vision of \swarm" robotics, and all group robotics to a lesser extent, is to be able to increase task performance linearly merely by dropping in more robots. A good distributed robotics architecture must allow robots to be eciently added to a system.

Dynamic recon gurability: At least for the present, robots are prone to failure; communica-

tion problems mentioned above lead to some robots being unreachable by others for varying 45

periods of time; and scalability factors mentioned above lead to change in the makeup of a robot team. Most applications would bene t from the ability to match available robots to tasks without user programming or con guration.

Heterogeneity: Robots may be equipped dierently, or may dier in reliability of subsystems.

Again, most applications would bene t from the ability to automatically match available robots to tasks based on individual robot capabilities, which may vary over time.

6.2 PAB Control as Situated Interaction These problems of communicating robots are of course extremely similar to problems of deliberative robots, discussed in Chapter 1, which we have seen addressed by situated PAB systems. Though there has much been discussion of the suitability of PAB techniques for building truly distributed (that is, networked) systems, previous research has not investigated the possibility of extending the PAB paradigm across networks. Our intention in the rest of this thesis is to demonstrate that:

the PAB approach can indeed be extended to allow connections to be made transparently over IP (so that there need be no distinction between intra- and inter-robot message passaing);

that this extension can allow arbitrarily scalable networked teams analogous to the scalable teams in our previous axamples which communicate through the environment;

that PAB interaction between robots can also provide the bene ts of minimalism and robustness previously gained through environmental interaction; and

that PAB-networked robots can directly in uence each others' behavior much as we have seen

that robots can directly in uence each others behavior through their presence in the physical environment

In other words, we make the claim that \ethereal" PAB interaction between robots, such as through wireless Ethernet, can be considered to be situated interaction, and this abstract situatedness can be exploited for many of the bene ts of physical situatedness. The structure of a PAB \behavior space" provides something analogous to physical locality through addressable behaviors and ports, the port-behavior-robot hierarchy, and broadcast messages to related groups of ports such as \peer groups" (Section 9.2), in which certain broadcast messages are targeted only to members of a restricted group of behaviors. These peer groups can be seen as spatial \neighborhoods." Robots are able to directly in uence or interfere with each others' behavior through suppression, inhibition, and overriding. Interactions, through unreliable messaging, are asynchronous and uncertain: messages sent may not arrive, as a result of either communications failure or arbitration processes. Due to the nature of connections and, especially, broadcast connections, behavior systems are scalable, and new BPMs or robots can begin interacting with a running system without

46

modi cation or even noti cation. The Broadcast of Local Eligibility technique presented in Section 9 provides the same type of indexical-functional (or deictic ) [AC87] addressing of other robots that [Bro91a] equates with situatedness. We have developed a number of tools to exploit this \behavior-space situatedness," and dedicate the rest of this thesis to their presentation and analysis. Speci cally, we show that when the PAB paradigm is extended across networks, the resulting systems are able to dynamically recon gure themselves in order to eciently allocate resources in response to changing environmental conditions, in a manner that is scalable and robust to robot failures.

47

Chapter 7 Ayllu: Scalable, Distributed Port-Arbitrated Behavior-Based Control Distributed control of a team of mobile robots presents a number of unique challenges, including highly unreliable communication, real world task and safety constraints, scalability, dynamic recon gurability, heterogenous platforms, and a lack of standardized tools or techniques. Similar problems plagued development of single-robot systems until the \behavior-based" revolution led to new techniques for robot control based on port-arbitrated behaviors (PAB). Though there are now many implementations of systems for behavior-based control of single robots, the potential for distributing PAB control across robots for multi-agent control has not until now been fully realized. The rest of this chapter brie y describes Ayllu, a language designed to address these issues of distributed robot control. Ayllu allows standard PAB interaction (message passing, inhibition, and suppression) to take place over IP networks, and extends the PAB paradigm to provide for arbitrary scalability. It allows distribution in many senses: distribution of control of one robot over several machines, coordination of several robots by one machine, and most interestingly, group interactions without centralized control, through either explicit communication or sharing of sensor readings. Below we discuss Ayllu's extensions to the port-arbitrated behavior paradigm on which it is based and survey a number of successful Ayllu systems.

7.1 The Ayllu Language Ayllu is an extension of the C language intended for development of distributed control systems

for groups of mobile robots. It facilitates communication between distributed system components and scheduling of tasks. While Ayllu has many features specialized for behavior-based systems, it is useful for development of a wide range of architectures from reactive to deliberative, and provides the means for coordinating processor-intensive tasks, such as high-level planning and vision processing, with responsive low-level control. A small interface, referred to as AylluLite, can be easily added to non-Ayllu programs, allowing Ayllu to serve as a simple and eective and powerful homogeneous \glue" between heterogeneous system components, thus simplifying and making more robust implementations of hybrid systems such as the \three layer architectures" 48

[Gat92, BFG+ 97] which requre a number of dierent communication protocols between system components. Ayllu is designed to be highly portable among operating systems and languages, and allow maximal interoperability. Ayllu's principal goal is facilitate implementation of robust multi-robot systems that must cope with noisy and range-limited communication, rapidly-changing real-world situations, variations in resource availability, tasks that require redistribution of system resources, and various hardware failures. Towards this goal, Ayllu extends subsumption-style message passing to the multi-robot domain, provides for a wide variety of behavior-arbitration techniques, and allows a great deal of run-time system exibility including dynamic recon guration of behavior structure and redistribution of tasks across a group of robots as determined by either task constraints or changing availability of resources. The question arises as to whether Ayllu is a language or a library. Strictly speaking, Ayllu may be referred to as a language, since it involves syntactic structures foreign to C itself, and some dierences in semantics. Ayllu makes extensive use of C's macro facilities, however, and as a result the standard C preprocessor is able to translate all Ayllu code into standard C code. Since Ayllu therefore consists of header les and object code that uses standard C compilation, it can also be viewed as a library, or toolkit.

7.1.1 Ayllu Extensions to the PAB Paradigm

Ayllu extends the PAB paradigm in two primary ways. One is the ability to connect behaviors (and arbitrate between them) over IP networks; the other is the addition of features that provide for scalable systems.

7.1.1.1 Connections Over Networks The extension across IP is straightforward; when making a connection Cs;d from source port s to destination port d, s and d can each be speci ed either as a (behavior; port) pair or as a (host; behavior; port) triple. Connections are made dynamically at run time, so within the behaviors themselves, no distinction need be made between local and network communication. Any connection can be speci ed to be a broadcast connection, in which any message written to the source port results in message propagation to (or inhibition, suppression, or overriding of) the named destination port on each robot on the local network. Ayllu also supports two forms of message broadcasting. One of these is IP broadcasting on a local subnet; if a connection is made to d = (host; behavior; port) where host is speci ed as a broadcast address, then any message propagated along the connection is sent to port (behavior; port) in every host on the network except the sender. Such broadcast connections can be special (i.e., suppressive, inhibitory, or overriding). The other form of broadcast is local to a host (i.e., not over IP). In this case, if a connection is made to d = (BROADCAST; port), then messages propagated along the connection will be (potentially) delivered to every behavior on the local host that has a port of the name speci ed. 49

7.1.1.2 Scalability Features Given the single-item register nature of ports, in previous PAB systems it is not easy (and often not possible) to arbitrarily add new behaviors to a system; unless new ports are added to behaviors, comparisons cannot be made between data coming from various sources. As new robots are added to a system, either behavior interfaces in all robots must be changed so that data sent by the new robots will be available, or an increasing percentage of all robots' data will be overwritten and lost. This does not allow for exible group size, or robustness to robot failures. Ayllu addresses this problem through the addition of four specialized port types which facilitate scalability by ltering messages, which may arrive from dierent sources, for selective replacement of the current value of the port. These port types are MaxPorts, MinPorts, SumPorts, and PriorityPorts. When a message mincoming is written to a port p holding a data item mcurrent ,

if p is a normal port, mincoming replaces mcurrent if p is a MaxPort, mcurrent becomes max(mincoming ; mcurrent ) if p is a MinPort, mcurrent becomes min(mincoming ; mcurrent) if p is a SumPort, mcurrent becomes mincoming + mcurrent if p is a PriorityPort, mcurrent becomes mincoming i priority(mincoming ) > priority(mcurrent )

As we will demonstrate in Section 9, a combination of only MaxPort s and NormalPort s is sucient for implementation of some arbitrarily-scalable group coordination strategies.

7.1.1.3 Write-Inhibition Ayllu also adds one new type of connection to the list found in Section 4.2: If Cs;d is Write-Inhibitory, then m is not written to d, and for period p, no messages written

directly from within the BPM will be propagated out from d, though messages arriving through connections will be written and propagated.

This provides a \closure" of the control of information ow through ports, and allows construction of multi-dimensional arbitration structures such as the cross-subsumption we present in Section 9.3. It is illustrated in Figure 4.1.

7.2 Ayllu Programming [Wer99a] provides a thorough description of Ayllu's structure, use, capabilities, standard sensor and motor-control behaviors, and a detailed tutorial. As Ayllu is gaining popularity in various research laboratories, we are beginning a project to develop a standard task-level behavior library, something promised by behavior-based systems but not so far realized due to lack of widely available, multi-platform, interoperable development tools. 50

7.2.1 Ayllu Behavior Structure

As a brief example of the structure of an Ayllu BPM, we present here a simple translational velocity control behavior as used for collision avoidance in both the robot soccer system (Section 5.3 above) and the multi-target observation system (Section 10). A more extended example, the full code for the cooperative multi-target observation task, is presented in Appendix A. Given that minfront and minback are the smallest sonar readings (distance to closest object) from the forward and backward pointing sonars respectively, scale, offset, and backsafetydist are parameters with default values, and maxvel is an input specifying the desired speed for a robot, this BPM calculates:

velocity = min(maxvel; scale (frontdist , offset))

(7.1)

that is, it outputs a velocity proportional to the distance to the closest front obstacle, with an upper bound of MaxV el and, if there is an object behind closer than BackDist, with a lower bound of 0. The Ayllu process that continuously performs this calculation is: ayDefProcess(SetSafeVelProc) { ayLocalPort MinFront, MinBack, Velocity, Scale, Offset, MaxVel, BackDist; int safevel; safevel = (ayReadIntPort(MinFront) - ayReadIntPort(Offset)) * ayReadFloatPort(Scale); if ((safevel < 0) && (ayReadIntPort(MinBack) > ayReadIntPort(BackStopDist))) safevel = 0; ayWriteIntPort(Velocity, ayMin(safevel, ayReadIntPort(MaxVel))); ayResetPort(MinFront); ayResetPort(MinBack); } }

And the Behavior Class de nition is: ayDefBehaviorClass(SafeVelocity) { ayINTERFACE { ayIntMinPort(MinFront, ayMAXINT); ayIntMinPort(MinBack, ayMAXINT); ayIntPort(Velocity, 0); ayFloatPort(Scale, 0.3); ayIntPort(Offset, 400); ayIntPort(MaxVel, 500); ayIntPort(BackStopDist, 550); }

51

ayPROCESSES { ayInitProcess(SetSafeVelProc, ratepersecond(20)); } }

7.3 Brief Survey of Ayllu-based Systems Ayllu has been used to implement the cooperative behaviors for ground-based robots which co-

operate with an autonomous robot helicopter [SMM99] and formation control in outdoor environments. [DMS99] uses Ayllu to build topological mapping systems that run on one robot that communicates with desktop displays, and for work on cooperative mapping with heterogeneous robots. [PM00] has used Ayllu as a substrate for both fuzzy-logic and multiple-objective behavior control. It is being used in experiments involving arti cial emotions in social robotics in continuation of research presented in [MV99], and in work on planar object manipulation [GM00]. [NM00] uses Ayllu in experiments in which robots learn about cooperation through analysis of the bene ts of interaction with humans. Ayllu is being ported to a number of new platforms at the Naval Undersea Warfare Center, where it is being used for development of surf-zone mine-sweeping systems and as the basis for the common control language for heterogeneous systems discussed in Chapter 8 and in [DW00]. In addition to the research projects listed above, Ayllu has been used in real-world applications such as museum installations, lm special eects, and roaming sales exhibits at trade shows; in thatrical works and performance art pieces; and in a number of robot competitions including AAAI's and RoboCup Robot Soccer competition.

52

Chapter 8 A PAB Approach to a Common Control Language Recent trends in research sponsored by the Oce of Naval Research have indicated the desirability of teams of heterogeneous underwater and surf-zone autonomous vehicles for reconaissance operations [DW00]. Important factors for such robotic systems include intermittent, low-bandwidth communication; the need for simple and rapid system con guration; and the need for human operator capabilities of observation, override, and re-tasking. In order to tie together widespread eorts by numerous research groups in academia, industry, and the military, there has been a call for a common control language (CCL) as a standard interface for information exchange and task delegation between human operators, swimming vehicles, bottom crawlers, moored communication systems, and other agents. Previous approaches to the CCL have involved low-level commands for such things as vehicle motion control, subsystem activation, and requests for sensor readings. In these approaches, adding capabilites to robots necessitates modi cation of the CCL. Further, they do not provide much support for autonomous behavior or interaction between autonomous systems; rather they seem to have evolved from single-robot remote-control commands. Our PAB framework naturally implies a dierent approach to a CCL, in which the language focuses on behavior manipulation rather than vehicle capabilities. Such a language is extremely simple, requiring only four basic commands: Instantiate (a behavior), Connect (a source port to a destination port, either normally or inhibitory/suppressive/overriding), SendMessage (to a speci ed destination port), and RequestMessage. The rst three of these commands re ect basic capabilities of any PAB system; RequestMessage merely asks that the value of a speci ed source port be sent to a speci ed destination port. This concise command set allows not only messaging between robots or between an operator and a robot, but construction and modi cation of controllers on-the- y (through instantiation and connection). We refer to such a port arbitrated behaviorbased common control language as PABCL. Since the PABCL should be, in general, used to construct or modify controllers rather than directly control vehicles, it can be far more bandwidthecient than previous CCL designs. PABCL abstracts away low-level implementation details so that dierent BPMs with a similar interface of ports can be written for dierent vehicles. New

53

capabilities can be added to robots or operator consoles by adding BPMs, without need for any modi cation of the PABCL itself. Given PAB's access to any addressable ports within the system, PABCL provides a uniform communication protocol for all levels of control. An operator can monitor system operations by examining appropriate ports, and can change system behavior by sending messages to appropriate \parameter" ports. The system can be overridden at any level by the operator who can, for example, tele-operate vehicles at the group level, the individual level, or at the level of individual subsystems, leaving low-level safety behaviors intact or overriding them. The following is the syntax (in Backus-Naur Form) of the PABCL: :: INSTANTIATE j CONNECT j SENDMESSAGE j REQUESTMESSAGE

::

:: j BROADCAST

:: NORMAL j DISCONNECT j

:: SUPPRESS j INHIBIT j OVERRIDEIN j OVERRIDEOUT

:: legal C identi er



:: INTEGER integer j FLOAT oat j STRING len string

:: oat We have modi ed Ayllu to support PABCL commands, and work on a parser and operator \console" is nearly complete.

54

Chapter 9 Broadcast of Local Eligibility We now introduce our Broadcast of Local Eligibility (BLE) approach to multi-robot coordination. BLE is a general tool that may be applied to strongly-cooperative tasks in which a number of roles must be lled, for each of which there is a suitable local-eligibility estimate. The BLE mechanism involves a comparison of locally determined eligibility (i.e., eligibility determined through a robot's own sensory input) with the best eligibility calculated by a peer behavior on another robot. When a robot's local eligibility is best for some behavior Bn which performs task Tn, it inhibits its peer behaviors (that is, behaviors Bn ) on all other robots, thereby \claiming" task Tn . Since this inhibition is an active process, failure of a robot which has claimed a task results in the task being immediately \freed" for potential takeover by another robot. Since BLE is based on broadcast messages and receiving ports that lter their input for the \best" eligibility, BLE-based systems are inherently scalable. Up to the limit of communication bandwidth (as discussed in Section 9.8), any number of BLE-enabled robots added to a system will properly interact. BLE allows heterogeneous robots to eciently allocate themselves to appropriate tasks without the need for any explicit communication or global knowledge of particular abilities. The ability to dynamically instantiate and connect BLE-enabled BPMs allows systems to scale in capability as well as in number of robots.

Note on BLE Terminology: Bn represents the name of a behavior which generates the ob-

servable behavior of performance of task Tn. The Bn peer group consists of all behaviors named Bn on all robots on the local network. All behaviors Bi ... Bj which are based on the same BPM, and which are connected and parameterized in the same way except for their relative priorities, are said to ful ll the same role { that is, if tasks Tn and Tm are exactly the same, then two robots, each performing one of Tn or Tm , are said to be concurrently playing the same role.

9.1 BLE-Enabled BPMs BLE action selection requires that each BLE-ready BPM include three speci c ports: Local, Best, and Inhibit (see Figure 9.2a). Useful BPMs will usually have additional ports for task-related input and output. We generically refer to the BLE-arbitrated output of a BPM as Output, though the 55

actual output may be through any number of arbitrarily-named ports. The Best port is a MaxPort (described in Section 7.1.1.2), which accepts only values that are larger than its current value.

9.2 Cross-Inhibition of Behaviors Cross-inhibition refers to the process of arbitration between peer behaviors, which are generally1 instances of the same BLE behavior on dierent robots. Given that there is some behavior instance Bn (which performs task Tn ) on each robot, cross-inhibition results in the selection of at most a single robot to perform Tn . The selected robot is the one that is most eligible (according to local criteria) for the task. There may be multiple sets of cross-inhibiting behaviors active at the same time and, in general, cross-inhibition of behaviors Bi across robots is independent of cross-inhibition of behaviors Bj ; Section 9.3 below discusses one manner in which local arbitration between dierent cross-inhibited behaviors can take place. Cross-inhibition is performed in a continuous series of decision cycles, the maximum frequency of which is limited by network bandwidth as discussed in Section 9.8. In practice decision cycles are usually programmed to take place at a xed rate between 10Hz and 1Hz. In each decision cycle, one robot from each peer group is selected to perform a task. As illustrated in Figure 9.2a, the Local port of each robot's behavior Bn broadcasts a locally-computed eligibility estimate to the Best port of each other robot's behavior Bn . Each Best port maintains the maximum of the eligibility messages it has received in the current decision cycle. Whichever robot has a local eligibility better than or equal to the Best it receives writes to its Inhibit port, causing write-inhibition (described in Section 4.2) of behavior Bn 's Output port(s) in the other robots for a period slightly longer than a decision cycle. For cases where multiple robots are \most eligible" for some Tn, they all inhibit such that no robot performs the task for the period of inhibition; if the eligibility function is well-written and based at least in part on sensor data, then real-world uncertainty and dynamism should lead to a single robot quickly emerging as \most eligible." If this is not possible or acceptable in a given task domain, then various methods could be used to guarantee that ties do not occur, such as the use of a unique identi er for each robot as part of the eligibility function.

9.3 Cross-Subsumption Cross-inhibition arbitrates only between peer behaviors on dierent robots; some local mechanism must arbitrate between dierent behaviors on the same robot. While, as we have seen in Chapter 4, in Section 7.3, and in our two previous example systems (Sections 5.2 and 5.3), the PAB approach 1 This is the case for BLE. It is certainly possible to use the same mechanism for selection between dierent behaviors, either within one robot or across robots, provided a local eligibility function can be normalized across the tasks. We refer to such selection between dierent tasks as Broadcast of Local Desirability.

56

Outp ut

Output

Inhibit

B1

Robot 1 - Behavior n Best

Local 5

Local 5

Lo

ior n

Ou

Rob

t

= st Be >

3

v Beha ot 2 -

t ibi

tpu

5

t= s Be >

t

ibi t

Inh

n

ca l

ibi

Ou

3

Lo

l

tpu

Inh

l ca 2 Lo

t

ca

Inh

n avior h e B t2Robo

t ibi

Inh

t

tpu

Robo t3-

tpu

Beha vior

Ou

Be st =5 >

Robo t3-B e h a vior n

3

Lo

t

a)

Ou

Be st =2 >

l ca 2

Best = 3 >

Inhibit

Robot 1 - Behavior n

b)

Figure 9.4: Robot or Communication Failure. a) When all robots are functioning and communicating, Robot 1 (with Local Eligibility 5) inhibits 2 and 3. b) If Robot 1 fails or leaves communication range, Robot 3 (Local Eligibility 3) takes over. or 50 robots each participating in 50 peer groups at 1 cycle/second. In practice, wireless Ethernet has a lower eective bandwidth, and the limitations of BLE are proportionally reduced.

9.9 BLE Summary The Broadcast of Local Eligibility technique structures behavior space so that a principled approach can be taken to building systems that exploit their abstract situatedness. Peer groups provide locality, local eligibility signals provide constant \environmental input," and inhibition allows robots to directly aect each other's actions. The broadcast basis of BLE leads to scalability limited only by communication bandwidth. BLE provides automatic recovery from robot failures and automatic retasking in response to changes in the environment or the robot team. BLE allows heterogeneous robots to assume task-achieving roles eciently without shared knowledge of capabilities, and provides a number of means for prioritizing tasks.

61

Chapter 10 BLE Experiments: Multi-Target Observation We have tested our BLE approach on a multi-target observation task known as CMOMMT (Cooperative Multi-robot Observation of Multiple Moving Targets) introduced by [Par97], and a prioritized variation that we call W-CMOMMT. CMOMMT is an NP-hard problem that requires strong cooperation [Par98] for good performance. It has the bene t of simple formulation and evaluation, and implemented systems for comparison. [Par99] gives a thorough overview of related work; [KDPT97] investigates ecient algorithms for the related multi-robot observation of entire areas, including trade-os between communicative, non-communicative, and centralized methods.

62

To optimally assign robots to observe multiple prioritized moving targets. Situatedness: In uential (in task-space). Scalability: broadcast-based BLE technique will properly assign an arbitrary number of robots to tasks. Robots do not need to know of the existence of other robots; they perceive at most only their own status (most appropriate or not) relevant to each BLE-arbitrated behavior, regardless of how many robots on team. Robustness: Robot or communication failure leads to redundant coverage of highest-priority targets rather than loss of coverage. Local eligibility estimates lead to role assignment based on perceptual/actuator capability, overcoming variations or failures in these systems. Actve inhibition rather than symbolic task-claiming provides instant, automatic recovery from failures, and bene cial task trade-o. Minimization: No need for positioning information, knowledge of team members, or reliable negotiation. Heterogeneous systems handled with no need for shared knowledge of agent capabilities. Standard BLE technique makes coding of strongly cooperative behaviors extremely simple and compact. Purpose:

10.1 The CMOMMT Task Our version of the CMOMMT problem is de ned as follows. Given: S : a bounded, enclosed region, R : a team of m robots with noisy, limited range, limited eld-of-view sensors, and O(t) : a set of n targets oj (t) such that In(oj (t); S ) is true, where In(oj (t); S ) means that target oj (t) is within S at time t De ne an m n matrix A(t) where

8 > < 1; if robot ri is observing target oj aij (t) = > at time t : 0; otherwise

(10.1)

A robot is said to be observing a target when the target is in the robot's eld of view and within a certain distance dobs .

63

We de ne a logical OR operator over a vector H :

_k i=1

hi =

(

1; if there exists an i such that hi = 1 0; otherwise

(10.2)

The goal of the CMOMMT is then to maximize

PT Pm Wr a (t) Observation = t=1 jt=1 mi=1 ij

(10.3)

that is, to maximize the time during which each target in S is being observed by at least one robot. We assume that the area covered by the sensors of the robots is much smaller than the total area to be monitored and that targets move slower than the robots. The original formulation of the problem [Par99] assumes that robots share a known global coordinate system; we replace this with the assumption that the robots can visually distinguish each target from the others1. Thus our formulation focuses on task space where Parker's formulation and implementation using predictive tracking and the local force-vector algorithm [Par99] tend to be more oriented towards physical space. We also introduce a prioritized version of the problem which we call Weighted CMOMMT, or W-CMOMMT. Given: W : a vector of weights such that wi re ects the priority of target oi , the goal of W-CMOMMT is to maximize

PT Pm w Wr a (t) WObservation = t=1t mj=1 Pj m i=1w ij v=1 v

(10.4)

10.2 Experimental Design We have implemented controllers for CMOMMT on a team of three ActivMedia Pioneer 2DX robots (see Figure 10.2b). These are dierentially-steered wheeled bases with on-board sonar (for obstacle avoidance) and vision (for identifying and tracking targets). The video cameras have a 45-degree eld of view. Each robot is connected to a wireless Ethernet LAN, and programmed using Ayllu [Wer00].

The Experimental Environment Current experiments take place in an 18 by 22 foot enclosure.

Targets are colored paper cylinders which an experimenter moves by hand in a xed pattern at an average speed of about 2 feet/minute; sequences of positions for each target are marked on the

oor, and a workstation provides verbal prompts for precise timing of target motion. The targets all begin at one end of the enclosure, and move in a criss-cross pattern that varies from a very 1 Parker [Par99] mentions that the ability to distinguish targets is important in her formulation as well, and might need to be performed through such techniques as analysis of target motion when targets are close together.

64

3

4

2

1

Visual Field

1

3 2 4 Sonar Pingers

a)

b)

Figure 10.1: a) Experimental Environment : The 18 by 22 foot enclosure. Robots are shown with observation ranges; elds of view extend further. Targets are numbered circles. Light grey targets and dashed lines indicate initial positions and paths of targets. b) Robotic Testbed : Three Pioneer 2DX robots. dispersed to a very condensed formation (see Figure 10.2a). Trials are run with three robots and four targets.

10.3 Robot Behaviors Four controllers have been implemented for comparison in CMOMMT: a BLE controller, a local greedy controller, a local subsumption controller, and a random controller. Each of these controllers is implemented using the same BPMs, with dierences only in behavior arbitration.

Common Behaviors A single behavior on each robot controls translational motion to maintain

a safe velocity based on the distance to sonar-detected obstacles. The task-oriented BPMs speci ed below only control rotational motion of the robots. Two classes of behavior are implemented:

Observer behaviors: the target-observing behaviors are instances of a single BPM which

rotates the robot towards a speci c target in its eld of view. This, combined with the 65

Observe1 Observe2 Vision Observe3 Observe4 Search

Rotation

Sonar SafeVelocity

Translation

Figure 10.2: Local Subsumption for CMOMMT. Arrows pointing into circles indicate OutputOverriding (see Section 4.2) connections; thus behaviors \higher" in the diagram have higher priority. common velocity control behavior, causes the robot to approach this speci c target and maintain a distance of approximately 1 foot. One observer behavior is instantiated for each target to be tracked. The observation range of the robots is approximately four feet, and the robots are able to perceive targets up to fteen feet away, depending on lighting conditions in dierent parts of the enclosure.

Search behavior: the search behavior is random wandering.

BLE Coordination: The BLE controller is a subsumption hierarchy of Observer behaviors, with

the Target 1 observer having highest priority (illustrated in Figure 10.2). Each Observer is then joined into a cross-inhibiting peer group which consists of Observer s of the same target on each robot, such that the controller becomes a cross-subsumption hierarchy (Figure 10.3). The highestpriority behavior that is not cross-inhibited controls the robot { that is, each robot approaches and tracks the highest-priority target it sees that is not being observed by another robot. The local evaluation function for each Observer behavior is proportional to the width of its associated target in the visual eld (an approximation of distance). It favors observation of multiple targets by increasing for each additional target viewed in observation range. Givent the de nitions from Section 10.1, the local eligibility of the target j observer on robot i at time t is calculated as:

( Pn k=1 ai;k (t) widthi;k (t); if ai;j (t) = 1 LEi;j (t) = ,1; otherwise

(10.5)

66

Outp ut

t Inhibi

Robot 1 Observe 1

Observe 2

O

u

tp

u

In

h

ib

it r se

ve

1

ve

2

e

se

e1

ve

Search

ib

u

tp

u

t

it

Ob

se

e2

Ob

se

rv

bs O

er

h

rv

3

v er

4

bs O

h recS raa hce S

R

In

Ob

b O

er

t2

Observe 4

rv

bs O

o ob

O

Observe 3

t

e3

Ob

se

rv

e4

SSe eaar crch h

R

ob

ot

3

Figure 10.3: BLE Control for CMOMMT. The local subsumption hierarchy is cross-inhibited. that is, if a robot is observing some target i, then its eligibility for the task of observing i is the sum of the widths of all targets that it observes, but if it is not observing target i, it has minimum eligibility for the task. This is calculated simply in the Ayllu code in Section A.1.2. The search behavior is active when all other behaviors are either cross-inhibited or unable to perceive any targets.

Local Subsumption Only The Local Subsumption (LS) controller is the same as the BLE

controller, but connections are not made across the peer groups so that no cross-inhibition takes place. The robot approaches and tracks the highest-priority target it sees, or searches if it sees no targets.

Local Greedy The Local Greedy controller (seen in Figure 10.4) has neither cross-inhibition nor

local subsumption; instead, the behavior with the highest evaluation function controls the robot. The robot approaches and tracks whatever target is most salient (as in uenced by perceptual uncertainty) in the eld of view, or searches if no target is perceived.

Centralized In the centralized controller (seen in Figure 10.5), a single behavior (running on a desktop computer in our experiments) processes the visual information from all the robots and assigns robots to tasks according to the following algorithm:

Step 1. make a list T of all targets and a list R of all robots Step 2. Sort R by the number of targets in T observed by each rn . In the case of ties,

rank robots secondarily on their approximate distance to targets. Step 3. Assign r1 to cover the highest-priority target it is observing. Remove r1 from R, and remove any targets observed by r1 from T . Step 4. If any rn is observing any tm, go to Step 2. 67

Observe1 Observe2 Vision Observe3 Observe4

S e l e c t

Rotation

Search Sonar SafeVelocity

Translation

Figure 10.4: Greedy Control for CMOMMT. The Select behavior chooses to track whichever target is closest.

Step 5. Assign all remaining members of R to Search A behavior MUX is instantiated on each robot which acts as a multiplexor, receiving an assignment from the central controller and passing output from the appropriate Observer or Search behavior to the motor controller.

Random In the Random controller, only the random wander behavior is active, at all times.

10.4 Results Five trials of approximately 12 minutes each were run for each of the BLE, Greedy, and LS controllers; two trials of the Random controller were run for a baseline. The most important measures are of course the Observation and W-Observation metrics of Section 10.1. As seen in Figure 10.6, on CMOMMT the BLE approach, with an average Observation of 0.7963, scored signi cantly higher (p = 0:0017 on a pairwise t-test) than the Greedy approach at 0.69940 and the LS approach at 0.51995 (p < :0001). Using the W-CMOMMT metric, the relative performance of BLE was even better, scoring 0.860984 to the Greedy score of 0.717251 (p < 0:00001) and the LS score of 0.630928. The distribution of the robots across the targets can be clari ed with information on simultaneous target coverage, illustrated in Figure 10.7. On average, the BLE approach observed all four targets 41.68 percent of the time, and observed at least three targets 82.17 percent of the time. The Greedy approach averaged four targets only 23.27 percent of the time, and at least 68

from other robots

Central Controller

to other robots

Observe1 Observe2 Vision Observe3

M U X

Rotation

Observe4 Search Sonar SafeVelocity

Translation

Figure 10.5: Centralized Control for CMOMMT. The MUX behavior selects which target tracker's output will conrol the robot based on a signal from the central controller, which makes its decision based on input from all robots. three targets 66.12 percent of the time. LS observation of four targets averaging 9.71 percent, and three or more, 30.38 percent. Thus, BLE achieved better distribution than either Greedy or LS. Surprisingly, BLE also achieved marginally higher observation of the highest-priority target than Local Subsumption (95.74 vs. 95.62 percent). It can be seen from the target motion patterns of Figure 10.2a that during the last third of each trial, targets 2 and 3 were consistently close enough to be observed by a single robot. While the BLE approach resulted in a stable con guration of all four targets being observed for the majority of the nal third of every trial, neither the Greedy nor the LS approaches maintained such a stable full observation in even a single trial. Figure 10.8 illustrates typical patterns of observation; for each algorithm we have chosen a trace of the trial which scored closest to the average. In the BLE approach, the three highest-priority targets are covered fairly constantly, although the observing robots switch o; overlap of observation is minimal. The stable four-target observation for the last third of the trial can be seen, with robot 1 covering both targets 2 and 3. In the Greedy trace, there are clearly both larger periods of overlap and larger periods in which some targets are not covered at all. In LS, as expected, the highest-priority target was well and redundantly covered, while others were not. In all trials, periods in which a particular robot seems not to be observing anything often re ect a blocked robot which is tracking a target, but not close enough to observe. This situation was common to Greedy and LS trials where robots often \queued up" behind other robots observing a salient target. Further, our collision avoidance, resulting only from the translational velocity control described in Section 10.3, did not deal eectively with robots approaching each other from 69

CMOMMT Observation Score

W-CMOMMT Observation Score

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0 BLE

Greedy

Local Subsumption

Random

BLE

Greedy

Local Subsumption

Random

Figure 10.6: Average Observation and W-Observation scores by algorithm. Error bars span 2 standard deviations; metrics described in Section 10.1. the side, as when both were trying to get close to the same target; this resulted in occasional collisions during the LS and Greedy trials. The task-space separation of the BLE approach proved to be very eective in preventing both of these physical-space problems of interference. Further, observation of the dierent approaches in action led to the realization that the BLE approach was eective in overcoming perceptual limitations of the robots. While in the Greedy and LS trials robots tended to cluster around targets that were \better perceived" (due to details of the color-tracking implementation, and environment), exacerbating the physical-space problems described above, in the BLE trials, highly visible targets were quickly observed, driving other robots to pursue less salient targets.

10.5 Comparison to ALLIANCE This section on BLE and CMOMMT demands comparison with the ALLIANCE architecture [Par98], a PAB extension which comes closest to our approach and provided the rst (partial) physical-robot implementation of the CMOMMT task. We must rst state that direct comparison of the two systems is not necessarily useful, as Ayllu and BLE are aimed at a lower level of abstraction than ALLIANCE; it is clear that ALLIANCE could be easily implemented using BLE, but our interest is in examining the range of tasks that can be covered by our small set of clean, standard, situated, language-level abstractions. The main dierences is that ALLIANCE does not provide a homogeneous model for both inter- and intra-robot communication, and ALLIANCE addresses robot failure and role (re)assignment through acquiescence and impatience.

70

Simultaneous Targets Observed 0.6

Percentage of Time

0.5

0.4 0 Targets 1 Target 0.3

2 Targets 3 Targets 4 Targets

0.2

0.1

0 BLE

Greedy

Local Subsumption

Random

Figure 10.7: Simultaneous Observation scores, by algorithm The percentage of time during which 1, 2, 3, 4, or no targets were observed, averaged over trials. In ALLIANCE, inter-robot communication, by non-PAB means, is permitted only to a restricted class of motivational behaviors. As far as we can tell2 , these motivational behaviors must be rewritten whenever new behaviors or capabilities are added to the system. In BLE, only interbehavior connections (at most) need to be changed; and this can be done dynamically at run-time. Thus, ALLIANCE does not add scalability or physical distribution features to the PAB paradigm, but allows a limited set of behaviors to make use of \foreign" communication techniques. Further, the computation performed by motivational behaviors is fairly unconstrained, where BLE is a standard computation. ALLIANCE is therefore perhaps strictly more powerful (as future analysis will investigate), where with BLE we seek to ask such questions as \how powerful is (i.e., to what class of tasks can we successfully apply) this set of standardized techniques." Impatience and acquiescence are measures of a robot's eagerness to take over a task and its willingness to let a team-mate claim the task, respectively. They require that each robot maintain records of each team-mate's task performance, which evolve over time. This, and the means of explicit task claiming rather than cross-inhibition, make ALLIANCE a non-situated approach to cooperation (though the individual robots may interact with the environment in a situated manner). 2

We are in communication with Parker to fully determine the implementation details of ALLIANCE.

71

Observation of Targets Over Time, By Robot Greedy

BLE R1

R1

T1 R2

T1 R2

R3

R3

R1

R1

T2 R2

T2 R2

R3

R3

R1

R1

T3 R2

T3 R2

R3

R3

R1

R1

T4 R2

T4 R2

R3

R3

Local Subsumption

Random

R1

R1

T1 R2

T1 R2

R3

R3

R1

R1

T2 R2

T2 R2

R3

R3

R1

R1

T3 R2

T3 R2

R3

R3

R1

R1

T4 R2

T4 R2

R3

R3

Figure 10.8: Observation Over Time, by Algorithm Each quadrant shows coverage of targets T1{ T4 by robots R1{R3 during an average run of each algorithm. The horizontal axis is time. In the Local Subsumption quadrant, for example, we can see that Target 3 was covered by both Robots 1 and 2 for the rst third of the trial. The BLE example demonstrates that BLE achieves high coverage of high-priority targets with low redundancy, even though robots switch targets. At this point, direct CMOMMT comparisons are not possible. While Parker claims the rst implementation of CMOMMT on physical robots [Par99], it is described as a proof-of-concept test of individual robot behaviors without any group coordination, with minimal anecdotal results. The implementation also relies on a 360 degree eld of view and shared global coordinate system. She has reported results from simulation studies, comparing CMOMMT against randomly moving observers.

72

Chapter 11 Conclusions and Timetable of Future Work We have demonstrated that a situated approach to role assumption can be eective in physical domains to build exible, scalable, systems; that this approach can be extended to abstract task spaces with similar eectiveness; and that the Broadcast of Local Eligibility provides a simple, general tool for building such abstract task spaces. Experimentation has shown that the PAB paradigm, and BLE in particular, are able to support fully distributed, ecient coordination of teams of robots using simple and general low-level components. The resulting systems are scalable, robust, and exible, adapting to changing environmental conditions and resource availability. Cross-subsumption can assign heterogeneous robots to tasks appropriately with no need for explicit negotiation or recognition. PAB is a principled approach, providing standard, well-de ned abstractions for behavior coordination. Behaviors are fully encapsulated, facilitating \bottom up" system design and testing. In the future, we plan to thoroughly analyze the class of tasks to which the PAB/BLE approach can scale. Papers in mobile robotics often make claims such as \[this architecture is] superior to subsumption for those applications which require higher-level reasoning to determine which behavior to activate" [Par99]. We plan to continue experimentation to increase the capabilities of behavior-based systems, and investigate and adapt analytic techniques in order to rigorously address such questions of relative capability. The development of simple, standardized coordination techniques such as PAB/BLE is an important step in the dicult problem of constructing analyzable behavior-based robotic systems. We also show that situatedness can be extended to include situatedness in abstract behavior spaces, and that many if not all of the bene ts of situatedness in the physical world are obtainable through situatedness in such abstract spaces. Further work and analysis will cast light on the nature of the non-symbolic \ethereal" interaction that we have claimed is analogous to physicalworld interaction.

73

11.1 Future Experiments with BLE Closure of CMOMMT We will run further experimental trials which focus on robot and communication failures to begin to quantify robustness, as well as trials with larger numbers of robots and targets to demonstrate scalability. End of October, 2000

Simulated Elevator Control Our next round of experiments applies BLE to the problem of

elevator dispatching as described in [CB96], which compares a number of algorithms (both heuristic and learning-based) in a simulated 10-story building with 4 elevator cars of limited capacity. The simulator models trac ow with passenger arrival rates varying at ve-minute intervals, and models the elevator cars at the level of velocity, stop time, turn-around time, and loading time. The goal is to minimize average wait time, total system time, and/or percentage of passengers whose wait is longer than a speci ed dissatisfaction threshold. We have obtained a copy of the simulator used in and comparison algorithms used in [CB96], and are currently in the process of translating the code from FORTRAN and integrating it with Ayllu. The simulated system allows us direct comparison with both industry-standard and other researchers' elevator algorithms and, running much faster than real-time, will allow us to extensively test robustness to various types of failures. Mid-December, 2000

Material Transport The material transport task performed by teams of physical robots will

be similar to the simulated elevator control task. Multiple sources will provide materials of dierent priorities at dierent rates, which must be delivered to appropriate sinks. The evaluation will again be in terms of wait time and system time. In these experiments we will focus on issues of heterogeneous robots each capable of a subset of the necessary subtasks. Mid-February, 2001

11.2 Common Control Language We will complete the implementation of the basics of the port-arbitrated behavior-based common control language (PABCL). End of November, 2000 Though we will perform some simple, informal experiments for testing, more thorough testing and experimentation of PABCL will thereafter be handed over to a team at the Naval Undersea Warfare Center in Newport, RI.

74

11.3 Analysis of Situated Systems Literature Review: Analysis of situated systems including information invariants, Horswill's comparative techniques, and situated automata. Mid-January, 2001

Literature Review: Philosophy of situated systems including situated action, situated cognition, situated automata, embodied cognition, and indexical knowledge. End February, 2001

11.4 Dissertation and Defense Dissertation Writing January { April, 2001 Dissertation Defense Early May, 2001

75

Reference List [AB98]

Ronald C. Arkin and Tucker Balch. Cooperative multiagent robotic systems. In David Kortenkamp, R. Peter Bonasso, and Robin Murphy, editors, Arti cial Intelligence and Mobile Robots: Case Studies of Successful Robot Systems, pages 277{296. AAAI Press/The MIT Press, Cambridge, MA, 1998. [AC87] P. E. Agre and D. Chapman. Pengi: an implementation of a theory of activity. In Proceedings of 6th Annual Meeting of AAAI, 1987. [ADGP90] S. Aron, J. L. Deneubourg, S. Goss, and J. M. Pasteels. Functional self-organization illustrated by inter-nest trac in ants: The case of the argentinian ant. In W. Alt and G. Homan, editors, Biological Motion, volume 89 of Lecture Notes in BioMathematics, pages 533{547. Springer-Verlag, Berlin, 1990. [AH92] R. Arkin and J. D. Hobbs. Dimensions of communication and social organization in multi-agent robotic systems. In Proceedings of the 2nd International Conference on Simulation of Adaptive Behavior, 1992. [Ark98a] R. Arkin. The 1997 aaai mobile robot competition and exhibition. AI Magazine, 19:3, 1998. [Ark98b] R. Arkin. Behavior-Based Robotics. MIT Press, 1998. [ASK+ 98] M. Asada, P. Stone, H. Kitano, B. Werger, Y. Kuniyoshi, A. Drogoul, D. Duhaut, M. Veloso, H. Asama, and S. Suzuki. The robocup physical agent challeng: Phase i. Applied Arti cial Intelligence, 12, 1998. [BA95] T. Balch and R. Arkin. Motor schema-based formation control for multi-agent robot teams. In Proceedings of the International Conference on Multiagent Systems (ICMAS), pages 10{16, San Francisco, 1995. [BA98] T. Balch and R. Arkin. Behavior-based formation control for multi-robot teams. IEEE Transactions on Robotics and Automation, 14(6):926{939, December 1998. [BFG+ 97] R. P. Bonasso, R. J. Firby, E. Gat, D. Kortenkamp, D. P. Miller, and M. G. Slack. Experiences with an architecture for intelligent, reactive agents. Experimental and Theoretical Arti cial Intelligence, 9(1), March 1997. [BHD94] R. Beckers, O. E. Holland, and J.L. Deneubourg. From local actions to global tasks: Stigmergy and collective robotics. In Proceedings of the Fourth International Workshop on the Synthesis and Simulation of Living Systems, pages 181{189. MIT Press, 1994. [BR95] R. A. Brooks and C. Rosenberg. L - a common lisp for embedded systems. In Proceedings of the Lisp Vendors and Users Conference, 1995. [Bro86] Rodney A. Brooks. A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation, 2(1):14{23, April 1986. 76

[Bro90a]

R. A. Brooks. The behavior language; user's guide. Memo 1227, MIT AI Lab, April 1990. [Bro90b] R. A. Brooks. Challenges for complete creature architectures. In Proceedings of the 1st International Conference on Simulation of Adaptive Behavior, 1990. [Bro90c] Rodney A. Brooks. Elephants don't play chess. Robotics and Autonomous Systems, 6:3{16, 1990. [Bro91a] Rodney A. Brooks. Intelligence without reason. In Proceedings of IJCAI, pages 569{ 595, Sydney, Australia, 1991. [Bro91b] Rodney A. Brooks. Intelligence without representation. Arti cial Intelligence, 47:139{ 159, 1991. [Bro99] R. A. Brooks. Cambrian Intelligence. MIT Press, 1999. [BSV96] M. Bowling, P. Stone, and M. Veloso. Predictive memory for an inaccessible environment. In International Conference on Intelligent Robots and Systems, 1996. [CB96] Robert H. Crites and Andrew G. Barto. Improving elevator performance using reinforcement learning. In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors, Advances in Neural Information Processing Systems, volume 8. MIT Press, Cambridge, MA, 1996. [Con90] Jonathan H. Connell. Minimalist Mobile Robotics. Academic Press, 1990. [DGF+ 91] J. L. Deneubourg, S. Goss, N. Franks, A. Sendova-Franks, C. Detrain, and L. Cretien. The dynamics of collective sorting: Robot-like ants and ant-like robots. In Proceedings of the First International Conference on Simulation of Adaptive Behavior: From Animals to Animats, pages 356{363. MIT Press, 1991. [DGS+ 90] J. Deneubourg, S. Goss, G. Sandini, F. Ferrari, and P. Dario. Self-organizing collection and transport of objects in unpredictable environments. In USA-Japan Symposium on Flexible Automation, Kyoto, Japan, July 1990. [DJMW95] G. Dudek, M. Jenkin, E. Milios, and D. Wilkes. Experiments in sensing and communication for robot convoy navigation. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems, 1995. [DJR94] B. Donald, J. Jennings, and D. Rus. Information invariants for distributed manipulation. In R. Wilson and J.-C. Latombe, editors, The First Workshop on the Algorithmic Foundations of Robotics. A.K. Peters, 1994. [DJR97] B. Donald, J. Jennings, and D. Rus. Minimalism + distribution = supermodularity. Journal of Experimental and Theoretical Arti cial Intelligence, 9:2-3, 1997. [DMS99] D. Dedeoglu, M. J. Mataric, and G. S. Sukhatme. Incremental, on-line topological map building with a mobile robot. In Sensor Fusion and Decentralized Control in Autonomous Robotic Systems, Proceedings of SPIE, pages 123{139, 1999. [Don95] B. R. Donald. Information invariants in robotics. Arti cial Intelligence, 72, 1995. [DW00] Christiane N. Duarte and Barry Brian Werger. De ning a common control language for multiple autonomous vehicle operations. In Proceedings of OCEANS 2000 MTS/IEEE, pages 1861{1865, September 2000. 77

[FM96] [GADP89] [Gat92] [GD91] [GM96] [GM00] [Gor99] [HM00] [HW90] [Jen95] [JKW98] [KAK+ 95] [KDPT97]

[Kra97] [KZ92] [Lit94] [Mae90]

Miguel Schneider Fontan and Maja J. Mataric. A study of territoriality: the role of critical mass in adaptive task division. In Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, Cape Cod, September 1996. S. Goss, S. Aron, J. L. Deneubourg, and J. M. Pasteels. Self-organized shortcuts in the argentine ant. Naturwissenschaften, 76:579{581, 1989. E. Gat. Integrating planning and reacting in a heterogeneous asynchronous architecture for controlling real-world mobile robots. In Proceedings of the Tenth National Conference on Arti cial Intelligence, 1992. S. Goss and J. L. Deneubourg. Harvesting by a group of robots. In Proceedings of the First European Conference on Arti cial Life. MIT Press, 1991. Dani Goldberg and Maja J Mataric. Interference as a tool for designing and evaluating multi-robot controllers. In Proceedings of AAAI-97, pages 637{642, Providence, RI, July 1996. Brian Gerkey and Maja Mataric. Murdoch: Publish/subscribe task allocation for heterogeneous agents. In Proceedings of Autonomous Agents, 2000. Deborah M. Gordon. Ants at Work. The Free Press, New York, 1999. Owen Holland and Chris Melhuish. Stigmergy, self-organisation, and sorting in collective robotics. Arti cial Life, 5:2:173{202, 2000. Bert Holldobler and Edward O. Wilson. The Ants. The Belknap Press of Harvard University Press, Cambridge, Massachusetts, 1990. N. Jennings. Controlling cooperative problem solving in industrial multi-agent systems using joint intentions. Arti cial Intelligence, 75(2):195{240, 1995. J. Jennings and C. Kirkwood-Watts. Distributed mobile robotics by the method of dynamic teams. In Proceedings of the Conference on Distributed Autonomous Robot Systems (DARS), Karlsruhe, Germany, May 1998. H. Kitano, M. Asada, Y. Kuniyoshi, I. Noda, and E. Osawa. Robocup: The robot world cup initiative. In Proceedings of IJCAI-95 Workshop on Entertainment and AI/ALife, 1995. K. Kwok, B. Driessen, C. Phillips, and C. Tovey. Analyzing the multiple-targetmultiple-agent scenario using optimal assignment algorithms. In Sensor Fusion and Decentralized Control in Autonomous Robotic Systems, Proceedings of SPIE 3209, pages 111{122, 1997. S. Kraus. Negotiation and cooperation in multi-agent environments. Arti cial Intelligence, 94:79{97, 1997. C. R. Kube and H. Zhang. Collective robotic intelligence. In Proceedings of the 2nd International Conference on Simulation of Adaptive Behavior, 1992. M. Littman. Memoryless policies: theoretical limitations and practical results. In Proceedings of the 3rd International Conference on Simulation of Adaptive Behavior, 1994. Pattie Maes. Situated agents can have goals. Robotics and Autonomous Systems, 6(1{2):49{70, 1990. 78

[Mat90]

Maja J Mataric. Navigating with a rat brain: A neurobiologically-inspired model for robot spatial representation. In Proceedings of the First International Conference on Simulation of Adaptive Behavior: From Animals to Animats. MIT Press, 1990. [Mat95a] M. Mataric. Issues and approaches in the design of collective autonomous agents. Robotics and Autonomous Systems, 16, 1995. [Mat95b] Maja J Mataric. Designing and understanding adaptive group behavior. Adaptive Behavior, 4:1:51{80, December 1995. [Mat97] Maja J. Mataric. Behavior-based control: Examples from navigation, learning, and group behavior. Journal of Experimental and Theoretical Arti cial Intelligence, 9(2{ 3):323{336, 1997. [MM99] Francois Michaud and Maja J Mataric. Representation of behavioral history for learning in nonstationary conditions. Robotics and Autonomous Systems, 29(2):1{14, 1999. [MV99] Francois Michaud and Minh Tuan Vu. Managing robot autonomy and interactivity using motives and visual communication. In Proc. Conf. Autonomous Agents, pages 160{167, 1999. [MW88] Martin Muller and Rudiger Wehner. Path integration in desert ants: Cataglyphis fortis. In Proceedings of the National Academy of Sciences, volume 85, pages 5287{ 5290, 1988. [NM00] Monica Nicolescu and Maja Mataric. Learning cooperation from human-robot interaction. In Proceedings of DARS, October 2000. [Nor97] D. Normile. Robocup soccer match is a challenge for silicon rookies. Science, 277, September 1997. [OT97] S. Onn and M. Tennenholtz. Determination of social laws for multi-agent mobilization. Arti cial Intelligence, 95:155{167, 1997. [Par93] L. E. Parker. Designing control laws for cooperative agent teams. In Proceedings of the IEEE International Conference on Robotics and Automation, pages 582{587, 1993. [Par96] L. E. Parker. On the design of behavior-based multi-robot teams. Advanced Robotics, 10:547{578, 1996. [Par97] L. E. Parker. Behavior-based cooperative robotics applied to multi-target observation. In R. Bolles, H. Bunke, and H. Noltemeier, editors, Intelligent robots: Sensing, modeling, and planning. World Scienti c, 1997. [Par98] L. E. Parker. Alliance: An architecture for fault tolerant multi-robot cooperation. IEEE Transactions on Robotics and Automation, 14, 1998. [Par99] L. E. Parker. Cooperative robotics for multi-target observation. Intelligent Automation and Soft Computing, 5:5{19, 1999. [PM00] Paolo Pirjanian and Maja Mataric. Multi-robot target acquisition using multiple objective behavior coordination. In IEEE International Conference on Robotics and Automation, San Francisco, April 2000. [SDHK97] J. Salido, J. M. Dolan, J. Hampshire, and P. Khosla. A modi ed reactive control framework for cooperative mobile robots. In Sensor Fusion and Decentralized Control in Autonomous Robotic Systems, Proceedings of SPIE 3209, 1997. 79

[Sim69] [SM94] [SMM99] [Ste94] [SV99]

[SV00] [Tam97a] [Tam97b] [TGGD91]

[VSH+ 98a] [VSH98b] [VySM00] [Wer98] [Wer99a] [Wer99b] [Wer00] [WM96]

Herbert A. Simon. The Sciences of the Arti cial. MIT Press, Cambridge, Massachussetts, 1969. M. Sahota and A. Mackworth. Can situated robots play soccer? In Conference of the Canadian Society for Computational Studies of Intelligence, 1994. G. S. Sukhatme, J. F. Montgomery, and M. J. Mataric. Design and implementation of a mechanically heterogeneous robot group. In Sensor Fusion and Decentralized Control in Autonomous Robotic Systems, Proceedings of SPIE, pages 111{122, 1999. L. Steels. The arti cial life roots of arti cial intelligence. Arti cial Life, 1(1/2), 1994. Peter Stone and Manuela M. Veloso. Task decomposition and dynamic role assignment for real time strategic teamwork. In Intelligent Agents V { Proceedings of the Fifth International Workshop on Agent Theories, Architectures, and Languages (ATAL '98), Heidelberg, 1999. Springer Verlag. P. Stone and M. Veloso. Multi-agent systems: a survey from a machine learning perspective. Autonomous Robotics, 8(3), July 2000. M. Tambe. Implementing agent teams in dynamic multi-agent environments. Applied Arti cial Intelligence, 1997. Milind Tambe. Towards exible teamwork. Journal of Arti cial Intelligence Research, 7:83{124, 1997. Guy Theraulaz, Simon Goss, Jacques Gervet, and Jean-Louis Deneubourg. Task differentiation in polistes wasp colonies: a model for self-organizing groups of robots. In Proceedings of the First International Conference on Simulation of Adaptive Behavior: From Animals to Animats, pages 346{355. MIT Press, 1991. M. Veloso, P. Stone, K. Han, , and S. Achim. The cmunited-97 small robot team. In RoboCup-97: The First Robot World Cup Soccer Games and Conferences, 1998. M. Veloso, P. Stone, and K. Han. The cmunited-97 robotic soccer team: Perception and multiagent control. In Proceedings of Autonomous Agents, pages 78{85, 1998. R. Vaughan, K. Sty, G. Sukhatme, and M. Mataric. Blazing a trail: Insect-inspired resource transportation by a robot team. 2000. Barry Brian Werger. Principles of minimal control for comprehensive team behavior. In Proceedings of the International Conference on Robotics and Automation, 1998. Barry Brian Werger. Ayllu: Distributed Behavior-Based Control; User's Manual. ActivMedia Robotics, 1999. available at http://robots.activmedia.com. Barry Brian Werger. Cooperation without deliberation: A minimal behavior-based approach to multi-robot teams. Arti cial Intelligence, 110:293{320, 1999. Barry Brian Werger. Ayllu: Distributed port-arbitrated behavior-based control. In Proceedings of DARS, October 2000. Barry Brian Werger and Maja J Mataric. Robotic food chains: Externalization of state and program for minimal-agent foraging. In Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior: From Animals to Animats 4, pages 625{634. MIT Press, 1996. 80

[WM99]

Barry Brian Werger and Maja J Mataric. Exploiting embodiment in multi-robot teams. Technical Report 99-378, USC Institute for Robotics and Intelligent Systems, 1999. [WM00] Barry Bryan Werger and Maja J Mataric. From insect to Internet: situated control for networked robot teams. Annals of Mathematics and Arti cial Intelligence, 2000. [WNM95] Z.-D. Wang, E. Nakano, and T. Matsukawa. Designing behavior of a mult-robot system for cooperative object manipulation. In Proceedings of the International Symposium on Microsystems, Intelligent Materials, and Robots, 1995.

81

Appendix A Ayllu Code for CMOMMT Below we include all of the Ayllu code necessary for the CMOMMT task of Section 10. #include #include

/* for MakeWanderBeh and MakeSafeVelBeh */

A.1 The Observer Behavior A.1.1 Behavior Class De nition

ayDefBehaviorClass(ObserveTarget) { ayINTERFACE { ayIntPort(Target, 0); ayStringPort(VisBlobs,"" ); ayFloatPort(Observed, 0); ayFloatSumPort(OthersObserved, 0); ayIntPort(Rotate, 0); } ayPROCESSES { ayInitProcess(TurnToTarget, ayMAXRATE); ayInitProcess(BLE_Select, ratepersecond(1)); } }

A.1.2 Process De nition ayDefProcess(TurnToTarget) { ayLocalPort Target, VisBlobs, Observed, OthersObserved, Rotate; int targnum = ayReadIntPort(Target); char *blobs = ayReadStrPort(VisBlobs); double eligibility = ayBlobWidth(targnum, 0, blobs); ayWriteIntPort(Rotate, ayVISCENTERX - ayBlobX(targnum, 0, blobs)); if (eligibility >= MIN_OBSERVE_WIDTH) { ayWriteFloatPort(Observed, eligibility); ayWriteFloatPort(BLE_Local, eligibility + ayReadFloatPort(OthersObserved)); }

}

ayResetPort(OthersObserved); /* BLE_Local is local eligibility estimate */

82

#define MakeObserver(num, next) ayInitBehavior(Observer, OBS##num); aySendIntMessage(OBS##num, Target, num); ayConnect(Vision, ACTS, BlobInfo, OBS##num, VisBlobs); ayOverrideOut(OBS##num, Rotate, next, Rotate, seconds(0.2)); ayBLE_Connect(OBS##num, Rotate, Peers); ayLocalBroadcast(OBS##num, Observed, OthersObserved)

\ \ \ \ \ \

A.2 main() Function void main () { ayBROADCAST Peers = ayMakeBroadcast("10.255.255.255"); ayHOST Vision = ayMakeHost(localhost, ayVISION); ayInitPioControl("/dev/ttyS0"); /* for Pioneer mobile robot */ ayInitIPComms(); ayMakeSefeVelBeh(VEL); ayMakeWanderBeh(WANDER); MakeObserver(5, WANDER); MakeObserver(4, OBS5); MakeObserver(3, OBS4); MakeObserver(2, OBS3); MakeObserver(1, OBS2); ayRunBehaviors(); }

83

Appendix B Glossary of Terms Ayllu (48) Our language for development of networked PAB control systems.

Bn (55) the name of a behavior which generates the observable behavior of performance of task Tn.

basis behavior (8) a set of minimal behaviors that are sucient to be combined into solutions

to a class of tasks behavior (10) a computational process resulting from instantiation, connection, and parameterization of a BPM. A behavior generally receives sensory input and generates actuator output as observable behavior. BPM (10) behavior-producing module; program code which, when properly interfaced to sensors and actuators, forms a behavior (which generates observable behavior). CMOMMT (62) cooperative multi-robot observation of multiple moving targets. connection (11) unidirectional data path between a source port and a destination port. Anything written to a source port will be propagated along all outgoing connections to their destination ports. Connections may be suppressive, inhibitory, or overriding, as discussed in Section 4.2. indexical-functional perception (6) task-related perception of the roles played by other agents in the environment, relative to oneself, rather than to identities particular to an agent. local eligibility (55)eligibility (for a task) determined through a robot's own sensory input. observable behavior (10) activity of a robot, as observed by humans, resulting from PAB (10) port-arbitrated behavior-based control. PABCL (53) port-arbitrated behavior-based control language. peer behavior (55) a behavior of the same name on another robot on the local network. peer group (55, 57) all behaviors of the same name on all robots on the local network. port (10) register that holds a single data item or \message," which may be simple or complex (i.e., an integer or an arbitrary data structure). Ports are local to behaviors, and are addressed by a pair (behaviorname, portname ) or a triple (robotname, behaviorname, portname ). Rn one of the robots on the local network role (55) All behaviors Bi ... Bj which are based on the same BPM, and which are connected and parameterized in the same way except for their relative priorities, are said to ful ll the same role. 84

shallow computation (3) a minimal computational \path" between sensor input and actuator

output. stigmergy (15) the production of a certain behavior in agents as a consequence of the eects produced in the local environment by previous behavior strong cooperation (1) describes performance of tasks that require distinct roles to be concurrently lled, and which cannot be be performed by a single robot. Subsumption (13) a control hierarchy in which higher-priority behaviors override the outputs of lower-priority behaviors. Tn (55) the task performed by behavior Bn . task (55) in BLE, the goal-oriented observable behavior resulting from the outputs of a BLEarbitrated behavior. Also used informally to refer to the overall task being performed by a group of robots.

85