Lessons Learned From TacAir-Soar in STOW-97 - Semantic Scholar

10 downloads 185747 Views 71KB Size Report
systems. Over the last fifteen years, Soar has been used for research in AI, including problem solving, planning and learning. Soar has also been used as the ...
5504: Building Advanced Autonomous AI Systems for Large Scale Real Time Simulations John E. Laird and Randolph M. Jones University of Michigan 1101 Beal Ave. Ann Arbor, MI 48109-2110 [email protected], [email protected] In 1992, researchers from University of Michigan, University of Southern California Information Sciences Institute (USC/ISI) and Carnegie Mellon University (CMU), started work on a project to develop synthetic pilots for large-scale distributed simulations. The goal was to develop real-time high-fidelity models of human behavior. We built our synthetic pilots using an AI architecture called Soar (Laird, Newell, & Rosenbloom, 1991); Rosenbloom, Laird, & Newell, 1993. In October 1997, our entities participated in the Synthetic Theater of War, 97 (STOW97). STOW-97 (Ceranowicz, 1998) was a DOD Advanced Concept Technology Demonstration (ACTD) that was integrated with United Endeavor 98-1, a USACOM sponsored exercise. As an ACTD, the overall goal of STOW-97 was to permit an early and inexpensive evaluation of advanced technologies that show promise for improving military effectiveness. STOW-97 provided a rigorous test of our entities because it involved large numbers of active entities (~4,000) in a realistic battlefield with synthetic environment (weather, day, night, dynamic terrain). In this paper, we describe the work done by the University of Michigan group, where we developed pilots for fixed wing aircraft, culminating in a system called TacAir-Soar. The group at USC/ISI developed pilots for helicopters (Hill, et al. 1997), while the group at CMU did research on real-time natural language processing (Lehman et al., 1995). This paper describes our approach to developing synthetic pilots for STOW-97, and its applicability to developing AI systems for computer games. Earlier papers discussed the lessons learned from developing the behaviors for STOW-97 (Laird et al., 1998a), and from integrating our software within the context of the larger simulation arena (Laird et al, 1998b). The point of our research is to develop a general, integrated architecture for building AI systems. Over the last fifteen years, Soar has been used for research in AI, including problem solving, planning and learning. Soar has also been used as the basis for detailed modeling of human behavior. Our agents for STOW are the largest and most complex systems ever developed in Soar. They, together with the helicopters entities developed by USC/ISI, are probably the first AI systems used in a large-scale military exercise. The application of this technology to computer games appears obvious. Current first-person perspective competitive games, such as Doom, Quake, or Descent, have relatively simple AI opponents and depend on numbers and fire power to make the game interesting. We wish to explore games where the AI opponent is on par with humans, so that each level is more like a deathmatch against one or two opponents, than a slugfest against many. We also envision high-fidelity AI systems that can substitute for real players in long term role playing games, where continual human play is not possible.

The remainder of the paper will be organized according to the main classes of capabilities required by autonomous AI entities. Such agents must: 1. Exist in a complex environment 2. Respond to relevant changes in the environment 3. Deliberate about the selection and execution of actions 4. Use goals 5. Generate human-like behavior 6. Coordinate behavior with other entities 7. Behave with low computational expense For each of these capabilities, we will review their contribution to intelligent behavior, illustrate how they are supported in Soar using examples from STOW-97, summarize any relevant results from STOW-97, and when appropriate, hypothesize its application to computer games. 1. Exist in a complex environment Much of the earlier work in AI problem solving was done without concern for integration with an external environment. Early AI systems solved problems that were only in their own “mind”. That approach ignores many difficult issues in integrating an intelligent system into a dynamic environment where there are a variety of sensors and motor systems. In this project, our AI system was embedded in a realistic real-time simulation of a battlefield, which included a large detailed terrain database. The terrain database was 575 megabytes (3.7 G for visualization) that covered 500 x 775 square kilometers. The database included 13.5K buildings, 20K destructible objects, runways, roads, bridges, bushes, rivers, bodies of water, etc. During the exercise there were up to 3,700 computer-generated vehicles that covered all U.S. military services as well as opposition forces. The simulation was completely distributed, running on over 300 computers at six sites. Time in the simulation corresponded to time in the real world, and active-duty military personnel participated in the simulation by observing the battles via simulated messages and sensors and giving real-world commands to the computer-generated forces. Our participation involved fielding all of the U.S. fixed-wing aircraft, which included up to 100 airplanes flying at one time, distributed across 20-30 machines. The integration of our system in this environment involved interfacing our AI architecture with a system called ModSAF (Calder et al., 1993), as shown in Figure 1. ModSAF includes all of the necessary software for interacting over the network with other entities in the simulation, as well as a local copy of the terrain database. ModSAF also provides simulations of the vehicles, weapons, and sensors that are controlled by TacAir-Soar. Soar is written in ANSI C and runs on all major platforms and under all major operating systems. In a typical configuration, we run all of the software as a single process under Linux on a Pentium Pro class machine with 256 Mbytes of memory. The large memory is required because of the terrain database. Between 6-10 TacAir-Soar entities run on top of ModSAF with a round-robin scheduler providing service to the TacAir-Soar entities and ModSAF. To provide realistic behavior, the scheduler must run through all of the entities at least 3-4 times a second.

Soar

Soar

Soar

TacAir Rules

TacAir Rules

TacAir Rules

Working Memory

Working Memory

Working Memory

Soar-ModSAF-Interface

Simulation Interface: Sensors, Weapons, Vehicle, Communication

Other Simulators

Network

Figure 1: System organization for TacAir-Soar Soar is a rule-based system, with TacAir-Soar being the combination of Soar and the rules for tactical air missions. The interaction between TacAir-Soar and ModSAF is modulated by the Soar-ModSAF-Interface (SMI), which calls functions in ModSAF to gather all of the necessary input data for the TacAir-Soar agents (Schwamb et al., 1994). Each entity only has access to information from its own sensors, so that there is no possibility of “cheating”. The SMI attempts to organize data so that TacAir-Soar has available the same information a pilot would have, including data from radar and vision sensors, as well as data on the current status of the plane and messages received via its radios. All of this information is added directly to the working memory of each entity and updated each cycle of the simulation. For a single entity, there are over 200 inputs and over 30 different types of outputs for controlling the plane. Based on paper designs for Descent and Quake II, we expect there would be 120-150 inputs and 15-20 outputs for comparable computer games. 2. Responsive to relevant changes in the environment All knowledge about how to behave in an environment is encoded as rules in Soar. TacAirSoar has approximately 5,200 rules, which cover almost all of the missions flown by U.S. pilots. The rules test input information as well as internal data structures, both of which are represented in working memory. Rules are inherently responsive to changes in the environment and in Soar, they are matched every cycle against all of working memory. In STOW-97, TacAir-Soar planes would quickly respond to new threats and other changes in their environment. This responsiveness will also be important in computer game applications where AI opponents must rapidly respond to the actions of humans and other AI systems.

3. Deliberate about the selection and execution of actions Although all knowledge is represented in rules, the rules are used in service of selecting and applying “operators”, which are the basic deliberative acts in Soar. In traditional AI systems, operators are used for planning and consist of preconditions, which determine when the operator can apply, and actions, which specify changes to the world state. This formulation of action is a useful abstraction for planning, because it organizes actions into units of behavior that can be deliberately selected. However, it assumes atomic, instantaneous actions and is therefore not as useful when doing real-time action in the world. In Soar, we still hold on to the advantages of operators, by organizing behavior as the selection and application of operators, but we enhance the way the preconditions and actions of an operator are represented. In Soar, rules propose, compare, apply, and terminate operators. This makes it easy to have several different situations in which the same operator is supposed to be selected – there are just multiple rules that propose the operator. Once the operator is proposed, rules can compare the operator to other operators and create preferences for which of the proposed operators should be selected given the current situation. A fixed decision procedure examines all of the preferences and selects the operator with the dominant preferences. For example, if a plane is intercepting an enemy plane, there might be a rule that proposes an operator to fire a missile. There might be a second rule that proposes an operator to perform a defensive maneuver because of the risk that the enemy plane has fired its own missile. A third rule might test that both of these operators have been proposed, and test that for this mission, it is critical to destroy the enemy plane. The rule can then prefer the operator for firing the missile over the operator for the defensive maneuver. Once an operator is selected, additional rules will fire that test for that operator being selected. Many different rules can fire in order to apply the operator, some of them initiating output commands, while others make changes to the internal representation of the situation in working memory. Rules provide a rich language for specifying operator actions; it is easy to have conditional actions, as well as actions that unfold over time. Once the actions of an operator are complete, there will be rules that test the current situation and the operator and declare that the operator is terminated. Once the operator is terminated, a new operator can be selected. An important aspect of the Soar architecture is that all of the rule firings happen in parallel (simulated). Thus, when operators are being proposed, all rules that propose operators fire at the same time. Similarly, during operator application, all application rules that match fire simultaneously. An important caveat is that a rule will only fire once on a specific set of working memory elements that match its conditions, but it can fire multiple times on different matches to the current working memory. The justification for parallel rule firing is that Soar’s rule memory is similar to an associative memory whose purpose is to retrieve all relevant operator proposals, comparisons, applications, and terminations. In contrast, many rule-based systems (such as CLIPS or OPS5) use a conflict resolution scheme to select the single “best” rule to fire on each cycle. Unfortunately, they must depend on some syntactic means to distinguish

between the various rules that match. In contrast, Soar tries to pick the best operator on each basic cycle. It uses knowledge about the domain to propose and compare operators so that the selection is not based on arbitrary syntactic properties of the matched rules, but on what it knows about the domain. In some situations, it may even know that it does not matter which operator is selected, so it creates indifferent preferences that lead to random selections. In viewing the trace behavior of a Soar entity at the operator level, there is a continual selection and application of operators. At a lower level, there is the continual matching and firing of rules. Rules match and fire in parallel until quiescence, which means there are no new rule firings proposed. At quiescence, a new operator is selected (assuming the previous one is terminated). Therefore there are two cycles in Soar, one for rule firings and one for operator selections. Figure 2 shows an abstract trace of operator selections and rule firings. On the far left is a count of the current operator selection cycle followed by the count of the current rule cycle. More than one rule can fire on the same cycle. 1:1 1:1 1:2 2 2:1 2:1 2:1 2:2 2:2 3 …

Rule1 fires proposing operator 1 Rule2 fires proposing operator 2 Rule3 fires preferring operator 1 to Operator 1 is selected Rule4 fires performing one aspect of Rule5 fires performing one aspect of Rule6 fires performing one aspect of Rule7 fires terminating operator 1 Rule8 fires proposing operator 3 Operator 3 is selected

operator 2 operator 1 operator 1 operator 1

Figure 2: Abstract trace of behavior in Soar 4. Use goals Using knowledge to deliberately select and apply operators is an important capability for an intelligent entity, but action must be in the service of goals, and the determination of goals itself can be quite complex. Soar unifies the reasoning about action and goals by treating goals as abstract operators to be selected and applied. Thus, in Soar, there can be abstract operators (goals) that cannot be directly applied to the current situation, but must be decomposed into simpler operators. These abstract operators are proposed and selected just like any other operators. However, once they are selected, there will not be rules that can complete their application directly. In addition, no rules will fire to terminate the operator. When this happens, Soar has reached an impasse in its problem solving. In response to an impasse, Soar automatically generates a subgoal to resolve the impasse. The subgoal is a new data structure added to the working memory. In addition to the impasse described above, Soar can have other impasses, such as when no operators are proposed, or when multiple operators are proposed and there are no preferences created to select between them. In all cases, a subgoal is created and problem solving continues in the subgoal via the selection and application of operators to resolve the impasse. These other types of impasses do not arise in expert-level Soar systems such as TacAir-Soar; however, they do arise in planning and learning systems implemented in Soar.

The purpose of the subgoal is to apply the selected operator, and the processing in the subgoal is of the same form as before, the repeated selection and application of operators. Of course now there is the subgoal, which can be tested by rules that propose appropriate operators for carrying out the original operator. A selected operator in the subgoal may also require further decomposition, leading to a hierarchy of goals. The hierarchy is generated dynamically based on the details of the current situation. The hierarchy bottoms out when operators are selected that are implemented directly by rules. An individual subgoal is terminated when the impasse that led to it is resolved, in this case when the original operator can be terminated. Figure 3 gives an example trace from TacAir-Soar for a simple intercept. It is at the level of operator selections and subgoal generation (it abstracts away from rule firings). Each line is an operator selection or goal generation and an indentation shows the creation of a subgoal. Some actions, such as turn-toward-bogey at decision 1774, require changes in the world that take time to complete. For these operators, a turn command is issued by a rule, but the termination conditions are not immediately met and there are no further actions for the agent to take. Soar’s response is still to create a subgoal, and in that subgoal, a trivial operator, called wait, is selected which does nothing (but wait). When the world finally changes so that the turn is completed, a rule fires (on decision 1795) and the subgoal is terminated. At this point achieve-proximity has nothing further to do until the planes close to a range where further maneuvering is required to line up a good missile shot, which finally happens at decision 1832. 1770: Intercept-bogey 1771: >>subgoal 1772: Achieve-proximity 1773: >>subgoal 1774: Turn-toward-bogey (issues turn output command) 1775: >>subgoal 1776: wait (waiting for turn to complete) … 1795: wait (waiting to get close to bogey) … 1810: Employ-weapons 1811: >>subgoal 1812: Select-missile 1813: Get-missile-launch-acceptability-region 1814: >>subgoal 1815: cut-to-lateral-separation … 1829: Launch-Missle 1830: >>subgoal 1831: Lock-radar 1832: Fire-missile 1833: Wait-for-missile-to-clear …

Figure 3: A trace of hierarchical operator selections from TacAir-Soar In TacAir-Soar, there are over 450 operators. Approximately 200 of the operators are goals that are dynamically decomposed into more primitive operators. Figure 4 shows part of the hierarchy relevant to the trace in Figure 3. Each level shows the operators that are available for the higher level. Operators with arrows coming out of them will become goals, and operators without arrows will be primitive, implemented by rules. This hierarchy is implicit in

the proposal conditions for operators. For example, the rules that propose Employ-Weapons all test that the intercept operator is selected in their supergoal. There is no sequencing between operators at the same level. Sequencing arises dynamically based on the selection of the operators given the current situation. Intercept

Achieve Proximity

Employ Weapons

Search

Scram

Execute Tactic

Get Missile LAR

Select Missile

Launch Missile

Get Steering Circle

Sort Group

Lock Radar

Local IR

Fire-Missile

Wait-for Missile-Clear

Figure 4: Operator Hiearchy The operator hierarchy is very effective at producing top-down, goal-driven behavior. However, not all behavior is best represented in strict hierarchies. In TacAir-Soar, we discovered that there are “implicit” goals, such as survival and understanding the current situation, that are always active. Many operators help achieve these implicit goals and they are appropriate in many different situations, independent of the current goal hierarchy. For example, there is an operator to identify radar contacts. This operator analyzes the available information on a radar contact, comparing it to earlier sightings and radio contacts and tries to decide on the identity of the contact. This operator is useful throughout the goal hierarchy. We call these operators, “floating operators” and they differ from other operators only in that they do not test the superoperators that defines the current goals. Thus, identify-contact is proposed for many different goals. By organizing our rules in terms of proposal, comparison, application and termination of operators, we were able to represent the appropriate doctrine and tactics for our synthetic TacAir-Soar pilots. We worked closely with former and current pilots to encode the appropriate behaviors. This was an iterative process, where we would interview experts on the details of performing different missions, build systems that performed the missions, demonstrate the behaviors to experts and revise the behaviors as appropriate. There was a significant amount of knowledge to encode so that all missions were flown according to the appropriate doctrine; however, by encoding the knowledge hierarchically we were able to share knowledge across many different missions and tactics. For example, the air-to-air tactics involved different methods for getting into a position to fire a missile,

depending on rules of engagement, enemy threat, etc. However, there is still a significant amount of common knowledge shared across these tactics in terms of the way they use the radar and finally employ the missiles. This sharing falls out of decomposing complex operators into simpler ones, and using multiple rules to suggest the same operator in different situations. The decomposition also made it easier for our experts to understand TacAir-Soar’s behavior. The linear trace of operator selections and application mapped well onto how the pilots made decisions and they could read these traces and directly critique the behavior. Another problem we avoided was having rigid tactics where once the tactic is started, it must be carried out even if it is no longer appropriate. Again, the reactive nature of the rules in TacAir-Soar made it so that tactics were dynamically strung together by rules proposing the next operator to select based on the current situation. For example, in STOW-97, an F/A-18 on a close-air support mission broke off its ground attack because it was threatened by an opposition plane. The F/A-18 intercepted the enemy plane, shot it down, and then returned to its close-air support mission, without any human intervention. 5. Generate human-like behavior In order to be realistic and effective for training, the behavior of our synthetic pilots had to correspond to the behavior of human pilots. It was unclear how close the behavior of our synthetic pilots had to be to human behavior. Our goal was for the behavior to appear human at the tactical level (Jones and Laird, 1997). From earlier experience in cognitive modeling, we knew that attempting to model human behavior as closely as possible would greatly slow development and increase the required computational resources. We adopted three principles to aid in approximating human behavior without attempting to build a detailed model. The first principle was to develop our synthetic pilots using an AI architecture (Soar) that roughly corresponded to the human cognitive architecture. Soar has a reasoning cycle that consists of a parallel access to long-term memory in service of selecting and performing actions, and a goal structure that supports hierarchical reasoning. This level of behavior clearly abstracts away from the underlying implementation mechanisms (silicon in Soar, neurons in human), but captures the regularities of human behavior at the appropriate time scale for tactical aircraft – 10’s of milliseconds, not nanoseconds or microseconds. Soar has been used for developing a variety of detailed models of human behavior (Newell, 1990). To speed development, we abstracted even further to modeling decisions made 2-4 times a second. This abstraction eliminates the low-level eye and finger movements, but retains the basic tactical decisions. Thus, complex tactical decisions will take longer than simple tactical decisions in a Soar system, just as they will take in a human, providing us with a gross model of human reaction time. While the first principle is to give the system the same basic processing structure, the second principle is to give the system the same basic sensory and motor systems. To that end, we created a “virtual cockpit” interface between our pilots and ModSAF. However, once again we abstracted away from the details of human vision and motor control. In TacAir-Soar we modeled the semantic content of the sensors available to a human pilot, and the basic commands for controlling an aircraft. There were some times when we violated this principle,

which in turn compromised the fidelity of our entities’ behavior. For example, in order to avoid the complex reasoning required to decide if a plane was an enemy, we allowed the radar to provide the side (friend or foe) of an aircraft. At the time this was originally being implemented, it did not seem worth the development effort to encode the necessary logic. Although this simplified development it hurt the fidelity, as subject matter experts noticed that our aircraft would decide to commit and intercept an enemy plane much too quickly. The final principle was to give the pilots the same basic knowledge as human pilots. Thus, we built our synthetic pilots based on the information we extracted from subject matter experts. We did not attempt to derive new tactics, base the behavior of our agents on some general theory of warfare, or create learning systems that would attempt to learn behaviors on their own. Some of the information we encoded was standard doctrine and tactics. However, much of it consisted of the details that are not mentioned as doctrine or tactics because it is “obvious” to a human pilot. For example, we needed to encode details such as the geometric features the wingman should pay attention to when turning while in formation. Our principle was to rely on experts, and we usually did. However, sometimes we would believe that the appropriate response to a situation was “obvious”, and we would insert it into the system. Invariably we would find that the situation was more complex than we thought and that what is “common sense” to an experienced pilot is quite different than the “common sense” of an AI researcher. A critical component of this effort was to have experts critique the behavior of our pilots. Incorporating these behaviors was sufficient to give our synthetic pilots the veneer of human behavior. 6. Coordinate behavior with other entities In almost all missions, a plane does not fly alone but coordinates its behavior with other forces. At the most immediate level, planes are organized into groups of two or four, in which they maneuver together and carry out the same mission. Small groups can be combined into packages, such as a strike package, where the subgroups have different specific missions, such as escorts and attack aircraft, but the planes must coordinate their joint behavior. To simulate communications we used simulated radios sending messages. Figure 5 shows an example interaction between our planes. Here Kiwi is an automated AWACS controller and Hawk121 and Hawk122 are automated fighter pilots. The interactions unfold dynamically based on the situation and are not scripted. In the scenario Kiwi gets a radar contact, sends a report to the fighter pilots, and identifies the bogey as an enemy plane (a bandit). When the bandit gets within a predefined commit range, the fighters start an intercept, and eventually shoot a missile at the bandit. Within a TacAir-Soar entity, coordination was supported by explicitly representing the different groups it was a part of, its role in the group, and the other members of the group and their roles. Knowledge about how to carry out its role in a group was encoded just like all other knowledge: as rules for selecting and applying operators. Thus, when the role of an entity in a group changed, different actions would be selected and applied. For example, many of the rules in TacAir-Soar tested whether a pilot was the lead or wingman of its current formation.

Kiwi: kiwi, hawk121 your bogey is at bearing 23 for 143 angels 8 ; Each plane prefaces its communication with its call sign. ; Here Kiwi is giving the bearing (23 degrees), range (143 nms) and altitude (8,000 ft). Hawk121: Roger Kiwi: kiwi, Contact is a bandit ; Kiwi is confirming that the bogey is an enemy plane. Hawk121: hawk121, Contact is a bandit Hawk122: Roger Hawk121: hawk121, Commit bearing 23 for 140 angels 8 ; Hawk121 decides its commit criteria have are achieved and starts to intercept the bandit. ; Hawk121 uses the information from Kiwi to plot an intercept course Kiwi: kiwi, hawk121 your bogey is at bearing 21 for 137 angels 8 ; Kiwi periodically reports position information to the fighters. Hawk121: Roger Hawk122: Roger Kiwi: kiwi, Bandit is closing on a hot vector Hawk121: hawk121, Bandit is closing on a hot vector Hawk121: hawk121, Go to defensive combat-spread formation. ; The section changes formation for the attack. Kiwi: kiwi, hawk121 your bogey is at bearing 12 for 116 angels 8 Hawk121: Roger Hawk122: Roger Hawk121: hawk121, Bandit is closing on a hot vector Hawk121: hawk121, Fox three ; Hawk121 fires a long-range missile and then performs an f-pole maneuver. Hawk121: hawk121, Cranking right

Figure 5: Trace of interaction between automated AWACS and automated pilots In addition to coordination between synthetic aircraft, we also needed to support the coordination between manned control stations, such as an AWACS or forward air controller, and a synthetic aircraft. The controllers had to communicate with the synthetic aircraft, once again via simulated radio. However, the humans needed an interface to the radio system and that was provided by two approaches: a graphical user interface called the Communications Panel (Jones, 1998); and a speech-to-text and text-to-speech system called CommandTalk. In STOW-97, command and control interactions included changing stations, directing fighters to intercept enemy planes, vectoring aircraft along various desired routes, and running numerous on-call close-air support missions against a variety of targets (including ground vehicles, buildings, runways, and ships). During the exercise, we did not have to micro-manage the planes, and we discovered that our air-to-air intercepts worked best when we directed the planes toward the enemy planes and then shut up. The Soar planes often had a better handle on the tactical situation than the controllers did (sometime engaging and shooting down planes the controllers had missed). This type of coordination has already been used in some computer games, but with limited types of commands. In STOW-97, we had over 100 different types of messages that could be sent between our synthetic forces. There is still a long way to go before we have full natural language, but the flexibility of systems such as TacAir-Soar have the basic architecture in which natural language can be incorporated.

7. Behave with low computational expense Another important requirement was that we had to fly large numbers of our synthetic pilots at the same time with out excessive computational expense. If it cost $50,000 for each pilot, then flying 100 pilot would cost $5,000,000. We knew from the beginning that our approach would require more computational resources than prior approaches based on hierarchical finite-state automata which had much more limited behaviors. Would that additional cost be excessive? This was especially problematic because we were planning on creating the largest real-time rule-based systems ever. In 1994, Soar was rewritten from scratch in C (from Lisp), and significant enhancements were made to its rule-matcher to improve its efficiency. Included in these enhancements were changes that allowed Soar to match very large numbers of rules (over a million) without significant degradation in performance (Doorenbos, 1993). However, an efficient matcher was only the first step. Throughout the development of TacAir-Soar we continually tested its efficiency, rewriting rules and developing techniques to keep rule firing to a minimum. Midway through the project we realized that some of the computations being performed by rules were normally performed by a piece of hardware on a plane instead of by the pilot. This led us to redesign our interface, moving calculations for waypoint headings and bomb attacks into a “mission computer”. In retrospect, this was exactly what we should have done from the beginning if we had followed our second principle in Section 5. The final contribution to efficiency was the results of Moore’s Law. The transition of ModSAF (and Soar) from SGI Indigos to Pentium Pro’s provided a significant improvement in performance coupled with a significant decrease in cost. Throughout STOW-97, we had sufficient computational resources to fly all of the required missions. Our computers were 200 MHz Pentium Pros (P6’s) with 256Mbytes of RAM. Before STOW-97, we estimated that we could fly up to 8 planes per machine. During STOW-97, we had sufficient machines so that we could keep the number of planes/machine lower. In general, we would have 4-6 planes per machine, although some missions, such as Recon/Intel would be flown with only one plane per machine. At its peak, we had 102 planes flying at the same time, and could have gone much higher if required. 8. Conclusions Many of the problems we faced in developing realistic synthetic pilots are the same problems faced in developing state-of-the-art opponents in computer games. How does our work in simulated pilots scale to computer games? For STOW-97, the AI components had to share a 200 MHz Pentium Pro with all of the underlying simulation software, so that each individual AI was getting on the order of 5-10% of the CPU. Thus, it might be realistic to expect to have 2-4 intelligent opponents running on a single machine as part of a single-person game. Our hope would be that this experience would be similar to a personal deathmatch, where the level of battle would be similar to human-to-human play. Another scenario to consider is having 8-10 intelligent opponents on dedicated networked computers, providing enemies in large-scale games. This spring we are integrating Soar with Quake II and Descent and expect to learn of the viability of such configurations.

There are many areas of further research worth pursuing. Currently all of our entities have the same knowledge base. Although the complexity of that knowledge base and environment makes it difficult to predict their behavior, except in terms of their programmed doctrine, it would be useful to create customized entities with different behavior, different doctrine. We will be doing this initially to model different opposition forces’ doctrine, but also plan to create individualized entities within a single force. This is relatively straightforward to do in the current system, as we can easily modify the rule base that drives the entities’ behavior. We are also working with researchers at Brooks Air Force Base, under funding from the Office of Naval Research, to develop a general computational model of the effects of fatigue on performance in computer generated forces (Jones & Laird, 1998). This is a part of a general plan to try to make Soar entities even more humanlike. We hope to incorporate other factors that lead to the rich variety of human behaviors, such as emotions, stress, morale, experience, and skill level. We are also developing a system for learning behaviors automatically by observation (van Lent and Laird, 1998). The system attempts to induce the knowledge used by a pilot by tracking his behavior while he is flying a simulated plane. To constrain the learning, we have the pilot annotate his behavior with the goals he is trying to achieve. This behavioral trace is used for learning the hierarchical operator structure, and then induction is used to learn the conditions under which these goals should be pursued. Our hope is that this can greatly reduce the time required to build such entities, as well as allow the creation of entities that closely model the behavior of specific individuals. To obtain more information on Soar, including the software, the manual, tutorials, and papers, visit our web site: http://ai.eecs.umich.edu/soar. 9. Acknowledgments The TacAir-Soar team also includes Karen Coulter, Patrick Kenny, Frank Koss, and Paul Nielsen. Without their efforts, this project would not have been possible. Our work on simulated pilots was one part of the AirSF development effort and was supported by many other organizations. The ones we worked with most closely are listed here. BMH Associates provided most of our subject matter experts (SME's) and did the on-site development and support of the Oceana hardware and software installation. Lockheed Martin Information Systems (originally a part of BBN, and then Loral) developed the underlying software to support the simulation of vehicles, weapons, and sensors; as well as significant work on the integration of the overall simulation software. MITRE developed CCSIL the language for communicating between vehicles. Sytronics developed behaviors for intelligence missions and the weapons editor component of the Exercise Editor. SRI and MITRE developed CommandTalk, a speech-to-text system for command and control. SAIC developed MRCI, which translates an ATO into CCSIL. The version of TacAir-Soar described in this paper was developed under contract N00014-92K-2015 from the Advanced Systems Technology Office of DARPA and NRL, and contract

N6600I-95-C-6013 from the Advanced Systems Technology Office of DARPA and the RDT&E division of NOSC. 10. Bibliography Calder, R., Smith, J., Courtenmanche, A., Mar, J., & Ceranowicz, A. (1993) “ModSAF behavior simulation and control.” Proceedings of the Third Conference on Computer Generated Forces and Behavioral Representation. Orlando, FL. Ceranowicz, A. (1998) STOW-97 - 99 Proceedings of the Seventh Conference on Computer Generated Forces and Behavioral Representation. Orlando, FL. Doorenbos, R. (1993) “Matching 100,000 learned rules.” Proceedings of the Eleventh National Conference on Artificial Intelligence 290-296. Menlo Park, CA: AAAI Press. Hill, R. W., Chen, J., Gratch, J., Rosenbloom, P., & Tambe, M. (1997) “Intelligent agents for the synthetic battlefield: A company of rotary wing aircraft.” Proceedings of Ninth Conference on Innovative Applications of Artificial Intelligence, 1006-1012. Menlo Park, CA: AAAI Press. Jones, R. M. (1998) “A graphical user interface for human control of intelligent synthetic forces.” Proceedings of the Seventh Conference on Computer Generated Forces and Behavioral Representation. Orlando, FL. Jones, R. M., Laird, J. E (1997) “Constraints on the design of a high-level model of cognition.” Proceedings of Nineteenth Annual Conference of the Cognitive Science Society. Jones, R. M., Neville, K, & Laird, J. E. (1998) “Modeling pilot fatigue with a synthetic behavior model.” Proceedings of the Seventh Conference on Computer Generated Forces and Behavioral Representation. Orlando, FL. Laird, J. E., Coulter, K. J., Jones, R. M., Kenny, P. G., Koss, F. V., & Nielsen, P. E. (1998) “Integrating intelligent computer generated forces in distributed simulation: TacAir-Soar in STOW-97.” Proceedings of the 1998 Simulation Interoperability Workshop. Orlando, FL. Laird, J. E., J., Jones, R. M., & Nielsen, P. E. (1998) “Lessons learned from TacAir-Soar in STOW-97.” Proceedings of the Seventh Conference on Computer Generated Forces and Behavioral Representation. Orlando, FL. Laird, J. E., Newell, A., & Rosenbloom, P. S. (1991) “Soar: An architecture for general intelligence.” Artificial Intelligence, 47, 289-325. Lehman, J. F., Van Dyke, J., & Rubinoff, R. (1995) “Natural language processing for IFORs: Comprehension and generation in the air combat domain.” Proceedings of the Fifth Conference on Computer Generated Forces and Behavioral Representation. Orlando, FL. Newell, A. (1990) Unified Theories of Cognition. Cambridge, MA: Harvard University Press. Rosenbloom, P. S., Laird, J. E. & Newell, A. (1993) The Soar Papers: Readings on Integrated Intelligence. Cambridge, MA: MIT Press. Schwamb, K. B., Koss, F. V., & Keirsey, D. (1994) “Working with ModSAF: Interfaces for programs and users.” Proceedings of the Fourth Conference on Computer Generated Forces and Behavioral Representation. Orlando, FL. Tambe, M., Johnson, W. L., Jones, R. M., Koss, F., Laird, J. E., Rosenbloom, P. S., & Schwamb, K. B. (1995) “Intelligent agents for interactive simulation environments.” AI Magazine, 16(1), 15-39. van Lent, M., & Laird, J. E. (1998) “Learning by observation in a complex domain.” Proceedings of the 1998 International Knowledge Acquisition Workshop. Banff, Alberta.