On Generating Hypotheses Using Computer ... - Semantic Scholar

On Generating Hypotheses Using Computer Simulations Kathleen M. Carley • Department of Social and Decision Sciences and H.J.Heinz III School of Policy and Management Carnegie Mellon University Pittsburgh, PA 15213 USA 412-268-3225 [email protected] Abstract Computational models of complex systems, such as teams, task forces, and organizations can be used to reason about the behavior of those systems under diverse conditions. The large number of integrated processes and variables, and the non-linearities inherent in the underlying processes make it difficult for humans, unassisted by computer simulations, to effectively reason about the consequences of any one action. Computer simulation becomes an important tool for generating hypotheses about the behavior of these systems that can then be tested in the lab and field. 1. Introduction and Motivation The use of formal techniques in general, and computational analysis in particular, is playing an increasingly important role in the development of theories of complex systems such as groups, teams, organizations, and their command and control architectures. One reason for this is the growing recognition that the underlying processes are complex, dynamic, adaptive, and nonlinear, that group or team behavior emerges from interactions within and between the agents and entities that comprise the unit (the people, sub-groups, technologies, etc.), and that the relationships among these entities are constrain and enable individual and unit level action. Another reason for the movement to computational approaches is the recognition that units composed of multiple people are inherently computational since they have a need to scan and observe their environment, store facts and programs, communicate among members and with their environment, and transform information by human or automated decision making [Baligh, Burton and Obel, 1990]. In general, the aim of this computational research is to build new concepts, theories, and knowledge about complex systems such as groups, teams, or command and control architectures. This aim can be, and is being met, through the use of a wide range of computational models including computer-based simulation, numerical enumeration, and emulation models that focus on the underlying processes. A large number of claims are being made about the value and use of computer-based simulation in general and computational process models in particular. These claims appear in articles in almost every discipline. One of the strongest claims is that such computer-based simulation can be used for theory development and hypothesis generation. Simple, but non-linear processes, often • This work was supported in part by the Office of Naval Research (ONR), United States Navy Grant No. N00014-97-1-0037.

Form Approved OMB No. 0704-0188

Report Documentation Page

Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number.

1. REPORT DATE

3. DATES COVERED 2. REPORT TYPE

1999

00-00-1999 to 00-00-1999

4. TITLE AND SUBTITLE

5a. CONTRACT NUMBER

On Generating Hypotheses Using Computer Simulations

5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER

6. AUTHOR(S)

5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER

7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES)

Carnegie Mellon University,H.J. Heinz III School of Policy and Managment,Pittsburgh,PA,15213 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES)

8. PERFORMING ORGANIZATION REPORT NUMBER

10. SPONSOR/MONITOR’S ACRONYM(S) 11. SPONSOR/MONITOR’S REPORT NUMBER(S)

12. DISTRIBUTION/AVAILABILITY STATEMENT

Approved for public release; distribution unlimited 13. SUPPLEMENTARY NOTES

1999 Command and Control Research and Technology Symposium (CCRTS) 14. ABSTRACT 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF:

17. LIMITATION OF ABSTRACT

a. REPORT

b. ABSTRACT

c. THIS PAGE

unclassified

unclassified

unclassified

18. NUMBER OF PAGES

19a. NAME OF RESPONSIBLE PERSON

6

Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18

underlie the team and group behavior. An example of such a non-linearity is the decreasing ability of a new piece of information to alter an agents opinion as the agent gains experience. As the agent gets more and more information that confirms a previously held idea, any information that disconfirms it is increasingly discounted. Such non-linearities make it non-trivial to think through the results of various types of learning, adaptation, and response of teams and groups, particularly in changing environments. Computational analysis enables the theorist to think through the possible ramifications of such non-linear processes and to develop a series of consistent predictions. These predictions are the hypotheses that can then be tested in human laboratory experiments or in live simulations. Thus, computer-based simulation models can be, and have been, used in a normative fashion to generate a series of hypotheses by running virtual experiments. 2. Virtual Experiments One of the most effective ways of generating hypotheses from computational models is by running a virtual experiment. A virtual experiment is an experiment in which the data for each cell in the experimental design is generated by running a computer simulation model. In generating this experiment, standard principles of good experimental design should be followed. The results should then be analyzed statistically. The results of that analysis are the hypotheses that can be examined using data from human laboratory experiments, live simulations, games, field studies, or archival sources. In conducting a virtual experiment and generating a series of hypotheses the followings stages are gone through. Stage 1. Identify core variables. Stage 2. Explore the parameter space. Stage 3. Set non-core variables. Stage 4. Run simulations in virtual experiment. Stage 5. Statistically analyze results. To demonstrate the value of this approach a particular illustrative virtual experiment is described and is used to illustrate each of these stages. 2.1 Illustrative Virtual Experiment To illustrate how a virtual experiment is done and hypotheses generated, a specific virtual experiment was run using ORGAHEAD [Carley, 1996a; Carley 1998; Carley and Svoboda, 1996; Carley and Lee, 1998]. ORGAHEAD illustrates several aspects of computational process models: 1. ORGAHEAD has been built in a building block fashion by adding on to a base model, additional computational process modules. This building block approach is one of the strongest approaches for building computational models as it enables the designer to validate the model as it is developed and to generate intermediate results. 2. Computational process modules should have face validity. ORGAHEAD, has demonstrated this level of validity and captures the core aspects of unit level architecture. 3. ORGAHEAD, like any computational process model, enables huge numbers of predictions in multiple areas. 4. ORGAHEAD, like any good computational process model is testable. ORGAHEAD is a computer-based simulation model for reasoning about organizational performance. Performance for units with different command and control architectures and different task environments is predicted. Each member of the organizational unit is modeled as an agent with the ability to learn. In ORGAHEAD the commander can change the C3I architecture in

response to various external and internal triggers. Each ORGAHEAD agent may be either a person, a subgroup, or a platform. Agents are boundedly and structurally rational and so exhibit limited attention, memory, information processing capability, and access to information. The performance of the unit is determined by the agent's actions as they process tasks. ORGAHEAD has been used to make predictions about training, learning, the fragility of organizational success, the type of emergent form, the relative value of different organizational forms, etc. One of the interesting predictions from ORGAHEAD is that organizations can trade individual experience or learning for structural learning. Another finding is that, while all successful organizational forms are similar, their nearest neighbor may be a completely unsuccessful form. Thus small changes in an organization's command and control architecture, small changes in a group's structure, can be devastating. 3. A Staged Approach to Hypothesis Generation In conducting a virtual experiment and generating a series of hypotheses the followings stages are gone through. Stage 1. Identify core variables. Stage 2. Explore the parameter space. Stage 3. Set non-core variables. Stage 4. Run simulations in virtual experiment. Stage 5. Statistically analyze results. To demonstrate the value of this approach a particular illustrative virtual experiment is described and is used to illustrate each of these stages. 3.1 Stage 1 Begin by identifying core variables. Core variables are the parameters or variables of concern. These core variables should be the parameters or model modules which are hypothesized to be the most relevant ones in affecting the dependent variable of interest. An example of a core variable in ORGAHEAD is task complexity. 3.2 Stage 2 Once the parameters have been identified you need to define which values for each parameter will be explored. The choice should reflect concerns with these parameters, and expectations as to where different values of the parameter will effect different system level behavior. In general, two or more values should be chosen for each parameter. For example, for the parameter task complexity we might choose values reflecting low, medium and high complexity. Choosing the parameters and the values defines a virtual experiment. The experiment used here is described in Table 1. These variations of parameters yield 512 different experimental conditions. 3.3 Stage 3 Non core variables should be set to be random, fixed at a level needed for the analysis, or should be set to match conditions known to be true of human groups. For example, ORGAHEAD enables the user to look at units whose size changes over time. In this experiment, however, size is fixed.

3.4 Stage 4 At this point simulations can be run. For each condition, each cell in the table describing your experiment, you should run multiple simulations. This is because there are stochastic elements. If you have a deterministic model you run each condition once. These simulations are your virtual experiment. For example, for the virtual experiment just described each condition was simulated 40 times. In general the number of observations generated via a virtual experiment will be much larger than that generated via a human laboratory experiment, or in a gaming or live simulation situation. For example, the virtual experiment described resulted in 20480 data observations at each point in time. Parameter Task limit

Categories 20,000 and 80,000

Task complexity

binary and trinary

Task information

7 and 9

Agent ability

5 and 7

Stressors

Stable and periodic

Unit Size

9, 12, 18, and 36

Shake-ups

1, 2, 3 and 4

Table 1: Summary of Parameters

3.5 Stage 5 Computer-based simulation models generate more data than human laboratory experiments. Nevertheless the results should still be statistically analyzed. Since there is so much data, it is possible to conduct multiple explorations given a single virtual experiment. For example, for the virtual experiment just described first the impact of meta-adaptation strategies on performance was examined then the impact of meta-adaptation strategies on the C3I structure was examined. For the first analysis, results indicate that, in order of impact, the four factors which most affect sustained performance are: the number of resources available to each agent, the size of the unit, the length of (amount of information in and resources associated with) the task, and the number of shake-ups. These results are summarized in Table 2. For the second analysis results indicate that increasing organizational size and increasing the task complexity from 7 to 9 bits, reduces the number of re-assignments (who reports to whom) made and increases the number of re-taskings (who is doing what task). The first part of this finding is quite non-intuitive. If there are more people, then the probability of a re-assignment should increase. However, we find this number decreases implying that units are adapting by creating more direct linkages between personnel and task thus reducing the complexities brought on by inter-organizational communication. As a side result the amount of information and the number of resources available to any one individual increases.

Predictor intercept Task limit Task complexity Environmental stressors Unit size Agent ability Task information Shake-ups

Coefficient 0.000000 0.031853 -0.024068 -0.014568

p value 1.000 0.000 0.000 0.027

0.170226 0.265205 0.091118 -0.012299

0.000 0.000 0.000 0.063

R2 (adj) = 10.9%, df = 7, 20472, p