Achieving Robust Human-Computer Communication Susan W. McRoy Department of Electrical Engineering and Computer Science and Department of Mathematical Sciences University of Wisconsin-Milwaukee Milwaukee, WI 53201
[email protected] November 25, 1997
Abstract This paper describes a computational approach to robust human-computer interaction. The approach relies on an explicit, declarative representation of the content and structure of the interaction that a computer system builds over the course of the interaction. In this paper, we will show how this representation allows the system to recognize and repair misunderstandings between the human and the computer. We demonstrate the utility of the representations by showing how they facilitate the repair process.
1
1 Introduction In dialogs between people or between people and machines, understanding is an uncertain process. If the goals or beliefs of two discourse1 participants dier, one of them might interpret an event in the dialog in a way that she believes is complete and correct, although her interpretation is not the one that the other one had intended. When this happens, we as analysts (or observers) would say that a misunderstanding has occurred. The participants themselves might come to know it as a misunderstanding as well|if the problem manifests itself as something incongruous in the interaction. Discourse participants can circumvent many misunderstandings by planning their actions carefully. For example, they can try to correct apparent misconceptions or try to clarify a potential ambiguity. This has been the predominant approach [Calistri-Yeh, 1991, Eller and Carberry, 1992, Goodman, 1985, McCoy, 1985, Zukerman, 1991]. However, because participants' beliefs and goals always dier to some extent, it is impossible to prevent all misunderstandings. Moreover, because people do not always recognize when communication has broken down, participants cannot always rely on each other's judgments about what has been understood. Although misunderstanding cannot be fully prevented, discourse participants can continue to communicate by detecting and repairing the problem. For example, one participant might notice an anomaly and attribute it to a misunderstanding. Then, that participant might choose to interrupt the dialog to initiate a repair by herself or the other. However, for misunderstandings to be diagnosed and resolved, discourse participants must have access to Here we use \discourse" to refer to the general notion of a multi-party communication that continues over a number of actions by various participants using various modalities, such as writing, mouse-clicks, or speech; \dialog" refers to the subclass of two-party interactions using various modalities, not necessarily restricted to speech. 1
2
an internal representation of the dialog that they can use to evaluate the reasonableness of each new action [McRoy, 1995a, McRoy and Hirst, 1995a]. To construct repairs, they can use this representation to re-evaluate earlier interpretations. This paper describes a computational representation of human-computer communication that supports exible and coherent dialog. The representation has been incorporated in a collaborative tutoring system, called B2, that allows medical students to practice their decision-making skills and to receive natural language explanations of complex statistical relationships. The primary modality of interaction is natural language, but other modalities (such as mouse-clicks) are also supported. Each participant in the dialogs can make statements, ask questions, or generate requests. Some of these actions are used to initiate repairs. Moreover, some of these actions would be ambiguous (or uninterpretable) out of context. For example, in dialogs between students and a tutoring system that we have been building, students ask the system questions such as \Is CT2 the best test to rule in gallstones?" (referring to a medical case that has been under discussion) and can follow up with more context-dependent questions, such as \What about HIDA3 ?" or \Why?" To support robust understanding, a ne-grained representation of the dialog has been used because, in the cases of unexpected dialog actions, the system needs to explain why the student might have said what she did and to decide what it should do to restore the coherence of its model. Producing an explanation might also require that the system make some (reasonable, but perhaps unprovable) assumptions about what the other participant believes. After interpretation, these explanations become part of the system's understanding of the dialog. The knowledge that serves as a basis for the system's reasoning includes information 2 3
CT stands for computed tomography, a diagnostic test. HIDA stands for radio-nuclide hepatobilary imaging, another diagonistic test.
3
about language and language use, facts about the world, and information about the content and structure of prior discourse. The representation framework used to represent this information is a mixed-depth representation, that is, one that mixes representations from dierent levels of abstraction or detail. A mixed-depth representation is as informative as needed, without overcommitment, because it represents each component concept at the most relevant level of abstraction, given the task at hand. This framework thus allows us to integrate information from the dierent sources incrementally, as they are needed. It also allows us to capture vagueness or ambiguity that is present in the utterance form. The design of the discourse model used in our work builds on the McRoy and Hirst's previous work on the Recognition and Repair of Speech Act Misunderstandings (RRM) project [McRoy and Hirst, 1993, McRoy, 1995b, McRoy and Hirst, 1995b]. This work uni es theories of speech-act production, interpretation, and repair. It combines intentional and social accounts of interaction to constrain the amount of reasoning that is necessary to identify a plausible interpretation. B2 is being developed using the Common LISP programming language. We are using the SNePS 2.3.1 and ANALOG 1.1 tools to create the lexicon, parser, generator, and underlying knowledge representations of domain and discourse information[Ali, 1994a, Ali, 1994b, Shapiro and Group, 1992, Shapiro and Rapaport, 1992]. Developed at the State University of New York at Bualo, SNePS (Semantic Network Processing System) provides tools for building and reasoning over nodes in a propositional semantic network. (We will de ne what a semantic network is shortly.) An Internet-accessible, graphical front-end to B2 has been developed using the JAVA 1.1 programming language. It can be run using a network browser, such as Netscape. The interface that the user sees communicates with a server-side program that initiates a LISP 4
process.4 (The Bayesian network software is executed from LISP as a foreign function.) The focus of this paper is on describing the internal representation of the dialog that is constructed by our tutoring system. It will show how the representations facilitate reasoning about understanding and misunderstanding. (It does not consider the problem of nonunderstanding, as it is beyond the scope of this paper.) The next section begins by reviewing some previous approaches to misunderstanding. Sections 3 and 4 introduce the kinds of information in the discourse model, along with their representation. Section 5 describes our account of coherence in dialog and its use of the representation. Section 6 then explains the discourse model that would be constructed for an example dialog. The paper concludes with Section 7.
2 Background Most previous work on representations of interactive discourse have focused on just one aspect of communication, such as interpretation or generation. Also typical is that discourse understanding is treated as generalization of parsing, in which plans (recipes for action) take the place of grammar rules [Lambert and Carberry, 1991, Litman, 1986, Novick and Ward, 1993]. This limits the applicability of these approaches to interactions that t one of the pre-de ned plans. However, in exible communication between people and computers, surprises are inevitable:
The user's attention might not be focused on the aspect of the presentation that the system expects.
The URL for this site is http://tigger.cs.uwm.edu/ when the server program is running. 4
5
wendy/b2/b2.html; however,
it is only available
The user might not have the same understanding of what a verbal description or a graphical image is meant to convey.
The user might lack some of the requisite knowledge of the domain necessary to interpret a proposed explanation.
Thus, the discourse model must include a representation of what actions the participants' performed, what inferences those actions would warrant, and what evidence there is for a misunderstanding between the explanation system and the user. A computer system that includes information about possible misunderstandings as well as intended interpretations would have to deal with a potential combinatoric explosion of alternatives. To prevent this, a system must be able to focus its processing on the most likely interpretations and avoid extended inference, unless it is necessary. The approach taken by McRoy and Hirst [1995b, 1993, 1995b] combines intentional and social accounts of discourse to capture the expectations that help constrain discourse, without requiring extended inference about plans. In intentional accounts, speakers use their beliefs, goals, and expectations to decide what to say; when they interpret an utterance, speakers identify goals that might account for it. This process can be time-consuming because speakers may have many dierent goals that they are trying to achieve. In sociological accounts provided by Ethnomethodology, discourse interactions and the resolution of misunderstandings are normal activities guided by social conventions. These conventions include expected sequences of utterances, such as question{answer. The approach extends intentional accounts by using expectations deriving from social conventions to guide interpretation. As a result, it does not require reasoning about mutual beliefs5 or multiple levels of goals, avoiding most of the 5
Mutual beliefs are those of the form \I believe that you believe that I believe
6
:::
".
time-consuming inference that is required by plan-based approaches such as [Allen, 1983, Carberry, 1990, Litman, 1986]. A prototype system RRM has been implemented in a logic-programming framework for default and abductive reasoning. The system includes an axiomatization of McRoy and Hirst's theory of how expected actions are understood and how misunderstandings can lead to unexpected actions and utterances. (The speci cs of this theory will be discussed in Section 5, as it forms the basis for our current work.) An action is explained as a manifestation of misunderstanding if it is not expected, given the prior interaction, and there is a plausible reason for supposing that misunderstanding has occurred. The system builds a model of the dialog as it is communicating. This model includes a representation of the speech act interpretation that the system gave to each of the utterances (that it either produced or received), including a temporal ordering relation over the speech acts, and a representation of the beliefs that such speech acts conventionally express. The model also enables the system to form expectations about the types of speech acts that could be expected to occur next. This model also enables the system to detect inconsistencies in the beliefs that had been expressed over the course of a dialog and to resolve the inconsistencies by providing new interpretations of previous utterances. The prototype has been used to simulate conversations involving the detection and correction of speech-act misunderstandings that occurred in human{human dialogs, previously discussed in the literature. (In the simulations, two versions of the system actually talk to each other|given some initial beliefs and goals.) Although this work demonstrated the utility of combining multiple sources of knowledge, the prototype implementation was too inecient for practical use and did not represent enough semantic information to account for misunderstandings of word or phrase-level information. The work described here attempts 7
to address these shortcomings. Representative of plan-based approaches, is the work by Ardissono, Boella, and Damiano [1998]. This work also provides a uni ed theory of interaction that addresses misunderstandings, but does so by extending a purely intentional (plan-based) approach. (Plan-based approaches formulate knowledge as recipes for action, akin to STRIPS operators [Fikes and Nilsson, 1971], that specify the preconditions and constraints that must be true for an action to succeed, and the eects or goals that an action achieves.) Like previous plan-based approaches to discourse by Litman [1986] and Lambert and Carberry [1991], interpretations are recognized by trying to nd a plan-based relation that links the utterance and the previous context. For example, a new utterance would be coherent if it seeks to gain information about the state of a constraint. Variations to domain plans are allowed by introducing meta-level actions that manage the interaction itself. The system's representation of the dialog consists of a structure that includes an instantiated plan recipe that links, by means of relations such as precondition or subgoal, all of the utterances that have been made by either participant. The values used to instantiate a recipe include more speci c instances of actions (instantiated recipes), complex mutual beliefs6, or goals. Ardissono et al extend the plan-recognition algorithm so that misunderstandings are recognized when the system fails to nd a plan-based relation. When such failures occur, a meta-level action results in a subgoal being posted to resolve the misunderstanding; repairs are actions that are performed to satisfy these goals. Thus, in comparison with McRoy and Hirst, the meta-level action of posting of a goal (e.g., to nd an alternate interpretation) takes the place of the RRM theorem prover's attempt to prove that such an alternative For example, one eect of a question is that the speaker and hearer mutually believe that the speaker has the communicative intention of expressing that the speaker has the goal of the hearer adopting a goal to answer the question. 6
8
exists. The strengths of purely plan-based approaches are in the uniformity of the representations that are used to specify both domain and discourse information and the wide-spread availability of plan-recognition algorithms. Also, the work of Ardissono et al accounts for several types of breakdown, including ambiguities in syntax, semantics, or pragmatics. The primary weakness of all plan-based approaches, is that they do not provide an explicit account of the constraints needed to focus the search for an interpretation; instead, it embeds them in the control structure of the plan-recognition algorithm. Also, the domain must be kept narrow, so that the number of domain-level ambiguities, and hence the eort needed to identify plan-base relations, is minimized. In the next section, we consider the design of the discourse model used in B2. To allow for both exibility and robustness, we have based the B2 discourse model on the approach of McRoy and Hirst [1995b]. However, to address the uniformity and eciency issues, B2 represents knowledge in a semantic network framework, that we will discuss in Section 4.
3 The Kinds of Information in the Discourse Model Now we will consider the information that is represented by the B2 system. The discourse model combines information about the discourse level actions performed by the system, as well as its interpretation of the user's utterances. Consider the dialogue shown in Figure 1.
B2: What is the best test to rule in Gallstones? Student: HIDA. B2: No, CT. Student: Ok. Figure 1: A Dialogue between the B2 System and a Medical Student 9
In the discourse model, this dialogue leads to the assertion of a number of propositions about what was said by each participant, how the system interpreted what was said as a discourse action, and how each action relates to prior ones. For the exchange above, the knowledge representation would include representations of facts that have been glossed as shown in Figure 2.
What was said u1 The system asked What is the best test to rule in gallstones?. u2 The student said HIDA. u3 The system said No. u4 The system said CT. u5 The student said Ok. The system's interpretation of what was said i1 The system asks What is the best test to rule in gallstones?. i2 The student answers HIDA is the best test to rule in gallstones. i3 The system discon rms No, HIDA is not the best test to rule in gallstones i4 The system informs CT is the best test to rule in gallstones. i5 The student agrees that CT is the best test to rule in gallstones. Relations r1 i1 is the interpretation of u1 r2 i2 is the interpretation of u2 r3 i3 is the interpretation of u3 r4 i4 is the interpretation of u4 r5 i5 is the interpretation of u5 r6 i2 accepts i1 r7 i3 accepts i2 r8 i5 accepts i4 r9 i4 justi es i3 Figure 2: Propositions that would be Added to the Discourse Model This model of the discourse is used to both interpret students' utterances and to generate the system's responses. When a user produces an utterance, the parser will generate a representation of its surface content (e.g., a word, phrase, or sentence), form (e.g., interrog10
ative, declarative, or imperative), and attitude (e.g., action, belief, or desire). B2 then uses its model of the discourse to build an interpretion of the utterance that captures both its complete propositional content and its relationship to the preceding discourse. This model speci es what the participants have expressed in the discourse, which may be dierent from what they actually believe or are believed to believe. Having this discourse model allows the system greater exibility than previous explanation systems, because it enables the system to judge whether:
The student understood the question that was just asked and has produced a response that can be evaluated as an answer to it; or
The student has rejected the question and is asking a question of her own; or
The student has misunderstood the question and has produced a response that can be analyzed to determine how it might repair the misunderstanding.
For example, if the student had answered \No" instead of \HIDA", after utterance u1, it would have been a good indication that she had misread (or misheard) the question. If she had responded \What is the patient's sex?", it would have been an indication that she feels that more information about the medical case must be speci ed before a correct answer can be determined. Similarly, when the system generates an utterance, the discourse model will be augmented to include a representation of it and its relation to the preceding discourse. These representations of the system's own outputs are useful, because they allow it to produced a focused answer to a question like \Why?", yet still be able to respond to requests for more information, such as \What about ultrasound?". 11
4 The Representation of Discourse The discourse model has ve levels of representation, shown in Figure 3. The utterance level represents the linguistic or graphical form of communication between the user and the system. The utterance sequence level represents the temporal ordering of utterances. The interpretation level represents the communicative role that the utterance achieves and the domain plan that it invokes. The exchange level represents the socially-de ned relationship that determines whether a communicative act has been accepted or understood. The exchange interpretation-level represents (explicitly) the acceptance or failure of an exchange. interpretation of exchanges exchanges (pairs of interpretations) system’s interpretation of each utterance sequence of utterances utterance level
Figure 3: Five Levels of Representation These levels capture the content expressed by the student and the system, as well as the relations that link their actions to the ongoing discourse. Although many systems give questions and requests a procedural semantics, (interpreting them as operations to be performed, or queries to be derived) such systems are not able to recover from misunderstandings. Thus, in B2 the discourse model includes a representation of all types of communicative acts, including questions, requests, and statements of fact.
12
4.1 The Representation Language The system represents both domain knowledge and discourse knowledge in a uniform framework as a propositional semantic network. A propositional semantic network is a framework for representing the concepts of a cognitive agent that can use language (hence the term semantic). The information is represented as a graph composed of nodes and labeled directed arcs. In a propositional semantic network, the propositions are represented by the nodes, rather than the arcs; arcs represent only non-conceptual binary relations between nodes. The particular semantic network framework that we are using is SNePS [Shapiro and Group, 1992] with ANALOG [Ali, 1994a, Ali, 1994b]. This framework has the following additional constraints: 1. Each node represents a unique concept. 2. Each concept represented in the network is represented by a unique node. 3. The knowledge represented about each concept is represented by the structure of the entire network connected to the node that represents that concept. These constraints allow ecient inference when processing natural language. For example, such networks can represent complex descriptions (common in the medical domain), and can support the resolution of ellipsis and anaphora, as well as general reasoning tasks such as concept subsumption [Ali, 1994a, Ali, 1994b, Maida and Shapiro, 1982, Shapiro and Rapaport, 1987, Shapiro and Rapaport, 1992]. Propositions are expressed using case frames. A case frame is a conventionally agreed upon set of arcs emanating from a node. For example, to express that A isa B we use the MEMBER{CLASS case frame, which is a node with a MEMBER arc and a CLASS arc. The 13
SNePS dictionary [1994] provides a dictionary of standard case frames used in our semantic network. Additional case frames can be de ned as needed. In the knowledge representation, assertion is the mechanism that is used to indicate that a proposition is believed by the cognitive agent that is implemented by the system. We use unasserted nodes to represent descriptions of observations, including surface-level representations of the user's utterances. After interpretation, the system may decide to accept this new information as credible and use it as a basis for inference. When this occurs, it asserts the corresponding node and treats the concept as a belief. (In network diagrams, an exclamation point to the right of a node is used indicate that it has been asserted.) Thus, for an agent, believing is not automatic; it is a result of inference that it performs. An important advantage of this representation is that it is uniform. In B2, discourse knowledge, domain knowledge, and probabilistic knowledge (from the Bayesian net) are all represented in the same knowledge base using the same inference processes. This uniformity facilitates the interaction between dierent processing tasks (such as domain or discourse level reasoning). We will now consider each of the ve levels of the discourse model in turn, starting with the utterance level, shown at the bottom of Figure 3. Then, in the next section, we will consider how these representations are used to achieve robust understanding. (For more details about the knowledge representation, see [Ali et al., 1997, McRoy et al., 1997].)
4.2 The Utterance Level For all inputs, the parser produces a representation of its surface content, which the system will add to its model as part of an occurrence of an event of type say. This representation describes the syntactic structure of the utterance along with some semantic annotations, such 14
as MEMBER{CLASS information. The grammar has broad linguistic coverage (comparable to Winograd [1983]) to accommodate the variety of structures that people might use across a variety of domains. The grammar also parses and builds representations for utterances that are fragmentary and ambiguous. Such utterances often require extensive reasoning about the domain or the discourse model to fully resolve; the system avoids unnecessary inference by producing a mixed-depth representation. A mixed-depth representation is one that may be shallow or deep in dierent places, depending on what was known or needed at the time the representation was created [Hirst and Ryan, 1992]. Moreover, \shallow" and \deep" are a matter of degree. Shallow representations can be a surface syntactic structure, or it can be the text itself (as a string of characters). Deep representations might be a conventional rst-order (or higher-order) AI knowledge representation, taking into account such aspects of language understanding as lexical disambiguation, marking case relations, attachment of modi ers of uncertain placement, reference resolution, quanti er scoping, and distinguishing extensional, intensional, generic, and descriptive noun phrases. Unlike quasi-logical form, which is used primarily for storage of information, mixed-depth representations are well-formed propositions, subject to logical inference. Disambiguation, when it occurs, is done by reasoning. In B2, deeper semantic analysis (which makes use of domain-speci c knowledge) will occur during the interpretation phase (to be described below), if it is needed. At that stage, the system would add additional annotations to account for domain-speci c referring expressions (e.g., the system might recognize that a story refers to a description of a medical case) or other underspeci ed meaning relations (e.g., the system might recognize that a possessive relation such as the patient's arm refers to a PART-OF relation, rather than a 15
KINSHIP relation, such as in the patient's mother). At the utterance level, the content of the user's utterance is always represented by what she said literally. For example, in the medical tutoring domain, the student may request a story problem directly, as an imperative sentence \Tell me a story." or indirectly, as a declarative sentence that expresses a desire \I want you to tell me a story.". Figures 4{7 include representations for the following examples (an imperative sentence, a declarative sentence, an interrogative sentence, and an interrogative sentence where the syntactic object has been moved to the front): 1. Tell me a story. (See Figure 4.) 2. I want you to tell me a story. (See Figure 5.) 3. What is the best test? (See Figure 6.) 4. What does HIDA detect? (See Figure 7.) Consider the example shown in Figure 4. In the network, square nodes correspond to potential discourse entities; round nodes are propositional concepts. Node B3 is the discourse entity corresponding to the utterance itself. Node M5 is a case frame that represents the structure of this utterance, which includes its syntactic form (imperative), the mental attitude that it expresses (action7), and its content (which is represented by node M4). Node M4 is the proposition that the hearer (B1) is to perform the action described by the node M6. Node M6 indicates the type of the action, tell (B4), the object of the action (B2), and the indirect object (recipient) of the action (B5). Nodes B1, B2, and B5 represent the discourse Attitude is the term that we use to indicate whether the content of the sentence describes a mental state (believe, want, or desire), a state of aairs (be, have), or an event (action). Attitude is indicated by the meaning of the main verb. 7
16
entities expressed by the noun phrases corresponding to the subject (the implied hearer), the object (a story), and the direct object me, respectively. Nodes M8 and M13 (attached to B1 and B5) are case frames to indicate that the noun phrases have been expressed as a pronoun. Attached to Node B2, there is a syntactic case frame that expresses that there is a determiner, a, and a semantic case frame that expresses that the entity B2 is a member of a class story. Throughout the gure, arcs marked LEX are used to indicate the word that the speaker used; word sense discrimination is achieved by attaching additional information to the appropriate discourse entity (e.g., B2's interpretation as a medical case).
17
action ATTITUDE UTTERANCE
M5
be
B3
ATTITUDE
FORM IMPER CONTENT
PRONOUN
B5 OBJECT
ACT
AGENT
IOBJECT
B1
M7 M11 CLASS
CLASS
MEMBER CLASS
M3
MEMBER
PRONOUN
OBJECT
"story"
ACTION DOBJECTNP
DET
M7 B4
M2
M10 LEX
M6
MEMBER
OBJECT
M4
M8
CLASS
MEMBER
"hearer"
CONTENT
"me"
M9
LEX
INT
LEX
M4
M8
B3
FORM
M12
OBJECT
UTTERANCE
M5
PRONOUN
M13
M2
B2
M1
LEX
LEX
B2
B1
M1
LEX
LEX
OBJECT
"test"
"a"
"what"
M6
M3
PROPERTY
"tell"
LEX
Figure 4: Node B3 represents an utterance whose form is imperative, and whose content Figure 6: Node B3 represents an utterance (M4) is the proposition that the hearer (B1) whose form is interrogative and whose conwill tell a story (B2) to the speaker (B5). tent is a proposition that the object designated by what (marked as the queried item) is a type of best test. "best"
desire ATTITUDE M10
UTTERANCE
B3
FORM DECL CONTENT
PRONOUN M2
M3 OBJECT
PRONOUN
M5
LEX
M6
FORM
B5
M13
M14
B4
CLASS
CONTENT
PRONOUN
DOBJECT
B2
B3
LEX AGENT
OBJECT
M4
"me"
LEX
COMPL
FORM
M7 M8
LEX "tell"
ACT
DOBJECT
MEMBER
M15
CLASS
M12 MEMBER
ACTION B7
LEX
M16
OBJECT
DOBJECT
M2
ACTION
M11 PRONOUN
MEMBER LEX
B8 M11
M10
IOBJECT
ACT
M9
B1
M3
CONTENT action
CLASS
AGENT
B6
ATTITUDE
"want"
M6
OBJECT PROPERNAME
UTTERANCE M19
B2
OBJECT
INT
IOBJECT
ACTION MEMBER
UTTERANCE
M5
"you"
M4
B1
"I"
ATTITUDE
LEX OBJECT
M1
action
M6
ACT
AGENT
"HIDA"
CLASS
"story"
NP M17
DET
B4 M8
M1
LEX
LEX
LEX
"what"
"a"
"detect"
M18
Figure 5: Node B3 represents an utterance whose form is declarative and whose content (which contains an embedded utterance) is that the speaker wants the hearer to tell a story to the speaker.
Figure 7: Node B2 represents an utterance whose form is interrogative and whose content is a proposition that HIDA detects an object designated by what.
18
4.3 Sequence of Utterances The second level corresponds to the sequence of utterances. (This level is comparable to the linguistic structure in the tripartite model of [Grosz and Sidner, 1986]). In the semantic network, we represent the sequencing of utterances explicitly, with asserted propositions that use the BEFORE-AFTER case frame. The order in which utterances occurred (system and user) can be determined by traversing these structures.
B2 What is the best test to rule in gallstones? 1 User: HIDA 2 B2: No, it is not the case that HIDA is the best test to rule in gallstones. CT is the best test to rule in gallstones.
3 4
Figure 8: A Sequence of Utterances Consider the example dialog shown in Figure 8 and its (partially glossed) representation, shown in Figure 9. In Figure 9, asserted propositional nodes M99, M103, M119, and M122 represent the events of utterances 1, 2, 3, and 4, respectively.8 For example, node M103 asserts that the user said HIDA. Asserted nodes M100, M104, M120, and M123 encode the order of the utterances. For example, node M120 asserts that the event of the user saying HIDA (M103) occurred just before the system said HIDA is not the best test to rule in gallstones (M119). Propositional expressions that are written in italics (for example, 9t.best-test(t, gallstones, jones-case)) represent subgraphs of the semantic network that are relevant, but that we have glossed for reasons of space. 8
19
M100
!
BEFORE
!
M104 AFTER
BEFORE
M120
AFTER
BEFORE
!
M123 AFTER
BEFORE
! AFTER
AGENT M99
. . .
! ACT
AGENT
M19 LEX
M103
ACTION
AGENT
M97
M14
M119
! ACT
ACTION
!
AGENT M122 ACT
M102
ACTION
!
ACT
M118
ACTION
M121
M7 LEX
system
LEX
OBJECT1
user
OBJECT1
OBJECT1
OBJECT1
say
t . best-test(t, gallstones, jones-case)
Hida
best-test(HIDA, gallstones, jones-case) best-test(CT, gallstones, jones-case)
Figure 9: Nodes M100, M104, M120, and M123 represent the sequence of utterances produced by the system and the user, shown in Figure 8. For example, Node M104 represents the proposition that the event M103 immediately followed event M99.
4.4 The Interpretation Level In the third level, we represent the system's interpretation of each utterance. Each utterance event (from level 1) will have an associated system interpretation, which is represented using the INTERPRETATION OF|INTERPRETATION case frame. For example, consider the interpretation of the utterance \Tell me a story." (as well as \I want you to tell me a story."), shown in Figure 10. The network indicates that the utterance has been interpreted as a request for the system to describe a case to the user. The system can distinguish multiple possible interpretations from the one that it believes is the correct. It does this by building several case frames of this type, but only asserting one as believed. Changed beliefs are represented by unasserting certain nodes, while asserting others. (The knowledge representation framework incorporates a justi cation-based truth 20
maintenance system to deal with contradictions.) INTERPRETATION
M22 AGENT
INTERPRETATION_OF
ACT
M21
M23
LEX
M31
"tell me a story"
ACTION
OBJECT1
"user’ M25
M24 ACT
LEX
AGENT
"request" M26 ACTION
M27 OBJECT1
LEX "system"
M28 LEX
B9 MEMBER M29
CLASS
M30
"describe" LEX "case"
Figure 10: Node M31 is a proposition that the interpretation of Tell me a story (which is glossed in this gure) is M22. Node M22 is the proposition that the user requested that the system describe a case to the user. (Describing a case is a domain-speci c action; the pronouns from the utterance level have been interpreted according to the context.)
4.5 The Exchange and Exchange Interpretation Levels The fourth and fth levels of representation in the discourse model are exchanges and interpretations of exchanges, respectively. A conversational exchange is a pair of interpreted events that ts one of the conventional structures for dialog (e.g., QUESTION{ANSWER); it is related to the notion of adjacency pair [Scheglo and Sacks, 1973]. An exchange may comprise a sequence of utterances or it may have another exchange nested within it, as when one defers the answer to a question to ask a clarifying question. (See Figure 11.) 21
B2 What is the best test to rule in gallstones? E1 User: What are my choices? E2 B2: CT, HIDA, Ultrasound for Cholecystitis E2 User:
and Ultrasound for Gallstones. CT.
E1
Figure 11: A pair of nested exchanges The system's knowledge includes a speci cation of these structures and cues for recognizing them. When recognized, an exchange will be interpreted either as an acceptance, indicating that it has been understood and completed, or as rejection, indicating that it was either abandoned or not understood. In cases where a repair is indicated, the system will step back through the history of exchanges to identify possible misunderstandings. (How this is done is discussed in Section 6.) Moreover, when an utterance can not be interpreted as either the completion of an open exchange or the start of a new one, it will be taken as evidence of a failure in understanding. Figure 12 gives the network representation of a conversational exchange and its interpretation. Node M113 represents the exchange in which the system has asked a question and the user has answered it. (This is determined by the type of action within each interpretation; the type names are indicated by nodes M7, an ask, and M106, an answer, respectively.) Using the MEMBER{CLASS case frame, propositional node M115 asserts that the node M113 is an exchange. Propositional node M112 represents the system's interpretation of this exchange: that the user has accepted the system's question (i.e., that the user has understood the question and requires no further clari cation). Finally, propositional node M116 represents the system's belief that node M112 is the interpretation of the exchange represented by node M113. 22
M115
!
M116 CLASS
MEMBER
INTERPRETATTION-OF
! INTERPRETATION
M114 M112
M113
!
LEX EVENT1
EVENT2
AGENT
ACT
exchange
M99 AGENT
!
M108
ACT
AGENT
M97
M19
M7
OBJECT1 ACT
M111
ACTION
M107
M14
ACTION LEX
!
ACTION
M110
M106
LEX
LEX system
LEX OBJECT1
LEX
user
ask
OBJECT1
accept
answer
t . best-test(t, gallstones, jones-case)
best-test(HIDA,gallstones,jones-case)
Figure 12: Node M115 represents the proposition that node M113 is an exchange comprised of the events M99 and M108. M108 is the proposition that The user answered \HIDA is the best test to rule in Gallstones". Additionally, node M116 represents the proposition that the interpretation of M113 is event M112. M112 is the proposition that the user has accepted M96. (M96 is the question that the system asked in event M99.)
23
4.6 Interaction among the Levels An advantage of the network representation is the knowledge sharing between these ve levels. We term this knowledge sharing associativity. This occurs because the representation is uniform and every concept is represented by a unique node. As a result, we can retrieve and make use of information that is represented in the network implicitly, by the arcs that connect propositional nodes. For example, if the system needed to explain why the user had said HIDA, it could follow the links from node M103 (shown in Figure 9) to the system's interpretation of that utterance, node M108 (shown in Figure 12, to determine that
The user's utterance was understood as the answer within an exchange (node M113), and
The user's answer indicated her acceptance and understanding of the discourse, up to that point (node M112).
This same representation could be used to explain why the system believed that the user had understood the system's question. This associativity in the network is vital if the interaction starts to fail, because the system will need to reassess its interpretation of the user's previous utterances or its beliefs about how the user has interpreted the system's utterances.
5 An Account of Coherence in Dialog The discourse model described above represents the system's interpretation of what has occurred during its interaction with the user. For a system to allow for the possibility of misunderstanding, belief cannot be automatic. Instead, beliefs are conclusions (or assumptions) that it makes on the basis of whether an interpretation seems reasonable, given the 24
context of prior discourse. Moreover, such beliefs may later be withdrawn in the face of contradictory evidence. McRoy and Hirst [1995a, 1995a] provide a computational theory of coherence in dialog that accounts for factors such as social expectations (including conversational exchanges), linguistic information (including the link between discourse acts and the beliefs that they express), and mental attitudes, such as belief, desire, or intention. According to this theory, an action is consistent with a discourse only if the beliefs that it expresses are consistent with the beliefs expressed by prior actions. Together, these dierent factors enable a system to distinguish between actions that are reasonable, although not fully expected (such as an incorrect answer or a clari cation question) from actions that are indications of misunderstanding. The core of this theory includes six general schema for interpreting an exchange relationship. These schema apply to dialogs in which two dialog participants share initiative, each acting as both actor and addressee. The de nitions of the schema assume that a discourse model is a representation of the subjective perspective of a dialog participant. For clarity, the de nitions will be given in terms of the system and the user, assuming the system's understanding of the dialog and assuming that the system has the initiative. These same de nitions will apply when the user has initiative.9
Plan adoption In plan adoption, the system begins a new conversational exchange. An utterance counts as a coherent attempt to initiate a conversational exchange if:
There is some action that the system wants the user to do.
The de nitions would also apply from the user's point of view; however, her interpretations would dier from the system's when there had been a misunderstanding. McRoy [1993] discusses this in detail, showing the representations that result when two copies of the system communicate with each other. 9
25
This action would be the expected response to the utterance.
There is an additional coherence constraint that both the beliefs that are expressed by the system's utterance and those that would be expressed by the user's response should be consistent with the beliefs already expressed in the discourse.10 For example, if the system asks a question, such as \What is the best test to rule in gallstones?", it is plan adoption, because the question creates an expectation for an answer. (Similarly, it is plan adoption if the student makes a request for the system to \Tell me a story." or asks the system \Why?".)
Acceptance In acceptance, the user continues a conversational exchange in a way that arms the previous utterance's role in the context. For example, providing the answer to a question legitimizes the question and displays the user's understanding of it. An action counts as acceptance if there is another action in the discourse that has created an expectation for this one. There is an additional constraint that the beliefs expressed by the actor must be consistent with those already expressed in the discourse. An example of acceptance would be when the student responds to a question, such as \What is the best test to rule in gallstones?", with an answer, such as \HIDA" or \I don't know.".
Other-misunderstanding In other-misunderstanding, the user's utterance can be explained by the system as a manifestation of a misunderstanding by the user. A user's utterance manifests her The derivation of expressed beliefs from the discourse model is a process that is still under re nement. Minimally it includes implicatures, as discussed in Hinkelman [1990]. 10
26
misunderstanding, if it does not display acceptance, but there is an alternative interpretation of an earlier utterance by the system for which it would display acceptance. An other-misunderstanding, if detected, normally leads to a third-turn repair (see Figure 13, which is described below). In the gure, utterance 2 can be taken to manifest the user's misunderstanding of the system's question, because the system was expecting an answer like \The patient does not have gallstones.", and the user's utterance would make sense if the referring expression a negative ultrasound were replaced by the one given by the user, an ultrasound.
Self-misunderstanding In self-misunderstanding, the user's action can be explained as the result of a misunderstanding by the system. An action can be counted as a sign that the system has misunderstood, if:
The user's action is inconsistent with the beliefs already expressed in the discourse by some earlier utterance.
The prior utterance allows an alternative interpretation that renders the user's action consistent.
A self-misunderstanding, if detected, could lead the system to produce a fourth-turn repair (see Figure 14, which is described below), or the system could choose to revise its beliefs silently. In the example of Figure 14, the system interprets utterance 3 as an indication of self-misunderstanding, because it was expecting the user to acknowledge that her question (utterance 1) had been accepted and answered by utterance 2.
27
Third-turn repair In third-turn repair, the system initiates a repair to be performed by the user.11 A third-turn repair is indicated when there has been a misunderstanding by the user (an other-misunderstanding). There is an additional constraint that the ful llment of the system's original intention must still be possible.
B2: User: B2:
Given a pretest probability of gallstones of .135, what would a negative ultrasound suggest? 1 An ultrasound raises the probability of gallstones.
No. What would a negative test suggest?
2 3
Figure 13: A third-turn repair An example of a third-turn repair by B2 is shown in Figure 13. From the system's perspective, the user has apparently misread the question, confusing a negative ultrasound with a positive one. B2 attempts to recover by re-asking its question, emphasizing the salient term.
Fourth-turn repair In fourth-turn repair, the system initiates a self-repair, after a self-misunderstanding. A fourth-turn repair is indicated when there has been a misunderstanding by the system. There is an additional constraint that the ful llment of the user's original intention must still be possible. The terminology of third-turn repair and fourth-turn repair is from Scheglo [1992]. Turns are counted from the location of the problematic turn. Hence, a third-turn repair is one that is initiated by an actor after the addressee has produced a response that appears to demonstrate that the addressee has misunderstood. A fourth-turn repair is one that is initiated by an actor after misunderstanding some action (turn 1) that was performed by the addressee, then producing a response (turn 2), and then receiving feedback (turn 3) that indicates that there was trouble earlier. Note that each participant will have her own perspective on the problem. A third-turn repair by one participant can trigger a fourth-turn repair by the other. (This interpretation of events is slightly dierent from Scheglo's original, which takes the view of the analyst, or an overhearer, of a conversation.) 11
28
An example of a third turn repair initiated by the user, followed by a fourth-turn repair by B2 is shown in Figure 14. In this example, the user requests an explanation. The system interprets it as a request to explain the result with respect to the current case, which includes the probability relationships. However, the user rejects this interpretation, and indicates that she wants the more general explanation, which includes the underlying cause. When utterance 2 is rejected, B2 recognizes self-misunderstanding of utterance 1, and produces the other type of explanation.
User: Why does a positive HIDA suggest gallstones? 1 B2 In the case of Mr Jones, the pretest probability of gallstones is 0.135. A positive HIDA test results in a post-test probability of 0.307.
User: I mean for what reason. B2: Oh. HIDA detects cholecystitis, which is caused by gallstones.
2 3 4
Figure 14: A dialog that illustrates a fourth-turn repair by B2 In our system, these schema guide the construction of the upper levels of the discourse model. The representation of discourse facilitates the reasoning described by these schema by allowing the system to represent all this information in the same network and by providing mechanisms for reasoning over the information eciently.
6 An Example We will now consider the example shown in Figure 14 in greater detail. Figure 15 illustrates the utterance sequence, interpretation, and exchange levels of representation that would
29
result after this conversation.12 Starting from the top of the gure, we see the following: M150 BEFORE
M160 AFTER
BEFORE
AFTER
"In the case of Mr. Jones, the pretest probability of gallstones is 0.135. A positive HIDA results in a posttest probability of 0.307"
"Why does a positive HIDA suggest Gallstones?"
The user requests that the system explain the probabiliity relation between HIDA and Gallstones
M170
INTERPRETATION-OF
"Oh. HIDA detects Cholecystitis, which is caused by gallstones."
The system tells the user the causality relations. between HIDA and Gallstones
The user requests a repair. (an alternative interpretation of an earlier utterance)
INTERPRETATION-OF
INTERPRETATION
INTERPRETATION-OF
INTERPRETATION
INTERPRETATION
M141
AFTER
"I mean for what reason."
The system tells the user the probability relations
INTERPRETATION-OF
BEFORE
M161
M151
INTERPRETATION M171
The user requests that the system explain the causality relation between HIDA and Gallstones.
INTERPRETATION-OF
INTERPRETATION M162
EVENT1
EVENT2
EVENT1
M152
EVENT2 M172
M114 LEX
MEMBER M153
CLASS
MEMBER CLASS
exchange
M173
Figure 15: This gure shows (a gloss of) the utterance level representations, the utterance sequence-level representations, (a gloss of) the interpretation level representations, and the exchange-level representations for a dialog containing a fourth-turn repair.
Nodes M150, M160, and M170 represent the sequencing relations that hold between utterances 1{4. (The rst row of boxes gloss the subnetworks corresponding to the utterance representations, which would be similar to those discussed in section 4.2.)
Currently, our system produces simplistic natural language output, rather than the output shown. We are still working on improving the text planning part of our system. 12
30
The sequence nodes would be constructed using the utterance level representations produced by the parser.
Nodes M141, M151, M161, and M171 represent the interpretations that B2 gives to utterances 1{4, at the time that it parsed them. (The second row of boxes gloss the subnetworks corresponding to the interpretations, which would be similar to those discussed in section 4.4.) The interpretations themselves correspond to representations of the speaker's actions on the domain or the discourse; these interpretations are derived from the utterance representations on the basis of linguistic information, social conventions, and domain-speci c plans. For example, to derive the interpretation of M141, the system reasons that when the user asks a why-question about an unspeci ed relation between two concepts in the Bayesian network, then it can be interpreted as a request for the system to perform the action describe-probability-chain on the two nodes. (Other interpretations are possible, the system need only nd one that applies.) In the terminology of the previous section, M141 is explained as a case of plan-adoption, M151 is explained as acceptance, M161 is a self-misunderstanding (of utterance 1), M171 is a fourth-turn repair, which also acts as the acceptance for the interpretation of M141.
Nodes M152 and M172 identify the exchange structure of utterances 1 and 2 (the original interpretation) and the exchange structure of utterances 1 and 4 (the repaired interpretation). When a node is recognized as the acceptance of another node, those two nodes can be taken as a complete exchange. In the gure, M153 and M173 are propositions that the two structures, respectively, are indeed exchanges.
31
The exchange structure is signi cant because, the system will step back through this structure if it needs to reason about alternative interpretations. The exchange structure indicates how each speaker displayed their understanding of the other's previous utterances. (The utterance sequence will not always provide this information because exchanges can be nested inside each other, e.g., to ask a clarifying question.) The interpretation of M161 is special, because it neither begins a new exchange nor completes an open one. (This would be determined by its linguistic form and by the expectations created by the previous interaction.) When the system fails to nd a domain plan (e.g., a request to display part of the network) or a discourse plan (e.g., a request to clarify a previous utterance or an answer to a question from the system), then it considers evidence of failure. In this case, the surface form of the utterance suggests looking back in the conversation to consider an alternative domain plan. Finding an utterance that admits an alternative interpretation{corresponding to an alternative domain plan|a new interpretation (M162) is constructed, which results in a repair action by the system to accept it. (If there had been no alternative domain plan, then the system would not be able to form any interpretation of utterance 3|a case of non-understanding|and would subsequently ask the user to provide more information about the problem.) The result of dialog processing is thus a detailed network or propositions that indicates the content of the utterances produced by the system or the user, their role in the interaction, and the system's belief about what has been understood. If necessary, the system will be able to explain why it produced the utterances that it did and recover from situations where communication has failed.
32
7 Conclusion When a system allows users the exibility to initiate tasks or to ask follow up questions, they must allow for possible misunderstanding. Such a system will need to examine the evidence provided by the user's actions over a sequence of interactions, to verify that understanding has occurred. A system will also need the ability to deviate from its planned sequence of actions to address apparent failures. Here we considered an approach to robust human-computer interaction that relies on an explicit, declarative representation of the content and structure of the interaction. The representation includes ve levels:
The utterance level represents the linguistic or graphical form of communication between the user and the system.
The utterance sequence level represents the temporal ordering of utterances.
The interpretation level represents the communicative role that the utterance achieves and the domain plan that it invokes.
The exchange level represents the socially-de ned relationship that determines whether a communicative act has been accepted or understood.
The exchange interpretation-level represents (explicitly) the acceptance or failure of an exchange.
Although these structures may grow quite large, two aspects of the representation have made it tractable for an interactive system. First, there is knowledge sharing or associativity between the structures in each of the ve levels. This occurs because the representation is uniform and every concept is represented by a unique node. As a result, we can retrieve and 33
make use of information that is represented in the network implicitly, by the arcs that connect propositional nodes. Second, the system uses mixed-depth representations, which allows it to defer building and elaborating interpretations until the information is most readily available and only when it is actually needed. Building such a representation over the course of an interaction allows the system to recognize and repair misunderstandings, because it has a record of why it invoked the plan that it did and what evidence it had that it was successful. If necessary, it can later reevaluate this evidence in light of new, possibly contradictory (or unexplainable) observations. The construction and use of the representation is facilitated by using a framework that allows us to represent all this information in the same network and by providing mechanisms for reasoning over the information eciently.
Acknowledgments Thanks are due to Susan Haller and Syed Ali for their contributions to the work on B2 and for their comments on this paper. Thanks also to Graeme Hirst for his contributions to the work on miscommunication. This work has been supported by the University of Wisconsin{Milwaukee and the National Science Foundation, IRI-9701617.
References [Ali et al., 1997] Syed S. Ali, Susan W. McRoy, and Susan M. Haller. Mixed-depth representations for dialog processing. 1997. Submitted for publication.
34
[Ali, 1994a] Syed S. Ali. A Logical Language for Natural Language Processing. In Proceedings of the 10th Biennial Canadian Arti cial Intelligence Conference, pages 187{196, Ban, Alberta, Canada, May 16-20 1994. [Ali, 1994b] Syed S. Ali. A \Natural Logic" for Natural Language Processing and Knowledge Representation. PhD thesis, State University of New York at Bualo, Computer Science,
January 1994. [Allen, 1983] James Allen. Recognizing intentions from natural language utterances. In Michael Brady, Robert C. Berwick, and James Allen, editors, Computational Models of Discourse, pages 107{166. The MIT Press, 1983. [Ardissono et al., 1998] Liliana Ardissono, Guido Boella, and Rossana Damiano. A planbased model of misunderstandings in cooperative dialogue. International Journal of Human-Computer Studies/Knowledge Acquisition, 1998. [Calistri-Yeh, 1991] Randall J. Calistri-Yeh. Utilizing user models to handle ambiguity and misconceptions in robust plan recognition. User Modeling and User-Adapted Interaction, 1(4):289{322, 1991. [Carberry, 1990] Sandra Carberry. Plan Recognition in Natural Language Dialogue. The MIT Press, Cambridge, MA, 1990. [Eller and Carberry, 1992] Rhonda Eller and Sandra Carberry. A meta-rule approach to
exible plan recognition in dialogue. User Modeling and User-Adapted Interaction, 2(1{ 2):27{53, 1992. [Fikes and Nilsson, 1971] R. E. Fikes and Nils J. Nilsson. STRIPS: A new approach to the application of theorem proving to problem solving. Arti cial Intelligence, 2:189{208, 1971. 35
[Goodman, 1985] Bradley Goodman. Repairing reference identi cation failures by relaxation. In The 23rd Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference, pages 204{217, Chicago, 1985. [Grosz and Sidner, 1986] B. J. Grosz and C. L. Sidner. Attention, intentions, and the structure of discourse. Computational Linquistics, 12, 1986. [Hinkelman, 1990] Elizabeth Ann Hinkelman. Linguistic and Pragmatic Constraints on Utterance Interpretation. PhD thesis, Department of Computer Science, University of Rochester, Rochester, New York, 1990. Published as University of Rochester Computer Science Technical Report 288, May 1990. [Hirst and Ryan, 1992] Graeme Hirst and Mark Ryan. Mixed-depth representations for natural language text. In Paul Jacobs, editor, Text-Based Intelligent Systems. Lawrence Erlbaum Associates, 1992. [Lambert and Carberry, 1991] Lynn Lambert and Sandra Carberry. A tri-partite plan-based model of dialogue. In 29th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, pages 47{54, Berkeley, CA, 1991.
[Litman, 1986] Diane J. Litman. Linguistic coherence: A plan-based alternative. In 24th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, pages 215{223, New York, 1986. [Maida and Shapiro, 1982] A. S. Maida and S. C. Shapiro. Intensional concepts in propositional semantic networks. Cognitive Science, 6(4):291{330, 1982. Reprinted in R. J. Brachman and H. J. Levesque, eds. Readings in Knowledge Representation, Morgan Kaufmann, Los Altos, CA, 1985, 170{189. 36
[McCoy, 1985] Kathleen F. McCoy. The role of perspective in responding to property misconceptions. In Proceedings of the Ninth International Joint Conference on Arti cial Intelligence, volume 2, pages 791{793, 1985. [McRoy and Hirst, 1993] Susan W. McRoy and Graeme Hirst. Abductive explanation of dialogue misunderstandings. In 6th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference, pages 277{286, Utrecht, The Netherlands, 1993. [McRoy and Hirst, 1995a] Susan W. McRoy and Graeme Hirst. The repair of speech act misunderstandings by abductive inference. Computational Linguistics, 21(4):435{478, December 1995. [McRoy and Hirst, 1995b] Susan W. McRoy and Graeme Hirst. The repair of speech act misunderstandings by abductive inference. Computational Linguistics, 21(4):435{478, December 1995. [McRoy et al., 1997] Susan McRoy, Susan Haller, and Syed Ali. Uniform knowledge representation for nlp in the b2 system. Journal of Natural Language Engineering, 3(2), 1997. [McRoy, 1993] Susan W. McRoy. Abductive Interpretation and Reinterpretation of Natural Language Utterances. PhD thesis, Department of Computer Science, University of Toronto, Toronto, Canada, 1993. Available as CSRI Technical Report No. 288, University of Toronto, Department of Computer Science. [McRoy, 1995a] Susan W. McRoy. Misunderstanding and the negotiation of meaning. Knowledge-based Systems, 8(2{3):126{134, 1995. 37
[McRoy, 1995b] Susan W. McRoy. Misunderstanding and the negotiation of meaning. Knowledge-based Systems, 1995. [Novick and Ward, 1993] David Novick and Karen Ward. Mutual beliefs of multiple conversants: A computational model of collaboration in air trac control. In Proceedings, National Conference on Arti cial Intelligence (AAAI-93), 1993. [Scheglo and Sacks, 1973] Emanuel A. Scheglo and Harvey Sacks. Opening up closings. Semiotica, 7:289{327, 1973. [Scheglo, 1992] Emanuel A. Scheglo. Repair after next turn: The last structurally provided defense of intersubjectivity in conversation. American Journal of Sociology, 97(5):1295{1345, 1992. [Shapiro and Group, 1992] Stuart C. Shapiro and The SNePS Implementation Group. SNePS-2.1 User's Manual. Department of Computer Science, SUNY at Bualo, 1992. [Shapiro and Rapaport, 1987] S. C. Shapiro and W. J. Rapaport. SNePS considered as a fully intensional propositional semantic network. In N. Cercone and G. McCalla, editors, The Knowledge Frontier, pages 263{315. Springer{Verlag, New York, 1987.
[Shapiro and Rapaport, 1992] Stuart C. Shapiro and William J. Rapaport. The SNePS family. Computers & Mathematics with Applications, 23(2{5), 1992. [Shapiro et al., 1994] Stuart Shapiro, William Rapaport, Sung-Hye Cho, Joongmin Choi, Elissa Feit, Susan Haller, Jason Kankiewicz, and Deepak Kumar. A dictionary of SNePS case frames, 1994. [Winograd, 1983] Terry Winograd. Language as a Cognitive Process, Volume 1: Syntax. Addison-Wesley, Reading, MA, 1983. 38
[Zukerman, 1991] Ingrid Zukerman. Avoiding mis-communications in concept explanations. In Proceedings of the 13th Annual Conference of the Cognitive Science Society, pages 406{411, Chicago, IL, 1991.
39