Language and Cognitive Processes What makes

0 downloads 0 Views 154KB Size Report
To refer successfully to a tangram as in Figure 1, then, speakers need to assess ..... manual, it's like he's prepared to drive, ready for the driving position. Round 8.
This article was downloaded by: [The University of Edinburgh] On: 11 June 2015, At: 03:45 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Language and Cognitive Processes Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/plcp20

What makes dialogues easy to understand? a

a

Holly P. Branigan , Ciara M. Catchpole & Martin J. Pickering

a

a

Department of Psychology , University of Edinburgh , Edinburgh, UK Published online: 08 Dec 2010.

To cite this article: Holly P. Branigan , Ciara M. Catchpole & Martin J. Pickering (2011) What makes dialogues easy to understand?, Language and Cognitive Processes, 26:10, 1667-1686, DOI: 10.1080/01690965.2010.524765 To link to this article: http://dx.doi.org/10.1080/01690965.2010.524765

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/ terms-and-conditions

LANGUAGE AND COGNITIVE PROCESSES 2011, 26 (10), 16671686

What makes dialogues easy to understand? Holly P. Branigan, Ciara M. Catchpole, and Martin J. Pickering Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

Department of Psychology, University of Edinburgh, Edinburgh, UK

Two experiments investigate the question of why dialogues tend to be easier for anyone to understand than monologues. One possibility is that overhearers of dialogue have access to the different perspectives provided by the interlocutors, whereas overhearers of monologue have access to the speaker’s perspective alone (Fox Tree, 1999). Directors first described a set of geometric shapes to matchers in monologue or dialogue eight times. Experiment 1 found that descriptions taken from dialogue were easier to understand than descriptions taken from monologue or descriptions taken from dialogue in which the matcher’s contributions were excised. This advantage occurred on early trials (when the matcher made a considerable contribution) but also on late trials (when the matcher simply accepted a description). Experiment 2 replicated this finding and ruled out an explanation in which the advantage of dialogue is due to its use of discourse markers. We argue that the ease of dialogue occurs because interlocutors negotiate a perspective that they can agree on (Clark, 1996). This grounded perspective is likely to be objectively easier to understand than a perspective that has not been grounded.

Keywords: Dialogue; Monologue; Overhearer.

Speakers tend to produce utterances that are intelligible not only to their addressees, but also to anyone who hears them (overhearers). Speakers might achieve this entirely independently, but (as we shall see) utterances produced in contexts in which the speaker and addressee both contribute (i.e., utterances produced in dialogue) tend to be more intelligible than utterances produced by Correspondence should be addressed to Holly Branigan, Department of Psychology, University of Edinburgh, 7 George Square, Edinburgh EH8 9JZ, UK. E-mail: holly. [email protected] We acknowledge support of ESRC Grant No. RES-062-23-0376 and an ESRC postgraduate studentship, and thank Janet McLean for her comments. # 2011 Psychology Press, an imprint of the Taylor & Francis Group, an Informa business http://www.psypress.com/lcp

http://dx.doi.org/10.1080/01690965.2010.524765

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

1668

BRANIGAN, CATCHPOLE, PICKERING

an isolated speaker (i.e., utterances produced in monologue). Why might this be so? In this paper, we investigate this issue by comparing the comprehensibility of descriptions made to an addressee who could not provide feedback (monological descriptions) with descriptions made to an addressee who could provide feedback (dialogical descriptions), both when overhearers could hear that feedback and when they could not. One possibility is that descriptions produced in a dialogue context are more comprehensible because addressees’ contributions tend to propose a different perspective to that provided in speakers’ contributions, so that any overhearer has more than one way of understanding descriptions in dialogue. Alternatively, they may be more comprehensible because the speaker and addressee jointly construct a perspective that they agree on, and any agreed-on perspective tends to be more intelligible than a perspective that has not been agreed on. Speakers try to ensure that their message is understood by their addressees, and so they assess what their addressees are likely to understand. In this paper, we are concerned with cases where the speaker and addressee are strangers, and the messages that they wish to convey concern novel shapes (specifically, complex geometric shapes, or tangrams, such as Figure 1), so that they cannot fall back on previous shared experiences or knowledge. To refer successfully to a tangram as in Figure 1, then, speakers need to assess what kind of description their addressees are likely to be able to interpret as referring to that shape. In monologue, they rely solely on this assessment to choose between alternative descriptions (e.g., The ice skater vs. The chicken) because their addressee cannot provide feedback about their comprehension. For example, when a speaker utters The ice skater, she cannot know whether or not her addressee understands that it refers to Figure 1. That is, she does not know whether the addressee believes himself to have correctly interpreted her perspective and the corresponding referring expression. She therefore has to decide when to move on to her next contribution without confirmation of understanding. But when a speaker in a dialogue describes the same tangram as The ice skater, she can tell from her addressee’s feedback whether he

Figure 1. Example tangram figure.

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

UNDERSTANDING DIALOGUES

1669

believes he understands its referent or not. By providing confirmatory feedback (e.g., uh-huh), the addressee indicates that he accepts the speaker’s perspective (and the corresponding referring expression). In Clark’s (1996) terms, the speaker and the addressee both (mutually) believe that the addressee has understood the speaker (well enough for current purposes). Their mutually accepted perspective and the corresponding referring expression have then been grounded and may then enter their store of mutually known information, or common ground. Thus addressees should comprehend speakers’ utterances more easily when they can provide feedback (i.e., in dialogue) than when they cannot (i.e., in monologue): by providing feedback, they are able to shape the speaker’s contributions, and in particular to help the speaker adopt a perspective that they can interpret. For example, they can indicate that they find a particular perspective difficult to adopt or offer an alternative perspective that may be more easily adopted (e.g., I don’t know what you mean by ice skater*do you mean the one that looks a bit like a chicken?). In keeping with this, Clark and Wilkes-Gibbs (1986) found that when pairs of participants repeatedly described a set of tangrams to each other, they initially went through a process of putting forward possible perspectives and negotiating until they agreed on a perspective. At first, they often made multiple contributions that drew on different perspectives, with the director (who had to communicate the appropriate tangram) and the matcher (who had to find the appropriate tangram) proposing alternatives. But once a perspective (and corresponding referring expression) had been mutually accepted (i.e., grounded), participants used this single perspective. Thus, later discussions of the same tangram tended to make use of this perspective. Later turns were much shorter than earlier turns, both because participants used a single perspective from the outset in later turns, and because such agreed perspectives could be expressed more concisely. Moreover, identification of the relevant tangram was usually achieved in just two turns, with the director providing the agreed perspective and the matcher acknowledging understanding. There is also evidence that listeners who cannot provide feedback understand less well than listeners who can provide feedback. Schober and Clark (1989) had overhearers listen to tape recordings of dialogues in which a director described tangrams to a matcher. They found that the overhearers were much worse at understanding directors’ descriptions than were the original matchers (even when the overhearers were permitted to pause the recordings, thus allowing them extra time to interpret the directors’ description that the matchers did not have). Taken together, these studies show that listeners who are able to provide feedback to the speaker find it easier to comprehend descriptions than listeners who are not able to provide feedback. This may be because providing feedback to the speaker allows listeners to direct the speaker

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

1670

BRANIGAN, CATCHPOLE, PICKERING

towards a perspective that they find easier to adopt (and listeners who cannot provide feedback cannot direct the speaker in this way). If so, feedback facilitates comprehension specifically for the person who provided that feedback. However, these results do not indicate whether descriptions produced in dialogue are objectively more or less comprehensible than descriptions produced in monologue. That is, do listeners who cannot provide feedback find dialogical or monological descriptions easier to understand? In fact, we might expect that any listener would find descriptions produced in monologue would be generally easier to understand than descriptions produced in dialogue. In monologue, speakers should try to use perspectives that can be easily understood, because they know that the listener cannot provide feedback to help them refine their descriptions. The perspectives that they choose may therefore be more accessible to any listener from the same community as the intended addressee. But in dialogue, speakers’ perspectives will be closely tailored to fit their particular addressees’ feedback; as a result, their partner-specific perspectives may be less accessible to a third party. Yet surprisingly, there is some evidence that listeners find it easier to comprehend dialogical descriptions than monological descriptions, even when they are not participants in the dialogue (and hence cannot provide feedback). Fox Tree (1999) had participants listen to tape recordings of monological tangram descriptions (in which the matcher had been in a different room to the director and could not provide feedback), and dialogical tangram descriptions (in which the addressee had been in the same room as the speaker and provided feedback). (Note that the dialogical descriptions included any feedback produced by the matcher.) Participants comprehended dialogical descriptions better than monological descriptions, identifying the tangram 90% of the time in the dialogical condition, compared to 85% of the time in the monological condition. Dialogical descriptions may therefore tend to be more comprehensible than monological descriptions, and this difference in comprehensibility for listeners is not reducible to their ability to provide feedback. Fox Tree (1999) suggested two possible reasons for this finding. First, it might reflect the high frequency of discourse markers such as I mean and you know in dialogue, a class of expressions that specify how the second part of an utterance should be interpreted with respect to the first part (Fraser, 1999). She argued that these markers could help speakers structure their descriptions (and so make them more comprehensible). In keeping with this, Bangerter and Clark (2003) proposed that discourse markers (in their terms, project markers) serve in part to manage dialogue by signalling ‘‘vertical transitions’’, such as a change of topic or a request for more information. Discourse markers might be helpful to listeners because they mark such vertical transitions and allow the addressee to follow the speaker’s train of

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

UNDERSTANDING DIALOGUES

1671

thought more easily. In support of this, Fox Tree (1999) found a positive correlation between the number of discourse markers in a description and overhearers’ accuracy in identifying the appropriate referent, and Fox Tree and Schrock (1999) found that listeners were faster to recognise a word if it was preceded by the discourse marker oh than if the oh was replaced with a pause or simply excised from speech. However, Fox Tree (1999) also considered an alternative possibility. Dialogues might be easier to understand than monologues because two interlocutors tend to present two (or perhaps more) perspectives on an issue under discussion. Multiple perspectives increase the chance that the listener will be able to adopt one of them, whereas an isolated speaker typically presents a single perspective that may or may not coincide with a perspective that a listener can adopt. Fox Tree and Mayer (2008) tested these possibilities by comparing how accurately participants identified tangrams when listening to recordings of monological and dialogical descriptions that included a single perspective or multiple perspectives. To do this, they selected descriptions recorded in monologue and dialogue that they judged to include a single perspective or multiple perspectives; for example, they treated It looks like a fish. . .there’s a big blob and then a little triangle on the bottom as multiple perspectives. They found that overhearers were more accurate when they heard descriptions that included multiple perspectives, irrespective of the number or rate of discourse markers, and irrespective of whether those perspectives were produced in monologue or dialogue. They therefore suggested that listeners find it easier to understand descriptions that involve multiple perspectives than single perspectives. Moreover, they argued that dialogue is not inherently easier to comprehend than monologue; its advantage lies in the fact that it tends to contain more perspectives than does monologue. However, the monologue and dialogue items that Fox Tree and Mayer selected may also have differed in other respects, as selection depended on their intuitions. We assume that successful comprehension occurs when the listener is able to adopt the same perspective (e.g., conceptualising a tangram as a chicken) as the speaker. Under Fox Tree and Mayer’s (2008) account, then, the advantage that dialogue generally offers over monologue for overhearers lies in the access that it offers to multiple perspectives, because there is a greater chance that one of them will coincide with a perspective that an overhearer can adopt (and presumably, the more perspectives to which he has access, the more likely he is to be able to adopt one of them). Note that these perspectives might constitute unrelated interpretations of an entity (e.g., chicken vs. ice skater, for Figure 1), but could also constitute related interpretations (e.g., chicken with wings vs. chicken with legs). In contrast, when there is a single perspective available, as tends to be the case in monologue, this perspective may or may not coincide with a

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

1672

BRANIGAN, CATCHPOLE, PICKERING

perspective that an overhearer can adopt. If it does not coincide, then the overhearer will not correctly interpret the speaker’s intention. Under this distinct perspectives account, overhearers should benefit from overhearing the process by which interlocutors negotiate multiple perspectives in order to agree upon one of them, but should not benefit from hearing only the outcome of that process (i.e., the single perspective that interlocutors ultimately agree upon). Indeed, Fox Tree and Mayer (2008) suggested that such jointly established perspectives may be idiosyncratic, and hence particularly obtuse for overhearers to understand. However, the evidence is also compatible with an alternative, the grounded perspective account, in which overhearers generally find dialogical descriptions easier than monological descriptions because a perspective that a speaker and addressee can agree on is likely to be easier for another person to adopt than a perspective that the speaker alone adopts. That is, a grounded perspective is more likely than an ungrounded perspective to coincide with a perspective that the overhearer can adopt. This might be because participants iteratively refine a perspective through their interaction in ways that are likely to make it more generally accessible to any comprehender (e.g., looks like an ice skater0kind of like a bird, maybe a chicken0a chicken with a pointy beak). But it might also be because the perspective that they ultimately agree upon is likely to be the perspective that involves least collaborative effort to adopt (see Clark & Schaefer, 1987). For example, both of them may easily conceptualise a tangram as a chicken, whereas one of them may find it very difficult to conceptualise it as an ice skater. That perspective is therefore also likely to be more accessible to another person. Either way, it is more likely that an overhearer will be able to adopt a grounded perspective than an ungrounded perspective. Under this account, then, the primary advantage of dialogue over monologue for overhearers is not that it offers multiple perspectives, but rather that any perspective it offers is better than the perspective offered by monologue. This is because it gives access to a single perspective that has been mutually accepted. For this reason, overhearers should benefit from hearing the outcome of the process by which interlocutors come to agree upon a perspective (i.e., hearing just the grounded perspective), and not merely from hearing the process of negotiating alternative perspectives. The grounded perspective account therefore predicts that the perspectives that speakers arrive at in dialogue are objectively more comprehensible than those that speakers adopt in monologue. To test these alternative accounts, we carried out two experiments in which participants listened to recordings of directors and matchers carrying out a tangram-description task (Clark & Wilkes-Gibbs, 1986), and attempted to identify the tangrams that were being described. The director repeatedly described the same set of tangrams and the matcher was either permitted to

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

UNDERSTANDING DIALOGUES

1673

provide feedback to the director or was not. The recordings included descriptions made when the interlocutors first encountered the tangrams (hence had to ground a perspective for the first time) and subsequent descriptions of the same tangrams (when they could appeal to a grounded perspective). In Experiment 1, we manipulated the type of description that participants heard. In one condition, they heard a director describing a tangram to a matcher who could not provide feedback, hence was producing descriptions in a monologue (monologue condition). In another condition, they heard both a director describing a tangram to a matcher and the matcher’s feedback, hence both the director and the matcher’s contributions to a dialogue (full-dialogue condition). In a third condition, they heard only the director’s contributions to a dialogue (half-dialogue condition). In Experiment 2, we replicated the monologue and full-dialogue conditions in order to be more confident about our findings, but replaced the half-dialogue condition with a condition in which participants overheard the director and the matcher’s contributions but with discourse markers excised (see Sections ‘‘Introduction’’ to ‘‘Experiment 2’’). We now discuss how we generated these descriptions.

DESCRIPTIONGENERATION TASK To generate the tangram descriptions, we carried out a referential-communication tangram-matching study in two conditions (dialogue-generation and monologue-generation conditions). Twelve pairs of participants from the University of Edinburgh student community took part (data from four further pairs were excluded owing to audio-recording failure). One member of each pair was randomly assigned the role of director; the other was assigned the role of matcher. Six pairs were randomly assigned to the monologue-generation condition and six pairs to the dialogue-generation condition. The participants were tested one pair at a time in an experimental room. The director and the matcher sat at separate tables and faced opposite directions, so that they could hear each other, but could not see each other or each other’s materials; thus visual feedback was eliminated. Each pair took part in two games. The director was given an A4 file containing 96 pages, 48 for each game. Each page contained a grid showing a tangram (Elffers, 1976) in one of the squares; all the other grid squares were empty. The grid squares were labelled AD along the bottom and 14 up the side. The director was also given the full array of 12 tangrams laid out on the table (to facilitate his or her descriptions). The matcher was given a set of 12 tangram cards laid out on the table, beside a larger, empty version of the director’s grid. The

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

1674

BRANIGAN, CATCHPOLE, PICKERING

director described to the matcher the tangram shown on each page and its position in the grid, in a way which would allow the matcher to pick out the tangram from his or her own set of 12 and place it in the correct position on his or her own grid. Each tangram was described four times in each of two games, such that once placed on the grid, each tangram would be moved to a further three positions within the same game. Participants did not receive feedback about their performance (i.e., how accurately they matched descriptions to tangrams). Following the completion of the first game, the matcher’s tangrams were removed from the grid, shuffled, and laid out on the table again, and the director continued to the next set of 48 pages. The order of description of tangrams was randomised for each pair of participants, subject to the restriction that each tangram would appear once within 12 consecutive moves. This resulted in the production of eight descriptions of each tangram (Rounds 18). In the monologue-generation condition, only the director was permitted to speak. The director wore a lapel microphone attached to a DAT recorder and a pair of sound-attenuating earmuffs to block out the sound of the matcher placing his or her cards (because Schober, 1993, found in pilot tests that experimental participants in a similar condition used the sounds of their partners’ pens scratching as a form of feedback). Thus the director received no feedback about when the matcher had picked out a tangram. In the dialogue-generation condition, both director and matcher were permitted to speak, the director did not wear earmuffs, and both participants wore lapel microphones. Reponses were recorded into a single file on a DAT recorder, with the director and matcher recorded on separate channels. Table 1 gives examples of descriptions produced for the same tangram in Rounds 1 and 8 by a director in the monologue-generation condition and a director in the dialogue-generation condition. The session lasted about 30 minutes. The mean lengths of the director’s descriptions (in words) are shown in Table 2. Two (Condition, between-participants, and within-items)eight (Round, within-participants, and within-items) repeated measures ANOVAs TABLE 1 Descriptiongeneration task. Example descriptions produced in Rounds 1 and 8 by a director in the monologue-generation condition and a director in the dialoguegeneration condition Monologue

Round 1 Round 8

Dialogue

Round 1 Round 8

Looks like a man’s like crouching like you’d see in a driving instructor manual, it’s like he’s prepared to drive, ready for the driving position. Person that looks like he’s about to drive. Somebody sat down in a chair except there’s no chair and their hands are out in front of them. Person sat in mid-air.

UNDERSTANDING DIALOGUES

1675

TABLE 2 Descriptiongeneration task. Mean number of words (and SD) used by director per description

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

Round 1 2 3 4 5 6 7 8 Mean

Monologue generation 30.1 27.3 26.1 23.3 20.5 19.4 18.8 18.3 23.0

(12.1) (14.8) (13.11) (12.7) (11.60) (11.28) (11.35) (11.74) (13.10)

Dialogue generation 30.4 17.3 12.5 10.7 8.8 6.8 5.8 6.0 12.3

(20.69) (11.21) (9.57) (8.04) (8.52) (7.17) (4.36) (4.59) (12.97)

yielded main effects of Round (F1(7, 77)35.03, pB.01; F2(7, 77)132.84, pB.01), with descriptions shortening in later repetitions; and Contribution Type (F1(1, 11)97.59, pB.01); F2(1, 11)74.66, pB.01), with descriptions in the dialogue-generation condition being shorter than those in the monologue-generation condition. There was also an interaction (F1(7, 77)4.89, pB.05; F2(7, 77)7.45, pB.01): descriptions shortened more with repetition in the dialogue-generation condition than the monologuegeneration condition. In the dialogue-generation condition (see Table 3), the matcher produced less feedback (indexed by mean number of words) in later rounds than in earlier rounds (F1(7, 35)3.45, pB.01; F2(7, 77)15.19, pB .001). The number of turns taken per tangram also reduced in later rounds (F1(7, 35)8.43, pB.001; F2(7, 77)35.2, pB.001).

TABLE 3 Descriptiongeneration task. Mean number of words (and SD) used by matcher and number of turns (and SD) per description in dialogue-generation condition Round 1 2 3 4 5 6 7 8 Mean

Words by matcher (SD) 12.5 4.5 3.3 2.7 2.1 1.5 1.0 1.1 3.6

(23.05) (7.23) (6.73) (4.75) (4.05) (2.03) (0.12) (0.60) (9.78)

Turns 4.7 2.8 2.7 2.5 2.4 2.2 2.0 2.1 2.7

(3.29) (1.28) (1.41) (1.11) (1.13) (0.65) (0.22) (0.36) (1.65)

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

1676

BRANIGAN, CATCHPOLE, PICKERING

These results replicate previous findings that speakers who repeatedly describe the same novel objects shorten their descriptions in dialogue and that they do so to a greater extent than in monologue (Clark & WilkesGibbs, 1986; Krauss & Weinheimer, 1966). Moreover, addressees shortened their feedback largely to a single-word response by Round 8. This suggests that feedback serves to ground descriptions so that speakers do not need to provide an alternative and can express a single description more succinctly. Therefore we assume that, by Round 8, the director provided a single grounded description and the matcher accepted this grounded description without further elaboration. We also found that directors shortened their discussions to some extent in monologue. Directors presumably reduce their false starts, sometimes decide for themselves among competing descriptions, or guess that an intelligent matcher is likely to understand a somewhat more succinct description.

EXPERIMENT 1 In Experiment 1, we used the descriptions generated in the tangramdescription task to compare how well participants could understand descriptions of tangrams that were produced in the monologue-generation condition or the dialogue-generation condition. We also manipulated whether participants who overheard descriptions in dialogue heard both the director’s and the matcher’s contributions (full dialogue), or only the director’s contributions (half dialogue). We further manipulated whether the descriptions were taken from an early round or from the final round of the task. The distinct perspectives account predicts that overhearers should find full-dialogue descriptions from early rounds easier to understand than monologue descriptions from early rounds, because the full-dialogue descriptions often involve different perspectives from the director and matcher and the monologue descriptions do not. Overhearers should also find full-dialogue descriptions from early rounds easier to understand than half-dialogue descriptions from early rounds, again because the half-dialogue descriptions do not involve different perspectives from the director and matcher. However, the benefit of hearing the matcher’s contributions in fulldialogue descriptions should be restricted to descriptions from early rounds (and most marked for the first round), where multiple perspectives are most likely to be presented and negotiated; it should not hold for the final round, where directors and matchers do not present multiple perspectives (and in particular, matchers’ contributions largely comprise single-word acknowledgements). Specifically, the distinct perspectives account predicts that overhearers should not find full-dialogue descriptions from the final round

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

UNDERSTANDING DIALOGUES

1677

easier to understand than half dialogue and monologue descriptions from the final round, because full-dialogue descriptions in this round do not contain more perspectives than the other conditions. The grounded perspective account notes that both descriptions generated in dialogue should be grounded by the end of the first round. It therefore predicts that overhearers should find both full-dialogue and half-dialogue descriptions from the first round easier to understand than monologue descriptions from the first round. Critically, the benefit of hearing halfdialogue and full-dialogue descriptions will also hold for descriptions from later rounds, because these descriptions have been grounded (in earlier rounds) but the monologue descriptions are still not grounded. Specifically, overhearers should find full- and half-dialogue descriptions from the final round easier to understand than monologue descriptions from the final round.

Method Participants Seventy-two further University of Edinburgh students volunteered to participate.

Items and design Our items were 12 tangram pictures and associated descriptions taken from the descriptiongeneration task. We used descriptions produced when a particular tangram was described in the first, second, and eighth rounds. We selected these rounds because using all eight rounds would be unnecessary, Rounds 1 and 8 are maximally different, and Round 2 is most clearly different from Rounds 1 and 8 (at least in terms of length, see Table 2). To construct the monologue descriptions, we extracted the descriptions by the 12 directors of each tangram in the monologue-generation condition of the descriptiongeneration task. To construct the dialogue descriptions, we extracted the descriptions by the 12 directors and matchers of each tangram in the dialogue-generation condition of the descriptiongeneration task. To construct the half-dialogue descriptions, we extracted the descriptions by the 12 directors of each tangram in the dialogue-generation condition of the descriptiongeneration task (i.e., excising the matchers’ contributions). Given that six pairs completed the description task in the monologuegeneration condition and six in the dialogue-generation condition, this procedure led to six monologue and six dialogue descriptions for each tangram for each of Rounds 1, 2, and 8. We then constructed stimuli for each participant by pairing each picture with one of these descriptions. The stimuli for each participant consisted of four tangrams paired with

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

1678

BRANIGAN, CATCHPOLE, PICKERING

monologue descriptions, four tangrams paired with half-dialogue descriptions, and four tangrams paired with full-dialogue descriptions. Each participant received stimuli from one of Rounds 1, 2, or 8. All participants received each tangram once, and each tangram was presented equally often in the different conditions. Pairings between tangrams and descriptions were randomly varied across participants subject to these constraints, except that every description was used at least once in the experiment. The order of descriptions was randomised across participants, such that no two participants heard the same order of tangrams. There were nine conditions, created by crossing Contribution type (monologue vs. full dialogue vs. half dialogue; within-participants and -items) and Round (1 vs. 2 vs. 8; betweenparticipants and within-items).

Procedure Participants were tested individually in an experimental cubicle with a PC. Each participant read instructions displayed on the screen and the experimenter answered any questions before the experiment began. Participants also saw a sample tangram picture, to familiarise them with the basic structure of the shapes. The experiment comprised 12 trials. In each trial, an array of 20 numbered tangrams appeared on the computer screen, as in Figure 2. Twelve of these tangrams were experimental items taken from the descriptiongeneration task (1, 2, 4, 7, 8, 1013, 15, 17, and 19 in Figure 2), each corresponding to one of the recorded descriptions, and the remaining eight were foils. The same set of 20 tangrams was presented to every participant on every trial, but in a different order on each trial. We

Figure 2. Experiments 1 and 2: Example tangram array.

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

UNDERSTANDING DIALOGUES

1679

presented the materials using E-Prime software (Psychology Software Tools Inc., Pittsburgh, PA). For each trial, participants were instructed to look at the array for a few minutes when it first appeared to familiarise themselves with the tangrams, then to press the spacebar. Participants then heard a recorded description of one of the experimental tangrams played through external computer speakers. The array of pictures remained on the screen while the description was being played. Participants said the number of the tangram that they thought was being described. This response was recorded by a lapel microphone. After completing the experiment, participants were asked what they thought it had been about. All of them thought it was about how good people are at following instructions that were intended for someone else (which was the explanation given in the instructions).

Results We scored correctly identified tangrams as ‘‘1’’, and incorrectly identified tangrams as ‘‘0’’ (see Table 4). Analyses on the mean values for each participant and each item revealed a main effect of Contribution type (F1(2, 138)6.94, pB.01; F2(2, 22)7.40, pB.01). Planned comparisons showed that participants were less accurate in the monologue condition than in the full-dialogue condition (t1(71)3.51, pB.01; t2(11)3.69, pB.01) and the half-dialogue condition (t1(71)2.81, pB.01; t2(11)2.2, p.05); however, the full-dialogue and half-dialogue conditions did not differ from each other (½t1½B1, t2(11)1.54, p.15). There was also a main effect of Round, marginal by items (F1(2, 69)4.83, pB.05; F2(2, 22)3.22, pB.06), indicating that participants tended to be less accurate for later than earlier descriptions. There was no interaction between Contribution type and Round (both FB1). These results show that overhearers found both full dialogue and half dialogue easier to understand than monologue, and that similar effects occurred for both early and late rounds. In subsequent analyses, we examined results for Rounds 1 and 8 separately. TABLE 4 Experiment 1. Mean identification accuracy scores (and SD) for Rounds 1, 2, and 8 in monologue, half-dialogue, and full-dialogue conditions Round

Monologue

1 2 8 Mean

0.46 0.45 0.32 0.41

(0.50) (0.50) (0.47) (0.49)

Half dialogue 0.53 0.54 0.46 0.51

(0.50) (0.50) (0.50) (0.50)

Full dialogue 0.64 0.54 0.47 0.55

(0.48) (0.50) (0.50) (0.49)

Mean 0.54 0.51 0.42 0.49

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

1680

BRANIGAN, CATCHPOLE, PICKERING

In Round 1, there was a marginal effect of Contribution type (F1(2, 46) 3.18, pB.06; F2(2, 22)3.29, pB.06). Planned comparisons showed that participants were less accurate in the monologue condition than in the fulldialogue condition (t1(23)2.66, pB.05; t2(11)2.60, pB.05), but there was no difference between the monologue and half-dialogue conditions (both ½t½B1.1), or between the half-dialogue and full-dialogue conditions (t1(23)1.42, p.17; t2(11)1.76, p.11). In Round 8, there was a main effect of Contribution type (F1(2, 46)3.63, pB.05; F2(2, 22)4.98, pB .05). Planned comparisons showed that participants were less accurate in the monologue condition than in the full-dialogue condition (t1(23)2.17, pB .05; t2(11)2.92, pB.05) and in the half-dialogue condition (marginal by items: t1(23)2.18, pB.05; t2(11)2.01, pB.07). The lack of feedback by Round 8 of course means that the half-dialogue and full-dialogue conditions should have been very similar; as expected, there was no difference between full-dialogue and half-dialogue conditions (both ½t½B1).

Discussion Participants were more accurate at matching tangrams to descriptions when they overheard descriptions that had been produced in dialogue, even when they only overheard one person’s contribution, than when they overheard descriptions that had been produced in monologue. Importantly, this pattern held when overhearers heard directors describe a tangram for the eighth time (i.e., descriptions in which directors did not present multiple perspectives and matchers’ contributions were largely single-word acknowledgements). These results suggest that dialogical descriptions are more comprehensible to overhearers than monological descriptions. More specifically, they support the grounded perspective account and provide no support for the distinct perspectives account.

EXPERIMENT 2 We have assumed that overhearers find dialogical descriptions easy to understand because they are grounded. But we have noted that Fox Tree (1999) observed that dialogical descriptions contained more discourse markers than monological descriptions, and suggested that this might explain their comprehensibility. We therefore counted the number of discourse markers in the descriptiongeneration task. We considered the same five markers as Fox Tree (1999), well, I mean, you know, like, and oh, and followed her definition for determining which uses were discourse markers. ‘‘Nonpragmatic [nondiscourse marker] uses were defined as those which could carry normal nominal, verbal or adverbial functions, as in ‘do you know which one I mean?’ and ‘it looks like a seal’. Pragmatic [discourse

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

UNDERSTANDING DIALOGUES

1681

marker] uses did not fit these criteria, as in ‘the other foot is supporting him and like he’s leaning over against the wall’ and ‘it’s just just you know pushed towards the left’’’ (p. 48). We found the same pattern as Fox Tree in the descriptiongeneration task (see Table 5). ANOVAs revealed that there were more discourse markers in the dialogue-generation condition than in the monologue-generation condition (F1(1, 10)17.03, pB.01; F2(1, 5)8.39, pB.05), and in earlier rounds than in later rounds (F1(2, 20)5.37, p.01; F2(2, 10)6.60, pB.05). However, the difference in the number of discourse markers eventually disappeared. This difference was greater in earlier rounds than in later rounds (F1(2, 20)4.97, pB.05; F2(2, 10)6.24, pB.05). Planned comparisons showed that there were more discourse markers in the dialogue-generation condition than in the monologue-generation condition in Round 1 (t1(10)2.68, pB.05; t2(10)2.64, pB.05) and Round 2 (t1(10)4.06, pB.01; t2(10)2.86, pB.05); there were no discourse markers in either condition in Round 8. These differences might also play a role in explaining the differences in ease of understanding dialogical and monological descriptions. Fox Tree (1999) suggested that discourse markers in directors’ contributions might help overhearers (and addressees) to structure the content of dialogue and separate the ideas contained within it more effectively, and thus facilitate comprehension. In Experiment 2, we therefore replaced the half-dialogue condition from Experiment 1 with a condition in which participants heard both the director’s and matcher’s contributions from descriptions produced in the dialogue-generation condition in the descriptiongeneration task, but in which the five discourse markers considered by Fox Tree (1999) had been excised from the director’s contributions (well, I mean, you know, like, and oh). If overhearers benefit from discourse markers, then they should be more accurate at identifying tangrams when they hear descriptions that include discourse markers than when they hear descriptions that do not include discourse markers. Experiment 2 also investigated whether the contrast between the full dialogue and monologue conditions in Experiment 1 replicated. TABLE 5 Directors’ contributions in descriptiongeneration task: Mean number of discourse markers per 100 words in monologue-generation and dialogue-generation conditions Reference 1 2 8 Mean

Monologue generation

Dialogue generation

Mean

0.068 0.062 0.000 0.043

3.51 3.26 0.000 2.26

1.79 1.11 0.000 0.967

1682

BRANIGAN, CATCHPOLE, PICKERING

Method Participants Seventy-two further University of Edinburgh students volunteered to participate.

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

Items, design, and procedure These were the same as in Experiment 1, except that the half-dialogue descriptions were replaced by descriptions from the dialogue-generation condition of the generation task in which occurrences of the discourse markers produced by the directors were excised (dialogue-no-markers condition). There were nine conditions, created by crossing Contribution type (monologue vs. full dialogue vs. dialogue-no-markers; within-participants and -items) and Round (1 vs. 2 vs. 8; between-participants and within-items). Note that Round 8 contained no discourse markers, and hence the descriptions used in the dialogue and dialogue-no-markers conditions were identical. During a debriefing session, participants were asked if they thought that any of the descriptions sounded unusual or as if they had been altered. None of the participants reported awareness.

Results The results are reported in Table 6. Analyses were conducted as in Experiment 1. These demonstrated an effect of Contribution type, reliable by participants only (F1(2, 138)4.09, pB.05; F2(2, 22)2.12, pB.15). Paired t tests showed that, by participants only, participants were more accurate at identifying tangrams in both the full-dialogue condition (t1(71) 2.46, pB.05; t2(11)1.64, pB.13) and the dialogue-no-markers condition (t1(71)2.52, pB.05; t2(11)1.66, pB.13) than in the monologue condition, but that performance did not differ between the full dialogue and dialogue-no-markers conditions (both ½t½B1). There was also a main effect of Round (F1(2, 69)19.51, pB.01; F2(2, 22)25.60, pB.01), with TABLE 6 Experiment 2. Mean identification accuracy scores (and SD) for Rounds 1, 2, and 8 in monologue, full-dialogue, and dialogue-no-markers conditions Round

Monologue

1 2 8 Mean

0.51 0.49 0.28 0.43

(0.50) (0.50) (0.45) (0.49)

Full dialogue 0.66 0.57 0.39 0.54

(0.48) (0.50) (0.49) (0.50)

Dialogue-no-markers 0.66 0.58 0.33 0.52

(0.48) (0.50) (0.47) (0.50)

Mean 0.61 0.55 0.33 0.50

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

UNDERSTANDING DIALOGUES

1683

participants being more accurate in earlier rounds than in later rounds. There was no interaction between Contribution type and Round (both FB1). As Round 8 contained no discourse markers, we also conducted 32 ANOVAs on Rounds 1 and 2 only. These showed an effect of Contribution type by participants only (F1(2, 92)3.16, pB.05; F2(2, 22)1.90, pB.18), no effect of Round (both FB2.1) and no interaction (both FB1). In further analyses, we examined whether participants’ greater accuracy for full-dialogue descriptions than monologue descriptions correlated with the difference in number of discourse markers between monologue and dialogue descriptions (as in Fox Tree, 1999). We compared mean values for descriptions produced in Round 1 (following Fox Tree, 1999), using the number of discourse markers per 100 words; observations more than 2.5 standard deviations from the mean were replaced with this cut-off value. There was no correlation (r(12).1).

Discussion As in Experiment 1, participants were more accurate in identifying tangrams when they heard descriptions produced in dialogue than when they heard descriptions in monologue, irrespective of whether those descriptions were produced in early or late rounds. Unlike Experiment 1, these results were significant by participants but not by items. Accuracy in identification was not affected by whether they heard directors’ discourse markers or not. The difference in number of discourse markers between monologue and dialogue was not related to the difference in score for monologue and dialogue. The results therefore provide further support for the grounded perspectives account and provide no evidence that the presence of discourse markers enhances comprehensibility.

GENERAL DISCUSSION In two experiments, participants overheard descriptions of geometric shapes that had been produced over eight rounds in monologue or dialogue. In Experiment 1, they overheard both the director’s and the matcher’s contributions in dialogue, the director’s contributions in dialogue, or the director’s contributions in monologue. Participants were more accurate at identifying tangrams when they heard descriptions produced in dialogue than when they heard descriptions produced in monologue, but were not affected by whether they heard the director’s contributions or both the director’s and the matcher’s contributions in dialogue. In other words, contributions in dialogue were more comprehensible than contributions in monologue, and this was because they were produced in dialogue, not because they involved two contributors rather than one. Moreover, this

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

1684

BRANIGAN, CATCHPOLE, PICKERING

advantage held for descriptions produced in Round 8, as well as for descriptions produced in Round 1. Experiment 2 replicated the advantage for full-dialogue descriptions over monologue descriptions (by participants only), and additionally demonstrated that this advantage was unaffected by the removal of discourse markers from the director’s contributions. These results help confirm that descriptions produced in dialogue can be easier for overhearers to understand than descriptions produced in monologue. But more importantly, they elucidate why this is so. Dialogue and monologue differ in many ways. Perhaps the most important difference is that in dialogue, addressees can provide feedback during a speaker’s contribution. In the case of referential communication, such feedback may include indications that the addressee cannot understand the speaker’s perspective, requests for clarifications of the speaker’s perspective, or suggestions for alternative perspectives. The greater ease of comprehension for dialogue over monologue could reflect any of these aspects of feedback. The results from Experiment 1 argue against possibly the simplest explanation for the advantage of dialogue over monologue: that overhearers benefit from actually hearing the addressee’s feedback, presumably because it gives them access to alternative perspectives. Instead, our results show that participants were as accurate at understanding descriptions when they did not hear feedback (in the half-dialogue condition) as when they did (in the full-dialogue condition). Moreover, the benefit of overhearing dialogue held even for descriptions produced in the final round, where addressees’ feedback was generally restricted to single-word acknowledgements. Hence we can conclude that any benefit of dialogue does not derive from being able to observe the addressee’s contributions, and using the content of those contributions directly to aid comprehension. Hence the effects of feedback in our experiments must have been indirect (i.e., not contingent upon direct access to the matcher’s words). Addressees’ feedback of course plays a role in shaping the speaker’s subsequent contributions. For example, indications that the addressee cannot understand the speaker’s perspective or requests for clarifications of the speaker’s perspective should presumably cause the speaker to revise or clarify his or her description. Indeed, other evidence suggests that feedback provided by an addressee helps speakers produce better narratives than they do if they do not receive such feedback (Bavelas, Coates, & Johnson, 2000; see also Kraut, Lewis, & Swezey, 1982). Equally, if the addressee provides alternative perspectives of the referent, the speaker should respond to these suggestions in some way. Hence overhearers may have benefited even when they heard only the director’s contributions because those contributions incorporated the addressees’ feedback. Our results suggest that this was the case, and are informative about the way in which this might have occurred. In both experiments, overhearers

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

UNDERSTANDING DIALOGUES

1685

benefited from hearing directors’ descriptions in Round 8, when matchers did not produce feedback that shaped the content of directors’ contributions; note that of the 96 contributions that matchers made in Round 8, all but two were single-word simple acknowledgements (e.g., yeah, uh-huh, and ok). The benefit of overhearing dialogue did not therefore lie in the ability to observe the process by which directors responded to matchers’ contributions. Instead, the benefit must have lain in the result of such a process, namely the final perspective that the director and matcher agreed on. Further evidence that the benefit of overhearing is not based upon access to the entirety of the speaker’s contribution comes from the finding that dialogue descriptions were as easily comprehended when they did not contain discourse markers as when they did. It appears that overhearers benefited from the semantic content of the contribution, rather than details of the form of the director’s (or indeed the matcher’s) contributions. Hence the pattern of findings supports an account in which the advantage of dialogue lies in the perspectives that arise from interaction, not in the process of interaction itself. Our results therefore argue against an account in which overhearers benefit simply from being exposed to multiple perspectives (on the assumption that exposure to multiple perspectives increases the likelihood that the overhearer will be able to adopt one of them). Instead, they support an account in which overhearers benefit from being exposed to a single perspective on which the interlocutors have agreed as a result of the interaction. On this account, interlocutors negotiate until they are able to agree upon a perspective that is shared (and which the interlocutors believe to be shared). In Clark’s (1996) terms, the perspective is grounded. It therefore appears that grounded perspectives tend to be well-adapted. It is not surprising that they are well-adapted for the interlocutors, because they must both agree that the perspective is appropriate and is relatively easy to use, in the sense that the joint effort of using it is low. Note that such a perspective is not necessarily each interlocutor’s preferred perspective (i.e., the perspective that each individual interlocutor could adopt with least effort); indeed, it may be neither interlocutor’s preferred perspective. But it is the preferred joint perspective, because the interlocutors can adopt that perspective with less joint effort than an alternative perspective. More interestingly, our results indicate that grounded perspectives are objectively well-adapted, in the sense of being better for any overhearer. This goes against Fox Tree and Mayer’s (2008) suggestion that grounded perspectives tend to be idiosyncratic and obtuse. Thus interlocutors do not appear to set-up a ‘‘private key’’ that allows privileged access to some hearers but not others (Clark & Schaefer, 1987). In some cases, interlocutors do produce descriptions that are hard for any outsider to understand; this is most likely to occur when interlocutors can draw upon shared knowledge to which the outsider is not privileged (e.g., previous joint experiences), or when

1686

BRANIGAN, CATCHPOLE, PICKERING

interlocutors set out to disguise their meaning deliberately. However, we suggest that such obtuse descriptions do not tend to occur in normal conversation between interlocutors who do not know each other well. In normal conversation, overhearers tend to benefit from interlocutors’ work in making their perspectives comprehensible.

Downloaded by [The University of Edinburgh] at 03:45 11 June 2015

Manuscript received 1 April 2010 Revised manuscript received 10 September 2010 First published online 8 December 2010

REFERENCES Bangerter, A., & Clark, H. H. (2003). Navigating joint projects with dialogue. Cognitive Science, 27, 195225. Bavelas, J. B., Coates, L., & Johnson, T. (2000). Listeners as co-narrators. Journal of Personality and Social Psychology, 79, 941952. Clark, H. H. (1996). Using language. Cambridge, MA: Cambridge University Press. Clark, H. H., & Schaefer, E. F. (1987). Collaborating on contributions to conversations. Language and Cognitive Processes, 2, 1941. Clark, H. H., & Wilkes-Gibbs, D. (1986). Referring as a collaborative process. Cognition, 22, 139. Elffers, J. (1976). Tangram. The ancient Chinese shapes game. New York: McGraw-Hill. Fox Tree, J. E. (1999). Listening in on monologues and dialogues. Discourse Processes, 27, 3553. Fox Tree, J. E., & Mayer, S. A. (2008). Overhearing single and multiple perspectives. Discourse Processes, 45, 160179. Fox Tree, J. E., & Schrock, J. C. (1999). Discourse markers in spontaneous speech: Oh what a difference an oh makes. Journal of Memory and Language, 40, 280295. Fraser, B. (1999). What are discourse markers? Journal of Pragmatics, 31, 931952. Krauss, R. M., & Weinheimer, S. (1966). Concurrent feedback, confirmation, and the encoding of referents in verbal communication. Journal of Personality and Social Psychology, 4, 343346. Kraut, R., Lewis, S., & Swezey, L. W. (1982). Listener responsiveness and the coordination of conversation. Journal of Personality and Social Psychology, 43, 718731. Schober, M. F. (1993). Spatial perspective-taking in conversation. Cognition, 47, 124. Schober, M. F., & Clark, H. H. (1989). Understanding by addressees and overhearers. Cognitive Psychology, 21, 211232.