Coarse Strategic Control in Word Reading: Evidence from Tempo ...

RUNNING HEAD: Strategic Control in Naming

Coarse Strategic Control in Word Reading: Evidence from Tempo Naming and a Computational Model

Christopher T. Kello David C. Plaut

Carnegie Mellon University

Submitted to Journal of Experimental Psychology: Learning, Memory and Cognition

Address for Correspondence: Christopher T. Kello Center for the Neural Basis of Cognition Mellon Institute 115 – CNBC Carnegie Mellon University 4400 Fifth Ave. Pittsburgh, PA 15213-2683

Strategic Control in Naming

78

Abstract We investigate the hypothesis that lexical and rule knowledge can be strategically emphasized in word naming based on task demands, in contrast with the hypothesis that a time criterion to respond can be strategically manipulated. We devised the tempo naming to investigate the extent to which readers have precise control over response initiation in naming. Relative to baseline performance in the standard naming task, subjects were induced to respond with faster latencies, shorter durations, and lower levels of accuracy by instructing them to time response initiation with an experimentally controlled tempo. The effects of printed frequency and spelling-to-sound consistency were attenuated for latencies in the tempo naming task, and the number of regularization errors remained constant with faster tempos, while the frequency of word, articulatory, and other errors increased. We interpret the pattern of effects as inconsistent with existing models of word reading. We propose that cognitively coarse parameters serve as the levers of strategic control over response initiation in word reading. We account for the tempo naming results in a connectionist model that embodies coarse strategic control by scaling the net input to all units the network via input gain, and we extend the model to account for a set of stimulus blocking effects previously interpreted as evidence for a time criterion. In the general discussion, we address how coarse strategic control and its current implementation relate to a broader set of issues in word reading and other domains of strategic control.


79

Interest in strategic control over processing in word reading has grown in recent years (Baluch & Besner, 1991; Colombo & Tabossi, 1992; Jared, 1997; Lupker, Brown, & Colombo, 1997a; Monsell, Patterson, Graham, Hughes, & Milroy, 1989; Paap & Noel, 1991). Research on the empirical and theoretical aspects of strategic control has addressed questions concerning the phenomena itself, as well as the architecture and processes of word reading more broadly construed. One contributing factor to the popularity this research has been its relevance to dual-process theories of the mapping from orthography to phonology (e.g., Coltheart, Curtis, Atkins, & Haller, 1993; Plaut, McClelland, Seidenberg, & Patterson, 1997). The proposal of two distinct routes for generating the pronunciation of a letter string has prompted researchers to suggest that subjects can selectively emphasize or de-emphasize one or both routes depending on task demands. A number of studies have shown effects that have been explained as a de-emphasis of rule knowledge when that knowledge interferes with performance (Coltheart & Rastle, 1994; Herdman, 1992; Monsell et al., 1989), and likewise, a de-emphasis of lexical knowledge when it could interfere (albeit the latter has been in languages with transparent orthographies relative to English; Tabossi & Laghi, 1992; Baluch & Besner, 1991; Colombo & Tabossi, 1992). These studies have been used as evidence against a single-route account of word reading. Recently, researchers have provided an alternative to the route emphasis account (Jared, 1997; Lupker et al., 1997a). Rather than manipulate route processing, subjects are hypothesized to strategically control a time criterion to initiate the naming response. Lupker et al. and Jared argued that manipulation of a time criterion can account for a number of blocking effects that had previously been used to argue for the route emphasis account. The time criterion hypothesis, as they proposed it, was problematic for two reasons. First, manipulation of a time criterion did not explain at least one aspect of the results from the Lupker et al. (1997a) study. Second, the authors did not present an explicit mechanism of how a time criterion is applied within and across trials, and it is unclear what version would best account for the relevant phenomena. In the current study, we investigate the hypothesis of strategic control over a time criterion by introducing a novel methodology, called tempo naming, designed to manipulate explicitly the initiation of a naming response. The results from three tempo naming experiments lead us to argue for an alternative to the time criterion and route emphasis accounts of strategic blocking effects. We propose that strategic control over response initiation is accomplished by changing the rate of information integration across the central processes involved in word reading. We embody this concept in a connectionist framework by controlling a scale parameter on the net input to all units in the network, termed gain (Cohen & Servan-Schreiber, 1992). We argue that results from the tempo naming experiments, as well as Lupker et al.’s (1997a) stimulus blocking experiments, are best accounted for by changes in a single parameter (level of gain) that alters processing across multiple levels of representation. We show that the manipulation of gain, when applied uniformly to all processing units, allows for complex, qualitative changes in behavior due to the non-linear interactions between and within levels of representation. We present a distributed, connectionist model of word reading that instantiates the gain hypothesis and accounts for the main results of the tempo naming experiments, as well as the main results from the stimulus blocking study by Lupker and his colleagues. In the General Discussion, we broaden the scope of our work in the context of considering a wider range of phenomena in strategic control and word reading. Empirical and Theoretical Background It takes roughly 400--600 ms for a skilled reader to begin the pronunciation of a single, clearly printed word. This ballpark range comes from a long history of speeded word naming studies in which subjects were asked to pronounce a printed word under instructions to the effect of “as quickly and accurately as possible.” The speeded word naming task has been used to examine a wide variety of theoretical issues such as the following: processes that map orthography to phonology (Glushko, 1979; Seidenberg, Waters, Barnes, & Tanenhaus, 1984), organization of the lexicon (Forster & Chambers, 1973; Frederiksen & Kroll, 1976), semantic, phonological, and orthographic priming (Forster & Davis, 1991; Tabossi & Laghi, 1992; Taraban & McClelland,


80

1987), sentence and discourse processes (Hess, Foss, & Carroll, 1995; Trueswell, Tanenhaus, & Kello, 1993), reading impairments (Patterson & Behrmann, 1997; Stanovich, Siegel, & Gottardo, 1997), and reading acquisition (Lemoine, Levy, & Hutchinson, 1993; Manis, 1985). One common denominator among these areas of study is that accounting for differences in mean naming latencies across conditions has been a primary source of evidence. Therefore, understanding the processes responsible for the initiation of a naming response is of general importance for interpreting naming data across research domains. The standard mode of thinking about the initiation of a naming response is as follows. A representation of pronunciation is built up over time, based on the results of processing at one or more other levels of representation (e.g., lexical, semantic, orthographic, and syntactic knowledge; Coltheart et al., 1993; Kawamoto, 1993; Plaut et al., 1996). When the pronunciation is resolved according to some criterion of completeness, the response is initiated. Often, the exact nature of the criterion is left unexplained, but a common assumption is that a response is initiated as soon as an entire pronunciation is complete by some criterion (e.g., an activation or saturation threshold; Plaut et al., 1996; Coltheart et al., 1993; for an alternate view, see Kawamoto, Kello, Jones, & Bame, 1998). One reason why issues of response generation are often neglected is what Bock (1996) has termed the “mind in the mouth” assumption. She argued that researchers often implicitly assume that articulation provides a relatively direct reflection of cognitive processing. For example, with regard to an activation threshold of pronunciation readiness, different subjects may set the threshold at different levels as a function of trading speed for accuracy (Colombo & Tabossi, 1992; Stanovich & Pachella, 1976); in fact, naming instructions are usually ambiguous as to the emphasis on speed versus accuracy. Moreover, the set point may vary across trials within the same subject (Lupker, Taylor, & Pexman, 1997b; for analogous effects in other domains, see Treisman & Williams, 1984; Strayer & Kramer, 1994). Nonetheless, it is typically held that the processes driving activation of a pronunciation are essentially unchanged as a function of threshold. More recently, however, researchers have argued that gaining a better understanding of response generation in naming is important for interpreting naming data and developing theories of the underlying cognitive processes (Balota, Boland, & Shields, 1989; Jared, 1997; Kawamoto et al., 1998; Kello & Kawamoto, in press; Lupker et al., 1997a). One theoretical issue that has been informed by research focused on response generation is that of strategic control over processing routes in generating a pronunciation from print (Monsell et al., 1992; Lupker et al., 1997a; Jared, 1997). As mentioned above, many researchers have proposed that subjects can strategically emphasize or de-emphasize one or both routes based on task demands (the route emphasis hypothesis1 ; Herdman, 1992; Herdman, LeFevre, & Greenham, 1994; Paap & Noel, 1991; Monsell et al., 1992; Plaut et al., 1996). For example, if the stimuli in a word naming task consisted of nothing but irregular words (e.g., sure, pint, rouse, etc.)2 , it would behoove subjects to de-emphasize the rule route since, by definition, it would provide incorrect information on the irregular portion of all pronunciations. Monsell et al. (1992) tested the route emphasis hypothesis by dividing stimuli in a word naming task into pure and mixed blocks. The pure blocks contained either all pseudowords or all irregular words (of mixed frequency in Experiment 1, and separated by frequency in Experiment 2). The mixed blocks contained both pseudowords and irregular words. The dual-route approach stipulates that pseudowords depend on processing from the rule route, whereas irregular words depend on the lexical route. Monsell and his colleagues found that irregular words were generally named faster in pure versus mixed blocks, 1

The route emphasis hypothesis, as defined in this study, does not distinguish whether a given processing route is actually emphasized or de-emphasized. Rather, the hypothesis describes any situation in which the processing of one route is privileged over processing in another route. 2

Coltheart (Coltheart et al., 1993) defines an irregular word as one that has one or more irregular grapheme-tophoneme correspondences (GPC), as determined by a set of correspondence rules (see also Venezky, 1970).


81

and they interpreted this as evidence that subjects de-emphasized the rule route in pure blocks of irregular words, presumably to reduce interference from the rule route's incorrect processing of those words. However, one puzzling aspect of their results was that the pure block latency advantage was only reliable for blocks of high frequency (HF) and not low frequency (LF) irregular words (for similar results, see Andrews, 1982; Frederiksen & Kroll, 1976). Lupker et al. (1997a) and Jared (1997) revisited these findings and argued for an alternative to the route emphasis account. They first noted that if one operationalizes de-emphasis as slowing the processing times of the rule route (regardless of changes in variance), then LF irregular words should have an equal or greater advantage in the pure block compared to HF irregulars. This is because processing times to pseudowords must overlap more with LF compared to HF words, provided that the mean of the rule route processing times is greater than that of the lexical route (as suggested by previous findings; for example, words are named faster than nonwords; Forster & Chambers, 1973). By contrast, studies have found a greater pure block advantage for HF irregulars. Lupker and his colleagues reran Monsell et al.'s blocking experiment (with minor variations), and they replicated the pure block advantage for HF irregulars. Moreover, they found a statistically reliable pure block disadvantage for the LF irregulars. Lupker et al. ran a second experiment to provide a further test of the rule route emphasis account, in which all of the stimuli contained regular spelling-to-sound correspondences. In this case, the rule route should remain active in both the pure and mixed blocks, and therefore no block effect should be found. Once again, they found a pure block advantage for HF words (now regular), but unlike their first experiment, they found a pure block advantage for the LF words as well. Jared (1997) found analogous results, except that she compared blocks mixed with pseudowords versus blocks mixed with LF inconsistent words. In summary, the data from Jared’s study and Experiments 1 and 2 from Lupker et al.’s study were not predicted by the route emphasis hypothesis. To explain their results, Lupker et al. (1997a) re-categorized the stimuli as fast or slow, based on the mean latency for each stimulus type in the pure blocks. The pseudowords and LF irregulars were slow, and the HF regulars and irregulars were fast (LF regulars were in the middle). The pattern of results could then be described as follows: whenever fast and slow stimuli were mixed, response latencies increased for the fast stimuli, but decreased for the slow stimuli, relative to when those stimuli were in pure blocks. This insight lead Lupker et al. to propose that the blocking manipulation prompted subjects to adjust a time criterion to initiate naming responses. The general idea is that subjects can, to some degree, set a time deadline relative to stimulus onset (Ollman & Billington, 1972). If the pronunciation is not fully activated by that time (i.e., an activation threshold is also in place), then the response is initiated based on whatever representation of pronunciation is available at that time (also see Meyer, Osman, Irwin, & Kounios, 1988). In order to maintain a certain level of accuracy while responding quickly, subjects adjust the time criterion based on the difficulty of the stimuli presented during the experiment. A pure block of fast stimuli allows for a quicker criterion than a pure block of slow stimuli. When fast and slow stimuli are mixed, subjects set a middling time criterion: thus, fewer HF but more LF words are hurried. This hypothesis embodies a speed-accuracy tradeoff, so it predicts an increase in errors to slow stimuli in mixed blocks3 . This is in fact what Lupker et al. found. The Lupker et al. (1997a) study raises two issues in the current context. First, in its simplest form, a time criterion means that a response is initiated at a particular point in time, and not before nor after that point (aside from random fluctuations). If the response is fully activated before the criterion is satisfied, execution of the response is delayed. If activation is not complete by the criterion, the response is initiated at the risk of an error. Of course, this implies that one should find no effects of cognitive, non-strategic variables on naming latencies, as Lupker et al. note (but see Kawamoto et al., 1998). Since researchers often do find stimulus effects on naming latencies, the hypothesis must be amended. Lupker and his colleagues did, in fact, propose a mechanism in 3

The complementary prediction for fast stimuli (i.e., more errors in the pure block) could not be verified because performance was at ceiling for those stimuli in both the pure and mixed blocks.


82

addition to the time criterion in order to account for the full range of their results. Recall that LF irregular words and pseudowords were estimated to be of comparable speed (i.e., difficulty). This estimate was based on the fact that mean latencies to stimuli of each type were comparable in pure block conditions. Therefore, the time criterion hypothesis predicts no blocking effect when comparing pure blocks of each type with a mixed block of LF irregulars and pseudowords. However, the results from Experiment 1 showed a pure block disadvantage for LF irregulars and a pure block advantage for pseudowords. Lupker and his colleagues proposed an additional lexical checking strategy that subjects invoked only (but not always) when words were present in the stimulus block. The fact that Lupker et al. needed to invoke an additional mechanism raises the question of whether a more parsimonious alternative to the time criterion hypothesis could be proposed (for similar issues in decision response tasks, see Ruthruff, 1996). A second issue raised by the time criterion hypothesis has implications for how speeded naming can be used to investigate the nature of word reading and other cognitive processes. If one views strategic manipulation of a time criterion as one type of a speed/accuracy tradeoff, then shifting this criterion earlier in time should cause an increase in naming errors (as it did in Lupker et al., 1997a). If subjects could shift the criterion very early in time, a very high error rate should ensue. Speech errors have served as a primary source of evidence for developing and testing models of speech production (Dell & Reich, 1981; Dell, 1986; Levelt, 1989), and one could use the same approach toward models of word reading, provided that enough errors could be generated. For example, Kawamoto and Kitzis (1991) showed that the interactive-activation model of word recognition (McClelland & Rumelhart, 1981; Rumelhart & McClelland, 1982), as well as a distributed model of lexical memory (Kawamoto, 1993), both make a specific prediction concerning the time-course of phonological activation in word reading. When processing an irregular word such as "pint", the models showed a strong influence of the regular, incorrect pronunciation /pInt/ (to rhyme with "mint") early in the time course of processing. Depending on various factors, such as the printed frequency of the item, the regularized pronunciation can dominate the correct but irregular one early in processing. As activation settled to a fixed state, the models showed that the correct phoneme usually quashed activation of the incorrect phoneme, but only later in processing. The prediction made by Kawamoto and Kitzis (1991) is also a more general property of most existing connectionist models of word reading. In particular, if distributed representations of orthography, phonology and semantics all interact with each other, then orthography will tend to activate a phonological code more quickly than a semantic one (i.e., “triangle” models of word reading; Kawamoto, 1993; Plaut et al., 1996; Seidenberg & McClelland, 1989; Van Orden, 1991). This prediction was most clearly stated by Van Orden and his colleagues (Van Orden, 1991; Van Orden & Goldinger, 1994). Van Orden et al. argued that because the resonance (i.e., correlational structure) between orthography and phonology is stronger than that between orthography and semantics, phonology will tend to resolve more quickly than semantics in word reading. To support the early regularization hypothesis, Kawamoto and Zemblidge (1992) reported a naming experiment in which the mean latency of 16 regularized responses to the word “pint” was 601 ms while the mean latency of 46 correct, irregular pronunciations was 711 ms. If one could induce subjects to generate a naming response based on very early activation of phonology, then one could test the prediction of an increase in the proportion of regularization errors. The time criterion hypothesis suggests that subjects might, in principle, be able to initiate a response based on early activation of phonology. To our knowledge, no one has made an explicit prediction concerning early phonology in word naming within the dual-route framework. Intuitively, one might expect that a dual-route implementation, such as the dual-route cascade (DRC) model (Coltheart et al., 1993), would predict an increase in the proportion of word errors during early stages of processing. The lexical route must override the rule route in order to correctly pronounce irregular words, and naming latencies to pseudowords are generally slower than those to words (e.g., Forster & Chambers, 1973). These facts suggest that lexical route processing times are considerably faster than rule route processing times, and therefore, word errors arising from lexical-route processing would be higher in


83

proportion during the earlier cycles of processing. On the other hand, the rule route in the DRC model processes the input string from left to right over time, and the irregular grapheme of a test word is usually the second or third in position from left to right (most experiments that examine spelling-to-sound consistency/regularity use monosyllabic English words, and the vowel is usually the exceptional/irregular component of the mapping; Plaut et al., 1996). If the rule route has enough time to output at least the first few phonemes before the lexical route can significantly influence the computation of phonology, then the proportion of regularizations may be higher during the earlier cycles of processing. We examined the contents of phonology during the early cycles of processing in the DRC model, and found that as output was read from the model at successively earlier times in the course of processing, regularizations and word errors increased in frequency at about the same rate4 . We analyzed the DRC model using the test stimuli from Experiment 2 of the current study (see below). The naming latencies produced by the model for these stimuli were first gathered under normal processing conditions (i.e., a response was initiated when the activation of one or more phonemes in each position-specific pool crossed a given threshold). The overall mean latency under normal conditions, in number of cycles, served as the baseline time criterion (98 cycles). Responses were then extracted from the model after running it for 98 cycles (no more, no less) and taking the most active phoneme at each position at this point in processing (including null phonemes, if these were the most active). This served as our instantiation of a time criterion in the DRC model. To measure phonological output at earlier points in processing, we set the time criterion at lower numbers of cycles (from 70 to 45 cycles, in steps of 5). Regularizations were found to increase as responses were generated at earlier points in time, along with word errors and other types of errors. Therefore, the DRC model with the current parameter settings makes the same prediction as triangle models of word reading: regularization errors are prevalent early in the time course of processing. We believe that the time criterion hypothesis is worthy of investigation for two main reasons: 1) it does not clearly account for Lupker et al.’s (1997a) and Jared’s (1997) strategic blocking results, and despite this concern, 2) it can be used to address the time course of phonological processing in word reading. More generally, response criteria are important because they mediate interpretation of word naming data that are used to develop and test theories of the underlying cognitive processes (Balota et al., 1989; Kawamoto et al., 1998). Experiment 1 was devised to investigate the hypothesis of a time criterion to initiate naming responses. Experiment 1 Our initial research question was two-fold. First, to what extent can subjects obey a pure time criterion? In other words, can subjects be induced to respond at a specific moment in time? Positive evidence would provide indirect support for a time criterion as Lupker et al. (1997a) presented it, with the caveat that standard naming cannot be based solely on a pure time criterion (see above). Negative evidence would suggest that we look for an alternative to the time criterion hypothesis. At the least, such negative evidence would suggest that we precisely state a version of the time criterion hypothesis that can account for the current results, as well as findings from stimulus blocking experiments. Second, if subjects can respond at a specific moment in time, how soon can this be after stimulus presentation? As discussed above, to the extent that response initiation can be based on the close equivalent of a pure time criterion, moving this criterion closer to stimulus onset should cause responses to reflect an earlier point in the trajectory of processing. To investigate our research question, we developed a novel methodology called tempo naming. Prior to the presentation of a letter string, subjects are presented with a series of evenly spaced auditory beeps, accompanied by the incremental removal of visual flankers on the computer screen. The letter string is presented upon the final beep, and the task is to pronounce the letter string such that the response is initiated simultaneously with the following beep (which is not 4

We thank Max Coltheart for making available the output of the DRC model over processing cycles for our stimuli.


84

actually presented). The rate of beep presentation (i.e., tempo) can be increased or decreased to require subjects to respond more quickly or slowly. Tempo naming is similar to deadlined naming (Stanovich & Bauer, 1978; Colombo & Tabossi, 1992), in which subjects are simply told to respond more quickly if a given response is slower than a preset deadline. A version of the deadline paradigm analogous to tempo naming would instruct subjects to go no faster than the deadline, as well as no slower. However, tempo naming is distinct in two important respects. First, tempo naming gives an explicit and precise cue (the beeps and visual flankers) for when to initiate each response. Second, the subject receives quantitative feedback on every trial indicating the amount (in hundredth's of a second) and direction (fast or slow) that the response was off tempo. Subjects are instructed to adjust the timing of their responses such that their feedback is as close to zero as possible on every trial, even at the expense of accuracy. If subjects have a time criterion at their disposal, then they should place it on tempo to the best of their ability. Studies in finger tapping have shown that behavior can be entrained to a rhythmic cue (Kurganskii, 1994; Mates, Radil, & Poppel, 1992), although there is significant error and variability within and across subjects (Yamada, 1995). One strategy that subjects could adopt to perform the tempo naming task is to entrain an “internal metronome” to the tempo, and then synchronize the hypothesized time criterion with the rate of the internal metronome. The way in which subjects use the tempo is a research question in itself, and we shall address this question to some extent. However, the primary goal of tempo naming is to induce subjects to use a pure time criterion, if such a criterion is available. Recall that a pure time criterion predicts no effects of stimulus type on latency. To test this prediction, stimuli were chosen along well-studied dimensions of stimulus difficulty in word reading: body consistency and printed frequency. Previous studies have shown that in standard naming, low frequency words take longer to name than high frequency words, exception words take longer to name than consistent words, and low frequency words show a greater effect of consistency compared with high frequency words (the frequency by consistency interaction; Taraban & McClelland, 1987; Waters & Seidenberg, 1985; Jared, McRae, & Seidenberg, 1990; Seidenberg et al., 1984)5 . We chose three groups of stimuli based on these previous studies: High frequency exceptions (HFEs), low frequency exceptions (LFEs), and low frequency consistents (LFCs). These allowed us to examine how tempo naming modulates frequency effects (HFEs versus LFEs) as well as consistency effects (LFEs versus LFCs). A pure time criterion predicts an effect of stimulus type on error rate but not latency. In addition to testing the pure time criterion hypothesis, consistency was manipulated to also investigate the prediction made by Kawamoto and his colleagues (Kawamoto & Kitzis, 1991; Kawamoto & Zemblidge, 1992) concerning triangle models of word reading, and made by DRC model as well (Coltheart et al., 1993). If responses can be based on a pure time criterion, and this criterion can be shifted to be significantly sooner than a subject would normally place it in the standard naming task, then both models predict an increase in the number of regularization errors to exception words. This is because, in these models, manipulation of a pure time criterion does not affect the dynamics of the underlying processes, and the models show an early prevalence of the inconsistent/irregular pronunciation. To test this prediction, tempos were set to be as fast or faster 5

Word consistency is analogous to regularity, except that it is defined based on the body rather than the grapheme. The body of a monosyllabic word is the vowel and any final consonants (cf. rime). A body is consistent if it maps to the same pronunciation in all words sharing that body, and inconsistent if it has multiple mappings. A word is exceptional if its body is inconsistent and it maps to the minority pronunciation (e.g., the body of "pint" maps to a different pronunciation than "mint", "hint", "lint", etc.). The effect of body consistency has been questioned on the basis of whether grapheme regularity is, in fact, the driving factor (Coltheart et al., 1993), and whether or not consistency interacts with frequency when number and frequency of friends and enemies is controlled for (Jared, 1997). The purpose of manipulating consistency in the current study was to manipulate stimulus difficulty and create the opportunity for regularization errors. Therefore, although we have argued elsewhere that consistency is the relevant factor (Kawamoto & Kello, in press; Plaut et al., 1996), we are setting aside such issues for purposes of the current study.


85

than each subject's baseline naming latency, as determined by an initial block of standard naming trials. The only guide we had to determine how much faster than baseline subjects should be induced to respond was a study by Colombo and Tabossi (1992). Using a response deadline, they induced subjects to respond more than 60 ms faster than baseline without any significant increase in error rate. We wanted to induce errors, so we set the maximum tempo rate to induce responses considerably faster than 60 ms below baseline (150 ms maximum). We explored a range of tempos faster than baseline because we did not know how well subjects could perform the task, given its novelty. We considered one further issue in creating the tempo naming task. How do the acoustic properties of an initial phoneme affect subjects’ ability to time a given response with the tempo? It has been known for some time that such properties will affect naming latency as traditionally measured (Sherak, 1982; Sternberg, Wright, Knoll, & Monsell, 1980). Kawamoto et al. (1998) showed that even when problems with the voice key are alleviated, acoustic energy from responses with plosive stops as the initial phoneme (e.g., /p,t,k,b,d,g/) will begin much later (i.e., about 60-100 ms) than comparable responses with non-plosive initial phonemes. Timing in the tempo naming experiments is measured acoustically on-line, and given as feedback to subjects. Subjects are instructed to respond with the best possible timing as measured by their feedback. If subjects time response initiation based only on articulatory commands, then the acoustic timing (i.e., feedback) will be consistently slow for plosive compared with non-plosive initial phonemes. Alternately, subjects might be able to time their responses based on the onset of acoustic energy. If this does not depend on the type of acoustic energy (i.e., periodic versus non-periodic, as in voiced versus voiceless phonemes), then the type of initial phoneme will have no effect on timing. On the other hand, one might find differences based on the type of acoustic energy (elaborated in the Results section of Experiment 1). This issue was not central to our line of investigation, so we simply included a mix of initial phonemes in the test words and within blocks (but controlled for initial phoneme across levels of the independent variables). We mention it here because it will be important in interpreting certain aspects of the findings in the current set of experiments. Methods Subjects. 33 subjects participated in the experiment as part of a requirement for an undergraduate psychology course. Subjects reported being native English speakers with normal or corrected vision. Stimuli. 52 HFE words, 52 LFE words, and 52 LFC test words were chosen as the test stimuli for the tempo naming portion of the experiment. An additional 13 of each stimulus type were also chosen for the standard naming portion. For each stimulus type, 13 of the 52 words chosen for tempo naming were also included in standard naming for a total of 26 words of each type in standard naming. All test words were monosyllabic. The independent factors were printed frequency as estimated by Kucera and Francis (1967) and consistency as measured by the number of friends (words with the same body pronunciation) versus enemies (words with a different body pronunciation). The factors controlled for were initial phoneme, number of letters and phonemes, and summed positional bigram frequency. Words were chosen in triplets (one of each type), and each triplet was matched on the control factors as closely as possible (LFELFC pairs were also matched on frequency, and HFELFE pairs were also matched on consistency). The mean values for the independent and control factors as a function of stimulus type are given in Table 1, and the triplets are given in Appendix A. The standard naming blocks included 2 filler words at the beginning of each block, and the tempo naming blocks included 156 fillers mixed throughout the 4 blocks. Filler words were mono- and bisyllabic, and ranged in frequency and consistency.


86

Table 1. Means of control variables for the standard and tempo naming portions of Experiment 1, separated by stimulus type

Word Freq Bigram Freq # Letters # Phonemes # Friends Sum Freq Friends # Enemies Sum Freq Enemies

Standard Naming

Tempo Naming

Hi Freq Exc Lo Freq Exc Lo Freq Con

Hi Freq Exc Lo Freq Exc Lo Freq Con

999 9492.9 4.2 3.4 1 351.3 7 1994.5

4.2 5589.7 4.5 3.7 0.8 215.2 4.6 1881.7

2.8 4561 4.2 3.7 6.4 668.1 0 0

617.3 8072.6 4.3 3.3 1.8 941.7 6.2 1413.2

8.2 5107.3 4.6 3.6 1.4 145.4 6 1234.3

6.2 5023.3 4.3 3.6 8.1 632.5 0 0

Note: Sum Freq _____ denotes summed printed frequency of body neighbors that are pronounced the same (friends) or differently (enemies).

Standard naming consisted of one practice block and two test blocks, and tempo naming consisted of 1 practice block and 4 test blocks. Test stimuli were evenly mixed and balanced across blocks, and counter-balanced across subjects. The order of trials within blocks was randomized for each subject, under the constraint that 2 fillers began each standard naming block, and 10 fillers began each tempo naming block. Standard naming always preceded tempo naming, and the practice blocks began each portion of the experiment. The order of test blocks was counterbalanced across subjects (in a Latin square design for the tempo blocks). An equal portion of each stimulus type appeared in each test block within standard and tempo naming, and fillers were divided equally among test blocks as well. The standard naming and tempo naming practice blocks consisted of 10 and 40 fillers, respectively. Procedure. Subjects sat in front of a 17" monitor, approximately 2 feet away, and wore an Audio Technica headset cardiod microphone. The microphone was positioned approximately 1" away and 2" down from the subject's mouth, and it was plugged into a Soundblaster 16-bit sound card. Subjects were given written instructions for the standard naming task and asked to read them silently. Following this, the experimenter summarized the instructions, and any questions were answered. Subjects were instructed that they would see words presented in isolation on the monitor, and that their task was to pronounce each word out loud as quickly and accurately as possible. The level of recording was calibrated for each subject. Following calibration, the subject ran through the practice trials with the experimenter present to make sure that the subject performed the task correctly. The practice block was followed by 2 continuous blocks of test trials. Each standard naming trial began with a "Ready?" prompt centered on the monitor. The subject pressed the space bar to begin the trial, upon which the prompt was replaced with a fixation point. The fixation point remained for 500 ms, and was replaced by a single word. The word remained on the screen till a vocal response was detected. All stimuli (for tempo naming as well) were presented in lowercase, in a large, distinct font (similar in appearance to times new roman) to minimize letter confusions. The time from word onset to the beginning of the next trial was a fixed 1500 ms (i.e., the screen was blank for any remaining time after the response was detected). This was necessary because each vocal response was digitized and stored on the hard drive using the Runword software package (Kello & Kawamoto, in press). Immediately after each subject completed the standard naming portion of the experiment, the mean latency of all test trials was calculated (excluding any responses faster than 200 ms, but including any errors the subject may have made). Naming latency was calculated using an acoustic


87

analysis algorithm described in Kello and Kawamoto (in press). In brief, the algorithm is sensitive to increases in amplitude (e.g., to detect voicing) as well as frequency of acoustic energy (e.g., to detect frication). Each subject's mean latency in the standard naming task was set as the baseline tempo for the upcoming tempo naming blocks. The four test blocks for each subject were assigned four different tempos: baseline, and 50, 100, and 150 ms faster than baseline (B-0, B-50, B-100, and B-150, respectively). As noted above, the order of test blocks was counterbalanced in a Latin square design. Stimuli were rotated across subjects such that each subject saw each test word once in the tempo naming blocks (1/4 of the test words appeared in standard naming as well), and each test word appeared in every tempo and every block order across subjects. After completing the standard naming blocks, the subject was given written instructions for the tempo naming task. After reading them silently, the instructions were summarized and any questions were answered. A paraphrasing of the instructions is as follows (also see Figure 1): Each trial will begin with a prompt followed by the presentation of 5 pairs of visual flankers. Then, 5 beeps will be played successively in a steady rhythm, and the pairs of flankers will disappear one by one with each beep. Upon presentation of the fifth beep, a word will appear in between the last pair of flankers. Try to name the word such that the beginning of your response is timed with the sixth beep. However, no sixth beep will be played; your response should begin where the sixth beep would have been. You will get feedback after completing your response to tell you how well-timed it was to the tempo. The feedback is in the form of a number, the more positive it is, the slower your response, the more negative it is, the faster. A perfectly timed response produces a feedback of zero. Your primary task is to name the word on tempo, regardless of making errors. In the practice block will see a mix of both relatively fast and slow tempos, but then you will run through four test blocks, and each one will be set at a different, but uniform, tempo. After instruction, subjects ran through the practice block, which represented a randomly ordered, but balanced distribution of the four tempos. The experimenter stayed with the subject through a number of trials to be sure the task was understood, and to give any additional instruction if necessary. Each tempo naming trial proceeded as follows: A "Ready?" prompt was presented in the center of the screen, and the subject pressed the space bar to begin. There was a 500 ms delay with a blank screen after pressing the space bar, followed by the presentation of a


88

Stimulus Visual Auditory Ready?

Response Space Bar

>>>> >>> >>

Subject 500 ms

Blank Screen >>>>>

Duration

.1; Fi(6,459) < 1). Planned comparisons on stimulus type were not conducted because these are equivalent to the analogous comparisons with naming latency as the dependent measure. Planned comparisons to test whether timing was progressively delayed as tempo increased revealed that responses were in fact further delayed from tempo at each increase: B-0 to B-50 (F s(1,31) = 27, p < .001; Fi(1,153) = 33, p < .001), B-50 to B-100 (Fs(1,31) = 70, p < .001; Fi(1,153) = 70, p < .001), B-100 to B-150 (Fs(1,31) = 62, p < .001; Fi(1,153) = 126, p < .001). To characterize the failure of tempo to perfectly drive response initiation, a linear regression line was fit to the timing means at each level of tempo, averaged across stimulus type; the slope was 0.43 (i.e., 1 + SRT, where SRT = slope of the regression line for latency). If the tempo manipulation was perfect in determining response initiation, the slope would be 0.


94

0

-50

-100

-150

60 45 30 15 0 (ms) Timing -15 -30 Hi Freq Exc

Lo Freq Exc

Lo Freq Con

Figure 4. Mean timing with tempo from Experiment 1 as a function of stimulus type and tempo.

One unexpected finding in the timing analyses was that responses were, on average, faster than tempo in the B-0 condition. If tempo conditions were mixed within blocks, this finding could have arisen from hysteresis of the response criterion (i.e., a relatively slow tempo trial preceded by a fast one might have a tendency to be overly fast; Lupker et al., 1997b). However, since tempo was blocked and counterbalanced, and each block began with 10 practice trials, this explanation cannot be correct. We hypothesized that an adequate explanation would depend on the acoustic characteristics of response onset, as well as an answer to the question of how subjects time their responses with the tempo. We investigated our hypothesis by examining timing as a function of tempo and initial phoneme type (graphed in Figure 5). We categorized initial phoneme type based on acoustic characteristics that are known to affect the measurement of response latencies (Kello & Kawamoto, in press): voicing (voiced or unvoiced) and plosivity (plosive or non-plosive). Figure 5 shows that voiceless initial phonemes, especially non-plosive ones, were fast in the B-0 condition, whereas voiced initial phonemes were closely timed to tempo in the B-0 condition. This result suggests that subjects, despite the numeric feedback, timed their responses with the onset of voicing (which we argue and statistically support below). To answer the question of why responses were faster than tempo in the B-0 condition, recall that the latency algorithm used in this study is sensitive to high amplitude and high frequency acoustic energy. The onset of periodic energy is later in responses beginning with voiceless compared to voiced initial phonemes, relative to the onset of any type of acoustic energy. To accurately time the onset of voicing, measured latencies for voiceless initial phonemes must be fast.


95

0

-50

-100

-150

80 60 40 20 0 (ms) Timing -20 -40 Plos-Unvoc

Nonplos-Unvoc

Plos-Voc

Nonplos-Voc

Figure 5. Mean timing with tempo from Experiment 1 as a function of tempo and the articulatory character of the initial phoneme (plosive versus non-plosive and voiced versus unvoiced).

To argue for the hypothesis that subjects time responses with the onset of voicing, we shall distinguish it from the alternate hypotheses that timing was based on the onset of any acoustic energy (which was the basis of feedback), the onset of articulation, or the onset of the vowel. If timing was based on the onset of acoustic energy, there should be no effect of initial phoneme type. However, the timing of plosive initial words (i.e., /p, t, k, b, d, g, tS/6 , Nitem = 63) was 14 ms slower than non-plosive initial words (Nitem = 93; Fs(1,31) = 33, p < .001; Fi(1,154) = 16, p < .001), and the interaction of plosivity and tempo was marginally significant by items (Fs(3,93) = 1.5, p > .2; Fi(3,462) = 2.5, p < .06). The effect of plosivity diminished slightly as tempo increased (18 ms effect at B-0, 17 ms at B-50, 10 ms at B-100, and 11 ms at B-150). The main effect of plosivity (as well as other effects of initial phoneme reported below) rules out the acoustic energy hypothesis, and we shall return to the interaction effect in the section on duration analyses. Next, if timing was based on the onset of articulation, then responses beginning with non-plosive phonemes should be relatively well-timed (to the extent that subjects can keep up with the tempo), and those with plosive initial phonemes should be slow by comparison. This is because the onset of articulation more closely corresponds to the onset of acoustic energy for non-plosive compared to plosive-initial responses (Kello & Kawamoto, in press). However, plosive-initial responses were on tempo in the B-0 condition (timing = -2 ms), whereas non-plosive-initial responses were fast (timing = -18 ms; Fs(1,31) = 10.8, p < .05; Fi(1,153) = 32, p < .001). Finally, if timing was based on the onset of the vowel,7 then there should be no effect of voicing of the initial phoneme since the vowel presumably begins at roughly the same point in comparable responses with voiced vs. unvoiced initial phonemes. A post-hoc split of the non-plosive-initial words by voicing on the initial phoneme revealed that responses to voiced stimuli were 23 ms slower than unvoiced stimuli (Fs(1,31) = 70, p < .001; Fi(1,91) = 32, p < .001). This effect of voicing interacted with tempo such that, as with plosivity, the effect diminished as tempo increased (Fs(3,93) = 2.4 p < .07 marginal; Fi(3,273) = 6.6, p < .001; we return to this effect below). Furthermore, the mean timing 6 7

/tS/, as in “chip”, is technically an affricate, but it has plosive-like characteristics, so we treated it as a plosive.

Previous research on the perceptual center of syllables (Hoequist, 1983; Marcus, 1981) suggests that vowel onset (or at least a correlate thereof) may play a role in synchronizing repeated syllables with a metronome.


96

of voiced, non-plosive initial words in the B-0 condition was -3 ms, compared to -35 ms for comparable unvoiced words. The voicing onset hypothesis for the basis of timing in tempo naming explains both the voicing and plosivity effects found, but the vowel onset hypothesis cannot explain the effect of voicing. Therefore, the data provide evidence for the voicing onset hypothesis. Naming Duration Analyses. Researchers have shown that cognitive processes affect not only the onset of a naming response, its articulatory duration as well (Balota, Boland, & Shields, 1989; Kawamoto et al., 1998; Kawamoto, Kello, Higareda, & Vu, in press). For instance, Kawamoto and his colleagues showed initial phoneme duration effects by contrasting the size of consistency effects in responses beginning with plosive versus non-plosive phonemes. They argued that a larger consistency effect on latency for plosive versus non-plosive initial stimuli (controlling for confounds) is evidence that consistency of the vowel affects the duration of the initial consonant(s). The logic was based on the premise (mentioned above) that the acoustic onset of non-plosive-initial responses closely correspond to the actual response onset, whereas the acoustic onset of plosiveinitial responses conflates response onset with duration of the initial phoneme. If we apply the same logic to the timing analyses conducted above, then the interaction of initial phoneme type (with both voicing and plosivity) and tempo in the timing analyses above suggests that, at the least, naming durations for voiceless, non-plosive initial stimuli decreased as tempo increased. For example, if the duration of the initial phoneme caused the plosivity effect, then the weakening of this effect as a function of tempo indicates a decrease in initial phoneme duration. Similarly, if duration of the entire naming response was reduced for both plosive and non-plosiveinitial words, then the same interaction of tempo and plosivity would be predicted (analogous arguments could be made for voicing). We tested these alternate hypotheses by measuring whole word naming durations (i.e., time from onset to offset of acoustic energy) as a function of initial phoneme voicing and plosivity, in conjunction with tempo8 . The predictions were as follows. If tempo affected only initial phoneme duration, then there should be no effect of tempo on naming durations for plosive-initial responses. Since a duration effect on the initial phoneme in plosiveinitial responses will mostly alter the silent gap in acoustic energy caused by pressure build-up for the plosive release, the duration effect will be reflected in naming latency rather than acoustic duration. By contrast, if tempo affected whole word durations, then all responses with any type of initial phoneme should show an effect of tempo. The results unambiguously showed that tempo affected whole word durations. Figure 6 graphs naming durations as a function of tempo and initial phoneme type. Naming durations were calculated using the same algorithm for detecting the acoustic onset of a response, except that the algorithm was run backwards from the end of response recording (durations less than 50 ms or greater than 1000 ms were removed from the analyses; see Kello & Kawamoto, in press, for details). Naming durations of plosive-initial responses showed a reliable effect of tempo (Fs(3,93) = 4.0, p < .01; Fi(3,303) = 6.9, p < .001). The main effect of tempo on naming durations, collapsing across initial-phoneme type, was also reliable (Fs(3,93) = 7.4, p < .001; Fi(3,582) = 21, p < .001). This effect did not significantly interact with voicing (Fs(3,93) < 1; Fi(3,582) = 1.4, p > .2), but it did interact marginally with plosivity (Fs(3,93) = 1.6, p > .2; Fi(3,582) = 3.3, p < .05). Qualitatively, naming durations of non-plosive-initial responses showed a stronger effect of tempo than plosive-initial responses (which nonetheless showed a reliable effect of tempo; see above). This interaction simply indicates that initial phoneme durations contributed to the effect of tempo on whole word durations, and that this contribution was attenuated for plosive-initial responses because plosive phonemes have much less acoustic extent than non-plosive phonemes.

8

The duration of most initial phonemes is difficult to measure (Kawamoto et al., 1998; Kello & Kawamoto, in press), and we did not carefully choose stimuli with easily measured initial phoneme durations. We therefore chose whole word durations as the dependent measure to examine.


97

0

-50

-100

-150

400

375

350

325

300 Duration (ms) Naming

275 Hi Freq Exc

Lo Freq Exc

Lo Freq Con

Figure 6. Mean acoustic naming durations from Experiment 1 as a function of stimulus type and tempo.

Error Analyses. Errors were categorized as in the standard naming analyses. Figure 7 graphs mean error rate as a function of stimulus type and tempo. The main effect of stimulus type was reliable (Fs(2,62) = 38, p < .001; Fi(2,153) = 9.4, p < .001), as was the main effect of tempo (Fs(2,93) = 11, p < .001; Fi(3,459) = 10, p < .001). As in the latency analysis, the interaction was again non-significant (Fs(6,186) = 1.5, p > .1; Fi(6,459) = 1.4, p > .2). Planned comparisons for stimulus type showed that LFE words were reliably more error prone than HFE words (F s(1,31) = 55, p < .001; Fi(1,102) = 13, p < .001), and likewise for LFE words compared to LFC words (Fs(1,31) = 43, p < .001; Fi(1,102) = 9.4, p < .001). Error rate results from planned comparisons over levels of tempo differed in part from those found in the latency analyses. Whereas the each level of tempo was reliably faster than the previous, only the increase from B-100 to B-150 showed a reliable increase in error rate: from B-0 to B-50 [F s(1,31) < 1; Fi(1,153) < 1], from B-50 to B-100 [Fs(1,31) = 3.2, p > .05; Fi(1,153) = 2.7, p > .1], and from B-100 to B-150 [Fs(1,31) = 10.2, p < .01; Fi(1,153) = 8.2, p < .01].


98

Std

0

-50

-100

-150

0.4

0.3

0.2 Error Rate 0.1

0 Hi Freq Exc

Lo Freq Exc

Lo Freq Con

Figure 7. Mean error rates (proportion of errors) from the standard and tempo naming portions of Experiment 1 as a function of stimulus type and tempo. The dashed lines separate the standard naming means from the tempo means.

Errors were further analyzed by calculating chi-square statistics on a frequency table of tempo by error type, presented in Table 5. The overall frequency counts were significantly different than their expected values based on row and column means (Χ2(12) = 37, p < .001). Our hypotheses concerned how the counts of different error types would vary as tempo increased. The MantelHaenszel chi-square test for trend (Cody & Smith, 1997) is especially suited for contingency tables in which one or both variables have a specific ordering of levels (i.e., it collapses the independent contribution to degrees of freedom for individual levels of each variable), so we also conducted Mantel-Haenszel (M-H) tests on all tempo naming contingency tables. The M-H test approached significance for the overall frequency counts (Χ2(1) = 2.9, p < .1). Motivation for analyzing error types specifically came from predictions concerning regularization errors. To address these more closely, regularization and mixed errors (which are regularizations themselves) were pooled together, and these were compared against word errors: the M-H test was now significant (Χ2(1) = 10.4, p < .001), as was the chi-square (Χ2(9) = 37, p < .001). M-H analyses were also performed on the subset of data including only word, regularization, and mixed errors (with regularizations and mixed errors collapsed); the M-H test was significant (Χ2(1) = 5.2, p < .05), but the chi-square test was not (Χ2(6) = 5.8, p > .1). Based on the pattern of column percents in the frequency table, the chi-square results indicate that the frequency of regularization mixed errors remained constant while that of other error types (word errors in particular) increased as tempo increased. If we consider the mixed errors as mostly regularizations that coincidentally form words, then we can conclude that the proportion of regularization errors decreased as tempo increased.


99

Table 5. Frequency counts of errors in the tempo naming portion of Experiment 1, as a function of tempo and error type. 0

-50

-100

-150

Regularization

20 (22.7)

19 (19.8)

20 (16.7)

18 (10.8)

77 (16.4)

Word

33 (37.5)

43 (44.8)

54 (45.0)

60 (36.1)

190 (40.4)

Mixed

20 (22.7)

14 (14.6)

17 (14.2)

17 (10.2)

68 (14.5)

Nonword

10 (11.4)

8 (8.3)

21 (17.5)

39 (23.5)

78 (16.6)

5 (5.7)

12 (12.5)

8 (6.7)

32 (19.3)

57 (12.1)

88

96

120

166

Articulatory

Total

Total

470

Discussion The results of Experiment 1 can be summarized as follows. The manipulation of frequency and consistency in the standard naming task basically replicated the findings of previous studies. Responses to HFE words were faster than those to LFE words, and responses to LFC words were faster than those to LFE words (reliable by subjects only). Error rates also showed this pattern, but more reliably. The tempo naming task was effective in inducing progressively faster, more error prone responses (responses were 94 ms faster and 7% less accurate, on average, than baseline in the fastest tempo condition). This result indicates that, as Lupker et al. (1997a) noted, subjects must use a fairly conservative criterion to respond in the standard speeded naming task (otherwise one would expect tempo to affect speed less and accuracy more, relative to the observed effects). Moreover, the fact that responses reliably increased in speed by 27 ms from the B-100 to B-150 conditions suggests that subjects could be driven to respond correctly even faster. The effect of stimulus type on naming latency diminished in the tempo naming task compared to standard naming, and this did not seem to be due to a practice effect (as indicated by blocking analyses), or a change in items (as indicated by analyses of item subsets). Furthermore, there was no indication of an interaction of stimulus type with tempo in the latency analyses. An analysis of stimuli by plosivity and voicing of the initial phoneme indicated that subjects attempted to time the onset of voicing with the tempo. In addition, analyses of naming duration showed that as tempo increased, duration of the entire naming response decreased. Unlike naming latency, the pattern of error rate effects was essentially the same between standard and tempo naming: error rates to HFE and LFC words were both lower than those to LFE words. To complement the latency results, error rates increased with tempo, indicating a speed/accuracy tradeoff (albeit the only reliable increase was from the B-100 to B-150 condition). An analysis of the error types as a function of tempo showed that while word errors and other error types increased in number with increases in tempo, the number of regularization errors remained constant.


100

The findings from Experiment 1 indicate the following in terms of the two main research agendas stated at the outset. First, they fail to provide support for the pure time criterion hypothesis of response initiation for two reasons. The first reason is that although the effect of stimulus type was diminished in tempo naming compared to standard naming, there was a reliable main effect of frequency. One could argue that the tempo manipulation was not perfectly effective in controlling the putative time criterion, as evidenced by the progressive failure to time responses as tempo increased. To the extent that response onset was not completely entrained to the tempo, stimulus factors would have the opportunity to influence response onset. However, this account must also predict that the influence of stimulus type would increase as failure to maintain timing increased. The lack of an interaction between tempo and stimulus type, therefore, argues against the pure time criterion hypothesis. The second claim against a time criterion is that it does not account for the decrease in naming durations as tempo increased. The scope of the time criterion hypothesis encompasses response initiation, but not response execution, so one might argue that the hypothesis should not be held accountable for duration effects. However, understanding the factors underlying response duration will shed light on the processes underlying response initiation because they are, in fact, both measures of naming behavior. In summary, we interpret the results as suggesting that the pure time criterion be amended, or an alternate account proposed. The second research agenda was to analyze naming errors as a measure of the contents of phonology early in processing. The proportion of regularization errors significantly decreased as tempo increased, due to an increase in the occurrence of other error types (word errors most notably), in conjunction with a constant number of regularizations. If responses in the tempo naming task reflected earlier states of phonology in the normal course of processing (e.g., as is the case with a pure time criterion), then the decrease in proportion of regularizations argues against triangle models and dual-route models of word reading, as indicated by simulations of Kawamoto’s distributed model (Kawamoto & Kitzis, 1992) and the DRC model (Coltheart et al., 1993). The distributed model shows a stronger influence of the lower level (i.e., letter-to-phoneme) correlations early in processing, which are over-ridden by item-level mappings only later in processing. The rule route in the DRC model outputs the regularized phoneme in time to influence the earlier cycles of processing. Neither model specifies how the timing of responses is achieved (to the extent that it was or was not) in the tempo naming task. Additionally, neither model provides a ready explanation for why response durations decreased as tempo increased. We highlight the duration results here because they play an important role in our account of strategic control over response initiation, presented in the Simulation and General Discussion sections. Experiment 2 Experiment 2 was intended to replicate and extend the results of Experiment 1, and to test a route emphasis account of the error pattern in the tempo naming portion of Experiment 1. The tempo naming task is novel, so it would be useful to know if the results from Experiment 1 can be replicated with an extended set of stimuli and a second group of subjects. More importantly, however, the decrease in proportion of regularization errors as tempo increased may have been due to strategic factors based on stimulus composition. Recall that one motivation for this study came from a debate concerning strategic effects in word naming. The dual-route account proposed that subjects emphasized or de-emphasized one of the routes based on composition of the stimulus list (Monsell et al., 1992), whereas the response criterion account proposed that subjects adjusted a criterion to initiate pronunciation (Lupker et al., 1997a; Jared, 1997). Subjects may have deemphasized the rule route because close to half of all stimuli in Experiment 1 contained exceptional spelling-to-sound correspondences, and the rule route interferes with at least some aspects of the pronunciation of irregular words9 . The fact that errors to LFE words decreased from the first to

9

Of the 104 test exception words in our stimulus set, 88 (85%) were are also irregular as defined by GPC rules used in Coltheart et al.’s (1993) DRC model.


101

second block in the standard naming task is consistent with the idea that subjects de-emphasized the rule route as they became familiar with the stimulus composition. To test this account, we included pseudowords as stimuli in Experiment 2. Following the logic laid out by Monsell et al. (1992), pseudowords should inhibit de-emphasis of the rule route since the rule route is required to generate a pronunciation of pseudowords. If the decrease in proportion of regularizations across tempo in Experiment 1 was due to a strategic de-emphasis of the rule route, then the effect should be diminished when pseudowords are included. If, however, the regularization rate continues to decrease in the presence of pseudowords, a route emphasis explanation would be discredited. Methods Subjects. Thirty-four subjects participated in the experiment as part of a requirement for an undergraduate psychology course. Subjects reported being native English speakers with normal or corrected vision. Stimuli. All test stimuli from Experiment 1 were included. In addition, 52 pseudowords were created by shuffling the onsets and bodies of the LFC words (listed in Appendix A). All 52 pseudowords appeared in the tempo naming task, but none of these appeared in the standard naming task; an additional 26 pseudowords were created to appear exclusively in the standard naming blocks. Care was taken to avoid pseudohomophones (e.g., brane). Standard naming and tempo naming blocks of trials were created in the same way as in Experiment 1. An equal portion of each stimulus type appeared in each test block within standard and tempo naming, and fillers were divided equally among test blocks as well. There were 108 fillers in the tempo naming task, and 1/4 of these were pseudowords. The standard naming and tempo naming practice blocks consisted of 10 and 40 fillers, respectively, and 1/4 of both practice blocks were pseudowords. Therefore, 1/4 of all stimuli were pseudowords. Procedure. The procedure was the same as that in Experiment 1, except subjects were told that some letter strings were not legal English words. As with words, they were to name these letter strings as quickly and as accurately as possible. Results - Standard Naming Data Removal. Data from 2 subjects were removed due to equipment failure, and data from 1 item (chic) was removed as in Experiment 1. Pseudoword errors were categorized as words were in Experiment 1, except regularization and mixed were not possible (all pseudoword bodies contained consistent spelling-to-sound correspondences). Further data removal was carried out as in Experiment 1. Naming Latency Analyses. Mean latencies as a function of stimulus type and block are given in Table 6. The main effect of stimulus type was significant by subjects (Fs(3,93) = 25, p < .001) and items (Fi(3,99) = 11.1, p < .001). The 4 ms increase in mean latency from block 1 to 2 was not significant (Fs(1,31) < 1, Fi(1,99) < 1), nor was the interaction of block and stimulus type (Fs(3,93) = 1.0, Fi(3,99) < 1). Pairwise comparisons revealed that the 18 ms frequency effect was reliable (Fs(1,31) = 17, p < .001; Fi(1,49) = 7.2, p < .01). Unlike Experiment 1 (in which there was no consistency effect), there was a reliable 15 ms advantage consistency effect (Fs(1,31) = 15, p < .001; Fi(1,49) = 5.1, p < .05). The overall mean latencies of words compared to pseudowords was also tested (a lexicality effect); responses to pseudowords were 35 ms slower than to words overall (Fs(1,31) = 36, p < .001; Fi(1,101) = 24, p < .001).


102

Table 6. Mean naming latencies from the standard naming portion of Experiment 2, as a function of block and stimulus type. Block 1 Hi Freq Exc Lo Freq Exc Lo Freq Con Pseudoword Frequency Effect Consistency Effect Lexicality Effect

466 487 469 510

Block 2 (10.9) (12.6) (13.2) (16.2)

21 18 36

475 489 478 510

Mean (12.0) (13.6) (12.4) (17.5)

14 12 29

471 (8.1) 488 (9.2) 473 (9.0) 510 (11.8) 17 *** 15 *** 32 ***

Error Analyses. Errors were categorized and analyzed as in Experiment 1. Table 7 shows mean overall error rate as a function of block and stimulus type. The main effect of stimulus type was significant by subjects (Fs(3,93) = 31, p < .001) and items (Fi(3,99) = 10.4, p < .001), but the Table 7. Mean error rates from the standard naming portion of Experiment 2, as a function of block and stimulus type. Block 1 Hi Freq Exc Lo Freq Exc Lo Freq Con Pseudoword

2.4 16.6 3.5 10.7

Frequency Effect Consistency Effect Lexicality Effect

14.2 13.1 3.2

Block 2 (0.7) (2.1) (1.1) (2.0)

4.4 16 3.5 6.3 11.6 12.5 -1.7

Mean (1.1) (1.5) (1.2) (1.9)

3.4 16.3 3.5 8.5

(0.7) (1.3) (0.8) (1.4)

12.9 *** 12.8 *** 0.8

1.3% drop in errors from the first to second block was not (F s(1,31) < 1, p; Fi(1,99) < 1). The interaction of block with stimulus was significant by subjects (Fs(3,93) = 2.6, p < .05), but not by items (Fi(3,99) = 1.7, p > .1). Although the lack of a fully significant interaction prohibited posthoc analyses, the cell means clearly show a different pattern of results compared to Experiment 1: whereas as LFE errors decreased from block 1 to 2 in Experiment 1, they remained constant in Experiment 2, and pseudoword errors decreased from block 1 to 2. Planned comparisons showed that LFE words were reliably more error prone than both LFC words (F s(1,31) = 59, p < .001; Fi(1,49) = 15, p < .001) and HFE words (Fs(1,31) = 70, p < .001; Fi(1,49) = 14, p < .001). In addition, the overall error rate to words was not significantly different than to pseudowords (Fs(1,31) < 1; Fi(1,101) < 1). Error counts as a function of block and error type are presented in Table 8. Counts were not reliably different than their expected values based on means calculated across levels of block and error type (Χ2(4) = 4.7, p > .2). To compare with Experiment 1, counts were also tallied with the pseudowords removed, and these too did not differ reliably from their expected values (Χ2(4) = 3.4, p > .5).


103

Table 8. Frequency counts of errors in the standard naming portion of Experiment 2, as a function of block and error type. Block 1

Block 2

Word

31 (23.0)

31 (25.0)

62 (23.9)

Regularization

25 (18.5)

22 (17.7)

47 (18.2)

Mixed

14 (10.4)

20 (16.1)

34 (13.1)

Nonword

48 (35.6)

31 (25.0)

79 (30.5)

Articulatory

17 (12.6)

20 (16.1)

37 (14.3)

135

124

Total

Total (Mean)

259

Results - Tempo Naming Data Removal. The procedure for data removal was the same as that in Experiment 1. Latency Analyses. Figure 8 graphs mean naming latencies as a function of tempo and stimulus type. The main effect of stimulus type was significant by subjects (Fs(3,93) = 15.5, p < .001) and items (Fi(3,204) = 3.3, p < .05), as was the main effect of tempo (Fs(3,93) = 439, p < .001; Fi(3,612) = 246, p < .001). The interaction did not reach significance (Fs(9,279) = 1.5, p > .1; Fi(9,612) < 1). Planned comparisons showed that the 9 ms frequency effect was reliable by subjects (Fs(1,31) = 16.3, p < .001) but not items (Fi(1,102) = 2.4, p > .1), as was the 6 ms consistency effect (Fs(1,31) = 9.7, p < .01; Fi(1,102) = 1.8, p > .1). The 9 ms lexicality effect was reliable by subjects (Fs(1,31) = 24, p < .001) and items (Fi(1,206) = 7.0, p < .01). Planned comparisons of the tempo manipulation confirmed that each successively faster level of tempo caused responses to be reliably faster than the previous level: from B-0 to B-50 (F s(1,31) = 223, p < .001; Fi(1,207) = 94, p < .001), from B-50 to B-100 (Fs(1,31) = 234, p < .001; Fi(1,207) = 120, p < .001), and from B-100 to B-150 (Fs(1,31) = 87, p < .001, Fi(1,207) = 40, p < .001). As in Experiment 1, the influence of stimulus type on latencies was smaller in the tempo naming task compared to standard naming (an 18 ms frequency effect, 15 ms consistency effect, and 35 ms lexicality effect, versus 9 ms, 6 ms, and 9 ms effects in tempo naming, respectively). We argue for the validity of drawing this inference using the same logic as in Experiment 1. First, we removed repeated stimuli (and all pseudowords since none of these were repeated) from the tempo naming analyses, and the effects of stimulus type were essentially unchanged, albeit there was a loss of power by items (all effects were significant by subjects but not items): the main effect of stimulus type (Fs(2,62) = 8.2, p < .01; Fi(2,114) = 1.3, p > .2), an 8 ms frequency effect (Fs(1,31) = 9.1, p < .01; Fi(1,76) = 1.3, p > .2), and a 9 ms consistency effect (Fs(1,31) = 12.4, p < .01; Fi(1,76) = 2.2, p > .1). Second, the analyses of block order and stimulus type once again


104

Std

0

-50

-100

-150

550

500

450

400 Naming Latency (ms)

350 Hi Freq Exc

Lo Freq Exc

Lo Freq Con

Pseudo

Figure 8. Mean latencies from the standard and tempo naming portions of Experiment 2 as a function of stimulus type and tempo. The dashed lines separate the standard naming means from the tempo means.

revealed no discernable effect of block order (F s(3,93) < 1; Fi(3,612) = 2.1, p > .1). Third and finally, we analyzed standard naming latencies including only tempo naming stimuli (thereby excluding all pseudowords), and the main effect of stimulus type was still reliable by subjects (Fs(2,62) = 5.7, p < .01), but not by items (Fi(2,36) = 1.4, p > .2). Planned comparisons showed that the 16 ms frequency effect (cf. an 18 ms effect with all stimuli included) was reliable by subjects only (Fs(1,31) = 8.8, p < .01; Fi(1,24) = 2.9, p < .1), as was the 12 ms consistency effect (cf. a 15 ms effect with all stimuli; Fs(1,31) = 7.5, p < .01; Fi(1,24) = 1.7, p > .2). Taken together, these three analyses indicate that the decreased stimulus effect on latencies in the tempo naming task was due to the task itself, rather than practice or stimulus selection. Timing Analyses. Figure 9 graphs mean timing offsets as a function of stimulus type and tempo. As in the latency analyses, the main effect of stimulus type was reliable (Fs(3,93) = 15.5, p < .001; Fi(3,204) = 2.7, p < .05), as was the main effect of tempo (Fs(3,93) = 86, p < .001; Fi(3,612) = 165, p < .001). The interaction of stimulus type and tempo was not significant (Fs(9,279) = 1.5, p > .1; Fi(9,612) = 1.3, p > .2). Planned comparisons showed that, as in Experiment 1, responses were increasingly delayed from tempo at each step: B-0 to B-50 (F s(1,31) = 28, p < .001; Fi(1,207) = 32, p < .001), B-50 to B-100 (Fs(1,31) = 10.0, p < .01; Fi(1,207) = 24, p < .001), B-100 to B-150 (Fs(1,31) = 85, p < .001; Fi(1,207) = 120, p < .001). The analyses of timing as a function of initial phoneme in Experiment 1 supported the hypothesis that subjects timed their responses based on the onset of voicing. The main piece of evidence was that responses with voiced non-plosive-initial phonemes were timed more accurately than those with unvoiced ones, which were too fast in the B-0 condition. Timing in Experiment 2 was also analyzed as a function of voicing and tempo: Responses to voiced non-plosive-initial responses were 28 ms slower (cf. 23 ms effect in Experiment 1) than the unvoiced counterparts (Fs(1,31) = 104, p < .001; Fi(1,123) = 54, p < .001), and this effect interacted with tempo in the same way as in Experiment 1 (Fs(3,93) = 3.3, p < .05; Fi(1,123) = 3.5, p < .05). In particular, the


105

0

-50

-100

-150

60 45 30 15 0 (ms) Timing -15 -30 Hi Freq Exc

Lo Freq Exc

Lo Freq Con

Pseudo

Figure 9. Mean timing with tempo from Experiment 2 as a function of stimulus type and tempo.

voicing effect steadily decreased from a 38 ms difference in the B-0 condition, to an 18 ms in the B150 condition. Finally, timing of non-plosive-initial voiced responses was 3 ms in the B-0 condition, compared to -35 ms for the unvoiced responses. These analyses corroborate those from Experiment 1 in showing that subjects timed their responses based on the onset of voicing, rather than other candidates such as the onset of acoustic energy or the vowel. They also show indirect evidence of a decrease in naming duration as tempo increased. Naming Duration Analyses. Naming duration analyses in Experiment 1 showed that an increase in tempo caused a decrease in whole word naming duration for all initial phoneme types. Naming durations in Experiment 2 replicated the pattern found in Experiment 1: Naming durations, collapsing across initial phoneme type, steadily and reliably decreased from B-0 to B-150 (Fs(3,93) = 10.3, p < .001; Fi(3,612) = 20, p < .001). The same pattern of effects held when analyses were restricted to voiced responses (F s(3,93) = 4.0, p < .01; Fi(3,177) = 6.7, p < .001), as well as plosive-initial responses (Fs(3,93) = 7.3, p < .001; Fi(3,243) = 11.5, p < .001). The fact that the naming duration effect held for both of the above IP types indicates that the rime portion of the response decreased in duration, as well as the onset. Error Analyses. Figure 10 graphs mean error rate as a function of stimulus type and tempo. The error rate results replicated those of Experiment 1. The main effect of stimulus type was reliable (Fs(3,93) = 55, p < .001; Fi(3,204) = 14.9, p < .001), as was the main effect of tempo (Fs(3,93) = 13.9, p < .001; Fi(3,612) = 17.0, p < .001). The interaction of stimulus type and tempo was nonsignificant (Fs(9,279) < 1; Fi(9,612) = 1.3, p > .2). Pairwise comparisons showed that 7.2% increase in erred responses to LFE over HFE words was reliable (F s(1,31) = 35, p < .001; Fi(1,102) = 12.6, p < .01), as was the 7.4% increase in LFE over HFE errors (F s(1,31) = 60, p < .001; Fi(1,102) = 15.5, p < .001). To test the extent to which error rates increased with tempo as latencies decreased (i.e., a speed-accuracy tradeoff), error rates between adjacent levels of tempo


106

Std

0

-50

-100

-150

0.4

0.3

0.2 Error Rate 0.1

0 Hi Freq Exc

Lo Freq Exc

Lo Freq Con

Pseudo

Figure 10. Mean error rates (proportion of errors) from the standard and tempo naming portions of Experiment 2 as a function of stimulus type and tempo. The dashed lines separate standard naming means from the tempo means.

were compared. Although error rate increased with each increase in tempo, only the change from B50 to B-100 was significant (cf. only the change from B-100 to B-150 was significant in Experiment 1): from B-0 to B-50 (Fs(1,31) = 1.9, p > .1; Fi(1,207) = 2.3, p > .1), from B-50 to B-100 (Fs(1,31) = 11.9, p < .01; Fi(1,207) = 14.2, p < .001), and from B-100 to B-150 (Fs(1,31) < 1; Fi(1,207) < 1). Errors were further analyzed by calculating chi-square statistics on the frequency table of error counts, broken down by tempo by error type (shown in Table 9). With regularizations and mixed errors treated separately, the overall frequency counts were not significantly different than their expected values in a chi-square test, based on row and column means (Χ2(12) = 17.8, p > .1). However, when these two error types were combined, the chi-square test was significant (Χ2(9) = 16.7, p < .05). The purpose of analyzing error types here is to compare them with the analogous analyses in Experiment 1. In particular, we want to know whether the proportion of regularization errors fell as tempo increased (as they did in Experiment 1). The Mantel-Haenszel chi-square test was significant when comparing word errors, regularizations, and mixed errors for all stimuli (regularizations and mixed errors separated, M-H(1) = 5.6, p < .05; combined, M-H(1) = 8.3, p < .01), as well as when pseudowords were excluded from the analysis to compare with the corresponding results of Experiment 1 (separated, Χ2(1) = 4.4, p < .05; combined, Χ2(1) = 6.2, p < .05). Discussion Overall, the results of Experiment 2 replicated those of Experiment 1. Tempo again had a strong influence on response latencies and error rates (i.e., it induced subjects to incrementally trade accuracy for speed), but subjects did not fully keep pace with the tempo at the faster rates. Timing analyses showed again that subjects attempted to time their responses based on the onset


107

Table 9. Frequency counts of errors in the tempo naming portion of Experiment 2, as a function of tempo and error type. 0

-50

-100

-150

Total

Word

63 (39.9)

70 (40.2)

108 (44.4)

128 (46.0)

369 (43.3)

Regularization

28 (17.7)

26 (14.9)

31 (12.8)

25 (9.0)

110 (12.9)

Mixed

16 (10.1)

18 (10.3)

17 (7.0)

21 (7.6)

72 (8.4)

Nonword

39 (24.7)

39 (22.4)

54 (22.2)

57 (20.5)

189 (22.2)

Articulatory

12 (7.6)

21 (12.1)

33 (13.6)

47 (16.9)

113 (13.3)

Total

158

174

243

278

853

of voicing, rather than the onset of articulation or the vowel. The frequency effect was diminished (but reliable) from standard to tempo naming, as was the consistency effect. Naming duration analyses again showed that durations decreased with increased tempo. These results provide further evidence against the pure time criterion hypothesis. Two differences between the results of Experiments 1 and 2 were that 1) the consistency effect was more reliable in both the standard and tempo naming portions of Experiment 2, and 2) the accuracy of naming LFE words dropped from the first to second block of standard naming in Experiment 2, whereas the accuracy of naming pseudowords increased; error proportions did not significantly change in the standard naming blocks from Experiment 1. On one hand, these differences seem to support the route emphasis hypothesis; the inclusion of pseudowords in Experiment 2 inhibited the putative de-emphasis of the rule route in Experiment 1. On the other hand, route emphasis also predicts that regularization errors should increase as subjects trade speed for accuracy, as they did in the tempo naming task. To the contrary, the proportion of regularizations dropped with increases in tempo. If one interprets the results of Experiment 2 as evidence against the route emphasis account, then the data do not fit with the prediction of increasing regularizations made by Kawamoto’s (1993) distributed model and the DRC model (Coltheart et al., 1993). Moreover, response durations again decreased as tempo increased in both experiments, but neither model can readily speak to such duration effects. There are at least two points to consider before concluding that these models need revision or replacement to account for the data. First, the argument against them is based on the assumption that the tempo naming task taps into the early contents of phonological processing as that processing is carried out in the standard naming task. In other words, a tempo naming response is assumed to arise from earlier moments within the “normal” trajectory of processing that occurs during standard naming. This assumption is most obvious in the case of a time criterion, but as argued above, the results suggest that subjects did not use such a criterion. Responses in the tempo naming task do, in fact, begin earlier in time than standard naming responses, but the trajectory of processing in tempo naming could be qualitatively different from that in standard naming. This would add a level of complexity to how models of word reading would account for tempo naming results. The second point to consider is that one could still argue that the inclusion of pseudowords


108

did not place sufficient emphasis on the rule route to cause a significant increase in regularization errors. The first point is addressed in the Simulation and General Discussion sections, and the second point is addressed in Experiment 3. Experiment 3 If stimulus composition can cause subjects to preferentially emphasize the rule-like aspects of word reading, then the logically strongest condition of encouragement would be a stimulus block consisting of all non-homophonic pseudowords. The problem with using blocks of all pseudowords is that there would be no opportunity to observe regularization errors (e.g., for pseudowords with inconsistent bodies, both consistent and inconsistent pronunciations are valid), and the proportion of regularization errors in Experiments 1 and 2 served as a measure of emphasis placed on the rule route. A stimulus block of all pseudowords can still be useful, however, because the route emphasis hypothesis applies to the lexical route as well as the rule route. In this case, a pure pseudoword block should encourage de-emphasis of the lexical route because, on dual-route accounts, the lexical route produces interfering output for pseudowords. A pure pseudoword block does not provide an opportunity for regularization errors, but it does provide an opportunity for word errors. If word errors to pseudoword targets arise, at least in part, from processing in the lexical route, then such errors should decrease in proportion when the lexical route is deemphasized, compared to the proportion of word errors in Experiments 1 and 2. Let us assume that the lexical route was, in fact, partially responsible for word errors in Experiments 1 and 2 (we argue below that this was the case). We can then state the alternate possible outcome: If the lexical route is not de-emphasized in a block of all pseudowords, then the rate of word errors should be equal to a mixed block of words and pseudowords. The purpose of Experiment 3 was to provide a further test of the route emphasis hypothesis by presenting all pseudowords in the tempo naming task. Methods Subjects. Thirty-six subjects participated in the experiment as part of a requirement for an undergraduate psychology course. Subjects reported being native English speakers with normal or corrected vision. Stimuli. Out of the 52 pseudowords from Experiment 2 (all created by mixing LFC onsets and bodies), 48 were included in Experiment 3. The onsets and bodies of an additional 48 LFE words and 48 HFE words from Experiments 1 and 2 were mixed to create an additional 96 pseudowords, for a total of 144 test pseudowords in Experiment 3 (all test pseudowords are given in Appendix B). An additional 50 pseudowords were created for the standard naming task in order to measure a baseline naming latency for each subject. Standard naming and tempo naming blocks of trials were created in the same way as in Experiment 1 and 2. An equal portion of each pseudoword type (i.e., those created from LFC, LFE, or HFE words) appeared in each test block within tempo naming, and fillers were divided equally among test blocks as well. There were 40 pseudoword fillers in the tempo naming task and 2 pseudoword fillers in the standard naming task. The standard naming and tempo naming practice blocks consisted of 10 and 40 pseudoword fillers, respectively. Procedure. The procedure was the same as that in Experiment 1, except subjects were told that none of the letter strings would make legal English words. Subjects were instructed to name these letter strings as quickly and as accurately as possible. Results - Standard Naming Data Removal. Data from 3 subjects were removed due to equipment failure. Further data removal was carried out as in Experiments 1 and 2.


109

Naming Latency Analyses. Mean latencies were analyzed as a function of block. A 15 ms increase in mean latency from block 1 (538 ms) to block 2 (553 ms) was not significant (Fs(1,31) = 1.8, p > .1; Fi(1,49) < 1). Error Analyses. Errors were categorized and analyzed as in Experiments 1 and 2. For pseudowords with inconsistent word bodies, both the regular and exceptional pronunciations were coded as correct. A 2.1% increase in error rate from block 1 (4.2%) to block 2 (6.3%) was not significant (Fs(1,31) = 2.5, p > .1; Fi(1,49) < 1). Error counts as a function of block and error type are presented in Table 10. Counts were not reliably different than their expected values based on means calculated across levels of block and error type (Χ2(2) < 1). Results - Tempo Naming Data Removal. The procedure for data removal was the same as in Experiments 1 and 2. Latency Analyses. Figure 11 graphs mean naming latencies as a function of tempo and stimulus type. Tempo once again had a strong influence on response latencies (Fs(3,93) = 345, p < .001; Fi(3,423) = 166, p < .001), but the main effect of stimulus type was significantly reduced compared to Experiments 1 and 2 (Fs(2,62) = 3.4, p < .05; Fi(2,141) < 1). The interaction of stimulus type and tempo was not significant (Fs(6,186) < 1; Fi(6,423) < 1). Planned comparisons showed that the “pseudo-frequency” effect (i.e., latencies of pseudowords created from HFE Table 10. Frequency counts of errors in the standard naming portion of Experiment 3, as a function of block and error type. Block 1

Block 2

Word

15 (44.1)

23 (45.1)

38 (44.7)

Nonword

16 (47.1)

23 (45.1)

39 (45.9)

Articulatory

3 (8.8)

5 (9.8)

34

51

Total

Total (Mean)

8 (9.4) 85


110

Std

0

-50

-100

-150

550

500

450

400 Naming Latency (ms)

350 Pseudo Hi Exc

Pseudo Lo Exc

Pseudo Lo Con

Figure 11. Mean latencies from the standard and tempo naming portions of Experiment 3 as a function of stimulus type and tempo. The dashed lines separate standard naming means from the tempo means.

versus LFE words) was not reliable (F s(1,31) = 2.8, p > .1; Fi(1,94) < 1), but the pseudoconsistency effect was reliable by subjects (Fs(1,31) = 6.4, p < .05; Fi(1,94) < 1). Planned comparisons of the tempo manipulation confirmed that each successively faster level of tempo caused responses to be reliably faster than the previous level: from B-0 to B-50 (F s(1,31) = 187, p < .001; Fi(1,141) = 61, p < .001), from B-50 to B-100 (Fs(1,31) = 128, p < .001; Fi(1,141) = 46, p < .001), and from B-100 to B-150 (Fs(1,31) = 150, p < .001, Fi(1,141) = 71, p < .001). The small, partially reliable pseudo-consistency effects was presumably due the ambiguous spelling-tosound correspondences that these strings contained (e.g., “bost” can rhyme with “cost” or “most”; Glushko, 1979; Taraban & McClelland, 1987; Seidenberg et al., 1994). Timing Analyses. Figure 12 graphs mean timing offsets as a function of stimulus type and tempo. As in the latency analyses, the main effect of stimulus type was only reliable by subjects (Fs(3,62) = 3.4, p < .05; Fi(2,141) < 1), the main effect of tempo was significant by subjects and items (Fs(3,93) = 26, p < .001; Fi(3,423) = 53, p < .001), and the interaction was not significant (Fs(6,186) < 1; Fi(6,423) < 1). Planned comparisons showed that, as in Experiments 1 and 2, responses were increasingly delayed from tempo at each step: B-0 to B-50 (F s(1,31) = 18, p < .001; Fi(1,141) = 25, p < .001), B-50 to B-100 (Fs(1,31) = 16.0, p < .01; Fi(1,141) = 20, p < .001), and B-100 to B-150 (Fs(1,31) = 2.9, p > .1; Fi(1,141) = 5.9, p < .05). Timing analyses were again conducted as a function of initial phoneme characteristics to test the hypothesis that subjects attempt to time their responses based on the onset of voicing. As in Experiments 1 and 2, responses to voiced, non-plosive stimuli were reliably slower (32 ms difference) than those to unvoiced, non-plosive stimuli (Fs(1,31) = 68, p < .001; Fi(1,85) = 44, p < .001), and this effect interacted with tempo, albeit reliably only by subjects (Fs(3,93) = 3.3, p < .05; Fi(3,255) = 2.0, p > .1). In particular, the voiced effect steadily decreased from a 40 ms difference in the B-0 condition, to an 24 ms in the B-150 condition. Finally, timing of voiced, non-plosive items was 0 ms in the B-0 condition, compared to -40 ms for unvoiced, non-plosive responses.


111

0

-50

-100

-150

60 45 30 15 0 (ms) Timing -15 -30 Pseudo Hi Exc

Pseudo Lo Exc

Pseudo Lo Con

Figure 12. Mean timings with tempo from Experiment 3 as a function of stimulus type and tempo.

Naming Duration Analyses. Naming duration analyses were again conducted to show that the entire response decreased in duration as tempo increased, rather than just the initial phoneme. The results replicated those of Experiments 1 and 2: Naming durations, collapsing across IP type, steadily decreased from B-0 to B-150 (Fs(3,93) = 5.3, p < .01; Fi(3,423) = 16.8, p < .001). The same pattern of effects held when analyses were restricted to voiced, non-plosive pseudowords (Fs(3,93) = 2.8, p < .05; Fi(3,117) = 6.3, p < .001), as well as plosive-initial pseudowords (Fs(3,93) = 4.0, p < .01; Fi(3,165) = 5.6, p < .01). The fact that the whole response duration effect held for both of the above IP types indicates that the rime portion of the response decreased in duration, as well as the onset. Error Analyses. Figure 13 graphs mean error rate as a function of stimulus type and tempo. Error rates again increased as tempo increased, indicating a speed/accuracy tradeoff (F s(3,93) = 5.4, p < .01; Fi(3,423) = 8.8, p < .001). However, unlike naming latencies, error rates were only marginally affected by stimulus type (Fs(2,62) = 2.7, p < .1; Fi(2,141) < 1). The interaction of stimulus type and tempo was also not significant (Fs(6,186) = 1.3, p > .2; Fi(6,423) = 1.4, p > .2). Although error rate increased with each increase in tempo, the increase from B-0 to B-50 was not significant (Fs(1,31) < 1; Fi(1,141) < 1). The increase from B-50 to B-100 was reliable by items but not subjects (Fs(1,31) = 1.9, p > .1; Fi(1,141) = 3.9, p < .05), and the increase from B-100 to B-150 was fully reliable (Fs(1,31) = 5.9, p < .05; Fi(1,141) = 4.2, p < .05). Unlike Experiments 1 and 2, there was no opportunity for regularization errors. However, one of the purposes of using all pseudowords in the tempo naming task was to test whether subjects could de-emphasize the putative lexical route to pronunciation. If so, one would expect the proportion of word errors to decrease compared to Experiments 1 and 2. Error counts as a


112

Std

0

-50

-100

-150

0.4

0.3

0.2 Error Rate 0.1

0 Pseudo Hi Exc

Pseudo Lo Exc

Pseudo Lo Con

Figure 13. Mean latencies from the standard and tempo naming portions of Experiment 3 as a function of stimulus type and tempo.

function of error type and tempo are presented in Table 11. To the contrary of the route emphasis hypothesis, the proportion of word errors was greater in the current experiment (53%) than in the previous two (40% in Experiment 1 and 43% in Experiment 2). Furthermore, the rate of increase in word errors as tempo increased was approximately equal to that for nonword errors, as was the case in Experiments 1 and 2. The Mantel-Haenszel test for trend was significant (Χ2(1) = 4.8, p < .05), but inspection of the column percents shows that this was due to a greater rate of increase in articulatory errors across tempo, compared to word and nonword errors. Table 11. Frequency counts of errors in the tempo naming portion of Experiment 3, as a function of tempo and error type. 0

-50

-100

-150

Word

46 (63.0)

41 (51.3)

51 (48.1)

68 (51.5)

206 (52.7)

Nonword

23 (31.5)

33 (41.3)

45 (42.5)

43 (32.6)

144 (36.8)

4 (5.5)

6 (7.5)

10 (9.4)

21 (15.9)

41 (10.5)

73

80

106

132

Articulatory

Total

Total

391

Discussion The results of Experiment 3 were consistent with those of Experiments 1 and 2. Timing analyses again suggested that subjects time their responses with the onset of voicing in the tempo naming task. Furthermore, the number of word, nonword, and articulatory errors as a function of tempo were comparable to those of Experiments 1 and 2: they all steadily increased as tempo was


113

increased. If the route emphasis hypothesis was correct in accounting for the constant number of regularizations across tempos in Experiments 1 and 2, then one should also expect a decrease in the number of word errors in Experiment 3, which included nothing but pseudowords. The results failed to support this prediction. There is one important assumption to clarify before we can interpret the increase in word errors with increased tempo as evidence against the route emphasis hypothesis. Word errors can arise from lexical route processing, but they can also arise by chance from faulty processing in the rule route, or faulty input (visual through orthographic) or output (articulatory) processing. This leaves open the possibility that the observed increase in word errors was due to chance errors arising from non-lexical route sources. Therefore, subjects may have deemphasized the lexical route in Experiment 3, even though there was an increase in occurrence of word errors. We addressed this ambiguity by estimating the rate of chance word errors in Experiment 3. Determining chance error rates is a difficult problem since one must take into consideration the specific stimuli and conditions of the observed errors, the types of errors that are possible (e.g., only phonetically legal errors allowed), as well as any error biases that may be independent of a lexical bias (e.g., in speech production, initial consonants are more likely to produce errors than final ones; Dell, 1988; Schwartz & Goffman, 1995). We empirically determined a chance error rate for the stimuli in Experiment 3, using the most conservative method that we could devise given our intentions. In particular, we first estimated the frequency of occurrence of a large number of phonological error types (e.g., position-dependent phoneme substitutions, deletions, insertions, and transpositions) based on the observed errors. We then applied each phonological error type (when applicable), weighted by the frequency estimates, to each test item to generate a set of possible erred responses. Our estimate of the chance frequency of word errors was the proportion of errors that formed legal English words out of the set of derived phonological errors. This method is conservative because the estimated rate of each phonological error type is determined based on all observed errors (i.e., including word errors); these estimates will therefore include any real lexical biases that may correlate with phonological biases (e.g., initial consonant substitutions may correlate with more word errors than final consonant substitutions). The chance estimate of the word error rate for Experiment 3 was 28%10 . The observed rate of word errors was 53%, which was significantly different than chance (t(27) = 8.1, p < .001 by subjects; t(120) = 6.9, p < .001 by items). Therefore, the increase in word errors as a function of tempo observed in Experiment 3 stands as evidence against the route emphasis hypothesis. Empirical Summary and Modeling Overview Two issues in word reading motivated the current study: 1) strategic control over response initiation in word naming, and 2) the time course of phonological activation in word reading. The tempo naming task generated data that addressed the route emphasis and time criterion hypotheses, as well as data that addressed the dynamics of phonology in word reading. The experimental results did not support either hypotheses, nor did they fit the predictions concerning early phonology made by the DRC model or the distributed model. We present a connectionist model of word reading that simulates the tempo naming task and accounts for the basic findings in Experiments 1 through 3. The main results from the current experiments that we wanted to account for were11 : 10

As a comparison, Garret (1976) and Dell and Reich (1981) estimated the chance occurance of word errors in speech production (i.e., within phrases and sentences), and their estimates were 33% and 40%, respectively. Note that different type of errors can and do occur in speech production versus word naming (e.g., transpositions between words; “barn door” → “darn bore”. 11

We did not consider the pseudo-consistency effects found in Experiment 3 as important for understanding the central issue of strategic control over response initiation, so we did not address these results in the modeling work (for related work, see Seidenberg, Plaut, Petersen, McClelland, & McRae, 1994).


114

6.

The frequency, consistency, and lexicality effects found in the standard naming portion of each experiment, for both latencies and error rates.

7.

Decreased naming latencies and durations, and increased error rates with decreases in the tempo naming portion of each experiment.

8.

The persistence of frequency and consistency effects on latency and error rates in the tempo naming portion of Experiments 1 and 2.

9.

The constant number of regularization errors accompanied by increases in the occurrence of other error types with increased tempo (i.e., a decrease in the proportion of regularization errors). The current model is largely an extension of previous connectionist models of word reading (Kawamoto, 1988; Plaut et al., 1996; Seidenberg & McClelland, 1989). Distributed representations of orthography, phonology and semantics interact with each other through learned, internal representations. Spoken language comprehension and production involve mapping between phonology and semantics, whereas reading involves mapping orthographic input onto both semantics and phonology. We assume that the system has considerable experience with language prior to beginning to learn to read, such that the process of reading acquisition is strongly shaped by existing language knowledge (i.e., lexical knowledge in the current context). The bulk of the conceptual work necessary to simulate the current empirical results focused on creating a mechanism for strategic control over response initiation. We approached the problem from a general perspective on strategic control that we propose here: strategic control is not carried out by manipulating individual levels of representation, as is the case with route emphasis or a time criterion over the output representation. Strategic control is instead carried out by changes in global parameters spanning the entire system in question. We refer to this as the hypothesis of coarse strategic control. The tempo naming task provides a test bed for coarse strategic control because subjects actively entrain themselves to the tempo in order to synchronize the initiation of their responses. It is likely that entrainment has widespread effects as evidenced by anecdotal reports of subjects tapping their fingers or bobbing their heads in time with the tempo. It seems reasonable to hypothesize that the active process of entrainment also drives, at least in part, the speed of response initiation, as well as its duration. This hypothesis suggests a system-wide mechanism of entrainment. We chose to simulate the effects of tempo by manipulating gain (Cohen et al., 1992; Nowlan, 1988), which is a multiplicative scale factor on the net input to processing units. We chose gain for the following reasons. First, the computational properties of gain have been studied in connectionist networks (Nowlan, 1988), and gain has previously been used as a mechanism of strategic control (Cohen et al., 1992). Second, it is straightforward to manipulate gain across all processing units in a network (in keeping with coarse strategic control). Third, increasing the gain over the net input to a unit causes that unit to change its activation values more quickly in response to incoming information. Therefore, increasing the input gain over all units in the network should have the desired quality of causing the network to settle more quickly (given that some measure of settling time corresponds to naming behavior, as it does in the current work). Finally, unlike a pure time criterion, the manipulation of gain does not preclude the possibility of stimulus effects on naming latencies. Gain is not a type of response criterion per se, so by invoking gain one must still choose a mechanism to extract naming latencies from the network. We chose a mechanism that allows for stimulus effects on latencies (described below). Although the primary goal of the current simulation was to account for the main results from the tempo naming experiments, we also wanted to simulate the stimulus blocking results from the study by Lupker et al. (1997a) because we are proposing the manipulation of gain as an alternative to the time criterion hypothesis. A summary of our reasons for suspecting that the manipulation of


115

gain would meet these two goals is given here, and the details are presented in the simulation methods and results. If fast tempos correspond to high levels of gain on the net input to all units in the network, then the network should settle more quickly at fast tempos. If settling time affects both response initiation and execution, then latencies and durations should be shorter for faster tempos. Additionally, error rates should be higher because changes in gain can have disproportionate changes in the rate of activation change across different processing units, due to the nonlinearities in the activation function (see below). The manipulation of gain must also account for the proportions of error types as function of tempo; in particular, the number of regularizations remained constant while the number of other error types increased with faster tempos in Experiments 1 and 212 . A system-wide manipulation of gain will affect processing within semantics and phonology, and the interaction of the two, as well as the processes mapping orthography to phonology. As noted above, the mapping between semantics and phonology represents language knowledge learned prior to reading, which is largely based upon words. Therefore, increased gain should amplify the influence of lexically-based language knowledge, and thereby cause a tendency for word errors over regularizations. Finally, the manipulation of gain should account for effects of stimulus blocking because shifts in gain will have effects on response initiation analogous to shifts in a time criterion. A Distributed, Connectionist Model of Word Reading Network Architecture. Figure 14 shows a diagram of the pools of units and their connectivity in the current simulation. Semantics and phonology served as both input and output layers, depending on the phase of training (see below), and orthography served only as an input layer. Orthography and phonology consisted of 37 units each, and semantics consisted of 50 units (representations explained below). A set of 250 hidden units (hidden language) was fully and bidirectionally connected to both semantics and phonology, and each hidden unit had a bias weight. A second set of 100 hidden units (hidden reading) were fully and bi-directionally connected to the hidden language group, and these also had biases. Orthography fully projected to the hidden reading group. Note that the pathways from orthography to phonology and semantics were both routed through the hidden language group. This contrasts with previous models of word reading in which there is a processing pathway between orthography and phonology that computes somewhat independently of semantics (Coltheart et al., 1993; Plaut et al., 1996; Seidenberg & McClelland, 1989). Preliminary simulations conducted by us indicated that this architectural constraint was necessary to account for certain aspects of the tempo naming results (Kello & Plaut, in preparation). We consider the implications of this architecture in the General Discussion. Time plays a prominent role in the tempo naming task. We instantiated a time course in the current network by using connectionist processing units that are “stiff” in time; the output of a unit was a function of its net input as well as its output from the previous time step (Pearlmutter, 1989). In particular, the activation s of unit i at time t + ∆t was si (t + ∆ t ) = si (t ) + ∆t (σ (x i (t )) − si (t ))

1.

where the activation function was the logistic

σ ( xi (t )) =

12

1 1 + exp( − x i (t ))

2.

The proportion of error types as a function of tempo in Experiment 3 generally replicated those from Experiments 1 and 2, except that regularizations and mixed errors were not possible in Experiment 3.


116

Semantics (50)

Orthography (37)

Hidden Reading (100)

Hidden Language (250)

Phonology (37)

Figure 14. Diagram of network architecture. Numbers in parentheses denote number of processing units, lines denote full connectivity (i.e., every unit to every unit). The dashed box separates the portion of the network that was adapted only during reading training from the portion that was adapted during both the language and reading phases of training.

and the net input xi to unit i was   x i (t ) = γ  ∑ s j (t ) w ji + bi  + Ei 3. j   where wji was the weight from unit j to unit i, bi was a real-valued bias to unit i, Ei was the contribution of external input, and γ was the gain parameter. As the parameter ∆t goes to zero, the network approaches a continuous time system. However, digital computers require discrete time, so ∆t must be in the interval [0, 1] (a ∆t of 1 is equivalent to a fully discrete system). This parameter is not of theoretical importance, so a relatively coarse value was chosen (∆t = 0.2) to minimize training and processing time. One can think of the gain parameter γ as scaling the steepness of the sigmoid activation function (most simulations essentially fix gain at 1). This is equivalent to the inverse of temperature in Boltzmann machines (Hinton & Sejnowski, 1986). The learning algorithm used was a derivative of the recurrent backpropagation algorithm (Rumelhart, et al., 1986) generalized for continuous time systems (Pearlmutter, 1989). Input/Output Representations. Inputs and targets at phonology consisted of distributed patterns of binary activation that abstractly represented CVC strings. Training examples (i.e., words) were formed by choosing 2 of 15 consonants (with replacement) and 1 of 7 vowels. Each consonant was a pattern of 8 out of 15 units on, and each vowel was a pattern of 4 out of 7 units on. Therefore, each phonological and orthographic training item was a pattern of activation across 15 + 7 + 15 = 37 units. The similarity of consonant patterns ranged from a normalized dot product (NDP) of 0.125 to 0.825. Vowel similarity ranged from an NDP of 0.25 to 0.75. The CVC structure defined a space of 15 × 7 × 15 = 1,575 possible strings. To approximate the density of monosyllabic English words, a total of 700 phonological words were generated by randomly choosing 8 prototype CVC strings (neighborhoods) and iteratively distorting each prototype by one or more “phonemes” to generate individual training items (all CVCs; 700 / 1575 = 44% words).


117

Inputs and targets at semantics consisted of sparse, distributed, binary representations. Semantic training patterns were generated by randomly seeding 16 semantic prototypes (semantic categories), and distorting each prototype by one or more bits to create each training pattern. The probability of each prototype bit being on was 0.1, and the probability of distorting each prototype bit per exemplar was 0.2, with the constraint that each exemplar must have at least 6 bits on and cannot be identical to another prototype (i.e., there were no exact synonyms; see Clark, 1993). The sparsity constraint was meant to capture the notion that most of one’s semantic knowledge is not activated for any given concept. We divided the 700 words into two classes: 400 familiar and 300 unfamiliar words. The familiar words were assigned to semantic targets such that there was no systematic relationship between semantics and phonology (for a given morpheme in English, any one particular phoneme does not convey much, if any, information about the semantics of that morpheme). The unfamiliar words received no semantic targets (and therefore no error from semantics). Correspondingly, there were only 400 semantic input patterns (25 distortions × 16 prototypes = 400 words). This captured the fact that in learning to read, a child will most likely learn both the meaning and sound of some words (familiar), but only the sound of others (unfamiliar). Orthographic input patterns were identical for 490 of the 700 phonological patterns (240 familiar and 250 unfamiliar words); these served as words with consistent spelling-to-sound correspondences. For the remaining 210 phonological patterns (160 familiar and 50 unfamiliar words), the phonological vowels were changed to create exception words such that no homophones were formed. Each of the 7 possible vowels had a single exceptional pronunciation that corresponded to a neighboring vowel with an NDP of 0.75. Ninety-five of the exception words (45%) and 210 of the consistent words (43%) were presented 4 times as often as the remaining words, corresponding to high- versus low-frequency words. The 700 words left 1575 − 700 = 875 untrained, legal CVC patterns, 125 of which corresponded to the phonological targets for exception words (i.e., pseudohomophones). The remaining 750 CVC patterns served as pseudowords for testing purposes (see below). Network training. Training was divided into two sequential, overlapping stages: 1) learning to pronounce words and understand spoken words (language training), and 2) also learning to read (combined language-reading training, hereafter referred to as combined training). Language training consisted of learning the bi-directional mapping between phonology and semantics via the language hidden units. During this first stage, the orthographic and hidden reading units were fixed at zero. Language training continued until error fell below a preset criterion (5 units of cross-entropy error, summed across semantics and phonology through time, per training example; Hinton, 1989), at which point training examples with orthographic input were introduced into the training regimen, interleaved with the training examples from the language training phase. Combined training continued until error again fell below the same criterion as in language training. At the beginning of training, all weight values (including biases) were initialized probabilistically from a uniform distribution centered at 0 with a range of ±0.25. The 400 familiar words served as the training corpus during language training, and these words plus the 300 unfamiliar words served as the training corpus during combined training. Half of the language training trials had input presented at phonology, and half had input presented at semantics. For combined training, approximately 47% of the trials had input presented at orthography, and the remainder had input presented as in language training. For trials in which an input pattern was presented at phonology or semantics, the net input for each input unit was set such that the input pattern would gradually accrue over the first few iterations (ticks) of processing (soft-clamping). For trials in which an input pattern was presented at orthography, the input units were instantaneously set to their values according to the input pattern (hard-clamping). Orthographic inputs were hard-clamped because there were no recurrent connections back to orthography.


118

Each learning trial proceeded as follows. Activation values of all units in the network were initialized to 0.2. The input pattern was presented, and noise was applied to each input unit by taking the absolute value of a sample from a gaussian probability distribution centered at zero with a standard deviation of 0.05, and moving the activation value of the input unit away from its target value (0 or 1) by the sampled amount. The resulting amount of noise remained fixed throughout the processing of a given word. After the input was presented, activation propagated through the network for 35 ticks according to the equations given above. For the last 10 ticks, target patterns were presented at semantics and phonology (in cases where input was soft-clamped at semantics or phonology, the target and input patterns were identical), and cross-entropy error was applied (Hinton, 1989)13 . Targets were presented as 0.1 and 0.9 (with no penalty for being more extreme than the target value), rather than 0 and 1, to discourage weights from growing too large. After 35 ticks, error was backpropagated through time continuously to the beginning of the trial, and weight derivatives were accumulated and stored. Weight changes were made after every 50 trials with a learning rate of 0.001 and a momentum of 0.9. A small amount of weight decay (0.0001) was added to the weight derivatives to further discourage weights from growing too large. Finally, gain on the net input to units was fixed at 1.0 throughout learning. General network testing procedure. Network performance was tested only with orthographic input and phonological output (due to the current focus on word naming). For each test trial, an input pattern was hard-clamped to orthography, and activation began to propagate through the network. At each tick of processing, the mean stress14 of the activation pattern at phonology was computed. When the mean stress exceeded 0.93 (or activation had propagated for 50 ticks), naming latency was recorded in number of ticks and the network’s response was categorized (explained below), based on the current pattern of activation at phonology. Activation continued to propagate until the mean stress at phonology exceeded 0.99 or 50 ticks had expired. The number of ticks between the time to reach 0.93 and 0.99 (or the maximum number of ticks) corresponded to the duration of the network’s naming response. This is clearly an over-simplification of the processes that drive articulation, but it captures two aspects of the phenomena that we believe are important for accounting for duration effects: 1) processes involved in the mappings between orthography, phonology, and semantics support and affect the entire trajectory of articulation, and 2) the mechanism of control over response initiation (input gain) has necessary consequences for response duration. We are working to develop these ideas in more detail elsewhere (Kello, MacWhinney, & Plaut, in preparation). The process for categorizing each network response was as follows. For each consonant slot, the normalized dot product (NDP) was computed between the output pattern of activation and all 15 target consonant patterns, and likewise for the output pattern in the vowel slot and the 7 target vowel patterns. The consonant or vowel that most closely matched each output slot was taken, and if one or more of the 3 phoneme outputs (CVC) did not have an NDP exceeding 0.92 with its closest match, the response was labeled as an articulatory error. If the trial was an exception word, then the response was labeled a regularization if the closest matching phonological vowel equaled the orthographic input vowel and the consonants matched their targets. If the phonological CVC string output by the network equaled a word other than the correct one, it was labeled a word error. Otherwise, if the response did not match the target CVC string, it was labeled a nonword error. If the response matched the target, it was labeled as correct.

13

For unfamiliar words, targets were presented at phonology only.

14

Stress is a measure of how binary a given activation value is. This is a valid measure of when the network has settled to a stable response since the target patterns were binary. The stress of the activation value s is stress = 1 + s log( s) + (1 − s ) log(1 − s) − log(0. 5)


119

The testing procedure as described thus far applies to both the standard naming task and the tempo naming task. For standard naming, the only other procedural detail to mention is that the gain parameter was fixed at 1.0. The testing procedures for tempo naming and stimulus blocking required additional mechanisms, which we explain next. Network testing procedure for tempo naming. The tempo naming task requires subjects to respond at a particular moment in time to the best of their ability. In the current study, we set the tempos to be as fast or faster than baseline latencies in the standard naming task. Increasing the level of gain on the net input to some set of units should cause those units to settle more quickly (i.e., become binary). Therefore, the basic mechanism we used to simulate tempo naming was to increase gain at different points in time relative to the network’s mean latency to stimuli in the standard naming task. There are at least four parameters that must be specified to instantiate this mechanism of control over response initiation: 1) the units over which to raise the level of gain, 2) timing of the rise in gain, 3) the baseline level of gain, and 4) the amount of rise in gain. Our choices for each parameter are explained next. The hypothesis of coarse strategic control implies that gain should be manipulated uniformly across the network. In fact, coarse control may be necessary to account for the finding from Experiments 1 and 2 that the proportion of regularization errors decreases as a function of tempo. As hypothesized earlier, an increase in the influence of semantic processing with increased tempos may be necessary to reduce the influence of low-level spelling-to-sound correspondences. These correspondences are presumably responsible for regularization errors. By increasing gain across all units in the network, the rate of information accrual at semantics is scaled proportionally with information accrual elsewhere in the system. However, network-wide manipulation of gain may not be crucial to account for the empirical results. The tempo naming task induces an articulatory response, so it may be that an increase in gain within phonology only is sufficient to perform the task. To adjudicate between these two alternatives, we tested the network with gain manipulated across all units, as well as within phonology only. The testing procedures were otherwise identical between the two gain manipulation conditions. Gain must be increased in a time-dependent manner to simulate the tempo naming task. If we imagine gain as implemented by some physical system, then we would expect the rise in gain to be relatively smooth and continuous (e.g., an exponentially increasing function). However, choosing such a function would introduce unnecessary complexity and degrees of freedom to the simulation. Therefore, we chose to simply increase the level of gain instantaneously at a particular point in processing, termed the gain onset time. As mentioned above, the gain onset time corresponding to the slowest tempo was set to the mean latency of the network’s naming responses to all test stimuli in the standard naming task. The tempo naming experiments contained 3 faster tempo conditions as well. To make contact with the empirical data as closely as possible, we determined these gain onset times to be proportional to 50 ms decrements based on the mean standard naming latency found in Experiment 2, converted to ticks. The rise in gain must, of course, be relative to some baseline level. To begin with, we set the baselines for each tempo condition relative to the level of gain used during standard naming (1.0). In particular, baseline gain in the slowest tempo condition was set slightly lower than the standard naming level (0.8). We started with a lower baseline because the slowest tempo requires some responses to be slower relative to their latencies in the standard naming task. This arises because the slowest tempo was set to the mean latency across all stimuli in the standard naming task. Therefore, response latencies to the faster stimuli (e.g., HFE words) must increase to be accurately timed with the slowest tempo. Furthermore, we raised the baseline in small increments (0.2) for each successively faster level of tempo. This captured the notion that gain cannot be increased instantaneously in reality; therefore, the baseline must be higher in order to achieve a particular increase earlier in time. Finally, we needed to determine the amount that gain should be raised for each tempo condition. We chose to a level for the fastest tempo condition that produced error rates comparable


120

with those from Experiment 2 (2.4). We also decreased the amount of rise in gain by 0.2 for each successively slower level of tempo to be consistent with the variation in baseline gain (see previous). Network testing procedure for stimulus blocking. Lupker et al. (1997a) and Jared (1997) proposed to account for stimulus blocking effects by invoking a time criterion of response initiation that would vary as a function of stimulus difficulty. We adopted the basic logic of their account, except we proposed to vary the baseline level of gain rather than a time criterion. We simulated effects of stimulus blocking by simply running the network on all test stimuli with a range of baseline levels of gain. We equated the lower end of the range with blocks of relatively difficult stimuli, and the upper end with blocks of relatively easy stimuli. We matched particular levels of gain with particular blocking conditions via the mean latencies of those conditions, and we used these levels of gain to predict latencies in other blocking conditions (as Lupker et al. did). Further details are provided in the Stimulus Blocking Results section. Simulation of Experiment 2 As stated above, the primary goal of developing the current network was to simulate tempo naming. We chose to focus on Experiment 2 because it produced the four major findings from Experiments 1 through 3. For all network results, incorrect responses were excluded from the latency and duration analyses. All statistics are reported with items as the random factor because only one “subject” network was tested due to the time required for training. Standard Naming. To provide a comparison with the standard naming results from Experiment 2, naming latencies and error rates were analyzed for the network’s responses to HFE, LFE, and LFC words, and pseudowords. Table 12 presents the mean latencies and error rates of the network’s responses to each stimulus type. For latency, the main effect of stimulus type was significant, Fi(3,1072) = 28, p < .001. Planned comparisons showed significant effects of frequency (Fi(1,142) = 6.9, p < .05) and lexicality (Fi(1,1074) = 82, p < .001), but not consistency (Fi(1,223) < 1). For error rate, the main effect of stimulus type was significant as well, F i(3,1296) = 15.3, p < .001. Planned comparisons indicated that all three pairwise effects of interest were significant [frequency: Fi(1,158) = 4.5, p < .05; lexicality: Fi(1,1298) = 39, p < .001; consistency: Fi(1,238) = 16.6, p < .001]. The relative difference in mean latencies and error rates as a function of stimulus type match closely with those found in the standard naming portion of Experiment 2, as do the pattern of significance tests. Table 12. Mean latencies and error rates (with standard errors) of the network's standard naming responses, as a function of stimulus type. Latency Hi Freq Exc Lo Freq Exc Lo Freq Con Pseudoword

3.49 3.81 3.67 4.31

(0.07) (0.09) (0.06) (0.04)

Frequency Effect Consistency Effect Lexicality Effect

0.32 * 0.14 0.65 ***

Error Rate 5.0 15.0 1.9 21.0

(1.3) (3.6) (1.1) (1.4)

10.0 * 13.1 *** 13.7 ***

Tempo Naming. As explained above, network performance in simulating tempo naming was evaluated with gain manipulated across all units in the network, or within the units at phonology. Results from the naming latency, naming duration, and error rate analyses are reported for the network-wide manipulation condition because these results were essentially equivalent with the phonology-only manipulation. Analyses of the proportions of error types as a function of tempo


121

are reported for both gain conditions because these were the empirical data that, in part, motivated the issue, and because the two conditions produced different results for these analyses. Figure 15 shows the latency, duration, and error rate results broken down by stimulus type and gain onset time. With latency as the dependent measure, there were main effects of stimulus type (Fi(3,1223) = 19.3, p < .001) and tempo (Fi(3,3087) = 24,009, p < .001), but no interaction (Fi(9,3087) < 1). With error rate as the dependent measure, there also were main effects of stimulus type (Fi(3,1296) = 11.8, p < .001) and tempo (Fi(3,3888) = 54, p < .001), and no significant interaction (Fi(9,3888) = 1.4, p > .1). With naming duration as the dependent measure, both main effects as well as the interaction were significant (stimulus type: Fi(3,1288) = 56, p < .001; tempo: Fi(3,3415) = 15.1, p < .001; stimulus type by tempo: Fi(9,3415) = 1.9, p < .05). The qualitative pattern of simulation results for latency and error rate closely matched the empirical data, so we did not conduct further statistics to test the pairwise comparisons. The main effect of tempo on naming duration produced by the network also matched the empirical data, but there was a discrepancy for stimulus type. Naming duration mirrored naming latency as a function of stimulus type in the simulation, but the experiments showed no effect of stimulus type on duration. We believe that this shortcoming is due to our over-simplification of the processes linking cognition to articulation (see Kello et al., in preparation). However, the correct pattern of durations as a function of tempo is evidence that gain potentially has the right qualities for modeling the duration results. As explained above, the pattern of error types as a function of tempo provides a test of two alternate ways to manipulate input gain: across all units in the network (as we have reported thus far), or within phonology only. The experimental results showed that regularization errors remained constant, and we reasoned that the network-wide gain manipulation would account for this result, whereas the phonology-only manipulation would not. Table 12 presents frequency counts of errors produced by the network as a function of tempo and error type, for both types of gain manipulation. The table contains errors counted among all stimulus types, as well as errors counted among only the word stimuli. When gain was manipulated across the whole network, the frequency counts generally replicate the relative pattern of errors found in the empirical results. The chi-square test for independence was significant with all stimuli included (Χ2(9) = 17.2, p < .05), as well as with only words (Χ2(9) = 26, p < .01). The Mantael-Haenzel test for trend was only marginally significant with word stimuli (all stimuli: M-H(1) < 1; words: M-H(1) = 3.1, p < .1). When gain was manipulated over phonology only, regularization errors increased with tempo, while word errors remain constant (all stimuli: Χ2(9) = 19.6, p < .05; M-H(1) = 11.4, p < .001; words only: Χ2(9) = 5.0, p > 0.2; M-H(1) = 1.1, p > .2). Although the statistical tests with word stimuli did not reach significance, the same pattern of results held as with all stimuli included. One discrepancy between empirical and model results is that overall, the model produced a smaller proportion of word errors than subjects did. We plan to address this concern in future simulations by training on a larger, more realistic corpus (Kello & Plaut, in preparation), but we argue that it does not compromise the current simulation because the occurrence of word errors do, in fact, increase with increasing tempos, while regularizations remain constant. Therefore, results from the simulation of tempo naming argue for a network-wide manipulation of gain, and therefore in favor of coarse strategic control.


122

Std

B-0

B-50

B-100

B-150

22 20 18 16 14 12 10 Naming Latency (ticks) Hi Freq Exc

Lo Freq Exc

Lo Freq Reg

Pseudo

Lo Freq Exc

Lo Freq Reg

Pseudo

Lo Freq Reg

Pseudo

0.4 0.3 0.2 0.1Rate Error 0 Hi Freq Exc 9 8 7 6 5 4 3 2 1 0 Naming Duration (ticks) Hi Freq Exc

Lo Freq Exc

Figure 15. Mean latencies, error rates (proportion of errors), and naming durations produced by the simulation of tempo naming, as a function of stimulus type and gain onset (in ticks).

Stimulus Blocking. The tempo naming task was devised, in part, to investigate Lupker et al.’s (1997a) time criterion hypothesis of response initiation, and the gain hypothesis was offered as an alternative to this hypothesis. The analyses above showed that a network-wide manipulation of gain accounts for the tempo naming data in the current network, but the hypothesis would be further supported if it also explained Lupker et al.’s stimulus blocking results. As discussed above, we


123

adopted Lupker et al.’s interpretation of blocking effects as due to differences in the average difficulty of stimuli between blocks, except that we manipulated the level of baseline gain instead of a time criterion of response initiation. One advantage of manipulating the level of gain is that it allows for stimulus processing effects on latency without any additional mechanisms (recall that a pure time criterion cannot, by definition, produce latency effects). Furthermore, as shown below, gain manipulation accounts for all of the basic patterns of results that Lupker et al. found, including one effect that Lupker and his colleagues could not explain with a time criterion alone.

Table 12. Frequency counts of errors made by the network as a function of gain onset and error type. System-Wide

Phonology-Only All Stimuli

Word

0

-50

10 (6.4)

10 (5.9)

-100

-150

Total

13 39 (6.2) (11.5)

0

-50

-100

-150

Total

72 (8.2)

62 54 60 66 (34.8) (31.4) (32.1) (29.1)

242 (31.7)

Regularization

24 19 25 (15.4) (11.2) (12.0)

24 (7.1)

92 (10.5)

12 19 28 30 (6.7) (11.1) (15.0) (13.2)

89 (11.7)

Nonword

85 86 109 178 (54.5) (50.9) (52.2) (52.4)

458 (52.4)

95 89 84 101 (53.4) (51.7) (44.9) (44.5)

369 (48.3)

Articulatory

37 54 62 99 (23.7) (32.0) (29.7) (29.1)

252 (28.8)

9 (5.1)

10 (5.8)

15 30 (8.0) (13.2)

874

227

187

172

Total

156

169

209

340

64 (8.4)

178

764

Words Only Word

2 (6.9)

Regularization

1 4 13 (4.4) (10.5) (20.6)

20 (13.1)

3 3 4 5 (17.7) (11.1) (10.0) (10.2)

15 (11.3)

24 19 25 24 (82.8) (82.6) (65.8) (38.1)

92 (60.1)

12 19 28 30 (70.6) (70.4) (70.0) (61.2)

89 (66.9)

2 4 4 8 (11.8) (14.8) (10.0) (16.3)

18 (13.5)

Nonword

2 (6.9)

2 7 14 (8.7) (18.4) (22.2)

25 (16.3)

Articulatory

1 (3.5)

1 (4.4)

16 (10.5)

0 0.0

29

23

153

17

Total

2 12 (5.3) (19.1) 38

63

1 4 6 (3.7) (10.0) (12.2) 27

40

49

11 (8.3) 133

Note: counts for the simulation with gain varied across the entire network are shown in bold and italics. Counts for gain varied only over phonology are shown in plain text.

The latency effects from Lupker et al.’s (1997a) study that we wanted to account for were the following: 10.

HFE, HFC, or LFC words alone < mixed with pseudowords

Strategic Control in Naming 11.

LFE words alone > mixed with pseudowords

12.

Pseudowords alone > mixed with HFE, HFC, or LFC words

13.

Pseudowords alone < mixed with LFE words

14.

HFE words alone < mixed with LFE words

124

15. LFE words alone > mixed with HFE words Lupker et al. used mean latencies in the pure blocks as a measure of the difficulty of each stimulus type. More difficult stimuli induce a late time criterion to allow sufficient time for processing, and less difficult stimuli induce an early time criterion because subjects wish to respond as quickly as possible without a large penalty in accuracy. When stimuli of different levels of difficulty are mixed within the same block, response latencies should reflect a time criterion higher than the one set for a pure block of the easy stimuli, but lower than the one set for a pure block of the difficult stimuli (i.e., an average of the two pure block time criteria). Lupker and his colleagues were largely successful in accounting for the results from their study, except that they failed to account for effects 2 and 4 listed above: responses to LFE words alone were slower than those made to LFE words mixed with pseudowords, and correspondingly, responses to pseudowords, when presented alone, were faster than responses to pseudowords mixed with LFE words. These findings were problematic because the mean latency of responses to LFE words when presented alone was equal to that for pseudowords alone, and therefore the time criteria in these two blocks should have been roughly equal. According to their logic, mixing these two types of stimuli should have produced no systematic change in latencies relative to the pure blocks. The authors suggested an additional mechanism in the form of a lexical checking strategy to account for these effects. In the model, we used pure block latencies to determine the level of baseline gain for each stimulus type, analogous to the logic behind changes in time criteria. However, naming latencies produced under given baseline level of gain are modulated by stimulus type (unlike a time criterion). Therefore, different pure block latencies can correspond to equal baseline gains, and likewise, equal pure block latencies can correspond to different baseline gains. Figure 16 graphs mean latencies as a function of stimulus type and baseline gain. We used the ratio of mean latencies from the pure blocks in Experiments 1 and 2 in Lupker et al.’s study to determine the baseline gain values for each stimulus type. The only degree of freedom remaining when the ratios are used is the number of ticks (and therefore baseline gain) for one of the stimulus types (i.e., without this, the ratio scale could move within number of ticks). If a gain of 1.0 corresponds to normal, unspeeded naming, then the speeded naming task should generally induce faster baseline gains. We set the slowest stimulus condition (LFE words) to a baseline above 1.0 (i.e., 1.3) to ensure that all gains computed would be above 1.0. Starting with the baseline gain for LFE words and Lupker et al.’s ratios of mean latencies for pure blocks, we calculated the model’s baseline gain values for each stimulus type. These were 1.30, 1.51, 1.60, 1.62, and 1.70 for LFE words, pseudowords, HFE, LFC, and HFC words, respectively (illustrated in Figure 16). If we assume, as Lupker et al. did with a time criterion, that mixed blocks have a baseline gain equal to the mean of the pure block baseline gains, then the rank order of gains for each stimulus type accounts for the rank order effects listed above. For example, HFE, HFC, and LFC words all have greater baseline gains than pseudowords. Therefore, mixing any of these stimuli with pseudowords will reduce the baseline gain relative to the words, but increase it relative to the pseudowords. This will cause slower word latencies (effect 1) and faster pseudoword latencies (effect 3) in the mixed condition, relative to the pure block condition, because increases in baseline gain above 1.0 cause progressively faster latencies and greater error rates (i.e., speed/accuracy tradeoff). The same logic applies for the other 4 effects. Manipulation of baseline gain is able to account for effects 2 and 4 because even though LFE words are slightly slower than pseudowords in pure blocks, pseudowords are processed more slowly by


125

the network. Therefore, to obtain comparable latencies, the baseline gain must have been greater for pseudowords than LFE words in the pure blocks. Hi Freq Con Lo Freq Exc

Hi Freq Exc Pseudo

Lo Freq Con

24 22 20 18 16 14

Naming 12 Latency (ticks)

LFE

Pseudo

HFE, LFC

HFC

10 1.0

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

Baseline Gain Figure 16. Mean latencies of the network as a function of stimulus type and baseline gain. Dashed lines illustrate how baseline gains for each stimulus type were estimated from mean latencies in the pure block conditions reported by Lupker et al. (1997).

General Discussion The current study began with two related questions: can readers strategically control the timing of response initiation in word naming, and if so, can we use this ability to examine the early contents of phonological processing in word reading? We devised the tempo naming task as a method to address these questions, and the results led us to draw a set of empirical and theoretical conclusions. We supported the theoretical conclusions with a connectionist simulation of word reading and, in particular, tempo naming. A summary of the empirical results is as follows: 〈

〈

Subjects have a high degree of control over the timing of response initiation, but unlike a pure time criterion, timing is subject to stimulus factors such as frequency, consistency, and lexicality. Subjects can respond correctly to word stimuli nearly 100 ms faster than their standard speeded naming baseline, and the data suggest that they could be induced to


126

respond even faster. Moreover, the mean latency of correct responses to pseudowords in Experiment 3 was approximately 130 ms faster than baseline. 〈

Subjects timed their responses based on the onset of voicing, even though numeric feedback on their timing instructed them to use the onset of acoustic energy (i.e., either voiced or unvoiced).

〈

As subjects were induced to initiate their responses more quickly, response durations decreased and error rates increased.

〈

As error rates increased with increasing tempos, the number of regularization errors remained constant while the occurrence of other errors increased. These results, in combination with our theoretical perspective and results from past research, led us to propose the hypothesis of coarse strategic control. In its most general form, coarse strategic control states that strategic shifts in processing or behavior can be accomplished through quantitative changes in a single (or very few) parameter. Moreover, the parameter(s) affects processing at individual levels of representation in a nonspecific way. The hypothesis is analogous to the concept of a control parameter in dynamic systems theory (Abraham & Shaw, 1985) in that a global, quantitative change in a single parameter can have complex, qualitative effects on the behavior of individual components in a system. For the current study, we restricted the hypothesis in scope to the issue of response initiation in word naming. We developed a distributed, connectionist model of word reading with coarse strategic control and the tempo naming task in mind. We implemented control of response initiation by manipulating the timing and level of gain on the net input to processing units across the entire network. The model successfully accounted for the pattern of results from Experiment 2, which we treated as representative of the results from all three experiments in the current study. In the standard naming portion of the experiment, the results accounted for included the pattern of latencies and error rates as a function of stimulus type (i.e., frequency, consistency, and lexicality effects). Results from the tempo naming task included the pattern of naming latencies, durations, and error rates as a function of stimulus type and tempo, as well as the frequency counts of different error types as a function of tempo. Furthermore, the model successfully accounted for the set of stimulus blocking results that had partly motivated the current study. The manipulation of gain in our network provided a fuller account of the results from the Lupker et al. (1997a) study than the time criterion hypothesis, as put forth by the authors. We interpret the success of the model as evidence for system-wide modulation of gain as the mechanism of strategic control over response initiation in word naming. The current study raises a number of issues, which we address in the final two sections. Coarse Strategic Control Other examples of strategic effects in word reading. In addition to the Lupker et al. (1997a) and Jared (1997) studies, other studies in word reading have focused on strategic effects, and these are relevant to the hypothesis of gain manipulation. Perhaps most notably, the route emphasis hypothesis has drawn support from studies in which the composition of stimuli is manipulated to bias lexical versus rule-like processing, and associative priming is examined as a function of stimulus composition (Baluch & Besner, 1991; Colombo & Tabossi, 1992; Tabossi & Laghi, 1992). The magnitude of associative priming served as an index of emphasis on lexical processing in these studies, and the results all showed greater associative priming when some portion of the stimuli contained exceptional spelling-to-sound correspondences. The authors interpreted these


127

results as evidence that a relatively higher proportion of exception words induced emphasis on lexical route processing (and/or de-emphasis of the rule route). Manipulation of gain may account for these results in the same way as it accounted for the Lupker et al. findings: variation in the overall difficulty of stimuli in a given block corresponds to variation in the baseline level of gain, and baseline gain would modulate the magnitude of associative priming. The latter assertion is not obviously true, but one might expect variation in gain to modulate associative priming because all stimulus effects (i.e., presumably including associative priming) on naming latency become attenuated with high baseline gains (refer to Figure 16). In addition, variation in gain can have differential effects on the influence of semantics versus low-level spelling to sound correspondences in processing, as indicated by the network’s pattern of error proportions as a function of gain manipulation. The issue is further complicated by the fact that these studies were conducted in languages with varying degrees of regularity between orthography and phonology (i.e., Italian has a deeper orthography with regards to stress, and Persian is a mix of deep and shallow orthographies). The effect of gain manipulation interacts with the dynamics of processing in the word reading system. If the dynamics of processing in a model that learned to read in these languages differed from the dynamics of a model that learned English, then one would expect different effects of variation in gain. Further simulations are necessary to resolve these issues. Other examples of coarse strategic control. We restricted the scope of coarse strategic control to response initiation in word naming in the current study. However, as mentioned earlier, coarse strategic control could potentially be applied to other domains of response initiation, and more broadly, to other domains of strategic control. The core tenet of the hypothesis is that strategic control is accomplished through quantitative changes in a small number of system-wide, nonspecific parameters (analogous to control parameters in dynamic systems theory; Abraham & Shaw, 1985). Researchers have invoked control parameters such as muscle stiffness and cyclic frequency in models of movement coordination (Kelso, 1997; see also Thelen & Smith, 1994), but to our knowledge, gain manipulation in the current study is the only example of coarse strategic control applied to a cognitive parameter15 . Cohen & Servan-Schreiber (1992) used gain on the net input to connectionist processing units as a mechanism of control over the influence of context on behavior (i.e., whether the task is color naming or word reading in the Stroop paradigm). However, their usage of gain is not an example of coarse strategic control because it was applied to subset of the units in their network (i.e., the task demand units). We believe that the hypothesis of coarse strategic control will prove useful in generating future research on issues of word reading and response generation, as well as in other domains of cognition. A general model of word reading? There are a number of more or less mature models of word reading that currently exist in the literature (e.g., Coltheart et al., 1993; Plaut et al., 1996; Kawamoto, 1993; Seidenberg & McClelland, 1989; Van Orden, Bosman, Goldinger, & Farrar, 1997), yet we chose to develop a connectionist network that differs in its architecture from these extant models. All previous models contain a pathway from orthography to phonology in which processing is, at most, indirectly influenced by semantic structure or processing. The phonological mediation hypothesis as implemented by Van Orden and his colleagues provides the strongest example because, on this hypothesis, the strong correlation between orthographic and phonological structure (relative to that between orthography and semantics) dictates that orthography will map to phonology before semantic activation is strong enough to influence processing. The same reasoning holds true for other connectionist models of word reading that are examples of the triangle framework (Kawamoto, 1993; Plaut et al., 1996; Seidenberg & McClelland, 1989). Similar to these, the dual-route 15

The apparent lack of examples is not surprising, given that strategic control is often factored out of experiments in cognitive psychology as much as possible.


128

framework also contains a pathway from orthography to phonology that is relatively independent of semantic processing; the rule route explicitly stores spelling-to-sound correspondence rules that are devoid of semantic structure. By contrast, the model proposed in the current study required that the mapping from orthography to phonology be mediated by representations learned at the hidden language group. The hidden language group mediated the mapping between phonology and semantics during the language phase of training, so these hidden representations were shaped by the structure of inputs at phonology and semantics. Orthography mapped through the hidden reading units, which were connected only to the hidden language units, and not to phonology or semantics. This architecture is reasonable if one considers that learning to read occurs after a child has gained a fair amount of proficiency at understanding and producing language (usually spoken). This time course in learning, combined with the fact that both skills are linguistic in nature, suggests that reading skills are built upon more general language skills. However, to claim that the processing involved in pronouncing a string of letters must be mediated by representations partly shaped by semantics is not absolutely required by the argument presented here. In fact, an additional factor motivated our choice of architecture. One clear aspect of the results from Experiments 1 and 2 was that the proportion of regularizations decreased as tempo increased. This was surprising because the DRC model (Coltheart et al., 1993) and Kawamoto's (1993) distributed model show an equal or increased proportion of regularization errors at early points in the time course of processing. We reasoned that an increase in the influence of semantics as naming responses are pressured to be faster is necessary to account for the effect of tempo on error proportions in the experiments. Preliminary simulations of the triangle framework suggested that the presence of a relatively direct pathway from orthography to phonology causes an overly strong influence of the low-level spelling-to-sound correspondences on the output at phonology (Kello & Plaut, in preparation). To support this claim, the models we have tested thus far that contain a relatively direct orthography-to-phonology pathway have exhibited a constant or increasing proportion of regularizations with earlier rises in gain. However, we do not wish to make any strong claims about the architecture of the processes underlying word reading in the current study because the focus here is on strategic control of response initiation and execution. Therefore, we are investigating the architectural issue separately (Kello & Plaut, in preparation). Nonetheless, our claim is strong and potentially contentious, so it warrants at least a brief discussion here. We outline how the current model might account for three phenomena in word reading that seem particularly problematic. Acquired surface and phonological dyslexia. Dual-route theories have, in the past, received considerable support from certain patterns of acquired dyslexia, particularly surface and phonological dyslexia. Surface dyslexic patients (Patterson, Marshall, & Coltheart, 1985) are impaired at reading exception words but show relatively preserved reading of regular words and pseudowords, whereas phonological dyslexic patients (Beauvois & Derouesné, 1979; Coltheart, 1996) are impaired at reading pseudowords but show relatively preserved reading of both regular and exception words. It is commonly assumed (see, e.g., Coltheart, 1985) that accounting for these patterns of performance requires an anatomic separation of lexical/semantic and sublexical processing, such that the former is selectively impaired in surface dyslexia and the latter is selectively impaired in phonological dyslexia. It might therefore seem difficult to account for both patterns within the current modeling framework as there is only a single pathway from orthography to phonology. Although additional modeling work is needed to address this concern fully, we would point out that Patterson and colleagues (Patterson, Graham, & Hodges, 1994; Patterson & Hodges, 1992, Patterson & Marcel, 1992; Patterson, Suzuki, & Wydell, 1996) have proposed and provided empirical support for an account of both surface and phonological dyslexia that involves different locations of damage within the single pathway between semantics and phonology. On this account, surface dyslexia arises from damage to semantics (or between semantics and phonology) because correctly pronouncing an exception word requires interactive support from semantics to override the


129

strong consistency in the orthography-phonology mapping. By contrast, phonological dyslexia arises from damage directly to phonology because words but not nonwords benefit from interactive support from semantics. Although Patterson and colleagues have cast their account in terms of a model with a distinct pathway between orthography and phonologythe triangle model (Plaut et al., 1996; Seidenberg & McClelland, 1989)the account holds equally well within the current framework. Masked phonological priming. Lukatela and his colleagues (Lukatela & Turvey, 1994a, 1994b; Lukatela, Lukatela, & Turvey, 1993; see also Perfetti & Bell, 1991) have gathered a large body of evidence for a fast and primary role of phonological codes in visual word recognition. The majority of their studies examine the effect of masked primes on target word responses when those primes are phonologically related to the target, versus other-related and unrelated primes. The general finding is that phonologically similar homophones and pseudohomophones prime their targets at very short (less than 70 ms) prime durations, sometimes even more than identity primes (Lukatela, Savic, Urosevic, & Turvey, 1997). The lack of a direct pathway between orthography and phonology may seem contrary to the apparent primacy of phonological codes. However, there are three reasons to suspect that this is not the case. First, reading was mediated by internal representations shaped by both semantic and phonological structure, so orthography is, in fact, mediated by phonological structure to some degree. Second, when gain was manipulated over phonology only (to contrast with coarse strategic control), the network produced more regularization errors as gain was increased, whereas word errors remained roughly constant. If increased gain over phonology is analogous to a pure time criterion, then this effect suggests that phonological structure is prominent during the early moments of normal processing in the network. This follows from the assumptions that regularizations reflect the low-level spelling-to-sound correspondences, and that increased gain at phonology “saturates” whatever response was active at that point in time. Therefore, the model’s predictions should be in line with evidence for early activation of phonology in word reading. Hyperlexia. Hyperlexia is a developmental, cognitive disorder that is marked by impaired language comprehension and production (Glosser, Friedman, & Roeltgen, 1996), with a relatively spared ability to read aloud words and nonwords (even though understanding of those words may be severely limited). On the face of it, the current model does not seem able to account for hyperlexic reading because the mapping from orthography to phonology is mediated by hidden representations shaped by both phonology and semantics. However, if language development is impaired, as is the case in hyperlexia, then one would expect the internal representations supporting communication to be malformed. In particular, if semantic processes are severely impaired while phonological processes are relatively spared, then one would expect the hidden language representations to be almost completely structured by phonology (since semantics would not exert any coherent pressure). Hyperlexic children do, in fact, tend to show severe semantic impairment with preserved phonological skills (Glosser, Grugan, & Friedman, 1997). The accounts given for the above three phenomena are only conjecture at this point. Further simulation work is clearly necessary to support them, but we believe that these brief outlines showed how the currently proposed architecture could, in principle, provide the foundation for a general theory of word reading. Conclusions In this study, we used the tempo naming task to investigate the hypothesis that subjects can strategically manipulate a time criterion to initiate a naming response. We interpreted the results from three experiments with tempo naming as inconsistent with the time criterion hypothesis and current models of word reading. We proposed the hypothesis of coarse strategic control, in which cognitively coarse parameters serve as the levers of strategic control over response initiation in word reading. We embodied this hypothesis in the global manipulation of gain on the net input to units in a connectionist model of word reading. The model accounted for the tempo naming results, as well


130

as the main results from the stimulus blocking study by Lupker et al. (1997a). In developing this model, we found it necessary to constrain the mapping from orthography to phonology to be mediated by learned, internal representations that were shaped by the mapping between phonology and semantics. We interpreted this architecture as arising from the time course of language and reading acquisition, but further work is necessary to investigate its implications and viability as a general model of word reading.


131

References Balota, D. A., Boland, J. E., & Shields, L. W. (1989). Priming in pronunciation: Beyond pattern recognition and onset latency. Journal of Memory and Language, 28, 14-36. Baluch, B., Besner, D. (1992). Visual word recognition: Evidence for strategic control of lexical and nonlexical routines in oral reading. Journal of Experimental Psychology: Learning, Memory, & Cognition, 17, 644-652. Baars, B. J., Motley, M. T., MacKay, D. G. (1975). Output editing for lexical status in artificially elicited slips of the tongue. Journal of Verbal Learning & Verbal Behavior, 14, 382-391. Besner, D., Twilley, L., McCann, R. S., & Seergobin, K. (1990). On the association between connectionism and data: Are a few words necessary? Psychological Review, 97, 432-446. Bock, K. (1996). Language production: Methods and methodologies. Psychonomic Bulletin & Review, 3, 395-421. Chomsky, N., & Halle, M. (1968). The sound pattern of English. New York: Harper and Row. Cohen, J. D.; Servan-Schreiber, D. (1992). Context, cortex, and dopamine: A connectionist approach to behavior and biology in schizophrenia. Psychological Review, 99, 45-77. Colombo, L., & Tabossi, P. (1992). Strategies and stress assignment: Evidence from a shallow orthography. In R. Frost & L. Katz (Eds.), Orthography, phonology, morphology, and meaning (pp. 319-399). Amsterdam: North-Holland. Coltheart, M. (1985). Cognitive neuropsychology and the study of reading. In M. I. Posner and O. S. M. Marin (Eds.), Attention and Performance XI (pages 3-37). Hillsdale, NJ: Erlbaum. Coltheart, M. (Ed.) (1996). Special issue on Phonological Dyslexia. Cognitive Neuropsychology, 13, 749-934. Coltheart, M., & Rastle, K. (1994). Serial processing in reading aloud: Evidence for dualroute models of reading. Journal of Experimental Psychology: Human Perception and Performance, 6, 1197-1211. Dell, G. S., & Reich, P. A. (1981). Stages in sentence production: An analysis of speech error data. Journal of Verbal Learning & Verbal Behavior, 20, 611-629. Dell, G. S. (1986). A spreading activation theory of retrieval in sentence production. Psychological Review, 93, 283-321. Dell, G. S. (1988). The retrieval of phonological forms in production: Tests of predictions from a connectionist model. Journal of Memory & Language, 27, 124-142. Forster, K. I., & Chambers, S. M. (1973). Lexical access and naming time. Journal of Verbal Learning and Verbal Behavior, 12, 627-635. Forster, K. I., & Davis, C. (1991). The density constraint on form-priming in the naming task: Interference effects from a masked prime. Journal of Memory and Language, 30, 1-25. Frederiksen, J. R., & Kroll, J. F. (1976). Spelling and sound: Approaches to the internal lexicon. Journal of Experimental Psychology: Human Perception and Performance, 2, 361-379.


132

Garrett, M. F. (1976). Syntactic processes in sentence production. In R. J. Wales & E. Walker (Eds.), New approaches to language mechanisms. Amsterdam: North-Holland. Glushko, R. J. (1979). The organization and activation of orthographic knowledge in reading aloud. Journal of Experimental Psychology: Human Perception and Performance, 5, 674-691. Herdman, C. M. (1992). Attentional resource demands of visual word recognition in naming and lexical decisions. Journal of Experimental Psychology: Human Perception & Performance, 18, 460-470. Ollman, R. T., & Billington, M. J. (1972). The deadline model for simple reaction times. Cognitive Psychology, 3, 311-336. Hinton, G. E., & Sejnowski, T. J. (1986). Learning and relearning in Boltzmann machines. In D. E. Rumelhart, J. L. McClelland, & the PDP Research Group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition: Vol. 1. Foundations (pp. 282-317). Cambridge, MA: MIT Press. Hoequist, C. E. (1983). The perceptual center and rhythm categories. Language & Speech, 26, 367-376. Jared, D. (1997). Evidence that strategy effects in word naming reflect changes in output timing rather than changes in processing route. Journal of Experimental Psychology: Learning, Memory, & Cognition, 23, 1424-1438. Jared, D., McRae, K., & Seidenberg, M. S. (1990). The basis of consistency effects in word naming. Journal of Memory & Language, 29, 687-715. Kawamoto, A. H. (1993). Nonlinear dynamics in the resolution of lexical ambiguity: A parallel distributed processing account. Journal of Memory and Language, 32, 474-516. Kawamoto, A. H., & Kello, C. T. (in press). Effect of onset cluster complexity in speeded naming: A test of rule-based approaches. To appear in the Journal of Experimental Psychology: General. Kawamoto, A. H., Kello, C. T., Jones, R., & Bame, K. (1998). Initial phoneme versus whole-word criterion to initiate pronunciation: Evidence based on response latency and initial phoneme duration. Journal of Experimental Psychology: Learning, Memory, & Cognition, 24, 862-885. Kawamoto, A. H., Kello, C. T., Higareda, I., & Vu, J. (in press). Effect of frequency on onset durations in speeded naming. To appear in the Journal of Experimental Psychology: Learning, Memory, & Cognition. Kawamoto, A. H., & Kitzis, S. N. (1991). Time course of regular and irregular pronunciations. Connection Science, 3, 207-217. Kawamoto, A. H., & Zemblidge, J. (1992). Pronunciation of homographs. Journal of Memory and Language, 31, 349-374. Kurganskii, A. V. (1994). Quantitative analysis of periodical time-patterned finger tapping. Human Physiology, 20, 117-120. Levelt, W. J. M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press.


133

Lemoine, H. E., Levy, B. A., Hutchinson, A. (1993). Increasing the naming speed of poor readers: Representations formed across repetitions. Journal of Experimental Child Psychology, 55, 297-328. Lupker, S. J., Brown, P., & Colombo, L. (1997a). Strategic control in a naming task: Changing routes or changing deadlines? Journal of Experimental Psychology: Learning, Memory, & Cognition, 23, 570-590. Lupker, S. J., Taylor, T. E., & Pexman, P. M. (1997b). Strategic control of a time criterion in naming: New evidence and effects. Presentation at The 38th Annual Meeting of the Psychonomic Society. Lukatela, G., & Turvey, M. T. (1994a). Visual lexical access is initially phonological: I. Evidence from associative priming by words, homophones, and pseudohomophones. Journal of Experimental Psychology: General, 123, 107-128. Lukatela, G., & Turvey, M. T. (1994b). Visual lexical access is initially phonological: 2. Evidence from phonological priming by homophones and pseudohomophones. Journal of Experimental Psychology: General, 123, 331-353. Manis, F. R. (1985). Acquisition of word identification skills in normal and disabled readers. Journal of Educational Psychology, 77, 78-90. Marcus, S. M. (1981). Acoustic determinants of perceptual center (P-center) location. Perception & Psychophysics, 30, 247-256. Marshall, J. C., & Newcombe, F. (1973). Patterns of paralexia: A psycholinguistic approach. Journal of Psycholinguistic Research, 2, 175-199. Mates, J., Radil, T., & Poppel, E. (1992). Cooperative tapping: Time control under different feedback conditions. Perception & Psychophysics, 52, 691-704. McClelland, J. L., & Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception: Part I. An account of basic findings. Psychological Review, 88, 375407. Meyer, D. E., Osman, A. M., Irwin, D. E., & Kounios, J. (1988). The dynamics of cognition and action: Mental processes inferred from speed-accuracy decompostion. Psychological Review, 95, 183-237. Monsell, S., Patterson, K. E., Graham, A., Hughes, C. H., & Milroy, R. (1992). Lexical and sublexical translation of spelling to sound: Strategic anticipation of lexical status. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 452-467. Nowlan, S. J. (1988). Gain variation in recurrent error propagation networks. Complex Systems, 2, 305-320. Ollman, R. T., & Billington, M. J. (1972). The deadline model for simple reaction times. Cognitive Psychology, 3, 311-336. Paap, K. R., Noel, R. W. (1991). Dual-route models of print to sound: Still a good horse race. Psychological Research, 53, 13-24.


134

Patterson, K., & Behrmann, M. (1997). Frequency and consistency effects in a pure surface dyslexic patient. Journal of Experimental Psychology: Human Perception & Performance, 23, 1217-1231. Patterson, K., Graham, N., & Hodges, J. R. (1994). The impact of semantic memory loss on phonological representations. Journal of Cognitive Neuroscience, 6, 57-69. Patterson, K., & Hodges, J. R. (1992). Deterioration of word meaning: Implications for reading. Neuropsychologia, 30, 1025-1040. Patterson, K., & Marcel, A. (1992). Phonological alexia or phonological alexia? In J. Alegria, D. Holender, J. Junca de Morais, & M. Radeau (Eds.), Analytic approaches to human cognition (pp. 259-274). Amsterdam, Netherlands: North-Holland. Patterson, K., Marshall, J. C., & Coltheart, M. (1985). Surface dyslexia. London: Lawrence Earlbaum. Patterson, K., Suzuki, T., & Wydell, T. N. (1996). Interpreting a case of Japanese phonological alexia: The key is in phonology. Cognitive Neuropsychology, 13, 803-22. Plaut, D. C., McClelland, J. L., Seidenberg, M. S., & Patterson, K. (1996). Understanding normal and impaired word reading: Computational principles in quasi-regular domains. Psychological Review, 103, 56-115. Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323, 533-536. Rumelhart, D. E., & McClelland, J. L. (1982). An interactive activation model of context effects in letter perception: II. The contextual enhancement effect and some tests and extensions of the model. Psychological Review, 89, 60-94. Ruthruff, E. (1996). A test of the deadline model for speed-accuracy tradeoffs. Perception & Psychophysics, 58, 56-64. Schwartz, R. G., & Goffman, L. (1995). Metrical patterns of words and production accuracy. Journal of Speech & Hearing Research, 38, 876-888. Seidenberg, M. S., & McClelland, J. L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96, 523-568. Seidenberg, M. S., Petersen, A., MacDonald, M. C., & Plaut, D. C. (1996). Pseudohomophone effects and models of word recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 48-62. Seidenberg, M. S., & Plaut, D. C. (1998). Evaluating word-reading models at the item level: Matching the grain of theory and data. Psychological Science, 9, 234-237. Seidenberg, M. S., Plaut, D. C., Petersen, A. S., McClelland, J. L., & McRae, K. (1994). Nonword pronunciation and models of word recognition. Journal of Experimental Psychology: Human Perception and Performance, 20, 1177-1196. Seidenberg, M. S., Waters, G. S., Barnes, M. A., & Tanenhaus, M. K. (1984). When does irregular spelling or pronunciation influence word recognition? Journal of Verbal Learning and Verbal Behavior, 23, 383-404.


135

Sherak, R. (1982). A real-time software voice key and an application. Behavior Research Methods and Instrumentation, 14, 2, 124-127. Spieler, D. H., & Balota, D. A. (1997). Bringing computational models of word naming down to the item level. Psychological Science, 8, 411-416. Stanovich, K. E., & Bauer, D. W. (1978). Experiments on the spelling-to-sound regularity effect in word recognition. Memory & Cognition, 6, 410-415. Stanovich, K E., Pachella, R. G. (1976). The effect of stimulus probability on the speed and accuracy of naming alphanumeric stimuli. Bulletin of the Psychonomic Society, 8, 281-284. Stanovich, K. E., Siegel, L. S., Gottardo, A. (1997). Converging evidence for phonological and surface subtypes of reading disability. Journal of Educational Psychology, 89, 114-127. Sternberg, S., Wright, C., Knoll, R., & Monsell, S. (1980). Motor programs in rapid speech: Additional evidence. In R. Cole (Ed.), Perception and production of fluent speech (pp. ). Hillsdale, NJ: Erlbaum. Strayer, D. L., & Kramer, A. F. (1994). Strategies and automaticity: II. Dynamic aspects of strategy adjustment. Journal of Experimental Psychology: Learning, Memory, & Cognition, 20, 342-365. Tabossi, P., & Laghi, L. (1992). Semantic priming in the pronunciation of words in two writing systems: Italian and English. Memory & Cognition, 20, 303-313. Taraban, R., & McClelland, J. L. (1987). Conspiracy effects in word pronunciation. Journal of Memory and Language, 26, 608-631. Treisman, M., & Williams, T. C. (1984). A theory of criteria setting with an application to sequential dependencies. Psychological Review, 91, 68-111. Trueswell, J. C., Tanenhaus, M. K., Kello, C. T. (1993). Verb-specific constraints in sentence processing: Separating effects of lexical preference from garden-paths. Journal of Experimental Psychology: Learning, Memory, & Cognition, 19, 528-553. Van Orden, G. C. (1991). Phonological mediation is fundamental in reading. In D. Besner & G. Humphreys (Eds.), Basic Processes in Reading: Visual Word Recognition (pp. 77-103). Hillsdale, NJ: Erlbaum. Van Orden, G. C., & Goldinger, S. D. (1994). Interdependence of form and function in cognitive systems explains perception of printed words. Journal of Experimental Psychology: Human Perception & Performance, 20, 1269-1291. Venezky, R. L. (1970). The structure of English Orthography. The Hague: Mouton. Waters, G. S., & Seidenberg, M. S. (1985). Spelling-sound effects in reading: Time-course and decision criteria. Memory & Cognition, 13, 557-572. Yamada, N. (1995). Nature of variability in rhythmical movement. Human Movement Science, 14, 371-384. Appendix A. Stimuli used in Experiments 1 and 2.


136

Standard Naming

Tempo Naming

Hi Freq Exc Lo Freq Exc Lo Freq Con *Pseudo

Hi Freq Exc Lo Freq Exc Lo Freq Con *Pseudo

break both friend full gone have low put source want what where word are dog his hour none once there warm watch whom whose worth your

blood both break broad come cost dead death done door foot four friend full give gone good great gross have head heard key learn lost love low month most move poor put said says shall show some son source stood through touch truth two want war were what where word work would

breast bush frost flange guise hearth limb pear suede wand wad wool womb ere cache swath rouge mule wharf **chic wolf rouse vise farce swamp wan

brunt bum fret flask grope hark lisp pest swish wit wax wade wane eke crag swoon rune mend whelp shrub wilt roam vibe fluff stench wean

brish bax frope flade gret hoam lask pag swench weke wub wibe wuff ean crelp swunt roon mest whark shrane wune rilt vit flisp stend wum

*These stimuli only appeared in Experiment 2 **Note: "chic" was removed from the analyes due a high error rate

blown bush breast plaid caste comb dough deaf drought doll flood flown frost flange glove guise gong grown ghoul hearth hind hook cough leapt loath lose limb mourn mould mow pint pear swap spook shoe shove sew steak suede suave threat tomb trough tread wand wart worm wad wool womb wasp warp

bloke bum brunt plod cask coal dole desk dank dolt flake float fret flask glum grope gob groan goon hark hitch hump cuff letch lobe loom lisp munch mole moan pine pest swig spurt shame shell sole stain swish swill thrift tote trite trait wit wink wick wax wade wane wipe weep

blisp bift brote plake cet cipe dest detch dit doan flain flill frask flod glane grole gade grig gax hait hame hink coom lesk loke luff lum meep mell mick pank pite swask spole shump shunch surt stine swobe swope thrish tunt tritch troat wark woal woan wob wole wolt woon wum


137

Appendix B. Stimuli used in Experiment 3. Stnd. Naming

Tempo Naming

Stnd. Naming

Hi Freq Exc Lo Freq Exc Lo Freq Con swant wug blan blig boog brear carg cowe dait dier diz dorg dup flet flote frak gabe geave gleap grap gream grud heab hib huf leab

bloor bood breat brone cays diend dord dource dut fost foth fough frome gaid gead goor grat grive har heak huth ko leath lon lould mant

blad bost broul cearth cong cood deak deast duave flaste flomb flove frap geat ghough glead goath grear haid heapt huede lange lew lomb masp mool

blift bope brole cark cobe dest disp dit flane fletch fline fritch gite glick goom grum hade het hink hod lask lish lolt luff meep mipe

Tempo Naming Hi Freq Exc Lo Freq Exc Lo Freq Con

losh loup mape meeb morp murp pleam poot sak shap shink soop soor spow sunth swode throob tring trouch wape woap wom woop wosk woup wunt

mave mork pead pone pood sall seard shouch sood sove stost sull throot tove trour twey wearn woad wome wonth woss wost

mough pleaf pove pown sarp shart shint sook sourn spown swand thromb torm trind trould woll wook wose wough wought wown wush

murt pake pank shain shig spoan stell sunt swax swole tait throke trob trote wame wask woat wole woon wum wump wunch


138 Acknowledgements

This research was supported by the National Institute of Mental Health (Grant MH47566) and the CMU Psychology Department Training Grant on “Individual Differences in Cognition”. We thank Marlene Behrmann, Max Coltheart, Alan Kawamoto, Jay McClelland, Chuck Perfetti, and the CMU PDP Research Group for helpful comments and discussions. Correspondence may be directed either to Christopher Kello ([email protected]) or David Plaut ([email protected]), Mellon Institute 115—CNBC, Carnegie Mellon University, 4400 Fifth Avenue, Pittsburgh, PA 15213— 2683.