06 Chang

1 downloads 0 Views 94KB Size Report
each participant. Based on these measures, locus equation statistics of slope, y-intercept, and standard error of estimate as well as the FTR were analyzed.
Coarticulation and Formant Transition Rate in Young Children Who Stutter Soo-Eun Chang Ralph N. Ohde Edward G. Conture Vanderbilt Bill Wilkerson Center for Otolaryngology and Communication Sciences Vanderbilt University Medical Center Nashville, TN

The purpose of this study was to assess anticipatory coarticulation and second formant (F2) transition rate (FTR) of speech production in young children who stutter (CWS) and who do not stutter (CWNS). Fourteen CWS and 14 age- and gender-matched CWNS in three age groups (3-, 4-, and 5-year-olds) participated in a picture-naming task that elicited single-word utterances. The initial consonant-vowel (CV) syllables of these utterances, comprising either bilabial [b m] or alveolar [d n s z] consonants and a number of vowels [i I e ” œ u o O AI AÁ], were used for acoustic analysis. To assess coarticulation and speech movement velocity, the F2 onset frequency and F2 vowel target frequency (for coarticulation) and FTR (for speech movement velocity) were computed for each CV syllable and for each participant. Based on these measures, locus equation statistics of slope, y-intercept, and standard error of estimate as well as the FTR were analyzed. Findings revealed a significant main effect for place of articulation and a significantly larger difference in FTR between the two places of articulation for CWNS than for CWS. Findings suggest that the organization of the FTR production for place of articulation may not be as contrastive or refined in CWS as in CWNS, a subtle difficulty in the speed of speech-language production, which may contribute to the disruption of their speech fluency. KEY WORDS: stuttering, coarticulation, children, speech acoustics, transition rate

R

ecent theories of stuttering have suggested that people who stutter have an impaired or slower-than-normal phonological encoding ability (Perkins, Kent, & Curlee, 1991; Postma & Kolk, 1993; Wingate, 1988). Although it is unknown how difficulties in articulatory processing may contribute to disfluency in ongoing speech, two aspects of these processes, coarticulation and FTR, have been directly and indirectly addressed in previous studies of people who stutter (Hall, Amir, & Yairi, 1999; Howell & Vause, 1986; Klich & May, 1982; Robb & Blomgren, 1996; Yaruss & Conture, 1993; Zebrowski, Conture, & Cudahy, 1985). Using different age groups, analyses, and speech samples, these researchers have reported a range of production behaviors in people who do and do not stutter, with some reporting similarities and others reporting differences in coarticulatory and second formant (F2) transition rate (FTR) characteristics between the two groups. Coarticulation—the influence of one phoneme on another (Daniloff & Hammarberg, 1973; Henke, 1966; Kozhevnikov & Chistovich, 1965; Sharf & Ohde, 1981; Whalen, 1990)—is viewed by some as involving “feature spreading,” which is planned prior to overt articulation rather

676

JournalofofSpeech, Speech,Language, Language,and andHearing HearingResearch Research •• Vol. Vol.45 45 •• 676–688 676–688 •• August August2002 2002 • ©American Speech-Language-Hearing Association Journal 1092-4388/02/4504-0676

than resulting from or being a byproduct of overt articulation. Others have argued that coarticulation is accounted for by a coproduction model, which does not have a “look-ahead” component that alters the plan for a sound segment based on context (Bell-Berti & Krakow, 1991). Instead, coproduction models suggest that coarticulation results from the overlap of gestures whose onsets have a stable relation to other aspects of the articulation of a sound segment. Coarticulation and FTR can be quantified through the use of locus equation and FTR metrics, respectively. FTRs (Hz/ms) are derived by dividing the difference in frequency between the onset/offset F2 frequency measures (Hz) by the transition duration (ms). Locus equations, first introduced by Lindblom (1963), are derived from plots of F2 onset frequency taken at the first glottal pulse along the ordinate and the corresponding F2 vowel target frequency along the abscissa. The F2 transition represents the formant pattern changes associated with the consonant and vowel production. Previous research (Geitz, 1998; Sussman, Hoemeke, & McCaffrey, 1992) has shown locus equations to reasonably estimate the degree of consonant and vowel coarticulation in a consonant-vowel (CV) syllable. Minimum coarticulation (slope = 0.0) between the consonant and vowel is supposedly characterized by a locus equation having a relatively fixed F2 onset frequency across all vowel contexts. Maximum coarticulation (slope = 1.0) is represented by an F2 onset frequency that systematically varies with F2 vowel target frequencies. Using the locus equation metric as a measure of amount of coarticulation, previous developmental findings indicated that younger children’s coarticulation strategies differ when compared to older children and adults (Geitz, 1998; Hamby, 2000; Robb & Wolk, 1997; Sussman et al., 1992; Sussman, Minifie, Buder, StoelGammon, & Smith, 1996). For example, Sussman et al. examined CV coarticulation in 3-, 4-, and 5-year-old children and found that although adults differentiated place of articulation on the basis of degree of coarticulation (e.g., bilabial consonants > velar consonants > alveolar consonants), this contrastive coarticulatory trend was not as apparent in the speech of younger children (3and 4-year-olds). Younger children (3- and 4-year-olds) did not differentiate place of articulation to the same degree as those of 5-year-old children and adults. As reflected by smaller standard error of estimate scores, there was also less variation or tighter clustering and greater linearity of data points about the regression line for older children than for younger children. With regard to children who stutter (CWS), there appear to be no previous studies that used the locus equation metric as an objective measure of coarticulation. Robb and Blomgren (1996) did, however, use the

slope of the F2 transition following consonant release to compare differences in the extent of coarticulation between adults who do and do not stutter, and found that the former group generally had larger slopes than the latter group. A larger slope indicates greater coarticulation. However, coarticulatory behaviors of adults who do and do not stutter are not readily generalizable to the coarticulatory behavior of CWS and children who do not stutter (CWNS), due to many developmental differences in speech/language production between adults and children. For example, recent research clearly shows that the coarticulatory behaviors of children differ from those exhibited by adult speakers (Geitz, 1998; Hamby, 2000; Sussman et al., 1992). Likewise, the speech/language production of CWS is thought to differ from that of adults who stutter (Conture, 1991, 2001) in subtle as well as not so subtle ways. Thus, due to reported developmental differences in coarticulatory behavior of CWNS as well as general differences in speech production between children and adults, it seems inappropriate to predict coarticulatory patterns of CWS based on adult models. There is a need, therefore, to empirically study the coarticulatory abilities of children who stutter, especially those who are relatively close to the onset of stuttering (3-, 4-, and 5-year-olds) (Yaruss, LaSalle, & Conture, 1998). It was thought important to include CWS in this age range, as the (dis)fluent behavior of these children is relatively free of the influence of treatment as well as any learned or habitualized speech production reactions to or associated with longer-term history of stuttering— behaviors more often observed in older children, teenagers, and adults who stutter. The purpose of this study was to compare coarticulation and FTR development in the fluent speech of children who do and do not stutter. This was achieved by determining locus equations and FTRs for two different consonant places of articulation (bilabial and alveolar) in various vowel contexts. The results of this study should shed light on how these properties of speech-language production differ between CWS and CWNS.

Method Participants Twenty-six monolingual boys and 2 monolingual girls who spoke Standard American English served as participants. Of the 28 children, 14 were classified as CWS, and the other 14 were classified as CWNS and were matched in gender and age (±4 months) at the time of data recording. Participants ranged in age from 36 months to 65 months (M = 51.43, SD = 9.42) and were divided into three age groups (3-, 4-, and 5-year-olds).

Chang et al.: Coarticulation and Transition Rate in Children

677

For male CWNS participants with comparable ages within an age group (e.g., 3 CWNS who were 42 months old at the time of data recording), matching to comparable (±4 months) male CWS participants was random. All CWS participated in this study before receiving any prescribed speech-language treatment. Both CWS and CWNS were paid volunteer participants in ongoing research (Conture, Ohde, & Melnick, 1998; Logan & Conture, 1997) and were naive as to the purpose and method of this study. Table 1 depicts the age, sex, and age at the time of videotaping for both CWS and CWNS. A child was included in the CWS group if he/she produced three or more within-word disfluencies (sound/syllable repetitions, sound prolongations, or monosyllabic whole-word repetitions) per 100 words of conversational speech (M =

8.17%; SD = 4.48) and if he/she had at least one adult expressing concern regarding speech fluency. Table 1 shows the mean and overall scores on the Riley (1980) Stuttering Severity Instrument (SSI) for each CWS. The mean SSI score for these children was 19.79 (range = 14 to 26, SD = 3.78), indicating a “moderate” risk for continued stuttering according to Riley’s criterion population (SSI-3). To be classified as CWNS, the child had to produce two or fewer within-word disfluencies per 100 words of conversational speech (M = 0.94%; SD = 0.75), with no concerns expressed by people in the children’s environment about speech fluency. Both CWS and CWNS met the following criteria for selection: normal hearing as determined from pure-tone evaluations (bilateral testing at 25 dB SPL from 250 to 4000 Hz) and tympanometry impedance audiometry (between 800 to 3000 Ω), English as a native language with no known influence

Table 1. Descriptive data pertaining to participants in this study. CWS

CWNS

Reported age at onset (mo.)

Time since onset (mo.)

Age at audio/ videotaping (mo.)

Withinword disfluencies

SSI overall score

SSI severity rating

N/Sex

1/M 2/M 3/M 4/M 5/M

27.00 31.00 28.00 37.00 28.00

9.00 8.00 12.00 7.00 18.00

36.00 39.00 40.00 44.00 46.00

7.80% 5.00% 19.80% 3.00% 11.00%

15.00 16.00 25.00 14.00 26.00

mild mild moderate mild moderate

1/M 2/M 3/M 4/M 5/M

40.00 42.00 42.00 42.00 44.00

1.50% 1.20% 2.00% 1.00% 0.60%

M SD

30.20 4.09

10.80 4.44

41.00 4.00

9.32% 6.58

19.20 5.81

moderate

M SD

42.00 1.41

1.26% 0.53

1/M 2/M 3/M 4/M

24.00 30.00 30.00 41.00

25.00 25.00 20.00 14.00

49.00 49.00 50.00 55.00

3.00% 10.00% 5.40% 7.00%

21.00 20.00 19.00 16.00

moderate moderate moderate mild

1/M 2/M 3/M 4/M

47.00 51.00 53.00 54.00

0.30% 2.00% 2.00% 0.00%

M SD

31.25 7.09

21.00 5.22

50.75 2.87

6.35% 2.93

19.00 2.16

moderate

M SD

51.25 3.10

1.08% 1.08

1/M 2/M 3/F 4/M 5/M

24.00 30.00 38.00 24.00 30.00

35.00 31.00 24.00 41.00 35.00

59.00 61.00 62.00 65.00 65.00

14.00% 6.30% 6.00% 8.20% 8.00%

22.00 18.00 20.00 23.00 22.00

moderate moderate moderate moderate moderate

1/M 2/M 3/F 4/M 5/M

60.00 63.00 63.00 64.00 65.00

1.30% 0.60% 0.00% 0.70% 0.00%

M SD

29.20 5.76

33.20 6.26

62.40 2.61

8.50% 3.22

21.00 2.00

moderate

M SD

63.00 1.87

0.52% 0.55

30.14 5.26

21.71 11.01

51.43 9.85

8.17% 4.48

19.79 3.68

moderate

M SD

52.14 9.44

0.94% 0.75

N/Sex

Age at audio/ Withinvideotaping word (mo.) disfluencies

3 years

4 years

5 years

Overall

M SD

Note. CWS = children who stutter, CWNS = children who do not stutter, mo. = months, SSI = Stuttering Severity Instrument, M = male and F = female.

678

Journal of Speech, Language, and Hearing Research • Vol. 45 • 676–688 • August 2002

of exposure to any other language, normal receptive and expressive functioning as determined by the Test of Early Language Development (TELD; Hresko, Reid, & Hammill, 1991) and the Peabody Picture Vocabulary Test (PPVT; Dunn & Dunn, 1981), no known or reported neurological illnesses or trauma, no evidence of oral or muscular weakness or dysarthria as determined by an oral peripheral examination, no known or reported difficulties in behavioral and/or intellectual functioning, and adequate motor functioning as determined by Wolk’s (1990) selected neuromotor test battery (SNTB). Of particular importance, due to the potential influence of speech-sound articulatory abilities on measures of coarticulation, all participants exhibited normal phonological abilities (a percentile rank of 25% or higher) for their age, as determined on the Goldman-Fristoe Test of Articulation (GFTA; Goldman & Fristoe, 1986).

Speech Samples Responses to a picture-naming task (PNT; Wolk, Edwards, & Conture, 1993) served as stimuli for perceptual and acoustic analysis. The PNT consists of 162 simple line drawings representing all English consonant sounds in initial, medial, and final word positions, as well as a variety of word-initial clusters. An attempt was made to obtain spontaneous productions of all 162 words in the PNT, but at times it was necessary (especially for younger children) to obtain imitated responses. The majority of the children’s responses were spontaneous productions. A comparison of locus equation results of the current research with mixed responses to previous child locus equation findings employing only imitated productions revealed very similar results (Geitz, 1998; Sussman et al., 1992). Thus, the method employed for obtaining child responses seems to have little effect on locus equation results. Among the elicited sounds, words containing initial bilabial ([b m]) and alveolar ([d n s z]) features in various vowel contexts ([i I e ” œ u o O AI AÁ]) were selected for analysis. It is important to note that for a child’s utterance to qualify for acoustic analysis, productions had to meet both of the following criteria: according to perceptual analysis, (1) each word had to be fluent and (2) each word had to be correctly articulated without any age-inappropriate phonological processes. Perceptual analyses of articulation and fluency on all productions were completed by the first author. Perceptually fluent speech (i.e., speech containing neither within- nor between-word disfluencies) was required in order to exclude any possible difference in coarticulation processes between stuttering and nonstuttering groups resulting from speech disfluencies. Correctly articulated speech was required to exclude any possible influence on coarticulation and FTR processes

due to incorrect or deficient articulation performance. In addition, any speech sample containing noise artifacts affecting F2 onset and offset measurements was eliminated from the study.

Data Collection Setting Details pertaining to similar procedures have been discussed elsewhere (LaSalle & Conture, 1995; Logan & Conture, 1997; Melnick & Conture, 2000; Wolk, Edwards, & Conture, 1993). Briefly, 14 participants were tested at the Gebbie Speech and Hearing Clinics, Syracuse University, and the remaining participants were tested under comparable conditions at the Vanderbilt Bill Wilkerson Center, Vanderbilt University. Each child’s testing was audio- and video-recorded and took place in a room specially designed for the experimental testing of young children; for example, testing rooms had been treated to minimize ambient and electrical noise. During testing, each child was seated across from the examiner who administered the PNT at a small table on which the PNT rested, with PNT pictures in front of the child within easy viewing distance. Administration of the PNT lasted about 30 minutes per participant and was the final part of a comprehensive speech and language assessment, which typically lasted 1.5 to 2.0 hours. Each child was allowed to take breaks during this period so that their speech-language productions would be minimally influenced by fatigue.

Audio- and Videotaping Procedures Two high-quality color video cameras (Panasonic models WV-3500 and WV-3250 at Syracuse and JVC model HZ-714 at Vanderbilt) were used for recording. One camera was positioned and focused to obtain a wellilluminated, clear view of the child’s head, neck, upper torso, hands, and arms, and the other was similarly arranged to obtain a clear view of each PNT picture as it was presented to the child (simultaneous recording of each child and the test material the child was responding to was obtained to facilitate subsequent analysis of the data). The output of each camera was channeled to a video switcher (Panasonic model WJ-3500), where it was multiplexed to form a split-screen composite. A visually apparent time code (Hours:Minutes:Seconds:: Videoframes) from an Evertz time code generator/reproducer (Model 3600D) was fed through the switcher, time locked to the videotape recording, and displayed on the split-screen composite. Each participant’s associated audio signals were obtained by a lapel microphone (SONY ECM-55), placed within 15 cm of the participant’s lips, fed to an audio channel of the VCR, and simultaneously

Chang et al.: Coarticulation and Transition Rate in Children

679

recorded and monitored throughout recording on VU meters located on the front of the VCR. The resulting video composite, together with the associated acoustic speech signal, was then recorded on a high-fidelity, 13-mm Panasonic videocassette recorderreproducer (VCR; Model AG-1730). This VCR simultaneously recorded the video signal, associated time code, and audio signals at 30 frames per second (60 videofields per second). For the purpose of online monitoring of all recorded signals (during recording), the output of each participant’s composite signal (i.e., audio + multiplexed video image + time code generator’s time-locked visually apparent output) was simultaneously fed to and appeared on a television monitor (Sony Trinitron at Syracuse and JVC RM-713 at Vanderbilt).

Acoustic Analysis Stimulus Selection As mentioned above, only fluent and correctly articulated words containing the initial consonant sounds [b], [m], [d], [n], [s], and [z] were included for acoustic analysis. Although each consonant was equally represented in the initial sample, the occurrence of articulation errors, disfluencies, and so forth decreased the number of consonants available for analysis. Thus, the [b] and [m] productions were combined for the bilabial place of articulation and the [d], [n], [s], and [z] productions were combined for the alveolar place of articulation to provide sufficient data points for the locus equation analyses. Although Sussman and Shore (1996) found a significant difference in the locus equation slope of [s] compared to [d], [n], or [z], there were only 12 [s] productions included in the current research that met the aforementioned fluent/correct articulation criteria. In addition, these 12 productions were distributed across 12 different speakers, including 6 CWS and 6 CWNS. Thus, it is unlikely that any difference between the alveolar slopes that included the [s] production and those that did not include [s] significantly affected the current findings. Also, the mean F2 onset values across the 12 speakers for [d], [n], [z], and [s] were 2435 Hz, 1978 Hz, 2714 Hz, and 2387 Hz, respectively. The mean F2 onset for [n] is relatively low because it occurred in the [AI] vowel context. This [n] value of 1978 Hz is very similar to the findings for children by Hamby (2000) of 1947 Hz for the [A] vowel context. Overall, for the present study, the F2 onset values for [d], [z], and [s] are similar. In addition, there were no more than two [z] productions for a given speaker. Thus, it was not possible to complete separate locus equation analyses for [s] and [z]. There were 141 and 146 words for CWS and CWNS, respectively, that were acceptable for acoustic analysis. All acoustic measurements were performed on the F2 680

transition between the aforementioned word-initial consonant and the following vowel. This was because F2 provides the most important information about speech articulator movements, particularly relating to movements of the tongue (Gay, 1978). Also, the measurement of F2 was essential in the calculation of the locus equation as well as the FTR.

Data Analysis: Locus Equations Participant recordings were digitized at a 25-kHz sampling rate using Kay Elemetrics Computerized Speech Lab (CSL version 5.5). To assess coarticulation, the onset and offset frequencies of F2 were measured at the first glottal pulse and the vowel steady state, respectively. The locus equation statistics were derived from these measurements for bilabial and alveolar place of consonant articulation. Measurement of the vowel frequencies of a given F2 transition involved producing sound spectrograms and waveforms for each word, which were displayed on the graphics monitor. To aid in location of relevant positions of the F2 onset and offset frequencies, the digital cursor was linked between the spectrogram and waveform displays. Next, the F2 onset and offset frequencies were measured from the visual center of the F2 energy bands on the spectrographic display. To ensure measurement validity and to control for inherent difficulties in the analysis of children’s speech, three measurement formats were employed: (1) spectrograms, (2) fast Fourier transform (FFT) spectra, and (3) linear predictive coding (LPC). Sussman et al. (1992) used all three analytic measures to ensure that an accurate measurement of F2 was obtained. Measurement consensus using the three measures ensured an accurate measurement of F2 as opposed to an analysis of a strong harmonic, which is characteristic of children’s speech. An average of the three measurements was taken, with the LPC measurement discarded if its value was substantively different from the other measures. Unlike Neary and Shammass (1987) and Robb and Blomgren (1996), who used a fixed time point (60-ms postconsonantal release and/or 120-ms postconsonantal release) to determine the loci of vowel target (offset), the present study used criteria similar to those used by Geitz (1998), Hamby (2000), and Sussman et al. (1992) for measuring the F2 offset frequencies. The vowel target measurements were made at variable positions, depending on the configuration of the F2 resonance. The rationale for using variable positions for the target measurements was that the vowel durations were variable, especially in younger children, and this precluded a fixed time window for F2 offset; another factor was the rate of frequency change measure, described below. The F2 vowel targets were measured as follows: (a) if the resonance was flat (manifested as a straight line configuration), a

Journal of Speech, Language, and Hearing Research • Vol. 45 • 676–688 • August 2002

temporal midpoint was measured; (b) if the resonance was diagonally rising or falling, a temporal midpoint was measured; (c) if the resonance was U-shaped or the inverse, measurement was taken at the minimum and maximum frequency point, respectively. For diphthongal vowels, only the initial vowel was measured. In computing locus equations, a straight-line regression fit was made to the resulting data points, which generally

spanned all quadrants of the vowel space. The equation determined by Lindblom (1963) was as follows: F2 onset = k × F2 vowel + c, where k is the slope of the line and c is the y-intercept. Less steep or flat slopes (i.e., k approaching 0) indicate minimum coarticulation typical of alveolar and dental productions, whereas steeper slopes (i.e., k approaching 1) indicate maximum coarticulation, as might be seen with bilabials (Krull, 1988).

Table 2. Group mean frequencies (in Hz) for F2 onset and F2 vowel target and F2 transition duration (in ms) as a function speaker age (in years), vowel context, and group. CWS Age

Context

CWNS

F2 onset

F2 target

F2 transition duration

F2 onset

F2 target

F2 transition duration

Bilabial + Vowel 3

[e] [”] [œ] [AI] [AÁ]

2317 1860 2308 1788 1642

2815 2347 2600 2245 2080

70 66 63 91 34

2183 1751 1865 1514 —

2871 1861 2232 2189 —

68 61 63 61 —

4

[e] [”] [œ] [AI] [AÁ]

2445 — 2208 1959 1703

2932 — 2427 2115 1861

71 — 44 45 75

2568 1806 2476 1311 1860

3083 2153 2725 2209 1807

53 30 65 58 46

5

[e] [”] [œ] [AI] [AÁ]

2350 1934 2199 1605 2080

2766 2517 2445 1952 1970

79 65 45 65 75

2595 1835 2298 1661 1740

2948 2328 2562 2170 1799

56 49 66 53 60

Alveolar + Vowel 3

[i] [I] [u] [o] [O] [AI]

3065 2701 2062 2116 2399 1718

3322 2748 1623 1581 1523 2061

35 50 43 61 167 47

2919 2547 2061 2253 2250 2224

3138 2469 1769 1714 1556 2003

68 61 65 147 124 84

4

[i] [I] [u] [o] [O] [AI]

2573 2265 1888 2079 1970 2294

2955 2230 1550 1532 1313 2071

59 41 77 94 109 67

3138 2525 2089 2426 2426 1982

3248 2440 1331 1567 1641 1939

26 48 94 97 128 70

5

[i] [I] [u] [o] [O] [AI]

2834 2291 — 2104 2265 2119

3053 2369 — 1849 1519 1790

52 43 — 103 121 65

3044 2421 2553 1968 1951 2120

2956 2336 1716 1222 1276 1899

59 56 104 145 81 70

Note. CWS = children who stutter; CWNS = children who do not stutter; — = no data available.

Chang et al.: Coarticulation and Transition Rate in Children

681

Data Analysis: FTR To assess FTR (F2 transition rate), the frequency change in F2 transitions (Hz/ms) was measured. FTR was estimated by calculating the difference in frequencies between the onset/offset F2 frequency measures (Hz) divided by the duration of the transition (ms). This measure is believed to approximate the speed with which the speech articulators move from one location to the next (Yaruss & Conture, 1993).

Table 3. Individual slope and y-intercept values as a function of speaker age (in years), group (CWS = children who stutter, CWNS = children who do not stutter), and place of articulation.

Age 3

Intrajudge and Interjudge Measurement Reliability To assess interjudge and intrajudge measurement reliability on the acoustic measures of the F2 transitions and transition rates, as well as the judgment of fluency and articulation of the utterances, the first author and another individual trained in acoustic analysis and phonetic transcription served as judges for 28 randomly selected target words (10% of each of the 28 participants’ analyzed productions). The intrajudge/ interjudge mean differences for the three acoustic measurements, as well as the percent agreement on the fluency and articulation of the single-word utterances, were as follows: F2 transition onset frequency, 90 Hz and 68 Hz (intrajudge and interjudge, respectively); F2 transition offset frequency, 68 Hz and 118 Hz; FTR, 1.42 Hz/ms and 3.50 Hz/ms; percent agreement on fluency: 100% and 100%; and percent agreement on articulation: 100% and 100%. These interjudge reliability values for F2 onset (68 Hz) and F2 offset (118 Hz) were similar to those of Sussman et al. (1992), who reported a mean difference of 97 Hz averaged across tokens for F2 onset and F2 vowel frequencies.

Results Coarticulation: Locus Equation Slopes and Y-Intercepts Mean frequencies for F2 onset and F2 target for each group across vowel contexts are presented in Table 2. In general, these values are consistent with previous reported data for children (Geitz, 1998; Hamby, 2000; Sussman et al., 1992). Coarticulation, assessed by means of locus equation slope and y-intercept values, was studied as a function of consonant place of articulation (see Table 3). A repeated-measures ANOVA (Group × Age × Place of Articulation) performed on the slopes showed a significant effect of place only, F(1, 22) = 16.74, p < .01. As illustrated in Figure 1, the slopes for all age groups differ between bilabial and alveolar places of articulation, a finding consistent with the studies of Geitz, Hamby, and Sussman et al. Across all ages, bilabial 682

4

5

Bilabial

Alveolar

Speaker groupparticipant #

Slope

Y-intercept

Slope

CWS-1 2 3 4 5 M

0.572 1.078 0.798 0.755 0.573 0.755

573 –523 98 178 564 178

0.478 0.241 0.507 0.638 0.542 0.480

1483 1980 1297 912 1188 1372

CWNS-1 2 3 4 5 M

0.722 0.690 0.688 0.988 0.354 0.688

155 386 307 –102 955 340

0.332 0.590 0.182 0.563 0.530 0.439

1486 1140 2315 1148 1181 1454

CWS-1 2 3 4 M

0.810 0.645 0.540 0.925 0.730

114 466 1071 –177 369

0.516 0.311 0.298 0.517 0.410

1156 1544 1599 1250 1387

CWNS-1 2 3 4 M

0.803 0.230 0.988 0.780 0.700

187 1204 –269 140 315

0.471 0.258 0.027 0.981 0.434

1528 1619 2504 –41 1402

CWS-1 2 3 4 5 M

0.549 0.887 0.582 1.360 0.816 0.840

428 14 609 –1477 267 –31

0.758 0.458 0.773 0.406 0.353 0.549

653 1251 746 1382 1518 1110

CWNS-1 2 3 4 5 M

0.775 0.895 0.889 0.988 0.703 0.850

80 18 –28 –286 540 64

0.399 0.405 0.480 0.150 0.963 0.479

1417 1573 1236 1913 177 1263

Grand Mean

0.763

205

0.468

1326

Y-intercept

slopes are steeper than alveolar slopes, as also observed in the three studies. Geitz (1998), Hamby (2000), and Sussman et al. (1992) reported that the y-intercepts, like slope, tend to have distinct values depending on the different places of articulation. Consistent with their findings, in the present study a repeated measures ANOVA also indicated a significant difference for this factor and place of articulation, F(1, 22) = 43.12, p < .01. Thus, as in previous research on children’s locus equations, the

Journal of Speech, Language, and Hearing Research • Vol. 45 • 676–688 • August 2002

Figure 1. Locus equation slope as a function of speaker age and group (CWS = children who stutter; CWNS = children who do not stutter) and place of articulation. Bars indicate the standard error.

Figure 2. Mean FTR as a function of speaker age and group (CWS = children who stutter; CWNS = children who do not stutter) and place of articulation. Bars indicate the standard error.

distinctiveness of place of articulation in the current data is reflected in both the slope and y-intercept functions.

Variability: Standard Error of Estimate

measure the “goodness of fit” of the regression line to the data points. An ANOVA performed on the SE showed no significant main effects of speaker group or age, but there was a significant Age × Group interaction, F(2, 22) = 3.46, p < .05.

To assess variability between and within talker groups, the standard error of estimate (SE) was used to

This interaction shows that the SE decreased for CWNS but increased for CWS between 4 and 5 years of Chang et al.: Coarticulation and Transition Rate in Children

683

Table 4. Individual mean FTR (Hz/ms) and difference (bilabial minus alveolar) measures as a function speaker age (in years), group (CWS = children who stutter, CWNS = children who do not stutter), and place of articulation. CWS

CWNS

Participant number

Bilabial

Alveolar

3

1 2 3 4 5 M

8.02 4.16 7.87 6.62 6.16 6.56

7.39 2.06 5.16 3.54 7.56 5.14

0.63 2.10 2.71 3.08 –1.40 1.42

4

1 2 3 4 M

6.52 5.76 2.93 4.78 4.99

3.92 4.36 2.31 3.91 3.62

5

1 2 3 4 5 M

5.76 7.43 8.33 9.12 3.46 6.82

Grand mean

6.20

Age

Difference

Alveolar

Difference

6.55 10.68 7.86 9.32 6.48 8.17

4.52 3.26 2.62 2.43 4.87 3.54

2.03 7.42 5.24 6.89 1.61 4.63

2.60 1.40 0.62 0.87 1.37

8.10 9.73 6.59 5.44 7.46

5.32 6.01 2.42 2.66 4.10

2.78 3.72 4.17 2.78 3.36

3.25 2.93 2.17 3.97 3.91 3.24

2.51 4.50 6.16 5.15 –0.45 3.57

9.52 6.66 3.10 7.56 4.81 6.33

4.26 4.08 1.99 5.82 2.06 3.64

5.26 2.58 1.11 1.74 2.75 2.69

4.03

2.17

7.31

3.73

3.58

age. Although there were different patterns of variability for the speaker groups as a function of age, the variability for the children in the present study was comparable to variability previously reported for children at similar ages (Geitz, 1998; Hamby, 2000; Sussman et al., 1992).

FTR: F2 Transition Rate Mean durations for F2 transition for each group across vowel contexts are presented in Table 2. The pattern of these values is consistent with previous reported data for adults (Ohde & Sharf, 1977). For example, the longest F2 durations for children and adults occur with alveolars and back vowel contexts. The authors are not aware of previous reported F2 transition durations of obstruents for children of this age. An indirect measure of the velocity of articulator movement was obtained by assessing FTR (Hz/ms) (see Table 4). A repeated measures ANOVA (Group × Age × Place of Articulation) revealed a significant effect for place of articulation, F(1, 22) = 57.70, p < .01, and a marginally significant Place × Group interaction, F(1, 22) = 3.84, p < .06. FTR was generally faster for bilabial than alveolar place of articulation. Moreover, as shown in Figure 2, the rate difference between the labial and alveolar place of articulation was not as great for CWS as for CWNS for the 3- and 4-year-old groups. 684

Bilabial

The above findings suggest that CWNS have greater differences in the rate measures between the two place-of-articulation categories. Because this difference in rate between the place of articulation categories could reflect variations in production strategies between CWNS and CWS, a mean rate difference score of bilabial minus alveolar context measures was derived for each participant by subtracting the mean FTR of the alveolar contexts from the mean FTR of the bilabial contexts. As illustrated in Figure 3, CWNS showed greater rate differentiation than CWS for bilabial and alveolar place-of-articulation categories. A dependent one-tailed t test (CWS vs. CWNS) on this measure revealed a significant effect of group, t(13) = 1.776, p < .05, a difference illustrated in Figure 3. These significant findings indicate that CWNS differentiate bilabial and alveolar places of articulation as a function of FTR significantly more than CWS, with the difference between the two speaker groups being largest for the 3- and 4-year-old groups and then decreasing as a function of age.

Discussion The present study examined possible differences between CWS and CWNS in coarticulation and FTR, as well as differences across age levels (3-, 4- and 5-yearolds) within and between these talker groups. The locus

Journal of Speech, Language, and Hearing Research • Vol. 45 • 676–688 • August 2002

Figure 3. Mean FTR difference (bilabial minus alveolar) for place of articulation as a function of speaker group. Bars indicate the standard error.

equation slopes for the bilabial consonant + vowel and alveolar consonant + vowel revealed the same linear relationship between F2 onset and F2 offset (vowel) frequencies as previously reported by others (Geitz, 1998; Hamby, 2000; Krull, 1988, 1989; Lindblom, 1963; Neary & Shammass, 1987; Sussman et al., 1992). Group means for the locus equation slopes as a function of consonant place of articulation revealed patterns similar to those reported in earlier related studies by Geitz, Hamby, and Sussman et al. (e.g., steepest for the bilabials and flattest for the alveolars). The y-intercept values in the current study also revealed the same kind of discriminating effect for place of articulation that has been previously reported for children (Geitz, 1998; Hamby, 2000; Sussman et al., 1992). Perhaps most interesting, the present results suggest that CWS and CWNS differ in FTR as a function of place of articulation, with CWNS exhibiting a greater contrast of FTRs between the labial and alveolar consonant contexts than CWS.

Coarticulation Properties of CWS and CWNS Present findings indicate that there are no appreciable differences between CWS and CWNS in terms of the degree of coarticulation as measured by the locus equation slope and y-intercept. In a related investigation, Robb and Blomgren (1996) studied adults who stutter, whereas the current study examined young children who stutter. Acknowledging this difference between the two studies, Robb and Blomgren’s interpretation that people who stutter have less refinement in articulation

of individual speech segments due to “greater or quicker movement of the tongue body in transitioning from closing-to-opening-to-closing vocal tract gestures,” is not consistent with present findings. Rather, we found that CWS and CWNS do not differ in terms of degree of coarticulation. Moreover, regarding FTR, it was CWNS, not CWS, that showed a greater tendency to have quicker movements during the CV transition in the bilabial context than in the alveolar environment (hence contributing to larger differences in FTR between bilabial and alveolar categories). Perhaps it is fair to interpret Robb and Blomgren’s finding as being related to the long-term effects of coping with instances of stuttering in the speech-language production of adults who stutter. If long-term effects of stuttering on speech-language production differentiate adults from children who stutter, then results from the present study strongly support further empirical studies of young children who stutter to determine the developmental basis for this speechlanguage production problem.

Formant Transition Rate (FTR) Properties of CWS and CWNS FTR, in this context, refers to the speed (velocity) at which the tongue moves from one position in the oral cavity to another, with higher rates indicating a relatively fast movement and lower rates indicating a relatively slow movement. Thus, FTR is a relative temporal measure that represents speed of movement between two articulatory positions. Regarding the velocity of articulator movement, FTR differentiated the bilabial and alveolar places of articulation to a greater degree for the CWNS than for the CWS, a trend most apparent when comparing the youngest (3- and 4-year-old) CWS and CWNS. This tendency may indicate that CWS (particularly younger CWS) have less control than CWNS over the rate of articulator movement. The findings showing that CWNS differentiated FTR more than CWS suggest that the latter talker group may exhibit immature, less organized subsegmental phonological encoding abilities involving the temporal/spatial domain of speech-language production. Perhaps temporal/spatial aspects of CWS’s speech production are either not stored or are poorly stored, and/or they have trouble mapping these aspects of speech production programs onto the sound’s phonological form. Alternatively, these rate differences between CWS and CWNS could be due to lower-level speech motor control processes. In general, the young CWS participants were slower than CWNS in executing anticipatory movement of articulators for the vowel during bilabial feature production. An important challenge for future research

Chang et al.: Coarticulation and Transition Rate in Children

685

will be to identify the independent contributions of higher level encoding and lower level motor execution processes involved in rate of speech production of CWS versus CWNS. Most previous studies examining rate of speech production have not measured subsegmental properties such as FTR. For example, recent findings of Hall, Amir, and Yairi (1999) show that the fluent speech phone rates (phones per second) differ between children who do and do not stutter. As in the current research, Hall et al. found that phone rates were generally slower for CWS than for CWNS, and these findings were interpreted to suggest that CWS exhibit either slower motor execution or longer central processing (or perhaps both) before execution (Postma & Kolk, 1993). And although the present authors duly note Armson and Kalinowski’s (1994) cautions regarding the study of perceptually fluent speech—for example, determining whether perceptually fluent tokens are also physiologically fluent—we also note that accumulating evidence appears to support continued study of the temporal/spatial parameters of speech-language production of people who stutter. In summary, children who stutter at the segmental/subsegmental level of speech production appear to have difficulties regulating the temporal/spatial programs of speaking, whether at a peripheral motor execution level or at a more central level before motor execution, a finding consistent with Kent (1984) and Van Riper’s (1971) notion that stuttering is associated with a disturbance in temporal processes. Results of several studies, of both children and adults, indicate that the temporal processes of coarticulation differentiate those who do from those who do not stutter. Although these differences in the temporal parameters of speech-language production may be subtle, probably even subperceptual, they are consistent with the notion that time is of the essence in terms of stuttering, something implied in the very definition of stuttering as a disruption in fluency or rhythm of speech (Kent, 1984). In specific, current findings indicate that CWS differ from CWNS in speed of movement between two articulatory positions, with the former generally slower than the latter. Thus, any perturbation in the rate of movement between two articulator positions may also contribute to a disruption in the fluency of speech.

Coarticulation as a Function of Age (3-, 4-, and 5-Year-Olds) No significant between-age-group differences were found for any of the locus equation metrics examined in this study. All age groups exhibited significantly different locus equation slope and y-intercept measures for the bilabial and alveolar places of articulation, findings 686

that support previous developmental research and suggest consistency in results across studies (Geitz, 1998; Hamby, 2000; Sussman et al., 1992). Sussman et al. (1992) reported that young children did not differentiate the coarticulation of consonant place of articulation to the same degree as adults. For example, the difference in locus-equation-slope values for stop consonant place of articulation categories tended to be smaller in young children than in adults, indicating a less distinctive and contrastive coarticulation function for children. The present findings suggest that there is a tendency for the mean locus-equation-slope values to increase with age, especially for the bilabial place of articulation (Table 3). These findings differ from those reported by Sussman et al. that younger children had an equally large slope value for the bilabial context for all ages, whereas the alveolar slope values for younger speakers tended to be larger than those of older children or adults. The present findings show that slope values for the bilabial place of articulation increase as a function of age, whereas slope values for the alveolar context are not substantively different among the three age groups. Although not statistically significant, the direction of age-related trends observed in the present study are consistent with previous developmental findings. Geitz (1998) examined a fairly large participant sample across eight age groups and found consistent differences among young children (3-, 4-, and 5-year-olds), older children (7-, 9-, and 11-year-olds), and adults in the variability of production. As in the present study, however, Geitz also found no significant differences in any of the locus equation measures among the youngest age groups (3-, 4-, and 5-year-olds). It would be reasonable to suggest, based on the present findings, that the slope metric as an index of coarticulation does not significantly differ as a function of age, at least among the 3- to 5-year-old age bracket considered in this study. Moreover, although all ages for the 3- to 5-year-old children exhibited contrastive locus equation slopes and y-intercepts for bilabial versus alveolar places of articulation, differences between bilabial and alveolar slopes did not differ as a function of age.

Conclusions This study analyzed coarticulation and FTR properties in the fluent CV productions of 3-, 4-, and 5-yearold CWS and CWNS. Based on the findings of this research, conclusions are: 1.

Coarticulation as measured by the locus equation slope, y-intercept, and standard error of estimate does not differ between CWS and CWNS talker groups.

Journal of Speech, Language, and Hearing Research • Vol. 45 • 676–688 • August 2002

2.

The difference in FTR between bilabial and alveolar places of articulation is significantly larger in CWNS than in CWS.

Hamby, M. J. (2000). The development of coarticulatory and segmental properties in nasal + vowel syllables. Unpublished master’s thesis, Vanderbilt University, Nashville, TN.

3.

Subtle difficulties learning, retrieving, storing, or executing certain temporal/spatial parameters of speech-language production may be associated with childhood stuttering.

Henke, W. (1966). Dynamic articulatory model of speech production using computer simulation. Unpublished doctoral thesis, Massachusetts Institute of Technology, Cambridge, MA.

Acknowledgments This research was supported by NIH Grant DC0052308. The article is based on a master’s thesis completed at Vanderbilt University in 1999 by the first author under the direction of the second and third authors. The authors express their appreciation to Dan Ashmead for comments on earlier drafts of this paper and for his assistance on statistical analyses. We extend our most sincere thanks to the many children and their parents who participated in this research, without whose cooperation, help, and patience this study could not have been completed.

References Armson, J., & Kalinowski, J. (1994). Interpreting results of the fluent speech paradigm: Difficulties in separating cause from effect. Journal of Speech and Hearing Research, 37, 69–82. Bell-Berti, F., & Krakow, R. A. (1991). Anticipatory velar lowering: A coproduction account. Journal of the Acoustical Society of America, 90, 112–123. Conture, E. (2001). Stuttering: Its nature, diagnosis, and treatment. Boston: Allyn & Bacon. Conture, E. (1991). Young stutterers’ speech production: A critical review. In H. F. M. Peters, W. Hulstijn, & C. W. Starkweather (Eds.), Speech motor control and stuttering (pp. 365–384). Amsterdam: Elsevier/Excerpta Medica. Conture, E., Ohde, R., & Melnick, K. (1998). Phonological priming of children who stutter: Preliminary findings. Paper presented at the annual convention of the American Speech-Language-Hearing Association, San Antonio, TX. Daniloff, R., & Hammarberg, R. (1973). On defining coarticulation. Journal of Phonetics, 1, 239–248. Dunn, A., & Dunn, A. (1981). Peabody Picture Vocabulary Test–Revised. Circle Pines, MN: American Guidance Service. Gay, T. (1978). Effect of speaking rate on vowel formant movements. Journal of the Acoustical Society of America, 63, 223–230. Geitz, B. (1998). The development of stop consonant place of articulation in preadolescent children. Unpublished master’s thesis, Vanderbilt University, Nashville, TN. Goldman, R., & Fristoe, M. (1986). Goldman-Fristoe Test of Articulation. Circle Pines, MN: American Guidance Service. Hall, K. D., Amir, O., & Yairi, E. (1999). A longitudinal investigation of speaking rate in preschool children who stutter. Journal of Speech, Language, and Hearing Research, 42, 1367–1377.

Howell, P., & Vause, L. (1986). Acoustic analysis and perception of vowels in stuttered speech. Journal of the Acoustical Society of America, 79, 1571–1579. Hresko, W., Reid, D., & Hammill, D. (1991). Test of Early Language Development. Austin, TX: Pro-Ed. Kent, R. (1984). Stuttering as a temporal programming disorder. In R. F. Curlee & W. H. Perkins (Eds.), Nature and treatment of stuttering: New directions (pp. 283–303). San Diego, CA: College-Hill Press. Klich, R., & May, G. (1982). Spectrographic study of vowels in stutterers’ fluent speech. Journal of Speech and Hearing Research, 25, 364–370. Kozhevnikov, V., & Chistovich, L. A. (1965). Speech: Articulation and perception. Washington, DC: Joint Publications of Research Services. Krull, D. (1988). Acoustic properties as predictors of perceptual responses: A study of Swedish voiced stops. PERILUS (Phonetic Experimental Research at the Institute of Linguistics, University of Stockholm), VII, 66–70. Krull, D. (1989). Second formant locus patterns and consonant-vowel coarticulation in spontaneous speech. PERILUS (Phonetic Experimental Research at the Institute of Linguistics, University of Stockholm), X, 87–108. LaSalle, L., & Conture, E. (1995). Disfluency clusters of children who stutter: Relation of stutterings to selfrepairs. Journal of Speech and Hearing Research, 38, 965–977. Lindblom, B. (1963). On vowel reduction. Report No. 29, The Royal Institute of Technology, Speech Transmission Laboratory, Stockholm, Sweden. Logan, K., & Conture, E. (1997). Selected temporal, grammatical, and phonological characteristics of conversational utterances produced by children who stutter. Journal of Speech, Language, and Hearing Research, 40, 107–120. Melnick, K., & Conture, E. (2000). Relationship of length and grammatical complexity to the systematic and nonsystematic speech errors and stuttering of children who stutter. Journal of Fluency Disorders, 25, 21–45. Neary, T., & Shammass, S. (1987). Formant transitions as partly distinctive invariant properties in the identification of voiced stops. Canadian Acoustics, 15, 17–24. Ohde, R. N., & Sharf, D. J. (1977). Order effect of acoustic segments of VC and CV syllables on stop and vowel identification. Journal of Speech and Hearing Research, 20, 543–554. Perkins, W., Kent, R., & Curlee, R. (1991). A theory of neuropsycholinguistic function in stuttering. Journal of Speech and Hearing Research, 34, 734–752.

Chang et al.: Coarticulation and Transition Rate in Children

687

Postma, A., & Kolk, H. (1993). The covert repair hypothesis: Prearticulatory repair processes in normal and stuttered disfluencies. Journal of Speech and Hearing Research, 36, 472–487.

Whalen, D. (1990). Coarticulation is largely planned. Journal of Phonetics, 18, 3–35.

Riley, G. (1980). Stuttering Severity Instrument for Young Children (rev. ed.). Tigard, OR: C.C. Publications.

Wolk, L. (1990). An investigation of stuttering and phonological difficulties in young children. Unpublished doctoral dissertation, Syracuse University, Syracuse, NY.

Robb, M., & Blomgren, M. (1996). Analysis of F2 transitions in the speech of stutterers and nonstutterers. Journal of Fluency Disorders, 21, 1–16. Robb, M., & Wolk, L. (1997). A note on prespeech and early speech coarticulation. Logopedics Phoniatrics Vocology, 22, 99–104. Sharf, D. J., & Ohde, R. N. (1981). Physiologic, acoustic, and perceptual aspects of coarticulation: Implications for the remediation of articulatory disorders. In N. Lass (Ed.), Speech and language: Advances in basic research and practice. New York: Academic Press. Sussman, H., Hoemeke, K., & McCaffrey, H. (1992). Locus equation as an index of coarticulation for place of articulation distinctions in children. Journal of Speech and Hearing Research, 35, 769–781. Sussman, H., Minifie, F., Buder, E., Stoel-Gammon, C., & Smith, J. (1996). Consonant-vowel interdependencies in babbling and early words: Preliminary examination of a locus equation approach. Journal of Speech and Hearing Research, 39, 424–433. Sussman, H. M., & Shore, J. (1996). Locus equations as phonetic descriptors of consonantal place of articulation. Perception & Psychophysics, 58, 936–946.

Wingate, D. (1988). The structure of stuttering: A psycholinguistic analysis. New York: Springer-Verlag.

Wolk, L., Edwards, M. L., & Conture, E. (1993). Coexistence of stuttering and disordered phonology in young children. Journal of Speech and Hearing Research, 36, 906–917. Yaruss, S., & Conture, E. (1993). F2 transitions during sound/syllable repetitions of children who stutter and predictions of stuttering chronicity. Journal of Speech and Hearing Research, 36, 883–896. Yaruss, S., LaSalle, L., & Conture, E. (1998). Evaluating young children who stutter: Diagnostic data. American Journal of Speech-Language Pathology, 7(4), 62–76. Zebrowski, P., Conture, E., & Cudahy, E. (1985). Acoustic analysis of young stutterers’ fluency: Preliminary observations. Journal of Fluency Disorders, 10, 173–192. Received September 21, 2001 Accepted April 4, 2002 DOI: 10.1044/1092-4388(2002/054) Contact author: Ralph N. Ohde, PhD, Vanderbilt Bill Wilkerson Center, 1114 19th Avenue South, Nashville, TN 37212. E-mail: [email protected]

Van Riper, C. (1971). The nature of stuttering. Englewood Cliffs, NJ: Prentice Hall.

688

Journal of Speech, Language, and Hearing Research • Vol. 45 • 676–688 • August 2002