Bruce Birch - ASSTA Inc

0 downloads 0 Views 183KB Size Report
his seventies, was asked to discuss issues relating to the joint management of Kakadu National Park. The text ... reflects how far apart they sound. Since most of ...
Bruce Birch Segmental Evidence

SEGMENTAL EVIDENCE FOR METRICAL STRUCTURE IN IWAIJA Bruce Birch Dept. of Linguistics and Applied Linguistics, The University of Melbourne. ABSTRACT: Reduction of unstressed vowels in Australian languages has been noted by a number of authors. This study examines the question as to whether this phenomenon is present in Iwaija, a nonPama-nyungan language of Northern Australia. An earlier intonation-based study (Birch 1999) suggested that Iwaija is a stress accent language in which strong syllables in left-headed feet are associated with pitch-accents. It was hypothesized that an examination of reduction and deletion processes may add support to this assertion. Results of the study so far show a peripheralization effect for vowels in syllables associated with pitch-accents, and a further peripheralization for vowels associated specifically with L+H* accents. Conversely, while there was no significant centralization of vowels in metrically weak syllables across the board, vowel reduction in weak syllables of a small set of high-frequency function words and bound morphemes resulted in significant and predictable centralization.

INTRODUCTION The work presented here forms part of a larger study which aims to provide various kinds of acoustic evidence for prosodic constituency in Iwaija, a non-Pama-nyungan language of Northern Australia. This paper focuses on first and second formant values of vowels, in order to discover possible correlations between vowel quality and metrical structure. It has been suggested by some observers that unstressed vowels may be centralized and elided in many Australian languages. (Butcher, 1996). Butcher found that in Burarra and Kunbarlang, neighbouring languages which are genetically unrelated to Iwaija, reduction and elision of unstressed vowels is as common as in some Germanic languages such as English. (Butcher, 1996). In an earlier acoustic analysis of the intonation system of Iwaija (Birch, 1999) the metrical status of syllables was identified on the basis of association with pitch accents. Syllables which were consistently associated with pitch accents were identified as metrically strong (stressed), whilst syllables which did not associate with pitch accents were identified as metrically weak (unstressed). The observed regularities supported a hypothesis that Iwaija is a stress accent language (Beckman, 1986) with a rhythmic structure based on left-headed feet. The current study aims to test this hypothesis further. RELEVANT PHONOLOGICAL FACTS ABOUT IWAIJA Iwaija has a triangular three vowel system. The inventory consists of the low central vowel /a/, and the two high vowels /i/ and /u/. Possible syllable structures are [V(C)], [CV], and [CVC(C)]. Vowel length is not phonemic, although the frequent elision of intervocalic glides, especially those which are homorganic with respect to their vocalic environment, means that [CVVC] syllables are extremely common in all forms of speech. This is independent of factors such as speech rate and hypoarticulation. Such syllables also form across word boundaries under certain conditions. (Birch, 2002a). METHOD A twenty-minute speech sample was extracted from a longer narrative in which the speaker, a male in his seventies, was asked to discuss issues relating to the joint management of Kakadu National Park. The text was transcribed using the EMU Speech Database System (Cassidy, J. and Harrington J., 1998). Around 2000 vowel tokens were labelled. A ToBI-style (Beckman and Hirschberg, 1993) intonational annotation was included. The hierarchical structure of the database allowed for isolation and subsequent analysis of the vowel subsets required for the study. F1 and F2 values were extracted for vowels using the “R” statistical computing and graphics environment. Values were checked, and adjusted in cases where the formant tracker had given misleading readings. Mean values were obtained and plotted using the Bark scale. Proceedings of the 9th Australian International Conference on Speech Science & Technology Melbourne, December 2 to 5, 2002.  Australian Speech Science & Technology Association Inc. Accepted after abstract review

page

76

Bruce Birch Segmental Evidence RESULTS All vowels

Figure 2 shows the mean formant frequencies for all vowels in the sample. The frequencies are spaced in accordance with the Bark scale, a measure of auditory similarity, so that the distance between any two vowels reflects how far apart they sound. Since most of the energy is in the first formant, the F1 scale is expanded in relation to the F2 scale.

Figure 2. Mean formant values for all vowels.

Strong versus weak.

Figure 3 shows vowels in metrically strong syllables (aS, uS, iS) plotted against vowels in metrically weak syllables (aW, uW, iW). Tokens which were associated with pitch-accents were excluded in order to eliminate accentual effects from the sample. Vowels in phrase-final syllables were also excluded since their association with boundary tones introduced variation unrelated to metrical strength.

Figure 3. Mean formant values for vowels in metrically strong syllables compared to metrically weak syllables.

Proceedings of the 9th Australian International Conference on Speech Science & Technology Melbourne, December 2 to 5, 2002.  Australian Speech Science & Technology Association Inc. Accepted after abstract review

page

77

Bruce Birch Segmental Evidence This plot illustrates two points of interest. Firstly, the high vowels show more variation than the low vowel, which shows virtually none. Secondly, high vowels in strong syllables are centralized compared to high vowels in weak syllables, and to the set of all vowels (see Figure 2). I will comment only on the second point here, as the first requires further examination. The more centralized position of high vowels in strong syllables is at least partly due to the the fact that the strong syllables represented here are the subset of strong syllables which do not bear accent. These are often found in hypo-articulated (Lindblom, 1990) stretches of speech, particularly in anacruses. On the other hand, metrically weak vowels are often hyperarticulated. This is especially the case in immediately post-accentual syllables, in which they frequently align with the delayed peak of a high accent. Many of the deaccented tokens in the sample occur in anacruses. This means that words which occur frequently in anacruses, and consequently the consonantal contexts which these words provide, are over-represented in the sample. Hence the relatively high F1 of /iS/ is to some extent explained by the frequent occurrence in the sample of the discourse particle ngindi which has the centralized vowels common to many function words. Likewise, the relatively high F1 of /uS/ is largely explained by the frequent occurrence in the sample of the second-person pronoun nuyi in which the pre-palatal /u/ is typically lowered. Confounding factors like these tend to render an across-the-board strong/weak comparison ineffective in a language which does not have the entrenched level of rhythmically motivated phonological vowel reduction exhibited by English.

Accentual effects (1)

Figure 4 clearly shows the more peripheral position of vowels associated with pitch accents (a*, u*, i*) compared to the set of all vowels (a, u, i). This suggests the possibility that hyperarticulation (Lindblom, 1990) may be a correlate of accent. Interestingly, the vowels differ in terms of the axis along which they vary. For the high vowels, variation is largely in terms of F2, an indicator of backness. For the low vowel the relevant parameter is F1, an indicator of height.

Figure 4. Mean formant values for accented vowels compared to those for all vowels.

Accentual effects (2) Proceedings of the 9th Australian International Conference on Speech Science & Technology Melbourne, December 2 to 5, 2002.  Australian Speech Science & Technology Association Inc. Accepted after abstract review

page

78

Bruce Birch Segmental Evidence

Bitonal (L+H*) accented vowels show a further peripheralization compared to the set of all accented vowels. This is unsurprising since bitonal accents result in increased duration and intensity of the vowel with which they associate. The generally greater salience of segments resulting from their association with L+H* accents, means that these accents are the type most likely to accompany the introduction of new topics into the discourse, or the highlighting of important topics. Some evidence for the special staus of L+H* accents is offered in the next section.

Figure 5. Mean formant values for L+H* accented vowels compared to accented vowels, compared to those for all vowels.

Distributional evidence for the special status of bitonal L+H* accents

Distribution of first formant values for accented low vowels (a*) is bimodal, showing a secondary peak between 700Hz and 800Hz. Examination of the context for tokens represented by this peak links them to the hyperarticulation of focused words in the discourse. They are typically associated with the L+H* accent type. This is suggestive of a possible categorical difference for this accent type, although the sample is too small to prove such a judgement.

Figure 6. Distribution of F1 values (represented on the x axis) for accented low vowels (a*).

Proceedings of the 9th Australian International Conference on Speech Science & Technology Melbourne, December 2 to 5, 2002.  Australian Speech Science & Technology Association Inc. Accepted after abstract review

page

79

Bruce Birch Segmental Evidence Frequency effects on vowel reduction A salient prosodic feature of Iwaija is that vowels in some frequently occurring bound morphemes, words, and constructions tend to exhibit more extreme forms of reduction than than do vowels in the lexicon as a whole. This section argues that reduction to schwa in some high frequency items may be distinguished categorically as phonological centralization, whereas vowel reduction in other cases constitutes phonetic contextual assimilation along the lines of the undershoot model first put forward by Lindblom (1963). In both cases the reduction tends to occur in metrically weak syllables. However, in the syllables of words where it is argued that the reduction is phonological, it is assumed either that the output target is schwa, or that the vowel in these cases is phonologically “targetless”. This paper does not attempt to distinguish between the two. The important point is that the centralized vowels in these words are predictable. Accordingly, vowels in these syllables are all labelled as schwa (a´, u´, i´) in the database. Formant analysis of this set of vowels shows highly significant centralization with respect to vowels in general (see Figure 7). The many cases in which vowels with near-identical values to “schwa-labelled” vowels are not labelled as schwa, are cases in which the realization is not predictable from the words in which they occur. In these cases, it is hypothesized that the reduction is a result of undershoot, or at least some kind of contextual assimilation, and that speech rate and hypo-articulation are factors. Formant analysis of this set of vowels (as represented by aW, uW and iW in Figure 3) shows no significant centralization with respect to vowels in general. This argument is in part motivated by the usage-based phonology proposed by Bybee (2001) and Pierrehumbert (2000) which draws on Exemplar Theory (see Johnson, 1996) to suggest that some frequently occurring lexemes in a language tend to be reduced because of their predictability (see also Lieberman, 1963), and that this frequent reduction comes to bear on their representation in memory such that the reduced forms become phonologized. An often quoted English example (e.g. Bybee, 2001) involves the construction to be going to when used to convey future action, in which going to is frequently realized as [g´R´]. An example from Iwaija is the ubiquitous demonstrative baraka. Whilst it may be heard in its fully realized form [ba”aga], such occurrences are rare and are in general the result of elicitation, whereas realizations such as [w´”´g´] or [w´”´] are the norm. Reduction in the metrically strong initial syllable of baraka is predictable as coarticulation with the following retroflex consonant. The centralization of the vowels in the two weak syllables which follow is predictable by frequency of occurrence of the word in which they occur. Another frequently occurring word in which centralization of the vowel in the weak syllable is predictable is the third person ‘sequential’ pronominal jamin [Ôam´n], Whereas the /i/ in jamin is realized as schwa in all contexts other than elicitation, the /i/ in the same syllable in the word burranymin (to become bigger) is realized as a full vowel. The immediate consonantal context in both cases is identical, a fact which tends to rule out contextual assimilation per se as an explanation for the reduction in jamin. As both syllables are

Figure 7. “Phonologically reduced” vowels compared to all vowels and to accented vowels.

Proceedings of the 9th Australian International Conference on Speech Science & Technology Melbourne, December 2 to 5, 2002.  Australian Speech Science & Technology Association Inc. Accepted after abstract review

page

80

Bruce Birch Segmental Evidence unstressed, they cannot be differentiated on the basis of metrical strength. Frequency of occurrence is a possible explanation however, since jamin is one of the most frequently occurring words in the language. CONCLUSIONS The conclusions presented here are preliminary in nature, as the current size of the database precludes more definitive statements about the prosodic structure of Iwaija. Rather, the results are suggestive of certain trends. These may be summarized as follows. An across-the-board correlation between vowel-quality and metrical strength is not supported by the study. Some correlations are supported, however. Hyperarticulation in Iwaija results in F1 and F2 values which place the affected vowels on the periphery of the vowel space. Accented vowels are more peripheral than vowels in general, and L+H* accented syllables are more peripheral than accented syllables in general. The vowels in weak syllables of some frequently occurring bound morphemes, words, and constructions show significantly centralized formant values in comparison to any other analysed vowel subset. REFERENCES References should be listed in alphabetical order, and referred to by name of author and year of publication in the text. References should include all relevant information in full as in the following examples. Do not abbreviate names of journals. If the journal is an electronic journal or electronic document available on the web, you are encouraged to insert a direct link to the reference. Birch, B. (1999) Prominence Patterns in Iwaija. Honours Thesis. Dept. of Linguistics and Applied Linguistics, University of Melbourne. Melbourne, Australia. Birch, B. (2002a) The IP as the Domain of Syllabification in Iwaija. Proceedings of Speech Prosody 2002. Aix-en-Provence. Beckman, Mary E. and Julia Hirschberg. 1993. The ToBI Annotation Conventions. Unpublished MS. Butcher, A. 1996. Some connected speech phenomena in Australian languages: universals and idiosyncrasies. In A.Simpson and M. Patzöld. Sounds and patterns of connected speech: description, models, and explanation. Proccedings of the Kiel University Symposium. Arbeitsberichte 31, 83-104. Bybee, J. (2001) Phonology and Language Use. Cambridge: Cambridge University Press. Lieberman, P. (1963) Some Effects of Semantic and Grammatical Context on the Production of Speech. Language and Speech, 6, 172-187. Lindblom, B. (1963) A Spectrographic Study of Vowel Reduction. Journal of the Acoustical Society of America.35, 1773-1781. Lindblom, B. (1990) Explaining phonetic variation: A Sketch of the H&H Theory, in W.J. Hardcastle and A. Marchal (eds), Speech Production and Speech Modelling. Dordrecht: Kluwer, pp. 403-439. Johnson, K. (1997) Speech Perception without Speaker Normalization: An Exemplar Model, in K. Johnson and K.W. Mullenix (eds) Talker Variability in Speech Processing. San Diego: Academic Press, pp. 145-166. Pierrehumbert, J. (2001) Exemplar dynamics: Word frequency, lenition, and contrast. In Bybee, J. and P. Hopper (eds) Frequency effects and the emergence of linguistic structure. Amsterdam: John Benjamins, pp.137-157.

Proceedings of the 9th Australian International Conference on Speech Science & Technology Melbourne, December 2 to 5, 2002.  Australian Speech Science & Technology Association Inc. Accepted after abstract review

page

81