Amplitude modulation detection with concurrent frequency modulation

0 downloads 0 Views 690KB Size Report
Sep 9, 2016 - blocks of complex communication signals produced by humans, animals, ... Human speech is rich in dynamic amplitude modulation (AM) and ...
Naveen K. Nagaraj: JASA Express Letters

[http://dx.doi.org/10.1121/1.4962374]

Published Online 9 September 2016

Amplitude modulation detection with concurrent frequency modulation Naveen K. Nagaraj Cognitive Hearing Science Lab, Department of Audiology and Speech Pathology, University of Arkansas for Medical Sciences/University of Arkansas at Little Rock, 2801 South University Avenue, Little Rock, Arkansas 72204, USA [email protected]

Abstract: Human speech consists of concomitant temporal modulations in amplitude and frequency that are crucial for speech perception. In this study, amplitude modulation (AM) detection thresholds were measured for 550 and 5000 Hz carriers with and without concurrent frequency modulation (FM), at AM rates crucial for speech perception. Results indicate that adding 40 Hz FM interferes with AM detection, more so for 5000 Hz carrier and for frequency deviations exceeding the critical bandwidth of the carrier frequency. These findings suggest that future cochlear implant processors, encoding speech fine-structures may consider limiting the FM to narrow bandwidth and to low frequencies. C 2016 Acoustical Society of America V

[QJF] Date Received: May 5, 2016

Date Accepted: July 29, 2016

1. Introduction Temporal modulations in amplitude and frequency are the fundamental building blocks of complex communication signals produced by humans, animals, and birds. Human speech is rich in dynamic amplitude modulation (AM) and frequency modulation (FM) cues that are decoded in our auditory system. Speech consists of distinct AM near the syllabic rate of 3 to 4 Hz and speech perception studies suggest that AM frequencies below 20 Hz are crucial and sufficient for accurate speech recognition in quiet (Drullman et al., 1994; Shannon et al., 1995). Most current day cochlear implant (CI) processers mainly extract and transmit low frequency AM information to specific electrodes depending on the spectral channel achieving great success in speech recognition. However, research evidence suggests that AM cues alone cannot support good speech recognition in noise (Fu and Shannon, 1999; Stickney et al., 2004), FM cues such as formant transitions, changes in pitch and vibratory pattern of an individual’s vocal folds also help in speech recognition. Studies designed to understand the relative importance of AM and FM cues suggest that FM information extracted from the finestructure of the speech signal are important for speech recognition in noise, speaker identification, music perception, and tonal language perception (Smith et al., 2002; Zeng et al., 2005). Concurrent AM and FM are the hallmark of our speech and these findings imply that AM and FM cues offer distinct complimentary information about speech signal and are crucial for segmenting and understanding target speech in complex listening situations. Based on these results, speech coding strategies for CI processors have been proposed to encode FM cues along with AM cues (Nie et al., 2005). Mixed modulation (simultaneous AM and FM) perception studies in normal hearing listeners have demonstrated constructive summation of subthreshold FM and AM cues at low modulation rates (Moore and Sek, 1992; Ozimek and Sek, 1987). However, these studies have used the same modulation rates for both AM and FM, whereas speech contains simultaneous temporal modulation in the range of 1 to 8 Hz corresponding to syllabic rate and spectral modulation