Where did that sound come from? Comparing the ...

5 downloads 156 Views 936KB Size Report
receivers facing laterally, and audified ultrasound with receivers facing forward. ... structors at the Ross McDonald School for the Blind for the development of an ..... Different tech- niques were used for ..... Wightman FL, Kistler DJ. Headphone ...
Disability and Rehabilitation: Assistive Technology, 2012; 7(2): 130–138 Copyright © 2012 Informa UK, Ltd. ISSN 1748-3107 print/ISSN 1748-3115 online DOI: 10.3109/17483107.2011.602172

Research Article

Where did that sound come from? Comparing the ability to localise using audification and audition Disabil Rehabil Assist Technol Downloaded from informahealthcare.com by University Of Auckland on 02/09/12 For personal use only.

T. Claire Davies1, Shane D. Pinder2, George Dodd3 & Catherine M. Burns4 1

University of Auckland, Mechanical Engineering, Auckland, New Zealand, 2Auckland University of Technology, Mechanical Engineering, Auckland, New Zealand, 3University of Auckland, Acoustics Research Centre, Auckland, New Zealand, and 4 University of Waterloo, Systems Design Engineering, Waterloo, Canada Implications for Rehabilitation

Purpose: A prototype device was developed to allow individuals to hear ultrasound reflections off environmental obstacles. Previous studies have shown that this device allows for better distance judgement than audition and allows for effective passage through the centreline of apertures. The purpose of this research was to evaluate audification as a method to localise direct sound sources as compared to audition. Method: In an anechoic environment, participants localised point-sound sources for three conditions: auditory, audified ultrasound with receivers facing laterally, and audified ultrasound with receivers facing forward. Results: Azimuth localisation was similar within a range of −35° to 35° in front of the participant among all conditions. At the periphery, −70° and 70°, audified ultrasound was more accurate than audition for novice participants. No difference was evident in user elevation accuracy for these signals among the different conditions. Conclusion: Audification of ultrasound can be effective for localising point-source sounds in the azimuth direction, but more evidence is required to evaluate accuracy in the vertical direction.

• Secondary mobility devices can be used by individuals with visual impairment to avoid obstacles above waist height. • Audification allows for skill-based response enabling intuitive obstacle avoidance and localization of point sound sources. • Localization of peripheral sounds was shown in this study to be better with audified ultrasound than audition. and active. Passive echolocation involves the use of reflected echoes to localise obstacles in a given environment by listening to sounds around the observer. This is achieved by listening to sounds around the observer. Active echolocation involves making a click with the tongue or a clap with the hand. Those signals generated near the ears provide echoes that are most easily interpreted as sound will reflect to the location of transmission. With the advent of the cane and the introduction of guide dogs, active echolocation is no longer taught and is believed to be socially unacceptable drawing too much attention to the individual [1]. Echoes can also be highly variable depending on the situation. Although many people with visual impairment capitalise on echolocation, few use it as their primary means of travel due to both environmental influences that interfere as well as perceived society rejection. The long white cane is often used to perform haptic exploration and provide some echolocation cues, but it is difficult to detect obstacles that are not ground-based such as wall mounted bookcases, tree branches, or pedestal display cases. Although auditory devices have been developed to provide information above waist height [2], these do not provide interfaces that are easy to learn. The primary and secondary mobility devices that are currently used for travel have limitations that may

Keywords:  Audification, localisation, ultrasound, visual impairment, assistive technology

Introduction The sense of hearing allows babies to turn toward their mothers’ voice, a driver to react to approaching emergency vehicles, and provides direction cues for a latecomer to a child’s birthday party. The ability to detect a sound and respond to it is learned from a very early age. Most individuals rely on perception of environmental obstacles through visual stimuli, but those who are functionally blind must use other methods of obstacle detection. In the past, echolocation of sound has been used to determine the presence of obstacles. There are two forms of echolocation, passive

Correspondence: T. Claire Davies, University of Auckland, Mechanical Engineering, Private Bag 92019, Auckland, 1142, New Zealand. Tel: 64 9 373 7599, ext 81898. E-mail: [email protected] (Accepted June 2011)

130

Disabil Rehabil Assist Technol Downloaded from informahealthcare.com by University Of Auckland on 02/09/12 For personal use only.

Localisation using audification  131 be overcome by providing the same information to the visually impaired individual that would be used for echolocation. A need was identified by orientation and mobility instructors at the Ross McDonald School for the Blind for the development of an effective, easy to use, secondary mobility device to provide information such as location and material properties of the obstacle especially for obstacles above waist height. Development of such devices should adapt to the user, not require adaptation by the user. Secondary mobility devices have been developed for individuals with functional blindness to enable auditory perception of the environment [2,3]. These include sonar and imaging devices which for a determined, diligent user with many hours of dedicated training, can provide auditory “pictures” through sonification, the mapping of data streams onto auditory dimensions [4]. These devices have often been designed to provide as much information as possible by mapping distance to pitch and location to sound intensity, but these mappings are difficult to interpret [5–7]. When referring to sonar devices, some of the terminology is synonymous with audition. The microphone is known as a receiver and the speaker is known as a transmitter. These terms will be used throughout this paper to enable the reader to differentiate between aspects of auditory localisation as compared to ultrasound localisation (both are discussed). More recent secondary mobility systems attempt to use sonification to provide more realistic localisation cues [8,9]. These systems measure head motion and apply head-related transfer functions to the signal in the form of interaural phase and intensity differences [10–12]. Devices to measure head motion in three dimensions are costly and are unlikely to be widely accepted, unless miniaturisation can be achieved economically. Others provide chirps or tweets similar to bats and require significant signal processing [13,14], but the human ears are not able to move as the pinnae on bats [15]. Rather than relying on complex signal processes to enable detection of obstacles, one should reflect on the ears of dolphins and whales and the ultrasound methods they use to communicate. These mammals cannot sweep the environment with their ears and instead use Doppler and motion perception to detect prey [16–18]. A prototype device (AUDEO—Audification of Ultrasound for Detection of Environmental Obstacles) that enables audification of ultrasound Doppler signals at normal walking speeds was developed using user-centred design methods [19] to allow individuals with visual impairment to detect and avoid obstacles within the environment. Audification is the direct translation of ultrasound into auditory sound. It is thought to be a skill-based behaviour that allows for intuitive response or direct perception [20]. Gibson hypothesised that response to environmental features is an innate behaviour such that perception and proprioception are complementary [21]. This intuitive response should thus enable efficient kinaesthetic response to environmental obstacles. An individual with minimal training should be able to effectively use this method to detect and avoid environmental obstacles without the need for a full cognitive map. Previous experiments have shown that the AUDEO device can be used by novices to effectively judge distances and avoid apertures with minimal training [22]. A similar device, with Copyright © 2012 Informa UK Ltd.

only forward-facing receivers, has also been trialled with blind travellers [23]. However, it is also important to understand how effectively point sound sources can be localised through audification. The main purpose of this experiment was to determine the effectiveness with which localisation could be achieved using direct auditory signals and audification of ultrasound signals. Spatial localisation requires both directional and distance parameters. With these, sound can be used to create a spatial map and be used for orientation and locomotion. Localisation studies fall into two broad categories: those that involve the head in a fixed position for localisation [11,24,25] and those that allow head movement, or motional studies [26–30]. Although considerable research has been done in the area of static localisation due to the increased demand for surroundsound and virtual systems, there has been very limited research in the area of head motion for localisation. Most localisation studies present a stimulus to an observer with the head maintained in a stationary position either with a device to fix the head in a stationary position or a very short burst of sound to avoid the effect of head motion (see [30] for a review of stationary and motional studies). These localisation studies evaluate localisation in both the horizontal and vertical direction without accounting for head movement. This research allows for refinement of the Duplex Theory (interaural phase and intensity differences) and further development of head-related transfer functions which are based on the structure of the pinna and allow for creation of virtual environments of simulated sound [24,31]. However, turning the head toward a source has been shown to be more effective for localising sound sources than stationary localisation [26,29]. Naturally, individuals move the head to hear the source of the auditory stimulus [32]. Any ambiguities can be resolved using head motion. This action is evident in infants as young as 14 weeks [33]. Sound localisation based on head movement is known to be very effective but requires a signal long enough to allow for this type of movement [27–30,34]. Hearing aid users must be retrained to localise with head movement as the usual effect of the pinna (outer ear) is absent [35,36]. Since the motion of the head is not typically monitored in hearing aids, individualised head-related transfer functions cannot be applied. Localisation with hearing aids depends largely on the ability to reorient the head, rather than the shape of the ear. Resolution can occur in the horizontal plane (azimuth) as two sensors (ears) provide sufficient information to resolve the position along a cone of confusion, but once the nominal horizontal position of the sound is determined, head motions up and down are required to enable vertical localisation (elevation). The AUDEO prototype device is shown in Figure 1 [37]. An ultrasound transmitter (400STR100, Midas Components, UK) is mounted forward-facing on the head with the receivers mounted on a headset at the ears. The orientation of the receivers can be changed. One focus of this testing was to determine whether the receivers should be oriented laterally or forward. Lateral orientation means that the intensity of a received sound is at a maximum if it is directly to the side of the user, just as in normal human hearing. Sound transmitted from a speaker is not omnidirectional and is higher in intensity in the direction the speaker is facing. Ultrasound

Disabil Rehabil Assist Technol Downloaded from informahealthcare.com by University Of Auckland on 02/09/12 For personal use only.

132  T.C. Davies et al. transmission is also directional in its distribution (also highly dependent on the source). With a head-mounted transmitter, the reflections are strongest in the direction of the transmission—forward facing. All commercially available secondary mobility devices collect signals from receivers facing in the same direction as the transmitter oriented above the eyes to ensure that obstacles immediately in front of the individual are detected [2,38]. Previous research has found however, that receivers mounted in the lateral orientation may be better for obstacle avoidance than forward-facing receivers [22]. The function of the AUDEO device allows normal ambient sounds to be heard by the user, as well as sounds resulting from ultrasound reflections off objects in the environment. It is the characteristics of these reflections that allow the user to create an auditory map or soundscape of the environment. To understand the characteristics of the resulting soundscape, imagine that an object in the environment emits a tone whose frequency is proportional to the rate at which the distance between that object and the user is changing. If the user walks toward a wall at a constant speed, the wall would be heard by the user as a tone of constant frequency whose volume increases as the distance decreases. Normal hearing allows localisation of such a sound in the reference frame of the user’s head through interaural intensity differences. Now further imagine that every object that moves relative to the user emits a similar, though slightly different tone, representative mostly of the rate of closure with that object but also influenced by characteristics of the texture of the object. Included in such objects would be other human beings, so someone waving a hand back and forth would be heard as a fluctuating frequency, characteristic of the manner of movement. If an obstacle is moving to the left of the user (such as a pedestrian), the user hears a sweeping sound which is greater in intensity on the left than the right. Naturally occurring environmental sounds

Figure 1.  Photograph of a participant wearing the device. The top image shows the receivers oriented in the lateral direction and the bottom shows receivers in the forward-facing direction.

pass through the receivers unimpeded as 40 kHz is a sufficient sampling frequency to allow the full range of normal human audio to be unaffected and available to the user as normal. Dynamics of the head are intuitively incorporated in the hybrid system in exactly the same way as normal (i.e. without additional technology) as long as the necessary instrumentation moves with the observer’s head. The design of this prototype assumes that head movement provides sufficient information for localisation [29,30]. The purpose of this project was to determine how effectively individuals with no training in localisation would be able to locate sound sources with audified ultrasound as compared to sound in the auditory range, audition. The transmitter was removed from the prototype system (the greyed portion of Figure 2). This prevented the system from being used to hear reflections. Instead, a series of transmitters (Figure 3) were

Figure 2.  A block diagram showing the operation of the AUDEO device. For this experiment, the transmitter was removed, and the sound was generated from an external source. Thus, the greyed areas were removed from the system.

Figure 3.  Diagram showing the transmitters located at the sound source location. The ultrasound transmitter, the loudspeaker and the LED light were collocated. When the stimulus was turned on, the LED was also illuminated to allow for LED detection in head-mounted camera image. Disability and Rehabilitation: Assistive Technology

Localisation using audification  133

Disabil Rehabil Assist Technol Downloaded from informahealthcare.com by University Of Auckland on 02/09/12 For personal use only.

Methods Experimental design This design was a four factorial repeated measures within subject design: two levels of experience (novice and experienced), three conditions (audition, ultrasound with laterally mounted receivers, ultrasound with forward-mounted receivers), five azimuth locations (each 1.9 m away from the participant at angles −70°, −35°, 0°, 35°, and 70° with 0° directly in front of the participant), and two elevations (eye level 0°, and 18° lower). For each trial, the coordinates of the “look direction” were determined from an image taken by a head-mounted camera and compared to the “actual” location of the sound source which was collocated with a LED (light emitting diode), Figure 3. Head pointing has been found to yield better localisation performance than arm pointing [39,40] suggesting that the most accurate results would be obtained with a head-mounted camera.

Figure 4.  Experimental apparatus for localisation experiment performed in the anechoic chamber. The participant sat in a seat at the centre of a circle of radius 1.9 m from the speaker locations. The experimenter sat behind and provided the stimuli from a central control board.

mounted around the anechoic chamber (Figure 4) to allow the device to receive direct signals.

Conditions Two ultrasound conditions were evaluated, audification with receivers oriented forward (same direction as eyes) and audification with receivers oriented laterally (same direction as ears). These ultrasound conditions were compared to audition (unoccluded ears). Hypotheses It was hypothesised that the audified ultrasound with the receivers oriented laterally would be better than the audified ultrasound with the receivers facing forward. The two previously conducted experiments with the AUDEO prototype showed that the orientation of the receivers laterally provided increased accuracy as compared to the receivers oriented forward [22]. Since this experiment required a direct comparison of auditory echolocation to audified localisation by the same users, one would also expect better localisation with laterally mounted receivers. Second, it was expected that azimuthal localisation would be better with audition than ultrasound audification. The participants were selected with normal hearing and familiar with localising sound sources within their environments. This prior knowledge would allow them to be more effective with audition than with the audified ultrasound. Finally, it was hypothesised that localisation accuracy with respect to elevation would be better with audition than with audified ultrasound. The shape of the pinna (outer ear) is designed to enable localisation in the vertical direction, whereas the receivers were not specifically designed to decipher elevation differences. Copyright © 2012 Informa UK Ltd.

Participants Eight females and seven males (six between 20 and 29 years, four between 30 and 39 years, four between 40 and 40 years, and one between 50 and 59 years) with normal hearing by selfassessment volunteered to participate. These participants also had normal vision. Minimising risk to participants is required during device development and evidence based design is required for acceptance of the device and minimisation of abandonment [41]. Once an evidence base exists to support effective device function, further testing with individuals with visual impairments can be undertaken. Eight of the participants were novices having never used the device at all. Seven of the participants had participated in one of the previous experiments [22] and these formed a separate group. Each participant read and signed an informed consent approved by the University of Waterloo Human Ethics Committee and reviewed by the University of Auckland Human Ethics Committee. As this test involved precise localisation for all three conditions, it was important to characterise hearing in each ear. Although some individuals have frequency sensitivities ranging from 20 Hz to 20 kHz, a standard ANSI audiogram [42] is the accepted measure to test hearing and was used to ensure a normal hearing level. Each ear was tested at frequencies of 4000 Hz, 2000 Hz, 1000 Hz, 500 Hz, 250 Hz, and 125 Hz. Each tone was played starting at 30 dB HL (hearing level) and decreased by 5 dB HL after each response. A plot of frequency versus sound pressure level for each ear was developed and reviewed. All minimum values were within the ANSI normal range of hearing which is 10 dB HL to 25 dB HL. Stimuli Two stimuli were used. For audition, broadband white noise in the range of 0–14 kHz was chosen as the stimulus. Broadband signals are more easily localised than simple tones [30,43]. For ultrasound audification, broadband noise in the frequency range of 26–40 kHz was used. After digital downconversion, this signal was reduced to 0–14 kHz. Each of these signals was played for 5 seconds at 40 dB HL allow for sufficient rotation, pivot, and tilt [30] to enable localisation [27], during which the collocated LED was also illuminated.

Disabil Rehabil Assist Technol Downloaded from informahealthcare.com by University Of Auckland on 02/09/12 For personal use only.

134  T.C. Davies et al.

Counterbalancing The order of each condition was counterbalanced among individuals to reduce the effect of crossover information. For each successive group of three participants, each condition was presented before and following all other conditions equally as often. Trials within each condition were completely randomised with the exception that the stimulus from the same location could not be repeated consecutively more than twice. As there were two experience groups, true counterbalancing was achieved after twelve individuals completed the experiment, six in each group. A further three volunteered to participate and these were counterbalanced relative to each other. The same statistical effects were observed by twelve participants as with fifteen. Apparatus To reduce the effect of external environmental sounds and uncontrolled reflections that might influence the results, a 5-m x 5-m anechoic chamber was used. The supporting floor of the chamber was an acoustically transparent mesh below which the structural floor was covered with absorbing foam wedges. A central control box was used by the experimenter to select which loudspeaker or ultrasound transmitter to radiate the test signal. The stimulus was played from one of ten sound source locations; two elevations (0° and 18°) in five azimuths on a circle of 1.9 m radius round the participant 35° apart (Figure 4). Procedure Each participant participated in three sessions: one for audition, one for audified ultrasound with receivers mounted at the ears laterally, and one for audified ultrasound with receivers mounted forward at the ears. For each session, the individual was seated in a chair, blindfolded and a head-mounted camera secured on the forehead with elastic headgear to prevent camera movement. For audition, only the camera was mounted on the participant. For ultrasound audification, both the AUDEO system and the camera were mounted on the participant. In each session the stimulus was played five times from each source position; five azimuth locations and two elevations for a total of fifty trials. A trial consisted of the participant facing forward, the experimenter playing the stimulus from the source, and the participant turning his head to face the direction of the source and maintaining that position while an image was taken by the head mounted camera. For each condition, participants were permitted three practice trials to hear the stimulus at the extreme left, the centre and the extreme right at one elevation. After these three practice trials, the testing began. Participants were not told how many sound sources existed, nor were they made aware of the differences in elevation. The AUDEO system was designed to allow for two output devices (earphones) to be plugged into the processor (though only the participant could control the direction of the sound by head motion), thereby allowing the experimenter to hear what the participant heard. The experimenter sat behind the participant and listened to all the audified sound signals at the same

Figure 5.  Image from head mounted camera showing the calculation of x and y error.

time as the individual. This allowed the experimenter to make general qualitative observations about the various techniques used in processing the information.

Analysis From each head-mounted camera image, the location of the LED mounted with the sound source was identified. The LED was 5 mm in diameter which resulted in a cross three pixels by three pixels in the image. The coordinates of the LED within the image were determined using a Matlab image-processing routine. Normalisation of the sound-source locations based on the digital image was required to determine the calibrated centre of the image due to the variability between individuals in mounting the camera (different head sizes, heights, orientation). For each of the three conditions, the position data from all the trials (a total of fifty trials for each condition and each subject) were averaged and the calibrated centre of the image was determined. This distance was used to relate the position of the LED to the observed error. The distance to the calibrated centre of each image from the LED location was the azimuth error (“x” direction) and elevation error (“y” direction) as shown in Figure 5. Two independent repeated-measures ANOVAs were performed, one for the elevation error and the other for the azimuth error. Resolution is more easily achieved in the horizontal plane (azimuth) as two sensors (ears) can provide sufficient information to resolve the position (with back-to-front ambiguity), but head motion is required to enable resolution in another degree of freedom, such as vertical localisation (elevation). Each ANOVA examined the error relations of experience × condition × location × height (2 × 3 × 5 × 2). A Tukey post hoc test was performed on any differences identified by the ANOVAs as having a p value of less than 0.05.

Results Azimuth error The azimuth error was defined as the distance from the calibrated centre of the image to the “x” position of the LED. Figure 6 shows the results from these calculations. There was Disability and Rehabilitation: Assistive Technology

Disabil Rehabil Assist Technol Downloaded from informahealthcare.com by University Of Auckland on 02/09/12 For personal use only.

Localisation using audification  135

Figure 6.  This figure represents the three way interaction of experience × condition × location of the distance from the calibrated centre of the image in the direction of azimuth. Error bars represent the standard error of the mean. The # represents azimuth location estimates that were significantly different than all others. The *, and ** represent values that were significantly different from each other.

a three way interaction effect of experience × condition × location (F(8,1617) = 2.98, p