Why is sound localization important
Three of the groups underwent sound localization testing and training in a virtual environment, each utilizing a different training paradigm. The fourth group acted as a control and remained in a quiet room between testing sessions. A set of 19 acoustically complex stimuli were developed. The sounds were designed such that they were consistent with the virtual environment and contained rich cues for localization.
As such, they were designed to resemble a short radio communications transmission. A schematic of the stimulus is shown in Fig. Each stimulus comprised unique noise and speech fragments. From this set, a single stimulus was used only during testing, whilst the other 18 stimuli were selected from randomly during training.
Testing blocks are in orange and training blocks in blue. Each session was carried out on a different day. Two HRTFs were randomly selected from a subset of this database, which was determined in a previous study to contain the seven HRTFs that produced the best subjective spatialization for the most listeners The former was used to spatialize sound during both training and testing. The latter was used only during testing. The interaural time differences were left unmodified, as per the original HRTFs.
All stimuli were generated and stored in Throughout this manuscript, two spherical coordinate systems are used. This coordinate system is intuitive and mathematically convenient and is therefore used in the subsequent description of target orientation, as well as in the experiment software.
See reference 54 for a diagram illustrating these coordinate systems. For each participant, the experiment comprised three sessions, each completed on a different day and separated by no more than two days.
Each session incorporated sound localization testing and training blocks in a virtual environment. During a session, participants sat on a freely rotating chair in the centre of a dark, quiet room.
The virtual environment was presented using a head-mounted display, and auditory stimuli were presented over headphones. Participants interacted with the experiment software using a gamepad. For details of the equipment, see Equipment and the virtual sound localization environment.
During testing and training blocks, participants initiated trials in their own time by orienting towards a button within the virtual scene and activating it using the gamepad. Doing so initiated playback of a randomly selected complex auditory stimulus see Stimuli. Testing and training blocks were differentiated by whether positional feedback, in the form of a visual sound source positional indicator, was given.
Participants underwent an initial testing block at the start of each session, followed by three minute training blocks, followed by a final testing block.
Participants were encouraged to take a 5—10 minute break between blocks, during which they remained in the quiet room. In order to capture any very rapid learning effects, additional testing blocks were carried out between each of the training blocks on the first day. This design is represented schematically in Fig. The control group followed the same process, but remained in the quiet testing room during the periods in which the other groups underwent training, during which time they were allowed to engage in non-auditory activities such as reading.
The choice of spatializing sources only in the upper hemisphere was made because the virtual and indeed real environment had a floor, and we wanted to avoid providing conflicting audio-visual cues.
The participants were instructed to remain oriented in the same direction throughout the 1. If the orientation of the head-mounted display deviated by more than 2. Following stimulus playback, participants were instructed to orient towards the perceived direction of the sound source and indicate their response using a button on the gamepad. When the response was given, the stimulus was played back a second time, spatialized at the same source location.
Visual positional feedback was presented simultaneously by introducing a spherical object in the virtual scene at the target sound source location after each participant response.
The way this was visualized depended on the training paradigm used and is detailed in the following section. The size of the target object varied adaptively.
After five misses at a given target size, the target size reverted to the previous one until reaching the initial size. All participant groups undergoing training were instructed to initiate and complete trials continuously for 12 minutes per training block.
Testing blocks were carried out using the same virtual environment and process, except positional feedback was not provided following participant responses. Further, testing blocks comprised a fixed number of trials rather than a fixed time limit. In order to ensure consistency across participants, target stimuli were positioned systematically.
These orientations are visualized in Fig. In one testing block, four targets were presented for each target orientation with a different, random deviation each time.
Three of these were spatialized using the same HRTFs used in the training blocks. In order to test if learning effects transferred to more than one set of HRTFs, the fourth target was spatialized using a second HRTF set for which the participants had received no positional feedback.
Three versions of the training software were developed. Each utilised an identical virtual environment. If the target was not in the visual field of the participant when they gave their response, an arrow on the HUD indicated the direction to the target following the shortest path.
Points were rewarded for target hits. When the target size decreased following the adaptive procedure described above, this was indicated as a level progression to the participant using a sound and text on the HUD i. The laser had an accompanying sound effect, which was not spatialized.
When the health ran out, this corresponded to an increase in target size, as per the adaptive procedure. A third version of the training software was implemented that incorporated this. There was no requirement for participants to remain oriented in the same direction during stimulus playback, which was looped until listeners gave their response.
Participants were not explicitly encouraged to move their head to better localize sound sources. Note that all participants utilised head tracking and were able to move their head relative to the sound source when receiving positional feedback upon giving a response. All participants used exactly the same version of the testing software.
A video of these training paradigms as well as the testing paradigm is available online Supplementary Video. Please note that due to the screen capture software and file size limitations, the audio in the video is not representative of the audio generated by the spatialization software. The virtual environment was rendered stereoscopically on a smartphone-based iPhone 6s head-mounted display. Participants interacted with the phone using a handheld controller connected via Bluetooth Mad Catz C.
A virtual, moon-like environment was designed to be acoustically neutral to minimize the potential mismatch between the anechoic stimuli and the perceived acoustic properties of the virtual space. The scene was also populated with some landmarks as it has been shown that a lack of visual frame of reference is detrimental to sound localization accuracy Screenshots of part of the environment are shown in Fig.
The virtual environment can be seen in more detail in the Supplementary Video. Four measures of localization error were used. These are 1 the angle formed by a vector oriented towards the location of a target, virtual sound source and a vector oriented in the direction of the listeners head when indicating their response.
This is referred to as the spherical angle error. This is referred to as the lateral error. This is equal to the difference in polar angle between the target and response in the interaural coordinate system, which were only calculated for responses that were not front-back confused and were scaled by the cosine of the target lateral angle to account for changes in the sizes of cones of confusion.
Again, this metric is the same as that used by Carlile et al. The Unity project used to build the software for the iPhone 6 and the full dataset, including Matlab scripts to read it, are publicly available on Zenodo Please note that the authors do not have permission to redistribute the LIMSI Spatialization Engine, which is required to reproduce the stimuli. Please contact the authors for details. Wightman, F. Headphone simulation of free-field listening. Am 85 , — Kahana, Y.
Numerical modelling of the transfer functions of a dummy-head and of the external ear. Katz, B. Boundary element method calculation of individual head-related transfer function. The J. Dellepiane, M. Reconstructing head models from photographs for individualized 3d-audio processing. In Computer Graphics Forum , vol. Torres-Gallegos, E. Personalization of head-related transfer functions hrtf based on automatic photo-anthropometry and inference from a database. Article Google Scholar. Round robin comparison of hrtf measurement systems: preliminary results.
In Intl. Google Scholar. Burkhard, M. Anthropometric manikin for acoustic research. Morimoto, M. Localization cues of sound sources in the upper hemisphere. E 5 , — Wenzel, E. Localization using nonindividualized head-related transfer functions. Begault, D. Headphone localization of speech. Factors 35 , — Individualized head-related transfer functions, and illusory ego-motion in virtual environments.
Self-motion Presence Percept. Seeber, B. Subjective selection of non-individual head-related transfer functions Georgia Institute of Technology, Iwaya, Y. Perceptually based head-related transfer function database optimization. Fuchs, E. Adult neuroplasticity: more than 40 years of research. Neural plasticity Hofman, P. Relearning sound localization with new ears. Van Wanrooij, M. Relearning sound localization with a new ear. Carlile, S. Accommodating to new ears: the effects of sensory and sensory-motor feedback.
A review on auditory space adaptations to altered head-related cues. Zahorik, P. Perceptual recalibration in human sound localization: Learning to remediate front-back reversals. Majdak, P. On the improvement of localization accuracy with non-individualized hrtf-based sounds.
Audio Eng. Parseihian, G. Rapid head-related transfer function adaptation using a virtual auditory environment. Learning auditory space: Generalization and long-term effects. PloS one 8 , e Koepp, M. Evidence for striatal dopamine release during a video game. Nature , Speech and Hearing in Communication, N. D, Van Nostrand[[2]] demonstrated that binaural hearing has always been with us, and its importance lies mainly in the fact that it enables us to pinpoint a sound source at its point of origin.
Even primitive auditory systems need to inform their owners of what is threatening and where the threat comes from. This dictates the direction for visual contact and offers directions for flight.
Experiments in Hearing , McGraw-Hill, NY[[3]] stated that the binaural phenomenon, especially dealing with localization, is incredibly complex. He added that in no other field of science does a stimulus produce so many different sensations as in the area of directional hearing.
Some sources list phase period-related time as being a third factor, while other sources identify that phase change of the signal as a by-product of the time of arrival of the signal at the two ears. Regardless, a phase change occurs as the signal passes from one ear to the other in time.
Localization is described by psychophysicists using two coordinates, one for azimuth horizontal plane and one for elevation vertical plane Figure 1. For human hearing, localization occurs primarily via the horizontal plane azimuth , and, because of this, little discussion will be made of vertical localization. Sound coming from directly in front point A in Figure 2 will be the same in both ears assuming symmetrical hearing sensitivity.
However, if the sound comes from somewhat to the right point B or left just reverse this , the sound will be slightly louder in the ear closest to the sound near ear. The IID is directly related to the head shadow effect discussed in a previous post. The interaural time of arrival differences are thought to play a major role in how a listener determines how far right or left a sound source may be.
This is a preview of subscription content, log in to check access. Psych Rev — CrossRef Google Scholar. Batteau DW The role of the pinna in human localization. J Acoust Soc Am — Sensory Processes — PubMed Google Scholar. Bernstein LR, Trahiotis C Lateralization of sinusoidally amplitude-modulated tones: Effects of spectral locus and temporal variation.
Blauert J Sound localization in the median plane. Acustica — Google Scholar. Blauert J. Blauert J Binaural localization: Multiple images and applications in Room- and electroacoustics. Groton, CN: Amphora Press, pp. Blauert J Spatial Hearing. Bloom PJ Creating source elevation illusions by spectral manipulation. J Audio Eng Soc — Butler RA On the relative usefulness of monaural and binaural cues in locating sound in space.
Psychon Sci — Butler RA The influence of the external and middle ear on auditory discriminations. Berlin: Springer-Verlag, pp. Butler RA The bandwidth effect on monaural and binaural localization.
Hear Res — Butler RA, Belendiuk K Spectral cues utilized in the localization of sound in the median sagittal plane. Butler RA, Flannery R The spatial attributes of stimulus frequency and their role in monaural localization of sound in the horizontal plane. Percept Psychophys — Butler RA, Humanski RA Localization of sound in the vertical plane with and without high-frequency spectral cues. Perception — To investigate this further, Ihlefeld, Alamatsaz, and Shapley asked healthy volunteers to localize sounds of different volumes.
These new findings also reveal key parallels to processing in the visual system. Visual areas of the brain estimate how far away an object is by comparing the input that reaches the two eyes. But these estimates are also systematically less accurate for low-contrast stimuli than for high-contrast ones, just as sound localization is less accurate for softer sounds than for louder ones. The idea that the brain uses the same basic strategy to localize both sights and sounds generates a number of predictions for future studies to test.
0コメント