Auditory processing of complex sounds: an overview

1992 ◽  
Vol 336 (1278) ◽  
pp. 295-306 ◽  

The past 30 years has seen a remarkable development in our understanding of how the auditory system - particularly the peripheral system - processes complex sounds. Perhaps the most significant has been our understanding of the mechanisms underlying auditory frequency selectivity and their importance for normal and impaired auditory processing. Physiologically vulnerable cochlear filtering can account for many aspects of our normal and impaired psychophysical frequency selectivity with important consequences for the perception of complex sounds. For normal hearing, remarkable mechanisms in the organ of Corti, involving enhancement of mechanical tuning (in mammals probably by feedback of electro-mechanically generated energy from the hair cells), produce exquisite tuning, reflected in the tuning properties of cochlear nerve fibres. Recent comparisons of physiological (cochlear nerve) and psychophysical frequency selectivity in the same species indicate that the ear’s overall frequency selectivity can be accounted for by this cochlear filtering, at least in band width terms. Because this cochlear filtering is physiologically vulnerable, it deteriorates in deleterious conditions of the cochlea - hypoxia, disease, drugs, noise overexposure, mechanical disturbance - and is reflected in impaired psychophysical frequency selectivity. This is a fundamental feature of sensorineural hearing loss of cochlear origin, and is of diagnostic value. This cochlear filtering, particularly as reflected in the temporal patterns of cochlear fibres to complex sounds, is remarkably robust over a wide range of stimulus levels. Furthermore, cochlear filtering properties are a prime determinant of the ‘place’ and ‘time’ coding of frequency at the cochlear nerve level, both of which appear to be involved in pitch perception. The problem of how the place and time coding of complex sounds is effected over the ear’s remarkably wide dynamic range is briefly addressed. In the auditory brainstem, particularly the dorsal cochlear nucleus, are inhibitory mechanisms responsible for enhancing the spectral and temporal contrasts in complex sounds. These mechanisms are now being dissected neuropharmacologically. At the cortical level, mechanisms are evident that are capable of abstracting biologically relevant features of complex sounds. Fundamental studies of how the auditory system encodes and processes complex sounds are vital to promising recent applications in the diagnosis and rehabilitation of the hearing impaired.

2015 ◽  
Vol 32 (5) ◽  
pp. 445-459 ◽  
Author(s):  
Kyung Myun Lee ◽  
Erika Skoe ◽  
Nina Kraus ◽  
Richard Ashley

Acoustic periodicity is an important factor for discriminating consonant and dissonant intervals. While previous studies have found that the periodicity of musical intervals is temporally encoded by neural phase locking throughout the auditory system, how the nonlinearities of the auditory pathway influence the encoding of periodicity and how this effect is related to sensory consonance has been underexplored. By measuring human auditory brainstem responses (ABRs) to four diotically presented musical intervals with increasing degrees of dissonance, this study seeks to explicate how the subcortical auditory system transforms the neural representation of acoustic periodicity for consonant versus dissonant intervals. ABRs faithfully reflect neural activity in the brainstem synchronized to the stimulus while also capturing nonlinear aspects of auditory processing. Results show that for the most dissonant interval, which has a less periodic stimulus waveform than the most consonant interval, the aperiodicity of the stimulus is intensified in the subcortical response. The decreased periodicity of dissonant intervals is related to a larger number of nonlinearities (i.e., distortion products) in the response spectrum. Our findings suggest that the auditory system transforms the periodicity of dissonant intervals resulting in consonant and dissonant intervals becoming more distinct in the neural code than if they were to be processed by a linear auditory system.


Author(s):  
Leslie S. Smith

Audition is the ability to sense and interpret pressure waves within a range of frequencies. The system tries to solve the what and where tasks: what is the sound source (interpretation), and where is it (location)? Auditory environments vary in the number and location of sound sources, their level and in the degree of reverberation, yet biological systems have robust techniques that work over a large range of conditions. We briefly review the auditory system, including the auditory brainstem and mid-brain major components, attempting to connect its structure with the problems to be solved: locating some sounds, and interpreting important ones. Systems using knowledge of animal auditory processing are discussed, including both CPU-based and Neuromorphic approaches, starting from the auditory filterbank, and including silicon cochleae: feature (auditory landmark) based systems are considered. The level of performance associated with animal auditory systems has not been achieved, and we discuss ways forward.


2001 ◽  
Vol 10 (2) ◽  
pp. 68-77 ◽  
Author(s):  
Aage R. Møller

The physiologic basis for cochlear and brainstem implants is discussed. It is concluded that the success of cochlear implants may be explained by assuming that the auditory system can adequately discriminate complex sounds, such as speech sounds, on the basis of their temporal structure when that is encoded in a few separate frequency bands to offer moderate separation of spectral components. The most important roles of the cochlea seems to be to prepare complex sounds for temporal analysis and to create separate channels through which information in different frequency bands is transmitted separately to higher nervous centers for decoding of temporal information. It is then pertinent to ask how many channels are needed. Because speech discrimination is very important, it is probably sufficient to use enough channels to separate formants from each other.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9363
Author(s):  
Priscilla Logerot ◽  
Paul F. Smith ◽  
Martin Wild ◽  
M. Fabiana Kubke

In birds the auditory system plays a key role in providing the sensory input used to discriminate between conspecific and heterospecific vocal signals. In those species that are known to learn their vocalizations, for example, songbirds, it is generally considered that this ability arises and is manifest in the forebrain, although there is no a priori reason why brainstem components of the auditory system could not also play an important part. To test this assumption, we used groups of normal reared and cross-fostered zebra finches that had previously been shown in behavioural experiments to reduce their preference for conspecific songs subsequent to cross fostering experience with Bengalese finches, a related species with a distinctly different song. The question we asked, therefore, is whether this experiential change also changes the bias in favour of conspecific song displayed by auditory midbrain units of normally raised zebra finches. By recording the responses of single units in MLd to a variety of zebra finch and Bengalese finch songs in both normally reared and cross-fostered zebra finches, we provide a positive answer to this question. That is, the difference in response to conspecific and heterospecific songs seen in normal reared zebra finches is reduced following cross-fostering. In birds the virtual absence of mammalian-like cortical projections upon auditory brainstem nuclei argues against the interpretation that MLd units change, as observed in the present experiments, as a result of top-down influences on sensory processing. Instead, it appears that MLd units can be influenced significantly by sensory inputs arising directly from a change in auditory experience during development.


2005 ◽  
Vol 94 (2) ◽  
pp. 1143-1157 ◽  
Author(s):  
Sarah M. N. Woolley ◽  
John H. Casseday

The avian auditory midbrain nucleus, the mesencephalicus lateralis, dorsalis (MLd), is the first auditory processing stage in which multiple parallel inputs converge, and it provides the input to the auditory thalamus. We studied the responses of single MLd neurons to four types of modulated sounds: 1) white noise; 2) band-limited noise; 3) frequency modulated (FM) sweeps, and 4) sinusoidally amplitude-modulated tones (SAM) in adult male zebra finches. Responses were compared with the responses of the same neurons to pure tones in terms of temporal response patterns, thresholds, characteristic frequencies, frequency tuning bandwidths, tuning sharpness, and spike rate/intensity relationships. Most neurons responded well to noise. More than one-half of the neurons responded selectively to particular portions of the noise, suggesting that, unlike forebrain neurons, many MLd neurons can encode specific acoustic components of highly modulated sounds such as noise. Selectivity for FM sweep direction was found in only 13% of cells that responded to sweeps. Those cells also showed asymmetric tuning curves, suggesting that asymmetric inhibition plays a role in FM directional selectivity. Responses to SAM showed that MLd neurons code temporal modulation rates using both spike rate and synchronization. Nearly all cells showed low-pass or band-pass filtering properties for SAM. Best modulation frequencies matched the temporal modulations in zebra finch song. Results suggest that auditory midbrain neurons are well suited for encoding a wide range of complex sounds with a high degree of temporal accuracy rather than selectively responding to only some sounds.


2013 ◽  
Vol 24 (04) ◽  
pp. 307-328 ◽  
Author(s):  
Joshua G.W. Bernstein ◽  
Van Summers ◽  
Elena Grassi ◽  
Ken W. Grant

Background: Hearing-impaired (HI) individuals with similar ages and audiograms often demonstrate substantial differences in speech-reception performance in noise. Traditional models of speech intelligibility focus primarily on average performance for a given audiogram, failing to account for differences between listeners with similar audiograms. Improved prediction accuracy might be achieved by simulating differences in the distortion that speech may undergo when processed through an impaired ear. Although some attempts to model particular suprathreshold distortions can explain general speech-reception deficits not accounted for by audibility limitations, little has been done to model suprathreshold distortion and predict speech-reception performance for individual HI listeners. Auditory-processing models incorporating individualized measures of auditory distortion, along with audiometric thresholds, could provide a more complete understanding of speech-reception deficits by HI individuals. A computational model capable of predicting individual differences in speech-recognition performance would be a valuable tool in the development and evaluation of hearing-aid signal-processing algorithms for enhancing speech intelligibility. Purpose: This study investigated whether biologically inspired models simulating peripheral auditory processing for individual HI listeners produce more accurate predictions of speech-recognition performance than audiogram-based models. Research Design: Psychophysical data on spectral and temporal acuity were incorporated into individualized auditory-processing models consisting of three stages: a peripheral stage, customized to reflect individual audiograms and spectral and temporal acuity; a cortical stage, which extracts spectral and temporal modulations relevant to speech; and an evaluation stage, which predicts speech-recognition performance by comparing the modulation content of clean and noisy speech. To investigate the impact of different aspects of peripheral processing on speech predictions, individualized details (absolute thresholds, frequency selectivity, spectrotemporal modulation [STM] sensitivity, compression) were incorporated progressively, culminating in a model simulating level-dependent spectral resolution and dynamic-range compression. Study Sample: Psychophysical and speech-reception data from 11 HI and six normal-hearing listeners were used to develop the models. Data Collection and Analysis: Eleven individualized HI models were constructed and validated against psychophysical measures of threshold, frequency resolution, compression, and STM sensitivity. Speech-intelligibility predictions were compared with measured performance in stationary speech-shaped noise at signal-to-noise ratios (SNRs) of −6, −3, 0, and 3 dB. Prediction accuracy for the individualized HI models was compared to the traditional audibility-based Speech Intelligibility Index (SII). Results: Models incorporating individualized measures of STM sensitivity yielded significantly more accurate within-SNR predictions than the SII. Additional individualized characteristics (frequency selectivity, compression) improved the predictions only marginally. A nonlinear model including individualized level-dependent cochlear-filter bandwidths, dynamic-range compression, and STM sensitivity predicted performance more accurately than the SII but was no more accurate than a simpler linear model. Predictions of speech-recognition performance simultaneously across SNRs and individuals were also significantly better for some of the auditory-processing models than for the SII. Conclusions: A computational model simulating individualized suprathreshold auditory-processing abilities produced more accurate speech-intelligibility predictions than the audibility-based SII. Most of this advantage was realized by a linear model incorporating audiometric and STM-sensitivity information. Although more consistent with known physiological aspects of auditory processing, modeling level-dependent changes in frequency selectivity and gain did not result in more accurate predictions of speech-reception performance.


2021 ◽  
Author(s):  
Goun Choe ◽  
Young Seok Kim ◽  
Myung-Whan Suh ◽  
Moo Kyun Park ◽  
Seung-Ha Oh ◽  
...  

Abstract Many otologists face a dilemma in the decision-making process of surgical management of patients with cochlear nerve (CN) aplasia. Currently, evidence on cochlear implantation (CI) outcomes in patients with CN aplasia is limited. We scrutinized functional outcomes in 37 ears of 21 children with bilateral CN aplasia who underwent unilateral or bilateral CI based on cross-sectional and longitudinal assessments. The Categories of Auditory Performance (CAP) scores gradually improved throughout the 3-year follow-up; however, variable outcomes existed between individuals. Specifically, the majority of recipients with a 1-year postoperative CAP score ≤1 remained steady or achieved awareness of environmental sounds, while recipients with early stage hearing benefit had markedly improved auditory performance and could possibly discriminate some speech without lipreading. Meanwhile, intraoperative electrically evoked compound action potential was not correlated with postoperative CAP score. The dynamic range between T and C levels remained unchanged. Our results further refine those of previous studies on the clinical feasibility of CI as the first treatment modality to elicit favorable auditory performance in children with CN aplasia. However, special attention should be paid to pediatric patients with an early postoperative CAP score ≤1 for identification of unsuccessful cochlear implants and switching to auditory brainstem implants.


2002 ◽  
Vol 127 (1) ◽  
pp. 84-96 ◽  
Author(s):  
Vittorio Colletti ◽  
Francesco Fiorino ◽  
Marco Carner ◽  
Luca Sacchetto ◽  
Veronica Miorelli ◽  
...  

OBJECTIVE: We sought to describe the advantages of the retrosigmoid-transmeatal (RS-TM) approach in the application of auditory brainstem implants (ABIs) in adults with monolateral and bilateral vestibular schwannoma (VS) and in children with cochlear nerve aplasia. STUDY DESIGN: We conducted a retrospective case review. SETTING: The study was conducted at the ENT Department of the University of Verona, Italy. PATIENTS: Six adult patients (5 men and 1 woman) with neurofibromatosis type 2 (NF2) were operated on for VS removal with ABI. An additional patient had a unilateral VS in the only hearing ear. Tumor size ranged from 12 to 40 mm. In addition, 2 children received ABIs for bilateral cochlear nerve aplasia. INTERVENTION: An RS-TM approach was used in all VS patients, and an RS approach was used in the subjects with cochlear nerve aplasia. After tumor excision, landmarks (VII, VIII and IX cranial nerves, choroid plexus) for the foramen of Luschka were carefully identified. The choroid plexus was then partially removed and the tela choroidea divided and bent back; the floor of the lateral recess of the fourth ventricle and the convolution of the dorsal cochlear nucleus became visible. In the 2 subjects with no cochlear nerve, the choroid plexus and VII and IX cranial nerves were used as landmarks. The electrode array was then inserted into the lateral recess and the correct position was monitored with the aid of electrically evoked auditory brainstem responses (EABR) and neural response telemetry (NRT). RESULTS: Correct implantation was possible in all patients. Auditory sensations were induced in all patients with various numbers of electrodes. Different pitch sensations could be identified with different electrode stimulation. CONCLUSIONS: We believe that the RS approach is the route of choice for patients who are candidates for ABI due to the easy and clear access to the cochlear nucleus area. This route avoids some of the drawbacks of the translabyrinthine approach, such as mastoidectomy, labyrinthectomy, sealing of the cavity and posterior fossa with abdominal fat, and contamination from the middle ear. For this reason, it is the route of choice in children with cochlear nerve aplasia or severe cochlear malformation and in adults with complete ossification of the cochlea or cochlear nerve disruption due to cranial trauma.


Sign in / Sign up

Export Citation Format

Share Document