Auditory–Articulatory Neural Alignment between Listener and Speaker during Verbal Communication

2019 ◽  
Vol 30 (3) ◽  
pp. 942-951 ◽  
Author(s):  
Lanfang Liu ◽  
Yuxuan Zhang ◽  
Qi Zhou ◽  
Douglas D Garrett ◽  
Chunming Lu ◽  
...  

Abstract Whether auditory processing of speech relies on reference to the articulatory motor information of speaker remains elusive. Here, we addressed this issue under a two-brain framework. Functional magnetic resonance imaging was applied to record the brain activities of speakers when telling real-life stories and later of listeners when listening to the audio recordings of these stories. Based on between-brain seed-to-voxel correlation analyses, we revealed that neural dynamics in listeners’ auditory temporal cortex are temporally coupled with the dynamics in the speaker’s larynx/phonation area. Moreover, the coupling response in listener’s left auditory temporal cortex follows the hierarchical organization for speech processing, with response lags in A1+, STG/STS, and MTG increasing linearly. Further, listeners showing greater coupling responses understand the speech better. When comprehension fails, such interbrain auditory-articulation coupling vanishes substantially. These findings suggest that a listener’s auditory system and a speaker’s articulatory system are inherently aligned during naturalistic verbal interaction, and such alignment is associated with high-level information transfer from the speaker to the listener. Our study provides reliable evidence supporting that references to the articulatory motor information of speaker facilitate speech comprehension under a naturalistic scene.

Author(s):  
Jayanthiny Kangatharan ◽  
Maria Uther ◽  
Fernand Gobet

AbstractComprehension assesses a listener’s ability to construe the meaning of an acoustic signal in order to be able to answer questions about its contents, while intelligibility indicates the extent to which a listener can precisely retrieve the acoustic signal. Previous comprehension studies asking listeners for sentence-level information or narrative-level information used native listeners as participants. This is the first study to look at whether clear speech properties (e.g. expanded vowel space) produce a clear speech benefit at the word level for L2 learners for speech produced in naturalistic settings. This study explored whether hyperarticulated speech was more comprehensible than non-hyperarticulated speech for both L1 British English speakers and early and late L2 British English learners in quiet and in noise. Sixteen British English listeners, 16 native Mandarin Chinese listeners as early learners of L2 and 16 native Mandarin Chinese listeners as late learners of L2 rated hyperarticulated samples versus non-hyperarticulated samples in form of words for comprehension under four listening conditions of varying white noise level (quiet or SNR levels of + 16 dB, + 12 dB or + 8 dB) (3 × 2× 4 mixed design). Mean ratings showed all three groups found hyperarticulated speech samples easier to understand than non-hyperarticulated speech at all listening conditions. Results are discussed in terms of other findings (Uther et al., 2012) that suggest that hyperarticulation may generally improve speech processing for all language groups.


2013 ◽  
Vol 25 (12) ◽  
pp. 2179-2188 ◽  
Author(s):  
Katya Krieger-Redwood ◽  
M. Gareth Gaskell ◽  
Shane Lindsay ◽  
Elizabeth Jefferies

Several accounts of speech perception propose that the areas involved in producing language are also involved in perceiving it. In line with this view, neuroimaging studies show activation of premotor cortex (PMC) during phoneme judgment tasks; however, there is debate about whether speech perception necessarily involves motor processes, across all task contexts, or whether the contribution of PMC is restricted to tasks requiring explicit phoneme awareness. Some aspects of speech processing, such as mapping sounds onto meaning, may proceed without the involvement of motor speech areas if PMC specifically contributes to the manipulation and categorical perception of phonemes. We applied TMS to three sites—PMC, posterior superior temporal gyrus, and occipital pole—and for the first time within the TMS literature, directly contrasted two speech perception tasks that required explicit phoneme decisions and mapping of speech sounds onto semantic categories, respectively. TMS to PMC disrupted explicit phonological judgments but not access to meaning for the same speech stimuli. TMS to two further sites confirmed that this pattern was site specific and did not reflect a generic difference in the susceptibility of our experimental tasks to TMS: stimulation of pSTG, a site involved in auditory processing, disrupted performance in both language tasks, whereas stimulation of occipital pole had no effect on performance in either task. These findings demonstrate that, although PMC is important for explicit phonological judgments, crucially, PMC is not necessary for mapping speech onto meanings.


Author(s):  
Shuo Liu ◽  
Gil Keren ◽  
Emilia Parada-Cabaleiro ◽  
Björn Schuller

AbstractThe unprecedented growth of noise pollution over the last decades has raised an always increasing need for developing efficient audio enhancement technologies. Yet, the variety of difficulties related to processing audio sources in-the-wild, such as handling unseen noises or suppressing specific interferences, makes audio enhancement a still open challenge. In this regard, we present (the Neuro-Holistic Audio-eNhancement System), a Python toolkit for in-the-wild audio enhancement that includes functionalities for audio denoising, source separation, and —for the first time in such a toolkit—selective noise suppression. The architecture is specially developed to automatically adapt to different environmental backgrounds and speakers. This is achieved by the use of two identical neural networks comprised of stacks of residual blocks, each conditioned on additional speech- and noise-based recordings through auxiliary sub-networks. Along to a Python API, a command line interface is provided to researchers and developers, both of them carefully documented. Experimental results indicate that achieves great performance w. r. t. existing methods, preserving also the audio quality at a high level; thus, ensuring a reliable usage in real-life application, e. g., for in-the-wild speech processing, which encourages the development of speech-based intelligent technology.


2021 ◽  
Author(s):  
Fabian Schmidt ◽  
Ya-Ping Chen ◽  
Anne Keitel ◽  
Sebastian Rösch ◽  
Ronny Hannemann ◽  
...  

ABSTRACTThe most prominent acoustic features in speech are intensity modulations, represented by the amplitude envelope of speech. Synchronization of neural activity with these modulations is vital for speech comprehension. As the acoustic modulation of speech is related to the production of syllables, investigations of neural speech tracking rarely distinguish between lower-level acoustic (envelope modulation) and higher-level linguistic (syllable rate) information. Here we manipulated speech intelligibility using noise-vocoded speech and investigated the spectral dynamics of neural speech processing, across two studies at cortical and subcortical levels of the auditory hierarchy, using magnetoencephalography. Overall, cortical regions mostly track the syllable rate, whereas subcortical regions track the acoustic envelope. Furthermore, with less intelligible speech, tracking of the modulation rate becomes more dominant. Our study highlights the importance of distinguishing between envelope modulation and syllable rate and provides novel possibilities to better understand differences between auditory processing and speech/language processing disorders.Abstract Figure


2009 ◽  
Vol 14 (1) ◽  
pp. 78-89 ◽  
Author(s):  
Kenneth Hugdahl ◽  
René Westerhausen

The present paper is based on a talk on hemispheric asymmetry given by Kenneth Hugdahl at the Xth European Congress of Psychology, Praha July 2007. Here, we propose that hemispheric asymmetry evolved because of a left hemisphere speech processing specialization. The evolution of speech and the need for air-based communication necessitated division of labor between the hemispheres in order to avoid having duplicate copies in both hemispheres that would increase processing redundancy. It is argued that the neuronal basis of this labor division is the structural asymmetry observed in the peri-Sylvian region in the posterior part of the temporal lobe, with a left larger than right planum temporale area. This is the only example where a structural, or anatomical, asymmetry matches a corresponding functional asymmetry. The increase in gray matter volume in the left planum temporale area corresponds to a functional asymmetry of speech processing, as indexed from both behavioral, dichotic listening, and functional neuroimaging studies. The functional anatomy of the corpus callosum also supports such a view, with regional specificity of information transfer between the hemispheres.


2012 ◽  
Vol 2012 ◽  
pp. 1-7 ◽  
Author(s):  
Joseph P. Pillion

Deficits in central auditory processing may occur in a variety of clinical conditions including traumatic brain injury, neurodegenerative disease, auditory neuropathy/dyssynchrony syndrome, neurological disorders associated with aging, and aphasia. Deficits in central auditory processing of a more subtle nature have also been studied extensively in neurodevelopmental disorders in children with learning disabilities, ADD, and developmental language disorders. Illustrative cases are reviewed demonstrating the use of an audiological test battery in patients with auditory neuropathy/dyssynchrony syndrome, bilateral lesions to the inferior colliculi, and bilateral lesions to the temporal lobes. Electrophysiological tests of auditory function were utilized to define the locus of dysfunction at neural levels ranging from the auditory nerve, midbrain, and cortical levels.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Chih-Hua Tai ◽  
Kuo-Hsuan Chung ◽  
Ya-Wen Teng ◽  
Feng-Ming Shu ◽  
Yue-Shan Chang

2021 ◽  
pp. 1-14
Author(s):  
Debo Dong ◽  
Dezhong Yao ◽  
Yulin Wang ◽  
Seok-Jun Hong ◽  
Sarah Genon ◽  
...  

Abstract Background Schizophrenia has been primarily conceptualized as a disorder of high-order cognitive functions with deficits in executive brain regions. Yet due to the increasing reports of early sensory processing deficit, recent models focus more on the developmental effects of impaired sensory process on high-order functions. The present study examined whether this pathological interaction relates to an overarching system-level imbalance, specifically a disruption in macroscale hierarchy affecting integration and segregation of unimodal and transmodal networks. Methods We applied a novel combination of connectome gradient and stepwise connectivity analysis to resting-state fMRI to characterize the sensorimotor-to-transmodal cortical hierarchy organization (96 patients v. 122 controls). Results We demonstrated compression of the cortical hierarchy organization in schizophrenia, with a prominent compression from the sensorimotor region and a less prominent compression from the frontal−parietal region, resulting in a diminished separation between sensory and fronto-parietal cognitive systems. Further analyses suggested reduced differentiation related to atypical functional connectome transition from unimodal to transmodal brain areas. Specifically, we found hypo-connectivity within unimodal regions and hyper-connectivity between unimodal regions and fronto-parietal and ventral attention regions along the classical sensation-to-cognition continuum (voxel-level corrected, p < 0.05). Conclusions The compression of cortical hierarchy organization represents a novel and integrative system-level substrate underlying the pathological interaction of early sensory and cognitive function in schizophrenia. This abnormal cortical hierarchy organization suggests cascading impairments from the disruption of the somatosensory−motor system and inefficient integration of bottom-up sensory information with attentional demands and executive control processes partially account for high-level cognitive deficits characteristic of schizophrenia.


2007 ◽  
Vol 52 (2) ◽  
pp. 290-298 ◽  
Author(s):  
Kilian G. Seeber ◽  
Christian Zelger

Abstract Simultaneous conference interpreting represents a highly complex linguistic task and a very delicate process of information transfer. Consequently, the notion of truth – which applied to the field of simultaneous interpreting entails an accurate rendition of the original message – is of pivotal importance. In spite of that, an analysis of experimental transcripts and corpora sometimes seems to suggest that interpreters betray the speaker by deliberately altering the original. While we cannot exclude that such instances do exist, we argue that sometimes what looks like betrayal may in fact be a rendition based on a sound ethical decision. In this paper we take a closer look at these situations in an attempt to shed more light on the potential motivations underlying the interpreter’s decisions and actions. Using examples from real life interpreting situations, we take the interpreter’s output and put what at first sight appears to be a betrayal of the speaker on the ethical test bench, both from a deontological and a teleological perspective. Based on this analysis we propose a model suggesting that the interpreter uses three principal message components, verbal, semantic and intentional, in order to come up with an accurate interpretation of the original, which we call “truthful rendition.”


2007 ◽  
Vol 18 (1) ◽  
pp. 230-242 ◽  
Author(s):  
Stephen M. Wilson ◽  
Istvan Molnar-Szakacs ◽  
Marco Iacoboni

Sign in / Sign up

Export Citation Format

Share Document