speech comprehension
Recently Published Documents


TOTAL DOCUMENTS

480
(FIVE YEARS 167)

H-INDEX

47
(FIVE YEARS 5)

Author(s):  
Nevena Dimitrova ◽  
Şeyda Özçalışkan

AbstractProduction and comprehension of gesture emerge early and are key to subsequent language development in typical development. Compared to typically developing (TD) children, children with autism spectrum disorders (ASD) exhibit difficulties and/or differences in gesture production. However, we do not yet know if gesture production either shows similar patterns to gesture comprehension across different ages and learners, or alternatively, lags behind gesture comprehension, thus mimicking a pattern akin to speech comprehension and production. In this study, we focus on the gestures produced and comprehended by a group of young TD children and children with ASD—comparable in language ability—with the goal to identify whether gesture production and comprehension follow similar patterns between ages and between learners. We elicited production of gesture in a semi-structured parent–child play and comprehension of gesture in a structured experimenter-child play across two studies. We tested whether young TD children (ages 2–4) follow a similar trajectory in their production and comprehension of gesture (Study 1) across ages, and if so, whether this alignment remains similar for verbal children with ASD (Mage = 5 years), comparable to TD children in language ability (Study 2). Our results provided evidence for similarities between gesture production and comprehension across ages and across learners, suggesting that comprehension and production of gesture form a largely integrated system of communication.


2022 ◽  
Vol 15 ◽  
Author(s):  
Enrico Varano ◽  
Konstantinos Vougioukas ◽  
Pingchuan Ma ◽  
Stavros Petridis ◽  
Maja Pantic ◽  
...  

Understanding speech becomes a demanding task when the environment is noisy. Comprehension of speech in noise can be substantially improved by looking at the speaker’s face, and this audiovisual benefit is even more pronounced in people with hearing impairment. Recent advances in AI have allowed to synthesize photorealistic talking faces from a speech recording and a still image of a person’s face in an end-to-end manner. However, it has remained unknown whether such facial animations improve speech-in-noise comprehension. Here we consider facial animations produced by a recently introduced generative adversarial network (GAN), and show that humans cannot distinguish between the synthesized and the natural videos. Importantly, we then show that the end-to-end synthesized videos significantly aid humans in understanding speech in noise, although the natural facial motions yield a yet higher audiovisual benefit. We further find that an audiovisual speech recognizer (AVSR) benefits from the synthesized facial animations as well. Our results suggest that synthesizing facial motions from speech can be used to aid speech comprehension in difficult listening environments.


2021 ◽  
Vol 12 (1) ◽  
pp. 33
Author(s):  
Andres Camarena ◽  
Grace Manchala ◽  
Julianne Papadopoulos ◽  
Samantha R. O’Connell ◽  
Raymond L. Goldsworthy

Cochlear implants have been used to restore hearing to more than half a million people around the world. The restored hearing allows most recipients to understand spoken speech without relying on visual cues. While speech comprehension in quiet is generally high for recipients, many complain about the sound of music. The present study examines consonance and dissonance perception in nine cochlear implant users and eight people with no known hearing loss. Participants completed web-based assessments to characterize low-level psychophysical sensitivities to modulation and pitch, as well as higher-level measures of musical pleasantness and speech comprehension in background noise. The underlying hypothesis is that sensitivity to modulation and pitch, in addition to higher levels of musical sophistication, relate to higher-level measures of music and speech perception. This hypothesis tested true with strong correlations observed between measures of modulation and pitch with measures of consonance ratings and speech recognition. Additionally, the cochlear implant users who were the most sensitive to modulations and pitch, and who had higher musical sophistication scores, had similar pleasantness ratings as those with no known hearing loss. The implication is that better coding and focused rehabilitation for modulation and pitch sensitivity will broadly improve perception of music and speech for cochlear implant users.


Loquens ◽  
2021 ◽  
Vol 7 (2) ◽  
pp. e074
Author(s):  
Lei He ◽  
Yu Zhang

Lower modulation rates in the temporal envelope (ENV) of the acoustic signal are believed to be the rhythmic backbone in speech, facilitating speech comprehension in terms of neuronal entrainments at δ- and θ-rates (these rates are comparable to the foot- and syllable-rates phonetically). The jaw plays the role of a carrier articulator regulating mouth opening in a quasi-cyclical way, which correspond to the low-frequency modulations as a physical consequence. This paper describes a method to examine the joint roles of jaw oscillation and ENV in realizing speech rhythm using spectral coherence. Relative powers in the frequency bands corresponding to the δ-and θ-oscillations in the coherence (respectively notated as %δ and %θ) were quantified as one possible way of revealing the amount of concomitant foot- and syllable-level rhythmicities carried by both acoustic and articulatory domains. Two English corpora (mngu0 and MOCHA-TIMIT) were used for the proof of concept. %δ and %θ were regressed on utterance duration for an initial analysis. Results showed that the degrees of foot- and syllable-sized rhythmicities are different and are contingent upon the utterance length.


2021 ◽  
Vol 2 (24 A) ◽  
pp. 137-149
Author(s):  
Wioletta A. Piegzik

This paper presents the phenomenon of anticipation which is one of the manifestations of linguistic maturity and language user rationality. Anticipation, taking place essentially in implicit structures and based on evolutionary old intuition, improves speech comprehension and increases the efficiency of cognitive processes. The phenomenon in question is presented on the example of foreign language communication, because it is there that the mechanisms governing the formulation of accurate hypotheses about form and content are particularly evident. The first part of the article discusses speech perception, and with it the categorization and selection of an appropriate cognitive schema conditioning accurate anticipation. The second part presents factors that facilitate and hinder the right hypothesis. Finally, conclusions and directions for future research on anticipation are formulated.


Author(s):  
E. Artukarslan ◽  
F. Matin ◽  
F. Donnerstag ◽  
L. Gärtner ◽  
T. Lenarz ◽  
...  

Abstract Introduction Superficial hemosiderosis is a sub-form of hemosiderosis in which the deposits of hemosiderin in the central nervous system damage the nerve cells. This form of siderosis is caused by chronic cerebral hemorrhages, especially subarachnoid hemorrhages. The diversity of symptoms depends on the respective damage to the brain, but in most of the cases it shows up as incipient unilateral or bilateral hearing loss, ataxia and signs of pyramidal tracts. We are investigating the question of whether cochlear implantation is a treatment option for patients with superficial hemosiderosis and which strategy of diagnostic procedure has to be ruled out preoperatively. Materials and methods In a tertiary hospital between 2009 and 2018, we examined (N = 5) patients with radiologically confirmed central hemosiderosis who suffered from profound hearing loss to deafness were treated with a cochlear implant (CI). We compared pre- and postoperative speech comprehension (Freiburg speech intelligibility test for monosyllables and HSM sentence test). Results Speech understanding improved on average by 20% (monosyllabic test in the Freiburg speech intelligibility test) and by 40% in noise (HSM sentence test) compared to preoperative speech understanding with optimized hearing aids. Discussion The results show that patients with superficial siderosis benefit from CI with better speech understanding. The results are below the average for all postlingual deaf CI patients. Superficial siderosis causes neural damages, which explains the reduced speech understanding based on central hearing loss. It is important to correctly weigh the patient's expectations preoperatively and to include neurologists within the therapy procedure.


2021 ◽  
Author(s):  
Enrico Varano ◽  
Konstantinos Vougioukas ◽  
Pingchuan Ma ◽  
Stavros Petridis ◽  
Maja Pantic ◽  
...  

Understanding speech becomes a demanding task when the environment is noisy. Comprehension of speech in noise can be substantially improved by looking at the speake's face, and this audiovisual benefit is even more pronounced in people with hearing impairment. Recent advances in AI have allowed to synthesize photorealistic talking faces from a speech recording and a still image of a person's face in an end-to-end manner. However, it has remained unknown whether such facial animations improve speech-in-noise comprehension. Here we consider facial animations produced by a recently introduced generative adversarial network (GAN), and show that humans cannot distinguish between the synthesized and the natural videos. Importantly, we then show that the end-to-end synthesized videos significantly aid humans in understanding speech in noise, although the natural facial motions yield a yet higher audiovisual benefit. We further find that an audiovisual speech recognizer benefits from the synthesized facial animations as well. Our results suggest that synthesizing facial motions from speech can be used to aid speech comprehension in difficult listening environments.


2021 ◽  
Author(s):  
Jeremy Giroud ◽  
Jacques Pesnot Lerousseau ◽  
Francois Pellegrino ◽  
Benjamin Morillon

Humans are expert at processing speech but how this feat is accomplished remains a major question in cognitive neuroscience. Capitalizing on the concept of channel capacity, we developed a unified measurement framework to investigate the respective influence of seven acoustic and linguistic features on speech comprehension, encompassing acoustic, sub-lexical, lexical and supra-lexical levels of description. We show that comprehension is independently impacted by all these features, but at varying degrees and with a clear dominance of the syllabic rate. Comparing comprehension of French words and sentences further reveals that when supra-lexical contextual information is present, the impact of all other features is dramatically reduced. Finally, we estimated the channel capacity associated with each linguistic feature and compared them with their generic distribution in natural speech. Our data point towards supra-lexical contextual information as the feature limiting the flow of natural speech. Overall, this study reveals how multilevel linguistic features constrain speech comprehension.


Sign in / Sign up

Export Citation Format

Share Document