Setting the Stage for Speech Production: Infants Prefer Listening to Speech Sounds With Infant Vocal Resonances

Author(s):  
Linda Polka ◽  
Matthew Masapollo ◽  
Lucie Ménard

Purpose: Current models of speech development argue for an early link between speech production and perception in infants. Recent data show that young infants (at 4–6 months) preferentially attend to speech sounds (vowels) with infant vocal properties compared to those with adult vocal properties, suggesting the presence of special “memory banks” for one's own nascent speech-like productions. This study investigated whether the vocal resonances (formants) of the infant vocal tract are sufficient to elicit this preference and whether this perceptual bias changes with age and emerging vocal production skills. Method: We selectively manipulated the fundamental frequency ( f 0 ) of vowels synthesized with formants specifying either an infant or adult vocal tract, and then tested the effects of those manipulations on the listening preferences of infants who were slightly older than those previously tested (at 6–8 months). Results: Unlike findings with younger infants (at 4–6 months), slightly older infants in Experiment 1 displayed a robust preference for vowels with infant formants over adult formants when f 0 was matched. The strength of this preference was also positively correlated with age among infants between 4 and 8 months. In Experiment 2, this preference favoring infant over adult formants was maintained when f 0 values were modulated. Conclusions: Infants between 6 and 8 months of age displayed a robust and distinct preference for speech with resonances specifying a vocal tract that is similar in size and length to their own. This finding, together with data indicating that this preference is not present in younger infants and appears to increase with age, suggests that nascent knowledge of the motor schema of the vocal tract may play a role in shaping this perceptual bias, lending support to current models of speech development. Supplemental Material https://doi.org/10.23641/asha.17131805

Phonology ◽  
1998 ◽  
Vol 15 (2) ◽  
pp. 143-188 ◽  
Author(s):  
Grzegorz Dogil ◽  
Jörg Mayer

The present study proposes a new interpretation of the underlying distortion in APRAXIA OF SPEECH. Apraxia of speech, in its pure form, is the only neurolinguistic syndrome for which it can be argued that phonological structure is selectively distorted.Apraxia of speech is a nosological entity in its own right which co-occurs with aphasia only occasionally. This…conviction rests on detailed descriptions of patients who have a severe and lasting disorder of speech production in the absence of any significant impairment of speech comprehension, reading or writing as well as of any significant paralysis or weakness of the speech musculature.(Lebrun 1990: 380)Based on the experimental investigation of poorly coarticulated speech of patients from two divergent languages (German and Xhosa) it is argued that apraxia of speech has to be seen as a defective implementation of phonological representations at the phonology–phonetics interface. We contend that phonological structure exhibits neither a homogeneously auditory pattern nor a motor pattern, but a complex encoding of sequences of speech sounds. Specifically, it is maintained that speech is encoded in the brain as a sequence of distinctive feature configurations. These configurations are specified with differing degrees of detail depending on the role the speech segments they underlie play in the phonological structure of a language. The transfer between phonological and phonetic representation encodes speech sounds as a sequence of vocal tract configurations. Like the distinctive feature representation, these configurations may be more or less specified. We argue that the severe and lasting disorders in speech production observed in apraxia of speech are caused by the distortion of this transfer between phonological and phonetic representation. The characteristic production deficits of apraxic patients are explained in terms of overspecification of phonetic representations.


2012 ◽  
Vol 107 (1) ◽  
pp. 442-447 ◽  
Author(s):  
Takayuki Ito ◽  
David J. Ostry

Interactions between auditory and somatosensory information are relevant to the neural processing of speech since speech processes and certainly speech production involves both auditory information and inputs that arise from the muscles and tissues of the vocal tract. We previously demonstrated that somatosensory inputs associated with facial skin deformation alter the perceptual processing of speech sounds. We show here that the reverse is also true, that speech sounds alter the perception of facial somatosensory inputs. As a somatosensory task, we used a robotic device to create patterns of facial skin deformation that would normally accompany speech production. We found that the perception of the facial skin deformation was altered by speech sounds in a manner that reflects the way in which auditory and somatosensory effects are linked in speech production. The modulation of orofacial somatosensory processing by auditory inputs was specific to speech and likewise to facial skin deformation. Somatosensory judgments were not affected when the skin deformation was delivered to the forearm or palm or when the facial skin deformation accompanied nonspeech sounds. The perceptual modulation that we observed in conjunction with speech sounds shows that speech sounds specifically affect neural processing in the facial somatosensory system and suggest the involvement of the somatosensory system in both the production and perceptual processing of speech.


2017 ◽  
Vol 284 (1859) ◽  
pp. 20171158 ◽  
Author(s):  
Bret Pasch ◽  
Isao T. Tokuda ◽  
Tobias Riede

Functional changes in vocal organ morphology and motor control facilitate the evolution of acoustic signal diversity. Although many rodents produce vocalizations in a variety of social contexts, few studies have explored the underlying production mechanisms. Here, we describe mechanisms of audible and ultrasonic vocalizations (USVs) produced by grasshopper mice (genus Onychomys ). Grasshopper mice are predatory rodents of the desert that produce both loud, long-distance advertisement calls and USVs in close-distance mating contexts. Using live-animal recording in normal air and heliox, laryngeal and vocal tract morphological investigations, and biomechanical modelling, we found that grasshopper mice employ two distinct vocal production mechanisms. In heliox, changes in higher-harmonic amplitudes of long-distance calls indicate an airflow-induced tissue vibration mechanism, whereas changes in fundamental frequency of USVs support a whistle mechanism. Vocal membranes and a thin lamina propria aid in the production of long-distance calls by increasing glottal efficiency and permitting high frequencies, respectively. In addition, tuning of fundamental frequency to the second resonance of a bell-shaped vocal tract increases call amplitude. Our findings indicate that grasshopper mice can dynamically adjust motor control to suit the social context and have novel morphological adaptations that facilitate long-distance communication.


2020 ◽  
Vol 63 (4) ◽  
pp. 931-947
Author(s):  
Teresa L. D. Hardy ◽  
Carol A. Boliek ◽  
Daniel Aalto ◽  
Justin Lewicke ◽  
Kristopher Wells ◽  
...  

Purpose The purpose of this study was twofold: (a) to identify a set of communication-based predictors (including both acoustic and gestural variables) of masculinity–femininity ratings and (b) to explore differences in ratings between audio and audiovisual presentation modes for transgender and cisgender communicators. Method The voices and gestures of a group of cisgender men and women ( n = 10 of each) and transgender women ( n = 20) communicators were recorded while they recounted the story of a cartoon using acoustic and motion capture recording systems. A total of 17 acoustic and gestural variables were measured from these recordings. A group of observers ( n = 20) rated each communicator's masculinity–femininity based on 30- to 45-s samples of the cartoon description presented in three modes: audio, visual, and audio visual. Visual and audiovisual stimuli contained point light displays standardized for size. Ratings were made using a direct magnitude estimation scale without modulus. Communication-based predictors of masculinity–femininity ratings were identified using multiple regression, and analysis of variance was used to determine the effect of presentation mode on perceptual ratings. Results Fundamental frequency, average vowel formant, and sound pressure level were identified as significant predictors of masculinity–femininity ratings for these communicators. Communicators were rated significantly more feminine in the audio than the audiovisual mode and unreliably in the visual-only mode. Conclusions Both study purposes were met. Results support continued emphasis on fundamental frequency and vocal tract resonance in voice and communication modification training with transgender individuals and provide evidence for the potential benefit of modifying sound pressure level, especially when a masculine presentation is desired.


Animals ◽  
2018 ◽  
Vol 8 (10) ◽  
pp. 167 ◽  
Author(s):  
Anton Baotic ◽  
Maxime Garcia ◽  
Markus Boeckle ◽  
Angela Stoeger

African savanna elephants live in dynamic fission–fusion societies and exhibit a sophisticated vocal communication system. Their most frequent call-type is the ‘rumble’, with a fundamental frequency (which refers to the lowest vocal fold vibration rate when producing a vocalization) near or in the infrasonic range. Rumbles are used in a wide variety of behavioral contexts, for short- and long-distance communication, and convey contextual and physical information. For example, maturity (age and size) is encoded in male rumbles by formant frequencies (the resonance frequencies of the vocal tract), having the most informative power. As sound propagates, however, its spectral and temporal structures degrade progressively. Our study used manipulated and resynthesized male social rumbles to simulate large and small individuals (based on different formant values) to quantify whether this phenotypic information efficiently transmits over long distances. To examine transmission efficiency and the potential influences of ecological factors, we broadcasted and re-recorded rumbles at distances of up to 1.5 km in two different habitats at the Addo Elephant National Park, South Africa. Our results show that rumbles were affected by spectral–temporal degradation over distance. Interestingly and unlike previous findings, the transmission of formants was better than that of the fundamental frequency. Our findings demonstrate the importance of formant frequencies for the efficiency of rumble propagation and the transmission of information content in a savanna elephant’s natural habitat.


1992 ◽  
Vol 35 (4) ◽  
pp. 761-768 ◽  
Author(s):  
Petra Zwirner ◽  
Gary J. Barnes

Acoustic analyses of upper airway and phonatory stability were conducted on samples of sustained phonation to evaluate the relation between laryngeal and articulomotor stability for 31 patients with dysarthria and 12 non-dysarthric control subjects. Significantly higher values were found for the variability in fundamental frequency and formant frequency of patients who have Huntington’s disease compared with normal subjects and patients with Parkinson’s disease. No significant correlations were found between formant frequency variability and the variability of the fundamental frequency for any subject group. These findings are discussed as they pertain to the relationship between phonatory and upper airway subsystems and the evaluation of vocal tract motor control impairments in dysarthria.


Author(s):  
Radhika Rani L ◽  
S. Chandra lingam ◽  
Anjaneyulu T ◽  
Satyanarayana K

Congenital Heart Defects (CHD) are the critical heart disorders that can be observed at the birth stage of the infants. These are classified mainly into two, Cyanotic and Acyanotic. The present paper concentrates on the Acyanotic heart disorders. Acyanotic heart disorder cannot be observed on external checkup, whereas bluish skin is the indication for the infant affected with Cyanotic disorder. Acyanotic heart disorder can only be diagnosed using chest X-Ray, ECG, Echocardiogram, Cardiac Catheterization and MRI of the Heart. The present work aims at estimating the fundamental frequency (pitch) and the vocal tract resonant frequencies (formants) from the cry signal of the infants. The pitch frequency and formant frequencies are estimated using frequency domain (Cepstrum) and Linear Prediction Code (LPC) methods. The results show that the fundamental frequency of the cry signal was between 600Hz-800Hz for the infants with Acyanotic heart disorders. This fundamental frequency helps in identifying Acyanotic heart disorders at an early stage.


Sign in / Sign up

Export Citation Format

Share Document