scholarly journals The effect of emotion on voice production and speech acoustics

2017 ◽  
Author(s):  
Tom Johnstone

The study of emotional expression in the voice has typically relied on acted portrayals of emotions, with the majority of studies focussing on the perception of emotion in such portrayals. The acoustic characteristics of natural, often involuntary encoding of emotion in the voice, and the mechanisms responsible for such vocal modulation, have received little attention from researchers. The small number of studies on natural or induced emotional speech have failed to identify acoustic patterns specific to different emotions. Instead, most acoustic changes measured have been explainable as resulting from the level of physiological arousal characteristic of different emotions. Thus measurements of the acoustic properties of angry, happy and fearful speech have been similar, corresponding to their similar elevated arousal levels. An opposing view, the most elaborate description of which was given by Scherer (1986), is that emotions affect the acoustic characteristics of speech along a number of dimensions, not only arousal. The lack of empirical data supporting such a theory has been blamed on the lack of sophistication of acoustic analyses in the little research that has been done.By inducing real emotional states in the laboratory, using a variety of computer administered induction methods, this thesis aimed to test the two opposing accounts of how emotion affects the voice. The induction methods were designed to manipulate some of the principal dimensions along which, according to multidimensional theories, emotional speech is expected to vary. A set of acoustic parameters selected to capture temporal, fundamental frequency (F0), intensity and spectral vocal characteristics of the voice was extracted from speech recordings. In addition, electroglottal and physiological measurements were made in parallel with speech recordings, in an effort to determine the mechanisms underlying the measured acoustic changes.The results indicate that a single arousal dimension cannot adequately describe a range of emotional vocal changes, and lend weight to a theory of multidimensional emotional response patterning as suggested by Scherer and others. The correlations between physiological and acoustic measures, although small, indicate that variations in sympathetic autonomic arousal do correspond to changes to F0 level and vocal fold dynamics as indicated by electroglottography. Changes to spectral properties, speech fluency, and F0 dynamics, however, can not be fully explained in terms of sympathetic arousal, and are probably related as well to cognitive processes involved in speech planning.

Author(s):  
Rosario Signorello

Voice is one of the most reliable and efficient behaviors that charismatic leaders use to convey their personality traits and emotional states in order to influence followers. Charismatic leaders manipulate voice acoustic characteristics through language and culture-based conventions. These manipulations cause different vocal qualities resulting in the perception of leaders’ different traits and types of charisma. This chapter first illustrates a sociocognitive approach to describing the phenomenon of charisma in leadership and illustrates how charisma is described in cultures. It also addresses many issues of voice in charismatic leadership, such as the biological and cultural functions of charismatic voice, how vocal behavior conveys charismatic leadership, how the voice influences the interaction between leaders and followers, and how the charismatic voice is perceived in different languages and cultures.


2002 ◽  
Vol 94 (3) ◽  
pp. 767-771 ◽  
Author(s):  
Robert Kevin Manning ◽  
Donald Fucci ◽  
Richard Dean

The purpose of this study was to examine college-age males' ability to produce the acoustic properties of the normally aging voice when reading. The 17 subjects ( M age = 21.13 yr., SD = 1.0) selected for this study were undergraduates who were placed into a single group. The procedure involved recording the subjects while reading The Rainbow Passage aloud. The first reading was in the subject's natural speaking voice. During the second reading, the reader imitated the voice of a normally aging 70-yr.-old man. Fundamental frequency and temporal measures were analyzed for each voice sample. Mean scores for each measure were compared for the natural speaking-voice production and the production when imitating the voice of a normally aging 70-yr.-old man. Analysis showed that temporal measures appear to have the most significant influence on subjects' production when imitating the normally aging voice as seen in the overall increase in all temporal measures.


Author(s):  
Vincent Martel-Sauvageau ◽  
Myriam Breton ◽  
Alexandra Chabot ◽  
Mélanie Langlois

Purpose Studies have reported that clear speech has the potential to influence suprasegmental and segmental aspects of speech, in both healthy and dysarthric speakers. While the impact of clear speech has been studied on the articulation of individual segments, few studies have investigated its effects on coarticulation with multisegment sequences such as fricative–vowel. Objectives The goals of this study are to investigate, in healthy and dysarthric speech, the impact of clear speech on (a) the perception of anticipatory vowel coarticulation in fricatives and (b) the acoustic characteristics of this effect. Method Ten speakers with dysarthria secondary to idiopathic Parkinson's disease were recruited as well as 10 age- and sex-matched healthy speakers. A sentence reading task was performed in natural and clear speaking conditions. The sentences contained words with the initial fricatives /s/ and /ʃ/ preceded by /ə/ and followed by the vowels /i/, /y/, /u/, or /a/. For the perceptual measurements, five listeners were recruited and were asked to predict the upcoming word by listening only to the isolated fricative. Acoustic analyses consisted of spectral moment analysis (M1–M4) on averaged time series. Results Perceptual findings report that identification rates were improved with clear speech for the speakers with dysarthria, but only for the fricative–/i/ sequences. Error pattern analysis indicates that this improvement is associated with an increase in the roundness parameter (lip spreading) identification. Acoustic results are unclear for M1 and M3 but suggest that M2 and M4 differentiation between the rounded versus unrounded vowel contexts is increased with clear speech for the speakers with dysarthria. Discussion Taken together, these findings suggest that clear speech may improve lip coordination in dysarthric speakers with Parkinson's disease. However, the impact of clear speech on the acoustic measures of fricative spectral moments is somewhat limited. This suggests that these metrics, when taken individually, do not capture the entire complexity of fricative–vowel coarticulation.


Author(s):  
Evelyn Alves Spazzapan ◽  
Eliana Maria Gradim Fabron ◽  
Larissa Cristina Berti ◽  
Eduardo Federighi Baisi Chagas ◽  
Viviane Cristina de Castro Marino

2017 ◽  
Vol 37 (6) ◽  
pp. 612-629 ◽  
Author(s):  
Chiara Suttora ◽  
Nicoletta Salerni ◽  
Paola Zanchi ◽  
Laura Zampini ◽  
Maria Spinelli ◽  
...  

This study aimed to investigate specific associations between structural and acoustic characteristics of infant-directed (ID) speech and word recognition. Thirty Italian-acquiring children and their mothers were tested when the children were 1;3. Children’s word recognition was measured with the looking-while-listening task. Maternal ID speech was recorded during a mother–child interaction session and analyzed in terms of amount of speech, lexical and syntactic complexity, positional salience of nouns and verbs, high pitch and variation, and temporal characteristics. The analyses revealed that final syllable length positively predicts children’s accuracy in word recognition whereas the use of verbs in the utterance-final position has an adverse effect on children’s performance. Several of the expected associations between ID speech features and children’s word recognition skills, however, were not significant. Taken together, these findings suggest that only specific structural and acoustic properties of ID speech can facilitate word recognition in children, thereby fostering their ability to extrapolate sound patterns from the stream and map them with their referents.


2019 ◽  
Vol 62 (1) ◽  
pp. 60-69
Author(s):  
Areen Badwal ◽  
JoHanna Poertner ◽  
Robin A. Samlan ◽  
Julie E. Miller

Purpose The zebra finch is used as a model to study the neural circuitry of auditory-guided human vocal production. The terminology of birdsong production and acoustic analysis, however, differs from human voice production, making it difficult for voice researchers of either species to navigate the literature from the other. The purpose of this research note is to identify common terminology and measures to better compare information across species. Method Terminology used in the birdsong literature will be mapped onto terminology used in the human voice production literature. Measures typically used to quantify the percepts of pitch, loudness, and quality will be described. Measures common to the literature in both species will be made from the songs of 3 middle-age birds using Praat and Song Analysis Pro. Two measures, cepstral peak prominence (CPP) and Wiener entropy (WE), will be compared to determine if they provide similar information. Results Similarities and differences in terminology and acoustic analyses are presented. A core set of measures including frequency, frequency variability within a syllable, intensity, CPP, and WE are proposed for future studies. CPP and WE are related yet provide unique information about the syllable structure. Conclusions Using a core set of measures familiar to both human voice and birdsong researchers, along with both CPP and WE, will allow characterization of similarities and differences among birds. Standard terminology and measures will improve accessibility of the birdsong literature to human voice researchers and vice versa. Supplemental Material https://doi.org/10.23641/asha.7438964


Author(s):  
Peter Townsend

Voice and singing are fundamental to music. Scales and content reflect our personal culture. Something beautiful and inspiring to one person may be a boring cacophony to another. Viewing musical evolution from the perspective of culture is therefore varied and individual. Input from science is generally less obvious, except for changes generated from acoustics of buildings, broadcasting, and electronic sound equipment. Medical studies reveal how we form sounds and tone quality, and modern electronic signal processing shows the complexity of the harmonic content of singing. The changes between sweetness, harshness, carrying power, and so on, all depend on not just volume, but the fundamental note and its harmonics, plus all the other frequencies generated in our vocalization. One fundamental may have 50 or more other frequencies. This signal processing tool is invaluable for understanding voice production.


2019 ◽  
Vol 23 (02) ◽  
pp. 203-208 ◽  
Author(s):  
Aline Juliane Romann ◽  
Bárbara Costa Beber ◽  
Carla Aparecida Cielo ◽  
Carlos Roberto de Mello Rieder

Introduction Subthalamic nucleus deep brain stimulation (STN-DBS) improves motor function in individuals with Parkinson disease (PD). The evidence about the effects of STN-DBS on the voice is still inconclusive. Objective To verify the effect of STN-DBS on the voice of Brazilian individuals with PD. Methods Sixteen participants were evaluated on the Unified Parkinson Disease Rating Scale—Part III, and by the measurement of the acoustic modifications in on and off conditions of stimulation. Results The motor symptoms showed significant improvement with STN-DBS on. Regarding the acoustic measures of the voice, only the maximum fundamental frequency (fhi) showed a statistical difference between on- and off-conditions, with reduction in off-condition. Conclusion Changes in computerized acoustic measures are more valuable when interpreted in conjunction with changes in other measures. The single finding in fhi suggests that DBS-STN increases vocal instability. The interpretation of this result should be done carefully, since it may not be of great value if other measures that also indicate instability are not significantly different.


Sign in / Sign up

Export Citation Format

Share Document