Continuous Assessment of Children’s Emotional States Using Acoustic Analysis

Author(s):  
Yuan Gong ◽  
Christian Poellabauer
2020 ◽  
Vol 287 (1929) ◽  
pp. 20201148
Author(s):  
Roza G. Kamiloğlu ◽  
Katie E. Slocombe ◽  
Daniel B. M. Haun ◽  
Disa A. Sauter

Vocalizations linked to emotional states are partly conserved among phylogenetically related species. This continuity may allow humans to accurately infer affective information from vocalizations produced by chimpanzees. In two pre-registered experiments, we examine human listeners' ability to infer behavioural contexts (e.g. discovering food) and core affect dimensions (arousal and valence) from 155 vocalizations produced by 66 chimpanzees in 10 different positive and negative contexts at high, medium or low arousal levels. In experiment 1, listeners ( n = 310), categorized the vocalizations in a forced-choice task with 10 response options, and rated arousal and valence. In experiment 2, participants ( n = 3120) matched vocalizations to production contexts using yes/no response options. The results show that listeners were accurate at matching vocalizations of most contexts in addition to inferring arousal and valence. Judgments were more accurate for negative as compared to positive vocalizations. An acoustic analysis demonstrated that, listeners made use of brightness and duration cues, and relied on noisiness in making context judgements, and pitch to infer core affect dimensions. Overall, the results suggest that human listeners can infer affective information from chimpanzee vocalizations beyond core affect, indicating phylogenetic continuity in the mapping of vocalizations to behavioural contexts.


2015 ◽  
Vol 18 ◽  
Author(s):  
Francisco Martínez-Sánchez ◽  
José Antonio Muela-Martínez ◽  
Pedro Cortés-Soto ◽  
Juan José García Meilán ◽  
Juan Antonio Vera Ferrándiz ◽  
...  

AbstractEmotional states, attitudes and intentions are often conveyed by modulations in the tone of voice. Impaired recognition of emotions from a tone of voice (receptive prosody) has been described as characteristic symptoms of schizophrenia. However, the ability to express non-verbal information in speech (expressive prosody) has been understudied. This paper describes a useful technique for quantifying the degree of expressive prosody deficits in schizophrenia, using a semi-automatic method, and evaluates this method’s ability to discriminate between patient and control groups. Forty-five medicated patients with a diagnosis of schizophrenia were matched with thirty-five healthy comparison subjects. Production of expressive prosodic speech was analyzed using variation in fundamental frequency (F0) measures on an emotionally neutral reading task. Results revealed that patients with schizophrenia exhibited significantly more pauses (p < .001), were slower (p < .001), and showed less pitch variability in speech (p < .05) and fewer variations in syllable timing (p < .001) than control subjects. These features have been associated with «flat» speech prosody. Signal processing algorithms applied to speech were shown to be capable of discriminating between patients and controls with an accuracy of 93.8%. These speech parameters may have a diagnostic and prognosis value and therefore could be used as a dependent measure in clinical trials.


2020 ◽  
Author(s):  
Roza Gizem Kamiloglu ◽  
Disa Sauter

When experiencing different positive emotional states, like amusement or relief, we may produce nonverbal vocalizations such as laughs and sighs. In the current study, we describe the acoustic structure of posed and spontaneous nonverbal vocalizations of 14 different positive emotions, and test whether listeners (N = 201) map the vocalizations to emotions. The results show that vocalizations of 13 different positive emotions were recognized at better-than-chance levels, but not vocalizations of being moved. Emotions varied in whether vocalizations were better recognized from spontaneous or posed expressions.


2020 ◽  
Author(s):  
Roza Gizem Kamiloglu ◽  
Katie Slocombe ◽  
Daniel Haun ◽  
Disa Sauter

Vocalisations linked to emotional states are partly conserved among phylogenetically related species. This continuity may allow humans to accurately infer affective information from vocalisations produced by chimpanzees. In two pre-registered experiments, we examine human listeners’ ability to infer behavioural contexts (e.g., discovering food) and core affect dimensions (arousal and valence) from 155 vocalisations produced by 66 chimpanzees in 10 different positive and negative contexts at high, medium, or low arousal levels. In Experiment 1, listeners (n = 310), categorised the vocalisations in a forced-choice task with 10 response options, and rated arousal and valence. In Experiment 2, participants (n = 3120) matched vocalisations to production contexts using Yes/No response options. The results show that listeners were accurate at matching vocalisations of most contexts in addition to inferring levels of arousal and valence. Judgments were more accurate for negative as compared to positive vocalisations. An acoustic analysis demonstrated that, listeners made use of brightness and duration cues, and relied on noisiness in making context judgements, and pitch to infer core affect dimensions. Overall, the results suggest that human listeners can infer affective information from chimpanzee vocalisations beyond core affect dimensions, indicating phylogenetic continuity in the mapping of vocalisations to behavioural contexts.


2020 ◽  
Vol 63 (4) ◽  
pp. 1018-1032
Author(s):  
Chia-Hsin Wu ◽  
Roger W. Chan

Purpose Semi-occluded vocal tract (SOVT) exercises with tubes or straws have been widely used for a variety of voice disorders. Yet, the effects of longer periods of SOVT exercises (lasting for weeks) on the aging voice are not well understood. This study investigated the effects of a 6-week straw phonation in water (SPW) exercise program. Method Thirty-seven elderly subjects with self-perceived voice problems were assigned into two groups: (a) SPW exercises with six weekly sessions and home practice (experimental group) and (b) vocal hygiene education (control group). Before and after intervention (2 weeks after the completion of the exercise program), acoustic analysis, auditory–perceptual evaluation, and self-assessment of vocal impairment were conducted. Results Analysis of covariance revealed significant differences between the two groups in smoothed cepstral peak prominence measures, harmonics-to-noise ratio, the auditory–perceptual parameter of breathiness, and Voice Handicap Index-10 scores postintervention. No significant differences between the two groups were found for other measures. Conclusions Our results supported the positive effects of SOVT exercises for the aging voice, with a 6-week SPW exercise program being a clinical option. Future studies should involve long-term follow-up and additional outcome measures to better understand the efficacy of SOVT exercises, particularly SPW exercises, for the aging voice.


2020 ◽  
Vol 63 (1) ◽  
pp. 59-73 ◽  
Author(s):  
Panying Rong

Purpose The purpose of this article was to validate a novel acoustic analysis of oral diadochokinesis (DDK) in assessing bulbar motor involvement in amyotrophic lateral sclerosis (ALS). Method An automated acoustic DDK analysis was developed, which filtered out the voice features and extracted the envelope of the acoustic waveform reflecting the temporal pattern of syllable repetitions during an oral DDK task (i.e., repetitions of /tɑ/ at the maximum rate on 1 breath). Cycle-to-cycle temporal variability (cTV) of envelope fluctuations and syllable repetition rate (sylRate) were derived from the envelope and validated against 2 kinematic measures, which are tongue movement jitter (movJitter) and alternating tongue movement rate (AMR) during the DDK task, in 16 individuals with bulbar ALS and 18 healthy controls. After the validation, cTV, sylRate, movJitter, and AMR, along with an established clinical speech measure, that is, speaking rate (SR), were compared in their ability to (a) differentiate individuals with ALS from healthy controls and (b) detect early-stage bulbar declines in ALS. Results cTV and sylRate were significantly correlated with movJitter and AMR, respectively, across individuals with ALS and healthy controls, confirming the validity of the acoustic DDK analysis in extracting the temporal DDK pattern. Among all the acoustic and kinematic DDK measures, cTV showed the highest diagnostic accuracy (i.e., 0.87) with 80% sensitivity and 94% specificity in differentiating individuals with ALS from healthy controls, which outperformed the SR measure. Moreover, cTV showed a large increase during the early disease stage, which preceded the decline of SR. Conclusions This study provided preliminary validation of a novel automated acoustic DDK analysis in extracting a useful measure, namely, cTV, for early detection of bulbar ALS. This analysis overcame a major barrier in the existing acoustic DDK analysis, which is continuous voicing between syllables that interferes with syllable structures. This approach has potential clinical applications as a novel bulbar assessment.


2014 ◽  
Vol 23 (3) ◽  
pp. 132-139 ◽  
Author(s):  
Lauren Zubow ◽  
Richard Hurtig

Children with Rett Syndrome (RS) are reported to use multiple modalities to communicate although their intentionality is often questioned (Bartolotta, Zipp, Simpkins, & Glazewski, 2011; Hetzroni & Rubin, 2006; Sigafoos et al., 2000; Sigafoos, Woodyatt, Tuckeer, Roberts-Pennell, & Pittendreigh, 2000). This paper will present results of a study analyzing the unconventional vocalizations of a child with RS. The primary research question addresses the ability of familiar and unfamiliar listeners to interpret unconventional vocalizations as “yes” or “no” responses. This paper will also address the acoustic analysis and perceptual judgments of these vocalizations. Pre-recorded isolated vocalizations of “yes” and “no” were presented to 5 listeners (mother, father, 1 unfamiliar, and 2 familiar clinicians) and the listeners were asked to rate the vocalizations as either “yes” or “no.” The ratings were compared to the original identification made by the child's mother during the face-to-face interaction from which the samples were drawn. Findings of this study suggest, in this case, the child's vocalizations were intentional and could be interpreted by familiar and unfamiliar listeners as either “yes” or “no” without contextual or visual cues. The results suggest that communication partners should be trained to attend to eye-gaze and vocalizations to ensure the child's intended choice is accurately understood.


2011 ◽  
Vol 21 (2) ◽  
pp. 44-54
Author(s):  
Kerry Callahan Mandulak

Spectral moment analysis (SMA) is an acoustic analysis tool that shows promise for enhancing our understanding of normal and disordered speech production. It can augment auditory-perceptual analysis used to investigate differences across speakers and groups and can provide unique information regarding specific aspects of the speech signal. The purpose of this paper is to illustrate the utility of SMA as a clinical measure for both clinical speech production assessment and research applications documenting speech outcome measurements. Although acoustic analysis has become more readily available and accessible, clinicians need training with, and exposure to, acoustic analysis methods in order to integrate them into traditional methods used to assess speech production.


Sign in / Sign up

Export Citation Format

Share Document