Analysis of emotional expression by visualization of the human and synthesized speech signal sets — A consideration of audio-visual advantage-

Author(s):  
Kazuki Yamamoto ◽  
Keiji Takahashi ◽  
Kanta Kishiro ◽  
Shunsuke Sasaki ◽  
Hidehiko Hayashi
Author(s):  
G. Lan ◽  
◽  
A. S. Fadeev ◽  
A. N. Morgunov ◽  
◽  
...  

This article details the development of methods for the synthesis of phonemes of the human voice based on the analytical description of individual formants. A technique for analyzing the spectrum and spectrograms of original phonemes to obtain the main amplitude-frequency characteristics of the signal components is presented. An algorithm to reconstruct a speech signal based on the obtained sets of parameters is proposed. A technique to assess the quality of synthesized speech elements is described


2001 ◽  
Vol 44 (5) ◽  
pp. 1052-1057 ◽  
Author(s):  
Kathryn D. R. Drager ◽  
Joe E. Reichle

The use of speech synthesis in electronic communication aids allows individuals who use augmentative and alternative communication (AAC) devices to communicate with a variety of partners. However, communication will only be effective if the speech signal is readily understood by the listener. The intelligibility of synthesized speech is influenced by a variety of factors, including the provision of context. Although the facilitative effects of context have been demonstrated extensively in studies with young adults, there are few investigations into older adults' ability to decode the synthesized speech signal. The present study investigated whether discourse context affected the intelligibility of synthesized sentences for young adult and older adult listeners. Listeners were asked to repeat 15-word sentences that were either presented in isolation or preceded by a story that set the context for the sentence. Participants correctly repeated significantly more words in the sentences when they were preceded by related sentences than when the sentences were presented in isolation. This research shows a facilitating effect of context in discourse, wherein previous words and sentences are related to later sentences, for both younger and older adult listeners. These results have direct implications for AAC system message transmission.


Author(s):  
Martin Chavant ◽  
Alexis Hervais-Adelman ◽  
Olivier Macherey

Purpose An increasing number of individuals with residual or even normal contralateral hearing are being considered for cochlear implantation. It remains unknown whether the presence of contralateral hearing is beneficial or detrimental to their perceptual learning of cochlear implant (CI)–processed speech. The aim of this experiment was to provide a first insight into this question using acoustic simulations of CI processing. Method Sixty normal-hearing listeners took part in an auditory perceptual learning experiment. Each subject was randomly assigned to one of three groups of 20 referred to as NORMAL, LOWPASS, and NOTHING. The experiment consisted of two test phases separated by a training phase. In the test phases, all subjects were tested on recognition of monosyllabic words passed through a six-channel “PSHC” vocoder presented to a single ear. In the training phase, which consisted of listening to a 25-min audio book, all subjects were also presented with the same vocoded speech in one ear but the signal they received in their other ear differed across groups. The NORMAL group was presented with the unprocessed speech signal, the LOWPASS group with a low-pass filtered version of the speech signal, and the NOTHING group with no sound at all. Results The improvement in speech scores following training was significantly smaller for the NORMAL than for the LOWPASS and NOTHING groups. Conclusions This study suggests that the presentation of normal speech in the contralateral ear reduces or slows down perceptual learning of vocoded speech but that an unintelligible low-pass filtered contralateral signal does not have this effect. Potential implications for the rehabilitation of CI patients with partial or full contralateral hearing are discussed.


1983 ◽  
Vol 26 (4) ◽  
pp. 516-524 ◽  
Author(s):  
Donald J. Sharf ◽  
Ralph N. Ohde

Adult and Child manifolds were generated by synthesizing 5 X 5 matrices of/Cej/ type utterances in which F2 and F3 frequencies were systematically varied. Manifold stimuli were presented to 11 graduate-level speech-language pathology students in two conditions: (a) a rating condition in which stimuli were rated on a 4-point scale between good /r/and good /w/; and (b) a labeling condition in which stimuli were labeled as "R," "W," "distorted R." or "N" (for none of the previous choices). It was found that (a) stimuli with low F2 and high F3 frequencies were rated 1.0nmdas;1.4; those with high F2 and low F3 frequencies were rated 3.6–4.0, and those with intermediate values were rated 1.5–3.5; (b) stimuli rated 1.0–1.4 were labeled as "W" and stimuli rated 3.6–4.0 were labeled as "R"; (c) none of the Child manifold stimuli were labeled as distorted "R" and one of the Adult manifold stimuli approached a level of identification that approached the percentage of identification for "R" and "W": and (d) rating and labeling tasks were performed with a high degree of reliability.


2011 ◽  
Vol 21 (2) ◽  
pp. 44-54
Author(s):  
Kerry Callahan Mandulak

Spectral moment analysis (SMA) is an acoustic analysis tool that shows promise for enhancing our understanding of normal and disordered speech production. It can augment auditory-perceptual analysis used to investigate differences across speakers and groups and can provide unique information regarding specific aspects of the speech signal. The purpose of this paper is to illustrate the utility of SMA as a clinical measure for both clinical speech production assessment and research applications documenting speech outcome measurements. Although acoustic analysis has become more readily available and accessible, clinicians need training with, and exposure to, acoustic analysis methods in order to integrate them into traditional methods used to assess speech production.


2013 ◽  
Vol 61 (1) ◽  
pp. 7-15 ◽  
Author(s):  
Daniel Dittrich ◽  
Gregor Domes ◽  
Susi Loebel ◽  
Christoph Berger ◽  
Carsten Spitzer ◽  
...  

Die vorliegende Studie untersucht die Hypothese eines mit Alexithymie assoziierten Defizits beim Erkennen emotionaler Gesichtsaudrücke an einer klinischen Population. Darüber hinaus werden Hypothesen zur Bedeutung spezifischer Emotionsqualitäten sowie zu Gender-Unterschieden getestet. 68 ambulante und stationäre psychiatrische Patienten (44 Frauen und 24 Männer) wurden mit der Toronto-Alexithymie-Skala (TAS-20), der Montgomery-Åsberg Depression Scale (MADRS), der Symptom-Check-List (SCL-90-R) und der Emotional Expression Multimorph Task (EEMT) untersucht. Als Stimuli des Gesichtererkennungsparadigmas dienten Gesichtsausdrücke von Basisemotionen nach Ekman und Friesen, die zu Sequenzen mit sich graduell steigernder Ausdrucksstärke angeordnet waren. Mittels multipler Regressionsanalyse untersuchten wir die Assoziation von TAS-20 Punktzahl und facial emotion recognition (FER). Während sich für die Gesamtstichprobe und den männlichen Stichprobenteil kein signifikanter Zusammenhang zwischen TAS-20-Punktzahl und FER zeigte, sahen wir im weiblichen Stichprobenteil durch die TAS-20 Punktzahl eine signifikante Prädiktion der Gesamtfehlerzahl (β = .38, t = 2.055, p < 0.05) und den Fehlern im Erkennen der Emotionen Wut und Ekel (Wut: β = .40, t = 2.240, p < 0.05, Ekel: β = .41, t = 2.214, p < 0.05). Für wütende Gesichter betrug die Varianzaufklärung durch die TAS-20-Punktzahl 13.3 %, für angeekelte Gesichter 19.7 %. Kein Zusammenhang bestand zwischen der Zeit, nach der die Probanden die emotionalen Sequenzen stoppten, um ihre Bewertung abzugeben (Antwortlatenz) und Alexithymie. Die Ergebnisse der Arbeit unterstützen das Vorliegen eines mit Alexithymie assoziierten Defizits im Erkennen emotionaler Gesichtsausdrücke bei weiblchen Probanden in einer heterogenen, klinischen Stichprobe. Dieses Defizit könnte die Schwierigkeiten Hochalexithymer im Bereich sozialer Interaktionen zumindest teilweise begründen und so eine Prädisposition für psychische sowie psychosomatische Erkrankungen erklären.


1999 ◽  
Author(s):  
Michele C. Fejfar ◽  
Lee Blonder ◽  
Michael Andrykowski

Sign in / Sign up

Export Citation Format

Share Document