Speaker variability in speech based emotion models - Analysis and normalisation

Author(s):  
Vidhyasaharan Sethu ◽  
Julien Epps ◽  
Eliathamby Ambikairajah
Author(s):  
Constantijn Kaland

ABSTRACT This paper reports an automatic data-driven analysis for describing prototypical intonation patterns, particularly suitable for initial stages of prosodic research and language description. The approach has several advantages over traditional ways to investigate intonation, such as the applicability to spontaneous speech, language- and domain-independency, and the potential of revealing meaningful functions of intonation. These features make the approach particularly useful for language documentation, where the description of prosody is often lacking. The core of this approach is a cluster analysis on a time-series of f0 measurements and consists of two scripts (Praat and R, available from https://constantijnkaland.github.io/contourclustering/). Graphical user interfaces can be used to perform the analyses on collected data ranging from spontaneous to highly controlled speech. There is limited need for manual annotation prior to analysis and speaker variability can be accounted for. After cluster analysis, Praat textgrids can be generated with the cluster number annotated for each individual contour. Although further confirmatory analysis is still required, the outcomes provide useful and unbiased directions for any investigation of prototypical f0 contours based on their acoustic form.


2010 ◽  
Vol 52 ◽  
pp. 19-42
Author(s):  
Melanie Weirich

This study examines articulatory and acoustic inter-speaker variability in the production of the German vowels /i/, /u/ and /a/. Our subjects are 3 monozygotic twin pairs (2 female and 1 male pair) and 2 dizygotic female twin pairs. All of them were born, raised and are still living in Berlin and see their twin brother or sister regularly. We assume that monozygotic twins that are genetically identical and share the same physiology should be more similar in their articulation than dizygotic twins but that the shared time and social environment of twins, regardless of their genetic similarity, also plays a crucial role in the acoustic similarity of twins. Articulatory measurements were made with EMA (Electromagnetic Articulography) and the target positions of the produced vowels were analyzed. Additionally, the formants F1-F4 of each vowel were measured and compared within the twin pairs. Our data seems to point out the importance of a shared environment and the strong influence of learning over the anatomical identity of the monozygotic twins regarding the production of vowels. But, additional results suggest (1) the impact of physiology on the production of a vowel following a velar consonant and (2) the interaction of physiology and stress in inter-speaker variability.  


2014 ◽  
Author(s):  
Βάϊα Παπαχρήστου

Previous research on second language phonological acquisition has shown that mastery of the L2 phonological system constitutes a challenging task for L2 learners. Several parametres have been suggested to constrain pronunciation accuracy, such as, interference from speakers’ mother tongue, learners’ age, quality and quantity of exposure to the target language, as well as motivation, attitude and other social and psychological factors. However, research on pronunciation teaching and its potential effectiveness on learners’ L2 phonological development has been quite limited, especially in foreign language contexts.The main aim of the present thesis is to investigate the production of English vowels by Greek learners of English and the effectiveness of explicit vs. implicit pronunciation instruction within a foreign language setting. To this end, three groups of speakers aged 9 and 15 years old were examined; i.e. two experimental groups, one which received explicit pronunciation tuition and one which was taught the pronunciation of the English vowels implicitly, via the use of recasts, and a control one which did not get any pronunciation tuition. Both experimental groups received 43 mini pronunciation interventions embedded in the regular English classes at school. The methodology adopted was the one proposed by Celce-Murcia, Brinton and Goodwin (1996) moving from controlled and guided activities to more communicative ones. Additionally, L1 Greek and L1 English data were obtained in order to compare the vowel inventories of the two languages.The results showed that after teaching, explicit pronunciation instruction can selectively bring about a change in both young and older students’ L2 vowel production, while no improvement was reported for the implicit and control groups9for either age group. Generally, considerable intra- and inter-speaker variability was revealed after tuition and despite the small changes observed, systematic native-like production was difficult to attain. Moreover, no clear effect of learners’ age was documented. A thorough examination of the factors hindering pronunciation accuracy is presented and the findings are discussed on the basis of current theories of L2 phonological acquisition.


Author(s):  
Sheldon Schiffer

Video game non-player characters (NPCs) are a type of agent that often inherits emotion models and functions from ancestor virtual agents. Few emotion models have been designed for NPCs explicitly, and therefore do not approach the expressive possibilities available to live-action performing actors nor hand-crafted animated characters. With distinct perspectives on emotion generation from multiple fields within narratology and computational cognitive psychology, the architecture of NPC emotion systems can reflect the theories and practices of performing artists. This chapter argues that the deployment of virtual agent emotion models applied to NPCs can constrain the performative aesthetic properties of NPCs. An actor-centric emotion model can accommodate creative processes for actors and may reveal what features emotion model architectures should have that are most useful for contemporary game production of photorealistic NPCs that achieve cinematic acting styles and robust narrative design.


Sign in / Sign up

Export Citation Format

Share Document