Comparative Study of English, Dutch and German Prosodic Features (Fundamental Frequency and Intensity) as Means of Speech

Author(s):  
Anna Moskvina
2013 ◽  
Vol 1 (1) ◽  
pp. 54-67
Author(s):  
Kanu Boku ◽  
Taro Asada ◽  
Yasunari Yoshitomi ◽  
Masayoshi Tabuse

Recently, methods for adding emotion to synthetic speech have received considerable attention in the field of speech synthesis research. For generating emotional synthetic speech, it is necessary to control the prosodic features of the utterances. The authors propose a case-based method for generating emotional synthetic speech by exploiting the characteristics of the maximum amplitude and the utterance time of vowels, and the fundamental frequency of emotional speech. As an initial investigation, they adopted the utterance of Japanese names, which are semantically neutral. By using the proposed method, emotional synthetic speech made from the emotional speech of one male subject was discriminable with a mean accuracy of 70% when ten subjects listened to the emotional synthetic utterances of “angry,” “happy,” “neutral,” “sad,” or “surprised” when the utterance was the Japanese name “Taro.”


Author(s):  
Ankur Bandhopadhyay ◽  
Indranil Chaterjee ◽  
Sanghamitra Dey

<p class="abstract"><strong>Background:</strong> Vocal sound is based on the complex yet co-ordinated interaction of phonatory system, resonatory system and respiratory system. Phonetography is a practicable and readily accessible method to investigate and map the quantitative potentialities of vocal output. The objectives of the present study were to determine the phonetogram of trained (Hindustani classical) singers, untrained singer sand non-singers elicited from singing as well as speech task to see if statistically significant differences were present which may indicate an effect of training.</p><p class="abstract"><strong>Methods:</strong> 90 female subjects between the ages 20-45 (mean age 34.2 years for trained subjects, 26.3 years for untrained subjects and 25.8 years for non-singers) divided into three groups each group consisting of 30 subjects. For the singing task, the individuals had to phonate |a| at habitual level by traversing through eight musical scales. In the speech task, the subjects were asked to count from one to twenty in Bengali at habitat level and at Sustainable cohorts of intensity. This was recorded using phonetogram software Dr. Speech (version 4). The parameters considered were fundamental frequency, intensity, semitones and area.  </p><p class="abstract"><strong>Results:</strong> The study revealed that in both tasks singing and non-singing task for all three groups in all the four parameters of phonetogram significant differences were seen (p=0.000) at 95% level of confidence.</p><p class="abstract"><strong>Conclusions:</strong> The present study depicted the phonetographic profile of a genre of trained singers and tracked out the parameters on which differences are pronounced between a trained and untrained singer and non-singer.</p>


1968 ◽  
Vol 11 (3) ◽  
pp. 481-487 ◽  
Author(s):  
George L. Huttar

The emotional states of an adult male American speaker, as reflected in 30 utterances, were evaluated by 12 subjects on nine 7-point semantic differential scales. The subjects also evaluated the utterances on similar scales for pitch, loudness, and speed. Significant correlations were found between some acoustic variables and the judgments of some types of emotion. Higher correlations were found between the acoustic variables and judgments of degree of emotion. Correlation coefficients between judgments of emotion and judgments of prosodic features were in general higher than the correlations involving the acoustic variables. Degree of perceived emotion was found to be highly and positively correlated with fundamental frequency range and intensity range. A causal explanation of these relations in terms of human physiology is suggested.


1983 ◽  
Vol 10 (1) ◽  
pp. 1-15 ◽  
Author(s):  
D. N. Stern ◽  
S. Spieker ◽  
R. K. Barnett ◽  
K. MacKain

ABSTRACTThe speech of 6 mothers to their healthy infants was examined longitudinally during the neonatal period and at 4, 12, and 24 months in a semi-naturalistic setting. Features of speech analysed were: contour of fundamental frequency, repetitiveness, timing (durations of vocalizations and pauses), tempo and MLU. The neonatal period was characterized by elongated pauses. During the 4-month period the extent of pitch contouring and repetitiveness was greater than at earlier or later ages. By 24 months, the duration of vocalizations and length of MLU became markedly greater. The period of intense face-to-face interaction around the fourth month proved to involve more changes in certain prosodic features. Some of the possible functions of these changes during this phase are discussed.


2019 ◽  
pp. 002383091988660
Author(s):  
Shu-chen Ou ◽  
Zhe-chen Guo

Experience with native-language prosody encourages language-specific strategies for speech segmentation. Conflicting findings from previous research suggest that these strategies may not be abstracted away from the acoustic manifestation of prosodic features in the native speech. Using the artificial language learning paradigm, the current study explores this possibility in connection with listeners of a lexical tone language called Taiwanese Southern Min (TSM). In TSM, the only rising lexical tone occurs almost only on the final syllable of the language’s tone sandhi domain and is phonetically associated with final lengthening. Based on these observations, Experiment I examined what constituted a sufficient finality cue for use by TSM listeners to support segmentation: (a) final fundamental frequency (F0) rise only; or (b) final F0 rise conjoined with final lengthening. The results showed that segmentation was inhibited by the former cue but facilitated by the latter. Experiment II showed that the facilitation cannot be attributed entirely to final lengthening, as a null effect was found when final lengthening was the sole prosodic cue to segmentation. It is thus assumed that acoustic details as fine-grained as the lengthening of the rising tone are involved in the modulation of the segmentation strategy whereby TSM listeners perceive F0 rise as signaling finality. The inhibitory effect of final F0 rise alone found in Experiment I motivated Experiment III, which revealed that initial F0 rise in the absence of lengthening cues improved TSM listeners’ segmentation. It is speculated that such use of initial F0 rise might reflect a cross-linguistic segmentation solution.


2017 ◽  
Vol 23 (2) ◽  
pp. 143-156 ◽  
Author(s):  
Annette Prochnow ◽  
Soly Erlandsson ◽  
Volker Hesse ◽  
Kathleen Wermke

The foetal environment is filled with a variety of noises. Among the manifold sounds of the maternal respiratory, gastrointestinal and cardiovascular systems, the intonation properties of the maternal language are well perceived by the foetus, whose hearing system is already functioning during the last trimester of gestation. These intonation (melodic) features, reflecting native-language prosody, have been found to shape vocal learning. Having had ample opportunity to become familiar with their mother’s language in the womb, newborns have been found to exhibit salient pitch-based elements in their own cry melodies. An interesting issue is whether an intrauterine exposure to a maternal pitch accent language, such as Swedish, in which emphatic syllables are pronounced typically on a higher pitch relative to other syllables will affect newborns’ cry melody (fundamental frequency contour). The present study aimed to answer this question by quantitatively analysing and comparing the melody structure in 52 Swedish compared with 79 German newborns. In accordance with previous approaches, cry melody structure was analysed by calculating a melody complexity index (MCI) expressing the share of cries exhibiting two or more (well-defined) arc-like substructures uttered during the recording sessions. A low MCI reflects a dominance of cries with a ‘simple’, i.e. single-arc melody. A significantly higher MCI was found in the Swedish infant group, which further corroborates the assumption that the well-known foetal sensitivity for musical (melodic) stimuli seems to shape infants’ cry melody.


Author(s):  
Chuan Dong Ma ◽  
Lun Hua Tan

Supra-segmental phonemes, the prosodic features of a language, including stress, pitch, intonation, rhythm and juncture, play a very important role in distinguishing meaning in English. This paper analyzes the supra-segmental phoneme differences between English and Sichuan Dialect from the following four aspects: word stress, intonation, rhythm and junture. We are convinced that if language teachers in China have some knowledge of the transfer theory and if they know clearly the similarities and differences of the supra-segmental phonemes between English and their mother tongue, it would be much easier for them to know the language focuses and difficulties for the learners and their teaching would be more effective.


2013 ◽  
Vol 7 (6) ◽  
pp. 2423-2436
Author(s):  
I. El-Hussain ◽  
A. Deif ◽  
K. Al-Jabri ◽  
A. M. E. Mohamed ◽  
S. El-Hady ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document