pitch contour
Recently Published Documents


TOTAL DOCUMENTS

174
(FIVE YEARS 29)

H-INDEX

19
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Martin Norgaard ◽  
Matthew Dunaway ◽  
Steven Patrick Black

Research about improvisation often focuses on one musical tradition. The current study investigated descriptions of thinking behind improvisation in different cultural traditions through interviews with advanced improvisers residing in a metropolitan area in the United States. The participants were rigorously trained in their tradition and have performance experience within it. However, as immigrants they are experienced in communicating with Western audiences and conversant in Western ways of thinking about music. Immediately after completing the improvisation, each participant listened to a recording and looked at its visual representation, while describing the underlying thinking. The visual representation showed pitch contour and note length without reference to any notational system. A thematic analysis revealed eight main themes: Licks and Conventions describe how prelearned material and convention guided creation; Reaction, Forward Looking, and Repetition & Variety outline various processes that shape creation in the moment; and Aesthetics, Communication, and Emotion provide clues to the improvisers’ motivation behind choices. Interestingly, the use of prelearned patterns appear to facilitate improvisations in all the traditions represented. This and other identified strategies appearing cross-culturally may be shaped by shared cognitive constraints. These shared strategies may also facilitate understanding as educators broaden their curricula to multiple musical traditions.


2021 ◽  
Vol 20 (No.4) ◽  
pp. 489-510
Author(s):  
Izzad Ramli ◽  
Nursuriati Jamil ◽  
Noraini Seman

Intonation generation in expressive speech such as storytelling is essential to produce high quality Malay language expressive speech synthesizer. Intonation generation, for instance explicit control, has shown good performance in terms of intelligibility with reasonably natural speech; thus, it was selected in this research. This approach modifies the prosodic features, such as pitch contour, intensity, and duration, to generate the intonation. However, modification of pitch contour remains a problem because the desired pitch contour is not achieved. This paper formulated an improved pitch contour algorithm to develop a modified pitch contour resembling the natural pitch contour. In this work, the syllable pitch contours of nine storytellers were extracted from their storytelling speeches to create an expressive speech syllable dataset called STORY_DATA. All the shapes of pitch contours from STORY_DATA were analyzed and clustered into the standard six main pitch contour clusters for storytelling. The clustering was performed using one minus the Pearson product moment correlation. Then, an improved iterative two-step sinusoidal pitch contour formulation was introduced to modify the pitch contours of a neutral speech into an expressive pitch contour of natural speeches. Overall, the improved pitch contour formulation was able to achieve 93 percent high correlated matches, indicating the high resemblance as compared to the previous pitch contour formulation at 15 percent. Therefore, the improved formula can be used in a text-to-speech (TTS) synthesizer to produce a more natural expressive speech. The paper also discovered unique expressive pitch contours in the Malay language that need further investigations in the future.


2021 ◽  
Vol 15 ◽  
Author(s):  
Liis Kask ◽  
Nele Põldver ◽  
Pärtel Lippus ◽  
Kairi Kreegipuu

Similar to visual perception, auditory perception also has a clearly described “pop-out” effect, where an element with some extra feature is easier to detect among elements without an extra feature. This phenomenon is better known as auditory perceptual asymmetry. We investigated such asymmetry between shorter or longer duration, and level or falling of pitch of linguistic stimuli that carry a meaning in one language (Estonian), but not in another (Russian). For the mismatch negativity (MMN) experiment, we created four different types of stimuli by modifying the duration of the first vowel [ɑ] (170, 290 ms) and pitch contour (level vs. falling pitch) of the stimuli words (‘SATA,’ ‘SAKI’). The stimuli were synthesized from Estonian words (‘SATA,’ ‘SAKI’) and follow the Estonian language three-way quantity system, which incorporates tonal features (falling pitch contour) together with temporal patterns. This made the meaning of the word dependent on the combination of both features and allows us to compare the relative contribution of duration and pitch contour in discrimination of language stimuli in the brain via MMN generation. The participants of the experiment were 12 Russian native speakers with little or no experience in Estonian and living in Estonia short-term, and 12 Estonian native speakers (age 18–27 years). We found that participants’ perception of the linguistic stimuli differed not only according to the physical features but also according to their native language, confirming that the meaning of the word interferes with the early automatic processing of phonological features. The GAMM and ANOVA analysis of the reversed design results showed that the deviant with longer duration among shorter standards elicited a MMN response with greater amplitude than the short deviant among long standards, while changes in pitch contour (falling vs. level pitch) produced neither strong MMN nor asymmetry. Thus, we demonstrate the effect of language background on asymmetric perception of linguistic stimuli that aligns with those of previous studies (Jaramillo et al., 2000), and contributes to the growing body of knowledge supporting auditory perceptual asymmetry.


Author(s):  
Tấn Thành Tạ

Rục, a dialect of the ethnic group of Chứt spoken in the mountainous area in Quảng Bình province, has been describing as a tonal language with four tones characterized by pitch (F0), voice quality and laryngeal features; however, there has been no experimental study on the tone system of Rục. In Summer 2019, we recorded 20 Ruc speakers (10 women) reading a wordlist including 66 words made of five vowels /iː, ɛː, uː, ɔː, aː/ in combination with different dental and velar onsets and the four tones. The results show that the four tones in Rục are differentiated by pitch height and pitch contour. Moreover, spectral measurements (H1-H2, H1-A1, H1-A2 and CPP) indicate that two low-register tones (derived from voiced onsets) have a breathy voice compared to a modal voice in two high-register tones (derived from voiceless onsets). In words with the two low-register tones, vowels tend to be pronounced with a higher aperture (a lower F1) than in high-register tones context. These results support and update theories on tonongenesis and registrogenesis in Vietic languages and Mon-Khmer languages in general.


2021 ◽  
Vol 12 ◽  
Author(s):  
I-Hui Hsieh ◽  
Wan-Ting Yeh

Speech comprehension across languages depends on encoding the pitch variations in frequency-modulated (FM) sweeps at different timescales and frequency ranges. While timescale and spectral contour of FM sweeps play important roles in differentiating acoustic speech units, relatively little work has been done to understand the interaction between the two acoustic dimensions at early cortical processing. An auditory oddball paradigm was employed to examine the interaction of timescale and pitch contour at pre-attentive processing of FM sweeps. Event-related potentials to frequency sweeps that vary in linguistically relevant pitch contour (fundamental frequency F0 vs. first formant frequency F1) and timescale (local vs. global) in Mandarin Chinese were recorded. Mismatch negativities (MMNs) were elicited by all types of sweep deviants. For local timescale, FM sweeps with F0 contours yielded larger MMN amplitudes than F1 contours. A reversed MMN amplitude pattern was obtained with respect to F0/F1 contours for global timescale stimuli. An interhemispheric asymmetry of MMN topography was observed corresponding to local and global-timescale contours. Falling but not rising frequency difference waveforms sweep contours elicited right hemispheric dominance. Results showed that timescale and pitch contour interacts with each other in pre-attentive auditory processing of FM sweeps. Findings suggest that FM sweeps, a type of non-speech signal, is processed at an early stage with reference to its linguistic function. That the dynamic interaction between timescale and spectral pattern is processed during early cortical processing of non-speech frequency sweep signal may be critical to facilitate speech encoding at a later stage.


2021 ◽  
Vol 3 (2) ◽  
Author(s):  
Ling Zhang ◽  
Liu Shi

Abstract This article reports on an empirical study of Chinese tone production in various contexts by Thai-speaking learners of L2 Chinese. Comparisons are made between Thai students and Chinese native speakers. The acoustic data are analyzed in terms of pitch register, pitch contour and duration, which show that the main problems of Thai students are: (1) T1 is lower in sentence-mid and sentence-initial positions; (2) T2 is less rising or even exhibits a falling-rising contour at a lower register; (3) T3 cannot approximate a full falling-rising contour in isolated characters and at sentence-final position; (4) T4 is too long and the falling slope is too strong. The implication of our results suggests that Thai students should make efforts in both pitch and rhythm control and pay attention to context variations. It is also suggested that similar research methods can be applied to L2 Chinese learners with different first languages (L1s).


2021 ◽  
Vol 4 (2) ◽  
pp. 17-57
Author(s):  
Piet Mertens

This paper first proposes a labeling scheme for tonal aspects of speech and then describes an automatic annotation system using this transcription. This fine-grained transcription provides labels indicating pitch level and pitch movement of individual syllables. Of the five pitch levels, three (low, mid, high) are defined on the basis of pitch changes in the local context and two (bottom, top) are defined relative to the boundaries of the speaker’s global pitch range. For pitch movements, both simple and compound, the transcription indicates direction (rise, fall, level) and size, using size categories (pitch intervals) adjusted relative to the speaker’s pitch range. The automatic tonal annotation system combines several processing steps: segmentation into syllable peaks, pause detection, pitch stylization, pitch range estimation, classification of the intra-syllabic pitch contour, and pitch level assignment. It uses a dedicated and rule-based procedure, which unlike commonly used supervised learning techniques does not require a labeled corpus for training the model. The paper also includes a preliminary evaluation of the annotation system, for a reference corpus of nearly 14 minutes of spontaneous speech in French and Dutch, in order to quantify the annotation errors. The results, expressed in terms of standard measures of precision, recall, accuracy and Fmeasure are encouraging. For pitch levels low, mid and high an F-measure between 0.946 and 0.815 is obtained and for pitch movements a value between 0.708 and 1. Provided additional modules for the detection of prominence and prosodic boundaries, the resulting annotation may serve as an input for a phonological annotation.  


Sign in / Sign up

Export Citation Format

Share Document