scholarly journals Polytonia

2021 ◽  
Vol 4 (2) ◽  
pp. 17-57
Author(s):  
Piet Mertens

This paper first proposes a labeling scheme for tonal aspects of speech and then describes an automatic annotation system using this transcription. This fine-grained transcription provides labels indicating pitch level and pitch movement of individual syllables. Of the five pitch levels, three (low, mid, high) are defined on the basis of pitch changes in the local context and two (bottom, top) are defined relative to the boundaries of the speaker’s global pitch range. For pitch movements, both simple and compound, the transcription indicates direction (rise, fall, level) and size, using size categories (pitch intervals) adjusted relative to the speaker’s pitch range. The automatic tonal annotation system combines several processing steps: segmentation into syllable peaks, pause detection, pitch stylization, pitch range estimation, classification of the intra-syllabic pitch contour, and pitch level assignment. It uses a dedicated and rule-based procedure, which unlike commonly used supervised learning techniques does not require a labeled corpus for training the model. The paper also includes a preliminary evaluation of the annotation system, for a reference corpus of nearly 14 minutes of spontaneous speech in French and Dutch, in order to quantify the annotation errors. The results, expressed in terms of standard measures of precision, recall, accuracy and Fmeasure are encouraging. For pitch levels low, mid and high an F-measure between 0.946 and 0.815 is obtained and for pitch movements a value between 0.708 and 1. Provided additional modules for the detection of prominence and prosodic boundaries, the resulting annotation may serve as an input for a phonological annotation.  

2019 ◽  
Vol 2 ◽  
pp. 205920431985719 ◽  
Author(s):  
E. Glenn Schellenberg ◽  
Michael W. Weiss ◽  
Chen Peng ◽  
Shayan Alam

Listeners remember the pitch level (key) and tempo of musical recordings they have heard multiple times. They also have long-term implicit memory for the key and tempo of novel melodies heard for the first time in the laboratory. In previous research, however, the stimulus melodies were simple and repetitive and the changes in key or tempo were large. Here, we tested the limits of implicit memory for the key and tempo of more complex stimulus melodies. Musically trained and untrained listeners heard 12 novel melodies during an exposure phase and 24 (12 old, 12 new) during a subsequent test (recognition) phase. From exposure to test, half of the melodies were transposed up or down (changed in key) (Experiment 1), or sped up or slowed down (Experiment 2), but to varying degrees. Musically trained listeners displayed enhanced recognition, but transposing or changing the tempo of the melodies reduced performance similarly for all listeners. The effect of the key change did not wane as the transposition was reduced from 6 semitones to 1, but recognition in general was worse as the pitch range of the stimulus melodies increased. The magnitude of the tempo change had a very small effect on response patterns, but Bayesian analyses indicated that the observed data were more likely without considering magnitude. The results suggest that musically trained and untrained listeners have implicit memory for key and tempo that is remarkably fine-grained, even for melodies that are heard for the first time in the laboratory, such that small changes in either feature make a melody less recognizable.


2003 ◽  
Vol 14 (3) ◽  
pp. 262-266 ◽  
Author(s):  
E. Glenn Schellenberg ◽  
Sandra E. Trehub

Here we show that good pitch memory is widespread among adults with no musical training. We tested unselected college students on their memory for the pitch level of instrumental soundtracks from familiar television programs. Participants heard 5-s excerpts either at the original pitch level or shifted upward or downward by 1 or 2 semitones. They successfully identified the original pitch levels. Other participants who heard comparable excerpts from unfamiliar recordings could not do so. These findings reveal that ordinary listeners retain fine-grained information about pitch level over extended periods. Adults' reportedly poor memory for pitch is likely to be a by-product of their inability to name isolated pitches.


1976 ◽  
Vol 24 (4) ◽  
pp. 169-176 ◽  
Author(s):  
John M. Geringer

The purpose of this study was to investigate tuning preferences regarding recorded orchestral music. Specifically, the study was designed to test subjects' tuning preferences while investigating both the direction and magnitude of mistuning. Sixty randomly selected undergraduate and graduate music students modulated a variable speed tape recorder to preferred pitch levels. Stimuli were recorded excerpts of ten orchestral works, each representative of a different key. Subjects listened to the thirty-second excerpts and turned a linear continuous-speed control knob with a pitch range of approximately an augmented fourth. Data consisted of cent deviation scores relative to A = 440 Hz. Results indicated a marked propensity to tune these excerpts sharper than their recorded pitch level. Subjects' responses indicated the mean cent deviation for sharp tunings to be 149.29 cents (approximately 11/2 semi-tones); when tuning flat, the mean deviation was 88.43 cents.


1991 ◽  
Vol 34 (4) ◽  
pp. 753-760 ◽  
Author(s):  
Ruth A. Newman ◽  
Floyd W. Emanuel

This study was designed to investigate the effects of vocal f o on vowel spectral noise level (SNL) and perceived vowel roughness for subjects in high- and low-pitch voice categories. The subjects were 40 adult singers (10 each sopranos, altos, tenors, and basses). Each produced the vowel /a/ in isolation at a comfortable speaking pitch, and at each of seven assigned pitches spaced at whole-tone intervals over a musical octave within his or her singing pitch range. The eight /a/ productions were repeated by each subject on a second test day. The SNL differences between repeated test samples (different days) were not statistically significant for any subject group. For the vowel samples produced at a comfortable pitch, a relatively large SNL was associated with samples phonated by the subjects of each sex who manifested the relatively low singing pitch range. Regarding the vowel samples produced at the assigned-pitch levels, it was found that both vowel SNL and perceived vowel roughness decreased as test-pitch level was raised over a range of one octave. The relationship between vocal pitch and either vowel roughness or SNL approached linearity for each of the four subject groups.


2021 ◽  
pp. 002383092199840
Author(s):  
Philipp Meer ◽  
Robert Fuchs

The current study provides a phonetic perspective on the questions of whether a high degree of variability in pitch may be considered a characteristic, endonormative feature of Trinidadian English (TrinE) at the level of speech production and contribute to what is popularly described as ‘sing-song’ prosody. Based on read and spontaneous data from 111 speakers, we analyze pitch level, range, and dynamism in TrinE in comparison to Southern Standard British (BrE) and Educated Indian English (IndE) and investigate sociophonetic variation in TrinE prosody with a view to these global F0 parameters. Our findings suggest that a large pitch range could potentially be considered an endonormative feature of TrinE that distinguishes it from other varieties (BrE and IndE), at least in spontaneous speech. More importantly, however, it is shown that a high degree of pitch variation in terms of range and dynamism is not as much characteristic of TrinE as a whole as it is of female Trinidadian speakers. An important finding of this study is that pitch variation patterns are not homogenous in TrinE, but systematically sociolinguistically conditioned across gender, age, and ethnic groups, and rural and urban speakers. The findings thus reveal that there is a considerable degree of systematic local differentiation in TrinE prosody. On a more general level, the findings may be taken to indicate that endonormative tendencies and sociolinguistic differentiation in TrinE prosody are interlinked.


Author(s):  
Maria. I. Pavlikova ◽  
◽  
Olga. V. Frolova ◽  
Elena. E. Lyakso ◽  
◽  
...  

In the literature, data on the formation of intonation in Russian-speaking children with mild intellectual disabilities (mental retardation) without genetic syndromes and serious neurological disorders (for example, cerebral palsy) based on the instrumental analysis of children’s speech are absent. The aim of this study was to compare the intonation characteristics of speech in children, aged 5 to 7, with typical development and with mild intellectual disabilities. The participants of the study were 20 children aged 5 to 7: 10 children (5 girls and 5 boys) with typical development (TD group) and 10 children (6 boys and 4 girls) with mild intellectual disabilities (ID group, ICD-10-CM Code F70). Intellectual disabilities were not associated with genetic or severe neurological disorders (non-specific ID). Child speech was taken from the AD-CHILD.RU speech database. Audio and video recordings of speech and behavior of TD group children (in a kindergarten) and ID group children (in an orphanage) were made in the model situation of a “dialogue with an adult”. Two studies were conducted: a perceptual experiment (n=10 listeners – native speakers, researchers in the field of child speech development) and an instrumental spectrographic analysis of child speech. The instrumental analysis of speech was made in the Praat program. The duration of utterances and stressed vowels, pitch values (average, maximum and minimum), pitch range values of utterances, and pitch range values of vowels were analyzed. The perceptual experiment showed that the utterances of ID group children classified as less clear and more emotional than the utterances of TD group children. The task of phrase stress (words highlighted by voice) revealing was more difficult for adults when they were listening to the speech of ID group children vs. TD group children. In ID group children, the values of utterance duration are lower and the values of vowel duration are higher than in TD group children. The average, maximum, and minimum pitch values, the pitch range values of ID group children’s utterances are higher vs. the corresponding parameters of TD group children’s speech. The duration and pitch range values of stressed vowels from ID group children’s words highlighted by intonation are higher than these features of TD group children’s stressed vowels. The pitch contours of stressed vowels from TD group children’s words highlighted by intonation were presented in most cases by the rise of the pitch contour; the pitch contours of stressed vowels from ID group children’s words highlighted by intonation were presented by the fall of the pitch. The dome-shaped vowel pitch contour and U-shaped contour are more frequent in the speech of ID group children vs. TD group children. In the future, the intonation characteristics of speech of children with different diagnoses could be considered as additional diagnostic criteria of developmental disorders.


2021 ◽  
Author(s):  
◽  
Anina Kinzel

Online grooming has become a wide-spread and worryingly fast increasing issue in society. This thesis analyses a corpus of online grooming communication, made available by the Perverted Justice (PJ) archive, a non-profit organisation that from 2004 until 2019 employed volunteers, who pretended to be children and entered chat rooms to catch and convict groomers, collaborating with law enforcement. The archive consists of 622 grooming chat logs and approx. 3.7 million words of groomer language. A corpus of this database was built, and a Corpus-Assisted Discourse Studies (CADS) approach used to analyse the language therein. Specifically, the language was compared to a reference corpus of general chat language data (PAN2012) and duration of online grooming and manipulative requesting behaviour were also investigated. The following research questions were answered: 1)What are the features of a corpus of online groomer language compared to that of a general digital chat language reference corpus? Is online groomer language distinct? How are online grooming intentions realised linguistically by online groomers?2)Does duration of grooming influence the grooming process/intentions? Is usage of specific words/specific grooming intentions associated with different duration of grooming? Can different duration profiles be established and, if so, what are the cut-off points for these duration profiles?3)How are requests realised in online grooming and how does duration influence this? How do groomers make requests and what support move functions do they use? Does duration influence how requests are made, and the type of support move function that are used?The thesis newly identifies nuanced linguistic realisations of groomers’ intentions and strategies, proposing a new working terminology for discourse-based models of online grooming. This is based on a review of the literature followed by an empirical analysis refining this terminology, which has not been done before. It finds evidence for two distinct duration-based grooming approaches and yields a fine-grained qualitative analysis of groomer requests, also influenced by grooming duration. There have only been very few studies using a CADS analysis of such a large dataset of groomer language and this thesis will lead to new insights, implications and significance for the successful analysis, detection and prevention of online grooming.


Sign in / Sign up

Export Citation Format

Share Document