f0 contour
Recently Published Documents


TOTAL DOCUMENTS

66
(FIVE YEARS 13)

H-INDEX

6
(FIVE YEARS 2)

2020 ◽  
Vol 63 (11) ◽  
pp. 3855-3864
Author(s):  
Wanting Huang ◽  
Lena L. N. Wong ◽  
Fei Chen ◽  
Haihong Liu ◽  
Wei Liang

Purpose Fundamental frequency (F0) is the primary acoustic cue for lexical tone perception in tonal languages but is processed in a limited way in cochlear implant (CI) systems. The aim of this study was to evaluate the importance of F0 contours in sentence recognition in Mandarin-speaking children with CIs and find out whether it is similar to/different from that in age-matched normal-hearing (NH) peers. Method Age-appropriate sentences, with F0 contours manipulated to be either natural or flattened, were randomly presented to preschool children with CIs and their age-matched peers with NH under three test conditions: in quiet, in white noise, and with competing sentences at 0 dB signal-to-noise ratio. Results The neutralization of F0 contours resulted in a significant reduction in sentence recognition. While this was seen only in noise conditions among NH children, it was observed throughout all test conditions among children with CIs. Moreover, the F0 contour-induced accuracy reduction ratios (i.e., the reduction in sentence recognition resulting from the neutralization of F0 contours compared to the normal F0 condition) were significantly greater in children with CIs than in NH children in all test conditions. Conclusions F0 contours play a major role in sentence recognition in both quiet and noise among pediatric implantees, and the contribution of the F0 contour is even more salient than that in age-matched NH children. These results also suggest that there may be differences between children with CIs and NH children in how F0 contours are processed.


2020 ◽  
Vol 10 (18) ◽  
pp. 6381 ◽  
Author(s):  
Pongsathon Janyoi ◽  
Pusadee Seresangtakul

The modeling of fundamental frequency (F0) in speech synthesis is a critical factor affecting the intelligibility and naturalness of synthesized speech. In this paper, we focus on improving the modeling of F0 for Isarn speech synthesis. We propose the F0 model for this based on a recurrent neural network (RNN). Sampled values of F0 are used at the syllable level of continuous Isarn speech combined with their dynamic features to represent supra-segmental properties of the F0 contour. Different architectures of the deep RNNs and different combinations of linguistic features are analyzed to obtain conditions for the best performance. To assess the proposed method, we compared it with several RNN-based baselines. The results of objective and subjective tests indicate that the proposed model significantly outperformed the baseline RNN model that predicts values of F0 at the frame level, and the baseline RNN model that represents the F0 contours of syllables by using discrete cosine transform.


2020 ◽  
Vol 22 (1) ◽  
pp. 5-27
Author(s):  
Doina Jitcă

This paper presents an Information Structure (IS) model at the information packaging (IPk) level and its usage in utterance partitioning and in explaining semantic IS category realizations at the pragmatic level. The IPk model proposes a hierarchical view of F0 contours that transforms utterances into binary contrast unit (CU) hierarchies. CUs have binary IPk partitions with two independent and overlapping structures and a nuclear element which project its IPk functions to the whole units it belongs to. Two nuclear accent identification rules are formulated in this paper in order to be used in decoding IPk partition hierarchy by F0 contour analysis. In the second part of the paper several intonational contours of English sentences, having different semantic IS events, are interpreted by correlating semantic IS analysis results with those of the IPk model-based analysis. By decoding IPk structure and functional constituents from F0 contours we can advance our knowledge about the relationship between prosody and intonational meaning.


2020 ◽  
Vol 24 ◽  
pp. 233121652092007
Author(s):  
Michael F. Dorman ◽  
Sarah Cook Natale ◽  
Leslie Baxter ◽  
Daniel M. Zeitler ◽  
Matthew L. Carlson ◽  
...  

Fourteen single-sided deaf listeners fit with an MED-EL cochlear implant (CI) judged the similarity of clean signals presented to their CI and modified signals presented to their normal-hearing ear. The signals to the normal-hearing ear were created by (a) filtering, (b) spectral smearing, (c) changing overall fundamental frequency (F0), (d) F0 contour flattening, (e) changing formant frequencies, (f) altering resonances and ring times to create a metallic sound quality, (g) using a noise vocoder, or (h) using a sine vocoder. The operations could be used singly or in any combination. On a scale of 1 to 10 where 10 was a complete match to the sound of the CI, the mean match score was 8.8. Over half of the matches were 9.0 or higher. The most common alterations to a clean signal were band-pass or low-pass filtering, spectral peak smearing, and F0 contour flattening. On average, 3.4 operations were used to create a match. Upshifts in formant frequencies were implemented most often for electrode insertion angles less than approximately 500°. A relatively small set of operations can produce signals that approximate the sound of the MED-EL CI. There are large individual differences in the combination of operations needed. The sound files in Supplemental Material approximate the sound of the MED-EL CI for patients fit with 28-mm electrode arrays.


2019 ◽  
Vol 1 (1) ◽  
Author(s):  
Hongliu Jiang

As a representative of southwestern Mandarin, the Chengdu dialect has its own distinctive pitch features in phonology of tone and intonation. Research on the pronunciation and lexical tone of the Chengdu dialect has a long history with a certain amount of theoretical results. However, research on intonation of Chengdu dialect is still rare. The writer provides an acoustic analysis of research into intonational pitch features of interrogative and declarative sentences of Chengdu dialect, discussing the F0 contour at the final syllable (character) of each sentence to find out if the statement or question mood is carried by the edge tone as well as the pitch perturbation between lexical tone and intonation on it. The results of this acoustic analysis show that there exist statement and question mood of Chengdu dialect carried by the final syllable within an intonational phrase as well as the perturbation on the final syllable (character) by the coexistence of its lexical tone and intonation.


2019 ◽  
Vol 28 (2S) ◽  
pp. 875-886 ◽  
Author(s):  
Jennifer M. Vojtech ◽  
Jacob P. Noordzij ◽  
Gabriel J. Cler ◽  
Cara E. Stepp

Purpose This study investigated how modulating fundamental frequency (f0) and speech rate differentially impact the naturalness, intelligibility, and communication efficiency of synthetic speech. Method Sixteen sentences of varying prosodic content were developed via a speech synthesizer. The f0 contour and speech rate of these sentences were altered to produce 4 stimulus sets: (a) normal rate with a fixed f0 level, (b) slow rate with a fixed f0 level, (c) normal rate with prosodically natural f0 variation, and (d) normal rate with prosodically unnatural f0 variation. Sixteen listeners provided orthographic transcriptions and judgments of naturalness for these stimuli. Results Sentences with f0 variation were rated as more natural than those with a fixed f0 level. Conversely, sentences with a fixed f0 level demonstrated higher intelligibility than those with f0 variation. Speech rate did not affect the intelligibility of stimuli with a fixed f0 level. Communication efficiency was highest for sentences produced at a normal rate and a fixed f0 level. Conclusions Sentence-level f0 variation increased naturalness ratings of synthesized speech, whether the variation was prosodically natural or not. However, these f0 variations reduced intelligibility. There is evidence of a trade-off in naturalness and intelligibility of synthesized speech, which may impact future speech synthesis designs. Supplemental Material https://doi.org/10.23641/asha.8847833


Author(s):  
Arman Kaliyev ◽  
Yuri N. Matveev ◽  
Elena E. Lyakso ◽  
Sergey V. Rybin
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document