Significance of spectral cues in automatic speech segmentation for Indian language speech synthesizers

2020 ◽  
Vol 123 ◽  
pp. 10-25
Author(s):  
Arun Baby ◽  
Jeena J. Prakash ◽  
Aswin Shanmugam Subramanian ◽  
Hema A. Murthy
1991 ◽  
Vol 34 (2) ◽  
pp. 415-426 ◽  
Author(s):  
Richard L. Freyman ◽  
G. Patrick Nerbonne ◽  
Heather A. Cote

This investigation examined the degree to which modification of the consonant-vowel (C-V) intensity ratio affected consonant recognition under conditions in which listeners were forced to rely more heavily on waveform envelope cues than on spectral cues. The stimuli were 22 vowel-consonant-vowel utterances, which had been mixed at six different signal-to-noise ratios with white noise that had been modulated by the speech waveform envelope. The resulting waveforms preserved the gross speech envelope shape, but spectral cues were limited by the white-noise masking. In a second stimulus set, the consonant portion of each utterance was amplified by 10 dB. Sixteen subjects with normal hearing listened to the unmodified stimuli, and 16 listened to the amplified-consonant stimuli. Recognition performance was reduced in the amplified-consonant condition for some consonants, presumably because waveform envelope cues had been distorted. However, for other consonants, especially the voiced stops, consonant amplification improved recognition. Patterns of errors were altered for several consonant groups, including some that showed only small changes in recognition scores. The results indicate that when spectral cues are compromised, nonlinear amplification can alter waveform envelope cues for consonant recognition.


Author(s):  
Ana Franco ◽  
Julia Eberlen ◽  
Arnaud Destrebecqz ◽  
Axel Cleeremans ◽  
Julie Bertels

Abstract. The Rapid Serial Visual Presentation procedure is a method widely used in visual perception research. In this paper we propose an adaptation of this method which can be used with auditory material and enables assessment of statistical learning in speech segmentation. Adult participants were exposed to an artificial speech stream composed of statistically defined trisyllabic nonsense words. They were subsequently instructed to perform a detection task in a Rapid Serial Auditory Presentation (RSAP) stream in which they had to detect a syllable in a short speech stream. Results showed that reaction times varied as a function of the statistical predictability of the syllable: second and third syllables of each word were responded to faster than first syllables. This result suggests that the RSAP procedure provides a reliable and sensitive indirect measure of auditory statistical learning.


Author(s):  
Jean Vroomen ◽  
Beatrice de Gelder
Keyword(s):  

2012 ◽  
Vol 3 (2) ◽  
pp. 343-346
Author(s):  
Adabala Venkata Srinivasa Rao ◽  
D R Sandeep ◽  
V B Sandeep ◽  
S Dhanam Jaya

Recognition of Indian language scripts is a challenging problem. Work for the development of complete OCR systems for Indian language scripts is still in infancy. Complete OCR systems have recently been developed for Devanagri and Bangla scripts. Research in the field of recognition of Telugu script faces major problems mainly related to the touching and overlapping of characters. Segmentation of touching Telugu characters is a difficult task for recognizing individual characters. In this paper, the proposed algorithm is for the segmentation of  touching Hand written Telugu characters. The proposed method using Drop-fall algorithm is based on the moving of a marble on either side of the touching characters for selection of the point from where the cutting of the fused components should take place. This method improvers the segmentation accuracy higher than the existing one.


Author(s):  
Barbra A. Meek

This chapter is an exploration of how race and language become entangled in representations and ideas about what it means to be seen and recognized as Native American. Most conceptions of Indianness derive from scholarly European-derived representations and evaluations and from popular narrative media, the one often bootstrapping the other. In tandem, these public manifestations perpetuate the racialization of Indian languages and of Indianness, most ubiquitously in and through a discourse of “blood.” Several ideologies configure the racial logic that determines Indianness: purism (percentage of “Indian blood”), visibility (racialized—and cultural—manifestations of “blood”), continuity (maintenance of a pre-contact “bloodline”), and primitivism (expression of indigenous “blood” in and through language). I argue that this “ideological assemblage” (Kroskrity 2018) undergirds the processes of “racing Indian language(s)” and “languaging an Indian race” (H. Samy Alim 2016) that has resulted in propagating conflicts over and denials of Native American heritage.


2017 ◽  
Vol 61 (1) ◽  
pp. 84-96 ◽  
Author(s):  
David M. Gómez ◽  
Peggy Mok ◽  
Mikhail Ordin ◽  
Jacques Mehler ◽  
Marina Nespor

Research has demonstrated distinct roles for consonants and vowels in speech processing. For example, consonants have been shown to support lexical processes, such as the segmentation of speech based on transitional probabilities (TPs), more effectively than vowels. Theory and data so far, however, have considered only non-tone languages, that is to say, languages that lack contrastive lexical tones. In the present work, we provide a first investigation of the role of consonants and vowels in statistical speech segmentation by native speakers of Cantonese, as well as assessing how tones modulate the processing of vowels. Results show that Cantonese speakers are unable to use statistical cues carried by consonants for segmentation, but they can use cues carried by vowels. This difference becomes more evident when considering tone-bearing vowels. Additional data from speakers of Russian and Mandarin suggest that the ability of Cantonese speakers to segment streams with statistical cues carried by tone-bearing vowels extends to other tone languages, but is much reduced in speakers of non-tone languages.


Sign in / Sign up

Export Citation Format

Share Document