Construction of spoken language model including fillers using filler prediction model

Author(s):  
Kengo Ohta ◽  
Masatoshi Tsuchiya ◽  
Seiichi Nakagawa
2014 ◽  
Vol 36 (1) ◽  
pp. 35-43 ◽  
Author(s):  
Kevin J. Miller

This article reflects on the author’s experience supervising a public school program for students who are deaf or hard-of-hearing, specifically addressing national, regional, and local trends affecting it. These trends included teacher efficacy, changes in educational service delivery, advances in technology, the selection of the listening and spoken language model, the needs of university teacher education programs, and telepractice. Furthermore, the author describes how the program responded to these trends, which ultimately resulted in positive educational outcomes for the students being served.


2019 ◽  
Vol 9 (18) ◽  
pp. 3648
Author(s):  
Casper S. Shikali ◽  
Zhou Sijie ◽  
Liu Qihe ◽  
Refuoe Mokhosi

Deep learning has extensively been used in natural language processing with sub-word representation vectors playing a critical role. However, this cannot be said of Swahili, which is a low resource and widely spoken language in East and Central Africa. This study proposed novel word embeddings from syllable embeddings (WEFSE) for Swahili to address the concern of word representation for agglutinative and syllabic-based languages. Inspired by the learning methodology of Swahili in beginner classes, we encoded respective syllables instead of characters, character n-grams or morphemes of words and generated quality word embeddings using a convolutional neural network. The quality of WEFSE was demonstrated by the state-of-art results in the syllable-aware language model on both the small dataset (31.229 perplexity value) and the medium dataset (45.859 perplexity value), outperforming character-aware language models. We further evaluated the word embeddings using word analogy task. To the best of our knowledge, syllabic alphabets have not been used to compose the word representation vectors. Therefore, the main contributions of the study are a syllabic alphabet, WEFSE, a syllabic-aware language model and a word analogy dataset for Swahili.


Author(s):  
Aditya Siddhant ◽  
Anuj Goyal ◽  
Angeliki Metallinou

User interaction with voice-powered agents generates large amounts of unlabeled utterances. In this paper, we explore techniques to efficiently transfer the knowledge from these unlabeled utterances to improve model performance on Spoken Language Understanding (SLU) tasks. We use Embeddings from Language Model (ELMo) to take advantage of unlabeled data by learning contextualized word representations. Additionally, we propose ELMo-Light (ELMoL), a faster and simpler unsupervised pre-training method for SLU. Our findings suggest unsupervised pre-training on a large corpora of unlabeled utterances leads to significantly better SLU performance compared to training from scratch and it can even outperform conventional supervised transfer. Additionally, we show that the gains from unsupervised transfer techniques can be further improved by supervised transfer. The improvements are more pronounced in low resource settings and when using only 1000 labeled in-domain samples, our techniques match the performance of training from scratch on 10-15x more labeled in-domain data.


2014 ◽  
Vol 369 (1651) ◽  
pp. 20130295 ◽  
Author(s):  
Susan Goldin-Meadow

The goal of this paper is to widen the lens on language to include the manual modality. We look first at hearing children who are acquiring language from a spoken language model and find that even before they use speech to communicate, they use gesture. Moreover, those gestures precede, and predict, the acquisition of structures in speech. We look next at deaf children whose hearing losses prevent them from using the oral modality, and whose hearing parents have not presented them with a language model in the manual modality. These children fall back on the manual modality to communicate and use gestures, which take on many of the forms and functions of natural language. These homemade gesture systems constitute the first step in the emergence of manual sign systems that are shared within deaf communities and are full-fledged languages. We end by widening the lens on sign language to include gesture and find that signers not only gesture, but they also use gesture in learning contexts just as speakers do. These findings suggest that what is key in gesture's ability to predict learning is its ability to add a second representational format to communication, rather than a second modality. Gesture can thus be language, assuming linguistic forms and functions, when other vehicles are not available; but when speech or sign is possible, gesture works along with language, providing an additional representational format that can promote learning.


2020 ◽  
Vol 20 (04) ◽  
pp. 2050029
Author(s):  
Aparna Brahme ◽  
Umesh Bhadade

In this paper, we describe our work in Spoken language Identification using Visual Speech Recognition (VSR) and analyze the effect of various visual speech units used to transcribe the visual speech on language recognition. We have proposed a new approach of word recognition followed by the word N-gram language model (WRWLM), which uses high-level syntactic features and the word bigram language model for language discrimination. Also, as opposed to the traditional visemic approach, we propose a holistic approach of using the signature of a whole word, referred to as a “Visual Word” as visual speech unit for transcribing visual speech. The result shows Word Recognition Rate (WRR) of 88% and Language Recognition Rate (LRR) of 94% in speaker dependent cases and 58% WRR and 77% LRR in speaker independent cases for English and Marathi digit classification task. The proposed approach is also evaluated for continuous speech input. The result shows that the Spoken Language Identification rate of 50% is possible even though the WRR using Visual Speech Recognition is below 10%, using only 1[Formula: see text]s of speech. Also, there is an improvement of about 5% in language discrimination as compared to traditional visemic approaches.


2022 ◽  
Vol 10 (1) ◽  
pp. 0-0

Sentence completion systems are actively studied by many researchers which ultimately results in the reduction of cognitive effort and enhancement in user-experience. The review of the literature reveals that most of the work in the said area is in English and limited effort spent on other languages, especially vernacular languages. This work aims to develop state-of-the-art sentence completion system for the Punjabi language, which is the 10th most spoken language in the world. The presented work is an outcome of the results of the experimentation on various neural network language model combinations. A new Sentence Search Algorithm (SSA) and patching system are developed to search, complete and rank the completed sub-string and give a syntactically rich sentence(s). The quantitative and qualitative evaluation metrics were utilized to evaluate the system. The results are quite promising, and the best performing model is capable of completing a given sub-string with more acceptability. Best performing model is utilized for developing the user-interface.


Sign in / Sign up

Export Citation Format

Share Document