Character-Aware Sub-Word Level Language Modeling for Uyghur and Turkish ASR

Author(s):  
Chang Liu ◽  
Zhen Zhang ◽  
Pengyuan Zhang ◽  
Yonghong Yan
Keyword(s):  
2018 ◽  
Vol 6 ◽  
pp. 451-465 ◽  
Author(s):  
Daniela Gerz ◽  
Ivan Vulić ◽  
Edoardo Ponti ◽  
Jason Naradowsky ◽  
Roi Reichart ◽  
...  

Neural architectures are prominent in the construction of language models (LMs). However, word-level prediction is typically agnostic of subword-level information (characters and character sequences) and operates over a closed vocabulary, consisting of a limited word set. Indeed, while subword-aware models boost performance across a variety of NLP tasks, previous work did not evaluate the ability of these models to assist next-word prediction in language modeling tasks. Such subword-level informed models should be particularly effective for morphologically-rich languages (MRLs) that exhibit high type-to-token ratios. In this work, we present a large-scale LM study on 50 typologically diverse languages covering a wide variety of morphological systems, and offer new LM benchmarks to the community, while considering subword-level information. The main technical contribution of our work is a novel method for injecting subword-level information into semantic word vectors, integrated into the neural language modeling training, to facilitate word-level prediction. We conduct experiments in the LM setting where the number of infrequent words is large, and demonstrate strong perplexity gains across our 50 languages, especially for morphologically-rich languages. Our code and data sets are publicly available.


2015 ◽  
Vol 12 (2) ◽  
pp. 026007 ◽  
Author(s):  
Jaime F Delgado Saa ◽  
Adriana de Pesters ◽  
Dennis McFarland ◽  
Müjdat Çetin

2018 ◽  
Vol 6 ◽  
pp. 529-541 ◽  
Author(s):  
Jacob Buckman ◽  
Graham Neubig

In this work, we propose a new language modeling paradigm that has the ability to perform both prediction and moderation of information flow at multiple granularities: neural lattice language models. These models construct a lattice of possible paths through a sentence and marginalize across this lattice to calculate sequence probabilities or optimize parameters. This approach allows us to seamlessly incorporate linguistic intuitions — including polysemy and the existence of multiword lexical items — into our language model. Experiments on multiple language modeling tasks show that English neural lattice language models that utilize polysemous embeddings are able to improve perplexity by 9.95% relative to a word-level baseline, and that a Chinese model that handles multi-character tokens is able to improve perplexity by 20.94% relative to a character-level baseline.


2020 ◽  
Vol 51 (3) ◽  
pp. 544-560 ◽  
Author(s):  
Kimberly A. Murphy ◽  
Emily A. Diehm

Purpose Morphological interventions promote gains in morphological knowledge and in other oral and written language skills (e.g., phonological awareness, vocabulary, reading, and spelling), yet we have a limited understanding of critical intervention features. In this clinical focus article, we describe a relatively novel approach to teaching morphology that considers its role as the key organizing principle of English orthography. We also present a clinical example of such an intervention delivered during a summer camp at a university speech and hearing clinic. Method Graduate speech-language pathology students provided a 6-week morphology-focused orthographic intervention to children in first through fourth grade ( n = 10) who demonstrated word-level reading and spelling difficulties. The intervention focused children's attention on morphological families, teaching how morphology is interrelated with phonology and etymology in English orthography. Results Comparing pre- and posttest scores, children demonstrated improvement in reading and/or spelling abilities, with the largest gains observed in spelling affixes within polymorphemic words. Children and their caregivers reacted positively to the intervention. Therefore, data from the camp offer preliminary support for teaching morphology within the context of written words, and the intervention appears to be a feasible approach for simultaneously increasing morphological knowledge, reading, and spelling. Conclusion Children with word-level reading and spelling difficulties may benefit from a morphology-focused orthographic intervention, such as the one described here. Research on the approach is warranted, and clinicians are encouraged to explore its possible effectiveness in their practice. Supplemental Material https://doi.org/10.23641/asha.12290687


2020 ◽  
Vol 29 (4) ◽  
pp. 2170-2188
Author(s):  
Lindsey R. Squires ◽  
Sara J. Ohlfest ◽  
Kristen E. Santoro ◽  
Jennifer L. Roberts

Purpose The purpose of this systematic review was to determine evidence of a cognate effect for young multilingual children (ages 3;0–8;11 [years;months], preschool to second grade) in terms of task-level and child-level factors that may influence cognate performance. Cognates are pairs of vocabulary words that share meaning with similar phonology and/or orthography in more than one language, such as rose – rosa (English–Spanish) or carrot – carotte (English–French). Despite the cognate advantage noted with older bilingual children and bilingual adults, there has been no systematic examination of the cognate research in young multilingual children. Method We conducted searches of multiple electronic databases and hand-searched article bibliographies for studies that examined young multilingual children's performance with cognates based on study inclusion criteria aligned to the research questions. Results The review yielded 16 articles. The majority of the studies (12/16, 75%) demonstrated a positive cognate effect for young multilingual children (measured in higher accuracy, faster reaction times, and doublet translation equivalents on cognates as compared to noncognates). However, not all bilingual children demonstrated a cognate effect. Both task-level factors (cognate definition, type of cognate task, word characteristics) and child-level factors (level of bilingualism, age) appear to influence young bilingual children's performance on cognates. Conclusions Contrary to early 1990s research, current researchers suggest that even young multilingual children may demonstrate sensitivity to cognate vocabulary words. Given the limits in study quality, more high-quality research is needed, particularly to address test validity in cognate assessments, to develop appropriate cognate definitions for children, and to refine word-level features. Only one study included a brief instruction prior to assessment, warranting cognate treatment studies as an area of future need. Supplemental Material https://doi.org/10.23641/asha.12753179


2020 ◽  
Vol 51 (3) ◽  
pp. 603-616
Author(s):  
Kenn Apel ◽  
Victoria S. Henbest

Purpose Morphological awareness is the ability to consciously manipulate the smallest units of meaning in language. Morphological awareness contributes to success with literacy skills for children with typical language and those with language impairment. However, little research has focused on the morphological awareness skills of children with speech sound disorders (SSD), who may be at risk for literacy impairments. No researcher has examined the morphological awareness skills of children with SSD and compared their skills to children with typical speech using tasks representing a comprehensive definition of morphological awareness, which was the main purpose of this study. Method Thirty second- and third-grade students with SSD and 30 with typical speech skills, matched on age and receptive vocabulary, completed four morphological awareness tasks and measures of receptive vocabulary, real-word reading, pseudoword reading, and word-level spelling. Results Results indicated there was no difference between the morphological awareness skills of students with and without SSD. Although morphological awareness was moderately to strongly related to the students' literacy skills, performance on the morphological awareness tasks contributed little to no additional variance to the children's real-word reading and spelling skills beyond what was accounted for by pseudoword reading. Conclusions Findings suggest that early elementary-age students with SSD may not present with concomitant morphological awareness difficulties and that the morphological awareness skills of these students may not play a unique role in their word-level literacy skills. Limitations and suggestions for future research on the morphological awareness skills of children with SSD are discussed.


2019 ◽  
Vol 118 (7) ◽  
pp. 73-76
Author(s):  
Sharanabasappa ◽  
P Ravibabu

Nowadays, during the process of Image acquisition and transmission, image information data can be corrupted by impulse noise. That noise is classified as salt and pepper noise and random impulse noise depending on the noise values. A median filter is widely used digital nonlinear filter  in edge preservation, removing of impulse noise and smoothing of signals. Median filter is the widely used to remove salt and pepper noise than rank order filter, morphological filter, and unsharp masking filter. The median filter replaces a sample with the middle value among all the samples present inside the sample window. A median filter will be of two types depending on the number of samples processed at the same cycle i.e, bit level architecture and word level architecture.. In this paper, Carry Look-ahead Adder median filter method will be introduced to improve the hardware resources used in median filter architecture for 5 window and 9 window for 8 bit and 16 bit median filter architecture.


2015 ◽  
Author(s):  
Beata Beigman Klebanov ◽  
Chee Wee Leong ◽  
Michael Flor
Keyword(s):  

1994 ◽  
Author(s):  
R. Schwartz ◽  
L. Nguyen ◽  
F. Kubala ◽  
G. CHou ◽  
G. Zavaliagkos ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document