scholarly journals The cross-linguistic performance of word segmentation models over time

2019 ◽  
Vol 46 (6) ◽  
pp. 1169-1201
Author(s):  
Andrew CAINES ◽  
Emma ALTMANN-RICHER ◽  
Paula BUTTERY

AbstractWe select three word segmentation models with psycholinguistic foundations – transitional probabilities, the diphone-based segmenter, and PUDDLE – which track phoneme co-occurrence and positional frequencies in input strings, and in the case of PUDDLE build lexical and diphone inventories. The models are evaluated on caregiver utterances in 132 CHILDES corpora representing 28 languages and 11.9 m words. PUDDLE shows the best performance overall, albeit with wide cross-linguistic variation. We explore the reasons for this variation, fitting regression models to performance scores with linguistic properties which capture lexico-phonological characteristics of the input: word length, utterance length, diversity in the lexicon, the frequency of one-word utterances, the regularity of phoneme patterns at word boundaries, and the distribution of diphones in each language. These properties together explain four-tenths of the observed variation in segmentation performance, a strong outcome and a solid foundation for studying further variables which make the segmentation task difficult.

2011 ◽  
Vol 8 (17) ◽  
pp. 1437-1443
Author(s):  
Saima Athar ◽  
Oscar Gustafsson ◽  
Fahad Qureshi ◽  
Izzet Kale

Author(s):  
Louise Goyet ◽  
Séverine Millotte ◽  
Anne Christophe ◽  
Thierry Nazzi

The present chapter focuses on fluent speech segmentation abilities in early language development. We first review studies exploring the early use of major prosodic boundary cues which allow infants to cut full utterances into smaller-sized sequences like clauses or phrases. We then summarize studies showing that word segmentation abilities emerge around 8 months, and rely on infants’ processing of various bottom-up word boundary cues and top-down known word recognition cues. Given that most of these cues are specific to the language infants are acquiring, we emphasize how the development of these abilities varies cross-linguistically, and explore their developmental origin. In particular, we focus on two cues that might allow bootstrapping of these abilities: transitional probabilities and rhythmic units.


2020 ◽  
Vol 9 (8) ◽  
pp. 486 ◽  
Author(s):  
Aleksandar Milosavljević

The proliferation of high-resolution remote sensing sensors and platforms imposes the need for effective analyses and automated processing of high volumes of aerial imagery. The recent advance of artificial intelligence (AI) in the form of deep learning (DL) and convolutional neural networks (CNN) showed remarkable results in several image-related tasks, and naturally, gain the focus of the remote sensing community. In this paper, we focus on specifying the processing pipeline that relies on existing state-of-the-art DL segmentation models to automate building footprint extraction. The proposed pipeline is organized in three stages: image preparation, model implementation and training, and predictions fusion. For the first and third stages, we introduced several techniques that leverage remote sensing imagery specifics, while for the selection of the segmentation model, we relied on empirical examination. In the paper, we presented and discussed several experiments that we conducted on Inria Aerial Image Labeling Dataset. Our findings confirmed that automatic processing of remote sensing imagery using DL semantic segmentation is both possible and can provide applicable results. The proposed pipeline can be potentially transferred to any other remote sensing imagery segmentation task if the corresponding dataset is available.


2013 ◽  
Vol 411-414 ◽  
pp. 308-312
Author(s):  
Hong Zhi Yu ◽  
Jin Xi Zhang ◽  
Guang Rong Shan ◽  
Ning Ma

The standardization of the text, word segmentation, the basic stitching unit divided for rhythm analysis and pronunciation conversion is an important content of the speech synthesis system front-end text processing modules. Lhasa Tibetan language and voice characteristics proposed the implementation of a set of Tibetan speech synthesis text analysis module to analyze and describe the Lhasa Tibetan language layer information and maps voice layer. The completion of the study is to lay a solid foundation for further Tibetan speech synthesis system.


2021 ◽  
Vol 102 ◽  
pp. 01003
Author(s):  
Katsunori Kotani ◽  
Takehiko Yoshimi

This study analyzes the extent to which dictation performance and linguistic features (linguistic difficulty of sentences during dictation) can predict general proficiency in English as a second language (ESL) learners. To this end, this study constructed a multiple linear and a non-linear regression models that predict general ESL proficiency (in which independent variables were the dictation performance scores and the linguistic features of sentences) and verified the correlation between the predicted and observed general ESL proficiencies. The results showed that general ESL proficiency could be predicted by dictation performance and linguistic features. Furthermore, the results indicated significant effects on dictation accuracy, sentence length, and mean word length.


2021 ◽  
Vol 1 (1) ◽  
pp. 50-52
Author(s):  
Bo Dong ◽  
Wenhai Wang ◽  
Jinpeng Li

We present our solutions to the MedAI for all three tasks: polyp segmentation task, instrument segmentation task, and transparency task. We use the same framework to process the two segmentation tasks of polyps and instruments. The key improvement over last year is new state-of-the-art vision architectures, especially transformers which significantly outperform ConvNets for the medical image segmentation tasks. Our solution consists of multiple segmentation models, and each model uses a transformer as the backbone network. we get the best IoU score of 0.915 on the instrument segmentation task and 0.836 on polyp segmentation task after submitting. Meanwhile, we provide complete solutions in https://github.com/dongbo811/MedAI-2021.


Author(s):  
Jaime Leung

This study looks at the mechanisms behind how people learn words of a new language. Syllables that occur within words have a higher chance of occurring together than the syllables between words. Both infants and adults use these transitional probabilities to extract the words in language. However, previous research has examined speech segmentation when learners are presented just with speech. In natural context, we look while we listen and what we see is correlated with what we hear. The goal of my study was to explore how visual context affects adult speech segmentation. To do so, we have three conditions: one where adults were presented with only a word stream, one where while listening adults saw animations that corresponded to words they heard, and one where the animations that the adults saw did not correspond to the words they heard. One hypothesis is that participants in the audio-visual conditions perform better at the segmentation task because the statistical boundaries in the audio are reinforced by the visual boundaries between animations. However, it is also possible that the visual information impairs performance because learners engage in learning the meanings of words in addition to speech segmentation. Preliminary results support the latter hypothesis.


2008 ◽  
Vol 36 (7) ◽  
pp. 1299-1305 ◽  
Author(s):  
P. PERRUCHET ◽  
S. DESAULTY

Sign in / Sign up

Export Citation Format

Share Document