word boundaries Latest Research Papers

How can infants detect where words or morphemes start and end in the continuous stream of speech? Previous computational studies have investigated this question mainly for English, where morpheme and word boundaries are often isomorphic. Yet in many languages, words are often multimorphemic, such that word and morpheme boundaries do not align. Our study employed corpora of two languages that differ in the complexity of inflectional morphology, Chintang (Sino-Tibetan) and Japanese (in Experiment 1), as well as corpora of artificial languages ranging in morphological complexity, as measured by the ratio and distribution of morphemes per word (in Experiments 2 and 3). We used two baselines and three conceptually diverse word segmentation algorithms, two of which rely purely on sublexical information using distributional cues, and one that builds a lexicon. The algorithms’ performance was evaluated on both word- and morpheme-level representations of the corpora.Segmentation results were better for the morphologically simpler languages than for the morphologically more complex languages, in line with the hypothesis that languages with greater inflectional complexity could be more difficult to segment into words. We further show that the effect of morphological complexity is relatively small, compared to that of algorithm and evaluation level. We therefore recommend that infant researchers look for signatures of the different segmentation algorithms and strategies, before looking for differences in infant segmentation landmarks across languages varying in complexity.

Download Full-text

Effects of Spacing on Sentence Reading in Chinese

Frontiers in Psychology ◽

10.3389/fpsyg.2021.765335 ◽

2021 ◽

Vol 12 ◽

Author(s):

Gaisha Oralova ◽

Victor Kuperman

Keyword(s):

Regression Models ◽

Native Speaker ◽

Mixed Effects ◽

The Other ◽

Word Boundary ◽

Linear Mixed Effects ◽

Sentence Reading ◽

Word Boundaries ◽

Theories Of Reading ◽

Chinese Writing

Given that Chinese writing conventions lack inter-word spacing, understanding whether and how readers of Chinese segment regular unspaced Chinese writing into words is an important question for theories of reading. This study examined the processing outcomes of introducing spaces to written Chinese sentences in varying positions based on native speaker consensus. The measure of consensus for every character transition in our stimuli sentences was the percent of raters who placed a word boundary in that position. The eye movements of native readers of Chinese were recorded while they silently read original unspaced sentences and their experimentally manipulated counterparts for comprehension. We introduced two types of spaced sentences: one with spaces inserted at every probable word boundary (heavily spaced), and another with spaces placed only at highly probable word boundaries (lightly spaced). Linear mixed-effects regression models showed that heavily spaced sentences took identical time to read as unspaced ones despite the shortened fixation times on individual words (Experiment 1). On the other hand, reading times for lightly spaced sentences and words were shorter than those for unspaced ones (Experiment 2). Thus, spaces proved to be advantageous but only when introduced at highly probable word boundaries. We discuss methodological and theoretical implications of these findings.

Download Full-text

Experience Control Analysis of English Reading Software Based on Wireless Binocular Line-of-Sight Sensing

Journal of Sensors ◽

10.1155/2021/8215510 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Cheng Feng

Keyword(s):

Blended Learning ◽

High Precision ◽

Semantic Segmentation ◽

Image Sensor ◽

English Word ◽

Word Boundary ◽

English Text ◽

English Reading ◽

Reading Software ◽

Word Boundaries

This paper proposes a segmented combined English text measurement method based on two sets of orthogonal linear image sensors and one area image sensor. This method fully combines the advantages of the linear image sensor and the area image sensor in long-distance and short-distance English text measurement and can continuously perform high-precision English text tracking within a large range of viewing distance. Based on this method, a set of segmented English text measurement system is designed and constructed. This paper presents a method for extracting English word boundaries based on semantic segmentation to solve the problem of global positioning and horizontal initialization of English reading text. The semantic segmentation method based on fully convolutional networks (FCN) is analyzed, and the target classification is defined. We used the classic FCN framework and model, fine-tuned with manually annotated data, and achieved good segmentation results. For the definition and extraction of English word boundaries in English text, a piecewise linear model is used to measure the projection confidence of each English word boundary point, and the overall observation of the English word boundary is measured. When the observation confidence is high enough, combined with the English word boundaries marked in the high-precision image, the horizontal positioning is obtained by matching the weights. This paper concludes that English reading software can help learners in English learning to a certain extent, which proves that the English reading software is an effective supplement based on blended learning classrooms. Through the analysis of learners and teaching content, an English teaching model based on English reading software blended learning is designed. Experimental studies have proved that English reading software can help learners learn English, which not only expands their vocabulary but also broadens their horizons.

Download Full-text

On word boundaries and blank spaces: Perceptions of orthography and writing reforms in the post-Serbo-Croatian language sphere

10.1515/9783110712766-011 ◽

2021 ◽

pp. 225-238

Author(s):

Katharina Tyran

Keyword(s):

Word Boundaries

Download Full-text

Sequences of high tones across word boundaries in Tswana

Journal of the International Phonetic Association ◽

10.1017/s0025100321000141 ◽

2021 ◽

pp. 1-22

Author(s):

Sabine Zerbian ◽

Frank Kügler

Keyword(s):

Empirical Data ◽

Obligatory Contour Principle ◽

Bantu Language ◽

Word Level ◽

Related Language ◽

Word Boundaries ◽

Phonological Phrase ◽

Read Speech

The article analyses violations of the Obligatory Contour Principle (OCP) above the word level in Tswana, a Southern Bantu language, by investigating the realization of adjacent lexical high tones across word boundaries. The results show that across word boundaries downstep (i.e. a lowering of the second in a series of adjacent high tones) only takes place within a phonological phrase. A phonological phrase break blocks downstep, even when the necessary tonal configuration is met. A phrase-based account is adopted in order to account for the occurrence of downstep. Our study confirms a pattern previously reported for the closely related language Southern Sotho and provides controlled, empirical data from Tswana, based on read speech of twelve speakers which has been analysed auditorily by two annotators as well as acoustically.

Download Full-text

Automatic Lip-Reading with Hierarchical Pyramidal Convolution and Self-Attention for Image Sequences with No Word Boundaries

10.21437/interspeech.2021-723 ◽

2021 ◽

Author(s):

Hang Chen ◽

Jun Du ◽

Yu Hu ◽

Li-Rong Dai ◽

Bao-Cai Yin ◽

...

Keyword(s):

Image Sequences ◽

Lip Reading ◽

Word Boundaries

Download Full-text

Aspects of English Coronal Stop Lenition between a Consonant and an Onglide across Word Boundaries.

Journal of Language Sciences ◽

10.14384/kals.2021.28.2.115 ◽

2021 ◽

Vol 28 (2) ◽

pp. 115-138

Author(s):

Seung-Hoon Shin ◽

Keyword(s):

Word Boundaries

Download Full-text

Correction to: When statistics collide: The use of transitional and phonotactic probability cues to word boundaries

Memory & Cognition ◽

10.3758/s13421-021-01191-0 ◽

2021 ◽

Author(s):

Rodrigo Dal Ben ◽

Débora de Hollanda Souza ◽

Jessica F. Hay

Keyword(s):

Phonotactic Probability ◽

Word Boundaries

Download Full-text

OCP Avoidance in Classical Chinese: Implications for Tonogenesis

Proceedings of the Annual Meetings on Phonology ◽

10.3765/amp.v9i0.4917 ◽

2021 ◽

Vol 9 ◽

Author(s):

Jack Isaac Rabinovitch

Keyword(s):

Word Order ◽

Old Chinese ◽

Word Boundaries ◽

Classical Chinese ◽

The Creation

Through a corpus of five pre-Qin (before 221 BCE) texts, this paper argues that the authors of both prose and poetry in Classical Chinese were sensitive to OCP violations at cross-word boundaries, and changed diction and used marked word order as a way to avoid the creation of pseudogeminates across words. The frequency of bigrams which result in pseudogeminates are compared to the predicted frequency of pseudogeminates across the corpus. This paper finds that pseudogeminates are significantly (p<0.00001) rarer than expected through randomization. Furthermore, by analyzing these texts with multiple possible phonological reconstructions, this paper suggests that post-codas, segments which were present in Old Chinese, but were elided during the process of tonogenesis between Old Chinese and Middle Chinese, were most likely present in the Chinese of the writers of the texts. Evidence comes from the consistency of OCP avoidance across all tones of Chinese assuming the presence of post-codas, and the lack of consistency thereof when post-codas are not assumed.

Download Full-text

Impact of vowel reduction in L2 Chinese learners of Portuguese within and across word boundaries

10.21437/iberspeech.2021-4 ◽

2021 ◽

Author(s):

Catarina Realinho ◽

Rita Gonçalves ◽

Helena Moniz ◽

Isabel Trancoso

Keyword(s):

Vowel Reduction ◽

Chinese Learners ◽

L2 Chinese ◽

Word Boundaries

Download Full-text

word boundaries
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Does morphological complexity affect word segmentation? Evidence from computational modeling

Effects of Spacing on Sentence Reading in Chinese

Experience Control Analysis of English Reading Software Based on Wireless Binocular Line-of-Sight Sensing

On word boundaries and blank spaces: Perceptions of orthography and writing reforms in the post-Serbo-Croatian language sphere

Sequences of high tones across word boundaries in Tswana

Automatic Lip-Reading with Hierarchical Pyramidal Convolution and Self-Attention for Image Sequences with No Word Boundaries

Aspects of English Coronal Stop Lenition between a Consonant and an Onglide across Word Boundaries.

Correction to: When statistics collide: The use of transitional and phonotactic probability cues to word boundaries

OCP Avoidance in Classical Chinese: Implications for Tonogenesis

Impact of vowel reduction in L2 Chinese learners of Portuguese within and across word boundaries

Export Citation Format

word boundariesRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Does morphological complexity affect word segmentation? Evidence from computational modeling

Effects of Spacing on Sentence Reading in Chinese

Experience Control Analysis of English Reading Software Based on Wireless Binocular Line-of-Sight Sensing

On word boundaries and blank spaces: Perceptions of orthography and writing reforms in the post-Serbo-Croatian language sphere

Sequences of high tones across word boundaries in Tswana

Automatic Lip-Reading with Hierarchical Pyramidal Convolution and Self-Attention for Image Sequences with No Word Boundaries

Aspects of English Coronal Stop Lenition between a Consonant and an Onglide across Word Boundaries.

Correction to: When statistics collide: The use of transitional and phonotactic probability cues to word boundaries

OCP Avoidance in Classical Chinese: Implications for Tonogenesis

Impact of vowel reduction in L2 Chinese learners of Portuguese within and across word boundaries

word boundaries
Recently Published Documents