lexical statistics
Recently Published Documents


TOTAL DOCUMENTS

41
(FIVE YEARS 4)

H-INDEX

10
(FIVE YEARS 0)

2021 ◽  
Vol 150 (4) ◽  
pp. A150-A150
Author(s):  
Margaret Cychosz ◽  
Jan R. Edwards ◽  
Nan Bernstein Ratner ◽  
Catherine Torrington Eaton ◽  
Rochelle S. Newman


Author(s):  
Pierina Cheung

Languages differ in how they refer to countable objects. In number-marking languages such as English, countable individuals are often labelled by count nouns (e.g. two hippos), but languages such as Japanese lack mass–count syntax and require classifiers when referring to entities (e.g. two CL hippo). These cross-linguistic differences have led some to propose that linguistic structure affects how speakers construe entities in the world. Shown an ambiguous entity, English speakers tend to construe it as an object kind, whereas Japanese speakers construe it as a substance kind. However, recent studies show that these differences are likely due to lexical statistics, with English speakers drawing on the distribution of mass– count nouns to infer that the ambiguous entity is likely an object kind. Developmental studies further show that infants can distinguish between objects and substances. Together, recent studies suggest the language we speak does not affect how we construe entities.



Phonology ◽  
2021 ◽  
Vol 38 (2) ◽  
pp. 241-275
Author(s):  
Shuxiao Gong ◽  
Jie Zhang

This paper investigates the nature of native Mandarin Chinese speakers’ phonotactic knowledge via an experimental study and formal modelling of the experimental results. Results from a phonological well-formedness judgement experiment suggest that Mandarin speakers’ phonotactic knowledge is sensitive not only to lexical statistics, but also to grammatical principles such as systematic and accidental phonotactic constraints, allophonic restrictions and segment–tone co-occurrence restrictions. We employ the UCLA Phonotactic Learner to model Mandarin speakers’ phonotactic knowledge, and compare the model's well-formedness predictions with speakers’ judgements. The disparity between the model's predictions and the well-formedness ratings from the experiment indicates that grammatical principles and the lexicon are still not sufficient to explain all of the variations in the speakers’ judgements. We argue that multiple biases, such as naturalness bias, allophony bias and suprasegmental bias, are effective during phonotactic learning.



2021 ◽  
Author(s):  
Guilherme Duarte Garcia

In weight-sensitive languages, stress is influenced by syllable weight. As a result, heavy syllables should attract, not repel, stress. The Portuguese lexicon, however, presents a case where weight seems to negatively impact stress: antepenultimate stress is more frequent in light antepenultimate syllables than in heavy ones. This pattern is phonologically unexpected, and appears to contradict the typology of weight and stress: it is a case where lexical statistics and the grammar conflict. Portuguese also contains gradient, not categorical, weight effects, which weaken as we move away from the right edge of the word. In this paper, I examine how native speakers’ grammars capture these subtle weight effects, and whether the negative antepenultimate weight effect is learned or repaired. I show that speakers learn the gradient weight effects in the language, but do not learn the unnatural negative effect. Instead, speakers repair this pattern, and generalize a positive weight effect to all syllables in the stress domain. This study thus provides empirical evidence that speakers may not only ignore unnatural patterns, but also learn the opposite pattern.



Author(s):  
Shuxiao Gong ◽  
Jie Zhang

Syllable well-formedness judgment experiments reveal that speakers exhibit gradient judgment on novel words, and the gradience has been attributed to both grammatical factors and lexical statistics (e.g., Coetzee, 2008). This study investigates gradient phonotactics stemming from the violations of four types of grammatical constraints in Mandarin Chinese: 1) principled phonotactic constraints, 2) accidental phonotactic constraints, 3) allophonic restrictions, and 4) segmental-tonal cooccurrence restrictions. A syllable well-formedness judgment experiment was conducted with native Mandarin speakers to examine how the grammatical and lexical statistics factors contribute to the variation in phonotactic acceptability judgment.



Language ◽  
2019 ◽  
Vol 95 (4) ◽  
pp. 612-641 ◽  
Author(s):  
Guilherme Duarte Garcia
Keyword(s):  


10.29007/2xzw ◽  
2018 ◽  
Author(s):  
Danilo S. Carvalho ◽  
Vu Tran ◽  
Khanh Van Tran ◽  
Nguyen Le Minh

Legal professionals worldwide are currently trying to get up-to-pace with the explosive growth in legal document availability through digital means. This drives a need for high efficiency Legal Information Retrieval (IR) and Question Answering (QA) methods. The IR task in particular has a set of unique challenges that invite the use of semantic motivated NLP techniques. In this work, a two-stage method for Legal Information Retrieval is proposed, combining lexical statistics and distributional sentence representations in the context of Competition on Legal Information Extraction/Entailment (COLIEE). The combination is done with the use of disambiguation rules, applied over the rankings obtained through n-gram statistics. After the ranking is done, its results are evaluated for ambiguity, and disambiguation is done if a result is decided to be unreliable for a given query. Competition and experimental results indicate small gains in overall retrieval performance using the proposed approach. Additionally, an analysis of error and improvement cases is presented for a better understanding of the contributions.



Phonology ◽  
2017 ◽  
Vol 34 (2) ◽  
pp. 269-298 ◽  
Author(s):  
Gaja Jarosz

Behavioural findings indicate that English, Mandarin and Korean speakers exhibit gradient sonority sequencing preferences among unattested initial clusters. While some have argued these results support an innate principle, recent modelling studies have questioned this conclusion, showing that computational models capable of inducing generalisations using abstract phonological features can detect these preferences from lexical statistics in the three languages. This paper presents a computational analysis of the development of initial clusters in Polish, which arguably presents a stronger test of these models. We show that (i) the statistics of Polish contradict the Sonority Sequencing Principle (SSP), favouring sonority plateaus, (ii) models that succeed in the other languages do not predict SSP preferences for Polish and (iii) children nonetheless exhibit sensitivity to the SSP, favouring onset clusters with larger sonority rises.



Author(s):  
Gaja Jarosz ◽  
Amanda Rysling

A growing body of behavioral results demonstrates cross-linguistic sensitivity to the SSP (Daland et al. 2011; Berent et al. 2007; Berent et al. 2008; Jarosz to appear; Ren, Gao & Morgan 2010). These consistent findings suggest a role for prior bias in phonological learning, but recent modeling studies question this conclusion, showing that for some languages these preferences can be derived from the input (Daland et al. 2011; Hayes 2011). Building on these results and Jarosz’s (to appear) developmental findings for Polish, the present paper investigates adult Polish speakers’ sensitivity to the SSP experimentally and computationally. We report the results of an online acceptability judgment experiment focusing on initial clusters and present the results of computational simulations evaluating the ability of phonotactic models to predict participants’ ratings on the basis of the lexical statistics of Polish. Our main findings are that 1) SSP is predictive of adults’ ratings, 2) sonority projection arises in both attested and unattested clusters, 3) while phonotactic models have significant predictive value, they do not subsume the SSP preferences observed in the participants’ ratings, and 4) participants’ sonority sequencing preferences are not entirely compatible with the SSP, suggesting a combined effect of prior bias and experience.



Sign in / Sign up

Export Citation Format

Share Document