Contributions of phonetic token variability and word-type frequency to phonological representations

ABSTRACTThe experiments here build on the widely reported finding that children are most accurate when producing phonotactic sequences with high ambient-language frequency. What remains controversial is a description of the input that children must be tracking for this effect to arise. We present a series of experiments that compare two ambient-language properties, token and type frequency, as they contribute to phonotactic learning. Token frequency is the raw number of exposures children have to a particular pattern; type frequency refers to a count of abstract entities, such as unique words. Our results suggest that children's production accuracy is most sensitive to a combination of type and token frequency: children were able to generalize a target phonotactic sequence to a new word when familiarized with multiple word-types across tokens from multiple talkers, but not when presented with either word-types with no talker variability or multiple talker-tokens of a single word.

Download Full-text

Word Type Frequency Alone Can Modulate Hemispheric Asymmetry in Visual Word Recognition: Evidence from Modeling Chinese Character Recognition

i-Perception ◽

10.1068/ic343 ◽

2011 ◽

Vol 2 (4) ◽

pp. 343-343

Author(s):

Janet H. Hsiao ◽

Kit Cheung

Keyword(s):

Word Recognition ◽

Visual Word Recognition ◽

Character Recognition ◽

Hemispheric Asymmetry ◽

Chinese Character ◽

Word Type ◽

Visual Word ◽

Chinese Character Recognition ◽

Type Frequency

Download Full-text

Types and degrees of vowel neutrality

Linguistica ◽

10.4312/linguistica.56.1.239-252 ◽

2016 ◽

Vol 56 (1) ◽

pp. 239-252 ◽

Cited By ~ 1

Author(s):

Péter Rebrus ◽

Miklós Törkenczy

Keyword(s):

Word Type ◽

Explicit Measure ◽

Type Frequency

This paper argues that neutrality in a harmony system is a gradient property since it is due to a vowel’s participation in different patterns that are considered to be indicators of neutral behaviour in harmony. We examine three of these patterns of neutrality (transparency, affixal invariance and antiharmony) and show that a scale of neutrality can be defined on the basis of these patterns (their occurrence and variability) and the neutrality of harmony systems can be characterized with reference to this scale. We describe a tentative quantification of neutrality and then develop an explicit measure of neutrality based on relative word type frequency. This explicit measure is applied to the behaviour of neutral vowels in Hungarian front/back harmony where the different neutral vowels represent different degrees of neutrality in all three neutrality patterns.

Download Full-text

Measuring analyticity and syntheticity in creoles

Journal of Pidgin and Creole Languages ◽

10.1075/jpcl.29.1.02sie ◽

2014 ◽

Vol 29 (1) ◽

pp. 49-85 ◽

Cited By ~ 12

Author(s):

Jeff Siegel ◽

Benedikt Szmrecsanyi ◽

Bernd Kortmann

Keyword(s):

Hong Kong ◽

American English ◽

The Other ◽

Word Class ◽

Language Varieties ◽

Token Frequency ◽

Quantitative Results ◽

Type Frequency ◽

Hong Kong English

Creoles (here including expanded pidgins) are commonly viewed as being more analytic than their lexifiers and other languages in terms of grammatical marking. The purpose of the study reported in this article was to examine the validity of this view by measuring the frequency of analytic (and synthetic) markers in corpora of two different English-lexified creoles — Tok Pisin and Hawai‘i Creole — and comparing the quantitative results with those for other language varieties. To measure token frequency, 1,000 randomly selected words in each creole corpus were tagged with regard to word class, and categorized as being analytic, synthetic, both analytic and synthetic, or purely lexical. On this basis, an Analyticity Index and a Syntheticity Index were calculated. These were first compared to indices for other languages and then to L1 varieties of English (e.g. standard British and American English and British dialects) and L2 varieties (e.g. Singapore English and Hong Kong English). Type frequency was determined by the size of the inventories of analytic and synthetic markers used in the corpora, and similar comparisons were made. The results show that in terms of both token and type frequency of grammatical markers, the creoles are not more analytic than the other varieties. However, they are significantly less synthetic, resulting in much higher ratios of analytic to synthetic marking. An explanation for this finding relates to the particular strategy for grammatical expansion used by individuals when the creoles were developing.

Download Full-text

The locative alternation in verb-framed vs. satellite-framed languages

Studies in Language ◽

10.1075/sl.38.4.08lew ◽

2014 ◽

Vol 38 (4) ◽

pp. 864-895 ◽

Cited By ~ 9

Author(s):

Wojciech Lewandowski

Keyword(s):

Comparative Analysis ◽

Argument Structure ◽

Corpus Study ◽

Token Frequency ◽

Locative Alternation ◽

Change Of State ◽

Type Frequency ◽

Verb Meaning

I propose a comparative analysis of the locative alternation in Polish and Spanish. I adopt a constructional theory of argument structure (Goldberg (1995)), according to which the locative alternation is an epiphenomenon of the compatibility of a single verb meaning with two different constructions: the caused-motion construction and the causative + with adjunct construction. As claimed by Pinker (1989), a verb must specify a manner of motion from which a particular change of state can be obtained in order to be able to appear in both constructional schemas. However, I show through a corpus study that the compatibility between verbal and constructional meaning is further restricted by Talmy’s (1985, 1991, 2000) distinction between verb-framed and satellite-framed languages. In particular, Talmy’s lexicalization patterns theory systematically explains why both the token frequency and the type frequency of the alternating verbs are considerably higher in Polish than in Spanish.

Download Full-text

Bilingual children's production of regular and irregular past tense morphology

Bilingualism Language and Cognition ◽

10.1017/s1366728914000108 ◽

2014 ◽

Vol 18 (2) ◽

pp. 290-303 ◽

Cited By ~ 8

Author(s):

JUDITH RISPENS ◽

ELISE DE BREE

Keyword(s):

Learning Strategy ◽

Bilingual Children ◽

Past Tense ◽

Token Frequency ◽

The Past ◽

Type Frequency ◽

The Difference ◽

Past Tenses

This study examined the production of the Dutch past tense in Dutch–Hebrew bilingual children and investigated the effect of type of past tense allomorph (de versus te) and token frequency on productions of the past tense. Seven-year-old bilingual children (n=11) were compared with monolingual children: age-matched (n=30) and younger vocabulary-matched (n=21). Accuracy of regular and novel past tense was similar for the bilingual and monolingual groups, but the former group was worse on irregular past tense than the age-matched monolingual peers. All three groups showed effects of type frequency: te past tenses were more accurate than de. The difference between the bilingual and monolingual children surfaces in the extent of the effect: for the bilingual children it was most pronounced in verbs with low token frequency and novel verbs. Results are interpreted as stemming from a learning strategy or from phonological transfer from the Hebrew morphosyntactic system.

Download Full-text

A Corpus-based Case Study on the POS Tagging of Self-referential Lexemes in the Contemporary Chinese Dictionary

Theory and Practice in Language Studies ◽

10.17507/tpls.1008.05 ◽

2020 ◽

Vol 10 (8) ◽

pp. 879

Author(s):

Jun Zhang ◽

Heng Zhang

Keyword(s):

The Self ◽

Word Class ◽

Pos Tagging ◽

Token Frequency ◽

Categorization Theory ◽

Type Frequency ◽

Set Up ◽

Contemporary Chinese ◽

Grammatical Functions

The POS tagging in the 5th edition of the CCD has been revised in the 6th and the 7th editions. The noun POS of most sports and science lexemes are deleted, and their senses of noun (self-referential senses) are included into verbs. However, most of these lexemes can be used as nouns intuitively, and their noun POS and senses should exist. Based on the grammatical functions of words (Xv & Tang, 2006) and the two-level word class categorization theory (Wang, 2014), this study conducts a corpus-based case study of a science lexeme “guina”. The result shows that “guina” not only has self-referential usage, but has high token frequency, with 133 occurrences accounting for 42.8% of the total usages, and rich type frequency widely distributed in “guina + (of) + NP “,” NP + (of) + guina” and “VP + guina”, which conforms to the criterion of conventionalization. Therefore, it is necessary to tag the noun POS and to set up the self-referential sense for “guina”. This research has an implication for solving the POS tagging problem of self-referential lexemes in the CCD.

Download Full-text

ANGER metaphors in American English and Kabyle

International Journal of Language and Culture ◽

10.1075/ijolc.3.2.04bel ◽

2016 ◽

Vol 3 (2) ◽

pp. 216-252

Author(s):

Sadia Belkhir

Keyword(s):

Cognitive Linguistics ◽

American English ◽

Prototype Model ◽

Sociocultural Influences ◽

Token Frequency ◽

Language Variety ◽

Type Frequency ◽

Human Thinking ◽

And Control ◽

Domain Concept

The position standardly held in cognitive linguistics is that anger is an emotion concept that communicates about human thinking and which is instantiated in language in ways that are often metaphorically, systematically, and conceptually structured. The container metaphor is claimed to be near-universal (Kövecses 2000), but also subject to variation (Kövecses 2005). Variation in metaphor frequencies across languages has also been investigated (Boers & Demecheleer 1997; Boers 1999; Deignan 2003; Kövecses et al. 2015). This article reports a corpus-based contrastive investigation of anger metaphors in American English and Kabyle — a Tamazight language variety spoken in the northern part of Algeria. Its main objective is to contrast these metaphors and try to find out the most used ones in these languages through a qualitative and quantitative analysis of the token frequency of linguistic expressions belonging to each of the conceptual metaphors, the type frequency of their linguistic realizations, and the number of their mappings. Aspects of the anger scenario are also studied and contrasted. The findings indicate similarities and differences in the use of anger metaphors in the two languages. The three most frequently used metaphors in American English involve the container, possessed object and opponent source domains while the most frequently used ones in Kabyle involve the fire, container and possessed object source domains. These results confirm the near-universality of the container metaphor. However, the most frequently used metaphorical source domain concept is different in the two languages due to sociocultural influences. In addition, the findings relating to aspects of the anger scenario (intensity and control) support Lakoff and Kövecses’ (1987) prototype model of anger, although it is found to be influenced by sociocultural specificities in American English and Kabyle.

Download Full-text

Polish children's productivity with case marking: the role of regularity, type frequency, and phonological diversity

Journal of Child Language ◽

10.1017/s0305000906007471 ◽

2006 ◽

Vol 33 (3) ◽

pp. 559-597 ◽

Cited By ~ 44

Author(s):

EWA DĄBROWSKA ◽

MARCIN SZCZERBIŃSKI

Keyword(s):

Age Groups ◽

The Other ◽

Older Children ◽

Mechanism Model ◽

Token Frequency ◽

Poor Predictor ◽

Overall Performance ◽

Type Frequency ◽

Dual Mechanism

57 Polish-speaking children aged from 2;4, to 4;8 and 16 adult controls participated in a nonce-word inflection experiment testing their ability to use the genitive, dative and accusative inflections productively. Results show that this ability develops early: the majority of two-year-olds were already productive with all inflections apart from dative neuter; and the overall performance of the four-year-olds was very similar to that of adults. All age groups were more productive with inflections that apply to large and/or phonologically diverse classes, although class size and token frequency appeared to be more important for younger children (two- and three-year-olds) and phonological diversity for older children and adults. Regularity, on the other hand, was a very poor predictor of productivity. The results support usage-based models of language acquisition and are problematic for the dual mechanism model.

Download Full-text

Caregiver speech and children's use of nouns versus verbs: A comparison of English, Italian, and Mandarin

Journal of Child Language ◽

10.1017/s030500099700319x ◽

1997 ◽

Vol 24 (3) ◽

pp. 535-565 ◽

Cited By ~ 154

Author(s):

TWILA TARDIF ◽

MARILYN SHATZ ◽

LETITIA NAIGLES

Keyword(s):

Morphological Variation ◽

Production Data ◽

Token Frequency ◽

Spontaneous Production ◽

English Speaking ◽

Type Frequency ◽

Child Speech

This paper examines naturalistic samples of adult-to-child speech to determine if variations in the input are consistent with reported variations in the proportions of nouns and verbs in children's early vocabularies. It contrasts two PRO-DROP languages, Italian and Mandarin, with English. Naturalistic speech samples from six 2;0 English-, six 1;11 Italian-, and ten 1;10 Mandarin-speaking children and their caregivers were examined. Adult-to-child speech was coded for the type frequency, token frequency, utterance position, and morphological variation of nouns and verbs as well as the types and placements of syntactic subjects and the pragmatic focus of adult questions. Children's spontaneous productions of nouns and verbs and their responses to adult questions were also examined. The results suggest a pattern consistent with the children's spontaneous production data. Namely, the speech of English-speaking caregivers emphasized nouns over verbs, whereas that of Mandarin-speaking caregivers emphasized verbs over nouns. The data from the Italian-speaking caregivers were more equivocal, though still noun-oriented, across these various input measures.

Download Full-text

The impact of phonological neighborhood density on typical and atypical emerging lexicons*

Journal of Child Language ◽

10.1017/s030500091300010x ◽

2013 ◽

Vol 41 (3) ◽

pp. 634-657 ◽

Cited By ~ 20

Author(s):

STEPHANIE F. STOKES

Keyword(s):

Statistical Learning ◽

Neighborhood Density ◽

Phonological Representations ◽

Phonological Neighborhood Density ◽

Learning Mechanism ◽

Late Talkers ◽

Ambient Language ◽

Current Article ◽

Phonological Neighborhood ◽

The Impact

ABSTRACTAccording to the Extended Statistical Learning account (ExSL; Stokes, Kern & dos Santos, 2012) late talkers (LTs) continue to use neighborhood density (ND) as a cue for word learning when their peers no longer use a density learning mechanism. In the current article, LTs expressive (active) lexicon ND values differed from those of their age-matched, but not language-matched, TD peers, a finding that provided support for the ExSL account. Stokes (2010) claimed that LTs had difficulty abstracting sparse words, but not dense, from the ambient language. If true, then LTs' receptive (passive), as well as active lexicons should be comprised of words of high ND. However, in the current research only active lexicons were of high ND. LTs' expressive lexicons may be small not because of an abstraction deficit, but because they are unable to develop sufficiently strong phonological representations to support word production.

Download Full-text