content word Latest Research Papers

Prosodic encoding of focus in Taifi Arabic is not yet fully understood. A recent production study found significant acoustic differences between syntactically identical sentences with information focus, contrastive focus and without focus. This paper presents results from a production experiment investigating whether information and contrastive focus have prosodic effects on the pitch-accent distributions. Using question-answer paradigms, 16 native speakers of Taifi Arabic were asked to read three target sentences in different focus conditions. Results reveal that every content word is pitch-accented in utterances with and without focus. However, there are very few cases (23.12%) in which the post-focus words are deaccented. The largest percentage of deaccentuation was observed in the utterances with initial contrastive focus. The results show that focus structures in Taifi Arabic show both deaccentuation and post-focus compression. Therefore, the prosodic realization of focus in Taifi Arabic is different from their counterparts in other Arabic dialects such as Egyptian and Lebanese Arabic. These findings have an important implication for both the prosodic typology and focus typology.

How analysis of mobile app reviews problematises linguistic approaches to internet troll detection

Humanities and Social Sciences Communications ◽

10.1057/s41599-021-00968-7 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Sergei Monakhov

Keyword(s):

Online Communication ◽

Automatic Detection ◽

Content Word ◽

Mobile App ◽

Experimental Setting ◽

Experimental Conditions ◽

Individual Contributions ◽

Time Distance ◽

Linguistic Approaches

AbstractState-sponsored internet trolls repeat themselves in a unique way. They have a small number of messages to convey but they have to do it multiple times. Understandably, they are afraid of being repetitive because that will inevitably lead to their identification as trolls. Hence, their only possible strategy is to keep diluting their target message with ever-changing filler words. That is exactly what makes them so susceptible to automatic detection. One serious challenge to this promising approach is posed by the fact that the same troll-like effect may arise as a result of collaborative repatterning that is not indicative of any malevolent practices in online communication. The current study addresses this issue by analysing more than 180,000 app reviews written in English and Russian and verifying the obtained results in the experimental setting where participants were asked to describe the same picture in two experimental conditions. The main finding of the study is that both observational and experimental samples became less troll-like as the time distance between their elements increased. Their ‘troll coefficient’ calculated as the ratio of the proportion of repeated content words among all content words to the proportion of repeated content word pairs among all content word pairs was found to be a function of time distance between separate individual contributions. These findings definitely render the task of developing efficient linguistic algorithms for internet troll detection more complicated. However, the problem can be alleviated by our ability to predict what the value of the troll coefficient of a certain group of texts would be if it depended solely on these texts’ creation time.

Some puzzling findings regarding the acquisition of verbs

10.31234/osf.io/xw9e7 ◽

2021 ◽

Author(s):

Joshua K. Hartshorne ◽

Yujing Huang ◽

Lauren Skorb

Keyword(s):

Language Learning ◽

Empirical Investigation ◽

Content Word ◽

Causal Agent ◽

Input Frequency ◽

Prior Work ◽

Function Words ◽

Cast Doubt ◽

Type Frequency ◽

Noun Bias

On the whole, children acquire frequent words earlier than less frequent words. However, there are other factors at play, such as an early "noun bias" (relative to input frequency, toddlers learn nouns faster than verbs) and a "content-word bias" (content words are acquired disproportionately to function words). This paper follows up reports of a puzzling phenomenon within verb-learning, where "experiencer-object" emotion verbs (A frightened/angered/delighted B) are lower frequency but learned earlier than "experiencer-subject" emotion verbs (A feared/hated/loved B). In addition to the possibility that the aforementioned results are a fluke or due to some confound, prior work has suggested several possible explanations: experiencer-object ("frighten-type") verbs have higher type frequency, encode a causal agent as the sentential subject, and perhaps describe a more salient perspective on the described event. In three experiments, we cast doubt on all three possible explanations. The first experiment replicates and extends the prior findings regarding emotion verbs, ruling out several possible confounds and concerns. The second and third experiments investigate acquisition of chase/flee verbs and give/get verbs, which reveal surprising findings that are not explained by the aforementioned hypotheses. We conclude that these findings indicate a significant hole in our theories of language learning, and that the path forward likely requires a great deal more empirical investigation of the order of acquisition of verbs.

A Collocation Inventory for Beginners

10.26686/wgtn.16945729 ◽

2021 ◽

Author(s):

◽

Dongkwang Shin

Keyword(s):

High Frequency ◽

Teaching And Learning ◽

Incidental Learning ◽

Course Design ◽

Content Word ◽

Written Language ◽

Spoken English ◽

Balanced Approach ◽

Teaching Learning ◽

The Right

<p>This study has two goals - (1) to see what criteria are needed to define collocations and (2) to make a list of the high frequency collocations of spoken English that would be useful for guiding teaching, learning and course design. The existing criteria for defining collocations are generally not well defined and have not been applied consistently. Wray and Perkins (2000) identify more than forty terms used for designating multi-word units. To avoid this confusion, three criteria are strictly applied - frequent co-occurrence, grammatical well-formedness and predictability in L1. The ten million word British National Corpus (BNC) spoken corpus is used as the data source, and the 1,000 most frequent spoken word types from that corpus are all investigated as pivot words. It is found that the three criteria can be applied in a systematic way. The most striking finding is that there are a large number of collocations meeting the first two criteria and a large number of these would qualify for inclusion in the most frequent 2,000 words of English, if no distinction was made between single words and collocations. There are nine major findings in this study - 1) there is a very large number of grammatically well-formed high frequency collocations, 2) collocations occur in spoken language much more frequently than they occur in written language, 3) the more frequent the pivot word, the greater the number of collocates, 4) a small number of pivot words account for a very large proportion of the tokens of collocations, 5) adjectives tend to have more collocates than other content words, 6) the shorter the collocation, the greater the frequency, 7) content word plus content word collocations outnumber other patterns of content word collocations, 8) there are more collocates on the left than collocates on the right, but this difference is not striking, 9) a third of the 500 most frequent collocations of English did not have word for word equivalents in Korean (L1). A balanced approach is needed for the teaching and learning of collocations, employing opportunities for both deliberate and incidental learning, and giving appropriate attention in each of the four skills of listening, speaking, reading and writing.</p>

A Collocation Inventory for Beginners

10.26686/wgtn.16945729.v1 ◽

2021 ◽

Author(s):

◽

Dongkwang Shin

Keyword(s):

High Frequency ◽

Teaching And Learning ◽

Incidental Learning ◽

Course Design ◽

Content Word ◽

Written Language ◽

Spoken English ◽

Balanced Approach ◽

Teaching Learning ◽

The Right

<p>This study has two goals - (1) to see what criteria are needed to define collocations and (2) to make a list of the high frequency collocations of spoken English that would be useful for guiding teaching, learning and course design. The existing criteria for defining collocations are generally not well defined and have not been applied consistently. Wray and Perkins (2000) identify more than forty terms used for designating multi-word units. To avoid this confusion, three criteria are strictly applied - frequent co-occurrence, grammatical well-formedness and predictability in L1. The ten million word British National Corpus (BNC) spoken corpus is used as the data source, and the 1,000 most frequent spoken word types from that corpus are all investigated as pivot words. It is found that the three criteria can be applied in a systematic way. The most striking finding is that there are a large number of collocations meeting the first two criteria and a large number of these would qualify for inclusion in the most frequent 2,000 words of English, if no distinction was made between single words and collocations. There are nine major findings in this study - 1) there is a very large number of grammatically well-formed high frequency collocations, 2) collocations occur in spoken language much more frequently than they occur in written language, 3) the more frequent the pivot word, the greater the number of collocates, 4) a small number of pivot words account for a very large proportion of the tokens of collocations, 5) adjectives tend to have more collocates than other content words, 6) the shorter the collocation, the greater the frequency, 7) content word plus content word collocations outnumber other patterns of content word collocations, 8) there are more collocates on the left than collocates on the right, but this difference is not striking, 9) a third of the 500 most frequent collocations of English did not have word for word equivalents in Korean (L1). A balanced approach is needed for the teaching and learning of collocations, employing opportunities for both deliberate and incidental learning, and giving appropriate attention in each of the four skills of listening, speaking, reading and writing.</p>

Content Word Production during Discourse in Aphasia: Deficits in Word Quantity, Not Lexical–Semantic Complexity

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_01772 ◽

2021 ◽

pp. 1-18

Author(s):

Reem S. W. Alyahya ◽

Ajay D. Halai ◽

Paul Conroy ◽

Matthew A. Lambon Ralph

Keyword(s):

Full Range ◽

Content Word ◽

Word Production ◽

Connected Speech ◽

Lexical Semantic ◽

Reduced Word ◽

Discourse Production ◽

Wide Range ◽

Semantic Complexity ◽

Lesion Symptom Mapping

Abstract Although limited and reduced connected speech production is one, if not the most, prominent feature of aphasia, few studies have examined the properties of content words produced during discourse in aphasia, in comparison to the many investigations of single-word production. In this study, we used a distributional analysis approach to investigate the properties of content word production during discourse by 46 participants spanning a wide range of chronic poststroke aphasia and 20 neurotypical adults, using different stimuli that elicited three discourse genres (descriptive, narrative, and procedural). Initially, we inspected the discourse data with respect to the quantity of production, lexical–semantic diversity, and psycholinguistic features (frequency and imageability) of content words. Subsequently, we created a “lexical–semantic landscape,” which is sensitive to subtle changes and allowed us to evaluate the pattern of changes in discourse production across groups. Relative to neurotypical adults, all persons with aphasia (both fluent and nonfluent) showed significant reduction in the quantity and diversity of production, but the lexical–semantic complexity of word production directly mirrored neurotypical performance. Specifically, persons with aphasia produced the same rate of nouns/verbs, and their discourse samples covered the full range of word frequency and imageability, albeit with reduced word quantity. These findings provide novel evidence that, unlike in other disorders (e.g., semantic dementia), discourse production in poststroke aphasia has relatively preserved lexical–semantic complexity but demonstrates significantly compromised quantity of content word production. Voxel-wise lesion-symptom mapping using both univariate and multivariate approaches revealed left frontal regions particularly the pars opercularis, IC, and central and frontal opercular cortices supporting word retrieval during connected speech, irrespective of word class or their lexical–semantic complexity.

An efficient, accurate and clinically-applicable index of content word fluency in Aphasia

Aphasiology ◽

10.1080/02687038.2021.1923946 ◽

2021 ◽

pp. 1-19

Author(s):

Reem S. W. Alyahya ◽

Paul Conroy ◽

Ajay D. Halai ◽

Matthew A. Lambon Ralph

Keyword(s):

Content Word ◽

Word Fluency

Most Frequently Used Vocabulary in Selected English Drama Movie Scripts

International Journal of Linguistics Literature & Translation ◽

10.32996/ijllt.2020.3.12.19 ◽

2020 ◽

Vol 3 (12) ◽

pp. 154-161

Author(s):

Sawitri Suwanaroa ◽

Sutarat Polerk

Keyword(s):

Motion Pictures ◽

Content Word ◽

The Other ◽

English Drama ◽

Total Frequency ◽

Part Of Speech ◽

Other Hand ◽

Program Software

Understanding words in motion pictures is a rewarding pathway for audience of cultural-based media and entertainment. The study aimed to investigate the most frequently used vocabulary in selected English drama movie scripts. It sought to find out the most frequently occurred part of speech and to probe the frequency of occurrences of each content words in the movie scripts. Five English drama movie scripts were chosen purposively as sources of the corpora. Applying the Antconc program software developed by Anthony, a total of 108 content words were noted and categorized for corpora analysis. There were 38 verbs, 28 adverbs, 24 nouns and 18 adjectives noted as commonly used vocabulary from the movie scripts. Among the verb types, have (366), can (276) and know (232) emerged as the most frequently occurred action words with a total frequency of 3,142 occurrences. In the case of adverbs, here (237), out (178), and about (163) came out as the most frequently used words with a total occurrence of 1,598. Nouns, on the other hand, have 1,574 occurrences with time (134), tree (133), and going (127) as the most frequently used name words. Adjective was found to be the least frequently occurred content word with 752 occurrences to include right (146), old (54), and only (53) as the top three frequently used noun modifiers. In conclusion, verb outnumbered the other three content words due to drama’s nature of involving the characters into action; and ‘have’ came out as the most frequently used verb because drama scenes imply possessions and obligations.

ConSenses: Disambiguating content word groups based on knowledge base and definition embedding

2020 International Conference on Technologies and Applications of Artificial Intelligence (TAAI) ◽

10.1109/taai51410.2020.00055 ◽

2020 ◽

Author(s):

Kai-Wen Tuan ◽

Yi-Chien Lin ◽

Jason S. Chang ◽

Kuan-Lin Lee ◽

Li-Kuang Chen

Keyword(s):

Knowledge Base ◽

Content Word

Can a bilingual lexicon be sustained by phonotactics alone?

The Mental Lexicon ◽

10.1075/ml.19024.lip ◽

2020 ◽

Vol 15 (2) ◽

pp. 330-365

Author(s):

John M. Lipski

Keyword(s):

Lexical Decision ◽

False Memory ◽

Close Relationships ◽

Research Effort ◽

Content Word ◽

content word
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Pitch Accent Distribution and Focus Structure in Taifi Arabic: A Production Study

How analysis of mobile app reviews problematises linguistic approaches to internet troll detection

Some puzzling findings regarding the acquisition of verbs

A Collocation Inventory for Beginners

A Collocation Inventory for Beginners

Content Word Production during Discourse in Aphasia: Deficits in Word Quantity, Not Lexical–Semantic Complexity

An efficient, accurate and clinically-applicable index of content word fluency in Aphasia

Most Frequently Used Vocabulary in Selected English Drama Movie Scripts

ConSenses: Disambiguating content word groups based on knowledge base and definition embedding

Can a bilingual lexicon be sustained by phonotactics alone?

Export Citation Format

content wordRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Pitch Accent Distribution and Focus Structure in Taifi Arabic: A Production Study

How analysis of mobile app reviews problematises linguistic approaches to internet troll detection

Some puzzling findings regarding the acquisition of verbs

A Collocation Inventory for Beginners

A Collocation Inventory for Beginners

Content Word Production during Discourse in Aphasia: Deficits in Word Quantity, Not Lexical–Semantic Complexity

An efficient, accurate and clinically-applicable index of content word fluency in Aphasia

Most Frequently Used Vocabulary in Selected English Drama Movie Scripts

ConSenses: Disambiguating content word groups based on knowledge base and definition embedding

Can a bilingual lexicon be sustained by phonotactics alone?

content word
Recently Published Documents