Word frequency counts

2018 ◽  
Vol 41 (2) ◽  
pp. 224-239
Author(s):  
Bartosz Brzoza

Abstract Lexical frequency is one of the major variables involved in language processing. It constitutes a cornerstone of psycholinguistic, corpus linguistic as well as applied research. Linguists take frequency counts from corpora and they started to take them for granted. However, voices emerge that corpora may not always provide a comprehensive picture of how frequently lexical items appear in a language. In the present contribution I compare corpus frequency counts for English and Polish words to native speakers’ perception of frequency. The analysis shows that, while generally objective and subjective values are related, there is a disparity between measures for frequent Polish words. The direction of the relationship, though positive, is also not as strong as in previous studies. I suggest linking objective with subjective frequency measures in research.

2017 ◽  
Vol 12 (2) ◽  
pp. 234-262 ◽  
Author(s):  
C. Sophia Rammell ◽  
Diana Van Lancker Sidtis ◽  
David B. Pisoni

Abstract Background: Formulaic expressions, including idioms and other fixed expressions, comprise a significant proportion of discourse. Although much has been written about this topic, controversy remains about their psychological status. An important claim about formulaic expressions, that they are known to native speakers, has seldom been directly demonstrated. This study tested the hypothesis that formulaic expressions are known and stored as whole unit mental representations by performing three perceptual experiments. Method: Listeners transcribed two kinds of spectrally-degraded spoken sentences, half formulaic, and half novel, newly created expressions, matched for grammar and length. Two familiarity ratings, usage and exposure, were obtained from listeners for each expression. Text frequency data for the stimuli and their constituent words were obtained using a spoken corpus. Results: Participants transcribed formulaic more successfully than literal utterances. Usage and familiarity ratings correlated with accuracy, but formulaic utterances with low ratings were also transcribed correctly. Phrase types differed significantly in text frequency, but word frequency counts did not differentiate the two kinds of expressions. Discussion: These studies provide new converging evidence that formulaic expressions are encoded and processed as whole units, supporting a dual-process model of language processing, which assumes that grammatical and formulaic expressions are differentially processed.


2001 ◽  
Vol 1 ◽  
pp. 7-28 ◽  
Author(s):  
Batia Laufer ◽  
Paul Nation

This paper examines the relationship between fluency and vocabulary size, and also between fluency and word frequency level. Fluency was operationalised as the time learners need to recognize meanings of words sampled from different frequency levels. It was measured by a computerised vocabulary recognition speed test (VORST). The test was given to 488 native and non-native speakers who were divided by vocabulary size into four groups. The four groups were compared on speed of response to the 3000 level and University Word List (UWL) words. Speed was also correlated with vocabulary size. Additionally, response times to different frequency levels were compared for each subject. Results suggest that speed of retrieval is moderately related to vocabulary size and word frequency. Non-native speakers’ increase in speed lags behind increase in vocabulary size. Non-native speakers also respond more slowly to less frequent words. Responses of native speakers, on the other hand, are more homogeneous across subjects and across vocabulary frequencies. Speed of retrieval cannot be fully predicted from vocabulary knowledge and therefore speed tests should supplement tests of vocabulary size and depth.


2021 ◽  
pp. 174702182110479
Author(s):  
Ferdy Hubers ◽  
Catia Cucchiarini ◽  
Helmer Strik ◽  
Ton Dijkstra

Idiom processing studies have paid considerable attention to the relationship between idiomatic expressions as a whole and their constituent words. Although most research focused on the semantic properties of the constituent words, their orthographic form could also play a role in processing. To test this, we assessed both form and meaning activation of individual words during the processing of opaque idioms. In two primed word naming experiments, Dutch native speakers silently read sentences word by word and then named the last word of the sentence. This target word was embedded in either an idiomatic or a literal context, and was either expected/correct in this context (COR), or semantically related (REL) or unrelated (UNREL) to the expected word. The correct target word in the idiomatic context was always part of an opaque idiom. Faster naming latencies for the idiom-final noun than for the unrelated target in the idiomatic context indicated that the idiom was activated as a whole during processing. In addition, semantic facilitation was observed in the literal context (COR<REL<UNREL), but not in the idiomatic context (COR<REL=UNREL). This is evidence that the idiom-final noun was not activated at the meaning level of representation. However, an inhibitory effect of orthographic word frequency of the idiom-final noun indicated that the idiom-final noun was activated at the form level. These results provide evidence in favor of a hybrid model of idiom processing in which the individual words and the idiom as a whole interact on form and meaning levels of representation.


1991 ◽  
Vol 9 (1) ◽  
pp. 09 ◽  
Author(s):  
Tracey M. Derwing

This study investigates the relationship of native speakers' (NSs) personality traits and experience interacting with non-native speakers (NNSs) to the use of conversational adjustments and differences in word frequency and speech rate. Eight ESL instructors and eight persons who had no regular contact with NNSs were asked to view a film, then tell a NS and a NNS partner its story. Transcripts of the subjects' film narratives to the listeners were examined for differences in word frequency, rate, and conversational adjustments. Although the ESL instructors used certain conversational adjustments significantly more with NNSs than did the inexperienced subjects, the two groups did not differ in terms of word frequency or rate. When subjects were grouped according to the personality traits of interpersonal affect and social participation, they did not differ in overall usage of conversational adjustments, but significant differences were found in both word frequency and speech rate.


2020 ◽  
Author(s):  
Paul Nation ◽  
B Laufer

This paper examines the relationship between fluency and vocabulary size, and also between fluency and word frequency level. Fluency was operationalised as the time learners need to recognize meanings of words sampled from different frequency levels. It was measured by a computerised vocabulary recognition speed test (VORST). The test was given to 488 native and non-native speakers who were divided by vocabulary size into four groups. The four groups were compared on speed of response to the 3000 level and University Word List (UWL) words. Speed was also correlated with vocabulary size. Additionally, response times to different frequency levels were compared for each subject. Results suggest that speed of retrieval is moderately related to vocabulary size and word frequency. Non-native speakers’ increase in speed lags behind increase in vocabulary size. Non-native speakers also respond more slowly to less frequent words. Responses of native speakers, on the other hand, are more homogeneous across subjects and across vocabulary frequencies. Speed of retrieval cannot be fully predicted from vocabulary knowledge and therefore speed tests should supplement tests of vocabulary size and depth.


2020 ◽  
Author(s):  
Paul Nation ◽  
B Laufer

This paper examines the relationship between fluency and vocabulary size, and also between fluency and word frequency level. Fluency was operationalised as the time learners need to recognize meanings of words sampled from different frequency levels. It was measured by a computerised vocabulary recognition speed test (VORST). The test was given to 488 native and non-native speakers who were divided by vocabulary size into four groups. The four groups were compared on speed of response to the 3000 level and University Word List (UWL) words. Speed was also correlated with vocabulary size. Additionally, response times to different frequency levels were compared for each subject. Results suggest that speed of retrieval is moderately related to vocabulary size and word frequency. Non-native speakers’ increase in speed lags behind increase in vocabulary size. Non-native speakers also respond more slowly to less frequent words. Responses of native speakers, on the other hand, are more homogeneous across subjects and across vocabulary frequencies. Speed of retrieval cannot be fully predicted from vocabulary knowledge and therefore speed tests should supplement tests of vocabulary size and depth.


2018 ◽  
Vol 14 (2) ◽  
Author(s):  
Sri Mahendra Putra Wirawan

Gross Regional Domestic Product (GRDP) which provides a comprehensive picture of the economic conditions of a region is indicator for analyzing economic region development. Another indicator that is no less important is inflation as an indicator to see the level of changes in price increases due to an increase in the money supply that causes rising prices. The success of development must also look at the income inequality of its population which is illustrated by this ratio. One of the main regional development goals is to improve the welfare of its people, where to see the level of community welfare, among others, can be seen from the level of unemployment in an area. To that end, in order to get an overview of the effects of GRDP, inflation and the ratio of gini to unemployment in DKI Jakarta for the last ten years (2007-2016), an analysis was carried out using multiple linear regression methods. As a result, together the relationship between GRDP, inflation and the Gini ratio is categorized as "very strong" with a score of 0.936, and has a significant influence on unemployment. Partially, the GRDP gives a significant influence, but inflation and gini ratio do not have a significant influence. GDP, inflation and the Gini ratio together for the last ten years have contributed 81.4% to unemployment in DKI Jakarta, while the remaining 18.6% is influenced by other variables not included in this research model, so for reduce unemployment in DKI Jakarta, programs that are oriented to economic growth, suppressing inflation and decreasing this ratio need to be carried out simultaneously. Keywords: GRDP, inflation, unemployment, DKI Jakarta, GINI ratio  


Interpreting ◽  
2017 ◽  
Vol 19 (1) ◽  
pp. 1-20 ◽  
Author(s):  
Ena Hodzik ◽  
John N. Williams

We report a study on prediction in shadowing and simultaneous interpreting (SI), both considered as forms of real-time, ‘online’ spoken language processing. The study comprised two experiments, focusing on: (i) shadowing of German head-final sentences by 20 advanced students of German, all native speakers of English; (ii) SI of the same sentences into English head-initial sentences by 22 advanced students of German, again native English speakers, and also by 11 trainee and practising interpreters. Latency times for input and production of the target verbs were measured. Drawing on studies of prediction in English-language reading production, we examined two cues to prediction in both experiments: contextual constraints (semantic cues in the context) and transitional probability (the statistical likelihood of words occurring together in the language concerned). While context affected prediction during both shadowing and SI, transitional probability appeared to favour prediction during shadowing but not during SI. This suggests that the two cues operate on different levels of language processing in SI.


2021 ◽  
Vol 11 (3) ◽  
pp. 359
Author(s):  
Katharina Hogrefe ◽  
Georg Goldenberg ◽  
Ralf Glindemann ◽  
Madleen Klonowski ◽  
Wolfram Ziegler

Assessment of semantic processing capacities often relies on verbal tasks which are, however, sensitive to impairments at several language processing levels. Especially for persons with aphasia there is a strong need for a tool that measures semantic processing skills independent of verbal abilities. Furthermore, in order to assess a patient’s potential for using alternative means of communication in cases of severe aphasia, semantic processing should be assessed in different nonverbal conditions. The Nonverbal Semantics Test (NVST) is a tool that captures semantic processing capacities through three tasks—Semantic Sorting, Drawing, and Pantomime. The main aim of the current study was to investigate the relationship between the NVST and measures of standard neurolinguistic assessment. Fifty-one persons with aphasia caused by left hemisphere brain damage were administered the NVST as well as the Aachen Aphasia Test (AAT). A principal component analysis (PCA) was conducted across all AAT and NVST subtests. The analysis resulted in a two-factor model that captured 69% of the variance of the original data, with all linguistic tasks loading high on one factor and the NVST subtests loading high on the other. These findings suggest that nonverbal tasks assessing semantic processing capacities should be administered alongside standard neurolinguistic aphasia tests.


Author(s):  
Filiz Rızaoğlu ◽  
Ayşe Gürel

AbstractThis study examines, via a masked priming task, the processing of English regular and irregular past tense morphology in proficient second language (L2) learners and native speakers in relation to working memory capacity (WMC), as measured by the Automated Reading Span (ARSPAN) and Operation Span (AOSPAN) tasks. The findings revealed quantitative group differences in the form of slower reaction times (RTs) in the L2-English group. While no correlation was found between the morphological processing patterns and WMC in either group, there was a negative relationship between English and Turkish ARSPAN scores and the speed of word recognition in the L2 group. Overall, comparable decompositional processing patterns found in both groups suggest that, like native speakers, high-proficiency L2 learners are sensitive to the morphological structure of the target language.


Sign in / Sign up

Export Citation Format

Share Document