scholarly journals Vocabulary: Common or Basic?

2021 ◽  
Vol 12 ◽  
Author(s):  
András Kornai

Neither linguistics nor psychology offers a single, unified notion of simplicity, and therefore the simplest “core” layer of vocabulary is hard to define in theory and hard to pinpoint in practice. In section 1 we briefly survey the main approaches, and distinguish two that are highly relevant to lexicography: we will call these common and basic. In sections 2 and 3 we compare these approaches, and in section 4 we point the reader to Kolmogorov complexity, unfamiliar as it may be to most working psychologists, lexicographers, and educators, as the best formal means to deal with core vocabulary.

2012 ◽  
Vol 21 (3) ◽  
pp. 74-81 ◽  
Author(s):  
Debbie Witkowski ◽  
Bruce Baker

Abstract In the early elementary grades, the primary emphasis is on developing skills crucial to future academic and personal success—specifically oral and written communication skills. These skills are vital to student success as well as to meaningful participation in the classroom and interaction with peers. Children with complex communication needs (CCN) may require the use of high-performance speech generating devices (SGDs). The challenges for these students are further complicated by the task of learning language at a time when they are expected to apply their linguistic skills to academic tasks. However, by focusing on core vocabulary as a primary vehicle for instruction, educators can equip students who use SGDs to develop language skills and be competitive in the classroom. In this article, we will define core vocabulary and provide theoretical and practical insights into integrating it into the classroom routine for developing oral and written communication skills.


2010 ◽  
Vol 3 (2) ◽  
pp. 156-180 ◽  
Author(s):  
Renáta Gregová ◽  
Lívia Körtvélyessy ◽  
Július Zimmermann

Universals Archive (Universal #1926) indicates a universal tendency for sound symbolism in reference to the expression of diminutives and augmentatives. The research ( Štekauer et al. 2009 ) carried out on European languages has not proved the tendency at all. Therefore, our research was extended to cover three language families – Indo-European, Niger-Congo and Austronesian. A three-step analysis examining different aspects of phonetic symbolism was carried out on a core vocabulary of 35 lexical items. A research sample was selected out of 60 languages. The evaluative markers were analyzed according to both phonetic classification of vowels and consonants and Ultan's and Niewenhuis' conclusions on the dominance of palatal and post-alveolar consonants in diminutive markers. Finally, the data obtained in our sample languages was evaluated by means of a three-dimensional model illustrating the place of articulation of the individual segments.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Pilar López-Úbeda ◽  
Alexandra Pomares-Quimbaya ◽  
Manuel Carlos Díaz-Galiano ◽  
Stefan Schulz

Abstract Background Controlled vocabularies are fundamental resources for information extraction from clinical texts using natural language processing (NLP). Standard language resources available in the healthcare domain such as the UMLS metathesaurus or SNOMED CT are widely used for this purpose, but with limitations such as lexical ambiguity of clinical terms. However, most of them are unambiguous within text limited to a given clinical specialty. This is one rationale besides others to classify clinical text by the clinical specialty to which they belong. Results This paper addresses this limitation by proposing and applying a method that automatically extracts Spanish medical terms classified and weighted per sub-domain, using Spanish MEDLINE titles and abstracts as input. The hypothesis is biomedical NLP tasks benefit from collections of domain terms that are specific to clinical subdomains. We use PubMed queries that generate sub-domain specific corpora from Spanish titles and abstracts, from which token n-grams are collected and metrics of relevance, discriminatory power, and broadness per sub-domain are computed. The generated term set, called Spanish core vocabulary about clinical specialties (SCOVACLIS), was made available to the scientific community and used in a text classification problem obtaining improvements of 6 percentage points in the F-measure compared to the baseline using Multilayer Perceptron, thus demonstrating the hypothesis that a specialized term set improves NLP tasks. Conclusion The creation and validation of SCOVACLIS support the hypothesis that specific term sets reduce the level of ambiguity when compared to a specialty-independent and broad-scope vocabulary.


2020 ◽  
pp. 1-28
Author(s):  
NIKITA MORIAKOV

Abstract A theorem of Brudno says that the Kolmogorov–Sinai entropy of an ergodic subshift over $\mathbb {N}$ equals the asymptotic Kolmogorov complexity of almost every word in the subshift. The purpose of this paper is to extend this result to subshifts over computable groups that admit computable regular symmetric Følner monotilings, which we introduce in this work. For every $d \in \mathbb {N}$ , the groups $\mathbb {Z}^d$ and $\mathsf{UT}_{d+1}(\mathbb {Z})$ admit computable regular symmetric Følner monotilings for which the required computing algorithms are provided.


Sign in / Sign up

Export Citation Format

Share Document