lexical databases Latest Research Papers

Gorazd: An Old Church Slavonic Digital Hub and the Romanian Slavonic studies

Diacronia ◽

10.17684/i14a197en ◽

2021 ◽

Author(s):

Vladislav Knoll

Keyword(s):

Czech Academy ◽

Card Index ◽

Lexical Databases ◽

Academy Of Sciences ◽

Reference Tool

The aim of this paper is to present the utility of the Gorazd: An Old Church Digital Hub for scholars working with Old Romanian and Slavonic texts written on the territory of today’ s Romania. The Gorazd Project was realized during the years 2016–2020 and it includes an Old Church Slavonic Card Index and three Old Church Slavonic lexical databases, among which the largest one is represented by the digitized and updated version of the monumental Lexicon linguæ palæoslovenicæ (vol. I–IV, 1958–1997) composed by the Institute of Slavonic Studies of the Czech Academy of Sciences. As the Gorazd Project uses English as meta-language, its application is not limited to narrowly specialized Slavic philologists, but it is also open for scholars of neighbouring fields. The dictionaries within the Gorazd Digital Hub can serve as a reference tool not just for the oldest attested Slavonic vocabulary and its semantics, but also for the biblical concordance of the Slavonic oldest Bible redaction and the oldest attested Old Church Slavonic morphological forms.

Download Full-text

A Human Machine Hybrid Approach for Systematic Reviews and Maps in International Development and Social Impact Sectors

Forests ◽

10.3390/f12081027 ◽

2021 ◽

Vol 12 (8) ◽

pp. 1027

Author(s):

Murat Sartas ◽

Sarah Cummings ◽

Alessandra Garbero ◽

Akmal Akramkhanov

Keyword(s):

International Development ◽

Systematic Reviews ◽

Digital Literacy ◽

Social Impact ◽

Semantic Analysis ◽

Hybrid Approach ◽

Evidence Base ◽

Societal Relevance ◽

Lexical Databases ◽

Very High

The international development and social impact evidence community is divided about the use of machine-centered approaches in carrying out systematic reviews and maps. While some researchers argue that machine-centered approaches such as machine learning, artificial intelligence, text mining, automated semantic analysis, and translation bots are superior to human-centered ones, others claim the opposite. We argue that a hybrid approach combining machine and human-centered elements can have higher effectiveness, efficiency, and societal relevance than either approach can achieve alone. We present how combining lexical databases with dictionaries from crowdsourced literature, using full texts instead of titles, abstracts, and keywords. Using metadata sets can significantly improve the current practices of systematic reviews and maps. Since the use of machine-centered approaches in forestry and forestry-related reviews and maps are rare, the gains in effectiveness, efficiency, and relevance can be very high for the evidence base in forestry. We also argue that the benefits from our hybrid approach will increase in time as digital literacy and better ontologies improve globally.

Download Full-text

Lexical Databases and Textual Corpora: Perspectives of Integration for a Lexical Knowledge Base

Lexical Acquisition: Exploiting On-Line Resources to Build a Lexicon ◽

10.4324/9781315785387-10 ◽

2021 ◽

pp. 191-208

Author(s):

Nicoletta Calzolari

Keyword(s):

Knowledge Base ◽

Lexical Knowledge ◽

Lexical Databases ◽

Lexical Knowledge Base

Download Full-text

EFL Learners’ Comprehension of Scalar Emotion Verbs

International Journal of Modern Education Studies ◽

10.51383/ijonmes.2020.54 ◽

2020 ◽

Vol 4 (2) ◽

pp. 110

Author(s):

Ismail Saka

Keyword(s):

Semantic Analysis ◽

Native Speakers ◽

Linear Scale ◽

Lexical Resources ◽

Efl Learners ◽

Learning And Teaching ◽

The Usa ◽

Lexical Databases ◽

The Uk ◽

Scalar Properties

This study investigated EFL learners’ comprehension of scalar properties of three types of emotion verbs, namely, fear type, liking and disliking emotion verbs and compare their performance with instructors and native speakers of English. The participants were 38 non-native pre-service teachers from ELT department at a state university in Turkey, 11 ELT instructors at different universities and 10 native speakers from the USA and the UK. The data were collected through a scale construction task according to participants’ judgements on scalar emotion verbs in terms of their relative order on a linear scale. The results revealed that in terms of constructing consistent scales with with previously determined scales in literatutre, pre-service teachers performed poorly for fear-type and disliking emotion verbs, they were partly successful in constructing consistent scales for liking verbs. It was also found that similarly instructors performed poorly in constructing scales for fear-type and disliking verbs, but they were better than pre-service teachers. They were also successful in constructing scales for liking verbs. Native speakers were successful in fear-type and liking verbs; however, like non-native participants, they performed poorly in constructing consistent scales for disliking verbs. This means that there may be cross-cultural differences among participants’ judgement of emotion verbs on a linear scale in terms of their intensity. This study may provide valuable information for the studies on lexical resources (e.g., VerbNet, WordNet etc.) Previous studies (e.g. Fellbaum & Mathieu,2014; Sheinman, & Tokunaga, 2009) show a way to represent the scalar properties of emotion verbs in WordNet, and other possible extensions to additional verb families can cause a more subtle semantic analysis of emotion verbs in lexical databases with potential benefits for automatic inferencing, language pedagogy and translation. This study may contribute to semantic analysis of emotion verbs in lexical databases. It may also provide some implications for students, language teachers, and policy makers in terms of vocabulary learning and teaching.

Download Full-text

Introducing corpus linguistic tools to EFL undergraduates and trainee teachers

Education 4.0 revolution: transformative approaches to language teaching and learning, assessment and campus design ◽

10.14705/rpnet.2020.42.1086 ◽

2020 ◽

pp. 45-51

Author(s):

Mária Adorján

Keyword(s):

Foreign Language ◽

Language Teachers ◽

Information And Communications Technology ◽

Assessment Practices ◽

Trainee Teachers ◽

Teacher Trainees ◽

Vocabulary Building ◽

Learning Platforms ◽

N Gram ◽

Lexical Databases

Many language teachers use Information and Communications Technology (ICT) in their classrooms to create tasks, quizzes, or polls with general online learning platforms. Few teachers have experience, however, of incorporating online corpus tools in their teaching or assessment practices. This paper will explore how autonomous learning can be fostered by gradually introducing freely available lexical databases, online collocation dictionaries, pronunciation guides, concordancers, N-gram extractors, and other text analysis tools for vocabulary building, skills practice, or self-checking. Tasks used with English as a Foreign Language (EFL) undergraduates and teacher trainees on a Master’s Teaching English as a Foreign Language (MA TEFL) course will be presented. I will also explain why having some familiarity with linguistics research can enable teachers to use these applications more meaningfully.

Download Full-text

NLP Questions Answering Using DBpedia and YAGO

Vietnam Journal of Computer Science ◽

10.1142/s2196888820500190 ◽

2020 ◽

Vol 07 (04) ◽

pp. 339-354

Author(s):

Tomasz Boiński ◽

Julian Szymański ◽

Bartłomiej Dudek ◽

Paweł Zalewski ◽

Szymon Dompke ◽

...

Keyword(s):

Sparql Query ◽

User Intention ◽

Question Analysis ◽

User Query ◽

Class 1 ◽

Dependency Trees ◽

Three Stages ◽

Lexical Databases ◽

Answering Questions

In this paper, we present results of employing DBpedia and YAGO as lexical databases for answering questions formulated in the natural language. The proposed solution has been evaluated for answering class 1 and class 2 questions (out of 5 classes defined by Moldovan for TREC conference). Our method uses dependency trees generated from the user query. The trees are browsed for paths leading from the root of the tree to the question subject. We call those paths fibers and they represent the user intention. The question analysis consists of three stages: query analysis, query breakdown and information retrieval. The aim of those stages is the detection of the entities of interest and its attributes, indicating the users’ domain of interest. The user query is then converted into a SPARQL query and sent to the DBpedia and YAGO databases. The baseline and the extended methods are presented and the quality of the solution is evaluated and discussed.

Download Full-text

Theoretical and Methodological Foundations of The Lexical Databases Using The Example of “ arabdict”. A Work Report الأسس المنهجیة والنظریة للمعاجم الإلکترونیة من خلال المعجم الالکترونی "ارابدیکت "

مجلة قطاع الدراسات الإنسانیة ◽

10.21608/jsh.2020.99823 ◽

2020 ◽

Vol 25 (2) ◽

pp. 985-1034

Author(s):

محمد سالم محمد یوسف

Keyword(s):

Lexical Databases

Download Full-text

Fast paraphrase extraction in Ancient Greek literature

it - Information Technology ◽

10.1515/itit-2019-0042 ◽

2020 ◽

Vol 62 (2) ◽

pp. 75-89

Author(s):

Marcus Pöckelmann ◽

Janis Dähne ◽

Jörg Ritter ◽

Paul Molitor

Keyword(s):

Interactive Systems ◽

Final Report ◽

Brute Force ◽

Search Query ◽

Ancient Greek ◽

Pos Tagging ◽

Text Segment ◽

Text Corpora ◽

Brute Force Approach ◽

Lexical Databases

AbstractIn this paper,A shorter version of the paper appeared in German in the final report of the Digital Plato project which was funded by the Volkswagen Foundation from 2016 to 2019. [35], [28]. we present a method for paraphrase extraction in Ancient Greek that can be applied to huge text corpora in interactive humanities applications. Since lexical databases and POS tagging are either unavailable or do not achieve sufficient accuracy for ancient languages, our approach is based on pure word embeddings and the word mover’s distance (WMD) [20]. We show how to adapt the WMD approach to paraphrase searching such that the expensive WMD computation has to be computed for a small fraction of the text segments contained in the corpus, only. Formally, the time complexity will be reduced from \mathcal{O}(N\cdot {K^{3}}\cdot \log K) to \mathcal{O}(N+{K^{3}}\cdot \log K), compared to the brute-force approach which computes the WMD between each text segment of the corpus and the search query. N is the length of the corpus and K the size of its vocabulary. The method, which searches not only for paraphrases of the same length as the search query but also for paraphrases of varying lengths, was evaluated on the Thesaurus Linguae Graecae® (TLG®) [25]. The TLG consists of about 75\cdot {10^{6}} Greek words. We searched the whole TLG for paraphrases for given passages of Plato. The experimental results show that our method and the brute-force approach, with only very few exceptions, propose the same text passages in the TLG as possible paraphrases. The computation times of our method are in a range that allows its application in interactive systems and let the humanities scholars work productively and smoothly.

Download Full-text

Dictionaries and lexical databases of the Polish language

Prace Językoznawcze ◽

10.31648/pj.5231 ◽

2020 ◽

Vol 22 (2) ◽

pp. 23-37

Author(s):

Monika Czerepowicka

Keyword(s):

Polish Language ◽

Lexical Databases

Przedmiotem badań jest jednojęzyczna leksykografia elektroniczna. Celem artykułujest ukazanie wpływu technik komputerowych na organizację, rozmiar, przeznaczeniei zawartość słowników. W swych badaniach autorka koncentruje się na elektronicznychbazach danych. Definiuje, czym są, oraz objaśnia, jak ich budowa i sposób organizacjizgromadzonych w nich danych wpływają na postać słowników elektronicznych. W artykulezostały poddane analizie trzy współczesne słowniki języka polskiego: Uniwersalny słownikjęzyka polskiego PWN, Wielki słownik języka polskiego PAN oraz Słownik gramatycznyjęzyka polskiego. Autorka dowodzi, że sposób organizacji i prezentacji wiedzy w omówionychdziełach umożliwia użytkownikom korzystanie z nich w sposób zaawansowany,co oznacza sprawne dotarcie do szczegółowych informacji o jednostkach leksykalnych,grupowanie ich, jak również doraźne kompilowanie „podsłowników”, spełniających określoneoczekiwania odbiorców.

Download Full-text