scholarly journals An extensive empirical study of collocation extraction methods

Author(s):  
Pavel Pecina
2021 ◽  
Vol 11 (7) ◽  
pp. 2892
Author(s):  
Olivera Kitanović ◽  
Ranka Stanković ◽  
Aleksandra Tomašević ◽  
Mihailo Škorić ◽  
Ivan Babić ◽  
...  

The research presented in this paper aims at creating a bilingual (sr-en), easily searchable, hypertext, born-digital, corpus-based terminological database of raw material terminology for dictionary production. The approach is based on linking dictionaries related to the raw material domain, both digitally born and printed, into a lexicon structure, aligning terminology from different dictionaries as much as possible. This paper presents the main features of this approach, data used for compilation of the terminological database, the procedure by which it has been generated and a mobile application for its use. Available (terminological) resources will be presented—paper dictionaries and digital resources related to the raw material domain, as well as general lexica morphological dictionaries. Resource preparation started with dictionary (retro)digitisation and corpora enlargement, followed by adding new Serbian terms to general lexica dictionaries, as well as adding bilingual terms. Dictionary development is relying on corpus analysis, details of which are also presented. Usage examples, collocations and concordances play an important role in raw material terminology, and have also been included in this research. Some important related issues discussed are collocation extraction methods, the use of domain labels, lexical and semantic relations, definitions and subentries.


Author(s):  
Lana Hudeček ◽  
Milica Mihaljević

The Croatian Web Dictionary – Mrežnik project aims to create a free, monolingual, easily searchable, hypertext, born-digital, corpus-based dictionary of the Croatian standard language. Collocations play an important role in Mrežnik. At the outset of the Mrežnik project, the concept of collocations and their presentation was modelled after the elexiko project. However, this concept was modified during the project on the basis of corpus analysis. This paper will outline the presentation of collocations of headwords of different word classes. Some important issues connected with collocations in Mrežnik are collocation extraction methods, collocations as a means of differentiating meanings and extracting new meanings, the use of stylistic and terminological labels in collocations, and the relationship of collocations with normative and pragmatic notes, definitions, and subentries.


1996 ◽  
Vol 81 (1) ◽  
pp. 76-87 ◽  
Author(s):  
Connie R. Wanberg ◽  
John D. Watt ◽  
Deborah J. Rumsey

Sign in / Sign up

Export Citation Format

Share Document