information retrieval thesauri
Recently Published Documents


TOTAL DOCUMENTS

7
(FIVE YEARS 2)

H-INDEX

3
(FIVE YEARS 0)

Author(s):  
Natalia Loukachevitch ◽  
Boris Dobrov

AbstractThis chapter describes the Russian RuThes thesaurus created as a linguistic and terminological resource for automatic document processing. Its structure utilizes two popular paradigms for computer thesauri: concept-based units, a small set of relation types, rules for including multiword expression as in information retrieval thesauri; and language-motivated units, detailed sets of synonyms, description of ambiguous words as in WordNet-like thesauri. The development of the RuThes thesaurus is supported for many years: new concepts, new senses, and multiword expressions found in contemporary texts are introduced regularly. The chapter shows some examples of representing newly appeared concepts related to important internal and international events.


Terminology ◽  
2015 ◽  
Vol 21 (2) ◽  
pp. 237-262 ◽  
Author(s):  
Natalia Loukachevitch ◽  
Boris Dobrov

This paper presents the structure and current state of the Sociopolitical thesaurus, which was developed for automatic document analysis and information-retrieval applications in Russian in a broad domain of public affairs. The scope of the Sociopolitical thesaurus resembles traditional information-retrieval thesauri for broad domains such as the EUROVOC or UNBIS thesauri, but the Sociopolitical thesaurus is intended as a tool for automatic document processing and this difference leads to considerable distinctions in the thesaurus structure and principles of its development. The knowledge representation in the Sociopolitical thesaurus is based on the combination of three existing traditions of developing information-retrieval thesauri, wordnets, and formal ontology research, which facilitates the consistent representation for such a broad scope of concepts and automatic document analysis of unstructured texts. The Sociopolitical thesaurus is used in such applications as conceptual indexing in information-retrieval systems, knowledge-based text categorization, automatic summarization of single and multiple documents, and question-answering. This paper presents an evaluation of the Sociopolitical thesaurus in automatic knowledge-based text categorization.


Sign in / Sign up

Export Citation Format

Share Document