RuThes Thesaurus for Natural Language Processing

The Palgrave Handbook of Digital Russia Studies ◽

10.1007/978-3-030-42855-6_18 ◽

2020 ◽

pp. 319-334

Author(s):

Natalia Loukachevitch ◽

Boris Dobrov

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Document Processing ◽

Multiword Expressions ◽

Ambiguous Words ◽

New Concepts ◽

Information Retrieval Thesauri ◽

Small Set

AbstractThis chapter describes the Russian RuThes thesaurus created as a linguistic and terminological resource for automatic document processing. Its structure utilizes two popular paradigms for computer thesauri: concept-based units, a small set of relation types, rules for including multiword expression as in information retrieval thesauri; and language-motivated units, detailed sets of synonyms, description of ambiguous words as in WordNet-like thesauri. The development of the RuThes thesaurus is supported for many years: new concepts, new senses, and multiword expressions found in contemporary texts are introduced regularly. The chapter shows some examples of representing newly appeared concepts related to important internal and international events.

Proceedings of the 2019 3rd International Conference on Natural Language Processing and Information Retrieval

10.1145/3342827 ◽

2019 ◽

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

International Conference

Thai Fake News Detection Based on Information Retrieval, Natural Language Processing and Machine Learning

SN Computer Science ◽

10.1007/s42979-021-00775-6 ◽

2021 ◽

Vol 2 (6) ◽

Author(s):

Phayung Meesad

Keyword(s):

Machine Learning ◽

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Fake News

Report on the 4th Joint Workshop on Bibliometric-Enhanced Information Retrieval and Natural Language Processing for Digital Libraries at SIGIR 2019

ACM SIGIR Forum ◽

10.1145/3458553.3458554 ◽

2019 ◽

Vol 53 (2) ◽

pp. 3-10

Author(s):

Muthu Kumar Chandrasekaran ◽

Philipp Mayr

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Research And Development ◽

Language Processing ◽

Digital Libraries ◽

State Of The Art ◽

Shared Task ◽

Processing Information ◽

Joint Workshop

The 4 th joint BIRNDL workshop was held at the 42nd ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019) in Paris, France. BIRNDL 2019 intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometrics, and recommendation techniques that can advance the state-of-the-art in scholarly document understanding, analysis, and retrieval at scale. The workshop incorporated different paper sessions and the 5 th edition of the CL-SciSumm Shared Task.

A lexicon of multiword expressions for linguistically precise, wide-coverage natural language processing

Computer Speech & Language ◽

10.1016/j.csl.2013.09.001 ◽

2014 ◽

Vol 28 (6) ◽

pp. 1317-1339 ◽

Cited By ~ 4

Author(s):

Toshifumi Tanabe ◽

Masahito Takahashi ◽

Kosho Shudo

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Multiword Expressions

Integrating natural language processing and information retrieval in a troubleshooting help desk

IEEE Expert ◽

10.1109/64.248348 ◽

1993 ◽

Vol 8 (6) ◽

pp. 9-17 ◽

Cited By ~ 5

Author(s):

P.G. Anick

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Help Desk

Applying Natural Language Processing, Information Retrieval and Machine Learning to Decision Support in Medical Coordination in an Emergency Medicine Context

2015 IEEE 28th International Symposium on Computer-Based Medical Systems ◽

10.1109/cbms.2015.82 ◽

2015 ◽

Cited By ~ 2

Author(s):

Juliana Tarossi Pollettini ◽

Hugo Cesar Pessotti ◽

Antonio Pazin Filho ◽

Evandro Eduardo Seron Ruiz ◽

Mario Sergio Adolfi Junior

Keyword(s):

Machine Learning ◽

Emergency Medicine ◽

Information Retrieval ◽

Natural Language Processing ◽

Decision Support ◽

Natural Language ◽

Language Processing ◽

Processing Information

Information retrieval using robust natural language processing

Proceedings of the workshop on Speech and Natural Language - HLT '91 ◽

10.3115/1075527.1075573 ◽

1992 ◽

Author(s):

Tomek Strzalkowski

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing

Implementation of a Cohort Retrieval System for Clinical Data Repositories Using the Observational Medical Outcomes Partnership Common Data Model: Proof-of-Concept System Validation (Preprint)

10.2196/preprints.17376 ◽

2019 ◽

Cited By ~ 1

Author(s):

Sijia Liu ◽

Yanshan Wang ◽

Andrew Wen ◽

Liwei Wang ◽

Na Hong ◽

...

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Data Model ◽

Structured Data ◽

Common Data Model ◽

Concept System ◽

Unstructured Text ◽

Electronic Health

BACKGROUND Widespread adoption of electronic health records has enabled the secondary use of electronic health record data for clinical research and health care delivery. Natural language processing techniques have shown promise in their capability to extract the information embedded in unstructured clinical data, and information retrieval techniques provide flexible and scalable solutions that can augment natural language processing systems for retrieving and ranking relevant records. OBJECTIVE In this paper, we present the implementation of a cohort retrieval system that can execute textual cohort selection queries on both structured data and unstructured text—Cohort Retrieval Enhanced by Analysis of Text from Electronic Health Records (CREATE). METHODS CREATE is a proof-of-concept system that leverages a combination of structured queries and information retrieval techniques on natural language processing results to improve cohort retrieval performance using the Observational Medical Outcomes Partnership Common Data Model to enhance model portability. The natural language processing component was used to extract common data model concepts from textual queries. We designed a hierarchical index to support the common data model concept search utilizing information retrieval techniques and frameworks. RESULTS Our case study on 5 cohort identification queries, evaluated using the precision at 5 information retrieval metric at both the patient-level and document-level, demonstrates that CREATE achieves a mean precision at 5 of 0.90, which outperforms systems using only structured data or only unstructured text with mean precision at 5 values of 0.54 and 0.74, respectively. CONCLUSIONS The implementation and evaluation of Mayo Clinic Biobank data demonstrated that CREATE outperforms cohort retrieval systems that only use one of either structured data or unstructured text in complex textual cohort queries.

Exact Expected Average Precision of the Random Baseline for System Evaluation

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2015-0007 ◽

2015 ◽

Vol 103 (1) ◽

pp. 131-138 ◽

Cited By ~ 2

Author(s):

Yves Bestgen

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

System Evaluation ◽

Average Precision ◽

The Difference

Abstract Average precision (AP) is one of the most widely used metrics in information retrieval and natural language processing research. It is usually thought that the expected AP of a system that ranks documents randomly is equal to the proportion of relevant documents in the collection. This paper shows that this value is only approximate, and provides a procedure for efficiently computing the exact value. An analysis of the difference between the approximate and the exact value shows that the discrepancy is large when the collection contains few documents, but becomes very small when it contains at least 600 documents.

BUILD KNOWLEDGE GRAPH FROM HETEROGENEOUS DOCUMENTS

Journal of Science and Technology - IUH ◽

10.46242/jst-iuh.v47i05.761 ◽

2021 ◽

Vol 47 (05) ◽

Author(s):

NGUYỄN CHÍ HIẾU

Keyword(s):

Information Retrieval ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Question Answering ◽

Semantic Analysis ◽

Knowledge Graph ◽

Question Answering Systems ◽

Knowledge Graphs

Knowledge Graphs are applied in many fields such as search engines, semantic analysis, and question answering in recent years. However, there are many obstacles for building knowledge graphs as methodologies, data and tools. This paper introduces a novel methodology to build knowledge graph from heterogeneous documents. We use the methodologies of Natural Language Processing and deep learning to build this graph. The knowledge graph can use in Question answering systems and Information retrieval especially in Computing domain