entity linking
Recently Published Documents


TOTAL DOCUMENTS

407
(FIVE YEARS 185)

H-INDEX

20
(FIVE YEARS 5)

2021 ◽  
Author(s):  
Petar Ristoski ◽  
Zhizhong Lin ◽  
Qunzhi Zhou

Author(s):  
Elvys Linhares Pontes ◽  
Luis Adrián Cabrera-Diego ◽  
Jose G. Moreno ◽  
Emanuela Boros ◽  
Ahmed Hamdi ◽  
...  

AbstractDigital libraries have a key role in cultural heritage as they provide access to our culture and history by indexing books and historical documents (newspapers and letters). Digital libraries use natural language processing (NLP) tools to process these documents and enrich them with meta-information, such as named entities. Despite recent advances in these NLP models, most of them are built for specific languages and contemporary documents that are not optimized for handling historical material that may for instance contain language variations and optical character recognition (OCR) errors. In this work, we focused on the entity linking (EL) task that is fundamental to the indexation of documents in digital libraries. We developed a Multilingual Entity Linking architecture for HIstorical preSS Articles that is composed of multilingual analysis, OCR correction, and filter analysis to alleviate the impact of historical documents in the EL task. The source code is publicly available. Experimentation has been done over two historical documents covering five European languages (English, Finnish, French, German, and Swedish). Results have shown that our system improved the global performance for all languages and datasets by achieving an F-score@1 of up to 0.681 and an F-score@5 of up to 0.787.


2021 ◽  
pp. 1-24
Author(s):  
Qiushuo Zheng ◽  
Hao Wen ◽  
Meng Wang ◽  
Guilin Qi

Abstract Existing visual scene understanding methods mainly focus on identifying coarse-grained concepts about the visual objects and their relationships, largely neglecting fine-grained scene understanding. In fact, many data-driven applications on the web (e.g. newsreading and e-shopping) require to accurately recognize much less coarse concepts as entities and properly link to a knowledge graph, which can take their performance to the next level. In light of this, in this paper, we identify a new research task: visual entity linking for fine-grained scene understanding. To accomplish the task, we first extract features of candidate entities from different modalities, i.e., visual features, textual features, and KG features. Then, we design a deep modal-attention neural network-based learning-to-rank method aggregates all features and map visual objects to the entities in KG. Extensive experimental results on the newly constructed dataset show that our proposed method is effective as it significantly improves the accuracy performance from 66.46% to 83.16% comparing with baselines.


2021 ◽  
Author(s):  
Ghadeer Mobasher ◽  
Lukrecia Mertova ◽  
Sucheta Ghosh ◽  
Olga Krebs ◽  
Bettina Heinlein ◽  
...  

Chemical named entity recognition (NER) is a significant step for many downstream applications like entity linking for the chemical text-mining pipeline. However, the identification of chemical entities in a biomedical text is a challenging task due to the diverse morphology of chemical entities and the different types of chemical nomenclature. In this work, we describe our approach that was submitted for BioCreative version 7 challenge Track 2, focusing on the "Chemical Identification" task for identifying chemical entities and entity linking, using MeSH. For this purpose, we have applied a two-stage approach as follows (a) usage of fine-tuned BioBERT for identification of chemical entities (b) semantic approximate search in MeSH and PubChem databases for entity linking. There was some friction between the two approaches, as our rule-based approach did not harmonise optimally with partially recognized words forwarded by the BERT component. For our future work, we aim to resolve the issue of the artefacts arising from BERT tokenizers and develop joint learning of chemical named entity recognition and entity linking using pretrained transformer-based models and compare their performance with our preliminary approach. Next, we will improve the efficiency of our approximate search in reference databases during entity linking. This task is non-trivial as it entails determining similarity scores of large sets of trees with respect to a query tree. Ideally, this will enable flexible parametrization and rule selection for the entity linking search.


2021 ◽  
pp. 1-42
Author(s):  
Aline Menin ◽  
Franck Michel ◽  
Fabien Gandon ◽  
Raphaël Gazzotti ◽  
Elena Cabrio ◽  
...  

Abstract The unprecedented mobilization of scientists, consequent of the COVID-19 pandemics, has generated an enormous number of scholarly articles that is impossible for a human being to keep track and explore without appropriate tool support. In this context, we created the Covid-on-the-Web project, which aims to assist the access, querying, and sense making of COVID-19 related literature by combining efforts from semantic web, natural language processing, and visualization fields. Particularly, in this paper, we present (i) an RDF dataset, a linked version of the “COVID-19 Open Research Dataset” (CORD-19), enriched via entity linking and argument mining, and (ii) the “Linked Data Visualizer” (LDViz), 28 which assists the querying and visual exploration of the referred dataset. The LDViz tool assists the exploration of different views of the data by combining a querying management interface, which enables the definition of meaningful subsets of data through SPARQL queries, and a visualization interface based on a set of six visualization techniques integrated in a chained visualization concept, which also supports the tracking of provenance information. We demonstrate the potential of our approach to assist biomedical researchers in solving domain-related tasks, as well as to perform exploratory analyses through use case scenarios.


2021 ◽  
Vol 21 (S9) ◽  
Author(s):  
Cheng Yan ◽  
Yuanzhe Zhang ◽  
Kang Liu ◽  
Jun Zhao ◽  
Yafei Shi ◽  
...  

Abstract Background A lot of medical mentions can be extracted from a huge amount of medical texts. In order to make use of these medical mentions, a prerequisite step is to link those medical mentions to a medical domain knowledge base (KB). This linkage of mention to a well-defined, unambiguous KB is a necessary part of the downstream application such as disease diagnosis and prescription of drugs. Such demand becomes more urgent in colloquial and informal situations like online medical consultation, where the medical language is more casual and vaguer. In this article, we propose an unsupervised method to link the Chinese medical symptom mentions to the ICD10 classification in a colloquial background. Methods We propose an unsupervised entity linking model using multi-instance learning (MIL). Our approach builds on a basic unsupervised entity linking method (named BEL), which is an embedding similarity-based EL model in this paper, and uses MIL training paradigm to boost the performance of BEL. First, we construct a dataset from an unlabeled large-scale Chinese medical consultation corpus with the help of BEL. Subsequently, we use a variety of encoders to obtain the representations of mention-context and the ICD10 entities. Then the representations are fed into a ranking network to score candidate entities. Results We evaluate the proposed model on the test dataset annotated by professional doctors. The evaluation results show that our method achieves 60.34% accuracy, exceeding the fundamental BEL by 1.72%. Conclusions We propose an unsupervised entity linking method to the entity linking in the medical domain, using MIL training manner. We annotate a test set for evaluation. The experimental results show that our model behaves better than the fundamental model BEL, and provides an insight for future research.


2021 ◽  
Author(s):  
Martin Gerlach ◽  
Marshall Miller ◽  
Rita Ho ◽  
Kosta Harlan ◽  
Djellel Difallah
Keyword(s):  

2021 ◽  
Author(s):  
Jingru Gan ◽  
Jinchang Luo ◽  
Haiwei Wang ◽  
Shuhui Wang ◽  
Wei He ◽  
...  
Keyword(s):  

2021 ◽  
Vol 37 (4) ◽  
pp. 365-402
Author(s):  
Han Li ◽  
Yash Govind ◽  
Sidharth Mudgal ◽  
Theodoros Rekatsinas ◽  
AnHai Doan

Semantic matching finds certain types of semantic relationships among schema/data constructs. Examples include entity matching, entity linking, coreference resolution, schema/ontology matching, semantic text similarity, textual entailment, question answering, tagging, etc. Semantic matching has received much attention in the database, AI, KDD, Web, and Semantic Web communities. Recently, many works have also applied deep learning (DL) to semantic matching. In this paper we survey this fast growing topic. We define the semantic matching problem, categorize its variations into a taxonomy, and describe important applications. We describe DL solutions for important variations of semantic matching. Finally, we discuss future R\&D directions.


2021 ◽  
pp. 229-258
Author(s):  
Sina Menzel ◽  
Hannes Schnaitter ◽  
Josefine Zinck ◽  
Vivien Petras ◽  
Clemens Neudecker ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document