entity disambiguation
Recently Published Documents


TOTAL DOCUMENTS

147
(FIVE YEARS 41)

H-INDEX

15
(FIVE YEARS 2)

2021 ◽  
Vol 58 (1) ◽  
pp. 520-524
Author(s):  
Katherine Louise Polley ◽  
Vivian Teresa Tompkins ◽  
Brendan John Honick ◽  
Jian Qin

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Jiangtao Ma ◽  
Duanyang Li ◽  
Yonggang Chen ◽  
Yaqiong Qiao ◽  
Haodong Zhu ◽  
...  

The purpose of knowledge graph entity disambiguation is to match the ambiguous entities to the corresponding entities in the knowledge graph. Current entity ambiguity elimination methods usually use the context information of the entity and its attributes to obtain the mention embedding vector, compare it with the candidate entity embedding vector for similarity, and perform entity matching through the similarity. The disadvantage of this type of method is that it ignores the structural characteristics of the knowledge graph where the entity is located, that is, the connection between the entity and the entity, and therefore cannot obtain the global semantic features of the entity. To improve the Precision and Recall of entity disambiguation problems, we propose the EDEGE (Entity Disambiguation based on Entity and Graph Embedding) method, which utilizes the semantic embedding vector of entity relationship and the embedding vector of subgraph structure feature. EDEGE first trains the semantic vector of the entity relationship, then trains the graph structure vector of the subgraph where the entity is located, and balances the weights of these two vectors through the entity similarity function. Finally, the balanced vector is input into the graph neural network, and the matching between the entities is output to achieve entity disambiguation. Extensive experimental results proved the effectiveness of the proposed method. Among them, on the ACE2004 data set, the Precision, Recall, and F1 values of EDEGE are 9.2%, 7%, and 11.2% higher than baseline methods.


Author(s):  
*Emrah Inan ◽  
Oguz Dikenelli

General entity linking systems usually leverage global coherence of all the mapped entities in the same document by using semantic embeddings and graph-based approaches. However, graph-based approaches are computationally expensive for open-domain datasets. In this paper, the authors overcome these problems by presenting an RDF embedding-based seq2seq entity linking method in specific domains. They filter candidate entities of mentions having similar meanings by using the domain information of the annotated pairs. They resolve high ambiguous pairs by using Bi-directional long short-term memory (Bi-LSTM) and attention mechanism for the entity disambiguation. To evaluate the system with baseline methods, they generate a dataset including book, music, and movie categories. They achieved 0.55 (Mi-F1), 0.586 (Ma-F1), 0.846 (Mi-F1), and 0.87 (Ma-F1) scores for high and low ambiguous datasets. They compare the method by using recent (WNED-CWEB) datasets with existing methods. Considering the domain-specificity of the proposed method, it tends to achieve competitive results while using the domain-oriented datasets.


Genes ◽  
2021 ◽  
Vol 12 (7) ◽  
pp. 998
Author(s):  
Peng Zhang ◽  
Yi Bu ◽  
Peng Jiang ◽  
Xiaowen Shi ◽  
Bing Lun ◽  
...  

This study builds a coronavirus knowledge graph (KG) by merging two information sources. The first source is Analytical Graph (AG), which integrates more than 20 different public datasets related to drug discovery. The second source is CORD-19, a collection of published scientific articles related to COVID-19. We combined both chemo genomic entities in AG with entities extracted from CORD-19 to expand knowledge in the COVID-19 domain. Before populating KG with those entities, we perform entity disambiguation on CORD-19 collections using Wikidata. Our newly built KG contains at least 21,700 genes, 2500 diseases, 94,000 phenotypes, and other biological entities (e.g., compound, species, and cell lines). We define 27 relationship types and use them to label each edge in our KG. This research presents two cases to evaluate the KG’s usability: analyzing a subgraph (ego-centered network) from the angiotensin-converting enzyme (ACE) and revealing paths between biological entities (hydroxychloroquine and IL-6 receptor; chloroquine and STAT1). The ego-centered network captured information related to COVID-19. We also found significant COVID-19-related information in top-ranked paths with a depth of three based on our path evaluation.


Author(s):  
Shengchen Jiang ◽  
Yantuan Xian ◽  
Hongbin Wang ◽  
Zhiju Zhang ◽  
Huaqin Li ◽  
...  

Entity disambiguation is extremely important in knowledge construction. The word representation model ignores the influence of the ordering between words on the sentence or text information. Thus, we propose a domain entity disambiguation method that fuses the doc2vec and LDA topic models. In this study, the doc2vec document is used to indicate that the model obtains the vector form of the entity reference item and the candidate entity from the domain corpus and knowledge base, respectively. Moreover, the context similarity and category referential similarity calculations are performed based on the knowledge base of the upper and lower relation domains that are constructed. The LDA topic model and doc2vec model are used to obtain word expressions with different meanings of polysemic words. We use the k-means algorithm to cluster the word vectors under different topics to obtain the topic domain keywords of the text, and perform the similarity calculations under the domain keywords of the different topics. Finally, the similarities of the three feature types are merged and the candidate entity with the highest similarity degree is used as the final target entity. The experimental results demonstrate that the proposed method outperforms the existing model, which proves its feasibility and effectiveness.


Sign in / Sign up

Export Citation Format

Share Document