WikiIdRank++: EXTENSIONS AND IMPROVEMENTS OF THE WikiIdRank SYSTEM FOR ENTITY LINKING

The amount of information available on the Web has grown considerably in recent years, leading to the need to structure it in order to access it in a quick and accurate way. In order to develop techniques to automate the structuring process, the Knowledge Base Population (KBP) track of the Text Analysis Conference (TAC) was created. This forum aims to encourage research in automated systems capable of capturing knowledge in unstructured information. One of the tasks proposed in the context of the KBP track is named entity linking, and its goal is to link named entities mentioned in a document to instances in a reference knowledge base built from Wikipedia. This paper focuses on the entity linking task in the context of KBP 2010, where two different varieties of this task were considered, depending on whether the use of the text from Wikipedia was allowed or not. Specifically, the paper proposes a set of modifications to a system that participated in KBP 2010, named WikiIdRank, in order to improve its performance. The different modifications were evaluated in the official KBP 2010 corpus, showing that the best combination increases the accuracy of the initial system in a 7.04%. Though the resultant system, named WikiIdRank++, is unsupervised and does not take advantage of Wikipedia text, a comparison with other approaches in KBP indicates that the system would rank as 4th (out of 16) in the global comparison, outperforming other approaches that use human supervision and take advantage of Wikipedia textual contents. Furthermore, the system would rank as 1st in the category of systems that do not use Wikipedia text.

Download Full-text

Using FRED for Named Entity Resolution, Linking and Typing for Knowledge Base Population

Semantic Web Evaluation Challenges - Communications in Computer and Information Science ◽

10.1007/978-3-319-25518-7_4 ◽

2015 ◽

pp. 40-50 ◽

Cited By ~ 7

Author(s):

Sergio Consoli ◽

Diego Reforgiato Recupero

Keyword(s):

Knowledge Base ◽

Entity Resolution ◽

Base Population ◽

Named Entity ◽

Knowledge Base Population

Download Full-text

Joint Entity Linking and Relation Extraction with Neural Networks for Knowledge Base Population

2020 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn48605.2020.9207021 ◽

2020 ◽

Author(s):

Zhenyu Zhang ◽

Xiaobo Sind ◽

Tingwen Liu ◽

Zheng Fang ◽

Quangang Li

Keyword(s):

Neural Networks ◽

Knowledge Base ◽

Relation Extraction ◽

Base Population ◽

Entity Linking ◽

Knowledge Base Population

Download Full-text

Knowledge base population from external data sources

10.14711/thesis-991012980225303412 ◽

2021 ◽

Author(s):

Xueling Lin

Keyword(s):

Knowledge Base ◽

Data Sources ◽

Base Population ◽

External Data ◽

Knowledge Base Population

Download Full-text

Enhancing Scientific Collaboration Through Knowledge Base Population and Linking for Meetings

Proceedings of the 51st Hawaii International Conference on System Sciences ◽

10.24251/hicss.2018.076 ◽

2018 ◽

Author(s):

Ning Gao ◽

Mark Dredze ◽

Douglas Oard

Keyword(s):

Knowledge Base ◽

Scientific Collaboration ◽

Base Population ◽

Knowledge Base Population

Download Full-text

Domain-specific Evaluation Dataset Generator for Multilingual Text Analysis

Journal of Intelligent Systems with Applications ◽

10.54856/jiswa.201912084 ◽

2019 ◽

pp. 140-147

Author(s):

Emrah Inan ◽

Vahab Mostafapour ◽

Fatif Tekbacak

Keyword(s):

Text Analysis ◽

General Purpose ◽

Entity Linking ◽

Named Entity ◽

Domain Specific ◽

Benchmark Datasets ◽

Concise Information ◽

Multilingual Text ◽

The Given ◽

Specific Evaluation

Web enables to retrieve concise information about specific entities including people, organizations, movies and their features. Additionally, large amount of Web resources generally lies on a unstructured form and it tackles to find critical information for specific entities. Text analysis approaches such as Named Entity Recognizer and Entity Linking aim to identify entities and link them to relevant entities in the given knowledge base. To evaluate these approaches, there are a vast amount of general purpose benchmark datasets. However, it is difficult to evaluate domain-specific approaches due to lack of evaluation datasets for specific domains. This study presents WeDGeM that is a multilingual evaluation set generator for specific domains exploiting Wikipedia category pages and DBpedia hierarchy. Also, Wikipedia disambiguation pages are used to adjust the ambiguity level of the generated texts. Based on this generated test data, a use case for well-known Entity Linking systems supporting Turkish texts are evaluated in the movie domain.

Download Full-text