Linking Named Entity in a Question with DBpedia Knowledge Base

Combining Contexts and Hyperlinks for Named Entity Disambiguation Based On Knowledge Base

Proceedings of the 2017 3rd International Conference on Economics, Social Science, Arts, Education and Management Engineering (ESSAEME 2017) ◽

10.2991/essaeme-17.2017.333 ◽

2017 ◽

Author(s):

Jiangying Yu

Keyword(s):

Knowledge Base ◽

Named Entity ◽

Entity Disambiguation ◽

Named Entity Disambiguation

Download Full-text

Deep Cascade Multi-Task Learning for Slot Filling in Online Shopping Assistant

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016465 ◽

2019 ◽

Vol 33 ◽

pp. 6465-6472 ◽

Cited By ~ 3

Author(s):

Yu Gong ◽

Xusheng Luo ◽

Yu Zhu ◽

Wenwu Ou ◽

Zhao Li ◽

...

Keyword(s):

Natural Language ◽

Knowledge Base ◽

Online Shopping ◽

State Of The Art ◽

Language Understanding ◽

Dialog Systems ◽

Named Entity ◽

Online Test ◽

Benchmark Datasets ◽

Slot Filling

Slot filling is a critical task in natural language understanding (NLU) for dialog systems. State-of-the-art approaches treat it as a sequence labeling problem and adopt such models as BiLSTM-CRF. While these models work relatively well on standard benchmark datasets, they face challenges in the context of E-commerce where the slot labels are more informative and carry richer expressions. In this work, inspired by the unique structure of E-commerce knowledge base, we propose a novel multi-task model with cascade and residual connections, which jointly learns segment tagging, named entity tagging and slot filling. Experiments show the effectiveness of the proposed cascade and residual structures. Our model has a 14.6% advantage in F1 score over the strong baseline methods on a new Chinese E-commerce shopping assistant dataset, while achieving competitive accuracies on a standard dataset. Furthermore, online test deployed on such dominant E-commerce platform shows 130% improvement on accuracy of understanding user utterances. Our model has already gone into production in the E-commerce platform.

Download Full-text

Using FRED for Named Entity Resolution, Linking and Typing for Knowledge Base Population

Semantic Web Evaluation Challenges - Communications in Computer and Information Science ◽

10.1007/978-3-319-25518-7_4 ◽

2015 ◽

pp. 40-50 ◽

Cited By ~ 7

Author(s):

Sergio Consoli ◽

Diego Reforgiato Recupero

Keyword(s):

Knowledge Base ◽

Entity Resolution ◽

Base Population ◽

Named Entity ◽

Knowledge Base Population

Download Full-text

Named entity recognition and linking with knowledge base

10.32657/10356/136585 ◽

2019 ◽

Author(s):

◽

Cong Minh Phan

Keyword(s):

Knowledge Base ◽

Named Entity Recognition ◽

Entity Recognition ◽

Named Entity

Download Full-text

J-NERD: Joint Named Entity Recognition and Disambiguation with Rich Linguistic Features

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00094 ◽

2016 ◽

Vol 4 ◽

pp. 215-229 ◽

Cited By ~ 11

Author(s):

Dat Ba Nguyen ◽

Martin Theobald ◽

Gerhard Weikum

Keyword(s):

Knowledge Base ◽

Graphical Model ◽

State Of The Art ◽

Named Entity Recognition ◽

Entity Recognition ◽

Probabilistic Graphical Model ◽

Linguistic Features ◽

False Negatives ◽

New Approach ◽

Named Entity

Methods for Named Entity Recognition and Disambiguation (NERD) perform NER and NED in two separate stages. Therefore, NED may be penalized with respect to precision by NER false positives, and suffers in recall from NER false negatives. Conversely, NED does not fully exploit information computed by NER such as types of mentions. This paper presents J-NERD, a new approach to perform NER and NED jointly, by means of a probabilistic graphical model that captures mention spans, mention types, and the mapping of mentions to entities in a knowledge base. We present experiments with different kinds of texts from the CoNLL’03, ACE’05, and ClueWeb’09-FACC1 corpora. J-NERD consistently outperforms state-of-the-art competitors in end-to-end NERD precision, recall, and F1.

Download Full-text

WikiIdRank++: EXTENSIONS AND IMPROVEMENTS OF THE WikiIdRank SYSTEM FOR ENTITY LINKING

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213013500188 ◽

2013 ◽

Vol 22 (03) ◽

pp. 1350018 ◽

Cited By ~ 1

Author(s):

M. D. JIMÉNEZ ◽

N. FERNÁNDEZ ◽

J. ARIAS FISTEUS ◽

L. SÁNCHEZ

Keyword(s):

Knowledge Base ◽

Text Analysis ◽

Base Population ◽

Entity Linking ◽

Named Entities ◽

Named Entity ◽

Global Comparison ◽

Knowledge Base Population ◽

Reference Knowledge ◽

Unstructured Information

The amount of information available on the Web has grown considerably in recent years, leading to the need to structure it in order to access it in a quick and accurate way. In order to develop techniques to automate the structuring process, the Knowledge Base Population (KBP) track of the Text Analysis Conference (TAC) was created. This forum aims to encourage research in automated systems capable of capturing knowledge in unstructured information. One of the tasks proposed in the context of the KBP track is named entity linking, and its goal is to link named entities mentioned in a document to instances in a reference knowledge base built from Wikipedia. This paper focuses on the entity linking task in the context of KBP 2010, where two different varieties of this task were considered, depending on whether the use of the text from Wikipedia was allowed or not. Specifically, the paper proposes a set of modifications to a system that participated in KBP 2010, named WikiIdRank, in order to improve its performance. The different modifications were evaluated in the official KBP 2010 corpus, showing that the best combination increases the accuracy of the initial system in a 7.04%. Though the resultant system, named WikiIdRank++, is unsupervised and does not take advantage of Wikipedia text, a comparison with other approaches in KBP indicates that the system would rank as 4th (out of 16) in the global comparison, outperforming other approaches that use human supervision and take advantage of Wikipedia textual contents. Furthermore, the system would rank as 1st in the category of systems that do not use Wikipedia text.

Download Full-text

TwitterNEED: A hybrid approach for named entity extraction and disambiguation for tweet

Natural Language Engineering ◽

10.1017/s1351324915000194 ◽

2015 ◽

Vol 22 (3) ◽

pp. 423-456 ◽

Cited By ~ 7

Author(s):

MENA B. HABIB ◽

MAURICE VAN KEULEN

Keyword(s):

Support Vector Machine ◽

Knowledge Base ◽

Hybrid Approach ◽

Entity Recognition ◽

Support Vector ◽

Entity Extraction ◽

Named Entity ◽

Named Entity Extraction ◽

Entity Disambiguation ◽

Named Entity Disambiguation

AbstractTwitter is a rich source of continuously and instantly updated information. Shortness and informality of tweets are challenges for Natural Language Processing tasks. In this paper, we present TwitterNEED, a hybrid approach for Named Entity Extraction and Named Entity Disambiguation for tweets. We believe that disambiguation can help to improve the extraction process. This mimics the way humans understand language and reduces error propagation in the whole system. Our extraction approach aims for high extraction recall first, after which a Support Vector Machine attempts to filter out false positives among the extracted candidates using features derived from the disambiguation phase in addition to other word shape and Knowledge Base features. For Named Entity Disambiguation, we obtain a list of entity candidates from the YAGO Knowledge Base in addition to top-ranked pages from the Google search engine for each extracted mention. We use a Support Vector Machine to rank the candidate pages according to a set of URL and context similarity features. For evaluation, five data sets are used to evaluate the extraction approach, and three of them to evaluate both the disambiguation approach and the combined extraction and disambiguation approach. Experiments show better results compared to our competitors DBpedia Spotlight, Stanford Named Entity Recognition, and the AIDA disambiguation system.

Download Full-text

Improving Named Entity Recognition in Vietnamese Texts by a Character-Level Deep Lifelong Learning Model

Vietnam Journal of Computer Science ◽

10.1142/s219688881950026x ◽

2019 ◽

Vol 06 (04) ◽

pp. 471-487 ◽

Cited By ~ 1

Author(s):

Ngoc-Vu Nguyen ◽

Thi-Lan Nguyen ◽

Cam-Van Nguyen Thi ◽

Mai-Vu Tran ◽

Tri-Thanh Nguyen ◽

...

Keyword(s):

Knowledge Base ◽

Lifelong Learning ◽

Named Entity Recognition ◽

Parameter Tuning ◽

Learning Model ◽

Entity Recognition ◽

Named Entities ◽

Named Entity ◽

Cross Domain ◽

Learning Tasks

Named entity recognition (NER) is a fundamental task which affects the performance of its dependent task, e.g. machine translation. Lifelong machine learning (LML) is a continuous learning process, in which the knowledge base accumulated from previous tasks will be used to improve future learning tasks having few samples. Since there are a few studies on LML based on deep neural networks for NER, especially in Vietnamese, we propose a lifelong learning model based on deep learning with a CRFs layer, named DeepLML–NER, for NER in Vietnamese texts. DeepLML–NER includes an algorithm to extract the knowledge of “prefix-features” of named entities in previous domains. Then the model uses the knowledge in the knowledge base to solve the current NER task. Preprocessing and model parameter tuning are also investigated to improve the performance. The effect of the model was demonstrated by in-domain and cross-domain experiments, achieving promising results.

Download Full-text

Disambiguating the Twitter Stream Entities and Enhancing the Search Operation Using DBpedia Ontology

International Journal of Information Technology and Web Engineering ◽

10.4018/ijitwe.2016040104 ◽

2016 ◽

Vol 11 (2) ◽

pp. 51-62 ◽

Cited By ~ 2

Author(s):

N. Senthil Kumar ◽

Dinakaran Muruganantham

Keyword(s):

Knowledge Base ◽

Structured Data ◽

Unstructured Data ◽

Social Web ◽

Huge Amount ◽

Named Entities ◽

Named Entity ◽

Semantic Classes ◽

Search Operation ◽

Principal Task

The web and social web is holding the huge amount of unstructured data and makes the searching processing more cumbersome. The principal task here is to migrate the unstructured data into the structured data through the appropriate utilization of named entity detections. The goal of the paper is to automatically build and store the deep knowledge base of important facts and construct the comprehensive details about the facts such as its related named entities, its semantic classes of the entities and its mutual relationship with its temporal context can be thoroughly analyzed and probed. In this paper, the authors have given and proposed the model to identify all the major interpretations of the named entities and effectively link them to the appropriate mentions of the knowledge base (DBpedia). They finally evaluate the approaches that uniquely identify the DBpedia URIs of the selected entities and eliminate the other candidate mentions of the entities based on the authority rankings of those candidate mentions.

Download Full-text

Domain-Targeted, High Precision Knowledge Extraction

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00058 ◽

2017 ◽

Vol 5 ◽

pp. 233-246 ◽

Cited By ~ 3

Author(s):

Bhavana Dalvi Mishra ◽

Niket Tandon ◽

Peter Clark

Keyword(s):

Information Extraction ◽

Knowledge Base ◽

High Precision ◽

Elementary Science ◽

Question Answering ◽

Learning Algorithm ◽

Knowledge Extraction ◽

Schema Learning ◽

Named Entity ◽

The World

Our goal is to construct a domain-targeted, high precision knowledge base (KB), containing general (subject,predicate,object) statements about the world, in support of a downstream question-answering (QA) application. Despite recent advances in information extraction (IE) techniques, no suitable resource for our task already exists; existing resources are either too noisy, too named-entity centric, or too incomplete, and typically have not been constructed with a clear scope or purpose. To address these, we have created a domain-targeted, high precision knowledge extraction pipeline, leveraging Open IE, crowdsourcing, and a novel canonical schema learning algorithm (called CASI), that produces high precision knowledge targeted to a particular domain - in our case, elementary science. To measure the KB’s coverage of the target domain’s knowledge (its “comprehensiveness” with respect to science) we measure recall with respect to an independent corpus of domain text, and show that our pipeline produces output with over 80% precision and 23% recall with respect to that target, a substantially higher coverage of tuple-expressible science knowledge than other comparable resources. We have made the KB publicly available.

Download Full-text