Multi-Distribution Characteristics Based Chinese Entity Synonym Extraction from The Web

2019 ◽  
Vol 15 (3) ◽  
pp. 42-63
Author(s):  
Xiuxia Ma ◽  
Xiangfeng Luo ◽  
Subin Huang ◽  
Yike Guo

Entity synonyms play an important role in natural language processing applications, such as query expansion and question answering. There are three main distribution characteristics in web texts:1) appearing in parallel structures; 2) occurring with specific patterns in sentences; and 3) distributed in similar contexts. The first and second characteristics rely on reliable prior knowledge and are susceptive to data sparseness, bringing high accuracy and low recall to synonym extraction. The third one may lead to high recall but low accuracy, since it identifies a somewhat loose semantic similarity. Existing methods, such as context-based and pattern-based methods, only consider one characteristic for synonym extraction and rarely take their complementarity into account. For increasing recall, this article proposes a novel extraction framework that can combine the three characteristics for extracting synonyms from the web, where an Entity Synonym Network (ESN) is built to incorporate synonymous knowledge. To improve accuracy, the article treats synonym detection as a ranking problem and uses the Spreading Activation model as a ranking means to detect the hard noise in ESN. Experimental results show the proposed method achieves better accuracy and recall than the state-of-the-art methods.

2018 ◽  
Author(s):  
Wesley W. O. Souza ◽  
Diorge Brognara ◽  
João A. Leite ◽  
Estevam R. Hruschka Jr.

With advances in machine learning, natural language processing, processing speed, and amount of data storage, conversational agents are being used in applications that were not possible to perform within a few years. NELL, a machine learning agent who learns to read the web, today has a considerably large ontology and while it can be used for multiple fact queries, it is also possible to expand it further and specialize its knowledge. One of the first steps to succeed is to refine existing knowledge in NELL’s knowledge base so that future communication between it and humans is as natural as possible. This work describes the results of an experiment where we investigate which machine learning algorithm performs best in the task of classifying candidate words to subcategories in the NELL knowledge base.


Author(s):  
Rafael E. Banchs ◽  
Carlos G. Rodríguez Penagos

The main objective of this chapter is to present a general overview of the most relevant applications of text mining and natural language processing technologies evolving and emerging around the Web 2.0 phenomenon (such as automatic categorization, document summarization, question answering, dialogue management, opinion mining, sentiment analysis, outlier identification, misbehavior detection, and social estimation and forecasting) along with the main challenges and new research opportunities that are directly and indirectly derived from them.


2018 ◽  
Vol 7 (3.6) ◽  
pp. 379
Author(s):  
S Jayalakshmi ◽  
Ananthi Sheshaayee

The growth of information retrieval from the web sources are increased day by day, proving an effective and efficient way to the user for retrieving relevant documents from the web is an art. Asking the right question and retrieving a right answer to the posted query is a service which provide by the Natural Language Processing. Question Answering System is one of the best ways to identify the candidate answer with high accuracy. The web and Semantic Knowledge Driven Question Answering System (QAS) used to determine the candidate answer for the posted query in the NLP tools.  This method includes Query expansion techniques and entity linking method to identify the information source snippets with ontology structure, also ranking the sentences by applying conditional probability between query and Answer to identify the optimal answer from the web corpus. The result provides an exact answer with high accuracy than the baseline method.  


2013 ◽  
pp. 1945-1979
Author(s):  
Rafael E. Banchs ◽  
Carlos G. Rodríguez Penagos

The main objective of this chapter is to present a general overview of the most relevant applications of text mining and natural language processing technologies evolving and emerging around the Web 2.0 phenomenon (such as automatic categorization, document summarization, question answering, dialogue management, opinion mining, sentiment analysis, outlier identification, misbehavior detection, and social estimation and forecasting) along with the main challenges and new research opportunities that are directly and indirectly derived from them.


AI Magazine ◽  
2019 ◽  
Vol 40 (3) ◽  
pp. 67-78
Author(s):  
Guy Barash ◽  
Mauricio Castillo-Effen ◽  
Niyati Chhaya ◽  
Peter Clark ◽  
Huáscar Espinoza ◽  
...  

The workshop program of the Association for the Advancement of Artificial Intelligence’s 33rd Conference on Artificial Intelligence (AAAI-19) was held in Honolulu, Hawaii, on Sunday and Monday, January 27–28, 2019. There were fifteen workshops in the program: Affective Content Analysis: Modeling Affect-in-Action, Agile Robotics for Industrial Automation Competition, Artificial Intelligence for Cyber Security, Artificial Intelligence Safety, Dialog System Technology Challenge, Engineering Dependable and Secure Machine Learning Systems, Games and Simulations for Artificial Intelligence, Health Intelligence, Knowledge Extraction from Games, Network Interpretability for Deep Learning, Plan, Activity, and Intent Recognition, Reasoning and Learning for Human-Machine Dialogues, Reasoning for Complex Question Answering, Recommender Systems Meet Natural Language Processing, Reinforcement Learning in Games, and Reproducible AI. This report contains brief summaries of the all the workshops that were held.


Designs ◽  
2021 ◽  
Vol 5 (3) ◽  
pp. 42
Author(s):  
Eric Lazarski ◽  
Mahmood Al-Khassaweneh ◽  
Cynthia Howard

In recent years, disinformation and “fake news” have been spreading throughout the internet at rates never seen before. This has created the need for fact-checking organizations, groups that seek out claims and comment on their veracity, to spawn worldwide to stem the tide of misinformation. However, even with the many human-powered fact-checking organizations that are currently in operation, disinformation continues to run rampant throughout the Web, and the existing organizations are unable to keep up. This paper discusses in detail recent advances in computer science to use natural language processing to automate fact checking. It follows the entire process of automated fact checking using natural language processing, from detecting claims to fact checking to outputting results. In summary, automated fact checking works well in some cases, though generalized fact checking still needs improvement prior to widespread use.


2021 ◽  
pp. 1-11
Author(s):  
Zhinan Gou ◽  
Yan Li

With the development of the web 2.0 communities, information retrieval has been widely applied based on the collaborative tagging system. However, a user issues a query that is often a brief query with only one or two keywords, which leads to a series of problems like inaccurate query words, information overload and information disorientation. The query expansion addresses this issue by reformulating each search query with additional words. By analyzing the limitation of existing query expansion methods in folksonomy, this paper proposes a novel query expansion method, based on user profile and topic model, for search in folksonomy. In detail, topic model is constructed by variational antoencoder with Word2Vec firstly. Then, query expansion is conducted by user profile and topic model. Finally, the proposed method is evaluated by a real dataset. Evaluation results show that the proposed method outperforms the baseline methods.


Poetics ◽  
1990 ◽  
Vol 19 (1-2) ◽  
pp. 99-120
Author(s):  
Stefan Wermter ◽  
Wendy G. Lehnert

Author(s):  
Horacio Saggion

Over the past decades, information has been made available to a broad audience thanks to the availability of texts on the Web. However, understanding the wealth of information contained in texts can pose difficulties for a number of people including those with poor literacy, cognitive or linguistic impairment, or those with limited knowledge of the language of the text. Text simplification was initially conceived as a technology to simplify sentences so that they would be easier to process by natural-language processing components such as parsers. However, nowadays automatic text simplification is conceived as a technology to transform a text into an equivalent which is easier to read and to understand by a target user. Text simplification concerns both the modification of the vocabulary of the text (lexical simplification) and the modification of the structure of the sentences (syntactic simplification). In this chapter, after briefly introducing the topic of text readability, we give an overview of past and recent methods to address these two problems. We also describe simplification applications and full systems also outline language resources and evaluation approaches.


Sign in / Sign up

Export Citation Format

Share Document