Identification of Pleonastic It Using the Web

Journal of Artificial Intelligence Research ◽

10.1613/jair.2622 ◽

2009 ◽

Vol 34 ◽

pp. 339-389 ◽

Cited By ~ 2

Author(s):

Y. Li ◽

P. Musilek ◽

M. Reformat ◽

L. Wyard-Scott

Keyword(s):

Natural Language ◽

Simple Set ◽

Novel Approach ◽

The Web

In a significant minority of cases, certain pronouns, especially the pronoun it, can be used without referring to any specific entity. This phenomenon of pleonastic pronoun usage poses serious problems for systems aiming at even a shallow understanding of natural language texts. In this paper, a novel approach is proposed to identify such uses of it: the extrapositional cases are identified using a series of queries against the web, and the cleft cases are identified using a simple set of syntactic rules. The system is evaluated with four sets of news articles containing 679 extrapositional cases as well as 78 cleft constructs. The identification results are comparable to those obtained by human efforts.

Download Full-text

Using NLP for Fact Checking: A Survey

Designs ◽

10.3390/designs5030042 ◽

2021 ◽

Vol 5 (3) ◽

pp. 42

Author(s):

Eric Lazarski ◽

Mahmood Al-Khassaweneh ◽

Cynthia Howard

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Computer Science ◽

Language Processing ◽

The Internet ◽

Fake News ◽

Fact Checking ◽

The Many ◽

Human Powered ◽

The Web

In recent years, disinformation and “fake news” have been spreading throughout the internet at rates never seen before. This has created the need for fact-checking organizations, groups that seek out claims and comment on their veracity, to spawn worldwide to stem the tide of misinformation. However, even with the many human-powered fact-checking organizations that are currently in operation, disinformation continues to run rampant throughout the Web, and the existing organizations are unable to keep up. This paper discusses in detail recent advances in computer science to use natural language processing to automate fact checking. It follows the entire process of automated fact checking using natural language processing, from detecting claims to fact checking to outputting results. In summary, automated fact checking works well in some cases, though generalized fact checking still needs improvement prior to widespread use.

Download Full-text

Recommending Relevant Tutorial Fragments for API-Related Natural Language Questions

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194021500406 ◽

2021 ◽

Vol 31 (09) ◽

pp. 1251-1275

Author(s):

Di Wu ◽

Xiao-Yuan Jing ◽

Haowen Chen ◽

Xiaohui Kong ◽

Jifeng Xuan

Keyword(s):

Natural Language ◽

State Of The Art ◽

Metric Learning ◽

Application Programming Interface ◽

Manual Annotation ◽

Candidate List ◽

Novel Approach ◽

Application Programming ◽

Programming Interface ◽

Reciprocal Rank

Application Programming Interface (API) tutorial is an important API learning resource. To help developers learn APIs, an API tutorial is often split into a number of consecutive units that describe the same topic (i.e. tutorial fragment). We regard a tutorial fragment explaining an API as a relevant fragment of the API. Automatically recommending relevant tutorial fragments can help developers learn how to use an API. However, existing approaches often employ supervised or unsupervised manner to recommend relevant fragments, which suffers from much manual annotation effort or inaccurate recommended results. Furthermore, these approaches only support developers to input exact API names. In practice, developers often do not know which APIs to use so that they are more likely to use natural language to describe API-related questions. In this paper, we propose a novel approach, called Tutorial Fragment Recommendation (TuFraRec), to effectively recommend relevant tutorial fragments for API-related natural language questions, without much manual annotation effort. For an API tutorial, we split it into fragments and extract APIs from each fragment to build API-fragment pairs. Given a question, TuFraRec first generates several clarification APIs that are related to the question. We use clarification APIs and API-fragment pairs to construct candidate API-fragment pairs. Then, we design a semi-supervised metric learning (SML)-based model to find relevant API-fragment pairs from the candidate list, which can work well with a few labeled API-fragment pairs and a large number of unlabeled API-fragment pairs. In this way, the manual effort for labeling the relevance of API-fragment pairs can be reduced. Finally, we sort and recommend relevant API-fragment pairs based on the recommended strategy. We evaluate TuFraRec on 200 API-related natural language questions and two public tutorial datasets (Java and Android). The results demonstrate that on average TuFraRec improves NDCG@5 by 0.06 and 0.09, and improves Mean Reciprocal Rank (MRR) by 0.07 and 0.09 on two tutorial datasets as compared with the state-of-the-art approach.

Download Full-text

InferPortOIE: A Portuguese Open Information Extraction system with inferences

Natural Language Engineering ◽

10.1017/s135132491800044x ◽

2018 ◽

Vol 25 (2) ◽

pp. 287-306 ◽

Cited By ~ 3

Author(s):

Cleiton Fernando Lima Sena ◽

Daniela Barreiro Claro

Keyword(s):

Natural Language ◽

Information Extraction ◽

Digital Data ◽

Extraction System ◽

Open Information Extraction ◽

The Web ◽

Information Extraction System

AbstractNowadays, there is an increasing amount of digital data. In the case of the Web, daily, a vast collection of data is generated, whose contents are heterogeneous. A significant portion of this data is available in a natural language format. Open Information Extraction (Open IE) enables the extraction of facts from large quantities of texts written in natural language. In this work, we propose an Open IE method to extract facts from texts written in Portuguese. We developed two new rules that generalize the inference by transitivity and by symmetry. Consequently, this approach increases the number of implicit facts in a sentence. Our novel symmetric inference approach is based on a list of symmetric features. Our results confirmed that our method outstands close works both in precision and number of valid extractions. Considering the number of minimal facts, our approach is equivalent to the most relevant methods in the literature.

Download Full-text

Conversational Web Interaction: Proposal of a Dialog-Based Natural Language Interaction Paradigm for the Web

Chatbot Research and Design - Lecture Notes in Computer Science ◽

10.1007/978-3-030-39540-7_7 ◽

2020 ◽

pp. 94-110 ◽

Cited By ~ 3

Author(s):

Marcos Baez ◽

Florian Daniel ◽

Fabio Casati

Keyword(s):

Natural Language ◽

Natural Language Interaction ◽

The Web

Download Full-text

Digital News and the Consumption of Political Information

Society and the Internet ◽

10.1093/oso/9780198843498.003.0015 ◽

2019 ◽

pp. 248-262

Author(s):

Silvia Majó-Vázquez ◽

Sandra González-Bailón

Keyword(s):

The Internet ◽

Digital Platforms ◽

Demographic Groups ◽

News Consumption ◽

Novel Approach ◽

Digital News ◽

Filter Bubbles ◽

Media Systems ◽

Reported Data ◽

The Web

The Internet has fundamentally changed how people access and use news. As Dutton and others (Chapter 13, this volume) note, there are concerns that the Internet leads us to get stuck in “echo chambers” or “filter bubbles”—limiting our access to points of view that might challenge our preexisting beliefs. This chapter introduces a network approach to analyzing news consumption in the digital age. The authors explain how we can compare patterns of news consumption across demographic groups, countries, and digital platforms, and determine if there are differences across groups of users and media systems. Measuring news consumption has long been difficult owing to the limitations of self-reported data, so this chapter is notable in offering a novel approach that leverages the digital traces that people leave behind when navigating the Web.

Download Full-text

Clustering of Human Gestures Images Using 3D Cellular Automata and N-Gram Pixels with a Visualisation aspect

International Journal of Organizational and Collective Intelligence ◽

10.4018/ijoci.2015100104 ◽

2015 ◽

Vol 5 (4) ◽

pp. 35-60 ◽

Cited By ~ 3

Author(s):

Hadj Ahmed Bouarara ◽

Reda Mohamed Hamou ◽

Abdelmalek Amine

Keyword(s):

Image Representation ◽

Transition Function ◽

Communication Tools ◽

Novel Approach ◽

Human Gestures ◽

Representation Technique ◽

Effective Instrument ◽

N Gram ◽

The Web

With the coming of cyberspace and the development of communication tools such as social networks. Now, the digital society is enriched continually by new content, especially the human images, which represent more than 50% of the information existed in the web. Hence, the necessity of an effective instrument for the automatic classification of this gigantic imagery base has become primordial. The content of our work, is a novel approach called clustering of Human Gesture Images using 3D cellular automaton, consists of 4 steps: Image vectoring using new image representation technique called n-gram pixels, and a normalised term frequency as weighting to calculate the importance of each term in the image. Our clustering strategy are based on the principle of 3D-CA, using a set of properties (transition function, and the 3D Moore neighbourhood). The experimentation using the dataset MuHAVi and a variety of validation measures. The performance of our approach were compared to the conventional methods in term of, representation (naive representation), and clustering strategy (kmeans).

Download Full-text

A Novel Approach to Implement Natural Language Interface for Database in Punjabi Language

Journal of Xidian University ◽

10.37896/jxu14.4/144 ◽

2020 ◽

Vol 14 (4) ◽

Keyword(s):

Natural Language ◽

Novel Approach ◽

Natural Language Interface

Download Full-text

Computational construction grammar for visual question answering

Linguistics Vanguard ◽

10.1515/lingvan-2018-0070 ◽

2019 ◽

Vol 5 (1) ◽

Author(s):

Jens Nevens ◽

Paul Van Eecke ◽

Katrien Beuls

Keyword(s):

Natural Language ◽

Question Answering ◽

Semantic Representation ◽

Construction Grammar ◽

Training Data ◽

Knowledge Sources ◽

Visual Question Answering ◽

Novel Approach ◽

Natural Language Question ◽

Grammar Model

AbstractIn order to be able to answer a natural language question, a computational system needs three main capabilities. First, the system needs to be able to analyze the question into a structured query, revealing its component parts and how these are combined. Second, it needs to have access to relevant knowledge sources, such as databases, texts or images. Third, it needs to be able to execute the query on these knowledge sources. This paper focuses on the first capability, presenting a novel approach to semantically parsing questions expressed in natural language. The method makes use of a computational construction grammar model for mapping questions onto their executable semantic representations. We demonstrate and evaluate the methodology on the CLEVR visual question answering benchmark task. Our system achieves a 100% accuracy, effectively solving the language understanding part of the benchmark task. Additionally, we demonstrate how this solution can be embedded in a full visual question answering system, in which a question is answered by executing its semantic representation on an image. The main advantages of the approach include (i) its transparent and interpretable properties, (ii) its extensibility, and (iii) the fact that the method does not rely on any annotated training data.

Download Full-text

A Novel Approach to Interactive Dialogue Generation Based on Natural Language Creation with Context-Free Grammars and Sentiment Analysis

2020 IEEE 20th International Conference on Advanced Learning Technologies (ICALT) ◽

10.1109/icalt49669.2020.00031 ◽

2020 ◽

Author(s):

Fabrizio Palmas ◽

Jakob Raith ◽

Gudrun Klinker

Keyword(s):

Natural Language ◽

Sentiment Analysis ◽

Novel Approach ◽

Context Free ◽

Context Free Grammars

Download Full-text

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5962 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5182-5190

Author(s):

Pasquale Minervini ◽

Matko Bošnjak ◽

Tim Rocktäschel ◽

Sebastian Riedel ◽

Edward Grefenstette

Keyword(s):

Natural Language ◽

Link Prediction ◽

Question Answering ◽

Knowledge Bases ◽

Small Scale ◽

Reasoning Systems ◽

Novel Approach ◽

Real World Datasets ◽

Interpretable Models ◽

Machine Reading

Reasoning with knowledge expressed in natural language and Knowledge Bases (KBs) is a major challenge for Artificial Intelligence, with applications in machine reading, dialogue, and question answering. General neural architectures that jointly learn representations and transformations of text are very data-inefficient, and it is hard to analyse their reasoning process. These issues are addressed by end-to-end differentiable reasoning systems such as Neural Theorem Provers (NTPs), although they can only be used with small-scale symbolic KBs. In this paper we first propose Greedy NTPs (GNTPs), an extension to NTPs addressing their complexity and scalability limitations, thus making them applicable to real-world datasets. This result is achieved by dynamically constructing the computation graph of NTPs and including only the most promising proof paths during inference, thus obtaining orders of magnitude more efficient models 1. Then, we propose a novel approach for jointly reasoning over KBs and textual mentions, by embedding logic facts and natural language sentences in a shared embedding space. We show that GNTPs perform on par with NTPs at a fraction of their cost while achieving competitive link prediction results on large datasets, providing explanations for predictions, and inducing interpretable models.

Download Full-text