CoRg: Commonsense Reasoning Using a Theorem Prover and Machine Learning

Commonsense reasoning is an everyday task that is intuitive for humans but hard to implement for computers. It requires large knowledge bases to get the required data from, although this data is still incomplete or even inconsistent. While machine learning algorithms perform rather well on these tasks, the reasoning process remains a black box. To close this gap, our system CoRg aims to build an explainable and well-performing system, which consists of both an explainable deductive derivation process and a machine learning part. We conduct our experiments on the Copa question-answering benchmark using the ontologies WordNet, Adimen-SUMO, and ConceptNet. The knowledge is fed into the theorem prover Hyper and in the end the conducted models will be analyzed using machine learning algorithms, to derive the most probable answer.

Download Full-text

A content spectral-based text representation

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219248 ◽

2021 ◽

pp. 1-12

Author(s):

Melesio Crespo-Sanchez ◽

Ivan Lopez-Arevalo ◽

Edwin Aldana-Bobadilla ◽

Alejandro Molina-Villegas

Keyword(s):

Machine Learning ◽

Text Analysis ◽

Question Answering ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Text Representation ◽

Feature Vectors ◽

Learning Tasks ◽

Semantic Component ◽

Vector Representations

In the last few years, text analysis has grown as a keystone in several domains for solving many real-world problems, such as machine translation, spam detection, and question answering, to mention a few. Many of these tasks can be approached by means of machine learning algorithms. Most of these algorithms take as input a transformation of the text in the form of feature vectors containing an abstraction of the content. Most of recent vector representations focus on the semantic component of text, however, we consider that also taking into account the lexical and syntactic components the abstraction of content could be beneficial for learning tasks. In this work, we propose a content spectral-based text representation applicable to machine learning algorithms for text analysis. This representation integrates the spectra from the lexical, syntactic, and semantic components of text producing an abstract image, which can also be treated by both, text and image learning algorithms. These components came from feature vectors of text. For demonstrating the goodness of our proposal, this was tested on text classification and complexity reading score prediction tasks obtaining promising results.

Download Full-text

Incremental and Iterative Learning of Answer Set Programs from Mutually Distinct Examples

Theory and Practice of Logic Programming ◽

10.1017/s1471068418000248 ◽

2018 ◽

Vol 18 (3-4) ◽

pp. 623-637 ◽

Cited By ~ 2

Author(s):

ARINDAM MITRA ◽

CHITTA BARAL

Keyword(s):

Machine Learning ◽

Learning Community ◽

Question Answering ◽

Learning Algorithms ◽

Inductive Logic ◽

Opportunity To Learn ◽

Machine Learning Algorithms ◽

Knowledge Representation And Reasoning ◽

Handwritten Digit ◽

Answer Set

AbstractOver the years the Artificial Intelligence (AI) community has produced several datasets which have given the machine learning algorithms the opportunity to learn various skills across various domains. However, a subclass of these machine learning algorithms that aimed at learning logic programs, namely the Inductive Logic Programming algorithms, have often failed at the task due to the vastness of these datasets. This has impacted the usability of knowledge representation and reasoning techniques in the development of AI systems. In this research, we try to address this scalability issue for the algorithms that learn answer set programs. We present a sound and complete algorithm which takes the input in a slightly different manner and performs an efficient and more user controlled search for a solution. We show via experiments that our algorithm can learn from two popular datasets from machine learning community, namely bAbl (a question answering dataset) and MNIST (a dataset for handwritten digit recognition), which to the best of our knowledge was not previously possible. The system is publicly available athttps://goo.gl/KdWAcV.

Download Full-text

Interconnecting a Class of Machine Learning Algorithms with Logical Commonsense Reasoning Operations

Soft Computing Applications for Database Technologies ◽

10.4018/978-1-60566-814-7.ch012 ◽

2010 ◽

pp. 214-246

Author(s):

Xenia Naidenova

Keyword(s):

Machine Learning ◽

Large Class ◽

Diagnostic Tests ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Learning Problems ◽

Mathematical Language ◽

Commonsense Reasoning ◽

Good Classification ◽

Logical Rules

The purpose of this chapter is to demonstrate the possibility of transforming a large class of machine learning algorithms into commonsense reasoning processes based on using well-known deduction and induction logical rules. The concept of a good classification (diagnostic) test for a given set of positive examples lies in the basis of our approach to the machine learning problems. The task of inferring all good diagnostic tests is formulated as searching the best approximations of a given classification (a partitioning) on a given set of examples. The lattice theory is used as a mathematical language for constructing good classification tests. The algorithms of good tests inference are decomposed into subtasks and operations that are in accordance with main human commonsense reasoning rules.

Download Full-text

Simulation-Based Approach to Efficient Commonsense Reasoning in Very Large Knowledge Bases

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33011360 ◽

2019 ◽

Vol 33 ◽

pp. 1360-1367

Author(s):

Abhishek Sharma ◽

Keith M. Goolsbey

Keyword(s):

Machine Learning ◽

Question Answering ◽

Knowledge Bases ◽

Cognitive Systems ◽

General Knowledge ◽

Commonsense Reasoning ◽

Selection Methods ◽

Monte Carlo Simulation Technique ◽

Simulation Based ◽

Limited Reasoning

Cognitive systems must reason with large bodies of general knowledge to perform complex tasks in the real world. However, due to the intractability of reasoning in large, expressive knowledge bases (KBs), many AI systems have limited reasoning capabilities. Successful cognitive systems have used a variety of machine learning and axiom selection methods to improve inference. In this paper, we describe a search heuristic that uses a Monte-Carlo simulation technique to choose inference steps. We test the efficacy of this approach on a very large and expressive KB, Cyc. Experimental results on hundreds of queries show that this method is highly effective in reducing inference time and improving question-answering (Q/A) performance.

Download Full-text

Clinical applications of machine learning algorithms: beyond the black box

BMJ ◽

10.1136/bmj.l886 ◽

2019 ◽

pp. l886 ◽

Cited By ~ 48

Author(s):

David S Watson ◽

Jenny Krutzinna ◽

Ian N Bruce ◽

Christopher EM Griffiths ◽

Iain B McInnes ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Black Box ◽

Clinical Applications ◽

Machine Learning Algorithms ◽

Applications Of Machine Learning

Download Full-text

Semi-Automated Identification of Biomedical Literature: A Proof of Concept Study

10.21203/rs.3.rs-560637/v1 ◽

2021 ◽

Author(s):

Gaelen P. Adam ◽

Dimitris Pappas ◽

Haris Papageorgiou ◽

Evangelos Evangelou ◽

Thomas A. Trikalinos

Keyword(s):

Machine Learning ◽

Domain Knowledge ◽

Contextual Information ◽

Learning Algorithms ◽

Knowledge Bases ◽

Machine Learning Algorithms ◽

Biomedical Literature ◽

Future Research ◽

Identification System ◽

Query Formulation

Abstract Background: The typical approach to literature identification involves two discrete and successive steps: (i) formulating a search strategy (i.e., a set of Boolean queries) and (ii) manually identifying the relevant citations in the corpus returned by the query. We have developed a literature identification system (Pythia) that combines the query formulation and citation screening steps and uses modern approaches for text encoding (dense text embeddings) to represent the text of the citations in a form that can be used by information retrieval and machine learning algorithms.Methods: Pythia incorporates a set of natural-language questions with machine-learning algorithms to rank all PubMed citations based on relevance. Pythia returns the 100 top-ranked citations for all questions combined. These 100 articles are exported, and a human screener adjudicates the relevance of each abstract and tags words that indicate relevance. The “curated” articles are then exploited by Pythia to refine the search and re-rank the abstracts, and a new set of 100 abstracts is exported and screened/tagged, until convergence (i.e., no other relevant abstracts are retrieved) or for a set number of iterations (batches). Pythia performance was assessed using seven systematic reviews (three prospectively and four retrospectively). Sensitivity, precision, and the number needed to read were calculated for each review. Results: The ability of Pythia to identify the relevant articles (sensitivity) varied across reviews from a low of 0.09 for a sleep apnea review to a high of 0.58 for a diverticulitis review. The number of abstracts that a reviewer had to read to find one relevant abstract (NNR) was lower than in the manually screened project in four reviews, higher in two, and had mixed results in one. The reviews that had greater overall sensitivity retrieved more relevant citations in early batches, but neither study design, study size, nor specific key question significantly affected retrieval across all reviews.Conclusions: Future research should explore ways to encode domain knowledge in query formulation, possibly by incorporating a "reasoning" aspect to Pythia to elicit more contextual information and leveraging ontologies and knowledge bases to better enrich the questions used in the search.

Download Full-text

Reducing a Class of Machine Learning Algorithms to Logical Commonsense Reasoning Operations

Mathematical Methods for Knowledge Discovery and Data Mining ◽

10.4018/978-1-59904-528-3.ch003 ◽

2011 ◽

pp. 41-64

Author(s):

Xenia Naidenova

Keyword(s):

Machine Learning ◽

Large Class ◽

Diagnostic Tests ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Learning Problems ◽

Mathematical Language ◽

Commonsense Reasoning ◽

Good Classification ◽

Logical Rules

The purpose of this paper is to demonstrate the possibility of transforming a large class of machine learning algorithms into commonsense reasoning processes based on using well-known deduction and induction logical rules. The concept of a good classification (diagnostic) test for a given set of positive examples lies in the basis of our approach to the machine learning problems. The task of inferring all good diagnostic tests is formulated as searching the best approximations of a given classification (a partitioning) on a given set of examples. The lattice theory is used as a mathematical language for constructing good classification tests. The algorithms of good tests inference are decomposed into subtasks and operations that are in accordance with main human commonsense reasoning rules.

Download Full-text

Clinical Applications of Machine Learning Algorithms: Beyond the Black Box

SSRN Electronic Journal ◽

10.2139/ssrn.3352454 ◽

2019 ◽

Cited By ~ 2

Author(s):

David Watson ◽

Jenny Krutzinna ◽

Ian Bruce ◽

Christopher Griffiths ◽

Iain McInnes ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Black Box ◽

Clinical Applications ◽

Machine Learning Algorithms ◽

Applications Of Machine Learning

Download Full-text

Developing an Explainable Machine Learning-Based Thyroid Disease Prediction Model

International Journal of Business Analytics ◽

10.4018/ijban.292058 ◽

2022 ◽

Vol 9 (3) ◽

pp. 0-0

Keyword(s):

Machine Learning ◽

Thyroid Disease ◽

Learning Algorithms ◽

Disease Diagnosis ◽

Black Box ◽

Machine Learning Algorithms ◽

Medical Decision ◽

Medical Decision Support ◽

Machine Leaning ◽

Medical Decision Support Systems

Healthcare and medicine are key areas where machine learning algorithms are widely used. The medical decision support systems thus created are accurate enough, however, they suffer from the lack of transparency in decision making and shows a black box behavior. However, transparency and trust are significant in the field of health and medicine and hence, a black box system is sub optimal in terms of widespread applicability and reach. Hence, the explainablility of the research make the system reliable and understandable, thereby enhancing its social acceptability. The presented work explores a thyroid disease diagnosis system. SHAP, a popular method based on coalition game theory is used for interpretability of results. The work explains the system behavior both locally and globally and shows how machine leaning can be used to ascertain the causality of the disease and support doctors to suggest the most effective treatment of the disease. The work not only demonstrates the results of machine learning algorithms but also explains related feature importance and model insights.

Download Full-text