A Study on Machine Learning for Imbalanced Datasets with Answer Validation of Question Answering

In the last few years, text analysis has grown as a keystone in several domains for solving many real-world problems, such as machine translation, spam detection, and question answering, to mention a few. Many of these tasks can be approached by means of machine learning algorithms. Most of these algorithms take as input a transformation of the text in the form of feature vectors containing an abstraction of the content. Most of recent vector representations focus on the semantic component of text, however, we consider that also taking into account the lexical and syntactic components the abstraction of content could be beneficial for learning tasks. In this work, we propose a content spectral-based text representation applicable to machine learning algorithms for text analysis. This representation integrates the spectra from the lexical, syntactic, and semantic components of text producing an abstract image, which can also be treated by both, text and image learning algorithms. These components came from feature vectors of text. For demonstrating the goodness of our proposal, this was tested on text classification and complexity reading score prediction tasks obtaining promising results.

Download Full-text

Incremental and Iterative Learning of Answer Set Programs from Mutually Distinct Examples

Theory and Practice of Logic Programming ◽

10.1017/s1471068418000248 ◽

2018 ◽

Vol 18 (3-4) ◽

pp. 623-637 ◽

Cited By ~ 2

Author(s):

ARINDAM MITRA ◽

CHITTA BARAL

Keyword(s):

Machine Learning ◽

Learning Community ◽

Question Answering ◽

Learning Algorithms ◽

Inductive Logic ◽

Opportunity To Learn ◽

Machine Learning Algorithms ◽

Knowledge Representation And Reasoning ◽

Handwritten Digit ◽

Answer Set

AbstractOver the years the Artificial Intelligence (AI) community has produced several datasets which have given the machine learning algorithms the opportunity to learn various skills across various domains. However, a subclass of these machine learning algorithms that aimed at learning logic programs, namely the Inductive Logic Programming algorithms, have often failed at the task due to the vastness of these datasets. This has impacted the usability of knowledge representation and reasoning techniques in the development of AI systems. In this research, we try to address this scalability issue for the algorithms that learn answer set programs. We present a sound and complete algorithm which takes the input in a slightly different manner and performs an efficient and more user controlled search for a solution. We show via experiments that our algorithm can learn from two popular datasets from machine learning community, namely bAbl (a question answering dataset) and MNIST (a dataset for handwritten digit recognition), which to the best of our knowledge was not previously possible. The system is publicly available athttps://goo.gl/KdWAcV.

Download Full-text

An Integrated Machine Learning and Case-Based Reasoning Approach to Answer Validation

2012 11th International Conference on Machine Learning and Applications ◽

10.1109/icmla.2012.90 ◽

2012 ◽

Cited By ~ 1

Author(s):

Ingo Glockner ◽

Karl-Heinz Weis

Keyword(s):

Machine Learning ◽

Case Based Reasoning ◽

Answer Validation ◽

Case Based

Download Full-text

Review: hindi question answering system using machine learning approach As an upshot of Natural Language Interface to Database (NLIDB), Question Answering System is relatively an Information Retrieval system which is suppose to reflect the user with the correct or closest results to the query being asked to the system in natural language. Information Retrieval and Information Extraction plays vital role in accomplishing the task of interaction between the user and the system. The paper discusses various ways and techniques of interaction between the user and the system along with different approaches. Machine Learning is one of the approaches which is preferred for the Question Answering System. Out of Supervised and unsupervised learning, supervised learning is taken priority here by going through numerous other techniques

International Journal of Latest Trends in Engineering and Technology ◽

10.21172/1.74.035 ◽

2016 ◽

Vol 7 (4) ◽

Keyword(s):

Machine Learning ◽

Information Retrieval ◽

Natural Language ◽

Question Answering ◽

Retrieval System ◽

Vital Role ◽

Learning Approach ◽

Question Answering System ◽

Natural Language Interface ◽

Machine Learning Approach

Download Full-text

Improving Logging Prediction on Imbalanced Datasets

International Journal of Open Source Software and Processes ◽

10.4018/ijossp.2016040103 ◽

2016 ◽

Vol 7 (2) ◽

pp. 43-71 ◽

Cited By ~ 3

Author(s):

Sangeeta Lal ◽

Neetu Sardana ◽

Ashish Sureka

Keyword(s):

Machine Learning ◽

Open Source ◽

Class Imbalance ◽

Learning Model ◽

Learning Models ◽

Class Imbalance Problem ◽

Imbalanced Datasets ◽

Imbalance Problem ◽

Machine Learning Model ◽

Machine Learning Models

Logging is an important yet tough decision for OSS developers. Machine-learning models are useful in improving several steps of OSS development, including logging. Several recent studies propose machine-learning models to predict logged code construct. The prediction performances of these models are limited due to the class-imbalance problem since the number of logged code constructs is small as compared to non-logged code constructs. No previous study analyzes the class-imbalance problem for logged code construct prediction. The authors first analyze the performances of J48, RF, and SVM classifiers for catch-blocks and if-blocks logged code constructs prediction on imbalanced datasets. Second, the authors propose LogIm, an ensemble and threshold-based machine-learning model. Third, the authors evaluate the performance of LogIm on three open-source projects. On average, LogIm model improves the performance of baseline classifiers, J48, RF, and SVM, by 7.38%, 9.24%, and 4.6% for catch-blocks, and 12.11%, 14.95%, and 19.13% for if-blocks logging prediction.

Download Full-text

Ranking with Genetics

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2020070102 ◽

2020 ◽

Vol 10 (3) ◽

pp. 20-34

Author(s):

Lawrence Master

Keyword(s):

Neural Network ◽

Machine Learning ◽

Question Answering ◽

Learning To Rank ◽

Approaches To Learning ◽

Genetic Optimization ◽

Regularization Technique ◽

The Past ◽

Ranking Algorithms ◽

Ranking Model

There are many applications for ranking, including page searching, question answering, recommender systems, sentiment analysis, and collaborative filtering, to name a few. In the past several years, machine learning and information retrieval techniques have been used to develop ranking algorithms and several list wise approaches to learning to rank have been developed. We propose a new method, which we call GeneticListMLE++ and GeneticListNet++, which build on the original ListMLE and ListNet algorithms. Our method substantially improves on the original ListMLE and ListNet ranking approaches by incorporating genetic optimization of hyperparameters, a nonlinear neural network ranking model, and a regularization technique.

Download Full-text