Self-bounding Majority Vote Learning Algorithms by the Direct Minimization of a Tight PAC-Bayesian C-Bound

AbstractObjectivesTo assess 1) differences in the hemodynamic response to the active stand test in older adults with a clinical diagnosis of vasovagal syncope compared to age-matched controls 2) if the active stand test combined with machine learning approaches can be used to identify the presence of vasovagal syncope in older adults.ApproachAdults aged 50 and over (Vasovagal Syncope N=46 Age=66.9±10.3; Control N=86 Age=65.3±9.5) completed an active stand test. Multiple features were extracted to characterize the hemodynamic responses to the active stand test and were compared between groups. Classification was performed using machine learning algorithms including linear discriminant analysis, quadratic discriminant analysis, support vector machine and an ensemble majority vote classifier.Main ResultsSubjects with vasovagal syncope demonstrated a higher resting (supine) heart rate (69.8±13.1 bpm vs 63.3±12.1 bpm; P=0.007), a smaller initial systolic blood pressure drop (−20.2±20.1% vs −27.3±17.5%; P=0.005), larger drops in stroke volume (−14.7±24.0% vs −2.7±23.3%; P=0.010) and cardiac output (−6.4±18.5% vs 5.8±22.3%;P<0.001) and a larger increase in total peripheral resistance (8.1±30.4% vs −6.03±22.8%; P=0.002) compared to controls. A majority vote classifier identified the presence of vasovagal syncope with 82.6% sensitivity, 76.8% specificity, and average accuracy of 78.9%.SignificanceOlder adults with vasovagal syncope display a unique hemodynamic and autonomic response to active standing characterized by relative autonomic hypersensitivity and larger drops in cardiac output compared to age-matched controls. With suitable machine learning algorithms, the active stand test holds the potential to be used to screen older adults for reflex syncopes and hypotensive susceptibility potentially reducing test time, cost, and patient discomfort. More broadly this paper presents a machine learning framework to support use of the active stand test for classification of clinical outcomes of interest.

Download Full-text

Identifying Mislabeled Training Data

Journal of Artificial Intelligence Research ◽

10.1613/jair.606 ◽

1999 ◽

Vol 11 ◽

pp. 131-167 ◽

Cited By ~ 354

Author(s):

C. E. Brodley ◽

M. A. Friedl

Keyword(s):

Classification Accuracy ◽

Majority Vote ◽

Learning Algorithms ◽

Empirical Evaluation ◽

Training Data ◽

Noise Levels ◽

New Approach ◽

Good Data ◽

Bad Data

This paper presents a new approach to identifying and eliminating mislabeled training instances for supervised learning. The goal of this approach is to improve classification accuracies produced by learning algorithms by improving the quality of the training data. Our approach uses a set of learning algorithms to create classifiers that serve as noise filters for the training data. We evaluate single algorithm, majority vote and consensus filters on five datasets that are prone to labeling errors. Our experiments illustrate that filtering significantly improves classification accuracy for noise levels up to 30 percent. An analytical and empirical evaluation of the precision of our approach shows that consensus filters are conservative at throwing away good data at the expense of retaining bad data and that majority filters are better at detecting bad data at the expense of throwing away good data. This suggests that for situations in which there is a paucity of data, consensus filters are preferable, whereas majority vote filters are preferable for situations with an abundance of data.

Download Full-text

Learning Bayesian Network Structure Using a MultiExpert Approach

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194014500119 ◽

2014 ◽

Vol 24 (02) ◽

pp. 269-284 ◽

Cited By ~ 1

Author(s):

Francesco Colace ◽

Massimo De Santo ◽

Luca Greco

Keyword(s):

Bayesian Network ◽

Network Structure ◽

Scientific Community ◽

Text Categorization ◽

Experimental Validation ◽

Majority Vote ◽

Learning Algorithms ◽

Structural Learning ◽

Bayesian Network Structure ◽

Ontology Building

The learning of a Bayesian network structure, especially in the case of wide domains, can be a complex, time-consuming and imprecise process. Therefore, the interest of the scientific community in learning Bayesian network structure from data is increasing: many techniques or disciplines such as data mining, text categorization, and ontology building, can take advantage from this process. In the literature, there are many structural learning algorithms but none of them provides good results for each dataset. This paper introduces a method for structural learning of Bayesian networks based on a MultiExpert approach. The proposed method combines five structural learning algorithms according to a majority vote combining rule for maximizing their effectiveness and, more generally, the results obtained by using of a single algorithm. This paper shows an experimental validation of the proposed algorithm on standard datasets.

Download Full-text

Issues in Stacked Generalization

Journal of Artificial Intelligence Research ◽

10.1613/jair.594 ◽

1999 ◽

Vol 10 ◽

pp. 271-289 ◽

Cited By ~ 300

Author(s):

K. M. Ting ◽

I. H. Witten

Keyword(s):

Predictive Accuracy ◽

Majority Vote ◽

Learning Algorithms ◽

Stacked Generalization ◽

Black Art ◽

Different Types ◽

Level Model ◽

Classification Tasks ◽

High Level ◽

General Method

Stacked generalization is a general method of using a high-level model to combine lower-level models to achieve greater predictive accuracy. In this paper we address two crucial issues which have been considered to be a `black art' in classification tasks ever since the introduction of stacked generalization in 1992 by Wolpert: the type of generalizer that is suitable to derive the higher-level model, and the kind of attributes that should be used as its input. We find that best results are obtained when the higher-level model combines the confidence (and not just the predictions) of the lower-level ones. We demonstrate the effectiveness of stacked generalization for combining three different types of learning algorithms for classification tasks. We also compare the performance of stacked generalization with majority vote and published results of arcing and bagging.

Download Full-text

Supplemental Material for One Model to Rule Them All? Using Machine Learning Algorithms to Determine the Number of Factors in Exploratory Factor Analysis

Psychological Methods ◽

10.1037/met0000262.supp ◽

2020 ◽

Keyword(s):

Machine Learning ◽

Factor Analysis ◽

Exploratory Factor Analysis ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Number Of Factors

Download Full-text

Forecasting US movies box office performances in Turkey using machine learning algorithms

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189120 ◽

2020 ◽

Vol 39 (5) ◽

pp. 6579-6590

Author(s):

Sandy Çağlıyor ◽

Başar Öztayşi ◽

Selime Sezgin

Keyword(s):

Machine Learning ◽

Global Economy ◽

Learning Algorithms ◽

Forecast Model ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

High Stakes ◽

Box Office ◽

Industry Forecast ◽

The Impact

The motion picture industry is one of the largest industries worldwide and has significant importance in the global economy. Considering the high stakes and high risks in the industry, forecast models and decision support systems are gaining importance. Several attempts have been made to estimate the theatrical performance of a movie before or at the early stages of its release. Nevertheless, these models are mostly used for predicting domestic performances and the industry still struggles to predict box office performances in overseas markets. In this study, the aim is to design a forecast model using different machine learning algorithms to estimate the theatrical success of US movies in Turkey. From various sources, a dataset of 1559 movies is constructed. Firstly, independent variables are grouped as pre-release, distributor type, and international distribution based on their characteristic. The number of attendances is discretized into three classes. Four popular machine learning algorithms, artificial neural networks, decision tree regression and gradient boosting tree and random forest are employed, and the impact of each group is observed by compared by the performance models. Then the number of target classes is increased into five and eight and results are compared with the previously developed models in the literature.

Download Full-text

Intelligent system of English composition scoring model based on improved machine learning algorithm

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189235 ◽

2020 ◽

pp. 1-11

Author(s):

Jie Liu ◽

Lin Lin ◽

Xiufang Liang

Keyword(s):

Machine Learning ◽

Evaluation System ◽

Intelligent System ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Assessment System ◽

English Composition ◽

Region Extraction ◽

Constraint Model

The online English teaching system has certain requirements for the intelligent scoring system, and the most difficult stage of intelligent scoring in the English test is to score the English composition through the intelligent model. In order to improve the intelligence of English composition scoring, based on machine learning algorithms, this study combines intelligent image recognition technology to improve machine learning algorithms, and proposes an improved MSER-based character candidate region extraction algorithm and a convolutional neural network-based pseudo-character region filtering algorithm. In addition, in order to verify whether the algorithm model proposed in this paper meets the requirements of the group text, that is, to verify the feasibility of the algorithm, the performance of the model proposed in this study is analyzed through design experiments. Moreover, the basic conditions for composition scoring are input into the model as a constraint model. The research results show that the algorithm proposed in this paper has a certain practical effect, and it can be applied to the English assessment system and the online assessment system of the homework evaluation system algorithm system.

Download Full-text

The Unlearnable Checkerboard Pattern

Communications of the Blyth Institute ◽

10.33014/issn.2640-5652.1.2.holloway.1 ◽

2019 ◽

Vol 1 (2) ◽

pp. 78-80

Author(s):

Eric Holloway

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Checkerboard Pattern ◽

Simple Task

Detecting some patterns is a simple task for humans, but nearly impossible for current machine learning algorithms. Here, the "checkerboard" pattern is examined, where human prediction nears 100% and machine prediction drops significantly below 50%.

Download Full-text