An Improved Binary Relevance Algorithm for Multi-Label Classification

Multi-label classification (MLC) is a machine learning task aiming to predict multiple labels for a given instance. The widely known binary relevance (BR) learns one classifier for each label without considering the correlation among labels. In this paper, an improved binary relevance algorithm (IBRAM) is proposed. This algorithm is derived form binary relevance method. It sets two layers to decompose the multi-label classification problem into L independent binary classification problems respectively. In the first layer, binary classifier is built one for each label. In the second layer, the label information from the first layer is fully used to help to generate final predicting by consider the correlation among labels. Experiments on benchmark datasets validate the effectiveness of proposed approach against other well-established methods.

Download Full-text

Advancing Stress Detection Methodology with Deep Learning Techniques Targeting UX Evaluation in AAL Scenarios: Applying Embeddings for Categorical Variables

Electronics ◽

10.3390/electronics10131550 ◽

2021 ◽

Vol 10 (13) ◽

pp. 1550

Author(s):

Alexandros Liapis ◽

Evanthia Faliagka ◽

Christos P. Antonopoulos ◽

Georgios Keramidas ◽

Nikolaos Voros

Keyword(s):

Machine Learning ◽

Deep Learning ◽

User Experience ◽

Electrodermal Activity ◽

Binary Classification ◽

Research Question ◽

Classification Problem ◽

Categorical Variables ◽

Stress Detection ◽

Software Failures

Physiological measurements have been widely used by researchers and practitioners in order to address the stress detection challenge. So far, various datasets for stress detection have been recorded and are available to the research community for testing and benchmarking. The majority of the stress-related available datasets have been recorded while users were exposed to intense stressors, such as songs, movie clips, major hardware/software failures, image datasets, and gaming scenarios. However, it remains an open research question if such datasets can be used for creating models that will effectively detect stress in different contexts. This paper investigates the performance of the publicly available physiological dataset named WESAD (wearable stress and affect detection) in the context of user experience (UX) evaluation. More specifically, electrodermal activity (EDA) and skin temperature (ST) signals from WESAD were used in order to train three traditional machine learning classifiers and a simple feed forward deep learning artificial neural network combining continues variables and entity embeddings. Regarding the binary classification problem (stress vs. no stress), high accuracy (up to 97.4%), for both training approaches (deep-learning, machine learning), was achieved. Regarding the stress detection effectiveness of the created models in another context, such as user experience (UX) evaluation, the results were quite impressive. More specifically, the deep-learning model achieved a rather high agreement when a user-annotated dataset was used for validation.

Download Full-text

TPOT-NN: augmenting tree-based automated machine learning with neural network estimators

Genetic Programming and Evolvable Machines ◽

10.1007/s10710-021-09401-z ◽

2021 ◽

Author(s):

Joseph D. Romano ◽

Trang T. Le ◽

Weixuan Fu ◽

Jason H. Moore

Keyword(s):

Neural Network ◽

Machine Learning ◽

Binary Classification ◽

Inductive Learning ◽

Future Directions ◽

High Performing ◽

Learning Tasks ◽

Benchmark Datasets ◽

Automated Machine Learning ◽

Standard Tree

AbstractAutomated machine learning (AutoML) and artificial neural networks (ANNs) have revolutionized the field of artificial intelligence by yielding incredibly high-performing models to solve a myriad of inductive learning tasks. In spite of their successes, little guidance exists on when to use one versus the other. Furthermore, relatively few tools exist that allow the integration of both AutoML and ANNs in the same analysis to yield results combining both of their strengths. Here, we present TPOT-NN—a new extension to the tree-based AutoML software TPOT—and use it to explore the behavior of automated machine learning augmented with neural network estimators (AutoML+NN), particularly when compared to non-NN AutoML in the context of simple binary classification on a number of public benchmark datasets. Our observations suggest that TPOT-NN is an effective tool that achieves greater classification accuracy than standard tree-based AutoML on some datasets, with no loss in accuracy on others. We also provide preliminary guidelines for performing AutoML+NN analyses, and recommend possible future directions for AutoML+NN methods research, especially in the context of TPOT.

Download Full-text

Confidence interval for micro-averaged F1 and macro-averaged F1 scores

Applied Intelligence ◽

10.1007/s10489-021-02635-5 ◽

2021 ◽

Author(s):

Kanae Takahashi ◽

Kouji Yamamoto ◽

Aya Kuchiba ◽

Tatsuki Koyama

Keyword(s):

Binary Classification ◽

Classification Problem ◽

Classification Problems ◽

Summary Measure ◽

Medical Field ◽

Predictive Values ◽

Binary Classification Problem ◽

Multi Class Classification ◽

Sensitivity Specificity ◽

Measures Of Performance

AbstractA binary classification problem is common in medical field, and we often use sensitivity, specificity, accuracy, negative and positive predictive values as measures of performance of a binary predictor. In computer science, a classifier is usually evaluated with precision (positive predictive value) and recall (sensitivity). As a single summary measure of a classifier’s performance, F1 score, defined as the harmonic mean of precision and recall, is widely used in the context of information retrieval and information extraction evaluation since it possesses favorable characteristics, especially when the prevalence is low. Some statistical methods for inference have been developed for the F1 score in binary classification problems; however, they have not been extended to the problem of multi-class classification. There are three types of F1 scores, and statistical properties of these F1 scores have hardly ever been discussed. We propose methods based on the large sample multivariate central limit theorem for estimating F1 scores with confidence intervals.

Download Full-text

Dual Generative Network with Discriminative Information for Generalized Zero-Shot Learning

Complexity ◽

10.1155/2021/6656797 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Tingting Xu ◽

Ye Zhao ◽

Xueliang Liu

Keyword(s):

Classification Problem ◽

Learning Task ◽

Visual Features ◽

Learning Technology ◽

Softmax Classifier ◽

Generation Network ◽

Benchmark Datasets ◽

Discrimination Information ◽

Novel Model ◽

Generalized Zero

Zero-shot learning is dedicated to solving the classification problem of unseen categories, while generalized zero-shot learning aims to classify the samples selected from both seen classes and unseen classes, in which “seen” and “unseen” classes indicate whether they can be used in the training process, and if so, they indicate seen classes, and vice versa. Nowadays, with the promotion of deep learning technology, the performance of zero-shot learning has been greatly improved. Generalized zero-shot learning is a challenging topic that has promising prospects in many realistic scenarios. Although the zero-shot learning task has made gratifying progress, there is still a strong deviation between seen classes and unseen classes in the existing methods. Recent methods focus on learning a unified semantic-aligned visual representation to transfer knowledge between two domains, while ignoring the intrinsic characteristics of visual features which are discriminative enough to be classified by itself. To solve the above problems, we propose a novel model that uses the discriminative information of visual features to optimize the generative module, in which the generative module is a dual generation network framework composed of conditional VAE and improved WGAN. Specifically, the model uses the discrimination information of visual features, according to the relevant semantic embedding, synthesizes the visual features of unseen categories by using the learned generator, and then trains the final softmax classifier by using the generated visual features, thus realizing the recognition of unseen categories. In addition, this paper also analyzes the effect of the additional classifiers with different structures on the transmission of discriminative information. We have conducted a lot of experiments on six commonly used benchmark datasets (AWA1, AWA2, APY, FLO, SUN, and CUB). The experimental results show that our model outperforms several state-of-the-art methods for both traditional as well as generalized zero-shot learning.

Download Full-text

Enhancing Vibration-Based Structural Health Monitoring via Edge Computing: A Tiny Machine Learning Perspective

10.1115/qnde2021-75153 ◽

2021 ◽

Author(s):

Federica Zonzini ◽

Francesca Romano ◽

Antonio Carbone ◽

Matteo Zauli ◽

Luca De Marchi

Keyword(s):

Neural Network ◽

Machine Learning ◽

Structural Health Monitoring ◽

Health Monitoring ◽

Network Architecture ◽

Binary Classification ◽

Low Cost ◽

Classification Problems ◽

Computationally Efficient ◽

Structural Health

Abstract Despite the outstanding improvements achieved by artificial intelligence in the Structural Health Monitoring (SHM) field, some challenges need to be coped with. Among them, the necessity to reduce the complexity of the models and the data-to-user latency time which are still affecting state-of-the-art solutions. This is due to the continuous forwarding of a huge amount of data to centralized servers, where the inference process is usually executed in a bulky manner. Conversely, the emerging field of Tiny Machine Learning (TinyML), promoted by the recent advancements by the electronic and information engineering community, made sensor-near data inference a tangible, low-cost and computationally efficient alternative. In line with this observation, this work explored the embodiment of the One Class Classifier Neural Network, i.e., a neural network architecture solving binary classification problems for vibration-based SHM scenarios, into a resource-constrained device. To this end, OCCNN has been ported on the Arduino Nano 33 BLE Sense platform and validated with experimental data from the Z24 bridge use case, reaching an average accuracy and precision of 95% and 94%, respectively.

Download Full-text

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part IV—A Practical Approach to Binary Classification Problems

10.1007/978-3-030-85292-4_5 ◽

2021 ◽

pp. 33-41

Author(s):

Victor E. Staartjes ◽

Julius M. Kernbach

Keyword(s):

Machine Learning ◽

Binary Classification ◽

Practical Approach ◽

Clinical Prediction ◽

Classification Problems ◽

Prediction Modeling

Download Full-text

An Improved Multi-Label Classifier Chain Derived from Binary Relevance

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.551.302 ◽

2014 ◽

Vol 551 ◽

pp. 302-308

Author(s):

Tao Guo ◽

Gui Yang Li

Keyword(s):

Double Layer ◽

Classification Problem ◽

Classification Methods ◽

Improved Method ◽

Binary Relevance ◽

Space Experiments ◽

Benchmark Datasets ◽

Label Correlations ◽

Two Stages ◽

Attribute Space

In multi-label classification, each training example is associated with a set of labels and the task for classification is to predict the proper label set for each unseen instance. Recently, multi-label classification methods mainly focus on exploiting the label correlations to improve the accuracy of individual multi-label learner. In this paper, an improved method derived from binary relevance named double layer classifier chaining (DCC) is proposed. This algorithm decomposes the multi-label classification problem into two stages classification process to generate classifier chain. Each classifier in the chain is responsible for learning and predicting the binary association of the label given the attribute space, augmented by all prior binary relevance predictions in the chain. This chaining allows DCC to take into account correlations in the label space. Experiments on benchmark datasets validate the effectiveness of proposed approach comparing with other well-established methods.

Download Full-text

Multiple similarly effective solutions exist for biomedical feature selection and classification problems

Scientific Reports ◽

10.1038/s41598-017-13184-8 ◽

2017 ◽

Vol 7 (1) ◽

Cited By ~ 9

Author(s):

Jiamei Liu ◽

Cheng Xu ◽

Weifeng Yang ◽

Yayun Shu ◽

Weiwei Zheng ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Association Studies ◽

Binary Classification ◽

Learning Algorithms ◽

Optimal Solution ◽

Machine Learning Algorithms ◽

Disease Classification ◽

Genome Wide Association Studies ◽

Classification Problems

Abstract Binary classification is a widely employed problem to facilitate the decisions on various biomedical big data questions, such as clinical drug trials between treated participants and controls, and genome-wide association studies (GWASs) between participants with or without a phenotype. A machine learning model is trained for this purpose by optimizing the power of discriminating samples from two groups. However, most of the classification algorithms tend to generate one locally optimal solution according to the input dataset and the mathematical presumptions of the dataset. Here we demonstrated from the aspects of both disease classification and feature selection that multiple different solutions may have similar classification performances. So the existing machine learning algorithms may have ignored a horde of fishes by catching only a good one. Since most of the existing machine learning algorithms generate a solution by optimizing a mathematical goal, it may be essential for understanding the biological mechanisms for the investigated classification question, by considering both the generated solution and the ignored ones.

Download Full-text

Predicting Takeover Success Using Machine Learning Techniques

Journal of Business & Economics Research (JBER) ◽

10.19030/jber.v10i10.7264 ◽

2012 ◽

Vol 10 (10) ◽

pp. 547

Author(s):

Mei Zhang ◽

Gregory Johnson ◽

Jia Wang

Keyword(s):

Machine Learning ◽

Learning Community ◽

Binary Classification ◽

Classification Problem ◽

Machine Learning Techniques ◽

Success Prediction ◽

Support Vector ◽

Font Size ◽

Network Support ◽

Learning Techniques

A takeover success prediction model aims at predicting the probability that a takeover attempt will succeed by using publicly available information at the time of the announcement. We perform a thorough study using machine learning techniques to predict takeover success. Specifically, we model takeover success prediction as a binary classification problem, which has been widely studied in the machine learning community. Motivated by the recent advance in machine learning, we empirically evaluate and analyze many state-of-the-art classifiers, including logistic regression, artificial neural network, support vector machines with different kernels, decision trees, random forest, and Adaboost. The experiments validate the effectiveness of applying machine learning in takeover success prediction, and we found that the support vector machine with linear kernel and the Adaboost with stump weak classifiers perform the best for the task. The result is consistent with the general observations of these two approaches.

Download Full-text

Benchmarking machine learning models for the analysis of genetic data using FRESA.CAD Binary Classification Benchmarking

10.1101/733675 ◽

2019 ◽

Author(s):

Javier de Velasco Oriol ◽

Antonio Martinez-Torteya ◽

Victor Trevino ◽

Israel Alanis ◽

Edgar E. Vallejo ◽

...

Keyword(s):

Machine Learning ◽

Model Selection ◽

Binary Classification ◽

Genetic Data ◽

R Package ◽

Learning Models ◽

Classification Problems ◽

Machine Learning Methods ◽

Computational Perspective ◽

Machine Learning Models

AbstractBackgroundMachine learning models have proven to be useful tools for the analysis of genetic data. However, with the availability of a wide variety of such methods, model selection has become increasingly difficult, both from the human and computational perspective.ResultsWe present the R package FRESA.CAD Binary Classification Benchmarking that performs systematic comparisons between a collection of representative machine learning methods for solving binary classification problems on genetic datasets.ConclusionsFRESA.CAD Binary Benchmarking demonstrates to be a useful tool over a variety of binary classification problems comprising the analysis of genetic data showing both quantitative and qualitative advantages over similar packages.

Download Full-text