Rademacher Complexity Bounds for a Penalized Multi-class Semi-supervised Algorithm

We propose Rademacher complexity bounds for multi-class classifiers trained with a two-step semi-supervised model. In the first step, the algorithm partitions the partially labeled data and then identifies dense clusters containing k predominant classes using the labeled training examples such that the proportion of their non-predominant classes is below a fixed threshold stands for clustering consistency. In the second step, a classifier is trained by minimizing a margin empirical loss over the labeled training set and a penalization term measuring the disability of the learner to predict the k predominant classes of the identified clusters. The resulting data-dependent generalization error bound involves the margin distribution of the classifier, the stability of the clustering technique used in the first step and Rademacher complexity terms corresponding to partially labeled training data. Our theoretical result exhibit convergence rates extending those proposed in the literature for the binary case, and experimental results on different multi-class classification problems show empirical evidence that supports the theory.

Download Full-text

Transductive Bounds for the Multi-Class Majority Vote Classifier

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013566 ◽

2019 ◽

Vol 33 ◽

pp. 3566-3573

Author(s):

Vasilii Feofanov ◽

Emilie Devijver ◽

Massih-Reza Amini

Keyword(s):

Learning Algorithm ◽

Majority Vote ◽

Confusion Matrix ◽

Data Sets ◽

Bayes Classifier ◽

Partially Labeled Data ◽

Margin Distribution ◽

Training Examples ◽

Multi Class Classification ◽

Self Learning

In this paper, we propose a transductive bound over the risk of the majority vote classifier learned with partially labeled data for the multi-class classification. The bound is obtained by considering the class confusion matrix as an error indicator and it involves the margin distribution of the classifier over each class and a bound over the risk of the associated Gibbs classifier. When this latter bound is tight and, the errors of the majority vote classifier per class are concentrated on a low margin zone; we prove that the bound over the Bayes classifier’ risk is tight. As an application, we extend the self-learning algorithm to the multi-class case. The algorithm iteratively assigns pseudo-labels to a subset of unlabeled training examples that have their associated class margin above a threshold obtained from the proposed transductive bound. Empirical results on different data sets show the effectiveness of our approach compared to the same algorithm where the threshold is fixed manually, to the extension of TSVM to multi-class classification and to a graph-based semi-supervised algorithm.

Download Full-text

Global Rademacher Complexity Bounds: From Slow to Fast Convergence Rates

Neural Processing Letters ◽

10.1007/s11063-015-9429-2 ◽

2015 ◽

Vol 43 (2) ◽

pp. 567-602 ◽

Cited By ~ 14

Author(s):

Luca Oneto ◽

Alessandro Ghio ◽

Sandro Ridella ◽

Davide Anguita

Keyword(s):

Convergence Rates ◽

Fast Convergence ◽

Complexity Bounds ◽

Rademacher Complexity

Download Full-text

Scalable Approach to High Coverages on Oxides via Iterative Training of a Machine-Learning Algorithm

10.26434/chemrxiv.10288514.v1 ◽

2019 ◽

Author(s):

Andrew Medford ◽

Shengchun Yang ◽

Fuzhu Liu

Keyword(s):

Machine Learning ◽

Chemical Potential ◽

Learning Algorithm ◽

Absolute Error ◽

Low Energy ◽

Training Data ◽

High Coverage ◽

Metal Compounds ◽

Adsorption Energies ◽

The Stability

Understanding the interaction of multiple types of adsorbate molecules on solid surfaces is crucial to establishing the stability of catalysts under various chemical environments. Computational studies on the high coverage and mixed coverages of reaction intermediates are still challenging, especially for transition-metal compounds. In this work, we present a framework to predict differential adsorption energies and identify low-energy structures under high- and mixed-adsorbate coverages on oxide materials. The approach uses Gaussian process machine-learning models with quantified uncertainty in conjunction with an iterative training algorithm to actively identify the training set. The framework is demonstrated for the mixed adsorption of CHx, NHx and OHx species on the oxygen vacancy and pristine rutile TiO2(110) surface sites. The results indicate that the proposed algorithm is highly efficient at identifying the most valuable training data, and is able to predict differential adsorption energies with a mean absolute error of ~0.3 eV based on <25% of the total DFT data. The algorithm is also used to identify 76% of the low-energy structures based on <30% of the total DFT data, enabling construction of surface phase diagrams that account for high and mixed coverage as a function of the chemical potential of C, H, O, and N. Furthermore, the computational scaling indicates the algorithm scales nearly linearly (N1.12) as the number of adsorbates increases. This framework can be directly extended to metals, metal oxides, and other materials, providing a practical route toward the investigation of the behavior of catalysts under high-coverage conditions.

Download Full-text

Stability Analysis of the Nanjung Water Diversion Twin Tunnels based on Convergence Measurement

Indonesian Mining Professionals Journal ◽

10.36986/impj.v1i1.11 ◽

2019 ◽

Vol 1 (1) ◽

pp. 49-60

Author(s):

Simon Heru Prassetyo ◽

Ganda Marihot Simangunsong ◽

Ridho Kresna Wattimena ◽

Made Astawa Rai ◽

Irwandy Arif ◽

...

Keyword(s):

Stability Analysis ◽

Convergence Rate ◽

Critical Strain ◽

Convergence Rates ◽

Strain Analysis ◽

Water Diversion ◽

Tunnel Face ◽

Tunnel Stability ◽

The Stability ◽

Twin Tunnels

This paper focuses on the stability analysis of the Nanjung Water Diversion Twin Tunnels using convergence measurement. The Nanjung Tunnel is horseshoe-shaped in cross-section, 10.2 m x 9.2 m in dimension, and 230 m in length. The location of the tunnel is in Curug Jompong, Margaasih Subdistrict, Bandung. Convergence monitoring was done for 144 days between February 18 and July 11, 2019. The results of the convergence measurement were recorded and plotted into the curves of convergence vs. day and convergence vs. distance from tunnel face. From these plots, the continuity of the convergence and the convergence rate in the tunnel roof and wall were then analyzed. The convergence rates from each tunnel were also compared to empirical values to determine the level of tunnel stability. In general, the trend of convergence rate shows that the Nanjung Tunnel is stable without any indication of instability. Although there was a spike in the convergence rate at several STA in the measured span, that spike was not replicated by the convergence rate in the other measured spans and it was not continuous. The stability of the Nanjung Tunnel is also confirmed from the critical strain analysis, in which most of the STA measured have strain magnitudes located below the critical strain line and are less than 1%.

Download Full-text

MULFE: Multi-Label Learning via Label-Specific Feature Space Ensemble

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3451392 ◽

2021 ◽

Vol 16 (1) ◽

pp. 1-24

Author(s):

Yaojin Lin ◽

Qinghua Hu ◽

Jinghua Liu ◽

Xingquan Zhu ◽

Xindong Wu

Keyword(s):

Empirical Studies ◽

Feature Space ◽

Training Data ◽

Data Sets ◽

Learning Framework ◽

Feature Spaces ◽

Public Data ◽

Margin Distribution ◽

Label Correlations ◽

Label Correlation

In multi-label learning, label correlations commonly exist in the data. Such correlation not only provides useful information, but also imposes significant challenges for multi-label learning. Recently, label-specific feature embedding has been proposed to explore label-specific features from the training data, and uses feature highly customized to the multi-label set for learning. While such feature embedding methods have demonstrated good performance, the creation of the feature embedding space is only based on a single label, without considering label correlations in the data. In this article, we propose to combine multiple label-specific feature spaces, using label correlation, for multi-label learning. The proposed algorithm, mu lti- l abel-specific f eature space e nsemble (MULFE), takes consideration label-specific features, label correlation, and weighted ensemble principle to form a learning framework. By conducting clustering analysis on each label’s negative and positive instances, MULFE first creates features customized to each label. After that, MULFE utilizes the label correlation to optimize the margin distribution of the base classifiers which are induced by the related label-specific feature spaces. By combining multiple label-specific features, label correlation based weighting, and ensemble learning, MULFE achieves maximum margin multi-label classification goal through the underlying optimization framework. Empirical studies on 10 public data sets manifest the effectiveness of MULFE.

Download Full-text

CrossGR

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies ◽

10.1145/3448100 ◽

2021 ◽

Vol 5 (1) ◽

pp. 1-23

Author(s):

Xinyi Li ◽

Liqiong Chang ◽

Fangfang Song ◽

Ju Wang ◽

Xiaojiang Chen ◽

...

Keyword(s):

Gesture Recognition ◽

Low Cost ◽

User Involvement ◽

Recognition System ◽

Training Data ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Target User ◽

Order Of Magnitude ◽

Training Examples

This paper focuses on a fundamental question in Wi-Fi-based gesture recognition: "Can we use the knowledge learned from some users to perform gesture recognition for others?". This problem is also known as cross-target recognition. It arises in many practical deployments of Wi-Fi-based gesture recognition where it is prohibitively expensive to collect training data from every single user. We present CrossGR, a low-cost cross-target gesture recognition system. As a departure from existing approaches, CrossGR does not require prior knowledge (such as who is currently performing a gesture) of the target user. Instead, CrossGR employs a deep neural network to extract user-agnostic but gesture-related Wi-Fi signal characteristics to perform gesture recognition. To provide sufficient training data to build an effective deep learning model, CrossGR employs a generative adversarial network to automatically generate many synthetic training data from a small set of real-world examples collected from a small number of users. Such a strategy allows CrossGR to minimize the user involvement and the associated cost in collecting training examples for building an accurate gesture recognition system. We evaluate CrossGR by applying it to perform gesture recognition across 10 users and 15 gestures. Experimental results show that CrossGR achieves an accuracy of over 82.6% (up to 99.75%). We demonstrate that CrossGR delivers comparable recognition accuracy, but uses an order of magnitude less training samples collected from the end-users when compared to state-of-the-art recognition systems.

Download Full-text

Confidence interval for micro-averaged F1 and macro-averaged F1 scores

Applied Intelligence ◽

10.1007/s10489-021-02635-5 ◽

2021 ◽

Author(s):

Kanae Takahashi ◽

Kouji Yamamoto ◽

Aya Kuchiba ◽

Tatsuki Koyama

Keyword(s):

Binary Classification ◽

Classification Problem ◽

Classification Problems ◽

Summary Measure ◽

Medical Field ◽

Predictive Values ◽

Binary Classification Problem ◽

Multi Class Classification ◽

Sensitivity Specificity ◽

Measures Of Performance

AbstractA binary classification problem is common in medical field, and we often use sensitivity, specificity, accuracy, negative and positive predictive values as measures of performance of a binary predictor. In computer science, a classifier is usually evaluated with precision (positive predictive value) and recall (sensitivity). As a single summary measure of a classifier’s performance, F1 score, defined as the harmonic mean of precision and recall, is widely used in the context of information retrieval and information extraction evaluation since it possesses favorable characteristics, especially when the prevalence is low. Some statistical methods for inference have been developed for the F1 score in binary classification problems; however, they have not been extended to the problem of multi-class classification. There are three types of F1 scores, and statistical properties of these F1 scores have hardly ever been discussed. We propose methods based on the large sample multivariate central limit theorem for estimating F1 scores with confidence intervals.

Download Full-text

An integral equation formulation of the N-body dielectric spheres problem. Part 1: Numerical analysis

ESAIM Mathematical Modelling and Numerical Analysis ◽

10.1051/m2an/2020030 ◽

2020 ◽

Author(s):

Muhammad Hassan ◽

Benjamin Stamm

Keyword(s):

Integral Equation ◽

Numerical Analysis ◽

Error Analysis ◽

Convergence Rates ◽

A Priori ◽

Approximation Error ◽

Spherical Particles ◽

Integral Equation Formulation ◽

Dielectric Spheres ◽

The Stability

In this article, we analyse an integral equation of the second kind that represents the solution of N interacting dielectric spherical particles undergoing mutual polarisation. A traditional analysis can not quantify the scaling of the stability constants- and thus the approximation error- with respect to the number N of involved dielectric spheres. We develop a new a priori error analysis that demonstrates N-independent stability of the continuous and discrete formulations of the integral equation. Consequently, we obtain convergence rates that are independent of N.

Download Full-text

DANNP: an efficient artificial neural network pruning tool

PeerJ Computer Science ◽

10.7717/peerj-cs.137 ◽

2017 ◽

Vol 3 ◽

pp. e137 ◽

Cited By ~ 7

Author(s):

Mona Alshahrani ◽

Othman Soufan ◽

Arturo Magana-Mora ◽

Vladimir B. Bajic

Keyword(s):

Neural Network ◽

State Of The Art ◽

Model Performance ◽

Training Data ◽

Classification Problems ◽

Link Type ◽

On Line ◽

Pruning Algorithms ◽

Artificial Neural ◽

The Impact

Background Artificial neural networks (ANNs) are a robust class of machine learning models and are a frequent choice for solving classification problems. However, determining the structure of the ANNs is not trivial as a large number of weights (connection links) may lead to overfitting the training data. Although several ANN pruning algorithms have been proposed for the simplification of ANNs, these algorithms are not able to efficiently cope with intricate ANN structures required for complex classification problems. Methods We developed DANNP, a web-based tool, that implements parallelized versions of several ANN pruning algorithms. The DANNP tool uses a modified version of the Fast Compressed Neural Network software implemented in C++ to considerably enhance the running time of the ANN pruning algorithms we implemented. In addition to the performance evaluation of the pruned ANNs, we systematically compared the set of features that remained in the pruned ANN with those obtained by different state-of-the-art feature selection (FS) methods. Results Although the ANN pruning algorithms are not entirely parallelizable, DANNP was able to speed up the ANN pruning up to eight times on a 32-core machine, compared to the serial implementations. To assess the impact of the ANN pruning by DANNP tool, we used 16 datasets from different domains. In eight out of the 16 datasets, DANNP significantly reduced the number of weights by 70%–99%, while maintaining a competitive or better model performance compared to the unpruned ANN. Finally, we used a naïve Bayes classifier derived with the features selected as a byproduct of the ANN pruning and demonstrated that its accuracy is comparable to those obtained by the classifiers trained with the features selected by several state-of-the-art FS methods. The FS ranking methodology proposed in this study allows the users to identify the most discriminant features of the problem at hand. To the best of our knowledge, DANNP (publicly available at www.cbrc.kaust.edu.sa/dannp) is the only available and on-line accessible tool that provides multiple parallelized ANN pruning options. Datasets and DANNP code can be obtained at www.cbrc.kaust.edu.sa/dannp/data.php and https://doi.org/10.5281/zenodo.1001086.

Download Full-text