Cyclical Learning Rates (CLR’s) for Improving Training Accuracies and Lowering Computational Cost

Abstract Prediction of different lung pathologies using chest X-ray images is a challenging task requiring robust training and testing accuracies. In this article, one-class classifier (OCC) and binary classification algorithms have been tested to classify 14 different diseases (atelectasis, cardiomegaly, consolidation, effusion, edema, emphysema, fibrosis, hernia, infiltration, mass, nodule, pneumonia, pneumothorax and pleural-thickening). We have utilized 3 different neural network architectures (MobileNetV1, Alexnet, and DenseNet-121) with four different optimizers (SGD, Adam, and RMSProp) for comparing best possible accuracies. Cyclical learning rate (CLR), a tuning hyperparameters technique was found to have a faster convergence of the cost towards the minima of cost function. Here, we present a unique approach of utilizing previously trained binary classification models with a learning rate decay technique for re-training models using CLR’s. Doing so, we found significant improvement in training accuracies for each of the selected conditions. Thus, utilizing CLR’s in callback functions seems a promising strategy for image classification problems.

Download Full-text

Learning Rates for Classification with Gaussian Kernels

Neural Computation ◽

10.1162/neco_a_00968 ◽

2017 ◽

Vol 29 (12) ◽

pp. 3353-3380 ◽

Cited By ~ 4

Author(s):

Shao-Bo Lin ◽

Jinshan Zeng ◽

Xiangyu Chang

Keyword(s):

Binary Classification ◽

Regression Function ◽

Learning Rate ◽

Loss Functions ◽

Gaussian Kernel ◽

Support Vector ◽

Quadratic Loss ◽

Optimal Learning ◽

Gaussian Kernels ◽

Learning Rates

This letter aims at refined error analysis for binary classification using support vector machine (SVM) with gaussian kernel and convex loss. Our first result shows that for some loss functions, such as the truncated quadratic loss and quadratic loss, SVM with gaussian kernel can reach the almost optimal learning rate provided the regression function is smooth. Our second result shows that for a large number of loss functions, under some Tsybakov noise assumption, if the regression function is infinitely smooth, then SVM with gaussian kernel can achieve the learning rate of order [Formula: see text], where [Formula: see text] is the number of samples.

Download Full-text

Subsampled Nonmonotone Spectral Gradient Methods

Communications in Applied and Industrial Mathematics ◽

10.2478/caim-2020-0002 ◽

2020 ◽

Vol 11 (1) ◽

pp. 19-34 ◽

Cited By ~ 1

Author(s):

Stefania Bellavia ◽

Nataša Krklec Jerinkić ◽

Greta Malaspina

Keyword(s):

Global Convergence ◽

Binary Classification ◽

Computational Cost ◽

Gradient Methods ◽

Linear Convergence ◽

Classification Problems ◽

Worst Case ◽

Strongly Convex ◽

Convex Objective Function ◽

Finite Sums

AbstractThis paper deals with subsampled spectral gradient methods for minimizing finite sums. Subsample function and gradient approximations are employed in order to reduce the overall computational cost of the classical spectral gradient methods. The global convergence is enforced by a nonmonotone line search procedure. Global convergence is proved provided that functions and gradients are approximated with increasing accuracy. R-linear convergence and worst-case iteration complexity is investigated in case of strongly convex objective function. Numerical results on well known binary classification problems are given to show the effectiveness of this framework and analyze the effect of different spectral coefficient approximations arising from the variable sample nature of this procedure.

Download Full-text

The Cost of Dichotomizing Continuous Labels for Binary Classification Problems: Deriving a Bayesian-Optimal Classifier

IEEE Transactions on Affective Computing ◽

10.1109/taffc.2015.2508454 ◽

2017 ◽

Vol 8 (1) ◽

pp. 119-130 ◽

Cited By ~ 7

Author(s):

Soroosh Mariooryad ◽

Carlos Busso

Keyword(s):

Binary Classification ◽

Classification Problems ◽

Optimal Classifier ◽

The Cost

Download Full-text

3D Octave and 2D Vanilla Mixed Convolutional Neural Network for Hyperspectral Image Classification with Limited Samples

Remote Sensing ◽

10.3390/rs13214407 ◽

2021 ◽

Vol 13 (21) ◽

pp. 4407

Author(s):

Yuchao Feng ◽

Jianwei Zheng ◽

Mengjie Qin ◽

Cong Bai ◽

Jinglin Zhang

Keyword(s):

Hyperspectral Image ◽

Feature Fusion ◽

Computational Cost ◽

Spectral Information ◽

Classification Problems ◽

Hyperspectral Image Classification ◽

Feature Maps ◽

Huge Amount ◽

Practical Performance ◽

The Cost

Owing to the outstanding feature extraction capability, convolutional neural networks (CNNs) have been widely applied in hyperspectral image (HSI) classification problems and have achieved an impressive performance. However, it is well known that 2D convolution suffers from the absent consideration of spectral information, while 3D convolution requires a huge amount of computational cost. In addition, the cost of labeling and the limitation of computing resources make it urgent to improve the generalization performance of the model with scarcely labeled samples. To relieve these issues, we design an end-to-end 3D octave and 2D vanilla mixed CNN, namely Oct-MCNN-HS, based on the typical 3D-2D mixed CNN (MCNN). It is worth mentioning that two feature fusion operations are deliberately constructed to climb the top of the discriminative features and practical performance. That is, 2D vanilla convolution merges the feature maps generated by 3D octave convolutions along the channel direction, and homology shifting aggregates the information of the pixels locating at the same spatial position. Extensive experiments are conducted on four publicly available HSI datasets to evaluate the effectiveness and robustness of our model, and the results verify the superiority of Oct-MCNN-HS both in efficacy and efficiency.

Download Full-text

Architecture Optimization Model for the Deep Neural Network For Binary Classification Problems

International Journal of Intelligent Computing and Information Sciences ◽

10.21608/ijicis.2020.18509.1008 ◽

2020 ◽

Vol 0 (0) ◽

pp. 0-0

Author(s):

Kingsley Ukaoha ◽

Efosa Igodan

Keyword(s):

Neural Network ◽

Optimization Model ◽

Deep Neural Network ◽

Binary Classification ◽

Classification Problems ◽

Architecture Optimization

Download Full-text

Adaptive Learning Rate for Dealing with Imbalanced Data in Classification Problems

2021 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunication Engineering ◽

10.1109/ectidamtncon51128.2021.9425715 ◽

2021 ◽

Author(s):

Ratanon Jantanasukon ◽

Arit Thammano

Keyword(s):

Adaptive Learning ◽

Imbalanced Data ◽

Learning Rate ◽

Classification Problems ◽

Adaptive Learning Rate

Download Full-text

Improving Land Cover Classification Using Genetic Programming for Feature Construction

Remote Sensing ◽

10.3390/rs13091623 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1623

Author(s):

João E. Batista ◽

Ana I. R. Cabral ◽

Maria J. P. Vasconcelos ◽

Leonardo Vanneschi ◽

Sara Silva

Keyword(s):

Land Cover ◽

Genetic Programming ◽

Satellite Images ◽

State Of The Art ◽

Binary Classification ◽

Feature Construction ◽

Classification Problems ◽

Construction Methods ◽

Box Models

Genetic programming (GP) is a powerful machine learning (ML) algorithm that can produce readable white-box models. Although successfully used for solving an array of problems in different scientific areas, GP is still not well known in the field of remote sensing. The M3GP algorithm, a variant of the standard GP algorithm, performs feature construction by evolving hyperfeatures from the original ones. In this work, we use the M3GP algorithm on several sets of satellite images over different countries to create hyperfeatures from satellite bands to improve the classification of land cover types. We add the evolved hyperfeatures to the reference datasets and observe a significant improvement of the performance of three state-of-the-art ML algorithms (decision trees, random forests, and XGBoost) on multiclass classifications and no significant effect on the binary classifications. We show that adding the M3GP hyperfeatures to the reference datasets brings better results than adding the well-known spectral indices NDVI, NDWI, and NBR. We also compare the performance of the M3GP hyperfeatures in the binary classification problems with those created by other feature construction methods such as FFX and EFS.

Download Full-text

MultiKOC: Multi-One-Class Classifier Based K-Means Clustering

Algorithms ◽

10.3390/a14050134 ◽

2021 ◽

Vol 14 (5) ◽

pp. 134

Author(s):

Loai Abdallah ◽

Murad Badarna ◽

Waleed Khalifa ◽

Malik Yousef

Keyword(s):

Clustering Algorithm ◽

Main Idea ◽

Molecular Classification ◽

Positive Sample ◽

Classification Problems ◽

Multiple Cancer ◽

Multiple Tumor ◽

One Class Classifier ◽

The Given

In the computational biology community there are many biological cases that are considered as multi-one-class classification problems. Examples include the classification of multiple tumor types, protein fold recognition and the molecular classification of multiple cancer types. In all of these cases the real world appropriately characterized negative cases or outliers are impractical to achieve and the positive cases might consist of different clusters, which in turn might lead to accuracy degradation. In this paper we present a novel algorithm named MultiKOC multi-one-class classifiers based K-means to deal with this problem. The main idea is to execute a clustering algorithm over the positive samples to capture the hidden subdata of the given positive data, and then building up a one-class classifier for every cluster member’s examples separately: in other word, train the OC classifier on each piece of subdata. For a given new sample, the generated classifiers are applied. If it is rejected by all of those classifiers, the given sample is considered as a negative sample, otherwise it is a positive sample. The results of MultiKOC are compared with the traditional one-class, multi-one-class, ensemble one-classes and two-class methods, yielding a significant improvement over the one-class and like the two-class performance.

Download Full-text

Confidence interval for micro-averaged F1 and macro-averaged F1 scores

Applied Intelligence ◽

10.1007/s10489-021-02635-5 ◽

2021 ◽

Author(s):

Kanae Takahashi ◽

Kouji Yamamoto ◽

Aya Kuchiba ◽

Tatsuki Koyama

Keyword(s):

Binary Classification ◽

Classification Problem ◽

Classification Problems ◽

Summary Measure ◽

Medical Field ◽

Predictive Values ◽

Binary Classification Problem ◽

Multi Class Classification ◽

Sensitivity Specificity ◽

Measures Of Performance

AbstractA binary classification problem is common in medical field, and we often use sensitivity, specificity, accuracy, negative and positive predictive values as measures of performance of a binary predictor. In computer science, a classifier is usually evaluated with precision (positive predictive value) and recall (sensitivity). As a single summary measure of a classifier’s performance, F1 score, defined as the harmonic mean of precision and recall, is widely used in the context of information retrieval and information extraction evaluation since it possesses favorable characteristics, especially when the prevalence is low. Some statistical methods for inference have been developed for the F1 score in binary classification problems; however, they have not been extended to the problem of multi-class classification. There are three types of F1 scores, and statistical properties of these F1 scores have hardly ever been discussed. We propose methods based on the large sample multivariate central limit theorem for estimating F1 scores with confidence intervals.

Download Full-text

Comparing the performance of different neural networks for binary classification problems

2009 Eighth International Symposium on Natural Language Processing ◽

10.1109/snlp.2009.5340935 ◽

2009 ◽

Cited By ~ 25

Author(s):

P. Jeatrakul ◽

K.W. Wong

Keyword(s):

Neural Networks ◽

Binary Classification ◽

Classification Problems

Download Full-text