Bi-Objective Continual Learning: Learning ‘New’ While Consolidating ‘Known’

In this paper, we propose a novel single-task continual learning framework named Bi-Objective Continual Learning (BOCL). BOCL aims at both consolidating historical knowledge and learning from new data. On one hand, we propose to preserve the old knowledge using a small set of pillars, and develop the pillar consolidation (PLC) loss to preserve the old knowledge and to alleviate the catastrophic forgetting problem. On the other hand, we develop the contrastive pillar (CPL) loss term to improve the classification performance, and examine several data sampling strategies for efficient onsite learning from ‘new’ with a reasonable amount of computational resources. Comprehensive experiments on CIFAR10/100, CORe50 and a subset of ImageNet validate the BOCL framework. We also reveal the performance accuracy of different sampling strategies when used to finetune a given CNN model. The code will be released.

Download Full-text

Trainable Undersampling for Class-Imbalance Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014707 ◽

2019 ◽

Vol 33 ◽

pp. 4707-4714 ◽

Cited By ~ 5

Author(s):

Minlong Peng ◽

Qi Zhang ◽

Xiaoyu Xing ◽

Tao Gui ◽

Xuanjing Huang ◽

...

Keyword(s):

Optimization Problem ◽

Class Imbalance ◽

Classification Performance ◽

Sampling Strategies ◽

Data Sampling ◽

Meta Learning ◽

Evaluation Metric ◽

Imbalance Learning ◽

Class Imbalance Learning ◽

The Given

Undersampling has been widely used in the class-imbalance learning area. The main deficiency of most existing undersampling methods is that their data sampling strategies are heuristic-based and independent of the used classifier and evaluation metric. Thus, they may discard informative instances for the classifier during the data sampling. In this work, we propose a meta-learning method built on the undersampling to address this issue. The key idea of this method is to parametrize the data sampler and train it to optimize the classification performance over the evaluation metric. We solve the non-differentiable optimization problem for training the data sampler via reinforcement learning. By incorporating evaluation metric optimization into the data sampling process, the proposed method can learn which instance should be discarded for the given classifier and evaluation metric. In addition, as a data level operation, this method can be easily applied to arbitrary evaluation metric and classifier, including non-parametric ones (e.g., C4.5 and KNN). Experimental results on both synthetic and realistic datasets demonstrate the effectiveness of the proposed method.

Download Full-text

Data Sampling Strategies for Click Fraud Detection Using Imbalanced User Click Data of Online Advertising: An Empirical Review

IETE Technical Review ◽

10.1080/02564602.2021.1915892 ◽

2021 ◽

pp. 1-10

Author(s):

Deepti Sisodia ◽

Dilip Singh Sisodia

Keyword(s):

Online Advertising ◽

Fraud Detection ◽

Sampling Strategies ◽

Data Sampling ◽

Click Fraud

Download Full-text

A Study of Features and Deep Neural Network Architectures and Hyper-Parameters for Domestic Audio Classification

Applied Sciences ◽

10.3390/app11114880 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4880

Author(s):

Abigail Copiaco ◽

Christian Ritz ◽

Nidhal Abdulaziz ◽

Stefano Fasciani

Keyword(s):

Network Architecture ◽

Single Channel ◽

Classification Performance ◽

Network Size ◽

Directed Acyclic Graphs ◽

Spectral Features ◽

Audio Classification ◽

Resource Requirements ◽

Efficient Alternative ◽

Computational Resources

Recent methodologies for audio classification frequently involve cepstral and spectral features, applied to single channel recordings of acoustic scenes and events. Further, the concept of transfer learning has been widely used over the years, and has proven to provide an efficient alternative to training neural networks from scratch. The lower time and resource requirements when using pre-trained models allows for more versatility in developing system classification approaches. However, information on classification performance when using different features for multi-channel recordings is often limited. Furthermore, pre-trained networks are initially trained on bigger databases and are often unnecessarily large. This poses a challenge when developing systems for devices with limited computational resources, such as mobile or embedded devices. This paper presents a detailed study of the most apparent and widely-used cepstral and spectral features for multi-channel audio applications. Accordingly, we propose the use of spectro-temporal features. Additionally, the paper details the development of a compact version of the AlexNet model for computationally-limited platforms through studies of performances against various architectural and parameter modifications of the original network. The aim is to minimize the network size while maintaining the series network architecture and preserving the classification accuracy. Considering that other state-of-the-art compact networks present complex directed acyclic graphs, a series architecture proposes an advantage in customizability. Experimentation was carried out through Matlab, using a database that we have generated for this task, which composes of four-channel synthetic recordings of both sound events and scenes. The top performing methodology resulted in a weighted F1-score of 87.92% for scalogram features classified via the modified AlexNet-33 network, which has a size of 14.33 MB. The AlexNet network returned 86.24% at a size of 222.71 MB.

Download Full-text

An explainable Artificial Intelligence approach to study MCI to AD conversion via HD-EEG processing

Clinical EEG and Neuroscience ◽

10.1177/15500594211063662 ◽

2021 ◽

pp. 155005942110636

Author(s):

Francesco Carlo Morabito ◽

Cosimo Ieracitano ◽

Nadia Mammone

Keyword(s):

Artificial Intelligence ◽

Frontal Lobe ◽

Parietal Lobe ◽

Classification Performance ◽

Head Region ◽

Left Frontal Lobe ◽

Performance Accuracy ◽

Power Spectral ◽

Explainable Artificial Intelligence

An explainable Artificial Intelligence (xAI) approach is proposed to longitudinally monitor subjects affected by Mild Cognitive Impairment (MCI) by using high-density electroencephalography (HD-EEG). To this end, a group of MCI patients was enrolled at IRCCS Centro Neurolesi Bonino Pulejo of Messina (Italy) within a follow-up protocol that included two evaluations steps: T0 (first evaluation) and T1 (three months later). At T1, four MCI patients resulted converted to Alzheimer’s Disease (AD) and were included in the analysis as the goal of this work was to use xAI to detect individual changes in EEGs possibly related to the degeneration from MCI to AD. The proposed methodology consists in mapping segments of HD-EEG into channel-frequency maps by means of the power spectral density. Such maps are used as input to a Convolutional Neural Network (CNN), trained to label the maps as “T0” (MCI state) or “T1” (AD state). Experimental results reported high intra-subject classification performance (accuracy rate up to 98.97% (95% confidence interval: 98.68–99.26)). Subsequently, the explainability of the proposed CNN is explored via a Grad-CAM approach. The procedure allowed to detect which EEG-channels (i.e., head region) and range of frequencies (i.e., sub-bands) resulted more active in the progression to AD. The xAI analysis showed that the main information is included in the delta sub-band and that, limited to the analyzed dataset, the highest relevant areas are: the left-temporal and central-frontal lobe for Sb01, the parietal lobe for Sb02, the left-frontal lobe for Sb03 and the left-frontotemporal region for Sb04.

Download Full-text

An Implementation of Yolo-family Algorithms in Classifying the Product Quality for the ABS Metallization

10.21203/rs.3.rs-898089/v1 ◽

2021 ◽

Author(s):

YUH-WEN CHEN ◽

Jing Mau Shiu

Keyword(s):

Visual Inspection ◽

Acrylonitrile Butadiene Styrene ◽

Labor Cost ◽

Classification Performance ◽

Video Data ◽

Production Lines ◽

Learning Framework ◽

Optical Inspection ◽

The Neural Network ◽

Product Surface

Abstract In the traditional electroplating industry of Acrylonitrile Butadiene Styrene (ABS), quality control inspection of the product surface is usually performed with the naked eye. However, these defects on the surface of electroplated products are minor and easily ignored under reflective conditions. If the number of defectiveness and samples is too large, manual inspection will be challenging and time-consuming. We innovatively applied Additive Manufacturing (AM) to design and assemble an automatic optical inspection (AOI) system. The system can identify defects on the reflective surface of the plated product. Based on the deep learning framework from YOLO, we successfully started the neural network model on GPU using the family of YOLO algorithms: from v2 to v5. Finally, our efforts showed an accuracy rate over an average of 70 percentage for detecting real-time video data in production lines. We also compare the classification performance among various YOLO algorithms. Our efforts of visual inspection significantly reduce the labor cost of visual inspection in the electroplating industry.

Download Full-text

A Multivariate Empirical Orthogonal Function Method to Construct Nitrate Maps in the Southern Ocean

Journal of Atmospheric and Oceanic Technology ◽

10.1175/jtech-d-18-0018.1 ◽

2018 ◽

Vol 35 (7) ◽

pp. 1505-1519 ◽

Cited By ~ 4

Author(s):

Yu-Chiao Liang ◽

Matthew R. Mazloff ◽

Isabella Rosso ◽

Shih-Wei Fang ◽

Jin-Yi Yu

Keyword(s):

Southern Ocean ◽

General Circulation ◽

Model Simulation ◽

Potential Temperature ◽

Circulation Model ◽

Empirical Orthogonal Functions ◽

Cluster Method ◽

Sampling Strategies ◽

Data Sampling ◽

Mean Square Errors

AbstractThe ability to construct nitrate maps in the Southern Ocean (SO) from sparse observations is important for marine biogeochemistry research, as it offers a geographical estimate of biological productivity. The goal of this study is to infer the skill of constructed SO nitrate maps using varying data sampling strategies. The mapping method uses multivariate empirical orthogonal functions (MEOFs) constructed from nitrate, salinity, and potential temperature (N-S-T) fields from a biogeochemical general circulation model simulation Synthetic N-S-T datasets are created by sampling modeled N-S-T fields in specific regions, determined either by random selection or by selecting regions over a certain threshold of nitrate temporal variances. The first 500 MEOF modes, determined by their capability to reconstruct the original N-S-T fields, are projected onto these synthetic N-S-T data to construct time-varying nitrate maps. Normalized root-mean-square errors (NRMSEs) are calculated between the constructed nitrate maps and the original modeled fields for different sampling strategies. The sampling strategy according to nitrate variances is shown to yield maps with lower NRMSEs than mapping adopting random sampling. A k-means cluster method that considers the N-S-T combined variances to identify key regions to insert data is most effective in reducing the mapping errors. These findings are further quantified by a series of mapping error analyses that also address the significance of data sampling density. The results provide a sampling framework to prioritize the deployment of biogeochemical Argo floats for constructing nitrate maps.

Download Full-text

Fuzziness-based active learning framework to enhance hyperspectral image classification performance for discriminative and generative classifiers

PLoS ONE ◽

10.1371/journal.pone.0188996 ◽

2018 ◽

Vol 13 (1) ◽

pp. e0188996 ◽

Cited By ~ 15

Author(s):

Muhammad Ahmad ◽

Stanislav Protasov ◽

Adil Mehmood Khan ◽

Rasheed Hussain ◽

Asad Masood Khattak ◽

...

Keyword(s):

Active Learning ◽

Image Classification ◽

Hyperspectral Image ◽

Classification Performance ◽

Hyperspectral Image Classification ◽

Learning Framework

Download Full-text

Incremental Dilations Using CNN for Brain Tumor Classification

Applied Sciences ◽

10.3390/app10144915 ◽

2020 ◽

Vol 10 (14) ◽

pp. 4915 ◽

Cited By ~ 1

Author(s):

Sanjiban Sekhar Roy ◽

Nishant Rodrigues ◽

Y-h. Taguchi

Keyword(s):

Neural Networks ◽

Brain Tumor ◽

Convolutional Neural Networks ◽

Model Performance ◽

Classification Performance ◽

Tumor Classification ◽

Computational Overhead ◽

Performance Accuracy ◽

Artificial Brain ◽

Dilation Rate

Brain tumor classification is a challenging task in the field of medical image processing. Technology has now enabled medical doctors to have additional aid for diagnosis. We aim to classify brain tumors using MRI images, which were collected from anonymous patients and artificial brain simulators. In this article, we carry out a comparative study between Simple Artificial Neural Networks with dropout, Basic Convolutional Neural Networks (CNN), and Dilated Convolutional Neural Networks. The experimental results shed light on the high classification performance (accuracy 97%) of Dilated CNN. On the other hand, Dilated CNN suffers from the gridding phenomenon. An incremental, even number dilation rate takes advantage of the reduced computational overhead and also overcomes the adverse effects of gridding. Comparative analysis between different combinations of dilation rates for the different convolution layers, help validate the results. The computational overhead in terms of efficiency for training the model to reach an acceptable threshold accuracy of 90% is another parameter to compare the model performance.

Download Full-text

Rethinking low-temperature thermochronology data sampling strategies for quantification of denudation and relief histories: A case study in the French western Alps

Earth and Planetary Science Letters ◽

10.1016/j.epsl.2011.05.003 ◽

2011 ◽

Vol 307 (3-4) ◽

pp. 309-322 ◽

Cited By ~ 9

Author(s):

Pierre G. Valla ◽

Peter A. van der Beek ◽

Jean Braun

Keyword(s):

Low Temperature ◽

Western Alps ◽

Sampling Strategies ◽

Data Sampling

Download Full-text

A Developed Siamese CNN with 3D Adaptive Spatial-Spectral Pyramid Pooling for Hyperspectral Image Classification

Remote Sensing ◽

10.3390/rs12121964 ◽

2020 ◽

Vol 12 (12) ◽

pp. 1964 ◽

Cited By ~ 1

Author(s):

Mengbin Rao ◽

Ping Tang ◽

Zheng Zhang

Keyword(s):

Transfer Learning ◽

Hyperspectral Image ◽

Three Dimensional ◽

Classification Performance ◽

Generalization Capability ◽

Training Strategy ◽

Learning Framework ◽

Spectral Bands ◽

Learning Tasks ◽

3D Information

Since hyperspectral images (HSI) captured by different sensors often contain different number of bands, but most of the convolutional neural networks (CNN) require a fixed-size input, the generalization capability of deep CNNs to use heterogeneous input to achieve better classification performance has become a research focus. For classification tasks with limited labeled samples, the training strategy of feeding CNNs with sample-pairs instead of single sample has proven to be an efficient approach. Following this strategy, we propose a Siamese CNN with three-dimensional (3D) adaptive spatial-spectral pyramid pooling (ASSP) layer, called ASSP-SCNN, that takes as input 3D sample-pair with varying size and can easily be transferred to another HSI dataset regardless of the number of spectral bands. The 3D ASSP layer can also extract different levels of 3D information to improve the classification performance of the equipped CNN. To evaluate the classification and generalization performance of ASSP-SCNN, our experiments consist of two parts: the experiments of ASSP-SCNN without pre-training and the experiments of ASSP-SCNN-based transfer learning framework. Experimental results on three HSI datasets demonstrate that both ASSP-SCNN without pre-training and transfer learning based on ASSP-SCNN achieve higher classification accuracies than several state-of-the-art CNN-based methods. Moreover, we also compare the performance of ASSP-SCNN on different transfer learning tasks, which further verifies that ASSP-SCNN has a strong generalization capability.

Download Full-text