Less is More: Culling the Training Set to Improve Robustness of Deep Neural Networks

Reverse engineering imperceptible backdoor attacks on deep neural networks for detection and training set cleansing

Computers & Security ◽

10.1016/j.cose.2021.102280 ◽

2021 ◽

Vol 106 ◽

pp. 102280

Author(s):

Zhen Xiang ◽

David J. Miller ◽

George Kesidis

Keyword(s):

Neural Networks ◽

Reverse Engineering ◽

Deep Neural Networks ◽

Training Set ◽

And Training

Download Full-text

Evaluation of Power Insulator Detection Efficiency with the Use of Limited Training Dataset

Applied Sciences ◽

10.3390/app10062104 ◽

2020 ◽

Vol 10 (6) ◽

pp. 2104

Author(s):

Michał Tomaszewski ◽

Paweł Michalski ◽

Jakub Osuchowski

Keyword(s):

Neural Network ◽

Neural Networks ◽

Object Detection ◽

Convolutional Neural Network ◽

Deep Neural Networks ◽

Detection Efficiency ◽

Training Data ◽

Training Dataset ◽

Training Set ◽

Convolutional Network

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.

Download Full-text

Self-Supervised Learning for Generalizable Out-of-Distribution Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5966 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5216-5223 ◽

Cited By ~ 1

Author(s):

Sina Mohseni ◽

Mandar Pitale ◽

JBS Yadawa ◽

Zhangyang Wang

Keyword(s):

Neural Networks ◽

Autonomous Vehicles ◽

Deep Neural Networks ◽

State Of The Art ◽

Feature Learning ◽

Detection Methods ◽

Training Set ◽

Safety Critical ◽

Multiple Image ◽

A New Technique

The real-world deployment of Deep Neural Networks (DNNs) in safety-critical applications such as autonomous vehicles needs to address a variety of DNNs' vulnerabilities, one of which being detecting and rejecting out-of-distribution outliers that might result in unpredictable fatal errors. We propose a new technique relying on self-supervision for generalizable out-of-distribution (OOD) feature learning and rejecting those samples at the inference time. Our technique does not need to pre-know the distribution of targeted OOD samples and incur no extra overheads compared to other methods. We perform multiple image classification experiments and observe our technique to perform favorably against state-of-the-art OOD detection methods. Interestingly, we witness that our method also reduces in-distribution classification risk via rejecting samples near the boundaries of the training set distribution.

Download Full-text

A new strategy for curriculum learning using model distillation

Global Journal of Computer Sciences Theory and Research ◽

10.18844/gjcs.v10i2.5810 ◽

2020 ◽

Vol 10 (2) ◽

pp. 57-65

Author(s):

Kaan Karakose ◽

Metin Bilgin

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Artificial Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Mixed Data ◽

Training Set ◽

Complex Samples ◽

New Strategy ◽

Sample Sorting

In recent years, deep neural networks have been successful in both industry and academia, especially for computer vision tasks. Humans and animals learn much better when gradually presented in a meaningful order showing more concepts and complex samples rather than randomly presenting the information. The use of such training strategies in the context of artificial neural networks is called curriculum learning. In this study, a strategy was developed for curriculum learning. Using the CIFAR-10 and CIFAR-100 training sets, the last few layers of the pre-trained on ImageNet Xception model were trained to keep the training set knowledge in the model’s weight. Finally, a much smaller model was trained with the sample sorting methods presented using these difficulty levels. The findings obtained in this study show that the accuracy value generated when trained by the method we provided with the accuracy value trained with randomly mixed data was more than 1% for each epoch. Keywords: Curriculum learning, model distillation, deep learning, academia, neural networks.

Download Full-text

Would large reference populations unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers.

10.21203/rs.2.22198/v1 ◽

2020 ◽

Author(s):

Tiago Luciano Passafaro ◽

Fernando B. Lopes ◽

João R. R. Dórea ◽

Mark Craven ◽

Vivian Breen ◽

...

Keyword(s):

Neural Networks ◽

Body Weight ◽

Complex Traits ◽

Regression Models ◽

Deep Neural Networks ◽

Bayesian Models ◽

Sample Sizes ◽

Training Set ◽

Predictive Approaches ◽

Larger Sample

Abstract Background: Deep neural networks (DNN) are a particular case of artificial neural networks (ANN) composed by multiple hidden layers, and have recently gained attention in genome-enabled prediction of complex traits. Yet, few studies in genome-enabled prediction have assessed the performance of DNN compared to traditional regression models. Strikingly, no clear superiority of DNN has been reported so far, and results seem highly dependent on the species and traits of application. Nevertheless, the relatively small datasets used in previous studies, most with fewer than 5,000 observations may have precluded the full potential of DNN. Therefore, the objective of this study was to investigate the impact of the size of the reference population on the performance of DNN compared to Bayesian regression models for genome-enable prediction of body weight in broilers. Results: Predictive performance of DNN improved as sample size increased, reaching a plateau at about 0.32 of prediction correlation when 60% of the entire training set size was used. Interestingly, DNN showed superior prediction correlation with smaller sample sizes and poorer prediction correlation with larger samples sizes compared to Bayesian Ridge Regression (BRR) and Bayes Cπ without including the tuning data in the training data. Conversely, Bayesian models fitted with the training and tuning sets showed the best performance in terms of prediction correlation, but such advantage vanished for larger sample sizes. DNN presented the lowest mean square error of prediction regardless the amount of data used to train the predictive approaches, as well as with Bayesian models including or not the tuning set into the training set. The predictive bias was lower for DNN compared to Bayesian models regardless the amount of data used with estimates closed to the unit with larger sample sizes. Conclusions: DNN had worse prediction correlation compared to BRR and Bayes Cπ, but improved mean square error of prediction and bias relative to both Bayesian models for genome-enabled prediction of body weight in broilers. Such findings, highlights advantages and disadvantages between predictive approaches depending on the criterion used for comparison. Nonetheless, further analysis is necessary to detect scenarios where DNN can clearly outperform Bayesian benchmark models.

Download Full-text

Microfluidic Droplet Detection via Region-Based and Single-Pass Convolutional Neural Networks with Comparison to Conventional Image Analysis Methodologies

10.26434/chemrxiv.14709477 ◽

2021 ◽

Author(s):

Gregory Rutkowski ◽

Ilgar Azizov ◽

Evan Unmann ◽

Marcin Dudek ◽

Brian Arthur Grimes

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Deep Neural Networks ◽

Data Gathering ◽

Image Data ◽

Validation Dataset ◽

Detection Accuracy ◽

Microfluidic Systems ◽

Training Set ◽

Analysis Pipeline

As the complexity of microfluidic experiments and the associated image data volumes scale, traditional feature extraction approaches begin to struggle at both detection and analysis pipeline throughput. Deep-neural networks trained to detect certain objects are rapidly emerging as data gathering tools that can either match or outperform the analysis capabilities of the conventional methods used in microfluidic emulsion science. We demonstrate that various convolutional neural networks can be trained and used as droplet detectors in a wide variety of microfluidic systems. A generalized microfluidic droplet training and validation dataset was developed and used to tune two versions of the You Only Look Once (YOLOv3/YOLOv5) model as well as Faster R-CNN. Each model was used to detect droplets in mono- and polydisperse flow cell systems. The detection accuracy of each model shows excellent statistical symmetry with an implementation of the Hough transform as well as relevant ImageJ plugins. The models were successfully used as droplet detectors in non-microfluidic micrograph observations, where these data were not included in the training set. The models outperformed the traditional methods in more complex, porous-media simulating chip architectures with a significant speedup to per-frame analysis times. Implementing these neural networks as the primary detectors in these microfluidic systems not only makes the data pipelining more efficient, but opens the door for live detection and development of autonomous microfluidic experimental platforms. <br>

Download Full-text

Stability for the training of deep neural networks and other classifiers

Mathematical Models and Methods in Applied Sciences ◽

10.1142/s0218202521500500 ◽

2021 ◽

pp. 1-46

Author(s):

Leonid Berlyand ◽

Pierre-Emmanuel Jabin ◽

C. Alex Safsten

Keyword(s):

Neural Networks ◽

Loss Function ◽

Deep Neural Networks ◽

Training Dataset ◽

Sufficient Condition ◽

Training Set ◽

Potential Sources ◽

The Stability

We examine the stability of loss-minimizing training processes that are used for deep neural networks (DNN) and other classifiers. While a classifier is optimized during training through a so-called loss function, the performance of classifiers is usually evaluated by some measure of accuracy, such as the overall accuracy which quantifies the proportion of objects that are well classified. This leads to the guiding question of stability: does decreasing loss through training always result in increased accuracy? We formalize the notion of stability, and provide examples of instability. Our main result consists of two novel conditions on the classifier which, if either is satisfied, ensure stability of training, that is we derive tight bounds on accuracy as loss decreases. We also derive a sufficient condition for stability on the training set alone, identifying flat portions of the data manifold as potential sources of instability. The latter condition is explicitly verifiable on the training dataset. Our results do not depend on the algorithm used for training, as long as loss decreases with training.

Download Full-text

Deep Neural Networks with Multistate Activation Functions

Computational Intelligence and Neuroscience ◽

10.1155/2015/721367 ◽

2015 ◽

Vol 2015 ◽

pp. 1-10 ◽

Cited By ~ 2

Author(s):

Chenghao Cai ◽

Yanyan Xu ◽

Dengfeng Ke ◽

Kaile Su

Keyword(s):

Neural Networks ◽

Gradient Descent ◽

Deep Neural Networks ◽

Error Rates ◽

Stochastic Gradient Descent ◽

Activation Functions ◽

Classification Problems ◽

Training Set ◽

Relative Improvement ◽

Better Than

We propose multistate activation functions (MSAFs) for deep neural networks (DNNs). These MSAFs are new kinds of activation functions which are capable of representing more than two states, including theN-order MSAFs and the symmetrical MSAF. DNNs with these MSAFs can be trained via conventional Stochastic Gradient Descent (SGD) as well as mean-normalised SGD. We also discuss how these MSAFs perform when used to resolve classification problems. Experimental results on the TIMIT corpus reveal that, on speech recognition tasks, DNNs with MSAFs perform better than the conventional DNNs, getting a relative improvement of 5.60% on phoneme error rates. Further experiments also reveal that mean-normalised SGD facilitates the training processes of DNNs with MSAFs, especially when being with large training sets. The models can also be directly trained without pretraining when the training set is sufficiently large, which results in a considerable relative improvement of 5.82% on word error rates.

Download Full-text

Androgen Receptor Binding Category Prediction with Deep Neural Networks and Structure-, Ligand-, and Statistically Based Features

Molecules ◽

10.3390/molecules26051285 ◽

2021 ◽

Vol 26 (5) ◽

pp. 1285

Author(s):

Alfonso T. García-Sosa

Keyword(s):

Neural Networks ◽

Androgen Receptor ◽

Logistic Model ◽

Deep Neural Networks ◽

State Of The Art ◽

Protein Structures ◽

Training Set ◽

Multivariate Logistic Model ◽

And Training ◽

Better Than

Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine-learning classifiers and regressors and to evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to different results, with deep neural networks (DNNs) on user-defined physicochemically relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evaluation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and prediction, improving assessment and design of compounds. Source code and data are available on github.

Download Full-text

Would large dataset sample size unveil the potential of deep neural networks for improved genome-enabled prediction of complex traits? The case for body weight in broilers.

10.21203/rs.2.22198/v2 ◽

2020 ◽

Author(s):

Tiago Luciano Passafaro ◽

Fernando B. Lopes ◽

João R. R. Dórea ◽

Mark Craven ◽

Vivian Breen ◽

...

Keyword(s):

Neural Networks ◽

Body Weight ◽

Sample Size ◽

Mean Square Error ◽

Complex Traits ◽

Regression Models ◽

Deep Neural Networks ◽

Bayesian Models ◽

Mean Square ◽

Training Set

Abstract Background: Deep neural networks (DNN) are a particular case of artificial neural networks (ANN) composed by multiple hidden layers, and have recently gained attention in genome-enabled prediction of complex traits. Yet, few studies in genome-enabled prediction have assessed the performance of DNN compared to traditional regression models. Strikingly, no clear superiority of DNN has been reported so far, and results seem highly dependent on the species and traits of application. Nevertheless, the relatively small datasets used in previous studies, most with fewer than 5,000 observations may have precluded the full potential of DNN. Therefore, the objective of this study was to investigate the impact of the dataset sample size on the performance of DNN compared to Bayesian regression models for genome-enable prediction of body weight in broilers by sub-sampling 63,526 observations of the training set.Results: Predictive performance of DNN improved as sample size increased, reaching a plateau at about 0.32 of prediction correlation when 60% of the entire training set size was used (i.e., 39,510 observations). Interestingly, DNN showed superior prediction correlation using up to 3% of training set, but poorer prediction correlation after that compared to Bayesian Ridge Regression (BRR) and Bayes Cπ. Regardless the amount of data used to train the predictive machines, DNN displayed the lowest mean square error of prediction compared to all other approaches. The predictive bias was lower for DNN compared to Bayesian models regardless the amount of data used with estimates closed to one with larger sample sizes. Conclusions: DNN had worse prediction correlation compared to BRR and Bayes Cπ, but improved mean square error of prediction and bias relative to both Bayesian models for genome-enabled prediction of body weight in broilers. Such findings, highlights advantages and disadvantages between predictive approaches depending on the criterion used for comparison. Nonetheless, further analysis is necessary to detect scenarios where DNN can clearly outperform Bayesian benchmark models.

Download Full-text