Reverse engineering imperceptible backdoor attacks on deep neural networks for detection and training set cleansing

Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine-learning classifiers and regressors and to evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to different results, with deep neural networks (DNNs) on user-defined physicochemically relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evaluation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and prediction, improving assessment and design of compounds. Source code and data are available on github.

Download Full-text

Androgen Receptor Binding Category Prediction with Deep Neural Networks and Structure-, Ligand-, and Statistically-Based Features

10.20944/preprints202102.0318.v3 ◽

2021 ◽

Author(s):

Alfonso T. García-Sosa

Keyword(s):

Neural Networks ◽

Androgen Receptor ◽

Logistic Model ◽

Deep Neural Networks ◽

State Of The Art ◽

Protein Structures ◽

Training Set ◽

Multivariate Logistic Model ◽

And Training ◽

Better Than

Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine learning classifiers and regressors and evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to dif- ferent results, with deep neural networks (DNNs) on user-defined physicochemically-relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically-based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evalu- ation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and predic- tion, improving assessment and design of compounds. Source code and data are available at https://github.com/AlfonsoTGarcia-Sosa/ML

Download Full-text

Androgen Receptor Binding Category Prediction with Deep Neural Networks and Structure-, Ligand-, and Statistically-Based Features

10.20944/preprints202102.0318.v2 ◽

2021 ◽

Author(s):

Alfonso T. García-Sosa

Keyword(s):

Neural Networks ◽

Androgen Receptor ◽

Logistic Model ◽

Deep Neural Networks ◽

State Of The Art ◽

Protein Structures ◽

Training Set ◽

Multivariate Logistic Model ◽

And Training ◽

Better Than

Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine learning classifiers and regressors and evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to dif- ferent results, with deep neural networks (DNNs) on user-defined physicochemically-relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically-based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evalu- ation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and predic- tion, improving assessment and design of compounds. Source code and data are available at https://github.com/AlfonsoTGarcia-Sosa/ML

Download Full-text

Evaluation of Power Insulator Detection Efficiency with the Use of Limited Training Dataset

Applied Sciences ◽

10.3390/app10062104 ◽

2020 ◽

Vol 10 (6) ◽

pp. 2104

Author(s):

Michał Tomaszewski ◽

Paweł Michalski ◽

Jakub Osuchowski

Keyword(s):

Neural Network ◽

Neural Networks ◽

Object Detection ◽

Convolutional Neural Network ◽

Deep Neural Networks ◽

Detection Efficiency ◽

Training Data ◽

Training Dataset ◽

Training Set ◽

Convolutional Network

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.

Download Full-text

Flat Minima

Neural Computation ◽

10.1162/neco.1997.9.1.1 ◽

1997 ◽

Vol 9 (1) ◽

pp. 1-42 ◽

Cited By ~ 156

Author(s):

Sepp Hochreiter ◽

Jürgen Schmidhuber

Keyword(s):

Neural Networks ◽

Error Function ◽

Low Complexity ◽

Generalization Error ◽

Input Output ◽

Generalization Capability ◽

Training Set ◽

Weight Decay ◽

Optimal Brain Surgeon ◽

And Training

We present a new algorithm for finding low-complexity neural networks with high generalization capability. The algorithm searches for a “flat” minimum of the error function. A flat minimum is a large connected region in weight space where the error remains approximately constant. An MDL-based, Bayesian argument suggests that flat minima correspond to “simple” networks and low expected overfitting. The argument is based on a Gibbs algorithm variant and a novel way of splitting generalization error into underfitting and overfitting error. Unlike many previous approaches, ours does not require gaussian assumptions and does not depend on a “good” weight prior. Instead we have a prior over input output functions, thus taking into account net architecture and training set. Although our algorithm requires the computation of second-order derivatives, it has backpropagation's order of complexity. Automatically, it effectively prunes units, weights, and input lines. Various experiments with feedforward and recurrent nets are described. In an application to stock market prediction, flat minimum search outperforms conventional backprop, weight decay, and “optimal brain surgeon/optimal brain damage.”

Download Full-text

Reverse-Engineering Deep Neural Networks Using Floating-Point Timing Side-Channels

2020 57th ACM/IEEE Design Automation Conference (DAC) ◽

10.1109/dac18072.2020.9218707 ◽

2020 ◽

Cited By ~ 1

Author(s):

Cheng Gongye ◽

Yunsi Fei ◽

Thomas Wahl

Keyword(s):

Neural Networks ◽

Reverse Engineering ◽

Deep Neural Networks ◽

Floating Point ◽

Side Channels

Download Full-text

The expressivity and training of deep neural networks: Toward the edge of chaos?

Neurocomputing ◽

10.1016/j.neucom.2019.12.044 ◽

2020 ◽

Vol 386 ◽

pp. 8-17

Author(s):

Gege Zhang ◽

Gangwei Li ◽

Weining Shen ◽

Weidong Zhang

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Edge Of Chaos ◽

And Training

Download Full-text

Computational memory-based inference and training of deep neural networks

2019 Symposium on VLSI Circuits ◽

10.23919/vlsic.2019.8778178 ◽

2019 ◽

Author(s):

A. Sebastian ◽

I. Boybat ◽

M. Dazzi ◽

I. Giannopoulos ◽

V. Jonnalagadda ◽

...

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Computational Memory ◽

And Training

Download Full-text

Self-Supervised Learning for Generalizable Out-of-Distribution Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5966 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5216-5223 ◽

Cited By ~ 1

Author(s):

Sina Mohseni ◽

Mandar Pitale ◽

JBS Yadawa ◽

Zhangyang Wang

Keyword(s):

Neural Networks ◽

Autonomous Vehicles ◽

Deep Neural Networks ◽

State Of The Art ◽

Feature Learning ◽

Detection Methods ◽

Training Set ◽

Safety Critical ◽

Multiple Image ◽

A New Technique

The real-world deployment of Deep Neural Networks (DNNs) in safety-critical applications such as autonomous vehicles needs to address a variety of DNNs' vulnerabilities, one of which being detecting and rejecting out-of-distribution outliers that might result in unpredictable fatal errors. We propose a new technique relying on self-supervision for generalizable out-of-distribution (OOD) feature learning and rejecting those samples at the inference time. Our technique does not need to pre-know the distribution of targeted OOD samples and incur no extra overheads compared to other methods. We perform multiple image classification experiments and observe our technique to perform favorably against state-of-the-art OOD detection methods. Interestingly, we witness that our method also reduces in-distribution classification risk via rejecting samples near the boundaries of the training set distribution.

Download Full-text

Less is More: Culling the Training Set to Improve Robustness of Deep Neural Networks

Lecture Notes in Computer Science - Decision and Game Theory for Security ◽

10.1007/978-3-030-01554-1_6 ◽

2018 ◽

pp. 102-114 ◽

Cited By ~ 4

Author(s):

Yongshuai Liu ◽

Jiyu Chen ◽

Hao Chen

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Training Set ◽

Less Is More

Download Full-text