DeepPayload: Black-box Backdoor Attack on Deep Learning Models through Neural Payload Injection

Deep learning models have significantly advanced various natural language processing tasks. However, they are strikingly vulnerable to adversarial text attacks, even in the black-box setting where no model knowledge is accessible to hackers. Such attacks are conducted with a two-phase framework: 1) a sensitivity estimation phase to evaluate each element’s sensitivity to the target model’s prediction, and 2) a perturbation execution phase to craft the adversarial examples based on estimated element sensitivity. This study explored the connections between the local post-hoc explainable methods for deep learning and black-box adversarial text attacks and proposed a novel eXplanation-based method for crafting Adversarial Text Attacks (XATA). XATA leverages local post-hoc explainable methods (e.g., LIME or SHAP) to measure input elements’ sensitivity and adopts the word replacement perturbation strategy to craft adversarial examples. We evaluated the attack performance of the proposed XATA on three commonly used text-based datasets: IMDB Movie Review, Yelp Reviews-Polarity, and Amazon Reviews-Polarity. The proposed XATA outperformed existing baselines in various target models, including LSTM, GRU, CNN, and BERT. Moreover, we found that improved local post-hoc explainable methods (e.g., SHAP) lead to more effective adversarial attacks. These findings showed that when researchers constantly advance the explainability of deep learning models with local post-hoc methods, they also provide hackers with weapons to craft more targeted and dangerous adversarial attacks.

Download Full-text

Explainable Artificial Intelligence (xAI) Approaches and Deep Meta-Learning Models

Advances and Applications in Deep Learning ◽

10.5772/intechopen.92172 ◽

2020 ◽

Author(s):

Evren Dağlarli

Keyword(s):

Artificial Intelligence ◽

Deep Learning ◽

Black Box ◽

Learning Models ◽

Learning Methods ◽

Data Set ◽

Box Models ◽

Explainable Artificial Intelligence ◽

Artificial Neural ◽

Black Box Models

The explainable artificial intelligence (xAI) is one of the interesting issues that has emerged recently. Many researchers are trying to deal with the subject with different dimensions and interesting results that have come out. However, we are still at the beginning of the way to understand these types of models. The forthcoming years are expected to be years in which the openness of deep learning models is discussed. In classical artificial intelligence approaches, we frequently encounter deep learning methods available today. These deep learning methods can yield highly effective results according to the data set size, data set quality, the methods used in feature extraction, the hyper parameter set used in deep learning models, the activation functions, and the optimization algorithms. However, there are important shortcomings that current deep learning models are currently inadequate. These artificial neural network-based models are black box models that generalize the data transmitted to it and learn from the data. Therefore, the relational link between input and output is not observable. This is an important open point in artificial neural networks and deep learning models. For these reasons, it is necessary to make serious efforts on the explainability and interpretability of black box models.

Download Full-text

Peeking inside the Black Box: Interpreting Deep-learning Models for Exoplanet Atmospheric Retrievals

The Astronomical Journal ◽

10.3847/1538-3881/ac1744 ◽

2021 ◽

Vol 162 (5) ◽

pp. 195

Author(s):

Kai Hou Yip ◽

Quentin Changeat ◽

Nikolaos Nikolaou ◽

Mario Morvan ◽

Billy Edwards ◽

...

Keyword(s):

Deep Learning ◽

Black Box ◽

Learning Models

Download Full-text

Generating adversarial examples without specifying a target model

PeerJ Computer Science ◽

10.7717/peerj-cs.702 ◽

2021 ◽

Vol 7 ◽

pp. e702

Author(s):

Gaoming Yang ◽

Mingwei Li ◽

Xianjing Fang ◽

Ji Zhang ◽

Xingzhu Liang

Keyword(s):

Deep Learning ◽

Success Rate ◽

Black Box ◽

Time Cost ◽

Learning Models ◽

Security Threat ◽

Practical Situation ◽

Data Set ◽

Target Model ◽

Adversarial Examples

Adversarial examples are regarded as a security threat to deep learning models, and there are many ways to generate them. However, most existing methods require the query authority of the target during their work. In a more practical situation, the attacker will be easily detected because of too many queries, and this problem is especially obvious under the black-box setting. To solve the problem, we propose the Attack Without a Target Model (AWTM). Our algorithm does not specify any target model in generating adversarial examples, so it does not need to query the target. Experimental results show that it achieved a maximum attack success rate of 81.78% in the MNIST data set and 87.99% in the CIFAR-10 data set. In addition, it has a low time cost because it is a GAN-based method.

Download Full-text

3D-Adv: Black-Box Adversarial Attacks against Deep Learning Models through 3D Sensors

10.1109/dac18074.2021.9586275 ◽

2021 ◽

Author(s):

Kaichen Yang ◽

Xuan-Yi Lin ◽

Yixin Sun ◽

Tsung-Yi Ho ◽

Yier Jin

Keyword(s):

Deep Learning ◽

Black Box ◽

Learning Models

Download Full-text

Ensemble Deep Learning on Large, Mixed-Site fMRI Datasets in Autism and Other Tasks

International Journal of Neural Systems ◽

10.1142/s0129065720500124 ◽

2020 ◽

Vol 30 (07) ◽

pp. 2050012

Author(s):

Matthew Leming ◽

Juan Manuel Górriz ◽

John Suckling

Keyword(s):

Deep Learning ◽

Autism Spectrum ◽

Black Box ◽

Typically Developing ◽

Learning Models ◽

Cross Sectional ◽

Functional Connections ◽

Independent Variable ◽

The Right ◽

And Task

Deep learning models for MRI classification face two recurring problems: they are typically limited by low sample size, and are abstracted by their own complexity (the “black box problem”). In this paper, we train a convolutional neural network (CNN) with the largest multi-source, functional MRI (fMRI) connectomic dataset ever compiled, consisting of 43,858 datapoints. We apply this model to a cross-sectional comparison of autism spectrum disorder (ASD) versus typically developing (TD) controls that has proved difficult to characterize with inferential statistics. To contextualize these findings, we additionally perform classifications of gender and task versus rest. Employing class-balancing to build a training set, we trained [Formula: see text] modified CNNs in an ensemble model to classify fMRI connectivity matrices with overall AUROCs of 0.6774, 0.7680, and 0.9222 for ASD versus TD, gender, and task versus rest, respectively. Additionally, we aim to address the black box problem in this context using two visualization methods. First, class activation maps show which functional connections of the brain our models focus on when performing classification. Second, by analyzing maximal activations of the hidden layers, we were also able to explore how the model organizes a large and mixed-center dataset, finding that it dedicates specific areas of its hidden layers to processing different covariates of data (depending on the independent variable analyzed), and other areas to mix data from different sources. Our study finds that deep learning models that distinguish ASD from TD controls focus broadly on temporal and cerebellar connections, with a particularly high focus on the right caudate nucleus and paracentral sulcus.

Download Full-text

Local Post-hoc Explainable Methods for Adversarial Text Attacks

10.36227/techrxiv.17185568 ◽

2021 ◽

Author(s):

Yidong Chai ◽

Ruicheng Liang ◽

Hongyi Zhu ◽

Sagar Samtani ◽

Meng Wang ◽

...

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Language Processing ◽

Black Box ◽

Learning Models ◽

Two Phase ◽

Sensitivity Estimation ◽

Execution Phase ◽

Adversarial Examples ◽

Post Hoc

Deep learning models have significantly advanced various natural language processing tasks. However, they are strikingly vulnerable to adversarial text attacks, even in the black-box setting where no model knowledge is accessible to hackers. Such attacks are conducted with a two-phase framework: 1) a sensitivity estimation phase to evaluate each element’s sensitivity to the target model’s prediction, and 2) a perturbation execution phase to craft the adversarial examples based on estimated element sensitivity. This study explored the connections between the local post-hoc explainable methods for deep learning and black-box adversarial text attacks and proposed a novel eXplanation-based method for crafting Adversarial Text Attacks (XATA). XATA leverages local post-hoc explainable methods (e.g., LIME or SHAP) to measure input elements’ sensitivity and adopts the word replacement perturbation strategy to craft adversarial examples. We evaluated the attack performance of the proposed XATA on three commonly used text-based datasets: IMDB Movie Review, Yelp Reviews-Polarity, and Amazon Reviews-Polarity. The proposed XATA outperformed existing baselines in various target models, including LSTM, GRU, CNN, and BERT. Moreover, we found that improved local post-hoc explainable methods (e.g., SHAP) lead to more effective adversarial attacks. These findings showed that when researchers constantly advance the explainability of deep learning models with local post-hoc methods, they also provide hackers with weapons to craft more targeted and dangerous adversarial attacks.

Download Full-text

Explainable Deep Learning Models in Medical Image Analysis

Journal of Imaging ◽

10.3390/jimaging6060052 ◽

2020 ◽

Vol 6 (6) ◽

pp. 52 ◽

Cited By ~ 3

Author(s):

Amitojdeep Singh ◽

Sourya Sengupta ◽

Vasudevan Lakshminarayanan

Keyword(s):

Image Analysis ◽

Deep Learning ◽

Medical Image ◽

Medical Image Analysis ◽

Black Box ◽

Clinical Use ◽

Learning Models ◽

Literature Reviews ◽

Medical Diagnostic ◽

Practical Standpoint

Deep learning methods have been very effective for a variety of medical diagnostic tasks and have even outperformed human experts on some of those. However, the black-box nature of the algorithms has restricted their clinical use. Recent explainability studies aim to show the features that influence the decision of a model the most. The majority of literature reviews of this area have focused on taxonomy, ethics, and the need for explanations. A review of the current applications of explainable deep learning for different medical imaging tasks is presented here. The various approaches, challenges for clinical deployment, and the areas requiring further research are discussed here from a practical standpoint of a deep learning researcher designing a system for the clinical end-users.

Download Full-text

Levenshtein Augmentation Improves Performance of SMILES Based Deep-Learning Synthesis Prediction

10.26434/chemrxiv.12562121 ◽

2020 ◽

Author(s):

Dean Sumner ◽

Jiazhen He ◽

Amol Thakkar ◽

Ola Engkvist ◽

Esben Jannik Bjerrum

Keyword(s):

Neural Networks ◽

Pattern Recognition ◽

Deep Learning ◽

Recurrent Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Sequence Similarity ◽

Learning Models ◽

Underlying Network

SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as attentional gain – an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.

Download Full-text

Improving the Accuracy of Protein-Ligand Binding Affinity Prediction by Deep Learning Models: Benchmark and Model

10.26434/chemrxiv.9866912 ◽

2019 ◽

Author(s):

Mohammad Rezaei ◽

Yanjun Li ◽

Xiaolin Li ◽

Chenglong Li

Keyword(s):

Deep Learning ◽

Drug Design ◽

Binding Affinity ◽

Benchmark Dataset ◽

Rational Drug Design ◽

Learning Models ◽

Structure Based Drug Design ◽

Binding Affinity Prediction ◽

Affinity Prediction ◽

Rational Drug

Introduction: The ability to discriminate among ligands binding to the same protein target in terms of their relative binding affinity lies at the heart of structure-based drug design. Any improvement in the accuracy and reliability of binding affinity prediction methods decreases the discrepancy between experimental and computational results. Objectives: The primary objectives were to find the most relevant features affecting binding affinity prediction, least use of manual feature engineering, and improving the reliability of binding affinity prediction using efficient deep learning models by tuning the model hyperparameters. Methods: The binding site of target proteins was represented as a grid box around their bound ligand. Both binary and distance-dependent occupancies were examined for how an atom affects its neighbor voxels in this grid. A combination of different features including ANOLEA, ligand elements, and Arpeggio atom types were used to represent the input. An efficient convolutional neural network (CNN) architecture, DeepAtom, was developed, trained and tested on the PDBbind v2016 dataset. Additionally an extended benchmark dataset was compiled to train and evaluate the models. Results: The best DeepAtom model showed an improved accuracy in the binding affinity prediction on PDBbind core subset (Pearson’s R=0.83) and is better than the recent state-of-the-art models in this field. In addition when the DeepAtom model was trained on our proposed benchmark dataset, it yields higher correlation compared to the baseline which confirms the value of our model. Conclusions: The promising results for the predicted binding affinities is expected to pave the way for embedding deep learning models in virtual screening and rational drug design fields.

Download Full-text