Detecting Scene-Plausible Perceptible Backdoors in Trained DNNs without Access to the Training Set

2021 ◽  
pp. 1-43
Author(s):  
Zhen Xiang ◽  
David J. Miller ◽  
Hang Wang ◽  
George Kesidis

Backdoor data poisoning attacks add mislabeled examples to the training set, with an embedded backdoor pattern, so that the classifier learns to classify to a target class whenever the backdoor pattern is present in a test sample. Here, we address posttraining detection of scene-plausible perceptible backdoors, a type of backdoor attack that can be relatively easily fashioned, particularly against DNN image classifiers. A posttraining defender does not have access to the potentially poisoned training set, only to the trained classifier, as well as some unpoisoned examples that need not be training samples. Without the poisoned training set, the only information about a backdoor pattern is encoded in the DNN's trained weights. This detection scenario is of great import considering legacy and proprietary systems, cell phone apps, as well as training outsourcing, where the user of the classifier will not have access to the entire training set. We identify two important properties of scene-plausible perceptible backdoor patterns, spatial invariance and robustness, based on which we propose a novel detector using the maximum achievable misclassification fraction (MAMF) statistic. We detect whether the trained DNN has been backdoor-attacked and infer the source and target classes. Our detector outperforms existing detectors and, coupled with an imperceptible backdoor detector, helps achieve posttraining detection of most evasive backdoors of interest.

Author(s):  
Shuhuan Zhao

Face recognition (FR) is a hotspot in pattern recognition and image processing for its wide applications in real life. One of the most challenging problems in FR is single sample face recognition (SSFR). In this paper, we proposed a novel algorithm based on nonnegative sparse representation, collaborative presentation, and probabilistic graph estimation to address SSFR. The proposed algorithm is named as Nonnegative Sparse Probabilistic Estimation (NNSPE). To extract the variation information from the generic training set, we first select some neighbor samples from the generic training set for each sample in the gallery set and the generic training set can be partitioned into some reference subsets. To make more meaningful reconstruction, the proposed method adopts nonnegative sparse representation to reconstruct training samples, and according to the reconstruction coefficients, NNSPE computes the probabilistic label estimation for the samples of the generic training set. Then, for a given test sample, collaborative representation (CR) is used to acquire an adaptive variation subset. Finally, the NNSPE classifies the test sample with the adaptive variation subset and probabilistic label estimation. The experiments on the AR and PIE verify the effectiveness of the proposed method both in recognition rates and time cost.


2014 ◽  
Vol 2014 ◽  
pp. 1-12 ◽  
Author(s):  
Hudson Fernandes Golino ◽  
Liliany Souza de Brito Amaral ◽  
Stenio Fernando Pimentel Duarte ◽  
Cristiano Mauro Assis Gomes ◽  
Telma de Jesus Soares ◽  
...  

The present study investigates the prediction of increased blood pressure by body mass index (BMI), waist (WC) and hip circumference (HC), and waist hip ratio (WHR) using a machine learning technique named classification tree. Data were collected from 400 college students (56.3% women) from 16 to 63 years old. Fifteen trees were calculated in the training group for each sex, using different numbers and combinations of predictors. The result shows that for women BMI, WC, and WHR are the combination that produces the best prediction, since it has the lowest deviance (87.42), misclassification (.19), and the higher pseudoR2(.43). This model presented a sensitivity of 80.86% and specificity of 81.22% in the training set and, respectively, 45.65% and 65.15% in the test sample. For men BMI, WC, HC, and WHC showed the best prediction with the lowest deviance (57.25), misclassification (.16), and the higher pseudoR2(.46). This model had a sensitivity of 72% and specificity of 86.25% in the training set and, respectively, 58.38% and 69.70% in the test set. Finally, the result from the classification tree analysis was compared with traditional logistic regression, indicating that the former outperformed the latter in terms of predictive power.


2017 ◽  
Vol 17 (02) ◽  
pp. 1750007 ◽  
Author(s):  
Chunwei Tian ◽  
Guanglu Sun ◽  
Qi Zhang ◽  
Weibing Wang ◽  
Teng Chen ◽  
...  

Collaborative representation classification (CRC) is an important sparse method, which is easy to carry out and uses a linear combination of training samples to represent a test sample. CRC method utilizes the offset between representation result of each class and the test sample to implement classification. However, the offset usually cannot well express the difference between every class and the test sample. In this paper, we propose a novel representation method for image recognition to address the above problem. This method not only fuses sparse representation and CRC method to improve the accuracy of image recognition, but also has novel fusion mechanism to classify images. The implementations of the proposed method have the following steps. First of all, it produces collaborative representation of the test sample. That is, a linear combination of all the training samples is first determined to represent the test sample. Then, it gets the sparse representation classification (SRC) of the test sample. Finally, the proposed method respectively uses CRC and SRC representations to obtain two kinds of scores of the test sample and fuses them to recognize the image. The experiments of face recognition show that the combination of CRC and SRC has satisfactory performance for image classification.


2018 ◽  
Vol 10 (12) ◽  
pp. 1934 ◽  
Author(s):  
Bao-Di Liu ◽  
Wen-Yang Xie ◽  
Jie Meng ◽  
Ye Li ◽  
Yanjiang Wang

In recent years, the collaborative representation-based classification (CRC) method has achieved great success in visual recognition by directly utilizing training images as dictionary bases. However, it describes a test sample with all training samples to extract shared attributes and does not consider the representation of the test sample with the training samples in a specific class to extract the class-specific attributes. For remote-sensing images, both the shared attributes and class-specific attributes are important for classification. In this paper, we propose a hybrid collaborative representation-based classification approach. The proposed method is capable of improving the performance of classifying remote-sensing images by embedding the class-specific collaborative representation to conventional collaborative representation-based classification. Moreover, we extend the proposed method to arbitrary kernel space to explore the nonlinear characteristics hidden in remote-sensing image features to further enhance classification performance. Extensive experiments on several benchmark remote-sensing image datasets were conducted and clearly demonstrate the superior performance of our proposed algorithm to state-of-the-art approaches.


2021 ◽  
Vol 2082 (1) ◽  
pp. 012021
Author(s):  
Bingsen Guo

Abstract Data classification is one of the most critical issues in data mining with a large number of real-life applications. In many practical classification issues, there are various forms of anomalies in the real dataset. For example, the training set contains outliers, often enough to confuse the classifier and reduce its ability to learn from the data. In this paper, we propose a new data classification improvement approach based on kernel clustering. The proposed method can improve the classification performance by optimizing the training set. We first use the existing kernel clustering method to cluster the training set and optimize it based on the similarity between the training samples in each class and the corresponding class center. Then, the optimized reliable training set is trained to the standard classifier in the kernel space to classify each query sample. Extensive performance analysis shows that the proposed method achieves high performance, thus improving the classifier’s effectiveness.


2020 ◽  
Vol 12 (4) ◽  
pp. 664 ◽  
Author(s):  
Binge Cui ◽  
Jiandi Cui ◽  
Yan Lu ◽  
Nannan Guo ◽  
Maoguo Gong

Hyperspectral image classification methods may not achieve good performance when a limited number of training samples are provided. However, labeling sufficient samples of hyperspectral images to achieve adequate training is quite expensive and difficult. In this paper, we propose a novel sample pseudo-labeling method based on sparse representation (SRSPL) for hyperspectral image classification, in which sparse representation is used to select the purest samples to extend the training set. The proposed method consists of the following three steps. First, intrinsic image decomposition is used to obtain the reflectance components of hyperspectral images. Second, hyperspectral pixels are sparsely represented using an overcomplete dictionary composed of all training samples. Finally, information entropy is defined for the vectorized sparse representation, and then the pixels with low information entropy are selected as pseudo-labeled samples to augment the training set. The quality of the generated pseudo-labeled samples is evaluated based on classification accuracy, i.e., overall accuracy, average accuracy, and Kappa coefficient. Experimental results on four real hyperspectral data sets demonstrate excellent classification performance using the new added pseudo-labeled samples, which indicates that the generated samples are of high confidence.


2015 ◽  
Vol 2015 ◽  
pp. 1-11 ◽  
Author(s):  
Jiajia Liu ◽  
Bailin Li ◽  
Ying Xiong ◽  
Biao He ◽  
Li Li

The detection of fastener defects is an important task for ensuring the safety of railway traffic. The earlier automatic inspection systems based on computer vision can detect effectively the completely missing fasteners, but they have weaker ability to recognize the partially worn ones. In this paper, we propose a method for detecting both partly worn and completely missing fasteners, the proposed algorithm exploits the first and second symmetry sample of original testing fastener image and integrates them for improved representation-based fastener recognition. This scheme is simple and computationally efficient. The underlying rationales of the scheme are as follows: First, the new virtual symmetrical images really reflect some possible appearance of the fastener; then the integration of two judgments of the symmetrical sample for fastener recognition can somewhat overcome the misclassification problem. Second, the improved sparse representation method discarding the training samples that are “far” from the test sample and uses a small number of samples that are “near” to the test sample to represent the test sample, so as to perform classification and it is able to reduce the side-effect of the error identification problem of the original fastener image. The experimental results show that the proposed method outperforms state-of-the-art fastener recognition methods.


2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Author(s):  
Minna Qiu ◽  
Jian Zhang ◽  
Jiayan Yang ◽  
Liying Ye

Face recognition has become a very active field of biometrics. Different pictures of the same face might include various changes of expressions, poses, and illumination. However, a face recognition system usually suffers from the problem that nonsufficient training samples cannot convey these possible changes effectively. The main reason is that a system has only limited storage space and limited time to capture training samples. Many previous literatures ignored the problem of nonsufficient training samples. In this paper, we overcome the insufficiency of training sample size problem by fusing two kinds of virtual samples and the original samples to perform small sample face recognition. The two used kinds of virtual samples are mirror faces and symmetrical faces. Firstly, we transform the original face image to obtain mirror faces and symmetrical faces. Secondly, we fuse these two kinds of virtual samples to achieve the matching scores between the test sample and each class. Finally, we integrate the matching scores to get the final classification results. We compare the proposed method with the single virtual sample augment methods and the original representation-based classification. The experiments on various face databases show that the proposed scheme achieves the best accuracy among the representation-based classification methods.


2012 ◽  
Vol 157-158 ◽  
pp. 1399-1403
Author(s):  
Jian Wu Long ◽  
Xuan Jing Shen ◽  
Hai Peng Chen

In this work principal component analysis (PCA) was adopted to construct a background model and moving objects were detected by background subtraction method. Firstly, constructed the matrix of training samples by means of converting the video sequence to vectors. Then calculated the covariance matrix C of the training set, and acquired the eigenvalues and eigenvectors of C through SVD decomposition. Next, sorted the eigenvalues and reconstructed the background model by using several image vectors which had higher cumulative contribution. Finally, comparison experiments are performed with the detection results by GMM approach. Experimental results show that the proposed method in this paper could establish background models more accurate and have better effective of object detection.


2021 ◽  
Vol 38 (1) ◽  
pp. 61-71
Author(s):  
Xianrong Zhang ◽  
Gang Chen

Facing the image detection of dense small rigid targets, the main bottleneck of convolutional neural network (CNN)-based algorithms is the lack of massive correctly labeled training images. To make up for the lack, this paper proposes an automatic end-to-end synthesis algorithm to generate a huge amount of labeled training samples. The synthetic image set was adopted to train the network progressively and iteratively, realizing the detection of dense small rigid targets based on the CNN and synthetic images. Specifically, the standard images of the target classes and the typical background mages were imported, and the color, brightness, position, orientation, and perspective of real images were simulated by image processing algorithm, creating a sufficiently large initial training set with correctly labeled images. Then, the network was preliminarily trained on this set. After that, a few real images were compiled into the test set. Taking the missed and incorrectly detected target images as inputs, the initial training set was progressively expanded, and then used to iteratively train the network. The results show that our method can automatically generate a training set that fully substitutes manually labeled dataset for network training, eliminating the dependence on massive manually labeled images. The research opens a new way to implement the tasks similar to the detection of dense small rigid targets, and provides a good reference for solving similar problems through deep learning (DL).


Sign in / Sign up

Export Citation Format

Share Document