scholarly journals ML-LOO: Detecting Adversarial Examples with Feature Attribution

2020 ◽  
Vol 34 (04) ◽  
pp. 6639-6647 ◽  
Author(s):  
Puyudi Yang ◽  
Jianbo Chen ◽  
Cho-Jui Hsieh ◽  
Jane-Ling Wang ◽  
Michael Jordan

Deep neural networks obtain state-of-the-art performance on a series of tasks. However, they are easily fooled by adding a small adversarial perturbation to the input. The perturbation is often imperceptible to humans on image data. We observe a significant difference in feature attributions between adversarially crafted examples and original examples. Based on this observation, we introduce a new framework to detect adversarial examples through thresholding a scale estimate of feature attribution scores. Furthermore, we extend our method to include multi-layer feature attributions in order to tackle attacks that have mixed confidence levels. As demonstrated in extensive experiments, our method achieves superior performances in distinguishing adversarial examples from popular attack methods on a variety of real data sets compared to state-of-the-art detection methods. In particular, our method is able to detect adversarial examples of mixed confidence levels, and transfer between different attacking methods. We also show that our method achieves competitive performance even when the attacker has complete access to the detector.

2021 ◽  
Vol 17 (3) ◽  
pp. e1008256
Author(s):  
Shuonan Chen ◽  
Jackson Loper ◽  
Xiaoyin Chen ◽  
Alex Vaughan ◽  
Anthony M. Zador ◽  
...  

Modern spatial transcriptomics methods can target thousands of different types of RNA transcripts in a single slice of tissue. Many biological applications demand a high spatial density of transcripts relative to the imaging resolution, leading to partial mixing of transcript rolonies in many voxels; unfortunately, current analysis methods do not perform robustly in this highly-mixed setting. Here we develop a new analysis approach, BARcode DEmixing through Non-negative Spatial Regression (BarDensr): we start with a generative model of the physical process that leads to the observed image data and then apply sparse convex optimization methods to estimate the underlying (demixed) rolony densities. We apply BarDensr to simulated and real data and find that it achieves state of the art signal recovery, particularly in densely-labeled regions or data with low spatial resolution. Finally, BarDensr is fast and parallelizable. We provide open-source code as well as an implementation for the ‘NeuroCAAS’ cloud platform.


2021 ◽  
Vol 25 (1) ◽  
pp. 27-50
Author(s):  
Tsung-Lin Li ◽  
◽  
Chen-An Tsai ◽  

Time series forecasting is a challenging task of interest in many disciplines. A variety of techniques have been developed to deal with the problem through a combination of different disciplines. Although various researches have proved successful for hybrid models, none of them carried out the comparisons with solid statistical test. This paper proposes a new stepwise model determination method for artificial neural network (ANN) and a novel hybrid model combining autoregressive integrated moving average (ARIMA) model, ANN and discrete wavelet transformation (DWT). Simulation studies are conducted to compare the performance of different models, including ARIMA, ANN, ARIMA-ANN, DWT-ARIMA-ANN and the proposed method, ARIMA-DWT-ANN. Also, two real data sets, Lynx data and cabbage data, are used to demonstrate the applications. Our proposed method, ARIMA-DWT-ANN, outperforms other methods in both simulated datasets and Lynx data, while ANN shows a better performance in the cabbage data. We conducted a two-way ANOVA test to compare the performances of methods. The results showed a significant difference between methods. As a brief conclusion, it is suggested to try on ANN and ARIMA-DWT-ANN due to their robustness and high accuracy. Since the performance of hybrid models may vary across data sets based on their ARIMA alike or ANN alike natures, they should all be considered when encountering a new data to reach an optimal performance.


2021 ◽  
Author(s):  
Enrico Gaffo ◽  
Alessia Buratin ◽  
Anna Dal Molin ◽  
Stefania Bortoluzzi

AbstractCurrent methods for identifying circular RNAs (circRNAs) suffer from low discovery rates and inconsistent performance in diverse data sets. Therefore, the applied detection algorithm can bias high-throughput study findings by missing relevant circRNAs. Here, we show that our bioinformatics tool CirComPara2 (https://github.com/egaffo/CirComPara2), by combining multiple circRNA detection methods, consistently achieves high recall rates without loss of precision in simulated and different real-data sets.


2019 ◽  
Vol 9 (18) ◽  
pp. 3801 ◽  
Author(s):  
Hyuk-Yoon Kwon

In this paper, we propose a method to construct a lightweight key-value store based on the Windows native features. The main idea is providing a thin wrapper for the key-value store on top of a built-in storage in Windows, called Windows registry. First, we define a mapping of the components in the key-value store onto the components in the Windows registry. Then, we present a hash-based multi-level registry index so as to distribute the key-value data balanced and to efficiently access them. Third, we implement basic operations of the key-value store (i.e., Get, Put, and Delete) by manipulating the Windows registry using the Windows native APIs. We call the proposed key-value store WR-Store. Finally, we propose an efficient ETL (Extract-Transform-Load) method to migrate data stored in WR-Store into any other environments that support existing key-value stores. Because the performance of the Windows registry has not been studied much, we perform the empirical study to understand the characteristics of WR-Store, and then, tune the performance of WR-Store to find the best parameter setting. Through extensive experiments using synthetic and real data sets, we show that the performance of WR-Store is comparable to or even better than the state-of-the-art systems (i.e., RocksDB, BerkeleyDB, and LevelDB). Especially, we show the scalability of WR-Store. That is, WR-Store becomes much more efficient than the other key-value stores as the size of data set increases. In addition, we show that the performance of WR-Store is maintained even in the case of intensive registry workloads where 1000 processes accessing to the registry actively are concurrently running.


Author(s):  
Bahareh Khozaei ◽  
Mahdi Eftekhari

In this paper, two novel approaches for unsupervised feature selection are proposed based on the spectral clustering. In the first proposed method, spectral clustering is employed over the features and the center of clusters is selected as well as their nearest-neighbors. These features have a minimum similarity (redundancy) between themselves since they belong to different clusters. Next, samples of data sets are clustered employing spectral clustering so that to the samples of each cluster a specific pseudo-label is assigned. After that according to the obtained pseudo-labels, the information gain of the features is computed that secures the maximum relevancy. Finally, the intersection of the selected features in the two previous steps is determined that simultaneously guarantees both the maximum relevancy and minimum redundancy. Our second proposed approach is very similar to the first one whose only but significant difference with the first method is that it selects one feature from each cluster and sorts all the features in terms of their relevancy. Then, by appending the selected features to a sorted list and ignoring them for the next step, the algorithm continues with the remaining features until all the features to be appended into the sorted list. Both of our proposed methods are compared with state-of-the-art methods and the obtained results confirm the performance of our proposed approaches especially the second one.


2015 ◽  
Vol 24 (04) ◽  
pp. 1540016 ◽  
Author(s):  
Muhammad Hussain ◽  
Sahar Qasem ◽  
George Bebis ◽  
Ghulam Muhammad ◽  
Hatim Aboalsamh ◽  
...  

Due to the maturing of digital image processing techniques, there are many tools that can forge an image easily without leaving visible traces and lead to the problem of the authentication of digital images. Based on the assumption that forgery alters the texture micro-patterns in a digital image and texture descriptors can be used for modeling this change; we employed two stat-of-the-art local texture descriptors: multi-scale Weber's law descriptor (multi-WLD) and multi-scale local binary pattern (multi-LBP) for splicing and copy-move forgery detection. As the tamper traces are not visible to open eyes, so the chrominance components of an image encode these traces and were used for modeling tamper traces with the texture descriptors. To reduce the dimension of the feature space and get rid of redundant features, we employed locally learning based (LLB) algorithm. For identifying an image as authentic or tampered, Support vector machine (SVM) was used. This paper presents the thorough investigation for the validation of this forgery detection method. The experiments were conducted on three benchmark image data sets, namely, CASIA v1.0, CASIA v2.0, and Columbia color. The experimental results showed that the accuracy rate of multi-WLD based method was 94.19% on CASIA v1.0, 96.52% on CASIA v2.0, and 94.17% on Columbia data set. It is not only significantly better than multi-LBP based method, but also it outperforms other stat-of-the-art similar forgery detection methods.


2010 ◽  
Vol 66 (7) ◽  
pp. 783-788 ◽  
Author(s):  
Pavol Skubák ◽  
Willem-Jan Waterreus ◽  
Navraj S. Pannu

Density modification is a standard technique in macromolecular crystallography that can significantly improve an initial electron-density map. To obtain optimal results, the initial and density-modified map are combined. Current methods assume that these two maps are independent and propagate the initial map information and its accuracy indirectly through previously determined coefficients. A multivariate equation has been derived that no longer assumes independence between the initial and density-modified map, considers the observed diffraction data directly and refines the errors that can occur in a single-wavelength anomalous diffraction experiment. The equation has been implemented and tested on over 100 real data sets. The results are dramatic: the method provides significantly improved maps over the current state of the art and leads to many more structures being built automatically.


2015 ◽  
Vol 24 (03) ◽  
pp. 1550003 ◽  
Author(s):  
Armin Daneshpazhouh ◽  
Ashkan Sami

The task of semi-supervised outlier detection is to find the instances that are exceptional from other data, using some labeled examples. In many applications such as fraud detection and intrusion detection, this issue becomes more important. Most existing techniques are unsupervised. On the other hand, semi-supervised approaches use both negative and positive instances to detect outliers. However, in many real world applications, very few positive labeled examples are available. This paper proposes an innovative approach to address this problem. The proposed method works as follows. First, some reliable negative instances are extracted by a kNN-based algorithm. Afterwards, fuzzy clustering using both negative and positive examples is utilized to detect outliers. Experimental results on real data sets demonstrate that the proposed approach outperforms the previous unsupervised state-of-the-art methods in detecting outliers.


2013 ◽  
Vol 11 (02) ◽  
pp. 1250014 ◽  
Author(s):  
MARÍA M. ABAD-GRAU ◽  
NURIA MEDINA-MEDINA ◽  
SERAFÍN MORAL ◽  
ROSANA MONTES-SOLDADO ◽  
SERGIO TORRES-SÁNCHEZ ◽  
...  

It is already known that power in multimarker transmission/disequilibrium tests may improve with the number of markers as some associations may require several markers to be captured. However, a mechanism such as haplotype grouping must be used to avoid incremental complexity with the number of markers. 2G, a state-of-the-art transmission/disequilibrium test, implements this mechanism to its maximum extent by grouping haplotypes into only two groups, high and low-risk haplotypes, so that the test has only one degree of freedom regardless of the number of markers. The test checks whether those haplotypes more often transmitted from parents to offspring are truly high-risk haplotypes. In this paper we use haplotype similarity as prior knowledge to classify haplotypes as high or low risk ones and start with those haplotypes in which the prior will have lower impact i.e. those with the largest differences between transmission and non-transmission counts. If their counts are very different, the prior knowledge has little effect and haplotypes are classified as low or high risk as 2G does. We show a substantial gain in power achieved by this approach, in both simulation and real data sets.


2020 ◽  
Vol 34 (05) ◽  
pp. 8384-8391
Author(s):  
Hui Liu ◽  
Yongzheng Zhang ◽  
Yipeng Wang ◽  
Zheng Lin ◽  
Yige Chen

Text classification is a basic task in natural language processing, but the small character perturbations in words can greatly decrease the effectiveness of text classification models, which is called character-level adversarial example attack. There are two main challenges in character-level adversarial examples defense, which are out-of-vocabulary words in word embedding model and the distribution difference between training and inference. Both of these two challenges make the character-level adversarial examples difficult to defend. In this paper, we propose a framework which jointly uses the character embedding and the adversarial stability training to overcome these two challenges. Our experimental results on five text classification data sets show that the models based on our framework can effectively defend character-level adversarial examples, and our models can defend 93.19% gradient-based adversarial examples and 94.83% natural adversarial examples, which outperforms the state-of-the-art defense models.


Sign in / Sign up

Export Citation Format

Share Document