VarDefense: Variance-Based Defense against Poison Attack

Wireless Communications and Mobile Computing ◽

10.1155/2021/1974822 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Mingyuan Fan ◽

Xue Du ◽

Ximeng Liu ◽

Wenzhong Guo

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

Training Dataset ◽

Effective Metric ◽

Core Role

The emergence of poison attack brings a serious risk to deep neural networks (DNNs). Specifically, an adversary can poison the training dataset to train a backdoor model, which behaves fine on clean data but induces targeted misclassification on arbitrary data with the crafted trigger. However, previous defense methods have to purify the backdoor model with the compromising degradation of performance. In this paper, to relieve the problem, a novel defense method VarDefense is proposed, which leverages an effective metric, i.e., variance, and purifying strategy. In detail, variance is adopted to distinguish the bad neurons that play a core role in poison attack and then purifying the bad neurons. Moreover, we find that the bad neurons are generally located in the later layers of the backdoor model because the earlier layers only extract general features. Based on it, we design a proper purifying strategy where only later layers of the backdoor model are purified and in this way, the degradation of performance is greatly reduced, compared to previous defense methods. Extensive experiments show that the performance of VarDefense significantly surpasses state-of-the-art defense methods.

Representing Deep Neural Networks Latent Space Geometries with Graphs

Algorithms ◽

10.3390/a14020039 ◽

2021 ◽

Vol 14 (2) ◽

pp. 39

Author(s):

Carlos Lassance ◽

Vincent Gripon ◽

Antonio Ortega

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Objective Function ◽

Learning Process ◽

Deep Neural Networks ◽

State Of The Art ◽

The Core ◽

Learning Tasks ◽

Latent Space

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.

Reconfigurable Binary Neural Network Accelerator with Adaptive Parallelism Scheme

Electronics ◽

10.3390/electronics10030230 ◽

2021 ◽

Vol 10 (3) ◽

pp. 230

Author(s):

Jaechan Cho ◽

Yongchul Jung ◽

Seongjoo Lee ◽

Yunho Jung

Keyword(s):

Neural Networks ◽

High Throughput ◽

Deep Neural Networks ◽

State Of The Art ◽

Throughput Performance ◽

Adaptive Parallelism ◽

Sensor Applications ◽

Binary Neural Network ◽

Target Layer ◽

Network Topologies

Binary neural networks (BNNs) have attracted significant interest for the implementation of deep neural networks (DNNs) on resource-constrained edge devices, and various BNN accelerator architectures have been proposed to achieve higher efficiency. BNN accelerators can be divided into two categories: streaming and layer accelerators. Although streaming accelerators designed for a specific BNN network topology provide high throughput, they are infeasible for various sensor applications in edge AI because of their complexity and inflexibility. In contrast, layer accelerators with reasonable resources can support various network topologies, but they operate with the same parallelism for all the layers of the BNN, which degrades throughput performance at certain layers. To overcome this problem, we propose a BNN accelerator with adaptive parallelism that offers high throughput performance in all layers. The proposed accelerator analyzes target layer parameters and operates with optimal parallelism using reasonable resources. In addition, this architecture is able to fully compute all types of BNN layers thanks to its reconfigurability, and it can achieve a higher area–speed efficiency than existing accelerators. In performance evaluation using state-of-the-art BNN topologies, the designed BNN accelerator achieved an area–speed efficiency 9.69 times higher than previous FPGA implementations and 24% higher than existing VLSI implementations for BNNs.

Evaluation of Power Insulator Detection Efficiency with the Use of Limited Training Dataset

Applied Sciences ◽

10.3390/app10062104 ◽

2020 ◽

Vol 10 (6) ◽

pp. 2104

Author(s):

Michał Tomaszewski ◽

Paweł Michalski ◽

Jakub Osuchowski

Keyword(s):

Neural Network ◽

Neural Networks ◽

Object Detection ◽

Convolutional Neural Network ◽

Deep Neural Networks ◽

Detection Efficiency ◽

Training Data ◽

Training Dataset ◽

Training Set ◽

Convolutional Network

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.

Label Distribution for Learning with Noisy Labels

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/356 ◽

2020 ◽

Author(s):

Yun-Peng Liu ◽

Ning Xu ◽

Yu Zhang ◽

Xin Geng

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Learning Algorithm ◽

State Of The Art ◽

Confidence Estimation ◽

Novel Method ◽

Real World Datasets ◽

Label Distribution ◽

Noisy Labels

The performances of deep neural networks (DNNs) crucially rely on the quality of labeling. In some situations, labels are easily corrupted, and therefore some labels become noisy labels. Thus, designing algorithms that deal with noisy labels is of great importance for learning robust DNNs. However, it is difficult to distinguish between clean labels and noisy labels, which becomes the bottleneck of many methods. To address the problem, this paper proposes a novel method named Label Distribution based Confidence Estimation (LDCE). LDCE estimates the confidence of the observed labels based on label distribution. Then, the boundary between clean labels and noisy labels becomes clear according to confidence scores. To verify the effectiveness of the method, LDCE is combined with the existing learning algorithm to train robust DNNs. Experiments on both synthetic and real-world datasets substantiate the superiority of the proposed algorithm against state-of-the-art methods.

Framework for TCAD augmented machine learning on multi- I–V characteristics using convolutional neural network and multiprocessing

Journal of Semiconductors ◽

10.1088/1674-4926/42/12/124101 ◽

2021 ◽

Vol 42 (12) ◽

pp. 124101

Author(s):

Thomas Hirtz ◽

Steyn Huurman ◽

He Tian ◽

Yi Yang ◽

Tian-Ling Ren

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Information Technologies ◽

Deep Neural Networks ◽

State Of The Art ◽

Data Driven ◽

Sufficient Data ◽

Learning Models ◽

Simulation Tools ◽

New Information

Abstract In a world where data is increasingly important for making breakthroughs, microelectronics is a field where data is sparse and hard to acquire. Only a few entities have the infrastructure that is required to automate the fabrication and testing of semiconductor devices. This infrastructure is crucial for generating sufficient data for the use of new information technologies. This situation generates a cleavage between most of the researchers and the industry. To address this issue, this paper will introduce a widely applicable approach for creating custom datasets using simulation tools and parallel computing. The multi-I–V curves that we obtained were processed simultaneously using convolutional neural networks, which gave us the ability to predict a full set of device characteristics with a single inference. We prove the potential of this approach through two concrete examples of useful deep learning models that were trained using the generated data. We believe that this work can act as a bridge between the state-of-the-art of data-driven methods and more classical semiconductor research, such as device engineering, yield engineering or process monitoring. Moreover, this research gives the opportunity to anybody to start experimenting with deep neural networks and machine learning in the field of microelectronics, without the need for expensive experimentation infrastructure.

Detecting Emotions in English and Arabic Tweets

Information ◽

10.3390/info10030098 ◽

2019 ◽

Vol 10 (3) ◽

pp. 98 ◽

Cited By ~ 4

Author(s):

Tariq Ahmad ◽

Allan Ramsay ◽

Hanady Ahmed

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

Learning Algorithms ◽

General Purpose ◽

Machine Learning Algorithms ◽

Current State ◽

Optimal Thresholds ◽

Alternative Approach

Assigning sentiment labels to documents is, at first sight, a standard multi-label classification task. Many approaches have been used for this task, but the current state-of-the-art solutions use deep neural networks (DNNs). As such, it seems likely that standard machine learning algorithms, such as these, will provide an effective approach. We describe an alternative approach, involving the use of probabilities to construct a weighted lexicon of sentiment terms, then modifying the lexicon and calculating optimal thresholds for each class. We show that this approach outperforms the use of DNNs and other standard algorithms. We believe that DNNs are not a universal panacea and that paying attention to the nature of the data that you are trying to learn from can be more important than trying out ever more powerful general purpose machine learning algorithms.

A New Click-Through Rates Prediction Model Based on Deep&Cross Network

Algorithms ◽

10.3390/a13120342 ◽

2020 ◽

Vol 13 (12) ◽

pp. 342

Author(s):

Guojing Huang ◽

Qingliang Chen ◽

Congjian Deng

Keyword(s):

Neural Networks ◽

Prediction Model ◽

Deep Neural Networks ◽

Prediction Models ◽

State Of The Art ◽

Online Advertising ◽

Optimization Technique ◽

Proposed Model ◽

Great Progress ◽

Online Advertisement

With the development of E-commerce, online advertising began to thrive and has gradually developed into a new mode of business, of which Click-Through Rates (CTR) prediction is the essential driving technology. Given a user, commodities and scenarios, the CTR model can predict the user’s click probability of an online advertisement. Recently, great progress has been made with the introduction of Deep Neural Networks (DNN) into CTR. In order to further advance the DNN-based CTR prediction models, this paper introduces a new model of FO-FTRL-DCN, based on the prestigious model of Deep&Cross Network (DCN) augmented with the latest optimization technique of Follow The Regularized Leader (FTRL) for DNN. The extensive comparative experiments on the iPinYou datasets show that the proposed model has outperformed other state-of-the-art baselines, with better generalization across different datasets in the benchmark.

Self-Supervised Learning for Generalizable Out-of-Distribution Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5966 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5216-5223 ◽

Cited By ~ 1

Author(s):

Sina Mohseni ◽

Mandar Pitale ◽

JBS Yadawa ◽

Zhangyang Wang

Keyword(s):

Neural Networks ◽

Autonomous Vehicles ◽

Deep Neural Networks ◽

State Of The Art ◽

Feature Learning ◽

Detection Methods ◽

Training Set ◽

Safety Critical ◽

Multiple Image ◽

A New Technique

The real-world deployment of Deep Neural Networks (DNNs) in safety-critical applications such as autonomous vehicles needs to address a variety of DNNs' vulnerabilities, one of which being detecting and rejecting out-of-distribution outliers that might result in unpredictable fatal errors. We propose a new technique relying on self-supervision for generalizable out-of-distribution (OOD) feature learning and rejecting those samples at the inference time. Our technique does not need to pre-know the distribution of targeted OOD samples and incur no extra overheads compared to other methods. We perform multiple image classification experiments and observe our technique to perform favorably against state-of-the-art OOD detection methods. Interestingly, we witness that our method also reduces in-distribution classification risk via rejecting samples near the boundaries of the training set distribution.

Dataset-aware multi-task learning approaches for biomedical named entity recognition

Bioinformatics ◽

10.1093/bioinformatics/btaa515 ◽

2020 ◽

Vol 36 (15) ◽

pp. 4331-4338

Author(s):

Mei Zuo ◽

Yang Zhang

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

Named Entity Recognition ◽

Entity Recognition ◽

Quality Data ◽

Supplementary Information ◽

Named Entity ◽

Task Learning ◽

Biomedical Named Entity Recognition

Abstract Motivation Named entity recognition is a critical and fundamental task for biomedical text mining. Recently, researchers have focused on exploiting deep neural networks for biomedical named entity recognition (Bio-NER). The performance of deep neural networks on a single dataset mostly depends on data quality and quantity while high-quality data tends to be limited in size. To alleviate task-specific data limitation, some studies explored the multi-task learning (MTL) for Bio-NER and achieved state-of-the-art performance. However, these MTL methods did not make full use of information from various datasets of Bio-NER. The performance of state-of-the-art MTL method was significantly limited by the number of training datasets. Results We propose two dataset-aware MTL approaches for Bio-NER which jointly train all models for numerous Bio-NER datasets, thus each of these models could discriminatively exploit information from all of related training datasets. Both of our two approaches achieve substantially better performance compared with the state-of-the-art MTL method on 14 out of 15 Bio-NER datasets. Furthermore, we implemented our approaches by incorporating Bio-NER and biomedical part-of-speech (POS) tagging datasets. The results verify Bio-NER and POS can significantly enhance one another. Availability and implementation Our source code is available at https://github.com/zmmzGitHub/MTL-BC-LBC-BioNER and all datasets are publicly available at https://github.com/cambridgeltl/MTL-Bioinformatics-2016. Supplementary information Supplementary data are available at Bioinformatics online.

Automated Recognition of Arrhythmia Using Deep Neural Networks for 12-Lead Electrocardiograms with Fractional Time–Frequency Domain Extension

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2020.3212 ◽

2020 ◽

Vol 10 (11) ◽

pp. 2764-2767

Author(s):

Chuanbin Ge ◽

Di Liu ◽

Juan Liu ◽

Bingshuai Liu ◽

Yi Xin

Keyword(s):

Neural Networks ◽

Frequency Domain ◽

Deep Neural Networks ◽

Fractional Fourier Transform ◽

Training Dataset ◽

Clinical Tool ◽

Time Frequency ◽

Ecg Signals ◽

Physiological Signal ◽

Ecg Data

Arrhythmia is a group of conditions in which the heartbeat is irregular. There are many types of arrhythmia. Some can be life-threatening. Electrocardiogram (ECG) is an effective clinical tool used to diagnosis arrhythmia. Automatic recognition of different arrhythmia types in ECG signals has become an important and challenging issue. In this article, we proposed an algorithm to detect arrhythmia in 12-lead ECG signals and classify signals into 9 categories. Two 19-layer deep neural networks combining convolutional neural network and gated recurrent unit were proposed to realize this work. The first one was trained directly with the raw 12-lead ECG data while the other one was trained with an 18-"lead" ECG data, where the six extra leads containing morphology information in fractional time–frequency domain were generated utilizing fractional Fourier transform (FRFT). Overall detection results were obtained by fusing the output of these two networks and the final classification results on the testing dataset reports our proposed algorithm obtained a F1 score of 0.855. Furthermore, with our proposed algorithm, a better F1 score 0.81 was attained using training dataset provided by the China Physiological Signal Challenge held in 2018.