A fast and efficient pre-training method based on layer-by-layer maximum discrimination for deep neural networks

2015 ◽  
Vol 168 ◽  
pp. 669-680 ◽  
Author(s):  
Seyyede Zohreh Seyyedsalehi ◽  
Seyyed Ali Seyyedsalehi
Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 428
Author(s):  
Hyun Kwon ◽  
Jun Lee

This paper presents research focusing on visualization and pattern recognition based on computer science. Although deep neural networks demonstrate satisfactory performance regarding image and voice recognition, as well as pattern analysis and intrusion detection, they exhibit inferior performance towards adversarial examples. Noise introduction, to some degree, to the original data could lead adversarial examples to be misclassified by deep neural networks, even though they can still be deemed as normal by humans. In this paper, a robust diversity adversarial training method against adversarial attacks was demonstrated. In this approach, the target model is more robust to unknown adversarial examples, as it trains various adversarial samples. During the experiment, Tensorflow was employed as our deep learning framework, while MNIST and Fashion-MNIST were used as experimental datasets. Results revealed that the diversity training method has lowered the attack success rate by an average of 27.2 and 24.3% for various adversarial examples, while maintaining the 98.7 and 91.5% accuracy rates regarding the original data of MNIST and Fashion-MNIST.


1997 ◽  
Vol 08 (05n06) ◽  
pp. 509-515
Author(s):  
Yan Li ◽  
A. B. Rad

A new structure and training method for multilayer neural networks is presented. The proposed method is based on cascade training of subnetworks and optimizing weights layer by layer. The training procedure is completed in two steps. First, a subnetwork, m inputs and n outputs as the style of training samples, is trained using the training samples. Secondly the outputs of the subnetwork is taken as the inputs and the outputs of the training sample as the desired outputs, another subnetwork with n inputs and n outputs is trained. Finally the two trained subnetworks are connected and a trained multilayer neural networks is created. The numerical simulation results based on both linear least squares back-propagation (LSB) and traditional back-propagation (BP) algorithm have demonstrated the efficiency of the proposed method.


2018 ◽  
Vol 6 (1) ◽  
pp. 74-86 ◽  
Author(s):  
Zhi-Hua Zhou ◽  
Ji Feng

Abstract Current deep-learning models are mostly built upon neural networks, i.e. multiple layers of parameterized differentiable non-linear modules that can be trained by backpropagation. In this paper, we explore the possibility of building deep models based on non-differentiable modules such as decision trees. After a discussion about the mystery behind deep neural networks, particularly by contrasting them with shallow neural networks and traditional machine-learning techniques such as decision trees and boosting machines, we conjecture that the success of deep neural networks owes much to three characteristics, i.e. layer-by-layer processing, in-model feature transformation and sufficient model complexity. On one hand, our conjecture may offer inspiration for theoretical understanding of deep learning; on the other hand, to verify the conjecture, we propose an approach that generates deep forest holding these characteristics. This is a decision-tree ensemble approach, with fewer hyper-parameters than deep neural networks, and its model complexity can be automatically determined in a data-dependent way. Experiments show that its performance is quite robust to hyper-parameter settings, such that in most cases, even across different data from different domains, it is able to achieve excellent performance by using the same default setting. This study opens the door to deep learning based on non-differentiable modules without gradient-based adjustment, and exhibits the possibility of constructing deep models without backpropagation.


2020 ◽  
Vol 2020 ◽  
pp. 1-17
Author(s):  
Yaochun Wu ◽  
Rongzhen Zhao ◽  
Wuyin Jin ◽  
Linfeng Deng ◽  
Tianjing He ◽  
...  

Deep learning (DL) has been successfully used in fault diagnosis. Training deep neural networks, such as convolutional neural networks (CNNs), require plenty of labeled samples. However, in mechanical fault diagnosis, labeled data are costly and time-consuming to collect. A novel method based on a deep convolutional autoencoding network (DCAEN) and adaptive nonparametric weighted-feature extraction Gustafson–Kessel (ANW-GK) clustering algorithm was developed for the fault diagnosis of bearings. First, the DCAEN that is pretrained layer by layer by unlabeled samples and fine-tuned by a few labeled samples is applied to learn representative features from the vibration signals. Then, the learned representative features are reduced by t-distributed stochastic neighbor embedding (t-SNE), and the low-dimensional main features are obtained. Finally, the low-dimensional features are input ANW-GK clustering for fault identification. Two datasets were used to validate the effectiveness of the proposed method. The experimental results show that the proposed method can effectively diagnose different fault types with only a few labeled samples.


Entropy ◽  
2020 ◽  
Vol 22 (12) ◽  
pp. 1429
Author(s):  
Scythia Marrow ◽  
Eric J. Michaud ◽  
Erik Hoel

Deep Neural Networks (DNNs) are often examined at the level of their response to input, such as analyzing the mutual information between nodes and data sets. Yet DNNs can also be examined at the level of causation, exploring “what does what” within the layers of the network itself. Historically, analyzing the causal structure of DNNs has received less attention than understanding their responses to input. Yet definitionally, generalizability must be a function of a DNN’s causal structure as it reflects how the DNN responds to unseen or even not-yet-defined future inputs. Here, we introduce a suite of metrics based on information theory to quantify and track changes in the causal structure of DNNs during training. Specifically, we introduce the effective information (EI) of a feedforward DNN, which is the mutual information between layer input and output following a maximum-entropy perturbation. The EI can be used to assess the degree of causal influence nodes and edges have over their downstream targets in each layer. We show that the EI can be further decomposed in order to examine the sensitivity of a layer (measured by how well edges transmit perturbations) and the degeneracy of a layer (measured by how edge overlap interferes with transmission), along with estimates of the amount of integrated information of a layer. Together, these properties define where each layer lies in the “causal plane”, which can be used to visualize how layer connectivity becomes more sensitive or degenerate over time, and how integration changes during training, revealing how the layer-by-layer causal structure differentiates. These results may help in understanding the generalization capabilities of DNNs and provide foundational tools for making DNNs both more generalizable and more explainable.


Entropy ◽  
2021 ◽  
Vol 23 (10) ◽  
pp. 1360
Author(s):  
Xin Du ◽  
Katayoun Farrahi ◽  
Mahesan Niranjan

In solving challenging pattern recognition problems, deep neural networks have shown excellent performance by forming powerful mappings between inputs and targets, learning representations (features) and making subsequent predictions. A recent tool to help understand how representations are formed is based on observing the dynamics of learning on an information plane using mutual information, linking the input to the representation (I(X;T)) and the representation to the target (I(T;Y)). In this paper, we use an information theoretical approach to understand how Cascade Learning (CL), a method to train deep neural networks layer-by-layer, learns representations, as CL has shown comparable results while saving computation and memory costs. We observe that performance is not linked to information–compression, which differs from observation on End-to-End (E2E) learning. Additionally, CL can inherit information about targets, and gradually specialise extracted features layer-by-layer. We evaluate this effect by proposing an information transition ratio, I(T;Y)/I(X;T), and show that it can serve as a useful heuristic in setting the depth of a neural network that achieves satisfactory accuracy of classification.


2016 ◽  
Vol 31 (4) ◽  
pp. 267
Author(s):  
Bao Quoc Nguyen ◽  
Thang Tat Vu ◽  
Mai Chi Luong

In this paper, the pre-training method based on denoising auto-encoder is investigated and proved to be good models for initializing bottleneck networks of Vietnamese speech recognition system that result in better recognition performance compared to base bottleneck features reported previously. The experiments are carried out on the dataset containing speeches on Voice of Vietnam channel (VOV). The results show that the DBNF extraction for Vietnamese recognition decreases relative word error rate by 14 % and 39 % compared to the base bottleneck features and MFCC baseline, respectively.


Sensors ◽  
2021 ◽  
Vol 21 (15) ◽  
pp. 4977
Author(s):  
Ji-Won Kang ◽  
Jae-Eun Lee ◽  
Jang-Hwan Choi ◽  
Woosuk Kim ◽  
Jin-Kyum Kim ◽  
...  

This paper proposes a method to embed and extract a watermark on a digital hologram using a deep neural network. The entire algorithm for watermarking digital holograms consists of three sub-networks. For the robustness of watermarking, an attack simulation is inserted inside the deep neural network. By including attack simulation and holographic reconstruction in the network, the deep neural network for watermarking can simultaneously train invisibility and robustness. We propose a network training method using hologram and reconstruction. After training the proposed network, we analyze the robustness of each attack and perform re-training according to this result to propose a method to improve the robustness. We quantitatively evaluate the results of robustness against various attacks and show the reliability of the proposed technique.


Sign in / Sign up

Export Citation Format

Share Document