scholarly journals VOVU: A Method for Predicting Generalization in Deep Neural Networks

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Juan Wang ◽  
Liangzhu Ge ◽  
Guorui Liu ◽  
Guoyan Li

During the development of deep neural networks (DNNs), it is difficult to trade off the performance of fitting ability and generalization ability in training set and unknown data (such as test set). The current solution is to reduce the complexity of the objective function, using regularization methods. In this paper, we propose a method called VOVU (Variance Of Variance of Units in the last hidden layer) to maximize the optimization of the balance between fitting power and generalization during monitoring the training process. The main idea is to give full play to the predictability of the variance of the hidden layer units in the complexity of the neural network model and use it as a generalization evaluation index. In particular, we take full advantage of the last layer of hidden layers since it has the greatest impact. The algorithm was tested on Fashion-MNIST and CIFAR-10. The experimental results demonstrate that VOVU and test loss are highly positively correlated. This implies that a smaller VOVU indicates that the network has better generalization. VOVU can serve as an alternative method for early stopping and a good predictor of the generalization performance in DNNs. Specially, when the sample size is limited, VOVU will be a better choice because it does not require dividing training data as validation set.

2020 ◽  
Vol 10 (6) ◽  
pp. 2104
Author(s):  
Michał Tomaszewski ◽  
Paweł Michalski ◽  
Jakub Osuchowski

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.


2020 ◽  
Vol 34 (10) ◽  
pp. 13791-13792
Author(s):  
Liangzhu Ge ◽  
Yuexian Hou ◽  
Yaju Jiang ◽  
Shuai Yao ◽  
Chao Yang

Despite their widespread applications, deep neural networks often tend to overfit the training data. Here, we propose a measure called VECA (Variance of Eigenvalues of Covariance matrix of Activation matrix) and demonstrate that VECA is a good predictor of networks' generalization performance during the training process. Experiments performed on fully-connected networks and convolutional neural networks trained on benchmark image datasets show a strong correlation between test loss and VECA, which suggest that we can calculate the VECA to estimate generalization performance without sacrificing training data to be used as a validation set.


2021 ◽  
Vol 13 (12) ◽  
pp. 2392
Author(s):  
Heikki Astola ◽  
Lauri Seitsonen ◽  
Eelis Halme ◽  
Matthieu Molinier ◽  
Anne Lönnqvist

Estimation of forest structural variables is essential to provide relevant insights for public and private stakeholders in forestry and environmental sectors. Airborne light detection and ranging (LiDAR) enables accurate forest inventory, but it is expensive for large area analyses. Continuously increasing volume of open Earth Observation (EO) imagery from high-resolution (<30 m) satellites together with modern machine learning algorithms provide new prospects for spaceborne large area forest inventory. In this study, we investigated the capability of Sentinel-2 (S2) image and metadata, topography data, and canopy height model (CHM), as well as their combinations, to predict growing stock volume with deep neural networks (DNN) in four forestry districts in Central Finland. We focused on investigating the relevance of different input features, the effect of DNN depth, the amount of training data, and the size of image data sampling window to model prediction performance. We also studied model transfer between different silvicultural districts in Finland, with the objective to minimize the amount of new field data needed. We used forest inventory data provided by the Finnish Forest Centre for model training and performance evaluation. Leaving out CHM features, the model using RGB and NIR bands, the imaging and sun angles, and topography features as additional predictive variables obtained the best plot level accuracy (RMSE% = 42.6%, |BIAS%| = 0.8%). We found 3×3 pixels to be the optimal size for the sampling window, and two to three hidden layer DNNs to produce the best results with relatively small improvement to single hidden layer networks. Including CHM features with S2 data and additional features led to reduced relative RMSE (RMSE% = 28.6–30.7%) but increased the absolute value of relative bias (|BIAS%| = 0.9–4.0%). Transfer learning was found to be beneficial mainly with training data sets containing less than 250 field plots. The performance differences of DNN and random forest models were marginal. Our results contribute to improved structural variable estimation performance in boreal forests with the proposed image sampling and input feature concept.


2019 ◽  
Vol 12 (3) ◽  
pp. 156-161 ◽  
Author(s):  
Aman Dureja ◽  
Payal Pahwa

Background: In making the deep neural network, activation functions play an important role. But the choice of activation functions also affects the network in term of optimization and to retrieve the better results. Several activation functions have been introduced in machine learning for many practical applications. But which activation function should use at hidden layer of deep neural networks was not identified. Objective: The primary objective of this analysis was to describe which activation function must be used at hidden layers for deep neural networks to solve complex non-linear problems. Methods: The configuration for this comparative model was used by using the datasets of 2 classes (Cat/Dog). The number of Convolutional layer used in this network was 3 and the pooling layer was also introduced after each layer of CNN layer. The total of the dataset was divided into the two parts. The first 8000 images were mainly used for training the network and the next 2000 images were used for testing the network. Results: The experimental comparison was done by analyzing the network by taking different activation functions on each layer of CNN network. The validation error and accuracy on Cat/Dog dataset were analyzed using activation functions (ReLU, Tanh, Selu, PRelu, Elu) at number of hidden layers. Overall the Relu gave best performance with the validation loss at 25th Epoch 0.3912 and validation accuracy at 25th Epoch 0.8320. Conclusion: It is found that a CNN model with ReLU hidden layers (3 hidden layers here) gives best results and improve overall performance better in term of accuracy and speed. These advantages of ReLU in CNN at number of hidden layers are helpful to effectively and fast retrieval of images from the databases.


Author(s):  
Serkan Kiranyaz ◽  
Junaid Malik ◽  
Habib Ben Abdallah ◽  
Turker Ince ◽  
Alexandros Iosifidis ◽  
...  

AbstractThe recently proposed network model, Operational Neural Networks (ONNs), can generalize the conventional Convolutional Neural Networks (CNNs) that are homogenous only with a linear neuron model. As a heterogenous network model, ONNs are based on a generalized neuron model that can encapsulate any set of non-linear operators to boost diversity and to learn highly complex and multi-modal functions or spaces with minimal network complexity and training data. However, the default search method to find optimal operators in ONNs, the so-called Greedy Iterative Search (GIS) method, usually takes several training sessions to find a single operator set per layer. This is not only computationally demanding, also the network heterogeneity is limited since the same set of operators will then be used for all neurons in each layer. To address this deficiency and exploit a superior level of heterogeneity, in this study the focus is drawn on searching the best-possible operator set(s) for the hidden neurons of the network based on the “Synaptic Plasticity” paradigm that poses the essential learning theory in biological neurons. During training, each operator set in the library can be evaluated by their synaptic plasticity level, ranked from the worst to the best, and an “elite” ONN can then be configured using the top-ranked operator sets found at each hidden layer. Experimental results over highly challenging problems demonstrate that the elite ONNs even with few neurons and layers can achieve a superior learning performance than GIS-based ONNs and as a result, the performance gap over the CNNs further widens.


Sensors ◽  
2019 ◽  
Vol 19 (16) ◽  
pp. 3530
Author(s):  
Juan Parras ◽  
Santiago Zazo ◽  
Iván A. Pérez-Álvarez ◽  
José Luis Sanz González

In recent years, there has been a significant effort towards developing localization systems in the underwater medium, with current methods relying on anchor nodes, explicitly modeling the underwater channel or cooperation from the target. Lately, there has also been some work on using the approximation capabilities of Deep Neural Networks in order to address this problem. In this work, we study how the localization precision of using Deep Neural Networks is affected by the variability of the channel, the noise level at the receiver, the number of neurons of the neural network and the utilization of the power or the covariance of the received acoustic signals. Our study shows that using deep neural networks is a valid approach when the channel variability is low, which opens the door to further research in such localization methods for the underwater environment.


Dose-Response ◽  
2019 ◽  
Vol 17 (4) ◽  
pp. 155932581989417 ◽  
Author(s):  
Zhi Huang ◽  
Jie Liu ◽  
Liang Luo ◽  
Pan Sheng ◽  
Biao Wang ◽  
...  

Background: Plenty of evidence has suggested that autophagy plays a crucial role in the biological processes of cancers. This study aimed to screen autophagy-related genes (ARGs) and establish a novel a scoring system for colorectal cancer (CRC). Methods: Autophagy-related genes sequencing data and the corresponding clinical data of CRC in The Cancer Genome Atlas were used as training data set. The GSE39582 data set from the Gene Expression Omnibus was used as validation set. An autophagy-related signature was developed in training set using univariate Cox analysis followed by stepwise multivariate Cox analysis and assessed in the validation set. Then we analyzed the function and pathways of ARGs using Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Finally, a prognostic nomogram combining the autophagy-related risk score and clinicopathological characteristics was developed according to multivariate Cox analysis. Results: After univariate and multivariate analysis, 3 ARGs were used to construct autophagy-related signature. The KEGG pathway analyses showed several significantly enriched oncological signatures, such as p53 signaling pathway, apoptosis, human cytomegalovirus infection, platinum drug resistance, necroptosis, and ErbB signaling pathway. Patients were divided into high- and low-risk groups, and patients with high risk had significantly shorter overall survival (OS) than low-risk patients in both training set and validation set. Furthermore, the nomogram for predicting 3- and 5-year OS was established based on autophagy-based risk score and clinicopathologic factors. The area under the curve and calibration curves indicated that the nomogram showed well accuracy of prediction. Conclusions: Our proposed autophagy-based signature has important prognostic value and may provide a promising tool for the development of personalized therapy.


Sign in / Sign up

Export Citation Format

Share Document