generalization error
Recently Published Documents


TOTAL DOCUMENTS

278
(FIVE YEARS 98)

H-INDEX

26
(FIVE YEARS 3)

Processes ◽  
2022 ◽  
Vol 10 (1) ◽  
pp. 140
Author(s):  
Yanxia Yang ◽  
Pu Wang ◽  
Xuejin Gao

A radial basis function neural network (RBFNN), with a strong function approximation ability, was proven to be an effective tool for nonlinear process modeling. However, in many instances, the sample set is limited and the model evaluation error is fixed, which makes it very difficult to construct an optimal network structure to ensure the generalization ability of the established nonlinear process model. To solve this problem, a novel RBFNN with a high generation performance (RBFNN-GP), is proposed in this paper. The proposed RBFNN-GP consists of three contributions. First, a local generalization error bound, introducing the sample mean and variance, is developed to acquire a small error bound to reduce the range of error. Second, the self-organizing structure method, based on a generalization error bound and network sensitivity, is established to obtain a suitable number of neurons to improve the generalization ability. Third, the convergence of this proposed RBFNN-GP is proved theoretically in the case of structure fixation and structure adjustment. Finally, the performance of the proposed RBFNN-GP is compared with some popular algorithms, using two numerical simulations and a practical application. The comparison results verified the effectiveness of RBFNN-GP.


Webology ◽  
2021 ◽  
Vol 18 (2) ◽  
pp. 439-448
Author(s):  
Parameswar Kanuparthi ◽  
Vaibhav Bejgam ◽  
V. Madhu Viswanatham

Agriculture, the primary sector of Indian economy. It contributes around 18 percent of overall GDP (Gross Domestic Product). More than fifty percent of Indians belong to an agricultural background. There is a necessary to rapidly increase the agriculture production in India due to the vast increasing of population. The significant crop type for most of the people in India is rice but it was one of the crops that has been mostly affected by the cause of diseases in majority of the cases. This results in reduced yield that lead to loss for farmers. The major challenges faced while cultivating the rice crops is getting infected by the diseases due to the various effects that include environmental conditions, pesticides used and natural disasters. Early detection of rice diseases will eventually help farmers to get out from disasters and help in better yield. In this paper, we are proposing a new method of ensembling the transfer learning models to detect the rice plant and classify the diseases using images. Using this model, the three most common rice crop diseases are detected such as Brown spot, Leaf smut and Bacterial leaf blight. Generally, transfer learning uses pre-trained models and gives better accuracy for the image datasets. Also, ensembling of machine learning algorithms (combining two or more ML algorithms) will help in reducing the generalization error and also makes the model more robust. Ensemble learning is becoming trendier as it reduces generalization error as well as makes the model more robust. The ensembling technique that was used in the paper is majority voting. Here we are proposing a novel model that ensembles three transfer learning models which are InceptionV3, MobileNetV2 and DenseNet121 with an accuracy of 96.42%.


2021 ◽  
Vol 2021 (12) ◽  
pp. 124014
Author(s):  
Umut Şimşekli ◽  
Ozan Sener ◽  
George Deligiannidis ◽  
Murat A Erdogdu

Abstract Despite its success in a wide range of applications, characterizing the generalization properties of stochastic gradient descent (SGD) in non-convex deep learning problems is still an important challenge. While modeling the trajectories of SGD via stochastic differential equations (SDE) under heavy-tailed gradient noise has recently shed light over several peculiar characteristics of SGD, a rigorous treatment of the generalization properties of such SDEs in a learning theoretical framework is still missing. Aiming to bridge this gap, in this paper, we prove generalization bounds for SGD under the assumption that its trajectories can be well-approximated by a Feller process, which defines a rich class of Markov processes that include several recent SDE representations (both Brownian or heavy-tailed) as its special case. We show that the generalization error can be controlled by the Hausdorff dimension of the trajectories, which is intimately linked to the tail behavior of the driving process. Our results imply that heavier-tailed processes should achieve better generalization; hence, the tail-index of the process can be used as a notion of ‘capacity metric’. We support our theory with experiments on deep neural networks illustrating that the proposed capacity metric accurately estimates the generalization error, and it does not necessarily grow with the number of parameters unlike the existing capacity metrics in the literature.


2021 ◽  
Vol 2021 (12) ◽  
pp. 124015
Author(s):  
Fabrizio Pittorino ◽  
Carlo Lucibello ◽  
Christoph Feinauer ◽  
Gabriele Perugini ◽  
Carlo Baldassi ◽  
...  

Abstract The properties of flat minima in the empirical risk landscape of neural networks have been debated for some time. Increasing evidence suggests they possess better generalization capabilities with respect to sharp ones. In this work we first discuss the relationship between alternative measures of flatness: the local entropy, which is useful for analysis and algorithm development, and the local energy, which is easier to compute and was shown empirically in extensive tests on state-of-the-art networks to be the best predictor of generalization capabilities. We show semi-analytically in simple controlled scenarios that these two measures correlate strongly with each other and with generalization. Then, we extend the analysis to the deep learning scenario by extensive numerical validations. We study two algorithms, entropy-stochastic gradient descent and replicated-stochastic gradient descent, that explicitly include the local entropy in the optimization objective. We devise a training schedule by which we consistently find flatter minima (using both flatness measures), and improve the generalization error for common architectures (e.g. ResNet, EfficientNet).


2021 ◽  
Vol 2021 (12) ◽  
pp. 124002
Author(s):  
Stéphane d’Ascoli ◽  
Levent Sagun ◽  
Giulio Biroli

Abstract A recent line of research has highlighted the existence of a ‘double descent’ phenomenon in deep learning, whereby increasing the number of training examples N causes the generalization error of neural networks (NNs) to peak when N is of the same order as the number of parameters P. In earlier works, a similar phenomenon was shown to exist in simpler models such as linear regression, where the peak instead occurs when N is equal to the input dimension D. Since both peaks coincide with the interpolation threshold, they are often conflated in the literature. In this paper, we show that despite their apparent similarity, these two scenarios are inherently different. In fact, both peaks can co-exist when NNs are applied to noisy regression tasks. The relative size of the peaks is then governed by the degree of nonlinearity of the activation function. Building on recent developments in the analysis of random feature models, we provide a theoretical ground for this sample-wise triple descent. As shown previously, the nonlinear peak at N = P is a true divergence caused by the extreme sensitivity of the output function to both the noise corrupting the labels and the initialization of the random features (or the weights in NNs). This peak survives in the absence of noise, but can be suppressed by regularization. In contrast, the linear peak at N = D is solely due to overfitting the noise in the labels, and forms earlier during training. We show that this peak is implicitly regularized by the nonlinearity, which is why it only becomes salient at high noise and is weakly affected by explicit regularization. Throughout the paper, we compare analytical results obtained in the random feature model with the outcomes of numerical experiments involving deep NNs.


2021 ◽  
Author(s):  
Eeshan Modak ◽  
Himanshu Asnani ◽  
Vinod M. Prabhakaran

Sign in / Sign up

Export Citation Format

Share Document