Application and Need-Based Architecture Design of Deep Neural Networks

Author(s):  
Soniya ◽  
Sandeep Paul ◽  
Lotika Singh

This paper applies a hybrid evolutionary approach to a convolutional neural network (CNN) and determines the number of layers and filters based on the application and user need. It integrates compact genetic algorithm with stochastic gradient descent (SGD) for simultaneously evolving structure and parameters of the CNN. It defines an effectual string representation for combining structure and parameters of the CNN. The compact genetic algorithm helps in the evolution of network structure by optimizing the number of convolutional layers and number of filters in each convolutional layer. At the same time, an optimal set of weight parameters of the network is obtained using the SGD law. This approach amalgamates exploration in network space by compact genetic algorithm and exploitation in weight space with SGD in an effective manner. The proposed approach also incorporates user-defined parameters in the cost function in an elegant manner which controls the network structure and hence the performance of the network based on the users need. The effectiveness of the proposed approach has been demonstrated on four benchmark datasets, namely MNIST, COIL-100, CIFAR-10 and CIFAR-100. The obtained results clearly demonstrate the potential of the proposed approach by evolving architectures based on the nature of the application and the need of the user.

2019 ◽  
Vol 10 (1) ◽  
pp. 64
Author(s):  
Yi Lin ◽  
Honggang Zhang

In the era of Big Data, multi-instance learning, as a weakly supervised learning framework, has various applications since it is helpful to reduce the cost of the data-labeling process. Due to this weakly supervised setting, learning effective instance representation/embedding is challenging. To address this issue, we propose an instance-embedding regularizer that can boost the performance of both instance- and bag-embedding learning in a unified fashion. Specifically, the crux of the instance-embedding regularizer is to maximize correlation between instance-embedding and underlying instance-label similarities. The embedding-learning framework was implemented using a neural network and optimized in an end-to-end manner using stochastic gradient descent. In experiments, various applications were studied, and the results show that the proposed instance-embedding-regularization method is highly effective, having state-of-the-art performance.


Entropy ◽  
2020 ◽  
Vol 22 (2) ◽  
pp. 213 ◽  
Author(s):  
Yiğit Uğur ◽  
George Arvanitakis ◽  
Abdellatif Zaidi

In this paper, we develop an unsupervised generative clustering framework that combines the variational information bottleneck and the Gaussian mixture model. Specifically, in our approach, we use the variational information bottleneck method and model the latent space as a mixture of Gaussians. We derive a bound on the cost function of our model that generalizes the Evidence Lower Bound (ELBO) and provide a variational inference type algorithm that allows computing it. In the algorithm, the coders’ mappings are parametrized using neural networks, and the bound is approximated by Markov sampling and optimized with stochastic gradient descent. Numerical results on real datasets are provided to support the efficiency of our method.


Author(s):  
Beitong Zhou ◽  
Jun Liu ◽  
Weigao Sun ◽  
Ruijuan Chen ◽  
Claire Tomlin ◽  
...  

We propose a novel technique for improving the stochastic gradient descent (SGD) method to train deep networks, which we term pbSGD. The proposed pbSGD method simply raises the stochastic gradient to a certain power elementwise during iterations and introduces only one additional parameter, namely, the power exponent (when it equals to 1, pbSGD reduces to SGD). We further propose pbSGD with momentum, which we term pbSGDM. The main results of this paper present comprehensive experiments on popular deep learning models and benchmark datasets. Empirical results show that the proposed pbSGD and pbSGDM obtain faster initial training speed than adaptive gradient methods, comparable generalization ability with SGD, and improved robustness to hyper-parameter selection and vanishing gradients. pbSGD is essentially a gradient modifier via a nonlinear transformation. As such, it is orthogonal and complementary to other techniques for accelerating gradient-based optimization such as learning rate schedules. Finally, we show convergence rate analysis for both pbSGD and pbSGDM methods. The theoretical rates of convergence match the best known theoretical rates of convergence for SGD and SGDM methods on nonconvex functions.


2018 ◽  
Vol 2018 ◽  
pp. 1-15 ◽  
Author(s):  
Gábor Danner ◽  
Árpád Berta ◽  
István Hegedűs ◽  
Márk Jelasity

Privacy and security are among the highest priorities in data mining approaches over data collected from mobile devices. Fully distributed machine learning is a promising direction in this context. However, it is a hard problem to design protocols that are efficient yet provide sufficient levels of privacy and security. In fully distributed environments, secure multiparty computation (MPC) is often applied to solve these problems. However, in our dynamic and unreliable application domain, known MPC algorithms are not scalable or not robust enough. We propose a light-weight protocol to quickly and securely compute the sum query over a subset of participants assuming a semihonest adversary. During the computation the participants learn no individual values. We apply this protocol to efficiently calculate the sum of gradients as part of a fully distributed minibatch stochastic gradient descent algorithm. The protocol achieves scalability and robustness by exploiting the fact that in this application domain a “quick and dirty” sum computation is acceptable. We utilize the Paillier homomorphic cryptosystem as part of our solution combined with extreme lossy gradient compression to make the cost of the cryptographic algorithms affordable. We demonstrate both theoretically and experimentally, based on churn statistics from a real smartphone trace, that the protocol is indeed practically viable.


2018 ◽  
Vol 4 (1) ◽  
pp. 3
Author(s):  
Rene Bidart ◽  
Alexander Wong

In this study, we explore the training of monolithic deep neural net-works in an effective manner. One of the biggest challenges withtraining such networks to the desired level of accuracy is the dif-ficulty in converging to a good solution using iterative optimizationmethods such as stochastic gradient descent due to the enormousnumber of parameters that need to be learned. To achieve this,we introduce a partitioned training strategy, where proxy layersare connected to different partitions of a deep neural network toenable isolated training of a much smaller number of parametersto convergence. To illustrate the efficacy of this training strategy,we introduce MonolithNet, a massive residual deep neural networkconsisting of 437 million parameters. The trained MonolithNet wasable to achieve a top-1 accuracy of 97% on the CIFAR10 imageclassification dataset, which demonstrates the feasibility of the pro-posed training strategy for training monolithic deep neural networksto high accuracies.


2022 ◽  
Vol 17 ◽  
Author(s):  
Xinyi Liao ◽  
Xiaomei Gu ◽  
Dejun Peng

Background: Many malaria infections are caused by Plasmodium falciparum. Accurate classification of the proteins secreted by the malaria parasite, which are essential for the development of anti-malarial drugs, is essential. Objective: To accurately classify the proteins secreted by the malaria parasite. Methods: Therefore, in order to improve the accuracy of the prediction of plasmodium secreted proteins, we established a classification model MGAP-SGD. MonodikGap features (k=7) of the secreted proteins were extracted, and then the optimal features were selected by the AdaBoost method. Finally, based on the optimal set of secreted proteins, the model was used to predict the secreted proteins using the stochastic gradient descent (SGD) algorithm. Results: Our model uses a 10-fold cross-validation set and independent test set in the stochastic gradient descent (SGD) classifier to validate the model, and the accuracy rates are 98.5859% and 97.973%, respectively. Conclusion: This also fully proves that the effectiveness and robustness of the prediction results of the MGAP-SGD model can meet the prediction needs of the secreted proteins of plasmodium.


2022 ◽  
Vol 40 (4) ◽  
pp. 1-32
Author(s):  
Jinze Wang ◽  
Yongli Ren ◽  
Jie Li ◽  
Ke Deng

Factorization models have been successfully applied to the recommendation problems and have significant impact to both academia and industries in the field of Collaborative Filtering ( CF ). However, the intermediate data generated in factorization models’ decision making process (or training process , footprint ) have been overlooked even though they may provide rich information to further improve recommendations. In this article, we introduce the concept of Convergence Pattern, which records how ratings are learned step-by-step in factorization models in the field of CF. We show that the concept of Convergence Patternexists in both the model perspective (e.g., classical Matrix Factorization ( MF ) and deep-learning factorization) and the training (learning) perspective (e.g., stochastic gradient descent ( SGD ), alternating least squares ( ALS ), and Markov Chain Monte Carlo ( MCMC )). By utilizing the Convergence Pattern, we propose a prediction model to estimate the prediction reliability of missing ratings and then improve the quality of recommendations. Two applications have been investigated: (1) how to evaluate the reliability of predicted missing ratings and thus recommend those ratings with high reliability. (2) How to explore the estimated reliability to adjust the predicted ratings to further improve the predication accuracy. Extensive experiments have been conducted on several benchmark datasets on three recommendation tasks: decision-aware recommendation, rating predicted, and Top- N recommendation. The experiment results have verified the effectiveness of the proposed methods in various aspects.


Sign in / Sign up

Export Citation Format

Share Document