scholarly journals Assessing Hyper Parameter Optimization and Speedup for Convolutional Neural Networks

Author(s):  
Sajid Nazir ◽  
Shushma Patel ◽  
Dilip Patel

The increased processing power of graphical processing units (GPUs) and the availability of large image datasets has fostered a renewed interest in extracting semantic information from images. Promising results for complex image categorization problems have been achieved using deep learning, with neural networks comprised of many layers. Convolutional neural networks (CNN) are one such architecture which provides more opportunities for image classification. Advances in CNN enable the development of training models using large labelled image datasets, but the hyper parameters need to be specified, which is challenging and complex due to the large number of parameters. A substantial amount of computational power and processing time is required to determine the optimal hyper parameters to define a model yielding good results. This article provides a survey of the hyper parameter search and optimization methods for CNN architectures.

2019 ◽  
Vol 35 (17) ◽  
pp. 3208-3210 ◽  
Author(s):  
Yangzhen Wang ◽  
Feng Su ◽  
Shanshan Wang ◽  
Chaojuan Yang ◽  
Yonglu Tian ◽  
...  

Abstract Motivation Functional imaging at single-neuron resolution offers a highly efficient tool for studying the functional connectomics in the brain. However, mainstream neuron-detection methods focus on either the morphologies or activities of neurons, which may lead to the extraction of incomplete information and which may heavily rely on the experience of the experimenters. Results We developed a convolutional neural networks and fluctuation method-based toolbox (ImageCN) to increase the processing power of calcium imaging data. To evaluate the performance of ImageCN, nine different imaging datasets were recorded from awake mouse brains. ImageCN demonstrated superior neuron-detection performance when compared with other algorithms. Furthermore, ImageCN does not require sophisticated training for users. Availability and implementation ImageCN is implemented in MATLAB. The source code and documentation are available at https://github.com/ZhangChenLab/ImageCN. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Mohammed Abdulla Salim Al Husaini ◽  
Mohamed Hadi Habaebi ◽  
Teddy Surya Gunawan ◽  
Md Rafiqul Islam ◽  
Elfatih A. A. Elsheikh ◽  
...  

AbstractBreast cancer is one of the most significant causes of death for women around the world. Breast thermography supported by deep convolutional neural networks is expected to contribute significantly to early detection and facilitate treatment at an early stage. The goal of this study is to investigate the behavior of different recent deep learning methods for identifying breast disorders. To evaluate our proposal, we built classifiers based on deep convolutional neural networks modelling inception V3, inception V4, and a modified version of the latter called inception MV4. MV4 was introduced to maintain the computational cost across all layers by making the resultant number of features and the number of pixel positions equal. DMR database was used for these deep learning models in classifying thermal images of healthy and sick patients. A set of epochs 3–30 were used in conjunction with learning rates 1 × 10–3, 1 × 10–4 and 1 × 10–5, Minibatch 10 and different optimization methods. The training results showed that inception V4 and MV4 with color images, a learning rate of 1 × 10–4, and SGDM optimization method, reached very high accuracy, verified through several experimental repetitions. With grayscale images, inception V3 outperforms V4 and MV4 by a considerable accuracy margin, for any optimization methods. In fact, the inception V3 (grayscale) performance is almost comparable to inception V4 and MV4 (color) performance but only after 20–30 epochs. inception MV4 achieved 7% faster classification response time compared to V4. The use of MV4 model is found to contribute to saving energy consumed and fluidity in arithmetic operations for the graphic processor. The results also indicate that increasing the number of layers may not necessarily be useful in improving the performance.


2020 ◽  
Author(s):  
Bian Li ◽  
Yucheng T. Yang ◽  
John A. Capra ◽  
Mark B. Gerstein

AbstractPredicting mutation-induced changes in protein thermodynamic stability (∆∆G) is of great interest in protein engineering, variant interpretation, and understanding protein biophysics. We introduce ThermoNet, a deep, 3D-convolutional neural network designed for structure-based prediction of ∆∆Gs upon point mutation. To leverage the image-processing power inherent in convolutional neural networks, we treat protein structures as if they were multi-channel 3D images. In particular, the inputs to ThermoNet are uniformly constructed as multi-channel voxel grids based on biophysical properties derived from raw atom coordinates. We train and evaluate ThermoNet with a curated data set that accounts for protein homology and is balanced with direct and reverse mutations; this provides a framework for addressing biases that have likely influenced many previous ∆∆G prediction methods. ThermoNet demonstrates performance comparable to the best available methods on the widely used Ssym test set. However, ThermoNet accurately predicts the effects of both stabilizing and destabilizing mutations, while most other methods exhibit a strong bias towards predicting destabilization. We further show that homology between Ssym and widely used training sets like S2648 and VariBench has likely led to overestimated performance in previous studies. Finally, we demonstrate the practical utility of ThermoNet in predicting the ∆∆Gs for two clinically relevant proteins, p53 and myoglobin, and for pathogenic and benign missense variants from ClinVar. Overall, our results suggest that 3D convolutional neural networks can model the complex, non-linear interactions perturbed by mutations, directly from biophysical properties of atoms.Author SummaryThe thermodynamic stability of a protein, usually represented as the Gibbs free energy for the biophysical process of protein folding (∆G), is a fundamental thermodynamic quantity. Predicting mutation-induced changes in protein thermodynamic stability (∆∆G) is of great interest in protein engineering, variant interpretation, and understanding protein biophysics. However, predicting ∆∆Gs in an accurate and unbiased manner has been a long-standing challenge in the field of computational biology. In this work, we introduce ThermoNet, a deep, 3D-convolutional neural network designed for structure-based ∆∆G prediction. To leverage the image-processing power inherent in convolutional neural networks, we treat protein structures as if they were multi-channel 3D images. ThermoNet demonstrates performance comparable to the best available methods. However, ThermoNet accurately predicts the effects of both stabilizing and destabilizing mutations, while most other methods exhibit a strong bias towards predicting destabilization. We also demonstrate that the presence of homologous proteins in commonly used training and testing sets for ∆∆G prediction methods has likely influenced previous performance estimates. Finally, we highlight the practical utility of ThermoNet by applying it to predicting the ∆∆Gs for two clinically relevant proteins, p53 and myoglobin, and for pathogenic and benign missense variants from ClinVar.


Author(s):  
I. Yu. Sesin ◽  
R. G. Bolbakov

General Purpose computing for Graphical Processing Units (GPGPU) technology is a powerful tool for offloading parallel data processing tasks to Graphical Processing Units (GPUs). This technology finds its use in variety of domains – from science and commerce to hobbyists. GPU-run general-purpose programs will inevitably run into performance issues stemming from code branch predication. Code predication is a GPU feature that makes both conditional branches execute, masking the results of incorrect branch. This leads to considerable performance losses for GPU programs that have large amounts of code hidden away behind conditional operators. This paper focuses on the analysis of existing approaches to improving software performance in the context of relieving the aforementioned performance loss. Description of said approaches is provided, along with their upsides, downsides and extents of their applicability and whether they address the outlined problem. Covered approaches include: optimizing compilers, JIT-compilation, branch predictor, speculative execution, adaptive optimization, run-time algorithm specialization, profile-guided optimization. It is shown that the aforementioned methods are mostly catered to CPU-specific issues and are generally not applicable, as far as branch-predication performance loss is concerned. Lastly, we outline the need for a separate performance improving approach, addressing specifics of branch predication and GPGPU workflow.


Author(s):  
Oleksii Gorokhovatskyi ◽  
Olena Peredrii

This paper describes the investigation results about the usage of shallow (limited by few layers only) convolutional neural networks (CNNs) to solve the video-based gender classification problem. Different architectures of shallow CNN are proposed, trained and tested using balanced and unbalanced static image datasets. The influence of diverse voting over confidences methods, applied for frame-by-frame gender classification of the video stream, is investigated for possible enhancement of the classification accuracy. The possibility of the grouping of shallow networks into ensembles is investigated; it has been shown that the accuracy may be more improved with the further voting of separate shallow CNN classification results inside an ensemble over a single frame or different ones.


10.29007/cd8h ◽  
2020 ◽  
Author(s):  
Ramin Sharifi ◽  
Pouya Shiri ◽  
Amirali Baniasadi

Capsule networks (CapsNet) are the next generation of neural networks. CapsNet can be used for classification of data of different types. Today’s General Purpose Graphical Processing Units (GPGPUs) are more capable than before and let us train these complex networks. However, time and energy consumption remains a challenge. In this work, we investigate if skipping trivial operations i.e. multiplication by zero in CapsNet, can possibly save energy. We base our analysis on the number of multiplications by zero detected while training CapsNet on MNIST and Fashion- MNIST datasets.


Author(s):  
Li’an Zhuo ◽  
Baochang Zhang ◽  
Chen Chen ◽  
Qixiang Ye ◽  
Jianzhuang Liu ◽  
...  

In stochastic gradient descent (SGD) and its variants, the optimized gradient estimators may be as expensive to compute as the true gradient in many scenarios. This paper introduces a calibrated stochastic gradient descent (CSGD) algorithm for deep neural network optimization. A theorem is developed to prove that an unbiased estimator for the network variables can be obtained in a probabilistic way based on the Lipschitz hypothesis. Our work is significantly distinct from existing gradient optimization methods, by providing a theoretical framework for unbiased variable estimation in the deep learning paradigm to optimize the model parameter calculation. In particular, we develop a generic gradient calibration layer which can be easily used to build convolutional neural networks (CNNs). Experimental results demonstrate that CNNs with our CSGD optimization scheme can improve the stateof-the-art performance for natural image classification, digit recognition, ImageNet object classification, and object detection tasks. This work opens new research directions for developing more efficient SGD updates and analyzing the backpropagation algorithm.


Electronics ◽  
2019 ◽  
Vol 8 (9) ◽  
pp. 997 ◽  
Author(s):  
Lin ◽  
Lin ◽  
Sun ◽  
Wang

Various optimization methods and network architectures are used by convolutional neural networks (CNNs). Each optimization method and network architecture style have their own advantages and representation abilities. To make the most of these advantages, evolutionary-fuzzy-integral-based convolutional neural networks (EFI-CNNs) are proposed in this paper. The proposed EFI-CNNs were verified by way of face classification of age and gender. The trained CNNs’ outputs were set as inputs of a fuzzy integral. The classification results were operated using either Sugeno or Choquet output rules. The conventional fuzzy density values of the fuzzy integral were decided by heuristic experiments. In this paper, particle swarm optimization (PSO) was used to adaptively find optimal fuzzy density values. To combine the advantages of each CNN type, the evaluation of each CNN type in EFI-CNNs is necessary. Three CNN structures, AlexNet, very deep convolutional neural network (VGG16), and GoogLeNet, and three databases, computational intelligence application laboratory (CIA), Morph, and cross-age celebrity dataset (CACD2000), were used in experiments to classify age and gender. The experimental results show that the proposed method achieved 5.95% and 3.1% higher accuracy, respectively, in classifying age and gender.


Sign in / Sign up

Export Citation Format

Share Document