An Overview of Deep Learning Optimization Methods and Learning Rate Attenuation Methods

2018 ◽  
Vol 08 (04) ◽  
pp. 186-200 ◽  
Author(s):  
宇旭 冯
2018 ◽  
Vol 6 (4) ◽  
pp. 440-447
Author(s):  
Amita Khatana ◽  
◽  
◽  
◽  
V.K Narang ◽  
...  

Author(s):  
Mohammad Shorfuzzaman ◽  
M. Shamim Hossain ◽  
Abdulmotaleb El Saddik

Diabetic retinopathy (DR) is one of the most common causes of vision loss in people who have diabetes for a prolonged period. Convolutional neural networks (CNNs) have become increasingly popular for computer-aided DR diagnosis using retinal fundus images. While these CNNs are highly reliable, their lack of sufficient explainability prevents them from being widely used in medical practice. In this article, we propose a novel explainable deep learning ensemble model where weights from different models are fused into a single model to extract salient features from various retinal lesions found on fundus images. The extracted features are then fed to a custom classifier for the final diagnosis of DR severity level. The model is trained on an APTOS dataset containing retinal fundus images of various DR grades using a cyclical learning rates strategy with an automatic learning rate finder for decaying the learning rate to improve model accuracy. We develop an explainability approach by leveraging gradient-weighted class activation mapping and shapely adaptive explanations to highlight the areas of fundus images that are most indicative of different DR stages. This allows ophthalmologists to view our model's decision in a way that they can understand. Evaluation results using three different datasets (APTOS, MESSIDOR, IDRiD) show the effectiveness of our model, achieving superior classification rates with a high degree of precision (0.970), sensitivity (0.980), and AUC (0.978). We believe that the proposed model, which jointly offers state-of-the-art diagnosis performance and explainability, will address the black-box nature of deep CNN models in robust detection of DR grading.


Author(s):  
Shelvi Nur Rahmawati ◽  
Eka Wahyu Hidayat ◽  
Husni Mubarok

Aksara Sunda merupakan salah satu aksara daerah Indonesia khususnya masyarakat Sunda. Seiring dengan perkembangan teknologi seperti sekarang ini, bahasa daerah pun semakin tergerus dari waktu kewaktu. Aksara Sunda pun mulai terlupakan, bahkan jarang digunakan oleh masyarakat Sunda dalam kehidupan sehari-hari serta kurangnya memahami Bahasa daerahnya sendiri. Oleh karena itu, perlu adanya pelestarian Bahasa daerah yang dikembangkan menyesuaikan perkembangan jaman agar bisa terus dikenal dan dilestarikan, salahsatunya dengan identifikasi aksara Sunda menggunakan metode Convolutional Neural Network (CNN). Convolutional Neural Network (CNN) adalah bagian dari deep learning yang biasanya digunakan dalam pengolahan data gambar. Hasil dari penelitian ini  menggunakan optimasi ADAM dengan penggunaan epoch 20, 50, 100 dan 500. Penggunaan epoch 500, learning rate 0.1 merupakan nilai tertinggi dengan akurasi 98.03%. Berdasarkan hasil data training dengan nilai epoch 100, learning rate 0.001 hasil akurasi sebesar 96.71% data training dan 92.02% data testing.


2018 ◽  
Vol 18 (01) ◽  
pp. 22-27 ◽  
Author(s):  
Royani Darma Nurfita ◽  
Gunawan Ariyanto

Sistem pengenalan sidik jari banyak digunakan dala bidang biometrik untuk berbagai keperluan pada beberapa tahun terakhir ini. Pengenalan sidik jari digunakan karena memiliki pola yang rumit yang dapat mengenali seseorang dan merupakan identitas setiap manusia. Sidik jari juga banyak digunakan sebagai verifikasi maupun identifikasi. Permasalahan yang dihadapi dalam penelitian ini adalah komputer sulit melakukan klasifikasi objek salah satunya pada sidikjari. Dalam penelitian ini penulismenggunakan deep learning yang menggunakan metode Convolutional Neural Network (CNN) untuk mengatasi masalah tersebut. CNN digunakan untuk melakukan proses pembelajaran mesin pada komputer. Tahapan pada CNN adalah input data, preprocessing, proses training. Implementasi CNN yang digunakan library tensorflow dengan menggunakan bahasa pemrograman python. Dataset yang digunakan bersumber dari sebuah website kompetisi verifikasi sidik jari pada tahun 2004 yang menggunakan sensor bertipe opticalsensor “V300” by crossMatch dan didalamnya terdapat 80 gambar sidik jari. Proses pelatihan menggunakan data yang berukuran 24x24 pixel dan melakukan pengujian dengan membandingkan jumlah epoch dan learning rate sehingga diketahui bahwa jika semakin besar jumlah epoch dan semakin kecil learning rate maka semakin baik tingkat akurasi pelatihan yang didapatkan. Pada penelitian ini tingkat akurasi pelatihan yang dicapai sebesar 100%


2021 ◽  
Author(s):  
Ryan Santoso ◽  
Xupeng He ◽  
Marwa Alsinan ◽  
Hyung Kwak ◽  
Hussein Hoteit

Abstract Automatic fracture recognition from borehole images or outcrops is applicable for the construction of fractured reservoir models. Deep learning for fracture recognition is subject to uncertainty due to sparse and imbalanced training set, and random initialization. We present a new workflow to optimize a deep learning model under uncertainty using U-Net. We consider both epistemic and aleatoric uncertainty of the model. We propose a U-Net architecture by inserting dropout layer after every "weighting" layer. We vary the dropout probability to investigate its impact on the uncertainty response. We build the training set and assign uniform distribution for each training parameter, such as the number of epochs, batch size, and learning rate. We then perform uncertainty quantification by running the model multiple times for each realization, where we capture the aleatoric response. In this approach, which is based on Monte Carlo Dropout, the variance map and F1-scores are utilized to evaluate the need to craft additional augmentations or stop the process. This work demonstrates the existence of uncertainty within the deep learning caused by sparse and imbalanced training sets. This issue leads to unstable predictions. The overall responses are accommodated in the form of aleatoric uncertainty. Our workflow utilizes the uncertainty response (variance map) as a measure to craft additional augmentations in the training set. High variance in certain features denotes the need to add new augmented images containing the features, either through affine transformation (rotation, translation, and scaling) or utilizing similar images. The augmentation improves the accuracy of the prediction, reduces the variance prediction, and stabilizes the output. Architecture, number of epochs, batch size, and learning rate are optimized under a fixed-uncertain training set. We perform the optimization by searching the global maximum of accuracy after running multiple realizations. Besides the quality of the training set, the learning rate is the heavy-hitter in the optimization process. The selected learning rate controls the diffusion of information in the model. Under the imbalanced condition, fast learning rates cause the model to miss the main features. The other challenge in fracture recognition on a real outcrop is to optimally pick the parental images to generate the initial training set. We suggest picking images from multiple sides of the outcrop, which shows significant variations of the features. This technique is needed to avoid long iteration within the workflow. We introduce a new approach to address the uncertainties associated with the training process and with the physical problem. The proposed approach is general in concept and can be applied to various deep-learning problems in geoscience.


2020 ◽  
Vol 10 (20) ◽  
pp. 7301
Author(s):  
Daniel Octavian Melinte ◽  
Ana-Maria Travediu ◽  
Dan N. Dumitriu

This paper presents an extensive research carried out for enhancing the performances of convolutional neural network (CNN) object detectors applied to municipal waste identification. In order to obtain an accurate and fast CNN architecture, several types of Single Shot Detectors (SSD) and Regional Proposal Networks (RPN) have been fine-tuned on the TrashNet database. The network with the best performances is executed on one autonomous robot system, which is able to collect detected waste from the ground based on the CNN feedback. For this type of application, a precise identification of municipal waste objects is very important. In order to develop a straightforward pipeline for waste detection, the paper focuses on boosting the performance of pre-trained CNN Object Detectors, in terms of precision, generalization, and detection speed, using different loss optimization methods, database augmentation, and asynchronous threading at inference time. The pipeline consists of data augmentation at the training time followed by CNN feature extraction and box predictor modules for localization and classification at different feature map sizes. The trained model is generated for inference afterwards. The experiments revealed better performances than all other Object Detectors trained on TrashNet or other garbage datasets with a precision of 97.63% accuracy for SSD and 95.76% accuracy for Faster R-CNN, respectively. In order to find the optimal higher and lower bounds of our learning rate where the network is actually learning, we trained our model for several epochs, updating the learning rate after each epoch, starting from 1 × 10−10 and decreasing it until reaching 1 × 10−1.


2019 ◽  
Vol 110 ◽  
pp. 225-231 ◽  
Author(s):  
Huizhen Zhao ◽  
Fuxian Liu ◽  
Han Zhang ◽  
Zhibing Liang

2020 ◽  
Vol 14 ◽  
Author(s):  
Yaqing Zhang ◽  
Jinling Chen ◽  
Jen Hong Tan ◽  
Yuxuan Chen ◽  
Yunyi Chen ◽  
...  

Emotion is the human brain reacting to objective things. In real life, human emotions are complex and changeable, so research into emotion recognition is of great significance in real life applications. Recently, many deep learning and machine learning methods have been widely applied in emotion recognition based on EEG signals. However, the traditional machine learning method has a major disadvantage in that the feature extraction process is usually cumbersome, which relies heavily on human experts. Then, end-to-end deep learning methods emerged as an effective method to address this disadvantage with the help of raw signal features and time-frequency spectrums. Here, we investigated the application of several deep learning models to the research field of EEG-based emotion recognition, including deep neural networks (DNN), convolutional neural networks (CNN), long short-term memory (LSTM), and a hybrid model of CNN and LSTM (CNN-LSTM). The experiments were carried on the well-known DEAP dataset. Experimental results show that the CNN and CNN-LSTM models had high classification performance in EEG-based emotion recognition, and their accurate extraction rate of RAW data reached 90.12 and 94.17%, respectively. The performance of the DNN model was not as accurate as other models, but the training speed was fast. The LSTM model was not as stable as the CNN and CNN-LSTM models. Moreover, with the same number of parameters, the training speed of the LSTM was much slower and it was difficult to achieve convergence. Additional parameter comparison experiments with other models, including epoch, learning rate, and dropout probability, were also conducted in the paper. Comparison results prove that the DNN model converged to optimal with fewer epochs and a higher learning rate. In contrast, the CNN model needed more epochs to learn. As for dropout probability, reducing the parameters by ~50% each time was appropriate.


Author(s):  
Mohammed Abdulla Salim Al Husaini ◽  
Mohamed Hadi Habaebi ◽  
Teddy Surya Gunawan ◽  
Md Rafiqul Islam ◽  
Elfatih A. A. Elsheikh ◽  
...  

AbstractBreast cancer is one of the most significant causes of death for women around the world. Breast thermography supported by deep convolutional neural networks is expected to contribute significantly to early detection and facilitate treatment at an early stage. The goal of this study is to investigate the behavior of different recent deep learning methods for identifying breast disorders. To evaluate our proposal, we built classifiers based on deep convolutional neural networks modelling inception V3, inception V4, and a modified version of the latter called inception MV4. MV4 was introduced to maintain the computational cost across all layers by making the resultant number of features and the number of pixel positions equal. DMR database was used for these deep learning models in classifying thermal images of healthy and sick patients. A set of epochs 3–30 were used in conjunction with learning rates 1 × 10–3, 1 × 10–4 and 1 × 10–5, Minibatch 10 and different optimization methods. The training results showed that inception V4 and MV4 with color images, a learning rate of 1 × 10–4, and SGDM optimization method, reached very high accuracy, verified through several experimental repetitions. With grayscale images, inception V3 outperforms V4 and MV4 by a considerable accuracy margin, for any optimization methods. In fact, the inception V3 (grayscale) performance is almost comparable to inception V4 and MV4 (color) performance but only after 20–30 epochs. inception MV4 achieved 7% faster classification response time compared to V4. The use of MV4 model is found to contribute to saving energy consumed and fluidity in arithmetic operations for the graphic processor. The results also indicate that increasing the number of layers may not necessarily be useful in improving the performance.


2018 ◽  
Author(s):  
Kazunori D Yamada

ABSTRACTIn the deep learning era, stochastic gradient descent is the most common method used for optimizing neural network parameters. Among the various mathematical optimization methods, the gradient descent method is the most naive. Adjustment of learning rate is necessary for quick convergence, which is normally done manually with gradient descent. Many optimizers have been developed to control the learning rate and increase convergence speed. Generally, these optimizers adjust the learning rate automatically in response to learning status. These optimizers were gradually improved by incorporating the effective aspects of earlier methods. In this study, we developed a new optimizer: YamAdam. Our optimizer is based on Adam, which utilizes the first and second moments of previous gradients. In addition to the moment estimation system, we incorporated an advantageous part of AdaDelta, namely a unit correction system, into YamAdam. According to benchmark tests on some common datasets, our optimizer showed similar or faster convergent performance compared to the existing methods. YamAdam is an option as an alternative optimizer for deep learning.


Sign in / Sign up

Export Citation Format

Share Document