scholarly journals Accelerating Convolutional Neural Network Using Discrete Orthogonal Transforms

Author(s):  
Eduardo Reis ◽  
Rachid Benlamri

<div> <div> <div> <div> <p>All experiments are implemented in Python, using the PyTorch and the Torch-DCT libraries under the Google Colab environment. The Intel(R) Xeon(R) CPU @ 2.00GHz and a Tesla V100-SXM2-16GB GPU were assignment to the Google Colab runtime when profiling the DOT models. It should be noted that the current stable version of the PyTorch library, version 1.8.1, offers only the implementation of the FFT algorithm. Therefore, the implementations of the Hartley and Cosine transforms, listed in Table 1, are not implemented using the same optimizations (algorithm and code wise) adopted in the FFT. We benchmark the DOT methods using the LENET-5 network shown in Figure 10. The ReLU activation function is adopted a non-linear operation across the entire architecture. In this network, the convolutional operations have a kernel of size K = 5. The convolution is of type “valid”, i.e., padding is not applied to the input. Hence the output size M of each layer is smaller than its input size N, that is M=N−K+1. The optimizers used in our experiments are Adam, SGD, SGD with Momentum of 0.9, and RMSProp with α = 0.99. The StepLR scheduler is used with a step size of 20 epochs and a γ = 0.5. We train our model for 40 epochs using a mini-batch of size 128 and a learning rate of 0.001. Five datasets are used in order to benchmark the proposed DOT methods. Among them, we have the MNIST dataset and some variants of the MNIST dataset such as EMNIST, KMNIST and Fashion-MNIST. Additionally, a more complex dataset, CIFAR-10 is also used in our benchmark.</p> </div> </div> </div> </div>

2021 ◽  
Author(s):  
Eduardo Reis ◽  
Rachid Benlamri

<div> <div> <div> <div> <p>All experiments are implemented in Python, using the PyTorch and the Torch-DCT libraries under the Google Colab environment. The Intel(R) Xeon(R) CPU @ 2.00GHz and a Tesla V100-SXM2-16GB GPU were assignment to the Google Colab runtime when profiling the DOT models. It should be noted that the current stable version of the PyTorch library, version 1.8.1, offers only the implementation of the FFT algorithm. Therefore, the implementations of the Hartley and Cosine transforms, listed in Table 1, are not implemented using the same optimizations (algorithm and code wise) adopted in the FFT. We benchmark the DOT methods using the LENET-5 network shown in Figure 10. The ReLU activation function is adopted a non-linear operation across the entire architecture. In this network, the convolutional operations have a kernel of size K = 5. The convolution is of type “valid”, i.e., padding is not applied to the input. Hence the output size M of each layer is smaller than its input size N, that is M=N−K+1. The optimizers used in our experiments are Adam, SGD, SGD with Momentum of 0.9, and RMSProp with α = 0.99. The StepLR scheduler is used with a step size of 20 epochs and a γ = 0.5. We train our model for 40 epochs using a mini-batch of size 128 and a learning rate of 0.001. Five datasets are used in order to benchmark the proposed DOT methods. Among them, we have the MNIST dataset and some variants of the MNIST dataset such as EMNIST, KMNIST and Fashion-MNIST. Additionally, a more complex dataset, CIFAR-10 is also used in our benchmark.</p> </div> </div> </div> </div>


2020 ◽  
Vol 59 (1) ◽  
pp. 131-142
Author(s):  
Daniel Štifanić ◽  
Zlatan Car

Fish population monitoring systems based on underwater video recording are becoming more popular nowadays, however, manual processing and analysis of such data can be time-consuming. Therefore, by utilizing machine learning algorithms, the data can be processed more efficiently. In this research, authors investigate the possibility of convolutional neural network (CNN) implementation for fish species classification. The dataset used in this research consists of four fish species (Plectroglyphidodon dickii, Chromis chrysura, Amphiprion clarkii, and Chaetodon lunulatus), which gives a total of 12859 fish images. For the aforementioned classification algorithm, different combinations of hyperparameters were examined as well as the impact of different activation functions on the classification performance. As a result, the best CNN classification performance was achieved when Identity activation function is applied to hidden layers, RMSprop is used as a solver with a learning rate of 0.001, and a learning rate decay of 1e-5. Accordingly, the proposed CNN model is capable of performing high-quality fish species classifications.


2020 ◽  
Author(s):  
Takuma Yoshimura

In this research, I propose a two-variable activation function "Yamatani" that satisfies the first-degree homogeneity, and realize a super-resolution convolutional neural network that is independent of the dynamic range and symmetrical about the luminance inversion.


2021 ◽  
Vol 905 (1) ◽  
pp. 012059
Author(s):  
Y Hendrawan ◽  
B Rohmatulloh ◽  
F I Ilmi ◽  
M R Fauzy ◽  
R Damayanti ◽  
...  

Abstract Various types of Indonesian coffee are already popular internationally. Recently, there are still not many methods to classify the types of typical Indonesian coffee. Computer vision is a non-destructive method for classifying agricultural products. This study aimed to classify three types of Indonesian Arabica coffee beans, i.e., Gayo Aceh, Kintamani Bali, and Toraja Tongkonan, using computer vision. The classification method used was the AlexNet convolutional neural network with sensitivity analysis using several variations of the optimizer such as SGDm, Adam, and RMSProp and the learning rate of 0.00005 and 0.0001. Each type of coffee used 500 data for training and validation with the distribution of 70% training and 30% validation. The results showed that all AlexNet models achieved a perfect validation accuracy value of 100% in 1,040 iterations. This study also used 100 testing-set data on each type of coffee bean. In the testing confusion matrix, the accuracy reached 99.6%.


Author(s):  
Shelvi Nur Rahmawati ◽  
Eka Wahyu Hidayat ◽  
Husni Mubarok

Aksara Sunda merupakan salah satu aksara daerah Indonesia khususnya masyarakat Sunda. Seiring dengan perkembangan teknologi seperti sekarang ini, bahasa daerah pun semakin tergerus dari waktu kewaktu. Aksara Sunda pun mulai terlupakan, bahkan jarang digunakan oleh masyarakat Sunda dalam kehidupan sehari-hari serta kurangnya memahami Bahasa daerahnya sendiri. Oleh karena itu, perlu adanya pelestarian Bahasa daerah yang dikembangkan menyesuaikan perkembangan jaman agar bisa terus dikenal dan dilestarikan, salahsatunya dengan identifikasi aksara Sunda menggunakan metode Convolutional Neural Network (CNN). Convolutional Neural Network (CNN) adalah bagian dari deep learning yang biasanya digunakan dalam pengolahan data gambar. Hasil dari penelitian ini  menggunakan optimasi ADAM dengan penggunaan epoch 20, 50, 100 dan 500. Penggunaan epoch 500, learning rate 0.1 merupakan nilai tertinggi dengan akurasi 98.03%. Berdasarkan hasil data training dengan nilai epoch 100, learning rate 0.001 hasil akurasi sebesar 96.71% data training dan 92.02% data testing.


2018 ◽  
Vol 18 (01) ◽  
pp. 22-27 ◽  
Author(s):  
Royani Darma Nurfita ◽  
Gunawan Ariyanto

Sistem pengenalan sidik jari banyak digunakan dala bidang biometrik untuk berbagai keperluan pada beberapa tahun terakhir ini. Pengenalan sidik jari digunakan karena memiliki pola yang rumit yang dapat mengenali seseorang dan merupakan identitas setiap manusia. Sidik jari juga banyak digunakan sebagai verifikasi maupun identifikasi. Permasalahan yang dihadapi dalam penelitian ini adalah komputer sulit melakukan klasifikasi objek salah satunya pada sidikjari. Dalam penelitian ini penulismenggunakan deep learning yang menggunakan metode Convolutional Neural Network (CNN) untuk mengatasi masalah tersebut. CNN digunakan untuk melakukan proses pembelajaran mesin pada komputer. Tahapan pada CNN adalah input data, preprocessing, proses training. Implementasi CNN yang digunakan library tensorflow dengan menggunakan bahasa pemrograman python. Dataset yang digunakan bersumber dari sebuah website kompetisi verifikasi sidik jari pada tahun 2004 yang menggunakan sensor bertipe opticalsensor “V300” by crossMatch dan didalamnya terdapat 80 gambar sidik jari. Proses pelatihan menggunakan data yang berukuran 24x24 pixel dan melakukan pengujian dengan membandingkan jumlah epoch dan learning rate sehingga diketahui bahwa jika semakin besar jumlah epoch dan semakin kecil learning rate maka semakin baik tingkat akurasi pelatihan yang didapatkan. Pada penelitian ini tingkat akurasi pelatihan yang dicapai sebesar 100%


Sign in / Sign up

Export Citation Format

Share Document