Parallel Computing, Graphics Processing Unit (GPU) and New Hardware for Deep Learning in Computational Intelligence Research

Author(s):  
M. Madiajagan ◽  
S. Sridhar Raj
2018 ◽  
Author(s):  
Maria Lorena Cordero-Maldonado ◽  
Simon Perathoner ◽  
Kees-Jan van der Kolk ◽  
Ralf Boland ◽  
Ursula Heins-Marroquin ◽  
...  

AbstractOne of the most popular techniques in zebrafish research is microinjection, as it is a rapid and efficient way to genetically manipulate early developing embryos, and to introduce microbes or tracers at larval stages.Here we demonstrate the development of a machine learning software that allows for microinjection at a trained target site in zebrafish eggs at unprecedented speed. The software is based on the open-source deep-learning library Inception v3.In a first step, the software distinguishes wells containing embryos at one-cell stage from wells to be skipped with an accuracy of 93%. A second step was developed to pinpoint the injection site. Deep learning allows to predict this location on average within 42 µm to manually annotated sites. Using a Graphics Processing Unit (GPU), both steps together take less than 100 milliseconds. We first tested our system by injecting a morpholino into the middle of the yolk and found that the automated injection efficiency is as efficient as manual injection (~ 80%). Next, we tested both CRISPR/Cas9 and DNA construct injections into the zygote and obtained a comparable efficiency to that of an experienced experimentalist. Combined with a higher throughput, this results in a higher yield. Hence, the automated injection of CRISPR/Cas9 will allow high-throughput applications to knock out and knock in relevant genes to study their mechanisms or pathways of interest in diverse areas of biomedical research.


2020 ◽  
Vol 20 (1) ◽  
pp. 67-76
Author(s):  
Rahmadya Trias Handayanto ◽  
Herlawati Herlawati

For the first time, machine learning did the classical classification process using two classes (bi-class) such as class -1 and class +1, 0 and 1, or the form of categories such as true and false. Famous methods used are Artificial Neural Networks (ANN) and Support Vector Machine (SVM). The current development was a problem with more than two classes, known as multi-class classes. For SVM sometimes the plural classes are overcome by doing a gradual process like a decision tree (DT) method. Meanwhile, ANN has experienced rapid development and is currently being developed with a large number of layers with the new activation functions, i.e. the rectified linear units (ReLu), and the probabilistic-based activation, i.e. softmax, including its optimizer methods (adam, sgd, and others). Then the term changed to Deep Learning (DL). This study aimed to compare two well-known methods (DL and SVM) in classifying multiple classes. The number of DL layers was six with the neuron composition are 128, 64, 32, 8, 4, and 3, while SVM uses a radial kernel base function with gamma and c respectively 0.7 and 5. Besides, this study intends to compare the use of the Graphics Processing Unit (GPU) available on Google Interactive Notebook (Google Colab), an online Python language programming application. The results showed that DL accuracy outperformed SVM but required large computational resources, with the accuracy for DL and SVM are 99% and 98%, respectively. However, the use of the GPU can overcome these problems and is proven to increase the speed of the process as much as 47 times. Keywords: Artificial Neural Networks, Graphics Processing Unit, Google Interactive Notebook, Rectified Linear units, Support Vector Machine. Abstrak Di awal perkembangannya mesin pembelajaran melakukan proses klasikfikasi menggunakan dua kelas (bi-class) misalnya kelas -1 dan kelas +1, 0 dan 1, atau bentuk kategori seperti benar dan salah. Metode terkenal yang digunakan adalah Jaringan Syaraf Tiruan (JST) dan Support Vector Machine (SVM). Perkembangan selanjutnya adalah problem dengan kelas yang lebih dari dua kelas, dikenal dengan istilah kelas jamak (multi-class). Untuk SVM terkadang kelas jamak diatasi dengan melakukan proses berjenjang mirip pohon keputusan (decision tree). Sementara itu JST telah mengalami perkembangan yang pesat dan saat ini sudah dikembangkan dengan jumlah layer yang banyak disertai dengan fungsi-fungsi aktivasi terkini seperti rectified linear unit (ReLu), dan softmax yang berbasis probabilistik, termasuk juga metode-metode optimizernya (adam, sgd, dan lain-lain). Kemudian istilahnya berubah menjadi Deep Learning (DL). Penelitian ini mencoba membandingkan dua metode terkenal (DL dan SVM) dalam melakukan klasifikasi kelas jamak. Jumlah layer DL sebanyak enam dengan masing-masing neuron sebesar 128, 64, 32, 8, 4, dan 3, sementara SVM menggunakan kernel radial basis function dengan gamma dan c berturut-turut 0.7 dan 5. Selain itu penelitian ini bermaksud membandingkan penggunaan Graphics Processing Unit (GPU) yang tersedia di Google Interactive Notebook (Google Colab), sebuah aplikasi online pemrograman bahasa Python. Hasil penelitian menunjukan akurasi DL unggul tipis dibanding SVM namun memerlukan sumber daya komputasi yang besar masing-masing dengan akurasi 99% dan 98%. Namun penggunaan GPU mampu mengatasi permasalahan tersebut dan terbukti meningkatkan kecepatan proses sebanyak 47 kali. Kata kunci: Jaringan Syaraf Tiruan, Graphics Processing Unit, Google Interactive Notebook, Rectified Linear units, Support Vector Machine.


Sensors ◽  
2021 ◽  
Vol 21 (16) ◽  
pp. 5330
Author(s):  
Marcin Łukasz Kowalski ◽  
Norbert Pałka ◽  
Jarosław Młyńczak ◽  
Mateusz Karol ◽  
Elżbieta Czerwińska ◽  
...  

Smuggling of drugs and cigarettes in small inflatable boats across border rivers is a serious threat to the EU’s financial interests. Early detection of such threats is challenging due to difficult and changing environmental conditions. This study reports on the automatic detection of small inflatable boats and people in a rough wild terrain in the infrared thermal domain. Three acquisition campaigns were carried out during spring, summer, and fall under various weather conditions. Three deep learning algorithms, namely, YOLOv2, YOLOv3, and Faster R-CNN working with six different feature extraction neural networks were trained and evaluated in terms of performance and processing time. The best performance was achieved with Faster R-CNN with ResNet101, however, processing requires a long time and a powerful graphics processing unit.


SPE Journal ◽  
2016 ◽  
Vol 21 (04) ◽  
pp. 1425-1435 ◽  
Author(s):  
Cheng Chen ◽  
Zheng Wang ◽  
Deepak Majeti ◽  
Nick Vrvilo ◽  
Timothy Warburton ◽  
...  

Summary Shale permeability is sufficiently low to require an unconventional scale of stimulation treatments, such as very-large-volume, high-rate, multistage hydraulic-fracturing applications. Upscaling of hydrocarbon transport processes in shales is challenging because of the low permeability and strong heterogeneity. Rock characterization with high-resolution imaging [X-ray tomography and scanning electron microscope (SEM)] is usually highly localized and contains significant uncertainties because of the small field of view. Therefore, an effective high-performance computing method is required to collect information over a larger scale to meet the ergodicity requirement in upscaling. The lattice Boltzmann (LB) method has received significant attention in computational fluid dynamics because of its capability in coping with complicated boundary conditions. A combination of high-resolution imaging and LB simulation is a powerful approach for evaluating the transport properties of a porous medium in a timely manner, on the basis of the numerical solution of the Navier-Stokes equations and Darcy's law. In this work, a graphics-processing-unit (GPU) -enhanced lattice Boltzmann simulator (GELBS) was developed, which was optimized by GPU parallel computing on the basis of the inherent parallelism of the LB method. Specifically, the LB method was used to implement the computational kernel; a sparse data structure was applied to optimize memory allocation; the OCCA (Medina et al. 2014) portability library was used, which enables the GELBS codes to use different application-programming interfaces (APIs) including open computing language (OpenCL), compute unified device architecture (CUDA), and open multiprocessing (OpenMP). OpenCL is an open standard for cross-platform parallel computing, CUDA is supported only by NVIDIA devices, and OpenMP is primarily used on central processing units (CPUs). It was found that the GPU-accelerated code was approximately 1,000 times faster than the unoptimized serial code and 10 times faster than the parallel code run on a standalone CPU. The CUDA code was slightly faster than OpenCL code on the NVIDA GPU because of the extra cost of OpenCL used to adapt to a heterogeneous platform. The GELBS was validated by comparing it with analytical solutions, laboratory measurements, and other independent numerical simulators in previous studies, and it was proved to have a second-order global accuracy. The GELBS was then used to analyze thin cuttings extracted from a sandstone reservoir and a shale-gas reservoir. The sandstone permeabilities were found relatively isotropic, whereas the shale permeabilities were strongly anisotropic because of the horizontal lamination structure. In shale cuttings, the average permeability in the horizontal direction was higher than that in the vertical direction by approximately two orders of magnitude. Correlations between porosity and permeability were observed in both rocks. The combination of GELBS and high-resolution imaging methods makes for a powerful tool for permeability evaluation when conventional laboratory measurement is impossible because of small cuttings sizes. The constitutive correlations between geometry and transport properties can be used for upscaling in different rock types. The GPU-optimized code significantly accelerates the computing speed; thus, many more samples can be analyzed given the same processing time. Consequently, the ergodicity requirement is met, which leads to a better reservoir characterization.


Sign in / Sign up

Export Citation Format

Share Document