Performance evaluation of GPU- and cluster-computing for parallelization of compute-intensive tasks

Purpose This paper aims to evaluate different approaches for the parallelization of compute-intensive tasks. The study compares a Java multi-threaded algorithm, distributed computing solutions with MapReduce (Apache Hadoop) and resilient distributed data set (RDD) (Apache Spark) paradigms and a graphics processing unit (GPU) approach with Numba for compute unified device architecture (CUDA). Design/methodology/approach The paper uses a simple but computationally intensive puzzle as a case study for experiments. To find all solutions using brute force search, 15! permutations had to be computed and tested against the solution rules. The experimental application comprises a Java multi-threaded algorithm, distributed computing solutions with MapReduce (Apache Hadoop) and RDD (Apache Spark) paradigms and a GPU approach with Numba for CUDA. The implementations were benchmarked on Amazon-EC2 instances for performance and scalability measurements. Findings The comparison of the solutions with Apache Hadoop and Apache Spark under Amazon EMR showed that the processing time measured in CPU minutes with Spark was up to 30% lower, while the performance of Spark especially benefits from an increasing number of tasks. With the CUDA implementation, more than 16 times faster execution is achievable for the same price compared to the Spark solution. Apart from the multi-threaded implementation, the processing times of all solutions scale approximately linearly. Finally, several application suggestions for the different parallelization approaches are derived from the insights of this study. Originality/value There are numerous studies that have examined the performance of parallelization approaches. Most of these studies deal with processing large amounts of data or mathematical problems. This work, in contrast, compares these technologies on their ability to implement computationally intensive distributed algorithms.

Download Full-text

Perbandingan Kinerja Komputasi Hadoop dan Spark untuk Memprediksi Cuaca (Studi Kasus : Storm Event Database)

Repositor ◽

10.22219/repositor.v2i4.93 ◽

2020 ◽

Vol 2 (4) ◽

pp. 463

Author(s):

Rendiyono Wahyu Saputro ◽

Aminuddin Aminuddin ◽

Yuda Munarko

Keyword(s):

Internet Of Things ◽

Distributed Computing ◽

Apache Spark ◽

Data Sources ◽

User Generated Content ◽

Storm Event ◽

Test Results ◽

Process Data ◽

Apache Hadoop ◽

Database Technology

AbstrakPerkembangan teknologi telah mengakibatkan pertumbuhan data yang semakin cepat dan besar setiap waktunya. Hal tersebut disebabkan oleh banyaknya sumber data seperti mesin pencari, RFID, catatan transaksi digital, arsip video dan foto, user generated content, internet of things, penelitian ilmiah di berbagai bidang seperti genomika, meteorologi, astronomi, fisika, dll. Selain itu, data - data tersebut memiliki karakteristik yang unik antara satu dengan lainnya, hal ini yang menyebabkan tidak dapat diproses oleh teknologi basis data konvensional.Oleh karena itu, dikembangkan beragam framework komputasi terdistribusi seperti Apache Hadoop dan Apache Spark yang memungkinkan untuk memproses data secara terdistribusi dengan menggunakan gugus komputer.Adanya ragam framework komputasi terdistribusi, sehingga diperlukan sebuah pengujian untuk mengetahui kinerja komputasi keduanya. Pengujian dilakukan dengan memproses dataset dengan beragam ukuran dan dalam gugus komputer dengan jumlah node yang berbeda. Dari semua hasil pengujian, Apache Hadoop memerlukan waktu yang lebih sedikit dibandingkan dengan Apache Spark. Hal tersebut terjadi karena nilai throughput dan throughput/node Apache Hadoop lebih tinggi daripada Apache Spark.AbstractTechnological developments have resulted in rapid and growing data growth every time. This is due to the large number of data sources such as search engines, RFID, digital transaction records, video and photo archives, user generated content, internet of things, scientific research in areas such as genomics, meteorology, astronomy, physics, In addition, these data have unique characteristics of each other, this is the cause can not be processed by conventional database technology. Therefore, developed various distributed computing frameworks such as Apache Hadoop and Apache Spark that enable to process data in a distributed by using computer cluster.The existence of various frameworks of distributed computing, so required a test to determine the performance of both computing. Testing is done by processing datasets of various sizes and in clusters of computers with different number of nodes. Of all the test results, Apache Hadoop takes less time than the Apache Spark. This happens because the value of throuhgput and throughput / node Apache Hadoop is higher than Apache Spark.

Download Full-text

An Interval Type 2 Fuzzy Logic Framework for Faster Evolutionary Design

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2019.8576 ◽

2019 ◽

Vol 16 (12) ◽

pp. 5140-5148

Author(s):

Sarabjeet Singh ◽

Satvir Singh ◽

Vijay Kumar Banga

Keyword(s):

Fuzzy Logic ◽

Graphics Processing Unit ◽

Noisy Data ◽

Parallel Execution ◽

Rule Base ◽

Processing Unit ◽

Data Set ◽

Interval Type ◽

Graphics Processing

In this paper, a fast processing and efficient framework has been proposed to get an optimum output from a noisy data set of a system by using interval type-2 fuzzy logic system. Further, the concept of GPGPU (General Purpose Computing on Graphics Processing Unit) is used for fast execution of the fuzzy rule base on Graphics Processing Unit (GPU). Application of Whale Optimization Algorithm (WOA) is used to ascertain optimum output from noisy data set. Which is further integrated with Interval Type-2 (IT2) fuzzy logic system and executed on Graphics Processing Unit for faster execution. The proposed framework is also designed for parallel execution using GPU and the results are compared with the serial program execution. Further, it is clearly observed that the parallel execution rule base evolved provide better accuracy in less time. The proposed framework (IT2FLS) has been validated with classical bench mark problem of Mackey Glass Time Series. For non-stationary time-series data with additive gaussian noise has been implemented with proposed framework and with T1 FLS. Further, it is observed that IT2 FLS provides better rule base for noisy data set.

Download Full-text

GPU acceleration of a model-based iterative method for Digital Breast Tomosynthesis

Scientific Reports ◽

10.1038/s41598-019-56920-y ◽

2020 ◽

Vol 10 (1) ◽

Cited By ~ 1

Author(s):

R. Cavicchioli ◽

J. Cheng Hu ◽

E. Loli Piccolomini ◽

E. Morotti ◽

L. Zanni

Keyword(s):

Graphics Processing Unit ◽

Digital Breast Tomosynthesis ◽

Projection Algorithm ◽

Processing Unit ◽

Data Set ◽

Breast Tomosynthesis ◽

Model Based ◽

Computational Performance ◽

Gradient Based ◽

Time Required

AbstractDigital Breast Tomosynthesis (DBT) is a modern 3D Computed Tomography X-ray technique for the early detection of breast tumors, which is receiving growing interest in the medical and scientific community. Since DBT performs incomplete sampling of data, the image reconstruction approaches based on iterative methods are preferable to the classical analytic techniques, such as the Filtered Back Projection algorithm, providing fewer artifacts. In this work, we consider a Model-Based Iterative Reconstruction (MBIR) method well suited to describe the DBT data acquisition process and to include prior information on the reconstructed image. We propose a gradient-based solver named Scaled Gradient Projection (SGP) for the solution of the constrained optimization problem arising in the considered MBIR method. Even if the SGP algorithm exhibits fast convergence, the time required on a serial computer for the reconstruction of a real DBT data set is too long for the clinical needs. In this paper we propose a parallel SGP version designed to perform the most expensive computations of each iteration on Graphics Processing Unit (GPU). We apply the proposed parallel approach on three different GPU boards, with computational performance comparable with that of the boards usually installed in commercial DBT systems. The numerical results show that the proposed GPU-based MBIR method provides accurate reconstructions in a time suitable for clinical trials.

Download Full-text

dadi.CUDA: Accelerating population genetic inference with Graphics Processing Units

10.1101/2020.07.30.229336 ◽

2020 ◽

Author(s):

Ryan N Gutenkunst

Keyword(s):

Graphics Processing Units ◽

Population Genetic ◽

Gpu Computing ◽

Graphics Processing Unit ◽

Demographic History ◽

Processing Unit ◽

Speed Increase ◽

Population Genetic Inference ◽

Computationally Intensive ◽

Graphics Processing

Extracting insight from population genetic data often demands computationally intensive modeling. dadi is a popular program for fitting models of demographic history and natural selection to such data. Here, I show that running dadi on a Graphics Processing Unit (GPU) can speed computation by orders of magnitude compared to the CPU implementation, with minimal user burden. This speed increase enables the analysis of more complex models, which motivated the extension of dadi to four- and five-population models. Remarkably, dadi performs almost as well on inexpensive consumer-grade GPUs as on expensive server-grade GPUs. GPU computing thus offers large and accessible benefits to the community of dadi users. This functionality is available in dadi version 2.1.0.

Download Full-text

Visualisation of Large-Scale Call-Centre Data

10.23889/suthesis.56839 ◽

2020 ◽

Author(s):

◽

Dylan G Rees

Keyword(s):

Large Scale ◽

Graphics Processing Unit ◽

Large Data ◽

Mean Value ◽

Data Sets ◽

Call Centre ◽

Processing Unit ◽

Data Set ◽

Domain Experts ◽

Graphics Processing

The contact centre industry employs 4% of the entire United King-dom and United States’ working population and generates gigabytes of operational data that require analysis, to provide insight and to improve efficiency. This thesis is the result of a collaboration with QPC Limited who provide data collection and analysis products for call centres. They provided a large data-set featuring almost 5 million calls to be analysed. This thesis utilises novel visualisation techniques to create tools for the exploration of the large, complex call centre data-set and to facilitate unique observations into the data.A survey of information visualisation books is presented, provid-ing a thorough background of the field. Following this, a feature-rich application that visualises large call centre data sets using scatterplots that support millions of points is presented. The application utilises both the CPU and GPU acceleration for processing and filtering and is exhibited with millions of call events.This is expanded upon with the use of glyphs to depict agent behaviour in a call centre. A technique is developed to cluster over-lapping glyphs into a single parent glyph dependant on zoom level and a customizable distance metric. This hierarchical glyph repre-sents the mean value of all child agent glyphs, removing overlap and reducing visual clutter. A novel technique for visualising individually tailored glyphs using a Graphics Processing Unit is also presented, and demonstrated rendering over 100,000 glyphs at interactive frame rates. An open-source code example is provided for reproducibility.Finally, a novel interaction and layout method is introduced for improving the scalability of chord diagrams to visualise call transfers. An exploration of sketch-based methods for showing multiple links and direction is made, and a sketch-based brushing technique for filtering is proposed. Feedback from domain experts in the call centre industry is reported for all applications developed.

Download Full-text

High-performance computer graphics technologies in engineering applications

World Journal of Engineering ◽

10.1108/wje-05-2018-0158 ◽

2019 ◽

Vol 16 (2) ◽

pp. 304-308

Author(s):

Chao Peng

Keyword(s):

Computer Graphics ◽

High Performance ◽

Graphics Processing Unit ◽

General Purpose ◽

Processing Unit ◽

Data Streaming ◽

Engineering Applications ◽

Content Type ◽

Central Processing ◽

Advantages And Disadvantages

Purpose The purpose of this paper is to investigate possibilities to adopt state-of-the-art computer graphics technologies for big data visualization in engineering applications. Toward this purpose, a conceptual heterogeneous system is proposed for graphical rendering, which is established with multiple central processing unit cores and multiple graphics processing unit GPUs. Design/methodology/approach The design of the system supports both general-purpose computation and graphics-related computation. Three processing components are discussed to fulfill the execution requirements in load balancing, data streaming and display. This design fully uses computational and memory resources and enhances the performance with the support of GPU-based parallelization. Findings The advantages and disadvantages of particular technical methods for each processing component are discussed. The possible ways to integrate them are analyzed. Originality/value This work has contributions of using computer graphics technologies in engineering applications.

Download Full-text

GPU-Based Soil Parameter Parallel Inversion for PolSAR Data

Remote Sensing ◽

10.3390/rs12030415 ◽

2020 ◽

Vol 12 (3) ◽

pp. 415 ◽

Cited By ~ 1

Author(s):

Qiang Yin ◽

You Wu ◽

Fan Zhang ◽

Yongsheng Zhou

Keyword(s):

Data Storage ◽

High Performance ◽

Graphics Processing Unit ◽

Processing Unit ◽

Inversion Process ◽

Soil Parameter ◽

Parameter Inversion ◽

Level Data ◽

Computationally Intensive ◽

Graphics Processing

With the development of polarimetric synthetic aperture radar (PolSAR), quantitative parameter inversion has been seen great progress, especially in the field of soil parameter inversion, which has achieved good results for applications. However, PolSAR data is also often many terabytes large. This huge amount of data also directly affects the efficiency of the inversion. Therefore, the efficiency of soil moisture and roughness inversion has become a problem in the application of this PolSAR technique. A parallel realization based on a graphics processing unit (GPU) for multiple inversion models of PolSAR data is proposed in this paper. This method utilizes the high-performance parallel computing capability of a GPU to optimize the realization of the surface inversion models for polarimetric SAR data. Three classical forward scattering models and their corresponding inversion algorithms are analyzed. They are different in terms of polarimetric data requirements, application situation, as well as inversion performance. Specifically, the inversion process of PolSAR data is mainly improved by the use of the high concurrent threads of GPU. According to the inversion process, various optimization strategies are applied, such as the parallel task allocation, and optimizations of instruction level, data storage, data transmission between CPU and GPU. The advantages of a GPU in processing computationally-intensive data are shown in the data experiments, where the efficiency of soil roughness and moisture inversion is increased by one or two orders of magnitude.

Download Full-text

GPU Computation and Platforms

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Emerging Research Surrounding Power Consumption and Performance Issues in Utility Computing ◽

10.4018/978-1-4666-8853-7.ch007 ◽

2016 ◽

pp. 136-174

Author(s):

K. Bhargavi ◽

Sathish Babu B.

Keyword(s):

Message Passing ◽

High Performance ◽

Message Passing Interface ◽

Gpu Computing ◽

Graphics Processing Unit ◽

General Purpose ◽

Processing Unit ◽

Computing Platforms ◽

Computationally Intensive ◽

Graphics Processing

The GPUs (Graphics Processing Unit) were mainly used to speed up computation intensive high performance computing applications. There are several tools and technologies available to perform general purpose computationally intensive application. This chapter primarily discusses about GPU parallelism, applications, probable challenges and also highlights some of the GPU computing platforms, which includes CUDA, OpenCL (Open Computing Language), OpenMPC (Open MP extended for CUDA), MPI (Message Passing Interface), OpenACC (Open Accelerator), DirectCompute, and C++ AMP (C++ Accelerated Massive Parallelism). Each of these platforms is discussed briefly along with their advantages and disadvantages.

Download Full-text

Parallel and distributed computing models on a graphics processing unit to accelerate simulation of membrane systems

Simulation Modelling Practice and Theory ◽

10.1016/j.simpat.2014.05.005 ◽

2014 ◽

Vol 47 ◽

pp. 60-78 ◽

Cited By ~ 13

Author(s):

Ali Maroosi ◽

Ravie Chandren Muniyandi ◽

Elankovan Sundararajan ◽

Abdullah Mohd Zin

Keyword(s):

Distributed Computing ◽

Graphics Processing Unit ◽

Processing Unit ◽

Membrane Systems ◽

Parallel And Distributed Computing ◽

Graphics Processing ◽

Computing Models

Download Full-text

BigData Analysis in Healthcare: Apache Hadoop , Apache spark and Apache Flink

Frontiers in Health Informatics ◽

10.30699/fhi.v8i1.180 ◽

2019 ◽

Vol 8 (1) ◽

pp. 14 ◽

Cited By ~ 1

Author(s):

Elham Nazari ◽

Mohammad Hasan Shahriari ◽

Hamed Tabesh

Keyword(s):

Big Data ◽

Error Detection ◽

Memory Management ◽

High Speed ◽

Scientific Information ◽

High Volume ◽

Apache Spark ◽

Data Set ◽

Apache Hadoop ◽

The Subject

Introduction: Health care data is increasing. The correct analysis of such data will improve the quality of care and reduce costs. This kind of data has certain features such as high volume, variety, high-speed production, etc. It makes it impossible to analyze with ordinary hardware and software platforms. Choosing the right platform for managing this kind of data is very important. The purpose of this study is to introduce and compare the most popular and most widely used platform for processing big data, Apache Hadoop MapReduce, and the two Apache Spark and Apache Flink platforms, which have recently been featured with great prominence.Material and Methods: This study is a survey whose content is based on the subject matter search of the Proquest, PubMed, Google Scholar, Science Direct, Scopus, IranMedex, Irandoc, Magiran, ParsMedline and Scientific Information Database (SID) databases, as well as Web reviews, specialized books with related keywords and standard. Finally, 80 articles related to the subject of the study were reviewed.Results: The findings showed that each of the studied platforms has features, such as data processing, support for different languages, processing speed, computational model, memory management, optimization, delay, error tolerance, scalability, performance, compatibility, Security and so on. Overall, the findings showed that the Apache Hadoop environment has simplicity, error detection, and scalability management based on clusters, but because its processing is based on batch processing, it works for slow complex analyzes and does not support flow processing, Apache Spark is also distributed as a computational platform that can process a big data set in memory with a very fast response time, the Apache Flink allows users to store data in memory and load them multiple times and provide a complex Fault Tolerance mechanism Continuously retrieves data flow status.Conclusion: The application of big data analysis and processing platforms varies according to the needs. In other words, it can be said that each technology is complementary, each of which is applicable in a particular field and cannot be separated from one another and depending on the purpose and the expected expectation, and the platform must be selected for analysis or whether custom tools are designed on these platforms.

Download Full-text