cluster computing
Recently Published Documents


TOTAL DOCUMENTS

645
(FIVE YEARS 81)

H-INDEX

23
(FIVE YEARS 2)

Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 491
Author(s):  
Woong Seo ◽  
Sanghun Park ◽  
Insung Ihm

Cluster computing has attracted much attention as an effective way of solving large-scale problems. However, only a few attempts have been made to explore mobile computing clusters that can be easily built using commodity smartphones and tablets. To investigate the possibility of mobile cluster-based rendering of large datasets, we developed a mobile GPU ray tracer that renders nontrivial 3D scenes with many millions of triangles at an interactive frame rate on a small-scale mobile cluster. To cope with the limited processing power and memory space, we first present an effective 3D scene representation scheme suitable for mobile GPU rendering. Then, to avoid performance impairment caused by the high latency and low bandwidth of mobile networks, we propose using a static load balancing strategy, which we found to be more appropriate for the vulnerable mobile clustering environment than a dynamic strategy. Our mobile distributed rendering system achieved a few frames per second when ray tracing 1024 × 1024 images, using only 16 low-end smartphones, for large 3D scenes, some with more than 10 million triangles. Through a conceptual demonstration, we also show that the presented rendering scheme can be effectively explored for augmenting real scene images, captured or perceived by augmented and mixed reality devices, with high quality ray-traced images.


2022 ◽  
Author(s):  
Baran Kılıç ◽  
Can Özturan ◽  
Alper Sen

AbstractAbility to perform fast analysis on massive public blockchain transaction data is needed in various applications such as tracing fraudulent financial transactions. The blockchain data is continuously growing and is organized as a sequence of blocks containing transactions. This organization, however, cannot be used for parallel graph algorithms which need efficient distributed graph data structures. Using message passing libraries (MPI), we develop a scalable cluster-based system that constructs a distributed transaction graph in parallel and implement various transaction analysis algorithms. We report performance results from our system operating on roughly 5 years of 10.2 million block Ethereum Mainnet blockchain data. We report timings obtained from tests involving distributed transaction graph construction, partitioning, page ranking of addresses, degree distribution, token transaction counting, connected components finding and our new parallel blacklisted address trace forest computation algorithm on a 16 node economical cluster set up on the Amazon cloud. Our system is able to construct a distributed graph of 766 million transactions in 218 s and compute the forest of blacklisted address traces in 32 s.


Author(s):  
Л.Б. Соколинский ◽  
И.М. Соколинская

В статье представлен параллельный алгоритм валидации решений задач линейного программирования. Идея метода состоит в том, чтобы генерировать регулярный набор точек на гиперсфере малого радиуса, центрированной в точке тестируемого решения. Целевая функция вычисляется для каждой точки валидационного множества, принадлежащей допустимой области. Если все полученные значения меньше или равны значению целевой функции в точке, проверяемой как решение, то эта точка считается корректным решением. Параллельная реализация алгоритма VaLiPro выполнена на языке C++ с использованием параллельного BSF-каркаса, инкапсулирующего в проблемно-независимой части своего кода все аспекты, связанные с распараллеливанием программы на базе библиотеки MPI. Приводятся результаты масштабных вычислительных экспериментов на кластерной вычислительной системе, подтверждающие эффективность предложенного подхода. The paper presents and evaluates a scalable algorithm for validating solutions to linear programming (LP) problems on cluster computing systems. The main idea of the method is to generate a regular set of points (validation set) on a small-radius hypersphere centered at the solution point submitted to validation. The objective function is computed at each point of the validation that belongs to the feasible region. If all the values are less than or equal to the value of the objective function at the point that is to be validated, then this point is the correct solution. The parallel implementation of the VaLiPro algorithm is written in C++ through the parallel BSF-skeleton, which encapsulates all aspects related to the MPI-based parallelization of the program. We provide the results of large-scale computational experiments on a cluster computing system to study the scalability of the VaLiPro algorithm.


2021 ◽  
Vol 7 (5) ◽  
pp. 4438-4448
Author(s):  
Jie Chang

Objectives: Based on the cluster calculation, in this paper, the implementation of the training of enterprise personnel recruitment management was studied, starting with the employment recommendation form as the starting point. Methods: First of all, making rational use of previous employment data of college graduates and concludes with a Sensitive-Personal Rank algorithm to calculate the sensitivity of graduates interested in the historical recruitment data of each enterprise. Results: Furthermore, sensitivity to the current graduates and the graduates of the existing correlation between the calculation methods was optimized; finally, it was similar to the previous graduates to the recent graduates to recommend, so as bringing effective employment reference and guidance. Conclusion: The experimental results showed that, RBSI had a relatively high recommendation accuracy and satisfaction.


2021 ◽  
Vol 13 (18) ◽  
pp. 3631
Author(s):  
Austin Madson ◽  
Yongwei Sheng

Of the approximately 6700 lakes and reservoirs larger than 1 km2 in the Contiguous United States (CONUS), only ~430 (~6%) are actively gaged by the United States Geological Survey (USGS) or their partners and are available for download through the National Water Information System database. Remote sensing analysis provides a means to fill in these data gaps in order to glean a better understanding of the spatiotemporal water level changes across the CONUS. This study takes advantage of two-plus years of NASA’s ICESat-2 (IS-2) ATLAS photon data (ATL03 products) in order to derive water level changes for ~6200 overlapping lakes and reservoirs (>1 km2) in the CONUS. Interactive visualizations of large spatial datasets are becoming more commonplace as data volumes for new Earth observing sensors have markedly increased in recent years. We present such a visualization created from an automated cluster computing workflow that utilizes tens of billions of ATLAS photons which derives water level changes for all of the overlapping lakes and reservoirs in the CONUS. Furthermore, users of this interactive website can download segmented and clustered IS-2 ATL03 photons for each individual waterbody so that they may run their own analysis. We examine ~19,000 IS-2 derived water level changes that are spatially and temporally coincident with water level changes from USGS gages and find high agreement with our results as compared to the in situ gage data. The mean squared error (MSE) and the mean absolute error (MAE) between these two products are 1 cm and 6 cm, respectively.


2021 ◽  
Author(s):  
B. Darmawan

Pertamina EP plays an important role in maintaining the oil production supply for national energy stability. Thus, they bear a great responsibility to accelerate all the development plans and execute them in timely manner. However, there is big challenge in the realization of those plans since they are not fully equipped with the advance computing technology to boost the reservoir modeling and simulation phase. Therefore, the effort on finalizing and executing of 33 Plan of development (POD) projects within 5 years was looked like a never-ending project. To face the challenge, Pertamina EP evaluated the possibility to create a cluster technology that can accommodate high intensity of simulation numbers and high load of simulation process. The evaluation process covers: compiling, sorting and selecting the analog reservoir model (highest grid number and longest simulation time), benchmarking and performance test to get the most optimum cluster configuration. Supercomputer was then procured and set based on the optimized model, then completed by implementing the test on three most extreme POD models. This paper described the success story and innovation of a complex simulation and finer scale reservoir model using the hybrid parallel-computing technology with a set of 8 nodes high performing computer. Three models were tested with satisfying results. This paper discusses the parallel scalability of complex computing systems of multi-CPU clusters. Multi-CPU distributed memory computing system is proven to be able to improve and accelerate the reservoir modeling and simulation time, when it is used in combination with a new so called “hybrid” approach. In this approach, the common Message Passing Interface (MPI) synchronization between the cluster nodes is being interleaved with a shared memory system thread-based synchronization at the node level. The model with the longest simulation time has been accelerated by magnitude of 60%. The most exhausted model with highest number of simulation steps has been accelerated by magnitude of 80%. The model with the greatest number of grid (21.7 million active grids) has finally finished its simulation just in 27 minutes where previously was impossible to have it open and run. The successful study case is then followed by the implementation of the cluster computing technology for two pilot POD projects which led to the very good result. With this improvement, Pertamina EP can finally perform the probabilistic simulation as recommended by SKKMIGAS in PTK Rev-2/2018. It is now possible to run all 33 structures of multiple reservoir realizations for each POD.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Alexander Döschl ◽  
Max-Emanuel Keller ◽  
Peter Mandl

Purpose This paper aims to evaluate different approaches for the parallelization of compute-intensive tasks. The study compares a Java multi-threaded algorithm, distributed computing solutions with MapReduce (Apache Hadoop) and resilient distributed data set (RDD) (Apache Spark) paradigms and a graphics processing unit (GPU) approach with Numba for compute unified device architecture (CUDA). Design/methodology/approach The paper uses a simple but computationally intensive puzzle as a case study for experiments. To find all solutions using brute force search, 15! permutations had to be computed and tested against the solution rules. The experimental application comprises a Java multi-threaded algorithm, distributed computing solutions with MapReduce (Apache Hadoop) and RDD (Apache Spark) paradigms and a GPU approach with Numba for CUDA. The implementations were benchmarked on Amazon-EC2 instances for performance and scalability measurements. Findings The comparison of the solutions with Apache Hadoop and Apache Spark under Amazon EMR showed that the processing time measured in CPU minutes with Spark was up to 30% lower, while the performance of Spark especially benefits from an increasing number of tasks. With the CUDA implementation, more than 16 times faster execution is achievable for the same price compared to the Spark solution. Apart from the multi-threaded implementation, the processing times of all solutions scale approximately linearly. Finally, several application suggestions for the different parallelization approaches are derived from the insights of this study. Originality/value There are numerous studies that have examined the performance of parallelization approaches. Most of these studies deal with processing large amounts of data or mathematical problems. This work, in contrast, compares these technologies on their ability to implement computationally intensive distributed algorithms.


Author(s):  
Zeyi Wen ◽  
Qinbin Li ◽  
Bingsheng He ◽  
Bin Cui

In the last few years, Gradient Boosting Decision Trees (GBDTs) have been widely used in various applications such as online advertising and spam filtering. However, GBDT training is often a key performance bottleneck for such data science pipelines, especially for training a large number of deep trees on large data sets. Thus, many parallel and distributed GBDT systems have been researched and developed to accelerate the training process. In this survey paper, we review the recent GBDT systems with respect to accelerations with emerging hardware as well as cluster computing, and compare the advantages and disadvantages of the existing implementations. Finally, we present the research opportunities and challenges in designing fast next generation GBDT systems.


Sign in / Sign up

Export Citation Format

Share Document