scholarly journals Upgrading a high performance computing environment for massive data processing

Author(s):  
Lucas M. Ponce ◽  
Walter dos Santos ◽  
Wagner Meira ◽  
Dorgival Guedes ◽  
Daniele Lezzi ◽  
...  

Abstract High-performance computing (HPC) and massive data processing (Big Data) are two trends that are beginning to converge. In that process, aspects of hardware architectures, systems support and programming paradigms are being revisited from both perspectives. This paper presents our experience on this path of convergence with the proposal of a framework that addresses some of the programming issues derived from such integration. Our contribution is the development of an integrated environment that integretes (i) COMPSs, a programming framework for the development and execution of parallel applications for distributed infrastructures; (ii) Lemonade, a data mining and analysis tool; and (iii) HDFS, the most widely used distributed file system for Big Data systems. To validate our framework, we used Lemonade to create COMPSs applications that access data through HDFS, and compared them with equivalent applications built with Spark, a popular Big Data framework. The results show that the HDFS integration benefits COMPSs by simplifying data access and by rearranging data transfer, reducing execution time. The integration with Lemonade facilitates COMPSs’s use and may help its popularization in the Data Science community, by providing efficient algorithm implementations for experts from the data domain that want to develop applications with a higher level abstraction.

2018 ◽  
Vol 88 ◽  
pp. 693-695 ◽  
Author(s):  
Yulei Wu ◽  
Yang Xiang ◽  
Jingguo Ge ◽  
Peter Muller

2017 ◽  
Vol 46 (3) ◽  
pp. 508-527 ◽  
Author(s):  
Awais Ahmad ◽  
Anand Paul ◽  
Sadia Din ◽  
M. Mazhar Rathore ◽  
Gyu Sang Choi ◽  
...  

2021 ◽  
Author(s):  
Mohsen Hadianpour ◽  
Ehsan Rezayat ◽  
Mohammad-Reza Dehaqani

Abstract Due to the significantly drastic progress and improvement in neurophysiological recording technologies, neuroscientists have faced various complexities dealing with unstructured large-scale neural data. In the neuroscience community, these complexities could create serious bottlenecks in storing, sharing, and processing neural datasets. In this article, we developed a distributed high-performance computing (HPC) framework called `Big neuronal data framework' (BNDF), to overcome these complexities. BNDF is based on open-source big data frameworks, Hadoop and Spark providing a flexible and scalable structure. We examined BNDF on three different large-scale electrophysiological recording datasets from nonhuman primate’s brains. Our results exhibited faster runtimes with scalability due to the distributed nature of BNDF. We compared BNDF results to a widely used platform like MATLAB in an equitable computational resource. Compared with other similar methods, using BNDF provides more than five times faster performance in spike sorting as a usual neuroscience application.


2017 ◽  
pp. 777-806 ◽  
Author(s):  
H. Anzt ◽  
J. Dongarra ◽  
M. Gates ◽  
J. Kurzak ◽  
P. Luszczek ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document