2021 ◽  
Vol 11 (18) ◽  
pp. 8476
Author(s):  
June Choi ◽  
Jaehyun Lee ◽  
Jik-Soo Kim ◽  
Jaehwan Lee

In this paper, we present several optimization strategies that can improve the overall performance of the distributed in-memory computing system, “Apache Spark”. Despite its distributed memory management capability for iterative jobs and intermediate data, Spark has a significant performance degradation problem when the available amount of main memory (DRAM, typically used for data caching) is limited. To address this problem, we leverage an SSD (solid-state drive) to supplement the lack of main memory bandwidth. Specifically, we present an effective optimization methodology for Apache Spark by collectively investigating the effects of changing the capacity fraction ratios of the shuffle and storage spaces in the “Spark JVM Heap Configuration” and applying different “RDD Caching Policies” (e.g., SSD-backed memory caching). Our extensive experimental results show that by utilizing the proposed optimization techniques, we can improve the overall performance by up to 42%.


2016 ◽  
Vol 4 (1) ◽  
pp. 61-71
Author(s):  
Hirotaka Kawata ◽  
Gaku Nakagawa ◽  
Shuichi Oikawa

The performance of mobile devices such as smartphones and tablets has been rapidly improving in recent years. However, these improvements have been seriously affecting power consumption. One of the greatest challenges is to achieve efficient power management for battery-equipped mobile devices. To solve this problem, the authors focus on the emerging non-volatile memory (NVM), which has been receiving increasing attention in recent years. Since its performance is comparable with that of DRAM, it is possible to replace the main memory with NVM, thereby reducing power consumption. However, the price and capacity of NVM are problematic. Therefore, the authors provide a large memory space without performance degradation by combining NVM with other memory devices. In this study, they propose a design for non-volatile main memory systems that use DRAM as a swap space. This enables both high performance and energy efficient memory management through dynamic power management in NVM and DRAM.


2018 ◽  
Vol 88 ◽  
pp. 1-15 ◽  
Author(s):  
Lin Wu ◽  
Qingfeng Zhuge ◽  
Edwin Hsing-Mean Sha ◽  
Xianzhang Chen ◽  
Linfeng Cheng

2021 ◽  
Vol 17 (1) ◽  
pp. 1-22
Author(s):  
Wen Cheng ◽  
Chunyan Li ◽  
Lingfang Zeng ◽  
Yingjin Qian ◽  
Xi Li ◽  
...  

In high-performance computing (HPC), data and metadata are stored on special server nodes and client applications access the servers’ data and metadata through a network, which induces network latencies and resource contention. These server nodes are typically equipped with (slow) magnetic disks, while the client nodes store temporary data on fast SSDs or even on non-volatile main memory (NVMM). Therefore, the full potential of parallel file systems can only be reached if fast client side storage devices are included into the overall storage architecture. In this article, we propose an NVMM-based hierarchical persistent client cache for the Lustre file system (NVMM-LPCC for short). NVMM-LPCC implements two caching modes: a read and write mode (RW-NVMM-LPCC for short) and a read only mode (RO-NVMM-LPCC for short). NVMM-LPCC integrates with the Lustre Hierarchical Storage Management (HSM) solution and the Lustre layout lock mechanism to provide consistent persistent caching services for I/O applications running on client nodes, meanwhile maintaining a global unified namespace of the entire Lustre file system. The evaluation results presented in this article show that NVMM-LPCC can increase the average read throughput by up to 35.80 times and the average write throughput by up to 9.83 times compared with the native Lustre system, while providing excellent scalability.


Sign in / Sign up

Export Citation Format

Share Document