Memory management techniques for large-scale persistent-main-memory systems

2017 ◽  
Vol 10 (11) ◽  
pp. 1166-1177 ◽  
Author(s):  
Ismail Oukid ◽  
Daniel Booss ◽  
Adrien Lespinasse ◽  
Wolfgang Lehner ◽  
Thomas Willhalm ◽  
...  
2016 ◽  
Vol 4 (1) ◽  
pp. 61-71
Author(s):  
Hirotaka Kawata ◽  
Gaku Nakagawa ◽  
Shuichi Oikawa

The performance of mobile devices such as smartphones and tablets has been rapidly improving in recent years. However, these improvements have been seriously affecting power consumption. One of the greatest challenges is to achieve efficient power management for battery-equipped mobile devices. To solve this problem, the authors focus on the emerging non-volatile memory (NVM), which has been receiving increasing attention in recent years. Since its performance is comparable with that of DRAM, it is possible to replace the main memory with NVM, thereby reducing power consumption. However, the price and capacity of NVM are problematic. Therefore, the authors provide a large memory space without performance degradation by combining NVM with other memory devices. In this study, they propose a design for non-volatile main memory systems that use DRAM as a swap space. This enables both high performance and energy efficient memory management through dynamic power management in NVM and DRAM.


2020 ◽  
Vol 10 (3) ◽  
pp. 999
Author(s):  
Hyokyung Bahn ◽  
Kyungwoon Cho

Recently, non-volatile memory (NVM) has advanced as a fast storage medium, and legacy memory subsystems optimized for DRAM (dynamic random access memory) and HDD (hard disk drive) hierarchies need to be revisited. In this article, we explore the memory subsystems that use NVM as an underlying storage device and discuss the challenges and implications of such systems. As storage performance becomes close to DRAM performance, existing memory configurations and I/O (input/output) mechanisms should be reassessed. This article explores the performance of systems with NVM based storage emulated by the RAMDisk under various configurations. Through our measurement study, we make the following findings. (1) We can decrease the main memory size without performance penalties when NVM storage is adopted instead of HDD. (2) For buffer caching to be effective, judicious management techniques like admission control are necessary. (3) Prefetching is not effective in NVM storage. (4) The effect of synchronous I/O and direct I/O in NVM storage is less significant than that in HDD storage. (5) Performance degradation due to the contention of multi-threads is less severe in NVM based storage than in HDD. Based on these observations, we discuss a new PC configuration consisting of small memory and fast storage in comparison with a traditional PC consisting of large memory and slow storage. We show that this new memory-storage configuration can be an alternative solution for ever-growing memory demands and the limited density of DRAM memory. We anticipate that our results will provide directions in system software development in the presence of ever-faster storage devices.


2022 ◽  
Vol 15 (2) ◽  
pp. 1-33
Author(s):  
Mikhail Asiatici ◽  
Paolo Ienne

Applications such as large-scale sparse linear algebra and graph analytics are challenging to accelerate on FPGAs due to the short irregular memory accesses, resulting in low cache hit rates. Nonblocking caches reduce the bandwidth required by misses by requesting each cache line only once, even when there are multiple misses corresponding to it. However, such reuse mechanism is traditionally implemented using an associative lookup. This limits the number of misses that are considered for reuse to a few tens, at most. In this article, we present an efficient pipeline that can process and store thousands of outstanding misses in cuckoo hash tables in on-chip SRAM with minimal stalls. This brings the same bandwidth advantage as a larger cache for a fraction of the area budget, because outstanding misses do not need a data array, which can significantly speed up irregular memory-bound latency-insensitive applications. In addition, we extend nonblocking caches to generate variable-length bursts to memory, which increases the bandwidth delivered by DRAMs and their controllers. The resulting miss-optimized memory system provides up to 25% speedup with 24× area reduction on 15 large sparse matrix-vector multiplication benchmarks evaluated on an embedded and a datacenter FPGA system.


2021 ◽  
Vol 11 (18) ◽  
pp. 8476
Author(s):  
June Choi ◽  
Jaehyun Lee ◽  
Jik-Soo Kim ◽  
Jaehwan Lee

In this paper, we present several optimization strategies that can improve the overall performance of the distributed in-memory computing system, “Apache Spark”. Despite its distributed memory management capability for iterative jobs and intermediate data, Spark has a significant performance degradation problem when the available amount of main memory (DRAM, typically used for data caching) is limited. To address this problem, we leverage an SSD (solid-state drive) to supplement the lack of main memory bandwidth. Specifically, we present an effective optimization methodology for Apache Spark by collectively investigating the effects of changing the capacity fraction ratios of the shuffle and storage spaces in the “Spark JVM Heap Configuration” and applying different “RDD Caching Policies” (e.g., SSD-backed memory caching). Our extensive experimental results show that by utilizing the proposed optimization techniques, we can improve the overall performance by up to 42%.


2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
Sol Ji Kang ◽  
Sang Yeon Lee ◽  
Keon Myung Lee

With problem size and complexity increasing, several parallel and distributed programming models and frameworks have been developed to efficiently handle such problems. This paper briefly reviews the parallel computing models and describes three widely recognized parallel programming frameworks: OpenMP, MPI, and MapReduce. OpenMP is the de facto standard for parallel programming on shared memory systems. MPI is the de facto industry standard for distributed memory systems. MapReduce framework has become the de facto standard for large scale data-intensive applications. Qualitative pros and cons of each framework are known, but quantitative performance indexes help get a good picture of which framework to use for the applications. As benchmark problems to compare those frameworks, two problems are chosen: all-pairs-shortest-path problem and data join problem. This paper presents the parallel programs for the problems implemented on the three frameworks, respectively. It shows the experiment results on a cluster of computers. It also discusses which is the right tool for the jobs by analyzing the characteristics and performance of the paradigms.


2017 ◽  
Vol 17 (2) ◽  
pp. 16-26 ◽  
Author(s):  
Chien-Chung Ho ◽  
Yu-Ming Chang ◽  
Yuan-Hao Chang ◽  
Hsiu-Chang Chen ◽  
Tei-Wei Kuo

IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 167351-167373
Author(s):  
Anna Pupykina ◽  
Giovanni Agosta

2010 ◽  
pp. 167-189
Author(s):  
Mahmood Shah ◽  
Steve Clarke

Project management is an important concept in business development. Often, the development of information technology or managing change will be run as projects, and managed using various well established project management techniques and tools. E-banking is often treated like a large scale project and broken into several small scale projects to manage various different aspects (called project portfolios), ranging from BPR to make the organization ready for online operations, to actual implementation of e-banking technologies.


Sign in / Sign up

Export Citation Format

Share Document