scholarly journals Q-Selector-Based Prefetching Method for DRAM/NVM Hybrid Main Memory System

Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 2158
Author(s):  
Jeong-Geun Kim ◽  
Shin-Dug Kim ◽  
Su-Kyung Yoon

This research is to design a Q-selector-based prefetching method for a dynamic random-access memory (DRAM)/ Phase-change memory (PCM)hybrid main memory system for memory-intensive big data applications generating irregular memory accessing streams. Specifically, the proposed method fully exploits the advantages of two-level hybrid memory systems, constructed as DRAM devices and non-volatile memory (NVM) devices. The Q-selector-based prefetching method is based on the Q-learning method, one of the reinforcement learning algorithms, which determines a near-optimal prefetcher for an application’s current running phase. For this, our model analyzes real-time performance status to set the criteria for the Q-learning method. We evaluate the Q-selector-based prefetching method with workloads from data mining and data-intensive benchmark applications, PARSEC-3.0 and graphBIG. Our evaluation results show that the system achieves approximately 31% performance improvement and increases the hit ratio of the DRAM-cache layer by 46% on average compared to a PCM-only main memory system. In addition, it achieves better performance results compared to the state-of-the-art prefetcher, access map pattern matching (AMPM) prefetcher, by 14.3% reduction of execution time and 12.89% of better CPI enhancement.

2020 ◽  
Vol 10 (3) ◽  
pp. 999
Author(s):  
Hyokyung Bahn ◽  
Kyungwoon Cho

Recently, non-volatile memory (NVM) has advanced as a fast storage medium, and legacy memory subsystems optimized for DRAM (dynamic random access memory) and HDD (hard disk drive) hierarchies need to be revisited. In this article, we explore the memory subsystems that use NVM as an underlying storage device and discuss the challenges and implications of such systems. As storage performance becomes close to DRAM performance, existing memory configurations and I/O (input/output) mechanisms should be reassessed. This article explores the performance of systems with NVM based storage emulated by the RAMDisk under various configurations. Through our measurement study, we make the following findings. (1) We can decrease the main memory size without performance penalties when NVM storage is adopted instead of HDD. (2) For buffer caching to be effective, judicious management techniques like admission control are necessary. (3) Prefetching is not effective in NVM storage. (4) The effect of synchronous I/O and direct I/O in NVM storage is less significant than that in HDD storage. (5) Performance degradation due to the contention of multi-threads is less severe in NVM based storage than in HDD. Based on these observations, we discuss a new PC configuration consisting of small memory and fast storage in comparison with a traditional PC consisting of large memory and slow storage. We show that this new memory-storage configuration can be an alternative solution for ever-growing memory demands and the limited density of DRAM memory. We anticipate that our results will provide directions in system software development in the presence of ever-faster storage devices.


Electronics ◽  
2020 ◽  
Vol 9 (6) ◽  
pp. 1013
Author(s):  
Hao Sun ◽  
Lan Chen ◽  
Xiaoran Hao ◽  
Chenji Liu ◽  
Mao Ni

Conventional main memory can no longer meet the requirements of low energy consumption and massive data storage in an artificial intelligence Internet of Things (AIoT) system. Moreover, the efficiency is decreased due to the swapping of data between the main memory and storage. This paper presents a hybrid storage class memory system to reduce the energy consumption and optimize IO performance. Phase change memory (PCM) brings the advantages of low static power and a large capacity to a hybrid memory system. In order to avoid the impact of poor write performance in PCM, a migration scheme implemented in the memory controller is proposed. By counting the write times and row buffer miss times in PCM simultaneously, the write-intensive data can be selected and migrated from PCM to dynamic random-access memory (DRAM) efficiently, which improves the performance of hybrid storage class memory. In addition, a fast mode with a tmpfs-based, in-memory file system is applied to hybrid storage class memory to reduce the number of data movements between memory and external storage. Experimental results show that the proposed system can reduce energy consumption by 46.2% on average compared with the traditional DRAM-only system. The fast mode increases the IO performance of the system by more than 30 times compared with the common ext3 file system.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Kanghyeok Jeon ◽  
Jeeson Kim ◽  
Jin Joo Ryu ◽  
Seung-Jong Yoo ◽  
Choongseok Song ◽  
...  

AbstractConventional computing architectures are poor suited to the unique workload demands of deep learning, which has led to a surge in interest in memory-centric computing. Herein, a trilayer (Hf0.8Si0.2O2/Al2O3/Hf0.5Si0.5O2)-based self-rectifying resistive memory cell (SRMC) that exhibits (i) large selectivity (ca. 104), (ii) two-bit operation, (iii) low read power (4 and 0.8 nW for low and high resistance states, respectively), (iv) read latency (<10 μs), (v) excellent non-volatility (data retention >104 s at 85 °C), and (vi) complementary metal-oxide-semiconductor compatibility (maximum supply voltage ≤5 V) is introduced, which outperforms previously reported SRMCs. These characteristics render the SRMC highly suitable for the main memory for memory-centric computing which can improve deep learning acceleration. Furthermore, the low programming power (ca. 18 nW), latency (100 μs), and endurance (>106) highlight the energy-efficiency and highly reliable random-access memory of our SRMC. The feasible operation of individual SRMCs in passive crossbar arrays of different sizes (30 × 30, 160 × 160, and 320 × 320) is attributed to the large asymmetry and nonlinearity in the current-voltage behavior of the proposed SRMC, verifying its potential for application in large-scale and high-density non-volatile memory for memory-centric computing.


Author(s):  
Xiongpai Qin ◽  
Yueguo Chen

In the last decade, computer hardware progressed by leaps and bounds. The advancements of hardware include the application of multi-core CPUs, use of GPUs in data intensive tasks, bigger and bigger main memory capacity, maturity and production use of non-volatile memory, etc. Database systems immediately benefit from faster CPU/GPU and bigger memory and run faster. However, there are some pitfalls. For example, database systems running on multi-core processors may suffer from cache conflicts when the number of concurrently executing DB processes increases. To fully exploit advantages of new hardware to improve the performance of database systems, database software should be more or less revised. This chapter introduces some efforts of database research community in this aspect.


Author(s):  
Xiongpai Qin ◽  
Yueguo Chen

In the last decade, computer hardware progresses with leaps and bounds. The advancements of hardware include: widely application of multi-core CPUs, using of GPUs in data intensive tasks, bigger and bigger main memory capacity, maturity and production use of non-volatile memory etc. Database systems immediately benefit from faster CPU/GPU and bigger memory, and run faster. However, there are some pitfalls. For example, database systems running on multi-core processors may suffer from cache conflicts when the number of concurrently executing DB processes increases. To fully exploit advantages of new hardware to improve the performance of database systems, database software should be more or less revised. This chapter introduces some efforts of database research community in this aspect.


Electronics ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 1399
Author(s):  
Taepyeong Kim ◽  
Sangun Park ◽  
Yongbeom Cho

In this study, a simple and effective memory system required for the implementation of an AI chip is proposed. To implement an AI chip, the use of internal or external memory is an essential factor, because the reading and writing of data in memory occurs a lot. Those memory systems that are currently used are large in design size and complex to implement in order to handle a high speed and a wide bandwidth. Therefore, depending on the AI application, there are cases where the circuit size of the memory system is larger than that of the AI core. In this study, SDRAM, which has a lower performance than the currently used memory system but does not have a problem in operating AI, was used and all circuits were implemented digitally for simple and efficient implementation. In particular, a delay controller was designed to reduce the error due to data skew inside the memory bus to ensure stability in reading and writing data. First of all, it verified the memory system based on the You Only Look Once (YOLO) algorithm in FPGA to confirm that the memory system proposed in AI works efficiently. Based on the proven memory system, we implemented a chip using Samsung Electronics’ 65 nm process and tested it. As a result, we designed a simple and efficient memory system for AI chip implementation and verified it with hardware.


IEEE Micro ◽  
2021 ◽  
pp. 1-1
Author(s):  
Zixuan Wang ◽  
Xiao Liu ◽  
Jian Yang ◽  
Theodore Michailidis ◽  
Steven Swanson ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document