memory wall Latest Research Papers

The non-volatile memory (NVM) device is a useful way to solve the memory wall in computers. However, the current I/O software stack in operating systems becomes a performance bottleneck for applications based on NVM devices, especially for key–value stores. We analyzed the characteristics of key–value stores and NVM devices and designed a new embedded key–value store for an NVM device simulator named PMEKV. The embedded processor in NVM devices was used to manage key–value pairs to reduce the data transfer between NVM devices and key–value applications. Meanwhile, it also cut down the data copy between the user space and the kernel space in the operating system to alleviate the I/O software stacks on the efficiency of key–value stores. The architecture, data layout, management strategy, new interface and log strategy of PMEKV are given. Finally, a prototype of PMEKV was implemented based on PMEM. We used YCSB to test and compare it with Redis, MongDB, and Memcache. Meanwhile, the Redis for PMEM named PMEM-Redis and PMEM-KV were also used to test and compared with PMEKV. The results show that PMEKV had the advantage of throughput and adaptability compared with the current key–value stores.

Download Full-text

Breaking down memory walls

Proceedings of the VLDB Endowment ◽

10.14778/3430915.3430916 ◽

2020 ◽

Vol 14 (3) ◽

pp. 241-254

Author(s):

Chen Luo ◽

Michael J. Carey

Keyword(s):

Memory Management ◽

Storage Systems ◽

Memory Allocation ◽

Adaptive Memory ◽

Component Structure ◽

Memory Wall ◽

Buffer Cache ◽

Management Architecture

Log-Structured Merge-trees (LSM-trees) have been widely used in modern NoSQL systems. Due to their out-of-place update design, LSM-trees have introduced memory walls among the memory components of multiple LSM-trees and between the write memory and the buffer cache. Optimal memory allocation among these regions is non-trivial because it is highly workload-dependent. Existing LSM-tree implementations instead adopt static memory allocation schemes due to their simplicity and robustness, sacrificing performance. In this paper, we attempt to break down these memory walls in LSM-based storage systems. We first present a memory management architecture that enables adaptive memory management. We then present a partitioned memory component structure with new flush policies to better exploit the write memory to minimize the write cost. To break down the memory wall between the write memory and the buffer cache, we further introduce a memory tuner that tunes the memory allocation between these two regions. We have conducted extensive experiments in the context of Apache AsterixDB using the YCSB and TPC-C benchmarks and we present the results here.

Download Full-text

Rediscovering Majority Logic in the Post-CMOS Era: A Perspective from In-Memory Computing

Journal of Low Power Electronics and Applications ◽

10.3390/jlpea10030028 ◽

2020 ◽

Vol 10 (3) ◽

pp. 28

Author(s):

John Reuben

Keyword(s):

Integrated Circuits ◽

Logic Gates ◽

Expressive Power ◽

Boolean Logic ◽

Majority Gate ◽

Von Neumann ◽

Majority Logic ◽

Memory Array ◽

Memory Wall ◽

Parallel Prefix

As we approach the end of Moore’s law, many alternative devices are being explored to satisfy the performance requirements of modern integrated circuits. At the same time, the movement of data between processing and memory units in contemporary computing systems (‘von Neumann bottleneck’ or ‘memory wall’) necessitates a paradigm shift in the way data is processed. Emerging resistance switching memories (memristors) show promising signs to overcome the ‘memory wall’ by enabling computation in the memory array. Majority logic is a type of Boolean logic which has been found to be an efficient logic primitive due to its expressive power. In this review, the efficiency of majority logic is analyzed from the perspective of in-memory computing. Recently reported methods to implement majority gate in Resistive RAM array are reviewed and compared. Conventional CMOS implementation accommodated heterogeneity of logic gates (NAND, NOR, XOR) while in-memory implementation usually accommodates homogeneity of gates (only IMPLY or only NAND or only MAJORITY). In view of this, memristive logic families which can implement MAJORITY gate and NOT (to make it functionally complete) are to be favored for in-memory computing. One-bit full adders implemented in memory array using different logic primitives are compared and the efficiency of majority-based implementation is underscored. To investigate if the efficiency of majority-based implementation extends to n-bit adders, eight-bit adders implemented in memory array using different logic primitives are compared. Parallel-prefix adders implemented in majority logic can reduce latency of in-memory adders by 50–70% when compared to IMPLY, NAND, NOR and other similar logic primitives.

Download Full-text

Breaking the Memory Wall for AI Chip with a New Dimension

2020 5th South-East Europe Design Automation, Computer Engineering, Computer Networks and Social Media Conference (SEEDA-CECNSM) ◽

10.1109/seeda-cecnsm49515.2020.9221795 ◽

2020 ◽

Author(s):

Eugene Tam ◽

Shenfei Jiang ◽

Paul Duan ◽

Shawn Meng ◽

Yue Pang ◽

...

Keyword(s):

Memory Wall

Download Full-text

In-memory computing to break the memory wall

Chinese Physics B ◽

10.1088/1674-1056/ab90e7 ◽

2020 ◽

Vol 29 (7) ◽

pp. 078504

Author(s):

Xiaohe Huang ◽

Chunsen Liu ◽

Yu-Gang Jiang ◽

Peng Zhou

Keyword(s):

Memory Wall

Download Full-text

Binary Addition in Resistance Switching Memory Array by Sensing Majority

Micromachines ◽

10.3390/mi11050496 ◽

2020 ◽

Vol 11 (5) ◽

pp. 496 ◽

Cited By ~ 2

Author(s):

John Reuben

Keyword(s):

Energy Efficient ◽

Full Adder ◽

Resistance Switching ◽

Boolean Logic ◽

Majority Gate ◽

Efficient Manner ◽

Computing Systems ◽

Von Neumann ◽

Memory Array ◽

Memory Wall

The flow of data between processing and memory units in contemporary computing systems is their main performance and energy-efficiency bottleneck, often referred to as the ‘von Neumann bottleneck’ or ‘memory wall’. Emerging resistance switching memories (memristors) show promising signs to overcome the ‘memory wall’ by enabling computation in the memory array. Majority logic is a type of Boolean logic, and in many nanotechnologies, it has been found to be an efficient logic primitive. In this paper, a technique is proposed to implement a majority gate in a memory array. The majority gate is realised in an energy-efficient manner as a memory R E A D operation. The proposed logic family disintegrates arithmetic operations to majority and NOT operations which are implemented as memory R E A D and W R I T E operations. A 1-bit full adder can be implemented in 6 steps (memory cycles) in a 1T–1R array, which is faster than I M P L Y , N A N D , N O R and other similar logic primitives.

Download Full-text

Joint Management of CPU and NVDIMM for Breaking Down the Great Memory Wall

IEEE Transactions on Computers ◽

10.1109/tc.2020.2964254 ◽

2020 ◽

Vol 69 (5) ◽

pp. 722-733

Author(s):

Chun-Feng Wu ◽

Yuan-Hao Chang ◽

Ming-Chang Yang ◽

Tei-Wei Kuo

Keyword(s):

Memory Wall ◽

Joint Management ◽

Great Memory

Download Full-text

Data Processing and Information Classification—An In-Memory Approach

Sensors ◽

10.3390/s20061681 ◽

2020 ◽

Vol 20 (6) ◽

pp. 1681 ◽

Cited By ~ 1

Author(s):

Milena Andrighetti ◽

Giovanna Turvani ◽

Giulia Santoro ◽

Marco Vacca ◽

Andrea Marchesin ◽

...

Keyword(s):

Information Society ◽

Electronic Devices ◽

Cmos Technology ◽

Hardware Accelerator ◽

Huge Amount ◽

Memory Wall ◽

Information Classification ◽

Enormous Amount ◽

Many Sources ◽

Problem Data

To live in the information society means to be surrounded by billions of electronic devices full of sensors that constantly acquire data. This enormous amount of data must be processed and classified. A solution commonly adopted is to send these data to server farms to be remotely elaborated. The drawback is a huge battery drain due to high amount of information that must be exchanged. To compensate this problem data must be processed locally, near the sensor itself. But this solution requires huge computational capabilities. While microprocessors, even mobile ones, nowadays have enough computational power, their performance are severely limited by the Memory Wall problem. Memories are too slow, so microprocessors cannot fetch enough data from them, greatly limiting their performance. A solution is the Processing-In-Memory (PIM) approach. New memories are designed that can elaborate data inside them eliminating the Memory Wall problem. In this work we present an example of such a system, using as a case of study the Bitmap Indexing algorithm. Such algorithm is used to classify data coming from many sources in parallel. We propose a hardware accelerator designed around the Processing-In-Memory approach, that is capable of implementing this algorithm and that can also be reconfigured to do other tasks or to work as standard memory. The architecture has been synthesized using CMOS technology. The results that we have obtained highlights that, not only it is possible to process and classify huge amount of data locally, but also that it is possible to obtain this result with a very low power consumption.

Download Full-text

Data Processing and Information Classification: An In-Memory Approach

10.20944/preprints202002.0294.v1 ◽

2020 ◽

Author(s):

Milena Andrighetti ◽

Giovanna Turvani ◽

Giulia Santoro ◽

Marco Vacca ◽

Andrea Marchesin ◽

...

Keyword(s):

Information Society ◽

Electronic Devices ◽

Cmos Technology ◽

Hardware Accelerator ◽

Huge Amount ◽

Memory Wall ◽

Information Classification ◽

Enormous Amount ◽

Many Sources ◽

Problem Data

To live in the information society means to be surrounded by billions of electronic devices full of sensors that constantly acquire data. This enormous amount of data must be processed and classified. A solution commonly adopted is to send these data to server farms to be remotely elaborated. The drawback is a huge battery drain due to high amount of information that must be exchanged. To compensate this problem data must be processed locally, near the sensor itself. But this solution requires huge computational capabilities. While microprocessors, even mobile ones, nowadays have enough computational power, their performance are severely limited by the Memory Wall problem. Memories are too slow, so microprocessors cannot fetch enough data from them, greatly limiting their performance. A solution is the Processing-In-Memory (PIM) approach. New memories are designed that are able to elaborate data inside them eliminating the Memory Wall problem. In this work we present an example of such system, using as a case of study the Bitmap Indexing algorithm. Such algorithm is used to classify data coming from many sources in parallel. We propose an hardware accelerator designed around the Processing-In-Memory approach, that is capable of implementing this algorithm and that can also be reconfigured to do other tasks or to work as standard memory. The architecture has been synthesized using CMOS technology. The results that we have obtained highlights that, not only it is possible to process and classify huge amount of data locally, but also that it is possible to obtain this result with a very low power consumption.

Download Full-text

memory wall
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Puncturing the memory wall

A New Embedded Key–Value Store for NVM Device Simulator

Breaking down memory walls

Rediscovering Majority Logic in the Post-CMOS Era: A Perspective from In-Memory Computing

Breaking the Memory Wall for AI Chip with a New Dimension

In-memory computing to break the memory wall

Binary Addition in Resistance Switching Memory Array by Sensing Majority

Joint Management of CPU and NVDIMM for Breaking Down the Great Memory Wall

Data Processing and Information Classification—An In-Memory Approach

Data Processing and Information Classification: An In-Memory Approach

Export Citation Format

memory wallRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Puncturing the memory wall

A New Embedded Key–Value Store for NVM Device Simulator

Breaking down memory walls

Rediscovering Majority Logic in the Post-CMOS Era: A Perspective from In-Memory Computing

Breaking the Memory Wall for AI Chip with a New Dimension

In-memory computing to break the memory wall

Binary Addition in Resistance Switching Memory Array by Sensing Majority

Joint Management of CPU and NVDIMM for Breaking Down the Great Memory Wall

Data Processing and Information Classification—An In-Memory Approach

Data Processing and Information Classification: An In-Memory Approach

memory wall
Recently Published Documents