SM3: A dynamically partitionable multicomputer system with switchable main memory modules

Efficiently accessing remote file data remains a challenging problem for data processing systems. Development of technologies in non-volatile dual in-line memory modules (NVDIMMs), in-memory file systems, and RDMA networks provide new opportunities towards solving the problem of remote data access. A general understanding about NVDIMMs, such as Intel Optane DC Persistent Memory (DCPM), is that they expand main memory capacity with a cost of multiple times lower performance than DRAM. With an in-depth exploration presented in this paper, however, we show an interesting finding that the potential of NVDIMMs for high-performance, remote in-memory accesses can be revealed through careful design. We explore multiple architectural structures for accessing remote NVDIMMs in a real system using Optane DCPM, and compare the performance of various structures. Experiments are conducted to show significant performance gaps among different ways of using NVDIMMs as memory address space accessible through RDMA interface. Furthermore, we design and implement a prototype of user-level, in-memory file system, RIMFS, in the device DAX mode on Optane DCPM. By comparing against the DAX-supported Linux file system, Ext4-DAX, we show that the performance of remote reads on RIMFS over RDMA is 11.44 higher than that on a remote Ext4-DAX on average. The experimental results also show that the performance of remote accesses on RIMFS is maintained on a heavily loaded data server with CPU utilization as high as 90%, while the performance of remote reads on Ext4-DAX is significantly reduced by 49.3%, and the performance of local reads on Ext4-DAX is even more significantly reduced by 90.1%. The performance comparisons of writes exhibit the same trends.

Download Full-text

How do memory modules differentially contribute to familiarity and recollection?

Behavioral and Brain Sciences ◽

10.1017/s0140525x19001833 ◽

2019 ◽

Vol 42 ◽

Author(s):

Olya Hakobyan ◽

Sen Cheng

Keyword(s):

Recognition Memory ◽

Experimental Evidence ◽

Subjective Experience ◽

Target Article ◽

Memory Modules ◽

Familiarity And Recollection ◽

Memory Contents

Abstract We fully support dissociating the subjective experience from the memory contents in recognition memory, as Bastin et al. posit in the target article. However, having two generic memory modules with qualitatively different functions is not mandatory and is in fact inconsistent with experimental evidence. We propose that quantitative differences in the properties of the memory modules can account for the apparent dissociation of recollection and familiarity along anatomical lines.

Download Full-text

AgiMonitor: a remote database monitoring tool for main memory database

Future Information Engineering ◽

10.2495/icie130752 ◽

2014 ◽

Author(s):

Huazhuang Yao ◽

Yongyan Wang ◽

Shuai Wang ◽

Kun Li ◽

Chao Guo

Keyword(s):

Main Memory ◽

Monitoring Tool ◽

Main Memory Database

Download Full-text

CIRCUIT SIMULATION OF SINGLE FAILURES OF MEMORY MODULES OF ON–BOARD ELECTRONICS

NEWS OF THE NATIONAL ACADEMY OF SCIENCES OF THE REPUBLIC OF KAZAKHSTAN ◽

10.32014/2020.2518-1726.44 ◽

2020 ◽

Vol 3 (331) ◽

pp. 118-126

Author(s):

Grichshenko V.F., ◽

◽

Mukushev A.A., ◽

◽

Keyword(s):

Circuit Simulation ◽

Memory Modules

Download Full-text

A Scheme for Protecting Confidentiality of No-volatile Main Memory Based on Phase-Change Memory

Chinese Journal of Computers ◽

10.3724/sp.j.1016.2011.02114 ◽

2011 ◽

Vol 34 (11) ◽

pp. 2114-2120 ◽

Cited By ~ 1

Author(s):

Peng ZHAO ◽

Long-Yun ZHU

Keyword(s):

Phase Change ◽

Phase Change Memory ◽

Main Memory ◽

Change Memory

Download Full-text

Performance characterization of a DRAM-NVM hybrid memory architecture for HPC applications using intel optane DC persistent memory modules

Proceedings of the International Symposium on Memory Systems - MEMSYS '19 ◽

10.1145/3357526.3357541 ◽

2019 ◽

Cited By ~ 4

Author(s):

Onkar Patil ◽

Latchesar Ionkov ◽

Jason Lee ◽

Frank Mueller ◽

Michael Lang

Keyword(s):

Memory Architecture ◽

Performance Characterization ◽

Hybrid Memory ◽

Persistent Memory ◽

Memory Modules

Download Full-text

Efficient local locking for massively multithreaded in-memory hash-based operators

The VLDB Journal ◽

10.1007/s00778-020-00642-5 ◽

2021 ◽

Author(s):

Bashar Romanous ◽

Skyler Windh ◽

Ildar Absalyamov ◽

Prerna Budhkar ◽

Robert Halstead ◽

...

Keyword(s):

Relational Databases ◽

Aggregation Operators ◽

Main Memory ◽

Paradigm Shifts ◽

Multithreaded Processors ◽

Cache Hierarchies ◽

Processor Architectures ◽

Spatial Locality ◽

Content Addressable Memories ◽

Multi Core Processor

AbstractThe join and group-by aggregation are two memory intensive operators that are affecting the performance of relational databases. Hashing is a common approach used to implement both operators. Recent paradigm shifts in multi-core processor architectures have reinvigorated research into how the join and group-by aggregation operators can leverage these advances. However, the poor spatial locality of the hashing approach has hindered performance on multi-core processor architectures which rely on using large cache hierarchies for latency mitigation. Multithreaded architectures can better cope with poor spatial locality by masking memory latency with many outstanding requests. Nevertheless, the number of parallel threads, even in the most advanced multithreaded processors, such as UltraSPARC, is not enough to fully cover the main memory access latency. In this paper, we explore the hardware re-configurability of FPGAs to enable deeper execution pipelines that maintain hundreds (instead of tens) of outstanding memory requests across four FPGAs-drastically increasing concurrency and throughput. We present two end-to-end in-memory accelerators for the join and group-by aggregation operators using FPGAs. Both accelerators use massive multithreading to mask long memory delays of traversing linked-list data structures, while concurrently managing hundreds of thread states across four FPGAs locally. We explore how content addressable memories can be intermixed within our multithreaded designs to act as a synchronizing cache, which enforces locks and merges jobs together before they are written to memory. Throughput results for our hash-join operator accelerator show a speedup between 2$$\times $$ × and 3.4$$\times $$ × over the best multi-core approaches with comparable memory bandwidths on uniform and skewed datasets. The accelerator for the hash-based group-by aggregation operator demonstrates that leveraging CAMs achieves average speedup of 3.3$$\times $$ × with a best case of 9.4$$\times $$ × in terms of throughput over CPU implementations across five types of data distributions.

Download Full-text

A Hybrid Approach Combining R*-Tree and k-d Trees to Improve Linked Open Data Query Performance

Applied Sciences ◽

10.3390/app11052405 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2405

Author(s):

Yuxiang Sun ◽

Tianyi Zhao ◽

Seulgi Yoon ◽

Yongju Lee

Keyword(s):

Flash Memory ◽

Query Language ◽

Hybrid Approach ◽

Open Data ◽

Main Memory ◽

Linked Open Data ◽

Index Structure ◽

Identification Algorithm ◽

Distributed Computing Systems ◽

Query Performance

Semantic Web has recently gained traction with the use of Linked Open Data (LOD) on the Web. Although numerous state-of-the-art methodologies, standards, and technologies are applicable to the LOD cloud, many issues persist. Because the LOD cloud is based on graph-based resource description framework (RDF) triples and the SPARQL query language, we cannot directly adopt traditional techniques employed for database management systems or distributed computing systems. This paper addresses how the LOD cloud can be efficiently organized, retrieved, and evaluated. We propose a novel hybrid approach that combines the index and live exploration approaches for improved LOD join query performance. Using a two-step index structure combining a disk-based 3D R*-tree with the extended multidimensional histogram and flash memory-based k-d trees, we can efficiently discover interlinked data distributed across multiple resources. Because this method rapidly prunes numerous false hits, the performance of join query processing is remarkably improved. We also propose a hot-cold segment identification algorithm to identify regions of high interest. The proposed method is compared with existing popular methods on real RDF datasets. Results indicate that our method outperforms the existing methods because it can quickly obtain target results by reducing unnecessary data scanning and reduce the amount of main memory required to load filtering results.

Download Full-text

Selective caching: a persistent memory approach for multi-dimensional index structures

Distributed and Parallel Databases ◽

10.1007/s10619-021-07327-0 ◽

2021 ◽

Author(s):

Muhammad Attahir Jibril ◽

Philipp Götze ◽

David Broneske ◽

Kai-Uwe Sattler

Keyword(s):

Main Memory ◽

Index Structure ◽

Index Structures ◽

Cloud Infrastructure ◽

General Technique ◽

Persistent Memory ◽

The Cost ◽

Cloud Applications ◽

Memory Layout ◽

Analytical Index

AbstractAfter the introduction of Persistent Memory in the form of Intel’s Optane DC Persistent Memory on the market in 2019, it has found its way into manifold applications and systems. As Google and other cloud infrastructure providers are starting to incorporate Persistent Memory into their portfolio, it is only logical that cloud applications have to exploit its inherent properties. Persistent Memory can serve as a DRAM substitute, but guarantees persistence at the cost of compromised read/write performance compared to standard DRAM. These properties particularly affect the performance of index structures, since they are subject to frequent updates and queries. However, adapting each and every index structure to exploit the properties of Persistent Memory is tedious. Hence, we require a general technique that hides this access gap, e.g., by using DRAM caching strategies. To exploit Persistent Memory properties for analytical index structures, we propose selective caching. It is based on a mixture of dynamic and static caching of tree nodes in DRAM to reach near-DRAM access speeds for index structures. In this paper, we evaluate selective caching on the OLAP-optimized main-memory index structure Elf, because its memory layout allows for an easy caching. Our experiments show that if configured well, selective caching with a suitable replacement strategy can keep pace with pure DRAM storage of Elf while guaranteeing persistence. These results are also reflected when selective caching is used for parallel workloads.

Download Full-text

SM3: A dynamically partitionable multicomputer system with switchable main memory modules

Matrix Operations on a Multicomputer System with Switchable Main Memory Modules and Dynamic Control

Exploring Efficient Architectures on Remote In-Memory NVM over RDMA

How do memory modules differentially contribute to familiarity and recollection?

AgiMonitor: a remote database monitoring tool for main memory database

CIRCUIT SIMULATION OF SINGLE FAILURES OF MEMORY MODULES OF ON–BOARD ELECTRONICS

A Scheme for Protecting Confidentiality of No-volatile Main Memory Based on Phase-Change Memory

Performance characterization of a DRAM-NVM hybrid memory architecture for HPC applications using intel optane DC persistent memory modules

Efficient local locking for massively multithreaded in-memory hash-based operators

A Hybrid Approach Combining R*-Tree and k-d Trees to Improve Linked Open Data Query Performance

Selective caching: a persistent memory approach for multi-dimensional index structures

Export Citation Format