Zen

Emerging Non-Volatile Memory (NVM) technologies like 3DX-point promise significant performance potential for OLTP databases. However, transactional databases need to be redesigned because the key assumptions that non-volatile storage is orders of magnitude slower than DRAM and only supports blocked-oriented access have changed. NVMs are byte-addressable and almost as fast as DRAM. The capacity of NVM is much (4-16x) larger than DRAM. Such NVM characteristics make it possible to build OLTP database entirely in NVM main memory. This paper studies the structure of OLTP engines with hybrid NVM and DRAM memory. We observe three challenges to design an OLTP engine for NVM: tuple metadata modifications, NVM write redundancy, and NVM space management. We propose Zen, a high-throughput log-free OLTP engine for NVM. Zen addresses the three design challenges with three novel techniques: metadata enhanced tuple cache, log-free persistent transactions, and light-weight NVM space management. Experimental results on a real machine equipped with Intel Optane DC Persistent Memory show that Zen achieves up to 10.1x improvement compared with existing solutions to run an OLTP database as large as the size of NVM while achieving fast failure recovery.

Download Full-text

Selective caching: a persistent memory approach for multi-dimensional index structures

Distributed and Parallel Databases ◽

10.1007/s10619-021-07327-0 ◽

2021 ◽

Author(s):

Muhammad Attahir Jibril ◽

Philipp Götze ◽

David Broneske ◽

Kai-Uwe Sattler

Keyword(s):

Main Memory ◽

Index Structure ◽

Index Structures ◽

Cloud Infrastructure ◽

General Technique ◽

Persistent Memory ◽

The Cost ◽

Cloud Applications ◽

Memory Layout ◽

Analytical Index

AbstractAfter the introduction of Persistent Memory in the form of Intel’s Optane DC Persistent Memory on the market in 2019, it has found its way into manifold applications and systems. As Google and other cloud infrastructure providers are starting to incorporate Persistent Memory into their portfolio, it is only logical that cloud applications have to exploit its inherent properties. Persistent Memory can serve as a DRAM substitute, but guarantees persistence at the cost of compromised read/write performance compared to standard DRAM. These properties particularly affect the performance of index structures, since they are subject to frequent updates and queries. However, adapting each and every index structure to exploit the properties of Persistent Memory is tedious. Hence, we require a general technique that hides this access gap, e.g., by using DRAM caching strategies. To exploit Persistent Memory properties for analytical index structures, we propose selective caching. It is based on a mixture of dynamic and static caching of tree nodes in DRAM to reach near-DRAM access speeds for index structures. In this paper, we evaluate selective caching on the OLAP-optimized main-memory index structure Elf, because its memory layout allows for an easy caching. Our experiments show that if configured well, selective caching with a suitable replacement strategy can keep pace with pure DRAM storage of Elf while guaranteeing persistence. These results are also reflected when selective caching is used for parallel workloads.

Download Full-text

Revisiting the design of LSM-tree Based OLTP storage engine with persistent memory

Proceedings of the VLDB Endowment ◽

10.14778/3467861.3467875 ◽

2021 ◽

Vol 14 (10) ◽

pp. 1872-1885

Author(s):

Baoyue Yan ◽

Xuntao Cheng ◽

Bo Jiang ◽

Shibin Chen ◽

Canfang Shang ◽

...

Keyword(s):

Recovery Time ◽

High Performance ◽

Transaction Processing ◽

Light Weight ◽

Global Index ◽

Persistent Memory ◽

Write Amplification ◽

Database As A Service ◽

Overall Evaluation ◽

Memory Compaction

The recent byte-addressable and large-capacity commercialized persistent memory (PM) is promising to drive database as a service (DBaaS) into unchartered territories. This paper investigates how to leverage PMs to revisit the conventional LSM-tree based OLTP storage engines designed for DRAM-SSD hierarchy for DBaaS instances. Specifically we (1) propose a light-weight PM allocator named Hal-loc customized for LSM-tree, (2) build a high-performance Semi-persistent Memtable utilizing the persistent in-memory writes of PM, (3) design a concurrent commit algorithm named Reorder Ring to aschieve log-free transaction processing for OLTP workloads and (4) present a Global Index as the new globally sorted persistent level with non-blocking in-memory compaction. The design of Reorder Ring and Semi-persistent Memtable achieves fast writes without synchronized logging overheads and achieves near instant recovery time. Moreover, the design of Semi-persistent Memtable and Global Index with in-memory compaction enables the byte-addressable persistent levels in PM, which significantly reduces the read and write amplification as well as the background compaction overheads. The overall evaluation shows that the performance of our proposal over PM-SSD hierarchy outperforms the baseline by up to 3.8x in YCSB benchmark and by 2x in TPC-C benchmark.

Download Full-text

Fast Failure Recovery for Main-Memory DBMSs on Multicores

Proceedings of the 2017 ACM International Conference on Management of Data - SIGMOD '17 ◽

10.1145/3035918.3064011 ◽

2017 ◽

Cited By ~ 7

Author(s):

Yingjun Wu ◽

Wentian Guo ◽

Chee-Yong Chan ◽

Kian-Lee Tan

Keyword(s):

Main Memory ◽

Failure Recovery

Download Full-text

A Low-Delay, Light-Weight Publish/Subscribe Architecture for Delay-Sensitive IOT Services

International Journal of Web Services Research ◽

10.4018/ijwsr.2013070104 ◽

2013 ◽

Vol 10 (3) ◽

pp. 60-81 ◽

Cited By ~ 3

Author(s):

Yunlei Sun ◽

Xiuquan Qiao ◽

Wei Tan ◽

Bo Cheng ◽

Ruisheng Shi ◽

...

Keyword(s):

Service Oriented Architecture ◽

Service System ◽

Information Service ◽

District Heating ◽

Light Weight ◽

Search Cost ◽

Delay Sensitive ◽

Significant Performance ◽

Service Oriented ◽

Beijing China

In order to build a low-latency and light-weight publish/subscribe (pub/sub) system for delay-sensitive IOT services, the authors propose an efficient and scalable broker architecture named Grid Quorum-based pub/sub (GQPS). As a key component in Event-Driven Service-Oriented Architecture (EDSOA) for IOT services, this architecture organizes multiple pub/sub brokers into a Quorum-based peer-to-peer topology for efficient topic searching. It also leverages a topic searching algorithm and a one-hop caching strategy to minimize the search latency. Light-weight RESTful interfaces make the authors’ GQPS more suitable for IOT services. Cost analysis and experimental study demonstrate that the GQPS achieves a significant performance gain in search satisfaction without compromising search cost. The authors apply the proposed GQPS in the District Heating Control and Information Service System in Beijing, China. This system validates the effectiveness of GQPS.

Download Full-text

Performance Optimization of Persistent Memory Systems Through Phase-Based Transactional Memory

10.5753/eradsp.2019.13590 ◽

2019 ◽

Author(s):

Rafael Murari ◽

João Paulo Carvalho ◽

Guido Araujo ◽

Alexandro Baldassin

Keyword(s):

Performance Optimization ◽

System Performance ◽

Transactional Memory ◽

Memory Systems ◽

Main Memory ◽

Persistent Memory ◽

Crash Failures ◽

And Storage

The emerging persistent memory technologies (PM) are aimed to eliminate the gap between main memory and storage. Nevertheless, its adoption requires measures to guarantee consistency, since crash failures might render the program in an unrecoverable state. In this context, the usage of durable transactions is one of the main investigated approaches to ease the adoption of PM. However, today's implementations are based exclusively on software (SW) or hardware (HW), which might degrade system performance. This paper presents NV-PhTM, a transactional system for PM that delivers the best out of both HW and SW transactions by dynamically changing the execution according to the application's characteristics.

Download Full-text

Failure Recovery from Persistent Memory in Paxos-Based State Machine Replication

10.1109/srds53918.2021.00018 ◽

2021 ◽

Author(s):

Jan Konczak ◽

Pawel T. Wojciechowski

Keyword(s):

State Machine ◽

Failure Recovery ◽

Persistent Memory ◽

State Machine Replication

Download Full-text

Exploring Efficient Architectures on Remote In-Memory NVM over RDMA

ACM Transactions on Embedded Computing Systems ◽

10.1145/3477004 ◽

2021 ◽

Vol 20 (5s) ◽

pp. 1-20

Author(s):

Qingfeng Zhuge ◽

Hao Zhang ◽

Edwin Hsing-Mean Sha ◽

Rui Xu ◽

Jun Liu ◽

...

Keyword(s):

High Performance ◽

File System ◽

File Systems ◽

Data Access ◽

Main Memory ◽

Memory Modules ◽

Significant Performance ◽

Architectural Structures ◽

Memory Accesses ◽

Careful Design

Efficiently accessing remote file data remains a challenging problem for data processing systems. Development of technologies in non-volatile dual in-line memory modules (NVDIMMs), in-memory file systems, and RDMA networks provide new opportunities towards solving the problem of remote data access. A general understanding about NVDIMMs, such as Intel Optane DC Persistent Memory (DCPM), is that they expand main memory capacity with a cost of multiple times lower performance than DRAM. With an in-depth exploration presented in this paper, however, we show an interesting finding that the potential of NVDIMMs for high-performance, remote in-memory accesses can be revealed through careful design. We explore multiple architectural structures for accessing remote NVDIMMs in a real system using Optane DCPM, and compare the performance of various structures. Experiments are conducted to show significant performance gaps among different ways of using NVDIMMs as memory address space accessible through RDMA interface. Furthermore, we design and implement a prototype of user-level, in-memory file system, RIMFS, in the device DAX mode on Optane DCPM. By comparing against the DAX-supported Linux file system, Ext4-DAX, we show that the performance of remote reads on RIMFS over RDMA is 11.44 higher than that on a remote Ext4-DAX on average. The experimental results also show that the performance of remote accesses on RIMFS is maintained on a heavily loaded data server with CPU utilization as high as 90%, while the performance of remote reads on Ext4-DAX is significantly reduced by 49.3%, and the performance of local reads on Ext4-DAX is even more significantly reduced by 90.1%. The performance comparisons of writes exhibit the same trends.

Download Full-text

A Database of Optimal Airfoils for Axial Compressor Throughflow Design

Journal of Turbomachinery ◽

10.1115/1.4035075 ◽

2017 ◽

Vol 139 (5) ◽

Cited By ~ 3

Author(s):

Markus Schnoes ◽

Eberhard Nicke

Keyword(s):

Numerical Optimization ◽

Axial Compressor ◽

Large Set ◽

Performance Potential ◽

Flow Solver ◽

Five Dimensions ◽

Aerodynamic Loading ◽

Significant Performance ◽

Stagger Angle ◽

Inflow Conditions

Airfoil shapes tailored to specific inflow conditions and loading requirements can offer a significant performance potential over classic airfoil shapes. However, their optimal operating range has to be matched thoroughly to the overall compressor layout. This paper describes methods to organize a large set of optimized airfoils in a database and its application in the throughflow design. Optimized airfoils are structured in five dimensions: inlet Mach number, blade stagger angle, pitch–chord ratio, maximum thickness–chord ratio, and a parameter for aerodynamic loading. In this space, a high number of airfoil geometries are generated by means of numerical optimization. During the optimization of each airfoil, the performance at design and off-design conditions is evaluated with the blade-to-blade flow solver MISES. Together with the airfoil geometry, the database stores automatically calibrated correlations which describe the cascade performance in throughflow calculation. Based on these methods, two subsonic stages of a 4.5-stage transonic research compressor are redesigned. Performance of the baseline and updated geometry is evaluated with 3D CFD. The overall approach offers accurate throughflow design incorporating optimized airfoil shapes and a fast transition from throughflow to 3D CFD design.

Download Full-text

Distillation: A light-weight data separation design to boost performance of NVDIMM main memory

2017 IEEE 23rd International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA) ◽

10.1109/rtcsa.2017.8046324 ◽

2017 ◽

Cited By ~ 1

Author(s):

Che-Wei Tsao ◽

Yuan-Hao Chang ◽

Tei-Wei Kuo ◽

Shau-Yin Tseng

Keyword(s):

Main Memory ◽

Light Weight ◽

Data Separation ◽

Weight Data

Download Full-text

Performance Modeling and Practical Use Cases for Black-Box SSDs

ACM Transactions on Storage ◽

10.1145/3440022 ◽

2021 ◽

Vol 17 (2) ◽

pp. 1-38

Author(s):

Joonsung Kim ◽

Kanghyun Choi ◽

Wonsik Lee ◽

Jangwoo Kim

Keyword(s):

Storage System ◽

Black Box ◽

Performance Model ◽

Use Cases ◽

Time Interval ◽

Diagnosis Code ◽

Solid State Drives ◽

Performance Potential ◽

Significant Performance ◽

Size Threshold

Modern servers are actively deploying Solid-State Drives (SSDs) thanks to their high throughput and low latency. However, current server architects cannot achieve the full performance potential of commodity SSDs, as SSDs are complex devices designed for specific goals (e.g., latency, throughput, endurance, cost) with their internal mechanisms undisclosed to users. In this article, we propose SSDcheck , a novel SSD performance model to extract various internal mechanisms and predict the latency of next access to commodity black-box SSDs. We identify key performance-critical features (e.g., garbage collection, write buffering) and find their parameters (i.e., size, threshold) from each SSD by using our novel diagnosis code snippets. Then, SSDcheck constructs a performance model for a target SSD and dynamically manages the model to predict the latency of the next access. In addition, SSDcheck extracts and provides other useful internal mechanisms (e.g., fetch unit in multi-queue SSDs, background tasks triggering idle-time interval) for the storage system to fully exploit SSDs. By using those useful features and the performance model, we propose multiple practical use cases. Our evaluations show that SSDcheck’s performance model is highly accurate, and proposed use cases achieve significant performance improvement in various scenarios.

Download Full-text