Zen

2021 ◽  
Vol 14 (5) ◽  
pp. 835-848
Author(s):  
Gang Liu ◽  
Leying Chen ◽  
Shimin Chen

Emerging <u>N</u>on-<u>V</u>olatile <u>M</u>emory (NVM) technologies like 3DX-point promise significant performance potential for OLTP databases. However, transactional databases need to be redesigned because the key assumptions that non-volatile storage is orders of magnitude slower than DRAM and only supports blocked-oriented access have changed. NVMs are byte-addressable and almost as fast as DRAM. The capacity of NVM is much (4-16x) larger than DRAM. Such NVM characteristics make it possible to build OLTP database entirely in NVM main memory. This paper studies the structure of OLTP engines with hybrid NVM and DRAM memory. We observe three challenges to design an OLTP engine for NVM: tuple metadata modifications, NVM write redundancy, and NVM space management. We propose Zen, a high-throughput log-free OLTP engine for NVM. Zen addresses the three design challenges with three novel techniques: metadata enhanced tuple cache, log-free persistent transactions, and light-weight NVM space management. Experimental results on a real machine equipped with Intel Optane DC Persistent Memory show that Zen achieves up to 10.1x improvement compared with existing solutions to run an OLTP database as large as the size of NVM while achieving fast failure recovery.

Author(s):  
Muhammad Attahir Jibril ◽  
Philipp Götze ◽  
David Broneske ◽  
Kai-Uwe Sattler

AbstractAfter the introduction of Persistent Memory in the form of Intel’s Optane DC Persistent Memory on the market in 2019, it has found its way into manifold applications and systems. As Google and other cloud infrastructure providers are starting to incorporate Persistent Memory into their portfolio, it is only logical that cloud applications have to exploit its inherent properties. Persistent Memory can serve as a DRAM substitute, but guarantees persistence at the cost of compromised read/write performance compared to standard DRAM. These properties particularly affect the performance of index structures, since they are subject to frequent updates and queries. However, adapting each and every index structure to exploit the properties of Persistent Memory is tedious. Hence, we require a general technique that hides this access gap, e.g., by using DRAM caching strategies. To exploit Persistent Memory properties for analytical index structures, we propose selective caching. It is based on a mixture of dynamic and static caching of tree nodes in DRAM to reach near-DRAM access speeds for index structures. In this paper, we evaluate selective caching on the OLAP-optimized main-memory index structure Elf, because its memory layout allows for an easy caching. Our experiments show that if configured well, selective caching with a suitable replacement strategy can keep pace with pure DRAM storage of Elf while guaranteeing persistence. These results are also reflected when selective caching is used for parallel workloads.


2021 ◽  
Vol 14 (10) ◽  
pp. 1872-1885
Author(s):  
Baoyue Yan ◽  
Xuntao Cheng ◽  
Bo Jiang ◽  
Shibin Chen ◽  
Canfang Shang ◽  
...  

The recent byte-addressable and large-capacity commercialized persistent memory (PM) is promising to drive database as a service (DBaaS) into unchartered territories. This paper investigates how to leverage PMs to revisit the conventional LSM-tree based OLTP storage engines designed for DRAM-SSD hierarchy for DBaaS instances. Specifically we (1) propose a light-weight PM allocator named Hal-loc customized for LSM-tree, (2) build a high-performance Semi-persistent Memtable utilizing the persistent in-memory writes of PM, (3) design a concurrent commit algorithm named Reorder Ring to aschieve log-free transaction processing for OLTP workloads and (4) present a Global Index as the new globally sorted persistent level with non-blocking in-memory compaction. The design of Reorder Ring and Semi-persistent Memtable achieves fast writes without synchronized logging overheads and achieves near instant recovery time. Moreover, the design of Semi-persistent Memtable and Global Index with in-memory compaction enables the byte-addressable persistent levels in PM, which significantly reduces the read and write amplification as well as the background compaction overheads. The overall evaluation shows that the performance of our proposal over PM-SSD hierarchy outperforms the baseline by up to 3.8x in YCSB benchmark and by 2x in TPC-C benchmark.


2013 ◽  
Vol 10 (3) ◽  
pp. 60-81 ◽  
Author(s):  
Yunlei Sun ◽  
Xiuquan Qiao ◽  
Wei Tan ◽  
Bo Cheng ◽  
Ruisheng Shi ◽  
...  

In order to build a low-latency and light-weight publish/subscribe (pub/sub) system for delay-sensitive IOT services, the authors propose an efficient and scalable broker architecture named Grid Quorum-based pub/sub (GQPS). As a key component in Event-Driven Service-Oriented Architecture (EDSOA) for IOT services, this architecture organizes multiple pub/sub brokers into a Quorum-based peer-to-peer topology for efficient topic searching. It also leverages a topic searching algorithm and a one-hop caching strategy to minimize the search latency. Light-weight RESTful interfaces make the authors’ GQPS more suitable for IOT services. Cost analysis and experimental study demonstrate that the GQPS achieves a significant performance gain in search satisfaction without compromising search cost. The authors apply the proposed GQPS in the District Heating Control and Information Service System in Beijing, China. This system validates the effectiveness of GQPS.


2019 ◽  
Author(s):  
Rafael Murari ◽  
João Paulo Carvalho ◽  
Guido Araujo ◽  
Alexandro Baldassin

The emerging persistent memory technologies (PM) are aimed to eliminate the gap between main memory and storage. Nevertheless, its adoption requires measures to guarantee consistency, since crash failures might render the program in an unrecoverable state. In this context, the usage of durable transactions is one of the main investigated approaches to ease the adoption of PM. However, today&apos;s implementations are based exclusively on software (SW) or hardware (HW), which might degrade system performance. This paper presents NV-PhTM, a transactional system for PM that delivers the best out of both HW and SW transactions by dynamically changing the execution according to the application&apos;s characteristics.


2021 ◽  
Vol 20 (5s) ◽  
pp. 1-20
Author(s):  
Qingfeng Zhuge ◽  
Hao Zhang ◽  
Edwin Hsing-Mean Sha ◽  
Rui Xu ◽  
Jun Liu ◽  
...  

Efficiently accessing remote file data remains a challenging problem for data processing systems. Development of technologies in non-volatile dual in-line memory modules (NVDIMMs), in-memory file systems, and RDMA networks provide new opportunities towards solving the problem of remote data access. A general understanding about NVDIMMs, such as Intel Optane DC Persistent Memory (DCPM), is that they expand main memory capacity with a cost of multiple times lower performance than DRAM. With an in-depth exploration presented in this paper, however, we show an interesting finding that the potential of NVDIMMs for high-performance, remote in-memory accesses can be revealed through careful design. We explore multiple architectural structures for accessing remote NVDIMMs in a real system using Optane DCPM, and compare the performance of various structures. Experiments are conducted to show significant performance gaps among different ways of using NVDIMMs as memory address space accessible through RDMA interface. Furthermore, we design and implement a prototype of user-level, in-memory file system, RIMFS, in the device DAX mode on Optane DCPM. By comparing against the DAX-supported Linux file system, Ext4-DAX, we show that the performance of remote reads on RIMFS over RDMA is 11.44 higher than that on a remote Ext4-DAX on average. The experimental results also show that the performance of remote accesses on RIMFS is maintained on a heavily loaded data server with CPU utilization as high as 90%, while the performance of remote reads on Ext4-DAX is significantly reduced by 49.3%, and the performance of local reads on Ext4-DAX is even more significantly reduced by 90.1%. The performance comparisons of writes exhibit the same trends.


2017 ◽  
Vol 139 (5) ◽  
Author(s):  
Markus Schnoes ◽  
Eberhard Nicke

Airfoil shapes tailored to specific inflow conditions and loading requirements can offer a significant performance potential over classic airfoil shapes. However, their optimal operating range has to be matched thoroughly to the overall compressor layout. This paper describes methods to organize a large set of optimized airfoils in a database and its application in the throughflow design. Optimized airfoils are structured in five dimensions: inlet Mach number, blade stagger angle, pitch–chord ratio, maximum thickness–chord ratio, and a parameter for aerodynamic loading. In this space, a high number of airfoil geometries are generated by means of numerical optimization. During the optimization of each airfoil, the performance at design and off-design conditions is evaluated with the blade-to-blade flow solver MISES. Together with the airfoil geometry, the database stores automatically calibrated correlations which describe the cascade performance in throughflow calculation. Based on these methods, two subsonic stages of a 4.5-stage transonic research compressor are redesigned. Performance of the baseline and updated geometry is evaluated with 3D CFD. The overall approach offers accurate throughflow design incorporating optimized airfoil shapes and a fast transition from throughflow to 3D CFD design.


2021 ◽  
Vol 17 (2) ◽  
pp. 1-38
Author(s):  
Joonsung Kim ◽  
Kanghyun Choi ◽  
Wonsik Lee ◽  
Jangwoo Kim

Modern servers are actively deploying Solid-State Drives (SSDs) thanks to their high throughput and low latency. However, current server architects cannot achieve the full performance potential of commodity SSDs, as SSDs are complex devices designed for specific goals (e.g., latency, throughput, endurance, cost) with their internal mechanisms undisclosed to users. In this article, we propose SSDcheck , a novel SSD performance model to extract various internal mechanisms and predict the latency of next access to commodity black-box SSDs. We identify key performance-critical features (e.g., garbage collection, write buffering) and find their parameters (i.e., size, threshold) from each SSD by using our novel diagnosis code snippets. Then, SSDcheck constructs a performance model for a target SSD and dynamically manages the model to predict the latency of the next access. In addition, SSDcheck extracts and provides other useful internal mechanisms (e.g., fetch unit in multi-queue SSDs, background tasks triggering idle-time interval) for the storage system to fully exploit SSDs. By using those useful features and the performance model, we propose multiple practical use cases. Our evaluations show that SSDcheck’s performance model is highly accurate, and proposed use cases achieve significant performance improvement in various scenarios.


Sign in / Sign up

Export Citation Format

Share Document