write amplification
Recently Published Documents


TOTAL DOCUMENTS

49
(FIVE YEARS 22)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Vol 17 (4) ◽  
pp. 1-32
Author(s):  
Siying Dong ◽  
Andrew Kryczka ◽  
Yanqin Jin ◽  
Michael Stumm

This article is an eight-year retrospective on development priorities for RocksDB, a key-value store developed at Facebook that targets large-scale distributed systems and that is optimized for Solid State Drives (SSDs). We describe how the priorities evolved over time as a result of hardware trends and extensive experiences running RocksDB at scale in production at a number of organizations: from optimizing write amplification, to space amplification, to CPU utilization. We describe lessons from running large-scale applications, including that resource allocation needs to be managed across different RocksDB instances, that data formats need to remain backward- and forward-compatible to allow incremental software rollouts, and that appropriate support for database replication and backups are needed. Lessons from failure handling taught us that data corruption errors needed to be detected earlier and that data integrity protection mechanisms are needed at every layer of the system. We describe improvements to the key-value interface. We describe a number of efforts that in retrospect proved to be misguided. Finally, we describe a number of open problems that could benefit from future research.


2021 ◽  
Vol 17 (3) ◽  
pp. 1-26
Author(s):  
Baoquan Zhang ◽  
David H. C. Du

Computer systems utilizing byte-addressable Non-Volatile Memory ( NVM ) as memory/storage can provide low-latency data persistence. The widely used key-value stores using Log-Structured Merge Tree ( LSM-Tree ) are still beneficial for NVM systems in aspects of the space and write efficiency. However, the significant write amplification introduced by the leveled compaction of LSM-Tree degrades the write performance of the key-value store and shortens the lifetime of the NVM devices. The existing studies propose new compaction methods to reduce write amplification. Unfortunately, they result in a relatively large read amplification. In this article, we propose NVLSM, a key-value store for NVM systems using LSM-Tree with new accumulative compaction. By fully utilizing the byte-addressability of NVM, accumulative compaction uses pointers to accumulate data into multiple floors in a logically sorted run to reduce the number of compactions required. We have also proposed a cascading searching scheme for reads among the multiple floors to reduce read amplification. Therefore, NVLSM reduces write amplification with small increases in read amplification. We compare NVLSM with key-value stores using LSM-Tree with two other compaction methods: leveled compaction and fragmented compaction. Our evaluations show that NVLSM reduces write amplification by up to 67% compared with LSM-Tree using leveled compaction without significantly increasing the read amplification. In write-intensive workloads, NVLSM reduces the average latency by 15.73%–41.2% compared to other key-value stores.


2021 ◽  
Vol 14 (11) ◽  
pp. 2216-2229
Author(s):  
Subhadeep Sarkar ◽  
Dimitris Staratzis ◽  
Ziehen Zhu ◽  
Manos Athanassoulis

Log-structured merge (LSM) trees offer efficient ingestion by appending incoming data, and thus, are widely used as the storage layer of production NoSQL data stores. To enable competitive read performance, LSM-trees periodically re-organize data to form a tree with levels of exponentially increasing capacity, through iterative compactions. Compactions fundamentally influence the performance of an LSM-engine in terms of write amplification, write throughput, point and range lookup performance, space amplification, and delete performance. Hence, choosing the appropriate compaction strategy is crucial and, at the same time, hard as the LSM-compaction design space is vast, largely unexplored, and has not been formally defined in the literature. As a result, most LSM-based engines use a fixed compaction strategy, typically hand-picked by an engineer, which decides how and when to compact data. In this paper, we present the design space of LSM-compactions, and evaluate state-of-the-art compaction strategies with respect to key performance metrics. Toward this goal, our first contribution is to introduce a set of four design primitives that can formally define any compaction strategy: (i) the compaction trigger, (ii) the data layout, (iii) the compaction granularity, and (iv) the data movement policy. Together, these primitives can synthesize both existing and completely new compaction strategies. Our second contribution is to experimentally analyze 10 compaction strategies. We present 12 observations and 7 high-level takeaway messages, which show how LSM systems can navigate the compaction design space.


2021 ◽  
Vol 14 (10) ◽  
pp. 1872-1885
Author(s):  
Baoyue Yan ◽  
Xuntao Cheng ◽  
Bo Jiang ◽  
Shibin Chen ◽  
Canfang Shang ◽  
...  

The recent byte-addressable and large-capacity commercialized persistent memory (PM) is promising to drive database as a service (DBaaS) into unchartered territories. This paper investigates how to leverage PMs to revisit the conventional LSM-tree based OLTP storage engines designed for DRAM-SSD hierarchy for DBaaS instances. Specifically we (1) propose a light-weight PM allocator named Hal-loc customized for LSM-tree, (2) build a high-performance Semi-persistent Memtable utilizing the persistent in-memory writes of PM, (3) design a concurrent commit algorithm named Reorder Ring to aschieve log-free transaction processing for OLTP workloads and (4) present a Global Index as the new globally sorted persistent level with non-blocking in-memory compaction. The design of Reorder Ring and Semi-persistent Memtable achieves fast writes without synchronized logging overheads and achieves near instant recovery time. Moreover, the design of Semi-persistent Memtable and Global Index with in-memory compaction enables the byte-addressable persistent levels in PM, which significantly reduces the read and write amplification as well as the background compaction overheads. The overall evaluation shows that the performance of our proposal over PM-SSD hierarchy outperforms the baseline by up to 3.8x in YCSB benchmark and by 2x in TPC-C benchmark.


2021 ◽  
Author(s):  
Qizhong Mao ◽  
Steven Jacobs ◽  
Waleed Amjad ◽  
Vagelis Hristidis ◽  
Vassilis J. Tsotras ◽  
...  

AbstractModern NoSQL database systems use log-structured merge (LSM) storage architectures to support high write throughput. LSM architectures aggregate writes in a mutableMemTable(stored in memory), which is regularly flushed to disk, creating a new immutable file called anSSTable. Some of the SSTables are chosen to be periodicallymerged—replaced with a single SSTable containing their union. Amergepolicy(a.k.a. compaction policy) specifies when to do merges and which SSTables to combine. Abounded depthmerge policy is one that guarantees that the number of SSTables never exceeds a given parameterk, typically in the range 3–10. Bounded depth policies are useful in applications where low read latency is crucial, but they and their underlying combinatorics are not yet well understood. This paper compares several bounded depth policies, including representative policies from industrial NoSQL databases and two new ones based on recent theoretical modeling, as well as the standard Tiered policy and Leveled policy. The results validate the proposed theoretical model and show that, compared to the existing policies, the newly proposed policies can have substantially lower write amplification with comparable read amplification.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Youngjoo Woo ◽  
Taesoo Kim ◽  
Sungin Jung ◽  
Euiseong Seo

Sign in / Sign up

Export Citation Format

Share Document