Application of both Temporal and Spatial Localities in the Management of Kernel Buffer Cache

Advanced Operating Systems and Kernel Applications ◽

10.4018/978-1-60566-850-5.ch006 ◽

2010 ◽

pp. 107-117

Author(s):

Song Jiang

Keyword(s):

Buffer Management ◽

Direct Access ◽

Storage Device ◽

Data Layout ◽

Data Intensive ◽

Spatial Locality ◽

Buffer Cache ◽

Temporal Locality ◽

Upper Level ◽

Access Patterns

As the hard disk remains as the mainstream on-line storage device, it continues to be the performance bottleneck of data-intensive applications. One of existing most effective solutions to ameliorate the bottleneck is to use the buffer cache in the OS kernel to achieve two objectives: reduction of direct access of on-disk data and improvement of disk performance. These two objectives can be achieved by applying both temporal locality and spatial locality in the management of the buffer cache. Traditionally only temporal locality is exploited for the purpose, and spatial locality, which refers to the on-disk sequentiality of requested blocks, is largely ignored. As the throughput of access of sequentially-placed disk blocks can be an order of magnitude higher than that of access to randomly-placed blocks, the missing of spatial locality in the buffer management can cause the performance of applications without dominant sequential accesses to be seriously degraded. In the chapter, we introduce a state-of-the-art technique that seamlessly combines these two locality properties embedded in the data access patterns into the management of the kernel buffer cache management. After elaboration on why the spatial locality is needed in addition to the temporal locality, we detail a framework, DULO (DUal LOcality), in which these two properties are taken account of simultaneously. A prototype implementation of DULO in the Linux kernel as well as some experiment results are presented, showing that DULO can significantly increases disk I/O throughput for real-world applications such as Web server, TPC benchmark, file system benchmark, and scientific programs. It reduces their execution times by as much as 53%. We conclude the chapter by identifying and encouraging a new direction for research and practice on the improvement of disk I/O performance, which is to expose more disk-specific data layout and access patterns to the upper-level system software for disk-oriented policies.

Download Full-text

Exploiting Disk Layout and Block Access History for I/O Prefetch

Advanced Operating Systems and Kernel Applications ◽

10.4018/978-1-60566-850-5.ch010 ◽

2010 ◽

pp. 201-217 ◽

Cited By ~ 1

Author(s):

Feng Chen ◽

Xiaoning Ding ◽

Song Jiang

Keyword(s):

Critical Role ◽

Hard Disk ◽

Storage Device ◽

Access Pattern ◽

Data Layout ◽

Modern Computer ◽

Access Cost ◽

Random Block ◽

Disk Layout ◽

Access Patterns

As the major secondary storage device, the hard disk plays a critical role in modern computer system. In order to improve disk performance, most operating systems conduct data prefetch policies by tracking I/O access pattern, mostly at the level of file abstractions. Though such a solution is useful to exploit application-level access patterns, file-level prefetching has many constraints that limit the capability of fully exploiting disk performance. The reasons are twofold. First, certain prefetch opportunities can only be detected by knowing the data layout on the hard disk, such as metadata blocks. Second, due to the non-uniform access cost on the hard disk, the penalty of mis-prefetching a random block is much more costly than mis-prefetching a sequential block. In order to address the intrinsic limitations of filelevel prefetching, we propose to prefetch data blocks directly at the disk level in a portable way. Our proposed scheme, called DiskSeen, is designed to supplement file-level prefetching. DiskSeen observes the workload access pattern by tracking the locations and access times of disk blocks. Based on analysis of the temporal and spatial relationships of disk data blocks, DiskSeen can significantly increase the sequentiality of disk accesses and improve disk performance in turn. We implemented the DiskSeen scheme in the Linux 2.6 kernel and we show that it can significantly improve the effectiveness of filelevel prefetching and reduce execution times by 20-53% for various types of applications, including grep, CVS, and TPC-H.

Download Full-text

AOIO: Application Oriented I/O Optimization for Buffer Management

Symmetry ◽

10.3390/sym13040573 ◽

2021 ◽

Vol 13 (4) ◽

pp. 573

Author(s):

Xiaochang Li ◽

Zhengjun Zhai ◽

Xin Ye

Keyword(s):

System Performance ◽

Simulation Experiment ◽

Buffer Management ◽

Input Output ◽

Output Data ◽

Program Counter ◽

Efficient Management ◽

Management Scheme ◽

Buffer Cache ◽

Management Schemes

Emerging scale-out I/O intensive applications are broadly used now, which process a large amount of data in buffer/cache for reorganization or analysis and their performances are greatly affected by the speed of the I/O system. Efficient management scheme of the limited kernel buffer plays a key role in improving I/O system performance, such as caching hinted data for reuse in future, prefetching hinted data, and expelling data not to be accessed again from a buffer, which are called proactive mechanisms in buffer management. However, most of the existing buffer management schemes cannot identify data reference regularities (i.e., sequential or looping patterns) that can benefit proactive mechanisms, and they also cannot perform in the application level for managing specified applications. In this paper, we present an A pplication Oriented I/O Optimization (AOIO) technique automatically benefiting the kernel buffer/cache by exploring the I/O regularities of applications based on program counter technique. In our design, the input/output data and the looping pattern are in strict symmetry. According to AOIO, each application can provide more appropriate predictions to operating system which achieve significantly better accuracy than other buffer management schemes. The trace-driven simulation experiment results show that the hit ratios are improved by an average of 25.9% and the execution times are reduced by as much as 20.2% compared to other schemes for the workloads we used.

Download Full-text

Enhancing spatial locality via data layout optimizations

Euro-Par’98 Parallel Processing - Lecture Notes in Computer Science ◽

10.1007/bfb0057885 ◽

1998 ◽

pp. 422-434 ◽

Cited By ~ 6

Author(s):

M. Kandemir ◽

A. Choudhary ◽

J. Ramanujam ◽

N. Shenoy ◽

P. Banerjee

Keyword(s):

Data Layout ◽

Spatial Locality

Download Full-text

File-access patterns of data-intensive workflow applications and their implications to distributed filesystems

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing - HPDC '10 ◽

10.1145/1851476.1851585 ◽

2010 ◽

Cited By ~ 16

Author(s):

Takeshi Shibata ◽

SungJun Choi ◽

Kenjiro Taura

Keyword(s):

Data Intensive ◽

File Access ◽

Access Patterns

Download Full-text

Oueueing Theory Analysis Of A Direct Access Storage Device

INFOR Information Systems and Operational Research ◽

10.1080/03155986.1974.11731562 ◽

1974 ◽

Vol 12 (1) ◽

pp. 39-54

Author(s):

Pieter Kritzinger ◽

Mary Thompson ◽

Wesley Graham

Keyword(s):

Direct Access ◽

Storage Device ◽

Theory Analysis

Download Full-text

HISTORICAL SEARCHING

International Journal of Foundations of Computer Science ◽

10.1142/s0129054193000067 ◽

1993 ◽

Vol 04 (01) ◽

pp. 85-98 ◽

Cited By ~ 1

Author(s):

ALISTAIR MOFFAT ◽

OLA PETERSSON

Keyword(s):

Data Structure ◽

Search Tree ◽

Insertions And Deletions ◽

Spatial Locality ◽

Temporal Locality

When searching for an item x in a dictionary, let t be the number of distinct items referenced since the previous reference to x. The move-to-front-list is a widely known dictionary data structure that supports searches on x in O(t) time. We present a new selforganizing data structure, called the Historical Search Tree, which supports the search in O( log t) time, and thus exploits the temporal locality more efficiently. The Historical Search Tree also handles unsuccessful searches, insertions, and deletions efficiently. We further show how the data structure can be modified to take advantage of spatial locality.

Download Full-text

Compiler driven data layout optimization for regular/irregular array access patterns

Proceedings of the 2008 ACM SIGPLAN-SIGBED conference on Languages, compilers, and tools for embedded systems - LCTES '08 ◽

10.1145/1375657.1375664 ◽

2008 ◽

Cited By ~ 8

Author(s):

Doosan Cho ◽

Sudeep Pasricha ◽

Ilya Issenin ◽

Nikil Dutt ◽

Yunheung Paek ◽

...

Keyword(s):

Layout Optimization ◽

Data Layout ◽

Access Patterns ◽

Data Layout Optimization

Download Full-text

SqORAM: Read-Optimized Sequential Write-Only Oblivious RAM

Proceedings on Privacy Enhancing Technologies ◽

10.2478/popets-2020-0012 ◽

2020 ◽

Vol 2020 (1) ◽

pp. 216-234

Author(s):

Anrin Chakraborti ◽

Radu Sion

Keyword(s):

Random Access ◽

Data Access ◽

Data Locality ◽

Storage Device ◽

Random Data ◽

Locality Property ◽

Oblivious Ram ◽

Locality Preserving ◽

Access Patterns ◽

Access Data

AbstractOblivious RAMs (ORAMs) allow a client to access data from an untrusted storage device without revealing the access patterns. Typically, the ORAM adversary can observe both read and write accesses. Write-only ORAMs target a more practical, multi-snapshot adversary only monitoring client writes – typical for plausible deniability and censorship-resilient systems. This allows write-only ORAMs to achieve significantly-better asymptotic performance. However, these apparent gains do not materialize in real deployments primarily due to the random data placement strategies used to break correlations between logical and physical names-paces, a required property for write access privacy. Random access performs poorly on both rotational disks and SSDs (often increasing wear significantly, and interfering with wear-leveling mechanisms).In this work, we introduce SqORAM, a new locality-preserving write-only ORAM that preserves write access privacy without requiring random data access. Data blocks close to each other in the logical domain land in close proximity on the physical media. Importantly, SqORAM maintains this data locality property over time, significantly increasing read throughput.A full Linux kernel-level implementation of SqORAM is 100x faster than non locality-preserving solutions for standard workloads and is 60-100% faster than the state-of-the-art for typical file system workloads.

Download Full-text