Effect of node size on the performance of cache-conscious B+-trees

Author(s):  
Richard A. Hankins ◽  
Jignesh M. Patel
Keyword(s):  
B Trees ◽  
2003 ◽  
Vol 31 (1) ◽  
pp. 283-294 ◽  
Author(s):  
Richard A. Hankins ◽  
Jignesh M. Patel
Keyword(s):  
B Trees ◽  

2021 ◽  
Vol 8 (3) ◽  
pp. 1-20
Author(s):  
Michael A. Bender ◽  
Alex Conway ◽  
Martín Farach-Colton ◽  
William Jannen ◽  
Yizheng Jiao ◽  
...  

Storage devices have complex performance profiles, including costs to initiate IOs (e.g., seek times in hard drives), parallelism and bank conflicts (in SSDs), costs to transfer data, and firmware-internal operations. The Disk-access Machine (DAM) model simplifies reality by assuming that storage devices transfer data in blocks of size B and that all transfers have unit cost. Despite its simplifications, the DAM model is reasonably accurate. In fact, if B is set to the half-bandwidth point, where the latency and bandwidth of the hardware are equal, then the DAM approximates the IO cost on any hardware to within a factor of 2. Furthermore, the DAM model explains the popularity of B-trees in the 1970s and the current popularity of B ɛ -trees and log-structured merge trees. But it fails to explain why some B-trees use small nodes, whereas all B ɛ -trees use large nodes. In a DAM, all IOs, and hence all nodes, are the same size. In this article, we show that the affine and PDAM models, which are small refinements of the DAM model, yield a surprisingly large improvement in predictability without sacrificing ease of use. We present benchmarks on a large collection of storage devices showing that the affine and PDAM models give good approximations of the performance characteristics of hard drives and SSDs, respectively. We show that the affine model explains node-size choices in B-trees and B ɛ -trees. Furthermore, the models predict that B-trees are highly sensitive to variations in the node size, whereas B ɛ -trees are much less sensitive. These predictions are born out empirically. Finally, we show that in both the affine and PDAM models, it pays to organize data structures to exploit varying IO size. In the affine model, B ɛ -trees can be optimized so that all operations are simultaneously optimal, even up to lower-order terms. In the PDAM model, B ɛ -trees (or B-trees) can be organized so that both sequential and concurrent workloads are handled efficiently. We conclude that the DAM model is useful as a first cut when designing or analyzing an algorithm or data structure but the affine and PDAM models enable the algorithm designer to optimize parameter choices and fill in design details.


1983 ◽  
Vol 13 (2) ◽  
pp. 36-38
Author(s):  
Juergen Klonk
Keyword(s):  

2021 ◽  
Vol 555 ◽  
pp. 81-88
Author(s):  
Atsushi Igarashi ◽  
Takashi Kato ◽  
Hiromi Sesaki ◽  
Miho Iijima

2002 ◽  
Vol 149 (4) ◽  
pp. 251-256 ◽  
Author(s):  
J.-M. Lin ◽  
H.-E. Yi ◽  
Y.-W. Chang
Keyword(s):  

1985 ◽  
Vol 10 (1) ◽  
pp. 127-134
Author(s):  
K.D. Sharma ◽  
Rekha Rani
Keyword(s):  

2012 ◽  
Vol 241-244 ◽  
pp. 3171-3174
Author(s):  
Chang Guang Shi

Many experts would agree that, had it not been for telephony, the construction of B-trees might never have occurred. Given the current status of random theory, information theorists urgently desire the unfortunate unification of virtual machines and voice-over-IP, which embodies the unproven principles of robotics. We show that even though voice-over-IP and e-commerce can collaborate to achieve this goal, courseware and Internet QoS can synchronize to realize this mission.


Sign in / Sign up

Export Citation Format

Share Document