XStore : Fast RDMA-Based Ordered Key-Value Store Using Remote Learned Cache

2021 ◽  
Vol 17 (3) ◽  
pp. 1-32
Author(s):  
Xingda Wei ◽  
Rong Chen ◽  
Haibo Chen ◽  
Binyu Zang

RDMA ( Remote Direct Memory Access ) has gained considerable interests in network-attached in-memory key-value stores. However, traversing the remote tree-based index in ordered key-value stores with RDMA becomes a critical obstacle, causing an order-of-magnitude slowdown and limited scalability due to multiple round trips. Using index cache with conventional wisdom—caching partial data and traversing them locally—usually leads to limited effect because of unavoidable capacity misses, massive random accesses, and costly cache invalidations. We argue that the machine learning (ML) model is a perfect cache structure for the tree-based index, termed learned cache . Based on it, we design and implement XStore , an RDMA-based ordered key-value store with a new hybrid architecture that retains a tree-based index at the server to perform dynamic workloads (e.g., inserts) and leverages a learned cache at the client to perform static workloads (e.g., gets and scans). The key idea is to decouple ML model retraining from index updating by maintaining a layer of indirection from logical to actual positions of key-value pairs. It allows a stale learned cache to continue predicting a correct position for a lookup key. XStore ensures correctness using a validation mechanism with a fallback path and further uses speculative execution to minimize the cost of cache misses. Evaluations with YCSB benchmarks and production workloads show that a single XStore server can achieve over 80 million read-only requests per second. This number outperforms state-of-the-art RDMA-based ordered key-value stores (namely, DrTM-Tree, Cell, and eRPC+Masstree) by up to 5.9× (from 3.7×). For workloads with inserts, XStore still provides up to 3.5× (from 2.7×) throughput speedup, achieving 53M reqs/s. The learned cache can also reduce client-side memory usage and further provides an efficient memory-performance tradeoff, e.g., saving 99% memory at the cost of 20% peak throughput.

2011 ◽  
Vol 14 (2) ◽  
Author(s):  
Thomas G Koch

Current estimates of obesity costs ignore the impact of future weight loss and gain, and may either over or underestimate economic consequences of weight loss. In light of this, I construct static and dynamic measures of medical costs associated with body mass index (BMI), to be balanced against the cost of one-time interventions. This study finds that ignoring the implications of weight loss and gain over time overstates the medical-cost savings of such interventions by an order of magnitude. When the relationship between spending and age is allowed to vary, weight-loss attempts appear to be cost-effective starting and ending with middle age. Some interventions recently proven to decrease weight may also be cost-effective.


1999 ◽  
Vol 86 (5) ◽  
pp. 1657-1662 ◽  
Author(s):  
Young-Hui Chang ◽  
Rodger Kram

Previous studies have suggested that generating vertical force on the ground to support body weight (BWt) is the major determinant of the metabolic cost of running. Because horizontal forces exerted on the ground are often an order of magnitude smaller than vertical forces, some have reasoned that they have negligible cost. Using applied horizontal forces (AHF; negative is impeding, positive is aiding) equal to −6, −3, 0, +3, +6, +9, +12, and +15% of BWt, we estimated the cost of generating horizontal forces while subjects were running at 3.3 m/s. We measured rates of oxygen consumption (V˙o 2) for eight subjects. We then used a force-measuring treadmill to measure ground reaction forces from another eight subjects. With an AHF of −6% BWt,V˙o 2 increased 30% compared with normal running, presumably because of the extra work involved. With an AHF of +15% BWt, the subjects exerted ∼70% less propulsive impulse and exhibited a 33% reduction inV˙o 2. Our data suggest that generating horizontal propulsive forces constitutes more than one-third of the total metabolic cost of normal running.


2009 ◽  
Vol 13 (2) ◽  
Author(s):  
Richard C. Hicks ◽  
Keith Wright

Implementations of inference engine systems invoke many costs, including the cost of the inference engine itself, the cost of integrating the inference engine, and the cost of specialized personnel needed to create and maintain the system. These costs make a very high return on investment a criterion for incorporating these systems into the corporate portfolio of applications and technologies. Recently, the No Inference Engine Theory (NIET) [8] has been developed for creating procedural propositional logic rule-based systems. The NIET systems are implemented in traditional procedural languages such as C++ and do not need an inference engine or proprietary languages, thus eliminating the cost of the inference engine, the cost of integrating the system, and the cost for knowledge of a proprietary language. In addition, these procedural systems are an order of magnitude faster [8] than inference systems and maintain linear performance. For problems using propositional logic, the procedural systems described in this paper offer dramatically lower costs, higher performance, and ease of integration. Lowering the external costs and eliminating the need for specialized skills should make NIET systems more profitable and lead to the wider use of propositional logic systems in business.


Author(s):  
Eric Timmons ◽  
Brian C. Williams

State estimation methods based on hybrid discrete and continuous state models have emerged as a method of precisely computing belief states for real world systems, however they have difficulty scaling to systems with more than a handful of components. Classical, consistency based diagnosis methods scale to this level by combining best-first enumeration and conflict-directed search. While best-first methods have been developed for hybrid estimation, conflict-directed methods have thus far been elusive as conflicts summarize constraint violations, but probabilistic hybrid estimation is relatively unconstrained. In this paper we present an approach (A*BC) that unifies best-first enumeration and conflict-directed search in relatively unconstrained problems through the concept of "bounding" conflicts, an extension of conflicts that represent tighter bounds on the cost of regions of the search space. Experiments show that an A*BC powered state estimator produces estimates up to an order of magnitude faster than the current state of the art, particularly on large systems.


2021 ◽  
Author(s):  
Robert Godin ◽  
James R. Durrant

The energy cost of lifetime gain in solar energy conversion systems is determined from a breadth of technologies. The cost of 87 meV per order of magnitude lifetime improvement is strikingly close to the 59 meV determined from a simple kinetic model.


Author(s):  
Mohammad Saleh Nambakhsh ◽  
M. Shiva

Exchange of databases between hospitals needs efficient and reliable transmission and storage techniques to cut down the cost of health care. This exchange involves a large amount of vital patient information such as biosignals and medical images. Interleaving one form of data such as 1-D signal over digital images can combine the advantages of data security with efficient memory utilization (Norris, Englehart & Lovely, 2001), but nothing prevents the user from manipulating or copying the decrypted data for illegal uses. Embedding vital information of patients inside their scan images will help physicians make a better diagnosis of a disease. In order to solve these issues, watermark algorithms have been proposed as a way to complement the encryption processes and provide some tools to track the retransmission and manipulation of multimedia contents (Barni, Podilchuk, Bartolini & Delp, 2001; Vallabha, 2003). A watermarking system is based on an imperceptible insertion of a watermark (a signal) in an image. This technique is adapted here for interleaving graphical ECG signals within medical images to reduce storage and transmission overheads as well as helping for computer-aided diagnostics system. In this chapter, we present a new wavelet-based watermarking method combined with the EZW coder. The principle is to replace significant wavelet coefficients of ECG signals by the corresponding significant wavelet coefficients belonging to the host image, which is much bigger in size than the mark signal. This chapter presents a brief introduction to watermarking and the EZW coder that acts as a platform for our watermarking algorithm.


2020 ◽  
Vol 34 (03) ◽  
pp. 2327-2334
Author(s):  
Vidal Alcázar ◽  
Pat Riddle ◽  
Mike Barley

In the past few years, new very successful bidirectional heuristic search algorithms have been proposed. Their key novelty is a lower bound on the cost of a solution that includes information from the g values in both directions. Kaindl and Kainz (1997) proposed measuring how inaccurate a heuristic is while expanding nodes in the opposite direction, and using this information to raise the f value of the evaluated nodes. However, this comes with a set of disadvantages and remains yet to be exploited to its full potential. Additionally, Sadhukhan (2013) presented BAE∗, a bidirectional best-first search algorithm based on the accumulated heuristic inaccuracy along a path. However, no complete comparison in regards to other bidirectional algorithms has yet been done, neither theoretical nor empirical. In this paper we define individual bounds within the lower-bound framework and show how both Kaindl and Kainz's and Sadhukhan's methods can be generalized thus creating new bounds. This overcomes previous shortcomings and allows newer algorithms to benefit from these techniques as well. Experimental results show a substantial improvement, up to an order of magnitude in the number of necessarily-expanded nodes compared to state-of-the-art near-optimal algorithms in common benchmarks.


2020 ◽  
Vol 29 (6) ◽  
pp. 1287-1310
Author(s):  
Sebastian Kruse ◽  
Zoi Kaoudi ◽  
Bertty Contreras-Rojas ◽  
Sanjay Chawla ◽  
Felix Naumann ◽  
...  

AbstractData analytics are moving beyond the limits of a single platform. In this paper, we present the cost-based optimizer of Rheem, an open-source cross-platform system that copes with these new requirements. The optimizer allocates the subtasks of data analytic tasks to the most suitable platforms. Our main contributions are: (i) a mechanism based on graph transformations to explore alternative execution strategies; (ii) a novel graph-based approach to determine efficient data movement plans among subtasks and platforms; and (iii) an efficient plan enumeration algorithm, based on a novel enumeration algebra. We extensively evaluate our optimizer under diverse real tasks. We show that our optimizer can perform tasks more than one order of magnitude faster when using multiple platforms than when using a single platform.


2017 ◽  
Vol 29 (1) ◽  
pp. 52-64 ◽  
Author(s):  
Inês Bramão ◽  
Mikael Johansson

This study investigated context-dependent episodic memory retrieval. An influential idea in the memory literature is that performance benefits when the retrieval context overlaps with the original encoding context. However, such memory facilitation may not be driven by the encoding–retrieval overlap per se but by the presence of diagnostic features in the reinstated context that discriminate the target episode from competing episodes. To test this prediction, the encoding–retrieval overlap and the diagnostic value of the context were manipulated in a novel associative recognition memory task. Participants were asked to memorize word pairs presented together with diagnostic (unique) and nondiagnostic (shared) background scenes. At test, participants recognized the word pairs in the presence and absence of the previously encoded contexts. Behavioral data show facilitated memory performance in the presence of the original context but, importantly, only when the context was diagnostic of the target episode. The electrophysiological data reveal an early anterior ERP encoding–retrieval overlap effect that tracks the cost associated with having nondiagnostic contexts present at retrieval, that is, shared by multiple previous episodes, and a later posterior encoding–retrieval overlap effect that reflects facilitated access to the target episode during retrieval in diagnostic contexts. Taken together, our results underscore the importance of the diagnostic value of the context and suggest that context-dependent episodic memory effects are multiple determined.


Sign in / Sign up

Export Citation Format

Share Document