data placement
Recently Published Documents


TOTAL DOCUMENTS

561
(FIVE YEARS 121)

H-INDEX

28
(FIVE YEARS 5)

2022 ◽  
Vol 41 (2) ◽  
pp. 661-676
Author(s):  
B. Prabhu Shankar ◽  
S. Chitra
Keyword(s):  

2021 ◽  
Vol 11 (24) ◽  
pp. 11842
Author(s):  
Gijun Oh ◽  
Junseok Yang ◽  
Sungyong Ahn

Log-structured merge-tree (LSM-Tree)-based key–value stores are attracting attention for their high I/O (Input/Output) performance due to their sequential write characteristics. However, excessive writes caused by compaction shorten the lifespan of the Solid-state Drive (SSD). Therefore, there are several studies aimed at reducing garbage collection overhead by using Zoned Namespace ZNS; SSD in which the host can determine data placement. However, the existing studies have limitations in terms of performance improvement because the lifetime and hotness of key–value data are not considered. Therefore, in this paper, we propose a technique to minimize the space efficiency and garbage collection overhead of SSDs by arranging them according to the characteristics of key–value data. The proposed method was implemented by modifying ZenFS of RocksDB and, according to the result of the performance evaluation, the space efficiency could be improved by up to 75%.


Author(s):  
Hindol Bhattacharya ◽  
Matangini Chattopadhyay ◽  
Samiran Chattopadhay

2021 ◽  
Vol 11 (21) ◽  
pp. 9940
Author(s):  
Jack Marquez ◽  
Oscar H. Mondragon ◽  
Juan D. Gonzalez

Cloud computing systems are rapidly evolving toward multicloud architectures supported on heterogeneous hardware. Cloud service providers are widely offering different types of storage infrastructures and multi-NUMA architecture servers. Existing cloud resource allocation solutions do not comprehensively consider this heterogeneous infrastructure. In this study, we present a novel approach comprised of a hierarchical framework based on genetic programming to solve problems related to data placement and virtual machine allocation for analytics applications running on heterogeneous hardware with a variety of storage types and nonuniform memory access. Our approach optimizes data placement using the Hadoop File System on heterogeneous storage devices on multicloud systems. It guarantees the efficient allocation of virtual machines on physical machines with multiple NUMA (nonuniform memory access) domains by minimizing contention between workloads. We prove that our solutions for data placement and virtual machine allocation outperform other state-of-the-art approaches.


2021 ◽  
pp. 102295
Author(s):  
Jinting Ren ◽  
Xianzhang Chen ◽  
Duo Liu ◽  
Yujuan Tan ◽  
Moming Duan ◽  
...  

2021 ◽  
Vol 2 (3) ◽  
pp. 1-28
Author(s):  
Jie Song ◽  
Qiang He ◽  
Feifei Chen ◽  
Ye Yuan ◽  
Ge Yu

In big data query processing, there is a trade-off between query accuracy and query efficiency, for example, sampling query approaches trade-off query completeness for efficiency. In this article, we argue that query performance can be significantly improved by slightly losing the possibility of query completeness, that is, the chance that a query is complete. To quantify the possibility, we define a new concept, Probability of query Completeness (hereinafter referred to as PC). For example, If a query is executed 100 times, PC = 0.95 guarantees that there are no more than 5 incomplete results among 100 results. Leveraging the probabilistic data placement and scanning, we trade off PC for query performance. In the article, we propose PoBery (POssibly-complete Big data quERY), a method that supports neither complete queries nor incomplete queries, but possibly-complete queries. The experimental results conducted on HiBench prove that PoBery can significantly accelerate queries while ensuring the PC. Specifically, it is guaranteed that the percentage of complete queries is larger than the given PC confidence. Through comparison with state-of-the-art key-value stores, we show that while Drill-based PoBery performs as fast as Drill on complete queries, it is 1.7 ×, 1.1 ×, and 1.5 × faster on average than Drill, Impala, and Hive, respectively, on possibly-complete queries.


2021 ◽  
Vol 17 (3) ◽  
pp. 1-32
Author(s):  
Amina Chikhaoui ◽  
Laurent Lemarchand ◽  
Kamel Boukhalfa ◽  
Jalil Boukhobza

Cloud federation enables service providers to collaborate to provide better services to customers. For cloud storage services, optimizing customer object placement for a member of a federation is a real challenge. Storage, migration, and latency costs need to be considered. These costs are contradictory in some cases. In this article, we modeled object placement as a multi-objective optimization problem. The proposed model takes into account parameters related to the local infrastructure, the federated environment, customer workloads, and their SLAs. For resolving this problem, we propose CDP-NSGAII IR , a Constraint Data Placement matheuristic based on NSGAII with Injection and Repair functions. The injection function aims to enhance the solutions’ quality. It consists to calculate some solutions using an exact method then inject them into the initial population of NSGAII. The repair function ensures that the solutions obey the problem constraints and so prevents from exploring large sets of unfeasible solutions. It reduces drastically the execution time of NSGAII. Experimental results show that the injection function improves the HV of NSGAII and the exact method by up to 94% and 60%, respectively, while the repair function reduces the execution time by an average of 68%.


Sign in / Sign up

Export Citation Format

Share Document