Selective caching: a persistent memory approach for multi-dimensional index structures

Distributed and Parallel Databases ◽

10.1007/s10619-021-07327-0 ◽

2021 ◽

Author(s):

Muhammad Attahir Jibril ◽

Philipp Götze ◽

David Broneske ◽

Kai-Uwe Sattler

Keyword(s):

Main Memory ◽

Index Structure ◽

Index Structures ◽

Cloud Infrastructure ◽

General Technique ◽

Persistent Memory ◽

The Cost ◽

Cloud Applications ◽

Memory Layout ◽

Analytical Index

AbstractAfter the introduction of Persistent Memory in the form of Intel’s Optane DC Persistent Memory on the market in 2019, it has found its way into manifold applications and systems. As Google and other cloud infrastructure providers are starting to incorporate Persistent Memory into their portfolio, it is only logical that cloud applications have to exploit its inherent properties. Persistent Memory can serve as a DRAM substitute, but guarantees persistence at the cost of compromised read/write performance compared to standard DRAM. These properties particularly affect the performance of index structures, since they are subject to frequent updates and queries. However, adapting each and every index structure to exploit the properties of Persistent Memory is tedious. Hence, we require a general technique that hides this access gap, e.g., by using DRAM caching strategies. To exploit Persistent Memory properties for analytical index structures, we propose selective caching. It is based on a mixture of dynamic and static caching of tree nodes in DRAM to reach near-DRAM access speeds for index structures. In this paper, we evaluate selective caching on the OLAP-optimized main-memory index structure Elf, because its memory layout allows for an easy caching. Our experiments show that if configured well, selective caching with a suitable replacement strategy can keep pace with pure DRAM storage of Elf while guaranteeing persistence. These results are also reflected when selective caching is used for parallel workloads.

Get full-text (via PubEx)

Study and Optimization of T-Tree Index in Main Memory Database

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.427-429.2531 ◽

2013 ◽

Vol 427-429 ◽

pp. 2531-2535 ◽

Cited By ~ 1

Author(s):

Feng Dong Sun ◽

Quan Guo ◽

Lan Wang

Keyword(s):

High Performance ◽

Main Memory ◽

Memory Access ◽

Index Structure ◽

Index Structures ◽

Clock Speed ◽

Main Memory Database ◽

Tree Index ◽

Overall Performance

The bottleneck is not the disk I/O but CUP clock speed faster than the memory speed in main memory database .In order to achieve high performance in main memory database ,it is a good approach to design new index structures to improve the memory access speed .This chapter presents a T-tree index structure and its algorithms in main memory database firstly .Then presents two results on Optimization of T-tree index ,including T-tail tree and TTB-tree. Our results indicate that the T-Tree provides good overall performance in main memory.

Get full-text (via PubEx)

Optimization of T-Tree Index of Main Memory Database in Critical Application

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.40-41.206 ◽

2010 ◽

Vol 40-41 ◽

pp. 206-211

Author(s):

Zhi Lin Zhu

Keyword(s):

Data Structures ◽

High Performance ◽

Main Memory ◽

Ongoing Study ◽

Index Structure ◽

Index Structures ◽

Main Memory Database ◽

Data Structures And Algorithms ◽

Tree Index ◽

Overall Performance

One approach to achieving high performance in the DBMS in the critical application is to store the database in main memory rather than on disk. One can then design new data structures and algorithms oriented towards increasing the efficiency of the main memory database -MMDB. In this paper we present some results on index structures from an ongoing study of MMDB. We propose a new index structure, the T-tail Tree. We give the main algorithm of the T-tail Tree and the performance of these algorithms. Our results indicate that T-tail Tree provides good overall performance in main memory.

Get full-text (via PubEx)

Micro Service Based System for Cost Based Selection of Cloud Provider Services

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e4848.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 985-991

Keyword(s):

Service Providers ◽

Cloud Service ◽

Daily Basis ◽

Cloud Services ◽

Cloud Provider ◽

Cloud Infrastructure ◽

Cloud Service Providers ◽

Architectural Framework ◽

The Cost ◽

Cloud Applications

The increase in the amount of data generated on a daily basis coupled with the need to store and manage this data has encouraged the organizations to adopt cloud computing. In order to ensure better availability and reliability of their data as well as resources, most of the organizations make use of one or more cloud service providers .But the use of cloud resources puts forth some challenges as well. One of the challenges is its detailed monitoring. As the number of services utilized by the cloud consumers goes on increasing, the number of logs and metrics generated by them also scales rapidly.The dynamic nature of cloud infrastructure and the variety of services offered by several cloud vendors demands a sophisticated mechanism to calculate and analyze the cost of using different services. The billing reports by the cloud service providers deliver statistics about the usage of resources and the costs associated with them. It contains large amount of data which needs to be processed in order to gain useful information. In this paper, we propose a micro service based architectural framework which gathers the data from two different cloud service providers. This data is not only stored but processed to generate reports to enable optimal use of cloud infrastructure. The use of microservices framework provides benefits and is a preferred framework for the development of cloud applications. The main aim of this work is to provide an integrated mechanism to enable the comparison of cost for using similar cloud services.

Get full-text (via PubEx)

Efficient Retrieval of Music Recordings Using Graph-Based Index Structures

Signals ◽

10.3390/signals2020021 ◽

2021 ◽

Vol 2 (2) ◽

pp. 336-352

Author(s):

Frank Zalkow ◽

Julian Brandner ◽

Meinard Müller

Keyword(s):

Nearest Neighbor ◽

Response Times ◽

Negative Impact ◽

Nearest Neighbor Search ◽

Index Structure ◽

Search Problem ◽

Index Structures ◽

Music Retrieval ◽

Retrieval Systems ◽

Music Recordings

Flexible retrieval systems are required for conveniently browsing through large music collections. In a particular content-based music retrieval scenario, the user provides a query audio snippet, and the retrieval system returns music recordings from the collection that are similar to the query. In this scenario, a fast response from the system is essential for a positive user experience. For realizing low response times, one requires index structures that facilitate efficient search operations. One such index structure is the K-d tree, which has already been used in music retrieval systems. As an alternative, we propose to use a modern graph-based index, denoted as Hierarchical Navigable Small World (HNSW) graph. As our main contribution, we explore its potential in the context of a cross-version music retrieval application. In particular, we report on systematic experiments comparing graph- and tree-based index structures in terms of the retrieval quality, disk space requirements, and runtimes. Despite the fact that the HNSW index provides only an approximate solution to the nearest neighbor search problem, we demonstrate that it has almost no negative impact on the retrieval quality in our application. As our main result, we show that the HNSW-based retrieval is several orders of magnitude faster. Furthermore, the graph structure also works well with high-dimensional index items, unlike the tree-based structure. Given these merits, we highlight the practical relevance of the HNSW graph for music information retrieval (MIR) applications.

Get full-text (via PubEx)

A Hybrid Approach Combining R*-Tree and k-d Trees to Improve Linked Open Data Query Performance

Applied Sciences ◽

10.3390/app11052405 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2405

Author(s):

Yuxiang Sun ◽

Tianyi Zhao ◽

Seulgi Yoon ◽

Yongju Lee

Keyword(s):

Flash Memory ◽

Query Language ◽

Hybrid Approach ◽

Open Data ◽

Main Memory ◽

Linked Open Data ◽

Index Structure ◽

Identification Algorithm ◽

Distributed Computing Systems ◽

Query Performance

Semantic Web has recently gained traction with the use of Linked Open Data (LOD) on the Web. Although numerous state-of-the-art methodologies, standards, and technologies are applicable to the LOD cloud, many issues persist. Because the LOD cloud is based on graph-based resource description framework (RDF) triples and the SPARQL query language, we cannot directly adopt traditional techniques employed for database management systems or distributed computing systems. This paper addresses how the LOD cloud can be efficiently organized, retrieved, and evaluated. We propose a novel hybrid approach that combines the index and live exploration approaches for improved LOD join query performance. Using a two-step index structure combining a disk-based 3D R*-tree with the extended multidimensional histogram and flash memory-based k-d trees, we can efficiently discover interlinked data distributed across multiple resources. Because this method rapidly prunes numerous false hits, the performance of join query processing is remarkably improved. We also propose a hot-cold segment identification algorithm to identify regions of high interest. The proposed method is compared with existing popular methods on real RDF datasets. Results indicate that our method outperforms the existing methods because it can quickly obtain target results by reducing unnecessary data scanning and reduce the amount of main memory required to load filtering results.

Get full-text (via PubEx)

Selection of optimal data placement using cloud infrastructure

Proceedings of the Russian higher school Academy of sciences ◽

10.17212/1727-2769-2021-2-34-42 ◽

2021 ◽

pp. 34-42

Author(s):

Vladimir Meikshan ◽

◽

Natalia Teslya ◽

Keyword(s):

Mathematical Model ◽

Data Storage ◽

Service Providers ◽

Cloud Service ◽

Cloud Services ◽

Information Storage ◽

Cloud Infrastructure ◽

Digital Ecosystem ◽

Cloud Service Providers ◽

The Cost

Benefits of using cloud technology are obvious, their application is expanding, as a result, it determines the steady growth of demand. Cloud computing has acquired particular relevance for large companies connected with Internet services, retailing, logistics that generate large volume of business and other information. The use of cloud technologies allows organizing the joint consumption of resources, solving the problems of storing and transferring significant amounts of data. Russian consumer cooperation refers to large territory distributed organizations actively forming their own digital ecosystem. The issue of data storing and processing for consumer coo-peration organizations is very relevant. At the same time, the prices of cloud service providers are significantly different and require solving the problem of minimizing the cost of storing and transferring significant amounts of data. The application of the linear programming method is considered to select the optimal data storage scheme for several cloud service providers having different technical and economic parameters of the package (maximum amount of storage, cost of allocated resources). Mathematical model includes the equation of costs for data storing and transferring and restrictions on the amount of storage, the amount of data and its safety. Software tool that allows to perform numerical calculations is selected Microsoft Excel in combination with the "search for solutions" add-on. In accordance with the mathematical model, the conditions for minimizing the amount of cloud storage costs and the necessary restrictions are established. Initial data are set for three data forming centers, storages of certain size for five cloud service providers and nominal price for information storage and transmission. Calculations of expenses are performed in several variants: without optimization, with the solution of the optimization problem, with price increase by cloud service providers. Results of the calculations confirm the necessity to solve the problem of minimizing the cost of cloud services for corporate clients. The presented model can be expanded for any cost conditions as well as for different areas of cloud applications.

Get full-text (via PubEx)

Operational Cost of Running Real-Time Mobile Cloud Applications

Advances in Wireless Technologies and Telecommunication - Enabling Real-Time Mobile Cloud Computing through Emerging Technologies ◽

10.4018/978-1-4666-8662-5.ch010 ◽

2015 ◽

pp. 294-321 ◽

Cited By ~ 2

Author(s):

Ovunc Kocabas ◽

Regina Gyampoh-Vidogah ◽

Tolga Soyata

Keyword(s):

Real Time ◽

Large Scale ◽

Service Level ◽

Cloud Services ◽

Mobile Cloud ◽

Cost Models ◽

Operational Cost ◽

Resource Requirements ◽

The Cost ◽

Cloud Applications

This chapter describes the concepts and cost models used for determining the cost of providing cloud services to mobile applications using different pricing models. Two recently implemented mobile-cloud applications are studied in terms of both the cost of providing such services by the cloud operator, and the cost of operating them by the cloud user. Computing resource requirements of both applications are identified and worksheets are presented to demonstrate how businesses can estimate the operational cost of implementing such real-time mobile cloud applications at a large scale, as well as how much cloud operators can profit from providing resources for these applications. In addition, the nature of available service level agreements (SLA) and the importance of quality of service (QoS) specifications within these SLAs are emphasized and explained for mobile cloud application deployment.

Get full-text (via PubEx)

Cloud Security Engineering

Cloud Computing Advancements in Design, Implementation, and Technologies ◽

10.4018/978-1-4666-1879-4.ch010 ◽

2013 ◽

pp. 147-153 ◽

Cited By ~ 1

Author(s):

Shadi Aljawarneh

Keyword(s):

Life Cycle ◽

Ad Hoc ◽

Service Development ◽

Infrastructure Development ◽

Cloud Infrastructure ◽

Data Owner ◽

Development Life Cycle ◽

Platform Development ◽

Cloud Applications ◽

Ad Hoc Security

Information security is a key challenge in the Cloud because the data will be virtualized across different host machines, hosted on the Web. Cloud provides a channel to the service or platform in which it operates. However, the owners of data will be worried because their data and software are not under their control. In addition, the data owner may not recognize where data is geographically located at any particular time. So there is still a question mark over how data will be more secure if the owner does not control its data and software. Indeed, due to shortage of control over the Cloud infrastructure, use of ad-hoc security tools is not sufficient to protect the data in the Cloud; this paper discusses this security. Furthermore, a vision and strategy is proposed to mitigate or avoid the security threats in the Cloud. This broad vision is based on software engineering principles to secure the Cloud applications and services. In this vision, security is built into all phases of Service Development Life Cycle (SDLC), Platform Development Life Cycle (PDLC) or Infrastructure Development Life Cycle (IDLC).

Get full-text (via PubEx)