Opportunities for optimism in contended main-memory multicore transactions

AgiMonitor: a remote database monitoring tool for main memory database

Future Information Engineering ◽

10.2495/icie130752 ◽

2014 ◽

Author(s):

Huazhuang Yao ◽

Yongyan Wang ◽

Shuai Wang ◽

Kun Li ◽

Chao Guo

Keyword(s):

Main Memory ◽

Monitoring Tool ◽

Main Memory Database

Download Full-text

A Scheme for Protecting Confidentiality of No-volatile Main Memory Based on Phase-Change Memory

Chinese Journal of Computers ◽

10.3724/sp.j.1016.2011.02114 ◽

2011 ◽

Vol 34 (11) ◽

pp. 2114-2120 ◽

Cited By ~ 1

Author(s):

Peng ZHAO ◽

Long-Yun ZHU

Keyword(s):

Phase Change ◽

Phase Change Memory ◽

Main Memory ◽

Change Memory

Download Full-text

Efficient local locking for massively multithreaded in-memory hash-based operators

The VLDB Journal ◽

10.1007/s00778-020-00642-5 ◽

2021 ◽

Author(s):

Bashar Romanous ◽

Skyler Windh ◽

Ildar Absalyamov ◽

Prerna Budhkar ◽

Robert Halstead ◽

...

Keyword(s):

Relational Databases ◽

Aggregation Operators ◽

Main Memory ◽

Paradigm Shifts ◽

Multithreaded Processors ◽

Cache Hierarchies ◽

Processor Architectures ◽

Spatial Locality ◽

Content Addressable Memories ◽

Multi Core Processor

AbstractThe join and group-by aggregation are two memory intensive operators that are affecting the performance of relational databases. Hashing is a common approach used to implement both operators. Recent paradigm shifts in multi-core processor architectures have reinvigorated research into how the join and group-by aggregation operators can leverage these advances. However, the poor spatial locality of the hashing approach has hindered performance on multi-core processor architectures which rely on using large cache hierarchies for latency mitigation. Multithreaded architectures can better cope with poor spatial locality by masking memory latency with many outstanding requests. Nevertheless, the number of parallel threads, even in the most advanced multithreaded processors, such as UltraSPARC, is not enough to fully cover the main memory access latency. In this paper, we explore the hardware re-configurability of FPGAs to enable deeper execution pipelines that maintain hundreds (instead of tens) of outstanding memory requests across four FPGAs-drastically increasing concurrency and throughput. We present two end-to-end in-memory accelerators for the join and group-by aggregation operators using FPGAs. Both accelerators use massive multithreading to mask long memory delays of traversing linked-list data structures, while concurrently managing hundreds of thread states across four FPGAs locally. We explore how content addressable memories can be intermixed within our multithreaded designs to act as a synchronizing cache, which enforces locks and merges jobs together before they are written to memory. Throughput results for our hash-join operator accelerator show a speedup between 2$$\times $$ × and 3.4$$\times $$ × over the best multi-core approaches with comparable memory bandwidths on uniform and skewed datasets. The accelerator for the hash-based group-by aggregation operator demonstrates that leveraging CAMs achieves average speedup of 3.3$$\times $$ × with a best case of 9.4$$\times $$ × in terms of throughput over CPU implementations across five types of data distributions.

Download Full-text

A Hybrid Approach Combining R*-Tree and k-d Trees to Improve Linked Open Data Query Performance

Applied Sciences ◽

10.3390/app11052405 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2405

Author(s):

Yuxiang Sun ◽

Tianyi Zhao ◽

Seulgi Yoon ◽

Yongju Lee

Keyword(s):

Flash Memory ◽

Query Language ◽

Hybrid Approach ◽

Open Data ◽

Main Memory ◽

Linked Open Data ◽

Index Structure ◽

Identification Algorithm ◽

Distributed Computing Systems ◽

Query Performance

Semantic Web has recently gained traction with the use of Linked Open Data (LOD) on the Web. Although numerous state-of-the-art methodologies, standards, and technologies are applicable to the LOD cloud, many issues persist. Because the LOD cloud is based on graph-based resource description framework (RDF) triples and the SPARQL query language, we cannot directly adopt traditional techniques employed for database management systems or distributed computing systems. This paper addresses how the LOD cloud can be efficiently organized, retrieved, and evaluated. We propose a novel hybrid approach that combines the index and live exploration approaches for improved LOD join query performance. Using a two-step index structure combining a disk-based 3D R*-tree with the extended multidimensional histogram and flash memory-based k-d trees, we can efficiently discover interlinked data distributed across multiple resources. Because this method rapidly prunes numerous false hits, the performance of join query processing is remarkably improved. We also propose a hot-cold segment identification algorithm to identify regions of high interest. The proposed method is compared with existing popular methods on real RDF datasets. Results indicate that our method outperforms the existing methods because it can quickly obtain target results by reducing unnecessary data scanning and reduce the amount of main memory required to load filtering results.

Download Full-text

Selective caching: a persistent memory approach for multi-dimensional index structures

Distributed and Parallel Databases ◽

10.1007/s10619-021-07327-0 ◽

2021 ◽

Author(s):

Muhammad Attahir Jibril ◽

Philipp Götze ◽

David Broneske ◽

Kai-Uwe Sattler

Keyword(s):

Main Memory ◽

Index Structure ◽

Index Structures ◽

Cloud Infrastructure ◽

General Technique ◽

Persistent Memory ◽

The Cost ◽

Cloud Applications ◽

Memory Layout ◽

Analytical Index

AbstractAfter the introduction of Persistent Memory in the form of Intel’s Optane DC Persistent Memory on the market in 2019, it has found its way into manifold applications and systems. As Google and other cloud infrastructure providers are starting to incorporate Persistent Memory into their portfolio, it is only logical that cloud applications have to exploit its inherent properties. Persistent Memory can serve as a DRAM substitute, but guarantees persistence at the cost of compromised read/write performance compared to standard DRAM. These properties particularly affect the performance of index structures, since they are subject to frequent updates and queries. However, adapting each and every index structure to exploit the properties of Persistent Memory is tedious. Hence, we require a general technique that hides this access gap, e.g., by using DRAM caching strategies. To exploit Persistent Memory properties for analytical index structures, we propose selective caching. It is based on a mixture of dynamic and static caching of tree nodes in DRAM to reach near-DRAM access speeds for index structures. In this paper, we evaluate selective caching on the OLAP-optimized main-memory index structure Elf, because its memory layout allows for an easy caching. Our experiments show that if configured well, selective caching with a suitable replacement strategy can keep pace with pure DRAM storage of Elf while guaranteeing persistence. These results are also reflected when selective caching is used for parallel workloads.

Download Full-text

Robust Performance of Main Memory Data Structures by Configuration

Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data ◽

10.1145/3318464.3389725 ◽

2020 ◽

Cited By ~ 1

Author(s):

Tiemo Bang ◽

Ismail Oukid ◽

Norman May ◽

Ilia Petrov ◽

Carsten Binnig

Keyword(s):

Data Structures ◽

Main Memory ◽

Robust Performance

Download Full-text

GPrimer: a fast GPU-based pipeline for primer design for qPCR experiments

BMC Bioinformatics ◽

10.1186/s12859-021-04133-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Jeongmin Bae ◽

Hajin Jeon ◽

Min-Soo Kim

Keyword(s):

Data Structures ◽

Primer Design ◽

Main Memory ◽

Design Tools ◽

Workload Balancing ◽

Qpcr Analysis ◽

Computational Speed ◽

Entire Sequence ◽

Target Sequences ◽

Coalesced Memory

Abstract Background Design of valid high-quality primers is essential for qPCR experiments. MRPrimer is a powerful pipeline based on MapReduce that combines both primer design for target sequences and homology tests on off-target sequences. It takes an entire sequence DB as input and returns all feasible and valid primer pairs existing in the DB. Due to the effectiveness of primers designed by MRPrimer in qPCR analysis, it has been widely used for developing many online design tools and building primer databases. However, the computational speed of MRPrimer is too slow to deal with the sizes of sequence DBs growing exponentially and thus must be improved. Results We develop a fast GPU-based pipeline for primer design (GPrimer) that takes the same input and returns the same output with MRPrimer. MRPrimer consists of a total of seven MapReduce steps, among which two steps are very time-consuming. GPrimer significantly improves the speed of those two steps by exploiting the computational power of GPUs. In particular, it designs data structures for coalesced memory access in GPU and workload balancing among GPU threads and copies the data structures between main memory and GPU memory in a streaming fashion. For human RefSeq DB, GPrimer achieves a speedup of 57 times for the entire steps and a speedup of 557 times for the most time-consuming step using a single machine of 4 GPUs, compared with MRPrimer running on a cluster of six machines. Conclusions We propose a GPU-based pipeline for primer design that takes an entire sequence DB as input and returns all feasible and valid primer pairs existing in the DB at once without an additional step using BLAST-like tools. The software is available at https://github.com/qhtjrmin/GPrimer.git.

Download Full-text

Implications of NVM Based Storage on Memory Subsystem Management

Applied Sciences ◽

10.3390/app10030999 ◽

2020 ◽

Vol 10 (3) ◽

pp. 999

Author(s):

Hyokyung Bahn ◽

Kyungwoon Cho

Keyword(s):

Random Access ◽

Disk Drive ◽

Main Memory ◽

Memory Storage ◽

Storage Device ◽

Storage Devices ◽

Large Memory ◽

Memory Subsystems ◽

Non Volatile Memory ◽

Management Techniques

Recently, non-volatile memory (NVM) has advanced as a fast storage medium, and legacy memory subsystems optimized for DRAM (dynamic random access memory) and HDD (hard disk drive) hierarchies need to be revisited. In this article, we explore the memory subsystems that use NVM as an underlying storage device and discuss the challenges and implications of such systems. As storage performance becomes close to DRAM performance, existing memory configurations and I/O (input/output) mechanisms should be reassessed. This article explores the performance of systems with NVM based storage emulated by the RAMDisk under various configurations. Through our measurement study, we make the following findings. (1) We can decrease the main memory size without performance penalties when NVM storage is adopted instead of HDD. (2) For buffer caching to be effective, judicious management techniques like admission control are necessary. (3) Prefetching is not effective in NVM storage. (4) The effect of synchronous I/O and direct I/O in NVM storage is less significant than that in HDD storage. (5) Performance degradation due to the contention of multi-threads is less severe in NVM based storage than in HDD. Based on these observations, we discuss a new PC configuration consisting of small memory and fast storage in comparison with a traditional PC consisting of large memory and slow storage. We show that this new memory-storage configuration can be an alternative solution for ever-growing memory demands and the limited density of DRAM memory. We anticipate that our results will provide directions in system software development in the presence of ever-faster storage devices.

Download Full-text

Main Memory-Based Algorithms for Efficient Parallel Aggregation for Temporal Databases

Distributed and Parallel Databases ◽

10.1023/b:dapd.0000028553.70337.e1 ◽

2004 ◽

Vol 16 (2) ◽

pp. 123-163 ◽

Cited By ~ 12

Author(s):

Dengfeng Gao ◽

Jose Alvin G. Gendrano ◽

Bongki Moon ◽

Richard T. Snodgrass ◽

Minseok Park ◽

...

Keyword(s):

Main Memory ◽

Temporal Databases

Download Full-text

Dynamic Time-slice Scaling for Addressing OS Problems Incurred by Main Memory DVFS in Intelligent System

Mobile Networks and Applications ◽

10.1007/s11036-015-0587-2 ◽

2015 ◽

Vol 20 (2) ◽

pp. 157-168 ◽

Cited By ~ 4

Author(s):

Gangyong Jia ◽

Guangjie Han ◽

Jinfang Jiang ◽

Aohan Li

Keyword(s):

Intelligent System ◽

Main Memory ◽

Time Slice ◽

Dynamic Time

Download Full-text