Towards multi-purpose main-memory storage structures: Exploiting sub-space distance equalities in totally ordered data sets for exact knn queries

Implications of NVM Based Storage on Memory Subsystem Management

Applied Sciences ◽

10.3390/app10030999 ◽

2020 ◽

Vol 10 (3) ◽

pp. 999

Author(s):

Hyokyung Bahn ◽

Kyungwoon Cho

Keyword(s):

Random Access ◽

Disk Drive ◽

Main Memory ◽

Memory Storage ◽

Storage Device ◽

Storage Devices ◽

Large Memory ◽

Memory Subsystems ◽

Non Volatile Memory ◽

Management Techniques

Recently, non-volatile memory (NVM) has advanced as a fast storage medium, and legacy memory subsystems optimized for DRAM (dynamic random access memory) and HDD (hard disk drive) hierarchies need to be revisited. In this article, we explore the memory subsystems that use NVM as an underlying storage device and discuss the challenges and implications of such systems. As storage performance becomes close to DRAM performance, existing memory configurations and I/O (input/output) mechanisms should be reassessed. This article explores the performance of systems with NVM based storage emulated by the RAMDisk under various configurations. Through our measurement study, we make the following findings. (1) We can decrease the main memory size without performance penalties when NVM storage is adopted instead of HDD. (2) For buffer caching to be effective, judicious management techniques like admission control are necessary. (3) Prefetching is not effective in NVM storage. (4) The effect of synchronous I/O and direct I/O in NVM storage is less significant than that in HDD storage. (5) Performance degradation due to the contention of multi-threads is less severe in NVM based storage than in HDD. Based on these observations, we discuss a new PC configuration consisting of small memory and fast storage in comparison with a traditional PC consisting of large memory and slow storage. We show that this new memory-storage configuration can be an alternative solution for ever-growing memory demands and the limited density of DRAM memory. We anticipate that our results will provide directions in system software development in the presence of ever-faster storage devices.

Download Full-text

A Snappy B+-Trees Index Reconstruction for Main-Memory Storage Systems

Computational Science and Its Applications - ICCSA 2006 - Lecture Notes in Computer Science ◽

10.1007/11751540_113 ◽

2006 ◽

pp. 1036-1044 ◽

Cited By ~ 1

Author(s):

Ig-hoon Lee ◽

Junho Shim ◽

Sang-goo Lee

Keyword(s):

Storage Systems ◽

Main Memory ◽

Memory Storage ◽

B Trees

Download Full-text

A PREDICTABLE MULTI-THREADED MAIN-MEMORY STORAGE MANAGER

Journal of Zhejiang University SCIENCE A ◽

10.1631/jzus.2001.0416 ◽

2001 ◽

Vol 2 (4) ◽

pp. 416

Author(s):

Guang-hua SONG

Keyword(s):

Main Memory ◽

Memory Storage

Download Full-text

The architecture of the Dalí main memory storage manager

Bell Labs Technical Journal ◽

10.1002/bltj.2030 ◽

2002 ◽

Vol 2 (1) ◽

pp. 36-47 ◽

Cited By ~ 2

Author(s):

Philip L. Bohannon ◽

Rajeev R. Rastogi ◽

Avi Silberschatz ◽

S. Sudarshan

Keyword(s):

Main Memory ◽

Memory Storage

Download Full-text

Testing for equality of rates of evolution

Paleobiology ◽

10.1017/s0094837300008861 ◽

1987 ◽

Vol 13 (3) ◽

pp. 272-285 ◽

Cited By ~ 18

Author(s):

Jennifer A. Kitchell ◽

George Estabrook ◽

Norman MacLeod

Keyword(s):

Rapid Change ◽

Time Interval ◽

Data Sets ◽

Independent Data ◽

Change Rate ◽

Rates Of Evolution ◽

Generative Processes ◽

Pattern Of Variation ◽

Ordered Data ◽

Post Hoc

A new method of data analysis offers a potentially powerful tool for statistically evaluating hypotheses of rate in temporally-ordered evolutionary phenomena. We present a method for bootstrapping time-ordered data sets to test hypotheses of the equality of rate. This method is applicable to both nonrandom and random generative processes. The method is applied to the data of Malmgren et al. (1983) for the Globorotalia plesiotumida–G. tumida planktonic foraminiferan lineage and the data of Reyment (1982) for the benthonic foraminiferan Afrobolivina afar. G. plesiotumida is recognizable on the basis of independent data as a species distinct from G. tumida, its descendant. Evolutionary change rate during the evolution of G. tumida from G. plesiotumida is shown to be faster than rates within either species. The pattern of variation exhibited by A. afar includes a time interval of more rapid change; this more rapid change is observed post hoc. A bootstrapping model based on post hoc observations reveals the rate in this time interval to be not significantly faster than expected in such post hoc intervals.

Download Full-text

Identification of step pattern in ordered data sets using the Walsh transform algorithm

Ecological Modelling ◽

10.1016/j.ecolmodel.2004.07.008 ◽

2005 ◽

Vol 182 (1) ◽

pp. 11-24 ◽

Cited By ~ 3

Author(s):

Radhouan Ben-Hamadou ◽

Ibanez Frédéric ◽

Picheral Marc ◽

Gorsky Gabriel

Keyword(s):

Walsh Transform ◽

Data Sets ◽

Ordered Data

Download Full-text

DEVELOPING A PARALLEL CLASSIFIER FOR MINING IN BIG DATA SETS

IIUM Engineering Journal ◽

10.31436/iiumej.v22i2.1541 ◽

2021 ◽

Vol 22 (2) ◽

pp. 119-134

Author(s):

Ahad Shamseen ◽

Morteza Mohammadi Zanjireh ◽

Mahdi Bahaghighat ◽

Qin Xin

Keyword(s):

Data Mining ◽

Big Data ◽

Decision Tree ◽

Main Memory ◽

Experimental Results ◽

Primary Data ◽

Data Sets ◽

Decision Tree Classifier ◽

Vast Amount ◽

Tree Classifier

Data mining is the extraction of information and its roles from a vast amount of data. This topic is one of the most important topics these days. Nowadays, massive amounts of data are generated and stored each day. This data has useful information in different fields that attract programmers’ and engineers’ attention. One of the primary data mining classifying algorithms is the decision tree. Decision tree techniques have several advantages but also present drawbacks. One of its main drawbacks is its need to reside its data in the main memory. SPRINT is one of the decision tree builder classifiers that has proposed a fix for this problem. In this paper, our research developed a new parallel decision tree classifier by working on SPRINT results. Our experimental results show considerable improvements in terms of the runtime and memory requirements compared to the SPRINT classifier. Our proposed classifier algorithm could be implemented in serial and parallel environments and can deal with big data. ABSTRAK: Perlombongan data adalah pengekstrakan maklumat dan peranannya dari sejumlah besar data. Topik ini adalah salah satu topik yang paling penting pada masa ini. Pada masa ini, data yang banyak dihasilkan dan disimpan setiap hari. Data ini mempunyai maklumat berguna dalam pelbagai bidang yang menarik perhatian pengaturcara dan jurutera. Salah satu algoritma pengkelasan perlombongan data utama adalah pokok keputusan. Teknik pokok keputusan mempunyai beberapa kelebihan tetapi kekurangan. Salah satu kelemahan utamanya adalah keperluan menyimpan datanya dalam memori utama. SPRINT adalah salah satu pengelasan pembangun pokok keputusan yang telah mengemukakan untuk masalah ini. Dalam makalah ini, penyelidikan kami sedang mengembangkan pengkelasan pokok keputusan selari baru dengan mengusahakan hasil SPRINT. Hasil percubaan kami menunjukkan peningkatan yang besar dari segi jangka masa dan keperluan memori berbanding dengan pengelasan SPRINT. Algoritma pengklasifikasi yang dicadangkan kami dapat dilaksanakan dalam persekitaran bersiri dan selari dan dapat menangani data besar.

Download Full-text

An Index Structure for Main-memory Storage Systems using The Level Pre-fetching

International Journal of Contents ◽

10.5392/ijoc.2007.3.1.019 ◽

2007 ◽

Vol 3 (1) ◽

pp. 19-23

Author(s):

Seok-Jae Lee ◽

Jong-Hyun Yoon ◽

Seok-Il Song ◽

Jae-Soo Yoo

Keyword(s):

Storage Systems ◽

Main Memory ◽

Memory Storage ◽

Index Structure

Download Full-text

Locality-Sensitive Hashing for Information Retrieval System on Multiple GPGPU Devices

Applied Sciences ◽

10.3390/app10072539 ◽

2020 ◽

Vol 10 (7) ◽

pp. 2539 ◽

Cited By ~ 1

Author(s):

Toan Nguyen Mau ◽

Yasushi Inoguchi

Keyword(s):

Big Data ◽

Information Retrieval ◽

Retrieval System ◽

Hash Table ◽

Information Retrieval System ◽

Main Memory ◽

Locality Sensitive Hashing ◽

Data Sets ◽

Similar Data ◽

Data Set

It is challenging to build a real-time information retrieval system, especially for systems with high-dimensional big data. To structure big data, many hashing algorithms that map similar data items to the same bucket to advance the search have been proposed. Locality-Sensitive Hashing (LSH) is a common approach for reducing the number of dimensions of a data set, by using a family of hash functions and a hash table. The LSH hash table is an additional component that supports the indexing of hash values (keys) for the corresponding data/items. We previously proposed the Dynamic Locality-Sensitive Hashing (DLSH) algorithm with a dynamically structured hash table, optimized for storage in the main memory and General-Purpose computation on Graphics Processing Units (GPGPU) memory. This supports the handling of constantly updated data sets, such as songs, images, or text databases. The DLSH algorithm works effectively with data sets that are updated with high frequency and is compatible with parallel processing. However, the use of a single GPGPU device for processing big data is inadequate, due to the small memory capacity of GPGPU devices. When using multiple GPGPU devices for searching, we need an effective search algorithm to balance the jobs. In this paper, we propose an extension of DLSH for big data sets using multiple GPGPUs, in order to increase the capacity and performance of the information retrieval system. Different search strategies on multiple DLSH clusters are also proposed to adapt our parallelized system. With significant results in terms of performance and accuracy, we show that DLSH can be applied to real-life dynamic database systems.

Download Full-text

The Architecture of the Dalí Main-Memory Storage Manager

Multimedia Database Management Systems ◽

10.1007/978-1-4615-6149-1_3 ◽

1997 ◽

pp. 23-59

Author(s):

Philip Bohannon ◽

Daniel Lieuwen ◽

Rajeev Rastogi ◽

Avi Silberschatz ◽

S. Seshadri ◽

...

Keyword(s):

Main Memory ◽

Memory Storage

Download Full-text