An interactive SQL relational interface for querying main-memory data structures

Computing ◽  
2015 ◽  
Vol 97 (12) ◽  
pp. 1141-1164 ◽  
Author(s):  
Marios Fragkoulis ◽  
Diomidis Spinellis ◽  
Panos Louridas
Keyword(s):  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Jeongmin Bae ◽  
Hajin Jeon ◽  
Min-Soo Kim

Abstract Background Design of valid high-quality primers is essential for qPCR experiments. MRPrimer is a powerful pipeline based on MapReduce that combines both primer design for target sequences and homology tests on off-target sequences. It takes an entire sequence DB as input and returns all feasible and valid primer pairs existing in the DB. Due to the effectiveness of primers designed by MRPrimer in qPCR analysis, it has been widely used for developing many online design tools and building primer databases. However, the computational speed of MRPrimer is too slow to deal with the sizes of sequence DBs growing exponentially and thus must be improved. Results We develop a fast GPU-based pipeline for primer design (GPrimer) that takes the same input and returns the same output with MRPrimer. MRPrimer consists of a total of seven MapReduce steps, among which two steps are very time-consuming. GPrimer significantly improves the speed of those two steps by exploiting the computational power of GPUs. In particular, it designs data structures for coalesced memory access in GPU and workload balancing among GPU threads and copies the data structures between main memory and GPU memory in a streaming fashion. For human RefSeq DB, GPrimer achieves a speedup of 57 times for the entire steps and a speedup of 557 times for the most time-consuming step using a single machine of 4 GPUs, compared with MRPrimer running on a cluster of six machines. Conclusions We propose a GPU-based pipeline for primer design that takes an entire sequence DB as input and returns all feasible and valid primer pairs existing in the DB at once without an additional step using BLAST-like tools. The software is available at https://github.com/qhtjrmin/GPrimer.git.


2018 ◽  
Vol 16 (9) ◽  
pp. 2328-2335 ◽  
Author(s):  
Cristian Vallejos ◽  
Monica Caniupan ◽  
Gilberto Gutierrez

Algorithms ◽  
2021 ◽  
Vol 14 (6) ◽  
pp. 172
Author(s):  
Kotaro Matsuda ◽  
Shuhei Denzumi ◽  
Kunihiko Sadakane

Zero-suppressed Binary Decision Diagrams (ZDDs) are data structures for representing set families in a compressed form. With ZDDs, many valuable operations on set families can be done in time polynomial in ZDD size. In some cases, however, the size of ZDDs for representing large set families becomes too huge to store them in the main memory. This paper proposes top ZDD, a novel representation of ZDDs which uses less space than existing ones. The top ZDD is an extension of the top tree, which compresses trees, to compress directed acyclic graphs by sharing identical subgraphs. We prove that navigational operations on ZDDs can be done in time poly-logarithmic in ZDD size, and show that there exist set families for which the size of the top ZDD is exponentially smaller than that of the ZDD. We also show experimentally that our top ZDDs have smaller sizes than ZDDs for real data.


2020 ◽  
Vol 14 (3) ◽  
pp. 431-444
Author(s):  
Yongjun He ◽  
Jiacheng Lu ◽  
Tianzheng Wang

Data stalls are a major overhead in main-memory database engines due to the use of pointer-rich data structures. Lightweight coroutines ease the implementation of software prefetching to hide data stalls by overlapping computation and asynchronous data prefetching. Prior solutions, however, mainly focused on (1) individual components and operations and (2) intra-transaction batching that requires interface changes, breaking backward compatibility. It was not clear how they apply to a full database engine and how much end-to-end benefit they bring under various workloads. This paper presents CoroBase, a main-memory database engine that tackles these challenges with a new coroutine-to-transaction paradigm. Coroutine-to-transaction models transactions as coroutines and thus enables inter-transaction batching, avoiding application changes but retaining the benefits of prefetching. We show that on a 48-core server, CoroBase can perform close to 2x better for read-intensive workloads and remain competitive for workloads that inherently do not benefit from software prefetching.


2010 ◽  
Vol 40-41 ◽  
pp. 206-211
Author(s):  
Zhi Lin Zhu

One approach to achieving high performance in the DBMS in the critical application is to store the database in main memory rather than on disk. One can then design new data structures and algorithms oriented towards increasing the efficiency of the main memory database -MMDB. In this paper we present some results on index structures from an ongoing study of MMDB. We propose a new index structure, the T-tail Tree. We give the main algorithm of the T-tail Tree and the performance of these algorithms. Our results indicate that T-tail Tree provides good overall performance in main memory.


2020 ◽  
Vol 7 (12) ◽  
pp. 199-208
Author(s):  
A. Kavinilavu ◽  
S. Neelavathy Pari

Data structures are chosen to save space and to grant fast access to data by it’s key for a particular structural representation. The data structures surveyed are linear lists, hierarchical structures, graph structures. B+ tree is an expansion of a B tree data structure which allows efficient insertions, deletions and search operations. It is used to store a large amount of data that cannot be stored in the main memory. B+ tree leaf nodes are connected together in the form of a singly linked list to make search queries more efficient and effective. The drawback of binary tree geometry is that the decrease in memory use comes at the expense of more frequent memory access, might slow down simulation in which frequent memory access constitutes a significant part of the execution time. Processing and compression of voxel phantoms without loss of quality. Voxels are often utilized in the visualization and analysis of medical and scientific (logical) information. Voxel phantoms which comprise a set of small volume components appeared towards the end of the 1980s and improved on the first scientific models. These are the models of the human body. These phantoms are an extremely exact representation. Fetching of records in the equal number of disk accesses and to reduce the access time by reducing the height of the tree and increasing the number of branches in the node.


2020 ◽  
Vol 3 (1) ◽  
pp. 40
Author(s):  
Rifat Osmanaj ◽  
Hysen Binjaku

Sorting is much used in massive data applications, insurance systems, education, health, business, etc. To the sorting operation that sorts the data as desired, quick access to the required data is achieved. Typically sorted data are organized in strings as file elements or tables. The most common case is when the tabular data is processed in the main memory of the computer. The paper presents the algorithms currently used for sorting objects that are involved in static and dynamic data structures. Then the selection of the data set on which particular algorithms will be applied will be made and the advantages and disadvantages of each of the algorithms in question will be seen.Thereafter, it is determined the efficiency of the sorting algorithm work and it is considered what is determinative when selecting the appropriate algorithm for sorting.


2017 ◽  
Vol 9 (02) ◽  
Author(s):  
Vinesh Kumar ◽  
Jayant Shekhar ◽  
Sunil Kumar

Data Representation in memory is one of the tasks in Big data. Data representation includes several types of tree data structures through the system can access accurate and efficient data in big data. Succinct data structures can play important role in data representation while data in big-data is processed in main memory. Data representation is a very complex problem in Big Data.We proposed some solution of problems of data representation in Big data. Data processing in big data can be utilized to take a decision on data mining. We know the function and rules for query processing. We have to either change the method of processor we can change the way of representation. In this paper, different kind of tree data structures is presented for data representation in main memory of computer system for big data by using succinct data structures. Here we first compare all data structures by the table. Each method has different space and time complexity. We know that Big data information services increasing day by day. So space complexity of succinct data structures is becoming very popular in practice in this era.


Sign in / Sign up

Export Citation Format

Share Document