MONTRES-NVM: An External Sorting Algorithm for Hybrid Memory

Author(s):  
Mohammed Bey Ahmed Khernache ◽  
Arezki Laga ◽  
Jalil Boukhobza
Cybernetics ◽  
1968 ◽  
Vol 2 (5) ◽  
pp. 75-79
Author(s):  
V. A. Litvinov

2020 ◽  
Vol 36 (9) ◽  
pp. 2705-2711 ◽  
Author(s):  
Gianvito Urgese ◽  
Emanuele Parisi ◽  
Orazio Scicolone ◽  
Santa Di Cataldo ◽  
Elisa Ficarra

Abstract Motivation High-throughput next-generation sequencing can generate huge sequence files, whose analysis requires alignment algorithms that are typically very demanding in terms of memory and computational resources. This is a significant issue, especially for machines with limited hardware capabilities. As the redundancy of the sequences typically increases with coverage, collapsing such files into compact sets of non-redundant reads has the 2-fold advantage of reducing file size and speeding-up the alignment, avoiding to map the same sequence multiple times. Method BioSeqZip generates compact and sorted lists of alignment-ready non-redundant sequences, keeping track of their occurrences in the raw files as well as of their quality score information. By exploiting a memory-constrained external sorting algorithm, it can be executed on either single- or multi-sample datasets even on computers with medium computational capabilities. On request, it can even re-expand the compacted files to their original state. Results Our extensive experiments on RNA-Seq data show that BioSeqZip considerably brings down the computational costs of a standard sequence analysis pipeline, with particular benefits for the alignment procedures that typically have the highest requirements in terms of memory and execution time. In our tests, BioSeqZip was able to compact 2.7 billion of reads into 963 million of unique tags reducing the size of sequence files up to 70% and speeding-up the alignment by 50% at least. Availability and implementation BioSeqZip is available at https://github.com/bioinformatics-polito/BioSeqZip. Supplementary information Supplementary data are available at Bioinformatics online.


1991 ◽  
Vol 92 (2) ◽  
pp. 141-160 ◽  
Author(s):  
Walter Cunto ◽  
Gasto´n H. Gonnet ◽  
J. Ian Munro ◽  
Patricio V. Poblete

Author(s):  
Wenhan Chen ◽  
Yang Liu ◽  
Zhiguang Chen ◽  
Fang Liu ◽  
Nong Xiao

2003 ◽  
Vol 86 (5) ◽  
pp. 229-233 ◽  
Author(s):  
Rafiqul Islam ◽  
Nasim Adnan ◽  
Nur Islam ◽  
Shohorab Hossen

Author(s):  
Asaduzzaman Nur Shuvo ◽  
Apurba Adhikary ◽  
Md. Bipul Hossain ◽  
Sultana Jahan Soheli

Data sets in large applications are often too gigantic to fit completely inside the computer’s internal memory. The resulting input/output communication (or I/O) between fast internal memory and slower external memory (such as disks) can be a major performance bottle−neck. While applying sorting on this huge data set, it is essential to do external sorting. This paper is concerned with a new in−place external sorting algorithm. Our proposed algorithm uses the concept of Quick−Sort and Divide−and−Conquer approaches resulting in a faster sorting algorithm avoiding any additional disk space. In addition, we showed that the average time complexity can be reduced compared to the existing external sorting approaches.


2017 ◽  
Vol 66 (10) ◽  
pp. 1689-1702 ◽  
Author(s):  
Arezki Laga ◽  
Jalil Boukhobza ◽  
Frank Singhoff ◽  
Michel Koskas

2000 ◽  
Vol 75 (4) ◽  
pp. 159-163 ◽  
Author(s):  
Fang-Cheng Leu ◽  
Yin-Te Tsai ◽  
Chuan Yi Tang

2021 ◽  
Vol 20 (4) ◽  
pp. 1-21
Author(s):  
Riley Jackson ◽  
Jonathan Gresl ◽  
Ramon Lawrence

Embedded devices are ubiquitous in areas of industrial and environmental monitoring, health and safety, and consumer appliances. A common use case is data collection, processing, and performing actions based on data analysis. Although many Internet of Things (IoT) applications use the embedded device simply for data collection, there are benefits to having more data processing done closer to data collection to reduce network transmissions and power usage and provide faster response. This work implements and evaluates algorithms for sorting data on embedded devices with specific focus on the smallest memory devices. In devices with less than 4 KB of available RAM, the standard external merge sort algorithm has limited application as it requires a minimum of three memory buffers and is not flash-aware. The contribution is a memory-optimized external sorting algorithm called no output buffer sort (NOBsort) that reduces the minimum memory required for sorting, has excellent performance for sorted or near-sorted data, and sorts on external memory such as SD cards or raw flash chips. When sorting large datasets, no output buffer sort reduces I/O and execution time by between 20% to 35% compared to standard external merge sort.


2017 ◽  
Vol 5 (12) ◽  
pp. 169-172
Author(s):  
Rina Damdoo ◽  
◽  
◽  
Kanak Kalyani

Sign in / Sign up

Export Citation Format

Share Document