dynamic data structure
Recently Published Documents


TOTAL DOCUMENTS

54
(FIVE YEARS 6)

H-INDEX

9
(FIVE YEARS 1)

2022 ◽  
Vol 18 (1) ◽  
pp. 1-63
Author(s):  
Siu-Wing Cheng ◽  
Man-Kit Lau

We propose a dynamic data structure for the distribution-sensitive point location problem in the plane. Suppose that there is a fixed query distribution within a convex subdivision S , and we are given an oracle that can return in O (1) time the probability of a query point falling into a polygonal region of constant complexity. We can maintain S such that each query is answered in O opt (S) ) expected time, where opt ( S ) is the expected time of the best linear decision tree for answering point location queries in S . The space and construction time are O(n log 2 n ), where n is the number of vertices of S . An update of S as a mixed sequence of k edge insertions and deletions takes O(k log 4 n) amortized time. As a corollary, the randomized incremental construction of the Voronoi diagram of n sites can be performed in O(n log 4 n ) expected time so that, during the incremental construction, a nearest neighbor query at any time can be answered optimally with respect to the intermediate Voronoi diagram at that time.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Luiz F. A. Brito ◽  
Marcelo K. Albertini ◽  
Arnaud Casteigts ◽  
Bruno A. N. Travençolo

Author(s):  
Ahsan Sanaullah ◽  
Degui Zhi ◽  
Shaojie Zhang

Abstract Motivation Durbin’s positional Burrows-Wheeler transform (PBWT) is a scalable data structure for haplotype matching. It has been successfully applied to identical by descent (IBD) segment identification and genotype imputation. Once the PBWT of a haplotype panel is constructed, it supports efficient retrieval of all shared long segments among all individuals (long matches) and efficient query between an external haplotype and the panel. However, the standard PBWT is an array-based static data structure and does not support dynamic updates of the panel. Results Here, we generalize the static PBWT to a dynamic data structure, d-PBWT, where the reverse prefix sorting at each position is stored with linked lists.We also developed efficient algorithms for insertion and deletion of individual haplotypes. In addition, we verified that d-PBWT can support all algorithms of PBWT. In doing so, we systematically investigated variations of set maximal match and long match query algorithms: while they all have average case time complexity independent of database size, they have different worst case complexities and dependencies on additional data structures. Availability The benchmarking code is available at genome.ucf.edu/d-PBWT. Supplementary information Supplementary Materials are available at Bioinformatics online.


Author(s):  
Nina Luhmann ◽  
Guillaume Holley ◽  
Mark Achtman

AbstractBlastFrost is a highly efficient method for querying 100,000s of genome assemblies. It builds on Bifrost, a recently developed dynamic data structure for compacted and colored de Bruijn graphs from bacterial genomes. BlastFrost queries a Bifrost data structure for sequences of interest, and extracts local subgraphs, thereby enabling the efficient identification of the presence or absence of individual genes or single nucleotide sequence variants. Here we describe the algorithms and implementation of BlastFrost. We also present two exemplar practical applications. In the first, we determined the presence of the individual genes within the SPI-2 Salmonella pathogenicity island within a collection of 926 representative genomes in minutes. In the second application, we determined the existence of known single nucleotide polymorphisms associated with fluoroquinolone resistance in the genes gyrA, gyrB and parE among 190, 209 Salmonella genomes. BlastFrost is available for download at https://github.com/nluhmann/BlastFrost.


Author(s):  
V. N. Tereshchenko ◽  
A. A. Marchenko ◽  
Y. V. Tereshchenko ◽  
A. N. Tara

The article is devoted to the development of a dynamic data structure for solving proximity problems based on the dynamic Voronoi Diagram. This data structure can be used as the core of the common algorithmic space model for solving a set of visualization and computer modeling problems. The data structure is based on the strategy of "divide and rule" for Voronoi diagram construction. Similar to the original algorithm, we store a binary tree that represents the Voronoi diagram, but define three new operations: insert, delete, and balance. To ensure the efficiency of operations, it is proposed to use red-black tree. In general, the proposed data structure shows much better results than the original static algorithm. Compared to existing algorithms, this data structure is both simple and efficient.


2017 ◽  
Author(s):  
Harun Mustafa ◽  
André Kahles ◽  
Mikhail Karasikov ◽  
Gunnar Rätsch

AbstractMuch of the DNA and RNA sequencing data available is in the form of high-throughput sequencing (HTS) reads and is currently unindexed by established sequence search databases. Recent succinct data structures for indexing both reference sequences and HTS data, along with associated metadata, have been based on either hashing or graph models, but many of these structures are static in nature, and thus, not well-suited as backends for dynamic databases.We propose a parallel construction method for and novel application of the wavelet trie as a dynamic data structure for compressing and indexing graph metadata. By developing an algorithm for merging wavelet tries, we are able to construct large tries in parallel by merging smaller tries constructed concurrently from batches of data.When compared against general compression algorithms and those developed specifically for graph colors (VARI and Rainbowfish), our method achieves compression ratios superior to gzip and VARI, converging to compression ratios of 6.5% to 2% on data sets constructed from over 600 virus genomes.While marginally worse than compression by bzip2 or Rainbowfish, this structure allows for both fast extension and query. We also found that additionally encoding graph topology metadata improved compression ratios, particularly on data sets consisting of several mutually-exclusive reference genomes.It was also observed that the compression ratio of wavelet tries grew sublinearly with the density of the annotation matrices.This work is a significant step towards implementing a dynamic data structure for indexing large annotated sequence data sets that supports fast query and update operations. At the time of writing, no established standard tool has filled this niche.


Sign in / Sign up

Export Citation Format

Share Document