d-PBWT: dynamic positional Burrows-Wheeler transform

Abstract Motivation Durbin’s positional Burrows-Wheeler transform (PBWT) is a scalable data structure for haplotype matching. It has been successfully applied to identical by descent (IBD) segment identification and genotype imputation. Once the PBWT of a haplotype panel is constructed, it supports efficient retrieval of all shared long segments among all individuals (long matches) and efficient query between an external haplotype and the panel. However, the standard PBWT is an array-based static data structure and does not support dynamic updates of the panel. Results Here, we generalize the static PBWT to a dynamic data structure, d-PBWT, where the reverse prefix sorting at each position is stored with linked lists.We also developed efficient algorithms for insertion and deletion of individual haplotypes. In addition, we verified that d-PBWT can support all algorithms of PBWT. In doing so, we systematically investigated variations of set maximal match and long match query algorithms: while they all have average case time complexity independent of database size, they have different worst case complexities and dependencies on additional data structures. Availability The benchmarking code is available at genome.ucf.edu/d-PBWT. Supplementary information Supplementary Materials are available at Bioinformatics online.

Download Full-text

d-PBWT: dynamic positional Burrows-Wheeler transform

10.1101/2020.01.14.906487 ◽

2020 ◽

Author(s):

Ahsan Sanaullah ◽

Degui Zhi ◽

Shaojie Zhang

Keyword(s):

Data Structure ◽

Time Complexity ◽

Linear Time ◽

Genotype Imputation ◽

Worst Case ◽

Average Case ◽

Insertion And Deletion ◽

Static Data ◽

Efficient Retrieval ◽

Burrows Wheeler Transform

AbstractDurbin’s PBWT, a scalable data structure for haplotype matching, has been successfully applied to identical by descent (IBD) segment identification and genotype imputation. Once the PBWT of a haplotype panel is constructed, it supports efficient retrieval of all shared long segments among all individuals (long matches) and efficient query between an external haplotype and the panel. However, the standard PBWT is an array-based static data structure and does not support dynamic updates of the panel. Here, we generalize the static PBWT to a dynamic data structure, d-PBWT, where the reverse prefix sorting at each position is represented by linked lists. We developed efficient algorithms for insertion and deletion of individual haplotypes. In addition, we verified that d-PBWT can support all algorithms of PBWT. In doing so, we systematically investigated variations of set maximal match and long match query algorithms: while they all have average case time complexity independent of database size, they have different worst case complexities, linear time complexity with the size of the genome, and dependency on additional data structures.

Download Full-text

DYNAMIZATION OF THE TRAPEZOID METHOD FOR PLANAR POINT LOCATION IN MONOTONE SUBDIVISIONS

International Journal of Computational Geometry & Applications ◽

10.1142/s0218195992000184 ◽

1992 ◽

Vol 02 (03) ◽

pp. 311-333 ◽

Cited By ~ 9

Author(s):

YI-JEN CHIANG ◽

ROBERTO TAMASSIA

Keyword(s):

Data Structure ◽

Point Location ◽

Worst Case ◽

Dynamic Data ◽

Location Data ◽

Insertion And Deletion ◽

Planar Point Location ◽

Vertex Insertion ◽

Trapezoid Method ◽

Dynamic Data Structure

We present a fully dynamic data structure for point location in a monotone subdivision, based on the trapezoid method. The operations supported are insertion and deletion of vertices and edges, and horizontal translation of vertices. Let n be the current number of vertices of the subdivision. Point location queries take O( log n) time, while updates take O ( log 2 n) time (amortized for vertex insertion/deletion and worst-case for the other updates). The space requirement is O(n log n). This is the first fully dynamic point location data structure for monotone subdivisions that achieves optimal query time.

Download Full-text

The Bit Probe Complexity Measure Revisited

DAIMI Report Series ◽

10.7146/dpb.v21i396.6631 ◽

1992 ◽

Vol 21 (396) ◽

Author(s):

Peter Bro Miltersen

Keyword(s):

Data Structure ◽

Polynomial Time ◽

Communication Complexity ◽

Complexity Measure ◽

Additive Constant ◽

Static Structure ◽

Worst Case ◽

Static Data ◽

On Line ◽

Probe Complexity

The bit probe complexity of a static data structure problem within a given size bound was defined by Elias and Flower. It is the number of bits one needs to probe in the data structure for worst case data and query with an optimal encoding of the data within the space bound. We make some furtber investigations into the properties of the bit probe complexity measure. We determine the complexity of the full problem, which is the problem where every possible query is allowed, within an additive constant. We show a trade off-between structure size and the number of bit probes for all problems. We show that the complexity of almost every problem, even with small query sets, equals that of the full problem. We show how communication complexity can be used to give small, but occasionally tight lower bounds for natural functions. We define the class of access feasible static structure problems and conjecture that not every polynomial time computable problem is access feasible. We show a link to dynamic problems by showing that if polynomial time computable functions without feasible static structures exist, then there are problems in P which can not be reevaluated efficiently on-line.

Download Full-text

A Selectable Sloppy Heap

Algorithms ◽

10.3390/a12030058 ◽

2019 ◽

Vol 12 (3) ◽

pp. 58 ◽

Cited By ~ 2

Author(s):

Adrian Dumitrescu

Keyword(s):

Data Structure ◽

Upper Bounds ◽

Structure Design ◽

Constant Time ◽

Worst Case ◽

Slowing Down ◽

Insertion And Deletion ◽

Speed Up ◽

Amortized Complexity ◽

Dynamic Version

We study the selection problem, namely that of computing the ith order statistic of n given elements. Here we offer a data structure called selectable sloppy heap that handles a dynamic version in which upon request (i) a new element is inserted or (ii) an element of a prescribed quantile group is deleted from the data structure. Each operation is executed in constant time—and is thus independent of n (the number of elements stored in the data structure)—provided that the number of quantile groups is fixed. This is the first result of this kind accommodating both insertion and deletion in constant time. As such, our data structure outperforms the soft heap data structure of Chazelle (which only offers constant amortized complexity for a fixed error rate 0 < ε ≤ 1 / 2 ) in applications such as dynamic percentile maintenance. The design demonstrates how slowing down a certain computation can speed up the data structure. The method described here is likely to have further impact in the field of data structure design in extending asymptotic amortized upper bounds to same formula asymptotic worst-case bounds.

Download Full-text

On Average Case Complexity of Problems that are Intractable in the Worst Case

1992 American Control Conference ◽

10.23919/acc.1992.4792069 ◽

1992 ◽

Author(s):

G.W. Wasilkowski

Keyword(s):

Worst Case ◽

Average Case ◽

Case Complexity ◽

Average Case Complexity

Download Full-text

Rule Based Classifiers for Suspect Detection from CCTV Footages

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200922142931 ◽

2020 ◽

Vol 13 ◽

Author(s):

Sunil Pathak

Keyword(s):

Answer Sheet ◽

Normal Activity ◽

Contact Detection ◽

Accuracy Rate ◽

Hand Detection ◽

Worst Case ◽

Average Case ◽

The Face ◽

Case Simulation ◽

Material Exchange

Background: The significant work has been present to identify suspects, gathering information and examining any videos from CCTV Footage. This exploration work expects to recognize suspicious exercises, i.e. object trade, passage of another individual, peeping into other's answer sheet and individual trade from the video caught by a reconnaissance camera amid examinations. This requires the procedure of face acknowledgment, hand acknowledgment and distinguishing the contact between the face and hands of a similar individual and that among various people. Methods: Segmented frames has given as input to obtain foreground image with the help of Gaussian filtering and background modeling method. Suh foreground images has given to Activity Recognition model to detect normal activity or suspicious activity. Results: Accuracy rate, Precision and Recall are calculate for activities detection, contact detection for Best Case, Average Case and Worst Case. Simulation results are compare with performance parameter such as Material Exchange, Position Exchange, and Introduction of a new person, Face and Hand Detection and Multi Person Scenario. Conclusion: In this paper, a framework is prepared for suspect detection. This framework will absolutely realize an unrest in the field of security observation in the training area.

Download Full-text

A-Tree: A Dynamic Data Structure for Efficiently Indexing Arbitrary Boolean Expressions

Proceedings of the 2021 International Conference on Management of Data ◽

10.1145/3448016.3457266 ◽

2021 ◽

Author(s):

Shuping Ji ◽

Hans-Arno Jacobsen

Keyword(s):

Data Structure ◽

Dynamic Data ◽

Dynamic Data Structure ◽

Boolean Expressions

Download Full-text

Average case complexity under the universal distribution equals worst-case complexity

Information Processing Letters ◽

10.1016/0020-0190(92)90138-l ◽

1992 ◽

Vol 42 (3) ◽

pp. 145-149 ◽

Cited By ~ 32

Author(s):

Ming Li ◽

Paul M.B. Vitányi

Keyword(s):

Worst Case ◽

Average Case ◽

Case Complexity ◽

Universal Distribution ◽

Average Case Complexity ◽

Worst Case Complexity

Download Full-text

A dynamic data structure for 3-D convex hulls and 2-D nearest neighbor queries

Journal of the ACM ◽

10.1145/1706591.1706596 ◽

2010 ◽

Vol 57 (3) ◽

pp. 1-15 ◽

Cited By ~ 15

Author(s):

Timothy M. Chan

Keyword(s):

Data Structure ◽

Nearest Neighbor ◽

Convex Hulls ◽

Dynamic Data ◽

Dynamic Data Structure ◽

Nearest Neighbor Queries

Download Full-text

A Splay Tree-Based Approach for Efficient Resource Location in P2P Networks

The Scientific World JOURNAL ◽

10.1155/2014/830682 ◽

2014 ◽

Vol 2014 ◽

pp. 1-11

Author(s):

Wei Zhou ◽

Zilong Tan ◽

Shaowen Yao ◽

Shipu Wang

Keyword(s):

Routing Algorithm ◽

Adaptive Routing ◽

Upper And Lower Bounds ◽

Resource Location ◽

Hop Count ◽

Worst Case ◽

Average Case ◽

Splay Tree ◽

Efficient Resource ◽

P2p System

Resource location in structured P2P system has a critical influence on the system performance. Existing analytical studies of Chord protocol have shown some potential improvements in performance. In this paper a splay tree-based new Chord structure called SChord is proposed to improve the efficiency of locating resources. We consider a novel implementation of the Chord finger table (routing table) based on the splay tree. This approach extends the Chord finger table with additional routing entries. Adaptive routing algorithm is proposed for implementation, and it can be shown that hop count is significantly minimized without introducing any other protocol overheads. We analyze the hop count of the adaptive routing algorithm, as compared to Chord variants, and demonstrate sharp upper and lower bounds for both worst-case and average case settings. In addition, we theoretically analyze the hop reducing in SChord and derive the fact that SChord can significantly reduce the routing hops as compared to Chord. Several simulations are presented to evaluate the performance of the algorithm and support our analytical findings. The simulation results show the efficiency of SChord.

Download Full-text