Effective BST Approach to Find Underflow Condition in Interval Trees Using Augmented Data Structure

Author(s):  
Keyur N. Upadhyay ◽  
Hemant D. Vasava ◽  
Viral V. Kapadia
2019 ◽  
Vol 35 (23) ◽  
pp. 4907-4911 ◽  
Author(s):  
Jianglin Feng ◽  
Aakrosh Ratan ◽  
Nathan C Sheffield

Abstract Motivation Genomic data is frequently stored as segments or intervals. Because this data type is so common, interval-based comparisons are fundamental to genomic analysis. As the volume of available genomic data grows, developing efficient and scalable methods for searching interval data is necessary. Results We present a new data structure, the Augmented Interval List (AIList), to enumerate intersections between a query interval q and an interval set R. An AIList is constructed by first sorting R as a list by the interval start coordinate, then decomposing it into a few approximately flattened components (sublists), and then augmenting each sublist with the running maximum interval end. The query time for AIList is O(log2N+n+m), where n is the number of overlaps between R and q, N is the number of intervals in the set R and m is the average number of extra comparisons required to find the n overlaps. Tested on real genomic interval datasets, AIList code runs 5–18 times faster than standard high-performance code based on augmented interval-trees, nested containment lists or R-trees (BEDTools). For large datasets, the memory-usage for AIList is 4–60% of other methods. The AIList data structure, therefore, provides a significantly improved fundamental operation for highly scalable genomic data analysis. Availability and implementation An implementation of the AIList data structure with both construction and search algorithms is available at http://ailist.databio.org. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Jianglin Feng ◽  
Aakrosh Ratan ◽  
Nathan C. Sheffield

AbstractMotivationGenomic data is frequently stored as segments or intervals. Because this data type is so common, interval-based comparisons are fundamental to genomic analysis. As the volume of available genomic data grows, developing efficient and scalable methods for searching interval data is necessary.ResultsWe present a new data structure, the augmented interval list (AIList), to enumerate intersections between a query interval q and an interval set R. An AIList is constructed by first sorting R as a list by the interval start coordinate, then decomposing it into a few approximately flattened components (sublists), and then augmenting each sublist with the running maximum interval end. The query time for AIList is O(log2N + n + m), where n is the number of overlaps between R and q, N is the number of intervals in the set R, and m is the average number of extra comparisons required to find the n overlaps. Tested on real genomic interval datasets, AIList code runs 5 - 18 times faster than standard high-performance code based on augmented interval-trees (AITree), nested containment lists (NCList), or R-trees (BEDTools). For large datasets, the memory-usage for AIList is 4% - 60% of other methods. The AIList data structure, therefore, provides a significantly improved fundamental operation for highly scalable genomic data analysis.AvailabilityAn implementation of the AIList data structure with both construction and search algorithms is available at code.databio.org/AIList.


Author(s):  
LEONIDAS GUIBAS ◽  
JOHN HERSHBERGER ◽  
JACK SNOEYINK

In this paper, we investigate the problem of finding the common tangents of two convex polygons that intersect in two (unknown) points. First, we give a Θ( log 2n) bound for algorithms that store the polygons in independent arrays. Second, we show how to beat the lower bound if the vertices of the convex polygons are drawn from a fixed set of n points. We introduce a data structure called a compact interval tree that supports common tangent computations, as well as the standard binary-search-based queries, in O( log n) time apiece. Third, we apply compact interval trees to solve the subpath hull query problem: given a simple path, preprocess it so that we can find the convex hull of a query subpath quickly. With O(n log n) preprocessing, we can assemble a compact interval tree that represents the convex hull of a query subpath in O( log n) time. In order to represent arrangements of Lines implicitly, Edelsbrunner et al. used a less efficient structure, called bridge trees, to solve the subpath hull query problem. Our compact interval trees improve their results by a factor of O( log n). Thus, the present paper replaces the paper on bridge trees referred to by Edelsbrunner et al.


This article describes the proposed approaches to creating distributed models that can, with given accuracy under given restrictions, replace classical physical models for construction objects. The ability to implement the proposed approaches is a consequence of the cyber-physical integration of building systems. The principles of forming the data structure of designed objects and distributed models, which make it possible to uniquely identify the elements and increase the level of detail of such a model, are presented. The data structure diagram of distributed modeling includes, among other things, the level of formation and transmission of signals about physical processes inside cyber-physical building systems. An enlarged algorithm for creating the structure of the distributed model which describes the process of developing a data structure, formalizing requirements for the parameters of a design object and its operating modes (including normal operating conditions and extreme conditions, including natural disasters) and selecting objects for a complete group that provides distributed modeling is presented. The article formulates the main approaches to the implementation of an important practical application of the cyber-physical integration of building systems - the possibility of forming distributed physical models of designed construction objects and the directions of further research are outlined.


Sign in / Sign up

Export Citation Format

Share Document