incremental computation Latest Research Papers

Efficient algorithms for dynamic bidirected Dyck-reachability

Proceedings of the ACM on Programming Languages ◽

10.1145/3498724 ◽

2022 ◽

Vol 6 (POPL) ◽

pp. 1-29

Author(s):

Yuanbo Li ◽

Kris Satya ◽

Qirun Zhang

Keyword(s):

Program Analysis ◽

Transitive Closure ◽

Dynamic Graph ◽

Alias Analysis ◽

Worst Case ◽

Dynamic Algorithm ◽

Dynamic Algorithms ◽

Incremental Computation ◽

Deterministic Dynamic ◽

Bidirected Graphs

Dyck-reachability is a fundamental formulation for program analysis, which has been widely used to capture properly-matched-parenthesis program properties such as function calls/returns and field writes/reads. Bidirected Dyck-reachability is a relaxation of Dyck-reachability on bidirected graphs where each edge u → ( i v labeled by an open parenthesis “( i ” is accompanied with an inverse edge v → ) i u labeled by the corresponding close parenthesis “) i ”, and vice versa. In practice, many client analyses such as alias analysis adopt the bidirected Dyck-reachability formulation. Bidirected Dyck-reachability admits an optimal reachability algorithm. Specifically, given a graph with n nodes and m edges, the optimal bidirected Dyck-reachability algorithm computes all-pairs reachability information in O ( m ) time. This paper focuses on the dynamic version of bidirected Dyck-reachability. In particular, we consider the problem of maintaining all-pairs Dyck-reachability information in bidirected graphs under a sequence of edge insertions and deletions. Dynamic bidirected Dyck-reachability can formulate many program analysis problems in the presence of code changes. Unfortunately, solving dynamic graph reachability problems is challenging. For example, even for maintaining transitive closure, the fastest deterministic dynamic algorithm requires O ( n 2 ) update time to achieve O (1) query time. All-pairs Dyck-reachability is a generalization of transitive closure. Despite extensive research on incremental computation, there is no algorithmic development on dynamic graph algorithms for program analysis with worst-case guarantees. Our work fills the gap and proposes the first dynamic algorithm for Dyck reachability on bidirected graphs. Our dynamic algorithms can handle each graph update ( i.e. , edge insertion and deletion) in O ( n ·α( n )) time and support any all-pairs reachability query in O (1) time, where α( n ) is the inverse Ackermann function. We have implemented and evaluated our dynamic algorithm on an alias analysis and a context-sensitive data-dependence analysis for Java. We compare our dynamic algorithms against a straightforward approach based on the O ( m )-time optimal bidirected Dyck-reachability algorithm and a recent incremental Datalog solver. Experimental results show that our algorithm achieves orders of magnitude speedup over both approaches.

Efficient Incremental Computation of Aggregations over Sliding Windows

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining ◽

10.1145/3447548.3467360 ◽

2021 ◽

Author(s):

Chao Zhang ◽

Reza Akbarinia ◽

Farouk Toumani

Keyword(s):

Sliding Windows ◽

Incremental Computation

DiterGraph: Toward I/O-Efficient Incremental Computation over Large Graphs with Billion Edges

10.1109/bigcom53800.2021.00006 ◽

2021 ◽

Author(s):

Yujie Du ◽

Zhigang Wang ◽

Ning Wang ◽

Luqing Xie ◽

Zhiqiang Wei

Keyword(s):

Large Graphs ◽

Incremental Computation

Incremental Computation for Structured Argumentation over Dynamic DeLP Knowledge Bases

Artificial Intelligence ◽

10.1016/j.artint.2021.103553 ◽

2021 ◽

pp. 103553

Author(s):

Gianvincenzo Alfano ◽

Sergio Greco ◽

Francesco Parisi ◽

Gerardo I. Simari ◽

Guillermo R. Simari

Keyword(s):

Knowledge Bases ◽

Incremental Computation ◽

Structured Argumentation

Chunk-wise regularised PCA-based imputation of missing data

Statistical Methods & Applications ◽

10.1007/s10260-021-00575-5 ◽

2021 ◽

Author(s):

A. Iodice D’Enza ◽

A. Markos ◽

F. Palumbo

Keyword(s):

Missing Data ◽

Principal Component ◽

Multivariate Techniques ◽

Sufficient Information ◽

Data Sets ◽

Incremental Approach ◽

Distributed Approach ◽

Missing Completely At Random ◽

Incremental Computation ◽

Pca Algorithm

AbstractStandard multivariate techniques like Principal Component Analysis (PCA) are based on the eigendecomposition of a matrix and therefore require complete data sets. Recent comparative reviews of PCA algorithms for missing data showed the regularised iterative PCA algorithm (RPCA) to be effective. This paper presents two chunk-wise implementations of RPCA suitable for the imputation of “tall” data sets, that is, data sets with many observations. A “chunk” is a subset of the whole set of available observations. In particular, one implementation is suitable for distributed computation as it imputes each chunk independently. The other implementation, instead, is suitable for incremental computation, where the imputation of each new chunk is based on all the chunks analysed that far. The proposed procedures were compared to batch RPCA considering different data sets and missing data mechanisms. Experimental results showed that the distributed approach had similar performance to batch RPCA for data with entries missing completely at random. The incremental approach showed appreciable performance when the data is missing not completely at random, and the first analysed chunks contain sufficient information on the data structure.

RECURSIVE JOIN PROCESSING IN BIG DATA ENVIRONMENT

Journal of Computer Science and Cybernetics ◽

10.15625/1813-9663/37/2/15889 ◽

2021 ◽

Vol 37 (2) ◽

pp. 107-122

Author(s):

Anh-Cang Phan ◽

Thanh-Ngoan Trieu ◽

Thuong-Cang Phan

Keyword(s):

Big Data ◽

Large Scale ◽

Large Datasets ◽

Experimental Results ◽

Hierarchical Data ◽

Efficient Approach ◽

Intermediate Data ◽

Incremental Computation ◽

Data Environment ◽

Over Time

In the era of information explosion, Big data is receiving increased attention as having important implications for growth, profitability, and survival of modern organizations. However, it also offers many challenges in the way data is processed and queried over time. A join operation is one of the most common operations appearing in many data queries. Specially, a recursive join is a join type used to query hierarchical data but it is more extremely complex and costly. The evaluation of the recursive join in MapReduce includes some iterations of two tasks of a join task and an incremental computation task. Those tasks are significantly expensive and reduce the performance of queries in large datasets because they generate plenty of intermediate data transmitting over the network. In this study, we thus propose a simple but efficient approach for Big recursive joins based on reducing by half the number of the required iterations in the Spark environment. This improvement leads to significantly reducing the number of the required tasks as well as the amount of the intermediate data generated and transferred over the network. Our experimental results show that an improved recursive join is more efficient and faster than a traditional one on large-scale datasets.

EXPERIENCE: Algorithms and Case Study for Explaining Repairs with Uniform Profiles over IoT Data

Journal of Data and Information Quality ◽

10.1145/3436239 ◽

2021 ◽

Vol 13 (3) ◽

pp. 1-17

Author(s):

Zhicheng Liu ◽

Yang Zhang ◽

Ruihong Huang ◽

Zhiwei Chen ◽

Shaoxu Song ◽

...

Keyword(s):

Water Temperature ◽

Decision Maker ◽

Block Size ◽

Temperature Data ◽

Prototype System ◽

Time Intervals ◽

Incremental Computation ◽

Water Temperature Data ◽

Gps Trajectories

IoT data with timestamps are often found with outliers, such as GPS trajectories or sensor readings. While existing systems mostly focus on detecting temporal outliers without explanations and repairs, a decision maker may be more interested in the cause of the outlier appearance such that subsequent actions would be taken, e.g., cleaning unreliable readings or repairing broken devices or adopting a strategy for data repairs. Such outlier detection, explanation, and repairs are expected to be performed in either offline (batch) or online modes (over streaming IoT data with timestamps). In this work, we present TsClean, a new prototype system for detecting and repairing outliers with explanations over IoT data. The framework defines uniform profiles to explain the outliers detected by various algorithms, including the outliers with variant time intervals, and take approaches to repair outliers. Both batch and streaming processing are supported in a uniform framework. In particular, by varying the block size, it provides a tradeoff between computing the accurate results and approximating with efficient incremental computation. In this article, we present several case studies of applying TsClean in industry, e.g., how this framework works in detecting and repairing outliers over excavator water temperature data, and how to get reasonable explanations and repairs for the detected outliers in tracking excavators.

Efficient EMD-based Similarity Search via Batch Pruning and Incremental Computation

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2021.3100566 ◽

2021 ◽

pp. 1-1

Author(s):

Yu Chen ◽

Yong Zhang ◽

Jin Wang ◽

Jiacheng Wu ◽

Chunxiao Xing

Keyword(s):

Similarity Search ◽

Incremental Computation

ORBITS

Proceedings of the VLDB Endowment ◽

10.14778/3430915.3430920 ◽

2020 ◽

Vol 14 (3) ◽

pp. 294-306

Author(s):

Mourad Khayati ◽

Ines Arous ◽

Zakhar Tymchenko ◽

Philippe Cudré-Mauroux

Keyword(s):

Time Series ◽

Linear Time ◽

Multiple Time ◽

Multiple Time Series ◽

Online Data ◽

Incremental Computation ◽

Online Recovery ◽

Recovery Technique ◽

Centroid Decomposition

With the emergence of the Internet of Things (IoT), time series streams have become ubiquitous in our daily life. Recording such data is rarely a perfect process, as sensor failures frequently occur, yielding occasional blocks of data that go missing in multiple time series. These missing blocks do not only affect real-time monitoring but also compromise the quality of online data analyses. Effective streaming recovery (imputation) techniques either have a quadratic runtime complexity, which is infeasible for any moderately sized data, or cannot recover more than one time series at a time. In this paper, we introduce a new online recovery technique to recover multiple time series streams in linear time. Our recovery technique implements a novel incremental version of the Centroid Decomposition technique and reduces its complexity from quadratic to linear. Using this incremental technique, missing blocks are efficiently recovered in a continuous manner based on previous recoveries. We formally prove the correctness of our new incremental computation, which yields an accurate recovery. Our experimental results on real-world time series show that our recovery technique is, on average, 30% more accurate than the state of the art while being vastly more efficient.

EFFICIENT SURFACE-AWARE SEMI-GLOBAL MATCHING WITH MULTI-VIEW PLANE-SWEEP SAMPLING

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-iv-2-w7-137-2019 ◽

2019 ◽

Vol IV-2/W7 ◽

pp. 137-144

Author(s):

B. Ruf ◽

T. Pollok ◽

M. Weinmann

Keyword(s):

Image Matching ◽

Structural Information ◽

Input Sequence ◽

Depth Estimation ◽

Aerial Image ◽

Smoothness Assumption ◽

Plane Sweep ◽

Incremental Computation ◽

Accuracy Measure ◽

Global Matching

<p><strong>Abstract.</strong> Online augmentation of an oblique aerial image sequence with structural information is an essential aspect in the process of 3D scene interpretation and analysis. One key aspect in this is the efficient dense image matching and depth estimation. Here, the Semi-Global Matching (SGM) approach has proven to be one of the most widely used algorithms for efficient depth estimation, providing a good trade-off between accuracy and computational complexity. However, SGM only models a first-order smoothness assumption, thus favoring fronto-parallel surfaces. In this work, we present a hierarchical algorithm that allows for efficient depth and normal map estimation together with confidence measures for each estimate. Our algorithm relies on a plane-sweep multi-image matching followed by an extended SGM optimization that allows to incorporate local surface orientations, thus achieving more consistent and accurate estimates in areasmade up of slanted surfaces, inherent to oblique aerial imagery. We evaluate numerous configurations of our algorithm on two different datasets using an absolute and relative accuracy measure. In our evaluation, we show that the results of our approach are comparable to the ones achieved by refined Structure-from-Motion (SfM) pipelines, such as COLMAP, which are designed for offline processing. In contrast, however, our approach only considers a confined image bundle of an input sequence, thus allowing to perform an online and incremental computation at 1Hz&ndash;2Hz.</p>

incremental computation
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Efficient algorithms for dynamic bidirected Dyck-reachability

Efficient Incremental Computation of Aggregations over Sliding Windows

DiterGraph: Toward I/O-Efficient Incremental Computation over Large Graphs with Billion Edges

Incremental Computation for Structured Argumentation over Dynamic DeLP Knowledge Bases

Chunk-wise regularised PCA-based imputation of missing data

RECURSIVE JOIN PROCESSING IN BIG DATA ENVIRONMENT

EXPERIENCE: Algorithms and Case Study for Explaining Repairs with Uniform Profiles over IoT Data

Efficient EMD-based Similarity Search via Batch Pruning and Incremental Computation

ORBITS

EFFICIENT SURFACE-AWARE SEMI-GLOBAL MATCHING WITH MULTI-VIEW PLANE-SWEEP SAMPLING

Export Citation Format

incremental computationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Efficient algorithms for dynamic bidirected Dyck-reachability

Efficient Incremental Computation of Aggregations over Sliding Windows

DiterGraph: Toward I/O-Efficient Incremental Computation over Large Graphs with Billion Edges

Incremental Computation for Structured Argumentation over Dynamic DeLP Knowledge Bases

Chunk-wise regularised PCA-based imputation of missing data

RECURSIVE JOIN PROCESSING IN BIG DATA ENVIRONMENT

EXPERIENCE: Algorithms and Case Study for Explaining Repairs with Uniform Profiles over IoT Data

Efficient EMD-based Similarity Search via Batch Pruning and Incremental Computation

ORBITS

EFFICIENT SURFACE-AWARE SEMI-GLOBAL MATCHING WITH MULTI-VIEW PLANE-SWEEP SAMPLING

incremental computation
Recently Published Documents