Stream Data Cleaning under Speed and Acceleration Constraints

2021 ◽  
Vol 46 (3) ◽  
pp. 1-44
Author(s):  
Shaoxu Song ◽  
Fei Gao ◽  
Aoqian Zhang ◽  
Jianmin Wang ◽  
Philip S. Yu

Stream data are often dirty, for example, owing to unreliable sensor reading or erroneous extraction of stock prices. Most stream data cleaning approaches employ a smoothing filter, which may seriously alter the data without preserving the original information. We argue that the cleaning should avoid changing those originally correct/clean data, a.k.a. the minimum modification rule in data cleaning. To capture the knowledge about what is clean , we consider the (widely existing) constraints on the speed and acceleration of data changes, such as fuel consumption per hour, daily limit of stock prices, or the top speed and acceleration of a car. Guided by these semantic constraints, in this article, we propose the constraint-based approach for cleaning stream data. It is notable that existing data repair techniques clean (a sequence of) data as a whole and fail to support stream computation. To this end, we have to relax the global optimum over the entire sequence to the local optimum in a window. Rather than the commonly observed NP-hardness of general data repairing problems, our major contributions include (1) polynomial time algorithm for global optimum, (2) linear time algorithm towards local optimum under an efficient median-based solution , and (3) experiments on real datasets demonstrate that our method can show significantly lower L1 error than the existing approaches such as smoother.

Author(s):  
Bengt J. Nilsson ◽  
Paweł Żyliński

We present new results on two types of guarding problems for polygons. For the first problem, we present an optimal linear time algorithm for computing a smallest set of points that guard a given shortest path in a simple polygon having [Formula: see text] edges. We also prove that in polygons with holes, there is a constant [Formula: see text] such that no polynomial-time algorithm can solve the problem within an approximation factor of [Formula: see text], unless P=NP. For the second problem, we present a [Formula: see text]-FPT algorithm for computing a shortest tour that sees [Formula: see text] specified points in a polygon with [Formula: see text] holes. We also present a [Formula: see text]-FPT approximation algorithm for this problem having approximation factor [Formula: see text]. In addition, we prove that the general problem cannot be polynomially approximated better than by a factor of [Formula: see text], for some constant [Formula: see text], unless P [Formula: see text]NP.


2009 ◽  
Vol 35 (4) ◽  
pp. 559-595 ◽  
Author(s):  
Liang Huang ◽  
Hao Zhang ◽  
Daniel Gildea ◽  
Kevin Knight

Systems based on synchronous grammars and tree transducers promise to improve the quality of statistical machine translation output, but are often very computationally intensive. The complexity is exponential in the size of individual grammar rules due to arbitrary re-orderings between the two languages. We develop a theory of binarization for synchronous context-free grammars and present a linear-time algorithm for binarizing synchronous rules when possible. In our large-scale experiments, we found that almost all rules are binarizable and the resulting binarized rule set significantly improves the speed and accuracy of a state-of-the-art syntax-based machine translation system. We also discuss the more general, and computationally more difficult, problem of finding good parsing strategies for non-binarizable rules, and present an approximate polynomial-time algorithm for this problem.


2014 ◽  
Vol 24 (03) ◽  
pp. 225-236 ◽  
Author(s):  
DAVID KIRKPATRICK ◽  
BOTING YANG ◽  
SANDRA ZILLES

Given an arrangement A of n sensors and two points s and t in the plane, the barrier resilience of A with respect to s and t is the minimum number of sensors whose removal permits a path from s to t such that the path does not intersect the coverage region of any sensor in A. When the surveillance domain is the entire plane and sensor coverage regions are unit line segments, even with restricted orientations, the problem of determining the barrier resilience is known to be NP-hard. On the other hand, if sensor coverage regions are arbitrary lines, the problem has a trivial linear time solution. In this paper, we study the case where each sensor coverage region is an arbitrary ray, and give an O(n2m) time algorithm for computing the barrier resilience when there are m ⩾ 1 sensor intersections.


Computing ◽  
2021 ◽  
Author(s):  
Peter Chini ◽  
Roland Meyer ◽  
Prakash Saivasan

AbstractWe study liveness and model checking problems for broadcast networks, a system model of identical clients communicating via message passing. The first problem that we consider is Liveness Verification. It asks whether there is a computation such that one clients visits a final state infinitely often. The complexity of the problem has been open. It was shown to be $$\texttt {P}$$ P -hard but in $$\texttt {EXPSPACE}$$ EXPSPACE . We close the gap by a polynomial-time algorithm. The latter relies on a characterization of live computations in terms of paths in a suitable graph, combined with a fixed-point iteration to efficiently check the existence of such paths. The second problem is Fair Liveness Verification. It asks for a computation where all participating clients visit a final state infinitely often. We adjust the algorithm to also solve fair liveness in polynomial time. Both problems can be instrumented to answer model checking questions for broadcast networks against linear time temporal logic specifications. The first problem in this context is Fair Model Checking. It demands that for all computations of a broadcast network, all participating clients satisfy the specification. We solve the problem via the Vardi–Wolper construction and a reduction to Liveness Verification. The second problem is Sparse Model Checking. It asks whether each computation has a participating client that satisfies the specification. We reduce the problem to Fair Liveness Verification.


2015 ◽  
Vol 07 (02) ◽  
pp. 1550018 ◽  
Author(s):  
Viet Hung Nguyen

A star is a graph in which some node is incident with every edge of the graph, i.e., a graph of diameter at most 2. A star forest is a graph in which each connected component is a star. Given a connected graph G in which the edges may be weighted positively. A spanning star forest of G is a subgraph of G which is a star forest spanning the nodes of G. The size of a spanning star forest F of G is defined to be the number of edges of F if G is unweighted and the total weight of all edges of F if G is weighted. We are interested in the problem of finding a Maximum Weight spanning Star Forest (MWSFP) in G. In [C. T. Nguyen, J. Shen, M. Hou, L. Sheng, W. Miller and L. Zhang, Approximating the spanning star forest problem and its applications to genomic sequence alignment, SIAM J. Comput. 38(3) (2008) 946–962], the authors introduced the MWSFP and proved its NP-hardness. They also gave a polynomial time algorithm for the MWSF problem when G is a tree. In this paper, we present a linear time algorithm that solves the MSWF problem when G is a cactus.


2005 ◽  
Vol 16 (04) ◽  
pp. 803-827 ◽  
Author(s):  
TAKEHIRO ITO ◽  
XIAO ZHOU ◽  
TAKAO NISHIZEKI

Assume that a tree T has a number ns of "supply vertices" and all the other vertices are "demand vertices." Each supply vertex is assigned a positive number called a supply, while each demand vertex is assigned a positive number called a demand. One wishes to partition T into exactly ns subtrees by deleting edges from T so that each subtree contains exactly one supply vertex whose supply is no less than the sum of demands of all demand vertices in the subtree. The "partition problem" is a decision problem to ask whether T has such a partition. The "maximum partition problem" is an optimization version of the partition problem. In this paper, we give three algorithms for the problems. The first is a linear-time algorithm for the partition problem. The second is a pseudo-polynomial-time algorithm for the maximum partition problem. The third is a fully polynomial-time approximation scheme (FPTAS) for the maximum partition problem.


Author(s):  
RANI SIROMONEY ◽  
LISA MATHEW ◽  
K.G. SUBRAMANIAN ◽  
V.R. DARE

Learning of certain classes of two-dimensional picture languages is considered in this paper. Linear time algorithms that learn in the limit, from positive data the classes of local picture languages and locally testable picture languages are presented. A crucial step for obtaining the learning algorithm for local picture languages is an explicit construction of a two-dimensional on-line tessellation acceptor for a given local picture language. A polynomial time algorithm that learns the class of recognizable picture languages from positive data and restricted subset queries, is presented in contrast to the fact that this class is not learnable in the limit from positive data alone.


Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6599
Author(s):  
Muhammad Shahid Iqbal ◽  
Yalcin Sadi ◽  
Sinem Coleri

Wireless powered communication networks (WPCNs) will be a major enabler of massive machine type communications (MTCs), which is a major service domain for 5G and beyond systems. These MTC networks will be deployed by using low-power transceivers and a very limited set of transmission configurations. We investigate a novel minimum length scheduling problem for multi-cell full-duplex wireless powered communication networks to determine the optimal power control and scheduling for constant rate transmission model. The formulated optimization problem is combinatorial in nature and, thus, difficult to solve for the global optimum. As a solution strategy, first, we decompose the problem into the power control problem (PCP) and scheduling problem. For the PCP, we propose the optimal polynomial time algorithm based on the evaluation of Perron–Frobenius conditions. For the scheduling problem, we propose a heuristic algorithm that aims to maximize the number of concurrently transmitting users by maximizing the allowable interference on each user without violating the signal-to-noise-ratio (SNR) requirements. Through extensive simulations, we demonstrate a 50% reduction in the schedule length by using the proposed algorithm in comparison to unscheduled concurrent transmissions.


2019 ◽  
Author(s):  
Md. Khaledur Rahman ◽  
M. Sohel Rahman

AbstractThe genome rearrangement problem computes the minimum number of operations that are required to sort all elements of a permutation. A block-interchange operation exchanges two blocks of a permutation which are not necessarily adjacent and in a prefix block-interchange, one block is always the prefix of that permutation. In this paper, we focus on applying prefix block-interchanges on binary and ternary strings. We present upper bounds to group and sort a given binary/ternary string. We also provide upper bounds for a different version of the block-interchange operation which we refer to as the ‘restricted prefix block-interchange’. We observe that our obtained upper bound for restricted prefix block-interchange operations on binary strings is better than that of other genome rearrangement operations to group fully normalized binary strings. Consequently, we provide a linear-time algorithm to solve the problem of grouping binary normalized strings by restricted prefix block-interchanges. We also provide a polynomial time algorithm to group normalized ternary strings by prefix block-interchange operations. Finally, we provide a classification for ternary strings based on the required number of prefix block-interchange operations.


2019 ◽  
Vol 30 (02) ◽  
pp. 197-230 ◽  
Author(s):  
Markus Chimani ◽  
Giuseppe Di Battista ◽  
Fabrizio Frati ◽  
Karsten Klein

In this paper, we show a polynomial-time algorithm for testing [Formula: see text]-planarity of embedded flat clustered graphs with at most two vertices per cluster on each face. Our result is based on a reduction to the planar set of spanning trees in topological multigraphs (pssttm) problem, which is defined as follows. Given a (non-planar) topological multigraph [Formula: see text] with [Formula: see text] connected components [Formula: see text], do spanning trees of [Formula: see text] exist such that no two edges in any two spanning trees cross? Kratochvíl et al. [SIAM Journal on Discrete Mathematics, 4(2): 223–244, 1991] proved that the problem is NP-hard even if [Formula: see text]; on the other hand, Di Battista and Frati presented a linear-time algorithm to solve the pssttm problem for the case in which [Formula: see text] is a [Formula: see text]-planar topological multigraph [Journal of Graph Algorithms and Applications, 13(3): 349–378, 2009]. For any embedded flat clustered graph [Formula: see text], an instance [Formula: see text] of the pssttm problem can be constructed in polynomial time such that [Formula: see text] is [Formula: see text]-planar if and only if [Formula: see text] admits a solution. We show that, if [Formula: see text] has at most two vertices per cluster on each face, then it can be tested in polynomial time whether the corresponding instance [Formula: see text] of the pssttm problem is positive or negative. Our strategy for solving the pssttm problem on [Formula: see text] is to repeatedly perform a sequence of tests, which might let us conclude that [Formula: see text] is a negative instance, and simplifications, which might let us simplify [Formula: see text] by removing or contracting some edges. Most of these tests and simplifications are performed “locally”, by looking at the crossings involving a single edge or face of a connected component [Formula: see text] of [Formula: see text]; however, some tests and simplifications have to consider certain global structures in [Formula: see text], which we call [Formula: see text]-donuts. If no test concludes that [Formula: see text] is a negative instance of the pssttm problem, then the simplifications eventually transform [Formula: see text] into an equivalent [Formula: see text]-planar topological multigraph on which we can apply the cited linear-time algorithm by Di Battista and Frati.


Sign in / Sign up

Export Citation Format

Share Document