scholarly journals The Decidability of Verification under PS 2.0

Author(s):  
Parosh Aziz Abdulla ◽  
Mohamed Faouzi Atig ◽  
Adwait Godbole ◽  
S. Krishna ◽  
Viktor Vafeiadis

AbstractWe consider the reachability problem for finite-state multi-threaded programs under thepromising semantics() of Lee et al., which captures most common program transformations. Since reachability is already known to be undecidable in the fragment of with only release-acquire accesses (-), we consider the fragment with only relaxed accesses and promises (). We show that reachability under is undecidable in general and that it becomes decidable, albeit non-primitive recursive, if we bound the number of promises.Given these results, we consider a bounded version of the reachability problem. To this end, we bound both the number of promises and of “view-switches”, i.e., the number of times the processes may switch their local views of the global memory. We provide a code-to-code translation from an input program under (with relaxed and release-acquire memory accesses along with promises) to a program under SC, thereby reducing the bounded reachability problem under to the bounded context-switching problem under SC. We have implemented a tool and tested it on a set of benchmarks, demonstrating that typical bugs in programs can be found with a small bound.

Author(s):  
Michael Blondin ◽  
Javier Esparza ◽  
Stefan Jaax ◽  
Philipp J. Meyer

AbstractPopulation protocols are a well established model of computation by anonymous, identical finite-state agents. A protocol is well-specified if from every initial configuration, all fair executions of the protocol reach a common consensus. The central verification question for population protocols is the well-specification problem: deciding if a given protocol is well-specified. Esparza et al. have recently shown that this problem is decidable, but with very high complexity: it is at least as hard as the Petri net reachability problem, which is -hard, and for which only algorithms of non-primitive recursive complexity are currently known. In this paper we introduce the class $${ WS}^3$$ WS 3 of well-specified strongly-silent protocols and we prove that it is suitable for automatic verification. More precisely, we show that $${ WS}^3$$ WS 3 has the same computational power as general well-specified protocols, and captures standard protocols from the literature. Moreover, we show that the membership and correctness problems for $${ WS}^3$$ WS 3 reduce to solving boolean combinations of linear constraints over $${\mathbb {N}}$$ N . This allowed us to develop the first software able to automatically prove correctness for all of the infinitely many possible inputs.


Author(s):  
A. R. Balasubramanian ◽  
Javier Esparza ◽  
Mikhail Raskin

AbstractIn rendez-vous protocols an arbitrarily large number of indistinguishable finite-state agents interact in pairs. The cut-off problem asks if there exists a number B such that all initial configurations of the protocol with at least B agents in a given initial state can reach a final configuration with all agents in a given final state. In a recent paper [17], Horn and Sangnier prove that the cut-off problem is equivalent to the Petri net reachability problem for protocols with a leader, and in "Image missing" for leaderless protocols. Further, for the special class of symmetric protocols they reduce these bounds to "Image missing" and "Image missing" , respectively. The problem of lowering these upper bounds or finding matching lower bounds is left open. We show that the cut-off problem is "Image missing" -complete for leaderless protocols, "Image missing" -complete for symmetric protocols with a leader, and in "Image missing" for leaderless symmetric protocols, thereby solving all the problems left open in [17].


1994 ◽  
Vol 05 (03n04) ◽  
pp. 281-292
Author(s):  
HSU-CHUN YEN ◽  
BOW-YAW WANG ◽  
MING-SHANG YANG

We define a subclass of Petri nets called m–state n–cycle Petri nets, each of which can be thought of as a ring of n bounded (by m states) Petri nets using n potentially unbounded places as joins. Let Ring(n, l, m) be the class of m–state n–cycle Petri nets in which the largest integer mentioned can be represented in l bits (when the standard binary encoding scheme is used). As it turns out, both the reachability problem and the boundedness problem can be decided in O(n(l+log m)) nondeterministic space. Our results provide a slight improvement over previous results for the so-called cyclic communicating finite state machines. We also compare and contrast our results with that of VASS(n, l, s), which represents the class of n-dimensional s-state vector addition systems with states where the largest integer mentioned can be described in l bits.


2013 ◽  
Vol 24 (02) ◽  
pp. 283-302
Author(s):  
NATALIYA SKRYPNYUK ◽  
FLEMMING NIELSON

In this work we present an algorithm for solving the reachability problem in finite systems that are modelled with process algebras. Our method is based on Static Analysis, in particular, Data Flow Analysis, of the syntax of a process algebraic system with multi-way synchronisation. The results of the Data Flow Analysis are used in order to build a set of Horn clauses whose least model corresponds to an overapproximation of the reachable states. The computed model can be refined after each transition, and the algorithm runs until either a state whose reachability should be checked is encountered or it is not in the least model for all constructed states and thus is definitely unreachable. The advantages of the algorithm are that in many cases only a part of the Labelled Transition System will be built which leads to lower time and memory consumption. Also, it is not necessary to save all the encountered states which leads to further reduction of the memory requirements of the algorithm.


2004 ◽  
Vol 11 (7) ◽  
Author(s):  
Mojmír Kretínský ◽  
Vojtech Rehák ◽  
Jan Strejcek

We provide a unified view on three extensions of Process rewrite systems (PRS) and compare their and PRS's expressive power. We show that the class of Petri Nets is less expressible up to bisimulation than the class of Process Algebra extended with finite state control unit. Further we show our main result that the reachability problem for PRS extended with a so called weak finite state unit is decidable.


2021 ◽  
Vol 34 (2) ◽  
pp. 133-177
Author(s):  
Javier Esparza ◽  
Stefan Jaax ◽  
Mikhail Raskin ◽  
Chana Weil-Kennedy

AbstractPopulation protocols (Angluin et al. in PODC, 2004) are a model of distributed computation in which indistinguishable, finite-state agents interact in pairs to decide if their initial configuration, i.e., the initial number of agents in each state, satisfies a given property. In a seminal paper Angluin et al. classified population protocols according to their communication mechanism, and conducted an exhaustive study of the expressive power of each class, that is, of the properties they can decide (Angluin et al. in Distrib Comput 20(4):279–304, 2007). In this paper we study the correctness problem for population protocols, i.e., whether a given protocol decides a given property. A previous paper (Esparza et al. in Acta Inform 54(2):191–215, 2017) has shown that the problem is decidable for the main population protocol model, but at least as hard as the reachability problem for Petri nets, which has recently been proved to have non-elementary complexity. Motivated by this result, we study the computational complexity of the correctness problem for all other classes introduced by Angluin et al., some of which are less powerful than the main model. Our main results show that for the class of observation models the complexity of the problem is much lower, ranging from $$\varPi _2^p$$ Π 2 p to .


2018 ◽  
Vol 3 (7) ◽  
pp. 12
Author(s):  
DaeHwan Kim

Nowadays, GPU processors are widely used for general-purpose parallel computation applications. In the GPU programming, thread and block configuration is one of the most important decisions to be made, which increases parallelism and hides instruction latency. However, in many cases, it is often difficult to have sufficient parallelism to hide all the latencies, where the high latencies are often caused by the global memory accesses. In order to reduce the number of  those accesses, the shared memory is instead used which is  much faster than the global memory being located on a chip. The performance of the proposed thread configuration is evaluated on the GPU 960 processor. The experimental result shows that the best configuration improves the performance by 7.3 times compared to the worst configuration in the experiment. The experiences are also discussed for the shared memory performance when compared to that of the global memory.


Author(s):  
Antonio Fuentes-Alventosa ◽  
Juan Gómez-Luna ◽  
José Maria González-Linares ◽  
Nicolás Guil ◽  
R. Medina-Carnicer

AbstractCAVLC (Context-Adaptive Variable Length Coding) is a high-performance entropy method for video and image compression. It is the most commonly used entropy method in the video standard H.264. In recent years, several hardware accelerators for CAVLC have been designed. In contrast, high-performance software implementations of CAVLC (e.g., GPU-based) are scarce. A high-performance GPU-based implementation of CAVLC is desirable in several scenarios. On the one hand, it can be exploited as the entropy component in GPU-based H.264 encoders, which are a very suitable solution when GPU built-in H.264 hardware encoders lack certain necessary functionality, such as data encryption and information hiding. On the other hand, a GPU-based implementation of CAVLC can be reused in a wide variety of GPU-based compression systems for encoding images and videos in formats other than H.264, such as medical images. This is not possible with hardware implementations of CAVLC, as they are non-separable components of hardware H.264 encoders. In this paper, we present CAVLCU, an efficient implementation of CAVLC on GPU, which is based on four key ideas. First, we use only one kernel to avoid the long latency global memory accesses required to transmit intermediate results among different kernels, and the costly launches and terminations of additional kernels. Second, we apply an efficient synchronization mechanism for thread-blocks (In this paper, to prevent confusion, a block of pixels of a frame will be referred to as simply block and a GPU thread block as thread-block.) that process adjacent frame regions (in horizontal and vertical dimensions) to share results in global memory space. Third, we exploit fully the available global memory bandwidth by using vectorized loads to move directly the quantized transform coefficients to registers. Fourth, we use register tiling to implement the zigzag sorting, thus obtaining high instruction-level parallelism. An exhaustive experimental evaluation showed that our approach is between 2.5$$\times$$ × and 5.4$$\times$$ × faster than the only state-of-the-art GPU-based implementation of CAVLC.


Sign in / Sign up

Export Citation Format

Share Document