Static Analysis for Java Servlets and JSP

We present an approach for statically reasoning about the behavior of Web applications that are developed using Java Servlets and JSP. Specifically, we attack the problems of guaranteeing that all output is well-formed and valid XML and ensuring consistency of XHTML form fields and session state. Our approach builds on a collection of program analysis techniques developed earlier in the JWIG and XACT projects, combined with work on balanced context-free grammars. Together, this provides the necessary foundation concerning reasoning about output streams and application control flow.

Download Full-text

Layered Region Based Flow-Sensitive Demand-Driven Alias Analysis

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.577.917 ◽

2014 ◽

Vol 577 ◽

pp. 917-920

Author(s):

Long Pang ◽

Xiao Hong Su ◽

Pei Jun Ma ◽

Ling Ling Zhao

Keyword(s):

Program Analysis ◽

Control Flow ◽

Alias Analysis ◽

Reachability Problem ◽

Analysis Algorithm ◽

Context Free Language ◽

Context Free ◽

Free Language ◽

Demand Driven Analysis

The pointer alias is indispensable for program analysis. Comparing to point-to set, it’s more efficient to formulate the alias as the context free language (CFL) reachability problem. However, the precision is limited to flow-insensitivity. To solve this problem, we propose a flow sensitive, demand-driven analysis algorithm for answering may-alias queries. First the partial single static assignment is used to discriminate the address-taken pointers. Then the order of control flow is encoded in the level linearization code to ease comparison. Finally, the query of alias in demand driven is converted into the search of CFL reachability with feasible flows. The experiments demonstrate the effectiveness of the proposed approach.

Download Full-text

Efficient generation of random sentences

Natural Language Engineering ◽

10.1017/s1351324996001234 ◽

1996 ◽

Vol 2 (1) ◽

pp. 1-13 ◽

Cited By ~ 8

Author(s):

MARK-JAN NEDERHOF

Keyword(s):

Static Analysis ◽

Finite Domain ◽

Random Generation ◽

Efficient Generation ◽

Parameter Values ◽

Context Free ◽

Context Free Grammars

We discuss the random generation of strings using the grammatical formalism AGFL. This formalism consists of context-free grammars extended with a parameter mechanism, where the parameters range over a finite domain. Our approach consists in static analysis of the combinations of parameter values with which derivations can be constructed. After this analysis, generation of sentences can be performed without backtracking.

Download Full-text

Obtaining Real-World Benchmark Programs from Open-Source Repositories Through Abstract-Semantics Preserving Transformations

10.18122/td/1644/boisestate ◽

2020 ◽

Author(s):

Maria Paquin

Keyword(s):

Open Source ◽

Static Analysis ◽

Real World ◽

Program Analysis ◽

Symbolic Execution ◽

Analysis Techniques ◽

Second Stage ◽

The Third ◽

Third Stage ◽

Transformation Algorithms

Benchmark programs are an integral part of program analysis research. Researchers use benchmark programs to evaluate existing techniques and test the feasibility of new approaches. The larger and more realistic the set of benchmarks, the more confident a researcher can be about the correctness and reproducibility of their results. However, obtaining an adequate set of benchmark programs has been a long-standing challenge in the program analysis community. In this thesis, we present the APT tool, a framework we designed and implemented to automate the generation of realistic benchmark programs suitable for program analysis evaluations. Our tool targets intra-procedural analyses that operate on an integer domain, specifically symbolic execution. The framework is composed of three main stages. In the first stage, the tool extracts potential benchmark programs from open-source repositories suitable for symbolic execution. In the second stage, the tool transforms the extracted programs into compilable, stand-alone benchmarks by removing external dependencies and nonlinear expressions. In the third stage, the benchmarks are verified and made available for the user. We have designed our transformation algorithms to remove program dependencies and nonlinear expressions while preserving their semantics-equivalence in the abstraction of symbolic analysis. That is, we want the information the analysis computes on the original program and its transformed version to be equivalent. Our work provides static analysis researchers with concise, compilable benchmark programs that are relevant to symbolic execution, allowing them to focus their efforts on advancing analysis techniques. Furthermore, our work benefits the software engineering community by enabling static analysis researchers to perform benchmarking with a large, realistic set of programs, thus strengthening the empirical evidence of the advancements in static program analysis.

Download Full-text

Learning metamorphic malware signatures from samples

Journal of Computer Virology and Hacking Techniques ◽

10.1007/s11416-021-00377-z ◽

2021 ◽

Author(s):

Marco Campion ◽

Mila Dalla Preda ◽

Roberto Giacobazzi

Keyword(s):

Control Flow ◽

Graph Representation ◽

Approximation Process ◽

Upper Approximation ◽

Detection Systems ◽

Transformation Rules ◽

Signature Matching ◽

Metamorphic Malware ◽

Context Free ◽

Context Free Grammars

AbstractMetamorphic malware are self-modifying programs which apply semantic preserving transformations to their own code in order to foil detection systems based on signature matching. Metamorphism impacts both software security and code protection technologies: it is used by malware writers to evade detection systems based on pattern matching and by software developers for preventing malicious host attacks through software diversification. In this paper, we consider the problem of automatically extracting metamorphic signatures from the analysis of metamorphic malware variants. We define a metamorphic signature as an abstract program representation that ideally captures all the possible code variants that might be generated during the execution of a metamorphic program. For this purpose, we developed MetaSign: a tool that takes as input a collection of metamorphic code variants and produces, as output, a set of transformation rules that could have been used to generate the considered metamorphic variants. MetaSign starts from a control flow graph representation of the input variants and agglomerates them into an automaton which approximates the considered code variants. The upper approximation process is based on the concept of widening automata, while the semantic preserving transformation rules, used by the metamorphic program, can be viewed as rewriting rules and modeled as grammar productions. In this setting, the grammar recognizes the language of code variants, while the production rules model the metamorphic transformations. In particular, we formalize the language of code variants in terms of pure context-free grammars, which are similar to context-free grammars with no terminal symbols. After the widening process, we create a positive set of samples from which we extract the productions of the grammar by applying a learning grammar technique. This allows us to learn the transformation rules used by the metamorphic engine to generate the considered code variants. We validate the results of MetaSign on some case studies.

Download Full-text

Analyzing Ambiguity of Context-Free Grammars

BRICS Report Series ◽

10.7146/brics.v13i9.21965 ◽

2006 ◽

Vol 13 (9) ◽

Cited By ~ 3

Author(s):

Claus Brabrand ◽

Robert Giegerich ◽

Anders Møller

Keyword(s):

Static Analysis ◽

Real World ◽

Full Text ◽

Language Design ◽

Rna Analysis ◽

Parser Generation ◽

Context Free ◽

Linguistic Characterization ◽

Context Free Grammars

It has been known since 1962 that the ambiguity problem for context-free grammars is undecidable. Ambiguity in context-free grammars is a recurring problem in language design and parser generation, as well as in applications where grammars are used as models of real-world physical structures. However, the fact that the problem is undecidable does not mean that there are no useful approximations to the problem. We observe that there is a simple linguistic characterization of the grammar ambiguity problem, and we show how to exploit this to conservatively approximate the problem based on local regular approximations and grammar unfoldings. As an application, we consider grammars that occur in RNA analysis in bioinformatics, and we demonstrate that our static analysis of context-free grammars is sufficiently precise and efficient to be practically useful. Full text: <a href="http://dx.doi.org/10.1016/j.scico.2009.11.002" target="_self">http://dx.doi.org/10.1016/j.scico.2009.11.002</a>

Download Full-text

Analyzing Ambiguity of Context-Free Grammars

BRICS Report Series ◽

10.7146/brics.v14i10.21932 ◽

2007 ◽

Vol 14 (10) ◽

Cited By ~ 2

Author(s):

Claus Brabrand ◽

Robert Giegerich ◽

Anders Møller

Keyword(s):

Static Analysis ◽

Real World ◽

Language Design ◽

Rna Analysis ◽

Parser Generation ◽

Context Free ◽

Linguistic Characterization ◽

Context Free Grammars

It has been known since 1962 that the ambiguity problem for context-free grammars is undecidable. Ambiguity in context-free grammars is a recurring problem in language design and parser generation, as well as in applications where grammars are used as models of real-world physical structures. We observe that there is a simple linguistic characterization of the grammar ambiguity problem, and we show how to exploit this to conservatively approximate the problem based on local regular approximations and grammar unfoldings. As an application, we consider grammars that occur in RNA analysis in bioinformatics, and we demonstrate that our static analysis of context-free grammars is sufficiently precise and efficient to be practically useful.

Download Full-text