abstract syntax
Recently Published Documents


TOTAL DOCUMENTS

330
(FIVE YEARS 69)

H-INDEX

26
(FIVE YEARS 3)

2022 ◽  
Vol 6 (POPL) ◽  
pp. 1-29
Author(s):  
Marcelo Fiore ◽  
Dmitrij Szamozvancev

Despite extensive research both on the theoretical and practical fronts, formalising, reasoning about, and implementing languages with variable binding is still a daunting endeavour – repetitive boilerplate and the overly complicated metatheory of capture-avoiding substitution often get in the way of progressing on to the actually interesting properties of a language. Existing developments offer some relief, however at the expense of inconvenient and error-prone term encodings and lack of formal foundations. We present a mathematically-inspired language-formalisation framework implemented in Agda. The system translates the description of a syntax signature with variable-binding operators into an intrinsically-encoded, inductive data type equipped with syntactic operations such as weakening and substitution, along with their correctness properties. The generated metatheory further incorporates metavariables and their associated operation of metasubstitution, which enables second-order equational/rewriting reasoning. The underlying mathematical foundation of the framework – initial algebra semantics – derives compositional interpretations of languages into their models satisfying the semantic substitution lemma by construction.


2021 ◽  
Vol 4 ◽  
pp. 72-77
Author(s):  
Volodymyr Protsenko

When creating a programming language, it is necessary to determine its syntax and semantics. The main task of syntax is to describe all constructions that are elements of the language. For this purpose, a specific syntax highlights syntactically correct sequences of characters of the language alphabet. Most often it is a finite set of rules that generate an infinite set of all construction languages, such as the extended Backus-Naur (BNF) form.To describe the semantics of the language, the preference is given to the abstract syntax, which in real programming languages is shorter and more obvious than specific. The relationship between abstract syntax objects and the syntax of the program in compilers solves the parsing phase.Denotational semantics is used to describe semantics. Initially, it records the denotations of the simplest syntactic objects. Then, with each compound syntactic construction, a semantic function is associated, which by denotations of components of a design calculates its value – denotation. Since the program is a specific syntactic construction, its denotation is possible to determine using the appropriate semantic function. Note that the program itself is not executed when calculating its denotation.The denotative description of a programming language includes the abstract syntax of its constructions, denotations – the meanings of constructions and semantic functions that reflect elements of abstract syntax (language constructions) in their denotations (meanings).The use of the functional programming language Haskell as a metalanguage is considered. The Haskell type system is a good tool for constructing abstract syntax. The various possibilities for describing pure functions, which are often the denotations of programming language constructs, are the basis for the effective use of Haskell to describe denotational semantics.The paper provides a formal specification of a simple imperative programming language with integer data, block structure, and the traditional set of operators: assignment, input, output, loop and conditional. The ability of Haskell to effectively implement parsing, which solves the problem of linking a particular syntax with the abstract, allows to expand the formal specification of the language to its implementation: a pure function — the interpreter.The work contains all the functions and data types that make up the interpreter of a simple imperative programming language.


2021 ◽  
Vol 5 (ICFP) ◽  
pp. 1-29
Author(s):  
Chaitanya Koparkar ◽  
Mike Rainey ◽  
Michael Vollmer ◽  
Milind Kulkarni ◽  
Ryan R. Newton

Recent work showed that compiling functional programs to use dense, serialized memory representations for recursive algebraic datatypes can yield significant constant-factor speedups for sequential programs. But serializing data in a maximally dense format consequently serializes the processing of that data, yielding a tension between density and parallelism. This paper shows that a disciplined, practical compromise is possible. We present Parallel Gibbon, a compiler that obtains the benefits of dense data formats and parallelism. We formalize the semantics of the parallel location calculus underpinning this novel implementation strategy, and show that it is type-safe. Parallel Gibbon exceeds the parallel performance of existing compilers for purely functional programs that use recursive algebraic datatypes, including, notably, abstract-syntax-tree traversals as in compilers.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yingjie Xu ◽  
Gengran Hu ◽  
Lin You ◽  
Chengtang Cao

In recent years, a lot of vulnerabilities of smart contracts have been found. Hackers used these vulnerabilities to attack the corresponding contracts developed in the blockchain system such as Ethereum, and it has caused lots of economic losses. Therefore, it is very important to find out the potential problems of the smart contracts and develop more secure smart contracts. As blockchain security events have raised more important issues, more and more smart contract security analysis methods have been developed. Most of these methods are based on traditional static analysis or dynamic analysis methods. There are only a few methods that use emerging technologies, such as machine learning. Some models that use machine learning to detect smart contract vulnerabilities cost much time in extracting features manually. In this paper, we introduce a novel machine learning-based analysis model by introducing the shared child nodes for smart contract vulnerabilities. We build the Abstract-Syntax-Tree (AST) for smart contracts with some vulnerabilities from two data sets including SmartBugs and SolidiFI-benchmark. Then, we build the Abstract-Syntax-Tree (AST) of the labeled smart contract for data sets named Smartbugs-wilds. Next, we get the shared child nodes from both of the ASTs to obtain the structural similarity, and then, we construct a feature vector composed of the values that measure structural similarity automatically to build our machine learning model. Finally, we get a KNN model that can predict eight types of vulnerabilities including Re-entrancy, Arithmetic, Access Control, Denial of Service, Unchecked Low Level Calls, Bad Randomness, Front Running, and Denial of Service. The accuracy, recall, and precision of our KNN model are all higher than 90%. In addition, compared with some other analysis tools including Oyente and SmartCheck, our model has higher accuracy. In addition, we spent less time for training .


2021 ◽  
Vol 2021 (4) ◽  
pp. 464-479
Author(s):  
Edwin Dauber ◽  
Robert Erbacher ◽  
Gregory Shearer ◽  
Michael Weisman ◽  
Frederica Nelson ◽  
...  

Abstract Source code authorship attribution can be used for many types of intelligence on binaries and executables, including forensics, but introduces a threat to the privacy of anonymous programmers. Previous work has shown how to attribute individually authored code files and code segments. In this work, we examine authorship segmentation, in which we determine authorship of arbitrary parts of a program. While previous work has performed segmentation at the textual level, we attempt to attribute subtrees of the abstract syntax tree (AST). We focus on two primary problems: identifying the primary author of an arbitrary AST subtree and identifying on which edges of the AST primary authorship changes. We demonstrate that the former is a difficult problem but the later is much easier. We also demonstrate methods by which we can leverage the easier problem to improve accuracy for the harder problem. We show that while identifying the author of subtrees is difficult overall, this is primarily due to the abundance of small subtrees: in the validation set we can attribute subtrees of at least 25 nodes with accuracy over 80% and at least 33 nodes with accuracy over 90%, while in the test set we can attribute subtrees of at least 33 nodes with accuracy of 70%. While our baseline accuracy for single AST nodes is 20.21% for the validation set and 35.66% for the test set, we present techniques by which we can increase this accuracy to 42.01% and 49.21% respectively. We further present observations about collaborative code found on GitHub that may drive further research.


Sign in / Sign up

Export Citation Format

Share Document