abstract syntax tree Latest Research Papers

Recent work showed that compiling functional programs to use dense, serialized memory representations for recursive algebraic datatypes can yield significant constant-factor speedups for sequential programs. But serializing data in a maximally dense format consequently serializes the processing of that data, yielding a tension between density and parallelism. This paper shows that a disciplined, practical compromise is possible. We present Parallel Gibbon, a compiler that obtains the benefits of dense data formats and parallelism. We formalize the semantics of the parallel location calculus underpinning this novel implementation strategy, and show that it is type-safe. Parallel Gibbon exceeds the parallel performance of existing compilers for purely functional programs that use recursive algebraic datatypes, including, notably, abstract-syntax-tree traversals as in compilers.

Download Full-text

A Novel Machine Learning-Based Analysis Model for Smart Contract Vulnerability

Security and Communication Networks ◽

10.1155/2021/5798033 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Yingjie Xu ◽

Gengran Hu ◽

Lin You ◽

Chengtang Cao

Keyword(s):

Machine Learning ◽

Denial Of Service ◽

Structural Similarity ◽

Data Sets ◽

Smart Contracts ◽

Abstract Syntax ◽

Analysis Model ◽

Abstract Syntax Tree ◽

Syntax Tree ◽

Smart Contract

In recent years, a lot of vulnerabilities of smart contracts have been found. Hackers used these vulnerabilities to attack the corresponding contracts developed in the blockchain system such as Ethereum, and it has caused lots of economic losses. Therefore, it is very important to find out the potential problems of the smart contracts and develop more secure smart contracts. As blockchain security events have raised more important issues, more and more smart contract security analysis methods have been developed. Most of these methods are based on traditional static analysis or dynamic analysis methods. There are only a few methods that use emerging technologies, such as machine learning. Some models that use machine learning to detect smart contract vulnerabilities cost much time in extracting features manually. In this paper, we introduce a novel machine learning-based analysis model by introducing the shared child nodes for smart contract vulnerabilities. We build the Abstract-Syntax-Tree (AST) for smart contracts with some vulnerabilities from two data sets including SmartBugs and SolidiFI-benchmark. Then, we build the Abstract-Syntax-Tree (AST) of the labeled smart contract for data sets named Smartbugs-wilds. Next, we get the shared child nodes from both of the ASTs to obtain the structural similarity, and then, we construct a feature vector composed of the values that measure structural similarity automatically to build our machine learning model. Finally, we get a KNN model that can predict eight types of vulnerabilities including Re-entrancy, Arithmetic, Access Control, Denial of Service, Unchecked Low Level Calls, Bad Randomness, Front Running, and Denial of Service. The accuracy, recall, and precision of our KNN model are all higher than 90%. In addition, compared with some other analysis tools including Oyente and SmartCheck, our model has higher accuracy. In addition, we spent less time for training .

Download Full-text

Loop Transformations using Clang’s Abstract Syntax Tree

10.1145/3458744.3473359 ◽

2021 ◽

Author(s):

Michael Kruse

Keyword(s):

Abstract Syntax ◽

Abstract Syntax Tree ◽

Syntax Tree ◽

Loop Transformations

Download Full-text

Supervised Authorship Segmentation of Open Source Code Projects

Proceedings on Privacy Enhancing Technologies ◽

10.2478/popets-2021-0080 ◽

2021 ◽

Vol 2021 (4) ◽

pp. 464-479

Author(s):

Edwin Dauber ◽

Robert Erbacher ◽

Gregory Shearer ◽

Michael Weisman ◽

Frederica Nelson ◽

...

Keyword(s):

Open Source ◽

Source Code ◽

Difficult Problem ◽

Authorship Attribution ◽

Abstract Syntax ◽

Test Set ◽

Abstract Syntax Tree ◽

Improve Accuracy ◽

Validation Set ◽

Primary Author

Abstract Source code authorship attribution can be used for many types of intelligence on binaries and executables, including forensics, but introduces a threat to the privacy of anonymous programmers. Previous work has shown how to attribute individually authored code files and code segments. In this work, we examine authorship segmentation, in which we determine authorship of arbitrary parts of a program. While previous work has performed segmentation at the textual level, we attempt to attribute subtrees of the abstract syntax tree (AST). We focus on two primary problems: identifying the primary author of an arbitrary AST subtree and identifying on which edges of the AST primary authorship changes. We demonstrate that the former is a difficult problem but the later is much easier. We also demonstrate methods by which we can leverage the easier problem to improve accuracy for the harder problem. We show that while identifying the author of subtrees is difficult overall, this is primarily due to the abundance of small subtrees: in the validation set we can attribute subtrees of at least 25 nodes with accuracy over 80% and at least 33 nodes with accuracy over 90%, while in the test set we can attribute subtrees of at least 33 nodes with accuracy of 70%. While our baseline accuracy for single AST nodes is 20.21% for the validation set and 35.66% for the test set, we present techniques by which we can increase this accuracy to 42.01% and 49.21% respectively. We further present observations about collaborative code found on GitHub that may drive further research.

Download Full-text

Multi-Granularity Code Smell Detection using Deep Learning Method based on Abstract Syntax Tree

10.18293/seke2021-014 ◽

2021 ◽

Author(s):

Weiwei Xu

Keyword(s):

Deep Learning ◽

Abstract Syntax ◽

Learning Method ◽

Abstract Syntax Tree ◽

Syntax Tree ◽

Code Smell

Download Full-text

An Intelligent Code Search Approach Using Hybrid Encoders

Wireless Communications and Mobile Computing ◽

10.1155/2021/9990988 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Yao Meng

Keyword(s):

Deep Neural Networks ◽

Source Code ◽

Structural Features ◽

Abstract Syntax ◽

Abstract Syntax Tree ◽

Discriminative Models ◽

Learning Framework ◽

Code Search ◽

Parsing Algorithm ◽

Search Approach

The intelligent code search with natural language queries has become an important researching area in software engineering. In this paper, we propose a novel deep learning framework At-CodeSM for source code search. The powerful code encoder in At-CodeSM, which is implemented with an abstract syntax tree parsing algorithm (Tree-LSTM) and token-level encoders, maintains both the lexical and structural features of source code in the process of code vectorizing. Both the representative and discriminative models are implemented with deep neural networks. Our experiments on the CodeSearchNet dataset show that At-CodeSM yields better performance in the task of intelligent code searching than previous approaches.

Download Full-text

Enhancing the Software Clone Detection in BigCloneBench

International Journal of Open Source Software and Processes ◽

10.4018/ijossp.2021070102 ◽

2021 ◽

Vol 12 (3) ◽

pp. 17-31

Author(s):

Amandeep Kaur ◽

Munish Saini

Keyword(s):

Structural Information ◽

Source Code ◽

Maintenance Cost ◽

Clone Detection ◽

Abstract Syntax ◽

Code Clones ◽

Major Drawback ◽

Abstract Syntax Tree ◽

Detection Techniques ◽

Code Clone

In the software system, the code snippets that are copied and pasted in the same software or another software result in cloning. The basic cause of cloning is either a programmer‘s constraint or language constraints. An increase in the maintenance cost of software is the major drawback of code clones. So, clone detection techniques are required to remove or refactor the code clone. Recent studies exhibit the abstract syntax tree (AST) captures the structural information of source code appropriately. Many researchers used tree-based convolution for identifying the clone, but this technique has certain drawbacks. Therefore, in this paper, the authors propose an approach that finds the semantic clone through square-based convolution by taking abstract syntax representation of source code. Experimental results show the effectiveness of the approach to the popular BigCloneBench benchmark.

Download Full-text

Research on PowerShell Obfuscation Technology Based on Abstract Syntax Tree Transformation

10.1109/isctis51085.2021.00032 ◽

2021 ◽

Author(s):

XiaoMeng Xu ◽

ShuWen Liu ◽

Pu Yu ◽

YunTian Zhao

Keyword(s):

Abstract Syntax ◽

Abstract Syntax Tree ◽

Syntax Tree ◽

Tree Transformation

Download Full-text

Framework for State-Aware Virtual Hardware Fuzzing

Wireless Communications and Mobile Computing ◽

10.1155/2021/6698311 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Hang Xu ◽

Ganyu Qin ◽

Junhu Zhu ◽

Zimian Liu ◽

Zhiqiang Liu

Keyword(s):

The State ◽

State Condition ◽

Prototype System ◽

Abstract Syntax ◽

Abstract Syntax Tree ◽

Code Coverage ◽

Syntax Tree ◽

Software Vulnerabilities

Coverage-based greybox fuzzing has strong capabilities in discovering virtualization software vulnerabilities. Efficiency is one of the most important indicators while evaluating greybox fuzzing. However, the interference of virtual hardware state conditions on testcase evaluation severely impairs the efficiency of greybox fuzzing. In order to reduce the interference of virtual hardware state conditions and increase the efficiency of fuzzing, we propose a state-based virtual hardware fuzzing framework, named SAVHF (State-Aware Virtual Hardware Fuzzing). In this framework, a source-to-source instrumentation method based on the abstract syntax tree is proposed to detect the state condition of virtual hardware. Based on the source-to-source instrumentation, we afterwards propose a state-based fuzzing strategy to adapt to the state conditions of virtual hardware. We realize the prototype system of SAVHF and use it to evaluate 17 popular virtual hardware of Qemu and find 16 bugs with 1 CVE (Common Vulnerabilities and Exposures) number assigned. Evaluation results demonstrate that the proposed SAVHF framework covers an average of more than 61% of virtual hardware code branches in the 18 hours testing and can improve the average code coverage by 11.04% compared with the path-based fuzzing strategy.

Download Full-text

abstract syntax tree
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Abstract Syntax Tree (AST) and Control Flow Graph (CFG) Construction of Notasi Algoritmik

Efficient tree-traversals: reconciling parallelism and dense data representations

A Novel Machine Learning-Based Analysis Model for Smart Contract Vulnerability

Loop Transformations using Clang’s Abstract Syntax Tree

Supervised Authorship Segmentation of Open Source Code Projects

Multi-Granularity Code Smell Detection using Deep Learning Method based on Abstract Syntax Tree

An Intelligent Code Search Approach Using Hybrid Encoders

Enhancing the Software Clone Detection in BigCloneBench

Research on PowerShell Obfuscation Technology Based on Abstract Syntax Tree Transformation

Framework for State-Aware Virtual Hardware Fuzzing

Export Citation Format

abstract syntax treeRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Abstract Syntax Tree (AST) and Control Flow Graph (CFG) Construction of Notasi Algoritmik

Efficient tree-traversals: reconciling parallelism and dense data representations

A Novel Machine Learning-Based Analysis Model for Smart Contract Vulnerability

Loop Transformations using Clang’s Abstract Syntax Tree

Supervised Authorship Segmentation of Open Source Code Projects

Multi-Granularity Code Smell Detection using Deep Learning Method based on Abstract Syntax Tree

An Intelligent Code Search Approach Using Hybrid Encoders

Enhancing the Software Clone Detection in BigCloneBench

Research on PowerShell Obfuscation Technology Based on Abstract Syntax Tree Transformation

Framework for State-Aware Virtual Hardware Fuzzing

abstract syntax tree
Recently Published Documents