program synthesis
Recently Published Documents


TOTAL DOCUMENTS

431
(FIVE YEARS 100)

H-INDEX

27
(FIVE YEARS 3)

2022 ◽  
Author(s):  
Harsh Patel ◽  
Praveen Venkatesh ◽  
Shivam Sahni ◽  
Varun Jain ◽  
Mrinal Anand ◽  
...  
Keyword(s):  

2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Yujie Zhao ◽  
Zhanyong Tang ◽  
Guixin Ye ◽  
Xiaoqing Gong ◽  
Dingyi Fang

Data obfuscation is usually used by malicious software to avoid detection and reverse analysis. When analyzing the malware, such obfuscations have to be removed to restore the program into an easier understandable form (deobfuscation). The deobfuscation based on program synthesis provides a good solution for treating the target program as a black box. Thus, deobfuscation becomes a problem of finding the shortest instruction sequence to synthesize a program with the same input-output behavior as the target program. Existing work has two limitations: assuming that obfuscated code snippets in the target program are known and using a stochastic search algorithm resulting in low efficiency. In this paper, we propose fine-grained obfuscation detection for locating obfuscated code snippets by machine learning. Besides, we also combine the program synthesis and a heuristic search algorithm of Nested Monte Carlo Search. We have applied a prototype implementation of our ideas to data obfuscation in different tools, including OLLVM and Tigress. Our experimental results suggest that this approach is highly effective in locating and deobfuscating the binaries with data obfuscation, with an accuracy of at least 90.34%. Compared with the state-of-the-art deobfuscation technique, our approach’s efficiency has increased by 75%, with the success rate increasing by 5%.


2021 ◽  
Author(s):  
Yuxin Wang ◽  
Zeyu Ding ◽  
Yingtai Xiao ◽  
Daniel Kifer ◽  
Danfeng Zhang

2021 ◽  
Vol 5 (OOPSLA) ◽  
pp. 1-29
Author(s):  
Kasra Ferdowsifard ◽  
Shraddha Barke ◽  
Hila Peleg ◽  
Sorin Lerner ◽  
Nadia Polikarpova

One vision for program synthesis, and specifically for programming by example (PBE), is an interactive programmer's assistant, integrated into the development environment. To make program synthesis practical for interactive use, prior work on Small-Step Live PBE has proposed to limit the scope of synthesis to small code snippets, and enable the users to provide local specifications for those snippets. This paradigm, however, does not work well in the presence of loops. We present LooPy, a synthesizer integrated into a live programming environment, which extends Small-Step Live PBE to work inside loops and scales it up to synthesize larger code snippets, while remaining fast enough for interactive use. To allow users to effectively provide examples at various loop iterations, even when the loop body is incomplete, LooPy makes use of live execution , a technique that leverages the programmer as an oracle to step over incomplete parts of the loop. To enable synthesis of loop bodies at interactive speeds, LooPy introduces Intermediate State Graph , a new data structure, which compactly represents a large space of code snippets composed of multiple assignment statements and conditionals. We evaluate LooPy empirically using benchmarks from competitive programming and previous synthesizers, and show that it can solve a wide variety of synthesis tasks at interactive speeds. We also perform a small qualitative user study which shows that LooPy's block-level specifications are easy for programmers to provide.


2021 ◽  
Vol 5 (OOPSLA) ◽  
pp. 1-27
Author(s):  
Xiang Gao ◽  
Arjun Radhakrishna ◽  
Gustavo Soares ◽  
Ridwan Shariffdeen ◽  
Sumit Gulwani ◽  
...  

Use of third-party libraries is extremely common in application software. The libraries evolve to accommodate new features or mitigate security vulnerabilities, thereby breaking the Application Programming Interface(API) used by the software. Such breaking changes in the libraries may discourage client code from using the new library versions thereby keeping the application vulnerable and not up-to-date. We propose a novel output-oriented program synthesis algorithm to automate API usage adaptations via program transformation. Our aim is not only to rely on the few example human adaptations of the clients from the old library version to the new library version, since this can lead to over-fitting transformation rules. Instead, we also rely on example usages of the new updated library in clients, which provide valuable context for synthesizing and applying the transformation rules. Our tool APIFix provides an automated mechanism to transform application code using the old library versions to code using the new library versions - thereby achieving automated API usage adaptation to fix the effect of breaking changes. Our evaluation shows that the transformation rules inferred by APIFix achieve 98.7% precision and 91.5% recall. By comparing our approach to state-of-the-art program synthesis approaches, we show that our approach significantly reduces over-fitting while synthesizing transformation rules for API usage adaptations.


2021 ◽  
Vol 5 (OOPSLA) ◽  
pp. 1-29
Author(s):  
Rohan Bavishi ◽  
Caroline Lemieux ◽  
Koushik Sen ◽  
Ion Stoica

While input-output examples are a natural form of specification for program synthesis engines, they can be imprecise for domains such as table transformations. In this paper, we investigate how extracting readily-available information about the user intent behind these input-output examples helps speed up synthesis and reduce overfitting. We present Gauss, a synthesis algorithm for table transformations that accepts partial input-output examples, along with user intent graphs. Gauss includes a novel conflict-resolution reasoning algorithm over graphs that enables it to learn from mistakes made during the search and use that knowledge to explore the space of programs even faster. It also ensures the final program is consistent with the user intent specification, reducing overfitting. We implement Gauss for the domain of table transformations (supporting Pandas and R), and compare it to three state-of-the-art synthesizers accepting only input-output examples. We find that it is able to reduce the search space by 56×, 73× and 664× on average, resulting in 7×, 26× and 7× speedups in synthesis times on average, respectively.


2021 ◽  
Vol 5 (OOPSLA) ◽  
pp. 1-29
Author(s):  
Kia Rahmani ◽  
Mohammad Raza ◽  
Sumit Gulwani ◽  
Vu Le ◽  
Daniel Morris ◽  
...  

Multi-modal program synthesis refers to the task of synthesizing programs (code) from their specification given in different forms, such as a combination of natural language and examples. Examples provide a precise but incomplete specification, and natural language provides an ambiguous but more "complete" task description. Machine-learned pre-trained models (PTMs) are adept at handling ambiguous natural language, but struggle with generating syntactically and semantically precise code. Program synthesis techniques can generate correct code, often even from incomplete but precise specifications, such as examples, but they are unable to work with the ambiguity of natural languages. We present an approach that combines PTMs with component-based synthesis (CBS): PTMs are used to generate candidates programs from the natural language description of the task, which are then used to guide the CBS procedure to find the program that matches the precise examples-based specification. We use our combination approach to instantiate multi-modal synthesis systems for two programming domains: the domain of regular expressions and the domain of CSS selectors. Our evaluation demonstrates the effectiveness of our domain-agnostic approach in comparison to a state-of-the-art specialized system, and the generality of our approach in providing multi-modal program synthesis from natural language and examples in different programming domains.


2021 ◽  
Author(s):  
Jarrett Holtz ◽  
Simon Andrews ◽  
Arjun Guha ◽  
Joydeep Biswas

Sign in / Sign up

Export Citation Format

Share Document