scholarly journals Gauss: program synthesis by reasoning over graphs

2021 ◽  
Vol 5 (OOPSLA) ◽  
pp. 1-29
Author(s):  
Rohan Bavishi ◽  
Caroline Lemieux ◽  
Koushik Sen ◽  
Ion Stoica

While input-output examples are a natural form of specification for program synthesis engines, they can be imprecise for domains such as table transformations. In this paper, we investigate how extracting readily-available information about the user intent behind these input-output examples helps speed up synthesis and reduce overfitting. We present Gauss, a synthesis algorithm for table transformations that accepts partial input-output examples, along with user intent graphs. Gauss includes a novel conflict-resolution reasoning algorithm over graphs that enables it to learn from mistakes made during the search and use that knowledge to explore the space of programs even faster. It also ensures the final program is consistent with the user intent specification, reducing overfitting. We implement Gauss for the domain of table transformations (supporting Pandas and R), and compare it to three state-of-the-art synthesizers accepting only input-output examples. We find that it is able to reduce the search space by 56×, 73× and 664× on average, resulting in 7×, 26× and 7× speedups in synthesis times on average, respectively.

Author(s):  
Christopher D. Rosin

Inductive program synthesis, from input/output examples, can provide an opportunity to automatically create programs from scratch without presupposing the algorithmic form of the solution. For induction of general programs with loops (as opposed to loop-free programs, or synthesis for domain-specific languages), the state of the art is at the level of introductory programming assignments. Most problems that require algorithmic subtlety, such as fast sorting, have remained out of reach without the benefit of significant problem-specific background knowledge. A key challenge is to identify cues that are available to guide search towards correct looping programs. We present MAKESPEARE, a simple delayed-acceptance hillclimbing method that synthesizes low-level looping programs from input/output examples. During search, delayed acceptance bypasses small gains to identify significantly-improved stepping stone programs that tend to generalize and enable further progress. The method performs well on a set of established benchmarks, and succeeds on the previously unsolved “Collatz Numbers” program synthesis problem. Additional benchmarks include the problem of rapidly sorting integer arrays, in which we observe the emergence of comb sort (a Shell sort variant that is empirically fast). MAKESPEARE has also synthesized a record-setting program on one of the puzzles from the TIS100 assembly language programming game.


Author(s):  
Marlene Goncalves ◽  
María Esther Vidal

Criteria that induce a Skyline naturally represent user’s preference conditions useful to discard irrelevant data in large datasets. However, in the presence of high-dimensional Skyline spaces, the size of the Skyline can still be very large. To identify the best k points among the Skyline, the Top-k Skyline approach has been proposed. This chapter describes existing solutions and proposes to use the TKSI algorithm for the Top-k Skyline problem. TKSI reduces the search space by computing only a subset of the Skyline that is required to produce the top-k objects. In addition, the Skyline Frequency Metric is implemented to discriminate among the Skyline objects those that best meet the multidimensional criteria. This chapter’s authors have empirically studied the quality of TKSI, and their experimental results show the TKSI may be able to speed up the computation of the Top-k Skyline in at least 50% percent with regard to the state-of-the-art solutions.


Author(s):  
Jack Hughes ◽  
Dominic Orchard

AbstractLinear types provide a way to constrain programs by specifying that some values must be used exactly once. Recent work on graded modal types augments and refines this notion, enabling fine-grained, quantitative specification of data use in programs. The information provided by graded modal types appears to be useful for type-directed program synthesis, where these additional constraints can be used to prune the search space of candidate programs. We explore one of the major implementation challenges of a synthesis algorithm in this setting: how does the synthesis algorithm efficiently ensure that resource constraints are satisfied throughout program generation? We provide two solutions to this resource management problem, adapting Hodas and Miller’s input-output model of linear context management to a graded modal linear type theory. We evaluate the performance of both approaches via their implementation as a program synthesis tool for the programming language Granule, which provides linear and graded modal typing.


Author(s):  
Jianwei Zhang ◽  
Dong Li ◽  
Lituan Wang ◽  
Lei Zhang

Neural Architecture Search (NAS), which aims at automatically designing neural architectures, recently draw a growing research interest. Different from conventional NAS methods, in which a large number of neural architectures need to be trained for evaluation, the one-shot NAS methods only have to train one supernet which synthesizes all the possible candidate architectures. As a result, the search efficiency could be significantly improved by sharing the supernet’s weights during the candidate architectures’ evaluation. This strategy could greatly speed up the search process but suffer a challenge that the evaluation based on sharing weights is not predictive enough. Recently, pruning the supernet during the search has been proven to be an efficient way to alleviate this problem. However, the pruning direction in complex-structured search space remains unexplored. In this paper, we revisited the role of path dropout strategy, which drops the neural operations instead of the neurons, in supernet training, and several interesting characters of the supernet trained with dropout are found. Based on the observations, a Hierarchically-Ordered Pruning Neural Architecture Search (HOPNAS) algorithm is proposed by dynamically pruning the supernet with a proper pruning direction. Experimental results indicate that our method is competitive with state-of-the-art approaches on CIFAR10 and ImageNet.


2018 ◽  
Vol 2018 ◽  
pp. 1-19 ◽  
Author(s):  
Gonçalo P. Amador ◽  
Abel J. P. Gomes

We propose a new pathfinding technique called xTrek that combines conventional pathfinding and influence fields; that is, we are introducing a new influence-sensitive pathfinder or influence-aware pathfinder. The leading idea of influence-aware pathfinding is to avoid unwanted regions and/or converge to desired regions of the search space during the path search. As shown throughout the paper, this region avoidance/convergence is more striking using our technique than in other field-aware pathfinders as, for example, risk-adverse pathfinders and constraint-aware navigation pathfinders. Furthermore, our technique constrains the search space even more than such state-of-the-art influence-aware pathfinders, aiming to reduce the memory space consumption, to speed up pathfinding computations, and at the same time to have better control on the paths to be discovered.


2021 ◽  
Vol 297 ◽  
pp. 126645
Author(s):  
Gajanan Sampatrao Ghodake ◽  
Surendra Krushna Shinde ◽  
Avinash Ashok Kadam ◽  
Rijuta Ganesh Saratale ◽  
Ganesh Dattatraya Saratale ◽  
...  

2012 ◽  
Vol 49 (2) ◽  
pp. 285-327 ◽  
Author(s):  
RUI P. CHAVES

Subject phrases impose particularly strong constraints on extraction. Most research assumes a syntactic account (e.g. Kayne 1983, Chomsky 1986, Rizzi 1990, Lasnik & Saito 1992, Takahashi 1994, Uriagereka 1999), but there are also pragmatic accounts (Erteschik-Shir & Lappin 1979; Van Valin 1986, 1995; Erteschik-Shir 2006, 2007) as well as performance-based approaches (Kluender 2004). In this work I argue that none of these accounts captures the full range of empirical facts, and show that subject and adjunct phrases (phrasal or clausal, finite or otherwise) are by no means impermeable to non-parasitic extraction of nominal, prepositional and adverbial phrases. The present empirical reassessment indicates that the phenomena involving subject and adjunct islands defies the formulation of a general grammatical account. Drawing from insights by Engdahl (1983) and Kluender (2004), I argue that subject island effects have a functional explanation. Independently motivated pragmatic and processing limitations cause subject-internal gaps to be heavily dispreferred, and therefore, extremely infrequent. In turn, this has led to heuristic parsing expectations that preempt subject-internal gaps and therefore speed up processing by pruning the search space of filler–gap dependencies. Such expectations cause processing problems when violated, unless they are dampened by prosodic and pragmatic cues that boost the construction of the correct parse. This account predicts subject islands and their (non-)parasitic exceptions.


2020 ◽  
Vol 14 (4) ◽  
pp. 653-667
Author(s):  
Laxman Dhulipala ◽  
Changwan Hong ◽  
Julian Shun

Connected components is a fundamental kernel in graph applications. The fastest existing multicore algorithms for solving graph connectivity are based on some form of edge sampling and/or linking and compressing trees. However, many combinations of these design choices have been left unexplored. In this paper, we design the ConnectIt framework, which provides different sampling strategies as well as various tree linking and compression schemes. ConnectIt enables us to obtain several hundred new variants of connectivity algorithms, most of which extend to computing spanning forest. In addition to static graphs, we also extend ConnectIt to support mixes of insertions and connectivity queries in the concurrent setting. We present an experimental evaluation of ConnectIt on a 72-core machine, which we believe is the most comprehensive evaluation of parallel connectivity algorithms to date. Compared to a collection of state-of-the-art static multicore algorithms, we obtain an average speedup of 12.4x (2.36x average speedup over the fastest existing implementation for each graph). Using ConnectIt, we are able to compute connectivity on the largest publicly-available graph (with over 3.5 billion vertices and 128 billion edges) in under 10 seconds using a 72-core machine, providing a 3.1x speedup over the fastest existing connectivity result for this graph, in any computational setting. For our incremental algorithms, we show that our algorithms can ingest graph updates at up to several billion edges per second. To guide the user in selecting the best variants in ConnectIt for different situations, we provide a detailed analysis of the different strategies. Finally, we show how the techniques in ConnectIt can be used to speed up two important graph applications: approximate minimum spanning forest and SCAN clustering.


Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6523
Author(s):  
Pieter Van Van Molle ◽  
Cedric De De Boom ◽  
Tim Verbelen ◽  
Bert Vankeirsbilck ◽  
Jonas De De Vylder ◽  
...  

Deep neural networks have achieved state-of-the-art performance in image classification. Due to this success, deep learning is now also being applied to other data modalities such as multispectral images, lidar and radar data. However, successfully training a deep neural network requires a large reddataset. Therefore, transitioning to a new sensor modality (e.g., from regular camera images to multispectral camera images) might result in a drop in performance, due to the limited availability of data in the new modality. This might hinder the adoption rate and time to market for new sensor technologies. In this paper, we present an approach to leverage the knowledge of a teacher network, that was trained using the original data modality, to improve the performance of a student network on a new data modality: a technique known in literature as knowledge distillation. By applying knowledge distillation to the problem of sensor transition, we can greatly speed up this process. We validate this approach using a multimodal version of the MNIST dataset. Especially when little data is available in the new modality (i.e., 10 images), training with additional teacher supervision results in increased performance, with the student network scoring a test set accuracy of 0.77, compared to an accuracy of 0.37 for the baseline. We also explore two extensions to the default method of knowledge distillation, which we evaluate on a multimodal version of the CIFAR-10 dataset: an annealing scheme for the hyperparameter α and selective knowledge distillation. Of these two, the first yields the best results. Choosing the optimal annealing scheme results in an increase in test set accuracy of 6%. Finally, we apply our method to the real-world use case of skin lesion classification.


Author(s):  
Yu Zeng ◽  
Yan Gao ◽  
Jiaqi Guo ◽  
Bei Chen ◽  
Qian Liu ◽  
...  

Neural semantic parsers usually fail to parse long and complicated utterances into nested SQL queries, due to the large search space. In this paper, we propose a novel recursive semantic parsing framework called RECPARSER to generate the nested SQL query layer-by-layer. It decomposes the complicated nested SQL query generation problem into several progressive non-nested SQL query generation problems. Furthermore, we propose a novel Question Decomposer module to explicitly encourage RECPARSER to focus on different components of an utterance when predicting SQL queries of different layers. Experiments on the Spider dataset show that our approach is more effective compared to the previous works at predicting the nested SQL queries. In addition, we achieve an overall accuracy that is comparable with state-of-the-art approaches.


Sign in / Sign up

Export Citation Format

Share Document