Data-driven abductive inference of library specifications

Zhe Zhou; Robert Dickerson; Benjamin Delaware; Suresh Jagannathan

doi:10.1145/3485493

Data-driven abductive inference of library specifications

Proceedings of the ACM on Programming Languages ◽

10.1145/3485493 ◽

2021 ◽

Vol 5 (OOPSLA) ◽

pp. 1-29

Author(s):

Zhe Zhou ◽

Robert Dickerson ◽

Benjamin Delaware ◽

Suresh Jagannathan

Keyword(s):

Data Structure ◽

Data Driven ◽

Abductive Inference ◽

Minimal Set ◽

Inference Mechanism ◽

Modular Verification ◽

Specification Inference ◽

Wide Range ◽

Client Side ◽

Client Program

Programmers often leverage data structure libraries that provide useful and reusable abstractions. Modular verification of programs that make use of these libraries naturally rely on specifications that capture important properties about how the library expects these data structures to be accessed and manipulated. However, these specifications are often missing or incomplete, making it hard for clients to be confident they are using the library safely. When library source code is also unavailable, as is often the case, the challenge to infer meaningful specifications is further exacerbated. In this paper, we present a novel data-driven abductive inference mechanism that infers specifications for library methods sufficient to enable verification of the library's clients. Our technique combines a data-driven learning-based framework to postulate candidate specifications, along with SMT-provided counterexamples to refine these candidates, taking special care to prevent generating specifications that overfit to sampled tests. The resulting specifications form a minimal set of requirements on the behavior of library implementations that ensures safety of a particular client program. Our solution thus provides a new multi-abduction procedure for precise specification inference of data structure libraries guided by client-side verification tasks. Experimental results on a wide range of realistic OCaml data structure programs demonstrate the effectiveness of the approach.

Download Full-text

Predicting specific impulse distributions for spherical explosives in the extreme near-field using a Gaussian function

International Journal of Protective Structures ◽

10.1177/2041419621993492 ◽

2021 ◽

pp. 204141962199349

Author(s):

Jordan J Pannell ◽

George Panoutsos ◽

Sam B Cooke ◽

Dan J Pope ◽

Sam E Rigby

Keyword(s):

Scaling Laws ◽

Near Field ◽

Specific Impulse ◽

Gaussian Function ◽

Data Driven ◽

Blast Load ◽

Structural Deformation ◽

Load Prediction ◽

Validation Data ◽

Wide Range

Accurate quantification of the blast load arising from detonation of a high explosive has applications in transport security, infrastructure assessment and defence. In order to design efficient and safe protective systems in such aggressive environments, it is of critical importance to understand the magnitude and distribution of loading on a structural component located close to an explosive charge. In particular, peak specific impulse is the primary parameter that governs structural deformation under short-duration loading. Within this so-called extreme near-field region, existing semi-empirical methods are known to be inaccurate, and high-fidelity numerical schemes are generally hampered by a lack of available experimental validation data. As such, the blast protection community is not currently equipped with a satisfactory fast-running tool for load prediction in the near-field. In this article, a validated computational model is used to develop a suite of numerical near-field blast load distributions, which are shown to follow a similar normalised shape. This forms the basis of the data-driven predictive model developed herein: a Gaussian function is fit to the normalised loading distributions, and a power law is used to calculate the magnitude of the curve according to established scaling laws. The predictive method is rigorously assessed against the existing numerical dataset, and is validated against new test models and available experimental data. High levels of agreement are demonstrated throughout, with typical variations of <5% between experiment/model and prediction. The new approach presented in this article allows the analyst to rapidly compute the distribution of specific impulse across the loaded face of a wide range of target sizes and near-field scaled distances and provides a benchmark for data-driven modelling approaches to capture blast loading phenomena in more complex scenarios.

Download Full-text

Cache-efficient sweeping-based interval joins for extended Allen relation predicates

The VLDB Journal ◽

10.1007/s00778-020-00650-5 ◽

2021 ◽

Author(s):

Danila Piatov ◽

Sven Helmer ◽

Anton Dignös ◽

Fabio Persia

Keyword(s):

Data Structure ◽

Experimental Evaluation ◽

State Of The Art ◽

Temporal Databases ◽

Access Method ◽

Wide Range ◽

Interval Relation ◽

Cache Efficient ◽

Join Algorithms ◽

Better Than

AbstractWe develop a family of efficient plane-sweeping interval join algorithms for evaluating a wide range of interval predicates such as Allen’s relationships and parameterized relationships. Our technique is based on a framework, components of which can be flexibly combined in different manners to support the required interval relation. In temporal databases, our algorithms can exploit a well-known and flexible access method, the Timeline Index, thus expanding the set of operations it supports even further. Additionally, employing a compact data structure, the gapless hash map, we utilize the CPU cache efficiently. In an experimental evaluation, we show that our approach is several times faster and scales better than state-of-the-art techniques, while being much better suited for real-time event processing.

Download Full-text

Shape Neutral Analysis of Graph-based Data-structures

Theory and Practice of Logic Programming ◽

10.1017/s147106841800025x ◽

2018 ◽

Vol 18 (3-4) ◽

pp. 470-483 ◽

Cited By ~ 1

Author(s):

GREGORY J. DUCK ◽

JOXAN JAFFAR ◽

ROLAND H. C. YAP

Keyword(s):

Data Structure ◽

Data Structures ◽

Program Analysis ◽

Constraint Handling ◽

C Programs ◽

Wide Range ◽

Target Program ◽

Target Data ◽

Structure Graph ◽

Structure Properties

AbstractMalformed data-structures can lead to runtime errors such as arbitrary memory access or corruption. Despite this, reasoning over data-structure properties for low-level heap manipulating programs remains challenging. In this paper we present a constraint-based program analysis that checks data-structure integrity, w.r.t. given target data-structure properties, as the heap is manipulated by the program. Our approach is to automatically generate a solver for properties using the type definitions from the target program. The generated solver is implemented using a Constraint Handling Rules (CHR) extension of built-in heap, integer and equality solvers. A key property of our program analysis is that the target data-structure properties are shape neutral, i.e., the analysis does not check for properties relating to a given data-structure graph shape, such as doubly-linked-lists versus trees. Nevertheless, the analysis can detect errors in a wide range of data-structure manipulating programs, including those that use lists, trees, DAGs, graphs, etc. We present an implementation that uses the Satisfiability Modulo Constraint Handling Rules (SMCHR) system. Experimental results show that our approach works well for real-world C programs.

Download Full-text

Combustion Tuning for a Gas Turbine Power Plant Using Data-Driven and Machine Learning Approach

Journal of Engineering for Gas Turbines and Power ◽

10.1115/1.4050020 ◽

2021 ◽

Vol 143 (3) ◽

Author(s):

Suhui Li ◽

Huaxin Zhu ◽

Min Zhu ◽

Gang Zhao ◽

Xiaofeng Wei

Keyword(s):

Machine Learning ◽

Power Plant ◽

Gas Turbine ◽

Operating Conditions ◽

Data Driven ◽

Ann Model ◽

Promising Alternative ◽

Combustion Performance ◽

Wide Range ◽

Gas Turbine Power Plant

Abstract Conventional physics-based or experimental-based approaches for gas turbine combustion tuning are time consuming and cost intensive. Recent advances in data analytics provide an alternative method. In this paper, we present a cross-disciplinary study on the combustion tuning of an F-class gas turbine that combines machine learning with physics understanding. An artificial-neural-network-based (ANN) model is developed to predict the combustion performance (outputs), including NOx emissions, combustion dynamics, combustor vibrational acceleration, and turbine exhaust temperature. The inputs of the ANN model are identified by analyzing the key operating variables that impact the combustion performance, such as the pilot and the premixed fuel flow, and the inlet guide vane angle. The ANN model is trained by field data from an F-class gas turbine power plant. The trained model is able to describe the combustion performance at an acceptable accuracy in a wide range of operating conditions. In combination with the genetic algorithm, the model is applied to optimize the combustion performance of the gas turbine. Results demonstrate that the data-driven method offers a promising alternative for combustion tuning at a low cost and fast turn-around.

Download Full-text

A Recommender for Choosing Data Systems based on Application Profiling and Benchmarking

10.5753/sbbd.2021.17883 ◽

2021 ◽

Author(s):

Elton Figueiredo de Souza Soares ◽

Renan Souza ◽

Raphael Melo Thiago ◽

Marcelo de Oliveira Costa Machado ◽

Leonardo Guerreiro Azevedo

Keyword(s):

Empirical Evidence ◽

Informed Decision ◽

Data System ◽

Data Driven ◽

Data Systems ◽

Ongoing Work ◽

Wide Range

In our data-driven society, there are hundreds of possible data systems in the market with a wide range of configuration parameters, making it very hard for enterprises and users to choose the most suitable data systems. There is a lack of representative empirical evidence to help users make an informed decision. Using benchmark results is a widely adopted practice, but like there are several data systems, there are various benchmarks. This ongoing work presents an architecture and methods of a system that supports the recommendation of the most suitable data system for an application. We also illustrates how the recommendation would work in a fictitious scenario.

Download Full-text

Building clone-consistent ecosystem models

PLoS Computational Biology ◽

10.1371/journal.pcbi.1008635 ◽

2021 ◽

Vol 17 (2) ◽

pp. e1008635

Author(s):

Gerrit Ansmann ◽

Tobias Bollenbach

Keyword(s):

Functional Analysis ◽

Building Blocks ◽

Data Driven ◽

Ecological Studies ◽

Ecosystem Models ◽

Linear Combinations ◽

Model Consistency ◽

Wide Range ◽

Critical Requirement ◽

Two Populations

Many ecological studies employ general models that can feature an arbitrary number of populations. A critical requirement imposed on such models is clone consistency: If the individuals from two populations are indistinguishable, joining these populations into one shall not affect the outcome of the model. Otherwise a model produces different outcomes for the same scenario. Using functional analysis, we comprehensively characterize all clone-consistent models: We prove that they are necessarily composed from basic building blocks, namely linear combinations of parameters and abundances. These strong constraints enable a straightforward validation of model consistency. Although clone consistency can always be achieved with sufficient assumptions, we argue that it is important to explicitly name and consider the assumptions made: They may not be justified or limit the applicability of models and the generality of the results obtained with them. Moreover, our insights facilitate building new clone-consistent models, which we illustrate for a data-driven model of microbial communities. Finally, our insights point to new relevant forms of general models for theoretical ecology. Our framework thus provides a systematic way of comprehending ecological models, which can guide a wide range of studies.

Download Full-text

Multidimensional Approximation of Nonlinear Dynamical Systems

Journal of Computational and Nonlinear Dynamics ◽

10.1115/1.4043148 ◽

2019 ◽

Vol 14 (6) ◽

Cited By ~ 5

Author(s):

Patrick Gelß ◽

Stefan Klus ◽

Jens Eisert ◽

Christof Schütte

Keyword(s):

Dynamical Systems ◽

Nonlinear Dynamical Systems ◽

Measurement Data ◽

Data Driven ◽

High Dimensional ◽

Data Sets ◽

Nonlinear Dynamical ◽

Tensor Network ◽

Wide Range ◽

Multidimensional Approximation

A key task in the field of modeling and analyzing nonlinear dynamical systems is the recovery of unknown governing equations from measurement data only. There is a wide range of application areas for this important instance of system identification, ranging from industrial engineering and acoustic signal processing to stock market models. In order to find appropriate representations of underlying dynamical systems, various data-driven methods have been proposed by different communities. However, if the given data sets are high-dimensional, then these methods typically suffer from the curse of dimensionality. To significantly reduce the computational costs and storage consumption, we propose the method multidimensional approximation of nonlinear dynamical systems (MANDy) which combines data-driven methods with tensor network decompositions. The efficiency of the introduced approach will be illustrated with the aid of several high-dimensional nonlinear dynamical systems.

Download Full-text

Client-Side Handheld Computing and Programming

Internet-Enabled Handheld Devices, Computing, and Programming ◽

10.4018/978-1-59140-769-0.ch010 ◽

2011 ◽

pp. 261-285

Author(s):

Wen-Chen Hu

Keyword(s):

Video Games ◽

Personal Information ◽

Weekly Schedule ◽

Web Content ◽

Server Side ◽

Handheld Computing ◽

Business Meetings ◽

Wide Range ◽

On Line ◽

Client Side

There are two kinds of handheld computing and programming, namely client- and server- side handheld computing and programming. The most popular applications of the latter are used with database-driven mobile web content, whose construction steps were described in the previous section. The remainder of this book will be devoted to client-side handheld computing and programming, whose applications do not need the support of server-side programs. Client-side handheld applications are varied and numerous, covering a wide range of everyday activities. Popular application examples include: • address books, which store personal information such as addresses, telephone numbers, and email addresses in an accessible format, • appointments, which allow users to edit, save, and view times reserved for business meetings and visits to the doctor, • calculators, which may be a standard 4-function pocket calculator or a multifunction scientific calculator, • datebooks/calendars, which allow users to enter hourly activities and show a daily or weekly schedule, or a simple monthly view, • expenses, which allow users to track and record common business expenses such as car mileage, per diems, air fees, and hotel bills, • mobile office functions, which include viewing and processing documents, spread sheets, presentations, and inventory. • multimedia, which includes playing music and videos, photography, and personal albums. • note pads, which allow users to save, view, and edit text notes, • to-do lists, which allow users to enter a list of tasks to be performed, and • video games, in addition to those on-line video games that require the support of server-side programs.

Download Full-text

Reinforcement Learning With Modulated Spike Timing–Dependent Synaptic Plasticity

Journal of Neurophysiology ◽

10.1152/jn.00364.2007 ◽

2007 ◽

Vol 98 (6) ◽

pp. 3648-3665 ◽

Cited By ~ 69

Author(s):

Michael A. Farries ◽

Adrienne L. Fairhall

Keyword(s):

Synaptic Plasticity ◽

Reinforcement Learning ◽

General Purpose ◽

Spike Timing ◽

Neural Population ◽

Minimal Set ◽

Population Responses ◽

Wide Range ◽

Output Neurons ◽

Novel Model

Spike timing–dependent synaptic plasticity (STDP) has emerged as the preferred framework linking patterns of pre- and postsynaptic activity to changes in synaptic strength. Although synaptic plasticity is widely believed to be a major component of learning, it is unclear how STDP itself could serve as a mechanism for general purpose learning. On the other hand, algorithms for reinforcement learning work on a wide variety of problems, but lack an experimentally established neural implementation. Here, we combine these paradigms in a novel model in which a modified version of STDP achieves reinforcement learning. We build this model in stages, identifying a minimal set of conditions needed to make it work. Using a performance-modulated modification of STDP in a two-layer feedforward network, we can train output neurons to generate arbitrarily selected spike trains or population responses. Furthermore, a given network can learn distinct responses to several different input patterns. We also describe in detail how this model might be implemented biologically. Thus our model offers a novel and biologically plausible implementation of reinforcement learning that is capable of training a neural population to produce a very wide range of possible mappings between synaptic input and spiking output.

Download Full-text

Friction Torque and Leakage Based Data-Driven Approach for Rotary Seal Prognostics in Manufacturing Industry

Volume 1: Additive Manufacturing; Manufacturing Equipment and Systems; Bio and Sustainable Manufacturing ◽

10.1115/msec2019-2819 ◽

2019 ◽

Author(s):

Madhumitha Ramachandran ◽

Zahed Siddique

Keyword(s):

Manufacturing Industry ◽

Friction Torque ◽

Operating Conditions ◽

Oil Field ◽

Data Driven ◽

Sensor Technology ◽

Physical Parameters ◽

Direct Estimation ◽

Rotary Seal ◽

Wide Range

Abstract Rotary seals are found in many manufacturing equipment and machines used for various applications under a wide range of operating conditions. Rotary seal failure can be catastrophic and can lead to costly downtime and large expenses; so it is extremely important to assess the degradation of rotary seal to avoid fatal breakdown of machineries. Physics-based rotary seal prognostics require direct estimation of different physical parameters to assess the degradation of seals. Data-driven prognostics utilizing sensor technology and computational capabilities can aid in the in-direct estimation of rotary seals’ running condition unlike the physics-based approach. An important aspect of data-driven prognostics is to collect appropriate data in order to reduce the cost and time associated with the data collection, storage and computation. Seals in machineries operate in harsh conditions, especially in the oil field, seals are exposed to harsh environment and aggressive fluids which gradually reduces the elastic modulus and hardness of seals, resulting in lower friction torque and excessive leakage. Therefore, in this study we implement a data-driven prognostics approach which utilizes friction torque and leakage signals along with Multilayer Perceptron as a classifier to compare the performance of the two metrics in classifying the running condition of rotary seals. Friction torque was found to have a better performance than leakage in terms of differentiating the running condition of rotary seals throughout its service life. Although this approach was designed for seals in oil and gas industry, this approach can be implemented in any manufacturing industry with similar applications.

Download Full-text