scholarly journals PATSQL

2021 ◽  
Vol 14 (11) ◽  
pp. 1937-1949
Author(s):  
Keita Takenouchi ◽  
Takashi Ishio ◽  
Joji Okada ◽  
Yuji Sakata

SQL is one of the most popular tools for data analysis, and it is now used by an increasing number of users without having expertise in databases. Several studies have proposed programming-by-example approaches to help such non-experts to write correct SQL queries. While existing methods support a variety of SQL features such as aggregation and nested query, they suffer a significant increase in computational cost as the scale of example tables increases. In this paper, we propose an efficient algorithm utilizing properties known in relational algebra to synthesize SQL queries from input and output tables. Our key insight is that a projection operator in a program sketch can be lifted above other operators by applying transformation rules in relational algebra, while preserving the semantics of the program. This enables a quick inference of appropriate columns in the projection operator, which is an essential component in synthesis but causes combinatorial explosions in prior work. We also introduce a novel form of constraints and its top-down propagation mechanism for efficient sketch completion. We implemented this algorithm in our tool PATSQL and evaluated it on 226 queries from prior benchmarks and Kaggle's tutorials. As a result, PATSQL solved 68% of the benchmarks and found 89% of the solutions within a second. Our tool is available at https://naist-se.github.io/patsql/.

2010 ◽  
Vol 14 (2) ◽  
pp. 369-382 ◽  
Author(s):  
M. G. Kleinhans ◽  
M. F. P. Bierkens ◽  
M. van der Perk

Abstract. From an outsider's perspective, hydrology combines field work with modelling, but mostly ignores the potential for gaining understanding and conceiving new hypotheses from controlled laboratory experiments. Sivapalan (2009) pleaded for a question- and hypothesis-driven hydrology where data analysis and top-down modelling approaches lead to general explanations and understanding of general trends and patterns. We discuss why and how such understanding is gained very effectively from controlled experimentation in comparison to field work and modelling. We argue that many major issues in hydrology are open to experimental investigations. Though experiments may have scale problems, these are of similar gravity as the well-known problems of fieldwork and modelling and have not impeded spectacular progress through experimentation in other geosciences.


2018 ◽  
Vol 13 (2) ◽  
pp. 188
Author(s):  
Tota Suhendrata

<p>Abstract : One of the efforts to increase the productivity of paddy rice by setting the<br />right spacing. At this time,  developing technology engine planting of rice seedlings<br />(rice transplanter) which introducing plant spacing ranging from narrow spacing to<br />large plant spacing both on legowo row planting system and tile planting system. With<br />regard to the introduction of these technologies, further research is needed in the<br />effect of plant spacing on growth, productivity (grain yield) and income of paddy rice<br />farmers. The assessment was carried out on the wetland of farmer group of Rukun<br />Tani Sulur  Blimbing Village of Sragen Regency on July – October 2014. The<br />assessment consisted of 3 planting distance treatment of legowo row 2: 1 planting<br />system, ie 20 x 10 x 40 cm, 20 x 13 x 15 cm and 20 x 15 x 40 cm, each treatment<br />repeated 7 times. The area of each treatment is about 0.33 ha. The assessment<br />involves 7 farmers, each farmer carrying out 3 treatment. The seedlings using legowo<br />2:1planting system of rice transplanter. This rice transplanter has 3 combination of<br />plant distance, that is 20 x 10 x 40 cm, 20 x 13 x 15 cm and 20 x 15 x 40 cm. The data<br />collected includes the number of productive tillers, productivity, input and output of<br />farming. Data analysis to compare between 3 treatment by using paired t test. While<br />the analysis of financial feasibility of paddy farming technology using partial budget<br />analysis. The results of this assessment showed that a legowo row 2:1 planting system<br />with plant distance 20 x 15 x 40 cm resulted in highest productive tillers, productivity<br />and income compared to the legowo 2:1 with a narrower plant distance 20 x 10 x 40<br />cm and 20 x 13 x 40 cm.</p><p> </p><p>Abstrak: Salah satu upaya untuk meningkatkan produktivitas padi sawah melalui<br />pengaturan jarak tanam yang tepat. Pada saat ini, berkembang teknologi mesin tanam<br />bibit padi (rice transplanter) yang mengintroduksikan jarak tanam mulai dari jarak<br />tanam sempit sampai dengan jarak tanam lebar baik pada sistem tanam jajar legowo<br />maupun sistem tanam tegel. Berkenaan dengan introduksi teknologi tersebut perlu<br />dilakukan penelitian lebih dalam pengaruh jarak tanam terhadap pertumbuhan,<br />produktivitas (hasil gabah) dan pendapatan petani padi sawah. Pengkajian<br />dilaksanakan pada lahan sawah kelompok tani Rukun Tani Sulur  Desa Blimbing Kec.<br />Sambirejo Kab. Sragen Jawa Tengah pada MT-3 2014. Pengkajian terdiri dari 3<br />perlakuan jarak tanam pada sistem tanam jajar legowo 2:1, yaitu  20 x 10 x 40 cm, 20<br />x 13 x 15 cm dan 20 x 15 x 40 cm dengan 7 kali ulangan. Luas masing-masing<br />perlakuan sekitar 0,33 ha.  Pengkajian melibatkan 7 orang petani, setiap petani<br />melaksanakan 3 perlakuan. Tanam bibit padi menggunakan mesin tanam bibit padi  4<br />baris sistem tanam jajar legowo 2:1. Mesin tanam ini mempunyai 3 kombinasi  jarak<br />tanam, yaitu  20 x 10 x 40 cm, 20 x 13 x 15 cm dan 20 x 15 x 40 cm.  Data yang<br />dikumpulkan meliputi jumlah anakan produktif, produktivitas, input dan output<br />usahatani.  Analisis data untuk membandingkan antara 3 perlakuan jarak tanam dilakukan uji t berpasangan dengan menggunakan software SPSS Statistics 17.0. <br />Sedangkan analisis kelayakan finansial teknologi usahatani padi sawah menggunakan<br />analisis anggaran parsial.  Hasil pengkajian menunjukkan bahwa sistem tanam jajar<br />legowo 2:1 dengan jarak tanam lebar (20 x 15 x 40 cm) menghasilkan jumlah anakan<br />produktif,  produktivitas dan pendapatan yang lebih tinggi dibandingkan sistem tanam<br />jajar legowo 2:1 dengan jarak tanam yang lebih sempit (20 x 10 x 40 cm dan 20 x 13 x<br />40 cm).</p>


2013 ◽  
pp. 1494-1521
Author(s):  
Jose M. Garcia-Manteiga

Metabolomics represents the new ‘omics’ approach of the functional genomics era. It consists in the identification and quantification of all small molecules, namely metabolites, in a given biological system. While metabolomics refers to the analysis of any possible biological system, metabonomics is specifically applied to disease and physiopathological situations. The data collected within these approaches is highly integrative of the other higher levels and is hence amenable to be explored with a top-down systems biology point of view. The aim of this chapter is to give a global view of the state of the art in metabolomics describing the two analytical techniques usually used to give rise to this kind of data, nuclear magnetic resonance, NMR, and mass spectrometry. In addition, the author will focus on the different data analysis tools that can be applied to such studies to extract information with special interest at the attempts to integrate metabolomics with other ‘omics’ approaches and its relevance in systems biology modeling.


Author(s):  
Jose M. Garcia-Manteiga

Metabolomics represents the new ‘omics’ approach of the functional genomics era. It consists in the identification and quantification of all small molecules, namely metabolites, in a given biological system. While metabolomics refers to the analysis of any possible biological system, metabonomics is specifically applied to disease and physiopathological situations. The data collected within these approaches is highly integrative of the other higher levels and is hence amenable to be explored with a top-down systems biology point of view. The aim of this chapter is to give a global view of the state of the art in metabolomics describing the two analytical techniques usually used to give rise to this kind of data, nuclear magnetic resonance, NMR, and mass spectrometry. In addition, the author will focus on the different data analysis tools that can be applied to such studies to extract information with special interest at the attempts to integrate metabolomics with other ‘omics’ approaches and its relevance in systems biology modeling.


Author(s):  
Joseph Fong ◽  
Kamalakar Karlapalem ◽  
Qing Li ◽  
Irene Kwan

A practitioner’s approach to integrate databases and evolve them so as to support new database applications is presented. The approach consists of a joint bottom-up and top-down methodology; the bottom-up approach is taken to integrate existing database using standard schema integration techniques (B-Schema), the top-down approach is used to develop a database schema for the new applications (T-Schema). The T-Schema uses a joint functional-data analysis. The B-schema is evolved by comparing it with the generated T-schema. This facilitates an evolutionary approach to integrate existing databases to support new applications as and when needed. The mutual completeness check of the T-Schema against B-Schema derive the schema modification steps to be performed on B-Schema to meet the requirements of the new database applications. A case study is presented to illustrate the methodology.


Water ◽  
2018 ◽  
Vol 10 (9) ◽  
pp. 1175 ◽  
Author(s):  
Pavel Praks ◽  
Dejan Brkić

Widely used in hydraulics, the Colebrook equation for flow friction relates implicitly to the input parameters; the Reynolds number, Re and the relative roughness of an inner pipe surface, ε/D with an unknown output parameter; the flow friction factor, λ; λ = f (λ, Re, ε/D). In this paper, a few explicit approximations to the Colebrook equation; λ ≈ f (Re, ε/D), are generated using the ability of artificial intelligence to make inner patterns to connect input and output parameters in an explicit way not knowing their nature or the physical law that connects them, but only knowing raw numbers, {Re, ε/D}→{λ}. The fact that the used genetic programming tool does not know the structure of the Colebrook equation, which is based on computationally expensive logarithmic law, is used to obtain a better structure of the approximations, which is less demanding for calculation but also enough accurate. All generated approximations have low computational cost because they contain a limited number of logarithmic forms used for normalization of input parameters or for acceleration, but they are also sufficiently accurate. The relative error regarding the friction factor λ, in in the best case is up to 0.13% with only two logarithmic forms used. As the second logarithm can be accurately approximated by the Padé approximation, practically the same error is obtained also using only one logarithm.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Kyle Ellrott ◽  
Alex Buchanan ◽  
Allison Creason ◽  
Michael Mason ◽  
Thomas Schaffter ◽  
...  

Abstract Challenges are achieving broad acceptance for addressing many biomedical questions and enabling tool assessment. But ensuring that the methods evaluated are reproducible and reusable is complicated by the diversity of software architectures, input and output file formats, and computing environments. To mitigate these problems, some challenges have leveraged new virtualization and compute methods, requiring participants to submit cloud-ready software packages. We review recent data challenges with innovative approaches to model reproducibility and data sharing, and outline key lessons for improving quantitative biomedical data analysis through crowd-sourced benchmarking challenges.


2002 ◽  
Vol 96 (1) ◽  
pp. 260-262
Author(s):  
Gallya Lahav

Joel Fetzer is to be congratulated for a serious attempt to bring a public opinion approach to comparative immigration politics. His book represents an ambitious step toward bridging the gap between policy input and output in the immigration equation of advanced industrialized democracies. Its occasional choppy organization and underdeveloped data analysis tend to distract from the import of the work and leave the reader yearning for a deeper and more substantive discussion.


Mathematics ◽  
2020 ◽  
Vol 8 (5) ◽  
pp. 736 ◽  
Author(s):  
Chao Fu ◽  
Guojin Feng ◽  
Jiaojiao Ma ◽  
Kuan Lu ◽  
Yongfeng Yang ◽  
...  

In this paper, the non-probabilistic steady-state dynamics of a dual-rotor system with parametric uncertainties under two-frequency excitations are investigated using the non-intrusive simplex form mathematical metamodel. The Lagrangian formulation is employed to derive the equations of motion (EOM) of the system. The simplex form metamodel without the distribution functions of the interval uncertainties is formulated in a non-intrusive way. In the multi-uncertain cases, strategies aimed at reducing the computational cost are incorporated. In numerical simulations for different interval parametric uncertainties, the special propagation mechanism is observed, which cannot be found in single rotor systems. Validations of the metamodel in terms of efficiency and accuracy are also carried out by comparisons with the scanning method. The results will be helpful to understand the dynamic behaviors of dual-rotor systems subject to uncertainties and provide guidance for robust design and analysis.


Author(s):  
Artur Boronat

Abstract When model transformations are used to implement consistency relations between very large models, incrementality plays a cornerstone role in detecting and resolving inconsistencies efficiently when models are updated. Given a directed consistency relation between two models, the problem studied in this work consists in propagating model changes from a source model to a target model in order to ensure consistency while minimizing computational costs. The mechanism that enforces such consistency is called consistency maintainer and, in this context, its scalability is a required non-functional requirement. State-of-the-art model transformation engines with support for incrementality normally rely on an observer pattern for linking model changes, also known as deltas, to the application of model transformation rules, in so-called dependencies, at run time. These model changes can then be propagated along an already executed model transformation. Only a few approaches to model transformation provide domain-specific languages for representing and storing model changes in order to enable their use in asynchronous, event-based execution environments. The principal contribution of this work is the design of a forward change propagation mechanism for incremental execution of model transformations, which decouples dependency tracking from change propagation using two innovations. First, the observer pattern-based model is replaced with dependency injection, decoupling domain models from consistency maintainers. Second, a standardized representation of model changes is reused, enabling interoperability with EMF-compliant tools, both for defining model changes and for processing them asynchronously. This procedure has been implemented in a model transformation engine, whose performance has been evaluated experimentally using the VIATRA CPS benchmark. In the experiments performed, the new transformation engine shows gains in the form of several orders of magnitude in the initial phase of the incremental execution of the benchmark model transformation and change propagation is performed in real time for those model sizes that are processable by other tools and, in addition, is able to process much larger models.


Sign in / Sign up

Export Citation Format

Share Document