scholarly journals RYANSQL: Recursively Applying Sketch-based Slot Fillings for Complex Text-to-SQL in Cross-Domain Databases

2021 ◽  
pp. 1-24
Author(s):  
DongHyun Choi ◽  
Myeong Cheol Shin ◽  
EungGyun Kim ◽  
Dong Ryeol Shin

Abstract Text-to-SQL is the problem of converting a user question into an SQL query, when the question and database are given. In this paper, we present a neural network approach called RYANSQL (Recursively Yielding Annotation Network for SQL) to solve complex Text-to-SQL tasks for cross-domain databases. Statement Position Code (SPC) is defined to transform a nested SQL query into a set of non-nested SELECT statements; a sketch-based slot filling approach is proposed to synthesize each SELECT statement for its corresponding SPC. Additionally, two input manipulation methods are presented to improve generation performance further. RYANSQL achieved competitive result of 58.2% accuracy on the challenging Spider benchmark. At the time of paper submission (April 2020), RYANSQL v2, a variant of original RYANSQL, is positioned at 3rd place among all systems and 1st place among the systems not using database content with 60.6% exact matching accuracy. The source code is available at https://github.com/kakaoenterprise/RYANSQL.

Author(s):  
K. Ahkouk ◽  
M. Machkour ◽  
K. Majhadi ◽  
R. Mama

Abstract. Sequence to sequence models have been widely used in the recent years in the different tasks of Natural Language processing. In particular, the concept has been deeply adopted to treat the problem of translating human language questions to SQL. In this context, many studies suggest the use of sequence to sequence approaches for predicting the target SQL queries using the different available datasets. In this paper, we put the light on another way to resolve natural language processing tasks, especially the Natural Language to SQL one using the method of sketch-based decoding which is based on a sketch with holes that the model incrementally tries to fill. We present the pros and cons of each approach and how a sketch-based model can outperform the already existing solutions in order to predict the wanted SQL queries and to generate to unseen input pairs in different contexts and cross-domain datasets, and finally we discuss the test results of the already proposed models using the exact matching scores and the errors propagation and the time required for the training as metrics.


Author(s):  
Hengtong Lu ◽  
Zhuoxin Han ◽  
Caixia Yuan ◽  
Xiaojie Wang ◽  
Shuyu Lei ◽  
...  
Keyword(s):  

2019 ◽  
Author(s):  
Rui Zhang ◽  
Tao Yu ◽  
Heyang Er ◽  
Sungrok Shim ◽  
Eric Xue ◽  
...  

2013 ◽  
Vol 753-755 ◽  
pp. 3108-3111
Author(s):  
Yin Bing Li

In allusion to the colored image matching characteristic in the system of robot view navigation, SSDA (the sequential similarity detection algorithm) is improved and adaptive genetic algorithm is brought in; meanwhile, level-divided search strategy connective with rough and exact matching. The improved algorithm can enhance the image matching speed with no matching accuracy reduced, so that real-time requirements of robot view navigation can be met and robot view navigation will be of preferable robustness.


2021 ◽  
Vol 11 (22) ◽  
pp. 10675
Author(s):  
Yinpei Dai ◽  
Yichi Zhang ◽  
Hong Liu ◽  
Zhijian Ou ◽  
Yi Huang ◽  
...  

Slot filling is a crucial component in task-oriented dialog systems that is used to parse (user) utterances into semantic concepts called slots. An ontology is defined by the collection of slots and the values that each slot can take. The most widely used practice of treating slot filling as a sequence labeling task suffers from two main drawbacks. First, the ontology is usually pre-defined and fixed and therefore is not able to detect new labels for unseen slots. Second, the one-hot encoding of slot labels ignores the correlations between slots with similar semantics, which makes it difficult to share knowledge learned across different domains. To address these problems, we propose a new model called elastic conditional random field (eCRF), where each slot is represented by the embedding of its natural language description and modeled by a CRF layer. New slot values can be detected by eCRF whenever a language description is available for the slot. In our experiment, we show that eCRFs outperform existing models in both in-domain and cross-domain tasks, especially in predicting unseen slots and values.


2021 ◽  
pp. 517-528
Author(s):  
Shudong Liu ◽  
Peijie Huang ◽  
Zhanbiao Zhu ◽  
Hualin Zhang ◽  
Jianying Tan
Keyword(s):  

2019 ◽  
Vol 36 (7) ◽  
pp. 2314-2315 ◽  
Author(s):  
Jacobo de la Cuesta-Zuluaga ◽  
Ruth E Ley ◽  
Nicholas D Youngblut

Abstract Summary Taxonomic and functional information from microbial communities can be efficiently obtained by metagenome profiling, which requires databases of genes and genomes to which sequence reads are mapped. However, the databases that accompany metagenome profilers are not updated at a pace that matches the increase in available microbial genomes, and unifying database content across metagenome profiling tools can be cumbersome. To address this, we developed Struo, a modular pipeline that automatizes the acquisition of genomes from public repositories and the construction of custom databases for multiple metagenome profilers. The use of custom databases that broadly represent the known microbial diversity by incorporating novel genomes results in a substantial increase in mappability of reads in synthetic and real metagenome datasets. Availability and implementation Source code available for download at https://github.com/leylabmpi/Struo. Custom genome taxonomy database databases available at http://ftp.tue.mpg.de/ebio/projects/struo/. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document