query translation
Recently Published Documents


TOTAL DOCUMENTS

175
(FIVE YEARS 18)

H-INDEX

15
(FIVE YEARS 1)

2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Senthilselvan Natarajan ◽  
Subramaniyaswamy Vairavasundaram ◽  
Yuvaraja Teekaraman ◽  
Ramya Kuppusamy ◽  
Arun Radhakrishnan

Modern web wants the data to be in Resource Description Framework (RDF) format, a machine-readable form that is easy to share and reuse data without human intervention. However, most of the information is still available in relational form. The existing conventional methods transform the data from RDB to RDF using instance-level mapping, which has not yielded the expected results because of poor mapping. Hence, in this paper, a novel schema-based RDB-RDF mapping method (relational database to Resource Description Framework) is proposed, which is an improvised version for transforming the relational database into the Resource Description Framework. It provides both data materialization and on-demand mapping. RDB-RDF reduces the data retrieval time for nonprimary key search by using schema-level mapping. The resultant mapped RDF graph presents the relational database in a conceptual schema and maintains the instance triples as data graph. This mechanism is known as data materialization, which suits well for the static dataset. To get the data in a dynamic environment, query translation (on-demand mapping) is best instead of whole data conversion. The proposed approach directly converts the SPARQL query into SQL query using the mapping descriptions available in the proposed system. The mapping description is the key component of this proposed system which is responsible for quick data retrieval and query translation. Join expression introduced in the proposed RDB-RDF mapping method efficiently handles all complex operations with primary and foreign keys. Experimental evaluation is done on the graphics designer database. It is observed from the result that the proposed schema-based RDB-RDF mapping method accomplishes more comprehensible mapping than conventional methods by dissolving structural and operational differences.


Author(s):  
Raphael W. Majeed ◽  
Patrick Fischer ◽  
Andreas Günther

In the era of translational research, data integration and clinical data warehouses are important enabling technologies for clinical researchers. The OMOP common data model is a wide-spread choice as a target for data integration in medical informatics. It’s portability of queries and analyses across different institutions and data are ideal also from the viewpoint of the FAIR principles. Yet, the OMOP CDM lacks a simple and intuitive user interface for untrained users to run simple queries for feasibility analysis. Aim of this study is to provide an algorithm to translate any given i2b2 query to an equivalent query which can then be run on the OMOP CDM database. The provided algorithm is able to convert queries created in the i2b2 webclient to SQL statements which can be executed on a standard OMOP CDM database programmatically.


Semantic Web ◽  
2021 ◽  
pp. 1-34
Author(s):  
David Chaves-Fraga ◽  
Edna Ruckhaus ◽  
Freddy Priyatna ◽  
Maria-Esther Vidal ◽  
Oscar Corcho

Ontology-Based Data Access (OBDA) has traditionally focused on providing a unified view of heterogeneous datasets (e.g., relational databases, CSV and JSON files), either by materializing integrated data into RDF or by performing on-the-fly querying via SPARQL query translation. In the specific case of tabular datasets represented as several CSV or Excel files, query translation approaches have been applied by considering each source as a single table that can be loaded into a relational database management system (RDBMS). Nevertheless, constraints over these tables are not represented (e.g., referential integrity among sources, datatypes, or data integrity); thus, neither consistency among attributes nor indexes over tables are enforced. As a consequence, efficiency of the SPARQL-to-SQL translation process may be affected, as well as the completeness of the answers produced during the evaluation of the generated SQL query. Our work is focused on applying implicit constraints on the OBDA query translation process over tabular data. We propose Morph-CSV, a framework for querying tabular data that exploits information from typical OBDA inputs (e.g., mappings, queries) to enforce constraints that can be used together with any SPARQL-to-SQL OBDA engine. Morph-CSV relies on both a constraint component and a set of constraint operators. For a given set of constraints, the operators are applied to each type of constraint with the aim of enhancing query completeness and performance. We evaluate Morph-CSV in several domains: e-commerce with the BSBM benchmark; transportation with the GTFS-Madrid benchmark; and biology with a use case extracted from the Bio2RDF project. We compare and report the performance of two SPARQL-to-SQL OBDA engines, without and with the incorporation of Morph-CSV. The observed results suggest that Morph-CSV is able to speed up the total query execution time by up to two orders of magnitude, while it is able to produce all the query answers.


2021 ◽  
pp. 016555152199275
Author(s):  
Juryong Cheon ◽  
Youngjoong Ko

Translation language resources, such as bilingual word lists and parallel corpora, are important factors affecting the effectiveness of cross-language information retrieval (CLIR) systems. In particular, when large domain-appropriate parallel corpora are not available, developing an effective CLIR system is particularly difficult. Furthermore, creating a large parallel corpus is costly and requires considerable effort. Therefore, we here demonstrate the construction of parallel corpora from Wikipedia as well as improved query translation, wherein the queries are used for a CLIR system. To do so, we first constructed a bilingual dictionary, termed WikiDic. Then, we evaluated individual language resources and combinations of them in terms of their ability to extract parallel sentences; the combinations of our proposed WikiDic with the translation probability from the Web’s bilingual example sentence pairs and WikiDic was found to be best suited to parallel sentence extraction. Finally, to evaluate the parallel corpus generated from this best combination of language resources, we compared its performance in query translation for CLIR to that of a manually created English–Korean parallel corpus. As a result, the corpus generated by our proposed method achieved a better performance than did the manually created corpus, thus demonstrating the effectiveness of the proposed method for automatic parallel corpus extraction. Not only can the method demonstrated herein be used to inform the construction of other parallel corpora from language resources that are readily available, but also, the parallel sentence extraction method will naturally improve as Wikipedia continues to be used and its content develops.


2020 ◽  
Vol 49 (09) ◽  
pp. 2113-2118
Author(s):  
Norita Md Norwawi ◽  
Sundresan a/l Perumal ◽  
Emran Huda ◽  
Waka Jeng
Keyword(s):  

Author(s):  
Liang Yao ◽  
Baosong Yang ◽  
Haibo Zhang ◽  
Boxing Chen ◽  
Weihua Luo

Author(s):  
Ganesh Chandra ◽  
Sanjay K. Dwivedi

The quality of retrieval documents in CLIR is often poor compared to IR system due to (1) query mismatching, (2) multiple representations of query terms, and (3) un-translated query terms. The inappropriate translation may lead to poor quality of results. Hence, automated query translation is performed using the back-translation approach for improvement of query translation. This chapter mainly focuses on query expansion (Q.E) and proposes an algorithm to address the drift query issue for Hindi-English CLIR. The system uses FIRE datasets and a set of 50 queries of Hindi language for evaluation. The purpose of a term ordering-based algorithm is to resolve the drift query issue in Q.E. The result shows that the relevancy of Hindi-English CLIR is improved by performing Q.E. using a term ordering-based algorithm. The outcome achieved 60.18% accuracy of results where Q.E has been performed using a term ordering based algorithm, whereas the result of Q.E without a term ordering-based algorithm stands at 57.46%.


Sign in / Sign up

Export Citation Format

Share Document