Research and Application on Oilfield Product Heterogeneous Data Integration Based on Ontology

The Oil & Gas industry is moving forward with Integrated Operations (IO). There are different ways to achieve data integration, and ontology-based approaches have drawn much attention. This paper introduces an ontology-based distributed data integration framework (ODDIF). The framework resolves the problem of semantic interoperability between heterogeneous data sources in semantic level. By metadatas specifying the distributed, heterogeneous data and by describing semantic information of data source , having "ontology" as a common semantic model, semantic match is established through ontology mapping between heterogeneous data sources and semantic difference institutions are shielded, so that semantic heterogeneity problem of the heterogeneous data sources can be effectively solved. The proposed method reduces developing difficulty, improves developing efficiency, and enhances the maintainability and expandability of the system.

Download Full-text

Design and Implementation of Oilfield Heterogeneous Data Integration Model Based on Ontology

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.912-914.1201 ◽

2014 ◽

Vol 912-914 ◽

pp. 1201-1204

Author(s):

Gang Huang ◽

Xiu Ying Wu ◽

Man Yuan

Keyword(s):

Data Integration ◽

Heterogeneous Data ◽

Data Sources ◽

Semantic Heterogeneity ◽

Integration Model ◽

Integration Framework ◽

Heterogeneous Data Integration ◽

Semantic Level ◽

Heterogeneous Data Sources ◽

Semantic Difference

This paper provides an ontology-based distributed heterogeneous data integration framework (ODHDIF). The framework resolves the problem of semantic interoperability between heterogeneous data sources in semantic level. By metadatas specifying the distributed, heterogeneous data and by describing semantic information of data source , having "ontology" as a common semantic model, semantic match is established through ontology mapping between heterogeneous data sources and semantic difference institutions are shielded, so that semantic heterogeneity problem of the heterogeneous data sources can be effectively solved. It provides an effective technology measure for the interior information of enterprises to be shared in time accurately.

Download Full-text

An approach for semantic integration of heterogeneous data sources

PeerJ Computer Science ◽

10.7717/peerj-cs.254 ◽

2020 ◽

Vol 6 ◽

pp. e254

Author(s):

Giuseppe Fusco ◽

Lerina Aversano

Keyword(s):

Data Integration ◽

Heterogeneous Data ◽

Semantic Integration ◽

Data Sources ◽

Complex Data ◽

Semantic Heterogeneity ◽

Heterogeneous Information ◽

Heterogeneous Data Sources ◽

Autonomous Data Sources ◽

Unified View

Integrating data from multiple heterogeneous data sources entails dealing with data distributed among heterogeneous information sources, which can be structured, semi-structured or unstructured, and providing the user with a unified view of these data. Thus, in general, gathering information is challenging, and one of the main reasons is that data sources are designed to support specific applications. Very often their structure is unknown to the large part of users. Moreover, the stored data is often redundant, mixed with information only needed to support enterprise processes, and incomplete with respect to the business domain. Collecting, integrating, reconciling and efficiently extracting information from heterogeneous and autonomous data sources is regarded as a major challenge. In this paper, we present an approach for the semantic integration of heterogeneous data sources, DIF (Data Integration Framework), and a software prototype to support all aspects of a complex data integration process. The proposed approach is an ontology-based generalization of both Global-as-View and Local-as-View approaches. In particular, to overcome problems due to semantic heterogeneity and to support interoperability with external systems, ontologies are used as a conceptual schema to represent both data sources to be integrated and the global view.

Download Full-text

Methodology of Big Data Integration from A Priori Unknown Heterogeneous Data Sources

Proceedings of the 2018 2nd International Conference on Computer Science and Artificial Intelligence - CSAI '18 ◽

10.1145/3297156.3297249 ◽

2018 ◽

Author(s):

Alexey Samoylov ◽

Nikolay Sergeev ◽

Margarita Kucherova ◽

Boris Denisov

Keyword(s):

Big Data ◽

Data Integration ◽

A Priori ◽

Heterogeneous Data ◽

Data Sources ◽

Heterogeneous Data Sources

Download Full-text

Integração, Relacionamento e Representação de Dados em Cidades Inteligentes: Uma Revisão de Literatura

10.5753/wbci.2018.3231 ◽

2018 ◽

Author(s):

Larysse Silva ◽

José Alex Lima ◽

Nélio Cacho ◽

Eiji Adachi ◽

Frederico Lopes ◽

...

Keyword(s):

Decision Making ◽

Literature Review ◽

Data Integration ◽

Smart Cities ◽

Heterogeneous Data ◽

Data Sources ◽

Application Development ◽

Continuous Integration ◽

Heterogeneous Data Sources ◽

Computational Systems

A notable characteristic of smart cities is the increase in the amount of available data generated by several devices and computational systems, thus augmenting the challenges related to the development of software that involves the integration of larges volumes of data. In this context, this paper presents a literature review aimed to identify the main strategies used in the development of solutions for data integration, relationship, and representation in smart cities. This study systematically selected and analyzed eleven studies published from 2015 to 2017. The achieved results reveal gaps regarding solutions for the continuous integration of heterogeneous data sources towards supporting application development and decision-making.

Download Full-text

Enabling semantic queries across federated bioinformatics databases

Database ◽

10.1093/database/baz106 ◽

2019 ◽

Vol 2019 ◽

Cited By ~ 9

Author(s):

Ana Claudia Sima ◽

Tarcisio Mendes de Farias ◽

Erich Zbinden ◽

Maria Anisimova ◽

Manuel Gil ◽

...

Keyword(s):

Gene Expression ◽

Data Integration ◽

Heterogeneous Data ◽

Biological Data ◽

Data Sources ◽

Biological Knowledge ◽

Biological Databases ◽

Semantic Level ◽

Sparql Endpoint ◽

Description Framework

Abstract Motivation: Data integration promises to be one of the main catalysts in enabling new insights to be drawn from the wealth of biological data available publicly. However, the heterogeneity of the different data sources, both at the syntactic and the semantic level, still poses significant challenges for achieving interoperability among biological databases. Results: We introduce an ontology-based federated approach for data integration. We applied this approach to three heterogeneous data stores that span different areas of biological knowledge: (i) Bgee, a gene expression relational database; (ii) Orthologous Matrix (OMA), a Hierarchical Data Format 5 orthology DS; and (iii) UniProtKB, a Resource Description Framework (RDF) store containing protein sequence and functional information. To enable federated queries across these sources, we first defined a new semantic model for gene expression called GenEx. We then show how the relational data in Bgee can be expressed as a virtual RDF graph, instantiating GenEx, through dedicated relational-to-RDF mappings. By applying these mappings, Bgee data are now accessible through a public SPARQL endpoint. Similarly, the materialized RDF data of OMA, expressed in terms of the Orthology ontology, is made available in a public SPARQL endpoint. We identified and formally described intersection points (i.e. virtual links) among the three data sources. These allow performing joint queries across the data stores. Finally, we lay the groundwork to enable nontechnical users to benefit from the integrated data, by providing a natural language template-based search interface.

Download Full-text

Integrating Heterogeneous Data Sources in the Web

Database Technologies ◽

10.4018/978-1-60566-058-5.ch150 ◽

2009 ◽

pp. 2472-2488

Author(s):

Angelo Brayner ◽

Marcelo Meirelles ◽

José de Aguiar Moraes Filho

Keyword(s):

Query Language ◽

Heterogeneous Data ◽

Data Sources ◽

Distributed Data ◽

Local Data ◽

Multidatabase System ◽

Integration Strategy ◽

Heterogeneous Data Sources ◽

Integration Problems ◽

The Web

Integrating data sources published on the Web requires an integration strategy that guarantees the local data sources’ autonomy. A multidatabase system (MDBS) has been consolidated as an approach to integrate multiple heterogeneous and distributed data sources in flexible and dynamic environments such as the Web. A key property of MDBSs is to guarantee a higher degree of local autonomy. In order to adopt the MDBS strategy, it is necessary to use a query language, called the MultiDatabase Language (MDL), which provides the necessary constructs for jointly manipulating and accessing data in heterogeneous data sources. In other words, the MDL is responsible for solving integration conflicts. This chapter describes an extension to the XQuery Language, called MXQuery, which supports queries over several data sources and solves such integration problems as semantic heterogeneity and incomplete information.

Download Full-text

Semantics-aware data integration for heterogeneous data sources

Journal of Ambient Intelligence and Humanized Computing ◽

10.1007/s12652-012-0165-4 ◽

2012 ◽

Vol 4 (4) ◽

pp. 471-491 ◽

Cited By ~ 4

Author(s):

Marcello Leida ◽

Alex Gusmini ◽

John Davies

Keyword(s):

Data Integration ◽

Heterogeneous Data ◽

Data Sources ◽

Heterogeneous Data Sources

Download Full-text

Research on the Expert System for Sweet Corn Standard Production Based on Heterologous Data Integration Technology

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.655-657.1730 ◽

2013 ◽

Vol 655-657 ◽

pp. 1730-1733

Author(s):

Lin Peng ◽

Qiang Zheng ◽

Zhao Rong Liu

Keyword(s):

Expert System ◽

Data Integration ◽

Data Management ◽

Sweet Corn ◽

Heterogeneous Data ◽

Data Sources ◽

Integration Technology ◽

Shared Data ◽

Heterogeneous Data Sources ◽

Standard Production

To better share agricultural information in existed agricultural informatization condition, and to meet agro-departments new needs about local self-governed and global shared data management during standardized production of the sweet corn, this paper provides a method of integrated sharing of heterogeneous data sources to apply to standardized product of the sweet corn. This method solves the data integration and sharing problems during standardized production of the sweet corn. In this paper, the expert system for sweet corn standard production which is ability to combine heterogeneous data is constructed. This system is proved to be reliable, perform well and it is easy to operate.

Download Full-text

Microarray Information and Data Integration Using SAMIDI

Encyclopedia of Artificial Intelligence ◽

10.4018/978-1-59904-849-9.ch157 ◽

2011 ◽

pp. 1064-1071 ◽

Cited By ~ 2

Author(s):

Juan M. Gómez ◽

Ricardo Colomo ◽

Marcos Ruano ◽

Ángel García

Keyword(s):

Data Integration ◽

Information Integration ◽

Protein Function ◽

Heterogeneous Data ◽

Data Sources ◽

New Paradigm ◽

Array Data ◽

Heterogeneous Data Sources ◽

Critical Issues ◽

Micro Array

Technological advances in high-throughput techniques and efficient data gathering methods, coupled computational biology efforts, have resulted in a vast amount of life science data often available in distributed and heterogeneous repositories. These repositories contain information such as sequence and structure data, annotations for biological data, results of complex computations, genetic sequences and multiple bio-datasets. However, the heterogeneity of these data, have created a need for research in resource integration and platform independent processing of investigative queries, involving heterogeneous data sources. When processing huge amounts of data, information integration is one of the most critical issues, because it’s crucial to preserve the intrinsic semantics of all the merged data sources. This integration would allow the proper organization of data, fostering the analysis and access the information to accomplish critical tasks, such as the processing of micro-array data to study protein function and medical researches in making detailed studies of protein structures to facilitate drug design (Ignacimuthu, 2005). Furthermore, DNA micro-array research community urgently requires technology to allow up-to-date micro-array data information to be found, accessed and delivered in a secure framework (Sinnot, 2007). Several research disciplines, such as Bioinformatics, where information integration is critical, could benefit from harnessing the potential of a new approach: the Semantic Web (SW). The SW term was coined by Berners-Lee, Hendler and Lassila (2001) to describe the evolution of a Web that consisted of largely documents for humans to read towards a new paradigm that included data and information for computers to manipulate. The SW is about adding machine-understandable and machine-processable metadata to Web resource through its key-enabling technology: ontologies (Fensel, 2002). Ontologies are a formal explicit and shared specification of a conceptualization. The SW was conceived as a way to solve the need for data integration on the Web. This article expounds SAMIDI, a Semantics-based Architecture for Micro-array Information and Data Integration. The most remarkable innovation offered by SAMIDI is the use of semantics as a tool for leveraging different vocabularies and terminologies and foster integration. SAMIDI is composed of a methodology for the unification of heterogeneous data sources from the analysis of the requirements of the unified data set and a software architecture.

Download Full-text

Integrating Heterogeneous Data Sources in the Web

Web Data Management Practices ◽

10.4018/978-1-59904-228-2.ch009 ◽

2007 ◽

pp. 199-219

Author(s):

Angelo Brayner ◽

Macelo Meireles ◽

José de Aguiar Moraes Filho

Keyword(s):

Query Language ◽

Heterogeneous Data ◽

Data Sources ◽

Distributed Data ◽

Local Data ◽

Multidatabase System ◽

Integration Strategy ◽

Heterogeneous Data Sources ◽

Integration Problems ◽

The Web

Integrating data sources published on the web requires an integration strategy that guarantees local data sources autonomy. Multidatabase System (MDBS) has been consolidated as an approach to integrate multiple heterogeneous and distributed data sources in flexible and dynamic environments such as the Web. A key property of MDBSs is to guarantee a higher degree of local autonomy. In order to adopt the MDBS strategy, it is necessary to use a query language, called multidatabase language (MDL), which provides the necessary constructs for jointly manipulating and accessing data in heterogeneous data sources. In other words, the MDL is responsible for solving integration conflicts. This chapter describes an extension to the XQuery language, called MXQuery, which supports queries over several data sources and solves integration problems as semantic heterogeneity and incomplete information.

Download Full-text