Research and Application on Oilfield Product Heterogeneous Data Integration Based on Ontology

2014 ◽  
Vol 530-531 ◽  
pp. 809-812
Author(s):  
Gang Huang ◽  
Xiu Ying Wu ◽  
Man Yuan ◽  
Rui Fang Li

The Oil & Gas industry is moving forward with Integrated Operations (IO). There are different ways to achieve data integration, and ontology-based approaches have drawn much attention. This paper introduces an ontology-based distributed data integration framework (ODDIF). The framework resolves the problem of semantic interoperability between heterogeneous data sources in semantic level. By metadatas specifying the distributed, heterogeneous data and by describing semantic information of data source , having "ontology" as a common semantic model, semantic match is established through ontology mapping between heterogeneous data sources and semantic difference institutions are shielded, so that semantic heterogeneity problem of the heterogeneous data sources can be effectively solved. The proposed method reduces developing difficulty, improves developing efficiency, and enhances the maintainability and expandability of the system.

2014 ◽  
Vol 912-914 ◽  
pp. 1201-1204
Author(s):  
Gang Huang ◽  
Xiu Ying Wu ◽  
Man Yuan

This paper provides an ontology-based distributed heterogeneous data integration framework (ODHDIF). The framework resolves the problem of semantic interoperability between heterogeneous data sources in semantic level. By metadatas specifying the distributed, heterogeneous data and by describing semantic information of data source , having "ontology" as a common semantic model, semantic match is established through ontology mapping between heterogeneous data sources and semantic difference institutions are shielded, so that semantic heterogeneity problem of the heterogeneous data sources can be effectively solved. It provides an effective technology measure for the interior information of enterprises to be shared in time accurately.


2020 ◽  
Vol 6 ◽  
pp. e254
Author(s):  
Giuseppe Fusco ◽  
Lerina Aversano

Integrating data from multiple heterogeneous data sources entails dealing with data distributed among heterogeneous information sources, which can be structured, semi-structured or unstructured, and providing the user with a unified view of these data. Thus, in general, gathering information is challenging, and one of the main reasons is that data sources are designed to support specific applications. Very often their structure is unknown to the large part of users. Moreover, the stored data is often redundant, mixed with information only needed to support enterprise processes, and incomplete with respect to the business domain. Collecting, integrating, reconciling and efficiently extracting information from heterogeneous and autonomous data sources is regarded as a major challenge. In this paper, we present an approach for the semantic integration of heterogeneous data sources, DIF (Data Integration Framework), and a software prototype to support all aspects of a complex data integration process. The proposed approach is an ontology-based generalization of both Global-as-View and Local-as-View approaches. In particular, to overcome problems due to semantic heterogeneity and to support interoperability with external systems, ontologies are used as a conceptual schema to represent both data sources to be integrated and the global view.


2018 ◽  
Author(s):  
Larysse Silva ◽  
José Alex Lima ◽  
Nélio Cacho ◽  
Eiji Adachi ◽  
Frederico Lopes ◽  
...  

A notable characteristic of smart cities is the increase in the amount of available data generated by several devices and computational systems, thus augmenting the challenges related to the development of software that involves the integration of larges volumes of data. In this context, this paper presents a literature review aimed to identify the main strategies used in the development of solutions for data integration, relationship, and representation in smart cities. This study systematically selected and analyzed eleven studies published from 2015 to 2017. The achieved results reveal gaps regarding solutions for the continuous integration of heterogeneous data sources towards supporting application development and decision-making.


Database ◽  
2019 ◽  
Vol 2019 ◽  
Author(s):  
Ana Claudia Sima ◽  
Tarcisio Mendes de Farias ◽  
Erich Zbinden ◽  
Maria Anisimova ◽  
Manuel Gil ◽  
...  

Abstract Motivation: Data integration promises to be one of the main catalysts in enabling new insights to be drawn from the wealth of biological data available publicly. However, the heterogeneity of the different data sources, both at the syntactic and the semantic level, still poses significant challenges for achieving interoperability among biological databases. Results: We introduce an ontology-based federated approach for data integration. We applied this approach to three heterogeneous data stores that span different areas of biological knowledge: (i) Bgee, a gene expression relational database; (ii) Orthologous Matrix (OMA), a Hierarchical Data Format 5 orthology DS; and (iii) UniProtKB, a Resource Description Framework (RDF) store containing protein sequence and functional information. To enable federated queries across these sources, we first defined a new semantic model for gene expression called GenEx. We then show how the relational data in Bgee can be expressed as a virtual RDF graph, instantiating GenEx, through dedicated relational-to-RDF mappings. By applying these mappings, Bgee data are now accessible through a public SPARQL endpoint. Similarly, the materialized RDF data of OMA, expressed in terms of the Orthology ontology, is made available in a public SPARQL endpoint. We identified and formally described intersection points (i.e. virtual links) among the three data sources. These allow performing joint queries across the data stores. Finally, we lay the groundwork to enable nontechnical users to benefit from the integrated data, by providing a natural language template-based search interface.


2009 ◽  
pp. 2472-2488
Author(s):  
Angelo Brayner ◽  
Marcelo Meirelles ◽  
José de Aguiar Moraes Filho

Integrating data sources published on the Web requires an integration strategy that guarantees the local data sources’ autonomy. A multidatabase system (MDBS) has been consolidated as an approach to integrate multiple heterogeneous and distributed data sources in flexible and dynamic environments such as the Web. A key property of MDBSs is to guarantee a higher degree of local autonomy. In order to adopt the MDBS strategy, it is necessary to use a query language, called the MultiDatabase Language (MDL), which provides the necessary constructs for jointly manipulating and accessing data in heterogeneous data sources. In other words, the MDL is responsible for solving integration conflicts. This chapter describes an extension to the XQuery Language, called MXQuery, which supports queries over several data sources and solves such integration problems as semantic heterogeneity and incomplete information.


2013 ◽  
Vol 655-657 ◽  
pp. 1730-1733
Author(s):  
Lin Peng ◽  
Qiang Zheng ◽  
Zhao Rong Liu

To better share agricultural information in existed agricultural informatization condition, and to meet agro-departments new needs about local self-governed and global shared data management during standardized production of the sweet corn, this paper provides a method of integrated sharing of heterogeneous data sources to apply to standardized product of the sweet corn. This method solves the data integration and sharing problems during standardized production of the sweet corn. In this paper, the expert system for sweet corn standard production which is ability to combine heterogeneous data is constructed. This system is proved to be reliable, perform well and it is easy to operate.


Author(s):  
Juan M. Gómez ◽  
Ricardo Colomo ◽  
Marcos Ruano ◽  
Ángel García

Technological advances in high-throughput techniques and efficient data gathering methods, coupled computational biology efforts, have resulted in a vast amount of life science data often available in distributed and heterogeneous repositories. These repositories contain information such as sequence and structure data, annotations for biological data, results of complex computations, genetic sequences and multiple bio-datasets. However, the heterogeneity of these data, have created a need for research in resource integration and platform independent processing of investigative queries, involving heterogeneous data sources. When processing huge amounts of data, information integration is one of the most critical issues, because it’s crucial to preserve the intrinsic semantics of all the merged data sources. This integration would allow the proper organization of data, fostering the analysis and access the information to accomplish critical tasks, such as the processing of micro-array data to study protein function and medical researches in making detailed studies of protein structures to facilitate drug design (Ignacimuthu, 2005). Furthermore, DNA micro-array research community urgently requires technology to allow up-to-date micro-array data information to be found, accessed and delivered in a secure framework (Sinnot, 2007). Several research disciplines, such as Bioinformatics, where information integration is critical, could benefit from harnessing the potential of a new approach: the Semantic Web (SW). The SW term was coined by Berners-Lee, Hendler and Lassila (2001) to describe the evolution of a Web that consisted of largely documents for humans to read towards a new paradigm that included data and information for computers to manipulate. The SW is about adding machine-understandable and machine-processable metadata to Web resource through its key-enabling technology: ontologies (Fensel, 2002). Ontologies are a formal explicit and shared specification of a conceptualization. The SW was conceived as a way to solve the need for data integration on the Web. This article expounds SAMIDI, a Semantics-based Architecture for Micro-array Information and Data Integration. The most remarkable innovation offered by SAMIDI is the use of semantics as a tool for leveraging different vocabularies and terminologies and foster integration. SAMIDI is composed of a methodology for the unification of heterogeneous data sources from the analysis of the requirements of the unified data set and a software architecture.


2007 ◽  
pp. 199-219
Author(s):  
Angelo Brayner ◽  
Macelo Meireles ◽  
José de Aguiar Moraes Filho

Integrating data sources published on the web requires an integration strategy that guarantees local data sources autonomy. Multidatabase System (MDBS) has been consolidated as an approach to integrate multiple heterogeneous and distributed data sources in flexible and dynamic environments such as the Web. A key property of MDBSs is to guarantee a higher degree of local autonomy. In order to adopt the MDBS strategy, it is necessary to use a query language, called multidatabase language (MDL), which provides the necessary constructs for jointly manipulating and accessing data in heterogeneous data sources. In other words, the MDL is responsible for solving integration conflicts. This chapter describes an extension to the XQuery language, called MXQuery, which supports queries over several data sources and solves integration problems as semantic heterogeneity and incomplete information.


Sign in / Sign up

Export Citation Format

Share Document