Integrating Heterogeneous Data Sources in the Web

2009 ◽  
pp. 2472-2488
Author(s):  
Angelo Brayner ◽  
Marcelo Meirelles ◽  
José de Aguiar Moraes Filho

Integrating data sources published on the Web requires an integration strategy that guarantees the local data sources’ autonomy. A multidatabase system (MDBS) has been consolidated as an approach to integrate multiple heterogeneous and distributed data sources in flexible and dynamic environments such as the Web. A key property of MDBSs is to guarantee a higher degree of local autonomy. In order to adopt the MDBS strategy, it is necessary to use a query language, called the MultiDatabase Language (MDL), which provides the necessary constructs for jointly manipulating and accessing data in heterogeneous data sources. In other words, the MDL is responsible for solving integration conflicts. This chapter describes an extension to the XQuery Language, called MXQuery, which supports queries over several data sources and solves such integration problems as semantic heterogeneity and incomplete information.

2007 ◽  
pp. 199-219
Author(s):  
Angelo Brayner ◽  
Macelo Meireles ◽  
José de Aguiar Moraes Filho

Integrating data sources published on the web requires an integration strategy that guarantees local data sources autonomy. Multidatabase System (MDBS) has been consolidated as an approach to integrate multiple heterogeneous and distributed data sources in flexible and dynamic environments such as the Web. A key property of MDBSs is to guarantee a higher degree of local autonomy. In order to adopt the MDBS strategy, it is necessary to use a query language, called multidatabase language (MDL), which provides the necessary constructs for jointly manipulating and accessing data in heterogeneous data sources. In other words, the MDL is responsible for solving integration conflicts. This chapter describes an extension to the XQuery language, called MXQuery, which supports queries over several data sources and solves integration problems as semantic heterogeneity and incomplete information.


Author(s):  
Barbara Catania ◽  
Elena Ferrari

Web is characterized by a huge amount of very heterogeneous data sources, that differ both in media support and format representation. In this scenario, there is the need of an integrating approach for querying heterogeneous Web documents. To this purpose, XML can play an important role since it is becoming a standard for data representation and exchange over the Web. Due to its flexibility, XML is currently being used as an interface language over the Web, by which (part of) document sources are represented and exported. Under this assumption, the problem of querying heterogeneous sources can be reduced to the problem of querying XML data sources. In this chapter, we first survey the most relevant query languages for XML data proposed both by the scientific community and by standardization committees, e.g., W3C, mainly focusing on their expressive power. Then, we investigate how typical Information Retrieval concepts, such as ranking, similarity-based search, and profile-based search, can be applied to XML query languages. Commercial products based on the considered approaches are then briefly surveyed. Finally, we conclude the chapter by providing an overview of the most promising research trends in the fields.


2014 ◽  
Vol 530-531 ◽  
pp. 809-812
Author(s):  
Gang Huang ◽  
Xiu Ying Wu ◽  
Man Yuan ◽  
Rui Fang Li

The Oil & Gas industry is moving forward with Integrated Operations (IO). There are different ways to achieve data integration, and ontology-based approaches have drawn much attention. This paper introduces an ontology-based distributed data integration framework (ODDIF). The framework resolves the problem of semantic interoperability between heterogeneous data sources in semantic level. By metadatas specifying the distributed, heterogeneous data and by describing semantic information of data source , having "ontology" as a common semantic model, semantic match is established through ontology mapping between heterogeneous data sources and semantic difference institutions are shielded, so that semantic heterogeneity problem of the heterogeneous data sources can be effectively solved. The proposed method reduces developing difficulty, improves developing efficiency, and enhances the maintainability and expandability of the system.


Author(s):  
Naïma Souâd Ougouti ◽  
Hafida Belbachir ◽  
Youssef Amghar

Semantic web offers new opportunities to multi-sources integration field, and many approaches like P2P systems are revisited taking into account the new requirements. In this chapter, the authors present their P2P heterogeneous and distributed data integration system. It is a super-peer system, where peers are regrouped by type of data (relational, image, text, etc.) around a super-peer which contains a domain ontology. Peers data sources are exported in a common format in the form of a semantically rich ontology. Schemas reconciliation is done by matching domain and local ontologies by the use of a similarity function whose contribution is based on the direct and indirect semantic neighborhood. Queries are described using ontologies, then routed towards relevant peers thanks to a semantic topology built on top of the existing physical one.


Author(s):  
J. F. Aldana Montes ◽  
A. C. Gómez Lora ◽  
N. Moreno Vergara ◽  
I. Navas Delgado ◽  
M. M. Roldán Garcia

Database community has been seriously disturbed with the Web technologies expansion. Particularly, two reports have produced a special commotion in database field. The first one, the Asilomar report (Bernstein et al., 1998), postulates the new directives in databases tendencies, previewing the Web impact in this field. The second one, Breaking out the Box (Silberschatz & Zdonik, 1996), proposes how database community must transfer its technology to be introduced into Web technology. In this sense, the database box must be broken out into its autonomous functional components, and they must be used to reach a solution for the problem of heterogeneous data sources integration.


Sign in / Sign up

Export Citation Format

Share Document