Application of Hadoop Based Distributed Data Cleaning Technology in Periodical Meta Data Integration

Author(s):  
Xiuyuan Liu ◽  
Aizhang Guo ◽  
Tao Sun
2018 ◽  
Vol 176 ◽  
pp. 03003
Author(s):  
Guang Yang ◽  
Zhihan Yang ◽  
Zhijie Han

This paper researches Shanghai’s real property registration, studying the data integration of the information database of real property unified registration in Shanghai and the establishment of information systems of real property unified registration. The means of data cleaning and data integration are used to consolidate all kinds of basic data for real property which are in complex structure and different formats, building a basic information database for real property and providing a strong support for the establishment of information platform.


Author(s):  
Bartosz Dobrzelecki ◽  
Amrey Krause ◽  
Alastair C. Hume ◽  
Alistair Grant ◽  
Mario Antonioletti ◽  
...  

OGSA-DAI (Open Grid Services Architecture Data Access and Integration) is a framework for building distributed data access and integration systems. Until recently, it lacked the built-in functionality that would allow easy creation of federations of distributed data sources. The latest release of the OGSA-DAI framework introduced the OGSA-DAI DQP (Distributed Query Processing) resource. The new resource encapsulates a distributed query processor, that is able to orchestrate distributed data sources when answering declarative user queries. The query processor has many extensibility points, making it easy to customize. We have also introduced a new OGSA-DAI V iews resource that provides a flexible method for defining views over relational data. The interoperability of the two new resources, together with the flexibility of the OGSA-DAI framework, allows the building of highly customized data integration solutions.


Author(s):  
Mikko Koho ◽  
Petri Leskinen ◽  
Eero Hyvönen

Abstract Semantic data integration from heterogeneous, distributed data silos enables Digital Humanities research and application development employing a larger, mutually enriched and interlinked knowledge graph. However, data integration is challenging, involving aligning the data models and reconciling the concepts and named entities, such as persons and places. This paper presents a record linkage process to reconcile person references in different military historical person registers with structured metadata. The information about persons is aggregated into a single knowledge graph. The process was applied to reconcile three person registers of the popular semantic portal “WarSampo – Finnish World War 2 on the Semantic Web”. The registers contain detailed information about some 100 000 people and are individually maintained by domain experts. Thus, the integration process needs to be automatic and adaptable to changes in the registers. An evaluation of the record linkage results is promising and provides some insight into military person register reconciliation in general.


2019 ◽  
Vol 1213 ◽  
pp. 022021 ◽  
Author(s):  
Chuan Li ◽  
Yunqi Hou ◽  
Zhang Yu

2011 ◽  
Vol 366 ◽  
pp. 45-49
Author(s):  
Jie Liu ◽  
Chuan Sheng Zhou

Today, data with hidden knowledge drives almost every activity in business and enterprises, etc. Computer technology has been successfully solved the problems with data storage, query, usability and transmission. But how to integrate all these huge distributed and heterogeneous data together for high level applications is still a critical problem. With research on the software bus and multi-agents technologies, it illustrates multi-agents based design architecture for data integration


2014 ◽  
Vol 530-531 ◽  
pp. 809-812
Author(s):  
Gang Huang ◽  
Xiu Ying Wu ◽  
Man Yuan ◽  
Rui Fang Li

The Oil & Gas industry is moving forward with Integrated Operations (IO). There are different ways to achieve data integration, and ontology-based approaches have drawn much attention. This paper introduces an ontology-based distributed data integration framework (ODDIF). The framework resolves the problem of semantic interoperability between heterogeneous data sources in semantic level. By metadatas specifying the distributed, heterogeneous data and by describing semantic information of data source , having "ontology" as a common semantic model, semantic match is established through ontology mapping between heterogeneous data sources and semantic difference institutions are shielded, so that semantic heterogeneity problem of the heterogeneous data sources can be effectively solved. The proposed method reduces developing difficulty, improves developing efficiency, and enhances the maintainability and expandability of the system.


Sign in / Sign up

Export Citation Format

Share Document