Application of Hadoop Based Distributed Data Cleaning Technology in Periodical Meta Data Integration

This paper researches Shanghai’s real property registration, studying the data integration of the information database of real property unified registration in Shanghai and the establishment of information systems of real property unified registration. The means of data cleaning and data integration are used to consolidate all kinds of basic data for real property which are in complex structure and different formats, building a basic information database for real property and providing a strong support for the establishment of information platform.

Download Full-text

Schema-Driven Security Filter Generation For Distributed Data Integration

2006 1st IEEE Workshop on Hot Topics in Web Systems and Technologies ◽

10.1109/hotweb.2006.355270 ◽

2006 ◽

Cited By ~ 2

Author(s):

Hui Dong ◽

Zhimin Wang ◽

Robert A. Morris ◽

Douglas Sellers

Keyword(s):

Data Integration ◽

Distributed Data

Download Full-text

Integrating distributed data sources with OGSA–DAI DQP and V iews

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2010.0166 ◽

2010 ◽

Vol 368 (1926) ◽

pp. 4133-4145 ◽

Cited By ~ 18

Author(s):

Bartosz Dobrzelecki ◽

Amrey Krause ◽

Alastair C. Hume ◽

Alistair Grant ◽

Mario Antonioletti ◽

...

Keyword(s):

Data Integration ◽

Query Processing ◽

Data Access ◽

Data Sources ◽

Distributed Data ◽

Grid Services ◽

Distributed Query Processing ◽

Processing Resource ◽

Distributed Query ◽

Open Grid Services Architecture

OGSA-DAI (Open Grid Services Architecture Data Access and Integration) is a framework for building distributed data access and integration systems. Until recently, it lacked the built-in functionality that would allow easy creation of federations of distributed data sources. The latest release of the OGSA-DAI framework introduced the OGSA-DAI DQP (Distributed Query Processing) resource. The new resource encapsulates a distributed query processor, that is able to orchestrate distributed data sources when answering declarative user queries. The query processor has many extensibility points, making it easy to customize. We have also introduced a new OGSA-DAI V iews resource that provides a flexible method for defining views over relational data. The interoperability of the two new resources, together with the flexibility of the OGSA-DAI framework, allows the building of highly customized data integration solutions.

Download Full-text

Integrating Historical Person Registers as Linked Open Data in the WarSampo Knowledge Graph

Semantic Systems. In the Era of Knowledge Graphs - Lecture Notes in Computer Science ◽

10.1007/978-3-030-59833-4_8 ◽

2020 ◽

pp. 118-126

Author(s):

Mikko Koho ◽

Petri Leskinen ◽

Eero Hyvönen

Keyword(s):

Data Integration ◽

Record Linkage ◽

Open Data ◽

Knowledge Graph ◽

Distributed Data ◽

Application Development ◽

Named Entities ◽

Domain Experts ◽

World War 2 ◽

Semantic Data

Abstract Semantic data integration from heterogeneous, distributed data silos enables Digital Humanities research and application development employing a larger, mutually enriched and interlinked knowledge graph. However, data integration is challenging, involving aligning the data models and reconciling the concepts and named entities, such as persons and places. This paper presents a record linkage process to reconcile person references in different military historical person registers with structured metadata. The information about persons is aggregated into a single knowledge graph. The process was applied to reconcile three person registers of the popular semantic portal “WarSampo – Finnish World War 2 on the Semantic Web”. The registers contain detailed information about some 100 000 people and are individually maintained by domain experts. Thus, the integration process needs to be automatic and adaptable to changes in the registers. An evaluation of the record linkage results is promising and provides some insight into military person register reconciliation in general.

Download Full-text

Research on data cleaning technology based on instance level

Journal of Physics Conference Series ◽

10.1088/1742-6596/1213/2/022021 ◽

2019 ◽

Vol 1213 ◽

pp. 022021 ◽

Cited By ~ 1

Author(s):

Chuan Li ◽

Yunqi Hou ◽

Zhang Yu

Keyword(s):

Data Cleaning ◽

Cleaning Technology

Download Full-text

Research on Multi-Agents Based Distributed Data Integration

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.366.45 ◽

2011 ◽

Vol 366 ◽

pp. 45-49

Author(s):

Jie Liu ◽

Chuan Sheng Zhou

Keyword(s):

Data Integration ◽

Data Storage ◽

Computer Technology ◽

Heterogeneous Data ◽

Distributed Data ◽

Critical Problem ◽

Software Bus ◽

Hidden Knowledge ◽

High Level ◽

Multi Agents

Today, data with hidden knowledge drives almost every activity in business and enterprises, etc. Computer technology has been successfully solved the problems with data storage, query, usability and transmission. But how to integrate all these huge distributed and heterogeneous data together for high level applications is still a critical problem. With research on the software bus and multi-agents technologies, it illustrates multi-agents based design architecture for data integration

Download Full-text

Research and Application on Oilfield Product Heterogeneous Data Integration Based on Ontology

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.530-531.809 ◽

2014 ◽

Vol 530-531 ◽

pp. 809-812

Author(s):

Gang Huang ◽

Xiu Ying Wu ◽

Man Yuan ◽

Rui Fang Li

Keyword(s):

Data Integration ◽

Heterogeneous Data ◽

Data Sources ◽

Distributed Data ◽

Semantic Heterogeneity ◽

Gas Industry ◽

Heterogeneous Data Integration ◽

Semantic Level ◽

Heterogeneous Data Sources ◽

Integrated Operations

The Oil & Gas industry is moving forward with Integrated Operations (IO). There are different ways to achieve data integration, and ontology-based approaches have drawn much attention. This paper introduces an ontology-based distributed data integration framework (ODDIF). The framework resolves the problem of semantic interoperability between heterogeneous data sources in semantic level. By metadatas specifying the distributed, heterogeneous data and by describing semantic information of data source , having "ontology" as a common semantic model, semantic match is established through ontology mapping between heterogeneous data sources and semantic difference institutions are shielded, so that semantic heterogeneity problem of the heterogeneous data sources can be effectively solved. The proposed method reduces developing difficulty, improves developing efficiency, and enhances the maintainability and expandability of the system.

Download Full-text