Integrating Heterogeneous Data for Big Data Analysis

Author(s):  
Richard Millham

Data is an integral part of most business-critical applications. As business data increases in volume and in variety due to technological, business, and other factors, managing this diverse volume of data becomes more difficult. A new paradigm, data virtualization, is used for data management. Although a lot of research has been conducted on developing techniques to accurately store huge amounts of data and to process this data with optimal resource utilization, research remains on how to handle divergent data from multiple data sources. In this chapter, the authors first look at the emerging problem of “big data” with a brief introduction to the emergence of data virtualization and at an existing system that implements data virtualization. Because data virtualization requires techniques to integrate data, the authors look at the problems of divergent data in terms of value, syntax, semantic, and structural differences. Some proposed methods to help resolve these differences are examined in order to enable the mapping of this divergent data into a homogeneous global schema that can more easily be used for big data analysis. Finally, some tools and industrial examples are given in order to demonstrate different approaches of heterogeneous data integration.

2019 ◽  
Vol 9 (1) ◽  
pp. 01-12 ◽  
Author(s):  
Kristy F. Tiampo ◽  
Javad Kazemian ◽  
Hadi Ghofrani ◽  
Yelena Kropivnitskaya ◽  
Gero Michel

2020 ◽  
Vol 25 (2) ◽  
pp. 18-30
Author(s):  
Seung Wook Oh ◽  
Jin-Wook Han ◽  
Min Soo Kim

2020 ◽  
Vol 14 (1) ◽  
pp. 151-163
Author(s):  
Joon-Seo Choi ◽  
◽  
Su-in Park

2020 ◽  
Vol 29 (4) ◽  
pp. 29-38
Author(s):  
Jeong-Hyeon Kwak ◽  
Sun-Hee Lee

Sign in / Sign up

Export Citation Format

Share Document