A rewrite/merge approach for supporting real-time data warehousing via lightweight data integration

In the standard ETL (Extract Processing Load), the data warehouse refreshment must be performed outside of peak hours. i It implies i that the i functioning and i analysis has stopped in their iall actions. iIt causes the iamount of icleanness of i data from the idata Warehouse which iisn't suggesting ithe latest i operational transections. This i issue is i known as i data i latency. The data warehousing is iemployed to ibe a iremedy for ithis iissue. It updates the idata warehouse iat a inear real-time iFashion, instantly after data found from the data source. Therefore, data i latency could i be reduced. Hence the near real time data warehousing was having issues which was not identified in traditional ETL. This paper claims to communicate the issues and accessible options at every point iin the i near real-time i data warehousing, i.e. i The i issues and Available alternatives iare based ion ia literature ireview by additional iStudy that ifocus ion near real-time data iwarehousing issue

Download Full-text

Tuned X-HYBRIDJOIN for Near-Real-Time Data Warehousing

Web Technologies and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-642-37401-2_49 ◽

2013 ◽

pp. 494-505 ◽

Cited By ~ 2

Author(s):

M. Asif Naeem

Keyword(s):

Real Time ◽

Data Warehousing ◽

Time Data ◽

Real Time Data

Download Full-text

Big Data Management in the Context of Real-Time Data Warehousing

Big Data Management, Technologies, and Applications - Advances in Data Mining and Database Management ◽

10.4018/978-1-4666-4699-5.ch007 ◽

2013 ◽

pp. 150-176

Author(s):

M. Asif Naeem ◽

Gillian Dobbie ◽

Gerald Weber

Keyword(s):

Big Data ◽

Data Integration ◽

Real Time ◽

Real Life ◽

Skewed Distribution ◽

Stream Data ◽

Time Data ◽

Master Data ◽

Real Time Data ◽

Resource Aware

In order to make timely and effective decisions, businesses need the latest information from big data warehouse repositories. To keep these repositories up to date, real-time data integration is required. An important phase in real-time data integration is data transformation where a stream of updates, which is huge in volume and infinite, is joined with large disk-based master data. Stream processing is an important concept in Big Data, since large volumes of data are often best processed immediately. A well-known algorithm called Mesh Join (MESHJOIN) was proposed to process stream data with disk-based master data, which uses limited memory. MESHJOIN is a candidate for a resource-aware system setup. The problem that the authors consider in this chapter is that MESHJOIN is not very selective. In particular, the performance of the algorithm is always inversely proportional to the size of the master data table. As a consequence, the resource consumption is in some scenarios suboptimal. They present an algorithm called Cache Join (CACHEJOIN), which performs asymptotically at least as well as MESHJOIN but performs better in realistic scenarios, particularly if parts of the master data are used with different frequencies. In order to quantify the performance differences, the authors compare both algorithms with a synthetic dataset of a known skewed distribution as well as TPC-H and real-life datasets.

Download Full-text

Bioterrorism Surveillance with Real-Time Data Warehousing

Intelligence and Security Informatics - Lecture Notes in Computer Science ◽

10.1007/3-540-44853-5_24 ◽

2003 ◽

pp. 322-335 ◽

Cited By ~ 12

Author(s):

Donald J. Berndt ◽

Alan R. Hevner ◽

James Studnicki

Keyword(s):

Real Time ◽

Data Warehousing ◽

Time Data ◽

Real Time Data

Download Full-text

Introduction to Real-Time Data Integration

Managing Data in Motion ◽

10.1016/b978-0-12-397167-8.00011-x ◽

2013 ◽

pp. 77-78

Author(s):

April Reeve

Keyword(s):

Data Integration ◽

Real Time ◽

Time Data ◽

Real Time Data

Download Full-text

Real-Time Data Warehousing: A Rewrite/Merge Approach

Data Warehousing and Knowledge Discovery - Lecture Notes in Computer Science ◽

10.1007/978-3-319-10160-6_8 ◽

2014 ◽

pp. 78-88 ◽

Cited By ~ 2

Author(s):

Alfredo Cuzzocrea ◽

Nickerson Ferreira ◽

Pedro Furtado

Keyword(s):

Real Time ◽

Data Warehousing ◽

Time Data ◽

Real Time Data

Download Full-text

Continuous Improvement through Real-Time Data Integration into Reservoir Management Workflows

10.2118/128660-ms ◽

2010 ◽

Cited By ~ 4

Author(s):

Tor K. Kragas ◽

Oktay Metin Gokdemir

Keyword(s):

Data Integration ◽

Real Time ◽

Continuous Improvement ◽

Reservoir Management ◽

Time Data ◽

Real Time Data

Download Full-text

Handling of internal inconsistency OLAP - Based lock table using Message Oriented Middleware in near real time data warehousing

2015 International Seminar on Intelligent Technology and Its Applications (ISITIA) ◽

10.1109/isitia.2015.7220001 ◽

2015 ◽

Author(s):

Ardianto Wibowo ◽

Saiful Akbar

Keyword(s):

Real Time ◽

Data Warehousing ◽

Time Data ◽

Message Oriented Middleware ◽

Real Time Data ◽

Internal Inconsistency

Download Full-text

Problems and available solutions on the stage of Extract, Transform, and Loading in near real-time data warehousing (a literature study)

2015 International Seminar on Intelligent Technology and Its Applications (ISITIA) ◽

10.1109/isitia.2015.7220004 ◽

2015 ◽

Cited By ~ 7

Author(s):

Ardianto Wibowo

Keyword(s):

Real Time ◽

Data Warehousing ◽

Time Data ◽

Literature Study ◽

Real Time Data

Download Full-text