An Innovative Method to Extract Data in a Real-time Data Warehousing Environment

Mapping Intimacies ◽

10.5121/csit.2021.112401 ◽

2021 ◽

Author(s):

Flavio de Assis Vilela ◽

Ricardo Rodrigues Ciferri

Keyword(s):

Real Time ◽

Data Warehousing ◽

Data Extraction ◽

Synthetic Data ◽

Knowledge Discovery In Databases ◽

Data Repository ◽

Time Data ◽

Innovative Method ◽

Real Time Data ◽

Time Requirements

ETL (Extract, Transform, and Load) is an essential process required to perform data extraction in knowledge discovery in databases and in data warehousing environments. The ETL process aims to gather data that is available from operational sources, process and store them into an integrated data repository. Also, the ETL process can be performed in a real-time data warehousing environment and store data into a data warehouse. This paper presents a new and innovative method named Data Extraction Magnet (DEM) to perform the extraction phase of ETL process in a real-time data warehousing environment based on non-intrusive, tag and parallelism concepts. DEM has been validated on a dairy farming domain using synthetic data. The results showed a great performance gain in comparison to the traditional trigger technique and the attendance of real-time requirements.

Download Full-text

Issues and Handy Solutions Addressed at Every Stage in Real Time Data Warehousing, I.E. ETL (Extraction, Transformation & Loading)

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.e1100.0785s319 ◽

2019 ◽

Vol 8 (5S3) ◽

pp. 344-348

Keyword(s):

Real Time ◽

Data Warehouse ◽

Data Warehousing ◽

Time Data ◽

Processing Load ◽

Real Time Data ◽

Data Source

In the standard ETL (Extract Processing Load), the data warehouse refreshment must be performed outside of peak hours. i It implies i that the i functioning and i analysis has stopped in their iall actions. iIt causes the iamount of icleanness of i data from the idata Warehouse which iisn't suggesting ithe latest i operational transections. This i issue is i known as i data i latency. The data warehousing is iemployed to ibe a iremedy for ithis iissue. It updates the idata warehouse iat a inear real-time iFashion, instantly after data found from the data source. Therefore, data i latency could i be reduced. Hence the near real time data warehousing was having issues which was not identified in traditional ETL. This paper claims to communicate the issues and accessible options at every point iin the i near real-time i data warehousing, i.e. i The i issues and Available alternatives iare based ion ia literature ireview by additional iStudy that ifocus ion near real-time data iwarehousing issue

Download Full-text

Tuned X-HYBRIDJOIN for Near-Real-Time Data Warehousing

Web Technologies and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-642-37401-2_49 ◽

2013 ◽

pp. 494-505 ◽

Cited By ~ 2

Author(s):

M. Asif Naeem

Keyword(s):

Real Time ◽

Data Warehousing ◽

Time Data ◽

Real Time Data

Download Full-text

Cassandra-based data repository design for food supply chain traceability

VINE Journal of Information and Knowledge Management Systems ◽

10.1108/vjikms-08-2019-0119 ◽

2020 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Sandeep Kumar Singh ◽

Mamata Jenamani

Keyword(s):

Supply Chain ◽

Real Time ◽

Radio Frequency Identification ◽

Query Language ◽

Data Repository ◽

Time Data ◽

Content Type ◽

The Real ◽

Traceability System ◽

Real Time Data

Purpose The purpose of this paper is to design a supply chain database schema for Cassandra to store real-time data generated by Radio Frequency IDentification technology in a traceability system. Design/methodology/approach The real-time data generated in such traceability systems are of high frequency and volume, making it difficult to handle by traditional relational database technologies. To overcome this difficulty, a NoSQL database repository based on Casandra is proposed. The efficacy of the proposed schema is compared with two such databases, document-based MongoDB and column family-based Cassandra, which are suitable for storing traceability data. Findings The proposed Cassandra-based data repository outperforms the traditional Structured Query Language-based and MongoDB system from the literature in terms of concurrent reading, and works at par with respect to writing and updating of tracing queries. Originality/value The proposed schema is able to store the real-time data generated in a supply chain with low latency. To test the performance of the Cassandra-based data repository, a test-bed is designed in the lab and supply chain operations of Indian Public Distribution System are simulated to generate data.

Download Full-text

Bioterrorism Surveillance with Real-Time Data Warehousing

Intelligence and Security Informatics - Lecture Notes in Computer Science ◽

10.1007/3-540-44853-5_24 ◽

2003 ◽

pp. 322-335 ◽

Cited By ~ 12

Author(s):

Donald J. Berndt ◽

Alan R. Hevner ◽

James Studnicki

Keyword(s):

Real Time ◽

Data Warehousing ◽

Time Data ◽

Real Time Data

Download Full-text

Real-Time Data Warehousing: A Rewrite/Merge Approach

Data Warehousing and Knowledge Discovery - Lecture Notes in Computer Science ◽

10.1007/978-3-319-10160-6_8 ◽

2014 ◽

pp. 78-88 ◽

Cited By ~ 2

Author(s):

Alfredo Cuzzocrea ◽

Nickerson Ferreira ◽

Pedro Furtado

Keyword(s):

Real Time ◽

Data Warehousing ◽

Time Data ◽

Real Time Data

Download Full-text

Handling of internal inconsistency OLAP - Based lock table using Message Oriented Middleware in near real time data warehousing

2015 International Seminar on Intelligent Technology and Its Applications (ISITIA) ◽

10.1109/isitia.2015.7220001 ◽

2015 ◽

Author(s):

Ardianto Wibowo ◽

Saiful Akbar

Keyword(s):

Real Time ◽

Data Warehousing ◽

Time Data ◽

Message Oriented Middleware ◽

Real Time Data ◽

Internal Inconsistency

Download Full-text

Problems and available solutions on the stage of Extract, Transform, and Loading in near real-time data warehousing (a literature study)

2015 International Seminar on Intelligent Technology and Its Applications (ISITIA) ◽

10.1109/isitia.2015.7220004 ◽

2015 ◽

Cited By ~ 7

Author(s):

Ardianto Wibowo

Keyword(s):

Real Time ◽

Data Warehousing ◽

Time Data ◽

Literature Study ◽

Real Time Data

Download Full-text

From Data Warehouses to Streaming Warehouses: A Survey on the Challenges for Real-Time Data Warehousing and Available Solutions

International Journal of Computer Applications ◽

10.5120/13984-1990 ◽

2013 ◽

Vol 81 (2) ◽

pp. 15-18

Author(s):

Revathy. S ◽

Saravana Balaji. B ◽

N. K. Karthikeyan

Keyword(s):

Real Time ◽

Data Warehousing ◽

Data Warehouses ◽

Time Data ◽

Real Time Data

Download Full-text

Efficient Usage of Memory Resources in Near-Real-Time Data Warehousing

Communications in Computer and Information Science - Emerging Trends and Applications in Information Communication Technologies ◽

10.1007/978-3-642-28962-0_32 ◽

2012 ◽

pp. 326-337 ◽

Cited By ~ 1

Author(s):

Muhammad Asif Naeem ◽

Gillian Dobbie ◽

Gerald Weber ◽

Imran Sarwar Bajwa

Keyword(s):

Real Time ◽

Data Warehousing ◽

Time Data ◽

Real Time Data ◽

Memory Resources

Download Full-text

Near Real-Time Data Warehousing with Multi-stage Trickle and Flip

Lecture Notes in Business Information Processing - Perspectives in Business Informatics Research ◽

10.1007/978-3-642-24511-4_6 ◽

2011 ◽

pp. 73-82 ◽

Cited By ~ 6

Author(s):

Janis Zuters

Keyword(s):

Real Time ◽

Data Warehousing ◽

Time Data ◽

Multi Stage ◽

Real Time Data

Download Full-text