NoSQL schema evolution and big data migration at scale

Efficient big data analysis is critical to support applications or services in Internet of Things (IoT) system, especially for the time-intensive services. Hence, the data center may host heterogeneous big data analysis tasks for multiple IoT systems. It is a challenging problem since the data centers usually need to schedule a large number of periodic or online tasks in a short time. In this paper, we investigate the heterogeneous task scheduling problem to reduce the global task execution time, which is also an efficient method to reduce energy consumption for data centers. We establish the task execution for heterogeneous tasks respectively based on the data locality feature, which also indicate the relationship among the tasks, data blocks and servers. We propose a heterogeneous task scheduling algorithm with data migration. The core idea of the algorithm is to maximize the efficiency by comparing the cost between remote task execution and data migration, which could improve the data locality and reduce task execution time. We conduct extensive simulations and the experimental results show that our algorithm has better performance than the traditional methods, and data migration actually works to reduce th overall task execution time. The algorithm also shows acceptable fairness for the heterogeneous tasks.

Download Full-text

Data migration ecosystem for big data invited paper

2013 7th IEEE International Conference on Digital Ecosystems and Technologies (DEST) ◽

10.1109/dest.2013.6611352 ◽

2013 ◽

Author(s):

Koong Wah Yan ◽

Nagendran M. Perumal ◽

Tharam Dillon

Keyword(s):

Big Data ◽

Data Migration

Download Full-text

Big Data Migration and Sentiment Analysis of Real Time Events Using Hadoop Ecosystem

International Conference on Intelligent Data Communication Technologies and Internet of Things (ICICI) 2018 - Lecture Notes on Data Engineering and Communications Technologies ◽

10.1007/978-3-030-03146-6_87 ◽

2018 ◽

pp. 764-770

Author(s):

R. Chandana ◽

D. Harshitha ◽

Meenakshi ◽

A. C. Ramachandra

Keyword(s):

Big Data ◽

Sentiment Analysis ◽

Real Time ◽

Data Migration ◽

Hadoop Ecosystem

Download Full-text

Towards an Automated Testing Framework for Big Data

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1087.1291s319 ◽

2019 ◽

Vol 9 (1S3) ◽

pp. 479-485

Keyword(s):

Big Data ◽

Performance Testing ◽

Data Migration ◽

Functional Testing ◽

Security Testing ◽

Testing Framework ◽

Strategy Map ◽

Data Ecosystem ◽

Testing Module ◽

Migration Testing

Big data testing services are to deliver end to end testing methodologies which address our big data challenges. The testing module includes two types of functionalities. One is functional testing and second is non- functional testing. The functional testing should be accomplished at every stage of big data processing. Functional testing is nothing but the big data sources extraction testing, data migration testing and big data ecosystem. Testing which completes ETL test strategy, Map job reduce validation, multicore Data integration validation and data duplication check. On the other side the non-functional testing is to ensure that there are no quality defeat in data and no performance related issues. It covers the area for security testing, performance testing which solve the problem of monitoring and identify bottlenecks.

Download Full-text

The schema evolution and data migration framework of the environmental mass database IMIS

Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004. ◽

10.1109/ssdm.2004.1311227 ◽

2004 ◽

Cited By ~ 4

Author(s):

D. Draheim ◽

M. Horn ◽

I. Schulz

Keyword(s):

Data Migration ◽

Schema Evolution

Download Full-text

An Algebraic Expert System with Neural Network Concepts for Cyber, Big Data and Data Migration

2019 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) ◽

10.1109/isspit47144.2019.9001880 ◽

2019 ◽

Author(s):

Richard N M Rudd-Orthner ◽

Lyudmilla Mihaylova

Keyword(s):

Neural Network ◽

Big Data ◽

Expert System ◽

Data Migration

Download Full-text

Sustainable Vehicle-Assisted Edge Computing for Big Data Migration in Smart Cities

IEEE Internet of Things Journal ◽

10.1109/jiot.2019.2957127 ◽

2020 ◽

Vol 7 (3) ◽

pp. 1857-1871 ◽

Cited By ~ 2

Author(s):

Maria Kanwal ◽

Asad Waqar Malik ◽

Anis Ur Rahman ◽

Imran Mahmood ◽

Muhammad Shahzad

Keyword(s):

Big Data ◽

Smart Cities ◽

Edge Computing ◽

Data Migration

Download Full-text

Dynamic Data Migration in Hybrid Main Memories for In-Memory Big Data Storage

ETRI Journal ◽

10.4218/etrij.14.0114.0012 ◽

2014 ◽

Vol 36 (6) ◽

pp. 988-998 ◽

Cited By ~ 8

Author(s):

Hai Thanh Mai ◽

Kyoung Hyun Park ◽

Hun Soon Lee ◽

Chang Soo Kim ◽

Miyoung Lee ◽

...

Keyword(s):

Big Data ◽

Data Storage ◽

Data Migration ◽

Dynamic Data ◽

Big Data Storage

Download Full-text

Ensuring Data Readiness for Quality Requirements with Help from Procedure Reuse

Journal of Data and Information Quality ◽

10.1145/3428154 ◽

2021 ◽

Vol 13 (3) ◽

pp. 1-15

Author(s):

Rada Chirkova ◽

Jon Doyle ◽

Juan Reutter

Keyword(s):

Big Data ◽

Data Quality ◽

Data Cleaning ◽

Data Migration ◽

Quality Of Data ◽

Design Data ◽

Big Data Applications ◽

Schema Design ◽

Data Transforming

Assessing and improving the quality of data are fundamental challenges in Big-Data applications. These challenges have given rise to numerous solutions targeting transformation, integration, and cleaning of data. However, while schema design, data cleaning, and data migration are nowadays reasonably well understood in isolation, not much attention has been given to the interplay between standalone tools in these areas. In this article, we focus on the problem of determining whether the available data-transforming procedures can be used together to bring about the desired quality characteristics of the data in business or analytics processes. For example, to help an organization avoid building a data-quality solution from scratch when facing a new analytics task, we ask whether the data quality can be improved by reusing the tools that are already available, and if so, which tools to apply, and in which order, all without presuming knowledge of the internals of the tools, which may be external or proprietary. Toward addressing this problem, we conduct a formal study in which individual data cleaning, data migration, or other data-transforming tools are abstracted as black-box procedures with only some of the properties exposed, such as their applicability requirements, the parts of the data that the procedure modifies, and the conditions that the data satisfy once the procedure has been applied. As a proof of concept, we provide foundational results on sequential applications of procedures abstracted in this way, to achieve prespecified data-quality objectives, for the use case of relational data and for procedures described by standard relational constraints. We show that, while reasoning in this framework may be computationally infeasible in general, there exist well-behaved cases in which these foundational results can be applied in practice for achieving desired data-quality results on Big Data.

Download Full-text