scholarly journals Towards an Automated Testing Framework for Big Data

Big data testing services are to deliver end to end testing methodologies which address our big data challenges. The testing module includes two types of functionalities. One is functional testing and second is non- functional testing. The functional testing should be accomplished at every stage of big data processing. Functional testing is nothing but the big data sources extraction testing, data migration testing and big data ecosystem. Testing which completes ETL test strategy, Map job reduce validation, multicore Data integration validation and data duplication check. On the other side the non-functional testing is to ensure that there are no quality defeat in data and no performance related issues. It covers the area for security testing, performance testing which solve the problem of monitoring and identify bottlenecks.

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Syed Iftikhar Hussain Shah ◽  
Vassilios Peristeras ◽  
Ioannis Magnisalis

AbstractThe public sector, private firms, business community, and civil society are generating data that is high in volume, veracity, velocity and comes from a diversity of sources. This kind of data is known as big data. Public Administrations (PAs) pursue big data as “new oil” and implement data-centric policies to transform data into knowledge, to promote good governance, transparency, innovative digital services, and citizens’ engagement in public policy. From the above, the Government Big Data Ecosystem (GBDE) emerges. Managing big data throughout its lifecycle becomes a challenging task for governmental organizations. Despite the vast interest in this ecosystem, appropriate big data management is still a challenge. This study intends to fill the above-mentioned gap by proposing a data lifecycle framework for data-driven governments. Through a Systematic Literature Review, we identified and analysed 76 data lifecycles models to propose a data lifecycle framework for data-driven governments (DaliF). In this way, we contribute to the ongoing discussion around big data management, which attracts researchers’ and practitioners’ interest.


Energies ◽  
2020 ◽  
Vol 13 (17) ◽  
pp. 4508
Author(s):  
Xin Li ◽  
Liangyuan Wang ◽  
Jemal H. Abawajy ◽  
Xiaolin Qin ◽  
Giovanni Pau ◽  
...  

Efficient big data analysis is critical to support applications or services in Internet of Things (IoT) system, especially for the time-intensive services. Hence, the data center may host heterogeneous big data analysis tasks for multiple IoT systems. It is a challenging problem since the data centers usually need to schedule a large number of periodic or online tasks in a short time. In this paper, we investigate the heterogeneous task scheduling problem to reduce the global task execution time, which is also an efficient method to reduce energy consumption for data centers. We establish the task execution for heterogeneous tasks respectively based on the data locality feature, which also indicate the relationship among the tasks, data blocks and servers. We propose a heterogeneous task scheduling algorithm with data migration. The core idea of the algorithm is to maximize the efficiency by comparing the cost between remote task execution and data migration, which could improve the data locality and reduce task execution time. We conduct extensive simulations and the experimental results show that our algorithm has better performance than the traditional methods, and data migration actually works to reduce th overall task execution time. The algorithm also shows acceptable fairness for the heterogeneous tasks.


Author(s):  
Jyotsna Talreja Wassan

Big data is revolutionizing the world in the internet age. The wide variety of areas like online businesses, electronic health management, social networking, demographics, geographic information systems, online education, etc. are gaining insight from big data principles. Big data is comprised of heterogeneous datasets which are too large to be handled by traditional relational database systems. An important reason for explosion of interest in big data is that it has become cheap to store volumes of data and there is a major rise in computation capacity. This chapter gives an overview of big data ecosystems comprising various big data platforms useful in today's competitive world.


Author(s):  
Kedareshwaran Subramanian ◽  
Kedar Pandurang Joshi ◽  
Sourabh Deshmukh

In this book chapter, the authors highlight the potential of big data analytics for improving the forecasting capabilities to support the after-sales customer service supply chain for a global manufacturing organization. The forecasting function in customer service drives the downstream resource planning processes to provide the best customer experience at optimal costs. For a mature, global organization, its existing systems and processes have evolved over time and become complex. These complexities result in informational silos that result in sub-optimal use of data thereby creating inaccurate forecasts that adversely affect the planning process in supporting the customer service function. For addressing this problem, the authors argue for the use of frameworks that are best suited for a big data ecosystem. Drawing from existing literature, the concept of data lakes and data value chain have been used as theoretical approaches to devise a road map to implement a better data architecture to improve the forecasting capabilities in the given organizational scenario.


Author(s):  
Alex Ng ◽  
Shiping Chen

Performance testing is one of the vital activities spanning the whole life cycle of software engineering. As a result, there are a considerable number of performance testing products and open source tools available. It has been observed that most of the existing performance testing products and tools are either too expensive and complicated for small projects, or too specific and simple for diverse performance tests. In this chapter, we will present an overview of existing performance test products/tools, provide a summary of some of the contemporary system performance testing frameworks, and capture the key requirements for a general-purpose performance testing framework. Based on our previous works, we propose a system performance testing framework which is suitable for both simple and small, as well as complicated and large-scale performance testing projects. The core of our framework contains an abstraction to facilitate performance testing by separating the application logic from the common performance testing functionality, and a set of general-purpose data model.


Big Data ◽  
2016 ◽  
pp. 1422-1451
Author(s):  
Jurgen Janssens

To make the deeply rooted layers of catalyzing technology and optimized modelling gain their true value for education, healthcare or other public services, it is necessary to prepare well the Big Data environment in which the Big Data will be developed, and integrate elements of it into the project approach. It is by integrating and managing these non-technical aspects of project reality that analytics will be accepted. This will enable data power to infuse the organizational processes and offer ultimately real added value. This chapter will shed light on complementary actions required on different levels. It will be analyzed how this layered effort starts by a good understanding of the different elements that contribute to the definition of an organization's Big Data ecosystem. It will be explained how this interacts with the management of expectations, needs, goals and change. Lastly, a closer look will be given at the importance of portfolio based big picture thinking.


Sign in / Sign up

Export Citation Format

Share Document