Genetic Based Data Placement for Geo-Distributed Data-Intensive Applications in Cloud Computing

Author(s):  
Weifeng Fan ◽  
Jun Peng ◽  
Xiaoyong Zhang ◽  
Zhiwu Huang
Author(s):  
Christopher Miceli ◽  
Michael Miceli ◽  
Bety Rodriguez-Milla ◽  
Shantenu Jha

Grids, clouds and cloud-like infrastructures are capable of supporting a broad range of data-intensive applications. There are interesting and unique performance issues that appear as the volume of data and degree of distribution increases. New scalable data-placement and management techniques, as well as novel approaches to determine the relative placement of data and computational workload, are required. We develop and study a genome sequence matching application that is simple to control and deploy, yet serves as a prototype of a data-intensive application. The application uses a SAGA-based implementation of the All-Pairs pattern. This paper aims to understand some of the factors that influence the performance of this application and the interplay of those factors. We also demonstrate how the SAGA approach can enable data-intensive applications to be extensible and interoperable over a range of infrastructure. This capability enables us to compare and contrast two different approaches for executing distributed data-intensive applications—simple application-level data-placement heuristics versus distributed file systems.


2011 ◽  
Vol 55-57 ◽  
pp. 1053-1057
Author(s):  
Gui De Zheng ◽  
Ming Chen

The next generation of scientific experiments and studies are being carried out by large collaborations of researchers distributed around the world engaged in analysis of huge collections of data generated by scientific instruments. Grid computing has emerged as an enabler for such collaborations as it aids communities in sharing resource to achieve common objective. This paper defines the problem of scheduling distributed data-intensive application on to Gird resource and presents a formal resource and application model for the problem.


2021 ◽  
Vol 34 (1) ◽  
pp. 66-85
Author(s):  
Yiannis Verginadis ◽  
Dimitris Apostolou ◽  
Salman Taherizadeh ◽  
Ioannis Ledakis ◽  
Gregoris Mentzas ◽  
...  

Fog computing extends multi-cloud computing by enabling services or application functions to be hosted close to their data sources. To take advantage of the capabilities of fog computing, serverless and the function-as-a-service (FaaS) software engineering paradigms allow for the flexible deployment of applications on multi-cloud, fog, and edge resources. This article reviews prominent fog computing frameworks and discusses some of the challenges and requirements of FaaS-enabled applications. Moreover, it proposes a novel framework able to dynamically manage multi-cloud, fog, and edge resources and to deploy data-intensive applications developed using the FaaS paradigm. The proposed framework leverages the FaaS paradigm in a way that improves the average service response time of data-intensive applications by a factor of three regardless of the underlying multi-cloud, fog, and edge resource infrastructure.


Author(s):  
Ganesh Chandra Deka

NoSQL databases are designed to meet the huge data storage requirements of cloud computing and big data processing. NoSQL databases have lots of advanced features in addition to the conventional RDBMS features. Hence, the “NoSQL” databases are popularly known as “Not only SQL” databases. A variety of NoSQL databases having different features to deal with exponentially growing data-intensive applications are available with open source and proprietary option. This chapter discusses some of the popular NoSQL databases and their features on the light of CAP theorem.


2011 ◽  
Vol 135-136 ◽  
pp. 43-49
Author(s):  
Han Ning Wang ◽  
Wei Xiang Xu ◽  
Chao Long Jia

The application of high-speed railway data, which is an important component of China's transportation science data sharing, has embodied the typical characteristics of data-intensive computing. A reasonable and effective data placement strategy is needed to deploy and execute data-intensive applications in the cloud computing environment. Study results of current data placement approaches have been analyzed and compared in this paper. Combining the semi-definite programming algorithm with the dynamic interval mapping algorithm, a hierarchical structure data placement strategy is proposed. The semi-definite programming algorithm is suitable for the placement of files with various replications, ensuring that different replications of a file are placed on different storage devices. And the dynamic interval mapping algorithm could guarantee better self-adaptability of the data storage system. It has been proved both by theoretical analysis and experiment demonstration that a hierarchical data placement strategy could guarantee the self-adaptability, data reliability and high-speed data access for large-scale networks.


Sign in / Sign up

Export Citation Format

Share Document