Data Intensive Cloud Computing

2015 ◽

pp. 305-320

Author(s):

Jayalakshmi D. S. ◽

R. Srinivasan ◽

K. G. Srinivasa

Keyword(s):

Cloud Computing ◽

Big Data ◽

Cluster Computing ◽

Resource Provisioning ◽

Data Intensive ◽

Scientific Value ◽

Data Intensive Applications ◽

Cloud Applications ◽

Problem Data ◽

Huge Challenge

Processing Big Data is a huge challenge for today's technology. There is a need to find, apply and analyze new ways of computing to make use of the Big Data so as to derive business and scientific value from it. Cloud computing with its promise of seemingly infinite computing resources is seen as the solution to this problem. Data Intensive computing on cloud builds upon the already mature parallel and distributed computing technologies such HPC, grid and cluster computing. However, handling Big Data in the cloud presents its own challenges. In this chapter, we analyze issues specific to data intensive cloud computing and provides a study on available solutions in programming models, data distribution and replication, resource provisioning and scheduling with reference to data intensive applications in cloud. Future directions for further research enabling data intensive cloud applications in cloud environment are identified.

Download Full-text

NoSQL Databases

Advances in Data Mining and Database Management - Handbook of Research on Cloud Infrastructures for Big Data Analytics ◽

10.4018/978-1-4666-5864-6.ch008 ◽

2014 ◽

pp. 186-215 ◽

Cited By ~ 2

Author(s):

Ganesh Chandra Deka

Keyword(s):

Cloud Computing ◽

Big Data ◽

Data Processing ◽

Open Source ◽

Data Storage ◽

Big Data Processing ◽

Nosql Databases ◽

Data Intensive ◽

Huge Data ◽

Data Intensive Applications

NoSQL databases are designed to meet the huge data storage requirements of cloud computing and big data processing. NoSQL databases have lots of advanced features in addition to the conventional RDBMS features. Hence, the “NoSQL” databases are popularly known as “Not only SQL” databases. A variety of NoSQL databases having different features to deal with exponentially growing data-intensive applications are available with open source and proprietary option. This chapter discusses some of the popular NoSQL databases and their features on the light of CAP theorem.

Download Full-text

PrEstoCloud

Information Resources Management Journal ◽

10.4018/irmj.2021010104 ◽

2021 ◽

Vol 34 (1) ◽

pp. 66-85

Author(s):

Yiannis Verginadis ◽

Dimitris Apostolou ◽

Salman Taherizadeh ◽

Ioannis Ledakis ◽

Gregoris Mentzas ◽

...

Keyword(s):

Cloud Computing ◽

Software Engineering ◽

Response Time ◽

Fog Computing ◽

Data Sources ◽

Data Intensive ◽

Service Response Time ◽

Enabling Services ◽

Data Intensive Applications ◽

Multi Cloud

Fog computing extends multi-cloud computing by enabling services or application functions to be hosted close to their data sources. To take advantage of the capabilities of fog computing, serverless and the function-as-a-service (FaaS) software engineering paradigms allow for the flexible deployment of applications on multi-cloud, fog, and edge resources. This article reviews prominent fog computing frameworks and discusses some of the challenges and requirements of FaaS-enabled applications. Moreover, it proposes a novel framework able to dynamically manage multi-cloud, fog, and edge resources and to deploy data-intensive applications developed using the FaaS paradigm. The proposed framework leverages the FaaS paradigm in a way that improves the average service response time of data-intensive applications by a factor of three regardless of the underlying multi-cloud, fog, and edge resource infrastructure.

Download Full-text

A Survey on Comparison of Performance Analysis on a Cloud-Based Big Data Framework

10.4018/978-1-6684-3662-2.ch091 ◽

2022 ◽

pp. 1865-1875

Author(s):

Krishan Tuli ◽

Amanpreet Kaur ◽

Meenakshi Sharma

Keyword(s):

Cloud Computing ◽

Big Data ◽

It Services ◽

Huge Amount ◽

Data Framework ◽

Development Tools ◽

Cloud Framework ◽

Suitable Framework ◽

Cloud Applications ◽

Day By Day

Cloud computing is offering various IT services to many users in the work on the basis of pay-as-you-use model. As the data is increasing day by day, there is a huge requirement for cloud applications that manage such a huge amount of data. Basically, a best solution for analyzing such amounts of data and handles a large dataset. Various companies are providing such framesets for particular applications. A cloud framework is the accruement of different components which is similar to the development tools, various middleware for particular applications and various other database management services that are needed for cloud computing deployment, development and managing the various applications of the cloud. This results in an effective model for scaling such a huge amount of data in dynamically allocated recourses along with solving their complex problems. This article is about the survey on the performance of the big data framework based on a cloud from various endeavors which assists ventures to pick a suitable framework for their work and get a desired outcome.

Download Full-text

Resource provisioning for data-intensive applications with deadline constraints on hybrid clouds using Aneka

Future Generation Computer Systems ◽

10.1016/j.future.2017.05.042 ◽

2018 ◽

Vol 79 ◽

pp. 765-775 ◽

Cited By ~ 37

Author(s):

Adel Nadjaran Toosi ◽

Richard O. Sinnott ◽

Rajkumar Buyya

Keyword(s):

Resource Provisioning ◽

Hybrid Clouds ◽

Data Intensive ◽

Deadline Constraints ◽

Data Intensive Applications

Download Full-text

Analysis of big data for data-intensive applications

2016 International Conference on Recent Advances and Innovations in Engineering (ICRAIE) ◽

10.1109/icraie.2016.7939551 ◽

2016 ◽

Author(s):

Meenu Dave ◽

Hemant Kumar Gianey

Keyword(s):

Big Data ◽

Data Intensive ◽

Data Intensive Applications

Download Full-text

Significance of Hierarchical and Markov Clustering in Grouping Aware Data Placement for Data Intensive Applications Having Interest Locality

Scalable Computing Practice and Experience ◽

10.12694/scpe.v19i3.1375 ◽

2018 ◽

Vol 19 (3) ◽

pp. 245-258

Author(s):

Vengadeswaran Shanmugasundaram ◽

Balasundaram Sadhu Ramakrishnan

Keyword(s):

Big Data ◽

Data Placement ◽

Query Execution ◽

Access Pattern ◽

Clustering Techniques ◽

Data Intensive ◽

Markov Clustering ◽

Default Data ◽

Data Intensive Applications ◽

Grouping Behavior

In this data era, massive volumes of data are being generated every second in variety of domains such as Geoscience, Social Web, Finance, e-Commerce, Health Care, Climate modelling, Physics, Astronomy, Government sectors etc. Hadoop has been well-recognized as de factobig data processing platform that have been extensively adopted, and is currently widely used, in many application domains processing Big Data. Even though it is considered as an efficient solution for such complex query processing, it has its own limitation when the data to be processed exhibit interest locality. The data required for any query execution follows grouping behavior wherein only a part of the Big-Data is accessed frequently. During such scenarion, the time taken to execute a queryand return results, increases exponentially as the amount of data increases leading to much waiting time for the user. Since Hadoop default data placement strategy (HDDPS) does not consider such grouping behavior, it does not perform efficiently resulting in lacunas such as decreased local map task execution, increased query execution time etc. Hence proposed an Optimal Data Placement Strategy (ODPS) based on grouping semantics. In this paper we experiment the significance oftwo most promising clustering techniques viz. Hierarchical Agglomerative Clustering (HAC) and Markov Clustering (MCL) in grouping aware data placement for data intensive applications having interest locality. Initially user access pattern is identified by dynamically analyzing history log.Then both clustering techniques (HAC & MCL) are separately applied over the access pattern to obtain independent clusters. These clusters are interpreted and validated to extract the Optimal Data Groupings (ODG). Finally proposed strategy reorganizes the default data layouts in HDFSbased on ODG to achieve maximum parallel execution per group subjective to Load Balancer and Rack Awareness. Our proposed strategy is tested in 10 node cluster placed in a multi rack with Hadoop installed in every node deployed in cloud platform. Proposed strategy reduces the query execution time, significantly improves the data locality and has proved to be more efficient for massive datasets processing in heterogeneous distributed environment. Also MCL shows a marginal improved performance over HAC for queries exhibiting interest localities.

Download Full-text

A REVIEW ON ENGERY EFFICIENT STRATEGY FOR TASK ALLOCATION IN CLOUD ENVIRONMENT

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v15i9.705 ◽

2016 ◽

Vol 15 (9) ◽

pp. 7035-7040

Author(s):

Sakshi Grover ◽

Mr. Navtej Singh Ghumman

Keyword(s):

Energy Efficiency ◽

Cloud Computing ◽

Resource Provisioning ◽

Research Issues ◽

Efficient Management ◽

Dynamic Resource Provisioning ◽

Computing Platforms ◽

And Performance ◽

Cloud Applications ◽

Allocation Algorithms

Although cloud computing is now becoming more advanced and matured as many companies have released their own computing platforms to provide services to public, but the research on cloud computing is still in its infancy. Apart from many other challenges of cloud computing, efficient management of energy is one of the most challenging research issues. In this paper we review the existing algorithm of dynamic resource provisioning and allocation algorithms and holistically work to boost data center energy efficiency and performance. This particular paper purposes a) heterogeneous workload and its implication on data centers energy efficiency b) solving the problem of VM resource scheduling to cloud applications

Download Full-text

Resource Provisioning and Scheduling of Big Data Processing Jobs

Advances in Data Mining and Database Management - Handbook of Research on Big Data Storage and Visualization Techniques ◽

10.4018/978-1-5225-3142-5.ch014 ◽

2018 ◽

pp. 382-401

Author(s):

Rajni Aron ◽

Deepak Kumar Aggarwal

Keyword(s):

Cloud Computing ◽

Big Data ◽

Data Processing ◽

Research Area ◽

Resource Provisioning ◽

Time Data ◽

Big Data Processing ◽

Cloud Resource Management ◽

Big Data Applications ◽

Cloud Resource

Cloud Computing has become a buzzword in the IT industry. Cloud Computing which provides inexpensive computing resources on the pay-as-you-go basis is promptly gaining momentum as a substitute for traditional Information Technology (IT) based organizations. Therefore, the increased utilization of Clouds makes an execution of Big Data processing jobs a vital research area. As more and more users have started to store/process their real-time data in Cloud environments, Resource Provisioning and Scheduling of Big Data processing jobs becomes a key element of consideration for efficient execution of Big Data applications. This chapter discusses the fundamental concepts supporting Cloud Computing & Big Data terms and the relationship between them. This chapter will help researchers find the important characteristics of Cloud Resource Management Systems to handle Big Data processing jobs and will also help to select the most suitable technique for processing Big Data jobs in Cloud Computing environment.

Download Full-text

Big Data in Massive Parallel Processing

Advances in Data Mining and Database Management - Handbook of Research on Big Data Storage and Visualization Techniques ◽

10.4018/978-1-5225-3142-5.ch011 ◽

2018 ◽

pp. 276-302 ◽

Cited By ~ 1

Author(s):

Vijayalakshmi Saravanan ◽

Anpalagan Alagan ◽

Isaac Woungang

Keyword(s):

Cloud Computing ◽

Big Data ◽

Data Storage ◽

Mobile Phones ◽

Scientific Data ◽

Computational Power ◽

Data Intensive ◽

Heterogeneous Devices ◽

Processing Power ◽

Massive Parallel Processing

With the advent of novel wireless technologies and Cloud Computing, large volumes of data are being produced from various heterogeneous devices such as mobile phones, credit cards, and computers. Managing this data has become the de-facto challenge in the current Information Systems. According to Moore's law, processor speeds are no longer doubling, the processing power also continuing to grow rapidly which leads to a new scientific data intensive problem in every field, especially Big Data domain. The revolution of Big Data lies in the improved statistical analysis and computational power depend on its processing speed. Hence, the need to put massively multi-core systems on the job is vital in order to overcome the physical limits of complexity and speed. It also arises with many challenges such as difficulties in capturing massive applications, data storage, and analysis. This chapter discusses some of the Big Data architectural challenges in the perspective of multi-core processors.

Download Full-text