Data Intensive Cloud Computing

Big Data ◽  
2016 ◽  
pp. 639-654
Author(s):  
Jayalakshmi D. S. ◽  
R. Srinivasan ◽  
K. G. Srinivasa

Processing Big Data is a huge challenge for today's technology. There is a need to find, apply and analyze new ways of computing to make use of the Big Data so as to derive business and scientific value from it. Cloud computing with its promise of seemingly infinite computing resources is seen as the solution to this problem. Data Intensive computing on cloud builds upon the already mature parallel and distributed computing technologies such HPC, grid and cluster computing. However, handling Big Data in the cloud presents its own challenges. In this chapter, we analyze issues specific to data intensive cloud computing and provides a study on available solutions in programming models, data distribution and replication, resource provisioning and scheduling with reference to data intensive applications in cloud. Future directions for further research enabling data intensive cloud applications in cloud environment are identified.

Author(s):  
Jayalakshmi D. S. ◽  
R. Srinivasan ◽  
K. G. Srinivasa

Processing Big Data is a huge challenge for today's technology. There is a need to find, apply and analyze new ways of computing to make use of the Big Data so as to derive business and scientific value from it. Cloud computing with its promise of seemingly infinite computing resources is seen as the solution to this problem. Data Intensive computing on cloud builds upon the already mature parallel and distributed computing technologies such HPC, grid and cluster computing. However, handling Big Data in the cloud presents its own challenges. In this chapter, we analyze issues specific to data intensive cloud computing and provides a study on available solutions in programming models, data distribution and replication, resource provisioning and scheduling with reference to data intensive applications in cloud. Future directions for further research enabling data intensive cloud applications in cloud environment are identified.


Author(s):  
Ganesh Chandra Deka

NoSQL databases are designed to meet the huge data storage requirements of cloud computing and big data processing. NoSQL databases have lots of advanced features in addition to the conventional RDBMS features. Hence, the “NoSQL” databases are popularly known as “Not only SQL” databases. A variety of NoSQL databases having different features to deal with exponentially growing data-intensive applications are available with open source and proprietary option. This chapter discusses some of the popular NoSQL databases and their features on the light of CAP theorem.


2021 ◽  
Vol 34 (1) ◽  
pp. 66-85
Author(s):  
Yiannis Verginadis ◽  
Dimitris Apostolou ◽  
Salman Taherizadeh ◽  
Ioannis Ledakis ◽  
Gregoris Mentzas ◽  
...  

Fog computing extends multi-cloud computing by enabling services or application functions to be hosted close to their data sources. To take advantage of the capabilities of fog computing, serverless and the function-as-a-service (FaaS) software engineering paradigms allow for the flexible deployment of applications on multi-cloud, fog, and edge resources. This article reviews prominent fog computing frameworks and discusses some of the challenges and requirements of FaaS-enabled applications. Moreover, it proposes a novel framework able to dynamically manage multi-cloud, fog, and edge resources and to deploy data-intensive applications developed using the FaaS paradigm. The proposed framework leverages the FaaS paradigm in a way that improves the average service response time of data-intensive applications by a factor of three regardless of the underlying multi-cloud, fog, and edge resource infrastructure.


2022 ◽  
pp. 1865-1875
Author(s):  
Krishan Tuli ◽  
Amanpreet Kaur ◽  
Meenakshi Sharma

Cloud computing is offering various IT services to many users in the work on the basis of pay-as-you-use model. As the data is increasing day by day, there is a huge requirement for cloud applications that manage such a huge amount of data. Basically, a best solution for analyzing such amounts of data and handles a large dataset. Various companies are providing such framesets for particular applications. A cloud framework is the accruement of different components which is similar to the development tools, various middleware for particular applications and various other database management services that are needed for cloud computing deployment, development and managing the various applications of the cloud. This results in an effective model for scaling such a huge amount of data in dynamically allocated recourses along with solving their complex problems. This article is about the survey on the performance of the big data framework based on a cloud from various endeavors which assists ventures to pick a suitable framework for their work and get a desired outcome.


2018 ◽  
Vol 19 (3) ◽  
pp. 245-258
Author(s):  
Vengadeswaran Shanmugasundaram ◽  
Balasundaram Sadhu Ramakrishnan

In this data era, massive volumes of data are being generated every second in variety of domains such as Geoscience, Social Web, Finance, e-Commerce, Health Care, Climate modelling, Physics, Astronomy, Government sectors etc. Hadoop has been well-recognized as de factobig data processing platform that have been extensively adopted, and is currently widely used, in many application domains processing Big Data. Even though it is considered as an efficient solution for such complex query processing, it has its own limitation when the data to be processed exhibit interest locality. The data required for any query execution follows grouping behavior wherein only a part of the Big-Data is accessed frequently. During such scenarion, the time taken to execute a queryand return results, increases exponentially as the amount of data increases leading to much waiting time for the user. Since Hadoop default data placement strategy (HDDPS) does not consider such grouping behavior, it does not perform efficiently resulting in lacunas such as decreased local map task execution, increased query execution time etc. Hence proposed an Optimal Data Placement Strategy (ODPS) based on grouping semantics. In this paper we experiment the significance oftwo most promising clustering techniques viz. Hierarchical Agglomerative Clustering (HAC) and Markov Clustering (MCL) in grouping aware data placement for data intensive applications having interest locality. Initially user access pattern is identified by dynamically analyzing history log.Then both clustering techniques (HAC & MCL) are separately applied over the access pattern to obtain independent clusters. These clusters are interpreted and validated to extract the Optimal Data Groupings (ODG). Finally proposed strategy reorganizes the default data layouts in HDFSbased on ODG to achieve maximum parallel execution per group subjective to Load Balancer and Rack Awareness. Our proposed strategy is tested in 10 node cluster placed in a multi rack with Hadoop installed in every node deployed in cloud platform. Proposed strategy reduces the query execution time, significantly improves the data locality and has proved to be more efficient for massive datasets processing in heterogeneous distributed environment. Also MCL shows a marginal improved performance over HAC for queries exhibiting interest localities.


2016 ◽  
Vol 15 (9) ◽  
pp. 7035-7040
Author(s):  
Sakshi Grover ◽  
Mr. Navtej Singh Ghumman

Although cloud computing is now becoming more advanced and matured as many companies have released their own computing platforms to provide services to public, but the research on cloud computing is still in its infancy. Apart from many other challenges of cloud computing, efficient management of energy is one of the most challenging research issues. In this paper we review the existing algorithm of dynamic resource provisioning and allocation algorithms and holistically work to boost data center energy efficiency and performance. This particular paper purposes a) heterogeneous workload and its implication on data centers energy efficiency b) solving the problem of VM resource scheduling to cloud applications


Author(s):  
Rajni Aron ◽  
Deepak Kumar Aggarwal

Cloud Computing has become a buzzword in the IT industry. Cloud Computing which provides inexpensive computing resources on the pay-as-you-go basis is promptly gaining momentum as a substitute for traditional Information Technology (IT) based organizations. Therefore, the increased utilization of Clouds makes an execution of Big Data processing jobs a vital research area. As more and more users have started to store/process their real-time data in Cloud environments, Resource Provisioning and Scheduling of Big Data processing jobs becomes a key element of consideration for efficient execution of Big Data applications. This chapter discusses the fundamental concepts supporting Cloud Computing & Big Data terms and the relationship between them. This chapter will help researchers find the important characteristics of Cloud Resource Management Systems to handle Big Data processing jobs and will also help to select the most suitable technique for processing Big Data jobs in Cloud Computing environment.


Author(s):  
Vijayalakshmi Saravanan ◽  
Anpalagan Alagan ◽  
Isaac Woungang

With the advent of novel wireless technologies and Cloud Computing, large volumes of data are being produced from various heterogeneous devices such as mobile phones, credit cards, and computers. Managing this data has become the de-facto challenge in the current Information Systems. According to Moore's law, processor speeds are no longer doubling, the processing power also continuing to grow rapidly which leads to a new scientific data intensive problem in every field, especially Big Data domain. The revolution of Big Data lies in the improved statistical analysis and computational power depend on its processing speed. Hence, the need to put massively multi-core systems on the job is vital in order to overcome the physical limits of complexity and speed. It also arises with many challenges such as difficulties in capturing massive applications, data storage, and analysis. This chapter discusses some of the Big Data architectural challenges in the perspective of multi-core processors.


Sign in / Sign up

Export Citation Format

Share Document