Resource Provisioning in SLA-Based Cluster Computing

Considering the recent exponential growth in the amount of information processed in Big Data, the high energy consumed by data processing engines in datacenters has become a major issue, underlining the need for efficient resource allocation for more energy-efficient computing. We previously proposed the Best Trade-off Point (BToP) method, which provides a general approach and techniques based on an algorithm with mathematical formulas to find the best trade-off point on an elbow curve of performance vs. resources for efficient resource provisioning in Hadoop MapReduce. The BToP method is expected to work for any application or system which relies on a trade-off elbow curve, non-inverted or inverted, for making good decisions. In this paper, we apply the BToP method to the emerging cluster computing framework, Apache Spark, and show that its performance and energy consumption are better than Spark with its built-in dynamic resource allocation enabled. Our Spark-Bench tests confirm the effectiveness of using the BToP method with Spark to determine the optimal number of executors for any workload in production environments where job profiling for behavioral replication will lead to the most efficient resource provisioning.

Download Full-text

Power-aware resource provisioning in cluster computing

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS) ◽

10.1109/ipdps.2010.5470395 ◽

2010 ◽

Cited By ~ 3

Author(s):

Kaiqi Xiong

Keyword(s):

Cluster Computing ◽

Resource Provisioning

Download Full-text

Data Intensive Cloud Computing

Big Data ◽

10.4018/978-1-4666-9840-6.ch029 ◽

2016 ◽

pp. 639-654

Author(s):

Jayalakshmi D. S. ◽

R. Srinivasan ◽

K. G. Srinivasa

Keyword(s):

Cloud Computing ◽

Big Data ◽

Cluster Computing ◽

Resource Provisioning ◽

Data Intensive ◽

Scientific Value ◽

Data Intensive Applications ◽

Cloud Applications ◽

Problem Data ◽

Huge Challenge

Processing Big Data is a huge challenge for today's technology. There is a need to find, apply and analyze new ways of computing to make use of the Big Data so as to derive business and scientific value from it. Cloud computing with its promise of seemingly infinite computing resources is seen as the solution to this problem. Data Intensive computing on cloud builds upon the already mature parallel and distributed computing technologies such HPC, grid and cluster computing. However, handling Big Data in the cloud presents its own challenges. In this chapter, we analyze issues specific to data intensive cloud computing and provides a study on available solutions in programming models, data distribution and replication, resource provisioning and scheduling with reference to data intensive applications in cloud. Future directions for further research enabling data intensive cloud applications in cloud environment are identified.

Download Full-text

Data Intensive Cloud Computing

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Advanced Research on Cloud Computing Design and Applications ◽

10.4018/978-1-4666-8676-2.ch019 ◽

2015 ◽

pp. 305-320

Author(s):

Jayalakshmi D. S. ◽

R. Srinivasan ◽

K. G. Srinivasa

Keyword(s):

Cloud Computing ◽

Big Data ◽

Cluster Computing ◽

Resource Provisioning ◽

Data Intensive ◽

Scientific Value ◽

Data Intensive Applications ◽

Cloud Applications ◽

Problem Data ◽

Huge Challenge

Processing Big Data is a huge challenge for today's technology. There is a need to find, apply and analyze new ways of computing to make use of the Big Data so as to derive business and scientific value from it. Cloud computing with its promise of seemingly infinite computing resources is seen as the solution to this problem. Data Intensive computing on cloud builds upon the already mature parallel and distributed computing technologies such HPC, grid and cluster computing. However, handling Big Data in the cloud presents its own challenges. In this chapter, we analyze issues specific to data intensive cloud computing and provides a study on available solutions in programming models, data distribution and replication, resource provisioning and scheduling with reference to data intensive applications in cloud. Future directions for further research enabling data intensive cloud applications in cloud environment are identified.

Download Full-text