Tutorial on Challenges for Big Data Application Performance Tuning and Prediction

Author(s):  
Rekha Singhal
2021 ◽  
Author(s):  
Jeanne Alcantara

Apache Spark enables a big data application—one that takes massive data as input and may produce massive data along its execution—to run in parallel on multiple nodes. Hence, for a big data application, performance is a vital issue. This project analyzes a WordCount application using Apache Spark, where the impact on the execution time and average utilization is assessed. To facilitate this assessment, the number of executor cores and the size of executor memory are varied across different sizes of data that the application has to process, and the different number of nodes in the cluster that the application runs on. It is concluded that different pairs (data size, number of nodes in the cluster) require different number of executor cores and different size of executor memory to obtain optimum results for execution time and average node utilization.


2021 ◽  
Author(s):  
Jeanne Alcantara

Apache Spark enables a big data application—one that takes massive data as input and may produce massive data along its execution—to run in parallel on multiple nodes. Hence, for a big data application, performance is a vital issue. This project analyzes a WordCount application using Apache Spark, where the impact on the execution time and average utilization is assessed. To facilitate this assessment, the number of executor cores and the size of executor memory are varied across different sizes of data that the application has to process, and the different number of nodes in the cluster that the application runs on. It is concluded that different pairs (data size, number of nodes in the cluster) require different number of executor cores and different size of executor memory to obtain optimum results for execution time and average node utilization.


2021 ◽  
Vol 11 (5) ◽  
pp. 2340
Author(s):  
Sanjay Mathrani ◽  
Xusheng Lai

Web data have grown exponentially to reach zettabyte scales. Mountains of data come from several online applications, such as e-commerce, social media, web and sensor-based devices, business web sites, and other information types posted by users. Big data analytics (BDA) can help to derive new insights from this huge and fast-growing data source. The core advantage of BDA technology is in its ability to mine these data and provide information on underlying trends. BDA, however, faces innate difficulty in optimizing the process and capabilities that require merging of diverse data assets to generate viable information. This paper explores the BDA process and capabilities in leveraging data via three case studies who are prime users of BDA tools. Findings emphasize four key components of the BDA process framework: system coordination, data sourcing, big data application service, and end users. Further building blocks are data security, privacy, and management that represent services for providing functionality to the four components of the BDA process across information and technology value chains.


Author(s):  
Bernard Tuffour Atuahene ◽  
Sittimont Kanjanabootra ◽  
Thayaparan Gajendran

Big data applications consist of i) data collection using big data sources, ii) storing and processing the data, and iii) analysing data to gain insights for creating organisational benefit. The influx of digital technologies and digitization in the construction process includes big data as one newly emerging digital technology adopted in the construction industry. Big data application is in a nascent stage in construction, and there is a need to understand the tangible benefit(s) that big data can offer the construction industry. This study explores the benefits of big data in the construction industry. Using a qualitative case study design, construction professionals in an Australian Construction firm were interviewed. The research highlights that the benefits of big data include reduction of litigation amongst projects stakeholders, enablement of near to real-time communication, and facilitation of effective subcontractor selection. By implication, on a broader scale, these benefits can improve contract management, procurement, and management of construction projects. This study contributes to an ongoing discourse on big data application, and more generally, digitization in the construction industry.


Sign in / Sign up

Export Citation Format

Share Document