scholarly journals Machine-Learning Based Memory Prediction Model for Data Parallel Workloads in Apache Spark

Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 697
Author(s):  
Rohyoung Myung ◽  
Sukyong Choi

A lack of memory can lead to job failures or increase processing times for garbage collection. However, if too much memory is provided, the processing time is only marginally reduced, and most of the memory is wasted. Many big data processing tasks are executed in cloud environments. When renting virtual resources in a cloud environment, it is necessary to pay the cost according to the specifications of resources (i.e., the number of virtual cores and the size of memory), as well as rental time. In this paper, given the type of workload and volume of the input data, we analyze the memory usage pattern and derive the efficient memory size of data-parallel workloads in Apache Spark. Then, we propose a machine-learning-based prediction model that determines the efficient memory for a given workload and data. To determine the validity of the proposed model, we applied it to data-parallel workloads which include a deep learning model. The predicted memory values were in close agreement with the actual amount of required memory. Additionally, the whole building time for the proposed model requires a maximum of 44% of the total execution time of a data-parallel workload. The proposed model can improve memory efficiency up to 1.89 times compared with the vanilla Spark setting.

Internet of Things (IoT) is one of the fast-growing technology paradigms used in every sectors, where in the Quality of Service (QoS) is a critical component in such systems and usage perspective with respect to ProSumers (producer and consumers). Most of the recent research works on QoS in IoT have used Machine Learning (ML) techniques as one of the computing methods for improved performance and solutions. The adoption of Machine Learning and its methodologies have become a common trend and need in every technologies and domain areas, such as open source frameworks, task specific algorithms and using AI and ML techniques. In this work we propose an ML based prediction model for resource optimization in the IoT environment for QoS provisioning. The proposed methodology is implemented by using a multi-layer neural network (MNN) for Long Short Term Memory (LSTM) learning in layered IoT environment. Here the model considers the resources like bandwidth and energy as QoS parameters and provides the required QoS by efficient utilization of the resources in the IoT environment. The performance of the proposed model is evaluated in a real field implementation by considering a civil construction project, where in the real data is collected by using video sensors and mobile devices as edge nodes. Performance of the prediction model is observed that there is an improved bandwidth and energy utilization in turn providing the required QoS in the IoT environment.


Mathematics ◽  
2020 ◽  
Vol 8 (9) ◽  
pp. 1620 ◽  
Author(s):  
Ganjar Alfian ◽  
Muhammad Syafrudin ◽  
Norma Latif Fitriyani ◽  
Muhammad Anshari ◽  
Pavel Stasa ◽  
...  

Extracting information from individual risk factors provides an effective way to identify diabetes risk and associated complications, such as retinopathy, at an early stage. Deep learning and machine learning algorithms are being utilized to extract information from individual risk factors to improve early-stage diagnosis. This study proposes a deep neural network (DNN) combined with recursive feature elimination (RFE) to provide early prediction of diabetic retinopathy (DR) based on individual risk factors. The proposed model uses RFE to remove irrelevant features and DNN to classify the diseases. A publicly available dataset was utilized to predict DR during initial stages, for the proposed and several current best-practice models. The proposed model achieved 82.033% prediction accuracy, which was a significantly better performance than the current models. Thus, important risk factors for retinopathy can be successfully extracted using RFE. In addition, to evaluate the proposed prediction model robustness and generalization, we compared it with other machine learning models and datasets (nephropathy and hypertension–diabetes). The proposed prediction model will help improve early-stage retinopathy diagnosis based on individual risk factors.


Author(s):  
Melda Alkan Çakıroğlu ◽  
◽  
Ahmet Ali Süzen ◽  

It has been built for centuries as housing and animal shelters, especially in rural areas, due to the advantages of masonry buildings being economical, being built with local materials, and not requiring skilled labor. The walls, which are the bearing elements of masonry structures, are formed by placing stones, bricks, or blocks on top of each other with a binding mortar. In this study, a model with the XGBoost algorithm, which is a tree-based classification algorithm, is proposed to scale cost of the samples reinforced with welded wire reinforcement/polypropylene fiber added dry mix shotcrete. The model executes cost classification based on concrete, steel mesh, steel, epoxy, fiber and workmanship independent parameters. A softmax function was incorporated into the model for classification. A complexity matrix was produced to evaluate classification performance of model. Also, it was compared to other machine learning algorithms. The model yielded higher accuracy and lower false-positive rates. As a result, the proposed model can make better estimates in cost classification compared to other machine learning methods. In conclusion, using the classification ability of the model, it is aimed to measure the cost effect in the construction process that calls for high labor force, time and cost.


2020 ◽  
Vol 10 (21) ◽  
pp. 7612
Author(s):  
Yaw-Shyan TSAY ◽  
Chiu-Yu YEH

Recently, micro-perforated panels (MPP) have become a popular sound absorbing material in the field of architectural acoustics. However, the cost of MPP is still high for the commercial market in Taiwan, and MPP is still not very popular compared to other sound absorbing materials and devices. The objective of this study is to develop a prediction model for MEMM via a machine learning approach. An experiment including 14 types of MEMM was first carried out in a reverberation room based on ISO 354. To predict the sound absorption coefficient of the MEMM, the capability of three conventional models and three machine learning (ML) models of the supervised learning method were studied for the development of the prediction model. The results showed that in most conventional models, the sound absorption coefficient of using an equivalent perimeter had the best agreement compared with other parameters, and the root mean square error (RMSE) between prediction models and experimental data were around 0.2~0.3. However, the RMSE of all ML models was less than 0.1, and the RMSE of the gradient boost model was 0.033 in the training sets and 0.062 in the testing sets, which showed the best agreement with the experiment data.


Author(s):  
Shengwei Tian ◽  
Yilin Yan ◽  
Long Yu ◽  
Mei Wang ◽  
Li Li

Malaria is a kind of disease that greatly threatens human health. Nearly half of the world’s population is at risk of malaria. Anti-malarial drugs which are sought, developed and synthesized keep malaria under control, having received increasing attention in drug discovery field. Machine learning techniques have been used widely in drug research and development. On the basis of semi-supervised machine learning for molecular descriptions, this research develops a multilayer deep belief network (DBN) that can be used to identify whether compounds have the anti-malarial activity. Firstly, the influence of feature dimensions on predicting accuracy is discussed. Furthermore, the proposed model is applied to contrast shallow machine learning and supervised machine learning with the similar deep architecture. The research results show that the proposed model can predict anti-malarial activity accurately. The stable performance on the evaluation metrics confirms the practicability of our model. The proposed DBN model performs better than other shallow supervised models and deep supervised models. Moreover, it could be applied to reduce the cost and the time of drug discovery.


Author(s):  
Dhilsath Fathima.M ◽  
S. Justin Samuel ◽  
R. Hari Haran

Aim: This proposed work is used to develop an improved and robust machine learning model for predicting Myocardial Infarction (MI) could have substantial clinical impact. Objectives: This paper explains how to build machine learning based computer-aided analysis system for an early and accurate prediction of Myocardial Infarction (MI) which utilizes framingham heart study dataset for validation and evaluation. This proposed computer-aided analysis model will support medical professionals to predict myocardial infarction proficiently. Methods: The proposed model utilize the mean imputation to remove the missing values from the data set, then applied principal component analysis to extract the optimal features from the data set to enhance the performance of the classifiers. After PCA, the reduced features are partitioned into training dataset and testing dataset where 70% of the training dataset are given as an input to the four well-liked classifiers as support vector machine, k-nearest neighbor, logistic regression and decision tree to train the classifiers and 30% of test dataset is used to evaluate an output of machine learning model using performance metrics as confusion matrix, classifier accuracy, precision, sensitivity, F1-score, AUC-ROC curve. Results: Output of the classifiers are evaluated using performance measures and we observed that logistic regression provides high accuracy than K-NN, SVM, decision tree classifiers and PCA performs sound as a good feature extraction method to enhance the performance of proposed model. From these analyses, we conclude that logistic regression having good mean accuracy level and standard deviation accuracy compared with the other three algorithms. AUC-ROC curve of the proposed classifiers is analyzed from the output figure.4, figure.5 that logistic regression exhibits good AUC-ROC score, i.e. around 70% compared to k-NN and decision tree algorithm. Conclusion: From the result analysis, we infer that this proposed machine learning model will act as an optimal decision making system to predict the acute myocardial infarction at an early stage than an existing machine learning based prediction models and it is capable to predict the presence of an acute myocardial Infarction with human using the heart disease risk factors, in order to decide when to start lifestyle modification and medical treatment to prevent the heart disease.


Author(s):  
Muhammad Junaid ◽  
Shiraz Ali Wagan ◽  
Nawab Muhammad Faseeh Qureshi ◽  
Choon Sung Nam ◽  
Dong Ryeol Shin

Sign in / Sign up

Export Citation Format

Share Document