Evaluating Resilience of Infrastructures Towards Endogenous Events by Non-Destructive High-Performance Techniques and Machine Learning Regression Algorithms

Mapping Intimacies ◽

10.5194/egusphere-egu2020-21183 ◽

2020 ◽

Author(s):

Nicholas Fiorentini ◽

Pietro Leandri ◽

Massimo Losa

Keyword(s):

Machine Learning ◽

High Performance ◽

Goodness Of Fit ◽

Performance Metrics ◽

Predictive Performance ◽

Absolute Error ◽

Road Maintenance ◽

Endogenous Factors ◽

Regression Algorithms ◽

Non Destructive

In order to plan infrastructure maintenance strategies, Non-Destructive Techniques (NDT) have been largely employed in recent years, achieving outstanding results in the identification of infrastructural deficiencies. Nevertheless, the extensive combination of different NDT that can cover various factors affecting infrastructure durability has not yet been thoroughly investigated.This paper proposes a methodology for evaluating the resilience of infrastructures towards endogenous factors by combining different NDT outcomes. Machine Learning (ML) Regression algorithms have been used to predict the pavement surface roughness connected to a set of potential endogenous conditioning factors. The development, application, and comparison of two different regression algorithms, specifically Regression Tree (RT) and Random Forest (RF) have been carried out.The study area involves 4 testing sites, both in the rural and urban context, for a total length of 11400 m. In addition to the International Roughness Index (IRI) calculated by profilometric measurements, a set of endogenous features of the infrastructure were collected by using NDT such as Falling Weight Deflectometer (FWD), and Ground Penetrating Radar (GPR). Moreover, a set of topographical data of roadside areas, information on properties of materials composing the subgrade and the pavement structure, traffic flow, rainfall, temperature, and age of infrastructure were gathered.The database was randomly split into a Training (70%) and Test sets (30%). With the Training set, through a 10-Fold Cross-Validation (CV), the models have been trained and validated. A set of three performance metrics, namely Correlation Coefficient (R2), Root Mean Square Error (RMSE), and Mean Absolute Error (MSE), has been used for the Goodness-of-Fit (GoF) assessment. Also, with the Test set, the Predictive Performance (PP) of the models has been evaluated.Results indicate that the suggested methodology is satisfactory for supporting processes on planning road maintenance by National Road Authorities (NRA) and allows decision-makers to pursue better solutions.

Download Full-text

ASSESSING RESILIENCE OF INFRASTRUCTURES TOWARDS EXOGENOUS EVENTS BY USING PS-INSAR-BASED SURFACE MOTION ESTIMATES AND MACHINE LEARNING REGRESSION TECHNIQUES

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-4-2020-19-2020 ◽

2020 ◽

Vol V-4-2020 ◽

pp. 19-26 ◽

Cited By ~ 1

Author(s):

N. Fiorentini ◽

M. Maboudi ◽

M. Losa ◽

M. Gerke

Keyword(s):

Machine Learning ◽

Goodness Of Fit ◽

Regression Tree ◽

Environmental Parameters ◽

Predictive Performance ◽

Machine Learning Algorithms ◽

Road Maintenance ◽

Surface Motion ◽

Road Pavement ◽

Pavement Structures

Abstract. Technologically advanced strategies in infrastructural maintenance are increasingly required in countries such as Italy, where recovery and rehabilitation interventions are preferred to new works. For this purpose, Interferometric Synthetic Aperture Radar (InSAR) techniques have been employed in recent years, achieving reliable outcomes in the identification of infrastructural instabilities. Nevertheless, using the InSAR survey exclusively, it is not feasible to recognize the reasons for such vulnerabilities, and further in-depth investigations are essential.The primary purpose of this paper is to predict infrastructural displacements connected to surface motion and the related causes by combining InSAR techniques and Machine Learning algorithms. The development and application of a Regression Tree-based algorithm have been carried out for estimating the displacement of road pavement structures detected by the Persistent Scatterer InSAR technique.The study area is located in the province of Pistoia, Tuscany, Italy. Sentinel-1 images from 2014 to 2019 were used for the interferometric process, and a set of 29 environmental parameters was collected in a GIS platform. The database is randomly split into a Training (70%) and Test sets (30%). With the Training set, through a 10-Fold Cross-Validation, the model is trained, validated, and the Goodness-of-Fit is evaluated. Also, with the Test set, the Predictive Performance of the model is assessed. Lastly, we applied the model onto a stretch of a two-lane rural road that crosses the area. Results show that the suggested procedure can be used for supporting decision-making processes on planning road maintenance by National Road Authorities.

Download Full-text

Study of a Privacy Preserving Logistic Regression Algorithm (PPLRA) For Data Privacy in the Context of Big Data

Journal of Physics Conference Series ◽

10.1088/1742-6596/2083/3/032059 ◽

2021 ◽

Vol 2083 (3) ◽

pp. 032059

Author(s):

Qiang Chen ◽

Meiling Deng

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Privacy Protection ◽

Data Privacy ◽

Absolute Error ◽

Average Absolute Error ◽

Regression Algorithms ◽

Hadoop Platform ◽

Logistic Regression Algorithm ◽

Computing Speed

Abstract Regression algorithms are commonly used in machine learning. Based on encryption and privacy protection methods, the current key hot technology regression algorithm and the same encryption technology are studied. This paper proposes a PPLAR based algorithm. The correlation between data items is obtained by logistic regression formula. The algorithm is distributed and parallelized on Hadoop platform to improve the computing speed of the cluster while ensuring the average absolute error of the algorithm.

Download Full-text

Predictive modelling of hospital readmission: Evaluation of different preprocessing techniques on machine learning classifiers

Intelligent Data Analysis ◽

10.3233/ida-205468 ◽

2021 ◽

Vol 25 (5) ◽

pp. 1073-1098

Author(s):

Nor Hamizah Miswan ◽

Chee Seng Chan ◽

Chong Guan Ng

Keyword(s):

Machine Learning ◽

Hospital Readmission ◽

Performance Metrics ◽

Predictive Performance ◽

Predictive Modelling ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Set Up ◽

The Right ◽

The Impact

Hospital readmission is a major cost for healthcare systems worldwide. If patients with a higher potential of readmission could be identified at the start, existing resources could be used more efficiently, and appropriate plans could be implemented to reduce the risk of readmission. Therefore, it is important to predict the right target patients. Medical data is usually noisy, incomplete, and inconsistent. Hence, before developing a prediction model, it is crucial to efficiently set up the predictive model so that improved predictive performance is achieved. The current study aims to analyse the impact of different preprocessing methods on the performance of different machine learning classifiers. The preprocessing applied by previous hospital readmission studies were compared, and the most common approaches highlighted such as missing value imputation, feature selection, data balancing, and feature scaling. The hyperparameters were selected using Bayesian optimisation. The different preprocessing pipelines were assessed using various performance metrics and computational costs. The results indicated that the preprocessing approaches helped improve the model’s prediction of hospital readmission.

Download Full-text

Identifying the Main Risk Factors for Cardiovascular Diseases Prediction Using Machine Learning Algorithms

Mathematics ◽

10.3390/math9202537 ◽

2021 ◽

Vol 9 (20) ◽

pp. 2537

Author(s):

Luis Rolando Guarneros-Nolasco ◽

Nancy Aracely Cruz-Ramos ◽

Giner Alor-Hernández ◽

Lisbeth Rodríguez-Mazahua ◽

José Luis Sánchez-Cervantes

Keyword(s):

Machine Learning ◽

Cardiovascular Diseases ◽

Performance Metrics ◽

Learning Algorithms ◽

Predictive Performance ◽

Machine Learning Algorithms ◽

Algorithm Performance ◽

Body Regions ◽

Risks Factors ◽

Fold Cross Validation

Cardiovascular Diseases (CVDs) are a leading cause of death globally. In CVDs, the heart is unable to deliver enough blood to other body regions. As an effective and accurate diagnosis of CVDs is essential for CVD prevention and treatment, machine learning (ML) techniques can be effectively and reliably used to discern patients suffering from a CVD from those who do not suffer from any heart condition. Namely, machine learning algorithms (MLAs) play a key role in the diagnosis of CVDs through predictive models that allow us to identify the main risks factors influencing CVD development. In this study, we analyze the performance of ten MLAs on two datasets for CVD prediction and two for CVD diagnosis. Algorithm performance is analyzed on top-two and top-four dataset attributes/features with respect to five performance metrics –accuracy, precision, recall, f1-score, and roc-auc—using the train-test split technique and k-fold cross-validation. Our study identifies the top-two and top-four attributes from CVD datasets analyzing the performance of the accuracy metrics to determine that they are the best for predicting and diagnosing CVD. As our main findings, the ten ML classifiers exhibited appropriate diagnosis in classification and predictive performance with accuracy metric with top-two attributes, identifying three main attributes for diagnosis and prediction of a CVD such as arrhythmia and tachycardia; hence, they can be successfully implemented for improving current CVD diagnosis efforts and help patients around the world, especially in regions where medical staff is lacking.

Download Full-text

A Neural Network Based Algorithm for Dynamically Adjusting Activity Targets to Sustain Exercise Engagement Among People Using Activity Trackers

10.1101/775908 ◽

2019 ◽

Author(s):

Ramin Mohammadi ◽

Amanda Jayne Centi ◽

Mursal Atif ◽

Stephen Agboola ◽

Kamal Jethwani ◽

...

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Absolute Error ◽

Learning Model ◽

Step Count ◽

Entropy Measure ◽

Daily Step ◽

Activity Trackers ◽

Machine Learning Model ◽

Daily Step Count

AbstractIt is well established that lack of physical activity is detrimental to overall health of an individual. Modern day activity trackers enable individuals to monitor their daily activity to meet and maintain targets and to promote activity encouraging behavior. However, the benefits of activity trackers are attenuated over time due to waning adherence. One of the key methods to improve adherence to goals is to motivate individuals to improve on their historic performance metrics. In this work we developed a machine learning model to dynamically adjust the activity target for the forthcoming week that can be realistically achieved by the activity-tracker users. This model prescribes activity target for the forthcoming week. We considered individual user-specific personal, social, and environmental factors, daily step count through the current week (7 days). In addition, we computed an entropy measure that characterizes the pattern of daily step count for the current week. Data for training the machine learning model was collected from 30 participants over a duration of 9 weeks. The model predicted target daily count with mean absolute error of 1545 steps. The proposed work can be used to set personalized goals in accordance with the individual’s level of activity and thereby improving adherence to fitness tracker.

Download Full-text

Exploring the use of Water Cycle Optimization Algorithm for Foreign Exchange Prediction

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j8793.0881019 ◽

2019 ◽

Vol 8 (10) ◽

pp. 680-684

Keyword(s):

Exchange Rates ◽

Foreign Exchange ◽

Performance Metrics ◽

Water Cycle ◽

Mean Absolute Error ◽

Predictive Performance ◽

Absolute Error ◽

Mean Square ◽

United States Dollar ◽

Artificial Neural Network Ann

The aim of this paper is to model a network and predict the exchange price of United States Dollar to Indian Rupees using daily exchange rates from Dec 18, 1991-Jul 19, 2007. In this paper, Water Cycle Optimization (WCA) technique has been used to optimize the Artificial Neural Network (ANN) for Foreign Exchange prediction on the basis of their predictive performance. The performance metrics considered for the evaluation of the models are root mean square error (RMSE) and mean absolute error (MAE). The tabulated outcome shows the efficiency of the model over other popular models

Download Full-text

Predicting Runtime in HPC Environments for an Efficient Use of Computational Resources

10.5753/wscad.2021.18513 ◽

2021 ◽

Author(s):

Mariza Ferro ◽

Vinicius P. Klôh ◽

Matheus Gritz ◽

Vitor de Sá ◽

Bruno Schulze

Keyword(s):

Neural Network ◽

Machine Learning ◽

Linear Regression ◽

Decision Tree ◽

High Performance ◽

Performance Metrics ◽

Scientific Applications ◽

Computing Systems ◽

Computational Resources ◽

Performance Computing

Understanding the computational impact of scientific applications on computational architectures through runtime should guide the use of computational resources in high-performance computing systems. In this work, we propose an analysis of Machine Learning (ML) algorithms to gather knowledge about the performance of these applications through hardware events and derived performance metrics. Nine NAS benchmarks were executed and the hardware events were collected. These experimental results were used to train a Neural Network, a Decision Tree Regressor and a Linear Regression focusing on predicting the runtime of scientific applications according to the performance metrics.

Download Full-text

Linear Attribute Projection and Performance Assessment for Signifying the Absenteeism at Work using Machine Learning

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c4405.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 1262-1267

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Mean Squared Error ◽

Working Hours ◽

Absolute Error ◽

Machine Learning Algorithms ◽

Working Environment ◽

Experimental Result ◽

Technological Advancement ◽

Development Environment

In recent times, with the technological advancement the industry and organization are transforming all their inflow and outflow operations into digital identity. At the outset, the name of the organization is also in the hands of the employee. One of the major needs of the employee in the working environment is to avail leave or vacation based on their family circumstances. Based on the health condition and need of the employee, the organization must extend their leave for the satisfaction of the employee. The performance of the employee is also predicted based on the working days in the organization. With this view, this paper attempts to analyze the performance of the employee and the number of working hours by using machine learning algorithms. The Absenteeism at work dataset from UCI machine learning Repository is used for prediction analysis. The prediction of absent hours is achieved in three ways. Firstly, the correlation between each of the dataset attributes are found and depicted as a histogram. Secondly, the top most high correlated features are identified which are directly fitted to the regression models like Linear regression, SRD regression, RANSAC regression, Ridge regression, Huber regression, ARD Regression, Passive Aggressive Regression and Theilson Regression. Thirdly, the Performance analysis is done by analyzing the performance metrics like Mean Squared Error, Mean Absolute Error, R2 Score, Explained Variance Score and Mean Squared Log Error. The implementation is done by python in Anaconda Spyder Navigator Integrated Development Environment. Experimental Result shows that the Passive Aggressive Regression have achieved the effective prediction of number of absent hours with minimum MSE of 0.04, MAE of 0.16, EVS of 0.03, MSLE of 0.32 and reasonable R2 Score of 0.89.

Download Full-text

Identifying the Main Risk Factors for CVD Prediction Using Machine Learning Algorithms

10.20944/preprints202108.0471.v1 ◽

2021 ◽

Author(s):

Luis Rolando Guarneros-Nolasco ◽

Nancy Aracely Cruz-Ramos ◽

Giner Alor-Hernández ◽

Lisbeth Rodríguez-Mazahua ◽

José Luis Sánchez-Cervantes

Keyword(s):

Machine Learning ◽

Cross Validation ◽

Performance Metrics ◽

Learning Algorithms ◽

Predictive Performance ◽

Machine Learning Algorithms ◽

Algorithm Performance ◽

Body Regions ◽

Risks Factors ◽

Fold Cross Validation

CVDs are a leading cause of death globally. In CVDs, the heart is unable to deliver enough blood to other body regions. Since effective and accurate diagnosis of CVDs is essential for CVD prevention and treatment, machine learning (ML) techniques can be effectively and reliably used to discern patients suffering from a CVD from those who do not suffer from any heart condition. Namely, machine learning algorithms (MLAs) play a key role in the diagnosis of CVDs through predictive models that allow us to identify the main risks factors influencing CVD development. In this study, we analyze the performance of ten MLAs on two datasets for CVD prediction and two for CVD diagnosis. Algorithm performance is analyzed on top-two and top-four dataset attributes/features with respect to five performance metrics –accuracy, precision, recall, f1-score, and roc-auc – using the train-test split technique and k-fold cross-validation. Our study identifies the top two and four attributes from each CVD diagnosis/prediction dataset. As our main findings, the ten MLAs exhibited appropriate diagnosis and predictive performance; hence, they can be successfully implemented for improving current CVD diagnosis efforts and help patients around the world, especially in regions where medical staff is lacking.

Download Full-text

Pharmacy Impact on Vaccination Progress Using Machine Learning Approach

10.20944/preprints202106.0533.v1 ◽

2021 ◽

Author(s):

Samir Bandyopadhyay ◽

Shawni Dutta ◽

Upasana Mukherjee

Keyword(s):

Machine Learning ◽

Social Life ◽

Mean Squared Error ◽

Severe Depression ◽

Predictive Performance ◽

Absolute Error ◽

Gradient Boosting ◽

Human Beings ◽

Infected People ◽

Extreme Gradient Boosting

The novel coronavirus disease (COVID-19) has created immense threats to public health on various levels around the globe. The unpredictable outbreak of this disease and the pandemic situation are causing severe depression, anxiety and other mental as physical health related problems among the human beings. To combat against this disease, vaccination is essential as it will boost the immune system of human beings while being in the contact with the infected people. The vaccination process is thus necessary to confront the outbreak of COVID-19. This deadly disease has put social, economic condition of the entire world into an enormous challenge. The worldwide vaccination progress should be tracked to identify how fast the entire economic as well as social life will be stabilized. The monitor ofthe vaccination progress, a machine learning based Regressor model is approached in this study. This tracking process has been applied on the data starting from 14th December, 2020 to 24th April, 2021. A couple of ensemble based machine learning Regressor models such as Random Forest, Extra Trees, Gradient Boosting, AdaBoost and Extreme Gradient Boosting are implemented and their predictive performance are compared. The comparative study reveals that the AdaBoostRegressor outperforms with minimized mean absolute error (MAE) of 9.968 and root mean squared error (RMSE) of 11.133.

Download Full-text