Integrated hydrodynamic and machine learning models for compound flooding prediction in a data-scarce estuarine delta

Abstract. Flood forecasting based on water level modeling is an essential non-structural measure against compound flooding over the globe. With its vulnerability increased under climate change, every coastal area became urgently needs a water level model for better flood risk management. Unfortunately, for local water management agencies in developing countries building such a model is challenging due to the limited computational resources and the scarcity of observational data. Here, we attempt to solve the issue by proposing an integrated hydrodynamic and machine learning approach to predict compound flooding in those areas. As a case study, this integrated approach is implemented in Pontianak, the densest coastal urban area over the Kapuas River delta, Indonesia. Firstly, we built a hydrodynamic model to simulate several compound flooding scenarios, and the outputs are then used to train the machine learning model. To obtain a robust machine learning model, we consider three machine learning algorithms, i.e., Random Forest, Multi Linear Regression, and Support Vector Machine. The results show that this integrated scheme is successfully working. The Random Forest performs as the most accurate algorithm to predict flooding hazards in the study area, with RMSE = 0.11 m compared to SVM (RMSE = 0.18 m) and MLR (RMSE = 0.19 m). The machine-learning model with the RF algorithm can predict ten out of seventeen compound flooding events during the testing phase. Therefore, the random forest is proposed as the most appropriate algorithm to build a reliable ML model capable of assessing the compound flood hazards in the area of interest.

Download Full-text

Machine Learning Model for Credit Card Fraud Detection- A Comparative Analysis

The International Arab Journal of Information Technology ◽

10.34028/iajit/18/6/6 ◽

2021 ◽

Author(s):

Pratyush Sharma ◽

Souradeep Banerjee ◽

Devyanshi Tiwari ◽

Jagdish Chandra Patni

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Credit Card ◽

Detection System ◽

Fraud Detection ◽

Learning Model ◽

Machine Learning Algorithms ◽

Support Vector ◽

Machine Learning Model ◽

Artificial Neural Network Ann

In today's world, we are on an express train to a cashless society which has led to a tremendous escalation in the use of credit card transactions. But the flipside of this is that fraudulent activities are on the increase; therefore, implementation of a methodical fraud detection system is indispensable to cardholders as well as the card-issuing banks. In this paper, we are going to use different machine learning algorithms like random forest, logistic regression, Support Vector Machine (SVM), and Neural Networks to train a machine learning model based on the given dataset and create a comparative study on the accuracy and different measures of the models being achieved using each of these algorithms. Using the comparative analysis on the F_1 score, we will be able to predict which algorithm is best suited to serve our purpose for the same. Our study concluded that Artificial Neural Network (ANN) performed best with an F_1 score of 0.91.

Download Full-text

Machine learning building price prediction with green building determinant

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v9.i3.pp379-386 ◽

2020 ◽

Vol 9 (3) ◽

pp. 379

Author(s):

Thuraiya Mohd ◽

Syafiqah Jamil ◽

Suraya Masrom

Keyword(s):

Machine Learning ◽

Random Forest ◽

Model Building ◽

Green Building ◽

Learning Model ◽

Machine Learning Algorithms ◽

Classification Problems ◽

Price Prediction ◽

Machine Learning Model ◽

Housing Issue

In the era of Industrial 4.0, many urgent issues in the industries can be effectively solved with artificial intelligence techniques, including machine learning. Designing an effective machine learning model for prediction and classification problems is an ongoing endeavor. Besides that, time and expertise are important factors that are needed to tailor the model to a specific issue, such as the green building housing issue. Green building is known as a potential approach to increase the efficiency of the building. To the best of our knowledge, there is still no implementation of machine learning model on GB valuation factors for building price prediction compared to conventional building development. This paper provides a report of an empirical study that model building price prediction based on green building and other common determinants. The experiments used five common machine learning algorithms namely Linear Regression, Decision Tree, Random Forest, Ridge and Lasso tested on a set of real building datasets that covered Kuala Lumpur District, Malaysia. The result showed that the Random Forest algorithm outperforms the other four algorithms on the tested dataset and the green building determinant has contributed some promising effects to the model.

Download Full-text

An integrated machine learning model for indoor network optimization to maximize coverage

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v24.i1.pp394-402 ◽

2021 ◽

Vol 24 (1) ◽

pp. 394

Author(s):

Ahmed Wasif Reza ◽

Abdullah Al Rifat ◽

Tanvir Ahmed

Keyword(s):

Machine Learning ◽

Network Optimization ◽

Nearest Neighbor ◽

Learning Model ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor ◽

Simple Task ◽

Machine Learning Model ◽

Minimum Number

Indoor network optimization is not a simple task due to the obstacles, interference, and attenuation of the signal in an environment. Intense noises can affect the intelligibility of the signal and reduce the coverage strength significantly which results in a poor user experience. Most of the existing works are associated with finding the location of the devices via different mathematical and generic algorithmic approaches, but very few are focused on implying machine learning algorithms. The purpose of this research is to introduce an integrated machine learning model to find maximum indoor coverage with a minimum number of transmitters. The users in the indoor environment also have been allocated based on the most reliable signal strength and the system is also capable of allocating new users. K-means clustering, K-nearest neighbor (KNN), support vector machine (SVM), and Gaussian Naïve Bayes (GNB) have been used to provide an optimized solution. It is found that KNN, SVM, and GNB obtained maximum accuracy of 100% in some cases. However, among all the algorithms, KNN performed the best and provided an average accuracy of 93.33%. K-fold cross-validation (Kf-CV) technique has been added to validate the experimental simulations and re-evaluate the outcomes of the machine learning models.

Download Full-text

Feature Selection and Comparison of Machine Learning Algorithms in Classification of Grazing and Rumination Behaviour in Sheep

Sensors ◽

10.3390/s18103532 ◽

2018 ◽

Vol 18 (10) ◽

pp. 3532 ◽

Cited By ~ 16

Author(s):

Nicola Mansbridge ◽

Jurgen Mitsch ◽

Nicola Bollard ◽

Keith Ellis ◽

Giuliana Miguel-Pacheco ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Time Budget ◽

Learning Algorithms ◽

Eating Behaviour ◽

Machine Learning Algorithms ◽

Support Vector ◽

Optimum Number ◽

Eating Behaviours ◽

Adaptive Boosting

Grazing and ruminating are the most important behaviours for ruminants, as they spend most of their daily time budget performing these. Continuous surveillance of eating behaviour is an important means for monitoring ruminant health, productivity and welfare. However, surveillance performed by human operators is prone to human variance, time-consuming and costly, especially on animals kept at pasture or free-ranging. The use of sensors to automatically acquire data, and software to classify and identify behaviours, offers significant potential in addressing such issues. In this work, data collected from sheep by means of an accelerometer/gyroscope sensor attached to the ear and collar, sampled at 16 Hz, were used to develop classifiers for grazing and ruminating behaviour using various machine learning algorithms: random forest (RF), support vector machine (SVM), k nearest neighbour (kNN) and adaptive boosting (Adaboost). Multiple features extracted from the signals were ranked on their importance for classification. Several performance indicators were considered when comparing classifiers as a function of algorithm used, sensor localisation and number of used features. Random forest yielded the highest overall accuracies: 92% for collar and 91% for ear. Gyroscope-based features were shown to have the greatest relative importance for eating behaviours. The optimum number of feature characteristics to be incorporated into the model was 39, from both ear and collar data. The findings suggest that one can successfully classify eating behaviours in sheep with very high accuracy; this could be used to develop a device for automatic monitoring of feed intake in the sheep sector to monitor health and welfare.

Download Full-text

A Latent Dirichlet Allocation and Fuzzy Clustering Based Machine Learning Model for Text Thesaurus

International Journal of Computers Communications & Control ◽

10.15837/ijccc.2020.2.3811 ◽

2020 ◽

Vol 15 (2) ◽

Author(s):

Jia Luo ◽

Dongwen Yu ◽

Zong Dai

Keyword(s):

Machine Learning ◽

Fuzzy Clustering ◽

Latent Dirichlet Allocation ◽

Learning Model ◽

Machine Learning Algorithms ◽

Text Data ◽

Huge Data ◽

Machine Learning Model ◽

N Gram ◽

Dirichlet Allocation

It is not quite possible to use manual methods to process the huge amount of structured and semi-structured data. This study aims to solve the problem of processing huge data through machine learning algorithms. We collected the text data of the company’s public opinion through crawlers, and use Latent Dirichlet Allocation (LDA) algorithm to extract the keywords of the text, and uses fuzzy clustering to cluster the keywords to form different topics. The topic keywords will be used as a seed dictionary for new word discovery. In order to verify the efficiency of machine learning in new word discovery, algorithms based on association rules, N-Gram, PMI, andWord2vec were used for comparative testing of new word discovery. The experimental results show that the Word2vec algorithm based on machine learning model has the highest accuracy, recall and F-value indicators.

Download Full-text

Internet of Things-Based Intelligent Smart Home Control System

Security and Communication Networks ◽

10.1155/2021/9928254 ◽

2021 ◽

Vol 2021 ◽

pp. 1-17

Author(s):

Olutosin Taiwo ◽

Absalom E. Ezugwu

Keyword(s):

Machine Learning ◽

Mobile Application ◽

Smart Home ◽

Human Life ◽

Machine Learning Algorithms ◽

Support Vector ◽

Automation System ◽

Home Automation ◽

Area Of Interest ◽

Home Automation System

The smart home is now an established area of interest and research that contributes to comfort in modern homes. With the Internet being an essential part of broad communication in modern life, IoT has allowed homes to go beyond building to interactive abodes. In many spheres of human life, the IoT has grown exponentially, including monitoring ecological factors, controlling the home and its appliances, and storing data generated by devices in the house in the cloud. Smart home includes multiple components, technologies, and devices that generate valuable data for predicting home and environment activities. This work presents the design and development of a ubiquitous, cloud-based intelligent home automation system. The system controls, monitors, and oversees the security of a home and its environment via an Android mobile application. One module controls and monitors electrical appliances and environmental factors, while another module oversees the home’s security by detecting motion and capturing images. Our work uses a camera to capture images of objects triggered by their motion being detected. To avoid false alarms, we used the concept of machine learning to differentiate between images of regular home occupants and those of an intruder. The support vector machine algorithm is proposed in this study to classify the features of the image captured and determine if it is that of a regular home occupant or an intruder before sending an alarm to the user. The design of the mobile application allows a graphical display of the activities in the house. Our work proves that machine learning algorithms can improve home automation system functionality and enhance home security. The work’s prototype was implemented using an ESP8266 board, an ESP32-CAM board, a 5 V four-channel relay module, and sensors.

Download Full-text

Techniques for Detecting Malware Traffic: A Comprehensive Approach to Feature Selection and Classification

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39088 ◽

2021 ◽

Vol 9 (12) ◽

pp. 1-10

Author(s):

Harsha A K

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Learning Algorithms ◽

Malware Detection ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Steady Increase ◽

Extreme Gradient Boosting

Abstract: Since the advent of encryption, there has been a steady increase in malware being transmitted over encrypted networks. Traditional approaches to detect malware like packet content analysis are inefficient in dealing with encrypted data. In the absence of actual packet contents, we can make use of other features like packet size, arrival time, source and destination addresses and other such metadata to detect malware. Such information can be used to train machine learning classifiers in order to classify malicious and benign packets. In this paper, we offer an efficient malware detection approach using classification algorithms in machine learning such as support vector machine, random forest and extreme gradient boosting. We employ an extensive feature selection process to reduce the dimensionality of the chosen dataset. The dataset is then split into training and testing sets. Machine learning algorithms are trained using the training set. These models are then evaluated against the testing set in order to assess their respective performances. We further attempt to tune the hyper parameters of the algorithms, in order to achieve better results. Random forest and extreme gradient boosting algorithms performed exceptionally well in our experiments, resulting in area under the curve values of 0.9928 and 0.9998 respectively. Our work demonstrates that malware traffic can be effectively classified using conventional machine learning algorithms and also shows the importance of dimensionality reduction in such classification problems. Keywords: Malware Detection, Extreme Gradient Boosting, Random Forest, Feature Selection.

Download Full-text

Detection and defense of cyberattacks on the machine learning control of robotic systems

The Journal of Defense Modeling and Simulation Applications Methodology Technology ◽

10.1177/15485129211043874 ◽

2021 ◽

pp. 154851292110438

Author(s):

George W Clark ◽

Todd R Andel ◽

J Todd McDonald ◽

Tom Johnsten ◽

Tom Thomas

Keyword(s):

Machine Learning ◽

Autonomous Vehicles ◽

Defense Mechanisms ◽

Autonomous Vehicle ◽

Learning Algorithms ◽

Learning Model ◽

Machine Learning Algorithms ◽

Robotic Systems ◽

Machine Learning Model ◽

Attack Surface

Robotic systems are no longer simply built and designed to perform sequential repetitive tasks primarily in a static manufacturing environment. Systems such as autonomous vehicles make use of intricate machine learning algorithms to adapt their behavior to dynamic conditions in their operating environment. These machine learning algorithms provide an additional attack surface for an adversary to exploit in order to perform a cyberattack. Since an attack on robotic systems such as autonomous vehicles have the potential to cause great damage and harm to humans, it is essential that detection and defenses of these attacks be explored. This paper discusses the plausibility of direct and indirect cyberattacks on a machine learning model through the use of a virtual autonomous vehicle operating in a simulation environment using a machine learning model for control. Using this vehicle, this paper proposes various methods of detection of cyberattacks on its machine learning model and discusses possible defense mechanisms to prevent such attacks.

Download Full-text

Abstract 15895: Machine Learning Algorithms to Predict Major Adverse Cardiovascular Events in Patients Undergoing Orthotopic Liver Transplantation: A Retrospective Cohort Study

Circulation ◽

10.1161/circ.142.suppl_3.15895 ◽

2020 ◽

Vol 142 (Suppl_3) ◽

Author(s):

vardhmaan jain ◽

Vikram Sharma ◽

Agam Bansal ◽

Cerise Kleb ◽

Chirag Sheth ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Cardiovascular Events ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Major Adverse Cardiovascular Events ◽

Support Vector ◽

Post Transplant ◽

Extreme Gradient Boosting ◽

All Cause Mortality

Background: Post-transplant major adverse cardiovascular events (MACE) are amongst the leading cause of death amongst orthotopic liver transplant(OLT) recipients. Despite years of guideline directed therapy, there are limited data on predictors of post-OLT MACE. We assessed if machine learning algorithms (MLA) can predict MACE and all-cause mortality in patients undergoing OLT. Methods: We tested three MLA: support vector machine, extreme gradient boosting(XG-Boost) and random forest with traditional logistic regression for prediction of MACE and all-cause mortality on a cohort of consecutive patients undergoing OLT at our center between 2008-2019. The cohort was randomly split into a training (80%) and testing (20%) cohort. Model performance was assessed using c-statistic or AUC. Results: We included 1,459 consecutive patients with mean ± SD age 54.2 ± 13.8 years, 32% female who underwent OLT. There were 199 (13.6%) MACE and 289 (20%) deaths at a mean follow up of 4.56 ± 3.3 years. The random forest MLA was the best performing model for predicting MACE [AUC:0.78, 95% CI: 0.70-0.85] as well as mortality [AUC:0.69, 95% CI: 0.61-0.76], with all models performing better when predicting MACE vs mortality. See Table and Figure. Conclusion: Random forest machine learning algorithms were more predictive and discriminative than traditional regression models for predicting major adverse cardiovascular events and all-cause mortality in patients undergoing OLT. Validation and subsequent incorporation of MLA in clinical decision making for OLT candidacy could help risk stratify patients for post-transplant adverse cardiovascular events.

Download Full-text

MEWS++: Enhancing the Prediction of Clinical Deterioration in Admitted Patients through a Machine Learning Model

Journal of Clinical Medicine ◽

10.3390/jcm9020343 ◽

2020 ◽

Vol 9 (2) ◽

pp. 343 ◽

Cited By ~ 4

Author(s):

Arash Kia ◽

Prem Timsina ◽

Himanshu N. Joshi ◽

Eyal Klang ◽

Rohit R. Gupta ◽

...

Keyword(s):

Machine Learning ◽

At Risk ◽

Area Under The Curve ◽

Learning Model ◽

Clinical Deterioration ◽

Early Warning Score ◽

Support Vector ◽

Adult Age ◽

Machine Learning Model ◽

Patients At Risk

Early detection of patients at risk for clinical deterioration is crucial for timely intervention. Traditional detection systems rely on a limited set of variables and are unable to predict the time of decline. We describe a machine learning model called MEWS++ that enables the identification of patients at risk of escalation of care or death six hours prior to the event. A retrospective single-center cohort study was conducted from July 2011 to July 2017 of adult (age > 18) inpatients excluding psychiatric, parturient, and hospice patients. Three machine learning models were trained and tested: random forest (RF), linear support vector machine, and logistic regression. We compared the models’ performance to the traditional Modified Early Warning Score (MEWS) using sensitivity, specificity, and Area Under the Curve for Receiver Operating Characteristic (AUC-ROC) and Precision-Recall curves (AUC-PR). The primary outcome was escalation of care from a floor bed to an intensive care or step-down unit, or death, within 6 h. A total of 96,645 patients with 157,984 hospital encounters and 244,343 bed movements were included. Overall rate of escalation or death was 3.4%. The RF model had the best performance with sensitivity 81.6%, specificity 75.5%, AUC-ROC of 0.85, and AUC-PR of 0.37. Compared to traditional MEWS, sensitivity increased 37%, specificity increased 11%, and AUC-ROC increased 14%. This study found that using machine learning and readily available clinical data, clinical deterioration or death can be predicted 6 h prior to the event. The model we developed can warn of patient deterioration hours before the event, thus helping make timely clinical decisions.

Download Full-text