Flood Prediction and Warning System using Dam Data Monitoring

Flood is one of the most devastating natural calamities affecting parts of the state from past few years. The recurring calamity necessitates an efficient early warning system since anticipation and preparedness play a key role in mitigating the impact. Though heavy and erratic rainfall has been marked as one of the main reasons for flood in several places, flood witnessed by various regions of Kerala was the result of sudden opening of reservoirs indicating poor dam management. The unforeseen flow of water often provided less time for evacuation. Prediction thus plays key role in avoiding loss of life and property, followed by such calamities. The vast benefits and potentials offered by Machine Learning makes it the most promising approach. The developed system is a model by taking Malampuzha Dam as reference. Support Vector Machine (SVM) is used as machine learning method for prediction and is programmed in python. The idea has been to create early flood prediction and warning system by monitoring different weather parameters and dam-related data. The feature vectors include current live storage, current reservoir level, rainfall and relative humidity from the period 2016-2019. Based on the analysis of these parameters, the open/closure of shutters of the dam is predicted. Release of shutters has varied impacts in the nearby regions and is measured by succeeding prediction, by mapping regions on grounds of level warning to be issued. Warning is issued through Flask-based server, by identifying vulnerable areas based on flood hazard reference for regions. The dam status prediction model delivered highest prediction accuracy of 99.14% and associated levels of warning has been generated in the development server, thus preventing unexpected release.

Download Full-text

Exploring Impact of Age and Gender on Sentiment Analysis Using Machine Learning

Electronics ◽

10.3390/electronics9020374 ◽

2020 ◽

Vol 9 (2) ◽

pp. 374 ◽

Cited By ~ 2

Author(s):

Sudhanshu Kumar ◽

Monika Gahalawat ◽

Partha Pratim Roy ◽

Debi Prosad Dogra ◽

Byung-Gyu Kim

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Age Groups ◽

Modern World ◽

Support Vector ◽

Digital Information ◽

Age And Gender ◽

And Gender ◽

The Impact

Sentiment analysis is a rapidly growing field of research due to the explosive growth in digital information. In the modern world of artificial intelligence, sentiment analysis is one of the essential tools to extract emotion information from massive data. Sentiment analysis is applied to a variety of user data from customer reviews to social network posts. To the best of our knowledge, there is less work on sentiment analysis based on the categorization of users by demographics. Demographics play an important role in deciding the marketing strategies for different products. In this study, we explore the impact of age and gender in sentiment analysis, as this can help e-commerce retailers to market their products based on specific demographics. The dataset is created by collecting reviews on books from Facebook users by asking them to answer a questionnaire containing questions about their preferences in books, along with their age groups and gender information. Next, the paper analyzes the segmented data for sentiments based on each age group and gender. Finally, sentiment analysis is done using different Machine Learning (ML) approaches including maximum entropy, support vector machine, convolutional neural network, and long short term memory to study the impact of age and gender on user reviews. Experiments have been conducted to identify new insights into the effect of age and gender for sentiment analysis.

Download Full-text

Explainable machine learning can outperform Cox regression predictions and provide insights in breast cancer survival

Scientific Reports ◽

10.1038/s41598-021-86327-7 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Arturo Moncada-Torres ◽

Marissa C. van Maaren ◽

Mathijs P. Hendriks ◽

Sabine Siesling ◽

Gijs Geleijnse

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Explicit Knowledge ◽

Cox Regression ◽

Metastatic Breast ◽

Gradient Boosting ◽

Support Vector ◽

Netherlands Cancer Registry ◽

Extreme Gradient Boosting ◽

The Impact

AbstractCox Proportional Hazards (CPH) analysis is the standard for survival analysis in oncology. Recently, several machine learning (ML) techniques have been adapted for this task. Although they have shown to yield results at least as good as classical methods, they are often disregarded because of their lack of transparency and little to no explainability, which are key for their adoption in clinical settings. In this paper, we used data from the Netherlands Cancer Registry of 36,658 non-metastatic breast cancer patients to compare the performance of CPH with ML techniques (Random Survival Forests, Survival Support Vector Machines, and Extreme Gradient Boosting [XGB]) in predicting survival using the $$c$$ c -index. We demonstrated that in our dataset, ML-based models can perform at least as good as the classical CPH regression ($$c$$ c -index $$\sim \,0.63$$ ∼ 0.63 ), and in the case of XGB even better ($$c$$ c -index $$\sim 0.73$$ ∼ 0.73 ). Furthermore, we used Shapley Additive Explanation (SHAP) values to explain the models’ predictions. We concluded that the difference in performance can be attributed to XGB’s ability to model nonlinearities and complex interactions. We also investigated the impact of specific features on the models’ predictions as well as their corresponding insights. Lastly, we showed that explainable ML can generate explicit knowledge of how models make their predictions, which is crucial in increasing the trust and adoption of innovative ML techniques in oncology and healthcare overall.

Download Full-text

Development of Machine Learning Models to Evaluate the Toughness of OPH Alloys

Materials ◽

10.3390/ma14216713 ◽

2021 ◽

Vol 14 (21) ◽

pp. 6713

Author(s):

Omid Khalaj ◽

Moslem Ghobadi ◽

Ehsan Saebnoori ◽

Alireza Zarezadeh ◽

Mohammadreza Shishesaz ◽

...

Keyword(s):

Machine Learning ◽

Mechanical Properties ◽

Mechanical Alloying ◽

Fuzzy Inference ◽

Oxide Dispersion Strengthened ◽

Machine Learning Techniques ◽

Support Vector ◽

Anfis Model ◽

Inference Systems ◽

The Impact

Oxide Precipitation-Hardened (OPH) alloys are a new generation of Oxide Dispersion-Strengthened (ODS) alloys recently developed by the authors. The mechanical properties of this group of alloys are significantly influenced by the chemical composition and appropriate heat treatment (HT). The main steps in producing OPH alloys consist of mechanical alloying (MA) and consolidation, followed by hot rolling. Toughness was obtained from standard tensile test results for different variants of OPH alloy to understand their mechanical properties. Three machine learning techniques were developed using experimental data to simulate different outcomes. The effectivity of the impact of each parameter on the toughness of OPH alloys is discussed. By using the experimental results performed by the authors, the composition of OPH alloys (Al, Mo, Fe, Cr, Ta, Y, and O), HT conditions, and mechanical alloying (MA) were used to train the models as inputs and toughness was set as the output. The results demonstrated that all three models are suitable for predicting the toughness of OPH alloys, and the models fulfilled all the desired requirements. However, several criteria validated the fact that the adaptive neuro-fuzzy inference systems (ANFIS) model results in better conditions and has a better ability to simulate. The mean square error (MSE) for artificial neural networks (ANN), ANFIS, and support vector regression (SVR) models was 459.22, 0.0418, and 651.68 respectively. After performing the sensitivity analysis (SA) an optimized ANFIS model was achieved with a MSE value of 0.003 and demonstrated that HT temperature is the most significant of these parameters, and this acts as a critical rule in training the data sets.

Download Full-text

Fault detection for air conditioning system using machine learning

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v9.i1.pp109-116 ◽

2020 ◽

Vol 9 (1) ◽

pp. 109

Author(s):

Noor Asyikin Sulaiman ◽

Md Pauzi Abdullah ◽

Hayati Abdullah ◽

Muhammad Noorazlan Shah Zainudin ◽

Azdiana Md Yusop

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Air Conditioning ◽

Machine Learning Algorithms ◽

Coefficient Of Performance ◽

Support Vector ◽

Air Conditioning System ◽

Learning Classifier ◽

Negative Impacts ◽

The Impact

Air conditioning system is a complex system and consumes the most energy in a building. Any fault in the system operation such as cooling tower fan faulty, compressor failure, damper stuck, etc. could lead to energy wastage and reduction in the system’s coefficient of performance (COP). Due to the complexity of the air conditioning system, detecting those faults is hard as it requires exhaustive inspections. This paper consists of two parts; i) to investigate the impact of different faults related to the air conditioning system on COP and ii) to analyse the performances of machine learning algorithms to classify those faults. Three supervised learning classifier models were developed, which were deep learning, support vector machine (SVM) and multi-layer perceptron (MLP). The performances of each classifier were investigated in terms of six different classes of faults. Results showed that different faults give different negative impacts on the COP. Also, the three supervised learning classifier models able to classify all faults for more than 94%, and MLP produced the highest accuracy and precision among all.

Download Full-text

Exploring the Mechanism of Crashes with Autonomous Vehicles Using Machine Learning

Mathematical Problems in Engineering ◽

10.1155/2021/5524356 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Hengrui Chen ◽

Hong Chen ◽

Ruiyu Zhou ◽

Zhizhen Liu ◽

Xiaoke Sun

Keyword(s):

Machine Learning ◽

Autonomous Vehicles ◽

Classification And Regression Tree ◽

Gradient Boosting ◽

Support Vector ◽

Crash Severity ◽

Apriori Algorithm ◽

Driving Mode ◽

Extreme Gradient Boosting ◽

The Impact

The safety issue has become a critical obstacle that cannot be ignored in the marketization of autonomous vehicles (AVs). The objective of this study is to explore the mechanism of AV-involved crashes and analyze the impact of each feature on crash severity. We use the Apriori algorithm to explore the causal relationship between multiple factors to explore the mechanism of crashes. We use various machine learning models, including support vector machine (SVM), classification and regression tree (CART), and eXtreme Gradient Boosting (XGBoost), to analyze the crash severity. Besides, we apply the Shapley Additive Explanations (SHAP) to interpret the importance of each factor. The results indicate that XGBoost obtains the best result (recall = 75%; G-mean = 67.82%). Both XGBoost and Apriori algorithm effectively provided meaningful insights about AV-involved crash characteristics and their relationship. Among all these features, vehicle damage, weather conditions, accident location, and driving mode are the most critical features. We found that most rear-end crashes are conventional vehicles bumping into the rear of AVs. Drivers should be extremely cautious when driving in fog, snow, and insufficient light. Besides, drivers should be careful when driving near intersections, especially in the autonomous driving mode.

Download Full-text

COVID-19 Risk Factors, Economic Factors, and Epidemiological Factors nexus on Economic Impact: Machine Learning and Structural Equation Modelling Approaches

Journal of the Nigerian Society of Physical Sciences ◽

10.46481/jnsps.2021.173 ◽

2021 ◽

pp. 395-405

Author(s):

David Opeoluwa Oyewola ◽

Emmanuel Gbenga Dada ◽

Juliana Ngozi Ndunagu ◽

Terrang Abubakar Umar ◽

Akinwunmi S.A

Keyword(s):

Machine Learning ◽

Risk Factors ◽

Structural Equation Modeling ◽

Latent Variables ◽

Structural Equation ◽

Equation Modeling ◽

Support Vector ◽

Economic Factors ◽

Negative Effects ◽

The Impact

Since the declaration of COVID-19 as a global pandemic, it has been transmitted to more than 200 nations of the world. The harmful impact of the pandemic on the economy of nations is far greater than anything suffered in almost a century. The main objective of this paper is to apply Structural Equation Modeling (SEM) and Machine Learning (ML) to determine the relationships among COVID-19 risk factors, epidemiology factors and economic factors. Structural equation modeling is a statistical technique for calculating and evaluating the relationships of manifest and latent variables. It explores the causal relationship between variables and at the same time taking measurement error into account. Bagging (BAG), Boosting (BST), Support Vector Machine (SVM), Decision Tree (DT) and Random Forest (RF) Machine Learning techniques was applied to predict the impact of COVID-19 risk factors. Data from patients who came into contact with coronavirus disease were collected from Kaggle database between 23 January 2020 and 24 June 2020. Results indicate that COVID-19 risk factors have negative effects on epidemiology factors. It also has negative effects on economic factors.

Download Full-text

The Influence of Inhomogeneous Input Data from Different Waves on Predictive Model Development for COVID-19 ICU Patients (Preprint)

10.2196/preprints.31539 ◽

2021 ◽

Author(s):

Sebastian Johannes Fritsch ◽

Konstantin Sharafutdinov ◽

Moein Einollahzadeh Samadi ◽

Gernot Marx ◽

Andreas Schuppert ◽

...

Keyword(s):

Machine Learning ◽

Convex Hull ◽

Prediction Models ◽

Model Development ◽

Predictive Performance ◽

Support Vector ◽

Good Prediction ◽

The Impact ◽

Second Wave ◽

Over Time

BACKGROUND During the course of the COVID-19 pandemic, a variety of machine learning models were developed to predict different aspects of the disease, such as long-term causes, organ dysfunction or ICU mortality. The number of training datasets used has increased significantly over time. However, these data now come from different waves of the pandemic, not always addressing the same therapeutic approaches over time as well as changing outcomes between two waves. The impact of these changes on model development has not yet been studied. OBJECTIVE The aim of the investigation was to examine the predictive performance of several models trained with data from one wave predicting the second wave´s data and the impact of a pooling of these data sets. Finally, a method for comparison of different datasets for heterogeneity is introduced. METHODS We used two datasets from wave one and two to develop several predictive models for mortality of the patients. Four classification algorithms were used: logistic regression (LR), support vector machine (SVM), random forest classifier (RF) and AdaBoost classifier (ADA). We also performed a mutual prediction on the data of that wave which was not used for training. Then, we compared the performance of models when a pooled dataset from two waves was used. The populations from the different waves were checked for heterogeneity using a convex hull analysis. RESULTS 63 patients from wave one (03-06/2020) and 54 from wave two (08/2020-01/2021) were evaluated. For both waves separately, we found models reaching sufficient accuracies up to 0.79 AUROC (95%-CI 0.76-0.81) for SVM on the first wave and up 0.88 AUROC (95%-CI 0.86-0.89) for RF on the second wave. After the pooling of the data, the AUROC decreased relevantly. In the mutual prediction, models trained on second wave´s data showed, when applied on first wave´s data, a good prediction for non-survivors but an insufficient classification for survivors. The opposite situation (training: first wave, test: second wave) revealed the inverse behaviour with models correctly classifying survivors and incorrectly predicting non-survivors. The convex hull analysis for the first and second wave populations showed a more inhomogeneous distribution of underlying data when compared to randomly selected sets of patients of the same size. CONCLUSIONS Our work demonstrates that a larger dataset is not a universal solution to all machine learning problems in clinical settings. Rather, it shows that inhomogeneous data used to develop models can lead to serious problems. With the convex hull analysis, we offer a solution for this problem. The outcome of such an analysis can raise concerns if the pooling of different datasets would cause inhomogeneous patterns preventing a better predictive performance.

Download Full-text

Sentiment Analysis of Impact of Technology on Employment from Text on Twitter

International Journal of Interactive Mobile Technologies (iJIM) ◽

10.3991/ijim.v14i07.10600 ◽

2020 ◽

Vol 14 (07) ◽

pp. 88

Author(s):

Shahzad Qaiser ◽

Nooraini Yusoff ◽

Farzana Kabir Ahmad ◽

Ramsha Ali

Keyword(s):

Machine Learning ◽

Social Media ◽

Social Issues ◽

Support Vector ◽

Ripple Effect ◽

Learning Classifier ◽

The People ◽

Impact Of Technology ◽

Negative Sentiment ◽

The Impact

Many different studies are in progress to analyze the content created by the users on social media due to its influence and social ripple effect. Various content created on social media has pieces of information and user’s sentiments about social issues. This study aims to analyze people’s sentiments about the impact of technology on employment and advancements in technologies and build a machine learning classifier to classify the sentiments. People are getting nervous, depressed and even doing suicides due to unemployment; hence, it is essential to explore this relatively new area of research. The study has two main objectives 1) to preprocess text collected from Twitter concerning the impact of technology on employment and analyze its sentiment, 2) to evaluate the performance of machine learning Naïve Bayes (NB) classifier on the text. To achieve this, a methodology is proposed that includes 1) data collection and preprocessing 2) analyze sentiment, 3) building machine learning classifier and 4) compare the performance of NB and support vector machine (SVM). NB and SVM achieved 87.18% and 82.05% accuracy respectively. The study found that 65% of the people hold negative sentiment regarding the impact of technology on employment and technological advancements; hence people must acquire new skills to minimize the effect of structural unemployment.

Download Full-text

Design flood estimation for global river networks based on machine learning models

10.5194/hess-2020-594 ◽

2020 ◽

Author(s):

Gang Zhao ◽

Paul Bates ◽

Jeffrey Neal ◽

Bo Pang

Keyword(s):

Machine Learning ◽

Regression Models ◽

Flood Hazard ◽

Model Development ◽

Flood Frequency ◽

Support Vector ◽

Design Flood ◽

Flood Estimation ◽

Design Floods ◽

Anderson Darling

Abstract. Design flood estimation is a fundamental task in hydrology. In this research, we propose a machine learning based approach to estimate design floods globally. This approach mainly involves three stages: (i) estimating at-site flood frequency curve for global gauging stations by the Anderson-Darling test and Bayesian MCMC method; (ii) clustering these stations into subgroups by a K-means model based on twelve globally available catchment descriptors, and (iii) developing a regression model in each subgroup for regional design flood estimation using the same descriptors. A total of 11793 stations globally were selected for model development and three widely used regression models were compared for design flood estimation. The results showed that: (1) the proposed approach achieved the highest accuracy for design flood estimation when using all twelve descriptors for clustering; and the performance of regression was improved by considering more descriptors during the training and validation; (2) a support vector machine regression provide the highest prediction performance among all regression models tested, with root mean square normalised error of 0.708 for 100-year return period flood estimation; (3) 100-year design flood in tropical, arid, temperate, cold and polar climate zones could be reliably estimated with the relative mean relative biases (RBIAS) of −0.199, −0.233, −0.169, 0.179 and −0.091 respectively; (4) This machine learning based approach shows considerable improvement over the index-flood based method introduced by Smith et al. (2015, https://doi.org/10.1002/2014WR015814) for the design flood estimation at global scales; and the average RBIAS in estimation is less than 18 % for 10, 20, 50 and 100-year design floods. We conclude that the proposed approach is a valid method to estimate design floods anywhere on the global river network, improving our prediction of the flood hazard, especially in ungauged areas.

Download Full-text

Studying the Effect of Taking Statins before Infection in the Severity Reduction of COVID-19 with Machine Learning

BioMed Research International ◽

10.1155/2021/9995073 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Alireza Davoudi ◽

Mohsen Ahmadi ◽

Abbas Sharifi ◽

Roshina Hassantabar ◽

Narges Najafi ◽

...

Keyword(s):

Machine Learning ◽

Diastolic Pressure ◽

Systolic Pressure ◽

Illness Severity ◽

Support Vector ◽

Learning Approaches ◽

Angiotensin Converting Enzyme 2 ◽

Decision Tree Method ◽

The Impact

Statins can help COVID-19 patients’ treatment because of their involvement in angiotensin-converting enzyme-2. The main objective of this study is to evaluate the impact of statins on COVID-19 severity for people who have been taking statins before COVID-19 infection. The examined research patients include people that had taken three types of statins consisting of Atorvastatin, Simvastatin, and Rosuvastatin. The case study includes 561 patients admitted to the Razi Hospital in Ghaemshahr, Iran, during February and March 2020. The illness severity was encoded based on the respiratory rate, oxygen saturation, systolic pressure, and diastolic pressure in five categories: mild, medium, severe, critical, and death. Since 69.23% of participants were in mild severity condition, the results showed the positive effect of Simvastatin on COVID-19 severity for people that take Simvastatin before being infected by the COVID-19 virus. Also, systolic pressure for this case study is 137.31, which is higher than that of the total patients. Another result of this study is that Simvastatin takers have an average of 95.77 mmHg O2Sat; however, the O2Sat is 92.42, which is medium severity for evaluating the entire case study. In the rest of this paper, we used machine learning approaches to diagnose COVID-19 patients’ severity based on clinical features. Results indicated that the decision tree method could predict patients’ illness severity with 87.9% accuracy. Other methods, including the K -nearest neighbors (KNN) algorithm, support vector machine (SVM), Naïve Bayes classifier, and discriminant analysis, showed accuracy levels of 80%, 68.8%, 61.1%, and 85.1%, respectively.

Download Full-text