Development of Water Level Prediction Models Using Machine Learning in Wetlands: A Case Study of Upo Wetland in South Korea

Wetlands play a vital role in hydrologic and ecologic communities. Since there are few studies conducted for wetland water level prediction due to the unavailability of data, this study developed a water level prediction model using various machine learning models such as artificial neural network (ANN), decision tree (DT), random forest (RF), and support vector machine (SVM). The Upo wetland, which is the largest inland wetland in South Korea, was selected as the study area. The daily water level gauge data from 2009 to 2015 were used as dependent variables, while the meteorological data and upstream water level gauge data were used as independent variables. Predictive performance evaluation using RF as the final model revealed 0.96 value for correlation coefficient (CC), 0.92 for Nash–Sutcliffe efficiency (NSE), 0.09 for root mean square error (RMSE), and 0.19 for persistence index (PI). The results indicate that the water level of the Upo wetland was well predicted, showing superior results compared to that of the ANN, which was used in a previous study. The results intend to provide basic data for development of a wetland management method, using water levels of previously ungauged areas.

Download Full-text

Onion Yield Prediction Based on Machine Learning

Turkish Journal of Computer and Mathematics Education (TURCOMAT) ◽

10.17762/turcomat.v12i2.1972 ◽

2021 ◽

Vol 12 (2) ◽

Author(s):

A. Selvi, Et. al.

Keyword(s):

Machine Learning ◽

Water Level ◽

Climate Changes ◽

Learning Algorithm ◽

Future Price ◽

Support Vector ◽

Yield Prediction ◽

Past Data ◽

Vegetables And Fruits ◽

Water Level Prediction

Indian economy is mostly based on agriculture but the price of each vegetables and fruits varies in day today life. In recent days, the price of onion is increasing wisely, to predict the future price of onion by using past data and also the water level need for a particular crop using current dataset and also by using climate changes. Using Machine learning, collect the dataset of weather report and past and current priceof vegetables (e.g. onion) and water level prediction of plants. By analyzing these dataset the price of vegetables and the water level required for particular plant on particular day according to the climate change are analyzed and predicted. By using machine learning algorithm like Support vector machine, stemming, and tokenization. This will be helpful for the cultivators to crop a plant according to the weather and it will be useful when the water level needed for the particular plant is known previously. And these will also help to bring customer and farmer will gain more. The water level will also help to enhance the harvest yield and also will produce good crops all the time. The cultivator will gain more and the harvesting will also be more when compared to other systems.

Download Full-text

Prediction of Groundwater Level Variations in a Changing Climate: A Danish Case Study

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10110792 ◽

2021 ◽

Vol 10 (11) ◽

pp. 792

Author(s):

Rebeca Quintero Gonzalez ◽

Jamal Jokar Arsanjani

Keyword(s):

Climate Change ◽

Machine Learning ◽

Water Level ◽

Water Table ◽

Water Levels ◽

Series Data ◽

Future Climate Change ◽

Support Vector ◽

Climate Change Scenarios ◽

The Future

Shallow groundwater is a key resource for human activities and ecosystems, and is susceptible to alterations caused by climate change, causing negative socio-economic and environmental impacts, and increasing the need to predict the evolution of the water table. The main objective of this study is to gain insights about future water level changes based on different climate change scenarios using machine learning algorithms, while addressing the following research questions: (a) how will the water table be affected by climate change in the future based on different socio-economic pathways (SSPs)?: (b) do machine learning models perform well enough in predicting changes of the groundwater in Denmark? If so, which ML model outperforms for forecasting these changes? Three ML algorithms were used in R: artificial neural networks (ANN), support vector machine (SVM) and random forest (RF). The ML models were trained with time-series data of groundwater levels taken at wells in the Hovedstaden region, for the period 1990–2018. Several independent variables were used to train the models, including different soil parameters, topographical features and climatic variables for the time period and region selected. Results show that the RF model outperformed the other two, resulting in a higher R-squared and lower mean absolute error (MAE). The future prediction maps for the different scenarios show little variation in the water table. Nevertheless, predictions show that it will rise slightly, mostly in the order of 0–0.25 m, especially during winter. The proposed approach in this study can be used to visualize areas where the water levels are expected to change, as well as to gain insights about how big the changes will be. The approaches and models developed with this paper could be replicated and applied to other study areas, allowing for the possibility to extend this model to a national level, improving the prevention and adaptation plans in Denmark and providing a more global overview of future water level predictions to more efficiently handle future climate change scenarios.

Download Full-text

In silico Prediction of Inhibitory Constant of Thrombin Inhibitors Using Machine Learning

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207322666181220130232 ◽

2019 ◽

Vol 21 (9) ◽

pp. 662-669 ◽

Cited By ~ 1

Author(s):

Junnan Zhao ◽

Lu Zhu ◽

Weineng Zhou ◽

Lingfeng Yin ◽

Yuchen Wang ◽

...

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Regression Tree ◽

Large Data ◽

Thrombin Inhibitors ◽

Coagulation Cascade ◽

Gradient Boosting ◽

Support Vector ◽

Data Set ◽

Descriptor Selection

Background: Thrombin is the central protease of the vertebrate blood coagulation cascade, which is closely related to cardiovascular diseases. The inhibitory constant Ki is the most significant property of thrombin inhibitors. Method: This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K Nearest Neighbors (KNN), Gradient Boosting Regression Tree (GBRT) and Support Vector Machine (SVM) were implemented to build prediction models with these selected descriptors. Results: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for the training set and R2=0.83, MSE=0.56 for the test set. Several validation methods such as yrandomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors.

Download Full-text

Machine Learning Methods Applied to the Prediction of Pseudo-nitzschia spp. Blooms in the Galician Rias Baixas (NW Spain)

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10040199 ◽

2021 ◽

Vol 10 (4) ◽

pp. 199

Author(s):

Francisco M. Bellas Aláez ◽

Jesus M. Torres Palenzuela ◽

Evangelos Spyrakos ◽

Luis González Vilas

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Prediction Models ◽

Support Vector ◽

False Alarms ◽

Learning Approaches ◽

Learning Methods ◽

Machine Learning Methods ◽

Rías Baixas ◽

New Algorithms

This work presents new prediction models based on recent developments in machine learning methods, such as Random Forest (RF) and AdaBoost, and compares them with more classical approaches, i.e., support vector machines (SVMs) and neural networks (NNs). The models predict Pseudo-nitzschia spp. blooms in the Galician Rias Baixas. This work builds on a previous study by the authors (doi.org/10.1016/j.pocean.2014.03.003) but uses an extended database (from 2002 to 2012) and new algorithms. Our results show that RF and AdaBoost provide better prediction results compared to SVMs and NNs, as they show improved performance metrics and a better balance between sensitivity and specificity. Classical machine learning approaches show higher sensitivities, but at a cost of lower specificity and higher percentages of false alarms (lower precision). These results seem to indicate a greater adaptation of new algorithms (RF and AdaBoost) to unbalanced datasets. Our models could be operationally implemented to establish a short-term prediction system.

Download Full-text

Development of Machine Learning Models for Prediction of Smoking Cessation Outcome

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18052584 ◽

2021 ◽

Vol 18 (5) ◽

pp. 2584

Author(s):

Cheng-Chien Lai ◽

Wei-Hsin Huang ◽

Betty Chia-Chen Chang ◽

Lee-Ching Hwang

Keyword(s):

Machine Learning ◽

Smoking Cessation ◽

Success Rate ◽

Prediction Models ◽

Smoking Status ◽

Medical Center ◽

Machine Learning Algorithms ◽

Classification And Regression Tree ◽

Support Vector ◽

Smoking Cessation Outcome

Predictors for success in smoking cessation have been studied, but a prediction model capable of providing a success rate for each patient attempting to quit smoking is still lacking. The aim of this study is to develop prediction models using machine learning algorithms to predict the outcome of smoking cessation. Data was acquired from patients underwent smoking cessation program at one medical center in Northern Taiwan. A total of 4875 enrollments fulfilled our inclusion criteria. Models with artificial neural network (ANN), support vector machine (SVM), random forest (RF), logistic regression (LoR), k-nearest neighbor (KNN), classification and regression tree (CART), and naïve Bayes (NB) were trained to predict the final smoking status of the patients in a six-month period. Sensitivity, specificity, accuracy, and area under receiver operating characteristic (ROC) curve (AUC or ROC value) were used to determine the performance of the models. We adopted the ANN model which reached a slightly better performance, with a sensitivity of 0.704, a specificity of 0.567, an accuracy of 0.640, and an ROC value of 0.660 (95% confidence interval (CI): 0.617–0.702) for prediction in smoking cessation outcome. A predictive model for smoking cessation was constructed. The model could aid in providing the predicted success rate for all smokers. It also had the potential to achieve personalized and precision medicine for treatment of smoking cessation.

Download Full-text

Machine Learning-Based Prediction of Air Quality

Applied Sciences ◽

10.3390/app10249151 ◽

2020 ◽

Vol 10 (24) ◽

pp. 9151

Author(s):

Yun-Chia Liang ◽

Yona Maimury ◽

Angela Hsiang-Ling Chen ◽

Josue Rodolfo Cuevas Juarez

Keyword(s):

Machine Learning ◽

Air Quality ◽

Random Forest ◽

Prediction Models ◽

Superior Performance ◽

Support Vector ◽

Economic Activities ◽

Adaptive Boosting ◽

Series Of Experiments ◽

Artificial Neural Network Ann

Air, an essential natural resource, has been compromised in terms of quality by economic activities. Considerable research has been devoted to predicting instances of poor air quality, but most studies are limited by insufficient longitudinal data, making it difficult to account for seasonal and other factors. Several prediction models have been developed using an 11-year dataset collected by Taiwan’s Environmental Protection Administration (EPA). Machine learning methods, including adaptive boosting (AdaBoost), artificial neural network (ANN), random forest, stacking ensemble, and support vector machine (SVM), produce promising results for air quality index (AQI) level predictions. A series of experiments, using datasets for three different regions to obtain the best prediction performance from the stacking ensemble, AdaBoost, and random forest, found the stacking ensemble delivers consistently superior performance for R2 and RMSE, while AdaBoost provides best results for MAE.

Download Full-text

EXAMINATION OF SHORT-TERM WATER LEVEL PREDICTION MODEL IN LOW-LYING LAKES USING MACHINE LEARNING

Journal of Japan Society of Civil Engineers Ser B1 (Hydraulic Engineering) ◽

10.2208/jscejhe.76.2_i_439 ◽

2020 ◽

Vol 76 (2) ◽

pp. I_439-I_444

Author(s):

Masaomi KIMURA ◽

Takahiro ISHIKAWA ◽

Naoto OKUMURA ◽

Issaku AZECHI ◽

Toshiaki IIDA

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Water Level ◽

Short Term ◽

Water Level Prediction

Download Full-text

Forecasting the risk at infractions: an ensemble comparison of machine learning approach

Industrial Management & Data Systems ◽

10.1108/imds-10-2020-0603 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Lei Li ◽

Desheng Wu

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Short Term Memory ◽

Model Performance ◽

Large Data ◽

Support Vector ◽

Learning Approaches ◽

Content Type ◽

Day To Day Operations ◽

Prediction Approach

PurposeThe infraction of securities regulations (ISRs) of listed firms in their day-to-day operations and management has become one of common problems. This paper proposed several machine learning approaches to forecast the risk at infractions of listed corporates to solve financial problems that are not effective and precise in supervision.Design/methodology/approachThe overall proposed research framework designed for forecasting the infractions (ISRs) include data collection and cleaning, feature engineering, data split, prediction approach application and model performance evaluation. We select Logistic Regression, Naïve Bayes, Random Forest, Support Vector Machines, Artificial Neural Network and Long Short-Term Memory Networks (LSTMs) as ISRs prediction models.FindingsThe research results show that prediction performance of proposed models with the prior infractions provides a significant improvement of the ISRs than those without prior, especially for large sample set. The results also indicate when judging whether a company has infractions, we should pay attention to novel artificial intelligence methods, previous infractions of the company, and large data sets.Originality/valueThe findings could be utilized to address the problems of identifying listed corporates' ISRs at hand to a certain degree. Overall, results elucidate the value of the prior infraction of securities regulations (ISRs). This shows the importance of including more data sources when constructing distress models and not only focus on building increasingly more complex models on the same data. This is also beneficial to the regulatory authorities.

Download Full-text

Machine Learning Frameworks in Cancer Detection

E3S Web of Conferences ◽

10.1051/e3sconf/202129701073 ◽

2021 ◽

Vol 297 ◽

pp. 01073

Author(s):

Sabyasachi Pramanik ◽

K. Martin Sagayam ◽

Om Prakash Jena

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Cancer Development ◽

Support Vector ◽

Learning Approaches ◽

Learning Techniques ◽

Fact Finding ◽

Risk Of Cancer

Cancer has been described as a diverse illness with several distinct subtypes that may occur simultaneously. As a result, early detection and forecast of cancer types have graced essentially in cancer fact-finding methods since they may help to improve the clinical treatment of cancer survivors. The significance of categorizing cancer suffers into higher or lower-threat categories has prompted numerous fact-finding associates from the bioscience and genomics field to investigate the utilization of machine learning (ML) algorithms in cancer diagnosis and treatment. Because of this, these methods have been used with the goal of simulating the development and treatment of malignant diseases in humans. Furthermore, the capacity of machine learning techniques to identify important characteristics from complicated datasets demonstrates the significance of these technologies. These technologies include Bayesian networks and artificial neural networks, along with a number of other approaches. Decision Trees and Support Vector Machines which have already been extensively used in cancer research for the creation of predictive models, also lead to accurate decision making. The application of machine learning techniques may undoubtedly enhance our knowledge of cancer development; nevertheless, a sufficient degree of validation is required before these approaches can be considered for use in daily clinical practice. An overview of current machine learning approaches utilized in the simulation of cancer development is presented in this paper. All of the supervised machine learning approaches described here, along with a variety of input characteristics and data samples, are used to build the prediction models. In light of the increasing trend towards the use of machine learning methods in biomedical research, we offer the most current papers that have used these approaches to predict risk of cancer or patient outcomes in order to better understand cancer.

Download Full-text

Kabul River Flow Prediction Using Automated ARIMA Forecasting: A Machine Learning Approach

Sustainability ◽

10.3390/su131910720 ◽

2021 ◽

Vol 13 (19) ◽

pp. 10720

Author(s):

Muhammad Ali Musarat ◽

Wesam Salah Alaloul ◽

Muhammad Babar Ali Rabbani ◽

Mujahid Ali ◽

Muhammad Altaf ◽

...

Keyword(s):

Machine Learning ◽

Water Level ◽

River Flow ◽

Early Stage ◽

Moving Average ◽

Weather Prediction ◽

Maximum Level ◽

Information Criterion ◽

Water Levels ◽

Kabul River

The water level in a river defines the nature of flow and is fundamental to flood analysis. Extreme fluctuation in water levels in rivers, such as floods and droughts, are catastrophic in every manner; therefore, forecasting at an early stage would prevent possible disasters and relief efforts could be set up on time. This study aims to digitally model the water level in the Kabul River to prevent and alleviate the effects of any change in water level in this river downstream. This study used a machine learning tool known as the automatic autoregressive integrated moving average for statistical methodological analysis for forecasting the river flow. Based on the hydrological data collected from the water level of Kabul River in Swat, the water levels from 2011–2030 were forecasted, which were based on the lowest value of Akaike Information Criterion as 9.216. It was concluded that the water flow started to increase from the year 2011 till it reached its peak value in the year 2019–2020, and then the water level will maintain its maximum level to 250 cumecs and minimum level to 10 cumecs till 2030. The need for this research is justified as it could prove helpful in establishing guidelines for hydrological designers, the planning and management of water, hydropower engineering projects, as an indicator for weather prediction, and for the people who are greatly dependent on the Kabul River for their survival.

Download Full-text