scholarly journals Evaluation of Watershed Scale Aquatic Ecosystem Health by SWAT Modeling and Random Forest Technique

2019 ◽  
Vol 11 (12) ◽  
pp. 3397 ◽  
Author(s):  
So Young Woo ◽  
Chung Gil Jung ◽  
Ji Wan Lee ◽  
Seong Joon Kim

In this study, we evaluated the aquatic ecosystem health (AEH) with five grades (A; very good to E; very poor) of FAI (Fish Assessment Index), TDI (Trophic Diatom Index), and BMI (Benthic Macroinvertebrate Index) using the results of SWAT (Soil and Water Assessment Tool) stream water temperature (WT) and quality (T-N, T-P, NH4, NO3, and PO4). By applying Random Forest, one of the machine learning algorithms for classification analysis, each AEH index was trained and graded from the SWAT results. For Han river watershed (34,418 km2) in South Korea, the 8 years (2008~2015) observed AEH data of Spring and Fall periods at 86 locations from NAEMP (National Aquatic Ecological Monitoring Program) were used. The AEH was separately trained for Spring (FAIs, TDIs, and BMIs) and Fall (FAIa, TDIa, and BMIa), and the AEH results of Random Forest with SWAT (WT, T-N, T-P, NH4, NO3, and PO4) as input variables showed the accuracy of 0.42, 0.48, 0.62, 0.45, 0.4, and 0.58, respectively. The reason for low accuracy was from the weak strength of the individual trees and high correlation between the trees composing the Random Forest due to the data imbalance. The AEH distribution results showed that the number of Grade A of total FAI, TDI, and BMI were 84, 0, and 158 respectively and they were mostly located at the upstream watersheds. The number of Grade E of total FAI, TDI, and BMI were 4, 50, and 13 and they were shown at downstream watersheds.

2021 ◽  
Vol 13 (18) ◽  
pp. 10435
Author(s):  
Seoro Lee ◽  
Jonggun Kim ◽  
Gwanjae Lee ◽  
Jiyeong Hong ◽  
Joo Hyun Bae ◽  
...  

Changes in hydrological characteristics and increases in various pollutant loadings due to rapid climate change and urbanization have a significant impact on the deterioration of aquatic ecosystem health (AEH). Therefore, it is important to effectively evaluate the AEH in advance and establish appropriate strategic plans. Recently, machine learning (ML) models have been widely used to solve hydrological and environmental problems in various fields. However, in general, collecting sufficient data for ML training is time-consuming and labor-intensive. Especially in classification problems, data imbalance can lead to erroneous prediction results of ML models. In this study, we proposed a method to solve the data imbalance problem through data augmentation based on Wasserstein Generative Adversarial Network (WGAN) and to efficiently predict the grades (from A to E grades) of AEH indices (i.e., Benthic Macroinvertebrate Index (BMI), Trophic Diatom Index (TDI), Fish Assessment Index (FAI)) through the ML models. Raw datasets for the AEH indices composed of various physicochemical factors (i.e., WT, DO, BOD5, SS, TN, TP, and Flow) and AEH grades were built and augmented through the WGAN. The performance of each ML model was evaluated through a 10-fold cross-validation (CV), and the performances of the ML models trained on the raw and WGAN-based training sets were compared and analyzed through AEH grade prediction on the test sets. The results showed that the ML models trained on the WGAN-based training set had an average F1-score for grades of each AEH index of 0.9 or greater for the test set, which was superior to the models trained on the raw training set (fewer data compared to other datasets) only. Through the above results, it was confirmed that by using the dataset augmented through WGAN, the ML model can yield better AEH grade predictive performance compared to the model trained on limited datasets; this approach reduces the effort needed for actual data collection from rivers which requires enormous time and cost. In the future, the results of this study can be used as basic data to construct big data of aquatic ecosystems, needed to efficiently evaluate and predict AEH in rivers based on the ML models.


2020 ◽  
Vol 8 (5) ◽  
pp. 5353-5362

Background/Aim: Prostate cancer is regarded as the most prevalent cancer in the word and the main cause of deaths worldwide. The early strategies for estimating the prostate cancer sicknesses helped in settling on choices about the progressions to have happened in high-chance patients which brought about the decrease of their dangers. Methods: In the proposed research, we have considered informational collection from kaggle and we have done pre-processing tasks for missing values .We have three missing data values in compactness attribute and two missing values in fractal dimension were replaced by mean of their column values .The performance of the diagnosis model is obtained by using methods like classification, accuracy, sensitivity and specificity analysis. This paper proposes a prediction model to predict whether a people have a prostate cancer disease or not and to provide an awareness or diagnosis on that. This is done by comparing the accuracies of applying rules to the individual results of Support Vector Machine, Random forest, Naive Bayes classifier and logistic regression on the dataset taken in a region to present an accurate model of predicting prostate cancer disease. Results: The machine learning algorithms under study were able to predict prostate cancer disease in patients with accuracy between 70% and 90%. Conclusions: It was shown that Logistic Regression and Random Forest both has better Accuracy (90%) when compared to different Machine-learning Algorithms.


2020 ◽  
Author(s):  
Hamed Vagheei ◽  
Paolo Vezza ◽  
Guillermo Palau-Salvador ◽  
Fulvio Boano

<p><strong>The Impacts of Water Quality Changes on Aquatic Ecosystems: A Case Study of Clariano River, Spain </strong></p><p>Hamed Vagheei<sup>1</sup>, Paolo Vezza<sup>2</sup>, Guillermo Palau-Salvador<sup>3</sup>, Fulvio Boano<sup>4</sup></p><ol><li>PhD Student, the Polytechnic University of Turin, [email protected]</li> <li>Assistant professor, the Polytechnic University of Turin, [email protected]</li> <li>Associate Professor, the Polytechnic University of Valencia, [email protected]</li> <li>Associate Professor, the Polytechnic University of Turin, [email protected]</li> </ol><p><strong>Abstract</strong></p><p>Water quality degradation resulting from different anthropogenic activities such as agriculture, deforestation and urbanization is a serious worldwide challenge which have negative impacts on aquatic ecology. Unfortunately, it is still difficult to quantitatively determine the impacts of water quality changes on aquatic communities. The objective of the present research activity is to investigate aquatic ecosystem responses to water quality deterioration using a case study of Clariano River, Spain. The Clariano River faces low water quality and the loss of biodiversity in some parts as a result of agricultural, industrial and livestock activities as well as wastewater treatment plants (WWTP) effluents entering the river. The Soil and Water Assessment Tool (SWAT), an eco-hydrological model, is used in the present study for the modelling of discharge, sediment and nutrients. SWAT-CUP is also used to calibrate and validate the SWAT model. We are currently employing the results from the calibrated model to obtain a better understanding of possible relations between water quality and biodiversity. In fact, the present study will focus on macroinvertebrates as biological indicators of stream health, and the model predictions will be coupled with empirical correlations between stream water quality and macroinvertebrates presence in order to assess the impacts of water quality changes on aquatic ecosystem. In addition, different model scenarios will be compared to explore the potential impacts of changes in land use, climate and WWTPs operation on the aquatic ecosystem.</p><p><strong>Keywords:</strong> aquatic ecosystem, Clariano River, eco-hydrological modelling, water quality, water resources management</p>


2021 ◽  
Vol 5 (2) ◽  
pp. 355-368
Author(s):  
Nadya Dwi Muchisha ◽  
Novian Tamara ◽  
Andriansyah Andriansyah ◽  
Agus M Soleh

GDP is very important to be monitored in real time because of its usefulness for policy making. We built and compared the ML models to forecast real-time Indonesia's GDP growth. We used 18 variables that consist a number of quarterly macroeconomic and financial market statistics. We have evaluated the performance of six popular ML algorithms, such as Random Forest, LASSO, Ridge, Elastic Net, Neural Networks, and Support Vector Machines, in doing real-time forecast on GDP growth from 2013:Q3 to 2019:Q4 period. We used the RMSE, MAD, and Pearson correlation coefficient as measurements of forecast accuracy. The results showed that the performance of all these models outperformed AR (1) benchmark. The individual model that showed the best performance is random forest. To gain more accurate forecast result, we run forecast combination using equal weighting and lasso regression. The best model was obtained from forecast combination using lasso regression with selected ML models, which are Random Forest, Ridge, Support Vector Machine, and Neural Network.


2011 ◽  
Vol 47 ◽  
pp. S3-S14 ◽  
Author(s):  
Sang-Woo Lee ◽  
Soon-Jin Hwang ◽  
Jae-Kwan Lee ◽  
Dong-Il Jung ◽  
Yeon-Jae Park ◽  
...  

This paper provides an overview of the development and application of the National Aquatic Ecological Monitoring Program (NAEMP) in Korea, which uses biological and habitat–riparian criteria for river/stream and watershed management. Development of NAEMP began in 2003, with recognition by the Korean Ministry of Environment (MOE) of the limitations of applying chemical parameters (e.g., biochemical oxygen demand (BOD)) as the principal targets of water environment management. Ecosystem health criteria under NAEMP were developed from 2003 to 2006. Candidate sites for monitoring were also screened and established across the country. NAEMP was implemented in 2007, and since then a standard protocol of nationwide monitoring based on multi-criteria has been implemented to assess the ecological condition of rivers and streams. The monitoring results indicate that many Korean rivers and streams are severely degraded, with biological conditions that are much worse than their water chemistry suggests. In 2009, 24% of rivers and streams were in classes C (Fair) and D (Poor) for BOD, but more than 71, 53, and 27% were categorized as Fair to Poor according to fish, diatom, and benthic macroinvertebrate assemblages, respectively. NAEMP is promising in that the results have already had great impacts on policy making and scientific research relevant to lotic water environment and watershed management in Korea. In the future, NAEMP results will be used to develop more aggressive regulations for the preservation and restoration of rivers/streams, riparian buffer areas and watersheds. Another future aim of the NAEMP is to develop aquatic ecological modeling based on the monitoring results.


2007 ◽  
Vol 42 (4) ◽  
pp. 303-310 ◽  
Author(s):  
Zhi Chen ◽  
Lin Zhao ◽  
Kenneth Lee ◽  
Charles Hannath

Abstract There has been a growing interest in assessing the risks to the marine environment from produced water discharges. This study describes the development of a numerical approach, POM-RW, based on an integration of the Princeton Ocean Model (POM) and a Random Walk (RW) simulation of pollutant transport. Specifically, the POM is employed to simulate local ocean currents. It provides three-dimensional hydrodynamic input to a Random Walk model focused on the dispersion of toxic components within the produced water stream on a regional spatial scale. Model development and field validation of the predicted current field and pollutant concentrations were conducted in conjunction with a water quality and ecological monitoring program for an offshore facility located on the Grand Banks of Canada. Results indicate that the POM-RW approach is useful to address environmental risks associated with the produced water discharges.


2018 ◽  
Author(s):  
Liyan Pan ◽  
Guangjian Liu ◽  
Xiaojian Mao ◽  
Huixian Li ◽  
Jiexin Zhang ◽  
...  

BACKGROUND Central precocious puberty (CPP) in girls seriously affects their physical and mental development in childhood. The method of diagnosis—gonadotropin-releasing hormone (GnRH)–stimulation test or GnRH analogue (GnRHa)–stimulation test—is expensive and makes patients uncomfortable due to the need for repeated blood sampling. OBJECTIVE We aimed to combine multiple CPP–related features and construct machine learning models to predict response to the GnRHa-stimulation test. METHODS In this retrospective study, we analyzed clinical and laboratory data of 1757 girls who underwent a GnRHa test in order to develop XGBoost and random forest classifiers for prediction of response to the GnRHa test. The local interpretable model-agnostic explanations (LIME) algorithm was used with the black-box classifiers to increase their interpretability. We measured sensitivity, specificity, and area under receiver operating characteristic (AUC) of the models. RESULTS Both the XGBoost and random forest models achieved good performance in distinguishing between positive and negative responses, with the AUC ranging from 0.88 to 0.90, sensitivity ranging from 77.91% to 77.94%, and specificity ranging from 84.32% to 87.66%. Basal serum luteinizing hormone, follicle-stimulating hormone, and insulin-like growth factor-I levels were found to be the three most important factors. In the interpretable models of LIME, the abovementioned variables made high contributions to the prediction probability. CONCLUSIONS The prediction models we developed can help diagnose CPP and may be used as a prescreening tool before the GnRHa-stimulation test.


2020 ◽  
Vol 13 (1) ◽  
pp. 10
Author(s):  
Andrea Sulova ◽  
Jamal Jokar Arsanjani

Recent studies have suggested that due to climate change, the number of wildfires across the globe have been increasing and continue to grow even more. The recent massive wildfires, which hit Australia during the 2019–2020 summer season, raised questions to what extent the risk of wildfires can be linked to various climate, environmental, topographical, and social factors and how to predict fire occurrences to take preventive measures. Hence, the main objective of this study was to develop an automatized and cloud-based workflow for generating a training dataset of fire events at a continental level using freely available remote sensing data with a reasonable computational expense for injecting into machine learning models. As a result, a data-driven model was set up in Google Earth Engine platform, which is publicly accessible and open for further adjustments. The training dataset was applied to different machine learning algorithms, i.e., Random Forest, Naïve Bayes, and Classification and Regression Tree. The findings show that Random Forest outperformed other algorithms and hence it was used further to explore the driving factors using variable importance analysis. The study indicates the probability of fire occurrences across Australia as well as identifies the potential driving factors of Australian wildfires for the 2019–2020 summer season. The methodical approach and achieved results and drawn conclusions can be of great importance to policymakers, environmentalists, and climate change researchers, among others.


Sign in / Sign up

Export Citation Format

Share Document