Retrieval of Case 2 Water Quality Parameters with Machine Learning

<p>With the increasingly prominent ecological and environmental problems in lakes, the monitoring water quality in lakes by satellite remote sensing is becoming more and more high demanding. Traditional water quality sampling is normally conducted manually and are time-consuming and labor-costly. It could not provide a full picture of the waterbodies over time due to limited sampling points and low sampling frequency. A novel attempt is proposed to use hyperspectral remote sensing in conjunction with machine learning technologies to retrieve water quality parameters and provide mapping for these parameters in a lake. The retrieval of both optically active parameters: Chlorophyll-a (CHLA) and dissolved oxygen concentration (DO), as well as non-optically active parameters: total phosphorous (TP), total nitrogen (TN), turbidity (TB), pH were studied in this research. A comparison of three machine learning algorithms including Random Forests (RF), Support Vector Regression (SVR) and Artificial Neural Networks were conducted. These water parameters collected by the Environment and Climate Change Canada agency for 20 years were used as the ground truth for model training and validation. Two set of remote sensing data from MODIS and Sentinel-2 were utilized and evaluated. This research proposed a new approach to retrieve both optically active parameters and non-optically active parameters for water body and provide new strategy for water quality monitoring.</p>

Download Full-text

Implementation of Machine Learning Methods for Monitoring and Predicting Water Quality Parameters

Biointerface Research in Applied Chemistry ◽

10.33263/briac112.92859295 ◽

2020 ◽

Vol 11 (2) ◽

pp. 9285-9295 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Water Quality ◽

River Basin ◽

Water Quality Index ◽

Quality Index ◽

Quality Parameters ◽

Water Quality Parameters ◽

Quality Data ◽

Learning Tools ◽

Water Quality Data

The importance of good water quality for human use and consumption can never be underestimated, and its quality is determined through effective monitoring of the water quality index. Different approaches have been employed in the treatment and monitoring of water quality parameters (WQP). Presently, water quality is carried out through laboratory experiments, which requires costly reagents, skilled labor, and consumes time. Thereby making it necessary to search for an alternative method. Recently, machine learning tools have been successfully implemented in the monitoring, estimation, and predictions of river water quality index to provide an alternative solution to the limitations of laboratory analytical methods. In this study, the potentials of one of the machine learning tools (artificial neural network) were explored in the predictions and estimation of the Kelantan River basin. Water quality data collected from the 14 stations of the River basin was used for modeling and predicting (WQP). As for WQP analysis, the results obtained from this study show that the best prediction was obtained from the prediction of pH. The low kurtosis values of pH indicate that the appearance of outliers give a negative impact on the performance. As for WQP analysis for each station, we found that the WQP prediction in station 1, 2, and 3 give the good results. This is related to the available data of those stations that are more than the available data in other stations, except station 8.

Download Full-text

Prediction of E. coli Concentrations in Agricultural Pond Waters: Application and Comparison of Machine Learning Algorithms

Frontiers in Artificial Intelligence ◽

10.3389/frai.2021.768650 ◽

2022 ◽

Vol 4 ◽

Author(s):

Matthew D. Stocker ◽

Yakov A. Pachepsky ◽

Robert L. Hill

Keyword(s):

Machine Learning ◽

Water Quality ◽

Quality Parameters ◽

Machine Learning Algorithms ◽

Water Quality Parameters ◽

Gradient Boosting ◽

Support Vector ◽

E Coli ◽

Stochastic Gradient Boosting ◽

Significant Difference

The microbial quality of irrigation water is an important issue as the use of contaminated waters has been linked to several foodborne outbreaks. To expedite microbial water quality determinations, many researchers estimate concentrations of the microbial contamination indicator Escherichia coli (E. coli) from the concentrations of physiochemical water quality parameters. However, these relationships are often non-linear and exhibit changes above or below certain threshold values. Machine learning (ML) algorithms have been shown to make accurate predictions in datasets with complex relationships. The purpose of this work was to evaluate several ML models for the prediction of E. coli in agricultural pond waters. Two ponds in Maryland were monitored from 2016 to 2018 during the irrigation season. E. coli concentrations along with 12 other water quality parameters were measured in water samples. The resulting datasets were used to predict E. coli using stochastic gradient boosting (SGB) machines, random forest (RF), support vector machines (SVM), and k-nearest neighbor (kNN) algorithms. The RF model provided the lowest RMSE value for predicted E. coli concentrations in both ponds in individual years and over consecutive years in almost all cases. For individual years, the RMSE of the predicted E. coli concentrations (log10 CFU 100 ml−1) ranged from 0.244 to 0.346 and 0.304 to 0.418 for Pond 1 and 2, respectively. For the 3-year datasets, these values were 0.334 and 0.381 for Pond 1 and 2, respectively. In most cases there was no significant difference (P > 0.05) between the RMSE of RF and other ML models when these RMSE were treated as statistics derived from 10-fold cross-validation performed with five repeats. Important E. coli predictors were turbidity, dissolved organic matter content, specific conductance, chlorophyll concentration, and temperature. Model predictive performance did not significantly differ when 5 predictors were used vs. 8 or 12, indicating that more tedious and costly measurements provide no substantial improvement in the predictive accuracy of the evaluated algorithms.

Download Full-text