Geochemical profile and source identification of surface and groundwater pollution of District Chitral, Northern Pakistan

Groundwater pollution numerical simulations coupled with Genetic Algorithms (GAs) lead to vast computational load, while flow fields&#8217; simplification can compensate in design, but not real-time/operational, applications. Various Machine Learning/Deep Learning (ML/DL) methods/problem-formulations were tested/evaluated for real-time inverse problems of aquifer pollution source identification. Aim: investigate data-driven approaches towards replacing flow simulation with ML/DL trained models identifying the source, faster but efficiently enough.Steady flow in a 1500mx1500m theoretical confined, isotropic aquifer of known characteristics is studied. Two pumping wells (PWs) near the southern boundary provide irrigation/drinking water, defining the flow together with a varying North-South natural flow. Six suspected possible sources, capable of instantaneous leakage, may spread a conservative pollutant. Particle tracking simulates advective mass transport, in a 2D flow-field for 2500 1-day timesteps. The 14x14 inner field grid nodes serve as locations of sources, PWs and monitoring wells (MWs; for simple daily yes/no pollution detection and/or drawdown measuring). 15,246 combinations of 6 Source Nrs, 21 N-S hydraulic gradients, 11+11 PW1,2 flow-rates were simulated with existing own software, providing the necessary data-sets for ML training/evaluation.Two basic ML/DL approaches were implemented: Classification (CL) and Computer Vision (CV). In CL, every source is a discrete class, while each MW is a discrete variable. The target variable Y can equal 1 to 6, while input variables X can be: a) 0/1 (MWi polluted or not), b) the first day of MWi&#8217;s pollution, c) the duration of MWi&#8217;s pollution, d) hydraulic drawdown of MWi. For a bit more realism, the two southern rows of 28 MWs, and the MWs on/around PWs are concealed. CL features the advantage of facilitating Correlation-based Feature Subset Selection (CFSS), indirectly leading to a pseudo-optimization of the monitoring network, minimizing the number of MWs (not the sampling frequency though), based solely on the efficiency in identifying the source criterion. As a downside, time dimension and spatial correlation of MWs are not considered. Approach (b) being the best scheme, Random Forests (RFs; 86.5576% accuracy), Multi-Layer Perceptron (MLP; 77.5%), and Nearest Neighbors (NN; 86.5%) were tested. CFSS led to 8 only MWs being important, so training with the optimal subsets gave promising results: RF=85.4%, MLP=73.1%, NN=85.4%. In CV, MWis&#8217; pollution input data on a 10-day basis (0-60, 800-on concealed) were formulated into 14x14-pixel black/white images, that is 14x14 binary (0,1) matrices, the t=0 image being the desideratum. A Convolutional Neural Network (CNN; U-Net architecture for image segmentation) achieved 97.1% accuracy. A Convolutional Long/Short-Term Memory Neural Network (CLSTM), training a model to back-propagate predicting each given time step, with unchanged data formulation (60-800d, step 10), exhibits 82.3% accuracy. CLSTM&#8217;s performance is timestep-sensitive, best results yielded (98% accuracy) using configuration 5-800d, step 6.Concluding, CL&#8217;s CFSS minimizes the input space, while CV approaches yield more promising results in terms of accuracy. Each approach has certain constraints in operational applicability, concerning the number of MWs, the sampling resolution and the total elapsed time. This process paves the way for realistic inverse problem solutions, ML-GAs monitoring network optimization, and real-time pollution detection operational systems.&#160;

Download Full-text

Surface and groundwater pollution potential

International Journal of Rock Mechanics and Mining Sciences & Geomechanics Abstracts ◽

10.1016/0148-9062(87)91948-6 ◽

1987 ◽

Vol 24 (2) ◽

pp. 47

Keyword(s):

Groundwater Pollution ◽

Pollution Potential ◽

Surface And Groundwater

Download Full-text

Industrial waste as a source of surface and groundwater pollution for more than half a century in a sector of the Río de la Plata coastal plain (Argentina)

Chemosphere ◽

10.1016/j.chemosphere.2018.05.084 ◽

2018 ◽

Vol 206 ◽

pp. 727-735 ◽

Cited By ~ 8

Author(s):

Lucía Santucci ◽

Eleonora Carol ◽

Carolina Tanjal

Keyword(s):

Coastal Plain ◽

Groundwater Pollution ◽

Industrial Waste ◽

Rio De La Plata ◽

Río De La Plata ◽

La Plata ◽

Surface And Groundwater

Download Full-text

Groundwater pollution source identification and apportionment using PMF and PCA-APCA-MLR receptor models in a typical mixed land-use area in Southwestern China

The Science of The Total Environment ◽

10.1016/j.scitotenv.2020.140383 ◽

2020 ◽

Vol 741 ◽

pp. 140383 ◽

Cited By ~ 4

Author(s):

Han Zhang ◽

Siqian Cheng ◽

Hongfei Li ◽

Kang Fu ◽

Yi Xu

Keyword(s):

Land Use ◽

Groundwater Pollution ◽

Source Identification ◽

Pollution Source ◽

Southwestern China ◽

Pollution Source Identification ◽

Receptor Models ◽

Mixed Land Use

Download Full-text

Wavelet denoising and cubic spline interpolation for observation data in groundwater pollution source identification problems

Water Science & Technology Water Supply ◽

10.2166/ws.2019.013 ◽

2019 ◽

Vol 19 (5) ◽

pp. 1454-1462 ◽

Cited By ~ 2

Author(s):

Ying Zhao ◽

Qiang Fu ◽

Wenxi Lu ◽

Ji Yi ◽

Haibo Chu

Keyword(s):

Missing Data ◽

Groundwater Pollution ◽

Source Identification ◽

Noise Level ◽

Spline Interpolation ◽

Wavelet Denoising ◽

Pollution Source ◽

Observation Data ◽

Concentration Data ◽

Cubic Spline Interpolation

Abstract As the identified results of groundwater pollution source identification (GPSI) can influence the cost for the polluter in paying for remediating groundwater resources, it is important that the accuracy of the estimated result should be as high as possible. However, many factors can influence the result, such as noisy concentration data and incomplete concentration data. Thus, this paper is aimed at studying the difference between using the observation data before and after denoising and interpolating for solving GPSI problems. Four kinds of noise level and 20 groups of missing data were designed to test the performance of wavelet denoising and cubic spline interpolation, respectively. The results show that the denoising process can improve the estimated result for the GPSI problem, and the higher the noise level, the stronger this effect. In terms of interpolation, more accurate results can be made after interpolating if the missing data belong to the period after the source releases the pollutant. If the missing data are from when the pollution source is active, interpolation cannot help increase the estimated performance.

Download Full-text