scholarly journals Using Real-Time Data and Unsupervised Machine Learning Techniques to Study Large-Scale Spatio–Temporal Characteristics of Wastewater Discharges and their Influence on Surface Water Quality in the Yangtze River Basin

Water ◽  
2019 ◽  
Vol 11 (6) ◽  
pp. 1268 ◽  
Author(s):  
Zhenzhen Di ◽  
Miao Chang ◽  
Peikun Guo ◽  
Yang Li ◽  
Yin Chang

Most worldwide industrial wastewater, including in China, is still directly discharged to aquatic environments without adequate treatment. Because of a lack of data and few methods, the relationships between pollutants discharged in wastewater and those in surface water have not been fully revealed and unsupervised machine learning techniques, such as clustering algorithms, have been neglected in related research fields. In this study, real-time monitoring data for chemical oxygen demand (COD), ammonia nitrogen (NH3-N), pH, and dissolved oxygen in the wastewater discharged from 2213 factories and in the surface water at 18 monitoring sections (sites) in 7 administrative regions in the Yangtze River Basin from 2016 to 2017 were collected and analyzed by the partitioning around medoids (PAM) and expectation–maximization (EM) clustering algorithms, Welch t-test, Wilcoxon test, and Spearman correlation. The results showed that compared with the spatial cluster comprising unpolluted sites, the spatial cluster comprised heavily polluted sites where more wastewater was discharged had relatively high COD (>100 mg L−1) and NH3-N (>6 mg L−1) concentrations and relatively low pH (<6) from 15 industrial classes that respected the different discharge limits outlined in the pollutant discharge standards. The results also showed that the economic activities generating wastewater and the geographical distribution of the heavily polluted wastewater changed from 2016 to 2017, such that the concentration ranges of pollutants in discharges widened and the contributions from some emerging enterprises became more important. The correlations between the quality of the wastewater and the surface water strengthened as the whole-year data sets were reduced to the heavily polluted periods by the EM clustering and water quality evaluation. This study demonstrates how unsupervised machine learning algorithms play an objective and effective role in data mining real-time monitoring information and highlighting spatio–temporal relationships between pollutants in wastewater discharges and surface water to support scientific water resource management.

Author(s):  
Melika Sajadian ◽  
Ana Teixeira ◽  
Faraz S. Tehrani ◽  
Mathias Lemmens

Abstract. Built environments developed on compressible soils are susceptible to land deformation. The spatio-temporal monitoring and analysis of these deformations are necessary for sustainable development of cities. Techniques such as Interferometric Synthetic Aperture Radar (InSAR) or predictions based on soil mechanics using in situ characterization, such as Cone Penetration Testing (CPT) can be used for assessing such land deformations. Despite the combined advantages of these two methods, the relationship between them has not yet been investigated. Therefore, the major objective of this study is to reconcile InSAR measurements and CPT measurements using machine learning techniques in an attempt to better predict land deformation.


2021 ◽  
Author(s):  
K. Emma Knowland ◽  
Christoph Keller ◽  
Krzysztof Wargan ◽  
Brad Weir ◽  
Pamela Wales ◽  
...  

&lt;p&gt;NASA's Global Modeling and Assimilation Office (GMAO) produces high-resolution global forecasts for weather, aerosols, and air quality. The NASA Global Earth Observing System (GEOS) model has been expanded to provide global near-real-time 5-day forecasts of atmospheric composition at unprecedented horizontal resolution of 0.25 degrees (~25 km). This composition forecast system (GEOS-CF) combines the operational GEOS weather forecasting model with the state-of-the-science GEOS-Chem chemistry module (version 12) to provide detailed analysis of a wide range of air pollutants such as ozone, carbon monoxide, nitrogen oxides, and fine particulate matter (PM2.5). Satellite observations are assimilated into the system for improved representation of weather and smoke. The assimilation system is being expanded to include chemically reactive trace gases. We discuss current capabilities of the GEOS Constituent Data Assimilation System (CoDAS) to improve atmospheric composition modeling and possible future directions, notably incorporating new observations (TROPOMI, geostationary satellites) and machine learning techniques. We show how machine learning techniques can be used to correct for sub-grid-scale variability, which further improves model estimates at a given observation site.&lt;/p&gt;


2021 ◽  
Author(s):  
Natacha Galmiche ◽  
Nello Blaser ◽  
Morten Brun ◽  
Helwig Hauser ◽  
Thomas Spengler ◽  
...  

&lt;p&gt;Probability distributions based on ensemble forecasts are commonly used to assess uncertainty in weather prediction. However, interpreting these distributions is not trivial, especially in the case of multimodality with distinct likely outcomes. The conventional summary employs mean and standard deviation across ensemble members, which works well for unimodal, Gaussian-like distributions. In the case of multimodality this misleads, discarding crucial information.&amp;#160;&lt;/p&gt;&lt;p&gt;We aim at combining previously developed clustering algorithms in machine learning and topological data analysis to extract useful information such as the number of clusters in an ensemble. Given the chaotic behaviour of the atmosphere, machine learning techniques can provide relevant results even if no, or very little, a priori information about the data is available. In addition, topological methods that analyse the shape of the data can make results explainable.&lt;/p&gt;&lt;p&gt;Given an ensemble of univariate time series, a graph is generated whose edges and vertices represent clusters of members, including additional information for each cluster such as the members belonging to them, their uncertainty, and their relevance according to the graph. In the case of multimodality, this approach provides relevant and quantitative information beyond the commonly used mean and standard deviation approach that helps to further characterise the predictability.&lt;/p&gt;


2021 ◽  
Author(s):  
Marcelo E. Pellenz ◽  
Rosana Lachowski ◽  
Edgard Jamhour ◽  
Glauber Brante ◽  
Guilherme Luiz Moritz ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document