scholarly journals A Comparison of Linear and Non-Linear Machine Learning Techniques (PCA and SOM) for Characterizing Urban Nutrient Runoff

2021 ◽  
Vol 13 (4) ◽  
pp. 2054 ◽  
Author(s):  
Angela Gorgoglione ◽  
Alberto Castro ◽  
Vito Iacobellis ◽  
Andrea Gioia

Urban stormwater runoff represents a significant challenge for the practical assessment of diffuse pollution sources on receiving water bodies. Given the high dimensionality of the problem, the main goal of this study was the comparison of linear and non-linear machine learning (ML) methods to characterize urban nutrient runoff from impervious surfaces. In particular, the principal component analysis (PCA) for the linear technique and the self-organizing map (SOM) for the non-linear technique were chosen and compared considering the high number of successful applications in the water quality field. To strengthen this comparison, these techniques were supported by well-known linear and non-linear methods. Those techniques were applied to a complete dataset with precipitation, flow rate, and water quality (sediments and nutrients) records of 577 events gathered for a watershed located in Southern Italy. According to the results, both linear and non-linear techniques can represent build-up and wash-off, the two main processes that characterize urban nutrient runoff. In particular, non-linear methods are able to capture and represent better the rainfall-runoff process and the transport of dissolved nutrients in urban runoff (dilution process). However, their computational time is higher than the linear technique (0.0054 s vs. 15.24 s, for linear and non-linear, respectively, in our study). The outcomes of this study provide significant insights into the application of ML methods for the water quality field.

2021 ◽  
pp. 1-13
Author(s):  
D. Senthilkumar ◽  
D. George Washington ◽  
A.K. Reshmy ◽  
M. Noornisha

Predicting the quality of water is a very important issue in an ecosystem and it can be used to control the increase of water contamination. Also, water quality prediction is a prominent complex non-linear multi-target learning problem and extracting a relevant subset of features from a large number of features with multiple targets is a challenging task. Existing water quality prediction model not focused on multi-target learning process simultaneously and not identifying the non-linear relationship between the features and target variables. Therefore, this study proposes a multi-task learning method dealing with multi-target regression using non-linear machine learning technique. Finally, experiments are conducted to build a prediction model based on the proposed methods to evaluate accuracy on water quality dataset. The experimental results indicate that our method increases the overall accuracy of the experimental dataset compared with the existing methods with the reduced number of significant features.


2021 ◽  
Author(s):  
Truong-Vinh Hoang ◽  
Sebastian Krumscheid ◽  
Raul Tempone

<p>Filtering is an uncertainty quantification technique that refers to the inference of the states of dynamical systems from noisy observations. This work proposes a machine learning-based filtering method for tracking the high-dimensional non-Gaussian state-space models with non-linear dynamics and sparse observations. Our filter method is based on the conditional expectation mean filter and uses machine-learning techniques to approximate the conditional mean (CM). The contribution of this work is twofolds: (i) we demonstrate theoretically that the assimilated ensembles obtained using the ensemble conditional mean filter (EnCMF) provide a correct prediction of the posterior mean and have the optimal variance, and (ii) we implement the EnCMF using artificial neural networks, which has a significant advantage in representing non-linear functions that map between high-dimensionality domains, such as the CM. We implement the machine learning-based EnCMF for tracking the states of the Lorenz-63 and 96 systems under the chaotic regime. Numerical results show that the EnCMF outperforms the ensemble Kalman filter.</p>


Author(s):  
Gonzalo Vergara ◽  
Juan J. Carrasco ◽  
Jesus Martínez-Gómez ◽  
Manuel Domínguez ◽  
José A. Gámez ◽  
...  

The study of energy efficiency in buildings is an active field of research. Modeling and predicting energy related magnitudes leads to analyze electric power consumption and can achieve economical benefits. In this study, classical time series analysis and machine learning techniques, introducing clustering in some models, are applied to predict active power in buildings. The real data acquired corresponds to time, environmental and electrical data of 30 buildings belonging to the University of León (Spain). Firstly, we segmented buildings in terms of their energy consumption using principal component analysis. Afterwards, we applied state of the art machine learning methods and compare between them. Finally, we predicted daily electric power consumption profiles and compare them with actual data for different buildings. Our analysis shows that multilayer perceptrons have the lowest error followed by support vector regression and clustered extreme learning machines. We also analyze daily load profiles on weekdays and weekends for different buildings.


Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 1927 ◽  
Author(s):  
Han-Shin Jo ◽  
Chanshin Park ◽  
Eunhyoung Lee ◽  
Haing Kun Choi ◽  
Jaedon Park

Although various linear log-distance path loss models have been developed for wireless sensor networks, advanced models are required to more accurately and flexibly represent the path loss for complex environments. This paper proposes a machine learning framework for modeling path loss using a combination of three key techniques: artificial neural network (ANN)-based multi-dimensional regression, Gaussian process-based variance analysis, and principle component analysis (PCA)-aided feature selection. In general, the measured path loss dataset comprises multiple features such as distance, antenna height, etc. First, PCA is adopted to reduce the number of features of the dataset and simplify the learning model accordingly. ANN then learns the path loss structure from the dataset with reduced dimension, and Gaussian process learns the shadowing effect. Path loss data measured in a suburban area in Korea are employed. We observe that the proposed combined path loss and shadowing model is more accurate and flexible compared to the conventional linear path loss plus log-normal shadowing model.


Author(s):  
Vidyullatha P ◽  
D. Rajeswara Rao

<p>Curve fitting is one of the procedures in data analysis and is helpful for prediction analysis showing graphically how the data points are related to one another whether it is in linear or non-linear model. Usually, the curve fit will find the concentrates along the curve or it will just use to smooth the data and upgrade the presence of the plot. Curve fitting checks the relationship between independent variables and dependent variables with the objective of characterizing a good fit model. Curve fitting finds mathematical equation that best fits given information. In this paper, 150 unorganized data points of environmental variables are used to develop Linear and non-linear data modelling which are evaluated by utilizing 3 dimensional ‘Sftool’ and ‘Labfit’ machine learning techniques. In Linear model, the best estimations of the coefficients are realized by the estimation of R- square turns in to one and in Non-Linear models with least Chi-square are the criteria. </p>


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
O. Obulesu ◽  
Suresh Kallam ◽  
Gaurav Dhiman ◽  
Rizwan Patan ◽  
Ramana Kadiyala ◽  
...  

Cancer is a complicated worldwide health issue with an increasing death rate in recent years. With the swift blooming of the high throughput technology and several machine learning methods that have unfolded in recent years, progress in cancer disease diagnosis has been made based on subset features, providing awareness of the efficient and precise disease diagnosis. Hence, progressive machine learning techniques that can, fortunately, differentiate lung cancer patients from healthy persons are of great concern. This paper proposes a novel Wilcoxon Signed-Rank Gain Preprocessing combined with Generative Deep Learning called Wilcoxon Signed Generative Deep Learning (WS-GDL) method for lung cancer disease diagnosis. Firstly, test significance analysis and information gain eliminate redundant and irrelevant attributes and extract many informative and significant attributes. Then, using a generator function, the Generative Deep Learning method is used to learn the deep features. Finally, a minimax game (i.e., minimizing error with maximum accuracy) is proposed to diagnose the disease. Numerical experiments on the Thoracic Surgery Data Set are used to test the WS-GDL method's disease diagnosis performance. The WS-GDL approach may create relevant and significant attributes and adaptively diagnose the disease by selecting optimal learning model parameters. Quantitative experimental results show that the WS-GDL method achieves better diagnosis performance and higher computing efficiency in computational time, computational complexity, and false-positive rate compared to state-of-the-art approaches.


Sign in / Sign up

Export Citation Format

Share Document