scholarly journals Dimension Reduction of Machine Learning-Based Forecasting Models Employing Principal Component Analysis

Mathematics ◽  
2020 ◽  
Vol 8 (8) ◽  
pp. 1233
Author(s):  
Yinghui Meng ◽  
Sultan Noman Qasem ◽  
Manouchehr Shokri ◽  
Shahab S

In this research, an attempt was made to reduce the dimension of wavelet-ANFIS/ANN (artificial neural network/adaptive neuro-fuzzy inference system) models toward reliable forecasts as well as to decrease computational cost. In this regard, the principal component analysis was performed on the input time series decomposed by a discrete wavelet transform to feed the ANN/ANFIS models. The models were applied for dissolved oxygen (DO) forecasting in rivers which is an important variable affecting aquatic life and water quality. The current values of DO, water surface temperature, salinity, and turbidity have been considered as the input variable to forecast DO in a three-time step further. The results of the study revealed that PCA can be employed as a powerful tool for dimension reduction of input variables and also to detect inter-correlation of input variables. Results of the PCA-wavelet-ANN models are compared with those obtained from wavelet-ANN models while the earlier one has the advantage of less computational time than the later models. Dealing with ANFIS models, PCA is more beneficial to avoid wavelet-ANFIS models creating too many rules which deteriorate the efficiency of the ANFIS models. Moreover, manipulating the wavelet-ANFIS models utilizing PCA leads to a significant decreasing in computational time. Finally, it was found that the PCA-wavelet-ANN/ANFIS models can provide reliable forecasts of dissolved oxygen as an important water quality indicator in rivers.

Author(s):  
Yinghui Meng ◽  
Sultan Noman Qasem ◽  
Manouchehr Shokri ◽  
S Shamshirband

In this research, an attempt was made to reduce the dimension of wavelet-ANFIS/ANN (artificial neural network/adaptive neuro-fuzzy inference system) models toward reliable forecasts as well as to decrease computational cost. In this regard, the principal component analysis was performed on the input time series decomposed by a discrete wavelet transform to feed the ANN/ANFIS models. The models were applied for dissolved oxygen (DO) forecasting in rivers which is an important variable affecting aquatic life and water quality. The current values of DO, water surface temperature, salinity, and turbidity have been considered as the input variable to forecast DO in a three-time step further. The results of the study revealed that PCA can be employed as a powerful for dimension reduction of input variables and also to detect inter-correlation of input variables. Results of the PCA-Wavelet-ANN models are compared with those obtained from Wavelet-ANN models while the earlier one has the advantage of less computational time than the later models. Dealing with ANFIS models, PCA is more beneficial to avoid Wavelet-ANFIS models creating too many rules which deteriorate the efficiency of the ANFIS models. Moreover, manipulating the Wavelet-ANFIS models utilizing PCA leads to a significant decreasing in computational time. Finally, it was found that the PCA-Wavelet-ANN/ANFIS models can provide reliable forecasts of dissolved oxygen as an important water quality indicators in rivers.


2016 ◽  
Vol 5 (1) ◽  
pp. 187 ◽  
Author(s):  
Kok Weng Tan ◽  
Weng Chee Beh

<p class="ber"><span lang="EN-GB">This study applies the Principal Component Analysis (PCA) to evaluate and interpret the relationship between water quality and benthic macro-invertebrates fauna data obtained from <span class="longtext">Pauh River, Cameron Highlands. Samples were collected once every two months (in February, April, June, August and October 2013) with six chosen sampling stations. Six water quality parameters namely </span></span><span lang="EN-GB">dissolved oxygen (DO), pH, biological oxygen demand (BOD<sub>5</sub>), chemical oxygen demand (COD), ammonia-nitrogen (NH<sub>3</sub>-N), total suspended solid (TSS) and heavy metals contents <span class="longtext"><span>were analyzed according to American Public Health Association (APHA), </span></span>Standard Methods for Examination of Water and Wastewater<span class="longtext"><span> (1998)</span>. <span>Macro-invertebrates were also sampled using Surber sampler and were identified until their family level. Water Quality Index (WQI) values for all stations were class II except for the station 6 which was recorded as class III. Both the diversity and biotic indices showed decreasing value from the upstream (Station 1) to downstream (Station 6). </span></span>A total 28 to 31 taxa have been found in Station 1, 2, 3 and 5 (upstream to middle stream). However, only 7 taxa found at station 6 (downstream). Total 31 taxa with an average density 368.28 ind/m<sup>2</sup> were found in Station 4 which was highest number of taxa among the monitoring stations. <span class="longtext"><span>The </span></span><span>principal component analysis (PCA) was applied on the dataset, which explained 72.15 % of the total variance </span>of the variables<span>. Three components were extracted in this study. First component was classified as benthic macroinvertebrates which tolerated to low water quality condition and high loading of organic matters. The benthic macro-invertebrates families loaded in second component were sensitive to water environment such as NH<sub>3</sub>-N, dissolved oxygen (DO), organic matter and stream flow. The benthic macroinvertebrate families loaded in third component were recognized as species which might not tolerate low concentration of dissolved oxygen.  </span></span></p>


2006 ◽  
Vol 1 (1) ◽  
Author(s):  
K. Katayama ◽  
K. Kimijima ◽  
O. Yamanaka ◽  
A. Nagaiwa ◽  
Y. Ono

This paper proposes a method of stormwater inflow prediction using radar rainfall data as the input of the prediction model constructed by system identification. The aim of the proposal is to construct a compact system by reducing the dimension of the input data. In this paper, Principal Component Analysis (PCA), which is widely used as a statistical method for data analysis and compression, is applied to pre-processing radar rainfall data. Then we evaluate the proposed method using the radar rainfall data and the inflow data acquired in a certain combined sewer system. This study reveals that a few principal components of radar rainfall data can be appropriate as the input variables to storm water inflow prediction model. Consequently, we have established a procedure for the stormwater prediction method using a few principal components of radar rainfall data.


Author(s):  
Petr Praus

In this chapter the principals and applications of principal component analysis (PCA) applied on hydrological data are presented. Four case studies showed the possibility of PCA to obtain information about wastewater treatment process, drinking water quality in a city network and to find similarities in the data sets of ground water quality results and water-related images. In the first case study, the composition of raw and cleaned wastewater was characterised and its temporal changes were displayed. In the second case study, drinking water samples were divided into clusters in consistency with their sampling localities. In the case study III, the similar samples of ground water were recognised by the calculation of cosine similarity, the Euclidean and Manhattan distances. In the case study IV, 32 water-related images were transformed into a large image matrix whose dimensionality was reduced by PCA. The images were clustered using the PCA scatter plots.


Water ◽  
2020 ◽  
Vol 12 (2) ◽  
pp. 420 ◽  
Author(s):  
Thuy Hoang Nguyen ◽  
Björn Helm ◽  
Hiroshan Hettiarachchi ◽  
Serena Caucci ◽  
Peter Krebs

Although river water quality monitoring (WQM) networks play an important role in water management, their effectiveness is rarely evaluated. This study aims to evaluate and optimize water quality variables and monitoring sites to explain the spatial and temporal variation of water quality in rivers, using principal component analysis (PCA). A complex water quality dataset from the Freiberger Mulde (FM) river basin in Saxony, Germany was analyzed that included 23 water quality (WQ) parameters monitored at 151 monitoring sites from 2006 to 2016. The subsequent results showed that the water quality of the FM river basin is mainly impacted by weathering processes, historical mining and industrial activities, agriculture, and municipal discharges. The monitoring of 14 critical parameters including boron, calcium, chloride, potassium, sulphate, total inorganic carbon, fluoride, arsenic, zinc, nickel, temperature, oxygen, total organic carbon, and manganese could explain 75.1% of water quality variability. Both sampling locations and time periods were observed, with the resulting mineral contents varying between locations and the organic and oxygen content differing depending on the time period that was monitored. The monitoring sites that were deemed particularly critical were located in the vicinity of the city of Freiberg; the results for the individual months of July and September were determined to be the most significant. In terms of cost-effectiveness, monitoring more parameters at fewer sites would be a more economical approach than the opposite practice. This study illustrates a simple yet reliable approach to support water managers in identifying the optimum monitoring strategies based on the existing monitoring data, when there is a need to reduce the monitoring costs.


2020 ◽  
pp. 1-10 ◽  
Author(s):  
Alexandre Teixeira de Souza ◽  
Lucas Augusto T. X. Carneiro ◽  
Osmar Pereira da Silva Junior ◽  
Sérgio Luís de Carvalho ◽  
Juliana Heloisa Pinê Américo-Pinheiro

Sign in / Sign up

Export Citation Format

Share Document