scholarly journals Multi-grained cascade forest for effluent quality prediction of papermaking wastewater treatment processes

2020 ◽  
Vol 81 (5) ◽  
pp. 1090-1098
Author(s):  
Chen Xin ◽  
Xueqing Shi ◽  
Dongsheng Wang ◽  
Chong Yang ◽  
Qian Li ◽  
...  

Abstract The real time estimation of effluent indices of papermaking wastewater is vital to environmental conservation. Ensemble methods have significant advantages over conventional single models in terms of prediction accuracy. As an ensemble method, multi-grained cascade forest (gcForest) is implemented for the prediction of wastewater indices. Compared with the conventional modeling methods including partial least squares, support vector regression, and artificial neural networks, the gcForest model shows prediction superiority for effluent suspended solid (SSeff) and effluent chemical oxygen demand (CODeff). In terms of SSeff, gcForest achieves the highest correlation coefficient with a value of 0.86 and the lowest root-mean-square error (RMSE) value of 0.41. In comparison with the conventional models, the RMSE value using gcForest is reduced by approximately 46.05% to 50.60%. In terms of CODeff, gcForest achieves the highest correlation coefficient with a value of 0.83 and the lowest root-mean-square error value of 4.05. In comparison with the conventional models, the RMSE value using gcForest is reduced by approximately 10.60% to 18.51%.

2020 ◽  
Vol 12 (3) ◽  
pp. 356 ◽  
Author(s):  
Hui Qiu ◽  
Shuanggen Jin

Mean sea surface height (MSSH) is an important parameter, which plays an important role in the analysis of the geoid gap and the prediction of ocean dynamics. Traditional measurement methods, such as the buoy and ship survey, have a small cover area, sparse data, and high cost. Recently, the Global Navigation Satellite System-Reflectometry (GNSS-R) and the spaceborne Cyclone Global Navigation Satellite System (CYGNSS) mission, which were launched on 15 December 2016, have provided a new opportunity to estimate MSSH with all-weather, global coverage, high spatial-temporal resolution, rich signal sources, and strong concealability. In this paper, the global MSSH was estimated by using the relationship between the waveform characteristics of the delay waveform (DM) obtained by the delay Doppler map (DDM) of CYGNSS data, which was validated by satellite altimetry. Compared with the altimetry CNES_CLS2015 product provided by AVISO, the mean absolute error was 1.33 m, the root mean square error was 2.26 m, and the correlation coefficient was 0.97. Compared with the sea surface height model DTU10, the mean absolute error was 1.20 m, the root mean square error was 2.15 m, and the correlation coefficient was 0.97. Furthermore, the sea surface height obtained from CYGNSS was consistent with Jason-2′s results by the average absolute error of 2.63 m, a root mean square error ( RMSE ) of 3.56 m and, a correlation coefficient ( R ) of 0.95.


2020 ◽  
Vol 12 (11) ◽  
pp. 1814
Author(s):  
Phamchimai Phan ◽  
Nengcheng Chen ◽  
Lei Xu ◽  
Zeqiang Chen

Tea is a cash crop that improves the quality of life for people in the Tanuyen District of Laichau Province, Vietnam. Tea yield, however, has stagnated in recent years, due to changes in temperature, precipitation, the age of the tea bushes, and diseases. Developing an approach for monitoring tea bushes by remote sensing and Geographic Information Systems (GIS) might be a way to alleviate this problem. Using multi-temporal remote sensing data, the paper details an investigation of the changes in tea health and yield forecasting through the normalized difference vegetation index (NDVI). In this study, we used NDVI as a support tool to demonstrate the temporal and spatial changes in NDVI through the extract tea NDVI value and calculate the mean NDVI value. The results of the study showed that the minimum NDVI value was 0.42 during January 2013 and February 2015 and 2016. The maximum NDVI value was in August 2015 and June 2017. We indicate that the linear relationship between NDVI value and mean temperature was strong with R 2 = 0.79 Our results confirm that the combination of meteorological data and NDVI data can achieve a high performance of yield prediction. Three models to predict tea yield were conducted: support vector machine (SVM), random forest (RF), and the traditional linear regression model (TLRM). For period 2009 to 2018, the prediction tea yield by the RF model was the best with a R 2 = 0.73 , by SVM it was 0.66, and 0.57 with the TLRM. Three evaluation indicators were used to consider accuracy: the coefficient of determination ( R 2 ), root-mean-square error (RMSE), and percentage error of tea yield (PETY). The highest accuracy for the three models was in 2015 with a R 2 ≥ 0.87, RMSE < 50 kg/ha, and PETY less 3% error. In the other years, the prediction accuracy was higher in the SVM and RF models. Meanwhile, the RF algorithm was better than PETY (≤10%) and the root mean square error for this algorithm was significantly less (≤80 kg/ha). RMSE and PETY showed relatively good values in the TLRM model with a RMSE from 80 to 100 kg/ha and a PETY from 8 to 15%.


2017 ◽  
Vol 71 (11) ◽  
pp. 2427-2436 ◽  
Author(s):  
Mi Lei ◽  
Long Chen ◽  
Bisheng Huang ◽  
Keli Chen

In this research paper, a fast, quantitative, analytical model for magnesium oxide (MgO) content in medicinal mineral talcum was explored based on near-infrared (NIR) spectroscopy. MgO content in each sample was determined by ethylenediaminetetraacetic acid (EDTA) titration and taken as reference value of NIR spectroscopy, and then a variety of processing methods of spectra data were compared to establish a good NIR spectroscopy model. To start, 50 batches of talcum samples were categorized into training set and test set using the Kennard–Stone (K-S) algorithm. In a partial least squares regression (PLSR) model, both leave-one-out cross-validation (LOOCV) and training set validation (TSV) were used to screen spectrum preprocessing methods from multiplicative scatter correction (MSC), and finally the standard normal variate transformation (SNV) was chosen as the optimal pretreatment method. The modeling spectrum bands and ranks were optimized using PLSR method, and the characteristic spectrum ranges were determined as 11995–10664, 7991–6661, and 4326–3999 cm−1, with four optimal ranks. In the support vector machine (SVM) model, the radical basis function (RBF) kernel function was used. Moreover, the full spectrum data of samples pretreated with SNV, the characteristic spectrum data screened using synergy interval partial least squares (SiPLS), and the scoring data of the first four ranks obtained by a partial least squares (PLS) dimension reduction of characteristic spectrum were taken as input variables of SVM, and the MgO content reference values of various sample were taken as output values. In addition, the SVM model internal parameters were optimized using the grid optimization method (GRID), particle swarm optimization (PSO), and genetic algorithm (GA) so that the optimal C and g-values were determined and the validation model was established. By comprehensively comparing the validation effects of different models, it can be concluded that the scoring data of the first four ranks obtained by PLS dimension reduction of characteristic spectrum were taken as input variables of SVM, and the PLS-SVM regression model established using GRID was the optimal NIR spectroscopy quantitative model of talc. This PLS-SVM regression model (rank = 4) measured that the MgO content of talcum was in the range of 17.42–33.22%, with root mean square error of cross validation (RMSECV) of 2.2127%, root mean square error of calibration (RMSEC) of 0.6057%, and root mean square error of prediction (RMSEP) of 1.2901%. This model showed high accuracy and strong prediction capacity, which can be used for rapid prediction of MgO content in talcum.


Materials ◽  
2021 ◽  
Vol 14 (22) ◽  
pp. 6792
Author(s):  
Jing Liu ◽  
Masoud Mohammadi ◽  
Yubao Zhan ◽  
Pengqiang Zheng ◽  
Maria Rashidi ◽  
...  

Self-consolidating concrete (SCC) is a well-known type of concrete, which has been employed in different structural applications due to providing desirable properties. Different studies have been performed to obtain a sustainable mix design and enhance the fresh properties of SCC. In this study, an adaptive neuro-fuzzy inference system (ANFIS) algorithm is developed to predict the superplasticizer (SP) demand and select the most significant parameter of the fresh properties of optimum mix design. For this purpose, a comprehensive database consisting of verified test results of SCC incorporating cement replacement powders including pumice, slag, and fly ash (FA) has been employed. In this regard, at first, fresh properties tests including the J-ring, V-funnel, U-box, and different time interval slump values were considered to collect the datasets. At the second stage, five models of ANFIS were adjusted and the most precise method for predicting the SP demand was identified. The correlation coefficient (R2), Pearson’s correlation coefficient (r), Nash–Sutcliffe efficiency (NSE), root mean square error (RMSE), mean absolute error (MAE), and Wilmot’s index of agreement (WI) were used as the measures of precision. Later, the most effective parameters on the prediction of SP demand were evaluated by the developed ANFIS. Based on the analytical results, the employed algorithm was successfully able to predict the SP demand of SCC with high accuracy. Finally, it was deduced that the V-funnel test is the most reliable method for estimating the SP demand value and a significant parameter for SCC mix design as it led to the lowest training root mean square error (RMSE) compared to other non-destructive testing methods.


2013 ◽  
Vol 76 (11) ◽  
pp. 1868-1872 ◽  
Author(s):  
SOOMIN LEE ◽  
HEEYOUNG LEE ◽  
JOO-YEON LEE ◽  
PANAGIOTIS SKANDAMIS ◽  
BEOM-YOUNG PARK ◽  
...  

In this study, mathematical models were developed to predict the growth probability and kinetic behavior of Listeria monocytogenes on fresh pork skin during storage at different temperatures. A 10-strain mixture of L. monocytogenes was inoculated on fresh pork skin (3 by 5 cm) at 4 log CFU/cm2. The inoculated samples were stored aerobically at 4, 7, and 10°C for 240 h, at 15 and 20°C for 96 h, and at 25 and 30°C for 12 h. The Baranyi model was fitted to L. monocytogenes growth data on PALCAM agar to calculate the maximum specific growth rate, lag-phase duration, the lower asymptote, and the upper asymptote. The kinetic parameters were then further analyzed as a function of storage temperature. The model simulated growth of L. monocytogenes under constant and changing temperatures, and the performances of the models were evaluated by the root mean square error and bias factor (Bf). Of the 49 combinations (temperature × sampling time), the combinations with significant growth (P &lt; 0.05) of L. monocytogenes were assigned a value of 1, and the combinations with nonsignificant growth (P ≥ 0.05) were given a value of 0. These data were analyzed by logistic regression to develop a model predicting the probabilities of L. monocytogenes growth. At 4 to 10°C, obvious L. monocytogenes growth was observable after 24 h of storage; but, at other temperatures, the pathogen had obvious growth after 12 h of storage. Because the root mean square error value (0.184) and Bf (1.01) were close to 0 and 1, respectively, the performance of the developed model was acceptable, and the probabilistic model also showed good performance. These results indicate that the developed model should be useful in predicting kinetic behavior and calculating growth probabilities of L. monocytogenes as a function of temperature and time.


2020 ◽  
Vol 20 (3) ◽  
pp. 1016-1034
Author(s):  
Zhongda Tian

Abstract The accurate prediction of crop water requirement is of great significance for the development of regional agriculture. Based on the wavelet transform, a combined prediction approach for crop water requirement is proposed. Firstly, the Mallat wavelet transform algorithm is used to decompose and reconstruct the crop water requirement series. The approximate and detail components of the original series can be obtained. The characteristics of approximate components and detail components are analyzed by Hurst index. Then, according to the different characteristics of the components, the particle swarm optimization algorithm optimized support vector machine is used to predict the approximate component, and the autoregressive moving average model is used to predict the detail components. Three-fold cross-validation is used to improve the generalization ability of the forecasting model. Finally, combined with the prediction value of each prediction model, the final prediction value of crop water requirement is obtained. The crop water requirement data from 1983 to 2018 in Liaoning Province of China are collected as the research object. The simulation results indicate that the proposed combined prediction approach has high prediction accuracy for crop water requirement. The comparison of performance indicators shows that the root mean square error of the proposed prediction approach reduced by 45.40% to 57.16%, mean absolute error reduced by 32.96% to 52.07%, mean absolute percentile error reduced by 33.02% to 52.37%, relative root mean square error reduced by 45.26% to 57.38%, square sum error reduced by 70.18% to 80.42%, and the Theil inequality coefficient reduced by 59.02% to 80.77%. R square increased by 16.46% to 54.77%, and the index of agreement increased by 3.82% to 23.37%. The results of Pearson's test and the DM test show that the association strength between the actual value and the prediction value of the crop water requirement is stronger. Moreover, the proposed prediction approach in this paper has higher reliability under the same confidence level. The effectiveness of the proposed prediction approach for crop water requirement is verified. The proposed prediction approach has great significance for the rational use of water resources, planning and management, promoting social and economic sustainable development.


2018 ◽  
Vol 4 (1) ◽  
Author(s):  
Agustian Noor

Gempa merupakan fenomena alam secara periodik yang terjadi di seluruh belahan bumi akibat adanya gaya pembangkit pasang surut yang utamanya berasal dari matahari dan bulan. Tujuan penelitian ini adalah untuk menganalisa hasil gempa bumi di Sumara Utara. Metode yang diusulkan adalahmembandingkan SVM dan SVM-PSO yang menggunakan data dari instansi terkait khususnya di daerah Sumatra Utara, Masing-masing algoritma akan implementasikan dengan menggunakan RapidMiner 5.1 Pengukuran kinerja dilakukan dengan menghitung rata-rata error yang terjadi melalui besaran Root Mean Square Error (RMSE). Semakin kecil nilai dari masing-masing parameter kinerja ini menyatakan semakin dekat nilai prediksi dengan nilai sebenarnya. Dengan demikian dapat diketahui algoritma yang lebih akurat.


Author(s):  
Parveen Bhola ◽  
Saurabh Bhardwaj

Many applications including power trading and planning require the accurate estimation of solar power in real time. As the power output of the solar panels degrades over the time period, so its real-time estimation is tough without the degradation parameter. In the proposed method, the effect of degradation in terms of performance ratio is incorporated along with other meteorological parameters. The degradation is calculated in real time using the clustering-based technique without physical inspection on site. Initially, the power is estimated using Support Vector Regression (SVR) model with the meteorological parameters. The estimation is further fine-tuned in sync with the degradation rate. The model is validated on the real data (Meteorological parameters and Solar power) procured from the solar plant. After refinement, the estimation results show significant improvement in terms of statistical measures. Now, the estimation accuracy in terms of coefficient of determination R2 is 92% and the error metrics normalized root mean square error (NMRSE), mean absolute percentage error (MAPE), root mean square error (RMSE) are 7.13, 5.92 and 14.54, respectively.


2018 ◽  
Vol 14 (2) ◽  
pp. 225
Author(s):  
Indriyanti Indriyanti ◽  
Agus Subekti

Konsumsi energi bangunan yang semakin meningkat mendorong para peneliti untuk membangun sebuah model prediksi dengan menerapkan metode machine learning, namun masih belum diketahui model yang paling akurat. Model prediktif untuk konsumsi energi bangunan komersial penting untuk konservasi energi. Dengan menggunakan model yang tepat, kita dapat membuat desain bangunan yang lebih efisien dalam penggunaan energi. Dalam tulisan ini, kami mengusulkan model prediktif berdasarkan metode pembelajaran mesin untuk mendapatkan model terbaik dalam memprediksi total konsumsi energi. Algoritma yang digunakan yaitu SMOreg dan LibSVM dari kelas Support Vector Machine, kemudian untuk evaluasi model berdasarkan nilai Mean Absolute Error dan Root Mean Square Error. Dengan menggunakan dataset publik yang tersedia, kami mengembangkan model berdasarkan pada mesin vektor pendukung untuk regresi. Hasil pengujian kedua algoritma tersebut diketahui bahwa algoritma SMOreg memiliki akurasi lebih baik karena memiliki nilai MAE dan RMSE sebesar 4,70 dan 10,15, sedangkan untuk model LibSVM memiliki nilai MAE dan RMSE sebesar 9,37 dan 14,45. Kami mengusulkan metode berdasarkan algoritma SMOreg karena kinerjanya lebih baik.


Sign in / Sign up

Export Citation Format

Share Document