scholarly journals A Method for Greenhouse Temperature Prediction Based on XGBoost Algorithm and Linear Residual Model

CONVERTER ◽  
2021 ◽  
pp. 108-121
Author(s):  
Huijin Han, Et al.

Temperature prediction is significant for precise control of the greenhouse environment. Traditional machine learning methods usually rely on a large amount of data. Therefore, it is difficult to make a stable and accurate prediction based on a small amount of data. This paper proposes a temperature prediction method for greenhouses. With the prediction target transformed to the logarithmic difference of temperature inside and outside the greenhouse,the method first uses XGBoost algorithm to make a preliminary prediction. Second, a linear model is used to predict the residuals of the predicted target. The predicted temperature is obtained combining the preliminary prediction and the residuals. Based on the 20-day greenhouse data, the results show that the target transformation applied in our method is better than the others presented in the paper. The MSE (Mean Squared Error) of our method is 0.0844, which is respectively 20.7%, 76.0%, 10.2%, and 95.3% of the MSE of LR (Logistic Regression), SGD (Stochastic Gradient Descent), SVM (Support Vector Machines), and XGBoost algorithm. The results indicate that our method significantly improves the accuracy of the prediction based on the small-scale data.

Entropy ◽  
2021 ◽  
Vol 23 (4) ◽  
pp. 429
Author(s):  
Jose Emmanuel Chacón ◽  
Oldemar Rodríguez

This paper presents new approaches to fit regression models for symbolic internal-valued variables, which are shown to improve and extend the center method suggested by Billard and Diday and the center and range method proposed by Lima-Neto, E.A.and De Carvalho, F.A.T. Like the previously mentioned methods, the proposed regression models consider the midpoints and half of the length of the intervals as additional variables. We considered various methods to fit the regression models, including tree-based models, K-nearest neighbors, support vector machines, and neural networks. The approaches proposed in this paper were applied to a real dataset and to synthetic datasets generated with linear and nonlinear relations. For an evaluation of the methods, the root-mean-squared error and the correlation coefficient were used. The methods presented herein are available in the the RSDA package written in the R language, which can be installed from CRAN.


Author(s):  
Ahmed Hassan Mohammed Hassan ◽  
◽  
Arfan Ali Mohammed Qasem ◽  
Walaa Faisal Mohammed Abdalla ◽  
Omer H. Elhassan

Day by day, the accumulative incidence of COVID-19 is rapidly increasing. After the spread of the Corona epidemic and the death of more than a million people around the world countries, scientists and researchers have tended to conduct research and take advantage of modern technologies to learn machine to help the world to get rid of the Coronavirus (COVID-19) epidemic. To track and predict the disease Machine Learning (ML) can be deployed very effectively. ML techniques have been anticipated in areas that need to identify dangerous negative factors and define their priorities. The significance of a proposed system is to find the predict the number of people infected with COVID19 using ML. Four standard models anticipate COVID-19 prediction, which are Neural Network (NN), Support Vector Machines (SVM), Bayesian Network (BN) and Polynomial Regression (PR). The data utilized to test these models content of number of deaths, newly infected cases, and recoveries in the next 20 days. Five measures parameters were used to evaluate the performance of each model, namely root mean squared error (RMSE), mean squared error (MAE), mean absolute error (MSE), Explained Variance score and r2 score (R2). The significance and value of proposed system auspicious mechanism to anticipate these models for the current cenario of the COVID-19 epidemic. The results showed NN outperformed the other models, while in the available dataset the SVM performs poorly in all the prediction. Reference to our results showed that injuries will increase slightly in the coming days. Also, we find that the results give rise to hope due to the low death rate. For future perspective, case explanation and data amalgamation must be kept up persistently.


2017 ◽  
Vol 57 (2) ◽  
pp. 229 ◽  
Author(s):  
Farhad Ghafouri-Kesbi ◽  
Ghodratollah Rahimi-Mianji ◽  
Mahmood Honarvar ◽  
Ardeshir Nejati-Javaremi

Three machine learning algorithms: Random Forests (RF), Boosting and Support Vector Machines (SVM) as well as Genomic Best Linear Unbiased Prediction (GBLUP) were used to predict genomic breeding values (GBV) and their predictive performance was compared in different combinations of heritability (0.1, 0.3, and 0.5), number of quantitative trait loci (QTL) (100, 1000) and distribution of QTL effects (normal, uniform and gamma). To this end, a genome comprised of five chromosomes, one Morgan each, was simulated on which 10000 bi-allelic single nucleotide polymorphisms were distributed. Pearson’s correlation between the true and predicted GBV and Mean Squared Error of GBV prediction were used, respectively, as measures of the predictive accuracy and the overall fit achieved with each method. In all methods, an increase in accuracy of prediction was seen following increase in heritability and decrease in the number of QTL. GBLUP had better predictive accuracy than machine learning methods in particular in the scenarios of higher number of QTL and normal and uniform distributions of QTL effects; though in most cases, the differences were non-significant. In the scenarios of small number of QTL and gamma distribution of QTL effects, Boosting outperformed other methods. Regarding Mean Squared Error of GBV prediction, in most cases Boosting outperformed other methods, although the estimates were close to that of GBLUP. Among methods studied, SVM with 0.6 gigabytes (GIG) was the most efficient user of memory followed by RF, GBLUP and Boosting with 1.2-GIG, 1.3-GIG and 2.3-GIG memory requirements, respectively. Regarding computational time, GBLUP, SVM, RF and Boosting ranked first, second, third and last with 10 min, 15 min, 75 min and 600 min, respectively. It was concluded that although stochastic gradient Boosting can predict GBV with high prediction accuracy, significantly longer computational time and memory requirement can be a serious limitation for this algorithm. Therefore, using of other variants of Boosting such as Random Boosting was recommended for genomic evaluation.


2020 ◽  
Vol 4 (2) ◽  
pp. 329-335
Author(s):  
Rusydi Umar ◽  
Imam Riadi ◽  
Purwono

The failure of most startups in Indonesia is caused by team performance that is not solid and competent. Programmers are an integral profession in a startup team. The development of social media can be used as a strategic tool for recruiting the best programmer candidates in a company. This strategic tool is in the form of an automatic classification system of social media posting from prospective programmers. The classification results are expected to be able to predict the performance patterns of each candidate with a predicate of good or bad performance. The classification method with the best accuracy needs to be chosen in order to get an effective strategic tool so that a comparison of several methods is needed. This study compares classification methods including the Support Vector Machines (SVM) algorithm, Random Forest (RF) and Stochastic Gradient Descent (SGD). The classification results show the percentage of accuracy with k = 10 cross validation for the SVM algorithm reaches 81.3%, RF at 74.4%, and SGD at 80.1% so that the SVM method is chosen as a model of programmer performance classification on social media activities.


Soil Research ◽  
2015 ◽  
Vol 53 (8) ◽  
pp. 907 ◽  
Author(s):  
David Clifford ◽  
Yi Guo

Given the wide variety of ways one can measure and record soil properties, it is not uncommon to have multiple overlapping predictive maps for a particular soil property. One is then faced with the challenge of choosing the best prediction at a particular point, either by selecting one of the maps, or by combining them together in some optimal manner. This question was recently examined in detail when Malone et al. (2014) compared four different methods for combining a digital soil mapping product with a disaggregation product based on legacy data. These authors also examined the issue of how to compute confidence intervals for the resulting map based on confidence intervals associated with the original input products. In this paper, we propose a new method to combine models called adaptive gating, which is inspired by the use of gating functions in mixture of experts, a machine learning approach to forming hierarchical classifiers. We compare it here with two standard approaches – inverse-variance weights and a regression based approach. One of the benefits of the adaptive gating approach is that it allows weights to vary based on covariate information or across geographic space. As such, this presents a method that explicitly takes full advantage of the spatial nature of the maps we are trying to blend. We also suggest a conservative method for combining confidence intervals. We show that the root mean-squared error of predictions from the adaptive gating approach is similar to that of other standard approaches under cross-validation. However under independent validation the adaptive gating approach works better than the alternatives and as such it warrants further study in other areas of application and further development to reduce its computational complexity.


2018 ◽  
Vol 10 (12) ◽  
pp. 4863 ◽  
Author(s):  
Chao Huang ◽  
Longpeng Cao ◽  
Nanxin Peng ◽  
Sijia Li ◽  
Jing Zhang ◽  
...  

Photovoltaic (PV) modules convert renewable and sustainable solar energy into electricity. However, the uncertainty of PV power production brings challenges for the grid operation. To facilitate the management and scheduling of PV power plants, forecasting is an essential technique. In this paper, a robust multilayer perception (MLP) neural network was developed for day-ahead forecasting of hourly PV power. A generic MLP is usually trained by minimizing the mean squared loss. The mean squared error is sensitive to a few particularly large errors that can lead to a poor estimator. To tackle the problem, the pseudo-Huber loss function, which combines the best properties of squared loss and absolute loss, was adopted in this paper. The effectiveness and efficiency of the proposed method was verified by benchmarking against a generic MLP network with real PV data. Numerical experiments illustrated that the proposed method performed better than the generic MLP network in terms of root mean squared error (RMSE) and mean absolute error (MAE).


Author(s):  
Nur Ariffin Mohd Zin ◽  
Hishammuddin Asmuni ◽  
Haza Nuzly Abdul Hamed ◽  
Razib M. Othman ◽  
Shahreen Kasim ◽  
...  

Recent studies have shown that the wearing of soft lens may lead to performance degradation with the increase of false reject rate. However, detecting the presence of soft lens is a non-trivial task as its texture that almost indiscernible. In this work, we proposed a classification method to identify the existence of soft lens in iris image. Our proposed method starts with segmenting the lens boundary on top of the sclera region. Then, the segmented boundary is used as features and extracted by local descriptors. These features are then trained and classified using Support Vector Machines. This method was tested on Notre Dame Cosmetic Contact Lens 2013 database. Experiment showed that the proposed method performed better than state of the art methods.


Author(s):  
Santi Koonkarnkhai ◽  
Phongsak Keeratiwintakorn ◽  
Piya Kovintavewat

In bit-patterned media recording (BPMR) channels, the inter-track interference (ITI) is extremely severe at ultra high areal densities, which significantly degrades the system performance. The partial-response maximum-likelihood (PRML) technique that uses an one-dimensional (1D) partial response target might not be able to cope with this severe ITI, especially in the presence of media noise and track mis-registration (TMR). This paper describes the target and equalizer design for highdensity BPMR channels. Specifically, we proposes a two-dimensional (2D) cross-track asymmetric target, based on a minimum mean-squared error (MMSE) approach, to combat media noise and TMR. Results indicate that the proposed 2D target performs better than the previously proposed 2D targets, especially when media noise and TMR is severe.


2022 ◽  
pp. 62-85
Author(s):  
Carlos N. Bouza-Herrera ◽  
Jose M. Sautto ◽  
Khalid Ul Islam Rather

This chapter introduced basic elements on stratified simple random sampling (SSRS) on ranked set sampling (RSS). The chapter extends Singh et al. results to sampling a stratified population. The mean squared error (MSE) is derived. SRS is used independently for selecting the samples from the strata. The chapter extends Singh et al. results under the RSS design. They are used for developing the estimation in a stratified population. RSS is used for drawing the samples independently from the strata. The bias and mean squared error (MSE) of the developed estimators are derived. A comparison between the biases and MSEs obtained for the sampling designs SRS and RSS is made. Under mild conditions the comparisons sustained that each RSS model is better than its SRS alternative.


Sign in / Sign up

Export Citation Format

Share Document