Demand forecasting at retail stage for selected vegetables: a performance analysis

2019 ◽  
Vol 14 (4) ◽  
pp. 1042-1063 ◽  
Author(s):  
Rahul Priyadarshi ◽  
Akash Panigrahi ◽  
Srikanta Routroy ◽  
Girish Kant Garg

Purpose The purpose of this study is to select the appropriate forecasting model at the retail stage for selected vegetables on the basis of performance analysis. Design/methodology/approach Various forecasting models such as the Box–Jenkins-based auto-regressive integrated moving average model and machine learning-based algorithms such as long short-term memory (LSTM) networks, support vector regression (SVR), random forest regression, gradient boosting regression (GBR) and extreme GBR (XGBoost/XGBR) were proposed and applied (i.e. modeling, training, testing and predicting) at the retail stage for selected vegetables to forecast demand. The performance analysis (i.e. forecasting error analysis) was carried out to select the appropriate forecasting model at the retail stage for selected vegetables. Findings From the obtained results for a case environment, it was observed that the machine learning algorithms, namely LSTM and SVR, produced the better results in comparison with other different demand forecasting models. Research limitations/implications The results obtained from the case environment cannot be generalized. However, it may be used for forecasting of different agriculture produces at the retail stage, capturing their demand environment. Practical implications The implementation of LSTM and SVR for the case situation at the retail stage will reduce the forecast error, daily retail inventory and fresh produce wastage and will increase the daily revenue. Originality/value The demand forecasting model selection for agriculture produce at the retail stage on the basis of performance analysis is a unique study where both traditional and non-traditional models were analyzed and compared.

2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Kushalkumar Thakkar ◽  
Suhas Suresh Ambekar ◽  
Manoj Hudnurkar

Purpose Longitudinal facial cracks (LFC) are one of the major defects occurring in the continuous-casting stage of thin slab caster using funnel molds. Longitudinal cracks occur mainly owing to non-uniform cooling, varying thermal conductivity along mold length and use of high superheat during casting, improper casting powder characteristics. These defects are difficult to capture and are visible only in the final stages of a process or even at the customer end. Besides, there is a seasonality associated with this defect where defect intensity increases during the winter season. To address the issue, a model-based on data analytics is developed. Design/methodology/approach Around six-month data of steel manufacturing process is taken and around 60 data collection point is analyzed. The model uses different classification machine learning algorithms such as logistic regression, decision tree, ensemble methods of a decision tree, support vector machine and Naïve Bays (for different cut off level) to investigate data. Findings Proposed research framework shows that most of models give good results between cut off level 0.6–0.8 and random forest, gradient boosting for decision trees and support vector machine model performs better compared to other model. Practical implications Based on predictions of model steel manufacturing companies can identify the optimal operating range where this defect can be reduced. Originality/value An analytical approach to identify LFC defects provides objective models for reduction of LFC defects. By reducing LFC defects, quality of steel can be improved.


Materials ◽  
2021 ◽  
Vol 14 (15) ◽  
pp. 4068
Author(s):  
Xu Huang ◽  
Mirna Wasouf ◽  
Jessada Sresakoolchai ◽  
Sakdirat Kaewunruen

Cracks typically develop in concrete due to shrinkage, loading actions, and weather conditions; and may occur anytime in its life span. Autogenous healing concrete is a type of self-healing concrete that can automatically heal cracks based on physical or chemical reactions in concrete matrix. It is imperative to investigate the healing performance that autogenous healing concrete possesses, to assess the extent of the cracking and to predict the extent of healing. In the research of self-healing concrete, testing the healing performance of concrete in a laboratory is costly, and a mass of instances may be needed to explore reliable concrete design. This study is thus the world’s first to establish six types of machine learning algorithms, which are capable of predicting the healing performance (HP) of self-healing concrete. These algorithms involve an artificial neural network (ANN), a k-nearest neighbours (kNN), a gradient boosting regression (GBR), a decision tree regression (DTR), a support vector regression (SVR) and a random forest (RF). Parameters of these algorithms are tuned utilising grid search algorithm (GSA) and genetic algorithm (GA). The prediction performance indicated by coefficient of determination (R2) and root mean square error (RMSE) measures of these algorithms are evaluated on the basis of 1417 data sets from the open literature. The results show that GSA-GBR performs higher prediction performance (R2GSA-GBR = 0.958) and stronger robustness (RMSEGSA-GBR = 0.202) than the other five types of algorithms employed to predict the healing performance of autogenous healing concrete. Therefore, reliable prediction accuracy of the healing performance and efficient assistance on the design of autogenous healing concrete can be achieved.


2021 ◽  
Vol 10 (1) ◽  
pp. 42
Author(s):  
Kieu Anh Nguyen ◽  
Walter Chen ◽  
Bor-Shiun Lin ◽  
Uma Seeboonruang

Although machine learning has been extensively used in various fields, it has only recently been applied to soil erosion pin modeling. To improve upon previous methods of quantifying soil erosion based on erosion pin measurements, this study explored the possible application of ensemble machine learning algorithms to the Shihmen Reservoir watershed in northern Taiwan. Three categories of ensemble methods were considered in this study: (a) Bagging, (b) boosting, and (c) stacking. The bagging method in this study refers to bagged multivariate adaptive regression splines (bagged MARS) and random forest (RF), and the boosting method includes Cubist and gradient boosting machine (GBM). Finally, the stacking method is an ensemble method that uses a meta-model to combine the predictions of base models. This study used RF and GBM as the meta-models, decision tree, linear regression, artificial neural network, and support vector machine as the base models. The dataset used in this study was sampled using stratified random sampling to achieve a 70/30 split for the training and test data, and the process was repeated three times. The performance of six ensemble methods in three categories was analyzed based on the average of three attempts. It was found that GBM performed the best among the ensemble models with the lowest root-mean-square error (RMSE = 1.72 mm/year), the highest Nash-Sutcliffe efficiency (NSE = 0.54), and the highest index of agreement (d = 0.81). This result was confirmed by the spatial comparison of the absolute differences (errors) between model predictions and observations using GBM and RF in the study area. In summary, the results show that as a group, the bagging method and the boosting method performed equally well, and the stacking method was third for the erosion pin dataset considered in this study.


Author(s):  
Gudipally Chandrashakar

In this article, we used historical time series data up to the current day gold price. In this study of predicting gold price, we consider few correlating factors like silver price, copper price, standard, and poor’s 500 value, dollar-rupee exchange rate, Dow Jones Industrial Average Value. Considering the prices of every correlating factor and gold price data where dates ranging from 2008 January to 2021 February. Few algorithms of machine learning are used to analyze the time-series data are Random Forest Regression, Support Vector Regressor, Linear Regressor, ExtraTrees Regressor and Gradient boosting Regression. While seeing the results the Extra Tree Regressor algorithm gives the predicted value of gold prices more accurately.


2021 ◽  
Author(s):  
Polash Banerjee

Abstract Wildfires in limited extent and intensity can be a boon for the forest ecosystem. However, recent episodes of wildfires of 2019 in Australia and Brazil are sad reminders of their heavy ecological and economical costs. Understanding the role of environmental factors in the likelihood of wildfires in a spatial context would be instrumental in mitigating it. In this study, 14 environmental features encompassing meteorological, topographical, ecological, in situ and anthropogenic factors have been considered for preparing the wildfire likelihood map of Sikkim Himalaya. A comparative study on the efficiency of machine learning methods like Generalized Linear Model (GLM), Support Vector Machine (SVM), Random Forest (RF) and Gradient Boosting Model (GBM) has been performed to identify the best performing algorithm in wildfire prediction. The study indicates that all the machine learning methods are good at predicting wildfires. However, RF has outperformed, followed by GBM in the prediction. Also, environmental features like average temperature, average wind speed, proximity to roadways and tree cover percentage are the most important determinants of wildfires in Sikkim Himalaya. This study can be considered as a decision support tool for preparedness, efficient resource allocation and sensitization of people towards mitigation of wildfires in Sikkim.


Author(s):  
Harsha A K

Abstract: Since the advent of encryption, there has been a steady increase in malware being transmitted over encrypted networks. Traditional approaches to detect malware like packet content analysis are inefficient in dealing with encrypted data. In the absence of actual packet contents, we can make use of other features like packet size, arrival time, source and destination addresses and other such metadata to detect malware. Such information can be used to train machine learning classifiers in order to classify malicious and benign packets. In this paper, we offer an efficient malware detection approach using classification algorithms in machine learning such as support vector machine, random forest and extreme gradient boosting. We employ an extensive feature selection process to reduce the dimensionality of the chosen dataset. The dataset is then split into training and testing sets. Machine learning algorithms are trained using the training set. These models are then evaluated against the testing set in order to assess their respective performances. We further attempt to tune the hyper parameters of the algorithms, in order to achieve better results. Random forest and extreme gradient boosting algorithms performed exceptionally well in our experiments, resulting in area under the curve values of 0.9928 and 0.9998 respectively. Our work demonstrates that malware traffic can be effectively classified using conventional machine learning algorithms and also shows the importance of dimensionality reduction in such classification problems. Keywords: Malware Detection, Extreme Gradient Boosting, Random Forest, Feature Selection.


2020 ◽  
Vol 9 (9) ◽  
pp. 507
Author(s):  
Sanjiwana Arjasakusuma ◽  
Sandiaga Swahyu Kusuma ◽  
Stuart Phinn

Machine learning has been employed for various mapping and modeling tasks using input variables from different sources of remote sensing data. For feature selection involving high- spatial and spectral dimensionality data, various methods have been developed and incorporated into the machine learning framework to ensure an efficient and optimal computational process. This research aims to assess the accuracy of various feature selection and machine learning methods for estimating forest height using AISA (airborne imaging spectrometer for applications) hyperspectral bands (479 bands) and airborne light detection and ranging (lidar) height metrics (36 metrics), alone and combined. Feature selection and dimensionality reduction using Boruta (BO), principal component analysis (PCA), simulated annealing (SA), and genetic algorithm (GA) in combination with machine learning algorithms such as multivariate adaptive regression spline (MARS), extra trees (ET), support vector regression (SVR) with radial basis function, and extreme gradient boosting (XGB) with trees (XGbtree and XGBdart) and linear (XGBlin) classifiers were evaluated. The results demonstrated that the combinations of BO-XGBdart and BO-SVR delivered the best model performance for estimating tropical forest height by combining lidar and hyperspectral data, with R2 = 0.53 and RMSE = 1.7 m (18.4% of nRMSE and 0.046 m of bias) for BO-XGBdart and R2 = 0.51 and RMSE = 1.8 m (15.8% of nRMSE and −0.244 m of bias) for BO-SVR. Our study also demonstrated the effectiveness of BO for variables selection; it could reduce 95% of the data to select the 29 most important variables from the initial 516 variables from lidar metrics and hyperspectral data.


Kybernetes ◽  
2019 ◽  
Vol 49 (9) ◽  
pp. 2335-2348 ◽  
Author(s):  
Milad Yousefi ◽  
Moslem Yousefi ◽  
Masood Fathi ◽  
Flavio S. Fogliatto

Purpose This study aims to investigate the factors affecting daily demand in an emergency department (ED) and to provide a forecasting tool in a public hospital for horizons of up to seven days. Design/methodology/approach In this study, first, the important factors to influence the demand in EDs were extracted from literature then the relevant factors to the study are selected. Then, a deep neural network is applied to constructing a reliable predictor. Findings Although many statistical approaches have been proposed for tackling this issue, better forecasts are viable by using the abilities of machine learning algorithms. Results indicate that the proposed approach outperforms statistical alternatives available in the literature such as multiple linear regression, autoregressive integrated moving average, support vector regression, generalized linear models, generalized estimating equations, seasonal ARIMA and combined ARIMA and linear regression. Research limitations/implications The authors applied this study in a single ED to forecast patient visits. Applying the same method in different EDs may give a better understanding of the performance of the model to the authors. The same approach can be applied in any other demand forecasting after some minor modifications. Originality/value To the best of the knowledge, this is the first study to propose the use of long short-term memory for constructing a predictor of the number of patient visits in EDs.


2021 ◽  
Author(s):  
ANKIT GHOSH ◽  
ALOK KOLE

<p>Smart grid is an essential concept in the transformation of the electricity sector into an intelligent digitalized energy network that can deliver optimal energy from the source to the consumers. Smart grids being self-sufficient systems are constructed through the integration of information, telecommunication, and advanced power technologies with the existing electricity systems. Artificial Intelligence (AI) is an important technology driver in smart grids. The application of AI techniques in smart grid is becoming more apparent because the traditional modelling optimization and control techniques have their own limitations. Machine Learning (ML) being a sub-set of AI enables intelligent decision-making and response to sudden changes in the customer energy demands, unexpected disruption of power supply, sudden variations in renewable energy output or any other catastrophic events in a smart grid. This paper presents the comparison among some of the state-of-the-art ML algorithms for predicting smart grid stability. The dataset that has been selected contains results from simulations of smart grid stability. Enhanced ML algorithms such as Support Vector Machine (SVM), Logistic Regression, K-Nearest Neighbour (KNN), Naïve Bayes (NB), Decision Tree (DT), Random Forest (RF), Stochastic Gradient Descent (SGD) classifier, XGBoost and Gradient Boosting classifiers have been implemented to forecast smart grid stability. A comparative analysis among the different ML models has been performed based on the following evaluation metrics such as accuracy, precision, recall, F1-score, AUC-ROC, and AUC-PR curves. The test results that have been obtained have been quite promising with the XGBoost classifier outperforming all the other models with an accuracy of 97.5%, recall of 98.4%, precision of 97.6%, F1-score of 97.9%, AUC-ROC of 99.8% and AUC-PR of 99.9%. </p>


2019 ◽  
Vol 8 (2) ◽  
pp. 3697-3705 ◽  

Forest fires have become one of the most frequently occurring disasters in recent years. The effects of forest fires have a lasting impact on the environment as it lead to deforestation and global warming, which is also one of its major cause of occurrence. Forest fires are dealt by collecting the satellite images of forest and if there is any emergency caused by the fires then the authorities are notified to mitigate its effects. By the time the authorities get to know about it, the fires would have already caused a lot of damage. Data mining and machine learning techniques can provide an efficient prevention approach where data associated with forests can be used for predicting the eventuality of forest fires. This paper uses the dataset present in the UCI machine learning repository which consists of physical factors and climatic conditions of the Montesinho park situated in Portugal. Various algorithms like Logistic regression, Support Vector Machine, Random forest, K-Nearest neighbors in addition to Bagging and Boosting predictors are used, both with and without Principal Component Analysis (PCA). Among the models in which PCA was applied, Logistic Regression gave the highest F-1 score of 68.26 and among the models where PCA was absent, Gradient boosting gave the highest score of 68.36.


Sign in / Sign up

Export Citation Format

Share Document