scholarly journals Using Machine Learning Methods to Solve Problems of Forecasting the Amount and Probability of Purchase Based on E-Commerce Data

2020 ◽  
Vol 10 (4) ◽  
pp. 31-40
Author(s):  
O.A. Mamiev ◽  
N.A. Finogenov ◽  
G.B. Sologub

The study is aimed at investigating the possibility of using machine learning methods to build models for predicting the probability of purchase and the amount of purchase by online store customers. As a sample, we used data of users transactions of the site ponpare.jp in the period from 01.07.2011 to 23.06.2012. The description and comparative analysis of the most common methods for solving similar problems are given. The metrics used to measure the results in the case of forecasting the fact and amount of the purchase are being described. The results obtained make it clear that within the framework of the problem of predicting the probability of a purchase, gradient boosting, namely its implementation of LGBMClassifier, shows the most accurate estimate. For the problem of predicting the amount of a customer’s purchase, using gradient boosting also gave the best results.

2020 ◽  
Vol 10 (4) ◽  
pp. 41-50
Author(s):  
A.A. Osin ◽  
A.K. Fomin ◽  
G.B. Sologub ◽  
V.I. Vinogradov

The work is aimed at researching the possibility of using machine learning methods to build models for forecasting demand for new products in the online store Ozon. ru. Approaches to the solution that were not previously used in a specific task are proposed for consideration. Data on sales history and storage of goods at Ozon.ru are used as a sample. There is a description and analysis of the approximate loss of the Ozon.ru website, the data used, the process of building a base model, and the results obtained. It describes the metrics used to evaluate the prediction results and makes a comparative analysis between the prediction results of the built model and the results of heuristically selected values.


2020 ◽  
Author(s):  
Olena Piskunova ◽  
◽  
Rostyslav Klochko ◽  

Due to the rapid development of e-commerce and increased competition in the retail market of Ukraine, companies are forced to look for new ways to grow their business. One of the options is to optimize business processes, in particular to increase the efficiency of marketing activities. Predicting consumer behavior is one of the most effective methods of optimizing marketing budgets by building processes based on the individual characteristics of each client. The aim of the study was to predict the behavior of online store customers, namely the time before the next order, based on machine learning methods and a comparative analysis of the effectiveness of different modeling algorithms. Five classification algorithms were implemented: linear discriminant analysis, сlassification and regression trees, random forest, support vector machine, k - nearest neighbors and comparative analysis of their efficiency was performed. Given the peculiarities of customer behavior for forecasting time to the next order, it is proposed to consider the following time intervals in the future when the customer makes the next order: up to two months, two to six months, six to fifteen months, and without order. Predicting such intervals allows us to identify customers who are more likely to make the next purchase and focus our advertising budgets on them, or build a customer experience management strategy: activate customers who have left, offer discounts to customers who are going to leave. Peculiarities of classification models quality assessment on the basis of the “confusion matrix” according to the forecasting accuracy indicators “Accuracy”, “F1”, “Recall” and “Precision” is considered. The study allowed us to give preference to the model of classification "random forest". A tenfold cross-validation was used to improve the quality of the simulation. The weighted accuracy of “F1” in the groups “Up to two months” and “two-six months” reached 62.5% and 64.1%, respectively. The developed model should reduce the influence of the human factor on the decision-making process in the construction of marketing strategies.


Author(s):  
Mehmet Şahin ◽  
Murat Uçar

In this study, a comparative analysis for predicting sports attendance demand is presented based on econometric, artificial intelligence, and machine learning methodologies. Data from more than 20,000 games from three major leagues, namely the National Basketball Association (NBA), National Football League (NFL), and Major League Baseball (MLB), were used for training and testing the approaches. The relevant literature was examined to determine the most useful variables as potential regressors in forecasting. To reveal the most effective approach, three scenarios containing seven cases were constructed. In the first scenario, each league was evaluated separately. In the second scenario, the three possible combinations of league pairings were evaluated, while in the third scenario, all three leagues were evaluated together. The performance evaluations of the results suggest that one of the machine learning methods, Gradient Boosting, outperformed the other methods used. However, the Artificial Neural Network, deep Convolutional Neural Network, and Decision Trees also provided productive and competitive predictions for sports games. Based on the results, the predictions for the NBA and NFL leagues are more satisfactory than the predictions of the MLB, which may be caused by the structure of the MLB. The results of the sensitivity analysis indicate that the performance of the home team is the most influential factor for all three leagues.


2021 ◽  
Author(s):  
Polash Banerjee

Abstract Wildfires in limited extent and intensity can be a boon for the forest ecosystem. However, recent episodes of wildfires of 2019 in Australia and Brazil are sad reminders of their heavy ecological and economical costs. Understanding the role of environmental factors in the likelihood of wildfires in a spatial context would be instrumental in mitigating it. In this study, 14 environmental features encompassing meteorological, topographical, ecological, in situ and anthropogenic factors have been considered for preparing the wildfire likelihood map of Sikkim Himalaya. A comparative study on the efficiency of machine learning methods like Generalized Linear Model (GLM), Support Vector Machine (SVM), Random Forest (RF) and Gradient Boosting Model (GBM) has been performed to identify the best performing algorithm in wildfire prediction. The study indicates that all the machine learning methods are good at predicting wildfires. However, RF has outperformed, followed by GBM in the prediction. Also, environmental features like average temperature, average wind speed, proximity to roadways and tree cover percentage are the most important determinants of wildfires in Sikkim Himalaya. This study can be considered as a decision support tool for preparedness, efficient resource allocation and sensitization of people towards mitigation of wildfires in Sikkim.


2021 ◽  
Vol 3 ◽  
pp. 47-57
Author(s):  
I. N. Myagkova ◽  
◽  
V. R. Shirokii ◽  
Yu. S. Shugai ◽  
O. G. Barinov ◽  
...  

The ways are studied to improve the quality of prediction of the time series of hourly mean fluxes and daily total fluxes (fluences) of relativistic electrons in the outer radiation belt of the Earth 1 to 24 hours ahead and 1 to 4 days ahead, respectively. The prediction uses an approximation approach based on various machine learning methods, namely, artificial neural networks (ANNs), decision tree (random forest), and gradient boosting. A comparison of the skill scores of short-range forecasts with the lead time of 1 to 24 hours showed that the best results were demonstrated by ANNs. For medium-range forecasting, the accuracy of prediction of the fluences of relativistic electrons in the Earth’s outer radiation belt three to four days ahead increases significantly when the predicted values of the solar wind velocity near the Earth obtained from the UV images of the Sun of the AIA (Atmospheric Imaging Assembly) instrument of the SDO (Solar Dynamics Observatory) are included to the list of the input parameters.


2021 ◽  
Vol 184 ◽  
pp. 107639
Author(s):  
Yude Bai ◽  
Zhenchang Xing ◽  
Duoyuan Ma ◽  
Xiaohong Li ◽  
Zhiyong Feng

Sign in / Sign up

Export Citation Format

Share Document