scholarly journals Gradient Boosting Machine and Deep Learning Approach in Big Data Analysis

2022 ◽  
Vol 15 (1) ◽  
pp. 1-20
Author(s):  
Ravinder Kumar ◽  
Lokesh Kumar Shrivastav

Designing a system for analytics of high-frequency data (Big data) is a very challenging and crucial task in data science. Big data analytics involves the development of an efficient machine learning algorithm and big data processing techniques or frameworks. Today, the development of the data processing system is in high demand for processing high-frequency data in a very efficient manner. This paper proposes the processing and analytics of stochastic high-frequency stock market data using a modified version of suitable Gradient Boosting Machine (GBM). The experimental results obtained are compared with deep learning and Auto-Regressive Integrated Moving Average (ARIMA) methods. The results obtained using modified GBM achieves the highest accuracy (R2 = 0.98) and minimum error (RMSE = 0.85) as compared to the other two approaches.

2020 ◽  
Vol 13 (12) ◽  
pp. 309 ◽  
Author(s):  
Julien Chevallier

The original contribution of this paper is to empirically document the contagion of the Covid-19 on financial markets. We merge databases from Johns Hopkins Coronavirus Center, Oxford-Man Institute Realized Library, NYU Volatility Lab, and St-Louis Federal Reserve Board. We deploy three types of models throughout our experiments: (i) the Susceptible-Infective-Removed (SIR) that predicts the infections’ peak on 2020-03-27; (ii) volatility (GARCH), correlation (DCC), and risk-management (Value-at-Risk (VaR)) models that relate how bears painted Wall Street red; and, (iii) data-science trees algorithms with forward prunning, mosaic plots, and Pythagorean forests that crunch the data on confirmed, deaths, and recovered Covid-19 cases and then tie them to high-frequency data for 31 stock markets.


2020 ◽  
Vol 12 (5) ◽  
pp. 393-406
Author(s):  
Jindong Zhao ◽  
Shouke Wei ◽  
Xuebin Wen ◽  
Xiuqin Qiu

Large scale real-time water quality monitoring system usually produces vast amounts of high frequency data, and it is difficult for traditional water quality monitoring system to process such large and high frequency data generated by wireless sensor network. A real-time processing and early warning system framework is proposed to solve this problem, Apache Storm is used as the big data processing platform, and Kafka message queue is applied to classify the sample data into several data streams so as to reserve the time series data property of a sensor. In storm platform, Daubechies Wavelet is used to decompose the data series to obtain the trend of the series, then Long Short Term Memory Network (LSTM) model is used to model and predict the trend of the data. This paper provides a detailed description concerning the distribution mechanism of aggregated data in Storm, data storage format in HBase, the process of wavelet decomposition, model training and the application of mode for prediction. The application results in Xin’an River in Yantai City reveal that the prosed system framework has a very good ability to model big data with high prediction accuracy and robust processing capability.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Vicente Ramos ◽  
Woraphon Yamaka ◽  
Bartomeu Alorda ◽  
Songsak Sriboonchitta

Purpose This paper aims to illustrate the potential of high-frequency data for tourism and hospitality analysis, through two research objectives: First, this study describes and test a novel high-frequency forecasting methodology applied on big data characterized by fine-grained time and spatial resolution; Second, this paper elaborates on those estimates’ usefulness for visitors and tourism public and private stakeholders, whose decisions are increasingly focusing on short-time horizons. Design/methodology/approach This study uses the technical communications between mobile devices and WiFi networks to build a high frequency and precise geolocation of big data. The empirical section compares the forecasting accuracy of several artificial intelligence and time series models. Findings The results robustly indicate the long short-term memory networks model superiority, both for in-sample and out-of-sample forecasting. Hence, the proposed methodology provides estimates which are remarkably better than making short-time decision considering the current number of residents and visitors (Naïve I model). Practical implications A discussion section exemplifies how high-frequency forecasts can be incorporated into tourism information and management tools to improve visitors’ experience and tourism stakeholders’ decision-making. Particularly, the paper details its applicability to managing overtourism and Covid-19 mitigating measures. Originality/value High-frequency forecast is new in tourism studies and the discussion sheds light on the relevance of this time horizon for dealing with some current tourism challenges. For many tourism-related issues, what to do next is not anymore what to do tomorrow or the next week. Plain Language Summary This research initiates high-frequency forecasting in tourism and hospitality studies. Additionally, we detail several examples of how anticipating urban crowdedness requires high-frequency data and can improve visitors’ experience and public and private decision-making.


Sign in / Sign up

Export Citation Format

Share Document