scholarly journals Civil airline fare prediction with a multi-attribute dual-stage attention mechanism

Author(s):  
Zhichao Zhao ◽  
Jinguo You ◽  
Guoyu Gan ◽  
Xiaowu Li ◽  
Jiaman Ding

AbstractAirfare price prediction is one of the core facilities of the decision support system in civil aviation, which includes departure time, days of purchase in advance and flight airline. The traditional airfare price prediction system is limited by the nonlinear interrelationship of multiple factors and fails to deal with the impact of different time steps, resulting in low prediction accuracy. To address these challenges, this paper proposes a novel civil airline fare prediction system with a Multi-Attribute Dual-stage Attention (MADA) mechanism integrating different types of data extracted from the same dimension. In this method, the Seq2Seq model is used to add attention mechanisms to both the encoder and the decoder. The encoder attention mechanism extracts multi-attribute data from time series, which are optimized and filtered by the temporal attention mechanism in the decoder to capture the complex time dependence of the ticket price sequence. Extensive experiments with actual civil aviation data sets were performed, and the results suggested that MADA outperforms airfare prediction models based on the Auto-Regressive Integrated Moving Average (ARIMA), random forest, or deep learning models in MSE, RMSE, and MAE indicators. And from the results of a large amount of experimental data, it is proven that the prediction results of the MADA model proposed in this paper on different routes are at least 2.3% better than the other compared models.

2010 ◽  
Vol 09 (04) ◽  
pp. 547-573 ◽  
Author(s):  
JOSÉ BORGES ◽  
MARK LEVENE

The problem of predicting the next request during a user's navigation session has been extensively studied. In this context, higher-order Markov models have been widely used to model navigation sessions and to predict the next navigation step, while prediction accuracy has been mainly evaluated with the hit and miss score. We claim that this score, although useful, is not sufficient for evaluating next link prediction models with the aim of finding a sufficient order of the model, the size of a recommendation set, and assessing the impact of unexpected events on the prediction accuracy. Herein, we make use of a variable length Markov model to compare the usefulness of three alternatives to the hit and miss score: the Mean Absolute Error, the Ignorance Score, and the Brier score. We present an extensive evaluation of the methods on real data sets and a comprehensive comparison of the scoring methods.


Author(s):  
Laura Rontu ◽  
Emily Gleeson ◽  
Daniel Martin Perez ◽  
Kristian Pagh Nielsen ◽  
Velle Toll

The direct radiative effect of aerosols is taken into account in many limited area numerical weather prediction models using wavelength-dependent aerosol optical depths of a range of aerosol species. We study the impact of aerosol distribution and optical properties on radiative transfer, based on climatological and more realistic near real-time aerosol data. Sensitivity tests were carried out using the single column version of the ALADIN-HIRLAM numerical weather prediction system, set up to use the HLRADIA broadband radiation scheme. The tests were restricted to clear-sky cases to avoid the complication of cloud-radiation-aerosol interactions. The largest differences in radiative fluxes and heating rates were found to be due to different aerosol loads. When the loads are large, the radiative fluxes and heating rates are sensitive to the aerosol inherent optical properties and vertical distribution of the aerosol species. Impacts of aerosols on shortwave radiation dominate longwave impacts. Sensitivity experiments indicated the important effects of highly absorbing black carbon aerosols and strongly scattering desert dust.


2015 ◽  
Vol 61 (2) ◽  
pp. 379-388 ◽  
Author(s):  
Andrej-Nikolai Spiess ◽  
Claudia Deutschmann ◽  
Michał Burdukiewicz ◽  
Ralf Himmelreich ◽  
Katharina Klat ◽  
...  

Abstract BACKGROUND Quantification cycle (Cq) and amplification efficiency (AE) are parameters mathematically extracted from raw data to characterize quantitative PCR (qPCR) reactions and quantify the copy number in a sample. Little attention has been paid to the effects of preprocessing and the use of smoothing or filtering approaches to compensate for noisy data. Existing algorithms largely are taken for granted, and it is unclear which of the various methods is most informative. We investigated the effect of smoothing and filtering algorithms on amplification curve data. METHODS We obtained published high-replicate qPCR data sets from standard block thermocyclers and other cycler platforms and statistically evaluated the impact of smoothing on Cq and AE. RESULTS Our results indicate that selected smoothing algorithms affect estimates of Cq and AE considerably. The commonly used moving average filter performed worst in all qPCR scenarios. The Savitzky–Golay smoother, cubic splines, and Whittaker smoother resulted overall in the least bias in our setting and exhibited low sensitivity to differences in qPCR AE, whereas other smoothers, such as running mean, introduced an AE-dependent bias. CONCLUSIONS The selection of a smoothing algorithm is an important step in developing data analysis pipelines for real-time PCR experiments. We offer guidelines for selection of an appropriate smoothing algorithm in diagnostic qPCR applications. The findings of our study were implemented in the R packages chipPCR and qpcR as a basis for the implementation of an analytical strategy.


Author(s):  
SUMANTH YENDURI ◽  
S. S. IYENGAR

In this study, we compare the performance of four different imputation strategies ranging from the commonly used Listwise Deletion to model based approaches such as the Maximum Likelihood on enhancing completeness in incomplete software project data sets. We evaluate the impact of each of these methods by implementing them on six different real-time software project data sets which are classified into different categories based on their inherent properties. The reliability of the constructed data sets using these techniques are further tested by building prediction models using stepwise regression. The experimental results are noted and the findings are finally discussed.


Vestnik MEI ◽  
2020 ◽  
Vol 6 (6) ◽  
pp. 119-128
Author(s):  
Anna V. Shikhina ◽  
◽  
Tatyana V. Yagodkina ◽  

The solution of problems concerned with predicting a free market price for electricity through constructing different prediction models is considered. In so doing, a shift is made from an analysis of conventional regression and auto-regression models of the moving average to the proposed combined multifactor models, which also include the time trend and dummy variables. This shift is partly justified by the specific behavior of the electricity price in the free market, which is caused by a strictly cyclic change of its value, e.g., proceeding from such attributes as the heating season, day of week, etc. The techniques of constructing combined prediction models has been developed to the level of elaborating effective computational procedures based on the Statistica and OsiSoft PI-System software packages. The application of the autoregressive and combined regression prediction models to the Russian market has demonstrated their fairly good effectiveness with an acceptable level of accuracy. A comparison of the achieved levels of accuracy provided by the competing models has not shown any advantages of the shift to the use of combined regression multifactor models in terms of achieving better prediction accuracy; however, their application for analyzing the influence of different factors on the predicted variable may become a fundamental advantage in selecting the type of prediction model. Despite their being limited to an analysis of the Belgorod region market, the obtained results demonstrate the achieved prediction accuracy that is as least as good, and in the main is even better than the majority of the data presented in the review of the results for European electricity markets. The article substantiates the advisability of studying the combined regression models as a tool for analyzing the influence of individual factors as components influencing the electricity price formation for the predicted period, given that the accuracy level of the combined regression models corresponds to the currently achieved electricity price prediction accuracy levels.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-21
Author(s):  
Sajid Ali ◽  
Naila Altaf ◽  
Ismail Shah ◽  
Lichen Wang ◽  
Syed Muhammad Muslim Raza

Control charts are a popular statistical process control (SPC) technique for monitoring to detect the unusual variations in different processes. Contrary to the classical charts, control charts have also been modified to include covariates using regression approaches. This study assesses the performance of risk-adjusted control charts under the complexity of estimation error by considering logistic and negative binomial regression models. To be more precise, risk-adjusted Cumulative Sum (CUSUM) and Exponentially Weighted Moving Average (EWMA) charts are used to evaluate the impact of the estimation error. To compute the average run length (ARL), Markov Chain Monte Carlo simulations are conducted. Furthermore, a bootstrap method is also used to compute the ARL assuming different Phase-I data sets to minimize the effect of estimation error on risk-adjusted control charts. The results for cardiac surgery and respiratory disease data sets show that the modified control charts improve the performance in detecting small shifts.


2020 ◽  
Vol 12 (18) ◽  
pp. 7520
Author(s):  
Hyunsoo Kim ◽  
Youngwoo Kwon ◽  
Yeol Choi

Providing adequate public rental housing (PRH) of a decent quality at a desirable location is a major challenge in many cities. Often, a prominent opponent of PRH development is its host community, driven by a belief that PRH depreciates nearby property values. While this is a persistent issue in many cities around the world, this study proposed a new approach to assessing the impact of PRH on nearby property value. This study utilized a machine learning technique called long short-term memory (LSTM) to construct a set of housing price prediction models based on 547,740 apartment transaction records from the city of Busan, South Korea. A set of apartment characteristics and proximity measures to PRH were included in the modeling process. Four geographic boundaries were analyzed: The entire region of Busan, all neighborhoods of PRH, the neighborhoods of PRH in the “favorable,” and the “less favorable” local housing market. The study produced accurate and reliable price predictions, which indicated that the proximity to PRH has a meaningful impact on nearby housing prices both at the city and the neighborhood level. The approach taken by the study can facilitate improved decision making for future PRH policies and programs.


Author(s):  
Vijay Kumar Dwivedi ◽  
Manoj Madhava Gore

Background: Stock price prediction is a challenging task. The social, economic, political, and various other factors cause frequent abrupt changes in the stock price. This article proposes a historical data-based ensemble system to predict the closing stock price with higher accuracy and consistency over the existing stock price prediction systems. Objective: The primary objective of this article is to predict the closing price of a stock for the next trading in more accurate and consistent manner over the existing methods employed for the stock price prediction. Method: The proposed system combines various machine learning-based prediction models employing least absolute shrinkage and selection operator (LASSO) regression regularization technique to enhance the accuracy of stock price prediction system as compared to any one of the base prediction models. Results: The analysis of results for all the eleven stocks (listed under Information Technology sector on the Bombay Stock Exchange, India) reveals that the proposed system performs best (on all defined metrics of the proposed system) for training datasets and test datasets comprising of all the stocks considered in the proposed system. Conclusion: The proposed ensemble model consistently predicts stock price with a high degree of accuracy over the existing methods used for the prediction.


2020 ◽  
Author(s):  
Eduardo Atem De Carvalho ◽  
Rogerio Atem De Carvalho

BACKGROUND Since the beginning of the COVID-19 pandemic, researchers and health authorities have sought to identify the different parameters that govern their infection and death cycles, in order to be able to make better decisions. In particular, a series of reproduction number estimation models have been presented, with different practical results. OBJECTIVE This article aims to present an effective and efficient model for estimating the Reproduction Number and to discuss the impacts of sub-notification on these calculations. METHODS The concept of Moving Average Method with Initial value (MAMI) is used, as well as a model for Rt, the Reproduction Number, is derived from experimental data. The models are applied to real data and their performance is presented. RESULTS Analyses on Rt and sub-notification effects for Germany, Italy, Sweden, United Kingdom, South Korea, and the State of New York are presented to show the performance of the methods here introduced. CONCLUSIONS We show that, with relatively simple mathematical tools, it is possible to obtain reliable values for time-dependent, incubation period-independent Reproduction Numbers (Rt). We also demonstrate that the impact of sub-notification is relatively low, after the initial phase of the epidemic cycle has passed.


Author(s):  
Richard McCleary ◽  
David McDowall ◽  
Bradley J. Bartos

The general AutoRegressive Integrated Moving Average (ARIMA) model can be written as the sum of noise and exogenous components. If an exogenous impact is trivially small, the noise component can be identified with the conventional modeling strategy. If the impact is nontrivial or unknown, the sample AutoCorrelation Function (ACF) will be distorted in unknown ways. Although this problem can be solved most simply when the outcome of interest time series is long and well-behaved, these time series are unfortunately uncommon. The preferred alternative requires that the structure of the intervention is known, allowing the noise function to be identified from the residualized time series. Although few substantive theories specify the “true” structure of the intervention, most specify the dichotomous onset and duration of an impact. Chapter 5 describes this strategy for building an ARIMA intervention model and demonstrates its application to example interventions with abrupt and permanent, gradually accruing, gradually decaying, and complex impacts.


Sign in / Sign up

Export Citation Format

Share Document