A COMPARISON OF SCORING METRICS FOR PREDICTING THE NEXT NAVIGATION STEP WITH MARKOV MODEL-BASED SYSTEMS

The problem of predicting the next request during a user's navigation session has been extensively studied. In this context, higher-order Markov models have been widely used to model navigation sessions and to predict the next navigation step, while prediction accuracy has been mainly evaluated with the hit and miss score. We claim that this score, although useful, is not sufficient for evaluating next link prediction models with the aim of finding a sufficient order of the model, the size of a recommendation set, and assessing the impact of unexpected events on the prediction accuracy. Herein, we make use of a variable length Markov model to compare the usefulness of three alternatives to the hit and miss score: the Mean Absolute Error, the Ignorance Score, and the Brier score. We present an extensive evaluation of the methods on real data sets and a comprehensive comparison of the scoring methods.

Download Full-text

Landscape pattern and economic factors’ effect on prediction accuracy of cellular automata-Markov chain model on county scale

Open Geosciences ◽

10.1515/geo-2020-0162 ◽

2020 ◽

Vol 12 (1) ◽

pp. 626-636

Author(s):

Wang Song ◽

Zhao Yunlin ◽

Xu Zhenggang ◽

Yang Guiyan ◽

Huang Tian ◽

...

Keyword(s):

Land Use ◽

Markov Chain ◽

Cellular Automata ◽

Markov Model ◽

Prediction Accuracy ◽

Driving Mechanism ◽

Economic Factors ◽

Chain Model ◽

The Impact ◽

County Scale

AbstractUnderstanding and modeling of land use change is of great significance to environmental protection and land use planning. The cellular automata-Markov chain (CA-Markov) model is a powerful tool to predict the change of land use, and the prediction accuracy is limited by many factors. To explore the impact of land use and socio-economic factors on the prediction of CA-Markov model on county scale, this paper uses the CA-Markov model to simulate the land use of Anren County in 2016, based on the land use of 1996 and 2006. Then, the correlation between the land use, socio-economic data and the prediction accuracy was analyzed. The results show that Shannon’s evenness index and population density having an important impact on the accuracy of model predictions, negatively correlate with kappa coefficient. The research not only provides a reference for correct use of the model but also helps us to understand the driving mechanism of landscape changes.

Download Full-text

Hidden Semi-Markov Models for Predictive Maintenance

Mathematical Problems in Engineering ◽

10.1155/2015/278120 ◽

2015 ◽

Vol 2015 ◽

pp. 1-23 ◽

Cited By ~ 17

Author(s):

Francesco Cartella ◽

Jan Lemeire ◽

Luca Dimiccoli ◽

Hichem Sahli

Keyword(s):

Markov Models ◽

Real Data ◽

Information Criterion ◽

Absolute Error ◽

Predictive Maintenance ◽

Average Absolute Error ◽

Current State ◽

Automatic Model Selection ◽

State Duration ◽

Useful Lifetime

Realistic predictive maintenance approaches are essential for condition monitoring and predictive maintenance of industrial machines. In this work, we propose Hidden Semi-Markov Models (HSMMs) with (i) no constraints on the state duration density function and (ii) being applied to continuous or discrete observation. To deal with such a type of HSMM, we also propose modifications to the learning, inference, and prediction algorithms. Finally, automatic model selection has been made possible using the Akaike Information Criterion. This paper describes the theoretical formalization of the model as well as several experiments performed on simulated and real data with the aim of methodology validation. In all performed experiments, the model is able to correctly estimate the current state and to effectively predict the time to a predefined event with a low overall average absolute error. As a consequence, its applicability to real world settings can be beneficial, especially where in real time the Remaining Useful Lifetime (RUL) of the machine is calculated.

Download Full-text

A Markov model of urban evolution: Neighbourhood change as a complex process

PLoS ONE ◽

10.1371/journal.pone.0245357 ◽

2021 ◽

Vol 16 (1) ◽

pp. e0245357

Author(s):

Daniel Silver ◽

Thiago H. Silva

Keyword(s):

Markov Model ◽

Markov Models ◽

Past Research ◽

Hierarchical Approach ◽

Evolutionary Trajectory ◽

Complex Process ◽

Trajectories Of Change ◽

Neighbourhood Change ◽

The Impact ◽

Complexity Theories

This paper seeks to advance neighbourhood change research and complexity theories of cities by developing and exploring a Markov model of socio-spatial neighbourhood evolution in Toronto, Canada. First, we classify Toronto neighbourhoods into distinct groups using established geodemographic segmentation techniques, a relatively novel application in this geographic setting. Extending previous studies, we pursue a hierarchical approach to classifying neighbourhoods that situates many neighbourhood types within the city’s broader structure. Our hierarchical approach is able to incorporate a richer set of types than most past research and allows us to study how neighbourhoods’ positions within this hierarchy shape their trajectories of change. Second, we use Markov models to identify generative processes that produce patterns of change in the city’s distribution of neighbourhood types. Moreover, we add a spatial component to the Markov process to uncover the extent to which change in one type of neighbourhood depends on the character of nearby neighbourhoods. In contrast to the few studies that have explored Markov models in this research tradition, we validate the model’s predictive power. Third, we demonstrate how to use such models in theoretical scenarios considering the impact on the city’s predicted evolutionary trajectory when existing probabilities of neighbourhood transitions or distributions of neighbourhood types would hypothetically change. Markov models of transition patterns prove to be highly accurate in predicting the final distribution of neighbourhood types. Counterfactual scenarios empirically demonstrate urban complexity: small initial changes reverberate throughout the system, and unfold differently depending on their initial geographic distribution. These scenarios show the value of complexity as a framework for interpreting data and guiding scenario-based planning exercises.

Download Full-text

Prediction of Eye, Hair and Skin Color in Admixed Populations of Latin America

10.1101/2020.12.09.415901 ◽

2020 ◽

Author(s):

Sagnik Palmal ◽

Kaustubh Adhikari ◽

Javier Mendoza-Revilla ◽

Macarena Fuentes-Guajardo ◽

Caio C. Silva de Cerqueira ◽

...

Keyword(s):

Native American ◽

Skin Color ◽

Prediction Accuracy ◽

Prediction Models ◽

Skin Pigmentation ◽

Genetic Ancestry ◽

Learning Approaches ◽

Latin Americans ◽

Eye Color ◽

The Impact

AbstractWe report an evaluation of prediction accuracy for eye, hair and skin pigmentation based on genomic and phenotypic data for over 6,500 admixed Latin Americans (the CANDELA dataset). We examined the impact on prediction accuracy of three main factors: (i) The methods of prediction, including classical statistical methods and machine learning approaches, (ii) The inclusion of non-genetic predictors, continental genetic ancestry and pigmentation SNPs in the prediction models, and (iii) Compared two sets of pigmentation SNPs: the commonly-used HIrisPlex-S set (developed in Europeans) and novel SNP sets we defined here based on genome-wide association results in the CANDELA sample. We find that Random Forest or regression are globally the best performing methods. Although continental genetic ancestry has substantial power for prediction of pigmentation in Latin Americans, the inclusion of pigmentation SNPs increases prediction accuracy considerably, particularly for skin color. For hair and eye color, HIrisPlex-S has a similar performance to the CANDELA-specific prediction SNP sets. However, for skin pigmentation the performance of HIrisPlex-S is markedly lower than the SNP set defined here, including predictions in an independent dataset of Native American data. These results reflect the relatively high variation in hair and eye color among Europeans for whom HIrisPlex-S was developed, whereas their variation in skin pigmentation is comparatively lower. Furthermore, we show that the dataset used in the training of prediction models strongly impacts on the portability of these models across Europeans and Native Americans.

Download Full-text

A Bayesian Mutation–Selection Framework for Detecting Site-Specific Adaptive Evolution in Protein-Coding Genes

Molecular Biology and Evolution ◽

10.1093/molbev/msaa265 ◽

2020 ◽

Author(s):

Nicolas Rodrigue ◽

Thibault Latrille ◽

Nicolas Lartillot

Keyword(s):

Adaptive Evolution ◽

Real Data ◽

Data Sets ◽

Protein Coding ◽

Site Specific ◽

Protein Coding Genes ◽

Codon Substitution ◽

Selection Framework ◽

Dna Alignment ◽

The Impact

Abstract In recent years, codon substitution models based on the mutation–selection principle have been extended for the purpose of detecting signatures of adaptive evolution in protein-coding genes. However, the approaches used to date have either focused on detecting global signals of adaptive regimes—across the entire gene—or on contexts where experimentally derived, site-specific amino acid fitness profiles are available. Here, we present a Bayesian site-heterogeneous mutation–selection framework for site-specific detection of adaptive substitution regimes given a protein-coding DNA alignment. We offer implementations, briefly present simulation results, and apply the approach on a few real data sets. Our analyses suggest that the new approach shows greater sensitivity than traditional methods. However, more study is required to assess the impact of potential model violations on the method, and gain a greater empirical sense its behavior on a broader range of real data sets. We propose an outline of such a research program.

Download Full-text

Validation of Adaptive Gaussian Process Regression Model Used for SIF Prediction

Volume 4: Materials Technology ◽

10.1115/omae2018-78608 ◽

2018 ◽

Author(s):

Arvind Keprate ◽

R. M. Chandima Ratnayake ◽

Shankar Sankararaman

Keyword(s):

Regression Model ◽

Gaussian Process ◽

Prediction Accuracy ◽

Experimental Validation ◽

Gaussian Process Regression ◽

Absolute Error ◽

Coefficient Of Determination ◽

Data Sets ◽

Maximum Absolute Error ◽

Data Points

The main aim of this paper is to perform the validation of the adaptive Gaussian process regression model (AGPRM) developed by the authors for the Stress Intensity Factor (SIF) prediction of a crack propagating in topside piping. For validation purposes, the values of SIF obtained from experiments available in the literature are used. Sixty-six data points (consisting of L, a, c and SIF values obtained by experiments) are used to train the AGPRM, while four independent data sets are used for validation purposes. The experimental validation of the AGPRM also consists of the comparison of the prediction accuracy of AGPRM and Finite Element Method (FEM) relative to the experimentally derived SIF values. Four metrics, namely, Root Mean Square Error (RMSE), Average Absolute Error (AAE), Maximum Absolute Error (MAE), and Coefficient of Determination (R2), are used to compare the accuracy. A case study illustrating the development and experimental validation of the AGPRM is presented. Results indicate that the prediction accuracy of the AGPRM is comparable with and even higher than that of the FEM, provided the training points of the AGPRM are aptly chosen.

Download Full-text

Predication of life cycle cost of equipment base on unbiased grey Markov models

MATEC Web of Conferences ◽

10.1051/matecconf/202030905005 ◽

2020 ◽

Vol 309 ◽

pp. 05005

Author(s):

Yonghong Chen ◽

Ping Hu ◽

Dong Zhang

Keyword(s):

Life Cycle ◽

Markov Model ◽

Life Cycle Cost ◽

Prediction Accuracy ◽

Markov Models ◽

Grey Model ◽

Model Accuracy ◽

Integrated Logistics ◽

And Control ◽

Whole Life Cycle

Life cycle cost(LCC) is an important content of equipment integrated logistics support. While the LCC includes the whole life cycle of equipment from development, production, service and maintenance to retirement, in order to effectively manage and control the LCC and better develop integrated logistics support, it is necessary to analyze and predict it. The unbiased grey markov model(UGMM) was introduced into the LCC prediction in the paper, in order to check model accuracy, the posterior difference method(PDM) was used, also the influence by the number of state intervals in UGMM on the prediction accuracy is analyzed and studied. The result indicate that UGMM can be used to predict the LCC, also have the highest prediction accuracy comparing with unbiased grey model and grey separating model, and in order to ensure the prediction accuracy, the state interval should be divided according to the number of sequence.

Download Full-text

Global Solar Radiation Prediction Using Hybrid Online Sequential Extreme Learning Machine Model

Energies ◽

10.3390/en11123415 ◽

2018 ◽

Vol 11 (12) ◽

pp. 3415 ◽

Cited By ~ 10

Author(s):

Muzhou Hou ◽

Tianle Zhang ◽

Futian Weng ◽

Mumtaz Ali ◽

Nadhir Al-Ansari ◽

...

Keyword(s):

Solar Radiation ◽

Extreme Learning Machine ◽

Prediction Accuracy ◽

Prediction Models ◽

Renewable Energy Sources ◽

Information Criterion ◽

Absolute Error ◽

Global Solar Radiation ◽

Machine Model ◽

Learning Machine

Accurate global solar radiation prediction is highly essential for related research on renewable energy sources. The cost implication and measurement expertise of global solar radiation emphasize that intelligence prediction models need to be applied. On the basis of long-term measured daily solar radiation data, this study uses a novel regularized online sequential extreme learning machine, integrated with variable forgetting factor (FOS-ELM), to predict global solar radiation at Bur Dedougou, in the Burkina Faso region. Bayesian Information Criterion (BIC) is applied to build the seven input combinations based on speed (Wspeed), maximum and minimum temperature (Tmax and Tmin), maximum and minimum humidity (Hmax and Hmin), evaporation (Eo) and vapor pressure deficiency (VPD). For the difference input parameters magnitudes, seven models were developed and evaluated for the optimal input combination. Various statistical indicators were computed for the prediction accuracy examination. The experimental results of the applied FOS-ELM model demonstrated a reliable prediction accuracy against the classical extreme learning machine (ELM) model for daily global solar radiation simulation. In fact, compared to classical ELM, the FOS-ELM model reported an enhancement in the root mean square error (RMSE) and mean absolute error (MAE) by (68.8–79.8%). In summary, the results clearly confirm the effectiveness of the FOS-ELM model, owing to the fixed internal tuning parameters.

Download Full-text

Multistep-Ahead Air Passengers Traffic Prediction with Hybrid ARIMA-SVMs Models

The Scientific World JOURNAL ◽

10.1155/2014/567246 ◽

2014 ◽

Vol 2014 ◽

pp. 1-14 ◽

Cited By ~ 7

Author(s):

Wei Ming ◽

Yukun Bao ◽

Zhongyi Hu ◽

Tao Xiong

Keyword(s):

Prediction Models ◽

Nonlinear Modeling ◽

Real Data ◽

Data Preprocessing ◽

Air Transportation ◽

Data Sets ◽

Term Prediction ◽

Short Term Prediction ◽

Long Term Prediction

The hybrid ARIMA-SVMs prediction models have been established recently, which take advantage of the unique strength of ARIMA and SVMs models in linear and nonlinear modeling, respectively. Built upon this hybrid ARIMA-SVMs models alike, this study goes further to extend them into the case of multistep-ahead prediction for air passengers traffic with the two most commonly used multistep-ahead prediction strategies, that is, iterated strategy and direct strategy. Additionally, the effectiveness of data preprocessing approaches, such as deseasonalization and detrending, is investigated and proofed along with the two strategies. Real data sets including four selected airlines’ monthly series were collected to justify the effectiveness of the proposed approach. Empirical results demonstrate that the direct strategy performs better than iterative one in long term prediction case while iterative one performs better in the case of short term prediction. Furthermore, both deseasonalization and detrending can significantly improve the prediction accuracy for both strategies, indicating the necessity of data preprocessing. As such, this study contributes as a full reference to the planners from air transportation industries on how to tackle multistep-ahead prediction tasks in the implementation of either prediction strategy.

Download Full-text

DEFECT PREDICTION USING CASE-BASED REASONING: AN ATTRIBUTE WEIGHTING TECHNIQUE BASED UPON SENSITIVITY ANALYSIS IN NEURAL NETWORKS

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194012400116 ◽

2012 ◽

Vol 22 (06) ◽

pp. 747-768 ◽

Cited By ~ 7

Author(s):

ELHAM PAIKARI ◽

MICHAEL M. RICHTER ◽

GUENTHER RUHE

Keyword(s):

Neural Networks ◽

Sensitivity Analysis ◽

Absolute Error ◽

Defect Prediction ◽

Case Based Reasoning ◽

Data Sets ◽

Attribute Weighting ◽

Weight Calculation ◽

The Impact ◽

Case Based

Software defect prediction is an acknowledged approach used to achieve better product quality and to better utilize resources needed for that purpose. One known method for predicting the number of defects is to apply case-based reasoning (CBR). In this paper, different attribute weighting techniques for CBR-based defect prediction are analyzed. One of the weighting techniques used in this work, Sensitivity Analysis based on Neural Networks (SANN), is based on sensitivity analysis of the impact of attributes as part of neural network analysis. Neural networks are applicable when there are non-linear and complicated relationships among the attributes. Since weighting plays a key role in the CBR model, using an efficient weight calculation method can change the results. The results of SANN are compared with applying uniform weights and weights gained from Multiple Linear Regression (MLR).Evaluation of the accuracy of the overall method for applying the three different weighting techniques is done over five data sets, comprising about 5000 modules from NASA. Two quality measures are applied: Average Absolute Error (AAE) and Average Relative Error (ARE). In addition to the variation of weighting techniques, the impact of varying the number of nearest neighbors is studied.The three main results of the empirical analysis are: (i) In the majority of cases, SANN achieves the most accurate results; (ii) uniform weighting performs better than the MLR-based weighting heuristic; and (iii) there is no significant preference pattern for defining the number of similar objects used for prediction in CBR.

Download Full-text