bic criterion Latest Research Papers

Abstract Background: Although data-driven methods for selecting covariates in multivariable models ignore confusion, mediation and collision, they are still used in causal inference. This study, through three real-world datasets, shows the impact of data-driven methods on causal inference. Methods: A research question leading to multivariate model was raised for each of three real-world datasets. Three covariate selection methods were compared on their performances to correctly answer the question: Augmented Backward Elimination with BIC criterion and “change-in-estimate” threshold set at 0.05, Backward Elimination with BIC criterion and a knowledge-based method relying on causal diagrams. The covariates were classified as indispensable, prohibited and optional, considering the potential bias they could cause on the estimate. For each dataset and sample size (N=75, 300 and 3,000), 10,000 Monte Carlo samples were drawn. Percentages of inclusion of each covariate in models were computed. Coverages of Wald’s 95% confidence interval of exposure effects were computed with two different theoretical values (the analysed method, the knowledge-based method).Results: Even with the largest sample size (n=3,000), data-driven methods were not reproducible, with 8.6% to 53% of covariates included in 20% to 80% of experiences. Prohibited covariates could be included in more than 80% of experiences and indispensable covariates missed in more than 80% of experiences even with n=3,000. With the largest sample sizes, coverages of the theoretical knowledge-based value by data-driven methods ranged from 0% to 83.7%; coverages of the theoretical value of the same data-driven method ranged from 73.2% to 91.1% and were asymmetrical. Conclusion: In conclusion, data-driven methods should not be used in causal inference.

Download Full-text

Multi-Partitions Subspace Clustering

Mathematics ◽

10.3390/math8040597 ◽

2020 ◽

Vol 8 (4) ◽

pp. 597 ◽

Cited By ~ 1

Author(s):

Vincent Vandewalle

Keyword(s):

Latent Variables ◽

Latent Variable ◽

Subspace Clustering ◽

Real Data ◽

Model Choice ◽

Model Based Clustering ◽

Model Based ◽

Choice Strategy ◽

Factorial Discriminant Analysis ◽

Bic Criterion

In model based clustering, it is often supposed that only one clustering latent variable explains the heterogeneity of the whole dataset. However, in many cases several latent variables could explain the heterogeneity of the data at hand. Finding such class variables could result in a richer interpretation of the data. In the continuous data setting, a multi-partition model based clustering is proposed. It assumes the existence of several latent clustering variables, each one explaining the heterogeneity of the data with respect to some clustering subspace. It allows to simultaneously find the multi-partitions and the related subspaces. Parameters of the model are estimated through an EM algorithm relying on a probabilistic reinterpretation of the factorial discriminant analysis. A model choice strategy relying on the BIC criterion is proposed to select to number of subspaces and the number of clusters by subspace. The obtained results are thus several projections of the data, each one conveying its own clustering of the data. Model’s behavior is illustrated on simulated and real data.

Download Full-text

The Application of Fractional Calculus in Chinese Economic Growth Models

Mathematics ◽

10.3390/math7080665 ◽

2019 ◽

Vol 7 (8) ◽

pp. 665 ◽

Cited By ~ 10

Author(s):

Hao Ming ◽

JinRong Wang ◽

Michal Fečkan

Keyword(s):

Fractional Order ◽

Growth Models ◽

Free Software ◽

Statistical Computing ◽

Order Model ◽

Software Environment ◽

R Software ◽

Fractional Order Calculus ◽

Economic Growth Models ◽

Bic Criterion

In this paper, we apply Caputo-type fractional order calculus to simulate China’s gross domestic product (GDP) growth based on R software, which is a free software environment for statistical computing and graphics. Moreover, we compare the results for the fractional model with the integer order model. In addition, we show the importance of variables according to the BIC criterion. The study shows that Caputo fractional order calculus can produce a better model and perform more accurately in predicting the GDP values from 2012–2016.

Download Full-text

FORECAST OF TOURIST DEMAND IN UKRAINE ON A FAST-FUTURE PROSPECTS

World Science ◽

10.31435/rsglobal_ws/30082018/6047 ◽

2018 ◽

Vol 1 (8(36)) ◽

pp. 11-16

Author(s):

Viktor Krylov ◽

Christina Lipyanina

Keyword(s):

Time Series ◽

Future Prospects ◽

Good Match ◽

Auto Correlation ◽

Tourist Demand ◽

Bic Criterion

The process of formation of tourist demand was studied and autocorrelation and partial auto-correlation were calculated. Valued behavior of selective ACF and partial PACF, showing the hypothesis about the values of the parameters p and q. Due to the lack of data, several competing ARMA (1.1) and ARMA (2.0) models have been selected. Both models showed a good match with the data, the models are adequate and the errors are random, so the best model is chosen according to the AIC and BIC criterion. The remains of the selected model are checked for the absence of auto-correlation using the Lew Box test. For the selected best model, forecasts were projected for 5 periods ahead. From the forecast of the time series it is clear that the tourist demand in the next 5 years will decline.

Download Full-text