bic criterion
Recently Published Documents


TOTAL DOCUMENTS

6
(FIVE YEARS 4)

H-INDEX

1
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Yangyang Xue ◽  
Wenliang Wang ◽  
Hanhao Zhu ◽  
Guangxue Zheng ◽  
Zhiqiang Cui

2021 ◽  
Author(s):  
Thibaut Pressat-Laffouilhère ◽  
Clément Massonnaud ◽  
Hélène Bréard ◽  
Margaux Lefebvre ◽  
Bruno Falissard ◽  
...  

Abstract Background: Although data-driven methods for selecting covariates in multivariable models ignore confusion, mediation and collision, they are still used in causal inference. This study, through three real-world datasets, shows the impact of data-driven methods on causal inference. Methods: A research question leading to multivariate model was raised for each of three real-world datasets. Three covariate selection methods were compared on their performances to correctly answer the question: Augmented Backward Elimination with BIC criterion and “change-in-estimate” threshold set at 0.05, Backward Elimination with BIC criterion and a knowledge-based method relying on causal diagrams. The covariates were classified as indispensable, prohibited and optional, considering the potential bias they could cause on the estimate. For each dataset and sample size (N=75, 300 and 3,000), 10,000 Monte Carlo samples were drawn. Percentages of inclusion of each covariate in models were computed. Coverages of Wald’s 95% confidence interval of exposure effects were computed with two different theoretical values (the analysed method, the knowledge-based method).Results: Even with the largest sample size (n=3,000), data-driven methods were not reproducible, with 8.6% to 53% of covariates included in 20% to 80% of experiences. Prohibited covariates could be included in more than 80% of experiences and indispensable covariates missed in more than 80% of experiences even with n=3,000. With the largest sample sizes, coverages of the theoretical knowledge-based value by data-driven methods ranged from 0% to 83.7%; coverages of the theoretical value of the same data-driven method ranged from 73.2% to 91.1% and were asymmetrical. Conclusion: In conclusion, data-driven methods should not be used in causal inference.


Mathematics ◽  
2020 ◽  
Vol 8 (4) ◽  
pp. 597 ◽  
Author(s):  
Vincent Vandewalle

In model based clustering, it is often supposed that only one clustering latent variable explains the heterogeneity of the whole dataset. However, in many cases several latent variables could explain the heterogeneity of the data at hand. Finding such class variables could result in a richer interpretation of the data. In the continuous data setting, a multi-partition model based clustering is proposed. It assumes the existence of several latent clustering variables, each one explaining the heterogeneity of the data with respect to some clustering subspace. It allows to simultaneously find the multi-partitions and the related subspaces. Parameters of the model are estimated through an EM algorithm relying on a probabilistic reinterpretation of the factorial discriminant analysis. A model choice strategy relying on the BIC criterion is proposed to select to number of subspaces and the number of clusters by subspace. The obtained results are thus several projections of the data, each one conveying its own clustering of the data. Model’s behavior is illustrated on simulated and real data.


Mathematics ◽  
2019 ◽  
Vol 7 (8) ◽  
pp. 665 ◽  
Author(s):  
Hao Ming ◽  
JinRong Wang ◽  
Michal Fečkan

In this paper, we apply Caputo-type fractional order calculus to simulate China’s gross domestic product (GDP) growth based on R software, which is a free software environment for statistical computing and graphics. Moreover, we compare the results for the fractional model with the integer order model. In addition, we show the importance of variables according to the BIC criterion. The study shows that Caputo fractional order calculus can produce a better model and perform more accurately in predicting the GDP values from 2012–2016.


World Science ◽  
2018 ◽  
Vol 1 (8(36)) ◽  
pp. 11-16
Author(s):  
Viktor Krylov ◽  
Christina Lipyanina

The process of formation of tourist demand was studied and autocorrelation and partial auto-correlation were calculated. Valued behavior of selective ACF and partial PACF, showing the hypothesis about the values of the parameters p and q. Due to the lack of data, several competing ARMA (1.1) and ARMA (2.0) models have been selected. Both models showed a good match with the data, the models are adequate and the errors are random, so the best model is chosen according to the AIC and BIC criterion. The remains of the selected model are checked for the absence of auto-correlation using the Lew Box test. For the selected best model, forecasts were projected for 5 periods ahead. From the forecast of the time series it is clear that the tourist demand in the next 5 years will decline.


Sign in / Sign up

Export Citation Format

Share Document