Performance of approaches relying on multidimensional intermediary data to decipher causal relationships between the exposome and health: A simulation study under various causal structures

Exploratory Causal Analysis of Open Data: Explanation Generation and Confounder Identification

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2020.p0142 ◽

2020 ◽

Vol 24 (1) ◽

pp. 142-155

Author(s):

Jing Song ◽

Satoshi Oyama ◽

Masahito Kurihara ◽

◽

...

Keyword(s):

Causal Relationship ◽

Language Processing ◽

Open Data ◽

Causal Analysis ◽

Causal Relationships ◽

Relationship Analysis ◽

Analysis Process ◽

Confounding Variables ◽

Data Environment ◽

Causal Structures

Open data are becoming increasingly available in various domains, and many organizations rely on making decisions according to data. Such decision making requires care to distinguish between correlations and causal relationships. Among data analysis tasks, causal relationship analysis is especially complex because of unobserved confounders. For example, to correctly analyze the causal relationship between two variables, the possible confounding effect of a third variable should be considered. In the open-data environment, however, it is difficult to consider all possible confounders in advance. In this paper, we propose a framework for exploratory causal analysis of open data, in which possible confounding variables are collected and incrementally tested from a large volume of open data. To the extent of the authors’ knowledge, no framework has been proposed to incorporate data for possible confounders in causal analysis process. This paper shows an original way to expand causal structures and generate reasonable causal relationships. The proposed framework accounts for the effect of possible confounding in causal analysis by first using a crowdsourcing platform to collect explanations of the correlation between variables. Keywords are then extracted using natural language processing methods. The framework searches the related open data according to the extracted keywords. Finally, the collected explanations are tested using several automated causal analysis methods. We conducted experiments using open data from the World Bank and the Japanese government. The experimental results confirmed that the proposed framework enables causal analysis while considering the effects of possible confounders.

Download Full-text

396 Validation of ECG indices of ventricular repolarization heterogeneity: a computer simulation study

EP Europace ◽

10.1016/s1099-5129(05)80252-1 ◽

2005 ◽

Vol 7 ◽

pp. 85-85

Keyword(s):

Computer Simulation ◽

Simulation Study ◽

Ventricular Repolarization ◽

Computer Simulation Study

Download Full-text

A Monte Carlo simulation study of associated liquid crystals

Molecular Physics ◽

10.1080/002689799162966 ◽

1999 ◽

Vol 97 (11) ◽

pp. 1173-1184 ◽

Cited By ~ 1

Author(s):

R. Berardi, M. Fehervari, C. Zannoni

Keyword(s):

Monte Carlo Simulation ◽

Monte Carlo ◽

Liquid Crystals ◽

Simulation Study ◽

Monte Carlo Simulation Study

Download Full-text

Molecular Simulation Study of Separation of CO2 from Alkanes Using Metal-Organic Frameworks

The Journal of Physical Chemistry B ◽

10.1021/jp064376w ◽

2006 ◽

Vol 0 (0) ◽

pp. 0-0

Author(s):

S. Wang ◽

Q. Yang ◽

C. Zhong

Keyword(s):

Molecular Simulation ◽

Simulation Study ◽

Metal Organic Frameworks ◽

Metal Organic ◽

Separation Of Co2

Download Full-text

An Alternative to Cohen's κ

European Psychologist ◽

10.1027/1016-9040.11.1.12 ◽

2006 ◽

Vol 11 (1) ◽

pp. 12-24 ◽

Cited By ~ 19

Author(s):

Alexander von Eye

Keyword(s):

Simulation Study ◽

Null Hypothesis ◽

Categorical Variables ◽

Alternative Measure ◽

Rater Agreement ◽

Verbal Processing ◽

Heavy Tailed ◽

Applicant Selection

At the level of manifest categorical variables, a large number of coefficients and models for the examination of rater agreement has been proposed and used. The most popular of these is Cohen's κ. In this article, a new coefficient, κ s , is proposed as an alternative measure of rater agreement. Both κ and κ s allow researchers to determine whether agreement in groups of two or more raters is significantly beyond chance. Stouffer's z is used to test the null hypothesis that κ s = 0. The coefficient κ s allows one, in addition to evaluating rater agreement in a fashion parallel to κ, to (1) examine subsets of cells in agreement tables, (2) examine cells that indicate disagreement, (3) consider alternative chance models, (4) take covariates into account, and (5) compare independent samples. Results from a simulation study are reported, which suggest that (a) the four measures of rater agreement, Cohen's κ, Brennan and Prediger's κ n , raw agreement, and κ s are sensitive to the same data characteristics when evaluating rater agreement and (b) both the z-statistic for Cohen's κ and Stouffer's z for κ s are unimodally and symmetrically distributed, but slightly heavy-tailed. Examples use data from verbal processing and applicant selection.

Download Full-text

Comparison of Principal Component Solutions in Two Populations

Methodology ◽

10.1027/1614-2241/a000099 ◽

2016 ◽

Vol 12 (1) ◽

pp. 11-20 ◽

Cited By ~ 1

Author(s):

Gregor Sočan

Keyword(s):

Simulation Study ◽

Principal Components ◽

Common Factor ◽

Principal Component ◽

Component Model ◽

Bootstrap Procedure ◽

Factor Loadings ◽

Component Loadings ◽

Principal Component Model ◽

Two Populations

Abstract. When principal component solutions are compared across two groups, a question arises whether the extracted components have the same interpretation in both populations. The problem can be approached by testing null hypotheses stating that the congruence coefficients between pairs of vectors of component loadings are equal to 1. Chan, Leung, Chan, Ho, and Yung (1999) proposed a bootstrap procedure for testing the hypothesis of perfect congruence between vectors of common factor loadings. We demonstrate that the procedure by Chan et al. is both theoretically and empirically inadequate for the application on principal components. We propose a modification of their procedure, which constructs the resampling space according to the characteristics of the principal component model. The results of a simulation study show satisfactory empirical properties of the modified procedure.

Download Full-text

Assessing Person Fit With the Information Matrix Test

Methodology ◽

10.1027/1614-2241/a000085 ◽

2015 ◽

Vol 11 (1) ◽

pp. 3-12 ◽

Cited By ~ 2

Author(s):

Jochen Ranger ◽

Jörg-Tobias Kuhn

Keyword(s):

Simulation Study ◽

Type I Error ◽

Information Matrix ◽

Small Samples ◽

Type I ◽

Person Fit ◽

Power Of The Test ◽

Order Expansion ◽

Trait Stability ◽

Information Matrix Test

In this manuscript, a new approach to the analysis of person fit is presented that is based on the information matrix test of White (1982) . This test can be interpreted as a test of trait stability during the measurement situation. The test follows approximately a χ2-distribution. In small samples, the approximation can be improved by a higher-order expansion. The performance of the test is explored in a simulation study. This simulation study suggests that the test adheres to the nominal Type-I error rate well, although it tends to be conservative in very short scales. The power of the test is compared to the power of four alternative tests of person fit. This comparison corroborates that the power of the information matrix test is similar to the power of the alternative tests. Advantages and areas of application of the information matrix test are discussed.

Download Full-text