scholarly journals Person-centered data analysis with covariates and the R-package confreq

Methodology ◽  
2021 ◽  
Vol 17 (2) ◽  
pp. 149-167
Author(s):  
Mark Stemmler ◽  
Jörg-Henrik Heine ◽  
Susanne Wallner

Configural Frequency Analysis (CFA) is a useful statistical method for the analysis of multiway contingency tables and an appropriate tool for person-oriented or person-centered methods. In complex contingency tables, patterns or configurations are analyzed by comparing observed cell frequencies with expected frequencies. Significant differences between observed and expected frequencies lead to the emergence of Types and Antitypes. Types are patterns or configurations which are significantly more often observed than the expected frequencies; Antitypes represent configurations which are observed less frequently than expected. The R-package confreq is an easy-to-use software for conducting CFAs; another useful shareware to run CFAs was developed by Alexander von Eye. Here, CFA is presented based on the log-linear modeling approach. CFA may be used together with interval level variables which can be added as covariates into the design matrix. In this article, a real data example and the use of confreq are presented. In sum, the use of a covariate may bring the estimated cell frequencies closer to the observed cell frequencies. In those cases, the number of Types or Antitypes may decrease. However, in rare cases, the Type-Antitype pattern can change with new emerging Types or Antitypes.

2016 ◽  
Vol 41 (5) ◽  
pp. 632-646 ◽  
Author(s):  
Mark Stemmler ◽  
Jörg-Henrik Heine

Configural frequency analysis and log-linear modeling are presented as person-centered analytic approaches for the analysis of categorical or categorized data in multi-way contingency tables. Person-centered developmental psychology, based on the holistic interactionistic perspective of the Stockholm working group around David Magnusson and Lars Bergman, is briefly revisited. According to person-centered theory, systems or individuals are seen as a whole and as inseparable units; individuals are embedded and strongly interconnected with their context; the individual and the environment influence each other, and the individual is seen as an active agent or producer of his or her own development. Four models of configural frequency analysis are presented: (1) First-order configural frequency analysis, which is basically the analysis of a main effects log-linear model; (2) prediction configural frequency analysis, which defines one or more dependent variables; (3) two-group configural frequency analysis, which proposes that there is no association between discrimination variables and group membership; and (4) functional configural frequency analysis, which allows us to blank out certain outlier cells in order to test for the quasi-independence of the rest of the cross-table. The use of the open source R-package confreq for computational analysis is demonstrated. The advantages, as well as the limitations, of configural frequency analysis are discussed.


Psych ◽  
2021 ◽  
Vol 3 (3) ◽  
pp. 522-541
Author(s):  
Jörg-Henrik Heine ◽  
Mark Stemmler

The person-centered approach in categorical data analysis is introduced as a complementary approach to the variable-centered approach. The former uses persons, animals, or objects on the basis of their combination of characteristics which can be displayed in multiway contingency tables. Configural Frequency Analysis (CFA) and log-linear modeling (LLM) are the two most prominent (and related) statistical methods. Both compare observed frequencies (foi…k) with expected frequencies (fei…k). While LLM uses primarily a model-fitting approach, CFA analyzes residuals of non-fitting models. Residuals with significantly more observed than expected frequencies (foi…k>fei…k) are called types, while residuals with significantly less observed than expected frequencies (foi…k<fei…k) are called antitypes. The R package confreq is presented and its use is demonstrated with several data examples. Results of contingency table analyses can be displayed in tables but also in graphics representing the size and type of residual. The expected frequencies represent the null hypothesis and different null hypotheses result in different expected frequencies. Different kinds of CFAs are presented: the first-order CFA based on the null hypothesis of independence, CFA with covariates, and the two-sample CFA. The calculation of the expected frequencies can be controlled through the design matrix which can be easily handled in confreq.


2021 ◽  
Vol 15 ◽  
Author(s):  
Sander Lamballais ◽  
Ryan L. Muetzel

The cerebral cortex is fundamental to the functioning of the mind and body. In vivo cortical morphology can be studied through magnetic resonance imaging in several ways, including reconstructing surface-based models of the cortex. However, existing software for surface-based statistical analyses cannot accommodate “big data” or commonly used statistical methods such as the imputation of missing data, extensive bias correction, and non-linear modeling. To address these shortcomings, we developed the QDECR package, a flexible and extensible R package for group-level statistical analysis of cortical morphology. QDECR was written with large population-based epidemiological studies in mind and was designed to fully utilize the extensive modeling options in R. QDECR currently supports vertex-wise linear regression. Design matrix generation can be done through simple, familiar R formula specification, and includes user-friendly extensions for R options such as polynomials, splines, interactions and other terms. QDECR can handle unimputed and imputed datasets with thousands of participants. QDECR has a modular design, and new statistical models can be implemented which utilize several aspects from other generic modules which comprise QDECR. In summary, QDECR provides a framework for vertex-wise surface-based analyses that enables flexible statistical modeling and features commonly used in population-based and clinical studies, which have until now been largely absent from neuroimaging research.


1987 ◽  
Vol 26 (03) ◽  
pp. 104-108
Author(s):  
M. A. A. Moussa

SummaryThe paper focuses upon the measurement of association in two-way contingency tables, using the log-linear models and dual scaling approaches. The former comprises [1] the use of pseudo-Bayes estimators to remove zeros, [2] fitting the resulting smoothed array to all possible configurations of log-linear models, [3] fitting the quasi-independence model to detect anomalous cells that caused deviation from the null-independence model. The latter includes [1] estimation of the optimal weights that maximize the canonical correlation between the two categorical variables by an optimization iterative method, [2] testing the discriminability of the estimated scoring scheme. The two approaches were applied to a set of real data for the study of the association between maternal age at marriage and types of reproductive wastage in a sampling survey conducted in the population of female nurses in Kuwait.


2016 ◽  
Vol 36 (2) ◽  
Author(s):  
Patrick Mair

The formulation of log-linear models within the framework of Generalized Linear Models offers new possibilities in modeling categorical data. The resulting models are not restricted to the analysis of contingency tables in terms of ordinary hierarchical interactions. Such models are considered as the family of nonstandard log-linear models. The problem that can arise is an ambiguous interpretation of parameters. In the current paperthis problem is solved by looking at the effects coded in the design matrix and determining the numerical contribution of single effects. Based on these results, stepwise approaches are proposed in order to achieve parsimonious models. In addition, some testing strategies are presented to test such (eventually non-nested) models against each other. As a result, a whole interpretation framework is elaborated to examine nonstandard log-linear models in depth.


2004 ◽  
Vol 8 (2) ◽  
pp. 67-86 ◽  
Author(s):  
Eric J. Beh ◽  
Pamela J. Davy

Log-linear modeling is a popular statistical tool for analysing a contingency table. This presentation focuses on an alternative approach to modeling ordinal categorical data. The technique, based on orthogonal polynomials, provides a much simpler method of model fitting than the conventional approach of maximum likelihood estimation, as it does not require iterative calculations nor the fitting and re-fitting to search for the best model. Another advantage is that quadratic and higher order effects can readily be included, in contrast to conventional log-linear models which incorporate linear terms only.The focus of the discussion is the application of the new parameter estimation technique to multi-way contingency tables with at least one ordered variable. This will also be done by considering singly and doubly ordered two-way contingency tables. It will be shown by example that the resulting parameter estimates are numerically similar to corresponding maximum likelihood estimates for ordinal log-linear models.


2020 ◽  
Vol 36 (14) ◽  
pp. 4222-4224
Author(s):  
Zhong Wang ◽  
Nating Wang ◽  
Zilu Wang ◽  
Libo Jiang ◽  
Yaqun Wang ◽  
...  

Abstract Summary Genome-wide association studies (GWAS), particularly designed with thousands and thousands of single-nucleotide polymorphisms (SNPs) (big p) genotyped on tens of thousands of subjects (small n), are encountered by a major challenge of p ≪ n. Although the integration of longitudinal information can significantly enhance a GWAS’s power to comprehend the genetic architecture of complex traits and diseases, an additional challenge is generated by an autocorrelative process. We have developed several statistical models for addressing these two challenges by implementing dimension reduction methods and longitudinal data analysis. To make these models computationally accessible to applied geneticists, we wrote an R package of computer software, HiGwas, designed to analyze longitudinal GWAS datasets. Functions in the package encompass single SNP analyses, significance-level adjustment, preconditioning and model selection for a high-dimensional set of SNPs. HiGwas provides the estimates of genetic parameters and the confidence intervals of these estimates. We demonstrate the features of HiGwas through real data analysis and vignette document in the package. Availability and implementation https://github.com/wzhy2000/higwas. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document