scholarly journals Analysis of Categorical Data with the R Package confreq

Psych ◽  
2021 ◽  
Vol 3 (3) ◽  
pp. 522-541
Author(s):  
Jörg-Henrik Heine ◽  
Mark Stemmler

The person-centered approach in categorical data analysis is introduced as a complementary approach to the variable-centered approach. The former uses persons, animals, or objects on the basis of their combination of characteristics which can be displayed in multiway contingency tables. Configural Frequency Analysis (CFA) and log-linear modeling (LLM) are the two most prominent (and related) statistical methods. Both compare observed frequencies (foi…k) with expected frequencies (fei…k). While LLM uses primarily a model-fitting approach, CFA analyzes residuals of non-fitting models. Residuals with significantly more observed than expected frequencies (foi…k>fei…k) are called types, while residuals with significantly less observed than expected frequencies (foi…k<fei…k) are called antitypes. The R package confreq is presented and its use is demonstrated with several data examples. Results of contingency table analyses can be displayed in tables but also in graphics representing the size and type of residual. The expected frequencies represent the null hypothesis and different null hypotheses result in different expected frequencies. Different kinds of CFAs are presented: the first-order CFA based on the null hypothesis of independence, CFA with covariates, and the two-sample CFA. The calculation of the expected frequencies can be controlled through the design matrix which can be easily handled in confreq.

Methodology ◽  
2021 ◽  
Vol 17 (2) ◽  
pp. 149-167
Author(s):  
Mark Stemmler ◽  
Jörg-Henrik Heine ◽  
Susanne Wallner

Configural Frequency Analysis (CFA) is a useful statistical method for the analysis of multiway contingency tables and an appropriate tool for person-oriented or person-centered methods. In complex contingency tables, patterns or configurations are analyzed by comparing observed cell frequencies with expected frequencies. Significant differences between observed and expected frequencies lead to the emergence of Types and Antitypes. Types are patterns or configurations which are significantly more often observed than the expected frequencies; Antitypes represent configurations which are observed less frequently than expected. The R-package confreq is an easy-to-use software for conducting CFAs; another useful shareware to run CFAs was developed by Alexander von Eye. Here, CFA is presented based on the log-linear modeling approach. CFA may be used together with interval level variables which can be added as covariates into the design matrix. In this article, a real data example and the use of confreq are presented. In sum, the use of a covariate may bring the estimated cell frequencies closer to the observed cell frequencies. In those cases, the number of Types or Antitypes may decrease. However, in rare cases, the Type-Antitype pattern can change with new emerging Types or Antitypes.


2016 ◽  
Vol 41 (5) ◽  
pp. 632-646 ◽  
Author(s):  
Mark Stemmler ◽  
Jörg-Henrik Heine

Configural frequency analysis and log-linear modeling are presented as person-centered analytic approaches for the analysis of categorical or categorized data in multi-way contingency tables. Person-centered developmental psychology, based on the holistic interactionistic perspective of the Stockholm working group around David Magnusson and Lars Bergman, is briefly revisited. According to person-centered theory, systems or individuals are seen as a whole and as inseparable units; individuals are embedded and strongly interconnected with their context; the individual and the environment influence each other, and the individual is seen as an active agent or producer of his or her own development. Four models of configural frequency analysis are presented: (1) First-order configural frequency analysis, which is basically the analysis of a main effects log-linear model; (2) prediction configural frequency analysis, which defines one or more dependent variables; (3) two-group configural frequency analysis, which proposes that there is no association between discrimination variables and group membership; and (4) functional configural frequency analysis, which allows us to blank out certain outlier cells in order to test for the quasi-independence of the rest of the cross-table. The use of the open source R-package confreq for computational analysis is demonstrated. The advantages, as well as the limitations, of configural frequency analysis are discussed.


1982 ◽  
Vol 19 (4) ◽  
pp. 461-471 ◽  
Author(s):  
Jay Magidson

Examples of some common pitfalls in the analysis of categorical data are discussed in the context of causal interpretation of the results. Though no statistical technique can replace theory, the author shows that log-linear modeling and chi square automatic interaction detection can provide researchers with powerful tools for gaining valuable causal insights into their data. Examples include the biasing effects of omitted variables, omitted interactions, improper contrast coding, and misspecification of the structure of an hypothesized interaction.


The R Journal ◽  
2018 ◽  
Vol 10 (1) ◽  
pp. 73 ◽  
Author(s):  
Juhyun Kim ◽  
Yiwen Zhang ◽  
Joshua Day ◽  
Hua Zhou

2019 ◽  
pp. 51-64

The article presents basic algorithms categorical data analysis using R package. Algorithms for the analysis of independent and non­independent nominal and ordinal data are presented.


Sign in / Sign up

Export Citation Format

Share Document