Exploring Incomplete Rating Designs With Mokken Scale Analysis

Recent research has explored the use of models adapted from Mokken scale analysis as a nonparametric approach to evaluating rating quality in educational performance assessments. A potential limiting factor to the widespread use of these techniques is the requirement for complete data, as practical constraints in operational assessment systems often limit the use of complete rating designs. In order to address this challenge, this study explores the use of missing data imputation techniques and their impact on Mokken-based rating quality indicators related to rater monotonicity, rater scalability, and invariant rater ordering. Simulated data and real data from a rater-mediated writing assessment were modified to reflect varying levels of missingness, and four imputation techniques were used to impute missing ratings. Overall, the results indicated that simple imputation techniques based on rater and student means result in generally accurate recovery of rater monotonicity indices and rater scalability coefficients. However, discrepancies between violations of invariant rater ordering in the original and imputed data are somewhat unpredictable across imputation methods. Implications for research and practice are discussed.

Download Full-text

Exploring Within-Rater Category Ordering: A Simulation Study Using Adjacent-Categories Mokken Scale Analysis

Educational and Psychological Measurement ◽

10.1177/0013164417724841 ◽

2017 ◽

Vol 78 (5) ◽

pp. 887-904 ◽

Cited By ~ 3

Author(s):

Stefanie A. Wind ◽

Randall E. Schumacker

Keyword(s):

Rating Scale ◽

Simulated Data ◽

Real Data ◽

Performance Assessments ◽

Theory And Practice ◽

Educational Performance ◽

Scale Analysis ◽

Mokken Scale Analysis ◽

Measurement Models ◽

Scale Category

The interpretation of ratings from educational performance assessments assumes that rating scale categories are ordered as expected (i.e., higher ratings correspond to higher levels of judged student achievement). However, this assumption must be verified empirically using measurement models that do not impose ordering constraints on the rating scale category thresholds, such as item response theory models based on adjacent-categories probabilities. This study considers the application of an adjacent-categories formulation of polytomous Mokken scale analysis (ac-MSA) models as a method for evaluating the degree to which rating scale categories are ordered as expected for individual raters in performance assessments. Using simulated data, this study builds on the preliminary application of ac-MSA models to rater-mediated performance assessments, in which a real data analysis suggested that these models can be used to identify disordered rating scale categories. The results suggested that ac-MSA models are sensitive to disordered categories within individual raters. Implications are discussed as they relate to research, theory, and practice for rater-mediated educational performance assessments.

Download Full-text

Rethinking the Exploration of Dichotomous Data: Mokken Scale Analysis Versus Factorial Analysis

Sociological Methods & Research ◽

10.1177/0049124118769090 ◽

2018 ◽

Vol 49 (4) ◽

pp. 839-867 ◽

Cited By ~ 1

Author(s):

Mirko Antino ◽

Jesús M. Alvarado ◽

Rodrigo A. Asún ◽

Paul Bliese

Keyword(s):

Simulated Data ◽

Principal Component ◽

Data Sets ◽

Scale Analysis ◽

Analysis Model ◽

Measurement Instruments ◽

Mokken Scale Analysis ◽

Dichotomous Data ◽

The Social ◽

Item Factor Analysis

The need to determine the correct dimensionality of theoretical constructs and generate valid measurement instruments when underlying items are categorical has generated a significant volume of research in the social sciences. This article presents two studies contrasting different categorical exploratory techniques. The first study compares Mokken scale analysis (MSA) and two-factor-based exploratory techniques for noncontinuous variables: item factor analysis and Normal Ogive Harmonic Analysis Robust Method (NOHARM). Comparisons are conducted across techniques and in reference to the common principal component analysis model using simulated data under conditions of two-dimensionality with different degrees of correlation ( r = .0 to .6). The second study shows the theoretical and practical results of using MSA and NOHARM (the factorial technique which functioned best in the first study) on two nonsimulated data sets. The nonsimulated data are particularly interesting because MSA was used to solve a theoretical debate. Based on the results from both studies, we show that the ability of NOHARM to detect dimensionality and scalability is similar to MSA when the data comprise two uncorrelated latent dimensions; however, NOHARM is preferable when data are drawn from instruments containing latent dimensions weakly or moderately correlated. This article discusses the theoretical and practical implications of these findings.

Download Full-text

The Markov link method: a nonparametric approach to combine observations from multiple experiments

10.1101/457283 ◽

2018 ◽

Cited By ~ 1

Author(s):

Jackson Loper ◽

Trygve Bakken ◽

Uygar Sumbul ◽

Gabe Murphy ◽

Hongkui Zeng ◽

...

Keyword(s):

Cell Biology ◽

Measurement Techniques ◽

Simulated Data ◽

Real Data ◽

The Other ◽

Independence Assumption ◽

Nonparametric Approach ◽

Conditional Independence Assumption ◽

The Subject ◽

Subject Making

AbstractThis paper studiesmeasurement linkage. An example from cell biology helps explain the problem: imagine for a given cell we can either sequence the cell’s RNA or we can examine its morphology, but not both. Given a cell’s morphology, what do we expect to see in its RNA? Given a cell’s RNA, what do we expect in its morphology? More broadly, given a measurement of one type, can we predict measurements of the other type? This measurement linkage problem arises in many scientific and technological fields. To solve this problem, we develop a nonparametric approach we dub the “Markov link method” (MLM). The MLM makes a conditional independence assumption that holds in many multi-measurement contexts and provides a way to estimate thelink, the conditional probability of one type of measurement given the other. We derive conditions under which the MLM estimator is consistent and we use simulated data to show that it provides accurate measures of uncertainty. We evaluate the MLM on real data generated by a pair of single-cell RNA sequencing techniques. The MLM characterizes the link between them and helps connect the two notions of cell type derived from each technique. Further, the MLM reveals that some aspects of the link cannot be determined from the available data, and suggests new experiments that would allow for better estimates.Significance StatementNovel experimental techniques are developing quickly, and each technique gives new perspectives. Ideally we would build theories that account for many perspectives at once. This is not easy. One challenge is that many experiments use measurement techniques that alter or destroy the subject, making it impossible to measure the same subject with both techniques and difficult to combine data from different experiments. In this paper we develop the Markov Link Method, a new tool that overcomes this challenge.

Download Full-text

Identifying Problematic Item Characteristics With Small Samples Using Mokken Scale Analysis

Educational and Psychological Measurement ◽

10.1177/00131644211045347 ◽

2021 ◽

pp. 001316442110453

Author(s):

Stefanie A. Wind

Keyword(s):

Latent Variable ◽

Item Analysis ◽

Small Samples ◽

Scale Analysis ◽

Mokken Scale Analysis ◽

Nonparametric Approach ◽

Level Measurement ◽

Item Level ◽

Item Quality ◽

Item Characteristics

Researchers frequently use Mokken scale analysis (MSA), which is a nonparametric approach to item response theory, when they have relatively small samples of examinees. Researchers have provided some guidance regarding the minimum sample size for applications of MSA under various conditions. However, these studies have not focused on item-level measurement problems, such as violations of monotonicity or invariant item ordering (IIO). Moreover, these studies have focused on problems that occur for a complete sample of examinees. The current study uses a simulation study to consider the sensitivity of MSA item analysis procedures to problematic item characteristics that occur within limited ranges of the latent variable. Results generally support the use of MSA with small samples ( N around 100 examinees) as long as multiple indicators of item quality are considered.

Download Full-text

Separation of Chromatographic Co-Eluted Compounds by Clustering and by Functional Data Analysis

Metabolites ◽

10.3390/metabo11040214 ◽

2021 ◽

Vol 11 (4) ◽

pp. 214

Author(s):

Aneta Sawikowska ◽

Anna Piasecka ◽

Piotr Kachlicki ◽

Paweł Krajewski

Keyword(s):

Simulated Data ◽

Principal Component ◽

Real Data ◽

Functional Principal Component Analysis ◽

Additional Advantage ◽

Time Alignment ◽

Peak Separation ◽

Biological Mixtures ◽

Overlapping Peaks ◽

Retention Time Alignment

Peak overlapping is a common problem in chromatography, mainly in the case of complex biological mixtures, i.e., metabolites. Due to the existence of the phenomenon of co-elution of different compounds with similar chromatographic properties, peak separation becomes challenging. In this paper, two computational methods of separating peaks, applied, for the first time, to large chromatographic datasets, are described, compared, and experimentally validated. The methods lead from raw observations to data that can form inputs for statistical analysis. First, in both methods, data are normalized by the mass of sample, the baseline is removed, retention time alignment is conducted, and detection of peaks is performed. Then, in the first method, clustering is used to separate overlapping peaks, whereas in the second method, functional principal component analysis (FPCA) is applied for the same purpose. Simulated data and experimental results are used as examples to present both methods and to compare them. Real data were obtained in a study of metabolomic changes in barley (Hordeum vulgare) leaves under drought stress. The results suggest that both methods are suitable for separation of overlapping peaks, but the additional advantage of the FPCA is the possibility to assess the variability of individual compounds present within the same peaks of different chromatograms.

Download Full-text

A Closed-Form Solution to Planar Feature-Based Registration of LiDAR Point Clouds

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10070435 ◽

2021 ◽

Vol 10 (7) ◽

pp. 435

Author(s):

Yongbo Wang ◽

Nanshan Zheng ◽

Zhengfu Bian

Keyword(s):

Closed Form ◽

Closed Form Solution ◽

Simulated Data ◽

Real Data ◽

Point Clouds ◽

Form Solution ◽

Spatial Transformation ◽

Dual Quaternions ◽

Feature Based ◽

Planar Feature

Since pairwise registration is a necessary step for the seamless fusion of point clouds from neighboring stations, a closed-form solution to planar feature-based registration of LiDAR (Light Detection and Ranging) point clouds is proposed in this paper. Based on the Plücker coordinate-based representation of linear features in three-dimensional space, a quad tuple-based representation of planar features is introduced, which makes it possible to directly determine the difference between any two planar features. Dual quaternions are employed to represent spatial transformation and operations between dual quaternions and the quad tuple-based representation of planar features are given, with which an error norm is constructed. Based on L2-norm-minimization, detailed derivations of the proposed solution are explained step by step. Two experiments were designed in which simulated data and real data were both used to verify the correctness and the feasibility of the proposed solution. With the simulated data, the calculated registration results were consistent with the pre-established parameters, which verifies the correctness of the presented solution. With the real data, the calculated registration results were consistent with the results calculated by iterative methods. Conclusions can be drawn from the two experiments: (1) The proposed solution does not require any initial estimates of the unknown parameters in advance, which assures the stability and robustness of the solution; (2) Using dual quaternions to represent spatial transformation greatly reduces the additional constraints in the estimation process.

Download Full-text

Penalized partial least squares for pleiotropy

BMC Bioinformatics ◽

10.1186/s12859-021-03968-1 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Camilo Broc ◽

Therese Truong ◽

Benoit Liquet

Keyword(s):

Least Squares ◽

Partial Least Squares ◽

Association Studies ◽

A Priori ◽

Simulated Data ◽

Real Data ◽

Genome Wide Association Studies ◽

Genetic Associations ◽

Multiple Traits ◽

Application Fields

Abstract Background The increasing number of genome-wide association studies (GWAS) has revealed several loci that are associated to multiple distinct phenotypes, suggesting the existence of pleiotropic effects. Highlighting these cross-phenotype genetic associations could help to identify and understand common biological mechanisms underlying some diseases. Common approaches test the association between genetic variants and multiple traits at the SNP level. In this paper, we propose a novel gene- and a pathway-level approach in the case where several independent GWAS on independent traits are available. The method is based on a generalization of the sparse group Partial Least Squares (sgPLS) to take into account groups of variables, and a Lasso penalization that links all independent data sets. This method, called joint-sgPLS, is able to convincingly detect signal at the variable level and at the group level. Results Our method has the advantage to propose a global readable model while coping with the architecture of data. It can outperform traditional methods and provides a wider insight in terms of a priori information. We compared the performance of the proposed method to other benchmark methods on simulated data and gave an example of application on real data with the aim to highlight common susceptibility variants to breast and thyroid cancers. Conclusion The joint-sgPLS shows interesting properties for detecting a signal. As an extension of the PLS, the method is suited for data with a large number of variables. The choice of Lasso penalization copes with architectures of groups of variables and observations sets. Furthermore, although the method has been applied to a genetic study, its formulation is adapted to any data with high number of variables and an exposed a priori architecture in other application fields.

Download Full-text

Calibration of Camera and Flash LiDAR System with a Triangular Pyramid Target

Applied Sciences ◽

10.3390/app11020582 ◽

2021 ◽

Vol 11 (2) ◽

pp. 582

Author(s):

Zean Bu ◽

Changku Sun ◽

Peng Wang ◽

Hang Dong

Keyword(s):

Simulated Data ◽

Real Data ◽

Calibration Method ◽

Multiple Sensors ◽

Triangular Pyramid ◽

World Coordinate System ◽

Flash Lidar ◽

Novel Method ◽

3D Information ◽

Incremental Validation

Calibration between multiple sensors is a fundamental procedure for data fusion. To address the problems of large errors and tedious operation, we present a novel method to conduct the calibration between light detection and ranging (LiDAR) and camera. We invent a calibration target, which is an arbitrary triangular pyramid with three chessboard patterns on its three planes. The target contains both 3D information and 2D information, which can be utilized to obtain intrinsic parameters of the camera and extrinsic parameters of the system. In the proposed method, the world coordinate system is established through the triangular pyramid. We extract the equations of triangular pyramid planes to find the relative transformation between two sensors. One capture of camera and LiDAR is sufficient for calibration, and errors are reduced by minimizing the distance between points and planes. Furthermore, the accuracy can be increased by more captures. We carried out experiments on simulated data with varying degrees of noise and numbers of frames. Finally, the calibration results were verified by real data through incremental validation and analyzing the root mean square error (RMSE), demonstrating that our calibration method is robust and provides state-of-the-art performance.

Download Full-text

Application of the Parent Attitudes about Childhood Vaccines (PACV) survey in three national languages in Switzerland: Exploratory factor analysis and Mokken scale analysis

Human Vaccines & Immunotherapeutics ◽

10.1080/21645515.2021.1894894 ◽

2021 ◽

pp. 1-9

Author(s):

Victoria O. Olarewaju ◽

Kristen Jafflin ◽

Michael J. Deml ◽

Clara Zimmermann ◽

Joanna Sonderegger ◽

...

Keyword(s):

Factor Analysis ◽

Exploratory Factor Analysis ◽

Scale Analysis ◽

Parent Attitudes ◽

Mokken Scale Analysis ◽

Childhood Vaccines ◽

National Languages

Download Full-text

Prediction of Fuel Poverty Potential Risk Index Using Six Regression Algorithms: A Case-Study of Chilean Social Dwellings

Sustainability ◽

10.3390/su13052426 ◽

2021 ◽

Vol 13 (5) ◽

pp. 2426

Author(s):

David Bienvenido-Huertas ◽

Jesús A. Pulido-Arcas ◽

Carlos Rubio-Bellido ◽

Alexis Pérez-Fargallo

Keyword(s):

Low Income ◽

Potential Risk ◽

Energy Use ◽

Risk Index ◽

Computing Time ◽

Simulated Data ◽

Real Data ◽

Support Vector ◽

Energy Poverty ◽

Regression Algorithms

In recent times, studies about the accuracy of algorithms to predict different aspects of energy use in the building sector have flourished, being energy poverty one of the issues that has received considerable critical attention. Previous studies in this field have characterized it using different indicators, but they have failed to develop instruments to predict the risk of low-income households falling into energy poverty. This research explores the way in which six regression algorithms can accurately forecast the risk of energy poverty by means of the fuel poverty potential risk index. Using data from the national survey of socioeconomic conditions of Chilean households and generating data for different typologies of social dwellings (e.g., form ratio or roof surface area), this study simulated 38,880 cases and compared the accuracy of six algorithms. Multilayer perceptron, M5P and support vector regression delivered the best accuracy, with correlation coefficients over 99.5%. In terms of computing time, M5P outperforms the rest. Although these results suggest that energy poverty can be accurately predicted using simulated data, it remains necessary to test the algorithms against real data. These results can be useful in devising policies to tackle energy poverty in advance.

Download Full-text