scholarly journals The Impact of Q-matrix Misspecification and Model Misuse on Classification Accuracy in the Generalized DINA Model

Author(s):  
Miao Gao ◽  
M. David Miller ◽  
Ren Liu
2019 ◽  
Vol 38 (5) ◽  
pp. 581-598
Author(s):  
Haiyan Wu ◽  
Xinya Liang ◽  
Hülya Yürekli ◽  
Betsy Jane Becker ◽  
Insu Paek ◽  
...  

The demand for diagnostic feedback has triggered extensive research on cognitive diagnostic models (CDMs), such as the deterministic input, noisy output “and” gate (DINA) model. This study explored two Q-matrix specifications with the DINA model in a statewide large-scale mathematics assessment. The first Q-matrix was developed based on five predefined content reporting categories, and the second was based on the post hoc coding of 15 attributes by test-development experts. Total raw scores correlated strongly with the number of skills mastered, using both Q-matrices. Correlations between the DINA-model item statistics and those from the item response theory analyses were moderate to strong, but were always lower for the 15-skill model. Results highlighted the trade-off between finer-grained modeling and less precise model estimation.


2013 ◽  
Vol 44 (4) ◽  
pp. 558-568 ◽  
Author(s):  
Dong-Bo TU ◽  
Yan CAI ◽  
Hai-Qi DAI
Keyword(s):  

2021 ◽  
pp. 014662162110138
Author(s):  
Joseph A. Rios ◽  
James Soland

Suboptimal effort is a major threat to valid score-based inferences. While the effects of such behavior have been frequently examined in the context of mean group comparisons, minimal research has considered its effects on individual score use (e.g., identifying students for remediation). Focusing on the latter context, this study addressed two related questions via simulation and applied analyses. First, we investigated how much including noneffortful responses in scoring using a three-parameter logistic (3PL) model affects person parameter recovery and classification accuracy for noneffortful responders. Second, we explored whether improvements in these individual-level inferences were observed when employing the Effort Moderated IRT (EM-IRT) model under conditions in which its assumptions were met and violated. Results demonstrated that including 10% noneffortful responses in scoring led to average bias in ability estimates and misclassification rates by as much as 0.15 SDs and 7%, respectively. These results were mitigated when employing the EM-IRT model, particularly when model assumptions were met. However, once model assumptions were violated, the EM-IRT model’s performance deteriorated, though still outperforming the 3PL model. Thus, findings from this study show that (a) including noneffortful responses when using individual scores can lead to potential unfounded inferences and potential score misuse, and (b) the negative impact that noneffortful responding has on person ability estimates and classification accuracy can be mitigated by employing the EM-IRT model, particularly when its assumptions are met.


2020 ◽  
Vol 0 (0) ◽  
Author(s):  
Tawfik Yahya ◽  
Nur Azah Hamzaid ◽  
Sadeeq Ali ◽  
Farahiyah Jasni ◽  
Hanie Nadia Shasmin

AbstractA transfemoral prosthesis is required to assist amputees to perform the activity of daily living (ADL). The passive prosthesis has some drawbacks such as utilization of high metabolic energy. In contrast, the active prosthesis consumes less metabolic energy and offers better performance. However, the recent active prosthesis uses surface electromyography as its sensory system which has weak signals with microvolt-level intensity and requires a lot of computation to extract features. This paper focuses on recognizing different phases of sitting and standing of a transfemoral amputee using in-socket piezoelectric-based sensors. 15 piezoelectric film sensors were embedded in the inner socket wall adjacent to the most active regions of the agonist and antagonist knee extensor and flexor muscles, i. e. region with the highest level of muscle contractions of the quadriceps and hamstring. A male transfemoral amputee wore the instrumented socket and was instructed to perform several sitting and standing phases using an armless chair. Data was collected from the 15 embedded sensors and went through signal conditioning circuits. The overlapping analysis window technique was used to segment the data using different window lengths. Fifteen time-domain and frequency-domain features were extracted and new feature sets were obtained based on the feature performance. Eight of the common pattern recognition multiclass classifiers were evaluated and compared. Regression analysis was used to investigate the impact of the number of features and the window lengths on the classifiers’ accuracies, and Analysis of Variance (ANOVA) was used to test significant differences in the classifiers’ performances. The classification accuracy was calculated using k-fold cross-validation method, and 20% of the data set was held out for testing the optimal classifier. The results showed that the feature set (FS-5) consisting of the root mean square (RMS) and the number of peaks (NP) achieved the highest classification accuracy in five classifiers. Support vector machine (SVM) with cubic kernel proved to be the optimal classifier, and it achieved a classification accuracy of 98.33 % using the test data set. Obtaining high classification accuracy using only two time-domain features would significantly reduce the processing time of controlling a prosthesis and eliminate substantial delay. The proposed in-socket sensors used to detect sit-to-stand and stand-to-sit movements could be further integrated with an active knee joint actuation system to produce powered assistance during energy-demanding activities such as sit-to-stand and stair climbing. In future, the system could also be used to accurately predict the intended movement based on their residual limb’s muscle and mechanical behaviour as detected by the in-socket sensory system.


2021 ◽  
Vol 10 (7) ◽  
pp. 436
Author(s):  
Amerah Alghanim ◽  
Musfira Jilani ◽  
Michela Bertolotto ◽  
Gavin McArdle

Volunteered Geographic Information (VGI) is often collected by non-expert users. This raises concerns about the quality and veracity of such data. There has been much effort to understand and quantify the quality of VGI. Extrinsic measures which compare VGI to authoritative data sources such as National Mapping Agencies are common but the cost and slow update frequency of such data hinder the task. On the other hand, intrinsic measures which compare the data to heuristics or models built from the VGI data are becoming increasingly popular. Supervised machine learning techniques are particularly suitable for intrinsic measures of quality where they can infer and predict the properties of spatial data. In this article we are interested in assessing the quality of semantic information, such as the road type, associated with data in OpenStreetMap (OSM). We have developed a machine learning approach which utilises new intrinsic input features collected from the VGI dataset. Specifically, using our proposed novel approach we obtained an average classification accuracy of 84.12%. This result outperforms existing techniques on the same semantic inference task. The trustworthiness of the data used for developing and training machine learning models is important. To address this issue we have also developed a new measure for this using direct and indirect characteristics of OSM data such as its edit history along with an assessment of the users who contributed the data. An evaluation of the impact of data determined to be trustworthy within the machine learning model shows that the trusted data collected with the new approach improves the prediction accuracy of our machine learning technique. Specifically, our results demonstrate that the classification accuracy of our developed model is 87.75% when applied to a trusted dataset and 57.98% when applied to an untrusted dataset. Consequently, such results can be used to assess the quality of OSM and suggest improvements to the data set.


2018 ◽  
Vol 43 (7) ◽  
pp. 527-542 ◽  
Author(s):  
Chunhua Kang ◽  
Yakun Yang ◽  
Pingfei Zeng

A Q-matrix, which reflects how attributes are measured for each item, is necessary when applying a cognitive diagnosis model to an assessment. In most cases, the Q-matrix is constructed by experts in the field and may be subjective and incorrect. One efficient method to refine the Q-matrix is to employ a suitable statistic that is calculated using response data. However, this approach is limited by its need to estimate all items in the Q-matrix even if only some are incorrect. To address this challenge, this study proposes an item fit statistic root mean square error approximation (RMSEA) for validating a Q-matrix with the deterministic inputs, noisy, “and” (DINA) model. Using a search algorithm, two simulation studies were performed to evaluate the effectiveness and efficiency of the proposed method at recovering Q-matrices. Results showed that using RMSEA can help define attributes in a Q-matrix. A comparison with the existing Delta method and residual sum of squares (RSS) method revealed that the proposed method had higher mean recovery rates and can be used to identify and correct Q-matrix misspecifications. When no error exists in the Q-matrix, the proposed method does not modify the correct Q-matrix.


2020 ◽  
Author(s):  
Casey L. Trevino ◽  
Jack J. Lin ◽  
Indranil Sen-Gupta ◽  
Beth A. Lopour

AbstractHigh frequency oscillations (HFOs) are a promising biomarker of epileptogenicity, and automated algorithms are critical tools for their detection. However, previously validated algorithms often exhibit decreased HFO detection accuracy when applied to a new data set, if the parameters are not optimized. This likely contributes to decreased seizure localization accuracy, but this has never been tested. Therefore, we evaluated the impact of parameter selection on seizure onset zone (SOZ) localization using automatically detected HFOs. We detected HFOs in intracranial EEG from twenty medically refractory epilepsy patients with seizure free surgical outcomes using an automated algorithm. For each patient, we assessed classification accuracy of channels inside/outside the SOZ using a wide range of detection parameters and identified the parameters associated with maximum classification accuracy. We found that only three out of twenty patients achieved maximal localization accuracy using conventional HFO detection parameters, and optimal parameter ranges varied significantly across patients. The parameters for amplitude threshold and root-mean-square window had the greatest impact on SOZ localization accuracy; minimum event duration and rejection of false positive events did not significantly affect the results. Using individualized optimal parameters led to substantial improvements in localization accuracy, particularly in reducing false positives from non-SOZ channels. We conclude that optimal HFO detection parameters are patient-specific, often differ from conventional parameters, and have a significant impact on SOZ localization. This suggests that individual variability should be considered when implementing automatic HFO detection as a tool for surgical planning.


Sign in / Sign up

Export Citation Format

Share Document