score reliability
Recently Published Documents


TOTAL DOCUMENTS

122
(FIVE YEARS 17)

H-INDEX

23
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Martin Rempel ◽  
Peter Schaumann ◽  
Ulrich Blahak ◽  
Volker Schmidt

<p>Verlässliche Niederschlagsvorhersagen innerhalb des Kürzestfristbereichs sind unerlässlich für präzise Warnungen und können die Vorlaufzeit für Entscheidungsträger im Bereich der Gefahrenabwehr und des Rettungswesens erhöhen. In der operationellen Wettervorhersage beruhen Vorhersage und Warnung vor konvektivem Starkniederschlag innerhalb der ersten zwei Stunden auf radarbasierten Nowcastingverfahren, während für spätere Zeitpunkte Simulationen konvektionserlaubender Ensemblevorhersagesysteme genutzt werden.</p> <p>Im Rahmen des Projekts SINFONY (Seamless INtegrated FOrecastiNg sYstem) des Deutschen Wetterdienstes wird ein integriertes Ensemblesystem auf konvektiver Skala im Bereich der Kürzestfristvorhersage entwickelt. Um die optimale Kombination der bisher unabhängigen Systeme von Nowcasting und numerischer Wetterverhorsage zu erleichtern, wurde mit STEPS-DWD eine Adaption des weitverbreiteten STEPS (u.a. Seed 2003, Bowler et al., 2006) als Nowcast-Ensemble in den Testbetrieb überführt. Basis der NWV ist ICON-D2-RUC, welches derzeit stündlich initialisiert  Ensemblevorhersagen bis +8h Stunden mit einer horizontalen Auflösung von 2,2km liefert. Kernkomponenten dieser Modellversion sind die Nutzung eines Zwei-Momenten-Mikrophysikschemas sowie die zusätzliche Assimilation von hochaufgelösten Fernerkundungsdaten wie 3D-Radardaten und Meteosat-SEVIRI-Daten.</p> <p>Auf Basis der zwei vorangenannten Ensemblesysteme STEPS-DWD und ICON-D2-RUC werden zwei Methoden zur Kombination der Vorhersagen dieser Systeme präsentiert. In einem ersten Verfahren wird die Methode nach Nerini et al., 2019 adaptiert. Hierbei werden die Vorhersagen von Reflektivitäten und Regenraten im physischen Raum auf Basis eines Ensemble-Kalmanfilters kombiniert. Durch eine zeitlich und räumliche Auflösung von fünf Minuten bzw. 1x1km wird unter Beibehaltung eines realistischen Aussehens der Niederschlagssysteme eine Möglichkeit zur Abschätzung der weiteren Entwicklung bis +6h geschaffen.<br /><br />Weiterhin wird eine neue statistische Methode vorgestellt, mit der prognostizierte Niederschlagssummen auf Basis Neuronaler Netze (NN) im Wahrscheinlichkeitsraum kombiniert werden (vgl. Schaumann et al., 2021). Ziel ist es, mit einem Training sowohl nahtlose und kalibrierte Vorhersagen zu erhalten, als auch konsistente Überschreitungswahrscheinlichkeiten gegenüber allen Schwellwerten zu erreichen. Für die Optimierung wurden drei Datensätze von jeweils drei Monaten verwendet, wobei die Datensätze A & B Ensemble-MOS und RadVOR mit einer jeweiligen horizontalen Auflösung von 20km beinhalten. In Datensatz C werden Vorhersagen eines dreistündig initialisierten ICON-D2-RUC sowie STEPS-DWD mit einer Auflösung von 2,2km verwendet. Die Hyperparameter der NN wurden mit Datensatz A optimiert und die daraus resultierenden NN mittels Rolling Origin Validation auf Datensatz B & C validiert. Hieraus werden Vorhersagen mit einer zeitlichen Auflösung von 1h bis +6h erzeugt.<br /><br />Für beide Verfahren wird durch mehrere Verifikationsmetriken (FSS, Bias, Brier Skill Score, Reliability und Reliability-Diagramm) gezeigt, dass die kombinierten Vorhersagen für alle Vorhersagezeiten gleich oder besser als die der individuellen Systeme sind.</p>


2021 ◽  
Vol 10 (4) ◽  
pp. 2121-2131
Author(s):  
Mustafa Ali ◽  
Mohammed A.

<p style="text-align: justify;">The academic buoyancy scale (ABS) is one of the most widely used instruments for measuring academic buoyancy. To obtain meaningful and valid comparisons across groups using ABS, however, measurement invariance should be ascertained a priori. To that end, we examined its measurement invariance, validity evidence based on relations to other variables, and score reliability using categorical omega across culture and gender among Egyptian and Omani undergraduates. Participants were 345 college students: Egyptian sample (N=191) and Omani sample (N=154). To assess measurement invariance across culture and gender, multiple–group confirmatory factor analysis was performed with four successive invariance models: (a) configural, (b) metric, (c) scalar, and (d) residual. Results revealed that the unidimensional baseline model had adequate fit to the data in the full sample. Moreover, measurement invariance was found to hold across culture but not across gender and consequently the ABS could be used to yield valid cross-cultural comparisons between the Egyptian and Omani students. Conversely, it cannot be used to yield valid inferences related to comparing gender groups within each culture. Validity evidence based on relations to other variables was supported by the significantly moderate correlation between ABS and academic achievement (GPA; r =.435 and r = .457, P < .01) for the Egyptian and Omani samples, respectively. With regard to score reliability, categorical omega coefficients were moderate across both samples. Educational and psychological implications, limitations and suggestions for improving the scale are discussed.</p>


2021 ◽  
Vol 10 (5) ◽  
pp. 1053
Author(s):  
Agnieszka Ćwirlej-Sozańska ◽  
Bernard Sozański ◽  
Mateusz Kupczyk ◽  
Justyna Leszczak ◽  
Andrzej Kwolek ◽  
...  

Background: Huntington’s disease is a progressive neurodegenerative disorder that usually manifests in adulthood and is inherited in an autosomal dominant manner. The main aim of the study was to assess the psychometric properties of the 12-item WHO Disability Assessment Schedule (WHODAS) 2.0 in studying the level of disability in people with Huntington’s disease. Method: This is a cross-sectional study that covered 128 people with Huntington’s disease living in Poland. We examined scale score reliability, internal consistency, convergent validity, and known-group validity. The disability and quality of life of people with Huntington’s disease were also assessed. Results: The scale score reliability of the entire tool for the research group was high. The Cronbach’s α test result for the whole scale was 0.97. Cronbach’s α for individual domains ranged from 0.95 to 0.79. Time consistency for the overall result was 0.99 and for particular domains ranged from 0.91 to 0.99, which confirmed that the scale was consistent over time. All of the 12-item WHODAS 2.0 domains negatively correlated with all of the Huntington Quality of Life Instrument (H-QoL-I) domains. All correlation coefficients were statistically significant at the level of p < 0.001. The results obtained in the linear regression model showed that with each subsequent point of decrease in BMI the level of disability increases by an average of 0.83 points on the 12-item WHODAS 2.0 scale. With each subsequent year of the disease, the level of disability increases by an average of 1.39 points. Conclusions: This is the first study assessing disability by means of the WHODAS 2.0 in the HD patient population in Poland, and it is also one of the few studies evaluating the validity of the WHODAS 2.0 scale in assessing the disability of people with HD in accordance with the recommendations of DSM-5 (R). We have confirmed that the 12-item WHODAS 2.0 is an effective tool for assessing disability and changes in functioning among people with Huntington’s disease.


2021 ◽  
pp. 107699862098694
Author(s):  
Zhengguo Gu ◽  
Wilco H. M. Emons ◽  
Klaas Sijtsma

Clinical, medical, and health psychologists use difference scores obtained from pretest–posttest designs employing the same test to assess intraindividual change possibly caused by an intervention addressing, for example, anxiety, depression, eating disorder, or addiction. Reliability of difference scores is important for interpreting observed change. This article compares the well-documented traditional method and the unfamiliar, rarely used item-level method for estimating difference-score reliability. We simulated data under various conditions that are typical of change assessment in pretest–posttest designs. The item-level method had smaller bias and greater precision than the traditional method and may be recommended for practical use.


2020 ◽  
Author(s):  
Kazuhiro Yamaguchi ◽  
Jonathan Templin

Quantifying the reliability of latent variable estimates in diagnostic classification models has been a difficult topic, complicated by the classification-based nature of these models. In this study, we derive observed score reliability indices based on diagnostic classification models as an extension of classical test theory-based reliability. Additionally, we derive conditional observed sum- and sub-score distributions. In this manner, various conditional expectations and conditional standard error of measurement estimates can be calculated for both total- and sub-scores of a test. The proposed methods provide a variety of expectations and standard errors for attribute estimates, which we demonstrate in an analysis of an empirical test.


2020 ◽  
Author(s):  
Peter E Clayson ◽  
Scott Baldwin ◽  
Michael J. Larson

In studies of event-related brain potentials (ERPs), difference scores between conditions in a task are frequently used to isolate neural activity for use as a dependent or independent variable. Adequate score reliability is a prerequisite for studies examining relationships between ERPs and external correlates, but there is a widely held view that difference scores are inherently unreliable and unsuitable for studies of individual differences. This view fails to consider the nuances of difference score reliability that are relevant to ERP research. In the present study, we provide formulas from classical test theory and generalizability theory for estimating the internal consistency of subtraction-based and residualized difference scores. These formulas are then applied to error-related negativity (ERN) and reward positivity (RewP) difference scores from the same sample of 117 participants. Analyses demonstrate that ERN difference scores can be reliable, which supports their use in studies of individual differences. However, RewP difference scores yielded poor reliability due to the high correlation between the constituent reward and non-reward ERPs. Findings emphasize that difference score reliability largely depends on the internal consistency of constituent scores and the correlation between those scores. Furthermore, generalizability theory estimates yielded higher internal consistency estimates for subtraction-based difference scores than classical test theory estimates did. Despite some beliefs that difference scores are inherently unreliable, ERP difference scores can show adequate reliability and be useful for isolating neural activity in studies of individual differences.


2020 ◽  
Author(s):  
Peter E Clayson ◽  
Kaylie Amanda Carbine ◽  
Scott Baldwin ◽  
Joseph A. Olsen ◽  
Michael J. Larson

The reliability of event-related brain potential (ERP) scores depends on study context and how those scores will be used, and reliability must be routinely evaluated. Many factors can influence ERP score reliability, and generalizability (G) theory provides a multifaceted approach to estimating the internal consistency and temporal stability of scores that is well suited for ERPs. G-theory’s approach possesses a number of advantages over classical test theory that make it ideal for pinpointing sources of error in scores. The current primer outlines the G-theory approach to estimating internal consistency (coefficients of equivalence) and test-retest reliability (coefficients of stability). This approach is used to evaluate the reliability of ERP measurements. The primer outlines how to estimate reliability coefficients that consider the impact of the number of trials, events, occasion, and group. The uses of two different G-theory reliability coefficients (i.e., generalizability and dependability) in ERP research are elaborated, and a dataset from the companion manuscript, which examines N2 amplitudes to Go/NoGo stimuli, is used as an example of the application of these coefficients to ERPs. The developed algorithms are implemented in the ERP Reliability Analysis (ERA) Toolbox, which is open-source software designed for estimating score reliability using G theory. The toolbox facilitates the application of G theory in an effort to simplify the study-by-study evaluation of ERP score reliability. The formulas provided in this primer should enable researchers to pinpoint the sources of measurement error in ERP scores from multiple recording sessions and subsequently plan studies that optimize score reliability.


Sign in / Sign up

Export Citation Format

Share Document