Testing the Tests: What Are the Impacts of Incorrect Assumptions When Applying Confidence Intervals or Hypothesis Tests to Compare Competing Forecasts?

2018 ◽  
Vol 146 (6) ◽  
pp. 1685-1703 ◽  
Author(s):  
Eric Gilleland ◽  
Amanda S. Hering ◽  
Tressa L. Fowler ◽  
Barbara G. Brown

Which of two competing continuous forecasts is better? This question is often asked in forecast verification, as well as climate model evaluation. Traditional statistical tests seem to be well suited to the task of providing an answer. However, most such tests do not account for some of the special underlying circumstances that are prevalent in this domain. For example, model output is seldom independent in time, and the models being compared are geared to predicting the same state of the atmosphere, and thus they could be contemporaneously correlated with each other. These types of violations of the assumptions of independence required for most statistical tests can greatly impact the accuracy and power of these tests. Here, this effect is examined on simulated series for many common testing procedures, including two-sample and paired t and normal approximation z tests, the z test with a first-order variance inflation factor applied, and the newer Hering–Genton (HG) test, as well as several bootstrap methods. While it is known how most of these tests will behave in the face of temporal dependence, it is less clear how contemporaneous correlation will affect them. Moreover, it is worthwhile knowing just how badly the tests can fail so that if they are applied, reasonable conclusions can be drawn. It is found that the HG test is the most robust to both temporal dependence and contemporaneous correlation, as well as the specific type and strength of temporal dependence. Bootstrap procedures that account for temporal dependence stand up well to contemporaneous correlation and temporal dependence, but require large sample sizes to be accurate.

2020 ◽  
Author(s):  
Gabriely S. Folli ◽  
Márcia H.C. Nascimento ◽  
Ellisson H. de Paulo ◽  
Pedro H.P. da Cunha ◽  
Wanderson Romão ◽  
...  

2021 ◽  
Vol 4 (1) ◽  
pp. 61
Author(s):  
Muchsin Riviwanto ◽  
Darwel Darwel ◽  
Defriani Dwiyanti ◽  
Juanda Juanda

Disability groups are groups vulnerable to disaster risk. Most families with disabilities feel worried about defending themselves in the event of a disaster. They are less socialized with disaster mitigation efforts. This research has provided an overview of the preparedness of families with disabilities children in increasing disaster resilience. Analytical research was conducted on families with disabilities children in the city of Padang. Data collection tools in this study used a standard questionnaire from LIPI-UNESCO / ISDR. The data were processed by a computer and analyzed using multiple regression statistical tests. The results showed the preparedness of Families With Disabilities Children in the face of disasters; it was seen that the knowledge category was ready (42.2%), the preparedness plan category was not ready (37.8%), the disaster warning category was not ready (46.7%), the resource mobilization category was not ready, ready (82.2%), the tsunami disaster preparedness index value is 57% (ready category). This research recommended local governments must provide special treatment for people with disabilities by increasing training, seminars, and disaster simulations.


2002 ◽  
Vol 14 (3) ◽  
pp. 391-403 ◽  
Author(s):  
Trevor A. Craney ◽  
James G. Surles

2020 ◽  
Vol 18 (1) ◽  
pp. 43
Author(s):  
Agung K Henaulu ◽  
Sony Ardian

Tujuan dari penelitian ini adalah untuk menguji kualitas pelayanan pengelola wisata bahari daerah desa Suli dengan pendekatan uji statistika. Dengan hipotesis apakah variabel independen responsivenes, reliabilit, assurance, emphaty, tangibles berpengaruh positif (signifikan) terhadap kualitas pelayanan, dan apakah seluruh variabel independen tersebut secara simultan bersama-sama berpengaruh positif terhadap kualitas pelayanan. Saat ini kebutuhan berwisata menjadi kebutuhan penting, sebab dengan berwisata diperoleh pengalaman, informasi, dan pengetahuan baru. Semua itu bisa diperoleh, manakala layanan yang diberikan pengelola sangat berkesan, khususnya wisatwan difabel. Hasil penelitian menunjukkan bahwa uji reliabilitas dengan nilai spearman-brown adalah 0,9352 sehingga masuk kategori sangat tinggi. Uji normalitas menggunakan metode Kolmogorov-Smirnov dan Jarque-Bera memiliki nilai p-value masing-masing adalah 0,779 dan 0,809 > 0,05 maka asumsi data terpenuhi. Uji mutikoliniieritas menunjukkan nilai variance inflation factor memiliki nilai < 10 maka tidak terjadi multikolinieritas. Uji homoskedastisitas terpenuhi dengan nilai p-value (sig) seluruh variabel independen > 0,05. Uji non-autocorrelation menggunakan Durbin-Watson dengan range nilai adalah 1 – 3 yakni 2,36. Uji koefisien determinasi dihasilkan bilai koefisien determinasi sebesar 0,8042 sangat mendekati nilai atau jauh dari nilai 0. Dan pada uji F, nilai p-value­ memiliki tingkat signifikansi < 0,05, maka seluruh variabel independen secara bersama-sama mempengaruhi variabel dependen.


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
Steven R Horbal ◽  
Edward Brown ◽  
Brian A Derstine ◽  
Peng Zhang ◽  
Andrea H Rossman ◽  
...  

Introduction: Aortic calcification can be utilized to assess cardiovascular risk. While contrast is useful for vascular enhancement in diagnostic imaging, enhancement creates heterogeneity between post and non-contrast scans and limits their direct comparability. Hypothesis: We hypothesized that post and non-contrast aortic calcification measures will correlate, and a correction score can be developed for statistical comparability. Methods: Retrospective CT-scans were obtained from the University of Michigan. Participants (N=330) received abdominal scans with and without contrast enhancement within 120 calendar days. Analytic Morphomics was used to obtain vertebral-indexed measurements of aortic calcium area, and aortic wall obfuscation percentage. Calcification was specifically identified as regions with a given morphology and pixel value five standard deviations above the defined central lumen zone. Pearson correlation and multiple linear regression were used to explain the relationship between aortic measurements with and without contrast. Regressions include calcification percent (Model 1), and area (Model 2). Independent variables were non-contrast measurements and dependent variables were contrast measurements, age, and sex. Results: Correlations of calcification percent ranged from 0.86 at T11 and 0.94 and L2. Correlations of calcification area ranged from 0.66 at T12 to 0.84 at L3. In Model 1, for every percent increase in post-contrast calcification, non-contrast calcification percent increased by 11% (β=1.11, p <0.001, R2=0.85). In Model 2, for every mm2 increase in post-contrast calcification area, non-contrast calcification area increased by 0.45 mm2 (β=1.45, p <0.001, R2=0.69). Variance inflation factor for Model 1 was 1.08 and 1.07 for Model 2. Conclusion: In conclusion, this research proposes a correction score for comparisons of abdominal aortic calcification measurements in post-contrast and non-contrast scans.


2020 ◽  
Vol 34 (12) ◽  
Author(s):  
Gabriely S. Folli ◽  
Márcia H.C. Nascimento ◽  
Ellisson H. Paulo ◽  
Pedro H.P. Cunha ◽  
Wanderson Romão ◽  
...  

2019 ◽  
Vol 20 (7) ◽  
pp. 1339-1357 ◽  
Author(s):  
Peter B. Gibson ◽  
Duane E. Waliser ◽  
Huikyo Lee ◽  
Baijun Tian ◽  
Elias Massoud

Abstract Climate model evaluation is complicated by the presence of observational uncertainty. In this study we analyze daily precipitation indices and compare multiple gridded observational and reanalysis products with regional climate models (RCMs) from the North American component of the Coordinated Regional Climate Downscaling Experiment (NA-CORDEX) multimodel ensemble. In the context of model evaluation, observational product differences across the contiguous United States (CONUS) are also deemed nontrivial for some indices, especially for annual counts of consecutive wet days and for heavy precipitation indices. Multidimensional scaling (MDS) is used to directly include this observational spread into the model evaluation procedure, enabling visualization and interpretation of model differences relative to a “cloud” of observational uncertainty. Applying MDS to the evaluation of NA-CORDEX RCMs reveals situations of added value from dynamical downscaling, situations of degraded performance from dynamical downscaling, and the sensitivity of model performance to model resolution. On precipitation days, higher-resolution RCMs typically simulate higher mean and extreme precipitation rates than their lower-resolution pairs, sometimes improving model fidelity with observations. These results document the model spread and biases in daily precipitation extremes across the full NA-CORDEX model ensemble. The often-large divergence between in situ observations, satellite data, and reanalysis, shown here for CONUS, is especially relevant for data-sparse regions of the globe where satellite and reanalysis products are extensively relied upon. This highlights the need to carefully consider multiple observational products when evaluating climate models.


Sign in / Sign up

Export Citation Format

Share Document