scholarly journals A Bayesian Hierarchical Modeling Framework for Correcting Reporting Bias in the U.S. Tornado Database

2018 ◽  
Vol 34 (1) ◽  
pp. 15-30 ◽  
Author(s):  
Corey K. Potvin ◽  
Chris Broyles ◽  
Patrick S. Skinner ◽  
Harold E. Brooks ◽  
Erik Rasmussen

Abstract The Storm Prediction Center (SPC) tornado database, generated from NCEI’s Storm Data publication, is indispensable for assessing U.S. tornado risk and investigating tornado–climate connections. Maximizing the value of this database, however, requires accounting for systemically lower reported tornado counts in rural areas owing to a lack of observers. This study uses Bayesian hierarchical modeling to estimate tornado reporting rates and expected tornado counts over the central United States during 1975–2016. Our method addresses a serious solution nonuniqueness issue that may have affected previous studies. The adopted model explains 73% (>90%) of the variance in reported counts at scales of 50 km (>100 km). Population density explains more of the variance in reported tornado counts than other examined geographical covariates, including distance from nearest city, terrain ruggedness index, and road density. The model estimates that approximately 45% of tornadoes within the analysis domain were reported. The estimated tornado reporting rate decreases sharply away from population centers; for example, while >90% of tornadoes that occur within 5 km of a city with population > 100 000 are reported, this rate decreases to <70% at distances of 20–25 km. The method is directly extendable to other events subject to underreporting (e.g., severe hail and wind) and could be used to improve climate studies and tornado and other hazard models for forecasters, planners, and insurance/reinsurance companies, as well as for the development and verification of storm-scale prediction systems.

PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0253926
Author(s):  
Xiang Zhang ◽  
Taolin Yuan ◽  
Jaap Keijer ◽  
Vincent C. J. de Boer

Background Mitochondrial dysfunction is involved in many complex diseases. Efficient and accurate evaluation of mitochondrial functionality is crucial for understanding pathology as well as facilitating novel therapeutic developments. As a popular platform, Seahorse extracellular flux (XF) analyzer is widely used for measuring mitochondrial oxygen consumption rate (OCR) in living cells. A hidden feature of Seahorse XF OCR data is that it has a complex data structure, caused by nesting and crossing between measurement cycles, wells and plates. Surprisingly, statistical analysis of Seahorse XF data has not received sufficient attention, and current methods completely ignore the complex data structure, impairing the robustness of statistical inference. Results To rigorously incorporate the complex structure into data analysis, here we developed a Bayesian hierarchical modeling framework, OCRbayes, and demonstrated its applicability based on analysis of published data sets. Conclusions We showed that OCRbayes can analyze Seahorse XF OCR experimental data derived from either single or multiple plates. Moreover, OCRbayes has potential to be used for diagnosing patients with mitochondrial diseases.


2021 ◽  
Author(s):  
Xiang Zhang ◽  
Taolin Yuan ◽  
Jaap Keijer ◽  
Vincent C. J. de Boer

Mitochondrial dysfunction is involved in many complex diseases. Efficient and accurate evaluation of mitochondrial functionality is crucial for understanding pathology as well as facilitating novel therapeutic developments. As a popular platform, Seahorse extracellular flux (XF) analyzer is widely used for measuring mitochondrial oxygen consumption rate (OCR) in living cells. A hidden feature of Seahorse XF OCR data is that it has a complex data structure, caused by nesting and crossing between measurement cycles, wells and plates. Surprisingly, statistical analysis of Seahorse XF data has not received sufficient attention, and current methods completely ignore the complex data structure, impairing the robustness of statistical inference. To rigorously incorporate the complex structure into data analysis, here we developed a Bayesian hierarchical modeling framework, OCRbayes, and demonstrated its applicability based on analysis of published data sets. We showed that OCRbayes can analyze Seahorse XF OCR experimental data derived from either single or multiple plates. Moreover, OCRbayes has potential to be used for diagnosing patients with mitochondrial diseases.


2022 ◽  
Vol 26 (1) ◽  
pp. 149-166
Author(s):  
Álvaro Ossandón ◽  
Manuela I. Brunner ◽  
Balaji Rajagopalan ◽  
William Kleiber

Abstract. Timely projections of seasonal streamflow extremes can be useful for the early implementation of annual flood risk adaptation strategies. However, predicting seasonal extremes is challenging, particularly under nonstationary conditions and if extremes are correlated in space. The goal of this study is to implement a space–time model for the projection of seasonal streamflow extremes that considers the nonstationarity (interannual variability) and spatiotemporal dependence of high flows. We develop a space–time model to project seasonal streamflow extremes for several lead times up to 2 months, using a Bayesian hierarchical modeling (BHM) framework. This model is based on the assumption that streamflow extremes (3 d maxima) at a set of gauge locations are realizations of a Gaussian elliptical copula and generalized extreme value (GEV) margins with nonstationary parameters. These parameters are modeled as a linear function of suitable covariates describing the previous season selected using the deviance information criterion (DIC). Finally, the copula is used to generate streamflow ensembles, which capture spatiotemporal variability and uncertainty. We apply this modeling framework to predict 3 d maximum streamflow in spring (May–June) at seven gauges in the Upper Colorado River basin (UCRB) with 0- to 2-month lead time. In this basin, almost all extremes that cause severe flooding occur in spring as a result of snowmelt and precipitation. Therefore, we use regional mean snow water equivalent and temperature from the preceding winter season as well as indices of large-scale climate teleconnections – El Niño–Southern Oscillation, Atlantic Multidecadal Oscillation, and Pacific Decadal Oscillation – as potential covariates for 3 d spring maximum streamflow. Our model evaluation, which is based on the comparison of different model versions and the energy skill score, indicates that the model can capture the space–time variability in extreme streamflow well and that model skill increases with decreasing lead time. We also find that the use of climate variables slightly enhances skill relative to using only snow information. Median projections and their uncertainties are consistent with observations, thanks to the representation of spatial dependencies through covariates in the margins and a Gaussian copula. This spatiotemporal modeling framework helps in the planning of seasonal adaptation and preparedness measures as predictions of extreme spring streamflows become available 2 months before actual flood occurrence.


2018 ◽  
Vol 16 (2) ◽  
pp. 142-153 ◽  
Author(s):  
Kristen M Cunanan ◽  
Alexia Iasonos ◽  
Ronglai Shen ◽  
Mithat Gönen

Background: In the era of targeted therapies, clinical trials in oncology are rapidly evolving, wherein patients from multiple diseases are now enrolled and treated according to their genomic mutation(s). In such trials, known as basket trials, the different disease cohorts form the different baskets for inference. Several approaches have been proposed in the literature to efficiently use information from all baskets while simultaneously screening to find individual baskets where the drug works. Most proposed methods are developed in a Bayesian paradigm that requires specifying a prior distribution for a variance parameter, which controls the degree to which information is shared across baskets. Methods: A common approach used to capture the correlated binary endpoints across baskets is Bayesian hierarchical modeling. We evaluate a Bayesian adaptive design in the context of a non-randomized basket trial and investigate three popular prior specifications: an inverse-gamma prior on the basket-level variance, a uniform prior and half-t prior on the basket-level standard deviation. Results: From our simulation study, we can see that the inverse-gamma prior is highly sensitive to the input hyperparameters. When the prior mean value of the variance parameter is set to be near zero [Formula: see text], this can lead to unacceptably high false-positive rates [Formula: see text] in some scenarios. Thus, use of this prior requires a fully comprehensive sensitivity analysis before implementation. Alternatively, we see that a prior that places sufficient mass in the tail, such as the uniform or half-t prior, displays desirable and robust operating characteristics over a wide range of prior specifications, with the caveat that the upper bound of the uniform prior and the scale parameter of the half-t prior must be larger than 1. Conclusion: Based on the simulation results, we recommend that those involved in designing basket trials that implement hierarchical modeling avoid using a prior distribution that places a majority of the density mass near zero for the variance parameter. Priors with this property force the model to share information regardless of the true efficacy configuration of the baskets. Many commonly used inverse-gamma prior specifications have this undesirable property. We recommend to instead consider the more robust uniform prior or half-t prior on the standard deviation.


Author(s):  
Suguru Yamanaka ◽  
Rei Yamamoto

Recent interest in financial technology (fintech) lending business has caused increasing challenges of credit scoring models using bank account activity information. Our work aims to develop a new credit scoring method based on bank account activity information. This method incorporates borrower firms’ segment-level heterogeneity, such as a segment of sales size and firm age. We employ Bayesian hierarchical modeling, which mitigates data sparsity issue due to data segmentation. We describe our modeling procedures, including data handling and variable selection. Empirical results show that our model outperforms the traditional logistic model for credit scoring in information criterion. Our model realizes advanced credit scoring based on bank account activity information in fintech lending businesses, taking segment-specific features into credit risk assessment.


Sign in / Sign up

Export Citation Format

Share Document