Prospective Evaluation of Multiplicative Hybrid Earthquake Forecast Models for California

Developing testable seismicity models is essential for robust seismic hazard assessments and to quantify the predictive skills of posited hypotheses about seismogenesis. On this premise, the Regional Earthquake Likelihood Models (RELM) group designed a joint forecasting experiment, with associated models, data and tests to evaluate earthquake predictability in California over a five-year period. Participating RELM forecast models were based on a range of geophysical datasets, including earthquake catalogs, interseismic strain rates, and geologic fault slip rates. After five years of prospective evaluation, the RELM experiment found that the smoothed seismicity (HKJ) model by Helmstetter et al. (2007) was the most informative. The diversity of competing forecast hypotheses in RELM was suitable for combining multiple models that could provide more informative earthquake forecasts than HKJ. Thus, Rhoades et al. (2014) created multiplicative hybrid models that involve the HKJ model as a baseline and one or more conjugate models. Particularly, the authors fitted two parameters for each conjugate model and an overall normalizing constant to optimize each hybrid model. Then, information gain scores per earthquake were computed using a corrected Akaike Information Criterion that penalized for the number of fitted parameters. According to retrospective analyses, some hybrid models showed significant information gains over the HKJ forecast, despite the penalty. Here, we assess in a prospective setting the predictive skills of 16 hybrids and 6 original RELM forecasts, using a suite of tests of the Collaboratory for the Study of Earthquake Predicitability (CSEP). The evaluation dataset contains 40 M&#8805;4.95 events recorded within the California CSEP-testing region from 1 January 2011 to 31 December 2020, including the 2016 Mw 5.6, 5.6, and 5.5 Hawthorne earthquake swarm, and the Mw 6.4 foreshock and Mw 7.1 mainshock from the 2019 Ridgecrest sequence. We evaluate the consistency between the observed and the expected number, spatial, likelihood and magnitude distributions of earthquakes, and compare the performance of each forecast to that of HKJ. Our prospective test results show that none of the hybrid models are significantly more informative than the HKJ baseline forecast. These results are mainly due to the occurrence of the 2016 Hawthorne earthquake cluster, and four events from the 2019 Ridgecrest sequence in two forecast bins. These clusters of seismicity are exceptionally unlikely in all models, and insufficiently captured by the Poisson distribution that the likelihood functions of tests assume. Therefore, we are currently examining alternative likelihood functions that reduce the sensitivity of the evaluations to clustering, and that could be used to better understand whether the discrepancies between prospective and retrospective test results for multiplicative hybrid forecasts are due to limitations of the tests or the methods used to create the hybrid models.&#160;

Download Full-text

Prospective Evaluation of Global Earthquake Forecast Models: 2 Yrs of Observations Provide Preliminary Support for Merging Smoothed Seismicity with Geodetic Strain Rates

Seismological Research Letters ◽

10.1785/0220180051 ◽

2018 ◽

Vol 89 (4) ◽

pp. 1262-1271 ◽

Cited By ~ 6

Author(s):

Anne Strader ◽

Maximilian Werner ◽

José Bayona ◽

Philip Maechling ◽

Fabio Silva ◽

...

Keyword(s):

Strain Rates ◽

Prospective Evaluation ◽

Earthquake Forecast ◽

Smoothed Seismicity ◽

Forecast Models ◽

Geodetic Strain ◽

Preliminary Support

Download Full-text

Earthquake forecast enrichment scores

Research in Geophysics ◽

10.4081/rg.2012.e2 ◽

2012 ◽

Vol 2 (1) ◽

pp. 2 ◽

Cited By ~ 1

Author(s):

Christine Smyth ◽

Masumi Yamada ◽

Jim Mori

Keyword(s):

Simulated Data ◽

Enrichment Score ◽

Permutation Testing ◽

Earthquake Forecast ◽

Gene Set Enrichment ◽

Time Period ◽

Earthquake Predictability ◽

Statistical Field ◽

Forecast Models ◽

Kolmogorov Smirnov

The Collaboratory for the Study of Earthquake Predictability (CSEP) is a global project aimed at testing earthquake forecast models in a fair environment. Various metrics are currently used to evaluate the submitted forecasts. However, the CSEP still lacks easily understandable metrics with which to rank the universal performance of the forecast models. In this research, we modify a well-known and respected metric from another statistical field, bioinformatics, to make it suitable for evaluating earthquake forecasts, such as those submitted to the CSEP initiative. The metric, originally called a gene-set enrichment score, is based on a Kolmogorov-Smirnov statistic. Our modified metric assesses if, over a certain time period, the forecast values at locations where earthquakes have occurred are significantly increased compared to the values for all locations where earthquakes did not occur. Permutation testing allows for a significance value to be placed upon the score. Unlike the metrics currently employed by the CSEP, the score places no assumption on the distribution of earthquake occurrence nor requires an arbitrary reference forecast. In this research, we apply the modified metric to simulated data and real forecast data to show it is a powerful and robust technique, capable of ranking competing earthquake forecasts.

Download Full-text

Earthquake forecast models for Italy based on the RI algorithm

Annals of Geophysics ◽

10.4401/ag-4810 ◽

2010 ◽

Vol 53 (3) ◽

Author(s):

Kazuyoshi Z.

Keyword(s):

Earthquake Forecast ◽

Forecast Models ◽

Ri Algorithm

Download Full-text

Assessing ‘alarm-based CN’ earthquake predictions in Italy

Annals of Geophysics ◽

10.4401/ag-6889 ◽

2017 ◽

Vol 59 (6) ◽

Author(s):

Matteo Taroni ◽

Warner Marzocchi ◽

Pamela Roselli

Keyword(s):

Earthquake Prediction ◽

News Media ◽

Poisson Model ◽

Prediction Performance ◽

Testing Procedures ◽

Earthquake Predictability ◽

Forecast Models ◽

Earthquake Predictions ◽

Cn Algorithm ◽

Italian Territory

The quantitative assessment of the performance of earthquake prediction and/or forecast models is essential for evaluating their applicability for risk reduction purposes. Here we assess the earthquake prediction performance of the CN model applied to the Italian territory. This model has been widely publicized in Italian news media, but a careful assessment of its prediction performance is still lacking. In this paper we evaluate the results obtained so far from the CN algorithm applied to the Italian territory, by adopting widely used testing procedures and under development in the Collaboratory for the Study of Earthquake Predictability (CSEP) network. Our results show that the CN prediction performance is comparable to the prediction performance of the stationary Poisson model, that is, CN predictions do not add more to what may be expected from random chance.

Download Full-text

Gaussian Process Based Expected Information Gain Computation for Bayesian Optimal Design

Entropy ◽

10.3390/e22020258 ◽

2020 ◽

Vol 22 (2) ◽

pp. 258

Author(s):

Zhihang Xu ◽

Qifeng Liao

Keyword(s):

Monte Carlo ◽

Optimal Design ◽

Information Gain ◽

Computational Cost ◽

Bayesian Optimization ◽

Likelihood Functions ◽

Expected Information ◽

Bayesian Optimal Design ◽

Bayesian Monte Carlo ◽

Expected Information Gain

Optimal experimental design (OED) is of great significance in efficient Bayesian inversion. A popular choice of OED methods is based on maximizing the expected information gain (EIG), where expensive likelihood functions are typically involved. To reduce the computational cost, in this work, a novel double-loop Bayesian Monte Carlo (DLBMC) method is developed to efficiently compute the EIG, and a Bayesian optimization (BO) strategy is proposed to obtain its maximizer only using a small number of samples. For Bayesian Monte Carlo posed on uniform and normal distributions, our analysis provides explicit expressions for the mean estimates and the bounds of their variances. The accuracy and the efficiency of our DLBMC and BO based optimal design are validated and demonstrated with numerical experiments.

Download Full-text

The influence of size or number of biopsies on rapid urease test results: a prospective evaluation

Gastrointestinal Endoscopy ◽

10.1016/s0016-5107(06)80081-7 ◽

1996 ◽

Vol 43 (2) ◽

pp. 49-53

Author(s):

Loren Laine ◽

David Chun ◽

Craig Stein ◽

Ihab El-Beblawi ◽

Vishvinder Sharma ◽

...

Keyword(s):

Rapid Urease Test ◽

Test Results ◽

Prospective Evaluation ◽

Urease Test

Download Full-text

Earthquake forecast models for inland Japan based on the G-R law and the modified G-R law

Earth Planets and Space ◽

10.5047/eps.2010.10.002 ◽

2011 ◽

Vol 63 (3) ◽

pp. 239-260 ◽

Cited By ~ 3

Author(s):

Fuyuki Hirose ◽

Kenji Maeda

Keyword(s):

Earthquake Forecast ◽

Forecast Models

Download Full-text

Deviation from G-R Law before Great Earthquakes and Recommendation for Earthquake Forecast Models Based on That Feature

Zisin (Journal of the Seismological Society of Japan 2nd ser ) ◽

10.4294/zisin.2016-8 ◽

2017 ◽

Vol 70 (0) ◽

pp. 21-40 ◽

Cited By ~ 1

Author(s):

Fuyuki HIROSE ◽

Kenji MAEDA

Keyword(s):

Earthquake Forecast ◽

Great Earthquakes ◽

Forecast Models

Download Full-text

Comparison of Short-Term and Time-Independent Earthquake Forecast Models for Southern California

Bulletin of the Seismological Society of America ◽

10.1785/0120050067 ◽

2006 ◽

Vol 96 (1) ◽

pp. 90-106 ◽

Cited By ~ 153

Author(s):

A. Helmstetter

Keyword(s):

Southern California ◽

Short Term ◽

Earthquake Forecast ◽

Forecast Models

Download Full-text

Evaluating the Assembly Dynamics in the Human Vaginal Microbiomes With Niche-Neutral Hybrid Modeling

Frontiers in Microbiology ◽

10.3389/fmicb.2021.699939 ◽

2021 ◽

Vol 12 ◽

Author(s):

Zhanshan (Sam) Ma

Keyword(s):

Hybrid Models ◽

Integrated Analysis ◽

Vaginal Microbiome ◽

Test Results ◽

Statistical Inferences ◽

Hypothesis Model ◽

The Difference ◽

Stochastic Dispersal ◽

Threshold Setting ◽

Stochastic Forces

Using 2,733 longitudinal vaginal microbiome samples (representing local microbial communities) from 79 individuals (representing meta-communities) in the states of healthy, BV (bacterial vaginosis) and pregnancy, we assess and interpret the relative importance of stochastic forces (e.g., stochastic drifts in bacteria demography, and stochastic dispersal) vs. deterministic selection (e.g., host genome, and host physiology) in shaping the dynamics of human vaginal microbiome (HVM) diversity by an integrated analysis with multi-site neutral (MSN) and niche-neutral hybrid (NNH) modeling. It was found that, when the traditional “default” P-value = 0.05 was specified, the neutral drifts were predominant (≥50% metacommunities indistinguishable from the MSN prediction), while the niche differentiations were moderate (<20% from the NNH prediction). The study also analyzed two challenging uncertainties in testing the neutral and/or niche-neutral hybrid models, i.e., lack of full model specificity – non-unique fittings of same datasets to multiple models with potentially different mechanistic assumptions – and lack of definite rules for setting the P-value thresholds (also noted as Pt-value when referring to the threshold of P-value in this article) in testing null hypothesis (model). Indeed, the two uncertainties can be interdependent, which further complicates the statistical inferences. To deal with the uncertainties, the MSN/NNH test results under a series of P-values ranged from 0.05 to 0.95 were presented. Furthermore, the influence of P-value threshold-setting on the model specificity, and the effects of woman’s health status on the neutrality level of HVM were examined. It was found that with the increase of P-value threshold from 0.05 to 0.95, the overlap (non-unique) fitting of MSN and NNH decreased from 29.1 to 1.3%, whereas the specificity (uniquely fitted to data) of MSN model was kept between 55.7 and 82.3%. Also with the rising P-value threshold, the difference between healthy and BV groups become significant. These findings suggested that traditional single P-value threshold (such as the de facto standard P-value = 0.05) might be insufficient for testing the neutral and/or niche neutral hybrid models.

Download Full-text