scholarly journals Comparing NWS PoP Forecasts to Third-Party Providers

2011 ◽  
Vol 139 (10) ◽  
pp. 3304-3321 ◽  
Author(s):  
J. Eric Bickel ◽  
Eric Floehr ◽  
Seong Dae Kim

Abstract In this paper, the authors verify probability of precipitation (PoP) forecasts provided by the National Weather Service (NWS), The Weather Channel (TWC), and CustomWeather (CW). The n-day-ahead forecasts, where n ranges from 1 to 3 for the NWS, from 1 to 9 for TWC, and from 1 to 14 for CW, are analyzed. The dataset includes almost 13 million PoP forecasts, or about 500 000 PoPs per provider per day of lead time. Data were collected over a 2-yr period (1 November 2008–31 October 2010) at 734 observation stations across the contiguous United States. In addition to verifying these PoP forecasts in an absolute sense, relative comparisons are made between the providers. It is found that, in most cases, TWC does not add skill to NWS forecasts. Perhaps most interestingly, it is also found that CW does have the ability to forecast PoPs at a resolution of 0.01.

Author(s):  
Evan S. Bentley ◽  
Richard L. Thompson ◽  
Barry R. Bowers ◽  
Justin G. Gibbs ◽  
Steven E. Nelson

AbstractPrevious work has considered tornado occurrence with respect to radar data, both WSR-88D and mobile research radars, and a few studies have examined techniques to potentially improve tornado warning performance. To date, though, there has been little work focusing on systematic, large-sample evaluation of National Weather Service (NWS) tornado warnings with respect to radar-observable quantities and the near-storm environment. In this work, three full years (2016–2018) of NWS tornado warnings across the contiguous United States were examined, in conjunction with supporting data in the few minutes preceding warning issuance, or tornado formation in the case of missed events. The investigation herein examines WSR-88D and Storm Prediction Center (SPC) mesoanalysis data associated with these tornado warnings with comparisons made to the current Warning Decision Training Division (WDTD) guidance.Combining low-level rotational velocity and the significant tornado parameter (STP), as used in prior work, shows promise as a means to estimate tornado warning performance, as well as relative changes in performance as criteria thresholds vary. For example, low-level rotational velocity peaking in excess of 30 kt (15 m s−1), in a near-storm environment which is not prohibitive for tornadoes (STP > 0), results in an increased probability of detection and reduced false alarms compared to observed NWS tornado warning metrics. Tornado warning false alarms can also be reduced through limiting warnings with weak (<30 kt), broad (>1nm) circulations in a poor (STP=0) environment, careful elimination of velocity data artifacts like sidelobe contamination, and through greater scrutiny of human-based tornado reports in otherwise questionable scenarios.


2018 ◽  
Vol 33 (6) ◽  
pp. 1501-1511 ◽  
Author(s):  
Harold E. Brooks ◽  
James Correia

Abstract Tornado warnings are one of the flagship products of the National Weather Service. We update the time series of various metrics of performance in order to provide baselines over the 1986–2016 period for lead time, probability of detection, false alarm ratio, and warning duration. We have used metrics (mean lead time for tornadoes warned in advance, fraction of tornadoes warned in advance) that work in a consistent way across the official changes in policy for warning issuance, as well as across points in time when unofficial changes took place. The mean lead time for tornadoes warned in advance was relatively constant from 1986 to 2011, while the fraction of tornadoes warned in advance increased through about 2006, and the false alarm ratio slowly decreased. The largest changes in performance take place in 2012 when the default warning duration decreased, and there is an apparent increased emphasis on reducing false alarms. As a result, the lead time, probability of detection, and false alarm ratio all decrease in 2012. Our analysis is based, in large part, on signal detection theory, which separates the quality of the warning system from the threshold for issuing warnings. Threshold changes lead to trade-offs between false alarms and missed detections. Such changes provide further evidence for changes in what the warning system as a whole considers important, as well as highlighting the limitations of measuring performance by looking at metrics independently.


1979 ◽  
Vol 60 (7) ◽  
pp. 788-790 ◽  
Author(s):  
Daniel L. Smith

The significance of Percent Correct Scores for National Weather Service (NWS) probability of precipitation (PoP) forecasts is examined. It is shown that the areal variability of rainfall and the nature of PoP forecasts preclude the achievement of a score of 100%—even for the best possible forecasts. A maximum possible percent correct is defined and radar estimates of rainfall coverage are combined with actual forecasts to determine how closely NWS forecasters approached this limit. Day- and nighttime percent correct scores were 75% and 85%, respectively, for the data examined. These values were close to the respective maximum possible scores of 83% and 90%. Relatively small changes in forecasters' percent correct scores are considered in light of these findings.


2019 ◽  
Vol 36 (1) ◽  
pp. 129-137 ◽  
Author(s):  
Micheal Hicks ◽  
Belay Demoz ◽  
Kevin Vermeesch ◽  
Dennis Atkinson

AbstractA network of automated weather stations (AWS) with ceilometers can be used to detect sky conditions, aerosol dispersion, and mixing layer heights, in addition to the routine surface meteorological parameters (temperature, pressure, humidity, etc.). Currently, a dense network of AWSs that observe all of these parameters does not exist in the United States even though networks of them with ceilometers exist. These networks normally use ceilometers for determining only sky conditions. Updating AWS networks to obtain those nonstandard observations with ceilometers, especially mixing layer height, across the United States would provide valuable information for validating and improving weather/climate forecast models. In this respect, an aerosol-based mixing layer height detection method, called the combined-hybrid method, is developed and evaluated for its uncertainty characteristics for application in the United States. Four years of ceilometer data from the National Weather Service Ceilometer Proof of Concept Project taken in temperate, maritime polar, and hot/arid climate regimes are utilized in this evaluation. Overall, the method proved to be a strong candidate for estimating mixing layer heights with ceilometer data, with averaged uncertainties of 237 ± 398 m in all tested climate regimes and 69 ± 250 m when excluding the hot/arid climate regime.


2018 ◽  
Vol 10 (4) ◽  
pp. 673-691 ◽  
Author(s):  
Michelle E. Saunders ◽  
Kevin D. Ash ◽  
Jennifer M. Collins

Abstract Weather radar is now widely viewed by the general public in the United States via television, computers/tablets, and smartphones. Anyone can consult near-real-time maps and animations of weather radar data when weather conditions are a factor. However, the usefulness of weather radar data for each user depends on a complex interaction of factors. There have been few studies providing conceptual arguments and empirical data to better understand what the most important factors are and to comprehend patterns of public weather radar use across the United States. The first part of this research provides a basic conceptual framework for research investigating the usefulness of weather radar displays as a source of weather information and as a decision aid. The second part aims to uncover several factors that influence the perceived usefulness rating of the National Weather Service (NWS) website’s weather radar display at both national and regional levels using variables gathered from the 2014 NWS customer satisfaction survey alongside relevant geographic and climatological variables. Data analyses include spatial clustering and ordinal regression utilized within a generalized linear model methodology. Overall, respondents who are more familiar with the NWS and their products, as well as those who indicate they are more likely to take action based on information provided by the NWS, are more likely to find the NWS radar display useful. Geographically, the NWS radar display is most useful to persons residing in the southern United States. Lightning is the most important hazard associated with higher radar usefulness ratings.


2005 ◽  
Vol 20 (6) ◽  
pp. 1034-1047 ◽  
Author(s):  
Jeffrey A. Baars ◽  
Clifford F. Mass

Abstract Model output statistics (MOS) guidance has been the central model postprocessing approach used by the National Weather Service since the 1970s. A recent advancement in the use of MOS is the application of “consensus” MOS (CMOS), an average of MOS from two or more models. CMOS has shown additional skill over individual MOS forecasts and has performed well compared to humans in forecasting contests. This study compares MOS, CMOS, and WMOS (weighting component MOS predictions by their past performance) forecasts of temperature and precipitation to those of the National Weather Service (NWS) subjective forecasts. Data from 29 locations throughout the United States from 1 August 2003 through 1 August 2004 are used. MOS forecasts from the Global Forecast System (GMOS), Eta (EMOS), and Nested Grid Model (NMOS) models are included, with CMOS being a simple average of these three forecasts. WMOS is calculated using weights determined from a minimum variance method, with varying training periods for each station and variable. Performance is analyzed at various forecast periods, by region of the United States, and by time/season, as well as for periods of large daily temperature changes or large departures from climatology. The results show that CMOS is competitive or superior to human forecasts at nearly all locations and that WMOS is superior to CMOS. Human forecasts are most skillful compared to MOS during the first forecast day and for periods when temperatures differ greatly from climatology. The implications of these results regarding the future role of human forecasters are examined in the conclusions.


2009 ◽  
Vol 24 (2) ◽  
pp. 504-519 ◽  
Author(s):  
David P. Ruth ◽  
Bob Glahn ◽  
Valery Dagostaro ◽  
Kathryn Gilbert

Abstract Model output statistics (MOS) guidance forecasts have been produced for over three decades. Until recently, MOS guidance was prepared for observing stations and formatted in text bulletins while official National Weather Service (NWS) forecasts for stations and zones were prepared by forecasters typing text. The flagship product of today’s NWS is the National Digital Forecast Database (NDFD). In support of NDFD, MOS is now also produced on grids. This paper compares MOS and gridded MOS (GMOS) to the forecaster-produced NDFD at approximately 1200 station locations in the conterminous United States. Results indicate that GMOS should provide good guidance for preparing the NDFD. In those areas of the country where station observations well represent the grid, GMOS features accuracy comparable to that of NDFD. In areas of complex terrain not well represented by station observations, GMOS appears similar to NDFD in its depiction. A new score is introduced to measure convergence from a long-range forecast to the final short-range forecast. This shows good GMOS forecast continuity when compared to station MOS and NDFD.


Author(s):  
Frederick L. Crosby

I appreciate the opportunity to talk to the 25th Conference for a few minutes today on the procedures and programs used by the National Weather Service to provide a meteorological service to the United States. Paper published with permission.


Sign in / Sign up

Export Citation Format

Share Document