Comparing NWS PoP Forecasts to Third-Party Providers

Abstract In this paper, the authors verify probability of precipitation (PoP) forecasts provided by the National Weather Service (NWS), The Weather Channel (TWC), and CustomWeather (CW). The n-day-ahead forecasts, where n ranges from 1 to 3 for the NWS, from 1 to 9 for TWC, and from 1 to 14 for CW, are analyzed. The dataset includes almost 13 million PoP forecasts, or about 500 000 PoPs per provider per day of lead time. Data were collected over a 2-yr period (1 November 2008–31 October 2010) at 734 observation stations across the contiguous United States. In addition to verifying these PoP forecasts in an absolute sense, relative comparisons are made between the providers. It is found that, in most cases, TWC does not add skill to NWS forecasts. Perhaps most interestingly, it is also found that CW does have the ability to forecast PoPs at a resolution of 0.01.

Download Full-text

An Analysis of 2016-2018 Tornadoes and National Weather Service Tornado Warnings across the contiguous United States

Weather and Forecasting ◽

10.1175/waf-d-20-0241.1 ◽

2021 ◽

Author(s):

Evan S. Bentley ◽

Richard L. Thompson ◽

Barry R. Bowers ◽

Justin G. Gibbs ◽

Steven E. Nelson

Keyword(s):

United States ◽

Rotational Velocity ◽

Radar Data ◽

Probability Of Detection ◽

False Alarms ◽

National Weather Service ◽

Low Level ◽

Tornado Warnings ◽

Tornado Warning ◽

Weather Service

AbstractPrevious work has considered tornado occurrence with respect to radar data, both WSR-88D and mobile research radars, and a few studies have examined techniques to potentially improve tornado warning performance. To date, though, there has been little work focusing on systematic, large-sample evaluation of National Weather Service (NWS) tornado warnings with respect to radar-observable quantities and the near-storm environment. In this work, three full years (2016–2018) of NWS tornado warnings across the contiguous United States were examined, in conjunction with supporting data in the few minutes preceding warning issuance, or tornado formation in the case of missed events. The investigation herein examines WSR-88D and Storm Prediction Center (SPC) mesoanalysis data associated with these tornado warnings with comparisons made to the current Warning Decision Training Division (WDTD) guidance.Combining low-level rotational velocity and the significant tornado parameter (STP), as used in prior work, shows promise as a means to estimate tornado warning performance, as well as relative changes in performance as criteria thresholds vary. For example, low-level rotational velocity peaking in excess of 30 kt (15 m s−1), in a near-storm environment which is not prohibitive for tornadoes (STP > 0), results in an increased probability of detection and reduced false alarms compared to observed NWS tornado warning metrics. Tornado warning false alarms can also be reduced through limiting warnings with weak (<30 kt), broad (>1nm) circulations in a poor (STP=0) environment, careful elimination of velocity data artifacts like sidelobe contamination, and through greater scrutiny of human-based tornado reports in otherwise questionable scenarios.

Download Full-text

Long-Term Performance Metrics for National Weather Service Tornado Warnings

Weather and Forecasting ◽

10.1175/waf-d-18-0120.1 ◽

2018 ◽

Vol 33 (6) ◽

pp. 1501-1511 ◽

Cited By ~ 21

Author(s):

Harold E. Brooks ◽

James Correia

Keyword(s):

False Alarm ◽

Lead Time ◽

Performance Metrics ◽

Warning System ◽

Probability Of Detection ◽

False Alarms ◽

National Weather Service ◽

False Alarm Ratio ◽

Tornado Warnings ◽

Weather Service

Abstract Tornado warnings are one of the flagship products of the National Weather Service. We update the time series of various metrics of performance in order to provide baselines over the 1986–2016 period for lead time, probability of detection, false alarm ratio, and warning duration. We have used metrics (mean lead time for tornadoes warned in advance, fraction of tornadoes warned in advance) that work in a consistent way across the official changes in policy for warning issuance, as well as across points in time when unofficial changes took place. The mean lead time for tornadoes warned in advance was relatively constant from 1986 to 2011, while the fraction of tornadoes warned in advance increased through about 2006, and the false alarm ratio slowly decreased. The largest changes in performance take place in 2012 when the default warning duration decreased, and there is an apparent increased emphasis on reducing false alarms. As a result, the lead time, probability of detection, and false alarm ratio all decrease in 2012. Our analysis is based, in large part, on signal detection theory, which separates the quality of the warning system from the threshold for issuing warnings. Threshold changes lead to trade-offs between false alarms and missed detections. Such changes provide further evidence for changes in what the warning system as a whole considers important, as well as highlighting the limitations of measuring performance by looking at metrics independently.

Download Full-text

Flood Forecasting in the United States NOAA/National Weather Service

Flood Forecasting ◽

10.1016/b978-0-12-801884-2.00010-4 ◽

2016 ◽

pp. 249-310 ◽

Cited By ~ 2

Author(s):

T.E. Adams

Keyword(s):

United States ◽

The United States ◽

Flood Forecasting ◽

National Weather Service ◽

Weather Service

Download Full-text

The National Weather Service (NWS) United States Integrated sustained Ocean Observing System and the federal backbone - the next steps, what can the federal backbone do for you?

Oceans 2003. Celebrating the Past ... Teaming Toward the Future (IEEE Cat. No.03CH37492) ◽

10.1109/oceans.2003.178307 ◽

2003 ◽

Author(s):

D. Henderson

Keyword(s):

United States ◽

National Weather Service ◽

Ocean Observing ◽

Weather Service

Download Full-text

Eighty-Five Percent and Holding—A Limit to Forecast Accuracy?

Bulletin of the American Meteorological Society ◽

10.1175/1520-0477-60.7.788 ◽

1979 ◽

Vol 60 (7) ◽

pp. 788-790 ◽

Cited By ~ 91

Author(s):

Daniel L. Smith

Keyword(s):

Forecast Accuracy ◽

National Weather Service ◽

Percent Correct ◽

Probability Of Precipitation ◽

Weather Service

The significance of Percent Correct Scores for National Weather Service (NWS) probability of precipitation (PoP) forecasts is examined. It is shown that the areal variability of rainfall and the nature of PoP forecasts preclude the achievement of a score of 100%—even for the best possible forecasts. A maximum possible percent correct is defined and radar estimates of rainfall coverage are combined with actual forecasts to determine how closely NWS forecasters approached this limit. Day- and nighttime percent correct scores were 75% and 85%, respectively, for the data examined. These values were close to the respective maximum possible scores of 83% and 90%. Relatively small changes in forecasters' percent correct scores are considered in light of these findings.

Download Full-text

Intercomparison of Mixing Layer Heights from the National Weather Service Ceilometer Test Sites and Collocated Radiosondes

Journal of Atmospheric and Oceanic Technology ◽

10.1175/jtech-d-18-0058.1 ◽

2019 ◽

Vol 36 (1) ◽

pp. 129-137 ◽

Cited By ~ 2

Author(s):

Micheal Hicks ◽

Belay Demoz ◽

Kevin Vermeesch ◽

Dennis Atkinson

Keyword(s):

United States ◽

Mixing Layer ◽

The United States ◽

Arid Climate ◽

National Weather Service ◽

Layer Height ◽

Hot Arid Climate ◽

Mixing Layer Height ◽

Sky Conditions ◽

Weather Service

AbstractA network of automated weather stations (AWS) with ceilometers can be used to detect sky conditions, aerosol dispersion, and mixing layer heights, in addition to the routine surface meteorological parameters (temperature, pressure, humidity, etc.). Currently, a dense network of AWSs that observe all of these parameters does not exist in the United States even though networks of them with ceilometers exist. These networks normally use ceilometers for determining only sky conditions. Updating AWS networks to obtain those nonstandard observations with ceilometers, especially mixing layer height, across the United States would provide valuable information for validating and improving weather/climate forecast models. In this respect, an aerosol-based mixing layer height detection method, called the combined-hybrid method, is developed and evaluated for its uncertainty characteristics for application in the United States. Four years of ceilometer data from the National Weather Service Ceilometer Proof of Concept Project taken in temperate, maritime polar, and hot/arid climate regimes are utilized in this evaluation. Overall, the method proved to be a strong candidate for estimating mixing layer heights with ceilometer data, with averaged uncertainties of 237 ± 398 m in all tested climate regimes and 69 ± 250 m when excluding the hot/arid climate regime.

Download Full-text

Usefulness of the United States National Weather Service Radar Display as Rated by Website Users

Weather Climate and Society ◽

10.1175/wcas-d-17-0108.1 ◽

2018 ◽

Vol 10 (4) ◽

pp. 673-691 ◽

Cited By ~ 4

Author(s):

Michelle E. Saunders ◽

Kevin D. Ash ◽

Jennifer M. Collins

Keyword(s):

United States ◽

Spatial Clustering ◽

Weather Radar ◽

Weather Conditions ◽

Radar Data ◽

The United States ◽

Perceived Usefulness ◽

National Weather Service ◽

Weather Information ◽

Weather Service

Abstract Weather radar is now widely viewed by the general public in the United States via television, computers/tablets, and smartphones. Anyone can consult near-real-time maps and animations of weather radar data when weather conditions are a factor. However, the usefulness of weather radar data for each user depends on a complex interaction of factors. There have been few studies providing conceptual arguments and empirical data to better understand what the most important factors are and to comprehend patterns of public weather radar use across the United States. The first part of this research provides a basic conceptual framework for research investigating the usefulness of weather radar displays as a source of weather information and as a decision aid. The second part aims to uncover several factors that influence the perceived usefulness rating of the National Weather Service (NWS) website’s weather radar display at both national and regional levels using variables gathered from the 2014 NWS customer satisfaction survey alongside relevant geographic and climatological variables. Data analyses include spatial clustering and ordinal regression utilized within a generalized linear model methodology. Overall, respondents who are more familiar with the NWS and their products, as well as those who indicate they are more likely to take action based on information provided by the NWS, are more likely to find the NWS radar display useful. Geographically, the NWS radar display is most useful to persons residing in the southern United States. Lightning is the most important hazard associated with higher radar usefulness ratings.

Download Full-text

Performance of National Weather Service Forecasts Compared to Operational, Consensus, and Weighted Model Output Statistics

Weather and Forecasting ◽

10.1175/waf896.1 ◽

2005 ◽

Vol 20 (6) ◽

pp. 1034-1047 ◽

Cited By ~ 31

Author(s):

Jeffrey A. Baars ◽

Clifford F. Mass

Keyword(s):

United States ◽

The United States ◽

Minimum Variance ◽

Model Output ◽

National Weather Service ◽

Past Performance ◽

Temperature Changes ◽

Temperature And Precipitation ◽

Model Output Statistics ◽

Weather Service

Abstract Model output statistics (MOS) guidance has been the central model postprocessing approach used by the National Weather Service since the 1970s. A recent advancement in the use of MOS is the application of “consensus” MOS (CMOS), an average of MOS from two or more models. CMOS has shown additional skill over individual MOS forecasts and has performed well compared to humans in forecasting contests. This study compares MOS, CMOS, and WMOS (weighting component MOS predictions by their past performance) forecasts of temperature and precipitation to those of the National Weather Service (NWS) subjective forecasts. Data from 29 locations throughout the United States from 1 August 2003 through 1 August 2004 are used. MOS forecasts from the Global Forecast System (GMOS), Eta (EMOS), and Nested Grid Model (NMOS) models are included, with CMOS being a simple average of these three forecasts. WMOS is calculated using weights determined from a minimum variance method, with varying training periods for each station and variable. Performance is analyzed at various forecast periods, by region of the United States, and by time/season, as well as for periods of large daily temperature changes or large departures from climatology. The results show that CMOS is competitive or superior to human forecasts at nearly all locations and that WMOS is superior to CMOS. Human forecasts are most skillful compared to MOS during the first forecast day and for periods when temperatures differ greatly from climatology. The implications of these results regarding the future role of human forecasters are examined in the conclusions.

Download Full-text

The Performance of MOS in the Digital Age

Weather and Forecasting ◽

10.1175/2008waf2222158.1 ◽

2009 ◽

Vol 24 (2) ◽

pp. 504-519 ◽

Cited By ~ 16

Author(s):

David P. Ruth ◽

Bob Glahn ◽

Valery Dagostaro ◽

Kathryn Gilbert

Keyword(s):

United States ◽

Long Range ◽

Complex Terrain ◽

Short Range ◽

Digital Age ◽

Model Output ◽

National Weather Service ◽

Model Output Statistics ◽

Weather Service

Abstract Model output statistics (MOS) guidance forecasts have been produced for over three decades. Until recently, MOS guidance was prepared for observing stations and formatted in text bulletins while official National Weather Service (NWS) forecasts for stations and zones were prepared by forecasters typing text. The flagship product of today’s NWS is the National Digital Forecast Database (NDFD). In support of NDFD, MOS is now also produced on grids. This paper compares MOS and gridded MOS (GMOS) to the forecaster-produced NDFD at approximately 1200 station locations in the conterminous United States. Results indicate that GMOS should provide good guidance for preparing the NDFD. In those areas of the country where station observations well represent the grid, GMOS features accuracy comparable to that of NDFD. In areas of complex terrain not well represented by station observations, GMOS appears similar to NDFD in its depiction. A new score is introduced to measure convergence from a long-range forecast to the final short-range forecast. This shows good GMOS forecast continuity when compared to station MOS and NDFD.

Download Full-text

Weather Analysis and Forecasting

ASME 1979 Citrus Engineering Conference ◽

10.1115/cec1979-2505 ◽

1979 ◽

Author(s):

Frederick L. Crosby

Keyword(s):

United States ◽

The United States ◽

National Weather Service ◽

Weather Service

I appreciate the opportunity to talk to the 25th Conference for a few minutes today on the procedures and programs used by the National Weather Service to provide a meteorological service to the United States. Paper published with permission.

Download Full-text