Aviation Turbulence Forecasting at Upper Levels with Machine Learning Techniques Based on Regression Trees

AbstractWe explore the use of machine learning (ML) techniques, namely, regression trees (RT), for the purpose of aviation turbulence forecasting at upper levels [20–45 kft (~6–14 km) in altitude]. In particular, we develop a series of RT-based algorithms that include random forests (RF) and gradient-boosted regression trees (GBRT) methods. Numerical weather prediction model prognostic variables and derived turbulence diagnostics based on 6-h forecasts from the 3-km High-Resolution Rapid Refresh model are used as features to train these data-driven models. Training and evaluation are based on turbulence estimates of eddy dissipation rate (EDR) obtained from automated in situ aircraft reports. Our baseline RF model, consisting of 100 trees with 30 layers of maximum depth, significantly reduces forecast errors for EDR < 0.1 m2/3 s−1 (which corresponds roughly to null and light turbulence) when compared with a simple regression model, increasing the probability of detection and in turn reducing the number of false alarms. Model complexity reduction via GBRT and feature-relevance analyses is performed, indicating that considerable execution speedups can be achieved while maintaining the model’s predictive skill. Overall, the ML models exhibit enhanced performance in discriminating the EDR forecast among the light, moderate, and severe turbulence categories. In addition, these artificial intelligence techniques significantly simplify the generation of new NWP and grid-spacing specific turbulence forecast products.

Download Full-text

Impact of the assimilation of lightning data on the precipitation forecast at different forecast ranges

Advances in Science and Research ◽

10.5194/asr-14-187-2017 ◽

2017 ◽

Vol 14 ◽

pp. 187-194 ◽

Cited By ~ 5

Author(s):

Stefano Federico ◽

Marco Petracca ◽

Giulia Panegrossi ◽

Claudio Transerici ◽

Stefano Dietrich

Keyword(s):

Data Assimilation ◽

Mesoscale Model ◽

Weather Prediction ◽

Horizontal Resolution ◽

Probability Of Detection ◽

Precipitation Forecast ◽

Time Range ◽

False Alarms ◽

Forecast Time ◽

The Impact

Abstract. This study investigates the impact of the assimilation of total lightning data on the precipitation forecast of a numerical weather prediction (NWP) model. The impact of the lightning data assimilation, which uses water vapour substitution, is investigated at different forecast time ranges, namely 3, 6, 12, and 24 h, to determine how long and to what extent the assimilation affects the precipitation forecast of long lasting rainfall events (> 24 h). The methodology developed in a previous study is slightly modified here, and is applied to twenty case studies occurred over Italy by a mesoscale model run at convection-permitting horizontal resolution (4 km). The performance is quantified by dichotomous statistical scores computed using a dense raingauge network over Italy. Results show the important impact of the lightning assimilation on the precipitation forecast, especially for the 3 and 6 h forecast. The probability of detection (POD), for example, increases by 10 % for the 3 h forecast using the assimilation of lightning data compared to the simulation without lightning assimilation for all precipitation thresholds considered. The Equitable Threat Score (ETS) is also improved by the lightning assimilation, especially for thresholds below 40 mm day−1. Results show that the forecast time range is very important because the performance decreases steadily and substantially with the forecast time. The POD, for example, is improved by 1–2 % for the 24 h forecast using lightning data assimilation compared to 10 % of the 3 h forecast. The impact of the false alarms on the model performance is also evidenced by this study.

Download Full-text

Insights into fish-anthropogenic pressures relationships using machine learning techniques: the case of Castilla-La Mancha (Spain)

10.5194/egusphere-egu21-7119 ◽

2021 ◽

Author(s):

Carlotta Valerio ◽

Graciela Gómez Nicola ◽

Rocío Aránzazu Baquero Noriega ◽

Alberto Garrido ◽

Lucia De Stefano

Keyword(s):

Machine Learning ◽

Fish Species ◽

Conservation Status ◽

Boosted Regression Trees ◽

Machine Learning Techniques ◽

Freshwater Species ◽

Anthropogenic Pressures ◽

La Mancha ◽

Learning Techniques ◽

Starting Point

Since 1970 the number of freshwater species has suffered a decline of 83% worldwide and anthropic activities are considered to be major drivers of ecosystems degradation. Linking the ecological response to the multiple anthropogenic stressors acting in the system is essential to effectively design policy measures to restore riverine ecosystems. However, obtaining quantitative links between stressors and ecological status is still challenging, given the non-linearity of the ecosystem response and the need to consider multiple factors at play. This study applies machine learning techniques to explore the relationships between anthropogenic pressures and the composition of fish communities in the river basins of Castilla-La Mancha, a region covering nearly 79 500 km&#178; in central Spain. During the past two decades, this region has experienced an alarming decline of the conservation status of native fish species. The starting point for the analysis is a 10x10 km grid that defines for each cell the presence or absence of several fish species before and after 2001. This database was used to characterize the evolution of several metrics of fish species richness over time, accounting for the species origin (native or alien), species features (e.g. pollution tolerance) and habitat preferences. Random Forest and Gradient Boosted Regression Trees algorithms were used to relate the resulting metrics to the stressor variables describing the anthropogenic pressures acting in the rivers, such as urban wastewater discharges, land use cover, hydro-morphological degradation and the alteration of the river flow regime. The study provides new, quantitative insights into pressures-ecosystem relationships in rivers and reveals the main factors that lead to the decline of fish richness in Castilla-La Mancha, which could help inform environmental policy initiatives.

Download Full-text

Exploring multi-modalities in weather prediction using a univariate graph based on machine learning techniques

10.5194/egusphere-egu21-11747 ◽

2021 ◽

Author(s):

Natacha Galmiche ◽

Nello Blaser ◽

Morten Brun ◽

Helwig Hauser ◽

Thomas Spengler ◽

...

Keyword(s):

Machine Learning ◽

Standard Deviation ◽

Probability Distributions ◽

Weather Prediction ◽

A Priori ◽

Clustering Algorithms ◽

Quantitative Information ◽

Machine Learning Techniques ◽

Topological Data Analysis ◽

Learning Techniques

Probability distributions based on ensemble forecasts are commonly used to assess uncertainty in weather prediction. However, interpreting these distributions is not trivial, especially in the case of multimodality with distinct likely outcomes. The conventional summary employs mean and standard deviation across ensemble members, which works well for unimodal, Gaussian-like distributions. In the case of multimodality this misleads, discarding crucial information.&#160;We aim at combining previously developed clustering algorithms in machine learning and topological data analysis to extract useful information such as the number of clusters in an ensemble. Given the chaotic behaviour of the atmosphere, machine learning techniques can provide relevant results even if no, or very little, a priori information about the data is available. In addition, topological methods that analyse the shape of the data can make results explainable.Given an ensemble of univariate time series, a graph is generated whose edges and vertices represent clusters of members, including additional information for each cluster such as the members belonging to them, their uncertainty, and their relevance according to the graph. In the case of multimodality, this approach provides relevant and quantitative information beyond the commonly used mean and standard deviation approach that helps to further characterise the predictability.

Download Full-text

Quasi-Operational Testing of Real-Time Storm-Longevity Prediction via Machine Learning

Weather and Forecasting ◽

10.1175/waf-d-18-0141.1 ◽

2019 ◽

Vol 34 (5) ◽

pp. 1437-1451 ◽

Cited By ~ 2

Author(s):

Amy McGovern ◽

Christopher D. Karstens ◽

Travis Smith ◽

Ryan Lagerquist

Keyword(s):

Machine Learning ◽

Real Time ◽

Weather Prediction ◽

Warning System ◽

Objective Evaluation ◽

Boosted Regression Trees ◽

Time Prediction ◽

Operational Testing ◽

Trade Offs ◽

Naturalistic Environment

Abstract Real-time prediction of storm longevity is a critical challenge for National Weather Service (NWS) forecasters. These predictions can guide forecasters when they issue warnings and implicitly inform them about the potential severity of a storm. This paper presents a machine-learning (ML) system that was used for real-time prediction of storm longevity in the Probabilistic Hazard Information (PHI) tool, making it a Research-to-Operations (R2O) project. Currently, PHI provides forecasters with real-time storm variables and severity predictions from the ProbSevere system, but these predictions do not include storm longevity. We specifically designed our system to be tested in PHI during the 2016 and 2017 Hazardous Weather Testbed (HWT) experiments, which are a quasi-operational naturalistic environment. We considered three ML methods that have proven in prior work to be strong predictors for many weather prediction tasks: elastic nets, random forests, and gradient-boosted regression trees. We present experiments comparing the three ML methods with different types of input data, discuss trade-offs between forecast quality and requirements for real-time deployment, and present both subjective (human-based) and objective evaluation of real-time deployment in the HWT. Results demonstrate that the ML system has lower error than human forecasters, which suggests that it could be used to guide future storm-based warnings, enabling forecasters to focus on other aspects of the warning system.

Download Full-text

A Precipitation Nowcasting Mechanism for Real-World Data Based on Machine Learning

Mathematical Problems in Engineering ◽

10.1155/2020/8408931 ◽

2020 ◽

Vol 2020 ◽

pp. 1-11

Author(s):

Yanfei Xiang ◽

Jianbing Ma ◽

Xi Wu

Keyword(s):

Machine Learning ◽

Optical Flow ◽

Weather Prediction ◽

Radar Data ◽

Machine Learning Techniques ◽

Economic Losses ◽

Model Parameters ◽

Real World Data ◽

Flow Method ◽

Optical Flow Method

Unpredicted precipitations, even mild, may cause severe economic losses to many businesses. Precipitation nowcasting is hence significant for people to make correct decisions timely. For traditional methods, such as numerical weather prediction (NWP), the accuracy is limited because the smaller scale of strong convective weather must be smaller than the minimum scale that the model can capture. And it often requires a supercomputer. Furthermore, the optical flow method has been proved to be available for precipitation nowcasting. However, it is difficult to determine the model parameters because the two steps of tracking and extrapolation are separate. In contrast, current machine learning applications are based on well-selected full datasets, ignoring the fact that real datasets quite often contain missing data requiring extra consideration. In this paper, we used a real Hubei dataset in which a few radar echo data are missing and proposed a proper mechanism to deal with the situation. Furthermore, we proposed a novel mechanism for radar reflectivity data with single altitudes or cumulative altitudes using machine learning techniques. From the experimental results, we conclude that our method can predict future precipitation with a high accuracy when a few data are missing, and it outperforms the traditional optical flow method. In addition, our model can be used for various types of radar data with a type-specific feature extraction, which makes the method more flexible and suitable for most situations.

Download Full-text

Verification Results from the 2017 HMT–WPC Flash Flood and Intense Rainfall Experiment

Journal of Applied Meteorology and Climatology ◽

10.1175/jamc-d-19-0097.1 ◽

2019 ◽

Vol 58 (12) ◽

pp. 2591-2604 ◽

Cited By ~ 2

Author(s):

Michael J. Erickson ◽

Joshua S. Kastman ◽

Benjamin Albright ◽

Sarah Perfater ◽

James A. Nelson ◽

...

Keyword(s):

Machine Learning ◽

Flash Flood ◽

Weather Prediction ◽

Stage Iv ◽

Skill Score ◽

Machine Learning Techniques ◽

Intense Rainfall ◽

Deterministic Models ◽

Relative Operating Characteristic ◽

Risk Categories

AbstractThe Flash Flood and Intense Rainfall (FFaIR) Experiment developed within the Hydrometeorology Testbed (HMT) of the Weather Prediction Center (WPC) is a pseudo-operational platform for participants from across the weather enterprise to test emerging flash flood forecasting tools and issue experimental forecast products. This study presents the objective verification portion of the 2017 edition of the experiment, which examines the performance from a variety of guidance tools (deterministic models, ensembles, and machine-learning techniques) and the participants’ forecasts, with occasional reference to the participants’ subjective ratings. The skill of the model guidance used in the FFaIR Experiment is evaluated using performance diagrams verified against the Stage IV analysis. The operational and FFaIR Experiment versions of the excessive rainfall outlook (ERO) are evaluated by assessing the frequency of issuances, probabilistic calibration, Brier skill score (BSS), and area under relative operating characteristic (AuROC). An ERO first-guess field called the Colorado State University Machine-Learning Probabilities method (CSU-MLP) is also evaluated in the FFaIR Experiment. Among convection-allowing models, the Met Office Unified Model generally performed optimally throughout the FFaIR Experiment when using performance diagrams (at the 0.5- and 1-in. thresholds; 1 in. = 25.4 mm), whereas the High-Resolution Rapid Refresh (HRRR), version 3, performed best subjectively. In terms of subjective and objective ensemble scores, the HRRR ensemble scored optimally. The CSU-MLP overpredicted lower risk categories and underpredicted higher risk categories, but it shows future promise as an ERO first-guess field. The EROs issued by the FFaIR Experiment forecasters had improved BSS and AuROC relative to the operational ERO, suggesting that the experimental guidance may have aided forecasters.

Download Full-text

MACHINE LEARNING AND GERONTOLOGY: BOOSTED REGRESSION TREES PREDICT AGE DIFFERENCES IN STRESSOR EXPERIENCE

The Gerontologist ◽

10.1093/geront/gnv195.10 ◽

2015 ◽

Vol 55 (Suppl_2) ◽

pp. 461-462

Keyword(s):

Machine Learning ◽

Age Differences ◽

Regression Trees ◽

Boosted Regression Trees

Download Full-text

Analysis of the Impact of Sustained Load and Temperature on the Performance of the Electromechanical Impedance Technique through Multilevel Machine Learning and FBG Sensors

Sensors ◽

10.3390/s21175755 ◽

2021 ◽

Vol 21 (17) ◽

pp. 5755

Author(s):

Ricardo Perera ◽

Lluis Torres ◽

Francisco J. Díaz ◽

Cristina Barris ◽

Marta Baena

Keyword(s):

Machine Learning ◽

Mechanical Performance ◽

Mechanical Impedance ◽

Machine Learning Techniques ◽

Sustained Load ◽

False Alarms ◽

Engineering Structures ◽

Near Surface ◽

Electromechanical Impedance ◽

The Impact

The electro-mechanical impedance (EMI) technique has been applied successfully to detect minor damage in engineering structures including reinforced concrete (RC). However, in the presence of temperature variations, it can cause false alarms in structural health monitoring (SHM) applications. This paper has developed an innovative approach that integrates the EMI methodology with multilevel hierarchical machine learning techniques and the use of fiber Bragg grating (FBG) temperature and strain sensors to evaluate the mechanical performance of RC beams strengthened with near surface mounted (NSM)-fiber reinforced polymer (FRP) under sustained load and varied temperatures. This problem is a real challenge since the bond behavior at the concrete–FRP interface plays a key role in the performance of this type of structure, and additionally, its failure occurs in a brittle and sudden way. The method was validated in a specimen tested over a period of 1.5 years under different conditions of sustained load and temperature. The analysis of the experimental results in an especially complex problem with the proposed approach demonstrated its effectiveness as an SHM method in a combined EMI–FBG framework.

Download Full-text

Can machine learning improve the model representation of TKE dissipation rate in the boundary layer for complex terrain?

10.5194/gmd-2020-16 ◽

2020 ◽

Author(s):

Nicola Bodini ◽

Julie K. Lundquist ◽

Mike Optis

Keyword(s):

Machine Learning ◽

Numerical Weather Prediction ◽

Complex Terrain ◽

Dissipation Rate ◽

Prediction Models ◽

Weather Prediction ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Tke Dissipation Rate ◽

Numerical Weather Prediction Models

Abstract. Current turbulence parameterizations in numerical weather prediction models at the mesoscale assume a local equilibrium between production and dissipation of turbulence. As this assumption does not hold at fine horizontal resolutions, improved ways to represent turbulent kinetic energy (TKE) dissipation rate (ε) are needed. Here, we use a 6-week data set of turbulence measurements from 184 sonic anemometers in complex terrain at the Perdigão field campaign to suggest improved representations of dissipation rate. First, we demonstrate that a widely used Mellor, Yamada, Nakanishi, and Niino (MYNN) parameterization of TKE dissipation rate leads to a large inaccuracy and bias in the representation of ε. Next, we assess the potential of machine-learning techniques to predict TKE dissipation rate from a set of atmospheric and terrain-related features. We train and test several machine-learning algorithms using the data at Perdigão, and we find that multivariate polynomial regressions and random forests can eliminate the bias MYNN currently shows in representing ε, while also reducing the average error by up to 30 %. Of all the variables included in the algorithms, TKE is the variable responsible for most of the variability of ε, and a strong positive correlation exists between the two. These results suggest further consideration of machine-learning techniques to enhance parameterizations of turbulence in numerical weather prediction models.

Download Full-text

Can machine learning improve the model representation of turbulent kinetic energy dissipation rate in the boundary layer for complex terrain?

Geoscientific Model Development ◽

10.5194/gmd-13-4271-2020 ◽

2020 ◽

Vol 13 (9) ◽

pp. 4271-4285

Author(s):

Nicola Bodini ◽

Julie K. Lundquist ◽

Mike Optis

Keyword(s):

Machine Learning ◽

Kinetic Energy ◽

Complex Terrain ◽

Dissipation Rate ◽

Prediction Models ◽

Weather Prediction ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Tke Dissipation Rate ◽

Numerical Weather Prediction Models

Abstract. Current turbulence parameterizations in numerical weather prediction models at the mesoscale assume a local equilibrium between production and dissipation of turbulence. As this assumption does not hold at fine horizontal resolutions, improved ways to represent turbulent kinetic energy (TKE) dissipation rate (ϵ) are needed. Here, we use a 6-week data set of turbulence measurements from 184 sonic anemometers in complex terrain at the Perdigão field campaign to suggest improved representations of dissipation rate. First, we demonstrate that the widely used Mellor, Yamada, Nakanishi, and Niino (MYNN) parameterization of TKE dissipation rate leads to a large inaccuracy and bias in the representation of ϵ. Next, we assess the potential of machine-learning techniques to predict TKE dissipation rate from a set of atmospheric and terrain-related features. We train and test several machine-learning algorithms using the data at Perdigão, and we find that the models eliminate the bias MYNN currently shows in representing ϵ, while also reducing the average error by up to almost 40 %. Of all the variables included in the algorithms, TKE is the variable responsible for most of the variability of ϵ, and a strong positive correlation exists between the two. These results suggest further consideration of machine-learning techniques to enhance parameterizations of turbulence in numerical weather prediction models.

Download Full-text