Comparison of wavelet and machine learning methods for regional drought prediction

Author(s):  
Gilbert Hinge ◽  
Ashutosh Sharma

<p>Droughts are considered as one of the most catastrophic natural disasters that affect humans and their surroundings at a larger spatial scale compared to other disasters. Rajasthan, one of India's semiarid states, is drought inclined and has experienced many drought events in the past. In this study, we evaluated different preprocessing and Machine Learning (ML) approaches for drought predictions in Rajasthan for a lead-time of up to 6 months. The Standardized Precipitation Index (SPI) was used as the drought quantifying measure to identify the drought events. SPI was calculated for 3, 6, and 12-month timescales over the last 115-year using monthly rainfall data at 119 grid stations.  ML techniques, namely Artificial Neural Network (ANN), Support Vector Regression (SVR), and Linear Regression (LR), were used to evaluate their accuracy in drought forecasting over different lead times. Furthermore, two data processing methods, namely the Wavelet Packet Transform (WPT) and Discrete Wavelet Transform (DWT), have also been used to enhance the aforementioned ML models' predictability. At the outset, the preprocessed SPI data from both the methods were used as inputs for LR, SVR, and ANN to form a hybrid model. The hybrid models' drought predictability for a different lead-time was evaluated and compared with the standalone ML models. The forecasting performance of all the models for all 119 grid points was assessed with three statistical indices: Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Nash-Sutcliffe Efficiency (NSE). RMSE was used to select the optimal model parameters, such as the number of hidden neurons and the number of inputs in ANN, and the level of decomposition and mother wavelet in wavelet analysis.  Based on these measures, the coupled model showed better forecasting performance than the standalone ML models. The coupled WPT-ANN model shows superior predictability for most of the grid points than other coupled models and standalone models.  All models' performance improved as the timescale increased from 3 to 12 months for all the lead times. However, the model performance decreased as the lead time increased.  These findings indicate the necessity of processing the data before the application of any machine learning technique. The hybrid model's prediction performance also shows that it can be used for drought early warning systems in the state.</p>

2020 ◽  
Author(s):  
Murad Megjhani ◽  
Kalijah Terilli ◽  
Ayham Alkhachroum ◽  
David J. Roh ◽  
Sachin Agarwal ◽  
...  

AbstractObjectiveTo develop a machine learning based tool, using routine vital signs, to assess delayed cerebral ischemia (DCI) risk over time.MethodsIn this retrospective analysis, physiologic data for 540 consecutive acute subarachnoid hemorrhage patients were collected and annotated as part of a prospective observational cohort study between May 2006 and December 2014. Patients were excluded if (i) no physiologic data was available, (ii) they expired prior to the DCI onset window (< post bleed day 3) or (iii) early angiographic vasospasm was detected on admitting angiogram. DCI was prospectively labeled by consensus of treating physicians. Occurrence of DCI was classified using various machine learning approaches including logistic regression, random forest, support vector machine (linear and kernel), and an ensemble classifier, trained on vitals and subject characteristic features. Hourly risk scores were generated as the posterior probability at time t. We performed five-fold nested cross validation to tune the model parameters and to report the accuracy. All classifiers were evaluated for good discrimination using the area under the receiver operating characteristic curve (AU-ROC) and confusion matrices.ResultsOf 310 patients included in our final analysis, 101 (32.6%) patients developed DCI. We achieved maximal classification of 0.81 [0.75-0.82] AU-ROC. We also predicted 74.7 % of all DCI events 12 hours before typical clinical detection with a ratio of 3 true alerts for every 2 false alerts.ConclusionA data-driven machine learning based detection tool offered hourly assessments of DCI risk and incorporated new physiologic information over time.


2021 ◽  
Author(s):  
Edward E. Salakpi ◽  
Peter D. Hurley ◽  
James M. Muthoka ◽  
Adam B. Barrett ◽  
Andrew Bowell ◽  
...  

Abstract. Droughts form a large part of climate/weather-related disasters reported globally. In Africa, pastoralists living in the Arid and Semi-Arid Lands (ASALs) are the worse affected. Prolonged dry spells that cause vegetation stress in these regions have resulted in the loss of income and livelihoods. To curb this, global initiatives like the Paris Agreement and the United Nations recognised the need to establish Early Warning Systems (EWS) to save lives and livelihoods. Existing EWS use a combination of Satellite Earth Observation (EO) based biophysical indicators like the Vegetation Condition Index (VCI) and socio-economic factors to measure and monitor droughts. Most of these EWS rely on expert knowledge in estimating upcoming drought conditions without using forecast models. Recent research has shown that the use of robust algorithms like Auto-Regression, Gaussian Processes and Artificial Neural Networks can provide very skilled models for forecasting vegetation condition at short to medium range lead times. However, to enable preparedness for early action, forecasts with a longer lead time are needed. The objective of this research work is to develop models that forecast vegetation conditions at longer lead times on the premise that vegetation condition is controlled by factors like precipitation and soil moisture. To achieve this, we used a Bayesian Auto-Regressive Distributed Lag (BARDL) modelling approach which enabled us to factor in lagged information from Precipitation and Soil moisture levels into our VCI forecast model. The results showed a ∼2-week gain in the forecast range compared to the univariate AR model used as a baseline. The R2 scores for the Bayesian ARDL model were 0.94, 0.85 and 0.74, compared to the AR model's R2 of 0.88, 0.77 and 0.65 for 6, 8 and 10 weeks lead time respectively.


Hydrology ◽  
2021 ◽  
Vol 8 (4) ◽  
pp. 183
Author(s):  
Paul Muñoz ◽  
Johanna Orellana-Alvear ◽  
Jörg Bendix ◽  
Jan Feyen ◽  
Rolando Célleri

Worldwide, machine learning (ML) is increasingly being used for developing flood early warning systems (FEWSs). However, previous studies have not focused on establishing a methodology for determining the most efficient ML technique. We assessed FEWSs with three river states, No-alert, Pre-alert and Alert for flooding, for lead times between 1 to 12 h using the most common ML techniques, such as multi-layer perceptron (MLP), logistic regression (LR), K-nearest neighbors (KNN), naive Bayes (NB), and random forest (RF). The Tomebamba catchment in the tropical Andes of Ecuador was selected as a case study. For all lead times, MLP models achieve the highest performance followed by LR, with f1-macro (log-loss) scores of 0.82 (0.09) and 0.46 (0.20) for the 1 h and 12 h cases, respectively. The ranking was highly variable for the remaining ML techniques. According to the g-mean, LR models correctly forecast and show more stability at all states, while the MLP models perform better in the Pre-alert and Alert states. The proposed methodology for selecting the optimal ML technique for a FEWS can be extrapolated to other case studies. Future efforts are recommended to enhance the input data representation and develop communication applications to boost the awareness of society of floods.


2019 ◽  
Vol 51 (1) ◽  
pp. 17-29 ◽  
Author(s):  
Ruixiang Yang ◽  
Baodeng Hou ◽  
Weihua Xiao ◽  
Chuan Liang ◽  
Xuelei Zhang ◽  
...  

Abstract Improving flood forecasting performance is critical for flood management. Real-time flood forecasting correction techniques (e.g., proportional correction (PC) and Kalman filter (KF)) coupled with the Muskingum method improve the forecasting performance but have limitations (e.g., short lead times and inadequate performance, respectively). Here, particle filter (PF) and combination forecasting (CF) are coupled with the Muskingum method and then applied to 10 flood events along the Shaxi River, China. Two indexes (overall consistency and permissible range) are selected to compare the performances of PC, KF, PF and CF for 3 h lead time. The changes in overall consistency for different lead times (1–6 h) are used to evaluate the applicability of PC, KF, PF and CF. The main conclusions are as follows: (1) for 3 h lead time, the two indexes indicate that the PF performance is optimal, followed in order by KF and PC; CF performance is close to PF and better than KF. (2) The performance of PC decreases faster than that of KF and PF with increases in the lead time. PC and PF are applicable for short (1–2 h) and long lead times (3–6 h), respectively. CF is applicable for 1–6 h lead times; however, it has no advantage over PC and PF for short and long lead times, respectively, which may be due to insufficient training and increase in cumulative errors.


2021 ◽  
Author(s):  
Elizaveta Felsche ◽  
Ralf Ludwig

Abstract. There is strong scientific and social interest to understand the factors leading to extreme events in order to improve the management of risks associated with hazards like droughts. In this study, artificial neural networks are applied to predict the occurrence of a drought in two contrasting European domains, Munich and Lisbon, with a lead time of one month. The approach takes into account a list of 30 atmospheric and soil variables as input parameters from a single-model initial condition large ensemble (CRCM5-LE). The data was produced the context of the ClimEx project by Ouranos with the Canadian Regional Climate Model (CRCM5) driven by 50 members of the Canadian Earth System Model (CanESM2). Drought occurrence was defined using the Standardized Precipitation Index. The best performing machine learning algorithms managed to obtain a correct classification of drought or no drought for a lead time of one month for around 55–60 % of the events of each class for both domains. Explainable AI methods like SHapley Additive exPlanations (SHAP) were applied to gain a better understanding of the trained algorithms. Variables like the North Atlantic Oscillation Index and air pressure one month before the event proved to be of high importance for the prediction. The study showed that seasonality has a high influence on goodness of drought prediction, especially for the Lisbon domain.


Author(s):  
Paul Muñoz ◽  
Johanna Orellana-Alvear ◽  
Jörg Bendix ◽  
Jan Feyen ◽  
Rolando Célleri

Flood Early Warning Systems (FEWSs) using Machine Learning (ML) has gained worldwide popularity. However, determining the most efficient ML technique is still a bottleneck. We assessed FEWSs with three river states, No-alert, Pre-alert, and Alert for flooding, for lead times between 1 to 12 hours using the most common ML techniques, such as Multi-Layer Perceptron (MLP), Logistic Regression (LR), K-Nearest Neighbors (KNN), Naive Bayes (NB), and Random Forest (RF). The Tomebamba catchment in the tropical Andes of Ecuador was selected as case study. For all lead times, MLP models achieve the highest performance followed by LR, with f1-macro (log-loss) scores of 0.82 (0.09) and 0.46 (0.20) for the 1- and 12-hour cases, respectively. The ranking was highly variable for the remaining ML techniques. According to the g-mean, LR models correctly forecast and show more stability at all states, while the MLP models perform better in the Pre-alert and Alert states. Future efforts are recommended to enhance the input data representation and develop communication applications to boost the awareness of the society for floods.


2021 ◽  
Vol 21 (8) ◽  
pp. 2379-2405
Author(s):  
Luigi Cesarini ◽  
Rui Figueiredo ◽  
Beatrice Monteleone ◽  
Mario L. V. Martina

Abstract. Weather index insurance is an innovative tool in risk transfer for disasters induced by natural hazards. This paper proposes a methodology that uses machine learning algorithms for the identification of extreme flood and drought events aimed at reducing the basis risk connected to this kind of insurance mechanism. The model types selected for this study were the neural network and the support vector machine, vastly adopted for classification problems, which were built exploring thousands of possible configurations based on the combination of different model parameters. The models were developed and tested in the Dominican Republic context, based on data from multiple sources covering a time period between 2000 and 2019. Using rainfall and soil moisture data, the machine learning algorithms provided a strong improvement when compared to logistic regression models, used as a baseline for both hazards. Furthermore, increasing the amount of information provided during the training of the models proved to be beneficial to the performances, increasing their classification accuracy and confirming the ability of these algorithms to exploit big data and their potential for application within index insurance products.


2021 ◽  
Author(s):  
Luigi Cesarini ◽  
Rui Figueiredo ◽  
Beatrice Monteleone ◽  
Mario Martina

&lt;p&gt;A steady increase in the frequency and severity of extreme climate events has been observed in recent years, causing losses amounting to billions of dollars. Floods and droughts are responsible for almost half of those losses, severely affecting people&amp;#8217;s livelihoods in the form of damaged property, goods and even loss of life. Weather index insurance is an innovative tool in risk transfer for disasters induced by natural hazards. In this type of insurance, payouts are triggered when an index calculated from one or multiple environmental variables exceeds a predefined threshold. Thus, contrary to traditional insurance, it does not require costly and time-consuming post-event loss assessments. Its ease of application makes it an ideal solution for developing countries, where fast payouts in light of a catastrophic event would guarantee the survival of an economic sector, for example, providing the monetary resources necessary for farmers to sustain a prolonged period of extreme temperatures. The main obstacle to a wider application of this type of insurance mechanism stems from the so-called basis risk, which arises when a loss event takes place but a payout is not issued, or vice-versa.&lt;/p&gt;&lt;p&gt;This study proposes and tests the application of machine learning algorithms for the identification of extreme flood and drought events in the context of weather index insurance, with the aim of reducing basis risk. Neural networks and support vector machines, widely adopted for classification problems, are employed exploring thousands of possible configurations based on the combination of different model parameters. The models were developed and tested in the Dominican Republic context, leveraging datasets from multiple sources with low latency, covering a time period between 2000 and 2019. Using rainfall (GSMaP, CMORPH, CHIRPS, CCS, PERSIANN and IMERG) and soil moisture (ERA5) data, the machine learning algorithms provided a strong improvement when compared to logistic regression models, used as a baseline for both hazards. Furthermore, increasing the number of information provided during model training proved to be beneficial to the performances, improving their classification accuracy and confirming the ability of these algorithms to exploit big data. Results highlight the potential of machine learning for application within index insurance products.&lt;/p&gt;


Author(s):  
Suhui Li ◽  
Wenkai Qian ◽  
Haoyang Liu ◽  
Min Zhu ◽  
Christos N. Markides

Abstract For advanced lean premixed gas turbine combustors that have high inlet air temperatures, autoignition may occur during the fuel/air mixing process, which can cause flame-holing inside the premixing device and burn the hardware. An experimental study was performed using a setup that mimics the fuel/air mixing process of lean-premixed combustors. In the present experiment, the preheated air was injected into a quartz tube, and a fuel jet was injected concentrically into the hot turbulent air coflow. The quartz tube allows for direct observation of the autoignition behavior, which develops when the fuel and air mix as they flow inside the tube. This paper presents a study combining machine learning methods and physical analysis that is aimed at predicting autoignition in such flows. A model for the prediction of autoignition of a fuel jet in a flow configuration referred to as a ‘confined turbulent hot coflow’ (CTHC) is developed using machine learning techniques based on binary logistic regression and support vector machine. Key factors that impact the autoignition phenomenon are identified by analyzing the underlying physics and are used to form the feature vector of the model. The model is trained using data from experiments and is validated by an additional set of data, which are selected randomly. The results show that the model predicts the autoignition event with satisfactory accuracy and quick turnaround. The trained model parameters in turn provide insights into the quantitative contribution of different factors that impact the autoignition event. Thus, the machine-learning based method can form an alternative to CFD modeling in some cases.


Water ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 1612
Author(s):  
Susanna Dazzi ◽  
Renato Vacondio ◽  
Paolo Mignosa

Real-time river flood forecasting models can be useful for issuing flood alerts and reducing or preventing inundations. To this end, machine-learning (ML) methods are becoming increasingly popular thanks to their low computational requirements and to their reliance on observed data only. This work aimed to evaluate the ML models’ capability of predicting flood stages at a critical gauge station, using mainly upstream stage observations, though downstream levels should also be included to consider backwater, if present. The case study selected for this analysis was the lower stretch of the Parma River (Italy), and the forecast horizon was extended up to 9 h. The performances of three ML algorithms, namely Support Vector Regression (SVR), MultiLayer Perceptron (MLP), and Long Short-term Memory (LSTM), were compared herein in terms of accuracy and computational time. Up to 6 h ahead, all models provided sufficiently accurate predictions for practical purposes (e.g., Root Mean Square Error < 15 cm, and Nash-Sutcliffe Efficiency coefficient > 0.99), while peak levels were poorly predicted for longer lead times. Moreover, the results suggest that the LSTM model, despite requiring the longest training time, is the most robust and accurate in predicting peak values, and it should be preferred for setting up an operational forecasting system.


Sign in / Sign up

Export Citation Format

Share Document