Comparison of Machine Learning Techniques Powering Flood Early Warning Systems. Application to a catchment located in the Tropical Andes of Ecuador.

Mapping Intimacies ◽

10.5194/egusphere-egu2020-4243 ◽

2020 ◽

Author(s):

Paul Munoz ◽

Johanna Orellana-Alvear ◽

Jörg Bendix ◽

Rolando Célleri

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Early Warning ◽

Lead Time ◽

Geometric Mean ◽

Classification Problem ◽

Early Warning Systems ◽

Machine Learning Techniques ◽

Warning Systems ◽

Tropical Andes

<p>Flood Early Warning Systems have globally become an effective tool to mitigate the adverse effects of this natural hazard on society, economy and environment. A novel approach for such systems is to actually forecast flood events rather than merely monitoring the catchment hydrograph evolution on its way to an inundation site. A wide variety of modelling approaches, from fully-physical to data-driven, have been developed depending on the availability of information describing intrinsic catchment characteristics. However, during last decades, the use of Machine Learning techniques has remarkably gained popularity due to its power to forecast floods at a minimum of demanded data and computational cost. Here, we selected the algorithms most commonly employed for flood prediction (K-nearest Neighbors, Logistic Regression, Random Forest, Na&#239;ve Bayes and Neural Networks), and used them in a precipitation-runoff classification problem aimed to forecast the inundation state of a river at a decisive control station. These are &#8220;No-alert&#8221;, &#8220;Pre-alert&#8221;, and &#8220;Alert&#8221; of inundation with varying lead times of 1, 4, 8 and 12 hours. The study site is a 300-km2 catchment in the tropical Andes draining to Cuenca, the third most populated city of Ecuador. Cuenca is susceptible to annual floods, and thus, the generated alerts will be used by local authorities to inform the population on upcoming flood risks. For an integral comparison between forecasting models, we propose a scheme relying on the F1-score, the Geometric mean and the Log-loss score to account for the resulting data imbalance and the multiclass classification problem. Furthermore, we used the Chi-Squared test to ensure that differences in model results were due to the algorithm applied and not due to statistical chance. We reveal that the most effective model according to the F1-score is using the Neural Networks technique (0.78, 0.62, 0.51 and 0.46 for the test subsets of the 1, 4, 8 and 12-hour forecasting scenarios, respectively), followed by the Logistic Regression algorithm. For the remaining algorithms, we found F1-score differences between the best and the worse model inversely proportional to the lead time (i.e., differences between models were more pronounced for shorter lead times). Moreover, the Geometric mean and the Log-log score showed similar patterns of degradation of the forecast ability with lead time for all algorithms. The overall higher scores found for the Neural Networks technique suggest this algorithm as the engine for the best forecasting Early Warning Systems of the city. For future research, we recommend further analyses on the effect of input data composition and on the architecture of the algorithm for full exploitation of its capacity, which would lead to an improvement of model performance and an extension of the lead time. The usability and effectiveness of the developed systems will depend, however, on the speed of communication to the public after an inundation signal is indicated. We suggest to complement our systems with a website and/or mobile application as a tool to boost the preparedness against floods for both decision makers and the public.</p><p>Keywords: Flood; forecasting; Early Warning; Machine Learning; Tropical Andes; Ecuador.</p>

Download Full-text

Flood Early Warning Systems Using Machine Learning Techniques: The Case of the Tomebamba Catchment at the Southern Andes of Ecuador

Hydrology ◽

10.3390/hydrology8040183 ◽

2021 ◽

Vol 8 (4) ◽

pp. 183

Author(s):

Paul Muñoz ◽

Johanna Orellana-Alvear ◽

Jörg Bendix ◽

Jan Feyen ◽

Rolando Célleri

Keyword(s):

Machine Learning ◽

Early Warning ◽

Early Warning Systems ◽

Data Representation ◽

Machine Learning Techniques ◽

Lead Times ◽

Warning Systems ◽

Tropical Andes ◽

K Nearest Neighbors ◽

Learning Techniques

Worldwide, machine learning (ML) is increasingly being used for developing flood early warning systems (FEWSs). However, previous studies have not focused on establishing a methodology for determining the most efficient ML technique. We assessed FEWSs with three river states, No-alert, Pre-alert and Alert for flooding, for lead times between 1 to 12 h using the most common ML techniques, such as multi-layer perceptron (MLP), logistic regression (LR), K-nearest neighbors (KNN), naive Bayes (NB), and random forest (RF). The Tomebamba catchment in the tropical Andes of Ecuador was selected as a case study. For all lead times, MLP models achieve the highest performance followed by LR, with f1-macro (log-loss) scores of 0.82 (0.09) and 0.46 (0.20) for the 1 h and 12 h cases, respectively. The ranking was highly variable for the remaining ML techniques. According to the g-mean, LR models correctly forecast and show more stability at all states, while the MLP models perform better in the Pre-alert and Alert states. The proposed methodology for selecting the optimal ML technique for a FEWS can be extrapolated to other case studies. Future efforts are recommended to enhance the input data representation and develop communication applications to boost the awareness of society of floods.

Download Full-text

Flood Early Warning Systems using Machine Learning Techniques. Case the Tomebamba Catchment at the Southern Andes of Ecuador

10.20944/preprints202111.0510.v1 ◽

2021 ◽

Author(s):

Paul Muñoz ◽

Johanna Orellana-Alvear ◽

Jörg Bendix ◽

Jan Feyen ◽

Rolando Célleri

Keyword(s):

Machine Learning ◽

Early Warning ◽

Early Warning Systems ◽

Data Representation ◽

Machine Learning Techniques ◽

Lead Times ◽

Warning Systems ◽

Tropical Andes ◽

K Nearest Neighbors ◽

Learning Techniques

Flood Early Warning Systems (FEWSs) using Machine Learning (ML) has gained worldwide popularity. However, determining the most efficient ML technique is still a bottleneck. We assessed FEWSs with three river states, No-alert, Pre-alert, and Alert for flooding, for lead times between 1 to 12 hours using the most common ML techniques, such as Multi-Layer Perceptron (MLP), Logistic Regression (LR), K-Nearest Neighbors (KNN), Naive Bayes (NB), and Random Forest (RF). The Tomebamba catchment in the tropical Andes of Ecuador was selected as case study. For all lead times, MLP models achieve the highest performance followed by LR, with f1-macro (log-loss) scores of 0.82 (0.09) and 0.46 (0.20) for the 1- and 12-hour cases, respectively. The ranking was highly variable for the remaining ML techniques. According to the g-mean, LR models correctly forecast and show more stability at all states, while the MLP models perform better in the Pre-alert and Alert states. Future efforts are recommended to enhance the input data representation and develop communication applications to boost the awareness of the society for floods.

Download Full-text

Flood Early Warning Systems using Machine Learning Techniques. Application to a Catchment located in the Tropical Andes of Ecuador.

10.21203/rs.3.rs-395457/v1 ◽

2021 ◽

Author(s):

Paul Muñoz ◽

Johanna Orellana-Alvear ◽

Jörg Bendix ◽

Rolando Célleri

Keyword(s):

Machine Learning ◽

Early Warning ◽

Research Question ◽

Early Warning Systems ◽

Flood Forecasting ◽

Data Representation ◽

Machine Learning Techniques ◽

Lead Times ◽

Warning Systems ◽

Tropical Andes

Abstract Short-rain floods, especially flash-floods, produce devastating impacts on society, the economy, and ecosystems. A key countermeasure is to develop Flood Early Warning Systems (FEWSs) aimed at forecasting flood warnings with sufficient lead time for decision making. Although Machine Learning (ML) techniques have gained popularity among hydrologists, the research question poorly answered is what is the best ML technique for flood forecasting? To answer this, we compare the efficiencies of FEWSs developed with the five most common ML techniques for flood forecasting, and for lead times between 1 to 12 hours. We use the Tomebamba catchment in the Ecuadorean Andes as a case study, with three warning classes to forecast No-alert, Pre-alert, and Alert of floods. For all lead times, the Multi-Layer Perceptron (MLP) technique achieves the highest model performances (f1-macro score) followed by Logistic Regression (LR), from 0.82 (1-hour) to 0.46 (12-hour). This ranking was confirmed by the log-loss scores, ranging from 0.09 (1-hour) to 0.20 (12-hour) for the above mentioned methods. Model performances decreased for the remaining ML techniques (K-Nearest Neighbors, Naive Bayes and Random Forest) but their ranking was highly variable and not conclusive. Moreover, according to the g-mean, LR models depict greater stability for correctly classifying all flood classes, whereas MLP models are specialized in the minority (Pre-alert and Alert) classes. To improve the performance and the applicability of FEWSs, we recommend future efforts to enhance input data representation and to develop communication applications between FEWSs and the public as tools to boost the preparedness of the society against floods.

Download Full-text

Enhancing the reliability of landslide early warning systems by machine learning

Landslides ◽

10.1007/s10346-020-01453-z ◽

2020 ◽

Vol 17 (9) ◽

pp. 2231-2246

Author(s):

Hemalatha Thirugnanam ◽

Maneesha Vinodini Ramesh ◽

Venkat P. Rangan

Keyword(s):

Machine Learning ◽

Early Warning ◽

Early Warning Systems ◽

Warning Systems ◽

Landslide Early Warning

Download Full-text

Timely prediction potential of landslide early warning systems with multispectral remote sensing: a conceptual approach tested in the Sattelkar, Austria

Natural Hazards and Earth System Science ◽

10.5194/nhess-21-2753-2021 ◽

2021 ◽

Vol 21 (9) ◽

pp. 2753-2772

Author(s):

Doris Hermle ◽

Markus Keuschnig ◽

Ingo Hartmeyer ◽

Robert Delleske ◽

Michael Krautblatter

Keyword(s):

Remote Sensing ◽

Early Warning ◽

Lead Time ◽

Early Warning Systems ◽

Image Correlation ◽

Optical Data ◽

Optical Remote Sensing ◽

Warning Systems ◽

Conceptual Approach ◽

Landslide Early Warning

Abstract. While optical remote sensing has demonstrated its capabilities for landslide detection and monitoring, spatial and temporal demands for landslide early warning systems (LEWSs) had not been met until recently. We introduce a novel conceptual approach to structure and quantitatively assess lead time for LEWSs. We analysed “time to warning” as a sequence: (i) time to collect, (ii) time to process and (iii) time to evaluate relevant optical data. The difference between the time to warning and “forecasting window” (i.e. time from hazard becoming predictable until event) is the lead time for reactive measures. We tested digital image correlation (DIC) of best-suited spatiotemporal techniques, i.e. 3 m resolution PlanetScope daily imagery and 0.16 m resolution unmanned aerial system (UAS)-derived orthophotos to reveal fast ground displacement and acceleration of a deep-seated, complex alpine mass movement leading to massive debris flow events. The time to warning for the UAS/PlanetScope totals 31/21 h and is comprised of time to (i) collect – 12/14 h, (ii) process – 17/5 h and (iii) evaluate – 2/2 h, which is well below the forecasting window for recent benchmarks and facilitates a lead time for reactive measures. We show optical remote sensing data can support LEWSs with a sufficiently fast processing time, demonstrating the feasibility of optical sensors for LEWSs.

Download Full-text

Communicating Complex Forecasts for Enhanced Early Warning in Nepal

10.5194/gc-2019-3 ◽

2019 ◽

Cited By ~ 1

Author(s):

Mirianna Budimir ◽

Amy Donovan ◽

Sarah Brown ◽

Puja Shakya ◽

Dilip Gautam ◽

...

Keyword(s):

Early Warning ◽

Lead Time ◽

Warning System ◽

Early Warning Systems ◽

Warning Systems ◽

Time Data ◽

Deterministic Models ◽

Early Action ◽

Probabilistic Forecasts ◽

Barriers And Challenges

Abstract. Early warning systems have the potential to save lives and improve resilience. Simple early warning systems rely on real-time data and deterministic models to generate evacuation warnings; these simple deterministic models enable life-saving action, but provide limited lead time for resilience-building early action. More complex early warning systems supported by forecasts, including probabilistic forecasts, can provide additional lead time for preparation. However, barriers and challenges remain in disseminating and communicating these more complex warnings to community members and individuals at risk. Research was undertaken to analyse and understand the current early warning system in Nepal, considering available data and forecasts, information flows, early warning dissemination and decision making for early action. The research reviewed the availability and utilisation of complex forecasts in Nepal, their integration into dissemination (Department of Hydrology and Meteorology (DHM) bulletins and SMS warnings), and decision support tools (Common Alerting Protocols and Standard Operating Procedures), considering their impact on improving early action to increase the resilience of vulnerable communities to flooding.

Download Full-text

Development of the On-Site Earthquake Early Warning Systems for Taiwan Using Neural Networks

Intelligent Engineering Systems through Artificial Neural Networks ◽

10.1115/1.802953.paper14 ◽

2009 ◽

pp. 107-113

Keyword(s):

Neural Networks ◽

Early Warning ◽

Early Warning Systems ◽

Earthquake Early Warning ◽

Warning Systems ◽

Earthquake Early Warning Systems

Download Full-text

Using Ground Microtremor Data in Advanced Rockfall Early Warning Systems and Predicting Spatiotemporal Characteristics of Rockfall Hazards

10.21203/rs.3.rs-944523/v1 ◽

2021 ◽

Author(s):

Yi-Rong Yang ◽

Tzu-Tung Lee ◽

Tai-Tien Wang

Keyword(s):

Slope Stability ◽

Early Warning ◽

Local Governments ◽

Lead Time ◽

Early Warning Systems ◽

Slope Instability ◽

Warning Systems ◽

Spatiotemporal Characteristics ◽

Stability Of Slopes ◽

Rockfall Hazards

Abstract Identifying cliffs that are prone to fall and providing a sufficient lead time for rockfall warning are crucial steps in disaster risk reduction and preventive maintenance work, especially that led by local governments. However, existing rockfall warning systems provide uncertain rockfall location forecasting and short warning times because the deformation and cracking of unstable slopes are not sufficiently detected by sensors before the rock collapses. Here, we introduce ground microtremor signals for early rockfall forecasting and demonstrate that microtremor characteristics can be used to detect unstable rock wedges on slopes, quantitatively describe the stability of slopes and lengthen the lead time for rockfall warning. We show that the change in the energy of ground microtremors can be an early precursor of rockfall and that the signal frequency decreases with slope instability. This finding indicates that ground microtremor signals are remarkably sensitive to slope stability. We conclude that microtremor characteristics can be used as an appropriate slope stability index for early rockfall warning systems and predicting the spatiotemporal characteristics of rockfall hazards. This early warning method has the advantages of providing a long lead time and on-demand monitoring, while increasing slope stability accessibility and prefailure location detectability.

Download Full-text

Effective real-time forecasting of inundation maps for early warning systems during typhoons

MATEC Web of Conferences ◽

10.1051/matecconf/201814703014 ◽

2018 ◽

Vol 147 ◽

pp. 03014

Author(s):

Jhih-Huang Wang ◽

Gwo-Fong Lin ◽

Bing-Chen Jhong

Keyword(s):

Early Warning ◽

Lead Time ◽

Early Warning Systems ◽

Reference Points ◽

Support Vector ◽

Spatial Expansion ◽

Warning Systems ◽

Inundation Maps ◽

Inundation Depth ◽

Proposed Model

Accurate forecasts of hourly inundation depths are essential for inundation warning and mitigation during typhoons. In this paper, an effective forecasting model is proposed to yield 1- to 6-h lead-time inundation maps for early warning systems during typhoons. The proposed model based on Support Vector Machine (SVM) is composed of two modules, point forecasting and spatial expansion. In the first module, the rainfall intensity, inundation depth, cumulative rainfall and forecasted inundation depths are considered as model input for point forecasting. In the second module, the geographic information of inundation grids and the inundation forecasts of reference points are used to yield inundation maps for spatial expansion. The results show that the proposed model is able to provide accurate point forecasts at each inundation point. Moreover, the spatial expansion module is capable of producing accurate spatial inundation forecasts. Obviously, the proposed model provides reasonable spatial inundation forecasts, and is able to deal with the nonlinear relationships between inputs and desired output. In conclusion, the proposed model is suitable and useful for inundation forecasting.

Download Full-text

Comparison of Resampling Techniques for Imbalanced Datasets in Machine Learning: Application to Epileptogenic Zone Localization From Interictal Intracranial EEG Recordings in Patients With Focal Epilepsy

Frontiers in Neuroinformatics ◽

10.3389/fninf.2021.715421 ◽

2021 ◽

Vol 15 ◽

Author(s):

Giulia Varotto ◽

Gianluca Susi ◽

Laura Tassi ◽

Francesca Gozzo ◽

Silvana Franceschetti ◽

...

Keyword(s):

Machine Learning ◽

Focal Epilepsy ◽

Geometric Mean ◽

Ensemble Methods ◽

Classification Problem ◽

Brain Regions ◽

Classification Method ◽

Machine Learning Techniques ◽

Epileptogenic Zone ◽

Original Dataset

Aim: In neuroscience research, data are quite often characterized by an imbalanced distribution between the majority and minority classes, an issue that can limit or even worsen the prediction performance of machine learning methods. Different resampling procedures have been developed to face this problem and a lot of work has been done in comparing their effectiveness in different scenarios. Notably, the robustness of such techniques has been tested among a wide variety of different datasets, without considering the performance of each specific dataset. In this study, we compare the performances of different resampling procedures for the imbalanced domain in stereo-electroencephalography (SEEG) recordings of the patients with focal epilepsies who underwent surgery.Methods: We considered data obtained by network analysis of interictal SEEG recorded from 10 patients with drug-resistant focal epilepsies, for a supervised classification problem aimed at distinguishing between the epileptogenic and non-epileptogenic brain regions in interictal conditions. We investigated the effectiveness of five oversampling and five undersampling procedures, using 10 different machine learning classifiers. Moreover, six specific ensemble methods for the imbalanced domain were also tested. To compare the performances, Area under the ROC curve (AUC), F-measure, Geometric Mean, and Balanced Accuracy were considered.Results: Both the resampling procedures showed improved performances with respect to the original dataset. The oversampling procedure was found to be more sensitive to the type of classification method employed, with Adaptive Synthetic Sampling (ADASYN) exhibiting the best performances. All the undersampling approaches were more robust than the oversampling among the different classifiers, with Random Undersampling (RUS) exhibiting the best performance despite being the simplest and most basic classification method.Conclusions: The application of machine learning techniques that take into consideration the balance of features by resampling is beneficial and leads to more accurate localization of the epileptogenic zone from interictal periods. In addition, our results highlight the importance of the type of classification method that must be used together with the resampling to maximize the benefit to the outcome.

Download Full-text