Flood Early Warning Systems using Machine Learning Techniques. Application to a Catchment located in the Tropical Andes of Ecuador.

Abstract Short-rain floods, especially flash-floods, produce devastating impacts on society, the economy, and ecosystems. A key countermeasure is to develop Flood Early Warning Systems (FEWSs) aimed at forecasting flood warnings with sufficient lead time for decision making. Although Machine Learning (ML) techniques have gained popularity among hydrologists, the research question poorly answered is what is the best ML technique for flood forecasting? To answer this, we compare the efficiencies of FEWSs developed with the five most common ML techniques for flood forecasting, and for lead times between 1 to 12 hours. We use the Tomebamba catchment in the Ecuadorean Andes as a case study, with three warning classes to forecast No-alert, Pre-alert, and Alert of floods. For all lead times, the Multi-Layer Perceptron (MLP) technique achieves the highest model performances (f1-macro score) followed by Logistic Regression (LR), from 0.82 (1-hour) to 0.46 (12-hour). This ranking was confirmed by the log-loss scores, ranging from 0.09 (1-hour) to 0.20 (12-hour) for the above mentioned methods. Model performances decreased for the remaining ML techniques (K-Nearest Neighbors, Naive Bayes and Random Forest) but their ranking was highly variable and not conclusive. Moreover, according to the g-mean, LR models depict greater stability for correctly classifying all flood classes, whereas MLP models are specialized in the minority (Pre-alert and Alert) classes. To improve the performance and the applicability of FEWSs, we recommend future efforts to enhance input data representation and to develop communication applications between FEWSs and the public as tools to boost the preparedness of the society against floods.

Download Full-text

Flood Early Warning Systems Using Machine Learning Techniques: The Case of the Tomebamba Catchment at the Southern Andes of Ecuador

Hydrology ◽

10.3390/hydrology8040183 ◽

2021 ◽

Vol 8 (4) ◽

pp. 183

Author(s):

Paul Muñoz ◽

Johanna Orellana-Alvear ◽

Jörg Bendix ◽

Jan Feyen ◽

Rolando Célleri

Keyword(s):

Machine Learning ◽

Early Warning ◽

Early Warning Systems ◽

Data Representation ◽

Machine Learning Techniques ◽

Lead Times ◽

Warning Systems ◽

Tropical Andes ◽

K Nearest Neighbors ◽

Learning Techniques

Worldwide, machine learning (ML) is increasingly being used for developing flood early warning systems (FEWSs). However, previous studies have not focused on establishing a methodology for determining the most efficient ML technique. We assessed FEWSs with three river states, No-alert, Pre-alert and Alert for flooding, for lead times between 1 to 12 h using the most common ML techniques, such as multi-layer perceptron (MLP), logistic regression (LR), K-nearest neighbors (KNN), naive Bayes (NB), and random forest (RF). The Tomebamba catchment in the tropical Andes of Ecuador was selected as a case study. For all lead times, MLP models achieve the highest performance followed by LR, with f1-macro (log-loss) scores of 0.82 (0.09) and 0.46 (0.20) for the 1 h and 12 h cases, respectively. The ranking was highly variable for the remaining ML techniques. According to the g-mean, LR models correctly forecast and show more stability at all states, while the MLP models perform better in the Pre-alert and Alert states. The proposed methodology for selecting the optimal ML technique for a FEWS can be extrapolated to other case studies. Future efforts are recommended to enhance the input data representation and develop communication applications to boost the awareness of society of floods.

Download Full-text

Flood Early Warning Systems using Machine Learning Techniques. Case the Tomebamba Catchment at the Southern Andes of Ecuador

10.20944/preprints202111.0510.v1 ◽

2021 ◽

Author(s):

Paul Muñoz ◽

Johanna Orellana-Alvear ◽

Jörg Bendix ◽

Jan Feyen ◽

Rolando Célleri

Keyword(s):

Machine Learning ◽

Early Warning ◽

Early Warning Systems ◽

Data Representation ◽

Machine Learning Techniques ◽

Lead Times ◽

Warning Systems ◽

Tropical Andes ◽

K Nearest Neighbors ◽

Learning Techniques

Flood Early Warning Systems (FEWSs) using Machine Learning (ML) has gained worldwide popularity. However, determining the most efficient ML technique is still a bottleneck. We assessed FEWSs with three river states, No-alert, Pre-alert, and Alert for flooding, for lead times between 1 to 12 hours using the most common ML techniques, such as Multi-Layer Perceptron (MLP), Logistic Regression (LR), K-Nearest Neighbors (KNN), Naive Bayes (NB), and Random Forest (RF). The Tomebamba catchment in the tropical Andes of Ecuador was selected as case study. For all lead times, MLP models achieve the highest performance followed by LR, with f1-macro (log-loss) scores of 0.82 (0.09) and 0.46 (0.20) for the 1- and 12-hour cases, respectively. The ranking was highly variable for the remaining ML techniques. According to the g-mean, LR models correctly forecast and show more stability at all states, while the MLP models perform better in the Pre-alert and Alert states. Future efforts are recommended to enhance the input data representation and develop communication applications to boost the awareness of the society for floods.

Download Full-text

Comparison of Machine Learning Techniques Powering Flood Early Warning Systems. Application to a catchment located in the Tropical Andes of Ecuador.

10.5194/egusphere-egu2020-4243 ◽

2020 ◽

Author(s):

Paul Munoz ◽

Johanna Orellana-Alvear ◽

Jörg Bendix ◽

Rolando Célleri

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Early Warning ◽

Lead Time ◽

Geometric Mean ◽

Classification Problem ◽

Early Warning Systems ◽

Machine Learning Techniques ◽

Warning Systems ◽

Tropical Andes

Flood Early Warning Systems have globally become an effective tool to mitigate the adverse effects of this natural hazard on society, economy and environment. A novel approach for such systems is to actually forecast flood events rather than merely monitoring the catchment hydrograph evolution on its way to an inundation site. A wide variety of modelling approaches, from fully-physical to data-driven, have been developed depending on the availability of information describing intrinsic catchment characteristics. However, during last decades, the use of Machine Learning techniques has remarkably gained popularity due to its power to forecast floods at a minimum of demanded data and computational cost. Here, we selected the algorithms most commonly employed for flood prediction (K-nearest Neighbors, Logistic Regression, Random Forest, Na&#239;ve Bayes and Neural Networks), and used them in a precipitation-runoff classification problem aimed to forecast the inundation state of a river at a decisive control station. These are &#8220;No-alert&#8221;, &#8220;Pre-alert&#8221;, and &#8220;Alert&#8221; of inundation with varying lead times of 1, 4, 8 and 12 hours. The study site is a 300-km2 catchment in the tropical Andes draining to Cuenca, the third most populated city of Ecuador. Cuenca is susceptible to annual floods, and thus, the generated alerts will be used by local authorities to inform the population on upcoming flood risks. For an integral comparison between forecasting models, we propose a scheme relying on the F1-score, the Geometric mean and the Log-loss score to account for the resulting data imbalance and the multiclass classification problem. Furthermore, we used the Chi-Squared test to ensure that differences in model results were due to the algorithm applied and not due to statistical chance. We reveal that the most effective model according to the F1-score is using the Neural Networks technique (0.78, 0.62, 0.51 and 0.46 for the test subsets of the 1, 4, 8 and 12-hour forecasting scenarios, respectively), followed by the Logistic Regression algorithm. For the remaining algorithms, we found F1-score differences between the best and the worse model inversely proportional to the lead time (i.e., differences between models were more pronounced for shorter lead times). Moreover, the Geometric mean and the Log-log score showed similar patterns of degradation of the forecast ability with lead time for all algorithms. The overall higher scores found for the Neural Networks technique suggest this algorithm as the engine for the best forecasting Early Warning Systems of the city. For future research, we recommend further analyses on the effect of input data composition and on the architecture of the algorithm for full exploitation of its capacity, which would lead to an improvement of model performance and an extension of the lead time. The usability and effectiveness of the developed systems will depend, however, on the speed of communication to the public after an inundation signal is indicated. We suggest to complement our systems with a website and/or mobile application as a tool to boost the preparedness against floods for both decision makers and the public.Keywords: Flood; forecasting; Early Warning; Machine Learning; Tropical Andes; Ecuador.

Download Full-text

Enhancing the reliability of landslide early warning systems by machine learning

Landslides ◽

10.1007/s10346-020-01453-z ◽

2020 ◽

Vol 17 (9) ◽

pp. 2231-2246

Author(s):

Hemalatha Thirugnanam ◽

Maneesha Vinodini Ramesh ◽

Venkat P. Rangan

Keyword(s):

Machine Learning ◽

Early Warning ◽

Early Warning Systems ◽

Warning Systems ◽

Landslide Early Warning

Download Full-text

Community-based early warning systems for flood risk mitigation in Nepal

Natural Hazards and Earth System Science ◽

10.5194/nhess-17-423-2017 ◽

2017 ◽

Vol 17 (3) ◽

pp. 423-437 ◽

Cited By ~ 22

Author(s):

Paul J. Smith ◽

Sarah Brown ◽

Sumit Dugar

Keyword(s):

Early Warning ◽

Risk Mitigation ◽

Early Warning Systems ◽

Appropriate Technology ◽

Data Availability ◽

Current Status ◽

Lead Times ◽

Warning Systems ◽

Community Based ◽

Probabilistic Forecasts

Abstract. This paper focuses on the use of community-based early warning systems for flood resilience in Nepal. The first part of the work outlines the evolution and current status of these community-based systems, highlighting the limited lead times currently available for early warning. The second part of the paper focuses on the development of a robust operational flood forecasting methodology for use by the Nepal Department of Hydrology and Meteorology (DHM) to enhance early warning lead times. The methodology uses data-based physically interpretable time series models and data assimilation to generate probabilistic forecasts, which are presented in a simple visual tool. The approach is designed to work in situations of limited data availability with an emphasis on sustainability and appropriate technology. The successful application of the forecast methodology to the flood-prone Karnali River basin in western Nepal is outlined, increasing lead times from 2–3 to 7–8 h. The challenges faced in communicating probabilistic forecasts to the last mile of the existing community-based early warning systems across Nepal is discussed. The paper concludes with an assessment of the applicability of this approach in basins and countries beyond Karnali and Nepal and an overview of key lessons learnt from this initiative.

Download Full-text

Using Machine Learning to Advance Early Warning Systems: Promise and Pitfalls

Teachers College Record ◽

10.1177/016146812012201403 ◽

2020 ◽

Vol 122 (14) ◽

pp. 1-30

Author(s):

James Soland ◽

Benjamin Domingue ◽

David Lang

Keyword(s):

Machine Learning ◽

High School ◽

At Risk ◽

Early Warning ◽

Early Warning Systems ◽

Machine Learning Techniques ◽

Dropping Out ◽

Learning Methods ◽

Machine Learning Methods ◽

Learning Techniques

Background/Context Early warning indicators (EWI) are often used by states and districts to identify students who are not on track to finish high school, and provide supports/interventions to increase the odds the student will graduate. While EWI are diverse in terms of the academic behaviors they capture, research suggests that indicators like course failures, chronic absenteeism, and suspensions can help identify students in need of additional supports. In parallel with the expansion of administrative data that have made early versions of EWI possible, new machine learning methods have been developed. These methods are data-driven and often designed to sift through thousands of variables with the purpose of identifying the best predictors of a given outcome. While applications of machine learning techniques to identify students at-risk of high school dropout have obvious appeal, few studies consider the benefits and limitations of applying those models in an EWI context, especially as they relate to questions of fairness and equity. Focus of Study In this study, we will provide applied examples of how machine learning can be used to support EWI selection. The purpose is to articulate the broad risks and benefits of using machine learning methods to identify students who may be at risk of dropping out. We focus on dropping out given its salience in the EWI literature, but also anticipate generating insights that will be germane to EWI used for a variety of outcomes. Research Design We explore these issues by using several hypothetical examples of how ML techniques might be used to identify EWI. For example, we show results from decision tree algorithms used to identify predictors of dropout that use simulated data. Conclusions/Recommendations Generally, we argue that machine learning techniques have several potential benefits in the EWI context. For example, some related methods can help create clear decision rules for which students are a dropout risk, and their predictive accuracy can be higher than for more traditional, regression-based models. At the same time, these methods often require additional statistical and data management expertise to be used appropriately. Further, the black-box nature of machine learning algorithms could invite their users to interpret results through the lens of preexisting biases about students and educational settings.

Download Full-text

Forecast of temperature-attributable mortality at lead times of up to 15 days for a very large ensemble of European regions

10.5194/egusphere-egu21-4107 ◽

2021 ◽

Author(s):

Marcos Quijal-Zamorano ◽

Desislava Petrova ◽

Xavier Rodó ◽

Èrica Martinez-Solanas ◽

Joan Ballester

Keyword(s):

Early Warning ◽

Lead Time ◽

Early Warning Systems ◽

Attributable Mortality ◽

Lead Times ◽

Warning Systems ◽

Epidemiological Models ◽

European Regions ◽

Large Ensemble ◽

Weather Forecasts

Implementing adequate health preventing measures is essential for public health decision making, particularly in the current context of rising temperatures. Most of the early warning systems are only based on climate data, and in very few cases they truly model the actual impact of the climate phenomena.Here we establish, for the first-time, the theoretical basis for the development of operational heat-health early warning systems that combine climate and health data. We studied the predictability of Temperature Attributable Mortality (TAM) at lead times of up to 15 days for a very large ensemble of European regions. To achieve this goal, we analysed daily counts of all-cause mortality for the period 1998-2012 in 147 NUTS2 regions in 16 European countries, representing more than 400 million people, and daily high-resolution weather forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF). We applied epidemiological models for the fitting of the temperature-mortality relationship in each of the regions, accounting for the different vulnerabilities and socio-demographic characteristics existing in Europe. We compared the predictive skill of the temperature and health forecasts on seasons and days with higher mortality risk.&#160;We conclude that the predictability of temperature can be used to issue skilful forecasts of TAM. In general, the predictability limit of temperature is similar to the one of TAM, which implies that the use of epidemiological models to transform the climate variables into health information does not reduce the lead time limit with significant forecast skill. Nonetheless, the spatial heterogeneity of the predictability lead time for TAM is higher than for temperature, especially in summer, where the complex shape of the temperature-mortality association amplifies the forecast errors. Overall, we find&#160; a nearly-linear relationship between the predictability of temperature and TAM for different seasons and regions, suggesting that future improvements in the predictability of temperature could automatically lead to improvements in the predictability of TAM.

Download Full-text

AN AFFORDABLE ALTERNATIVE TO EARTHQUAKE EARLY WARNING SYSTEMS USING RECENT ADVANCES IN MACHINE LEARNING AND MICROELECTRONICS

10.1130/abs/2020am-359248 ◽

2020 ◽

Author(s):

Patrick LaChapelle ◽

Keyword(s):

Machine Learning ◽

Early Warning ◽

Early Warning Systems ◽

Earthquake Early Warning ◽

Warning Systems ◽

Recent Advances ◽

Earthquake Early Warning Systems

Download Full-text

Machine Learning-based Early Warning Systems for Clinical Deterioration: A Systematic Scoping Review (Preprint)

10.2196/preprints.25187 ◽

2020 ◽

Author(s):

Sankavi Muralitharan ◽

Walter Nelson ◽

Shuang Di ◽

Michael McGillion ◽

PJ Devereaux ◽

...

Keyword(s):

Machine Learning ◽

Ambulatory Care ◽

Early Warning ◽

Vital Signs ◽

Early Warning Systems ◽

Clinical Deterioration ◽

Cochrane Library ◽

Warning Systems ◽

Learning Models ◽

Machine Learning Models

BACKGROUND Timely identification of patients at a high risk of clinical deterioration is key to prioritizing care, allocating resources effectively and preventing adverse outcomes. Vital signs-based aggregate-weighted Early Warning Systems are commonly used to predict the risk of outcomes related to cardiorespiratory instability and sepsis, which are strong predictors of poor outcomes and mortality. Machine learning models, which can incorporate trends and capture relationships among parameters that aggregate-weighted models cannot, have recently been showing promising results. OBJECTIVE To identify, summarize, and evaluate the available research, current state of utility and challenges with machine learning based early warning systems using vital signs to predict the risk of physiological deterioration in acutely ill patients, across acute and ambulatory care settings. METHODS PubMed, CINAHL, Cochrane Library, Web of Science, Embase, and Google Scholar were searched for peer-reviewed, original studies with keywords related to “vital signs”, “clinical deterioration”, and “machine learning”. Included studies used patient vital signs along with demographics and described a machine learning model for predicting an outcome in acute and ambulatory care settings. Data were extracted following PRISMA, TRIPOD, and Cochrane Collaboration guidelines. RESULTS 24 peer-reviewed studies were identified for inclusion from 417 articles. 23 studies were retrospective, while 1 was prospective in nature. Care settings included general wards, ICUs, emergency departments, step-down units, medical assessment units, post-anesthetic wards, and home care. Machine learning models including logistic regression, tree-based methods, kernel-based methods and neural networks were most commonly used to predict the risk of deterioration. The area under the curve for models ranged from 0.57 to 0.97. CONCLUSIONS In studies that compared performance, reported results suggest that machine learning based early warning systems can achieve greater accuracy than aggregate weighted early warning systems but several areas for further research were identified. While these models have the potential to provide clinical decision support, there is a need for standardized outcome measures to allow for rigorous evaluation of performance across models. Further research needs to address the interpretability of model outputs by clinicians, clinical efficacy of these systems through prospective study design, and their potential impact in different clinical settings. CLINICALTRIAL

Download Full-text

After the extreme flood in 2002: changes in preparedness, response and recovery of flood-affected residents in Germany between 2005 and 2011

Natural Hazards and Earth System Sciences Discussions ◽

10.5194/nhessd-2-6397-2014 ◽

2014 ◽

Vol 2 (10) ◽

pp. 6397-6451 ◽

Cited By ~ 2

Author(s):

S. Kienzler ◽

I. Pech ◽

H. Kreibich ◽

M. Müller ◽

A. H. Thieken

Keyword(s):

Early Warning ◽

Early Warning Systems ◽

Lead Times ◽

Warning Systems ◽

Telephone Interviews ◽

Private Households ◽

Extreme Flood ◽

Response And Recovery ◽

Flood Experience ◽

Flood Characteristics

Abstract. In the aftermath of the severe flood in August 2002, a number of changes in flood policies were launched in Germany and other European countries aiming at an improved risk management. The question arises, whether these changes have already an impact on the residents' capabilities of coping with floods and whether flood-affected private households are now better prepared than in 2002. Therefore, computer-aided telephone interviews with private households in Germany that suffered from property damage due to flooding in 2005, 2006, 2010 or 2011 were performed and analysed with respect to flood awareness, precaution, preparedness and recovery. The data were compared to a similar investigation after the flood in 2002. After the flood in 2002, the level of private precaution increased considerably. One contribution factor is that a larger part of people knew that they are at risk of flooding. Yet this knowledge did not necessarily result in building retrofitting or flood proofing measures. The best level of precaution was found before the flood events in 2006 and 2011. This might be explained by more flood experience and overall greater awareness of the residents. Still, costs and damage avoiding benefits of these measures have to be communicated in a better way. Early warning and emergency response were substantially influenced by flood characteristics. In contrast to flood-affected people in 2006 or 2011, people affected by flooding in 2005 or 2010 had to deal with shorter lead times, less time to take emergency measures; consequently they suffered from higher losses. Therefore, it is important to further improve early warning systems and communication channels, particularly in hilly areas with fast onset flooding.

Download Full-text