Flood Early Warning Systems using Machine Learning Techniques. Application to a Catchment located in the Tropical Andes of Ecuador.
Abstract Short-rain floods, especially flash-floods, produce devastating impacts on society, the economy, and ecosystems. A key countermeasure is to develop Flood Early Warning Systems (FEWSs) aimed at forecasting flood warnings with sufficient lead time for decision making. Although Machine Learning (ML) techniques have gained popularity among hydrologists, the research question poorly answered is what is the best ML technique for flood forecasting? To answer this, we compare the efficiencies of FEWSs developed with the five most common ML techniques for flood forecasting, and for lead times between 1 to 12 hours. We use the Tomebamba catchment in the Ecuadorean Andes as a case study, with three warning classes to forecast No-alert, Pre-alert, and Alert of floods. For all lead times, the Multi-Layer Perceptron (MLP) technique achieves the highest model performances (f1-macro score) followed by Logistic Regression (LR), from 0.82 (1-hour) to 0.46 (12-hour). This ranking was confirmed by the log-loss scores, ranging from 0.09 (1-hour) to 0.20 (12-hour) for the above mentioned methods. Model performances decreased for the remaining ML techniques (K-Nearest Neighbors, Naive Bayes and Random Forest) but their ranking was highly variable and not conclusive. Moreover, according to the g-mean, LR models depict greater stability for correctly classifying all flood classes, whereas MLP models are specialized in the minority (Pre-alert and Alert) classes. To improve the performance and the applicability of FEWSs, we recommend future efforts to enhance input data representation and to develop communication applications between FEWSs and the public as tools to boost the preparedness of the society against floods.