scholarly journals Emulation of 2D Hydrodynamic Flood Simulations at Catchment Scale Using ANN and SVR

Water ◽  
2021 ◽  
Vol 13 (20) ◽  
pp. 2858
Author(s):  
Saba Mirza Alipour ◽  
Joao Leal

Two-dimensional (2D) hydrodynamic models are one of the most widely used tools for flood modeling practices and risk estimation. The 2D models provide accurate results; however, they are computationally costly and therefore unsuitable for many real time applications and uncertainty analysis that requires a large number of model realizations. Therefore, the present study aims to (i) develop emulators based on SVR and ANN as an alternative for predicting the 100-year flood water level, (ii) improve the performance of the emulators through dimensionality reduction techniques, and (iii) assess the required training sample size to develop an accurate emulator. Our results indicate that SVR based emulator is a fast and reliable alternative that can predict the water level accurately. Moreover, the performance of the models can improve by identifying the most influencing input variables and eliminating redundant inputs from the training process. The findings in this study suggest that the training data size equal to 70% (or more) of data results in reliable and accurate predictions.

2020 ◽  
Vol 14 (1) ◽  
pp. 5
Author(s):  
Adam Adli ◽  
Pascal Tyrrell

Introduction: Advances in computers have allowed for the practical application of increasingly advanced machine learning models to aid healthcare providers with diagnosis and inspection of medical images. Often, a lack of training data and computation time can be a limiting factor in the development of an accurate machine learning model in the domain of medical imaging. As a possible solution, this study investigated whether L2 regularization moderate s the overfitting that occurs as a result of small training sample sizes.Methods: This study employed transfer learning experiments on a dental x-ray binary classification model to explore L2 regularization with respect to training sample size in five common convolutional neural network architectures. Model testing performance was investigated and technical implementation details including computation times and hardware considerations as well as performance factors and practical feasibility were described.Results: The experimental results showed a trend that smaller training sample sizes benefitted more from regularization than larger training sample sizes. Further, the results showed that applying L2 regularization did not apply significant computational overhead and that the extra rounds of training L2 regularization were feasible when training sample sizes are relatively small.Conclusion: Overall, this study found that there is a window of opportunity in which the benefits of employing regularization can be most cost-effective relative to training sample size. It is recommended that training sample size should be carefully considered when forming expectations of achievable generalizability improvements that result from investing computational resources into model regularization.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Wenyi Lin ◽  
Kyle Hasenstab ◽  
Guilherme Moura Cunha ◽  
Armin Schwartzman

AbstractWe propose a random forest classifier for identifying adequacy of liver MR images using handcrafted (HC) features and deep convolutional neural networks (CNNs), and analyze the relative role of these two components in relation to the training sample size. The HC features, specifically developed for this application, include Gaussian mixture models, Euler characteristic curves and texture analysis. Using HC features outperforms the CNN for smaller sample sizes and with increased interpretability. On the other hand, with enough training data, the combined classifier outperforms the models trained with HC features or CNN features alone. These results illustrate the added value of HC features with respect to CNNs, especially when insufficient data is available, as is often found in clinical studies.


Author(s):  
Fabio Azzalini ◽  
Songle Jin ◽  
Marco Renzi ◽  
Letizia Tanca

Abstract Nowadays, data integration must often manage noisy data, also containing attribute values written in natural language such as product descriptions or book reviews. In the data integration process, Entity Linkage has the role of identifying records that contain information referring to the same object. Modern Entity Linkage methods, in order to reduce the dimension of the problem, partition the initial search space into “blocks” of records that can be considered similar according to some metrics, comparing then only the records belonging to the same block and thus greatly reducing the overall complexity of the algorithm. In this paper, we propose two automatic blocking strategies that, differently from the traditional methods, aim at capturing the semantic properties of data by means of recent deep learning frameworks. Both methods, in a first phase, exploit recent research on tuple and sentence embeddings to transform the database records into real-valued vectors; in a second phase, to arrange the tuples inside the blocks, one of them adopts approximate nearest neighbourhood algorithms, while the other one uses dimensionality reduction techniques combined with clustering algorithms. We train our blocking models on an external, independent corpus, and then, we directly apply them to new datasets in an unsupervised fashion. Our choice is motivated by the fact that, in most data integration scenarios, no training data are actually available. We tested our systems on six popular datasets and compared their performances against five traditional blocking algorithms. The test results demonstrated that our deep-learning-based blocking solutions outperform standard blocking algorithms, especially on textual and noisy data.


The objective of this paper is to introduce to Technologies of linear dimension reduction popularly known as Principal Component Analysis and Linear Discriminant Analysis. PCA reduces the size of data and conserve maximum variance in the form of new variable called principal components where LDA works with minimum class distance and maximizing difference between the classes. Axis of maximum variance is found by PCA while axis of class separability is found by LDA. This method is experimented over and MNIST handwritten digit data set. Our conclusion explains PCA can outperform LDA when training data set a small and recalls values with lesser computational complexity. The present in linear techniques in this paper presents clear understanding and methods in comparative manner


2015 ◽  
Vol 294 ◽  
pp. 553-564 ◽  
Author(s):  
Manuel Domínguez ◽  
Serafín Alonso ◽  
Antonio Morán ◽  
Miguel A. Prada ◽  
Juan J. Fuertes

2021 ◽  
Vol 2021 (4) ◽  
Author(s):  
Jack Y. Araz ◽  
Michael Spannowsky

Abstract Ensemble learning is a technique where multiple component learners are combined through a protocol. We propose an Ensemble Neural Network (ENN) that uses the combined latent-feature space of multiple neural network classifiers to improve the representation of the network hypothesis. We apply this approach to construct an ENN from Convolutional and Recurrent Neural Networks to discriminate top-quark jets from QCD jets. Such ENN provides the flexibility to improve the classification beyond simple prediction combining methods by linking different sources of error correlations, hence improving the representation between data and hypothesis. In combination with Bayesian techniques, we show that it can reduce epistemic uncertainties and the entropy of the hypothesis by simultaneously exploiting various kinematic correlations of the system, which also makes the network less susceptible to a limitation in training sample size.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Van Hoan Do ◽  
Stefan Canzar

AbstractEmerging single-cell technologies profile multiple types of molecules within individual cells. A fundamental step in the analysis of the produced high-dimensional data is their visualization using dimensionality reduction techniques such as t-SNE and UMAP. We introduce j-SNE and j-UMAP as their natural generalizations to the joint visualization of multimodal omics data. Our approach automatically learns the relative contribution of each modality to a concise representation of cellular identity that promotes discriminative features but suppresses noise. On eight datasets, j-SNE and j-UMAP produce unified embeddings that better agree with known cell types and that harmonize RNA and protein velocity landscapes.


Sign in / Sign up

Export Citation Format

Share Document