scholarly journals An introduction to distributed training of deep neural networks for segmentation tasks with large seismic data sets

Geophysics ◽  
2021 ◽  
Vol 86 (6) ◽  
pp. KS151-KS160
Author(s):  
Claire Birnie ◽  
Haithem Jarraya ◽  
Fredrik Hansteen

Deep learning applications are drastically progressing in seismic processing and interpretation tasks. However, most approaches subsample data volumes and restrict model sizes to minimize computational requirements. Subsampling the data risks losing vital spatiotemporal information which could aid training, whereas restricting model sizes can impact model performance, or in some extreme cases renders more complicated tasks such as segmentation impossible. We have determined how to tackle the two main issues of training of large neural networks (NNs): memory limitations and impracticably large training times. Typically, training data are preloaded into memory prior to training, a particular challenge for seismic applications in which the data format is typically four times larger than that used for standard image processing tasks (float32 versus uint8). Based on an example from microseismic monitoring, we evaluate how more than 750 GB of data can be used to train a model by using a data generator approach, which only stores in memory the data required for that training batch. Furthermore, efficient training over large models is illustrated through the training of a seven-layer U-Net with input data dimensions of [Formula: see text] (approximately [Formula: see text] million parameters). Through a batch-splitting distributed training approach, the training times are reduced by a factor of four. The combination of data generators and distributed training removes any necessity of data subsampling or restriction of NN sizes, offering the opportunity to use larger networks, higher resolution input data, or move from 2D to 3D problem spaces.

1995 ◽  
Vol 7 (3) ◽  
pp. 507-517 ◽  
Author(s):  
Marco Idiart ◽  
Barry Berk ◽  
L. F. Abbott

Model neural networks can perform dimensional reductions of input data sets using correlation-based learning rules to adjust their weights. Simple Hebbian learning rules lead to an optimal reduction at the single unit level but result in highly redundant network representations. More complex rules designed to reduce or remove this redundancy can develop optimal principal component representations, but they are not very compelling from a biological perspective. Neurons in biological networks have restricted receptive fields limiting their access to the input data space. We find that, within this restricted receptive field architecture, simple correlation-based learning rules can produce surprisingly efficient reduced representations. When noise is present, the size of the receptive fields can be optimally tuned to maximize the accuracy of reconstructions of input data from a reduced representation.


2021 ◽  
pp. 1-17
Author(s):  
Luis Sa-Couto ◽  
Andreas Wichert

Abstract Convolutional neural networks (CNNs) evolved from Fukushima's neocognitron model, which is based on the ideas of Hubel and Wiesel about the early stages of the visual cortex. Unlike other branches of neocognitron-based models, the typical CNN is based on end-to-end supervised learning by backpropagation and removes the focus from built-in invariance mechanisms, using pooling not as a way to tolerate small shifts but as a regularization tool that decreases model complexity. These properties of end-to-end supervision and flexibility of structure allow the typical CNN to become highly tuned to the training data, leading to extremely high accuracies on typical visual pattern recognition data sets. However, in this work, we hypothesize that there is a flip side to this capability, a hidden overfitting. More concretely, a supervised, backpropagation based CNN will outperform a neocognitron/map transformation cascade (MTCCXC) when trained and tested inside the same data set. Yet if we take both models trained and test them on the same task but on another data set (without retraining), the overfitting appears. Other neocognitron descendants like the What-Where model go in a different direction. In these models, learning remains unsupervised, but more structure is added to capture invariance to typical changes. Knowing that, we further hypothesize that if we repeat the same experiments with this model, the lack of supervision may make it worse than the typical CNN inside the same data set, but the added structure will make it generalize even better to another one. To put our hypothesis to the test, we choose the simple task of handwritten digit classification and take two well-known data sets of it: MNIST and ETL-1. To try to make the two data sets as similar as possible, we experiment with several types of preprocessing. However, regardless of the type in question, the results align exactly with expectation.


2005 ◽  
Vol 9 (4) ◽  
pp. 313-321 ◽  
Author(s):  
R. R. Shrestha ◽  
S. Theobald ◽  
F. Nestmann

Abstract. Artificial neural networks (ANNs) provide a quick and flexible means of developing flood flow simulation models. An important criterion for the wider applicability of the ANNs is the ability to generalise the events outside the range of training data sets. With respect to flood flow simulation, the ability to extrapolate beyond the range of calibrated data sets is of crucial importance. This study explores methods for improving generalisation of the ANNs using three different flood events data sets from the Neckar River in Germany. An ANN-based model is formulated to simulate flows at certain locations in the river reach, based on the flows at upstream locations. Network training data sets consist of time series of flows from observation stations. Simulated flows from a one-dimensional hydrodynamic numerical model are integrated for network training and validation, at a river section where no measurements are available. Network structures with different activation functions are considered for improving generalisation. The training algorithm involved backpropagation with the Levenberg-Marquardt approximation. The ability of the trained networks to extrapolate is assessed using flow data beyond the range of the training data sets. The results of this study indicate that the ANN in a suitable configuration can extend forecasting capability to a certain extent beyond the range of calibrated data sets.


2019 ◽  
Vol 7 (3) ◽  
pp. SE113-SE122 ◽  
Author(s):  
Yunzhi Shi ◽  
Xinming Wu ◽  
Sergey Fomel

Salt boundary interpretation is important for the understanding of salt tectonics and velocity model building for seismic migration. Conventional methods consist of computing salt attributes and extracting salt boundaries. We have formulated the problem as 3D image segmentation and evaluated an efficient approach based on deep convolutional neural networks (CNNs) with an encoder-decoder architecture. To train the model, we design a data generator that extracts randomly positioned subvolumes from large-scale 3D training data set followed by data augmentation, then feed a large number of subvolumes into the network while using salt/nonsalt binary labels generated by thresholding the velocity model as ground truth labels. We test the model on validation data sets and compare the blind test predictions with the ground truth. Our results indicate that our method is capable of automatically capturing subtle salt features from the 3D seismic image with less or no need for manual input. We further test the model on a field example to indicate the generalization of this deep CNN method across different data sets.


2007 ◽  
Vol 11 (1) ◽  
pp. 647-662 ◽  
Author(s):  
M. G. Hutchins ◽  
C. Dilks ◽  
H. N. Davies ◽  
A. Deflandre

Abstract. Flow and nitrate dynamics were simulated in two catchments, the River Aire in northern England and the River Ythan in north-east Scotland. In the case of the Aire, a diffuse pollution model was coupled with a river quality model (CASCADE-QUESTOR); in the study of the Ythan, an integrated model (SWAT) was used. In each study, model performance was evaluated for differing levels of spatial representation in input data sets (rainfall, soils and land use). In respect of nitrate concentrations, the performance of the models was compared with that of a regression model based on proportions of land cover. The overall objective was to assess the merits of spatially distributed input data sets. In both catchments, specific measures of quantitative performance showed that models using the most detailed available input data contributed, at best, only a marginal improvement over simpler implementations. Hence, the level of complexity used in input data sets has to be determined, not only on multiple criteria of quantitative performance but also on qualitative assessments, reflecting the specific context of the model application and the current and likely future needs of end-users.


2012 ◽  
Vol 490-495 ◽  
pp. 3105-3108
Author(s):  
Kamran Pazand ◽  
Younes Alizadeh

The purpose of this paper is to estimate the fast determination of stress distribution around a circular hole in symmetric composite laminates under in-plane loading. For this purpose calculation of stress values in the composite plate around edge holes in different plies position for a finite number of input data sets using the Lekhnitskii expressions and code program. The resulting data would then be used to train artificial neural networks (ANN) which would be able to predict –accurately enough- those quantities throughout the composite plate body for any given input value in any position ply and fore and stress that impose.


Author(s):  
Mohammad Amin Nabian ◽  
Hadi Meidani

Abstract In this paper, we introduce a physics-driven regularization method for training of deep neural networks (DNNs) for use in engineering design and analysis problems. In particular, we focus on the prediction of a physical system, for which in addition to training data, partial or complete information on a set of governing laws is also available. These laws often appear in the form of differential equations, derived from first principles, empirically validated laws, or domain expertise, and are usually neglected in a data-driven prediction of engineering systems. We propose a training approach that utilizes the known governing laws and regularizes data-driven DNN models by penalizing divergence from those laws. The first two numerical examples are synthetic examples, where we show that in constructing a DNN model that best fits the measurements from a physical system, the use of our proposed regularization results in DNNs that are more interpretable with smaller generalization errors, compared with other common regularization methods. The last two examples concern metamodeling for a random Burgers’ system and for aerodynamic analysis of passenger vehicles, where we demonstrate that the proposed regularization provides superior generalization accuracy compared with other common alternatives.


2020 ◽  
Vol 12 (20) ◽  
pp. 3358
Author(s):  
Vasileios Syrris ◽  
Ondrej Pesek ◽  
Pierre Soille

Automatic supervised classification with complex modelling such as deep neural networks requires the availability of representative training data sets. While there exists a plethora of data sets that can be used for this purpose, they are usually very heterogeneous and not interoperable. In this context, the present work has a twofold objective: (i) to describe procedures of open-source training data management, integration, and data retrieval, and (ii) to demonstrate the practical use of varying source training data for remote sensing image classification. For the former, we propose SatImNet, a collection of open training data, structured and harmonized according to specific rules. For the latter, two modelling approaches based on convolutional neural networks have been designed and configured to deal with satellite image classification and segmentation.


Author(s):  
Frank Padberg

The author uses neural networks to estimate how many defects are hidden in a software document. Input for the models are metrics that get collected when effecting a standard quality assurance technique on the document, a software inspection. For inspections, the empirical data sets typically are small. The author identifies two key ingredients for a successful application of neural networks to small data sets: Adapting the size, complexity, and input dimension of the networks to the amount of information available for training; and using Bayesian techniques instead of cross-validation for determining model parameters and selecting the final model. For inspections, the machine learning approach is highly successful and outperforms the previously existing defect estimation methods in software engineering by a factor of 4 in accuracy on the standard benchmark. The author’s approach is well applicable in other contexts that are subject to small training data sets.


Sign in / Sign up

Export Citation Format

Share Document