Deep-Learning-Based Automated Sedimentary Geometry Characterization From Borehole Images

Sedimentary geometry on borehole images usually summarizes the arrangement of bed boundaries, erosive surfaces, crossbedding, sedimentary dip, and/or deformed beds. The interpretation, very often manual, requires a good level of expertise, is time consuming, can suffer from user bias, and becomes very challenging when dealing with highly deviated wells. Bedform geometry interpretation from crossbed data is rarely completed from a borehole image. The purpose of this study is to develop an automated method to interpret sedimentary structures, including the bedform geometry resulting from the change in flow direction from borehole images. Automation is achieved in this unique interpretation methodology using deep learning (DL). The first task comprised the creation of a training data set of 2D borehole images. This library of images was then used to train deep neural network models. Testing different architectures of convolutional neural networks (CNN) showed the ResNet architecture to give the best performance for the classification of the different sedimentary structures. The validation accuracy was very high, in the range of 93 to 96%. To test the developed method, additional logs of synthetic data were created as sequences of different sedimentary structures (i.e., classes) associated with different well deviations, with the addition of gaps. The model was able to predict the proper class in these composite logs and highlight the transitions accurately.

Download Full-text

DEEP-LEARNING-BASED AUTOMATED SEDIMENTARY GEOMETRY CHARACTERIZATION FROM BOREHOLE IMAGES

10.30632/spwla-2021-0082 ◽

2021 ◽

Author(s):

Marie Lefranc ◽

◽

Zikri Bayraktar ◽

Morten Kristensen ◽

Hedi Driss ◽

...

Keyword(s):

Deep Learning ◽

Synthetic Data ◽

Training Dataset ◽

Proper Class ◽

Sedimentary Structures ◽

Automated Method ◽

Unique Interpretation ◽

Borehole Image ◽

Very High

Sedimentary geometry on borehole images usually summarizes the arrangement of bed boundaries, erosive surfaces, cross bedding, sedimentary dip, and/or deformed beds. The interpretation, very often manual, requires a good level of expertise, is time consuming, can suffer from user bias, and become very challenging when dealing with highly deviated wells. Bedform geometry interpretation from crossbed data is rarely completed from a borehole image. The purpose of this study is to develop an automated method to interpret sedimentary structures, including the bedform geometry, from borehole images. Automation is achieved in this unique interpretation methodology using deep learning. The first task comprised the creation of a training dataset of 2D borehole images. This library of images was then used to train machine learning (ML) models. Testing different architectures of convolutional neural networks (CNN) showed the ResNet architecture to give the best performance for the classification of the different sedimentary structures. The validation accuracy was very high, in the range of 93–96%. To test the developed method, additional logs of synthetic data were created as sequences of different sedimentary structures (i.e., classes) associated with different well deviations, with addition of gaps. The model was able to predict the proper class and highlight the transitions accurately.

Download Full-text

Unsupervised Mutual Information Criterion for Elimination of Overtraining in Supervised Multilayer Networks

Neural Computation ◽

10.1162/neco.1995.7.1.86 ◽

1995 ◽

Vol 7 (1) ◽

pp. 86-107 ◽

Cited By ~ 51

Author(s):

G. Deco ◽

W. Finnoff ◽

H. G. Zimmermann

Keyword(s):

Mutual Information ◽

Network Architecture ◽

Synthetic Data ◽

Network Models ◽

Information Criterion ◽

Small Data ◽

Economic Time Series ◽

Neural Network Models ◽

Data Set ◽

Penalty Term

Controlling the network complexity in order to prevent overfitting is one of the major problems encountered when using neural network models to extract the structure from small data sets. In this paper we present a network architecture designed for use with a cost function that includes a novel complexity penalty term. In this architecture the outputs of the hidden units are strictly positive and sum to one, and their outputs are defined as the probability that the actual input belongs to a certain class formed during learning. The penalty term expresses the mutual information between the inputs and the extracted classes. This measure effectively describes the network complexity with respect to the given data in an unsupervised fashion. The efficiency of this architecture/penalty-term when combined with backpropagation training, is demonstrated on a real world economic time series forecasting problem. The model was also applied to the benchmark sunspot data and to a synthetic data set from the statistics community.

Download Full-text

Image-based taxonomic classification of bulk biodiversity samples using deep learning and domain adaptation

10.1101/2021.12.22.473797 ◽

2021 ◽

Author(s):

Tomochika Fujisawa ◽

Victor Noguerales ◽

Emmanouil Meramveliotakis ◽

Anna Papadopoulou ◽

Alfried P Vogler

Keyword(s):

Deep Learning ◽

High Throughput ◽

Domain Adaptation ◽

Network Models ◽

Neural Network Models ◽

Data Set ◽

Model Training ◽

Trained Neural Network ◽

Domain Transfer

Complex bulk samples of invertebrates from biodiversity surveys present a great challenge for taxonomic identification, especially if obtained from unexplored ecosystems. High-throughput imaging combined with machine learning for rapid classification could overcome this bottleneck. Developing such procedures requires that taxonomic labels from an existing source data set are used for model training and prediction of an unknown target sample. Yet the feasibility of transfer learning for the classification of unknown samples remains to be tested. Here, we assess the efficiency of deep learning and domain transfer algorithms for family-level classification of below-ground bulk samples of Coleoptera from understudied forests of Cyprus. We trained neural network models with images from local surveys versus global databases of above-ground samples from tropical forests and evaluated how prediction accuracy was affected by: (a) the quality and resolution of images, (b) the size and complexity of the training set and (c) the transferability of identifications across very disparate source-target pairs that do not share any species or genera. Within-dataset classification accuracy reached 98% and depended on the number and quality of training images and on dataset complexity. The accuracy of between-datasets predictions was reduced to a maximum of 82% and depended greatly on the standardisation of the imaging procedure. When the source and target images were of similar quality and resolution, albeit from different faunas, the reduction of accuracy was minimal. Application of algorithms for domain adaptation significantly improved the prediction performance of models trained by non-standardised, low-quality images. Our findings demonstrate that existing databases can be used to train models and successfully classify images from unexplored biota, when the imaging conditions and classification algorithms are carefully considered. Also, our results provide guidelines for data acquisition and algorithmic development for high-throughput image-based biodiversity surveys.

Download Full-text

A Profile Analysis of User Interaction in Social Media Using Deep Learning

Traitement du signal ◽

10.18280/ts.380101 ◽

2021 ◽

Vol 38 (1) ◽

pp. 1-11

Author(s):

Hafzullah İş ◽

Taner Tuncer

Keyword(s):

Social Networks ◽

Social Media ◽

Deep Learning ◽

Profile Analysis ◽

User Interaction ◽

Network Models ◽

User Profiles ◽

Success Rates ◽

Neural Network Models ◽

Data Set

It is highly important to detect malicious account interaction in social networks with regard to political, social and economic aspects. This paper analyzed the profile structure of social media users using their data interactions. A total of 10 parameters including diameter, density, reciprocity, centrality and modularity were used to comprehensively characterize the interactions of Twitter users. Moreover, a new data set was formed by visualizing the data obtained with these parameters. User profiles were classified using Convolutional Neural Network models with deep learning. Users were divided into active, passive and malicious classes. Success rates for the algorithms used in the classification were estimated based on the hyper parameters and application platforms. The best model had a success rate of 98.67%. The methodology demonstrated that Twitter user profiles can be classified successfully through user interaction-based parameters. It is expected that this paper will contribute to published literature in terms of behavioral analysis and the determination of malicious accounts in social networks.

Download Full-text

Arabic text summarization using deep learning approach

Journal Of Big Data ◽

10.1186/s40537-020-00386-7 ◽

2020 ◽

Vol 7 (1) ◽

Author(s):

Molham Al-Maleh ◽

Said Desouki

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Language Processing ◽

Network Models ◽

Arabic Language ◽

Text Summarization ◽

Neural Network Models ◽

Data Set ◽

Learning Techniques ◽

Arabic Text Summarization

AbstractNatural language processing has witnessed remarkable progress with the advent of deep learning techniques. Text summarization, along other tasks like text translation and sentiment analysis, used deep neural network models to enhance results. The new methods of text summarization are subject to a sequence-to-sequence framework of encoder–decoder model, which is composed of neural networks trained jointly on both input and output. Deep neural networks take advantage of big datasets to improve their results. These networks are supported by the attention mechanism, which can deal with long texts more efficiently by identifying focus points in the text. They are also supported by the copy mechanism that allows the model to copy words from the source to the summary directly. In this research, we are re-implementing the basic summarization model that applies the sequence-to-sequence framework on the Arabic language, which has not witnessed the employment of this model in the text summarization before. Initially, we build an Arabic data set of summarized article headlines. This data set consists of approximately 300 thousand entries, each consisting of an article introduction and the headline corresponding to this introduction. We then apply baseline summarization models to the previous data set and compare the results using the ROUGE scale.

Download Full-text

Deep Learning based Handwriting Recognition with Adversarial Feature Deformation and Regularization

Journal of Innovative Image Processing - December 2019 ◽

10.36548/jiip.2021.4.008 ◽

2021 ◽

Vol 3 (4) ◽

pp. 367-376

Author(s):

Yasir Babiker Hamdan ◽

A. Sathesh

Keyword(s):

Deep Learning ◽

Handwriting Recognition ◽

Network Models ◽

Training Data ◽

Neural Network Models ◽

Efficient Manner ◽

Extended Training ◽

Intermediate Layers ◽

Handwritten Text ◽

Conventional Models

Due to the complex and irregular shapes of handwritten text, it is challenging to spot and recognize the handwritten words. In low-resource scripts, retrieval of words is a difficult and laborious task. The need for increasing the number of samples and introducing variations in the extended training datasets occur with the use of deep learning and neural network models. All possible variations and occurrences cannot be covered in an efficient manner with the use of the existing preprocessing strategies and theories. A scalable and elastic methodology for wrapping the extracted features is presented with the introduction of an adversarial feature deformation and regularization module in this paper. In the original deep learning framework, this module is introduced between the intermediate layers while training in an alternative manner. When compared to the conventional models, highly informative features are learnt in an efficient manner with the help of this setup. Extensive word datasets are used for testing the proposed model, which is built on popular frameworks available for word recognition and spotting, while enhancing them with the proposed module. While varying the training data size, the results are recorded and compared with the conventional models. Improvement in the mAP scores, word-error rate and low data regime is observed from the results of comparison.

Download Full-text

AffinityNet: Semi-Supervised Few-Shot Learning for Disease Type Prediction

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33011069 ◽

2019 ◽

Vol 33 ◽

pp. 1069-1076 ◽

Cited By ~ 2

Author(s):

Tianle Ma ◽

Aidong Zhang

Keyword(s):

Neural Network ◽

Deep Learning ◽

Network Model ◽

Nearest Neighbor ◽

Synthetic Data ◽

Genomic Data ◽

Training Data ◽

Great Success ◽

K Nearest Neighbor ◽

Neural Network Models

While deep learning has achieved great success in computer vision and many other fields, currently it does not work very well on patient genomic data with the “big p, small N” problem (i.e., a relatively small number of samples with highdimensional features). In order to make deep learning work with a small amount of training data, we have to design new models that facilitate few-shot learning. Here we present the Affinity Network Model (AffinityNet), a data efficient deep learning model that can learn from a limited number of training examples and generalize well. The backbone of the AffinityNet model consists of stacked k-Nearest-Neighbor (kNN) attention pooling layers. The kNN attention pooling layer is a generalization of the Graph Attention Model (GAM), and can be applied to not only graphs but also any set of objects regardless of whether a graph is given or not. As a new deep learning module, kNN attention pooling layers can be plugged into any neural network model just like convolutional layers. As a simple special case of kNN attention pooling layer, feature attention layer can directly select important features that are useful for classification tasks. Experiments on both synthetic data and cancer genomic data from TCGA projects show that our AffinityNet model has better generalization power than conventional neural network models with little training data.

Download Full-text

IUCNN - deep learning approaches to approximate species' extinction risk

10.1101/2021.06.17.448832 ◽

2021 ◽

Author(s):

Alexander Zizka ◽

Tobias Andermann ◽

Daniele Silvestro

Keyword(s):

Neural Network ◽

Deep Learning ◽

Large Scale ◽

Extinction Risk ◽

Class Imbalance ◽

Network Models ◽

Training Data ◽

Assessment Process ◽

Learning Approaches ◽

Neural Network Models

Aim: The global Red List (RL) from the International Union for the Conservation of Nature is the most comprehensive global quantification of extinction risk, and widely used in applied conservation as well as in biogeographic and ecological research. Yet, due to the time-consuming assessment process, the RL is biased taxonomically and geographically, which limits its application on large scales, in particular for understudied areas such as the tropics, or understudied taxa, such as most plants and invertebrates. Here we present IUCNN, an R-package implementing deep learning models to predict species RL status from publicly available geographic occurrence records (and other traits if available). Innovation: We implement a user-friendly workflow to train and validate neural network models, and subsequently use them to predict species RL status. IUCNN contains functions to address specific issues related to the RL framework, including a regression-based approach to account for the ordinal nature of RL categories and class imbalance in the training data, a Bayesian approach for improved uncertainty quantification, and a target accuracy threshold approach that limits predictions to only those species whose RL status can be predicted with high confidence. Most analyses can be run with few lines of code, without prior knowledge of neural network models. We demonstrate the use of IUCNN on an empirical dataset of ~14,000 orchid species, for which IUCNN models can predict extinction risk within minutes, while outperforming comparable methods. Main conclusions: IUCNN harnesses innovative methodology to estimate the RL status of large numbers of species. By providing estimates of the number and identity of threatened species in custom geographic or taxonomic datasets, IUCNN enables large-scale analyses on the extinction risk of species so far not well represented on the official RL.

Download Full-text

Analysis and Comparison of Neural Network Models for Software Development Effort Estimation

Journal of Cases on Information Technology ◽

10.4018/jcit.2019040106 ◽

2019 ◽

Vol 21 (2) ◽

pp. 88-112

Author(s):

Kamlesh Dutta ◽

Varun Gupta ◽

Vachik S. Dave

Keyword(s):

Neural Network ◽

Software Development ◽

Network Models ◽

Training Data ◽

Accurate Estimation ◽

Development Effort ◽

Effort Estimation ◽

Neural Network Models ◽

Data Set ◽

Software Development Effort

Prediction of software development is the key task for the effective management of any software industry. The accuracy and reliability of the prediction mechanisms used for the estimation of software development effort is also important. A series of experiments are conducted to gradually progress towards the improved accurate estimation of the software development effort. However, while conducting these experiments, it was found that the size of the training set was not sufficient to train a large and complex artificial neural network (ANN). To overcome the problem of the size of the available training data set, a novel multilayered architecture based on a neural network model is proposed. The accuracy of the proposed multi-layered model is assessed using different criteria, which proves the pre-eminence of the proposed model.

Download Full-text

A Data-Driven Surrogate Approach for the Temporal Stability Forecasting of Vegetation Covered Dikes

Water ◽

10.3390/w13010107 ◽

2021 ◽

Vol 13 (1) ◽

pp. 107

Author(s):

Elahe Jamalinia ◽

Faraz S. Tehrani ◽

Susan C. Steele-Dunne ◽

Philip J. Vardon

Keyword(s):

Numerical Simulation ◽

Water Flux ◽

Temporal Stability ◽

Synthetic Data ◽

Climatic Conditions ◽

Training Data ◽

Data Driven ◽

Data Set ◽

Surface Cracking ◽

Real Time Analysis

Climatic conditions and vegetation cover influence water flux in a dike, and potentially the dike stability. A comprehensive numerical simulation is computationally too expensive to be used for the near real-time analysis of a dike network. Therefore, this study investigates a random forest (RF) regressor to build a data-driven surrogate for a numerical model to forecast the temporal macro-stability of dikes. To that end, daily inputs and outputs of a ten-year coupled numerical simulation of an idealised dike (2009–2019) are used to create a synthetic data set, comprising features that can be observed from a dike surface, with the calculated factor of safety (FoS) as the target variable. The data set before 2018 is split into training and testing sets to build and train the RF. The predicted FoS is strongly correlated with the numerical FoS for data that belong to the test set (before 2018). However, the trained model shows lower performance for data in the evaluation set (after 2018) if further surface cracking occurs. This proof-of-concept shows that a data-driven surrogate can be used to determine dike stability for conditions similar to the training data, which could be used to identify vulnerable locations in a dike network for further examination.

Download Full-text