Unsupervised Mutual Information Criterion for Elimination of Overtraining in Supervised Multilayer Networks

1995 ◽  
Vol 7 (1) ◽  
pp. 86-107 ◽  
Author(s):  
G. Deco ◽  
W. Finnoff ◽  
H. G. Zimmermann

Controlling the network complexity in order to prevent overfitting is one of the major problems encountered when using neural network models to extract the structure from small data sets. In this paper we present a network architecture designed for use with a cost function that includes a novel complexity penalty term. In this architecture the outputs of the hidden units are strictly positive and sum to one, and their outputs are defined as the probability that the actual input belongs to a certain class formed during learning. The penalty term expresses the mutual information between the inputs and the extracted classes. This measure effectively describes the network complexity with respect to the given data in an unsupervised fashion. The efficiency of this architecture/penalty-term when combined with backpropagation training, is demonstrated on a real world economic time series forecasting problem. The model was also applied to the benchmark sunspot data and to a synthetic data set from the statistics community.

Author(s):  
Marie Lefranc ◽  
◽  
Zikri Bayraktar ◽  
Morten Kristensen ◽  
Hedi Driss ◽  
...  

Sedimentary geometry on borehole images usually summarizes the arrangement of bed boundaries, erosive surfaces, crossbedding, sedimentary dip, and/or deformed beds. The interpretation, very often manual, requires a good level of expertise, is time consuming, can suffer from user bias, and becomes very challenging when dealing with highly deviated wells. Bedform geometry interpretation from crossbed data is rarely completed from a borehole image. The purpose of this study is to develop an automated method to interpret sedimentary structures, including the bedform geometry resulting from the change in flow direction from borehole images. Automation is achieved in this unique interpretation methodology using deep learning (DL). The first task comprised the creation of a training data set of 2D borehole images. This library of images was then used to train deep neural network models. Testing different architectures of convolutional neural networks (CNN) showed the ResNet architecture to give the best performance for the classification of the different sedimentary structures. The validation accuracy was very high, in the range of 93 to 96%. To test the developed method, additional logs of synthetic data were created as sequences of different sedimentary structures (i.e., classes) associated with different well deviations, with the addition of gaps. The model was able to predict the proper class in these composite logs and highlight the transitions accurately.


2021 ◽  
Vol 12 (6) ◽  
pp. 1-21
Author(s):  
Jayant Gupta ◽  
Carl Molnar ◽  
Yiqun Xie ◽  
Joe Knight ◽  
Shashi Shekhar

Spatial variability is a prominent feature of various geographic phenomena such as climatic zones, USDA plant hardiness zones, and terrestrial habitat types (e.g., forest, grasslands, wetlands, and deserts). However, current deep learning methods follow a spatial-one-size-fits-all (OSFA) approach to train single deep neural network models that do not account for spatial variability. Quantification of spatial variability can be challenging due to the influence of many geophysical factors. In preliminary work, we proposed a spatial variability aware neural network (SVANN-I, formerly called SVANN ) approach where weights are a function of location but the neural network architecture is location independent. In this work, we explore a more flexible SVANN-E approach where neural network architecture varies across geographic locations. In addition, we provide a taxonomy of SVANN types and a physics inspired interpretation model. Experiments with aerial imagery based wetland mapping show that SVANN-I outperforms OSFA and SVANN-E performs the best of all.


2019 ◽  
Vol 53 (1) ◽  
pp. 2-19 ◽  
Author(s):  
Erion Çano ◽  
Maurizio Morisio

Purpose The fabulous results of convolution neural networks in image-related tasks attracted attention of text mining, sentiment analysis and other text analysis researchers. It is, however, difficult to find enough data for feeding such networks, optimize their parameters, and make the right design choices when constructing network architectures. The purpose of this paper is to present the creation steps of two big data sets of song emotions. The authors also explore usage of convolution and max-pooling neural layers on song lyrics, product and movie review text data sets. Three variants of a simple and flexible neural network architecture are also compared. Design/methodology/approach The intention was to spot any important patterns that can serve as guidelines for parameter optimization of similar models. The authors also wanted to identify architecture design choices which lead to high performing sentiment analysis models. To this end, the authors conducted a series of experiments with neural architectures of various configurations. Findings The results indicate that parallel convolutions of filter lengths up to 3 are usually enough for capturing relevant text features. Also, max-pooling region size should be adapted to the length of text documents for producing the best feature maps. Originality/value Top results the authors got are obtained with feature maps of lengths 6–18. An improvement on future neural network models for sentiment analysis could be generating sentiment polarity prediction of documents using aggregation of predictions on smaller excerpt of the entire text.


Author(s):  
Aditya Rajbongshi ◽  
Thaharim Khan ◽  
Md. Mahbubur Rahman ◽  
Anik Pramanik ◽  
Shah Md Tanvir Siddiquee ◽  
...  

<p>The acknowledgment of plant diseases assumes an indispensable part in taking infectious prevention measures to improve the quality and amount of harvest yield. Mechanization of plant diseases is a lot advantageous as it decreases the checking work in an enormous cultivated area where mango is planted to a huge extend. Leaves being the food hotspot for plants, the early and precise recognition of leaf diseases is significant. This work focused on grouping and distinguishing the diseases of mango leaves through the process of CNN. DenseNet201, InceptionResNetV2, InceptionV3, ResNet50, ResNet152V2, and Xception all these models of CNN with transfer learning techniques are used here for getting better accuracy from the targeted data set. Image acquisition, image segmentation, and features extraction are the steps involved in disease detection. Different kinds of leaf diseases which are considered as the class for this work such as anthracnose, gall machi, powdery mildew, red rust are used in the dataset consisting of 1500 images of diseased and also healthy mango leaves image data another class is also added in the dataset. We have also evaluated the overall performance matrices and found that the DenseNet201 outperforms by obtaining the highest accuracy as 98.00% than other models.</p>


Healthcare ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 181 ◽  
Author(s):  
Patricia Melin ◽  
Julio Cesar Monica ◽  
Daniela Sanchez ◽  
Oscar Castillo

In this paper, a multiple ensemble neural network model with fuzzy response aggregation for the COVID-19 time series is presented. Ensemble neural networks are composed of a set of modules, which are used to produce several predictions under different conditions. The modules are simple neural networks. Fuzzy logic is then used to aggregate the responses of several predictor modules, in this way, improving the final prediction by combining the outputs of the modules in an intelligent way. Fuzzy logic handles the uncertainty in the process of making a final decision about the prediction. The complete model was tested for the case of predicting the COVID-19 time series in Mexico, at the level of the states and the whole country. The simulation results of the multiple ensemble neural network models with fuzzy response integration show very good predicted values in the validation data set. In fact, the prediction errors of the multiple ensemble neural networks are significantly lower than using traditional monolithic neural networks, in this way showing the advantages of the proposed approach.


Author(s):  
Ratish Puduppully ◽  
Li Dong ◽  
Mirella Lapata

Recent advances in data-to-text generation have led to the use of large-scale datasets and neural network models which are trained end-to-end, without explicitly modeling what to say and in what order. In this work, we present a neural network architecture which incorporates content selection and planning without sacrificing end-to-end training. We decompose the generation task into two stages. Given a corpus of data records (paired with descriptive documents), we first generate a content plan highlighting which information should be mentioned and in which order and then generate the document while taking the content plan into account. Automatic and human-based evaluation experiments show that our model1 outperforms strong baselines improving the state-of-the-art on the recently released RotoWIRE dataset.


Author(s):  
A. Saravanan ◽  
J. Jerald ◽  
A. Delphin Carolina Rani

AbstractThe objective of the paper is to develop a new method to model the manufacturing cost–tolerance and to optimize the tolerance values along with its manufacturing cost. A cost–tolerance relation has a complex nonlinear correlation among them. The property of a neural network makes it possible to model the complex correlation, and the genetic algorithm (GA) is integrated with the best neural network model to optimize the tolerance values. The proposed method used three types of neural network models (multilayer perceptron, backpropagation network, and radial basis function). These network models were developed separately for prismatic and rotational parts. For the construction of network models, part size and tolerance values were used as input neurons. The reference manufacturing cost was assigned as the output neuron. The qualitative production data set was gathered in a workshop and partitioned into three files for training, testing, and validation, respectively. The architecture of the network model was identified based on the best regression coefficient and the root-mean-square-error value. The best network model was integrated into the GA, and the role of genetic operators was also studied. Finally, two case studies from the literature were demonstrated in order to validate the proposed method. A new methodology based on the neural network model enables the design and process planning engineers to propose an intelligent decision irrespective of their experience.


2021 ◽  
Author(s):  
Tomochika Fujisawa ◽  
Victor Noguerales ◽  
Emmanouil Meramveliotakis ◽  
Anna Papadopoulou ◽  
Alfried P Vogler

Complex bulk samples of invertebrates from biodiversity surveys present a great challenge for taxonomic identification, especially if obtained from unexplored ecosystems. High-throughput imaging combined with machine learning for rapid classification could overcome this bottleneck. Developing such procedures requires that taxonomic labels from an existing source data set are used for model training and prediction of an unknown target sample. Yet the feasibility of transfer learning for the classification of unknown samples remains to be tested. Here, we assess the efficiency of deep learning and domain transfer algorithms for family-level classification of below-ground bulk samples of Coleoptera from understudied forests of Cyprus. We trained neural network models with images from local surveys versus global databases of above-ground samples from tropical forests and evaluated how prediction accuracy was affected by: (a) the quality and resolution of images, (b) the size and complexity of the training set and (c) the transferability of identifications across very disparate source-target pairs that do not share any species or genera. Within-dataset classification accuracy reached 98% and depended on the number and quality of training images and on dataset complexity. The accuracy of between-datasets predictions was reduced to a maximum of 82% and depended greatly on the standardisation of the imaging procedure. When the source and target images were of similar quality and resolution, albeit from different faunas, the reduction of accuracy was minimal. Application of algorithms for domain adaptation significantly improved the prediction performance of models trained by non-standardised, low-quality images. Our findings demonstrate that existing databases can be used to train models and successfully classify images from unexplored biota, when the imaging conditions and classification algorithms are carefully considered. Also, our results provide guidelines for data acquisition and algorithmic development for high-throughput image-based biodiversity surveys.


2021 ◽  
Vol 7 (8) ◽  
pp. 146
Author(s):  
Joshua Ganter ◽  
Simon Löffler ◽  
Ron Metzger ◽  
Katharina Ußling ◽  
Christoph Müller

Collecting real-world data for the training of neural networks is enormously time- consuming and expensive. As such, the concept of virtualizing the domain and creating synthetic data has been analyzed in many instances. This virtualization offers many possibilities of changing the domain, and with that, enabling the relatively fast creation of data. It also offers the chance to enhance necessary augmentations with additional semantic information when compared with conventional augmentation methods. This raises the question of whether such semantic changes, which can be seen as augmentations of the virtual domain, contribute to better results for neural networks, when trained with data augmented this way. In this paper, a virtual dataset is presented, including semantic augmentations and automatically generated annotations, as well as a comparison between semantic and conventional augmentation for image data. It is determined that the results differ only marginally for neural network models trained with the two augmentation approaches.


Sign in / Sign up

Export Citation Format

Share Document