A Novel Cooperative Divide-and-Conquer Neural Networks Algorithm

Dynamic modularity is one of the fundamental characteristics of the human brain. Cooperative divide and conquer strategy is a basic problem solving approach. This chapter proposes a new subnet training method for modular neural networks with the inspiration of the principle of “an expert with other capabilities.” The key point of this method is that a subnet learns the neighbor data sets while fulfilling its main task: learning the objective data set. Additionally, a relative distance measure is proposed to replace the absolute distance measure used in the classical method and its advantage is theoretically discussed. Both methodology and empirical study are presented. Two types of experiments respectively related with the approximation problem and the prediction problem in nonlinear dynamic systems are designed to verify the effectiveness of the proposed method. Compared with the classical learning method, the average testing error is dramatically decreased and more stable. The superiority of the relative distance measure is also corroborated. Finally, a mind-gut frame is proposed.

Download Full-text

Methodological Research for Modular Neural Networks Based on “an Expert With Other Capabilities”

Journal of Global Information Management ◽

10.4018/jgim.2018040105 ◽

2018 ◽

Vol 26 (2) ◽

pp. 104-126

Author(s):

Pan Wang ◽

Jiasen Wang ◽

Jian Zhang

Keyword(s):

Neural Networks ◽

Distance Measure ◽

Approximation Problem ◽

Relative Distance ◽

Data Sets ◽

Main Task ◽

Learning Method ◽

Data Set ◽

Modular Neural Networks ◽

Testing Error

This article contains a new subnet training method for modular neural networks, proposed with the inspiration of the principle of “an expert with other capabilities”. The key point of this method is that a subnet learns the neighbor data sets while fulfilling its main task: learning the objective data set. Additionally, a relative distance measure is proposed to replace the absolute distance measure used in the classical subnet learning method and its advantage in the general case is theoretically discussed. Both methodology and empirical study of this new method are presented. Two types of experiments respectively related with the approximation problem and the prediction problem in nonlinear dynamic systems are designed to verify the effectiveness of the proposed method. Compared with the classical subnet learning method, the average testing error of the proposed method is dramatically decreased and more stable. The superiority of the relative distance measure is also corroborated.

Download Full-text

Generation of geometric interpolations of building types with deep variational autoencoders

Design Science ◽

10.1017/dsj.2020.31 ◽

2020 ◽

Vol 6 ◽

Author(s):

Jaime de Miguel Rodríguez ◽

Maria Eugenia Villafañe ◽

Luka Piškorec ◽

Fernando Sancho Caparrini

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Large Data ◽

Learning Model ◽

Large Data Sets ◽

Data Sets ◽

Connectivity Map ◽

Data Set ◽

3D Objects ◽

Machine Learning Model

Abstract This work presents a methodology for the generation of novel 3D objects resembling wireframes of building types. These result from the reconstruction of interpolated locations within the learnt distribution of variational autoencoders (VAEs), a deep generative machine learning model based on neural networks. The data set used features a scheme for geometry representation based on a ‘connectivity map’ that is especially suited to express the wireframe objects that compose it. Additionally, the input samples are generated through ‘parametric augmentation’, a strategy proposed in this study that creates coherent variations among data by enabling a set of parameters to alter representative features on a given building type. In the experiments that are described in this paper, more than 150 k input samples belonging to two building types have been processed during the training of a VAE model. The main contribution of this paper has been to explore parametric augmentation for the generation of large data sets of 3D geometries, showcasing its problems and limitations in the context of neural networks and VAEs. Results show that the generation of interpolated hybrid geometries is a challenging task. Despite the difficulty of the endeavour, promising advances are presented.

Download Full-text

Variational inference using approximate likelihood under the coalescent with recombination

Genome Research ◽

10.1101/gr.273631.120 ◽

2021 ◽

pp. gr.273631.120

Author(s):

Xinhao Liu ◽

Huw A Ogilvie ◽

Luay Nakhleh

Keyword(s):

Simulated Data ◽

Variational Inference ◽

Divide And Conquer ◽

Data Sets ◽

Transition Rates ◽

Data Set ◽

Population Sizes ◽

Novel Method ◽

Approximate Likelihood ◽

Promising Avenue

Coalescent methods are proven and powerful tools for population genetics, phylogenetics, epidemiology, and other fields. A promising avenue for the analysis of large genomic alignments, which are increasingly common, are coalescent hidden Markov model (coalHMM) methods, but these methods have lacked general usability and flexibility. We introduce a novel method for automatically learning a coalHMM and inferring the posterior distributions of evolutionary parameters using black-box variational inference, with the transition rates between local genealogies derived empirically by simulation. This derivation enables our method to work directly with three or four taxa and through a divide-and-conquer approach with more taxa. Using a simulated data set resembling a human-chimp-gorilla scenario, we show that our method has comparable or better accuracy to previous coalHMM methods. Both species divergence times and population sizes were accurately inferred. The method also infers local genealogies and we report on their accuracy. Furthermore, we discuss a potential direction for scaling the method to larger data sets through a divide-and-conquer approach. This accuracy means our method is useful now, and by deriving transition rates by simulation it is flexible enough to enable future implementations of all kinds of population models.

Download Full-text

Uncertainty-Aware Deep Classifiers Using Generative Models

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6015 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5620-5627 ◽

Cited By ~ 1

Author(s):

Murat Sensoy ◽

Lance Kaplan ◽

Federico Cerutti ◽

Maryam Saleki

Keyword(s):

Neural Networks ◽

Epistemic Uncertainty ◽

Feature Space ◽

Generative Models ◽

Detection Methods ◽

Generative Adversarial Networks ◽

Data Sets ◽

Bayesian Approaches ◽

Data Set ◽

Auxiliary Data

Deep neural networks are often ignorant about what they do not know and overconfident when they make uninformed predictions. Some recent approaches quantify classification uncertainty directly by training the model to output high uncertainty for the data samples close to class boundaries or from the outside of the training distribution. These approaches use an auxiliary data set during training to represent out-of-distribution samples. However, selection or creation of such an auxiliary data set is non-trivial, especially for high dimensional data such as images. In this work we develop a novel neural network model that is able to express both aleatoric and epistemic uncertainty to distinguish decision boundary and out-of-distribution regions of the feature space. To this end, variational autoencoders and generative adversarial networks are incorporated to automatically generate out-of-distribution exemplars for training. Through extensive analysis, we demonstrate that the proposed approach provides better estimates of uncertainty for in- and out-of-distribution samples, and adversarial examples on well-known data sets against state-of-the-art approaches including recent Bayesian approaches for neural networks and anomaly detection methods.

Download Full-text

Simple Convolutional-Based Models: Are They Learning the Task or the Data?

Neural Computation ◽

10.1162/neco_a_01446 ◽

2021 ◽

pp. 1-17

Author(s):

Luis Sa-Couto ◽

Andreas Wichert

Keyword(s):

Neural Networks ◽

Pattern Recognition ◽

Training Data ◽

Model Complexity ◽

Data Sets ◽

Simple Task ◽

Data Set ◽

Knowing That ◽

Handwritten Digit ◽

End To End

Abstract Convolutional neural networks (CNNs) evolved from Fukushima's neocognitron model, which is based on the ideas of Hubel and Wiesel about the early stages of the visual cortex. Unlike other branches of neocognitron-based models, the typical CNN is based on end-to-end supervised learning by backpropagation and removes the focus from built-in invariance mechanisms, using pooling not as a way to tolerate small shifts but as a regularization tool that decreases model complexity. These properties of end-to-end supervision and flexibility of structure allow the typical CNN to become highly tuned to the training data, leading to extremely high accuracies on typical visual pattern recognition data sets. However, in this work, we hypothesize that there is a flip side to this capability, a hidden overfitting. More concretely, a supervised, backpropagation based CNN will outperform a neocognitron/map transformation cascade (MTCCXC) when trained and tested inside the same data set. Yet if we take both models trained and test them on the same task but on another data set (without retraining), the overfitting appears. Other neocognitron descendants like the What-Where model go in a different direction. In these models, learning remains unsupervised, but more structure is added to capture invariance to typical changes. Knowing that, we further hypothesize that if we repeat the same experiments with this model, the lack of supervision may make it worse than the typical CNN inside the same data set, but the added structure will make it generalize even better to another one. To put our hypothesis to the test, we choose the simple task of handwritten digit classification and take two well-known data sets of it: MNIST and ETL-1. To try to make the two data sets as similar as possible, we experiment with several types of preprocessing. However, regardless of the type in question, the results align exactly with expectation.

Download Full-text

EXTRACTING RULES FROM TRAINED RBF NEURAL NETWORKS

Environment Technology Resources Proceedings of the International Scientific and Practical Conference ◽

10.17770/etr2005vol1.2128 ◽

2005 ◽

Vol 1 ◽

pp. 33

Author(s):

Peter Grabusts

Keyword(s):

Neural Networks ◽

Rbf Neural Network ◽

Extraction Procedure ◽

Rule Extraction ◽

Data Sets ◽

Rbf Neural Networks ◽

Data Set ◽

Extraction Algorithm ◽

Rule Set ◽

Iris Data

This paper describes a method of rule extraction from trained artificial neural networks. The statement of the problem is given. The aim of rule extraction procedure and suitable neural networks for rule extraction are outlined. The RULEX rule extraction algorithm is discussed that is based on the radial basis function (RBF) neural network. The extracted rules can help discover and analyze the rule set hidden in data sets. The paper contains an implementation example, which is shown through standalone IRIS data set.

Download Full-text

BraggNet: integrating Bragg peaks using neural networks

Journal of Applied Crystallography ◽

10.1107/s1600576719008665 ◽

2019 ◽

Vol 52 (4) ◽

pp. 854-863 ◽

Cited By ~ 3

Author(s):

Brendan Sullivan ◽

Rick Archibald ◽

Jahaun Azadmanesh ◽

Venu Gopal Vandavasi ◽

Patricia S. Langan ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Negative Control ◽

Protein Crystal ◽

Data Sets ◽

Data Set ◽

X Ray ◽

X Ray Crystallography ◽

Integrated Intensities ◽

Neutron Crystallography

Neutron crystallography offers enormous potential to complement structures from X-ray crystallography by clarifying the positions of low-Z elements, namely hydrogen. Macromolecular neutron crystallography, however, remains limited, in part owing to the challenge of integrating peak shapes from pulsed-source experiments. To advance existing software, this article demonstrates the use of machine learning to refine peak locations, predict peak shapes and yield more accurate integrated intensities when applied to whole data sets from a protein crystal. The artificial neural network, based on the U-Net architecture commonly used for image segmentation, is trained using about 100 000 simulated training peaks derived from strong peaks. After 100 training epochs (a round of training over the whole data set broken into smaller batches), training converges and achieves a Dice coefficient of around 65%, in contrast to just 15% for negative control data sets. Integrating whole peak sets using the neural network yields improved intensity statistics compared with other integration methods, including k-nearest neighbours. These results demonstrate, for the first time, that neural networks can learn peak shapes and be used to integrate Bragg peaks. It is expected that integration using neural networks can be further developed to increase the quality of neutron, electron and X-ray crystallography data.

Download Full-text

Evaluation of Deep Learning-Based Neural Network Methods for Cloud Detection and Segmentation

Energies ◽

10.3390/en14196156 ◽

2021 ◽

Vol 14 (19) ◽

pp. 6156

Author(s):

Stefan Hensel ◽

Marin B. Marinov ◽

Michael Koch ◽

Dimitar Arnaudov

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Data Sets ◽

Cloud Detection ◽

Camera System ◽

Data Set ◽

Network Methods ◽

Segmentation Approach ◽

Coverage Prediction ◽

Short Time

This paper presents a systematic approach for accurate short-time cloud coverage prediction based on a machine learning (ML) approach. Based on a newly built omnidirectional ground-based sky camera system, local training and evaluation data sets were created. These were used to train several state-of-the-art deep neural networks for object detection and segmentation. For this purpose, the camera-generated a full hemispherical image every 30 min over two months in daylight conditions with a fish-eye lens. From this data set, a subset of images was selected for training and evaluation according to various criteria. Deep neural networks, based on the two-stage R-CNN architecture, were trained and compared with a U-net segmentation approach implemented by CloudSegNet. All chosen deep networks were then evaluated and compared according to the local situation.

Download Full-text

Modified Deep Neural Networks for Dog Breeds Identification

10.20944/preprints201812.0232.v1 ◽

2018 ◽

Cited By ~ 1

Author(s):

Aydin Ayanzadeh ◽

Sahand Vahidnia

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

The State ◽

Fine Tuning ◽

Test Accuracy ◽

Data Sets ◽

Data Set

In this paper, we leverage state of the art models on Imagenet data-sets. We use the pre-trained model and learned weighs to extract the feature from the Dog breeds identification data-set. Afterwards, we applied fine-tuning and dataaugmentation to increase the performance of our test accuracy in classification of dog breeds datasets. The performance of the proposed approaches are compared with the state of the art models of Image-Net datasets such as ResNet-50, DenseNet-121, DenseNet-169 and GoogleNet. we achieved 89.66% , 85.37% 84.01% and 82.08% test accuracy respectively which shows thesuperior performance of proposed method to the previous works on Stanford dog breeds datasets.

Download Full-text

Supposed Maximum Mutual Information for Improving Generalization and Interpretation of Multi-Layered Neural Networks

Journal of Artificial Intelligence and Soft Computing Research ◽

10.2478/jaiscr-2018-0029 ◽

2019 ◽

Vol 9 (2) ◽

pp. 123-147 ◽

Cited By ~ 5

Author(s):

Ryotaro Kamimura

Keyword(s):

Neural Networks ◽

Mutual Information ◽

Data Sets ◽

Data Set ◽

Information Theoretic ◽

Information Maximization ◽

Maximum Mutual Information ◽

Information Theoretic Method ◽

Mutual Information Maximization ◽

Inputs And Outputs

Abstract The present paper1 aims to propose a new type of information-theoretic method to maximize mutual information between inputs and outputs. The importance of mutual information in neural networks is well known, but the actual implementation of mutual information maximization has been quite difficult to undertake. In addition, mutual information has not extensively been used in neural networks, meaning that its applicability is very limited. To overcome the shortcoming of mutual information maximization, we present it here in a very simplified manner by supposing that mutual information is already maximized before learning, or at least at the beginning of learning. The method was applied to three data sets (crab data set, wholesale data set, and human resources data set) and examined in terms of generalization performance and connection weights. The results showed that by disentangling connection weights, maximizing mutual information made it possible to explicitly interpret the relations between inputs and outputs.

Download Full-text