Modified Neural Architecture Search (NAS) Using the Chromosome Non-Disjunction

This paper proposes a deep neural network structuring methodology through a genetic algorithm (GA) using chromosome non-disjunction. The proposed model includes methods for generating and tuning the neural network architecture without the aid of human experts. Since the original neural architecture search (henceforth, NAS) was announced, NAS techniques, such as NASBot, NASGBO and CoDeepNEAT, have been widely adopted in order to improve cost- and/or time-effectiveness for human experts. In these models, evolutionary algorithms (EAs) are employed to effectively enhance the accuracy of the neural network architecture. In particular, CoDeepNEAT uses a constructive GA starting from minimal architecture. This will only work quickly if the solution architecture is small. On the other hand, the proposed methodology utilizes chromosome non-disjunction as a new genetic operation. Our approach differs from previous methodologies in that it includes a destructive approach as well as a constructive approach, and is similar to pruning methodologies, which realizes tuning of the previous neural network architecture. A case study applied to the sentence word ordering problem and AlexNet for CIFAR-10 illustrates the applicability of the proposed methodology. We show from the simulation studies that the accuracy of the model was improved by 0.7% compared to the conventional model without human expert.

Download Full-text

Online Neural Architecture Search (ONAS): Adapting neural network architecture search in a continuously evolving domain. [Proposal]

10.31219/osf.io/suqxr ◽

2021 ◽

Author(s):

Nathan Buskulic ◽

Edward Bergman ◽

Joeran Beel

Keyword(s):

Neural Network ◽

Network Architecture ◽

Early Stopping ◽

Neural Network Architecture ◽

Initial State ◽

Warm Up ◽

Neural Architecture ◽

The Neural Network ◽

Evolving Data ◽

Minimisation Problem

Neural Architecture Search research has been limited to fixed datasets and as such does not provide the flexibility needed to deal with real-world, constantly evolving data. This is why we propose the basis of Online Neural Architecture Search (ONAS) to deal with complex, evolving, data distributions. We formalise ONAS as a minimisation problem upon which both the weights and the architecture of the neural network needs to be optimised for the data up until a time $t_i$. To solve this problem, we adapt a DARTS optimisation process, associated with an early stopping scheme, by using the supernet optimised on previous data as a warm-up initial state. This allows the architecture of the neural network to evolve as the data distribution evolves while limiting the computational burden. This work aims at building the initial mathematical formalism of the problem as well as the development of a framework where NAS methods could be used to solve this problem. Finally, several possible next steps are presented to show the potential of this field of Online Neural Architecture Search.

Download Full-text

SCORING MODELING BASED ON NEURAL NETWORKS FOR DETERMINING A BANK BORROWER'S RATING

Economy of Ukraine ◽

10.15407/economyukr.2020.10.054 ◽

2020 ◽

Vol 2020 (10) ◽

pp. 54-62

Author(s):

Oleksii VASYLIEV ◽

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Statistical Data ◽

Activation Function ◽

Decision Making Process ◽

Neural Network Architecture ◽

Acceptable Accuracy ◽

The Neural Network ◽

Sigmoid Activation Function

The problem of applying neural networks to calculate ratings used in banking in the decision-making process on granting or not granting loans to borrowers is considered. The task is to determine the rating function of the borrower based on a set of statistical data on the effectiveness of loans provided by the bank. When constructing a regression model to calculate the rating function, it is necessary to know its general form. If so, the task is to calculate the parameters that are included in the expression for the rating function. In contrast to this approach, in the case of using neural networks, there is no need to specify the general form for the rating function. Instead, certain neural network architecture is chosen and parameters are calculated for it on the basis of statistical data. Importantly, the same neural network architecture can be used to process different sets of statistical data. The disadvantages of using neural networks include the need to calculate a large number of parameters. There is also no universal algorithm that would determine the optimal neural network architecture. As an example of the use of neural networks to determine the borrower's rating, a model system is considered, in which the borrower's rating is determined by a known non-analytical rating function. A neural network with two inner layers, which contain, respectively, three and two neurons and have a sigmoid activation function, is used for modeling. It is shown that the use of the neural network allows restoring the borrower's rating function with quite acceptable accuracy.

Download Full-text

Spatial Variability Aware Deep Neural Networks (SVANN): A General Approach

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3466688 ◽

2021 ◽

Vol 12 (6) ◽

pp. 1-21

Author(s):

Jayant Gupta ◽

Carl Molnar ◽

Yiqun Xie ◽

Joe Knight ◽

Shashi Shekhar

Keyword(s):

Neural Network ◽

Spatial Variability ◽

Network Architecture ◽

Network Models ◽

Neural Network Architecture ◽

Neural Network Models ◽

Climatic Zones ◽

The Neural Network ◽

Plant Hardiness ◽

Interpretation Model

Spatial variability is a prominent feature of various geographic phenomena such as climatic zones, USDA plant hardiness zones, and terrestrial habitat types (e.g., forest, grasslands, wetlands, and deserts). However, current deep learning methods follow a spatial-one-size-fits-all (OSFA) approach to train single deep neural network models that do not account for spatial variability. Quantification of spatial variability can be challenging due to the influence of many geophysical factors. In preliminary work, we proposed a spatial variability aware neural network (SVANN-I, formerly called SVANN ) approach where weights are a function of location but the neural network architecture is location independent. In this work, we explore a more flexible SVANN-E approach where neural network architecture varies across geographic locations. In addition, we provide a taxonomy of SVANN types and a physics inspired interpretation model. Experiments with aerial imagery based wetland mapping show that SVANN-I outperforms OSFA and SVANN-E performs the best of all.

Download Full-text

Towards Better Interpretability in Deep Q-Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014561 ◽

2019 ◽

Vol 33 ◽

pp. 4561-4569 ◽

Cited By ~ 2

Author(s):

Raghuram Mandyam Annasamy ◽

Katia Sycara

Keyword(s):

Neural Network ◽

Network Architecture ◽

Empirical Studies ◽

Superior Performance ◽

Training Algorithms ◽

Neural Network Architecture ◽

Q Learning ◽

The Neural Network ◽

Learning Techniques ◽

Out Of Sample

Deep reinforcement learning techniques have demonstrated superior performance in a wide variety of environments. As improvements in training algorithms continue at a brisk pace, theoretical or empirical studies on understanding what these networks seem to learn, are far behind. In this paper we propose an interpretable neural network architecture for Q-learning which provides a global explanation of the model’s behavior using key-value memories, attention and reconstructible embeddings. With a directed exploration strategy, our model can reach training rewards comparable to the state-of-the-art deep Q-learning models. However, results suggest that the features extracted by the neural network are extremely shallow and subsequent testing using out-of-sample examples shows that the agent can easily overfit to trajectories seen during training.

Download Full-text

Interpretable deep neural network for cancer survival analysis by integrating genomic and clinical data

BMC Medical Genomics ◽

10.1186/s12920-019-0624-2 ◽

2019 ◽

Vol 12 (S10) ◽

Cited By ~ 1

Author(s):

Jie Hao ◽

Youngsoon Kim ◽

Tejaswini Mallavarapu ◽

Jung Hun Oh ◽

Mingon Kang

Keyword(s):

Neural Network ◽

Survival Analysis ◽

Cancer Patient ◽

Clinical Data ◽

Network Architecture ◽

Deep Neural Network ◽

Patient Survival ◽

Biological Mechanisms ◽

Neural Network Architecture ◽

The Neural Network

Abstract Background Understanding the complex biological mechanisms of cancer patient survival using genomic and clinical data is vital, not only to develop new treatments for patients, but also to improve survival prediction. However, highly nonlinear and high-dimension, low-sample size (HDLSS) data cause computational challenges to applying conventional survival analysis. Results We propose a novel biologically interpretable pathway-based sparse deep neural network, named Cox-PASNet, which integrates high-dimensional gene expression data and clinical data on a simple neural network architecture for survival analysis. Cox-PASNet is biologically interpretable where nodes in the neural network correspond to biological genes and pathways, while capturing the nonlinear and hierarchical effects of biological pathways associated with cancer patient survival. We also propose a heuristic optimization solution to train Cox-PASNet with HDLSS data. Cox-PASNet was intensively evaluated by comparing the predictive performance of current state-of-the-art methods on glioblastoma multiforme (GBM) and ovarian serous cystadenocarcinoma (OV) cancer. In the experiments, Cox-PASNet showed out-performance, compared to the benchmarking methods. Moreover, the neural network architecture of Cox-PASNet was biologically interpreted, and several significant prognostic factors of genes and biological pathways were identified. Conclusions Cox-PASNet models biological mechanisms in the neural network by incorporating biological pathway databases and sparse coding. The neural network of Cox-PASNet can identify nonlinear and hierarchical associations of genomic and clinical data to cancer patient survival. The open-source code of Cox-PASNet in PyTorch implemented for training, evaluation, and model interpretation is available at: https://github.com/DataX-JieHao/Cox-PASNet.

Download Full-text

Neural Network Based Intelligent Learning of Fuzzy Logic Controller Parameters

Dynamic Systems and Control, Parts A and B ◽

10.1115/imece2004-59589 ◽

2004 ◽

Cited By ~ 8

Author(s):

Manish Kumar ◽

Devendra P. Garg

Keyword(s):

Neural Network ◽

Fuzzy Logic ◽

Network Architecture ◽

Fuzzy Logic Controller ◽

Rule Base ◽

Neural Network Architecture ◽

The Neural Network ◽

Neuro Fuzzy ◽

Learning Capabilities ◽

Optimal Signal

Design of an efficient fuzzy logic controller involves the optimization of parameters of fuzzy sets and proper choice of rule base. There are several techniques reported in recent literature that use neural network architecture and genetic algorithms to learn and optimize a fuzzy logic controller. This paper presents methodologies to learn and optimize fuzzy logic controller parameters that use learning capabilities of neural network. Concepts of model predictive control (MPC) have been used to obtain optimal signal to train the neural network via backpropagation. The strategies developed have been applied to control an inverted pendulum and results have been compared for two different fuzzy logic controllers developed with the help of neural networks. The first neural network emulates a PD controller, while the second controller is developed based on MPC. The proposed approach can be applied to learn fuzzy logic controller parameter online via the use of dynamic backpropagation. The results show that the Neuro-Fuzzy approaches were able to learn rule base and identify membership function parameters accurately.

Download Full-text

Bioacoustic classification of avian calls from raw sound waveforms with an open-source deep learning architecture

Scientific Reports ◽

10.1038/s41598-021-95076-6 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Francisco J. Bravo Sanchez ◽

Md Rahat Hossain ◽

Nathan B. English ◽

Steven T. Moore

Keyword(s):

Neural Network ◽

Network Architecture ◽

Limited Data ◽

Extraction Techniques ◽

Neural Network Architecture ◽

The Neural Network ◽

Current Classification ◽

Efficient Learning ◽

Sound Features

AbstractThe use of autonomous recordings of animal sounds to detect species is a popular conservation tool, constantly improving in fidelity as audio hardware and software evolves. Current classification algorithms utilise sound features extracted from the recording rather than the sound itself, with varying degrees of success. Neural networks that learn directly from the raw sound waveforms have been implemented in human speech recognition but the requirements of detailed labelled data have limited their use in bioacoustics. Here we test SincNet, an efficient neural network architecture that learns from the raw waveform using sinc-based filters. Results using an off-the-shelf implementation of SincNet on a publicly available bird sound dataset (NIPS4Bplus) show that the neural network rapidly converged reaching accuracies of over 65% with limited data. Their performance is comparable with traditional methods after hyperparameter tuning but they are more efficient. Learning directly from the raw waveform allows the algorithm to select automatically those elements of the sound that are best suited for the task, bypassing the onerous task of selecting feature extraction techniques and reducing possible biases. We use publicly released code and datasets to encourage others to replicate our results and to apply SincNet to their own datasets; and we review possible enhancements in the hope that algorithms that learn from the raw waveform will become useful bioacoustic tools.

Download Full-text

The approach to the formation of the neural network architecture for pattern recognition

Yugra State University Bulletin ◽

10.17816/byusu201612261-64 ◽

2016 ◽

Vol 12 (2) ◽

pp. 61-64 ◽

Cited By ~ 2

Author(s):

Vitaly M Tatyankin

Keyword(s):

Neural Network ◽

Pattern Recognition ◽

Network Architecture ◽

Test Sample ◽

Recognition Algorithm ◽

Neural Network Architecture ◽

Pattern Recognition Algorithm ◽

The Neural Network

An approach to the formation of an efficient pattern recognition algorithm. Under efficiency, understood as a zero error, resulting in the identification of the images on the test sample. As a test sample is considered an open database of images of handwritten digits MNIST.

Download Full-text

A Compact Gradient Based Neural Network for Capon Spectral Estimation

International Journal of Neural Networks and Advanced Applications ◽

10.46300/91016.2020.7.7 ◽

2020 ◽

Vol 7 ◽

Keyword(s):

Neural Network ◽

Network Architecture ◽

Spectral Estimation ◽

Original Method ◽

Vector Product ◽

Neural Network Architecture ◽

Computational Burden ◽

The Neural Network ◽

Gradient Based ◽

Simulation Results

This paper describes the use of a novel gradient based recurrent neural network to perform Capon spectral estimation. Nowadays, in the fastest algorithm proposed by Marple et al., the computational burden still remains significant in the calculation of the autoregressive (AR) Parameters. In this paper we propose to use a gradient based neural network to compute the AR parameters by solving the Yule-Walker equations. Furthermore, to reduce the complexity of the neural network architecture, the weights matrixinputs vector product is performed efficiently using the fast Fourier transform. Simulation results show that proposed neural network and its simplified architecture lead to the same results as the original method which prove the correctness of the proposed scheme.

Download Full-text

A Novel Deep Learning Approach Using Contextual Embeddings for Toponym Resolution

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi11010028 ◽

2021 ◽

Vol 11 (1) ◽

pp. 28

Author(s):

Ana Bárbara Cardoso ◽

Bruno Martins ◽

Jacinto Estima

Keyword(s):

Network Architecture ◽

Additional Data ◽

Short Term Memory ◽

External Information ◽

Neural Network Architecture ◽

Novel Approach ◽

The Neural Network ◽

Proposed Model ◽

Textual Data ◽

Spatial Coordinates

This article describes a novel approach for toponym resolution with deep neural networks. The proposed approach does not involve matching references in the text against entries in a gazetteer, instead directly predicting geo-spatial coordinates. Multiple inputs are considered in the neural network architecture (e.g., the surrounding words are considered in combination with the toponym to disambiguate), using pre-trained contextual word embeddings (i.e., ELMo or BERT) as well as bi-directional Long Short-Term Memory units, which are both regularly used for modeling textual data. The intermediate representations are then used to predict a probability distribution over possible geo-spatial regions, and finally to predict the coordinates for the input toponym. The proposed model was tested on three datasets used on previous toponym resolution studies, specifically the (i) War of the Rebellion, (ii) Local–Global Lexicon, and (iii) SpatialML corpora. Moreover, we evaluated the effect of using (i) geophysical terrain properties as external information, including information on elevation or terrain development, among others, and (ii) additional data collected from Wikipedia articles, to further help with the training of the model. The obtained results show improvements using the proposed method, when compared to previous approaches, and specifically when BERT embeddings and additional data are involved.

Download Full-text