scholarly journals Modified Neural Architecture Search (NAS) Using the Chromosome Non-Disjunction

2021 ◽  
Vol 11 (18) ◽  
pp. 8628
Author(s):  
Kang-Moon Park ◽  
Donghoon Shin ◽  
Sung-Do Chi

This paper proposes a deep neural network structuring methodology through a genetic algorithm (GA) using chromosome non-disjunction. The proposed model includes methods for generating and tuning the neural network architecture without the aid of human experts. Since the original neural architecture search (henceforth, NAS) was announced, NAS techniques, such as NASBot, NASGBO and CoDeepNEAT, have been widely adopted in order to improve cost- and/or time-effectiveness for human experts. In these models, evolutionary algorithms (EAs) are employed to effectively enhance the accuracy of the neural network architecture. In particular, CoDeepNEAT uses a constructive GA starting from minimal architecture. This will only work quickly if the solution architecture is small. On the other hand, the proposed methodology utilizes chromosome non-disjunction as a new genetic operation. Our approach differs from previous methodologies in that it includes a destructive approach as well as a constructive approach, and is similar to pruning methodologies, which realizes tuning of the previous neural network architecture. A case study applied to the sentence word ordering problem and AlexNet for CIFAR-10 illustrates the applicability of the proposed methodology. We show from the simulation studies that the accuracy of the model was improved by 0.7% compared to the conventional model without human expert.

2021 ◽  
Author(s):  
Nathan Buskulic ◽  
Edward Bergman ◽  
Joeran Beel

Neural Architecture Search research has been limited to fixed datasets and as such does not provide the flexibility needed to deal with real-world, constantly evolving data. This is why we propose the basis of Online Neural Architecture Search (ONAS) to deal with complex, evolving, data distributions. We formalise ONAS as a minimisation problem upon which both the weights and the architecture of the neural network needs to be optimised for the data up until a time $t_i$. To solve this problem, we adapt a DARTS optimisation process, associated with an early stopping scheme, by using the supernet optimised on previous data as a warm-up initial state. This allows the architecture of the neural network to evolve as the data distribution evolves while limiting the computational burden. This work aims at building the initial mathematical formalism of the problem as well as the development of a framework where NAS methods could be used to solve this problem. Finally, several possible next steps are presented to show the potential of this field of Online Neural Architecture Search.


2020 ◽  
Vol 2020 (10) ◽  
pp. 54-62
Author(s):  
Oleksii VASYLIEV ◽  

The problem of applying neural networks to calculate ratings used in banking in the decision-making process on granting or not granting loans to borrowers is considered. The task is to determine the rating function of the borrower based on a set of statistical data on the effectiveness of loans provided by the bank. When constructing a regression model to calculate the rating function, it is necessary to know its general form. If so, the task is to calculate the parameters that are included in the expression for the rating function. In contrast to this approach, in the case of using neural networks, there is no need to specify the general form for the rating function. Instead, certain neural network architecture is chosen and parameters are calculated for it on the basis of statistical data. Importantly, the same neural network architecture can be used to process different sets of statistical data. The disadvantages of using neural networks include the need to calculate a large number of parameters. There is also no universal algorithm that would determine the optimal neural network architecture. As an example of the use of neural networks to determine the borrower's rating, a model system is considered, in which the borrower's rating is determined by a known non-analytical rating function. A neural network with two inner layers, which contain, respectively, three and two neurons and have a sigmoid activation function, is used for modeling. It is shown that the use of the neural network allows restoring the borrower's rating function with quite acceptable accuracy.


2021 ◽  
Vol 12 (6) ◽  
pp. 1-21
Author(s):  
Jayant Gupta ◽  
Carl Molnar ◽  
Yiqun Xie ◽  
Joe Knight ◽  
Shashi Shekhar

Spatial variability is a prominent feature of various geographic phenomena such as climatic zones, USDA plant hardiness zones, and terrestrial habitat types (e.g., forest, grasslands, wetlands, and deserts). However, current deep learning methods follow a spatial-one-size-fits-all (OSFA) approach to train single deep neural network models that do not account for spatial variability. Quantification of spatial variability can be challenging due to the influence of many geophysical factors. In preliminary work, we proposed a spatial variability aware neural network (SVANN-I, formerly called SVANN ) approach where weights are a function of location but the neural network architecture is location independent. In this work, we explore a more flexible SVANN-E approach where neural network architecture varies across geographic locations. In addition, we provide a taxonomy of SVANN types and a physics inspired interpretation model. Experiments with aerial imagery based wetland mapping show that SVANN-I outperforms OSFA and SVANN-E performs the best of all.


Author(s):  
Raghuram Mandyam Annasamy ◽  
Katia Sycara

Deep reinforcement learning techniques have demonstrated superior performance in a wide variety of environments. As improvements in training algorithms continue at a brisk pace, theoretical or empirical studies on understanding what these networks seem to learn, are far behind. In this paper we propose an interpretable neural network architecture for Q-learning which provides a global explanation of the model’s behavior using key-value memories, attention and reconstructible embeddings. With a directed exploration strategy, our model can reach training rewards comparable to the state-of-the-art deep Q-learning models. However, results suggest that the features extracted by the neural network are extremely shallow and subsequent testing using out-of-sample examples shows that the agent can easily overfit to trajectories seen during training.


2019 ◽  
Vol 12 (S10) ◽  
Author(s):  
Jie Hao ◽  
Youngsoon Kim ◽  
Tejaswini Mallavarapu ◽  
Jung Hun Oh ◽  
Mingon Kang

Abstract Background Understanding the complex biological mechanisms of cancer patient survival using genomic and clinical data is vital, not only to develop new treatments for patients, but also to improve survival prediction. However, highly nonlinear and high-dimension, low-sample size (HDLSS) data cause computational challenges to applying conventional survival analysis. Results We propose a novel biologically interpretable pathway-based sparse deep neural network, named Cox-PASNet, which integrates high-dimensional gene expression data and clinical data on a simple neural network architecture for survival analysis. Cox-PASNet is biologically interpretable where nodes in the neural network correspond to biological genes and pathways, while capturing the nonlinear and hierarchical effects of biological pathways associated with cancer patient survival. We also propose a heuristic optimization solution to train Cox-PASNet with HDLSS data. Cox-PASNet was intensively evaluated by comparing the predictive performance of current state-of-the-art methods on glioblastoma multiforme (GBM) and ovarian serous cystadenocarcinoma (OV) cancer. In the experiments, Cox-PASNet showed out-performance, compared to the benchmarking methods. Moreover, the neural network architecture of Cox-PASNet was biologically interpreted, and several significant prognostic factors of genes and biological pathways were identified. Conclusions Cox-PASNet models biological mechanisms in the neural network by incorporating biological pathway databases and sparse coding. The neural network of Cox-PASNet can identify nonlinear and hierarchical associations of genomic and clinical data to cancer patient survival. The open-source code of Cox-PASNet in PyTorch implemented for training, evaluation, and model interpretation is available at: https://github.com/DataX-JieHao/Cox-PASNet.


Author(s):  
Manish Kumar ◽  
Devendra P. Garg

Design of an efficient fuzzy logic controller involves the optimization of parameters of fuzzy sets and proper choice of rule base. There are several techniques reported in recent literature that use neural network architecture and genetic algorithms to learn and optimize a fuzzy logic controller. This paper presents methodologies to learn and optimize fuzzy logic controller parameters that use learning capabilities of neural network. Concepts of model predictive control (MPC) have been used to obtain optimal signal to train the neural network via backpropagation. The strategies developed have been applied to control an inverted pendulum and results have been compared for two different fuzzy logic controllers developed with the help of neural networks. The first neural network emulates a PD controller, while the second controller is developed based on MPC. The proposed approach can be applied to learn fuzzy logic controller parameter online via the use of dynamic backpropagation. The results show that the Neuro-Fuzzy approaches were able to learn rule base and identify membership function parameters accurately.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Francisco J. Bravo Sanchez ◽  
Md Rahat Hossain ◽  
Nathan B. English ◽  
Steven T. Moore

AbstractThe use of autonomous recordings of animal sounds to detect species is a popular conservation tool, constantly improving in fidelity as audio hardware and software evolves. Current classification algorithms utilise sound features extracted from the recording rather than the sound itself, with varying degrees of success. Neural networks that learn directly from the raw sound waveforms have been implemented in human speech recognition but the requirements of detailed labelled data have limited their use in bioacoustics. Here we test SincNet, an efficient neural network architecture that learns from the raw waveform using sinc-based filters. Results using an off-the-shelf implementation of SincNet on a publicly available bird sound dataset (NIPS4Bplus) show that the neural network rapidly converged reaching accuracies of over 65% with limited data. Their performance is comparable with traditional methods after hyperparameter tuning but they are more efficient. Learning directly from the raw waveform allows the algorithm to select automatically those elements of the sound that are best suited for the task, bypassing the onerous task of selecting feature extraction techniques and reducing possible biases. We use publicly released code and datasets to encourage others to replicate our results and to apply SincNet to their own datasets; and we review possible enhancements in the hope that algorithms that learn from the raw waveform will become useful bioacoustic tools.


2016 ◽  
Vol 12 (2) ◽  
pp. 61-64 ◽  
Author(s):  
Vitaly M Tatyankin

An approach to the formation of an efficient pattern recognition algorithm. Under efficiency, understood as a zero error, resulting in the identification of the images on the test sample. As a test sample is considered an open database of images of handwritten digits MNIST.


This paper describes the use of a novel gradient based recurrent neural network to perform Capon spectral estimation. Nowadays, in the fastest algorithm proposed by Marple et al., the computational burden still remains significant in the calculation of the autoregressive (AR) Parameters. In this paper we propose to use a gradient based neural network to compute the AR parameters by solving the Yule-Walker equations. Furthermore, to reduce the complexity of the neural network architecture, the weights matrixinputs vector product is performed efficiently using the fast Fourier transform. Simulation results show that proposed neural network and its simplified architecture lead to the same results as the original method which prove the correctness of the proposed scheme.


2021 ◽  
Vol 11 (1) ◽  
pp. 28
Author(s):  
Ana Bárbara Cardoso ◽  
Bruno Martins ◽  
Jacinto Estima

This article describes a novel approach for toponym resolution with deep neural networks. The proposed approach does not involve matching references in the text against entries in a gazetteer, instead directly predicting geo-spatial coordinates. Multiple inputs are considered in the neural network architecture (e.g., the surrounding words are considered in combination with the toponym to disambiguate), using pre-trained contextual word embeddings (i.e., ELMo or BERT) as well as bi-directional Long Short-Term Memory units, which are both regularly used for modeling textual data. The intermediate representations are then used to predict a probability distribution over possible geo-spatial regions, and finally to predict the coordinates for the input toponym. The proposed model was tested on three datasets used on previous toponym resolution studies, specifically the (i) War of the Rebellion, (ii) Local–Global Lexicon, and (iii) SpatialML corpora. Moreover, we evaluated the effect of using (i) geophysical terrain properties as external information, including information on elevation or terrain development, among others, and (ii) additional data collected from Wikipedia articles, to further help with the training of the model. The obtained results show improvements using the proposed method, when compared to previous approaches, and specifically when BERT embeddings and additional data are involved.


Sign in / Sign up

Export Citation Format

Share Document