Understanding Memories of the Past in the Context of Different Complex Neural Network Architectures

2022 ◽  
pp. 1-27
Author(s):  
Clifford Bohm ◽  
Douglas Kirkpatrick ◽  
Arend Hintze

Abstract Deep learning (primarily using backpropagation) and neuroevolution are the preeminent methods of optimizing artificial neural networks. However, they often create black boxes that are as hard to understand as the natural brains they seek to mimic. Previous work has identified an information-theoretic tool, referred to as R, which allows us to quantify and identify mental representations in artificial cognitive systems. The use of such measures has allowed us to make previous black boxes more transparent. Here we extend R to not only identify where complex computational systems store memory about their environment but also to differentiate between different time points in the past. We show how this extended measure can identify the location of memory related to past experiences in neural networks optimized by deep learning as well as a genetic algorithm.

Author(s):  
Ruofan Liao ◽  
Paravee Maneejuk ◽  
Songsak Sriboonchitta

In the past, in many areas, the best prediction models were linear and nonlinear parametric models. In the last decade, in many application areas, deep learning has shown to lead to more accurate predictions than the parametric models. Deep learning-based predictions are reasonably accurate, but not perfect. How can we achieve better accuracy? To achieve this objective, we propose to combine neural networks with parametric model: namely, to train neural networks not on the original data, but on the differences between the actual data and the predictions of the parametric model. On the example of predicting currency exchange rate, we show that this idea indeed leads to more accurate predictions.


Author(s):  
Carlos Lassance ◽  
Vincent Gripon ◽  
Antonio Ortega

For the past few years, deep learning (DL) robustness (i.e. the ability to maintain the same decision when inputs are subject to perturbations) has become a question of paramount importance, in particular in settings where misclassification can have dramatic consequences. To address this question, authors have proposed different approaches, such as adding regularizers or training using noisy examples. In this paper we introduce a regularizer based on the Laplacian of similarity graphs obtained from the representation of training data at each layer of the DL architecture. This regularizer penalizes large changes (across consecutive layers in the architecture) in the distance between examples of different classes, and as such enforces smooth variations of the class boundaries. We provide theoretical justification for this regularizer and demonstrate its effectiveness to improve robustness on classical supervised learning vision datasets for various types of perturbations. We also show it can be combined with existing methods to increase overall robustness.


2019 ◽  
Vol 491 (2) ◽  
pp. 2280-2300 ◽  
Author(s):  
Kaushal Sharma ◽  
Ajit Kembhavi ◽  
Aniruddha Kembhavi ◽  
T Sivarani ◽  
Sheelu Abraham ◽  
...  

ABSTRACT Due to the ever-expanding volume of observed spectroscopic data from surveys such as SDSS and LAMOST, it has become important to apply artificial intelligence (AI) techniques for analysing stellar spectra to solve spectral classification and regression problems like the determination of stellar atmospheric parameters Teff, $\rm {\log g}$, and [Fe/H]. We propose an automated approach for the classification of stellar spectra in the optical region using convolutional neural networks (CNNs). Traditional machine learning (ML) methods with ‘shallow’ architecture (usually up to two hidden layers) have been trained for these purposes in the past. However, deep learning methods with a larger number of hidden layers allow the use of finer details in the spectrum which results in improved accuracy and better generalization. Studying finer spectral signatures also enables us to determine accurate differential stellar parameters and find rare objects. We examine various machine and deep learning algorithms like artificial neural networks, Random Forest, and CNN to classify stellar spectra using the Jacoby Atlas, ELODIE, and MILES spectral libraries as training samples. We test the performance of the trained networks on the Indo-U.S. Library of Coudé Feed Stellar Spectra (CFLIB). We show that using CNNs, we are able to lower the error up to 1.23 spectral subclasses as compared to that of two subclasses achieved in the past studies with ML approach. We further apply the trained model to classify stellar spectra retrieved from the SDSS data base with SNR > 20.


Author(s):  
Derya Soydaner

In recent years, we have witnessed the rise of deep learning. Deep neural networks have proved their success in many areas. However, the optimization of these networks has become more difficult as neural networks going deeper and datasets becoming bigger. Therefore, more advanced optimization algorithms have been proposed over the past years. In this study, widely used optimization algorithms for deep learning are examined in detail. To this end, these algorithms called adaptive gradient methods are implemented for both supervised and unsupervised tasks. The behavior of the algorithms during training and results on four image datasets, namely, MNIST, CIFAR-10, Kaggle Flowers and Labeled Faces in the Wild are compared by pointing out their differences against basic optimization algorithms.


2019 ◽  
Author(s):  
Mark Rademaker ◽  
Laurens Hogeweg ◽  
Rutger Vos

AbstractKnowledge of global biodiversity remains limited by geographic and taxonomic sampling biases. The scarcity of species data restricts our understanding of the underlying environmental factors shaping distributions, and the ability to draw comparisons among species. Species distribution models (SDMs) were developed in the early 2000s to address this issue. Although SDMs based on single layered Neural Networks have been experimented with in the past, these performed poorly. However, the past two decades have seen a strong increase in the use of Deep Learning (DL) approaches, such as Deep Neural Networks (DNNs). Despite the large improvement in predictive capacity DNNs provide over shallow networks, to our knowledge these have not yet been applied to SDM. The aim of this research was to provide a proof of concept of a DL-SDM1. We used a pre-existing dataset of the world’s ungulates and abiotic environmental predictors that had recently been used in MaxEnt SDM, to allow for a direct comparison of performance between both methods. Our DL-SDM consisted of a binary classification DNN containing 4 hidden layers and drop-out regularization between each layer. Performance of the DL-SDM was similar to MaxEnt for species with relatively large sample sizes and worse for species with relatively low sample sizes. Increasing the number of occurrences further improved DL-SDM performance for species that already had relatively high sample sizes. We then tried to further improve performance by altering the sampling procedure of negative instances and increasing the number of environmental predictors, including species interactions. This led to a large increase in model performance across the range of sample sizes in the species datasets. We conclude that DL-SDMs provide a suitable alternative to traditional SDMs such as MaxEnt and have the advantage of being both able to directly include species interactions, as well as being able to handle correlated input features. Further improvements to the model would include increasing its scalability by turning it into a multi-classification model, as well as developing a more user friendly DL-SDM Python package.


2019 ◽  
Vol 2019 ◽  
pp. 1-12
Author(s):  
Xu Yin ◽  
Yan Li ◽  
Byeong-Seok Shin

With the widespread use of deep learning methods, semantic segmentation has achieved great improvements in recent years. However, many researchers have pointed out that with multiple uses of convolution and pooling operations, great information loss would occur in the extraction processes. To solve this problem, various operations or network architectures have been suggested to make up for the loss of information. We observed a trend in many studies to design a network as a symmetric type, with both parts representing the “encoding” and “decoding” stages. By “upsampling” operations in the “decoding” stage, feature maps are constructed in a certain way that would more or less make up for the losses in previous layers. In this paper, we focus on upsampling operations, make a detailed analysis, and compare current methods used in several famous neural networks. We also combine the knowledge on image restoration and design a new upsampled layer (or operation) named the TGV upsampling algorithm. We successfully replaced upsampling layers in the previous research with our new method. We found that our model can better preserve detailed textures and edges of feature maps and can, on average, achieve 1.4–2.3% improved accuracy compared to the original models.


Author(s):  
A. Sokolova ◽  
A. Konushin

In this work we investigate the problem of people recognition by their gait. For this task, we implement deep learning approach using the optical flow as the main source of motion information and combine neural feature extraction with the additional embedding of descriptors for representation improvement. In order to find the best heuristics, we compare several deep neural network architectures, learning and classification strategies. The experiments were made on two popular datasets for gait recognition, so we investigate their advantages and disadvantages and the transferability of considered methods.


2019 ◽  
Vol 11 (2) ◽  
pp. 28-35
Author(s):  
Tsila Hassine ◽  
Ziv Neeman

In the past few years deep-learning AI neural networks have achieved major milestones in artistic image analysis and generation, producing what some refer to as ‘art.’ We reflect critically on some of the artistic shortcomings of a few projects that occupied the spotlight in recent years. We introduce the term ‘Zombie Art’ to describe the generation of new images of dead masters, as well as ‘The AI Reproducibility Test.’ We designate the problems inherent in AI and in its application to art history. In conclusion, we propose new directions for both AI-generated art and art history, in the light of these new powerful AI technologies of artistic image analysis and generation.


2022 ◽  
Vol 13 (1) ◽  
Author(s):  
Tianyu Wang ◽  
Shi-Yuan Ma ◽  
Logan G. Wright ◽  
Tatsuhiro Onodera ◽  
Brian C. Richard ◽  
...  

AbstractDeep learning has become a widespread tool in both science and industry. However, continued progress is hampered by the rapid growth in energy costs of ever-larger deep neural networks. Optical neural networks provide a potential means to solve the energy-cost problem faced by deep learning. Here, we experimentally demonstrate an optical neural network based on optical dot products that achieves 99% accuracy on handwritten-digit classification using ~3.1 detected photons per weight multiplication and ~90% accuracy using ~0.66 photons (~2.5 × 10−19 J of optical energy) per weight multiplication. The fundamental principle enabling our sub-photon-per-multiplication demonstration—noise reduction from the accumulation of scalar multiplications in dot-product sums—is applicable to many different optical-neural-network architectures. Our work shows that optical neural networks can achieve accurate results using extremely low optical energies.


Author(s):  
Théophile Sanchez ◽  
Jean Cury ◽  
Guillaume Charpiat ◽  
Flora Jay

AbstractFor the past decades, simulation-based likelihood-free inference methods have enabled researchers to address numerous population genetics problems. As the richness and amount of simulated and real genetic data keep increasing, the field has a strong opportunity to tackle tasks that current methods hardly solve. However, high data dimensionality forces most methods to summarize large genomic datasets into a relatively small number of handcrafted features (summary statistics). Here we propose an alternative to summary statistics, based on the automatic extraction of relevant information using deep learning techniques. Specifically, we design artificial neural networks (ANNs) that take as input single nucleotide polymorphic sites (SNPs) found in individuals sampled from a single population and infer the past effective population size history. First, we provide guidelines to construct artificial neural networks that comply with the intrinsic properties of SNP data such as invariance to permutation of haplotypes, long scale interactions between SNPs and variable genomic length. Thanks to a Bayesian hyperparameter optimization procedure, we evaluate the performance of multiple networks and compare them to well established methods like Approximate Bayesian Computation (ABC). Even without the expert knowledge of summary statistics, our approach compares fairly well to an ABC based on handcrafted features. Furthermore we show that combining deep learning and ABC can improve performance while taking advantage of both frameworks. Finally, we apply our approach to reconstruct the effective population size history of cattle breed populations.


Sign in / Sign up

Export Citation Format

Share Document