scholarly journals Generalization in data-driven models of primary visual cortex

2020 ◽  
Author(s):  
Konstantin-Klemens Lurz ◽  
Mohammad Bashiri ◽  
Konstantin Willeke ◽  
Akshay K. Jagadish ◽  
Eric Wang ◽  
...  

AbstractDeep neural networks (DNN) have set new standards at predicting responses of neural populations to visual input. Most such DNNs consist of a convolutional network (core) shared across all neurons which learns a representation of neural computation in visual cortex and a neuron-specific readout that linearly combines the relevant features in this representation. The goal of this paper is to test whether such a representation is indeed generally characteristic for visual cortex, i.e. generalizes between animals of a species, and what factors contribute to obtaining such a generalizing core. To push all non-linear computations into the core where the generalizing cortical features should be learned, we devise a novel readout that reduces the number of parameters per neuron in the readout by up to two orders of magnitude compared to the previous state-of-the-art. It does so by taking advantage of retinotopy and learns a Gaussian distribution over the neuron’s receptive field position. With this new readout we train our network on neural responses from mouse primary visual cortex (V1) and obtain a gain in performance of 7% compared to the previous state-of-the-art network. We then investigate whether the convolutional core indeed captures general cortical features by using the core in transfer learning to a different animal. When transferring a core trained on thousands of neurons from various animals and scans we exceed the performance of training directly on that animal by 12%, and outperform a commonly used VGG16 core pre-trained on imagenet by 33%. In addition, transfer learning with our data-driven core is more data-efficient than direct training, achieving the same performance with only 40% of the data. Our model with its novel readout thus sets a new state-of-the-art for neural response prediction in mouse visual cortex from natural images, generalizes between animals, and captures better characteristic cortical features than current task-driven pre-training approaches such as VGG16.

2017 ◽  
Author(s):  
Santiago A. Cadena ◽  
George H. Denfield ◽  
Edgar Y. Walker ◽  
Leon A. Gatys ◽  
Andreas S. Tolias ◽  
...  

AbstractDespite great efforts over several decades, our best models of primary visual cortex (V1) still predict spiking activity quite poorly when probed with natural stimuli, highlighting our limited understanding of the nonlinear computations in V1. Recently, two approaches based on deep learning have been successfully applied to neural data: On the one hand, transfer learning from networks trained on object recognition worked remarkably well for predicting neural responses in higher areas of the primate ventral stream, but has not yet been used to model spiking activity in early stages such as V1. On the other hand, data-driven models have been used to predict neural responses in the early visual system (retina and V1) of mice, but not primates. Here, we test the ability of both approaches to predict spiking activity in response to natural images in V1 of awake monkeys. Even though V1 is rather at an early to intermediate stage of the visual system, we found that the transfer learning approach performed similarly well to the data-driven approach and both outperformed classical linear-nonlinear and wavelet-based feature representations that build on existing theories of V1. Notably, transfer learning using a pre-trained feature space required substantially less experimental time to achieve the same performance. In conclusion, multi-layer convolutional neural networks (CNNs) set the new state of the art for predicting neural responses to natural images in primate V1 and deep features learned for object recognition are better explanations for V1 computation than all previous filter bank theories. This finding strengthens the necessity of V1 models that are multiple nonlinearities away from the image domain and it supports the idea of explaining early visual cortex based on high-level functional goals.Author summaryPredicting the responses of sensory neurons to arbitrary natural stimuli is of major importance for understanding their function. Arguably the most studied cortical area is primary visual cortex (V1), where many models have been developed to explain its function. However, the most successful models built on neurophysiologists’ intuitions still fail to account for spiking responses to natural images. Here, we model spiking activity in primary visual cortex (V1) of monkeys using deep convolutional neural networks (CNNs), which have been successful in computer vision. We both trained CNNs directly to fit the data, and used CNNs trained to solve a high-level task (object categorization). With these approaches, we are able to outperform previous models and improve the state of the art in predicting the responses of early visual neurons to natural images. Our results have two important implications. First, since V1 is the result of several nonlinear stages, it should be modeled as such. Second, functional models of entire visual pathways, of which V1 is an early stage, do not only account for higher areas of such pathways, but also provide useful representations for V1 predictions.


2021 ◽  
Vol 11 (24) ◽  
pp. 11974
Author(s):  
Shijie Zhang ◽  
Gang Wu

Logs, recording the system runtime information, are frequently used to ensure software system reliability. As the first and foremost step of typical log analysis, many data-driven methods have been proposed for automated log parsing. Most existing log parsers work offline, requiring a time-consuming training progress and retraining as the system upgrades. Meanwhile, the state of the art online log parsers are tree-based, which still have defects in robustness and efficiency. To overcome such limitations, we abandon the tree structure and propose a hash-like method. In this paper, we propose LogPunk, an efficient online log parsing method. The core of LogPunk is a novel log signature method based on log punctuations and length features. According to the signature, we can quickly find a small set of candidate templates. Further, the most suitable template is returned by traversing the candidate set with our log similarity function. We evaluated LogPunk on 16 public datasets from the LogHub comparing with five other log parsers. LogPunk achieves the best parsing accuracy of 91.9%. Evaluation results also demonstrate its superiority in terms of robustness and efficiency.


Author(s):  
Seungwhan Moon ◽  
Jaime Carbonell

We study a transfer learning framework where source and target datasets are heterogeneous in both feature and label spaces. Specifically, we do not assume explicit relations between source and target tasks a priori, and thus it is crucial to determine what and what not to transfer from source knowledge. Towards this goal, we define a new heterogeneous transfer learning approach that (1) selects and attends to an optimized subset of source samples to transfer knowledge from, and (2) builds a unified transfer network that learns from both source and target knowledge. This method, termed "Attentional Heterogeneous Transfer", along with a newly proposed unsupervised transfer loss, improve upon the previous state-of-the-art approaches on extensive simulations as well as a challenging hetero-lingual text classification task.


2020 ◽  
Author(s):  
Abhinav Sagar ◽  
J Dheeba

AbstractIn this work, we address the problem of skin cancer classification using convolutional neural networks. A lot of cancer cases early on are misdiagnosed as something else leading to severe consequences including the death of a patient. Also there are cases in which patients have some other problems and doctors think they might have skin cancer. This leads to unnecessary time and money spent for further diagnosis. In this work, we address both of the above problems using deep neural networks and transfer learning architecture. We have used publicly available ISIC databases for both training and testing our model. Our work achieves an accuracy of 0.935, precision of 0.94, recall of 0.77, F1 score of 0.85 and ROC-AUC of 0.861 which is better than the previous state of the art approaches.


Author(s):  
Joel Dapello ◽  
Tiago Marques ◽  
Martin Schrimpf ◽  
Franziska Geiger ◽  
David D. Cox ◽  
...  

AbstractCurrent state-of-the-art object recognition models are largely based on convolutional neural network (CNN) architectures, which are loosely inspired by the primate visual system. However, these CNNs can be fooled by imperceptibly small, explicitly crafted perturbations, and struggle to recognize objects in corrupted images that are easily recognized by humans. Here, by making comparisons with primate neural data, we first observed that CNN models with a neural hidden layer that better matches primate primary visual cortex (V1) are also more robust to adversarial attacks. Inspired by this observation, we developed VOneNets, a new class of hybrid CNN vision models. Each VOneNet contains a fixed weight neural network front-end that simulates primate V1, called the VOneBlock, followed by a neural network back-end adapted from current CNN vision models. The VOneBlock is based on a classical neuroscientific model of V1: the linear-nonlinear-Poisson model, consisting of a biologically-constrained Gabor filter bank, simple and complex cell nonlinearities, and a V1 neuronal stochasticity generator. After training, VOneNets retain high ImageNet performance, but each is substantially more robust, outperforming the base CNNs and state-of-the-art methods by 18% and 3%, respectively, on a conglomerate benchmark of perturbations comprised of white box adversarial attacks and common image corruptions. Finally, we show that all components of the VOneBlock work in synergy to improve robustness. While current CNN architectures are arguably brain-inspired, the results presented here demonstrate that more precisely mimicking just one stage of the primate visual system leads to new gains in ImageNet-level computer vision applications.


2019 ◽  
Vol 11 (3) ◽  
pp. 280 ◽  
Author(s):  
Yongyong Fu ◽  
Kunkun Liu ◽  
Zhangquan Shen ◽  
Jinsong Deng ◽  
Muye Gan ◽  
...  

Impervious surfaces play an important role in urban planning and sustainable environmental management. High-spatial-resolution (HSR) images containing pure pixels have significant potential for the detailed delineation of land surfaces. However, due to high intraclass variability and low interclass distance, the mapping and monitoring of impervious surfaces in complex town–rural areas using HSR images remains a challenge. The fully convolutional network (FCN) model, a variant of convolution neural networks (CNNs), recently achieved state-of-the-art performance in HSR image classification applications. However, due to the inherent nature of FCN processing, it is challenging for an FCN to precisely capture the detailed information of classification targets. To solve this problem, we propose an object-based deep CNN framework that integrates object-based image analysis (OBIA) with deep CNNs to accurately extract and estimate impervious surfaces. Specifically, we also adopted two widely used transfer learning technologies to expedite the training of deep CNNs. Finally, we compare our approach with conventional OBIA classification and state-of-the-art FCN-based methods, such as FCN-8s and the U-Net methods. Both of these FCN-based methods are well designed for pixel-wise classification applications and have achieved great success. Our results show that the proposed approach effectively identified impervious surfaces, with 93.9% overall accuracy. Compared with the existing methods, i.e., OBIA, FCN-8s and U-Net methods, it shows that our method achieves obviously improvement in accuracy. Our findings also suggest that the classification performance of our proposed method is related to training strategy, indicating that significantly higher accuracy can be achieved through transfer learning by fine-tuning rather than feature extraction. Our approach for the automatic extraction and mapping of impervious surfaces also lays a solid foundation for intelligent monitoring and the management of land use and land cover.


2018 ◽  
Vol 64 ◽  
pp. 37-53
Author(s):  
Ugo Boscain ◽  
Roman Chertovskih ◽  
Jean-Paul Gauthier ◽  
Dario Prandi ◽  
Alexey Remizov

In this paper we review several algorithms for image inpainting based on the hypoelliptic diffusion naturally associated with a mathematical model of the primary visual cortex. In particular, we present one algorithm that does not exploit the information of where the image is corrupted, and others that do it. While the first algorithm is able to reconstruct only images that our visual system is still capable of recognize, we show that those of the second type completely transcend such limitation providing reconstructions at the state-of-the-art in image inpainting. This can be interpreted as a validation of the fact that our visual cortex actually encodes the first type of algorithm.


2021 ◽  
Vol 5 (4) ◽  
pp. 37-53
Author(s):  
Zurana Mehrin Ruhi ◽  
Sigma Jahan ◽  
Jia Uddin

In the fourth industrial revolution, data-driven intelligent fault diagnosis for industrial purposes serves a crucial role. In contemporary times, although deep learning is a popular approach for fault diagnosis, it requires massive amounts of labelled samples for training, which is arduous to come by in the real world. Our contribution to introduce a novel comprehensive intelligent fault detection model using the Case Western Reserve University dataset is divided into two steps. Firstly, a new hybrid signal decomposition methodology is developed comprising Empirical Mode Decomposition and Variational Mode Decomposition to leverage signal information from both processes for effective feature extraction. Secondly, transfer learning with DenseNet121 is employed to alleviate the constraints of deep learning models. Finally, our proposed novel technique surpassed not only previous outcomes but also generated state-of-the-art outcomes represented via the F1 score.


2018 ◽  
Author(s):  
Ján Antolík ◽  
Cyril Monier ◽  
Yves Frégnac ◽  
Andrew P. Davison

AbstractKnowledge integration based on the relationship between structure and function of the neural substrate is one of the main targets of neuroinformatics and data-driven computational modeling. However, the multiplicity of data sources, the diversity of benchmarks, the mixing of observables of different natures, and the necessity of a long-term, systematic approach make such a task challenging. Here we present a first snapshot of a long-term integrative modeling program designed to address this issue: a comprehensive spiking model of cat primary visual cortex satisfying an unprecedented range of anatomical, statistical and functional constraints under a wide range of visual input statistics. In the presence of physiological levels of tonic stochastic bombardment by spontaneous thalamic activity, the modeled cortical reverberations self-generate a sparse asynchronous ongoing activity that quantitatively matches a range of experimentally measured statistics. When integrating feed-forward drive elicited by a high diversity of visual contexts, the simulated network produces a realistic, quantitatively accurate interplay between visually evoked excitatory and inhibitory conductances; contrast-invariant orientation-tuning width; center surround interactions; and stimulus-dependent changes in the precision of the neural code. This integrative model offers numerous insights into how the studied properties interact, contributing to a better understanding of visual cortical dynamics. It provides a basis for future development towards a comprehensive model of low-level perception.Significance statementComputational modeling can integrate fragments of understanding generated by experimental neuroscience. However, most previous models considered only a few features of neural computation at a time, leading to either poorly constrained models with many parameters, or lack of expressiveness in over-simplified models. A solution is to commit to detailed models, but constrain them with a broad range of anatomical and functional data. This requires a long-term systematic approach. Here we present a first snapshot of such an integrative program: a large-scale spiking model of V1, that is constrained by an unprecedented range of anatomical and functional features. Together with the associated modeling infrastructure, this study lays the groundwork for a broad integrative modeling program seeking an in-depth understanding of vision.


2021 ◽  
Vol 10 (4) ◽  
pp. 1-20
Author(s):  
Malcolm Doering ◽  
Dražen Brščić ◽  
Takayuki Kanda

Data-driven imitation learning enables service robots to learn social interaction behaviors, but these systems cannot adapt after training to changes in the environment, such as changing products in a store. To solve this, a novel learning system that uses neural attention and approximate string matching to copy information from a product information database to its output is proposed. A camera shop interaction dataset was simulated for training/testing. The proposed system was found to outperform a baseline and a previous state of the art in an offline, human-judged evaluation.


Sign in / Sign up

Export Citation Format

Share Document