scholarly journals Training neural networks to recognize speech increased their correspondence to the human auditory pathway but did not yield a shared hierarchy of acoustic features

Author(s):  
Jessica A.F. Thompson ◽  
Yoshua Bengio ◽  
Elia Formisano ◽  
Marc Schönwiesner

AbstractThe correspondence between the activity of artificial neurons in convolutional neural networks (CNNs) trained to recognize objects in images and neural activity collected throughout the primate visual system has been well documented. Shallower layers of CNNs are typically more similar to early visual areas and deeper layers tend to be more similar to later visual areas, providing evidence for a shared representational hierarchy. This phenomenon has not been thoroughly studied in the auditory domain. Here, we compared the representations of CNNs trained to recognize speech (triphone recognition) to 7-Tesla fMRI activity collected throughout the human auditory pathway, including subcortical and cortical regions, while participants listened to speech. We found no evidence for a shared representational hierarchy of acoustic speech features. Instead, all auditory regions of interest were most similar to a single layer of the CNNs: the first fully-connected layer. This layer sits at the boundary between the relatively task-general intermediate layers and the highly task-specific final layers. This suggests that alternative architectural designs and/or training objectives may be needed to achieve fine-grained layer-wise correspondence with the human auditory pathway.HighlightsTrained CNNs more similar to auditory fMRI activity than untrainedNo evidence of a shared representational hierarchy for acoustic featuresAll ROIs were most similar to the first fully-connected layerCNN performance on speech recognition task positively associated with fmri similarity

2020 ◽  
Author(s):  
Kai J. Sandbrink ◽  
Pranav Mamidanna ◽  
Claudio Michaelis ◽  
Mackenzie Weygandt Mathis ◽  
Matthias Bethge ◽  
...  

Biological motor control is versatile and efficient. Muscles are flexible and undergo continuous changes requiring distributed adaptive control mechanisms. How proprioception solves this problem in the brain is unknown. Here we pursue a task-driven modeling approach that has provided important insights into other sensory systems. However, unlike for vision and audition where large annotated datasets of raw images or sound are readily available, data of relevant proprioceptive stimuli are not. We generated a large-scale dataset of human arm trajectories as the hand is tracing the alphabet in 3D space, then using a musculoskeletal model derived the spindle firing rates during these movements. We propose an action recognition task that allows training of hierarchical models to classify the character identity from the spindle firing patterns. Artificial neural networks could robustly solve this task, and the networks’ units show directional movement tuning akin to neurons in the primate somatosensory cortex. The same architectures with random weights also show similar kinematic feature tuning but do not reproduce the diversity of preferred directional tuning nor do they have invariant tuning across 3D space. Taken together our model is the first to link tuning properties in the proprioceptive system to the behavioral level.HighlightsWe provide a normative approach to derive neural tuning of proprioceptive features from behaviorally-defined objectives.We propose a method for creating a scalable muscle spindles dataset based on kinematic data and define an action recognition task as a benchmark.Hierarchical neural networks solve the recognition task from muscle spindle inputs.Individual neural network units in middle layers resemble neurons in primate somatosensory cortex & make predictions for neurons along the proprioceptive pathway.


2018 ◽  
Vol 23 (2) ◽  
pp. 141-149 ◽  
Author(s):  
Vadim Romanuke

Abstract A complex classification task as scene recognition is considered in the present research. Scene recognition tasks are successfully solved by the paradigm of transfer learning from pretrained convolutional neural networks, but a problem is that the eventual size of the network is huge despite a common scene recognition task has up to a few tens of scene categories. Thus, the goal is to ascertain possibility of a size reduction. The modelling recognition task is a small dataset of 4485 grayscale images broken into 15 image categories. The pretrained network is AlexNet dealing with much simpler image categories whose number is 1000, though. This network has two fully connected layers, which can be potentially reduced or deleted. A regular transfer learning network occupies about 202.6 MB performing at up to 92 % accuracy rate for the scene recognition. It is revealed that deleting the layers is not reasonable. The network size is reduced by setting a fewer number of filters in the 17th and 20th layers of the AlexNet-based networks using a dichotomy principle or similar. The best truncated network with 384 and 192 filters in those layers performs at 93.3 % accuracy rate, and its size is 21.63 MB.


2020 ◽  
Author(s):  
Manik Dhingra ◽  
Sarthak Rawat ◽  
Jinan Fiaidhi

The work presented here works on getting higher performances for image recognition task using convolutional neural networks on the MNIST handwritten digits data-set. A range of techniques are compared for improvements with respect to time and accuracy, such as using one-shot Extreme Learning Machines (ELM) in place of the iteratively tuned fully-connected networks for classification, using transfer learning for faster convergence of image classification, and improving the size of data-set and making robust models by image augmentation. The final implementation is hosted on cloud as a web-service for better visualization of the prediction results.


1994 ◽  
Vol 37 (3) ◽  
Author(s):  
G. Romeo

Pattern recognition belongs to a class of Problems which are easily solved by humans, but difficult for computers. It is sometimes difficult to formalize a problem which a human operator can casily understand by using examples. Neural networks are useful in solving this kind of problem. A neural network may, under certain conditions, simulate a well trained human operator in recognizing different types of earthquakes or in detecting the presence of a seismic event. It is then shown how a fully connected multi layer perceptron may perform a recognition task. It is shown how a self training auto associative neural network may detect an earthquake occurrence analysing the change in signal characteristics.


2020 ◽  
Author(s):  
Manik Dhingra ◽  
Sarthak Rawat ◽  
Jinan Fiaidhi

The work presented here works on getting higher performances for image recognition task using convolutional neural networks on the MNIST handwritten digits data-set. A range of techniques are compared for improvements with respect to time and accuracy, such as using one-shot Extreme Learning Machines (ELM) in place of the iteratively tuned fully-connected networks for classification, using transfer learning for faster convergence of image classification, and improving the size of data-set and making robust models by image augmentation. The final implementation is hosted on cloud as a web-service for better visualization of the prediction results.


Author(s):  
Mattson Ogg ◽  
L. Robert Slevc

Music and language are uniquely human forms of communication. What neural structures facilitate these abilities? This chapter conducts a review of music and language processing that follows these acoustic signals as they ascend the auditory pathway from the brainstem to auditory cortex and on to more specialized cortical regions. Acoustic, neural, and cognitive mechanisms are identified where processing demands from both domains might overlap, with an eye to examples of experience-dependent cortical plasticity, which are taken as strong evidence for common neural substrates. Following an introduction describing how understanding musical processing informs linguistic or auditory processing more generally, findings regarding the major components (and parallels) of music and language research are reviewed: pitch perception, syntax and harmonic structural processing, semantics, timbre and speaker identification, attending in auditory scenes, and rhythm. Overall, the strongest evidence that currently exists for neural overlap (and cross-domain, experience-dependent plasticity) is in the brainstem, followed by auditory cortex, with evidence and the potential for overlap becoming less apparent as the mechanisms involved in music and speech perception become more specialized and distinct at higher levels of processing.


2021 ◽  
Vol 17 (4) ◽  
pp. 1-26
Author(s):  
Md Musabbir Adnan ◽  
Sagarvarma Sayyaparaju ◽  
Samuel D. Brown ◽  
Mst Shamim Ara Shawkat ◽  
Catherine D. Schuman ◽  
...  

Spiking neural networks (SNN) offer a power efficient, biologically plausible learning paradigm by encoding information into spikes. The discovery of the memristor has accelerated the progress of spiking neuromorphic systems, as the intrinsic plasticity of the device makes it an ideal candidate to mimic a biological synapse. Despite providing a nanoscale form factor, non-volatility, and low-power operation, memristors suffer from device-level non-idealities, which impact system-level performance. To address these issues, this article presents a memristive crossbar-based neuromorphic system using unsupervised learning with twin-memristor synapses, fully digital pulse width modulated spike-timing-dependent plasticity, and homeostasis neurons. The implemented single-layer SNN was applied to a pattern-recognition task of classifying handwritten-digits. The performance of the system was analyzed by varying design parameters such as number of training epochs, neurons, and capacitors. Furthermore, the impact of memristor device non-idealities, such as device-switching mismatch, aging, failure, and process variations, were investigated and the resilience of the proposed system was demonstrated.


Author(s):  
Naoki Matsumura ◽  
Yasuaki Ito ◽  
Koji Nakano ◽  
Akihiko Kasagi ◽  
Tsuguchika Tabaru

Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2005
Author(s):  
Veronika Scholz ◽  
Peter Winkler ◽  
Andreas Hornig ◽  
Maik Gude ◽  
Angelos Filippatos

Damage identification of composite structures is a major ongoing challenge for a secure operational life-cycle due to the complex, gradual damage behaviour of composite materials. Especially for composite rotors in aero-engines and wind-turbines, a cost-intensive maintenance service has to be performed in order to avoid critical failure. A major advantage of composite structures is that they are able to safely operate after damage initiation and under ongoing damage propagation. Therefore, a robust, efficient diagnostic damage identification method would allow monitoring the damage process with intervention occurring only when necessary. This study investigates the structural vibration response of composite rotors by applying machine learning methods and the ability to identify, localise and quantify the present damage. To this end, multiple fully connected neural networks and convolutional neural networks were trained on vibration response spectra from damaged composite rotors with barely visible damage, mostly matrix cracks and local delaminations using dimensionality reduction and data augmentation. A databank containing 720 simulated test cases with different damage states is used as a basis for the generation of multiple data sets. The trained models are tested using k-fold cross validation and they are evaluated based on the sensitivity, specificity and accuracy. Convolutional neural networks perform slightly better providing a performance accuracy of up to 99.3% for the damage localisation and quantification.


2016 ◽  
Vol 182 ◽  
pp. 154-164 ◽  
Author(s):  
Junfei Qiao ◽  
Fanjun Li ◽  
Honggui Han ◽  
Wenjing Li

Sign in / Sign up

Export Citation Format

Share Document