Attentional Reinforcement Learning in the Brain

AbstractRecently, attention mechanisms have significantly boosted the performance of natural language processing using deep learning. An attention mechanism can select the information to be used, such as by conducting a dictionary lookup; this information is then used, for example, to select the next utterance word in a sentence. In neuroscience, the basis of the function of sequentially selecting words is considered to be the cortico-basal ganglia-thalamocortical loop. Here, we first show that the attention mechanism used in deep learning corresponds to the mechanism in which the cerebral basal ganglia suppress thalamic relay cells in the brain. Next, we demonstrate that, in neuroscience, the output of the basal ganglia is associated with the action output in the actor of reinforcement learning. Based on these, we show that the aforementioned loop can be generalized as reinforcement learning that controls the transmission of the prediction signal so as to maximize the prediction reward. We call this attentional reinforcement learning (ARL). In ARL, the actor selects the information transmission route according to the attention, and the prediction signal changes according to the context detected by the information source of the route. Hence, ARL enables flexible action selection that depends on the situation, unlike traditional reinforcement learning, wherein the actor must directly select an action.

Download Full-text

High accuracy offering attention mechanisms based deep learning approach using CNN/bi-LSTM for sentiment analysis

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-06-2021-0109 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Venkateswara Rao Kota ◽

Shyamala Devi Munisamy

Keyword(s):

Neural Network ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Attention Mechanism ◽

Supervised Machine Learning ◽

Method Performance ◽

Content Type

PurposeNeural network (NN)-based deep learning (DL) approach is considered for sentiment analysis (SA) by incorporating convolutional neural network (CNN), bi-directional long short-term memory (Bi-LSTM) and attention methods. Unlike the conventional supervised machine learning natural language processing algorithms, the authors have used unsupervised deep learning algorithms.Design/methodology/approachThe method presented for sentiment analysis is designed using CNN, Bi-LSTM and the attention mechanism. Word2vec word embedding is used for natural language processing (NLP). The discussed approach is designed for sentence-level SA which consists of one embedding layer, two convolutional layers with max-pooling, one LSTM layer and two fully connected (FC) layers. Overall the system training time is 30 min.FindingsThe method performance is analyzed using metrics like precision, recall, F1 score, and accuracy. CNN is helped to reduce the complexity and Bi-LSTM is helped to process the long sequence input text.Originality/valueThe attention mechanism is adopted to decide the significance of every hidden state and give a weighted sum of all the features fed as input.

Download Full-text

A representation and deep learning model for annotating ubiquitylation sentences stating E3 ligase - substrate interaction

BMC Bioinformatics ◽

10.1186/s12859-021-04435-7 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Mengqi Luo ◽

Zhongyan Li ◽

Shangfu Li ◽

Tzong-Yi Lee

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Language Processing ◽

Short Term Memory ◽

E3 Ligase ◽

Attention Mechanism ◽

Biomedical Literature ◽

Learning Methods ◽

Substrate Interactions ◽

Substrate Interaction

Abstract Background Ubiquitylation is an important post-translational modification of proteins that not only plays a central role in cellular coding, but is also closely associated with the development of a variety of diseases. The specific selection of substrate by ligase E3 is the key in ubiquitylation. As various high-throughput analytical techniques continue to be applied to the study of ubiquitylation, a large amount of ubiquitylation site data, and records of E3-substrate interactions continue to be generated. Biomedical literature is an important vehicle for information on E3-substrate interactions in ubiquitylation and related new discoveries, as well as an important channel for researchers to obtain such up to date data. The continuous explosion of ubiquitylation related literature poses a great challenge to researchers in acquiring and analyzing the information. Therefore, automatic annotation of these E3-substrate interaction sentences from the available literature is urgently needed. Results In this research, we proposed a model based on representation and attention mechanism based deep learning methods, to automatic annotate E3-substrate interaction sentences in biomedical literature. Focusing on the sentences with E3 protein inside, we applied several natural language processing methods and a Long Short-Term Memory (LSTM)-based deep learning classifier to train the model. Experimental results had proved the effectiveness of our proposed model. And also, the proposed attention mechanism deep learning method outperforms other statistical machine learning methods. We also created a manual corpus of E3-substrate interaction sentences, in which the E3 proteins and substrate proteins are also labeled, in order to construct our model. The corpus and model proposed by our research are definitely able to be very useful and valuable resource for advancement of ubiquitylation-related research. Conclusion Having the entire manual corpus of E3-substrate interaction sentences readily available in electronic form will greatly facilitate subsequent text mining and machine learning analyses. Automatic annotating ubiquitylation sentences stating E3 ligase-substrate interaction is significantly benefited from semantic representation and deep learning. The model enables rapid information accessing and can assist in further screening of key ubiquitylation ligase substrates for in-depth studies.

Download Full-text

Performance Analysis of Hybrid Deep Learning Models with Attention Mechanism Positioning and Focal Loss for Text Classification

Scientific Programming ◽

10.1155/2021/2420254 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Sunil Kumar Prabhakar ◽

Harikumar Rajaguru ◽

Dong-Ok Won

Keyword(s):

Deep Learning ◽

Hybrid Model ◽

Language Processing ◽

Text Classification ◽

Classification Accuracy ◽

Short Term Memory ◽

Learning Algorithm ◽

Attention Mechanism ◽

Learning Models ◽

High Classification Accuracy

Over the past few decades, text classification problems have been widely utilized in many real time applications. Leveraging the text classification methods by means of developing new applications in the field of text mining and Natural Language Processing (NLP) is very important. In order to accurately classify tasks in many applications, a deeper insight into deep learning methods is required as there is an exponential growth in the number of complex documents. The success of any deep learning algorithm depends on its capacity to understand the nonlinear relationships of the complex models within data. Thus, a huge challenge for researchers lies in the development of suitable techniques, architectures, and models for text classification. In this paper, hybrid deep learning models, with an emphasis on positioning of attention mechanism analysis, are considered and analyzed well for text classification. The first hybrid model proposed is called convolutional Bidirectional Long Short-Term Memory (Bi-LSTM) with attention mechanism and output (CBAO) model, and the second hybrid model is called convolutional attention mechanism with Bi-LSTM and output (CABO) model. In the first hybrid model, the attention mechanism is placed after the Bi-LSTM, and then the output Softmax layer is constructed. In the second hybrid model, the attention mechanism is placed after convolutional layer and followed by Bi-LSTM and the output Softmax layer. The proposed hybrid models are tested on three datasets, and the results show that when the proposed CBAO model is implemented for IMDB dataset, a high classification accuracy of 92.72% is obtained and when the proposed CABO model is implemented on the same dataset, a high classification accuracy of 90.51% is obtained.

Download Full-text

Modeling Neurodegeneration in silico With Deep Learning

Frontiers in Neuroinformatics ◽

10.3389/fninf.2021.748370 ◽

2021 ◽

Vol 15 ◽

Author(s):

Anup Tuladhar ◽

Jasmine A. Moore ◽

Zahinoor Ismail ◽

Nils D. Forkert

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Object Recognition ◽

Language Processing ◽

Neural Plasticity ◽

In Silico ◽

Cortical Atrophy ◽

Visual Object ◽

Deep Convolutional Neural Networks ◽

The Brain

Deep neural networks, inspired by information processing in the brain, can achieve human-like performance for various tasks. However, research efforts to use these networks as models of the brain have primarily focused on modeling healthy brain function so far. In this work, we propose a paradigm for modeling neural diseases in silico with deep learning and demonstrate its use in modeling posterior cortical atrophy (PCA), an atypical form of Alzheimer’s disease affecting the visual cortex. We simulated PCA in deep convolutional neural networks (DCNNs) trained for visual object recognition by randomly injuring connections between artificial neurons. Results showed that injured networks progressively lost their object recognition capability. Simulated PCA impacted learned representations hierarchically, as networks lost object-level representations before category-level representations. Incorporating this paradigm in computational neuroscience will be essential for developing in silico models of the brain and neurological diseases. The paradigm can be expanded to incorporate elements of neural plasticity and to other cognitive domains such as motor control, auditory cognition, language processing, and decision making.

Download Full-text

Deep Learning Models for Detection and Diagnosis of Alzheimer's Disease

Advances in Medical Technologies and Clinical Practice - Machine Learning and Data Analytics for Predicting, Managing, and Monitoring Disease ◽

10.4018/978-1-7998-7188-0.ch011 ◽

2021 ◽

pp. 140-149

Author(s):

Gowhar Mohiuddin Dar ◽

Ashok Sharma ◽

Parveen Singh

Keyword(s):

Computer Vision ◽

Deep Learning ◽

Natural Language Processing ◽

Reinforcement Learning ◽

Natural Language ◽

Language Processing ◽

Computational Techniques ◽

Medical Sciences ◽

Data Application ◽

Detection And Diagnosis

The chapter explores the implications of deep learning in medical sciences, focusing on deep learning concerning natural language processing, computer vision, reinforcement learning, big data, and blockchain influence on some areas of medicine and construction of end-to-end systems with the help of these computational techniques. The deliberation of computer vision in the study is mainly concerned with medical imaging and further usage of natural language processing to spheres such as electronic wellbeing record data. Application of deep learning in genetic mapping and DNA sequencing termed as genomics and implications of reinforcement learning about surgeries assisted by robots are also overviewed.

Download Full-text

A State-of-the-Art Survey on Deep Learning Theory and Architectures

Electronics ◽

10.3390/electronics8030292 ◽

2019 ◽

Vol 8 (3) ◽

pp. 292 ◽

Cited By ~ 157

Author(s):

Md Zahangir Alom ◽

Tarek M. Taha ◽

Chris Yakopcic ◽

Stefan Westberg ◽

Paheding Sidike ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Reinforcement Learning ◽

Language Processing ◽

Large Scale ◽

Medical Information ◽

State Of The Art ◽

Generative Models ◽

Learning Approaches

In recent years, deep learning has garnered tremendous success in a variety of application domains. This new field of machine learning has been growing rapidly and has been applied to most traditional application domains, as well as some new areas that present more opportunities. Different methods have been proposed based on different categories of learning, including supervised, semi-supervised, and un-supervised learning. Experimental results show state-of-the-art performance using deep learning when compared to traditional machine learning approaches in the fields of image processing, computer vision, speech recognition, machine translation, art, medical imaging, medical information processing, robotics and control, bioinformatics, natural language processing, cybersecurity, and many others. This survey presents a brief survey on the advances that have occurred in the area of Deep Learning (DL), starting with the Deep Neural Network (DNN). The survey goes on to cover Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), including Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), Auto-Encoder (AE), Deep Belief Network (DBN), Generative Adversarial Network (GAN), and Deep Reinforcement Learning (DRL). Additionally, we have discussed recent developments, such as advanced variant DL techniques based on these DL approaches. This work considers most of the papers published after 2012 from when the history of deep learning began. Furthermore, DL approaches that have been explored and evaluated in different application domains are also included in this survey. We also included recently developed frameworks, SDKs, and benchmark datasets that are used for implementing and evaluating deep learning approaches. There are some surveys that have been published on DL using neural networks and a survey on Reinforcement Learning (RL). However, those papers have not discussed individual advanced techniques for training large-scale deep learning models and the recently developed method of generative models.

Download Full-text

Deep Learning Cluster Structures for Management Decisions: The Digital CEO

Sensors ◽

10.3390/s18103327 ◽

2018 ◽

Vol 18 (10) ◽

pp. 3327

Author(s):

Will Serrano

Keyword(s):

Deep Learning ◽

Reinforcement Learning ◽

Cyber Security ◽

Learning Algorithm ◽

Cluster Structure ◽

Long Term Memory ◽

Management Decisions ◽

Proposed Model ◽

The Brain ◽

The Way

This paper presents a Deep Learning (DL) Cluster Structure for Management Decisions that emulates the way the brain learns and makes choices by combining different learning algorithms. The proposed model is based on the Random Neural Network (RNN) Reinforcement Learning for fast local decisions and Deep Learning for long-term memory. The Deep Learning Cluster Structure has been applied in the Cognitive Packet Network (CPN) for routing decisions based on Quality of Service (QoS) metrics (Delay, Loss and Bandwidth) and Cyber Security keys (User, Packet and Node) which includes a layer of DL management clusters (QoS, Cyber and CEO) that take the final routing decision based on the inputs from the DL QoS clusters and RNN Reinforcement Learning algorithm. The model has been validated under different network sizes and scenarios. The simulation results are promising; the presented DL Cluster management structure as a mechanism to transmit, learn and make packet routing decisions is a step closer to emulate the way the brain transmits information, learns the environment and takes decisions.

Download Full-text

Unsupervised Deep Learning Network with Self-Attention Mechanism for Non-Rigid Registration of 3D Brain MR Images

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2021.3345 ◽

2021 ◽

Vol 11 (3) ◽

pp. 736-751

Author(s):

Donggeon Oh ◽

Bohyoung Kim ◽

Jeongjin Lee ◽

Yeong-Gil Shin

Keyword(s):

Deep Learning ◽

Medical Imaging ◽

Unsupervised Learning ◽

High Accuracy ◽

Magnetic Resonance Images ◽

Attention Mechanism ◽

Learning Networks ◽

Rigid Registration ◽

Learning Network ◽

The Brain

In non-rigid registration for medical imaging analysis, computation is complicated, and the high accuracy and robustness needed for registration are difficult to obtain. Recently, many studies have been conducted for nonrigid registration via unsupervised learning networks. This study proposes a method to improve the performance of this unsupervised learning network approach, through the use of a self-attention mechanism. In this paper, the self-attention mechanism is combined with deep learning networks to identify information of higher importance, among large amounts of data, and thereby solve specific tasks. Furthermore, the proposed method extracts both local and non-local information so that the network can create feature vectors with more information. As a result, the limitation of the existing network is addressed: alignment based solely on the entire silhouette of the brain is mitigated in favor of a network which also learns to perform registration of the parts of the brain that have internal structural characteristics. To the best of our knowledge, this is the first such utilization of the attention mechanism in this unsupervised learning network for non-rigid registration. The proposed attention network performs registration that takes into account the overall characteristics of the data, thus yielding more accurate matching results than those of the existing methods. In particular, matching is achieved with especially high accuracy in the gray matter and cortical ventricle areas, since these areas contain many of the structural features of the brain. The experiment was performed on 3D magnetic resonance images of the brains of 50 people. The measured average dice similarity coefficient after registration was 70.40%, which is an improvement of 17.48% compared to that before registration. This improvement indicates that application of the attention block can further improve the performance by an additional 8.5%, as relative to that without attention block. Ultimately, through implementation of non-rigid registration via the attention block method, the internal structure and overall shape of the brain can be addressed, without additional data input. Additionally, attention blocks have the advantage of being able to easily connect to existing networks without a significant computational overhead. Furthermore, by producing an attention map, the area of the brain around which registration was more performed can be visualized. This approach can be used for non-rigid registration with various types of medical imaging data.

Download Full-text

Chunking sequence information by mutually predicting recurrent neural networks

10.1101/215392 ◽

2017 ◽

Author(s):

Toshitake Asabuki ◽

Naoki Hiratani ◽

Tomoki Fukai

Keyword(s):

Neural Networks ◽

Basal Ganglia ◽

Language Processing ◽

Cognitive Tasks ◽

Sequence Information ◽

Reservoir Computing ◽

Neural Responses ◽

Complex Sequences ◽

Unsupervised Neural Networks ◽

The Brain

AbstractInterpretation and execution of complex sequences is crucial for various cognitive tasks such as language processing and motor control. The brain solves this problem arguably by dividing a sequence into discrete chunks of contiguous items. While chunking has been accounted for by predictive uncertainty, alternative mechanisms have also been suggested, and the mechanism underlying chunking is poorly understood. Here, we propose a class of unsupervised neural networks for learning and identifying repeated patterns in sequence input with various degrees of complexity. In this model, a pair of reservoir computing modules, each of which comprises a recurrent neural network and readout units, supervise each other to consistently predict others’ responses to frequently recurring segments. Interestingly, this system generates neural responses similar to those formed in the basal ganglia during habit formation. Our model extends reservoir computing to higher cognitive function and demonstrates its resemblance to sequence processing by cortico-basal ganglia loops.

Download Full-text

A Sentence-Level Joint Relation Classification Model Based on Reinforcement Learning

Computational Intelligence and Neuroscience ◽

10.1155/2021/5557184 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Zhen Liu ◽

XiaoQiang Di ◽

Wei Song ◽

WeiWu Ren

Keyword(s):

Reinforcement Learning ◽

Language Processing ◽

Semantic Processing ◽

Large Scale ◽

Short Term Memory ◽

Attention Mechanism ◽

Training Data ◽

Classification Model ◽

Sentence Level ◽

Relation Classification

Relation classification is an important semantic processing task in the field of natural language processing (NLP). Data sources generally adopt remote monitoring strategies to automatically generate large-scale training data, which inevitably causes label noise problems. At the same time, another challenge is that important information can appear at any place in the sentence. This paper presents a sentence-level joint relation classification model. The model has two modules: a reinforcement learning (RL) agent and a joint network model. In particular, we combine bidirectional long short-term memory (Bi-LSTM) and attention mechanism as a joint model to process the text features of sentences and classify the relation between two entities. At the same time, we introduce an attention mechanism to discover hidden information in sentences. The joint training of the two modules solves the noise problem in relation extraction, sentence-level information extraction, and relation classification. Experimental results demonstrate that the model can effectively deal with data noise and achieve better relation classification performance at the sentence level.

Download Full-text