Double Multi-Head Attention-Based Capsule Network for Relation Classification

Mapping Intimacies ◽

10.5121/csit.2021.110711 ◽

2021 ◽

Author(s):

Hongjun Heng ◽

Renjie Li

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Language Processing ◽

Layer Structure ◽

Single Layer ◽

Network Models ◽

Classification Model ◽

Neural Network Models ◽

Comparable Performance ◽

Relation Classification

Semantic relation classification is an important task in the field of nature language processing. The existing neural network relation classification models introduce attention mechanism to increase the importance of significant features, but part of these attention models only have one head which is not enough to capture more distinctive fine-grained features. Models based on RNN (Recurrent Neural Network) usually use single-layer structure and have limited feature extraction capability. Current RNN-based capsule networks have problem of improper handling of noise which increase complexity of network. Therefore, we propose a capsule network relation classification model based on double multi-head attention. In this model, we introduce an auxiliary BiGRU (Bidirectional Gated Recurrent Unit) to make up for the lack of feature extraction performance of single BiGRU, improve the bilinear attention through double multihead mechanism to enable the model to obtain more information of sentence from different representation subspace and instantiate capsules with sentence-level features to alleviate noise impact. Experiments on the SemEval-2010 Task 8 benchmark dataset show that our model outperforms most of previous state-of-the-art neural network models and achieves the comparable performance with F1 score of 85.3% in capsule network.

Download Full-text

Toxic Comments Classification using Neural Network

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.g1005.0597s20 ◽

2020 ◽

Vol 9 (7S) ◽

pp. 12-15

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Natural Language Processing ◽

Language Processing ◽

Network Models ◽

Classification Problem ◽

The Internet ◽

Neural Network Models ◽

Feature Extraction Technique ◽

Large Corpus

Humans have built broad models of expressing their thoughts via several appliances. The internet has not only become a credible method for expressing one's thoughts, but is also rapidly becoming the single largest means of doing so. In this context, one area of focus is the study of negative online behaviors of users like, toxic comments that are threat, obscenity, insults and abuse. The task of identifying and removing toxic communication from public forums is critical. The undertaking of analyzing a large corpus of comments is infeasible for human moderators. Our approach is to use Natural Language Processing (NLP) techniques to provide an efficient and accurate tool to detect online toxicity. We apply TF-IDF feature extraction technique, Neural Network models to tackle a toxic comment classification problem with a labeled dataset from Wikipedia Talk Page.

Download Full-text

Constructive Learning of Deep Neural Networks for Bigdata Analysis

International Journal of Computer Applications Technology and Research ◽

10.7753/ijcatr0912.1001 ◽

2020 ◽

Vol 9 (12) ◽

pp. 311-322

Author(s):

Soha Abd Mohamed El-Moamen ◽

Marghany Hassan Mohamed ◽

Mohammed F. Farghally

Keyword(s):

Neural Network ◽

Lung Cancer ◽

Binary Classification ◽

Network Models ◽

Classification Model ◽

Neural Network Models ◽

Constructive Learning ◽

The Neural Network ◽

Rapid Pace ◽

Better Than

The need for tracking and evaluation of patients in real-time has contributed to an increase in knowing people’s actions to enhance care facilities. Deep learning is good at both a rapid pace in collecting frameworks of big data healthcare and good predictions for detection the lung cancer early. In this paper, we proposed a constructive deep neural network with Apache Spark to classify images and levels of lung cancer. We developed a binary classification model using threshold technique classifying nodules to benign or malignant. At the proposed framework, the neural network models training, defined using the Keras API, is performed using BigDL in a distributed Spark clusters. The proposed algorithm has metrics AUC-0.9810, a misclassifying rate from which it has been shown that our suggested classifiers perform better than other classifiers.

Download Full-text

The relational processing limits of classic and contemporary neural network models of language processing

10.32470/ccn.2019.1022-0 ◽

2019 ◽

Author(s):

Guillermo Puebla ◽

Andrea Martin ◽

Leonidas Doumas

Keyword(s):

Neural Network ◽

Language Processing ◽

Network Models ◽

Relational Processing ◽

Neural Network Models

Download Full-text

The relational processing limits of classic and contemporary neural network models of language processing

Language Cognition and Neuroscience ◽

10.1080/23273798.2020.1821906 ◽

2020 ◽

pp. 1-15

Author(s):

Guillermo Puebla ◽

Andrea E. Martin ◽

Leonidas A. A. Doumas

Keyword(s):

Neural Network ◽

Language Processing ◽

Network Models ◽

Relational Processing ◽

Neural Network Models

Download Full-text

Comparison of rule-based and neural network models for negation detection in radiology reports

Natural Language Engineering ◽

10.1017/s1351324920000509 ◽

2020 ◽

pp. 1-22 ◽

Cited By ~ 2

Author(s):

D. Sykes ◽

A. Grivas ◽

C. Grover ◽

R. Tobin ◽

C. Sudlow ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Language Processing ◽

Network Models ◽

Neural Network Models ◽

Test Set ◽

Rule Based ◽

Radiology Reports ◽

The Neural Network ◽

Negation Detection

Abstract Using natural language processing, it is possible to extract structured information from raw text in the electronic health record (EHR) at reasonably high accuracy. However, the accurate distinction between negated and non-negated mentions of clinical terms remains a challenge. EHR text includes cases where diseases are stated not to be present or only hypothesised, meaning a disease can be mentioned in a report when it is not being reported as present. This makes tasks such as document classification and summarisation more difficult. We have developed the rule-based EdIE-R-Neg, part of an existing text mining pipeline called EdIE-R (Edinburgh Information Extraction for Radiology reports), developed to process brain imaging reports, (https://www.ltg.ed.ac.uk/software/edie-r/) and two machine learning approaches; one using a bidirectional long short-term memory network and another using a feedforward neural network. These were developed on data from the Edinburgh Stroke Study (ESS) and tested on data from routine reports from NHS Tayside (Tayside). Both datasets consist of written reports from medical scans. These models are compared with two existing rule-based models: pyConText (Harkema et al. 2009. Journal of Biomedical Informatics42(5), 839–851), a python implementation of a generalisation of NegEx, and NegBio (Peng et al. 2017. NegBio: A high-performance tool for negation and uncertainty detection in radiology reports. arXiv e-prints, p. arXiv:1712.05898), which identifies negation scopes through patterns applied to a syntactic representation of the sentence. On both the test set of the dataset from which our models were developed, as well as the largely similar Tayside test set, the neural network models and our custom-built rule-based system outperformed the existing methods. EdIE-R-Neg scored highest on F1 score, particularly on the test set of the Tayside dataset, from which no development data were used in these experiments, showing the power of custom-built rule-based systems for negation detection on datasets of this size. The performance gap of the machine learning models to EdIE-R-Neg on the Tayside test set was reduced through adding development Tayside data into the ESS training set, demonstrating the adaptability of the neural network models.

Download Full-text

Identifying User Suitability in sEMG Based Hand Prosthesis Using Neural Networks

Current Signal Transduction Therapy ◽

10.2174/1574362413666180604100542 ◽

2019 ◽

Vol 14 (2) ◽

pp. 158-164 ◽

Cited By ~ 9

Author(s):

G. Emayavaramban ◽

A. Amudha ◽

T. Rajendran ◽

M. Sivaramkumar ◽

K. Balachandar ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Pattern Recognition ◽

Feature Extraction ◽

Classification Accuracy ◽

Network Models ◽

Prosthetic Hand ◽

Neural Network Models ◽

The Neural Network ◽

Semg Signals

Background: Identifying user suitability plays a vital role in various modalities like neuromuscular system research, rehabilitation engineering and movement biomechanics. This paper analysis the user suitability based on neural networks (NN), subjects, age groups and gender for surface electromyogram (sEMG) pattern recognition system to control the myoelectric hand. Six parametric feature extraction algorithms are used to extract the features from sEMG signals such as AR (Autoregressive) Burg, AR Yule Walker, AR Covariance, AR Modified Covariance, Levinson Durbin Recursion and Linear Prediction Coefficient. The sEMG signals are modeled using Cascade Forward Back propagation Neural Network (CFBNN) and Pattern Recognition Neural Network. Methods: sEMG signals generated from forearm muscles of the participants are collected through an sEMG acquisition system. Based on the sEMG signals, the type of movement attempted by the user is identified in the sEMG recognition module using signal processing, feature extraction and machine learning techniques. The information about the identified movement is passed to microcontroller wherein a control is developed to command the prosthetic hand to emulate the identified movement. Results: From the six feature extraction algorithms and two neural network models used in the study, the maximum classification accuracy of 95.13% was obtained using AR Burg with Pattern Recognition Neural Network. This justifies that the Pattern Recognition Neural Network is best suited for this study as the neural network model is specially designed for pattern matching problem. Moreover, it has simple architecture and low computational complexity. AR Burg is found to be the best feature extraction technique in this study due to its high resolution for short data records and its ability to always produce a stable model. In all the neural network models, the maximum classification accuracy is obtained for subject 10 as a result of his better muscle fitness and his maximum involvement in training sessions. Subjects in the age group of 26-30 years are best suited for the study due to their better muscle contractions. Better muscle fatigue resistance has contributed for better performance of female subjects as compared to male subjects. From the single trial analysis, it can be observed that the hand close movement has achieved best recognition rate for all neural network models. Conclusion: In this paper a study was conducted to identify user suitability for designing hand prosthesis. Data were collected from ten subjects for twelve tasks related to finger movements. The suitability of the user was identified using two neural networks with six parametric features. From the result, it was concluded thatfit women doing regular physical exercises aged between 26-30 years are best suitable for developing HMI for designing a prosthetic hand. Pattern Recognition Neural Network with AR Burg extraction features using extension movements will be a better way to design the HMI. However, Signal acquisition based on wireless method is worth considering for the future.

Download Full-text

Finding Fuzziness in Neural Network Models of Language Processing

Explainable AI and Other Applications of Fuzzy Techniques - Lecture Notes in Networks and Systems ◽

10.1007/978-3-030-82099-2_25 ◽

2021 ◽

pp. 278-290

Author(s):

Kanishka Misra ◽

Julia Taylor Rayz

Keyword(s):

Neural Network ◽

Language Processing ◽

Network Models ◽

Neural Network Models

Download Full-text

A Primer on Neural Network Models for Natural Language Processing

Journal of Artificial Intelligence Research ◽

10.1613/jair.4992 ◽

2016 ◽

Vol 57 ◽

pp. 345-420 ◽

Cited By ~ 233

Author(s):

Yoav Goldberg

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Speech Processing ◽

Network Models ◽

Neural Network Models ◽

Convolutional Networks ◽

The Past ◽

Gradient Computation

Over the past few years, neural networks have re-emerged as powerful machine-learning models, yielding state-of-the-art results in fields such as image recognition and speech processing. More recently, neural network models started to be applied also to textual natural language signals, again with very promising results. This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques. The tutorial covers input encoding for natural language tasks, feed-forward networks, convolutional networks, recurrent networks and recursive networks, as well as the computation graph abstraction for automatic gradient computation.

Download Full-text

Aspect-Level Sentiment Analysis Based on Position Features Using Multilevel Interactive Bidirectional GRU and Attention Mechanism

Discrete Dynamics in Nature and Society ◽

10.1155/2020/5824873 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Xiaodi Wang ◽

Xiaoliang Chen ◽

Mingwei Tang ◽

Tian Yang ◽

Zhen Wang

Keyword(s):

Neural Network ◽

Sentiment Analysis ◽

Network Models ◽

Attention Mechanism ◽

Sentiment Classification ◽

Classification Model ◽

Neural Network Models ◽

Position Information ◽

Gated Recurrent Unit ◽

Relative Position Information

The aim of aspect-level sentiment analysis is to identify the sentiment polarity of a given target term in sentences. Existing neural network models provide a useful account of how to judge the polarity. However, context relative position information for the target terms is adversely ignored under the limitation of training datasets. Considering position features between words into the models can improve the accuracy of sentiment classification. Hence, this study proposes an improved classification model by combining multilevel interactive bidirectional Gated Recurrent Unit (GRU), attention mechanisms, and position features (MI-biGRU). Firstly, the position features of words in a sentence are initialized to enrich word embedding. Secondly, the approach extracts the features of target terms and context by using a well-constructed multilevel interactive bidirectional neural network. Thirdly, an attention mechanism is introduced so that the model can pay greater attention to those words that are important for sentiment analysis. Finally, four classic sentiment classification datasets are used to deal with aspect-level tasks. Experimental results indicate that there is a correlation between the multilevel interactive attention network and the position features. MI-biGRU can obviously improve the performance of classification.

Download Full-text

Supervised Word Sense Disambiguation on Polysemy with Neural Network Models: A Case Study of BUN in Taiwan Hakka

International Journal of Asian Language Processing ◽

10.1142/s2717554520500113 ◽

2021 ◽

pp. 2050011

Author(s):

Huei-Ling Lai ◽

Hsiao-Ling Hsu ◽

Jyi-Shane Liu ◽

Chia-Hung Lin ◽

Yanhong Chen

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Language Processing ◽

Word Sense Disambiguation ◽

Network Models ◽

Word Sense ◽

Neural Network Models ◽

Low Resource ◽

Sense Disambiguation

While word sense disambiguation (WSD) has been extensively studied in natural language processing, such a task in low-resource languages still receives little attention. Findings based on a few dominant languages may lead to narrow applications. A language-specific WSD system is in need to implement in low-resource languages, for instance, in Taiwan Hakka. This study examines the performance of DNN and Bi-LSTM in WSD tasks on polysemous BUNin Taiwan Hakka. Both models are trained and tested on a small amount of hand-crafted labeled data. Two experiments are designed with four kinds of input features and two window spans to explore what information is needed for the models to achieve their best performance. The results show that to achieve the best performance, DNN and Bi-LSTM models prefer different kinds of input features and window spans.

Download Full-text