scholarly journals Language Semantics Interpretation with an Interaction-Based Recurrent Neural Network

2021 ◽  
Vol 3 (4) ◽  
pp. 922-945
Author(s):  
Shaw-Hwa Lo ◽  
Yiqiao Yin

Text classification is a fundamental language task in Natural Language Processing. A variety of sequential models are capable of making good predictions, yet there is a lack of connection between language semantics and prediction results. This paper proposes a novel influence score (I-score), a greedy search algorithm, called Backward Dropping Algorithm (BDA), and a novel feature engineering technique called the “dagger technique”. First, the paper proposes to use the novel influence score (I-score) to detect and search for the important language semantics in text documents that are useful for making good predictions in text classification tasks. Next, a greedy search algorithm, called the Backward Dropping Algorithm, is proposed to handle long-term dependencies in the dataset. Moreover, the paper proposes a novel engineering technique called the “dagger technique” that fully preserves the relationship between the explanatory variable and the response variable. The proposed techniques can be further generalized into any feed-forward Artificial Neural Networks (ANNs) and Convolutional Neural Networks (CNNs), and any neural network. A real-world application on the Internet Movie Database (IMDB) is used and the proposed methods are applied to improve prediction performance with an 81% error reduction compared to other popular peers if I-score and “dagger technique” are not implemented.

Author(s):  
Dolly Sapra ◽  
Andy D. Pimentel

AbstractThe automated architecture search methodology for neural networks is known as Neural Architecture Search (NAS). In recent times, Convolutional Neural Networks (CNNs) designed through NAS methodologies have achieved very high performance in several fields, for instance image classification and natural language processing. Our work is in the same domain of NAS, where we traverse the search space of neural network architectures with the help of an evolutionary algorithm which has been augmented with a novel approach of piecemeal-training. In contrast to the previously published NAS techniques, wherein the training with given data is considered an isolated task to estimate the performance of neural networks, our work demonstrates that a neural network architecture and the related weights can be jointly learned by combining concepts of the traditional training process and evolutionary architecture search in a single algorithm. The consolidation has been realised by breaking down the conventional training technique into smaller slices and collating them together with an integrated evolutionary architecture search algorithm. The constraints on architecture search space are placed by limiting its various parameters within a specified range of values, consequently regulating the neural network’s size and memory requirements. We validate this concept on two vastly different datasets, namely, the CIFAR-10 dataset in the domain of image classification, and PAMAP2 dataset in the Human Activity Recognition (HAR) domain. Starting from randomly initialized and untrained CNNs, the algorithm discovers models with competent architectures, which after complete training, reach an accuracy of of 92.5% for CIFAR-10 and 94.36% PAMAP2. We further extend the algorithm to include an additional conflicting search objective: the number of parameters of the neural network. Our multi-objective algorithm produces a Pareto optimal set of neural networks, by optimizing the search for both the accuracy and the parameter count, thus emphasizing the versatility of our approach.


Author(s):  
Liqiang Xiao ◽  
Honglun Zhang ◽  
Wenqing Chen ◽  
Yongkun Wang ◽  
Yaohui Jin

Convolutional neural networks (CNNs) have shown their promising performance for natural language processing tasks, which extract n-grams as features to represent the input. However, n-gram based CNNs are inherently limited to fixed geometric structure and cannot proactively adapt to the transformations of features. In this paper, we propose two modules to provide CNNs with the flexibility for complex features and the adaptability for transformation, namely, transformable convolution and transformable pooling. Our method fuses dynamic and static deviations to redistribute the sampling locations, which can capture both current and global transformations. Our modules can be easily integrated by other models to generate new transformable networks. We test proposed modules on two state-of-the-art models, and the results demonstrate that our modules can effectively adapt to the feature transformation in text classification.


2016 ◽  
Vol 25 (06) ◽  
pp. 1650033 ◽  
Author(s):  
Hossam Faris ◽  
Ibrahim Aljarah ◽  
Nailah Al-Madi ◽  
Seyedali Mirjalili

Evolutionary Neural Networks are proven to be beneficial in solving challenging datasets mainly due to the high local optima avoidance. Stochastic operators in such techniques reduce the probability of stagnation in local solutions and assist them to supersede conventional training algorithms such as Back Propagation (BP) and Levenberg-Marquardt (LM). According to the No-Free-Lunch (NFL), however, there is no optimization technique for solving all optimization problems. This means that a Neural Network trained by a new algorithm has the potential to solve a new set of problems or outperform the current techniques in solving existing problems. This motivates our attempts to investigate the efficiency of the recently proposed Evolutionary Algorithm called Lightning Search Algorithm (LSA) in training Neural Network for the first time in the literature. The LSA-based trainer is benchmarked on 16 popular medical diagnosis problems and compared to BP, LM, and 6 other evolutionary trainers. The quantitative and qualitative results show that the LSA algorithm is able to show not only better local solutions avoidance but also faster convergence speed compared to the other algorithms employed. In addition, the statistical test conducted proves that the LSA-based trainer is significantly superior in comparison with the current algorithms on the majority of datasets.


2017 ◽  
Vol 56 (05) ◽  
pp. 377-389 ◽  
Author(s):  
Xingyu Zhang ◽  
Joyce Kim ◽  
Rachel E. Patzer ◽  
Stephen R. Pitts ◽  
Aaron Patzer ◽  
...  

SummaryObjective: To describe and compare logistic regression and neural network modeling strategies to predict hospital admission or transfer following initial presentation to Emergency Department (ED) triage with and without the addition of natural language processing elements.Methods: Using data from the National Hospital Ambulatory Medical Care Survey (NHAMCS), a cross-sectional probability sample of United States EDs from 2012 and 2013 survey years, we developed several predictive models with the outcome being admission to the hospital or transfer vs. discharge home. We included patient characteristics immediately available after the patient has presented to the ED and undergone a triage process. We used this information to construct logistic regression (LR) and multilayer neural network models (MLNN) which included natural language processing (NLP) and principal component analysis from the patient’s reason for visit. Ten-fold cross validation was used to test the predictive capacity of each model and receiver operating curves (AUC) were then calculated for each model.Results: Of the 47,200 ED visits from 642 hospitals, 6,335 (13.42%) resulted in hospital admission (or transfer). A total of 48 principal components were extracted by NLP from the reason for visit fields, which explained 75% of the overall variance for hospitalization. In the model including only structured variables, the AUC was 0.824 (95% CI 0.818-0.830) for logistic regression and 0.823 (95% CI 0.817-0.829) for MLNN. Models including only free-text information generated AUC of 0.742 (95% CI 0.7310.753) for logistic regression and 0.753 (95% CI 0.742-0.764) for MLNN. When both structured variables and free text variables were included, the AUC reached 0.846 (95% CI 0.839-0.853) for logistic regression and 0.844 (95% CI 0.836-0.852) for MLNN.Conclusions: The predictive accuracy of hospital admission or transfer for patients who presented to ED triage overall was good, and was improved with the inclusion of free text data from a patient’s reason for visit regardless of modeling approach. Natural language processing and neural networks that incorporate patient-reported outcome free text may increase predictive accuracy for hospital admission.


2020 ◽  
Vol 49 (4) ◽  
pp. 482-494
Author(s):  
Jurgita Kapočiūtė-Dzikienė ◽  
Senait Gebremichael Tesfagergish

Deep Neural Networks (DNNs) have proven to be especially successful in the area of Natural Language Processing (NLP) and Part-Of-Speech (POS) tagging—which is the process of mapping words to their corresponding POS labels depending on the context. Despite recent development of language technologies, low-resourced languages (such as an East African Tigrinya language), have received too little attention. We investigate the effectiveness of Deep Learning (DL) solutions for the low-resourced Tigrinya language of the Northern-Ethiopic branch. We have selected Tigrinya as the testbed example and have tested state-of-the-art DL approaches seeking to build the most accurate POS tagger. We have evaluated DNN classifiers (Feed Forward Neural Network – FFNN, Long Short-Term Memory method – LSTM, Bidirectional LSTM, and Convolutional Neural Network – CNN) on a top of neural word2vec word embeddings with a small training corpus known as Nagaoka Tigrinya Corpus. To determine the best DNN classifier type, its architecture and hyper-parameter set both manual and automatic hyper-parameter tuning has been performed. BiLSTM method was proved to be the most suitable for our solving task: it achieved the highest accuracy equal to 92% that is 65% above the random baseline.


Author(s):  
Noha Ali ◽  
Ahmed H. AbuEl-Atta ◽  
Hala H. Zayed

<span id="docs-internal-guid-cb130a3a-7fff-3e11-ae3d-ad2310e265f8"><span>Deep learning (DL) algorithms achieved state-of-the-art performance in computer vision, speech recognition, and natural language processing (NLP). In this paper, we enhance the convolutional neural network (CNN) algorithm to classify cancer articles according to cancer hallmarks. The model implements a recent word embedding technique in the embedding layer. This technique uses the concept of distributed phrase representation and multi-word phrases embedding. The proposed model enhances the performance of the existing model used for biomedical text classification. The result of the proposed model overcomes the previous model by achieving an F-score equal to 83.87% using an unsupervised technique that trained on PubMed abstracts called PMC vectors (PMCVec) embedding. Also, we made another experiment on the same dataset using the recurrent neural network (RNN) algorithm with two different word embeddings Google news and PMCVec which achieving F-score equal to 74.9% and 76.26%, respectively.</span></span>


2018 ◽  
Vol 10 (11) ◽  
pp. 113 ◽  
Author(s):  
Yue Li ◽  
Xutao Wang ◽  
Pengjian Xu

Text classification is of importance in natural language processing, as the massive text information containing huge amounts of value needs to be classified into different categories for further use. In order to better classify text, our paper tries to build a deep learning model which achieves better classification results in Chinese text than those of other researchers’ models. After comparing different methods, long short-term memory (LSTM) and convolutional neural network (CNN) methods were selected as deep learning methods to classify Chinese text. LSTM is a special kind of recurrent neural network (RNN), which is capable of processing serialized information through its recurrent structure. By contrast, CNN has shown its ability to extract features from visual imagery. Therefore, two layers of LSTM and one layer of CNN were integrated to our new model: the BLSTM-C model (BLSTM stands for bi-directional long short-term memory while C stands for CNN.) LSTM was responsible for obtaining a sequence output based on past and future contexts, which was then input to the convolutional layer for extracting features. In our experiments, the proposed BLSTM-C model was evaluated in several ways. In the results, the model exhibited remarkable performance in text classification, especially in Chinese texts.


2019 ◽  
Vol 1 (2) ◽  
pp. 575-589 ◽  
Author(s):  
Blaž Škrlj ◽  
Jan Kralj ◽  
Nada Lavrač ◽  
Senja Pollak

Deep neural networks are becoming ubiquitous in text mining and natural language processing, but semantic resources, such as taxonomies and ontologies, are yet to be fully exploited in a deep learning setting. This paper presents an efficient semantic text mining approach, which converts semantic information related to a given set of documents into a set of novel features that are used for learning. The proposed Semantics-aware Recurrent deep Neural Architecture (SRNA) enables the system to learn simultaneously from the semantic vectors and from the raw text documents. We test the effectiveness of the approach on three text classification tasks: news topic categorization, sentiment analysis and gender profiling. The experiments show that the proposed approach outperforms the approach without semantic knowledge, with highest accuracy gain (up to 10%) achieved on short document fragments.


2019 ◽  
Vol 25 (4) ◽  
pp. 543-557 ◽  
Author(s):  
Afra Alishahi ◽  
Grzegorz Chrupała ◽  
Tal Linzen

AbstractThe Empirical Methods in Natural Language Processing (EMNLP) 2018 workshop BlackboxNLP was dedicated to resources and techniques specifically developed for analyzing and understanding the inner-workings and representations acquired by neural models of language. Approaches included: systematic manipulation of input to neural networks and investigating the impact on their performance, testing whether interpretable knowledge can be decoded from intermediate representations acquired by neural networks, proposing modifications to neural network architectures to make their knowledge state or generated output more explainable, and examining the performance of networks on simplified or formal languages. Here we review a number of representative studies in each category.


2019 ◽  
Vol 9 (11) ◽  
pp. 2347 ◽  
Author(s):  
Hannah Kim ◽  
Young-Seob Jeong

As the number of textual data is exponentially increasing, it becomes more important to develop models to analyze the text data automatically. The texts may contain various labels such as gender, age, country, sentiment, and so forth. Using such labels may bring benefits to some industrial fields, so many studies of text classification have appeared. Recently, the Convolutional Neural Network (CNN) has been adopted for the task of text classification and has shown quite successful results. In this paper, we propose convolutional neural networks for the task of sentiment classification. Through experiments with three well-known datasets, we show that employing consecutive convolutional layers is effective for relatively longer texts, and our networks are better than other state-of-the-art deep learning models.


Sign in / Sign up

Export Citation Format

Share Document