On Guaranteed Optimal Robust Explanations for NLP Models

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/366 ◽

2021 ◽

Author(s):

Emanuele La Malfa ◽

Rhiannon Michelmore ◽

Agnieszka M. Zbrzezny ◽

Nicola Paoletti ◽

Marta Kwiatkowska

Keyword(s):

Neural Network ◽

Language Processing ◽

Network Models ◽

Neural Network Models ◽

Solution Algorithms ◽

Input Text ◽

Bounded Perturbation ◽

Speed Up ◽

Key Features ◽

Prediction Invariance

We build on abduction-based explanations for machine learning and develop a method for computing local explanations for neural network models in natural language processing (NLP). Our explanations comprise a subset of the words of the input text that satisfies two key features: optimality w.r.t. a user-defined cost function, such as the length of explanation, and robustness, in that they ensure prediction invariance for any bounded perturbation in the embedding space of the left-out words. We present two solution algorithms, respectively based on implicit hitting sets and maximum universal subsets, introducing a number of algorithmic improvements to speed up convergence of hard instances. We show how our method can be configured with different perturbation sets in the embedded space and used to detect bias in predictions by enforcing include/exclude constraints on biased terms, as well as to enhance existing heuristic-based NLP explanation frameworks such as Anchors. We evaluate our framework on three widely used sentiment analysis tasks and texts of up to 100 words from SST, Twitter and IMDB datasets, demonstrating the effectiveness of the derived explanations.

Download Full-text

The relational processing limits of classic and contemporary neural network models of language processing

10.32470/ccn.2019.1022-0 ◽

2019 ◽

Author(s):

Guillermo Puebla ◽

Andrea Martin ◽

Leonidas Doumas

Keyword(s):

Neural Network ◽

Language Processing ◽

Network Models ◽

Relational Processing ◽

Neural Network Models

Download Full-text

The relational processing limits of classic and contemporary neural network models of language processing

Language Cognition and Neuroscience ◽

10.1080/23273798.2020.1821906 ◽

2020 ◽

pp. 1-15

Author(s):

Guillermo Puebla ◽

Andrea E. Martin ◽

Leonidas A. A. Doumas

Keyword(s):

Neural Network ◽

Language Processing ◽

Network Models ◽

Relational Processing ◽

Neural Network Models

Download Full-text

Comparison of rule-based and neural network models for negation detection in radiology reports

Natural Language Engineering ◽

10.1017/s1351324920000509 ◽

2020 ◽

pp. 1-22 ◽

Cited By ~ 2

Author(s):

D. Sykes ◽

A. Grivas ◽

C. Grover ◽

R. Tobin ◽

C. Sudlow ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Language Processing ◽

Network Models ◽

Neural Network Models ◽

Test Set ◽

Rule Based ◽

Radiology Reports ◽

The Neural Network ◽

Negation Detection

Abstract Using natural language processing, it is possible to extract structured information from raw text in the electronic health record (EHR) at reasonably high accuracy. However, the accurate distinction between negated and non-negated mentions of clinical terms remains a challenge. EHR text includes cases where diseases are stated not to be present or only hypothesised, meaning a disease can be mentioned in a report when it is not being reported as present. This makes tasks such as document classification and summarisation more difficult. We have developed the rule-based EdIE-R-Neg, part of an existing text mining pipeline called EdIE-R (Edinburgh Information Extraction for Radiology reports), developed to process brain imaging reports, (https://www.ltg.ed.ac.uk/software/edie-r/) and two machine learning approaches; one using a bidirectional long short-term memory network and another using a feedforward neural network. These were developed on data from the Edinburgh Stroke Study (ESS) and tested on data from routine reports from NHS Tayside (Tayside). Both datasets consist of written reports from medical scans. These models are compared with two existing rule-based models: pyConText (Harkema et al. 2009. Journal of Biomedical Informatics42(5), 839–851), a python implementation of a generalisation of NegEx, and NegBio (Peng et al. 2017. NegBio: A high-performance tool for negation and uncertainty detection in radiology reports. arXiv e-prints, p. arXiv:1712.05898), which identifies negation scopes through patterns applied to a syntactic representation of the sentence. On both the test set of the dataset from which our models were developed, as well as the largely similar Tayside test set, the neural network models and our custom-built rule-based system outperformed the existing methods. EdIE-R-Neg scored highest on F1 score, particularly on the test set of the Tayside dataset, from which no development data were used in these experiments, showing the power of custom-built rule-based systems for negation detection on datasets of this size. The performance gap of the machine learning models to EdIE-R-Neg on the Tayside test set was reduced through adding development Tayside data into the ESS training set, demonstrating the adaptability of the neural network models.

Download Full-text

Finding Fuzziness in Neural Network Models of Language Processing

Explainable AI and Other Applications of Fuzzy Techniques - Lecture Notes in Networks and Systems ◽

10.1007/978-3-030-82099-2_25 ◽

2021 ◽

pp. 278-290

Author(s):

Kanishka Misra ◽

Julia Taylor Rayz

Keyword(s):

Neural Network ◽

Language Processing ◽

Network Models ◽

Neural Network Models

Download Full-text

A Primer on Neural Network Models for Natural Language Processing

Journal of Artificial Intelligence Research ◽

10.1613/jair.4992 ◽

2016 ◽

Vol 57 ◽

pp. 345-420 ◽

Cited By ~ 233

Author(s):

Yoav Goldberg

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Speech Processing ◽

Network Models ◽

Neural Network Models ◽

Convolutional Networks ◽

The Past ◽

Gradient Computation

Over the past few years, neural networks have re-emerged as powerful machine-learning models, yielding state-of-the-art results in fields such as image recognition and speech processing. More recently, neural network models started to be applied also to textual natural language signals, again with very promising results. This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques. The tutorial covers input encoding for natural language tasks, feed-forward networks, convolutional networks, recurrent networks and recursive networks, as well as the computation graph abstraction for automatic gradient computation.

Download Full-text

Double Multi-Head Attention-Based Capsule Network for Relation Classification

10.5121/csit.2021.110711 ◽

2021 ◽

Author(s):

Hongjun Heng ◽

Renjie Li

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Language Processing ◽

Layer Structure ◽

Single Layer ◽

Network Models ◽

Classification Model ◽

Neural Network Models ◽

Comparable Performance ◽

Relation Classification

Semantic relation classification is an important task in the field of nature language processing. The existing neural network relation classification models introduce attention mechanism to increase the importance of significant features, but part of these attention models only have one head which is not enough to capture more distinctive fine-grained features. Models based on RNN (Recurrent Neural Network) usually use single-layer structure and have limited feature extraction capability. Current RNN-based capsule networks have problem of improper handling of noise which increase complexity of network. Therefore, we propose a capsule network relation classification model based on double multi-head attention. In this model, we introduce an auxiliary BiGRU (Bidirectional Gated Recurrent Unit) to make up for the lack of feature extraction performance of single BiGRU, improve the bilinear attention through double multihead mechanism to enable the model to obtain more information of sentence from different representation subspace and instantiate capsules with sentence-level features to alleviate noise impact. Experiments on the SemEval-2010 Task 8 benchmark dataset show that our model outperforms most of previous state-of-the-art neural network models and achieves the comparable performance with F1 score of 85.3% in capsule network.

Download Full-text

Supervised Word Sense Disambiguation on Polysemy with Neural Network Models: A Case Study of BUN in Taiwan Hakka

International Journal of Asian Language Processing ◽

10.1142/s2717554520500113 ◽

2021 ◽

pp. 2050011

Author(s):

Huei-Ling Lai ◽

Hsiao-Ling Hsu ◽

Jyi-Shane Liu ◽

Chia-Hung Lin ◽

Yanhong Chen

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Language Processing ◽

Word Sense Disambiguation ◽

Network Models ◽

Word Sense ◽

Neural Network Models ◽

Low Resource ◽

Sense Disambiguation

While word sense disambiguation (WSD) has been extensively studied in natural language processing, such a task in low-resource languages still receives little attention. Findings based on a few dominant languages may lead to narrow applications. A language-specific WSD system is in need to implement in low-resource languages, for instance, in Taiwan Hakka. This study examines the performance of DNN and Bi-LSTM in WSD tasks on polysemous BUNin Taiwan Hakka. Both models are trained and tested on a small amount of hand-crafted labeled data. Two experiments are designed with four kinds of input features and two window spans to explore what information is needed for the models to achieve their best performance. The results show that to achieve the best performance, DNN and Bi-LSTM models prefer different kinds of input features and window spans.

Download Full-text

Probing Classifiers: Promises, Shortcomings, and Advances

Computational Linguistics ◽

10.1162/coli_a_00422 ◽

2021 ◽

pp. 1-12

Author(s):

Yonatan Belinkov

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Deep Neural Network ◽

Network Models ◽

Neural Network Models ◽

Linguistic Property

Abstract Probing classifiers have emerged as one of the prominent methodologies for interpreting and analyzing deep neural network models of natural language processing. The basic idea is simple —a classifier is trained to predict some linguistic property from a model's representations—and has been used to examine a wide variety of models and properties. However, recent studies have demonstrated various methodological limitations of this approach. This article critically reviews the probing classifiers framework, highlighting their promises, shortcomings, and advances.

Download Full-text

Medication-rights detection using incident reports: A natural language processing and deep neural network approach

Health Informatics Journal ◽

10.1177/1460458219889798 ◽

2019 ◽

Vol 26 (3) ◽

pp. 1777-1794

Author(s):

Zoie Shui-Yee Wong ◽

HY So ◽

Belinda SC Kwok ◽

Mavis WS Lai ◽

David TF Sun

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Deep Neural Network ◽

Detection System ◽

Network Models ◽

Incident Reporting ◽

Neural Network Models ◽

The Right

Medication errors often occurred due to the breach of medication rights that are the right patient, the right drug, the right time, the right dose and the right route. The aim of this study was to develop a medication-rights detection system using natural language processing and deep neural networks to automate medication-incident identification using free-text incident reports. We assessed the performance of deep neural network models in classifying the Advanced Incident Reporting System reports and compared the models’ performance with that of other common classification methods (including logistic regression, support vector machines and the decision-tree method). We also evaluated the effects on prediction outcomes of several deep neural network model settings, including number of layers, number of neurons and activation regularisation functions. The accuracy of the models was measured at 0.9 or above across model settings and algorithms. The average values obtained for accuracy and area under the curve were 0.940 (standard deviation: 0.011) and 0.911 (standard deviation: 0.019), respectively. It is shown that deep neural network models were more accurate than the other classifiers across all of the tested class labels (including wrong patient, wrong drug, wrong time, wrong dose and wrong route). The deep neural network method outperformed other binary classifiers and our default base case model, and parameter arguments setting generally performed well for the five medication-rights datasets. The medication-rights detection system developed in this study successfully uses a natural language processing and deep-learning approach to classify patient-safety incidents using the Advanced Incident Reporting System reports, which may be transferable to other mandatory and voluntary incident reporting systems worldwide.

Download Full-text

Analysis of Neural Network Based Language Modeling

Journal of Artificial Intelligence and Capsule Networks - September 2019 ◽

10.36548/jaicn.2020.3.006 ◽

2020 ◽

Vol 2 (1) ◽

pp. 53-63

Author(s):

Dr. Karrupusamy P.

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Network Models ◽

Research Field ◽

Neural Network Models ◽

Natural Languages ◽

The Neural Network ◽

Language Modelling

The fundamental and core process of the natural language processing is the language modelling usually referred as the statistical language modelling. The language modelling is also considered to be vital in the processing the natural languages as the other chores such as the completion of sentences, recognition of speech automatically, translations of the statistical machines, and generation of text and so on. The success of the viable natural language processing totally relies on the quality of the modelling of the language. In the previous spans the research field such as the linguistics, psychology, speech recognition, data compression, neuroscience, machine translation etc. As the neural network are the very good choices for having a quality language modelling the paper presents the analysis of neural networks in the modelling of the language. Utilizing some of the dataset such as the Penn Tree bank, Billion Word Benchmark and the Wiki Test the neural network models are evaluated on the basis of the word error rate, perplexity and the bilingual evaluation under study scores to identify the optimal model.

Download Full-text