Quasi-compositional mapping from form to meaning: a neural network-based approach to capturing neural responses during human language comprehension

We argue that natural language can be usefully described as quasi-compositional and we suggest that deep learning-based neural language models bear long-term promise to capture how language conveys meaning. We also note that a successful account of human language processing should explain both the outcome of the comprehension process and the continuous internal processes underlying this performance. These points motivate our discussion of a neural network model of sentence comprehension, the Sentence Gestalt model, which we have used to account for the N400 component of the event-related brain potential (ERP), which tracks meaning processing as it happens in real time. The model, which shares features with recent deep learning-based language models, simulates N400 amplitude as the automatic update of a probabilistic representation of the situation or event described by the sentence, corresponding to a temporal difference learning signal at the level of meaning. We suggest that this process happens relatively automatically, and that sometimes a more-controlled attention-dependent process is necessary for successful comprehension, which may be reflected in the subsequent P600 ERP component. We relate this account to current deep learning models as well as classic linguistic theory, and use it to illustrate a domain general perspective on some specific linguistic operations postulated based on compositional analyses of natural language. This article is part of the theme issue ‘Towards mechanistic models of meaning composition’.

Download Full-text

Text: An R-package for Analyzing and Visualizing Human Language Using Natural Language Processing and Deep Learning

10.31234/osf.io/293kt ◽

2021 ◽

Author(s):

Oscar Nils Erik Kjell ◽

H. Andrew Schwartz ◽

Salvatore Giorgi

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Rating Scale ◽

State Of The Art ◽

R Package ◽

Language Models ◽

Categorical Variables ◽

Human Language

The language that individuals use for expressing themselves contains rich psychological information. Recent significant advances in Natural Language Processing (NLP) and Deep Learning (DL), namely transformers, have resulted in large performance gains in tasks related to understanding natural language such as machine translation. However, these state-of-the-art methods have not yet been made easily accessible for psychology researchers, nor designed to be optimal for human-level analyses. This tutorial introduces text (www.r-text.org), a new R-package for analyzing and visualizing human language using transformers, the latest techniques from NLP and DL. Text is both a modular solution for accessing state-of-the-art language models and an end-to-end solution catered for human-level analyses. Hence, text provides user-friendly functions tailored to test hypotheses in social sciences for both relatively small and large datasets. This tutorial describes useful methods for analyzing text, providing functions with reliable defaults that can be used off-the-shelf as well as providing a framework for the advanced users to build on for novel techniques and analysis pipelines. The reader learns about six methods: 1) textEmbed: to transform text to traditional or modern transformer-based word embeddings (i.e., numeric representations of words); 2) textTrain: to examine the relationships between text and numeric/categorical variables; 3) textSimilarity and 4) textSimilarityTest: to computing semantic similarity scores between texts and significance test the difference in meaning between two sets of texts; and 5) textProjection and 6) textProjectionPlot: to examine and visualize text within the embedding space according to latent or specified construct dimensions (e.g., low to high rating scale scores).

Download Full-text

Automatic ICD-10 Coding and Training System: Deep Neural Network Based on Supervised Learning

JMIR Medical Informatics ◽

10.2196/23230 ◽

2021 ◽

Vol 9 (8) ◽

pp. e23230

Author(s):

Pei-Fu Chen ◽

Ssu-Ming Wang ◽

Wei-Chih Liao ◽

Lu-Cheng Kuo ◽

Kuan-Chih Chen ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Deep Neural Network ◽

University Hospital ◽

Classification Model ◽

Icd 10 ◽

And Training

Background The International Classification of Diseases (ICD) code is widely used as the reference in medical system and billing purposes. However, classifying diseases into ICD codes still mainly relies on humans reading a large amount of written material as the basis for coding. Coding is both laborious and time-consuming. Since the conversion of ICD-9 to ICD-10, the coding task became much more complicated, and deep learning– and natural language processing–related approaches have been studied to assist disease coders. Objective This paper aims at constructing a deep learning model for ICD-10 coding, where the model is meant to automatically determine the corresponding diagnosis and procedure codes based solely on free-text medical notes to improve accuracy and reduce human effort. Methods We used diagnosis records of the National Taiwan University Hospital as resources and apply natural language processing techniques, including global vectors, word to vectors, embeddings from language models, bidirectional encoder representations from transformers, and single head attention recurrent neural network, on the deep neural network architecture to implement ICD-10 auto-coding. Besides, we introduced the attention mechanism into the classification model to extract the keywords from diagnoses and visualize the coding reference for training freshmen in ICD-10. Sixty discharge notes were randomly selected to examine the change in the F1-score and the coding time by coders before and after using our model. Results In experiments on the medical data set of National Taiwan University Hospital, our prediction results revealed F1-scores of 0.715 and 0.618 for the ICD-10 Clinical Modification code and Procedure Coding System code, respectively, with a bidirectional encoder representations from transformers embedding approach in the Gated Recurrent Unit classification model. The well-trained models were applied on the ICD-10 web service for coding and training to ICD-10 users. With this service, coders can code with the F1-score significantly increased from a median of 0.832 to 0.922 (P<.05), but not in a reduced interval. Conclusions The proposed model significantly improved the F1-score but did not decrease the time consumed in coding by disease coders.

Download Full-text

Fake News Detection using Deep Learning

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i7059.079920 ◽

2020 ◽

Vol 9 (9) ◽

pp. 226-228

Keyword(s):

Neural Network ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Deep Neural Network ◽

Negative Impact ◽

Research Area ◽

Fake News ◽

The World

News is a routine in everyone's life. It helps in enhancing the knowledge on what happens around the world. Fake news is a fictional information madeup with the intension to delude and hence the knowledge acquired becomes of no use. As fake news spreads extensively it has a negative impact in the society and so fake news detection has become an emerging research area. The paper deals with a solution to fake news detection using the methods, deep learning and Natural Language Processing. The dataset is trained using deep neural network. The dataset needs to be well formatted before given to the network which is made possible using the technique of Natural Language Processing and thus predicts whether a news is fake or not.

Download Full-text

Deep Learning Approaches for Textual Sentiment Analysis

Handbook of Research on Emerging Trends and Applications of Machine Learning - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-5225-9643-1.ch009 ◽

2020 ◽

pp. 171-182 ◽

Cited By ~ 1

Author(s):

Tamanna Sharma ◽

Anu Bajaj ◽

Om Prakash Sangwan

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Machine Learning Techniques ◽

Computational Technique ◽

Learning Techniques

Sentiment analysis is computational measurement of attitude, opinions, and emotions (like positive/negative) with the help of text mining and natural language processing of words and phrases. Incorporation of machine learning techniques with natural language processing helps in analysing and predicting the sentiments in more precise manner. But sometimes, machine learning techniques are incapable in predicting sentiments due to unavailability of labelled data. To overcome this problem, an advanced computational technique called deep learning comes into play. This chapter highlights latest studies regarding use of deep learning techniques like convolutional neural network, recurrent neural network, etc. in sentiment analysis.

Download Full-text

High accuracy offering attention mechanisms based deep learning approach using CNN/bi-LSTM for sentiment analysis

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-06-2021-0109 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Venkateswara Rao Kota ◽

Shyamala Devi Munisamy

Keyword(s):

Neural Network ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Attention Mechanism ◽

Supervised Machine Learning ◽

Method Performance ◽

Content Type

PurposeNeural network (NN)-based deep learning (DL) approach is considered for sentiment analysis (SA) by incorporating convolutional neural network (CNN), bi-directional long short-term memory (Bi-LSTM) and attention methods. Unlike the conventional supervised machine learning natural language processing algorithms, the authors have used unsupervised deep learning algorithms.Design/methodology/approachThe method presented for sentiment analysis is designed using CNN, Bi-LSTM and the attention mechanism. Word2vec word embedding is used for natural language processing (NLP). The discussed approach is designed for sentence-level SA which consists of one embedding layer, two convolutional layers with max-pooling, one LSTM layer and two fully connected (FC) layers. Overall the system training time is 30 min.FindingsThe method performance is analyzed using metrics like precision, recall, F1 score, and accuracy. CNN is helped to reduce the complexity and Bi-LSTM is helped to process the long sequence input text.Originality/valueThe attention mechanism is adopted to decide the significance of every hidden state and give a weighted sum of all the features fed as input.

Download Full-text

Analisis Sentimen Pilkada di Tengah Pandemi Covid-19 Menggunakan Convolution Neural Network (CNN)

Jurnal Pendidikan dan Teknologi Indonesia ◽

10.52436/1.jpti.60 ◽

2021 ◽

Vol 1 (7) ◽

pp. 261-268

Author(s):

Sukma Nindi Listyarini ◽

Dimas Aryo Anggoro

Keyword(s):

Neural Network ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Convolution Neural Network

Pemilihan kepala daerah 2020 menjadi kontroversi, sebab dilaksanakan ditengah pandemi covid-19. Komentar muncul di berbagai lini media sosial seperti twitter. Banyak masyarakat yang setuju pilkada dilanjutkan, namun banyak juga yang perpendapat untuk menunda pilkada sampai masa pandemi berakhir. Melihat perbedaan pendapat seperti ini, perlu dilakukan analisis sentimen, dengan tujuan untuk memperoleh persepsi atau gambaran umum masyarakat terhadap penyelenggaraan pilkada 2020 saat pandemi covid-19. Sebanyak 500 tweet diperoleh dengan cara crawling data dari twitter API menggunakan library tweepy, bedasarkan keyword yang telah ditentukan. Dataset yang didapat diberi label ke dalam dua kelas, negatif dan positif. Penelitian ini mengusulkan pendekatan deep learning dengan algoritma Convolution Neural Network (CNN) untuk klasifikasi, yang terbukti efektif untuk tugas Natural Language Processing (NLP) dan mampu mencapai kinerja yang baik dalam klasifikasi kalimat. Percobaan dilakukan dengan menerapkan 4-layer convolutional dan mengamati pengaruh jumlah epoch terhadap akurasi model. Variasi epoch yang digunakan adalah 50, 75, 100. Hasil dari penelitian menunjukkan bahwa, metode CNN dengan dataset pilkada ditengah pandemi mendapatkan akurasi tertinggi sebesar 90% dengan 4-layer convolutional dan 100 epoch. Didapatkan pula bahwa, semakin banyak epoch yang digunakan dalam model, akurasi cenderung meningkat.

Download Full-text

Deep Learning-Based Natural Language Processing in Radiology: The Impact of Report Complexity, Disease Prevalence, Dataset Size, and Algorithm Type on Model Performance

Journal of Medical Systems ◽

10.1007/s10916-021-01761-4 ◽

2021 ◽

Vol 45 (10) ◽

Author(s):

A. W. Olthof ◽

P. M. A. van Ooijen ◽

L. J. Cornelissen

Keyword(s):

Neural Network ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Predictive Value ◽

Training Data ◽

Radiology Reports ◽

Dataset Size ◽

The Impact

AbstractIn radiology, natural language processing (NLP) allows the extraction of valuable information from radiology reports. It can be used for various downstream tasks such as quality improvement, epidemiological research, and monitoring guideline adherence. Class imbalance, variation in dataset size, variation in report complexity, and algorithm type all influence NLP performance but have not yet been systematically and interrelatedly evaluated. In this study, we investigate these factors on the performance of four types [a fully connected neural network (Dense), a long short-term memory recurrent neural network (LSTM), a convolutional neural network (CNN), and a Bidirectional Encoder Representations from Transformers (BERT)] of deep learning-based NLP. Two datasets consisting of radiologist-annotated reports of both trauma radiographs (n = 2469) and chest radiographs and computer tomography (CT) studies (n = 2255) were split into training sets (80%) and testing sets (20%). The training data was used as a source to train all four model types in 84 experiments (Fracture-data) and 45 experiments (Chest-data) with variation in size and prevalence. The performance was evaluated on sensitivity, specificity, positive predictive value, negative predictive value, area under the curve, and F score. After the NLP of radiology reports, all four model-architectures demonstrated high performance with metrics up to > 0.90. CNN, LSTM, and Dense were outperformed by the BERT algorithm because of its stable results despite variation in training size and prevalence. Awareness of variation in prevalence is warranted because it impacts sensitivity and specificity in opposite directions.

Download Full-text

Malicious Encryption Traffic Detection Based on NLP

Security and Communication Networks ◽

10.1155/2021/9960822 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Hao Yang ◽

Qin He ◽

Zhenyan Liu ◽

Qian Zhang

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Detection Method ◽

Data Preprocessing ◽

Encrypted Traffic ◽

Traffic Detection

The development of Internet and network applications has brought the development of encrypted communication technology. But on this basis, malicious traffic also uses encryption to avoid traditional security protection and detection. Traditional security protection and detection methods cannot accurately detect encrypted malicious traffic. In recent years, the rise of artificial intelligence allows us to use machine learning and deep learning methods to detect encrypted malicious traffic without decryption, and the detection results are very accurate. At present, the research on malicious encrypted traffic detection mainly focuses on the characteristics’ analysis of encrypted traffic and the selection of machine learning algorithms. In this paper, a method combining natural language processing and machine learning is proposed; that is, a detection method based on TF-IDF is proposed to build a detection model. In the process of data preprocessing, this method introduces the natural language processing method, namely, the TF-IDF model, to extract data information, obtain the importance of keywords, and then reconstruct the characteristics of data. The detection method based on the TF-IDF model does not need to analyze each field of the data set. Compared with the general machine learning data preprocessing method, that is, data encoding processing, the experimental results show that using natural language processing technology to preprocess data can effectively improve the accuracy of detection. Gradient boosting classifier, random forest classifier, AdaBoost classifier, and the ensemble model based on these three classifiers are, respectively, used in the construction of the later models. At the same time, CNN neural network in deep learning is also used for training, and CNN can effectively extract data information. Under the condition that the input data of the classifier and neural network are consistent, through the comparison and analysis of various methods, the accuracy of the one-dimensional convolutional network based on CNN is slightly higher than that of the classifier based on machine learning.

Download Full-text

KM-BERT: A Pre-trained BERT for Korean Medical Natural Language Processing (Preprint)

10.2196/preprints.31223 ◽

2021 ◽

Author(s):

Yoojoong Kim ◽

Jeong Moon Lee ◽

Moon Joung Jang ◽

Yun Jin Yum ◽

Jong-Ho Kim ◽

...

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Pearson Correlation ◽

Language Model ◽

Language Models ◽

Korean Language ◽

Medical Texts ◽

Proposed Model

BACKGROUND With advances in deep learning and natural language processing, analyzing medical texts is becoming increasingly important. Nonetheless, a study on medical-specific language models has not yet been conducted given the importance of medical texts. OBJECTIVE Korean medical text is highly difficult to analyze because of the agglutinative characteristics of the language as well as the complex terminologies in the medical domain. To solve this problem, we collected a Korean medical corpus and used it to train language models. METHODS In this paper, we present a Korean medical language model based on deep learning natural language processing. The proposed model was trained using the pre-training framework of BERT for the medical context based on a state-of-the-art Korean language model. RESULTS After pre-training, the proposed method showed increased accuracies of 0.147 and 0.148 for the masked language model with next sentence prediction. In the intrinsic evaluation, the next sentence prediction accuracy improved by 0.258, which is a remarkable enhancement. In addition, the extrinsic evaluation of Korean medical semantic textual similarity data showed a 0.046 increase in the Pearson correlation. CONCLUSIONS The results demonstrated the superiority of the proposed model for Korean medical natural language processing. We expect that our proposed model can be extended for application to various languages and domains.

Download Full-text

Deep learning for pharmacovigilance: recurrent neural network architectures for labeling adverse drug reactions in Twitter posts

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocw180 ◽

2017 ◽

Vol 24 (4) ◽

pp. 813-821 ◽

Cited By ~ 60

Author(s):

Anne Cocos ◽

Alexander G Fiks ◽

Aaron J Masino

Keyword(s):

Neural Network ◽

Social Media ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Recurrent Neural Network ◽

State Of The Art ◽

Detection Performance ◽

Word Embeddings

Abstract Objective Social media is an important pharmacovigilance data source for adverse drug reaction (ADR) identification. Human review of social media data is infeasible due to data quantity, thus natural language processing techniques are necessary. Social media includes informal vocabulary and irregular grammar, which challenge natural language processing methods. Our objective is to develop a scalable, deep-learning approach that exceeds state-of-the-art ADR detection performance in social media. Materials and Methods We developed a recurrent neural network (RNN) model that labels words in an input sequence with ADR membership tags. The only input features are word-embedding vectors, which can be formed through task-independent pretraining or during ADR detection training. Results Our best-performing RNN model used pretrained word embeddings created from a large, non–domain-specific Twitter dataset. It achieved an approximate match F-measure of 0.755 for ADR identification on the dataset, compared to 0.631 for a baseline lexicon system and 0.65 for the state-of-the-art conditional random field model. Feature analysis indicated that semantic information in pretrained word embeddings boosted sensitivity and, combined with contextual awareness captured in the RNN, precision. Discussion Our model required no task-specific feature engineering, suggesting generalizability to additional sequence-labeling tasks. Learning curve analysis showed that our model reached optimal performance with fewer training examples than the other models. Conclusions ADR detection performance in social media is significantly improved by using a contextually aware model and word embeddings formed from large, unlabeled datasets. The approach reduces manual data-labeling requirements and is scalable to large social media datasets.

Download Full-text