Learning Multi-Level Dependencies for Robust Word Recognition

Robust language processing systems are becoming increasingly important given the recent awareness of dangerous situations where brittle machine learning models can be easily broken with the presence of noises. In this paper, we introduce a robust word recognition framework that captures multi-level sequential dependencies in noised sentences. The proposed framework employs a sequence-to-sequence model over characters of each word, whose output is given to a word-level bi-directional recurrent neural network. We conduct extensive experiments to verify the effectiveness of the framework. The results show that the proposed framework outperforms state-of-the-art methods by a large margin and they also suggest that character-level dependencies can play an important role in word recognition. The code of the proposed framework and the major experiments are publicly available1.

Download Full-text

AUTOMATIC KEYWORD EXTRACTION USING ARTIFICIAL NEURAL NETWORK AND FEATURE EXTRACTION

Journal of Military Science and Technology ◽

10.54939/1859-1043.j.mst.69a.2020.63-74 ◽

2020 ◽

pp. 63-74

Author(s):

Son

Keyword(s):

Neural Network ◽

Machine Learning ◽

Artificial Neural Network ◽

Language Processing ◽

Extraction Methods ◽

Keyword Extraction ◽

Learning Models ◽

New Approach ◽

Word Level ◽

Machine Learning Models

Extracting keywords from documents is an essential task in natural language processing. A challenge of this task is to define a reasonable set of keywords from which we can find all relevant documents. This paper proposes a new approach that exploits word-level handcrafted features and machine learning models to select a single document's most important keywords. To evaluate the proposed solution, we compare our results with the latest supervised and unsupervised automatic keyword extraction methods. Experiment results show that our model achieves the best results on the 9/20 data corpus. It points out that our proposed approach is promising.

Download Full-text

A transformer-based approach to irony and sarcasm detection

Neural Computing and Applications ◽

10.1007/s00521-020-05102-3 ◽

2020 ◽

Vol 32 (23) ◽

pp. 17309-17320

Author(s):

Rolandos Alexandros Potamias ◽

Georgios Siolas ◽

Andreas - Georgios Stafylopatis

Keyword(s):

Neural Network ◽

Language Processing ◽

Network Architecture ◽

Figurative Language ◽

State Of The Art ◽

Unresolved Issue ◽

Discussion Forums ◽

Large Margin ◽

Neural Architecture ◽

Benchmark Datasets

AbstractFigurative language (FL) seems ubiquitous in all social media discussion forums and chats, posing extra challenges to sentiment analysis endeavors. Identification of FL schemas in short texts remains largely an unresolved issue in the broader field of natural language processing, mainly due to their contradictory and metaphorical meaning content. The main FL expression forms are sarcasm, irony and metaphor. In the present paper, we employ advanced deep learning methodologies to tackle the problem of identifying the aforementioned FL forms. Significantly extending our previous work (Potamias et al., in: International conference on engineering applications of neural networks, Springer, Berlin, pp 164–175, 2019), we propose a neural network methodology that builds on a recently proposed pre-trained transformer-based network architecture which is further enhanced with the employment and devise of a recurrent convolutional neural network. With this setup, data preprocessing is kept in minimum. The performance of the devised hybrid neural architecture is tested on four benchmark datasets, and contrasted with other relevant state-of-the-art methodologies and systems. Results demonstrate that the proposed methodology achieves state-of-the-art performance under all benchmark datasets, outperforming, even by a large margin, all other methodologies and published studies.

Download Full-text

Machine Learning Crowdfunding

International Journal of Knowledge-Based Organizations ◽

10.4018/ijkbo.2020040101 ◽

2020 ◽

Vol 10 (2) ◽

pp. 1-11

Author(s):

Evangelos Katsamakas ◽

Hao Sun

Keyword(s):

Neural Network ◽

Machine Learning ◽

Language Processing ◽

Topic Modeling ◽

Network Models ◽

Unstructured Data ◽

Text Analytics ◽

Learning Models ◽

Neural Network Models ◽

Machine Learning Models

Crowdfunding is a novel and important economic mechanism for funding projects and promoting innovation in the digital economy. This article explores most recent structured and unstructured data from a crowdfunding platform. It provides an in-depth exploration of the data using text analytics techniques, such as sentiment analysis and topic modeling. It uses novel natural language processing to represent project descriptions, and evaluates machine learning models, including neural network models, to predict project fundraising success. It discusses the findings of the performance evaluation, and summarizes lessons for crowdfunding platforms and their users.

Download Full-text

CoCoX: Generating Conceptual and Counterfactual Explanations via Fault-Lines

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i03.5643 ◽

2020 ◽

Vol 34 (03) ◽

pp. 2594-2601

Author(s):

Arjun Akula ◽

Shuai Wang ◽

Song-Chun Zhu

Keyword(s):

Neural Network ◽

State Of The Art ◽

Input Image ◽

Classification Model ◽

Learning Models ◽

Fault Line ◽

Semantic Level ◽

Explainable Ai ◽

Fault Lines ◽

Classification Category

We present CoCoX (short for Conceptual and Counterfactual Explanations), a model for explaining decisions made by a deep convolutional neural network (CNN). In Cognitive Psychology, the factors (or semantic-level features) that humans zoom in on when they imagine an alternative to a model prediction are often referred to as fault-lines. Motivated by this, our CoCoX model explains decisions made by a CNN using fault-lines. Specifically, given an input image I for which a CNN classification model M predicts class cpred, our fault-line based explanation identifies the minimal semantic-level features (e.g., stripes on zebra, pointed ears of dog), referred to as explainable concepts, that need to be added to or deleted from I in order to alter the classification category of I by M to another specified class calt. We argue that, due to the conceptual and counterfactual nature of fault-lines, our CoCoX explanations are practical and more natural for both expert and non-expert users to understand the internal workings of complex deep learning models. Extensive quantitative and qualitative experiments verify our hypotheses, showing that CoCoX significantly outperforms the state-of-the-art explainable AI models. Our implementation is available at https://github.com/arjunakula/CoCoX

Download Full-text

Triage and diagnosis of COVID-19 from medical social media (Preprint)

10.2196/preprints.30397 ◽

2021 ◽

Author(s):

Abul Hasan ◽

Mark Levene ◽

David Weston ◽

Renate Fromson ◽

Nicolas Koslover ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Models ◽

Rule Based ◽

Additional Information ◽

Processing Pipeline ◽

Machine Learning Models

BACKGROUND The COVID-19 pandemic has created a pressing need for integrating information from disparate sources, in order to assist decision makers. Social media is important in this respect, however, to make sense of the textual information it provides and be able to automate the processing of large amounts of data, natural language processing methods are needed. Social media posts are often noisy, yet they may provide valuable insights regarding the severity and prevalence of the disease in the population. In particular, machine learning techniques for triage and diagnosis could allow for a better understanding of what social media may offer in this respect. OBJECTIVE This study aims to develop an end-to-end natural language processing pipeline for triage and diagnosis of COVID-19 from patient-authored social media posts, in order to provide researchers and other interested parties with additional information on the symptoms, severity and prevalence of the disease. METHODS The text processing pipeline first extracts COVID-19 symptoms and related concepts such as severity, duration, negations, and body parts from patients’ posts using conditional random fields. An unsupervised rule-based algorithm is then applied to establish relations between concepts in the next step of the pipeline. The extracted concepts and relations are subsequently used to construct two different vector representations of each post. These vectors are applied separately to build support vector machine learning models to triage patients into three categories and diagnose them for COVID-19. RESULTS We report that Macro- and Micro-averaged F_{1\ }scores in the range of 71-96% and 61-87%, respectively, for the triage and diagnosis of COVID-19, when the models are trained on human labelled data. Our experimental results indicate that similar performance can be achieved when the models are trained using predicted labels from concept extraction and rule-based classifiers, thus yielding end-to-end machine learning. Also, we highlight important features uncovered by our diagnostic machine learning models and compare them with the most frequent symptoms revealed in another COVID-19 dataset. In particular, we found that the most important features are not always the most frequent ones. CONCLUSIONS Our preliminary results show that it is possible to automatically triage and diagnose patients for COVID-19 from natural language narratives using a machine learning pipeline, in order to provide additional information on the severity and prevalence of the disease through the eyes of social media.

Download Full-text

Predicting Onset of Dementia Using Clinical Notes and Machine Learning: Case-Control Study (Preprint)

10.2196/preprints.17819 ◽

2020 ◽

Author(s):

Christopher A Hane ◽

Vijay S Nori ◽

William H Crown ◽

Darshak M Sanghavi ◽

Paul Bleicher

Keyword(s):

Machine Learning ◽

Language Processing ◽

Disease Onset ◽

Area Under The Curve ◽

Learning Models ◽

Term Care ◽

Clinical Notes ◽

Patients At Risk ◽

Hospital Systems ◽

Machine Learning Models

BACKGROUND Clinical trials need efficient tools to assist in recruiting patients at risk of Alzheimer disease and related dementias (ADRD). Early detection can also assist patients with financial planning for long-term care. Clinical notes are an important, underutilized source of information in machine learning models because of the cost of collection and complexity of analysis. OBJECTIVE This study aimed to investigate the use of deidentified clinical notes from multiple hospital systems collected over 10 years to augment retrospective machine learning models of the risk of developing ADRD. METHODS We used 2 years of data to predict the future outcome of ADRD onset. Clinical notes are provided in a deidentified format with specific terms and sentiments. Terms in clinical notes are embedded into a 100-dimensional vector space to identify clusters of related terms and abbreviations that differ across hospital systems and individual clinicians. RESULTS When using clinical notes, the area under the curve (AUC) improved from 0.85 to 0.94, and positive predictive value (PPV) increased from 45.07% (25,245/56,018) to 68.32% (14,153/20,717) in the model at disease onset. Models with clinical notes improved in both AUC and PPV in years 3-6 when notes’ volume was largest; results are mixed in years 7 and 8 with the smallest cohorts. CONCLUSIONS Although clinical notes helped in the short term, the presence of ADRD symptomatic terms years earlier than onset adds evidence to other studies that clinicians undercode diagnoses of ADRD. De-identified clinical notes increase the accuracy of risk models. Clinical notes collected across multiple hospital systems via natural language processing can be merged using postprocessing techniques to aid model accuracy.

Download Full-text

Covid-19 detection via deep neural network and occlusion sensitivity maps

10.36227/techrxiv.14100890 ◽

2021 ◽

Author(s):

Noor Ahmad ◽

Muhammad Aminu ◽

Mohd Halim Mohd Noor

Keyword(s):

Neural Network ◽

Deep Learning ◽

Deep Neural Network ◽

State Of The Art ◽

Color Images ◽

Fine Tuning ◽

Training Dataset ◽

Learning Approaches ◽

Learning Models ◽

Sensitivity Maps

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.

Download Full-text

Enhancing the performance of cancer text classification model based on cancer hallmarks

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v10.i2.pp316-323 ◽

2021 ◽

Vol 10 (2) ◽

pp. 316

Author(s):

Noha Ali ◽

Ahmed H. AbuEl-Atta ◽

Hala H. Zayed

Keyword(s):

Neural Network ◽

Language Processing ◽

Text Classification ◽

State Of The Art ◽

Classification Model ◽

Biomedical Text ◽

Cancer Hallmarks ◽

Embedding Technique ◽

Proposed Model ◽

Biomedical Text Classification

<span id="docs-internal-guid-cb130a3a-7fff-3e11-ae3d-ad2310e265f8"><span>Deep learning (DL) algorithms achieved state-of-the-art performance in computer vision, speech recognition, and natural language processing (NLP). In this paper, we enhance the convolutional neural network (CNN) algorithm to classify cancer articles according to cancer hallmarks. The model implements a recent word embedding technique in the embedding layer. This technique uses the concept of distributed phrase representation and multi-word phrases embedding. The proposed model enhances the performance of the existing model used for biomedical text classification. The result of the proposed model overcomes the previous model by achieving an F-score equal to 83.87% using an unsupervised technique that trained on PubMed abstracts called PMC vectors (PMCVec) embedding. Also, we made another experiment on the same dataset using the recurrent neural network (RNN) algorithm with two different word embeddings Google news and PMCVec which achieving F-score equal to 74.9% and 76.26%, respectively.</span></span>

Download Full-text

Ensemble and Neural Network Machine Learning Models for Short-Term Load Forecasting of Open Cast Mining Companies

Electrotechnical Systems and Complexes ◽

10.18503/2311-8318-2021-3(52)-57-65 ◽

2021 ◽

pp. 57-65

Author(s):

Dmitry Antonenkov ◽

◽

Pavel Matrenin ◽

Keyword(s):

Neural Network ◽

Machine Learning ◽

Load Forecasting ◽

Learning Models ◽

Short Term ◽

Mining Companies ◽

Open Cast Mining ◽

Short Term Load Forecasting ◽

Open Cast ◽

Machine Learning Models

Download Full-text

First-Break Picking Classification Models Using Recurrent Neural Network

10.2118/204862-ms ◽

2021 ◽

Author(s):

Mohammed Ayub ◽

SanLinn Kaka

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Neural Network ◽

Contextual Information ◽

Classification Model ◽

Superior Performance ◽

Learning Models ◽

Neural Network Models ◽

Minimum Number ◽

Machine Learning Models

Abstract Manual first-break picking from a large volume of seismic data is extremely tedious and costly. Deployment of machine learning models makes the process fast and cost effective. However, these machine learning models require high representative and effective features for accurate automatic picking. Therefore, First- Break (FB) picking classification model that uses effective minimum number of features and promises performance efficiency is proposed. The variants of Recurrent Neural Networks (RNNs) such as Long ShortTerm Memory (LSTM) and Gated Recurrent Unit (GRU) can retain contextual information from long previous time steps. We deploy this advantage for FB picking as seismic traces are amplitude values of vibration along the time-axis. We use behavioral fluctuation of amplitude as input features for LSTM and GRU. The models are trained on noisy data and tested for generalization on original traces not seen during the training and validation process. In order to analyze the real-time suitability, the performance is benchmarked using accuracy, F1-measure and three other established metrics. We have trained two RNN models and two deep Neural Network models for FB classification using only amplitude values as features. Both LSTM and GRU have the accuracy and F1-measure with a score of 94.20%. With the same features, Convolutional Neural Network (CNN) has an accuracy of 93.58% and F1-score of 93.63%. Again, Deep Neural Network (DNN) model has scores of 92.83% and 92.59% as accuracy and F1-measure, respectively. From the pexperiment results, we see significant superior performance of LSTM and GRU to CNN and DNN when used the same features. For robustness of LSTM and GRU models, the performance is compared with DNN model that is trained using nine features derived from seismic traces and observed that the performance superiority of RNN models. Therefore, it is safe to conclude that RNN models (LSTM and GRU) are capable of classifying the FB events efficiently even by using a minimum number of features that are not computationally expensive. The novelty of our work is the capability of automatic FB classification with the RNN models that incorporate contextual behavioral information without the need for sophisticated feature extraction or engineering techniques that in turn can help in reducing the cost and fostering classification model robust and faster.

Download Full-text