Classifying Radiology Abstracts with Deep Learning

Average Error ◽

Ensemble Model ◽

Radiological Society ◽

Natural Language Programming

Background The Radiological Society of North America (RSNA) receives more than 8000 abstracts yearly for scientific presentations, scientific posters, and scientific papers. Each abstract is assigned manually one of 16 top-level categories (e.g. "Breast Imaging") for workflow purposes. Additionally, each abstract receives a grade from 1-10 based on a variety of subjective factors such as style and perceived writing quality. Using machine learning to automate, at least partially, the categorization of abstract submissions can result in saving many hours of manual labor. Methods A total of 45527 RSNA abstract submissions from 2014 through 2019 were ingested, tokenized, and pre-processed with a standard natural language programming protocol. A bag-of-words (BOW) model was used as a baseline to evaluate two more sophisticated models, convolutional neural networks and recurrent neural networks, and also evaluate an ensemble model featuring all three neural networks. Results ensemble model was able to achieve 73% testing accuracy for classifying the 16 top-level categories, outperforming all other models. The top model for classifying abstract grade was also an ensemble model, achieving a mean average error (MAE) of 1.01. Conclusion While the baseline BOW model was the highest performing individual classifier, ensemble models that included state-of-the-art neural networks were able to outperform it. Our research shows that machine learning techniques can, to a reasonable degree of accuracy, predict both objective factors such as abstract category as well as subjective factors such as abstract grade. This work builds upon previous research involving using natural language processing on scientific abstracts to make useful inferences that address a meaningful problem.

A Survey on Bias in Deep NLP

Applied Sciences ◽

10.3390/app11073184 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3184

Author(s):

Ismael Garrido-Muñoz ◽

Arturo Montejo-Ráez ◽

Fernando Martínez-Santiago ◽

L. Alfonso Ureña-López

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Probability Distribution ◽

Natural Language ◽

Network Design ◽

Language Processing ◽

Deep Neural Networks ◽

Learning Processes ◽

Relevant Issue

Deep neural networks are hegemonic approaches to many machine learning areas, including natural language processing (NLP). Thanks to the availability of large corpora collections and the capability of deep architectures to shape internal language mechanisms in self-supervised learning processes (also known as “pre-training”), versatile and performing models are released continuously for every new network design. These networks, somehow, learn a probability distribution of words and relations across the training collection used, inheriting the potential flaws, inconsistencies and biases contained in such a collection. As pre-trained models have been found to be very useful approaches to transfer learning, dealing with bias has become a relevant issue in this new scenario. We introduce bias in a formal way and explore how it has been treated in several networks, in terms of detection and correction. In addition, available resources are identified and a strategy to deal with bias in deep NLP is proposed.

Machine Learning Techniques for Biomedical Natural Language Processing: A comprehensive Review

IEEE Access ◽

10.1109/access.2021.3119621 ◽

2021 ◽

pp. 1-1

Author(s):

Essam H. Houssein ◽

Rehab E. Mohamed ◽

Abdelmgeid A. Ali

Keyword(s):

Machine Learning ◽

Natural Language ◽

Language Processing ◽

Comprehensive Review ◽

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

Analyzing Behavior of Cancer Patients using Machine Learning Techniques

10.35940/ijitee.i8414.078919 ◽

2019 ◽

Vol 8 (9) ◽

pp. 1547-1556

Keyword(s):

Machine Learning ◽

Natural Language ◽

Cancer Patients ◽

Language Processing ◽

Support Vector ◽

Svm Classifier ◽

Operating Characteristics ◽

Decision Tree Classifier ◽

Tree Classifier

The online discussion forums and blogs are very vibrant platforms for cancer patients to express their views in the form of stories. These stories sometimes become a source of inspiration for some patients who are anxious in searching the similar cases. This paper proposes a method using natural language processing and machine learning to analyze unstructured texts accumulated from patient’s reviews and stories. The proposed methodology aims to identify behavior, emotions, side-effects, decisions and demographics associated with the cancer victims. The pre-processing phase of our work involves extraction of web text followed by text-cleaning where some special characters and symbols are omitted, and finally tagging the texts using NLTK’s (Natural Language Toolkit) POS (Parts of Speech) Tagger. The post-processing phase performs training of seven machine learning classifiers (refer Table 6). The Decision Tree classifier shows the higher precision (0.83) among the other classifiers while, the Area under the operating Characteristics (AUC) for Support Vector Machine (SVM) classifier is highest (0.98).

International Journal of Advanced Research in Science, Communication and Technology ◽

A Comparative Analysis of Machine Learning Techniques for Spam Detection

10.48175/ijarsct-1308 ◽

2021 ◽

pp. 657-661

Author(s):

Rashida Ali ◽

Ibrahim Rampurawala ◽

Mayuri Wandhe ◽

Ruchika Shrikhande ◽

Arpita Bhatkar

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Natural Language ◽

Language Processing ◽

High Volume ◽

Machine Learning Algorithms ◽

Spam Detection ◽

Internet provides a medium to connect with individuals of similar or different interests creating a hub. Since a huge hub participates on these platforms, the user can receive a high volume of messages from different individuals creating a chaos and unwanted messages. These messages sometimes contain a true information and sometimes false, which leads to a state of confusion in the minds of the users and leads to first step towards spam messaging. Spam messages means an irrelevant and unsolicited message sent by a known/unknown user which may lead to a sense of insecurity among users. In this paper, the different machine learning algorithms were trained and tested with natural language processing (NLP) to classify whether the messages are spam or ham.

Machine Learning in Natural Language Processing

Handbook of Research on Machine Learning Applications and Trends ◽

10.4018/978-1-60566-766-9.ch014 ◽

2010 ◽

pp. 302-324

Author(s):

Marina Sokolova ◽

Stan Szpakowicz

Keyword(s):

Machine Learning ◽

Natural Language ◽

Language Processing ◽

Text Processing ◽

Word Sense Disambiguation ◽

Word Sense ◽

Part Of Speech ◽

Applications Of Machine Learning

This chapter presents applications of machine learning techniques to traditional problems in natural language processing, including part-of-speech tagging, entity recognition and word-sense disambiguation. People usually solve such problems without difficulty or at least do a very good job. Linguistics may suggest labour-intensive ways of manually constructing rule-based systems. It is, however, the easy availability of large collections of texts that has made machine learning a method of choice for processing volumes of data well above the human capacity. One of the main purposes of text processing is all manner of information extraction and knowledge extraction from such large text. Machine learning methods discussed in this chapter have stimulated wide-ranging research in natural language processing and helped build applications with serious deployment potential.

Communications in Computer and Information Science - Highlights on Practical Applications of Agents and Multi-Agent Systems ◽

Combining Machine Learning Techniques and Natural Language Processing to Infer Emotions Using Spanish Twitter Corpus

10.1007/978-3-642-38061-7_15 ◽

2013 ◽

pp. 149-157 ◽

Cited By ~ 5

Author(s):

Gonzalo Blázquez Gil ◽

Antonio Berlanga de Jesús ◽

José M. Molina Lopéz

Keyword(s):

Machine Learning ◽

Natural Language ◽

Language Processing ◽

Advanced Machine Learning Techniques in Natural Language Processing for Indian Languages

Smart Techniques for a Smarter Planet - Studies in Fuzziness and Soft Computing ◽

10.1007/978-3-030-03131-2_7 ◽

2019 ◽

pp. 117-144 ◽

Cited By ~ 1

Author(s):

Vaishali Gupta ◽

Nisheeth Joshi ◽

Iti Mathur

Keyword(s):

Machine Learning ◽

Natural Language ◽

Language Processing ◽

Indian Languages ◽

An Analysis of Machine Learning Algorithms and Deep Neural Networks for Email Spam Classification using Natural Language Processing

10.1109/soli54607.2021.9672398 ◽

2021 ◽

Author(s):

Md. Mohidul Hasan ◽

Syed Mahbubuz Zaman ◽

Md. Asif Talukdar ◽

Ayesha Siddika ◽

Md. Golam Rabiul Alam

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Natural Language ◽

Language Processing ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Email Spam

Handbook of Research on Emerging Trends and Applications of Machine Learning - Advances in Computational Intelligence and Robotics ◽

Deep Learning Approaches for Textual Sentiment Analysis

10.4018/978-1-5225-9643-1.ch009 ◽

2020 ◽

pp. 171-182 ◽

Cited By ~ 1

Author(s):

Tamanna Sharma ◽

Anu Bajaj ◽

Om Prakash Sangwan

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Computational Technique ◽

Sentiment analysis is computational measurement of attitude, opinions, and emotions (like positive/negative) with the help of text mining and natural language processing of words and phrases. Incorporation of machine learning techniques with natural language processing helps in analysing and predicting the sentiments in more precise manner. But sometimes, machine learning techniques are incapable in predicting sentiments due to unavailability of labelled data. To overcome this problem, an advanced computational technique called deep learning comes into play. This chapter highlights latest studies regarding use of deep learning techniques like convolutional neural network, recurrent neural network, etc. in sentiment analysis.

Handbook of Research on Ambient Intelligence and Smart Environments - Advances in Computational Intelligence and Robotics ◽

Opinion Mining and Information Retrieval

10.4018/978-1-61692-857-5.ch030 ◽

2011 ◽

pp. 640-652

Author(s):

Shishir K. Shandilya ◽

Suresh Jain

Keyword(s):

Machine Learning ◽

Natural Language ◽

Language Processing ◽

Ambient Intelligence ◽

Opinion Mining ◽

Training Data ◽

Web Documents ◽

Opinion Extraction ◽

Traditional Natural

The explosive increase in Internet usage has attracted technologies for automatically mining the user-generated contents (UGC) from Web documents. These UGC-rich resources have raised new opportunities and challenges to carry out the opinion extraction and mining tasks for opinion summaries. The technology of opinion extraction allows users to retrieve and analyze people’s opinions scattered over Web documents. Opinion mining is a process which is concerned with the opinions generated by the consumers about the product. Opinion Mining aims at understanding, extraction and classification of opinions scattered in unstructured text of online resources. The search engines performs well when one wants to know about any product before purchase, but the filtering and analysis of search results often complex and time-consuming. This generated the need of intelligent technologies which could process these unstructured online text documents through automatic classification, concept recognition, text summarization, etc. These tools are based on traditional natural language techniques, statistical analysis, and machine learning techniques. Automatic knowledge extraction over large text collections like Internet has been a challenging task due to many constraints such as needs of large annotated training data, requirement of extensive manual processing of data, and huge amount of domain-specific terms. Ambient Intelligence (AmI) in wed-enabled technologies supports and promotes the intelligent e-commerce services to enable the provision of personalized, self-configurable, and intuitive applications for facilitating UGC knowledge for buying confidence. In this chapter, we will discuss various approaches of Opinion Mining which combines Ambient Intelligence, Natural Language Processing and Machine Learning methods based on textual and grammatical clues.