False News Recognition Using Machine Learning

Abstract In these modern times where internet has become widely popular and used by almost everyone, anyone can share or upload articles without any credibility. False news refers to articles that are published with the intent of deliberately misleading readers. In the recent times false news on internet has become more and it has become a major problem as it is difficult to differentiate between the real and the false news. False news and false posts have become more prevalent on social media sites such as Face book and Twitter. From these platforms the news will be spread like wild fire without any authenticity. It can be used to sway election outcomes against certain candidates, can be used for click baiting purposes, and can be used to earn revenue by misleading the users. In this paper we will use natural language processing techniques like bag of words and TD-IDF and machine learning concepts of classification algorithms like SVM and passive aggressive classifier to train our machine to differentiate false news from real news and we will compare the accuracy of methods used to find accurate model.

Download Full-text

Social Media Content Categorization Using Supervised Based Machine Learning Methods and Natural Language Processing in Bangla Language

2020 11th International Conference on Electrical and Computer Engineering (ICECE) ◽

10.1109/icece51571.2020.9393095 ◽

2020 ◽

Author(s):

Md. Rejaul Alam ◽

Afsana Akter ◽

Minhajul Abedin Shafin ◽

Md. Mehedi Hasan ◽

Antara Mahmud

Keyword(s):

Machine Learning ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Media Content ◽

Learning Methods ◽

Machine Learning Methods

Download Full-text

Applying natural language processing and machine learning techniques to patient experience feedback: a systematic review

BMJ Health & Care Informatics ◽

10.1136/bmjhci-2020-100262 ◽

2021 ◽

Vol 28 (1) ◽

pp. e100262

Author(s):

Mustafa Khanbhai ◽

Patrick Anyadi ◽

Joshua Symons ◽

Kelsey Flott ◽

Ara Darzi ◽

...

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Patient Experience ◽

Language Processing ◽

Performance Metrics ◽

Free Text ◽

Patient Feedback

ObjectivesUnstructured free-text patient feedback contains rich information, and analysing these data manually would require a lot of personnel resources which are not available in most healthcare organisations.To undertake a systematic review of the literature on the use of natural language processing (NLP) and machine learning (ML) to process and analyse free-text patient experience data.MethodsDatabases were systematically searched to identify articles published between January 2000 and December 2019 examining NLP to analyse free-text patient feedback. Due to the heterogeneous nature of the studies, a narrative synthesis was deemed most appropriate. Data related to the study purpose, corpus, methodology, performance metrics and indicators of quality were recorded.ResultsNineteen articles were included. The majority (80%) of studies applied language analysis techniques on patient feedback from social media sites (unsolicited) followed by structured surveys (solicited). Supervised learning was frequently used (n=9), followed by unsupervised (n=6) and semisupervised (n=3). Comments extracted from social media were analysed using an unsupervised approach, and free-text comments held within structured surveys were analysed using a supervised approach. Reported performance metrics included the precision, recall and F-measure, with support vector machine and Naïve Bayes being the best performing ML classifiers.ConclusionNLP and ML have emerged as an important tool for processing unstructured free text. Both supervised and unsupervised approaches have their role depending on the data source. With the advancement of data analysis tools, these techniques may be useful to healthcare organisations to generate insight from the volumes of unstructured free-text data.

Download Full-text

Triage and diagnosis of COVID-19 from medical social media (Preprint)

10.2196/preprints.30397 ◽

2021 ◽

Author(s):

Abul Hasan ◽

Mark Levene ◽

David Weston ◽

Renate Fromson ◽

Nicolas Koslover ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Models ◽

Rule Based ◽

Additional Information ◽

Processing Pipeline ◽

Machine Learning Models

BACKGROUND The COVID-19 pandemic has created a pressing need for integrating information from disparate sources, in order to assist decision makers. Social media is important in this respect, however, to make sense of the textual information it provides and be able to automate the processing of large amounts of data, natural language processing methods are needed. Social media posts are often noisy, yet they may provide valuable insights regarding the severity and prevalence of the disease in the population. In particular, machine learning techniques for triage and diagnosis could allow for a better understanding of what social media may offer in this respect. OBJECTIVE This study aims to develop an end-to-end natural language processing pipeline for triage and diagnosis of COVID-19 from patient-authored social media posts, in order to provide researchers and other interested parties with additional information on the symptoms, severity and prevalence of the disease. METHODS The text processing pipeline first extracts COVID-19 symptoms and related concepts such as severity, duration, negations, and body parts from patients’ posts using conditional random fields. An unsupervised rule-based algorithm is then applied to establish relations between concepts in the next step of the pipeline. The extracted concepts and relations are subsequently used to construct two different vector representations of each post. These vectors are applied separately to build support vector machine learning models to triage patients into three categories and diagnose them for COVID-19. RESULTS We report that Macro- and Micro-averaged F_{1\ }scores in the range of 71-96% and 61-87%, respectively, for the triage and diagnosis of COVID-19, when the models are trained on human labelled data. Our experimental results indicate that similar performance can be achieved when the models are trained using predicted labels from concept extraction and rule-based classifiers, thus yielding end-to-end machine learning. Also, we highlight important features uncovered by our diagnostic machine learning models and compare them with the most frequent symptoms revealed in another COVID-19 dataset. In particular, we found that the most important features are not always the most frequent ones. CONCLUSIONS Our preliminary results show that it is possible to automatically triage and diagnose patients for COVID-19 from natural language narratives using a machine learning pipeline, in order to provide additional information on the severity and prevalence of the disease through the eyes of social media.

Download Full-text

A Review of Natural Language Processing and Machine Learning Tools Used to Analyze Arabic Social Media

2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT) ◽

10.1109/jeeit.2019.8717369 ◽

2019 ◽

Cited By ~ 8

Author(s):

Tarek Kanan ◽

Odai Sadaqa ◽

Amal Aldajeh ◽

Hanadi Alshwabka ◽

Wassan AL-dolime ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Tools

Download Full-text

Sentiment Analysis on Twitter Data of World Cup Soccer Tournament Using Machine Learning

IoT ◽

10.3390/iot1020014 ◽

2020 ◽

Vol 1 (2) ◽

pp. 218-239 ◽

Cited By ~ 2

Author(s):

Ravikumar Patel ◽

Kalpdrum Passi

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Random Forest ◽

Natural Language ◽

Language Processing ◽

Machine Learning Algorithms ◽

World Cup ◽

Part Of Speech ◽

Twitter Data ◽

Processing Techniques

In the derived approach, an analysis is performed on Twitter data for World Cup soccer 2014 held in Brazil to detect the sentiment of the people throughout the world using machine learning techniques. By filtering and analyzing the data using natural language processing techniques, sentiment polarity was calculated based on the emotion words detected in the user tweets. The dataset is normalized to be used by machine learning algorithms and prepared using natural language processing techniques like word tokenization, stemming and lemmatization, part-of-speech (POS) tagger, name entity recognition (NER), and parser to extract emotions for the textual data from each tweet. This approach is implemented using Python programming language and Natural Language Toolkit (NLTK). A derived algorithm extracts emotional words using WordNet with its POS (part-of-speech) for the word in a sentence that has a meaning in the current context, and is assigned sentiment polarity using the SentiWordNet dictionary or using a lexicon-based method. The resultant polarity assigned is further analyzed using naïve Bayes, support vector machine (SVM), K-nearest neighbor (KNN), and random forest machine learning algorithms and visualized on the Weka platform. Naïve Bayes gives the best accuracy of 88.17% whereas random forest gives the best area under the receiver operating characteristics curve (AUC) of 0.97.

Download Full-text

Information Search Mechanisms for Government Entities using Machine Learning and Natural Language Processing Techniques

International Journal of Computer Applications ◽

10.5120/ijca2020920150 ◽

2020 ◽

Vol 176 (21) ◽

pp. 1-7

Author(s):

Ricardo Ponciano ◽

João Santos ◽

João Isento

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Information Search ◽

Processing Techniques

Download Full-text

Computerized Answer Grading

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35044 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 618-619

Author(s):

Anurag Langan

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Computer Technology ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Grade Student ◽

Processing Techniques

Grading student answers is a tedious and time-consuming task. A study had found that almost on average around 25% of a teacher's time is spent in scoring the answer sheets of students. This time could be utilized in much better ways if computer technology could be used to score answers. This system will aim to grade student answers using the various Natural Language processing techniques and Machine Learning algorithms available today.

Download Full-text

A Decision Support System for Predicting Socially Depressed Users Using Bidirectional Encoders Representations from Transformers (BERT)

Journal of University of Shanghai for Science and Technology ◽

10.51201/jusst12674 ◽

2021 ◽

Vol 23 (3) ◽

Author(s):

Maharukh Syed ◽

◽

Meera Narvekar ◽

Keyword(s):

Mental Health ◽

Social Media ◽

Natural Language Processing ◽

Decision Support ◽

Decision Support System ◽

Language Processing ◽

21St Century ◽

Support System ◽

Contextual Data ◽

Processing Techniques

Depression is one of the leading causes of suicides in society. The youth of the 21st century are inclined towards social media for all their needs and expressions. Close friends can easily predict if someone is happy, sad, or depressed from a user’s daily social media activity like status uploads/shares/reposts/check-ins, etc. This activity can be analyzed in order to understand the pattern of mental health. Such data is easily available and if suspected, it can be reported to a Psychiatrist and Psychologist to prevent socially active depressed patients from taking any wrong decisions regarding their life thus providing a Decision Support System (DSS). Various natural language processing techniques have been used in order to detect depression but there is a need for a unified architecture that is based on contextual data and is bidirectional in nature. This can be achieved by using example be achieved by using the Google research project (BERT) Bidirectional Encoder Representations from Transformers.

Download Full-text

An Exploration of Impact of COVID 19 on mental health -Analysis of tweets using Natural Language Processing techniques

10.1101/2020.07.30.20165571 ◽

2020 ◽

Author(s):

Sohini Sengupta ◽

Sareeta Mugde ◽

Garima Sharma

Keyword(s):

Mental Health ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Gold Mine ◽

Social Media Platforms ◽

Processing Techniques

Twitter is one of the world's biggest social media platforms for hosting abundant number of user-generated posts. It is considered as a gold mine of data. Majority of the tweets are public and thereby pullable unlike other social media platforms. In this paper we are analyzing the topics related to mental health that are recently (June, 2020) been discussed on Twitter. Also amidst the on-going pandemic, we are going to find out if covid-19 emerges as one of the factors impacting mental health. Further we are going to do an overall sentiment analysis to better understand the emotions of users.

Download Full-text

Extending A Chronological and Geographical Analysis of Personal Reports of COVID-19 on Twitter to England, UK

10.1101/2020.05.05.20083436 ◽

2020 ◽

Cited By ~ 5

Author(s):

S Golder ◽

Ari Z. Klein ◽

Arjun Magge ◽

Karen O’Connor ◽

Haitao Cai ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Natural Language Processing ◽

Geographical Distribution ◽

Language Processing ◽

Social Media Mining ◽

Learning Framework ◽

The Us ◽

Early Indication ◽

Media Mining

AbstractThe rapidly evolving COVID-19 pandemic presents challenges for actively monitoring its transmission. In this study, we extend a social media mining approach used in the US to automatically identify personal reports of COVID-19 on Twitter in England, UK. The findings indicate that natural language processing and machine learning framework could help provide an early indication of the chronological and geographical distribution of COVID-19 in England.

Download Full-text