scholarly journals Identifying and Characterizing the Propagation Scale of COVID-19 Situational Information on Twitter: A Hybrid Text Analytic Approach

2021 ◽  
Vol 11 (14) ◽  
pp. 6526
Author(s):  
Junaid Abdul Wahid ◽  
Lei Shi ◽  
Yufei Gao ◽  
Bei Yang ◽  
Yongcai Tao ◽  
...  

During the recent pandemic of COVID-19, an increasing amount of information has been propagated on social media. This situational information is valuable for public authorities. Therefore, this study characterized the propagation scale of situational information types by harnessing the power of natural language processing techniques and machine learning algorithms. We observed that the length of the post has a positive correlation with type 1 information (announcements), and negative words were mostly used in type 5 information (criticizing the government), whereas anxiety-related words have a negative effect on the amount of retweeted type 0 (precautions) and type 2 (donations) information. This type of research study not only contributes to the situational information literature by comprehensively defining categories but also provides data-oriented practical insights into information so that management authorities can formulate response strategies after the pandemic. Our approach is one of its kind and combines Twitter content features, user features and LIWC linguistic features with machine learning algorithms to analyze the propagation scale of situational information, and it achieved 77% accuracy with SVM while classifying the information categories.

2019 ◽  
Vol 2 (1) ◽  
Author(s):  
Ari Z. Klein ◽  
Abeed Sarker ◽  
Davy Weissenbacher ◽  
Graciela Gonzalez-Hernandez

Abstract Social media has recently been used to identify and study a small cohort of Twitter users whose pregnancies with birth defect outcomes—the leading cause of infant mortality—could be observed via their publicly available tweets. In this study, we exploit social media on a larger scale by developing natural language processing (NLP) methods to automatically detect, among thousands of users, a cohort of mothers reporting that their child has a birth defect. We used 22,999 annotated tweets to train and evaluate supervised machine learning algorithms—feature-engineered and deep learning-based classifiers—that automatically distinguish tweets referring to the user’s pregnancy outcome from tweets that merely mention birth defects. Because 90% of the tweets merely mention birth defects, we experimented with under-sampling and over-sampling approaches to address this class imbalance. An SVM classifier achieved the best performance for the two positive classes: an F1-score of 0.65 for the “defect” class and 0.51 for the “possible defect” class. We deployed the classifier on 20,457 unlabeled tweets that mention birth defects, which helped identify 542 additional users for potential inclusion in our cohort. Contributions of this study include (1) NLP methods for automatically detecting tweets by users reporting their birth defect outcomes, (2) findings that an SVM classifier can outperform a deep neural network-based classifier for highly imbalanced social media data, (3) evidence that automatic classification can be used to identify additional users for potential inclusion in our cohort, and (4) a publicly available corpus for training and evaluating supervised machine learning algorithms.


2019 ◽  
Vol 63 (4) ◽  
pp. 243-252 ◽  
Author(s):  
Jaret Hodges ◽  
Soumya Mohan

Machine learning algorithms are used in language processing, automated driving, and for prediction. Though the theory of machine learning has existed since the 1950s, it was not until the advent of advanced computing that their potential has begun to be realized. Gifted education is a field where machine learning has yet to be utilized, even though one of the underlying problems of gifted education is classification, which is an area where learning algorithms have become exceptionally accurate. We provide a brief overview of machine learning with a focus on neural networks and supervised learning, followed by a demonstration using simulated data and neural networks for classification issues with a practical explanation of the mechanics of the neural network and associated R code. Implications for gifted education are then discussed. Finally, the limitations of supervised learning are discussed. Code used in this article can be found at https://osf.io/4pa3b/


Author(s):  
Anurag Langan

Grading student answers is a tedious and time-consuming task. A study had found that almost on average around 25% of a teacher's time is spent in scoring the answer sheets of students. This time could be utilized in much better ways if computer technology could be used to score answers. This system will aim to grade student answers using the various Natural Language processing techniques and Machine Learning algorithms available today.


JAMIA Open ◽  
2019 ◽  
Vol 2 (1) ◽  
pp. 139-149 ◽  
Author(s):  
Meijian Guan ◽  
Samuel Cho ◽  
Robin Petro ◽  
Wei Zhang ◽  
Boris Pasche ◽  
...  

Abstract Objectives Natural language processing (NLP) and machine learning approaches were used to build classifiers to identify genomic-related treatment changes in the free-text visit progress notes of cancer patients. Methods We obtained 5889 deidentified progress reports (2439 words on average) for 755 cancer patients who have undergone a clinical next generation sequencing (NGS) testing in Wake Forest Baptist Comprehensive Cancer Center for our data analyses. An NLP system was implemented to process the free-text data and extract NGS-related information. Three types of recurrent neural network (RNN) namely, gated recurrent unit, long short-term memory (LSTM), and bidirectional LSTM (LSTM_Bi) were applied to classify documents to the treatment-change and no-treatment-change groups. Further, we compared the performances of RNNs to 5 machine learning algorithms including Naive Bayes, K-nearest Neighbor, Support Vector Machine for classification, Random forest, and Logistic Regression. Results Our results suggested that, overall, RNNs outperformed traditional machine learning algorithms, and LSTM_Bi showed the best performance among the RNNs in terms of accuracy, precision, recall, and F1 score. In addition, pretrained word embedding can improve the accuracy of LSTM by 3.4% and reduce the training time by more than 60%. Discussion and Conclusion NLP and RNN-based text mining solutions have demonstrated advantages in information retrieval and document classification tasks for unstructured clinical progress notes.


2022 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Krishnadas Nanath ◽  
Supriya Kaitheri ◽  
Sonia Malik ◽  
Shahid Mustafa

Purpose The purpose of this paper is to examine the factors that significantly affect the prediction of fake news from the virality theory perspective. The paper looks at a mix of emotion-driven content, sentimental resonance, topic modeling and linguistic features of news articles to predict the probability of fake news. Design/methodology/approach A data set of over 12,000 articles was chosen to develop a model for fake news detection. Machine learning algorithms and natural language processing techniques were used to handle big data with efficiency. Lexicon-based emotion analysis provided eight kinds of emotions used in the article text. The cluster of topics was extracted using topic modeling (five topics), while sentiment analysis provided the resonance between the title and the text. Linguistic features were added to the coding outcomes to develop a logistic regression predictive model for testing the significant variables. Other machine learning algorithms were also executed and compared. Findings The results revealed that positive emotions in a text lower the probability of news being fake. It was also found that sensational content like illegal activities and crime-related content were associated with fake news. The news title and the text exhibiting similar sentiments were found to be having lower chances of being fake. News titles with more words and content with fewer words were found to impact fake news detection significantly. Practical implications Several systems and social media platforms today are trying to implement fake news detection methods to filter the content. This research provides exciting parameters from a viral theory perspective that could help develop automated fake news detectors. Originality/value While several studies have explored fake news detection, this study uses a new perspective on viral theory. It also introduces new parameters like sentimental resonance that could help predict fake news. This study deals with an extensive data set and uses advanced natural language processing to automate the coding techniques in developing the prediction model.


Author(s):  
Dr. K. Suresh

The current way of checking answer scripts is hectic for the college. They need to manually check the answers and allocate the marks to the students. Our proposed system uses Machine Learning and Natural Language Processing techniques to beat this. Machine learning algorithms use computational methods to find out directly from data without hopping on predetermined rules. NLP algorithms identify specific entities within the text, explore for key elements during a document, run a contextual search for synonyms and detect misspelled words or similar entries, and more. Our algorithm performs similarity checking and also the number of words associated with the question exactly matched between two documents. It also checks whether the grammar is correctly used or not within the student's answer. Our proposed system performs text extraction and evaluation of marks by applying Machine Learning and Natural Language Processing techniques.


2021 ◽  
Vol 10 (5) ◽  
pp. 2857-2865
Author(s):  
Moanda Diana Pholo ◽  
Yskandar Hamam ◽  
Abdel Baset Khalaf ◽  
Chunling Du

Available literature reports several lymphoma cases misdiagnosed as tuberculosis, especially in countries with a heavy TB burden. This frequent misdiagnosis is due to the fact that the two diseases can present with similar symptoms. The present study therefore aims to analyse and explore TB as well as lymphoma case reports using Natural Language Processing tools and evaluate the use of machine learning to differentiate between the two diseases. As a starting point in the study, case reports were collected for each disease using web scraping. Natural language processing tools and text clustering were then used to explore the created dataset. Finally, six machine learning algorithms were trained and tested on the collected data, which contained 765 lymphoma and 546 tuberculosis case reports. Each method was evaluated using various performance metrics. The results indicated that the multi-layer perceptron model achieved the best accuracy (93.1%), recall (91.9%) and precision score (93.7%), thus outperforming other algorithms in terms of correctly classifying the different case reports.


Author(s):  
Prof. Prema Sahane

In this paper we are introducing a sign language converter which works as a duplex system as it can convert text to sign language as well as it can do a real time video to text conversion. It is basically a system that can be used by all people who know sign language as well as who are not familiar with it. The main aim of this system is to involve the specially abled people as much as possible to interact with others. Our system uses the basic NLP i.e. the Natural language Processing and algorithms like CNN classifier to make the implementation of this translator. Along with that this system focuses on the Indian Sign Language so that it can be used by our country people. The finger gestures are captured by the camera and using various machine learning algorithms the system will automatically translate the signs to the readable text, similarly in sign to text conversion, based on the data sets and various Machine learning algorithms the text will be converted to sign language.


Sign in / Sign up

Export Citation Format

Share Document