Towards scaling Twitter for digital epidemiology of birth defects

Abstract Social media has recently been used to identify and study a small cohort of Twitter users whose pregnancies with birth defect outcomes—the leading cause of infant mortality—could be observed via their publicly available tweets. In this study, we exploit social media on a larger scale by developing natural language processing (NLP) methods to automatically detect, among thousands of users, a cohort of mothers reporting that their child has a birth defect. We used 22,999 annotated tweets to train and evaluate supervised machine learning algorithms—feature-engineered and deep learning-based classifiers—that automatically distinguish tweets referring to the user’s pregnancy outcome from tweets that merely mention birth defects. Because 90% of the tweets merely mention birth defects, we experimented with under-sampling and over-sampling approaches to address this class imbalance. An SVM classifier achieved the best performance for the two positive classes: an F1-score of 0.65 for the “defect” class and 0.51 for the “possible defect” class. We deployed the classifier on 20,457 unlabeled tweets that mention birth defects, which helped identify 542 additional users for potential inclusion in our cohort. Contributions of this study include (1) NLP methods for automatically detecting tweets by users reporting their birth defect outcomes, (2) findings that an SVM classifier can outperform a deep neural network-based classifier for highly imbalanced social media data, (3) evidence that automatic classification can be used to identify additional users for potential inclusion in our cohort, and (4) a publicly available corpus for training and evaluating supervised machine learning algorithms.

Download Full-text

Citation Classification Prediction Implying Text Features Using Natural Language Processing and Supervised Machine Learning Algorithms

Communications in Computer and Information Science - Recent Trends in Image Processing and Pattern Recognition ◽

10.1007/978-981-16-0507-9_46 ◽

2021 ◽

pp. 540-552

Author(s):

Priya Porwal ◽

Manoj H. Devare

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Text Features ◽

Classification Prediction

Download Full-text

Detection of social media platform insults using Natural language processing and comparative study of machine learning algorithms

2020 24th International Conference on System Theory, Control and Computing (ICSTCC) ◽

10.1109/icstcc50638.2020.9259730 ◽

2020 ◽

Author(s):

Sruthi Chiramel ◽

Doina Logofatu ◽

Gheorghe Goldenthal

Keyword(s):

Machine Learning ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Comparative Study ◽

Language Processing ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Social Media Platform ◽

Media Platform

Download Full-text

Text Polarity Detection using Multiple Supervised Machine Learning Algorithms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8449.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 1612-1618

Keyword(s):

Machine Learning ◽

Social Media ◽

Sentiment Analysis ◽

Large Scale ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

The Public ◽

Social Media Platforms ◽

Day By Day

Sentiment analysis is the classifying of a review, opinion or a statement into categories, which brings clarity about specific sentiments of customers or the concerned group to businesses and developers. These categorized data are very critical to the development of businesses and understanding the public opinion. The need for accurate opinion and large-scale sentiment analysis on social media platforms is growing day by day. In this paper, a number of machine learning algorithms are trained and applied on twitter datasets and their respective accuracies are determined separately on different polarities of data, thereby giving a glimpse to which algorithm works best and which works worst..

Download Full-text

Application of Supervised Machine Learning Algorithms for Lithofacies Classification.

10.2523/19349-ms ◽

2019 ◽

Author(s):

Subhadeep Sarkar ◽

Chandan Majumdar

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Lithofacies Classification

Download Full-text

A Deep Analysis and Efficient Implementation of Supervised Machine Learning Algorithms for Enhancing The Classification Ability of System

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i3.10941101 ◽

2019 ◽

Vol 7 (3) ◽

pp. 1094-1101

Author(s):

Sandeep Kumar Verma ◽

Turendar Sahu ◽

Manjit Jaiswal

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Efficient Implementation ◽

Machine Learning Algorithms ◽

Supervised Machine Learning

Download Full-text

A Comparative Study of Three Supervised Machine-Learning Algorithms for Classifying Carbonate Vuggy Facies in the Kansas Arbuckle Formation

Petrophysics – The SPWLA Journal of Formation Evaluation and Reservoir Description ◽

10.30632/pjv60n6-2019a8 ◽

2019 ◽

Vol 60 (6) ◽

pp. 838-853

Author(s):

◽

Chicheng Xu ◽

Dawn Jobe ◽

Rui Xu ◽

◽

...

Keyword(s):

Machine Learning ◽

Comparative Study ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning

Download Full-text

Crop price prediction using supervised machine learning algorithms

Journal of Physics Conference Series ◽

10.1088/1742-6596/1916/1/012042 ◽

2021 ◽

Vol 1916 (1) ◽

pp. 012042

Author(s):

Ranjani Dhanapal ◽

A AjanRaj ◽

S Balavinayagapragathish ◽

J Balaji

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Price Prediction

Download Full-text

Performance Improvement of Decision Tree: A Robust Classifier Using Tabu Search Algorithm

Applied Sciences ◽

10.3390/app11156728 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6728

Author(s):

Muhammad Asfand Hafeez ◽

Muhammad Rashid ◽

Hassan Tariq ◽

Zain Ul Abideen ◽

Saud S. Alotaibi ◽

...

Keyword(s):

Machine Learning ◽

Tabu Search ◽

Decision Tree ◽

Decision Trees ◽

Search Algorithm ◽

Learning Algorithms ◽

Performance Comparison ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Tabu Search Algorithm

Classification and regression are the major applications of machine learning algorithms which are widely used to solve problems in numerous domains of engineering and computer science. Different classifiers based on the optimization of the decision tree have been proposed, however, it is still evolving over time. This paper presents a novel and robust classifier based on a decision tree and tabu search algorithms, respectively. In the aim of improving performance, our proposed algorithm constructs multiple decision trees while employing a tabu search algorithm to consistently monitor the leaf and decision nodes in the corresponding decision trees. Additionally, the used tabu search algorithm is responsible to balance the entropy of the corresponding decision trees. For training the model, we used the clinical data of COVID-19 patients to predict whether a patient is suffering. The experimental results were obtained using our proposed classifier based on the built-in sci-kit learn library in Python. The extensive analysis for the performance comparison was presented using Big O and statistical analysis for conventional supervised machine learning algorithms. Moreover, the performance comparison to optimized state-of-the-art classifiers is also presented. The achieved accuracy of 98%, the required execution time of 55.6 ms and the area under receiver operating characteristic (AUROC) for proposed method of 0.95 reveals that the proposed classifier algorithm is convenient for large datasets.

Download Full-text

Prediction of social media effects on students’ academic performance using Machine Learning Algorithms (MLAs)

Journal of Computers in Education ◽

10.1007/s40692-021-00201-z ◽

2021 ◽

Author(s):

Isaac Kofi Nti ◽

Samuel Akyeramfo-Sam ◽

Bright Bediako-Kyeremeh ◽

Sylvester Agyemang

Keyword(s):

Machine Learning ◽

Social Media ◽

Academic Performance ◽

Media Effects ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

Spatial Roadway Condition-Assessment Mapping Utilizing Smartphones and Machine Learning Algorithms

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211006105 ◽

2021 ◽

pp. 036119812110061

Author(s):

Charalambos Kyriakou ◽

Symeon E. Christodoulou ◽

Loukas Dimitriou

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Condition Assessment ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Geographical Information ◽

Related Field ◽

Pavement Surface ◽

Automated Method ◽

Smartphone Technology

The paper presents a data-driven framework and related field studies on the use of supervised machine learning and smartphone technology for the spatial condition-assessment mapping of roadway pavement surface anomalies. The study explores the use of data, collected by sensors from a smartphone and a vehicle’s onboard diagnostic device while the vehicle is in movement, for the detection of roadway anomalies. The research proposes a low-cost and automated method to obtain up-to-date information on roadway pavement surface anomalies with the use of smartphone technology, artificial neural networks, robust regression analysis, and supervised machine learning algorithms for multiclass problems. The technology for the suggested system is readily available and accurate and can be utilized in pavement monitoring systems and geographical information system applications. Further, the proposed methodology has been field-tested, exhibiting accuracy levels higher than 90%, and it is currently expanded to include larger datasets and a bigger number of common roadway pavement surface defect types. The proposed system is of practical importance since it provides continuous information on roadway pavement surface conditions, which can be valuable for pavement engineers and public safety.

Download Full-text