Content Feature Extraction in the Context of Social Media Behavior

Author(s):  
Shai Neumann ◽  
Charles Li ◽  
Chloe Lo ◽  
Corinne Lee ◽  
Shakeel Rajwani ◽  
...  
2021 ◽  
Vol 29 (1) ◽  
pp. 160-171
Author(s):  
Lin-miao HU ◽  
◽  
Yong ZHANG ◽  
Chen-feng LOU ◽  
◽  
...  

Writer inference systems tend to identify and verify the authorship of the handwritten documents. Each writer will have his own style of writing that uniquely identifies the writer. Hence authorship identification finds its application in forensic document analysis. It is also considered as one of the biometric features of a person, so helps in security to uniquely identify a person. Recognition of writers online has its application in detecting the identity thefts. That is compromising one’s social media account and sending messages to others as if he were an authentic sender. By discriminating the writing characteristics of the original and intruder, the masquerader can be identified. In this survey various works contributing to feature extraction and prediction of writers are discussed.


Author(s):  
Monte Hancock ◽  
Charles Li ◽  
Shakeel Rajwani ◽  
Payton Brown ◽  
Olivia Hancock ◽  
...  

Author(s):  
Junanda Patihullah ◽  
Edi Winarko

Social media has changed the people mindset to express thoughts and moods. As the activity of social media users increases, it does not rule out the possibility of crimes of spreading hate speech can spread quickly and widely. So that it is not possible to detect hate speech manually. GRU is one of the deep learning methods that has the ability to learn information relations from the previous time to the present time. In this research feature extraction used is word2vec, because it has the ability to learn semantics between words. In this research the GRU performance will be compared with other supervision methods such as support vector machine, naive bayes, decision tree and logistic regression. The results obtained show that the best accuracy is 92.96% by the GRU model with word2vec feature extraction. The use of word2vec in the comparison supervision method is not good enough from tf and tf-idf.


2020 ◽  
Vol 8 (2) ◽  
pp. 169
Author(s):  
Afiyati Afiyati ◽  
Azhari Azhari ◽  
Anny Kartika Sari ◽  
Abdul Karim

Nowadays, sarcasm recognition and detection simplified with various domains knowledge, among others, computer science, social science, psychology, mathematics, and many more. This article aims to explain trends in sentiment analysis especially sarcasm detection in the last ten years and its direction in the future. We review journals with the title’s keyword “sarcasm” and published from the year 2008 until 2018. The articles were classified based on the most frequently discussed topics among others: the dataset, pre-processing, annotations, approaches, features, context, and methods used. The significant increase in the number of articles on “sarcasm” in recent years indicates that research in this area still has enormous opportunities. The research about “sarcasm” also became very interesting because only a few researchers offer solutions for unstructured language. Some hybrid approaches using classification and feature extraction are used to identify the sarcasm sentence using deep learning models. This article will provide a further explanation of the most widely used algorithms for sarcasm detection with object social media. At the end of this article also shown that the critical aspect of research on sarcasm sentence that could be done in the future is dataset usage with various languages that cover unstructured data problem with contextual information will effectively detect sarcasm sentence and will improve the existing performance.


10.2196/24889 ◽  
2021 ◽  
Vol 23 (1) ◽  
pp. e24889
Author(s):  
Shi Chen ◽  
Lina Zhou ◽  
Yunya Song ◽  
Qian Xu ◽  
Ping Wang ◽  
...  

Background Social media plays a critical role in health communications, especially during global health emergencies such as the current COVID-19 pandemic. However, there is a lack of a universal analytical framework to extract, quantify, and compare content features in public discourse of emerging health issues on different social media platforms across a broad sociocultural spectrum. Objective We aimed to develop a novel and universal content feature extraction and analytical framework and contrast how content features differ with sociocultural background in discussions of the emerging COVID-19 global health crisis on major social media platforms. Methods We sampled the 1000 most shared viral Twitter and Sina Weibo posts regarding COVID-19, developed a comprehensive coding scheme to identify 77 potential features across six major categories (eg, clinical and epidemiological, countermeasures, politics and policy, responses), quantified feature values (0 or 1, indicating whether or not the content feature is mentioned in the post) in each viral post across social media platforms, and performed subsequent comparative analyses. Machine learning dimension reduction and clustering analysis were then applied to harness the power of social media data and provide more unbiased characterization of web-based health communications. Results There were substantially different distributions, prevalence, and associations of content features in public discourse about the COVID-19 pandemic on the two social media platforms. Weibo users were more likely to focus on the disease itself and health aspects, while Twitter users engaged more about policy, politics, and other societal issues. Conclusions We extracted a rich set of content features from social media data to accurately characterize public discourse related to COVID-19 in different sociocultural backgrounds. In addition, this universal framework can be adopted to analyze social media discussions of other emerging health issues beyond the COVID-19 pandemic.


Sign in / Sign up

Export Citation Format

Share Document