scholarly journals Rumor Detection on Social Media with Graph Structured Adversarial Learning

Author(s):  
Xiaoyu Yang ◽  
Yuefei Lyu ◽  
Tian Tian ◽  
Yifei Liu ◽  
Yudong Liu ◽  
...  

The wide spread of rumors on social media has caused tremendous effects in both the online and offline world. In addition to text information, recent detection methods began to exploit the graph structure in the propagation network. However, without a rigorous design, rumors may evade such graph models using various camouflage strategies by perturbing the structured data. Our focus in this work is to develop a robust graph-based detector to identify rumors on social media from an adversarial perspective. We first build a heterogeneous information network to model the rich information among users, posts, and user comments for detection. We then propose a graph adversarial learning framework, where the attacker tries to dynamically add intentional perturbations on the graph structure to fool the detector, while the detector would learn more distinctive structure features to resist such perturbations. In this way, our model would be enhanced in both robustness and generalization. Experiments on real-world datasets demonstrate that our model achieves better results than the state-of-the-art methods.

2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Meicheng Guo ◽  
Zhiwei Xu ◽  
Limin Liu ◽  
Mengjie Guo ◽  
Yujun Zhang

With the extensive usage of social media platforms, spam information, especially rumors, has become a serious problem of social network platforms. The rumors make it difficult for people to get credible information from Internet and cause social panic. Existing detection methods always rely on a large amount of training data. However, the number of the identified rumors is always insufficient for developing a stable detection model. To handle this problem, we proposed a deep transfer model to achieve accurate rumor detection in social media platforms. In detail, an adaptive parameter tuning method is proposed to solve the negative transferring problem in the parameter transferring process. Experiments based on real-world datasets demonstrate that the proposed model achieves more accurate rumor detection and significantly outperforms state-of-the-art rumor detection models.


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0256039
Author(s):  
Jiho Choi ◽  
Taewook Ko ◽  
Younhyuk Choi ◽  
Hyungho Byun ◽  
Chong-kwon Kim

Social media has become an ideal platform for the propagation of rumors, fake news, and misinformation. Rumors on social media not only mislead online users but also affect the real world immensely. Thus, detecting the rumors and preventing their spread became an essential task. Some of the recent deep learning-based rumor detection methods, such as Bi-Directional Graph Convolutional Networks (Bi-GCN), represent rumor using the completed stage of the rumor diffusion and try to learn the structural information from it. However, these methods are limited to represent rumor propagation as a static graph, which isn’t optimal for capturing the dynamic information of the rumors. In this study, we propose novel graph convolutional networks with attention mechanisms, named Dynamic GCN, for rumor detection. We first represent rumor posts with their responsive posts as dynamic graphs. The temporal information is used to generate a sequence of graph snapshots. The representation learning on graph snapshots with attention mechanism captures both structural and temporal information of rumor spreads. The conducted experiments on three real-world datasets demonstrate the superiority of Dynamic GCN over the state-of-the-art methods in the rumor detection task.


10.2196/17650 ◽  
2020 ◽  
Vol 8 (6) ◽  
pp. e17650
Author(s):  
Genghao Li ◽  
Bing Li ◽  
Langlin Huang ◽  
Sibing Hou

Background According to a World Health Organization report in 2017, there was almost one patient with depression among every 20 people in China. However, the diagnosis of depression is usually difficult in terms of clinical detection owing to slow observation, high cost, and patient resistance. Meanwhile, with the rapid emergence of social networking sites, people tend to share their daily life and disclose inner feelings online frequently, making it possible to effectively identify mental conditions using the rich text information. There are many achievements regarding an English web-based corpus, but for research in China so far, the extraction of language features from web-related depression signals is still in a relatively primary stage. Objective The purpose of this study was to propose an effective approach for constructing a depression-domain lexicon. This lexicon will contain language features that could help identify social media users who potentially have depression. Our study also compared the performance of detection with and without our lexicon. Methods We autoconstructed a depression-domain lexicon using Word2Vec, a semantic relationship graph, and the label propagation algorithm. These two methods combined performed well in a specific corpus during construction. The lexicon was obtained based on 111,052 Weibo microblogs from 1868 users who were depressed or nondepressed. During depression detection, we considered six features, and we used five classification methods to test the detection performance. Results The experiment results showed that in terms of the F1 value, our autoconstruction method performed 1% to 6% better than baseline approaches and was more effective and steadier. When applied to detection models like logistic regression and support vector machine, our lexicon helped the models outperform by 2% to 9% and was able to improve the final accuracy of potential depression detection. Conclusions Our depression-domain lexicon was proven to be a meaningful input for classification algorithms, providing linguistic insights on the depressive status of test subjects. We believe that this lexicon will enhance early depression detection in people on social media. Future work will need to be carried out on a larger corpus and with more complex methods.


2019 ◽  
Author(s):  
Genghao Li ◽  
Bing Li ◽  
Langlin Huang ◽  
Sibing Hou

BACKGROUND According to a World Health Organization report in 2017, there was almost one patient with depression among every 20 people in China. However, the diagnosis of depression is usually difficult in terms of clinical detection owing to slow observation, high cost, and patient resistance. Meanwhile, with the rapid emergence of social networking sites, people tend to share their daily life and disclose inner feelings online frequently, making it possible to effectively identify mental conditions using the rich text information. There are many achievements regarding an English web-based corpus, but for research in China so far, the extraction of language features from web-related depression signals is still in a relatively primary stage. OBJECTIVE The purpose of this study was to propose an effective approach for constructing a depression-domain lexicon. This lexicon will contain language features that could help identify social media users who potentially have depression. Our study also compared the performance of detection with and without our lexicon. METHODS We autoconstructed a depression-domain lexicon using Word2Vec, a semantic relationship graph, and the label propagation algorithm. These two methods combined performed well in a specific corpus during construction. The lexicon was obtained based on 111,052 Weibo microblogs from 1868 users who were depressed or nondepressed. During depression detection, we considered six features, and we used five classification methods to test the detection performance. RESULTS The experiment results showed that in terms of the F1 value, our autoconstruction method performed 1% to 6% better than baseline approaches and was more effective and steadier. When applied to detection models like logistic regression and support vector machine, our lexicon helped the models outperform by 2% to 9% and was able to improve the final accuracy of potential depression detection. CONCLUSIONS Our depression-domain lexicon was proven to be a meaningful input for classification algorithms, providing linguistic insights on the depressive status of test subjects. We believe that this lexicon will enhance early depression detection in people on social media. Future work will need to be carried out on a larger corpus and with more complex methods.


Symmetry ◽  
2020 ◽  
Vol 12 (11) ◽  
pp. 1806
Author(s):  
Zunwang Ke ◽  
Zhe Li ◽  
Chenzhi Zhou ◽  
Jiabao Sheng ◽  
Wushour Silamu ◽  
...  

Social media had a revolutionary impact because it provides an ideal platform for share information; however, it also leads to the publication and spreading of rumors. Existing rumor detection methods have relied on finding cues from only user-generated content, user profiles, or the structures of wide propagation. However, the previous works have ignored the organic combination of wide dispersion structures in rumor detection and text semantics. To this end, we propose KZWANG, a framework for rumor detection that provides sufficient domain knowledge to classify rumors accurately, and semantic information and a propagation heterogeneous graph are symmetry fused together. We utilize an attention mechanism to learn a semantic representation of text and introduce a GCN to capture the global and local relationships among all the source microblogs, reposts, and users. An organic combination of text semantics and propagating heterogeneous graphs is then used to train a rumor detection classifier. Experiments on Sina Weibo, Twitter15, and Twitter16 rumor detection datasets demonstrate the proposed model’s superiority over baseline methods. We also conduct an ablation study to understand the relative contributions of the various aspects of the method we proposed.


Author(s):  
Zoleikha Jahanbakhsh-Nagadeh ◽  
Mohammad-Reza Feizi-Derakhshi ◽  
Arash Sharifi

During the development of social media, there has been a transformation in social communication. Despite their positive applications in social interactions and news spread, it also provides an ideal platform for spreading rumors. Rumors can endanger the security of society in normal or critical situations. Therefore, it is important to detect and verify the rumors in the early stage of their spreading. Many research works have focused on social attributes in the social network to solve the problem of rumor detection and verification, while less attention has been paid to content features. The social and structural features of rumors develop over time and are not available in the early stage of rumor. Therefore, this study presented a content-based model to verify the Persian rumors on Twitter and Telegram early. The proposed model demonstrates the important role of content in spreading rumors and generates a better-integrated representation for each source rumor document by fusing its semantic, pragmatic, and syntactic information. First, contextual word embeddings of the source rumor are generated by a hybrid model based on ParsBERT and parallel CapsNets. Then, pragmatic and syntactic features of the rumor are extracted and concatenated with embeddings to capture the rich information for rumor verification. Experimental results on real-world datasets demonstrated that the proposed model significantly outperforms the state-of-the-art models in the early rumor verification task. Also, it can enhance the performance of the classifier from 2% to 11% on Twitter and from 5% to 23% on Telegram. These results validate the model's effectiveness when limited content information is available.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Mingxi Cheng ◽  
Yizhi Li ◽  
Shahin Nazarian ◽  
Paul Bogdan

AbstractSocial media have emerged as increasingly popular means and environments for information gathering and propagation. This vigorous growth of social media contributed not only to a pandemic (fast-spreading and far-reaching) of rumors and misinformation, but also to an urgent need for text-based rumor detection strategies. To speed up the detection of misinformation, traditional rumor detection methods based on hand-crafted feature selection need to be replaced by automatic artificial intelligence (AI) approaches. AI decision making systems require to provide explanations in order to assure users of their trustworthiness. Inspired by the thriving development of generative adversarial networks (GANs) on text applications, we propose a GAN-based layered model for rumor detection with explanations. To demonstrate the universality of the proposed approach, we demonstrate its benefits on a gene classification with mutation detection case study. Similarly to the rumor detection, the gene classification can also be formulated as a text-based classification problem. Unlike fake news detection that needs a previously collected verified news database, our model provides explanations in rumor detection based on tweet-level texts only without referring to a verified news database. The layered structure of both generative and discriminative models contributes to the outstanding performance. The layered generators produce rumors by intelligently inserting controversial information in non-rumors, and force the layered discriminators to detect detailed glitches and deduce exactly which parts in the sentence are problematic. On average, in the rumor detection task, our proposed model outperforms state-of-the-art baselines on PHEME dataset by $$26.85\%$$ 26.85 % in terms of macro-f1. The excellent performance of our model for textural sequences is also demonstrated by the gene mutation case study on which it achieves $$72.69\%$$ 72.69 % macro-f1 score.


2021 ◽  
pp. 146144482110348
Author(s):  
Kaiping Chen ◽  
June Jeon ◽  
Yanxi Zhou

Diversity in knowledge production is a core challenge facing science communication. Despite extensive works showing how diversity has been undermined in science communication, little is known about to what extent social media augments or hinders diversity for science communication. This article addresses this gap by examining the profile and network diversities of knowledge producers on a popular social media platform—YouTube. We revealed the pattern of the juxtaposition of inclusiveness and segregation in this digital platform, which we define as “segregated inclusion.” We found that diverse profiles are presented in digital knowledge production. However, the network among these knowledge producers reveals the rich-get-richer effect. At the intersection of profile and network diversities, we found a decrease in the overall profile diversity when we moved toward the center of the core producers. This segregated inclusion phenomenon questions how inequalities in science communication are replicated and amplified in relation to digital platforms.


2020 ◽  
Vol 54 (1) ◽  
pp. 1-2
Author(s):  
Shubhanshu Mishra

Information extraction (IE) aims at extracting structured data from unstructured or semi-structured data. The thesis starts by identifying social media data and scholarly communication data as a special case of digital social trace data (DSTD). This identification allows us to utilize the graph structure of the data (e.g., user connected to a tweet, author connected to a paper, author connected to authors, etc.) for developing new information extraction tasks. The thesis focuses on information extraction from DSTD, first, using only the text data from tweets and scholarly paper abstracts, and then using the full graph structure of Twitter and scholarly communications datasets. This thesis makes three major contributions. First, new IE tasks based on DSTD representation of the data are introduced. For scholarly communication data, methods are developed to identify article and author level novelty [Mishra and Torvik, 2016] and expertise. Furthermore, interfaces for examining the extracted information are introduced. A social communication temporal graph (SCTG) is introduced for comparing different communication data like tweets tagged with sentiment, tweets about a search query, and Facebook group posts. For social media, new text classification categories are introduced, with the aim of identifying enthusiastic and supportive users, via their tweets. Additionally, the correlation between sentiment classes and Twitter meta-data in public corpora is analyzed, leading to the development of a better model for sentiment classification [Mishra and Diesner, 2018]. Second, methods are introduced for extracting information from social media and scholarly data. For scholarly data, a semi-automatic method is introduced for the construction of a large-scale taxonomy of computer science concepts. The method relies on the Wikipedia category tree. The constructed taxonomy is used for identifying key computer science phrases in scholarly papers, and tracking their evolution over time. Similarly, for social media data, machine learning models based on human-in-the-loop learning [Mishra et al., 2015], semi-supervised learning [Mishra and Diesner, 2016], and multi-task learning [Mishra, 2019] are introduced for identifying sentiment, named entities, part of speech tags, phrase chunks, and super-sense tags. The machine learning models are developed with a focus on leveraging all available data. The multi-task models presented here result in competitive performance against other methods, for most of the tasks, while reducing inference time computational costs. Finally, this thesis has resulted in the creation of multiple open source tools and public data sets (see URL below), which can be utilized by the research community. The thesis aims to act as a bridge between research questions and techniques used in DSTD from different domains. The methods and tools presented here can help advance work in the areas of social media and scholarly data analysis.


Author(s):  
Shengsheng Qian ◽  
Jun Hu ◽  
Quan Fang ◽  
Changsheng Xu

In this article, we focus on fake news detection task and aim to automatically identify the fake news from vast amount of social media posts. To date, many approaches have been proposed to detect fake news, which includes traditional learning methods and deep learning-based models. However, there are three existing challenges: (i) How to represent social media posts effectively, since the post content is various and highly complicated; (ii) how to propose a data-driven method to increase the flexibility of the model to deal with the samples in different contexts and news backgrounds; and (iii) how to fully utilize the additional auxiliary information (the background knowledge and multi-modal information) of posts for better representation learning. To tackle the above challenges, we propose a novel Knowledge-aware Multi-modal Adaptive Graph Convolutional Networks (KMAGCN) to capture the semantic representations by jointly modeling the textual information, knowledge concepts, and visual information into a unified framework for fake news detection. We model posts as graphs and use a knowledge-aware multi-modal adaptive graph learning principal for the effective feature learning. Compared with existing methods, the proposed KMAGCN addresses challenges from three aspects: (1) It models posts as graphs to capture the non-consecutive and long-range semantic relations; (2) it proposes a novel adaptive graph convolutional network to handle the variability of graph data; and (3) it leverages textual information, knowledge concepts and visual information jointly for model learning. We have conducted extensive experiments on three public real-world datasets and superior results demonstrate the effectiveness of KMAGCN compared with other state-of-the-art algorithms.


Sign in / Sign up

Export Citation Format

Share Document