SenseLens: An Efficient Social Signal Conditioning System for True Event Detection

2022 ◽  
Vol 18 (2) ◽  
pp. 1-27
Author(s):  
Hang Cui ◽  
Tarek Abdelzaher

This article narrows the gap between physical sensing systems that measure physical signals and social sensing systems that measure information signals by (i) defining a novel algorithm for extracting information signals (building on results from text embedding) and (ii) showing that it increases the accuracy of truth discovery—the separation of true information from false/manipulated one. The work is applied in the context of separating true and false facts on social media, such as Twitter and Reddit, where users post predominantly short microblogs. The new algorithm decides how to aggregate the signal across words in the microblog for purposes of clustering the miscroblogs in the latent information signal space, where it is easier to separate true and false posts. Although previous literature extensively studied the problem of short text embedding/representation, this article improves previous work in three important respects: (1) Our work constitutes unsupervised truth discovery, requiring no labeled input or prior training. (2) We propose a new distance metric for efficient short text similarity estimation, we call Semantic Subset Matching , that improves our ability to meaningfully cluster microblog posts in the latent information signal space. (3) We introduce an iterative framework that jointly improves miscroblog clustering and truth discovery. The evaluation shows that the approach improves the accuracy of truth-discovery by 6.3%, 2.5%, and 3.8% (constituting a 38.9%, 14.2%, and 18.7% reduction in error, respectively) in three real Twitter data traces.

2020 ◽  
Vol 16 (2) ◽  
pp. 1045-1057 ◽  
Author(s):  
Yang Du ◽  
Yu-E Sun ◽  
He Huang ◽  
Liusheng Huang ◽  
Hongli Xu ◽  
...  

2019 ◽  
Vol 15 (1) ◽  
pp. 1-32 ◽  
Author(s):  
Chenglin Miao ◽  
Wenjun Jiang ◽  
Lu Su ◽  
Yaliang Li ◽  
Suxin Guo ◽  
...  

2013 ◽  
Vol 756-759 ◽  
pp. 1309-1313 ◽  
Author(s):  
Bing Jie Sun ◽  
Zhi Chao Liang ◽  
Qing Tian Zeng ◽  
Hua Zhao ◽  
Wei Jian Ni ◽  
...  

Text similarity computing is the core issue that question-answering system needs to solve. It is mainly used to filter out the existed problems which are similar to the users questions from database. Because of the low recall of domain keywords in domain text similarity computing based on traditional semantic dictionary, this paper proposed a short text similarity computing method in the field of agriculture based on the extended version of <<Tongyicicilin>> which referred to as <<CiLin>>. This paper propose to consider both the similarity and correlation when calculate the words final similarity. The experimental results show that the proposed short text similarity computing method resolve the problem of the low recall of domain words in traditional semantic dictionary well, and improve the similarity calculation performance of high relevant keywords greatly.


2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Taochun Wang ◽  
Chengmei Lv ◽  
Chengtian Wang ◽  
Fulong Chen ◽  
Yonglong Luo

With the rapid development of portable mobile devices, mobile crowd sensing systems (MCS) have been widely studied. However, the sensing data provided by participants in MCS applications is always unreliable, which affects the service quality of the system, and the truth discovery technology can effectively obtain true values from the data provided by multiple users. At the same time, privacy leaks also restrict users’ enthusiasm for participating in the MCS. Based on this, our paper proposes a secure truth discovery for data aggregation in crowd sensing systems, STDDA, which iteratively calculates user weights and true values to obtain real object data. In order to protect the privacy of data, STDDA divides users into several clusters, and users in the clusters ensure the privacy of data by adding secret random numbers to the perceived data. At the same time, the cluster head node uses the secure sum protocol to obtain the aggregation result of the sense data and uploads it to the server so that the server cannot obtain the sense data and weight of individual users, further ensuring the privacy of the user’s sense data and weight. In addition, using the truth discovery method, STDDA provides corresponding processing mechanisms for users’ dynamic joining and exiting, which enhances the robustness of the system. Experimental results show that STDDA has the characteristics of high accuracy, low communication, and high security.


Author(s):  
Kaiping Xue ◽  
Bin Zhu ◽  
Qingyou Yang ◽  
Na Gai ◽  
David S.L. Wei ◽  
...  

2019 ◽  
Vol 68 (4) ◽  
pp. 3854-3865 ◽  
Author(s):  
Guowen Xu ◽  
Hongwei Li ◽  
Sen Liu ◽  
Mi Wen ◽  
Rongxing Lu

Data ◽  
2018 ◽  
Vol 4 (1) ◽  
pp. 6 ◽  
Author(s):  
Sophie Jordan ◽  
Sierra Hovet ◽  
Isaac Fung ◽  
Hai Liang ◽  
King-Wa Fu ◽  
...  

Twitter is a social media platform where over 500 million people worldwide publish their ideas and discuss diverse topics, including their health conditions and public health events. Twitter has proved to be an important source of health-related information on the Internet, given the amount of information that is shared by both citizens and official sources. Twitter provides researchers with a real-time source of public health information on a global scale, and can be very important in public health research. Classifying Twitter data into topics or categories is helpful to better understand how users react and communicate. A literature review is presented on the use of mining Twitter data or similar short-text datasets for public health applications. Each method is analyzed for ways to use Twitter data in public health surveillance. Papers in which Twitter content was classified according to users or tweets for better surveillance of public health were selected for review. Only papers published between 2010–2017 were considered. The reviewed publications are distinguished by the methods that were used to categorize the Twitter content in different ways. While comparing studies is difficult due to the number of different methods that have been used for applying Twitter and interpreting data, this state-of-the-art review demonstrates the vast potential of utilizing Twitter for public health surveillance purposes.


Sign in / Sign up

Export Citation Format

Share Document