SenseLens: An Efficient Social Signal Conditioning System for True Event Detection

This article narrows the gap between physical sensing systems that measure physical signals and social sensing systems that measure information signals by (i) defining a novel algorithm for extracting information signals (building on results from text embedding) and (ii) showing that it increases the accuracy of truth discovery—the separation of true information from false/manipulated one. The work is applied in the context of separating true and false facts on social media, such as Twitter and Reddit, where users post predominantly short microblogs. The new algorithm decides how to aggregate the signal across words in the microblog for purposes of clustering the miscroblogs in the latent information signal space, where it is easier to separate true and false posts. Although previous literature extensively studied the problem of short text embedding/representation, this article improves previous work in three important respects: (1) Our work constitutes unsupervised truth discovery, requiring no labeled input or prior training. (2) We propose a new distance metric for efficient short text similarity estimation, we call Semantic Subset Matching , that improves our ability to meaningfully cluster microblog posts in the latent information signal space. (3) We introduce an iterative framework that jointly improves miscroblog clustering and truth discovery. The evaluation shows that the approach improves the accuracy of truth-discovery by 6.3%, 2.5%, and 3.8% (constituting a 38.9%, 14.2%, and 18.7% reduction in error, respectively) in three real Twitter data traces.

Download Full-text

A Short Text Similarity Algorithm for Finding Similar Police 110 Incidents

2016 7th International Conference on Cloud Computing and Big Data (CCBD) ◽

10.1109/ccbd.2016.058 ◽

2016 ◽

Cited By ~ 2

Author(s):

Lei Duan ◽

Tongge Xu

Keyword(s):

Text Similarity ◽

Short Text ◽

Similarity Algorithm ◽

Short Text Similarity

Download Full-text

An effective short text conceptualization based on new short text similarity

Social Network Analysis and Mining ◽

10.1007/s13278-018-0544-8 ◽

2018 ◽

Vol 9 (1) ◽

Cited By ~ 1

Author(s):

Mohammed Bekkali ◽

Abdelmonaime Lachkar

Keyword(s):

Text Similarity ◽

Short Text ◽

Short Text Similarity

Download Full-text

Bayesian Co-Clustering Truth Discovery for Mobile Crowd Sensing Systems

IEEE Transactions on Industrial Informatics ◽

10.1109/tii.2019.2896287 ◽

2020 ◽

Vol 16 (2) ◽

pp. 1045-1057 ◽

Cited By ~ 3

Author(s):

Yang Du ◽

Yu-E Sun ◽

He Huang ◽

Liusheng Huang ◽

Hongli Xu ◽

...

Keyword(s):

Mobile Crowd Sensing ◽

Sensing Systems ◽

Crowd Sensing ◽

Truth Discovery ◽

Mobile Crowd

Download Full-text

Privacy-Preserving Truth Discovery in Crowd Sensing Systems

ACM Transactions on Sensor Networks ◽

10.1145/3277505 ◽

2019 ◽

Vol 15 (1) ◽

pp. 1-32 ◽

Cited By ~ 10

Author(s):

Chenglin Miao ◽

Wenjun Jiang ◽

Lu Su ◽

Yaliang Li ◽

Suxin Guo ◽

...

Keyword(s):

Privacy Preserving ◽

Sensing Systems ◽

Crowd Sensing ◽

Truth Discovery

Download Full-text

Short Text Similarity Computing Method towards Agriculture Question and Answering Systems

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.756-759.1309 ◽

2013 ◽

Vol 756-759 ◽

pp. 1309-1313 ◽

Cited By ~ 1

Author(s):

Bing Jie Sun ◽

Zhi Chao Liang ◽

Qing Tian Zeng ◽

Hua Zhao ◽

Wei Jian Ni ◽

...

Keyword(s):

Question Answering ◽

Extended Version ◽

Text Similarity ◽

Short Text ◽

Question Answering System ◽

The Core ◽

Similarity Calculation ◽

Core Issue ◽

Short Text Similarity ◽

Computing Method

Text similarity computing is the core issue that question-answering system needs to solve. It is mainly used to filter out the existed problems which are similar to the users questions from database. Because of the low recall of domain keywords in domain text similarity computing based on traditional semantic dictionary, this paper proposed a short text similarity computing method in the field of agriculture based on the extended version of <<Tongyicicilin>> which referred to as <<CiLin>>. This paper propose to consider both the similarity and correlation when calculate the words final similarity. The experimental results show that the proposed short text similarity computing method resolve the problem of the low recall of domain words in traditional semantic dictionary well, and improve the similarity calculation performance of high relevant keywords greatly.

Download Full-text

A Secure Truth Discovery for Data Aggregation in Mobile Crowd Sensing

Security and Communication Networks ◽

10.1155/2021/2296386 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Taochun Wang ◽

Chengmei Lv ◽

Chengtian Wang ◽

Fulong Chen ◽

Yonglong Luo

Keyword(s):

Data Aggregation ◽

Rapid Development ◽

Mobile Crowd Sensing ◽

Sense Data ◽

Sensing Systems ◽

Crowd Sensing ◽

Truth Discovery ◽

Head Node ◽

True Values ◽

Mobile Crowd

With the rapid development of portable mobile devices, mobile crowd sensing systems (MCS) have been widely studied. However, the sensing data provided by participants in MCS applications is always unreliable, which affects the service quality of the system, and the truth discovery technology can effectively obtain true values from the data provided by multiple users. At the same time, privacy leaks also restrict users’ enthusiasm for participating in the MCS. Based on this, our paper proposes a secure truth discovery for data aggregation in crowd sensing systems, STDDA, which iteratively calculates user weights and true values to obtain real object data. In order to protect the privacy of data, STDDA divides users into several clusters, and users in the clusters ensure the privacy of data by adding secret random numbers to the perceived data. At the same time, the cluster head node uses the secure sum protocol to obtain the aggregation result of the sense data and uploads it to the server so that the server cannot obtain the sense data and weight of individual users, further ensuring the privacy of the user’s sense data and weight. In addition, using the truth discovery method, STDDA provides corresponding processing mechanisms for users’ dynamic joining and exiting, which enhances the robustness of the system. Experimental results show that STDDA has the characteristics of high accuracy, low communication, and high security.

Download Full-text

InPPTD: An Lightweight Incentive-based Privacy-Preserving Truth Discovery for Crowd Sensing Systems

IEEE Internet of Things Journal ◽

10.1109/jiot.2020.3029294 ◽

2020 ◽

pp. 1-1

Author(s):

Kaiping Xue ◽

Bin Zhu ◽

Qingyou Yang ◽

Na Gai ◽

David S.L. Wei ◽

...

Keyword(s):

Privacy Preserving ◽

Sensing Systems ◽

Crowd Sensing ◽

Truth Discovery

Download Full-text

Efficient and Privacy-Preserving Truth Discovery in Mobile Crowd Sensing Systems

IEEE Transactions on Vehicular Technology ◽

10.1109/tvt.2019.2895834 ◽

2019 ◽

Vol 68 (4) ◽

pp. 3854-3865 ◽

Cited By ~ 30

Author(s):

Guowen Xu ◽

Hongwei Li ◽

Sen Liu ◽

Mi Wen ◽

Rongxing Lu

Keyword(s):

Privacy Preserving ◽

Mobile Crowd Sensing ◽

Sensing Systems ◽

Crowd Sensing ◽

Truth Discovery ◽

Mobile Crowd

Download Full-text

Short Text Similarity Measurement Using Context from Bag of Word Pairs and Word Co-occurrence

Communications in Computer and Information Science - Data Science ◽

10.1007/978-981-15-2810-1_22 ◽

2020 ◽

pp. 221-231

Author(s):

Shuiqiao Yang ◽

Guangyan Huang ◽

Bahadorreza Ofoghi

Keyword(s):

Similarity Measurement ◽

Text Similarity ◽

Short Text ◽

Short Text Similarity

Download Full-text

Using Twitter for Public Health Surveillance from Monitoring and Prediction to Public Response

Data ◽

10.3390/data4010006 ◽

2018 ◽

Vol 4 (1) ◽

pp. 6 ◽

Cited By ~ 16

Author(s):

Sophie Jordan ◽

Sierra Hovet ◽

Isaac Fung ◽

Hai Liang ◽

King-Wa Fu ◽

...

Keyword(s):

Public Health ◽

Public Health Surveillance ◽

Public Health Research ◽

Global Scale ◽

Health Surveillance ◽

Short Text ◽

Related Information ◽

Twitter Data ◽

Public Health Information ◽

Health Related

Twitter is a social media platform where over 500 million people worldwide publish their ideas and discuss diverse topics, including their health conditions and public health events. Twitter has proved to be an important source of health-related information on the Internet, given the amount of information that is shared by both citizens and official sources. Twitter provides researchers with a real-time source of public health information on a global scale, and can be very important in public health research. Classifying Twitter data into topics or categories is helpful to better understand how users react and communicate. A literature review is presented on the use of mining Twitter data or similar short-text datasets for public health applications. Each method is analyzed for ways to use Twitter data in public health surveillance. Papers in which Twitter content was classified according to users or tweets for better surveillance of public health were selected for review. Only papers published between 2010–2017 were considered. The reviewed publications are distinguished by the methods that were used to categorize the Twitter content in different ways. While comparing studies is difficult due to the number of different methods that have been used for applying Twitter and interpreting data, this state-of-the-art review demonstrates the vast potential of utilizing Twitter for public health surveillance purposes.

Download Full-text