A Unified Perspective for Disinformation Detection and Truth Discovery in Social Sensing: A Survey

2023 ◽  
Vol 55 (1) ◽  
pp. 1-33
Author(s):  
Fan Xu ◽  
Victor S. Sheng ◽  
Mingwen Wang

With the proliferation of social sensing, large amounts of observation are contributed by people or devices. However, these observations contain disinformation. Disinformation can propagate across online social networks at a relatively low cost, but result in a series of major problems in our society. In this survey, we provide a comprehensive overview of disinformation and truth discovery in social sensing under a unified perspective, including basic concepts and the taxonomy of existing methodologies. Furthermore, we summarize the mechanism of disinformation from four different perspectives (i.e., text only, text with image/multi-modal, text with propagation, and fusion models). In addition, we review existing solutions based on these requirements and compare their pros and cons and give a sort of guide to usage based on a detailed lesson learned. To facilitate future studies in this field, we summarize related publicly accessible real-world data sets and open source codes. Last but the most important, we emphasize potential future research topics and challenges in this domain through a deep analysis of most recent methods.

Sensors ◽  
2019 ◽  
Vol 19 (11) ◽  
pp. 2613 ◽  
Author(s):  
Collins Burton Mwakwata ◽  
Hassan Malik ◽  
Muhammad Mahtab Alam ◽  
Yannick Le Moullec ◽  
Sven Parand ◽  
...  

Narrowband internet of things (NB-IoT) is a recent cellular radio access technology based on Long-Term Evolution (LTE) introduced by Third-Generation Partnership Project (3GPP) for Low-Power Wide-Area Networks (LPWAN). The main aim of NB-IoT is to support massive machine-type communication (mMTC) and enable low-power, low-cost, and low-data-rate communication. NB-IoT is based on LTE design with some changes to meet the mMTC requirements. For example, in the physical (PHY) layer only single-antenna and low-order modulations are supported, and in the Medium Access Control (MAC) layers only one physical resource block is allocated for resource scheduling. The aim of this survey is to provide a comprehensive overview of the design changes brought in the NB-IoT standardization along with the detailed research developments from the perspectives of Physical and MAC layers. The survey also includes an overview of Evolved Packet Core (EPC) changes to support the Service Capability Exposure Function (SCEF) to manage both IP and non-IP data packets through Control Plane (CP) and User Plane (UP), the possible deployment scenarios of NB-IoT in future Heterogeneous Wireless Networks (HetNet). Finally, existing and emerging research challenges in this direction are presented to motivate future research activities.


Author(s):  
LOTFI BEN ROMDHANE ◽  
NADIA FADHEL ◽  
BECHIR AYEB

Data mining (DM) is a new emerging discipline that aims to extract knowledge from data using several techniques. DM turned out to be useful in business where the data describing the customers and their transactions is in the order of terabytes. In this paper, we propose an approach for building customer models (said also profiles in the literature) from business data. Our approach is three-step. In the first step, we use fuzzy clustering to categorize customers, i.e., determine groups of customers. A key feature is that the number of groups (or clusters) is computed automatically from data using the partition entropy as a validity criteria. In the second step, we proceed to a dimensionality reduction which aims at keeping for each group of customers only the most informative attributes. For this, we define the information loss to quantify the information degree of an attribute. Hence, and as a result to this second step, we obtain groups of customers each described by a distinct set of attributes. In the third and final step, we use backpropagation neural networks to extract useful knowledge from these groups. Experimental results on real-world data sets reveal a good performance of our approach and should simulate future research.


Author(s):  
Lincy Mathews ◽  
Seetha Hari

A very challenging issue in real world data is that in many domains like medicine, finance, marketing, web, telecommunication, management etc., the distribution of data among classes is inherently imbalanced. A widely accepted researched issue is that the traditional classifier algorithms assume a balanced distribution among the classes. Data imbalance is evident when the number of instances representing the class of concern is much lesser than other classes. Hence, the classifiers tend to bias towards the well-represented class. This leads to a higher misclassification rate among the lesser represented class. Hence, there is a need of efficient learners to classify imbalanced data. This chapter aims to address the need, challenges, existing methods and evaluation metrics identified when learning from imbalanced data sets. Future research challenges and directions are highlighted.


2019 ◽  
Vol 8 (3) ◽  
pp. 7071-7081

Current generation real-world data sets processed through machine learning are imbalanced by nature. This imbalanced data enables the researchers with a challenging scenario in the context of perdition for both the machine learning and data mining algorithms. It is observed from the past research studies most of the imbalanced data sets consists of the major classes and minor classes and the major class leads the minor class. Several standards and hybrid prediction algorithms are proposed in various application domains but in most of the real-time data sets analyzed in the studies are imbalanced by nature thereby affecting the accuracy of the prediction. This paper presents a systematic survey of the past research studies to analyze intrinsic data characteristics and techniques utilized for handling class-imbalanced data. In addition, this study reveals the research gaps, trends and patterns in existing studies and discusses briefly on future research directions


2021 ◽  
Vol 145 (9) ◽  
pp. 1095-1109
Author(s):  
Kyle Rehder ◽  
Kathryn C. Adair ◽  
J. Bryan Sexton

Context.— Problems with health care worker (HCW) well-being have become a leading concern in medicine given their severity and robust links to outcomes like medical error, mortality, and turnover. Objective.— To describe the state of the science regarding HCW well-being, including how it is measured, what outcomes it predicts, and what institutional and individual interventions appear to reduce it. Data Sources.— Peer review articles as well as multiple large data sets collected within our own research team are used to describe the nature of burnout, associations with institutional resources, and individual tools to improve well-being. Conclusions.— Rates of HCW burnout are alarmingly high, placing the health and safety of patients and HCWs at risk. To help address the urgent need to help HCWs, we summarize some of the most promising early interventions, and point toward future research that uses standardized metrics to evaluate interventions (with a focus on low-cost institutional and personal interventions).


Author(s):  
Lincy Mathews ◽  
Seetha Hari

A very challenging issue in real-world data is that in many domains like medicine, finance, marketing, web, telecommunication, management, etc. the distribution of data among classes is inherently imbalanced. A widely accepted researched issue is that the traditional classifier algorithms assume a balanced distribution among the classes. Data imbalance is evident when the number of instances representing the class of concern is much lesser than other classes. Hence, the classifiers tend to bias towards the well-represented class. This leads to a higher misclassification rate among the lesser represented class. Hence, there is a need of efficient learners to classify imbalanced data. This chapter aims to address the need, challenges, existing methods, and evaluation metrics identified when learning from imbalanced data sets. Future research challenges and directions are highlighted.


Complexity ◽  
2019 ◽  
Vol 2019 ◽  
pp. 1-11 ◽  
Author(s):  
Xiaodan Xu ◽  
Huawen Liu ◽  
Minghai Yao

Anomaly analysis is of great interest to diverse fields, including data mining and machine learning, and plays a critical role in a wide range of applications, such as medical health, credit card fraud, and intrusion detection. Recently, a significant number of anomaly detection methods with a variety of types have been witnessed. This paper intends to provide a comprehensive overview of the existing work on anomaly detection, especially for the data with high dimensionalities and mixed types, where identifying anomalous patterns or behaviours is a nontrivial work. Specifically, we first present recent advances in anomaly detection, discussing the pros and cons of the detection methods. Then we conduct extensive experiments on public datasets to evaluate several typical and popular anomaly detection methods. The purpose of this paper is to offer a better understanding of the state-of-the-art techniques of anomaly detection for practitioners. Finally, we conclude by providing some directions for future research.


Author(s):  
K Sobha Rani

Collaborative filtering suffers from the problems of data sparsity and cold start, which dramatically degrade recommendation performance. To help resolve these issues, we propose TrustSVD, a trust-based matrix factorization technique. By analyzing the social trust data from four real-world data sets, we conclude that not only the explicit but also the implicit influence of both ratings and trust should be taken into consideration in a recommendation model. Hence, we build on top of a state-of-the-art recommendation algorithm SVD++ which inherently involves the explicit and implicit influence of rated items, by further incorporating both the explicit and implicit influence of trusted users on the prediction of items for an active user. To our knowledge, the work reported is the first to extend SVD++ with social trust information. Experimental results on the four data sets demonstrate that our approach TrustSVD achieves better accuracy than other ten counterparts, and can better handle the concerned issues.


Author(s):  
Mukhil Azhagan M. S ◽  
Dhwani Mehta ◽  
Hangwei Lu ◽  
Sudarshan Agrawal ◽  
Mark Tehranipoor ◽  
...  

Abstract Globalization and complexity of the PCB supply chain has made hardware assurance a challenging task. An automated system to extract the Bill of Materials (BoM) can save time and resources during the authentication process, however, there are numerous imaging modalities and image analysis techniques that can be used to create such a system. In this paper we review different imaging modalities and their pros and cons for automatic PCB inspection. In addition, image analysis techniques commonly used for such images are reviewed in a systematic way to provide a direction for future research in this area. Index Terms—Component Detection, PCB, Authentication, Image Analysis, Machine Learning


2018 ◽  
Vol 32 (2) ◽  
pp. 103-119
Author(s):  
Colleen M. Boland ◽  
Chris E. Hogan ◽  
Marilyn F. Johnson

SYNOPSIS Mandatory existence disclosure rules require an organization to disclose a policy's existence, but not its content. We examine policy adoption frequencies in the year immediately after the IRS required mandatory existence disclosure by nonprofits of various governance policies. We also examine adoption frequencies in the year of the subsequent change from mandatory existence disclosure to a disclose-and-explain regime that required supplemental disclosures about the content and implementation of conflict of interest policies. Our results suggest that in areas where there is unclear regulatory authority, mandatory existence disclosure is an effective and low cost regulatory device for encouraging the adoption of policies desired by regulators, provided those policies are cost-effective for regulated firms to implement. In addition, we find that disclose-and-explain regulatory regimes provide stronger incentives for policy adoption than do mandatory existence disclosure regimes and also discourage “check the box” behavior. Future research should examine the impact of mandatory existence disclosure rules in the year that the regulation is implemented. Data Availability: Data are available from sources cited in the text.


Sign in / Sign up

Export Citation Format

Share Document