scholarly journals Quantifying Aspect Bias in Ordinal Ratings using a Bayesian Approach

Author(s):  
Lahari Poddar ◽  
Wynne Hsu ◽  
Mong Li Lee

User opinions expressed in the form of ratings can influence an individual's view of an item. However, the true quality of an item is often obfuscated by user biases, and it is not obvious from the observed ratings the importance different users place on different aspects of an item. We propose a probabilistic modeling of the observed aspect ratings to infer (i) each user's aspect bias and (ii) latent intrinsic quality of an item. We model multi-aspect ratings as ordered discrete data and encode the dependency between different aspects by using a latent Gaussian structure. We handle the Gaussian-Categorical non-conjugacy using a stick-breaking formulation coupled with P\'{o}lya-Gamma auxiliary variable augmentation for a simple, fully Bayesian inference. On two real world datasets, we demonstrate the predictive ability of our model and its effectiveness in learning explainable user biases to provide insights towards a more reliable product quality estimation.

2020 ◽  
Vol 8 ◽  
pp. 539-555
Author(s):  
Marina Fomicheva ◽  
Shuo Sun ◽  
Lisa Yankovskaya ◽  
Frédéric Blain ◽  
Francisco Guzmán ◽  
...  

Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time. Existing approaches require large amounts of expert annotated data, computation, and time for training. As an alternative, we devise an unsupervised approach to QE where no training or access to additional resources besides the MT system itself is required. Different from most of the current work that treats the MT system as a black box, we explore useful information that can be extracted from the MT system as a by-product of translation. By utilizing methods for uncertainty quantification, we achieve very good correlation with human judgments of quality, rivaling state-of-the-art supervised QE models. To evaluate our approach we collect the first dataset that enables work on both black-box and glass-box approaches to QE.


2022 ◽  
Vol 6 (GROUP) ◽  
pp. 1-25
Author(s):  
Ziyi Kou ◽  
Lanyu Shang ◽  
Yang Zhang ◽  
Dong Wang

The proliferation of social media has promoted the spread of misinformation that raises many concerns in our society. This paper focuses on a critical problem of explainable COVID-19 misinformation detection that aims to accurately identify and explain misleading COVID-19 claims on social media. Motivated by the lack of COVID-19 relevant knowledge in existing solutions, we construct a novel crowdsource knowledge graph based approach to incorporate the COVID-19 knowledge facts by leveraging the collaborative efforts of expert and non-expert crowd workers. Two important challenges exist in developing our solution: i) how to effectively coordinate the crowd efforts from both expert and non-expert workers to generate the relevant knowledge facts for detecting COVID-19 misinformation; ii) How to leverage the knowledge facts from the constructed knowledge graph to accurately explain the detected COVID-19 misinformation. To address the above challenges, we develop HC-COVID, a hierarchical crowdsource knowledge graph based framework that explicitly models the COVID-19 knowledge facts contributed by crowd workers with different levels of expertise and accurately identifies the related knowledge facts to explain the detection results. We evaluate HC-COVID using two public real-world datasets on social media. Evaluation results demonstrate that HC-COVID significantly outperforms state-of-the-art baselines in terms of the detection accuracy of misleading COVID-19 claims and the quality of the explanations.


2021 ◽  
Vol 2 (3) ◽  
Author(s):  
Riccardo Dondi ◽  
Mohammad Mehdi Hosseinzadeh

AbstractTemporal networks have been successfully applied to analyse dynamics of networks. In this paper we focus on an approach recently introduced to identify dense subgraphs in a temporal network and we present a heuristic, based on the local search technique, for the problem. The experimental results we present on synthetic and real-world datasets show that our heuristic provides mostly better solutions (denser solutions) and that the heuristic is fast (comparable with the fastest method in literature, which is outperformed in terms of quality of the solutions). We present also experimental results of two variants of our method based on two different subroutines to compute a dense subgraph of a given graph.


2021 ◽  
pp. 1-10
Author(s):  
Jin Yi ◽  
Jiajin Huang ◽  
Jin Qin

Recommender systems have been widely used in our life in recent years to facilitate our life. And it is very important and meaningful to improve recommendation performance. Generally, recommendation methods use users’ historical ratings on items to predict ratings on their unrated items to make recommendations. However, with the increase of the number of users and items, the degree of data sparsity increases, and the quality of recommendations decreases sharply. In order to solve the sparsity problem, other auxiliary information is combined to mine users’ preferences for higher recommendation quality. Similar to rating data, review data also contain rich information about users’ preferences on items. This paper proposes a novel recommendation model, which harnesses an adversarial learning among auto-encoders to improve recommendation quality by minimizing the gap of the rating and review relation between a user and an item. The empirical studies on real-world datasets show that the proposed method improves the recommendation performance.


Author(s):  
Kaj Syrjänen ◽  
Luke Maurits ◽  
Unni Leino ◽  
Terhi Honkola ◽  
Jadranka Rota ◽  
...  

Abstract In recent years, techniques such as Bayesian inference of phylogeny have become a standard part of the quantitative linguistic toolkit. While these tools successfully model the tree-like component of a linguistic dataset, real-world datasets generally include a combination of tree-like and nontree-like signals. Alongside developing techniques for modeling nontree-like data, an important requirement for future quantitative work is to build a principled understanding of this structural complexity of linguistic datasets. Some techniques exist for exploring the general structure of a linguistic dataset, such as NeighborNets, δ scores, and Q-residuals; however, these methods are not without limitations or drawbacks. In general, the question of what kinds of historical structure a linguistic dataset can contain and how these might be detected or measured remains critically underexplored from an objective, quantitative perspective. In this article, we propose TIGER values, a metric that estimates the internal consistency of a genetic dataset, as an additional metric for assessing how tree-like a linguistic dataset is. We use TIGER values to explore simulated language data ranging from very tree-like to completely unstructured, and also use them to analyze a cognate-coded basic vocabulary dataset of Uralic languages. As a point of comparison for the TIGER values, we also explore the same data using δ scores, Q-residuals, and NeighborNets. Our results suggest that TIGER values are capable of both ranking tree-like datasets according to their degree of treelikeness, as well as distinguishing datasets with tree-like structure from datasets with a nontree-like structure. Consequently, we argue that TIGER values serve as a useful metric for measuring the historical heterogeneity of datasets. Our results also highlight the complexities in measuring treelikeness from linguistic data, and how the metrics approach this question from different perspectives.


Author(s):  
Stephen Verderber

The interdisciplinary field of person-environment relations has, from its origins, addressed the transactional relationship between human behavior and the built environment. This body of knowledge has been based upon qualitative and quantitative assessment of phenomena in the “real world.” This knowledge base has been instrumental in advancing the quality of real, physical environments globally at various scales of inquiry and with myriad user/client constituencies. By contrast, scant attention has been devoted to using simulation as a means to examine and represent person-environment transactions and how what is learned can be applied. The present discussion posits that press-competency theory, with related aspects drawn from functionalist-evolutionary theory, can together function to help us learn of how the medium of film can yield further insights to person-environment (P-E) transactions in the real world. Sampling, combined with extemporary behavior setting analysis, provide the basis for this analysis of healthcare settings as expressed throughout the history of cinema. This method can be of significant aid in examining P-E transactions across diverse historical periods, building types and places, healthcare and otherwise, otherwise logistically, geographically, or temporally unattainable in real time and space.


2020 ◽  
Author(s):  
Laetitia Zmuda ◽  
Charlotte Baey ◽  
Paolo Mairano ◽  
Anahita Basirat

It is well-known that individuals can identify novel words in a stream of an artificial language using statistical dependencies. While underlying computations are thought to be similar from one stream to another (e.g. transitional probabilities between syllables), performance are not similar. According to the “linguistic entrenchment” hypothesis, this would be due to the fact that individuals have some prior knowledge regarding co-occurrences of elements in speech which intervene during verbal statistical learning. The focus of previous studies was on task performance. The goal of the current study is to examine the extent to which prior knowledge impacts metacognition (i.e. ability to evaluate one’s own cognitive processes). Participants were exposed to two different artificial languages. Using a fully Bayesian approach, we estimated an unbiased measure of metacognitive efficiency and compared the two languages in terms of task performance and metacognition. While task performance was higher in one of the languages, the metacognitive efficiency was similar in both languages. In addition, a model assuming no correlation between the two languages better accounted for our results compared to a model where correlations were introduced. We discuss the implications of our findings regarding the computations which underlie the interaction between input and prior knowledge during verbal statistical learning.


2020 ◽  
Vol 19 (10) ◽  
pp. 943-948
Author(s):  
Peter Lio ◽  
Andreas Wollenberg ◽  
Jacob Thyssen ◽  
Evangeline Pierce ◽  
Maria Rueda ◽  
...  

2020 ◽  
Vol 9 (20) ◽  
Author(s):  
Akshay Pendyal ◽  
Craig Rothenberg ◽  
Jean E. Scofi ◽  
Harlan M. Krumholz ◽  
Basmah Safdar ◽  
...  

Background Despite investments to improve quality of emergency care for patients with acute myocardial infarction (AMI), few studies have described national, real‐world trends in AMI care in the emergency department (ED). We aimed to describe trends in the epidemiology and quality of AMI care in US EDs over a recent 11‐year period, from 2005 to 2015. Methods and Results We conducted an observational study of ED visits for AMI using the National Hospital Ambulatory Medical Care Survey, a nationally representative probability sample of US EDs. AMI visits were classified as ST‐segment–elevation myocardial infarction (STEMI) and non‐STEMI. Outcomes included annual incidence of AMI, median ED length of stay, ED disposition type, and ED administration of evidence‐based medications. Annual ED visits for AMI decreased from 1 493 145 in 2005 to 581 924 in 2015. Estimated yearly incidence of ED visits for STEMI decreased from 1 402 768 to 315 813. The proportion of STEMI sent for immediate, same‐hospital catheterization increased from 12% to 37%. Among patients with STEMI sent directly for catheterization, median ED length of stay decreased from 62 to 37 minutes. ED administration of antithrombotic and nonaspirin antiplatelet agents rose for STEMI (23%–31% and 10%–27%, respectively). Conclusions National, real‐world trends in the epidemiology of AMI in the ED parallel those of clinical registries, with decreases in AMI incidence and STEMI proportion. ED care processes for STEMI mirror evolving guidelines that favor high‐intensity antiplatelet therapy, early invasive strategies, and regionalization of care.


Sign in / Sign up

Export Citation Format

Share Document