scholarly journals Demarcating Endogenous and Exogenous Opinion Dynamics: An Experimental Design Approach

2021 ◽  
Vol 15 (6) ◽  
pp. 1-25
Author(s):  
Paramita Koley ◽  
Avirup Saha ◽  
Sourangshu Bhattacharya ◽  
Niloy Ganguly ◽  
Abir De

The networked opinion diffusion in online social networks is often governed by the two genres of opinions— endogenous opinions that are driven by the influence of social contacts among users, and exogenous opinions which are formed by external effects like news and feeds. Accurate demarcation of endogenous and exogenous messages offers an important cue to opinion modeling, thereby enhancing its predictive performance. In this article, we design a suite of unsupervised classification methods based on experimental design approaches, in which, we aim to select the subsets of events which minimize different measures of mean estimation error. In more detail, we first show that these subset selection tasks are NP-Hard. Then we show that the associated objective functions are weakly submodular, which allows us to cast efficient approximation algorithms with guarantees. Finally, we validate the efficacy of our proposal on various real-world datasets crawled from Twitter as well as diverse synthetic datasets. Our experiments range from validating prediction performance on unsanitized and sanitized events to checking the effect of selecting optimal subsets of various sizes. Through various experiments, we have found that our method offers a significant improvement in accuracy in terms of opinion forecasting, against several competitors.

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
João Lobo ◽  
Rui Henriques ◽  
Sara C. Madeira

Abstract Background Three-way data started to gain popularity due to their increasing capacity to describe inherently multivariate and temporal events, such as biological responses, social interactions along time, urban dynamics, or complex geophysical phenomena. Triclustering, subspace clustering of three-way data, enables the discovery of patterns corresponding to data subspaces (triclusters) with values correlated across the three dimensions (observations $$\times$$ × features $$\times$$ × contexts). With increasing number of algorithms being proposed, effectively comparing them with state-of-the-art algorithms is paramount. These comparisons are usually performed using real data, without a known ground-truth, thus limiting the assessments. In this context, we propose a synthetic data generator, G-Tric, allowing the creation of synthetic datasets with configurable properties and the possibility to plant triclusters. The generator is prepared to create datasets resembling real 3-way data from biomedical and social data domains, with the additional advantage of further providing the ground truth (triclustering solution) as output. Results G-Tric can replicate real-world datasets and create new ones that match researchers needs across several properties, including data type (numeric or symbolic), dimensions, and background distribution. Users can tune the patterns and structure that characterize the planted triclusters (subspaces) and how they interact (overlapping). Data quality can also be controlled, by defining the amount of missing, noise or errors. Furthermore, a benchmark of datasets resembling real data is made available, together with the corresponding triclustering solutions (planted triclusters) and generating parameters. Conclusions Triclustering evaluation using G-Tric provides the possibility to combine both intrinsic and extrinsic metrics to compare solutions that produce more reliable analyses. A set of predefined datasets, mimicking widely used three-way data and exploring crucial properties was generated and made available, highlighting G-Tric’s potential to advance triclustering state-of-the-art by easing the process of evaluating the quality of new triclustering approaches.


Author(s):  
ChunYan Yin ◽  
YongHeng Chen ◽  
Wanli Zuo

AbstractPreference-based recommendation systems analyze user-item interactions to reveal latent factors that explain our latent preferences for items and form personalized recommendations based on the behavior of others with similar tastes. Most of the works in the recommendation systems literature have been developed under the assumption that user preference is a static pattern, although user preferences and item attributes may be changed through time. To achieve this goal, we develop an Evolutionary Social Poisson Factorization (EPF$$\_$$ _ Social) model, a new Bayesian factorization model that can effectively model the smoothly drifting latent factors using Conjugate Gamma–Markov chains. Otherwise, EPF$$\_$$ _ Social can obtain the impact of friends on social network for user’ latent preferences. We studied our models with two large real-world datasets, and demonstrated that our model gives better predictive performance than state-of-the-art static factorization models.


Author(s):  
Luca Pasa ◽  
Nicolò Navarin ◽  
Alessandro Sperduti

AbstractGraph property prediction is becoming more and more popular due to the increasing availability of scientific and social data naturally represented in a graph form. Because of that, many researchers are focusing on the development of improved graph neural network models. One of the main components of a graph neural network is the aggregation operator, needed to generate a graph-level representation from a set of node-level embeddings. The aggregation operator is critical since it should, in principle, provide a representation of the graph that is isomorphism invariant, i.e. the graph representation should be a function of graph nodes treated as a set. DeepSets (in: Advances in neural information processing systems, pp 3391–3401, 2017) provides a framework to construct a set-aggregation operator with universal approximation properties. In this paper, we propose a DeepSets aggregation operator, based on Self-Organizing Maps (SOM), to transform a set of node-level representations into a single graph-level one. The adoption of SOMs allows to compute node representations that embed the information about their mutual similarity. Experimental results on several real-world datasets show that our proposed approach achieves improved predictive performance compared to the commonly adopted sum aggregation and many state-of-the-art graph neural network architectures in the literature.


2020 ◽  
Vol 34 (04) ◽  
pp. 6837-6844
Author(s):  
Xiaojin Zhang ◽  
Honglei Zhuang ◽  
Shengyu Zhang ◽  
Yuan Zhou

We study a variant of the thresholding bandit problem (TBP) in the context of outlier detection, where the objective is to identify the outliers whose rewards are above a threshold. Distinct from the traditional TBP, the threshold is defined as a function of the rewards of all the arms, which is motivated by the criterion for identifying outliers. The learner needs to explore the rewards of the arms as well as the threshold. We refer to this problem as "double exploration for outlier detection". We construct an adaptively updated confidence interval for the threshold, based on the estimated value of the threshold in the previous rounds. Furthermore, by automatically trading off exploring the individual arms and exploring the outlier threshold, we provide an efficient algorithm in terms of the sample complexity. Experimental results on both synthetic datasets and real-world datasets demonstrate the efficiency of our algorithm.


2016 ◽  
Vol 2016 ◽  
pp. 1-14
Author(s):  
Lin-Ping Song ◽  
Leonard R. Pasion ◽  
Nicolas Lhomme ◽  
Douglas W. Oldenburg

This work, under the optimal experimental design framework, investigates the sensor placement problem that aims to guide electromagnetic induction (EMI) sensing of multiple objects. We use the linearized model covariance matrix as a measure of estimation error to present a sequential experimental design (SED) technique. The technique recursively minimizes data misfit to update model parameters and maximizes an information gain function for a future survey relative to previous surveys. The fundamental process of the SED seeks to increase weighted sensitivities to targets when placing sensors. The synthetic and field experiments demonstrate that SED can be used to guide the sensing process for an effective interrogation. It also can serve as a theoretic basis to improve empirical survey operation. We further study the sensitivity of the SED to the number of objects within the sensing range. The tests suggest that an appropriately overrepresented model about expected anomalies might be a feasible choice.


2020 ◽  
Vol 34 (04) ◽  
pp. 3593-3600
Author(s):  
Jiezhu Cheng ◽  
Kaizhu Huang ◽  
Zibin Zheng

Multivariate time series forecasting is an important yet challenging problem in machine learning. Most existing approaches only forecast the series value of one future moment, ignoring the interactions between predictions of future moments with different temporal distance. Such a deficiency probably prevents the model from getting enough information about the future, thus limiting the forecasting accuracy. To address this problem, we propose Multi-Level Construal Neural Network (MLCNN), a novel multi-task deep learning framework. Inspired by the Construal Level Theory of psychology, this model aims to improve the predictive performance by fusing forecasting information (i.e., future visions) of different future time. We first use the Convolution Neural Network to extract multi-level abstract representations of the raw data for near and distant future predictions. We then model the interplay between multiple predictive tasks and fuse their future visions through a modified Encoder-Decoder architecture. Finally, we combine traditional Autoregression model with the neural network to solve the scale insensitive problem. Experiments on three real-world datasets show that our method achieves statistically significant improvements compared to the most state-of-the-art baseline methods, with average 4.59% reduction on RMSE metric and average 6.87% reduction on MAE metric.


Author(s):  
Sabina-Adriana Floria ◽  
Florin Leon

Online social networks are the main choice of people to maintain their social relationships and share information or opinions. Estimating the actions of a user is not trivial because an individual can act spontaneously or be influenced by external factors. In this paper we propose a novel model for imitating the evolution of the information diffusion in a network as well as possible. Each individual is modeled as a node with two factors (psychological and sociological) that control its probabilistic transmission of information. The psychological factor refers to the node?s preference for the topic discussed, i.e. the information diffused. The sociological factor takes into account the influence of the neighbors? activity on the node, i.e. the gregarious behavior. A genetic algorithm is used to automatically tune the parameters of the model in order to fit the evolution of information diffusion observed in two real-world datasets with three topics. The reproduced diffusions show that the proposed model imitates the real diffusions very well.


Author(s):  
Hong Xie ◽  
Yongkun Li ◽  
John C.S. Lui

Online product rating systems have become an indispensable component for numerous web services such as Amazon, eBay, Google play store and TripAdvisor. One functionality of such systems is to uncover the product quality via product ratings (or reviews) contributed by consumers. However, a well-known psychological phenomenon called “messagebased persuasion” lead to “biased” product ratings in a cascading manner (we call this the persuasion cascade). This paper investigates: (1) How does the persuasion cascade influence the product quality estimation accuracy? (2) Given a real-world product rating dataset, how to infer the persuasion cascade and analyze it to draw practical insights? We first develop a mathematical model to capture key factors of a persuasion cascade. We formulate a high-order Markov chain to characterize the opinion dynamics of a persuasion cascade and prove the convergence of opinions. We further bound the product quality estimation error for a class of rating aggregation rules including the averaging scoring rule, via the matrix perturbation theory and the Chernoff bound. We also design a maximum likelihood algorithm to infer parameters of the persuasion cascade. We conduct experiments on the data from Amazon and TripAdvisor, and show that persuasion cascades notably exist, but the average scoring rule has a small product quality estimation error under practical scenarios.


Sign in / Sign up

Export Citation Format

Share Document