scholarly journals Identification of Micro-blog Opinion Leaders based on User Features and Outbreak Nodes

Author(s):  
Lin Cui ◽  
Dechang Pi

At present, recognition of micro-blog opinion leaders mainly depends on the number of users posting micro-blogs, registration time, the number of good friends and other static attributes. However, it is very difficult to obtain the ideal recognition results through the above mentioned methods. This paper puts forward a new method that identifies the opinion leaders according to the change of user features and outbreak nodes. Deeply analyzing various attributes and behaviors of users, on the basis of user features and outbreak nodes, user’s attribute features are regarded as the input variables, behavior features of the user and outbreak nodes are regarded as observed variables. The probability as an opinion leader is the latent variable between input variables and observation variables, and the constructed probability model is used to recognize micro-blog opinion leaders. Experiments are carried out on the two real-world datasets from Sina micro-blog and Twitter, and the comparative experimental results show that the proposed model can more precisely find the micro-blog opinion leaders.

Data ◽  
2020 ◽  
Vol 6 (1) ◽  
pp. 1
Author(s):  
Ahmed Elmogy ◽  
Hamada Rizk ◽  
Amany M. Sarhan

In data mining, outlier detection is a major challenge as it has an important role in many applications such as medical data, image processing, fraud detection, intrusion detection, and so forth. An extensive variety of clustering based approaches have been developed to detect outliers. However they are by nature time consuming which restrict their utilization with real-time applications. Furthermore, outlier detection requests are handled one at a time, which means that each request is initiated individually with a particular set of parameters. In this paper, the first clustering based outlier detection framework, (On the Fly Clustering Based Outlier Detection (OFCOD)) is presented. OFCOD enables analysts to effectively find out outliers on time with request even within huge datasets. The proposed framework has been tested and evaluated using two real world datasets with different features and applications; one with 699 records, and another with five millions records. The experimental results show that the performance of the proposed framework outperforms other existing approaches while considering several evaluation metrics.


2021 ◽  
Vol 15 (3) ◽  
pp. 1-33
Author(s):  
Wenjun Jiang ◽  
Jing Chen ◽  
Xiaofei Ding ◽  
Jie Wu ◽  
Jiawei He ◽  
...  

In online systems, including e-commerce platforms, many users resort to the reviews or comments generated by previous consumers for decision making, while their time is limited to deal with many reviews. Therefore, a review summary, which contains all important features in user-generated reviews, is expected. In this article, we study “how to generate a comprehensive review summary from a large number of user-generated reviews.” This can be implemented by text summarization, which mainly has two types of extractive and abstractive approaches. Both of these approaches can deal with both supervised and unsupervised scenarios, but the former may generate redundant and incoherent summaries, while the latter can avoid redundancy but usually can only deal with short sequences. Moreover, both approaches may neglect the sentiment information. To address the above issues, we propose comprehensive Review Summary Generation frameworks to deal with the supervised and unsupervised scenarios. We design two different preprocess models of re-ranking and selecting to identify the important sentences while keeping users’ sentiment in the original reviews. These sentences can be further used to generate review summaries with text summarization methods. Experimental results in seven real-world datasets (Idebate, Rotten Tomatoes Amazon, Yelp, and three unlabelled product review datasets in Amazon) demonstrate that our work performs well in review summary generation. Moreover, the re-ranking and selecting models show different characteristics.


2020 ◽  
Vol 34 (04) ◽  
pp. 6837-6844
Author(s):  
Xiaojin Zhang ◽  
Honglei Zhuang ◽  
Shengyu Zhang ◽  
Yuan Zhou

We study a variant of the thresholding bandit problem (TBP) in the context of outlier detection, where the objective is to identify the outliers whose rewards are above a threshold. Distinct from the traditional TBP, the threshold is defined as a function of the rewards of all the arms, which is motivated by the criterion for identifying outliers. The learner needs to explore the rewards of the arms as well as the threshold. We refer to this problem as "double exploration for outlier detection". We construct an adaptively updated confidence interval for the threshold, based on the estimated value of the threshold in the previous rounds. Furthermore, by automatically trading off exploring the individual arms and exploring the outlier threshold, we provide an efficient algorithm in terms of the sample complexity. Experimental results on both synthetic datasets and real-world datasets demonstrate the efficiency of our algorithm.


2020 ◽  
Vol 31 (4) ◽  
pp. 24-45
Author(s):  
Mengmeng Shen ◽  
Jun Wang ◽  
Ou Liu ◽  
Haiying Wang

Tags generated in collaborative tagging systems (CTSs) may help users describe, categorize, search, discover, and navigate content, whereas the difficulty is how to go beyond the information explosion and obtain experts and the required information quickly and accurately. This paper proposes an expert detection and recommendation (EDAR) model based on semantics of tags; the framework consists of community detection and EDAR. Specifically, this paper firstly mines communities based on an improved agglomerative hierarchical clustering (I-AHC) to cluster tags and then presents a community expert detection (CED) algorithm for identifying community experts, and finally, an expert recommendation algorithm is proposed based the improved collaborative filtering (CF) algorithm to recommend relevant experts for the target user. Experiments are carried out on real world datasets, and the results from data experiments and user evaluations have shown that the proposed model can provide excellent performance compared to the benchmark method.


2013 ◽  
Vol 709 ◽  
pp. 642-645 ◽  
Author(s):  
Yi Jing Fu ◽  
Ting Zhang ◽  
Xiao Chang ◽  
Yu Yu Yuan

Weibo is a dominant twitter-like micro-blog media in China, which indicates the trend of social changes in recent years in China. Opinion leaders, in particular, have a marvelous power to influence the thinking of the mass to some extent. In this paper, we propose an innovative model which automatically revises the selection of opinion leader list and analyzes their influence in consideration of the amount of followers and friends and topics such as life, campus, government, public welfare and entertainment. Two-step strategy is applied to our model, namely self-revised opinion leader list construction and VSM-based influence analysis. Experimental results reveal that our model has a good performance on reflecting the analysis of the relationship between authoritative opinion leaders and the mass media.


2013 ◽  
Vol 24 (04) ◽  
pp. 1350022 ◽  
Author(s):  
DA-CHENG NIE ◽  
MING-JING DING ◽  
YAN FU ◽  
JUN-LIN ZHOU ◽  
ZI-KE ZHANG

Recommender systems have developed rapidly and successfully. The system aims to help users find relevant items from a potentially overwhelming set of choices. However, most of the existing recommender algorithms focused on the traditional user-item similarity computation, other than incorporating the social interests into the recommender systems. As we know, each user has their own preference field, they may influence their friends' preference in their expert field when considering the social interest on their friends' item collecting. In order to model this social interest, in this paper, we proposed a simple method to compute users' social interest on the specific items in the recommender systems, and then integrate this social interest with similarity preference. The experimental results on two real-world datasets Epinions and Friendfeed show that this method can significantly improve not only the algorithmic precision-accuracy but also the diversity-accuracy.


Author(s):  
Yusuke Tanaka ◽  
Tomoharu Iwata ◽  
Takeshi Kurashima ◽  
Hiroyuki Toda ◽  
Naonori Ueda

Analyzing people flows is important for better navigation and location-based advertising. Since the location information of people is often aggregated for protecting privacy, it is not straightforward to estimate transition populations between locations from aggregated data. Here, aggregated data are incoming and outgoing people counts at each location; they do not contain tracking information of individuals. This paper proposes a probabilistic model for estimating unobserved transition populations between locations from only aggregated data. With the proposed model, temporal dynamics of people flows are assumed to be probabilistic diffusion processes over a network, where nodes are locations and edges are paths between locations. By maximizing the likelihood with flow conservation constraints that incorporate travel duration distributions between locations, our model can robustly estimate transition populations between locations. The statistically significant improvement of our model is demonstrated using real-world datasets of pedestrian data in exhibition halls, bike trip data and taxi trip data in New York City.


Author(s):  
Guibing Guo ◽  
Enneng Yang ◽  
Li Shen ◽  
Xiaochun Yang ◽  
Xiaodong He

Trust-aware recommender systems have received much attention recently for their abilities to capture the influence among connected users. However, they suffer from the efficiency issue due to large amount of data and time-consuming real-valued operations. Although existing discrete collaborative filtering may alleviate this issue to some extent, it is unable to accommodate social influence. In this paper we propose a discrete trust-aware matrix factorization (DTMF) model to take dual advantages of both social relations and discrete technique for fast recommendation. Specifically, we map the latent representation of users and items into a joint hamming space by recovering the rating and trust interactions between users and items. We adopt a sophisticated discrete coordinate descent (DCD) approach to optimize our proposed model. In addition, experiments on two real-world datasets demonstrate the superiority of our approach against other state-of-the-art approaches in terms of ranking accuracy and efficiency.


2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Baocheng Huang ◽  
Guang Yu ◽  
Hamid Reza Karimi

It is valuable for the real world to find the opinion leaders. Because different data sources usually have different characteristics, there does not exist a standard algorithm to find and detect the opinion leaders in different data sources. Every data source has its own structural characteristics, and also has its own detection algorithm to find the opinion leaders. Experimental results show the opinion leaders and theirs characteristics can be found among the comments from the Weibo social network of China, which is like Facebook or Twitter in USA.


Algorithms ◽  
2020 ◽  
Vol 13 (1) ◽  
pp. 17 ◽  
Author(s):  
Emmanuel Pintelas ◽  
Ioannis E. Livieris ◽  
Panagiotis Pintelas

Machine learning has emerged as a key factor in many technological and scientific advances and applications. Much research has been devoted to developing high performance machine learning models, which are able to make very accurate predictions and decisions on a wide range of applications. Nevertheless, we still seek to understand and explain how these models work and make decisions. Explainability and interpretability in machine learning is a significant issue, since in most of real-world problems it is considered essential to understand and explain the model’s prediction mechanism in order to trust it and make decisions on critical issues. In this study, we developed a Grey-Box model based on semi-supervised methodology utilizing a self-training framework. The main objective of this work is the development of a both interpretable and accurate machine learning model, although this is a complex and challenging task. The proposed model was evaluated on a variety of real world datasets from the crucial application domains of education, finance and medicine. Our results demonstrate the efficiency of the proposed model performing comparable to a Black-Box and considerably outperforming single White-Box models, while at the same time remains as interpretable as a White-Box model.


Sign in / Sign up

Export Citation Format

Share Document