Bursting the Filter Bubble: Fairness-Aware Network Link Prediction

Farzan Masrour; Tyler Wilson; Heng Yan; Pang-Ning Tan; Abdol Esfahanian

doi:10.1609/aaai.v34i01.5429

Bursting the Filter Bubble: Fairness-Aware Network Link Prediction

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5429 ◽

2020 ◽

Vol 34 (01) ◽

pp. 841-848

Author(s):

Farzan Masrour ◽

Tyler Wilson ◽

Heng Yan ◽

Pang-Ning Tan ◽

Abdol Esfahanian

Keyword(s):

Social Networking ◽

Real World ◽

Link Prediction ◽

Representation Learning ◽

Experimental Results ◽

Adversarial Network ◽

Network Link ◽

Network Modularity ◽

Filter Bubble ◽

Real World Datasets

Link prediction is an important task in online social networking as it can be used to infer new or previously unknown relationships of a network. However, due to the homophily principle, current algorithms are susceptible to promoting links that may lead to increase segregation of the network—an effect known as filter bubble. In this study, we examine the filter bubble problem from the perspective of algorithm fairness and introduce a dyadic-level fairness criterion based on network modularity measure. We show how the criterion can be utilized as a postprocessing step to generate more heterogeneous links in order to overcome the filter bubble problem. In addition, we also present a novel framework that combines adversarial network representation learning with supervised link prediction to alleviate the filter bubble problem. Experimental results conducted on several real-world datasets showed the effectiveness of the proposed methods compared to other baseline approaches, which include conventional link prediction and fairness-aware methods for i.i.d data.

Download Full-text

OFCOD: On the Fly Clustering Based Outlier Detection Framework

Data ◽

10.3390/data6010001 ◽

2020 ◽

Vol 6 (1) ◽

pp. 1

Author(s):

Ahmed Elmogy ◽

Hamada Rizk ◽

Amany M. Sarhan

Keyword(s):

Data Mining ◽

Image Processing ◽

Intrusion Detection ◽

Real Time ◽

Outlier Detection ◽

Real World ◽

Medical Data ◽

Experimental Results ◽

Real Time Applications ◽

Real World Datasets

In data mining, outlier detection is a major challenge as it has an important role in many applications such as medical data, image processing, fraud detection, intrusion detection, and so forth. An extensive variety of clustering based approaches have been developed to detect outliers. However they are by nature time consuming which restrict their utilization with real-time applications. Furthermore, outlier detection requests are handled one at a time, which means that each request is initiated individually with a particular set of parameters. In this paper, the first clustering based outlier detection framework, (On the Fly Clustering Based Outlier Detection (OFCOD)) is presented. OFCOD enables analysts to effectively find out outliers on time with request even within huge datasets. The proposed framework has been tested and evaluated using two real world datasets with different features and applications; one with 699 records, and another with five millions records. The experimental results show that the performance of the proposed framework outperforms other existing approaches while considering several evaluation metrics.

Download Full-text

Review Summary Generation in Online Systems: Frameworks for Supervised and Unsupervised Scenarios

ACM Transactions on the Web ◽

10.1145/3448015 ◽

2021 ◽

Vol 15 (3) ◽

pp. 1-33

Author(s):

Wenjun Jiang ◽

Jing Chen ◽

Xiaofei Ding ◽

Jie Wu ◽

Jiawei He ◽

...

Keyword(s):

Decision Making ◽

Real World ◽

Text Summarization ◽

Experimental Results ◽

Product Review ◽

Comprehensive Review ◽

Online Systems ◽

Real World Datasets ◽

Different Characteristics

In online systems, including e-commerce platforms, many users resort to the reviews or comments generated by previous consumers for decision making, while their time is limited to deal with many reviews. Therefore, a review summary, which contains all important features in user-generated reviews, is expected. In this article, we study “how to generate a comprehensive review summary from a large number of user-generated reviews.” This can be implemented by text summarization, which mainly has two types of extractive and abstractive approaches. Both of these approaches can deal with both supervised and unsupervised scenarios, but the former may generate redundant and incoherent summaries, while the latter can avoid redundancy but usually can only deal with short sequences. Moreover, both approaches may neglect the sentiment information. To address the above issues, we propose comprehensive Review Summary Generation frameworks to deal with the supervised and unsupervised scenarios. We design two different preprocess models of re-ranking and selecting to identify the important sentences while keeping users’ sentiment in the original reviews. These sentences can be further used to generate review summaries with text summarization methods. Experimental results in seven real-world datasets (Idebate, Rotten Tomatoes Amazon, Yelp, and three unlabelled product review datasets in Amazon) demonstrate that our work performs well in review summary generation. Moreover, the re-ranking and selecting models show different characteristics.

Download Full-text

Adaptive Double-Exploration Tradeoff for Outlier Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6164 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6837-6844

Author(s):

Xiaojin Zhang ◽

Honglei Zhuang ◽

Shengyu Zhang ◽

Yuan Zhou

Keyword(s):

Confidence Interval ◽

Outlier Detection ◽

Real World ◽

Efficient Algorithm ◽

Experimental Results ◽

Sample Complexity ◽

Bandit Problem ◽

Real World Datasets ◽

Synthetic Datasets ◽

The Individual

We study a variant of the thresholding bandit problem (TBP) in the context of outlier detection, where the objective is to identify the outliers whose rewards are above a threshold. Distinct from the traditional TBP, the threshold is defined as a function of the rewards of all the arms, which is motivated by the criterion for identifying outliers. The learner needs to explore the rewards of the arms as well as the threshold. We refer to this problem as "double exploration for outlier detection". We construct an adaptively updated confidence interval for the threshold, based on the estimated value of the threshold in the previous rounds. Furthermore, by automatically trading off exploring the individual arms and exploring the outlier threshold, we provide an efficient algorithm in terms of the sample complexity. Experimental results on both synthetic datasets and real-world datasets demonstrate the efficiency of our algorithm.

Download Full-text

SOCIAL INTEREST FOR USER SELECTING ITEMS IN RECOMMENDER SYSTEMS

International Journal of Modern Physics C ◽

10.1142/s0129183113500228 ◽

2013 ◽

Vol 24 (04) ◽

pp. 1350022 ◽

Cited By ~ 7

Author(s):

DA-CHENG NIE ◽

MING-JING DING ◽

YAN FU ◽

JUN-LIN ZHOU ◽

ZI-KE ZHANG

Keyword(s):

Recommender Systems ◽

Real World ◽

Social Interest ◽

Experimental Results ◽

Simple Method ◽

The Social ◽

Social Interests ◽

Similarity Computation ◽

Real World Datasets

Recommender systems have developed rapidly and successfully. The system aims to help users find relevant items from a potentially overwhelming set of choices. However, most of the existing recommender algorithms focused on the traditional user-item similarity computation, other than incorporating the social interests into the recommender systems. As we know, each user has their own preference field, they may influence their friends' preference in their expert field when considering the social interest on their friends' item collecting. In order to model this social interest, in this paper, we proposed a simple method to compute users' social interest on the specific items in the recommender systems, and then integrate this social interest with similarity preference. The experimental results on two real-world datasets Epinions and Friendfeed show that this method can significantly improve not only the algorithmic precision-accuracy but also the diversity-accuracy.

Download Full-text

Multi-Aspect Embedding for Attribute-Aware Trajectories

Symmetry ◽

10.3390/sym11091149 ◽

2019 ◽

Vol 11 (9) ◽

pp. 1149

Author(s):

Thapana Boonchoo ◽

Xiang Ao ◽

Qing He

Keyword(s):

Real World ◽

Execution Time ◽

State Of The Art ◽

Representation Learning ◽

Learning Approach ◽

Trajectory Data ◽

Trajectory Mining ◽

Trajectory Similarity ◽

Effectiveness And Efficiency ◽

Real World Datasets

Motivated by the proliferation of trajectory data produced by advanced GPS-enabled devices, trajectory is gaining in complexity and beginning to embroil additional attributes beyond simply the coordinates. As a consequence, this creates the potential to define the similarity between two attribute-aware trajectories. However, most existing trajectory similarity approaches focus only on location based proximities and fail to capture the semantic similarities encompassed by these additional asymmetric attributes (aspects) of trajectories. In this paper, we propose multi-aspect embedding for attribute-aware trajectories (MAEAT), a representation learning approach for trajectories that simultaneously models the similarities according to their multiple aspects. MAEAT is built upon a sentence embedding algorithm and directly learns whole trajectory embedding via predicting the context aspect tokens when given a trajectory. Two kinds of token generation methods are proposed to extract multiple aspects from the raw trajectories, and a regularization is devised to control the importance among aspects. Extensive experiments on the benchmark and real-world datasets show the effectiveness and efficiency of the proposed MAEAT compared to the state-of-the-art and baseline methods. The results of MAEAT can well support representative downstream trajectory mining and management tasks, and the algorithm outperforms other compared methods in execution time by at least two orders of magnitude.

Download Full-text

Heterogeneous Combat Network Link Prediction Based on Representation Learning

IEEE Systems Journal ◽

10.1109/jsyst.2020.3028168 ◽

2020 ◽

pp. 1-9

Author(s):

Wenhao Chen ◽

Jichao Li ◽

Jiang Jiang

Keyword(s):

Link Prediction ◽

Representation Learning ◽

Network Link

Download Full-text

Self-weighted Multiview Clustering with Multiple Graphs

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/357 ◽

2017 ◽

Cited By ~ 44

Author(s):

Feiping Nie ◽

Jing Li ◽

Xuelong Li

Keyword(s):

Real World ◽

Spectral Clustering ◽

Experimental Results ◽

Clustering Method ◽

Elegant Method ◽

Multiview Learning ◽

Cluster Label ◽

Real World Datasets ◽

Synthetic Datasets ◽

Multiview Clustering

In multiview learning, it is essential to assign a reasonable weight to each view according to its importance. Thus, for multiview clustering task, a wise and elegant method should achieve clustering multiview data while learning the view weights. In this paper, we address this problem by exploring a Laplacian rank constrained graph, which can be approximately as the centroid of the built graph for each view with different confidences. We start our work with a natural thought that the weights can be learned by introducing a hyperparameter. By analyzing the weakness of it, we further propose a new multiview clustering method which is totally self-weighted. Furthermore, once the target graph is obtained in our models, we can directly assign the cluster label to each data point and do not need any postprocessing such as $K$-means in standard spectral clustering. Evaluations on two synthetic datasets prove the effectiveness of our methods. Compared with several representative graph-based multiview clustering approaches on four real-world datasets, experimental results demonstrate that the proposed methods achieve the better performances and our new clustering method is more practical to use.

Download Full-text

Adaptively Multi-Objective Adversarial Training for Dialogue Generation

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/397 ◽

2020 ◽

Author(s):

Xuemiao Zhang ◽

Zhouxing Tan ◽

Xiaoning Zhang ◽

Yang Cao ◽

Rui Yan

Keyword(s):

Real World ◽

Optimization Problem ◽

Experimental Results ◽

Sampling Distribution ◽

Multi Objective Optimization ◽

Generation Task ◽

Multi Objective ◽

Adversarial Models ◽

Adversarial Training ◽

Real World Datasets

Naive neural dialogue generation models tend to produce repetitive and dull utterances. The promising adversarial models train the generator against a well-designed discriminator to push it to improve towards the expected direction. However, assessing dialogues requires consideration of many aspects of linguistics, which are difficult to be fully covered by a single discriminator. To address it, we reframe the dialogue generation task as a multi-objective optimization problem and propose a novel adversarial dialogue generation framework with multiple discriminators that excel in different objectives for multiple linguistic aspects, called AMPGAN, whose feasibility is proved by theoretical derivations. Moreover, we design an adaptively adjusted sampling distribution to balance the discriminators and promote the overall improvement of the generator by continuing to focus on these objectives that the generator is not performing well relatively. Experimental results on two real-world datasets show a significant improvement over the baselines.

Download Full-text

Discrete Embedding for Latent Networks

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/170 ◽

2020 ◽

Cited By ~ 1

Author(s):

Hong Yang ◽

Ling Chen ◽

Minglong Lei ◽

Lingfeng Niu ◽

Chuan Zhou ◽

...

Keyword(s):

Real World ◽

Representation Learning ◽

Mixed Integer ◽

Information Cascades ◽

Integer Optimization ◽

Network Embedding ◽

Structure Information ◽

Proximity Matrix ◽

Real World Datasets ◽

Embedding Methods

Discrete network embedding emerged recently as a new direction of network representation learning. Compared with traditional network embedding models, discrete network embedding aims to compress model size and accelerate model inference by learning a set of short binary codes for network vertices. However, existing discrete network embedding methods usually assume that the network structures (e.g., edge weights) are readily available. In real-world scenarios such as social networks, sometimes it is impossible to collect explicit network structure information and it usually needs to be inferred from implicit data such as information cascades in the networks. To address this issue, we present an end-to-end discrete network embedding model for latent networks DELN that can learn binary representations from underlying information cascades. The essential idea is to infer a latent Weisfeiler-Lehman proximity matrix that captures node dependence based on information cascades and then to factorize the latent Weisfiler-Lehman matrix under the binary node representation constraint. Since the learning problem is a mixed integer optimization problem, an efficient maximal likelihood estimation based cyclic coordinate descent (MLE-CCD) algorithm is used as the solution. Experiments on real-world datasets show that the proposed model outperforms the state-of-the-art network embedding methods.

Download Full-text

Exploiting POI-Specific Geographical Influence for Point-of-Interest Recommendation

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/539 ◽

2018 ◽

Cited By ~ 34

Author(s):

Hao Wang ◽

Huawei Shen ◽

Wentao Ouyang ◽

Xueqi Cheng

Keyword(s):

Social Networks ◽

Real World ◽

Fundamental Problem ◽

Physical Distance ◽

Experimental Results ◽

Point Of Interest ◽

Poi Recommendation ◽

Movie Recommendation ◽

Real World Datasets ◽

Location Based Social Networks

Point-of-interest (POI) recommendation, i.e., recommending unvisited POIs for users, is a fundamental problem for location-based social networks. POI recommendation distinguishes itself from traditional item recommendation, e.g., movie recommendation, via geographical influence among POIs. Existing methods model the geographical influence between two POIs as the probability or propensity that the two POIs are co-visited by the same user given their physical distance. These methods assume that geographical influence between POIs is determined by their physical distance, failing to capture the asymmetry of geographical influence and the high variation of geographical influence across POIs. In this paper, we exploit POI-specific geographical influence to improve POI recommendation. We model the geographical influence between two POIs using three factors: the geo-influence of POI, the geo-susceptibility of POI, and their physical distance. Geo-influence captures POI?s capacity at exerting geographical influence to other POIs, and geo-susceptibility reflects POI?s propensity of being geographically influenced by other POIs. Experimental results on two real-world datasets demonstrate that POI-specific geographical influence significantly improves the performance of POI recommendation.

Download Full-text