Trajectory Similarity Learning with Auxiliary Supervision and Optimal Matching

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/444 ◽

2020 ◽

Author(s):

Hanyuan Zhang ◽

Xinyu Zhang ◽

Qize Jiang ◽

Baihua Zheng ◽

Zhenbang Sun ◽

...

Keyword(s):

Real World ◽

Representation Learning ◽

Similarity Learning ◽

Trajectory Data ◽

Optimal Matching ◽

Training Samples ◽

Trajectory Similarity ◽

Similarity Computation ◽

Real World Datasets ◽

Relationship Of

Trajectory similarity computation is a core problem in the field of trajectory data queries. However, the high time complexity of calculating the trajectory similarity has always been a bottleneck in real-world applications. Learning-based methods can map trajectories into a uniform embedding space to calculate the similarity of two trajectories with embeddings in constant time. In this paper, we propose a novel trajectory representation learning framework Traj2SimVec that performs scalable and robust trajectory similarity computation. We use a simple and fast trajectory simplification and indexing approach to obtain triplet training samples efficiently. We make the framework more robust via taking full use of the sub-trajectory similarity information as auxiliary supervision. Furthermore, the framework supports the point matching query by modeling the optimal matching relationship of trajectory points under different distance metrics. The comprehensive experiments on real-world datasets demonstrate that our model substantially outperforms all existing approaches.

Multi-Aspect Embedding for Attribute-Aware Trajectories

Symmetry ◽

10.3390/sym11091149 ◽

2019 ◽

Vol 11 (9) ◽

pp. 1149

Author(s):

Thapana Boonchoo ◽

Xiang Ao ◽

Qing He

Keyword(s):

Real World ◽

Execution Time ◽

State Of The Art ◽

Representation Learning ◽

Learning Approach ◽

Trajectory Data ◽

Trajectory Mining ◽

Trajectory Similarity ◽

Effectiveness And Efficiency ◽

Real World Datasets

Motivated by the proliferation of trajectory data produced by advanced GPS-enabled devices, trajectory is gaining in complexity and beginning to embroil additional attributes beyond simply the coordinates. As a consequence, this creates the potential to define the similarity between two attribute-aware trajectories. However, most existing trajectory similarity approaches focus only on location based proximities and fail to capture the semantic similarities encompassed by these additional asymmetric attributes (aspects) of trajectories. In this paper, we propose multi-aspect embedding for attribute-aware trajectories (MAEAT), a representation learning approach for trajectories that simultaneously models the similarities according to their multiple aspects. MAEAT is built upon a sentence embedding algorithm and directly learns whole trajectory embedding via predicting the context aspect tokens when given a trajectory. Two kinds of token generation methods are proposed to extract multiple aspects from the raw trajectories, and a regularization is devised to control the importance among aspects. Extensive experiments on the benchmark and real-world datasets show the effectiveness and efficiency of the proposed MAEAT compared to the state-of-the-art and baseline methods. The results of MAEAT can well support representative downstream trajectory mining and management tasks, and the algorithm outperforms other compared methods in execution time by at least two orders of magnitude.

Evaluating the effect of compressing algorithms for trajectory similarity and classification problems

GeoInformatica ◽

10.1007/s10707-021-00434-1 ◽

2021 ◽

Author(s):

Antonios Makris ◽

Camila Leite da Silva ◽

Vania Bogorny ◽

Luis Otavio Alvares ◽

Jose Antonio Macedo ◽

...

Keyword(s):

Trajectory Analysis ◽

Similarity Measures ◽

Classification Problems ◽

Trajectory Data ◽

Compression Algorithms ◽

Time Ratio ◽

Ratio Speed ◽

Trajectory Similarity ◽

Real World Datasets ◽

The Impact

AbstractDuring the last few years the volumes of the data that synthesize trajectories have expanded to unparalleled quantities. This growth is challenging traditional trajectory analysis approaches and solutions are sought in other domains. In this work, we focus on data compression techniques with the intention to minimize the size of trajectory data, while, at the same time, minimizing the impact on the trajectory analysis methods. To this extent, we evaluate five lossy compression algorithms: Douglas-Peucker (DP), Time Ratio (TR), Speed Based (SP), Time Ratio Speed Based (TR_SP) and Speed Based Time Ratio (SP_TR). The comparison is performed using four distinct real world datasets against six different dynamically assigned thresholds. The effectiveness of the compression is evaluated using classification techniques and similarity measures. The results showed that there is a trade-off between the compression rate and the achieved quality. The is no “best algorithm” for every case and the choice of the proper compression algorithm is an application-dependent process.

SOCIAL INTEREST FOR USER SELECTING ITEMS IN RECOMMENDER SYSTEMS

International Journal of Modern Physics C ◽

10.1142/s0129183113500228 ◽

2013 ◽

Vol 24 (04) ◽

pp. 1350022 ◽

Cited By ~ 7

Author(s):

DA-CHENG NIE ◽

MING-JING DING ◽

YAN FU ◽

JUN-LIN ZHOU ◽

ZI-KE ZHANG

Keyword(s):

Recommender Systems ◽

Real World ◽

Social Interest ◽

Experimental Results ◽

Simple Method ◽

The Social ◽

Social Interests ◽

Similarity Computation ◽

Real World Datasets

Recommender systems have developed rapidly and successfully. The system aims to help users find relevant items from a potentially overwhelming set of choices. However, most of the existing recommender algorithms focused on the traditional user-item similarity computation, other than incorporating the social interests into the recommender systems. As we know, each user has their own preference field, they may influence their friends' preference in their expert field when considering the social interest on their friends' item collecting. In order to model this social interest, in this paper, we proposed a simple method to compute users' social interest on the specific items in the recommender systems, and then integrate this social interest with similarity preference. The experimental results on two real-world datasets Epinions and Friendfeed show that this method can significantly improve not only the algorithmic precision-accuracy but also the diversity-accuracy.

T3S: Effective Representation Learning for Trajectory Similarity Computation

2021 IEEE 37th International Conference on Data Engineering (ICDE) ◽

10.1109/icde51399.2021.00221 ◽

2021 ◽

Author(s):

Peilun Yang ◽

Hanchen Wang ◽

Ying Zhang ◽

Lu Qin ◽

Wenjie Zhang ◽

...

Keyword(s):

Representation Learning ◽

Trajectory Similarity ◽

Similarity Computation ◽

Effective Representation

Online Multitask Relative Similarity Learning

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/253 ◽

2017 ◽

Cited By ~ 2

Author(s):

Shuji Hao ◽

Peilin Zhao ◽

Yong Liu ◽

Steven C. H. Hoi ◽

Chunyan Miao

Keyword(s):

Real World ◽

Learning Algorithm ◽

Learning Problems ◽

Similarity Function ◽

Learning Approaches ◽

Similarity Learning ◽

Real World Data ◽

Real World Datasets ◽

Online Learning Algorithm ◽

Relative Similarity

Relative similarity learning~(RSL) aims to learn similarity functions from data with relative constraints. Most previous algorithms developed for RSL are batch-based learning approaches which suffer from poor scalability when dealing with real-world data arriving sequentially. These methods are often designed to learn a single similarity function for a specific task. Therefore, they may be sub-optimal to solve multiple task learning problems. To overcome these limitations, we propose a scalable RSL framework named OMTRSL (Online Multi-Task Relative Similarity Learning). Specifically, we first develop a simple yet effective online learning algorithm for multi-task relative similarity learning. Then, we also propose an active learning algorithm to save the labeling cost. The proposed algorithms not only enjoy theoretical guarantee, but also show high efficacy and efficiency in extensive experiments on real-world datasets.

Discrete Embedding for Latent Networks

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/170 ◽

2020 ◽

Cited By ~ 1

Author(s):

Hong Yang ◽

Ling Chen ◽

Minglong Lei ◽

Lingfeng Niu ◽

Chuan Zhou ◽

...

Keyword(s):

Real World ◽

Representation Learning ◽

Mixed Integer ◽

Information Cascades ◽

Integer Optimization ◽

Network Embedding ◽

Structure Information ◽

Proximity Matrix ◽

Real World Datasets ◽

Embedding Methods

Discrete network embedding emerged recently as a new direction of network representation learning. Compared with traditional network embedding models, discrete network embedding aims to compress model size and accelerate model inference by learning a set of short binary codes for network vertices. However, existing discrete network embedding methods usually assume that the network structures (e.g., edge weights) are readily available. In real-world scenarios such as social networks, sometimes it is impossible to collect explicit network structure information and it usually needs to be inferred from implicit data such as information cascades in the networks. To address this issue, we present an end-to-end discrete network embedding model for latent networks DELN that can learn binary representations from underlying information cascades. The essential idea is to infer a latent Weisfeiler-Lehman proximity matrix that captures node dependence based on information cascades and then to factorize the latent Weisfiler-Lehman matrix under the binary node representation constraint. Since the learning problem is a mixed integer optimization problem, an efficient maximal likelihood estimation based cyclic coordinate descent (MLE-CCD) algorithm is used as the solution. Experiments on real-world datasets show that the proposed model outperforms the state-of-the-art network embedding methods.

Simandro-plus: On computing similarity of android applications

Computer Science and Information Systems ◽

10.2298/csis210208036h ◽

2021 ◽

pp. 36-36

Author(s):

Masoud Hamedani ◽

Sang-Wook Kim

Keyword(s):

Real World ◽

State Of The Art ◽

Similarity Score ◽

The State ◽

The Other ◽

Android Applications ◽

Similarity Computation ◽

Real World Datasets

In this paper, we propose SimAndro-Plus as an improved variant of the state-of-the-art method, SimAndro, to compute the similarity of Android applications (apps) regarding their functionalities. SimAndro-Plus has two major differences with SimAndro: 1) it exploits two beneficial features to similarity computation, which are totally disregarded by SimAndro; 2) to compute the similarity score of an app-pair based on strings and package name features, SimAndro-Plus considers not only those terms co-appearing in both apps but also considers those terms appearing in one app while missing in the other one. The results of our extensive ex periments with three real-world datasets and a dataset constructed by human experts demonstrate that 1) each of the two aforementioned differences is really effective to achieve better accuracy and 2) SimAndro-Plus outperforms SimAndro in similarity computation by 14% in average.

Bursting the Filter Bubble: Fairness-Aware Network Link Prediction

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5429 ◽

2020 ◽

Vol 34 (01) ◽

pp. 841-848

Author(s):

Farzan Masrour ◽

Tyler Wilson ◽

Heng Yan ◽

Pang-Ning Tan ◽

Abdol Esfahanian

Keyword(s):

Social Networking ◽

Real World ◽

Link Prediction ◽

Representation Learning ◽

Experimental Results ◽

Adversarial Network ◽

Network Link ◽

Network Modularity ◽

Filter Bubble ◽

Real World Datasets

Link prediction is an important task in online social networking as it can be used to infer new or previously unknown relationships of a network. However, due to the homophily principle, current algorithms are susceptible to promoting links that may lead to increase segregation of the network—an effect known as filter bubble. In this study, we examine the filter bubble problem from the perspective of algorithm fairness and introduce a dyadic-level fairness criterion based on network modularity measure. We show how the criterion can be utilized as a postprocessing step to generate more heterogeneous links in order to overcome the filter bubble problem. In addition, we also present a novel framework that combines adversarial network representation learning with supervised link prediction to alleviate the filter bubble problem. Experimental results conducted on several real-world datasets showed the effectiveness of the proposed methods compared to other baseline approaches, which include conventional link prediction and fairness-aware methods for i.i.d data.

Learning Sequential Correlation for User Generated Textual Content Popularity Prediction

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/225 ◽

2018 ◽

Cited By ~ 6

Author(s):

Wen Wang ◽

Wei Zhang ◽

Jun Wang ◽

Junchi Yan ◽

Hongyuan Zha

Keyword(s):

Real World ◽

Information Overload ◽

Representation Learning ◽

Text Representation ◽

Sequential Model ◽

Content Popularity ◽

Popularity Prediction ◽

Real World Datasets ◽

Textual Content ◽

The Web

Popularity prediction of user generated textual content is critical for prioritizing information in the web, which alleviates heavy information overload for ordinary readers. Most previous studies model each content instance separately for prediction and thus overlook the sequential correlations between instances of a specific user. In this paper, we go deeper into this problem based on the two observations for each user, i.e., sequential content correlation and sequential popularity correlation. We propose a novel deep sequential model called User Memory-augmented recurrent Attention Network (UMAN). This model encodes the two correlations by updating external user memories which is further leveraged for target text representation learning and popularity prediction. The experimental results on several real-world datasets validate the benefits of considering these correlations and demonstrate UMAN achieves best performance among several strong competitors.

Stochastic Recursive Gradient Support Pursuit and Its Sparse Representation Applications

Sensors ◽

10.3390/s20174902 ◽

2020 ◽

Vol 20 (17) ◽

pp. 4902

Author(s):

Fanhua Shang ◽

Bingkun Wei ◽

Yuanyuan Liu ◽

Hongying Liu ◽

Shuang Wang ◽

...

Keyword(s):

Sparse Representation ◽

Real World ◽

Large Scale ◽

Matching Pursuit ◽

Linear Convergence ◽

Representation Learning ◽

Optimization Methods ◽

Hard Thresholding ◽

Real World Datasets ◽

Norm Constraint

In recent years, a series of matching pursuit and hard thresholding algorithms have been proposed to solve the sparse representation problem with ℓ0-norm constraint. In addition, some stochastic hard thresholding methods were also proposed, such as stochastic gradient hard thresholding (SG-HT) and stochastic variance reduced gradient hard thresholding (SVRGHT). However, each iteration of all the algorithms requires one hard thresholding operation, which leads to a high per-iteration complexity and slow convergence, especially for high-dimensional problems. To address this issue, we propose a new stochastic recursive gradient support pursuit (SRGSP) algorithm, in which only one hard thresholding operation is required in each outer-iteration. Thus, SRGSP has a significantly lower computational complexity than existing methods such as SG-HT and SVRGHT. Moreover, we also provide the convergence analysis of SRGSP, which shows that SRGSP attains a linear convergence rate. Our experimental results on large-scale synthetic and real-world datasets verify that SRGSP outperforms state-of-the-art related methods for tackling various sparse representation problems. Moreover, we conduct many experiments on two real-world sparse representation applications such as image denoising and face recognition, and all the results also validate that our SRGSP algorithm obtains much better performance than other sparse representation learning optimization methods in terms of PSNR and recognition rates.