scholarly journals UTSM: A Trajectory Similarity Measure Considering Uncertainty Based on an Amended Ellipse Model

2019 ◽  
Vol 8 (11) ◽  
pp. 518 ◽  
Author(s):  
Ning Guo ◽  
Shashi Shekhar ◽  
Wei Xiong ◽  
Luo Chen ◽  
Ning Jing

Measuring the similarity between a pair of trajectories is the basis of many spatiotemporal clustering methods and has wide applications in trajectory pattern mining. However, most measures of trajectory similarity in the literature are based on precise models that ignore the inherent uncertainty in trajectory data recorded by sensors. Traditional computing or mining approaches that assume the preciseness and exactness of trajectories therefore risk underperforming or returning incorrect results. To address the problem, we propose an amended ellipse model which takes both interpolation error and positioning error into account by making use of motion features of trajectory to compute the ellipse’s shape parameters. A specialized similarity measure method considering uncertainty called UTSM based on the model is also proposed. We validate the approach experimentally on both synthetic and real-world data and show that UTSM is not only more robust to noise and outliers but also more tolerant of different sample frequencies and asynchronous sampling of trajectories.

Author(s):  
Yingchi Mao ◽  
Haishi Zhong ◽  
Xianjian Xiao ◽  
Xiaofang Li

With the rapid spread of built-in GPS handheld smart devices, the trajectory data from GPS sensors has grown explosively. Trajectory data has spatio-temporal characteristics and rich information. Using trajectory data processing techniques can mine the patterns of human activities and the moving patterns of vehicles in the intelligent transportation systems. A trajectory similarity measure is one of the most important issues in trajectory data mining (clustering, classification, frequent pattern mining, etc.). Unfortunately, the main similarity measure algorithms with the trajectory data have been found to be inaccurate, highly sensitive of sampling methods, and have low robustness for the noise data. To solve the above problems, three distances and their corresponding computation methods are proposed in this paper. The point-segment distance can decrease the sensitivity of the point sampling methods. The prediction distance optimizes the temporal distance with the features of trajectory data. The segment-segment distance introduces the trajectory shape factor into the similarity measurement to improve the accuracy. The three kinds of distance are integrated with the traditional dynamic time warping algorithm (DTW) algorithm to propose a new segment–based dynamic time warping algorithm (SDTW). The experimental results show that the SDTW algorithm can exhibit about 57%, 86%, and 31% better accuracy than the longest common subsequence algorithm (LCSS), and edit distance on real sequence algorithm (EDR) , and DTW, respectively, and that the sensitivity to the noise data is lower than that those algorithms.


2021 ◽  
Author(s):  
Antonios Makris ◽  
Camila Leite da Silva ◽  
Vania Bogorny ◽  
Luis Otavio Alvares ◽  
Jose Antonio Macedo ◽  
...  

AbstractDuring the last few years the volumes of the data that synthesize trajectories have expanded to unparalleled quantities. This growth is challenging traditional trajectory analysis approaches and solutions are sought in other domains. In this work, we focus on data compression techniques with the intention to minimize the size of trajectory data, while, at the same time, minimizing the impact on the trajectory analysis methods. To this extent, we evaluate five lossy compression algorithms: Douglas-Peucker (DP), Time Ratio (TR), Speed Based (SP), Time Ratio Speed Based (TR_SP) and Speed Based Time Ratio (SP_TR). The comparison is performed using four distinct real world datasets against six different dynamically assigned thresholds. The effectiveness of the compression is evaluated using classification techniques and similarity measures. The results showed that there is a trade-off between the compression rate and the achieved quality. The is no “best algorithm” for every case and the choice of the proper compression algorithm is an application-dependent process.


2021 ◽  
Vol 10 (2) ◽  
pp. 90
Author(s):  
Jin Zhu ◽  
Dayu Cheng ◽  
Weiwei Zhang ◽  
Ci Song ◽  
Jie Chen ◽  
...  

People spend more than 80% of their time in indoor spaces, such as shopping malls and office buildings. Indoor trajectories collected by indoor positioning devices, such as WiFi and Bluetooth devices, can reflect human movement behaviors in indoor spaces. Insightful indoor movement patterns can be discovered from indoor trajectories using various clustering methods. These methods are based on a measure that reflects the degree of similarity between indoor trajectories. Researchers have proposed many trajectory similarity measures. However, existing trajectory similarity measures ignore the indoor movement constraints imposed by the indoor space and the characteristics of indoor positioning sensors, which leads to an inaccurate measure of indoor trajectory similarity. Additionally, most of these works focus on the spatial and temporal dimensions of trajectories and pay less attention to indoor semantic information. Integrating indoor semantic information such as the indoor point of interest into the indoor trajectory similarity measurement is beneficial to discovering pedestrians having similar intentions. In this paper, we propose an accurate and reasonable indoor trajectory similarity measure called the indoor semantic trajectory similarity measure (ISTSM), which considers the features of indoor trajectories and indoor semantic information simultaneously. The ISTSM is modified from the edit distance that is a measure of the distance between string sequences. The key component of the ISTSM is an indoor navigation graph that is transformed from an indoor floor plan representing the indoor space for computing accurate indoor walking distances. The indoor walking distances and indoor semantic information are fused into the edit distance seamlessly. The ISTSM is evaluated using a synthetic dataset and real dataset for a shopping mall. The experiment with the synthetic dataset reveals that the ISTSM is more accurate and reasonable than three other popular trajectory similarities, namely the longest common subsequence (LCSS), edit distance on real sequence (EDR), and the multidimensional similarity measure (MSM). The case study of a shopping mall shows that the ISTSM effectively reveals customer movement patterns of indoor customers.


2017 ◽  
Vol 71 (1) ◽  
pp. 100-116 ◽  
Author(s):  
Kai Sheng ◽  
Zhong Liu ◽  
Dechao Zhou ◽  
Ailin He ◽  
Chengxu Feng

It is important for maritime authorities to effectively classify and identify unknown types of ships in historical trajectory data. This paper uses a logistic regression model to construct a ship classifier by utilising the features extracted from ship trajectories. First of all, three basic movement patterns are proposed according to ship sailing characteristics, with related sub-trajectory partitioning algorithms. Subsequently, three categories of trajectory features with their extraction methods are presented. Finally, a case study on building a model for classifying fishing boats and cargo ships based on real Automatic Identification System (AIS) data is given. Experimental results indicate that the proposed classification method can meet the needs of recognising uncertain types of targets in historical trajectory data, laying a foundation for further research on camouflaged ship identification, behaviour pattern mining, outlier behaviour detection and other applications.


2021 ◽  
Vol 10 (11) ◽  
pp. 757
Author(s):  
Pin Nie ◽  
Zhenjie Chen ◽  
Nan Xia ◽  
Qiuhao Huang ◽  
Feixue Li

Automatic Identification System (AIS) data have been widely used in many fields, such as collision detection, navigation, and maritime traffic management. Similarity analysis is an important process for most AIS trajectory analysis topics. However, most traditional AIS trajectory similarity analysis methods calculate the distance between trajectory points, which requires complex and time-consuming calculations, often leading to substantial errors when processing AIS trajectory data characterized by substantial differences in length or uneven trajectory points. Therefore, we propose a cell-based similarity analysis method that combines the weight of the direction and k-neighborhood (WDN-SIM). This method quantifies the similarity between trajectories based on the degree of proximity and differences in motion direction. In terms of its effectiveness and efficiency, WDN-SIM outperformed seven traditional methods for trajectory similarity analysis. Particularly, WDN-SIM has a high robustness to noise and can distinguish the similarities between trajectories under complex situations, such as when there are opposing directions of motion, large differences in length, and uneven point distributions.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Xin Wang ◽  
Xinzheng Niu ◽  
Jiahui Zhu ◽  
Zuoyan Liu

Nowadays, large volumes of multimodal data have been collected for analysis. An important type of data is trajectory data, which contains both time and space information. Trajectory analysis and clustering are essential to learn the pattern of moving objects. Computing trajectory similarity is a key aspect of trajectory analysis, but it is very time consuming. To address this issue, this paper presents an improved branch and bound strategy based on time slice segmentation, which reduces the time to obtain the similarity matrix by decreasing the number of distance calculations required to compute similarity. Then, the similarity matrix is transformed into a trajectory graph and a community detection algorithm is applied on it for clustering. Extensive experiments were done to compare the proposed algorithms with existing similarity measures and clustering algorithms. Results show that the proposed method can effectively mine the trajectory cluster information from the spatiotemporal trajectories.


Author(s):  
Jacopo Vanoli ◽  
Consuelo Rubina Nava ◽  
Chiara Airoldi ◽  
Andrealuna Ucciero ◽  
Virginio Salvi ◽  
...  

While state sequence analysis (SSA) has been long used in social sciences, its use in pharmacoepidemiology is still in its infancy. Indeed, this technique is relatively easy to use, and its intrinsic visual nature may help investigators to untangle the latent information within prescription data, facilitating the individuation of specific patterns and possible inappropriate use of medications. In this paper, we provide an educational primer of the most important learning concepts and methods of SSA, including measurement of dissimilarities between sequences, the application of clustering methods to identify sequence patterns, the use of complexity measures for sequence patterns, the graphical visualization of sequences, and the use of SSA in predictive models. As a worked example, we present an application of SSA to opioid prescription patterns in patients with non-cancer pain, using real-world data from Italy. We show how SSA allows the identification of patterns in prescriptions in these data that might not be evident using standard statistical approaches and how these patterns are associated with future discontinuation of opioid therapy.


Author(s):  
Xinning Zhu ◽  
Tianyue Sun ◽  
Hao Yuan ◽  
Zheng Hu ◽  
Jiansong Miao

Identifying group movement patterns of crowds and understanding group behaviors is valuable for urban planners, especially when the groups are special such as tourist groups. In this paper, we present a framework to discover tourist groups and investigate the tourist behaviors using mobile phone call detail records (CDRs). Unlike GPS data, CDRs are relatively poor in spatial resolution with low sampling rates, which makes it a big challenge to identify group members from thousands of tourists. Moreover, since touristic trips are not on a regular basis, no historical data of the specific group can be used to reduce the uncertainty of trajectories. To address such challenges, we propose a method called group movement pattern mining based on similarity (GMPMS) to discover tourist groups. To avoid large amounts of trajectory similarity measurements, snapshots of the trajectories are firstly generated to extract candidate groups containing co-occurring tourists. Then, considering that different groups may follow the same itineraries, additional traveling behavioral features are defined to identify the group members. Finally, with Hainan province as an example, we provide a number of interesting insights of travel behaviors of group tours as well as individual tours, which will be helpful for tourism planning and management.


Author(s):  
Hiroto Saigo ◽  
Koji Tsuda

Graph is a mathematical framework that allows us to represent and manage many real-world data such as relational data, multimedia data and biomedical data. When each data point is represented as a graph and we are given a number of graphs, a task is to extract a few common patterns that capture the property of each population. A frequent graph mining algorithm such as AGM, gSpan and Gaston can enumerate all the frequent patterns in graph data, however, the number of patterns grows exponentially, therefore it is essential to output only discriminative patterns. There are many existing researches on this topic, but this chapter focus on the use of matrix decomposition techniques, and explains the two general cases where either i) no target label is available, or ii) target label is available for each data point. The reuslting method is a branch and bound pattern mining algorithm with efficient pruning condition, and we evaluate its effectiveness on cheminformatics data.


Author(s):  
Longbing Cao ◽  
Chengqi Zhang

Quantitative intelligence based traditional data mining is facing grand challenges from real-world enterprise and cross-organization applications. For instance, the usual demonstration of specific algorithms cannot support business users to take actions to their advantage and needs. We think this is due to Quantitative Intelligence focused data-driven philosophy. It either views data mining as an autonomous data-driven, trial-and-error process, or only analyzes business issues in an isolated, case-by-case manner. Based on experience and lessons learnt from real-world data mining and complex systems, this article proposes a practical data mining methodology referred to as Domain-Driven Data Mining. On top of quantitative intelligence and hidden knowledge in data, domain-driven data mining aims to meta-synthesize quantitative intelligence and qualitative intelligence in mining complex applications in which human is in the loop. It targets actionable knowledge discovery in constrained environment for satisfying user preference. Domain-driven methodology consists of key components including understanding constrained environment, business-technical questionnaire, representing and involving domain knowledge, human-mining cooperation and interaction, constructing next-generation mining infrastructure, in-depth pattern mining and postprocessing, business interestingness and actionability enhancement, and loop-closed human-cooperated iterative refinement. Domain-driven data mining complements the data-driven methodology, the metasynthesis of qualitative intelligence and quantitative intelligence has potential to discover knowledge from complex systems, and enhance knowledge actionability for practical use by industry and business.


Sign in / Sign up

Export Citation Format

Share Document