scholarly journals A DBSCAN based Algorithm for Ship Spot Area Detection in AIS Trajectory Data

2019 ◽  
Vol 291 ◽  
pp. 01008 ◽  
Author(s):  
Bao Lei

The big data acquired by AIS system contains abundant maritime traffic information. With the wide application of data mining in various fields in recent years, the mining on AIS data has draw attention of related researchers. Based on the ship AIS location data, this paper studies the relevant spot area detection algorithm. Firstly, the sample data are pre-processed from the original data, and the residence point of each ship is identified according to the ship speed and course change. Then a DBSCAN based clustering algorithm is used to cluster several latitude and longitude lattice, that is spot areas. The experiments on real AIS data sets shows that the algorithm is efficient and correct.

2021 ◽  
Vol 2078 (1) ◽  
pp. 012008
Author(s):  
Hui Liu ◽  
Keyang Cheng

Abstract Aiming at the problem of false detection and missed detection of small targets and occluded targets in the process of pedestrian detection, a pedestrian detection algorithm based on improved multi-scale feature fusion is proposed. First, for the YOLOv4 multi-scale feature fusion module PANet, which does not consider the interaction relationship between scales, PANet is improved to reduce the semantic gap between scales, and the attention mechanism is introduced to learn the importance of different layers to strengthen feature fusion; then, dilated convolution is introduced. Dilated convolution reduces the problem of information loss during the downsampling process; finally, the K-means clustering algorithm is used to redesign the anchor box and modify the loss function to detect a single category. The experimental results show that the improved pedestrian detection algorithm in the INRIA and WiderPerson data sets under different congestion conditions, the AP reaches 96.83% and 59.67%, respectively. Compared with the pedestrian detection results of the YOLOv4 model, the algorithm improves by 2.41% and 1.03%, respectively. The problem of false detection and missed detection of small targets and occlusion has been significantly improved.


2011 ◽  
Vol 34 (7) ◽  
pp. 850-861 ◽  
Author(s):  
Guan Yuan ◽  
Shixiong Xia ◽  
Lei Zhang ◽  
Yong Zhou ◽  
Cheng Ji

With the development of location-based services, such as the Global Positioning System and Radio Frequency Identification, a great deal of trajectory data can be collected. Therefore, how to mine knowledge from these data has become an attractive topic. In this paper, we propose an efficient trajectory-clustering algorithm based on an index tree. Firstly, an index tree is proposed to store trajectories and their similarity matrix, with which trajectories can be retrieved efficiently; secondly, a new conception of trajectory structure is introduced to analyse both the internal and external features of trajectories; then, trajectories are partitioned into trajectory segments according to their corners; furthermore, the similarity between every trajectory segment pairs is compared by presenting the structural similarity function; finally, trajectory segments are grouped into different clusters according to their location in the different levels of the index tree. Experimental results on real data sets demonstrate not only the efficiency and effectiveness of our algorithm, but also the great flexibility that feature sensitivity can be adjusted by different parameters, and the cluster results are more practically significant.


2020 ◽  
Author(s):  
Fatima Zahra Errounda ◽  
Yan Liu

Abstract Location and trajectory data are routinely collected to generate valuable knowledge about users' pattern behavior. However, releasing location data may jeopardize the privacy of the involved individuals. Differential privacy is a powerful technique that prevents an adversary from inferring the presence or absence of an individual in the original data solely based on the observed data. The first challenge in applying differential privacy in location is that a it usually involves a single user. This shifts the adversary's target to the user's locations instead of presence or absence in the original data. The second challenge is that the inherent correlation between location data, due to people's movement regularity and predictability, gives the adversary an advantage in inferring information about individuals. In this paper, we review the differentially private approaches to tackle these challenges. Our goal is to help newcomers to the field to better understand the state-of-the art by providing a research map that highlights the different challenges in designing differentially private frameworks that tackle the characteristics of location data. We find that in protecting an individual's location privacy, the attention of differential privacy mechanisms shifts to preventing the adversary from inferring the original location based on the observed one. Moreover, we find that the privacy-preserving mechanisms make use of the predictability and regularity of users' movements to design and protect the users' privacy in trajectory data. Finally, we explore how well the presented frameworks succeed in protecting users' locations and trajectories against well-known privacy attacks.


2019 ◽  
Vol 2019 ◽  
pp. 1-15 ◽  
Author(s):  
Jun Li ◽  
Qingqi Li ◽  
Yan Zhu ◽  
Yan Ma ◽  
Yubin Xu ◽  
...  

Quality of travel service for road transport relies heavily on richness of transport operation data. Currently, most types of data including coach operation data are collected by manual investigation which is time-consuming and labor-intensive, and this significantly hinders the realization of intelligent traffic information service. In view of the above problems, this paper is aimed at introducing a method of automatically extracting coach operation information using historical GPS trajectory data of massive coaches. The method first analyzes trajectory characteristics of coaches within stations and identifies the highly dense point clusters as coach stations using the DBSCAN clustering algorithm. Then the schedule information is obtained by conducting error adjustment on the actual arrival and departure time series of multiple shifts, and the name of coach station is queried from point of interest (POI) and geographical name database provided by online map. Finally, the regular driving route of coaches is extracted by an incremental trajectory merging method. The proposed method is applied in handling historical trajectory data in the Beijing-Tianjin-Hebei region in China, and experimental results show that the extraction accuracy is 84% and verify its effectiveness and feasibility. The proposed method makes use of data mining techniques to extract coach operation information from big trajectory data and saves a lot of labor work, time, and economic cost required by on-site investigation.


2015 ◽  
Vol 2015 ◽  
pp. 1-18 ◽  
Author(s):  
Dawen Xia ◽  
Binfeng Wang ◽  
Yantao Li ◽  
Zhuobo Rong ◽  
Zili Zhang

Traffic subarea division is vital for traffic system management and traffic network analysis in intelligent transportation systems (ITSs). Since existing methods may not be suitable for big traffic data processing, this paper presents a MapReduce-based Parallel Three-PhaseK-Means (Par3PKM) algorithm for solving traffic subarea division problem on a widely adopted Hadoop distributed computing platform. Specifically, we first modify the distance metric and initialization strategy ofK-Means and then employ a MapReduce paradigm to redesign the optimizedK-Means algorithm for parallel clustering of large-scale taxi trajectories. Moreover, we propose a boundary identifying method to connect the borders of clustering results for each cluster. Finally, we divide traffic subarea of Beijing based on real-world trajectory data sets generated by 12,000 taxis in a period of one month using the proposed approach. Experimental evaluation results indicate that when compared withK-Means, Par2PK-Means, and ParCLARA, Par3PKM achieves higher efficiency, more accuracy, and better scalability and can effectively divide traffic subarea with big taxi trajectory data.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Lili Pei ◽  
Zhaoyun Sun ◽  
Yuxi Han ◽  
Wei Li ◽  
Huaixin Zhao

Aiming at the mining of traffic events based on large amounts of highway data, this paper proposes an improved fast peak clustering algorithm to process highway toll data. The highway toll data are first analyzed, and a data cleaning method based on the sum of similar coefficients is proposed to process the original data. Next, to avoid the shortcomings of the excessive subjectivity of the original algorithm, an improved fast peak clustering algorithm is proposed. Finally, the improved algorithm is applied to highway traffic condition analysis and abnormal event mining to obtain more accurate and intuitive clustering results. Compared with two classical algorithms, namely, the k-means and density-based spatial clustering of applications with noise (DBSCAN) algorithms, as well as the unimproved original fast peak clustering algorithm, the proposed algorithm is faster and more accurate and can reveal the complex relationships among massive data more efficiently. During the process of reforming the toll system, the algorithm can automatically and more efficiently analyze massive toll data and detect abnormal events, thereby providing a theoretical basis and data support for the operation monitoring and maintenance of highways.


2018 ◽  
Vol 173 ◽  
pp. 03086 ◽  
Author(s):  
Zhen Yang ◽  
Wang Hong-jun

As an emerging spatial trajectory data, mobile terminal location data can be widely used to analyze the behavior characteristics and interests of individuals or groups in smart cities, transportation planning and other civil fields. It can also be used to track suspects in anti-terrorism security and public opinion management. Aiming at the problem that it is difficult to determine suitable input parameters of clustering caused by different subscriber location data size and distribution difference, an improved density peak clustering algorithm is proposed and the performance of the improved algorithm is verified on the UCI data set. Firstly the important location is identified by the proposed algorithm, and the personal location is further inferred by the algorithm based on the subscriber's schedule and maximum cluster. Then, the algorithm adopts Google's inverse geocoding technology to obtain the semantic names corresponding to the coordinate points, and introduces the natural language processing technology to achieve word frequency statistics and keyword extraction. The simulation results based on the Geolife data set show that the algorithm is feasible for identifying important locations and inferring personal locations.


2019 ◽  
Vol 10 (2) ◽  
pp. 105-115
Author(s):  
Rong Wen ◽  
Wenjing Yan

Abstract The goal of maritime traffic management is to provide a safe and efficient maritime environment for different type of vessels facilitating port logistics and supply chain business. However, current maritime traffic management mainly relies on the massive individual vessel’s data for decision making. Lack of macro-level understanding of vessel crowd movement around port challenges maritime safety and traffic efficiency. In this paper, we describe a spatio-temporal data mining method to discover crowd movement patterns of vessels from their short-term history data. The method first captures vessels’ crowd movement features by building vessels’ tracklets with their speed and location. A movement vector clustering algorithm is developed to find different travel behaviors for different group of vessels. With nonparametric regression on the classified vessel movement vectors which represent the crowd travel behaviors, an overall vessel movement pattern can then be discovered. In this research, we tested real trajectory data of vessels near Singapore ports. Comparing with the actual massive vessel movement data, we found that this method was able to extract vessels’ crowd movement information. The hotspots on risk area in terms of vessel traffic and speed can be identified. The method can be used to provide decision-making support for maritime traffic management.


2021 ◽  
Vol 30 (1) ◽  
pp. 763-773
Author(s):  
Yang Zhang ◽  
Abhinav Asthana ◽  
Sudeep Asthana ◽  
Shaweta Khanna ◽  
Ioan-Cosmin Mihai

Abstract In order to study the intelligent collection system of moving object trajectory data under cloud computing, information useful to passengers and taxi drivers is collected from massive trajectory data. This paper uses cloud computing technology, through clustering algorithm and density-based DBSCAN algorithm combined with Map Reduce programming model and design trajectory clustering algorithm. The results show that based on the 8-day data of 15,000 taxis in Shenzhen, the characteristic time period is determined. The passenger hot spot area is obtained by clustering the passenger load points in each time period, which verifies the feasibility of the passenger load point recommendation application based on trajectory clustering. Therefore, in the absence of holidays, the number of passenger hotspots tends to be stable. It is reliable to perform cluster analysis. The recommended application has been demonstrated through experiments, and the implementation results show the rationality of the recommended application design and the feasibility of practice.


Author(s):  
MICHEL BRUYNOOGHE

The clustering of large data sets is of great interest in fields such as pattern recognition, numerical taxonomy, image or speech processing. The traditional Ascendant Hierarchical Algorithm (AHC) cannot be run for sets of more than a few thousand elements. The reducible neighborhoods clustering algorithm, which is presented in this paper, has overtaken the limits of the traditional hierarchical clustering algorithm by generating an exact hierarchy on a large data set. The theoretical justification of this algorithm is the so-called Bruynooghe reducibility principle, that lays down the condition under which the exact hierarchy may be constructed locally, by carrying out aggregations in restricted regions of the representation space. As for the Day and Edelsbrunner algorithm, the maximum theoretical time complexity of the reducible neighborhoods clustering algorithm is O(n2 log n), regardless of the chosen clustering strategy. But the reducible neighborhoods clustering algorithm uses the original data table and its practical performances are by far better than Day and Edelsbrunner’s algorithm, thus allowing the hierarchical clustering of large data sets, i.e. composed of more than 10 000 objects.


Sign in / Sign up

Export Citation Format

Share Document