scholarly journals Mining Taxi Pick-Up Hotspots Based on Grid Information Entropy Clustering Algorithm

2021 ◽  
Vol 2021 ◽  
pp. 1-25
Author(s):  
Shuoben Bi ◽  
Ruizhuang Xu ◽  
Aili Liu ◽  
Luye Wang ◽  
Lei Wan

In view of the fact that the density-based clustering algorithm is sensitive to the input data, which results in the limitation of computing space and poor timeliness, a new method is proposed based on grid information entropy clustering algorithm for mining hotspots of taxi passengers. This paper selects representative geographical areas of Nanjing and Beijing as the research areas and uses information entropy and aggregation degree to analyze the distribution of passenger-carrying points. This algorithm uses a grid instead of original trajectory data to calculate and excavate taxi passenger hotspots. Through the comparison and analysis of the data of taxi loading points in Nanjing and Beijing, it is found that the experimental results are consistent with the actual urban passenger hotspots, which verifies the effectiveness of the algorithm. It overcomes the shortcomings of a density-based clustering algorithm that is limited by computing space and poor timeliness, reduces the size of data needed to be processed, and has greater flexibility to process and analyze massive data. The research results can provide an important scientific basis for urban traffic guidance and urban management.

2014 ◽  
Vol 48 (6) ◽  
pp. 74-85 ◽  
Author(s):  
Jiacai Pan ◽  
Qingshan Jiang ◽  
Zheping Shao

AbstractThe trajectory data of moving objects contain huge amounts of information pertaining to traffic flow. It is incredibly important to extract valuable knowledge from this particular kind of data. Trajectory clustering is one of the most widely used approaches to complete this extraction. However, the current practice of trajectory clustering always groups similar subtrajectories that are partitioned from the trajectories; these methods would thus lose important information of the trajectory as a whole. To deal with this problem, this paper introduces a new trajectory-clustering algorithm based on sampling and density, which groups similar traffic movement tracks (car, ship, airplane, etc.) for further analysis of the characteristics of traffic flow. In particular, this paper proposes a novel technique of measuring distances between trajectories using point sampling. This distance measure does not divide the trajectory and thus conserves the integrated knowledge of these trajectories. This trajectory clustering approach is a new adaptation of a density-based clustering algorithm to the trajectories of moving objects. This paper then adopts the entropy theory as the heuristic for selecting the parameter values of this algorithm and the sum of the squared error method for measuring the clustering quality. Experiments on real ship trajectory data have shown that this algorithm is superior to the classical method TRACLUSS in the run time and that this method works well in discovering traffic flow patterns.


2021 ◽  
Vol 40 (6) ◽  
pp. 10781-10796
Author(s):  
Xin Yu ◽  
Feng Zeng ◽  
Deborah Simon Mwakapesa ◽  
Y.A. Nanehkaran ◽  
Yi-Min Mao ◽  
...  

The main target of this paper is to design a density-based clustering algorithm using the weighted grid and information entropy based on MapReduce, noted as DBWGIE-MR, to deal with the problems of unreasonable division of data gridding, low accuracy of clustering results and low efficiency of parallelization in big data clustering algorithm based on density. This algorithm is implemented in three stages: data partitioning, local clustering, and global clustering. For each stage, we propose several strategies to improve the algorithm. In the first stage, based on the spatial distribution of data points, we propose an adaptive division strategy (ADG) to divide the grid adaptively. In the second stage, we design a weighted grid construction strategy (NE) which can strengthen the relevance between grids to improve the accuracy of clustering. Meanwhile, based on the weighted grid and information entropy, we design a density calculation strategy (WGIE) to calculate the density of the grid. And last, to improve the parallel efficiency, core clusters computing algorithm based on MapReduce (COMCORE-MR) are proposed to parallel compute the core clusters of the clustering algorithm. In the third stage, based on disjoint-set, we propose a core cluster merging algorithm (MECORE) to speed-up ratio the convergence of merged local clusters. Furthermore, based on MapReduce, a core clusters parallel merging algorithm (MECORE-MR) is proposed to get the clustering algorithm results faster, which improves the core clusters merging efficiency of the density-based clustering algorithm. We conduct the experiments on four synthetic clusters. Compared with H-DBSCAN, DBSCAN-MR and MR-VDBSCAN, the experimental results show that the DBWGIE-MR algorithm has higher stability and accuracy, and it takes less time in parallel clustering.


2018 ◽  
Vol 2018 ◽  
pp. 1-9 ◽  
Author(s):  
Zhihan Liu ◽  
Yi Jia ◽  
Xiaolu Zhu

Car sharing is a type of car rental service, by which consumers rent cars for short periods of time, often charged by hours. The analysis of urban traffic big data is full of importance and significance to determine locations of depots for car-sharing system. Taxi OD (Origin-Destination) is a typical dataset of urban traffic. The volume of the data is extremely large so that traditional data processing applications do not work well. In this paper, an optimization method to determine the depot locations by clustering taxi OD points with AP (Affinity Propagation) clustering algorithm has been presented. By analyzing the characteristics of AP clustering algorithm, AP clustering has been optimized hierarchically based on administrative region segmentation. Considering sparse similarity matrix of taxi OD points, the input parameters of AP clustering have been adapted. In the case study, we choose the OD pairs information from Beijing’s taxi GPS trajectory data. The number and locations of depots are determined by clustering the OD points based on the optimization AP clustering. We describe experimental results of our approach and compare it with standard K-means method using quantitative and stationarity index. Experiments on the real datasets show that the proposed method for determining car-sharing depots has a superior performance.


2021 ◽  
Vol 10 (7) ◽  
pp. 473
Author(s):  
Qingying Yu ◽  
Chuanming Chen ◽  
Liping Sun ◽  
Xiaoyao Zheng

Urban hotspot area detection is an important issue that needs to be explored for urban planning and traffic management. It is of great significance to mine hotspots from taxi trajectory data, which reflect residents’ travel characteristics and the operational status of urban traffic. The existing clustering methods mainly concentrate on the number of objects contained in an area within a specified size, neglecting the impact of the local density and the tightness between objects. Hence, a novel algorithm is proposed for detecting urban hotspots from taxi trajectory data based on nearest neighborhood-related quality clustering techniques. The proposed spatial clustering algorithm not only considers the maximum clustering in a limited range but also considers the relationship between each cluster center and its nearest neighborhood, effectively addressing the clustering issue of unevenly distributed datasets. As a result, the proposed algorithm obtains high-quality clustering results. The visual representation and simulated experimental results on a real-life cab trajectory dataset show that the proposed algorithm is suitable for inferring urban hotspot areas, and that it obtains better accuracy than traditional density-based methods.


2021 ◽  
Vol 10 (2) ◽  
pp. 77
Author(s):  
Yitong Gan ◽  
Hongchao Fan ◽  
Wei Jiao ◽  
Mengqi Sun

In China, the traditional taxi industry is conforming to the trend of the times, with taxi drivers working with e-hailing applications. This reform is of great significance, not only for the taxi industry, but also for the transportation industry, cities, and society as a whole. Our goal was to analyze the changes in driving behavior since taxi drivers joined e-hailing platforms. Therefore, this paper mined taxi trajectory data from Shanghai and compared the data of May 2015 with those of May 2017 to represent the before-app stage and the full-use stage, respectively. By extracting two-trip events (i.e., vacant trip and occupied trip) and two-spot events (i.e., pick-up spot and drop-off spot), taxi driving behavior changes were analyzed temporally, spatially, and efficiently. The results reveal that e-hailing applications mine more long-distance rides and new pick-up locations for drivers. Moreover, driver initiative have increased at night since using e-hailing applications. Furthermore, mobile payment facilities save time that would otherwise be taken sorting out change. Although e-hailing apps can help citizens get taxis faster, from the driver’s perspective, the apps do not reduce their cruising time. In general, e-hailing software reduces the unoccupied ratio of taxis and improves the operating ratio. Ultimately, new driving behaviors can increase the driver’s revenue. This work is meaningful for the formulation of reasonable traffic laws and for urban traffic decision-making.


2011 ◽  
Vol 291-294 ◽  
pp. 344-348
Author(s):  
Lin Lin ◽  
Shu Yan ◽  
Yi Nian

The hierarchical topology of wireless sensor networks can effectively reduce the consumption in communication. Clustering algorithm is the foundation to realize herarchical structure, so it has been extensive researched. On the basis of Leach algorithm, a distance density based clustering algorithm (DDBC) is proposed, considering synthetically the distribution density of around nodes and the remaining energy factors of the node to dynamically banlance energy usage of nodes when selecting cluster heads. We analyzed the performance of DDBC through compared with the existing other clustering algorithms in simulation experiment. Results show that the proposed method can generare stable quantity cluster heads and banlance the energy load effectively.


2021 ◽  
Author(s):  
Shenfei Pei ◽  
Feiping Nie ◽  
Rong Wang ◽  
Xuelong Li

2021 ◽  
Vol 25 (6) ◽  
pp. 1453-1471
Author(s):  
Chunhua Tang ◽  
Han Wang ◽  
Zhiwen Wang ◽  
Xiangkun Zeng ◽  
Huaran Yan ◽  
...  

Most density-based clustering algorithms have the problems of difficult parameter setting, high time complexity, poor noise recognition, and weak clustering for datasets with uneven density. To solve these problems, this paper proposes FOP-OPTICS algorithm (Finding of the Ordering Peaks Based on OPTICS), which is a substantial improvement of OPTICS (Ordering Points To Identify the Clustering Structure). The proposed algorithm finds the demarcation point (DP) from the Augmented Cluster-Ordering generated by OPTICS and uses the reachability-distance of DP as the radius of neighborhood eps of its corresponding cluster. It overcomes the weakness of most algorithms in clustering datasets with uneven densities. By computing the distance of the k-nearest neighbor of each point, it reduces the time complexity of OPTICS; by calculating density-mutation points within the clusters, it can efficiently recognize noise. The experimental results show that FOP-OPTICS has the lowest time complexity, and outperforms other algorithms in parameter setting and noise recognition.


Transport ◽  
2018 ◽  
Vol 33 (4) ◽  
pp. 959-970 ◽  
Author(s):  
Tamás Tettamanti ◽  
Alfréd Csikós ◽  
Krisztián Balázs Kis ◽  
Zsolt János Viharos ◽  
István Varga

A full methodology of short-term traffic prediction is proposed for urban road traffic network via Artificial Neural Network (ANN). The goal of the forecasting is to provide speed estimation forward by 5, 15 and 30 min. Unlike similar research results in this field, the investigated method aims to predict traffic speed for signalized urban road links and not for highway or arterial roads. The methodology contains an efficient feature selection algorithm in order to determine the appropriate input parameters required for neural network training. As another contribution of the paper, a built-in incomplete data handling is provided as input data (originating from traffic sensors or Floating Car Data (FCD)) might be absent or biased in practice. Therefore, input data handling can assure a robust operation of speed forecasting also in case of missing data. The proposed algorithm is trained, tested and analysed in a test network built-up in a microscopic traffic simulator by using daily course of real-world traffic.


Sign in / Sign up

Export Citation Format

Share Document