Mining Taxi Pick-Up Hotspots Based on Grid Information Entropy Clustering Algorithm

Journal of Advanced Transportation ◽

10.1155/2021/5814879 ◽

2021 ◽

Vol 2021 ◽

pp. 1-25

Author(s):

Shuoben Bi ◽

Ruizhuang Xu ◽

Aili Liu ◽

Luye Wang ◽

Lei Wan

Keyword(s):

Information Entropy ◽

Input Data ◽

Clustering Algorithm ◽

Scientific Basis ◽

Urban Traffic ◽

Massive Data ◽

Trajectory Data ◽

Research Areas ◽

Density Based Clustering ◽

Traffic Guidance

In view of the fact that the density-based clustering algorithm is sensitive to the input data, which results in the limitation of computing space and poor timeliness, a new method is proposed based on grid information entropy clustering algorithm for mining hotspots of taxi passengers. This paper selects representative geographical areas of Nanjing and Beijing as the research areas and uses information entropy and aggregation degree to analyze the distribution of passenger-carrying points. This algorithm uses a grid instead of original trajectory data to calculate and excavate taxi passenger hotspots. Through the comparison and analysis of the data of taxi loading points in Nanjing and Beijing, it is found that the experimental results are consistent with the actual urban passenger hotspots, which verifies the effectiveness of the algorithm. It overcomes the shortcomings of a density-based clustering algorithm that is limited by computing space and poor timeliness, reduces the size of data needed to be processed, and has greater flexibility to process and analyze massive data. The research results can provide an important scientific basis for urban traffic guidance and urban management.

Download Full-text

Trajectory Clustering by Sampling and Density

Marine Technology Society Journal ◽

10.4031/mtsj.48.6.8 ◽

2014 ◽

Vol 48 (6) ◽

pp. 74-85 ◽

Cited By ~ 5

Author(s):

Jiacai Pan ◽

Qingshan Jiang ◽

Zheping Shao

Keyword(s):

Traffic Flow ◽

Clustering Algorithm ◽

Moving Objects ◽

Distance Measure ◽

Classical Method ◽

Trajectory Clustering ◽

Trajectory Data ◽

Density Based Clustering ◽

Clustering Quality ◽

Parameter Values

AbstractThe trajectory data of moving objects contain huge amounts of information pertaining to traffic flow. It is incredibly important to extract valuable knowledge from this particular kind of data. Trajectory clustering is one of the most widely used approaches to complete this extraction. However, the current practice of trajectory clustering always groups similar subtrajectories that are partitioned from the trajectories; these methods would thus lose important information of the trajectory as a whole. To deal with this problem, this paper introduces a new trajectory-clustering algorithm based on sampling and density, which groups similar traffic movement tracks (car, ship, airplane, etc.) for further analysis of the characteristics of traffic flow. In particular, this paper proposes a novel technique of measuring distances between trajectories using point sampling. This distance measure does not divide the trajectory and thus conserves the integrated knowledge of these trajectories. This trajectory clustering approach is a new adaptation of a density-based clustering algorithm to the trajectories of moving objects. This paper then adopts the entropy theory as the heuristic for selecting the parameter values of this algorithm and the sum of the squared error method for measuring the clustering quality. Experiments on real ship trajectory data have shown that this algorithm is superior to the classical method TRACLUSS in the run time and that this method works well in discovering traffic flow patterns.

Download Full-text

DBWGIE-MR: A density-based clustering algorithm by using the weighted grid and information entropy based on MapReduce

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201792 ◽

2021 ◽

Vol 40 (6) ◽

pp. 10781-10796

Author(s):

Xin Yu ◽

Feng Zeng ◽

Deborah Simon Mwakapesa ◽

Y.A. Nanehkaran ◽

Yi-Min Mao ◽

...

Keyword(s):

Information Entropy ◽

Clustering Algorithm ◽

Data Partitioning ◽

The Core ◽

Merging Algorithm ◽

Density Based Clustering ◽

Data Points ◽

Speed Up ◽

Low Efficiency ◽

Stability And Accuracy

The main target of this paper is to design a density-based clustering algorithm using the weighted grid and information entropy based on MapReduce, noted as DBWGIE-MR, to deal with the problems of unreasonable division of data gridding, low accuracy of clustering results and low efficiency of parallelization in big data clustering algorithm based on density. This algorithm is implemented in three stages: data partitioning, local clustering, and global clustering. For each stage, we propose several strategies to improve the algorithm. In the first stage, based on the spatial distribution of data points, we propose an adaptive division strategy (ADG) to divide the grid adaptively. In the second stage, we design a weighted grid construction strategy (NE) which can strengthen the relevance between grids to improve the accuracy of clustering. Meanwhile, based on the weighted grid and information entropy, we design a density calculation strategy (WGIE) to calculate the density of the grid. And last, to improve the parallel efficiency, core clusters computing algorithm based on MapReduce (COMCORE-MR) are proposed to parallel compute the core clusters of the clustering algorithm. In the third stage, based on disjoint-set, we propose a core cluster merging algorithm (MECORE) to speed-up ratio the convergence of merged local clusters. Furthermore, based on MapReduce, a core clusters parallel merging algorithm (MECORE-MR) is proposed to get the clustering algorithm results faster, which improves the core clusters merging efficiency of the density-based clustering algorithm. We conduct the experiments on four synthetic clusters. Compared with H-DBSCAN, DBSCAN-MR and MR-VDBSCAN, the experimental results show that the DBWGIE-MR algorithm has higher stability and accuracy, and it takes less time in parallel clustering.

Download Full-text

Deployment Strategy for Car-Sharing Depots by Clustering Urban Traffic Big Data Based on Affinity Propagation

Scientific Programming ◽

10.1155/2018/3907513 ◽

2018 ◽

Vol 2018 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Zhihan Liu ◽

Yi Jia ◽

Xiaolu Zhu

Keyword(s):

Big Data ◽

Clustering Algorithm ◽

Optimization Method ◽

Urban Traffic ◽

Affinity Propagation ◽

Superior Performance ◽

Trajectory Data ◽

Car Sharing ◽

Gps Trajectory Data ◽

Ap Clustering

Car sharing is a type of car rental service, by which consumers rent cars for short periods of time, often charged by hours. The analysis of urban traffic big data is full of importance and significance to determine locations of depots for car-sharing system. Taxi OD (Origin-Destination) is a typical dataset of urban traffic. The volume of the data is extremely large so that traditional data processing applications do not work well. In this paper, an optimization method to determine the depot locations by clustering taxi OD points with AP (Affinity Propagation) clustering algorithm has been presented. By analyzing the characteristics of AP clustering algorithm, AP clustering has been optimized hierarchically based on administrative region segmentation. Considering sparse similarity matrix of taxi OD points, the input parameters of AP clustering have been adapted. In the case study, we choose the OD pairs information from Beijing’s taxi GPS trajectory data. The number and locations of depots are determined by clustering the OD points based on the optimization AP clustering. We describe experimental results of our approach and compare it with standard K-means method using quantitative and stationarity index. Experiments on the real datasets show that the proposed method for determining car-sharing depots has a superior performance.

Download Full-text

Urban Hotspot Area Detection Using Nearest-Neighborhood-Related Quality Clustering on Taxi Trajectory Data

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10070473 ◽

2021 ◽

Vol 10 (7) ◽

pp. 473

Author(s):

Qingying Yu ◽

Chuanming Chen ◽

Liping Sun ◽

Xiaoyao Zheng

Keyword(s):

Clustering Algorithm ◽

Local Density ◽

Real Life ◽

Urban Traffic ◽

Cluster Center ◽

Trajectory Data ◽

Related Quality ◽

Nearest Neighborhood ◽

The Impact ◽

Taxi Trajectory

Urban hotspot area detection is an important issue that needs to be explored for urban planning and traffic management. It is of great significance to mine hotspots from taxi trajectory data, which reflect residents’ travel characteristics and the operational status of urban traffic. The existing clustering methods mainly concentrate on the number of objects contained in an area within a specified size, neglecting the impact of the local density and the tightness between objects. Hence, a novel algorithm is proposed for detecting urban hotspots from taxi trajectory data based on nearest neighborhood-related quality clustering techniques. The proposed spatial clustering algorithm not only considers the maximum clustering in a limited range but also considers the relationship between each cluster center and its nearest neighborhood, effectively addressing the clustering issue of unevenly distributed datasets. As a result, the proposed algorithm obtains high-quality clustering results. The visual representation and simulated experimental results on a real-life cab trajectory dataset show that the proposed algorithm is suitable for inferring urban hotspot areas, and that it obtains better accuracy than traditional density-based methods.

Download Full-text

Exploring the Influence of E-Hailing Applications on the Taxi Industry—From the Perspective of the Drivers

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10020077 ◽

2021 ◽

Vol 10 (2) ◽

pp. 77

Author(s):

Yitong Gan ◽

Hongchao Fan ◽

Wei Jiao ◽

Mengqi Sun

Keyword(s):

Driving Behavior ◽

Urban Traffic ◽

Behavior Changes ◽

Transportation Industry ◽

Trajectory Data ◽

Long Distance ◽

Taxi Drivers ◽

Driving Behaviors ◽

Taxi Industry ◽

The Times

In China, the traditional taxi industry is conforming to the trend of the times, with taxi drivers working with e-hailing applications. This reform is of great significance, not only for the taxi industry, but also for the transportation industry, cities, and society as a whole. Our goal was to analyze the changes in driving behavior since taxi drivers joined e-hailing platforms. Therefore, this paper mined taxi trajectory data from Shanghai and compared the data of May 2015 with those of May 2017 to represent the before-app stage and the full-use stage, respectively. By extracting two-trip events (i.e., vacant trip and occupied trip) and two-spot events (i.e., pick-up spot and drop-off spot), taxi driving behavior changes were analyzed temporally, spatially, and efficiently. The results reveal that e-hailing applications mine more long-distance rides and new pick-up locations for drivers. Moreover, driver initiative have increased at night since using e-hailing applications. Furthermore, mobile payment facilities save time that would otherwise be taken sorting out change. Although e-hailing apps can help citizens get taxis faster, from the driver’s perspective, the apps do not reduce their cruising time. In general, e-hailing software reduces the unoccupied ratio of taxis and improves the operating ratio. Ultimately, new driving behaviors can increase the driver’s revenue. This work is meaningful for the formulation of reasonable traffic laws and for urban traffic decision-making.

Download Full-text

Adaptive Density-Based Clustering Algorithm with Shared KNN Conflict Game

Information Sciences ◽

10.1016/j.ins.2021.02.017 ◽

2021 ◽

Author(s):

Rui Zhang ◽

Tao Du ◽

Shouning Qu ◽

Hongwei Sun

Keyword(s):

Clustering Algorithm ◽

Density Based Clustering

Download Full-text

Distance Density Based Clustering Algorithm in Wireless Sensor Network

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.291-294.344 ◽

2011 ◽

Vol 291-294 ◽

pp. 344-348

Author(s):

Lin Lin ◽

Shu Yan ◽

Yi Nian

Keyword(s):

Clustering Algorithm ◽

Distribution Density ◽

Simulation Experiment ◽

Clustering Algorithms ◽

Wireless Sensor ◽

Energy Usage ◽

Cluster Heads ◽

Hierarchical Topology ◽

Energy Factors ◽

Density Based Clustering

The hierarchical topology of wireless sensor networks can effectively reduce the consumption in communication. Clustering algorithm is the foundation to realize herarchical structure, so it has been extensive researched. On the basis of Leach algorithm, a distance density based clustering algorithm (DDBC) is proposed, considering synthetically the distribution density of around nodes and the remaining energy factors of the node to dynamically banlance energy usage of nodes when selecting cluster heads. We analyzed the performance of DDBC through compared with the existing other clustering algorithms in simulation experiment. Results show that the proposed method can generare stable quantity cluster heads and banlance the energy load effectively.

Download Full-text

An Efficient Density-based Clustering Algorithm for Face Groping

Neurocomputing ◽

10.1016/j.neucom.2021.07.074 ◽

2021 ◽

Author(s):

Shenfei Pei ◽

Feiping Nie ◽

Rong Wang ◽

Xuelong Li

Keyword(s):

Clustering Algorithm ◽

Density Based Clustering

Download Full-text

An improved OPTICS clustering algorithm for discovering clusters with uneven densities

Intelligent Data Analysis ◽

10.3233/ida-205497 ◽

2021 ◽

Vol 25 (6) ◽

pp. 1453-1471

Author(s):

Chunhua Tang ◽

Han Wang ◽

Zhiwen Wang ◽

Xiangkun Zeng ◽

Huaran Yan ◽

...

Keyword(s):

Time Complexity ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Substantial Improvement ◽

Experimental Results ◽

High Time ◽

Parameter Setting ◽

K Nearest Neighbor ◽

Density Based Clustering

Most density-based clustering algorithms have the problems of difficult parameter setting, high time complexity, poor noise recognition, and weak clustering for datasets with uneven density. To solve these problems, this paper proposes FOP-OPTICS algorithm (Finding of the Ordering Peaks Based on OPTICS), which is a substantial improvement of OPTICS (Ordering Points To Identify the Clustering Structure). The proposed algorithm finds the demarcation point (DP) from the Augmented Cluster-Ordering generated by OPTICS and uses the reachability-distance of DP as the radius of neighborhood eps of its corresponding cluster. It overcomes the weakness of most algorithms in clustering datasets with uneven densities. By computing the distance of the k-nearest neighbor of each point, it reduces the time complexity of OPTICS; by calculating density-mutation points within the clusters, it can efficiently recognize noise. The experimental results show that FOP-OPTICS has the lowest time complexity, and outperforms other algorithms in parameter setting and noise recognition.

Download Full-text

PATTERN RECOGNITION BASED SPEED FORECASTING METHODOLOGY FOR URBAN TRAFFIC NETWORK

Transport ◽

10.3846/16484142.2017.1352027 ◽

2018 ◽

Vol 33 (4) ◽

pp. 959-970 ◽

Cited By ~ 3

Author(s):

Tamás Tettamanti ◽

Alfréd Csikós ◽

Krisztián Balázs Kis ◽

Zsolt János Viharos ◽

István Varga

Keyword(s):

Neural Network ◽

Input Data ◽

Road Traffic ◽

Urban Traffic ◽

Traffic Network ◽

Data Handling ◽

Urban Road ◽

Traffic Sensors ◽

Network Training ◽

Artificial Neural Network Ann

A full methodology of short-term traffic prediction is proposed for urban road traffic network via Artificial Neural Network (ANN). The goal of the forecasting is to provide speed estimation forward by 5, 15 and 30 min. Unlike similar research results in this field, the investigated method aims to predict traffic speed for signalized urban road links and not for highway or arterial roads. The methodology contains an efficient feature selection algorithm in order to determine the appropriate input parameters required for neural network training. As another contribution of the paper, a built-in incomplete data handling is provided as input data (originating from traffic sensors or Floating Car Data (FCD)) might be absent or biased in practice. Therefore, input data handling can assure a robust operation of speed forecasting also in case of missing data. The proposed algorithm is trained, tested and analysed in a test network built-up in a microscopic traffic simulator by using daily course of real-world traffic.

Download Full-text