A grid-based clustering method for mining frequent trips from large-scale, event-based telematics datasets

Author(s):  
Qing Cao ◽  
Bouchra Bouqata ◽  
Patricia D. Mackenzie ◽  
Daniel Messier ◽  
Josheph J. Salvo
PLoS ONE ◽  
2022 ◽  
Vol 17 (1) ◽  
pp. e0262499
Author(s):  
Negin Alisoltani ◽  
Mostafa Ameli ◽  
Mahdi Zargayouna ◽  
Ludovic Leclercq

Real-time ride-sharing has become popular in recent years. However, the underlying optimization problem for this service is highly complex. One of the most critical challenges when solving the problem is solution quality and computation time, especially in large-scale problems where the number of received requests is huge. In this paper, we rely on an exact solving method to ensure the quality of the solution, while using AI-based techniques to limit the number of requests that we feed to the solver. More precisely, we propose a clustering method based on a new shareability function to put the most shareable trips inside separate clusters. Previous studies only consider Spatio-temporal dependencies to do clustering on the mobility service requests, which is not efficient in finding the shareable trips. Here, we define the shareability function to consider all the different sharing states for each pair of trips. Each cluster is then managed with a proposed heuristic framework in order to solve the matching problem inside each cluster. As the method favors sharing, we present the number of sharing constraints to allow the service to choose the number of shared trips. To validate our proposal, we employ the proposed method on the network of Lyon city in France, with half-million requests in the morning peak from 6 to 10 AM. The results demonstrate that the algorithm can provide high-quality solutions in a short time for large-scale problems. The proposed clustering method can also be used for different mobility service problems such as car-sharing, bike-sharing, etc.


2012 ◽  
Vol 433-440 ◽  
pp. 4297-4301
Author(s):  
Hui Ru Wang ◽  
Jing Ding

For large-scale distributed interactive simulation, it is important and difficult for data to communicate among thousands of objects. The purpose of the Data Distribution Management (DDM) service performs data filter and reduces irrelevant data between federations. Grid-based algorithm can only manage to filter part of irrelevant data. Experimental results show that, compare with normal grid-based algorithms, the dynamic multicast method can minimize.


Author(s):  
Ming Cao ◽  
Qinke Peng ◽  
Ze-Gang Wei ◽  
Fei Liu ◽  
Yi-Fan Hou

The development of high-throughput technologies has produced increasing amounts of sequence data and an increasing need for efficient clustering algorithms that can process massive volumes of sequencing data for downstream analysis. Heuristic clustering methods are widely applied for sequence clustering because of their low computational complexity. Although numerous heuristic clustering methods have been developed, they suffer from two limitations: overestimation of inferred clusters and low clustering sensitivity. To address these issues, we present a new sequence clustering method (edClust) based on Edlib, a C/C[Formula: see text] library for fast, exact semi-global sequence alignment to group similar sequences. The new method edClust was tested on three large-scale sequence databases, and we compared edClust to several classic heuristic clustering methods, such as UCLUST, CD-HIT, and VSEARCH. Evaluations based on the metrics of cluster number and seed sensitivity (SS) demonstrate that edClust can produce fewer clusters than other methods and that its SS is higher than that of other methods. The source codes of edClust are available from https://github.com/zhang134/EdClust.git under the GNU GPL license.


Sign in / Sign up

Export Citation Format

Share Document