scholarly journals A Novel High-Dimensional Trajectories Construction Network based on Multi-Clustering Algorithm

Author(s):  
Feiyang Ren ◽  
Yi Han ◽  
Shaohan Wang ◽  
He Jiang

Abstract A novel marine transportation network based on high-dimensional AIS data with a multi-level clustering algorithm is proposed to discover important waypoints in trajectories based on selected navigation features. This network contains two parts: the calculation of major nodes with CLIQUE and BIRCH clustering methods and navigation network construction with edge construction theory. Unlike the state-of-art work for navigation clustering with only ship coordinate, the proposed method contains more high-dimensional features such as drafting, weather, and fuel consumption. By comparing the historical AIS data, more than 220,133 lines of data in 30 days were used to extract 440 major nodal points in less than 4 minutes with ordinary PC specs (i5 processer). The proposed method can be performed on more dimensional data for better ship path planning or even national economic analysis. Current work has shown good performance on complex ship trajectories distinction and great potential for future shipping transportation market analytical predictions.

Author(s):  
Carlotta Domeniconi

In an effort to achieve improved classifier accuracy, extensive research has been conducted in classifier ensembles. Very recently, cluster ensembles have emerged. It is well known that off-the-shelf clustering methods may discover different structures in a given set of data. This is because each clustering algorithm has its own bias resulting from the optimization of different criteria. Furthermore, there is no ground truth against which the clustering result can be validated. Thus, no cross-validation technique can be carried out to tune input parameters involved in the clustering process. As a consequence, the user is not equipped with any guidelines for choosing the proper clustering method for a given dataset. Cluster ensembles offer a solution to challenges inherent to clustering arising from its ill-posed nature. Cluster ensembles can provide more robust and stable solutions by leveraging the consensus across multiple clustering results, while averaging out emergent spurious structures that arise due to the various biases to which each participating algorithm is tuned. In this chapter, we discuss the problem of combining multiple weighted clusters, discovered by a locally adaptive algorithm (Domeniconi, Papadopoulos, Gunopulos, & Ma, 2004) which detects clusters in different subspaces of the input space. We believe that our approach is the first attempt to design a cluster ensemble for subspace clustering (Al-Razgan & Domeniconi, 2006). Recently, several subspace clustering methods have been proposed (Parsons, Haque, & Liu, 2004). They all attempt to dodge the curse of dimensionality which affects any algorithm in high dimensional spaces. In high dimensional spaces, it is highly likely that, for any given pair of points within the same cluster, there exist at least a few dimensions on which the points are far apart from each other. As a consequence, distance functions that equally use all input features may not be effective. Furthermore, several clusters may exist in different subspaces comprised of different combinations of features. In many real-world problems, some points are correlated with respect to a given set of dimensions, while others are correlated with respect to different dimensions. Each dimension could be relevant to at least one of the clusters. Global dimensionality reduction techniques are unable to capture local correlations of data. Thus, a proper feature selection procedure should operate locally in input space. Local feature selection allows one to embed different distance measures in different regions of the input space; such distance metrics reflect local correlations of data. In (Domeniconi, Papadopoulos, Gunopulos, & Ma, 2004) we proposed a soft feature selection procedure (called LAC) that assigns weights to features according to the local correlations of data along each dimension. Dimensions along which data are loosely correlated receive a small weight, which has the effect of elongating distances along that dimension. Features along which data are strongly correlated receive a large weight, which has the effect of constricting distances along that dimension. Thus the learned weights perform a directional local reshaping of distances which allows a better separation of clusters, and therefore the discovery of different patterns in different subspaces of the original input space.


2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
JingDong Tan ◽  
RuJing Wang

Sharing nearest neighbor (SNN) is a novel metric measure of similarity, and it can conquer two hardships: the low similarities between samples and the different densities of classes. At present, there are two popular SNN similarity based clustering methods: JP clustering and SNN density based clustering. Their clustering results highly rely on the weighting value of the single edge, and thus they are very vulnerable. Motivated by the idea of smooth splicing in computing geometry, the authors design a novel SNN similarity based clustering algorithm within the structure of graph theory. Since it inherits complementary intensity-smoothness principle, its generalizing ability surpasses those of the previously mentioned two methods. The experiments on text datasets show its effectiveness.


2021 ◽  
Author(s):  
Mebarka Allaoui ◽  
Mohammed Lamine Kherfi ◽  
Abdelhakim Cheriet ◽  
Abdelhamid Bouchachia

In this paper, we introduce a novel algorithm that unifies manifold embedding and clustering (UEC) which efficiently predicts clustering assignments of the high dimensional data points in a new embedding space. The algorithm is based on a bi-objective optimisation problem combining embedding and clustering loss functions. Such original formulation will allow to simultaneously preserve the original structure of the data in the embedding space and produce better clustering assignments. The experimental results using a number of real-world datasets show that UEC is competitive with the state-of-art clustering methods.


2016 ◽  
Vol 9 (4) ◽  
pp. 89 ◽  
Author(s):  
Nyoman Nyoman Budiartha RM ◽  
Ida Bagus Putu Adnyana

Bali Island is well-known tourist destination in the world. As well as Bali Island, several small islands amongst Bali Island such as Nusa Penida, Nusa Lembongan and Nusa Ceningan have good potential for tourist destination. Infrastructure for inter-islands transportation which is located in Southern Bali needs to be improved to support the economy in such region. Various issues are found in improving the infrastructure provision which is related to existing harbor infrastructure such as lack of support from the relevant institutional and port site selection. This study reviewed the factors considered in the development of infrastructure for marine transportation. Supports from relevant institutions, improved infrastructure, transportation network construction, and support then participation of local communities are the factors which can use as strategies and recommendation in strengthening the marine transportation.


2021 ◽  
Author(s):  
Mebarka Allaoui ◽  
Mohammed Lamine Kherfi ◽  
Abdelhakim Cheriet ◽  
Abdelhamid Bouchachia

In this paper, we introduce a novel algorithm that unifies manifold embedding and clustering (UEC) which efficiently predicts clustering assignments of the high dimensional data points in a new embedding space. The algorithm is based on a bi-objective optimisation problem combining embedding and clustering loss functions. Such original formulation will allow to simultaneously preserve the original structure of the data in the embedding space and produce better clustering assignments. The experimental results using a number of real-world datasets show that UEC is competitive with the state-of-art clustering methods.


2016 ◽  
Vol 2016 ◽  
pp. 1-9
Author(s):  
Mingming Liu ◽  
Bing Liu ◽  
Chen Zhang ◽  
Wei Sun

As is well known, traditional spectral clustering (SC) methods are developed based on themanifold assumption, namely, that two nearby data points in the high-density region of a low-dimensional data manifold have the same cluster label. But, for some high-dimensional and sparse data, such an assumption might be invalid. Consequently, the clustering performance of SC will be degraded sharply in this case. To solve this problem, in this paper, we propose a general spectral embedded framework, which embeds the true cluster assignment matrix for high-dimensional data into a nonlinear space by a predefined embedding function. Based on this framework, several algorithms are presented by using different embedding functions, which aim at learning the final cluster assignment matrix and a transformation into a low dimensionality space simultaneously. More importantly, the proposed method can naturally handle the out-of-sample extension problem. The experimental results on benchmark datasets demonstrate that the proposed method significantly outperforms existing clustering methods.


2009 ◽  
Vol 35 (7) ◽  
pp. 859-866
Author(s):  
Ming LIU ◽  
Xiao-Long WANG ◽  
Yuan-Chao LIU

Sign in / Sign up

Export Citation Format

Share Document