scholarly journals Application of Cluster Analysis to Define Level of Service Criteria of U-turns at Median Openings

2021 ◽  
Vol 81 ◽  
pp. 1-17
Author(s):  
Smruti Sourava Mohapatra

Defining Level of Service (LOS) criteria of U-turns is important for proper planning, design of transportation projects and also allocating resources. The present study attempts to establish a framework to define LOS criteria of U-turns keeping in mind the peculiar behavior of drivers and heterogeneity in urban Indian context. The U-turns at uncontrolled (no traffic sign, no signal, no traffic personnel) median openings are very risky. Upon arrival at the median opening, the U-turning vehicle looks for a suitable gap in the approaching traffic stream before initiating the merging process. While waiting for a suitable gap the U-turning vehicle experiences service delay. This service delay has been studied to quantify the delay ranges for different LOS categories. In this study, service delay data were collected from 7 different sections and microscopic analysis procedure was adopted to extract data from the recorded video. Subsequently, clustering technique has been utilized to defining delay ranges of different level of service categories. Four clustering methods, namely; K-mean, K-medoid, Affinity Propagation (AP), and Fuzzy C-means (FCM) are used. Four validation parameters are applied to determine most suitable clustering algorithm for the study and to determine the optimal number of cluster. AP was found to be the most suitable clustering method and 6 was found to be the optimal number and accordingly the collected delay data were clustered into 6 categories using AP. The delay range is found to be less than 4 s for LOS A is greater than 35 s for LOS F.

2005 ◽  
Vol 15 (05) ◽  
pp. 391-401 ◽  
Author(s):  
DIMITRIOS S. FROSSYNIOTIS ◽  
CHRISTOS PATERITSAS ◽  
ANDREAS STAFYLOPATIS

A multi-clustering fusion method is presented based on combining several runs of a clustering algorithm resulting in a common partition. More specifically, the results of several independent runs of the same clustering algorithm are appropriately combined to obtain a distinct partition of the data which is not affected by initialization and overcomes the instabilities of clustering methods. Subsequently, a fusion procedure is applied to the clusters generated during the previous phase to determine the optimal number of clusters in the data set according to some predefined criteria.


2009 ◽  
Vol 2009 ◽  
pp. 1-16 ◽  
Author(s):  
David J. Miller ◽  
Carl A. Nelson ◽  
Molly Boeka Cannon ◽  
Kenneth P. Cannon

Fuzzy clustering algorithms are helpful when there exists a dataset with subgroupings of points having indistinct boundaries and overlap between the clusters. Traditional methods have been extensively studied and used on real-world data, but require users to have some knowledge of the outcome a priori in order to determine how many clusters to look for. Additionally, iterative algorithms choose the optimal number of clusters based on one of several performance measures. In this study, the authors compare the performance of three algorithms (fuzzy c-means, Gustafson-Kessel, and an iterative version of Gustafson-Kessel) when clustering a traditional data set as well as real-world geophysics data that were collected from an archaeological site in Wyoming. Areas of interest in the were identified using a crisp cutoff value as well as a fuzzyα-cut to determine which provided better elimination of noise and non-relevant points. Results indicate that theα-cut method eliminates more noise than the crisp cutoff values and that the iterative version of the fuzzy clustering algorithm is able to select an optimum number of subclusters within a point set (in both the traditional and real-world data), leading to proper indication of regions of interest for further expert analysis


1993 ◽  
Vol 5 (1) ◽  
pp. 75-88 ◽  
Author(s):  
Joachim Buhmann ◽  
Hans Kühnel

Data clustering is a complex optimization problem with applications ranging from vision and speech processing to data transmission and data storage in technical as well as in biological systems. We discuss a clustering strategy that explicitly reflects the tradeoff between simplicity and precision of a data representation. The resulting clustering algorithm jointly optimizes distortion errors and complexity costs. A maximum entropy estimation of the clustering cost function yields an optimal number of clusters, their positions, and their cluster probabilities. Our approach establishes a unifying framework for different clustering methods like K-means clustering, fuzzy clustering, entropy constrained vector quantization, or topological feature maps and competitive neural networks.


Transport ◽  
2014 ◽  
Vol 32 (2) ◽  
pp. 221-232 ◽  
Author(s):  
Rima Sahani ◽  
Prasanta Kumar Bhuyan

Levels Of Service (LOS) evaluation criteria for off-street pedestrian facilities are not well defined in urban Indian context; hence an in-depth research is carried out in this regard. Defining Pedestrian Level of Service (PLOS) criteria is basically a classification problem; therefore a comparative study is made using three methods of clustering i.e. Affinity Propagation (AP), Self-Organizing Map (SOM) in Artificial Neural Network (ANN) and Genetic AlgorithmFuzzy (GA-Fuzzy) clustering. Pedestrian data are used on validation measure of clustering method to obtain optimal number of cluster used in defining PLOS categories. To decide the most suitable algorithm applicable in defining PLOS criteria for urban off-street facilities in Indian context, Wilk’s Lambda is used on results of the three clustering methods. It is observed from the analysis that GA-Fuzzy is the most suitable clustering analysis among the three methods. With the help of GA-Fuzzy clustering analysis the ranges of the four measuring parameters (average pedestrian space, flow rate, speed of pedestrian and volume to capacity ratio) are defined by using the data collected from two mid-sized cities located in the state of Odisha, India. It is also observed that at >16.53 m2/ped average space, ≤0.061 ped/sec/m flow rate, >1.21 speed and ≤0.34 v/c ratio pedestrians can move in their desired path at LOS ‘A’ without changing movements and it is the best condition for off-street facilities. But in the pedestrian facility having ≤4.48 m2/ped average space, >0.146 ped/sec/m flow rate, ≤0.62 average speed and >1.00 v/c ratio, pedestrian movement is severely restricted and frequent collision among users occurs. The ranges of the parameters used for LOS categories found in this study for Indian cities are different from that mentioned in HCM (Highway Capacity Manual 2010) because of differences in population density, traffic flow condition, geometric structure and some other factors.


2000 ◽  
Vol 09 (04) ◽  
pp. 509-526 ◽  
Author(s):  
OLFA NASRAOUI ◽  
HICHEM FRIGUI ◽  
RAGHU KRISHNAPURAM ◽  
ANUPAM JOSHI

The proliferation of information on the World Wide Web has made the personalization of this information space a necessity. An important component of Web personalization is to mine typical user profiles from the vast amount of historical data stored in access logs. In the absence of any a priori knowledge, unsupervised classification or clustering methods seem to be ideally suited to analyze the semi-structured log data of user accesses. In this paper, we define the notion of a "user session" as being a temporally compact sequence of Web accesses by a user. We also define a new distance measure between two Web sessions that captures the organization of a Web site. The Competitive Agglomeration clustering algorithm which can automatically cluster data into the optimal number of components is extended so that it can work on relational data. The resulting Competitive Agglomeration for Relational Data (CARD) algorithm can deal with complex, non-Euclidean, distance/similarity measures. This algorithm was used to analyze Web server access logs successfully and obtain typical session profiles of users.


2008 ◽  
Vol 06 (02) ◽  
pp. 261-282 ◽  
Author(s):  
AO YUAN ◽  
WENQING HE

Clustering is a major tool for microarray gene expression data analysis. The existing clustering methods fall mainly into two categories: parametric and nonparametric. The parametric methods generally assume a mixture of parametric subdistributions. When the mixture distribution approximately fits the true data generating mechanism, the parametric methods perform well, but not so when there is nonnegligible deviation between them. On the other hand, the nonparametric methods, which usually do not make distributional assumptions, are robust but pay the price for efficiency loss. In an attempt to utilize the known mixture form to increase efficiency, and to free assumptions about the unknown subdistributions to enhance robustness, we propose a semiparametric method for clustering. The proposed approach possesses the form of parametric mixture, with no assumptions to the subdistributions. The subdistributions are estimated nonparametrically, with constraints just being imposed on the modes. An expectation-maximization (EM) algorithm along with a classification step is invoked to cluster the data, and a modified Bayesian information criterion (BIC) is employed to guide the determination of the optimal number of clusters. Simulation studies are conducted to assess the performance and the robustness of the proposed method. The results show that the proposed method yields reasonable partition of the data. As an illustration, the proposed method is applied to a real microarray data set to cluster genes.


2007 ◽  
Vol 16 (06) ◽  
pp. 919-934
Author(s):  
YONGGUO LIU ◽  
XIAORONG PU ◽  
YIDONG SHEN ◽  
ZHANG YI ◽  
XIAOFENG LIAO

In this article, a new genetic clustering algorithm called the Improved Hybrid Genetic Clustering Algorithm (IHGCA) is proposed to deal with the clustering problem under the criterion of minimum sum of squares clustering. In IHGCA, the improvement operation including five local iteration methods is developed to tune the individual and accelerate the convergence speed of the clustering algorithm, and the partition-absorption mutation operation is designed to reassign objects among different clusters. By experimental simulations, its superiority over some known genetic clustering methods is demonstrated.


Genetics ◽  
2001 ◽  
Vol 159 (2) ◽  
pp. 699-713
Author(s):  
Noah A Rosenberg ◽  
Terry Burke ◽  
Kari Elo ◽  
Marcus W Feldman ◽  
Paul J Freidlin ◽  
...  

Abstract We tested the utility of genetic cluster analysis in ascertaining population structure of a large data set for which population structure was previously known. Each of 600 individuals representing 20 distinct chicken breeds was genotyped for 27 microsatellite loci, and individual multilocus genotypes were used to infer genetic clusters. Individuals from each breed were inferred to belong mostly to the same cluster. The clustering success rate, measuring the fraction of individuals that were properly inferred to belong to their correct breeds, was consistently ~98%. When markers of highest expected heterozygosity were used, genotypes that included at least 8–10 highly variable markers from among the 27 markers genotyped also achieved >95% clustering success. When 12–15 highly variable markers and only 15–20 of the 30 individuals per breed were used, clustering success was at least 90%. We suggest that in species for which population structure is of interest, databases of multilocus genotypes at highly variable markers should be compiled. These genotypes could then be used as training samples for genetic cluster analysis and to facilitate assignments of individuals of unknown origin to populations. The clustering algorithm has potential applications in defining the within-species genetic units that are useful in problems of conservation.


Sign in / Sign up

Export Citation Format

Share Document